Namespaces and the Interface Principle

Namespaces & Interface Principle

On the
blog

November 4: Other Concurrency Sessions at PDC
November 3: PDC'09: Tutorial & Panel

October 26: Hoare on Testing
October 23: Deprecating export Considered for ISO C++0x

This is the original article substantially as first published. See the book Exceptional C++ (Addison-Wesley, 2000) for the most current version of this article. The versions in the book have been revised and expanded since their initial appearance in print. The book versions also incorporate corrections, new material, and conformance to the final ANSI/ISO C++ standard.

Namespaces and the Interface Principle

This article appeared in C++ Report, 11(3), March 1999.

Use namespaces wisely. If you put a class into a namespace, be sure to put all helper functions and operators into the same namespace too. If you don't, you may discover surprising effects in your code.

The Interface Principle Revisited

Exactly one year ago, in the March 1998 issue,^[1] I dedicated the entire column to proposing and explaining the Interface Principle:

The Interface Principle
For a class X, all functions, including free functions, that both
(a) "mention" X, and
(b) are "supplied with" X
are logically part of X, because they form part of the interface of X.

In passing, that column showed why the "supplied with" requirement could be usefully interpreted as meaning "supplied in the same header or namespace with." It also described Koenig name lookup in detail, and showed that Koenig lookup had to operate the way it does because of the Interface Principle.

This column goes further, and shows why you should always put helper functions, especially operators, in the same namespace as a class. That's a simple rule, but it's not entirely obvious why it's necessary, so I'll begin with a motivating example to demonstrate a hidden pitfall that comes from the way C++ namespaces and name lookup interact.

Consider the following simple program, based on code e-mailed to me by Astute Reader Darin Adler. It supplies a class C, in namespace N, and an operation on that class. Notice that the operator+ is in the global namespace, not in namespace N. Does that matter? Isn't the code valid as written anyway?

// Example 1: Will this compile?
//

// in some library header
namespace N { class C {}; }
int operator+(int i, N::C) { return i+1; }

// a mainline to exercise it
#include <numeric>
int main() {
N::C a[10];
std::accumulate(a, a+10, 0);
}

Before reading on, stop and consider: Will this program compile?^[2] Is it portable?

Digression: Recap of a Familiar Inheritance Issue

For a moment, let's put Example 1 aside and consider what looks like (but isn't) a completely unrelated example:

// Example 2a: Hiding a name
// from a base class
//
struct B {
int f( int );
int f( double );
int g( int );
};

struct D : public B {
private:
int g( std::string, bool );
};

D   d;
int i;
d.f(i);    // ok, means B::f(int)
d.g(i);    // error: g takes 2 args

Most of us should be used to seeing this kind of name hiding, although the fact that the last line won't compile surprises most new C++ programmers. In short, when we declare a function named g in the derived class D, it automatically hides all functions with the same name in all direct and indirect base classes. It doesn't matter a whit that D::g "obviously" can't be the function that the programmer meant to call (not only does D::g have the wrong signature, but it's private and therefore inaccessible to boot), because B::g is hidden and can't be considered by name lookup.

To see what's really going on, let's look in a little more detail at what the compiler does when it encounters the function call d.g(i). First, it looks in the immediate scope, in this case the scope of class D, and makes a list of all functions it can find that are named g (regardless of whether they're accessible or even take the right number of parameters). Only if it doesn does it then continue "outward" into the next enclosing scope and repeat--in this case, the scope of the base class B--until eventually it either runs out of scopes without having found a function with the right name or else finds a scope that contains at least one candidate function. If a scope is found that has one or more candidate functions, the compiler then stops searching and works with the candidates that it's found, performing overload resolution and applying access rules.

There are very good reasons why the language must work this way.^[3] To take the extreme case, it makes intuitive sense that a member function that's a near-exact match ought to be preferred over a global function that would have been a perfect match had we considered the parameter types only.

Of course, there are the two usual ways around the name-hiding problem in Example 2a. First, the calling code can simply say which one it wants, and force the compiler to look in the right scope:

// Example 2b: Asking for a name
//             from a base class
//
D   d;
int i;
d.f(i);    // ok, means B::f(int)
d.B::g(i); // ok, asks for B::g(int)

Second, and usually more appropriate, the designer of class D can make B::g visible with a using-declaration. This allows the compiler to consider B::g in the same scope as D::g for the purposes of name lookup and subsequent overload resolution:

// Example 2c: Un-hiding a name
// from a base class
//
struct D : public B {
using B::g;
private:
int g( std::string, bool );
};

Either of these gets around the hiding problem in the original Example 2a code.

Back On Topic: Name Hiding in Nested Namespaces

So, with that example behind us, let's go back and tackle Example 1 again:

// Example 1: Will this compile?
//

// in some library header
namespace N { class C {}; }
int operator+(int i, N::C) { return i+1; }

// a mainline to exercise it
#include <numeric>
int main() {
N::C a[10];
std::accumulate(a, a+10, 0);
}

Well, at first glance, it sure looks legal. But will it compile? The answer is probably surprising: Maybe it will compile, or maybe not. It depends entirely on your implementation, and I know of standard-conforming implementations that will compile this program correctly and equally standard-conforming implementations that won't. Gather 'round, and I'll show you why.

The key to understanding the answer is understanding what the compiler has to do inside std::accumulate. The std::accumulate template looks something like this:

namespace std {
template<class Iter, class T>
inline T accumulate( Iter first,
                       Iter last,
                       T    value ) {
    while( first != last ) {
      value = value + *first;   // 1
      ++first;
    }
    return value;
}
}

The code in Example 1 actually calls std::accumulate<N::C*,int>. In line 1 above, how should the compiler interpret the expression value + *first? Well, it's got to look for an operator+ that takes an int and an N::C (or parameters that can be converted to int and N::C). Hey, it just so happens that we have just such an operator+(int,N::C) at global scope! Look, there it is! Cool. So everything must be fine, right?

The problem is that the compiler may or may not be able to see the operator+(int,N::C) at global scope, depending on what other functions have already been seen to be declared in namespace std at the point where std::accumulate<N::C*,int> is instantiated.

To see why, consider that the same name hiding the we observed with derived classes happens with any nested scopes, including namespaces, and consider where the compiler starts looking for a suitable operator+. (Now I'm going to reuse my explanation from the previous section, only with a few names substituted:) First, it looks in the immediate scope, in this case the scope of namespace std, and makes a list of all functions it can find that are named operator+ (regardless of whether they're accessible or even take the right number of parameters). Only if it doesn does it then continue "outward" into the next enclosing scope and repeat--in this case, the scope of the next enclosing namespace outside std, which happens to be the global scope--until eventually it either runs out of scopes without having found a function with the right name or else finds a scope that contains at least one candidate function. If a scope is found that has one or more candidate functions, the compiler then stops searching and works with the candidates that it's found, performing overload resolution and applying access rules.

In short, whether Example 1 will compile depends entirely on whether this implementation's version of the standard header numeric: a) declares an operator+ (any operator+, suitable or not, accessible or not); or b) includes any other standard header that does so. Unlike Standard C, Standard C++ does not specify which standard headers may or may not include each other, so when you include numeric you may or may not get header iterator too, for example, which does define several operator+ functions. I know of C++ products that won't compile Example 1, others that will compile Example 1 but balk once you add the line #include <vector>, and so on.

Some Fun With Compilers

It's bad enough that the compiler can't find the right function if there happens to be another operator+ in the way, but typically the operator+ that does get encountered in a standard header is a template, and compilers generate notoriously difficult-to-read error messages when templates are involved. For example, one popular implementation reports the following errors when compiling Example 1 (note that in this implementation the header numeric does in fact include the header iterator):

error C2784: 'class std::reverse_iterator<`template-parameter-1', `template-parameter-2', `template-parameter-3', `template-parameter-4', `template-parameter-5'> __cdecl std::operator +(template-parameter-5, const class std::reverse_iterator< `template-parameter-1', `template-parameter-2', `template-parameter-3', `template-parameter-4', `template-parameter-5'>&)' : could not deduce template argument for 'template-parameter-5' from 'int'
error C2677: binary '+' : no global operator defined which takes type 'class N::C' (or there is no acceptable conversion)

Imagine the poor programmer's confusion:

o The first error message is unreadable. The compiler is merely complaining (as clearly as it can) that it did find an operator+ but can't figure out how to use it in an appropriate way, but that doesn't help the poor programmer. "Huh?" saith the programmer, scratching his forelock, "when did I ever ask for a reverse_iterator anywhere?"

o The second message is a flagrant lie, and it's the compiler vendor's fault (although perhaps an understandable mistake, because the message was probably right in most of the cases where it came up before people began to use namespaces widely). It's close to the correct message "no operator found which takes...," but that doesn't help the poor programmer either. "Huh?" saith the programmer, indignant with ire, "there is too a global operator defined that takes type 'class N::C'!"

How is a mortal programmer ever to decipher what's going wrong here? And, once he does, how loudly is he likely to curse the author of class N::C? Best to avoid the problem completely, as we shall now see.

The Solution

When we encountered this problem in the familiar guise of base/derived name hiding, we had two possible solutions: either have the calling code explicitly say which function it wants (Example 2b), or write a using-declaration to make the desired function visible in the right scope (Example 2c). Neither solution works in this case; the first is possible^[4] but places an unacceptable burden on the programmer, while the second is impossible.

The real solution is to put our operator+ where it has always truly belonged and should have been put in the first place: in namespace N.

// Example 1b: Solution
//

// in some library header
namespace N {
class C {};
int operator+(int i, N::C) { return i+1; }
}

// a mainline to exercise it
#include <numeric>
int main() {
N::C a[10];
std::accumulate(a, a+10, 0); // now ok
}

This code is portable and will compile on all conforming compilers, regardless of what happens to be already defined in std or any other namespace. Now that the operator+ is in the same namespace as the second parameter, when the compiler tries to resolve the "+" call inside std::accumulate it is able to see the right operator+ because of Koenig lookup. Recall that Koenig lookup says that, in addition to looking in all the usual scopes, the compiler shall also look in the scopes of the function's parameter types to see if it can find a match. N::C is in namespace N, so the compiler looks in namespace N, and happily finds exactly what it needs no matter how many other operator+'s happen to be lying around and cluttering up namespace std.

Conclusion

The problem arose because Example 1 did not follow the Interface Principle:

The Interface Principle
For a class X, all functions, including free functions, that both
(a) "mention" X, and
(b) are "supplied with" X
are logically part of X, because they form part of the interface of X.

If an operation, even a free function (and especially an operator) mentions a class and is intended to form part of the interface of a class, then always be sure to supply it with the class--which means, among other things, to put it in the same namespace as the class. The problem in Example 1 arose because we wrote a class C and put part of its interface in a different namespace. Making sure that the class and the interface stay together is The Right Thing To Do in any case, and is a simple way of avoiding complex name lookup problems later on when other people try to use your class.

Use namespaces wisely. Either put all of a class inside the same namespace -- including things that to innocent eyes don't look like they're part of the class, such as free functions that mention the class -- or don't put the class in a namespace at all. Your users will thank you.

Notes

1. H. Sutter. "What's In a Class?" (C++ Report, 10(3), Mar 1998).

2. In case you're wondering that there might be a potential portability problem depending on whether the implementation of std::accumulate invokes operator+(int,N::C) or operator+(N::C,int), there isn't; the standard says that it must be the former, so Example 1 is providing an operator+ with the correct signature.

3. For example, you might think that, if none of the functions found in an inner scope were usable, then it could be okay to let the compiler start searching further enclosing scopes; that would, however, produce surprising results in some cases (consider the case where there's a function that would be an exact match in an outer scope, but there's a function in an inner scope that's a close match requiring only a few parameter conversions). Or, you might think that the compiler should just make a list of all functions with the required name in all scopes, and then perform overload resolution across scopes; but, alas, that too has its pitfalls (consider that a member function ought to be preferred over a global function, rather than result in a possible ambiguity).

4. By requiring the programmer to use the version of std::accumulate that takes a predicate and explicitly say which one he wants each time... a good way to lose customers.

Namespaces and the Interface Principle

The Interface Principle Revisited

Digression: Recap of a Familiar Inheritance Issue

Back On Topic: Name Hiding in Nested Namespaces

Some Fun With Compilers

The Solution

Conclusion

Notes

Copyright © 2009 Herb Sutter