|
Namespaces and the Interface PrincipleThis article appeared in C++ Report, 11(3), March 1999.
Use namespaces wisely. If you put a class into a namespace, be sure to put all helper functions and operators into the same namespace too. If you don't, you may discover surprising effects in your code. The Interface Principle RevisitedExactly one year ago, in the March 1998 issue,[1] I dedicated the entire column to proposing and explaining the Interface Principle:
In passing, that column showed why the "supplied with" requirement could be usefully interpreted as meaning "supplied in the same header or namespace with." It also described Koenig name lookup in detail, and showed that Koenig lookup had to operate the way it does because of the Interface Principle. This column goes further, and shows why you should always put helper functions, especially operators, in the same namespace as a class. That's a simple rule, but it's not entirely obvious why it's necessary, so I'll begin with a motivating example to demonstrate a hidden pitfall that comes from the way C++ namespaces and name lookup interact. Consider the following simple program, based on code e-mailed to me by Astute Reader Darin Adler. It supplies a class C, in namespace N, and an operation on that class. Notice that the operator+ is in the global namespace, not in namespace N. Does that matter? Isn't the code valid as written anyway?
Before reading on, stop and consider: Will this program compile?[2] Is it portable? Digression: Recap of a Familiar Inheritance IssueFor a moment, let's put Example 1 aside and consider what looks like (but isn't) a completely unrelated example: // Example 2a: Hiding a name struct D : public B { D d; Most of us should be used to seeing this kind of name hiding, although the fact that the last line won't compile surprises most new C++ programmers. In short, when we declare a function named g in the derived class D, it automatically hides all functions with the same name in all direct and indirect base classes. It doesn't matter a whit that D::g "obviously" can't be the function that the programmer meant to call (not only does D::g have the wrong signature, but it's private and therefore inaccessible to boot), because B::g is hidden and can't be considered by name lookup. To see what's really going on, let's look in a little more detail at what the compiler does when it encounters the function call d.g(i). First, it looks in the immediate scope, in this case the scope of class D, and makes a list of all functions it can find that are named g (regardless of whether they're accessible or even take the right number of parameters). Only if it doesn does it then continue "outward" into the next enclosing scope and repeat--in this case, the scope of the base class B--until eventually it either runs out of scopes without having found a function with the right name or else finds a scope that contains at least one candidate function. If a scope is found that has one or more candidate functions, the compiler then stops searching and works with the candidates that it's found, performing overload resolution and applying access rules. There are very good reasons why the language must work this way.[3] To take the extreme case, it makes intuitive sense that a member function that's a near-exact match ought to be preferred over a global function that would have been a perfect match had we considered the parameter types only. Of course, there are the two usual ways around the name-hiding problem in Example 2a. First, the calling code can simply say which one it wants, and force the compiler to look in the right scope: // Example 2b: Asking for a name Second, and usually more appropriate, the designer of class D can make B::g visible with a using-declaration. This allows the compiler to consider B::g in the same scope as D::g for the purposes of name lookup and subsequent overload resolution: // Example 2c: Un-hiding a name Either of these gets around the hiding problem in the original Example 2a code. Back On Topic: Name Hiding in Nested NamespacesSo, with that example behind us, let's go back and tackle Example 1 again: // Example 1: Will this compile? // in some library header // a mainline to exercise it Well, at first glance, it sure looks legal. But will it compile? The answer is probably surprising: Maybe it will compile, or maybe not. It depends entirely on your implementation, and I know of standard-conforming implementations that will compile this program correctly and equally standard-conforming implementations that won't. Gather 'round, and I'll show you why. The key to understanding the answer is understanding what the compiler has to do inside std::accumulate. The std::accumulate template looks something like this: namespace std { The code in Example 1 actually calls std::accumulate<N::C*,int>. In line 1 above, how should the compiler interpret the expression value + *first? Well, it's got to look for an operator+ that takes an int and an N::C (or parameters that can be converted to int and N::C). Hey, it just so happens that we have just such an operator+(int,N::C) at global scope! Look, there it is! Cool. So everything must be fine, right? The problem is that the compiler may or may not be able to see the operator+(int,N::C) at global scope, depending on what other functions have already been seen to be declared in namespace std at the point where std::accumulate<N::C*,int> is instantiated. To see why, consider that the same name hiding the we observed with derived classes happens with any nested scopes, including namespaces, and consider where the compiler starts looking for a suitable operator+. (Now I'm going to reuse my explanation from the previous section, only with a few names substituted:) First, it looks in the immediate scope, in this case the scope of namespace std, and makes a list of all functions it can find that are named operator+ (regardless of whether they're accessible or even take the right number of parameters). Only if it doesn does it then continue "outward" into the next enclosing scope and repeat--in this case, the scope of the next enclosing namespace outside std, which happens to be the global scope--until eventually it either runs out of scopes without having found a function with the right name or else finds a scope that contains at least one candidate function. If a scope is found that has one or more candidate functions, the compiler then stops searching and works with the candidates that it's found, performing overload resolution and applying access rules. In short, whether Example 1 will compile depends entirely on whether this implementation's version of the standard header numeric: a) declares an operator+ (any operator+, suitable or not, accessible or not); or b) includes any other standard header that does so. Unlike Standard C, Standard C++ does not specify which standard headers may or may not include each other, so when you include numeric you may or may not get header iterator too, for example, which does define several operator+ functions. I know of C++ products that won't compile Example 1, others that will compile Example 1 but balk once you add the line #include <vector>, and so on. Some Fun With CompilersIt's bad enough that the compiler can't find the right function if there happens to be another operator+ in the way, but typically the operator+ that does get encountered in a standard header is a template, and compilers generate notoriously difficult-to-read error messages when templates are involved. For example, one popular implementation reports the following errors when compiling Example 1 (note that in this implementation the header numeric does in fact include the header iterator):
Imagine the poor programmer's confusion: o The first error message is unreadable. The compiler is merely complaining (as clearly as it can) that it did find an operator+ but can't figure out how to use it in an appropriate way, but that doesn't help the poor programmer. "Huh?" saith the programmer, scratching his forelock, "when did I ever ask for a reverse_iterator anywhere?" o The second message is a flagrant lie, and it's the compiler vendor's fault (although perhaps an understandable mistake, because the message was probably right in most of the cases where it came up before people began to use namespaces widely). It's close to the correct message "no operator found which takes...," but that doesn't help the poor programmer either. "Huh?" saith the programmer, indignant with ire, "there is too a global operator defined that takes type 'class N::C'!" How is a mortal programmer ever to decipher what's going wrong here? And, once he does, how loudly is he likely to curse the author of class N::C? Best to avoid the problem completely, as we shall now see. The SolutionWhen we encountered this problem in the familiar guise of base/derived name hiding, we had two possible solutions: either have the calling code explicitly say which function it wants (Example 2b), or write a using-declaration to make the desired function visible in the right scope (Example 2c). Neither solution works in this case; the first is possible[4] but places an unacceptable burden on the programmer, while the second is impossible. The real solution is to put our operator+ where it has always truly belonged and should have been put in the first place: in namespace N. // Example 1b: Solution // in some library header // a mainline to exercise it This code is portable and will compile on all conforming compilers, regardless of what happens to be already defined in std or any other namespace. Now that the operator+ is in the same namespace as the second parameter, when the compiler tries to resolve the "+" call inside std::accumulate it is able to see the right operator+ because of Koenig lookup. Recall that Koenig lookup says that, in addition to looking in all the usual scopes, the compiler shall also look in the scopes of the function's parameter types to see if it can find a match. N::C is in namespace N, so the compiler looks in namespace N, and happily finds exactly what it needs no matter how many other operator+'s happen to be lying around and cluttering up namespace std. ConclusionThe problem arose because Example 1 did not follow the Interface Principle:
If an operation, even a free function (and especially an operator) mentions a class and is intended to form part of the interface of a class, then always be sure to supply it with the class--which means, among other things, to put it in the same namespace as the class. The problem in Example 1 arose because we wrote a class C and put part of its interface in a different namespace. Making sure that the class and the interface stay together is The Right Thing To Do in any case, and is a simple way of avoiding complex name lookup problems later on when other people try to use your class. Use namespaces wisely. Either put all of a class inside the same namespace -- including things that to innocent eyes don't look like they're part of the class, such as free functions that mention the class -- or don't put the class in a namespace at all. Your users will thank you.
Notes1. H. Sutter. "What's In a Class?" (C++ Report, 10(3), Mar 1998). 2. In case you're wondering that there might be a potential portability problem depending on whether the implementation of std::accumulate invokes operator+(int,N::C) or operator+(N::C,int), there isn't; the standard says that it must be the former, so Example 1 is providing an operator+ with the correct signature. 3. For example, you might think that, if none of the functions found in an inner scope were usable, then it could be okay to let the compiler start searching further enclosing scopes; that would, however, produce surprising results in some cases (consider the case where there's a function that would be an exact match in an outer scope, but there's a function in an inner scope that's a close match requiring only a few parameter conversions). Or, you might think that the compiler should just make a list of all functions with the required name in all scopes, and then perform overload resolution across scopes; but, alas, that too has its pitfalls (consider that a member function ought to be preferred over a global function, rather than result in a possible ambiguity). 4. By requiring the programmer to use the version of std::accumulate that takes a predicate and explicitly say which one he wants each time... a good way to lose customers. |
Copyright © 2009 Herb Sutter |