|
Uses and Abuses of Inheritance, Part 1This article appeared in C++ Report, 10(9), October 1998.
Inheritance is often overused, even by experienced developers. Always minimize coupling: If a class relationship can be expressed in more than one way, use the weakest relationship that's practical. Given that inheritance is nearly the strongest relationship you can express in C++ (second only to friendship), it's only really appropriate when there is no equivalent weaker alternative. In this column, the spotlight is on private inheritance, and one real (if obscure) use for protected inheritance. In the following column, I'll start by covering public inheritance, and then bring things together by discussing some multiple-inheritance issues and techniques. A Motivating ExampleHere's an example to help illustrate some of the issues. The following template provides list-management functions, including the ability to manipulate list elements at specific list locations: // Example 1 Consider the following code, which shows two ways to write a MySet class in terms of MyList. Assume that all important elements are shown: // Example 1(a) Before reading on, give these alternatives some thought, and consider these questions: 1. Is there any difference between MySet1 and MySet2? 2. More generally, what is the difference between nonpublic inheritance and containment? 3. Which version of MySet would you prefer -- MySet1 or MySet2? Nonpublic Inheritance vs. ContainmentThe answer to Question 1 is straightforward: There is no substantial difference between MySet1 and MySet2. They are functionally identical. Question 2 gets us right down to business: o Nonpublic inheritance should always express IS-IMPLEMENTED-IN-TERMS-OF (with only one rare exception, which I'll cover shortly). It makes the using class depend upon the public and protected parts of the used class. o Containment always expresses HAS-A and, therefore, IS-IMPLEMENTED-IN-TERMS-OF. It makes the using class depend upon only the public parts of the used class. It's easy to show that inheritance is a superset of single containment -- that is, there's nothing we can do with a single MyList<T> member that we couldn't do if we inherited from MyList<T>. Of course, using inheritance does limit us to having just one MyList<T> (as a subobject); if we needed to have multiple instances of MyList<T>, we would have to use containment instead. That being the case, what are the extra things we can do if we use inheritance that we can't do if we use containment? In other words, why use nonpublic inheritance? Here are five reasons, in rough order from most to least common. Interestingly, the final item points out a useful(?) application of protected inheritance: o We need access to a protected member. This applies to protected member functions[1] in general, and to protected constructors in particular. o We need to override a virtual function. This is one of inheritance's classic raisons d.[2] Often we want to override in order to customize the used class' behaviour. Sometimes, however, there's no other choice: If the used class is abstract -- that is, it has at least one pure virtual function that has not yet been overridden -- we must inherit and override because we can't instantiate directly. o We need to construct the used object before, or destroy it after, another base subobject. If the slightly longer object lifetime matters, there's no way to get it other than using inheritance. This can be necessary when the used class provides a lock of some sort, such as a critical section or a database transaction, which must cover the entire lifetime of another base subobject. o We need to share a common virtual base class, or override the construction of a virtual base class. The first part applies if the using class has to inherit from one of the same virtual bases as the used class. If it does not, the second part may still apply: The most-derived class is responsible for initializing all virtual base classes, and so if we need to use a different constructor or different constructor parameters for a virtual base, then we must inherit. There is one additional feature we can get using nonpublic inheritance, and it's the only one that doesn't model IS-IMPLEMENTED-IN-TERMS-OF: o We need "controlled polymorphism" LSP IS-A, but in certain code only. Public inheritance should always model IS-A as per the Liskov Substitution Principle (LSP).[3] Nonpublic inheritance can express a restricted form of IS-A, even though most people identify IS-A with public inheritance alone. Given class Derived : private Base, from the point of view of outside code, a Derived object IS-NOT-A Base, and so of course can't be used polymorphically as a Base because of the access restrictions imposed by private inheritance. However, inside Derived's own member functions and friends only, a Derived object can indeed be used polymorphically as a Base (you can supply a pointer or reference to a Derived object where a Base object is expected), because members and friends have the necessary access. If instead of private inheritance you use protected inheritance, then the IS-A relationship is additionally visible to further-derived classes, which means subclasses can also make use of the polymorphism. That's as complete a list as I can make of reasons to use nonpublic inheritance. (In fact, just one additional point would make this a complete list of all reasons to use any kind of inheritance: "We need public inheritance to express IS-A." More on that in the next column.) So What About MySet?All of this brings us to Question 3: Which version of MySet would you prefer -- MySet1 or MySet2? Let's analyze the code in Example 1 and see whether any of the above criteria apply: o MyList has no protected members, so we don't need to inherit to gain access to them. o MyList has no virtual functions, so we don't need to inherit to override them. o MySet has no other potential base classes, so the MyList object doesn't need to be constructed before, or destroyed after, another base subobject. o MyList has no virtual base classes that MySet might need to share or whose construction it might need to override. o MySet IS-NOT-A MyList, not even within MySet's member functions and friends. This last point is interesting, because it points out a (minor) disadvantage of inheritance: Even had one of the other criteria been true, so that we would use inheritance, we would have to be careful that members and friends of MySet wouldn't accidentally use a MySet polymorphically as a MyList -- a remote possibility, maybe, but sufficiently subtle that if it did ever happen it would probably keep the poor programmer who encountered it confused for hours. In short, MySet should not inherit from MyList. Using inheritance where containment is just as effective only introduces gratuitous coupling and needless dependencies, and that's never a good idea. Unfortunately, in the real world I still see programmers -- even experienced ones -- who implement relationships like MySet's using inheritance. Astute readers will have noticed that the inheritance-based version of MySet does offer one (fairly trivial) advantage over the containment-based version: Using inheritance, you only need to write a using-declaration to expose the unchanged Size function. Using containment, to get the same effect you have to explicitly write a simple forwarding function. But What If We Do Need To Inherit?Of course, sometimes inheritance will be appropriate. For example: // Example 2: Sometimes you need to inherit If we need to override a virtual function like Func1 or access a protected member like Func2, inheritance is necessary. Example 2 illustrates why overriding a virtual function may be necessary for reasons other than allowing polymorphism: Here Base is implemented in terms of Func1 (Func3 uses Func1 in its implementation), and so the only way to get the right behaviour is to override Func1. Even when inheritance is necessary, however, is the following the right way to do it? // Example 2(a) This code allows Derived to override Base::Func1, which is good. Unfortunately, it also grants access to Base::Func2 to all members of Derived, and there's the rub: Maybe only a few, or just one, of Derived's member functions really need access to Base::Func2. By using inheritance like this, we've needlessly made all of Derived's members depend upon Base's protected interface. Clearly inheritance is necessary, but wouldn't it be nice to introduce only as much coupling as we really need? Well, we can do better with a little judicious engineering: // Example 2(b) This design is much better, because it nicely separates and encapsulates the dependencies on Base. Derived only depends directly on Base's public interface, and on DerivedImpl's public interface. Why is this design more successful? Primarily because it follows the fundamental "one class, one responsibility" design guideline. In Example 2(a), Derived was responsible for both customizing Base and implementing itself in terms of Base. In Example 2(b), those concerns are nicely separated out. Variants On ContainmentContainment has some advantages of its own. First, it allows having multiple instances of the used class, which isn't possible with inheritance.[4] If you need to both derive and have multiple instances, just use the same idiom as in Example 2(b): Derive a helper class (like DerivedImpl) to do whatever needs the inheritance, then contain multiple copies of the helper class. Second, having the used class be a data member gives additional flexibility: The member can be hidden behind a compiler firewall inside a Pimpl[5] (whereas base class definitions must always be visible), and it can be easily converted to a pointer if it needs to be changed at runtime (whereas inheritance hierarchies are static and fixed at compile time). Finally, here's a third useful way to rewrite MySet2 from Example 1(b) to use containment in a more generic way: // Example 1(c): Generic containment Instead of just choosing to be IMPLEMENTED-IN-TERMS-OF MyList<T> only, we now have the flexibility of having MySet IMPLEMENTABLE-IN-TERMS-OF any class that supports the required Add, Get and other functions that we need. The C++ standard library uses this very technique for its stack and queue templates, which are by default IMPLEMENTED-IN-TERMS-OF a deque but are also IMPLEMENTABLE-IN-TERMS-OF any other class that provides the required services. Specifically, different user code may choose to instantiate MySet using implementations with different performance characteristics -- for example, if I know I'm going to write code that does many more inserts than searches, I'd want to use an implementation that optimizes inserts. We haven't lost any ease of use, either: Under Example 1(b), client code could simply write MySet2<int> to instantiate a set of ints, and that's still true with Example 1(c) because MySet3<int> is just a synonym for MySet3<int,MyList<int> >, thanks to the default template parameter. This kind of flexibility is more difficult to achieve with inheritance, primarily because inheritance tends to fix an implementation decision at design time. It is possible to write Example 1(c) to inherit from Impl, but here the tighter coupling isn't necessary and should be avoided. ConclusionIn general, it's a good idea to prefer less inheritance. Use containment wherever possible, and inheritance only in the specific situations in which it's needed. Large inheritance hierarchies in general, and deep ones in particular, are confusing to understand and therefore difficult to maintain. Inheritance is a design-time decision and trades off a lot of runtime flexibility. In the next installment, I'll focus on public and multiple inheritance. Public/LSP inheritance is the clearest form of inheritance to design and understand, and it should account for the vast majority of inheritance in most projects; more about that when we return.
Notes1. I say "member functions" because you would never write a class that has a public or protected member variable, right? (Regardless of the poor example set by some libraries.) 2. See also S. Meyers, Effective C++, 2nd edition (Addison-Wesley, 1998), under the index entry "French, gratuitous use of." 3. See www.objectmentor.com for several good papers describing LSP. 4. For those who revel in unuseful obscurities: Yes, it's technically possible to have the same class appear as a base class more than once (indirectly), but it's not useful because even if you do that there's no way to refer to any of those base's nonstatic members. At any rate, even if it were possible, it would be unmaintainable -- the whole point of this article is that containment is much cleaner. 5. H. Sutter. "More About the Compiler-Firewall Idiom," (C++ Report, 10(7), July-August 1998). |
Copyright © 2009 Herb Sutter |