|
Widgets, Inheritance, and Writing Safe CodeThis article appeared in C++ Report, 12(3), March 2000.
OverviewThis column answers the following questions: o Can any arbitrary class be made exception-safe, that is, without modifying its structure? o If not (that is, if exception safety does affect a class's design), is there any simple change that always works to let us make any arbitrary class exception-safe? o Are there exception safety consequences to the way we choose to express relationships between classes? Specifically, does it matter whether we choose to express a relationship using inheritance or using delegation? The Cargill Widget ExampleTo begin, consider an exception safety challenge proposed by Tom Cargill: // Example 1: The Cargill Widget Example Assume that any T1 or T2 operation might throw. Without changing the structure of the Widget class, is it possible to write a strongly exception-safe Widget::operator=( const Widget& )? (Remember that "strongly exception-safe" means that, if an exception is thrown, program state is unchanged; that is, the operation must be atomic with respect to the entire program, either a complete success or no effect.) Why or why not? Think about this question for a moment before reading on.
Analyzing the Cargill Widget ExampleIn short: No, exception safety can't be achieved in general without changing the structure of Widget. In Example 1, it's not possible to write a safe Widget::operator=() at all -- we cannot guarantee the Widget object will even be in a consistent final state if an exception is thrown, because there's no way that we can change the state of both of the t1_ and t2_ members atomically. Say that our Widget::operator=() attempts to change t1_, then attempt to change t2_ (one or the other member has to be done first, it doesn't really matter which in this case). The problem is twofold: 1. If the attempt to change t1_ throws, t1_ must be unchanged. That is, to make Widget::operator=() exception-safe relies fundamentally on the exception safety guarantees provided by T1, namely that T1::operator=() -- or whatever mutating function we are using -- either succeeds or does not change its target. This comes close to requiring the strong guarantee of T1::operator=(). (The same reasoning applies to T2::operator=().) 2. If the attempt to change t1_ succeeds, but the attempt to change t2_ throws, we've entered a "halfway" state and cannot in general roll back the change already made to t1_. For example, what if our attempt to reassign t1_'s old value also fails? Then the Widget object can't even guarantee recovery to a consistent state that maintains Widget's invariants. Therefore, the way Widget is structured in Example 1, its operator=() cannot be made strongly exception-safe. (See the accompanying sidebar "A Simpler but Still Difficult Widget" for a simpler example that has a subset of the above problems.) Our goal is to write a Widget::operator=() that is strongly exception-safe, without making any assumptions about the exception safety of any T1 or T2 operation. Can it be done? Or is all lost? A General Technique: Using the Pimpl IdiomThe good news is that, even though Widget::operator=() can't be made strongly exception-safe without changing Widget's structure, the following simple transformation always works to enable almost strongly exception-safe assignment: Hold the member objects by pointer instead of by value, preferably all behind a single pointer with a Pimpl transformation (for more details, including an analysis of the costs of using Pimpls and how to minimize those costs, see Items 26 to 30 of Exceptional C++ [1]). Example 2 illustrates the general exception safety-promoting transformation (alternatively, the pimpl_ could be held as a bald pointer or using some other pointer-managing object): // Example 2: The general solution to private: // ... provide destruction, copy construction Aside: Note that if you use an auto_ptr member, then: a) you must either provide the definition of WidgetImpl with the definition of Widget, or if you want to keep hiding WidgetImpl you must write your own destructor for Widget even if it's a trivial destructor;[2] and b) you should also provide your own copy construction and assignment for Widget because normally you don't want transfer-of-ownership semantics for class members. If you have a different kind of smart pointer available, consider using that instead of auto_ptr, but the principles being described here remain important. Now we can easily implement a nonthrowing Swap(), which means we can easily implement exception-safe copy assignment that nearly meets the strong guarantee. First, provide the nonthrowing Swap() function that swaps the guts (state) of two objects -- note that this function can provide the nothrow guarantee, that no exceptions will be thrown under any circumstances, because no auto_ptr operation is permitted to throw exceptions:[3] void Widget::Swap( Widget& other ) throw() Second, implement the common exception-safe form of operator=() using the "create a temporary and swap" idiom: Widget& Widget::operator=( const Widget& other ) Swap( temp ); // then "commit" the work using This is nearly strongly exception-safe. It doesn't quite guarantee that, if an exception is thrown, program state will remain entirely unchanged; do you see why? It's because, when we create the temporary Widget object and therefore its pimpl_'s t1_ and t2_ members, the creation of those members (and/or their destruction if we fail) may cause side effects, such as changing a global variable, and there's no way we can know about or control that. More on this issue in the next column. A Potential Objection, and Why It's UnreasonableSome may leap upon this with the ardent battle cry: "Aha, so this proves exception safety is unattainable in general, because you can't solve the general problem of making any arbitrary class strongly exception-safe without changing the class." (I raise this point only because some people have indeed raised this objection.) Such a conclusion seems unreasonable to me. The Pimpl transformation, a minor structural change, is indeed the solution to the general problem. Like most implementation goals, exception safety affects a class's design, period. Just as one wouldn't expect to make a class work polymorphically without accepting the slight change to inherit from the necessary base class, one wouldn't expect to make a class work in an exception-safe way without accepting the slight change to hold its members at arm's length. To illustrate, consider three statements: o Unreasonable Statement #1: "Polymorphism doesn't work in C++ because you can't make an arbitrary class usable in place of a Base& without changing it (to derive from Base)." o Unreasonable Statement #2: "STL containers don't work in C++ because you can't make an arbitrary class usable in an STL container without changing it (to provide an assignment operator)." o Unreasonable Statement #3: "Exception safety doesn't work in C++ because you can't make an arbitrary class exception-safe without changing it (to put the internals in a Pimpl class)." The above arguments are equally fruitless, and the Pimpl transformation is indeed the general solution to writing classes that give useful exception safety guarantees (indeed, nearly the strong guarantee) without requiring any knowledge of the safety of class data members. So, what have we learned? Conclusion 1: Exception Safety Affects a Class's DesignException safety is never "just an implementation detail." The Pimpl transformation is a minor structural change, but still a change. Conclusion 2: You Can Always Make Your Code (Nearly) Strongly Exception-SafeThere's an important principle here: Just because a class you use isn't in the least exception-safe is no reason that your code that uses it can't be strongly exception-safe (except for side effects). Anybody can use a class that lacks a strongly exception-safe copy assignment operator and make that use strongly exception-safe, except that of course if Widget operations cause side effects (such as changing a global variable) there's no way we can know about or control that; more on that in the next column. The "hide the details behind a pointer" technique can be done equally well by either the Widget implementer or the Widget user -- it's just that if it's done by the Widget implementer it's always safe, and the user won't have to do this: // Example 3: What the user has to do if public: MyClass& operator=( const MyClass& other ) Swap( temp ); // then "commit" the work using // ... provide destruction, copy construction Conclusion 3: Use Pointers JudiciouslyScott Meyers writes:[4] "When I give talks on EH, I teach people two things: - POINTERS ARE YOUR ENEMIES, because they lead to the kinds of problems that auto_ptr is designed to eliminate. To wit, bald pointers should normally be owned by manager objects that own the pointed-at resource and perform automatic cleanup. Then Scott continues: "- POINTERS ARE YOUR FRIENDS, because operations on pointers can Then I tell them to have a nice day :-)" Scott captures a fundamental dichotomy well. Fortunately, in practice you can and should get the best of both worlds: o Use pointers because they are your friends, because operations on pointers can't throw. o Keep them friendly by wrapping them in manager objects like auto_ptrs, because this guarantees cleanup. This doesn't compromise the nonthrowing advantages of pointers because auto_ptr operations never throw either (and you can always get at the real pointer inside an auto_ptr whenever you need to, for example by calling auto_ptr::get()). Indeed, often the best way to implement the Pimpl idiom is as shown in Example 2 above, by using a pointer (in order to take advantage of nonthrowing operations) while still wrapping the dynamic resource safely in a manager object (in this example, an auto_ptr). Just remember that if you do use auto_ptr your class must provide its own destruction, copy construction and copy assignment with the right semantics, or you can disable copy construction and assignment if those don't make sense for the class. We'll now apply what we've learned by using the above to analyze the best way to express a common class relationship. Background: What's "IIITO"?By coining expressions like "HAS-A," "IS-A," and "USES-A," we have developed a convenient shorthand for describing many types of code relationships. "IS-A," or more precisely "IS-SUBSTITUTABLE-FOR-A," is usually used to described public inheritance that preserves substitutability according to the Liskov Substitution Principle (LSP), as all public inheritance ought. For example, "D IS-A B" means that code that accepts objects of the base class B by pointer or reference can seamlessly use objects of the publicly derived class D instead. That's not the only meaning of IS-A in C++, of course; IS-A can also describe unrelated (by inheritance) classes that support the same interface and can therefore be used interchangeably in templated code that uses that common interface. In that context, for example, "X IS-A Y" -- or, "X IS-SUBSTITUTABLE-FOR-A Y" -- communicates that templated code that accepts objects of type Y will also accept objects of type X since both X and Y support the same interface. Nathan Myers has dubbed this the "Generic Liskov Substitution Principle" (GLSP).[5] Clearly both kinds of substitutability depend on the context in which the objects are actually used, but the point is that IS-A can be implemented in different ways. An equally common code relationship is IS-IMPLEMENTED-IN-TERMS-OF, or IIITO for short. A type T IIITO another type U if T uses U in its implementation in some form. Now, saying "uses... in some form" certainly leaves a lot of latitude, and this can run the gamut from T being an adapter or proxy or wrapper for U, to T simply using U incidentally to implement some details of T's own services. Typically "T IIITO U" means that either T HAS-A U, as shown in Example 4(a): // Example 4(a): "T IIITO U" using HAS-A or that T is derived from U nonpublicly, as shown in Example 4(b):[6] // Example 4(b): "T IIITO U" using derivation This brings us to the natural questions: When we have a choice, which is the better way to implement IIITO? What are the tradeoffs? When should we consider using each one? How To Implement IIITO: Inheritance or Delegation?As I've argued before, inheritance is often overused, even by experienced developers. A sound rule of software engineering is to always minimize coupling: If a relationship can be expressed in more than one way, use the weakest relationship that's practical. Given that inheritance is nearly the strongest relationship we can express in C++ (second only to friendship), it's only really appropriate when there is no equivalent weaker alternative. If you can express a class relationship using delegation alone, you should always prefer that. The "minimize coupling" principle clearly has a direct effect on the robustness (or fragility) of your code, of how long your compile times are, and other observable consequences. What's interesting is that the choice between inheritance and delegation for IIITO turns out to have exception safety implications. In hindsight, that the "minimize coupling" principle should also relate to exception safety should not be surprising, because a design's coupling has a direct impact on its possible exception safety. The coupling principle states: Lower coupling promotes program correctness (including exception safety), and This is only natural. After all, the less tightly real-world objects are related, the less effect they necessarily have on each other. That's why we put firewalls in buildings and bulkheads in ships; if there's a failure in one compartment, the more we've isolated the compartments the less likely the failure is to spread to other compartments before things can be brought back under control. Now let's return to Examples 4(a) and 4(b) and consider again a class T that is IIITO another type U. Specifically, consider the copy assignment operator: How does the choice of how to express the IIITO relationship affect how we write T::operator=()? Exception Safety ConsequencesFirst, consider how we would have to write T::operator=() if the IIITO relationship is expressed using HAS-A. We of course have the good habit of using the common "do all the work off to the side, then commit using nonthrowing operations only" technique to maximize exception safety, and so we would write something like the following: // Example 5(a): "T IIITO U" using HAS-A T& T::operator=( const T& other ) delete u_; // then "commit" the work using This is pretty good: Without making any assumptions about U, we can write a T::operator=() that is "nearly" strongly exception-safe except for possible side effects of U. (Again, more about this in the next column.) Even if the U object were contained by value instead of by pointer, it could be easily transformed into being held by pointer as above. The U object could also be put into a Pimpl using the transformation described in the first part of this article. It is precisely the fact that delegation (HAS-A) gives us this flexibility that allows us to easily write a fairly exception-safe T::operator=() without making any assumptions about U. Next, consider how the problem changes once the relationship between T and U involves any kind of inheritance: // Example 5(b): "T IIITO U" using derivation T& T::operator=( const T& other ) The problem is the call to U::operator=(). As alluded to earlier in a sidebar to this article (there speaking of a similar case), if U::operator=() can throw in such a way that it has already started to modify the target, there is no way to write a strongly exception-safe T::operator=() unless U provides suitable facilities through some other function. (But if U can do that, why doesn't it do so for U::operator=()?) In other words, now T's ability to make an exception safety guarantee for its own member function T::operator=() depends implicitly on U's own safety and guarantees. But, again, should this be surprising? No, it shouldn't, because Example 5(b) uses the tightest possible relationship, and hence the highest possible coupling, to express the connection between T and U. SummaryLooser coupling promotes program correctness (including exception safety), and tight coupling reduces the maximum possible program correctness (including exception safety). Inheritance is often overused, even by experienced developers. See Item 24 in Exceptional C++ [1] for more information about many other reasons, besides exception safety, why and how you should use delegation instead of inheritance wherever possible. Always minimize coupling: If a class relationship can be expressed in more than one way, use the weakest relationship that's practical. In particular, only use inheritance where delegation alone won't suffice. Next time: A new method of reasoning about program correctness, including but not limited to exception safety. Specifically, it will help to address more clearly exactly what is meant by "nearly strongly exception-safe" in Example 2 and Conclusion 2; Example 2 turns out to be an important example because it demonstrates the strongest guarantee a class can make for its own operator=() without relying on any guarantees from the objects that it uses. That's a pretty fundamental concept, and next time I'll give it a name.
Notes1. H. Sutter. Exceptional C++ (Addison-Wesley, 2000). 2. If you use the automatically compiler-generated destructor, that destructor will be defined in every translation unit, and therefore the definition of WidgetImpl must be visible in every translation unit. 3. Note that replacing the three-line body of Swap() with the single line "swap( pimpl_, other.pimpl_ );" is not guaranteed to work correctly, because std::swap() will not necessarily work correctly for auto_ptrs. 4. Scott Meyers, private communication. 5. For more about Nathan's comments and how GLSP applies to templates like char_traits, see Item 3 in Exceptional C++ [1]. 6. Arguably, public derivation also models IIITO incidentally, but the primary meaning of public derivation is still IS-SUBSTITUTABLE-FOR-A. |
Copyright © 2009 Herb Sutter |