|
The Joy of Pimpls (or, More About the Compiler-Firewall Idiom)This article appeared in C++ Report, 10(7), July/August 1998.
In my previous column,[1] I briefly outlined the Pimpl Idiom (or, Compiler Firewall Idiom) and showed how it and other techniques can be used to minimize compile-time dependencies. There are many interesting aspects of pimpls that I didn't cover last time; to make up for that, I'll devote this month's column to these engaging little classes. Pimpls: A RecapIn short, the core issue is that when anything in a C++ class definition changes--even private members--all users of that class definition must be recompiled. To reduce these dependencies, a common technique is to use an opaque pointer to an implementation class, the eponymous "pimpl," to hide some of the internal details: class X { This is a variant of the handle/body idiom. As documented by Coplien, handle/body was described as being primarily useful for reference counting of a shared implementation, but it also has more general implementation-hiding uses.[2] For convenience, from now on I'll call X the "visible class" and XImpl the "pimpl class." One big advantage of this idiom is that it breaks compile-time dependencies. First, system builds run faster because using a pimpl can eliminate extra #includes as demonstrated in the previous article. I have worked on projects where converting just a few widely-visible classes to use pimpls has halved the system's build time. Second, it localizes the build impact of code changes because the parts of a class that reside in the pimpl can be freely changed--that is, members can be freely added or removed--without recompiling client code. Here are the questions I'll cover this time: 1. What should go into XImpl? 2. Does XImpl require a "back pointer" to the X object? 3. How can we overcome the space overhead (for storing one or two pointers, and possibly even more "wasted" space)? 4. How can we overcome the performance overhead (of the extra allocations and indirections)? "What's In a Pimpl?"[3]First, what should go into XImpl? There are four main alternative disciplines: o Put all private data (but not functions) into XImpl. o Put all private members into XImpl. o Put all private and protected members into XImpl. o Make XImpl entirely the class that X would have been, and write X as only the public interface made up entirely of simple forwarding functions (another handle/body variant). Before reading on, consider: What are the advantages/drawbacks? How would you choose among them? Option 1 (Score: 6 / 10): Put all private data (but not functions) into XImpl. This is a good start, because now we can forward-declare any class which only appears as a data member (rather than #include the class' actual declaration, which would make client code depend on that too). Still, we can usually do better. Option 2 (Score: 10 / 10): Put all private members into XImpl. This is (almost) my usual practice these days. After all, in C++, the phrase "client code shouldn't and doesn't care about these parts" is spelled "private," and privates are always hidden.[4] There are three caveats, the first of which is the reason for my "almost" above: 1. You can't hide virtual member functions in the pimpl, even if the virtual functions are private. If the virtual function overrides one inherited from a base class, then it must appear in the actual derived class. If the virtual function is not inherited, then it must still appear in the visible class in order to be available for overriding by further derived classes.[5] 2. Functions in the pimpl may require a "back pointer" to the visible object if they need to in turn use visible functions, which adds another level of indirection. (By convention such a back pointer is usually named self_ where I've worked.) 3. Often the best compromise is to use Option 2, and additionally put into XImpl only those non-private functions that need to be called by the private ones (see the "back pointer" comments below). Option 3 (Score: 0 / 10): Put all private and protected members into XImpl. Taking this extra step to include protected members is actually wrong. Protected members should never go into a pimpl, since putting them there just emasculates them. After all, protected members exist specifically to be seen and used by derived classes, and so aren't nearly as useful if derived classes can't see or use them. Option 4 (Score: 10 / 10 in restricted cases): Make XImpl entirely the class that X would have been, and write X as only the public interface made up entirely of simple forwarding functions (another handle/body variant). This is useful in a few restricted cases, and has the benefit of avoiding a back pointer since all services are available within the pimpl class. The chief drawback is that it normally makes the visible class useless for any inheritance, as either a base or a derived class. Does XImpl Require a Back Pointer?Does the pimpl require a back pointer to the visible object? The answer is: Sometimes, unhappily, yes. After all, what we're doing is (somewhat artificially) splitting each object into two halves for the purposes of hiding one part. Consider: Whenever a function in the visible class is called, usually some function or data in the hidden half is needed to complete the request. That's fine and reasonable. What's perhaps not as obvious at first is that often a function in the pimpl must call a function in the visible class, usually because the called function is public or virtual. One way to minimize this is to use Option 4 (above) judiciously for the functions concerned... that is, implement Option 2 and additionally put inside the pimpl any non-private functions that are used by private functions. What About the Space Overhead?"What space overhead?" you ask? Well, we now need space for at least one extra pointer (and possibly two, if there's a back pointer in XImpl) for every X object. This typically adds at least four (or eight) bytes on many popular systems, and possibly as many as 14 bytes or more depending on alignment requirements! For example, try the following program on your favourite compiler: struct X1 { char c; }; On many popular compilers that use 32-bit pointers, this prints: 1 8 On these compilers, the overhead of storing one extra pointer was actually seven bytes, not four. Why? Because the platform on which the compiler is running either requires a pointer to be stored on a four-byte boundary, or else performs much more poorly if the pointer isn't stored on such a boundary. Knowing this, the compiler allocates three bytes of unused/empty space inside each X2 object, which means the cost of adding a pointer member was actually seven bytes, not four. If a back pointer is also needed, then the total storage overhead can be as high as 14 bytes on a 32-bit machine, as high as 30 bytes on a 64-bit machine, and so on. How do we get around this space overhead? The short answer is: We can't eliminate it, but sometimes we can minimize it. The longer answer is: There's a downright reckless way to eliminate it that you should never ever use (and don't tell anyone that you heard it from me), and there's usually a nonportable but correct way to minimize it. The utterly reckless "space optimization" happens to be the same as the utterly reckless "performance optimization," so I've moved that discussion off to the side; see the accompanying box. If (and only if) the space difference is actually important in your program, then the nonportable but correct way to minimize the pointer overhead is to use compiler-specific #pragmas. Many compilers will let you override the default alignment/packing for a given class; see your vendor's documentation for details. If your platform only "prefers" (rather than "enforces") pointer alignment and your compiler offers this feature, then on a 32-bit platform you can eliminate as much as six bytes of overhead per X object, at the (usually minuscule) cost of runtime performance because actually using the pointer will be slightly less efficient. Before you even consider anything like this, though, always follow the age-old sage advice: First make it right, then make it fast. Never optimize--neither for speed, nor for size--until your profiler and other tools tell you that you should. What About the Performance Overhead?Using the Pimpl idiom can have a performance overhead for two main reasons: For one thing, each X construction/destruction must now allocate/deallocate memory for its XImpl object, which is typically a relatively expensive operation.[6] For another, each access of a member in the pimpl can require at least one extra indirection.[7] How do we get around this performance overhead? The short answer is: Use the Fast Pimpl idiom, which I'll cover next. (There's also a downright reckless way to eliminate it that you should never ever use; see the accompanying box.) The Fast Pimpl IdiomThe main performance issue here is that space for the pimpl objects is being allocated from the free store. In general, the right way to address allocation performance for a specific class is to overload operator new for that class and use a fixed-size allocator, because fixed-size allocators can be made much more efficient than general-purpose allocators. // file x.h "Aha!" you say. "We've found the holy grail--the Fast Pimpl!" you say. Well, yes, but hold on a minute and think about how this will work and what it will cost you. Your favourite advanced C++ or general-purpose programming textbook has the details about how to write efficient fixed-size [de]allocation functions, so I won't cover that again here. I will talk about usability: One technique is to put the [de]allocation functions in a generic fixed-size allocator template, perhaps something like this: template<size_t S> Because the private details are likely to use statics, however, there could be problems if Deallocate is ever called from a static object's destructor. Probably safer is a singleton that manages a separate free list for each request size (or, as an efficiency tradeoff, a separate free list for each request size "bucket"; e.g., one list for blocks of size 0-8, another for blocks of size 9-16, etc.): class FixedAllocator { Let's throw in a helper base class to encapsulate the calls. This works because derived classes "inherit" these overloaded base operators: struct FastAllocation { Now, you can easily write as many Fast Pimpls as you like: // Want this one to be a Fast Pimpl? But Beware!This is nice and all, but don't just use the Fast Pimpl willy-nilly. You're getting extra allocation speed, but as usual you should never forget the cost: Managing separate free lists for objects of specific sizes usually means incurring a space efficiency penalty because any free space is fragmented (more than usual) across several lists. A final reminder: As with any other optimization, use pimpls in general and fast pimpls in particular only after profiling and experience prove that the extra performance boost is really needed in your situation. In the next column, I'll cover the uses and abuses of non-public inheritance. Stay tuned.
Notes1. H. Sutter, "Pimpls: Beauty Marks You Can Depend On" (C++ Report, May 1998). 2. J. Coplien. Advanced C++ Programming Styles and Idioms (Addison-Wesley, 1992). 3. Please don't email me jokes about this subheading. I can imagine most of the answers. 4. Except in some liberal European countries. 5. Making a virtual private is usually not a good idea, anyway. The point of a virtual function is to allow a derived class to redefine it, and a common redefinition technique is to call the base class' version (not possible, if it's private) for most of the functionality. 6. Compared to most other common operations in C++, such as function calls. Note that here I'm specifically talking about the cost of using a general-purpose allocator, which is what you typically get with the built-in operator new and malloc. 7. If the hidden member being accessed itself uses a back pointer to call a function in the visible class, there will be multiple indirections. 8. This completely hides the pimpl class, but of course clients must still be recompiled if sizeofximpl changes. 9. All right, I'll fess up: There actually is a (not very portable, but pretty safe) way to do put the pimpl class right into the main class like this, thus avoiding all space and time overhead. It involves creating a "max_align" struct that guarantees maximal alignment, and defining the pimpl member as union { max_align dummy; char pimpl_[sizeofximpl]; }; -- this will guarantee sufficient alignment. For all the gory details, do a search for "max_align" on the web or on DejaNews. However, I still strongly urge you not to go down this path, because using a "max_align" solves only this first issue #1 and does not address issues #2 through #5. You Have Been Warned. 10. See H. Sutter, "Exception-Safe Generic Containers" (C++ Report, September 1997) and H. Sutter, "More Exception-Safe Generic Containers" (C++ Report, November/December 1997). 11. See GotW #23, and Advice From the C++ Experts in the October 1997 C++ Report. |
Copyright © 2009 Herb Sutter |