Standard Library News, Part 1: Vectors and Deques

Vectors and Deques

On the
blog

November 4: Other Concurrency Sessions at PDC
November 3: PDC'09: Tutorial & Panel

October 26: Hoare on Testing
October 23: Deprecating export Considered for ISO C++0x

This is the original article substantially as first published. See the book More Exceptional C++ (Addison-Wesley, 2002) for the most current version of this article. The versions in the book have been revised and expanded since their initial appearance in print. The book versions also incorporate corrections, new material, and conformance to the final ANSI/ISO C++ standard.

Standard Library News, Part 1: Vectors and Deques

This article appeared in C++ Report, 11(7), July/August 1999.

NOTE: The "In Most Cases, Prefer Using deque" section in this latest revision was significantly expanded after the article went to press.

This column and the next will present news updates about the standard library containers vector, deque, set, multiset, map, and multimap.

"Updates?" you might ask. "What updates? Isn't the C++ Standard done?" Yes, the C++ Standard is done, and in this "maintenance" phase the ANSI/ISO standards committees are still meeting (albeit only twice a year) to produce defect reports and clarifications in response to questions about the Standard. Why should you care about these results? Because many of the clarifications include useful information about how to use C++ effectively, as we shall see in this column and the next one.

This time, my focus is on news about std::vector and std::deque. This column will answer the following questions:

o Can you use a vector interchangeably with a C-style array (say, with an existing function that expects an array)? How? What are the issues?

o Why should you usually prefer using a deque instead of a vector?

o How do you shrink a vector whose capacity has grown larger than you need? In particular, how do you "shrink-to-fit" (so that the vector's size is just big enough to hold only its current contents) or clear a vector's internal storage entirely (so that the vector is truly empty, with no contents and little or no under-the-covers preallocated storage)? Can the same technique be used for other containers?

o What's happening with std::vector<bool>, whose problems I discussed in the previous column?^[1]

Using a vector As an Array: A Motivating Example

C++ programmers are strongly encouraged to use the std::vector template instead of C-style arrays. Using a vector is easier and safer than using an array, because a vector is a better-abstracted and better-encapsulated form of container; for example, a vector can grow and shrink as needed whereas an array has a fixed size.

Nonetheless, there's still a lot of code out there that is designed to work with C-style arrays. For example, consider the following code:

// Example 1(a): A function that operates
//               on a C-style array
//
int FindCustomer(
    const char* szName,     // name to search for
    Customer*   pCustomers, // pointer to beginning
                            // of a Customer array
    size_t      nLength )   // with nLength entries
{
    // performs a (possibly optimized) search and
    // returns the index at which the specified
    // name was found
}

To use the above function we might create, populate, and pass in an array as follows. Note that we could equivalently pass just c as the second parameter, which would be the same as passing &c[0] because the array name would automatically be converted into a pointer to the first element in the array:

// Example 1(b): Using an array
//
Customer c[100];

//-- populate the contents of c somehow ---

int i = FindCustomer( "Fred Jones", &c[0], 100 );

Example 1(b) is all well and good, but as modern C++ programmers aren't we supposed to be using vectors instead of arrays? Indeed, wouldn't it be nice if we could have the best of both worlds: use a vector for its convenience and safety, yet still use the vector's contents with functions that expect an array? This desire comes up most often when an organization is migrating to C++, and people want to write new parts of the system in the "new and better" way while still tying in easily with a legacy code base. So let's use a vector... our first attempt could look something like this:

// Example 1(c): Using a vector -- will this work?
//
vector<Customer> c;

//-- populate the contents of c somehow ---

int i = FindCustomer( "Fred Jones", &c[0], c.size() );

In Example 1(c), when it comes time to call FindCustomer(), the idea is to pass in the address of the first element, and the number of elements--which is pretty much just what we did in Example 1(b).

Before reading on, stop for a moment and consider these questions: What potential benefits does the code in Example 1(c) have over the code in Example 1(b)? Do you see any potential problems with Example 1(c), or do you think that it will work as intended?

A Fly In the Ointment

I didn't just pull Example 1(c) out of a hat. It turns out that there are lot of real-world C++ programmers who have done exactly this kind of thing, and with good reason, because code like that in Example 1(c) illustrates some of the important benefits that come from using a vector:

o We don't have to know the potential size of c in advance. In many C programs, you'll see arrays being allocated pessimistically so that they'll be big enough for most intended uses; unfortunately, this means that space is being needlessly wasted much of the time when the full size isn't needed after all. (Worse, it means the program breaks if it ever end up needing more space than we'd thought.) With a C++ vector, we can grow the container as needed instead of just guessing. Also, if we do happen to know in advance how many objects we're going to put into the vector, we can simply say so (by first calling c.reserve(100), if we want to reserve space for 100 elements), so there's no performance disadvantage with a vector compared to an array because we have a way to avoid repeated reallocations as we insert new elements into the vector.

o We don't have to separately track the actual length of the array so that we can pass it as the final (nLength) parameter to FindCustomer(). Yes, you can use a C array and use workarounds that make remembering or recalculating the length easier,^[2] but none of those workarounds are as easy and safe as just writing c.size().

In short, Example 1(c) indeed works on every standard library implementation that I know about, and is likely to work on all that exist today.

The only (minor, and as you'll soon see, pretty much moot) fly in the ointment is that the C++ Standard as it's currently written doesn't guarantee that Example 1(c) will actually work portably on every C++ platform you might buy. The reason is pretty simple: Example 1(c) assumes that the internal contents of the vector are stored contiguously in the same format as is an array, but today's Standard doesn't require vendors to implement vector that way. If, for example, a vector implementation were to play games and store its contents in some other way (such as contiguously but in backwards order, or in sequential order but with an extra byte of housekeeping information between elements), then Example 1(c) would in fact fail miserably.

Having said that, here's the good news: I believe that we can now safely say that Example 1(c) is entirely safe and portable, for the following reasons:

o It should work in practice today. Again, I don't know of a commercial vector implementation that doesn't store its elements contiguously.

o It will be guaranteed to work in practice tomorrow. The standards committee agrees that this is a natural thing to want to do, and it intends to put wording in the first Technical Corrigendum to the effect that the elements inside a vector must indeed be stored contiguously and so will be usable with code that expects an array. The library vendors know that this clarification is coming (after all, the major standard library vendors were with us in the room when this was being discussed), and so they will make sure that new versions of their vector implementations won't introduce any incompatibilities that would make them nonconforming in light of this clarification.

Bottom line: Prefer using std::vector (or std::deque; see below) instead of arrays.

In Most Cases, Prefer Using deque

The C++ Standard, in section 23.1.1, offers some advice on which containers to prefer. It says:

"vector is the type of sequence that should be used by default. ... deque is the data structure of choice when most insertions and deletions take place at the beginning or at the end of the sequence."

I'd like to present an amiably dissenting point of view: I recommend that you consider preferring deque by default instead of vector, especially when the contained type is a class or struct and not a builtin type, unless you really need the container's memory to be contiguous.

vector and deque offer nearly identical interfaces and are generally interchangeable. deque also offers push_front() and pop_front(), which vector does not. (True, vector offers capacity() and reserve(), which deque does not, but that's no loss--those functions are actually a weakness of vector, as I'll demonstrate in a moment.)

The main structural difference between vector and deque lies in the way the two containers organize their internal storage: Under the covers, a deque allocates its storage in pages, or "chunks," with a fixed number of contained elements in each page; this why a deque is often compared to (and pronounced as) a "deck" of cards, although its name originally came from "double-ended queue" because of is ability to insert elements efficiently at either end of the sequence. On the other hand, a vector allocates a contiguous block of memory, and can only insert elements efficiently at the end of the sequence.

The paged organization of a deque offers significant benefits:

1. A deque offers constant-time insert() and erase() operations at the front of the container, whereas a vector does not--hence the note in the Standard about using a deque if you need to insert or erase at both ends of the sequence.

2. A deque uses memory in a more operating system-friendly way. For example, a 10-megabyte vector uses a single 10-megabyte block of memory; on some operating systems that single block can be less efficient in practice than a 10-megabyte deque that can fit in a series of smaller blocks of memory.

3. A deque is easier to use, and inherently more efficient for growth, than a vector. The only operations supplied by vector that deque doesn't have are capacity() and reserve()--and that's because deque doesn't need them! For vector, calling reserve() before a large number of push_back()s can eliminate reallocating ever-larger versions of the same buffer every time it finds out that the current one isn't big enough after all. A deque has no such problem, and having a deque::reserve() before a large number of push_back()s would not eliminate any allocations (or any other work) because none of the allocations are redundant; the deque has to allocate the same number of extra pages whether it does it all at once or as elements are actually appended.

Interestingly, the C++ Standard itself contains an indication that it disagrees with its own note about preferring to use a vector by default: Even for the stack adapter, which can only grow in one direction and so never needs insertion in the middle or at the other end, the default implementation must always be a deque:

template <class T, class Container = deque<T> >
class stack {
// ...
};

Some readers are going to be concerned about my advice to prefer deque, saying perhaps: "But deque is a more complex container than vector, and so deque must be five or ten times less efficient compared to vector, right?" As always, beware premature optimization before you actually measure: I have found the "deque must be inefficient" assumption to be generally untrue on popular implementations. Using MSVC 5.0 SP3 with the current patches for its bundled Dinkumware C++ Standard Library implementation (probably by far the most widely-used compiler and library configuration), I tested the performance of the following operations, in order:

Operation        Description
grow                first, perform 1,000,000 push_back()s
traverse           then traverse, simply incrementing iterators, from begin() to end()
at                     then access each element in turn using at()
shuffle             then random_shuffle() the entire container
sort                  then sort() the entire container (for list, uses list::sort())

I tested each operation against several standard containers including deque. I expected vector to outperform deque in traverse and at, and deque to win in grow for reason #3 above. In fact, note that deque did outperform vector on the grow test even though in fairness I gave vector special treatment by calling vector::reserve(1000000) first to avoid any reallocations. All timings varied less than 5% across multiple runs, and all runs executed fully in-memory without paging or other disk accesses.

First, consider what happened for containers of a simple builtin type, int:

Times in ms        grow     traverse               at       shuffle            sort
vector<int>       1015               55             219             621           3624
deque<int>           761             265             605           1370           7820
list<int>           1595             420             n/a             n/a         16070

Here list was always the slowest, and the difference between vector and deque wasn't as great as several people had led me to expect. Of course, the performance differences between the containers will fade when you use a contained type that is more expensive to manipulate. What's interesting is that most of the differences go away even when the contained type is as simple (and common) as a struct E { int i; char buf[100]; };:

Times in ms        grow     traverse               at       shuffle            sort
vector<E>           3253               60             519           3825         17546
deque<E>             3471             274             876           4950         21766
list<E>               3740             497             n/a             n/a         15134

Now deque's performance disadvantage for even an intensive operation like sort is less than 25%.

Finally, note that the popular library implementation that I tested has since been revised and now includes a streamlined version of deque with simplified iterators. I do not yet have a copy of that library, but it will be interesting to see how much of the deque disadvantage in even the raw iterator traverse and element-accessing at tests will remain compared to vector.

So, for the three reasons cited earlier: Consider preferring deque by default in your programs, especially when the contained type is a class or struct and not a builtin type, unless the actual performance difference in your situation is known to be important or you specifically require the contained elements to be contiguous in memory (which typically means that you intend to pass the contents to a function that expects an array).

The Incredible Shrinking vector

As already noted, it's nice that vector manages its own storage, and that we can give it hints about how much capacity to keep ready under the covers as it grows. If we have a vector<T> c that we're about to grow to contain 100 elements, we can first call c.reserve(100) and incur at most a single reallocation hit. This allows optimally efficient growth.

But what if we're doing the opposite? What if we're using a vector that's pretty big, and then we remove elements that we no longer need and want the vector to shrink to fit; that is, we want it to get rid of the now-unneeded extra capacity? You might think that the following would work:

// Example 2(a): A naive attempt at
// shrinking a vector.
//
vector<Customer> c( 10000 );
// ...now c.capacity() >= 10000...

// erase all but the first 10 elements
c.erase( c.begin()+10, c.end() );

// the following line does NOT shrink
// c's internal buffer to fit!
c.reserve( 10 );

// ...now c.capacity() is still >= 10000 as before

Example 2(a) does not do what you might expect because calling reserve() will never shrink the vector's capacity; it can only increase the capacity, or do nothing if the capacity is already sufficient.

// Example 2(b): The right way to
// shrink-to-fit a vector.
//
vector<Customer> c( 10000 );
// ...now c.capacity() >= 10000...

// erase all but the first 10 elements
c.erase( c.begin()+10, c.end() );

// the following line DOES shrink c's
// internal buffer to fit (or close)
vector<Customer>( c ).swap( c );

// ...now c.capacity() == c.size(), or
// perhaps a little more than c.size()

Do you see how the shrink-to-fit statement works? It's a little subtle:

1. First, we create a temporary (unnamed) vector<Customer> and initialize it to hold the same contents as c. The salient difference between the temporary vector and c is that, while c still carries around a lot of extra capacity in its oversize internal buffer, the temporary vector has just enough capacity to hold its copy of c's contents. (Some implementations may choose to round up the capacity slightly to their next larger internal "chunk size," with the result that the capacity actually ends up being slightly larger than the size of the contents.)

2. Next, we call swap() to exchange the internals of c with the temporary vector. Now the temporary vector owns the oversize buffer with the extra capacity that we're trying to get rid of, and c owns the "rightsized" buffer with the just-enough capacity.

3. Finally, the temporary vector goes out of scope, carrying away the old oversize buffer with it; the old buffer is deleted when the temporary vector is destroyed. Now all we're left with is c itself, but now c has a "rightsized" capacity.

Note that this procedure is not needlessly inefficient. Even if vector had a special-purpose shrink_to_fit() member function, it would have to do pretty much all of the same work just described above.

If you want to completely clear a vector, so that it has no contents and no extra capacity at all, the code is nearly identical; you just initialize the temporary vector to be empty instead of making it a copy of c:

// Example 2(c): The right way to
// clear out a vector.
//
vector<Customer> c( 10000 );
// ...now c.capacity() >= 10000...

// the following line makes
// c be truly empty
vector<Customer>().swap( c );

// ...now c.capacity() == 0, unless the
// implementation happens to enforce a
// minimum size even for empty vectors

Again, note that the vector implementation you're using may choose to make even empty vectors have some slight capacity, but now you're guaranteed that c's capacity will be the smallest possible allowed by your implementation: it will be the same as the capacity of an empty vector.

These techniques work for deque, too, but you don't need to do this kind of thing for list, set, map, multiset, or multimap because they allocate storage on an "exactly-as-needed" basis and so should never have excess capacity lying around.

Update About vector<bool>

In the previous column, I spent time showing why including the vector<bool> specialization in the Standard was a questionable idea; among other problems, std::vector<bool> doesn't even meet the Standard's own container requirements. As of the April standards meeting in Dublin, vector<bool> is again under active discussion, and several countries consider it to be a defect. It won't be resolved soon, but my advice is that you stop using it in most cases. I'd like to propose an easier workaround, though: In the previous column, I suggested using a vector<char> or vector<int> instead, and casting bools to and from chars or ints when inserting and removing them from the container. A superior workaround is to use a deque<bool>; it's a far simpler solution, and it fits with my advice earlier in this column to prefer deque over vector anyway.

Summary

It's safe to use vectors interchangeably with C-style arrays, and you should always prefer to use vectors or deques instead of arrays. Get in the habit of always using deque instead of vector whenever you don't need all of the contained elements to be contiguous in memory. Use the swap-with-a-temporary-container idiom if you want to shrink an existing vector or deque. Finally, unless you know that you really need the space optimization, always use deque<bool> instead of vector<bool>.

Next time I'll cover news about the associative containers set, multiset, map, and multimap. Stay tuned.

Notes

1. H. Sutter. "When Is a Container Not a Container?" C++ Report, 11(5), May 1999.

2. The most typical approaches are to use a #defined name for the size of the array, or use a macro that expands to sizeof(c)/sizeof(c[0]) to compute the size of the array. I think you'll agree that both of these are less elegant and less maintainable than simply writing c.size().

Standard Library News, Part 1: Vectors and Deques

Using a vector As an Array: A Motivating Example

A Fly In the Ointment

In Most Cases, Prefer Using deque

Times in ms grow traverse at shuffle sortvector<int> 1015 55 219 621 3624deque<int> 761 265 605 1370 7820list<int> 1595 420 n/a n/a 16070

Times in ms grow traverse at shuffle sortvector<E> 3253 60 519 3825 17546deque<E> 3471 274 876 4950 21766list<E> 3740 497 n/a n/a 15134

The Incredible Shrinking vector

Update About vector<bool>

Summary

Notes

Copyright © 2009 Herb Sutter

Times in ms grow traverse at shuffle sort
vector<int> 1015 55 219 621 3624
deque<int> 761 265 605 1370 7820
list<int> 1595 420 n/a n/a 16070

Times in ms grow traverse at shuffle sort
vector<E> 3253 60 519 3825 17546
deque<E> 3471 274 876 4950 21766
list<E> 3740 497 n/a n/a 15134