A while ago, I sat down to write a few allocators for C++ objects and/or STL containers. That is, inspired by an article on STL-compatible allocators, I sat down to write one such allocator, but made it possible to feed it memory from a variety of what I chose to call pools. One such pool I modelled after what I dimly recall having read in the code of Python‘s allocator.
You can find all of that effort in the memory sub-directory in fhtagn!’s trunk — but more important than code, right now, is another question: why should I be interested?
Well, this Python inspired allocator is somewhat clever in that it tries to satisfy the most common allocation requests from pre-allocated pools. It internally manages pools for different object sizes, and considers objects in sizes of two to the power of N: 1, 2, 4, 8, …, 256. Each pool then is little more than an array of entries of this fixed size.
For any allocation request, the allocator will determine which of the 2^N sizes would be large enough to fit the object into, and will take memory from the pool that handles this size — except if the allocation size required is too large, in which case it’ll fall back on the heap.
This allocation scheme works very well for Python. In python, all objects (including integers) are allocated from this allocator individually, which means that this very fast allocation from pre-allocated pools improves performance drastically.
But how does this approach measure up in C++?
To find out, I wrote a little benchmark program for allocators. It tries to imitate very naive usage of STL containers, and therefore uses STL’s vector very simply: it’ll fill a vector with a random amount of entries, then drain a random amount of entries again (from anywhere within the vector, not just the beginning or end), and then repeat the process several times.
The reasoning behind this approach is to simulate a relatively healthy balance between user stupidity and vector’s own approach to allocating smartly: if vector is out of memory, it’ll usually allocate as much memory as it already uses currently — a simple but relatively efficient method for anticipating what the user might do next.
The interesting part about this is that it flies in the face of the size-based pool approach: vector, doubling it’s capacity with each time it runs out of memory, will pretty soon request memory far larger than the size-based pools provide for, thus triggering the fallback solution of allocating from the heap. Once the vector has grown to that stage, there should be no benefit between STL’s default allocator and the size-based pool approach.
In that sense, the tests try to account for the worst case the size-based pools can encounter — but as it’s often impossible to determine the size of a containers before filling it, this usage of vector is also moderately realistic. In balance, I think these tests should paint a pretty realistic picture of the performance of this size-based pool approach1.
- Note that the same usage of e.g. STL’s lists or maps would favour the size-based pools approach more [↩]