There are quite a few extra issues that come up when you implement a usable allocator. I'm surprised the article didn't mention them. Here are just a few:<p>Alignment: different systems have different minimum alignment requirements. Most require all allocations are 8 or 16 byte aligned.<p>Buffer overruns: using object headers is risky, since a buffer overrun can corrupt your heap metadata. You'll either need to validate heap metadata before trusting it (e.g. keep a checksum) or store it elsewhere. This also wastes quite a bit of space for small objects.<p>Size segregation: this isn't absolutely essential, but most real allocators serve allocations for each size class from a different block. This is nice for locality (similarly-sized objects are more likely to be the same type, accessed together, etc). You can also use per-page or per-block bitmaps to track which objects are free/allocated. This eliminates the need for per-object headers.<p>Internal frees: many programs will free a pointer that is internal to an allocated object. This is especially likely with C++, because of how classes using inheritance are represented.<p>Double/invalid frees: you'll need a way of detecting double/invalid frees, or these will quickly lead to heap corruption. Aborting on the first invalid free isn't a great idea, since the way you insert your custom allocator can cause the dynamic linker to allocate from its own private heap, then free these objects using your custom allocator.<p>Thread safety: at the very least, you need to lock the heap when satisfying an allocation. If you want good performance, you need to allocate objects to separate threads from separate cache lines, or you'll end up with false sharing. Thread segregated heaps also reduce contention, but you need a way of dealing with cross-thread frees (thread A allocates p, passes it to B, which frees it).<p>The HeapLayers library is very useful for building portable custom allocators: <a href="https://github.com/emeryberger/Heap-Layers" rel="nofollow">https://github.com/emeryberger/Heap-Layers</a>. The library includes easily-reusable components (like freelists, size classes, bitmaps, etc.) for building stable, fast allocators. HeapLayers is used to implement the Hoard memory allocator, a high performance allocator optimized for parallel programs.