The best thing I learned in algorithms class is:<p>For all realizable n, ln(n) is less than 40.<p>So when amortizing over a large window (say, a block size of 512), it can be the case that, <i>for any realizable n</i>, the logarithmic amortized costs are completely dominated by the constant non-amortized costs (adding an element to a block in this case)... making the operation effectively constant-time.
While I very much appreciate and respect author's detailed analysis, I am still not 100% convinced without a corresponding benchmark result. If it's really O(log n) vs O(1), it should be very easy to verify in some micro-benchmark.<p>Instead of that, the author mentioned that:<p>> The tests that I did run, initially the libstdc++ and the libc++ versions have roughly the same performance, but the performance diverges after a number of insertions somewhere between 6,400 and 12,800. After that the runtime of the libc++ version is roughly 2-3 times that of the libstdc++ version. (...) the outcomes of which depend on the compiler version anyway.<p>This does not seem right.
Anyone who programs low-latency software knows std::deque is an absolutely horrible-performance deque implementation due to how much it allocates and the pointer chasing involved. In most cases it's better to use a vector-backed deque (ring buffer) for anything performance-sensitive.
CppReference defines the time complexity is constant: <a href="https://en.cppreference.com/w/cpp/container/deque/push_front" rel="nofollow">https://en.cppreference.com/w/cpp/container/deque/push_front</a><p>So does this mean:
- We're talking about different complexities (Amortized vs something else)
- The libc++ implementation isn't standards conformant
- The analysis in the c++ standard is incorrect.
- Something else
std::deque is odd in that there's no std size of the blocks used. We have libstdc++ that is about 512bytes, libc++ at 4096bytes, and MS STL at around 64bytes. This leaves a bunch of room for optimizing for certain cases. Smaller is more insert friendly, larger more op[] reading friendly. But push_front after a pop_front should be O(1) on all of them. Otherwise it is moving/copying data to make room. Another nice/similar data structure is something like Boosts devector that adds a pointer to vector for the front capacity.
Very nice post!<p>Small comment: Ideally, big-O notation is for upper bounds. If you are doing lower bounds, you should ideally use big-Omega notation. But Omegas are harder to format in plain text, so it may be better to abuse notation and use big-O...
So iiuc:<p>The libc++ implementation is bad. The libstdc++ implementation is fine. The issue with the former is that it doesn't have enough spare capacity so it has to shift elements around too often.<p>Actually I think the push_front is even worse than stated: O(n). Consider an array with capacity N+1, contains N elements. Now if you alternate push_front and pop_back then every push_front will cause memmove of N elements.<p>Oh and to make like a 4th addition to this comment: It's kind of funny that the code is so unreadable that the author found it more helpful to look at the generated assembler code.
The most important thing to know about std::deque is that the implementation in MSVC is really just <i>awfully bad</i>. So, if your code might need to be built there, you are should consider using an implementation from somewhere other than std::.
Interesting!<p>It reminds me of the hash maps that are said to be amortized O(1) but can be O(n) for some sequences of operations in various languages like Python and JS: <a href="https://blog.toit.io/hash-maps-that-dont-hate-you-1a96150b492a?gi=1fee8f53b528" rel="nofollow">https://blog.toit.io/hash-maps-that-dont-hate-you-1a96150b49...</a>