Can anyone share some practical examples where it is a requirement to maintain 1M+ elements in a data structure when streams and iterators are not a viable alternative?<p>Using a vector as an example, where O(n) operations are actually very fast in practice when n < X, when might X be big enough in practice to warrant the use of other specialized structures like skip lists and radix-balanced trees?
How about cases where the data structures need to support "interesting" query operations. E.g. data structures to support spatial queries: "find me all the neighbouring elements within a radius r of element x". Not super-specialised, but probably need something like a quadtree or a hash table or other indices to help partition the search space. There can easily be 1M+ elements if each element is an address in state / national address database, or a component of some utility network, or a location of an app user. There are use cases where you'd want to be able to execute these kinds of queries with low latency, so jamming everything into ram in an appropriate data structure could be a good fit.
Any sort of DNA processing. There is a lot of research into concise data structures for a very specific set of operations for DNA processing since our ability to generate DNA data is accelerating much master than processing or memory speed.
Here’s one example:<p>In the IPv4 Internet, there are about 830,000 prefixes (network addresses) announced, and the number is growing [1]. Routers at ISPs and sophisticated customers maintain full (“default-free”) routing tables with at least one entry for each prefix.<p>These tables are used for packet forwarding decisions, so lookups have to be fast. Traditionally, a radix tree is used, but some routers use other data structures or specialized hardware.<p>[1] <a href="https://www.cidr-report.org/as2.0/" rel="nofollow">https://www.cidr-report.org/as2.0/</a>
Not sure if this fits the example you have in mind: <a href="https://medium.com/pinterest-engineering/an-update-on-pixie-pinterests-recommendation-system-6f273f737e1b" rel="nofollow">https://medium.com/pinterest-engineering/an-update-on-pixie-...</a><p>> The ultimate goal is to fit the entire Pinterest graph in memory. Obviously we can’t fit the entire 100 billion edges in RAM, but once we prune down to 20 billion edges, we end up with a graph that’s 150GB.
Do you mean specifically 1M elements in memory?<p>Because something like a relational database could have 1M elements in a B-tree but the whole data structure doesn't have to be in memory at the same time.<p>Or operating system page tables as another example.