> The compiler optimizes for data locality<p>> So, we have a single array in which every entry has the key and the value paired together. But, during lookups, we only care about the keys. The way the data is structured, we keep loading the values into the cache, wasting cache space and cycles. One way to improve that is to split this into two arrays: one for the keys and one for the values.<p>Recently someone proposed this on LLVM: <a href="https://discourse.llvm.org/t/rfc-add-a-new-structure-layout-optimization-pass/80596" rel="nofollow">https://discourse.llvm.org/t/rfc-add-a-new-structure-layout-...</a><p>Also, I think what you meant by data locality here is really optimizing data layout, which, as you also mentioned, is a hard problem. But if it's just optimizing (cache's) locality, I think the classic loop interchange also qualifies. Though it's not enabled by default in LLVM, despite being there for quite a while.