I chased that rabbit hole briefly and it's not very clear that the hashed value is required to be <= UINT32_MAX. Closest is a claim by the same author as this post:<p>> It seems obvious that on 32-bit and 64-bit systems, the function should not give different results<p>and a commit to mask off the low bits in an implementation elsewhere.<p>Well, maybe that would be convenient, but overall it seems unimportant. It's necessary for the tool writing the table and the tool reading it to agree but cross compilation is absolutely full of hazards like this anyway.<p>The code looks fine to me for what that's worth. I can see the assignment in the if being contentious.
I suspect the author of the hash function thought this wouldn't add more than 4 bits:<p><pre><code> h = (h << 4) + *name++;
</code></pre>
But as one should know, two n-bit numbers can create an n+1-bit result when added due to carry.
Back when ELF was designed that architectures larger than 32 bits were <i>extremely</i> uncommon, either obsolete (36 and 40 bit) or expensive and exotic (Cray) so in neither case part of the ELF design space. So not a huge surprise.<p>I remember thinking at the time that it was an oversight but it took more than another decade for that to even matter.
I have a question: what should I read for an introduction to the implementation/internals/design of hash functions?<p>I would like to to beyond my current understanding, which is basically “they’re effectively one-way functions”, and be able to participate in discussions of articles such as this one.
If someone checked in that code, it would definitely fail my code review. I understand back in the day it was different, but today there should be a lot of named intermediates. Additionally, `long` and any such keywords should not make it into any commit unless the commit explains
1) why its needed and
2) how, with any standard conforming implementation, it couldnt possibly cause a bug.<p>As always in C programming, the bugs arise from people doing stuff that any sane guideline tells them to not do.
ELF is way too complex and not really adapted anymore.<p>We should start to deprecate DT_NEEDED and make dlopen/dlsym/dlclose (maybe, dlvsym) hard symbols in the loader.<p>And game devs should stop using main() as some genius glibc dev did add a new libc_start_main version in 2.34. Namely, any game executable linked with a glibc from 2.34 will refuse to load on system with a previous glibc.<p>Actually, game binaries should be pure ELF64 binaries (not using main()) which "libdl" (dlopen/dlsym/dlclose) everything they need from the system. And of course, as much as possible should be statically linked (I think this is what unity is doing, but unreal/godot have a big issue: the static libstdc++ which, as of late, does not libdl anything from the system).