TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

How to implement a hash table in C (2021)

169 点作者 benhoyt11 个月前

13 条评论

commandersaki11 个月前
There was a fantastic benchmark of C and C++ hash tables doing the rounds a few weeks ago, it&#x27;s pretty fun reading: <a href="https:&#x2F;&#x2F;jacksonallan.github.io&#x2F;c_cpp_hash_tables_benchmark&#x2F;" rel="nofollow">https:&#x2F;&#x2F;jacksonallan.github.io&#x2F;c_cpp_hash_tables_benchmark&#x2F;</a>.<p>Unless I really didn&#x27;t want to introduce dependencies, or reduce code size, I think I&#x27;d use an off the shelf hash table implementation these days. It&#x27;s still a fun exercise building your own though.
评论 #40889003 未加载
akira250111 个月前
I really like the API design of libjudy for general hash table like interfaces. It prevents you from having to hash a key twice just to do &quot;check if key is not present, then set value, otherwise leave original value in hash.&quot; The pattern also meshes better with the way iterators naturally work.<p>Also, in terms of this table, if you add the computed key hash value to the stored hash entry, you can check that the value of the hashes match before you do the full strcmp. If you have a weak hash you might also get a benefit from checking that the first characters match before calling the full strcmp.<p>It would also make rehashing easier since you already have the full key available to you and don&#x27;t have to use your internal set function to move entries into the new table. In the posted implementation the big-O semantics of a rehash are worst case.<p>Anyways.. &quot;man hsearch.3&quot; if you want something cheap and easy.
评论 #40889051 未加载
SloopJon11 个月前
I was noodling in this area recently, trying to speed up some code similar to the tr utility:<p><pre><code> $ echo abcdef |tr abc ghi ghidef </code></pre> For an eight-bit character set, I found that building an array to map every character improved on linear search, even for short replacement strings and relatively short input strings.<p>There isn&#x27;t as easy a win for Unicode, so I played with some simple hash tables. Although the conventional wisdom is to use the high-order bits of a hash function&#x27;s output, FNV-1a is not so good for short inputs of one or two bytes. (I used the 32-bit variant.) It was better just to use the low-order bits.
iamevn11 个月前
if (size + size &lt; size) {<p>I know size_t wraps around on overflow so this would actually work, but this still makes me do a double take.
suprjami11 个月前
It&#x27;s the ihih guy! Thanks for being one of the few C bloggers out there.
评论 #40888864 未加载
评论 #40888893 未加载
MrVandemar11 个月前
Enjoyable article and thorough run through. I didn&#x27;t read all of it, but it really took me back to learning &#x27;C&#x27; in comp sci in the early 90s. Great times.
Joker_vD11 个月前
&gt; but it is non-ideal that I’m only allowing half the range of size_t.<p>I am fairly certain that in C it&#x27;s actually impossible to have an object whose size is larger than half of the range of size_t unless ptrdiff_t is wider than size_t, which normally isn&#x27;t. Unless, of course, C standard decided to make subtracting two valid pointers into the same array (or one past the end) a potential UB, just because.
评论 #40895158 未加载
dang11 个月前
Discussed at the time:<p><i>How to implement a hash table in C</i> - <a href="https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=26590234">https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=26590234</a> - March 2021 (156 comments)
foresto11 个月前
In case anyone here is not already familiar with it, gperf is a perfect hash function generator.<p><a href="https:&#x2F;&#x2F;www.gnu.org&#x2F;software&#x2F;gperf&#x2F;manual&#x2F;gperf.html" rel="nofollow">https:&#x2F;&#x2F;www.gnu.org&#x2F;software&#x2F;gperf&#x2F;manual&#x2F;gperf.html</a>
glouwbug11 个月前
Open hash tables are typically the way to go in C. They drop the added complexity of managing a forward singly list by turning getters and setters typically into what is an array operation
aronhegedus11 个月前
This came up on an interview test where I had to implement this from scratch! Fun to read
a-dub11 个月前
might be neat if there were acceleration wins to be had with instructions for computing hash ints from machine ints and strings.<p>or even better, a full fledged linear probing hash lookup instruction.
timo-e-aus-e11 个月前
I played around with C++ when I was at university. Then never touched it again. So, with a grin I stumble over things like<p>&quot;void* ht_get(...)&quot;<p>Wait. What? A void pointer? Interesting... I have no clue.<p>I like articles like these. For someone not familiar with C it&#x27;s a perfect level. In terms of explanation and the code itself.
评论 #40888398 未加载
评论 #40889622 未加载
评论 #40904262 未加载
评论 #40888639 未加载