TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

You could have invented futexes

144 pointsby dmartoabout 2 years ago

10 comments

Animatsabout 2 years ago
It&#x27;s more like UNIX and Linux being way behind on threading technology. User space mutexes were in QNX over two decades ago. And on the UNIVAC 1108 half a century ago.[1] Here&#x27;s a implementation from 1972, by John Walker.<p><pre><code> . DIJKSTRA P FUNCTION . . . LA,U A0,&lt;QUEUE&gt; . LMJ X11,P . &lt;RETURN&gt; X5 DESTROYED . P* TS QHEAD,A0 LOCK THE QUEUE LX X5,QN,A0 LOAD QUEUE COUNT ANX,U X5,1 BACK UP THE COUNT SX X5,QN,A0 REPLACE THE COUNT IN THE QUEUE TN X5 DO WE NEED TO DEACTIVATE HIM ? J PDONE NO. SKIP DEACTIVATION ON TSQ=0 LX X5,QHL,A0 LOAD BACK LINK OF QUEUE SX X5,QHL,X4 PUT INTO BACK LINK OF ACTIVITY SX X4,QFL,X5 CHAIN ACTIVITY TO LAST ACTIVITY SA A0,QFL,X4 CHAIN HEAD TO NEW ACTIVITY SX X4,QHL,A0 MAKE THE NEW ACTIVITY LAST ON QUEUE CTS QHEAD,A0 RELEASE PROTECTION ON QUEUE HEAD SCHDACT* DACT$ . DEACTIVATE PROCESS OFF ON TSQ C$TSQ QHEAD,A0 WAIT FOR C$TSA OFF J 0,X11 RETURN AFTER ACTIVATION . PDONE CTS QHEAD,A0 UNLOCK THE QUEUE J 0,X11 RETURN </code></pre> And that followed Djykstra&#x27;s paper, published in Dutch in the late 1960s.<p>[1] <a href="https:&#x2F;&#x2F;www.fourmilab.ch&#x2F;documents&#x2F;univac&#x2F;fang&#x2F;" rel="nofollow">https:&#x2F;&#x2F;www.fourmilab.ch&#x2F;documents&#x2F;univac&#x2F;fang&#x2F;</a>
评论 #35710644 未加载
评论 #35715909 未加载
评论 #35713321 未加载
kazinatorabout 2 years ago
Hashing an address to map to a wait queue was done by early Unix kernels.<p>The address was called a &quot;wait channel&quot; (wchan for short).<p>There was no object declared at the address; any object or function could be waited on.<p>The <i>ps</i> utility still has a field called &quot;wchan&quot; you can select, which tells you where the process is waiting.<p>Hashing is basically way to add properties to an object (identified by an address) without physically extending the structure located at that address.<p>We can give any address a wait queue, for instance.<p>E.g. Javascript object properties are like this. If we have some <i>futex</i> object in Javascript, and give it a futex.waitQueue property, it&#x27;s basically the same: the futex thing is hashed somehow to access that property.<p>The main point of futexes is efficiency in the face of user&#x2F;kernel separation. Futexes are intended to be used in a double-checked pattern, avoiding a call into the kernel. User space can check the state of the futex memory location and avoid calling wait. The wait operation in the kernel will do the same check again (this time atomically) possibly avoiding the wait.. If that weren&#x27;t done, there would be a lost-wakeup problem in the use of mutexes.<p>Futexes are fast in the uncontended case; hence the F. Fast means in the context of the expensive user&#x2F;kernel boundary. You would never need futexes without such a boundary; they are not used in the kernel. Futexes inside a single protection space are indeed a non-invention; the invention is the idea that something outside of the protected space can inspect the value and act on it.
评论 #35710736 未加载
compumikeabout 2 years ago
Just a reminder that glibc &#x2F; pthread has a known futex-related threading bug in the wild that has been open for three years: <a href="https:&#x2F;&#x2F;sourceware.org&#x2F;bugzilla&#x2F;show_bug.cgi?id=25847" rel="nofollow">https:&#x2F;&#x2F;sourceware.org&#x2F;bugzilla&#x2F;show_bug.cgi?id=25847</a><p>It was discussed here two years ago: <a href="https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=24958504" rel="nofollow">https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=24958504</a><p>But I only recently believe I ran into it in the wild when testing some high-concurrency multithreading code for several days. It would run a few hundred million operations successfully and then hang, CPU busy, on pthread_cond_wait().
评论 #35716499 未加载
RustyRussellabout 2 years ago
Oh, blast from the past!<p>The original futex code used physical addresses, and pinned the page, which is simple: if you can share memory somehow, you can use a futex.<p>Of course, most people use pthreads and increasingly the idea of a Linux userspace locking primitive gave way to a glibc-pthread accelerator...
评论 #35709141 未加载
rektideabout 2 years ago
Omg the multi-threaded code examples are amazing. The active line is highlighted. Amazingly illuminating, so so good.
评论 #35708887 未加载
jiveturkeyabout 2 years ago
Apparently Solaris uses park() to the same effect. I can&#x27;t readily find a history of either. Anyone know which came first, Solaris or Linux? In part I&#x27;m reacting to @Animats statement that UNIX is behind.
评论 #35710317 未加载
评论 #35710990 未加载
alex-moonabout 2 years ago
I really loved the little UI element in this showing the concurrent executing paths as highlighted lines in the code (with soothing little transitions no less).
评论 #35713478 未加载
fenaerabout 2 years ago
There is a typo in the first code example, “memoy” instead of “memory”
评论 #35716080 未加载
jesse__about 2 years ago
Thanks for writing this up -- I&#x27;ll be coming back to it when I need to implement mutexes.<p>As an aside -- it would be interesting to see what the analog on Windows is. I know from experience the platform-provided mutex implementation on win32 is horrendously slow, and it would be nice to know if it&#x27;s even possible on that platform to implement a &#x27;real&#x27; mutex that doesn&#x27;t suck. Anyone know of a reference implementation of mutexes on Windows?
评论 #35709153 未加载
评论 #35709577 未加载
评论 #35709281 未加载
评论 #35709136 未加载
评论 #35711752 未加载
评论 #35709129 未加载
评论 #35709126 未加载
bullenabout 2 years ago
Why write workarounds instead of simply using arrays of 64 byte atomic structures for your data?<p>That way you avoid cache misses and cache invalidation at the same time. And your software can run at full speed across multiple threads&#x2F;cores.
评论 #35711419 未加载