I enjoyed the content, but could have done without the constant hyping up of the edginess of lock free data structures. I mean yes, like almost any heavily optimized structure there are trade offs that prevent this optimization from being globally applicable. But also being borderline aroused at the “danger” and rule breaking is tiresome and strikes me as juvenile.
I've done a little bit of "lock-free" programming in Rust, but it's for very specialized situations.[1] This allocates and releases bits in a bitmap. The bitmap is intended to represent the slots in use in the Vulkan bindless texture index, which resides in the GPU. Can't read that those slots from the CPU side to see if an entry is in use. So in-use slots in that table have to be tracked with an external bitmap.<p>This has no unsafe code. It's all done with compare and swap. There is locking here, but it's down at the hardware level within the compare and swap instruction. This is cleaner and more portable than relying on cross-CPU ordering of operations.<p>[1] <a href="https://github.com/John-Nagle/rust-vulkan-bindless/blob/main/alloc/src/bitalloc.rs">https://github.com/John-Nagle/rust-vulkan-bindless/blob/main...</a>
The claim that the lock free array is faster then the locked variant is suspicious. The lock free array is performing a CAS for every operation, this is going to dominate[1]. A plain mutex would do two CAS (or just one if it is a spin lock), so the order of magnitude difference is not explainable by the lock free property.<p>Of course if the mutex array is doing a linear scan to find the insertion point that would explain the difference but: a) I can't see the code for the alternative and b) there is no reason why the mutex variant can't use a free list.<p>Remember:<p>- Lock free doesn't automatically means faster (still it has other properties that might be desirable even if slower)<p>- Never trust a benchmark you didn't falsify yourself.<p>[1] when uncontended; when contended cache coherence cost will dominate over everything else, lock-free or not.
> AtomicUsize: Used for indexing and freelist linkage. It’s a plain old number, except it’s watched 24 / 7 by the CPU’s race condition alarm.<p>Is it though? Aren't atomic load/store instructions the actual important thing. I know the type system ensures that `AtomicUsize` can only be accessed using atomic instructions but saying it's being watched by the CPU is inaccurate.
Pretty interesting.<p>I have finally bitten the bullet and learned Rust in the last few months and ended up really liking it, but I have to admit that it's a bit lower level than I generally work in.<p>I have generally avoided locks by making very liberal use of Tokio channels, though that isn't for performance reasons or anything: I just find locks really hard to reason about for anything but extremely trivial usecases, and channels are a more natural abstraction for me.<p>I've never really considered what goes into these lock-free structures, but that might be one of my next "unemployment projects" after I finish my current one.
Is the advantage of a freelist over just an array of the values (implemented like a ring buffer), that you can don't have to consume values in order? It just seems like throwing a pointer lookup would add a lot of latency for something thats so latency sensitive.
This is the kind of stuff that you shouldn't have to reinvent yourself but be able to reuse from a good library. Or the standard library even.<p>How would this compare to the lock free abstractions that come with e.g. the java.concurrent package? It has a lot of useful primitives and data structures. I expect the memory overhead is probably worse for those.<p>Support for this is one of the big reason Java and the jvm has been a popular choice for companies building middleware and data processing frameworks for the last few decades. Exactly the kind of stuff that the author of this article is proposing you could build with this. Things like Kafka, Lucene, Spark, Hadoop, Flink, Beam, etc.
You can go one step further if :<p>- you don't reallocate the array<p>- you don't allow updating/ removing past inserted values<p>In essence it become a log, a Vec<OnceCell<T>> or a Vec<UnsafeCell<Option<T>>>. Works well, but only for a bounded array. So applications like messaging, or inter-thread communication are not a perfect fit.<p>It's a fixed-size vector that can be read at the same time it's being written to. It's no a common need.
hope you enjoy this article on lock free programming in rust.<p>I used humor and analogies in the article not just to be entertaining, but to make difficult concepts like memory ordering and atomics more approachable and memorable.
wow - that's a huge speed improvement.<p>I wonder if this is connected to that rust optimisation bounty post we saw the other day where they couldn't get rust safe decoder closer than 5% to their C implementation. Maybe that's just the cost of safety
> NOTE: In this snippet we ignore the ABA problem<p>The article doesn't go into details but this is subtle way to mess up writing lock free data structures:<p><a href="https://en.wikipedia.org/wiki/ABA_problem" rel="nofollow">https://en.wikipedia.org/wiki/ABA_problem</a>
Obligatory video from Jon Gjengset “Crust of Rust: Atomics and Memory Ordering”: <a href="https://youtu.be/rMGWeSjctlY?si=iDhOLFj4idOOKby8" rel="nofollow">https://youtu.be/rMGWeSjctlY?si=iDhOLFj4idOOKby8</a>