Lock-Free Rust: How to Build a Rollercoaster While It's on Fire

128 pointsby r3tr04 days ago

19 comments

I enjoyed the content, but could have done without the constant hyping up of the edginess of lock free data structures. I mean yes, like almost any heavily optimized structure there are trade offs that prevent this optimization from being globally applicable. But also being borderline aroused at the “danger” and rule breaking is tiresome and strikes me as juvenile.

评论 #44002731 未加载

评论 #44002242 未加载

评论 #44002463 未加载

Animats2 days ago

I've done a little bit of "lock-free" programming in Rust, but it's for very specialized situations.[1] This allocates and releases bits in a bitmap. The bitmap is intended to represent the slots in use in the Vulkan bindless texture index, which resides in the GPU. Can't read that those slots from the CPU side to see if an entry is in use. So in-use slots in that table have to be tracked with an external bitmap.This has no unsafe code. It's all done with compare and swap. There is locking here, but it's down at the hardware level within the compare and swap instruction. This is cleaner and more portable than relying on cross-CPU ordering of operations.[1] <a href="https://github.com/John-Nagle/rust-vulkan-bindless/blob/main/alloc/src/bitalloc.rs">https://github.com/John-Nagle/rust-vulkan-bindless/blob/main...</a>

gpderetta2 days ago

The claim that the lock free array is faster then the locked variant is suspicious. The lock free array is performing a CAS for every operation, this is going to dominate[1]. A plain mutex would do two CAS (or just one if it is a spin lock), so the order of magnitude difference is not explainable by the lock free property.Of course if the mutex array is doing a linear scan to find the insertion point that would explain the difference but: a) I can't see the code for the alternative and b) there is no reason why the mutex variant can't use a free list.Remember:- Lock free doesn't automatically means faster (still it has other properties that might be desirable even if slower)- Never trust a benchmark you didn't falsify yourself.[1] when uncontended; when contended cache coherence cost will dominate over everything else, lock-free or not.

评论 #44003642 未加载

评论 #44003617 未加载

评论 #44008467 未加载

评论 #44008806 未加载

评论 #44003685 未加载

0x1ceb00da2 days ago

> AtomicUsize: Used for indexing and freelist linkage. It’s a plain old number, except it’s watched 24 / 7 by the CPU’s race condition alarm.Is it though? Aren't atomic load/store instructions the actual important thing. I know the type system ensures that `AtomicUsize` can only be accessed using atomic instructions but saying it's being watched by the CPU is inaccurate.

评论 #44002012 未加载

tombert2 days ago

Pretty interesting.I have finally bitten the bullet and learned Rust in the last few months and ended up really liking it, but I have to admit that it's a bit lower level than I generally work in.I have generally avoided locks by making very liberal use of Tokio channels, though that isn't for performance reasons or anything: I just find locks really hard to reason about for anything but extremely trivial usecases, and channels are a more natural abstraction for me.I've never really considered what goes into these lock-free structures, but that might be one of my next "unemployment projects" after I finish my current one.

评论 #44001831 未加载

评论 #44002137 未加载

zero05292 days ago

Like the writing style but would prefer if it was dialed down maybe 10 %. Otherwise a great article as an introduction to lock-free datastructures.

gmm19901 day ago

Is the advantage of a freelist over just an array of the values (implemented like a ring buffer), that you can don't have to consume values in order? It just seems like throwing a pointer lookup would add a lot of latency for something thats so latency sensitive.

jillesvangurp2 days ago

This is the kind of stuff that you shouldn't have to reinvent yourself but be able to reuse from a good library. Or the standard library even.How would this compare to the lock free abstractions that come with e.g. the java.concurrent package? It has a lot of useful primitives and data structures. I expect the memory overhead is probably worse for those.Support for this is one of the big reason Java and the jvm has been a popular choice for companies building middleware and data processing frameworks for the last few decades. Exactly the kind of stuff that the author of this article is proposing you could build with this. Things like Kafka, Lucene, Spark, Hadoop, Flink, Beam, etc.

评论 #44003593 未加载

Fiahil1 day ago

You can go one step further if :- you don't reallocate the array- you don't allow updating/ removing past inserted valuesIn essence it become a log, a Vec<OnceCell<T>> or a Vec<UnsafeCell<Option<T>>>. Works well, but only for a bounded array. So applications like messaging, or inter-thread communication are not a perfect fit.It's a fixed-size vector that can be read at the same time it's being written to. It's no a common need.

r3tr04 days ago

hope you enjoy this article on lock free programming in rust.I used humor and analogies in the article not just to be entertaining, but to make difficult concepts like memory ordering and atomics more approachable and memorable.

评论 #44001990 未加载

评论 #44001418 未加载

Havocabout 16 hours ago

wow - that's a huge speed improvement.I wonder if this is connected to that rust optimisation bounty post we saw the other day where they couldn't get rust safe decoder closer than 5% to their C implementation. Maybe that's just the cost of safety

jonco2171 day ago

> NOTE: In this snippet we ignore the ABA problemThe article doesn't go into details but this is subtle way to mess up writing lock free data structures:<a href="https://en.wikipedia.org/wiki/ABA_problem" rel="nofollow">https://en.wikipedia.org/wiki/ABA_problem</a>

评论 #44005574 未加载

IX-1031 day ago

> We don’t need strict ordering here; we’re just reading a number.That's probably the most scary sentence in the whole article.

rurban1 day ago

It still wouldn't lead to proper Rust concurrency safety, because their IO is still blocking.

评论 #44014493 未加载

lucraft1 day ago

Can I ask a dumb question - how is Atomic set operation implemented internally if not by grabbing a lock?

评论 #44004071 未加载

评论 #44004621 未加载

评论 #44006060 未加载

评论 #44004631 未加载

fefe232 days ago

To borrow an old adage: The determined programmer can write C code in any language. :-)

评论 #44002240 未加载

pjmlp1 day ago

That screenshot is very much CDE inspired.

sph1 day ago

Obligatory video from Jon Gjengset “Crust of Rust: Atomics and Memory Ordering”: <a href="https://youtu.be/rMGWeSjctlY?si=iDhOLFj4idOOKby8" rel="nofollow">https://youtu.be/rMGWeSjctlY?si=iDhOLFj4idOOKby8</a>

ephemer_a2 days ago

did 4o write this