Nice article, though I think that any intro-level material on lock-free programming should always include a "don't try this at home for anything important" warning. Until you have some experience with this stuff you will almost certainly make mistakes, but these mistakes might only manifest themselves as crashes in extremely rare circumstances.<p>I wrote my first lock-free code in 2004 based on reading some papers by Maged Michael from IBM. I wrote a lock-free FIFO in PowerPC assembly, and was convinced it was safe and robust. When I emailed Maged about it, he pointed out that if a thread was suspended on one specific instruction and some specific memory was unmapped before it could run again, the program could crash. I was amazed; I had thought hard about this algorithm, but had completely missed that possibility.<p>Some other specific notes about the article:<p>> Basically, if some part of your program satisfies the following conditions, then that part can rightfully be considered lock-free.<p>The are actually several levels of lock-freedom defined in the literature: lock-freedom, wait-freedom, and obstruction-freedom. For more info see: <a href="http://en.wikipedia.org/wiki/Non-blocking_algorithm" rel="nofollow">http://en.wikipedia.org/wiki/Non-blocking_algorithm</a><p>> Processors such as PowerPC and ARM expose load-link/store-conditional instructions, which effectively allow you to implement your own RMW primitive at a low level, though this is not often done.<p>One benefit of load-linked/store-conditional (often abbreviated LL/SC) is that it avoids the ABA problem (<a href="http://en.wikipedia.org/wiki/ABA_problem" rel="nofollow">http://en.wikipedia.org/wiki/ABA_problem</a>). In practice this doesn't matter that much since x86 doesn't support LL/SC, but I just think it's an interesting factoid to know.<p>> For instance, PowerPC and ARM processors can change the order of memory stores relative to the instructions themselves, but the x86/64 family of processors from Intel and AMD cannot.<p>(I've edited my reply here since my original assertion was incorrect). It's true that x86/64 won't reorder stores (see <a href="http://en.wikipedia.org/wiki/Memory_ordering" rel="nofollow">http://en.wikipedia.org/wiki/Memory_ordering</a> for details) but it <i>will</i> reorder loads, so memory barriers are still required in some situations. However I believe that the atomic instructions ("lock cmpxchg", and "lock xadd") imply full barriers on x86.