Slight nitpick - the definition of "race condition" on Wikipedia [0] is:<p><pre><code> [...] the condition of an electronics, software, or other system where the system's substantive behavior is dependent on the sequence or timing of other uncontrollable events
</code></pre>
If we take the first example - Parallel BFS - the correctness of the output could be considered "system's substantive behavior". Properly implemented atomic checks (as demonstrated) would still guarantee to lead to correct output in all possible combinations of events. Therefore, the system's "substantive behavior" is <i>not</i> dependent on the sequence or timing of other uncontrollable events. Therefore, there is no "race condition" involved.<p>Of course, the term "race condition" here is taken colloquially, for the sake of familiarity to the reader - the article has correctly recognized that the appropriate term for this kind of behavior is "non-determinism".<p>[0] <a href="https://en.wikipedia.org/wiki/Race_condition" rel="nofollow">https://en.wikipedia.org/wiki/Race_condition</a>
This reminds me of Kuper and Newton's LVars paper that introduces lattice variables in Haskell: <a href="https://users.soe.ucsc.edu/~lkuper/papers/lvars-fhpc13.pdf" rel="nofollow">https://users.soe.ucsc.edu/~lkuper/papers/lvars-fhpc13.pdf</a> <a href="http://dx.doi.org/10.1145/2502323.2502326" rel="nofollow">http://dx.doi.org/10.1145/2502323.2502326</a>
> I’m not talking about data races. Data races are typically bugs, by definition.<p>One notable exception is the Racy Single-Check Idiom: <a href="http://javaagile.blogspot.com/2013/05/the-racy-single-check-idiom.html" rel="nofollow">http://javaagile.blogspot.com/2013/05/the-racy-single-check-...</a><p>It is particularly suitable for lazy initialization in code that is typically (but not necessarily) executed single-threaded, and is famously used in Java’s <i>String.hashCode()</i> implementation.
I appreciate a provocative title, but I think the practical lesson in almost all cases is the inverse: fixing race conditions can introduce performance bottlenecks.
Tell me about non guaranteed order of operations in GPU reductions and floating point results changing slightly between two runs. Yes it's useful and you get the goddamn FP32 TFLOPS, but damn it makes testing, validating, qualifying systems harder. And yes, I know one shouldn't rely and test on equality, but not knowing the actual order of FP operations makes numerical analysis of the actual error harder (just take the worst case of every reduction, ugh).<p>EDIT: and don't get me started on tensor cores and clever tricks to have them do 'fp32-alike' accuracy. Yes, wonderful magic but how do you reason about these new objects without a whole new slew of tools.
If you're just doing BFS why do you care who the parent is? Why not just choose the parent to be the predecessor? I.e if you visit 4 from 1, then 1 is the parent. Why do you need to check a list of potential parents?