It's not that Arrays are not thread safe; it's just that the code was written in a non-thread-safe way.<p>Writing<p><pre><code> x[i] -= 1
</code></pre>
actually means<p><pre><code> x[i] = x[i] - 1
</code></pre>
So, there's a read, a subtraction, and a write, and they all happen sequentially. Since they are not in a transaction or protected by a mutex, nothing guarantees that other thread don't mutate `x[i]` in the mean time.<p>This has nothing to do with Ruby, and nothing to do with multicore, either. Even on a single CPU core, threads might interleave and cause unexpected behavior.
Under the semantics described here, Java or C# core classes aren't "thread-safe" either and I'd expect the <i>vast</i> majority of standard libraries to completely fail the test (potential winners: Clojure using an immutable collection bound on an atom, as they have compare-and-swap semantics; and probably Haskell somehow), the example code requires performing the following actions atomically:<p>* Loading an instance-local collection<p>* Fetching a numeric value from the collection<p>* Incrementing or decrementing the numeric value<p>* Putting the incremented value back<p>Even if the collection is "synchronized" (each method call takes a collection-local lock), because the value is altered outside the collection there's no way for the change to be atomic unless it's wrapped in a transaction block or protected by a lock. As far as I can think, the only ways for core classes to "be thread-safe" considering the example (keeping mutable collections semantics) would be to either have collections dedicated specifically to the operation such as Java6's AtomicIntegerArray (<a href="http://docs.oracle.com/javase/6/docs/api/java/util/concurrent/atomic/AtomicIntegerArray.html" rel="nofollow">http://docs.oracle.com/javase/6/docs/api/java/util/concurren...</a>), or to have <i>the collection</i> apply the operation internally e.g. Hash#apply(key, &operation) used roughly like this:<p><pre><code> def decrease item
@stock.apply(item) {|from| from - 1}
end</code></pre>
Hi. OP here.<p>Thanks for the comments about atomicity vs. thread-safety. Absolutely on point. The article started out demonstrating what happened with concurrent Array mutation, but then I put in that += operation and didn't address it. Sorry for not making the distinction. Atomicity is absolutely a different issue than a thread-safe collection. I'm publishing something new tomorrow that addresses this point.<p>To bring things back to code, the point I was originally trying to make is that this code is not thread-safe.<p><pre><code> array = []
threads = []
10.times do
threads << Thread.new do
100.times { array.push(rand) }
end
end
threads.each(&:join)
# 10 threads each inserted 100 values, result should be 1000
puts array.size
</code></pre>
Specifically, too many Ruby programmers won't think twice about this operation not being thread-safe:<p><pre><code> array.push(item)
</code></pre>
But there's no such guarantee. This is demonstrated nicely when this code example is run on an implementation with no global lock, try it on JRuby.
The article seems to be wrong in several aspects ... First, the issue described has nothing to do with arrays; the same problem happens when using a plain number:<p><pre><code> class Inventory
def initialize(nb)
@nb_items = nb
end
def decrease
@nb_items -= 1
end
def nb_items
@nb_items
end
end
@inventory = Inventory.new(4000)
threads = Array.new
400.times do
threads << Thread.new do
10.times do
@inventory.decrease
end
end
end
threads.each(&:join)
puts @inventory.nb_items
</code></pre>
Second, the mutex in the OP's code synchronizes the whole block passed to a thread, i.e. there's no parallelism at all (the second thread waits until the first one finishes, and so on). It should rather be something like:<p><pre><code> class Inventory
def initialize(nb)
@nb_items = nb
@lock = Mutex.new
end
def decrease
@lock.synchronize do
@nb_items -= 1
end
end
def nb_items
@nb_items
end
end
@inventory = Inventory.new(4000)
threads = Array.new
400.times do
threads << Thread.new do
10.times do
@inventory.decrease
end
end
end
threads.each(&:join)
puts @inventory.nb_items</code></pre>
Nice write-up. It's easy to fall into the trap of assuming that things are "thread-safe" when writing under the iron fist of a Global Interpreter Lock.<p>Ruby threads seems to be a fairly narrow topic on which to base a book; I'll look forward to seeing what all the book covers.
This still doesn't explain why the MRI implementation is accidentally threadsafe. Why doesn't the interpreter switch threads after reading the value from the hash but before storing the updated value?
Does ruby have a spec? If not, I don't really see this as a problem... synchronization, especially in a dynamic language that's already really slow, would just add more overhead.