Wilson's highly readable "Uniprocessor Garbage Collection Techniques" is a far better overview: <a href="http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.138.5038" rel="nofollow">http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.138....</a> You could write a decent GC with just the info in that paper.<p>It doesn't cover parallel or real-time garbage collection (which both get far more complicated); for those, you want Jones et. al's _The Garbage Collection Handbook_ (<a href="http://www.amazon.com/The-Garbage-Collection-Handbook-Management/dp/1420082795/" rel="nofollow">http://www.amazon.com/The-Garbage-Collection-Handbook-Manage...</a>) and plenty of time to explore its bibliography. (His older book is also good, but doesn't cover those topics.)
The reference counting metaphor leaves out a critical aspect of reference-counting: cyclical structures and how to collect them. Books never write their names in other books so this doesn't appear in his discussion, though it's one of the more important complications for implementations.
One of the most fun discussions of the complications of reference counting and how nuanced you have to be comes from the Python docs:<p><a href="http://docs.python.org/extending/extending.html#reference-counts" rel="nofollow">http://docs.python.org/extending/extending.html#reference-co...</a><p>It relates "a true story. An older version of Python contained variants of this bug and someone spent a considerable amount of time in a C debugger to figure out why his __del__() methods would fail" in a considerable amount of detail.
<i>At the December 2008 Kyushu Ruby 01 Conference, I asked the audience "How many of you here have some interest in garbage collection?" Out of 200 people, only 3 raised their hands</i><p>Can you imagine how many people would raise their hands if you asked that question NOW at a ruby conference? Crazy how times change.
Another great GC discussion for LuaJit: <a href="http://wiki.luajit.org/New-Garbage-Collector" rel="nofollow">http://wiki.luajit.org/New-Garbage-Collector</a>
In the article he states "[for Copying] Conservative GC is unable to determine the difference between pointer and integer values. If an integer value was overwritten the result would be disastrous. It is likely that the result of 1+1 would change occasionally."<p>However reading through the RHG, it is shown that Ruby allocates memory for objects in 20 byte blocks and that this results in all pointers therefore being multiples of four. This is handy as it allows the direct usage of literals such as FixNum (an always odd 'pointer' that you shift right by one bit to access the value) and Symbol (a 'pointer' that always has 0xff as the trailing byte that you shift right 1 byte to access the unique ID) without requiring object creation.<p>With this in mind, can someone enlighten me as to why Copying could not be used inside Ruby?
It seems as though it would be trivial for the GC to differentiate between literals and pointers as otherwise they would not be much use as literals.
I don't see how a Copying GC can be faster than Mark and Sweep. Transferring a book may be easy, but copying an object can be quite costly. I can't imagine copying almost all objects everytime the GC runs.
I recommend Elise Huard's recent talk on Ruby GC: <a href="http://skillsmatter.com/podcast/home/ruby-bin-men" rel="nofollow">http://skillsmatter.com/podcast/home/ruby-bin-men</a><p>(Also, I'd love to see the Hungarian Folk Dance interpretation of different GC approaches <a href="http://t.co/l8ADbEQR" rel="nofollow">http://t.co/l8ADbEQR</a>)
nice introduction, but it seems to be missing some really important concepts<p>- generational GC<p>- concurrent GC, the parts of GC which can be concurrent (which depends on your algorithm), and the tradeoffs needed to let your program run concurrently with the GC without stopping the world.
very nice article. given its likely target audience, though, he really needs to explain explicitly why mark-and-sweep can't compact as it goes, rather than leaving a fragmented heap.