And this, kids and kid-ettes, is why having someone who knows their way around your C compiler, assembler, and debugger on your team can be worthwhile, even if you're working in a high-level language like Ruby.
This is why I hate garbage collection -- it tends to break horribly as soon as someone does anything unexpected. Using 800 kB of stack is an incredibly dumb thing to do (not as bad on a 64-bit system as it is on a 32-bit system, admittedly), but it still shouldn't cause a huge drop in performance.
The local fix is nice, but I would be tempted to offer a solution for all code that does the same thing -- put a big chunk of GC-opaque data on the stack. It might take the form of informing the garbage collector: skip this range.
Well, one of my friends drop me one formula. Let x to be the rate of GC time out of total running time, and we can get<p>(1-2x) * 14 = 1- x => x = 13/27<p>which means now ruby spends 13/27 time on GC, fast enough? (before patch, it spends 26/27 time on GC, really bad)