It alrady handles long lists without recursion:<p><pre><code> case Lisp_Cons:
{
register struct Lisp_Cons *ptr = XCONS (obj);
if (CONS_MARKED_P (ptr))
break;
CHECK_ALLOCATED_AND_LIVE (live_cons_p);
CONS_MARK (ptr);
/* If the cdr is nil, avoid recursion for the car. */
if (EQ (ptr->u.s.u.cdr, Qnil))
{
obj = ptr->u.s.car;
cdr_count = 0;
goto loop;
}
mark_object (ptr->u.s.car);
obj = ptr->u.s.u.cdr;
cdr_count++;
if (cdr_count == mark_object_loop_halt)
emacs_abort ();
goto loop;
}
</code></pre>
The "goto loop" chases the <i>cdr</i> pointer iteratively. So with regard to list structure, this will only blow the stack if you have deep <i>car</i> recursion. When conses are used to represent nested lists, you don't get that much depth on the CAR side.<p>I'm just wondering what is the exact scenario, involving what structure.<p>Of course there can be other kinds of linked objects. E.g. a huge graph structure, where the GC happens to traverse a very long path through the graph before bottoming out somewhere.<p>BTW, "[i]f the cdr is nil, avoid recursion for the car." Initial reaction: good frickin' idea and I'm stealing it immediately. Though, wait, not sure what it buys you in practical terms; who the heck builds a deep structure of conses where the linkage is through <i>car</i> and the <i>cdr</i>-s are nil.<p>A better test might be for the <i>cdr</i> being any atom at all. Though obviously some kinds of atoms are structures with pointers that can have depth behind them, many kinds of atoms are shallow. If a <i>cons</i> has any atom whatsoever as its <i>cdr</i>, then maybe we should recurse on the <i>cdr</i> and tail loop on the <i>car</i>.<p>Analysis required.
Interesting. So that's why emacs whigs out sometimes when trying to open very large files: because it has a recursive GC that has trouble when it runs out of stack space. Assuming I read the first statement correctly.<p>Seems like a welcome change. I'm not too well versed with the internals, but does 4.6MB seem reasonable for a memory footprint? I mean, compared to Slack, not bad.<p>> <i>The naive version of this scheme needs about 4.6MB of overhead on my current 20MB Emacs heap, but it should be possible to reduce the overhead significantly by taking advantage of the block allocation we do for conses and other types --- we can put whole blocks on the queue instead of pointers to individual block parts, so we can get away with a much smaller queue.</i>
> <i>We need to fix GC being deeply recursive once and for all. Tweaking stack sizes on various platforms and trying to spot-fix GC for the occasional deeply recursive structure is annoying. Here's my proposal:</i><p>There is still a default stack size limit of 8 megabytes on Linux,
even though consumer machines nowadays have RAM measured in double-digit gigabytes.<p>On an Ubuntu 16.x machine:<p><pre><code> $ ulimit -a | grep stack
stack size (kbytes, -s) 8192
$ free -h
total used free shared buff/cache available
Mem: 15G 4.1G 2.8G 600M 8.7G 10G
Swap: 14G 626M 13G
</code></pre>
This 8192 kbytes has not changed in at least 15 years, unlike that 15G. Slight disconnect there.<p>This 8192 is just the soft limit; applications can raise it up to the root-imposed hard limit.<p>Unfortunately, that tends to be only 4X larger than the 8192:<p><pre><code> $ ulimit -s 32769
bash: ulimit: stack size: cannot modify limit: Operation not permitted
$ ulimit -s 32768
$ # OK
</code></pre>
Clearly, the hard limit needs revision too.<p>Threads are another issue; if you have a "can run in any thread" garbage collector, it has to live within a thread stack. You can't have huge stacks for each of a large number of threads. A thread's stack cannot be temporarily extended; there may be something in the address space preventing it, since it's just another mmap. (Don't think Emacs has this problem. :)
Why don't they just use the Deutsch-Schorr-Waite algorithm to store the mark stack in the heap itself with pointer-reversal tricks? This would remove the additional overhead.
Did sth. similar a while ago for our not so great D GC.
<a href="https://github.com/dlang/druntime/pull/1073" rel="nofollow">https://github.com/dlang/druntime/pull/1073</a>
I wonder if a hybrid approach is possible. Run GC recursively. Arrange for the segfault to run on an alternative stack (sigaltstack). If a segfault occurs due to insufficient stack space, then do the rest of the GC using the "queue draining loop" approach, within the small amount of alternative stack memory. Then longmp to some point in the original GC to bail out the original stack.
Super dumb question: why not just use a queue, and visit the objects iteratively?<p>Eg each visited object appends/prepends its children to the queue of objects to visit, and you just repeatedly drain until you have an empty list.<p>That’s what I had to do many years ago for the GC I was working on.<p>Then any special cases (like the cons case referenced elsewhere) just becomes performance optimization rather than “don’t crash” correctness
This talks about possible generational GC. Incremeantal/generational was available many years ago with a port to the Boehm collector. I never understood rejecting it and generally spending effort on a bespoke collector. (I don't remember whether the Ravenbrook system was available at the time, or if it was unsuitable for some reason.)