JavaScript garbage collection and closures

138 点作者 jaffathecake10 个月前

17 条评论

DanielHB10 个月前

This behavior is required because eval exists:<pre><code> function test(text) { const a = 1 function inner() { console.log(eval(text)) } inner() } test("() => { return a }") </code></pre> prints 1 to the consoleThis happens because the closure's context object is shared between all closures in a given scope. So as soon as one variable from a give scope is accessed through a closure then all variables will be retained by all inner functions.Technically the engines could be optimizing it when no eval used is detected or when in strict mode (which blocks eval), but I guess that dynamically dropping values from a closure context based on inner lexical scopes can be really tricky thing to do and probably not worth the overhead.

评论 #41112833 未加载

评论 #41112584 未加载

评论 #41112589 未加载

评论 #41112682 未加载

评论 #41113260 未加载

bgirard10 个月前

At Meta I worked on memlab [1]. This tool is very effective at finding memory leaks in our JavaScript. AFAIK we found one, but only one, such leak that happened in production code. Once discovered it was easily fixed. But understanding this class of issue was important to make sense of the leak report.[1] <a href="https://facebook.github.io/memlab/" rel="nofollow">https://facebook.github.io/memlab/</a>

评论 #41112966 未加载

beardyw10 个月前

Just today (re)discovered FinalizationRegistry which is a big help if you are worried about what may be left behind. It's quite nice to be able to see a log of objects disappearing.

评论 #41111988 未加载

评论 #41112594 未加载

评论 #41111957 未加载

pizlonator10 个月前

Fundamental downside of tracing garbage collection.Reachability is a conservative approximation of the set of objects that need to be kept alive.Not saying don’t use GC, but this is a great example of how there are no silver bullets in memory management - only imperfect trade offs.

评论 #41113321 未加载

评论 #41113116 未加载

IainIreland10 个月前

I believe the technical term for the property that existing JS engines lack here is "safe for space". The V8 bug (<a href="https://issues.chromium.org/issues/41070945" rel="nofollow">https://issues.chromium.org/issues/41070945</a>) has already been linked elsewhere).Here's a long-standing SpiderMonkey bug: <a href="https://bugzilla.mozilla.org/show_bug.cgi?id=894971" rel="nofollow">https://bugzilla.mozilla.org/show_bug.cgi?id=894971</a>.Here's a JSC equivalent: <a href="https://bugs.webkit.org/show_bug.cgi?id=224077.\" rel="nofollow">https://bugs.webkit.org/show_bug.cgi?id=224077.\</a>Both of those bugs (especially the JSC one) sketch out possible solutions and give some insight into why this is hard to implement efficiently. In general, it adds a lot of complexity to an already complicated (and performance-sensitive!) chunk of code.

lxe10 个月前

I got confused here, because in the timers example, I assumed that without calling cancelDemo/clearTimeout, we cannot assume that the timer's function is no longer callable. Heck, even WITH clearTimeout... without understanding its implemetation, we cannot assume it's no longer callable by some internal system.I think you should call cancelDemo() in the article to show that yes, even when we assume that the reference to the closure is cleaned up, the allocation persists.<pre><code> (() => { // Assumed implementation of setTimeout // in which we cannot make assumptions that fn // is inaccessible after the first invocation const timers = []; function mySetTimeout(fn) { fn(); // retain the function for some reason // maybe because of timers/intervals implemetation detail timers.push(fn); const id = timers.length - 1; return id } function myClearTimeout(id) { // this makes it to fn has no more references // but it still gets retained const fn = timers.splice(id, 1); // For good measure delete fn; } function demo() { const bigArrayBuffer = new ArrayBuffer(100_000_000); const id = mySetTimeout(() => { console.log(bigArrayBuffer.byteLength); }, 1000); return () => myClearTimeout(id); } cancelDemo = demo(); // Even when calling clearTimeout, // bigArrayBuffer is still allocated, // Which is the crux of the article cancelDemo(); // Now it's actually deallocated, as // shows in the article delete cancelDemo; })();</code></pre>

brundolf10 个月前

It would be intuitive to me if function closures only retained things they referenceIt would also be intuitive to me if closures naively retained everything in scopeIt's bizarre to me that the behavior is "if one closure references something, then all of them retain it"I guess maybe it's a stack vs heap thing? If nothing retains a variable then it can be kept on the stack, but once it has to outlive the function it has to be moved. Still odd the bookkeeping can't distinguish closures that references it from ones that don't, if it already has to check that for the entire set

评论 #41114273 未加载

jtbandes10 个月前

Chrome bug tracking this issue (since 2013): <a href="https://issues.chromium.org/issues/41070945" rel="nofollow">https://issues.chromium.org/issues/41070945</a>

olliej10 个月前

Ok, so there have been a lot of comments on what causes this, which are not correct, and a few that are pretty close. Here is the actual explanation of what is happening and what is causing the leak.For context, I used to be on TC39 (the ecmascript standards committee) and spent many many years working on JSC, and specifically working on the GC and closure modeling.First off: this is not due to eval. In ES3.1 or ES5 (alas I can't recall which) we (tc39) clarified the semantics of eval, to only evaluate in the containing scope if it is called directly - essentially turning it into a pseudo operator (implementations today generally implement a direct eval as `if (target function == real eval function) { do eval } else { call the function }`. Calling eval in any way other that `eval(<expression>)` will not invoke the scope capturing behavior of eval (this is a strict requirement to allow fast access to non-local variables).The function being reported as exhibiting the bad/unexpected behavior in the post is:<pre><code> function demo() { const bigArrayBuffer = new ArrayBuffer(100_000_000); const id = setTimeout(/* timeout closure */ () => { console.log(bigArrayBuffer.byteLength); }, 1000); return /* cleanup closure */ () => clearTimeout(id); } </code></pre> If we were to follow the spec language fairly explicitly, the behavior of this function is (eliding exact semantics of everything other than creation of function objects and closures)<pre><code> 1. enter the function 2. env = create an empty lexical environment object (I may use "activation" by accident because that was the spec language when I was first working on JS engines) a) set the parent scope of env to the internal scope reference of the callee (in this case because demo is a global function this will be the global object) b) add a property "bigArrayBuffer" to env, setting the value to undefined c) add a property "id" to env, setting the value to undefined 3. evaluate `new ArrayBuffer(100_000_000)` and assign the result to the "bigArrayBuffer" property of env 4. Construct a function object for the timeout closure, and set its internal scope reference to *env* (i.e. capture the containing scope) 5. call setTimeout passing the function from (4) and 1000 as, and assign the result to the "id" property on the env object 6. construct the cleanup closure, and set the internal scope property to env </code></pre> The result of this is that we end up with the following set of objects:<pre><code> globalObject = {.....} demo = Function { @scope: globalObject } <demo_env> (not directly exposed anywhere) = LexicalEnvironment { @scope: demo.@scope, bigArrayBuffer: big array, id: number } <timeout closure> = Function { @scope: demo_env } <cleanup closure> = Function { @scope: demo_env } </code></pre> At which point you can see as long as either closure is live, the reference to bigArrayBuffer is reachable and therefore kept alive.Now, I was confused about this report originally as I know JSC at least does do free var anaylsis (and I can't imagine v8 doesn't, not sure about SM these days) to reduce false captures, because I had not properly read the example code, and was like "why is this being kept alive", but having actually read the code properly and written out the above it's hopefully very obvious to everyone now.The language semantics of JS mean that all closures in a given scope chain share that scope chain, which means if one closure captures a variable, then all closures will end up keeping that capture alive, and there is not a lot the JS engine can do to limit that.There are some steps that could be taken to mitigate or reduce this, but doing that kind of flow analysis can become expensive and a real issue JS engines have is that the overwhelming majority of JS runs a tiny number of times, and is extremely latency sensitive (this is why JSC has put so much effort into parsing + interpreter perf), and any real data flow analysis is too expensive for such code, and by the time code is hot enough to have warranted that kind of analysis the overall program state has got to a point where you cannot retroactively remove closure references, so they remain.Now something that you _could_ try as a developer in this kind of scenario would be to use let, or a scoped let, to reduce the sharing of scopes, e.g.<pre><code> function demo() { let id; { let bigArrayBuffer = new ArrayBuffer(100_000_000); id = setTimeout(/* timeout closure */ () => { console.log(bigArrayBuffer.byteLength); }, 1000); } return /* cleanup closure */ () => clearTimeout(id); } </code></pre> which might resolve this issue, in this particular kind of case.In principle an engine could introduce logic to try to track exactly how many live closures reference a captured variable, but this is also tricky as you could easily end up with something like:<pre><code> function f() { let x = new GiantObject; return (a) => { if (a) return (g) => { g(x) } return (g) => { g(null); } } } y = f() // y needs to keep x alive y = y(some value) // you get a new closure which // may or may not be the one referencing // x. </code></pre> This is something you _could_ support, but there's a lot of complexity to ensuring correct behavior and maintaining performance in all the common cases, and it's possibly just not worth it given the JS capturing model.There are also a few things you could do that would likely be relatively easy/low cost from a JS engine that would remove some cases of excessive capture, but they'd still just be helping super trivial cases like this reduced example code, not necessarily any actual real world examples.

评论 #41117347 未加载

kazinator10 个月前

Could it be that the vector is hoisted outside of the function's inner scope, promoted to a literal-like datum attached to the outer function?It's obvious from the function that the ArrayBuffer object never escapes from the scope, and is never modified.If the object is never modified, there is no need to keep newly instantiating it; it can be hoisted to the function and attached to it somehow, so then to get rid of it, we have to lose the function itself.

qbane10 个月前

Can reproduce in latest Firefox and Chromium. I wonder whether this is an actual leak or there is a good reason for JS engines to retain the array buffer.

评论 #41112762 未加载

blackhaj710 个月前

Interesting, succinct article. Love it.How can I try this myself? i.e. see bigArrayBuffer in memory and see if it is/isn't garbage collated. I am guessing I can use the Chrome debugger but I would love to know to do it how step by step if anyone has a link

评论 #41112387 未加载

评论 #41112412 未加载

评论 #41112409 未加载

评论 #41112404 未加载

skrebbel10 个月前

As an old OO guy, I like to think of a closure as syntax sugar for a class that’s being generated which has fields for all the variables in a scope that are used in callbacks. In those terms, this is quite a bit less surprising. (That said, you could also imagine generating a separate class for each callback - I wonder why JS engines don’t do that)

评论 #41112240 未加载

评论 #41112368 未加载

adhamsalama10 个月前

Insightful article, thanks for sharing!

unstirrer10 个月前

See also <a href="https://news.ycombinator.com/item?id=5959020">https://news.ycombinator.com/item?id=5959020</a>

samanator10 个月前

I wonder how many petabytes of leaked memory there is in the world at any given time from open chrome tabs

评论 #41113060 未加载

评论 #41111916 未加载

评论 #41112351 未加载

评论 #41112370 未加载

packetlost10 个月前

TLDR: the entire scope for a closure is retained as long as that environment might still be referenced. There's no such thing as a partial scope in JavaScript (to my knowledge, please correct if wrong).In the example, if you don't capture `id` in the returned closure, the problem goes away.

评论 #41111881 未加载

评论 #41111831 未加载

评论 #41111857 未加载

评论 #41111908 未加载