It's all about fixed resource allocations. The default stack size for user-land threads tends to be 1MB or 2MB, and there's also (smaller) kernel stacks.<p>In implementations of languages like Scheme or Haskell where no stack may even be allocated, or where stacks might be allocated in chunks, the cost of a co-routine can be very small -- as small as the cost of a closure. If you take that to the limit and allocate every call frame on the heap, and if you make heavy use of closures/continuations, then you end up with more GC pressure because instead of freeing every frame on return you have to free many of them via the GC.<p>In terms of ease of programming to deal with async events, the spectrum runs from threads on the one hand to callback hell on the other. Callback hell can be ameliorated by allowing lambdas and closures, but you still end up with indentation and/or paren/brace hell. Co-routines are somewhere in the middle of the spectrum, but closer to threads than to callback hell. Await is something closer to callback hell with nice syntax to make things easier on the programmer.<p>Ultimately it's all about how to represent state.<p>In threaded programming program state is largely implicit in the call stack (including all the local variables). In callback hell program state is largely explicit in the form of the context argument that gets passed to the callback functions (or which they close over), with a little bit of state implicit in the extant event registrations.<p>Threaded programming is easier because the programmer doesn't have to think about how to compactly represent program state, but the cost is higher resource consumption, especially memory. More memory consumption == more cache pressure == more cache misses == slower performance.<p>Callback hell is much harder on the programmer because it forces the programmer to be much more explicit about program state, but this also allows the programmer to better compress program state, thus using fewer resources, thus allowing the program to deal with many more clients at once and also be faster than threaded programming.<p>Everything in computer science in the async I/O space in the last 30 years has been about striking the right balance between minimizing program state on the one hand and minimizing programmer pain on the other. Continuations, delineated continuations, await, and so on -- all are about finding that sweet spot.