I think Java's issue is not really due to garbage collection directly, but overuse of references (pointers) leading to very complicated object graphs.<p>This in turn leads to a lot of pointer chasing, cache thrashing and typically significantly higher memory requirements. It's sad how bad reputation this has "earned" for garbage collection in general.<p>Even with just a few CPU cores, if you tried to emulate a similar number of object pointers by using reference counting instead of garbage collection, I believe it'd be at least by an <i>order of magnitude slower</i> than modern GC algorithms. With tens of CPU cores, reference counting would get almost no actual work done. Garbage collection would outperform it by a very large margin.<p>Manual memory management wouldn't fare much better, because memory allocation is glacial: "malloc" (and often also "free") are <i>very</i> slow. Allocators need to traverse a complicated data structure to find a suitable free memory block and to mark it used. Fortunately memory allocation algorithms can at least scale with number of CPU cores.<p>Reference counting is problematic with more than one CPU core. If multiple cores can access the same reference counted object, naive reference counting very quickly saturates CPU core interconnect with cache coherence traffic. Every time a reference count is increased (retained/reserved/locked) or decreased (released), an atomic operation is required, all cores must have same value in their local caches. The reference count changes need to be communicated to <i>all</i> other CPU cores [1]. This performance penalty can be partially worked around by minimizing number of reference count alterations and by using hacks on top of the simple case, but at cost of complexity and reduction of safety.<p>In Java, you can't have an array of Objects, for non-elementary types you get an array of references (pointers) instead, where each array element carries overhead of java.lang.Object. The number of pointers is gigantic in pretty much any JVM heap. This is very bad especially for small objects, which unfortunately tend to dominate nearly any heap.<p>You can of course get around this by "rotating" arrays of Objects. No array of Objects, but Object of arrays. Each semantic Object field becomes an elementary type array in a holder Object.<p>This fixes performance and memory consumption at cost of flexibility, code size and readability.<p>Garbage collection extends the object lifetimes and scales linearly with number of CPU cores, but needs [periodic] housekeeping. Reference counting keeps object lifetimes short, but is unsafe with multiple CPU cores or needs slow atomic operations that result expensive inter-core cache coherence traffic, making it ultimately scale badly.<p>There's no silver bullet.<p>[1]: Real cache coherence protocols are quite a bit more complicated than that, see <a href="http://en.wikipedia.org/wiki/Cache_coherence" rel="nofollow">http://en.wikipedia.org/wiki/Cache_coherence</a>