I always had a beef with this paper; so many of the issues are much better solved in Haskell, yet he doesn't introduce it directly until he has something to criticize. Unfortunately the critique reveal a very poor understanding.<p>First point, lazy evaluation incurs a mutation when the whnf is reduced. However, this mutation occurs at most once and after that the value is immutable just like in SML. Furthermore, the update is not directly exposed to the programmer and thus the compiler can implement it in a way that works efficiently in the presence of generational GC. See the GHC papers on how.<p>Second point, on polymorphic overloading he claims that Haskell "allow run-time resolution of overloading". Mark P. Jones and others have show how this overhead can be largely eliminated via partial evaluation applied to the type dictionaries. The use of polymorphic overloading is pervasive in Haskell code because it's incredibly useful so the "apparently small gains" is complete and utter nonsense.<p>In many ways O'Caml is a better SML than SML, but Haskell, and especially the GHC variant, has a type system that is light years ahead of both.<p>In the modern world there are many variations on the theme, but the language Rust (which borrows many ideas from these) might become the most popular yet.<p>EDIT: typo