NaN-boxing is morally equivalent to using tagged unions to represent values of any type. It optimizes for fast floating-point operations, and for allowing a larger amount of meaningful values to be represented in 64 bits.<p>Most tagged-union approaches use some sort of punning so that at least one type of value is directly meaningful, without any bit manipulation. For example, some Common Lisp implementations will choose their tag bits for Cons cells such that NIL is represented with all zeros. Since NIL is the only falsey value and everything else is truthy, this means that any bit-pattern can be used directly for testing conditionals, with the correct semantics.<p>NaN-boxing is choosing a different type of punning for the tags, to optimize a different use case. First, all bit-patterns can be used directly as 64-bit floats with technically correct semantics--anything that's NaN-boxed is actually not a floating-point number. I'm given to understand that modern architectures are slow to mix float and bit ops, so it's nice that you don't need to mask your tags off of your float before operating on them.<p>Second, (double-precision) floats are usually the only type that require the full 64 bits to have meaning. In many applications, 50ish bits is 'good enough' for ints and pointers (with some extra ops to handle overflow), but floats are mandated by a standard and don't scale down gracefully. A tagged union wouldn't be able to contain a 64-bit float directly without spilling into an extra machine word. Unless, of course, the tags are punned to be part of the float value itself. And that's exactly what NaN-boxing is.