I wonder: since there are only thirteen factorials that fit into int32, and thus only 13 possible valid inputs, is it faster or slower to use a very small lookup table? <a href="https://godbolt.org/g/9ii0S0" rel="nofollow">https://godbolt.org/g/9ii0S0</a> It seems like this is the sort of thing a good compiler can figure out on its own, but I don't fully understand what clang is doing with this loop: <a href="https://godbolt.org/g/ze5ycb" rel="nofollow">https://godbolt.org/g/ze5ycb</a>