No wonder. There is wisdom in circular intervals.<p><a href="https://www.amazon.com/Geometry-Biological-Interdisciplinary-Applied-Mathematics/dp/0387989927" rel="nofollow">https://www.amazon.com/Geometry-Biological-Interdisciplinary...</a><p>Neural networks learn bad habits from the (-∞,∞) range, particularly like polynomials they like to make big coefficients that make terms that cancel out to get precise answers.<p>I see all the states on an LSTM creep up in magnitude as it scans a document and that is really just wrong.<p>It makes me think that FP16 is a joke, you get some regularization by not letting the numbers get too big.