The second chapter "High-Dimensional Space" talks about the problem of spikey spheres[0] (how most of the mass is near the surface), I made an ipython notebook to illustrate it[1].<p>[0] <a href="http://www.penzba.co.uk/cgi-bin/PvsNP.py?SpikeySpheres" rel="nofollow">http://www.penzba.co.uk/cgi-bin/PvsNP.py?SpikeySpheres</a><p>[1] <a href="http://nbviewer.ipython.org/urls/gist.github.com/SnippyHolloW/9025964/raw/b2d266e7e19d64e0343fd899dfbc3e8ddc889269/SpikeySpheres?create=1" rel="nofollow">http://nbviewer.ipython.org/urls/gist.github.com/SnippyHollo...</a>
"Please do not put solutions to exercises online as it is important for students to work out solutions for themselves rather than copy them from the internet."<p>I find crowdsourced solutions for honest autodidacts very valuable.
Anybody who doesn't read that first chapter to the end is going to be very confused.<p>> To make it easier to read we use E^2(1-x) for (E(1-x))^2 and E(1-x)^2 for E((1-x)^2).<p>Why change that notation? That seems to purposefully be introducing confusion.<p>On page 14 they don't use that notation (om^2(x+y) = om^2(x) + om^2(y) -- according to their notation note that should really be om^2(x+y) = (om (x+y))^2).<p>Not trying to knock what seems like a really neat introduction, I just don't understand the need for defining ridiculously unconventional notation and then not using it consistently introducing a lot of confusion.
This is cool, but how can you write a book about data science without mentioning causal inference or experimental design? Most people that do data science are not applying black box algorithms to clean data. They are actively manipulating and shaping the data, coming up with theories, and testing those theories. Inference is more important in theory and in practice for data scientists than theoretical models of graph formation and some of the other topics covered in this book.
I find the title to be linkbaity and misleading (quite disappointing for decorated computer scientists like Hopcroft and Kannan).<p>Based on the table of contents, a more accurate title would be "Modern Foundations of Theoretical Computer Science with an Eye Towards Machine Learning", and even that is given a disproportionately large weight on machine learning.