TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

Foundations of Data Science [pdf]

117 pointsby necrodomeover 10 years ago

6 comments

snippyhollowover 10 years ago
The second chapter &quot;High-Dimensional Space&quot; talks about the problem of spikey spheres[0] (how most of the mass is near the surface), I made an ipython notebook to illustrate it[1].<p>[0] <a href="http://www.penzba.co.uk/cgi-bin/PvsNP.py?SpikeySpheres" rel="nofollow">http:&#x2F;&#x2F;www.penzba.co.uk&#x2F;cgi-bin&#x2F;PvsNP.py?SpikeySpheres</a><p>[1] <a href="http://nbviewer.ipython.org/urls/gist.github.com/SnippyHolloW/9025964/raw/b2d266e7e19d64e0343fd899dfbc3e8ddc889269/SpikeySpheres?create=1" rel="nofollow">http:&#x2F;&#x2F;nbviewer.ipython.org&#x2F;urls&#x2F;gist.github.com&#x2F;SnippyHollo...</a>
princehonestover 10 years ago
&quot;Please do not put solutions to exercises online as it is important for students to work out solutions for themselves rather than copy them from the internet.&quot;<p>I find crowdsourced solutions for honest autodidacts very valuable.
评论 #8495245 未加载
thomaskcrover 10 years ago
Anybody who doesn&#x27;t read that first chapter to the end is going to be very confused.<p>&gt; To make it easier to read we use E^2(1-x) for (E(1-x))^2 and E(1-x)^2 for E((1-x)^2).<p>Why change that notation? That seems to purposefully be introducing confusion.<p>On page 14 they don&#x27;t use that notation (om^2(x+y) = om^2(x) + om^2(y) -- according to their notation note that should really be om^2(x+y) = (om (x+y))^2).<p>Not trying to knock what seems like a really neat introduction, I just don&#x27;t understand the need for defining ridiculously unconventional notation and then not using it consistently introducing a lot of confusion.
评论 #8491630 未加载
评论 #8492528 未加载
asquidyover 10 years ago
This is cool, but how can you write a book about data science without mentioning causal inference or experimental design? Most people that do data science are not applying black box algorithms to clean data. They are actively manipulating and shaping the data, coming up with theories, and testing those theories. Inference is more important in theory and in practice for data scientists than theoretical models of graph formation and some of the other topics covered in this book.
kiyotoover 10 years ago
I find the title to be linkbaity and misleading (quite disappointing for decorated computer scientists like Hopcroft and Kannan).<p>Based on the table of contents, a more accurate title would be &quot;Modern Foundations of Theoretical Computer Science with an Eye Towards Machine Learning&quot;, and even that is given a disproportionately large weight on machine learning.
评论 #8491502 未加载
评论 #8491955 未加载
评论 #8492251 未加载
devilsdounutover 10 years ago
Looks pretty academic. I see no mention of data cleaning or more practical considerations in the table of contents.
评论 #8491899 未加载