TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

Lean, mean data science machine

13 pointsby jeroenjanssensover 11 years ago

1 comment

gjredaover 11 years ago
I&#x27;m super interested in the chapter on creating reusable command line tools.<p>I&#x27;ve found the command line to be ideal for performing a lot of simple, memory-intensive tasks (filtering&#x2F;munging&#x2F;sorting&#x2F;etc. a massive text file).<p>However, after data collection (and munging), data science is typically A LOT of _exploratory_ analysis. I think it&#x27;s extremely important that all practitioners approach analysis with the mindset of making it easily reproducible (and if possible, flexible - don&#x27;t hard code date ranges, file paths, etc.).<p>I tend to stick with IPython Notebook (and heavily recommend it). I fear that heavy analysis at the command line would consist of too many one-liners and thus be difficult to read and maintain.
评论 #6867358 未加载