TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

Ask HN: What's your favorite command-line tool for working with data?

8 点作者 jeroenjanssens将近 11 年前
About a year ago, I wrote a blog post about command-line tools for data science [1]. Thanks to HN, I received a lot of valuable comments and pointers to other great command-line tools! In the past 10 months, I have been writing a book titled Data Science at the Command Line [2]. Ever since that blog post, I&#x27;ve been discovering new tools. On the one hand, that&#x27;s quite frustrating because it&#x27;s difficult to keep up and include everything in the book. On the other hand, it&#x27;s fantastic to see that the command line is still very popular!<p>In order to gain a better overview of what&#x27;s available, I thought it&#x27;d be nice to ask on HN what your favorite tools are to work with data. Many new tools have been developed in the past year, but your favorite one may just be 10 years old. You may think that I&#x27;m too late with this question because the book is already finished, but fortunately the book also discusses the underlying concepts which haven&#x27;t changed too much in the past forty years.<p>I&#x27;m very much looking forward to hearing about your favorite command-line tools. Bonus points if you reply in CSV format &quot;command,url,reason\n&quot;, so I can easily scrape the comments :)<p>Thanks!<p>PS. For those who are interested, next Wednesday, I&#x27;ll be doing a webcast about this topic [3], where I might share the outcome of this discussion.<p>[1] https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=6412190<p>[2] http:&#x2F;&#x2F;shop.oreilly.com&#x2F;product&#x2F;0636920032823.do<p>[3] http:&#x2F;&#x2F;www.oreilly.com&#x2F;pub&#x2F;e&#x2F;3115

11 条评论

ycombover将近 11 年前
GNU sed 4.2.1 awk 3.0.4 and grep 2.4.2, <a href="http://www.git-scm.com/" rel="nofollow">http:&#x2F;&#x2F;www.git-scm.com&#x2F;</a> , Bundled with Windows Git (which needs an updated find)<p>Python with pandas, <a href="http://pandas.pydata.org/" rel="nofollow">http:&#x2F;&#x2F;pandas.pydata.org&#x2F;</a> , If I need HDF5 or time series<p>ffmpeg, ffmpeg.org, If I&#x27;m generating animations<p>* I look forward to your book :)
fhuszar将近 11 年前
jq,<a href="http://stedolan.github.io/jq/,best" rel="nofollow">http:&#x2F;&#x2F;stedolan.github.io&#x2F;jq&#x2F;,best</a> tool to handle JSON files in command line #The tool that immediately comes to my mind is jq, a tool to transform and process JSON objects. It&#x27;s one of those powerful tools that is super easy to learn and once I started using it I just couldn&#x27;t live without. The only negative thing I have to say is that it does not have good native support to transform between JSON and CSV.
vram22将近 11 年前
After all your command-line data munging (possibly in a Unix pipeline), if you want to convert the resulting text to PDF (without leaving the command line :-), check this post:<p>[xtopdf] PDFWriter can create PDF from standard input:<p><a href="http://jugad2.blogspot.in/2013/12/xtopdf-pdfwriter-can-create-pdf-from.html" rel="nofollow">http:&#x2F;&#x2F;jugad2.blogspot.in&#x2F;2013&#x2F;12&#x2F;xtopdf-pdfwriter-can-creat...</a><p>It needs xtopdf and ReportLab (use v1.17) and Python (use 2.2 or higher).<p>Online overview of xtopdf: <a href="http://slid.es/vasudevram/xtopdf" rel="nofollow">http:&#x2F;&#x2F;slid.es&#x2F;vasudevram&#x2F;xtopdf</a><p>xtopdf on Bitbucket:<p><a href="https://bitbucket.org/vasudevram/xtopdf" rel="nofollow">https:&#x2F;&#x2F;bitbucket.org&#x2F;vasudevram&#x2F;xtopdf</a>
hashtag将近 11 年前
Clickable:<p>[1] <a href="https://news.ycombinator.com/item?id=6412190" rel="nofollow">https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=6412190</a><p>[2] <a href="http://shop.oreilly.com/product/0636920032823.do" rel="nofollow">http:&#x2F;&#x2F;shop.oreilly.com&#x2F;product&#x2F;0636920032823.do</a><p>[3] <a href="http://www.oreilly.com/pub/e/3115" rel="nofollow">http:&#x2F;&#x2F;www.oreilly.com&#x2F;pub&#x2F;e&#x2F;3115</a>
gexos将近 11 年前
CSVKit: <a href="https://github.com/onyxfish/csvkit" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;onyxfish&#x2F;csvkit</a> and The R Project for Statistical Computing: <a href="http://www.r-project.org/" rel="nofollow">http:&#x2F;&#x2F;www.r-project.org&#x2F;</a>
crasshopper将近 11 年前
Jeroen, I&#x27;m just reading your [1] for the first time now. Are you aware of Dirk Eddelbuettel&#x27;s `littler`? I believe that might overlap with your Rio tool to some degree.
评论 #8175075 未加载
ole_tange将近 11 年前
histogram, <a href="https://github.com/ole-tange/tangetools/blob/master/histogram/histogram" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;ole-tange&#x2F;tangetools&#x2F;blob&#x2F;master&#x2F;histogra...</a>, So you got this table and you are not really in the mood of firing up GNUplot&#x2F;a spreadsheet&#x2F;R but you would like a quick bar chart here in the terminal. cat data | histogram
kazinator将近 11 年前
txr, <a href="http://nongnu.org.txr" rel="nofollow">http:&#x2F;&#x2F;nongnu.org.txr</a>, Use it all the time and like it a lot! That keeps me interested in working on it. Started five years and and still at it today, more than 1500 commits later, and 27000 LOC.
hellageek将近 11 年前
I like cat. Always a good start to a pipe chain for a quick look at a small data set.
评论 #8170776 未加载
roycoding将近 11 年前
jq has proven useful for dealing with JSON. A nice way to reduce or reformat your data.<p><a href="http://stedolan.github.io/jq/" rel="nofollow">http:&#x2F;&#x2F;stedolan.github.io&#x2F;jq&#x2F;</a>
评论 #8170173 未加载
ibstudios将近 11 年前
Interactive Ruby Shell.