TechEcho

5 comments

astowawayover 1 year ago

gnu awk recently got CSV support built into it which is quite nice imo though certainly less featureful than qsv appears to be

snidaneover 1 year ago

This looks great!Please consider removing any implicit network calls like the initial "Checking GitHub for updates...". This itself will prevent people from adoption or even trying it any further. This is similar to gnu parallel's --citation, which, albeit a small thing - will scare many people off.Consider adding pivot and unpivot operations. Mlr gets it quite right with syntax, but is unusable since it doesn't work in streaming mode and tries to load everything into memory, despite claiming otherwise.Consider adding basic summing command. Sum is the most common data operation, which could warrant its own special optimized command, instead offloading this to external math processor like lua or python. Even better if this had a group by (-by) and window by (-over) capability. Eg. 'qsv sum col1,col2 -by col3,col4'. Brimdata's zq utility is the only one I know that does this quite right, but is quite clunky to use.Consider adding a laminate command. Essentially adding a new column with a constant. This probably could be achieved by a join with a file with a single row, but why not make this common operation easier to use.Consider the option to concatenate csv files with mismatched headers. cat rows or cat columns complains about the mismatch. One of the most common problems with handling csvs is schema evolution. I and many others would appreciate if we could merge similar csvs together easily.Conversions to and from other standard formats would be appreciated (parquet, ion, fixed width lenghts, avro, etc.). Othe compression formats as well - especially zstd.It would be nice if the tool enabled embedding outputs of external commands easily. Lua and python builtin support is nice, but probably not sufficient. i'd like to be able to run a jq command on a single column and merge it back as another for example.Inspiration:<pre><code> - csvquote: https://news.ycombinator.com/item?id=31351393 - teip: https://github.com/greymd/teip</code></pre>

评论 #38747444 未加载

评论 #38842958 未加载

评论 #38752737 未加载

评论 #38746875 未加载

alchemist1e9over 1 year ago

Wow! This looks a really complete set of operations and extremely useful.

foehrenwaldover 1 year ago

related: <a href="https://github.com/johnkerl/miller">https://github.com/johnkerl/miller</a>I am wondering who really uses these tools and for what since there are R and python data science tools available?

评论 #38746476 未加载

评论 #38747167 未加载

评论 #38746424 未加载

评论 #38746430 未加载

dima55over 1 year ago

An incomplete list of other similar tools: <a href="https://github.com/dkogan/vnlog/#description">https://github.com/dkogan/vnlog/#description</a>

评论 #38746579 未加载

5 comments

astowawayover 1 year ago

gnu awk recently got CSV support built into it which is quite nice imo though certainly less featureful than qsv appears to be

snidaneover 1 year ago

评论 #38747444 未加载

评论 #38842958 未加载

评论 #38752737 未加载

评论 #38746875 未加载

alchemist1e9over 1 year ago

Wow! This looks a really complete set of operations and extremely useful.

foehrenwaldover 1 year ago

评论 #38746476 未加载

评论 #38747167 未加载

评论 #38746424 未加载

评论 #38746430 未加载

dima55over 1 year ago

An incomplete list of other similar tools: <a href="https://github.com/dkogan/vnlog/#description">https://github.com/dkogan/vnlog/#description</a>

评论 #38746579 未加载

Qsv: Efficient CSV CLI Toolkit

5 comments

Qsv: Efficient CSV CLI Toolkit

5 comments