Processing 256M publications records using Awk and parallel scripting

9 点作者 ketanmaheshwari大约 5 年前

I participated in a organizational data challenge in 2018 and chose awk to process the massive data to solve several interesting challenges.<p>The repo is: https://github.com/ketancmaheshwari/SMC18<p>A report detailing the approach and results is here: https://github.com/ketancmaheshwari/SMC18/blob/master/report/SMC18_DataChallenge4.pdf

2 条评论

yesenadam大约 5 年前

You might try posting this as a usual HN news item to get more response, or write a blog post about it and link to that. (Awk lover here)

评论 #23229230 未加载

mraza007大约 5 年前

Hey that’s pretty cool you should definitely write about it in a blog post