What Open Source projects are similar? It seems like data transformation tasks like this are a universal problem. I'd just do map/reduce, but that's the advantage that comes with being a developer. Great tools for non-developers to do limited programming tasks are incredibly useful.
Check out Diffbot @ <a href="http://www.diffbot.com/" rel="nofollow">http://www.diffbot.com/</a> -- you can setup repeat crawls and extract data into .csv/Excel file format or JSON with Crawlbot API, extract data automatically with automatic APIs, or Custom API Toolkit. You can get a free trial account at <a href="https://www.diffbot.com/plans/trial" rel="nofollow">https://www.diffbot.com/plans/trial</a> to try it out, and you can get paid plans at <a href="http://www.diffbot.com/pricing/" rel="nofollow">http://www.diffbot.com/pricing/</a>.
This general category of tool seems very useful. Apparently there are many Yahoo Pipes followers. <a href="http://www.makeuseof.com/tag/12-best-yahoo-pipes-alternatives-look/" rel="nofollow">http://www.makeuseof.com/tag/12-best-yahoo-pipes-alternative...</a><p>Support for email and HTML tables seem to be some distinguishing features of this one.
I've seen something similar more than once on HN but my google-fu is failing. What other similar services are out there that'll let you smartly scrape a page into a spreadsheet.<p>In the past, I've tried Google spreadsheets with "ImportXML" option but got frustrated after a bit and resorted to python.