TechEcho

1 comment

hah, I recently had a project with an ex-cofounder (non-technical) where he wanted me to make a scraper for some sites.<p>He said he wanted to scrape the metadata, he suggested this library even, I could hardly see what I needed to do and why his current programmers couldn't do it.<p>After going back and forth for what seemed like infinity it turns out he didn't want metadata and when he used terms like title he meant an h1 at a particular position on some pages, and a div on other pages, and description was a div sibling to the h1 on site 1 but a span with a randomly generated id on another site etc. etc.<p>It was easy enough to do in the end, as is often the case the difficulty was in communication, but it did highlight one point - generally the metadata of a modern web page is not that interesting to scrape.

A library to easily scrape metadata from an article on the web

1 comment

A library to easily scrape metadata from an article on the web

1 comment