Hi HN<p>As in the title, I am wondering what are the reasons anyone would use Puppeteer/Selenium/other-browser-emu for web scraping if there are already tens if not hundreds of SaaS services offering "scraping-as-a-service". Except for JS execution.<p>A handful of examples: Scrapehero, Webrobots, Apify, Scrapingbee, Scrapinghub, Promptcloud<p>Except for the ones that require setup fee, or have ridiculous pricing models. Why would anyone want to setup Puppeteer/Selenium/other scraping bots instead of using one of the "scraping-as-a-service" platforms?
Probably because people who are doing web scraping aren't professional scrapers, they're just programmers who need some data quickly. And since they're already familiar with Selenium, they think that's the state of the art. I've never seen an ad for a scraping service, so I also didn't know that they existed.
my main concern is pricing. many websites use anti-scraping technologies. scraping the html doesn't work anymore. need to load everything and execute js. for example, I have seen some can detect headless / puppeteer mode too. I ended up creating my own scraping infra using vanilla chrome...<p>current saas platforms charge by request count. If I need to load everything, the cost will be too high.