TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

Show HN: Cloakbits - Headless web scraping with bypass for anti-bot WAFs

8 点作者 proszkinasenne2超过 4 年前
There is a growing number of companies offering anti-bot protection SaaS to protect websites from scraping by automated bots based on Puppeteer&#x2F;Selenium. Most of them rely on browser properties such as headers, javascript properties (window.<i>, navigator.</i>), behavior analysis, to build device&#x2F;user fingerprints and match it against a database of &quot;whitelisted&quot; fingerprints (typical user behavior&#x2F;settings&#x2F;device props etc).<p>For the past few months, together with two other devs I have worked on a customized Puppeteer&#x2F;Playwright scraping backend. It&#x27;s essentially a drop-in replacement for default Chrome&#x2F;FF binaries. We managed to successfully go through Coinbase, Amazon, Aliexpress login pages in headless mode without getting captcha, or any other verification. We are planning to roll out a beta version. If you are interested in getting beta access leave us details about your use case here: https:&#x2F;&#x2F;a90eq67iroz.typeform.com&#x2F;to&#x2F;FAkWnrtv<p>The motivation for our project is that open-source solutions such as puppeteer-extra-stealth cover only a small portion of what popular anti-bot software such as Akamai Bot Manager or Imperva use to detect and ban emulated browsers.

2 条评论

dryja超过 4 年前
We regularly scrape competitor websites to get insights on product availability and pricing. However, one of the competitors installed a script that gets us &quot;Pardon our interruption&quot;. I guess it&#x27;s because of bot detection. Unfortunately, that puppeteer plugin doesn&#x27;t make it any different. We overcome this by using Oxylabs service. It&#x27;s pricey but as long as you don&#x27;t mind paying extra bucks (&amp; got low frequency of scraping) you can use it as an alternative.
jjgreen超过 4 年前
This is an invitation to beta-test, not a question -- so why Ask HN?
评论 #25701396 未加载