TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

Headless Chrome support in Cloud Functions and App Engine

333 点作者 idoco将近 7 年前

22 条评论

rmdashrfroot将近 7 年前
I wrote a collection of Dockerfiles for images running Python 2.7 or Python 3.6 + Selenium with either Chrome or Firefox and using Xvfb for the X display (necessary for running Selenium headlessly).<p><a href="https:&#x2F;&#x2F;github.com&#x2F;seanpianka&#x2F;docker-python-xvfb-selenium-chrome-firefox" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;seanpianka&#x2F;docker-python-xvfb-selenium-ch...</a><p>Using this, in conjunction with AWS Step Functions, Lambda, and ECS, it became merely cents a month to run a headless scraper task in the cloud.
评论 #17796618 未加载
mostlystatic将近 7 年前
I tried to use Headless Chrome on Cloud Functions for a project I&#x27;m working on, but even on the fastest instances loading pages was sometimes really slow (pages timing out after waiting for 60s).<p>It seems sometimes JS execution was taking a long time, so I guess that was preventing requests from being made. In a single CPU cloud function you have network requests, JavaScript execution, rendering, and the Node process controlling the browser all competing for resources.<p>That being said, it was super simple to get started!
评论 #17797626 未加载
评论 #17796225 未加载
iamjustlooking将近 7 年前
Among other things with puppeteer we do screenshot generation using GKE on Google Cloud @ <a href="https:&#x2F;&#x2F;screenshots.cloud&#x2F;" rel="nofollow">https:&#x2F;&#x2F;screenshots.cloud&#x2F;</a> scaling up and down running instances depending on demand. We keep browser instances running constantly as the startup time is significant. I will be interested to see what the startup time is for puppeteer on this, will definitely be giving it a try.
评论 #17797056 未加载
bergie将近 7 年前
Nice to see this concept get into the big cloud platforms. We built something similar couple of years ago, primarily to get a sandbox for some compute jobs we were running on Heroku:<p><a href="https:&#x2F;&#x2F;github.com&#x2F;flowhub&#x2F;jsjob" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;flowhub&#x2F;jsjob</a>
k__将近 7 年前
Does this mean that HC is preinstalled in the runtime?<p>Becauee as far as I know you can already run HC with other FaaS solutions, but having this out of the box would be really nice.
评论 #17795974 未加载
mrskitch将近 7 年前
If Google cloud ain’t your jam then checkout browserless (<a href="https:&#x2F;&#x2F;browserless.io&#x2F;" rel="nofollow">https:&#x2F;&#x2F;browserless.io&#x2F;</a>). It can be considerably cheaper under certain situations, and we’ve been up and running for almost a year. Happy to answer questions if anyone has any.<p>EDIT: We’ve got stuff on GH: <a href="https:&#x2F;&#x2F;github.com&#x2F;joelgriffith&#x2F;browserless" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;joelgriffith&#x2F;browserless</a>, and startup is under 100ms most of the time. Fonts and other things “just work” as well, plus there’s a slew of REST APIs for common stuff as well. Selenium webdriver support landing soon!
评论 #17796333 未加载
评论 #17796335 未加载
schappim将近 7 年前
I wish Google would support Ruby!<p>If you do too checkout the petition over at <a href="https:&#x2F;&#x2F;www.serverless-ruby.org" rel="nofollow">https:&#x2F;&#x2F;www.serverless-ruby.org</a>
评论 #17796396 未加载
评论 #17796595 未加载
评论 #17796398 未加载
antoncohen将近 7 年前
I think it would be pretty cool to use Cloud Functions as a Selenium Grid, sort of like Zalenium (<a href="https:&#x2F;&#x2F;github.com&#x2F;zalando&#x2F;zalenium" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;zalando&#x2F;zalenium</a>) does with Kubernetes. If you could parallelize end-to-end tests enough, you could get massive burstable capacity to run parallel tests.
wslh将近 7 年前
Not perfect but HtmlUnit anyone? I used it for scraping in the past with mixed experiences.
评论 #17796646 未加载
tnolet将近 7 年前
I&#x27;ve been screwing around with running Headless Chrome &amp; Puppeteer on Lambda&#x2F;Serverless&#x2F;FAAS solutions. It&#x27;s all a bit of a mixed bag. You CAN run Headless Chrome on AWS Lambda, but the cost involved is pretty crazy as you need ~1500Mb in RAM to comfortably run any code with Chrome.<p>Google Cloud of course has &quot;inside knowledge&quot; and I would love to switch to them for my SaaS <a href="https:&#x2F;&#x2F;checklyhq.com" rel="nofollow">https:&#x2F;&#x2F;checklyhq.com</a>, were it not that Google Cloud Functions is just offered in four (!) regions...
评论 #17796282 未加载
pwaai将近 7 年前
that&#x27;s it...im moving to GCP<p>sorry but Rekognition rekt it for any type of computer vision on AWS.<p>Great infrastructure...after all I do have an AWS Solution Architect Associate certification....which means jack shit<p>Great move by GCP, I&#x27;m also very pleased with Firebase and it&#x27;s integration with cloud functions....<p>BUT my biggest reservation still in 2018 when it comes to serverless is the cold start up time...<p>I built a token based API on AWS Lambda and registering, signing up took forever when the app was not at peak. that was 2014 tho.
评论 #17798086 未加载
评论 #17806909 未加载
guiomie将近 7 年前
Ok, I was just looking into this 1 week ago and was gonna spin up a VM to do headless. Now I get to keep my firebase project in cloud functions only. Much cleaner architecture.
defied将近 7 年前
My company provides a similar service, with both Chrome and FireFox headless support for Automated Testing&#x2F;Screenshots: <a href="https:&#x2F;&#x2F;testingbot.com&#x2F;support&#x2F;getting-started&#x2F;headless.html" rel="nofollow">https:&#x2F;&#x2F;testingbot.com&#x2F;support&#x2F;getting-started&#x2F;headless.html</a><p>We run each test in a new VM, running on our own private cloud (dedicated servers).<p>Note: we use the Selenium protocol for this, not yet Puppeteer.
patd将近 7 年前
I&#x27;m currently using a headless Chrome for my latest project www.blockedby.com (still in alpha stage, looking for feedback)<p>I&#x27;ve been looking at a non-local solution. I&#x27;m using Python and this article hints that Puppeteer is not the only way to invoke this. But I don&#x27;t see any documentation on the DevTools protocol.<p>Anyone knows if it&#x27;s supported ? Or any providers that do ?
评论 #17799204 未加载
评论 #17798909 未加载
kwerk将近 7 年前
I’ve tested HC on GCF and GAE standard with the IO launch. Sadly they’re 3-10x slower than App Engine Flex (same vm size on App Engine Flex vs Standard). Even a screenshot of google.com takes 6+ seconds on GCF &#x2F; GAE Standard vs 2 seconds for Flex. I hope they fix this as spinning to zero is important for me but the latency is too high right now.
nojvek将近 7 年前
I don’t know what this means for browserless.io but I hope he still retains a strong niche and has a desirable product.
eknkc将近 7 年前
Tried HC just yesterday on cloud functions. For some reason, it runs extremely slow. Did some comparisons to AWS lambda with similar memory &#x2F; cpu sizes and basic “load page and screenshot” jobs would take 2x - 3x more time on google cloud functions.<p>I’ll dig deeper soon but this is a bad start.
评论 #17796235 未加载
dakom将近 7 年前
Can someone please explain what the advantage of running a snapshot service via GCF would be vs. AppEngine Standard (w&#x2F; node)?
评论 #17798284 未加载
isuckatcoding将近 7 年前
I wonder how feature&#x2F;pricing compete with Browserless?
评论 #17796274 未加载
techsin101将近 7 年前
How Google is behaving recently... I can feel it becoming Oracle. I want to stay 10 miles far from it. Learned my lesson with Google maps.
benatkin将近 7 年前
I&#x27;m not buying it, Google Cloud just moved to node 8 earlier this year as the post says, but now it&#x27;s node 10. It&#x27;s just not good tech, it&#x27;s unnecessary lock-in. Docker on anything is better, this is similar: <a href="https:&#x2F;&#x2F;zeit.co&#x2F;blog&#x2F;serverless-docker" rel="nofollow">https:&#x2F;&#x2F;zeit.co&#x2F;blog&#x2F;serverless-docker</a>
评论 #17797316 未加载
jancurn将近 7 年前
It&#x27;s nice to see serverless platforms adding support for headless Chrome. But there&#x27;s still one problem with AWS Lambda &#x2F; Cloud Functions &#x2F; Zeit Now - the run time is limited to a few minutes. If you want to run any longer job, e.g. a web crawler, you need to either spin up the instances yourself or use platform like Apify, which allows running arbitrary-long jobs, provides pre-built Docker images for headless Chrome or XVFB, and provides SDK to simplify state persistence, access to proxies etc.<p>For example, a simple actor to convert HTML to PDF looks like this:<p><a href="https:&#x2F;&#x2F;www.apify.com&#x2F;jancurn&#x2F;url-to-pdf" rel="nofollow">https:&#x2F;&#x2F;www.apify.com&#x2F;jancurn&#x2F;url-to-pdf</a><p>More info:<p><a href="https:&#x2F;&#x2F;www.apify.com&#x2F;docs&#x2F;actor" rel="nofollow">https:&#x2F;&#x2F;www.apify.com&#x2F;docs&#x2F;actor</a><p><a href="https:&#x2F;&#x2F;www.apify.com&#x2F;docs&#x2F;sdk&#x2F;apify-runtime-js&#x2F;latest" rel="nofollow">https:&#x2F;&#x2F;www.apify.com&#x2F;docs&#x2F;sdk&#x2F;apify-runtime-js&#x2F;latest</a><p><a href="https:&#x2F;&#x2F;www.apify.com&#x2F;library?type=acts" rel="nofollow">https:&#x2F;&#x2F;www.apify.com&#x2F;library?type=acts</a><p>Disclaimer: I&#x27;m a co-founder of Apify