TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

Apple might be running a web crawler written in Go

169 点作者 beltex超过 10 年前

15 条评论

unwind超过 10 年前
Not at all sure why it&#x27;s relevant which programming languages they use at Apple.<p>Is this considered interesting since it somehow proves the mainstream adoption of Go? Or since Apple, a competitor of Google&#x27;s in cell phones, are using one of Google&#x27;s technologies? Or what?<p>The takeaway for me was that Apple has a whole &#x2F;8 subnet to themselves. That&#x27;s just ... immense, for a single company. Gaah.<p>EDIT: Mis-typed the netmask, I meant &#x2F;8 but typed &#x2F;24. Fixed.
评论 #8567463 未加载
评论 #8567430 未加载
评论 #8567474 未加载
评论 #8567427 未加载
评论 #8567794 未加载
评论 #8567690 未加载
评论 #8567439 未加载
评论 #8567472 未加载
评论 #8567645 未加载
评论 #8567477 未加载
评论 #8568011 未加载
评论 #8567837 未加载
christop超过 10 年前
I noticed this as well a couple of weeks ago.<p>They&#x27;re still actively visiting every page on the websites that are associated with our iOS apps.<p>Today alone (starting shortly before 8am CET) they&#x27;ve crawled over 8000 pages on <a href="https://trails.io" rel="nofollow">https:&#x2F;&#x2F;trails.io</a> and <a href="http://offmaps.com" rel="nofollow">http:&#x2F;&#x2F;offmaps.com</a> — without sending conditional caching headers (our pages don&#x27;t change <i>too</i> often).<p><pre><code> $ grep ^17\\. &#x2F;var&#x2F;log&#x2F;nginx&#x2F;*.log | grep &#x27;Go|Fetcher&#x27; | wc -l 8254 </code></pre> Note that they don&#x27;t appear to be scraping any URLs available from within the apps themselves, but rather the the company&#x2F;support websites linked in our App Store listings.<p>I guess they&#x27;re automatically scanning for objectionable content, since these websites are linked from the App Store and the iTunes website?
评论 #8567704 未加载
brycekahle超过 10 年前
The Go http client definitely has a bug that doesn&#x27;t maintain the User-Agent across redirected requests.<p><a href="https://code.google.com/p/go/source/browse/src/net/http/client.go#338" rel="nofollow">https:&#x2F;&#x2F;code.google.com&#x2F;p&#x2F;go&#x2F;source&#x2F;browse&#x2F;src&#x2F;net&#x2F;http&#x2F;clie...</a>
评论 #8567775 未加载
pkulak超过 10 年前
Go is a fantastic language for writing a crawler. I wrote the Showyou crawler in Go and it&#x27;s both one of our highest load processes (and it&#x27;s very efficient, trust me) and most stable.
bkeroack超过 10 年前
I&#x27;ve seen golang mentioned in Apple job listings. I didn&#x27;t think this was a secret.
13超过 10 年前
I&#x27;ve got several thousand requests for this in my logs with an IP address in the same &#x2F;24.<p>Two weeks ago it was using a different user agent:<p><pre><code> Mozilla&#x2F;5.0 (iPhone; CPU iPhone OS 6_0 like Mac OS X) AppleWebKit&#x2F;536.26 (KHTML, like Gecko) Version&#x2F;6.0 Mobile&#x2F;10A5376e Safari&#x2F;8536.25 (compatible; Fetcher&#x2F;0.1) </code></pre> This string is mostly from a beta build of iOS 6.0 beta 4, but with the same suffix the author discovered.
pvsnp超过 10 年前
It looks more likely that someone&#x27;s learning Go and going through the crawler example in the tutorials. Of course I may be wrong here. The trigger for me was the term &quot;Fetcher&quot; here.. <a href="http://tour.golang.org/#73" rel="nofollow">http:&#x2F;&#x2F;tour.golang.org&#x2F;#73</a><p>Fun exercise though.
评论 #8571317 未加载
lifeforms超过 10 年前
If Apple coders are reading, I noticed exactly the same issue of User-Agent header not being used on redirects.<p>You can solve this by (ab?)using the CheckRedirect functionality of http.Client to set the User-Agent again.<p>Here&#x27;s an example: <a href="https://github.com/lifeforms/httpcheck/commit/07440d952d166002ab759873e2885e78e8fa5c61" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;lifeforms&#x2F;httpcheck&#x2F;commit&#x2F;07440d952d1660...</a> (The program is nothing special, just a little thing I use internally for monitoring)
0x0超过 10 年前
So they are &quot;apple-maps&#x27;ing&quot; search as well, hilariously using one of Google&#x27;s own tools in the process.<p>Would be interesting to know if&#x2F;when any deals with bing&#x2F;google&#x2F;duckduckgo for search is expiring, like they were with google maps. That&#x27;s probably when we will see an apple search engine.<p>Makes sense for them to fully control another part of the backend for spotlight and siri.
yourad_io超过 10 年前
<i>&gt; So far, I have seen requests from two IPs: 17.147.18.33 (7 on 2014-10-15) and 17.147.18.33 (7 on 2014-10-15)</i><p>One of these should be .35, judging by the logs that follow ;)
struct超过 10 年前
Perhaps a proxy designed for iOS testing?
评论 #8567476 未加载
sean9999超过 10 年前
This just in: Someone at Apple was doing stuff
jdalgetty超过 10 年前
I&#x27;ve been seeing this as well.
jezfromfuture超过 10 年前
Way to get someone sacked at apple.
评论 #8567338 未加载
评论 #8567345 未加载
评论 #8567360 未加载
评论 #8567336 未加载
hartator超过 10 年前
I am doubtfull Apple wants to compete with Google in search.<p>Or maybe it&#x27;s a way for Steve Jobs to get a revenge post-mortem on Google for Android!