TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

Googlebot’s JavaScript random() function is deterministic

311 点作者 TomAnthony超过 7 年前

15 条评论

geocar超过 7 年前
I wrote about something like this previously[1]<p>Ads often have something like this attached to the end:<p><pre><code> load(&quot;https:&#x2F;&#x2F;adserver&#x2F;track.gif?{stuff}&amp;_=&quot; + Math.random()) </code></pre> My ad server collected multiple tracking pixels -- for various events and after five pixels I could fingerprint the browser (Firefox, Chrome, MSIE, etc) and identify someone fiddling with the user agent string, or using a proxy server to mask this information.<p>[1]: <a href="http:&#x2F;&#x2F;geocar.sdf1.org&#x2F;browser-verification.html" rel="nofollow">http:&#x2F;&#x2F;geocar.sdf1.org&#x2F;browser-verification.html</a>
评论 #16326312 未加载
评论 #16324046 未加载
jchw超过 7 年前
Seems reasonable to me. My guess is that it&#x27;s not performance, but rather predictability, that matters here. Being able to detect when a page meaningfully changes is probably useful for Google, and a good implementation of Math.random() would potentially thwart that. Especially seeing how many pages have the magic constant in them...<p>Also, probably useful for determining two pages are the same, which may be needed to help prevent the crawler from crawling a million paths into a SPA that don&#x27;t actually exist, for example.
评论 #16326048 未加载
sevensor超过 7 年前
Seems like the expectation on PRNGs being expressed in this thread is a bit unrealistic. They&#x27;re always deterministic. The fact that this PRNG is also always seeded the same makes it easier to fingerprint, but that has no bearing on whether the PRNG is deterministic.
评论 #16325415 未加载
lolc超过 7 年前
Maybe Googlebot forks its JS-engine from a pre-initialized image. That would explain the unchanging seed.
评论 #16324935 未加载
endymi0n超过 7 年前
This is known for a while and used to cause huge problems for our tracking: <a href="https:&#x2F;&#x2F;github.com&#x2F;snowplow&#x2F;snowplow-javascript-tracker&#x2F;issues&#x2F;499" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;snowplow&#x2F;snowplow-javascript-tracker&#x2F;issu...</a>
评论 #16325581 未加载
评论 #16326353 未加载
amelius超过 7 年前
Nice to see fingerprinting used <i>against</i> Google (instead of <i>by</i> Google). But this is easy to fix, so I don&#x27;t expect this to work for much longer.
评论 #16324579 未加载
评论 #16323666 未加载
评论 #16323372 未加载
kmbriedis超过 7 年前
Every random() function is kind of deterministic, but still very interesting discovery!
评论 #16323761 未加载
评论 #16323446 未加载
nukeop超过 7 年前
What&#x27;s a plausible real world use case for this if one wanted to exploit this to game SEO? Can it even be exploited in any way?
评论 #16323614 未加载
评论 #16325620 未加载
评论 #16323795 未加载
评论 #16323615 未加载
herodotus超过 7 年前
For those (like me) who are not that familiar with Javascript, the Javascript spec for Math.random says: &quot;....The implementation selects the initial seed to the random number generation algorithm; it cannot be chosen or reset by the user.&quot; Furthermore, the seed usually changes. It seems that Google has modified their Javascript library, perhaps by allowing an explicit Math.Seed function.
some1else超过 7 年前
I employed a seed-based deterministic random function in a WebWorker once, to noisily but predictably drive an animation. I suppose one could use the same approach to have a decent non-deterministic source alongside Google&#x27;s patched seedrandom.
onion2k超过 7 年前
I guess they used the XKCD method of random number generation: <a href="https:&#x2F;&#x2F;xkcd.com&#x2F;221&#x2F;" rel="nofollow">https:&#x2F;&#x2F;xkcd.com&#x2F;221&#x2F;</a><p>It is actually truly random if you have a fair dice.
评论 #16323955 未加载
评论 #16323829 未加载
yueq超过 7 年前
def roll(): return 4
评论 #16329134 未加载
est超过 7 年前
This makes me wonder if Chrome Headless has enough entropy for random() if running on a Linux server.
评论 #16323550 未加载
评论 #16323628 未加载
partycoder超过 7 年前
PRNGs are deterministic. Even the PRNG shipped in processors available through the RDSEED&#x2F;RDRND instructions is deterministic.<p>Unless you are using some form of entropy, e.g: dedicated hardware, that will be the case.
评论 #16326024 未加载
h000per超过 7 年前
The google queries for &quot;roll a dice&quot; and &quot;flip a coin&quot; aren&#x27;t actually random either. They seem to be based off the current time.
评论 #16323600 未加载