科技回声

6 条评论

tlrobinson超过 16 年前

The vast majority of the entries in Bush's robots.txt were filtering out the plain text versions which are linked at the bottom of the HTML versions containing identical content. This prevents duplicates from showing up in searches. This is likely done automatically by whatever software they use to manage the content.Want proof? Pick any of the entries ending in "/text", for example "/911/911day/text", search Google with the "/text" removed like this: "site:whitehouse.gov inurl:/911/911day" and you can still see the page in the Google cache (at least until Google's index is updated).If you want to view it as a metaphor, fine, but there's no evidence Bush's administration was trying to hide anything on their website like this article implies. If they wanted to hide it, why would they put it on there in the first place?

miketheburrito超过 16 年前

This is a great and semi-metaphorical comparison (woohoo transparency!), but to be fair, the Obama administration hasn't done anything yet, so there isn't even anything to hide at this point.

nir超过 16 年前

Having /includes/ under document root - and trying to fix this via a robots.txt entry (??) - wouldn't reflect well on Obama, if they actually had any meaning :)

评论 #442226 未加载

评论 #442201 未加载

评论 #442197 未加载

gojomo超过 16 年前

Why aren't we allowed to crawl their JS and CSS?What are they trying to hide?

Obama's New Robots.txt

6 条评论

Obama's New Robots.txt

6 条评论