科技回声

To anyone blaming the current administration, note that the robots.txt is identical before the election too: <a href="https://web.archive.org/web/20161101000359/https://petitions.whitehouse.gov/robots.txt" rel="nofollow">https://web.archive.org/web/20161101000359/https://petitions...</a>

Isn't there a built in search page for these petitions? What good would it be to have these petitions indexed by google? To be honest, I don't really want petitions influenced by SEO

i would point out that robots.txt is optional you don't have to follow it. It would be easy enough for one of us to extract the text of each petition, with a simple spider, put it on a web site with links back to the original, and let google search that. The petitions are public documents for public consumption. Even if white house tried to sue it wouldn't be their content. it's the content of the person who created it. Otherwise they would be legally suggesting that they are petitioning themselves.... then again IANAL just a human capable of reasoning through things logically, which rarely has any bearing on lawsuits. ;)

Someone should create a scraper/aggregator w/ links back and synopsis... So google does spider the content.

Isn't there a built in search page for these petitions? What good would it be to have these petitions indexed by google? To be honest, I don't really want petitions influenced by SEO

Someone should create a scraper/aggregator w/ links back and synopsis... So google does spider the content.

Whitehouse.gov petitions are blocked from search results

4 条评论

Whitehouse.gov petitions are blocked from search results

4 条评论