TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

Why Search Crawlers Sometimes Ask for URLs That Never Were Part of Your Site

2 点作者 cskau将近 14 年前

1 comment

cskau将近 14 年前
I'm setting up a new site and noticed that within seconds of starting the server I was getting hits in the log like:<p><pre><code> 67.195.112.231 - - [2011-06-17 18:49:37] "GET /SlurpConfirm404/starsong/pro-road.htm HTTP/1.0" 404 18 "-" "Mozilla/5.0 (compatible; Yahoo! Slurp; http://help.yahoo.com/help/us/ysearch/slurp)" 67.195.115.174 - - [2011-06-17 16:53:30] "GET /SlurpConfirm404/drugstore.htm HTTP/1.0" 404 18 "-" "Mozilla/5.0 (compatible; Yahoo! Slurp; http://help.yahoo.com/help/us/ysearch/slurp)" </code></pre> My initial thought was that I was getting crawled by a spam bot masquerading as a Yahoo crawler, but after a bit of googling I found a couple of blogs guessing on the nature of the strange requests.<p>My best guess is that they're using the above test to check if your server is gladly responding even the most obscure request, thus making it look like a spider trap/content farm.