Greg Linden's (MSFT) comments on a recent Google paper on this:<p><a href="http://glinden.blogspot.com/2009/01/how-google-crawls-deep-web.html" rel="nofollow">http://glinden.blogspot.com/2009/01/how-google-crawls-deep-w...</a>
There's many startups attempting to do this including pipl.com, <a href="http://cazoodle.com/" rel="nofollow">http://cazoodle.com/</a> among others. Here's some research about it: <a href="http://www-sal.cs.uiuc.edu/~kcchang/" rel="nofollow">http://www-sal.cs.uiuc.edu/~kcchang/</a>
I believe I've seen this type of crawling in action in request logs. For example, Yahoo might try to request "news.ycombinator.com/user?id=britney_spears", even though it's not linked to from anywhere.