TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

Algorithmic search is sinking

53 点作者 McKittrick超过 14 年前

15 条评论

jeremydavid超过 14 年前
"The only way to combat this and return trust and quality to search is by taking an editorial stand and having humans identify the best sites for every category."<p>There are billions of webpages. Who is going to do this review?<p>Is someone honestly going to review <a href="http://stackoverflow.com/questions/4300234/how-might-union-find-data-structures-be-applied-to-kruskals-algorithm" rel="nofollow">http://stackoverflow.com/questions/4300234/how-might-union-f...</a> and put it in the category of "How Union/Find data structures can be applied to Kruskal's algorithm?"?<p>No.<p>The closest thing to a editorialized web is www.dmoz.org, and that hasn't been properly updated in years (and never will be) because it failed.<p>Search has to be done with algorithms - there are just too many search queries to do it any other way. Udi Manber, Google’s VP of Engineering stated that 20-25% of all queries made each day have never been seen before: <a href="http://www.readwriteweb.com/archives/udi_manber_search_is_a_hard_problem.php" rel="nofollow">http://www.readwriteweb.com/archives/udi_manber_search_is_a_...</a>.
评论 #1949471 未加载
评论 #1950077 未加载
tptacek超过 14 年前
... says the guy with the (crowdsourced) curated search engine.
评论 #1949764 未加载
DrJosiah超过 14 年前
Use sentiment analysis to discover the intent of a link, and whether the destination should get more link juice. Positive sentiment: positive link juice. Negative sentiment: zero link juice.<p>Alternatively, for negative reviews, etc., use rel="nofollow".<p>To claim that algorithmic search is dead completely ignores the <i>volume</i> that Google is doing, or the fact that they are making $billions in algorithmic search and Ad placement. How much do curated places make?<p>Also, not to rain on anyone's parade or anything (just kidding, I'm going to rain it down) it would take decades of 10k people churning through pages to get even 1% of the <i>new</i> content that Google discovers <i>daily</i>.<p>You all saw the 24 hours of unique video uploaded to YouTube every minute of every day figure from a year or two ago, right? Imagine that, only text, and produced by 10x-1000x as many people at 10-1000x the volume posting to forums, newsgroups, social networking sites, blogs, etc., every minute of every day. Because of this, you can't just review a site, you have to review the content on each page of the site. That's going to kill any curated engine in the long term.
jules超过 14 年前
Ironic that he's proposing to have people solve a problem that arose because people were being manipulated, of course your people cannot be manipulated.<p>No, the solution to this problem is that GetSatisfaction et al use rel=nofollow. It's as simple as that. And arguably Google could improve its algorithm by taking negativity into account.
评论 #1948904 未加载
Vivtek超过 14 年前
I have heard about search engine spam before and sort of discounted it - but you know, if you search on something that's not a technical topic or something equally specific like a band name, that is, you're searching on a general topic that is of interest to the mundanes, then there really is a whole lot of spam on Google.<p>My case from this week was that I wanted plans for a bookcase. I searched, therefore, on "build a bookcase". There was exactly one useful link on Google's front page (a Popular Mechanics link), and the rest were regurgitated spam that I could improve on with a Markov chain algorithm.<p>I've read that as long as people click on ads, Google has no motivation to clean up spam, but surely this can't be the best even for Google?
评论 #1948797 未加载
评论 #1949881 未加载
评论 #1949761 未加载
mixmax超过 14 年前
Or maybe our algorithms just aren't good enough.<p>Suppose you use bayesian filtering on the text surrounding the links to determine whether the connection is good or bad. With a reasonable amount of data it should be possible.<p><i>Note:</i> I'm not an algorithms guy, I do business and strategy and a wee bit of programming, so maybe the example isn't good, but I thinkthe point is.
评论 #1949064 未加载
评论 #1948946 未加载
评论 #1949208 未加载
jeffmiller超过 14 年前
The core problem with having humans identify the best sites is that it doesn't scale. It's probably ok for big topics like travel or healthcare, but it shafts those users who are searching for long tail topics.
idheitmann超过 14 年前
The mystery of the PageRank algorithm is not only a defense against gaming, it's a defense against competition. Other than stylistic differences (a la Bing), it seems difficult to differentiate a new service when nobody understands the details of the standard one.<p>As a net addict, I regularly find myself frustrated because I can't figure out how to get meaningful information out of Google instead of sites trying to sell me. And if I can't think off the top of my head of a website that will act as a relevant portal for that kind of info, then there isn't really any alternative to Google.<p>At least, not that I know of yet: can anyone suggest one?<p>Google has done amazing things for our ability to get what we want and fast, but it also is slowly eroding our independence from it and our ability to educate ourselves by other means.<p>Here's hoping they prove worthy stewards once they own all the information on the planet.
zmmmmm超过 14 年前
An awful lot seems to be getting made out of this one story, and there's really precious little else cited in the post other than generic claims of gloom and doom about search. Google's been fighting spam sites for a long time before this and the battle certainly waxes and wanes but I'm sceptical that it's actually being lost, it's just a constant struggle.<p>Now if you tell me that there is value in social search we could have a totally different discussion, but it's more about the persuasive power of personal recommendation than algorithms not working any more.
fonosip超过 14 年前
it is sinking, but for a different reason. the web is getting away from google. getting locked up in apps, or walled gardens like facebook or itunes
评论 #1949289 未加载
jellicle超过 14 年前
I don't know if Skrenta's approach is perfect (can spammers make slashtags? I'll bet they can!) but Google's is clearly failing.<p>Giant swathes of Google searches are now overrun with datafog spammers. Ehow, squidoo, hubpages, wikihow, buzzle, how-wiki, ezinearticles, bukisa, wisegeek, articlesnatch, healthblurbs, associatedcontent - all thee and thousands more domains filled with spam semi-automatically generated by legions of Indians for a few cents per page.<p>There's not one word of useful information on any of those domains. But apparently they serve a lot of ads for Google, so they don't get delisted.
评论 #1949153 未加载
评论 #1949893 未加载
评论 #1950136 未加载
评论 #1950285 未加载
JoachimSchipper超过 14 年前
To everyone talking about "sentiment analysis": that's not easy. Sentences like "John has stupidly said foo[1], and even went so far as to say bar[2] (which was demolished by Jane[3] and Jan[4]); he's now capitulated[5]" would be quite difficult to parse. The following articles may also be instructive: all by the same author, all quite critical, but with links with quite different intentions.<p><a href="http://scienceblogs.com/goodmath/2009/05/dembski_responds.php" rel="nofollow">http://scienceblogs.com/goodmath/2009/05/dembski_responds.ph...</a> <a href="http://scienceblogs.com/goodmath/2009/12/id_garbage_csi_as_non-computab.php" rel="nofollow">http://scienceblogs.com/goodmath/2009/12/id_garbage_csi_as_n...</a> <a href="http://scienceblogs.com/goodmath/2009/08/quick_critique_dembski_and_mar.php" rel="nofollow">http://scienceblogs.com/goodmath/2009/08/quick_critique_demb...</a>
maheshs超过 14 年前
&#62;&#62;Algorithmic search is sinking<p>I think we need better Algorithm.
ergo98超过 14 年前
There is little rigor behind most of the claims of the NYTime story: The targeted site already negates any pagerank benefit of their links (they do implement nofollow), and the definitive example seems to be nothing more than good SEO of the site in question (most of the other front and second page sites are pretty mediocre as well, clearly with little web competition in the keyword space).<p>In any case, go to a shopping specific (sub)site if shopping. A google search is a terrible way of find either products or retailers.
评论 #1949204 未加载
rorrr超过 14 年前
So there's some shitty retailer. What does this have to do with algorithmic search?
评论 #1949186 未加载