TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

Machine Learning Applied to Google's Rankings

21 pointsby randfishover 15 years ago

6 comments

ramanujanover 15 years ago
1) Rand Moz and other SEO people should do an extremely thorough study of 23andMe's site. Their product may be of questionable value, but if there is one site which has the skeleton key to SEO, it is an ecommerce site run by Google's Wife. Any kind of convention or trick that they use is likely to be preferred by Google.<p>2) I've messed around with this problem myself a bit. In general, predicting rank as a function of page properties is equivalent to replicating Google's own search ranking (i.e. if your predicted rank \hat{Y} = the true rank Y for input features X then you can basically rank pages as google does from signals on web pages, though of course you'll be doing it in batch without all the semi-realtime crawling that goog now does).<p>That said, you can pretty easily get something decent that will (a) give you an overall estimate of rank and (b) at least tell you quantitatively whether a given feature impacts rankings. This can settle a lot of debates among SEO people.<p>3) Specific proposal: calculate a non-parametric measure of correlation between empirical page rank and each of the features mentioned in this post (<a href="http://www.seomoz.org/article/search-ranking-factors" rel="nofollow">http://www.seomoz.org/article/search-ranking-factors</a> ) on a sample of say 100k keywords. Examination of individual scatterplots will also be informative.<p>Now you can do a more abstract analysis. Construct a table where rows correspond to features and there are two columns: the empirical non-parametric correlation with PageRank and the estimate in the SEOMoz post on ranking factors of that feature's importance.<p>Make a scatterplot here (and calculate just one more non-parametric correlation) to see how good the experts were at determining how much each feature contributed to rank.
评论 #896774 未加载
paraschopraover 15 years ago
Though the analysis is interesting, it is not "Machine Learning". There is no test/training data set, no prediction, no model selection. It is just plain old, but extremely useful correlation analysis.<p>If someone is interested in a similar kind of correlation (and regression) analysis for website conversion rate, have a look at the study I did a study recently - <a href="http://www.wingify.com/case-studies/predictive-web-analytics-conversion-case-study.php" rel="nofollow">http://www.wingify.com/case-studies/predictive-web-analytics...</a>
评论 #897042 未加载
jgrahamcover 15 years ago
If you're going to throw around a term like 'machine learning' then it would be nice if you were to explain what you were doing. The article says:<p><i>We (well, technically, Ben) run them through a machine learning model that maps to the search results and produces a result that's considerably better correlated with rankings than any single metric.</i>
kurtosisover 15 years ago
If anyone is interested in what a real "machine learning" approach the problem of learning a ranking function from data looks like see this paper:<p>Burges et. al. Learning to Rank using Gradient Descent <a href="http://research.microsoft.com/apps/pubs/?id=69183" rel="nofollow">http://research.microsoft.com/apps/pubs/?id=69183</a><p>Although this is most definitely an active research area and the papers citing this one should be pretty interesting.
carbocationover 15 years ago
If you look at their model vs the correct result, their error appears logarithmic. This is what I would expect from a linear model that is trying to approximate a function known to be logarithmic. (The 1-10 PageRank values we see are logarithms of the actual internal Google values, or so it is said.)
meatbagover 15 years ago
This submission implies that SEOmoz is at least slightly interested in peer review. Which would be a very good thing for them.
评论 #897045 未加载