TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

Q: Why is PageRank so complicated?

2 pointsby Trindazover 14 years ago
A potentially silly question that might hurt our pending YCombinator application, I've got to ask it anyway:<p>I really have no idea how PageRank works, and I'm interested in where the magic lies. Is it in the way they create an index that can quickly be searched even though it's got billions of entries? I read lots about how it's based on the number of links pointing to pages, but isn't that just a matter of creating a big graph of all web pages and counting the links between them?<p>I'd appreciate any info on specifically what technical problems have been so ingeniously solved to create PageRank.

4 comments

pbhjpbhjover 14 years ago
<a href="http://en.wikipedia.org/wiki/PageRank" rel="nofollow">http://en.wikipedia.org/wiki/PageRank</a> is pretty good, if you want to see the original patent then there's a link there.<p>Of course there are many alterations now to the original ranking algo.<p>Good quality links with varied text from a wide variety of high-authority domains along with a well structured internal link graph should be all the magic you need to get a high PR.<p>SeoMOZ's annual ranking factors report is a good read on the practicalities.
Gibbonover 14 years ago
If you want to read up on PageRank and other search algorithms, this book: <a href="http://www.amazon.com/Googles-PageRank-Beyond-Science-Rankings/dp/0691122024" rel="nofollow">http://www.amazon.com/Googles-PageRank-Beyond-Science-Rankin...</a> goes into more detail.<p>Larry joked that Sergei just wanted to see how cool he was by measuring how many people were linking to him.
cpercivaover 14 years ago
PageRank isn't complicated. It's just an eigenvector.
korchover 14 years ago
It's not just about eigenvectors, though that is the coolest part of it. The "magic" part is that PageRank is also about stochastic processes, which use the Ergodic theorem to show that their PageRank exists at an arbitrary scale. Therefore it doesn't matter how big or complex the link structure on the web gets—it will scale.<p>See this:<p>"For any matrix A = [cP + (1-c)E]' where P is an n×n row-stochastic matrix, E is a nonnegative n×n rank-one row-stochastic matrix, and 0 =&#60; c =&#60; 1, the second eigenvalue of A has modulus less than or equal to c. Furthermore, if P has at least two irreducible closed subsets, the second eigenvalue is equal to c.<p>This statement has implications for the convergence rate of the standard PageRank algorithm as the web scales, for the stability of PageRank to perturbations to the link structure of the web, for the detection of Google spammers, and for the design of algorithms to speed up PageRank."<p><a href="http://answers.google.com/answers/threadview/id/379557.html" rel="nofollow">http://answers.google.com/answers/threadview/id/379557.html</a>
评论 #1803223 未加载