A potentially silly question that might hurt our pending YCombinator application, I've got to ask it anyway:<p>I really have no idea how PageRank works, and I'm interested in where the magic lies. Is it in the way they create an index that can quickly be searched even though it's got billions of entries? I read lots about how it's based on the number of links pointing to pages, but isn't that just a matter of creating a big graph of all web pages and counting the links between them?<p>I'd appreciate any info on specifically what technical problems have been so ingeniously solved to create PageRank.
<a href="http://en.wikipedia.org/wiki/PageRank" rel="nofollow">http://en.wikipedia.org/wiki/PageRank</a> is pretty good, if you want to see the original patent then there's a link there.<p>Of course there are many alterations now to the original ranking algo.<p>Good quality links with varied text from a wide variety of high-authority domains along with a well structured internal link graph should be all the magic you need to get a high PR.<p>SeoMOZ's annual ranking factors report is a good read on the practicalities.
If you want to read up on PageRank and other search algorithms, this book: <a href="http://www.amazon.com/Googles-PageRank-Beyond-Science-Rankings/dp/0691122024" rel="nofollow">http://www.amazon.com/Googles-PageRank-Beyond-Science-Rankin...</a> goes into more detail.<p>Larry joked that Sergei just wanted to see how cool he was by measuring how many people were linking to him.
It's not just about eigenvectors, though that is the coolest part of it. The "magic" part is that PageRank is also about stochastic processes, which use the Ergodic theorem to show that their PageRank exists at an arbitrary scale. Therefore it doesn't matter how big or complex the link structure on the web gets—it will scale.<p>See this:<p>"For any matrix A = [cP + (1-c)E]' where P is an n×n row-stochastic
matrix, E is a nonnegative n×n rank-one row-stochastic matrix, and 0
=< c =< 1, the second eigenvalue of A has modulus less than or equal
to c. Furthermore, if P has at least two irreducible closed subsets,
the second eigenvalue is equal to c.<p>This statement has implications for the convergence rate of the
standard PageRank algorithm as the web scales, for the stability of
PageRank to perturbations to the link structure of the web, for the
detection of Google spammers, and for the design of algorithms to
speed up PageRank."<p><a href="http://answers.google.com/answers/threadview/id/379557.html" rel="nofollow">http://answers.google.com/answers/threadview/id/379557.html</a>