This is the same technology that is used by the webgrepper tool [<a href="http://blekko.com/webgrep" rel="nofollow">http://blekko.com/webgrep</a>] (a grep for the web pages' sources).<p>Disclaimer: I work at blekko and I developed the webgrepper.<p>As a side note, we have used this for various other purposes - some fun ones being, store a big music collection (to extract meta data via mapjob), citizenship test q&a (to pick random questions), the 'joke of the day' (of course, this is our "hello world" example internally to new employees) ..etc.