Googling "query searchengine" and the like doesnt give a good answer so i try here.<p>Searchengines generally do a good job of sorting out relevant data/links but they rarely ever take you all the way to the goal.<p>So I have some different webapps I'm working on and I would like to query Google or some other searchengine for a subject and then go through those links and present the relevant information.<p>I could obv write something to do this, but is it allowed?
Or I have to pay for this service at Google?
If so, is there some automated way to do this?
Yahoo! have a great search API: <a href="http://developer.yahoo.com/search/" rel="nofollow">http://developer.yahoo.com/search/</a><p>Google quietly re-introduced their search API a few months ago under the guise of the "Ajax Search API that doesn't require JavaScript" - it requires you to provide a referrer which is a bit weird, but it should work OK: <a href="http://code.google.com/apis/ajaxsearch/documentation/#fonje" rel="nofollow">http://code.google.com/apis/ajaxsearch/documentation/#fonje</a>
Google doesn't allow automated queries. An old soap api key might still work. (At least it works for the former company I worked for, but I think they also had a special deal with google)
Once upon a time there was a Google API to do that, but its not available anymore. And im pretty sure scraping their result pages is against the TOS and will get you banned pretty fast.
There is a perl module that you can use to search Google: <a href="http://search.cpan.org/~bstilwell/Net-Google-1.0.1/lib/Net/Google/Search.pm" rel="nofollow">http://search.cpan.org/~bstilwell/Net-Google-1.0.1/lib/Net/G...</a><p>I'd imagine other languages have similar libraries.
I wrote a page scraper for google some time back. Its actually easier to parse their pages than yahoo's imo. You will get blocked for hours if you intend to post hundreds of times a minute though.
Seconding simonw's Yahoo recommendation. The API is great and the results are often as good as Google's. If you scrape Google searches, they <i>will</i> eventually block you.