I thought an archive for RSS feeds (similar to the wayback machine) should exist. But I couldn't find any.<p>RSS would be a minimum common denominator for the web. When pulling an RSS feed, omehow you can only access the last 20 posts or so; I don't think this is a limit that the protocol imposes (it's rdf after all) or a custom, but the fact is you cannot get an old article from an RSS feed.<p>Feedburner (RSS kings) don't seem to have an archive, at least one I could find.<p>Do you know if such a thing exist? The main advantage of RSS over scraping is that there's little noise due to formatting, TOCs, ads, etc.
> The main advantage of RSS over scraping is that there's little noise due to formatting, TOCs, ads, etc.<p>How would you get past the fact that not everyone publishes the entire piece of content in the RSS feed?<p>Slightly off topic, but related... I've often wondered why no one has created an RSS delivery protocol like IMAP. I've considered writing something many times, but it seems like it could get crushed by a few changes in already existing readers.