In many European countries, we have legal deposit laws that require the national library to run archival web crawls, at least of the country TLD and sometimes on a best-effort basis for material outside the TLD that's considered relevant (based on language, for example).<p>These laws specify that the crawls are stored untampered and guarantee that the results can be considered valid evidence.<p>Interestingly, that's how the Internet Archive's heritrix crawler came about - the nordic national libraries were saddled with this requirement but didn't really have the technical infrastructure to implement it. They formed a coalition among themselves and brought the Internet Archive into it (the IIPC[0]), and used it to fund development of heritrix.<p>[0] <a href="http://netpreserve.org/" rel="nofollow">http://netpreserve.org/</a>