This is Bret Taylor, CTO of Facebook.<p>There are a couple of things I want to clarify. First, we genuinely support data portability: we want users to be able to use their data in other applications without restriction. Our new data policies, which we deployed at f8, clearly reflect this (<a href="http://developers.facebook.com/policy/" rel="nofollow">http://developers.facebook.com/policy/</a>):<p><pre><code> "Users give you their basic account information when they connect with your application. For all other data, you must obtain explicit consent from the user who provided the data to us before using it for any purpose other than displaying it back to the user on your application."
</code></pre>
Basically, users have complete control over their data, and as long as user gives an application explicit consent, Facebook doesn't get in the way of the user using their data in your applications beyond basic protections like selling data to ad networks and other sleazy data collectors.<p>Crawling is a bit of special case. We have a privacy control enabling users to decide whether they want their profile page to show up in search engines. Many of the other "crawlers" don't really meet user expectations. As Blake mentioned in his response on Pete's blog post, some sleazy crawlers simply aggregate user data en masse and then sell it, which we view as a threat to user privacy.<p>Pete's post did bring up some real issues with the way we were handling things. In particular, I think it was bad for us to stray from Internet standards and conventions by having an robots.txt that was open and a separate agreement with additional restrictions. This was just a lapse of judgment.<p>We are updating our robots.txt to explicitly allow the crawlers of search engines that we currently allow to index Facebook content and disallow all other crawlers. We will whitelist crawlers when legitimate companies contact us who want to crawl us (presumably search engines). For other purposes, we really want people using our API because it has explicit controls around privacy and has important additional requirements that we feel are important when a company is using users' data from Facebook (e.g., we require that you have a privacy policy and offer users the ability to delete their data from your service).<p>This robots.txt change should be deployed today. The change will make our robots.txt abide by conventions and standards, which I think is the main legitimate complaint in Pete's post.