I'm looking for any libraries or papers outlining strategies/algorithms on automatically linking data sets together.<p>I recall there was one company that got sold that was doing this for government data/police data a while back.<p>E.g running this on your own personal data to link together bank transactions with gps/foursquare logins, emailed receipts, browser history etc.<p>E.g. sqlite table with another table, or Sqlite table-> disk files -> email.