How do big companies (Google, Facebook, Amazon, etc.) perform record linkage (deduplication, entity resolution, master data management)? Did they develop their own algorithms and are there any published papers? Are any of the implementations open-source or do any of them use commercial software?