This is very interesting for many reasons. First we have the privacy stance, which is a tremendous step for big G. Whoever managed to push this through in the "machine" of internal office politics deserves applause. The very fact of acknowledging that users might want to control their data locally rather than rsync everything all the time is a big step—it takes us off the "give me all your data" train that we have been on for some time.<p>Talking about specific applications of your users' data makes a lot more sense: "If you share X with us, you're helping to build a better model Y that helps you with Z." Then the prompt "Do you want to share X?" makes a lot more sense than the current generic prompts "App V wants to access all your data W?" which doesn't tell you anything.<p>The anonymisation-by-aggregation aspect is interesting on it's own since it provides a practical approach we can use today and not have to wait for homomorphic encryption. There will probably still be "data leakage" but I can see how aggregation can be fundamentally better than trying to shared anonymized data by fuzzing identifiers, randomization, and binning, which are notoriously hard to pull off and suffer from de-anonymisation attacks by cross linking with other datasets.<p>Research-wise this could be a whole new field. Let's revisit all the ML algorithms and look at the ones that lend themselves to federated updates. Perhaps certain ML algorithms have been overlooked historically because they are not "cutting edge" but lend themselves better to distributed model updates? (I bet this is already a thing...)<p>The communication complexity aspects are also very interesting since it forces us to think about bandwidth needed to communicate model updates and training batching. For high-bandwidth settings we could consider training a model from scratch, for medium bandwidth you can send model updates regularly, but what would be particularly interesting to see async and VERY low bandwidth updates—like just a few MB every, exchanged once in a while when connectivity is available.