> AI researcher Kate Crawford was quick to ask Bard itself where its dataset came from. The answer caught her attention: Bard said one of its data sources was Gmail.<p>Did they find anything? There's a lot of hand-wringing at the start, then a big focus on how Google can't deny that emails are in their training data. Then they finish by interviewing Bard. Google's response makes sense, given that they're working with multi-terabyte language files. It probably has seen Gmail contents through the form of naturally published emails that just get picked up with other data. Claiming otherwise would be confidently wrong.<p>It would be interesting if they had a "Q_rsqrt in Copilot" moment here, but they don't. There seems to be no evidence that Google uses private data in Bard.<p>> Society should be having a robust discussion on these questions, but this is not possible if such discussion is inhibited by key players like Google.<p>How is Google inhibiting this discussion?
The whole asking Bard thing towards the end is completely meaningless and I'd argue irresponsible. They even say<p><pre><code> But of course, the observation that Bard consistently makes these claims can’t be seen as evidence one way or the other
</code></pre>
and then go on to quote a bunch of stuff Bard said.<p>If I had to speculate, sounds like it could have used anonymized gmail data (could they have some kind of pii removal tool that they run first, that's common, though I wouldn't trust it too much), or something is being pretrained on gmail and fine tuned on something else (hard to see a reason for that). Anyway, google is acting suspicious, but pretending the chatbot's "opinion" has any bearing is disingenuous.