TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

Ask HN: How to prevent LLMs from building a profile of us?

2 pointsby ntechabout 2 years ago
Perhaps similar to what Google (implicitly) does with search, but I assume LLMs would learn more deeply about us, our thoughts, questions, .. . What I&#x27;m thinking of is to, for example, circumvent it by use a second LLM to rephrase our prompt, generalize it, narrow it down, etc. Is it even possible?<p>Edit: Quotations-&gt; Questions . Though it would learn our writing style too.

2 comments

eternalbanabout 2 years ago
I stopped using chatGPT very early on as I caught myself sharing my thoughts. It&#x27;s not just the association with my primary email account as at this point it is clear that our word patterns almost uniquely identify us as well.<p>So I suggest at minimum 2 safe guards:<p>1 - Account should be tied to an email that is <i>only</i> used for that service.<p>2 - Text should be pre-processed to obfuscate your personal written word idiosyncrasies. Some sort of a locally executable text similarity tool that is trained on your normal output (just feed it your hn comments, emails, etc.) which can help create a semantically equivalent text from your original text [that the tool deems &#x27;distant&#x27; to your normal writing style]. Use that output for prompting.<p>p.s.<p><i>A Girl Has A Name: Detecting Authorship Obfuscation</i>, 2020<p><a href="https:&#x2F;&#x2F;aclanthology.org&#x2F;2020.acl-main.203.pdf" rel="nofollow">https:&#x2F;&#x2F;aclanthology.org&#x2F;2020.acl-main.203.pdf</a><p><i>&quot;Authorship attribution aims to identify the author of a text based on the stylometric analysis. Authorship obfuscation, on the other hand, aims to protect against authorship attribution by modifying a text’s style. In this paper, we evaluate the stealthiness of state-of-the-art authorship obfuscation methods under an adversarial threat model. An obfuscator is stealthy to the extent an adversary finds it challenging to detect whether or not a text modified by the obfuscator is obfuscated – a decision that is key to the adversary interested in authorship attribution. We show that the existing authorship obfuscation methods are not stealthy as their obfuscated texts can be identified with an average F1 score of 0.87.&quot;</i>
评论 #35196039 未加载
gostsamoabout 2 years ago
Or you can introduce noise and bad data in the dataset. The issue is if it is really an issue at the moment.
评论 #35196821 未加载