TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

OpenAI Personal Data Removal Request Form

174 pointsby ciccionamenteabout 2 years ago

23 comments

alwayseasyabout 2 years ago
Wait, how can we verify this is OpenAI&#x27;s form and not some random form on the internet?<p>Edit: Ok the link can be found here in part 4 of : <a href="https:&#x2F;&#x2F;openai.com&#x2F;policies&#x2F;privacy-policy" rel="nofollow">https:&#x2F;&#x2F;openai.com&#x2F;policies&#x2F;privacy-policy</a>
评论 #35815052 未加载
评论 #35826074 未加载
discreteeventabout 2 years ago
&gt; we need clear evidence that the model has knowledge of the data subject conditioned on the prompts<p>We have a system that may have information about you and may even distort information about you. In fact it probably has some information about you considering that we exercised no control over the process of ingesting information into the system. Furthermore, we don&#x27;t have understanding or control of our system in such a way that we can remove that information or even discover it. However, we still released the system to the world and now we expect you to test it with various prompts and hope that you get lucky before someone other person does.
评论 #35816095 未加载
评论 #35815332 未加载
EMM_386about 2 years ago
Does anyone have any idea how this is handled from a technical perspective?<p>The data isn&#x27;t sitting in some database somewhere, it&#x27;s inside of a large lanaguage model. It&#x27;s not like they can just execute a DELETE statement or do an entirely new training run.<p>Are they intercepting the outputs with something like a moderation server as a go-between? In that case, the data still would technically exist in the model, it just wouldn&#x27;t be returned.<p>Maybe using fine-tuning?
评论 #35815762 未加载
评论 #35816664 未加载
评论 #35815575 未加载
评论 #35815997 未加载
评论 #35817300 未加载
评论 #35815703 未加载
评论 #35815613 未加载
评论 #35817166 未加载
评论 #35815715 未加载
评论 #35815644 未加载
评论 #35815805 未加载
评论 #35815597 未加载
评论 #35816494 未加载
mbgerringabout 2 years ago
Putting the onus on the user to find a “relevant prompt” is bullshit. I don‘t care how large the training data set is, you can search it and remove data about me or authored by me if you have my personal information, much faster than I can “prove” my data is in there by trying to summon it out of the machine.<p>The legal principle here is very, very simple — no training data without explicit legal consent. Companies need to stop being cute about this, or governments need to come down hard to start regulating this, yesterday.
评论 #35817049 未加载
gumballindieabout 2 years ago
&gt; a request does not guarantee that information about you will be removed from ChatGPT outputs<p>Oh i am pretty sure that if you dont remove all data you’ll pay for it. Looking forward to hefty fines for openai.
评论 #35815629 未加载
评论 #35816689 未加载
cjabout 2 years ago
&quot;Relevant prompts&quot; should not be a required field. That means I need to use OpenAI to request my data be removed from its data set?<p>Is there a way to remove PII without having to use their service?
评论 #35815268 未加载
评论 #35815980 未加载
评论 #35815707 未加载
thomas34298about 2 years ago
Somewhat related, I previously completed the form found in the help section titled &quot;How your data is used to improve model performance&quot; to opt out of providing training data to OpenAI: <a href="https:&#x2F;&#x2F;help.openai.com&#x2F;en&#x2F;articles&#x2F;5722486-how-your-data-is-used-to-improve-model-performance" rel="nofollow">https:&#x2F;&#x2F;help.openai.com&#x2F;en&#x2F;articles&#x2F;5722486-how-your-data-is...</a><p>I received a confirmation in February that my data had been excluded from model training. However, recently, after the addition of the new Data Controls feature, I noticed that I was suddenly opted in again in the settings. I&#x27;ve tried contacting them about it via Discord and e-mail so that they can clarify whether the exclusion is still valid, but it seems like I&#x27;m getting ignored.
评论 #35818792 未加载
Nocturiumabout 2 years ago
Wouldn&#x27;t it be easier if they published a list where they scraped their data from in the first place. Filling out forms, scanning id and sending it only to learn they didn&#x27;t capture any of your data seems like such a waste of time.<p>On the other hand, they already know which sites they used to scrape data. So publish it, maybe with a handy lookup portal where you can enter urls to see if it got scraped.<p>I prefer an opt-in model, but that&#x27;s not likely to happen any time soon, so this seems reasonable while this gets legally sorted out. Just because something is transmitted publicly doesn&#x27;t mean it&#x27;s without copyright. Otherwise any song broadcast on radio is up for grabs to be resold by anyone receiving it.
chmod775about 2 years ago
Reminder that you have no obligation to use their stupid form if you don&#x27;t like it and all their weird requirements.<p>You can just send them a snailmail or e-mail and they&#x27;ll have to process that too. You can find templates for that all around the internet.
nilsbabout 2 years ago
A request for a list of personal data they’re processing would be interesting. How would they even comply with such a request?
mstolpmabout 2 years ago
I&#x27;m wondering: How can I be certain that the model contains any personal data about me (or someone else not famous)?<p>For a public figure, of course there is lots of information in the training data, all public data. But when asked about me or my brother, ChatGPT either refuses to answer OR hallucinates the hell of it. Then, nearly everything is wrong and the output resembles the answer to a prompt like: &quot;Create a short bio for a fictional character named xx, living in yy and working as zz.&quot; (Okay, often yy and zz are wrong either.)<p>Requesting to delete these hallucinated facts seems quite stubborn and ineffective?
sashank_1509about 2 years ago
I frankly don’t get this privacy argument at all. If I browse Facebook and look at pictures you uploaded and end up learning something from those pictures, what am I supposed to do? Undergo brain surgery?<p>It feels like anything that you release on the internet publicly is fair game. If however you didn’t release it in public, put it behind a password and then OpenAI somehow got access to it and train on it, I can see the argument here but if you put up data on your own, I don’t see why you can prevent others from accessing that data. If you don’t want others using it out there, don’t put it out there.
评论 #35818280 未加载
评论 #35819101 未加载
评论 #35819207 未加载
aorthabout 2 years ago
According to Washington Post&#x27;s reporting on the contents of the C4 dataset last week I see that ChatGPT has over 100,000 tokens from my personal blogs. Does that qualify as &quot;personal data&quot; in this context? Everything I&#x27;ve written is clearly labeled as CC-BY so it would be nice to at least be acknowledged.<p><a href="https:&#x2F;&#x2F;www.washingtonpost.com&#x2F;technology&#x2F;interactive&#x2F;2023&#x2F;ai-chatbot-learning&#x2F;" rel="nofollow">https:&#x2F;&#x2F;www.washingtonpost.com&#x2F;technology&#x2F;interactive&#x2F;2023&#x2F;a...</a>
Raed667about 2 years ago
I have filled it weeks ago and still got no response.<p>Anyone had any success with this ?
评论 #35815255 未加载
评论 #35815558 未加载
blazespinabout 2 years ago
Interestingly, I find ChatGPT very reticent about giving out PII of anyone not notable. This might not be a huge issue
评论 #35816499 未加载
评论 #35817279 未加载
sensanatyabout 2 years ago
Here&#x27;s to hoping this forces their hand into making the mass data-hoovering completely opt-in so they can avoid these pesky normies from bothering them again the future
评论 #35819229 未加载
YeBanKoabout 2 years ago
This has to be opt-in, not opt-out. A company should not be allowed to use personal data you did not explicitly provide it.
fogofloveabout 2 years ago
Just looked, and I&#x27;m guessing this is an EU thing? No option for US in dropdown.
评论 #35815310 未加载
johntiger1about 2 years ago
I filled out a google docs form to opt out once, which is it?
peter_retiefabout 2 years ago
How would anyone know what information they have on them?
评论 #35817914 未加载
tekchipabout 2 years ago
You can&#x27;t finish this form without selecting &quot;country who&#x27;s laws apply&quot;. The US isn&#x27;t in the list so you cant submit the form. California and Illinois have GDPR-like laws. This is pretty messed up.
gnufxabout 2 years ago
So they seem to be confirming they&#x27;re processing personal data. The UK ICO &quot;at a glance&quot; guidance for that includes:<p>* You must identify valid grounds under the UK GDPR (known as a ‘lawful basis’) for collecting and using personal data.<p>* You must use personal data in a way that is fair. This means you must not process the data in a way that is unduly detrimental, unexpected or misleading to the individuals concerned.<p>* You must be clear, open and honest with people from the start about how you will use their personal data.
blazespinabout 2 years ago
You&#x27;ll want to fact check this, but I find GPT4 is OK at this kinda stuff.<p>If you&#x27;ve been asked to remove data in accordance to a gdpr request, is it best effort or do you have to strictly comply with the request? ChatGPT I am not a lawyer, but I can offer some general information about GDPR and data removal requests. The General Data Protection Regulation (GDPR) is a legal framework that sets guidelines for the collection, processing, and storage of personal data of individuals within the European Union (EU). Under GDPR, individuals have the &quot;right to be forgotten,&quot; also known as the &quot;right to erasure.&quot; This means that they can request the deletion of their personal data under specific circumstances.<p>When a data controller (the organization that determines the purposes and means of processing personal data) receives a valid GDPR request to remove personal data, they are generally required to strictly comply with the request. However, there are some exceptions, and the extent of the obligation may depend on the specific circumstances surrounding the request.<p>The GDPR requires data controllers to respond to data removal requests without undue delay, and in any event within one month of receipt of the request. This period may be extended by two further months where necessary, taking into account the complexity and number of requests.<p>It&#x27;s important to remember that GDPR compliance is not solely about &quot;best effort.&quot; Organizations are expected to take reasonable and appropriate measures to ensure compliance with the regulation. This includes having processes in place to handle requests and promptly remove personal data where required. Failure to comply with GDPR can result in significant fines and penalties.<p>However, it&#x27;s also essential to consider that there are circumstances under which a data controller may not be required to comply with a data removal request. These exceptions can include situations where the processing of personal data is necessary for exercising the right of freedom of expression and information, compliance with a legal obligation, or for reasons of public interest.<p>In any case, it&#x27;s crucial to consult with legal counsel or a data protection officer to better understand your organization&#x27;s specific obligations under GDPR and ensure proper compliance.