TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

Show HN: AI tool reads privacy policies, tells you which sites sell your info.

145 pointsby michaelaielloover 13 years ago

31 comments

michaelaielloover 13 years ago
I've been a member @ HN for quite a while, but I usually just tend to focus in on security and privacy topics. One of my good friends is visiting, and we wanted to work on something challenging together. Both of us find privacy policies overly confusing and annoying, so we decided to tackle the problem. We built a tool that that crawls for privacy policies and uses guided machine learning to analyze them. We would love any feedback you have.
评论 #3223100 未加载
评论 #3222599 未加载
pkulakover 13 years ago
You can go so much farther with this. How about letting me paste in any TOS and have it analyzed for important bits or things out of the ordinary? I'd love having my own personal robot lawyer to read over all the stuff I sign or agree to!
评论 #3222423 未加载
评论 #3223059 未加载
carbocationover 13 years ago
The site is loading extremely slowly. Might I suggest you turn off, or dramatically reduce, the KeepAliveTimeout?<p>Header:<p><pre><code> Connection:Keep-Alive Date:Fri, 11 Nov 2011 01:27:19 GMT Keep-Alive:timeout=15, max=100</code></pre>
评论 #3223303 未加载
marteyover 13 years ago
What about inconsistent privacy policies?<p>For example, <a href="http://www.privacyparrot.com/privacy" rel="nofollow">http://www.privacyparrot.com/privacy</a> states that they never "share any information about you" but then has an offhand mention about the site using Google Analytics.
评论 #3222541 未加载
saalweachterover 13 years ago
On a more serious note, I have two real questions:<p>1. So, having crawled a boatload of privacy policies, what fraction of them say that they'll sell your data?<p>2. Are you worried that the lawyers will find your tool and tweak their policies to beat it?
评论 #3222448 未加载
dylangs1030over 13 years ago
I love this idea, but here's how I think it could be improved:<p>1. Consider porting the entire service to a browser extension for Chrome or Firefox, and making the homepage more of an information/FAQ center.<p>2. Demo video. A good demo video explaining why "John Doe" should worry about his personal information being sold would be more convincing - this is how you get your service to less savvy internet users who aren't primarily concerned with privacy.<p>3. Find a way around inconsistencies. It would be better to report if a website <i>actually</i> sells/uses your personal information rather than returning a simple search result with TOS findings. A website can tweak or flat out lie. You should try to account for this.<p>4. Are you planning to commercialize this in any way? How do you plan to fund it, if at all?
评论 #3223398 未加载
jeremyarussellover 13 years ago
Just curious, if it could highlight the offending phrases it used to figure out the difference between selling, not selling, and bankruptcy selling. This way when we put in our revisions we can better help it learn.<p>Also if you aren't planning on making this a commercially viable product, could you release source code? Things like this make the world better and safer, (not to mention easier and funner.) All in all though it was rather interesting. (Still trying out websites and i see myself doing this until the end of the day at work.)
jasonkesterover 13 years ago
Of the two sites of mine that I checked, one came up as "Danger! Warning! They're going to sell your information in case of a Bankruptcy!!!"<p>Why?<p>Reading one of the submitter's comments below, it seems to lump "sold the entire company, therefore the user database went with it" into the same category as "we're running out of money, so let's sell everybody's email addresses to spammers."<p>They're not in any way related. I'd suggest splitting out those two categories, as I suspect it will drop that "bankruptcy email fire sale" category down to somewhere near 0%.
评论 #3224807 未加载
评论 #3226484 未加载
jrockwayover 13 years ago
Why does it automatically add www in front of what I type in the URL box? If I type in news.ycombinator.com, it says "www.news.ycombinator.com does not exist".<p>Well, yes. That's why I didn't type that.
评论 #3224360 未加载
jakubwover 13 years ago
It'd be interesting to have a browser extension based on this popping up a warning on websites with a suspicious privacy policy.
评论 #3222483 未加载
route66over 13 years ago
In a galaxy far away there was once conceived the idea of a machine readable privacy policy ... checking the interwebs reveals that <a href="http://www.w3.org/P3P/" rel="nofollow">http://www.w3.org/P3P/</a> was updated for the last time in 2007.<p>After some more searching: <a href="http://www.cdt.org/paper/looking-back-p3p-lessons-future" rel="nofollow">http://www.cdt.org/paper/looking-back-p3p-lessons-future</a> points to more information about that path from the past. What would the p3p.xml of facebook look like?<p>On another note: www dot freeprivacypolicy dot com [1] seems to generate the kind op privacy policies the site featured in this post sets out to parse. There is humor in that.<p>[1] don't want to feed page rank as privacyparrot says: "Your information may be sold during a bankruptcy"
raheemmover 13 years ago
I entered facebook and got two results, one saying facebook.com does not sell; the other saying www.facebook.com does sell. See <a href="http://www.privacyparrot.com/search?search=facebook.com" rel="nofollow">http://www.privacyparrot.com/search?search=facebook.com</a>
评论 #3223321 未加载
ams6110over 13 years ago
Follow up idea, read mutual fund prospectuses and identify anything "out of the ordinary."
评论 #3223345 未加载
roryokaneover 13 years ago
A bug report: if someone already added “.com” to their results because of no result, don’t offer to add it again if there are still no results.<p>Example: <a href="http://www.privacyparrot.com/search?search=http://www.noSuch.com" rel="nofollow">http://www.privacyparrot.com/search?search=http://www.noSuch...</a><p>Also, I suggest when you suggest adding “.com”, you strip spaces from the search as well. For instance, I searched for “Less Wrong” and found nothing, and you suggested “Less Wrong.com”. That doesn’t exist, but “LessWrong.com” does.
simonbrownover 13 years ago
I'm not a lawyer but...<p>How can someone trust you to parse the policies correctly? What if someone sues you for incorrectly interpreting a policy which they then use to make a decision.
评论 #3222740 未加载
SoftwareMavenover 13 years ago
Is it really possible to keep user data safe during a bankruptcy? It is a tangible asset that may provide value to creditors.<p>I really want (for my co) the answer to be "yes".
评论 #3223712 未加载
samg_over 13 years ago
I am just learning some of these machine learning tools and am rapt, so forgive me for asking, but would you be able to explain a little about what you are doing?<p>How are you generating features? Stanford parser? Are you using logistic regression or something more advanced?<p>I love the idea. I am interested in applying some of these concepts myself. Do you have any ideas that you are not able to pursue yourself, that I might take a crack at?
评论 #3222555 未加载
rednaughtover 13 years ago
Would it be possibly to also identify sites that share your information?<p>A great example: facebook.com Does not sell your private information.<p>But they obviously do share information and while this is apparent to most users, how many sites practice the same and users are not aware of it?<p>Also, any plans to capture change in privacy policies over time? Often times, site owners do not proactively notify users when their policies or legalese has changed.
评论 #3223334 未加载
simcop2387over 13 years ago
It does not appear to like subdomains. I've been trying to get it to visit <a href="http://news.ycombinator.com" rel="nofollow">http://news.ycombinator.com</a> and see what it thinks. But I keep getting back, We were unable to connect to <a href="http://www.news.ycombinator.com" rel="nofollow">http://www.news.ycombinator.com</a>. If it exists, please try again later.
评论 #3222957 未加载
评论 #3223326 未加载
a3_nmover 13 years ago
&#62; See if a site sells your personal information.<p>Rather, see if a site tells you that it sells your personal information. It's an important difference.
rblackwaterover 13 years ago
This is very cool. Maybe you could make a scriptlet bookmark that pops something on the page you are viewing. Here's one that will redirect you to the privacy parrot page : javascript:location.href="<a href="http://www.privacyparrot.com/privacy-policy-for-+location.host" rel="nofollow">http://www.privacyparrot.com/privacy-policy-for-+location.ho...</a>;
user24over 13 years ago
Would be useful to reproduce the part of the terms which causes the parrot to reach the conclusion it does.
giulivoover 13 years ago
I tried the policizer with some copy and pasted policies but it frequently told me "CAN SELL" just because the text did not include any specifics regarding selling and bankruptcy<p>it that the intended behaviour?
omouseover 13 years ago
Where's the code?<p>Seriously, where's the code? Your server seems to be getting hit pretty hard. Would be nice to be able to hack on it and to be able to host a mirror.
评论 #3223132 未加载
评论 #3223358 未加载
JacobIrwinover 13 years ago
I wonder if Facebook does (<a href="http://tinypic.com/r/wjctow/5" rel="nofollow">http://tinypic.com/r/wjctow/5</a>) hmmm... that interesting..
评论 #3223391 未加载
01PHover 13 years ago
Thank you so much for the effort. This is really something useful. Would love to see more projects on making privacy themes more accessible.
dodo53over 13 years ago
Who new the singularity would start as an arms race between AI writing obfuscated legal documents and AI decoding them :oP
michaelfeathersover 13 years ago
What are the AI tool's terms of service?
saalweachterover 13 years ago
bool CompanySellsPrivateData(string privacy_policy) { return true; }
rwalianyover 13 years ago
it says doubleclick.net does not sell my private information...
评论 #3223362 未加载
lucian303over 13 years ago
facebook.com Does not sell your private information.<p>www.facebook.com Can sell your private information.
评论 #3223316 未加载