TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

Does HN have anti-duplication protection?

4 pointsby mothcampover 2 years ago
Six months ago, I published part one of my NLP course and submitted this link: https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=31421232<p>This morning, I wanted to share that I released the FULL course (same URL) but every time I hit submit, it redirects me to my previous submission.<p>Is this some anti-duplication protection in action? Does my account not have posting privileges?

5 comments

mindcrimeover 2 years ago
Yes, there is at least <i>some</i> automatic anti-duplication stuff going on. The easiest way to see this in action is to re-submit an existing URL with <i>the exact same URL</i> within a certain period of time, and notice that your submission just automatically becomes an upvote on the existing submission.<p>That said, the anti-dupe mechanism doesn&#x27;t catch all dupes, and from what I can recall of things said by dang, pg, etc in the past, I think that is intentional. In particular, dupes are explicitly considered OK after a certain period of time. You can see this by noting that certain links have been submitted to HN, and sometimes discussed in detail, on 5, 10, or even 15 unique occasions.<p>I believe it is the case that whatever automatic anti-duplicate detection they have doesn&#x27;t do much besides look for an exact match on the URL though. It was known at one time that you could submit a dupe and get it to go through by just adding some extra stuff to the query string for example. What I can&#x27;t speak to at all, is how much effort (if any) the mods put into <i>manually</i> detecting and remediating dupes. I can&#x27;t recall any of the mods ever addressing that point explicitly, but my suspicion is that they do spend at least some cycles on doing that, but I can&#x27;t prove it. And I may very well be wrong.<p>All this is totally unofficial mind you. It&#x27;s just based on my recollections from various times this topic has been discussed in the past, and my own empirical observations. YMMV.
wskishover 2 years ago
I noticed a lot of dups on the HN Summary bot (<a href="https:&#x2F;&#x2F;github.com&#x2F;jiggy-ai&#x2F;hn_summary" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;jiggy-ai&#x2F;hn_summary</a>) so was wondering if we needed an embedding similarity search to filter them. So I checked the database of recent stories and found 194 instances of duplicates with exact same Story Title or Story URL in the last few days that the bot has been running.<p>There were all story items that made it into the &#x2F;topstories hacker news api endpoint:<p><a href="https:&#x2F;&#x2F;gist.github.com&#x2F;wskish&#x2F;c8c6dbcb1c036882f3eb11b0660c0ac4" rel="nofollow">https:&#x2F;&#x2F;gist.github.com&#x2F;wskish&#x2F;c8c6dbcb1c036882f3eb11b0660c0...</a>
Normilleover 2 years ago
Judging by the countless times the same stories get posted here, I&#x27;d very much doubt there&#x27;s any automatic de-duplication going on.<p>But, if the system is stopping you submitting the same URL again, why not why not just put a meaningless query string on the URL so it&#x27;s different from last time. eg:<p><a href="https:&#x2F;&#x2F;www.nlpdemystified.org&#x2F;?blah" rel="nofollow">https:&#x2F;&#x2F;www.nlpdemystified.org&#x2F;?blah</a><p>BTW. I don&#x27;t know if that will work. Just a thought.
Tomteover 2 years ago
Yes. The former submission got enough attention, so it shouldn&#x27;t be submitted for a year.<p>Solution: write a separate release announcement (there&#x27;s certainly more to tell than just &quot;done&quot;?), link to the course from there, and submit the announcement.
评论 #33791512 未加载
评论 #33792045 未加载
PaulHouleover 2 years ago
Yeah it tries to block dupes, but funny the other day people were complaining that the same Elon Musk tweet got submitted at least 5 times in 30 minutes…
评论 #33813995 未加载