TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

Ask HN: What's up with the ChatGPT spam here lately?

19 pointsby pona-a10 months ago
I noticed in the past few days a large uptick in probably ChatGPT-generated comments. These accounts have low or negative karma, were registered in the past few months, started posting less than a week ago, and seem to just rephrase the title or the contents of a post with some faux &quot;questions&quot; at the end.<p>Had anyone found reasonable heuristic to block them? Can someone maybe collect a small dataset to train a classifier? If HN becomes a target for this, manual moderation may quickly prove insufficient.

5 comments

mtmail10 months ago
Can you list examples? Or better, report them to the moderators (&#x27;Contact&#x27; link on the page footer)? I&#x27;ve reported some in the past, months ago, but haven&#x27;t seen any recently.
评论 #41061017 未加载
评论 #41061034 未加载
评论 #41061377 未加载
low_tech_love10 months ago
Never noticed it, but I&#x27;m interested; can you link some examples?
评论 #41064198 未加载
mediumsmart10 months ago
I think you can just feed the ai real HN comments (as the style to use for generating) to avoid detection.<p>Besides, how would the classifier scheme work? Validate the input or prune the threads? Good luck with either approach.
ilt10 months ago
@dang
评论 #41073484 未加载
syndicatedjelly10 months ago
It&#x27;s a valid concern that you&#x27;ve raised about the potential increase in ChatGPT-generated comments on HN. Here are some thoughts and potential solutions:<p>1. *Heuristic Identification*: - *Account Age and Karma*: As you mentioned, new accounts with low or negative karma could be a red flag. Filtering out comments from these accounts might help, although it might also block new, genuine users. - *Comment Content*: Look for patterns in the comments, such as generic or overly formal language, repetition, and lack of personal experience or detailed technical knowledge. - *Engagement Metrics*: Check the engagement these comments receive. Comments that are ignored or downvoted could be another indicator.<p>2. *Training a Classifier*: - *Data Collection*: You&#x27;d need a dataset of known AI-generated comments and genuine comments. This could be challenging but necessary for creating an effective classifier. - *Features*: Potential features for the classifier could include linguistic cues, metadata (account age, karma), and engagement metrics (upvotes, downvotes, replies). - *Community Involvement*: Encourage the community to flag suspected AI-generated comments. This could provide more data for training and improve the classifier&#x27;s accuracy.<p>3. *Manual Moderation*: - While manual moderation might not be scalable, especially if the volume increases, it is still crucial for edge cases where automated methods might fail. - Moderators could focus on verifying flagged comments rather than monitoring all comments, making the process more efficient.<p>4. *Community Guidelines*: - Clear guidelines about AI-generated content could help. Encourage transparency if users are experimenting with AI-generated comments and provide a proper context.<p>5. *Technical Solutions*: - *CAPTCHA*: Implementing CAPTCHAs during account creation or before posting could deter automated systems from flooding the site. - *Rate Limiting*: Limiting the number of posts or comments a new account can make in a short period could reduce the impact of spam accounts.<p>By combining these approaches, HN can better manage the influx of AI-generated content and maintain the quality of discussions.
评论 #41066012 未加载