科技回声

8 条评论

ChuckMcM超过 13 年前

Its an interesting concept, although the definition of 'idiot' is not very precise. As others have pointed out, sometimes brilliant people can't compose gramattically correct english. That being said ...Its fairly easy to identify forums on the web (they have a form which is generally very common, inspired by PhPBB way back when). And you could identify users, take the sum of all their contributions and try to generate some sort of 'evolved' karma score for their posts. Things you might consider are things that academics use, how many times was the post referred to (similar to citations in papers), what sort of traffic follows the posting (similar to counterpoint papers), Etc. But even if you end up with a perfect score, you won't benefit until you've been able to process several postings. If poor quality posts are the norm in your particular research area you will still deal with a lot of junk while the algorithm is learning that it is junk.Finding a way to predict that the posting is going to score high on the suppression scale as its being posted would be helpful but new posters appear quite rapidly mitigating the benefit significantly.

devs1010超过 13 年前

Hey, I'm working on an open source project that I think could application for this. Its something I've termed a "web gatherer" basically it provides the framework for crawling web pages and then has workflows where custom code is written to determine certain things about each page, if it meets criteria that is programmed for that workflow then the page is added to the results queue, the others are filtered out. I'm planning to implement an NLP component at some point using one of the open source NLP libs availabe. Overall, I think of this project as sort of a web scraper / search engine that sits above the base layer (such as Google) which can be used to refine results. Anyways, you may be interested, if so feel free to contact me: <a href="https://github.com/devs1010/WebGatherer---Scraper-and-Analyzer" rel="nofollow">https://github.com/devs1010/WebGatherer---Scraper-and-Analyz...</a>

dlitz超过 13 年前

You'd end up filtering out really good blogs like ERV, because its author objects to apostrophes and sometimes writes like a LOLcat: <a href="http://scienceblogs.com/erv/2009/12/drug_resistant_prions_via_quas.php" rel="nofollow">http://scienceblogs.com/erv/2009/12/drug_resistant_prions_vi...</a>

johnl超过 13 年前

I search forums for DIY home projects and have found I need at least 10 responses to my question before I can arrive at a result I feel comfortable with. Going back over the responses with the overview from the search, I can now understand responses that I originally thought were poor, weren't. I keep thinking something like a do-it-yourself thread builder that you build, save and share while you do your Google search, sort of a tumblr except you access multiple sites might be a better approach than an exclusion approach.

glimcat超过 13 年前

NLP is hard, particularly for highly general problems conducted on small samples of text.Here's a problem case that you will find to be very common: a 20-second post with the right answer to a difficult problem by someone who's busy and typing on their phone. Riddled with typos, weird corrections, transposition errors, etc. - but still something you'd want to be a high-ranking result.

评论 #3302745 未加载

gujk超过 13 年前

Done.<a href="http://www.chrisfinke.com/addons/youtube-comment-snob/" rel="nofollow">http://www.chrisfinke.com/addons/youtube-comment-snob/</a>

meatsock超过 13 年前

this could be accomplished more simply by counting the number and size of the avatars on the forum in question.

评论 #3302813 未加载

mrkmcknz超过 13 年前

I know some highly intelligent people who are dyslexic. How would you tackle that?

评论 #3302811 未加载

评论 #3302326 未加载

评论 #3302324 未加载

8 条评论

ChuckMcM超过 13 年前

devs1010超过 13 年前

dlitz超过 13 年前

johnl超过 13 年前

glimcat超过 13 年前

评论 #3302745 未加载

gujk超过 13 年前

Done.<a href="http://www.chrisfinke.com/addons/youtube-comment-snob/" rel="nofollow">http://www.chrisfinke.com/addons/youtube-comment-snob/</a>

meatsock超过 13 年前

this could be accomplished more simply by counting the number and size of the avatars on the forum in question.

评论 #3302813 未加载

mrkmcknz超过 13 年前

I know some highly intelligent people who are dyslexic. How would you tackle that?

评论 #3302811 未加载

评论 #3302326 未加载

评论 #3302324 未加载

Idea: Idiot Filter

8 条评论

Idea: Idiot Filter

8 条评论