TechEcho

10 comments

wadkarover 7 years ago

The fakebox doesn’t detect fake news, it detects articles which are factual/real and everything else is labeled as “fake”.Where’s the dataset? How did you verify the ground truth? Where are the annotation/labeling guidelines?What’s the definition of factual/real articles? The dataset appears to be created by the author - which isn’t necessarily wrong but to paraphrase Karl Popper (in the context of human knowledge and scientific endeavors):There are no ‘pure’ facts available; all observations are functions of subjective factors such as interests, expectations, wishes etc.<a href="http://plato.stanford.edu/entries/popper/#GrowHumaKnow" rel="nofollow">http://plato.stanford.edu/entries/popper/#GrowHumaKnow</a>

评论 #16128827 未加载

rakerover 7 years ago

This article is the 5%.A more accurate way of detecting "fake news" would be interesting, but I fail to see how such a thing could be designed, past simple detection of wishy-washy and avoidant word patterns.

bagrowover 7 years ago

Accuracy is not a sufficient measure of a classifier. Better to report precision and recall, or any number of other combination measures.<a href="https://en.m.wikipedia.org/wiki/Evaluation_of_binary_classifiers" rel="nofollow">https://en.m.wikipedia.org/wiki/Evaluation_of_binary_classif...</a>

minimaxirover 7 years ago

The OP does not say the label distribution of the training data; it's entirely likely that the split is not balanced 50/50, which would make "95% accuracy" as an indicator of quality misleading.This is one of the reasons why I recommend that Medium thought pieces disclose their data and code instead of just saying "I did AI magic!" to sell a product (and they do charge for their product on their website).

richdoughertyover 7 years ago

> I found myself drifting in my own interpretation of fake news, getting angry as I came across articles that I didn’t agree with, fighting hard against the urge to only pick ones I thought were right. What was right or wrong anyway?A good question and I'm not surprised he went a bit crazy.<a href="https://plato.stanford.edu/entries/truth/" rel="nofollow">https://plato.stanford.edu/entries/truth/</a>> The problem of truth is in a way easy to state: what truths are, and what (if anything) makes them true. But this simple statement masks a great deal of controversy. Whether there is a metaphysical problem of truth at all, and if there is, what kind of theory might address it, are all standing issues in the theory of truth. We will see a number of distinct ways of answering these questions.

评论 #16128844 未加载

thetall0neover 7 years ago

The model is not based on domains. Just the text of the article. Can confirm there was an even number of real and notreal news examples. Data set was eventually broken into two categories; written with bias, or without bias. For example, a NYT Opinion piece was considered notreal news.

txshover 7 years ago

He’s not detecting fake news. He’s detecting articles that match the writing style of a handful of publications and labeling everything else “fake”.

评论 #16128835 未加载

peterwwillisover 7 years ago

What the....?The author describes a "fake news detector AI", that is actually a "typically legitimate source of news" data model, combined with a fake news domain blacklist. It doesn't detect fake news. It detects whether a story possibly came from a source you find to typically be legitimate.This article is fake news.

评论 #16128696 未加载

评论 #16128703 未加载

评论 #16133177 未加载

评论 #16128754 未加载

tantalorover 7 years ago

Where's the demo?

评论 #16128795 未加载

mirekrusinover 7 years ago

He needs to release/train at least 3 versions with whitelist-blacklist variations for rt, al jazeera and fox news.

10 comments

wadkarover 7 years ago

评论 #16128827 未加载

rakerover 7 years ago

bagrowover 7 years ago

minimaxirover 7 years ago

richdoughertyover 7 years ago

评论 #16128844 未加载

thetall0neover 7 years ago

txshover 7 years ago

He’s not detecting fake news. He’s detecting articles that match the writing style of a handful of publications and labeling everything else “fake”.

评论 #16128835 未加载

peterwwillisover 7 years ago

评论 #16128696 未加载

评论 #16128703 未加载

评论 #16133177 未加载

评论 #16128754 未加载

tantalorover 7 years ago

Where's the demo?

评论 #16128795 未加载

mirekrusinover 7 years ago

He needs to release/train at least 3 versions with whitelist-blacklist variations for rt, al jazeera and fox news.

How I trained fake news detection AI with 95% accuracy, and almost went crazy

10 comments

How I trained fake news detection AI with 95% accuracy, and almost went crazy

10 comments