科技回声

7 条评论

aabaker99将近 6 年前

Take these results with a grain of salt. There's a large class imbalance in this dataset and ROC curves can be misleading in this case. The test set contains 269 positive examples and 8482 negative examples.From [1]:> Class imbalance can cause ROC curves to be poor visualiza- tions of classifier performance. For instance, if only 5 out of 100 individuals have the disease, then we would expect the five posi- tive cases to have scores close to the top of our list. If our classifier generates scores that rank these 5 cases as uniformly distributed in the top 15, the ROC graph will look good (Fig. 4a). However, if we had used a threshold such that the top 15 were predicted to be true, 10 of them would be FPs, which is not reflected in the ROC curve. This poor performance is reflected in the PR curve, however.The authors seem to be aware of this in the supplement and also evaluate performance by a hazard ratio they define:> We calculated the ratio of the observed cancer incidence in the top 10% of patients over the incidence in the middle 80% and referred to this metric as the top decile hazard ratio. We calculated the ratio of the observed cancer incidence in the bottom 10% of patients over the incidence in the middle 80% and referred to this metric as the bottom decile hazard ratio.However, binning is a form of p-hacking [2]. And I'm still wondering why they don't just post the Precision-Recall curves.[1] <a href="https://doi.org/10.1038/nmeth.3945" rel="nofollow">https://doi.org/10.1038/nmeth.3945</a>[2] <a href="https://doi.org/10.1080/09332480.2006.10722771" rel="nofollow">https://doi.org/10.1080/09332480.2006.10722771</a>[Edit] to add link to [2]

评论 #20446019 未加载

wccrawford将近 6 年前

Without information about false positives, this is just basically saying they wrote an algorithm that sometimes points out cancer early. But if it is only correct 1% of the time, nobody is going to listen to it. It'd do even less than the current "You really need to check for cancer!" statements that we already have.Edit: From the paper:> A deep learning (DL) mammography-based model identified women at high risk for breast cancer and placed 31% of all patients with future breast cancer in the top risk decile compared with only 18% by the Tyrer-Cuzick model (version 8).So better than before, but still only detects 31%. If I'm reading correctly, it's 95% correct? I guess that means 5% false positives? That wouldn't be bad.

评论 #20442791 未加载

评论 #20442756 未加载

评论 #20442723 未加载

评论 #20443722 未加载

评论 #20443393 未加载

melling将近 6 年前

According to Craig Venter, early detection is what we need to eliminate cancer:<a href="https://youtu.be/iUqgTYbkHP8?t=15m37s" rel="nofollow">https://youtu.be/iUqgTYbkHP8?t=15m37s</a>The reason most people die from pancreatic cancer, for example, is because we almost always detect it in a late stage.

评论 #20443943 未加载

评论 #20442875 未加载

评论 #20443233 未加载

评论 #20443198 未加载

b_tterc_p将近 6 年前

Addressing model bias by adjusting which data the model has access to is a bad idea. Tweaking the data so that the model output looks equitable is going to make your model across the board. You should train your model on what you have and then add explicit biases to the classifier for different groups. That way you have the best model and are clear on your biases.If this model is equally accurate for black and white women that means that either race is not a factor in predictability, that it is a factor but easily adaptable into a model, or race is a factor and they’ve reduced their ability to diagnose one group in the name of equity.The linked article suggests accuracy gains are due to better risk models, that use more than age. I’m not sure if that means it’s tied into the image neural net. Would like to see false positive rate too.

jszymborski将近 6 年前

Here's the paper in question.<a href="https://pubs.rsna.org/doi/full/10.1148/radiol.2019182716" rel="nofollow">https://pubs.rsna.org/doi/full/10.1148/radiol.2019182716</a>

stakhanov将近 6 年前

I wish they would call it something other than AI. Like "Diagnostics" or if there MUST be a buzzword there, then call it "Predictive Diagnostics".Once upon a time, a necessary precondition to call something AI was that it should be something where there is at least the hope that it could one day generalize to pass the Turing test or something along those lines.Medical diagnostics is one of the primary applications of pattern processing, and since it's pretty damned impressive as it is, it's a bit pointless to try and make it even more impressive by suggesting that you might one day enjoy a chat with your medical diagnostic tool over breakfast, exchanging views on how the Knicks' season is shaping up... (Which both the informed readers, and the people writing this, know pretty damned well is never going to happen, and was never intended to happen).

magwa101将近 6 年前

JFC this is great.

7 条评论

aabaker99将近 6 年前

评论 #20446019 未加载

wccrawford将近 6 年前

评论 #20442791 未加载

评论 #20442756 未加载

评论 #20442723 未加载

评论 #20443722 未加载

评论 #20443393 未加载

melling将近 6 年前

评论 #20443943 未加载

评论 #20442875 未加载

评论 #20443233 未加载

评论 #20443198 未加载

b_tterc_p将近 6 年前

jszymborski将近 6 年前

Here's the paper in question.<a href="https://pubs.rsna.org/doi/full/10.1148/radiol.2019182716" rel="nofollow">https://pubs.rsna.org/doi/full/10.1148/radiol.2019182716</a>

stakhanov将近 6 年前

magwa101将近 6 年前

JFC this is great.

MIT AI tool can predict breast cancer up to 5 years early

7 条评论

MIT AI tool can predict breast cancer up to 5 years early

7 条评论