Monday, April 10, 2023

Machine learning can easily produce false positives when the test set is wrongly used

 Machine learning can easily produce false positives when the test set is wrongly used. Just et al in Nature HumBehav suggested that ML can identify suicidal ideation extremely well from fMRI and there were plenty of reasons to be skeptical. Today retraction and an analysis of what went wrong came out.



Read the refutation here https://www.nature.com/articles/s41562-023-01560-6 (the retracted paper is available here https://www.nature.com/articles/s41562-017-0234-y.epdf if you need it for didactic purposes)

So what went wrong? The authors apparently used the test data to select features. Obvious mistake. A reminder for everyone into ML: never use the test set for *anything* but testing.

Only practical way to do so in medicine? Lock away the test set till algorithm is registered.


Side note: it took 3 years to go through the process of demonstrating that the paper went wrong. Journals need procedures to accelerate this.

BTW, pay attention: Confound Removal in Machine Learning Leads to Leakage https://arxiv.org/abs/2210.09232 Now a quick survey (not by us, but by the source of inspiration for this post) for people in medical ML: which proportion of papers have some kind of leakage? https://twitter.com/KordingLab/status/1645140456655970304

No comments:

Post a Comment

Addressing Health and Environmental Resilience in the Mediterranean - Position paper presented at COP29

 We are delighted to share with you that the position paper " Addressing Health and Environmental Resilience in the Mediterranean - Tow...