Machine learning: a friend or a foe for science?
How machine learning is affecting science reproducibility and how to solve it
Reproducibility is fundamental for scientific progress, but the increasing use of machine learning is affecting it. Why reproducibility is important? Why machine learning usage has a problematic side effect? How we can solve it?
Not everything shining is a diamond
In 2016 the scientific journal Nature published the results of a survey. They asked 1,576 researchers to reply to a brief questionnaire about reproducibility in research. The results showed that more than 70% of the scientists failed to reproduce another fellow researcher’s experiment. More than 50% of the researcher in the survey declared that there is a reproducibility crisis.
The problem is concerning all the scientific disciplines from medicine to biology, from economy to physics, from psychology to chemistry. The scientist replied that the main causes behind these are two factors: pressure to publish (“publish or perish”)and selective reporting. Others pointed out that also low statistical power and technical difficulties can be a cause. In fact, p-value and other statistical methods are under scrutiny to find a better way to analyze the data.
Researchers declared that when trying to replicate academic findings less than 40% of the attempts are successful. Moreover, many undergraduate students in the laboratory are frustrated by the failure to replicate (leading them to burnout). In addition, often when a scientist was able to replicate the findings, the results were much less enthusiastic than the original paper (effect size much smaller than declared). In fact, often what they look for break-through findings are actually much less impacting results.
“The definition of insanity is doing the same thing over and over again and expecting different results.” attributed to Einstein