Can scientific discoveries made using machine learning and big data be trusted? New research suggests not always. Photo by Stephen Shaver/UPI | License Photo
Feb. 18 (UPI) -- Researchers at Rice University want scientists to continue double-checking discoveries made using machine learning.
Until machine-learning systems are capable of self-critique, scientists warn their predictions can't be fully trusted.
"Work is underway on next-generation machine-learning systems that will assess the uncertainty and reproducibility of their predictions," Rice University statistician Genevera Allen said in a news release.
Machine-learning systems are designed to make predictions about future data given what they've learned by analyzing current datasets.
"A lot of these techniques are designed to always make a prediction," she said. "They never come back with 'I don't know,' or 'I didn't discover anything,' because they aren't made to."
Machine-learning systems are currently used to develop cancer drugs targeting patients with similar genomic profiles.
"People have applied machine learning to genomic data from clinical cohorts to find groups, or clusters, of patients with similar genomic profiles," Allen said.
But often, these studies produce uncorroborated results -- findings that aren't easily replicated.
"The clusters discovered in one study are completely different than the clusters found in another," Allen said. "Why? Because most machine-learning techniques today always say, 'I found a group.' Sometimes, it would be far more useful if they said, 'I think some of these are really grouped together, but I'm uncertain about these others.'"
Allen shared her analysis of machine-learning systems and their biases at the 2019 Annual Meeting of the American Association for the Advancement of Science, held over the weekend in Washington, D.C.