May 25 (UPI) -- Researchers have found a way to train train their medical diagnostics algorithm without sharing patient data, according to a new study.
Medical researchers are beginning to use machine learning algorithms to help analyze diagnostic images like X-rays and MRI scans, but these artificial intelligence systems must be trained with real data -- a tall task if patients' privacy rights are to be adequately protected.
Previously, scientists have tried to anonymize the data, but some critics suggest current privacy protection strategies are insufficient.
"These processes have often proven inadequate in terms of protecting patients' health data," Daniel Rueckert, professor of artificial intelligence in healthcare and medicine at the Technical University of Munich in Germany, said in a press release.
Related
For the new study -- published this week in the journal Nature Machine Intelligence -- scientists trained an algorithm designed to identify pneumonia in pediatric X-ray images by loaning it out to other medical institutions.
By sharing the algorithm instead of fielding health data, researchers ensured private data never left the clinics where it was collected.
"For our algorithm we used federated learning, in which the deep learning algorithm is shared -- and not the data," said first author Alexander Ziller, researcher at the TUM Institute of Radiology.
"Our models were trained in the various hospitals using the local data and then returned to us. Thus, the data owners did not have to share their data and retained complete control," Ziller said.
To prevent users from deducing the institutions where the algorithm was trained, scientists used a method called secure aggregation.
"We combined the algorithms in encrypted form and only decrypted them after they were trained with the data of all participating institutions," said project leader and first author Georgios Kaissis of the TUM Institute of Medical Informatics, Statistics and Epidemiology.
Researchers also used what's known as differential privacy to prevent individual data from being pulled from the dataset.
"Ultimately, statistical correlations can be extracted from the data records, but not the contributions of individual persons," Kaissis said.
The authors of the new study acknowledged that while their techniques aren't new, the efforts mark the first time such methods have been used for large-scale machine learning with real clinical data.
When researchers pitted the results of their federated learning algorithm with the interpretations of specialized radiologists, they found the model was as good as or better than humans at diagnosing different types of pneumonia in children.
If adopted by other researchers, the federated learning methods detailed in the latest study could help relieve privacy concerns and encourage greater cooperation among hospitals, clinics and other medical institutions, the researchers said.
Though they also caution that algorithms, whether they're designed to predict hurricane pathways or screen for cancer, are only as good as their data.
"And we can only obtain these data by properly protecting patient privacy," said Rueckert. "This shows that, with data protection, we can do much more for the advancement knowledge than many people think."