Researchers at the Whitehead Institute for Biomedical Research in Massachusetts say armed with just a computer, an Internet connection and publicly accessible online resources they were able to identify nearly 50 individuals who had submitted genetic material as participants in the studies.
Under certain circumstances, the full names and identities of genomic research participants can be determined even when their genetic information is held in databases in de-identified form, a Whitehead release said Thursday.
The Whitehead researchers said they used publicly available genetic information and an algorithm they developed to identify some of the people who donated their DNA to the 1000 Genomes Project.
They analyzed unique genetic markers on the Y chromosomes of men whose genetic material was collected, and because the Y chromosome is transmitted from father to son, as are family surnames, there is a strong correlation between surnames and the DNA on the Y chromosome.
There are publicly accessible databases that house Y chromosome data by surname, and with surnames in hand, the researchers queried other information sources including Internet record search engines, obituaries, genealogical websites and public demographic data to identify nearly 50 individual people in the United States who were participants in the genomes project.
"This is an important result that points out the potential for breaches of privacy in genomics studies," says Whitehead's Yaniv Erlich, who led the research team that published the study in Science magazine.
"We also hope that this study will eventually result in better security algorithms, better policy guidelines and better legislation to help mitigate some of the risks described."