March 2 (UPI) -- DNA is nature's hard drive, capable of storing, replicating and transmitting massive amounts of information. Researchers in New York found a way to use DNA like an actual computer hard drive, successfully storing, replicating and retrieving several digital files.
A pair of scientists from Columbia University and the New York Genome Center selected five files -- including a computer operating system and computer virus -- and compressed them into a master file. They transcribed the master file into short strings of binary code, combinations of ones and zeros.
The researchers then randomly compiled the strings into so-called droplets using fountain codes. The droplets were translated into four DNA nucleotide bases -- A, G, C and T. The erasure-correcting algorithm ensured no letter combinations known to cause errors were used, and also assigned a barcode to each droplet to aid file retrieval and reassembly.
The coding process produced 72,000 DNA strands, each 200 bases long. Researchers sent the DNA file to Twist Bioscience, a startup in San Francisco that turns digital DNA into biological DNA. Two weeks later, the company sent the researchers a vial containing their DNA strands.
Researcher Yaniv Erlich and Dina Zielinski used standard DNA sequencing software to re-digitalize their DNA. A special program helped them translate the nucleotide sequences back into binary code. They found their files with zero coding errors.
According to the pair's calculations -- detailed in the journal Science -- they were able to store 215 petabytes of data in a single gram of DNA, a new record.
"We believe this is the highest-density data-storage device ever created," Erlich, a computer science professor at Columbia Engineering, said in a news release.
The scientists also proved the DNA strands -- and the embedded files -- could be infinitely replicated through a polymerase chain reaction without creating any coding errors.
Though DNA synthesis is currently quite pricey, the costs may shrink as technologies improve.
"We can do more of the heavy lifting on the computer to take the burden off time-intensive molecular coding," Erlich said.