DNA Data Storage

By Luca Grittini, S5ENA, ESF

Deoxyribonucleic Acid (DNA) is our genetic code. It is made up of molecules known as nucleotides, which have a phosphate group (which form a phosphate backbone, joining nucleotides together), a pentose sugar and a nitrogenous base. There are four types of nucleotides; each type differs in the composition of its nitrogenous base. They are adenine (A), cytosine (C), guanine (G) and thymine (T).  

Each link of DNA consists of two nucleotides, whose nitrogenous base is bonded by hydrogen bonds. Adenine (A) always bonds with thymine (T) and cytosine (C) always bonds with guanine (G), forming two base pairs: A-T and C-G. This is where it gets interesting. Traditional computers function using the binary system, i.e., 1’s and 0’s, to process text, images, audio, etc. We can, for example, replace these 1’s and 0’s with the two base pairs: the C-G pair can represent the 0’s while A-T pair can represent 1.  

Using this mechanism, digital information can be stored, sequenced, and encoded in DNA instead of using large server rooms. DNA is so good at storing digital information, that some say 1 gram of the stuff could contain up to 17 exabytes (1 exabyte = 1000 petabytes). So, with the use of this technology, we will not have to make compromises on what to keep and what to throw away. We can save everything from cat videos to research papers. In addition to being able to store high densities of information, DNA has a half-life of 521 years meaning that it is exceptionally durable and can stay intact for an extended period.   

It is worth noting that DNA storage is not without flaws. Technology to encode and sequence DNA is expensive and inefficient. But advances in these encoding and sequencing techniques are occurring, it is only a matter of time before digital storage in DNA is the new norm.  

Thank you for reading, would love to hear your opinions on the topic in the comment section!