If you stretched out all the DNA in a single human cell, it would be more than 6 feet long. But only a sliver of that DNA makes proteins, the biological machinery for life. In 2003, the Human Genome Project quantified just how little: Only 1% to 2% of our DNA — about 1.5 inches out of those 6.5 feet — encodes genes for protein. The noncoding sequences that make up the other 98% are often referred to as "junk DNA," a term coined in 1972 by the geneticist Susumu Ohno, who suggested that just as the fossil record is full of extinct species, our genomes are full of extinct or badly copied genes damaged by mutations.
But even though 98% of the genome is noncoding, it isn't precisely dead weight. In 2012, a consortium of hundreds of scientists reported in the Encyclopedia of DNA Elements that at least 80% of the genome is "active" in the sense that some of the DNA is being translated into RNA, even if that RNA isn't then being translated into proteins. There's little evidence that most of this RNA from broken genes does anything.
However, some of the noncoding sequences, making up roughly 8% to 15% of our DNA, aren't junk at all. They serve important purposes, regulating which genes in cells are active and how much protein they produce. Researchers are still discovering new ways that noncoding DNA does this, but it's clear that human biology is massively influenced by the noncoding regions, which don't directly code for proteins but still mold their production. Mutations in these regions, for example, have been linked to diseases or disorders as varied as autism, tremors and liver dysfunction.
Moreover, by comparing human genomes to those of chimpanzees and other animals, scientists have learned that noncoding regions may be a big part of what makes us uniquely human: It's possible that gene regulation by noncoding DNA differentiates species more than genes and proteins themselves do.
Researchers are also finding that new mutations can sometimes confer new abilities on noncoding sequences, which makes them a kind of resource for future evolution. As a result, what deserves the label "junk DNA" can be controversial. Scientists have clearly started to clean out the junk drawer since 1972 — but how much to keep in there is still up for debate.
What's New and Noteworthy
Scientists have been working to understand a type of noncoding DNA known as "transposons" or "jumping genes." These snippets can hop around the genome, making copies of themselves that sometimes get inserted into sequences of DNA. Transposons have increasingly been found to be critical to
tuning gene expression, or determining which coding genes get turned on to be transcribed into proteins. In part for that reason, they are proving to be important to an organism's
development and survival. When researchers engineered mice to lack transposons, half of the animals' pups died before birth. Transposons have left marks on the evolution of life.
Quanta has reported that they can jump between species — such as from
herring to smelt and from
snakes to frogs — sometimes even providing some kind of benefit, such as protecting fish from freezing in ice-cold waters.
Geneticists are also investigating "short tandem repeats," in which a stretch of DNA only one to six base pairs long is heavily repeated, sometimes dozens of times in a row. Scientists suspected that they help regulate genes because these sequences, which make up about 5% of the human genome, have been linked to conditions like Huntington's disease and cancer. In a study covered by
Quanta in February, researchers unraveled
one possible way that short tandem repeats could regulate genes: by helping to convene transcription factors, which then help turn on the protein-making machinery.
Then there are "pseudogenes," the remnants of working genes that were duplicated and then degraded by subsequent mutations. However, as
Quanta reported in 2021, scientists have been finding that sometimes pseudogenes don't remain pseudo or junk; instead, they
evolve new functions and become genetic regulators — sometimes even regulating the very gene from which they were copied.