Junk DNA (non-functional DNA) is a DNA sequence that has no relevant biological function.
transposons and viruses—but it is possible that some organisms have substantial amounts of junk DNA.[3]
All protein-coding regions of genes are generally considered as functional elements in genomes. Additionally, non-protein coding regions such as genes for ribosomal RNA and transfer RNA, regulatory sequences controlling expression of those genes, elements of the genome involving origins of replication (in all species), centromeres, telomeres, and scaffold attachment regions (in eukaryotes) are generally considered as functional elements of genomes as well. (See Non-coding DNA for more information.)
It is difficult to determine whether other regions of the genome are functional or nonfunctional. There is considerable controversy over which criteria should be used to identify function. Many scientists have an evolutionary view of the genome and they prefer criteria based on whether DNA sequences are preserved by natural selection.[4][5][6] Other scientists dispute this view or have different interpretations of the data.[7][8][9]
The history of junk DNA
The idea that only a fraction of the human genome could be functional dates back to the late 1940s. The estimated mutation rate in humans suggested that if a large fraction of those mutations were deleterious then the human species could not survive such a mutation load (genetic load). This led to predictions in the late 1940s by one of the founders of population genetics,
J.B.S. Haldane, and by Nobel laureate Hermann Muller, that only a small percentage of the human genome contains functional DNA elements (genes) that can be destroyed by mutation.[10][11] (see Genetic load
for more information)
In 1966 Muller reviewed these predictions and concluded that the human genome could only contain about 30,000 genes based on the number of deleterious mutations that the species could tolerate.[12] Similar predictions were made by other leading experts in molecular evolution who concluded that the human genome could not contain more than 40,000 genes and that less than 10% of the genome was functional.[13][14][4][15]
The size of genomes in various species was known to vary considerably and there did not seem to be a correlation between genome size and the complexity of the species. Even closely related species could have very different genome sizes. This observation led to what came to be known as the
C-value paradox.[16] The paradox was resolved with the discovery of repetitive DNA and the observation that most of the differences in genome size could be attributed to repetitive DNA.[16][17] Some scientists thought that most of the repetitive DNA was involved in regulating gene expression but many scientists thought that the excess repetitive DNA was nonfunctional.[18][16][19][20][21]
At about the same time (late 1960s) the newly developed technique of C0t analysis was refined to include RNA:DNA hybridization leading to the discovery that considerably less than 10% of the human genome was complementary to mRNA and this DNA was in the unique (non-repetitive) fraction. This confirmed the predictions made from genetic load arguments and was consistent with the idea that much of the repetitive DNA is nonfunctional.[22][23][24]
The idea that large amounts of eukaryotic genomes could be nonfunctional conflicted with the prevailing view of evolution in 1968 since it seemed likely that nonfunctional DNA would be eliminated by natural selection. The development of the neutral theory and the nearly neutral theory provided a way out of this problem since it allowed for the preservation of slightly deleterious nonfunctional DNA in accordance with fundamental principles of population genetics.[14][13][25]
The term "junk DNA" began to be used in the late 1950s[26] but Susumu Ohno popularized the term in a 1972 paper titled "So much 'junk' DNA in our genome"[27] where he summarized the current evidence that had accumulated by then.[27] In a second paper that same year, he concluded that 90% of mammalian genomes consisted of nonfunctional DNA.[4] The case for junk DNA was summarized in a lengthy paper by David Comings in 1972 where he listed four reasons for proposing junk DNA:[28]
some organisms have a lot more DNA than they seem to require (C-value paradox),
current estimates of the number of genes (in 1972) are much less than the number that can be accommodated,
the mutation load would be too large if all the DNA were functional, and
some junk DNA clearly exists.
The discovery of introns in the 1970s seemed to confirm the views of junk DNA proponents because it meant that genes were very large and even huge genomes could not accommodate large numbers of genes. The proponents of junk DNA tended to dismiss intron sequences as mostly nonfunctional DNA (junk) but junk DNA opponents advanced a number of hypotheses attributing functions of various sort to intron sequences.[29][30][31][32][33]