User:Genome42/sandbox

Source: Wikipedia, the free encyclopedia.

Junk DNA

Junk DNA is DNA that does not have a function. The term was used in the 1960s[1] but became popular with the publication of the 1972 paper "So much 'junk' DNA in our genome" by Susumu Ohno.[2]

Functional DNA is DNA that is currently under purifying selection. This is the definition used by Dan Gaur in his textbook "Molecular and Genome Evolution."

"Functional DNA refers to any segment in the genome whose selected-effect function is that for which it was selected and/or by which it is maintained. Most functional sequences are maintained by purifying selection."[3]

This definition of function is called the maintenance function.[4][5] From this it follows that nonfunctional DNA, or junk DNA, is any segment in the genome that is NOT maintained by purifying selection. Many similar definitions have been published but they all have in common the idea that junk DNA is DNA that does not have a function and this means that it is not under negative selective pressure.[6][7][8]


How much of the human genome is junk?

Most of this article is about the human genome but the arguments for function and junk apply to other genomes.

The data on functional and nonfunctional DNA elements in the human genome is covered in many other articles so this is just a brief summary.

Genes (main article genes) There are approximately 20,000 protein-coding genes in the human genome. The number of noncoding genes is disputed with values ranging from about 5,000 to more than 100,000.

Arguments against junk DNA

Some scientists are convinced that junk DNA does not exist. For example, Peter Larsen declared in 2018 that,

"There is no such thing as 'junk DNA.' Indeed, a suite of discoveries made over the past few decades have put to rest this misnomer and have identified many important roles that so-called junk DNA provides to both genome and function."[9]

This is a widely held point of view although most of these authors don't explain why obvious examples of junk DNA, such as pseudogenes and broken bits of transposons, don't qualify as junk DNA.


Mutation load

The idea of excess DNA in some species started with the realization that the expected number of mutations in a species was would lead to extinction if the entire genome were full of functional DNA. This is a reference to mutation load or [genetic load].

By the late 1960s it was apparent that much of the DNA in humans had to be invisible to mutations and only a small percentage could be devoted to genes and other functional elements. The connection between the mutation load argument and junk DNA appeared in the paper by Susumu Ohno in 1972 where he said,

"All in all, it appears that the calculations made by Muller, Kimura and others are not far off the mark and that at least 90% of our genomic DNA is 'junk' or 'garbage' of various sorts."[10]


The C-Value Paradox

How much of the human genome is junk?

Some scientists are convinced that junk DNA does not exist. For example, Peter Larsen declared in 2018 that,

"There is no such thing as 'junk DNA.' Indeed, a suite of discoveries made over the past few decades have put to rest this misnomer and have identified many important roles that so-called junk DNA provides to both genome and function."[11]

This is a widely held point of view although most of these authors don't explain why obvious examples of junk DNA, such as pseudogenes and broken bits of transposons, don't qualify as junk DNA.


Mutation load

The connection between the mutation load argument and junk DNA appeared in a paper by Susumu Ohno in 1972 where he said,

"All in all, it appears that the calculations made by Muller, Kimura and others are not far off the mark and that at least 90% of our genomic DNA is 'junk' or 'garbage' of various sorts."[12]

.... see "Gene: Mutation" ....

Most mutations are due to DNA replication errors. The DNA replication complex is highly accurate and newly replicated DNA will only have only about one error for every 10 billion base pairs replicated (10-10 per bp per replication.) - the estimates in various publications range from 10-9 to 10-11. [13][14][15][16][17] The overall replication error rate is the product of (1) the intrinsic error rate of the polymerization reaction, (2) the errors that are corrected by proofreading, and (3) the errors that are corrected by repair enzymes following DNA replication.[13]

The extraordinary accuracy of DNA replication means that mutations will be rare in unicellular organisms with small genomes, such as bacteria. However, when mutations occur they will likely affect genes since genes take up a large part of the bacterial genome. Such mutations have a good chance of being deleterious.

The overall DNA replication error rate applies to all cell divisions in multicellular organisms.[18] This means a much greater chance of a mutation being passed on the the daughter cells in various tissues in species with large genomes. In humans, for example, an overall error rate of 10-10 means that there will be 0.62 mutations every time a cell divides (assuming cells are diploid and a genome size of 3.1 x 109). Spontaneous somatic cell mutations are responsible for many human diseases, including cancer.[19][Note 1]

In multicellular species, the mutation rate per generation can be calculated from the DNA replication error rate knowing the number of cell divisions that occur in germline cells.[20][14] It can also be observed directly by sequencing the genomes of each parent and their offspring. These two values agree in humans, leading to an estimate of about 100 new mutations in every newborn baby.[16][Note 2] Mutation rates can also be calculated by comparing the genome sequences of two closely related sequences, such as humans and chimpanzees, and these rates are roughly the same as those obtained by the two other methods.[21][22][23][24][25]

Since the phylogenetic rate only measures the neutral mutation rate, the agreement of the three estimates means that most of the human and chimpanzee genomes is evolving at the neutral rate - an observation that's consistent with the idea that most of the genome is junk .

Genes occupy about 45% of the human genome so in every newborn child there will be approximately 45 new mutations in genes and 55 new mutations elsewhere . If a large fraction of those mutations were deleterious then human species could not survive such a mutation load (genetic load). This lead to predictions in the late 1940s by one of the founders of population genetics,

J.B.S. Haldane, and by Nobel laureate, Hermann Muller, that only a small percentage of the human genome contains functional DNA elements that can be destroyed by mutation.[26][27]

In 1966 Muller reviewed these prediction and concluded that the human genome could only contain about 30,000 genes based on the known mutation rate and the number of deleterious mutations that the species could tolerate . [28] Similar predictions were made by other leading experts in molecular evolution who concluded that the human genome could not contain more than 40,000 genes and that less than 10% of the genome was functional.[20][29] [30] These predictions were confirmed with the publication of the human genome sequence.

The connection between the mutation load argument and junk DNA appeared in a paper by Susumu Ohno in 1972 where he said,

"All in all, it appears that the calculations made by Muller, Kimura and others are not far off the mark and that at least 90% of our genomic DNA is 'junk' or 'garbage' of various sorts."[31]

Several hundred thousand human genome have been sequenced making it possible to analyze the regions that are subject to purifying selection, that is, sequences that seem to be protected from mutations because such mutations are very deleterious. The results show that only a small percentage of the genome (less than 10%) seems to be functional by this criterion. Less than half of the sites subject to purifying selection lie within genes and these are concentrated in coding regions, the regions specifying functional non-codong RNAs, and intron splice sites. Other sites subject to purifying selection include regulatory sequences.[32][33][34]

Notes

  1. ^ The somatic cell mutation rate in humans is probably higher than the germline rate. (Lynch, 2010; Soek et al., 2017; Ju et al., 2017; Tomasetti and Vogelstein, 2017)
  2. ^ Males contribute more mutations than females and the number of mutations increases with the age of the father. (Jónsson et al., 2017)

Transposon-related sequences

Selfish DNA

It's important to note that selfish DNA is functional DNA although its function resides at the level of the gene and not the individual. This means that selfish DNA is not junk DNA and the two terms are not synonyms.[35]

Molecular evolution

The early proponents of junk DNA were well aware of the controversy they were initiating and how it would affect those whose standard view of evolution was restricted to natural selection. For example, Thomas Jukes wrote the following in a letter to Francis Crick in 1979.[36]

"I am sure that you realize how frightfully angry a lot of people will be if you say that much of the DNA is junk. The geneticists will be angry because they think that DNA is sacred. The Darwinian evolutionists will be outraged because they believe every change in DNA that is accepted in evolution is necessarily an adaptive change. To suggest anything else is an insult to the sacred memory of Darwin."

DRAFT SECTION

Junk DNA stub for Non-Coding DNA

Junk DNA is DNA that has no biologically relevant function such as pseudogenes and fragments of once active transposons. Bacteria genomes have very little junk DNA[37] but some eukaryotic genomes may have a substantial amount of junk DNA.[38] The exact amount of nonfunctional DNA in humans and other species with large genomes has not been determined and there is considerable controversy in the scientific literature.[39][40] See the article on Junk DNA for more information.

The nonfunctional DNA in bacterial genomes is mostly located in the intergenic fraction of non-coding DNA but in eukaryotic genomes it may also be found within introns (see

Introns
). It's important to note that there are many examples of functional DNA elements in non-coding DNA (see above) and there are no scientists who claim that all non-coding DNA is junk.

Origin of introns (Feb. 25, 2023)

(Added to INTRON on Feb. 25, 2023)

The current view is that following the formation of the first eukaryotic cell, group II introns from the bacterial endosymbiont invaded the host genome. In the beginning these self-splicing introns excised themselves from the mRNA precursor but over time some of them lost that ability and their excision had to be aided in trans by other group II introns. Eventually a number of specific trans-acting introns evolved and these became the precursors to the snRNAs of the spliceosome. The efficiency of splicing was improved by association with stabilizing proteins to form the primitive spliceosome.[41][42][43][44]

Remove Prokaryotic Cell Diagram

A label diagram explaining the different parts of a Prokaryotic genome

The prokaryotic cell diagram was created by Ulissesrp and inserted into this article on May 29, 2020.

The label under the diagram says, "A label diagram explaining the different parts of a prokaryotic genome" but the diagram does not explain the different parts of a prokaryotic genome. Instead it describes different parts of prokaryotic cell but some of those labels are incorrect.

The prokaryotic cell diagram stacks on top of the Genetics series banner, which should normally be at the top of the article.

I propose deleting the prokaryotic cell diagram in a few days unless anyone objects.

Are introns mostly junk?

The size of individual introns is not conserved in closely related species and the average length of introns varies widely from an average of less than 100 nucloetides in some species to averages of more than several thousand nulceotides in plants and mammals. [44] As a general rule, the length of introns correlates with genome size suggesting that expansions and contractions of genome size affect introns and intergenic regions equally. Thus, the arguments for junk DNA apply to introns as well as the rest of the genome.


[45]

[46]

[47]

Highly repetitive DNA

Highly repetitive DNA consists of short stretches of DNA that are repeated many times in tandem (one after the other). The repeat segments are usually between 2 bp and 10 bp but longer ones are known. Highly repetitive DNA is rare in prokaryotes but common in eukaryotes, especially those with large genomes. It is sometimes called satellite DNA.

Most of the highly repetitive DNA is found in centromeres and telomeres (see above) and most of it is functional although some might be redundant. The other significant fraction resides in short tandem repeats (STRs; also called microsatellites) consisting of short stretches of a simple repeat such as ATC. There are about 350,000 STRs in the human genome and they are scattered throughout the genome with an average length of about 25 repeats.[48][49]

Variations in the number of STR repeats can cause genetic diseases when they lie within a gene but most of these regions appear to be non-functional junk DNA that where the number of repeats can vary considerably from individual to individual. This is why these length differences are used extensively in DNA fingerprinting.


Untranslated regions

The standard biochemistry and molecular biology textbooks describe non-coding nucleotides in mRNA located between the 5' end of the gene and the translation initiation codon. These regions are called 5'-untranslated regions or 5'-UTRs. Similar regions called 3'-untranslated regions (3'-UTRs) are found at the end of the gene. The 5'-UTRs and 3'UTRs are very short in bacteria but they can be several hundred nucleotides in length in eukaryotes. They contain short elements that control the initiation of translation (5'UTRs) and transcription termination (3'-UTRs) as well as regulatory elements that may control mRNA stability, processing, and targeting to different regions of the cell.[50][51][52]

Defining the genome

It's very difficult to come up with a precise definition of "genome." It usually refers to the DNA (or sometimes RNA) molecules that carry the genetic information in an organism but sometimes it is difficult to decide which molecules to include in the definition; for example, bacteria usually have one or two large DNA molecules ([[[Chromosomes |chromosomes]]) that contain all of the essential genetic material but they also contain smaller extrachromosomal plasmid molecules that carry important genetic information. The definition of 'genome' that's commonly used in the scientific literature is usually restricted to the large chromosomal DNA molecules in bacteria.[53]

Eukaryotic genomes are even more difficult to define because almost all eukaryotic species contain nuclear chromosomes plus extra DNA molecules in the mitochondria. In addition, algae and plants have chloroplast DNA. Most textbooks make a distinction between the nuclear genome and the organelle (mitochondria and chloroplast) genome so when they speak of, say, the human genome they are only referring to the genetic material in the nucleus.[54][55] This is the most common usage of 'genome' in the scientific literature.

Most eukaryotes are diploid, meaning that there are two copies of each chromosome in the nucleus but the 'genome' refers to only one copy of each chromosome. Some eukaryotes have distinctive sex chromosomes such as the X and Y chromosomes of mammals so the technical definition of the genome must include both copies of the sex chromosomes. When referring to the standard reference genome of humans, for example, it consists of one copy of each of the 23 autosomes plus one X chromosome and one Y chromosome.[56]

Conflicting definitions 'gene'

There are many different ways to use the term "gene" based on different aspects of their inheritance, selection, biological function, or molecular structure but most of the definitions fall into two categories, the Mendelian gene or the molecular gene. (12 = Orgogozo et al. (2016) [57] [58][59]

The Mendelian gene is the classical gene of genetics and it refers to any heritable trait. This is the gene described in "The Selfish Gene" 14 = Dawkins). More thorough discussions of this version of a gene can be found in the articles on Genetics and Gene-centered view of evolution. This article focuses on the molecular gene—the gene that's described in terms of DNA sequence. There are many different different definitions of this gene - some of which are mispleading or incorrect. Cite error: A <ref> tag is missing the closing </ref> (see the help page)..



There are lots of different ways to use the term "gene." Richard Dawkins, for example, wrote a book called "The Selfish Gene"[60] where 'gene' simply meant any part of the chromosome that was subject to natural selection. This 'gene' is often referred to as the "Mendelian gene" whereas the physical gene described in this article is called the "molecular gene." [61]

The very first edition of the textbook "Molecular Biology of the Gene" (1965) described two kinds of molecular gene: protein-coding genes and those that specified functional RNA molecules such as ribosomal RNA and tRNA (noncoding genes).[62] But the idea of two kinds of genes dates back to the late 1950's when Jacob and Monod speculated that regulatory genes might produce repressor RNAs.[63]

This idea of two kinds of genes is still part of the definition of a gene in most textbooks. For example,

"The primary function of the genome is to produce RNA molecules. Selected portions of the DNA nucleotide sequence are copied into a corresponding RNA nucleotide sequence, which either encodes a protein (if it is an mRNA) or forms a 'structural' RNA, such as a transfer RNA (tRNA) or ribosomal RNA (rRNA) molecule. Each region of the DNA helix that produces a functional RNA molecule constitutes a gene."[64]
"We define a gene as a DNA sequence that is transcribed. This definition includes genes that do not encode proteins (not all transcripts are messenger RNA). The definition normally excludes regions of the genome that control transcription but are not themselves transcribed. We will encounter some exceptions to our definition of a gene - surprisingly, there is no definition that is entirely satisfactory."[65]
"A gene is a DNA sequence that codes for a diffusible product. This product may be protein (as is the case in the majority of genes) or may be RNA (as is the case of genes that code for tRNA and rRNA). The crucial feature is that the product diffuses away from its site of synthesis to act elsewhere."[66]

The important parts of such definitions are: (1) that a gene corresponds to a transcription unit; (2) that genes produce both mRNA and noncoding RNAs; and (3) regulatory sequences control gene expression but are not part of the gene itself. However, there's one other important part of the definition and it is emphasized in Kostas Kampourakis' book "Making Sense of Genes."

"Therefore in this book I will consider genes as DNA sequences encoding information for functional products, be it proteins or RNA molecles. With 'encoding information,' I mean that the DNA sequence is used as a template for the production of an RNA molecule or a protein that performs some function.'[57]

The emphasis on function is essential because there are stretches of DNA that produce non-functional transcripts and they don't qualify as genes. These include obvious examples such as transcribed pseudogenes as well as less obvious examples such as junk RNA produced as noise due to transcription errors. In order to qualify as a true gene, by this definition, one has to prove that the transcript has a biological function.[57]

Early speculations on the size of a typical gene were based on high resolution genetic mapping and on the size of proteins and RNA molecules. A length of 1500 base pairs seemed reasonable at the time (1965).[62] This was based on the idea that the gene was the DNA that was directly responsible for production of the functional product. The discovery of introns in the 1970s meant that many eukaryotic genes were much larger than the size of the functional product would imply. Typical mammalian protein-coding genes, for example, are about 62,000 base pairs in length (transcribed region) and since there are about 20,000 of them they occupy about 35-40% of the mammalian genome (including the human genome).[67][68][69]

In spite of the fact that both protein-coding genes and noncoding genes have been known for more than 50 years, there are still a number of textbooks, websites, and scientific publications that define a gene as a DNA sequence that specifies a protein. In other words, the definition is restricted to protein-coding genes. Here's an example from a recent article in American Scientist.

What Is a Gene, Really?
... to truly assess the potential significance of de novo genes, we relied on a strict definition of the word "gene" with which nearly every expert can agree. First, in order for a nucleotide sequence to be considered a true gene, an open reading frame (ORF) must be present. The ORF can be thought of as the "gene itself"; it begins with a starting mark common for every gene and ends with one of three possible finish line signals. One of the key enzymes in this process, the RNA polymerase, zips along the strand of DNA like a train on a monorail, transcribing it into its messenger RNA form. This point brings us to our second important criterion: A true gene is one that is both transcribed and translated. That is, a true gene is first used as a template to make transient messenger RNA, which is then translated into a protein.[70]

This restricted definition is so common that it has spawned many recent articles that criticize this "standard definition" and call for a new expanded definition that includes noncoding genes.[71][72][73] However, this so-called "new" definition has been around for more than half a century and it's not clear why some modern writers are ignoring noncoding genes.

There are exceptions to the standard definition of a gene; for example, some viruses have an RNA genome. The one important exception concerns bacterial operons where a contiguous stretch of DNA containing multiple protein-coding regions is transcribed into one large mRNA. Scientists usually refer to each of the coding regions as separate genes in this case. The only significant controversy over the definition of a gene is whether to include the regulatory sequences that control transcription of the gene. The general consensus among scientists is that regulatory elements control the expression of a gene but are not part of the gene.

Repeat sequences, transposons and viral elements

Virus DNA

There are two main types of viruses, DNA viruses and RNA viruses. Some RNA viruses are called retroviruses in eukaryotes because the RNA is 'retrotranscribed' into DNA as part of the life cycle. In prokaryotes, these viruses are called bacteriophage or phage.

Sometimes the viral genome can become incorporated into the host genome, either as part of the normal life cycle or by accident. The viral sequence will then be passed on to daughter cells following DNA replication and cell division. If the insertion occurs in the germ line of multicellular species then the viral genome will be inherited in the next generation and the viral DNA may become fixed in the genome by random genetic drift.[74]

The viral genome usually contains virus-specific genes that are transcribed and translated, which means that this DNA doesn't qualify as 'non-coding' in the strictest sense of the word, but, with some exceptions, the viral DNA evolves at the neutral rate of evolution[75] so it soon becomes non-functional and qualifies as junk DNA. The exceptions include a few retroviral genes that have secondarily become essential in the life of the host.[74]

DNA viruses and their degenerate descendants occupy about 3-4% of the human genome and RNA virus fragments take up about 9%.[76] Viral DNAs have inserted into introns and also the spaces between genes (intergenic DNA). Since introns take up a substantial portion of the genome, the viral DNA elements are about equally distributed between introns and intergenic DNA. [77]


Mobile genetic elements in the cell (left) and how they can be acquired (right)

Alu sequences, classified as a short interspersed nuclear element, are the most abundant mobile elements in the human genome. Some examples have been found of SINEs exerting transcriptional control of some protein-encoding genes.[78][79][80]

reverse transcription of retrovirus genomes into the genomes of germ cells. Mutation within these retro-transcribed sequences can inactivate the viral genome.[81]

Over 8% of the human genome is made up of (mostly decayed) endogenous retrovirus sequences, as part of the over 42% fraction that is recognizably derived of retrotransposons, while another 3% can be identified to be the remains of

DNA transposons. Much of the remaining half of the genome that is currently without an explained origin is expected to have found its origin in transposable elements that were active so long ago (> 200 million years) that random mutations have rendered them unrecognizable.[82] Genome size variation in at least two kinds of plants is mostly the result of retrotransposon sequences.[83][84]

Protein-coding genes

The human genome contains somewhere between 19,000 and 20,000 protein-coding genes. [85][86][87][88] These genes contain an average of 10 introns and the average size of an intron is about 6 kb (6,000 bp).[89] This means that the average size of a protein-coding gene is about 62 kb and these genes take up about 40% of the genome.[90]


Exon sequences consist of coding DNA and untranslated regions (UTRs) at either end of the mature mRNA. The total amount of coding DNA is about 1-2% of the genome.[91][89]


Many people divide the genome into coding and non-coding DNA based on the idea that coding DNA is the most important functional component the genome. About 98-99% of the human genome is non-coding DNA.

References

For references with author credit

{{cite web}}: Empty citation (help)

For references without author credit

{{cite web}}: Empty citation (help) access-date= 2023-02-28

Citing a symposium volume.

Bloggs, Fred (January 1, 2001). "Chapter 2: The History of the Bloggs Family". In Doe, John (ed.). Big Compilation Book with Many Chapters and Distinct Chapter Authors. Book Publishers. pp. 100–110.

Link to subsections within an article. Junk DNA section of Non-coding_DNA

This is the first citation to Alberts et al. 1994 textbook.[92] This is the second citation.[92]

Shortened footnote template (sfn). Refers to the first reference in the list that corresponds to the same author name and date (e.g. Gould (2002) pp. 1-10)[93]

Alberts et al. 1994 textbook[94]

Amaral et al. (2023) (human genome catalogue) [88]

Abascal et al. (2018) [85]

Besenbacher et al. (2019) (mutation rates in great apes)[95]

Bishop (1974)[96]

Britten and Davidson (1969)[97]

Britten and Kohne (1968)[98]

Brown (2018) (Genomes 4)[99]

Brown (2018) (Genomes 4: Chapt. 12 Transcriptomics)[100]

Brzović and Šustar (2020)[101]

Cavalier-Smith (1978)[102]

Cavalier-Smith (1980)[103]

Cavalier-Smith (1991) (introns)[104]

Comings (1972) (book)[105]

Comings (1972) (book review)[106]

Coyne (2009) [107]

Crick (1978)(introns)[108]

Dawkins (1976) (The Selfish Gene)[109]

Dawkins and Wong (2016)[110]

De Parseval and Heidmann (2005) (ERVs)[111]

Doolittle (1978)(introns)[112]

Doolittle (1991) (origin of inrons) [113]

Doolittle (2013)[114]

Doolittle and Sapienza (1980)[115]

Dover (1980)[116]

Dover and Doolittle (1980)[117]

Dukler et al. (2022) (genetic load)[118]

Echols and Goodman (1991) (DNA replication)[119]

Eddy (2012)[120]

Elliot et al. (2014)[121]

ENCODE (2012)[122]

ENCODE cartoon[123]

ENCODE EMBL video[124]

ENCODE The Guardian video[125]

ENCODE Maher blog (2012)[126]

Ensemble Homo sapiens[127]

Francis and Wörheide (2017) (50% genes)[128]

Galeota-Sprung et al. (2020)[129]

Gericke and Hagberg (2007) (gene definitions)[130]

Germain et al. (2014)[131]

Gil and Latorre (2012) (junk DNA in bacteria)[132]

Gilbert (1978)(introns)[133]

Gilbert (1985)(introns)[134]

Gould (2002) [135]

Graur (2016) (textbook) [136]

Graur (2017)[137]

Graur et al. (2013)[138]

Graur et al. (2015)[139]

Gregory (2005)[140]

Gymrek et al. (2016) (STRs)[141]

Haldane (1949)[26]

Halldorson et al. (2022)(genetic load)[142]

Häsler et al. (2007) (Alus not junk)[143]

Hatje et al. (2019)[91]

Haerty and Ponting (2014)[144]

Hopkin (2009) (gene definition)[145]

Hoyt et al. (2022) (T2T sequence)[146]

Hubé and Francastel (2015) (introns)[147]

Irimia and Roy (2014) (origin of introns)[44]

Jain (1980)[148]

Jensen (2001) (orthologs and paralogs)[149]

Jensen et al. (2013) (pervasive transcription)[150]

Johnson (2019) (ERVs)[151]

Judson (1996) (The Eight Day of Creation)[152]

Jukes (1979) (letter to Crick)[153]

Kampourakis (2017)[57]

Keightly (2012) (mutation rates)[154]

Kimura (1968)[20]

Kimura and Ohta (1971)[155]

King and Jukes (1969)[29]

Kirchberger et al. (2020) (bacterial genomes)[156]

Kronenberg et al. (2018) (great ape genomes)Cite error: The <ref> tag has too many names (see the help page).

Kunkel (2009) (DNA replication)[157]

Lander et al. (2001) (human genome)[158]

Larsen (2018)[159]

Lewin (1974)[160]

Lewin (1974b)[161]

Lewin (1974c)(Cell editorial)[162]

Lewin (2004) (Genes VIII)Cite error: The <ref> tag has too many names (see the help page).

Leypold and Speicher (2021) (sequence conservation)[163]

Linquist (2022)[164]

Linquist et al. (2020)[165]

Lynch (2016) (~100 mutations per newborn)[166]

Lynch et al. (2016) (mutation rate)[167]

Mattick (2023)[168]

Mattick and Dinger (2013)[169]

McHughen (2020)[170]

Moorjani et al. 2016) (primate molecular clock)[171]

Moran et al. (2012) Principles of Biochemistry)Cite error: The <ref> tag has too many names (see the help page).

Morange (2014) (junk DNA controversy)[172]

Morange (2020)(intron history)[173]

Mortola and Long (2021) (gene definition/birth)[174]

Muller (1950)[27]

Muller, H.J. (1966)[28]

Nachman (2004) (mutation rate history)[175]

Neil and Faribrother (2019) (intron function)[176]

Nelson et al. (2004)[177]

Nowak and Waclaw (2017) (review of mutations cause cancer)[178]

Niu and Jiang (2013)[179]

O'Brian (1973)[180]

Ohno (1972) (So much 'Junk' DNA)[181]

Ohno (1972) (Genetic Simplicity)[30]

Ohno (1972) (regulatory sequences)[182]

Ohno and Yomo (1991)[183]

Ohta (1973) [184]

Ohta (1998) [185]

Ohta and Kimura (1971)(30,000 genes)[186]

Omenn et al. (2020) [87]

Orgel and Crick (1980)[187]

Orgel, Crick and Sapienza (1980)[188]

Orgogozo et al. (2016) (Mendelian vs Molecular Gene)Cite error: The <ref> tag has too many names (see the help page).

Palazzo and Gregory (2014)[38]

Palazzo and Kejiou (2022) (molecular biologists)[189]

Palazzo and Lee (2015)[190]

Pearson (2006) (gene definition)[191]

Pennisi (2007) (gene definition)[192]

Piovesan et al. (2019)[193]

Pioveasan et al. (2919) (length weight of human genome)[194]

Ponicson et al. (2010) (SINE function)[195]

Ponting (2017)[196]

Ponting and Hardison (2011)[197]

Ponting and Haerty (2022)[198]


Ségurel et al. (2014) (mutation rates)[199]

Scally (2016) (human mutation rate)[200]

Scally and Durbin (2012) (human mutation rate)[201]

Sharp (1991) ("Five easy pieces")[202]

Sverdlov (2017) (junk RNA)[203]

Sweet (2022)(junk DNA history thesis)[204]

Thomas (1971) (C-value Paradox)[205]

Yu et al. (2002) (minimal introns not junk)[206]

van Bakel et al. (2011) (pervasive transcription)[207]

Wade and Grainger (2018) (spurious transcription)[208]

Walters et al. (2009) (SINE functions)[209]

Watson (1965) (Molecular Biology of he Gene)[210]

Wong et al. (2000) (are introns junk?)[211]

Zhou et al. (2021) (DNA replication)[212]

Press release: Yale 2012 [213]

  1. .
  2. ^ Ohno S (1972). "So much "junk" DNA in our genome". Brookhaven symposia in biology. 23: 366–370.
  3. .
  4. .
  5. doi:10.1371/journal.pgen.1008702.{{cite journal}}: CS1 maint: unflagged free DOI (link
    )
  6. .
  7. .
  8. .
  9. .
  10. .
  11. .
  12. .
  13. ^ .
  14. ^ .
  15. .
  16. ^ .
  17. .
  18. .
  19. .
  20. ^ a b c Kimura, Mootoo (1968). "Evolutionary rate at the molecular level" (PDF). Nature. 217: 624–626.
  21. .
  22. .
  23. doi:10.1371/journal.pbio.2000744.{{cite journal}}: CS1 maint: unflagged free DOI (link
    )
  24. .
  25. .
  26. ^ .
  27. ^ a b Muller, Hermann J (1950). "Our load of mutations" (PDF). American journal of human genetics. 2: 111–175.
  28. ^ a b Muller HJ (1966). "The gene material as the initiator and the organizing basis of life". American Naturalist. 100: 493–517.
  29. ^ a b King JL, and Jukes TH (1969). "Non-Darwinian evolution". Science. 164: 788–798.
  30. ^ .
  31. .
  32. .
  33. .
  34. .
  35. .
  36. ^ "Thomas Jukes letter to Francis Crick". The Francis Crick Papers, National Library of Medicine (USA). Retrieved May 17, 2022.
  37. doi:10.3390/genes3040634.{{cite journal}}: CS1 maint: unflagged free DOI (link
    )
  38. ^
    PMID 24809441.{{cite journal}}: CS1 maint: unflagged free DOI (link) Cite error: The named reference "PalazzoGregory2014" was defined multiple times with different content (see the help page
    ).
  39. .
  40. .
  41. .
  42. .
  43. ^ Sharp PA (1991). ""Five easy pieces."(role of RNA catalysis in cellular processes)". Science. 254: 663–664.
  44. ^ .
  45. .
  46. .
  47. .
  48. .
  49. .
  50. ^ Alberts B, Bray D, Lewis J, Raff M, Roberts K, Watson JD (1994). Molecular Biology of the Cell, 3rd edition. London, UK: Garland Publishing Inc.
  51. ^ Lewin B (2004). Genes VIII. Upper Saddle River, NJ, USA: Pearson/Prentice Hall.
  52. ^ Moran L, Horton HR, Scrimgeour KG, Perry MD (2012). Principles of Biochemistry Fifth Edition. Upper Saddle River, NJ, USA: Pearson.
  53. .
  54. .
  55. .
  56. ^ "Ensembl Human Assembly and gene annotation (GRCh38)". Ensembl. Retrieved May 30, 2022.
  57. ^ a b c d Kampourakis K (2017). Making Sense of Genes. Cambridge, UK: Cambridge University Press.
  58. .
  59. ^ Meunier, Robert (2022). "Stanford Encyclopedia of Philosophy: Gene". Stanford Encyclopedia of Philosophy. Retrieved 2023/02/28. {{cite web}}: Check date values in: |access-date= (help)
  60. ^ Dawkins R (1976). The selfish gene. Oxford, UK: Oxford University Press.
  61. ^ <cite journal | vauthors = Orgogozo V, Peluffo AE, Morizot B | date = 2016 | title = Chapter One-The “Mendelian Gene” and the “Molecular Gene”: Two Relevant Concepts of Genetic Units | journal = Current topics in developmental biology | volume = 119 pages = 1-26 | doi = 10.1016/bs.ctdb.2016.03.002}}
  62. ^ a b Watson JD (1965). Molecular Biology of the Gene. New York, NY, USA: W.A. Benjamin, Inc.
  63. ^ Judson HF (1996). The Eight Day of Creation (Expanded Edition). Plainview, NY (USA): Cold Spring Harbor Laboratory Press.{{cite book}}: CS1 maint: extra punctuation (link)
  64. .
  65. ^ Moran LA, Horton HR, Scrimgeour KG, Perry MD (2012). Principles of Biochemistry: Fifth Edition. Upper Saddle River, NJ, USA: Pearson.
  66. ^ Lewin B (2004). Genes VIII. Upper Saddle River, NJ, USA: Pearson/Prentice Hall.
  67. doi:10.1186/s13104-019-4137-z.{{cite journal}}: CS1 maint: unflagged free DOI (link
    )
  68. doi:10.3390/ijms16034429.{{cite journal}}: CS1 maint: unflagged free DOI (link
    )
  69. .
  70. ^ Mortola E, Long M (2021). "Turning Junk into Us: How Genes Are Born". American Scientist. 109: 174–182.
  71. .
  72. ^ Pearson H (2006). "What Is a Gene?". Nature. 441: 399–401.
  73. .
  74. ^ .
  75. .
  76. doi:10.1126/science.abk3112. {{cite journal}}: Check date values in: |date= (help); Missing or empty |title= (help); Unknown parameter |voume= ignored (help
    )
  77. .
  78. .
  79. .
  80. .
  81. .
  82. .
  83. .
  84. .
  85. ^ .
  86. .
  87. ^ .
  88. ^ .
  89. ^
    doi:10.1186/s13104-019-4343-8.{{cite journal}}: CS1 maint: unflagged free DOI (link
    )
  90. .
  91. ^ .
  92. ^ .
  93. ^ Gould 2002, pp. 1–10.
  94. .
  95. .
  96. .
  97. .
  98. .
  99. .
  100. .
  101. .
  102. ^ Cavalier-Smith, Thomas (1978). "Nuclear volume control by nucleoskeletal DNA, selection for cell volume and cell growth rate, and the solution of the DNA C-value paradox". Journal of Cell Science. 34: 247–278.
  103. .
  104. .
  105. ^ Comings DE (1972). "The structure and function of chromatin". Advances in human genetics. Springer. p. 237-431.
  106. ^ Comings, DE (1972). "Review of Evolution of Genetics Systems". American Journal of Human Genetics. 25: 340-342.
  107. OCLC 233549529
    .
  108. .
  109. ^ Dawkins R (1976). The selfish gene. Oxford, UK: Oxford University Press.
  110. ^ Dawkins R, and Wong Y (2016). "The Humped Bladderwort's Tale". The Ancestor's Tale 2nd ed. Weidenfeld & Nicolson.
  111. .
  112. .
  113. .
  114. .
  115. .
  116. ^ Glover, G (1980). "Ignorant DNA?". Nature. 285: 618–619.
  117. ^ Dover G, and Doolittle WF (1980). "Modes of genome evolution". Nature. 288: 646–647.
  118. .
  119. .
  120. .
  121. .
  122. doi:10.1038/nature11247. {{cite journal}}: Cite uses deprecated parameter |authors= (help
    )
  123. ^ "The Story of You: ENCODE and the human genome". YouTube. Nature/Illumina. 2012.
  124. ^ "ENCODE: Encyclopedia of DNA Elements". YouTube. European Molecular Biology Laboratories. 2012.
  125. ^ "What the Encode project tells us about the human genome and 'junk DNA'". YouTube. The Guardian. 2012.
  126. ^ Maher, Brendan. "Fighting about ENCODE and junk". Nature News Blog. Nature.
  127. ^ "Human assembly and gene annotation". Ensembl. 2022. Retrieved 2023-02-28.
  128. PMID 28633296
    .
  129. .
  130. .
  131. .
  132. doi:10.3390/genes3040634.{{cite journal}}: CS1 maint: unflagged free DOI (link
    )
  133. .
  134. .
  135. .
  136. ^ Graur, Dan (2016). "Eukaryotic Genome Evolution". Molecular and Genome Evolution. Sinauer Associates, Inc.
  137. ^ Graur, Dan (2017). "Rubbish DNA: The functionless fraction of the human genome". In Saitou, Naruya (ed.). Evolution of the Human Genome I. Springer. pp. 19–60.
  138. PMID 23431001
    .
  139. .
  140. ^ Gregory, TR (2005). "Genome Size Evolution in Animals". The Evolution of the Genome. Elsevier. p. 3-87.
  141. .
  142. .
  143. .
  144. .
  145. .
  146. .
  147. doi:10.3390/ijms16034429.{{cite journal}}: CS1 maint: unflagged free DOI (link
    )
  148. ^ Jain, HK (1980). "Incidental DNA". Nature. 288: 647–648.
  149. doi:10.1186/gb-2001-2-8-interactions1002.{{cite journal}}: CS1 maint: unflagged free DOI (link
    )
  150. .
  151. .
  152. ^ Judson HF (1996). The Eight Day of Creation (Expanded Edition). Plainview, NY (USA): Cold Spring Harbor Laboratory Press.{{cite book}}: CS1 maint: extra punctuation (link)
  153. ^ "Thomas Jukes letter to Francis Crick". The Francis Crick Papers, National Library of Medicine (USA). 1979. Retrieved May 17, 2022.
  154. .
  155. .
  156. .
  157. .
  158. .
  159. .
  160. ^ Lewin, Benjamin (1974). "Chapter 4: Sequences of Eukaryotic DNA". Gene Expression-2: Eukaryotic Chromosomes. John Wiley & Sons.
  161. ^ Lewin, Benjamin (1974). "Chapter 5: Transcription and Processing of RNA". Gene Expression-2: Eukaryotic Chromosomes. John Wiley & Sons.
  162. .
  163. .
  164. .
  165. doi:10.1371/journal.pgen.1008702.{{cite journal}}: CS1 maint: unflagged free DOI (link
    )
  166. .
  167. .
  168. .
  169. doi:10.1186/1877-6566-7-2.{{cite journal}}: CS1 maint: unflagged free DOI (link
    )
  170. ^ McHughen A (2020). DNA Demystified: Unraveling the Double Helix. New York, New York, USA: Oxford University Press.
  171. doi:10.1371/journal.pbio.2000744.{{cite journal}}: CS1 maint: unflagged free DOI (link
    )
  172. ^ cite journal | last = Morange | first = Michel | date = 2014 | title = Genome as a Multipurpose Structure Built by Evolution | journal = Perspectives in Biology and Medicine | volume = 57 | pages = 162-171 | doi = 10.1353/pbm.2014.0008 }}
  173. ^ Morange, Michel (2020). "Chapter 17: Split Genes and Splicing". The Black Box of Biology: A History of the Molecular Revolution. Harvard University Press.
  174. ^ Mortola E, Long M (2021). "Turning Junk into Us: How Genes Are Born". American Scientist. 109: 174–182.
  175. ^ Nachman, Michael W (2004). "Haldane and the first estimates of the human mutation rate". Journal of Genetics. 83: 231–233.
  176. .
  177. .
  178. .
  179. .
  180. ^ O'Brian, S.J. (1973). "On estimating functional gene number in eukaryotes". Nature New Biology. 242: 52–54.
  181. ^ Ohno S (1972). "So much "junk" DNA in our genome". Brookhaven symposia in biology. 23: 366–370.
  182. .
  183. .
  184. .
  185. ^ Ohta, Tomoko (1998). "Evolution by nearly-neutral mutations". Genetica. 102: 83–90.
  186. ^ Ohta T, and Kimura M (1971). "Functional organization of genetic material as a product of molecular evolution". Nature. 233: 118–119.
  187. .
  188. .
  189. doi:10.3389/fgene.2022.831068.{{cite journal}}: CS1 maint: unflagged free DOI (link
    )
  190. doi:10.3389/fgene.2015.00002.{{cite journal}}: CS1 maint: unflagged free DOI (link
    )
  191. ^ Pearson H (2006). "What Is a Gene?". Nature. 441: 399–401.
  192. .
  193. doi:10.1186/s13104-019-4343-8.{{cite journal}}: CS1 maint: unflagged free DOI (link
    )
  194. doi:10.1186/s13104-019-4137-z.{{cite journal}}: CS1 maint: unflagged free DOI (link
    )
  195. .
  196. doi:10.1186/s12915-017-0411-5.{{cite journal}}: CS1 maint: unflagged free DOI (link
    )
  197. .
  198. .
  199. .
  200. .
  201. .
  202. ^ Sharp PA (1991). ""Five easy pieces."(role of RNA catalysis in cellular processes)". Science. 254: 663–664.
  203. .
  204. ^ Sweet, Amalia (2022). Requiem for a Gene: The Problem of Junk DNA for the Molecular Paradigm (MA). University of Chicago.
  205. .
  206. .
  207. doi:10.1371/journal.pbio.1001102.{{cite journal}}: CS1 maint: unflagged free DOI (link
    )
  208. .
  209. .
  210. ^ Watson JD (1965). Molecular Biology of the Gene. New York, NY, USA: W.A. Benjamin, Inc.
  211. .
  212. .
  213. ^ Colleen Shaddox (2012). "Junk no more". Yale Shcool of Medicine. Retrieved 2023-04-07. Hopefully, ENCODE will help put an end to the notion of junk DNA.