C-value

C-value is the amount, in

eukaryotic organism. In some cases (notably among diploid organisms), the terms C-value and genome size are used interchangeably; however, in polyploids the C-value may represent two or more genomes contained within the same nucleus. Greilhuber et al.^[1]

Origin of the Term - C-value

Many authors have incorrectly assumed that the 'C' in "C-value" refers to "characteristic", "content", or "complement". Even among authors who have attempted to trace the origin of the term, there had been some confusion because Hewson Swift did not define it explicitly when he coined it in 1950.[2] In his original paper, Swift appeared to use the designation "1C value", "2C value", etc., in reference to "classes" of DNA content (e.g., Gregory 2001,^[3] 2002^[4]); however, Swift explained in personal correspondence to Prof. Michael D. Bennett in 1975 that "I am afraid the letter C stood for nothing more glamorous than 'constant', i.e., the amount of DNA that was characteristic of a particular genotype" (quoted in Bennett and Leitch 2005^[5]). This is in reference to the report in 1948 by Vendrely and Vendrely of a "remarkable constancy in the nuclear DNA content of all the cells in all the individuals within a given animal species" (translated from the original French).^[6] Swift's study of this topic related specifically to variation (or lack thereof) among chromosome sets in different cell types within individuals, but his notation evolved into "C-value" in reference to the haploid DNA content of individual species and retains this usage today.

Variation among species

C-values vary enormously among species. In animals they range more than 3,300-fold, and in land plants they differ by a factor of about 1,000.

organ complexity, geographical distribution, or extinction risk (for recent reviews, see Bennett and Leitch 2005;^[5] Gregory 2005^[7]

).

The C-value enigma or C-value paradox is the complex puzzle surrounding the extensive variation in nuclear

humans

.

Some prefer the term C-value enigma because it explicitly includes all of the questions that will need to be answered if a complete understanding of genome size evolution is to be achieved (Gregory 2005). Moreover, the term paradox implies a lack of understanding of one of the most basic features of eukaryotic genomes: namely that they are composed primarily of non-coding DNA. Some have claimed that the term paradox also has the unfortunate tendency to lead authors to seek simple one-dimensional solutions to what is, in actuality, a multi-faceted puzzle.^[8] For these reasons, in 2003 the term "C-value enigma" was endorsed in preference to "C-value paradox" at the Second Plant Genome Size Discussion Meeting and Workshop at the Royal Botanic Gardens, Kew, UK,^[8] and an increasing number of authors have begun adopting this term.

C-value paradox

In 1948, Roger and Colette Vendrely reported a "remarkable constancy in the nuclear DNA content of all the cells in all the individuals within a given animal species",

salamanders may contain 40 times more DNA than those of humans.^[11] Given that C-values were assumed to be constant because genetic information is encoded by DNA, and yet bore no relationship to presumed gene number, this was understandably considered paradoxical

; the term "C-value paradox" was used to describe this situation by C.A. Thomas Jr. in 1971.

The discovery of

transposable elements).^[12]

C-value enigma

The term "C-value enigma" represents an update of the more common but outdated term "C-value paradox" (Thomas 1971), being ultimately derived from the term "C-value" (Swift 1950) in reference to

haploid nuclear DNA contents. The term was coined by Canadian biologist Dr. T. Ryan Gregory of the University of Guelph in 2000/2001. In general terms, the C-value enigma relates to the issue of variation in the amount of non-coding DNA found within the genomes

of different eukaryotes.

The C-value enigma, unlike the older C-value paradox, is explicitly defined as a series of independent but equally important component questions, including:

What types of non-coding DNA are found in different eukaryotic genomes, and in what proportions?
From where does this non-coding DNA come, and how is it spread and/or lost from genomes over time?
What effects, or perhaps even functions, does this non-coding DNA have for
organisms
?
Why do some species exhibit remarkably streamlined chromosomes, while others possess massive amounts of non-coding DNA?

Calculating C-values

Table 1: Relative Molecular Masses of Nucleotides†
Nucleotide	Chemical formula	Relative molecular mass (Da)
2′-deoxyadenosine 5′-monophosphate	C₁₀H₁₄N₅O₆P	331.2213
2′-deoxythymidine 5′-monophosphate	C₁₀H₁₅N₂O₈P	322.2079
2′-deoxyguanosine 5′-monophosphate	C₁₀H₁₄N₅O₇P	347.2207
2′-deoxycytidine 5′-monophosphate	C₉H₁₄N₃O₇P	307.1966

†Source of table: Doležel et al., 2003 [13]

The formulas for converting the number of nucleotide pairs (or base pairs) to picograms of DNA and vice versa are:^[13]

genome size (bp) = (0.978 x 10⁹) x DNA content (pg)
DNA content (pg) = genome size (bp) / (0.978 x 10⁹)
1 pg = 978 Mbp

By using the data in Table 1, relative masses of nucleotide pairs can be calculated as follows: A/T = 615.383 and G/C = 616.3711, bearing in mind that formation of one phosphodiester linkage involves a loss of one H₂O molecule. Further, phosphates of nucleotides in the DNA chain are acidic, so at physiologic pH the H⁺ ion is dissociated. Provided the ratio of A/T to G/C pairs is 1:1 (the GC-content is 50%), the mean relative mass of one nucleotide pair is 615.8771.

The relative molecular mass may be converted to an absolute value by multiplying it by the

atomic mass unit (1 u) in picograms. Thus, 615.8771 is multiplied by 1.660539 × 10⁻¹² pg. Consequently, the mean mass per nucleotide pair would be 1.023 × 10⁻⁹ pg, and 1 pg of DNA would represent 0.978 × 10⁹ base pairs (978 Mbp).^[13]

No species has a GC-content of exactly 50% (equal amounts of A/T and G/C nucleotide bases) as assumed by Doležel et al. However, as a G/C pair is only heavier than an A/T pair by about 1/6 of 1%, the effect of variations in GC content is small. The actual GC content varies between species, between chromosomes, and between isochores (sections of a chromosome with like GC content). Adjusting Doležel's calculation for GC content, the theoretical variation in base pairs per picogram ranges from 977.0317 Mbp/pg for 100% GC content to 978.6005 Mbp/pg for 0% GC content (A/T being lighter, has more Mbp/pg), with a midpoint of 977.8155 Mbp/pg for 50% GC content.

Human C-values

The human genome^[14] varies in size; however, the current estimate of the nuclear haploid size of the reference human genome^[15] is 3,031,042,417 bp for the X gamete and 2,932,228,937 bp for the Y gamete. The X gamete and Y gamete both contain 22 autosomes whose combined lengths comprise the majority of the genome in both gametes. The X gamete contains an X chromosome, while the Y gamete contains a Y chromosome. The larger size of the X chromosome is responsible for the difference in the size of the two gametes. When the gametes are combined, the XX female zygote has a size of 6,062,084,834 bp while the XY male zygote has a size 5,963,271,354 bp. However, the base pairs of the XX female zygote are distributed among 2 homologous groups of 23 heterologous chromosomes each, while the base pairs of the XY male zygote are distributed among 2 homologous groups of 22 heterologous chromosomes each plus 2 heterologous chromosomes. Although each zygote has 46 chromosomes, 23 chromosomes of the XX female zygote are heterologous while 24 chromosomes of the XY male zygote are heterologous. As a result, the C-value for the XX female zygote is 3.099361 while the C-value for the XY male zygote is 3.157877.

The human genome's GC content is about 41%.^[16] Accounting for the autosomal, X, and Y chromosomes,^[17] human haploid GC contents are 40.97460% for X gametes, and 41.01724% for Y gametes.

Summarizing these numbers:

Table 2: Human Genome Size
Cell	Chromosomes Description	Type	Ploidy	Base Pairs (bp)	GC Content (%)	Density (Mbp/pg)	Mass (pg)	C-Value
Sperm or egg	23 heterologous chromosomes	X Gamete	Haploid	3,031,042,417	40.97460%	977.9571	3.099361	3.099361
Sperm only	23 heterologous chromosomes	Y Gamete	Haploid	2,932,228,937	41.01724%	977.9564	2.998323	2.998323
Zygote	46 chromosomes consisting of 2 homologous sets of 23 heterologous chromosomes each	XX Female	Diploid	6,062,084,834	40.97460%	977.9571	6.198723	3.099361
Zygote	46 chromosomes consisting of 2 homologous sets of 22 heterologous chromosomes each plus 2 heterologous chromosomes	XY Male	Mostly diploid	5,963,271,354	40.99557%	977.9567	6.097684	3.157877

References

PMID 15596473
.

PMID 14808154
.

PMID 11325054
.

PMID 11913657
.

^ ^a ^b ^c Bennett MD, Leitch IJ (2005). "Genome size evolution in plants". In T.R. Gregory (ed.). The Evolution of the Genome. San Diego: Elsevier. pp. 89–162.

S2CID 22272730
.

^ ^a ^b Gregory T.R. (2005). "Genome size evolution in animals". In T.R. Gregory (ed.). The Evolution of the Genome. San Diego: Elsevier. pp. 3–87.

^ ^a ^b "Second Plant Genome Size Discussion Meeting and Workshop". Archived from the original on 2008-12-01. Retrieved 2015-04-19.

S2CID 22272730
.

ISBN 978-0544859937
.

^ "Animal Genome Size Database". Retrieved 14 May 2013.

PMID 18514361
.

^
PMID 12541287
.

PMID 11237011
.

^ "Assembly Statistics for GRCh38.p2". Genome Reference Consortium. 8 December 2014. Retrieved 8 February 2015.

ISBN 978-3-540-37654-5. Archived from the original
(PDF) on 24 September 2015. Retrieved 8 February 2015.

^ Kokocinski, Felix. "Bioinformatics work notes". GC content of human chromosomes. Archived from the original on 10 February 2015. Retrieved 8 February 2015.

External links

Animal Genome Size Database

Plant DNA C-values Database

Fungal Genome Size Database

Retrieved from "https://en.wikipedia.org/w/index.php?title=C-value&oldid=1206566911"

[Greilhuber2005-1] PMID 15596473
.

[Swift1950-2] PMID 14808154
.

[Gregory2001-3] PMID 11325054
.

[Gregory2002-4] PMID 11913657
.

[Bennett2005-5] Bennett MD, Leitch IJ (2005). "Genome size evolution in plants". In T.R. Gregory (ed.). The Evolution of the Genome. San Diego: Elsevier. pp. 89–162.

[Vendrely1948-6] S2CID 22272730
.

[Gregory2005-7] Gregory T.R. (2005). "Genome size evolution in animals". In T.R. Gregory (ed.). The Evolution of the Genome. San Diego: Elsevier. pp. 3–87.

[kew-8] "Second Plant Genome Size Discussion Meeting and Workshop". Archived from the original on 2008-12-01. Retrieved 2015-04-19.

[9] S2CID 22272730
.

[Ancestor-10] ISBN 978-0544859937
.

[Gregory,_T.R._(2013)._Animal_Genome_Size_Database-11] "Animal Genome Size Database". Retrieved 14 May 2013.

[12] PMID 18514361
.

[Dolezel2003-13] 
PMID 12541287
.

[IHGSC2001-14] PMID 11237011
.

[GRCh38p2-15] "Assembly Statistics for GRCh38.p2". Genome Reference Consortium. 8 December 2014. Retrieved 8 February 2015.

[Antonarakis-16] ISBN 978-3-540-37654-5. Archived from the original
(PDF) on 24 September 2015. Retrieved 8 February 2015.

[17] Kokocinski, Felix. "Bioinformatics work notes". GC content of human chromosomes. Archived from the original on 10 February 2015. Retrieved 8 February 2015.

[1]

[3]

[4]

[5]

[6]

[7]

[8]

[11]

[12]

[13]

[14]

[15]

[16]

[17]