Base pair

A base pair (bp) is a fundamental unit of double-stranded

genetic information encoded within each strand of DNA. The regular structure and data redundancy provided by the DNA double helix make DNA well suited to the storage of genetic information, while base-pairing between DNA and incoming nucleotides provides the mechanism through which DNA polymerase replicates DNA and RNA polymerase

transcribes DNA into RNA. Many DNA-binding proteins can recognize specific base-pairing patterns that identify particular regulatory regions of genes.

Intramolecular base pairs can occur within single-stranded nucleic acids. This is particularly important in RNA molecules (e.g.,

structures. In addition, base-pairing between transfer RNA (tRNA) and messenger RNA (mRNA) forms the basis for the molecular recognition events that result in the nucleotide sequence of mRNA becoming translated into the amino acid sequence of proteins via the genetic code

.

The size of an individual

TtC (trillion tons of carbon).^[8]

Hydrogen bonding and stability

Top, a G.C base pair with three hydrogen bonds. Bottom, an A.T base pair with two hydrogen bonds. Non-covalent hydrogen bonds between the bases are shown as dashed lines. The wiggly lines stand for the connection to the pentose sugar and point in the direction of the minor groove.

Hydrogen bonding is the chemical interaction that underlies the base-pairing rules described above. Appropriate geometrical correspondence of hydrogen bond donors and acceptors allows only the "right" pairs to form stably. DNA with high GC-content is more stable than DNA with low GC-content. Crucially, however, stacking interactions are primarily responsible for stabilising the double-helical structure; Watson-Crick base pairing's contribution to global structural stability is minimal, but its role in the specificity underlying complementarity is, by contrast, of maximal importance as this underlies the template-dependent processes of the central dogma (e.g. DNA replication).^[9]

The bigger nucleobases, adenine and guanine, are members of a class of double-ringed chemical structures called purines; the smaller nucleobases, cytosine and thymine (and uracil), are members of a class of single-ringed chemical structures called pyrimidines. Purines are complementary only with pyrimidines: pyrimidine–pyrimidine pairings are energetically unfavorable because the molecules are too far apart for hydrogen bonding to be established; purine–purine pairings are energetically unfavorable because the molecules are too close, leading to overlap repulsion. Purine–pyrimidine base-pairing of AT or GC or UA (in RNA) results in proper duplex structure. The only other purine–pyrimidine pairings would be AC and GT and UG (in RNA); these pairings are mismatches because the patterns of hydrogen donors and acceptors do not correspond. The GU pairing, with two hydrogen bonds, does occur fairly often in RNA (see wobble base pair).

Paired DNA and RNA molecules are comparatively stable at room temperature, but the two nucleotide strands will separate above a

transcribed genes — are comparatively GC-poor (for example, see TATA box). GC content and melting temperature must also be taken into account when designing primers for PCR reactions.^{[citation needed}

]

Examples

The following DNA sequences illustrate pair double-stranded patterns. By convention, the top strand is written from the

3′-end

; thus, the bottom strand is written 3′ to 5′.

A base-paired DNA sequence:

ATCGATTGAGCTCTAGCG

TAGCTAACTCGAGATCGC

The corresponding RNA sequence, in which uracil is substituted for thymine in the RNA strand:

AUCGAUUGAGCUCUAGCG

UAGCUAACUCGAGAUCGC

Base analogs and intercalators

Chemical analogs of nucleotides can take the place of proper nucleotides and establish non-canonical base-pairing, leading to errors (mostly

5-bromouracil, which resembles thymine but can base-pair to guanine in its enol form.^[10]

Other chemicals, known as

polyaromatic compounds and are known or suspected carcinogens. Examples include ethidium bromide and acridine.^[11]^{[citation needed}

]

Mismatch repair

Mismatched base pairs can be generated by errors of DNA replication and as intermediates during homologous recombination. The process of mismatch repair ordinarily must recognize and correctly repair a small number of base mispairs within a long sequence of normal DNA base pairs. To repair mismatches formed during DNA replication, several distinctive repair processes have evolved to distinguish between the template strand and the newly formed strand so that only the newly inserted incorrect nucleotide is removed (in order to avoid generating a mutation).^[12] The proteins employed in mismatch repair during DNA replication, and the clinical significance of defects in this process are described in the article DNA mismatch repair. The process of mispair correction during recombination is described in the article gene conversion.

Length measurements

The following abbreviations are commonly used to describe the length of a D/RNA molecule:

bp = base pair—one bp corresponds to approximately 3.4
daltons
for DNA and RNA respectively.
kb (= kbp) = kilo–base-pair = 1,000 bp
Mb (= Mbp) = mega–base-pair = 1,000,000 bp
Gb (= Gbp) = giga–base-pair = 1,000,000,000 bp

For single-stranded DNA/RNA, units of nucleotides are used—abbreviated nt (or knt, Mnt, Gnt)—as they are not paired. To distinguish between units of

computer storage

and bases, kbp, Mbp, Gbp, etc. may be used for base pairs.

The centimorgan is also often used to imply distance along a chromosome, but the number of base pairs it corresponds to varies widely. In the human genome, the centimorgan is about 1 million base pairs.^[14]^[15]

Unnatural base pair (UBP)

An unnatural base pair (UBP) is a designed subunit (or nucleobase) of DNA which is created in a laboratory and does not occur in nature. DNA sequences have been described which use newly created nucleobases to form a third base pair, in addition to the two base pairs found in nature, A-T (adenine – thymine) and G-C (guanine – cytosine). A few research groups have been searching for a third base pair for DNA, including teams led by Steven A. Benner, Philippe Marliere, Floyd E. Romesberg and Ichiro Hirao.^[16] Some new base pairs based on alternative hydrogen bonding, hydrophobic interactions and metal coordination have been reported.^[17]^[18]^[19]^[20]

In 1989 Steven Benner (then working at the Swiss Federal Institute of Technology in Zurich) and his team led with modified forms of cytosine and guanine into DNA molecules in vitro.^[21] The nucleotides, which encoded RNA and proteins, were successfully replicated in vitro. Since then, Benner's team has been trying to engineer cells that can make foreign bases from scratch, obviating the need for a feedstock.^[22]

In 2002, Ichiro Hirao's group in Japan developed an unnatural base pair between 2-amino-8-(2-thienyl)purine (s) and pyridine-2-one (y) that functions in transcription and translation, for the site-specific incorporation of non-standard amino acids into proteins.^[23] In 2006, they created 7-(2-thienyl)imidazo[4,5-b]pyridine (Ds) and pyrrole-2-carbaldehyde (Pa) as a third base pair for replication and transcription.^[24] Afterward, Ds and 4-[3-(6-aminohexanamido)-1-propynyl]-2-nitropyrrole (Px) was discovered as a high fidelity pair in PCR amplification.^[25]^[26] In 2013, they applied the Ds-Px pair to DNA aptamer generation by in vitro selection (SELEX) and demonstrated the genetic alphabet expansion significantly augment DNA aptamer affinities to target proteins.^[27]

In 2012, a group of American scientists led by Floyd Romesberg, a chemical biologist at the

aromatic rings that form a (d5SICS–dNaM) complex or base pair in DNA.^[22]^[28] His team designed a variety of in vitro or "test tube" templates containing the unnatural base pair and they confirmed that it was efficiently replicated with high fidelity in virtually all sequence contexts using the modern standard in vitro techniques, namely PCR amplification of DNA and PCR-based applications.^[19] Their results show that for PCR and PCR-based applications, the d5SICS–dNaM unnatural base pair is functionally equivalent to a natural base pair, and when combined with the other two natural base pairs used by all organisms, A–T and G–C, they provide a fully functional and expanded six-letter "genetic alphabet".^[28]

In 2014 the same team from the Scripps Research Institute reported that they synthesized a stretch of circular DNA known as a plasmid containing natural T-A and C-G base pairs along with the best-performing UBP Romesberg's laboratory had designed and inserted it into cells of the common bacterium E. coli that successfully replicated the unnatural base pairs through multiple generations.^[16] The transfection did not hamper the growth of the E. coli cells and showed no sign of losing its unnatural base pairs to its natural DNA repair mechanisms. This is the first known example of a living organism passing along an expanded genetic code to subsequent generations.^[28]^[29] Romesberg said he and his colleagues created 300 variants to refine the design of nucleotides that would be stable enough and would be replicated as easily as the natural ones when the cells divide. This was in part achieved by the addition of a supportive algal gene that expresses a nucleotide triphosphate transporter which efficiently imports the triphosphates of both d5SICSTP and dNaMTP into E. coli bacteria.^[28] Then, the natural bacterial replication pathways use them to accurately replicate a plasmid containing d5SICS–dNaM. Other researchers were surprised that the bacteria replicated these human-made DNA subunits.^[30]

The successful incorporation of a third base pair is a significant breakthrough toward the goal of greatly expanding the number of amino acids which can be encoded by DNA, from the existing 20 amino acids to a theoretically possible 172, thereby expanding the potential for living organisms to produce novel proteins.^[16] The artificial strings of DNA do not encode for anything yet, but scientists speculate they could be designed to manufacture new proteins which could have industrial or pharmaceutical uses.^[31] Experts said the synthetic DNA incorporating the unnatural base pair raises the possibility of life forms based on a different DNA code.^[30]^[31]

Non-canonical base pairing

Wobble base pairs

Comparison of Hoogsteen to Watson–Crick base pairs.^[32]

In addition to the canonical pairing, some conditions can also favour base-pairing with alternative base orientation, and number and geometry of hydrogen bonds. These pairings are accompanied by alterations to the local backbone shape.^{[citation needed]}

The most common of these is the

codons during transcription^[33] and during the charging of tRNAs by some tRNA synthetases.^[34] They have also been observed in the secondary structures of some RNA sequences.^[35]

Additionally, Hoogsteen base pairing (typically written as A•U/T and G•C) can exist in some DNA sequences (e.g. CA and TA dinucleotides) in dynamic equilibrium with standard Watson–Crick pairing.^[32] They have also been observed in some protein–DNA complexes.^[36]

In addition to these alternative base pairings, a wide range of base-base hydrogen bonding is observed in RNA secondary and tertiary structure.^[37] These bonds are often necessary for the precise, complex shape of an RNA, as well as its binding to interaction partners.^[37]

References

ISSN 0365-110X
.

ISBN 978-0-387-25579-8
.

^ Moran LA (2011-03-24). "The total size of the human genome is very likely to be ~3,200 Mb". Sandwalk.blogspot.com. Retrieved 2012-07-16.

^ "The finished length of the human genome is 2.86 Gb". Strategicgenomics.com. 2006-06-12. Retrieved 2012-07-16.

PMID 15496913
.

S2CID 31624366
.

ISSN 0362-4331. Archived from the original
on 2022-01-01. Retrieved 2015-07-18.

^ "The Biosphere: Diversity of Life". Aspen Global Change Institute. Basalt, CO. Archived from the original on 2014-11-10. Retrieved 2015-07-19.

PMID 16449200
.

PMID 13922323
.

ISBN 978-1-284-10449-3
. Each mutagenic event in the presence of an acridine results in the addition or removal of a single base pair.

PMID 34171627
.

ISBN 978-0-8153-4432-2
.

^ "NIH ORDR – Glossary – C". Rarediseases.info.nih.gov. Archived from the original on 2012-07-17. Retrieved 2012-07-16.

ISBN 978-0-7167-4366-8
. ...in humans 1 centimorgan on average represents a distance of about 7.5x10⁵ base pairs.

^ ^a ^b ^c Fikes BJ (May 8, 2014). "Life engineered with expanded genetic code". San Diego Union Tribune. Archived from the original on 9 May 2014. Retrieved 8 May 2014.

PMID 21842904
.

PMID 22121213
.

^
PMID 22773812
.

ISSN 0366-7022
.

doi:10.1021/ja00203a067
.

^ ^a ^b Callaway E (May 7, 2014). "Scientists Create First Living Organism With 'Artificial' DNA". Nature News. Huffington Post. Retrieved 8 May 2014.

S2CID 22055476
.

S2CID 6494156
.

PMID 19073696
.

PMID 22121213
.

S2CID 23329867
.

^
PMID 24805238
.

^ Sample I (May 7, 2014). "First life forms to pass on artificial DNA engineered by US scientists". The Guardian. Retrieved 8 May 2014.

^ ^a ^b "Scientists create first living organism containing artificial DNA". The Wall Street Journal. Fox News. May 8, 2014. Retrieved 8 May 2014.

^ ^a ^b Pollack A (May 7, 2014). "Scientists Add Letters to DNA's Alphabet, Raising Hope and Fear". New York Times. Retrieved 8 May 2014.

^
PMID 21270796
.

S2CID 27022506
.

S2CID 205239383
.

PMID 29122970
.

PMID 12466549
.

^
PMID 12831880
.

Further reading

Watson JD, Baker TA, Bell SP, Gann A, Levine M, Losick R (2004). Molecular Biology of the Gene (5th ed.). Pearson Benjamin Cummings: CSHL Press. (See esp. ch. 6 and 9)

Sigel A, Sigel H, Sigel RK, eds. (2012). Interplay between Metal Ions and Nucleic Acids. Metal Ions in Life Sciences. Vol. 10. Springer.
S2CID 92951134
.

Clever GH, Shionoya M (2012). "Alternative DNA Base Pairing through Metal Coordination". Interplay between Metal Ions and Nucleic Acids. Metal Ions in Life Sciences. Vol. 10. pp. 269–294.
PMID 22210343
.

Megger DA, Megger N, Mueller J (2012). "Metal-Mediated Base Pairs in Nucleic Acids with Purine- and Pyrimidine-Derived Nucleosides". Interplay between Metal Ions and Nucleic Acids. Metal Ions in Life Sciences. Vol. 10. pp. 295–317.
PMID 22210344
.

External links

Wikimedia Commons has media related to Base pairing.

DAN—webserver version of the EMBOSS tool for calculating melting temperatures

v
t
e
Genetics

Introduction

Outline

History

Timeline

Index

Glossary

Key components

Chromosome

DNA

RNA

Genome

Heredity

Nucleotide

Mutation

Genetic variation

Allele

Amino acid

Fields

Classical

Conservation

Cytogenetics

Ecological

Immunogenetics

Microbial

Molecular

Population

Quantitative

Archaeogenetics of

Africa

the Americas

the British Isles

Europe

Italy

the Middle East

South Asia

Related topics

Behavioural genetics

Epigenetics

Geneticist

Genome editing

Genomics

Genetic code

Genetic engineering

Genetic diversity

Genetic monitoring

Genetic genealogy

Heredity

He Jiankui affair

Medical genetics

Missing heritability problem

Molecular evolution

Plant genetics

Population genomics

Reverse genetics

Lists

List of genetic codes

List of genetics research organizations

Category

v
t
e
Types of nucleic acids
Constituents

Nucleobases

Nucleosides

Nucleotides

Deoxynucleotides

Ribonucleic acids
(coding, non-coding)
Translational

Messenger
precursor, heterogenous nuclear

modified Messenger

Transfer

Ribosomal

Transfer-messenger

Regulatory

Interferential
Micro

Small interfering

Piwi-interacting

Antisense

Processual
Small nuclear

Small nucleolar

Small Cajal Body RNAs

Y RNA

Enhancer RNAs

Others

Guide

Ribozyme

Small hairpin

Small temporal

Trans-acting small interfering

Subgenomic messenger

Deoxyribonucleic
acids

Organellar
Chloroplast

Mitochondrial

Complementary

Deoxyribozyme

Genomic

Hachimoji

Multicopy single-stranded

Analogues

Xeno
Glycol

Threose

Hexose

Locked

Peptide

Morpholino

Cloning vectors

Phagemid

Plasmid

Lambda phage

Cosmid

Fosmid

Artificial chromosomes
P1-derived

Bacterial

Yeast

Human

Category

v
t
e
Nucleic acid constituents
Nucleobase

Purine
Adenine

Guanine

Hypoxanthine

Xanthine

Purine analogue

Pyrimidine
Uracil

Thymine

Cytosine

Pyrimidine analogue

Unnatural base pair (UBP)

Nucleoside
Ribonucleoside

Adenosine

Guanosine

5-Methyluridine

Uridine

5-Methylcytidine

Cytidine

Pseudouridine

Inosine

N⁶-Methyladenosine

Xanthosine

Wybutosine

Deoxyribonucleoside

Deoxyadenosine

Deoxyguanosine

Thymidine

Deoxyuridine

Deoxycytidine

Deoxyinosine

Deoxyxanthosine

Nucleotide
(Nucleoside monophosphate)
Ribonucleotide

AMP

GMP

m⁵UMP

UMP

CMP

IMP

XMP

Deoxyribonucleotide

dAMP

dGMP

dTMP

dUMP

dCMP

dIMP

dXMP

Cyclic nucleotide

cAMP

cGMP

c-di-GMP

c-di-AMP

cADPR

cGAMP

Nucleoside diphosphate

ADP

GDP

m⁵UDP

UDP

CDP

Xanthosine diphosphate

dADP

dGDP

dTDP

dUDP

dCDP

Nucleoside triphosphate

ATP

GTP

m⁵UTP

UTP

CTP

ITP

XTP

dATP

dGTP

dTTP

dUTP

dCTP

dITP

dXTP

Portal:
Biology

Retrieved from "https://en.wikipedia.org/w/index.php?title=Base_pair&oldid=1187784531#Length_measurements"

[1] ISSN 0365-110X
.

[2] ISBN 978-0-387-25579-8
.

[3] Moran LA (2011-03-24). "The total size of the human genome is very likely to be ~3,200 Mb". Sandwalk.blogspot.com. Retrieved 2012-07-16.

[4] "The finished length of the human genome is 2.86 Gb". Strategicgenomics.com. 2006-06-12. Retrieved 2012-07-16.

[IHSGC2004-5] PMID 15496913
.

[6] S2CID 31624366
.

[NYT-20150718-rn-7] ISSN 0362-4331. Archived from the original
on 2022-01-01. Retrieved 2015-07-18.

[AGCI-2015-8] "The Biosphere: Diversity of Life". Aspen Global Change Institute. Basalt, CO. Archived from the original on 2014-11-10. Retrieved 2015-07-19.

[Yakovchuk2006-9] PMID 16449200
.

[10] PMID 13922323
.

[11] ISBN 978-1-284-10449-3
. Each mutagenic event in the presence of an acridine results in the addition or removal of a single base pair.

[12] PMID 34171627
.

[13] ISBN 978-0-8153-4432-2
.

[14] "NIH ORDR – Glossary – C". Rarediseases.info.nih.gov. Archived from the original on 2012-07-17. Retrieved 2012-07-16.

[15] ISBN 978-0-7167-4366-8
. ...in humans 1 centimorgan on average represents a distance of about 7.5x10⁵ base pairs.

[Fikes-16] Fikes BJ (May 8, 2014). "Life engineered with expanded genetic code". San Diego Union Tribune. Archived from the original on 9 May 2014. Retrieved 8 May 2014.

[17] PMID 21842904
.

[18] PMID 22121213
.

[Malyshev_PNAS_20120724-19] 
PMID 22773812
.

[20] ISSN 0366-7022
.

[21] :10.1021/ja00203a067
.

[Ewan-22] Callaway E (May 7, 2014). "Scientists Create First Living Organism With 'Artificial' DNA". Nature News. Huffington Post. Retrieved 8 May 2014.

[23] S2CID 22055476
.

[24] S2CID 6494156
.

[25] PMID 19073696
.

[26] PMID 22121213
.

[27] S2CID 23329867
.

[NATJ-20140507-28] 
PMID 24805238
.

[Sample-29] Sample I (May 7, 2014). "First life forms to pass on artificial DNA engineered by US scientists". The Guardian. Retrieved 8 May 2014.

[fox-30] "Scientists create first living organism containing artificial DNA". The Wall Street Journal. Fox News. May 8, 2014. Retrieved 8 May 2014.

[Pollack-31] Pollack A (May 7, 2014). "Scientists Add Letters to DNA's Alphabet, Raising Hope and Fear". New York Times. Retrieved 8 May 2014.

[Nikolova2-32] 
PMID 21270796
.

[33] S2CID 27022506
.

[34] S2CID 205239383
.

[35] PMID 29122970
.

[Aishima-36] PMID 12466549
.

[:0-37] 
PMID 12831880
.

[8]

[9]

[10]

[11]

[12]

[14]

[15]

[16]

[17]

[18]

[19]

[20]

[21]

[22]

[23]

[24]

[25]

[26]

[27]

[28]

[29]

[30]

[31]

[32]

[33]

[34]

[35]

[36]

[37]