Chloroplast DNA

Source: Wikipedia, the free encyclopedia.
tRNAs
tRNA
tRNAs
tRNAs
ribosomal
proteins
tRNA
tRNA
nadh dehydrogenase
ribosomal proteins
tRNA
replication origin regions
tRNA
replication origin regions
tRNAs
tRNA
cytochromes
ribosomal proteins
cytochromes
atp synthase
tRNAs
nadh dehydrogenase
tRNA
ribosomal proteins
tRNAs
atp synthase
tRNAs
tRNA
tRNA
tRNA
tRNA
tRNA
nadh dehydrogenase
tRNA
nadh dehydrogenase
tRNA
tRNA
ribosomal proteins
initiation factor 1
ribosomal proteins
ribosomal proteins
tRNAs
Chloroplast DNA Interactive gene map of chloroplast DNA from
introns
.


Chloroplast DNA (cpDNA) is the DNA located in chloroplasts, which are photosynthetic organelles located within the cells of some eukaryotic organisms. Chloroplasts, like other types of plastid, contain a genome separate from that in the cell nucleus. The existence of chloroplast DNA was identified biochemically in 1959,[1] and confirmed by electron microscopy in 1962.[2] The discoveries that the chloroplast contains ribosomes[3] and performs protein synthesis[4] revealed that the chloroplast is genetically semi-autonomous. The first complete chloroplast genome sequences were published in 1986, Nicotiana tabacum (tobacco) by Sugiura and colleagues and Marchantia polymorpha (liverwort) by Ozeki et al.[5][6] Since then, a great number of chloroplast DNAs from various species have been sequenced.

Molecular structure

The 154 kb chloroplast DNA map of a model flowering plant (Arabidopsis thaliana: Brassicaceae) showing genes and inverted repeats.

base pairs long.[7][8][9] They can have a contour length of around 30–60 micrometers, and have a mass of about 80–130 million daltons.[10]

Most chloroplasts have their entire chloroplast genome combined into a single large ring, though those of

coding DNA
, have also been found.

Chloroplast DNA has long been thought to have a circular structure, but some evidence suggests that chloroplast DNA more commonly takes a linear shape.

corn chloroplasts has been observed to be in branched linear form rather than individual circles.[11]

Inverted repeats

Many chloroplast DNAs contain two inverted repeats, which separate a long single copy section (LSC) from a short single copy section (SSC).[9]

The inverted repeats vary wildly in length, ranging from 4,000 to 25,000

base pairs long each.[11] Inverted repeats in plants tend to be at the upper end of this range, each being 20,000–25,000 base pairs long.[9][13]
The inverted repeat regions usually contain three
reduced to contain as few as four or as many as over 150 genes.[11]
While a given pair of inverted repeats are rarely completely identical, they are always very similar to each other, apparently resulting from concerted evolution.[11]

The inverted repeat regions are highly

peas and a few red algae[11] have since lost the inverted repeats.[13][14] Others, like the red alga Porphyra flipped one of its inverted repeats (making them direct repeats).[11] It is possible that the inverted repeats help stabilize the rest of the chloroplast genome, as chloroplast DNAs which have lost some of the inverted repeat segments tend to get rearranged more.[14]

Nucleoids

Each chloroplast contains around 100 copies of its DNA in young leaves, declining to 15–20 copies in older leaves.

nucleoids which can contain several identical chloroplast DNA rings. Many nucleoids can be found in each chloroplast.[10]

Though chloroplast DNA is not associated with true histones,[16] in red algae, a histone-like chloroplast protein (HC) coded by the chloroplast DNA that tightly packs each chloroplast DNA ring into a nucleoid has been found.[17]

In primitive red algae, the chloroplast DNA nucleoids are clustered in the center of a chloroplast, while in green plants and green algae, the nucleoids are dispersed throughout the stroma.[17]

Gene content and plastid gene expression

More than 5000 chloroplast genomes have been sequenced and are accessible via the NCBI organelle genome database.[18] The first chloroplast genomes were sequenced in 1986, from tobacco (Nicotiana tabacum)[19] and liverwort (Marchantia polymorpha).[20] Comparison of the gene sequences of the cyanobacteria Synechocystis to those of the chloroplast genome of Arabidopsis provided confirmation of the endosymbiotic origin of the chloroplast.[21][22] It also demonstrated the significant extent of gene transfer from the cyanobacterial ancestor to the nuclear genome.

In most plant species, the chloroplast genome encodes approximately 120 genes.

Rubisco subunit and 28 photosynthetic thylakoid proteins are encoded within the chloroplast genome.[25]

Chloroplast genome reduction and gene transfer

Over time, many parts of the chloroplast genome were transferred to the

endosymbiotic gene transfer
. As a result, the chloroplast genome is heavily
reduced compared to that of free-living cyanobacteria. Chloroplasts may contain 60–100 genes whereas cyanobacteria often have more than 1500 genes in their genome.[27] Contrarily, there are only a few known instances where genes have been transferred to the chloroplast from various donors, including bacteria.[28][29][30]

Endosymbiotic gene transfer is how we know about the

green algal derived chloroplast at some point, which was subsequently replaced by the red chloroplast.[31]

In land plants, some 11–14% of the DNA in their nuclei can be traced back to the chloroplast,[32] up to 18% in Arabidopsis, corresponding to about 4,500 protein-coding genes.[33] There have been a few recent transfers of genes from the chloroplast DNA to the nuclear genome in land plants.[8]

Proteins encoded by the chloroplast

Of the approximately three-thousand proteins found in chloroplasts, some 95% of them are encoded by nuclear genes. Many of the chloroplast's protein complexes consist of subunits from both the chloroplast genome and the host's nuclear genome. As a result,

retrograde signaling.[34]

Protein synthesis

Protein synthesis within chloroplasts relies on an RNA polymerase coded by the chloroplast's own genome, which is related to RNA polymerases found in bacteria. Chloroplasts also contain a mysterious second RNA polymerase that is encoded by the plant's nuclear genome. The two RNA polymerases may recognize and bind to different kinds of promoters within the chloroplast genome.[35] The ribosomes in chloroplasts are similar to bacterial ribosomes.[36]

RNA editing in plastids

RNA editing is the insertion, deletion, and substitution of nucleotides in a mRNA transcript prior to translation to protein. The highly oxidative environment inside chloroplasts increases the rate of mutation so post-transcription repairs are needed to conserve functional sequences. The chloroplast editosome substitutes C -> U and U -> C at very specific locations on the transcript. This can change the codon for an amino acid or restore a non-functional pseudogene by adding an AUG start codon or removing a premature UAA stop codon.[37]

The editosome recognizes and binds to cis sequence upstream of the editing site. The distance between the binding site and editing site varies by gene and proteins involved in the editosome. Hundreds of different

PPR proteins from the nuclear genome are involved in the RNA editing process. These proteins consist of 35-mer repeated amino acids, the sequence of which determines the cis binding site for the edited transcript.[37]

Basal land plants such as liverworts, mosses and ferns have hundreds of different editing sites while flowering plants typically have between thirty and forty. Parasitic plants such as

Epifagus virginiana show a loss of RNA editing resulting in a loss of function for photosynthesis genes.[38]

DNA replication

Leading model of cpDNA replication

Chloroplast DNA replication via multiple D loop mechanisms. Adapted from Krishnan NM, Rao BJ's paper "A comparative approach to elucidate chloroplast genome replication."

The mechanism for chloroplast DNA (cpDNA) replication has not been conclusively determined, but two main models have been proposed. Scientists have attempted to observe chloroplast replication via

replication forks
open up, allowing replication machinery to replicate the DNA. As replication continues, the forks grow and eventually converge. The new cpDNA structures separate, creating daughter cpDNA chromosomes.

In addition to the early microscopy experiments, this model is also supported by the amounts of

amino group is lost and is a mutation that often results in base changes. When adenine is deaminated, it becomes hypoxanthine (H). Hypoxanthine can bind to cytosine, and when the HC base pair is replicated, it becomes a GC (thus, an A → G base change).[41]

Over time, base changes in the DNA sequence can arise from deamination mutations. When adenine is deaminated, it becomes hypoxanthine, which can pair with cytosine. During replication, the cytosine will pair with guanine, causing an A → G base change.

In cpDNA, there are several A → G deamination gradients. DNA becomes susceptible to deamination events when it is single stranded. When replication forks form, the strand not being copied is single stranded, and thus at risk for A → G deamination. Therefore, gradients in deamination indicate that replication forks were most likely present and the direction that they initially opened (the highest gradient is most likely nearest the start site because it was single stranded for the longest amount of time).[39] This mechanism is still the leading theory today; however, a second theory suggests that most cpDNA is actually linear and replicates through homologous recombination. It further contends that only a minority of the genetic material is kept in circular chromosomes while the rest is in branched, linear, or other complex structures.[39][12]

Alternative model of replication

One of the main competing models for cpDNA asserts that most cpDNA is linear and participates in

bacteriophage T4.[12] It has been established that some plants have linear cpDNA, such as maize, and that more still contain complex structures that scientists do not yet understand;[12] however, the predominant view today is that most cpDNA is circular. When the original experiments on cpDNA were performed, scientists did notice linear structures; however, they attributed these linear forms to broken circles.[12] If the branched and complex structures seen in cpDNA experiments are real and not artifacts of concatenated circular DNA or broken circles, then a D-loop mechanism of replication is insufficient to explain how those structures would replicate.[12] At the same time, homologous recombination does not explain the multiple A → G gradients seen in plastomes.[39]
This shortcoming is one of the biggest for the linear structure theory.

Protein targeting and import

The movement of so many chloroplast genes to the nucleus means that many chloroplast

proteins that were supposed to be translated in the chloroplast are now synthesized in the cytoplasm. This means that these proteins must be directed back to the chloroplast, and imported through at least two chloroplast membranes.[42]

Curiously, around half of the protein products of transferred genes aren't even targeted back to the chloroplast. Many became

topologically outside of the cell, because to reach the chloroplast from the cytosol, you have to cross the cell membrane, just like if you were headed for the extracellular space. In those cases, chloroplast-targeted proteins do initially travel along the secretory pathway).[43]

Because the cell acquiring a chloroplast

protein targeting system to avoid having chloroplast proteins being sent to the wrong organelle.[42]

Cytoplasmic translation and N-terminal transit sequences

carboxyl group
(CO2H) is at the right.

N-terminal transit sequences are also called presequences
ribosomes synthesize polypeptides from the N-terminus to the C-terminus.[44]

Chloroplast transit peptides exhibit huge variation in length and

acidic amino acids like aspartic acid and glutamic acid.[46] In an aqueous solution, the transit sequence forms a random coil.[42]

Not all chloroplast proteins include a N-terminal cleavable transit peptide though.

Phosphorylation, chaperones, and transport

After a chloroplast

Phosphorylation changes the polypeptide's shape,

14-3-3 proteins only bind to chloroplast preproteins.[47] It is also bound by the heat shock protein Hsp70 that keeps the polypeptide from folding prematurely.[42] This is important because it prevents chloroplast proteins from assuming their active form and carrying out their chloroplast functions in the wrong place—the cytosol.[47][50] At the same time, they have to keep just enough shape so that they can be recognized and imported into the chloroplast.[47]

The heat shock protein and the 14-3-3 proteins together form a cytosolic guidance complex that makes it easier for the chloroplast polypeptide to get imported into the chloroplast.[42]

Alternatively, if a chloroplast preprotein's transit peptide is not phosphorylated, a chloroplast preprotein can still attach to a heat shock protein or

TOC complex on the outer chloroplast membrane using GTP energy.[42]

The translocon on the outer chloroplast membrane (TOC)

The

outer chloroplast envelope. Five subunits of the TOC complex have been identified—two GTP-binding proteins Toc34 and Toc159, the protein import tunnel Toc75, plus the proteins Toc64[42] and Toc12.[45]

The first three proteins form a core complex that consists of one Toc159, four to five Toc34s, and four Toc75s that form four holes in a disk 13

kilodaltons. The other two proteins, Toc64 and Toc12, are associated with the core complex but are not part of it.[45]

Toc34 and 33

pea plant. Toc34 has three almost identical molecules (shown in slightly different shades of green), each of which forms a dimer with one of its adjacent molecules. Part of a GDP molecule binding site is highlighted in pink.[51]