Codon usage bias

Codon usage bias refers to differences in the frequency of occurrence of

polypeptide chain or for the termination of translation (stop codons

There are 64 different codons (61 codons encoding for amino acids and 3 stop codons) but only 20 different translated amino acids. The overabundance in the number of codons allows many amino acids to be encoded by more than one codon. Because of such redundancy it is said that the genetic code is degenerate. The genetic codes of different organisms are often biased towards using one of the several codons that encode the same amino acid over the others—that is, a greater frequency of one will be found than expected by chance. How such biases arise is a much debated area of molecular evolution. Codon usage tables detailing genomic codon usage bias for organisms in GenBank and RefSeq can be found in the HIVE-Codon Usage Tables (HIVE-CUTs) project,^[1] which contains two distinct databases, CoCoPUTs and TissueCoCoPUTs. Together, these two databases provide comprehensive, up-to-date codon, codon pair and dinucleotide usage statistics for all organisms with available sequence information and 52 human tissues, respectively.^[2]^[3]

It is generally acknowledged that codon biases reflect the contributions of 3 main factors,

host cell. The suggestion has been made that these codon biases play a role in the temporal regulation of their late proteins.^[13]

The nature of the codon usage-tRNA optimization has been fiercely debated. It is not clear whether codon usage drives tRNA evolution or vice versa. At least one mathematical model has been developed where both codon usage and tRNA expression co-evolve in feedback fashion (i.e., codons already present in high frequencies drive up the expression of their corresponding tRNAs, and tRNAs normally expressed at high levels drive up the frequency of their corresponding codons). However, this model does not seem to yet have experimental confirmation. Another problem is that the evolution of tRNA genes has been a very inactive area of research.^{[citation needed]}

Contributing factors

Different factors have been proposed to be related to codon usage bias, including gene expression level (reflecting selection for optimizing the translation process by tRNA abundance),

guanine-cytosine skew (GC skew, reflecting strand-specific mutational bias), amino acid conservation, protein hydropathy, transcriptional selection, RNA stability, optimal growth temperature, hypersaline adaptation, and dietary nitrogen.^[14]^[15]^[16]^[17]^[18]^[19]

Evolutionary theories

Mutational bias versus selection

Although the mechanism of codon bias selection remains controversial, possible explanations for this bias fall into two general categories. One explanation revolves around the selectionist theory, in which codon bias contributes to the efficiency and/or accuracy of protein expression and therefore undergoes

ribosomes and potentially the rate of initiation for messenger RNAs (mRNAs).^[20]

The second explanation for codon usage can be explained by mutational bias, a theory which posits that codon bias exists because of nonrandomness in the mutational patterns. In other words, some codons can undergo more changes and therefore result in lower equilibrium frequencies, also known as “rare” codons. Different organisms also exhibit different mutational biases, and there is growing evidence that the level of genome-wide GC content is the most significant parameter in explaining codon bias differences between organisms. Additional studies have demonstrated that codon biases can be statistically predicted in

coding regions and further supporting the mutation bias model. However, this model alone cannot fully explain why preferred codons are recognized by more abundant tRNAs.^[20]

Mutation-selection-drift balance model

To reconcile the evidence from both

mutational pressures and selection, the prevailing hypothesis for codon bias can be explained by the mutation-selection-drift balance model. This hypothesis states that selection favors major codons over minor codons, but minor codons are able to persist due to mutation pressure and genetic drift. It also suggests that selection is generally weak, but that selection intensity scales to higher expression and more functional constraints of coding sequences.^[20]

Consequences of codon composition

Effect on RNA secondary structure

Because

ribosome-binding site or initiation codon can inhibit translation, and mRNA folding at the 5’ end generates a large amount of variation in protein levels.^[21]

Effect on transcription or gene expression

codon optimization, has traditionally been used for expression of a heterologous gene. However, new strategies for optimization of heterologous expression consider global nucleotide content such as local mRNA folding, codon pair bias, a codon ramp, codon harmonization or codon correlations.^[24]^[25] With the number of nucleotide changes introduced, artificial gene synthesis

is often necessary for the creation of such an optimized gene.

Specialized codon bias is further seen in some

endogenous genes such as those involved in amino acid starvation. For example, amino acid biosynthetic enzymes preferentially use codons that are poorly adapted to normal tRNA abundances, but have codons that are adapted to tRNA pools under starvation conditions. Thus, codon usage can introduce an additional level of transcriptional regulation for appropriate gene expression under specific cellular conditions.^[25]

Effect on speed of translation elongation

Generally speaking for highly expressed genes, translation elongation rates are faster along transcripts with higher codon adaptation to tRNA pools, and slower along transcripts with rare codons. This correlation between codon translation rates and cognate tRNA concentrations provides additional modulation of translation elongation rates, which can provide several advantages to the organism. Specifically, codon usage can allow for global regulation of these rates, and rare codons may contribute to the accuracy of translation at the expense of speed.[26]

Effect on protein folding

synonymous mutations have been shown to have significant consequences in the folding process of the nascent protein and can even change substrate specificity of enzymes. These studies suggest that codon usage influences the speed at which polypeptides emerge vectorially from the ribosome, which may further impact protein folding pathways throughout the available structural space.^[26]

Methods of analysis

In the field of

DNA vaccines. Several software packages are available online for this purpose (refer to external links).^{[citation needed}

]

References

PMID 28865429
.

S2CID 139104807
.

PMID 31982380
.

PMID 21646514
.

doi:10.1146/annurev-genom-082908-150001
.

hdl:20.500.12210/34500.{{cite journal}}: CS1 maint: multiple names: authors list (link
)

PMID 8709146
.

S2CID 8582630
.

PMID 10570992
.

PMID 10784043
.

PMID 31174582
.

PMID 10858656
.

PMID 26504241
.

PMID 11719972
.

PMID 12364606
.

PMID 18397532
.

PMID 23637123
.

PMID 9724767
.

PMID 27842572
.

^
S2CID 7085012
.

PMID 22921354
.

PMID 17017124
.

S2CID 202555575
.

PMID 29624661
.

^
PMID 21102527
.

^
PMID 24688635
.

S2CID 21862217
.

PMID 6175758
.

PMID 20453079
.

PMID 3547335
.

^ Peden J (2005-04-15). "Codon usage indices". Correspondence Analysis of Codon Usage. SourceForge. Retrieved 2010-10-20.

PMID 18940873
.

External links

Composition Analysis Toolkit Archived 2020-07-26 at the Wayback Machine: estimating codon usage bias and its statistical significance

HIVE-Codon Usage Table database

Codon Usage Database

CodonW

GCUA - General Codon Usage Analysis

Graphical Codon Usage Analyser

JCat - Java Codon Usage Adaptation Tool

INCA - Interactive Codon Analysis software

ACUA - Automated Codon Usage Analysis Tool Archived 2020-07-26 at the Wayback Machine

OPTIMIZER - Codon usage optimization

HEG-DB - Highly Expressed Genes Database

E-CAI - Expected value of Codon Adaptation Index

CAIcal -Set of tools to assess codon usage adaptation

scRCA - Automatic determination of translational codon usage bias

Online Synonymous Codon Usage Analyses with the ade4 and seqinR packages

Genetic Algorithm Simulation for Codon Optimization

Retrieved from "https://en.wikipedia.org/w/index.php?title=Codon_usage_bias&oldid=1193970285#Effect_on_transcription_or_gene_expression"

[1] PMID 28865429
.

[2] S2CID 139104807
.

[3] PMID 31982380
.

[ShahGilchrist2011-4] PMID 21646514
.

[DuretGaltier2009-5] :10.1146/annurev-genom-082908-150001
.

[Galtier2018-6] :20.500.12210/34500.{{cite journal}}: CS1 maint: multiple names: authors list (link
)

[Dong1996-7] PMID 8709146
.

[Sharp1993-8] S2CID 8582630
.

[Kanaya1999-9] PMID 10570992
.

[10] PMID 10784043
.

[11] PMID 31174582
.

[Duret2000-12] PMID 10858656
.

[13] PMID 26504241
.

[pmid11719972-14] PMID 11719972
.

[pmid12364606-15] PMID 12364606
.

[pmid18397532-16] PMID 18397532
.

[pmid23637123-17] PMID 23637123
.

[18] PMID 9724767
.

[pmid27842572-19] PMID 27842572
.

[pmid18983258-20] 
S2CID 7085012
.

[pmid22921354-21] PMID 22921354
.

[pmid17017124-22] PMID 17017124
.

[pmid31509345-23] S2CID 202555575
.

[pmid29624661-24] PMID 29624661
.

[pmid21102527-25] 
PMID 21102527
.

[doi1-26] 
PMID 24688635
.

[pmid9732453-27] S2CID 21862217
.

[pmid6175758-28] PMID 6175758
.

[pmid20453079-29] PMID 20453079
.

[SharpLi1987-30] PMID 3547335
.

[urlCodon_usage_indices-31] Peden J (2005-04-15). "Codon usage indices". Correspondence Analysis of Codon Usage. SourceForge. Retrieved 2010-10-20.

[pmid18940873-32] PMID 18940873
.

[1]

[2]

[3]

[13]

[14]

[15]

[16]

[17]

[18]

[19]

[20]

[21]

[24]

[25]

[26]