Protein isoform
A protein isoform, or "protein variant",
The discovery of isoforms could explain the discrepancy between the small number of protein coding regions of genes revealed by the
Definition
One single gene has the ability to produce multiple proteins that differ both in structure and composition;[4][5] this process is regulated by the alternative splicing of mRNA, though it is not clear to what extent such a process affects the diversity of the human proteome, as the abundance of mRNA transcript isoforms does not necessarily correlate with the abundance of protein isoforms.[6] Three-dimensional protein structure comparisons can be used to help determine which, if any, isoforms represent functional protein products, and the structure of most isoforms in the human proteome has been predicted by AlphaFold and publicly released at isoform.io. [7] The specificity of translated isoforms is derived by the protein's structure/function, as well as the cell type and developmental stage during which they are produced.[4][5] Determining specificity becomes more complicated when a protein has multiple subunits and each subunit has multiple isoforms.
For example, the 5' AMP-activated protein kinase (AMPK), an enzyme, which performs different roles in human cells, has 3 subunits:[8]
- α, catalytic domain, has two isoforms: α1 and α2 which are encoded from PRKAA1 and PRKAA2
- β, regulatory domain, has two isoforms: β1 and β2 which are encoded from PRKAB1 and PRKAB2
- γ, regulatory domain, has three isoforms: γ1, γ2, and γ3 which are encoded from PRKAG1, PRKAG2, and PRKAG3
In human skeletal muscle, the preferred form is α2β2γ1.[8] But in the human liver, the most abundant form is α1β2γ1.[8]
Mechanism
The primary mechanisms that produce protein isoforms are alternative splicing and variable promoter usage, though modifications due to genetic changes, such as mutations and polymorphisms are sometimes also considered distinct isoforms.[9]
Alternative splicing is the main
Because splicing is a process that occurs between transcription and translation, its primary effects have mainly been studied through genomics techniques—for example, microarray analyses and RNA sequencing have been used to identify alternatively spliced transcripts and measure their abundances.[9] Transcript abundance is often used as a proxy for the abundance of protein isoforms, though proteomics experiments using gel electrophoresis and mass spectrometry have demonstrated that the correlation between transcript and protein counts is often low, and that one protein isoform is usually dominant.[11] One 2015 study states that the cause of this discrepancy likely occurs after translation, though the mechanism is essentially unknown.[12] Consequently, although alternative splicing has been implicated as an important link between variation and disease, there is no conclusive evidence that it acts primarily by producing novel protein isoforms.[11]
Alternative splicing generally describes a tightly regulated process in which alternative transcripts are intentionally generated by the splicing machinery. However, such transcripts are also produced by splicing errors in a process called "noisy splicing," and are also potentially translated into protein isoforms. Although ~95% of multi-exonic genes are thought to be alternatively spliced, one study on noisy splicing observed that most of the different low-abundance transcripts are noise, and predicts that most alternative transcript and protein isoforms present in a cell are not functionally relevant.[13]
Other transcriptional and post-transcriptional regulatory steps can also produce different protein isoforms.[14] Variable promoter usage occurs when the transcriptional machinery of a cell (RNA polymerase, transcription factors, and other enzymes) begin transcription at different promoters—the region of DNA near a gene that serves as an initial binding site—resulting in slightly modified transcripts and protein isoforms.
Characteristics
Generally, one protein isoform is labeled as the canonical sequence based on criteria such as its prevalence and similarity to orthologous—or functionally analogous—sequences in other species.[15] Isoforms are assumed to have similar functional properties, as most have similar sequences, and share some to most exons with the canonical sequence. However, some isoforms show much greater divergence (for example, through trans-splicing), and can share few to no exons with the canonical sequence. In addition, they can have different biological effects—for example, in an extreme case, the function of one isoform can promote cell survival, while another promotes cell death—or can have similar basic functions but differ in their sub-cellular localization.[16] A 2016 study, however, functionally characterized all the isoforms of 1,492 genes and determined that most isoforms behave as "functional alloforms." The authors came to the conclusion that isoforms behave like distinct proteins after observing that the functional of most isoforms did not overlap.[17] Because the study was conducted on cells in vitro, it is not known if the isoforms in the expressed human proteome share these characteristics. Additionally, because the function of each isoform must generally be determined separately, most identified and predicted isoforms still have unknown functions.
Related concepts
Glycoform
A glycoform is an isoform of a protein that differs only with respect to the number or type of attached
Examples
- G-actin: despite its conserved nature, it has a varying number of isoforms (at least six in mammals).
- Creatine kinase, the presence of which in the blood can be used as an aid in the diagnosis of myocardial infarction, exists in 3 isoforms.
- Hyaluronan synthase, the enzyme responsible for the production of hyaluronan, has three isoforms in mammalian cells.
- UDP-glucuronosyltransferase, an enzyme superfamily responsible for the detoxification pathway of many drugs, environmental pollutants, and toxic endogenous compounds has 16 known isoforms encoded in the human genome.[18]
- G6PDA: normal ratio of active isoforms in cells of any tissue is 1:1 shared with G6PDG. This is precisely the normal isoform ratio in hyperplasia. Only one of these isoforms is found during neoplasia.[19]
Monoamine oxidase, a family of enzymes that catalyze the oxidation of monoamines, exists in two isoforms, MAO-A and MAO-B.
See also
References
- ^ S2CID 2724843.
- PMID 19740416.
- ISBN 9783527636778.
- ^ PMID 2891362.
- ^ PMID 3304142.
- PMID 27104977.
- PMID 36519529.
- ^ PMID 26711141.
- ^ S2CID 54560052.
- PMID 25784052.
- ^ PMID 27712956.
- PMID 25657249.
- PMID 21151575.
- PMID 23443629.
- PMID 25265570.
- PMID 20943952.
- PMID 26871637.
- PMID 17263731.
- ^ Pathoma, Fundamentals of Pathology