Alternative splicing
Alternative splicing, or alternative RNA splicing, or differential splicing, is an alternative splicing process during gene expression that allows a single gene to code for multiple proteins. In this process, particular exons of a gene may be included within or excluded from the final, processed messenger RNA (mRNA) produced from that gene.[1] This means the exons are joined in different combinations, leading to different (alternative) mRNA strands. Consequently, the proteins translated from alternatively spliced mRNAs usually contain differences in their amino acid sequence and, often, in their biological functions (see Figure).
Biologically relevant alternative splicing occurs as a normal phenomenon in eukaryotes, where it increases the number of proteins that can be encoded by the genome.[1] In humans, it is widely believed that ~95% of multi-exonic genes are alternatively spliced to produce functional alternative products from the same gene[2] but many scientists believe that most of the observed splice variants are due to splicing errors and the actual number of biologically relevant alternatively spliced genes is much lower.[3][4]
Alternative splicing enables the regulated generation of multiple mRNA and protein products from a single gene.[5]
There are numerous modes of alternative splicing observed, of which the most common is exon skipping. In this mode, a particular exon may be included in mRNAs under some conditions or in particular tissues, and omitted from the mRNA in others.[1]
The production of alternatively spliced mRNAs is regulated by a system of
Abnormal variations in splicing are also implicated in disease; a large proportion of human genetic disorders result from splicing variants.[6] Abnormal splicing variants are also thought to contribute to the development of cancer,[8][9][10][11] and splicing factor genes are frequently mutated in different types of cancer.[11]
Discovery
Alternative splicing was first observed in 1977.
In 1981, the first example of alternative splicing in a
Since then, many other examples of biologically relevant alternative splicing have been found in eukaryotes.
In 2021, it was discovered that the genome of adenovirus type 2, the adenovirus in which alternative splicing was first identified, was able to produce a much greater variety of mRNA than previously thought.[21] By using next generation sequencing technology, researchers were able to update the human adenovirus type 2 transcriptome, and present a mind-boggling 904 unique mRNA, produced by the virus through a complex pattern of alternative splicing. Very few of these splice variants have been shown to be functional, a point that the authors raise in their paper.
- "An outstanding question is what roles the menagerie of novel RNAs play or whether they are spurious molecules generated by an overloaded splicing machinery."[21]
Modes
Five basic modes of alternative splicing are generally recognized.[1][2][6][22]
- Exon skipping or cassette exon: in this case, an pre-mRNAs.[22]
- Mutually exclusive exons: One of two exons is retained in mRNAs after splicing, but not both.
- Alternative donor site: An alternative 5' splice junction (donor site) is used, changing the 3' boundary of the upstreamexon.
- Alternative acceptor site: An alternative 3' splice junction (acceptor site) is used, changing the 5' boundary of the downstream exon.
- Intron retention: A sequence may be spliced out as an intron or simply retained. This is distinguished from exon skipping because the retained sequence is not flanked by introns. If the retained intron is in the coding region, the intron must encode amino acids in frame with the neighboring exons, or a stop codon or a shift in the reading frame will cause the protein to be non-functional. This is the rarest mode in mammals but the most common in plants.[22][23]
In addition to these primary modes of alternative splicing, there are two other main mechanisms by which different mRNAs may be generated from the same gene; multiple
These modes describe basic splicing mechanisms, but may be inadequate to describe complex splicing events. For instance, the figure to the right shows 3 spliceforms from the mouse hyaluronidase 3 gene. Comparing the exonic structure shown in the first line (green) with the one in the second line (yellow) shows intron retention, whereas the comparison between the second and the third spliceform (yellow vs. blue) exhibits exon skipping. A model nomenclature to uniquely designate all possible splicing patterns has recently been proposed.[22]
Mechanisms
General splicing mechanism
When the pre-mRNA has been transcribed from the
The typical eukaryotic nuclear intron has consensus sequences defining important regions. Each intron has the sequence GU at its 5' end. Near the 3' end there is a branch site. The nucleotide at the branchpoint is always an A; the consensus around this sequence varies somewhat. In humans the branch site consensus sequence is yUnAy.[24] The branch site is followed by a series of pyrimidines – the polypyrimidine tract – then by AG at the 3' end.[6]
Splicing of mRNA is performed by an RNA and protein complex known as the spliceosome, containing snRNPs designated U1, U2, U4, U5, and U6 (U3 is not involved in mRNA splicing).[25] U1 binds to the 5' GU and U2, with the assistance of the U2AF protein factors, binds to the branchpoint A within the branch site. The complex at this stage is known as the spliceosome A complex. Formation of the A complex is usually the key step in determining the ends of the intron to be spliced out, and defining the ends of the exon to be retained.[6] (The U nomenclature derives from their high uridine content).
The U4,U5,U6 complex binds, and U6 replaces the U1 position. U1 and U4 leave. The remaining complex then performs two
Regulatory elements and proteins
Splicing is regulated by
There are two major types of cis-acting RNA sequence elements present in pre-mRNAs and they have corresponding trans-acting
In general, the determinants of splicing work in an inter-dependent manner that depends on context, so that the rules governing how splicing is regulated form a splicing code.[30] The presence of a particular cis-acting RNA sequence element may increase the probability that a nearby site will be spliced in some cases, but decrease the probability in other cases, depending on context. The context within which regulatory elements act includes cis-acting context that is established by the presence of other RNA sequence features, and trans-acting context that is established by cellular conditions. For example, some cis-acting RNA sequence elements influence splicing only if multiple elements are present in the same region so as to establish context. As another example, a cis-acting element can have opposite effects on splicing, depending on which proteins are expressed in the cell (e.g., neuronal versus non-neuronal PTB). The adaptive significance of splicing silencers and enhancers is attested by studies showing that there is strong selection in human genes against mutations that produce new silencers or disrupt existing enhancers.[31][32]
DNA methylation and alternative splicing in social insects
CpG DNA methylation has showed a role to regulate the alternative splicing in social insects.[33][34] In honey bees (Apis mellifera), CpG DNA methylation seems to regulate the exon skipping based on the first few genomic studies[35][36] after honey bee genome was available.[37] CpG DNA methylation regulated alternative splicing more extensively, not only affect exon skipping, but also intron retention, and other splicing events.[38]
Examples
Exon skipping: Drosophila dsx
Pre-mRNAs from the D. melanogaster gene dsx contain 6 exons. In males, exons 1,2,3,5,and 6 are joined to form the mRNA, which encodes a transcriptional regulatory protein required for male development. In females, exons 1,2,3, and 4 are joined, and a polyadenylation signal in exon 4 causes cleavage of the mRNA at that point. The resulting mRNA is a transcriptional regulatory protein required for female development.[39]
This is an example of exon skipping. The intron upstream from exon 4 has a polypyrimidine tract that doesn't match the consensus sequence well, so that U2AF proteins bind poorly to it without assistance from splicing activators. This 3' splice acceptor site is therefore not used in males. Females, however, produce the splicing activator Transformer (Tra) (see below). The SR protein Tra2 is produced in both sexes and binds to an ESE in exon 4; if Tra is present, it binds to Tra2 and, along with another SR protein, forms a complex that assists U2AF proteins in binding to the weak polypyrimidine tract. U2 is recruited to the associated branchpoint, and this leads to inclusion of exon 4 in the mRNA.[39][40]
Alternative acceptor sites: Drosophila Transformer
Pre-mRNAs of the Transformer (Tra) gene of Drosophila melanogaster undergo alternative splicing via the alternative acceptor site mode. The gene Tra encodes a protein that is expressed only in females. The primary transcript of this gene contains an intron with two possible acceptor sites. In males, the upstream acceptor site is used. This causes a longer version of exon 2 to be included in the processed transcript, including an early stop codon. The resulting mRNA encodes a truncated protein product that is inactive. Females produce the master sex determination protein Sex lethal (Sxl). The Sxl protein is a splicing repressor that binds to an ISS in the RNA of the Tra transcript near the upstream acceptor site, preventing U2AF protein from binding to the polypyrimidine tract. This prevents the use of this junction, shifting the spliceosome binding to the downstream acceptor site. Splicing at this point bypasses the stop codon, which is excised as part of the intron. The resulting mRNA encodes an active Tra protein, which itself is a regulator of alternative splicing of other sex-related genes (see dsx above).[1]
Exon definition: Fas receptor
Multiple isoforms of the Fas receptor protein are produced by alternative splicing. Two normally occurring isoforms in humans are produced by an exon-skipping mechanism. An mRNA including exon 6 encodes the membrane-bound form of the Fas receptor, which promotes apoptosis, or programmed cell death. Increased expression of Fas receptor in skin cells chronically exposed to the sun, and absence of expression in skin cancer cells, suggests that this mechanism may be important in elimination of pre-cancerous cells in humans.[41] If exon 6 is skipped, the resulting mRNA encodes a soluble Fas protein that does not promote apoptosis. The inclusion or skipping of the exon depends on two antagonistic proteins, TIA-1 and polypyrimidine tract-binding protein (PTB).
- The 5' donor site in the intron downstream from exon 6 in the pre-mRNA has a weak agreement with the consensus sequence, and is not bound usually by the U1 snRNP. If U1 does not bind, the exon is skipped (see "a" in accompanying figure).
- Binding of TIA-1 protein to an intronic splicing enhancer site stabilizes binding of the U1 snRNP.[6] The resulting 5' donor site complex assists in binding of the splicing factor U2AF to the 3' splice site upstream of the exon, through a mechanism that is not yet known (see b).[42]
- Exon 6 contains a pyrimidine-rich exonic splicing silencer, ure6, where PTB can bind. If PTB binds, it inhibits the effect of the 5' donor complex on the binding of U2AF to the acceptor site, resulting in exon skipping (see c).
This mechanism is an example of exon definition in splicing. A spliceosome assembles on an intron, and the snRNP subunits fold the RNA so that the 5' and 3' ends of the intron are joined. However, recently studied examples such as this one show that there are also interactions between the ends of the exon. In this particular case, these exon definition interactions are necessary to allow the binding of core splicing factors prior to assembly of the spliceosomes on the two flanking introns.[42]
Repressor-activator competition: HIV-1 tat exon 2
Adaptive significance
Genuine alternative splicing occurs in both protein-coding genes and non-coding genes to produce multiple products (proteins or non-coding RNAs). External information is needed in order to decide which product is made, given a DNA sequence and the initial transcript. Since the methods of regulation are inherited, this provides novel ways for mutations to affect gene expression.[10]
Alternative splicing may provide evolutionary flexibility. A single point mutation may cause a given exon to be occasionally excluded or included from a transcript during splicing, allowing production of a new
Research based on the Human Genome Project and other genome sequencing has shown that humans have only about 30% more genes than the roundworm Caenorhabditis elegans, and only about twice as many as the fly Drosophila melanogaster. This finding led to speculation that the perceived greater complexity of humans, or vertebrates generally, might be due to higher rates of alternative splicing in humans than are found in invertebrates.[49][50] However, a study on samples of 100,000 expressed sequence tags (EST) each from human, mouse, rat, cow, fly (D. melanogaster), worm (C. elegans), and the plant Arabidopsis thaliana found no large differences in frequency of alternatively spliced genes among humans and any of the other animals tested.[51] Another study, however, proposed that these results were an artifact of the different numbers of ESTs available for the various organisms. When they compared alternative splicing frequencies in random subsets of genes from each organism, the authors concluded that vertebrates do have higher rates of alternative splicing than invertebrates.[52]
Disease
Changes in the RNA processing machinery may lead to mis-splicing of multiple transcripts, while single-nucleotide alterations in splice sites or cis-acting splicing regulatory sites may lead to differences in splicing of a single gene, and thus in the mRNA produced from a mutant gene's transcripts. A study in 2005 involving probabilistic analyses indicated that greater than 60% of human disease-causing mutations affect splicing rather than directly affecting coding sequences.[53] A more recent study indicates that one-third of all hereditary diseases are likely to have a splicing component.[26] Regardless of exact percentage, a number of splicing-related diseases do exist.[54] As described below, a prominent example of splicing-related diseases is cancer.
Abnormally spliced mRNAs are also found in a high proportion of cancerous cells.[8][9][11] Combined RNA-Seq and proteomics analyses have revealed striking differential expression
of splice isoforms of key proteins in important cancer pathways.
In fact, there is actually a reduction of alternative splicing in cancerous cells compared to normal ones, and the types of splicing differ; for instance, cancerous cells show higher levels of intron retention than normal cells, but lower levels of exon skipping.
One example of a specific splicing variant associated with cancers is in one of the human
Another example is the Ron (
Overexpression of a truncated splice variant of the
Recent provocative studies point to a key function of chromatin structure and histone modifications in alternative splicing regulation. These insights suggest that epigenetic regulation determines not only what parts of the genome are expressed but also how they are spliced.[67]
Genome-scale (transcriptome-wide) analysis
Transcriptome-wide analysis of alternative splicing is typically performed by high-throughput RNA-sequencing. Most commonly, by short-read sequencing, such as by Illumina instrumentation. But even more informative, by long-read sequencing, such as by Nanopore or PacBio instrumentation. Transcriptome-wide analyses can for example be used to measure the amount of deviating alternative splicing, such as in a cancer cohort.[68]
Deep sequencing technologies have been used to conduct genome-wide analyses of both unprocessed and processed mRNAs; thus providing insights into alternative splicing. For example, results from use of deep sequencing indicate that, in humans, an estimated 95% of transcripts from multiexon genes undergo alternative splicing, with a number of pre-mRNA transcripts spliced in a tissue-specific manner.
More historically, alternatively spliced transcripts have been found by comparing
In microarray analysis, arrays of DNA fragments representing individual
CLIP (Cross-linking and immunoprecipitation) uses UV radiation to link proteins to RNA molecules in a tissue during splicing. A trans-acting splicing regulatory protein of interest is then precipitated using specific antibodies. When the RNA attached to that protein is isolated and cloned, it reveals the target sequences for that protein.[7] Another method for identifying RNA-binding proteins and mapping their binding to pre-mRNA transcripts is "Microarray Evaluation of Genomic Aptamers by shift (MEGAshift)".net[72] This method involves an adaptation of the "Systematic Evolution of Ligands by Exponential Enrichment (SELEX)" method[73] together with a microarray-based readout. Use of the MEGAshift method has provided insights into the regulation of alternative splicing by allowing for the identification of sequences in pre-mRNA transcripts surrounding alternatively spliced exons that mediate binding to different splicing factors, such as ASF/SF2 and PTB.[74] This approach has also been used to aid in determining the relationship between RNA secondary structure and the binding of splicing factors.[28]
Use of reporter assays makes it possible to find the splicing proteins involved in a specific alternative splicing event by constructing reporter genes that will express one of two different fluorescent proteins depending on the splicing reaction that occurs. This method has been used to isolate mutants affecting splicing and thus to identify novel splicing regulatory proteins inactivated in those mutants.[7]
Recent advancements in protein structure prediction have facilitated the development of new tools for genome annotation and alternative splicing anlaysis. For instance, isoform.io, a platform guided by protein structure predictions, has evaluated hundreds of thousands of isoforms of human protein-coding genes assembled from numerous RNA sequencing experiments across a variety of human tissues. This comprehensive analysis has led to the identification of numerous isoforms with more confidently predicted structure and potentially superior function compared to canonical isoforms in the latest human gene database. By integrating structural predictions with expression and evolutionary evidence, this approach has demonstrated the potential of protein structure prediction as a tool for refining the annotation of the human genome.[75]
Databases
There is a collection of alternative splicing databases.[76][77][78] These databases are useful for finding genes having pre-mRNAs undergoing alternative splicing and alternative splicing events or to study the functional impact of alternative splicing.
- AspicDBdatabase
- Intronerator database
- ProSAS database
See also
References
- ^ S2CID 23576288.
- ^ S2CID 9228930.
- S2CID 52113302.
- PMID 27712956.
- S2CID 215805109.
- ^ S2CID 14883495.
- ^ PMID 18245441.
- ^ PMID 17416541.
- ^ PMID 19266097.
- ^ PMID 19048051.
- ^ S2CID 22943729.
- S2CID 2099968.
- PMID 269380.
- ^ PMID 3017190.
- S2CID 44642349.
- S2CID 39704416.
- S2CID 4318349.
- PMID 6952224.
- S2CID 13208589.
- S2CID 13829976.
- ^ PMID 33239457.
- ^ a b c d e
Sammeth M, Foissac S, Guigó R (August 2008). Brent MR (ed.). "A general definition and nomenclature for alternative splicing events". PLOS Computational Biology. 4 (8): e1000147. PMID 18688268.
- PMID 15341630.
- PMID 18285363.
- ISBN 978-0-12-175551-5.
- ^ PMID 21685335.
- PMID 19959365.
- ^ PMID 19861426.
- ^ PMID 18369186.
- ^ S2CID 2398858.
- PMID 18204002.
- PMID 15340491.
- .
- ISBN 9780128025864.
- PMID 21072239.
- PMID 22978521.
- PMID 17073008.
- PMID 23852726.
- ^ PMID 8769651.
- PMID 11421359.
- S2CID 23772719.
- ^ PMID 16109372.
- ^ PMID 14703516.
- PMID 11526107.
- PMID 16717195.
- PMID 24951248.
- ^ PMID 24244129.
- PMID 17916237.
- S2CID 19165121.
- ^
Roest Crollius H, Jaillon O, Bernot A, Dasilva C, Bouneau L, Fischer C, et al. (June 2000). "Estimate of human gene number provided by genome-wide analysis using Tetraodon nigroviridis DNA sequence". Nature Genetics. 25 (2): 235–8. S2CID 44052050.
- ^
Brett D, Pospisil H, Valcárcel J, Reich J, Bork P (January 2002). "Alternative splicing and genome complexity". Nature Genetics. 30 (1): 29–30. S2CID 2724843.
- PMID 17158149.
- S2CID 30174458.
- PMID 19918805.
- PMID 24802673.
- PMID 25109687.
- PMID 21619627.
- PMID 29324392.
- PMID 18054115.
- ^ PMID 16364913.
- PMID 15048092.
- S2CID 17128174.
- PMID 24459410.
DESPITE THE IMPORTANCE OF NUMEROUS PSYCHOSOCIAL FACTORS, AT ITS CORE, DRUG ADDICTION INVOLVES A BIOLOGICAL PROCESS: the ability of repeated exposure to a drug of abuse to induce changes in a vulnerable brain that drive the compulsive seeking and taking of drugs, and loss of control over drug use, that define a state of addiction. ... A large body of literature has demonstrated that such ΔFosB induction in D1-type NAc neurons increases an animal's sensitivity to drug as well as natural rewards and promotes drug self-administration, presumably through a process of positive reinforcement
- S2CID 19157711.
ΔFosB is an essential transcription factor implicated in the molecular and behavioral pathways of addiction following repeated drug exposure. The formation of ΔFosB in multiple brain regions, and the molecular pathway leading to the formation of AP-1 complexes is well understood. The establishment of a functional purpose for ΔFosB has allowed further determination as to some of the key aspects of its molecular cascades, involving effectors such as GluR2 (87,88), Cdk5 (93) and NFkB (100). Moreover, many of these molecular changes identified are now directly linked to the structural, physiological and behavioral changes observed following chronic drug exposure (60,95,97,102). New frontiers of research investigating the molecular roles of ΔFosB have been opened by epigenetic studies, and recent advances have illustrated the role of ΔFosB acting on DNA and histones, truly as a molecular switch (34). As a consequence of our improved understanding of ΔFosB in addiction, it is possible to evaluate the addictive potential of current medications (119), as well as use it as a biomarker for assessing the efficacy of therapeutic interventions (121,122,124). Some of these proposed interventions have limitations (125) or are in their infancy (75). However, it is hoped that some of these preliminary findings may lead to innovative treatments, which are much needed in addiction.
- PMID 23020045.
For these reasons, ΔFosB is considered a primary and causative transcription factor in creating new neural connections in the reward centre, prefrontal cortex, and other regions of the limbic system. This is reflected in the increased, stable and long-lasting level of sensitivity to cocaine and other drugs, and tendency to relapse even after long periods of abstinence. These newly constructed networks function very efficiently via new pathways as soon as drugs of abuse are further taken
- PMID 21459101.
- PMID 21215366.
- PMID 36821799.
- PMID 22705790.
- S2CID 8689111.
- PMID 15610736.
- PMID 19956082.
- PMID 2200121.
- PMID 20015017.
- PMID 36519529.
- PMID 28855263.
- PMID 32976589.
- PMID 23161672.
External links
- A General Definition and Nomenclature for Alternative Splicing Events at SciVee
- AStalavista (Alternative Splicing landscape visualization tool), a method for the computationally exhaustive classification of Alternative Splicing Structures Archived 2009-12-11 at the Wayback Machine
- IsoPred: computationally predicted isoform functions
- Stamms-lab.net: Research Group dealing with alternative Splicing issues and mis-splicing in human diseases
- Alternative Splicing of ion channels in the brain, connected to mental and neurological diseases
- BIPASS: Web Services in Alternative Splicing