ORF1ab
Replicase polyprotein | |||||||
---|---|---|---|---|---|---|---|
Identifiers | |||||||
Organism | |||||||
Symbol | rep | ||||||
UniProt | P0C6X7 | ||||||
|
Replicase polyprotein | |||||||
---|---|---|---|---|---|---|---|
Identifiers | |||||||
Organism | |||||||
Symbol | rep | ||||||
UniProt | P0DTD1 | ||||||
|
ORF1ab (also ORF1a/b) refers collectively to two
programmed ribosomal frameshift that allows the ribosome to continue translating past the stop codon at the end of ORF1a, in a -1 reading frame. The resulting polyproteins are known as pp1a and pp1ab.[1][2][3][4]
Expression
Genomic organisation of isolate Wuhan-Hu-1, the earliest sequenced sample of SARS-CoV-2, indicating the location of ORF1a and ORF1b | |
NCBI genome ID | 86693 |
---|---|
Genome size | 29,903 bases |
Year of completion | 2020 |
Genome browser (UCSC) |
ORF1a is the first
kb for coronaviruses[1]), but ORF1ab is translated directly from the genomic RNA.[5] ORF1ab sequences have been observed in noncanonical subgenomic RNAs, though their functional significance is unclear.[5]
A
RNA secondary structure.[1] This has been measured at between 20-50% efficiency for murine coronavirus,[6] or 45-70% in SARS-CoV-2[7] yielding a stoichiometry of roughly 1.5 to 2 times as much pp1a as pp1ab protein expressed.[2]
Processing
The
polyproteins pp1a and pp1ab contain about 13 to 17 nonstructural proteins.[3] They undergo auto-proteolysis to release the nonstructural proteins due to the actions of internal cysteine protease domains.[1][2][3]
In coronaviruses, there are a total of 16 nonstructural proteins; pp1a protein contains
3CL protease (also known as the main protease, nsp5) performs the remaining cleavages of nsp5 through the polyprotein C-terminus.[1][2] Proteins nsp12-16, the C-terminal components of the pp1ab polyprotein, contain the core enzymatic activities necessary for viral replication.[1] After proteolytic processing, several of the nonstructural proteins assemble into a large protein complex known as the replicase-transcriptase complex (RTC) which performs genome replication and transcription.[1][2]
Components
Core replicase domains
A set of five
main protease flanked on either end by transmembrane domains; and from ORF1b, a nucleotidyltransferase domain known as NiRAN, RNA-dependent RNA polymerase (RdRp), a zinc-binding domain, and a helicase.[3][9] (This is sometimes considered seven domains, counting the transmembrane regions separately.[4]) In addition, an endoribonuclease domain is found in all nidoviruses that infect vertebrate hosts. Arteriviruses, which have smaller genomes than the other nidovirus lineages, also lack methyltransferases as well as a proofreading exoribonuclease, a domain that is conserved in nidoviruses with larger genomes.[3] This proofreading functionality is thought to be required for sufficient fidelity to replicate large RNA genomes, but may also play additional roles in some viruses.[9]
In coronaviruses, pp1a and pp1ab together contain sixteen nonstructural proteins, which have the following functions:[1][2][10][11]
Nonstructural protein | Function |
---|---|
nonstructural protein 1 | Cellular host cell translation inhibition, interferon inhibition; not present in Gammacoronavirus
|
nonstructural protein 2 | Unknown; binds prohibitin |
nonstructural protein 3 | Multi-domain protein with one or two papain-like protease domains for polyprotein processing; interferon antagonist; multiple other roles |
nonstructural protein 4 | Double-membrane vesicle formation |
nonstructural protein 5 |
3CL protease for polyprotein processing; interferon inhibition
|
nonstructural protein 6 | Double-membrane vesicle formation |
nonstructural protein 7 | Cofactor and RdRp ; forms complex with nsp8 and nsp12
|
nonstructural protein 8 | Cofactor and RdRp ; forms complex with nsp7 and nsp12
|
nonstructural protein 9 | Single-stranded RNA binding |
nonstructural protein 10 | Cofactor for nsp14 and nsp16 |
nonstructural protein 11 | Unknown |
nonstructural protein 12 | RNA-dependent RNA polymerase (RdRp) and nucleotidyltransferase |
nonstructural protein 13 | RNA triphosphatase
|
nonstructural protein 14 | Proofreading exonuclease, RNA cap formation, guanosine N7-methyltransferase |
nonstructural protein 15 | immune evasion function
|
nonstructural protein 16 | Ribose 2'-O-methyltransferase, RNA cap formation |
Evolution
The structure and organization of the genome, including ORF1a, ORF1b, and the
gene fusions.[4] The largest known nidovirus, planarian secretory cell nidovirus (PSCNV), with a 41kb genome, has a non-canonical genome structure in which ORF1a, ORF1b, and downstream ORFs containing structural proteins are fused and expressed as a single large ORF encoding a polyprotein of over 13,000 amino acids.[4][12] In these non-canonical genomes, other frameshift locations or stop codon readthrough may be used to regulate the stoichiometry of viral proteins.[4]
Nidoviruses vary widely in genome size, from
arteriviruses with typically 12-15kb genomes to coronaviruses at 27-32kb. Their evolutionary history has been of research interest in understanding the replication of very large RNA genomes despite the relatively low-fidelity replication mechanism of the viral RNA-dependent RNA polymerase (RdRp).[4] The larger nidovirus genomes (above around 20kb[3]) encode a proofreading exoribonuclease (nsp14 in coronaviruses) thought to be required for replication fidelity.[9][1]
Among
sequenced many times, resulting in identification of thousands of distinct variants. In a World Health Organization analysis from July 2020, ORF1ab was the most frequently mutated gene, followed by the S gene encoding the spike protein. The most commonly mutated protein within ORF1ab was papain-like protease (nsp3), and the single most commonly observed missense mutation was in RNA-dependent RNA polymerase.[13] Some PCR tests that detect COVID-19 analyze the specimen for the ORF1ab gene, among others.[14]
References
- ^ PMID 32661197.
- ^ PMID 33116300.
- ^ PMID 28174054.
- ^ PMID 33413979.
- ^ PMID 33713597.
- PMID 26919232.
- S2CID 221624633.
- PMID 24348241.
- ^ PMID 31440227.
- PMID 33242646.
- ^ PMID 31967327.
- S2CID 53872740.
- PMID 32742035.
- ^ Richardson, Robin (August 22, 2021). "Open Wide". The Marshall News Messenger. pp. A1, A2. Retrieved 21 November 2022.