ORF1ab

Replicase polyprotein
Replicase polyprotein
Identifiers
Organism	SARS-CoV-2
Symbol	rep
UniProt	P0DTD1
Structures
Search for
Structures	Swiss-model
Domains	InterPro

Search for
Structures	Swiss-model
Domains	InterPro

Replicase polyprotein
Replicase polyprotein
Identifiers
Organism	SARS-CoV
Symbol	rep
UniProt	P0C6X7
Structures
Search for
Structures	Swiss-model
Domains	InterPro

ORF1ab (also ORF1a/b) refers collectively to two

programmed ribosomal frameshift that allows the ribosome to continue translating past the stop codon at the end of ORF1a, in a -1 reading frame. The resulting polyproteins are known as pp1a and pp1ab.^[1]^[2]^[3]^[4]

Expression

Genomic information
Genomic organisation of isolate Wuhan-Hu-1, the earliest sequenced sample of SARS-CoV-2, indicating the location of ORF1a and ORF1b
NCBI genome ID	86693
Genome size	29,903 bases
Year of completion	2020
Genome browser (UCSC)

ORF1a is the first

kb for coronaviruses^[1]), but ORF1ab is translated directly from the genomic RNA.^[5] ORF1ab sequences have been observed in noncanonical subgenomic RNAs, though their functional significance is unclear.^[5]

A

RNA secondary structure.^[1] This has been measured at between 20-50% efficiency for murine coronavirus,^[6] or 45-70% in SARS-CoV-2^[7] yielding a stoichiometry of roughly 1.5 to 2 times as much pp1a as pp1ab protein expressed.^[2]

Processing

replicase-transcriptase complex.^[8]

The

polyproteins pp1a and pp1ab contain about 13 to 17 nonstructural proteins.^[3] They undergo auto-proteolysis to release the nonstructural proteins due to the actions of internal cysteine protease domains.^[1]^[2]^[3]

In coronaviruses, there are a total of 16 nonstructural proteins; pp1a protein contains

3CL protease (also known as the main protease, nsp5) performs the remaining cleavages of nsp5 through the polyprotein C-terminus.^[1]^[2] Proteins nsp12-16, the C-terminal components of the pp1ab polyprotein, contain the core enzymatic activities necessary for viral replication.^[1] After proteolytic processing, several of the nonstructural proteins assemble into a large protein complex known as the replicase-transcriptase complex (RTC) which performs genome replication and transcription.^[1]^[2]

Components

Core replicase domains

A set of five

main protease flanked on either end by transmembrane domains; and from ORF1b, a nucleotidyltransferase domain known as NiRAN, RNA-dependent RNA polymerase (RdRp), a zinc-binding domain, and a helicase.^[3]^[9] (This is sometimes considered seven domains, counting the transmembrane regions separately.^[4]) In addition, an endoribonuclease domain is found in all nidoviruses that infect vertebrate hosts. Arteriviruses, which have smaller genomes than the other nidovirus lineages, also lack methyltransferases as well as a proofreading exoribonuclease, a domain that is conserved in nidoviruses with larger genomes.^[3] This proofreading functionality is thought to be required for sufficient fidelity to replicate large RNA genomes, but may also play additional roles in some viruses.^[9]

Coronaviruses

In coronaviruses, pp1a and pp1ab together contain sixteen nonstructural proteins, which have the following functions:[1]^[2]^[10]^[11]

Nonstructural proteins derived from coronavirus pp1a and pp1ab proteins
Nonstructural protein	Function
nonstructural protein 1	Cellular host cell translation inhibition, interferon inhibition; not present in Gammacoronavirus
nonstructural protein 2	Unknown; binds prohibitin
nonstructural protein 3	Multi-domain protein with one or two papain-like protease domains for polyprotein processing; interferon antagonist; multiple other roles
nonstructural protein 4	Double-membrane vesicle formation
nonstructural protein 5	3CL protease for polyprotein processing; interferon inhibition
nonstructural protein 6	Double-membrane vesicle formation
nonstructural protein 7	Cofactor and RdRp ; forms complex with nsp8 and nsp12
nonstructural protein 8	Cofactor and RdRp ; forms complex with nsp7 and nsp12
nonstructural protein 9	Single-stranded RNA binding
nonstructural protein 10	Cofactor for nsp14 and nsp16
nonstructural protein 11	Unknown
nonstructural protein 12	RNA-dependent RNA polymerase (RdRp) and nucleotidyltransferase
nonstructural protein 13	RNA triphosphatase
nonstructural protein 14	Proofreading exonuclease, RNA cap formation, guanosine N7-methyltransferase
nonstructural protein 15	immune evasion function
nonstructural protein 16	Ribose 2'-O-methyltransferase, RNA cap formation

Evolution

The structure and organization of the genome, including ORF1a, ORF1b, and the

gene fusions.^[4] The largest known nidovirus, planarian secretory cell nidovirus (PSCNV), with a 41kb genome, has a non-canonical genome structure in which ORF1a, ORF1b, and downstream ORFs containing structural proteins are fused and expressed as a single large ORF encoding a polyprotein of over 13,000 amino acids.^[4]^[12] In these non-canonical genomes, other frameshift locations or stop codon readthrough may be used to regulate the stoichiometry of viral proteins.^[4]

Nidoviruses vary widely in genome size, from

arteriviruses with typically 12-15kb genomes to coronaviruses at 27-32kb. Their evolutionary history has been of research interest in understanding the replication of very large RNA genomes despite the relatively low-fidelity replication mechanism of the viral RNA-dependent RNA polymerase (RdRp).^[4] The larger nidovirus genomes (above around 20kb^[3]) encode a proofreading exoribonuclease (nsp14 in coronaviruses) thought to be required for replication fidelity.^[9]^[1]

Among

sequenced many times, resulting in identification of thousands of distinct variants. In a World Health Organization analysis from July 2020, ORF1ab was the most frequently mutated gene, followed by the S gene encoding the spike protein. The most commonly mutated protein within ORF1ab was papain-like protease (nsp3), and the single most commonly observed missense mutation was in RNA-dependent RNA polymerase.^[13] Some PCR tests that detect COVID-19 analyze the specimen for the ORF1ab gene, among others.^[14]

References

^
PMID 32661197
.

^
PMID 33116300
.

^
PMID 28174054
.

^
PMID 33413979
.

^
PMID 33713597
.

PMID 26919232
.

S2CID 221624633
.

PMID 24348241
.

^
PMID 31440227
.

PMID 33242646
.

^
PMID 31967327
.

S2CID 53872740
.

PMID 32742035
.

^ Richardson, Robin (August 22, 2021). "Open Wide". The Marshall News Messenger. pp. A1, A2. Retrieved 21 November 2022.

v
t
e
Coronavirus genomes
Viral structural protein

spike protein (S)

envelope protein (E)

membrane protein (M)

nucleocapsid protein (N)

Viral nonstructural protein
(expressed from ORF1ab)

nonstructural protein 1

nonstructural protein 2

papain-like protease (nsp3)

nonstructural protein 4

3C-like protease (nsp5)

nonstructural protein 6

nonstructural protein 7

nonstructural protein 8

nonstructural protein 9

nonstructural protein 10

nonstructural protein 11

nonstructural protein 12

nonstructural protein 13

nonstructural protein 14

nonstructural protein 15

nonstructural protein 16

Viral accessory protein

ORF3a

ORF3b

ORF3c

ORF3d

ORF6

ORF7a

ORF7b

ORF8

ORF9b

ORF9c

ORF10

RNA

Coronavirus 5′ UTR

Coronavirus 3′ UTR
Coronavirus 3′ UTR pseudoknot

Coronavirus 3′ stem-loop II-like motif (s2m)

Coronavirus packaging signal

Coronavirus frameshifting stimulation element

Human identical sequence

v
t
e
Viral proteins (early and late)
DNA
linear ds-DNA
(Duplodnaviria,
Varidnaviria)
Herpes simplex
VSPs:
capsid:

HHV capsid portal protein

Herpesvirus glycoprotein B

VNPs:

vmw65

ICP8

ICP34.5

ICP47

Vaccinia
VNPs:

B13R

Adenoviridae
VNPs:

E1A

E1B

circular ds-DNA
(Duplodnaviria,
Varidnaviria?)
Epstein–Barr
VSPs:

LMP-1

LMP-2

VNPs:

EBNA-1

EBNA-2

EBNA-3

ncRNA:

EBER

Baculoviridae
VNPs:

Early 35 kDa protein

other
(Riboviria,
Monodnaviria)
Polyomaviridae
(SV40, MPyV, MCPyV, HaPyV)
(non-enveloped circular ds-DNA)
VSPs:
capsid:

VP1

VP2 and VP3

oncoprotein
:

STag

MTag

LTag (SV40 Tag)

Agnoprotein
Hepatitis B
(circular partially ds-DNA)
VSPs:

HBsAg

HBcAg
HBeAg

VNPs:

HBx

Hepatitis B virus DNA polymerase

RNA
ds-RNA
(Riboviria)
Rotavirus
(Duplornaviricota)
VNPs:

NSP1

NSP2

NSP3

NSP4

NSP5

NSP6

Hep A,
etc. (Pisuviricota)
VNPs:

VPg

ss-RNA
positive-sense
(Riboviria)
Hepatitis C
(Kitrinoviricota)
VSPs:

viral envelope
E1

E2

VNPs:

P7

NS2

NS3

NS4A

NS4B

NS5A

NS5B

SARS-CoV-2
(Pisuviricota)
VSPs:

viral envelope
Spike

Envelope

Membrane

Nucleocapsid

VNPs:

ORF1ab
3C-like protease (NS5)

ORF3a

ORF3b

ORF3c

ORF3d

ORF6

ORF7a

ORF7b

ORF8

ORF9b

ss-RNA
negative-sense
(
Influenza virus
VSPs:
capsid:

matrix protein
M1 protein

viral envelope
M2 protein

glycoprotein:

Influenza hemagglutinin

Neuraminidase

HA-tag

VNPs:

NS1

Parainfluenza
VSPs:
glycoprotein:

Parainfluenza hemagglutinin-neuraminidase

Mumps
VSPs:
glycoprotein:

Mumps hemagglutinin-neuraminidase

Measles
VSPs:
glycoprotein:

Measles hemagglutinin

RSV
VSPs:
glycoprotein:

Respiratory syncytial virus G protein

Zaire ebolavirus
VSPs:
capsid:

matrix protein
VP40

VP24

Indiana vesiculovirus
VSPs:
capsid:

matrix protein
Vesiculovirus matrix proteins

RT
Structure and genome of HIV
VSPs:

gag
p24

pol
Integrase

Reverse transcriptase

HIV-1 protease

env
gp120

gp41

VRAPs:

transactivators
Tat

Rev

Vpr

Nef

Vif

Vpu or Vpx

Multiple
Rous sarcoma virus

oncoprotein
:

Gag-onc fusion protein

Retrieved from "https://en.wikipedia.org/w/index.php?title=ORF1ab&oldid=1188039393"

[hartenian_2020-1] 
PMID 32661197
.

[vkovski_2021-2] 
PMID 33116300
.

[posthuma_2017-3] 
PMID 28174054
.

[gulyaeva_2021-4] 
PMID 33413979
.

[wang_2021-5] 
PMID 33713597
.

[irigoyen_2016-6] PMID 26919232
.

[finkel_2021-7] S2CID 221624633
.

[smith_2013-8] PMID 24348241
.

[ogando_2019-9] 
PMID 31440227
.

[rohaim_2021-10] PMID 33242646
.

[chen_2020-11] 
PMID 31967327
.

[saberi_2018-12] S2CID 53872740
.

[koyama_2020-13] PMID 32742035
.

[14] Richardson, Robin (August 22, 2021). "Open Wide". The Marshall News Messenger. pp. A1, A2. Retrieved 21 November 2022.

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10]

[11]

[12]

[13]

[14]