Protein superfamily

Source: Wikipedia, the free encyclopedia.

(Redirected from

Enzyme superfamily

)

A protein superfamily is the largest grouping (

glycosyl hydrolases superfamilies based on the MEROPS and CAZy classification systems.^[2]^[3]

Identification

Superfamilies of proteins are identified using a number of methods. Closely related members can be identified by different methods to those needed to group the most evolutionarily divergent members.

Sequence similarity

␣ non-conservative mutations

Historically, the similarity of different amino acid sequences has been the most common method of inferring

catalytic sites

and binding sites, since these regions are less tolerant to sequence changes.

Using sequence similarity to infer homology has several limitations. There is no minimum level of sequence similarity guaranteed to produce identical structures. Over long periods of evolution, related proteins may show no detectable sequence similarity to one another. Sequences with many

PA clan of proteases, for example, not a single residue is conserved through the superfamily, not even those in the catalytic triad

. Conversely, the individual families that make up a superfamily are defined on the basis of their sequence alignment, for example the C04 protease family within the PA clan.

Nevertheless, sequence similarity is the most commonly used form of evidence to infer relatedness, since the number of known sequences vastly outnumbers the number of known tertiary structures.^[6] In the absence of structural information, sequence similarity constrains the limits of which proteins can be assigned to a superfamily.^[6]

Structural similarity

west nile virus protease (1fp7), exfoliatin toxin (1exf), HtrA protease (1l1j), snake venom plasminogen activator (1bqy), chloroplast protease (4fln) and equine arteritis virus protease (1mbm).

DALI, use the 3D structure of a protein of interest to find proteins with similar folds.^[10] However, on rare occasions, related proteins may evolve to be structurally dissimilar^[11] and relatedness can only be inferred by other methods.^[12]^[13]^[14]

Mechanistic similarity

The

convergently evolved multiple times independently, and so form separate superfamilies,^[18]^[19]^[20] and in some superfamilies display a range of different (though often chemically similar) mechanisms.^[15]^[21]

Evolutionary significance

Protein superfamilies represent the current limits of our ability to identify common ancestry.

evolutionary grouping based on direct evidence that is currently possible. They are therefore amongst the most ancient evolutionary events currently studied. Some superfamilies have members present in all kingdoms of life, indicating that the last common ancestor of that superfamily was in the last universal common ancestor of all life (LUCA).^[23]

Superfamily members may be in different species, with the ancestral protein being the form of the protein that existed in the ancestral species (orthology). Conversely, the proteins may be in the same species, but evolved from a single protein whose gene was duplicated in the genome (paralogy).

Diversification

A majority of proteins contain multiple domains. Between 66-80% of eukaryotic proteins have multiple domains while about 40-60% of prokaryotic proteins have multiple domains.^[5] Over time, many of the superfamilies of domains have mixed together. In fact, it is very rare to find “consistently isolated superfamilies”.^[5] ^[1] When domains do combine, the N- to C-terminal domain order (the "domain architecture") is typically well conserved. Additionally, the number of domain combinations seen in nature is small compared to the number of possibilities, suggesting that selection acts on all combinations.^[5]

Examples

α/β hydrolase superfamily: Members share an α/β sheet, containing 8
esterases, epoxide hydrolases and dehalogenases.^[25]
Alkaline phosphatase superfamily: Members share an αβα sandwich structure^[26] as well as performing common promiscuous reactions by a common mechanism.^[27]
Globin superfamily: Members share an 8-
globin fold.^[28]^[29]
Immunoglobulin superfamily: Members share a sandwich-like structure of two
Ig-fold), and are involved in recognition, binding, and adhesion.^[30]^[31]
PA clan: Members share a
nucleophiles).^[2]^[32]
Ras superfamily: Members share a common catalytic G domain of a 6-strand β sheet surrounded by 5 α-helices.^[33]
RSH superfamily: Members share capability to hydrolyze and/or synthesize ppGpp alarmones in the stringent response. ^[34]
Serpin superfamily: Members share a high-energy, stressed fold which can undergo a large
cysteine proteases by disrupting their structure.^[9]
TIM barrel superfamily: Members share a large α₈β₈ barrel structure. It is one of the most common
protein folds and the monophylicity of this superfamily is still contested.^[35]^[36]

Protein superfamily resources

Several

biological databases

document protein superfamilies and protein folds, for example:

Pfam - Protein families database of alignments and HMMs
PROSITE - Database of protein domains, families and functional sites
PIRSF - SuperFamily Classification System
PASS2 - Protein Alignment as Structural Superfamilies v2
SUPERFAMILY
- Library of HMMs representing superfamilies and database of (superfamily and family) annotations for all completely sequenced organisms
CATH
- Classifications of protein structures into superfamilies, families and domains

Similarly there are algorithms that search the PDB for proteins with structural homology to a target structure, for example:

DALI - Structural alignment based on a distance alignment matrix method

References

^
PMID 20457744
.

^
PMID 22086950
.

PMID 8687420
.

^ "Clustal FAQ #Symbols". Clustal. Archived from the original on 24 October 2016. Retrieved 8 December 2014.

^
S2CID 13762291
.

^
PMID 11752317
.

PMID 15954844
.

PMID 22427707
.

^
PMID 11435447
.

PMID 27131377
.

PMID 19325884
.

S2CID 14936647
.

PMID 15604105
.

PMID 20591649
.

^
ISBN 9789402410679

PMID 26781812
.

PMID 26097079
.

PMID 23382230
.

PMID 12691742
.

PMID 25575902
.

PMID 24271399
.

PMID 15741509
.

S2CID 25258028
.

PMID 19508187
.

PMID 10607665
.

^ "SCOP". Archived from the original on 29 July 2014. Retrieved 28 May 2014.

PMID 22885024
.

ISBN 978-0815323051
.

PMID 2926816
.

PMID 7932691
.

PMID 8574878
.

PMID 3186696
.

S2CID 6636339
.

PMID 21858139
.

PMID 12206759
.

doi:10.1016/S0959-440X(05)80114-9
.

External links

Media related to Protein superfamilies at Wikimedia Commons

v
t
e
Enzymes
Activity

Active site

Binding site

Catalytic triad

Oxyanion hole

Enzyme promiscuity

Diffusion-limited enzyme

Cofactor

Enzyme catalysis

Regulation

Allosteric regulation

Cooperativity

Enzyme inhibitor

Enzyme activator

Classification

EC number

Enzyme superfamily

Enzyme family

List of enzymes

Kinetics

Enzyme kinetics

Eadie–Hofstee diagram

Hanes–Woolf plot

Lineweaver–Burk plot

Michaelis–Menten kinetics

Types

EC1 Oxidoreductases (list)

EC2 Transferases (list)

EC3 Hydrolases (list)

EC4 Lyases (list)

EC5 Isomerases (list)

EC6 Ligases (list)

EC7 Translocases (list)

Retrieved from "https://en.wikipedia.org/w/index.php?title=Protein_superfamily&oldid=1187408315"