Protein family
A protein family is a group of evolutionarily related proteins. In many cases, a protein family has a corresponding gene family, in which each gene encodes a corresponding protein with a 1:1 relationship. The term "protein family" should not be confused with family as it is used in taxonomy.
Proteins in a family descend from a common ancestor and typically have similar
Currently, over 60,000 protein families have been defined,[5] although ambiguity in the definition of "protein family" leads different researchers to highly varying numbers.
Terminology and usage
As with many biological terms, the use of protein family is somewhat context dependent; it may indicate large groups of proteins with the lowest possible level of detectable sequence similarity, or very narrow groups of proteins with almost identical sequence, function, and three-dimensional structure, or any kind of group in between. To distinguish between these situations, the term
Protein domains and motifs
The concept of protein family was conceived when very few protein structures or sequences were known. At the time, the majority of proteins that were structurally understood were small, single-domain proteins such as myoglobin, hemoglobin, and cytochrome c. Since then, many proteins have been found with multiple independent structural and functional units or domains. Due to evolutionary shuffling, different domains in a protein have evolved independently. This has led to a focus on families of protein domains. A number of online resources are devoted to identifying and cataloging such domains.[12][13]
Different regions of a protein have differing functional constraints (features critical to the structure and function of the protein). For example, the
Evolution of protein families
According to current consensus, protein families arise in two ways. First, the separation of a parent species into two genetically isolated descendant species allows a gene/protein to independently accumulate variations (
Certain gene/protein families, especially in
Use and importance of protein families
As the total number of sequenced proteins increases and interest expands in
Protein family resources
Many
- Pfam - Protein families database of alignments and HMMs
- PROSITE - Database of protein domains, families and functional sites
- PIRSF - SuperFamily Classification System
- PASS2 - Protein Alignment as Structural Superfamilies v2 - PASS2@NCBS[17]
- SUPERFAMILY- Library of HMMs representing superfamilies and database of (superfamily and family) annotations for all completely sequenced organisms
- CATH- Classifications of protein structures into superfamilies, families and domains
Similarly, many database-searching algorithms exist, for example:
- BLAST - DNA sequence similarity search
- BLASTp- Protein sequence similarity search
- OrthoFinder - Method for clustering proteins into families (orthogroups)[18][19]
See also
- Gene family
- Genome annotation
- Sequence clustering
Protein families
References
- EMBL-EBI. Retrieved 2023-11-14.
- ^ ISBN 9781118743089.
- ^ PMID 23749753.
- ^ PMID 27881430.
- PMID 12620116.
- PMID 4435228.
- S2CID 40304076.
- PMID 181273.
- PMID 15954844.
- PMID 15140831.
- S2CID 85641264.
- PMID 33680357.
- ISBN 9781118743089.
- PMID 11806833.
- PMC 10089649.
- PMID 21999478.
- PMID 22123743.
- PMID 26243257.
- PMID 31727128.
External links
- Media related to Protein families at Wikimedia Commons