Orphan gene
Orphan genes,
In some cases, a gene can be classified as an orphan gene due to undersampling of the existing genome space. While it is possible that homologues exist for a given gene, that gene will still be classified as an orphan if the organisms harbouring homologues have not yet been discovered and had their genomes
Some approaches characterize all microbial genes as part of one of two classes of genes. One class is characterized by conservation or partial conservation across lineages, whereas the other (represented by orphan genes) is characterized by evolutionarily instantaneous rates of gene turnover/replacement with a negligible effect on fitness when such genes are either gained or lost. These orphan genes primarily derive from mobile genetic elements and tend to be 'passively selfish', often devoid of cellular functions (which is why they experience little selective pressure in their gain or loss from genomes) but persist in the biosphere due to their transient movement across genomes.[14][15]
Evolution
Orphan genes Orphan genes evolve more rapidly than other genes. They often originate through two primary mechanisms: de novo gene birth, where new genes emerge from non-coding sequences within the genome, and horizontal gene transfer, the acquisition of genetic material from another organism.
Biologists believe orphan genes may play a crucial role in developing species-specific traits, environmental adaptations, or responses to changing ecological niches. These functional innovations necessitate rapid evolutionary changes to optimize their efficacy within the organism's biology.
Multiple studies have supported these evolutionary theories regarding orphan genes. Domazet-Loso and Tautz[16] conducted a study focusing on orphan genes in Drosophila, revealing that these genes evolve at a faster pace compared to conserved genes. This finding suggests a potential correlation between evolutionary rate and gene novelty. Similarly, Tautz and Domazet-Loso[17] presented evidence indicating a substantial contribution of orphan genes to phenotypic diversity and adaptation across different species. Their research underscores the crucial role of orphan genes in driving evolutionary innovation and shaping biological diversity.
History
Orphan genes were first discovered when the yeast genome-sequencing project began in 1996.[2] Orphan genes accounted for an estimated 26% of the yeast genome, but it was believed that these genes could be classified with homologues when more genomes were sequenced.[3] At the time, gene duplication was considered the only serious model of gene evolution[2][4][18] and there were few sequenced genomes for comparison, so a lack of detectable homologues was thought to be most likely due to a lack of sequencing data and not due to a true lack of homology.[3] However, orphan genes continued to persist as the quantity of sequenced genomes grew,[3][19] eventually leading to the conclusion that orphan genes are ubiquitous to all genomes.[2] Estimates of the percentage of genes which are orphans varies enormously between species and between studies; 10-30% is a commonly cited figure.[3]
The study of orphan genes emerged largely after the turn of the century. In 2003, a study of Caenorhabditis briggsae and related species compared over 2000 genes.[3] They proposed that these genes must be evolving too quickly to be detected and are consequently sites of very rapid evolution.[3] In 2005, Wilson examined 122 bacterial species to try to examine whether the large number of orphan genes in many species was legitimate.[19] The study found that it was legitimate and played a role in bacterial adaptation. The definition of taxonomically-restricted genes was introduced into the literature to make orphan genes seem less "mysterious."[19]
In 2008, a yeast protein of established functionality, BSC4, was found to have evolved de novo from non-coding sequences whose homology was still detectable in sister species.[20]
In 2009, an orphan gene was discovered to regulate an internal biological network: the orphan gene, QQS, from Arabidopsis thaliana modifies plant composition.[21] The QQS orphan protein interacts with a conserved transcription factor, these data explain the compositional changes (increased protein) that are induced when QQS is engineered into diverse species.[22] In 2011, a comprehensive genome-wide study of the extent and evolutionary origins of orphan genes in plants was conducted in the model plant Arabidopsis thaliana "[23]
Identification
Genes can be tentatively classified as orphans if no orthologous proteins can be found in nearby species.[10]
One method used to estimate nucleotide or protein sequence similarity indicative of homology (i.e. similarity due to common origin) is the
The systematic detection of homology to annotate orphan genes is called phylostratigraphy.[28] Phylostratigraphy generates a phylogenetic tree in which the homology is calculated between all genes of a focal species and the genes of other species. The earliest common ancestor for a gene determines the age, or phylostratum, of the gene. The term "orphan" is sometimes used only for the youngest phylostratum containing only a single species, but when interpreted broadly as a taxonomically-restricted gene, it can refer to all but the oldest phylostratum, with the gene orphaned within a larger clade.
Homology detection failure accounts for a majority of classified orphan genes.[8] Some scientists have attempted to recover some homology by using more sensitive methods, such as remote homology detection. In one study, remote homology detection techniques were used to demonstrate that a sizable fraction of orphan genes (over 15%) still exhibited remote homology despite being missed by conventional homology detection techniques, and that their functions were often related to the functions of nearby genes at genomic loci.[29]
Sources
Orphan genes arise from multiple sources, predominantly through de novo origination, duplication and rapid divergence, and horizontal gene transfer.[2]
De novo gene birth
Novel orphan genes continually arise de novo from non-coding sequences.[30] These novel genes may be sufficiently beneficial to be swept to fixation by selection. Or, more likely, they will fade back into the non-genic background. This latter option is supported by research in Drosophila showing that young genes are more likely go extinct.[31]
De novo genes were once thought to be a near impossibility due to the complex and potentially fragile intricacies of creating and maintaining functional polypeptides,[18] but research from the past 10 years or so has found multiple examples of de novo genes, some of which are associated with important biological processes, particularly testes function in animals. De novo genes were also found in fungi and plants.[20][32][33][5][34][35][11][36]
For young orphan genes, it is sometimes possible to find homologous
Duplication and divergence
The duplication and divergence model for orphan genes involves a new gene being created from some duplication or divergence event and undergoing a period of rapid evolution where all detectable similarity to the originally duplicated gene is lost.[2] While this explanation is consistent with current understandings of duplication mechanisms,[2] the number of mutations needed to lose detectable similarity is large enough as to be a rare event,[2][26] and the evolutionary mechanism by which a gene duplicate could be sequestered and diverge so rapidly remains unclear.[2][40]
Horizontal gene transfer
Another explanation for how orphan genes arise is through a duplication mechanism called horizontal gene transfer, where the original duplicated gene derives from a separate, unknown lineage.[2] This explanation for the origin of orphan genes is especially relevant in bacteria and archaea, where horizontal gene transfer is common.
Protein characteristics
Orphans genes tend to be very short (~6 times shorter than mature genes), and some are weakly expressed, tissue specific and simpler in codon usage and amino acid composition.[41] Orphan genes tend to encode more intrinsically disordered proteins,[42][43][44] although some structure has been found in one of the best characterized orphan genes.[45] Of the tens of thousands of enzymes of primary or specialized metabolism that have been characterized to date, none are orphans, or even of restricted lineage; apparently, catalysis requires hundreds of millions of years of evolution.[41]
Biological functions
Orphan genes, which have no detectable homologs in other species, represent a fascinating area of study in genomics. Their evolutionary role and biological significance remain subjects of ongoing research and debate. Orphan genes are important in evolution and speciation because of the potential for the production of novel genes and functions[46]. Orphan genes are theorized to play a critical role in the evolution of species, as they allow organisms to respond to changes in their environment and develop new adaptations rapidly[47].
Orphan genes can have diverse functions, ranging from basic metabolic functions to complex regulatory processes. For example, some orphan genes are involved in the regulation of growth and development, while others play a role in the response to the environmental stresses[48]. Their evolutionary role and biological significance remain subjects of ongoing research and debate.
Emergence and Controversy
Some scientists propose that many orphan genes may not play a direct evolutionary role. They argue that genomes contain non-functional open reading frames (ORFs) which might produce spurious polypeptides not maintained by natural selection. Such genes are likely to be unique to a species because they do not undergo conservation across species, hence are categorized as orphan genes.[49]
Functional Significance Through Research
Contrary to the view that they are evolutionary noise, emerging studies have illustrated the functional importance of orphan genes:
- Drosophila melanogaster: The orphan gene "pipsqueak" was shown to be essential for larval development. Experiments demonstrated that mutations in this gene lead to significant developmental issues, establishing its critical evolutionary role.[50]
- Arabidopsis thaliana: The QQS orphan gene in Arabidopsis plays a novel regulatory role in starch content, responding to various environmental and developmental stimuli. It also interacts with a conserved transcription factor NF-YC4, which is crucial in regulating carbon and nitrogen allocation, showing its broad significance across different plant species.[21]
- Yeast (Saccharomyces cerevisiae): In yeast, the orphan gene "bsc4" is associated with DNA repair. Deletion of this gene increases sensitivity to DNA-damaging agents, highlighting its role in maintaining genomic integrity.[45]
- Tobacco: The QQS in tobacco was found to induce the activity of RubisCO (Ribulose-1, 5-biphosphate carboxylase/oxygenase), an enzyme critical for the initial step of carbon fixation[51].
These examples confirm the functionality of some orphan genes but also suggest their potential involvement in the emergence of novel phenotypes, thereby contributing to species-specific adaptations.
Implications
Orphan genes have garnered interest across multiple scientific disciplines such as evolutionary biology and medicine, due to their nature and potential implications.[52]
In evolutionary biology, orphan genes diverge from traditional models of gene evolution and provide valuable insights into the process of de novo gene origination and lineage-specific adaptation. The term "de novo gene" specifically denotes the emergence of a functional gene without ancestral genetic material, whether as a protein-coding gene or a functional RNA molecule[53]. This understanding of de novo genes, coupled with the study of orphan genes, enriches the traditional Charles Darwin's model of evolution, also called Darwinism or Darwinian theory, by revealing additional mechanisms through which genetic diversity and adaptation can occur. By clarifying that de novo genes can arise from non-genic sequences and contribute to lineage-specific adaptation, this research expands our understanding of the creative forces of evolution, adding depth and complexity to Darwin's foundational principles.
In
Orphan genes have the potential to serve as biomarkers for disease diagnosis, prognosis, and treatment response. Their lineage-specific nature and expression patterns may provide valuable information for personalized medicine approaches, enabling more accurate and targeted interventions for individuals affected by various diseases. Thus, harnessing the potential of orphan genes in understanding human health has significant implications for advancing biomedical research and improving clinical outcomes.
See also
References
- PMID 10498776.
- ^ S2CID 31738556.
- ^ PMID 19716618.
- ^ ISBN 978-3-642-86659-3.
- ^ PMID 18550802.
- PMID 19064677.
- PMID 19531232.
- ^ PMID 33137085.
- PMID 32066524.
- ^ PMID 23348040.
- ^ PMID 24146629.
- PMID 24391509.
- PMID 23034216.
- S2CID 21799266.
- S2CID 231820647.
- PMID 14525923.
- PMID 21878963. Retrieved 26 April 2024.
- ^ PMID 860134.
- ^ PMID 16079329.
- ^ PMID 18493065.
- ^ PMID 19154206.
- PMID 26554020.
- PMID 21332978.
- PMID 9254694.
- ^ "NCBI BLAST homepage". National Center for Biotechnology Information. National Institutes of Health, U.S. Department of Health and Human Services.
- ^ PMID 17408474.
- PMID 25312911.
- PMID 18029048.
- PMID 26257768.
- PMID 26323763.
- PMID 24554240.
- PMID 24457212.
- PMID 16777968.
- PMID 19733073.
- PMID 21164016.
- PMID 23593031.
- PMID 24650912.
- PMID 26758516.
- S2CID 247240062.
- PMID 15475113.
- ^ PMID 25151064.
- PMID 25843649.
- PMID 28642936.
- PMID 30026186.
- ^ PMID 29033289.
- PMID 30919281.
- PMID 26677845.
- PMID 25151064.
- PMID 33193682.
- PMID 9774480.
- PMID 35193747.
- ^ a b c d e Fakhar, A. Z., Liu, J., Pajerowska-Mukhtar, K. M., & Mukhtar, M. S. (Year). "The Lost and Found: Unraveling the Functions of Orphan Genes." Journal Name, Volume(Issue), Page numbers.
- ^ Schmitz, J. F., & Bornberg-Bauer, E. (2017). Fact or fiction: updates on how protein-coding genes might emerge de novo from previously non-coding DNA. F1000Res, 6, 57. doi: 10.12688/f1000research.9736.1.