Endogenous retrovirus

Source: Wikipedia, the free encyclopedia.
(Redirected from
Endogenous retroviruses
)
Dendrogram of various classes of endogenous retroviruses

Endogenous retroviruses (ERVs) are endogenous viral elements in the genome that closely resemble and can be derived from retroviruses. They are abundant in the genomes of jawed vertebrates, and they comprise up to 5–8% of the human genome (lower estimates of ~1%).[1][2]

ERVs are a vertically inherited

retroelements due to their integration and reverse-transcription
into the nuclear genome of the host cell.

Researchers have suggested that retroviruses evolved from a type of transposon called a retrotransposon, a Class I element;[7] these genes can mutate and instead of moving to another location in the genome they can become exogenous or pathogenic. This means that not all ERVs may have originated as an insertion by a retrovirus but that some may have been the source for the genetic information in the retroviruses they resemble.[8] When integration of viral DNA occurs in the germ-line, it can give rise to an ERV, which can later become fixed in the gene pool of the host population.[1][9]

Formation

The replication cycle of a

somatic cells, but occasional infection of germline cells (cells that produce eggs and sperm) can also occur. Rarely, retroviral integration may occur in a germline cell that goes on to develop into a viable organism. This organism will carry the inserted retroviral genome as an integral part of its own genome—an "endogenous" retrovirus (ERV) that may be inherited by its offspring as a novel allele. Many ERVs have persisted in the genome of their hosts for millions of years. However, most of these have acquired inactivating mutations during host DNA replication and are no longer capable of producing the virus. ERVs can also be partially excised from the genome by a process known as recombinational deletion, in which recombination between the identical sequences that flank newly integrated retroviruses results in deletion of the internal, protein
-coding regions of the viral genome.

The general retrovirus genome consists of three genes vital for the invasion, replication, escape, and spreading of its viral genome. These three genes are gag (encodes for structural proteins for the viral core), pol (encodes for

polyproteins. In order to carry out their life cycle, the retrovirus relies heavily on the host cell's machinery. Protease degrades peptide bonds of the viral polyproteins, making the separate proteins functional. Reverse transcriptase functions to synthesize viral DNA from the viral RNA in the host cell's cytoplasm before it enters the nucleus. Integrase guides the integration of viral DNA into the host genome.[9][10]

Over time, the genome of ERVs not only acquire point mutations, but also shuffle and recombine with other ERVs.[11] ERVs with a decayed sequence for the env become more likely to propagate.[12]

Role in genomic evolution

Diagram displaying the integration of viral DNA into a host genome

Endogenous retroviruses can play an active role in shaping genomes. Most studies in this area have focused on the genomes of humans and higher primates, but other vertebrates, such as mice and sheep, have also been studied in depth.

enhancers, often contributing to the transcriptome by producing tissue-specific variants. In addition, the retroviral proteins
themselves have been co-opted to serve novel host functions, particularly in reproduction and development. Recombination between homologous retroviral sequences has also contributed to gene shuffling and the generation of genetic variation. Furthermore, in the instance of potentially antagonistic effects of retroviral sequences, repressor genes have co-evolved to combat them.

About 90% of endogenous retroviruses are solo LTRs, lacking all open reading frames (ORFs). Solo LTRs and LTRs associated with complete retroviral sequences have been shown to act as transcriptional elements on host genes. Their range of action is mainly by insertion into the 5' UTRs of protein coding genes; however, they have been known to act upon genes up to 70–100 kb away.[13][17][18][19] The majority of these elements are inserted in the sense direction to their corresponding genes, but there has been evidence[20] of LTRs acting in the antisense direction and as a bidirectional promoter for neighboring genes.[21][22] In a few cases, the LTR functions as the major promoter for the gene.

For example, in humans AMY1C has a complete ERV sequence in its promoter region; the associated LTR confers salivary specific expression of the digestive enzyme amylase.[23] Also, the primary promoter for bile acid-CoA:amino acid N-acyltransferase (BAAT), which codes for an enzyme that is integral in bile metabolism, is of LTR origin.[18][24]

The insertion of a solo ERV-9 LTR may have produced a functional open reading frame, causing the rebirth of the human immunity related GTPase gene (IRGM).[25] ERV insertions have also been shown to generate alternative splice sites either by direct integration into the gene, as with the human leptin hormone receptor, or driven by the expression of an upstream LTR, as with the phospholipase A-2 like protein.[26]

Most of the time, however, the LTR functions as one of many alternate promoters, often conferring tissue-specific expression related to reproduction and development. In fact, 64% of known LTR-promoted transcription variants are

NOS3), interleukin-2 receptor B (IL2RB), and another mediator of estrogen synthesis, HSD17B1, are also alternatively regulated by LTRs that confer placental expression, but their specific functions are not yet known.[24][29] The high degree of reproductive expression is thought to be an after effect of the method by which they were endogenized; however, this also may be due to a lack of DNA methylation in germ-line tissues.[24]

The best-characterized instance of placental protein expression comes not from an alternatively promoted host gene but from a complete co-option of a retroviral protein. Retroviral fusogenic env proteins, which play a role in the entry of the virion into the host cell, have had an important impact on the development of the mammalian

syncytiotrophoblasts.[15] These multinucleated cells are mainly responsible for maintaining nutrient exchange and separating the fetus from the mother's immune system.[15] It has been suggested that the selection and fixation of these proteins for this function have played a critical role in the evolution of viviparity.[30]

In addition, the insertion of ERVs and their respective LTRs have the potential to induce chromosomal rearrangement due to recombination between viral sequences at inter-chromosomal loci. These rearrangements have been shown to induce gene duplications and deletions that largely contribute to genome plasticity and dramatically change the dynamic of gene function.

class II MHC genes have a high density of HERV elements as compared to other multi-locus-gene families.[26] It has been shown that HERVs have contributed to the formation of extensively duplicated duplicon blocks that make up the HLA class 1 family of genes.[32] More specifically, HERVs primarily occupy regions within and between the break points between these blocks, suggesting that considerable duplication and deletions events, typically associated with unequal crossover, facilitated their formation.[33] The generation of these blocks, inherited as immunohaplotypes, act as a protective polymorphism against a wide range of antigens that may have imbued humans with an advantage over other primates.[32]

The characteristic of

orthologs in mouse TSCs. TSCs were observed because they reflect the initial cells that develop in the fetal placenta. Regardless of their tangible similarities, enhancer and repressed regions were mostly species-specific. However, most promoter sequences were conserved between mouse and rat. In conclusion to their study, researchers proposed that ERVs influenced species-specific placental evolution through mediation of placental growth, immunosuppression, and cell fusion.[34]

Another example of ERV exploiting cellular mechanisms is p53, a tumor suppressor gene (TSG). DNA damage and cellular stress induces the p53 pathway, which results in cell apoptosis. Using chromatin immunoprecipitation with sequencing, thirty-percent of all p53-binding sites were located within copies of a few primate-specific ERV families. A study suggested that this benefits retroviruses because p53's mechanism provides a rapid induction of transcription, which leads to the exit of viral RNA from the host cell.[7]

Finally, the insertion of ERVs or ERV elements into genic regions of host DNA, or overexpression of their transcriptional variants, has a much higher potential to produce deleterious effects than positive ones. Their appearance into the genome has created a

KRAB domain, exist in high copy number in vertebrate genomes, and their range of functions are limited to transcriptional roles.[35] It has been shown in mammals, however, that the diversification of these genes was due to multiple duplication and fixation events in response to new retroviral sequences or their endogenous copies to repress their transcription.[19]

Role in disease

The majority of ERVs that occur in vertebrate genomes are ancient, inactivated by mutation, and have reached genetic fixation in their host species. For these reasons, they are extremely unlikely to have negative effects on their hosts except under unusual circumstances. Nevertheless, it is clear from studies in birds and non-human mammal species including mice, cats and koalas, that younger (i.e., more recently integrated) ERVs can be associated with disease.[36] The number of active ERVs in the genome of mammals is negatively related to their body size, suggesting a contribution to Peto's paradox through cancer pathogenesis.[37] This has led researchers to propose a role for ERVs in several forms of human cancer and autoimmune disease, although conclusive evidence is lacking.[38][39][40][41]

Neurological disorders

In humans, ERVs have been proposed to be involved in

ERVWE1, or "syncytin", gene, which is derived from an ERV insertion, has been reported, along with the presence of an "MS-associated retrovirus" (MSRV), in patients with the disease.[42][43] Human ERVs (HERVs) have also been implicated in ALS[44] and addiction.[45][46][47]

In 2004 it was reported that antibodies to HERVs were found in greater frequency in the

Immunity

ERVs have been found to be associated to disease not only through disease-causing relations, but also through immunity. The frequency of ERVs in long terminal repeats (LTRs) likely correlates to viral adaptations to take advantage of immunity signaling pathways that promote viral transcription and replication. A study done in 2016 investigated the benefit of ancient viral DNA integrated into a host through gene regulation networks induced by

CD14+ macrophages.[1]

HERVs also play various roles shaping the human

innate immunity response, with some sequences activating the system and others suppressing it. They may also protect from exogenous retroviral infections: the virus-like transcripts can activate pattern recognition receptors, and the proteins can interfere with active retroviruses. A gag protein from HERV-K(HML2) is shown to mix with HIV Gag, impairing HIV capsid formation as a result.[52]

Gene regulation

Another idea proposed was that ERVs from the same family played a role in recruiting multiple genes into the same network of regulation. It was found that MER41 elements provided addition redundant regulatory enhancement to the genes located near STAT1 binding sites.[1]

Role in medicine

Porcine endogenous retrovirus

For humans,

CRISPR-Cas9, removed all 62 retroviruses from the pig genome.[53] The consequences of cross-species transmission remain unexplored and have dangerous potential.[54]

Researchers have indicated that infection of human tissues by PERVs is very possible, especially in immunosuppressed individuals. An immunosuppressed condition could potentially permit a more rapid and tenacious replication of viral DNA, and would later have less difficulty adapting to human-to-human transmission. Although known infectious pathogens present in the donor organ/tissue can be eliminated by breeding pathogen-free herds, unknown retroviruses can be present in the donor. These retroviruses are often latent and asymptomatic in the donor, but can become active in the recipient. Some examples of endogenous viruses that can infect and multiply in human cells are from baboons (BaEV), cats (RD114), and mice.[50]

There are three different classes of PERVs, PERV-A, PERV-B, and PERV-C. PERV-A and PERV-B are

ecotropic and does not replicate on human cells. The major differences between the classes is in the receptor binding domain of the env protein and the long terminal repeats (LTRs) that influence the replication of each class. PERV-A and PERV-B display LTRs that have repeats in the U3 region. However, PERV-A and PERV-C show repeatless LTRs. Researchers found that PERVs in culture actively adapted to the repeat structure of their LTR in order to match the best replication performance a host cell could perform. At the end of their study, researchers concluded that repeatless PERV LTR evolved from the repeat-harboring LTR. This was likely to have occurred from insertional mutation and was proven through use of data on LTR and env/Env. It is thought that the generation of repeatless LTRs could be reflective of an adaptation process of the virus, changing from an exogenous to an endogenous lifestyle.[55]

A clinical trial study performed in 1999 sampled 160 patients who were treated with different living pig tissues and observed no evidence of a persistent PERV infection in 97% of the patients for whom a sufficient amount of DNA was available to PCR for amplification of PERV sequences. This study stated that retrospective studies are limited to find the true incidence of infection or associated clinical symptoms, however. It suggested using closely monitored prospective trials, which would provide a more complete and detailed evaluation of the possible cross-species PERV transmission and a comparison of the PERV.[56]

Human endogenous retroviruses

Human endogenous retroviruses (HERV) comprise a significant part of the human genome, with approximately 98,000 ERV elements and fragments making up 5–8%.[1] According to a study published in 2005, no HERVs capable of replication had been identified; all appeared to be defective, containing major deletions or nonsense mutations (not true for HERV-K). This is because most HERVs are merely traces of original viruses, having first integrated millions of years ago. An analysis of HERV integrations is ongoing as part of the 100,000 Genomes Project.[57]

A 2023 study found HERV can become awakened from dormant states and contribute to aging which could be blocked by neutralizing antibodies.[58][59]

Human endogenous retroviruses were originally discovered when human genomic libraries were screened under low-stringency conditions using either probes from animal retroviruses or by using oligonucleotides with similarity to virus sequences.[1]

Classification

HERVs are classified based on their homologies to animal retroviruses. Families belonging to Class I are similar in sequence to mammalian

superfamily. There are more Class I families known to exist.[1][11] The families themselves are named in a less uniform manner, with a mixture of naming based on an exogenous retrovirus, the priming tRNA (HERV-W, HERV-K), or some neighboring gene (HERV-ADP), clone number (HERV-S71), or some amino acid motif (HERV-FRD). A proposed nomenclature aims to clean up the sometimes paraphyletic standards.[6]

Origin

Sometime during human evolution, exogenous progenitors of HERV inserted themselves into germ line cells and then replicated along with the host's genes using and exploiting the host's cellular mechanisms. Because of their distinct genomic structure, HERVs were subjected to many rounds of amplification and transposition, which lead to a more widespread distribution of retroviral DNA.[1]

Nevertheless, one family of viruses has been active since the divergence of

HERV-K (HML2), makes up less than 1% of HERV elements but is one of the most studied. There are indications it has even been active in the past few hundred thousand years, e.g., some human individuals carry more copies of HML2 than others.[60] Traditionally, age estimates of HERVs are performed by comparing the 5' and 3' LTR of a HERV; however, this method is only relevant for full-length HERVs. A recent method, called cross-sectional dating,[61] uses variations within a single LTR to estimate the ages of HERV insertions. This method is more precise in estimating HERV ages and can be used for any HERV insertions. Cross-sectional dating has been used to suggest that two members of HERV-K (HML2), HERV-K106 and HERV-K116, were active in the last 800,000 years and that HERV-K106 may have infected modern humans 150,000 years ago.[62] However, the absence of known infectious members of the HERV-K (HML2) family, and the lack of elements with a full coding potential within the published human genome sequence, suggests to some that the family is less likely to be active at present. In 2006 and 2007, researchers working independently in France and the US recreated functional versions of HERV-K (HML2).[63][64]

Expression of HERV proteins

The expression of HERV-K, a biologically active family of HERV, produces proteins found in placenta. Furthermore, the expression of the envelope genes of

HERV-FRD (ERVFRD-1 Archived 2012-10-26 at the Wayback Machine) produces syncytins which are important for the generation of the syncytiotrophoblast cell layer during placentogenesis by inducing cell-cell fusion.[65] The HUGO Gene Nomenclature Committee (HGNC) approves gene symbols for transcribed human ERVs.[66]

Functional impact

MER41.AIM2 is an HERV that regulates the transcription of AIM2 (Absent in Melanoma 2) which encodes for a sensor of foreign cytosolic DNA. This acts as a binding site for AIM2, meaning that it is necessary for the transcription of AIM2. Researchers had shown this by deleting MER41.AIM2 in HeLa cells using CRISPR/Cas9, leading to an undetectable transcript level of AIM2 in modified HeLa cells. The control cells, which still contained the MER41.AIM2 ERV, were observed with normal amounts of AIM2 transcript. In terms of immunity, researchers concluded that MER41.AIM2 is necessary for an inflammatory response to infection.[67]

Activation by exogenous viruses

Considerable evidence indicate that HERVs can be reactivated by viral infections, such as:

1) retroviruses –

(HTLV-1);

2) RNA viruses – influenza A virus, hepatitis C virus (HCV), severe acute respiratory syndrome coronavirus-2 (SARSCoV-2);

3) DNA viruses –

human cytomegalovirus (CMV), Kaposi’s sarcoma-associated herpesvirus (KSHV) [68]

Several studies have shown that EBV is able to transactivate the expression of the normally inactive HERV-K18 Env protein, e.g., interacting with resting

EBNA-2. In-depth analysis completed the picture identifying the EBV latent membrane protein LMP-2A as a strong candidate for the HERV-K18 transactivation. HERV-K18 has also been reported to have superantigen activity (i.e. polyclonal T and B cell activation regardless of the specificity of their antigen receptor).[69]

It has also been shown that in vitro binding of the EBV gp350 protein caused activation of MSRVenv and syncytin-1 in B-cells, monocytes, macrophages and in astrocytes - cells that are involved in pathogenesis of multiple sclerosis.[70] Monocytes, especially after their differentiation into macrophages, appeared to be the most responsive to EBVgp350, expressing even higher levels of HERV-Wenv than B cells. This finding is concordant with another study, which demonstrated that during infectious mononucleosis EBV promoted the strongest activation of HERV-W/MSRV expression in monocytes compared to other blood cell types.[71]

Immune response to HERVs

Despite having been integrated into genomes of vertebrates for millions of years, ERVs represent an intermediate stage between exogenous viruses and the host genome; it is suggested that immunological tolerance to HERV-derived proteins and peptides is imperfect due to the epigenetic silencing of HERV in the thymus and bone marrow, which prevents deletion of all HERV-specific T and B cells.

HERV-K (HML-2) elements which integrated most recently are the most intact and biologically active forms.[69] HERV-K env and HERV-H env, considered to be a new class of tumor-associated antigens, have been found to promote strong cytotoxic T-cell responses in patients with various types of cancers.[72][73][74]

On a level of the innate immune sensing of nucleic acids, single-stranded RNA (

dsRNA) derived from endogenous retroviruses are recognized by pattern recognition receptors
(PRRs).

TLR-3, RIG-I and MDA5; RIG-I and MDA5 are known to induce a type I IFN response.[75][76]

When retrotranscribed into DNA, retroviruses can be sensed by cyclic GMP-AMP

The recognition of nucleic acids through PRRs provides a very efficient strategy to fight against viral infections, at the same time imposing the host to a risk due to the possibility of recognizing self-nucleic acids and promotion of autoimmunity.

On a protein level, a direct interaction between TLRs and certain HERV proteins has been shown. For example, the surface unit of HERV-W Env (also known as Multiple sclerosis-associated retroviral element (MSRV) env) was found to bind to

Th1-like type of Th cell differentiation.[77]

Immunological studies have shown some evidence for T cell immune responses against HERVs in HIV-infected individuals.[78] The hypothesis that HIV induces HERV expression in HIV-infected cells led to the proposal that a vaccine targeting HERV antigens could specifically eliminate HIV-infected cells. The potential advantage of this novel approach is that, by using HERV antigens as surrogate markers of HIV-infected cells, it could circumvent the difficulty inherent in directly targeting notoriously diverse and fast-mutating HIV antigens.[78]

Techniques for characterizing ERVs

Whole genome sequencing

Example: A porcine ERV (PERV) Chinese-born

minipig isolate, PERV-A-BM, was sequenced completely and along with different breeds and cell lines in order to understand its genetic variation and evolution. The observed number of nucleotide substitutions and among the different genome sequences helped researchers determine an estimate age that PERV-A-BM was integrated into its host genome, which was found to be of an evolutionary age earlier than the European-born pigs isolates.[54]

Chromatin immunoprecipitation with sequencing (ChIP-seq)

This technique is used to find histone marks indicative of promoters and enhancers, which are binding sites for DNA proteins, and repressed regions and trimethylation.

embryonic stem cells (ESCs) and early embryogenesis.[7]

Applications

Constructing phylogenies

Because most HERVs have no function, are selectively neutral, and are very abundant in primate genomes, they easily serve as phylogenetic markers for linkage analysis. They can be exploited by comparing the integration site polymorphisms or the evolving, proviral, nucleotide sequences of orthologs. To estimate when integration occurred, researchers used distances from each phylogenetic tree to find the rate of molecular evolution at each particular locus. It is also useful that ERVs are rich in many species genomes (i.e. plants, insects, mollusks, fish, rodents, domestic pets, and livestock) because its application can be used to answer a variety of phylogenetic questions.[9]

Designating the age of provirus and the time points of species separation events

This is accomplished by comparing the different HERV from different evolutionary periods. For example, this study was done for different hominoids, which ranged from humans to apes and to monkeys. This is difficult to do with PERV because of the large diversity present.[55]

Further research

Epigenetic variability

Researchers could analyze individual epigenomes and

transcriptomes to study the reactivation of dormant transposable elements through epigenetic release and their potential associations with human disease and exploring the specifics of gene regulatory networks.[7]

Immunological problems of xenotransplantation

Little is known about an effective way to overcoming

hyperacute rejection (HAR), which follows the activation of complement initiated by xenoreactive antibodies recognizing galactosyl-alpha1-3galatosyl (alpha-Gal) antigens on the donor epithelium.[50]

Risk factors of HERVs in gene therapy

Because retroviruses are able to recombine with each other and with other endogenous DNA sequences, it would be beneficial for gene therapy to explore the potential risks HERVs can cause, if any. Also, this ability of HERVs to recombine can be manipulated for site-directed integration by including HERV sequences in retroviral vectors.[1]

HERV gene expression

Researchers believe that RNA and proteins encoded for by HERV genes should continue to be explored for putative function in cell physiology and in pathological conditions. This would make sense to examine in order to more deeply define the biological significance of the proteins synthesized.[1]

See also

References

  1. ^
    PMID 15044706
    .
  2. .
  3. .
  4. .
  5. . NBK19468. Retrieved 2021-02-22.
  6. ^ .
  7. ^ .
  8. . It appears that the transition from nonviral retrotransposon to retrovirus has occurred independently at least eight times, and the source of the envelope gene responsible for infectious ability can now be traced to a virus in at least four of these instances. This suggests that potentially, any LTR retrotransposon can become a virus through the acquisition of existing viral genes.
  9. ^ .
  10. .
  11. ^ .
  12. .
  13. ^ .
  14. .
  15. ^ .
  16. .
  17. .
  18. ^ .
  19. ^ .
  20. .
  21. .
  22. .
  23. .
  24. ^ .
  25. .
  26. ^ .
  27. .
  28. .
  29. .
  30. .
  31. .
  32. ^ .
  33. .
  34. ^ .
  35. .
  36. .
  37. .
  38. .
  39. .
  40. .
  41. .
  42. ^ Mameli G, Astone V, Arru G, Marconi S, Lovato L, Serra C, et al. (January 2007). "Brains and peripheral blood mononuclear cells of multiple sclerosis (MS) patients hyperexpress MS-associated retrovirus/HERV-W endogenous retrovirus, but not Human herpesvirus 6". The Journal of General Virology. 88 (Pt 1): 264–274.
    PMID 17170460
    .
  43. .
  44. ^ "Reactivated Virus May Contribute to ALS". 2016-01-23.
  45. PMID 30249655
    .
  46. ^ Rob Picheta (25 September 2018). "Addiction may stem from ancient retrovirus, study says". CNN. Retrieved 2019-10-13.
  47. ISSN 0013-0613
    . Retrieved 2019-10-13.
  48. .
  49. ^ Fox D (2010). "The Insanity Virus". Discover. Retrieved 2011-02-17.
  50. ^
    S2CID 33977939
    .
  51. .
  52. .
  53. .
  54. ^ .
  55. ^ .
  56. .
  57. ^ "Genomics England › Integrated Pathogens and Mobile Elements GeCIP Domain". Archived from the original on 2019-10-13. Retrieved 2019-10-13.
  58. ^ "Aging and Retroviruses". Science. Archived from the original on 17 February 2023. Retrieved 17 February 2023.
  59. S2CID 232060038
    .
  60. .
  61. .
  62. .
  63. .
  64. .
  65. .
  66. .
  67. .
  68. .
  69. ^ .
  70. .
  71. .
  72. ^ .
  73. .
  74. .
  75. ^ .
  76. .
  77. .
  78. ^ .

Further reading

External links