Chimera (molecular biology)

Source: Wikipedia, the free encyclopedia.

In molecular biology, and more importantly high-throughput DNA sequencing, a chimera is a single DNA sequence originating when multiple transcripts or DNA sequences get joined. Chimeras can be considered artifacts and be filtered out from the data during processing [1] to prevent spurious inferences of biological variation.[2] However, chimeras should not be confused with chimeric reads, who are generally used by structural variant callers to detect structural variation events[3] and are not always an indication of the presence of a chimeric transcript or gene.

In a different context, the deliberate creation of artificial chimeras can also be a useful tool in molecular biology. For example, in protein engineering, "chimeragenesis" (forming chimeras between proteins that are encoded by homologous cDNAs)[4] is one of the "two major techniques used to manipulate cDNA sequences".[4] For gene fusions that occur through natural processes, see chimeric genes and fusion genes.

Description

Transcript chimera

A chimera can occur as a single

transcripts. It is usually considered to be a contaminant in transcript and expressed sequence tag (which results in the moniker of EST chimera) databases.[5] It is estimated that approximately 1% of all transcripts in the National Center for Biotechnology Information's Unigene database contain a "chimeric sequence".[6]

PCR chimera

A chimera can also be an artifact of

anneals to the wrong template and continues to extend, thereby synthesizing a single sequence sourced from two different templates.[7]

PCR chimeras are an important issue to take into account during metabarcoding, where DNA sequences from environmental samples are used to determine biodiversity. A chimera is a novel sequence that will most probably not match to any known organism. Hence, it might be interpreted as a new species thereby overinflating the diversity.

PCR chimeras also occur in DNA sequencing. In this case, the most common mechanism of chimera formation is that incomplete extension during the PCR results in partial sequence strands that can act as primers in subsequent PCR cycles on similar but non identical sequences. Extension of such hybrid priming events causes the formation of chimeric sequences.[1]

Some computational methods have been devised to detect and remove chimeras, like:

  • CHECK_CHIMERA of the Ribosomal Database Project [8]
  • ChimeraSlayer in QIIME[9][7]
  • uchime in usearch[10]
  • removeBimeraDenovo() in dada2[11]
  • Bellerophon[12]
  • CATCh[13]
  • DECIPHER[14]

Chimeric read

A read is a sequence of nucleic acids determined through high-throughput DNA or RNA sequencing, corresponding to a DNA or RNA fragment. A chimeric read or split read means that multiple subsections of that read align to different positions in a reference genome.[15] They are not always a sign of the presence of a PCR chimera and often used to detect structural variations.[3]

Examples

  • "The first mRNA transcript isolated for..." the human gene
    C2orf3
    "...was part of an artificial chimera..."
  • CYP2C17 was thought to be a human gene, but "...is now considered an artefact based on a chimera of CYP2C18 and CYP2C19."[16]
  • Researchers have created receptor chimeras in their studies of Oncostatin M.

See also

References

  1. ^ a b "Chimeras". www.drive5.com. Retrieved 2022-10-27.
  2. S2CID 88955007
    .
  3. ^ .
  4. ^ . p. 424
  5. ^ Nelson C. "EST Assembly for the Creation of Oligonucleotide Probe Targets" (PDF). Agilent Technologies. Archived from the original (PDF) on 23 February 2012. Retrieved May 12, 2009.
  6. ^
    PMID 21212162
    .
  7. .
  8. ^ "Chimera checking sequences with QIIME". Quantitative Insights Into Microbial Ecology (QIIME). Retrieved 2019-01-10.
  9. ^ Edgar R. "UCHIME algorithm". drive5.com. Retrieved 2019-01-10.
  10. ^ "removeBimeraDenovo function". R Documentation. www.rdocumentation.org. Retrieved 2019-01-10.
  11. PMID 15073015
    .
  12. .
  13. .
  14. ^ "SAM Format specifications" (PDF). Retrieved 2023-05-31.
  15. ^ "Entrez Gene: CYP2C18 cytochrome P450, family 2, subfamily C, polypeptide 18". National Center for Biotechnology Information. Retrieved May 12, 2009.