CRISPR

Source: Wikipedia, the free encyclopedia.
Cascade (CRISPR-associated complex for antiviral defense)
UniProt
P38036
Search for
StructuresSwiss-model
DomainsInterPro

CRISPR (

acquired immunity.[2][3][4][5] CRISPR is found in approximately 50% of sequenced bacterial genomes and nearly 90% of sequenced archaea.[6]

Diagram of the CRISPR prokaryotic antiviral defense mechanism[7]

Cas9 (or "CRISPR-associated protein 9") is an enzyme that uses CRISPR sequences as a guide to recognize and open up specific strands of DNA that are complementary to the CRISPR sequence. Cas9 enzymes together with CRISPR sequences form the basis of a technology known as CRISPR-Cas9 that can be used to edit genes within the organisms.[8][9] This editing process has a wide variety of applications including basic biological research, development of biotechnological products, and treatment of diseases.[10][11] The development of the CRISPR-Cas9 genome editing technique was recognized by the Nobel Prize in Chemistry in 2020 which was awarded to Emmanuelle Charpentier and Jennifer Doudna.[12][13]

History

Repeated sequences

The discovery of clustered DNA repeats took place independently in three parts of the world. The first description of what would later be called CRISPR is from Osaka University researcher Yoshizumi Ishino and his colleagues in 1987. They accidentally cloned part of a CRISPR sequence together with the "iap" gene (isozyme conversion of alkaline phosphatase) from the genome of Escherichia coli[14][15] which was their target. The organization of the repeats was unusual. Repeated sequences are typically arranged consecutively, without interspersing different sequences.[11][15] They did not know the function of the interrupted clustered repeats.

In 1993, researchers of Mycobacterium tuberculosis in the Netherlands published two articles about a cluster of interrupted direct repeats (DR) in that bacterium. They recognized the diversity of the sequences that intervened in the direct repeats among different strains of M. tuberculosis[16] and used this property to design a typing method that was named spoligotyping, which is still in use, today.[17][18]

Archaeoglobus fulgidus were transcribed into the long RNA molecules that were subsequently processed into unit-length small RNAs, plus some longer forms of 2, 3, or more spacer-repeat units.[23][24]

In 2005, yogurt researcher Rodolphe Barrangou discovered that Streptococcus thermophilus, after iterative phage challenges, develops increased phage resistance, and this enhanced resistance is due to the incorporation of additional CRISPR spacer sequences.[25] The Danish food company Danisco, which at that time Barrangou worked for, then developed phage-resistant S. thermophilus strains for use in yogurt production. Danisco was later bought out by DuPont, which "owns about 50 percent of the global dairy culture market" and the technology went mainstream.[26]

CRISPR-associated systems

A major addition to the understanding of CRISPR came with Jansen's observation that the prokaryote repeat cluster was accompanied by a set of homologous genes that make up CRISPR-associated systems or cas genes. Four cas genes (cas 1–4) were initially recognized. The Cas proteins showed helicase and nuclease motifs, suggesting a role in the dynamic structure of the CRISPR loci.[27] In this publication, the acronym CRISPR was used as the universal name of this pattern. However, the CRISPR function remained enigmatic.

Simplified diagram of a CRISPR locus. The three major components of a CRISPR locus are shown: cas genes, a leader sequence, and a repeat-spacer array. Repeats are shown as gray boxes and spacers are colored bars. The arrangement of the three components is not always as shown.[28][29] In addition, several CRISPRs with similar sequences can be present in a single genome, only one of which is associated with cas genes.[30]

In 2005, three independent research groups showed that some CRISPR spacers are derived from

phage DNA and extrachromosomal DNA such as plasmids.[31][32][33] In effect, the spacers are fragments of DNA gathered from viruses that previously tried to attack the cell. The source of the spacers was a sign that the CRISPR-cas system could have a role in adaptive immunity in bacteria.[28][34] All three studies proposing this idea were initially rejected by high-profile journals, but eventually appeared in other journals.[35]

The first publication[32] proposing a role of CRISPR-Cas in microbial immunity, by Mojica and collaborators at the University of Alicante, predicted a role for the RNA transcript of spacers on target recognition in a mechanism that could be analogous to the RNA interference system used by eukaryotic cells. Koonin and colleagues extended this RNA interference hypothesis by proposing mechanisms of action for the different CRISPR-Cas subtypes according to the predicted function of their proteins.[36]

Experimental work by several groups revealed the basic mechanisms of CRISPR-Cas immunity. In 2007, the first experimental evidence that CRISPR was an adaptive immune system was published.

S. epidermidis targeted DNA and not RNA to prevent conjugation. This finding was at odds with the proposed RNA-interference-like mechanism of CRISPR-Cas immunity, although a CRISPR-Cas system that targets foreign RNA was later found in Pyrococcus furiosus.[11][38] A 2010 study showed that CRISPR-Cas cuts both strands of phage and plasmid DNA in S. thermophilus.[40]

Cas9

A simpler CRISPR system from Streptococcus pyogenes relies on the protein Cas9. The Cas9 endonuclease is a four-component system that includes two small molecules: crRNA and trans-activating CRISPR RNA (tracrRNA).[41][42] In 2012, Jennifer Doudna and Emmanuelle Charpentier re-engineered the Cas9 endonuclease into a more manageable two-component system by fusing the two RNA molecules into a "single-guide RNA" that, when combined with Cas9, could find and cut the DNA target specified by the guide RNA.[43] This contribution was so significant that it was recognized by the Nobel Prize in Chemistry in 2020. By manipulating the nucleotide sequence of the guide RNA, the artificial Cas9 system could be programmed to target any DNA sequence for separation.[43] Another group of collaborators comprising Virginijus Šikšnys together with Gasiūnas, Barrangou, and Horvath showed that Cas9 from the S. thermophilus CRISPR system can also be reprogrammed to target a site of their choosing by changing the sequence of its crRNA. These advances fueled efforts to edit genomes with the modified CRISPR-Cas9 system.[18]

Groups led by

opportunistic pathogen Candida albicans,[49][50] zebrafish (Danio rerio),[51] fruit flies (Drosophila melanogaster),[52][53] ants (Harpegnathos saltator[54] and Ooceraea biroi[55]), mosquitoes (Aedes aegypti[56]), nematodes (Caenorhabditis elegans),[57] plants,[58] mice (Mus musculus domesticus),[59][60] monkeys[61] and human embryos.[62]

CRISPR has been modified to make programmable

transcription factors that allows targeting and activation or silencing specific genes.[63]

A diagram of the CRISPR nucleases Cas12a and Cas9 with the position of DNA cleavage shown relative to their PAM sequences in a zoom-in

The CRISPR-Cas9 system has shown to make effective gene edits in Human

HBB) in 28 out of 54 embryos. Four out of the 28 embryos were successfully recombined using a donor template given by the scientists. The scientists showed that during DNA recombination of the cleaved strand, the homologous endogenous sequence HBD competes with the exogenous donor template. DNA repair in human embryos is much more complicated and particular than in derived stem cells.[64]

Cas12a

In 2015, the nuclease Cas12a (formerly known as Cpf1[65]) was characterized in the CRISPR-Cpf1 system of the bacterium Francisella novicida.[66][67] Its original name, from a TIGRFAMs protein family definition built in 2012, reflects the prevalence of its CRISPR-Cas subtype in the Prevotella and Francisella lineages. Cas12a showed several key differences from Cas9 including: causing a 'staggered' cut in double stranded DNA as opposed to the 'blunt' cut produced by Cas9, relying on a 'T rich' PAM (providing alternative targeting sites to Cas9), and requiring only a CRISPR RNA (crRNA) for successful targeting. By contrast, Cas9 requires both crRNA and a trans-activating crRNA (tracrRNA).

These differences may give Cas12a some advantages over Cas9. For example, Cas12a's small crRNAs are ideal for multiplexed genome editing, as more of them can be packaged in one vector than can Cas9's sgRNAs. The sticky 5′ overhangs left by Cas12a can also be used for DNA assembly that is much more target-specific than traditional restriction enzyme cloning.[68] Finally, Cas12a cleaves DNA 18–23 base pairs downstream from the PAM site. This means there is no disruption to the recognition sequence after repair, and so Cas12a enables multiple rounds of DNA cleavage. By contrast, since Cas9 cuts only 3 base pairs upstream of the PAM site, the NHEJ pathway results in indel mutations that destroy the recognition sequence, thereby preventing further rounds of cutting. In theory, repeated rounds of DNA cleavage should cause an increased opportunity for the desired genomic editing to occur.[69] A distinctive feature of Cas12a, as compared to Cas9, is that after cutting its target, Cas12a remains bound to the target and then cleaves other ssDNA molecules non-discriminately.[70] This property is called "collateral cleavage" or "trans-cleavage" activity and has been exploited for the development of various diagnostic technologies.[71][72]

Cas13

In 2016, the nuclease Cas13a (formerly known as C2c2) from the bacterium Leptotrichia shahii was characterized. Cas13 is an RNA-guided RNA endonuclease, which means that it does not cleave DNA, but only single-stranded RNA. Cas13 is guided by its crRNA to a ssRNA target and binds and cleaves the target. Similar to Cas12a, the Cas13 remains bound to the target and then cleaves other ssRNA molecules non-discriminately.[73] This collateral cleavage property has been exploited for the development of various diagnostic technologies.[74][75][76]

In 2021, Dr. Hui Yang characterized novel miniature Cas13 protein (mCas13) variants, Cas13X and Cas13Y. Using a small portion of N gene sequence from SARS-CoV-2 as a target in characterization of mCas13, revealed the sensitivity and specificity of mCas13 coupled with RT-LAMP for detection of SARS-CoV-2 in both synthetic and clinical samples over other available standard tests like RT-qPCR (1 copy/μL).[77]

Locus structure

Repeats and spacers

The CRISPR array is made up of an AT-rich leader sequence followed by short repeats that are separated by unique spacers.[78] CRISPR repeats typically range in size from 28 to 37 base pairs (bps), though there can be as few as 23 bp and as many as 55 bp.[79] Some show dyad symmetry, implying the formation of a secondary structure such as a stem-loop ('hairpin') in the RNA, while others are designed to be unstructured. The size of spacers in different CRISPR arrays is typically 32 to 38 bp (range 21 to 72 bp).[79] New spacers can appear rapidly as part of the immune response to phage infection.[80] There are usually fewer than 50 units of the repeat-spacer sequence in a CRISPR array.[79]

CRISPR RNA structures

  • CRISPR-DR2: Secondary structure taken from the Rfam database. Family RF01315.
    CRISPR-DR2: Secondary structure taken from the Rfam database. Family RF01315.
  • CRISPR-DR5: Secondary structure taken from the Rfam database. Family RF011318.
    CRISPR-DR5: Secondary structure taken from the Rfam database. Family RF011318.
  • CRISPR-DR6: Secondary structure taken from the Rfam database. Family RF01319.
    CRISPR-DR6: Secondary structure taken from the Rfam database. Family RF01319.
  • CRISPR-DR8: Secondary structure taken from the Rfam database. Family RF01321.
    CRISPR-DR8: Secondary structure taken from the Rfam database. Family RF01321.
  • CRISPR-DR9: Secondary structure taken from the Rfam database. Family RF01322.
    CRISPR-DR9: Secondary structure taken from the Rfam database. Family RF01322.
  • CRISPR-DR19: Secondary structure taken from the Rfam database. Family RF01332.
    CRISPR-DR19: Secondary structure taken from the Rfam database. Family RF01332.
  • CRISPR-DR41: Secondary structure taken from the Rfam database. Family RF01350.
    CRISPR-DR41: Secondary structure taken from the Rfam database. Family RF01350.
  • CRISPR-DR52: Secondary structure taken from the Rfam database. Family RF01365.
    CRISPR-DR52: Secondary structure taken from the Rfam database. Family RF01365.
  • CRISPR-DR57: Secondary structure taken from the Rfam database. Family RF01370.
    CRISPR-DR57: Secondary structure taken from the Rfam database. Family RF01370.
  • CRISPR-DR65: Secondary structure taken from the Rfam database. Family RF01378.
    CRISPR-DR65: Secondary structure taken from the Rfam database. Family RF01378.

Cas genes and CRISPR subtypes

Small clusters of cas genes are often located next to CRISPR repeat-spacer arrays. Collectively the 93 cas genes are grouped into 35 families based on sequence similarity of the encoded proteins. 11 of the 35 families form the cas core, which includes the protein families Cas1 through Cas9. A complete CRISPR-Cas locus has at least one gene belonging to the cas core.[81]

CRISPR-Cas systems fall into two classes. Class 1 systems use a complex of multiple Cas proteins to degrade foreign nucleic acids. Class 2 systems use a single large Cas protein for the same purpose. Class 1 is divided into types I, III, and IV; class 2 is divided into types II, V, and VI.

phylogeny of Cas1 proteins generally agrees with the classification system,[84] but exceptions exist due to module shuffling.[81] Many organisms contain multiple CRISPR-Cas systems suggesting that they are compatible and may share components.[85][86] The sporadic distribution of the CRISPR-Cas subtypes suggests that the CRISPR-Cas system is subject to horizontal gene transfer during microbial evolution
.

Signature genes and their putative functions for the major and minor CRISPR-cas types
Class Cas type Cas subtype Signature protein Function Reference
1 I Cas3 Single-stranded DNA nuclease (HD domain) and ATP-dependent helicase [87][88]
I-A Cas8a, Cas5 Cas8 is a Subunit of the interference module that is important in targeting of invading DNA by recognizing the PAM sequence. Cas5 is required for processing and stability of crRNAs. [84][89]
I-B Cas8b
I-C Cas8c
I-D Cas10d contains a domain homologous to the palm domain of nucleic acid polymerases and nucleotide cyclases [90][91]
I-E Cse1, Cse2
I-F Csy1, Csy2, Csy3 Type IF-3 have been implicated in CRISPR-associated transposons [84]
I-G[Note 1] GSU0054 [92]
III Cas10 Homolog of Cas10d and Cse1. Binds CRISPR target RNA and promotes stability of the interference complex [91][93]
III-A Csm2 Not determined [84]
III-B Cmr5 Not determined [84]
III-C Cas10 or Csx11 [84][93]
III-D Csx10 [84]
III-E [92]
III-F [92]
IV Csf1 [92]
IV-A [92]
IV-B [92]
IV-C [92]
2 II Cas9
DSBs
, and separately can produce single-strand breaks. Ensures the acquisition of functional spacers during adaptation.
[94][95]
II-A Csn2 Ring-shaped DNA-binding protein. Involved in primed adaptation in Type II CRISPR system. [96]
II-B Cas4 Endonuclease that works with cas1 and cas2 to generate spacer sequences [97]
II-C Characterized by the absence of either Csn2 or Cas4 [98]
V Cas12 Nuclease RuvC. Lacks HNH. [82][99]
V-A Cas12a (Cpf1) Auto-processing pre-crRNA activity for multiplex gene regulation [92][100]
V-B Cas12b (C2c1) [92]
V-C Cas12c (C2c3) [92]
V-D Cas12d (CasY) [92]
V-E Cas12e (CasX) [92]
V-F Cas12f (Cas14, C2c10) [92]
V-G Cas12g [92]
V-H Cas12h [92]
V-I Cas12i [92]
V-K[Note 2] Cas12k (C2c5) Type V-K have been implicated in CRISPR-associated transposons. [92]
V-U C2c4, C2c8, C2c9 [92]
VI Cas13 RNA-guided RNase [82][101]
VI-A Cas13a (C2c2) [92]
VI-B Cas13b [92]
VI-C Cas13c [92]
VI-D Cas13d [92]
VI-X Cas13x.1 RNA dependent RNA polymerase, Prophylactic RNA-virus inhibition [102]
VI-Y [102]

Mechanism

The stages of CRISPR immunity for each of the three major types of adaptive immunity.
(1) Acquisition begins by recognition of invading DNA by Cas1 and Cas2 and cleavage of a protospacer.
(2) The protospacer is ligated to the direct repeat adjacent to the leader sequence and
(3) single strand extension repairs the CRISPR and duplicates the direct repeat. The crRNA processing and interference stages occur differently in each of the three major CRISPR systems.
(4) The primary CRISPR transcript is cleaved by cas genes to produce crRNAs.
(5) In type I systems Cas6e/Cas6f cleave at the junction of ssRNA and dsRNA formed by hairpin loops in the direct repeat. Type II systems use a trans-activating (tracr) RNA to form dsRNA, which is cleaved by Cas9 and RNaseIII. Type III systems use a Cas6 homolog that does not require hairpin loops in the direct repeat for cleavage.
(6) In type II and type III systems secondary trimming is performed at either the 5' or 3' end to produce mature crRNAs.
(7) Mature crRNAs associate with Cas proteins to form interference complexes.
(8) In type I and type II systems, interactions between the protein and PAM sequence are required for degradation of invading DNA. Type III systems do not require a PAM for successful degradation and in type III-A systems basepairing occurs between the crRNA and mRNA rather than the DNA, targeted by type III-B systems.
The CRISPR genetic locus provides bacteria with a defense mechanism to protect them from repeated phage infections.
Transcripts of the CRISPR Genetic Locus and Maturation of pre-crRNA
3D Structure of the CRISPR-Cas9 Interference Complex
CRISPR-Cas9 as a Molecular Tool Introduces Targeted Double Strand DNA Breaks.
Double-strand DNA breaks introduced by CRISPR-Cas9 allows further genetic manipulation by exploiting endogenous DNA repair mechanisms.

CRISPR-Cas immunity is a natural process of bacteria and archaea.[103] CRISPR-Cas prevents bacteriophage infection, conjugation and natural transformation by degrading foreign nucleic acids that enter the cell.[38]

Spacer acquisition

When a

microbe is invaded by a bacteriophage, the first stage of the immune response is to capture phage DNA and insert it into a CRISPR locus in the form of a spacer. Cas1 and Cas2 are found in both types of CRISPR-Cas immune systems, which indicates that they are involved in spacer acquisition. Mutation studies confirmed this hypothesis, showing that removal of Cas1 or Cas2 stopped spacer acquisition, without affecting CRISPR immune response.[104][105][106][107][108]

Multiple Cas1 proteins have been characterised and their structures resolved.

integrases that bind to DNA in a sequence-independent manner.[85] Representative Cas2 proteins have been characterised and possess either (single strand) ssRNA-[112] or (double strand) dsDNA-[113][114] specific endoribonuclease
activity.

In the I-E system of E. coli Cas1 and Cas2 form a complex where a Cas2 dimer bridges two Cas1 dimers.[115] In this complex Cas2 performs a non-enzymatic scaffolding role,[115] binding double-stranded fragments of invading DNA, while Cas1 binds the single-stranded flanks of the DNA and catalyses their integration into CRISPR arrays.[116][117][118] New spacers are usually added at the beginning of the CRISPR next to the leader sequence creating a chronological record of viral infections.[119] In E. coli a histone like protein called integration host factor (IHF), which binds to the leader sequence, is responsible for the accuracy of this integration.[120] IHF also enhances integration efficiency in the type I-F system of Pectobacterium atrosepticum.[121] but in other systems, different host factors may be required[122]

Protospacer adjacent motifs (PAM)

Bioinformatic analysis of regions of phage genomes that were excised as spacers (termed protospacers) revealed that they were not randomly selected but instead were found adjacent to short (3–5 bp) DNA sequences termed

leader sequence.[127][130]

New spacers are added to a CRISPR array in a directional manner,[31] occurring preferentially,[80][123][124][131][132] but not exclusively, adjacent[126][129] to the leader sequence. Analysis of the type I-E system from E. coli demonstrated that the first direct repeat adjacent to the leader sequence is copied, with the newly acquired spacer inserted between the first and second direct repeats.[107][128]

The PAM sequence appears to be important during spacer insertion in type I-E systems. That sequence contains a strongly conserved final nucleotide (nt) adjacent to the first nt of the protospacer. This nt becomes the final base in the first direct repeat.[108][133][134] This suggests that the spacer acquisition machinery generates single-stranded overhangs in the second-to-last position of the direct repeat and in the PAM during spacer insertion. However, not all CRISPR-Cas systems appear to share this mechanism as PAMs in other organisms do not show the same level of conservation in the final position.[130] It is likely that in those systems, a blunt end is generated at the very end of the direct repeat and the protospacer during acquisition.

Insertion variants

Analysis of Sulfolobus solfataricus CRISPRs revealed further complexities to the canonical model of spacer insertion, as one of its six CRISPR loci inserted new spacers randomly throughout its CRISPR array, as opposed to inserting closest to the leader sequence.[129]

Multiple CRISPRs contain many spacers to the same phage. The mechanism that causes this phenomenon was discovered in the type I-E system of E. coli. A significant enhancement in spacer acquisition was detected where spacers already target the phage, even mismatches to the protospacer. This 'priming' requires the Cas proteins involved in both acquisition and interference to interact with each other. Newly acquired spacers that result from the priming mechanism are always found on the same strand as the priming spacer.[108][133][134] This observation led to the hypothesis that the acquisition machinery slides along the foreign DNA after priming to find a new protospacer.[134]

Biogenesis

CRISPR-RNA (crRNA), which later guides the Cas nuclease to the target during the interference step, must be generated from the CRISPR sequence. The crRNA is initially transcribed as part of a single long transcript encompassing much of the CRISPR array.[29] This transcript is then cleaved by Cas proteins to form crRNAs. The mechanism to produce crRNAs differs among CRISPR-Cas systems. In type I-E and type I-F systems, the proteins Cas6e and Cas6f respectively, recognise stem-loops[135][136][137] created by the pairing of identical repeats that flank the crRNA.[138] These Cas proteins cleave the longer transcript at the edge of the paired region, leaving a single crRNA along with a small remnant of the paired repeat region.

Type III systems also use Cas6, however, their repeats do not produce stem-loops. Cleavage instead occurs by the longer transcript wrapping around the Cas6 to allow cleavage just upstream of the repeat sequence.[139][140][141]

Type II systems lack the Cas6 gene and instead utilize RNaseIII for cleavage. Functional type II systems encode an extra small RNA that is complementary to the repeat sequence, known as a trans-activating crRNA (tracrRNA).[41] Transcription of the tracrRNA and the primary CRISPR transcript results in base pairing and the formation of dsRNA at the repeat sequence, which is subsequently targeted by RNaseIII to produce crRNAs. Unlike the other two systems, the crRNA does not contain the full spacer, which is instead truncated at one end.[94]

CrRNAs associate with Cas proteins to form ribonucleotide complexes that recognize foreign nucleic acids. CrRNAs show no preference between the coding and non-coding strands, which is indicative of an RNA-guided DNA-targeting system.[5][40][104][108][142][143][144] The type I-E complex (commonly referred to as Cascade) requires five Cas proteins bound to a single crRNA.[145][146]

Interference

During the interference stage in type I systems, the PAM sequence is recognized on the crRNA-complementary strand and is required along with crRNA annealing. In type I systems correct base pairing between the crRNA and the protospacer signals a conformational change in Cascade that recruits Cas3 for DNA degradation.

Type II systems rely on a single multifunctional protein, Cas9, for the interference step.[94] Cas9 requires both the crRNA and the tracrRNA to function and cleave DNA using its dual HNH and RuvC/RNaseH-like endonuclease domains. Basepairing between the PAM and the phage genome is required in type II systems. However, the PAM is recognized on the same strand as the crRNA (the opposite strand to type I systems).

Type III systems, like type I require six or seven Cas proteins binding to crRNAs.[147][148] The type III systems analysed from S. solfataricus and P. furiosus both target the mRNA of phages rather than phage DNA genome,[86][148] which may make these systems uniquely capable of targeting RNA-based phage genomes.[85] Type III systems were also found to target DNA in addition to RNA using a different Cas protein in the complex, Cas10.[149] The DNA cleavage was shown to be transcription dependent.[150]

The mechanism for distinguishing self from foreign DNA during interference is built into the crRNAs and is therefore likely common to all three systems. Throughout the distinctive maturation process of each major type, all crRNAs contain a spacer sequence and some portion of the repeat at one or both ends. It is the partial repeat sequence that prevents the CRISPR-Cas system from targeting the chromosome as base pairing beyond the spacer sequence signals self and prevents DNA cleavage.[151] RNA-guided CRISPR enzymes are classified as type V restriction enzymes.

Evolution

CRISPR associated protein Cas2 (adaptation RNase)
Crystal structure of a hypothetical protein tt1823 from Thermus thermophilus
Identifiers
SymbolCRISPR_Cas2
PfamPF09827
InterProIPR019199
CDDcd09638
Available protein structures:
Pfam  structures / ECOD  
PDBRCSB PDB; PDBe; PDBj
PDBsumstructure summary
CRISPR-associated protein CasA/Cse1 (Type I effector DNase)
Identifiers
SymbolCRISPR_Cse1
PfamPF09481
InterProIPR013381
CDDcd09729
Available protein structures:
Pfam  structures / ECOD  
PDBRCSB PDB; PDBe; PDBj
PDBsumstructure summary
CRISPR associated protein CasC/Cse3/Cas6 (Type I effector RNase)
Crystal structure of a crispr-associated protein from Thermus thermophilus
Identifiers
SymbolCRISPR_assoc
PfamPF08798
Pfam clanCL0362
InterProIPR010179
CDDcd09727
Available protein structures:
Pfam  structures / ECOD  
PDBRCSB PDB; PDBe; PDBj
PDBsumstructure summary

The cas genes in the adaptor and effector modules of the CRISPR-Cas system are believed to have evolved from two different ancestral modules. A

transposon-like element called casposon encoding the Cas1-like integrase and potentially other components of the adaptation module was inserted next to the ancestral effector module, which likely functioned as an independent innate immune system.[152] The highly conserved cas1 and cas2 genes of the adaptor module evolved from the ancestral module while a variety of class 1 effector cas genes evolved from the ancestral effector module.[153] The evolution of these various class 1 effector module cas genes was guided by various mechanisms, such as duplication events.[154] On the other hand, each type of class 2 effector module arose from subsequent independent insertions of mobile genetic elements.[155] These mobile genetic elements took the place of the multiple gene effector modules to create single gene effector modules that produce large proteins which perform all the necessary tasks of the effector module.[155] The spacer regions of CRISPR-Cas systems are taken directly from foreign mobile genetic elements and thus their long-term evolution is hard to trace.[156] The non-random evolution of these spacer regions has been found to be highly dependent on the environment and the particular foreign mobile genetic elements it contains.[157]

CRISPR-Cas can immunize bacteria against certain phages and thus halt transmission. For this reason, Koonin described CRISPR-Cas as a Lamarckian inheritance mechanism.[158] However, this was disputed by a critic who noted, "We should remember [Lamarck] for the good he contributed to science, not for things that resemble his theory only superficially. Indeed, thinking of CRISPR and other phenomena as Lamarckian only obscures the simple and elegant way evolution really works".[159] But as more recent studies have been conducted, it has become apparent that the acquired spacer regions of CRISPR-Cas systems are indeed a form of Lamarckian evolution because they are genetic mutations that are acquired and then passed on.[160] On the other hand, the evolution of the Cas gene machinery that facilitates the system evolves through classic Darwinian evolution.[160]

Coevolution

Analysis of CRISPR sequences revealed

commensal bacteria. CRISPR-Cas-mediated gene regulation may contribute to the regulation of endogenous bacterial genes, particularly during interaction with eukaryotic hosts. For example, Francisella novicida uses a unique, small, CRISPR-Cas-associated RNA (scaRNA) to repress an endogenous transcript encoding a bacterial lipoprotein that is critical for F. novicida to dampen host response and promote virulence.[162]

The basic model of CRISPR evolution is newly incorporated spacers driving phages to mutate their genomes to avoid the bacterial immune response, creating diversity in both the phage and host populations. To resist a phage infection, the sequence of the CRISPR spacer must correspond perfectly to the sequence of the target phage gene. Phages can continue to infect their hosts' given point mutations in the spacer.[151] Similar stringency is required in PAM or the bacterial strain remains phage sensitive.[124][151]

Rates

A study of 124 S. thermophilus strains showed that 26% of all spacers were unique and that different CRISPR loci showed different rates of spacer acquisition.[123] Some CRISPR loci evolve more rapidly than others, which allowed the strains' phylogenetic relationships to be determined. A comparative genomic analysis showed that E. coli and S. enterica evolve much more slowly than S. thermophilus. The latter's strains that diverged 250,000 years ago still contained the same spacer complement.[163]

Metagenomic analysis of two acid-mine-drainage biofilms showed that one of the analyzed CRISPRs contained extensive deletions and spacer additions versus the other biofilm, suggesting a higher phage activity/prevalence in one community than the other.[80] In the oral cavity, a temporal study determined that 7–22% of spacers were shared over 17 months within an individual while less than 2% were shared across individuals.[132]

From the same environment, a single strain was tracked using PCR primers specific to its CRISPR system. Broad-level results of spacer presence/absence showed significant diversity. However, this CRISPR added three spacers over 17 months,[132] suggesting that even in an environment with significant CRISPR diversity some loci evolve slowly.

CRISPRs were analysed from the metagenomes produced for the

streptococcal species and contained ≈15,000 spacers, 50% of which were unique. Similar to the targeted studies of the oral cavity, some showed little evolution over time.[164]

CRISPR evolution was studied in chemostats using S. thermophilus to directly examine spacer acquisition rates. In one week, S. thermophilus strains acquired up to three spacers when challenged with a single phage.[165] During the same interval, the phage developed single-nucleotide polymorphisms that became fixed in the population, suggesting that targeting had prevented phage replication absent these mutations.[165]

Another S. thermophilus experiment showed that phages can infect and replicate in hosts that have only one targeting spacer. Yet another showed that sensitive hosts can exist in environments with high-phage titres.[166] The chemostat and observational studies suggest many nuances to CRISPR and phage (co)evolution.

Identification

CRISPRs are widely distributed among bacteria and archaea[90] and show some sequence similarities.[138] Their most notable characteristic is their repeating spacers and direct repeats. This characteristic makes CRISPRs easily identifiable in long sequences of DNA, since the number of repeats decreases the likelihood of a false positive match.[167]

Analysis of CRISPRs in metagenomic data is more challenging, as CRISPR loci do not typically assemble, due to their repetitive nature or through strain variation, which confuses assembly algorithms. Where many reference genomes are available, polymerase chain reaction (PCR) can be used to amplify CRISPR arrays and analyse spacer content.[123][132][168][169][170][171] However, this approach yields information only for specifically targeted CRISPRs and for organisms with sufficient representation in public databases to design reliable polymerase PCR primers. Degenerate repeat-specific primers can be used to amplify CRISPR spacers directly from environmental samples; amplicons containing two or three spacers can be then computationally assembled to reconstruct long CRISPR arrays.[171]

The alternative is to extract and reconstruct CRISPR arrays from shotgun metagenomic data. This is computationally more difficult, particularly with second generation sequencing technologies (e.g. 454, Illumina), as the short read lengths prevent more than two or three repeat units appearing in a single read. CRISPR identification in raw reads has been achieved using purely de novo identification[172] or by using direct repeat sequences in partially assembled CRISPR arrays from contigs (overlapping DNA segments that together represent a consensus region of DNA)[164] and direct repeat sequences from published genomes[173] as a hook for identifying direct repeats in individual reads.

Use by phages

Another way for bacteria to defend against phage infection is by having chromosomal islands. A subtype of chromosomal islands called phage-inducible chromosomal island (PICI) is excised from a bacterial chromosome upon phage infection and can inhibit phage replication.[174] PICIs are induced, excised, replicated, and finally packaged into small capsids by certain staphylococcal temperate phages. PICIs use several mechanisms to block phage reproduction. In the first mechanism, PICI-encoded Ppi differentially blocks phage maturation by binding or interacting specifically with phage TerS, hence blocking phage TerS/TerL complex formation responsible for phage DNA packaging. In the second mechanism PICI CpmAB redirects the phage capsid morphogenetic protein to make 95% of SaPI-sized capsid and phage DNA can package only 1/3rd of their genome in these small capsids and hence become nonviable phage.[175] The third mechanism involves two proteins, PtiA and PtiB, that target the LtrC, which is responsible for the production of virion and lysis proteins. This interference mechanism is modulated by a modulatory protein, PtiM, binds to one of the interference-mediating proteins, PtiA, and hence achieves the required level of interference.[176]

One study showed that lytic ICP1 phage, which specifically targets

serogroup O1, has acquired a CRISPR-Cas system that targets a V. cholera PICI-like element. The system has 2 CRISPR loci and 9 Cas genes. It seems to be homologous to the I-F system found in Yersinia pestis. Moreover, like the bacterial CRISPR-Cas system, ICP1 CRISPR-Cas can acquire new sequences, which allows phage and host to co-evolve.[177][178]

Certain archaeal viruses were shown to carry mini-CRISPR arrays containing one or two spacers. It has been shown that spacers within the virus-borne CRISPR arrays target other viruses and plasmids, suggesting that mini-CRISPR arrays represent a mechanism of heterotypic superinfection exclusion and participate in interviral conflicts.[171]

Applications

CRISPR gene editing

CRISPR technology has been applied in the food and farming industries to engineer probiotic cultures and to immunize industrial cultures (for yogurt, for instance) against infections. It is also being used in crops to enhance yield, drought tolerance and nutritional value.[179][180][181] CRISPR gene editing has also become a fantastic tool for scientific research. The amplification and "knock-out" of gene products is a reliable way to identify genes of interest for pharmaceutical development or to simply better understand the complexities that lie in any genome.

By the end of 2014, some 1,000 research papers had been published that mentioned CRISPR.

biofuels, and genetically modify crop strains.[183] Hsu and his colleagues state that the ability to manipulate the genetic sequences allows for reverse engineering that can positively affect biofuel production.[184] CRISPR can also be used to change mosquitoes so they cannot transmit diseases such as malaria.[185] CRISPR-based approaches utilizing Cas12a have recently been utilized in the successful modification of a broad number of plant species.[186]

In July 2019, CRISPR was used to experimentally treat a patient with a genetic disorder. The patient was a 34-year-old woman with sickle cell disease.[187]

In February 2020, progress was made on HIV treatments with 60–80% of the integrated viral DNA removed in mice and some being completely free from the virus after edits involving both LASER ART, a new anti-retroviral therapy, and CRISPR.[188]

In March 2020, CRISPR-modified virus was injected into a patient's eye in an attempt to treat Leber congenital amaurosis.[189]

In the future, CRISPR gene editing could potentially be used to create new species or revive extinct species from closely related ones.[190]

CRISPR-based re-evaluations of claims for gene-disease relationships have led to the discovery of potentially important anomalies.[191][192]

In July 2021, CRISPR gene editing of hiPSC's was used to study the role of MBNL proteins associated with DM1.[193]

The CRISPR technique has a positive response in working towards different disorders like Nervous system, Circulatory system, Stem cells, blood disorders, muscular degeneration. This tool has made advanced approaches in both therapeutic and biomedical systems and some of the applications are discussed below,

1.1 β-Hemoglobinopathies

This disease comes under genetic disorders which are caused by mutation occurring in the structure of hemoglobin or due to substitution of different amino acids in globin chains. Due to this, the red blood cells (RBC) cause a string of obstacles such as failure of heart, hindrance of blood vessels, defects in growth and optical problems.[194] To rehabilitate β-hemoglobinopathies, the patient's multipotent cells are transferred in a mice model to study the rate of gene therapy in ex-vivo which results in expression of mRNA and the gene being rectified. Intriguingly RBC half-life was also increased.

1.2 Hemophilia

It is a loss of function in blood where clotting factors do not work properly. There are two types, Hemophilia A and Hemophilia B. By using CRISPR-Cas9, a vector is inserted into bacteria.[195] The vector used is Adenoviral vector which helps in correction of genes. Doubtlessly, CRISPR has given hope for the treatment of hemophilia by setting right the genes.

1.3 Neurological disorders

CRISPR is used in suppressing the mutations which cause gain of function and also repairs the mutations with loss of functions with gene editing in neurological disorders.[196] The gene editing tool setup a foothold in vivo application for assimilation of molecular pathways.

1.4 Blindness

The Eye disorders became more impediment for the doctors to treat the victims. Moreover, the retinal tissue present in the eye is free from body immune response. The most commonly occurring worldwide eye diseases are cataract and retinitis pigmentosa (RP). These are caused by a missense mutation in the alpha chain that leads to permanent blindness. The approach of CRISPR is to bag the gene coding retinal protein and edit the genome which results in good vision.

1.5 Cardiovascular diseases

The CRISPR technology works more efficiently towards diseases related to the heart. Due to deposition of cholesterol in the walls of the artery leads to blockage of flow of blood. This is caused by mutation in low density lipoprotein cholesterol receptors (LDLC) which results in release of cholesterol into blood in higher levels.[197] This can be treated by deletion of base pair in exon 4 of LDLC receptor. This is a nonsense mutation.

Applications of CRISPR in agriculture

The application of CRISPR in plants was successfully achieved in the year 2013. CRISPR Cas9 has become an influential appliance in editing genomes in crops. It made a mark in present breeding systems,[198]

2.1 Boosting the yield

For high production of yield in cereals the balance of cytokinin is changed. cytokinin oxidase/dehydrogenase (CKX), is an enzyme,[199] so the gene that codes this enzyme was knocked out for more yield to be produced.

2.2 Enhancing quality

Grains have a high amount of amylose polysaccharide. To decrease the amylose content CRISPR is used to alter the amino acids which leads to low production of saccharide. Moreover, wheat contains gluten proteins due to which some of them are intolerant to gluten and cause a disorder called coeliac disease.[200] The gene editing tool targets the gluten genes which results in low gluten production in wheat.

2.3 Resistance to disease

The biotic stress of plants can be reduced by using CRISPR tools. The bacterial infections caused on the rice leads to activation of transcription of genes,[201] the products of these are susceptible to disease and by using CRISPR scientists were able to generate lines of resistance.

General applications of CRISPR

3.1 Gene Therapy

The overall genetic disorders discovered till now are about 6000. Most of them do not have treatment till date. The role of gene therapy is to substitute with exogenous DNA in the place of defective genes and edit the mutated sequence.[202] This therapy made a huge impact in medical biotechnology.

3.2 Base editing

They are two types of base editings:

Cytidine base editor is a novel therapy in which the cytidine (C) changes to thymidine (T).

Adenine base editor (ABE),[203] in this there is a change in base complements from adenine (A) to Guanine (G).

The mutations were directly installed in cellular DNA so that the donor template is not required. The base editings can only edit point mutations moreover they can only fix up to four-point mutations.[204] So, to master this problem CRISPR system has introduced a new technique known as Cas9 fusion to stretch the level of genome editing.

3.3 Gene silencing and activating

Furthermore, the CRISPR Cas9 protein can modulate genes either by activating or silencing based on genes of interest.[205] There is a nuclease called dCas9 (endonuclease) used to silence or activate the expression of genes.

Limitations in Applications of CRISPR

The researchers are facing many challenges in gene editing.[206] The major hurdles coming in the clinical applications are ethical issues and the transport system to the target site. As the units of CRISPR system taken from bacteria, when they are transferred to host cells it produces an immune response against them. Physical, chemical, viral vectors are used as vehicles to deliver the complex into the host.[citation needed] Due to this many complications are arising such as cell damage that leads to cell death. In the case of viral vectors, the capacity of the virus is small and Cas9 protein is large. So, to overcome these new methods were developed in which smaller strains of Cas9 are taken from bacteria. Finally, a great extent of work is still needed to improve the system.

CRISPR as diagnostic tool

Schematic flowchart of molecular detection methods for COVID-19 virus[207]

CRISPR associated nucleases have shown to be useful as a tool for molecular testing due to their ability to specifically target nucleic acid sequences in a high background of non-target sequences.

plant pathogens by molecular typing of the pathogen's CRISPRs can be used in agriculture as demonstrated by Shen et al., 2020.[211]
: 553 

By coupling CRISPR-based diagnostics to additional enzymatic processes, the detection of molecules beyond nucleic acids is possible. One example of a coupled technology is SHERLOCK-based Profiling of IN vitro Transcription (SPRINT). SPRINT can be used to detect a variety of substances, such as metabolites in patient samples or contaminants in environmental samples, with high throughput or with portable point-of-care devices.[76] CRISPR-Cas platforms are also being explored for detection[71][72][207][212][213] and inactivation of SARS-CoV-2, the virus that causes COVID-19.[214] Two different comprehensive diagnostic tests, AIOD-CRISPR and SHERLOCK test have been identified for SARS-CoV-2.[215] The SHERLOCK test is based on a fluorescently labelled press reporter RNA which has the ability to identify 10 copies per microliter.[216] The AIOD-CRISPR helps with robust and highly sensitive visual detection of the viral nucleic acid.[217]

See also

Notes

  1. ^ Subtype I-G was previously known as subtype I-U.[84]
  2. ^ Subtype V-K was previously known as subtype V-U5.[92]

References

  1. PMID 25123481
    .
  2. ^ .
  3. .
  4. ^ )
  5. ^ .
  6. .
  7. .
  8. .
  9. .
  10. ^ CRISPR-CAS9, TALENS and ZFNS – the battle in gene editing https://www.ptglab.com/news/blog/crispr-cas9-talens-and-zfns-the-battle-in-gene-editing/
  11. ^
    PMID 24906146
    .
  12. ^ "Press release: The Nobel Prize in Chemistry 2020". Nobel Foundation. Retrieved 7 October 2020.
  13. ^ Wu KJ, Peltier E (7 October 2020). "Nobel Prize in Chemistry Awarded to 2 Scientists for Work on Genome Editing – Emmanuelle Charpentier and Jennifer A. Doudna developed the Crispr tool, which can alter the DNA of animals, plants and microorganisms with high precision". The New York Times. Retrieved 7 October 2020.
  14. S2CID 234597200
    .
  15. ^ .
  16. .
  17. .
  18. ^ .
  19. ^ .
  20. .
  21. .
  22. .
  23. .
  24. .
  25. .
  26. ^ Molteni M, Huckins G (1 August 2020). "The WIRED Guide to Crispr". Condé Nast. Wired Magazine.
  27. S2CID 23196085
    .
  28. ^ .
  29. ^ .
  30. .
  31. ^ .
  32. ^ .
  33. ^ .
  34. .
  35. .
  36. .
  37. .
  38. ^ .
  39. .
  40. ^ .
  41. ^ .
  42. .
  43. ^ .
  44. .
  45. .
  46. .
  47. .
  48. .
  49. .
  50. .
  51. .
  52. .
  53. .
  54. .
  55. .
  56. .
  57. .
  58. .
  59. .
  60. .
  61. .
  62. .
  63. .
  64. .
  65. .
  66. .
  67. .
  68. .
  69. ^ "Cpf1 Nuclease". abmgood.com. Retrieved 2017-12-14.
  70. PMID 29449511
    .
  71. ^ .
  72. ^ .
  73. .
  74. ^ .
  75. .
  76. ^ .
  77. .
  78. .
  79. ^ .
  80. ^ .
  81. ^ .
  82. ^ .
  83. .
  84. ^ .
  85. ^ .
  86. ^ .
  87. .
  88. .
  89. .
  90. ^ .
  91. ^ .
  92. ^ .
  93. ^ .
  94. ^ .
  95. .
  96. .
  97. .
  98. .
  99. .
  100. .
  101. .
  102. ^ .
  103. .
  104. ^ .
  105. .
  106. .
  107. ^ .
  108. ^ .
  109. .
  110. .
  111. .
  112. .
  113. .
  114. .
  115. ^ .
  116. .
  117. .
  118. .
  119. .
  120. .
  121. .
  122. .
  123. ^ .
  124. ^ .
  125. .
  126. ^ .
  127. ^ .
  128. ^ .
  129. ^ .
  130. ^ .
  131. .
  132. ^ .
  133. ^ .
  134. ^ .
  135. .
  136. .
  137. .
  138. ^ .
  139. .
  140. .
  141. .
  142. .
  143. .
  144. .
  145. .
  146. .
  147. .
  148. ^ .
  149. .
  150. .
  151. ^ .
  152. .
  153. .
  154. .
  155. ^ .
  156. .
  157. .
  158. .
  159. .
  160. ^ .
  161. .
  162. .
  163. .
  164. ^ .
  165. ^ .
  166. .
  167. . Table 1: Web resources for CRISPR analysis
  168. .
  169. .
  170. .
  171. ^ .
  172. .
  173. .
  174. .
  175. .
  176. .
  177. .
  178. .
  179. ^ "What is CRISPR and How does it work?". Livescience.Tech. 30 April 2018. Archived from the original on 2020-01-24. Retrieved 2019-12-14.
  180. PMID 34681400
    .
  181. .
  182. .
  183. ^ .
  184. .
  185. .
  186. .
  187. ^ "In A 1st, Doctors In U.S. Use CRISPR Tool To Treat Patient With Genetic Disorder". NPR.org. Retrieved 2019-07-31.
  188. ^ National Institute on Drug Abuse (2020-02-14). "Antiretroviral Therapy Combined With CRISPR Gene Editing Can Eliminate HIV Infection in Mice". National Institute on Drug Abuse. Retrieved 2020-11-15.
  189. ^ "In A 1st, Scientists Use Revolutionary Gene-Editing Tool To Edit Inside A Patient". NPR.org.
  190. ^ The-Crispr (2019-07-15). "Listen Radiolab CRISPR podcast". The Crispr. Archived from the original on 2019-07-15. Retrieved 2019-07-15.
  191. S2CID 90757972
    .
  192. .
  193. .
  194. doi:10.29011/2576-9588.100111 (inactive 2024-03-27).{{cite journal}}: CS1 maint: DOI inactive as of March 2024 (link
    )
  195. .
  196. .
  197. .
  198. .
  199. .
  200. .
  201. .
  202. , retrieved 2023-12-10
  203. .
  204. .
  205. .
  206. .
  207. ^ .
  208. .
  209. .
  210. .
  211. ^ .
  212. .
  213. .
  214. .
  215. .
  216. .
  217. .

Further reading

External links

Protein Data Bank

  • Overview of all the structural information available in the PDB for UniProt: Q46901 (CRISPR system Cascade subunit CasA) at the PDBe-KB.
  • Overview of all the structural information available in the PDB for UniProt: P76632 (CRISPR system Cascade subunit CasB) at the PDBe-KB.
  • Overview of all the structural information available in the PDB for UniProt: Q46899 (CRISPR system Cascade subunit CasC) at the PDBe-KB.
  • Overview of all the structural information available in the PDB for UniProt: Q46898 (CRISPR system Cascade subunit CasD) at the PDBe-KB.
  • Overview of all the structural information available in the PDB for UniProt: Q46897 (CRISPR system Cascade subunit CasE) at the PDBe-KB.
This page is based on the copyrighted Wikipedia article: CRISPR. Articles is available under the CC BY-SA 3.0 license; additional terms may apply.Privacy Policy