PROGRAMMABLE NUCLEASES AND METHODS OF USE

Information

  • Patent Application
  • 20230167454
  • Publication Number
    20230167454
  • Date Filed
    August 11, 2022
    a year ago
  • Date Published
    June 01, 2023
    12 months ago
Abstract
Provided herein, in certain embodiments, are programmable nucleases, guide nucleic acids, and complexes thereof. Certain programmable nucleases provided herein comprise a RuvC domain. Also provided herein are nucleic acids encoding said programmable nucleases and guide nucleic acids. Also provided herein are methods of genome editing, methods of regulating gene expression, and methods of detecting nucleic acids with said programmable nucleases and guide nucleic acids.
Description
SEQUENCE LISTING

This application incorporates by reference a Sequence Listing XML submitted via the USPTO patent electronic filing system. The Sequence Listing XML, entitled 203477-734301US_Sequence_Listing.xml, was created on Aug. 1, 2022, and is 3,349,159 bytes in size.


BACKGROUND

Certain programmable nucleases can be used for genome editing of nucleic acid sequences or detection of nucleic acid sequences. There is a need for high efficiency, programmable nucleases that are capable of working under various sample conditions and can be used for both genome editing and diagnostics.


SUMMARY

In various aspects, the present disclosure provides a composition comprising: a) a programmable CasΦ nuclease or a nucleic acid encoding said programmable CasΦ nuclease, wherein said programmable CasΦ nuclease comprises at least 85% sequence identity to a sequence selected from the group consisting of SEQ ID NOs: 1 to 47, SEQ ID NO. 105, and SEQ ID NO. 107, and b) a guide nucleic acid or a nucleic acid encoding said guide nucleic acid, wherein said guide nucleic acid comprises a region comprising a nucleotide sequence that is complementary to a target nucleic acid sequence and an additional region, wherein said region and said additional region are heterologous to each other.


In some aspects, the additional region of the guide nucleic acid comprises at least 85% sequence identity to a sequence selected from the group consisting of SEQ ID NOs: 48 to 86. In some aspects, the guide nucleic acid comprises a sequence comprising at least 95% sequence identity to a sequence selected from the group consisting of SEQ ID NOs: 48 to 86. In some aspects, the guide nucleic acid comprises a sequence selected from the group consisting of SEQ ID NOs: 48 to 86. In some aspects, the programmable CasΦ nuclease comprises nickase activity. In some aspects, the programmable CasΦ nuclease comprises double-strand cleavage activity. In some aspects, the programmable CasΦ nuclease comprises at least 90% sequence identity to a sequence selected from the group consisting of SEQ ID NOs: 1 to 47, SEQ ID NO. 105, and SEQ ID NO. 107.


In some aspects, the programmable CasΦ nuclease comprises at least 95% sequence identity to a sequence selected from the group consisting of SEQ ID NOs: 1 to 47, SEQ ID NO. 105, and SEQ ID NO. 107. In some aspects, the programmable CasΦ nuclease comprises at least 98% sequence identity to a sequence selected from the group consisting of SEQ ID NOs: 1 to 47, SEQ ID NO. 105, and SEQ ID NO. 107. In some aspects, the programmable CasΦ nuclease comprises a sequence selected from the group consisting of SEQ ID NOs: 1 to 47, SEQ ID NO. 105, and SEQ ID NO. 107. In some aspects, the guide nucleic acid does not comprise a tracrRNA. In some aspects, the programmable CasΦ nuclease does not require a tracrRNA. In some aspects, the programmable CasΦ nuclease comprises greater nickase activity when complexed with the guide nucleic acid at a temperature from about 20° C. to about 25° C., as compared with complex formation at a temperature of about 37° C. In some aspects, the guide nucleic acid comprises at least 98% sequence identity to SEQ ID NO: 54. In some aspects, the guide nucleic acid comprises at least 98% sequence identity to SEQ ID NO: 57. In some aspects, the programmable CasΦ nuclease comprises greater nickase activity when complexed with the guide nucleic acid comprising a sequence comprising at least 98% sequence identity to SEQ ID NO: 57, as compared to when complexed with a guide nucleic acid comprising SEQ ID NO: 49.


In some aspects, the programmable CasΦ nuclease exhibits greater nicking activity as compared to double stranded cleavage activity. In some aspects, the programmable CasΦ nuclease exhibits greater double stranded cleavage activity as compared to nicking activity. In some aspects, the programmable CasΦ nuclease comprises a single active site in a RuvC domain that is capable of catalyzing pre-crRNA processing and nicking or cleaving of nucleic acids. In some aspects, the programmable CasΦ nuclease recognizes a protospacer adjacent motif (PAM) of 5′-TBN-3′, wherein B is one or more of C, G, or, T. In some aspects, the programmable CasΦ nuclease recognizes a protospacer adjacent motif (PAM) of 5′-TTTN-3′.


In various aspects, the present disclosure provides a method of modifying a target nucleic acid sequence, the method comprising: contacting a target nucleic acid sequence with a programmable CasΦ nuclease comprising at least 85% sequence identity to a sequence selected from the group consisting of SEQ ID NOs: 1 to 47, SEQ ID NO. 105, and SEQ ID NO. 107, and a guide nucleic acid, wherein the programmable CasΦ nuclease cleaves the target nucleic acid sequence, thereby modifying the target nucleic acid sequence.


In some aspects, the programmable CasΦ nuclease introduces a double-stranded break in the target nucleic acid sequence. In some aspects, the programmable CasΦ nuclease comprises double-strand cleavage activity. In some aspects, the programmable CasΦ nuclease cleaves a single-strand of the target nucleic acid sequence. In some aspects, the programmable CasΦ nuclease comprises nickase activity. In some aspects, the programmable CasΦ nuclease exhibits greater nicking activity as compared to double stranded cleavage activity. In some aspects, the programmable CasΦ nuclease exhibits greater double stranded cleavage activity as compared to nicking activity. In some aspects, the target nucleic acid is DNA. In some aspects, the target nucleic acid is double-stranded DNA. In some aspects, the programmable CasΦ nuclease cleaves a non-target strand of the double-stranded DNA, wherein the non-target strand is non-complementary to the guide nucleic acid. In some aspects, the programmable CasΦ nuclease does not cleave a target strand of the double-stranded DNA, wherein the target strand is complementary to the guide nucleic acid.


In some aspects, the programmable CasΦ nuclease comprises at least 90% sequence identity to a sequence selected from the group consisting of SEQ ID NOs: 1 to 47, SEQ ID NO. 105, and SEQ ID NO. 107. In some aspects, the programmable CasΦ nuclease comprises at least 95% sequence identity to a sequence selected from the group consisting of SEQ ID NOs: 1 to 47, SEQ ID NO. 105, and SEQ ID NO. 107. In some aspects, the programmable CasΦ nuclease comprises at least 98% sequence identity to a sequence selected from the group consisting of SEQ ID NOs: 1 to 47, SEQ ID NO. 105, and SEQ ID NO. 107. In some aspects, the programmable CasΦ nuclease comprises a sequence selected from the group consisting of SEQ ID NOs: 1 to 47, SEQ ID NO. 105, and SEQ ID NO. 107. In some aspects, the guide nucleic acid comprises a sequence comprising at least 85% sequence identity to a sequence selected from the group consisting of SEQ ID NOs: 48 to 86. In some aspects, the guide nucleic acid comprises a sequence comprising at least 95% sequence identity to a sequence selected from the group consisting of SEQ ID NOs: 48 to 86. In some aspects, the guide nucleic acid comprises a sequence selected from the group consisting of SEQ ID NOs: 48 to 86.


In some aspects, the guide nucleic acid does not comprise a tracrRNA. In some aspects, the target nucleic acid sequence comprises a mutated sequence or a sequence associated with a disease. In some aspects, the mutated sequence is removed after the programmable CasΦ nuclease cleaves the target nucleic acid sequence. In some aspects, the target nucleic acid sequence is in a human cell. In some aspects, the method is performed in vivo. In some aspects, the method is performed ex vivo. In some aspects, the method further comprises inserting a donor polynucleotide into the target nucleic acid sequence at the site of cleavage.


In various aspects, the present disclosure provides a method of introducing a break in a target nucleic acid, the method comprising: contacting the target nucleic acid with: (a) a first guide nucleic acid comprising a region that binds to a first programmable nickase comprising at least 85% sequence identity to a sequence selected from the group consisting of SEQ ID NOs: 1 to 47, SEQ ID NO. 105, and SEQ ID NO. 107; and (b) a second guide nucleic acid comprising a region that binds to a second programmable nickase comprising at least 85% sequence identity to a sequence selected from the group consisting of SEQ ID NOs: 1 to 47, SEQ ID NO. 105, and SEQ ID NO. 107, wherein the first guide nucleic acid comprises a first additional region that binds to the target nucleic acid and wherein the second guide nucleic acid comprises a second additional region that binds to the target nucleic acid and wherein the first additional region of the first guide nucleic acid and the second additional region of the second guide nucleic acid bind opposing strands of the target nucleic acid. In some aspects, the first programmable nickase, the second programmable nickase, or both comprise at least 90% sequence identity to a sequence selected from the group consisting of SEQ ID NOs: 1 to 47, SEQ ID NO. 105, and SEQ ID NO. 107.


In some aspects, the first programmable nickase, the second programmable nickase, or both comprise at least 95% sequence identity to a sequence selected from the group consisting of SEQ ID NOs: 1 to 47, SEQ ID NO. 105, and SEQ ID NO. 107. In some aspects, the first programmable nickase, the second programmable nickase, or both comprise a sequence selected from the group consisting of SEQ ID NOs: 1 to 47, SEQ ID NO. 105, and SEQ ID NO. 107. In some aspects, the first guide nucleic acid, the second guide nucleic acid, or both comprise a sequence comprising at least 85% sequence identity to a sequence selected from the group consisting of SEQ ID NOs: 48 to 86. In some aspects, the first guide nucleic acid, the second guide nucleic acid, or both comprise a sequence comprising at least 95% sequence identity to a sequence selected from the group consisting of SEQ ID NOs: 48 to 86. In some aspects, the first guide nucleic acid, the second guide nucleic acid, or both comprise a sequence selected from the group consisting of SEQ ID NOs: 48 to 86.


In some aspects, the first programmable nickase and the second programmable nickase exhibit greater nicking activity as compared to double stranded cleavage activity. In some aspects, the first programmable nickase and the second programmable nickase nick the target nucleic acid at two different sites. In some aspects, the target nucleic acid comprises double stranded DNA. In some aspects, the two different sites are on opposing strands of the double stranded DNA. In some aspects, the target nucleic acid comprises a mutated sequence or a sequence is associated with a disease. In some aspects, the mutated sequence is removed after the first programmable nickase and the second programmable nickase nick the target nucleic acid. In some aspects, the target nucleic acid is in a cell. In some aspects, the method is performed in vivo. In some aspects, the method is performed ex vivo. In some aspects, the first programmable nickase and the second programmable nickase are the same. In some aspects, the first programmable nickase and the second programmable nickase are different.


In various aspects, the present disclosure provides a method of detecting a target nucleic acid in a sample, the method comprising contacting a sample comprising a target nucleic acid with (a) a programmable CasΦ nuclease comprising at least 85% sequence identity to a sequence selected from the group consisting of SEQ ID NOs: 1 to 47, SEQ ID NO. 105, and SEQ ID NO. 107; (b) a guide RNA comprising a region that binds to the programmable CasΦ nuclease and an additional region that binds to the target nucleic acid; and (c) a labeled single stranded DNA reporter that does not bind the guide RNA; cleaving the labeled single stranded DNA reporter by the programmable CasΦ nuclease to release a detectable label; and detecting the target nucleic acid by measuring a signal from the detectable label.


In some aspects, the target nucleic acid is single stranded DNA. In some aspects, the target nucleic acid is double stranded DNA. In some aspects, the target nucleic acid is a viral nucleic acid. In some aspects, the target nucleic acid is bacterial nucleic acid. In some aspects, the target nucleic acid is from a human cell. In some aspects, the target nucleic acid is a fetal nucleic acid. In some aspects, the sample is derived from a subject's saliva, blood, serum, plasma, urine, aspirate, or biopsy sample. In some aspects, the programmable CasΦ nuclease comprises at least 95% sequence identity to a sequence selected from the group consisting of SEQ ID NOs: 1 to 47, SEQ ID NO. 105, and SEQ ID NO. 107. In some aspects, the programmable CasΦ nuclease comprises a sequence selected from the group consisting of SEQ ID NOs: 1 to 47, SEQ ID NO. 105, and SEQ ID NO. 107.


In some aspects, the guide RNA comprises at least about 95% sequence identity to a sequence selected from the group consisting of SEQ ID NOs: 48 to 86. In some aspects, the guide RNA comprises a sequence selected from the group consisting of SEQ ID NOs: 48 to 86. In some aspects, the sample comprises a phosphate buffer, a Tris buffer, or a HEPES buffer. In some aspects, the sample comprises a pH of 7 to 9. In some aspects, the sample comprises a pH of 7.5 to 8. In some aspects, the sample comprises a salt concentration of 25 nM to 200 mM. In some aspects, the single stranded DNA reporter comprises an ssDNA-fluorescence quenching DNA reporter. In some aspects, the ssDNA-fluorescence quenching DNA reporter is a universal ssDNA-fluorescence quenching DNA reporter. In some aspects, the programmable CasΦ nuclease exhibits PAM-independent cleaving.


In various aspects, the present disclosure provides a method of modulating transcription of a gene in a cell, the method comprising: introducing into a cell comprising a target nucleic acid sequence: (i) a fusion polypeptide or a nucleic acid encoding the fusion polypeptide, wherein the fusion polypeptide comprises: (a) a dCasΦ polypeptide comprising at least 85% sequence identity to a sequence selected from the group consisting of SEQ ID NOs: 1 to 47, SEQ ID NO. 105, and SEQ ID NO. 107, wherein the dCasΦ polypeptide is enzymatically inactive; and (b) a polypeptide comprising transcriptional regulation activity; and (ii) a guide nucleic acid, or a nucleic acid comprising a nucleotide sequence encoding the guide nucleic acid, wherein the guide nucleic acid comprises a region that binds to the dCasΦ polypeptide and an additional region that binds to the target nucleic acid; wherein transcription of the gene is modulated through the fusion polypeptide acting on the target nucleic acid sequence.


In some aspects, the dCasΦ polypeptide comprises at least 95% sequence identity to a sequence selected from the group consisting of SEQ ID NOs: 1 to 47, SEQ ID NO. 105, and SEQ ID NO. 107. In some aspects, the guide nucleic acid comprises at least about 95% sequence identity to a sequence selected from the group consisting of SEQ ID NOs: 48 to 86. In some aspects, the guide nucleic acid comprises a sequence selected from the group consisting of SEQ ID NOs: 48 to 86. In some aspects, the guide nucleic acid comprises a sequence selected from the group consisting of SEQ ID NOs: 48 to 86. In some aspects, the polypeptide comprising transcriptional regulation activity polypeptide comprises transcription activation activity.


In some aspects, the polypeptide comprising transcriptional regulation activity polypeptide comprises transcription repressor activity. In some aspects, the polypeptide comprising transcriptional regulation activity polypeptide comprises an activity selected from the group consisting of transcription activation activity, transcription repression activity, nuclease activity, transcription release factor activity, histone modification activity, histone acetyltransferase activity, nucleic acid association activity, DNA methylase activity, direct or indirect DNA demethylase activity, methyltransferase activity, demethylase activity, acetyltransferase activity, deacetylase activity, kinase activity, phosphatase activity, ubiquitin ligase activity, deubiquitinating activity, adenylation activity, deadenylation activity, deaminase activity, SUMOylating activity, deSUMOylating activity, ribosylation activity, deribosylation activity, myristoylation activity, and demyristoylation activity.


In various aspects, the present disclosure provides a composition comprising: a) a Cas nuclease or nucleic acid encoding said Cas nuclease, and b) a guide nucleic acid or a nucleic acid encoding said guide nucleic acid, wherein said guide nucleic acid comprises a region comprising a nucleotide sequence that is complementary to a target nucleic acid sequence and an additional region, wherein said region and said additional region are heterologous to each other; wherein the Cas nuclease comprises a RuvC domain, wherein the RuvC domain is capable of processing a pre-crRNA and cleaving a target nucleic acid. In some aspects, the same active site in the RuvC domain catalyzes the processing of the pre-crRNA and the cleaving of the target nucleic acid. In some aspects, the Cas nuclease is the programmable CasΦ nuclease as disclosed herein. In some aspects, the Cas nuclease recognizes a protospacer adjacent motif (PAM) of 5′-TBN-3′, wherein B is one or more of C, G, or, T. In some aspects, the Cas nuclease recognizes a protospacer adjacent motif (PAM) of 5′-TTTN-3′. In some aspects, the Cas nuclease recognizes a protospacer adjacent motif (PAM) of 5′-TTN-3′. In some aspects, the Cas nuclease recognizes a protospacer adjacent motif (PAM) of 5′-GTTB-3′, wherein B is C, G, or T. In some aspects, the Cas nuclease recognizes a protospacer adjacent motif (PAM) of 5′-GTTK-3′, 5′-VTTK-3′, 5′-VTTS-3′, 5′-TTTS-3′ or 5′-VTTN-3′, where K is G or T, V is A, C or G, and S is C or G. In some aspects, the composition is used in any of the above methods.


In various aspects, the present disclosure provides the use of a programmable CasΦ nuclease to modify a target nucleic acid sequence according to any one of the above methods. In various aspects, the present disclosure provides the use of a first programmable nickase and a second programmable nickase to introduce a break in a target nucleic acid according to any one of the above methods. In various aspects, the present disclosure provides the use of a programmable CasΦ nuclease to detect a target nucleic acid in a sample according to any one of the above methods. In various aspects, the present disclosure provides the use of a dCasΦ polypeptide to modulate transcription of a gene in a cell according to any one of the above methods. In some aspects, the region is a spacer region and the additional region is a repeat region. In some aspects, the region is a repeat region and the additional region is a spacer region. In some aspects, the repeat region comprises a GAC sequence, optionally wherein the GAC sequence is at the 3′ end of the repeat region. In some aspects, the repeat region comprises a hairpin, optionally wherein the hairpin is in the 3′ portion of the repeat region. In some aspects, the hairpin comprises a double-stranded stem portion and a single-stranded loop portion. In some aspects, a strand of the stem portion comprises a CYC sequence and the other strand of the stem portion comprises a GRG sequence, wherein Y and R are complementary. In some aspects, the G of the GAC sequence is in the stem portion of the hairpin. In some aspects, each strand of the stem portion comprises 3, 4 or 5 nucleotides. In some aspects, the loop portion comprises between 2 and 8 nucleotides, optionally wherein the loop portion comprises 4 nucleotides. In some aspects, the guide nucleic acid comprises at least 98% sequence identity to SEQ ID NO: 54.


In some aspects, the repeat region is between 15 and 50 nucleotides in length, preferably, wherein the repeat region is between 19 and 37 nucleotides in length. In some aspects, the spacer region is between 15 and 50 nucleotides in length, between 15 and 40 nucleotides in length, or between 15 and 35 nucleotides in length, preferably wherein the spacer region is between 16 and 30 nucleotides in length. In some aspects, the spacer region is between 16 and 20 nucleotides in length. In some aspects, the programmable CasΦ nuclease forms a complex with a divalent metal ion, preferably wherein the divalent metal ion is Mg2+.


In various aspects, the present disclosure provides a programmable CasΦ nuclease or a nucleic acid encoding said programmable CasΦ nuclease, wherein said programmable CasΦ nuclease comprises at least 85% sequence identity to a sequence selected from the group consisting of SEQ ID NOs: 1 to 47, SEQ ID NO. 105, and SEQ ID NO. 107, and wherein the programmable CasΦ nuclease is capable of binding to a guide RNA comprising a first region that is complementary to a target nucleic acid sequence in a eukaryotic genome and a second region that binds to the programmable CasΦ nuclease; a complex comprising the programmable CasΦ nuclease and the guide RNA binds to the target sequence; the programmable CasΦ nuclease comprises a RuvC domain, wherein the RuvC domain is capable of processing a pre-crRNA and cleaving the target nucleic acid; and the programmable CasΦ nuclease does not require a tracrRNA to cleave the target nucleic acid.


In various aspects, the present disclosure provides a programmable CasΦ nuclease or a nucleic acid encoding said programmable CasΦ nuclease, wherein said programmable CasΦ nuclease comprises a RuvC-like domain which matches PFAM family PF07282 and does not match PFAM family PF18516, and wherein the programmable CasΦ nuclease is capable of binding to a guide RNA comprising a first region that is complementary to a target nucleic acid sequence in a eukaryotic genome and a second region that binds to the programmable CasΦ nuclease; a complex comprising the programmable CasΦ nuclease and the guide RNA binds to the target sequence; the RuvC-like domain is capable of processing a pre-crRNA and cleaving the target nucleic acid; and the programmable CasΦ nuclease does not require a tracrRNA to cleave the target nucleic acid.


In various aspects, the present disclosure provides a programmable CasΦ nuclease or a nucleic acid encoding said programmable CasΦ nuclease, wherein said programmable CasΦ nuclease comprises at least 85% sequence identity to a sequence selected from the group consisting of SEQ ID NOs: 1 to 47, SEQ ID NO. 105, or SEQ ID NO. 107, and wherein a) the programmable CasΦ nuclease comprises a RuvC-like domain which matches PFAM family PF07282 and does not match PFAM family PF18516; b) the programmable CasΦ nuclease is capable of binding to a guide RNA comprising a first region that is complementary to a target nucleic acid sequence in a eukaryotic genome and a second region that binds to the programmable CasΦ nuclease; c) a complex comprising the programmable CasΦ nuclease and the guide RNA binds to the target sequence; d) the RuvC-like domain is capable of processing a pre-crRNA and cleaving the target nucleic acid; and e) the programmable CasΦ nuclease does not require a tracrRNA to cleave the target nucleic acid.


In some aspects, the same active site in the RuvC domain or RuvC-like domain catalyzes the processing of the pre-crRNA and the cleaving of the target nucleic acid. In some aspects, the programmable CasΦ nuclease is fused or linked to one or more NLS. In some aspects, the one or more NLS are fused or linked to the N-terminus of the programmable CasΦ nuclease; the one or more NLS are fused or linked to the C-terminus of the programmable CasΦ nuclease; or the one or more NLS are fused or linked to the N-terminus and the C-terminus of the programmable CasΦ nuclease. In some cases, an aspect comprises the programmable CasΦ nuclease or a nucleic acid described herein and a gRNA comprising a first region that is complementary to a target nucleic acid sequence in a eukaryotic genome and a second region that binds to the programmable CasΦ nuclease.


In some cases, an aspect comprises the programmable CasΦ nuclease or a nucleic acid described herein and a cell, preferably wherein the cell is a eukaryotic cell. In some cases, an aspect comprises the programmable CasΦ nuclease or a nucleic acid described herein and a gRNA comprising a first region that is complementary to a target nucleic acid sequence in a eukaryotic genome and a second region that binds to the programmable CasΦ nuclease and a cell, preferably wherein the cell is a eukaryotic cell. In some cases, an aspect comprises a eukaryotic cell comprising the programmable CasΦ nuclease or a nucleic acid described herein.


In some aspects, the cell further comprises a gRNA comprising a first region that is complementary to a target nucleic acid sequence in a eukaryotic genome and a second region that binds to the programmable CasΦ nuclease and a cell, preferably wherein the cell is a eukaryotic cell.


In some cases, an aspect comprises a vector comprising a nucleic acid described herein. In some aspects, the vector is a viral vector.


In some aspects, the programmable CasΦ nuclease recognizes a protospacer adjacent motif (PAM) of 5′-TTN-3′. In some aspects, the programmable CasΦ nuclease recognizes a protospacer adjacent motif (PAM) of 5′-GTTB-3′, wherein B is C, G, or T. In some aspects, the Cas nuclease recognizes a protospacer adjacent motif (PAM) of 5′-TTN-3′, optionally wherein the PAM is 5′-TTN-3′. In some aspects, the Cas nuclease recognizes a protospacer adjacent motif (PAM) of 5′-GTTK-3′, 5′-VTTK-3′, 5′-VTTS-3′, 5′-TTTS-3′ or 5′-VTTN-3′, where K is G or T, V is A, C or G, and S is C or G. In some aspects, the Cas nuclease recognizes a protospacer adjacent motif (PAM) of 5′-GTTB-3′, wherein B is C, G, or T.


In various aspects, the present disclosure provides a programmable CasΦ nuclease or a nucleic acid encoding said programmable CasΦ nuclease, wherein said programmable CasΦ nuclease comprises at least 85% sequence identity to a sequence selected from the group consisting of SEQ ID NOs: 1 to 47, SEQ ID NO. 105, and SEQ ID NO. 107, and wherein the programmable CasΦ nuclease is capable of binding to a guide RNA comprising a first region that is complementary to a target nucleic acid sequence in a eukaryotic genome and a second region that binds to the programmable CasΦ nuclease; a complex comprising the programmable CasΦ nuclease and the guide RNA binds to the target sequence; the programmable CasΦ nuclease comprises a RuvC domain, wherein the RuvC domain is capable of processing a pre-crRNA and cleaving the target nucleic acid; the programmable CasΦ nuclease cleaves both strands of the target nucleic acid comprising the target sequence, wherein the strand break is a staggered cut with a 5′ overhang; and the programmable CasΦ nuclease does not require a tracrRNA to cleave the target nucleic acid.


In various aspects, the present disclosure provides a programmable CasΦ nuclease or a nucleic acid encoding said programmable CasΦ nuclease, wherein said programmable CasΦ nuclease comprises a RuvC-like domain which matches PFAM family PF07282 and does not match PFAM family PF18516, and wherein the programmable CasΦ nuclease is capable of binding to a guide RNA comprising a first region that is complementary to a target nucleic acid sequence in a eukaryotic genome and a second region that binds to the programmable CasΦ nuclease; a complex comprising the programmable CasΦ nuclease and the guide RNA binds to the target sequence; the RuvC-like domain is capable of processing a pre-crRNA and cleaving the target nucleic acid; the programmable CasΦ nuclease cleaves both strands of the target nucleic acid comprising the target sequence, wherein the strand break is a staggered cut with a 5′ overhang; and the programmable CasΦ nuclease does not require a tracrRNA to cleave the target nucleic acid.


In various aspects, the present disclosure provides a programmable nuclease or a nucleic acid encoding said programmable nuclease, wherein said programmable nuclease is a Type V CRISPR/Cas enzyme nuclease and comprises between 400 and 900 amino acids, and wherein the programmable CasΦ nuclease is capable of binding to a guide RNA comprising a first region that is complementary to a target nucleic acid sequence in a eukaryotic genome and a second region that binds to the programmable CasΦ nuclease; a complex comprising the programmable CasΦ nuclease and the guide RNA binds to the target sequence; the programmable CasΦ nuclease comprises a RuvC domain, wherein the RuvC domain is capable of processing a pre-crRNA and cleaving the target nucleic acid; the programmable CasΦ nuclease cleaves both strands of the target nucleic acid comprising the target sequence, wherein the strand break is a staggered cut with a 5′ overhang; and the programmable CasΦ nuclease does not require a tracrRNA to cleave the target nucleic acid.


In various aspects, the present disclosure provides a programmable CasΦ nuclease or a nucleic acid encoding said programmable CasΦ nuclease, wherein said programmable CasΦ nuclease comprises at least 85% sequence identity to a sequence selected from the group consisting of SEQ ID NOs: 1 to 47, SEQ ID NO. 105, and SEQ ID NO. 107, and wherein the programmable CasΦ nuclease is capable of binding to a guide RNA comprising a first region that is complementary to a target nucleic acid sequence in a eukaryotic genome and a second region that binds to the programmable CasΦ nuclease; a complex comprising the programmable CasΦ nuclease and the guide RNA binds to the target sequence; the programmable CasΦ nuclease comprises a RuvC domain, wherein the RuvC domain is capable of processing a pre-crRNA and cleaving the target nucleic acid; the programmable CasΦ nuclease is capable of cleaving the second region of the guide RNA in mammalian cells; and the programmable CasΦ nuclease does not require a tracrRNA to cleave the target nucleic acid.


In various aspects, the present disclosure provides a programmable CasΦ nuclease or a nucleic acid encoding said programmable CasΦ nuclease, wherein said programmable CasΦ nuclease comprises a RuvC-like domain which matches PFAM family PF07282 and does not match PFAM family PF18516, and wherein the programmable CasΦ nuclease is capable of binding to a guide RNA comprising a first region that is complementary to a target nucleic acid sequence in a eukaryotic genome and a second region that binds to the programmable CasΦ nuclease; a complex comprising the programmable CasΦ nuclease and the guide RNA binds to the target sequence; the RuvC-like domain is capable of processing a pre-crRNA and cleaving the target nucleic acid; the programmable CasΦ nuclease is capable of cleaving the second region of the guide RNA in mammalian cells; and the programmable CasΦ nuclease does not require a tracrRNA to cleave the target nucleic acid.


In various aspects, the present disclosure provides a programmable nuclease or a nucleic acid encoding said programmable nuclease, wherein said programmable nuclease is a Type V CRISPR/Cas enzyme nuclease and comprises between 400 and 900 amino acids, and wherein the programmable CasΦ nuclease is capable of binding to a guide RNA comprising a first region that is complementary to a target nucleic acid sequence in a eukaryotic genome and a second region that binds to the programmable CasΦ nuclease; a complex comprising the programmable CasΦ nuclease and the guide RNA binds to the target sequence; the RuvC-like domain is capable of processing a pre-crRNA and cleaving the target nucleic acid; the programmable CasΦ nuclease is capable of cleaving the second region of the guide RNA in mammalian cells; and the programmable CasΦ nuclease does not require a tracrRNA to cleave the target nucleic acid.


In various aspects, the present disclosure provides a programmable CasΦ nuclease or a nucleic acid encoding said programmable CasΦ nuclease, wherein said programmable CasΦ nuclease comprises at least 85% sequence identity to a sequence selected from the group consisting of SEQ ID NOs: 1 to 47, SEQ ID NO. 105, and SEQ ID NO. 107, and wherein the programmable CasΦ nuclease is capable of binding to a guide RNA comprising a first region that is complementary to a target nucleic acid sequence in a eukaryotic genome and a second region that binds to the programmable CasΦ nuclease; a complex comprising the programmable CasΦ nuclease and the guide RNA binds to the target sequence; the programmable CasΦ nuclease comprises a RuvC domain, wherein the RuvC domain is capable of processing a pre-crRNA and cleaving the target nucleic acid; the programmable CasΦ nuclease cleaves both strands of a target nucleic acid comprising the target sequence, wherein the strand break is a staggered cut with a 5′ overhang; the programmable CasΦ nuclease is capable of cleaving the second region of the guide RNA in mammalian cells; and the programmable CasΦ nuclease does not require a tracrRNA to cleave the target nucleic acid.


In various aspects, the present disclosure provides a programmable CasΦ nuclease or a nucleic acid encoding said programmable CasΦ nuclease, wherein said programmable CasΦ nuclease comprises a RuvC-like domain which matches PFAM family PF07282 and does not match PFAM family PF18516, and wherein the programmable CasΦ nuclease is capable of binding to a guide RNA comprising a first region that is complementary to a target nucleic acid sequence in a eukaryotic genome and a second region that binds to the programmable CasΦ nuclease; a complex comprising the programmable CasΦ nuclease and the guide RNA binds to the target sequence; the RuvC-like domain is capable of processing a pre-crRNA and cleaving the target nucleic acid; the programmable CasΦ nuclease cleaves both strands of a target nucleic acid comprising the target sequence, wherein the strand break is a staggered cut with a 5′ overhang; the programmable CasΦ nuclease is capable of cleaving the second region of the guide RNA in mammalian cells; and the programmable CasΦ nuclease does not require a tracrRNA to cleave the target nucleic acid.


In various aspects, the present disclosure provides a programmable nuclease or a nucleic acid encoding said programmable nuclease, wherein said programmable nuclease is a Type V CRISPR/Cas enzyme nuclease and comprises between 400 and 900 amino acids, and wherein the programmable CasΦ nuclease is capable of binding to a guide RNA comprising a first region that is complementary to a target nucleic acid sequence in a eukaryotic genome and a second region that binds to the programmable CasΦ nuclease; a complex comprising the programmable CasΦ nuclease and the guide RNA binds to the target sequence; the RuvC-like domain is capable of processing a pre-crRNA and cleaving the target nucleic acid; the programmable CasΦ nuclease cleaves both strands of a target nucleic acid comprising the target sequence, wherein the strand break is a staggered cut with a 5′ overhang; the programmable CasΦ nuclease is capable of cleaving the second region of the guide RNA in mammalian cells; and the programmable CasΦ nuclease does not require a tracrRNA to cleave the target nucleic acid.


In various aspects, the present disclosure provides a programmable CasΦ nuclease or a nucleic acid encoding said programmable CasΦ nuclease, wherein said programmable CasΦ nuclease comprises at least 85% sequence identity to a sequence selected from the group consisting of SEQ ID NOs: 1 to 47, SEQ ID NO. 105, and SEQ ID NO. 107, and wherein the programmable CasΦ nuclease is capable of binding to a guide RNA comprising a first region that is complementary to a target nucleic acid sequence in a eukaryotic genome and a second region that binds to the programmable CasΦ nuclease, wherein the first region comprises a seed region comprising between 10 and 16 nucleosides; a complex comprising the programmable CasΦ nuclease and the guide RNA binds to the target sequence; the programmable CasΦ nuclease comprises a RuvC domain, wherein the RuvC domain is capable of processing a pre-crRNA and cleaving the target nucleic acid; and the programmable CasΦ nuclease does not require a tracrRNA to cleave the target nucleic acid.


In various aspects, the present disclosure provides a programmable CasΦ nuclease or a nucleic acid encoding said programmable CasΦ nuclease, wherein said programmable CasΦ nuclease comprises a RuvC-like domain which matches PFAM family PF07282 and does not match PFAM family PF18516, and wherein the programmable CasΦ nuclease is capable of binding to a guide RNA comprising a first region that is complementary to a target nucleic acid sequence in a eukaryotic genome and a second region that binds to the programmable CasΦ nuclease, wherein the first region comprises a seed region comprising between 10 and 16 nucleosides; a complex comprising the programmable CasΦ nuclease and the guide RNA binds to the target sequence; the RuvC-like domain is capable of processing a pre-crRNA and cleaving the target nucleic acid; and the programmable CasΦ nuclease does not require a tracrRNA to cleave the target nucleic acid.


In various aspects, the present disclosure provides a programmable nuclease or a nucleic acid encoding said programmable nuclease, wherein said programmable nuclease is a Type V CRISPR/Cas enzyme nuclease and comprises between 400 and 900 amino acids, and wherein the programmable CasΦ nuclease is capable of binding to a guide RNA comprising a first region that is complementary to a target nucleic acid sequence in a eukaryotic genome and a second region that binds to the programmable CasΦ nuclease, wherein the first region comprises a seed region comprising between 10 and 16 nucleosides; a complex comprising the programmable CasΦ nuclease and the guide RNA binds to the target sequence; the RuvC-like domain is capable of processing a pre-crRNA and cleaving the target nucleic acid; and the programmable CasΦ nuclease does not require a tracrRNA to cleave the target nucleic acid.


In various aspects, the present disclosure provides a programmable CasΦ nuclease or a nucleic acid encoding said programmable CasΦ nuclease, wherein said programmable CasΦ nuclease comprises at least 85% sequence identity to a sequence selected from the group consisting of SEQ ID NOs: 1 to 47, SEQ ID NO. 105, and SEQ ID NO. 107, and wherein the programmable CasΦ nuclease is capable of binding to a guide RNA comprising a first region that is complementary to a target nucleic acid sequence in a eukaryotic genome and a second region that binds to the programmable CasΦ nuclease, wherein the first region comprises a seed region comprising between 10 and 16 nucleosides; a complex comprising the programmable CasΦ nuclease and the guide RNA binds to the target sequence; the programmable CasΦ nuclease comprises a RuvC domain, wherein the RuvC domain is capable of processing a pre-crRNA and cleaving the target nucleic acid; the programmable CasΦ nuclease cleaves both strands of the target nucleic acid comprising the target sequence, wherein the strand break is a staggered cut with a 5′ overhang; and the programmable CasΦ nuclease does not require a tracrRNA to cleave the target nucleic acid.


In various aspects, the present disclosure provides a programmable CasΦ nuclease or a nucleic acid encoding said programmable CasΦ nuclease, wherein said programmable CasΦ nuclease comprises a RuvC-like domain which matches PFAM family PF07282 and does not match PFAM family PF18516, and wherein the programmable CasΦ nuclease is capable of binding to a guide RNA comprising a first region that is complementary to a target nucleic acid sequence in a eukaryotic genome and a second region that binds to the programmable CasΦ nuclease, wherein the first region comprises a seed region comprising between 10 and 16 nucleosides; a complex comprising the programmable CasΦ nuclease and the guide RNA binds to the target sequence; the RuvC-like domain is capable of processing a pre-crRNA and cleaving the target nucleic acid; the programmable CasΦ nuclease cleaves both strands of the target nucleic acid comprising the target sequence, wherein the strand break is a staggered cut with a 5′ overhang; and the programmable CasΦ nuclease does not require a tracrRNA to cleave the target nucleic acid.


In various aspects, the present disclosure provides a programmable nuclease or a nucleic acid encoding said programmable nuclease, wherein said programmable nuclease is a Type V CRISPR/Cas enzyme nuclease and comprises between 400 and 900 amino acids, and wherein the programmable CasΦ nuclease is capable of binding to a guide RNA comprising a first region that is complementary to a target nucleic acid sequence in a eukaryotic genome and a second region that binds to the programmable CasΦ nuclease, wherein the first region comprises a seed region comprising between 10 and 16 nucleosides; a complex comprising the programmable CasΦ nuclease and the guide RNA binds to the target sequence; the RuvC-like domain is capable of processing a pre-crRNA and cleaving the target nucleic acid; the programmable CasΦ nuclease cleaves both strands of the target nucleic acid comprising the target sequence, wherein the strand break is a staggered cut with a 5′ overhang; and the programmable CasΦ nuclease does not require a tracrRNA to cleave the target nucleic acid.


In various aspects, the present disclosure provides a programmable CasΦ nuclease or a nucleic acid encoding said programmable CasΦ nuclease, wherein said programmable CasΦ nuclease comprises at least 85% sequence identity to a sequence selected from the group consisting of SEQ ID NOs: 1 to 47, SEQ ID NO. 105, and SEQ ID NO. 107, and wherein the programmable CasΦ nuclease is capable of binding to a guide RNA comprising a first region that is complementary to a target nucleic acid sequence in a eukaryotic genome and a second region that binds to the programmable CasΦ nuclease, wherein the first region comprises a seed region comprising between 10 and 16 nucleosides; a complex comprising the programmable CasΦ nuclease and the guide RNA binds to the target sequence; the programmable CasΦ nuclease comprises a RuvC domain, wherein the RuvC domain is capable of processing a pre-crRNA and cleaving the target nucleic acid; the programmable CasΦ nuclease is capable of cleaving the second region of the guide RNA in mammalian cells; and the programmable CasΦ nuclease does not require a tracrRNA to cleave the target nucleic acid.


In various aspects, the present disclosure provides a programmable CasΦ nuclease or a nucleic acid encoding said programmable CasΦ nuclease, wherein said programmable CasΦ nuclease comprises a RuvC-like domain which matches PFAM family PF07282 and does not match PFAM family PF18516, and wherein the programmable CasΦ nuclease is capable of binding to a guide RNA comprising a first region that is complementary to a target nucleic acid sequence in a eukaryotic genome and a second region that binds to the programmable CasΦ nuclease, wherein the first region comprises a seed region comprising between 10 and 16 nucleosides; a complex comprising the programmable CasΦ nuclease and the guide RNA binds to the target sequence; the RuvC-like domain is capable of processing a pre-crRNA and cleaving the target nucleic acid; the programmable CasΦ nuclease is capable of cleaving the second region of the guide RNA in mammalian cells; and the programmable CasΦ nuclease does not require a tracrRNA to cleave the target nucleic acid.


In various aspects, the present disclosure provides a programmable nuclease or a nucleic acid encoding said programmable nuclease, wherein said programmable nuclease is a Type V CRISPR/Cas enzyme nuclease and comprises between 400 and 900 amino acids, and wherein the programmable CasΦ nuclease is capable of binding to a guide RNA comprising a first region that is complementary to a target nucleic acid sequence in a eukaryotic genome and a second region that binds to the programmable CasΦ nuclease, wherein the first region comprises a seed region comprising between 10 and 16 nucleosides; a complex comprising the programmable CasΦ nuclease and the guide RNA binds to the target sequence; the RuvC-like domain is capable of processing a pre-crRNA and cleaving the target nucleic acid; the programmable CasΦ nuclease is capable of cleaving the second region of the guide RNA in mammalian cells; and the programmable CasΦ nuclease does not require a tracrRNA to cleave the target nucleic acid.


In various aspects, the present disclosure provides a programmable CasΦ nuclease or a nucleic acid encoding said programmable CasΦ nuclease, wherein said programmable CasΦ nuclease comprises at least 85% sequence identity to a sequence selected from the group consisting of SEQ ID NOs: 1 to 47, SEQ ID NO. 105, and SEQ ID NO. 107, and wherein the programmable CasΦ nuclease is capable of binding to a guide RNA comprising a first region that is complementary to a target nucleic acid sequence in a eukaryotic genome and a second region that binds to the programmable CasΦ nuclease, wherein the first region comprises a seed region comprising between 10 and 16 nucleosides; a complex comprising the programmable CasΦ nuclease and the guide RNA binds to the target sequence; the programmable CasΦ nuclease comprises a RuvC domain, wherein the RuvC domain is capable of processing a pre-crRNA and cleaving the target nucleic acid; the programmable CasΦ nuclease cleaves both strands of a target nucleic acid comprising the target sequence, wherein the strand break is a staggered cut with a 5′ overhang; the programmable CasΦ nuclease is capable of cleaving the second region of the guide RNA in mammalian cells; and the programmable CasΦ nuclease does not require a tracrRNA to cleave the target nucleic acid.


In various aspects, the present disclosure provides a programmable CasΦ nuclease or a nucleic acid encoding said programmable CasΦ nuclease, wherein said programmable CasΦ nuclease comprises a RuvC-like domain which matches PFAM family PF07282 and does not match PFAM family PF18516, and wherein the programmable CasΦ nuclease is capable of binding to a guide RNA comprising a first region that is complementary to a target nucleic acid sequence in a eukaryotic genome and a second region that binds to the programmable CasΦ nuclease, wherein the first region comprises a seed region comprising between 10 and 16 nucleosides; a complex comprising the programmable CasΦ nuclease and the guide RNA binds to the target sequence; the RuvC-like domain is capable of processing a pre-crRNA and cleaving the target nucleic acid; the programmable CasΦ nuclease cleaves both strands of a target nucleic acid comprising the target sequence, wherein the strand break is a staggered cut with a 5′ overhang; the programmable CasΦ nuclease is capable of cleaving the second region of the guide RNA in mammalian cells; and the programmable CasΦ nuclease does not require a tracrRNA to cleave the target nucleic acid.


In various aspects, the present disclosure provides a programmable nuclease or a nucleic acid encoding said programmable nuclease, wherein said programmable nuclease is a Type V CRISPR/Cas enzyme nuclease and comprises between 400 and 900 amino acids, and wherein the programmable CasΦ nuclease is capable of binding to a guide RNA comprising a first region that is complementary to a target nucleic acid sequence in a eukaryotic genome and a second region that binds to the programmable CasΦ nuclease, wherein the first region comprises a seed region comprising between 10 and 16 nucleosides; a complex comprising the programmable CasΦ nuclease and the guide RNA binds to the target sequence; the RuvC-like domain is capable of processing a pre-crRNA and cleaving the target nucleic acid; the programmable CasΦ nuclease cleaves both strands of a target nucleic acid comprising the target sequence, wherein the strand break is a staggered cut with a 5′ overhang; the programmable CasΦ nuclease is capable of cleaving the second region of the guide RNA in mammalian cells; and the programmable CasΦ nuclease does not require a tracrRNA to cleave the target nucleic acid. In some aspects the same active site in the RuvC domain or RuvC-like domain catalyzes the processing of the pre-crRNA and the cleaving of the target nucleic acid.


In some aspects, the programmable CasΦ nuclease is fused or linked to one or more NLS. In some aspects, the one or more NLS are fused or linked to the N-terminus of the programmable CasΦ nuclease; the one or more NLS are fused or linked to the C-terminus of the programmable CasΦ nuclease; or the one or more NLS are fused or linked to the N-terminus and the C-terminus of the programmable CasΦ nuclease.


In some cases, an aspect comprises the programmable CasΦ nuclease or a nucleic acid described herein and a gRNA comprising a first region that is complementary to a target nucleic acid sequence in a eukaryotic genome and a second region that binds to the programmable CasΦ nuclease. In some aspects, the first region comprises a seed region comprising between 10 and 16 nucleosides. In some aspects, the seed region comprises 16 nucleosides. In some cases, an aspect comprises the programmable CasΦ nuclease or a nucleic acid described herein and a cell, preferably wherein the cell is a eukaryotic cell.


In some cases, an aspect comprises the programmable CasΦ nuclease or a nucleic acid described herein and a gRNA comprising a first region that is complementary to a target nucleic acid sequence in a eukaryotic genome and a second region that binds to the programmable CasΦ nuclease and a cell, preferably wherein the cell is a eukaryotic cell. In some aspects, the first region comprises a seed region comprising between 10 and 16 nucleosides. In some aspects, the seed region comprises 16 nucleosides.


In some aspects, a eukaryotic cell comprises the programmable CasΦ nuclease or a nucleic acid described herein. In some aspects, the cell further comprises a gRNA comprising a first region that is complementary to a target nucleic acid sequence in a eukaryotic genome and a second region that binds to the programmable CasΦ nuclease. In some aspects, the first region comprises a seed region comprising between 10 and 16 nucleosides. In some aspects, the seed region comprises 16 nucleosides. In some aspects, a vector comprises a nucleic acid described herein. In some aspects, the vector is a viral vector.


In various aspects, the present disclosure provides a guide nucleic acid, or a nucleic acid encoding said guide nucleic acid, comprising a sequence that is the same as or differs by no more than 5, 4, 3, 2, or 1 nucleotides from: a sequence from Tables A to AH; or a sequence comprising a repeat sequence from Table 2 and a spacer sequence from Tables A to H. In some aspects, the guide nucleic acid comprises a sequence from Tables A to AH; or a sequence comprising a repeat sequence from Table 2 and a spacer sequence from Tables A to H. In some aspects, the guide nucleic acid comprises RNA and/or DNA. In some aspects, the guide nucleic acid is a guide RNA. Some aspects further comprise a complex comprising the guide nucleic acid and a programmable CasΦ nuclease. Some aspects comprise a eukaryotic cell comprising the guide nucleic acid. In some aspects, the eukaryotic cell further comprises a programmable CasΦ nuclease. Some aspects further comprise a vector encoding the guide nucleic acid. In some aspects, the vector is a viral vector.


In various aspects, the present disclosure provides a method of introducing a first modification in a first gene and a second modification in a second gene, the method comprising contacting a cell with a CasΦ nuclease; a first guide RNA that is at least partially complementary to an equal length portion of the first gene; and a second guide RNA that is at least partially complementary to an equal length portion of the second gene. In some aspects, the CasΦ nuclease is a CasΦ 12 nuclease. In some aspects, the CasΦ 12 nuclease comprises or consists of an amino acid sequence of SEQ ID NO: 12. In some aspects, the first and/or second modification comprises an insertion of a nucleotide, a deletion of a nucleotide or a combination thereof. In some aspects, the first and/or second modification comprises an epigenetic modification. In some aspects, the first and/or second mutation results in a reduction in the expression of the first gene and/or second gene, respectively. In some aspects, the reduction in the expression is at least about a 10% reduction, at least about a 20% reduction, at least about a 30% reduction, at least about a 40% reduction, at least about a 50% reduction, at least about a 60% reduction, at least about a 70% reduction, at least about an 80% reduction, or at least about a 90% reduction. In some aspects, the method comprises contacting the cell with three different guide RNAs targeting three different genes.


In various aspects, the present disclosure provides a programmable CasΦ nuclease or a nucleic acid encoding said programmable CasΦ nuclease, wherein said programmable CasΦ nuclease comprises at least 85% sequence identity to SEQ ID NO: 12. In some aspects, the programmable CasΦ nuclease comprises at least 90% sequence identity to SEQ ID NO: 12. In some aspects, the programmable CasΦ nuclease comprises at least 95% sequence identity to SEQ ID NO: 12. In some aspects, the programmable CasΦ nuclease comprises at least 98% sequence identity to SEQ ID NO: 12. In some aspects, the programmable CasΦ nuclease comprises or consists of an amino acid sequence of SEQ ID NO: 12. In some aspects, the programmable CasΦ nuclease comprises at least 85% sequence identity to SEQ ID NO: 18. In some aspects, the programmable CasΦ nuclease comprises at least 90% sequence identity to SEQ ID NO: 18. In some aspects, the programmable CasΦ nuclease comprises at least 95% sequence identity to SEQ ID NO: 18. In some aspects, the programmable CasΦ nuclease comprises at least 98% sequence identity to SEQ ID NO: 18. In some aspects, the programmable CasΦ nuclease comprises or consists of an amino acid sequence of SEQ ID NO: 18. In some aspects, the programmable CasΦ nuclease comprises at least 85% sequence identity to SEQ ID NO: 32. In some aspects, the programmable CasΦ nuclease comprises at least 85% sequence identity to SEQ ID NO: 32. In some aspects, the programmable CasΦ nuclease comprises at least 90% sequence identity to SEQ ID NO: 32. In some aspects, the programmable CasΦ nuclease comprises at least 95% sequence identity to SEQ ID NO: 32. In some aspects, the programmable CasΦ nuclease comprises at least 98% sequence identity to SEQ ID NO: 32. In some aspects, the programmable CasΦ nuclease comprises or consists of an amino acid sequence of SEQ ID NO: 32. In some aspects, the programmable CasΦ nuclease is capable of binding to a guide RNA comprising a first region that is complementary to a target nucleic acid sequence in a eukaryotic genome and a second region that binds to the programmable CasΦ nuclease. In some aspects, the a complex comprising the programmable CasΦ nuclease and the guide RNA binds to the target sequence. In some aspects, the programmable CasΦ nuclease does not require a tracrRNA to cleave a target nucleic acid. In some aspects, the programmable CasΦ nuclease comprises a RuvC domain, wherein the RuvC domain is capable of processing a pre-crRNA and cleaving a target nucleic acid.


In various aspects, the present disclosure provides a composition comprising the programmable CasΦ nuclease disclosed herein or a nucleic acid encoding said programmable nuclease, and a guide nucleic acid comprising a first region that is complementary to a target nucleic acid sequence in a eukaryotic genome and a second region that binds to the programmable CasΦ nuclease. In some aspects, the first region comprises a seed region comprising between 10 and 16 nucleosides. In some aspects, the seed region comprises 16 nucleosides. In some aspects, the composition comprises the programmable CasΦ nuclease or a nucleic acid encoding said programmable nuclease and a cell, preferably wherein the cell is a eukaryotic cell. In various aspects, the present disclosure provides a programmable CasΦ nuclease disclosed herein or a nucleic acid encoding said programmable nuclease, and a guide nucleic acid comprising a first region that is complementary to a target nucleic acid sequence in a eukaryotic genome and a second region that binds to the programmable CasΦ nuclease and a cell, preferably wherein the cell is a eukaryotic cell. In some aspects, the first region comprises a seed region comprising between 10 and 16 nucleosides. In some aspects, the seed region comprises 16 nucleosides.


In various aspects, the present disclosure provides a eukaryotic cell comprising the programmable CasΦ nuclease disclosed herein or a nucleic acid encoding said programmable nuclease. In some aspects, the cell further comprises a guide nucleic acid comprising a first region that is complementary to a target nucleic acid sequence in a eukaryotic genome and a second region that binds to the programmable CasΦ nuclease. In some aspects, the first region comprises a seed region comprising between 10 and 16 nucleosides. In some aspects, the seed region comprises 16 nucleosides.


In various aspects, the present disclosure provides a vector comprising the nucleic acid encoding a programmable nuclease as disclosed herein. In some aspects, the vector is a viral vector. In some aspects, the vector further comprises a nucleic acid encoding a guide nucleic acid, wherein the guide nucleic acid comprises a first region that is complementary to a target nucleic acid sequence in a eukaryotic genome and a second region that binds to the programmable CasΦ nuclease. In some aspects, the guide nucleic acid is a guide RNA. In some aspects, the vector further comprises a donor polynucleotide. In some aspects, the guide nucleic acid is a guide RNA.


In various aspects, the present disclosure provides a programmable nuclease or a nucleic acid encoding said programmable nuclease, wherein said programmable nuclease is a Type V CRISPR/Cas enzyme nuclease and comprises between 400 and 900 amino acids, and wherein the programmable nuclease is capable of binding to a guide RNA comprising a first region that is complementary to a target nucleic acid sequence in a eukaryotic genome and a second region that binds to the programmable nuclease; a complex comprising the programmable nuclease and the guide RNA binds to the target sequence; the programmable nuclease comprises a RuvC domain, wherein the RuvC domain is capable of processing a pre-crRNA and cleaving the target nucleic acid; the programmable nuclease cleaves both strands of the target nucleic acid comprising the target sequence, wherein the strand break is a staggered cut with a 5′ overhang; and the programmable nuclease does not require a tracrRNA to cleave the target nucleic acid.


In various aspects, the present disclosure provides a programmable nuclease or a nucleic acid encoding said programmable nuclease, wherein said programmable nuclease is a Type V CRISPR/Cas enzyme nuclease and comprises between 400 and 900 amino acids, and wherein the programmable nuclease is capable of binding to a guide RNA comprising a first region that is complementary to a target nucleic acid sequence in a eukaryotic genome and a second region that binds to the programmable nuclease; a complex comprising the programmable nuclease and the guide RNA binds to the target sequence; the RuvC-like domain is capable of processing a pre-crRNA and cleaving the target nucleic acid; the programmable nuclease is capable of cleaving the second region of the guide RNA in mammalian cells; and the programmable nuclease does not require a tracrRNA to cleave the target nucleic acid.


In various aspects, the present disclosure provides a programmable nuclease or a nucleic acid encoding said programmable nuclease, wherein said programmable nuclease is a Type V CRISPR/Cas enzyme nuclease and comprises between 400 and 900 amino acids, and wherein the programmable nuclease is capable of binding to a guide RNA comprising a first region that is complementary to a target nucleic acid sequence in a eukaryotic genome and a second region that binds to the programmable nuclease; a complex comprising the programmable nuclease and the guide RNA binds to the target sequence; the RuvC-like domain is capable of processing a pre-crRNA and cleaving the target nucleic acid; the programmable nuclease cleaves both strands of a target nucleic acid comprising the target sequence, wherein the strand break is a staggered cut with a 5′ overhang; the programmable nuclease is capable of cleaving the second region of the guide RNA in mammalian cells; and the programmable nuclease does not require a tracrRNA to cleave the target nucleic acid.


In various aspects, the present disclosure provides a programmable nuclease or a nucleic acid encoding said programmable nuclease, wherein said programmable nuclease is a Type V CRISPR/Cas enzyme nuclease and comprises between 400 and 900 amino acids, and wherein the programmable nuclease is capable of binding to a guide RNA comprising a first region that is complementary to a target nucleic acid sequence in a eukaryotic genome and a second region that binds to the programmable nuclease, wherein the first region comprises a seed region comprising between 10 and 16 nucleosides; a complex comprising the programmable nuclease and the guide RNA binds to the target sequence; the RuvC-like domain is capable of processing a pre-crRNA and cleaving the target nucleic acid; and the programmable nuclease does not require a tracrRNA to cleave the target nucleic acid.


In various aspects, the present disclosure provides a programmable nuclease or a nucleic acid encoding said programmable nuclease, wherein said programmable nuclease is a Type V CRISPR/Cas enzyme nuclease and comprises between 400 and 900 amino acids, and wherein the programmable nuclease is capable of binding to a guide RNA comprising a first region that is complementary to a target nucleic acid sequence in a eukaryotic genome and a second region that binds to the programmable nuclease, wherein the first region comprises a seed region comprising between 10 and 16 nucleosides; a complex comprising the programmable nuclease and the guide RNA binds to the target sequence; the RuvC-like domain is capable of processing a pre-crRNA and cleaving the target nucleic acid; the programmable nuclease cleaves both strands of the target nucleic acid comprising the target sequence, wherein the strand break is a staggered cut with a 5′ overhang; and the programmable nuclease does not require a tracrRNA to cleave the target nucleic acid.


In various aspects, the present disclosure provides a programmable nuclease or a nucleic acid encoding said programmable nuclease, wherein said programmable nuclease is a Type V CRISPR/Cas enzyme nuclease and comprises between 400 and 900 amino acids, and wherein the programmable nuclease is capable of binding to a guide RNA comprising a first region that is complementary to a target nucleic acid sequence in a eukaryotic genome and a second region that binds to the programmable nuclease, wherein the first region comprises a seed region comprising between 10 and 16 nucleosides; a complex comprising the programmable nuclease and the guide RNA binds to the target sequence; the RuvC-like domain is capable of processing a pre-crRNA and cleaving the target nucleic acid; the programmable nuclease is capable of cleaving the second region of the guide RNA in mammalian cells; and the programmable nuclease does not require a tracrRNA to cleave the target nucleic acid.


In various aspects, the present disclosure provides a programmable nuclease or a nucleic acid encoding said programmable nuclease, wherein said programmable nuclease is a Type V CRISPR/Cas enzyme nuclease and comprises between 400 and 900 amino acids, and wherein the programmable nuclease is capable of binding to a guide RNA comprising a first region that is complementary to a target nucleic acid sequence in a eukaryotic genome and a second region that binds to the programmable nuclease, wherein the first region comprises a seed region comprising between 10 and 16 nucleosides; a complex comprising the programmable nuclease and the guide RNA binds to the target sequence; the RuvC-like domain is capable of processing a pre-crRNA and cleaving the target nucleic acid; the programmable nuclease cleaves both strands of a target nucleic acid comprising the target sequence, wherein the strand break is a staggered cut with a 5′ overhang; the programmable nuclease is capable of cleaving the second region of the guide RNA in mammalian cells; and the programmable nuclease does not require a tracrRNA to cleave the target nucleic acid. In some aspects, the same active site in the RuvC domain or RuvC-like domain catalyzes the processing of the pre-crRNA and the cleaving of the target nucleic acid. In some aspects, the programmable nuclease is fused or linked to one or more NLS.


In various aspects, the programmable nuclease disclosed herein or the nucleic acid encoding said programmable nuclease is fused to one or more NLS. In some aspects, the one or more NLS are fused or linked to the N-terminus of the programmable nuclease. In some aspects, the one or more NLS are fused or linked to the C-terminus of the programmable nuclease; or the one or more NLS are fused or linked to the N-terminus and the C-terminus of the programmable nuclease.


In various aspects, the present disclosure provides a composition comprising a programmable nuclease disclosed herein or a nucleic acid encoding the programmable nuclease; and a gRNA comprising a first region that is complementary to a target nucleic acid sequence in a eukaryotic genome and a second region that binds to the programmable nuclease. In some aspects, the first region comprises a seed region comprising between 10 and 16 nucleosides. In some aspects, the seed region comprises 16 nucleosides. In some aspects, the programmable nuclease or a nucleic acid disclosed herein is comprised in a cell, preferably wherein the cell is a eukaryotic cell. In some aspects, the composition comprising the programmable nuclease or a nucleic acid disclosed herein further comprises a gRNA comprising a first region that is complementary to a target nucleic acid sequence in a eukaryotic genome and a second region that binds to the programmable nuclease and a cell, preferably wherein the cell is a eukaryotic cell. In some aspects, the first region comprises a seed region comprising between 10 and 16 nucleosides. In some aspects, the seed region comprises 16 nucleosides.


In various aspects, the present disclosure provides a eukaryotic cell comprising a programmable nuclease disclosed herein or a nucleic acid molecule encoding said programmable nuclease. In some aspects, the cell further comprises a gRNA comprising a first region that is complementary to a target nucleic acid sequence in a eukaryotic genome and a second region that binds to the programmable nuclease. In some aspects, the first region comprises a seed region comprising between 10 and 16 nucleosides. In some aspects, the seed region comprises 16 nucleosides. In some aspects, the nucleic acid disclosed herein is comprised in a vector. In some aspects, the vector is a viral vector.


In some aspects, the present disclosure provides a complex comprising a first programmable CasΦ nuclease and a second programmable CasΦ nuclease. In some aspects, the first programmable CasΦ nuclease and the second programmable CasΦ nuclease are the same programmable CasΦ nuclease. In some aspects, the dimer comprises a first programmable CasΦ nuclease and a second programmable CasΦ nuclease. In some aspects, the composition comprises a first programmable CasΦ nuclease and a second programmable CasΦ nuclease.


In various aspects, the present disclosure provides a method of modifying a cell comprising a target nucleic acid, comprising introducing a composition comprising a programmable CasΦ nuclease, programmable nuclease or a cas nuclease to a cell, wherein the programmable CasΦ nuclease, programmable nuclease or the cas nuclease cleaves the target nucleic acid, thereby modifying the cell.


In various aspects, the disclosure provides a method of modifying a cell comprising a target nucleic acid, comprising introducing to the cell (i) the programmable CasΦ nuclease or programmable nuclease disclosed herein and (ii) a guide nucleic acid, wherein the programmable CasΦ nuclease or programmable Cas nuclease cleaves the target nucleic acid, thereby modifying the cell. In some aspects, the guide nucleic acid is a guide RNA. In some aspects, the method further comprises introducing a donor polynucleotide to the cell. In some aspects, the method comprises inserting the donor polynucleotide into the target nucleic acid at the site of cleavage. In some aspects, the cell is a eukaryotic cell, preferably a human cell. In some aspects, the cell is a T cell. In some aspects, the cell is a CAR-T cell. In some aspects, the cell is a stem cell. In some aspects, the cell is a hematopoietic stem cell. In some aspects, the stem cell is a pluripotent stem cell, preferably an induced pluripotent stem cell. In some aspects, the modified cell obtained or obtainable by the method disclosed herein. In some aspect, the disclosure provides a modified human cell obtained or obtainable by the methods herein. In some aspects, the modified cell is a eukaryotic cell, preferably a human cell. In some aspects, the cell is a T cell. In some aspects, the T cell is a CAR-T cell. In some aspects, the cell is a stem cell. In some aspects, the cell is a hematopoietic stem cell. In some aspects, the cell is a pluripotent stem cell, preferably an induced pluripotent stem cell.


In some aspects, the method comprises the use of a CasΦ nuclease to introduce a first modification in a first gene and a second modification in a gene according to the methods disclosed herein. In some aspects, the method comprises the use of a programmable CasΦ nuclease, programmable nuclease or a cas nuclease to modify a cell according to the methods disclosed herein. In some aspects, the method comprises lipid nanoparticle delivery of a nucleic acid encoding the programmable CasΦ nuclease, programmable nuclease or cas nuclease, and the guide nucleic acid. In some aspects, the nucleic acid further comprises a donor polynucleotide. In some aspects, the nucleic acid is a viral vector. In some aspects, the viral vector is an AAV vector.


INCORPORATION BY REFERENCE

All publications, patents, and patent applications mentioned in this specification are herein incorporated by reference to the same extent as if each individual publication, patent, or patent application was specifically and individually indicated to be incorporated by reference.





BRIEF DESCRIPTION OF THE DRAWINGS

The novel features of the invention are set forth with particularity in the appended claims. A better understanding of the features and advantages of the present invention will be obtained by reference to the following detailed description that sets forth illustrative embodiments, in which the principles of the invention are utilized, and the accompanying drawings of which:



FIG. 1 illustrates results of a cis-cleavage assay on CasΦ polypeptides to assess programmable nickase activity. The results showed that CasΦ orthologs comprise programmable nickase activity. The assay was performed on five CasΦ polypeptides, designated CasΦ.2, CasΦ.11, CasΦ.17, CasΦ.18, and CasΦ.12, in FIG. 1. For the assay, each of the CasΦ polypeptides was complexed with a guide nucleic acid at room temperature for 20 minutes to form a ribonucleoprotein (RNP) complex. The RNP complexes for each of the CasΦ polypeptides were separately incubated at 37° C. for 60 minutes with plasmid DNA targeted by the guide nucleic acids. The graph shows the percentage of plasmids that developed nicks (single-stranded breaks) or linearized (double-stranded breaks) during the 60 minute incubation, as measured by gel-electrophoresis. The data showed that CasΦ.2, CasΦ.11, CasΦ.17, and CasΦ.18 acted as programmable nickases. CasΦ.17 and CasΦ.18 produced only nicked product. CasΦ.2 and CasΦ.11 generated some linearized product but primarily nicked intermediate. CasΦ.12 generated almost entirely linearized product.



FIG. 2A and FIG. 2B illustrate results of a cis-cleavage assay on CasΦ polypeptides to assess the effect of crRNA repeat sequence and RNP complexing temperature on the programmable nickase activity of CasΦ polypeptides. Each of three proteins (designated CasΦ.11, CasΦ.17 and CasΦ.18 in FIG. 2A and FIG. 2B) was tested for its ability to nick plasmid DNA when complexed with one of four crRNAs comprising the repeat sequences of CasΦ.2, CasΦ.7, CasΦ.10 and CasΦ.18 (abbreviated j2, j7, j10, and j 18, respectively, in FIG. 2A and FIG. 2B). FIG. 2C illustrates the alignment of CasΦ.2, CasΦ.7, CasΦ.10, and CasΦ.18 repeat sequences showing conserved (highlighted in black) and diverged nucleotides. For the assay, the RNP complex formation of each of the CasΦ polypeptides with the guide nucleic acid was performed at either room temperature or at 37° C. The incubation of the RNP complex with the input plasmid DNA that comprised the target sequence for the guide nucleic acids was carried out for 60 minutes at 37° C. FIG. 2A shows the percentage of input plasmid DNA that was nicked by RNP complexes assembled at room temperature. The data showed that crRNAs comprising repeat sequences from all tested CasΦ polypeptides supported nickase activity by CasΦ.11, CasΦ.17, and CasΦ.18; the only exception was the CasΦ.17/CasΦ.2-repeat pairing.



FIG. 2B shows the percentage of input plasmid DNA that was nicked by RNP complexes assembled at 37° C. The data showed that the activity of each protein is completely abolished when complexed with crRNAs comprising a repeat sequence from CasΦ.2 or CasΦ.10. FIG. 2D shows corresponding data for CasΦ.2, CasΦ.4, CasΦ.6, CasΦ.9, CasΦ.10, CasΦ.12 and CasΦ.13 for the experiment shown in FIG. 2A and FIG. 2B. FIG. 2D also shows the percentage of input plasmid DNA that was linearized by CasΦ.2, CasΦ.4, CasΦ.6, CasΦ.9, CasΦ.10, CasΦ.11, CasΦ.12, CasΦ.13, CasΦ.17 and CasΦ.18 when complexed with one of four crRNAs J2, j7, j10 and j 18, as described above.



FIG. 3A illustrates the cleavage pattern for the control that comprised no CasΦ polypeptide. In the absence of CasΦ polypeptide, the target DNA remained uncut and resulted in complete sequencing of both target and non-target strands. FIGS. 3B-3E illustrate results of a cis-cleavage assay and sequencing run demonstrating that CasΦ nickases cleave the non-target strand of a double-stranded DNA target. A cis-cleavage assay was performed with four CasΦ polypeptides, CasΦ.12, CasΦ.2, CasΦ.11, and CasΦ.18, and a control comprising no CasΦ polypeptide, on a super-coiled plasmid DNA comprising a protospacer immediately downstream of a TTTN PAM sequence. The resulting DNA from the assay was Sanger sequenced using forward and reverse primers. The forward primer comprised the sequence of the target strand (TS) of the DNA sequence, while the reverse primer comprised the sequence of the non-target strand (NTS). If a strand had been cleaved by the CasΦ polypeptide being assayed, the sequencing signal would drop off from the cleavage site. FIG. 3B illustrates the cleavage pattern for CasΦ.12 protein, which comprises double-stranded DNA cleavage activity. As shown in the figure, the sequencing signal dropped off on both the target and the non-target strands (as shown by arrows) demonstrating cleavage of both strands. FIG. 3C illustrates the cleavage pattern for CasΦ.2, which predominantly nicks DNA as illustrated in FIG. 1. The sequencing signal dropped off only on the non-target strand (bottom arrow) demonstrating nicking of the non-target strand. FIG. 3D illustrates the cleavage pattern for CasΦ.11. As illustrated in FIG. 1, CasΦ.11 only nicks DNA after 60 minutes of incubation with plasmid DNA. The sequencing signal dropped off on the non-target strand (bottom arrow), thus demonstrating that CasΦ.11 nicks the non-target strand. FIG. 3E illustrates the cleavage pattern for CasΦ.18. As illustrated in FIG. 1, CasΦ.18 only nicks DNA after 60 minutes of incubation with plasmid DNA. The sequencing signal dropped off on the non-target strand (bottom arrow), thus demonstrating that CasΦ.18 nicks the non-target strand.



FIGS. 4A-4B illustrate results of a cis-cleavage assay on CasΦ polypeptides to assess the effect of crRNA repeat and target sequence the programmable nickase and double strand DNA cleavage activity of CasΦ polypeptides. The heat map in FIG. 4A cleavage products for 60 minute in vitro plasmid cleavage reactions of 12 CasΦ orthologs paired with 10 crRNA repeat sequences. Except for 0, all Repeat and CasΦ axis labels refer Cas12Φ system numbers. Repeat 0 is a negative control including the CasΦ.18 crRNA repeat sequence and a non-targeting spacer sequence. With rare exceptions, preference for nicking or linearizing target DNA is not affected by crRNA repeat or target DNA sequence. Raw data for CasΦ.12 and CasΦ.18 targeting spacer 1 (boxes) are shown in FIG. 4B. FIG. 4B shows the raw gel data used to generate a subset of the heat map from FIG. 4A. CasΦ.12 predominantly linearizes plasmid DNA (i.e. cleaves both strands of a double strand DNA target) whereas CasΦ.18 primarily does not proceed beyond the first strand nicking.



FIGS. 5A-5C illustrate the structural conservation of CasΦ crRNA repeats. FIG. 5A shows the structure of the crRNA repeats for CasΦ.1, CasΦ.2, CasΦ.7, CasΦ.11, CasΦ.12, CasΦ.13, CasΦ.18, and CasΦ.32. These structures were calculated using an online RNA prediction tool (https://rna.urrinc.rochester.edu/RNAstructureWeb/Servers/Predict1/Predict1.html) using default parameters at 37° C. The sequences of these repeats are provided in TABLE 2. FIG. 5B shows the consensus structure of the crRNA as determined by the LocaRNA tool using the crRNA repeats from CasΦ.1, CasΦ.2, CasΦ.4, CasΦ.7, CasΦ.10, CasΦ.11, CasΦ.12, CasΦ.13, Cas12Φ.17, CasΦ.18, CasΦ.19, CasΦ.21, CasΦ.22, CasΦ.23, CasΦ.24, CasΦ.25, CasΦ.26, CasΦ.27, CasΦ.28, CasΦ.29, CasΦ.30, CasΦ.31, CasΦ.32, CasΦ.33, CasΦ.35 and CasΦ.41.



FIG. 5C shows a further refined consensus structure of the crRNA determined by the LocaRNA tool. The LocaRNA tool aligns RNA sequences while considering consensus secondary structure of the RNA sequence.



FIGS. 6A-6C illustrate the optimal PAM preferences for CasΦ.2, CasΦ.4, CasΦ.11, CasΦ.12 and CasΦ.18. An in vitro cleavage assay was performed using a linear DNA target. Starting with a TTTA PAM, each position was varied one by one to the other 3 nucleotides for a total of 12 variants in addition to parental TTTA. FIG. 6A shows a heat map which illustrates the absolute levels of double strand cleavage (or nicking for CasΦ.18). FIG. 6B shows the data from FIG. 6A after normalization to the parental TTTA PAM as 100%. FIG. 6C shows the optimal PAM preferences of these CasΦ polypeptides with a summary of the data shown in FIG. 6A and FIG. 6B.



FIG. 7 illustrates that CasΦ polypeptides rapidly nick supercoiled DNA. CasΦ polypeptides where assembled with their native repeat crRNAs targeting one of two targets (51, TATTAAATACTCGTATTGCTGTTCGATTAT (SEQ ID NO: 108), or S2, CACAGCTTGTCTGTAAGCGGATGCCATATG (SEQ ID NO: 109)) immediately downstream of a GTTG or TTTG PAM. Reactions were initiated with the addition of supercoiled target DNA and stopped after 1, 3, 6, 15, 30 and 60 mins. The cleavage was quantified by agarose gel analysis as nicked (left column) or linear (right column). Error bars are +/−SEM of duplicate time courses.



FIGS. 8A-8B illustrate that CasΦ polypeptides prefer full-length repeats and spacers from 16 to 20 nucleotides. crRNA panels varying in repeat and spacer length were tested for their ability to support CasΦ polypeptides spacer cleavage. Two different CasΦ repeats that function across CasΦ orthologs were utilized. FIG. 8A shows results of the assay for nicking (top) or linearization (bottom) as influenced by the length of the crRNA repeat. 19 nucleotides was the shortest repeat still supporting cleaving activity. FIG. 8B shows results for nicking (top) or linearization (bottom) as influenced by the length of the crRNA spacer. The optimal spacer length varied by target but is generally 16 to 20 nucleotides.



FIGS. 9A-9B illustrate CasΦ.12 cleavage in HEK293T cells and the effect of changing the spacer length on this cleavage. FIG. 9A provides a schematic of how CasΦ.12 cleavage activity was assessed in HEK293T cells. An Ac-GFP-expressing HEK293T cell line was transfected with a plasmid expressing CasΦ.12 and its crRNA targeting the Ac-GFP gene. CasΦ.12 cleavage was assessed by the reduction in Ac-GFP-expressing cells as assessed by flow cytometry. As shown in FIG. 9B, varying the spacer length varied the degree of CasΦ.12 cleavage. CasΦ.12 has a preference for a spacer length of 17 to 22 nucleotides in HEK293T cells, but longer spacers (up to 30 nucleotides was tested) also supported CasΦ.12 cleavage.



FIGS. 10A-10B illustrate that the CasΦ disclosed herein are a novel family of Cas nucleases. As shown in FIG. 10A, the InterPro database did not recognize CasΦ.2 as a protein family member. As a positive control, the InterPro database identified Acidaminococcus sp. (strain BV3L6) as a Cas12a protein family member, as shown in FIG. 10B.



FIG. 11 illustrates the raw HMM for PF07282.



FIG. 12 illustrates the raw HMM for PF18516.



FIGS. 13A-13C illustrate the cleavage activity of CasΦ.19-CasΦ.48.



FIGS. 14A-14C illustrates the PAM requirement of CasΦ polypeptides. FIG. 14A shows the PAM requirement of CasΦ.2, CasΦ.4, CasΦ.11 and CasΦ.12. FIG. 14B shows the PAM requirement of CasΦ.20, CasΦ.26, CasΦ.32, CasΦ.38 and CasΦ.45. FIG. 14C shows the cleavage products from the assessment of the PAM requirement for CasΦ.20, CasΦ.24 and CasΦ.25. FIG. 14D shows the quantification of the raw data shown in FIG. 14C.



FIG. 15 illustrates endogenous gene editing in HEK293T cells.



FIGS. 16A-16L illustrate endogenous gene editing in CHO cells. FIG. 16A shows CasΦ.12 mediated generation of insertion or deletion mutations (indel) in the endogenous Bak1, Bax and Fut8 genes. FIG. 16B shows the DNA donor oligos used to assess CasΦ.12 mediated gene editing via the homology directed repair pathway. FIG. 16C shows the detection of indels following delivery of CasΦ.12. FIG. 16D shows the sequence analysis for the data in FIG. 16C.



FIG. 16E shows the detection of incorporated donor template following delivery of CasΦ.12 and a donor oligo. Further examples of CasΦ.12 mediated generation of indel mutations are shown in FIG. 16F, FIG. 16G and FIG. 16H for Bak1, Bax and Fut8 genes, respectively. FIG. 16I shows the DNA donor oligos used to assess CasΦ.12 mediated gene editing via the homology directed repair pathway. FIG. 16J shows the frequency of HDR in CHO cells following delivery of either Cas9 and a gRNA targeting Bax, CasΦ.12 and a gRNA targeting Bax or CasΦ.12 and a gRNA targeting Fut8. FIG. 16K and FIG. 16L show the frequency of indel mutations and HDR, respectively, detected in CHO cells following delivery of CasΦ.12 and AAV6 DNA donors at the indicated number of viral genomes per cell (1×10{circumflex over ( )}5, 3×10{circumflex over ( )}5, or 1×10{circumflex over ( )}6).



FIG. 17 illustrates endogenous gene editing in K562 cells.



FIGS. 18A-18E illustrate endogenous gene editing in primary cells. FIG. 18A shows a flow cytometry analysis of T cells that have received CasΦ.12 with or without a gRNA targeting the beta-2 microglobulin gene. FIG. 18B shows the modification detected in K562 cells and T cells following delivery of CasΦ.12 and a gRNA targeting the beta-2 microglobulin gene.



FIG. 18C shows the sequence analysis of the T cell population which received CasΦ.12 and the gRNA targeting the beta-2 microglobulin gene. FIG. 18D shows a flow cytometry analysis of T cells that have received CasΦ.12 with a gRNA targeting the T Cell Receptor Alpha Constant gene. FIG. 18E shows the sequence analysis of cell populations that received CasΦ.12 with a gRNA targeting the T Cell Receptor Alpha Constant gene. FIG. 18F shows the quantification of indels detected by sequence analysis.



FIG. 19 illustrates the cleavage of the second DNA strand by CasΦ nucleases in a separable reaction step to the cleavage of the first DNA strand.



FIG. 20 illustrates the trans cleavage of ssDNA by CasΦ nucleases in a detection assay.



FIGS. 21A-21B illustrate the CasΦ.12-mediated efficiency is comparable to that of Cas9. FIG. 21A shows the frequency of indel mutations and quantification of B2M knockout cells from flow cytometry panels in FIG. 21B.



FIGS. 22A-22B illustrate the identification of optimized gRNAs for genome editing with CasΦ.12 in CHO cells. FIG. 22A shows the frequency of indel mutations induced by CasΦ.12 polypeptides complexed with a 2′fluoro modified gRNA. FIG. 22B shows further CasΦ.12 RNP complexes that can mediate genome editing in CHO cells.



FIGS. 23A-23H illustrate minimal off-target CasΦ.12-mediated genome editing in CHO and HEK293 cells. FIGS. 23A-23F are off-target analysis InDel validation from a list of potential off-target sites based on in-silico computational predictions. FIG. 23A shows CasΦ.12 targeting Fut8, FIG. 23B shows CasΦ.12 targeting BAX, FIG. 23C shows Cas9 targeting BAX, FIG. 23D shows Cas9 targeting Fut8, FIG. 23E shows Cas9 targeting Bak1 and FIG. 23F shows CasΦ.12 targeting Bak1. FIG. 23G shows off-target analysis using unbiased guide-seq procedure, using CasΦ.12 and guides targeting human Fut8 in HEK293 cells. FIG. 23H shows off-target analysis using unbiased guide-seq procedure, using Cas9 and guides targeting human Fut8 in HEK293 cells.



FIGS. 24A-24B illustrate CasΦ.12-mediated genome editing via homology directed repair (HDR). FIG. 24A shows CasΦ.12-mediated gene editing via the HDR pathway. FIG. 24B shows a schematic of the donor oligonucleotide.



FIGS. 25A-25E illustrate the ability of CasΦ.12 to target multiple genes. FIG. 25A shows the percentage of B2M and TRAC knockout after CasΦ.12-mediated genome editing with gRNAs with a repeat length of 20 nucleotides and a spacer length of 20 nucleotides. FIG. 25B shows the percentage of B2M and TRAC knockout after CasΦ.12-mediated genome editing with gRNAs with a repeat length of 20 nucleotides and a spacer length of 17 nucleotides. FIG. 25C shows corresponding flow cytometry panels for B2M and TRAC knockout with different gRNAs. FIG. 25D shows the percentage of TRAC knockout after CasΦ.12-mediated genome editing with modified gRNAs of different spacer lengths (repeat length of 20 nucleotides and a spacer length of 17 or 20 nucleotides). FIG. 25E shows a corresponding flow cytometry panel for TRAC knockout after CasΦ.12-mediated genome editing.



FIGS. 26A-26D illustrate the extended seed region of CasΦ.12. FIG. 26A and FIG. 26B show no indel mutations or CD3 knockout occurs when there is a single or double mismatch in the first 1-16 nucleotides from the 5′ end of the spacer. FIG. 26C and FIG. 26D provide schematics of the gRNAs with mismatches.



FIGS. 27A-27B illustrate the ability of CasΦ.12 to mediate genome editing in CHO cells with modified gRNAs.



FIGS. 28A-28B illustrate the ability of CasΦ.12 to mediate genome editing with gRNAs with variations in repeat and spacer length. FIG. 28A shows the frequency of CasΦ.12-mediated indel mutations using gRNA of different repeat lengths. FIG. 28B shows the frequency of CasΦ.12-mediated indel mutations using gRNA of different spacer lengths.



FIGS. 29A-29E illustrate exemplary gRNAs for targeting CD3, B2M and PD1 with CasΦ.12 in human primary T cells. FIG. 29F shows the screening of gRNAs targeting TRAC.



FIG. 29H shows the screening of gRNAs targeting B2M. FIG. 29G and FIG. 29I show flow cytometry panels of exemplary gRNAs targeting TRAC and B2M, respectively.



FIGS. 30A-30J illustrate delivery of CasΦ.12 RNPs or CasΦ.12 mRNA both lead to efficient genome editing. FIG. 30A and FIG. 30B show flow cytometry panels of CasΦ.12 RNP complexes targeting B2M and TRAC in T cells, and are quantified in FIG. 30C and FIG. 30D.



FIG. 30E and FIG. 30F show the quantification of indels detected by sequence analysis with delivery of CasΦ.12 RNPs. FIG. 30G and FIG. 30I show the frequency of indel mutations after delivery of CasΦ.12 mRNA and the quantification of B2M knockout cells shown in FIG. 30H is an exemplary FACS panel for two data points in FIG. 30G. FIG. 30J shows the distribution of the size of indel mutations induced by CasΦ.12 or Cas9.



FIG. 31 illustrates CasΦ.12 can process its own guide RNA in mammalian cells.



FIGS. 32A-32E illustrate CasΦ polypeptide-induced cleavage patterns. FIG. 32A, shows CasΦ polypeptides generated nicked and linearized plasmid DNA. FIG. 32B shows a schematic of the cut sites on the target and non-target strand. FIG. 32C shows sequence analysis of the non-target stand target strand and is represented in FIG. 32D. FIG. 32E shows a table of cut sites and overhangs of the different CasΦ polypeptides.



FIG. 33 illustrates the ability of CasΦ RNP complexes to knockout multiple genes simultaneously. T cells were nucleofected with RNP complexes of CasΦ.12 and gRNAs targeting B2M, TRAC or PDCD1 and the percentage knockout was measured using flow cytometry.



FIG. 34 illustrates the ability of CasΦ.12 RNP complexes to mediate high efficiency genome editing of PCKS9 in mouse Hepa1-6 cells. 95 CasΦ gRNAs were used along with Cas9, as a control. CasΦ.12 RNP complexes induced a maximum indel frequency of 48%, whereas Cas9 RNP complexed induced a maximum indel frequency of 22%.



FIGS. 35A-35F illustrate the ability of a CasΦ.12 all-in-one vector to mediate genome editing in Hepa1-6 mouse hepatoma cells. FIG. 35A shows a plasmid map of the AAV encoding the CasΦ polypeptide sequence and gRNA sequence. FIG. 35B illustrates repeat truncations.



FIG. 35C shows efficient transfection with AAV. FIG. 35D shows the frequency of CasΦ.12 induced indel mutations. FIG. 35E and FIG. 35F show the frequency of CasΦ.12 induced indel mutations with different gRNA containing repeat and spacer sequences of different lengths.



FIG. 36 illustrates the optimization of LNP delivery of mRNA encoding CasΦ and gRNA. A range of N/P ratios were tested and the frequency of indel mutations was determined.



FIG. 37 illustrates CasΦ-mediated genome editing of CD34+ hematopoietic stem cells. Cells were nucleofected with either RNP complexes containing CasΦ.12 polypeptides and a B2M-targeting guide, or a mixture of CasΦ.12 mRNA and B2M-targeting guide and the frequency of indel mutations was determined.



FIG. 38 illustrates CasΦ-mediated genome editing of induced pluripotent stem cells. Cells were nucleofected with RNP complexes (CasΦ.12 polypeptides and gRNAs targeting either the B2M locus or targeting a CIITA locus) and the frequency of indel mutations was determined.



FIG. 39 illustrates CasΦ-mediated genome editing of the CIITA locus in K562 cells. Cells were nucleofected with RNP complexes (CasΦ polypeptides and gRNAs targeting CIITA) and the frequency of indel mutations was determined by NGS.





DETAILED DESCRIPTION

The present disclosure provides methods, compositions, systems, and kits comprising programmable CasΦ nucleases. An illustrative composition comprises a programmable CasΦ nuclease or a nucleic acid encoding the programmable CasΦ nuclease, wherein the programmable CasΦ nuclease comprises at least 85% sequence identity to a sequence selected from the group consisting of SEQ ID NOs: 1 to 47 and SEQ ID NO. 105. In some embodiments, the composition further comprises a guide nucleic acid or a nucleic acid encoding the guide nucleic acid, wherein the guide nucleic acid comprises a region comprising a nucleotide sequence that is complementary to a target nucleic acid sequence and an additional region, wherein the region and the additional region are heterologous to each other. As used herein, the term “heterologous” may be used to describe or indicate that a first sequence is different from a second sequence and do not naturally occur together. As used herein, the term “heterologous” may be used to describe that a first moiety (e.g., a first sequence) is different from a second moiety (e.g., a second sequence) and, as such, the two moieties do not naturally occur together and are engineered to be a part of one entity. For example, a guide nucleic acid sequence comprising a region and an additional region that are heterologous to each other may indicate that the guide nucleic acid sequence is engineered to include the region and the additional region. The programmable CasΦ nuclease and the guide nucleic acid may be complexed together in a ribonucleoprotein complex. Alternatively, compositions consistent with the present disclosure include nucleic acids encoding for the programmable CasΦ nuclease and the guide nucleic acid. In some embodiments, the guide nucleic acid comprises a sequence with at least about 85% sequence identity to a sequence selected from the group consisting of SEQ ID NOs: 48 to 86. In some embodiments, the programmable CasΦ nuclease is SEQ ID NO: 12 or SEQ ID NO: 105. In some embodiments, the programmable CasΦ nuclease comprises nickase activity. In some embodiments, the programmable CasΦ nuclease comprises double-strand cleavage activity. As used herein, CasΦ may be referred to as Cas12j or Cas14u.


Also disclosed herein are compositions, methods, and systems for modifying a target nucleic acid sequence. An illustrative method for modifying a target nucleic acid sequence comprises contacting a target nucleic acid sequence with a programmable CasΦ nuclease comprising at least 85% sequence identity to a sequence selected from the group consisting of SEQ ID NOs: 1 to 47 and SEQ ID NO. 105, and a guide nucleic acid, wherein the programmable CasΦ nuclease cleaves the target nucleic acid sequence, thereby modifying the target nucleic acid sequence. In some embodiments, the programmable CasΦ nuclease introduces a double-stranded break in the target nucleic acid. In some embodiments, the programmable CasΦ nuclease introduces a single-stranded break.


Also disclosed herein are compositions, methods, and systems for modifying a target nucleic acid sequence comprising use of two or more programmable CasΦ nickases. An illustrative method for introducing a break in a target nucleic acid comprises contacting the target nucleic acid with: (a) a first guide nucleic acid comprising a region that binds to a first programmable nickase comprising at least 85% sequence identity to a sequence selected from the group consisting of SEQ ID NOs: 1 to 47 and SEQ ID NO. 105; and (b) a second guide nucleic acid comprising a region that binds to a second programmable nickase comprising at least 85% sequence identity to a sequence selected from the group consisting of SEQ ID NOs: 1 to 47 and SEQ ID NO. 105, wherein the first guide nucleic acid comprises an additional region that binds to the target nucleic acid and wherein the second guide nucleic acid comprises an additional region that binds to the target nucleic acid and wherein the additional region of the first guide nucleic acid and the additional region of the second guide nucleic acid bind opposing strands of the target nucleic acid.


Also disclosed herein are compositions, methods, and systems for detecting a target nucleic acid in a sample. An illustrative method for detecting a target nucleic acid in a sample comprises contacting the sample comprising the target nucleic acid with (a) a programmable CasΦ nuclease comprising at least 85% sequence identity to a sequence selected from the group consisting of SEQ ID NOs: 1 to 47 and SEQ ID NO. 105; (b) a guide RNA comprising a region that binds to the programmable CasΦ nuclease and an additional region that binds to the target nucleic acid; and (c) a labeled, single stranded DNA reporter that does not bind the guide RNA; cleaving the labeled single stranded DNA reporter by the programmable CasΦ nuclease to release a detectable label; and detecting the target nucleic acid by measuring a signal from the detectable label.


Also disclosed herein are compositions, methods, and systems for modulating transcription of a gene in a cell. An illustrative method of modulating transcription of a gene in a cell comprises introducing into a cell comprising a target nucleic acid sequence: (i) a fusion polypeptide or a nucleic acid encoding the fusion polypeptide, wherein the fusion polypeptide comprises: (a) a dCasΦ polypeptide comprising at least 85% sequence identity to a sequence selected from the group consisting of SEQ ID NOs: 1 to 47 and SEQ ID NO. 105, wherein the dCasΦ polypeptide is enzymatically inactive; and (b) a polypeptide comprising transcriptional regulation activity; and (ii) a guide nucleic acid, or a nucleic acid comprising a nucleotide sequence encoding the guide nucleic acid, wherein the guide nucleic acid comprises a region that binds to the dCasΦ polypeptide and an additional region that binds to the target nucleic acid; wherein transcription of the gene is modulated through the fusion polypeptide acting on the target nucleic acid sequence.


Also disclosed is use of a programmable CasΦ nuclease to modify a target nucleic acid sequence according to any of the methods described herein. Also disclosed is use of a first programmable nickase and a second programmable nickase to introduce a break in a target nucleic acid according to any of the methods described herein. Also disclosed is use of a programmable CasΦ nuclease to detect a target nucleic acid in a sample according to any of the methods described herein. Also disclosed is use of a dCasΦ polypeptide to modulate transcription of a gene in a cell according to any of the methods described herein.


Programmable Nucleases

The present disclosure provides methods and compositions comprising programmable nucleases. The programmable nucleases can be complexed with a guide nucleic acid of the disclosure for targeting a target nucleic acid for detection, editing, modification, or regulation of the target nucleic acid.


The programmable nuclease can be used for detecting a target nucleic acid. For example, in certain embodiments, when the programmable nuclease is complexed with the guide nucleic acid and the target nucleic acid hybridizes to the guide nucleic acid, trans-cleavage of a single stranded DNA (ssDNA), such as an ssDNA reporter, by the programmable nuclease is activated. Detection of trans-cleavage of ssDNA can be used to determine a target nucleic acid in a sample.


The programmable nuclease can be used for editing or modifying a target nucleic acid, for example, by site-specific cleavage of a target sequence, donor nucleic acid insertion, or a combination thereof.


The programmable nuclease can be used for gene regulation of a target nucleic acid, for example, using a catalytically inactive programmable nuclease in combination with a polypeptide comprising gene regulation activity.


In some embodiments, the programmable nuclease is a programmable nuclease comprising site-specific nucleic acid cleavage activity. In some embodiments, the programmable nuclease is a programmable nuclease comprising double-strand DNA cleavage activity. In some embodiments, the programmable nuclease is a programmable nickase. In some embodiments, the programmable nuclease is a programmable DNA nickase. In some embodiments, the programmable nuclease is a programmable nuclease comprising a catalytically inactive nuclease domain. In some embodiments, the programmable nuclease comprising a catalytically inactive nuclease domain can include at least 1, at least 2, at least 3, at least 4, or at least 5 mutations relative to a wild type nuclease domain. Said mutations may be present within the cleaving or active site of the nuclease.


In some embodiments, the programmable nuclease is a programmable DNA nuclease. In some embodiments, the programmable nuclease is a Type V CRISPR/Cas enzyme, wherein a Type V CRISPR/Cas enzyme comprises a single active site or catalytic domain in a single RuvC domain. The RuvC domain is typically near the C-terminus of the enzyme. A single RuvC domain may comprise RuvC subdomains, for example RuvCI, RuvCII and RuvCIII. As used herein a “Type V CRISPR/Cas enzyme” or “Type V cas nuclease” or “Type V cas effector” may be used to describe a family of enzymes or a member thereof having diverse N-terminal structures and often comprising a conserved single catalytic RuvC-like endonuclease domain that is C-terminal of the N-terminal structures, derived from the TnpB protein encoded by autonomous or non-autonomous transposons. The terms “RuvC domain” and “RuvC-like domain” are used interchangeably for Type V CRISPR/Cas enzymes, Type V cas nucleases and Type V cas effectors. In some embodiments, the Type V CRISPR/Cas enzyme is a CasΦ nuclease. A CasΦ polypeptide can function as an endonuclease that catalyzes cleavage at a specific sequence in a target nucleic acid. A programmable CasΦ nuclease of the present disclosure may have a single active site in a RuvC domain that is capable of catalyzing pre-crRNA processing and nicking or cleaving of nucleic acids. This compact catalytic site may render the programmable CasΦ nuclease especially advantageous for genome engineering and new functionalities for genome manipulation.


In some embodiments, the RuvC domain is a RuvC-like domain. Various RuvC-like domains are known in the art and are easily identified using online tools such as InterPro (https://www.ebi.ac.uk/interpro/). For example, a RuvC-like domain may be a domain which shares homology with a region of TnpB proteins of the IS605 and other related families of transposons, as described in review articles such as Shmakov et al. (Nature Reviews Microbiology volume 15, pages 169-182(2017)) and Koonin E. V. and Makarova K. S. (2019, Phil. Trans. R. Soc., B 374:20180087). In some embodiments, the RuvC-like domain shares homology with the transposase IS605, OrfB, C-terminal. A transposase IS605, OrfB, C-terminal is easily identified by the skilled person using bioinformatics tools, such as PFAM (Finn et al. (Nucleic Acids Res. 2014 Jan. 1; 42(Database issue): D222-D230); El-Gebali et al. (2019) Nucleic Acids Res. doi:10.1093/nar/gky995). PFAM is a database of protein families in which each entry is composed of a seed alignment which forms the basis to build a profile hidden Markov model (HMM) using the HMMER software (hmmer.org). It is readily accessible via pfam.xfam.org, maintained by EMBL-EBI, which easily allows an amino acid sequence to be analyzed against the current release of PFAM (e.g. version 33.1 from May 2020), but local builds can also be implemented using publicly- and freely-available database files and tools. A transposase IS605, OrfB, C-terminal is easily identified by the skilled person using the HMM PF07282. PF07282 is reproduced for reference in FIG. 11 (accession number PF07282.12). The skilled person would also be able to identify a RuvC domain, for example with the HMM PF18516, using the PFAM tool. PF18516 is reproduced for reference in FIG. 12 (accession number PF18516.2). In some embodiments, the programmable CasΦ nuclease comprises a RuvC-like domain which matches PFAM family PF07282 but does not match PFAM family PF18516, as assessed using the PFAM tool (e.g. using PFAM version 33.1, and the HMM accession numbers PF07282.12 and PF18516.2). PFAM searches should ideally be performed using an E-value cut-off set at 1.0.


In some embodiments, a programmable nuclease described herein—or a programmable nuclease and guide RNA combination described herein—has an editing efficiency of at least 1%, at least 2%, at least 3%, at least 4%, at least 5%, at least 6%, at least 7%, at least 8%, at least 9%, at least 10%, at least 11%, at least 12%, at least 13%, at least 14%, at least 15%, at least 16%, at least 17%, at least 18%, at least 19%, at least 20%, at least 21%, at least 22%, at least 23%, at least 24%, at least 25%, at least 26%, at least 27%, at least 28%, at least 29%, at least 30%, at least 31%, at least 32%, at least 33%, at least 34%, at least 35%, at least 36%, at least 37%, at least 38%, at least 39%, at least 40%, at least 41%, at least 42%, at least 43%, at least 44%, at least 45%, at least 46%, at least 47%, at least 48%, at least 49%, at least 50%, at least 51%, at least 52%, at least 53%, at least 54%, at least 55%, at least 56%, at least 57%, at least 58%, at least 59%, at least 60%, at least 61%, at least 62%, at least 63%, at least 64%, at least 65%, at least 66%, at least 67%, at least 68%, at least 69%, at least 70%, at least 71%, at least 72%, at least 73%, at least 74%, at least 75%, at least 76%, at least 77%, at least 78%, at least 79%, at least 80%, at least 81%, at least 82%, at least 83%, at least 84%, at least 85%, at least 86%, at least 87%, at least 88%, at least 89%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or 100%. In some embodiments, a programmable nuclease described herein—or a programmable nuclease and guide RNA combination described herein—has an editing efficiency of at least 20%. In some embodiments, a programmable nuclease described herein—or a programmable nuclease and guide RNA combination described herein—has an editing efficiency of at least 25%. In some embodiments, a programmable nuclease described herein—or a programmable nuclease and guide RNA combination described herein—has an editing efficiency of at least 30%. In some embodiments, a programmable nuclease described herein—or a programmable nuclease and guide RNA combination described herein—has an editing efficiency of at least 35%. In some embodiments, a programmable nuclease described herein—or a programmable nuclease and guide RNA combination described herein—has an editing efficiency of at least 40%. In some embodiments, a programmable nuclease described herein—or a programmable nuclease and guide RNA combination described herein—has an editing efficiency of at least 45%. In some embodiments, a programmable nuclease described herein—or a programmable nuclease and guide RNA combination described herein—has an editing efficiency of at least 50%. In some embodiments, a programmable nuclease described herein—or a programmable nuclease and guide RNA combination described herein—has an editing efficiency of at least 55%. In some embodiments, a programmable nuclease described herein—or a programmable nuclease and guide RNA combination described herein—has an editing efficiency of at least 60%. In some embodiments, a programmable nuclease described herein—or a programmable nuclease and guide RNA combination described herein—has an editing efficiency of at least 65%. In some embodiments, a programmable nuclease described herein—or a programmable nuclease and guide RNA combination described herein—has an editing efficiency of at least 70%. In some embodiments, a programmable nuclease described herein—or a programmable nuclease and guide RNA combination described herein—has an editing efficiency of at least 75%. In some embodiments, a programmable nuclease described herein—or a programmable nuclease and guide RNA combination described herein—has an editing efficiency of at least 80%. In some embodiments, a programmable nuclease described herein—or a programmable nuclease and guide RNA combination described herein—has an editing efficiency of at least 85%. In some, a programmable nuclease described herein—or a programmable nuclease and guide RNA combination described herein—has an editing efficiency of at least 90%. In some embodiments, a programmable nuclease described herein—or a programmable nuclease and guide RNA combination described herein—has an editing efficiency of at least 95%. In some embodiments, a programmable nuclease described herein—or a programmable nuclease and guide RNA combination described herein—has an editing efficiency of at least 100%. In some embodiments, a programmable nuclease described herein—or a programmable nuclease and guide RNA combination described herein—has an editing efficiency of 42%. In some embodiments, said editing efficiency is determined by analyzing the frequency of indel mutations in a nucleic acid or gene knockout.


In some embodiments, a programmable nuclease described herein has a primary amino acid sequence length of less than 1500 amino acids, less than 1450 amino acids, less than 1400 amino acids, less than 1350 amino acids, less than 1300 amino acids, less than 1250 amino acids, less than 1200 amino acids, less than 1150 amino acids, less than 1100 amino acids, less than 1050 amino acids, less than 1000 amino acids, less than 950 amino acids, less than 900 amino acids, less than 850 amino acids, or less than 800 amino acids.


In some examples, a programmable nuclease described herein is a Type V cas nuclease. In some examples, the Type V cas nuclease, or a composition comprising the Type V cas nuclease, has an editing efficiency of at least 20%. In some examples, the Type V cas nuclease, or a composition comprising the Type V cas nuclease, has an editing efficiency of at least 25%. In some examples, the Type V cas nuclease, or a composition comprising the Type V cas nuclease, has an editing efficiency of at least 30%. In some examples, the Type V cas nuclease, or a composition comprising the Type V cas nuclease, has an editing efficiency of at least 35%. In some examples, the Type V cas nuclease, or a composition comprising the Type V cas nuclease, has an editing efficiency of at least 40%. In some examples, the Type V cas nuclease, or a composition comprising the Type V cas nuclease, has an editing efficiency of at least 45%. In some examples, the Type V cas nuclease, or a composition comprising the Type V cas nuclease, has an editing efficiency of at least 50%. In some examples, the Type V cas nuclease, or a composition comprising the Type V cas nuclease, has an editing efficiency of at least 55%.


In some examples, the Type V cas nuclease, or a composition comprising the Type V cas nuclease, has an editing efficiency of at least 60%. In some examples, the Type V cas nuclease, or a composition comprising the Type V cas nuclease, has an editing efficiency of at least 65%. In some examples, the Type V cas nuclease, or a composition comprising the Type V cas nuclease, has an editing efficiency of at least 70%. In some examples, the Type V cas nuclease, or a composition comprising the Type V cas nuclease, has an editing efficiency of at least 75%. In some examples, the Type V cas nuclease, or a composition comprising the Type V cas nuclease, has an editing efficiency of at least 80%. In some examples, the Type V cas nuclease, or a composition comprising the Type V cas nuclease, has an editing efficiency of at least 85%. In some examples, the Type V cas nuclease, or a composition comprising the Type V cas nuclease, has an editing efficiency of at least 90%. In some examples, the Type V cas nuclease, or a composition comprising the Type V cas nuclease, has an editing efficiency of at least 95%. In some examples, the Type V cas nuclease, or a composition comprising the Type V cas nuclease, has an editing efficiency of 100%.


In some examples, a programmable nuclease described herein has a primary amino acid sequence length of less than 850 amino acids. In some examples, the programmable nuclease having a primary amino acid sequence length of less than 850 amino acids has an editing efficiency of at least 20%. In some examples, the programmable nuclease having a primary amino acid sequence length of less than 850 amino acids has an editing efficiency of at least 25%. In some examples, the programmable nuclease having a primary amino acid sequence length of less than 850 amino acids has an editing efficiency of at least 30%. In some examples, the programmable nuclease having a primary amino acid sequence length of less than 850 amino acids has an editing efficiency of at least 35%. In some examples, the programmable nuclease having a primary amino acid sequence length of less than 850 amino acids has an editing efficiency of at least 40%. In some examples, the programmable nuclease having a primary amino acid sequence length of less than 850 amino acids has an editing efficiency of at least 45%. In some examples, the programmable nuclease having a primary amino acid sequence length of less than 850 amino acids has an editing efficiency of at least 50%. In some examples, the programmable nuclease having a primary amino acid sequence length of less than 850 amino acids has an editing efficiency of at least 55%. In some examples, the programmable nuclease having a primary amino acid sequence length of less than 850 amino acids has an editing efficiency of at least 60%. In some examples, the programmable nuclease having a primary amino acid sequence length of less than 850 amino acids has an editing efficiency of at least 65%. In some examples, the programmable nuclease having a primary amino acid sequence length of less than 850 amino acids has an editing efficiency of at least 70%. In some examples, the programmable nuclease having a primary amino acid sequence length of less than 850 amino acids has an editing efficiency of at least 75%. In some examples, the programmable nuclease having a primary amino acid sequence length of less than 850 amino acids has an editing efficiency of at least 80%. In some examples, the programmable nuclease having a primary amino acid sequence length of less than 850 amino acids has an editing efficiency of at least 85%. In some examples, the programmable nuclease having a primary amino acid sequence length of less than 850 amino acids has an editing efficiency of at least 90%. In some examples, the programmable nuclease having a primary amino acid sequence length of less than 850 amino acids has an editing efficiency of at least 95%. In some examples, the programmable nuclease having a primary amino acid sequence length of less than 850 amino acids has an editing efficiency of 100%.


TABLE 1 provides amino acid sequences of illustrative CasΦ polypeptides that can be used in compositions and methods of the disclosure.









TABLE 1







CasΦ Amino Acid Sequences










SEQ ID



Name
NO
Amino Acid Sequence





CasΦ.1
 1
MADTPTLFTQFLRHHLPGQRFRKDILKQAGRILANKGEDATI




AFLRGKSEESPPDFQPPVKCPIIACSRPLTEWPIYQASVAIQGY




VYGQSLAEFEASDPGCSKDGLLGWFDKTGVCTDYFSVQGLN




LIFQNARKRYIGVQTKVTNRNEKRHKKLKRINAKRIAEGLPE




LTSDEPESALDETGHLIDPPGLNTNIYCYQQVSPKPLALSEVN




QLPTAYAGYSTSGDDPIQPMVTKDRLSISKGQPGYIPEHQRA




LLSQKKHRRMRGYGLKARALLVIVRIQDDWAVIDLRSLLRN




AYWRRIVQTKEPSTITKLLKLVTGDPVLDATRMVATFTYKPG




IVQVRSAKCLKNKQGSKLFSERYLNETVSVTSIDLGSNNLVA




VATYRLVNGNTPELLQRFTLPSHLVKDFERYKQAHDTLEDSI




QKTAVASLPQGQQTEIRMWSMYGFREAQERVCQELGLADG




SIPWNVMTATSTILTDLFLARGGDPKKCMFTSEPKKKKNSKQ




VLYKIRDRAWAKMYRTLLSKETREAWNKALWGLKRGSPDY




ARLSKRKEELARRCVNYTISTAEKRAQCGRTIVALEDLNIGFF




HGRGKQEPGWVGLFTRKKENRWLMQALHKAFLELAHHRG




YHVIEVNPAYTSQTCPVCRHCDPDNRDQHNREAFHCIGCGFR




GNADLDVATHNIAMVAITGESLKRARGSVASKTPQPLAAE





CasΦ.2
 2
MPKPAVESEFSKVLKKHFPGERFRSSYMKRGGKILAAQGEE




AVVAYLQGKSEEEPPNFQPPAKCHVVTKSRDFAEWPIMKAS




EAIQRYIYALSTTERAACKPGKSSESHAAWFAATGVSNHGYS




HVQGLNLIFDHTLGRYDGVLKKVQLRNEKARARLESINASR




ADEGLPEIKAEEEEVATNETGHLLQPPGINPSFYVYQTISPQA




YRPRDEIVLPPEYAGYVRDPNAPIPLGVVRNRCDIQKGCPGYI




PEWQREAGTAISPKTGKAVTVPGLSPKKNKRMRRYWRSEKE




KAQDALLVTVRIGTDWVVIDVRGLLRNARWRTIAPKDISLN




ALLDLFTGDPVIDVRRNIVTFTYTLDACGTYARKWTLKGKQ




TKATLDKLTATQTVALVAIDLGQTNPISAGISRVTQENGALQ




CEPLDRFTLPDDLLKDISAYRIAWDRNEEELRARSVEALPEA




QQAEVRALDGVSKETARTQLCADFGLDPKRLPWDKMSSNT




TFISEALLSNSVSRDQVFFTPAPKKGAKKKAPVEVMRKDRT




WARAYKPRLSVEAQKLKNEALWALKRTSPEYLKLSRRKEEL




CRRSINYVIEKTRRRTQCQIVIPVIEDLNVRFFHGSGKRLPGW




DNFFTAKKENRWFIQGLHKAFSDLRTHRSFYVFEVRPERTSIT




CPKCGHCEVGNRDGEAFQCLSCGKTCNADLDVATHNLTQV




ALTGKTMPKREEPRDAQGTAPARKTKKASKSKAPPAEREDQ




TPAQEPSQTS





CasΦ.3
 3
MYILEMADLKSEPSLLAKLLRDRFPGKYWLPKYWKLAEKKR




LTGGEEAACEYMADKQLDSPPPNFRPPARCVILAKSRPFEDW




PVHRVASKAQSFVIGLSEQGFAALRAAPPSTADARRDWLRS




HGASEDDLMALEAQLLETIMGNAISLHGGVLKKIDNANVKA




AKRLSGRNEARLNKGLQELPPEQEGSAYGADGLLVNPPGLN




LNIYCRKSCCPKPVKNTARFVGHYPGYLRDSDSILISGTMDR




LTIIEGMPGHIPAWQREQGLVKPGGRRRRLSGSESNMRQKVD




PSTGPRRSTRSGTVNRSNQRTGRNGDPLLVEIRMKEDWVLL




DARGLLRNLRWRESKRGLSCDHEDLSLSGLLALFSGDPVIDP




VRNEVVFLYGEGIIPVRSTKPVGTRQSKKLLERQASMGPLTLI




SCDLGQTNLIAGRASAISLTHGSLGVRSSVRIELDPEIIKSFERL




RKDADRLETEILTAAKETLSDEQRGEVNSHEKDSPQTAKASL




CRELGLHPPSLPWGQMGPSTTFIADMLISHGRDDDAFLSHGE




FPTLEKRKKFDKRFCLESRPLLSSETRKALNESLWEVKRTSSE




YARLSQRKKEMARRAVNFVVEISRRKTGLSNVIVNIEDLNVR




IFHGGGKQAPGWDGFFRPKSENRWFIQAIHKAFSDLAAHHGI




PVIESDPQRTSMTCPECGHCDSKNRNGVRFLCKGCGASMDA




DFDAACRNLERVALTGKPMPKPSTSCERLLSATTGKVCSDHS




LSHDAIEKAS





CasΦ.4
 4
MEKEITELTKIRREFPNKKFSSTDMKKAGKLLKAEGPDAVRD




FLNSCQEIIGDFKPPVKTNIVSISRPFEEWPVSMVGRAIQEYYF




SLTKEELESVHPGTSSEDHKSFFNITGLSNYNYTSVQGLNLIF




KNAKAIYDGTLVKANNKNKKLEKKFNEINHKRSLEGLPIITP




DFEEPFDENGHLNNPPGINRNIYGYQGCAAKVFVPSKHKMV




SLPKEYEGYNRDPNLSLAGFRNRLEIPEGEPGHVPWFQRMDI




PEGQIGHVNKIQRFNFVHGKNSGKVKFSDKTGRVKRYHHSK




YKDATKPYKFLEESKKVSALDSILAIITIGDDWVVFDIRGLYR




NVFYRELAQKGLTAVQLLDLFTGDPVIDPKKGVVTFSYKEG




VVPVFSQKIVPRFKSRDTLEKLTSQGPVALLSVDLGQNEPVA




ARVCSLKNINDKITLDNSCRISFLDDYKKQIKDYRDSLDELEI




KIRLEAINSLETNQQVEIRDLDVFSADRAKANTVDMFDIDPN




LISWDSMSDARVSTQISDLYLKNGGDESRVYFEINNKRIKRS




DYNISQLVRPKLSDSTRKNLNDSIWKLKRTSEEYLKLSKRKL




ELSRAVVNYTIRQSKLLSGINDIVIILEDLDVKKKFNGRGIRDI




GWDNFFSSRKENRWFIPAFHKAFSELSSNRGLCVIEVNPAWT




SATCPDCGFCSKENRDGINFTCRKCGVSYHADIDVATLNIAR




VAVLGKPMSGPADRERLGDTKKPRVARSRKTMKRKDISNST




VEAMVTA





CasΦ.5
 5
MDMLDTETNYATETPAQQQDYSPKPPKKAQRAPKGFSKKA




RPEKKPPKPITLFTQKHFSGVRFLKRVIRDASKILKLSESRTITF




LEQAIERDGSAPPDVTPPVHNTIMAVTRPFEEWPEVILSKALQ




KHCYALTKKIKIKTWPKKGPGKKCLAAWSARTKIPLIPGQVQ




ATNGLFDRIGSIYDGVEKKVTNRNANKKLEYDEAIKEGRNPA




VPEYETAYNIDGTLINKPGYNPNLYITQSRTPRLITEADRPLVE




KILWQMVEKKTQSRNQARRARLEKAAHLQGLPVPKFVPEK




VDRSQKIEIRIIDPLDKIEPYMPQDRMAIKASQDGHVPYWQRP




FLSKRRNRRVRAGWGKQVSSIQAWLTGALLVIVRLGNEAFL




ADIRGALRNAQWRKLLKPDATYQSLFNLFTGDPVVNTRTNH




LTMAYREGVVNIVKSRSFKGRQTREHLLTLLGQGKTVAGVS




FDLGQKHAAGLLAAHFGLGEDGNPVFTPIQACFLPQRYLDSL




TNYRNRYDALTLDMRRQSLLALTPAQQQEFADAQRDPGGQ




AKRACCLKLNLNPDEIRWDLVSGISTMISDLYIERGGDPRDV




HQQVETKPKGKRKSEIRILKIRDGKWAYDFRPKIADETRKAQ




REQLWKLQKASSEFERLSRYKINIARAIANWALQWGRELSG




CDIVIPVLEDLNVGSKFFDGKGKWLLGWDNRFTPKKENRWF




IKVLHKAVAELAPHRGVPVYEVMPHRTSMTCPACHYCHPTN




REGDRFECQSCHVVKNTDRDVAPYNILRVAVEGKTLDRWQ




AEKKPQAEPDRPMILIDNQES





CasΦ.6
 6
MDMLDTETNYATETPAQQQDYSPKPPKKAQRAPKGFSKKA




RPEKKPPKPITLFTQKHFSGVRFLKRVIRDASKILKLSESRTITF




LEQAIERDGSAPPDVTPPVHNTIMAVTRPFEEWPEVILSKALQ




KHCYALTKKIKIKTWPKKGPGKKCLAAWSARTKIPLIPGQVQ




ATNGLFDRIGSIYDGVEKKVTNRNANKKLEYDEAIKEGRNPA




VPEYETAYNIDGTLINKPGYNPNLYITQSRTPRLITEADRPLVE




KILWQMVEKKTQSRNQARRARLEKAAHLQGLPVPKFVPEK




VDRSQKIEIRIIDPLDKIEPYMPQDRMAIKASQDGHVPYWQRP




FLSKRRNRRVRAGWGKQVSSIQAWLTGALLVIVRLGNEAFL




ADIRGALRNAQWRKLLKPDATYQSLFNLFTGDPVVNTRTNH




LTMAYREGVVDIVKSRSFKGRQTREHLLTLLGQGKTVAGVS




FDLGQKHAAGLLAAHFGLGEDGNPVFTPIQACFLPQRYLDSL




TNYRNRYDALTLDMRRQSLLALTPAQQQEFADAQRDPGGQ




AKRACCLKLNLNPDEIRWDLVSGISTMISDLYIERGGDPRDV




HQQVETKPKGKRKSEIRILKIRDGKWAYDFRPKIADETRKAQ




REQLWKLQKASSEFERLSRYKINIARAIANWALQWGRELSG




CDIVIPVLEDLNVGSKFFDGKGKWLLGWDNRFTPKKENRWF




IKVLHKAVAELAPHKGVPVYEVMPHRTSMTCPACHYCHPTN




REGDRFECQSCHVVKNTDRDVAPYNILRVAVEGKTLDRWQ




AEKKPQAEPDRPMILIDNQES





CasΦ.7
 7
MSSLPTPLELLKQKHADLFKGLQFSSKDNKMAGKVLKKDGE




EAALAFLSERGVSRGELPNFRPPAKTLVVAQSRPFEEFPIYRV




SEAIQLYVYSLSVKELETVPSGSSTKKEHQRFFQDSSVPDFGY




TSVQGLNKIFGLARGIYLGVITRGENQLQKAKSKHEALNKKR




RASGEAETEFDPTPYEYMTPERKLAKPPGVNHSIMCYVDISV




DEFDFRNPDGIVLPSEYAGYCREINTAIEKGTVDRLGHLKGG




PGYIPGHQRKESTTEGPKINFRKGRIRRSYTALYAKRDSRRVR




QGKLALPSYRHHMMRLNSNAESAILAVIFFGKDWVVFDLRG




LLRNVRWRNLFVDGSTPSTLLGMFGDPVIDPKRGVVAFCYK




EQIVPVVSKSITKMVKAPELLNKLYLKSEDPLVLVAIDLGQT




NPVGVGVYRVMNASLDYEVVTRFALESELLREIESYRQRTN




AFEAQIRAETFDAMTSEEQEEITRVRAFSASKAKENVCHRFG




MPVDAVDWATMGSNTIHIAKWVMRHGDPSLVEVLEYRKDN




EIKLDKNGVPKKVKLTDKRIANLTSIRLRFSQETSKHYNDTM




WELRRKHPVYQKLSKSKADFSRRVVNSIIRRVNHLVPRARIV




FIIEDLKNLGKVFHGSGKRELGWDSYFEPKSENRWFIQVLHK




AFSETGKHKGYYIIECWPNWTSCTCPKCSCCDSENRHGEVFR




CLACGYTCNTDFGTAPDNLVKIATTGKGLPGPKKRCKGSSK




GKNPKIARSSETGVSVTESGAPKVKKSSPTQTSQSSSQSAP





CasΦ.8
 8
MNKIEKEKTPLAKLMNENFAGLRFPFAIIKQAGKKLLKEGEL




KTIEYMTGKGSIEPLPNFKPPVKCLIVAKRRDLKYFPICKASC




EIQSYVYSLNYKDFMDYFSTPMTSQKQHEEFFKKSGLNIEYQ




NVAGLNLIFNNVKNTYNGVILKVKNRNEKLKKKAIKNNYEF




EEIKTFNDDGCLINKPGINNVIYCFQSISPKILKNITHLPKEYND




YDCSVDRNIIQKYVSRLDIPESQPGHVPEWQRKLPEFNNTNN




PRRRRKWYSNGRNISKGYSVDQVNQAKIEDSLLAQIKIGED




WIILDIRGLLRDLNRRELISYKNKLTIKDVLGFFSDYPIIDIKKN




LVTFCYKEGVIQVVSQKSIGNKKSKQLLEKLIENKPIALVSID




LGQTNPVSVKISKLNKINNKISIESFTYRFLNEEILKEIEKYRK




DYDKLELKLINEA





CasΦ.9
 9
MDMLDTETNYATETPSQQQDYSPKPPKKDRRAPKGFSKKAR




PEKKPPKPITLFTQKHFSGVRFLKRVIRDASKILKLSESRTITFL




EQAIERDGSAPPDVTPPVHNTIMAVTRPFEEWPEVILSKALQK




HCYALTKKIKIKTWPKKGPGKKCLAAWSARTKIPLIPGQVQA




TNGLFDRIGSIYDGVEKKVTNRNANKKLEYDEAIKEGRNPAV




PEYETAYNIDGTLINKPGYNPNLYITQSRTPRLITEADRPLVEK




ILWQMVEKKTQSRNQARRARLEKAAHLQGLPVPKFVPEKV




DRSQKIEIRIIDPLDKIEPYMPQDRMAIKASQDGHVPYWQRPF




LSKRRNRRVRAGWGKQVSSIQAWLTGALLVIVRLGNEAFLA




DIRGALRNAQWRKLLKPDATYQSLFNLFTGDPVVNTRTNHL




TMAYREGVVDIVKSRSFKGRQTREHLLTLLGQGKTVAGVSF




DLGQKHAAGLLAAHFGLGEDGNPVFTPIQACFLPQRYLDSLT




NYRNRYDALTLDMRRQSLLALTPAQQQEFADAQRDPGGQA




KRACCLKLNLNPDEIRWDLVSGISTMISDLYIERGGDPRDVH




QQVETKPKGKRKSEIRILKIRDGKWAYDFRPKIADETRKAQR




EQLWKLQKASSEFERLSRYKINIARAIANWALQWGRELSGC




DIVIPVLEDLNVGSKFFDGKGKWLLGWDNRFTPKKENRWFI




KVLHKAVAELAPHRGVPVYEVMPHRTSMTCPACHYCHPTN




REGDRFECQSCHVVKNTDRDVAPYNILRVAVEGKTLDRWQ




AEKKPQAEPDRPMILIDNQES





CasΦ.10
10
MDMLDTETNYATETPSQQQDYSPKPPKKDRRAPKGFSKKAR




PEKKPPKPITLFTQKHFSGVRFLKRVIRDASKILKLSESRTITFL




EQAIERDGSAPPDVTPPVHNTIMAVTRPFEEWPEVILSKALQK




HCYALTKKIKIKTWPKKGPGKKCLAAWSARTKIPLIPGQVQA




TNGLFDRIGSIYDGVEKKVTNRNANKKLEYDEAIKEGRNPAV




PEYETAYNIDGTLINKPGYNPNLYITQSRTPRLITEADRPLVEK




ILWQMVEKKTQSRNQARRARLEKAAHLQGLPVPKFVPEKV




DRSQKIEIRIIDPLDKIEPYMPQDRMAIKASQDGHVPYWQRPF




LSKRRNRRVRAGWGKQVSSIQAWLTGALLVIVRLGNEAFLA




DIRGALRNAQWRKLLKPDATYQSLFNLFTGDPVVNTRTNHL




TMAYREGVVNIVKSRSFKGRQTREHLLTLLGQGKTVAGVSF




DLGQKHAAGLLAAHFGLGEDGNPVFTPIQACFLPQRYLDSLT




NYRNRYDALTLDMRRQSLLALTPAQQQEFADAQRDPGGQA




KRACCLKLNLNPDEIRWDLVSGISTMISDLYIERGGDPRDVH




QQVETKPKGKRKSEIRILKIRDGKWAYDFRPKIADETRKAQR




EQLWKLQKASSEFERLSRYKINIARAIANWALQWGRELSGC




DIVIPVLEDLNVGSKFFDGKGKWLLGWDNRFTPKKENRWFI




KVLHKAVAELAPHRGVPVYEVMPHRTSMTCPACHYCHPTN




REGDRFECQSCHVVKNTDRDVAPYNILRVAVEGKTLDRWQ




AEKKPQAEPDRPMILIDNQES





CasΦ.11
11
MSNKTTPPSPLSLLLRAHFPGLKFESQDYKIAGKKLRDGGPE




AVISYLTGKGQAKLKDVKPPAKAFVIAQSRPFIEWDLVRVSR




QIQEKIFGIPATKGRPKQDGLSETAFNEAVASLEVDGKSKLNE




ETRAAFYEVLGLDAPSLHAQAQNALIKSAISIREGVLKKVEN




RNEKNLSKTKRRKEAGEEATFVEEKAHDERGYLIHPPGVNQ




TIPGYQAVVIKSCPSDFIGLPSGCLAKESAEALTDYLPHDRMT




IPKGQPGYVPEWQHPLLNRRKNRRRRDWYSASLNKPKATCS




KRSGTPNRKNSRTDQIQSGRFKGAIPVLMRFQDEWVIIDIRGL




LRNARYRKLLKEKSTIPDLLSLFTGDPSIDMRQGVCTFIYKAG




QACSAKMVKTKNAPEILSELTKSGPVVLVSIDLGQTNPIAAK




VSRVTQLSDGQLSHETLLRELLSNDSSDGKEIARYRVASDRL




RDKLANLAVERLSPEHKSEILRAKNDTPALCKARVCAALGL




NPEMIAWDKMTPYTEFLATAYLEKGGDRKVATLKPKNRPE




MLRRDIKFKGTEGVRIEVSPEAAEAYREAQWDLQRTSPEYLR




LSTWKQELTKRILNQLRHKAAKSSQCEVVVMAFEDLNIKMM




HGNGKWADGGWDAFFIKKRENRWFMQAFHKSLTELGAHK




GVPTIEVTPHRTSITCTKCGHCDKANRDGERFACQKCGFVAH




ADLEIATDNIERVALTGKPMPKPESERSGDAKKSVGARKAAF




KPEEDAEAAE





CasΦ.12
12
MIKPTVSQFLTPGFKLIRNHSRTAGLKLKNEGEEACKKFVRE




NEIPKDECPNFQGGPAIANIIAKSREFTEWEIYQSSLAIQEVIFT




LPKDKLPEPILKEEWRAQWLSEHGLDTVPYKEAAGLNLIIKN




AVNTYKGVQVKVDNKNKNNLAKINRKNEIAKLNGEQEISFE




EIKAFDDKGYLLQKPSPNKSIYCYQSVSPKPFITSKYHNVNLP




EEYIGYYRKSNEPIVSPYQFDRLRIPIGEPGYVPKWQYTFLSK




KENKRRKLSKRIKNVSPILGIICIKKDWCVFDMRGLLRTNHW




KKYHKPTDSINDLFDYFTGDPVIDTKANVVRFRYKMENGIV




NYKPVREKKGKELLENICDQNGSCKLATVDVGQNNPVAIGL




FELKKVNGELTKTLISRHPTPIDFCNKITAYRERYDKLESSIKL




DAIKQLTSEQKIEVDNYNNNFTPQNTKQIVCSKLNINPNDLP




WDKMISGTHFISEKAQVSNKSEIYFTSTDKGKTKDVMKSDY




KWFQDYKPKLSKEVRDALSDIEWRLRRESLEFNKLSKSREQ




DARQLANWISSMCDVIGIENLVKKNNFFGGSGKREPGWDNF




YKPKKENRWWINAIHKALTELSQNKGKRVILLPAMRTSITCP




KCKYCDSKNRNGEKFNCLKCGIELNADIDVATENLATVAITA




QSMPKPTCERSGDAKKPVRARKAKAPEFHDKLAPSYTVVLR




EAV





CasΦ.13
13
MRQPAEKTAFQVFRQEVIGTQKLSGGDAKTAGRLYKQGKM




EAAREWLLKGARDDVPPNFQPPAKCLVVAVSHPFEEWDISK




TNHDVQAYIYAQPLQAEGHLNGLSEKWEDTSADQHKLWFE




KTGVPDRGLPVQAINKIAKAAVNRAFGVVRKVENRNEKRRS




RDNRIAEHNRENGLTEVVREAPEVATNADGFLLHPPGIDPSIL




SYASVSPVPYNSSKHSFVRLPEEYQAYNVEPDAPIPQFVVED




RFAIPPGQPGYVPEWQRLKCSTNKHRRMRQWSNQDYKPKA




GRRAKPLEFQAHLTRERAKGALLVVMRIKEDWVVFDVRGL




LRNVEWRKVLSEEAREKLTLKGLLDLFTGDPVIDTKRGIVTF




LYKAEITKILSKRTVKTKNARDLLLRLTEPGEDGLRREVGLV




AVDLGQTHPIAAAIYRIGRTSAGALESTVLHRQGLREDQKEK




LKEYRKRHTALDSRLRKEAFETLSVEQQKEIVTVSGSGAQIT




KDKVCNYLGVDPSTLPWEKMGSYTHFISDDFLRRGGDPNIV




HFDRQPKKGKVSKKSQRIKRSDSQWVGRMRPRLSQETAKAR




MEADWAAQNENEEYKRLARSKQELARWCVNTLLQNTRCIT




QCDEIVVVIEDLNVKSLHGKGAREPGWDNFFTPKTENRWFIQ




ILHKTFSELPKHRGEHVIEGCPLRTSITCPACSYCDKNSRNGE




KFVCVACGATFHADFEVATYNLVRLATTGMPMPKSLERQG




GGEKAGGARKARKKAKQVEKIVVQANANVTMNGASLHSP





CasΦ.14
14
MSSLPTPLELLKQKHADLFKGLQFSSKDNKMAGKVLKKDGE




EAALAFLSERGVSRGELPNFRPPAKTLVVAQSRPFEEFPIYRV




SEAIQLYVYSLSVKELETVPSGSSTKKEHQRFFQDSSVPDFGY




TSVQGLNKIFGLARGIYLGVITRGENQLQKAKSKHEALNKKR




RASGEAETEFDPTPYEYMTPERKLAKPPGVNHSIMCYVDISV




DEFDFRNPDGIVLPSEYAGYCREINTAIEKGTVDRLGHLKGG




PGYIPGHQRKESTTEGPKINFRKGRIRRSYTALYAKRDSRRVR




QGKLALPSYRHHMMRLNSNAESAILAVIFFGKDWVVFDLRG




LLRNVRWRNLFVDGSTPSTLLGMFGDPVIDPKRGVVAFCYK




EQIVPVVSKSITKMVKAPELLNKLYLKSEDPLVLVAIDLGQT




NPVGVGVYRVMNASLDYEVVTRFALESELLREIESYRQRTN




AFEAQIRAETFDAMTSEEQEEITRVRAFSASKAKENVCHRFG




MPVDAVDWATMGSNTIHIAKWVMRHGDPSLVEVLEYRKDN




EIKLDKNGVPKKVKLTDKRIANLTSIRLRFSQETSKHYNDTM




WELRRKHPVYQKLSKSKADFSRRVVNSIIRRVNHLVPRARIV




FIIEDLKNLGKVFHGSGKRELGWDSYFEPKSENRWFIQVLHK




AFSETGKHKGYYIIECWPNWTSCTCPKCSCCDSENRHGEVFR




CLACGYTCNTDFGTAPDNLVKIATTGKGLPGPKKRCKGSSK




GKNPKIARSSETGVSVTESGAPKVKKSSPTQTSQSSSQSAP


CasΦ.15
15
MIKPTVSQFLTPGFKLIRNHSRTAGLKLKNEGEEACKKFVRE







NEIPKDECPNFQGGPAIANIIAKSREFTEWEIYQSSLAIQEVIFT




LPKDKLPEPILKEEWRAQWLSEHGLDTVPYKEAAGLNLIIKN




AVNTYKGVQVKVDNKNKNNLAKINRKNEIAKLNGEQEISFE




EIKAFDDKGYLLQKPSPNKSIYCYQSVSPKPFITSKYHNVNLP




EEYIGYYRKSNEPIVSPYQFDRLRIPIGEPGYVPKWQYTFLSK




KENKRRKLSKRIKNVSPILGIICIKKDWCVFDMRGLLRTNHW




KKYHKPTDSINDLFDYFTGDPVIDTKANVVRFRYKMENGIV




NYKPVREKKGKELLENICDQNGSCKLATVDVGQNNPVAIGL




FELKKVNGELTKTLISRHPTPIDFCNKITAYRERYDKLESSIKL




DAIKQLTSEQKIEVDNYNNNFTPQNTKQIVCSKLNINPNDLP




WDKMISGTHFISEKAQVSNKSEIYFTSTDKGKTKDVMKSDY




KWFQDYKPKLSKEVRDALSDIEWRLRRESLEFNKLSKSREQ




DARQLANWISSMCDVIGIENLVKKNNFFGGSGKREPGWDNF




YKPKKENRWWINAIHKALTELSQNKGKRVILLPAMRTSITCP




KCKYCDSKNRNGEKFNCLKCGIELNADIDVATENLATVAITA




QSMPKPTCERSGDAKKPVRARKAKAPEFHDKLAPSYTVVLR




EAV





CasΦ.16
16
MSNKTTPPSPLSLLLRAHFPGLKFESQDYKIAGKKLRDGGPE




AVISYLTGKGQAKLKDVKPPAKAFVIAQSRPFIEWDLVRVSR




QIQEKIFGIPATKGRPKQDGLSETAFNEAVASLEVDGKSKLNE




ETRAAFYEVLGLDAPSLHAQAQNALIKSAISIREGVLKKVEN




RNEKNLSKTKRRKEAGEEATFVEEKAHDERGYLIHPPGVNQ




TIPGYQAVVIKSCPSDFIGLPSGCLAKESAEALTDYLPHDRMT




IPKGQPGYVPEWQHPLLNRRKNRRRRDWYSASLNKPKATCS




KRSGTPNRKNSRTDQIQSGRFKGAIPVLMRFQDEWVIIDIRGL




LRNARYRKLLKEKSTIPDLLSLFTGDPSIDMRQGVCTFIYKAG




QACSAKMVKTKNAPEILSELTKSGPVVLVSIDLGQTNPIAAK




VSRVTQLSDGQLSHETLLRELLSNDSSDGKEIARYRVASDRL




RDKLANLAVERLSPEHKSEILRAKNDTPALCKARVCAALGL




NPEMIAWDKMTPYTEFLATAYLEKGGDRKVATLKPKNRPE




MLRRDIKFKGTEGVRIEVSPEAAEAYREAQWDLQRTSPEYLR




LSTWKQELTKRILNQLRHKAAKSSQCEVVVMAFEDLNIKMM




HGNGKWADGGWDAFFIKKRENRWFMQAFHKSLTELGAHK




GVPTIEVTPHRTSITCTKCGHCDKANRDGERFACQKCGFVAH




ADLEIATDNIERVALTGKPMPKPESERSGDAKKSVGARKAAF




KPEEDAEAAE





CasΦ.17
17
MYSLEMADLKSEPSLLAKLLRDRFPGKYWLPKYWKLAEKK




RLTGGEEAACEYMADKQLDSPPPNFRPPARCVILAKSRPFED




WPVHRVASKAQSFVIGLSEQGFAALRAAPPSTADARRDWLR




SHGASEDDLMALEAQLLETIMGNAISLHGGVLKKIDNANVK




AAKRLSGRNEARLNKGLQELPPEQEGSAYGADGLLVNPPGL




NLNIYCRKSCCPKPVKNTARFVGHYPGYLRDSDSILISGTMD




RLTIIEGMPGHIPAWQREQGLVKPGGRRRRLSGSESNMRQKV




DPSTGPRRSTRSGTVNRSNQRTGRNGDPLLVEIRMKEDWVL




LDARGLLRNLRWRESKRGLSCDHEDLSLSGLLALFSGDPVID




PVRNEVVFLYGEGIIPVRSTKPVGTRQSKKLLERQASMGPLT




LISCDLGQTNLIAGRASAISLTHGSLGVRSSVRIELDPEIIKSFE




RLRKDADRLETEILTAAKETLSDEQRGEVNSHEKDSPQTAKA




SLCRELGLHPPSLPWGQMGPSTTFIADMLISHGRDDDAFLSH




GEFPTLEKRKKFDKRFCLESRPLLSSETRKALNESLWEVKRTS




SEYARLSQRKKEMARRAVNFVVEISRRKTGLSNVIVNIEDLN




VRIFHGGGKQAPGWDGFFRPKSENRWFIQAIHKAFSDLAAH




HGIPVIESDPQRTSMTCPECGHCDSKNRNGVRFLCKGCGASM




DADFDAACRNLERVALTGKPMPKPSTSCERLLSATTGKVCS




DHSLSHDAIEKAS





CasΦ.18
18
MEKEITELTKIRREFPNKKFSSTDMKKAGKLLKAEGPDAVRD




FLNSCQEIIGDFKPPVKTNIVSISRPFEEWPVSMVGRAIQEYYF




SLTKEELESVHPGTSSEDHKSFFNITGLSNYNYTSVQGLNLIF




KNAKAIYDGTLVKANNKNKKLEKKFNEINHKRSLEGLPIITP




DFEEPFDENGHLNNPPGINRNIYGYQGCAAKVFVPSKHKMV




SLPKEYEGYNRDPNLSLAGFRNRLEIPEGEPGHVPWFQRMDI




PEGQIGHVNKIQRFNFVHGKNSGKVKFSDKTGRVKRYHHSK




YKDATKPYKFLEESKKVSALDSILAIITIGDDWVVFDIRGLYR




NVFYRELAQKGLTAVQLLDLFTGDPVIDPKKGVVTFSYKEG




VVPVFSQKIVPRFKSRDTLEKLTSQGPVALLSVDLGQNEPVA




ARVCSLKNINDKITLDNSCRISFLDDYKKQIKDYRDSLDELEI




KIRLEAINSLETNQQVEIRDLDVFSADRAKANTVDMFDIDPN




LISWDSMSDARVSTQISDLYLKNGGDESRVYFEINNKRIKRS




DYNISQLVRPKLSDSTRKNLNDSIWKLKRTSEEYLKLSKRKL




ELSRAVVNYTIRQSKLLSGINDIVIILEDLDVKKKFNGRGIRDI




GWDNFFSSRKENRWFIPAFHKTFSELSSNRGLCVIEVNPAWT




SATCPDCGFCSKENRDGINFTCRKCGVSYHADIDVATLNIAR




VAVLGKPMSGPADRERLGDTKKPRVARSRKTMKRKDISNST




VEAMVTA





CasΦ.19
19
MLVRTSTLVQDNKNSRSASRAFLKKPKMPKNKHIKEPTELA




KLIRELFPGQRFTRAINTQAGKILKHKGRDEVVEFLKNKGIDK




EQFMDFRPPTKARIVATSGAIEEFSYLRVSMAIQECCFGKYKF




PKEKVNGKLVLETVGLTKEELDDFLPKKYYENKKSRDRFFL




KTGICDYGYTYAQGLNEIFRNTRAIYEGVFTKVNNRNEKRRE




KKDKYNEERRSKGLSEEPYDEDESATDESGHLINPPGVNLNI




WTCEGFCKGPYVTKLSGTPGYEVILPKVFDGYNRDPNEIISC




GITDRFAIPEGEPGHIPWHQRLEIPEGQPGYVPGHQRFADTGQ




NNSGKANPNKKGRMRKYYGHGTKYTQPGEYQEVFRKGHRE




GNKRRYWEEDFRSEAHDCILYVIHIGDDWVVCDLRGPLRDA




YRRGLVPKEGITTQELCNLFSGDPVIDPKHGVVTFCYKNGLV




RAQKTISAGKKSRELLGALTSQGPIALIGVDLGQTEPVGARAF




IVNQARGSLSLPTLKGSFLLTAENSSSWNVFKGEIKAYREAID




DLAIRLKKEAVATLSVEQQTEIESYEAFSAEDAKQLACEKFG




VDSSFILWEDMTPYHTGPATYYFAKQFLKKNGGNKSLIEYIP




YQKKKSKKTPKAVLRSDYNIACCVRPKLLPETRKALNEAIRI




VQKNSDEYQRLSKRKLEFCRRVVNYLVRKAKKLTGLERVII




AIEDLKSLEKFFTGSGKRDNGWSNFFRPKKENRWFIPAFHKA




FSELAPNRGFYVIECNPARTSITDPDCGYCDGDNRDGIKFECK




KCGAKHHTDLDVAPLNIAIVAVTGRPMPKTVSNKSKRERSG




GEKSVGASRKRNHRKSKANQEMLDATSSAAE





CasΦ.20
20
MPKIKKPTEISLLRKEVFPDLHFAKDRMRAASLVLKNEGREA




AIEYLRVNHEDKPPNFMPPAKTPYVALSRPLEQWPIAQASIAI




QKYIFGLTKDEFSATKKLLYGDKSTPNTESRKRWFEVTGVPN




FGYMSAQGLNAIFSGALARYEGVVQKVENRNKKRFEKLSEK




NQLLIEEGQPVKDYVPDTAYHTPETLQKLAENNHVRVEDLG




DMIDRLVHPPGIHRSIYGYQQVPPFAYDPDNPKGIILPKAYAG




YTRKPHDIIEAMPNRLNIPEGQAGYIPEHQRDKLKKGGRVKR




LRTTRVRVDATETVRAKAEALNAEKARLRGKEAILAVFQIEE




DWALIDMRGLLRNVYMRKLIAAGELTPTTLLGYFTETLTLDP




RRTEATFCYHLRSEGALHAEYVRHGKNTRELLLDLTKDNEKI




ALVTIDLGQRNPLAAAIFRVGRDASGDLTENSLEPVSRMLLP




QAYLDQIKAYRDAYDSFRQNIWDTALASLTPEQQRQILAYE




AYTPDDSKENVLRLLLGGNVMPDDLPWEDMTKNTHYISDR




YLADGGDPSKVWFVPGPRKRKKNAPPLKKPPKPRELVKRSD




HNISHLSEFRPQLLKETRDAFEKAKIDTERGHVGYQKLSTRK




DQLCKEILNWLEAEAVRLTRCKTMVLGLEDLNGPFFNQGKG




KVRGWVSFFRQKQENRWIVNGFRKNALARAHDKGKYILEL




WPSWTSQTCPKCKHVHADNRHGDDFVCLQCGARLHADAEV




ATWNLAVVAIQGHSLPGPVREKSNDRKKSGSARKSKKANES




GKVVGAWAAQATPKRATSKKETGTARNPVYNPLETQASCP




AP





CasΦ.21
21
MTPSPQIARLVETPLAAALKAHHPGKKFRSDYLKKAGKILKD




QGVEAAMAHLDGKDQAEPPNFKPPAKCRIVARSREFSEWPI




VKASVEIQKYIYGLTLEERKACDPGKSSASHKAWFAKTGVN




TFGYSSVQGFNLIFGHTLGRYDGVLVKTENLNKKRAEKNER




FRAKALAEGRAEPVCPPLVTATNDTGQDVTLEDGRVVRPGQ




LLQPPGINPNIYAYQQVSPKAYVPGIIELPEEFQGYSRDPNAVI




LPLVPRDRLSIPKGQPGYVPEPHREGLTGRKDRRMRRYYETE




RGTKLKRPPLTAKGRADKANEALLVVVRIDSDWVVMDVRG




LLRNARWRRLVSKEGITLNGLLDLFTGDPVLNPKDCSVSRDT




GDPVNDPRHGVVTFCYKLGVVDVCSKDRPIKGFRTKEVLER




LTSSGTVGMVSIDLGQTNPVAAAVSRVTKGLQAETLETFTLP




DDLLGKVRAYRAKTDRMEEGFRRNALRKLTAEQQAEITRYN




DATEQQAKALVCSTYGIGPEEVPWERMTSNTTYISDHILDHG




GDPDTVFFMATKRGQNKPTLHKRKDKAWGQKFRPAISVETR




LARQAAEWELRRASLEFQKLSVWKTELCRQAVNYVMERTK




KRTQCDVIIPVIEDLPVPLFHGSGKRDPGWANFFVHKRENRW




FIDGLHKAFSELGKHRGIYVFEVCPQRTSITCPKCGHCDPDNR




DGEKFVCLSCQATLNADLDVATTNLVRVALTGKVMPRSERS




GDAQTPGPARKARTGKIKGSKPTSAPQGATQTDAKAHLSQT




GV





CasΦ.22
22
MTPSPQIARLVETPLAAALKAHHPGKKFRSDYLKKAGKILKD




QGVEAAMAHLDGKDQAEPPNFKPPAKCRIVARSREFSEWPI




VKASVEIQKYIYGLTLEERKACDPGKSSASHKAWFAKTGVN




TFGYSSVQGFNLIFGHTLGRYDGVLVKTENLNKKRAEKNER




FRAKALAEGRAEPVCPPLVTATNDTGQDVTLEDGRVVRPGQ




LLQPPGINPNIYAYQQVSPKAYVPGIIELPEEFQGYSRDPNAVI




LPLVPRDRLSIPKGQPGYVPEPHREGLTGRKDRRMRRYYETE




RGTKLKRPPLTAKGRADKANEALLVVVRIDSDWVVMDVRG




LLRNARWRRLVSKEGITLNGLLDLFTGDPVLNPKDCSVSRDT




GDPVNDPRHGVVTFCYKLGVVDVCSKDRPIKGFRTKEVLER




LTSSGTVGMVSIDLGQTNPVAAAVSRVTKGLQAETLETFTLP




DDLLGKVRAYRAKTDRMEEGFRRNALRKLTAEQQAEITRYN




DATEQQAKALVCSTYGIGPEEVPWERMTSNTTYISDHILDHG




GDPDTVFFMATKRGQNKPTLHKRKDKAWGQKFRPAISVETR




LARQAAEWELRRASLEFQKLSVWKTELCRQAVNYVMERTK




KRTQCDVIIPVIEDLPVPLFHGSGKRDPGWANFFVHKRENRW




FIDGLHKAFSELGKHRGIYVFEVCPQRTSITCPKCGHCDPDNR




DGEKFVCLSCQATLHADLDVATTNLVRVALTGKVMPRSERS




GDAQTPGPARKARTGKIKGSKPTSAPQGATQTDAKAHLSQT




GV





CasΦ.23
23
MKTEKPKTALTLLREEVFPGKKYRLDVLKEAGKKLSTKGRE




ATIEFLTGKDEERPQNFQPPAKTSIVAQSRPFDQWPIVQVSLA




VQKYIYGLTQSEFEANKKALYGETGKAISTESRRAWFEATGV




DNFGFTAAQGINPIFSQAVARYEGVIKKVENRNEKKLKKLTK




KNLLRLESGEEIEDFEPEATFNEEGRLLQPPGANPNIYCYQQIS




PRIYDPSDPKGVILPQIYAGYDRKPEDIISAGVPNRLAIPEGQP




GYIPEHQRAGLKTQGRIRCRASVEAKARAAILAVVHLGEDW




VVLDLRGLLRNVYWRKLASPGTLTLKGLLDFFTGGPVLDAR




RGIATFSYTLKSAAAVHAENTYKGKGTREVLLKLTENNSVA




LVTVDLGQRNPLAAMIARVSRTSQGDLTYPESVEPLTRLFLP




DPFLEEVRKYRSSYDALRLSIREAAIASLTPEQQAEIRYIEKFS




AGDAKKNVAEVFGIDPTQLPWDAMTPRTTYISDLFLRMGGD




RSRVFFEVPPKKAKKAPKKPPKKPAGPRIVKRTDGMIARLREI




RPRLSAETNKAFQEARWEGERSNVAFQKLSVRRKQFARTVV




NHLVQTAQKMSRCDTVVLGIEDLNVPFFHGRGKYQPGWEG




FFRQKKENRWLINDMHKALSERGPHRGGYVLELTPFWTSLR




CPKCGHTDSANRDGDDFVCVKCGAKLHSDLEVATANLALV




AITGQSIPRPPREQSSGKKSTGTARMKKTSGETQGKGSKACV




SEALNKIEQGTARDPVYNPLNSQVSCPAP





CasΦ.24
24
VYNPDMKKPNNIRRIREEHFEGLCFGKDVLTKAGKIYEKDGE




EAAIDFLMGKDEEDPPNFKPPAKTTIVAQSRPFDQWPIYQVS




QAVQERVFAYTEEEFNASKEALFSGDISSKSRDFWFKTNNIS




DQGIGAQGLNTILSHAFSRYSGVIKKVENRNKKRLKKLSKKN




QLKIEEGLEILEFKPDSAFNENGLLAQPPGINPNIYGYQAVTPF




VFDPDNPGDVILPKQYEGYSRKPDDIIEKGPSRLDIPKGQPGY




VPEHQRKNLKKKGRVRLYRRTPPKTKALASILAVLQIGKDW




VLFDMRGLLRSVYMREAATPGQISAKDLLDTFTGCPVLNTR




TGEFTFCYKLRSEGALHARKIYTKGETRTLLTSLTSENNTIAL




VTVDLGQRNPAAIMISRLSRKEELSEKDIQPVSRRLLPDRYLN




ELKRYRDAYDAFRQEVRDEAFTSLCPEHQEQVQQYEALTPE




KAKNLVLKHFFGTHDPDLPWDDMTSNTHYIANLYLERGGDP




SKVFFTRPLKKDSKSKKPRKPTKRTDASISRLPEIRPKMPEDA




RKAFEKAKWEIYTGHEKFPKLAKRVNQLCREIANWIEKEAK




RLTLCDTVVVGIEDLSLPPKRGKGKFQETWQGFFRQKFENR




WVIDTLKKAIQNRAHDKGKYVLGLAPYWTSQRCPACGFIHK




SNRNGDHFKCLKCEALFHADSEVATWNLALVAVLGKGITNP




DSKKPSGQKKTGTTRKKQIKGKNKGKETVNVPPTTQEVEDII




AFFEKDDETVRNPVYKPTGT





CasΦ.25
25
MKKPNNIRRIREEHFEGLCFGKDVLTKAGKIYEKDGEEAAID




FLMGKDEEDPPNFKPPAKTTIVAQSRPFDQWPIYQVSQAVQE




RVFAYTEEEFNASKEALFSGDISSKSRDFWFKTNNISDQGIGA




QGLNTILSHAFSRYSGVIKKVENRNKKRLKKLSKKNQLKIEE




GLEILEFKPDSAFNENGLLAQPPGINPNIYGYQAVTPFVFDPD




NPGDVILPKQYEGYSRKPDDIIEKGPSRLDIPKGQPGYVPEHQ




RKNLKKKGRVRLYRRTPPKTKALASILAVLQIGKDWVLFDM




RGLLRSVYMREAATPGQISAKDLLDTFTGCPVLNTRTGEFTF




CYKLRSEGALHARKIYTKGETRTLLTSLTSENNTIALVTVDL




GQRNPAAIMISRLSRKEELSEKDIQPVSRRLLPDRYLNELKRY




RDAYDAFRQEVRDEAFTSLCPEHQEQVQQYEALTPEKAKNL




VLKHFFGTHDPDLPWDDMTSNTHYIANLYLERGGDPSKVFF




TRPLKKDSKSKKPRKPTKRTDASISRLPEIRPKMPEDARKAFE




KAKWEIYTGHEKFPKLAKRVNQLCREIANWIEKEAKRLTLC




DTVVVGIEDLSLPPKRGKGKFQETWQGFFRQKFENRWVIDT




LKKAIQNRAHDKGKYVLGLAPYWTSQRCPACGFIHKSNRNG




DHFKCLKCEALFHADSEVATWNLALVAVLGKGITNPDSKKP




SGQKKTGTTRKKQIKGKNKGKETVNVPPTTQEVEDIIAFFEK




DDETVRNPVYKPTGT





CasΦ.26
26
VIKTHFPAGRFRKDHQKTAGKKLKHEGEEACVEYLRNKVSD




YPPNFKPPAKGTIVAQSRPFSEWPIVRASEAIQKYVYGLTVAE




LDVFSPGTSKPSHAEWFAKTGVENYGYRQVQGLNTIFQNTV




NRFKGVLKKVENRNKKSLKRQEGANRRRVEEGLPEVPVTVE




SATDDEGRLLQPPGVNPSIYGYQGVAPRVCTDLQGFSGMSV




DFAGYRRDPDAVLVESLPEGRLSIPKGERGYVPEWQRDPERN




KFPLREGSRRQRKWYSNACHKPKPGRTSKYDPEALKKASAK




DALLVSISIGEDWAIIDVRGLLRDARRRGFTPEEGLSLNSLLG




LFTEYPVFDVQRGLITFTYKLGQVDVHSRKTVPTFRSRALLES




LVAKEEIALVSVDLGQTNPASMKVSRVRAQEGALVAEPVHR




MFLSDVLLGELSSYRKRMDAFEDAIRAQAFETMTPEQQAEIT




RVCDVSVEVARRRVCEKYSISPQDVPWGEMTGHSTFIVDAV




LRKGGDESLVYFKNKEGETLKFRDLRISRMEGVRPRLTKDTR




DALNKAVLDLKRAHPTFAKLAKQKLELARRCVNFIEREAKR




YTQCERVVFVIEDLNVGFFHGKGKRDRGWDAFFTAKKENR




WVIQALHKAFSDLGLHRGSYVIEVTPQRTSMTCPRCGHCDK




GNRNGEKFVCLQCGATLHADLEVATDNIERVALTGKAMPKP




PVRERSGDVQKAGTARKARKPLKPKQKTEPSVQEGSSDDGV




DKSPGDASRNPVYNPSDTLSI





CasΦ.27
27
MAKAKTLAALLRELLPGQHLAPHHRWVANKLLMTSGDAAA




FVIGKSVSDPVRGSFRKDVITKAGRIFKKDGPDAAAAFLDGK




WEDRPPNFQPPAKAAIVAISRSFDEWPIVKVSCAIQQYLYALP




VQEFESSVPEARAQAHAAWFQDTGVDDCNFKSTQGLNAIFN




HGKRTYEGVLKKAQNRNDKKNLRLERINAKRAEAGQAPLV




AGPDESPTDDAGCLLHPPGINANIYCYQQVSPRPYEQSCGIQL




PPEYAGYNRLSNVAIPPMPNRLDIPQGQPGYVPEHHRHGIKK




FGRVRKRYGVVPGRNRDADGKRTRQVLTEAGAAAKARDSV




LAVIRIGDDWTVVDLRGLLRNAQWRKLVPDGGITVQGLLDL




FTGDPVIDPRRGVVTFIYKADSVGIHSEKVCRGKQSKNLLER




LCAMPEKSSTRLDCARQAVALVSVDLGQRNPVAARFSRVSL




AEGQLQAQLVSAQFLDDAMVAMIRSYREEYDRFESLVREQA




KAALSPEQLSEIVRHEADSAESVKSCVCAKFGIDPAGLSWDK




MTSGTWRIADHVQAAGGDVEWFFFKTCGKGKEIKTVRRSDF




NVAKQFRLRLSPETRKDWNDAIWELKRGNPAYVSFSKRKSE




FARRVVNDLVHRARRAVRCDEVVFAIEDLNISFFHGKGQRQ




MGWDAFFEVKQENRWFIQALHKAFVERATHKGGYVLEVAP




ARTSTTCPECRHCDPESRRGEQFCCIKCRHTCHADLEVATFNI




EQVALTGVSLPKRLSSTLL





CasΦ.28
28
MSKEKTPPSAYAILKAKHFPDLDFEKKHKMMAGRMFKNGA




SEQEVVQYLQGKGSESLMDVKPPAKSPILAQSRPFDEWEMV




RTSRLIQETIFGIPKRGSIPKRDGLSETQFNELVASLEVGGKPM




LNKQTRAIFYGLLGIKPPTFHAMAQNILIDLAINIRKGVLKKV




DNLNEKNRKKVKRIRDAGEQDVMVPAEVTAHDDRGYLNHP




PGVNPTIPGYQGVVIPFPEGFEGLPSGMTPVDWSHVLVDYLP




HDRLSIPKGSPGYIPEWQRPLLNRHKGRRHRSWYANSLNKPR




KSRTEEAKDRQNAGKRTALIEAERLKGVLPVLMRFKEDWLII




DARGLLRNARYRGVLPEGSTLGNLIDLFSDSPRVDTRRGICTF




LYRKGRAYSTKPVKRKESKETLLKLTEKSTIALVSIDLGQTNP




LTAKLSKVRQVDGCLVAEPVLRKLIDNASEDGKEIARYRVA




HDLLRARILEDAIDLLGIYKDEVVRARSDTPDLCKERVCRFL




GLDSQAIDWDRMTPYTDFIAQAFVAKGGDPKVVTIKPNGKP




KMFRKDRSIKNMKGIRLDISKEASSAYREAQWAIQRESPDFQ




RLAVWQSQLTKRIVNQLVAWAKKCTQCDTVVLAFEDLNIG




MMHGSGKWANGGWNALFLHKQENRWFMQAFHKALTELS




AHKGIPTIEVLPHRTSITCTQCGHCHPGNRDGERFKCLKCEFL




ANTDLEIATDNIERVALTGLPMPKGERSSAKRKPGGTRKTKK




SKHSGNSPLAAE





CasΦ.29
29
MEKAGPTSPLSVLIHKNFEGCRFQIDHLKIAGRKLAREGEAA




AIEYLLDKKCEGLPPNFQPPAKGNVIAQSRPFTEWAPYRASV




AIQKYIYSLSVDERKVCDPGSSSDSHEKWFKQTGVQNYGYT




HVQGLNLIFKHALARYDGVLKKVDNRNEKNRKKAERVNSF




RREEGLPEEVFEEEKATDETGHLLQPPGVNHSIYCYQSVRPK




PFNPRKPGGISLPEAYSGYSLKPQDELPIGSLDRLSIPPGQPGY




VPEWQRSQLTTQKHRRKRSWYSAQKWKPRTGRTSTFDPDR




LNCARAQGAILAVVRIHEDWVVFDVRGLLRNALWRELAGK




GLTVRDLLDFFTGDPVVDTKRGVVTFTYKLGKVDVHSLRTV




RGKRSKKVLEDLTLSSDVGLVTIDLGQTNVLAADYSKVTRSE




NGELLAVPLSKSFLPKHLLHEVTAYRTSYDQMEEGFRRKALL




TLTEDQQVEVTLVRDFSVESSKTKLLQLGVDVTSLPWEKMS




SNTTYISDQLLQQGADPASLFFDGERDGKPCRHKKKDRTWA




YLVRPKVSPETRKALNEALWALKNTSPEFESLSKRKIQFSRR




CMNYLLNEAKRISGCGQVVFVIEDLNVRVHHGRGKRAIGWD




NFFKPKRENRWFMQALHKAASELAIHRGMHIIEACPARSSIT




CPKCGHCDPENRCSSDREKFLCVKCGAAFHADLEVATFNLR




KVALTGTALPKSIDHSRDGLIPKGARNRKLKEPQANDEKACA





CasΦ.30
30
MKEQSPLSSVLKSNFPGKKFLSADIRVAGRKLAQLGEAAAVE




YLSPRQRDSVPNFRPPAFCTVVAKSRPFEEWPIYKASVLLQE




QIYGMTGQEFEERCGSIPTSLSGLRQWASSVGLGAAMEGLH




VQGMNLMVKNAINRYKGVLVKVENRNKKLVEANEAKNSS




REERGLPPLRPPELGSAFGPDGRLVNPPGIDKSIRLYQGVSPV




PVVKTTGRPTVHRLDIPAGEKGHVPLWQREAGLVKEGPRRR




RMWYSNSNLKRSRKDRSAEASEARKADSVVVRVSVKEDWV




DIDVRGLLRNVAWRGIERAGESTEDLLSLFSGDPVVDPSRDS




VVFLYKEGVVDVLSKKVVGAGKSRKQLEKMVSEGPVALVS




CDLGQTNYVAARVSVLDESLSPVRSFRVDPREFPSADGSQGV




VGSLDRIRADSDRLEAKLLSEAEASLPEPVRAEIEFLRSERPSA




VAGRLCLKLGIDPRSIPWEKMGSTTSFISEALSAKGSPLALHD




GAPIKDSRFAHAARGRLSPESRKALNEALWERKSSSREYGVI




SRRKSEASRRMANAVLSESRRLTGLAVVAVNLEDLNMVSKF




FHGRGKRAPGWAGFFTPKMENRWFIRSIHKAMCDLSKHRGI




TVIESRPERTSISCPECGHCDPENRSGERFSCKSCGVSLHADFE




VATRNLERVALTGKPMPRRENLHSPEGATASRKTRKKPREA




TASTFLDLRSVLSSAENEGSGPAARAG





CasΦ.31
31
MLPPSNKIGKSMSLKEFINKRNFKSSIIKQAGKILKKEGEEAV




KKYLDDNYVEGYKKRDFPITAKCNIVASNRKIEDFDISKFSSF




IQNYVFNLNKDNFEEFSKIKYNRKSFDELYKKIANEIGLEKPN




YENIQGEIAVIRNAINIYNGVLKKVENRNKKIQEKNQSKDPPK




LLSAFDDNGFLAERPGINETIYGYQSVRLRHLDVEKDKDIIVQ




LPDIYQKYNKKSTDKISVKKRLNKYNVDEYGKLISKRRKERI




NKDDAILCVSNFGDDWIIFDARGLLRQTYRYKLKKKGLCIKD




LLNLFTGDPIINPTKTDLKEALSLSFKDGIINNRTLKVKNYKK




CPELISELIRDKGKVAMISIDLGQTNPISYRLSKFTANNVAYIE




NGVISEDDIVKMKKWREKSDKLENLIKEEAIASLSDDEQREV




RLYENDIADNTKKKILEKFNIREEDLDFSKMSNNTYFIRDCLK




NKNIDESEFTFEKNGKKLDPTDACFAREYKNKLSELTRKKIN




EKIWEIKKNSKEYHKISIYKKETIRYIVNKLIKQSKEKSECDDII




VNIEKLQIGGNFFGGRGKRDPGWNNFFLPKEENRWFINACH




KAFSELAPHKGIIVIESDPAYTSQTCPKCENCDKENRNGEKFK




CKKCNYEANADIDVATENLEKIAKNGRRLIKNFDQLGERLPG




AEMPGGARKRKPSKSLPKNGRGAGVGSEPELINQSPSQVIA





CasΦ.32
32
VPDKKETPLVALCKKSFPGLRFKKHDSRQAGRILKSKGEGAA




VAFLEGKGGTTQPNFKPPVKCNIVAMSRPLEEWPIYKASVVI




QKYVYAQSYEEFKATDPGKSEAGLRAWLKATRVDTDGYFN




VQGLNLIFQNARATYEGVLKKVENRNSKKVAKIEQRNEHRA




ERGLPLLTLDEPETALDETGHLRHRPGINCSVFGYQHMKLKP




YVPGSIPGVTGYSRDPSTPIAACGVDRLEIPEGQPGYVPPWDR




ENLSVKKHRRKRASWARSRGGAIDDNMLLAVVRVADDWA




LLDLRGLLRNTQYRKLLDRSVPVTIESLLNLVTNDPTLSVVK




KPGKPVRYTATLIYKQGVVPVVKAKVVKGSYVSKMLDDTT




ETFSLVGVDLGVNNLIAANALRIRPGKCVERLQAFTLPEQTV




EDFFRFRKAYDKHQENLRLAAVRSLTAEQQAEVLALDTFGP




EQAKMQVCGHLGLSVDEVPWDKVNSRSSILSDLAKERGVD




DTLYMFPFFKGKGKKRKTEIRKRWDVNWAQHFRPQLTSETR




KALNEAKWEAERNSSKYHQLSIRKKELSRHCVNYVIRTAEK




RAQCGKVIVAVEDLHHSFRRGGKGSRKSGWGGFFAAKQEG




RWLMDALFGAFCDLAVHRGYRVIKVDPYNTSRTCPECGHC




DKANRDRVNREAFICVCCGYRGNADIDVAAYNIAMVAITGV




SLRKAARASVASTPLESLAAE





CasΦ.33
33
MSKTKELNDYQEALARRLPGVRHQKSVRRAARLVYDRQGE




DAMVAFLDGKEVDEPYTLQPPAKCHILAVSRPIEEWPIARVT




MAVQEHVYALPVHEVEKSRPETTEGSRSAWFKNSGVSNHG




VTHAQTLNAILKNAYNVYNGVIKKVENRNAKKRDSLAAKN




KSRERKGLPHFKADPPELATDEQGYLLQPPSPNSSVYLVQQH




LRTPQIDLPSGYTGPVVDPRSPIPSLIPIDRLAIPPGQPGYVPLH




DREKLTSNKHRRMKLPKSLRAQGALPVCFRVFDDWAVVDG




RGLLRHAQYRRLAPKNVSIAELLELYTGDPVIDIKRNLMTFR




FAEAVVEVTARKIVEKYHNKYLLKLTEPKGKPVREIGLVSID




LNVQRLIALAIYRVHQTGESQLALSPCLHREILPAKGLGDFDK




YKSKFNQLTEEILTAAVQTLTSAQQEEYQRYVEESSHEAKAD




LCLKYSITPHELAWDKMTSSTQYISRWLRDHGWNASDFTQIT




KGRKKVERLWSDSRWAQELKPKLSNETRRKLEDAKHDLQR




ANPEWQRLAKRKQEYSRHLANTVLSMAREYTACETVVIAIE




NLPMKGGFVDGNGSRESGWDNFFTHKKENRWMIKDIHKAL




SDLAPNRGVHVLEVNPQYTSQTCPECGHRDKANRDPIQRERF




CCTHCGAQRHADLEVATHNIAMVATTGKSLTGKSLAPQRLQ




EAAE





CasΦ.41
34
VLLSDRIQYTDPSAPIPAMTVVDRRKIKKGEPGYVPPFMRKN




LSTNKHRRMRLSRGQKEACALPVGLRLPDGKDGWDFIIFDG




RALLRACRRLRLEVTSMDDVLDKFTGDPRIQLSPAGETIVTC




MLKPQHTGVIQQKLITGKMKDRLVQLTAEAPIAMLTVDLGE




HNLVACGAYTVGQRRGKLQSERLEAFLLPEKVLADFEGYRR




DSDEHSETLRHEALKALSKRQQREVLDMLRTGADQARESLC




YKYGLDLQALPWDKMSSNSTFIAQHLMSLGFGESATHVRYR




PKRKASERTILKYDSRFAAEEKIKLTDETRRAWNEAIWECQR




ASQEFRCLSVRKLQLARAAVNWTLTQAKQRSRCPRVVVVV




EDLNVRFMHGGGKRQEGWAGFFKARSEKRWFIQALHKAYT




ELPTNRGIHVMEVNPARTSITCTKCGYCDPENRYGEDFHCRN




PKCKVRGGHVANADLDIATENLARVALSGPMPKAPKLK





CasΦ.34
35
MTPSFGYQMIIVTPIHHASGAWATLRLLFLNPKTSGVMLGMT




KTKSAFALMREEVFPGLLFKSADLKMAGRKFAKEGREAAIE




YLRGKDEERPANFKPPAKGDIIAQSRPFDQWPIVQVSQAIQK




YIFGLTKAEFDATKTLLYGEGNHPTTESRRRWFEATGVPDFG




FTSAQGLNAIFSSALARYEGVIQKVENRNEKRLKKLSEKNQR




LVEEGHAVEAYVPETAFHTLESLKALSEKSLVPLDDLMDKID




RLAQPPGINPCLYGYQQVAPYIYDPENPRGVVLPDLYLGYCR




KPDDPITACPNRLDIPKGQPGYIPEHQRGQLKKHGRVRRFRY




TNPQAKARAKAQTAILAVLRIDEDWVVMDLRGLLRNVYFRE




VAAPGELTARTLLDTFTGCPVLNLRSNVVTFCYDIESKGALH




AEYVRKGWATRNKLLDLTKDGQSVALLSVDLGQRHPVAVM




ISRLKRDDKGDLSEKSIQVVSRTFADQYVDKLKRYRVQYDA




LRKEIYDAALVSLPPEQQAEIRAYEAFAPGDAKANVLSVMFQ




GEVSPDELPWDKMNTNTHYISDLYLRRGGDPSRVFFVPQPST




PKKNAKKPPAPRKPVKRTDENVSHMPEFRPHLSNETREAFQ




KAKWTMERGNVRYAQLSRFLNQIVREANNWLVSEAKKLTQ




CQTVVWAIEDLHVPFFHGKGKYHETWDGFFRQKKEDRWFV




NVFHKAISERAPNKGEYVMEVAPYRTSQRCPVCGFVDADNR




HGDHFKCLRCGVELHADLEVATWNIALVAVQGHGIAGPPRE




QSCGGETAGTARKGKNIKKNKGLADAVTVEAQDSEGGSKK




DAGTARNPVYIPSESQVNCPAP





CasΦ.35
36
MKPKTPKPPKTPVAALIDKHFPGKRFRASYLKSVGKKLKNQ




GEDVAVRFLTGKDEERPPNFQPPAKSNIVAQSRPIEEWPIHKV




SVAVQEYVYGLTVAEKEACSDAGESSSSHAAWFAKTGVENF




GYTSVQGLNKIFPPTFNRFDGVIKKVENRNEKKRQKATRINE




AKRNKGQSEDPPEAEVKATDDAGYLLQPPGINHSVYGYQSIT




LCPYTAEKFPTIKLPEEYAGYHSNPDAPIPAGVPDRLAIPEGQ




PGHVPEEHRAGLSTKKHRRVRQWYAMANWKPKPKRTSKPD




YDRLAKARAQGALLIVIRIDEDWVVVDARGLLRNVRWRSLG




KREITPNELLDLFTGDPVLDLKRGVVTFTYAEGVVNVCSRST




TKGKQTKVLLDAMTAPRDGKKRQIGMVAVDLGQTNPIAAE




YSRVGKNAAGTLEATPLSRSTLPDELLREIALYRKAHDRLEA




QLREEAVLKLTAEQQAENARYVETSEEGAKLALANLGVDTS




TLPWDAMTGWSTCISDHLINHGGDTSAVFFQTIRKGTKKLET




IKRKDSSWADIVRPRLTKETREALNDFLWELKRSHEGYEKLS




KRLEELARRAVNHVVQEVKWLTQCQDIVIVIEDLNVRNFHG




GGKRGGGWSNFFTVKKENRWFMQALHKAFSDLAAHRGIPV




LEVYPARTSITCLGCGHCDPENRDGEAFVCQQCGATFHADLE




VATRNIARVALTGEAMPKAPAREQPGGAKKRGTSRRRKLTE




VAVKSAEPTIHQAKNQQLNGTSRDPVYKGSELPAL





CasΦ.43
37
MSEITDLLKANFKGKTFKSADMRMAGRILKKSGAQAVIKYL




SDKGAVDPPDFRPPAKCNIIAQSRPFDEWPICKASMAIQQHIY




GLTKNEFDESSPGTSSASHEQWFAKTGVDTHGFTHVQGLNLI




FQHAKKRYEGVIKKVENYNEKERKKFEGINERRSKEGMPLL




EPRLRTAFGDDGKFAEKPGVNPSIYLYQQTSPRPYDKTKHPY




VHAPFELKEITTIPTQDDRLKIPFGAPGHVPEKHRSQLSMAKH




KRRRAWYALSQNKPRPPKDGSKGRRSVRDLADLKAASLAD




AIPLVSRVGFDWVVIDGRGLLRNLRWRKLAHEGMTVEEML




GFFSGDPVIDPRRNVATFIYKAEHATVKSRKPIGGAKRAREEL




LKATASSDGVIRQVGLISVDLGQTNPVAYEISRMHQANGELV




AEHLEYGLLNDEQVNSIQRYRAAWDSMNESFRQKAIESLSM




EAQDEIMQASTGAAKRTREAVLTMFGPNATLPWSRMSSNTT




CISDALIEVGKEEETNFVTSNGPRKRTDAQWAAYLRPRVNPE




TRALLNQAVWDLMKRSDEYERLSKRKLEMARQCVNFVVAR




AEKLTQCNNIGIVLENLVVRNFHGSGRRESGWEGFFEPKREN




RWFMQVLHKAFSDLAQHRGVMVFEVHPAYSSQTCPACRYV




DPKNRSSEDRERFKCLKCGRSFNADREVATFNIREIARTGVG




LPKPDCERSRGVQTTGTARNPGRSLKSNKNPSEPKRVLQSKT




RKKITSTETQNEPLATDLKT





CasΦ.44
38
MTPKTESPLSALCKKHFPGKRFRTNYLKDAGKILKKHGEDA




VVAFLSDKQEDEPANFCPPAKVHILAQSRPFEDWPINLASKAI




QTYVYGLTADERKTCEPGTSKESHDRWFKETGVDHHGFTSV




QGLNLIFKHTLNRYDGVIKKVETRNEKRRSSVVRINEKKAAE




GLPLIAAEAEETAFGEDGRLLQPPGVNHSIYCFQQVSPQPYSS




KKHPQVVLPHAVQGVDPDAPIPVGRPNRLDIPKGQPGYVPE




WQRPHLSMKCKRVRMWYARANWRRKPGRRSVLNEARLKE




ASAKGALPIVLVIGDDWLVMDARGLLRSVFWRRVAKPGLSL




SELLNVTPTGLFSGDPVIDPKRGLVTFTSKLGVVAVHSRKPTR




GKKSKDLLLKMTKPTDDGMPRHVGMVAIDLGQTNPVAAEY




SRVVQSDAGTLKQEPVSRGVLPDDLLKDVARYRRAYDLTEE




SIRQEAIALLSEGHRAEVTKLDQTTANETKRLLVDRGVSESLP




WEKMSSNTTYISDCLVALGKTDDVFFVPKAKKGKKETGIAV




KRKDHGWSKLLRPRTSPEARKALNENQWAVKRASPEYERLS




RRKLELGRRCVNHIIQETKRWTQCEDIVVVLEDLNVGFFHGS




GKRPDGWDNFFVSKRENRWFIQVLHKAFGDLATHRGTHVIE




VHPARTSITCIKCGHCDAGNRDGESFVCLASACGDRRHADLE




VATRNVARVAITGERMPPSEQARDVQKAGGARKRKPSARN




VKSSYPAVEPAPASP





CasΦ.36
39
MSDNKMKKLSKEEKPLTPLQILIRKYIDKSQYPSGFKTTIIKQ




AGVRIKSVKSEQDEINLANWIISKYDPTYIKRDFNPSAKCQIIA




TSRSVADFDIVKMSNKVQEIFFASSHLDKNVFDIGKSKSDHD




SWFERNNVDRGIYTYSNVQGMNLIFSNTKNTYLGVAVKAQN




KFSSKMKRIQDINNFRITNHQSPLPIPDEIKIYDDAGFLLNPPG




VNPNIFGYQSCLLKPLENKEIISKTSFPEYSRLPADMIEVNYKI




SNRLKFSNDQKGFIQFKDKLNLFKINSQELFSKRRRLSGQPIL




LVASFGDDWVVLDGRGLLRQVYYRGIAKPGSITISELLGFFT




GDPIVDPIRGVVSLGFKPGVLSQETLKTTSARIFAEKLPNLVL




NNNVGLMSIDLGQTNPVSYRLSEITSNMSVEHICSDFLSQDQI




SSIEKAKTSLDNLEEEIAIKAVDHLSDEDKINFANFSKLNLPED




TRQSLFEKYPELIGSKLDFGSMGSGTSYIADELIKFENKDAFY




PSGKKKFDLSFSRDLRKKLSDETRKSYNDALFLEKRTNDKYL




KNAKRRKQIVRTVANSLVSKIEELGLTPVINIENLAMSGGFFD




GRGKREKGWDNFFKVKKENRWVMKDFHKAFSELSPHHGVI




VIESPPYCTSVTCTKCNFCDKKNRNGHKFTCQRCGLDANAD




LDIATENLEKVAISGKRMPGSERSSDERKVAVARKAKSPKGK




AIKGVKCTITDEPALLSANSQDCSQSTS





CasΦ.37
40
MALSLAEVRERHFKGLRFRSSYLKRAGKILKKEGEAACVAY




LTGKDEESPPNFKPPAKCDVVAQSRPFEEWPIVQASVAVQSY




VYGLTKEAFEAFNPGTTKQSHEACLAATGIDTCGYSNVQGL




NLIFRQAKNRYEGVITKVENRNKKAKKKLTRKNEWRQKNG




HSELPEAPEELTFNDEGRLLQPPGINPSLYTYQQISPTPWSPKD




SSILPPQYAGYERDPNAPIPFGVAKDRLTIASGCPGYIPEWMR




TAGEKTNPRTQKKFMHPGLSTRKNKRMRLPRSVRSAPLGAL




LVTIHLGEDWLVLDVRGLLRNARWRGVAPKDISTQGLLNLF




TGDPVIDTRRGVVTFTYKPETVGIHSRTWLYKGKQTKEVLEK




LTQDQTVALVAIDLGQTNPVSAAASRVSRSGENLSIETVDRF




FLPDELIKELRLYRMAHDRLEERIREESTLALTEAQQAEVRAL




EHVVRDDAKNKVCAAFNLDAASLPWDQMTSNTTYLSEAIL




AQGVSRDQVFFTPNPKKGSKEPVEVMRKDRAWVYAFKAKL




SEETRKAKNEALWALKRASPDYARLSKRREELCRRSVNMVI




NRAKKRTQCQVVIPVLEDLNIGFFHGSGKRLPGWDNFFVAK




KENRWLMNGLHKSFSDLAVHRGFYVFEVMPHRTSITCPACG




HCDSENRDGEAFVCLSCKRTYHADLDVATHNLTQVAGTGLP




MPEREHPGGTKKPGGSRKPESPQTHAPILHRTDYSESADRLG




s





CasΦ.45
41
QAVIKYLSDKGAVDPPDFRPPAKCNIIAQSRPFDEWPICKASM




AIQQHIYGLTKNEFDESSPGTSSASHEQWFAKTGVDTHGFTH




VQGLNLIFQHAKKRYEGVIKKVENYNEKERKKFEGINERRSK




EGMPLLEPRLRTAFGDDGKFAEKPGVNPSIYLYQQTSPRPYD




KTKHPYVHAPFELKEITTIPTQDDRLKIPFGAPGHVPEKHRSQ




LSMAKHKRRRAWYALSQNKPRPPKDGSKGRRSVRDLADLK




AASLADAIPLVSRVGFDWVVIDGRGLLRNLRWRKLAHEGMT




VEEMLGFFSGDPVIDPRRNVATFIYKAEHATVKSRKPIGGAK




RAREELLKATASSDGVIRQVGLISVDLGQTNPVAYEISRMHQ




ANGELVAEHLEYGLLNDEQVNSIQRYRAAWDSMNESFRQK




AIESLSMEAQDEIMQASTGAAKRTREAVLTMFGPNATLPWS




RMSSNTTCISDALIEVGKEEETNFVTSNGPRKRTDAQWAAYL




RPRVNPETRALLNQAVWDLMKRSDEYERLSKRKLEMARQC




VNFVVARAEKLTQCNNIGIVLENLVVRNFHGSGRRESGWEG




FFEPKRENRWFMQVLHKAFSDLAQHRGVMVFEVHPAYSSQ




TCPACRYVDPKNRSSEDRERFKCLKCGRSFNADREVATFNIR




EIARTGVGLPKPDCERSRDVQTPGTARKSGRSLKSQDNLSEP




KRVLQSKTRKKITSTETQNEPLATDLKT





CasΦ.38
42
MIKEQSELSKLIEKYYPGKKFYSNDLKQAGKHLKKSEHLTAK




ESEELTVEFLKSCKEKLYDFRPPAKALIISTSRPFEEWPIYKAS




ESIQKYIYSLTKEELEKYNISTDKTSQENFFKESLIDNYGFANV




SGLNLIFQHTKAIYDGVLKKVNNRNNKILKKYKRKIEEGIEID




SPELEKAIDESGHFINPPGINKNIYCYQQVSPTIFNSFKETKIICP




FNYKRNPNDIIQKGVIDRLAIPFGEPGYIPDHQRDKVNKHKK




RIRKYYKNNENKNKDAILAKINIGEDWVLFDLRGLLRNAYW




RKLIPKQGITPQQLLDMFSGDPVIDPIKNNITFIYKESIIPIHSESI




IKTKKSKELLEKLTKDEQIALVSIDLGQTNPVAARFSRLSSDL




KPEHVSSSFLPDELKNEICRYREKSDLLEIEIKNKAIKMLSQEQ




QDEIKLVNDISSEELKNSVCKKYNIDNSKIPWDKMNGFTTFIA




DEFINNGGDKSLVYFTAKDKKSKKEKLVKLSDKKIANSFKPK




ISKETREILNKITWDEKISSNEYKKLSKRKLEFARRATNYLIN




QAKKATRLNNVVLVVEDLNSKFFHGSGKREDGWDNFFIPKK




ENRWFIQALHKSLTDVSIHRGINVIEVRPERTSITCPKCGCCD




KENRKGEDFKCIKCDSVYHADLEVATFNIEKVAITGESMPKP




DCERLGGEESIG





CasΦ.39
43
VAFLDGKEVDEPYTLQPPAKCHILAVSRPIEEWPIARVTMAV




QEHVYALPVHEVEKSRPETTEGSRSAWFKNSGVSNHGVTHA




QTLNAILKNAYNVYNGVIKKVENRNAKKRDSLAAKNKSRER




KGLPHFKADPPELATDEQGYLLQPPSPNSSVYLVQQHLRTPQ




IDLPSGYTGPVVDPRSPIPSLIPIDRLAIPPGQPGYVPLHDREKL




TSNKHRRMKLPKSLRAQGALPVCFRVFDDWAVVDGRGLLR




HAQYRRLAPKNVSIAELLELYTGDPVIDIKRNLMTFRFAEAV




VEVTARKIVEKYHNKYLLKLTEPKGKPVREIGLVSIDLNVQR




LIALAIYRVHQTGESQLALSPCLHREILPAKGLGDFDKYKSKF




NQLTEEILTAAVQTLTSAQQEEYQRYVEESSHEAKADLCLKY




SITPHELAWDKMTSSTQYISRWLRDHGWNASDFTQITKGRK




KVERLWSDSRWAQELKPKLSNETRRKLEDAKHDLQRANPE




WQRLAKRKQEYSRHLANTVLSMAREYTACETVVIAIENLPM




KGGFVDGNGSRESGWDNFFTHKKENRWMIKDIHKALSDLAP




NRGVHVLEVNPQYTSQTCPECGHRDKANRDPIQRERFCCTH




CGAQRHADLEVATHNIAMVATTGKSLTGKSLAPQRLQ





CasΦ.42
44
LEIPEGEPGHVPWFQRMDIPEGQIGHVNKIQRFNFVHGKNSG




KVKFSDKTGRVKRYHHSKYKDATKPYKFLEESKKVSALDSI




LAIITIGDDWVVFDIRGLYRNVFYRELAQKGLTAVQLLDLFT




GDPVIDPKKGIITFSYKEGVVPVFSQKIVSRFKSRDTLEKLTSQ




GPVALLSVDLGQNEPVAARVCSLKNINDKIALDNSCRIPFLD




DYKKQIKDYRDSLDELEIKIRLEAINSLDVNQQVEIRDLDVFS




ADRAKASTVDMFDIDPNLISWDSMSDARFSTQISDLYLKNGG




DESRVYFEINNKRIKRSDYNISQLVRPKLSDSTRKNLNDSIWK




LKRTSEEYLKLSKRKLELSRAVVNYTIRQSKLLSGINDIVIILE




DLDVKKKFNGRGIRDIGWDNFFSSRKENRWFIPAFHKSFSEL




SSNRGLCVIEVNPAWTSATCPDCGFCSKENRDGINFTCRKCG




VSYHADIDVATLNIARVAVLGKPMSGPADRERLGGTKKPRV




ARSRKDMKRKDISNGTVEVMVTA





CasΦ.46
45
IPSFGYLDRLKIAKGQPGYIPEWQRETINPSKKVRRYWATNH




EKIRNAIPLVVFIGDDWVIIDGRGLLRDARRRKLADKNTTIEQ




LLEMVSNDPVIDSTRGIATLSYVEGVVPVRSFIPIGEKKGREY




LEKSTQKESVTLLSVDIGQINPVSCGVYKVSNGCSKIDFLDKF




FLDKKHLDAIQKYRTLQDSLEASIVNEALDEIDPSFKKEYQNI




NSQTSNDVKKSLCTEYNIDPEAISWQDITAHSTLISDYLIDNNI




TNDVYRTVNKAKYKTNDFGWYKKFSAKLSKEAREALNEKI




WELKIASSKYKKLSVRKKEIARTIANDCVKRAETYGDNVVV




AMESLTKNNKVMSGRGKRDPGWHNLGQAKVENRWFIQAIS




SAFEDKATHHGTPVLKVNPAYTSQTCPSCGHCSKDNRSSKD




RTIFVCKSCGEKFNADLDVATYNIAHVAFSGKKLSPPSEKSSA




TKKPRSARKSKKSRKS





CasΦ.47
46
SPIEKLLNGLLVKITFGNDWIICDARGLLDNVQKGIIHKSYFT




NKSSLVDLIDLFTCNPIVNYKNNVVTFCYKEGVVDVKSFTPI




KSGPKTQENLIKKLKYSRFQNEKDACVLGVGVDVGVTNPFA




INGFKMPVDESSEWVMLNEPLFTIETSQAFREEIMAYQQRTD




EMNDQFNQQSIDLLPPEYKVEFDNLPEDINEVAKYNLLHTLN




IPNNFLWDKMSNTTQFISDYLIQIGRGTETEKTITTKKGKEKIL




TIRDVNWFNTFKPKISEETGKARTEIKRDLQKNSDQFQKLAK




SREQSCRTWVNNVTEEAKIKSGCPLIIFVIEALVKDNRVFSGK




GHRAIGWHNFGKQKNERRWWVQAIHKAFQEQGVNHGYPVI




LCPPQYTSQTCPKCNHVDRDNRSGEKFKCLKYGWIGNADLD




VGAYNIARVAITGKALSKPLEQKKIKKAKNKT





CasΦ.48
47
LLDNVQKGIIHKSYFTNKSSLVDLIDLFTCNPIVNYKNNVVTF




CYKEGVVDVKSFTPIKSGPKTQENLIKKLKYSRFQNEKDACV




LGVGVDVGVTNPFAINGFKMPVDESSEWVMLNEPLFTIETSQ




AFREEIMAYQQRTDEMNDQFNQQSIDLLPPEYKVEFDNLPED




INEVAKYNLLHTLNIPNNFLWDKMSNTTQFISDYLIQIGRGTE




TEKTITTKKGKEKILTIRDVNWFNTFKPKISEETGKARTEIKR




DLQKNSDQFQKLAKSREQSCRTWVNNVTEEAKIKSGCPLIIF




VIEALVKDNRVFSGKGHRAIGWHNFGKQKNERRWWVQAIH




KAFQEQGVNHGYPVILCPPQYTSQTCPKCNHVDRDNRSGEK




FKCLKYGWIGNADLDVGAYNIARVAITGKALSKPLEQKKIK




KAKNKT





CasΦ.49
105
MIKPTVSQFLTPGFKLIRNHSRTAGLKLKNEGEEACKKFVRE




NEIPKDECPNFQGGPAIANIIAKSREFTEWEIYQSSLAIQEVIFT




LPKDKLPEPILKEEWRAQWLSEHGLDTVPYKEAAGLNLIIKN




AVNTYKGVQVKVDNKNKNNLAKINRKNEIAKLNGEQEISFE




EIKAFDDKGYLLQKPSPNKSIYCYQSVSPKPFITSKYHNVNLP




EEYIGYYRKSNEPIVSPYQFDRLRIPIGEPGYVPKWQYTFLSK




KENKRRKLSKRIKNVSPILGIICIKKDWCVFDMRGLLRTNHW




KKYHKPTDSINDLFDYFTGDPVIDTKANVVRFRYKMENGIV




NYKPVREKKGKELLENICDQNGSCKLATVDVGQNNPVAIGL




FELKKVNGELTKTLISRHPTPIDFCNKITAYRERYDKLESSIKL




DAIKQLTSEQKIEVDNYNNNFTPQNTKQIVCSKLNINPNDLP




WDKMISGTHFISEKAQVSNKSEIYFTSTDKGKTKDVMKSDY




KWFQDYKPKLSKEVRDALSDIEWRLRRESLEFNKLSKSREQ




DARQLANWISSMCDVIGIENLVKKNNFFGGSGKREPGWDNF




YKPKKENRWWINAIHKALTELSQNKGKRVILLPAMRTSITCP




KCKYCDSKNRNGEKFNCLKCGIELNADIDVATENLATVAITA




QSMPKPTCERSGDAKKPVRARKAKAPEFHDKLAPSYTVVLR




EAVKRPAATKKAGQAKKKKEF




(Underlined sequence is Nuclear Localization




Signal; SEQ ID NO: 106)





CasΦ.12
107
SNAPKKKRKVGIHGVPAAMIKPTVSQFLTPGFKLIRNHSRT


with NLS

AGLKLKNEGEEACKKFVRENEIPKDECPNFQGGPAIANIIAKS


Signals

REFTEWEIYQSSLAIQEVIFTLPKDKLPEPILKEEWRAQWLSE




HGLDTVPYKEAAGLNLIIKNAVNTYKGVQVKVDNKNKNNL




AKINRKNEIAKLNGEQEISFEEIKAFDDKGYLLQKPSPNKSIY




CYQSVSPKPFITSKYHNVNLPEEYIGYYRKSNEPIVSPYQFDR




LRIPIGEPGYVPKWQYTFLSKKENKRRKLSKRIKNVSPILGIICI




KKDWCVFDMRGLLRTNHWKKYHKPTDSINDLFDYFTGDPVI




DTKANVVRFRYKMENGIVNYKPVREKKGKELLENICDQNGS




CKLATVDVGQNNPVAIGLFELKKVNGELTKTLISRHPTPIDFC




NKITAYRERYDKLESSIKLDAIKQLTSEQKIEVDNYNNNFTPQ




NTKQIVCSKLNINPNDLPWDKMISGTHFISEKAQVSNKSEIYF




TSTDKGKTKDVMKSDYKWFQDYKPKLSKEVRDALSDIEWR




LRRESLEFNKLSKSREQDARQLANWISSMCDVIGIENLVKKN




NFFGGSGKREPGWDNFYKPKKENRWWINAIHKALTELSQNK




GKRVILLPAMRTSITCPKCKYCDSKNRNGEKFNCLKCGIELN




ADIDVATENLATVAITAQSMPKPTCERSGDAKKPVRARKAK




APEFHDKLAPSYTVVLREAVKRPAATKKAGQAKKKKEF




(Underlined sequences Nuclear Localization




Signals; SEQ ID NO: 112 and 106)









In some embodiments, any of the programmable CasΦ nucleases of the present disclosure (e.g., any one of SEQ ID NO: 1 to 47, 105, or 107, or fragments or variants thereof) may include a nuclear localization signal (NLS). In some cases, one or more NLS are fused or linked to the N-terminus of the programmable CasΦ nuclease. In some embodiments, one or more NLS are fused or linked to the C-terminus of the programmable CasΦ nuclease. In some embodiments, one or more NLS are fused or linked to the N-terminus and the C-terminus of the programmable CasΦ nuclease. In some embodiments, the link between the NLS and the programmable CasΦ nuclease comprises a tag. In some cases, said NLS may have a sequence of KRPAATKKAGQAKKKKEF (SEQ ID NO: 106). The NLS can be selected to match the cell type of interest, for example several NLSs are known to be functional in different types of eukaryotic cell e.g. in mammalian cells. Suitable NLSs include the SV40 large T antigen NLS (PKKKRKV, SEQ ID NO: 110) and the c-Myc NLS (PAAKRVKLD, SEQ ID NO: 111). In some embodiments, an NLS may be the SV40 large T antigen NLS or the c-Myc NLS. NLSs that are functional in plant cells are described in Chang et al., (Plant Signal Behay. 2013 October; 8(10):e25976). In some embodiments, an NLS sequence can be selected from the following consensus sequences: KR(K/R)R, K(K/R)RK; (P/R)XXKR({circumflex over ( )}DE)(K/R); KRX(W/F/Y)XXAF (SEQ ID NO: 2489); (R/P)XXKR(K/R)({circumflex over ( )}DE); LGKR(K/R)(W/F/Y) (SEQ ID NO: 2490); KRX10-12K(KR)(KR) or KRX10-12K(KR)X(K/R).


In some embodiments, the nucleoplasmin NLS (KRPAATKKAGQAKKKKEF (SEQ ID NO: 106)) is linked or fused to the C-terminus of the programmable CasΦ nuclease. In some embodiments, the SV40 NLS (PKKKRKVGIHGVPAA) (SEQ ID NO: 112) is linked or fused to the N-terminus of the programmable CasΦ nuclease. In preferred embodiments, the nucleoplasmin NLS (SEQ ID NO: 106) is linked or fused to the C-terminus of the programmable CasΦ nuclease and the SV40 NLS (SEQ ID NO: 112) is linked or fused to the N-terminus of the programmable CasΦ nuclease.


In some embodiments, the CasΦ nuclease comprises more than 200 amino acids, more than 300 amino acids, more than 400 amino acids. In some embodiments, the CasΦ nuclease comprises less than 1500 amino acids, less than 1000 amino acids or less than 900 amino acids. In some embodiments, the CasΦ nuclease comprises between 200 and 1500 amino acids, between 300 and 1000 amino acids, or between 400 and 900 amino acids. In preferred embodiments, the CasΦ nuclease comprises between 400 and 900 amino acids.


“Percent identity” and “% identity” can refer to the extent to which two sequences (nucleotide or amino acid) have the same residue at the same positions in an alignment. For example, “an amino acid sequence is X % identical to SEQ ID NO: Y” can refer to % identity of the amino acid sequence to SEQ ID NO: Y and is elaborated as X % of residues in the amino acid sequence are identical to the residues of sequence disclosed in SEQ ID NO: Y. Generally, computer programs can be employed for such calculations. Illustrative programs that compare and align pairs of sequences, include ALIGN (Myers and Miller, Comput Appl Biosci. 1988 March; 4(1):11-7), FASTA (Pearson and Lipman, Proc Natl Acad Sci USA. 1988 April; 85(8):2444-8; Pearson, Methods Enzymol. 1990; 183:63-98) and gapped BLAST (Altschul et al., Nucleic Acids Res. 1997 Sep. 1; 25(17):3389-40), BLASTP, BLASTN, or GCG (Devereux et al., Nucleic Acids Res. 1984 Jan. 11; 12(1 Pt 1):387-95).


A CasΦ polypeptide or a variant thereof can comprise at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 92%, at least 95%, at least 97%, at least 98%, at least 99%, or 100% sequence identity with any one of SEQ ID NO: 1 to SEQ ID NO: 47, SEQ ID NO. 105, and SEQ ID NO: 107.


A programmable nuclease or nickase of the present disclosure can comprise at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 92%, at least 95%, at least 97%, at least 98%, at least 99%, or 100% sequence identity with any one of SEQ ID NO: 1 to SEQ ID NO: 47, SEQ ID NO. 105, and SEQ ID NO: 107.


Compositions and methods of the disclosure can comprise a programmable nuclease comprising at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 92%, at least 95%, at least 97%, at least 98%, at least 99%, or 100% sequence identity to SEQ ID NO: 2.


Compositions and methods of the disclosure can comprise a programmable nuclease comprising at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 92%, at least 95%, at least 97%, at least 98%, at least 99%, or 100% sequence identity to SEQ ID NO: 4.


Compositions and methods of the disclosure can comprise a programmable nuclease comprising at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 92%, at least 95%, at least 97%, at least 98%, at least 99%, or 100% sequence identity to SEQ ID NO: 11.


Compositions and methods of the disclosure can comprise a programmable nuclease comprising at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 92%, at least 95%, at least 97%, at least 98%, at least 99%, or 100% sequence identity to SEQ ID NO: 17.


Compositions and methods of the disclosure can comprise a programmable nuclease comprising at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 92%, at least 95%, at least 97%, at least 98%, at least 99%, or 100% sequence identity to SEQ ID NO: 18.


Compositions and methods of the disclosure can comprise a programmable polypeptide or nuclease comprising at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 92%, at least 95%, at least 97%, at least 98%, at least 99%, or 100% sequence identity to SEQ ID NO: 12.


Compositions and methods of the disclosure can comprise a programmable polypeptide or nuclease comprising at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 92%, at least 95%, at least 97%, at least 98%, at least 99%, or 100% sequence identity to SEQ ID NO: 105.


Compositions and methods of the disclosure can comprise a programmable polypeptide or nuclease comprising at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 92%, at least 95%, at least 97%, at least 98%, at least 99%, or 100% sequence identity to SEQ ID NO: 107.


In some embodiments, the programmable nuclease comprises a sequence with at least 70% identity to SEQ ID NO: 2. In some embodiments, the programmable nuclease comprises a sequence with at least 75% identity to SEQ ID NO: 2. In some embodiments, the programmable nuclease comprises a sequence with at least 80% identity to SEQ ID NO: 2. In some embodiments, the programmable nuclease comprises a sequence with at least 85% identity to SEQ ID NO: 2. In some embodiments, the programmable nuclease comprises a sequence with at least 90% identity to SEQ ID NO: 2. In some embodiments, the programmable nuclease comprises a sequence with at least 92% identity to SEQ ID NO: 2. In some embodiments, the programmable nuclease comprises a sequence with at least 95% identity to SEQ ID NO: 2. In some embodiments, the programmable nuclease comprises a sequence with at least 97% identity to SEQ ID NO: 2. In some embodiments, the programmable nuclease comprises a sequence with at least 98% identity to SEQ ID NO: 2. In some embodiments, the programmable nuclease comprises a sequence with at least 99% identity to SEQ ID NO: 2. In some embodiments, the programmable nuclease comprises a sequence of SEQ ID NO: 2.


In some embodiments, the programmable nuclease comprises a sequence with at least 70% identity to SEQ ID NO: 4. In some embodiments, the programmable nuclease comprises a sequence with at least 75% identity to SEQ ID NO: 4. In some embodiments, the programmable nuclease comprises a sequence with at least 80% identity to SEQ ID NO: 4. In some embodiments, the programmable nuclease comprises a sequence with at least 85% identity to SEQ ID NO: 4. In some embodiments, the programmable nuclease comprises a sequence with at least 90% identity to SEQ ID NO: 4. In some embodiments, the programmable nuclease comprises a sequence with at least 92% identity to SEQ ID NO: 4. In some embodiments, the programmable nuclease comprises a sequence with at least 95% identity to SEQ ID NO: 4. In some embodiments, the programmable nuclease comprises a sequence with at least 97% identity to SEQ ID NO: 4. In some embodiments, the programmable nuclease comprises a sequence with at least 98% identity to SEQ ID NO: 4. In some embodiments, the programmable nuclease comprises a sequence with at least 99% identity to SEQ ID NO: 4. In some embodiments, the programmable nuclease comprises a sequence of SEQ ID NO: 4.


In some embodiments, the programmable nuclease comprises a sequence with at least 70% identity to SEQ ID NO: 11. In some embodiments, the programmable nuclease comprises a sequence with at least 75% identity to SEQ ID NO: 11. In some embodiments, the programmable nuclease comprises a sequence with at least 80% identity to SEQ ID NO: 11. In some embodiments, the programmable nuclease comprises a sequence with at least 85% identity to SEQ ID NO: 11. In some embodiments, the programmable nuclease comprises a sequence with at least 90% identity to SEQ ID NO: 11. In some embodiments, the programmable nuclease comprises a sequence with at least 92% identity to SEQ ID NO: 11. In some embodiments, the programmable nuclease comprises a sequence with at least 95% identity to SEQ ID NO: 11. In some embodiments, the programmable nuclease comprises a sequence with at least 97% identity to SEQ ID NO: 11. In some embodiments, the programmable nuclease comprises a sequence with at least 98% identity to SEQ ID NO: 11. In some embodiments, the programmable nuclease comprises a sequence with at least 99% identity to SEQ ID NO: 11. In some embodiments, the programmable nuclease comprises a sequence of SEQ ID NO: 11.


In some embodiments, the programmable nuclease comprises a sequence with at least 70% identity to SEQ ID NO: 12. In some embodiments, the programmable nuclease comprises a sequence with at least 75% identity to SEQ ID NO: 12. In some embodiments, the programmable nuclease comprises a sequence with at least 80% identity to SEQ ID NO: 12. In some embodiments, the programmable nuclease comprises a sequence with at least 85% identity to SEQ ID NO: 12. In some embodiments, the programmable nuclease comprises a sequence with at least 90% identity to SEQ ID NO: 12. In some embodiments, the programmable nuclease comprises a sequence with at least 92% identity to SEQ ID NO: 12. In some embodiments, the programmable nuclease comprises a sequence with at least 95% identity to SEQ ID NO: 12. In some embodiments, the programmable nuclease comprises a sequence with at least 97% identity to SEQ ID NO: 12. In some embodiments, the programmable nuclease comprises a sequence with at least 98% identity to SEQ ID NO: 12. In some embodiments, the programmable nuclease comprises a sequence with at least 99% identity to SEQ ID NO: 12. In some embodiments, the programmable nuclease comprises a sequence of SEQ ID NO: 12.


In some embodiments, the programmable nuclease comprises a sequence with at least 70% identity to SEQ ID NO: 17. In some embodiments, the programmable nuclease comprises a sequence with at least 75% identity to SEQ ID NO: 17. In some embodiments, the programmable nuclease comprises a sequence with at least 80% identity to SEQ ID NO: 17. In some embodiments, the programmable nuclease comprises a sequence with at least 85% identity to SEQ ID NO: 17. In some embodiments, the programmable nuclease comprises a sequence with at least 90% identity to SEQ ID NO: 17. In some embodiments, the programmable nuclease comprises a sequence with at least 92% identity to SEQ ID NO: 17. In some embodiments, the programmable nuclease comprises a sequence with at least 95% identity to SEQ ID NO: 17. In some embodiments, the programmable nuclease comprises a sequence with at least 97% identity to SEQ ID NO: 17. In some embodiments, the programmable nuclease comprises a sequence with at least 98% identity to SEQ ID NO: 17. In some embodiments, the programmable nuclease comprises a sequence with at least 99% identity to SEQ ID NO: 17. In some embodiments, the programmable nuclease comprises a sequence of SEQ ID NO: 17.


In some embodiments, the programmable nuclease comprises a sequence with at least 70% identity to SEQ ID NO: 18. In some embodiments, the programmable nuclease comprises a sequence with at least 75% identity to SEQ ID NO: 18. In some embodiments, the programmable nuclease comprises a sequence with at least 80% identity to SEQ ID NO: 18. In some embodiments, the programmable nuclease comprises a sequence with at least 85% identity to SEQ ID NO: 18. In some embodiments, the programmable nuclease comprises a sequence with at least 90% identity to SEQ ID NO: 18. In some embodiments, the programmable nuclease comprises a sequence with at least 92% identity to SEQ ID NO: 18. In some embodiments, the programmable nuclease comprises a sequence with at least 95% identity to SEQ ID NO: 18. In some embodiments, the programmable nuclease comprises a sequence with at least 97% identity to SEQ ID NO: 18. In some embodiments, the programmable nuclease comprises a sequence with at least 98% identity to SEQ ID NO: 18. In some embodiments, the programmable nuclease comprises a sequence with at least 99% identity to SEQ ID NO: 18. In some embodiments, the programmable nuclease comprises a sequence of SEQ ID NO: 18.


In some embodiments, the programmable nuclease comprises a sequence with at least 70% identity to SEQ ID NO: 105. In some embodiments, the programmable nuclease comprises a sequence with at least 75% identity to SEQ ID NO: 105. In some embodiments, the programmable nuclease comprises a sequence with at least 80% identity to SEQ ID NO: 105. In some embodiments, the programmable nuclease comprises a sequence with at least 85% identity to SEQ ID NO: 105. In some embodiments, the programmable nuclease comprises a sequence with at least 90% identity to SEQ ID NO: 105. In some embodiments, the programmable nuclease comprises a sequence with at least 92% identity to SEQ ID NO: 105. In some embodiments, the programmable nuclease comprises a sequence with at least 95% identity to SEQ ID NO: 105. In some embodiments, the programmable nuclease comprises a sequence with at least 97% identity to SEQ ID NO: 105. In some embodiments, the programmable nuclease comprises a sequence with at least 98% identity to SEQ ID NO: 105. In some embodiments, the programmable nuclease comprises a sequence with at least 99% identity to SEQ ID NO: 105. In some embodiments, the programmable nuclease comprises a sequence of SEQ ID NO: 105.


In some embodiments, the programmable nuclease comprises a sequence with at least 70% identity to the N-terminal 717 amino acid residues of SEQ ID NO: 105. In some embodiments, the programmable nuclease comprises a sequence with at least 75% identity to the N-terminal 717 amino acid residues of SEQ ID NO: 105. In some embodiments, the programmable nuclease comprises a sequence with at least 80% identity to the N-terminal 717 amino acid residues of SEQ ID NO: 105. In some embodiments, the programmable nuclease comprises a sequence with at least 85% identity to the N-terminal 717 amino acid residues of SEQ ID NO: 105. In some embodiments, the programmable nuclease comprises a sequence with at least 90% identity to SEQ ID NO: 105. In some embodiments, the programmable nuclease comprises a sequence with at least 95% identity to the N-terminal 717 amino acid residues of SEQ ID NO: 105. In some embodiments, the programmable nuclease comprises a sequence with at least 98% identity to the N-terminal 717 amino acid residues of SEQ ID NO: 105. In some embodiments, the programmable nuclease comprises a sequence with at least 99% identity to the N-terminal 717 amino acid residues of SEQ ID NO: 105. In some embodiments, the programmable nuclease comprises a sequence of the N-terminal 717 amino acid residues of SEQ ID NO: 105.


In some embodiments, the programmable nuclease comprises a sequence with at least 70% identity to SEQ ID NO: 106. In some embodiments, the programmable nuclease comprises a sequence with 75% identity to SEQ ID NO: 106. In some embodiments, the programmable nuclease comprises a sequence with at least 80% identity to SEQ ID NO: 106. In some embodiments, the programmable nuclease comprises a sequence with at least 85% identity to SEQ ID NO: 106. In some embodiments, the programmable nuclease comprises a sequence with at least 90% identity to SEQ ID NO: 105. In some embodiments, the programmable nuclease comprises a sequence with at least 95% identity to SEQ ID NO: 106. In some embodiments, the programmable nuclease comprises a sequence with at least 98% identity to SEQ ID NO: 106. In some embodiments, the programmable nuclease comprises a sequence with at least 99% identity to SEQ ID NO: 106. In some embodiments, the programmable nuclease comprises a sequence of SEQ ID NO: 106.


In some embodiments, the programmable nuclease comprises a sequence with at least 70% identity to SEQ ID NO: 107. In some embodiments, the programmable nuclease comprises a sequence with at least 75% identity to SEQ ID NO: 107. In some embodiments, the programmable nuclease comprises a sequence with at least 80% identity to SEQ ID NO: 107. In some embodiments, the programmable nuclease comprises a sequence with at least 85% identity to SEQ ID NO: 107. In some embodiments, the programmable nuclease comprises a sequence with at least 90% identity to SEQ ID NO: 107. In some embodiments, the programmable nuclease comprises a sequence with at least 95% identity to SEQ ID NO: 107. In some embodiments, the programmable nuclease comprises a sequence with at least 98% identity to SEQ ID NO: 107. In some embodiments, the programmable nuclease comprises a sequence with at least 99% identity to SEQ ID NO: 107. In some embodiments, the programmable nuclease comprises a sequence of SEQ ID NO: 107.


The programmable nucleases disclosed herein can be codon optimized for expression in a specific cell, for example, a bacterial cell, a plant cell, a eukaryotic cell, an animal cell, a mammalian cell, or a human cell. In some embodiments, the programmable nuclease is codon optimized for a human cell.


The programmable nucleases presented in TABLE 1 or variants or fragments thereof comprising at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 92%, at least 95%, at least 97%, at least 99%, or 100% sequence identity with any one of SEQ ID NO: 1-SEQ ID NO: 47, SEQ ID NO. 105, and SEQ ID NO: 107 can comprise nicking activity. Compositions and methods of the disclosure can comprise a programmable nickase comprising at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 92%, at least 95%, at least 97%, at least 99%, or 100% sequence identity to any one of SEQ ID NO: 1-SEQ ID NO: 47, SEQ ID NO. 105, and SEQ ID NO: 107. Compositions and methods of the disclosure can comprise a programmable nickase comprising at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 92%, at least 95%, at least 97%, at least 99%, or 100% sequence identity to SEQ ID NO: 2. Compositions and methods of the disclosure can comprise a programmable nickase comprising at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 92%, at least 95%, at least 97%, at least 99%, or 100% sequence identity to SEQ ID NO: 4. Compositions and methods of the disclosure can comprise a programmable nickase comprising at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 92%, at least 95%, at least 97%, at least 99%, or 100% sequence identity to SEQ ID NO: 11. Compositions and methods of the disclosure can comprise a programmable nickase comprising at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 92%, at least 95%, at least 97%, at least 99%, or 100% sequence identity to SEQ ID NO: 17. Compositions and methods of the disclosure can comprise a programmable nuclease comprising at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 92%, at least 95%, at least 97%, at least 99%, or 100% sequence identity to SEQ ID NO: 18.


The programmable nucleases presented in TABLE 1 or variants thereof comprising at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 92%, at least 95%, at least 97%, at least 99%, or 100% sequence identity with any one of SEQ ID NO: 1-SEQ ID NO: 47, SEQ ID NO. 105, and SEQ ID NO: 107 can comprise double-strand DNA cleavage activity. Compositions and methods of the disclosure can comprise a programmable nuclease capable of introducing a double-strand break in a target DNA sequence and comprising at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 92%, at least 95%, at least 97%, at least 99%, or 100% sequence identity to any one of SEQ ID NO: 1-SEQ ID NO: 47, SEQ ID NO. 105, and SEQ ID NO: 107. Compositions and methods of the disclosure can comprise a programmable nuclease with double-strand DNA cleaving activity and comprising at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 92%, at least 95%, at least 97%, at least 99%, or 100% sequence identity to SEQ ID NO: 12. Compositions and methods of the disclosure can comprise a programmable nuclease with double-strand DNA cleaving activity and comprising at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 92%, at least 95%, at least 97%, at least 99%, or 100% sequence identity to SEQ ID NO: 2. Compositions and methods of the disclosure can comprise a programmable nickase comprising at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 92%, at least 95%, at least 97%, at least 99%, or 100% sequence identity to SEQ ID NO: 4. Compositions and methods of the disclosure can comprise a programmable nuclease with double-strand DNA cleaving activity and comprising at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 92%, at least 95%, at least 97%, at least 99%, or 100% sequence identity to SEQ ID NO: 11.


The programmable nucleases presented in TABLE 1 or variants thereof comprising at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 92%, at least 95%, at least 97%, at least 99%, or 100% sequence identity with any one of SEQ ID NO: 1-SEQ ID NO: 47 and SEQ ID NO. 105 can comprise nickase activity and double-strand DNA cleavage activity. The ratio of the nickase activity and double-strand DNA cleavage activity can be modulated depending on the reaction conditions including for example, RNP complexing temperature, the crRNA repeat sequence in the guide nucleic acid. In some embodiments, nickase activity is reduced when RNP complexing temperature is room temperature, for example 20 to 22° C., compared to when RNP complexing temperature is 37° C. In some embodiments, the double-strand DNA cleavage activity is insensitive to RNP complexing at 37° C. compared to room temperature, or the double-strand DNA cleavage activity is reduced by 10%, 20% or 30% when complexed with a guide RNA at room temperature as compared to when complexed at 37° C. In a preferred embodiment, double-strand cleavage activity is similar when the RNP complexing temperature is room temperature and 37° C.


The programmable nucleases presented in TABLE 1 or variants thereof comprising at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 92%, at least 95%, at least 97%, at least 99%, or 100% sequence identity with any one of SEQ ID NO: 1-SEQ ID NO: 47, SEQ ID NO. 105, and SEQ ID NO: 107 can comprise reduced or substantially no nucleic acid cleavage activity.


In some embodiments, the N-terminal amino acid sequence of the programmable nuclease is not MISKMIKPTV (SEQ ID NO: 113). In some embodiments, the programmable nuclease does not include the amino acid sequence MISKMIKPTV (SEQ ID NO: 114).


In some embodiments, the N-terminal amino acid sequence of the programmable nuclease is not MISK (SEQ ID NO: 115). In some embodiments, the programmable nuclease does not include the amino acid sequence MISK (SEQ ID NO: 115).


In some embodiments, a composition comprises a first programmable nuclease described herein and a second programmable nuclease described herein. In some embodiments, a complex comprises a first programmable nuclease described herein and a second programmable nuclease described herein. In preferred embodiments, a complex comprises a first programmable nuclease described herein and a second programmable nuclease described herein, wherein the first and second programmable nucleases are the same programmable nuclease. In some embodiments, the first and second programmable nucleases form a dimer. In some preferred embodiments, the first and second programmable nucleases form a homodimer.


In some embodiments, a dimer comprises a first programmable nuclease described herein and a second programmable nuclease described herein. In preferred embodiments, the dimer is a homodimer wherein the first and second programmable nucleases are the same.


In some embodiments, a programmable nuclease may be a programmable nickase. The present disclosure provides compositions of programmable nickases, capable of introducing a break in a single strand of a double stranded DNA (dsDNA) (“nicking”). In some embodiments the programmable nickase is a programmable DNA nickase. Said programmable nickases can be coupled to a guide nucleic acid that targets a particular region of interest in the dsDNA. In some embodiments, two programmable nickases are combined and delivered together to generate two strand breaks. For example, a first programmable nickase can be targeted to and nicks a first region of dsDNA and a second programmable nickase can be targeted to and nicks a second region of the same dsDNA on the opposing strand. When combined and delivered together to generate nicks on opposing strands of the dsDNA, two strand breaks in the dsDNA can be generated. The strand breaks can be repaired and rejoined by non-homologous end joining (NHEJ) or homology directed repair (HDR). Thus, two programmable nickases disclosed herein can be combined to selectively edit nucleic acid sequences. This can be useful in any genome editing method, for example, used for therapeutic applications to treat a disease or disorder, or for agricultural applications.


In some embodiments, a programmable nuclease as disclosed herein can be used for genome editing purposes to generate strand breaks in order to excise a region of DNA or to subsequently introduce a region of DNA (e.g., donor DNA).


In some embodiments, the programmable nucleases (e.g., nickases) disclosed herein can be used in DNA Endonuclease Targeted CRISPR TransReporter (DETECTR) assays. In some embodiments, the programmable nuclease is a programmable nickase. A DETECTR assay can utilize the trans-cleavage abilities of some programmable nucleases to achieve fast and high-fidelity detection of a target nucleic acid in a sample. The target nucleic acid can be DNA or RNA. For example, following target DNA extraction from a biological sample, crRNA comprising a portion that is complementary to the target DNA of interest can bind to the target DNA sequence, initiating indiscriminate ssDNase activity by the programmable nuclease. In some embodiments, the extracted DNA is amplified by PCR or isothermal amplification reactions before contacting the DNA to the programmable nuclease complexed with a guide RNA. Upon hybridization with the target DNA, the trans-cleavage activity of the programmable nuclease is activated, which can then cleave an ssDNA fluorescence-quenching (FQ) reporter molecule. Cleavage of the reporter molecule can provide a fluorescent readout indicating the presence of the target DNA in the sample. In some embodiments, the programmable nucleases disclosed herein can be combined, or multiplexed, with other programmable nucleases in a DETECTR assay. The principles of the DETECTR assay are described in Chen et al. (Science 2018 Apr. 27; 360(6387):436-439) and can be modified to facilitate the use of the programmable nucleases described herein. In some embodiments, the programmable nucleases disclosed herein can be used in a specific high-sensitivity enzymatic reporter unlocking (SHERLOCK) assay. The principles of the SHERLOCK assay are described in Kellner et al. (Nat Protoc. 2019 October; 14(10):2986-3012) and can be modified to facilitate the use of the programmable nucleases described herein. Thus some embodiments provide a method of detecting a target nucleic acid in a sample, the method comprising: contacting a sample comprising a target nucleic acid with (a) a programmable CasΦ nuclease disclosed herein, (b) a guide RNA comprising a region that binds to the programmable CasΦ nuclease and an additional region that binds to the target nucleic acid, and (c) a detector nucleic acid that does not bind the guide RNA; cleaving the detector nucleic acid by the programmable CasΦ nuclease; and detecting the target nucleic acid by measuring a signal produced by the cleavage of the detector nucleic acid. In preferred embodiments, the detector nucleic acid is a single stranded DNA reporter.


The programmable nucleases of the present disclosure can show enhanced activity, as measured by enhanced cleavage of an ssDNA-FQ reporter, under certain conditions in the presence of the target DNA. For example, the programmable nucleases of the present disclosure can have variable levels of activity based on a buffer formulation, a pH level, temperature, or salt. Buffers consistent with the present disclosure include phosphate buffers, Tris buffers, and HEPES buffers. Programmable nucleases of the present disclosure can show optimal activity in phosphate buffers, Tris buffers, and HEPES buffers.


Programmable nucleases can also exhibit varying levels of nickase or double-stranded cleavage activity at different pH levels. For example, enhanced cleavage can be observed between pH 7 and pH 9. In some embodiments, programmable nuclease of the present disclosure exhibit enhanced cleavage at about pH 7, about pH 7.1, about pH 7.2, about pH 7.3, about pH 7.4, about pH 7.5, about pH 7.6, about pH 7.7, about pH 7.8, about pH 7.9, about pH 8, about pH 8.1, about pH 8.2, about pH 8.3, about pH 8.4, about pH 8.5, about pH 8.6, about pH 8.7, about pH 8.8, about pH 8.9, about pH 9, from pH 7 to 7.5, from pH 7.5 to 8, from pH 8 to 8.5, from pH 8.5 to 9, or from pH 7 to 8.5.


In some embodiments, the programmable nucleases of the present disclosure exhibit enhanced cleavage of ssDNA-FQ reporters DNA at a temperature of 25° C. to 50° C. in the presence of target DNA. For example, the programmable nucleases of the present disclosure can exhibit enhanced cleavage of an ssDNA-FQ reporter at about 25° C., about 26° C., about 27° C., about 28° C., about 29° C., about 30° C., about 31° C., about 32° C., about 33° C., about 34° C., about 35° C., about 36° C., about 37° C., about 38° C., about 39° C., about 40° C., about 41° C., about 42° C., about 43° C., about 44° C., about 45° C., about 46° C., about 47° C., about 48° C., about 49° C., about 50° C., from 30° C. to 40° C., from 35° C. to 45° C., or from 35° C. to 40° C.


The programmable nucleases of the present disclosure may not be sensitive to salt concentrations in a sample in the presence of the target DNA. Advantageously, said programmable nucleases can be active and capable of cleaving ssDNA-FQ-reporter sequences under varying salt concentrations from 25 nM salt to 200 mM salt. Various salts are consistent with this property of the programmable nucleases disclosed herein, including NaCl or KCl. The programmable nucleases of the present disclosure can be active at salt concentrations of from 25 nM to 500 nM salt, from 500 nM to 1000 nM salt, from 1000 nM to 2000 nM salt, from 2000 nM to 3000 nM salt, from 3000 nM to 4000 nM salt, from 4000 nM to 5000 nM salt, from 5000 nM to 6000 nM salt, from 6000 nM to 7000 nM salt, from 7000 nM to 8000 nM salt, from 8000 nM to 9000 nM salt, from 9000 nM to 0.01 mM salt, from 0.01 mM to 0.05 mM salt, from 0.05 mM to 0.1 mM salt, from 0.1 mM to 10 mM salt, from 10 mM to 100 mM salt, or from 100 mM to 500 mM salt. Thus, the programmable nucleases of the present disclosure can exhibit cleavage activity independent of the salt concentration in a sample.


Programmable nucleases of the present disclosure can be capable of cleaving any ssDNA-FQ reporter, regardless of its sequence. The programmable nucleases provided herein can, thus, be capable of cleaving a universal ssDNA FQ reporter. In some embodiments, the programmable nucleases provided herein cleave homopolymer ssDNA-FQ reporter comprising 5 to 20 adenines, 5 to 20 thymines, 5 to 20 cytosines, or 5 to 20 guanines. Programmable nucleases of the present disclosure, thus, are capable of cleaving ssDNA-FQ reporters also cleaved by programmable nucleases, as disclosed elsewhere herein, allowing for facile multiplexing of multiple programmable nickases and programmable nucleases in a single assay having a single ssDNA-FQ reporter.


Programmable nucleases of the present disclosure can bind a wild type protospacer adjacent motif (PAM) or a mutant PAM in a target DNA. In some embodiments the programmable CasΦ nucleases of the present disclosure recognizes and bind a protospacer adjacent motif (PAM) of 5′-TBN-3′, where B is one or more of C, G, or, T. For example, programmable CasΦ nucleases of the present disclosure may recognizes and bind a protospacer adjacent motif (PAM) of 5′-TTTN-3′. As another example, programmable CasΦ nucleases of the present disclosure may recognizes and bind a protospacer adjacent motif (PAM) of 5′-TTN-3.′ In some embodiments, the PAM is 5′-TTTA-3′, 5′-GTTK-3′, 5′-VTTK-3′, 5′-VTTS-3′, 5′-TTTS-3′ or 5′-VTTN-3′, where K is G or T, V is A, C or G, and S is C or G. In some embodiments, the PAM is 5′-GTTB-3′, wherein B is C, G, or, T.


In some embodiments of the present disclosure, the programmable CasΦ nucleases recognize and bind a PAM of 5′-NTTN-3′.


In some embodiments, when the programmable CasΦ nuclease or a variant thereof comprises at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 92%, at least 95%, at least 97%, at least 99%, or 100% sequence identity with SEQ ID NO: 2, the programmable CasΦ nuclease or a variant recognizes a 5′-GTTK-3′ PAM. In some embodiments, when the programmable CasΦ nuclease or a variant thereof comprises at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 92%, at least 95%, at least 97%, at least 99%, or 100% sequence identity with SEQ ID NO: 2, the programmable CasΦ nuclease or a variant recognizes a 5′-NTTN-3′ PAM.


In some embodiments, when the programmable CasΦ nuclease or a variant thereof comprises at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 92%, at least 95%, at least 97%, at least 99%, or 100% sequence identity with SEQ ID NO: 4, the programmable CasΦ nuclease or a variant recognizes a 5′-VTTK-3′ PAM. In some embodiments, when the programmable CasΦ nuclease or a variant thereof comprises at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 92%, at least 95%, at least 97%, at least 99%, or 100% sequence identity with SEQ ID NO: 4, the programmable CasΦ nuclease or a variant recognizes a 5′-NTTN-3′ PAM.


In some embodiments, when the programmable CasΦ nuclease or a variant thereof comprises at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 92%, at least 95%, at least 97%, at least 99%, or 100% sequence identity with SEQ ID NO: 11, the programmable CasΦ nuclease or a variant recognizes a 5′-VTTS-3′ PAM. In some embodiments, when the programmable CasΦ nuclease or a variant thereof comprises at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 92%, at least 95%, at least 97%, at least 99%, or 100% sequence identity with SEQ ID NO: 11, the programmable CasΦ nuclease or a variant recognizes a 5′-NTTN-3′ PAM.


In some embodiments, when the programmable CasΦ nuclease or a variant thereof comprises at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 92%, at least 95%, at least 97%, at least 99%, or 100% sequence identity with SEQ ID NO: 12, the programmable CasΦ nuclease or a variant recognizes a 5′-TTTS-3′ PAM. In some embodiments, when the programmable CasΦ nuclease or a variant thereof comprises at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 92%, at least 95%, at least 97%, at least 99%, or 100% sequence identity with SEQ ID NO: 12, the programmable CasΦ nuclease or a variant recognizes a 5′-NTTN-3′ PAM.


In some embodiments, when the programmable CasΦ nuclease or a variant thereof comprises at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 92%, at least 95%, at least 97%, at least 99%, or 100% sequence identity with SEQ ID NO: 18, the programmable CasΦ nuclease or a variant recognizes a 5′-VTTN-3′ PAM.


In some embodiments, when the programmable CasΦ nuclease or a variant thereof comprises at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 92%, at least 95%, at least 97%, at least 99%, or 100% sequence identity with SEQ ID NO: 20, the programmable CasΦ nuclease or a variant recognizes a 5′-NTNN-3′ PAM.


In some embodiments, when the programmable CasΦ nuclease or a variant thereof comprises at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 92%, at least 95%, at least 97%, at least 99%, or 100% sequence identity with SEQ ID NO: 20, the programmable CasΦ nuclease or a variant recognizes a 5′-TTN-3′ PAM.


In some embodiments, when the programmable CasΦ nuclease or a variant thereof comprises at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 92%, at least 95%, at least 97%, at least 99%, or 100% sequence identity with SEQ ID NO: 26, the programmable CasΦ nuclease or a variant recognizes a 5′-NTTG-3′ PAM.


In some embodiments, when the programmable CasΦ nuclease or a variant thereof comprises at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 92%, at least 95%, at least 97%, at least 99%, or 100% sequence identity with SEQ ID NO: 32, the programmable CasΦ nuclease or a variant recognizes a 5′-GTTB-3′ PAM, wherein B is C, G, or


N.


In some embodiments, when the programmable CasΦ nuclease or a variant thereof comprises at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 92%, at least 95%, at least 97%, at least 99%, or 100% sequence identity with SEQ ID NO: 42, the programmable CasΦ nuclease or a variant recognizes a 5′-GTTN-3′ PAM.


In some embodiments, when the programmable CasΦ nuclease or a variant thereof comprises at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 92%, at least 95%, at least 97%, at least 99%, or 100% sequence identity with SEQ ID NO: 41, the programmable CasΦ nuclease or a variant recognizes a 5′-NTTN-3′ PAM.


In some embodiments, when the programmable CasΦ nuclease or a variant thereof comprises at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 92%, at least 95%, at least 97%, at least 99%, or 100% sequence identity with SEQ ID NO: 24, the programmable CasΦ nuclease or a variant recognizes a 5′-NTNN-3′ PAM.


In some embodiments, when the programmable CasΦ nuclease or a variant thereof comprises at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 92%, at least 95%, at least 97%, at least 99%, or 100% sequence identity with SEQ ID NO: 25, the programmable CasΦ nuclease or a variant recognizes a 5′-NTNN-3′ PAM.


The programmable nucleases and other reagents (e.g., a guide nucleic acid) can be formulated in a buffer disclosed herein. A wide variety of buffered solutions are compatible with the methods, compositions, reagents, enzymes, and kits disclosed herein. Buffers are compatible with different programmable nucleases described herein. Any of the methods, compositions, reagents, enzymes, or kits disclosed herein may comprise a buffer. These buffers may be compatible with the other reagents, samples, and support mediums as described herein for detection of an ailment, such as a disease, cancer, or genetic disorder, or genetic information, such as for phenotyping, genotyping, or determining ancestry. A buffer, as described herein, can enhance the cis- or trans-cleavage rates of any of the programmable nucleases described herein. The buffer can increase the discrimination of the programmable nucleases for the target nucleic acid. The methods as described herein can be performed in the buffer.


In some embodiments, a buffer may comprise one or more of a buffering agent, a salt, a crowding agent, or a detergent, or any combination thereof. A buffer may comprise a reducing agent. A buffer may comprise a competitor. Exemplary buffering agents include HEPES, TRIS, MES, ADA, PIPES, ACES, MOPSO, BIS-TRIS propane, BES, MOPS, TES, DISO, Trizma, TRICINE, GLY-GLY, HEPPS, BICINE, TAPS, A MPD, A MPSO, CHES, CAPSO, AMP, CAPS, phosphate, citrate, acetate, imidazole, or any combination thereof. A buffering agent may be compatible with a programmable nuclease. A buffer compatible with a programmable nuclease may comprise a buffering agent at a concentration of from 1 mM to 200 mM. A buffer compatible with a programmable nuclease may comprise a buffering agent at a concentration of from 10 mM to 30 mM. A buffer compatible with a programmable nuclease may comprise a buffering agent at a concentration of about 20 mM. A composition (e.g., a composition comprising a programmable nuclease) may have a pH of from 2.5 to 3.5. A composition (e.g., a composition comprising a programmable nuclease) may have a pH of from 3 to 4. A composition (e.g., a composition comprising a programmable nuclease) may have a pH of from 3.5 to 4.5. A composition (e.g., a composition comprising a programmable nuclease) may have a pH of from 4 to 5. A composition (e.g., a composition comprising a programmable nuclease) may have a pH of from 4.5 to 5.5. A composition (e.g., a composition comprising a programmable nuclease) may have a pH of from 5 to 6. A composition (e.g., a composition comprising a programmable nuclease) may have a pH of from 5.5 to 6.5. A composition (e.g., a composition comprising a programmable nuclease) may have a pH of from 6 to 7. A composition (e.g., a composition comprising a programmable nuclease) may have a pH of from 6.5 to 7.5. A composition (e.g., a composition comprising a programmable nuclease) may have a pH of from 7 to 8. A composition (e.g., a composition comprising a programmable nuclease) may have a pH of from 7.5 to 8.5. A composition (e.g., a composition comprising a programmable nuclease) may have a pH of from 8 to 9. A composition (e.g., a composition comprising a programmable nuclease) may have a pH of from 8.5 to 9.5. A composition (e.g., a composition comprising a programmable nuclease) may have a pH of from 9 to 10. A composition (e.g., a composition comprising a programmable nuclease) may have a pH of from 9.5 to 10.5.


A buffer may comprise a salt. Exemplary salts include NaCl, KCl, magnesium acetate, potassium acetate, CaCl2 and MgCl2. A buffer may comprise potassium acetate, magnesium acetate, sodium chloride, magnesium chloride, or any combination thereof. A buffer compatible with a programmable nuclease may comprise a salt at a concentration of from 5 mM to 100 mM. A buffer compatible with a programmable nuclease may comprise a salt at a concentration of from 5 mM to 10 mM. In some embodiments, a buffer compatible with a programmable nuclease comprises a salt from 1 mM to 60 mM. In some embodiments, a buffer compatible with a programmable nuclease comprises a salt from 1 mM to 10 mM. In some embodiments, a buffer compatible with a programmable nuclease comprises a salt at about 105 mM. In some embodiments, a buffer compatible with a programmable nuclease comprises a salt at about 55 mM. In some embodiments, a buffer compatible with a programmable nuclease comprises a salt at about 7 mM. In some embodiments, a buffer compatible with a programmable nuclease comprises a salt, wherein the salt comprises potassium acetate and magnesium acetate. In some embodiments, a buffer compatible with a programmable nuclease comprises a salt, wherein the salt comprises sodium chloride and magnesium chloride. In some embodiments, a buffer compatible with a programmable nuclease comprises a salt, wherein the salt comprises potassium chloride and magnesium chloride.


A buffer may comprise a crowding agent. Exemplary crowding agents include glycerol and bovine serum albumin. A buffer may comprise glycerol. A crowding agent may reduce the volume of solvent available for other molecules in the solution, thereby increasing the effective concentrations of said molecules. A buffer compatible with a programmable nuclease may comprise a crowding agent at a concentration of from 0.01% (v/v) to 10% (v/v). A buffer compatible with a programmable nuclease may comprise a crowding agent at a concentration of from 0.5% (v/v) to 10% (v/v).


A buffer may comprise a detergent. Exemplary detergents include Tween, Triton-X, and IGEPAL. A buffer may comprise Tween, Triton-X, or any combination thereof. A buffer compatible with a programmable nuclease may comprise Triton-X. A buffer compatible with a programmable nuclease may comprise IGEPAL CA-630. In some embodiments, a buffer compatible with a programmable nuclease comprises a detergent at a concentration of 2% (v/v) or less. A buffer compatible with a programmable nuclease may comprise a detergent at a concentration of 2% (v/v) or less. A buffer compatible with a programmable nuclease may comprise a detergent at a concentration of from 0.00001% (v/v) to 0.01% (v/v). A buffer compatible with a programmable nuclease may comprise a detergent at a concentration of about 0.01% (v/v).


A buffer may comprise a reducing agent. Exemplary reducing agents comprise dithiothreitol (DTT), ß-mercaptoethanol (BME), or tris(2-carboxyethyl)phosphine (TCEP). A buffer compatible with a programmable nuclease may comprise DTT. A buffer compatible with a programmable nuclease may comprise a reducing agent at a concentration of from 0.01 mM to 100 mM. A buffer compatible with a programmable nuclease may comprise a reducing agent at a concentration of from 0.1 mM to 10 mM. A buffer compatible with a programmable nuclease may comprise a reducing agent at a concentration of from 0.5 mM to 2 mM. A buffer compatible with a programmable nuclease may comprise a reducing agent at a concentration of from 0.01 mM to 100 mM. A buffer compatible with a programmable nuclease may comprise a reducing agent at a concentration of from 0.1 mM to 10 mM. A buffer compatible with a programmable nuclease may comprise a reducing agent at a concentration of about 1 mM.


A buffer compatible with a programmable nuclease may comprise a competitor. Exemplary competitors compete with the target nucleic acid or the reporter nucleic acid for cleavage by the programmable nuclease. Exemplary competitors include heparin, and imidazole, and salmon sperm DNA. A buffer compatible with a programmable nuclease may comprise a competitor at a concentration of from 1 μg/mL to 100 μg/mL. A buffer compatible with a programmable nuclease may comprise a competitor at a concentration of from 40 μg/mL to 60 μg/mL.


In some embodiments, a programmable CasΦ nuclease is described as a “nickase” if the predominant cleavage product is a nicked nucleic acid when the target nucleic acid is a double-stranded nucleic acid. In some embodiments, a programmable CasΦ nuclease cleaves both strands of a double-stranded target nucleic acid. In some embodiments, the target nucleic acid is DNA. In some embodiments, the target nucleic acid is double-stranded DNA.


Where a programmable CasΦ nuclease disclosed herein cleaves both strands of a double-stranded target nucleic acid, the strand break may be a staggered cut with a 5′ overhang. In some embodiments, the 5′ overhang is an overhang of between 5 and 10 nucleotides. In some embodiments, the 5′ overhang is an overhang of 5 or 6 nucleotides. In some embodiments, the 5′ overhang is an overhang of 9 or 10 nucleotides.


In some embodiments, where the programmable CasΦ nuclease or a variant thereof comprises at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 92%, at least 95%, at least 97%, at least 99%, or 100% sequence identity with SEQ ID NO: 20, the 5′ overhang is a 9 or 10 nucleotide overhang. In preferred embodiments, where the programmable CasΦ nuclease or a variant thereof comprises at least 90% sequence identity with SEQ ID NO: 20, the 5′ overhang is a 9 or 10 nucleotide overhang. In further preferred embodiments, where the programmable CasΦ nuclease or a variant thereof comprises the amino acid sequence of SEQ ID NO: 20, the 5′ overhang is a 9 or 10 nucleotide overhang.


In some embodiments, where the programmable CasΦ nuclease or a variant thereof comprises at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 92%, at least 95%, at least 97%, at least 99%, or 100% sequence identity with SEQ ID NO: 22, the 5′ overhang is a 9 or 10 nucleotide overhang. In preferred embodiments, where the programmable CasΦ nuclease or a variant thereof comprises at least 90% sequence identity with SEQ ID NO: 22, the 5′ overhang is a 10 nucleotide overhang. In further preferred embodiments, where the programmable CasΦ nuclease or a variant thereof comprises the amino acid sequence of SEQ ID NO: 22, the 5′ overhang is a 10 nucleotide overhang.


In some embodiments, where the programmable CasΦ nuclease or a variant thereof comprises at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 92%, at least 95%, at least 97%, at least 99%, or 100% sequence identity with SEQ ID NO: 28, the 5′ overhang is a 9 nucleotide overhang. In preferred embodiments, where the programmable CasΦ nuclease or a variant thereof comprises at least 90% sequence identity with SEQ ID NO: 28, the 5′ overhang is a 9 nucleotide overhang. In further preferred embodiments, where the programmable CasΦ nuclease or a variant thereof comprises the amino acid sequence of SEQ ID NO: 28, the 5′ overhang is a 9 nucleotide overhang.


In some embodiments, where the programmable CasΦ nuclease or a variant thereof comprises at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 92%, at least 95%, at least 97%, at least 99%, or 100% sequence identity with SEQ ID NO: 40, the 5′ overhang is a 10 nucleotide overhang. In preferred embodiments, where the programmable CasΦ nuclease or a variant thereof comprises at least 90% sequence identity with SEQ ID NO: 40, the 5′ overhang is a 10 nucleotide overhang. In further embodiments, where the programmable CasΦ nuclease or a variant thereof comprises the amino acid sequence of SEQ ID NO: 40, the 5′ overhang is a 10 nucleotide overhang.


In some embodiments, where the programmable CasΦ nuclease or a variant thereof comprises at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 92%, at least 95%, at least 97%, at least 99%, or 100% sequence identity with SEQ ID NO: 37, the 5′ overhang is a 9 or 10 nucleotide overhang. In preferred embodiments, where the programmable CasΦ nuclease or a variant thereof comprises at least 90% sequence identity with SEQ ID NO: 37, the 5′ overhang is a 9 or 10 nucleotide overhang. In further preferred embodiments, where the programmable CasΦ nuclease or a variant thereof comprises the amino acid sequence of SEQ ID NO: 37, the 5′ overhang is a 9 or 10 nucleotide overhang.


In some embodiments, where the programmable CasΦ nuclease or a variant thereof comprises at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 92%, at least 95%, at least 97%, at least 99%, or 100% sequence identity with SEQ ID NO: 41, the 5′ overhang is a 9 or 10 nucleotide overhang. In preferred embodiments, where the programmable CasΦ nuclease or a variant thereof comprises at least 90% sequence identity with SEQ ID NO: 41, the 5′ overhang is a 9 or 10 nucleotide overhang. In further preferred embodiments, where the programmable CasΦ nuclease or a variant thereof comprises the amino acid sequence of SEQ ID NO: 41, the 5′ overhang is a 9 or 10 nucleotide overhang.


In some embodiments, where the programmable CasΦ nuclease or a variant thereof comprises at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 92%, at least 95%, at least 97%, at least 99%, or 100% sequence identity with SEQ ID NO: 12, the 5′ overhang is a 5 nucleotide overhang. In preferred embodiments, where the programmable CasΦ nuclease or a variant thereof comprises at least 90% sequence identity with SEQ ID NO: 12, the 5′ overhang is a 5 nucleotide overhang. In further preferred embodiments, where the programmable CasΦ nuclease or a variant thereof comprises the amino acid sequence of SEQ ID NO: 12, the 5′ overhang is a 5 nucleotide overhang.


In some embodiments, where the programmable CasΦ nuclease or a variant thereof comprises at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 92%, at least 95%, at least 97%, at least 99%, or 100% sequence identity with SEQ ID NO: 24, the 5′ overhang is a 6 nucleotide overhang. In preferred embodiments, where the programmable CasΦ nuclease or a variant thereof comprises at least 90% sequence identity with SEQ ID NO: 24, the 5′ overhang is a 6 nucleotide overhang. In further preferred embodiments, where the programmable CasΦ nuclease or a variant thereof comprises the amino acid sequence of SEQ ID NO: 24, the 5′ overhang is a 6 nucleotide overhang.


In some embodiments, where the programmable CasΦ nuclease or a variant thereof comprises at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 92%, at least 95%, at least 97%, at least 99%, or 100% sequence identity with SEQ ID NO: 25, the 5′ overhang is a 6 nucleotide overhang. In preferred embodiments, where the programmable CasΦ nuclease or a variant thereof comprises at least 90% sequence identity with SEQ ID NO: 25, the 5′ overhang is a 6 nucleotide overhang. In further preferred embodiments, where the programmable CasΦ nuclease or a variant thereof comprises the amino acid sequence of SEQ ID NO: 25, the 5′ overhang is a 6 nucleotide overhang.


In some embodiments, where the programmable CasΦ nuclease or a variant thereof comprises at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 92%, at least 95%, at least 97%, at least 99%, or 100% sequence identity with SEQ ID NO: 32, the 5′ overhang is a 6 nucleotide overhang. In preferred embodiments, where the programmable CasΦ nuclease or a variant thereof comprises at least 90% sequence identity with SEQ ID NO: 32, the 5′ overhang is a 6 nucleotide overhang. In further preferred embodiments, where the programmable CasΦ nuclease or a variant thereof comprises the amino acid sequence of SEQ ID NO: 32, the 5′ overhang is a 6 nucleotide overhang.


In some embodiments, where the programmable CasΦ nuclease or a variant thereof comprises at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 92%, at least 95%, at least 97%, at least 99%, or 100% sequence identity with SEQ ID NO: 33, the 5′ overhang is a 6 nucleotide overhang. In preferred embodiments, where the programmable CasΦ nuclease or a variant thereof comprises at least 90% sequence identity with SEQ ID NO: 33, the 5′ overhang is a 6 nucleotide overhang. In further preferred embodiments, where the programmable CasΦ nuclease or a variant thereof comprises the amino acid sequence of SEQ ID NO: 33, the 5′ overhang is a 6 nucleotide overhang.


In some embodiments, a programmable CasΦ nuclease rapidly cleaves a strand of a double-stranded target nucleic acid. In some embodiments, the programmable CasΦ nuclease cleaves the second strand of the target nucleic acid after it has cleaved the first strand of the target nucleic acid. The cleavage of target nucleic acid strands can be assessed in an in vitro cis-cleavage assay. To perform such as assay, the programmable CasΦ nuclease is complexed to its native crRNA, e.g. CasΦ.2 nuclease with the CasΦ.2 repeat, in buffer comprising 50 mM potassium acetate, 20 mM Tris-acetate, 10 mM magnesium acetate, 100 ug/ml BSA, and which is pH 7.9 at 25° C. The complexing is carried out for 20 minutes at room temperature, e.g. 20-22° C. The RNP is at a concentration of 200 nM. The target plasmid is a 2.2 kb super-coiled plasmid containing a target sequence, either 5′-TATTAAATACTCGTATTGCTGTTCGATTAT-3′ (SEQ ID NO: 116) or 5′-CACAGCTTGTCTGTAAGCGGATGCCATATG-3′ (SEQ ID NO: 117), which is immediately downstream of a 5′-GTTG-3′ or 5′-TTTG-3′ PAM. At time “0” 30 equal volumes of target plasmid, at 20 nM, and complexed RNP are mixed, so that the concentration of target plasmid is 10 nM and the concentration of complexed RNP is 100 nM. The incubation temperature is 37° C. The reaction is quenched at desired time points, e.g. 1, 3, 6, 15, 30 and 60 minutes, with reaction quench comprising 1 mg/ml proteinase K, 0.08% SDS and 15 mM EDTA. The sample incubates for 30 minutes at 37° C. to deproteinize. The cleavage is quantified by agarose gel analysis.


In some embodiments, a programmable CasΦ nuclease creates at least 50%, at least 55%, at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90 or at least 95% of the maximum amount of nicked product within 1 minute, where the maximum amount of nicked product is the maximum amount detected within a 60 minute period from when the target plasmid is mixed with the programmable CasΦ nuclease. In preferred embodiments, at least 80% of the maximum amount of nicked product is created within 1 minute. In more preferred embodiments, at least 90% of the maximum amount of nicked product is created within 1 minute.


In some embodiments, a programmable CasΦ nuclease creates at least 50%, at least 55%, at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90 or at least 95% of the maximum amount of linearized product is created within 1 minute, where the maximum amount of linearized product is the maximum amount detected within a 60 minute period from when the target plasmid is mixed with the programmable CasΦ nuclease. In preferred embodiments, at least 80% of the maximum amount of linearized product is created within 1 minute. In more preferred embodiments, at least 90% of the maximum amount of linearized product is created within 1 minute.


In some embodiments, a programmable CasΦ nuclease uses a co-factor. In some embodiments, the co-factor allows the programmable CasΦ nuclease to perform a function. In some embodiments, the function is pre-crRNA processing and/or target nucleic acid cleavage. As discussed in Jiang F. and Doudna J. A. (Annu. Rev. Biophys. 2017. 46:505-29), Cas9 uses divalent metal ions as co-factors. The suitability of a divalent metal ion as a cofactor can easily be assessed, such as by methods based on those described by Sundaresan et al. (Cell Rep. 2017 Dec. 26; 21(13): 3728-3739). In some embodiments, the co-factor is a divalent metal ion. In some embodiments, the divalent metal ion is selected from Mg2+, Mn2+, Zn2+, Ca2+, cu2+. In a preferred embodiment, the divalent metal ion is Mg2+. In some embodiments, a programmable CasΦ nuclease forms a complex with a divalent metal ion. In preferred embodiments, a programmable CasΦ nuclease forms a complex with Mg2+.


In some aspects, the disclosure provides a composition comprising a programmable CasΦ nuclease disclosed herein and a cell, preferably wherein the cell is a eukaryotic cell. In some embodiments, a programmable CasΦ nuclease disclosed herein is in a cell, preferably wherein the cell is a eukaryotic cell.


In some aspects, the disclosure provides a composition comprising a nucleic acid encoding a programmable CasΦ nuclease disclosed herein and a cell, preferably wherein the cell is a eukaryotic cell. In some embodiments, a nucleic acid encoding a programmable CasΦ nuclease disclosed herein is in a cell, preferably wherein the cell is a eukaryotic cell.


Guide Nucleic Acids

The methods and compositions of the disclosure may comprise a guide nucleic acid. The guide nucleic acid can bind to a target nucleic acid (e.g., a single strand of a target nucleic acid) or portion thereof. For example, the guide nucleic acid can bind to a target nucleic acid such as nucleic acid from a virus or a bacterium or other agents responsible for a disease, or an amplicon thereof, as described herein. The guide nucleic acid can bind to a target nucleic acid such as a nucleic acid from a bacterium, a virus, a parasite, a protozoa, a fungus or other agents responsible for a disease, or an amplicon thereof, as described herein. The target nucleic acid can comprise a mutation, such as a single nucleotide polymorphism (SNP). A mutation can confer for example, resistance to a treatment, such as antibiotic treatment. A mutation can confer a gene malfunction or gene knockout. A mutation can confer a disease, contribution to a disease, or risk for a disease, such as a liver disease or disorder, eye disease or disorder, cystic fibrosis, or muscle disease or disorder. The guide nucleic acid can bind to a target nucleic acid such as a nucleic acid, preferably DNA, from a cancer gene or gene associated with a genetic disorder, or an amplicon thereof, as described herein. The guide nucleic acid comprises a segment of nucleic acids that are reverse complementary to the target nucleic acid. Often the guide nucleic acid binds specifically to the target nucleic acid. The target nucleic acid may be a reversed transcribed RNA, DNA, DNA amplicon, or synthetic nucleic acids. The target nucleic acid can be a single-stranded DNA or DNA amplicon of a nucleic acid of interest. A guide nucleic acid may be a non-naturally occurring guide nucleic acid. A non-naturally occurring guide nucleic acid may comprise an engineered sequence having a repeat and a spacer that hybridizes to a target nucleic acid sequence of interest. A non-naturally occurring guide nucleic acid may be recombinantly expressed or chemically synthesized.


A guide nucleic acid (e.g. gRNA) may hybridize to a target sequence of a target nucleic acid. The guide nucleic acid can bind to a programmable nuclease.


In some embodiments, a gRNA comprises a crRNA. In some embodiments, a gRNA of a CasΦ polypeptide or variants thereof does not comprise a tracrRNA. As described by Jiang F. and Doudna J. A. (Annu. Rev. Biophys. 2017. 46:505-29), Cas9 cleavage activity requires a tracrRNA. A tracrRNA is a polynucleotide that hybridizes with a crRNA to allow crRNA maturation such that the crRNA can bind to the Cas nuclease and locate the Cas nuclease to a target sequence. In some embodiments, a programmable CasΦ nuclease disclosed herein does not require a tracrRNA to locate and/or cleave a target nucleic acid. A crRNA may comprise a repeat region. Specifically, the crRNA of the guide nucleic acid may comprise a repeat region and a spacer region. The repeat region refers to the sequence of the crRNA that binds to the programmable nuclease. The spacer region refers to the sequence of the crRNA that hybridizes to a sequence of the target nucleic acid. In some embodiments, the repeat region may comprise mutations or truncations with respect to the repeat sequences in pre-crRNA. The repeat sequence of the crRNA may interact with a programmable nuclease, allowing for the guide nucleic acid and the programmable nuclease to form a complex. This complex may be referred to as a ribonucleoprotein (RNP) complex. The crRNA may comprise a spacer sequence. The spacer sequence may hybridize to a target sequence of the target nucleic acid, where the target sequence is a segment of a target nucleic acid. The spacer sequences may be reverse complementary to the target sequence. In some cases, the spacer sequence may be sufficiently reverse complementary to a target sequence to allow for hybridization, however, may not necessarily be 100% reverse complementary.


In some embodiments, a programmable nuclease may cleave a precursor RNA (“pre-crRNA”) to produce (or “process”) a guide RNA (gRNA), also referred to as a “mature guide RNA.” A programmable nuclease that cleaves pre-crRNA to produce a mature guide RNA is said to have pre-crRNA processing activity.


Programmable nucleases disclosed herein may process the repeat sequence of a crRNA, where the repeat sequence is the region of the crRNA that binds to the programmable nuclease. For example, crRNA may be delivered to a mammalian cell, e.g. a HEK293T cell, wherein the crRNA includes a full length repeat region which is 36 nucleotides in length, along with a programmable nuclease. The programmable nuclease then cleaves the repeat region of the crRNA so that the mature crRNA comprises a shorter repeat region (e.g. 24 nucleotides in length). Accordingly, in some embodiments, programmable nucleases disclosed herein are capable of cleaving the repeat region of a crRNA. In preferred embodiments, programmable nucleases disclosed herein are capable of cleaving the repeat region of a crRNA in mammalian cells.


The guide nucleic acid can bind specifically to the target nucleic acid. A guide nucleic acid can comprise a sequence that is, at least in part, reverse complementary to the sequence of a target nucleic acid.


The guide nucleic acid may be a non-naturally occurring guide nucleic acid. A non-naturally occurring guide nucleic acid may comprise an engineered sequence having a repeat and a spacer that hybridizes to a target nucleic acid sequence of interest. A non-naturally occurring guide nucleic acid may be recombinantly expressed or chemically synthesized.


A guide nucleic acid can comprise RNA, DNA, or a combination thereof. The term “gRNA” refers to a guide nucleic acid comprising RNA. A gRNA may include nucleosides that are not ribonucleic. In some embodiments, all nucleosides in a gRNA are ribonucleic. In some embodiments, some of the nucleosides in a gRNA are not ribonucleic. In embodiments where nucleosides in a gRNA are not ribonucleic, non-ribonucleic nucleosides may be naturally-occurring or non-naturally-occurring nucleosides. In some embodiments, inter-nucleoside links are phosphodiester bonds. In some embodiments, the inter-nucleoside link between at least two nucleosides in a guide nucleic acid is not a phosphodiester bond. In some embodiments, the inter-nucleoside link between at least two nucleosides is a non-natural inter-nucleoside linkage. Non-natural inter-nucleoside linkages include phosphorous and non-phosphorous inter-nucleoside linkages. Phosphorous inter-nucleoside linkages include phosphorothioate linkages and thiophosphate linkages. An inter-nucleoside linkage may comprise a “C3 spacer”. C3 spacers are known to the skilled person as comprising a chain of three carbon atoms.


Guide nucleic acids may be modified to improve genome editing efficiency, increase stability, reduce off-target effects, and/or increase the affinity of the guide nucleic acid for a CasΦ polypeptide disclosed herein. Modifications may include non-natural nucleotides and/or non-natural linkages. In addition or alternatively, one or more sugar moieties of the guide nucleic acid may be modified. Such sugar moiety modifications may include 2′-O-methyl (2′OMe), 2′-0-methyoxy-ethyl and 2′ fluoro. In some embodiments, editing efficiency, or genome editing efficiency, is determined by analyzing the frequency of indel mutations in a nucleic acid or gene knockout. In some embodiments, the use of a flow cytometer or next generation sequencing may be used to analyze cells for indel mutations or gene knockout. In other embodiments, off-target effects may be detected using a flow cytometer, next generation sequencing, or CIRCLE-seq.


In some preferred embodiments, first 3 nucleosides (or one of the first 3 nucleosides, or a combination of the first 3 nucleosides) from the 5′ end of the repeat region comprise a 2′methyl modification and the linkages between the 3 nucleosides at the 3′ end of the spacer region comprise phosphorothioate linkages.


In some embodiments, the first nucleoside at the 5′ end of the repeat region comprises a 2′-O-methyl modification. In some embodiments, the first two nucleosides at the 5′ end of the repeat region comprise 2′-O-methyl modifications. In some embodiments, the first three nucleosides at the 5′ end of the repeat region comprise 2′-O-methyl modifications. In some embodiments, the last nucleoside at the 3′ end of the spacer region comprises a 2′-O-methyl modification. In some embodiments, the last two nucleosides at the 3′ end of the spacer region comprise 2′-O-methyl modifications. In some embodiments, the last three nucleosides at the 3′ end of the spacer region comprise 2′-O-methyl modifications.


In some embodiments, the first 3 nucleosides (or one of the first 3 nucleosides, or a combination of the first 3 nucleosides) from the 5′ end of the repeat region and the 3 nucleosides at the 3′ end of the spacer region comprise a 2′-O-methyl modification, and the linkages between the 3 nucleosides at the 3′ end of the spacer region comprise phosphorothioate linkages.


In some embodiments, the first 3 nucleosides (or one of the first 3 nucleosides, or a combination of the first 3 nucleosides) from the 5′ end of the repeat region and the 3 nucleosides at the 3′ end of the spacer region comprise a 2′ fluoro modification.


In some embodiments, the first nucleoside at the 5′ end of the repeat region comprises a 2′ fluoro modification. In some embodiments, the first two nucleosides at the 5′ end of the repeat region comprise 2′ fluoro modifications. In some embodiments, the first three nucleosides at the 5′ end of the repeat region comprise 2′ fluoro modifications. In some embodiments, the last nucleoside at the 3′ end of the spacer region comprises a 2′ fluoro modification. In some embodiments, the last two nucleosides at the 3′ end of the spacer region comprise 2′ fluoro modifications. In some embodiments, the last three nucleosides at the 3′ end of the spacer region comprise 2′ fluoro modifications. In preferred embodiments, the last three nucleosides at the 3′ end of the spacer region comprise 2′ fluoro modifications.


In preferred embodiments, the first two nucleosides at the 5′ end of the repeat region comprise 2′-O-methyl modifications, the first two nucleosides at the 5′ end of the repeat are linked by a phosphorothioate linkage, and the last three nucleosides at the 3′ end of the spacer region comprise 2′ fluoro modifications.


In some embodiments, the linkage between the two nucleosides at the 5′ end of the repeat region comprises a 3C spacer and the linkage between the two nucleosides at the 3′ end of the spacer region comprises a 3C spacer.


In some embodiments, the guide nucleic acid comprises ribonucleic nucleosides and deoxyribonucleic nucleosides. In some embodiments, the guide nucleic acid is a guide RNA wherein the first, eighth and ninth nucleosides from the 5′ end of the spacer region and the four nucleosides at the 3′ end of the spacer region are deoxyribonucleic nucleosides.


In some embodiments, the guide nucleic acid comprises a polyA tail. In some preferred embodiments, the guide nucleic acid comprises a polyA tail at the 3′ end of the spacer region.


In some embodiments, a plurality of modified guides (e.g., a combination of modified guides disclosed herein) are complexed with one or more programmable nucleases (e.g., one or more programmable nucleases disclosed herein). In some examples, one or more of the plurality of modified guides comprise any of the nucleoside modifications described herein. In some examples, one or more of the plurality of the modified guides comprise any length of repeat or spacer region described herein. In some examples, one or more of the plurality of the modified guides comprise a repeat spacer length described herein, and a nucleoside modification described herein. In some embodiments, one or more of the plurality of modified guides comprise a repeat sequence from about 15 to about 20 nucleotides in length. In some embodiments, one or more of the plurality of modified guides comprise a spacer sequence or region from about 15 to about 20 nucleotides in length.


TABLE 2 provides illustrative crRNA sequences for use with the compositions and methods of the disclosure. In some embodiments, the crRNA sequence comprises at least 70%, at least 80%, at least 90%, at least 92%, at least 95%, at least 97%, or at least 99%, or 100% sequence identity to any one of SEQ ID NO: 48-SEQ ID NO: 86, or a reverse complement thereof. In some embodiments, the crRNA sequence comprises at least 70%, at least 80%, at least 90%, at least 92%, at least 95%, at least 97%, at least 99%, or 100% sequence identity to SEQ ID NO: 49 or a reverse complement thereof. In some embodiments, the crRNA sequence comprises at least 70%, at least 80%, at least 90%, at least 92%, at least 95%, at least 97%, at least 99%, or 100% sequence identity to SEQ ID NO: 51 or a reverse complement thereof. In some embodiments, the crRNA sequence comprises at least 70%, at least 80%, at least 90%, at least 92%, at least 95%, at least 97%, at least 99%, or 100% sequence identity to SEQ ID NO: 52 or a reverse complement thereof. In some embodiments, the crRNA sequence comprises at least 70%, at least 80%, at least 90%, at least 92%, at least 95%, at least 97%, at least 99%, or 100% sequence identity to SEQ ID NO: 54 or a reverse complement thereof. In some embodiments, the crRNA sequence comprises at least 70%, at least 80%, at least 90%, at least 92%, at least 95%, at least 97%, at least 99%, or 100% sequence identity to SEQ ID NO: 57 or a reverse complement thereof.









TABLE 2







Illustrative crRNA sequences











SEQ


CasΦ
crRNA repeat sequence
ID.


ortholog
 (shown as DNA), 5′-to-3′
NO.





CasΦ.01
GGAGAGATCTCAAACGATTGCTCGATTAGTCGAGAC
48





CasΦ.02
GTCGGAACGCTCAACGATTGCCCCTCACGAGGGGAC
49





CasΦ.04
ACCAAAACGACTATTGATTGCCCAGTACGCTGGGAC
50





CasΦ.07
GGATCCAATCCTTTTTGATTGCCCAATTCGTTGG
51



GAC






CasΦ.10
GGATCTGAGGATCATTATTGCTCGTTACGACGAGAC
52





CasΦ.11
CCTGCGAAACCTTTTGATTGCTCAGTACGCTGAGAC
53





CasΦ.12
CTTTCAAGACTAATAGATTGCTCCTTACGAGGAGAC
54





CasΦ.13
GTAGAAGACCTCGCTGATTGCTCGGTGCGCCGAGAC
55





CasΦ.17
ATGGCAACAGACTCTCATTGCGCGGTACGCCGCGAC
56





CasΦ.18
ACCAAAACGACTATTGATTGCCCAGTACGCTGGGAC
57





CasΦ.19
GTCGCTCTCTAACGCTTGCCCAGTACGCTGGGAC
58





CasΦ.20
GCTGGAAGACTCAATGATGGCTCCTTACGAGGAGAC
59





CasΦ.21
GGTTGAACCCTCAACAGATTGCTCGGTAAGCCGAG
60



AC






CasΦ.22
GGTTGAACCCTCAACAGATTGCTCGGTAAGCCGAG
61



AC






CasΦ.23
CTTGAAATCCTGTCAGATTGCTCCCTTCGGGGAGAC
62





CasΦ.24
GCTGGAAGACTCAATGATGGCTCCTTACGAGGAGAC
63





CasΦ.25
GCTGGAAGACTCAATGATGGCTCCTTACGAGGAGAC
64





CasΦ.26
CTAGGAACGCACGCAGATTGCTCGGTACGCCGAGAC
65





CasΦ.27
ATTGCAACGCCTAAAGATTGCTCGATACGTCGAGAC
66





CasΦ.28
GTTCGGCRAYCCTTTGATTGCTCAGTACGCTGAGAC
67





CasΦ,29
GTTGAACCTAGATCAGATGGCTCAGTACGCTGAGAC
68





CasΦ.30
CCCTCAACACGTCAGAAATGCCCGGCACGCCGGGAC
69





CasΦ.31
GTCGCAAGACTCGAATAATTGCCCCTCTATGGGGAC
70





CasΦ.32
GCTGGGGACCGATCCTGATTGCTCGCTGCGGCGAGAC
71





CasΦ.33
CTCTCAATGGATAACGATTGCTCTCTACGGAGAGAC
72





CasΦ.34
GCTGGAAGACTCAATGATGGCTCCTTACGAGGAGAC
73





CasΦ.35
GTTGAACCCTCAACAGATTGCTCGGTAAGCCGAGAC
74





CasΦ.36
GTCGCAAGACTCGAATAATTGCCCCTCTATGGGGAC
75





CasΦ.37
GTCGGAACGCTCAACGATTGCCCCTCACGAGGGGAC
76





CasΦ.38
GTTGAACCTAGATCAGATGGCTCAGTACGCTGAGAC
77





CasΦ.39
CTCTCAATGGATAACGATTGCTCTCTACGGAGAGAC
78





CasΦ.41
ACTGAAACCACCAACGATTGCGCTCCTCGGAGCGAC
79





CasΦ.42
ACCAAAACGACTATTGATTGCCCAGTACGCTGGGAC
80





CasΦ.43
GTTGAACCTAGATCAGATGGCTCAGTACGCTGAGAC
81





CasΦ.44
GTTGAACCCTCAACAGATTGCTCGGTAAGCCGAGAC
82





CasΦ.45
GTTGAACCTAGATCAGATGGCTCAGTACGCTGAGAC
83





CasΦ.46
GTCGGAACGCTCAACGATTGCCCCTCACGAGGGGAC
84





CasΦ.47
GGTTGAACCCTCAACAGATTGCTCGGTAAGCCGAG
85



AC






CasΦ.48
GGTTGAACCCTCAACAGATTGCTCGGTAAGCCGAG
86



AC









In some embodiments, the programmable nuclease disclosed herein is used in conjunction with a specific crRNA sequence. In some embodiments, the crRNA sequence comprises at least 70%, at least 80%, at least 90%, at least 92%, at least 95%, at least 97%, at least 99%, or 100% sequence identity to any one of SEQ ID NO: 48-SEQ ID NO: 86, or a reverse complement thereof. In some embodiments, the crRNA sequence comprises at least 70%, at least 80%, at least 90%, at least 92%, at least 95%, at least 97%, at least 99%, or 100% sequence identity to SEQ ID NO: 49 or a reverse complement thereof. In some embodiments, the crRNA sequence comprises at least 70%, at least 80%, at least 90%, at least 92%, at least 95%, at least 97%, at least 99%, or 100% sequence identity to SEQ ID NO: 51 or a reverse complement thereof. In some embodiments, the crRNA sequence comprises at least 70%, at least 80%, at least 90%, at least 92%, at least 95%, at least 97%, at least 99%, or 100% sequence identity to SEQ ID NO: 52 or a reverse complement thereof. In some embodiments, the crRNA sequence comprises at least 70%, at least 80%, at least 90%, at least 92%, at least 95%, at least 97%, at least 99%, or 100% sequence identity to SEQ ID NO: 54 or a reverse complement thereof. In some embodiments, the crRNA sequence comprises at least 70%, at least 80%, at least 90%, at least 92%, at least 95%, at least 97%, at least 99%, or 100% sequence identity to SEQ ID NO: 57 or a reverse complement thereof.


In some embodiments, the activity of a programmable CasΦ nuclease can be supported by a crRNA comprising any of the crRNA repeat sequences recited in TABLE 2. In some embodiments, the activity of a programmable CasΦ nuclease can be supported by a crRNA comprising a crRNA repeat sequence comprising at least 70%, at least 80%, at least 90%, at least 92%, at least 95%, at least 97%, at least 99%, or 100% sequence identity to any one of SEQ ID NO: 48-SEQ ID NO: 86.


In some embodiments, the crRNA repeat sequence comprises a hairpin. In some embodiments, the hairpin is in the 3′ portion of the crRNA repeat sequence. The hairpin comprises a double-stranded stem portion and a single-stranded loop portion. In preferred embodiments, one stand of the stem portion comprises a CYC sequence and the other strand comprises a GRG sequence, wherein Y and R are complementary. In preferred embodiments, the crRNA repeat comprises a GAC sequence at the 3′ end. In more preferred embodiments, the G of the GAC sequence is in the stem portion of the hairpin. In some embodiments, each strand of the stem portion comprises 3, 4, 5, 6, 7, 8, 9 or 10 nucleotides. In preferred embodiments, each strand of the stem portion comprises 3, 4 or 5 nucleotides. In some embodiments, the loop portion comprises 2, 3, 4, 5, 6, 7, 8, 9, 10 or more nucleotides. In preferred embodiments, the loop portion comprises 2, 3, 4, 5 or 6 nucleotides. In most preferred embodiments, the loop portion comprises 4 nucleotides. In some embodiments, the nucleotides are naturally occurring nucleotides. In some embodiments, the nucleotides are synthetic nucleotides.


In some cases, the guide nucleic acid is not naturally occurring and made by artificial combination of otherwise separate segments of sequence. Often, the artificial combination is performed by chemical synthesis, by genetic engineering techniques, or by the artificial manipulation of isolated segments of nucleic acids. In some cases, the segment of a guide nucleic acid that comprises a sequence that is reverse complementary to the target nucleic acid is 20 nucleotides in length. A guide nucleic acid can have at least 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, or 30 nucleotides reverse complementary to a target nucleic acid. In some cases, the guide nucleic acid can be 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, or 30 nucleotides in length. For example, a guide nucleic acid may be at least 10 bases. In some embodiments, a guide nucleic acid may be from 10 to 50 bases. In some embodiments, a guide nucleic acid may be at least 25 bases. In some cases, the guide nucleic acid has from exactly or about 12 nucleotides (nt) to about 80 μL, from about 12 μL to about 50 μL, from about 12 μL to about 45 μL, from about 12 μL to about 40 μL, from about 12 μL to about 35 μL, from about 12 μL to about 30 μL, from about 12 μL to about 25 μL, from about 12 μL to about 20 μL, from about 12 μL to about 19 μL, from about 19 μL to about 20 μL, from about 19 μL to about 25 μL, from about 19 μL to about 30 μL, from about 19 μL to about 35 μL, from about 19 μL to about 40 μL, from about 19 μL to about 45 μL, from about 19 μL to about 50 μL, from about 19 μL to about 60 μL, from about 20 μL to about 25 μL, from about 20 μL to about 30 μL, from about 20 μL to about 35 μL, from about 20 μL to about 40 μL, from about 20 μL to about 45 μL, from about 20 μL to about 50 μL, or from about 20 μL to about 60 μL reverse complementary to a target nucleic acid. In some cases, the guide nucleic acid has from about 10 μL to about 60 μL, from about 20 μL to about 50 μL, or from about 30 μL to about 40 μL reverse complementary to a target nucleic acid. It is understood that the sequence of a guide nucleic acid need not be 100% reverse complementary to that of its target nucleic acid to be specifically hybridizable, hybridizable, or bind specifically. The guide nucleic acid can have a sequence comprising at least one uracil in a region from nucleic acid residue 5 to 20 that is reverse complementary to a modification variable region in the target nucleic acid. The guide nucleic acid, in some cases, has a sequence comprising at least one uracil in a region from nucleic acid residue 5 to 9, 10 to 14, or 15 to 20 that is reverse complementary to a modification variable region in the target nucleic acid. The guide nucleic acid can have a sequence comprising at least one uracil in a region from nucleic acid residue 5 to 20 that is reverse complementary to a methylation variable region in the target nucleic acid. The guide nucleic acid, in some cases, has a sequence comprising at least one uracil in a region from nucleic acid residue 5 to 9, 10 to 14, or 15 to 20 that is reverse complementary to a methylation variable region in the target nucleic acid. The guide nucleic acid can hybridize with a target nucleic acid.


In some instances, compositions comprise shorter versions of the guide nucleic acids disclosed herein. For instance, the guide nucleic acid sequence may consist of a portion of a guide nucleic acid disclosed herein. In some instances, shorter versions may provide enhanced activity relative to their longer versions. Examples of longer versions and shorter versions of guide RNA for CasΦ.12 are shown in Tables I, K, M, O, Q, S, U, and W, and Tables AB-AF, respectively, wherein the shorter versions are produced by removing sixteen nucleotides from the 5′ end of the long version and three nucleotides from the 3′ end of the long version. In some instances, the long version is a CasΦ.32 guide nucleic acid described in Tables J, L, N, P, R, T, V, X, and the short version is a guide nucleic acid without the sixteen nucleotides at the 5′ end of the long version and without the three nucleotides at the 3′ end of the long version.


The guide nucleic acid (e.g., a non-naturally occurring guide nucleic acid) can be selected from a group of guide nucleic acids that have been tiled against the nucleic acid sequence of a strain of an infection or genomic locus of interest. The guide nucleic acid can be selected from a group of guide nucleic acids that have been tiled against the nucleic acid sequence of a target nucleic acid, for example, a strain of HPV16 or HPV18. Often, guide nucleic acids that are tiled against the nucleic acid of a strain of an infection or genomic locus of interest can be pooled for use in a method described herein. Often, these guide nucleic acids are pooled for detecting a target nucleic acid in a single assay. The pooling of guide nucleic acids that are tiled against a single target nucleic acid can enhance the detection of the target nucleic using the methods described herein. The pooling of guide nucleic acids that are tiled against a single target nucleic acid can ensure broad coverage of the target nucleic acid within a single reaction using the methods described herein. The tiling, for example, is sequential along the target nucleic acid. Sometimes, the tiling is overlapping along the target nucleic acid. In some instances, the tiling comprises gaps between the tiled guide nucleic acids along the target nucleic acid. In some instances, the tiling of the guide nucleic acids is non-sequential. Often, a method for detecting a target nucleic acid comprises contacting a target nucleic acid to a pool of guide nucleic acids and a programmable nuclease or nickase as disclosed herein, wherein a guide nucleic acid sequence of the pool of guide nucleic acids has a sequence selected from a group of tiled guide nucleic acid that correspond to nucleic acid sequence of a target nucleic acid; and assaying for a signal produce by cleavage of at least some nucleic acids of a reporter of a population of nucleic acids of a reporter. Pooling of guide nucleic acids can ensure broad spectrum identification, or broad coverage, of a target species within a single reaction. This can be particularly helpful in diseases or indications, like sepsis, that may be caused by multiple organisms.


In some embodiments, the spacer sequence is between 10 and 35 nucleotides in length, between 10 and 30 nucleotides in length, between 15 and 30 nucleotides in length, between 10 and 25 nucleotides in length, between 15 and 25 nucleotides in length, between 17 and 30 nucleotides in length, between 17 and 25 nucleotides in length, between 17 and 22 nucleotides in length, or between 17 and 20 nucleotides in length. In preferred embodiments, the spacer sequence between 17 and 25 nucleotides in length. In more preferred embodiments, the spacer sequence is between 17 and 20 nucleotides in length. In most preferred embodiments, the spacer sequence is 17 nucleotides in length.


In some embodiments, the repeat sequence is between 15 and 40 nucleotides in length, between 15 and 36 nucleotides in length, between 18 and 36 nucleotides in length, between 18 and 30 nucleotides in length, between 18 and 25 nucleotides in length, between 18 and 22 nucleotides in length, between 18 and 20 nucleotides in length. In preferred embodiments, the repeat sequence is between 20 and 22 nucleotides in length. In more preferred embodiments, the repeat sequence is 20 nucleotides in length.


The spacer region of guide nucleic acids for CasΦ polypeptides disclosed herein comprise a seed region. In some embodiments, the seed regions do not tolerate mismatches in the complementarity of a spacer and a target sequence within about 1 to about 20 nucleotides from the 5′ end of a spacer sequence. The seed region starts from the 5′ end of the spacer sequence and is a region in which mismatches in the complementarity between the spacer sequence and the target sequence are not tolerated when the guide nucleic acid is bound to a CasΦ polypeptide such that the guide nucleic acid does not hybridize to the target sequence to allow cleavage of the target nucleic acid by the CasΦ polypeptide. In some embodiments, the seed region comprises between 10 and 20 nucleosides, between 12 and 20 nucleosides, between 14 and 20 nucleosides, between 14 and 18 nucleosides, between 10 and 16 nucleosides, between 12 and 16 nucleosides, or between 14 and 16 nucleosides. In preferred embodiments, the seed region comprises 16 nucleotides.


A programmable nuclease of the present disclosure may be activated to exhibit cleavage activity (e.g., cis-cleavage of a target nucleic acid or trans-cleavage of a collateral nucleic acid) upon binding of a ribonucleoprotein (RNP) complex to a target nucleic acid, in which the spacer of the crRNA of the gRNA hybridizes to the target nucleic acid.









TABLE A







spacer sequences of gRNAs targeting


human TRAC in T cells













Spacer sequence

SEQ




(5′ --> 3′),

ID.



Name
shown as DNA
Target
NO.







R3040
TGGATATCTGTGGGACAAGA
TRAC
118







R3041
TCCCACAGATATCCAGAACC
TRAC
119







R3042
GAGTCTCTCAGCTGGTACAC
TRAC
120







R3043
AGAGTCTCTCAGCTGGTACA
TRAC
121







R3044
TCACTGGATTTAGAGTCTCT
TRAC
122







R3045
AGAATCAAAATCGGTGAATA
TRAC
123







R3046
GAGAATCAAAATCGGTGAAT
TRAC
124







R3047
ACCGATTTTGATTCTCAAAC
TRAC
125







R3048
TTTGAGAATCAAAATCGGTG
TRAC
126







R3049
GTTTGAGAATCAAAATCGGT
TRAC
127







R3050
TGATTCTCAAACAAATGTGT
TRAC
128







R3051
GATTCTCAAACAAATGTGTC
TRAC
129







R3052
ATTCTCAAACAAATGTGTCA
TRAC
130







R3053
TGACACATTTGTTTGAGAAT
TRAC
131







R3054
TCAAACAAATGTGTCACAAA
TRAC
132







R3055
GTGACACATTTGTTTGAGAA
TRAC
133







R3056
CTTTGTGACACATTTGTTTG
TRAC
134







R3057
TGATGTGTATATCACAGACA
TRAC
135







R3058
TCTGTGATATACACATCAGA
TRAC
136







R3059
GTCTGTGATATACACATCAG
TRAC
137







R3060
TGTCTGTGATATACACATCA
TRAC
138







R3061
AAGTCCATAGACCTCATGTC
TRAC
139







R3062
CTCTTGAAGTCCATAGACCT
TRAC
140







R3063
AAGAGCAACAGTGCTGTGGC
TRAC
141







R3064
CTCCAGGCCACAGCACTGTT
TRAC
142







R3065
TTGCTCCAGGCCACAGCACT
TRAC
143







R3066
GTTGCTCCAGGCCACAGCAC
TRAC
144







R3067
CACATGCAAAGTCAGATTTG
TRAC
145







R3068
GCACATGCAAAGTCAGATTT
TRAC
146







R3069
GCATGTGCAAACGCCTTCAA
TRAC
147







R3070
AAGGCGTTTGCACATGCAAA
TRAC
148







R3071
CATGTGCAAACGCCTTCAAC
TRAC
149







R3072
TTGAAGGCGTTTGCACATGC
TRAC
150







R3073
AACAACAGCATTATTCCAGA
TRAC
151







R3074
TGGAATAATGCTGTTGTTGA
TRAC
152







R3075
TTCCAGAAGACACCTTCTTC
TRAC
153







R3076
CAGAAGACACCTTCTTCCCC
TRAC
154







R3077
CCTGGGCTGGGGAAGAAGGT
TRAC
155







R3078
TTCCCCAGCCCAGGTAAGGG
TRAC
156







R3079
CCCAGCCCAGGTAAGGGCAG
TRAC
157







R3080
TAAAAGGAAAAACAGACATT
TRAC
158







R3081
CTAAAAGGAAAAACAGACAT
TRAC
159







R3082
TTCCTTTTAGAAAGTTCCTG
TRAC
160







R3083
TCCTTTTAGAAAGTTCCTGT
TRAC
161







R3084
CCTTTTAGAAAGTTCCTGTG
TRAC
162







R3085
CTTTTAGAAAGTTCCTGTGA
TRAC
163







R3086
TAGAAAGTTCCTGTGATGTC
TRAC
164







R3136
AGAAAGTTCCTGTGATGTCA
TRAC
165







R3137
GAAAGTTCCTGTGATGTCAA
TRAC
166







R3138
ACATCACAGGAACTTTCTAA
TRAC
167







R3139
CTGTGATGTCAAGCTGGTCG
TRAC
168







R3140
TCGACCAGCTTGACATCACA
TRAC
169







R3141
CTCGACCAGCTTGACATCAC
TRAC
170







R3142
TCTCGACCAGCTTGACATCA
TRAC
171







R3143
AAAGCTTTTCTCGACCAGCT
TRAC
172







R3144
CAAAGCTTTTCTCGACCAGC
TRAC
173







R3145
CCTGTTTCAAAGCTTTTCTC
TRAC
174







R3146
GAAACAGGTAAGACAGGGGT
TRAC
175







R3147
AAACAGGTAAGACAGGGGTC
TRAC
176

















TABLE B







spacer sequences of gRNAs targeting


human B2M in T cells













Spacer Sequence

SEQ




 (5′ --> 3′),

ID.



Name
shown as DNA
Target
NO.







R3087
AATATAAGTGGAGGCGTCGC
B2M
177







R3088
ATATAAGTGGAGGCGTCGCG
B2M
178







R3089
AGGAATGCCCGCCAGCGCGA
B2M
179







R3090
CTGAAGCTGACAGCATTCGG
B2M
180







R3091
GGGCCGAGATGTCTCGCTCC
B2M
181







R3092
GCTGTGCTCGCGCTACTCTC
B2M
182







R3093
CTGGCCTGGAGGCTATCCAG
B2M
183







R3094
TGGCCTGGAGGCTATCCAGC
B2M
184







R3095
ATGTGTCTTTTCCCGATATT
B2M
185







R3096
TCCCGATATTCCTCAGGTAC
B2M
186







R3097
CCCGATATTCCTCAGGTACT
B2M
187







R3098
CCGATATTCCTCAGGTACTC
B2M
188







R3099
GAGTACCTGAGGAATATCGG
B2M
189







R3100
GGAGTACCTGAGGAATATCG
B2M
190







R3101
CTCAGGTACTCCAAAGATTC
B2M
191







R3102
AGGTTTACTCACGTCATCCA
B2M
192







R3103
ACTCACGTCATCCAGCAGAG
B2M
193







R3104
CTCACGTCATCCAGCAGAGA
B2M
194







R3105
TCTGCTGGATGACGTGAGTA
B2M
195







R3106
CATTCTCTGCTGGATGACGT
B2M
196







R3107
CCATTCTCTGCTGGATGACG
B2M
197







R3108
ACTTTCCATTCTCTGCTGGA
B2M
198







R3109
GACTTTCCATTCTCTGCTGG
B2M
199







R3110
AGGAAATTTGACTTTCCATT
B2M
200







R3111
CCTGAATTGCTATGTGTCTG
B2M
201







R3112
CTGAATTGCTATGTGTCTGG
B2M
202







R3113
CTATGTGTCTGGGTTTCATC
B2M
203







R3114
AATGTCGGATGGATGAAACC
B2M
204







R3115
CATCCATCCGACATTGAAGT
B2M
205







R3116
ATCCATCCGACATTGAAGTT
B2M
206







R3117
AGTAAGTCAACTTCAATGTC
B2M
207







R3118
TTCAGTAAGTCAACTTCAAT
B2M
208







R3119
AAGTTGACTTACTGAAGAAT
B2M
209







R3120
ACTTACTGAAGAATGGAGAG
B2M
210







R3121
TCTCTCCATTCTTCAGTAAG
B2M
211







R3122
CTGAAGAATGGAGAGAGAAT
B2M
212







R3123
AATTCTCTCTCCATTCTTCA
B2M
213







R3124
CAATTCTCTCTCCATTCTTC
B2M
214







R3125
TCAATTCTCTCTCCATTCTT
B2M
215







R3126
TTCAATTCTCTCTCCATTCT
B2M
216







R3127
AAAAAGTGGAGCATTCAGAC
B2M
217







R3128
CTGAAAGACAAGTCTGAATG
B2M
218







R3129
AGACTTGTCTTTCAGCAAGG
B2M
219







R3130
TCTTTCAGCAAGGACTGGTC
B2M
220







R3131
CAGCAAGGACTGGTCTTTCT
B2M
221







R3132
AGCAAGGACTGGTCTTTCTA
B2M
222







R3133
CTATCTCTTGTACTACACTG
B2M
223







R3134
TATCTCTTGTACTACACTGA
B2M
224







R3135
AGTGTAGTACAAGAGATAGA
B2M
225







R3148
TACTACACTGAATTCACCCC
B2M
226







R3149
AGTGGGGGTGAATTCAGTGT
B2M
227







R3150
CAGTGGGGGTGAATTCAGTG
B2M
228







R3151
TCAGTGGGGGTGAATTCAGT
B2M
229







R3152
TTCAGTGGGGGTGAATTCAG
B2M
230







R3153
ACCCCCACTGAAAAAGATGA
B2M
231







R3154
ACACGGCAGGCATACTCATC
B2M
232







R3155
GGCTGTGACAAAGTCACATG
B2M
233







R3156
GTCACAGCCCAAGATAGTTA
B2M
234







R3157
TCACAGCCCAAGATAGTTAA
B2M
235







R3158
ACTATCTTGGGCTGTGACAA
B2M
236







R3159
CCCCACTTAACTATCTTGGG
B2M
237

















TABLE C







spacer sequences of gRNAs that


target human PD1 in T cells















SEQ




Spacer sequence

ID.



Name
(5′ --> 3′)
Target
NO.







R2921
CCUUCCGCUCACCUCCGCCU
PD1
238







R2922
CCUUCCGCUCACCUCCGCCU
PD1
239







R2923
CGCUCACCUCCGCCUGAGCA
PD1
240







R2924
UCCACUGCUCAGGCGGAGGU
PD1
241







R2925
UAGCACCGCCCAGACGACUG
PD1
242







R2926
AGGCAUGCAGAUCCCACAGG
PD1
243







R2927
CACAGGCGCCCUGGCCAGUC
PD1
244







R2928
UCUGGGCGGUGCUACAACUG
PD1
245







R2929
GCAUGCCUGGAGCAGCCCCA
PD1
246







R2930
UAGCACCGCCCAGACGACUG
PD1
247







R2931
UGGCCGCCAGCCCAGUUGUA
PD1
248







R2932
CUUCCGCUCACCUCCGCCUG
PD1
249







R2933
CAGGGCCUGUCUGGGGAGUC
PD1
250







R2934
UCCCCAGCCCUGCUCGUGGU
PD1
251







R2935
GGUCACCACGAGCAGGGCUG
PD1
252







R2936
UCCCCUUCGGUCACCACGAG
PD1
253







R2937
GAGAAGCUGCAGGUGAAGGU
PD1
254







R2938
ACCUGCAGCUUCUCCAACAC
PD1
255







R2939
UCCAACACAUCGGAGAGCUU
PD1
256







R2940
GCACGAAGCUCUCCGAUGUG
PD1
257







R2941
AGCACGAAGCUCUCCGAUGU
PD1
258







R2942
GUGCUAAACUGGUACCGCAU
PD1
259







R2943
CUGGGGCUCAUGCGGUACCA
PD1
260







R2944
UCCGUCUGGUUGCUGGGGCU
PD1
261







R2945
CCCGAGGACCGCAGCCAGCC
PD1
262







R2946
UGUGACACGGAAGCGGCAGU
PD1
263







R2947
CGUGUCACACAACUGCCCAA
PD1
264







R2948
GGCAGUUGUGUGACACGGAA
PD1
265







R2949
CACAUGAGCGUGGUCAGGGC
PD1
266







R2950
CGCCGGGCCCUGACCACGCU
PD1
267







R2951
GGGGCCAGGGAGAUGGCCCC
PD1
268







R2952
AUCUGCGCCUUGGGGGCCAG
PD1
269







R2953
GAUCUGCGCCUUGGGGGCCA
PD1
270







R2954
CCAGACAGGCCCUGGAACCC
PD1
271







R2955
CCAGCCCUGCUCGUGGUGAC
PD1
272







R2956
UCUCUGGAAGGGCACAAAGG
PD1
273







R2957
GUGCCCUUCCAGAGAGAAGG
PD1
274







R2958
UGCCCUUCCAGAGAGAAGGG
PD1
275







R2959
UGCCCUUCUCUCUGGAAGGG
PD1
276







R2960
CAGAGAGAAGGGCAGAAGUG
PD1
277







R2961
GAACUGGCCGGCUGGCCUGG
PD1
278







R2962
GGAACUGGCCGGCUGGCCUG
PD1
279







R2963
CAAACCCUGGUGGUUGGUGU
PD1
280







R2964
GUGUCGUGGGCGGCCUGCUG
PD1
281







R2965
CCUCGUGCGGCCCGGGAGCA
PD1
282







R2966
UCCCUGCAGAGAAACACACU
PD1
283







R2967
CUCUGCAGGGACAAUAGGAG
PD1
284







R2968
UCUGCAGGGACAAUAGGAGC
PD1
285







R2969
CUCCUCAAAGAAGGAGGACC
PD1
286







R2970
UCCUCAAAGAAGGAGGACCC
PD1
287







R2971
UCUGUGGACUAUGGGGAGCU
PD1
288







R2972
UCUCGCCACUGGAAAUCCAG
PD1
289







R2973
CCAGUGGCGAGAGAAGACCC
PD1
290







R2974
CAGUGGCGAGAGAAGACCCC
PD1
291







R2975
CGCUAGGAAAGACAAUGGUG
PD1
292







R2976
UCUUUCCUAGCGGAAUGGGC
PD1
293







R2977
CCUAGCGGAAUGGGCACCUC
PD1
294







R2978
CUAGCGGAAUGGGCACCUCA
PD1
295







R2979
GCCCCUCUGACCGGCUUCCU
PD1
296







R2980
CUUGGCCACCAGUGUUCUGC
PD1
297







R2981
GCCACCAGUGUUCUGCAGAC
PD1
298







R2982
UGCAGACCCUCCACCAUGAG
PD1
299







R2983
UCCUGAGGAAAUGCGCUGAC
PD1
300







R2984
CCUCAGGAGAAGCAGGCAGG
PD1
301







R2985
CUCAGGAGAAGCAGGCAGGG
PD1
302







R2986
CAGGCCGUCCAGGGGCUGAG
PD1
303







R2987
AGACAUGAGUCCUGUGGUGG
PD1
304







R2988
AGGUCCUGCCAGCACAGAGC
PD1
305







R2989
AGGGAGCUGGACGCAGGCAG
PD1
306







R2990
AGCCCCGGGCCGCAGGCAGC
PD1
307







R2991
AGGCAGGAGGCUCCGGGGCG
PD1
308







R2992
GGGGCUGGUUGGAGAUGGCC
PD1
309







R2993
GAGAUGGCCUUGGAGCAGCC
PD1
310







R2994
GCUGCUCCAAGGCCAUCUCC
PD1
311







R2995
GAGCAGCCAAGGUGCCCCUG
PD1
312







R2996
GGGAUGCCACUGCCAGGGGC
PD1
313







R2997
CGGGAUGCCACUGCCAGGGG
PD1
314







R2998
GGCCCUGCGUCCAGGGCGUU
PD1
315







R2999
UCUGCUCCCUGCAGGCCUAG
PD1
316







R3000
UCUAGGCCUGCAGGGAGCAG
PD1
317







R3001
CCUGAAACUUCUCUAGGCCU
PD1
318







R3002
UGACCUUCCCUGAAACUUCU
PD1
319







R3003
CAGGGAAGGUCAGAAGAGCU
PD1
320







R3004
AGGGAAGGUCAGAAGAGCUC
PD1
321







R3005
CUGCCCUGCCCACCACAGCC
PD1
322







R3006
CCUGCCCUGCCCACCACAGC
PD1
323







R3007
ACACAUGCCCAGGCAGCACC
PD1
324







R3008
CACAUGCCCAGGCAGCACCU
PD1
325







R3009
CCUGCCCCACAAAGGGCCUG
PD1
326







R3010
GUGGGGCAGGGAAGCUGAGG
PD1
327







R3011
UGGGGCAGGGAAGCUGAGGC
PD1
328







R3012
CUGCCUCAGCUUCCCUGCCC
PD1
329







R3013
CAGGCCCAGCCAGCACUCUG
PD1
330







R3014
AGGCCCAGCCAGCACUCUGG
PD1
331







R3015
CACCCCAGCCCCUCACACCA
PD1
332







R3016
GGACCGUAGGAUGUCCCUCU
PD1
333

















TABLE D







spacer sequences of gRNAs targeting


human CIITA













Spacer sequence

SEQ




(5′  > 3′),

ID.



Name
shown as DNA
Target
NO.















R4503
CTACACAATGCGTTGCCTGG
CIITA
334



C2TA_T1.1









R4504
GGGCTCTGACAGGTAGGACC
CIITA
335



C2TA_T1.2









R4505
TGTAGGAATCCCAGCCAGGC
CIITA
336



C2TA_T1.3









R4506
CCTGGCTCCACGCCCTGCTG
CIITA
337



C2TA_T1.8 









R4507
GGGAAGCTGAGGGCACGAGG
CIITA
338



C2TA_T1.9 









R4508
ACAGCGATGCTGACCCCCTG
CIITA
339



C2TA_T2.1









R4509
TTAACAGCGATGCTGACCCC
CIITA
340



C2TA_T2.2 









R4510
TATGACCAGATGGACCTGGC
CIITA
341



C2TA_T2.3 









R4511
GGGCCCCTAGAAGGTGGCTA
CIITA
342



C2TA_T2.4 









R4512
TAGGGGCCCCAACTCCATGG
CIITA
343



C2TA_T2.5 









R4513
AGAAGCTCCAGGTAGCCACC
CIITA
344



C2TA_T2.6 









R4514
TCCAGCCAGGTCCATCTGGT
CIITA
345



C2TA_T2.7 









R4515
TTCTCCAGCCAGGTCCATCT
CIITA
346



C2TA_T2.8 









R5200
AGCAGGCTGTTGTGTGACAT
CIITA
1934






R5201
CATGTCACACAACAGCCTGC
CIITA
1935






R5202
TGTGACATGGAAGGTGATGA
CIITA
1936






R5203
ATCACCTTCCATGTCACACA
CIITA
1937






R5204
GCATAAGCCTCCCTGGTCTC
CIITA
1938






R5205
CAGGACTCCCAGCTGGAGGG
CIITA
1939






R5206
CTCAGGCCCTCCAGCTGGGA
CIITA
1940






R5207
TGCTGGCATCTCCATACTCT
CIITA
1941






R5208
TGCCCAACTTCTGCTGGCAT
CIITA
1942






R5209
CTGCCCAACTTCTGCTGGCA
CIITA
1943






R5210
TCTGCCCAACTTCTGCTGGC
CIITA
1944






R5211
TGACTTTTCTGCCCAACTTC
CIITA
1945






R5212
CTGACTTTTCTGCCCAACTT
CIITA
1946






R5213
TCTGACTTTTCTGCCCAACT
CIITA
1947






R5214
CCAGAGGAGCTTCCGGCAGA
CIITA
1948






R5215
AGGTCTGCCGGAAGCTCCTC
CIITA
1949






R5216
CGGCAGACCTGAAGCACTGG
CIITA
1950






R5217
CAGTGCTTCAGGTCTGCCGG
CIITA
1951






R5218
AACAGCGCAGGCAGTGGCAG
CIITA
1952






R5219
AACCAGGAGCCAGCCTCCGG
CIITA
1953






R5220
TCCAGGCGCATCTGGCCGGA
CIITA
1954






R5221
CTCCAGGCGCATCTGGCCGG
CIITA
1955






R5222
TCTCCAGGCGCATCTGGCCG
CIITA
1956






R5223
CTCCAGTTCCTCGTTGAGCT
CIITA
1957






R5224
TCCAGTTCCTCGTTGAGCTG
CIITA
1958






R5225
AGGCAGCTCAACGAGGAACT
CIITA
1959






R5226
CTCGTTGAGCTGCCTGAATC
CIITA
1960






R5227
AGCTGCCTGAATCTCCCTGA
CIITA
1961






R5228
GTCCCCACCATCTCCACTCT
CIITA
1962






R5229
TCCCCACCATCTCCACTCTG
CIITA
1963






R5230
CCAGAGCCCATGGGGCAGAG
CIITA
1964






R5231
GCCAGAGCCCATGGGGCAGA
CIITA
1965






R5232
CAGCCTCAGAGATTTGCCAG
CIITA
1966






R5233
GGAGGCCGTGGACAGTGAAT
CIITA
1967






R5234
ACTGTCCACGGCCTCCCAAC
CIITA
1968






R5235
GCTCCATCAGCCACTGACCT
CIITA
1969






R5236
AGGCATGCTGGGCAGGTCAG
CIITA
1970






R5237
CTCGGGAGGTCAGGGCAGGT
CIITA
1971






R5238
GCTCGGGAGGTCAGGGCAGG
CIITA
1972






R5239
GAGACCTCTCCAGCTGCCGG
CIITA
1973






R5240
TTGGAGACCTCTCCAGCTGC
CIITA
1974






R5241
GAAGCTTGTTGGAGACCTCT
CIITA
1975






R5242
GGAAGCTTGTTGGAGACCTC
CIITA
1976






R5243
TGGAAGCTTGTTGGAGACCT
CIITA
1977






R5244
TACCGCTCACTGCAGGACAC
CIITA
1978






R5245
CTGCTGCTCCTCTCCAGCCT
CIITA
1979






R5246
CCGCTCCAGGCTCTTGCTGC
CIITA
1980






R5247
TGCCCAGTCCGGGGTGGCCA
CIITA
1981






R5248
GGCCAGCTGCCGTTCTGCCC
CIITA
1982






R5249
GCAGCCAACAGCACCTCAGC
CIITA
1983






R5250
GCTGCCAAGGAGCACCGGCG
CIITA
1984






R5251
CCCAGCACAGCAATCACTCG
CIITA
1985






R5252
GCCCAGCACAGCAATCACTC
CIITA
1986






R5253
CTGTGCTGGGCAAAGCTGGT
CIITA
1987






R5254
CCCTGACCAGCTTTGCCCAG
CIITA
1988






R5255
GGCTGGGGCAGTGAGCCGGG
CIITA
1989






R5256
TGGCCGGCTTCCCCAGTACG
CIITA
1990






R5257
CCCAGTACGACTTTGTCTTC
CIITA
1991






R5258
GTCTTCTCTGTCCCCTGCCA
CIITA
1992






R5259
TCTTCTCTGTCCCCTGCCAT
CIITA
1993






R5260
TCTGTCCCCTGCCATTGCTT
CIITA
1994






R5261
AAGCAATGGCAGGGGACAGA
CIITA
1995






R5262
CTTGAACCGTCCGGGGGATG
CIITA
1996






R5263
AACCGTCCGGGGGATGCCTA
CIITA
1997






R5264
TCCCTGGGCCCACAGCCACT
CIITA
1998






R5265
AAGATGTGGCTGAAAACCTC
CIITA
1999






R5266
TCAGCCACATCTTGAAGAGA
CIITA
2000






R5267
CAGCCACATCTTGAAGAGAC
CIITA
2001






R5268
AGCCACATCTTGAAGAGACC
CIITA
2002






R5269
AAGAGACCTGACCGCGTTCT
CIITA
2003






R5270
TGCTCATCCTAGACGGCTTC
CIITA
2004






R5271
CAGCTCCTCGAAGCCGTCTA
CIITA
2005






R5272
CGCTTCCAGCTCCTCGAAGC
CIITA
2006






R5273
GAGGAGCTGGAAGCGCAAGA
CIITA
2007






R5274
CTGCACAGCACGTGCGGACC
CIITA
2008






R5275
TGGAAAAGGCCGGCCAGCAG
CIITA
2009






R5276
TTCTGGAAAAGGCCGGCCAG
CIITA
2010






R5277
TCCAGAAGAAGCTGCTCCGA
CIITA
2011






R5278
CCAGAAGAAGCTGCTCCGAG
CIITA
2012






R5279
CAGAAGAAGCTGCTCCGAGG
CIITA
2013






R5280
CACCCTCCTCCTCACAGCCC
CIITA
2014






R5281
CTCAGGCTCTGGACCAGGCG
CIITA
2015






R5282
GAGCTGTCCGGCTTCTCCAT
CIITA
2016






R5283
AGCTGTCCGGCTTCTCCATG
CIITA
2017






R5284
TCCATGGAGCAGGCCCAGGC
CIITA
2018






R5285
GAGAGCTCAGGGATGACAGA
CIITA
2019






R5286
AGAGCTCAGGGATGACAGAG
CIITA
2020






R5287
GTGCTCTGTCATCCCTGAGC
CIITA
2021






R5288
TTCTCAGTCACAGCCACAGC
CIITA
2022






R5289
TCAGTCACAGCCACAGCCCT
CIITA
2023






R5290
GTGCCGGGCAGTGTGCCAGC
CIITA
2024






R5291
TGCCGGGCAGTGTGCCAGCT
CIITA
2025






R5292
GCGTCCTCCCCAAGCTCCAG
CIITA
2026






R5293
GGGAGGACGCCAAGCTGCCC
CIITA
2027






R5294
GCCAGCTCTGCCAGGGCCCC
CIITA
2028






R5295
ATGTCTGCGGCCCAGCTCCC
CIITA
2029






R5392
GATGTCTGCGGCCCAGCTCC
CIITA
2030






R5393
CCATCCGCAGACGTGAGGAC
CIITA
2031






R5394
GCCATCGCCCAGGTCCTCAC
CIITA
2032






R5395
GGCCATCGCCCAGGTCCTCA
CIITA
2033






R5396
GACTAAGCCTTTGGCCATCG
CIITA
2034






R5397
GTCCAACACCCACCGCGGGC
CIITA
2035






R5398
CAGGAGGAAGCTGGGGAAGG
CIITA
2036






R5399
CCCAGCTTCCTCCTGCAATG
CIITA
2037






R5400
CTCCTGCAATGCTTCCTGGG
CIITA
2038






R5401
CTGGGGGCCCTGTGGCTGGC
CIITA
2039






R5402
GCCACTCAGAGCCAGCCACA
CIITA
2040






R5403
CGCCACTCAGAGCCAGCCAC
CIITA
2041






R5404
ATTTCGCCACTCAGAGCCAG
CIITA
2042






R5405
TCCTTGATTTCGCCACTCAG
CIITA
2043






R5406
GGGTCAATGCTAGGTACTGC
CIITA
2044






R5407
CTTGGGGTCAATGCTAGGTA
CIITA
2045






R5408
TTCCTTGGGGTCAATGCTAG
CIITA
2046






R5409
ACCCCAAGGAAGAAGAGGCC
CIITA
2047






R5410
TCATAGGGCCTCTTCTTCCT
CIITA
2048






R5411
CTGGCTGGGCTGATCTTCCA
CIITA
2049






R5412
TGGCTGGGCTGATCTTCCAG
CIITA
2050






R5413
CAGCCTCCCGCCCGCTGCCT
CIITA
2051






R5414
CTGTCCACCGAGGCAGCCGC
CIITA
2052






R5415
TGCTTCCTGTCCACCGAGGC
CIITA
2053






R5416
AGGTACCTCGCAAGCACCTT
CIITA
2054






R5417
CGAGGTACCTGAAGCGGCTG
CIITA
2055






R5418
CAGCCTCCTCGGCCTCGTGG
CIITA
2056






R5419
GGCAGCACGTGGTACAGGAG
CIITA
2057






R5420
GCAGCACGTGGTACAGGAGC
CIITA
2058






R5421
TCTGGGCACCCGCCTCACGC
CIITA
2059






R5422
CTGGGCACCCGCCTCACGCC
CIITA
2060






R5423
TGGGCACCCGCCTCACGCCT
CIITA
2061






R5424
CCCAGTACATGTGCATCAGG
CIITA
2062






R5425
GCCCGCCGCCTCCAAGGCCT
CIITA
2063






R5426
GAGGCGGCGGGCCAAGACTT
CIITA
2064






R5427
TCCCTGGACCTCCGCAGCAC
CIITA
2065






R5428
GCCCCTCTGGATTGGGGAGC
CIITA
2066






R5429
CCCCTCTGGATTGGGGAGCC
CIITA
2067






R5430
GGGAGCCTCGTGGGACTCAG
CIITA
2068






R5431
GTCTCCCCATGCTGCTGCAG
CIITA
2069






R5432
TCCTCTGCTGCCTGAAGTAG
CIITA
2070






R5433
AGGCAGCAGAGGAGAAGTTC
CIITA
2071






R5434
AAAGGCTCGATGGTGAACTT
CIITA
2072






R5435
GAAAGGCTCGATGGTGAACT
CIITA
2073






R5436
ACCATCGAGCCTTTCAAAGC
CIITA
2074






R5437
GCTTTGAAAGGCTCGATGGT
CIITA
2075






R5438
AGGGACTTGGCTTTGAAAGG
CIITA
2076






R5439
CAAAGCCAAGTCCCTGAAGG
CIITA
2077






R5440
AAAGCCAAGTCCCTGAAGGA
CIITA
2078






R5441
CACATCCTTCAGGGACTTGG
CIITA
2079






R5442
CCAGGTCTTCCACATCCTTC
CIITA
2080






R5443
CCCAGGTCTTCCACATCCTT
CIITA
2081






R5444
CTCGGAAGACACAGCTGGGG
CIITA
2082






R5445
GGTCCCGAACAGCAGGGAGC
CIITA
2083






R5446
AGGTCCCGAACAGCAGGGAG
CIITA
2084






R5447
TTTAGGTCCCGAACAGCAGG
CIITA
2085






R5448
CTTTAGGTCCCGAACAGCAG
CIITA
2086






R5449
GGGACCTAAAGAAACTGGAG
CIITA
2087






R5450
GGGAAAGCCTGGGGGCCTGA
CIITA
2088






R5451
GGGGAAAGCCTGGGGGCCTG
CIITA
2089






R5452
CCCCAAACTGGTGCGGATCC
CIITA
2090






R5453
CCCAAACTGGTGCGGATCCT
CIITA
2091






R5454
TTCTCACTCAGCGCATCCAG
CIITA
2092






R5455
AGCTGGGGGAAGGTGGCTGA
CIITA
2093






R5456
CCCCAGCTGAAGTCCTTGGA
CIITA
2094






R5457
CAAGGACTTCAGCTGGGGGA
CIITA
2095






R5458
CCAAGGACTTCAGCTGGGGG
CIITA
2096






R5459
AGGGTTTCCAAGGACTTCAG
CIITA
2097






R5460
TAGGCACCCAGGTCAGTGAT
CIITA
2098






R5461
GTAGGCACCCAGGTCAGTGA
CIITA
2099






R5462
GCTCGCTGCATCCCTGCTCA
CIITA
2100






R5463
GCCTGAGCAGGGATGCAGCG
CIITA
2101






R5464
TACAATAACTGCATCTGCGA
CIITA
2102






R5465
GCTCGTGTGCTTCCGGACAT
CIITA
2103






R5466
CGGACATGGTGTCCCTCCGG
CIITA
2104






R5467
ACGGCTGCCGGGGCCCAGCA
CIITA
2105






R5468
GGAGGTGTCCTCATGTGGAG
CIITA
2106






R5469
CTGGACACTGAATGGGATGG
CIITA
2107






R5470
AGTGTCCAGGAACACCTGCA
CIITA
2108






R5471
CAGGTGTTCCTGGACACTGA
CIITA
2109






R5472
TTGCAGGTGTTCCTGGACAC
CIITA
2110






R5473
ACGGATCAGCCTGAGATGAT
CIITA
2111
















TABLE E







spacer sequences of gRNAs targeting mouse PCSK9













SEQ



Spacer sequence

ID.


Name
(5′ --> 3′)
Target
NO.





R4238
CCGCUGUUGCCGCCGCUGCU
PCSK9
347





R4239
CCGCCGCUGCUGCUGCUGUU
PCSK9
348





R4240
CUGCUACUGUGCCCCACCGG
PCSK9
349





R4241
AUAAUCUCCAUCCUCGUCCU
PCSK9
350





R4242
UGAAGAGCUGAUGCUCGCCC
PCSK9
351





R4243
GAGCAACGGCGGAAGGUGGC
PCSK9
352





R4244
CUGGCAGCCUCCAGGCCUCC
PCSK9
353





R4245
UGGUGCUGAUGGAGGAGACC
PCSK9
354





R4246
AAUCUGUAGCCUCUGGGUCU
PCSK9
355





R4247
UUCAAUCUGUAGCCUCUGGG
PCSK9
356





R4248
GUUCAAUCUGUAGCCUCUGG
PCSK9
357





R4249
AACAAACUGCCCACCGCCUG
PCSK9
358





R4250
AUGACAUAGCCCCGGCGGGC
PCSK9
359





R4251
UACAUAUCUUUUAUGACCUC
PCSK9
360





R4252
UAUGACCUCUUCCCUGGCUU
PCSK9
361





R4253
AUGACCUCUUCCCUGGCUUC
PCSK9
362





R4254
UGACCUCUUCCCUGGCUUCU
PCSK9
363





R4255
ACCAAGAAGCCAGGGAAGAG
PCSK9
364





R4256
CCUGGCUUCUUGGUGAAGAU
PCSK9
365





R4257
UUGGUGAAGAUGAGCAGUGA
PCSK9
366





R4258
GUGAAGAUGAGCAGUGACCU
PCSK9
367





R4259
CCCCAUGUGGAGUACAUUGA
PCSK9
368





R4260
CUCAAUGUACUCCACAUGGG
PCSK9
369





R4261
AGGAAGACUCCUUUGUCUUC
PCSK9
370





R4262
GUCUUCGCCCAGAGCAUCCC
PCSK9
371





R4263
UCUUCGCCCAGAGCAUCCCA
PCSK9
372





R4264
GCCCAGAGCAUCCCAUGGAA
PCSK9
373





R4265
CAUGGGAUGCUCUGGGCGAA
PCSK9
374





R4266
GCUCCAGGUUCCAUGGGAUG
PCSK9
375





R4267
UCCCAGCAUGGCACCAGACA
PCSK9
376





R4268
CUCUGUCUGGUGCCAUGCUG
PCSK9
377





R4269
GAUACCAGCAUCCAGGGUGC
PCSK9
378





R4270
AGGGCAGGGUCACCAUCACC
PCSK9
379





R4271
AAGUCGGUGAUGGUGACCCU
PCSK9
380





R4272
AACAGCGUGCCGGAGGAGGA
PCSK9
381





R4273
GCCACACCAGCAUCCCGGCC
PCSK9
382





R4274
AGCACACGCAGGCUGUGCAG
PCSK9
383





R4275
ACAGUUGAGCACACGCAGGC
PCSK9
384





R4276
CCUUGACAGUUGAGCACACG
PCSK9
385





R4277
GCUGACUCUUCCGAAUAAAC
PCSK9
386





R4278
AUUCGGAAGAGUCAGCUAAU
PCSK9
387





R4279
UUCGGAAGAGUCAGCUAAUC
PCSK9
388





R4280
GGAAGAGUCAGCUAAUCCAG
PCSK9
389





R4281
UGCUGCCCCUGGCCGGUGGG
PCSK9
390





R4282
AGGAUGCGGCUAUACCCACC
PCSK9
391





R4283
CCAGCUGCUGCAACCAGCAC
PCSK9
392





R4284
CAGCAGCUGGGAACUUCCGG
PCSK9
393





R4285
CGGGACGACGCCUGCCUCUA
PCSK9
394





R4286
GUGGCCCCGACUGUGAUGAC
PCSK9
395





R4287
CCUUGGGGACUUUGGGGACU
PCSK9
396





R4288
GUCCCCAAAGUCCCCAAGGU
PCSK9
397





R4289
GGGACUUUGGGGACUAAUUU
PCSK9
398





R4290
GGGGACUAAUUUUGGACGCU
PCSK9
399





R4291
GGGACUAAUUUUGGACGCUG
PCSK9
400





R4292
UGGACGCUGUGUGGAUCUCU
PCSK9
401





R4293
GGACGCUGUGUGGAUCUCUU
PCSK9
402





R4294
GACGCUGUGUGGAUCUCUUU
PCSK9
403





R4295
CCGGGGGCAAAGAGAUCCAC
PCSK9
404





R4296
GCCCCCGGGAAGGACAUCAU
PCSK9
405





R4297
CCCCCGGGAAGGACAUCAUC
PCSK9
406





R4298
AUGUCACAGAGUGGGACCUC
PCSK9
407





R4299
UGGCUCGGAUGCUGAGCCGG
PCSK9
408





R4300
CCCUGGCCGAGCUGCGGCAG
PCSK9
409





R4301
GUAGAGAAGUGGAUCAGCCU
PCSK9
410





R4302
GGUAGAGAAGUGGAUCAGCC
PCSK9
411





R4303
UCUACCAAAGACGUCAUCAA
PCSK9
412





R4304
AUGACGUCUUUGGUAGAGAA
PCSK9
413





R4305
CCUGAGGACCAGCAGGUGCU
PCSK9
414





R4306
GGGGUCAGCACCUGCUGGUC
PCSK9
415





R4307
GAGUGGGCCCCGAGUGUGCC
PCSK9
416





R4308
UGGGGCACAGCGGGCUGUAG
PCSK9
417





R4309
UCCAGGAGCGGGAGGCGUCG
PCSK9
418





R4310
CAGACCUGCUGGCCUCCUAU
PCSK9
419





R4311
AGGGCCUUGCAGACCUGCUG
PCSK9
420





R4312
GGGGGUGAGGGUGUCUAUGC
PCSK9
421





R4313
GGGGUGAGGGUGUCUAUGCC
PCSK9
422





R4314
GCACGGGGAACCAGGCAGCA
PCSK9
423





R4315
CCCGUGCCAACUGCAGCAUC
PCSK9
424





R4316
UGGAUGCUGCAGUUGGCACG
PCSK9
425





R4317
UGGUGGCAGUGGACAUGGGU
PCSK9
426





R4318
CACUUCCCAAUGGAAGCUGC
PCSK9
427





R4319
CAUUGGGAAGUGGAAGACCU
PCSK9
428





R4320
GGAAGUGGAAGACCUUAGUG
PCSK9
429





R4321
GUGUCCGGAGGCAGCCUGCG
PCSK9
430





R4322
GCCACCAGGCGGCCAGUGUC
PCSK9
431





R4323
CUGCUGCCAUGCCCCAGGGC
PCSK9
432





R4324
CAGCCCUGGGGCAUGGCAGC
PCSK9
433





R4325
CAUUCCAGCCCUGGGGCAUG
PCSK9
434





R4326
GCAUUCCAGCCCUGGGGCAU
PCSK9
435





R4327
UGCAUUCCAGCCCUGGGGCA
PCSK9
436





R4328
AUUUUGCAUUCCAGCCCUGG
PCSK9
437





R4329
CAUCCAGUCAGGGUCCAUCC
PCSK9
438





R4330
UCCACGCUGUAGGCUCCCAG
PCSK9
439





R4331
CCACACACAGGUUGUCCACG
PCSK9
440





R4332
UCCACUGGUCCUGUCUGCUC
PCSK9
441





R4333
CUGAAGGCCGGCUCCGGCAG
PCSK9
442
















TABLE F







spacer sequences of gRNAs


targets Bak1 in CHO cells










Spacer sequence
SEQ



(5′ --> 3′),
ID


Name
shown as DNA
NO





R2452_Bak1_CasPhi_1
GAAGCTATGTTTTCCATCTC
443





R2453_Bak1_CasPhi_2
GCAGGGGCAGCCGCCCCCTG
444





R2454_Bak1_CasPhi_3
CTCCTAGAACCCAACAGGTA
445





R2455_Bak1_CasPhi_4
GAAAGACCTCCTCTGTGTCC
446





R2456_Bak1_CasPhi_5
TCCATCTCGGGGTTGGCAGG
447





R2457_Bak1_CasPhi_6
TTCCTGATGGTGGAGATGGA
448





R2849_Bak1_nsd_sg1
CTGACTCCCAGCTCTGACCC
449





R2850_Bak1_nsd_sg2
TGGGGTCAGAGCTGGGAGTC
450





R2851_Bak1_nsd_sg3
GAAAGACCTCCTCTGTGTCC
451





R2852_Bak1_nsd_sg4
CGAAGCTATGTTTTCCATCT
452





R2853_Bak1_nsd_sg5
GAAGCTATGTTTTCCATCTC
453





R2854_Bak1_nsd_sg6
TCCATCTCCACCATCAGGAA
454





R2855_Bak1_nsd_sg7
CCATCTCCACCATCAGGAAC
455





R2856_Bak1_nsd_sg8
CTGATGGTGGAGATGGAAAA
456





R2857_Bak1_nsd_sg9
CATCTCCACCATCAGGAACA
457





R2858_Bak1_nsd_sg10
TTCCTGATGGTGGAGATGGA
458





R2859_Bak1_nsd_sg11
GCAGGGGCAGCCGCCCCCTG
459





R2860_Bak1_nsd_sg12
TCCATCTCGGGGTTGGCAGG
460





R2861_Bak1_nsd_sg13
TAGGAGCAAATTGTCCATCT
461





R2862_Bak1_nsd_sg14
GGTTCTAGGAGCAAATTGTC
462





R2863_Bak1_nsd_sg15
GCTCCTAGAACCCAACAGGT
463





R2864_Bak1_nsd_sg16
CTCCTAGAACCCAACAGGTA
464





R3977_Bak1_exon1_sg1
TCCAGACGCCATCTTTCAGG
465





R3978_Bak1_exon1_sg2
TGGTAAGAGTCCTCCTGCCC
466





R3979_Bak1_exon3_sg1
TTACAGCATCTTGGGTCAGG
467





R3980_Bak1_exon3_sg2
GGTCAGGTGGGCCGGCAGCT
468





R3981_Bak1_exon3_sg3
CTATCATTGGAGATGACATT
469





R3982_Bak1_exon3_sg4
GAGATGACATTAACCGGAGA
470





R3983_Bak1_exon3_sg5
TGGAACTCTGTGTCGTATCT
471





R3984_Bak1_exon3_sg6
CAGAATTTACTGGAGCAGCT
472





R3985_Bak1_exon3_sg7
ACTGGAGCAGCTGCAGCCCA
473





R3986_Bak1_exon3_sg8
CCAGCTGTGGGCTGCAGCTG
474





R3987_Bak1_exon3_sg9
GTAGGCATTCCCAGCTGTGG
475





R3988_Bak1_exon3_sg10
GTGAAGAGTTCGTAGGCATT
476





R3989_Bak1_exon3_sg11
ACCAAGATTGCCTCCAGGTA
477





R3990_Bak1_exon3_sg12
CCTCCAGGTACCCACCACCA
478
















TABLE G







spacer sequences of gRNAs


targeting Bax in CHO cells












Spacer sequence
SEQ




(5′ --> 3′),
ID



Name
shown as DNA
NO







R2458_Bax_CasPhi_1
CTAATGTGGATACTAACTCC
479







R2459_Bax_CasPhi_2
TTCCGTGTGGCAGCTGACAT
480







R2460_Bax_CasPhi_3
CTGATGGCAACTTCAACTGG
481







R2461_Bax_CasPhi_4
TACTTTGCTAGCAAACTGGT
482







R2462_Bax_CasPhi_5
AGCACCAGTTTGCTAGCAAA
483







R2463_Bax_CasPhi_6
AACTGGGGCCGGGTTGTTGC
484







R2865_Bax_nsd_sg1
TTCTCTTTCCTGTAGGATGA
485







R2866_Bax_nsd_sg2
TCTTTCCTGTAGGATGATTG
486







R2867_Bax_nsd_sg3
CCTGTAGGATGATTGCTAAT
487







R2868_Bax_nsd_sg4
CTGTAGGATGATTGCTAATG
488







R2869_Bax_nsd_sg5
CTAATGTGGATACTAACTCC
489







R2870_Bax_nsd_sg6
TTCCGTGTGGCAGCTGACAT
490







R2871_Bax_nsd_sg7
CGTGTGGCAGCTGACATGTT
491







R2872_Bax_nsd_sg8
CCATCAGCAAACATGTCAGC
492







R2873_Bax_nsd_sg9
AAGTTGCCATCAGCAAACAT
493







R2874_Bax_nsd_sg10
GCTGATGGCAACTTCAACTG
494







R2875_Bax_nsd_sg11
CTGATGGCAACTTCAACTGG
495







R2876_Bax_nsd_sg12
AACTGGGGCCGGGTTGTTGC
496







R2877_Bax_nsd_sg13
TTGCCCTTTTCTACTTTGCT
497







R2878_Bax_nsd_sg14
CCCTTTTCTACTTTGCTAGC
498







R2879_Bax_nsd_sg15
CTAGCAAAGTAGAAAAGGGC
499







R2880_Bax_nsd_sg16
GCTAGCAAAGTAGAAAAGGG
500







R2881_Bax_nsd_sg17
TCTACTTTGCTAGCAAACTG
501







R2882_Bax_nsd_sg18
CTACTTTGCTAGCAAACTGG
502







R2883_Bax_nsd_sg19
TACTTTGCTAGCAAACTGGT
503







R2884_Bax_nsd_sg20
GCTAGCAAACTGGTGCTCAA
504







R2885_Bax_nsd_sg21
CTAGCAAACTGGTGCTCAAG
505







R2886_Bax_nsd_sg22
AGCACCAGTTTGCTAGCAAA
506

















TABLE H







spacer sequences of gRNAs


targeting Fut8 in CHO cells












Spacer sequence
SEQ




(5′ --> 3′),
ID



Name
shown as DNA
NO







R2464_Fut8_CasPhi_1
CCACTTTGTCAGTGCGTCTG
507







R2465_Fut8_casPhi_2
CTCAATGGGATGGAAGGCTG
508







R2466_Fut8_CasPhi_3
AGGAATACATGGTACACGTT
509







R2467_Fut8_CasPhi_4
AAGAACATTTTCAGCTTCTC
510







R2468_Fut8_CasPhi_5
ATCCACTTTCATTCTGCGTT
511







R2469_Fut8_CasPhi_6
TTTGTTAAAGGAGGCAAAGA
512







R2887_Fut8_nsd_sg1
TCCCCAGAGTCCATGTCAGA
513







R2888_Fut8_nsd_sg2
TCAGTGCGTCTGACATGGAC
514







R2889_Fut8_nsd_sg3
GTCAGTGCGTCTGACATGGA
515







R2890_Fut8_nsd_sg4
CCACTTTGTCAGTGCGTCTG
516







R2891_Fut8_nsd_sg5
TGTTCCCACTTTGTCAGTGC
517







R2892_Fut8_nsd_sg6
CTCAATGGGATGGAAGGCTG
518







R2893_Fut8_nsd_sg7
CATCCCATTGAGGAATACAT
519







R2894_Fut8_nsd_sg8
AGGAATACATGGTACACGTT
520







R2895_Fut8_nsd_sg9
AACGTGTACCATGTATTCCT
521







R2896_Fut8_nsd_sg10
TTCAACGTGTACCATGTATT
522







R2897_Fut8_nsd_sg11
AAGAACATTTTCAGCTTCTC
523







R2898_Fut8_nsd_sg12
GAGAAGCTGAAAATGTTCTT
524







R2899_Fut8_nsd_sg13
TCAGCTTCTCGAACGCAGAA
525







R2900_Fut8_nsd_sg14
CAGCTTCTCGAACGCAGAAT
526







R2901_Fut8_nsd_sg15
TGCGTTCGAGAAGCTGAAAA
527







R2902_Fut8_nsd_sg16
AGCTTCTCGAACGCAGAATG
528







R2903_Fut8_nsd_sg17
ATTCTGCGTTCGAGAAGCTG
529







R2904_Fut8_nsd_sg18
CATTCTGCGTTCGAGAAGCT
530







R2905_Fut8_nsd_sg19
TCGAACGCAGAATGAAAGTG
531







R2906_Fut8_nsd_sg20
ATCCACTTTCATTCTGCGTT
532







R2907_Fut8_nsd_sg21
TATCCACTTTCATTCTGCGT
533







R2908_Fut8_nsd_sg22
TTATCCACTTTCATTCTGCG
534







R2909_Fut8_nsd_sg23
TTTATCCACTTTCATTCTGC
535







R2910_Fut8_nsd_sg24
TTTTATCCACTTTCATTCTG
536







R2911_Fut8_nsd_sg25
AACAAAGAAGGGTCATCAGT
537







R2912_Fut8_nsd_sg26
CCTCCTTTAACAAAGAAGGG
538







R2913_Fut8_nsd_sg27
GCCTCCTTTAACAAAGAAGG
539







R2914_Fut8_nsd_sg28
TTTGTTAAAGGAGGCAAAGA
540







R2915_Fut8_nsd_sg29
GTTAAAGGAGGCAAAGACAA
541







R2916_Fut8_nsd_sg30
TTAAAGGAGGCAAAGACAAA
542







R2917_Fut8_nsd_sg31
TCTTTGCCTCCTTTAACAAA
543







R2918_Fut8_nsd_sg32
GTCTTTGCCTCCTTTAACAA
544







R2919_Fut8_nsd_sg33
GTCTAACTTACTTTGTCTTT
545







R2920_Fut8_nsd_sg34
TTGGTCTAACTTACTTTGTC
546

















TABLE 1







CasΦ.12 gRNAs targeting human


TRAC in T cells












Spacer sequence
SEQ




(5′ --> 3′),
ID



Name
shown as DNA
NO







R3040_
CTTTCAAGACTAATAGAT
547



CasP
TGCTCCTTACGAGGAGAC




hi12
TGGATATCTGTGGGACAA





GA








R3041_
CTTTCAAGACTAATAGAT
548



CasP
TGCTCCTTACGAGGAGAC




hi12
TCCCACAGATATCCAGAA





CC








R3042_
CTTTCAAGACTAATAGAT
549



CasP
TGCTCCTTACGAGGAGAC




hi12
GAGTCTCTCAGCTGGTAC





AC








R3043_
CTTTCAAGACTAATAGAT
550



CasP
TGCTCCTTACGAGGAGAC




hi12
AGAGTCTCTCAGCTGGTA





CA








R3044_
CTTTCAAGACTAATAGAT
551



CasP
TGCTCCTTACGAGGAGAC




hi12
TCACTGGATTTAGAGTCT





CT








R3045_
CTTTCAAGACTAATAGAT
552



CasP
TGCTCCTTACGAGGAGAC




hi12
AGAATCAAAATCGGTGAA





TA








R3046_
CTTTCAAGACTAATAGAT
553



CasP
TGCTCCTTACGAGGAGAC




hi12
GAGAATCAAAATCGGTGA





AT








R3047_
CTTTCAAGACTAATAGAT
554



CasP
TGCTCCTTACGAGGAGAC




hi12
ACCGATTTTGATTCTCAA





AC








R3048_
CTTTCAAGACTAATAGAT
555



CasP
TGCTCCTTACGAGGAGAC




hi12
TTTGAGAATCAAAATCGG





TG








R3049_
CTTTCAAGACTAATAGAT
556



CasP
TGCTCCTTACGAGGAGAC




hi12
GTTTGAGAATCAAAATCG





GT








R3050_
CTTTCAAGACTAATAGAT
557



CasP
TGCTCCTTACGAGGAGAC




hi12
TGATTCTCAAACAAATGT





GT








R3051_
CTTTCAAGACTAATAGAT
558



CasP
TGCTCCTTACGAGGAGAC




hi12
GATTCTCAAACAAATGTG





TC








R3052_
CTTTCAAGACTAATAGAT
559



CasP
TGCTCCTTACGAGGAGAC




hi12
ATTCTCAAACAAATGTGT





CA








R3053_
CTTTCAAGACTAATAGAT
560



CasP
TGCTCCTTACGAGGAGAC




hi12
TGACACATTTGTTTGAGA





AT








R3054_
CTTTCAAGACTAATAGAT
561



CasP
TGCTCCTTACGAGGAGAC




hi12
TCAAACAAATGTGTCACA





AA








R3055_
CTTTCAAGACTAATAGAT
562



CasP
TGCTCCTTACGAGGAGAC




hi12
GTGACACATTTGTTTGAG





AA








R3056_
CTTTCAAGACTAATAGAT
563



CasP
TGCTCCTTACGAGGAGAC




hi12
CTTTGTGACACATTTGTT





TG








R3057_
CTTTCAAGACTAATAGAT
564



CasP
TGCTCCTTACGAGGAGAC




hi12
TGATGTGTATATCACAGA





CA








R3058_
CTTTCAAGACTAATAGAT
565



CasP
TGCTCCTTACGAGGAGAC




hi12
TCTGTGATATACACATCA





GA








R3059_
CTTTCAAGACTAATAGAT
566



CasP
TGCTCCTTACGAGGAGAC




hi12
GTCTGTGATATACACATC





AG








R3060_
CTTTCAAGACTAATAGAT
567



CasP
TGCTCCTTACGAGGAGAC




hi12
TGTCTGTGATATACACAT





CA








R3061_
CTTTCAAGACTAATAGAT
568



CasP
TGCTCCTTACGAGGAGAC




hi12
AAGTCCATAGACCTCATG





TC








R3062_
CTTTCAAGACTAATAGAT
569



CasP
TGCTCCTTACGAGGAGAC




hi12
CTCTTGAAGTCCATAGAC





CT








R3063_
CTTTCAAGACTAATAGAT
570



CasP
TGCTCCTTACGAGGAGAC




hi12
AAGAGCAACAGTGCTGTG





GC








R3064_
CTTTCAAGACTAATAGAT
571



CasP
TGCTCCTTACGAGGAGAC




hi12
CTCCAGGCCACAGCACTG





TT








R3065_
CTTTCAAGACTAATAGAT
572



CasP
TGCTCCTTACGAGGAGAC




hi12
TTGCTCCAGGCCACAGCA





CT








R3066_
CTTTCAAGACTAATAGAT
573



CasP
TGCTCCTTACGAGGAGAC




hi12
GTTGCTCCAGGCCACAGC





AC








R3067_
CTTTCAAGACTAATAGAT
574



CasP
TGCTCCTTACGAGGAGAC




hi12
CACATGCAAAGTCAGATT





TG








R3068_
CTTTCAAGACTAATAGAT
575



CasP
TGCTCCTTACGAGGAGAC




hi12
GCACATGCAAAGTCAGAT





TT








R3069_
CTTTCAAGACTAATAGAT
576



CasP
TGCTCCTTACGAGGAGAC




hi12
GCATGTGCAAACGCCTTC





AA








R3070_
CTTTCAAGACTAATAGAT
577



CasP
TGCTCCTTACGAGGAGAC




hi12
AAGGCGTTTGCACATGCA





AA








R3071_
CTTTCAAGACTAATAGAT
578



CasP
TGCTCCTTACGAGGAGAC




hi12
CATGTGCAAACGCCTTCA





AC








R3072_
CTTTCAAGACTAATAGAT
579



CasP
TGCTCCTTACGAGGAGAC




hi12
TTGAAGGCGTTTGCACAT





GC








R3073_
CTTTCAAGACTAATAGAT
580



CasP
TGCTCCTTACGAGGAGAC




hi12
AACAACAGCATTATTCCA





GA








R3074_
CTTTCAAGACTAATAGAT
581



CasP
TGCTCCTTACGAGGAGAC




hi12
TGGAATAATGCTGTTGTT





GA








R3075_
CTTTCAAGACTAATAGAT
582



CasP
TGCTCCTTACGAGGAGAC




hi12
TTCCAGAAGACACCTTCT





TC








R3076_
CTTTCAAGACTAATAGAT
583



CasP
TGCTCCTTACGAGGAGAC




hi12
CAGAAGACACCTTCTTCC





CC








R3077_
CTTTCAAGACTAATAGAT
584



CasP
TGCTCCTTACGAGGAGAC




hi12
CCTGGGCTGGGGAAGAAG





GT








R3078_
CTTTCAAGACTAATAGAT
585



CasP
TGCTCCTTACGAGGAGAC




hi12
TTCCCCAGCCCAGGTAAG





GG








R3079_
CTTTCAAGACTAATAGAT
586



CasP
TGCTCCTTACGAGGAGAC




hi12
CCCAGCCCAGGTAAGGGC





AG








R3080_
CTTTCAAGACTAATAGAT
587



CasP
TGCTCCTTACGAGGAGAC




hi12
TAAAAGGAAAAACAGACA





TT








R3081_
CTTTCAAGACTAATAGAT
588



CasP
TGCTCCTTACGAGGAGAC




hi12
CTAAAAGGAAAAACAGAC





AT








R3082_
CTTTCAAGACTAATAGAT
589



CasP
TGCTCCTTACGAGGAGAC




hi12
TTCCTTTTAGAAAGTTCC





TG








R3083_
CTTTCAAGACTAATAGAT
590



CasP
TGCTCCTTACGAGGAGAC




hi12
TCCTTTTAGAAAGTTCCT





GT








R3084_
CTTTCAAGACTAATAGAT
591



CasP
TGCTCCTTACGAGGAGAC




hi12
CCTTTTAGAAAGTTCCTG





TG








R3085_
CTTTCAAGACTAATAGAT
592



CasP
TGCTCCTTACGAGGAGAC




hi12
CTTTTAGAAAGTTCCTGT





GA








R3086_
CTTTCAAGACTAATAGAT
593



CasP
TGCTCCTTACGAGGAGAC




hi12
TAGAAAGTTCCTGTGATG





TC








R3136_
CTTTCAAGACTAATAGAT
594



CasP
TGCTCCTTACGAGGAGAC




hi12
AGAAAGTTCCTGTGATGT





CA








R3137_
CTTTCAAGACTAATAGAT
595



CasP
TGCTCCTTACGAGGAGAC




hi12
GAAAGTTCCTGTGATGTC





AA








R3138_
CTTTCAAGACTAATAGAT
596



CasP
TGCTCCTTACGAGGAGAC




hi12
ACATCACAGGAACTTTCT





AA








R3139_
CTTTCAAGACTAATAGAT
597



CasP
TGCTCCTTACGAGGAGAC




hi12
CTGTGATGTCAAGCTGGT





CG








R3140_
CTTTCAAGACTAATAGAT
598



CasP
TGCTCCTTACGAGGAGAC




hi12
TCGACCAGCTTGACATCA





CA








R3141_
CTTTCAAGACTAATAGAT
599



CasP
TGCTCCTTACGAGGAGAC




hi12
CTCGACCAGCTTGACATC





AC








R3142_
CTTTCAAGACTAATAGAT
600



CasP
TGCTCCTTACGAGGAGAC




hi12
TCTCGACCAGCTTGACAT





CA








R3143_
CTTTCAAGACTAATAGAT
601



CasP
TGCTCCTTACGAGGAGAC




hi12
AAAGCTTTTCTCGACCAG





CT








R3144_
CTTTCAAGACTAATAGAT
602



CasP
TGCTCCTTACGAGGAGAC




hi12
CAAAGCTTTTCTCGACCA





GC








R3145_
CTTTCAAGACTAATAGAT
603



CasP
TGCTCCTTACGAGGAGAC




hi12
CCTGTTTCAAAGCTTTTC





TC








R3146_
CTTTCAAGACTAATAGAT
604



CasP
TGCTCCTTACGAGGAGAC




hi12
GAAACAGGTAAGACAGGG





GT








R3147_
CTTTCAAGACTAATAGAT
605



CasP
TGCTCCTTACGAGGAGAC




hi12
AAACAGGTAAGACAGGGG





TC

















TABLE J







CasΦ.32 gRNAs targeting human


TRAC in T cells












Spacer sequence
SEQ




(5′ --> 3′),
ID



Name
shown as DNA
NO







R3040_
GCTGGGGACCGATCCTGA
606



Cas
TTGCTCGCTGCGGCGAGA




Phi32
CTGGATATCTGTGGGACA





AGA








R3041_
GCTGGGGACCGATCCTGA
607



Cas
TTGCTCGCTGCGGCGAGA




Phi32
CTCCCACAGATATCCAGA





ACC








R3042_
GCTGGGGACCGATCCTGA
608



Cas
TTGCTCGCTGCGGCGAGA




Phi32
CGAGTCTCTCAGCTGGTA





CAC








R3043_
GCTGGGGACCGATCCTGA
609



Cas
TTGCTCGCTGCGGCGAGA




Phi32
CAGAGTCTCTCAGCTGGT





ACA








R3044_
GCTGGGGACCGATCCTGA
610



Cas
TTGCTCGCTGCGGCGAGA




Phi32
CTCACTGGATTTAGAGTC





TCT








R3045_
GCTGGGGACCGATCCTGA
611



Cas
TTGCTCGCTGCGGCGAGA




Phi32
CAGAATCAAAATCGGTGA





ATA








R3046_
GCTGGGGACCGATCCTGA
612



Cas
TTGCTCGCTGCGGCGAGA




Phi32
CGAGAATCAAAATCGGTG





AAT








R3047_
GCTGGGGACCGATCCTGA
613



Cas
TTGCTCGCTGCGGCGAGA




Phi32
CACCGATTTTGATTCTCA





AAC








R3048_
GCTGGGGACCGATCCTGA
614



Cas
TTGCTCGCTGCGGCGAGA




Phi32
CTTTGAGAATCAAAATCG





GTG








R3049_
GCTGGGGACCGATCCTGA
615



Cas
TTGCTCGCTGCGGCGAGA




Phi32
CGTTTGAGAATCAAAATC





GGT








R3050_
GCTGGGGACCGATCCTGA
616



Cas
TTGCTCGCTGCGGCGAGA




Phi32
CTGATTCTCAAACAAATG





TGT








R3051_
GCTGGGGACCGATCCTGA
617



Cas
TTGCTCGCTGCGGCGAGA




Phi32
CGATTCTCAAACAAATGT





GTC








R3052_
GCTGGGGACCGATCCTGA
618



Cas
TTGCTCGCTGCGGCGAGA




Phi32
CATTCTCAAACAAATGTG





TCA








R3053_
GCTGGGGACCGATCCTGA
619



Cas
TTGCTCGCTGCGGCGAGA




Phi32
CTGACACATTTGTTTGAG





AAT








R3054_
GCTGGGGACCGATCCTGA
620



Cas
TTGCTCGCTGCGGCGAGA




Phi32
CTCAAACAAATGTGTCAC





AAA








R3055_
GCTGGGGACCGATCCTGA
621



Cas
TTGCTCGCTGCGGCGAGA




Phi32
CGTGACACATTTGTTTGA





GAA








R3056_
GCTGGGGACCGATCCTGA
622



Cas
TTGCTCGCTGCGGCGAGA




Phi32
CCTTTGTGACACATTTGT





TTG








R3057_
GCTGGGGACCGATCCTGA
623



Cas
TTGCTCGCTGCGGCGAGA




Phi32
CTGATGTGTATATCACAG





ACA








R3058_
GCTGGGGACCGATCCTGA
624



Cas
TTGCTCGCTGCGGCGAGA




Phi32
CTCTGTGATATACACATC





AGA








R3059_
GCTGGGGACCGATCCTGA
625



Cas
TTGCTCGCTGCGGCGAGA




Phi32
CGTCTGTGATATACACAT





CAG








R3060_
GCTGGGGACCGATCCTGA
626



Cas
TTGCTCGCTGCGGCGAGA




Phi32
CTGTCTGTGATATACACA





TCA








R3061_
GCTGGGGACCGATCCTGA
627



Cas
TTGCTCGCTGCGGCGAGA




Phi32
CAAGTCCATAGACCTCAT





GTC








R3062_
GCTGGGGACCGATCCTGA
628



Cas
TTGCTCGCTGCGGCGAGA




Phi32
CCTCTTGAAGTCCATAGA





CCT








R3063_
GCTGGGGACCGATCCTGA
629



Cas
TTGCTCGCTGCGGCGAGA




Phi32
CAAGAGCAACAGTGCTGT





GGC








R3064_
GCTGGGGACCGATCCTGA
630



Cas
TTGCTCGCTGCGGCGAGA




Phi32
CCTCCAGGCCACAGCACT





GTT








R3065_
GCTGGGGACCGATCCTGA
631



Cas
TTGCTCGCTGCGGCGAGA




Phi32
CTTGCTCCAGGCCACAGC





ACT








R3066_
GCTGGGGACCGATCCTGA
632



Cas
TTGCTCGCTGCGGCGAGA




Phi32
CGTTGCTCCAGGCCACAG





CAC








R3067_
GCTGGGGACCGATCCTGA
633



Cas
TTGCTCGCTGCGGCGAGA




Phi32
CCACATGCAAAGTCAGAT





TTG








R3068_
GCTGGGGACCGATCCTGA
634



Cas
TTGCTCGCTGCGGCGAGA




Phi32
CGCACATGCAAAGTCAGA





TTT








R3069_
GCTGGGGACCGATCCTGA
635



Cas
TTGCTCGCTGCGGCGAGA




Phi32
CGCATGTGCAAACGCCTT





CAA








R3070_
GCTGGGGACCGATCCTGA
636



Cas
TTGCTCGCTGCGGCGAGA




Phi32
CAAGGCGTTTGCACATGC





AAA








R3071_
GCTGGGGACCGATCCTGA
637



Cas
TTGCTCGCTGCGGCGAGA




Phi32
CCATGTGCAAACGCCTTC





AAC








R3072_
GCTGGGGACCGATCCTGA
638



Cas
TTGCTCGCTGCGGCGAGA




Phi32
CTTGAAGGCGTTTGCACA





TGC








R3073_
GCTGGGGACCGATCCTGA
639



Cas
TTGCTCGCTGCGGCGAGA




Phi32
CAACAACAGCATTATTCC





AGA








R3074_
GCTGGGGACCGATCCTGA
640



Cas
TTGCTCGCTGCGGCGAGA




Phi32
CTGGAATAATGCTGTTGT





TGA








R3075_
GCTGGGGACCGATCCTGA
641



Cas
TTGCTCGCTGCGGCGAGA




Phi32
CTTCCAGAAGACACCTTC





TTC








R3076_
GCTGGGGACCGATCCTGA
642



Cas
TTGCTCGCTGCGGCGAGA




Phi32
CCAGAAGACACCTTCTTC





CCC








R3077_
GCTGGGGACCGATCCTGA
643



Cas
TTGCTCGCTGCGGCGAGA




Phi32
CCCTGGGCTGGGGAAGAA





GGT








R3078_
GCTGGGGACCGATCCTGA
644



Cas
TTGCTCGCTGCGGCGAGA




Phi32
CTTCCCCAGCCCAGGTAA





GGG








R3079_
GCTGGGGACCGATCCTGA
645



Cas
TTGCTCGCTGCGGCGAGA




Phi32
CCCCAGCCCAGGTAAGGG





CAG








R3080_
GCTGGGGACCGATCCTGA
646



Cas
TTGCTCGCTGCGGCGAGA




Phi32
CTAAAAGGAAAAACAGAC





ATT








R3081_
GCTGGGGACCGATCCTGA
647



Cas
TTGCTCGCTGCGGCGAGA




Phi32
CCTAAAAGGAAAAACAGA





CAT








R3082_
GCTGGGGACCGATCCTGA
648



Cas
TTGCTCGCTGCGGCGAGA




Phi32
CTTCCTTTTAGAAAGTTC





CTG








R3083_
GCTGGGGACCGATCCTGA
649



Cas
TTGCTCGCTGCGGCGAGA




Phi32
CTCCTTTTAGAAAGTTCC





TGT








R3084_
GCTGGGGACCGATCCTGA
650



Cas
TTGCTCGCTGCGGCGAGA




Phi32
CCCTTTTAGAAAGTTCCT





GTG








R3085_
GCTGGGGACCGATCCTGA
651



Cas
TTGCTCGCTGCGGCGAGA




Phi32
CCTTTTAGAAAGTTCCTG





TGA








R3086_
GCTGGGGACCGATCCTGA
652



Cas
TTGCTCGCTGCGGCGAGA




Phi32
CTAGAAAGTTCCTGTGAT





GTC








R3136_
GCTGGGGACCGATCCTGA
653



Cas
TTGCTCGCTGCGGCGAGA




Phi32
CAGAAAGTTCCTGTGATG





TCA








R3137_
GCTGGGGACCGATCCTGA
654



Cas
TTGCTCGCTGCGGCGAGA




Phi32
CGAAAGTTCCTGTGATGT





CAA








R3138_
GCTGGGGACCGATCCTGA
655



Cas
TTGCTCGCTGCGGCGAGA




Phi32
CACATCACAGGAACTTTC





TAA








R3139_
GCTGGGGACCGATCCTGA
656



Cas
TTGCTCGCTGCGGCGAGA




Phi32
CCTGTGATGTCAAGCTGG





TCG








R3140_
GCTGGGGACCGATCCTGA
657



Cas
TTGCTCGCTGCGGCGAGA




Phi32
CTCGACCAGCTTGACATC





ACA








R3141_
GCTGGGGACCGATCCTGA
658



Cas
TTGCTCGCTGCGGCGAGA




Phi32
CCTCGACCAGCTTGACAT





CAC








R3142_
GCTGGGGACCGATCCTGA
659



Cas
TTGCTCGCTGCGGCGAGA




Phi32
CTCTCGACCAGCTTGACA





TCA








R3143_
GCTGGGGACCGATCCTGA
660



Cas
TTGCTCGCTGCGGCGAGA




Phi32
CAAAGCTTTTCTCGACCA





GCT








R3144_
GCTGGGGACCGATCCTGA
661



Cas
TTGCTCGCTGCGGCGAGA




Phi32
CCAAAGCTTTTCTCGACC





AGC








R3145_
GCTGGGGACCGATCCTGA
662



Cas
TTGCTCGCTGCGGCGAGA




Phi32
CCCTGTTTCAAAGCTTTT





CTC








R3146_
GCTGGGGACCGATCCTGA
663



Cas
TTGCTCGCTGCGGCGAGA




Phi32
CGAAACAGGTAAGACAGG





GGT








R3147_
GCTGGGGACCGATCCTGA
664



Cas
TTGCTCGCTGCGGCGAGA




Phi32
CAAACAGGTAAGACAGGG





GTC

















TABLE K







CasΦ.12 gRNAs targeting human


B2M in T cells












Spacer sequence
SEQ




(5′ --> 3′),
ID



Name
shown as DNA
NO







R3087_
CTTTCAAGACTAATAGAT
665



CasP
TGCTCCTTACGAGGAGAC




hi12
AATATAAGTGGAGGCGTC





GC








R3088_
CTTTCAAGACTAATAGAT
666



CasP
TGCTCCTTACGAGGAGAC




hi12
ATATAAGTGGAGGCGTCG





CG








R3089_
CTTTCAAGACTAATAGAT
667



CasP
TGCTCCTTACGAGGAGAC




hi12
AGGAATGCCCGCCAGCGC





GA








R3090_
CTTTCAAGACTAATAGAT
668



CasP
TGCTCCTTACGAGGAGAC




hi12
CTGAAGCTGACAGCATTC





GG








R3091_
CTTTCAAGACTAATAGAT
669



CasP
TGCTCCTTACGAGGAGAC




hi12
GGGCCGAGATGTCTCGCT





CC








R3092_
CTTTCAAGACTAATAGAT
670



CasP
TGCTCCTTACGAGGAGAC




hi12
GCTGTGCTCGCGCTACTC





TC








R3093_
CTTTCAAGACTAATAGAT
671



CasP
TGCTCCTTACGAGGAGAC




hi12
CTGGCCTGGAGGCTATCC





AG








R3094_
CTTTCAAGACTAATAGAT
672



CasP
TGCTCCTTACGAGGAGAC




hi12
TGGCCTGGAGGCTATCCA





GC








R3095_
CTTTCAAGACTAATAGAT
673



CasP
TGCTCCTTACGAGGAGAC




hi12
ATGTGTCTTTTCCCGATA





TT








R3096_
CTTTCAAGACTAATAGAT
674



CasP
TGCTCCTTACGAGGAGAC




hi12
TCCCGATATTCCTCAGGT





AC








R3097_
CTTTCAAGACTAATAGAT
675



CasP
TGCTCCTTACGAGGAGAC




hi12
CCCGATATTCCTCAGGTA





CT








R3098_
CTTTCAAGACTAATAGAT
676



CasP
TGCTCCTTACGAGGAGAC




hi12
CCGATATTCCTCAGGTAC





TC








R3099_
CTTTCAAGACTAATAGAT
677



CasP
TGCTCCTTACGAGGAGAC




hi12
GAGTACCTGAGGAATATC





GG








R3100_
CTTTCAAGACTAATAGAT
678



CasP
TGCTCCTTACGAGGAGAC




hi12
GGAGTACCTGAGGAATAT





CG








R3101_
CTTTCAAGACTAATAGAT
679



CasP
TGCTCCTTACGAGGAGAC




hi12
CTCAGGTACTCCAAAGAT





TC








R3102_
CTTTCAAGACTAATAGAT
680



CasP
TGCTCCTTACGAGGAGAC




hi12
AGGTTTACTCACGTCATC





CA








R3103_
CTTTCAAGACTAATAGAT
681



CasP
TGCTCCTTACGAGGAGAC




hi12
ACTCACGTCATCCAGCAG





AG








R3104_
CTTTCAAGACTAATAGAT
682



CasP
TGCTCCTTACGAGGAGAC




hi12
CTCACGTCATCCAGCAGA





GA








R3105_
CTTTCAAGACTAATAGAT
683



CasP
TGCTCCTTACGAGGAGAC




hi12
TCTGCTGGATGACGTGAG





TA








R3106_
CTTTCAAGACTAATAGAT
684



CasP
TGCTCCTTACGAGGAGAC




hi12
CATTCTCTGCTGGATGAC





GT








R3107_
CTTTCAAGACTAATAGAT
685



CasP
TGCTCCTTACGAGGAGAC




hi12
CCATTCTCTGCTGGATGA





CG








R3108_
CTTTCAAGACTAATAGAT
686



CasP
TGCTCCTTACGAGGAGAC




hi12
ACTTTCCATTCTCTGCTG





GA








R3109_
CTTTCAAGACTAATAGAT
687



CasP
TGCTCCTTACGAGGAGAC




hi12
GACTTTCCATTCTCTGCT





GG








R3110_
CTTTCAAGACTAATAGAT
688



CasP
TGCTCCTTACGAGGAGAC




hi12
AGGAAATTTGACTTTCCA





TT








R3111_
CTTTCAAGACTAATAGAT
689



CasP
TGCTCCTTACGAGGAGAC




hi12
CCTGAATTGCTATGTGTC





TG








R3112_
CTTTCAAGACTAATAGAT
690



CasP
TGCTCCTTACGAGGAGAC




hi12
CTGAATTGCTATGTGTCT





GG








R3113_
CTTTCAAGACTAATAGAT
691



CasP
TGCTCCTTACGAGGAGAC




hi12
CTATGTGTCTGGGTTTCA





TC








R3114_
CTTTCAAGACTAATAGAT
692



CasP
TGCTCCTTACGAGGAGAC




hi12
AATGTCGGATGGATGAAA





CC








R3115_
CTTTCAAGACTAATAGAT
693



CasP
TGCTCCTTACGAGGAGAC




hi12
CATCCATCCGACATTGAA





GT








R3116_
CTTTCAAGACTAATAGAT
694



CasP
TGCTCCTTACGAGGAGAC




hi12
ATCCATCCGACATTGAAG





TT








R3117_
CTTTCAAGACTAATAGAT
695



CasP
TGCTCCTTACGAGGAGAC




hi12
AGTAAGTCAACTTCAATG





TC








R3118_
CTTTCAAGACTAATAGAT
696



CasP
TGCTCCTTACGAGGAGAC




hi12
TTCAGTAAGTCAACTTCA





AT








R3119_
CTTTCAAGACTAATAGAT
697



CasP
TGCTCCTTACGAGGAGAC




hi12
AAGTTGACTTACTGAAGA





AT








R3120_
CTTTCAAGACTAATAGAT
698



CasP
TGCTCCTTACGAGGAGAC




hi12
ACTTACTGAAGAATGGAG





AG








R3121_
CTTTCAAGACTAATAGAT
699



CasP
TGCTCCTTACGAGGAGAC




hi12
TCTCTCCATTCTTCAGTA





AG








R3122_
CTTTCAAGACTAATAGAT
700



CasP
TGCTCCTTACGAGGAGAC




hi12
CTGAAGAATGGAGAGAGA





AT








R3123_
CTTTCAAGACTAATAGAT
701



CasP
TGCTCCTTACGAGGAGAC




hi12
AATTCTCTCTCCATTCTT





CA








R3124_
CTTTCAAGACTAATAGAT
702



CasP
TGCTCCTTACGAGGAGAC




hi12
CAATTCTCTCTCCATTCT





TC








R3125_
CTTTCAAGACTAATAGAT
703



CasP
TGCTCCTTACGAGGAGAC




hi12
TCAATTCTCTCTCCATTC





TT








R3126_
CTTTCAAGACTAATAGAT
704



CasP
TGCTCCTTACGAGGAGAC




hi12
TTCAATTCTCTCTCCATT





CT








R3127_
CTTTCAAGACTAATAGAT
705



CasP
TGCTCCTTACGAGGAGAC




hi12
AAAAAGTGGAGCATTCAG





AC








R3128_
CTTTCAAGACTAATAGAT
706



CasP
TGCTCCTTACGAGGAGAC




hi12
CTGAAAGACAAGTCTGAA





TG








R3129_
CTTTCAAGACTAATAGAT
707



CasP
TGCTCCTTACGAGGAGAC




hi12
AGACTTGTCTTTCAGCAA





GG








R3130_
CTTTCAAGACTAATAGAT
708



CasP
TGCTCCTTACGAGGAGAC




hi12
TCTTTCAGCAAGGACTGG





TC








R3131_
CTTTCAAGACTAATAGAT
709



CasP
TGCTCCTTACGAGGAGAC




hi12
CAGCAAGGACTGGTCTTT





CT








R3132_
CTTTCAAGACTAATAGAT
710



CasP
TGCTCCTTACGAGGAGAC




hi12
AGCAAGGACTGGTCTTTC





TA








R3133_
CTTTCAAGACTAATAGAT
711



CasP
TGCTCCTTACGAGGAGAC




hi12
CTATCTCTTGTACTACAC





TG








R3134_
CTTTCAAGACTAATAGAT
712



CasP
TGCTCCTTACGAGGAGAC




hi12
TATCTCTTGTACTACACT





GA








R3135_
CTTTCAAGACTAATAGAT
713



CasP
TGCTCCTTACGAGGAGAC




hi12
AGTGTAGTACAAGAGATA





GA








R3148_
CTTTCAAGACTAATAGAT
714



CasP
TGCTCCTTACGAGGAGAC




hi12
TACTACACTGAATTCACC





CC








R3149_
CTTTCAAGACTAATAGAT
715



CasP
TGCTCCTTACGAGGAGAC




hi12
AGTGGGGGTGAATTCAGT





GT








R3150_
CTTTCAAGACTAATAGAT
716



CasP
TGCTCCTTACGAGGAGAC




hi12
CAGTGGGGGTGAATTCAG





TG








R3151_
CTTTCAAGACTAATAGAT
717



CasP
TGCTCCTTACGAGGAGAC




hi12
TCAGTGGGGGTGAATTCA





GT








R3152_
CTTTCAAGACTAATAGAT
718



CasP
TGCTCCTTACGAGGAGAC




hi12
TTCAGTGGGGGTGAATTC





AG








R3153_
CTTTCAAGACTAATAGAT
719



CasP
TGCTCCTTACGAGGAGAC




hi12
ACCCCCACTGAAAAAGAT





GA








R3154_
CTTTCAAGACTAATAGAT
720



CasP
TGCTCCTTACGAGGAGAC




hi12
ACACGGCAGGCATACTCA





TC








R3155_
CTTTCAAGACTAATAGAT
721



CasP
TGCTCCTTACGAGGAGAC




hi12
GGCTGTGACAAAGTCACA





TG








R3156_
CTTTCAAGACTAATAGAT
722



CasP
TGCTCCTTACGAGGAGAC




hi12
GTCACAGCCCAAGATAGT





TA








R3157_
CTTTCAAGACTAATAGAT
723



CasP
TGCTCCTTACGAGGAGAC




hi12
TCACAGCCCAAGATAGTT





AA








R3158_
CTTTCAAGACTAATAGAT
724



CasP
TGCTCCTTACGAGGAGAC




hi12
ACTATCTTGGGCTGTGAC





AA








R3159_
CTTTCAAGACTAATAGAT
725



CasP
TGCTCCTTACGAGGAGAC




hi12
CCCCACTTAACTATCTTG





GG

















TABLE L







CasΦ.32 gRNAs targeting


human B2M in T cells












Spacer sequence
SEQ




(5′ --> 3′),
ID



Name
shown as DNA
NO







R3087_
GCTGGGGACCGATCCTGA
726



CasP
TTGCTCGCTGCGGCGAGA




hi32
CAATATAAGTGGAGGCGT





CGC








R3088_
GCTGGGGACCGATCCTGA
727



CasP
TTGCTCGCTGCGGCGAGA




hi32
CATATAAGTGGAGGCGTC





GCG








R3089_
GCTGGGGACCGATCCTGA
728



CasP
TTGCTCGCTGCGGCGAGA




hi32
CAGGAATGCCCGCCAGCG





CGA








R3090_
GCTGGGGACCGATCCTGA
729



CasP
TTGCTCGCTGCGGCGAGA




hi32
CCTGAAGCTGACAGCATT





CGG








R3091_
GCTGGGGACCGATCCTGA
730



CasP
TTGCTCGCTGCGGCGAGA




hi32
CGGGCCGAGATGTCTCGC





TCC








R3092_
GCTGGGGACCGATCCTGA
731



CasP
TTGCTCGCTGCGGCGAGA




hi32
CGCTGTGCTCGCGCTACT





CTC








R3093_
GCTGGGGACCGATCCTGA
732



CasP
TTGCTCGCTGCGGCGAGA




hi32
CCTGGCCTGGAGGCTATC





CAG








R3094_
GCTGGGGACCGATCCTGA
733



CasP
TTGCTCGCTGCGGCGAGA




hi32
CTGGCCTGGAGGCTATCC





AGC








R3095_
GCTGGGGACCGATCCTGA
734



CasP
TTGCTCGCTGCGGCGAGA




hi32
CATGTGTCTTTTCCCGAT





ATT








R3096_
GCTGGGGACCGATCCTGA
735



CasP
TTGCTCGCTGCGGCGAGA




hi32
CTCCCGATATTCCTCAGG





TAC








R3097_
GCTGGGGACCGATCCTGA
736



CasP
TTGCTCGCTGCGGCGAGA




hi32
CCCCGATATTCCTCAGGT





ACT








R3098_
GCTGGGGACCGATCCTGA
737



CasP
TTGCTCGCTGCGGCGAGA




hi32
CCCGATATTCCTCAGGTA





CTC








R3099_
GCTGGGGACCGATCCTGA
738



CasP
TTGCTCGCTGCGGCGAGA




hi32
CGAGTACCTGAGGAATAT





CGG








R3100_
GCTGGGGACCGATCCTGA
739



CasP
TTGCTCGCTGCGGCGAGA




hi32
CGGAGTACCTGAGGAATA





TCG








R3101_
GCTGGGGACCGATCCTGA
740



CasP
TTGCTCGCTGCGGCGAGA




hi32
CCTCAGGTACTCCAAAGA





TTC








R3102_
GCTGGGGACCGATCCTGA
741



CasP
TTGCTCGCTGCGGCGAGA




hi32
CAGGTTTACTCACGTCAT





CCA








R3103_
GCTGGGGACCGATCCTGA
742



CasP
TTGCTCGCTGCGGCGAGA




hi32
CACTCACGTCATCCAGCA





GAG








R3104_
GCTGGGGACCGATCCTGA
743



CasP
TTGCTCGCTGCGGCGAGA




hi32
CCTCACGTCATCCAGCAG





AGA








R3105_
GCTGGGGACCGATCCTGA
744



CasP
TTGCTCGCTGCGGCGAGA




hi32
CTCTGCTGGATGACGTGA





GTA








R3106_
GCTGGGGACCGATCCTGA
745



CasP
TTGCTCGCTGCGGCGAGA




hi32
CCATTCTCTGCTGGATGA





CGT








R3107_
GCTGGGGACCGATCCTGA
746



CasP
TTGCTCGCTGCGGCGAGA




hi32
CCCATTCTCTGCTGGATG





ACG








R3108_
GCTGGGGACCGATCCTGA
747



CasP
TTGCTCGCTGCGGCGAGA




hi32
CACTTTCCATTCTCTGCT





GGA








R3109_
GCTGGGGACCGATCCTGA
748



CasP
TTGCTCGCTGCGGCGAGA




hi32
CGACTTTCCATTCTCTGC





TGG








R3110_
GCTGGGGACCGATCCTGA
749



CasP
TTGCTCGCTGCGGCGAGA




hi32
CAGGAAATTTGACTTTCC





ATT








R3111_
GCTGGGGACCGATCCTGA
750



CasP
TTGCTCGCTGCGGCGAGA




hi32
CCCTGAATTGCTATGTGT





CTG








R3112_
GCTGGGGACCGATCCTGA
751



CasP
TTGCTCGCTGCGGCGAGA




hi32
CCTGAATTGCTATGTGTC





TGG








R3113_
GCTGGGGACCGATCCTGA
752



CasP
TTGCTCGCTGCGGCGAGA




hi32
CCTATGTGTCTGGGTTTC





ATC








R3114_
GCTGGGGACCGATCCTGA
753



CasP
TTGCTCGCTGCGGCGAGA




hi32
CAATGTCGGATGGATGAA





ACC








R3115_
GCTGGGGACCGATCCTGA
754



CasP
TTGCTCGCTGCGGCGAGA




hi32
CCATCCATCCGACATTGA





AGT








R3116_
GCTGGGGACCGATCCTGA
755



CasP
TTGCTCGCTGCGGCGAGA




hi32
CATCCATCCGACATTGAA





GTT








R3117_
GCTGGGGACCGATCCTGA
756



CasP
TTGCTCGCTGCGGCGAGA




hi32
CAGTAAGTCAACTTCAAT





GTC








R3118_
GCTGGGGACCGATCCTGA
757



CasP
TTGCTCGCTGCGGCGAGA




hi32
CTTCAGTAAGTCAACTTC





AAT








R3119_
GCTGGGGACCGATCCTGA
758



CasP
TTGCTCGCTGCGGCGAGA




hi32
CAAGTTGACTTACTGAAG





AAT








R3120_
GCTGGGGACCGATCCTGA
759



CasP
TTGCTCGCTGCGGCGAGA




hi32
CACTTACTGAAGAATGGA





GAG








R3121_
GCTGGGGACCGATCCTGA
760



CasP
TTGCTCGCTGCGGCGAGA




hi32
CTCTCTCCATTCTTCAGT





AAG








R3122_
GCTGGGGACCGATCCTGA
761



CasP
TTGCTCGCTGCGGCGAGA




hi32
CCTGAAGAATGGAGAGAG





AAT








R3123_
GCTGGGGACCGATCCTGA
762



CasP
TTGCTCGCTGCGGCGAGA




hi32
CAATTCTCTCTCCATTCT





TCA








R3124_
GCTGGGGACCGATCCTGA
763



CasP
TTGCTCGCTGCGGCGAGA




hi32
CCAATTCTCTCTCCATTC





TTC








R3125_
GCTGGGGACCGATCCTGA
764



CasP
TTGCTCGCTGCGGCGAGA




hi32
CTCAATTCTCTCTCCATT





CTT








R3126_
GCTGGGGACCGATCCTGA
765



CasP
TTGCTCGCTGCGGCGAGA




hi32
CTTCAATTCTCTCTCCAT





TCT








R3127_
GCTGGGGACCGATCCTGA
766



CasP
TTGCTCGCTGCGGCGAGA




hi32
CAAAAAGTGGAGCATTCA





GAC








R3128_
GCTGGGGACCGATCCTGA
767



CasP
TTGCTCGCTGCGGCGAGA




hi32
CCTGAAAGACAAGTCTGA





ATG








R3129_
GCTGGGGACCGATCCTGA
768



CasP
TTGCTCGCTGCGGCGAGA




hi32
CAGACTTGTCTTTCAGCA





AGG








R3130_
GCTGGGGACCGATCCTGA
769



CasP
TTGCTCGCTGCGGCGAGA




hi32
CTCTTTCAGCAAGGACTG





GTC








R3131_
GCTGGGGACCGATCCTGA
770



CasP
TTGCTCGCTGCGGCGAGA




hi32
CCAGCAAGGACTGGTCTT





TCT








R3132_
GCTGGGGACCGATCCTGA
771



CasP
TTGCTCGCTGCGGCGAGA




hi32
CAGCAAGGACTGGTCTTT





CTA








R3133_
GCTGGGGACCGATCCTGA
772



CasP
TTGCTCGCTGCGGCGAGA




hi32
CCTATCTCTTGTACTACA





CTG








R3134_
GCTGGGGACCGATCCTGA
773



CasP
TTGCTCGCTGCGGCGAGA




hi32
CTATCTCTTGTACTACAC





TGA








R3135_
GCTGGGGACCGATCCTGA
774



CasP
TTGCTCGCTGCGGCGAGA




hi32
CAGTGTAGTACAAGAGAT





AGA








R3148_
GCTGGGGACCGATCCTGA
775



CasP
TTGCTCGCTGCGGCGAGA




hi32
CTACTACACTGAATTCAC





CCC








R3149_
GCTGGGGACCGATCCTGA
776



CasP
TTGCTCGCTGCGGCGAGA




hi32
CAGTGGGGGTGAATTCAG





TGT








R3150_
GCTGGGGACCGATCCTGA
777



CasP
TTGCTCGCTGCGGCGAGA




hi32
CCAGTGGGGGTGAATTCA





GTG








R3151_
GCTGGGGACCGATCCTGA
778



CasP
TTGCTCGCTGCGGCGAGA




hi32
CTCAGTGGGGGTGAATTC





AGT








R3152_
GCTGGGGACCGATCCTGA
779



CasP
TTGCTCGCTGCGGCGAGA




hi32
CTTCAGTGGGGGTGAATT





CAG








R3153_
GCTGGGGACCGATCCTGA
780



CasP
TTGCTCGCTGCGGCGAGA




hi32
CACCCCCACTGAAAAAGA





TGA








R3154_
GCTGGGGACCGATCCTGA
781



CasP
TTGCTCGCTGCGGCGAGA




hi32
CACACGGCAGGCATACTC





ATC








R3155_
GCTGGGGACCGATCCTGA
782



CasP
TTGCTCGCTGCGGCGAGA




hi32
CGGCTGTGACAAAGTCAC





ATG








R3156_
GCTGGGGACCGATCCTGA
783



CasP
TTGCTCGCTGCGGCGAGA




hi32
CGTCACAGCCCAAGATAG





TTA








R3157_
GCTGGGGACCGATCCTGA
784



CasP
TTGCTCGCTGCGGCGAGA




hi32
CTCACAGCCCAAGATAGT





TAA








R3158_
GCTGGGGACCGATCCTGA
785



CasP
TTGCTCGCTGCGGCGAGA




hi32
CACTATCTTGGGCTGTGA





CAA








R3159_
GCTGGGGACCGATCCTGA
786



CasP
TTGCTCGCTGCGGCGAGA




hi32
CCCCCACTTAACTATCTT





GGG

















TABLE M







CasΦ.12 gRNAs targeting human


PDI in T cells












Spacer sequence
SEQ




(5′ --> 3′),
ID



Name
shown as DNA
NO







R2921_
CUUUCAAGACUAAUAGAU
787



CasP
UGCUCCUUACGAGGAG




hi12
ACCCUUCCGCUCACCUCC





GCCU








R2922_
CUUUCAAGACUAAUAGAU
788



CasP
UGCUCCUUACGAGGAG




hi12
ACCCUUCCGCUCACCUCC





GCCU








R2923_
CUUUCAAGACUAAUAGAU
789



CasP
UGCUCCUUACGAGGAG




hi12
ACCGCUCACCUCCGCCUG





AGCA








R2924_
CUUUCAAGACUAAUAGAU
790



CasP
UGCUCCUUACGAGGAG




hi12
ACUCCACUGCUCAGGCGG





AGGU








R2925_
CUUUCAAGACUAAUAGAU
791



CasP
UGCUCCUUACGAGGAG




hi12
ACUAGCACCGCCCAGACG





ACUG








R2926_
CUUUCAAGACUAAUAGAU
792



CasP
UGCUCCUUACGAGGAG




hi12
ACAGGCAUGCAGAUCCCA





CAGG








R2927_
CUUUCAAGACUAAUAGAU
793



CasP
UGCUCCUUACGAGGAG




hi12
ACCACAGGCGCCCUGGCC





AGUC








R2928_
CUUUCAAGACUAAUAGAU
794



CasP
UGCUCCUUACGAGGAG




hi12
ACUCUGGGCGGUGCUACA





ACUG








R2929_
CUUUCAAGACUAAUAGAU
795



CasP
UGCUCCUUACGAGGAG




hi12
ACGCAUGCCUGGAGCAGC





CCCA








R2930_
CUUUCAAGACUAAUAGAU
796



CasP
UGCUCCUUACGAGGAG




hi12
ACUAGCACCGCCCAGACG





ACUG








R2931_
CUUUCAAGACUAAUAGAU
797



CasP
UGCUCCUUACGAGGAG




hi12
ACUGGCCGCCAGCCCAGU





UGUA








R2932_
CUUUCAAGACUAAUAGAU
798



CasP
UGCUCCUUACGAGGAG




hi12
ACCUUCCGCUCACCUCCG





CCUG








R2933_
CUUUCAAGACUAAUAGAU
799



CasP
UGCUCCUUACGAGGAG




hi12
ACCAGGGCCUGUCUGGGG





AGUC








R2934_
CUUUCAAGACUAAUAGAU
800



CasP
UGCUCCUUACGAGGAG




hi12
ACUCCCCAGCCCUGCUCG





UGGU








R2935_
CUUUCAAGACUAAUAGAU
801



CasP
UGCUCCUUACGAGGAG




hi12
ACGGUCACCACGAGCAGG





GCUG








R2936_
CUUUCAAGACUAAUAGAU
802



CasP
UGCUCCUUACGAGGAG




hi12
ACUCCCCUUCGGUCACCA





CGAG








R2937_
CUUUCAAGACUAAUAGAU
803



CasP
UGCUCCUUACGAGGAG




hi12
ACGAGAAGCUGCAGGUGA





AGGU








R2938_
CUUUCAAGACUAAUAGAU
804



CasP
UGCUCCUUACGAGGAG




hi12
ACACCUGCAGCUUCUCCA





ACAC








R2939_
CUUUCAAGACUAAUAGAU
805



CasP
UGCUCCUUACGAGGAG




hi12
ACUCCAACACAUCGGAGA





GCUU








R2940_
CUUUCAAGACUAAUAGAU
806



CasP
UGCUCCUUACGAGGAG




hi12
ACGCACGAAGCUCUCCGA





UGUG








R2941_
CUUUCAAGACUAAUAGAU
807



CasP
UGCUCCUUACGAGGAG




hi12
ACAGCACGAAGCUCUCCG





AUGU








R2942_
CUUUCAAGACUAAUAGAU
808



CasP
UGCUCCUUACGAGGAG




hi12
ACGUGCUAAACUGGUACC





GCAU








R2943_
CUUUCAAGACUAAUAGAU
809



CasP
UGCUCCUUACGAGGAG




hi12
ACCUGGGGCUCAUGCGGU





ACCA








R2944_
CUUUCAAGACUAAUAGAU
810



CasP
UGCUCCUUACGAGGAG




hi12
ACUCCGUCUGGUUGCUGG





GGCU








R2945_
CUUUCAAGACUAAUAGAU
811



CasP
UGCUCCUUACGAGGAG




hi12
ACCCCGAGGACCGCAGCC





AGCC








R2946_
CUUUCAAGACUAAUAGAU
812



CasP
UGCUCCUUACGAGGAG




hi12
ACUGUGACACGGAAGCGG





CAGU








R2947_
CUUUCAAGACUAAUAGAU
813



CasP
UGCUCCUUACGAGGAG




hi12
ACCGUGUCACACAACUGC





CCAA








R2948_
CUUUCAAGACUAAUAGAU
814



CasP
UGCUCCUUACGAGGAG




hi12
ACGGCAGUUGUGUGACAC





GGAA








R2949_
CUUUCAAGACUAAUAGAU
815



CasP
UGCUCCUUACGAGGAG




hi12
ACCACAUGAGCGUGGUCA





GGGC








R2950_
CUUUCAAGACUAAUAGAU
816



CasP
UGCUCCUUACGAGGAG




hi12
ACCGCCGGGCCCUGACCA





CGCU








R2951_
CUUUCAAGACUAAUAGAU
817



CasP
UGCUCCUUACGAGGAG




hi12
ACGGGGCCAGGGAGAUGG





CCCC








R2952_
CUUUCAAGACUAAUAGAU
818



CasP
UGCUCCUUACGAGGAG




hi12
ACAUCUGCGCCUUGGGGG





CCAG








R2953_
CUUUCAAGACUAAUAGAU
819



CasP
UGCUCCUUACGAGGAG




hi12
ACGAUCUGCGCCUUGGGG





GCCA








R2954_
CUUUCAAGACUAAUAGAU
820



CasP
UGCUCCUUACGAGGAG




hi12
ACCCAGACAGGCCCUGGA





ACCC








R2955_
CUUUCAAGACUAAUAGAU
821



CasP
UGCUCCUUACGAGGAG




hi12
ACCCAGCCCUGCUCGUGG





UGAC








R2956_
CUUUCAAGACUAAUAGAU
822



CasP
UGCUCCUUACGAGGAG




hi12
ACUCUCUGGAAGGGCACA





AAGG








R2957_
CUUUCAAGACUAAUAGAU
823



CasP
UGCUCCUUACGAGGAG




hi12
ACGUGCCCUUCCAGAGAG





AAGG








R2958_
CUUUCAAGACUAAUAGAU
824



CasP
UGCUCCUUACGAGGAG




hi12
ACUGCCCUUCCAGAGAGA





AGGG








R2959_
CUUUCAAGACUAAUAGAU
825



CasP
UGCUCCUUACGAGGAG




hi12
ACUGCCCUUCUCUCUGGA





AGGG








R2960_
CUUUCAAGACUAAUAGAU
826



CasP
UGCUCCUUACGAGGAG




hi12
ACCAGAGAGAAGGGCAGA





AGUG








R2961_
CUUUCAAGACUAAUAGAU
827



CasP
UGCUCCUUACGAGGAG




hi12
ACGAACUGGCCGGCUGGC





CUGG








R2962_
CUUUCAAGACUAAUAGAU
828



CasP
UGCUCCUUACGAGGAG




hi12
ACGGAACUGGCCGGCUGG





CCUG








R2963_
CUUUCAAGACUAAUAGAU
829



CasP
UGCUCCUUACGAGGAG




hi12
ACCAAACCCUGGUGGUUG





GUGU








R2964_
CUUUCAAGACUAAUAGAU
830



CasP
UGCUCCUUACGAGGAG




hi12
ACGUGUCGUGGGCGGCCU





GCUG








R2965_
CUUUCAAGACUAAUAGAU
831



CasP
UGCUCCUUACGAGGAG




hi12
ACCCUCGUGCGGCCCGGG





AGCA








R2966_
CUUUCAAGACUAAUAGAU
832



CasP
UGCUCCUUACGAGGAG




hi12
ACUCCCUGCAGAGAAACA





CACU








R2967_
CUUUCAAGACUAAUAGAU
833



CasP
UGCUCCUUACGAGGAG




hi12
ACCUCUGCAGGGACAAUA





GGAG








R2968_
CUUUCAAGACUAAUAGAU
834



CasP
UGCUCCUUACGAGGAG




hi12
ACUCUGCAGGGACAAUAG





GAGC








R2969_
CUUUCAAGACUAAUAGAU
835



CasP
UGCUCCUUACGAGGAG




hi12
ACCUCCUCAAAGAAGGAG





GACC








R2970_
CUUUCAAGACUAAUAGAU
836



CasP
UGCUCCUUACGAGGAG




hi12
ACUCCUCAAAGAAGGAGG





ACCC








R2971_
CUUUCAAGACUAAUAGAU
837



CasP
UGCUCCUUACGAGGAG




hi12
ACUCUGUGGACUAUGGGG





AGCU








R2972_
CUUUCAAGACUAAUAGAU
838



CasP
UGCUCCUUACGAGGAG




hi12
ACUCUCGCCACUGGAAAU





CCAG








R2973_
CUUUCAAGACUAAUAGAU
839



CasP
UGCUCCUUACGAGGAG




hi12
ACCCAGUGGCGAGAGAAG





ACCC








R2974_
CUUUCAAGACUAAUAGAU
840



CasP
UGCUCCUUACGAGGAG




hi12
ACCAGUGGCGAGAGAAGA





CCCC








R2975_
CUUUCAAGACUAAUAGAU
841



CasP
UGCUCCUUACGAGGAG




hi12
ACCGCUAGGAAAGACAAU





GGUG








R2976_
CUUUCAAGACUAAUAGAU
842



CasP
UGCUCCUUACGAGGAG




hi12
ACUCUUUCCUAGCGGAAU





GGGC








R2977_
CUUUCAAGACUAAUAGAU
843



CasP
UGCUCCUUACGAGGAG




hi12
ACCCUAGCGGAAUGGGCA





CCUC








R2978_
CUUUCAAGACUAAUAGAU
844



CasP
UGCUCCUUACGAGGAG




hi12
ACCUAGCGGAAUGGGCAC





CUCA








R2979_
CUUUCAAGACUAAUAGAU
845



CasP
UGCUCCUUACGAGGAG




hi12
ACGCCCCUCUGACCGGCU





UCCU








R2980_
CUUUCAAGACUAAUAGAU
846



CasP
UGCUCCUUACGAGGAG




hi12
ACCUUGGCCACCAGUGUU





CUGC








R2981_
CUUUCAAGACUAAUAGAU
847



CasP
UGCUCCUUACGAGGAG




hi12
ACGCCACCAGUGUUCUGC





AGAC








R2982_
CUUUCAAGACUAAUAGAU
848



CasP
UGCUCCUUACGAGGAG




hi12
ACUGCAGACCCUCCACCA





UGAG








R2983_
CUUUCAAGACUAAUAGAU
849



CasP
UGCUCCUUACGAGGAG




hi12
ACUCCUGAGGAAAUGCGC





UGAC








R2984_
CUUUCAAGACUAAUAGAU
850



CasP
UGCUCCUUACGAGGAG




hi12
ACCCUCAGGAGAAGCAGG





CAGG








R2985_
CUUUCAAGACUAAUAGAU
851



CasP
UGCUCCUUACGAGGAG




hi12
ACCUCAGGAGAAGCAGGC





AGGG








R2986_
CUUUCAAGACUAAUAGAU
852



CasP
UGCUCCUUACGAGGAG




hi12
ACCAGGCCGUCCAGGGGC





UGAG








R2987_
CUUUCAAGACUAAUAGAU
853



CasP
UGCUCCUUACGAGGAG




hi12
ACAGACAUGAGUCCUGUG





GUGG








R2988_
CUUUCAAGACUAAUAGAU
854



CasP
UGCUCCUUACGAGGAG




hi12
ACAGGUCCUGCCAGCACA





GAGC








R2989_
CUUUCAAGACUAAUAGAU
855



CasP
UGCUCCUUACGAGGAG




hi12
ACAGGGAGCUGGACGCAG





GCAG








R2990_
CUUUCAAGACUAAUAGAU
856



CasP
UGCUCCUUACGAGGAG




hi12
ACAGCCCCGGGCCGCAGG





CAGC








R2991_
CUUUCAAGACUAAUAGAU
857



CasP
UGCUCCUUACGAGGAG




hi12
ACAGGCAGGAGGCUCCGG





GGCG








R2992_
CUUUCAAGACUAAUAGAU
858



CasP
UGCUCCUUACGAGGAG




hi12
ACGGGGCUGGUUGGAGAU





GGCC








R2993_
CUUUCAAGACUAAUAGAU
859



CasP
UGCUCCUUACGAGGAG




hi12
ACGAGAUGGCCUUGGAGC





AGCC








R2994_
CUUUCAAGACUAAUAGAU
860



CasP
UGCUCCUUACGAGGAG




hi12
ACGCUGCUCCAAGGCCAU





CUCC








R2995_
CUUUCAAGACUAAUAGAU
861



CasP
UGCUCCUUACGAGGAG




hi12
ACGAGCAGCCAAGGUGCC





CCUG








R2996_
CUUUCAAGACUAAUAGAU
862



CasP
UGCUCCUUACGAGGAG




hi12
ACGGGAUGCCACUGCCAG





GGGC








R2997_
CUUUCAAGACUAAUAGAU
863



CasP
UGCUCCUUACGAGGAG




hi12
ACCGGGAUGCCACUGCCA





GGGG








R2998_
CUUUCAAGACUAAUAGAU
864



CasP
UGCUCCUUACGAGGAG




hi12
ACGGCCCUGCGUCCAGGG





CGUU








R2999_
CUUUCAAGACUAAUAGAU
865



CasP
UGCUCCUUACGAGGAG




hi12
ACUCUGCUCCCUGCAGGC





CUAG








R3000_
CUUUCAAGACUAAUAGAU
866



CasP
UGCUCCUUACGAGGAG




hi12
ACUCUAGGCCUGCAGGGA





GCAG








R3001_
CUUUCAAGACUAAUAGAU
867



CasP
UGCUCCUUACGAGGAG




hi12
ACCCUGAAACUUCUCUAG





GCCU








R3002_
CUUUCAAGACUAAUAGAU
868



CasP
UGCUCCUUACGAGGAG




hi12
ACUGACCUUCCCUGAAAC





UUCU








R3003_
CUUUCAAGACUAAUAGAU
869



CasP
UGCUCCUUACGAGGAG




hi12
ACCAGGGAAGGUCAGAAG





AGCU








R3004_
CUUUCAAGACUAAUAGAU
870



CasP
UGCUCCUUACGAGGAG




hi12
ACAGGGAAGGUCAGAAGA





GCUC








R3005_
CUUUCAAGACUAAUAGAU
871



CasP
UGCUCCUUACGAGGAG




hi12
ACCUGCCCUGCCCACCAC





AGCC








R3006_
CUUUCAAGACUAAUAGAU
872



CasP
UGCUCCUUACGAGGAG




hi12
ACCCUGCCCUGCCCACCA





CAGC








R3007_
CUUUCAAGACUAAUAGAU
873



CasP
UGCUCCUUACGAGGAG




hi12
ACACACAUGCCCAGGCAG





CACC








R3008_
CUUUCAAGACUAAUAGAU
874



CasP
UGCUCCUUACGAGGAG




hi12
ACCACAUGCCCAGGCAGC





ACCU








R3009_
CUUUCAAGACUAAUAGAU
875



CasP
UGCUCCUUACGAGGAG




hi12
ACCCUGCCCCACAAAGGG





CCUG








R3010_
CUUUCAAGACUAAUAGAU
876



CasP
UGCUCCUUACGAGGAG




hi12
ACGUGGGGCAGGGAAGCU





GAGG








R3011_
CUUUCAAGACUAAUAGAU
877



CasP
UGCUCCUUACGAGGAG




hi12
ACUGGGGCAGGGAAGCUG





AGGC








R3012_
CUUUCAAGACUAAUAGAU
878



CasP
UGCUCCUUACGAGGAG




hi12
ACCUGCCUCAGCUUCCCU





GCCC








R3013_
CUUUCAAGACUAAUAGAU
879



CasP
UGCUCCUUACGAGGAG




hi12
ACCAGGCCCAGCCAGCAC





UCUG








R3014_
CUUUCAAGACUAAUAGAU
880



CasP
UGCUCCUUACGAGGAG




hi12
ACAGGCCCAGCCAGCACU





CUGG








R3015_
CUUUCAAGACUAAUAGAU
881



CasP
UGCUCCUUACGAGGAG




hi12
ACCACCCCAGCCCCUCAC





ACCA








R3016_
CUUUCAAGACUAAUAGAU
882



CasP
UGCUCCUUACGAGGAG




hi12
ACGGACCGUAGGAUGUCC





CUCU

















TABLE N







CasΦ.32 gRNAs targeting human PD1 in T cells









Name
Repeat + spacer RNA Sequence (5′→3′)
SEQ ID NO





R2921_
GCUGGGGACCGAUCCUGAUUGCUCGCUGCGGCGA
883


CasPhi32
GACCCUUCCGCUCACCUCCGCCU






R2922_
GCUGGGGACCGAUCCUGAUUGCUCGCUGCGGCGA
884


CasPhi32
GACCCUUCCGCUCACCUCCGCCU






R2923_
GCUGGGGACCGAUCCUGAUUGCUCGCUGCGGCGA
885


CasPhi32
GACCGCUCACCUCCGCCUGAGCA






R2924_
GCUGGGGACCGAUCCUGAUUGCUCGCUGCGGCGA
886


CasPhi32
GACUCCACUGCUCAGGCGGAGGU






R2925_
GCUGGGGACCGAUCCUGAUUGCUCGCUGCGGCGA
887


CasPhi32
GACUAGCACCGCCCAGACGACUG






R2926_
GCUGGGGACCGAUCCUGAUUGCUCGCUGCGGCGA
888


CasPhi32
GACAGGCAUGCAGAUCCCACAGG






R2927_
GCUGGGGACCGAUCCUGAUUGCUCGCUGCGGCGA
889


CasPhi32
GACCACAGGCGCCCUGGCCAGUC






R2928_
GCUGGGGACCGAUCCUGAUUGCUCGCUGCGGCGA
890


CasPhi32
GACUCUGGGCGGUGCUACAACUG






R2929_
GCUGGGGACCGAUCCUGAUUGCUCGCUGCGGCGA
891


CasPhi32
GACGCAUGCCUGGAGCAGCCCCA






R2930_
GCUGGGGACCGAUCCUGAUUGCUCGCUGCGGCGA
892


CasPhi32
GACUAGCACCGCCCAGACGACUG






R2931_
GCUGGGGACCGAUCCUGAUUGCUCGCUGCGGCGA
893


CasPhi32
GACUGGCCGCCAGCCCAGUUGUA






R2932_
GCUGGGGACCGAUCCUGAUUGCUCGCUGCGGCGA
894


CasPhi32
GACCUUCCGCUCACCUCCGCCUG






R2933_
GCUGGGGACCGAUCCUGAUUGCUCGCUGCGGCGA
895


CasPhi32
GACCAGGGCCUGUCUGGGGAGUC






R2934_
GCUGGGGACCGAUCCUGAUUGCUCGCUGCGGCGA
896


CasPhi32
GACUCCCCAGCCCUGCUCGUGGU






R2935_
GCUGGGGACCGAUCCUGAUUGCUCGCUGCGGCGA
897


CasPhi32
GACGGUCACCACGAGCAGGGCUG






R2936_
GCUGGGGACCGAUCCUGAUUGCUCGCUGCGGCGA
898


CasPhi32
GACUCCCCUUCGGUCACCACGAG






R2937_
GCUGGGGACCGAUCCUGAUUGCUCGCUGCGGCGA
899


CasPhi32
GACGAGAAGCUGCAGGUGAAGGU






R2938_
GCUGGGGACCGAUCCUGAUUGCUCGCUGCGGCGA
900


CasPhi32
GACACCUGCAGCUUCUCCAACAC






R2939_
GCUGGGGACCGAUCCUGAUUGCUCGCUGCGGCGA
901


CasPhi32
GACUCCAACACAUCGGAGAGCUU






R2940_
GCUGGGGACCGAUCCUGAUUGCUCGCUGCGGCGA
902


CasPhi32
GACGCACGAAGCUCUCCGAUGUG






R2941_
GCUGGGGACCGAUCCUGAUUGCUCGCUGCGGCGA
903


CasPhi32
GACAGCACGAAGCUCUCCGAUGU






R2942_
GCUGGGGACCGAUCCUGAUUGCUCGCUGCGGCGA
904


CasPhi32
GACGUGCUAAACUGGUACCGCAU






R2943_
GCUGGGGACCGAUCCUGAUUGCUCGCUGCGGCGA
905


CasPhi32
GACCUGGGGCUCAUGCGGUACCA






R2944_
GCUGGGGACCGAUCCUGAUUGCUCGCUGCGGCGA
906


CasPhi32
GACUCCGUCUGGUUGCUGGGGCU






R2945_
GCUGGGGACCGAUCCUGAUUGCUCGCUGCGGCGA
907


CasPhi32
GACCCCGAGGACCGCAGCCAGCC






R2946_
GCUGGGGACCGAUCCUGAUUGCUCGCUGCGGCGA
908


CasPhi32
GACUGUGACACGGAAGCGGCAGU






R2947_
GCUGGGGACCGAUCCUGAUUGCUCGCUGCGGCGA
909


CasPhi32
GACCGUGUCACACAACUGCCCAA






R2948_
GCUGGGGACCGAUCCUGAUUGCUCGCUGCGGCGA
910


CasPhi32
GACGGCAGUUGUGUGACACGGAA






R2949_
GCUGGGGACCGAUCCUGAUUGCUCGCUGCGGCGA
911


CasPhi32
GACCACAUGAGCGUGGUCAGGGC






R2950_
GCUGGGGACCGAUCCUGAUUGCUCGCUGCGGCGA
912


CasPhi32
GACCGCCGGGCCCUGACCACGCU






R2951_
GCUGGGGACCGAUCCUGAUUGCUCGCUGCGGCGA
913


CasPhi32
GACGGGGCCAGGGAGAUGGCCCC






R2952_
GCUGGGGACCGAUCCUGAUUGCUCGCUGCGGCGA
914


CasPhi32
GACAUCUGCGCCUUGGGGGCCAG






R2953_
GCUGGGGACCGAUCCUGAUUGCUCGCUGCGGCGA
915


CasPhi32
GACGAUCUGCGCCUUGGGGGCCA






R2954_
GCUGGGGACCGAUCCUGAUUGCUCGCUGCGGCGA
916


CasPhi32
GACCCAGACAGGCCCUGGAACCC






R2955_
GCUGGGGACCGAUCCUGAUUGCUCGCUGCGGCGA
917


CasPhi32
GACCCAGCCCUGCUCGUGGUGAC






R2956_
GCUGGGGACCGAUCCUGAUUGCUCGCUGCGGCGA
918


CasPhi32
GACUCUCUGGAAGGGCACAAAGG






R2957_
GCUGGGGACCGAUCCUGAUUGCUCGCUGCGGCGA
919


CasPhi32
GACGUGCCCUUCCAGAGAGAAGG






R2958_
GCUGGGGACCGAUCCUGAUUGCUCGCUGCGGCGA
920


CasPhi32
GACUGCCCUUCCAGAGAGAAGGG






R2959_
GCUGGGGACCGAUCCUGAUUGCUCGCUGCGGCGA
921


CasPhi32
GACUGCCCUUCUCUCUGGAAGGG






R2960_
GCUGGGGACCGAUCCUGAUUGCUCGCUGCGGCGA
922


CasPhi32
GACCAGAGAGAAGGGCAGAAGUG






R2961_
GCUGGGGACCGAUCCUGAUUGCUCGCUGCGGCGA
923


CasPhi32
GACGAACUGGCCGGCUGGCCUGG






R2962_
GCUGGGGACCGAUCCUGAUUGCUCGCUGCGGCGA
924


CasPhi32
GACGGAACUGGCCGGCUGGCCUG






R2963_
GCUGGGGACCGAUCCUGAUUGCUCGCUGCGGCGA
925


CasPhi32
GACCAAACCCUGGUGGUUGGUGU






R2964_
GCUGGGGACCGAUCCUGAUUGCUCGCUGCGGCGA
926


CasPhi32
GACGUGUCGUGGGCGGCCUGCUG






R2965_
GCUGGGGACCGAUCCUGAUUGCUCGCUGCGGCGA
927


CasPhi32
GACCCUCGUGCGGCCCGGGAGCA






R2966_
GCUGGGGACCGAUCCUGAUUGCUCGCUGCGGCGA
928


CasPhi32
GACUCCCUGCAGAGAAACACACU






R2967_
GCUGGGGACCGAUCCUGAUUGCUCGCUGCGGCGA
929


CasPhi32
GACCUCUGCAGGGACAAUAGGAG






R2968_
GCUGGGGACCGAUCCUGAUUGCUCGCUGCGGCGA
930


CasPhi32
GACUCUGCAGGGACAAUAGGAGC






R2969_
GCUGGGGACCGAUCCUGAUUGCUCGCUGCGGCGA
931


CasPhi32
GACCUCCUCAAAGAAGGAGGACC






R2970_
GCUGGGGACCGAUCCUGAUUGCUCGCUGCGGCGA
932


CasPhi32
GACUCCUCAAAGAAGGAGGACCC






R2971_
GCUGGGGACCGAUCCUGAUUGCUCGCUGCGGCGA
933


CasPhi32
GACUCUGUGGACUAUGGGGAGCU






R2972_
GCUGGGGACCGAUCCUGAUUGCUCGCUGCGGCGA
934


CasPhi32
GACUCUCGCCACUGGAAAUCCAG






R2973_
GCUGGGGACCGAUCCUGAUUGCUCGCUGCGGCGA
935


CasPhi32
GACCCAGUGGCGAGAGAAGACCC






R2974_
GCUGGGGACCGAUCCUGAUUGCUCGCUGCGGCGA
936


CasPhi32
GACCAGUGGCGAGAGAAGACCCC






R2975_
GCUGGGGACCGAUCCUGAUUGCUCGCUGCGGCGA
937


CasPhi32
GACCGCUAGGAAAGACAAUGGUG






R2976_
GCUGGGGACCGAUCCUGAUUGCUCGCUGCGGCGA
938


CasPhi32
GACUCUUUCCUAGCGGAAUGGGC






R2977_
GCUGGGGACCGAUCCUGAUUGCUCGCUGCGGCGA
939


CasPhi32
GACCCUAGCGGAAUGGGCACCUC






R2978_
GCUGGGGACCGAUCCUGAUUGCUCGCUGCGGCGA
940


CasPhi32
GACCUAGCGGAAUGGGCACCUCA






R2979_
GCUGGGGACCGAUCCUGAUUGCUCGCUGCGGCGA
941


CasPhi32
GACGCCCCUCUGACCGGCUUCCU






R2980_
GCUGGGGACCGAUCCUGAUUGCUCGCUGCGGCGA
942


CasPhi32
GACCUUGGCCACCAGUGUUCUGC






R2981_
GCUGGGGACCGAUCCUGAUUGCUCGCUGCGGCGA
943


CasPhi32
GACGCCACCAGUGUUCUGCAGAC






R2982_
GCUGGGGACCGAUCCUGAUUGCUCGCUGCGGCGA
944


CasPhi32
GACUGCAGACCCUCCACCAUGAG






R2983_
GCUGGGGACCGAUCCUGAUUGCUCGCUGCGGCGA
945


CasPhi32
GACUCCUGAGGAAAUGCGCUGAC






R2984_
GCUGGGGACCGAUCCUGAUUGCUCGCUGCGGCGA
946


CasPhi32
GACCCUCAGGAGAAGCAGGCAGG






R2985_
GCUGGGGACCGAUCCUGAUUGCUCGCUGCGGCGA
947


CasPhi32
GACCUCAGGAGAAGCAGGCAGGG






R2986_
GCUGGGGACCGAUCCUGAUUGCUCGCUGCGGCGA
948


CasPhi32
GACCAGGCCGUCCAGGGGCUGAG






R2987_
GCUGGGGACCGAUCCUGAUUGCUCGCUGCGGCGA
949


CasPhi32
GACAGACAUGAGUCCUGUGGUGG






R2988_
GCUGGGGACCGAUCCUGAUUGCUCGCUGCGGCGA
950


CasPhi32
GACAGGUCCUGCCAGCACAGAGC






R2989_
GCUGGGGACCGAUCCUGAUUGCUCGCUGCGGCGA
951


CasPhi32
GACAGGGAGCUGGACGCAGGCAG






R2990_
GCUGGGGACCGAUCCUGAUUGCUCGCUGCGGCGA
952


CasPhi32
GACAGCCCCGGGCCGCAGGCAGC






R2991_
GCUGGGGACCGAUCCUGAUUGCUCGCUGCGGCGA
953


CasPhi32
GACAGGCAGGAGGCUCCGGGGCG






R2992_
GCUGGGGACCGAUCCUGAUUGCUCGCUGCGGCGA
954


CasPhi32
GACGGGGCUGGUUGGAGAUGGCC






R2993_
GCUGGGGACCGAUCCUGAUUGCUCGCUGCGGCGA
955


CasPhi32
GACGAGAUGGCCUUGGAGCAGCC






R2994_
GCUGGGGACCGAUCCUGAUUGCUCGCUGCGGCGA
956


CasPhi32
GACGCUGCUCCAAGGCCAUCUCC






R2995_
GCUGGGGACCGAUCCUGAUUGCUCGCUGCGGCGA
957


CasPhi32
GACGAGCAGCCAAGGUGCCCCUG






R2996_
GCUGGGGACCGAUCCUGAUUGCUCGCUGCGGCGA
958


CasPhi32
GACGGGAUGCCACUGCCAGGGGC






R2997_
GCUGGGGACCGAUCCUGAUUGCUCGCUGCGGCGA
959


CasPhi32
GACCGGGAUGCCACUGCCAGGGG






R2998_
GCUGGGGACCGAUCCUGAUUGCUCGCUGCGGCGA
960


CasPhi32
GACGGCCCUGCGUCCAGGGCGUU






R2999_
GCUGGGGACCGAUCCUGAUUGCUCGCUGCGGCGA
961


CasPhi32
GACUCUGCUCCCUGCAGGCCUAG






R3000_
GCUGGGGACCGAUCCUGAUUGCUCGCUGCGGCGA
962


CasPhi32
GACUCUAGGCCUGCAGGGAGCAG






R3001_
GCUGGGGACCGAUCCUGAUUGCUCGCUGCGGCGA
963


CasPhi32
GACCCUGAAACUUCUCUAGGCCU






R3002_
GCUGGGGACCGAUCCUGAUUGCUCGCUGCGGCGA
964


CasPhi32
GACUGACCUUCCCUGAAACUUCU






R3003_
GCUGGGGACCGAUCCUGAUUGCUCGCUGCGGCGA
965


CasPhi32
GACCAGGGAAGGUCAGAAGAGCU






R3004_
GCUGGGGACCGAUCCUGAUUGCUCGCUGCGGCGA
966


CasPhi32
GACAGGGAAGGUCAGAAGAGCUC






R3005_
GCUGGGGACCGAUCCUGAUUGCUCGCUGCGGCGA
967


CasPhi32
GACCUGCCCUGCCCACCACAGCC






R3006_
GCUGGGGACCGAUCCUGAUUGCUCGCUGCGGCGA
968


CasPhi32
GACCCUGCCCUGCCCACCACAGC






R3007_
GCUGGGGACCGAUCCUGAUUGCUCGCUGCGGCGA
969


CasPhi32
GACACACAUGCCCAGGCAGCACC






R3008_
GCUGGGGACCGAUCCUGAUUGCUCGCUGCGGCGA
970


CasPhi32
GACCACAUGCCCAGGCAGCACCU






R3009_
GCUGGGGACCGAUCCUGAUUGCUCGCUGCGGCGA
971


CasPhi32
GACCCUGCCCCACAAAGGGCCUG






R3010_
GCUGGGGACCGAUCCUGAUUGCUCGCUGCGGCGA
972


CasPhi32
GACGUGGGGCAGGGAAGCUGAGG






R3011_
GCUGGGGACCGAUCCUGAUUGCUCGCUGCGGCGA
973


CasPhi32
GACUGGGGCAGGGAAGCUGAGGC






R3012_
GCUGGGGACCGAUCCUGAUUGCUCGCUGCGGCGA
974


CasPhi32
GACCUGCCUCAGCUUCCCUGCCC






R3013_
GCUGGGGACCGAUCCUGAUUGCUCGCUGCGGCGA
975


CasPhi32
GACCAGGCCCAGCCAGCACUCUG






R3014_
GCUGGGGACCGAUCCUGAUUGCUCGCUGCGGCGA
976


CasPhi32
GACAGGCCCAGCCAGCACUCUGG






R3015_
GCUGGGGACCGAUCCUGAUUGCUCGCUGCGGCGA
977


CasPhi32
GACCACCCCAGCCCCUCACACCA






R3016_
GCUGGGGACCGAUCCUGAUUGCUCGCUGCGGCGA
978


CasPhi32
GACGGACCGUAGGAUGUCCCUCU
















TABLE O







CasΦ.12 gRNAs targeting human CIITA










Repeat + 



Name
spacer sequence RNA Sequence (5′→3′)
SEQ ID NO





R4503_CasPhi12_
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGG
 979


C2TA_T1.1
AGACCUACACAAUGCGUUGCCUGG









R4504_CasPhi12_
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGG
 980


C2TA_T1.2
AGACGGGCUCUGACAGGUAGGACC






R4505_CasPhi12_
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGG
 981


C2TA_T1.3
AGACUGUAGGAAUCCCAGCCAGGC






R4506_CasPhi12_
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGG
 982


C2TA_T1.8
AGACCCUGGCUCCACGCCCUGCUG






R4507_CasPhi12_
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGG
 983


C2TA_T1.9
AGACGGGAAGCUGAGGGCACGAGG






R4508_CasPhi12_
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGG
 984


C2TA_T2.1
AGACACAGCGAUGCUGACCCCCUG






R4509_CasPhi12_
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGG
 985


C2_TAT2.2
AGACUUAACAGCGAUGCUGACCCC






R4510_CasPhi12_
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGG
 986


C2TA_T2.3
AGACUAUGACCAGAUGGACCUGGC






R4511_CasPhi12_
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGG
 987


C2TA_T2.4
AGACGGGCCCCUAGAAGGUGGCUA






R4512_CasPhi12_
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGG
 988


C2TA_T2.5
AGACUAGGGGCCCCAACUCCAUGG






R4513_CasPhi12_
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGG
 989


C2TA_T2.6
AGACAGAAGCUCCAGGUAGCCACC






R4514_CasPhi12
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGG
 990


C2TA_T2.7
AGACUCCAGCCAGGUCCAUCUGGU






R4515_CasPhi12_
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGG
 991


C2TA_T2.8
AGACUUCUCCAGCCAGGUCCAUCU






R5200_CasPhi12
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGG
2112



AGACAGCAGGCUGUUGUGUGACAU






R5201_CasPhi12
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGG
2113



AGACCAUGUCACACAACAGCCUGC






R5202_CasPhi12
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGG
2114



AGACUGUGACAUGGAAGGUGAUGA






R5203_CasPhi12
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGG
2115



AGACAUCACCUUCCAUGUCACACA






R5204_CasPhi12
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGG
2116



AGACGCAUAAGCCUCCCUGGUCUC






R5205_CasPhi12
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGG
2117



AGACCAGGACUCCCAGCUGGAGGG






R5206_CasPhi12
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGG
2118



AGACCUCAGGCCCUCCAGCUGGGA






R5207_CasPhi12
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGG
2119



AGACUGCUGGCAUCUCCAUACUCU






R5208_CasPhi12
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGG
2120



AGACUGCCCAACUUCUGCUGGCAU






R5209_CasPhi12
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGG
2121



AGACCUGCCCAACUUCUGCUGGCA






R5210_CasPhi12
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGG
2122



AGACUCUGCCCAACUUCUGCUGGC






R5211_CasPhi12
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGG
2123



AGACUGACUUUUCUGCCCAACUUC






R5212_CasPhi12
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGG
2124



AGACCUGACUUUUCUGCCCAACUU






R5213_CasPhi12
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGG
2125



AGACUCUGACUUUUCUGCCCAACU






R5214_CasPhi12
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGG
2126



AGACCCAGAGGAGCUUCCGGCAGA






R5215_CasPhi12
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGG
2127



AGACAGGUCUGCCGGAAGCUCCUC






R5216_CasPhi12
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGG
2128



AGACCGGCAGACCUGAAGCACUGG






R5217_CasPhi12
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGG
2129



AGACCAGUGCUUCAGGUCUGCCGG






R5218_CasPhi12
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGG
2130



AGACAACAGCGCAGGCAGUGGCAG






R5219_CasPhi12
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGG
2131



AGACAACCAGGAGCCAGCCUCCGG






R5220_CasPhi12
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGG
2132



AGACUCCAGGCGCAUCUGGCCGGA






R5221_CasPhi12
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGG
2133



AGACCUCCAGGCGCAUCUGGCCGG






R5222_CasPhi12
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGG
2134



AGACUCUCCAGGCGCAUCUGGCCG






R5223_CasPhi12
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGG
2135



AGACCUCCAGUUCCUCGUUGAGCU






R5224_CasPhi12
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGG
2136



AGACUCCAGUUCCUCGUUGAGCUG






R5225_CasPhi12
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGG
2137



AGACAGGCAGCUCAACGAGGAACU






R5226_CasPhi12
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGG
2138



AGACCUCGUUGAGCUGCCUGAAUC






R5227_CasPhi12
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGG
2139



AGACAGCUGCCUGAAUCUCCCUGA






R5228_CasPhi12
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGG
2140



AGACGUCCCCACCAUCUCCACUCU






R5229_CasPhi12
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGG
2141



AGACUCCCCACCAUCUCCACUCUG






R5230_CasPhi12
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGG
2142



AGACCCAGAGCCCAUGGGGCAGAG






R5231_CasPhi12
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGG
2143



AGACGCCAGAGCCCAUGGGGCAGA






R5232_CasPhi12
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGG
2144



AGACCAGCCUCAGAGAUUUGCCAG






R5233_CasPhi12
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGG
2145



AGACGGAGGCCGUGGACAGUGAAU






R5234_CasPhi12
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGG
2146



AGACACUGUCCACGGCCUCCCAAC






R5235_CasPhi12
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGG
2147



AGACGCUCCAUCAGCCACUGACCU






R5236_CasPhi12
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGG
2148



AGACAGGCAUGCUGGGCAGGUCAG






R5237_CasPhi12
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGG
2149



AGACCUCGGGAGGUCAGGGCAGGU






R5238_CasPhi12
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGG
2150



AGACGCUCGGGAGGUCAGGGCAGG






R5239_CasPhi12
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGG
2151



AGACGAGACCUCUCCAGCUGCCGG






R5240_CasPhi12
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGG
2152



AGACUUGGAGACCUCUCCAGCUGC






R5241_CasPhi12
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGG
2153



AGACGAAGCUUGUUGGAGACCUCU






R5242_CasPhi12
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGG
2154



AGACGGAAGCUUGUUGGAGACCUC






R5243_CasPhi12
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGG
2155



AGACUGGAAGCUUGUUGGAGACCU






R5244_CasPhi12
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGG
2156



AGACUACCGCUCACUGCAGGACAC






R5245_CasPhi12
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGG
2157



AGACCUGCUGCUCCUCUCCAGCCU






R5246_CasPhi12
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGG
2158



AGACCCGCUCCAGGCUCUUGCUGC






R5247_CasPhi12
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGG
2159



AGACUGCCCAGUCCGGGGUGGCCA






R5248_CasPhi12
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGG
2160



AGACGGCCAGCUGCCGUUCUGCCC






R5249_CasPhi12
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGG
2161



AGACGCAGCCAACAGCACCUCAGC






R5250_CasPhi12
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGG
2162



AGACGCUGCCAAGGAGCACCGGCG






R5251_CasPhi12
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGG
2163



AGACCCCAGCACAGCAAUCACUCG






R5252_CasPhi12
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGG
2164



AGACGCCCAGCACAGCAAUCACUC






R5253_CasPhi12
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGG
2165



AGACCUGUGCUGGGCAAAGCUGGU






R5254_CasPhi12
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGG
2166



AGACCCCUGACCAGCUUUGCCCAG






R5255_CasPhi12
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGG
2167



AGACGGCUGGGGCAGUGAGCCGGG






R5256_CasPhi12
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGG
2168



AGACUGGCCGGCUUCCCCAGUACG






R5257_CasPhi12
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGG
2169



AGACCCCAGUACGACUUUGUCUUC






R5258_CasPhi12
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGG
2170



AGACGUCUUCUCUGUCCCCUGCCA






R5259_CasPhi12
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGG
2171



AGACUCUUCUCUGUCCCCUGCCAU






R5260_CasPhi12
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGG
2172



AGACUCUGUCCCCUGCCAUUGCUU






R5261_CasPhi12
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGG
2173



AGACAAGCAAUGGCAGGGGACAGA






R5262_CasPhi12
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGG
2174



AGACCUUGAACCGUCCGGGGGAUG






R5263_CasPhi12
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGG
2175



AGACAACCGUCCGGGGGAUGCCUA






R5264_CasPhi12
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGG
2176



AGACUCCCUGGGCCCACAGCCACU






R5265_CasPhi12
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGG
2177



AGACAAGAUGUGGCUGAAAACCUC






R5266_CasPhi12
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGG
2178



AGACUCAGCCACAUCUUGAAGAGA






R5267_CasPhi12
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGG
2179



AGACCAGCCACAUCUUGAAGAGAC






R5268_CasPhi12
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGG
2180



AGACAGCCACAUCUUGAAGAGACC






R5269_CasPhi12
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGG
2181



AGACAAGAGACCUGACCGCGUUCU






R5270_CasPhi12
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGG
2182



AGACUGCUCAUCCUAGACGGCUUC






R5271_CasPhi12
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGG
2183



AGACCAGCUCCUCGAAGCCGUCUA






R5272_CasPhi12
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGG
2184



AGACCGCUUCCAGCUCCUCGAAGC






R5273_CasPhi12
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGG
2185



AGACGAGGAGCUGGAAGCGCAAGA






R5274_CasPhi12
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGG
2186



AGACCUGCACAGCACGUGCGGACC






R5275_CasPhi12
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGG
2187



AGACUGGAAAAGGCCGGCCAGCAG






R5276_CasPhi12
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGG
2188



AGACUUCUGGAAAAGGCCGGCCAG






R5277_CasPhi12
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGG
2189



AGACUCCAGAAGAAGCUGCUCCGA






R5278_CasPhi12
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGG
2190



AGACCCAGAAGAAGCUGCUCCGAG






R5279_CasPhi12
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGG
2191



AGACCAGAAGAAGCUGCUCCGAGG






R5280_CasPhi12
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGG
2192



AGACCACCCUCCUCCUCACAGCCC






R5281_CasPhi12
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGG
2193



AGACCUCAGGCUCUGGACCAGGCG






R5282_CasPhi12
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGG
2194



AGACGAGCUGUCCGGCUUCUCCAU






R5283_CasPhi12
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGG
2195



AGACAGCUGUCCGGCUUCUCCAUG






R5284_CasPhi12
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGG
2196



AGACUCCAUGGAGCAGGCCCAGGC






R5285_CasPhi12
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGG
2197



AGACGAGAGCUCAGGGAUGACAGA






R5286_CasPhi12
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGG
2198



AGACAGAGCUCAGGGAUGACAGAG






R5287_CasPhi12
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGG
2199



AGACGUGCUCUGUCAUCCCUGAGC






R5288_CasPhi12
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGG
2200



AGACUUCUCAGUCACAGCCACAGC






R5289_CasPhi12
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGG
2201



AGACUCAGUCACAGCCACAGCCCU






R5290_CasPhi12
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGG
2202



AGACGUGCCGGGCAGUGUGCCAGC






R5291_CasPhi12
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGG
2203



AGACUGCCGGGCAGUGUGCCAGCU






R5292_CasPhi12
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGG
2204



AGACGCGUCCUCCCCAAGCUCCAG






R5293_CasPhi12
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGG
2205



AGACGGGAGGACGCCAAGCUGCCC






R5294_CasPhi12
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGG
2206



AGACGCCAGCUCUGCCAGGGCCCC






R5295_CasPhi12
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGG
2207



AGACAUGUCUGCGGCCCAGCUCCC






R5392_CasPhi12
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGG
2208



AGACGAUGUCUGCGGCCCAGCUCC






R5393_CasPhi12
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGG
2209



AGACCCAUCCGCAGACGUGAGGAC






R5394_CasPhi12
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGG
2210



AGACGCCAUCGCCCAGGUCCUCAC






R5395_CasPhi12
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGG
2211



AGACGGCCAUCGCCCAGGUCCUCA






R5396_CasPhi12
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGG
2212



AGACGACUAAGCCUUUGGCCAUCG






R5397_CasPhi12
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGG
2213



AGACGUCCAACACCCACCGCGGGC






R5398_CasPhi12
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGG
2214



AGACCAGGAGGAAGCUGGGGAAGG






R5399_CasPhi12
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGG
2215



AGACCCCAGCUUCCUCCUGCAAUG






R5400_CasPhi12
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGG
2216



AGACCUCCUGCAAUGCUUCCUGGG






R5401_CasPhi12
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGG
2217



AGACCUGGGGGCCCUGUGGCUGGC






R5402_CasPhi12
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGG
2218



AGACGCCACUCAGAGCCAGCCACA






R5403_CasPhi12
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGG
2219



AGACCGCCACUCAGAGCCAGCCAC






R5404_CasPhi12
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGG
2220



AGACAUUUCGCCACUCAGAGCCAG






R5405_CasPhi12
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGG
2221



AGACUCCUUGAUUUCGCCACUCAG






R5406_CasPhi12
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGG
2222



AGACGGGUCAAUGCUAGGUACUGC






R5407_CasPhi12
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGG
2223



AGACCUUGGGGUCAAUGCUAGGUA






R5408_CasPhi12
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGG
2224



AGACUUCCUUGGGGUCAAUGCUAG






R5409_CasPhi12
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGG
2225



AGACACCCCAAGGAAGAAGAGGCC






R5410_CasPhi12
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGG
2226



AGACUCAUAGGGCCUCUUCUUCCU






R5411_CasPhi12
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGG
2227



AGACCUGGCUGGGCUGAUCUUCCA






R5412_CasPhi12
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGG
2228



AGACUGGCUGGGCUGAUCUUCCAG






R5413_CasPhi12
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGG
2229



AGACCAGCCUCCCGCCCGCUGCCU






R5414_CasPh12
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGG
2230



AGACCUGUCCACCGAGGCAGCCGC






R5415_CasPh12
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGG
2231



AGACUGCUUCCUGUCCACCGAGGC






R5416_CasPh12
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGG
2232



AGACAGGUACCUCGCAAGCACCUU






R5417_CasPh12
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGG
2233



AGACCGAGGUACCUGAAGCGGCUG






R5418_CasPh12
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGG
2234



AGACCAGCCUCCUCGGCCUCGUGG






R5419_CasPh12
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGG
2235



AGACGGCAGCACGUGGUACAGGAG






R5420_CasPh12
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGG
2236



AGACGCAGCACGUGGUACAGGAGC






R5421_CasPh12
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGG
2237



AGACUCUGGGCACCCGCCUCACGC






R5422_CasPh12
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGG
2238



AGACCUGGGCACCCGCCUCACGCC






R5423_CasPh12
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGG
2239



AGACUGGGCACCCGCCUCACGCCU






R5424_CasPh12
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGG
2240



AGACCCCAGUACAUGUGCAUCAGG






R5425_CasPh12
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGG
2241



AGACGCCCGCCGCCUCCAAGGCCU






R5426_CasPh12
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGG
2242



AGACGAGGCGGCGGGCCAAGACUU






R5427_CasPh12
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGG
2243



AGACUCCCUGGACCUCCGCAGCAC






R5428_CasPh12
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGG
2244



AGACGCCCCUCUGGAUUGGGGAGC






R5429_CasPh12
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGG
2245



AGACCCCCUCUGGAUUGGGGAGCC






R5430_CasPh12
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGG
2246



AGACGGGAGCCUCGUGGGACUCAG






R5431_CasPh12
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGG
2247



AGACGUCUCCCCAUGCUGCUGCAG






R5432_CasPh12
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGG
2248



AGACUCCUCUGCUGCCUGAAGUAG






R5433_CasPh12
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGG
2249



AGACAGGCAGCAGAGGAGAAGUUC






R5434_CasPh12
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGG
2250



AGACAAAGGCUCGAUGGUGAACUU






R5435_CasPh12
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGG
2251



AGACGAAAGGCUCGAUGGUGAACU






R5436_CasPh12
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGG
2252



AGACACCAUCGAGCCUUUCAAAGC






R5437_CasPh12
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGG
2253



AGACGCUUUGAAAGGCUCGAUGGU






R5438_CasPh12
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGG
2254



AGACAGGGACUUGGCUUUGAAAGG






R5439_CasPh12
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGG
2255



AGACCAAAGCCAAGUCCCUGAAGG






R5440_CasPh12
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGG
2256



AGACAAAGCCAAGUCCCUGAAGGA






R5441_CasPh12
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGG
2257



AGACCACAUCCUUCAGGGACUUGG






R5442_CasPh12
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGG
2258



AGACCCAGGUCUUCCACAUCCUUC






R5443_CasPh12
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGG
2259



AGACCCCAGGUCUUCCACAUCCUU






R5444_CasPh12
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGG
2260



AGACCUCGGAAGACACAGCUGGGG






R5445_CasPh12
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGG
2261



AGACGGUCCCGAACAGCAGGGAGC






R5446_CasPh12
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGG
2262



AGACAGGUCCCGAACAGCAGGGAG






R5447_CasPh12
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGG
2263



AGACUUUAGGUCCCGAACAGCAGG






R5448_CasPh12
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGG
2264



AGACCUUUAGGUCCCGAACAGCAG






R5449_CasPh12
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGG
2265



AGACGGGACCUAAAGAAACUGGAG






R5450_CasPh12
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGG
2266



AGACGGGAAAGCCUGGGGGCCUGA






R5451_CasPh12
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGG
2267



AGACGGGGAAAGCCUGGGGGCCUG






R5452_CasPh12
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGG
2268



AGACCCCCAAACUGGUGCGGAUCC






R5453_CasPh12
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGG
2269



AGACCCCAAACUGGUGCGGAUCCU






R5454_CasPh12
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGG
2270



AGACUUCUCACUCAGCGCAUCCAG






R5455_CasPh12
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGG
2271



AGACAGCUGGGGGAAGGUGGCUGA






R5456_CasPh12
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGG
2272



AGACCCCCAGCUGAAGUCCUUGGA






R5457_CasPh12
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGG
2273



AGACCAAGGACUUCAGCUGGGGGA






R5458_CasPh12
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGG
2274



AGACCCAAGGACUUCAGCUGGGGG






R5459_CasPh12
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGG
2275



AGACAGGGUUUCCAAGGACUUCAG






R5460_CasPh12
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGG
2276



AGACUAGGCACCCAGGUCAGUGAU






R5461_CasPh12
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGG
2277



AGACGUAGGCACCCAGGUCAGUGA






R5462_CasPh12
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGG
2278



AGACGCUCGCUGCAUCCCUGCUCA






R5463_CasPh12
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGG
2279



AGACGCCUGAGCAGGGAUGCAGCG






R5464_CasPh12
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGG
2280



AGACUACAAUAACUGCAUCUGCGA






R5465_CasPh12
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGG
2281



AGACGCUCGUGUGCUUCCGGACAU






R5466_CasPh12
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGG
2282



AGACCGGACAUGGUGUCCCUCCGG






R5467_CasPh12
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGG
2283



AGACACGGCUGCCGGGGCCCAGCA






R5468_CasPh12
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGG
2284



AGACGGAGGUGUCCUCAUGUGGAG






R5469_CasPh12
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGG
2285



AGACCUGGACACUGAAUGGGAUGG






R5470_CasPh12
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGG
2286



AGACAGUGUCCAGGAACACCUGCA






R5471_CasPh12
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGG
2287



AGACCAGGUGUUCCUGGACACUGA






R5472_CasPh12
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGG
2288



AGACUUGCAGGUGUUCCUGGACAC






R5473_CasPh12
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGG
2289



AGACACGGAUCAGCCUGAGAUGAU
















TABLE P







CasΦ.32 gRNAs targeting human CIITA










Repeat +



Name
spacer sequence RNA Sequence (5′→3′)
SEQ ID NO





R4503_CasPhi32_
GCUGGGGACCGAUCCUGAUUGCUCGCUGCGGCG
 992


C2TA_T1.1
AGACCUACACAAUGCGUUGCCUGG






R4504_CasPhi32_
GCUGGGGACCGAUCCUGAUUGCUCGCUGCGGCG
 993


C2TA_T1.2
AGACGGGCUCUGACAGGUAGGACC






R4505_CasPhi32_
GCUGGGGACCGAUCCUGAUUGCUCGCUGCGGCG
 994


C2TA_T1.3
AGACUGUAGGAAUCCCAGCCAGGC






R4506_CasPhi32_
GCUGGGGACCGAUCCUGAUUGCUCGCUGCGGCG
 995


C2TA_T1.8
AGACCCUGGCUCCACGCCCUGCUG






R4507_CasPhi32_
GCUGGGGACCGAUCCUGAUUGCUCGCUGCGGCG
 996


C2TA_T1.9
AGACGGGAAGCUGAGGGCACGAGG






R4508_CasPhi32_
GCUGGGGACCGAUCCUGAUUGCUCGCUGCGGCG
 997


C2TA_T2.1
AGACACAGCGAUGCUGACCCCCUG






R4509_CasPhi32_
GCUGGGGACCGAUCCUGAUUGCUCGCUGCGGCG
 998


C2TA_T2.2
AGACUUAACAGCGAUGCUGACCCC






R4510_CasPhi32_
GCUGGGGACCGAUCCUGAUUGCUCGCUGCGGCG
 999


C2TA_T2.3
AGACUAUGACCAGAUGGACCUGGC






R4511_CasPhi32_
GCUGGGGACCGAUCCUGAUUGCUCGCUGCGGCG
1000


C2TA_T2.4
AGACGGGCCCCUAGAAGGUGGCUA






R4512_CasPhi32_
GCUGGGGACCGAUCCUGAUUGCUCGCUGCGGCG
1001


C2TA_T2.5
AGACUAGGGGCCCCAACUCCAUGG






R4513_CasPhi32_
GCUGGGGACCGAUCCUGAUUGCUCGCUGCGGCG
1002


C2TA_T2.6
AGACAGAAGCUCCAGGUAGCCACC






R4514_CasPhi32_
GCUGGGGACCGAUCCUGAUUGCUCGCUGCGGCG
1003


C2TA_T2.7
AGACUCCAGCCAGGUCCAUCUGGU






R4515_CasPhi32
GCUGGGGACCGAUCCUGAUUGCUCGCUGCGGCG
1004


C2TA_T2.8
AGACUUCUCCAGCCAGGUCCAUCU
















TABLE Q







CasΦ.12 gRNAs targeting mouse PCSK9










Repeat +



Name
spacer sequence RNA Sequence (5′→3′)
SEQ ID NO





R4238_
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGGAG
1005


CasPhi12
ACCCGCUGUUGCCGCCGCUGCU






R4239_
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGGAG
1006


CasPhi12
ACCCGCCGCUGCUGCUGCUGUU






R4240_
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGGAG
1007


CasPhi12
ACCUGCUACUGUGCCCCACCGG






R4241_
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGGAG
1008


CasPhi12
ACAUAAUCUCCAUCCUCGUCCU






R4242_
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGGAG
1009


CasPhi12
ACUGAAGAGCUGAUGCUCGCCC






R4243_
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGGAG
1010


CasPhi12
ACGAGCAACGGCGGAAGGUGGC






R4244_
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGGAG
1011


CasPhi12
ACCUGGCAGCCUCCAGGCCUCC






R4245_
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGGAG
1012


CasPhi12
ACUGGUGCUGAUGGAGGAGACC






R4246_
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGGAG
1013


CasPhi12
ACAAUCUGUAGCCUCUGGGUCU






R4247_
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGGAG
1014


CasPhi12
ACUUCAAUCUGUAGCCUCUGGG






R4248_
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGGAG
1015


CasPhi12
ACGUUCAAUCUGUAGCCUCUGG






R4249_
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGGAG
1016


CasPhi12
ACAACAAACUGCCCACCGCCUG






R4250_
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGGAG
1017


CasPhi12
ACAUGACAUAGCCCCGGCGGGC






R4251_
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGGAG
1018


CasPhi12
ACUACAUAUCUUUUAUGACCUC






R4252_
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGGAG
1019


CasPhi12
ACUAUGACCUCUUCCCUGGCUU






R4253_
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGGAG
1020


CasPhi12
ACAUGACCUCUUCCCUGGCUUC






R4254_
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGGAG
1021


CasPhi12
ACUGACCUCUUCCCUGGCUUCU






R4255_
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGGAG
1022


CasPhi12
ACACCAAGAAGCCAGGGAAGAG






R4256_
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGGAG
1023


CasPhi12
ACCCUGGCUUCUUGGUGAAGAU






R4257_
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGGAG
1024


CasPhi12
ACUUGGUGAAGAUGAGCAGUGA






R4258_
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGGAG
1025


CasPhi12
ACGUGAAGAUGAGCAGUGACCU






R4259_
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGGAG
1026


CasPhi12
ACCCCCAUGUGGAGUACAUUGA






R4260_
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGGAG
1027


CasPhi12
ACCUCAAUGUACUCCACAUGGG






R4261_
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGGAG
1028


CasPhi12
ACAGGAAGACUCCUUUGUCUUC






R4262_
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGGAG
1029


CasPhi12
ACGUCUUCGCCCAGAGCAUCCC






R4263_
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGGAG
1030


CasPhi12
ACUCUUCGCCCAGAGCAUCCCA






R4264_
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGGAG
1031


CasPhi12
ACGCCCAGAGCAUCCCAUGGAA






R4265_
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGGAG
1032


CasPhi12
ACCAUGGGAUGCUCUGGGCGAA






R4266_
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGGAG
1033


CasPhi12
ACGCUCCAGGUUCCAUGGGAUG






R4267_
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGGAG
1034


CasPhi12
ACUCCCAGCAUGGCACCAGACA






R4268_
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGGAG
1035


CasPhi12
ACCUCUGUCUGGUGCCAUGCUG






R4269_
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGGAG
1036


CasPhi12
ACGAUACCAGCAUCCAGGGUGC






R4270_
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGGAG
1037


CasPhi12
ACAGGGCAGGGUCACCAUCACC






R4271_
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGGAG
1038


CasPhi12
ACAAGUCGGUGAUGGUGACCCU






R4272_
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGGAG
1039


CasPhi12
ACAACAGCGUGCCGGAGGAGGA






R4273_
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGGAG
1040


CasPhi12
ACGCCACACCAGCAUCCCGGCC






R4274_
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGGAG
1041


CasPhi12
ACAGCACACGCAGGCUGUGCAG






R4275_
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGGAG
1042


CasPhi12
ACACAGUUGAGCACACGCAGGC






R4276_
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGGAG
1043


CasPhi12
ACCCUUGACAGUUGAGCACACG






R4277_
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGGAG
1044


CasPhi12
ACGCUGACUCUUCCGAAUAAAC






R4278_
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGGAG
1045


CasPhi12
ACAUUCGGAAGAGUCAGCUAAU






R4279_
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGGAG
1046


CasPhi12
ACUUCGGAAGAGUCAGCUAAUC






R4280_
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGGAG
1047


CasPhi12
ACGGAAGAGUCAGCUAAUCCAG






R4281_
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGGAG
1048


CasPhi12
ACUGCUGCCCCUGGCCGGUGGG






R4282_
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGGAG
1049


CasPhi12
ACAGGAUGCGGCUAUACCCACC






R4283_
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGGAG
1050


CasPhi12
ACCCAGCUGCUGCAACCAGCAC






R4284_
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGGAG
1051


CasPhi12
ACCAGCAGCUGGGAACUUCCGG






R4285_
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGGAG
1052


CasPhi12
ACCGGGACGACGCCUGCCUCUA






R4286_
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGGAG
1053


CasPhi12
ACGUGGCCCCGACUGUGAUGAC






R4287_
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGGAG
1054


CasPhi12
ACCCUUGGGGACUUUGGGGACU






R4288_
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGGAG
1055


CasPhi12
ACGUCCCCAAAGUCCCCAAGGU






R4289_
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGGAG
1056


CasPhi12
ACGGGACUUUGGGGACUAAUUU






R4290_
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGGAG
1057


CasPhi12
ACGGGGACUAAUUUUGGACGCU






R4291_
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGGAG
1058


CasPhi12
ACGGGACUAAUUUUGGACGCUG






R4292_
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGGAG
1059


CasPhi12
ACUGGACGCUGUGUGGAUCUCU






R4293_
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGGAG
1060


CasPhi12
ACGGACGCUGUGUGGAUCUCUU






R4294_
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGGAG
1061


CasPhi12
ACGACGCUGUGUGGAUCUCUUU






R4295_
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGGAG
1062


CasPhi12
ACCCGGGGGCAAAGAGAUCCAC






R4296_
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGGAG
1063


CasPhi12
ACGCCCCCGGGAAGGACAUCAU






R4297_
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGGAG
1064


CasPhi12
ACCCCCCGGGAAGGACAUCAUC






R4298_
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGGAG
1065


CasPhi12
ACAUGUCACAGAGUGGGACCUC






R4299_
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGGAG
1066


CasPhi12
ACUGGCUCGGAUGCUGAGCCGG






R4300_
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGGAG
1067


CasPhi12
ACCCCUGGCCGAGCUGCGGCAG






R4301_
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGGAG
1068


CasPhi12
ACGUAGAGAAGUGGAUCAGCCU






R4302_
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGGAG
1069


CasPhi12
ACGGUAGAGAAGUGGAUCAGCC






R4303_
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGGAG
1070


CasPhi12
ACUCUACCAAAGACGUCAUCAA






R4304_
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGGAG
1071


CasPhi12
ACAUGACGUCUUUGGUAGAGAA






R4305_
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGGAG
1072


CasPhi12
ACCCUGAGGACCAGCAGGUGCU






R4306_
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGGAG
1073


CasPhi12
ACGGGGUCAGCACCUGCUGGUC






R4307_
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGGAG
1074


CasPhi12
ACGAGUGGGCCCCGAGUGUGCC






R4308_
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGGAG
1075


CasPhi12
ACUGGGGCACAGCGGGCUGUAG






R4309_
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGGAG
1076


CasPhi12
ACUCCAGGAGCGGGAGGCGUCG






R4310_
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGGAG
1077


CasPhi12
ACCAGACCUGCUGGCCUCCUAU






R4311_
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGGAG
1078


CasPhi12
ACAGGGCCUUGCAGACCUGCUG






R4312_
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGGAG
1079


CasPhi12
ACGGGGGUGAGGGUGUCUAUGC






R4313_
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGGAG
1080


CasPhi12
ACGGGGUGAGGGUGUCUAUGCC






R4314_
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGGAG
1081


CasPhi12
ACGCACGGGGAACCAGGCAGCA






R4315_
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGGAG
1082


CasPhi12
ACCCCGUGCCAACUGCAGCAUC






R4316_
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGGAG
1083


CasPhi12
ACUGGAUGCUGCAGUUGGCACG






R4317_
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGGAG
1084


CasPhi12
ACUGGUGGCAGUGGACAUGGGU






R4318_
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGGAG
1085


CasPhi12
ACCACUUCCCAAUGGAAGCUGC






R4319_
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGGAG
1086


CasPhi12
ACCAUUGGGAAGUGGAAGACCU






R4320_
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGGAG
1087


CasPhi12
ACGGAAGUGGAAGACCUUAGUG






R4321_
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGGAG
1088


CasPhi12
ACGUGUCCGGAGGCAGCCUGCG






R4322_
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGGAG
1089


CasPhi12
ACGCCACCAGGCGGCCAGUGUC






R4323_
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGGAG
1090


CasPhi12
ACCUGCUGCCAUGCCCCAGGGC






R4324_
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGGAG
1091


CasPhi12
ACCAGCCCUGGGGCAUGGCAGC






R4325_
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGGAG
1092


CasPhi12
ACCAUUCCAGCCCUGGGGCAUG






R4326_
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGGAG
1093


CasPhi12
ACGCAUUCCAGCCCUGGGGCAU






R4327_
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGGAG
1094


CasPhi12
ACUGCAUUCCAGCCCUGGGGCA






R4328_
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGGAG
1095


CasPhi12
ACAUUUUGCAUUCCAGCCCUGG






R4329_
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGGAG
1096


CasPhi12
ACCAUCCAGUCAGGGUCCAUCC






R4330_
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGGAG
1097


CasPhi12
ACUCCACGCUGUAGGCUCCCAG






R4331_
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGGAG
1098


CasPhi12
ACCCACACACAGGUUGUCCACG






R4332_
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGGAG
1099


CasPhi12
ACUCCACUGGUCCUGUCUGCUC






R4333_
CUUUCAAGACUAAUAGAUUGCUCCUUACGAGGAG
1100


CasPhi12
ACCUGAAGGCCGGCUCCGGCAG
















TABLE R







CasΦ.32 gRNAs targeting mouse PCSK9










Repeat +



Name
spacer sequence RNA Sequence (5′→3′)
SEQ ID NO





R4238_
GCUGGGGACCGAUCCUGAUUGCUCGCUGCGGCGAG
1101


CasPhi32
ACCCGCUGUUGCCGCCGCUGCU






R4239_
GCUGGGGACCGAUCCUGAUUGCUCGCUGCGGCGAG
1102


CasPhi32
ACCCGCCGCUGCUGCUGCUGUU






R4240_
GCUGGGGACCGAUCCUGAUUGCUCGCUGCGGCGAG
1103


CasPhi32
ACCUGCUACUGUGCCCCACCGG






R4241_
GCUGGGGACCGAUCCUGAUUGCUCGCUGCGGCGAG
1104


CasPhi32
ACAUAAUCUCCAUCCUCGUCCU






R4242_
GCUGGGGACCGAUCCUGAUUGCUCGCUGCGGCGAG
1105


CasPhi32
ACUGAAGAGCUGAUGCUCGCCC






R4243_
GCUGGGGACCGAUCCUGAUUGCUCGCUGCGGCGAG
1106


CasPhi32
ACGAGCAACGGCGGAAGGUGGC






R4244_
GCUGGGGACCGAUCCUGAUUGCUCGCUGCGGCGAG
1107


CasPhi32
ACCUGGCAGCCUCCAGGCCUCC






R4245_
GCUGGGGACCGAUCCUGAUUGCUCGCUGCGGCGAG
1108


CasPhi32
ACUGGUGCUGAUGGAGGAGACC






R4246_
GCUGGGGACCGAUCCUGAUUGCUCGCUGCGGCGAG
1109


CasPhi32
ACAAUCUGUAGCCUCUGGGUCU






R4247_
GCUGGGGACCGAUCCUGAUUGCUCGCUGCGGCGAG
1110


CasPhi32
ACUUCAAUCUGUAGCCUCUGGG






R4248_
GCUGGGGACCGAUCCUGAUUGCUCGCUGCGGCGAG
1111


CasPhi32
ACGUUCAAUCUGUAGCCUCUGG






R4249_
GCUGGGGACCGAUCCUGAUUGCUCGCUGCGGCGAG
1112


CasPhi32
ACAACAAACUGCCCACCGCCUG






R4250_
GCUGGGGACCGAUCCUGAUUGCUCGCUGCGGCGAG
1113


CasPhi32
ACAUGACAUAGCCCCGGCGGGC






R4251_
GCUGGGGACCGAUCCUGAUUGCUCGCUGCGGCGAG
1114


CasPhi32
ACUACAUAUCUUUUAUGACCUC






R4252_
GCUGGGGACCGAUCCUGAUUGCUCGCUGCGGCGAG
1115


CasPhi32
ACUAUGACCUCUUCCCUGGCUU






R4253_
GCUGGGGACCGAUCCUGAUUGCUCGCUGCGGCGAG
1116


CasPhi32
ACAUGACCUCUUCCCUGGCUUC






R4254_
GCUGGGGACCGAUCCUGAUUGCUCGCUGCGGCGAG
1117


CasPhi32
ACUGACCUCUUCCCUGGCUUCU






R4255_
GCUGGGGACCGAUCCUGAUUGCUCGCUGCGGCGAG
1118


CasPhi32
ACACCAAGAAGCCAGGGAAGAG






R4256_
GCUGGGGACCGAUCCUGAUUGCUCGCUGCGGCGAG
1119


CasPhi32
ACCCUGGCUUCUUGGUGAAGAU






R4257_
GCUGGGGACCGAUCCUGAUUGCUCGCUGCGGCGAG
1120


CasPhi32
ACUUGGUGAAGAUGAGCAGUGA






R4258_
GCUGGGGACCGAUCCUGAUUGCUCGCUGCGGCGAG
1121


CasPhi32
ACGUGAAGAUGAGCAGUGACCU






R4259_
GCUGGGGACCGAUCCUGAUUGCUCGCUGCGGCGAG
1122


CasPhi32
ACCCCCAUGUGGAGUACAUUGA






R4260_
GCUGGGGACCGAUCCUGAUUGCUCGCUGCGGCGAG
1123


CasPhi32
ACCUCAAUGUACUCCACAUGGG






R4261_
GCUGGGGACCGAUCCUGAUUGCUCGCUGCGGCGAG
1124


CasPhi32
ACAGGAAGACUCCUUUGUCUUC






R4262_
GCUGGGGACCGAUCCUGAUUGCUCGCUGCGGCGAG
1125


CasPhi32
ACGUCUUCGCCCAGAGCAUCCC






R4263_
GCUGGGGACCGAUCCUGAUUGCUCGCUGCGGCGAG
1126


CasPhi32
ACUCUUCGCCCAGAGCAUCCCA






R4264_
GCUGGGGACCGAUCCUGAUUGCUCGCUGCGGCGAG
1127


CasPhi32
ACGCCCAGAGCAUCCCAUGGAA






R4265_
GCUGGGGACCGAUCCUGAUUGCUCGCUGCGGCGAG
1128


CasPhi32
ACCAUGGGAUGCUCUGGGCGAA






R4266_
GCUGGGGACCGAUCCUGAUUGCUCGCUGCGGCGAG
1129


CasPhi32
ACGCUCCAGGUUCCAUGGGAUG






R4267_
GCUGGGGACCGAUCCUGAUUGCUCGCUGCGGCGAG
1130


CasPhi32
ACUCCCAGCAUGGCACCAGACA






R4268_
GCUGGGGACCGAUCCUGAUUGCUCGCUGCGGCGAG
1131


CasPhi32
ACCUCUGUCUGGUGCCAUGCUG






R4269_
GCUGGGGACCGAUCCUGAUUGCUCGCUGCGGCGAG
1132


CasPhi32
ACGAUACCAGCAUCCAGGGUGC






R4270_
GCUGGGGACCGAUCCUGAUUGCUCGCUGCGGCGAG
1133


CasPhi32
ACAGGGCAGGGUCACCAUCACC






R4271_
GCUGGGGACCGAUCCUGAUUGCUCGCUGCGGCGAG
1134


CasPhi32
ACAAGUCGGUGAUGGUGACCCU






R4272_
GCUGGGGACCGAUCCUGAUUGCUCGCUGCGGCGAG
1135


CasPhi32
ACAACAGCGUGCCGGAGGAGGA






R4273_
GCUGGGGACCGAUCCUGAUUGCUCGCUGCGGCGAG
1136


CasPhi32
ACGCCACACCAGCAUCCCGGCC






R4274_
GCUGGGGACCGAUCCUGAUUGCUCGCUGCGGCGAG
1137


CasPhi32
ACAGCACACGCAGGCUGUGCAG






R4275_
GCUGGGGACCGAUCCUGAUUGCUCGCUGCGGCGAG
1138


CasPhi32
ACACAGUUGAGCACACGCAGGC






R4276_
GCUGGGGACCGAUCCUGAUUGCUCGCUGCGGCGAG
1139


CasPhi32
ACCCUUGACAGUUGAGCACACG






R4277_
GCUGGGGACCGAUCCUGAUUGCUCGCUGCGGCGAG
1140


CasPhi32
ACGCUGACUCUUCCGAAUAAAC






R4278_
GCUGGGGACCGAUCCUGAUUGCUCGCUGCGGCGAG
1141


CasPhi32
ACAUUCGGAAGAGUCAGCUAAU






R4279_
GCUGGGGACCGAUCCUGAUUGCUCGCUGCGGCGAG
1142


CasPhi32
ACUUCGGAAGAGUCAGCUAAUC






R4280_
GCUGGGGACCGAUCCUGAUUGCUCGCUGCGGCGAG
1143


CasPhi32
ACGGAAGAGUCAGCUAAUCCAG






R4281_
GCUGGGGACCGAUCCUGAUUGCUCGCUGCGGCGAG
1144


CasPhi32
ACUGCUGCCCCUGGCCGGUGGG






R4282_
GCUGGGGACCGAUCCUGAUUGCUCGCUGCGGCGAG
1145


CasPhi32
ACAGGAUGCGGCUAUACCCACC






R4283_
GCUGGGGACCGAUCCUGAUUGCUCGCUGCGGCGAG
1146


CasPhi32
ACCCAGCUGCUGCAACCAGCAC






R4284_
GCUGGGGACCGAUCCUGAUUGCUCGCUGCGGCGAG
1147


CasPhi32
ACCAGCAGCUGGGAACUUCCGG






R4285_
GCUGGGGACCGAUCCUGAUUGCUCGCUGCGGCGAG
1148


CasPhi32
ACCGGGACGACGCCUGCCUCUA






R4286_
GCUGGGGACCGAUCCUGAUUGCUCGCUGCGGCGAG
1149


CasPhi32
ACGUGGCCCCGACUGUGAUGAC






R4287_
GCUGGGGACCGAUCCUGAUUGCUCGCUGCGGCGAG
1150


CasPhi32
ACCCUUGGGGACUUUGGGGACU






R4288_
GCUGGGGACCGAUCCUGAUUGCUCGCUGCGGCGAG
1151


CasPhi32
ACGUCCCCAAAGUCCCCAAGGU






R4289_
GCUGGGGACCGAUCCUGAUUGCUCGCUGCGGCGAG
1152


CasPhi32
ACGGGACUUUGGGGACUAAUUU






R4290_
GCUGGGGACCGAUCCUGAUUGCUCGCUGCGGCGAG
1153


CasPhi32
ACGGGGACUAAUUUUGGACGCU






R4291_
GCUGGGGACCGAUCCUGAUUGCUCGCUGCGGCGAG
1154


CasPhi32
ACGGGACUAAUUUUGGACGCUG






R4292_
GCUGGGGACCGAUCCUGAUUGCUCGCUGCGGCGAG
1155


CasPhi32
ACUGGACGCUGUGUGGAUCUCU






R4293_
GCUGGGGACCGAUCCUGAUUGCUCGCUGCGGCGAG
1156


CasPhi32
ACGGACGCUGUGUGGAUCUCUU






R4294_
GCUGGGGACCGAUCCUGAUUGCUCGCUGCGGCGAG
1157


CasPhi32
ACGACGCUGUGUGGAUCUCUUU






R4295_
GCUGGGGACCGAUCCUGAUUGCUCGCUGCGGCGAG
1158


CasPhi32
ACCCGGGGGCAAAGAGAUCCAC






R4296_
GCUGGGGACCGAUCCUGAUUGCUCGCUGCGGCGAG
1159


CasPhi32
ACGCCCCCGGGAAGGACAUCAU






R4297_
GCUGGGGACCGAUCCUGAUUGCUCGCUGCGGCGAG
1160


CasPhi32
ACCCCCCGGGAAGGACAUCAUC






R4298_
GCUGGGGACCGAUCCUGAUUGCUCGCUGCGGCGAG
1161


CasPhi32
ACAUGUCACAGAGUGGGACCUC






R4299_
GCUGGGGACCGAUCCUGAUUGCUCGCUGCGGCGAG
1162


CasPhi32
ACUGGCUCGGAUGCUGAGCCGG






R4300_
GCUGGGGACCGAUCCUGAUUGCUCGCUGCGGCGAG
1163


CasPhi32
ACCCCUGGCCGAGCUGCGGCAG






R4301_
GCUGGGGACCGAUCCUGAUUGCUCGCUGCGGCGAG
1164


CasPhi32
ACGUAGAGAAGUGGAUCAGCCU






R4302_
GCUGGGGACCGAUCCUGAUUGCUCGCUGCGGCGAG
1165


CasPhi32
ACGGUAGAGAAGUGGAUCAGCC






R4303_
GCUGGGGACCGAUCCUGAUUGCUCGCUGCGGCGAG
1166


CasPhi32
ACUCUACCAAAGACGUCAUCAA






R4304_
GCUGGGGACCGAUCCUGAUUGCUCGCUGCGGCGAG
1167


CasPhi32
ACAUGACGUCUUUGGUAGAGAA






R4305_
GCUGGGGACCGAUCCUGAUUGCUCGCUGCGGCGAG
1168


CasPhi32
ACCCUGAGGACCAGCAGGUGCU






R4306_
GCUGGGGACCGAUCCUGAUUGCUCGCUGCGGCGAG
1169


CasPhi32
ACGGGGUCAGCACCUGCUGGUC






R4307_
GCUGGGGACCGAUCCUGAUUGCUCGCUGCGGCGAG
1170


CasPhi32
ACGAGUGGGCCCCGAGUGUGCC






R4308_
GCUGGGGACCGAUCCUGAUUGCUCGCUGCGGCGAG
1171


CasPhi32
ACUGGGGCACAGCGGGCUGUAG






R4309_
GCUGGGGACCGAUCCUGAUUGCUCGCUGCGGCGAG
1172


CasPhi32
ACUCCAGGAGCGGGAGGCGUCG






R4310_
GCUGGGGACCGAUCCUGAUUGCUCGCUGCGGCGAG
1173


CasPhi32
ACCAGACCUGCUGGCCUCCUAU






R4311_
GCUGGGGACCGAUCCUGAUUGCUCGCUGCGGCGAG
1174


CasPhi32
ACAGGGCCUUGCAGACCUGCUG






R4312_
GCUGGGGACCGAUCCUGAUUGCUCGCUGCGGCGAG
1175


CasPhi32
ACGGGGGUGAGGGUGUCUAUGC






R4313_
GCUGGGGACCGAUCCUGAUUGCUCGCUGCGGCGAG
1176


CasPhi32
ACGGGGUGAGGGUGUCUAUGCC






R4314_
GCUGGGGACCGAUCCUGAUUGCUCGCUGCGGCGAG
1177


CasPhi32
ACGCACGGGGAACCAGGCAGCA






R4315_
GCUGGGGACCGAUCCUGAUUGCUCGCUGCGGCGAG
1178


CasPhi32
ACCCCGUGCCAACUGCAGCAUC






R4316_
GCUGGGGACCGAUCCUGAUUGCUCGCUGCGGCGAG
1179


CasPhi32
ACUGGAUGCUGCAGUUGGCACG






R4317_
GCUGGGGACCGAUCCUGAUUGCUCGCUGCGGCGAG
1180


CasPhi32
ACUGGUGGCAGUGGACAUGGGU






R4318_
GCUGGGGACCGAUCCUGAUUGCUCGCUGCGGCGAG
1181


CasPhi32
ACCACUUCCCAAUGGAAGCUGC






R4319_
GCUGGGGACCGAUCCUGAUUGCUCGCUGCGGCGAG
1182


CasPhi32
ACCAUUGGGAAGUGGAAGACCU






R4320_
GCUGGGGACCGAUCCUGAUUGCUCGCUGCGGCGAG
1183


CasPhi32
ACGGAAGUGGAAGACCUUAGUG






R4321_
GCUGGGGACCGAUCCUGAUUGCUCGCUGCGGCGAG
1184


CasPhi32
ACGUGUCCGGAGGCAGCCUGCG






R4322_
GCUGGGGACCGAUCCUGAUUGCUCGCUGCGGCGAG
1185


CasPhi32
ACGCCACCAGGCGGCCAGUGUC






R4323_
GCUGGGGACCGAUCCUGAUUGCUCGCUGCGGCGAG
1186


CasPhi32
ACCUGCUGCCAUGCCCCAGGGC






R4324_
GCUGGGGACCGAUCCUGAUUGCUCGCUGCGGCGAG
1187


CasPhi32
ACCAGCCCUGGGGCAUGGCAGC






R4325_
GCUGGGGACCGAUCCUGAUUGCUCGCUGCGGCGAG
1188


CasPhi32
ACCAUUCCAGCCCUGGGGCAUG






R4326_
GCUGGGGACCGAUCCUGAUUGCUCGCUGCGGCGAG
1189


CasPhi32
ACGCAUUCCAGCCCUGGGGCAU






R4327_
GCUGGGGACCGAUCCUGAUUGCUCGCUGCGGCGAG
1190


CasPhi32
ACUGCAUUCCAGCCCUGGGGCA






R4328_
GCUGGGGACCGAUCCUGAUUGCUCGCUGCGGCGAG
1191


CasPhi32
ACAUUUUGCAUUCCAGCCCUGG






R4329_
GCUGGGGACCGAUCCUGAUUGCUCGCUGCGGCGAG
1192


CasPhi32
ACCAUCCAGUCAGGGUCCAUCC






R4330_
GCUGGGGACCGAUCCUGAUUGCUCGCUGCGGCGAG
1193


CasPhi32
ACUCCACGCUGUAGGCUCCCAG






R4331_
GCUGGGGACCGAUCCUGAUUGCUCGCUGCGGCGAG
1194


CasPhi32
ACCCACACACAGGUUGUCCACG






R4332_
GCUGGGGACCGAUCCUGAUUGCUCGCUGCGGCGAG
1195


CasPhi32
ACUCCACUGGUCCUGUCUGCUC






R4333_
GCUGGGGACCGAUCCUGAUUGCUCGCUGCGGCGAG
1196


CasPhi32
ACCUGAAGGCCGGCUCCGGCAG
















TABLE S







CasΦ.12 gRNAs targeting Bak1 in CHO cells










Repeat + spacer RNA Sequence



Name
(5′→3′), shown as DNA
SEQ ID NO





R2452
CTTTCAAGACTAATAGATTGCTCCTTACGAG
1197


Bak1_CasPhi12_1
GAGACGAAGCTATGTTTTCCATCTC






R2453
CTTTCAAGACTAATAGATTGCTCCTTACGAG
1198


Bak1_CasPhi12_2
GAGACGCAGGGGCAGCCGCCCCCTG






R2454
CTTTCAAGACTAATAGATTGCTCCTTACGAG
1199


Bak1_CasPhi12_3
GAGACCTCCTAGAACCCAACAGGTA






R2455
CTTTCAAGACTAATAGATTGCTCCTTACGAG
1200


Bak1_CasPhi12_4
GAGACGAAAGACCTCCTCTGTGTCC






R2456
CTTTCAAGACTAATAGATTGCTCCTTACGAG
1201


Bak1_CasPhi12_5
GAGACTCCATCTCGGGGTTGGCAGG






R2457
CTTTCAAGACTAATAGATTGCTCCTTACGAG
1202


Bak1_CasPhi12_6
GAGACTTCCTGATGGTGGAGATGGA






R2849_Bak1_CasPhi12_
CTTTCAAGACTAATAGATTGCTCCTTACGAG
1203


nsd_sg1
GAGACCTGACTCCCAGCTCTGACCC






R2850_Bak1_
CTTTCAAGACTAATAGATTGCTCCTTACGAG
1204


CasPhi12_nsd_sg2
GAGACTGGGGTCAGAGCTGGGAGTC






R2851_Bak1_CasPhi12_
CTTTCAAGACTAATAGATTGCTCCTTACGAG
1205


nsd_sg3
GAGACGAAAGACCTCCTCTGTGTCC






R2852_Bak1_
CTTTCAAGACTAATAGATTGCTCCTTACGAG
1206


CasPhi12_nsd_sg4
GAGACCGAAGCTATGTTTTCCATCT






R2853_Bak1_
CTTTCAAGACTAATAGATTGCTCCTTACGAG
1207


CasPhi12_nsd_sg5
GAGACGAAGCTATGTTTTCCATCTC






R2854_Bak1_
CTTTCAAGACTAATAGATTGCTCCTTACGAG
1208


CasPhi12_nsd_sg6
GAGACTCCATCTCCACCATCAGGAA






R2855_Bak1_
CTTTCAAGACTAATAGATTGCTCCTTACGAG
1209


CasPhi12_nsd_sg7
GAGACCCATCTCCACCATCAGGAAC






R2856_Bak1_
CTTTCAAGACTAATAGATTGCTCCTTACGAG
1210


CasPhi12_nsd_sg8
GAGACCTGATGGTGGAGATGGAAAA






R2857_Bak1_
CTTTCAAGACTAATAGATTGCTCCTTACGAG
1211


CasPhi12_nsd_sg9
GAGACCATCTCCACCATCAGGAACA






R2858_Bak1_
CTTTCAAGACTAATAGATTGCTCCTTACGAG
1212


CasPhi12_nsd_sg10
GAGACTTCCTGATGGTGGAGATGGA






R2859_Bak1_
CTTTCAAGACTAATAGATTGCTCCTTACGAG
1213


CasPhi12_nsd_sg11
GAGACGCAGGGGCAGCCGCCCCCTG






R2860_Bak1_
CTTTCAAGACTAATAGATTGCTCCTTACGAG
1214


CasPhi12_nsd_sg12
GAGACTCCATCTCGGGGTTGGCAGG






R2861_Bak1_
CTTTCAAGACTAATAGATTGCTCCTTACGAG
1215


CasPhi12_nsd_sg13
GAGACTAGGAGCAAATTGTCCATCT






R2862_Bak1_
CTTTCAAGACTAATAGATTGCTCCTTACGAG
1216


CasPhi12_nsd_sg14
GAGACGGTTCTAGGAGCAAATTGTC






R2863_Bak1_
CTTTCAAGACTAATAGATTGCTCCTTACGAG
1217


CasPhi12_nsd_sg15
GAGACGCTCCTAGAACCCAACAGGT






R2864_Bak1_
CTTTCAAGACTAATAGATTGCTCCTTACGAG
1218


CasPhi12_nsd_sg16
GAGACCTCCTAGAACCCAACAGGTA






R3977 Bak1_
CTTTCAAGACTAATAGATTGCTCCTTACGAG
1219


CasPhi12_exon1_sg1
GAGACTCCAGACGCCATCTTTCAGG






R3978 Bak1_
CTTTCAAGACTAATAGATTGCTCCTTACGAG
1220


CasPhi12_exon1_sg2
GAGACTGGTAAGAGTCCTCCTGCCC






R3979 Bak1_
CTTTCAAGACTAATAGATTGCTCCTTACGAG
1221


CasPhi12_exon3_sg1
GAGACTTACAGCATCTTGGGTCAGG






R3980 Bak1_
CTTTCAAGACTAATAGATTGCTCCTTACGAG
1222


CasPhi12_exon3_sg2
GAGACGGTCAGGTGGGCCGGCAGCT






R3981 Bak1_
CTTTCAAGACTAATAGATTGCTCCTTACGAG
1223


CasPhi12_exon3_sg3
GAGACCTATCATTGGAGATGACATT






R3982 Bak1_
CTTTCAAGACTAATAGATTGCTCCTTACGAG
1224


CasPhi12_exon3_sg4
GAGACGAGATGACATTAACCGGAGA






R3983 Bak1_
CTTTCAAGACTAATAGATTGCTCCTTACGAG
1225


CasPhi12_exon3_sg5
GAGACTGGAACTCTGTGTCGTATCT






R3984 Bak1_
CTTTCAAGACTAATAGATTGCTCCTTACGAG
1226


CasPhi12_exon3_sg6
GAGACCAGAATTTACTGGAGCAGCT






R3985 Bak1_
CTTTCAAGACTAATAGATTGCTCCTTACGAG
1227


CasPhi12_exon3_sg7
GAGACACTGGAGCAGCTGCAGCCCA






R3986 Bak1_
CTTTCAAGACTAATAGATTGCTCCTTACGAG
1228


CasPhi12_exon3_sg8
GAGACCCAGCTGTGGGCTGCAGCTG






R3987 Bak1_
CTTTCAAGACTAATAGATTGCTCCTTACGAG
1229


CasPhi12_exon3_sg9
GAGACGTAGGCATTCCCAGCTGTGG






R3988 Bak1_
CTTTCAAGACTAATAGATTGCTCCTTACGAG
1230


CasPhi12_exon3_sg10
GAGACGTGAAGAGTTCGTAGGCATT






R3989 Bak1_
CTTTCAAGACTAATAGATTGCTCCTTACGAG
1231


CasPhi12_exon3_sg11
GAGACACCAAGATTGCCTCCAGGTA






R3990 Bak1_
CTTTCAAGACTAATAGATTGCTCCTTACGAG
1232


CasPhi12_exon3_sg12
GAGACCCTCCAGGTACCCACCACCA
















TABLE T







CasΦ.32 gRNAs targeting Bak1 in CHO cells










Repeat + spacer RNA Sequence



Name
(5′→3′), shown as DNA
SEQ ID NO





R2452
GCTGGGGACCGATCCTGATTGCTCGCTGCGG
1233


Bak1_CasPhi32_1
CGAGACGAAGCTATGTTTTCCATCTC






R2453
GCTGGGGACCGATCCTGATTGCTCGCTGCGG
1234


Bak1_CasPhi32_2
CGAGACGCAGGGGCAGCCGCCCCCTG






R2454
GCTGGGGACCGATCCTGATTGCTCGCTGCGG
1235


Bak1_CasPhi32_3
CGAGACCTCCTAGAACCCAACAGGTA






R2455
GCTGGGGACCGATCCTGATTGCTCGCTGCGG
1236


Bak1_CasPhi32_4
CGAGACGAAAGACCTCCTCTGTGTCC






R2456
GCTGGGGACCGATCCTGATTGCTCGCTGCGG
1237


Bak1_CasPhi32_5
CGAGACTCCATCTCGGGGTTGGCAGG






R2457
GCTGGGGACCGATCCTGATTGCTCGCTGCGG
1238


Bak1_CasPhi32_6
CGAGACTTCCTGATGGTGGAGATGGA






R2849_Bak1_CasPhi32_
GCTGGGGACCGATCCTGATTGCTCGCTGCGG
1239


nsd_sg1
CGAGACCTGACTCCCAGCTCTGACCC






R2850_Bak1_CasPhi32_
GCTGGGGACCGATCCTGATTGCTCGCTGCGG
1240


nsd_sg2
CGAGACTGGGGTCAGAGCTGGGAGTC






R2851_Bak1_
GCTGGGGACCGATCCTGATTGCTCGCTGCGG
1241


CasPhi32_nsd_sg3
CGAGACGAAAGACCTCCTCTGTGTCC






R2852_Bak1
GCTGGGGACCGATCCTGATTGCTCGCTGCGG
1242


CasPhi321nsd_sg4
CGAGACCGAAGCTATGTTTTCCATCT






R2853_Bak1_
GCTGGGGACCGATCCTGATTGCTCGCTGCGG
1243


CasPhi32_nsd_sg5
CGAGACGAAGCTATGTTTTCCATCTC






R2854_Bak1_
GCTGGGGACCGATCCTGATTGCTCGCTGCGG
1244


CasPhi32_nsd_sg6
CGAGACTCCATCTCCACCATCAGGAA






R2855_Bak1_
GCTGGGGACCGATCCTGATTGCTCGCTGCGG
1245


CasPhi32_nsd_sg7
CGAGACCCATCTCCACCATCAGGAAC






R2856_Bak1_
GCTGGGGACCGATCCTGATTGCTCGCTGCGG
1246


CasPhi32_nsd_sg8
CGAGACCTGATGGTGGAGATGGAAAA






R2857_Bak1_
GCTGGGGACCGATCCTGATTGCTCGCTGCGG
1247


CasPhi32_nsd_sg9
CGAGACCATCTCCACCATCAGGAACA






R2858_Bak1_
GCTGGGGACCGATCCTGATTGCTCGCTGCGG
1248


CasPhi32_nsd_sg10
CGAGACTTCCTGATGGTGGAGATGGA






R2859_Bak1_
GCTGGGGACCGATCCTGATTGCTCGCTGCGG
1249


CasPhi32_nsd_sg11
CGAGACGCAGGGGCAGCCGCCCCCTG






R2860_Bak1_
GCTGGGGACCGATCCTGATTGCTCGCTGCGG
1250


CasPhi32_nsd_sg12
CGAGACTCCATCTCGGGGTTGGCAGG






R2861_Bak1_
GCTGGGGACCGATCCTGATTGCTCGCTGCGG
1251


CasPhi32_nsd_sg13
CGAGACTAGGAGCAAATTGTCCATCT






R2862_Bak1_
GCTGGGGACCGATCCTGATTGCTCGCTGCGG
1252


CasPhi32_nsd_sg14
CGAGACGGTTCTAGGAGCAAATTGTC






R2863_Bak1_
GCTGGGGACCGATCCTGATTGCTCGCTGCGG
1253


CasPhi32_nsd_sg15
CGAGACGCTCCTAGAACCCAACAGGT






R2864_Bak1_
GCTGGGGACCGATCCTGATTGCTCGCTGCGG
1254


CasPhi32_nsd_sg16
CGAGACCTCCTAGAACCCAACAGGTA






R3977 Bak1_
GCTGGGGACCGATCCTGATTGCTCGCTGCGG
1255


CasPhi32_exon1_sg1
CGAGACTCCAGACGCCATCTTTCAGG






R3978 Bak1_
GCTGGGGACCGATCCTGATTGCTCGCTGCGG
1256


CasPhi32_exon1_sg2
CGAGACTGGTAAGAGTCCTCCTGCCC






R3979 Bak1_
GCTGGGGACCGATCCTGATTGCTCGCTGCGG
1257


CasPhi32_exon3_sg1
CGAGACTTACAGCATCTTGGGTCAGG






R3980 Bak1_
GCTGGGGACCGATCCTGATTGCTCGCTGCGG
1258


CasPhi32_exon3_sg2
CGAGACGGTCAGGTGGGCCGGCAGCT






R3981 Bak1_
GCTGGGGACCGATCCTGATTGCTCGCTGCGG
1259


CasPhi32_exon3_sg3
CGAGACCTATCATTGGAGATGACATT






R3982 Bak1_
GCTGGGGACCGATCCTGATTGCTCGCTGCGG
1260


CasPhi32_exon3_sg4
CGAGACGAGATGACATTAACCGGAGA






R3983 Bak1_
GCTGGGGACCGATCCTGATTGCTCGCTGCGG
1261


CasPhi32_exon3_sg5
CGAGACTGGAACTCTGTGTCGTATCT






R3984 Bak1_
GCTGGGGACCGATCCTGATTGCTCGCTGCGG
1262


CasPhi32_exon3_sg6
CGAGACCAGAATTTACTGGAGCAGCT






R3985 Bak1_
GCTGGGGACCGATCCTGATTGCTCGCTGCGG
1263


CasPhi32_exon3_sg7
CGAGACACTGGAGCAGCTGCAGCCCA






R3986 Bak1_
GCTGGGGACCGATCCTGATTGCTCGCTGCGG
1264


CasPhi32_exon3_sg8
CGAGACCCAGCTGTGGGCTGCAGCTG






R3987 Bak1_
GCTGGGGACCGATCCTGATTGCTCGCTGCGG
1265


CasPhi32_exon3_sg9
CGAGACGTAGGCATTCCCAGCTGTGG






R3988 Bak1_
GCTGGGGACCGATCCTGATTGCTCGCTGCGG
1266


CasPhi32_exon3_sg10
CGAGACGTGAAGAGTTCGTAGGCATT






R3989 Bak1_
GCTGGGGACCGATCCTGATTGCTCGCTGCGG
1267


CasPhi32_exon3_sg11
CGAGACACCAAGATTGCCTCCAGGTA






R3990 Bak1_
GCTGGGGACCGATCCTGATTGCTCGCTGCGG
1268


CasPhi32_exon3_sg12
CGAGACCCTCCAGGTACCCACCACCA
















TABLE U







CasΦ.12 gRNAs targeting Bax in CHO cells










Repeat + spacer RNA Sequence



Name
(5′→3′), shown as DNA
SEQ ID NO





R2458
CTTTCAAGACTAATAGATTGCTCCTTACGAG
1269


Bax_CasPhi12_1
GAGACCTAATGTGGATACTAACTCC






R2459
CTTTCAAGACTAATAGATTGCTCCTTACGAG
1270


BaxCasPhi12_2
GAGACTTCCGTGTGGCAGCTGACAT






R2460
CTTTCAAGACTAATAGATTGCTCCTTACGAG
1271


BaxCasPhi12_3
GAGACCTGATGGCAACTTCAACTGG






R2461
CTTTCAAGACTAATAGATTGCTCCTTACGAG
1272


BaxCasPhi12_4
GAGACTACTTTGCTAGCAAACTGGT






R2462
CTTTCAAGACTAATAGATTGCTCCTTACGAG
1273


BaxCasPhi12_5
GAGACAGCACCAGTTTGCTAGCAAA






R2463
CTTTCAAGACTAATAGATTGCTCCTTACGAG
1274


BaxCasPhi12_6
GAGACAACTGGGGCCGGGTTGTTGC






R2865_Bax_CasPhi12_
CTTTCAAGACTAATAGATTGCTCCTTACGAG
1275


nsd_sg1
GAGACTTCTCTTTCCTGTAGGATGA






R2866_Bax_CasPhi12_
CTTTCAAGACTAATAGATTGCTCCTTACGAG
1276


nsd_sg2
GAGACTCTTTCCTGTAGGATGATTG






R2867_Bax_
CTTTCAAGACTAATAGATTGCTCCTTACGAG
1277


CasPhi12_nsd_sg3
GAGACCCTGTAGGATGATTGCTAAT






R2868_Bax_
CTTTCAAGACTAATAGATTGCTCCTTACGAG
1278


CasPhi12_nsd_sg4
GAGACCTGTAGGATGATTGCTAATG






R2869_Bax_
CTTTCAAGACTAATAGATTGCTCCTTACGAG
1279


CasPhi12_nsd_sg5
GAGACCTAATGTGGATACTAACTCC






R2870_Bax_
CTTTCAAGACTAATAGATTGCTCCTTACGAG
1280


CasPhi12_nsd_sg6
GAGACTTCCGTGTGGCAGCTGACAT






R2871_Bax_
CTTTCAAGACTAATAGATTGCTCCTTACGAG
1281


CasPhi12_nsd_sg7
GAGACCGTGTGGCAGCTGACATGTT






R2872_Bax_
CTTTCAAGACTAATAGATTGCTCCTTACGAG
1282


CasPhi12_nsd_sg8
GAGACCCATCAGCAAACATGTCAGC






R2873_Bax_
CTTTCAAGACTAATAGATTGCTCCTTACGAG
1283


CasPhi12_nsd_sg9
GAGACAAGTTGCCATCAGCAAACAT






R2874_Bax_
CTTTCAAGACTAATAGATTGCTCCTTACGAG
1284


CasPhi12_nsd_sg10
GAGACGCTGATGGCAACTTCAACTG






R2875_Bax_
CTTTCAAGACTAATAGATTGCTCCTTACGAG
1285


CasPhi12_nsd_sg11
GAGACCTGATGGCAACTTCAACTGG






R2876_Bax_
CTTTCAAGACTAATAGATTGCTCCTTACGAG
1286


CasPhi12_nsd_sg12
GAGACAACTGGGGCCGGGTTGTTGC






R2877_Bax_
CTTTCAAGACTAATAGATTGCTCCTTACGAG
1287


CasPhi12_nsd_sg13
GAGACTTGCCCTTTTCTACTTTGCT






R2878_Bax_
CTTTCAAGACTAATAGATTGCTCCTTACGAG
1288


CasPhi12_nsd_sg14
GAGACCCCTTTTCTACTTTGCTAGC






R2879_Bax_
CTTTCAAGACTAATAGATTGCTCCTTACGAG
1289


CasPhi12_nsd_sg15
GAGACCTAGCAAAGTAGAAAAGGGC






R2880_Bax_
CTTTCAAGACTAATAGATTGCTCCTTACGAG
1290


CasPhi12_nsd_sg16
GAGACGCTAGCAAAGTAGAAAAGGG






R2881_Bax_
CTTTCAAGACTAATAGATTGCTCCTTACGAG
1291


CasPhi12_nsd_sg17
GAGACTCTACTTTGCTAGCAAACTG






R2882_Bax_
CTTTCAAGACTAATAGATTGCTCCTTACGAG
1292


CasPhi12_nsd_sg18
GAGACCTACTTTGCTAGCAAACTGG






R2883_Bax_
CTTTCAAGACTAATAGATTGCTCCTTACGAG
1293


CasPhi12_nsd_sg19
GAGACTACTTTGCTAGCAAACTGGT






R2884_Bax_
CTTTCAAGACTAATAGATTGCTCCTTACGAG
1294


CasPhi12_nsd_sg20
GAGACGCTAGCAAACTGGTGCTCAA






R2885_Bax_
CTTTCAAGACTAATAGATTGCTCCTTACGAG
1295


CasPhi12_nsd_sg21
GAGACCTAGCAAACTGGTGCTCAAG






R2886_Bax_
CTTTCAAGACTAATAGATTGCTCCTTACGAG
1296


CasPhi12_nsd_sg22
GAGACAGCACCAGTTTGCTAGCAAA
















TABLE V







CasΦ.32 gRNAs targeting Bax in CHO cells










Repeat + spacer RNA Sequence (5′→3′),



Name
shown as DNA
SEQ ID NO





R2458
GCTGGGGACCGATCCTGATTGCTCGCTGCG
1297


Bax_CasPhi32_1
GCGAGACCTAATGTGGATACTAACTCC






R2459
GCTGGGGACCGATCCTGATTGCTCGCTGCG
1298


Bax_CasPhi32_2
GCGAGACTTCCGTGTGGCAGCTGACAT






R2460
GCTGGGGACCGATCCTGATTGCTCGCTGCG
1299


Bax_CasPhi32_3
GCGAGACCTGATGGCAACTTCAACTGG






R2461
GCTGGGGACCGATCCTGATTGCTCGCTGCG
1300


Bax_CasPhi32_4
GCGAGACTACTTTGCTAGCAAACTGGT






R2462
GCTGGGGACCGATCCTGATTGCTCGCTGCG
1301


Bax_CasPhi32_5
GCGAGACAGCACCAGTTTGCTAGCAAA






R2463
GCTGGGGACCGATCCTGATTGCTCGCTGCG
1302


Bax_CasPhi32_6
GCGAGACAACTGGGGCCGGGTTGTTGC






R2865_Bax_CasPhi32_
GCTGGGGACCGATCCTGATTGCTCGCTGCG
1303


nsd_sg1
GCGAGACTTCTCTTTCCTGTAGGATGA






R2866_Bax_
GCTGGGGACCGATCCTGATTGCTCGCTGCG
1304


CasPhi32_nsd_sg2
GCGAGACTCTTTCCTGTAGGATGATTG






R2867_Bax_
GCTGGGGACCGATCCTGATTGCTCGCTGCG
1305


CasPhi32_nsd_sg3
GCGAGACCCTGTAGGATGATTGCTAAT






R2868_Bax_
GCTGGGGACCGATCCTGATTGCTCGCTGCG
1306


CasPhi32_nsd_sg4
GCGAGACCTGTAGGATGATTGCTAATG






R2869_Bax_
GCTGGGGACCGATCCTGATTGCTCGCTGCG
1307


CasPhi32_nsd_sg5
GCGAGACCTAATGTGGATACTAACTCC






R2870_Bax_
GCTGGGGACCGATCCTGATTGCTCGCTGCG
1308


CasPhi32_nsd_sg6
GCGAGACTTCCGTGTGGCAGCTGACAT






R2871_Bax_
GCTGGGGACCGATCCTGATTGCTCGCTGCG
1309


CasPhi32_nsd_sg7
GCGAGACCGTGTGGCAGCTGACATGTT






R2872_Bax_
GCTGGGGACCGATCCTGATTGCTCGCTGCG
1310


CasPhi32_nsd_sg8
GCGAGACCCATCAGCAAACATGTCAGC






R2873_Bax_
GCTGGGGACCGATCCTGATTGCTCGCTGCG
1311


CasPhi32_nsd_sg9
GCGAGACAAGTTGCCATCAGCAAACAT






R2874_Bax_
GCTGGGGACCGATCCTGATTGCTCGCTGCG
1312


CasPhi32_nsd_sg10
GCGAGACGCTGATGGCAACTTCAACTG






R2875_Bax_
GCTGGGGACCGATCCTGATTGCTCGCTGCG
1313


CasPhi32_nsd_sg11
GCGAGACCTGATGGCAACTTCAACTGG






R2876_Bax_
GCTGGGGACCGATCCTGATTGCTCGCTGCG
1314


CasPhi32_nsd_sg12
GCGAGACAACTGGGGCCGGGTTGTTGC






R2877_Bax_
GCTGGGGACCGATCCTGATTGCTCGCTGCG
1315


CasPhi32_nsd_sg13
GCGAGACTTGCCCTTTTCTACTTTGCT






R2878_Bax_
GCTGGGGACCGATCCTGATTGCTCGCTGCG
1316


CasPhi32_nsd_sg14
GCGAGACCCCTTTTCTACTTTGCTAGC






R2879_Bax_
GCTGGGGACCGATCCTGATTGCTCGCTGCG
1317


CasPhi32_nsd_sg15
GCGAGACCTAGCAAAGTAGAAAAGGGC






R2880_Bax_
GCTGGGGACCGATCCTGATTGCTCGCTGCG
1318


CasPhi32_nsd_sg16
GCGAGACGCTAGCAAAGTAGAAAAGGG






R2881_Bax_
GCTGGGGACCGATCCTGATTGCTCGCTGCG
1319


CasPhi32_nsd_sg17
GCGAGACTCTACTTTGCTAGCAAACTG






R2882_Bax_
GCTGGGGACCGATCCTGATTGCTCGCTGCG
1320


CasPhi32_nsd_sg18
GCGAGACCTACTTTGCTAGCAAACTGG






R2883_Bax_
GCTGGGGACCGATCCTGATTGCTCGCTGCG
1321


CasPhi32_nsd_sg19
GCGAGACTACTTTGCTAGCAAACTGGT






R2884_Bax_
GCTGGGGACCGATCCTGATTGCTCGCTGCG
1322


CasPhi32_nsd_sg20
GCGAGACGCTAGCAAACTGGTGCTCAA






R2885_Bax_
GCTGGGGACCGATCCTGATTGCTCGCTGCG
1323


CasPhi32_nsd_sg21
GCGAGACCTAGCAAACTGGTGCTCAAG






R2886_Bax_
GCTGGGGACCGATCCTGATTGCTCGCTGCG
1324


CasPhi32_nsd_sg22
GCGAGACAGCACCAGTTTGCTAGCAAA
















TABLE W







CasΦ.12 gRNAs targeting Fut8 in CHO cells










Repeat + spacer RNA Sequence (5′→3′),



Name
shown as DNA
SEQ ID NO





R2464
CTTTCAAGACTAATAGATTGCTCCTTACGAG
1325


Fut8_CasPhi12_1
GAGACCCACTTTGTCAGTGCGTCTG






R2465
CTTTCAAGACTAATAGATTGCTCCTTACGAG
1326


Fut8_CasPhi12_2
GAGACCTCAATGGGATGGAAGGCTG






R2466
CTTTCAAGACTAATAGATTGCTCCTTACGAG
1327


Fut8_CasPhi12_3
GAGACAGGAATACATGGTACACGTT






R2467
CTTTCAAGACTAATAGATTGCTCCTTACGAG
1328


Fut8_CasPhi12_4
GAGACAAGAACATTTTCAGCTTCTC






R2468
CTTTCAAGACTAATAGATTGCTCCTTACGAG
1329


Fut8_CasPhi12_5
GAGACATCCACTTTCATTCTGCGTT






R2469
CTTTCAAGACTAATAGATTGCTCCTTACGAG
1330


Fut8_CasPhi12_6
GAGACTTTGTTAAAGGAGGCAAAGA






R2887_Fut8_CasPhi12_
CTTTCAAGACTAATAGATTGCTCCTTACGAG
1331


nsd_sg1
GAGACTCCCCAGAGTCCATGTCAGA






R2888_Fut8_
CTTTCAAGACTAATAGATTGCTCCTTACGAG
1332


CasPhi12_nsd_sg2
GAGACTCAGTGCGTCTGACATGGAC






R2889_Fut8_CasPhi12_
CTTTCAAGACTAATAGATTGCTCCTTACGAG
1333


nsd_sg3
GAGACGTCAGTGCGTCTGACATGGA






R2890_Fut8_
CTTTCAAGACTAATAGATTGCTCCTTACGAG
1334


CasPhi12_nsd_sg4
GAGACCCACTTTGTCAGTGCGTCTG






R2891_Fut8_
CTTTCAAGACTAATAGATTGCTCCTTACGAG
1335


CasPhi12_nsd_sg5
GAGACTGTTCCCACTTTGTCAGTGC






R2892_Fut8_
CTTTCAAGACTAATAGATTGCTCCTTACGAG
1336


CasPhi12_nsd_sg6
GAGACCTCAATGGGATGGAAGGCTG






R2893_Fut8_
CTTTCAAGACTAATAGATTGCTCCTTACGAG
1337


CasPhi12_nsd_sg7
GAGACCATCCCATTGAGGAATACAT






R2894_Fut8_
CTTTCAAGACTAATAGATTGCTCCTTACGAG
1338


CasPhi12_nsd_sg8
GAGACAGGAATACATGGTACACGTT






R2895_Fut8_
CTTTCAAGACTAATAGATTGCTCCTTACGAG
1339


CasPhi12_nsd_sg9
GAGACAACGTGTACCATGTATTCCT






R2896_Fut8_
CTTTCAAGACTAATAGATTGCTCCTTACGAG
1340


CasPhi12_nsd_sg10
GAGACTTCAACGTGTACCATGTATT






R2897_Fut8_
CTTTCAAGACTAATAGATTGCTCCTTACGAG
1341


CasPhi12_nsd_sg11
GAGACAAGAACATTTTCAGCTTCTC






R2898_Fut8_
CTTTCAAGACTAATAGATTGCTCCTTACGAG
1342


CasPhi12_nsd_sg12
GAGACGAGAAGCTGAAAATGTTCTT






R2899_Fut8_
CTTTCAAGACTAATAGATTGCTCCTTACGAG
1343


CasPhi12_nsd_sg13
GAGACTCAGCTTCTCGAACGCAGAA






R2900_Fut8_
CTTTCAAGACTAATAGATTGCTCCTTACGAG
1344


CasPhi12_nsd_sg14
GAGACCAGCTTCTCGAACGCAGAAT






R2901_Fut8_
CTTTCAAGACTAATAGATTGCTCCTTACGAG
1345


CasPhi12_nsd_sg15
GAGACTGCGTTCGAGAAGCTGAAAA






R2902_Fut8_
CTTTCAAGACTAATAGATTGCTCCTTACGAG
1346


CasPhi12_nsd_sg16
GAGACAGCTTCTCGAACGCAGAATG






R2903_Fut8_
CTTTCAAGACTAATAGATTGCTCCTTACGAG
1347


CasPhi12_nsd_sg17
GAGACATTCTGCGTTCGAGAAGCTG






R2904_Fut8_
CTTTCAAGACTAATAGATTGCTCCTTACGAG
1348


CasPhi12_nsd_sg18
GAGACCATTCTGCGTTCGAGAAGCT






R2905_Fut8_
CTTTCAAGACTAATAGATTGCTCCTTACGAG
1349


CasPhi12_nsd_sg19
GAGACTCGAACGCAGAATGAAAGTG






R2906_Fut8_
CTTTCAAGACTAATAGATTGCTCCTTACGAG
1350


CasPhi12_nsd_sg20
GAGACATCCACTTTCATTCTGCGTT






R2907_Fut8_
CTTTCAAGACTAATAGATTGCTCCTTACGAG
1351


CasPhi12_nsd_sg21
GAGACTATCCACTTTCATTCTGCGT






R2908_Fut8_
CTTTCAAGACTAATAGATTGCTCCTTACGAG
1352


CasPhi12_nsd_sg22
GAGACTTATCCACTTTCATTCTGCG






R2909_Fut8_
CTTTCAAGACTAATAGATTGCTCCTTACGAG
1353


CasPhi12_nsd_sg23
GAGACTTTATCCACTTTCATTCTGC






R2910_Fut8_
CTTTCAAGACTAATAGATTGCTCCTTACGAG
1354


CasPhi12_nsd_sg24
GAGACTTTTATCCACTTTCATTCTG






R2911_Fut8_
CTTTCAAGACTAATAGATTGCTCCTTACGAG
1355


CasPhi12_nsd_sg25
GAGACAACAAAGAAGGGTCATCAGT






R2912_Fut8_
CTTTCAAGACTAATAGATTGCTCCTTACGAG
1356


CasPhi12_nsd_sg26
GAGACCCTCCTTTAACAAAGAAGGG






R2913_Fut8_
CTTTCAAGACTAATAGATTGCTCCTTACGAG
1357


CasPhi12_nsd_sg27
GAGACGCCTCCTTTAACAAAGAAGG






R2914_Fut8_
CTTTCAAGACTAATAGATTGCTCCTTACGAG
1358


CasPhi12_nsd_sg28
GAGACTTTGTTAAAGGAGGCAAAGA






R2915_Fut8_
CTTTCAAGACTAATAGATTGCTCCTTACGAG
1359


CasPhi12_nsd_sg29
GAGACGTTAAAGGAGGCAAAGACAA






R2916_Fut8_
CTTTCAAGACTAATAGATTGCTCCTTACGAG
1360


CasPhi12_nsd_sg30
GAGACTTAAAGGAGGCAAAGACAAA






R2917_Fut8_
CTTTCAAGACTAATAGATTGCTCCTTACGAG
1361


CasPhi12_nsd_sg31
GAGACTCTTTGCCTCCTTTAACAAA






R2918_Fut8_
CTTTCAAGACTAATAGATTGCTCCTTACGAG
1362


CasPhi12_nsd_sg32
GAGACGTCTTTGCCTCCTTTAACAA






R2919_Fut8_
CTTTCAAGACTAATAGATTGCTCCTTACGAG
1363


CasPhi12_nsd_sg33
GAGACGTCTAACTTACTTTGTCTTT






R2920_Fut8_
CTTTCAAGACTAATAGATTGCTCCTTACGAG
1364


CasPhi12_nsd_sg34
GAGACTTGGTCTAACTTACTTTGTC

















TABLE X







CasΦ.32 gRNAs targeting Fut8 in CHO cells










Repeat + spacer RNA Sequence



Name
(5′→3′), shown as DNA
SEQ ID NO





R2464
GCTGGGGACCGATCCTGATTGCTCGCTGCG
1365


Fut8_CasPhi32_1
GCGAGACCCACTTTGTCAGTGCGTCTG






R2465
GCTGGGGACCGATCCTGATTGCTCGCTGCG
1366


Fut8_CasPhi32_2
GCGAGACCTCAATGGGATGGAAGGCTG






R2466
GCTGGGGACCGATCCTGATTGCTCGCTGCG
1367


Fut8_CasPhi32_3
GCGAGACAGGAATACATGGTACACGTT






R2467
GCTGGGGACCGATCCTGATTGCTCGCTGCG
1368


Fut8_CasPhi32_4
GCGAGACAAGAACATTTTCAGCTTCTC






R2468
GCTGGGGACCGATCCTGATTGCTCGCTGCG
1369


Fut8_CasPhi32_5
GCGAGACATCCACTTTCATTCTGCGTT






R2469
GCTGGGGACCGATCCTGATTGCTCGCTGCG
1370


Fut8_CasPhi32_6
GCGAGACTTTGTTAAAGGAGGCAAAGA






R2887_Fut8_CasPhi32_
GCTGGGGACCGATCCTGATTGCTCGCTGCG
1371


nsd_sg1
GCGAGACTCCCCAGAGTCCATGTCAGA






R2888_Fut8_
GCTGGGGACCGATCCTGATTGCTCGCTGCG
1372


CasPhi32_nsd_sg2
GCGAGACTCAGTGCGTCTGACATGGAC






R2889_Fut8_
GCTGGGGACCGATCCTGATTGCTCGCTGCG
1373


CasPhi32_nsd_sg3
GCGAGACGTCAGTGCGTCTGACATGGA






R2890_Fut8_
GCTGGGGACCGATCCTGATTGCTCGCTGCG
1374


CasPhi32_nsd_sg4
GCGAGACCCACTTTGTCAGTGCGTCTG






R2891_Fut8_
GCTGGGGACCGATCCTGATTGCTCGCTGCG
1375


CasPhi32_nsd_sg5
GCGAGACTGTTCCCACTTTGTCAGTGC






R2892_Fut8_
GCTGGGGACCGATCCTGATTGCTCGCTGCG
1376


CasPhi32_nsd_sg6
GCGAGACCTCAATGGGATGGAAGGCTG






R2893_Fut8_
GCTGGGGACCGATCCTGATTGCTCGCTGCG
1377


CasPhi32_nsd_sg7
GCGAGACCATCCCATTGAGGAATACAT






R2894_Fut8_
GCTGGGGACCGATCCTGATTGCTCGCTGCG
1378


CasPhi32_nsd_sg8
GCGAGACAGGAATACATGGTACACGTT






R2895_Fut8_
GCTGGGGACCGATCCTGATTGCTCGCTGCG
1379


CasPhi32_nsd_sg9
GCGAGACAACGTGTACCATGTATTCCT






R2896_Fut8_
GCTGGGGACCGATCCTGATTGCTCGCTGCG
1380


CasPhi32_nsd_sg10
GCGAGACTTCAACGTGTACCATGTATT






R2897_Fut8_
GCTGGGGACCGATCCTGATTGCTCGCTGCG
1381


CasPhi32_nsd_sg11
GCGAGACAAGAACATTTTCAGCTTCTC






R2898_Fut8_
GCTGGGGACCGATCCTGATTGCTCGCTGCG
1382


CasPhi32_nsd_sg12
GCGAGACGAGAAGCTGAAAATGTTCTT






R2899_Fut8_
GCTGGGGACCGATCCTGATTGCTCGCTGCG
1383


CasPhi32_nsd_sg13
GCGAGACTCAGCTTCTCGAACGCAGAA






R2900_Fut8_
GCTGGGGACCGATCCTGATTGCTCGCTGCG
1384


CasPhi32_nsd_sg14
GCGAGACCAGCTTCTCGAACGCAGAAT






R2901_Fut8_
GCTGGGGACCGATCCTGATTGCTCGCTGCG
1385


CasPhi32_nsd_sg15
GCGAGACTGCGTTCGAGAAGCTGAAAA






R2902_Fut8_
GCTGGGGACCGATCCTGATTGCTCGCTGCG
1386


CasPhi32_nsd_sg16
GCGAGACAGCTTCTCGAACGCAGAATG






R2903_Fut8_
GCTGGGGACCGATCCTGATTGCTCGCTGCG
1387


CasPhi32_nsd_sg17
GCGAGACATTCTGCGTTCGAGAAGCTG






R2904_Fut8_
GCTGGGGACCGATCCTGATTGCTCGCTGCG
1388


CasPhi32_nsd_sg18
GCGAGACCATTCTGCGTTCGAGAAGCT






R2905_Fut8_
GCTGGGGACCGATCCTGATTGCTCGCTGCG
1389


CasPhi32_nsd_sg19
GCGAGACTCGAACGCAGAATGAAAGTG






R2906_Fut8_
GCTGGGGACCGATCCTGATTGCTCGCTGCG
1390


CasPhi32_
GCGAGACATCCACTTTCATTCTGCGTT



CasPhi32_nsd_sg20







R2907_Fut8_
GCTGGGGACCGATCCTGATTGCTCGCTGCG
1391


CasPhi32_nsd_sg21
GCGAGACTATCCACTTTCATTCTGCGT






R2908_Fut8_
GCTGGGGACCGATCCTGATTGCTCGCTGCG
1392


CasPhi32_nsd_sg22
GCGAGACTTATCCACTTTCATTCTGCG






R2909_Fut8_
GCTGGGGACCGATCCTGATTGCTCGCTGCG
1393


CasPhi32_nsd_sg23
GCGAGACTTTATCCACTTTCATTCTGC






R2910_Fut8_
GCTGGGGACCGATCCTGATTGCTCGCTGCG
1394


CasPhi32_nsd_sg24
GCGAGACTTTTATCCACTTTCATTCTG






R2911_Fut8_
GCTGGGGACCGATCCTGATTGCTCGCTGCG
1395


CasPhi32_nsd_sg25
GCGAGACAACAAAGAAGGGTCATCAGT






R2912_Fut8_
GCTGGGGACCGATCCTGATTGCTCGCTGCG
1396


CasPhi32_nsd_sg26
GCGAGACCCTCCTTTAACAAAGAAGGG






R2913_Fut8_
GCTGGGGACCGATCCTGATTGCTCGCTGCG
1397


CasPhi32_nsd_sg27
GCGAGACGCCTCCTTTAACAAAGAAGG






R2914_Fut8_
GCTGGGGACCGATCCTGATTGCTCGCTGCG
1398


CasPhi32_nsd_sg28
GCGAGACTTTGTTAAAGGAGGCAAAGA






R2915_Fut8_
GCTGGGGACCGATCCTGATTGCTCGCTGCG
1399


CasPhi32_nsd_sg29
GCGAGACGTTAAAGGAGGCAAAGACAA






R2916_Fut8_
GCTGGGGACCGATCCTGATTGCTCGCTGCG
1400


CasPhi32_nsd_sg30
GCGAGACTTAAAGGAGGCAAAGACAAA






R2917_Fut8_
GCTGGGGACCGATCCTGATTGCTCGCTGCG
1401


CasPhi32_nsd_sg31
GCGAGACTCTTTGCCTCCTTTAACAAA






R2918_Fut8_
GCTGGGGACCGATCCTGATTGCTCGCTGCG
1402


CasPhi32_nsd_sg32
GCGAGACGTCTTTGCCTCCTTTAACAA






R2919_Fut8_
GCTGGGGACCGATCCTGATTGCTCGCTGCG
1403


CasPhi32_nsd_sg33
GCGAGACGTCTAACTTACTTTGTCTTT






R2920_Fut8_
GCTGGGGACCGATCCTGATTGCTCGCTGCG
1404


CasPhi32_nsd_sg34
GCGAGACTTGGTCTAACTTACTTTGTC
















TABLE Y







CasΦ.12 gRNAs targeting human TRAC in T cells










Repeat + spacer RNA Sequence



Name
(5′→3′), shown as DNA
SEQ ID NO





R3040_CasPhi12_S
ATTGCTCCTTACGAGGAGACTGGATATCTGT
1533



GGGACA






R3041_CasPhi12_S
ATTGCTCCTTACGAGGAGACTCCCACAGATA
1534



TCCAGA






R3042_CasPhi12_S
ATTGCTCCTTACGAGGAGACGAGTCTCTCAG
1535



CTGGTA






R3043_CasPhi12_S
ATTGCTCCTTACGAGGAGACAGAGTCTCTCA
1536



GCTGGT






R3044_CasPhi12_S
ATTGCTCCTTACGAGGAGACTCACTGGATTT
1537



AGAGTC






R3045_CasPhi12_S
ATTGCTCCTTACGAGGAGACAGAATCAAAAT
1538



CGGTGA






R3046_CasPhi12_S
ATTGCTCCTTACGAGGAGACGAGAATCAAAA
1539



TCGGTG






R3047_CasPhi12_S
ATTGCTCCTTACGAGGAGACACCGATTTTGA
1540



TTCTCA






R3048_CasPhi12_S
ATTGCTCCTTACGAGGAGACTTTGAGAATCA
1541



AAATCG






R3049_CasPhi12_S
ATTGCTCCTTACGAGGAGACGTTTGAGAATC
1542



AAAATC






R3050_CasPhi12_S
ATTGCTCCTTACGAGGAGACTGATTCTCAAA
1543



CAAATG






R3051_CasPhi12_S
ATTGCTCCTTACGAGGAGACGATTCTCAAAC
1544



AAATGT






R3052_CasPhi12_S
ATTGCTCCTTACGAGGAGACATTCTCAAACA
1545



AATGTG






R3053_CasPhi12_S
ATTGCTCCTTACGAGGAGACTGACACATTTG
1546



TTTGAG






R3054_CasPhi12_S
ATTGCTCCTTACGAGGAGACTCAAACAAATG
1547



TGTCAC






R3055_CasPhi12_S
ATTGCTCCTTACGAGGAGACGTGACACATTT
1548



GTTTGA






R3056_CasPhi12_S
ATTGCTCCTTACGAGGAGACCTTTGTGACAC
1549



ATTTGT






R3057_CasPhi12_S
ATTGCTCCTTACGAGGAGACTGATGTGTATA
1550



TCACAG






R3058_CasPhi12_S
ATTGCTCCTTACGAGGAGACTCTGTGATATA
1551



CACATC






R3059_CasPhi12_S
ATTGCTCCTTACGAGGAGACGTCTGTGATAT
1552



ACACAT






R3060_CasPhi12_S
ATTGCTCCTTACGAGGAGACTGTCTGTGATA
1553



TACACA






R3061_CasPhi12_S
ATTGCTCCTTACGAGGAGACAAGTCCATAGA
1554



CCTCAT






R3062_CasPhi12_S
ATTGCTCCTTACGAGGAGACCTCTTGAAGTC
1555



CATAGA






R3063_CasPhi12_S
ATTGCTCCTTACGAGGAGACAAGAGCAACAG
1556



TGCTGT






R3064_CasPhi12_S
ATTGCTCCTTACGAGGAGACCTCCAGGCCAC
1557



AGCACT






R3065_CasPhi12_S
ATTGCTCCTTACGAGGAGACTTGCTCCAGGC
1558



CACAGC






R3066_CasPhi12_S
ATTGCTCCTTACGAGGAGACGTTGCTCCAGG
1559



CCACAG






R3067_CasPhi12_S
ATTGCTCCTTACGAGGAGACCACATGCAAAG
1560



TCAGAT






R3068_CasPhi12_S
ATTGCTCCTTACGAGGAGACGCACATGCAAA
1561



GTCAGA






R3069_CasPhi12_S
ATTGCTCCTTACGAGGAGACGCATGTGCAAA
1562



CGCCTT






R3070_CasPhi12_S
ATTGCTCCTTACGAGGAGACAAGGCGTTTGC
1563



ACATGC






R3071_CasPhi12_S
ATTGCTCCTTACGAGGAGACCATGTGCAAAC
1564



GCCTTC






R3072_CasPhi12_S
ATTGCTCCTTACGAGGAGACTTGAAGGCGTT
1565



TGCACA






R3073_CasPhi12_S
ATTGCTCCTTACGAGGAGACAACAACAGCAT
1566



TATTCC






R3074_CasPhi12_S
ATTGCTCCTTACGAGGAGACTGGAATAATGC
1567



TGTTGT






R3075_CasPhi12_S
ATTGCTCCTTACGAGGAGACTTCCAGAAGAC
1568



ACCTTC






R3076_CasPhi12_S
ATTGCTCCTTACGAGGAGACCAGAAGACACC
1569



TTCTTC






R3077_CasPhi12_S
ATTGCTCCTTACGAGGAGACCCTGGGCTGGG
1570



GAAGAA






R3078_CasPhi12_S
ATTGCTCCTTACGAGGAGACTTCCCCAGCCC
1571



AGGTAA






R3079_CasPhi12_S
ATTGCTCCTTACGAGGAGACCCCAGCCCAGG
1572



TAAGGG






R3080_CasPhi12_S
ATTGCTCCTTACGAGGAGACTAAAAGGAAAA
1573



ACAGAC






R3081_CasPhi12_S
ATTGCTCCTTACGAGGAGACCTAAAAGGAAA
1574



AACAGA






R3082_CasPhi12_S
ATTGCTCCTTACGAGGAGACTTCCTTTTAGAA
1575



AGTTC






R3083_CasPhi12_S
ATTGCTCCTTACGAGGAGACTCCTTTTAGAA
1576



AGTTCC






R3084_CasPhi12_S
ATTGCTCCTTACGAGGAGACCCTTTTAGAAA
1577



GTTCCT






R3085_CasPhi12_S
ATTGCTCCTTACGAGGAGACCTTTTAGAAAG
1578



TTCCTG






R3086_CasPhi12_S
ATTGCTCCTTACGAGGAGACTAGAAAGTTCC
1579



TGTGAT






R3136_CasPhi12_S
ATTGCTCCTTACGAGGAGACAGAAAGTTCCT
1580



GTGATG






R3137_CasPhi12_S
ATTGCTCCTTACGAGGAGACGAAAGTTCCTG
1581



TGATGT






R3138_CasPhi12_S
ATTGCTCCTTACGAGGAGACACATCACAGGA
1582



ACTTTC






R3139_CasPhi12_S
ATTGCTCCTTACGAGGAGACCTGTGATGTCA
1583



AGCTGG






R3140_CasPhi12_S
ATTGCTCCTTACGAGGAGACTCGACCAGCTT
1584



GACATC






R3141_CasPhi12_S
ATTGCTCCTTACGAGGAGACCTCGACCAGCT
1585



TGACAT






R3142_CasPhi12_S
ATTGCTCCTTACGAGGAGACTCTCGACCAGC
1586



TTGACA






R3143_CasPhi12_S
ATTGCTCCTTACGAGGAGACAAAGCTTTTCT
1587



CGACCA






R3144_CasPhi12_S
ATTGCTCCTTACGAGGAGACCAAAGCTTTTC
1588



TCGACC






R3145_CasPhi12_S
ATTGCTCCTTACGAGGAGACCCTGTTTCAAA
1589



GCTTTT






R3146_CasPhi12_S
ATTGCTCCTTACGAGGAGACGAAACAGGTAA
1590



GACAGG






R3147_CasPhi12_S
ATTGCTCCTTACGAGGAGACAAACAGGTAAG
1591



ACAGGG
















TABLE Z







CasΦ.12 gRNAs targeting human B2M in T cells










Repeat + spacer RNA Sequence



Name
(5′→3′), shown as DNA
SEQ ID NO





R3115_CasPhi12_S
ATTGCTCCTTACGAGGAGACCATCCATCCGA
1592



CATTGA






R3116_CasPhi12_S
ATTGCTCCTTACGAGGAGACATCCATCCGAC
1593



ATTGAA






R3117_CasPhi12_S
ATTGCTCCTTACGAGGAGACAGTAAGTCAAC
1594



TTCAAT






R3118_CasPhi12_S
ATTGCTCCTTACGAGGAGACTTCAGTAAGTC
1595



AACTTC






R3119_CasPhi12_S
ATTGCTCCTTACGAGGAGACAAGTTGACTTA
1596



CTGAAG






R3120_CasPhi12_S
ATTGCTCCTTACGAGGAGACACTTACTGAAG
1597



AATGGA






R3121_CasPhi12_S
ATTGCTCCTTACGAGGAGACTCTCTCCATTCT
1598



TCAGT






R3122_CasPhi12_S
ATTGCTCCTTACGAGGAGACCTGAAGAATGG
1599



AGAGAG






R3123_CasPhi12_S
ATTGCTCCTTACGAGGAGACAATTCTCTCTCC
1600



ATTCT






R3124_CasPhi12_S
ATTGCTCCTTACGAGGAGACCAATTCTCTCTC
1601



CATTC






R3125_CasPhi12_S
ATTGCTCCTTACGAGGAGACTCAATTCTCTCT
1602



CCATT






R3126_CasPhi12_S
ATTGCTCCTTACGAGGAGACTTCAATTCTCTC
1603



TCCAT






R3127_CasPhi12_S
ATTGCTCCTTACGAGGAGACAAAAAGTGGAG
1604



CATTCA






R3128_CasPhi12_S
ATTGCTCCTTACGAGGAGACCTGAAAGACAA
1605



GTCTGA






R3129_CasPhi12_S
ATTGCTCCTTACGAGGAGACAGACTTGTCTTT
1606



CAGCA






R3130_CasPhi12_S
ATTGCTCCTTACGAGGAGACTCTTTCAGCAA
1607



GGACTG






R3131_CasPhi12_S
ATTGCTCCTTACGAGGAGACCAGCAAGGACT
1608



GGTCTT






R3132_CasPhi12_S
ATTGCTCCTTACGAGGAGACAGCAAGGACTG
1609



GTCTTT






R3133_CasPhi12_S
ATTGCTCCTTACGAGGAGACCTATCTCTTGTA
1610



CTACA






R3134_CasPhi12_S
ATTGCTCCTTACGAGGAGACTATCTCTTGTAC
1611



TACAC






R3135_CasPhi12_S
ATTGCTCCTTACGAGGAGACAGTGTAGTACA
1612



AGAGAT






R3148_CasPhi12_S
ATTGCTCCTTACGAGGAGACTACTACACTGA
1613



ATTCAC






R3149_CasPhi12_S
ATTGCTCCTTACGAGGAGACAGTGGGGGTGA
1614



ATTCAG






R3150_CasPhi12_S
ATTGCTCCTTACGAGGAGACCAGTGGGGGTG
1615



AATTCA






R3151_CasPhi12_S
ATTGCTCCTTACGAGGAGACTCAGTGGGGGT
1616



GAATTC






R3152_CasPhi12_S
ATTGCTCCTTACGAGGAGACTTCAGTGGGGG
1617



TGAATT






R3153_CasPhi12_S
ATTGCTCCTTACGAGGAGACACCCCCACTGA
1618



AAAAGA






R3154_CasPhi12_S
ATTGCTCCTTACGAGGAGACACACGGCAGGC
1619



ATACTC






R3155_CasPhi12_S
ATTGCTCCTTACGAGGAGACGGCTGTGACAA
1620



AGTCAC






R3156_CasPhi12_S
ATTGCTCCTTACGAGGAGACGTCACAGCCCA
1621



AGATAG






R3157_CasPhi12_S
ATTGCTCCTTACGAGGAGACTCACAGCCCAA
1622



GATAGT






R3158_CasPhi12_S
ATTGCTCCTTACGAGGAGACACTATCTTGGG
1623



CTGTGA






R3159_CasPhi12_S
ATTGCTCCTTACGAGGAGACCCCCACTTAAC
1624



TATCTT
















TABLE AA







CasΦ.12 gRNAs targeting human PD1 in T cells









Name
Repeat + spacer RNA Sequence (5′→3′)
SEQ ID NO





R2921_CasPhi12_S
AUUGCUCCUUACGAGGAGACCCUUCCGC
1625



UCACCUCCG






R2922_CasPhi12_S
AUUGCUCCUUACGAGGAGACCCUUCCGC
1626



UCACCUCCG






R2923_CasPhi12_S
AUUGCUCCUUACGAGGAGACCGCUCACC
1627



UCCGCCUGA






R2924_CasPhi12_S
AUUGCUCCUUACGAGGAGACUCCACUGC
1628



UCAGGCGGA






R2925_CasPhi12_S
AUUGCUCCUUACGAGGAGACUAGCACCG
1629



CCCAGACGA






R2926_CasPhi12_S
AUUGCUCCUUACGAGGAGACAGGCAUGC
1630



AGAUCCCAC






R2927_CasPhi12_S
AUUGCUCCUUACGAGGAGACCACAGGCG
1631



CCCUGGCCA






R2928_CasPhi12_S
AUUGCUCCUUACGAGGAGACUCUGGGCG
1632



GUGCUACAA






R2929_CasPhi12_S
AUUGCUCCUUACGAGGAGACGCAUGCCU
1633



GGAGCAGCC






R2930_CasPhi12_S
AUUGCUCCUUACGAGGAGACUAGCACCG
1634



CCCAGACGA






R2931_CasPhi12_S
AUUGCUCCUUACGAGGAGACUGGCCGCC
1635



AGCCCAGUU






R2932_CasPhi12_S
AUUGCUCCUUACGAGGAGACCUUCCGCU
1636



CACCUCCGC






R2933_CasPhi12_S
AUUGCUCCUUACGAGGAGACCAGGGCCU
1637



GUCUGGGGA






R2934_CasPhi12_S
AUUGCUCCUUACGAGGAGACUCCCCAGC
1638



CCUGCUCGU






R2935_CasPhi12_S
AUUGCUCCUUACGAGGAGACGGUCACCA
1639



CGAGCAGGG






R2936_CasPhi12_S
AUUGCUCCUUACGAGGAGACUCCCCUUC
1640



GGUCACCAC






R2937_CasPhi12_S
AUUGCUCCUUACGAGGAGACGAGAAGCU
1641



GCAGGUGAA






R2938_CasPhi12_S
AUUGCUCCUUACGAGGAGACACCUGCAG
1642



CUUCUCCAA






R2939_CasPhi12_S
AUUGCUCCUUACGAGGAGACUCCAACAC
1643



AUCGGAGAG






R2940_CasPhi12_S
AUUGCUCCUUACGAGGAGACGCACGAAG
1644



CUCUCCGAU






R2941_CasPhi12_S
AUUGCUCCUUACGAGGAGACAGCACGAA
1645



GCUCUCCGA






R2942_CasPhi12_S
AUUGCUCCUUACGAGGAGACGUGCUAAA
1646



CUGGUACCG






R2943_CasPhi12_S
AUUGCUCCUUACGAGGAGACCUGGGGCU
1647



CAUGCGGUA






R2944_CasPhi12_S
AUUGCUCCUUACGAGGAGACUCCGUCUG
1648



GUUGCUGGG






R2945_CasPhi12_S
AUUGCUCCUUACGAGGAGACCCCGAGGA
1649



CCGCAGCCA






R2946_CasPhi12_S
AUUGCUCCUUACGAGGAGACUGUGACAC
1650



GGAAGCGGC






R2947_CasPhi12_S
AUUGCUCCUUACGAGGAGACCGUGUCAC
1651



ACAACUGCC






R2948_CasPhi12_S
AUUGCUCCUUACGAGGAGACGGCAGUUG
1652



UGUGACACG






R2949_CasPhi12_S
AUUGCUCCUUACGAGGAGACCACAUGAG
1653



CGUGGUCAG






R2950_CasPhi12_S
AUUGCUCCUUACGAGGAGACCGCCGGGC
1654



CCUGACCAC






R2951_CasPhi12_S
AUUGCUCCUUACGAGGAGACGGGGCCAG
1655



GGAGAUGGC






R2952_CasPhi12_S
AUUGCUCCUUACGAGGAGACAUCUGCGC
1656



CUUGGGGGC






R2953_CasPhi12_S
AUUGCUCCUUACGAGGAGACGAUCUGCG
1657



CCUUGGGGG






R2954_CasPhi12_S
AUUGCUCCUUACGAGGAGACCCAGACAG
1658



GCCCUGGAA






R2955_CasPhi12_S
AUUGCUCCUUACGAGGAGACCCAGCCCU
1659



GCUCGUGGU






R2956_CasPhi12_S
AUUGCUCCUUACGAGGAGACUCUCUGGA
1660



AGGGCACAA






R2957_CasPhi12_S
AUUGCUCCUUACGAGGAGACGUGCCCUU
1661



CCAGAGAGA






R2958_CasPhi12_S
AUUGCUCCUUACGAGGAGACUGCCCUUC
1662



CAGAGAGAA






R2959_CasPhi12_S
AUUGCUCCUUACGAGGAGACUGCCCUUC
1663



UCUCUGGAA






R2960_CasPhi12_S
AUUGCUCCUUACGAGGAGACCAGAGAGA
1664



AGGGCAGAA






R2961_CasPhi12_S
AUUGCUCCUUACGAGGAGACGAACUGGC
1665



CGGCUGGCC






R2962_CasPhi12_S
AUUGCUCCUUACGAGGAGACGGAACUGG
1666



CCGGCUGGC






R2963_CasPhi12_S
AUUGCUCCUUACGAGGAGACCAAACCCU
1667



GGUGGUUGG






R2964_CasPhi12_S
AUUGCUCCUUACGAGGAGACGUGUCGUG
1668



GGCGGCCUG






R2965_CasPhi12_S
AUUGCUCCUUACGAGGAGACCCUCGUGC
1669



GGCCCGGGA






R2966_CasPhi12_S
AUUGCUCCUUACGAGGAGACUCCCUGCA
1670



GAGAAACAC






R2967_CasPhi12_S
AUUGCUCCUUACGAGGAGACCUCUGCAG
1671



GGACAAUAG






R2968_CasPhi12_S
AUUGCUCCUUACGAGGAGACUCUGCAGG
1672



GACAAUAGG






R2969_CasPhi12_S
AUUGCUCCUUACGAGGAGACCUCCUCAA
1673



AGAAGGAGG






R2970_CasPhi12_S
AUUGCUCCUUACGAGGAGACUCCUCAAA
1674



GAAGGAGGA






R2971_CasPhi12_S
AUUGCUCCUUACGAGGAGACUCUGUGGA
1675



CUAUGGGGA






R2972_CasPhi12_S
AUUGCUCCUUACGAGGAGACUCUCGCCA
1676



CUGGAAAUC






R2973_CasPhi12_S
AUUGCUCCUUACGAGGAGACCCAGUGGC
1677



GAGAGAAGA






R2974_CasPhi12_S
AUUGCUCCUUACGAGGAGACCAGUGGCG
1678



AGAGAAGAC






R2975_CasPhi12_S
AUUGCUCCUUACGAGGAGACCGCUAGGA
1679



AAGACAAUG






R2976_CasPhi12_S
AUUGCUCCUUACGAGGAGACUCUUUCCU
1680



AGCGGAAUG






R2977_CasPhi12_S
AUUGCUCCUUACGAGGAGACCCUAGCGG
1681



AAUGGGCAC






R2978_CasPhi12_S
AUUGCUCCUUACGAGGAGACCUAGCGGA
1682



AUGGGCACC






R2979_CasPhi12_S
AUUGCUCCUUACGAGGAGACGCCCCUCU
1683



GACCGGCUU






R2980_CasPhi12_S
AUUGCUCCUUACGAGGAGACCUUGGCCA
1684



CCAGUGUUC






R2981_CasPhi12_S
AUUGCUCCUUACGAGGAGACGCCACCAG
1685



UGUUCUGCA






R2982_CasPhi12_S
AUUGCUCCUUACGAGGAGACUGCAGACC
1686



CUCCACCAU






R2983_CasPhi12_S
AUUGCUCCUUACGAGGAGACUCCUGAGG
1687



AAAUGCGCU






R2984_CasPhi12_S
AUUGCUCCUUACGAGGAGACCCUCAGGA
1688



GAAGCAGGC






R2985_CasPhi12_S
AUUGCUCCUUACGAGGAGACCUCAGGAG
1689



AAGCAGGCA






R2986_CasPhi12_S
AUUGCUCCUUACGAGGAGACCAGGCCGU
1690



CCAGGGGCU






R2987_CasPhi12_S
AUUGCUCCUUACGAGGAGACAGACAUGA
1691



GUCCUGUGG






R2988_CasPhi12_S
AUUGCUCCUUACGAGGAGACAGGUCCUG
1692



CCAGCACAG






R2989_CasPhi12_S
AUUGCUCCUUACGAGGAGACAGGGAGCU
1693



GGACGCAGG






R2990_CasPhi12_S
AUUGCUCCUUACGAGGAGACAGCCCCGG
1694



GCCGCAGGC






R2991_CasPhi12_S
AUUGCUCCUUACGAGGAGACAGGCAGGA
1695



GGCUCCGGG






R2992_CasPhi12_S
AUUGCUCCUUACGAGGAGACGGGGCUGG
1696



UUGGAGAUG






R2993_CasPhi12_S
AUUGCUCCUUACGAGGAGACGAGAUGGC
1697



CUUGGAGCA






R2994_CasPhi12_S
AUUGCUCCUUACGAGGAGACGCUGCUCC
1698



AAGGCCAUC






R2995_CasPhi12_S
AUUGCUCCUUACGAGGAGACGAGCAGCC
1699



AAGGUGCCC






R2996_CasPhi12_S
AUUGCUCCUUACGAGGAGACGGGAUGCC
1700



ACUGCCAGG






R2997_CasPhi12_S
AUUGCUCCUUACGAGGAGACCGGGAUGC
1701



CACUGCCAG






R2998_CasPhi12_S
AUUGCUCCUUACGAGGAGACGGCCCUGC
1702



GUCCAGGGC






R2999_CasPhi12_S
AUUGCUCCUUACGAGGAGACUCUGCUCC
1703



CUGCAGGCC






R3000_CasPhi12_S
AUUGCUCCUUACGAGGAGACUCUAGGCC
1704



UGCAGGGAG






R3001_CasPhi12_S
AUUGCUCCUUACGAGGAGACCCUGAAAC
1705



UUCUCUAGG






R3002_CasPhi12_S
AUUGCUCCUUACGAGGAGACUGACCUUC
1706



CCUGAAACU






R3003_CasPhi12_S
AUUGCUCCUUACGAGGAGACCAGGGAAG
1707



GUCAGAAGA






R3004_CasPhi12_S
AUUGCUCCUUACGAGGAGACAGGGAAGG
1708



UCAGAAGAG






R3005_CasPhi12_S
AUUGCUCCUUACGAGGAGACCUGCCCUG
1709



CCCACCACA






R3006_CasPhi12_S
AUUGCUCCUUACGAGGAGACCCUGCCCU
1710



GCCCACCAC






R3007_CasPhi12_S
AUUGCUCCUUACGAGGAGACACACAUGC
1711



CCAGGCAGC






R3008_CasPhi12_S
AUUGCUCCUUACGAGGAGACCACAUGCC
1712



CAGGCAGCA






R3009_CasPhi12_S
AUUGCUCCUUACGAGGAGACCCUGCCCC
1713



ACAAAGGGC






R3010_CasPhi12_S
AUUGCUCCUUACGAGGAGACGUGGGGCA
1714



GGGAAGCUG






R3011_CasPhi12_S
AUUGCUCCUUACGAGGAGACUGGGGCAG
1715



GGAAGCUGA






R3012_CasPhi12_S
AUUGCUCCUUACGAGGAGACCUGCCUCA
1716



GCUUCCCUG






R3013_CasPhi12_S
AUUGCUCCUUACGAGGAGACCAGGCCCA
1717



GCCAGCACU






R3014_CasPhi12_S
AUUGCUCCUUACGAGGAGACAGGCCCAG
1718



CCAGCACUC






R3015_CasPhi12_S
AUUGCUCCUUACGAGGAGACCACCCCAG
1719



CCCCUCACA






R3016_CasPhi12_S
AUUGCUCCUUACGAGGAGACGGACCGUA
1720



GGAUGUCCC
















TABLE AB







shortened CasΦ.12 gRNAs targeting human CIITA











SEQ



Repeat + spacer 
ID


Name
RNA Sequence (5′→3′)
NO





R4503_CasPhi12_
AUUGCUCCUUACGAGGAGACCUACACAA
1721


C2TA_T1.1_S
UGCGUUGCC






R4504_CasPhi12_
AUUGCUCCUUACGAGGAGACGGGCUCUG
1722


C2TA_T1.2_S
ACAGGUAGG






R4505_CasPhi12_
AUUGCUCCUUACGAGGAGACUGUAGGAA
1723


C2TA_T1.3_S
UCCCAGCCA






R4506_CasPhi12_
AUUGCUCCUUACGAGGAGACCCUGGCUC
1724


C2TA_T1.8_S
CACGCCCUG






R4507_CasPhi12_
AUUGCUCCUUACGAGGAGACGGGAAGCU
1725


C2TA_T1.9_S
GAGGGCACG






R4508_CasPhi12_
AUUGCUCCUUACGAGGAGACACAGCGAU
1726


C2TA_T2.1_S
GCUGACCCC






R4509_CasPhi12_
AUUGCUCCUUACGAGGAGACUUAACAGC
1727


C2TA_T2.2_S
GAUGCUGAC






R4510_CasPhi12_
AUUGCUCCUUACGAGGAGACUAUGACCA
1728


C2TA_T2.3_S
GAUGGACCU






R4511_CasPhi12_
AUUGCUCCUUACGAGGAGACGGGCCCCU
1729


C2TA_T2.4_S
AGAAGGUGG






R4512_CasPhi12_
AUUGCUCCUUACGAGGAGACUAGGGGCC
1730


C2TA_T2.5_S
CCAACUCCA






R4513_CasPhi12_
AUUGCUCCUUACGAGGAGACAGAAGCUC
1731


C2TA_T2.6_S
CAGGUAGCC






R4514_CasPhi12_
AUUGCUCCUUACGAGGAGACUCCAGCCA
1732


C2TA_T2.7_S
GGUCCAUCU






R4515_CasPhi12_
AUUGCUCCUUACGAGGAGACUUCUCCAG
1733


C2TA_T2.8_S
CCAGGUCCA






R5200_CasPhi12_S
AUUGCUCCUUACGAGGAGACAGCAGGCU
2290



GUUGUGUGA






R5201_CasPhi12_S
AUUGCUCCUUACGAGGAGACCAUGUCAC
2291



ACAACAGCC






R5202_CasPhi12_S
AUUGCUCCUUACGAGGAGACUGUGACAU
2292



GGAAGGUGA






R5203_CasPhi12_S
AUUGCUCCUUACGAGGAGACAUCACCUU
2293



CCAUGUCAC






R5204_CasPhi12_S
AUUGCUCCUUACGAGGAGACGCAUAAGC
2294



CUCCCUGGU






R5205_CasPhi12_S
AUUGCUCCUUACGAGGAGACCAGGACUC
2295



CCAGCUGGA






R5206_CasPhi12_S
AUUGCUCCUUACGAGGAGACCUCAGGCC
2296



CUCCAGCUG






R5207_CasPhi12_S
AUUGCUCCUUACGAGGAGACUGCUGGCA
2297



UCUCCAUAC






R5208_CasPhi12_S
AUUGCUCCUUACGAGGAGACUGCCCAAC
2298



UUCUGCUGG






R5209_CasPhi12_S
AUUGCUCCUUACGAGGAGACCUGCCCAA
2299



CUUCUGCUG






R5210_CasPhi12_S
AUUGCUCCUUACGAGGAGACUCUGCCCA
2300



ACUUCUGCU






R5211_CasPhi12_S
AUUGCUCCUUACGAGGAGACUGACUUUU
2301



CUGCCCAAC






R5212_CasPhi12_S
AUUGCUCCUUACGAGGAGACCUGACUUU
2302



UCUGCCCAA






R5213_CasPhi12_S
AUUGCUCCUUACGAGGAGACUCUGACUU
2303



UUCUGCCCA






R5214_CasPhi12_S
AUUGCUCCUUACGAGGAGACCCAGAGGA
2304



GCUUCCGGC






R5215_CasPhi12_S
AUUGCUCCUUACGAGGAGACAGGUCUGC
2305



CGGAAGCUC






R5216_CasPhi12_S
AUUGCUCCUUACGAGGAGACCGGCAGAC
2306



CUGAAGCAC






R5217_CasPhi12_S
AUUGCUCCUUACGAGGAGACCAGUGCUU
2307



CAGGUCUGC






R5218_CasPhi12_S
AUUGCUCCUUACGAGGAGACAACAGCGC
2308



AGGCAGUGG






R5219_CasPhi12_S
AUUGCUCCUUACGAGGAGACAACCAGGA
2309



GCCAGCCUC






R5220_CasPhi12_S
AUUGCUCCUUACGAGGAGACUCCAGGCG
2310



CAUCUGGCC






R5221_CasPhi12_S
AUUGCUCCUUACGAGGAGACCUCCAGGC
2311



GCAUCUGGC






R5222_CasPhi12_S
AUUGCUCCUUACGAGGAGACUCUCCAGG
2312



CGCAUCUGG






R5223_CasPhi12_S
AUUGCUCCUUACGAGGAGACCUCCAGUU
2313



CCUCGUUGA






R5224_CasPhi12_S
AUUGCUCCUUACGAGGAGACUCCAGUUC
2314



CUCGUUGAG






R5225_CasPhi12_S
AUUGCUCCUUACGAGGAGACAGGCAGCU
2315



CAACGAGGA






R5226_CasPhi12_S
AUUGCUCCUUACGAGGAGACCUCGUUGA
2316



GCUGCCUGA






R5227_CasPhi12_S
AUUGCUCCUUACGAGGAGACAGCUGCCU
2317



GAAUCUCCC






R5228_CasPhi12_S
AUUGCUCCUUACGAGGAGACGUCCCCAC
2318



CAUCUCCAC






R5229_CasPhi12_S
AUUGCUCCUUACGAGGAGACUCCCCACC
2319



AUCUCCACU






R5230_CasPhi12_S
AUUGCUCCUUACGAGGAGACCCAGAGCC
2320



CAUGGGGCA






R5231_CasPhi12_S
AUUGCUCCUUACGAGGAGACGCCAGAGC
2321



CCAUGGGGC






R5232_CasPhi12_S
AUUGCUCCUUACGAGGAGACCAGCCUCA
2322



GAGAUUUGC






R5233_CasPhi12_S
AUUGCUCCUUACGAGGAGACGGAGGCCG
2323



UGGACAGUG






R5234_CasPhi12_S
AUUGCUCCUUACGAGGAGACACUGUCCA
2324



CGGCCUCCC






R5235_CasPhi12_S
AUUGCUCCUUACGAGGAGACGCUCCAUC
2325



AGCCACUGA






R5236_CasPhi12_S
AUUGCUCCUUACGAGGAGACAGGCAUGC
2326



UGGGCAGGU






R5237_CasPhi12_S
AUUGCUCCUUACGAGGAGACCUCGGGAG
2327



GUCAGGGCA






R5238_CasPhi12_S
AUUGCUCCUUACGAGGAGACGCUCGGGA
2328



GGUCAGGGC






R5239_CasPhi12_S
AUUGCUCCUUACGAGGAGACGAGACCUC
2329



UCCAGCUGC






R5240_CasPhi12_S
AUUGCUCCUUACGAGGAGACUUGGAGAC
2330



CUCUCCAGC






R5241_CasPhi12_S
AUUGCUCCUUACGAGGAGACGAAGCUUG
2331



UUGGAGACC






R5242_CasPhi12_S
AUUGCUCCUUACGAGGAGACGGAAGCUU
2332



GUUGGAGAC






R5243_CasPhi12_S
AUUGCUCCUUACGAGGAGACUGGAAGCU
2333



UGUUGGAGA






R5244_CasPhi12_S
AUUGCUCCUUACGAGGAGACUACCGCUC
2334



ACUGCAGGA






R5245_CasPhi12_S
AUUGCUCCUUACGAGGAGACCUGCUGCU
2335



CCUCUCCAG






R5246_CasPhi12_S
AUUGCUCCUUACGAGGAGACCCGCUCCA
2336



GGCUCUUGC






R5247_CasPhi12_S
AUUGCUCCUUACGAGGAGACUGCCCAGU
2337



CCGGGGUGG






R5248_CasPhi12_S
AUUGCUCCUUACGAGGAGACGGCCAGCU
2338



GCCGUUCUG






R5249_CasPhi12_S
AUUGCUCCUUACGAGGAGACGCAGCCAA
2339



CAGCACCUC






R5250_CasPhi12_S
AUUGCUCCUUACGAGGAGACGCUGCCAA
2340



GGAGCACCG






R5251_CasPhi12_S
AUUGCUCCUUACGAGGAGACCCCAGCAC
2341



AGCAAUCAC






R5252_CasPhi12_S
AUUGCUCCUUACGAGGAGACGCCCAGCA
2342



CAGCAAUCA






R5253_CasPhi12_S
AUUGCUCCUUACGAGGAGACCUGUGCUG
2343



GGCAAAGCU






R5254_CasPhi12_S
AUUGCUCCUUACGAGGAGACCCCUGACC
2344



AGCUUUGCC






R5255_CasPhi12_S
AUUGCUCCUUACGAGGAGACGGCUGGGG
2345



CAGUGAGCC






R5256_CasPhi12_S
AUUGCUCCUUACGAGGAGACUGGCCGGC
2346



UUCCCCAGU






R5257_CasPhi12_S
AUUGCUCCUUACGAGGAGACCCCAGUAC
2347



GACUUUGUC






R5258_CasPhi12_S
AUUGCUCCUUACGAGGAGACGUCUUCUC
2348



UGUCCCCUG






R5259_CasPhi12_S
AUUGCUCCUUACGAGGAGACUCUUCUCU
2349



GUCCCCUGC






R5260_CasPhi12_S
AUUGCUCCUUACGAGGAGACUCUGUCCC
2350



CUGCCAUUG






R5261_CasPhi12_S
AUUGCUCCUUACGAGGAGACAAGCAAUG
2351



GCAGGGGAC






R5262_CasPhi12_S
AUUGCUCCUUACGAGGAGACCUUGAACC
2352



GUCCGGGGG






R5263_CasPhi12_S
AUUGCUCCUUACGAGGAGACAACCGUCC
2353



GGGGGAUGC






R5264_CasPhi12_S
AUUGCUCCUUACGAGGAGACUCCCUGGG
2354



CCCACAGCC






R5265_CasPhi12_S
AUUGCUCCUUACGAGGAGACAAGAUGUG
2355



GCUGAAAAC






R5266_CasPhi12_S
AUUGCUCCUUACGAGGAGACUCAGCCAC
2356



AUCUUGAAG






R5267_CasPhi12_S
AUUGCUCCUUACGAGGAGACCAGCCACA
2357



UCUUGAAGA






R5268_CasPhi12_S
AUUGCUCCUUACGAGGAGACAGCCACAU
2358



CUUGAAGAG






R5269_CasPhi12_S
AUUGCUCCUUACGAGGAGACAAGAGACC
2359



UGACCGCGU






R5270_CasPhi12_S
AUUGCUCCUUACGAGGAGACUGCUCAUC
2360



CUAGACGGC






R5271_CasPhi12_S
AUUGCUCCUUACGAGGAGACCAGCUCCU
2361



CGAAGCCGU






R5272_CasPhi12_S
AUUGCUCCUUACGAGGAGACCGCUUCCA
2362



GCUCCUCGA






R5273_CasPhi12_S
AUUGCUCCUUACGAGGAGACGAGGAGCU
2363



GGAAGCGCA






R5274_CasPhi12_S
AUUGCUCCUUACGAGGAGACCUGCACAG
2364



CACGUGCGG






R5275_CasPhi12_S
AUUGCUCCUUACGAGGAGACUGGAAAAG
2365



GCCGGCCAG






R5276_CasPhi12_S
AUUGCUCCUUACGAGGAGACUUCUGGAA
2366



AAGGCCGGC






R5277_CasPhi12_S
AUUGCUCCUUACGAGGAGACUCCAGAAG
2367



AAGCUGCUC






R5278_CasPhi12_S
AUUGCUCCUUACGAGGAGACCCAGAAGA
2368



AGCUGCUCC






R5279_CasPhi12_S
AUUGCUCCUUACGAGGAGACCAGAAGAA
2369



GCUGCUCCG






R5280_CasPhi12_S
AUUGCUCCUUACGAGGAGACCACCCUCC
2370



UCCUCACAG






R5281_CasPhi12_S
AUUGCUCCUUACGAGGAGACCUCAGGCU
2371



CUGGACCAG






R5282_CasPhi12_S
AUUGCUCCUUACGAGGAGACGAGCUGUC
2372



CGGCUUCUC






R5283_CasPhi12_S
AUUGCUCCUUACGAGGAGACAGCUGUCC
2373



GGCUUCUCC






R5284_CasPhi12_S
AUUGCUCCUUACGAGGAGACUCCAUGGA
2374



GCAGGCCCA






R5285_CasPhi12_S
AUUGCUCCUUACGAGGAGACGAGAGCUC
2375



AGGGAUGAC






R5286_CasPhi12_S
AUUGCUCCUUACGAGGAGACAGAGCUCA
2376



GGGAUGACA






R5287_CasPhi12_S
AUUGCUCCUUACGAGGAGACGUGCUCUG
2377



UCAUCCCUG






R5288_CasPhi12_S
AUUGCUCCUUACGAGGAGACUUCUCAGU
2378



CACAGCCAC






R5289_CasPhi12_S
AUUGCUCCUUACGAGGAGACUCAGUCAC
2379



AGCCACAGC






R5290_CasPhi12_S
AUUGCUCCUUACGAGGAGACGUGCCGGG
2380



CAGUGUGCC






R5291_CasPhi12_S
AUUGCUCCUUACGAGGAGACUGCCGGGC
2381



AGUGUGCCA






R5292_CasPhi12_S
AUUGCUCCUUACGAGGAGACGCGUCCUC
2382



CCCAAGCUC






R5293_CasPhi12_S
AUUGCUCCUUACGAGGAGACGGGAGGAC
2383



GCCAAGCUG






R5294_CasPhi12_S
AUUGCUCCUUACGAGGAGACGCCAGCUC
2384



UGCCAGGGC






R5295_CasPhi12_S
AUUGCUCCUUACGAGGAGACAUGUCUGC
2385



GGCCCAGCU






R5392_CasPhi12_S
AUUGCUCCUUACGAGGAGACGAUGUCUG
2386



CGGCCCAGC






R5393_CasPhi12_S
AUUGCUCCUUACGAGGAGACCCAUCCGC
2387



AGACGUGAG






R5394_CasPhi12_S
AUUGCUCCUUACGAGGAGACGCCAUCGC
2388



CCAGGUCCU






R5395_CasPhi12_S
AUUGCUCCUUACGAGGAGACGGCCAUCG
2389



CCCAGGUCC






R5396_CasPhi12_S
AUUGCUCCUUACGAGGAGACGACUAAGC
2390



CUUUGGCCA






R5397_CasPhi12_S
AUUGCUCCUUACGAGGAGACGUCCAACA
2391



CCCACCGCG






R5398_CasPhi12_S
AUUGCUCCUUACGAGGAGACCAGGAGGA
2392



AGCUGGGGA






R5399_CasPhi12_S
AUUGCUCCUUACGAGGAGACCCCAGCUU
2393



CCUCCUGCA






R5400_CasPhi12_S
AUUGCUCCUUACGAGGAGACCUCCUGCA
2394



AUGCUUCCU






R5401_CasPhi12_S
AUUGCUCCUUACGAGGAGACCUGGGGGC
2395



CCUGUGGCU






R5402_CasPhi12_S
AUUGCUCCUUACGAGGAGACGCCACUCA
2396



GAGCCAGCC






R5403_CasPhi12_S
AUUGCUCCUUACGAGGAGACCGCCACUC
2397



AGAGCCAGC






R5404_CasPhi12_S
AUUGCUCCUUACGAGGAGACAUUUCGCC
2398



ACUCAGAGC






R5405_CasPhi12_S
AUUGCUCCUUACGAGGAGACUCCUUGAU
2399



UUCGCCACU






R5406_CasPhi12_S
AUUGCUCCUUACGAGGAGACGGGUCAAU
2400



GCUAGGUAC






R5407_CasPhi12_S
AUUGCUCCUUACGAGGAGACCUUGGGGU
2401



CAAUGCUAG






R5408_CasPhi12_S
AUUGCUCCUUACGAGGAGACUUCCUUGG
2402



GGUCAAUGC






R5409_CasPhi12_S
AUUGCUCCUUACGAGGAGACACCCCAAG
2403



GAAGAAGAG






R5410_CasPhi12_S
AUUGCUCCUUACGAGGAGACUCAUAGGG
2404



CCUCUUCUU






R5411_CasPhi12_S
AUUGCUCCUUACGAGGAGACCUGGCUGG
2405



GCUGAUCUU






R5412_CasPhi12_S
AUUGCUCCUUACGAGGAGACUGGCUGGG
2406



CUGAUCUUC






R5413_CasPhi12_S
AUUGCUCCUUACGAGGAGACCAGCCUCC
2407



CGCCCGCUG






R5414_CasPhi12_S
AUUGCUCCUUACGAGGAGACCUGUCCAC
2408



CGAGGCAGC






R5415_CasPhi12_S
AUUGCUCCUUACGAGGAGACUGCUUCCU
2409



GUCCACCGA






R5416_CasPhi12_S
AUUGCUCCUUACGAGGAGACAGGUACCU
2410



CGCAAGCAC






R5417_CasPhi12_S
AUUGCUCCUUACGAGGAGACCGAGGUAC
2411



CUGAAGCGG






R5418_CasPhi12_S
AUUGCUCCUUACGAGGAGACCAGCCUCC
2412



UCGGCCUCG






R5419_CasPhi12_S
AUUGCUCCUUACGAGGAGACGGCAGCAC
2413



GUGGUACAG






R5420_CasPhi12_S
AUUGCUCCUUACGAGGAGACGCAGCACG
2414



UGGUACAGG






R5421_CasPhi12_S
AUUGCUCCUUACGAGGAGACUCUGGGCA
2415



CCCGCCUCA






R5422_CasPhi12_S
AUUGCUCCUUACGAGGAGACCUGGGCAC
2416



CCGCCUCAC






R5423_CasPhi12_S
AUUGCUCCUUACGAGGAGACUGGGCACC
2417



CGCCUCACG






R5424_CasPhi12_S
AUUGCUCCUUACGAGGAGACCCCAGUAC
2418



AUGUGCAUC






R5425_CasPhi12_S
AUUGCUCCUUACGAGGAGACGCCCGCCG
2419



CCUCCAAGG






R5426_CasPhi12_S
AUUGCUCCUUACGAGGAGACGAGGCGGC
2420



GGGCCAAGA






R5427_CasPhi12_S
AUUGCUCCUUACGAGGAGACUCCCUGGA
2421



CCUCCGCAG






R5428_CasPhi12_S
AUUGCUCCUUACGAGGAGACGCCCCUCU
2422



GGAUUGGGG






R5429_CasPhi12_S
AUUGCUCCUUACGAGGAGACCCCCUCUG
2423



GAUUGGGGA






R5430_CasPhi12_S
AUUGCUCCUUACGAGGAGACGGGAGCCU
2424



CGUGGGACU






R5431_CasPhi12_S
AUUGCUCCUUACGAGGAGACGUCUCCCC
2425



AUGCUGCUG






R5432_CasPhi12_S
AUUGCUCCUUACGAGGAGACUCCUCUGC
2426



UGCCUGAAG






R5433_CasPhi12_S
AUUGCUCCUUACGAGGAGACAGGCAGCA
2427



GAGGAGAAG






R5434_CasPhi12_S
AUUGCUCCUUACGAGGAGACAAAGGCUC
2428



GAUGGUGAA






R5435_CasPhi12_S
AUUGCUCCUUACGAGGAGACGAAAGGCU
2429



CGAUGGUGA






R5436_CasPhi12_S
AUUGCUCCUUACGAGGAGACACCAUCGA
2430



GCCUUUCAA






R5437_CasPhi12_S
AUUGCUCCUUACGAGGAGACGCUUUGAA
2431



AGGCUCGAU






R5438_CasPhi12_S
AUUGCUCCUUACGAGGAGACAGGGACUU
2432



GGCUUUGAA






R5439_CasPhi12_S
AUUGCUCCUUACGAGGAGACCAAAGCCA
2433



AGUCCCUGA






R5440_CasPhi12_S
AUUGCUCCUUACGAGGAGACAAAGCCAA
2434



GUCCCUGAA






R5441_CasPhi12_S
AUUGCUCCUUACGAGGAGACCACAUCCU
2435



UCAGGGACU






R5442_CasPhi12_S
AUUGCUCCUUACGAGGAGACCCAGGUCU
2436



UCCACAUCC






R5443_CasPhi12_S
AUUGCUCCUUACGAGGAGACCCCAGGUC
2437



UUCCACAUC






R5444_CasPhi12_S
AUUGCUCCUUACGAGGAGACCUCGGAAG
2438



ACACAGCUG






R5445_CasPhi12_S
AUUGCUCCUUACGAGGAGACGGUCCCGA
2439



ACAGCAGGG






R5446_CasPhi12_S
AUUGCUCCUUACGAGGAGACAGGUCCCG
2440



AACAGCAGG






R5447_CasPhi12_S
AUUGCUCCUUACGAGGAGACUUUAGGUC
2441



CCGAACAGC






R5448_CasPhi12_S
AUUGCUCCUUACGAGGAGACCUUUAGGU
2442



CCCGAACAG






R5449_CasPhi12_S
AUUGCUCCUUACGAGGAGACGGGACCUA
2443



AAGAAACUG






R5450_CasPhi12_S
AUUGCUCCUUACGAGGAGACGGGAAAGC
2444



CUGGGGGCC






R5451_CasPhi12_S
AUUGCUCCUUACGAGGAGACGGGGAAAG
2445



CCUGGGGGC






R5452_CasPhi12_S
AUUGCUCCUUACGAGGAGACCCCCAAAC
2446



UGGUGCGGA






R5453_CasPhi12_S
AUUGCUCCUUACGAGGAGACCCCAAACU
2447



GGUGCGGAU






R5454_CasPhi12_S
AUUGCUCCUUACGAGGAGACUUCUCACU
2448



CAGCGCAUC






R5455_CasPhi12_S
AUUGCUCCUUACGAGGAGACAGCUGGGG
2449



GAAGGUGGC






R5456_CasPhi12_S
AUUGCUCCUUACGAGGAGACCCCCAGCU
2450



GAAGUCCUU






R5457_CasPhi12_S
AUUGCUCCUUACGAGGAGACCAAGGACU
2451



UCAGCUGGG






R5458_CasPhi12_S
AUUGCUCCUUACGAGGAGACCCAAGGAC
2452



UUCAGCUGG






R5459_CasPhi12_S
AUUGCUCCUUACGAGGAGACAGGGUUUC
2453



CAAGGACUU






R5460_CasPhi12_S
AUUGCUCCUUACGAGGAGACUAGGCACC
2454



CAGGUCAGU






R5461_CasPhi12_S
AUUGCUCCUUACGAGGAGACGUAGGCAC
2455



CCAGGUCAG






R5462_CasPhi12_S
AUUGCUCCUUACGAGGAGACGCUCGCUG
2456



CAUCCCUGC






R5463_CasPhi12_S
AUUGCUCCUUACGAGGAGACGCCUGAGC
2457



AGGGAUGCA






R5464_CasPhi12_S
AUUGCUCCUUACGAGGAGACUACAAUAA
2458



CUGCAUCUG






R5465_CasPhi12_S
AUUGCUCCUUACGAGGAGACGCUCGUGU
2459



GCUUCCGGA






R5466_CasPhi12_S
AUUGCUCCUUACGAGGAGACCGGACAUG
2460



GUGUCCCUC






R5467_CasPhi12_S
AUUGCUCCUUACGAGGAGACACGGCUGC
2461



CGGGGCCCA






R5468_CasPhi12_S
AUUGCUCCUUACGAGGAGACGGAGGUGU
2462



CCUCAUGUG






R5469_CasPhi12_S
AUUGCUCCUUACGAGGAGACCUGGACAC
2463



UGAAUGGGA






R5470_CasPhi12_S
AUUGCUCCUUACGAGGAGACAGUGUCCA
2464



GGAACACCU






R5471_CasPhi12_S
AUUGCUCCUUACGAGGAGACCAGGUGUU
2465



CCUGGACAC






R5472_CasPhi12_S
AUUGCUCCUUACGAGGAGACUUGCAGGU
2466



GUUCCUGGA






R5473_CasPhi12_S
AUUGCUCCUUACGAGGAGACACGGAUCA
2467



GCCUGAGAU
















TABLE AC







CasΦ.12 gRNAs targeting mouse PCSK9










Repeat + spacer 
SEQ


Name
RNA Sequence (5′→3′)
ID NO





R4238_CasPhi12_S
AUUGCUCCUUACGAGGAGACCCGCUGUUGCCG
1734



CCGCU






R4239_CasPhi12_S
AUUGCUCCUUACGAGGAGACCCGCCGCUGCUG
1735



CUGCU






R4240_CasPhi12_S
AUUGCUCCUUACGAGGAGACCUGCUACUGUGC
1736



CCCAC






R4241_CasPhi12_S
AUUGCUCCUUACGAGGAGACAUAAUCUCCAUC
1737



CUCGU






R4242_CasPhi12_S
AUUGCUCCUUACGAGGAGACUGAAGAGCUGAU
1738



GCUCG






R4243_CasPhi12_S
AUUGCUCCUUACGAGGAGACGAGCAACGGCGG
1739



AAGGU






R4244_CasPhi12_S
AUUGCUCCUUACGAGGAGACCUGGCAGCCUCC
1740



AGGCC






R4245_CasPhi12_S
AUUGCUCCUUACGAGGAGACUGGUGCUGAUGG
1741



AGGAG






R4246_CasPhi12_S
AUUGCUCCUUACGAGGAGACAAUCUGUAGCCU
1742



CUGGG






R4247_CasPhi12_S
AUUGCUCCUUACGAGGAGACUUCAAUCUGUAG
1743



CCUCU






R4248_CasPhi12_S
AUUGCUCCUUACGAGGAGACGUUCAAUCUGUA
1744



GCCUC






R4249_CasPhi12_S
AUUGCUCCUUACGAGGAGACAACAAACUGCCC
1745



ACCGC






R4250_CasPhi12_S
AUUGCUCCUUACGAGGAGACAUGACAUAGCCC
1746



CGGCG






R4251_CasPhi12_S
AUUGCUCCUUACGAGGAGACUACAUAUCUUUU
1747



AUGAC






R4252_CasPhi12_S
AUUGCUCCUUACGAGGAGACUAUGACCUCUUC
1748



CCUGG






R4253_CasPhi12_S
AUUGCUCCUUACGAGGAGACAUGACCUCUUCC
1749



CUGGC






R4254_CasPhi12_S
AUUGCUCCUUACGAGGAGACUGACCUCUUCCC
1750



UGGCU






R4255_CasPhi12_S
AUUGCUCCUUACGAGGAGACACCAAGAAGCCA
1751



GGGAA






R4256_CasPhi12_S
AUUGCUCCUUACGAGGAGACCCUGGCUUCUUG
1752



GUGAA






R4257_CasPhi12_S
AUUGCUCCUUACGAGGAGACUUGGUGAAGAUG
1753



AGCAG






R4258_CasPhi12_S
AUUGCUCCUUACGAGGAGACGUGAAGAUGAGC
1754



AGUGA






R4259_CasPhi12_S
AUUGCUCCUUACGAGGAGACCCCCAUGUGGAG
1755



UACAU






R4260_CasPhi12_S
AUUGCUCCUUACGAGGAGACCUCAAUGUACUC
1756



CACAU






R4261_CasPhi12_S
AUUGCUCCUUACGAGGAGACAGGAAGACUCCU
1757



UUGUC






R4262_CasPhi12_S
AUUGCUCCUUACGAGGAGACGUCUUCGCCCAG
1758



AGCAU






R4263_CasPhi12_S
AUUGCUCCUUACGAGGAGACUCUUCGCCCAGA
1759



GCAUC






R4264_CasPhi12_S
AUUGCUCCUUACGAGGAGACGCCCAGAGCAUC
1760



CCAUG






R4265_CasPhi12_S
AUUGCUCCUUACGAGGAGACCAUGGGAUGCUC
1761



UGGGC






R4266_CasPhi12_S
AUUGCUCCUUACGAGGAGACGCUCCAGGUUCC
1762



AUGGG






R4267_CasPhi12_S
AUUGCUCCUUACGAGGAGACUCCCAGCAUGGC
1763



ACCAG






R4268_CasPhi12_S
AUUGCUCCUUACGAGGAGACCUCUGUCUGGUG
1764



CCAUG






R4269_CasPhi12_S
AUUGCUCCUUACGAGGAGACGAUACCAGCAUC
1765



CAGGG






R4270_CasPhi12_S
AUUGCUCCUUACGAGGAGACAGGGCAGGGUCA
1766



CCAUC






R4271_CasPhi12_S
AUUGCUCCUUACGAGGAGACAAGUCGGUGAUG
1767



GUGAC






R4272_CasPhi12_S
AUUGCUCCUUACGAGGAGACAACAGCGUGCCG
1768



GAGGA






R4273_CasPhi12_S
AUUGCUCCUUACGAGGAGACGCCACACCAGCA
1769



UCCCG






R4274_CasPhi12_S
AUUGCUCCUUACGAGGAGACAGCACACGCAGG
1770



CUGUG






R4275_CasPhi12_S
AUUGCUCCUUACGAGGAGACACAGUUGAGCAC
1771



ACGCA






R4276_CasPhi12_S
AUUGCUCCUUACGAGGAGACCCUUGACAGUUG
1772



AGCAC






R4277_CasPhi12_S
AUUGCUCCUUACGAGGAGACGCUGACUCUUCC
1773



GAAUA






R4278_CasPhi12_S
AUUGCUCCUUACGAGGAGACAUUCGGAAGAGU
1774



CAGCU






R4279_CasPhi12_S
AUUGCUCCUUACGAGGAGACUUCGGAAGAGUC
1775



AGCUA






R4280_CasPhi12_S
AUUGCUCCUUACGAGGAGACGGAAGAGUCAGC
1776



UAAUC






R4281_CasPhi12_S
AUUGCUCCUUACGAGGAGACUGCUGCCCCUGG
1777



CCGGU






R4282_CasPhi12_S
AUUGCUCCUUACGAGGAGACAGGAUGCGGCUA
1778



UACCC






R4283_CasPhi12_S
AUUGCUCCUUACGAGGAGACCCAGCUGCUGCA
1779



ACCAG






R4284_CasPhi12_S
AUUGCUCCUUACGAGGAGACCAGCAGCUGGGA
1780



ACUUC






R4285_CasPhi12_S
AUUGCUCCUUACGAGGAGACCGGGACGACGCC
1781



UGCCU






R4286_CasPhi12_S
AUUGCUCCUUACGAGGAGACGUGGCCCCGACU
1782



GUGAU






R4287_CasPhi12_S
AUUGCUCCUUACGAGGAGACCCUUGGGGACUU
1783



UGGGG






R4288_CasPhi12_S
AUUGCUCCUUACGAGGAGACGUCCCCAAAGUC
1784



CCCAA






R4289_CasPhi12_S
AUUGCUCCUUACGAGGAGACGGGACUUUGGGG
1785



ACUAA






R4290_CasPhi12_S
AUUGCUCCUUACGAGGAGACGGGGACUAAUUU
1786



UGGAC






R4291_CasPhi12_S
AUUGCUCCUUACGAGGAGACGGGACUAAUUUU
1787



GGACG






R4292_CasPhi12_S
AUUGCUCCUUACGAGGAGACUGGACGCUGUGU
1788



GGAUC






R4293_CasPhi12_S
AUUGCUCCUUACGAGGAGACGGACGCUGUGUG
1789



GAUCU






R4294_CasPhi12_S
AUUGCUCCUUACGAGGAGACGACGCUGUGUGG
1790



AUCUC






R4295_CasPhi12_S
AUUGCUCCUUACGAGGAGACCCGGGGGCAAAG
1791



AGAUC






R4296_CasPhi12_S
AUUGCUCCUUACGAGGAGACGCCCCCGGGAAG
1792



GACAU






R4297_CasPhi12_S
AUUGCUCCUUACGAGGAGACCCCCCGGGAAGG
1793



ACAUC






R4298_CasPhi12_S
AUUGCUCCUUACGAGGAGACAUGUCACAGAGU
1794



GGGAC






R4299_CasPhi12_S
AUUGCUCCUUACGAGGAGACUGGCUCGGAUGC
1795



UGAGC






R4300_CasPhi12_S
AUUGCUCCUUACGAGGAGACCCCUGGCCGAGC
1796



UGCGG






R4301_CasPhi12_S
AUUGCUCCUUACGAGGAGACGUAGAGAAGUGG
1797



AUCAG






R4302_CasPhi12_S
AUUGCUCCUUACGAGGAGACGGUAGAGAAGUG
1798



GAUCA






R4303_CasPhi12_S
AUUGCUCCUUACGAGGAGACUCUACCAAAGAC
1799



GUCAU






R4304_CasPhi12_S
AUUGCUCCUUACGAGGAGACAUGACGUCUUUG
1800



GUAGA






R4305_CasPhi12_S
AUUGCUCCUUACGAGGAGACCCUGAGGACCAG
1801



CAGGU






R4306_CasPhi12_S
AUUGCUCCUUACGAGGAGACGGGGUCAGCACC
1802



UGCUG






R4307_CasPhi12_S
AUUGCUCCUUACGAGGAGACGAGUGGGCCCCG
1803



AGUGU






R4308_CasPhi12_S
AUUGCUCCUUACGAGGAGACUGGGGCACAGCG
1804



GGCUG






R4309_CasPhi12_S
AUUGCUCCUUACGAGGAGACUCCAGGAGCGGG
1805



AGGCG






R4310_CasPhi12_S
AUUGCUCCUUACGAGGAGACCAGACCUGCUGG
1806



CCUCC






R4311_CasPhi12_S
AUUGCUCCUUACGAGGAGACAGGGCCUUGCAG
1807



ACCUG






R4312_CasPhi12_S
AUUGCUCCUUACGAGGAGACGGGGGUGAGGGU
1808



GUCUA






R4313_CasPhi12_S
AUUGCUCCUUACGAGGAGACGGGGUGAGGGUG
1809



UCUAU






R4314_CasPhi12_S
AUUGCUCCUUACGAGGAGACGCACGGGGAACC
1810



AGGCA






R4315_CasPhi12_S
AUUGCUCCUUACGAGGAGACCCCGUGCCAACU
1811



GCAGC






R4316_CasPhi12_S
AUUGCUCCUUACGAGGAGACUGGAUGCUGCAG
1812



UUGGC






R4317_CasPhi12_S
AUUGCUCCUUACGAGGAGACUGGUGGCAGUGG
1813



ACAUG






R4318_CasPhi12_S
AUUGCUCCUUACGAGGAGACCACUUCCCAAUG
1814



GAAGC






R4319_CasPhi12_S
AUUGCUCCUUACGAGGAGACCAUUGGGAAGUG
1815



GAAGA






R4320_CasPhi12_S
AUUGCUCCUUACGAGGAGACGGAAGUGGAAGA
1816



CCUUA






R4321_CasPhi12_S
AUUGCUCCUUACGAGGAGACGUGUCCGGAGGC
1817



AGCCU






R4322_CasPhi12_S
AUUGCUCCUUACGAGGAGACGCCACCAGGCGG
1818



CCAGU






R4323_CasPhi12_S
AUUGCUCCUUACGAGGAGACCUGCUGCCAUGC
1819



CCCAG






R4324_CasPhi12_S
AUUGCUCCUUACGAGGAGACCAGCCCUGGGGC
1820



AUGGC






R4325_CasPhi12_S
AUUGCUCCUUACGAGGAGACCAUUCCAGCCCU
1821



GGGGC






R4326_CasPhi12_S
AUUGCUCCUUACGAGGAGACGCAUUCCAGCCC
1822



UGGGG






R4327_CasPhi12_S
AUUGCUCCUUACGAGGAGACUGCAUUCCAGCC
1823



CUGGG






R4328_CasPhi12_S
AUUGCUCCUUACGAGGAGACAUUUUGCAUUCC
1824



AGCCC






R4329_CasPhi12_S
AUUGCUCCUUACGAGGAGACCAUCCAGUCAGG
1825



GUCCA






R4330_CasPhi12_S
AUUGCUCCUUACGAGGAGACUCCACGCUGUAG
1826



GCUCC






R4331_CasPhi12_S
AUUGCUCCUUACGAGGAGACCCACACACAGGU
1827



UGUCC






R4332_CasPhi12_S
AUUGCUCCUUACGAGGAGACUCCACUGGUCCU
1828



GUCUG






R4333_CasPhi12_S
AUUGCUCCUUACGAGGAGACCUGAAGGCCGGC
1829



UCCGG
















TABLE AD







CasΦ.12 gRNAs targeting Bak1 in CHO cells










Repeat + spacer 
SEQ



RNA Sequence (5′→3′),
ID


Name
shown as DNA
NO





R2452
ATTGCTCCTTACGAGGAGACG
1830


Bak1_CasPhi12_1_S
AAGCTATGTTTTCCAT






R2453
ATTGCTCCTTACGAGGAGACG
1831


Bak1_CasPhi12_2_S
CAGGGGCAGCCGCCCC






R2454
ATTGCTCCTTACGAGGAGACC
1832


Bak1_CasPhi12_3_S
TCCTAGAACCCAACAG






R2455
ATTGCTCCTTACGAGGAGACG
1833


Bak1_CasPhi12_4_S
AAAGACCTCCTCTGTG






R2456
ATTGCTCCTTACGAGGAGACT
1834


Bak1_CasPhi12_5_S
CCATCTCGGGGTTGGC






R2457
ATTGCTCCTTACGAGGAGACT
1835


Bak1_CasPhi12_6_S
TCCTGATGGTGGAGAT






R2849_Bak1_CasPhi12_
ATTGCTCCTTACGAGGAGACC
1836


nsd_sg1_S
TGACTCCCAGCTCTGA






R2850_Bak1_CasPhi12_
ATTGCTCCTTACGAGGAGACT
1837


nsd_sg2_S
GGGGTCAGAGCTGGGA






R2851_Bak1_CasPhi12_
ATTGCTCCTTACGAGGAGACG
1838


nsd_sg3_S
AAAGACCTCCTCTGTG






R2852_Bak1_
ATTGCTCCTTACGAGGAGACC
1839


CasPhi12_nsd_sg4_S
GAAGCTATGTTTTCCA






R2853_Bak1_
ATTGCTCCTTACGAGGAGACG
1840


CasPhi12_nsd_sg5_S
AAGCTATGTTTTCCAT






R2854_Bak1_
ATTGCTCCTTACGAGGAGACT
1841


CasPhi12_nsd_sg6_S
CCATCTCCACCATCAG






R2855_Bak1_
ATTGCTCCTTACGAGGAGACC
1842


CasPhi12_nsd_sg7_S
CATCTCCACCATCAGG






R2856_Bak1_
ATTGCTCCTTACGAGGAGACC
1843


CasPhi12_nsd_sg8_S
TGATGGTGGAGATGGA






R2857_Bak1_
ATTGCTCCTTACGAGGAGACC
1844


CasPhi12_nsd_sg9_S
ATCTCCACCATCAGGA






R2858_Bak1_
ATTGCTCCTTACGAGGAGACT
1845


CasPhi12_nsd_sg10_S
TCCTGATGGTGGAGAT






R2859_Bak1_
ATTGCTCCTTACGAGGAGACG
1846


CasPhi12_nsd_sg11_S
CAGGGGCAGCCGCCCC






R2860_Bak1_
ATTGCTCCTTACGAGGAGACT
1847


CasPhi12_nsd_sg12_S
CCATCTCGGGGTTGGC






R2861_Bak1_
ATTGCTCCTTACGAGGAGACT
1848


CasPhi12_nsd_sg13_S
AGGAGCAAATTGTCCA






R2862_Bak1_
ATTGCTCCTTACGAGGAGACG
1849


CasPhi12_nsd_sg14_S
GTTCTAGGAGCAAATT






R2863_Bak1_
ATTGCTCCTTACGAGGAGACG
1850


CasPhi12_nsd_sg15_S
CTCCTAGAACCCAACA






R2864_Bak1_
ATTGCTCCTTACGAGGAGACC
1851


CasPhi12_nsd_sg16_S
TCCTAGAACCCAACAG






R3977_Bak1_CasPhi12_
ATTGCTCCTTACGAGGAGACT
1852


exon1_sg1_S
CCAGACGCCATCTTTC






R3978_Bak1_CasPhi12_
ATTGCTCCTTACGAGGAGACT
1853


exon1_sg2_S
GGTAAGAGTCCTCCTG






R3979_Bak1_CasPhi12_
ATTGCTCCTTACGAGGAGACT
1854


exon3_sg1_S
TACAGCATCTTGGGTC






R3980_Bak1_CasPhi12_
ATTGCTCCTTACGAGGAGACG
1855


exon3_sg2_S
GTCAGGTGGGCCGGCA






R3981_Bak1_CasPhi12_
ATTGCTCCTTACGAGGAGACC
1856


exon3_sg3_S
TATCATTGGAGATGAC






R3982_Bak1_CasPhi12_
ATTGCTCCTTACGAGGAGACG
1857


exon3_sg4_S
AGATGACATTAACCGG






R3983_Bak1_CasPhi12_
ATTGCTCCTTACGAGGAGACT
1858


exon3_sg5_S
GGAACTCTGTGTCGTA






R3984_Bak1_CasPhi12_
ATTGCTCCTTACGAGGAGACC
1859


exon3_sg6_S
AGAATTTACTGGAGCA






R3985_Bak1_CasPhi12_
ATTGCTCCTTACGAGGAGACA
1860


exon3_sg7_S
CTGGAGCAGCTGCAGC






R3986_Bak1_CasPhi12_
ATTGCTCCTTACGAGGAGACC
1861


exon3_sg8_S
CAGCTGTGGGCTGCAG






R3987_Bak1_CasPhi12_
ATTGCTCCTTACGAGGAGACG
1862


exon3_sg9_S
TAGGCATTCCCAGCTG






R3988_Bak1_CasPhi12_
ATTGCTCCTTACGAGGAGACG
1863


exon3_sg10_S
TGAAGAGTTCGTAGGC






R3989_Bak1_CasPhi12_
ATTGCTCCTTACGAGGAGACA
1864


exon3_sg11_S
CCAAGATTGCCTCCAG






R3990_Bak1_CasPhi12_
ATTGCTCCTTACGAGGAGACC
1865


exon3_sg12_S
CTCCAGGTACCCACCA
















TABLE AE







CasΦ.12 gRNAs targeting Bax in CHO cells










Repeat + spacer 
SEQ



RNA Sequence (5′→3′),
ID


Name
shown as DNA)
NO





R2458
ATTGCTCCTTACGAGGAGACC
1866


Bax_CasPhi12_1_S
TAATGTGGATACTAAC






R2459
ATTGCTCCTTACGAGGAGACT
1867


Bax_CasPhi12_2_S
TCCGTGTGGCAGCTGA






R2460
ATTGCTCCTTACGAGGAGACC
1868


Bax_CasPhi12_3_S
TGATGGCAACTTCAAC






R2461
ATTGCTCCTTACGAGGAGACT
1869


Bax_CasPhi12_4_S
ACTTTGCTAGCAAACT






R2462
ATTGCTCCTTACGAGGAGACA
1870


Bax_CasPhi12_5_S
GCACCAGTTTGCTAGC






R2463
ATTGCTCCTTACGAGGAGACA
1871


Bax_CasPhi12_6_S
ACTGGGGCCGGGTTGT






R2865_Bax_CasPhi12_
ATTGCTCCTTACGAGGAGACT
1872


nsd_sg1_S
TCTCTTTCCTGTAGGA






R2866_Bax_CasPhi12_
ATTGCTCCTTACGAGGAGACT
1873


nsd_sg2_S
CTTTCCTGTAGGATGA






R2867_Bax_
ATTGCTCCTTACGAGGAGACC
1874


CasPhi12_nsd_sg3_S
CTGTAGGATGATTGCT






R2868_Bax_
ATTGCTCCTTACGAGGAGACC
1875


CasPhi12_nsd_sg4_S
TGTAGGATGATTGCTA






R2869_Bax_
ATTGCTCCTTACGAGGAGACC
1876


CasPhi12_nsd_sg5_S
TAATGTGGATACTAAC






R2870_Bax_
ATTGCTCCTTACGAGGAGACT
1877


CasPhi12_nsd_sg6_S
TCCGTGTGGCAGCTGA






R2871_Bax_
ATTGCTCCTTACGAGGAGACC
1878


CasPhi12_nsd_sg7_S
GTGTGGCAGCTGACAT






R2872_Bax_
ATTGCTCCTTACGAGGAGACC
1879


CasPhi12_nsd_sg8_S
CATCAGCAAACATGTC






R2873_Bax_
ATTGCTCCTTACGAGGAGACA
1880


CasPhi12_nsd_sg9_S
AGTTGCCATCAGCAAA






R2874_Bax_
ATTGCTCCTTACGAGGAGACG
1881


CasPhi12_nsd_sg10_S
CTGATGGCAACTTCAA






R2875_Bax_
ATTGCTCCTTACGAGGAGACC
1882


CasPhi12_nsd_sg11_S
TGATGGCAACTTCAAC






R2876_Bax_
ATTGCTCCTTACGAGGAGACA
1883


CasPhi12_nsd_sg12_S
ACTGGGGCCGGGTTGT






R2877_Bax_
ATTGCTCCTTACGAGGAGACT
1884


CasPhi12_nsd_sg13_S
TGCCCTTTTCTACTTT






R2878_Bax_
ATTGCTCCTTACGAGGAGACC
1885


CasPhi12_nsd_sg14_S
CCTTTTCTACTTTGCT






R2879_Bax_
ATTGCTCCTTACGAGGAGACC
1886


CasPhi12_nsd_sg15_S
TAGCAAAGTAGAAAAG






R2880_Bax_
ATTGCTCCTTACGAGGAGACG
1887


CasPhi12_nsd_sg16_S
CTAGCAAAGTAGAAAA






R2881_Bax_
ATTGCTCCTTACGAGGAGACT
1888


CasPhi12_nsd_sg17_S
CTACTTTGCTAGCAAA






R2882_Bax_
ATTGCTCCTTACGAGGAGACC
1889


CasPhi12_nsd_sg18_S
TACTTTGCTAGCAAAC






R2883_Bax_
ATTGCTCCTTACGAGGAGACT
1890


CasPhi12_nsd_sg19_S
ACTTTGCTAGCAAACT






R2884_Bax_
ATTGCTCCTTACGAGGAGACG
1891


CasPhi12_nsd_sg20_S
CTAGCAAACTGGTGCT






R2885_Bax_
ATTGCTCCTTACGAGGAGACC
1892


CasPhi12_nsd_sg21_S
TAGCAAACTGGTGCTC






R2886_Bax_
ATTGCTCCTTACGAGGAGACA
1893


CasPhi12_nsd_sg22_S
GCACCAGTTTGCTAGC
















TABLE AF







CasΦ.12 gRNAs targeting Fut8 in CHO cells










Repeat + spacer 
SEQ



RNA Sequence (5′→3′),
ID


Name
shown as DNA)
NO





R2464
ATTGCTCCTTACGAGGAGACC
1894


Fut8_CasPhi12_1_S
CACTTTGTCAGTGCGT






R2465
ATTGCTCCTTACGAGGAGACC
1895


Fut8_CasPhi12_2_S
TCAATGGGATGGAAGG






R2466
ATTGCTCCTTACGAGGAGACA
1896


Fut8_CasPhi12_3_S
GGAATACATGGTACAC






R2467
ATTGCTCCTTACGAGGAGACA
1897


Fut8_CasPhi12_4_S
AGAACATTTTCAGCTT






R2468
ATTGCTCCTTACGAGGAGACA
1898


Fut8_CasPhi12_5_S
TCCACTTTCATTCTGC






R2469
ATTGCTCCTTACGAGGAGACT
1899


Fut8_CasPhi12_6_S
TTGTTAAAGGAGGCAA






R2887_Fut8_
ATTGCTCCTTACGAGGAGACT
1900


CasPhi12_nsd_sg1_S
CCCCAGAGTCCATGTC






R2888_Fut8_
ATTGCTCCTTACGAGGAGACT
1901


CasPhi12_nsd_sg2_S
CAGTGCGTCTGACATG






R2889_Fut8_
ATTGCTCCTTACGAGGAGACG
1902


CasPhi12_nsd_sg3_S
TCAGTGCGTCTGACAT






R2890_Fut8_
ATTGCTCCTTACGAGGAGACC
1903


CasPhi12_nsd_sg4_S
CACTTTGTCAGTGCGT






R2891_Fut8_
ATTGCTCCTTACGAGGAGACT
1904


CasPhi12_nsd_sg5_S
GTTCCCACTTTGTCAG






R2892_Fut8_
ATTGCTCCTTACGAGGAGACC
1905


CasPhi12_nsd_sg6_S
TCAATGGGATGGAAGG






R2893_Fut8_
ATTGCTCCTTACGAGGAGACC
1906


CasPhi12_nsd_sg7_S
ATCCCATTGAGGAATA






R2894_Fut8_
ATTGCTCCTTACGAGGAGACA
1907


CasPhi12_nsd_sg8_S
GGAATACATGGTACAC






R2895_Fut8_
ATTGCTCCTTACGAGGAGACA
1908


CasPhi12_nsd_sg9_S
ACGTGTACCATGTATT






R2896_Fut8_
ATTGCTCCTTACGAGGAGACT
1909


CasPhi12_nsd_sg10_S
TCAACGTGTACCATGT






R2897_Fut8_
ATTGCTCCTTACGAGGAGACA
1910


CasPhi12_nsd_sg11_S
AGAACATTTTCAGCTT






R2898_Fut8_
ATTGCTCCTTACGAGGAGACG
1911


CasPhi12_nsd_sg12_S
AGAAGCTGAAAATGTT






R2899_Fut8_
ATTGCTCCTTACGAGGAGACT
1912


CasPhi12_nsd_sg13_S
CAGCTTCTCGAACGCA






R2900_Fut8_
ATTGCTCCTTACGAGGAGACC
1913


CasPhi12_nsd_sg14_S
AGCTTCTCGAACGCAG






R2901_Fut8_
ATTGCTCCTTACGAGGAGACT
1914


CasPhi12_nsd_sg15_S
GCGTTCGAGAAGCTGA






R2902_Fut8_
ATTGCTCCTTACGAGGAGACA
1915


CasPhi12_nsd_sg16_S
GCTTCTCGAACGCAGA






R2903_Fut8_
ATTGCTCCTTACGAGGAGACA
1916


CasPhi12_nsd_sg17_S
TTCTGCGTTCGAGAAG






R2904_Fut8_
ATTGCTCCTTACGAGGAGACC
1917


CasPhi12_nsd_sg18_S
ATTCTGCGTTCGAGAA






R2905_Fut8_
ATTGCTCCTTACGAGGAGACT
1918


CasPhi12_nsd_sg19_S
CGAACGCAGAATGAAA






R2906_Fut8_
ATTGCTCCTTACGAGGAGACA
1919


CasPhi12_nsd_sg20_S
TCCACTTTCATTCTGC






R2907_Fut8_
ATTGCTCCTTACGAGGAGACT
1920


CasPhi12_nsd_sg21_S
ATCCACTTTCATTCTG






R2908_Fut8_
ATTGCTCCTTACGAGGAGACT
1921


CasPhi12_nsd_s822_S
TATCCACTTTCATTCT






R2909_Fut8_
ATTGCTCCTTACGAGGAGACT
1922


CasPhi12_nsd_sg23_S
TTATCCACTTTCATTC






R2910_Fut8_
ATTGCTCCTTACGAGGAGACT
1923


CasPhi12_nsd_sg24_S
TTTATCCACTTTCATT






R2911_Fut8_
ATTGCTCCTTACGAGGAGACA
1924


CasPhi12_nsd_sg25_S
ACAAAGAAGGGTCATC






R2912_Fut8_
ATTGCTCCTTACGAGGAGACC
1925


CasPhi12_nsd_sg26_S
CTCCTTTAACAAAGAA






R2913_Fut8_
ATTGCTCCTTACGAGGAGACG
1926


CasPhi12_nsd_sg27_S
CCTCCTTTAACAAAGA






R2914_Fut8_
ATTGCTCCTTACGAGGAGACT
1927


CasPhi12_nsd_sg28_S
TTGTTAAAGGAGGCAA






R2915_Fut8_
ATTGCTCCTTACGAGGAGACG
1928


CasPhi12_nsd_sg29_S
TTAAAGGAGGCAAAGA






R2916_Fut8_
ATTGCTCCTTACGAGGAGACT
1929


CasPhi12_nsd_sg30_S
TAAAGGAGGCAAAGAC






R2917_Fut8_
ATTGCTCCTTACGAGGAGACT
1930


CasPhi12_nsd_sg31_S
CTTTGCCTCCTTTAAC






R2918_Fut8_
ATTGCTCCTTACGAGGAGACG
1931


CasPhi12_nsd_sg32_S
TCTTTGCCTCCTTTAA






R2919_Fut8_
ATTGCTCCTTACGAGGAGACG
1932


CasPhi12_nsd_sg33_S
TCTAACTTACTTTGTC






R2920_Fut8_
ATTGCTCCTTACGAGGAGACT
1933


CasPhi12_nsd_sg34_S
TGGTCTAACTTACTTT
















TABLE AG







CasΦ.12 gRNAs targeting Fut8















Repeat
Spacer
crRNA



Repeat
Spacer
sequence
sequence
sequence


Name
length
length
(5′→3′)
(5′→3′)
(5′→3′)





R3582
36
30
CUUUCAAGA
AGGAAUACAU
CUUUCAAGACUAA





CUAAUAGAU
GGUACACGUU
UAGAUUGCUCCUU





UGCUCCUUA
GAAGAACAUU
ACGAGGAGACAGG





CGAGGAGAC
(SEQ ID NO:
AAUACAUGGUACA





(SEQ ID NO:
1482)
CGUUGAAGAACAU





2469)

U (SEQ ID 







NO: 1499)





R3583
36
29
CUUUCAAGA
AGGAAUACAU
CUUUCAAGACUAA





CUAAUAGAU
GGUACACGUU
UAGAUUGCUCCUU





UGCUCCUUA
GAAGAACAU
ACGAGGAGACAGG





CGAGGAGAC
(SEQ ID NO:
AAUACAUGGUACA





(SEQ ID NO:
1483)
CGUUGAAGAACAU





2469)

(SEQ ID NO: 1500)





R3584
36
28
CUUUCAAGA
AGGAAUACAU
CUUUCAAGACUAA





CUAAUAGAU
GGUACACGUU
UAGAUUGCUCCUU





UGCUCCUUA
GAAGAACA
ACGAGGAGACAGG





CGAGGAGAC
(SEQ ID NO:
AAUACAUGGUACA





(SEQ ID NO:
1484)
CGUUGAAGAACA





2469)

(SEQ ID NO: 1501)





R3585
36
27
CUUUCAAGA
AGGAAUACAU
CUUUCAAGACUAA





CUAAUAGAU
GGUACACGUU
UAGAUUGCUCCUU





UGCUCCUUA
GAAGAAC
ACGAGGAGACAGG





CGAGGAGAC
(SEQ ID NO:
AAUACAUGGUACA





(SEQ ID NO:
1485)
CGUUGAAGAAC





2469)

(SEQ ID NO: 1502)





R3586
36
26
CUUUCAAGA
AGGAAUACAU
CUUUCAAGACUAA





CUAAUAGAU
GGUACACGUU
UAGAUUGCUCCUU





UGCUCCUUA
GAAGAA (SEQ
ACGAGGAGACAGG





CGAGGAGAC
ID NO: 1486)
AAUACAUGGUACA





(SEQ ID NO:

CGUUGAAGAA (SEQ





2469)

ID NO: 1503)





R3587
36
25
CUUUCAAGA
AGGAAUACAU
CUUUCAAGACUAA





CUAAUAGAU
GGUACACGUU
UAGAUUGCUCCUU





UGCUCCUUA
GAAGA (SEQ
ACGAGGAGACAGG





CGAGGAGAC
ID NO: 1487)
AAUACAUGGUACA





(SEQ ID NO:

CGUUGAAGA (SEQ





2469)

ID NO: 1504)





R3588
36
24
CUUUCAAGA
AGGAAUACAU
CUUUCAAGACUAA





CUAAUAGAU
GGUACACGUU
UAGAUUGCUCCUU





UGCUCCUUA
GAAG (SEQ ID
ACGAGGAGACAGG





CGAGGAGAC
NO: 1488)
AAUACAUGGUACA





(SEQ ID NO:

CGUUGAAG (SEQ ID





2469)

NO: 1505)





R3589
36
23
CUUUCAAGA
AGGAAUACAU
CUUUCAAGACUAA





CUAAUAGAU
GGUACACGUU
UAGAUUGCUCCUU





UGCUCCUUA
GAA (SEQ ID
ACGAGGAGACAGG





CGAGGAGAC
NO: 1489)
AAUACAUGGUACA





(SEQ ID NO:

CGUUGAA (SEQ ID





2469)

NO: 1506)





R3590
36
22
CUUUCAAGA
AGGAAUACAU
CUUUCAAGACUAA





CUAAUAGAU
GGUACACGUU
UAGAUUGCUCCUU





UGCUCCUUA
GA 
ACGAGGAGACAGG





CGAGGAGAC
(SEQ ID NO:
AAUACAUGGUACA





(SEQ ID NO:
1490)
CGUUGA (SEQ ID





2469)

NO: 1507)





R3591
36
21
CUUUCAAGA
AGGAAUACAU
CUUUCAAGACUAA





CUAAUAGAU
GGUACACGUU
UAGAUUGCUCCUU





UGCUCCUUA
G (SEQ ID NO:
ACGAGGAGACAGG





CGAGGAGAC
1491)
AAUACAUGGUACA





(SEQ ID NO:

CGUUG (SEQ ID





2469)

NO: 1508)





R3592
36
20
CUUUCAAGA
AGGAAUACAU
CUUUCAAGACUAA





CUAAUAGAU
GGUACACGUU
UAGAUUGCUCCUU





UGCUCCUUA
(SEQ ID NO:
ACGAGGAGACAGG





CGAGGAGAC
1492)
AAUACAUGGUACA





(SEQ ID NO:

CGUU (SEQ ID





2469)

NO: 1509)





R3593
36
19
CUUUCAAGA
AGGAAUACAU
CUUUCAAGACUAA





CUAAUAGAU
GGUACACGU
UAGAUUGCUCCUU





UGCUCCUUA
(SEQ ID NO:
ACGAGGAGACAGG





CGAGGAGAC
1493)
AAUACAUGGUACA





(SEQ ID NO:

CGU (SEQ ID





2469)

NO: 1510)





R3594
36
18
CUUUCAAGA
AGGAAUACAU
CUUUCAAGACUAA





CUAAUAGAU
GGUACACG
UAGAUUGCUCCUU





UGCUCCUUA
(SEQ ID NO:
ACGAGGAGACAGG





CGAGGAGAC
1494)
AAUACAUGGUACA





(SEQ ID NO:

CG (SEQ ID 





2469)

NO: 1511)





R3595
36
17
CUUUCAAGA
AGGAAUACAU
CUUUCAAGACUAA





CUAAUAGAU
GGUACAC
UAGAUUGCUCCUU





UGCUCCUUA
(SEQ ID NO:
ACGAGGAGACAGG





CGAGGAGAC
1495)
AAUACAUGGUACA





(SEQ ID NO:

C (SEQ ID 





2469)

NO: 1512)





R3596
36
16
CUUUCAAGA
AGGAAUACAU
CUUUCAAGACUAA





CUAAUAGAU
GGUACA (SEQ
UAGAUUGCUCCUU





UGCUCCUUA
ID NO: 1496)
ACGAGGAGACAGG





CGAGGAGAC

AAUACAUGGUACA





(SEQ ID NO:

(SEQ ID 





2469)

NO: 1513)





R3597
36
15
CUUUCAAGA
AGGAAUACAU
CUUUCAAGACUAA





CUAAUAGAU
GGUAC (SEQ ID
UAGAUUGCUCCUU





UGCUCCUUA
NO: 1497)
ACGAGGAGACAGG





CGAGGAGAC

AAUACAUGGUAC





(SEQ ID NO:

(SEQ ID 





2469)

NO: 1514)





R3598
35
20
UUUCAAGAC
AGGAAUACAU
UUUCAAGACUAAU





UAAUAGAUU
GGUACACGUU
AGAUUGCUCCUUA





GCUCCUUAC
(SEQ ID NO:
CGAGGAGACAGGA





GAGGAGAC
1498)
AUACAUGGUACAC





(SEQ ID NO:

GUU (SEQ ID





1466)

NO: 1515)





R3599
34
20
UUCAAGACU
AGGAAUACAU
UUCAAGACUAAUA





AAUAGAUUG
GGUACACGUU
GAUUGCUCCUUAC





CUCCUUACG
(SEQ ID NO:
GAGGAGACAGGAA





AGGAGAC
1498)
UACAUGGUACACG





(SEQ ID NO:

UU (SEQ ID 





1467)

NO: 1516)





R3600
33
20
UCAAGACUA
AGGAAUACAU
UCAAGACUAAUAG





AUAGAUUGC
GGUACACGUU
AUUGCUCCUUACG





UCCUUACGA
(SEQ ID NO:
AGGAGACAGGAAU





GGAGAC (SEQ
1498)
ACAUGGUACACGU





ID NO: 1468)

U (SEQ ID 







NO: 1517)





R3601
32
20
CAAGACUAA
AGGAAUACAU
CAAGACUAAUAGA





UAGAUUGCU
GGUACACGUU
UUGCUCCUUACGA





CCUUACGAG
(SEQ ID NO:
GGAGACAGGAAUA





GAGAC (SEQ
1498)
CAUGGUACACGUU





ID NO: 1469)

(SEQ ID NO: 1518)





R3602
31
20
AAGACUAAU
AGGAAUACAU
AAGACUAAUAGAU





AGAUUGCUC
GGUACACGUU
UGCUCCUUACGAG





CUUACGAGG
(SEQ ID NO:
GAGACAGGAAUAC





AGAC (SEQ ID
1498)
AUGGUACACGUU





NO: 1470)

(SEQ ID NO: 1519)





R3603
30
20
AGACUAAUA
AGGAAUACAU
AGACUAAUAGAUU





GAUUGCUCC
GGUACACGUU
GCUCCUUACGAGG





UUACGAGGA
(SEQ ID NO:
AGACAGGAAUACA





GAC (SEQ ID
1498)
UGGUACACGUU





NO: 1471)

(SEQ ID NO: 1520)





R3604
29
20
GACUAAUAG
AGGAAUACAU
GACUAAUAGAUUG





AUUGCUCCU
GGUACACGUU
CUCCUUACGAGGA





UACGAGGAG
(SEQ ID NO:
GACAGGAAUACAU





AC (SEQ ID
1498)
GGUACACGUU (SEQ





NO: 1472)

ID NO: 1521)





R3605
28
20
ACUAAUAGA
AGGAAUACAU
ACUAAUAGAUUGC





UUGCUCCUU
GGUACACGUU
UCCUUACGAGGAG





ACGAGGAGA
(SEQ ID NO:
ACAGGAAUACAUG





C (SEQ ID NO:
1498)
GUACACGUU (SEQ





1473)

ID NO: 1522)





R3606
27
20
CUAAUAGAU
AGGAAUACAU
CUAAUAGAUUGCU





UGCUCCUUA
GGUACACGUU
CCUUACGAGGAGA





CGAGGAGAC
(SEQ ID NO:
CAGGAAUACAUGG





(SEQ ID NO:
1498)
UACACGUU (SEQ ID





1474)

NO: 1523)





R3607
26
20
UAAUAGAUU
AGGAAUACAU
UAAUAGAUUGCUC





GCUCCUUAC
GGUACACGUU
CUUACGAGGAGAC





GAGGAGAC
(SEQ ID NO:
AGGAAUACAUGGU





(SEQ ID NO:
1498)
ACACGUU (SEQ ID





1475)

NO: 1524)





R3608
25
20
AAUAGAUUG
AGGAAUACAU
AAUAGAUUGCUCC





CUCCUUACG
GGUACACGUU
UUACGAGGAGACA





AGGAGAC
AGGAAUACAU
GGAAUACAUGGUA





(SEQ ID NO:
GGUACACGUU
CACGUU (SEQ ID





1476)
(SEQ ID NO:
NO: 1525)






2487)






R3609
24
20
AUAGAUUGC
AGGAAUACAU
AUAGAUUGCUCCU





UCCUUACGA
GGUACACGUU
UACGAGGAGACAG





GGAGAC (SEQ
AGGAAUACAU
GAAUACAUGGUAC





ID NO: 1477)
GGUACACGUU
ACGUU (SEQ ID






(SEQ ID NO:
NO: 1526)






2487)






R3610
23
20
UAGAUUGCU
AGGAAUACAU
UAGAUUGCUCCUU





CCUUACGAG
GGUACACGUU
ACGAGGAGACAGG





GAGAC (SEQ
AGGAAUACAU
AAUACAUGGUACA





ID NO: 1478)
GGUACACGUU
CGUU (SEQ ID






(SEQ ID NO:
NO: 1527)






2487)






R3611
22
20
AGAUUGCUC
AGGAAUACAU
AGAUUGCUCCUUA





CUUACGAGG
GGUACACGUU
CGAGGAGACAGGA





AGAC (SEQ ID
AGGAAUACAU
AUACAUGGUACAC





NO: 1479)
GGUACACGUU
GUU (SEQ ID






(SEQ ID NO:
NO: 1528)






2487)






R3612
21
20
GAUUGCUCC
AGGAAUACAU
GAUUGCUCCUUAC





UUACGAGGA
GGUACACGUU
GAGGAGACAGGAA





GAC (SEQ ID
AGGAAUACAU
UACAUGGUACACG





NO: 1480)
GGUACACGUU
UU (SEQ ID 






(SEQ ID NO:
NO: 1529)






2487)






R3613
20
20
AUUGCUCCU
AGGAAUACAU
AUUGCUCCUUACG





UACGAGGAG
GGUACACGUU
AGGAGACAGGAAU





AC (SEQ ID
AGGAAUACAU
ACAUGGUACACGU





NO: 1481)
GGUACACGUU
U (SEQ ID 






(SEQ ID NO:
NO: 1530)






2487)
















TABLE AH







Casd.12 gRNAs targeting B2M and TRAC















Repeat
Spacer






sequence
sequence 
crRNA sequence


Name
Target
Modification
(5′->3′)
(5′->3′)
(5′->3′)





R3150
B2M
Unmodified,
AUUGCUC
CAGUGGGGG
AUUGCUCCUUAC


20-20
Exon 2
2′OMe at last
CUUACGA
UGAAUUCAG
GAGGAGACCAG




31 base (2me)
GGAGAC
UG (SEQ ID
UGGGGGUGAAU




2′OMe at last
(SEQ ID
NO: 1434)
UCAGUG (SEQ ID




two 3′ bases
NO: 1433)

NO: 1435)




(2me)







2′OMe at last







three 3′ bases







(3me)








R3042
TRAC
Unmodified,
AUUGCUC
GAGUCUCUC
AUUGCUCCUUAC


20-20
Exon 1
2me
CUUACGA
AGCUGGUAC
GAGGAGACGAG




2me
GGAGAC
AC (SEQ ID
UCUCUCAGCUGG




3me
(SEQ ID 
NO: 1436)
UACAC (SEQ ID





NO: 1433)

NO: 1437)





R3150
B2M
Unmodified,
AUUGCUC
CAGUGGGGG
AUUGCUCCUUAC


20-17
Exon 2
2me
CUUACGA
UGAAUUCA
GAGGAGACCAG




2me
GGAGAC
(SEQ ID NO:
UGGGGGUGAAU




3me
(SEQ ID
1438)
UCA (SEQ ID





NO: 1433)

NO: 1439)





R3042
TRAC
Unmodified,
AUUGCUC
CAGUGGGGG
AUUGCUCCUUAC


20-17
Exon 1
2me
CUUACGA
UGAAUUCA
GAGGAGACGAG




2me
GGAGAC
(SEQ ID NO:
UCUCUCAGCUGG




3me
(SEQ ID
1440)
UA (SEQ ID





NO: 1433)

NO: 1441)









In some embodiments, the guide nucleic acid comprises a spacer sequence that is the same as or differs by no more than 5 nucleotides from a spacer sequence from Tables A to H by no more than 4 nucleotides from a spacer sequence from Tables A to H, by no more than 3 nucleotides from a spacer sequence from Tables A to H, no more than 2 nucleotides from a spacer sequence from Tables A to H, or no more than 1 nucleotide from a spacer sequence from Tables A to H. A difference may be addition, deletion or substitution and where there are multiple differences, the differences may be addition, deletion and/or substitution.


In some embodiments, the guide nucleic acid comprises a sequence that is the same as or differs by no more than 5 nucleotides from a sequence from Tables I to AH by no more than 4 nucleotides from a sequence from Tables I to AH, by no more than 3 nucleotides from a sequence from Tables I to X, no more than 2 nucleotides from a sequence from Table I to AH, or no more than 1 nucleotide from a sequence from Tables I to AH. A difference may be addition, deletion or substitution and where there are multiple differences, the differences may be addition, deletion and/or substitution.


In some embodiments, the guide nucleic acid comprises a sequence that is at least 30, at least 31, at least 32, at least 33, at least 34, at least 35, at least 36, at least 37, at least 38, at least 39, at least 40, at least 41, at least 42, at least 43, at least 44, at least 45, at least 46, at least 47, at least 48, at least 49, at least 50, at least 51, at least 52, at least 53, at least 54, at least 55, at least 56 or at least 57 contiguous nucleobases of a sequence from Tables I to X, AG and AH (SEQ ID NO: 547-1404, 1433-1441, 1466-1530 or 2112-2289).


In some embodiments, the guide nucleic acid comprises a sequence that is 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56 or 57 contiguous nucleobases of a sequence from Tables I to X, AG and AH (SEQ ID NO: 547-1404, 1433-1441, 1466-1530 or 2112-2289).


In some embodiments, the guide nucleic acid comprises a sequence that is at least 30, at least 31, at least 32, at least 33, at least 34, at least 35, at least 36 or at least 37 contiguous nucleobases of a sequence from Tables Y to AF (SEQ ID NO: 1533-1933 or 2290-2467).


In some embodiments, the guide nucleic acid comprises a sequence that is 30, 31, 32, 33, 34, 35, 36 or 37 contiguous nucleobases of a sequence from Tables Y to AF (SEQ ID NO: 1533-1933 or 2290-2467).


In some embodiments, the guide nucleic acid comprises a repeat sequence from Table 2 and a spacer sequence from Tables A to H


In the sequences provided in Tables A-AH, the base T is interchangeable with U when a guide nucleic either is or comprises ribonucleic or deoxyribonucleic nucleosides.


Coding Sequences and Expression Vectors

In some aspects, the present disclosure provides a nucleic acid encoding a programmable CasΦ nuclease disclosed herein. In some embodiments, the nucleic acid is a vector, preferably the vector is an expression vector. Suitable expression vectors are easily identifiable for the cell type of interest. For example, an expression vector comprises a suitable promoter for transcription in the cell type of interest. An expression vector can also include other elements to support transcription, such as a Woodchuck Hepatitis Virus (WHP) Posttranscriptional regulatory Element (WPRE).


In some embodiments, a nucleic acid encoding a programmable CasΦ nuclease (e.g. within an expression vector) comprises elements suitable for expression in a eukaryotic cell. In some embodiments, the nucleic acid comprises a promoter suitable for transcription in a eukaryotic cell e.g. containing a TATA box and/or a TFIIB recognition element. The nucleic acid (e.g. within an expression vector) will typically include a promoter suitable for transcription in a eukaryotic cell upstream of the sequence encoding the programmable CasΦ nuclease, and may include a transcription terminator downstream of the sequence encoding the programmable CasΦ nuclease. The nucleic acid (e.g. within an expression vector) may also include enhancer(s) upstream and/or downstream of the sequence encoding the programmable CasΦ nuclease. A promoter may be an inducible promoter. The nucleic acid may also comprise a guide RNA. Suitable promoters are well known in the art and include the CMV promoter, EF1a promoter, intron-less EF1a short promoter, SV40 promoter, human or mouse PGK1 promoter, Ubc (ubiquitin C) promoter and mouse or human U6 promoter. Suitable mammalian promoters include the EFla promoter, intron-less EFla short promoter, and human U6 promoter.


In some embodiments, the vector is a viral vector. In some embodiments, the vector is a retroviral vector or a lentiviral vector. In preferred embodiments, the vector is an adeno-associated viral (AAV) vector. Several serotypes are available for AAV vectors that can be used in the compositions and methods disclosed herein, including AAV1, AAV2, AAV5, AAV6, AAV8, AAV9 and AAV DJ. In more preferred embodiments, the AAV vector is an AAV DJ vector.


A vector may be integrated into a host cell genome.


In some embodiments, a vector comprises a nucleic acid encoding a programmable CasΦ nuclease. In some embodiments, a vector comprises a nucleic acid encoding a guide nucleic acid. In some embodiments, a vector comprises a donor polynucleotide. In some embodiments, a nucleic acid encoding a programmable CasΦ nuclease, a nucleic acid encoding a guide nucleic acid and a donor polynucleotide are comprised by separate vectors. In some embodiments, a vector comprises a nucleic acid encoding a programmable CasΦ nuclease and a nucleic acid encoding a guide nucleic acid.


It is well known in the field that the large size of Cas9 nucleases makes Cas9 impractical for several applications. For example, packaging vectors into viral particles becomes more difficult as the size of the vector increases. It is therefore difficult to include other components in a viral vector that includes a nucleic acid encoding a Cas9 nuclease. Accordingly, one of the advantages of the programmable CasΦ nucleases disclosed herein arises from the smaller size of the programmable CasΦ nucleases which allows vectors comprising a nucleic acid encoding a programmable CasΦ nuclease to be easily packaged into viral particles when the vector also includes nucleic acids encoding other components, such a nucleic acid encoding a guide nucleic acid and/or donor polynucleotide. In preferred embodiments, a vector encodes a nucleic acid encoding a programmable CasΦ nuclease and a nucleic acid encoding a guide nucleic acid. In preferred embodiments, a vector encodes a nucleic acid encoding a programmable CasΦ nuclease, a nucleic acid encoding a guide nucleic acid and a donor polynucleotide. In some preferred embodiments, a vector comprises up to 1 kb donor polynucleotide, a promoter for expression of a guide nucleic acid, a nucleic acid encoding the nucleic acid, a mammalian promoter for expression of a programmable CasΦ nuclease, a nucleic acid encoding the programmable CasΦ nuclease, and a polyA signal. In alternative preferred embodiments, the donor polynucleotide is included in a nucleic acid encoding a tag, such as a fluorescent protein. In further preferred embodiments, the programmable CasΦ nuclease encoded by the vector is fuzed or linked to two nuclear localization signals.


In some embodiments, the expression vector comprises elements suitable for expression in a prokaryotic cell. In some embodiments, the expression vector comprises a promoter suitable for transcription in a prokaryotic cell e.g. comprising a Shine Dalgarno sequence.


In some embodiments, a CasΦ nuclease, a guide nucleic acid, or a nucleic acid encoding any combination thereof, may be inserted into a host cell by manner of electroporation, nucleofection, chemical methods, transfection, transduction, transformation, or microinjection. In some embodiments, a CasΦ nuclease, a guide nucleic acid, or a nucleic acid encoding any combination thereof, may be introduced into a cell by squeezing the cell to deform it, thereby disrupting the cell membrane and allowing the CasΦ nuclease, the guide nucleic acid, or the nucleic acid encoding any combination thereof, to pass into the cell.


In some embodiments, an Amaxa 4D nucleofector may be used to carry out nucleofection. In some embodiments, the chemical method or transfection comprises lipofectamine.


Lipid nanoparticle (LNP) delivery is one of the most clinically advanced non-viral delivery systems for gene therapy. LNPs have many properties that make them ideal candidates for delivery of nucleic acids, including ease of manufacture, low cytotoxicity and immunogenicity, high efficiency of nucleic acid encapsulation and cell transfection, multidosing capabilities and flexibility of design (Kulkarni et al., (2018) Nucleic Acid Therapeutics). In some embodiments, LNP is used to deliver a nucleic acid encoding a programmable CasΦ nuclease described herein. In some embodiments, LNP is used to deliver a nucleic acid encoding a guide nucleic acid. In some embodiments, LNP is used to deliver a nucleic acid encoding encoding a programmable CasΦ nuclease and a guide nucleic acid. In some embodiments, the LNP has an amine group to phosphate (N/P) ratio of between 2 and 10, between 3 and 10, or between 5 and 9. In preferred embodiments, the LNP has a N/P ratio of between 5 and 9. In more preferred embodiments, the LNP has a N/P ratio of 5. In some embodiments, the LNP additional components, e.g., nucleic acids, proteins, peptides, small molecules, sugars, lipids.


In more preferred embodiments, the LNP has a N/P ratio of 4 to 5. In preferred embodiments, the LNP comprises a nucleic acid encoding a programmable CasΦ nuclease, and the LNP has an N/P ratio of 4 to 5.


Target Nucleic Acid and Sample

A wide array of samples is compatible with the compositions and methods disclosed herein. The samples, as described herein, may be used in the methods of nicking a target nucleic acid disclosed herein. The samples, as described herein, may be used in the DETECTR assay methods disclosed herein. The samples, as described herein, are compatible with any of the programmable nucleases disclosed herein and use of said programmable nuclease in a method of detecting a target nucleic acid. The samples, as described herein, are compatible with any of the compositions comprising a programmable nuclease and a buffer. Described herein are samples that contain deoxyribonucleic acid (DNA), ribonucleic acid (RNA), or both, which can be modified or detected using a programmable nuclease of the present disclosure. As described herein, programmable nucleases are activated upon binding to a target nucleic acid of interest in a sample upon hybridization of a guide nucleic acid to the target nucleic acid. Subsequently, the activated programmable nucleases exhibit sequence-independent cleavage of a nucleic acid in a reporter. The reporter additionally includes a detectable moiety, which is released upon sequence-independent cleavage of the nucleic acid in the reporter. The detectable moiety emits a detectable signal, which can be measured by various methods (e.g., spectrophotometry, fluorescence measurements, electrochemical measurements).


Various sample types comprising a target nucleic acid of interest are consistent with the present disclosure. These samples can comprise a target nucleic acid sequence for detection. In some embodiments, the detection of the target nucleic indicates an ailment, such as a disease, cancer, or genetic disorder, or genetic information, such as for phenotyping, genotyping, or determining ancestry and are compatible with the reagents and support mediums as described herein. Generally, a sample from an individual or an animal or an environmental sample can be obtained to test for presence of a disease, cancer, genetic disorder, or any mutation of interest. A biological sample from the individual may be blood, serum, plasma, saliva, urine, mucosal sample, peritoneal sample, cerebrospinal fluid, gastric secretions, nasal secretions, sputum, pharyngeal exudates, urethral or vaginal secretions, an exudate, an effusion, or tissue. A tissue sample may be dissociated or liquified prior to application to detection system of the present disclosure. A sample from an environment may be from soil, air, or water. In some instances, the environmental sample is taken as a swab from a surface of interest or taken directly from the surface of interest. In some instances, the raw sample is applied to the detection system. In some instances, the sample is diluted with a buffer or a fluid or concentrated prior to application to the detection system or be applied neat to the detection system. Sometimes, the sample is contained in no more 20 μl. The sample, in some cases, is contained in no more than 1, 5, 10, 15, 20, 25, 30, 35 40, 45, 50, 55, 60, 65, 70, 75, 80, 90, 100, 200, 300, 400, 500 μl, or any of value from 1 μl to 500 μl, preferably from 10 μl to 200 μl, or more preferably from 50 μl to 100 μl. Sometimes, the sample is contained in more than 500 μl.


In some embodiments, the target nucleic acid is single-stranded DNA. The methods, reagents, enzymes, and kits disclosed herein may enable the direct detection of a DNA encoding a sequence of interest, in particular a single-stranded DNA encoding a sequence of interest, without transcribing the DNA into RNA, for example, by using an RNA polymerase. The compositions and methods disclosed herein may enable the detection of target nucleic acid that is an amplified nucleic acid of a nucleic acid of interest. In some embodiments, the target nucleic acid is a cDNA, genomic DNA, an amplicon of genomic DNA or a DNA amplicon of an RNA. A nucleic acid can encode a sequence from a genomic locus. In some cases, the target nucleic acid that binds to the guide nucleic acid is from 5 to 100, 5 to 90, 5 to 80, 5 to 70, 5 to 60, 5 to 50, 5 to 40, 5 to 30, 5 to 25, 5 to 20, 5 to 15, or 5 to 10 nucleotides in length. The nucleic acid can be from 10 to 90, from 20 to 80, from 30 to 70, or from 40 to 60 nucleotides in length. A nucleic acid can be 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 45, 50, 60, 70, 80, 90, or 100 nucleotides in length. The target nucleic acid can encode a sequence reverse complementary to a guide nucleic acid sequence.


In some instances, the sample is taken from single-cell eukaryotic organisms; a plant or a plant cell; an algal cell; a fungal cell; an animal cell, tissue, or organ; a cell, tissue, or organ from an invertebrate animal; a cell, tissue, fluid, or organ from a vertebrate animal such as fish, amphibian, reptile, bird, and mammal; a cell, tissue, fluid, or organ from a mammal such as a human, a non-human primate, an ungulate, a feline, a bovine, an ovine, and a caprine. In some instances, the sample is taken from nematodes, protozoans, helminths, or malarial parasites. In some cases, the sample comprises nucleic acids from a cell lysate from a eukaryotic cell, a mammalian cell, a human cell, a prokaryotic cell, or a plant cell. In some cases, the sample comprises nucleic acids expressed from a cell.


The sample described herein may comprise at least one target nucleic acid. The target nucleic acid comprises a segment that is reverse complementary to a segment of a guide nucleic acid. Often, the sample comprises the segment of the target nucleic acid and at least one nucleic acid comprising at least 50% sequence identity to a segment of the target nucleic acid. Sometimes, the at least one nucleic acid comprises a segment comprising at least 60%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% sequence identity to the segment of the target nucleic acid. Often, a sample comprises the segment of the target nucleic acid and at least one nucleic acid a segment comprising less than 100% sequence identity to the target nucleic acid but no less than 50% sequence identity to the segment of the target nucleic acid. Sometimes, a sample comprises the segment of the target nucleic acid and at least one nucleic acid a segment comprising less than 100% sequence identity to the target nucleic acid but no less than 60%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% sequence identity to the segment of the target nucleic acid. For example, the segment of the target nucleic acid comprises a mutation as compared to at least one nucleic acid comprising a segment comprising less than 100% sequence identity to the segment of the target nucleic acid but no less than 50% sequence identity to the segment of the target nucleic acid. Sometimes, the segment of the target nucleic acid comprises a mutation as compared to at least one nucleic acid comprising a segment comprising less than 100% sequence identity to the segment of the target nucleic acid but no less than 60%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% sequence identity to the segment of the target nucleic acid. Often, the segment of the target nucleic acid comprises a mutation as compared to at least one nucleic acid comprising a segment comprising less than 100% sequence identity to the segment of the target nucleic acid but no less than 50% sequence identity to the segment of the target nucleic acid. The mutation can be a mutation of 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20 or more nucleotides. Often, the mutation is a single nucleotide mutation.


The single nucleotide mutation can be a single nucleotide polymorphism (SNP), which is a single base pair variation in a DNA sequence present in less than 1% of a population. Sometimes, the target nucleic acid comprises a single nucleotide mutation, wherein the single nucleotide mutation comprises the wild type variant of the SNP. The single nucleotide mutation or SNP can be associated with a phenotype of the sample or a phenotype of the organism from which the sample was taken. The SNP, in some cases, is associated with altered phenotype from wild type phenotype. Often, the segment of the target nucleic acid sequence comprises a deletion as compared to at least one nucleic acid comprising a segment comprising less than 100% sequence identity to the segment of the target nucleic acid but no less than 50% sequence identity to the segment of the target nucleic acid. The mutation can be a deletion of 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20 or more nucleotides. The mutation can be a deletion of about 5, about 10, about 15, about 20, about 25, about 30, about 35, about 40, about 45, about 50, about 55, about 60, about 65, about 70, about 75, about 80, about 85, about 90, about 95, about 100, about 200, about 300, about 400, about 500, about 600, about 700, about 800, about 900, or about 1000 nucleotides. The mutation can be a deletion of from 1 to 5, from 5 to 10, from 10 to 15, from 15 to 20, from 20 to 25, from 25 to 30, from 30 to 35, from 35 to 40, from 40 to 45, from 45 to 50, from 50 to 55, from 55 to 60, from 60 to 65, from 65 to 70, from 70 to 75, from 75 to 80, from 80 to 85, from 85 to 90, from 90 to 95, from 95 to 100, from 100 to 200, from 200 to 300, from 300 to 400, from 400 to 500, from 500 to 600, from 600 to 700, from 700 to 800, from 800 to 900, from 900 to 1000, from 1 to 50, from 1 to 100, from 25 to 50, from 25 to 100, from 50 to 100, from 100 to 500, from 100 to 1000, or from 500 to 1000 nucleotides. The segment of the target nucleic acid that the guide nucleic acid of the methods describe herein binds to comprises the mutation, such as the SNP or the deletion. The mutation can be a single nucleotide mutation or a SNP. The SNP can be a synonymous substitution or a nonsynonymous substitution. The nonsynonymous substitution can be a missense substitution or a nonsense point mutation. The synonymous substitution can be a silent substitution. The mutation can be a deletion of one or more nucleotides. Often, the single nucleotide mutation, SNP, or deletion is associated with a disease such as cancer or a genetic disorder. The mutation, such as a single nucleotide mutation, a SNP, or a deletion, can be encoded in the sequence of a target nucleic acid from the germline of an organism or can be encoded in a target nucleic acid from a diseased cell, such as a cancer cell.


The sample used for disease testing may comprise at least one target nucleic acid that can bind to a guide nucleic acid of the reagents described herein. The sample used for disease testing may comprise at least nucleic acid of interest that is amplified to produce a target nucleic acid that can bind to a guide nucleic acid of the reagents described herein. The nucleic acid of interest can comprise DNA, RNA, or a combination thereof.


The target nucleic acid (e.g., a target DNA) may be a portion of a nucleic acid from a virus or a bacterium or other agents responsible for a disease in the sample. The target nucleic acid may be a portion of a nucleic acid from a gene expressed in a cancer or genetic disorder in the sample. In some cases, the sequence is a segment of a target nucleic acid sequence. A segment of a target nucleic acid sequence can be from a genomic locus, a transcribed mRNA, or a reverse transcribed cDNA. A segment of a target nucleic acid sequence can be from 5 to 100, 5 to 90, 5 to 80, 5 to 70, 5 to 60, 5 to 50, 5 to 40, 5 to 30, 5 to 25, 5 to 20, 5 to 15, or 5 to 10 nucleotides in length. A segment of a target nucleic acid sequence can be 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 45, 50, 60, 70, 80, 90, or 100 nucleotides in length. The sequence of the target nucleic acid segment can be reverse complementary to a segment of a guide nucleic acid sequence. The target nucleic acid may comprise a genetic variation (e.g., a single nucleotide polymorphism), with respect to a standard sample, associated with a disease phenotype or disease predisposition. The target nucleic acid may be an amplicon of a portion of an RNA, may be a DNA, or may be a DNA amplicon from any organism in the sample.


In some embodiments, the target nucleic acid sequence comprises a nucleic acid sequence of a virus or a bacterium or other agents responsible for a disease in the sample. In some embodiments, the target nucleic acid comprises DNA that is reverse transcribed from RNA using a reverse transcriptase prior to detection by a programmable nuclease using the compositions, systems, and methods disclosed herein. The target nucleic acid, in some cases, is a portion of a nucleic acid from a sexually transmitted infection or a contagious disease, in the sample. In some cases, the target nucleic acid is a portion of a nucleic acid from a genomic locus, or any DNA amplicon, such as a reverse transcribed mRNA or a cDNA from a gene locus, a transcribed mRNA, or a reverse transcribed cDNA from a gene locus in at least one of: human immunodeficiency virus (HIV), human papillomavirus (HPV), chlamydia, gonorrhea, syphilis, trichomoniasis, sexually transmitted infection, malaria, Dengue fever, Ebola, chikungunya, and leishmaniasis. Pathogens include viruses, fungi, helminths, protozoa, malarial parasites, Plasmodium parasites, Toxoplasma parasites, and Schistosoma parasites. Helminths include roundworms, heartworms, and phytophagous nematodes, flukes, Acanthocephala, and tapeworms. Protozoan infections include infections from Giardia spp., Trichomonas spp., African trypanosomiasis, amoebic dysentery, babesiosis, balantidial dysentery, Chaga's disease, coccidiosis, malaria and toxoplasmosis. Examples of pathogens such as parasitic/protozoan pathogens include, but are not limited to: Plasmodium falciparum, P. vivax, Trypanosoma cruzi and Toxoplasma gondii. Fungal pathogens include, but are not limited to Cryptococcus neoformans, Histoplasma capsulatum, Coccidioides immitis, Blastomyces dermatitides, Chlamydia trachomatis, and Candida albicans. Pathogenic viruses include but are not limited to coronavirus; immunodeficiency virus (e.g., HIV); influenza virus; dengue; West Nile virus; herpes virus; yellow fever virus; Hepatitis Virus C; Hepatitis Virus A; Hepatitis Virus B; papillomavirus; and the like. Pathogens include, e.g., HIV virus, Mycobacterium tuberculosis, Streptococcus agalactiae, methicillin-resistant Staphylococcus aureus, Legionella pneumophila, Streptococcus pyogenes, Escherichia coli, Neisseria gonorrhoeae, Neisseria meningitidis, Pneumococcus, Cryptococcus neoformans, Histoplasma capsulatum, Hemophilus influenzae B, Treponema pallidum, Lyme disease spirochetes, Pseudomonas aeruginosa, Mycobacterium leprae, Brucella abortus, rabies virus, influenza virus, cytomegalovirus, herpes simplex virus I, herpes simplex virus II, human serum parvo-like virus, respiratory syncytial virus (RSV), M. genitalium, T. vaginalis, varicella-zoster virus, hepatitis B virus, hepatitis C virus, measles virus, adenovirus, human T-cell leukemia viruses, Epstein-Barr virus, murine leukemia virus, mumps virus, vesicular stomatitis virus, Sindbis virus, lymphocytic choriomeningitis virus, wart virus, blue tongue virus, Sendai virus, feline leukemia virus, Reovirus, polio virus, simian virus 40, mouse mammary tumor virus, dengue virus, rubella virus, West Nile virus, Plasmodium falciparum, Plasmodium vivax, Toxoplasma gondii, Trypanosoma rangeli, Trypanosoma cruzi, Trypanosoma rhodesiense, Trypanosoma brucei, Schistosoma mansoni, Schistosoma japonicum, Babesia bovis, Eimeria tenella, Onchocerca volvulus, Leishmania tropica, Mycobacterium tuberculosis, Trichinella spiralis, Theileria parva, Taenia hydatigena, Taenia ovis, Taenia saginata, Echinococcus granulosus, Mesocestoides corti, Mycoplasma arthritidis, M. hyorhinis, M. orale, M. arginini, Acholeplasma laidlawii, M. salivarium and M. pneumoniae. In some cases, the target sequence is a portion of a nucleic acid from a genomic locus, a transcribed mRNA, or a reverse transcribed cDNA from a gene locus of bacterium or other agents responsible for a disease in the sample comprising a mutation that confers resistance to a treatment, such as a single nucleotide mutation that confers resistance to antibiotic treatment. In some cases, the mutation that confers resistance to a treatment is a deletion.


Compositions and methods of the disclosure can be used for cell line engineering (e.g., engineering a cell from a cell line for bioproduction). For example, compositions and methods of the disclosure can be used to express a desired protein from a cell line. In some embodiments, the target nucleic acid sequence comprises a nucleic acid sequence of a cell line. In some embodiments, the target nucleic acid sequence comprises a genomic nucleic acid sequence of a cell line. In some embodiments, the cell line is a Chinese hamster ovary cell line (CHO), human embryonic kidney cell line (HEK), cell lines derived from cancer cells, cell lines derived from lymphocytes, and the like. Non-limiting examples of cell lines includes: C8161, CCRF-CEM, MOLT, mIMCD-3, NHDF, HeLa-S3, Huh1, Huh4, Huh7, HUVEC, HASMC, HEKn, HEKa, MiaPaCell, Panc1, PC-3, TF1, CTLL-2, CIR, Rat6, CV1, RPTE, A10, T24, J82, A375, ARH-77, Calu1, SW480, SW620, SKOV3, SK-UT, CaCo2, P388D1, SEM-K2, WEHI-231, HB56, TIB55, Jurkat, J45.01, LRMB, Bc1-1, BC-3, IC21, DLD2, Raw264.7, NRK, NRK-52E, MRCS, MEF, Hep G2, HeLa B, HeLa T4, COS, COS-1, COS-6, COS-M6A, BS-C-1 monkey kidney epithelial, BALB/3T3 mouse embryo fibroblast, 3T3 Swiss, 3T3-L1, 132-d5 human fetal fibroblasts; 10.1 mouse fibroblasts, 293-T, 3T3, 721, 9L, A2780, A2780ADR, A2780cis, A172, A20, A253, A431, A-549, ALC, AsPC-1, B16, B35, BCP-1 cells, BEAS-2B, bEnd.3, BHK-21, BR 293, BxPC3, C3H-10T1/2, C6/36, Cal-27, Capan-1, CHO, CHO-7, CHO-IR, CHO-K1, CHO-K2, CHO-S, CHO-T, CHO Dhfr−/−, COR-L23, COR-L23/CPR, COR-L23/5010, COR-L23/R23, COS-7, COV-434, CML T1, CMT, CT26, D17, DH82, DU145, DuCaP, EL4, EM2, EM3, EMT6/AR1, EMT6/AR10.0, FM3, H1299, H69, HAP1, HB54, HB55, HCA2, HEK-293, HeLa, Hepa1-6, Hep3B, Hepa1 cic7, HL-60, HMEC, HT-29, Jurkat, JY cells, K562 cells, Ku812, KCL22, KG1, KYO1, LNCap, Ma-Mel 1-48, MC-38, MCF-7, MCF-10A, MDA-MB-231, MDA-MB-468, MDA-MB-435, MDCK II, MDCK II, MOR/0.2R, MONO-MAC 6, MTD-1A, MyEnd, NCI-H69/CPR, NCI-H69/LX10, NCI-H69/LX20, NCI-H69/LX4, NIH-3T3, NALM-1, Neuro2A, NK92, NW-145, OPCN/OPCT cell lines, Peer, PNT-1A/PNT 2, RenCa, RIN-5F, RMA/RMAS, Saos-2 cells, Sf-9, SkBr3, T2, T-47D, T84, THP1 cell line, U373, U87, U937, VCaP, Vero cells, WM39, WT-49, X63, YAC-1, and YAR. Non-limiting examples of other cells that can be used with the disclosure include immune cells, such as CART, T-cells, B-cells, NK cells (including iNK cells), granulocytes, basophils, eosinophils, neutrophils, mast cells, monocytes, macrophages, dendritic cells, antigen-presenting cells (APC), or adaptive cells. Non-limiting examples of cells that can be used with this disclosure also include plant cells, such as parenchyma, sclerenchyma, collenchyma, xylem, phloem, germline (e.g., pollen). Cells may be from lycophytes, ferns, gymnosperms, angiosperms, bryophytes, charophytes, chloropytes, rhodophytes, or glaucophytes. Cells may be obtained from non-human animals, including, but not limited to, rats, dogs, rabbits, cats, and monkeys. Non-limiting examples of cells that can be used with this disclosure also include stem cells, such as human stem cells, animal stem cells, stem cells that are not derived from human embryonic stem cells, embryonic stem cells, mesenchymal stem cells, pluripotent stem cells, induced pluripotent stem cells (iPS), somatic stem cells, adult stem cells, hematopoietic stem cells, tissue-specific stem cells. Non-limiting examples of cells that can be used with this disclosure also include neuronal cells from various organs of an animal, e.g., brain, heart, lung, liver, pancreas, and muscle. In preferred embodiments, the cells that can be used with the disclosure are T cells, such as CAR-T (CART) cells.


CHO cells are an epithelial cell line which is particularly useful in biological and medical research. In particular, CHO cells are frequently used for the industrial production of recombinant therapeutics. In some embodiments, a CasΦ polypeptide disclosed herein is expressed in a CHO cell. In some embodiments, a CasΦ polypeptide disclosed herein complexed with a guide nucleic is expressed in a CHO cell. In some embodiments, a method disclosed herein comprises modifying or editing a CHO cell. In some embodiments, a modified CHO cell is provided wherein the CHO cell is modified by a CasΦ polypeptide disclosed herein. In some embodiments, a CHO cell is provided wherein the CHO cell comprises a CasΦ polypeptide disclosed herein.


T cells are important therapeutic targets. In some embodiments, a CasΦ polypeptide disclosed herein is expressed in a T cell. In some embodiments, a CasΦ polypeptide disclosed herein complexed with a guide nucleic is expressed in a T cell. In some embodiments, a method disclosed herein comprises modifying or editing a T cell. In some embodiments, a method disclosed herein comprises modifying a PDCD1 gene of a T cell. In some embodiments, a method disclosed herein comprises modifying a TRAC gene of a T cell. In some embodiments, a method disclosed herein comprises modifying a B2M gene of a T cell. In some embodiments, a method disclosed herein comprises modifying a PDCD1 gene of a T cell, a TRAC gene of a T cell, a B2M gene of a T cell or a combination thereof. In some embodiments, a method disclosed herein comprises modifying a PDCD1 gene, a TRAC gene, and a B2M gene of a T cell. In some embodiments, a modified T cell is provided wherein the T cell is modified by a CasΦ polypeptide disclosed herein. In some embodiments, a T cell is provided wherein the T cell comprises a CasΦ polypeptide disclosed herein.


T cells, also known as T lymphocytes, are easily identifiable by the surface expression of the T-cell receptor (TCR). In some embodiments, the T cells include one or more subsets of T cells, such as CD4+ cells, CD8+ cells, and sub-populations thereof. In some embodiments, a T cell is a CD4+ cell. In some embodiments, a T cell is a CD8+ T cells. In some embodiments, a population of T cells comprises CD4+ T cells and CD8+ T cells. In some embodiments, T cells comprise TCR-T, Tscm, or iT cells.


Sub-populations of CD4+ and CD8+ T cells include naive T cells, effector T cells, memory T cells, immature T cells, mature T cells, helper T cells, cytotoxic T cells, regulatory T cells, alpha/beta T cells, and delta/gamma T cells. Sub-types of memory T cells include stem cell memory T cells, central memory T cells, effector memory T cells, and terminally differentiated effector memory T cells. Sub-types of helper T cells, include T helper 1 cells, T helper 2 cells, T helper 3 cells, T helper 17 cells, T helper 9 cells, T helper 22 cells, and follicular helper T cells. In some embodiments, the cell is a regulatory T cell (Treg).


CART cells are T cells that have been genetically engineered to express unique chimeric antigen receptors (CARs) targeting specific antigens. CART cells are important targets for immunotherapy. In some embodiments, a CasΦ polypeptide disclosed herein is expressed in a CART cell. In some embodiments, a CasΦ polypeptide disclosed herein complexed with a guide nucleic is expressed in a CART cell. In some embodiments, a method disclosed herein comprises modifying or editing a CART cell. In some embodiments, a modified CART cell is provided wherein the CART cell is modified by a CasΦ polypeptide disclosed herein. In some embodiments, a CART cell is provided wherein the CART cell comprises a CasΦ polypeptide disclosed herein.


Modified stem cells and methods of modifying stem cells are also provided. In some embodiments, a CasΦ polypeptide disclosed herein is expressed in a stem cell. In some embodiments, a CasΦ polypeptide disclosed herein complexed with a guide nucleic is expressed in a stem cell. In some embodiments, a method disclosed herein comprises modifying or editing a stem cell. In some embodiments, a modified stem cell is provided wherein a stem cell is modified by a CasΦ polypeptide disclosed herein. In some embodiments, a stem cell is provided wherein the stem cell comprises a CasΦ polypeptide disclosed herein. In some embodiments, a modified stem cell is obtained or is obtainable by a method disclosed herein. In some embodiments, a modified stem cell is provided wherein the CART cell is modified by a CasΦ polypeptide disclosed herein.


Induced pluripotent stem cells (iPSCs) are pluripotent stem cells that are generated from somatic cells. They can propagate indefinitely and give rise to any cell type in the body. These features make iPSCs a powerful tool for researching human disease and provide a promising prospect for cell therapies for a range of medical conditions. iPSCs can be generated in a patient-specific manner and used in autologous transplant, thereby overcoming complications of rejection by the host immune system (Moradi et al. (2019), Stem Cell Research & Therapy).


In some embodiments, a CasΦ polypeptide disclosed herein is expressed in an induced pluripotent stem cell. In some embodiments, a CasΦ polypeptide disclosed herein complexed with a guide nucleic is expressed in an induced pluripotent stem cell. In some embodiments, a method disclosed herein comprises modifying or editing an induced pluripotent stem cell. In some embodiments, a modified induced pluripotent stem cell is provided wherein an induced pluripotent stem cell is modified by a CasΦ polypeptide disclosed herein. In some embodiments, an induced pluripotent stem cell is provided wherein the induced pluripotent stem cell comprises a CasΦ polypeptide disclosed herein. In some embodiments, a modified induced pluripotent cell is obtained or is obtainable by a method disclosed herein.


Hematopoietic stem cells (HSCs) are identifiable by the marker CD34. HSCs are stem cells that differentiate to give rise blood cells, such as T and B lymphocytes, erythrocytes, monocytes and macrophages. HSCs are important cells for future stem cell therapies as they have the potential to be used to treat genetic blood cell diseases (Morgan et al. (2017), Cell Stem Cell).


In some embodiments, a CasΦ polypeptide disclosed herein is expressed in a hematopoietic stem cell. In some embodiments, a CasΦ polypeptide disclosed herein complexed with a guide nucleic is expressed in a hematopoietic stem cell. In some embodiments, a method disclosed herein comprises modifying or editing a hematopoietic stem cell. In some embodiments, a modified hematopoietic stem cell is provided wherein a hematopoietic stem cell is modified by a CasΦ polypeptide disclosed herein. In some embodiments, a hematopoietic stem cell is provided wherein the hematopoietic stem cell comprises a CasΦ polypeptide disclosed herein. In some embodiments, a modified hematopoietic stem cell is obtained or is obtainable by a method disclosed herein.


Compositions and methods of the disclosure can be used for agricultural engineering. For example, compositions and methods of the disclosure can be used to confer desired traits on a plant. A plant can be engineered for the desired physiological and agronomic characteristic using the present disclosure. In some embodiments, the target nucleic acid sequence comprises a nucleic acid sequence of a plant. In some embodiments, the target nucleic acid sequence comprises a genomic nucleic acid sequence of a plant cell. In some embodiments, the target nucleic acid sequence comprises a nucleic acid sequence of an organelle of a plant cell. In some embodiments, the target nucleic acid sequence comprises a nucleic acid sequence of a chloroplast of a plant cell.


The plant can be a monocotyledonous plant. The plant can be a dicotyledonous plant. Non-limiting examples of orders of dicotyledonous plants include Magniolales, Illiciales, Laurales, Piperales, Aristochiales, Nymphaeales, Ranunculales, Papeverales, Sarraceniaceae, Trochodendrales, Hamamelidales, Eucomiales, Leitneriales, Myricales, Fagales, Casuarinales, Caryophyllales, Batales, Polygonales, Plumbaginales, Dilleniales, Theales, Malvales, Urticales, Lecythidales, Violales, Salicales, Capparales, Ericales, Diapensales, Ebenales, Primulales, Rosales, Fabales, Podostemales, Haloragales, Myrtales, Cornales, Proteales, San tales, Rafflesiales, Celastrales, Euphorbiales, Rhamnales, Sapindales, Juglandales, Geraniales, Polygalales, Umbellales, Gentianales, Polemoniales, Lamiales, Plantaginales, Scrophulariales, Campanulales, Rubiales, Dipsacales, and Asterales.


Non-limiting examples of orders of monocotyledonous plants include Alismatales, Hydrocharitales, Najadales, Triuridales, Commelinales, Eriocaulales, Restionales, Poales, Juncales, Cyperales, Typhales, Bromeliales, Zingiberales, Arecales, Cyclanthales, Pandanales, Arales, Lilliales, and Orchid ales. A plant can belong to the order, for example, Gymnospermae, Pinales, Ginkgoales, Cycadales, Araucariales, Cupressales and Gnetales.


Non-limiting examples of plants include plant crops, fruits, vegetables, grains, soy bean, corn, maize, wheat, seeds, tomatoes, rice, cassava, sugarcane, pumpkin, hay, potatoes, cotton, cannabis, tobacco, flowering plants, conifers, gymnosperms, ferns, clubmosses, hornworts, liverworts, mosses, wheat, maize, rice, millet, barley, tomato, apple, pear, strawberry, orange, acacia, carrot, potato, sugar beets, yam, lettuce, spinach, sunflower, rape seed, Arabidopsis, alfalfa, amaranth, apple, apricot, artichoke, ash tree, asparagus, avocado, banana, barley, beans, beet, birch, beech, blackberry, blueberry, broccoli, Brussel's sprouts, cabbage, canola, cantaloupe, carrot, cassava, cauliflower, cedar, a cereal, celery, chestnut, cherry, Chinese cabbage, citrus, clementine, clover, coffee, corn, cotton, cowpea, cucumber, cypress, eggplant, elm, endive, eucalyptus, fennel, figs, fir, geranium, grape, grapefruit, groundnuts, ground cherry, gum hemlock, hickory, kale, kiwifruit, kohlrabi, larch, lettuce, leek, lemon, lime, locust, pine, maidenhair, maize, mango, maple, melon, millet, mushroom, mustard, nuts, oak, oats, oil palm, okra, onion, orange, an ornamental plant or flower or tree, papaya, palm, parsley, parsnip, pea, peach, peanut, pear, peat, pepper, persimmon, pigeon pea, pine, pineapple, plantain, plum, pomegranate, potato, pumpkin, radicchio, radish, rapeseed, raspberry, rice, rye, sorghum, safflower, sallow, soybean, spinach, spruce, squash, strawberry, sugar beet, sugarcane, sunflower, sweet potato, sweet corn, tangerine, tea, tobacco, tomato, trees, triticale, turf grasses, turnips, vine, walnut, watercress, watermelon, wheat, yams, yew, and zucchini. A plant can include algae.


In some embodiments, the target nucleic acid sequence comprises a nucleic acid sequence of a virus, a bacterium, or other pathogen responsible for a disease in a plant (e.g., a crop). Methods and compositions of the disclosure can be used to treat or detect a disease in a plant. For example, the methods of the disclosure can be used to target a viral nucleic acid sequence in a plant. A programmable nuclease of the disclosure (e.g., CasΦ) can cleave the viral nucleic acid. In some embodiments, the target nucleic acid sequence comprises a nucleic acid sequence of a virus or a bacterium or other agents (e.g., any pathogen) responsible for a disease in the plant (e.g., a crop). In some embodiments, the target nucleic acid comprises DNA that is reverse transcribed from RNA using a reverse transcriptase prior to detection by a programmable nuclease using the compositions, systems, and methods disclosed herein. The target nucleic acid, in some cases, is a portion of a nucleic acid from a virus or a bacterium or other agents responsible for a disease in the plant (e.g., a crop). In some cases, the target nucleic acid is a portion of a nucleic acid from a genomic locus, or any DNA amplicon, such as a reverse transcribed mRNA or a cDNA from a gene locus, a transcribed mRNA, or a reverse transcribed cDNA from a gene locus in at a virus or a bacterium or other agents (e.g., any pathogen) responsible for a disease in the plant (e.g., a crop). A virus infecting the plant can be an RNA virus. A virus infecting the plant can be a DNA virus. Non-limiting examples of viruses that can be targeted with the disclosure include Tobacco mosaic virus (TMV), Tomato spotted wilt virus (TSWV), Cucumber mosaic virus (CMV), Potato virus Y (PVY), Cauliflower mosaic virus (CaMV) (RT virus), Plum pox virus (PPV), Brome mosaic virus (BMV) and Potato virus X (PVX).


The sample used for cancer testing may comprise at least one target nucleic acid that can bind to a guide nucleic acid of the reagents described herein. The target nucleic acid, in some cases, comprises a portion of a gene comprising a mutation associated with cancer, a gene whose overexpression is associated with cancer, a tumor suppressor gene, an oncogene, a checkpoint inhibitor gene, a gene associated with cellular growth, a gene associated with cellular metabolism, or a gene associated with cell cycle. Sometimes, the target nucleic acid encodes a cancer biomarker, such as a prostate cancer biomarker or non-small cell lung cancer. In some cases, the assay can be used to detect “hotspots” in target nucleic acids that can be predictive of lung cancer. In some cases, the target nucleic acid comprises a portion of a nucleic acid that is associated with a blood fever. In some cases, the target nucleic acid is a portion of a nucleic acid from a genomic locus, any DNA amplicon of, a reverse transcribed mRNA, or a cDNA from a locus of at least one of: ALK, APC, ATM, AXIN2, BAP1, BARD1, BLM, BMPR1A, BRCA1, BRCA2, BRIP1, CASR, CDC73, CDH1, CDK4, CDKN1B, CDKN1C, CDKN2A, CEBPA, CHEK2, CTNNA1, DICER1, DIS3L2, EGFR, EPCAM, FH, FLCN, GATA2, GPC3, GREM1, HOXB13, HRAS, KIT, MAX, MEN1, MET, MITF, MLH1, MSH2, MSH3, MSH6, MUTYH, NBN, NF1, NF2, NTHL1, PALB2, PDGFRA, PHOX2B, PMS2, POLD1, POLE, POT1, PRKAR1A, PTCH1, PTEN, RAD50, RAD51C, RAD51D, RB1, RECQL4, RET, RUNX1, SDHA, SDHAF2, SDHB, SDHC, SDHD, SMAD4, SMARCA4, SMARCB1, SMARCE1, STK11, SUFU, TERC, TERT, TMEM127, TP53, TSC1, TSC2, VHL, WRN, and WT1. Any region of the aforementioned gene loci can be probed for a mutation or deletion using the compositions and methods disclosed herein. For example, in the EGFR gene locus, the compositions and methods for detection disclosed herein can be used to detect a single nucleotide polymorphism or a deletion. The SNP or deletion can occur in a non-coding region or a coding region. The SNP or deletion can occur in an Exon, such as Exon19. A SNP, deletion, or other mutation may mediate gene knockout.


The sample used for genetic disorder testing may comprise at least one target nucleic acid that can bind to a guide nucleic acid of the reagents described herein. In some embodiments, the genetic disorder is hemophilia, sickle cell anemia, 0-thalassemia, Duchene muscular dystrophy, severe combined immunodeficiency, Huntington's disease, or cystic fibrosis. The target nucleic acid, in some cases, is from a gene with a mutation associated with a genetic disorder, from a gene whose overexpression is associated with a genetic disorder, from a gene associated with abnormal cellular growth resulting in a genetic disorder, or from a gene associated with abnormal cellular metabolism resulting in a genetic disorder. In some cases, the target nucleic acid is a nucleic acid from a genomic locus, a transcribed mRNA, or a reverse transcribed mRNA, a DNA amplicon of or a cDNA from a locus of at least one of: CFTR, FMR1, SMN1, ABCB11, ABCC8, ABCD1, ACAD9, ACADM, ACADVL, ACAT1, ACOX1, ACSF3, ADA, ADAMTS2, ADGRG1, AGA, AGL, AGPS, AGXT, AIRE, ALDH3A2, ALDOB, ALG6, ALMS1, ALPL, AMT, AQP2, ARG1, ARSA, ARSB, ASL, ASNS, ASPA, ASS1, ATM, ATP6V1B1, ATP7A, ATP7B, ATRX, BBS1, BBS10, BBS12, BBS2, BCKDHA, BCKDHB, BCS1L, BLM, BSND, CAPN3, CBS, CDH23, CEP290, CERKL, CHM, CHRNE, CIITA, CLN3, CLN5, CLN6, CLN8, CLRN1, CNGB3, COL27A1, COL4A3, COL4A4, COL4A5, COL7A1, CPS1, CPT1A, CPT2, CRB1, CTNS, CTSK, CYBA, CYBB, CYP11B1, CYP11B2, CYP17A1, CYP19A1, CYP27A1, DBT, DCLRE1C, DHCR7, DHDDS, DLD, DMD, DNAH5, DNAI1, DNAI2, DYSF, EDA, EIF2B5, EMD, ERCC6, ERCC8, ESCO2, ETFA, ETFDH, ETHE1, EVC, EVC2, EYS, F9, FAH, FAM161A, FANCA, FANCC, FANCG, FH, FKRP, FKTN, G6PC, GAA, GALC, GALK1, GALT, GAMT, GBA, GBE1, GCDH, GFM1, GJB1, GJB2, GLA, GLB1, GLDC, GLE1, GNE, GNPTAB, GNPTG, GNS, GRHPR, HADHA, HAX1, HBA1, HBA2, HBB, HEXA, HEXB, HGSNAT, HLCS, HMGCL, HOGA1, HPS1, HPS3, HSD17B4, HSD3B2, HYAL1, HYLS1, IDS, IDUA, IKBKAP, IL2RG, IVD, KCNJ11, LAMA2, LAMA3, LAMB3, LAMC2, LCA5, LDLR, LDLRAP1, LHX3, LIFR, LIPA, LOXHD1, LPL, LRPPRC, MAN2B1, MCOLN1, MED17, MESP2, MFSD8, MKS1, MLC1, MMAA, MMAB, MMACHC, MMADHC, MPI, MPL, MPV17, MTHFR, MTM1, MTRR, MTTP, MUT, MYO7A, NAGLU, NAGS, NBN, NDRG1, NDUFAF5, NDUFS6, NEB, NPC1, NPC2, NPHS1, NPHS2, NR2E3, NTRK1, OAT, OPA3, OTC, PAH, PC, PCCA, PCCB, PCDH15, PDHA1, PDHB, PEX1, PEX10, PEX12, PEX2, PEX6, PEX7, PFKM, PHGDH, PKHD1, PMM2, POMGNT1, PPT1, PROP1, PRPS1, PSAP, PTS, PUS1, PYGM, RAB23, RAG2, RAPSN, RARS2, RDH12, RMRP, RPE65, RPGRIP1L, RS1, RTEL1, SACS, SAMHD1, SEPSECS, SGCA, SGCB, SGCG, SGSH, SLC12A3, SLC12A6, SLC17A5, SLC22A5, SLC25A13, SLC25A15, SLC26A2, SLC26A4, SLC35A3, SLC37A4, SLC39A4, SLC4A11, SLC6A8, SLC7A7, SMARCAL1, SMPD1, STAR, SUMF1, TAT, TCIRG1, TECPR2, TFR2, TGM1, TH, TMEM216, TPP1, TRMU, TSFM, TTPA, TYMP, USH1C, USH2A, VPS13A, VPS13B, VPS45, VRK1, VSX2, WNT10A, XPA, XPC, and ZFYVE26.


The sample used for phenotyping testing may comprise at least one target nucleic acid that can bind to a guide nucleic acid of the reagents described herein. The target nucleic acid, in some cases, is a nucleic acid encoding a sequence associated with a phenotypic trait.


The sample used for genotyping testing may comprise at least one target nucleic acid that can bind to a guide nucleic acid of the reagents described herein. The target nucleic acid, in some cases, is a nucleic acid encoding a sequence associated with a genotype of interest.


The sample used for ancestral testing may comprise at least one target nucleic acid that can bind to a guide nucleic acid of the reagents described herein. The target nucleic acid, in some cases, is a nucleic acid encoding a sequence associated with a geographic region of origin or ethnic group.


The sample can be used for identifying a disease status. For example, a sample is any sample described herein, and is obtained from a subject for use in identifying a disease status of a subject. The disease can be a cancer or genetic disorder. Sometimes, a method comprises obtaining a serum sample from a subject; and identifying a disease status of the subject. Often, the disease status is prostate disease status, but the status of any disease can be assessed.


In some instances, the target nucleic acid is a single stranded nucleic acid. Alternatively, or in combination, the target nucleic acid is a double stranded nucleic acid and is prepared into single stranded nucleic acids before or upon contacting the reagents. The target nucleic acid may be a reverse transcribed RNA, DNA, DNA amplicon, synthetic nucleic acids, or nucleic acids found in biological or environmental samples. The target nucleic acids include but are not limited to mRNA, rRNA, tRNA, non-coding RNA, long non-coding RNA, and microRNA (miRNA). In some cases, the target nucleic acid is single-stranded DNA (ssDNA) or mRNA. In some cases, the target nucleic acid is from a virus, a parasite, or a bacterium described herein. In some cases, the target nucleic acid is transcribed from a gene as described herein and then reverse transcribed into a DNA amplicon. In some cases, miRNA is extracted using a mirVANA kit. In some cases, RNA may be treated with shrimp alkaline phosphatase to remove phosphates from the 5′ and 3′ ends of an RNA for analysis. RNA analysis may further comprise the use of a thermocycler, SR Adaptors for Illumina, ligation enzymes, reverse transcriptase, and suitable primers for polymerase chain reaction.


A number of target nucleic acids are consistent with the methods and compositions disclosed herein. Some methods described herein can detect a target nucleic acid present in the sample in various concentrations or amounts as a target nucleic acid population. In some cases, the sample has at least 2 target nucleic acids. In some cases, the sample has at least 3, 5, 10, 20, 30, 40, 50, 100, 200, 300, 400, 500, 600, 700, 800, 900, 1000, 2000, 3000, 4000, 5000, 6000, 7000, 8000, 9000, or 10000 target nucleic acids. In some cases, the sample as from 1 to 10,000, from 100 to 8000, from 400 to 6000, from 500 to 5000, from 1000 to 4000, or from 2000 to 3000 target nucleic acids. In some cases, the method detects target nucleic acid present at least at one copy per 10 non-target nucleic acids, 102 non-target nucleic acids, 103 non-target nucleic acids, 104 non-target nucleic acids, 105 non-target nucleic acids, 106 non-target nucleic acids, 107 non-target nucleic acids, 108 non-target nucleic acids, 109 non-target nucleic acids, or 1010 non-target nucleic acids. Often, the target nucleic acid can be from 0.05% to 20% of total nucleic acids in the sample. Sometimes, the target nucleic acid is from 0.1% to 10% of the total nucleic acids in the sample. The target nucleic acid, in some cases, is from 0.1% to 5% of the total nucleic acids in the sample. The target nucleic acid can also be from 0.1% to 1% of the total nucleic acids in the sample. The target nucleic acid can be DNA or RNA. The target nucleic acid can be any amount less than 100% of the total nucleic acids in the sample. The target nucleic acid can be 100% of the total nucleic acids in the sample.


In some embodiments, the sample comprises a target nucleic acid at a concentration of less than 1 nM, less than 2 nM, less than 3 nM, less than 4 nM, less than 5 nM, less than 6 nM, less than 7 nM, less than 8 nM, less than 9 nM, less than 10 nM, less than 20 nM, less than 30 nM, less than 40 nM, less than 50 nM, less than 60 nM, less than 70 nM, less than 80 nM, less than 90 nM, less than 100 nM, less than 200 nM, less than 300 nM, less than 400 nM, less than 500 nM, less than 600 nM, less than 700 nM, less than 800 nM, less than 900 nM, less than 1 μM, less than 2 μM, less than 3 μM, less than 4 μM, less than 5 μM, less than 6 μM, less than 7 μM, less than 8 μM, less than 9 μM, less than 10 μM, less than 100 μM, or less than 1 mM. In some embodiments, the sample comprises a target nucleic acid sequence at a concentration of from 1 nM to 2 nM, from 2 nM to 3 nM, from 3 nM to 4 nM, from 4 nM to 5 nM, from 5 nM to 6 nM, from 6 nM to 7 nM, from 7 nM to 8 nM, from 8 nM to 9 nM, from 9 nM to 10 nM, from 10 nM to 20 nM, from 20 nM to 30 nM, from 30 nM to 40 nM, from 40 nM to 50 nM, from 50 nM to 60 nM, from 60 nM to 70 nM, from 70 nM to 80 nM, from 80 nM to 90 nM, from 90 nM to 100 nM, from 100 nM to 200 nM, from 200 nM to 300 nM, from 300 nM to 400 nM, from 400 nM to 500 nM, from 500 nM to 600 nM, from 600 nM to 700 nM, from 700 nM to 800 nM, from 800 nM to 900 nM, from 900 nM to 1 μM, from 1 μM to 2 μM, from 2 μM to 3 μM, from 3 μM to 4 μM, from 4 μM to 5 μM, from 5 μM to 6 μM, from 6 μM to 7 μM, from 7 μM to 8 μM, from 8 μM to 9 μM, from 9 μM to 10 μM, from 10 μM to 100 μM, from 100 μM to 1 mM, from 1 nM to 10 nM, from 1 nM to 100 nM, from 1 nM to 1 μM, from 1 nM to 10 μM, from 1 nM to 100 μM, from 1 nM to 1 mM, from 10 nM to 100 nM, from 10 nM to 1 μM, from 10 nM to 10 μM, from 10 nM to 100 μM, from 10 nM to 1 mM, from 100 nM to 1 μM, from 100 nM to 10 μM, from 100 nM to 100 μM, from 100 nM to 1 mM, from 1 μM to 10 μM, from 1 μM to 100 μM, from 1 μM to 1 mM, from 10 μM to 100 μM, from 10 μM to 1 mM, or from 100 μM to 1 mM. In some embodiments, the sample comprises a target nucleic acid at a concentration of from 20 nM to 200 μM, from 50 nM to 100 μM, from 200 nM to 50 μM, from 500 nM to 20 μM, or from 2 μM to 10 μM. In some embodiments, the target nucleic acid is not present in the sample.


In some embodiments, the sample comprises fewer than 10 copies, fewer than 100 copies, fewer than 1000 copies, fewer than 10,000 copies, fewer than 100,000 copies, or fewer than 1,000,000 copies of a target nucleic acid sequence. In some embodiments, the sample comprises from 10 copies to 100 copies, from 100 copies to 1000 copies, from 1000 copies to 10,000 copies, from 10,000 copies to 100,000 copies, from 100,000 copies to 1,000,000 copies, from 10 copies to 1000 copies, from 10 copies to 10,000 copies, from 10 copies to 100,000 copies, from 10 copies to 1,000,000 copies, from 100 copies to 10,000 copies, from 100 copies to 100,000 copies, from 100 copies to 1,000,000 copies, from 1,000 copies to 100,000 copies, or from 1,000 copies to 1,000,000 copies of a target nucleic acid sequence. In some embodiments, the sample comprises from 10 copies to 500,000 copies, from 200 copies to 200,000 copies, from 500 copies to 100,000 copies, from 1000 copies to 50,000 copies, from 2000 copies to 20,000 copies, from 3000 copies to 10,000 copies, or from 4000 copies to 8000 copies. In some embodiments, the target nucleic acid is not present in the sample.


A number of target nucleic acid populations are consistent with the methods and compositions disclosed herein. Some methods described herein can detect two or more target nucleic acid populations present in the sample in various concentrations or amounts. In some cases, the sample has at least 2 target nucleic acid populations. In some cases, the sample has at least 3, 4, 5, 6, 7, 8, 9, 10, 20, 30, 40, or 50 target nucleic acid populations. In some cases, the sample has from 3 to 50, from 5 to 40, or from 10 to 25 target nucleic acid populations. In some cases, the method detects target nucleic acid populations that are present at least at one copy per 101 non-target nucleic acids, 102 non-target nucleic acids, 103 non-target nucleic acids, 104 non-target nucleic acids, 105 non-target nucleic acids, 106 non-target nucleic acids, 107 non-target nucleic acids, 108 non-target nucleic acids, 109 non-target nucleic acids, or 1010 non-target nucleic acids. The target nucleic acid populations can be present at different concentrations or amounts in the sample.


In some embodiments, the target nucleic acid as disclosed herein can activate the programmable nuclease to initiate sequence-independent cleavage of a nucleic acid-based reporter (e.g., a reporter comprising a DNA sequence, a reporter comprising an RNA sequence, or a reporter comprising DNA and RNA). For example, a programmable nuclease of the present disclosure is activated by a target DNA to cleave reporters having an RNA (also referred to herein as an “RNA reporter”). Alternatively, a programmable nuclease of the present disclosure is activated by a target RNA to cleave reporters having an RNA. Alternatively, a programmable nuclease of the present disclosure is activated by a target DNA to cleave reporters having a DNA (also referred to herein as a “DNA reporter”). The RNA reporter can comprise a single-stranded RNA labelled with a detection moiety or can be any RNA reporter as disclosed herein. The DNA reporter can comprise a single-stranded DNA labelled with a detection moiety or can be any DNA reporter as disclosed herein.


In some embodiments, the target nucleic acid as described in the methods herein does not initially comprise a PAM sequence. However, any target nucleic acid of interest may be generated using the methods described herein to comprise a PAM sequence, and thus be a PAM target nucleic acid. A PAM target nucleic acid, as used herein, refers to a target nucleic acid that has been amplified to insert a PAM sequence that is recognized by a CRISPR/Cas system.


In some embodiments, the target nucleic acid is in a cell. In some embodiments, the cell is a single-cell eukaryotic organism; a plant cell an algal cell; a fungal cell; an animal cell; a cell from an invertebrate animal; a cell from a vertebrate animal such as fish, amphibian, reptile, bird, and mammal; or a cell from a mammal such as a human, a non-human primate, an ungulate, a feline, a bovine, an ovine, and a caprine. In preferred embodiments, the cell is a eukaryotic cell. In preferred embodiments, the cell is a mammalian cell, a human cell, or a plant cell.


Any of the above disclosed samples are consistent with the methods, compositions, reagents, enzymes, and kits disclosed herein and can be used as a companion diagnostic with any of the diseases disclosed herein, or can be used in reagent kits, point-of-care diagnostics, or over-the-counter diagnostics.


Methods of Modifying or Editing a Target Nucleic Acid Sequence

The disclosure provides compositions and methods for modifying or editing a target nucleic acid sequence. In some embodiments, the target nucleic acid sequence is associated with (e.g., causes, at least in part) a disease or disorder described herein, including a liver disease or disorder, an eye disease or disorder, cystic fibrosis, or a muscle disease or disorder. In some examples, the target nucleic acid comprises at least a portion of any one of the following genes: DNMT1, HPRT1, RPL32P3, CCR5, FANCF, GRIN2B, EMX1, AAVS1, ALKBH5, CLTA, CDK11, CTNNB1, AXIN1, LRP6, TBK1, BAP1, TLE3, PPM1A, BCL2L2, SUFU, RICTOR, VPS35, TOP1, SIRT1, PTEN, MMD, PAQR8, H2AX, POU5F1, OCT4, SYS1, ARFRP1, TSPAN14, EMC2, EMC3, SEL1L, DERL2, UBE2G2, UBE2J1, HRD1, PCSK9, BAK1 and CFTR. In some embodiments, the target nucleic acid comprises at least a portion of a PCSK9 gene. In some embodiments, the PCSK9 gene comprises a mutation associated with a liver disease or disorder. In some embodiments, the target nucleic acid comprises at least a portion of a BAK1 gene. In some embodiments, the BAK1 gene comprises a mutation associated with an eye disease or disorder. In some embodiments, the target nucleic acid comprises at least a portion of a CFTR gene. In some embodiments, the CFTR gene comprises a mutation associated with cystic fibrosis. In some embodiments, the CFTR gene comprises a delta F508 mutation. Compositions and methods of the disclosure can be used for introducing a site-specific cleavage in a target nucleic acid sequence. The site-specific cleavage can be a double-strand cleavage. The site-specific cleavage can be a single-strand cleavage (e.g. nicking). The modification can result in introducing a mutation (e.g., point mutations, deletions) in a target nucleic acid. The modification can result in removing a disease-causing mutation in a nucleic acid sequence. Methods of the disclosure can be targeted to any locus in a genome of a cell. They can generate point mutations, deletions, null mutations, or tissue-specific mutations in a target nucleic acid sequence. A complex comprising a programmable nuclease and guide nucleic acid of the disclosure can be used to generate gene knock-out, gene knock-in, gene editing, gene tagging, or a combination thereof. In some embodiments, the activity of a nuclease, such as a cleavage product, may be analyzed using gel electrophoresis or nucleic acid sequencing.


The methods described herein (e.g., methods of introducing a nick or a double-stranded break into a target nucleic acid) may be used to edit or modify a target nucleic acid. Methods of modifying a target nucleic acid may use the compositions comprising a programmable nuclease and a gRNA as described herein. Modifying a target nucleic acid may comprise one or more of cleaving the target nucleic acid, deleting one or more nucleotides of the target nucleic acid, inserting one or more nucleotides into the target nucleic acid, mutating one or more nucleotides of the target nucleic acid, or modifying (e.g., methylating, demethylating, deaminating, or oxidizing) of one or more nucleotides of the target nucleic acid.


In some embodiments, modifying a target nucleic acid comprises genome editing. Genome editing may comprise modifying a genome, chromosome, plasmid, or other genetic material of a cell or organism. In some embodiments the genome, chromosome, plasmid, or other genetic material of the cell or organism is modified in vivo. In some embodiments the genome, chromosome, plasmid, or other genetic material of the cell or organism is modified in a cell. In some embodiments the genome, chromosome, plasmid, or other genetic material of the cell or organism is modified in vitro. For example, a plasmid may be modified in vitro using a composition described herein and introduced into a cell or organism. In some embodiments, modifying a target nucleic acid may comprise deleting a sequence from a target nucleic acid. For example, a mutated sequence or a sequence associated with a disease may be removed from a target nucleic acid. In some embodiments, modifying a target nucleic acid may comprise replacing a sequence in a target nucleic acid with a second sequence. For example, a mutated sequence or a sequence associated with a disease may be replaced with a second sequence lacking the mutation or that is not associated with the disease. In some embodiments, modifying a target nucleic acid may comprise introducing a sequence into a target nucleic acid. For example, a beneficial sequence or a sequence that may reduce or eliminate a disease may inserted into the target nucleic acid.


In some embodiments, the present disclosure provides methods and compositions for editing a target nucleic acid sequence comprising a programmable nuclease capable of introducing a double-strand break in a double stranded DNA (dsDNA) target sequence. The programmable nuclease can be coupled to a guide nucleic acid that targets a particular region of interest in the dsDNA. A double-strand break can be repaired and rejoined by non-homologous end joining (NHEJ) or homology directed repair (HDR). Thus, a programmable nuclease capable of introducing a double-strand break as disclosed herein can be useful in a genome editing method, for example, used for therapeutic applications to treat a disease or disorder, or for agricultural applications. Such diseases or disorders that can be treated by the methods and compositions described herein include a liver disease or disorder, an eye disease or disorder, cystic fibrosis, or a muscle disease or disorder. CasΦ programmable nuclease disclosed herein can be used for genome editing purposes to generate double strand breaks in order to excise a region of DNA and subsequently introduce a region of DNA (e.g., donor DNA) into the excised region.


In some embodiments, the present disclosure provides methods and compositions for modifying or editing a target nucleic acid sequence comprising two or more programmable nickases. For example, modifying a target nucleic acid may comprise introducing a two or more single-stranded breaks in the target nucleic acid. In some embodiments, a break may be introduced by contacting a target nucleic acid with a programmable nickase and a guide nucleic acid. The guide nucleic acid may bind to the programmable nickase and hybridize to a region of the target nucleic acid, thereby recruiting the programmable nickase to the region of the target nucleic acid. Binding of the programmable nickase to the guide nucleic acid and the region of the target nucleic acid may activate the programmable nickase, and the programmable nickase may introduce a break (e.g., a single stranded break) in the region of the target nucleic acid. In some embodiments, modifying a target nucleic acid may comprise introducing a first break in a first region of the target nucleic acid and a second break in a second region of the target nucleic acid. For example, modifying a target nucleic acid may comprise contacting a target nucleic acid with a first guide nucleic acid that binds to a first programmable nickase and hybridizes to a first region of the target nucleic acid and a second guide nucleic acid that binds to a second programmable nickase and hybridizes to a second region of the target nucleic acid. The first programmable nickase may introduce a first break in a first strand at the first region of the target nucleic acid, and the second programmable nickase may introduce a second break in a second strand at the second region of the target nucleic acid. In some embodiments, a segment of the target nucleic acid between the first break and the second break may be removed, thereby modifying the target nucleic acid. In some embodiments, a segment of the target nucleic acid between the first break and the second break may be replaced (e.g., with an insert sequence), thereby modifying the target nucleic acid.


The methods of the disclosure can use HDR or NHEJ. Following cleavage of a targeted genomic sequence, one of two alternative DNA repair mechanisms can restore chromosomal integrity: non-homologous end joining (NHEJ) which can generate insertions and/or deletions of a few base-pairs of DNA at the cut site. Alternatively, the cell can employ homology-directed repair (HDR), which can correct the lesion via an additional DNA template (e.g., donor) that spans the cut site. In some instances, the methods of the disclosure use microhomology-mediated end-joining (MMEJ).


Methods and compositions of the disclosure can be used to insert a donor polynucleotide into a target nucleic acid sequence. A donor polynucleotide can comprise a segment of nucleic acid to be integrated at a target genomic locus. The donor polynucleotide can comprise one or more polynucleotides of interest. The donor polynucleotide can comprise one or more expression cassettes. The expression cassette can comprise a donor polynucleotide of interest, a polynucleotide encoding a selection marker and/or a reporter gene, and regulatory components that influence expression.


The donor polynucleotide can comprise a genomic nucleic acid. The genomic nucleic acid can be derived from an animal, a mouse, a human, a non-human, a rodent, a non-human, a rat, a hamster, a rabbit, a pig, a bovine, a deer, a sheep, a goat, a chicken, a cat, a dog, a ferret, a primate (e.g., marmoset, rhesus monkey), domesticated mammal or an agricultural mammal, an avian, a bacterium, a archaeon, a virus, or any other organism of interest or a combination thereof. The donor polynucleotide may be synthetic.


Donor polynucleotides of any suitable size can be integrated into a genome. In some embodiments, the donor polynucleotide integrated into a genome is less than 3, about 3, 3.5, 4, 4.5, 5, 5.5, 6, 6.5, 7, 7.5, 8, 8.5, 9, 9.5, 10, 10.5, 11, 11.5, 12, 12.5, 13, 13.5, 14, 14.5, 15, 16, 17, 18, 19, 20, 25, 30, 35, 40, 45, 50, 100, 150, 200, 250, 300, 350, 400, 450, 500 or more than 500 kilobases (kb) in length. In some embodiments, the donor polynucleotide integrated into a genome is at least about 2, 2.5, 3, 3.5, 4, 4.5, 5, 5.5, 6, 6.5, 7, 7.5, 8, 8.5, 9, 9.5, 10, 10.5, 11, 11.5, 12, 12.5, 13, 13.5, 14, 14.5, 15, 16, 17, 18, 19, 20, 25, 30, 35, 40, 45, 50, 100, 150, 200, 250, 300, 350, 400, 450, 500 or more than 500 kb in length. In some embodiments, the donor polynucleotide integrated into a genome is up to about 3, 3.5, 4, 4.5, 5, 5.5, 6, 6.5, 7, 7.5, 8, 8.5, 9, 9.5, 10, 10.5, 11, 11.5, 12, 12.5, 13, 13.5, 14, 14.5, 15, 16, 17, 18, 19, 20, 25, 30, 35, 40, 45, 50, 100, 150, 200, 250, 300, 350, 400, 450, 500 or more than 500 kb in length.


The donor polynucleotide can be flanked by site-specific recombination target sequences (e.g., 5′ and 3′ homology arms) on a targeting vector. The length of a homology arm may be from about 50 to about 1000 bp. The length of a homology arm may be from about 400 to about 1000 bp. A homology arm can be of any length that is sufficient to promote a homologous recombination event with a corresponding target site, including for example, from about 400 bp to about 500 bp, from about 500 bp to about 600 bp, from about 600 bp to about 700 bp, from about 700 bp to about 800 bp, from about 800 bp to about 900 bp, or from about 900 bp to about 1000 bp. In preferred embodiments, the length of a homology arm may be from about 200 to about 300 bp. The sum total of 5′ and 3′ homology arms can be about 0.5 kb, 1 kb, 1.5 kb, 2 kb, 3 kb, 4 kb, 5 kb, 6 kb, 7 kb, 8 kb, 9 kb, about 0.5 kb to about 1 kb, about 1 kb to about 1.5 kb, about 1.5 kb to about 2 kb, about 2 kb to about 3 kb, about 3 kb to about 4 kb, about 4 kb to about 5 kb, about 5 kb to about 6 kb, about 6 kb to about 7 kb, about 8 kb to about 9 kb, or is at least 10 kb.


In some embodiments, the donor polynucleotide comprises one or more phosphorothioate bonds between nucleobases. In some embodiments, one or more of the first five 5′ nucleobases of the donor polynucleotide are linked by phosphorothioate bonds. In some embodiments, one or more of the five nucleobases at the 3′ end of the donor polynucleotide are linked by phosphorothioate bonds. In some embodiments, one or more of the first three 5′ nucleobases of the donor polynucleotide are linked by phosphorothioate bonds. In some embodiments, one or more of the three nucleobases at the 3′ end of the donor polynucleotide are linked by phosphorothioate bonds. In preferred embodiments, the two nucleobases at 5′ end of the donor polynucleotide are linked by a phosphorothioate bond. In some embodiments, the two nucleobases at the 3′ end of the donor polynucleotide are linked by a phosphorothioate bond. In more preferred embodiments, the two nucleobases at 5′ end of the donor polynucleotide are linked by a phosphorothioate bond and the two nucleobases at the 3′ end of the donor polynucleotide are linked by a phosphorothioate bond.


Examples of site-specific recombinases that can be used include, but are not limited to, Cre, Flp, and Dre recombinases. The site-specific recombinase can be introduced into the cell by any means, including by introducing the recombinase polypeptide into the cell or by introducing a polynucleotide encoding the site-specific recombinase into the host cell. The polynucleotide encoding the site-specific recombinase can be located within the insert polynucleotide or within a separate polynucleotide. The site-specific recombinase can be operably linked to a promoter active in the cell including, for example, an inducible promoter, a promoter that is endogenous to the cell, a promoter that is heterologous to the cell, a cell-specific promoter, a tissue-specific promoter, or a developmental stage-specific promoter. Site-specific recombination target sequences which can flank the insert polynucleotide or any polynucleotide of interest in the insert polynucleotide can include, but are not limited to, 1oxP, 1ox511, 1oχ2272, 1oχ66, 1ox71, 1oxM2, 1ox5171, FRT, FRT11, FRT71, attp, att, FRT, rox, and a combination thereof.


The target nucleic acid may comprise one or more of a genome, a chromosome, a plasmid, a gene, a promoter, an untranslated region, an open reading frame, an intron, an exon, or an operator. The target nucleic acid may comprise a segment of one or more of a genome, a chromosome, a plasmid, a gene, a promoter, an untranslated region, an open reading frame, an intron, an exon, or an operator. In some embodiments, the target nucleic acid may be part of a cell or an organism. In some embodiments, the target nucleic acid may be a cell-free genetic component.


In some embodiments, gene modifying or gene editing is achieved by fusing a programmable nuclease such as a CasΦ protein to a heterologous sequence. The heterologous sequence can be a suitable fusion partner, e.g., a polypeptide that provides recombinase activity by acting on the target nucleic acid sequence. In some embodiments, the fusion protein comprises a programmable nuclease such as a CasΦ protein fused to a heterologous sequence by a linker.


The heterologous sequence or fusion partner can be a site specific recombinase. The site specific recombinase can have recombinase activity. Examples of site-specific recombinases that can be used include, but are not limited to, Cre, Hin, Tre, and FLP recombinases. The heterologous sequence or fusion partner can be a recombinase catalytic domain. The recombinase catalytic domains can be from, for example, a tyrosine recombinase, a serine recombinase, a Gin recombinase, a Hin recombinase, a β recombinase, a Sin recombinase, a Tn3 recombinase, a γδ recombinase, a Cre recombinase, a FLP recombinase, or a phC31 integrase.


The heterologous sequence or fusion partner can be fused to the C-terminus, N-terminus, or an internal portion (e.g., a portion other than the N- or C-terminus) of the programmable nuclease, for example a dead CasΦ polypeptide.


The heterologous sequence or fusion partner can be fused to the programmable nuclease by a linker. A linker can be a peptide linker or a non-peptide linker. In some embodiments, the linker is an XTEN linker. In some embodiments, the linker comprises one or more repeats a tri-peptide GGS. In some embodiments, the linker is from 1 to 100 amino acids in length. In some embodiments, the linker is more 100 amino acids in length. In some embodiments, the linker is from 10 to 27 amino acids in length. A non-peptide linker can be a polyethylene glycol (PEG), polypropylene glycol (PPG), co-poly(ethylene/propylene) glycol, polyoxyethylene (POE), polyurethane, polyphosphazene, polysaccharides, dextran, polyvinyl alcohol, polyvinylpyrrolidones, polyvinyl ethyl ether, polyacryl amide, polyacrylate, polycyanoacrylates, lipid polymers, chitins, hyaluronic acid, heparin, or an alkyl linker.


In some embodiments, the CasΦ protein can comprise an enzymatically inactive and/or “dead” (abbreviated by “d”) programmable nuclease in combination (e.g., fusion) with a polypeptide comprising recombinase activity. Although a programmable CasΦ nuclease normally has nuclease activity, in some embodiments, a programmable CasΦ nuclease does not have nuclease activity.


A programmable nuclease can comprise a modified form of a wild type counterpart. The modified form of the wild type counterpart can comprise an amino acid change (e.g., deletion, insertion, or substitution) that reduces the nucleic acid-cleaving activity of the programmable nuclease. For example, a nuclease domain (e.g., RuvC domain) of a CasΦ polypeptide can be deleted or mutated so that it is no longer functional or comprises reduced nuclease activity. The modified form of the programmable nuclease can have less than 90%, less than 80%, less than 70%, less than 60%, less than 50%, less than 40%, less than 30%, less than 20%, less than 10%, less than 5%, or less than 1% of the nucleic acid-cleaving activity of the wild-type counterpart. The modified form of a programmable nuclease can have no substantial nucleic acid-cleaving activity. When a programmable nuclease is a modified form that has no substantial nucleic acid-cleaving activity, it can be referred to as enzymatically inactive and/or dead. A dead CasΦ polypeptide (e.g., dCasΦ) can bind to a target nucleic acid sequence but may not cleave the target nucleic acid sequence. A dCasΦ polypeptide can associate with a guide nucleic acid to activate or repress transcription of a target nucleic acid sequence.


In some embodiments, a programmable nuclease is a dead CasΦ polypeptide. A dead CasΦ polypeptide can comprise at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 92%, at least 95%, at least 97%, at least 99%, or 100% sequence identity with any one of SEQ ID NO: 1-SEQ ID NO: 47, SEQ ID NO. 105, and SEQ ID NO 107. In some embodiments, a programmable nuclease is a dead CasΦ polypeptide comprising at least 85% sequence identity to any one of SEQ ID NO: 1-SEQ ID NO: 47, SEQ ID NO. 105, and SEQ ID NO 107. In some embodiments, a programmable nuclease is a dead CasΦ polypeptide comprising at least 90% sequence identity to any one of SEQ ID NO: 1-SEQ ID NO: 47, SEQ ID NO. 105, and SEQ ID NO 107. In some embodiments, a programmable nuclease is a dead CasΦ polypeptide comprising at least 95% sequence identity to any one of SEQ ID NO: 1-SEQ ID NO: 47, SEQ ID NO. 105, and SEQ ID NO 107. In some embodiments, a programmable nuclease is a dead CasΦ polypeptide comprising at least 98% sequence identity to any one of SEQ ID NO: 1-SEQ ID NO: 47, SEQ ID NO. 105, and SEQ ID NO 107.


A deadCasΦ (also referred to herein as “dCasΦ”) polypeptide can form a ribonucleoprotein complex with a guide nucleic acid. The guide nucleic acid can comprise a crRNA sequence comprising at least 70%, at least 80%, at least 90%, at least 92%, at least 95%, at least 97%, or at least 99%, or 100% sequence identity to any one of SEQ ID NO: 48-SEQ ID NO: 86, or a reverse complement thereof.


Enzymatically inactive can refer to a polypeptide that can bind to a nucleic acid sequence in a polynucleotide in a sequence-specific manner, but may not cleave a target polynucleotide. An enzymatically inactive site-directed polypeptide can comprise an enzymatically inactive domain (e.g. a programmable nuclease domain). Enzymatically inactive can refer to no activity. Enzymatically inactive can refer to substantially no activity. Enzymatically inactive can refer to essentially no activity. Enzymatically inactive can refer to an activity less than 1%, less than 2%, less than 3%, less than 4%, less than 5%, less than 6%, less than 7%, less than 8%, less than 9%, or less than 10% activity compared to a wild-type exemplary activity (e.g., nucleic acid cleaving activity, wild-type CasΦ activity).


In further embodiments, methods of modifying cells are provided. In some embodiments, a method of modifying a cell comprising a target nucleic acid wherein the method comprises introducing a programmable CasΦ nuclease or variant thereof disclosed herein to the cell, wherein the programmable CasΦ nuclease or variant cleaves or modifies the target nucleic acid.


Modified cells obtained or obtainable by the methods described herein are provided. In some embodiments, a modified cell is obtained or is obtained by a method of modifying a cell disclosed herein.


In some embodiments, a CasΦ polypeptide disclosed herein is expressed in a cell. In some embodiments, a CasΦ polypeptide disclosed herein complexed with a guide nucleic is expressed in a cell. In some embodiments, a method disclosed herein comprises modifying or editing a cell. In some embodiments, a modified cell is provided wherein a cell is modified by a CasΦ polypeptide disclosed herein. In some embodiments, a cell is provided wherein the cell comprises a CasΦ polypeptide disclosed herein.


Methods of Nicking of a Target Nucleic Acid

Disclosed herein are methods of introducing a break into a target nucleic acid. In some embodiments, the break may be a single stranded break (e.g., a nick). The programmable nickases disclosed herein and a gRNA disclosed herein may be used to introduce a single-stranded break into a target nucleic acid, for example a single stranded break in a double-stranded DNA.


A method of introducing a break into a target nucleic acid may comprise contacting the target nucleic acid with a first guide nucleic acid (e.g., a guide nucleic acid comprising a region that binds to a first programmable nickase) and a second guide nucleic acid (e.g., a guide nucleic acid comprising a region that binds to a second programmable nickase). The first guide nucleic acid may comprise an additional region that binds to the target nucleic acid, and the second guide nucleic acid may comprise an additional region that binds to the target nucleic acid. The additional region of the first guide nucleic acid and the additional region of the second guide nucleic acid may bind opposing strands of the target nucleic acid.


In some embodiments, a programmable nickase of the disclosure can cleave a non-target strand of a double-stranded target nucleic acid (e.g., DNA). In some embodiments, the programmable nickase may not cleave the target strand of the double-stranded target nucleic acid (e.g., DNA). The strand of a double-stranded target nucleic acid that is complementary to and hybridizes with the guide nucleic acid can be called the target strand. The strand of the double-stranded target DNA that is complementary to the target strand, and therefore is not complementary to the guide nucleic acid can be called non-target strand.


The temperature at which a ribonucleoprotein (RNP) complex comprising a programmable nuclease and a guide nucleic acid is formed (i.e. the RNP complexing temperature) can affect the nickase activity of the programmable nuclease. For example, an RNP complex formed at room temperature can have a greater nickase activity than an RNP complex formed at 37° C. In some cases, the RNP complex can be formed at room temperature, for example, from about 20° C. to 22° C. In some cases, the RNP complex can be formed at, for example, about 15° C., about 16° C., about 17° C., about 18° C., about 19° C., about 20° C., about 21° C., about 22° C., about 23° C., about 24° C., or about 25° C.


In some embodiments, a programmable nuclease may exhibit at least about 1.1-fold, at least about 1.2-fold, at least about 1.3-fold, at least about 1.4-fold, at least about 1.5-fold, at least about 1.6-fold, at least about 1.7-fold, at least about 1.8-fold, at least about 1.9-fold, at least about 2-fold, at least about 2.1-fold, at least about 2.2-fold, at least about 2.3-fold, at least about 2.4-fold, at least about 2.5-fold, at least about 2.6-fold, at least about 2.7-fold, at least about 2.8-fold, at least about 2.9-fold, at least about 3-fold, at least about 3.5-fold, at least about 4-fold, at least about 4.5-fold, at least about 5-fold, at least about 5.5-fold, at least about 6-fold, at least about 7-fold, at least about 8-fold, at least about 9-fold, at least about 10-fold, at least about 15-fold, at least about 20-fold, at least about 30-fold, at least about 40-fold, or at least about 50-fold greater nicking activity when complexed with a guide RNA at room temperature as compared to when complexed at 37° C.


The crRNA repeat sequence of a guide nucleic acid can affect the nickase activity of a programmable nuclease. For example, a programmable nuclease can comprise enhanced or greater nickase activity when complexed with guide nucleic acids comprising certain crRNA repeat sequences. For example, a programmable nuclease can comprise greater nickase activity when complexed with a guide RNA comprising a crRNA repeat sequence of CasΦ.18 as shown in TABLE 2. In another example, a programmable nuclease can comprise greater nickase activity when complexed with a guide RNA comprising a crRNA repeat sequence of CasΦ.7 as shown in TABLE 2. In some embodiments, a programmable nuclease may exhibit at least about 1.1-fold, at least about 1.2-fold, at least about 1.3-fold, at least about 1.4-fold, at least about 1.5-fold, at least about 1.6-fold, at least about 1.7-fold, at least about 1.8-fold, at least about 1.9-fold, at least about 2-fold, at least about 2.1-fold, at least about 2.2-fold, at least about 2.3-fold, at least about 2.4-fold, at least about 2.5-fold, at least about 2.6-fold, at least about 2.7-fold, at least about 2.8-fold, at least about 2.9-fold, at least about 3-fold, at least about 3.5-fold, at least about 4-fold, at least about 4.5-fold, at least about 5-fold, at least about 5.5-fold, at least about 6-fold, at least about 7-fold, at least about 8-fold, at least about 9-fold, at least about 10-fold, at least about 15-fold, at least about 20-fold, at least about 30-fold, at least about 40-fold, or at least about 50-fold greater nicking activity when complexed with a guide RNA comprising a specific crRNA repeat sequence as compared to when in a complex with a guide RNA comprising another crRNA repeat sequence.


The programmable nucleases disclosed herein may exhibit cis-cleavage activity or target cleavage activity. Target cleavage activity may refer to the cleavage of a target nucleic acid by the programmable nuclease. In some cases, the cis-cleavage activity results in double-stranded breaks in the target nucleic acids. In some cases, the cis-cleavage activity results in single-stranded breaks in the target nucleic acids. In some cases, the cis-cleavage activity produces a mixture of double- and single-stranded breaks in the target nucleic acids. In further cases, the rates of cis-cleavage double- and single-strand break formation may be dependent on the sequence of the guide nucleic acid. In some cases, the ratio of cis-cleavage double- and single-strand break formation may be dependent on the sequence of the guide nucleic acid. In some cases, the ratio or rate of cis-cleavage double- and single-strand break formation may be dependent on the repeat sequence of the crRNA of the guide nucleic acid. In some cases, the ratio or rate of cis-cleavage double- and single-strand break formation may be dependent on the temperature at which the ribonucleoprotein complex comprising the programmable nuclease and the guide nucleic acid are complexed.


A programmable nuclease for use in modifying a target nucleic acid may have greater nicking activity as compared to double stranded cleavage activity. In some embodiments, a programmable nuclease may exhibit at least about 1.1-fold, at least about 1.2-fold, at least about 1.3-fold, at least about 1.4-fold, at least about 1.5-fold, at least about 1.6-fold, at least about 1.7-fold, at least about 1.8-fold, at least about 1.9-fold, at least about 2-fold, at least about 2.1-fold, at least about 2.2-fold, at least about 2.3-fold, at least about 2.4-fold, at least about 2.5-fold, at least about 2.6-fold, at least about 2.7-fold, at least about 2.8-fold, at least about 2.9-fold, at least about 3-fold, at least about 3.5-fold, at least about 4-fold, at least about 4.5-fold, at least about 5-fold, at least about 5.5-fold, at least about 6-fold, at least about 7-fold, at least about 8-fold, at least about 9-fold, at least about 10-fold, at least about 15-fold, at least about 20-fold, at least about 30-fold, at least about 40-fold, or at least about 50-fold greater nicking activity as compared to double stranded cleavage activity.


In other cases, a programmable nuclease for use in modifying a target nucleic acid may have greater double stranded cleavage activity as compared to nicking activity. In some embodiments, a programmable nuclease may exhibit at least about 1.1-fold, at least about 1.2-fold, at least about 1.3-fold, at least about 1.4-fold, at least about 1.5-fold, at least about 1.6-fold, at least about 1.7-fold, at least about 1.8-fold, at least about 1.9-fold, at least about 2-fold, at least about 2.1-fold, at least about 2.2-fold, at least about 2.3-fold, at least about 2.4-fold, at least about 2.5-fold, at least about 2.6-fold, at least about 2.7-fold, at least about 2.8-fold, at least about 2.9-fold, at least about 3-fold, at least about 3.5-fold, at least about 4-fold, at least about 4.5-fold, at least about 5-fold, at least about 5.5-fold, at least about 6-fold, at least about 7-fold, at least about 8-fold, at least about 9-fold, at least about 10-fold, at least about 15-fold, at least about 20-fold, at least about 30-fold, at least about 40-fold, or at least about 50-fold greater double stranded cleavage activity as compared to nicking activity.


In some embodiments, the nicking activity and double stranded cleavage activity of a programmable nuclease depend on the conditions and species present in the sample containing the programmable nuclease. In some cases, the nicking activity and double stranded cleavage activity of the programmable nuclease are responsive to the sequence of the crRNA present in the guide nucleic acid. In some cases, the ratio of nicking activity and double stranded cleavage activity can be modulated by changing the sequence of the crRNA present. In some cases, the nicking activity and double stranded cleavage activity of the programmable nuclease respond differently to changes in temperature (e.g., RNP complexing temperature), pH, osmolarity, buffer, target nucleic acid concentration, ionic strength, and inhibitor concentration. In some embodiments, the ratio of nicking activity to cleavage activity by a programmable nuclease can be actively controlled by adjusting sample conditions and crRNA sequences.


Methods of Regulating Gene Expression

In some embodiments, the disclosure provided methods and compositions for regulating gene expression. The methods and compositions can comprise use of an enzymatically inactive and/or “dead” (abbreviated by “d”) programmable nuclease in combination (e.g., fusion) with a polypeptide comprising transcriptional regulation activity. Although a programmable CasΦ nuclease normally has nuclease activity, in some embodiments, a programmable CasΦ nuclease does not have nuclease activity.


A programmable nuclease can comprise a modified form of a wild type counterpart. The modified form of the wild type counterpart can comprise an amino acid change (e.g., deletion, insertion, or substitution) that reduces the nucleic acid-cleaving activity of the programmable nuclease. For example, a nuclease domain (e.g., RuvC domain) of a CasΦ polypeptide can be deleted or mutated so that it is no longer functional or comprises reduced nuclease activity. The modified form of the programmable nuclease can have less than 90%, less than 80%, less than 70%, less than 60%, less than 50%, less than 40%, less than 30%, less than 20%, less than 10%, less than 5%, or less than 1% of the nucleic acid-cleaving activity of the wild-type counterpart. The modified form of a programmable nuclease can have no substantial nucleic acid-cleaving activity. When a programmable nuclease is a modified form that has no substantial nucleic acid-cleaving activity, it can be referred to as enzymatically inactive and/or dead. A dead CasΦ polypeptide (e.g., dCasΦ) can bind to a target nucleic acid sequence but may not cleave the target nucleic acid sequence. A dCasΦ polypeptide can associate with a guide nucleic acid to activate or repress transcription of a target nucleic acid sequence.


In some embodiments, the disclosure provides a method of selectively modulating transcription of a gene in a cell. The method can comprise introducing into a cell a (i) fusion polypeptide comprising a dCasΦ polypeptide and a polypeptide comprising transcriptional regulation activity, or a nucleic acid comprising a nucleotide sequence encoding the fusion polypeptide, wherein the dCasΦ polypeptide is enzymatically inactive or exhibits reduced nucleic acid cleavage activity; and ii) a guide nucleic acid, or a nucleic acid comprising a nucleotide sequence encoding the guide nucleic acid.


In some embodiments, a programmable nuclease is a dead CasΦ polypeptide. A dead CasΦ polypeptide can comprise at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 92%, at least 95%, at least 97%, at least 99%, or 100% sequence identity with any one of SEQ ID NO: 1-SEQ ID NO: 47, SEQ ID NO. 105, and SEQ ID NO 107. In some embodiments, a programmable nuclease is a dead CasΦ polypeptide comprising at least 85% sequence identity to any one of SEQ ID NO: 1-SEQ ID NO: 47, SEQ ID NO. 105, and SEQ ID NO 107. In some embodiments, a programmable nuclease is a dead CasΦ polypeptide comprising at least 90% sequence identity to any one of SEQ ID NO: 1-SEQ ID NO: 47, SEQ ID NO. 105, and SEQ ID NO 107. In some embodiments, a programmable nuclease is a dead CasΦ polypeptide comprising at least 95% sequence identity to any one of SEQ ID NO: 1-SEQ ID NO: 47, SEQ ID NO. 105, and SEQ ID NO 107. In some embodiments, a programmable nuclease is a dead CasΦ polypeptide comprising at least 98% sequence identity to any one of SEQ ID NO: 1-SEQ ID NO: 47, SEQ ID NO. 105, and SEQ ID NO 107.


A deadCasΦ (also referred to herein as “dCasΦ”) polypeptide can form a ribonucleoprotein complex with a guide nucleic acid. The guide nucleic acid can comprise a crRNA sequence comprising at least 70%, at least 80%, at least 90%, at least 92%, at least 95%, at least 97%, or at least 99%, or 100% sequence identity to any one of SEQ ID NO: 48-SEQ ID NO: 86, or a reverse complement thereof.


Enzymatically inactive can refer to a polypeptide that can bind to a nucleic acid sequence in a polynucleotide in a sequence-specific manner, but may not cleave a target polynucleotide. An enzymatically inactive site-directed polypeptide can comprise an enzymatically inactive domain (e.g. a programmable nuclease domain). Enzymatically inactive can refer to no activity. Enzymatically inactive can refer to substantially no activity. Enzymatically inactive can refer to essentially no activity. Enzymatically inactive can refer to an activity less than 1%, less than 2%, less than 3%, less than 4%, less than 5%, less than 6%, less than 7%, less than 8%, less than 9%, or less than 10% activity compared to a wild-type exemplary activity (e.g., nucleic acid cleaving activity, wild-type CasΦ activity).


Transcription regulation can be achieved by fusing a programmable nuclease such as a dead CasΦ protein to a heterologous sequence. The heterologous sequence can be a suitable fusion partner, e.g., a polypeptide that provides an activity that increases, decreases, or otherwise regulates transcription by acting on the target nucleic acid sequence or on a polypeptide (e.g., a histone or other DNA-binding protein) associated with the target nucleic acid sequence. Non-limiting examples of suitable fusion partners include a polypeptide that provides for transcription activation activity, transcription repression activity, nuclease activity, transcription release factor activity, histone modification activity, histone acetyltransferase activity, nucleic acid association activity, DNA methylase activity, direct or indirect DNA demethylase activity, methyltransferase activity, demethylase activity, acetyltransferase activity, deacetylase activity, kinase activity, phosphatase activity, ubiquitin ligase activity, deubiquitinating activity, adenylation activity, deaminase activity, deadenylation activity, SUMOylating activity, deSUMOylating activity, ribosylation activity, deribosylation activity, myristoylation activity, or demyristoylation activity.


Illustrative modifications performed by a fusion polypeptide can comprise methylation, demethylation, acetylation, deacetylation, ubiquitination, deubiquitination, deamination, alkylation, depurination, oxidation, pyrimidine dimer formation, transposition, recombination, chain elongation, ligation, glycosylation. Phosphorylation, dephosphorylation, adenylation, deadenylation, SUMOylation, deSUMOylation, ribosylation, deribosylation, myristoylation, remodeling, cleavage, oxidoreduction, hydrolation, or isomerization.


The heterologous sequence or fusion partner can be fused to the C-terminus, N-terminus, or an internal portion (e.g., a portion other than the N- or C-terminus) of the programmable nuclease, for example a dead CasΦ polypeptide. Non-limiting examples of fusion partners include transcription activators, transcription repressors, histone lysine methyltransferases (KMT), Histone Lysine Demethylates, Histone lysine acetyltransferases (KAT), Histone lysine deacetylase, DNA methylases (adenosine or cytosine modification), deaminases, CTCF, periphery recruitment elements (e.g., Lamin A, Lamin B), and protein docking elements (e.g., FKBP/FRB).


Non-limiting examples of transcription activators include GAL4, VP16, VP64, and p65 subdomain (NFkappaB).


Non-limiting examples of transcription repressors include Kruippel associated box (KRAB or SKD), the Mad mSIN3 interaction domain (SID), and the ERF repressor domain (ERD).


Non-limiting examples of histone lysine methyltransferases (KMT) include members from KMT1 family (e.g., SUV39H1, SUV39H2, G9A, ESET/SETDB1, C1r4, Su(var)3-9), KMT2 family members (e.g., hSET1A, hSET1 B, MLL 1 to 5, ASH1, and homologs (Trx, Trr, Ash1)), KMT3 family (SYMD2, NSD1), KMT4 (DOT1L and homologs), KMT5 family (Pr-SET7/8, SUV4-20H1, and homologs), KMT6 (EZH2), and KMT8 (e.g., RIZ1).


Non-limiting examples of Histone Lysine Demethylates (KDM) include members from KDM1 family (LSD1/BHC110, Splsd1/Swm1/Saf11 0, Su(var)3-3), KDM3 family (JHDM2a/b), KDM4 family (JMJD2A/JHDM3A, JMJD2B, JMJD2C/GASC1, JMJD2D, and homologs (Rph1)), KDM5 family (JARID1A/RBP2, JARID1 B/PLU-1, JARIDIC/SMCX, JARID1D/SMCY, and homologs (Lid, Jhn2, Jmj2)), and KDM6 family (e.g., UTX, JMJD3).


Non-limiting examples of KAT include members of KAT2 family (hGCN5, PCAF, and homologs (dGCN5/PCAF, Gcn5), KAT3 family (CBP, p300, and homologs (dCBP/NEJ)), KAT4, KAT5, KAT6, KAT7, KAT8, and KAT13.


In some embodiments, the disclosure provides methods for increasing transcription of a target nucleic acid sequence. The transcription of a target nucleic acid sequence can increase by at least about 1.1 fold, at least about 1.2 fold, at least about 1.3 fold, at least about 1.4 fold, at least about 1.5 fold, at least about 1.6 fold, at least about 1.7 fold, at least about 1.8 fold, at least about 1.9 fold, at least about 2 fold, at least about 2.5 fold, at least about 3 fold, at least about 3.5 fold, at least about 4 fold, at least about 4.5 fold, at least about 5 fold, at least about 6 fold, at least about 7 fold, at least about 8 fold, at least about 9 fold, at least about 10 fold, at least about 12 fold, at least about 15 fold, at least about 20-fold, at least about 50-fold, at least about 70-fold, or at least about 100-fold compared to the level of transcription of the target nucleic acid sequence in the absence of a fusion polypeptide comprising a enzymatically inactive or enzymatically reduced programmable nuclease (e.g., dead CasΦ protein).


In some embodiments, the disclosure provides methods for decreasing transcription of a target nucleic acid sequence. The transcription of a target nucleic acid sequence can decrease by at least about 1.1 fold, at least about 1.2 fold, at least about 1.3 fold, at least about 1.4 fold, at least about 1.5 fold, at least about 1.6 fold, at least about 1.7 fold, at least about 1.8 fold, at least about 1.9 fold, at least about 2 fold, at least about 2.5 fold, at least about 3 fold, at least about 3.5 fold, at least about 4 fold, at least about 4.5 fold, at least about 5 fold, at least about 6 fold, at least about 7 fold, at least about 8 fold, at least about 9 fold, at least about 10 fold, at least about 12 fold, at least about 15 fold, at least about 20-fold, at least about 50-fold, at least about 70-fold, or at least about 100-fold compared to the level of transcription of the target nucleic acid sequence in the absence of a fusion polypeptide comprising a enzymatically inactive or enzymatically reduced programmable nuclease (e.g., dead Cas 12j protein).


Method of Treating a Disorder

The compositions and methods described herein may be used to treat, prevent, or inhibit an ailment in a subject. The ailments may include diseases, cancers, genetic disorders, neoplasias, and infections. In some cases, the disease or disorder for treatment is a liver disease or disorder, an eye disease or disorder, cystic fibrosis, or a muscle disease or disorder. In some cases, the ailments are associated with one or more genetic sequences, including but not limited to 11-hydroxylase deficiency; 17,20-desmolase deficiency; 17-hydroxylase deficiency; 3-hydroxyisobutyrate aciduria; 3-hydroxysteroid dehydrogenase deficiency; 46, XY gonadal dysgenesis; AAA syndrome; ABCA3 deficiency; ABCC8-associated hyperinsulinism; aceruloplasminemia; achondrogenesis type 2; acral peeling skin syndrome; acrodermatitis enteropathica; adrenocortical micronodular hyperplasia; adrenoleukodystrophies; adrenomyeloneuropathies; Aicardi-Goutieres syndrome; Alagille disease; Alpers syndrome; alpha-mannosidosis; Alstrom syndrome; Alzheimer disease; amelogenesis imperfecta; amish type microcephaly; amyotrophic lateral sclerosis (ALS); anauxetic dysplasia; androgen insensitivity syndrome; Antley-Bixler syndrome; APECED, Apert syndrome, aplasia of lacrimal and salivary glands, argininemia, arrhythmogenic right ventricular dysplasia, Arts syndrome, ARVD2, arylsulfatase deficiency type metachromatic leokodystrophy, ataxia telangiectasia, autoimmune lymphoproliferative syndrome; autoimmune polyglandular syndrome type 1; autosomal dominant anhidrotic ectodermal dysplasia; autosomal dominant polycystic kidney disease; autosomal recessive microtia; autosomal recessive renal glucosuria; autosomal visceral heterotaxy; Bardet-Biedl syndrome; Bartter syndrome; basal cell nevus syndrome; Batten disease; benign recurrent intrahepatic cholestasis; beta-mannosidosis; Bethlem myopathy; Blackfan-Diamond anemia; blepharophimosis; Byler disease; C syndrome; CADASIL; carbamyl phosphate synthetase deficiency; cardiofaciocutaneous syndrome; Carney triad; carnitine palmitoyltransferase deficiencies; cartilage-hair hypoplasia; cb1C type of combined methylmalonic aciduria; CD18 deficiency; CD3Z-associated primary T-cell immunodeficiency; CD40L deficiency; CDAGS syndrome; CDG1A; CDG1B; CDG1M; CDG2C; CEDNIK syndrome; central core disease; centronuclear myopathy; cerebral capillary malformation; cerebrooculofacioskeletal syndrome type 4; cerebrooculogacioskeletal syndrome; cerebrotendinous xanthomatosis; CHARGE association; cherubism; CHILD syndrome; chronic granulomatous disease; chronic recurrent multifocal osteomyelitis; citrin deficiency; classic hemochromatosis; CNPPB syndrome; cobalamin C disease; Cockayne syndrome; coenzyme Q10 deficiency; Coffin-Lowry syndrome; Cohen syndrome; combined deficiency of coagulation factors V; common variable immune deficiency; complete androgen insentivity; cone rod dystrophies; conformational diseases; congenital bile adid synthesis defect type 1; congenital bile adid synthesis defect type 2; congenital defect in bile acid synthesis type; congenital erythropoietic porphyria; congenital generalized osteosclerosis; Cornelia de Lange syndrome; Cousin syndrome; Cowden disease; COX deficiency; Crigler-Najjar disease; Crigler-Najjar syndrome type 1; Crisponi syndrome; Currarino syndrome; Curth-Macklin type ichthyosis hystrix; cutis laxa; cystic fibrosis; cystinosis; d-2-hydroxyglutaric aciduria; DDP syndrome; Dejerine-Sottas disease; Denys-Drash syndrome; desmin cardiomyopathy; desmin myopathy; DGUOK-associated mitochondrial DNA depletion; disorders of glutamate metabolism; distal spinal muscular atrophy type 5; DNA repair diseases; dominant optic atrophy; Doyne honeycomb retinal dystrophy; Duchenne muscular dystrophy; dyskeratosis congenita; Ehlers-Danlos syndrome type 4; Ehlers-Danlos syndromes; Elejalde disease; Ellis-van Creveld disease; Emery-Dreifuss muscular dystrophies; encephalomyopathic mtDNA depletion syndrome; enzymatic diseases; EPCAM-associated congenital tufting enteropathy; epidermolysis bullosa with pyloric atresia; exercise-induced hypoglycemia; facioscapulohumeral muscular dystrophy; Faisalabad histiocytosis; familial atypical mycobacteriosis; familial capillary malformation-arteriovenous; familial esophageal achalasia; familial glomuvenous malformation; familial hemophagocytic lymphohistiocytosis; familial mediterranean fever; familial megacalyces; familial schwannomatosisl; familial spina bifida; familial splenic asplenia/hypoplasia; familial thrombotic thrombocytopenic purpura; Fanconi disease; Feingold syndrome; FENIB; fibrodysplasia ossificans progressiva; FKTN; Francois-Neetens fleck corneal dystrophy; Frasier syndrome; Friedreich ataxia; FTDP-17; fucosidosis; G6PD deficiency; galactosialidosis; Galloway syndrome; Gardner syndrome; Gaucher disease; Gitelman syndrome; GLUT1 deficiency; glycogen storage disease type 1b; glycogen storage disease type 2; glycogen storage disease type 3; glycogen storage disease type 4; glycogen storage disease type 9a; glycogen storage diseases; GM1-gangliosidosis; Greenberg syndrome; Greig cephalopolysyndactyly syndrome; hair genetic diseases; HANAC syndrome; harlequin type ichtyosis congenita; HDR syndrome; hemochromatosis type 3; hemochromatosis type 4; hemophilia A; hereditary angioedema type 3; hereditary angioedemas; hereditary hemorrhagic telangiectasia; hereditary hypofibrinogenemia; hereditary intraosseous vascular malformation; hereditary leiomyomatosis and renal cell cancer; hereditary neuralgic amyotrophy; hereditary sensory and autonomic neuropathy type; Hermansky-Pudlak disease; HHH syndrome; HHT2; hidrotic ectodermal dysplasia type 1; hidrotic ectodermal dysplasias; HNF4A-associated hyperinsulinism; HNPCC; human immunodeficiency with microcephaly; Huntington disease; hyper-IgD syndrome; hyperinsulinism-hyperammonemia syndrome; hypertrophy of the retinal pigment epithelium; hypochondrogenesis; hypohidrotic ectodermal dysplasia; ICF syndrome; idiopathic congenital intestinal pseudo-obstruction; immunodeficiency with hyper-IgM type 1; immunodeficiency with hyper-IgM type 3; immunodeficiency with hyper-IgM type 4; immunodeficiency with hyper-IgM type 5; inborm errors of thyroid metabolism; infantile visceral myopathy; infantile X-linked spinal muscular atrophy; intrahepatic cholestasis of pregnancy; IPEX syndrome; IRAK4 deficiency; isolated congenital asplenia; Jeune syndrome Imag; Johanson-Blizzard syndrome; Joubert syndrome; JP-HHT syndrome; juvenile hemochromatosis; juvenile hyalin fibromatosis; juvenile nephronophthisis; Kabuki mask syndrome; Kallmann syndromes; Kartagener syndrome; KCNJ11-associated hyperinsulinism; Kearns-Sayre syndrome; Kostmann disease; Kozlowski type of spondylometaphyseal dysplasia; Krabbe disease; LADD syndrome; late infantile-onset neuronal ceroid lipofuscinosis; LCK deficiency; LDHCP syndrome; Legius syndrome; Leigh syndrome; lethal congenital contracture syndrome 2; lethal congenital contracture syndromes; lethal contractural syndrome type 3; lethal neonatal CPT deficiency type 2; lethal osteosclerotic bone dysplasia; LIG4 syndrome; lissencephaly type 1 Imag; lissencephaly type 3; Loeys-Dietz syndrome; low phospholipid-associated cholelithiasis; lysinuric protein intolerance; Maffucci syndrome; Majeed syndrome; mannose-binding protein deficiency; Marfan disease; Marshall syndrome; MASA syndrome; MCAD deficiency; McCune-Albright syndrome; MCKD2; Meckel syndrome; Meesmann corneal dystrophy; megacystis-microcolon-intestinal hypoperistalsis; megaloblastic anemia type 1; MEHMO; MELAS; Melnick-Needles syndrome; MEN2s; Menkes disease; metachromatic leukodystrophies; methylmalonic acidurias; methylvalonic aciduria; microcoria-congenital nephrosis syndrome; microvillous atrophy; mitochondrial neurogastrointestinal encephalomyopathy; monilethrix; monosomy X; mosaic trisomy 9 syndrome; Mowat-Wilson syndrome; mucolipidosis type 2; mucolipidosis type Ma; mucolipidosis type IV; mucopolysaccharidoses; mucopolysaccharidosis type 3A; mucopolysaccharidosis type 3C; mucopolysaccharidosis type 4B; multiminicore disease; multiple acyl-CoA dehydrogenation deficiency; multiple cutaneous and mucosal venous malformations; multiple endocrine neoplasia type 1; multiple sulfatase deficiency; NAIC; nail-patella syndrome; nemaline myopathies; neonatal diabetes mellitus; neonatal surfactant deficiency; nephronophtisis; Netherton disease; neurofibromatoses; neurofibromatosis type 1; Niemann-Pick disease type A; Niemann-Pick disease type B; Niemann-Pick disease type C; NKX2E; Noonan syndrome; North American Indian childhood cirrhosis; NROB1 duplication-associated DSD; ocular genetic diseases; oculo-auricular syndrome; OLEDAID; oligomeganephronia; oligomeganephronic renal hypolasia; 011ier disease; Opitz-Kaveggia syndrome; orofaciodigital syndrome type 1; orofaciodigital syndrome type 2; osseous Paget disease; otopalatodigital syndrome type 2; OXPHOS diseases; palmoplantar hyperkeratosis; panlobar nephroblastomatosis; Parkes-Weber syndrome; Parkinson disease; partial deletion of 21q22.2-q22.3; Pearson syndrome; Pelizaeus-Merzbacher disease; Pendred syndrome; pentalogy of Cantrell; peroxisomal acyl-CoA-oxidase deficiency; Peutz-Jeghers syndrome; Pfeiffer syndrome; Pierson syndrome; pigmented nodular adrenocortical disease; pipecolic acidemia; Pitt-Hopkins syndrome; plasmalogens deficiency; pleuropulmonary blastoma and cystic nephroma; polycystic lipomembranous osteodysplasia; porphyrias; premature ovarian failure; primary erythermalgia; primary hemochromatoses; primary hyperoxaluria; progressive familial intrahepatic cholestasis; propionic acidemia; pyruvate decarboxylase deficiency; RAPADILINO syndrome; renal cystinosis; rhabdoid tumor predisposition syndrome; Rieger syndrome; ring chromosome 4; Roberts syndrome; Robinow-Sorauf syndrome; Rothmund-Thomson syndrome; SCID; Saethre-Chotzen syndrome; Sandhoff disease; SC phocomelia syndrome; SCAS; Schinzel phocomelia syndrome; short rib-polydactyly syndrome type 1; short rib-polydactyly syndrome type 4; short-rib polydactyly syndrome type 2; short-rib polydactyly syndrome type 3; Shwachman disease; Shwachman-Diamond disease; sickle cell anemia; Silver-Russell syndrome; Simpson-Golabi-Behmel syndrome; Smith-Lemli-Opitz syndrome; SPG7-associated hereditary spastic paraplegia; spherocytosis; split-hand/foot malformation with long bone deficiencies; spondylocostal dysostosis; sporadic visceral myopathy with inclusion bodies; storage diseases; STRA6-associated syndrome; Tay-Sachs disease; thanatophoric dysplasia; thyroid metabolism diseases; Tourette syndrome; transthyretin-associated amyloidosis; trisomy 13; trisomy 22; trisomy 2p syndrome; tuberous sclerosis; tufting enteropathy; urea cycle diseases; Van Den Ende-Gupta syndrome; Van der Woude syndrome; variegated mosaic aneuploidy syndrome; VLCAD deficiency; von Hippel-Lindau disease; Waardenburg syndrome; WAGR syndrome; Walker-Warburg syndrome; Werner syndrome; Wilson disease; Wolcott-Rallison syndrome; Wolfram syndrome; X-linked agammaglobulinemia; X-linked chronic idiopathic intestinal pseudo-obstruction; X-linked cleft palate with ankyloglossia; X-linked dominant chondrodysplasia punctata; X-linked ectodermal dysplasia; X-linked Emery-Dreifuss muscular dystrophy; X-linked lissencephaly; X-linked lymphoproliferative disease; X-linked visceral heterotaxy; xanthinuria type 1; xanthinuria type 2; xeroderma pigmentosum; XPV; and Zellweger disease. In some embodiments, the ailment is Duchenne muscular dystrophy. In some embodiments, the ailment is myotonic dystrophy Type 1 (DM1). In some embodiments, the ailment is blindness or an inherited disease affecting the back of the eye. In some embodiments, the ailment is deafness. In some embodiments, the ailment is progeria. In some embodiments, the ailment is multiple sclerosis. In some embodiments, the ailment is cancer. In some embodiments, the ailment is a lysosomal storage disease, e.g., Hunter syndrome, Hurler syndrome. In some embodiments, the ailment is hypercholesterolemia. In some embodiments, the ailment is Stargardt macular dystrophy. In some embodiments, the ailment is In preferred embodiments, the ailment is cystic fibrosis.


In some embodiments, treating, preventing, or inhibiting an ailment in a subject may comprise contacting a target nucleic acid associated with a particular ailment to a programmable nuclease (e.g., a CasΦ programmable nuclease). In some aspects, the methods of treating, preventing, or inhibiting an ailment may involve removing, modifying, replacing, transposing, or affecting the regulation of a genomic sequence of a patient in need thereof. In some embodiments, the methods of treating, preventing, or inhibiting an ailment may involve modulating gene expression. In some embodiments, the methods of treating, preventing, or inhibiting an ailment may comprise targeting a nucleic acid sequence associated with a pathogen, such as a virus or bacteria, to a programmable nuclease of the present disclosure.


The compositions and methods described herein may be used to treat, prevent, diagnose, or identify a cancer in a subject. In some aspects, the methods may target cells or tissues. In some embodiments, the methods may be applied to subjects, such as humans. As used herein, the term “cancer” refers to a physiological condition that may be characterized by abnormal or unregulated cell growth or activity. In some cases, cancer may involve the spread of the cells exhibiting abnormal or unregulated growth or activity between various tissues in a subject. In some aspects, cancer may be a genetic condition. Examples of cancers include, but are not limited to Acute Lymphoblastic Leukemia, Acute Myeloid Leukemia, Adrenocortical Carcinoma, Anal Cancer, Astrocytomas, Bile Duct Cancer, Bladder Cancer, Bone Cancer, Brain Cancer, Breast Cancer, Bronchial Cancer, Burkitt Lymphoma, Carcinoma, Cardiac Tumors, Cervical Cancer, Chordoma, Chronic Lymphocytic Leukemia, Chronic Myelogenous Leukemia, Chronic Myeloproliferative Neoplasms, Colon Cancer, Colorectal Cancer, Craniopharyngioma, Cutaneous T-cell lymphoma, Ductal Carcinoma, Embryonal Tumors, Endometrial Cancer, Ependymoma, Esophageal Cancer, Esthesioneuroblastoma, Ewing Sarcoma, Extracranial Germ Cell Tumors, Extragonadal Germ Cell Tumors, Fallopian Tube Cancer, Fibrous Histiocytoma, Gallbladder Cancer, Gastric Cancer, Gastrointestinal Cancer, Gastrointestinal Carcinoid Cancer, Gastrointestinal Stromal Tumors, Gestational Trophoblastic Disease, Hairy Cell Leukemia, Head and Neck Cancer, Heart Tumors, Hepatocellular Cancer, Histiocytosis, Hodgkin Lymphoma, Hypopharyngeal Cancer, Intraocular Melanoma, Islet Cell Tumors, Kaposi Sarcoma, Kidney cancer, Langerhans Cell Histiocytosis, Laryngeal Cancer, Leukemia, Lip and Oral Cavity Cancer, Liver Cancer, Lung Cancer, Lymphoma, Malignant Fibrous Histiocytoma, Melanoma, Merkel Cell Carcinoma, Mesothelioma, Metastatic Squamous Neck Cancer, Midline Tract Carcinoma, Mouth Cancer, Multiple Endocrine Neoplasia Syndromes, Multiple Myeloma, Mycosis Fungoides, Myelodysplastic Syndromes, Myelogenous Leukemia, Myeloid Leukemia, Myeloproliferative Neoplasms, Nasal Cavity and Paranasal Sinus Cancer, Nasopharyngeal Cancer, Neuroblastoma, Non-Hodgkin Lymphoma, Non-Small Cell Lung Cancer, Oral Cancer, Osteosarcoma, Ovarian Cancer, Pancreatic Cancer, Pancreatic Neuroendocrine Tumors, Papillomatosis, Paraganglioma, Paranasal Sinus and Nasal Cavity Cancer, Parathyroid Cancer, Penile Cancer, Pharyngeal Cancer, Pheochromocytoma, Pituitary Tumor, Plasma Cell Neoplasm, Pleuropulmonary Blastoma, Primary Central Nervous System (CNS) Lymphoma, Primary Peritoneal Cancer, Prostate Cancer, Rectal Cancer, Recurrent Cancer, Renal Cell Cancer, Retinoblastoma, Rhabdomyosarcoma, Salivary Gland Cancer, Sezary Syndrome, Skin Cancer, Small Cell Lung Cancer, Small Intestine Cancer, Soft Tissue Sarcoma, Squamous Cell Carcinoma, Squamous Neck Cancer with Occult Primary, Stomach Cancer, T-Cell Lymphoma, Testicular Cancer, Throat Cancer, Thymoma and Thymic Carcinoma, Thyroid Cancer, Tracheobronchial Cancer, Transitional Cell Cancer of the Renal Pelvis and Ureter, Ureter Cancer, Renal Pelvis Cancer, Urethral Cancer, Uterine Cancer, Uterine Sarcoma, Vaginal Cancer, Vascular Tumors, Vulvar Cancer, and Wilms Tumor.


In some cases, a cancer is associated with one or more particular biomarkers. A biomarker is a chemical species or profile that may serve as an indicator of a cellular or organismal state (e.g., the presence or absence of a disease). Non-limiting examples of biomarkers include biomolecules, nucleic acid sequences, proteins, metabolites, nucleic acids, protein modifications. A biomarker may refer to one species or to a plurality of species, such as a cell surface profile.


The methods of the present disclosure (e.g., methods of modifying a target nucleic acid) may comprise targeting a biomarker or a nucleic acid associated with a biomarker with a programmable nuclease of the disclosure (e.g., a CasΦ). In some cases, the biomarker is a gene associated with a cancer. Non-limiting examples of genes associated with cancers include, ABL, AF4/HRX, AKT-2, ALK, ALK/NPM, AML1, AML1/MTG8, APC, ATM, AXIN2, AXL, BAP1, BARD1, BCL-2, BCL-3, BCL-6, BCR/ABL, BLM, BMPR1A, BRCA1, BRCA2, BRIP1, c-MYC, CASR, CDC73, CDH1, CDK4, CDKN1B, CDKN1C, CDKN2A, CEBPA, CHEK2, CTNNA1, DBL, DEK/CAN, DICER1, DIS3L2, E2A/PBX1, EGFR, ENL/HRX, EPCAM, ERG/TLS, ERBB, ERBB-2, ETS-1, EWS/FLI-1, FH, FLCN, FMS, FOS, FPS, GATA2, GLI, GPGSP, GREM1, HER2/neu, HOX11, HOXB13, HST, IL-3, INT-2, JUN, KIT, KS3, K-SAM, LBC, LCK, LMO1, LMO2, L-MYC, LYL-1, LYT-10, LYT-10/Cα1, MAS, MAX, MDM-2, MEN1, MET, MITF, MLH1, MLL, MOS, MSH1, MSH2, MSH3, MSH6, MTG8/AML1, MUTYH, MYB, MYH11/CBFB, NBN, NEU, NF1, NF2, N-MYC, NTHL1, OST, PALB2, PAX-5, PBX1/E2A, PDGFRA, PHOX2B, PIM-1, PMS2, POLD1, POLE, POT1, PRAD-1, PRKAR1A, PTCH1, PTEN, RAD50, RAD51C, RAD51D, RAF, RAR/PML, RAS-H, RAS-K, RAS-N, RB1, RECQL4, REL/NRG, RET, RHOM1, RHOM2, ROS, RUNX1, SDHA, SDHAF, SDHB, SDHC, SDHD, SET/CAN, SIS, SKI, SMAD4, SMARCA4, SMARCB1, SMARCE1, SRC, STK11, SUFU, TAL1, TAL2, TAN-1, TIAM1, TERC, TERT, TMEM127, TP53, TSC1, TSC2, TRK, VHL, WRN, and WT1. In some cases, a gene biomarker for cancer will carry one or more mutations. In some cases, a gene biomarker for a cancer will be upregulated or downregulated relative to a patient or sample that does not have the cancer.


The compositions and methods described herein may be suitable for autologous or allogeneic treatment, as well as ex vivo cell-based treatments.


The compositions and methods described herein may be used to treat, prevent, diagnose, or identify an infection in a subject. In some embodiments, the subject is an animal (e.g., a mammal, such as a human). In some embodiments, the subject is a plant (e.g., a crop).


In some aspects, the disclosure provides the programmable CasΦ nucleases and compositions described herein for use in a method of treatment. In some embodiments, the disclosure provides the CasΦ programmable nucleases and compositions described herein for use in a method of treating an ailment recited above.


In some aspects, the disclosure provides the programmable CasΦ nucleases and compositions described herein for use as a medicament.


Methods of Detecting a Target Nucleic Acid

The present disclosure provides methods and compositions, which enable target nucleic acid detection by programmable nuclease platforms, such as the DNA Endonuclease Targeted CRISPR TransReporter (DETECTR) platform. In some embodiments, the target nucleic acid is a DNA. In some embodiments, the target nucleic acid is a RNA.


A number of reagents are consistent with the compositions and methods disclosed herein. The reagents described herein may be used for nicking target nucleic acids and for detection of target nucleic acids. The reagents disclosed herein can include programmable nucleases, guide nucleic acids, target nucleic acids, and buffers. As described herein, target nucleic acid comprising DNA or RNA may be modified or detected (e.g., the target nucleic acid hybridizes to the guide nucleic) using a programmable nuclease (e.g., a CasΦ as disclosed herein) and other reagents disclosed herein. As described herein, target nucleic acids comprising DNA may be an amplicon of a nucleic acid of interest and the amplicon can be detected using a programmable nuclease (e.g., a CasΦ as disclosed herein) and other reagents disclosed herein. Additionally, detection of multiple target nucleic acids is possible using two or more programmable nickases or a programmable nickase with a non-nickase programmable nuclease complexed to guide nucleic acids that target the multiple target nucleic acids, wherein the programmable nucleases exhibit different sequence-independent cleavage of the nucleic acid of a reporter (e.g., cleavage of an RNA reporter by a first programmable nuclease and cleavage of a DNA reporter by a second programmable nuclease).


In some embodiments, target nucleic acid from a sample is amplified before assaying for cleavage of reporters. Target DNA can be amplified by PCR or isothermal amplification techniques. DNA amplification methods that are compatible with the DETECTR technology can be used for programmable nucleases disclosed herein. For example, ssDNA can be amplified. Amplification of ssDNA instead of dsDNA can enable PAM-independent detection of nucleic acids by proteins with PAM requirements for dsDNA-activated trans-cleavage.


Certain programmable nucleases (e.g., a CasΦ as disclosed herein) of the disclosure can exhibit indiscriminate trans-cleavage of ssDNA, enabling their use for detection of DNA in samples. In some embodiments, target ssDNA are generated from many nucleic acid templates (RNA, ss/dsDNA) in order to achieve cleavage of the FQ reporter in the DETECTR platform. Certain programmable nucleases can be activated by ssDNA, upon which they can exhibit trans-cleavage of ssDNA and can, thereby, be used to cleave ssDNA FQ reporter molecules in the DETECTR system. These programmable nucleases can target ssDNA present in the sample, or generated and/or amplified from any number of nucleic acid templates (RNA, ssDNA, or dsDNA).


The compositions, kits and methods disclosed herein may be implemented in methods of assaying for a target nucleic acid. In some embodiments, a method of assaying for a target nucleic acid in a sample, comprises: contacting the sample to a complex comprising a guide nucleic acid comprising a segment that is reverse complementary to a segment of the target nucleic acid and a programmable nuclease (e.g., a CasΦ as disclosed herein) of the disclosure that exhibits sequence independent cleavage upon forming a complex comprising the segment of the guide nucleic acid binding to the segment of the target nucleic acid, wherein the sample comprises at least one nucleic acid comprising at least 50% sequence identity to the segment of the target nucleic acid; and assaying for cleavage of at least one reporter nucleic acids of a population of reporter nucleic acids, wherein the cleavage indicates a presence of the target nucleic acid in the sample and wherein absence of the cleavage indicates an absence of the target nucleic acid in the sample.


The target nucleic acid can be from 0.05% to 20% of total nucleic acids in the sample. Sometimes, the target nucleic acid is from 0.1% to 10% of the total nucleic acids in the sample. The target nucleic acid, in some cases, is from 0.1% to 5% of the total nucleic acids in the sample. Often, a sample comprises the segment of the target nucleic acid and at least one nucleic acid comprising less than 100% sequence identity to the segment of the target nucleic acid but no less than 50% sequence identity to the segment of the target nucleic acid. For example, the segment of the target nucleic acid comprises a mutation as compared to at least one nucleic acid comprising less than 100% sequence identity to the segment of the target nucleic acid but no less than 50% sequence identity to the segment of the target nucleic acid. Often, the segment of the target nucleic acid comprises a single nucleotide mutation as compared to at least one nucleic acid comprising less than 100% sequence identity to the segment of the target nucleic acid but no less than 50% sequence identity to the segment of the target nucleic acid.


The concentrations of the various reagents in the programmable nuclease DETECTR reaction mix can vary depending on the particular scale of the reaction. For example, the final concentration of the programmable nuclease can vary from 1 pM to 1 nM, from 1 pM to 10 pM, from 10 pM to 100 pM, from 100 pM to 1 nM, from 1 nM to 10 nM, from 10 nM to 20 nM, from 20 nM to 30 nM, from 30 nM to 40 nM, from 40 nM to 50 nM, from 50 nM to 60 nM, from 60 nM to 70 nM, from 70 nM to 80 nM, from 80 nM to 90 nM, from 90 nM to 100 nM, from 100 nM to 200 nM, from 200 nM to 300 nM, from 300 nM to 400 nM, from 400 nM to 500 nM, from 500 nM to 600 nM, from 600 nM to 700 nM, from 700 nM to 800 nM, from 800 nM to 900 nM, from 900 nM to 1000 nM. The final concentration of the sgRNA complementary to the target nucleic acid can be from 1 pM to 1 nM, from 1 pM to 10 pM, from 10 pM to 100 pM, from 100 pM to 1 nM, from 1 nM to 10 nM, from 10 nM to 20 nM, from 20 nM to 30 nM, from 30 nM to 40 nM, from 40 nM to 50 nM, from 50 nM to 60 nM, from 60 nM to 70 nM, from 70 nM to 80 nM, from 80 nM to 90 nM, from 90 nM to 100 nM, from 100 nM to 200 nM, from 200 nM to 300 nM, from 300 nM to 400 nM, from 400 nM to 500 nM, from 500 nM to 600 nM, from 600 nM to 700 nM, from 700 nM to 800 nM, from 800 nM to 900 nM, from 900 nM to 1000 nM. The concentration of the ssDNA-FQ reporter can be from 1 pM to 1 nM, from 1 pM to 10 pM, from 10 pM to 100 pM, from 100 pM to 1 nM, from 1 nM to 10 nM, from 10 nM to 20 nM, from 20 nM to 30 nM, from 30 nM to 40 nM, from 40 nM to 50 nM, from 50 nM to 60 nM, from 60 nM to 70 nM, from 70 nM to 80 nM, from 80 nM to 90 nM, from 90 nM to 100 nM, from 100 nM to 200 nM, from 200 nM to 300 nM, from 300 nM to 400 nM, from 400 nM to 500 nM, from 500 nM to 600 nM, from 600 nM to 700 nM, from 700 nM to 800 nM, from 800 nM to 900 nM, from 900 nM to 1000 nM.


An example of a DETECTR reaction comprises, consists, or consists essentially of a final concentration of 100 nM CasΦ polypeptide or variant thereof, 125 nM sgRNA, and 50 nM ssDNA-FQ reporter in a total reaction volume of 20 μL. Reactions are incubated in a fluorescence plate reader (Tecan Infinite Pro 200 M Plex) for 2 hours at 37° C. with fluorescence measurements taken every 30 seconds (e.g., 2\, ex: 485 nm; 2\, em: 535 nm). The fluorescence wavelength detected can vary depending on the reporter molecule.


Described herein are reagents comprising a single stranded reporter nucleic acid comprising a detection moiety, wherein the reporter nucleic acid (e.g., the ssDNA-FQ reporter described above) is capable of being cleaved by the programmable nuclease, upon generation and amplification of ssDNA from a nucleic acid template using the methods disclosed herein, thereby generating a first detectable signal.


The methods disclosed herein, thus, include generation and amplification of ssDNA from a target nucleic acid template (e.g., cDNA, ssDNA, or dsDNA) of interest in a sample, incubation of the ssDNA with an ssDNA activated programmable nuclease leading to indiscriminate, PAM-independent cleavage of reporter nucleic acids (also referred to as ssDNA-FQ reporters) to generate a detectable signal, and quantification of the detectable signal to detect a target nucleic acid sequence of interest.


Reporters

Described herein are reagents comprising a reporter. The reporter can comprise a single stranded nucleic acid and a detection moiety (e.g., a labeled single stranded DNA reporter), wherein the nucleic acid is capable of being cleaved by the activated programmable nuclease (e.g., a CasΦ as disclosed herein), releasing the detection moiety, and, generating a detectable signal. As used herein, “reporter” is used interchangeably with “reporter nucleic acid” or “reporter molecule”. The programmable nucleases disclosed herein, activated upon hybridization of a guide RNA to a target nucleic acid, can cleave the reporter. Cleaving the “reporter” may be referred to herein as cleaving the “reporter nucleic acid,” the “reporter molecule,” or the “nucleic acid of the reporter.”


A major advantage of the compositions and methods disclosed herein can be the design of excess reporters to total nucleic acids in an unamplified or an amplified sample, not including the nucleic acid of the reporter. Total nucleic acids can include the target nucleic acids and non-target nucleic acids, not including the nucleic acid of the reporter. The non-target nucleic acids can be from the original sample, either lysed or unlysed. The non-target nucleic acids can also be byproducts of amplification. Thus, the non-target nucleic acids can include both non-target nucleic acids from the original sample, lysed or unlysed, and from an amplified sample. The presence of a large amount of non-target nucleic acids, an activated programmable nuclease (e.g., a CasΦ as disclosed herein) may be inhibited in its ability to bind and cleave the reporter sequences. This is because the activated programmable nuclease collaterally cleaves any nucleic acids. If total nucleic acids are in present in large amounts, they may outcompete reporters for the programmable nucleases. The compositions and methods disclosed herein are designed to have an excess of reporter to total nucleic acids, such that the detectable signals from DETECTR reactions are particularly superior. In some embodiments, the reporter can be present in at least 1.5 fold, at least 2 fold, at least 3 fold, at least 4 fold, at least 5 fold, at least 6 fold, at least 7 fold, at least 8 fold, at least 9 fold, at least 10 fold, at least 11 fold, at least 12 fold, at least 13 fold, at least 14 fold, at least 15 fold, at least 16 fold, at least 17 fold, at least 18 fold, at least 19 fold, at least 20 fold, at least 30 fold, at least 40 fold, at least 50 fold, at least 60 fold, at least 70 fold, at least 80 fold, at least 90 fold, at least 100 fold, from 1.5 fold to 100 fold, from 2 fold to 10 fold, from 10 fold to 20 fold, from 20 fold to 30 fold, from 30 fold to 40 fold, from 40 fold to 50 fold, from 50 fold to 60 fold, from 60 fold to 70 fold, from 70 fold to 80 fold, from 80 fold to 90 fold, from 90 fold to 100 fold, from 1.5 fold to 10 fold, from 1.5 fold to 20 fold, from 10 fold to 40 fold, from 20 fold to 60 fold, or from 10 fold to 80 fold excess of total nucleic acids.


Another significant advantage of the compositions and methods disclosed herein can be the design of an excess volume comprising the guide nucleic acid, the programmable nuclease (e.g., a CasΦ as disclosed herein), and the reporter, which contacts a smaller volume comprising the sample with the target nucleic acid of interest. The smaller volume comprising the sample can be unlysed sample, lysed sample, or lysed sample which has undergone any combination of reverse transcription, amplification, and in vitro transcription. The presence of various reagents in a crude, non-lysed sample, a lysed sample, or a lysed and amplified sample, such as buffer, magnesium sulfate, salts, the pH, a reducing agent, primers, dNTPs, NTPs, cellular lysates, non-target nucleic acids, primers, or other components, can inhibit the ability of the programmable nuclease to become activated or to find and cleave the nucleic acid of the reporter. This may be due to nucleic acids that are not the reporter outcompeting the nucleic acid of the reporter, for the programmable nuclease. Alternatively, various reagents in the sample may simply inhibit the activity of the programmable nuclease. Thus, the compositions and methods provided herein for contacting an excess volume comprising the guide nucleic acid, the programmable nuclease, and the reporter to a smaller volume comprising the sample with the target nucleic acid of interest provides for superior detection of the target nucleic acid by ensuring that the programmable nuclease is able to find and cleaves the nucleic acid of the reporter. In some embodiments, the volume comprising the guide nucleic acid, the programmable nuclease, and the reporter (can be referred to as “a second volume”) is 4-fold greater than a volume comprising the sample (can be referred to as “a first volume”). In some embodiments, the volume comprising the guide nucleic acid, the programmable nuclease, and the reporter (can be referred to as “a second volume”) is at least 1.5 fold, at least 2 fold, at least 3 fold, at least 4 fold, at least 5 fold, at least 6 fold, at least 7 fold, at least 8 fold, at least 9 fold, at least 10 fold, at least 11 fold, at least 12 fold, at least 13 fold, at least 14 fold, at least 15 fold, at least 16 fold, at least 17 fold, at least 18 fold, at least 19 fold, at least 20 fold, at least 30 fold, at least 40 fold, at least 50 fold, at least 60 fold, at least 70 fold, at least 80 fold, at least 90 fold, at least 100 fold, from 1.5 fold to 100 fold, from 2 fold to 10 fold, from 10 fold to 20 fold, from 20 fold to 30 fold, from 30 fold to 40 fold, from 40 fold to 50 fold, from 50 fold to 60 fold, from 60 fold to 70 fold, from 70 fold to 80 fold, from 80 fold to 90 fold, from 90 fold to 100 fold, from 1.5 fold to 10 fold, from 1.5 fold to 20 fold, from 10 fold to 40 fold, from 20 fold to 60 fold, or from 10 fold to 80 fold greater than a volume comprising the sample (can be referred to as “a first volume”). In some embodiments, the volume comprising the sample is at least 0.5 μL, at least 1 μL, at least at least 1 μL, at least 2 μL, at least 3 μt, at least 4 μL, at least 5 μL, at least 6 μL, at least 7 μL, at least 8 μL, at least 9 μL, at least 10 μL, at least 11 μL, at least 12 μL, at least 13 μL, at least 14 μL, at least 15 μL, at least 16 μL, at least 17 μL, at least 18 μL, at least 19 μL, at least 20 μL, at least 25 μL, at least 30 μL, at least 35 μL, at least 40 μL, at least 45 μL, at least 50 μL, at least 55 μL, at least 60 μL, at least 65 μL, at least 70 μL, at least 75 μL, at least 80 μL, at least 85 μL, at least 90 μL, at least 95 μL, at least 100 μL, from 0.5 μL to 5 μL, from 5 μL to 10 μL, from 10 μL to 15 μL, from 15 μL to 20 μL, from 20 μL to 25 μL, from 25 μL to 30 μL, from 30 μL to 35 μL, from 35 μL to 40 μL, from 40 μL to 45 μL, from 45 μL to 50 μL, from 10 μL to 20 μL, from 5 μL to 20 μL, from 1 μL to 40 μL, from 2 μL to 10 μL, or from 1 μL to 10 μL. In some embodiments, the volume comprising the programmable nuclease, the guide nucleic acid, and the reporter is at least 10 μL, at least 11 μL, at least 12 μL, at least 13 μL, at least 14 μL, at least 15 μL, at least 16 μL, at least 17 μL, at least 18 μL, at least 19 μL, at least 20 μL, at least 21 μL, at least 22 μL, at least 23 μL, at least 24 μL, at least 25 μL, at least 26 μL, at least 27 μL, at least 28 μL, at least 29 μL, at least 30 μL, at least 40 μL, at least 50 μL, at least 60 μL, at least 70 μL, at least 80 μL, at least 90 μL, at least 100 μL, at least 150 μL, at least 200 μL, at least 250 μL, at least 300 μL, at least 350 μL, at least 400 μL, at least 450 μL, at least 500 μL, from 10 μL to 15 μL μL, from 15 μL to 20 μL, from 20 μL to 25 μL, from 25 μL to 30 μL, from 30 μL to 35 μL, from 35 μL to 40 μL, from 40 μL to 45 μL, from 45 μL to 50 μL, from 50 μL to 55 μL, from 55 μL to 60 μL, from 60 μL to 65 μL, from 65 μL to 70 μL, from 70 μL to 75 μL, from 75 μL to 80 μL, from 80 μL to 85 μL, from 85 μL to 90 μL, from 90 μL to 95 μL, from 95 μL to 100 μL, from 100 μL to 150 μL, from 150 μL to 200 μL, from 200 μL to 250 μL, from 250 μL to 300 μL, from 300 μL to 350 μL, from 350 μL to 400 μL, from 400 μL to 450 μL, from 450 μL to 500 μL, from 10 μL to 20 μL, from 10 μL to 30 μL, from 25 μL to 35 μL, from 10 μL to 40 μL, from 20 μL to 50 μL, from 18 μL to 28 μL, or from 17 μL to 22 μL.


In some cases, the reporter nucleic acid is a single-stranded nucleic acid sequence comprising deoxyribonucleotides. In other cases, the reporter nucleic acid is a single-stranded nucleic acid sequence comprising ribonucleotides. The nucleic acid of a reporter can be a single-stranded nucleic acid sequence comprising at least one deoxyribonucleotide and at least one ribonucleotide. In some cases, the nucleic acid of a reporter is a single-stranded nucleic acid comprising at least one ribonucleotide residue at an internal position that functions as a cleavage site. In some cases, the nucleic acid of a reporter comprises at least 2, 3, 4, 5, 6, 7, 8, 9, or 10 ribonucleotide residues at an internal position. In some cases, the nucleic acid of a reporter comprises from 2 to 10, from 3 to 9, from 4 to 8, or from 5 to 7 ribonucleotide residues at an internal position. Sometimes the ribonucleotide residues are continuous. Alternatively, the ribonucleotide residues are interspersed in between non-ribonucleotide residues. In some cases, the nucleic acid of a reporter has only ribonucleotide residues. In some cases, the nucleic acid of a reporter has only deoxyribonucleotide residues. In some cases, the nucleic acid comprises nucleotides resistant to cleavage by the programmable nuclease described herein. In some cases, the nucleic acid of a reporter comprises synthetic nucleotides. In some cases, the nucleic acid of a reporter comprises at least one ribonucleotide residue and at least one non-ribonucleotide residue. In some cases, the nucleic acid of a reporter is 5-20, 5-15, 5-10, 7-20, 7-15, or 7-10 nucleotides in length. In some cases, the nucleic acid of a reporter is from 3 to 20, from 4 to 10, from 5 to 10, or from 5 to 8 nucleotides in length. In some cases, the nucleic acid of a reporter comprises at least one uracil ribonucleotide. In some cases, the nucleic acid of a reporter comprises at least two uracil ribonucleotides. Sometimes the nucleic acid of a reporter has only uracil ribonucleotides. In some cases, the nucleic acid of a reporter comprises at least one adenine ribonucleotide. In some cases, the nucleic acid of a reporter comprises at least two adenine ribonucleotides. In some cases, the nucleic acid of a reporter has only adenine ribonucleotides. In some cases, the nucleic acid of a reporter comprises at least one cytosine ribonucleotide. In some cases, the nucleic acid of a reporter comprises at least two cytosine ribonucleotides. In some cases, the nucleic acid of a reporter comprises at least one guanine ribonucleotide. In some cases, the nucleic acid of a reporter comprises at least two guanine ribonucleotides. A nucleic acid of a reporter can comprise only unmodified ribonucleotides, only unmodified deoxyribonucleotides, or a combination thereof. In some cases, the nucleic acid of a reporter is from 5 to 12 nucleotides in length. In some cases, the reporter nucleic acid is at least 2, at least 3, at least 4, at least 5, at least 6, at least 7, at least 8, at least 9, at least 10, at least 11, at least 12, at least 13, at least 14, at least 15, at least 16, at least 17, at least 18, at least 19, at least 20, at least 21, at least 22, at least 23, at least 24, at least 25, at least 26, at least 27, at least 28, at least 29, or at least 30 nucleotides in length. In some cases, the reporter nucleic acid is 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, or 30 nucleotides in length.


The single stranded nucleic acid of a reporter comprises a detection moiety capable of generating a first detectable signal. Sometimes the reporter nucleic acid comprises a protein capable of generating a signal. A signal can be a calorimetric, potentiometric, amperometric, optical (e.g., fluorescent, colorimetric, etc.), or piezo-electric signal. In some cases, a detection moiety is on one side of the cleavage site. Optionally, a quenching moiety is on the other side of the cleavage site. Sometimes the quenching moiety is a fluorescence quenching moiety. In some cases, the quenching moiety is 5′ to the cleavage site and the detection moiety is 3′ to the cleavage site. In some cases, the detection moiety is 5′ to the cleavage site and the quenching moiety is 3′ to the cleavage site. Sometimes the quenching moiety is at the 5′ terminus of the nucleic acid of a reporter. Sometimes the detection moiety is at the 3′ terminus of the nucleic acid of a reporter. In some cases, the detection moiety is at the 5′ terminus of the nucleic acid of a reporter. In some cases, the quenching moiety is at the 3′ terminus of the nucleic acid of a reporter. In some cases, the single-stranded nucleic acid of a reporter is at least one population of the single-stranded nucleic acid capable of generating a first detectable signal. In some cases, the single-stranded nucleic acid of a reporter is a population of the single stranded nucleic acid capable of generating a first detectable signal. Optionally, there is more than one population of single-stranded nucleic acid of a reporter. In some cases, there are 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 20, 30, 40, 50, or greater than 50, or any number spanned by the range of this list of different populations of single-stranded nucleic acids of a reporter capable of generating a detectable signal. In some cases, there are from 2 to 50, from 3 to 40, from 4 to 30, from 5 to 20, or from 6 to 10 different populations of single-stranded nucleic acids of a reporter capable of generating a detectable signal.









TABLE 3







Examples of Single Stranded Nucleic Acids in a Reporter









5′ Detection Moiety*
Sequence (SEQ ID NO)
3′ Quencher*





/56-FAM/
TTATTATT (SEQ ID NO: 95)
/3IABkFQ/





/56-FAM/
TTATTATT (SEQ ID NO: 95)
/3IABkFQ/





/5IRD700/
TTATTATT (SEQ ID NO: 95)
/3IRQC1N/





/5TYE665/
TTATTATT (SEQ ID NO: 95)
/3IAbRQSp/





/5Alex594N/
TTATTATT (SEQ ID NO: 95)
/3IAbRQSp/





/5ATTO633N/
TTATTATT (SEQ ID NO: 95)
/3IAbRQSp/





/56-FAM/
TTTTTT (SEQ ID NO: 96)
/3IABkFQ/





/56-FAM/
TTTTTTTT (SEQ ID NO: 97)
/3IABkFQ/





/56-FAM/
TTTTTTTTTT (SEQ ID NO: 98)
/3IABkFQ/





/56-FAM/
TTTTTTTTTTTT (SEQ ID NO: 99)
/3IABkFQ/





/56-FAM/
TTTTTTTTTTTTTT (SEQ ID NO: 100)
/3IABkFQ/





/56-FAM/
AAAAAA (SEQ ID NO: 101)
/3IABkFQ/





/56-FAM/
CCCCCC (SEQ ID NO: 102)
/3IABkFQ/





/56-FAM/
GGGGGG (SEQ ID NO: 103)
/3IABkFQ/





/56-FAM/
TTATTATT (SEQ ID NO: 104)
/3IABkFQ/





*This Table refers to the detection moiety and quencher moiety as their tradenames and their source is identified. However, alternatives, generics, or non-tradename moieties with similar function from other sources can also be used.


/56-FAM/: 5′ 6-Fluorescein (Integrated DNA Technologies)


/3IABkFQ/: 3′ Iowa Black FQ (Integrated DNA Technologies)


/5IRD700/: 5′ IRDye 700 (Integrated DNA Technologies)


/5TYE665/: 5′ TYE 665 (Integrated DNA Technologies)


/5Alex594N/: 5′ Alexa Fluor 594 (NHS Ester) (Integrated DNA Technologies)


/5ATTO633N/: 5′ ATTO TM 633 (NHS Ester) (Integrated DNA Technologies)


/3IRQC1N/: 3′ IRDye QC-1 Quencher (Li-Cor)


/3IAbRQSp/: 3′ Iowa Black RQ (Integrated DNA Technologies)






A detection moiety can be an infrared fluorophore. A detection moiety can be a fluorophore that emits fluorescence in the range of from 500 nm and 720 nm. A detection moiety can be a fluorophore that emits fluorescence in the range of from 500 nm and 720 nm. In some cases, the detection moiety emits fluorescence at a wavelength of 700 nm or higher. In other cases, the detection moiety emits fluorescence at about 660 nm or about 670 nm. In some cases, the detection moiety emits fluorescence in the range of from 500 to 520, 500 to 540, 500 to 590, 590 to 600, 600 to 610, 610 to 620, 620 to 630, 630 to 640, 640 to 650, 650 to 660, 660 to 670, 670 to 680, 690 to 690, 690 to 700, 700 to 710, 710 to 720, or 720 to 730 nm. In some cases, the detection moiety emits fluorescence in the range from 450 nm to 750 nm, from 500 nm to 650 nm, or from 550 to 650 nm. A detection moiety can be a fluorophore that emits a detectable fluorescence signal in the same range as 6-Fluorescein, IRDye 700, TYE 665, Alex Fluor, or ATTO TM 633 (NHS Ester). A detection moiety can be fluorescein amidite, 6-Fluorescein, IRDye 700, TYE 665, Alex Fluor 594, or ATTO TM 633 (NHS Ester). A detection moiety can be a fluorophore that emits a fluorescence in the same range as 6-Fluorescein (Integrated DNA Technologies), IRDye 700 (Integrated DNA Technologies), TYE 665 (Integrated DNA Technologies), Alex Fluor 594 (Integrated DNA Technologies), or ATTO TM 633 (NHS Ester) (Integrated DNA Technologies). A detection moiety can be fluorescein amidite, 6-Fluorescein (Integrated DNA Technologies), IRDye 700 (Integrated DNA Technologies), TYE 665 (Integrated DNA Technologies), Alex Fluor 594 (Integrated DNA Technologies), or ATTO TM 633 (NHS Ester) (Integrated DNA Technologies). Any of the detection moieties described herein can be from any commercially available source, can be an alternative with a similar function, a generic, or a non-tradename of the detection moieties listed.


A detection moiety can be chosen for use based on the type of sample to be tested. For example, a detection moiety that is an infrared fluorophore is used with a urine sample. As another example, SEQ ID NO: 87 with a fluorophore that emits a fluorescence around 520 nm is used for testing in non-urine samples, and SEQ ID NO: 94 with a fluorophore that emits a fluorescence around 700 nm is used for testing in urine samples.


A quenching moiety can be chosen based on its ability to quench the detection moiety. A quenching moiety can be a non-fluorescent fluorescence quencher. A quenching moiety can quench a detection moiety that emits fluorescence in the range of from 500 nm and 720 nm. A quenching moiety can quench a detection moiety that emits fluorescence in the range of from 500 nm and 720 nm. In some cases, the quenching moiety quenches a detection moiety that emits fluorescence at a wavelength of 700 nm or higher. In other cases, the quenching moiety quenches a detection moiety that emits fluorescence at about 660 nm or about 670 nm. In some cases, the quenching moiety quenches a detection moiety that emits fluorescence in the range of from 500 to 520, 500 to 540, 500 to 590, 590 to 600, 600 to 610, 610 to 620, 620 to 630, 630 to 640, 640 to 650, 650 to 660, 660 to 670, 670 to 680, 690 to 690, 690 to 700, 700 to 710, 710 to 720, or 720 to 730 nm. In some cases, the quenching moiety quenches a detection moiety that emits fluorescence in the range from 450 nm to 750 nm, from 500 nm to 650 nm, or from 550 to 650 nm. A quenching moiety can quench fluorescein amidite, 6-Fluorescein, IRDye 700, TYE 665, Alex Fluor 594, or ATTO TM 633 (NHS Ester). A quenching moiety can be Iowa Black RQ, Iowa Black FQ or IRDye QC-1 Quencher. A quenching moiety can quench fluorescein amidite, 6-Fluorescein (Integrated DNA Technologies), IRDye 700 (Integrated DNA Technologies), TYE 665 (Integrated DNA Technologies), Alex Fluor 594 (Integrated DNA Technologies), or ATTO TM 633 (NHS Ester) (Integrated DNA Technologies). A quenching moiety can be Iowa Black RQ (Integrated DNA Technologies), Iowa Black FQ (Integrated DNA Technologies) or IRDye QC-1 Quencher (LiCor). Any of the quenching moieties described herein can be from any commercially available source, can be an alternative with a similar function, a generic, or a non-tradename of the quenching moieties listed.


The generation of the detectable signal from the release of the detection moiety indicates that cleavage by the programmable nucleases has occurred and that the sample contains the target nucleic acid. In some cases, the detection moiety comprises a fluorescent dye. Sometimes the detection moiety comprises a fluorescence resonance energy transfer (FRET) pair. In some cases, the detection moiety comprises an infrared (IR) dye. In some cases, the detection moiety comprises an ultraviolet (UV) dye. Alternatively or in combination, the detection moiety comprises a polypeptide. Sometimes the detection moiety comprises a biotin. Sometimes the detection moiety comprises at least one of avidin or streptavidin. In some instances, the detection moiety comprises a polysaccharide, a polymer, or a nanoparticle. In some instances, the detection moiety comprises a gold nanoparticle or a latex nanoparticle.


A detection moiety can be any moiety capable of generating a calorimetric, potentiometric, amperometric, optical (e.g., fluorescent, colorimetric, etc.), or piezo-electric signal. A nucleic acid of a reporter, sometimes, is protein-nucleic acid that is capable of generating a calorimetric, potentiometric, amperometric, optical (e.g., fluorescent, colorimetric, etc.), or piezo-electric signal upon cleavage of the nucleic acid. Often a calorimetric signal is heat produced after cleavage of the nucleic acids of a reporter. Sometimes, a calorimetric signal is heat absorbed after cleavage of the nucleic acids of a reporter. A potentiometric signal, for example, is electrical potential produced after cleavage of the nucleic acids of a reporter. An amperometric signal can be movement of electrons produced after the cleavage of nucleic acid of a reporter. Often, the signal is an optical signal, such as a colorimetric signal or a fluorescence signal. An optical signal is, for example, a light output produced after the cleavage of the nucleic acids of a reporter. Sometimes, an optical signal is a change in light absorbance between before and after the cleavage of nucleic acids of a reporter. Often, a piezo-electric signal is a change in mass between before and after the cleavage of the nucleic acid of a reporter.


The detectable signal can be a colorimetric signal or a signal visible by eye. In some instances, the detectable signal can be fluorescent, electrical, chemical, electrochemical, or magnetic. In some cases, the first detection signal can be generated by binding of the detection moiety to the capture molecule in the detection region, where the first detection signal indicates that the sample contained the target nucleic acid. Sometimes the system can be capable of detecting more than one type of target nucleic acid, wherein the system comprises more than one type of guide nucleic acid and more than one type of reporter nucleic acid. In some cases, the detectable signal can be generated directly by the cleavage event. Alternatively or in combination, the detectable signal can be generated indirectly by the signal event. Sometimes the detectable signal is not a fluorescent signal. In some instances, the detectable signal can be a colorimetric or color-based signal. In some cases, the detected target nucleic acid can be identified based on its spatial location on the detection region of the support medium. In some cases, the second detectable signal can be generated in a spatially distinct location than the first generated signal.


Often, the protein-nucleic acid is an enzyme-nucleic acid. The enzyme may be sterically hindered when present as in the enzyme-nucleic acid, but then functional upon cleavage from the nucleic acid. Often, the enzyme is an enzyme that produces a reaction with a substrate. An enzyme can be invertase. Often, the substrate of invertase is sucrose. A DNS reagent produces a colorimetric change when invertase converts sucrose to glucose. In some cases, it is preferred that the nucleic acid (e.g., DNA) and invertase are conjugated using a heterobifunctional linker via sulfo-SMCC chemistry. Sometimes the protein-nucleic acid is a substrate-nucleic acid. Often the substrate is a substrate that produces a reaction with an enzyme.


A protein-nucleic acid may be attached to a solid support. The solid support, for example, is a surface. A surface can be an electrode. Sometimes the solid support is a bead. Often the bead is a magnetic bead. Upon cleavage, the protein is liberated from the solid and interacts with other mixtures. For example, the protein is an enzyme, and upon cleavage of the nucleic acid of the enzyme-nucleic acid, the enzyme flows through a chamber into a mixture comprising the substrate. When the enzyme meets the enzyme substrate, a reaction occurs, such as a colorimetric reaction, which is then detected. As another example, the protein is an enzyme substrate, and upon cleavage of the nucleic acid of the enzyme substrate-nucleic acid, the enzyme flows through a chamber into a mixture comprising the enzyme. When the enzyme substrate meets the enzyme, a reaction occurs, such as a calorimetric reaction, which is then detected.


Often, the signal is a colorimetric signal or a signal visible by eye. In some instances, the signal is fluorescent, electrical, chemical, electrochemical, or magnetic. A signal can be a calorimetric, potentiometric, amperometric, optical (e.g., fluorescent, colorimetric, etc.), or piezo-electric signal. In some cases, the detectable signal is a colorimetric signal or a signal visible by eye. In some instances, the detectable signal is fluorescent, electrical, chemical, electrochemical, or magnetic. In some cases, the first detection signal is generated by binding of the detection moiety to the capture molecule in the detection region, where the first detection signal indicates that the sample contained the target nucleic acid. Sometimes the system is capable of detecting more than one type of target nucleic acid, wherein the system comprises more than one type of guide nucleic acid and more than one type of nucleic acid of a reporter. In some cases, the detectable signal is generated directly by the cleavage event. Alternatively or in combination, the detectable signal is generated indirectly by the signal event. Sometimes the detectable signal is not a fluorescent signal. In some instances, the detectable signal is a colorimetric or color-based signal. In some cases, the detected target nucleic acid is identified based on its spatial location on the detection region of the support medium. In some cases, the second detectable signal is generated in a spatially distinct location than the first generated signal.


In some cases, the threshold of detection, for a subject method of detecting a single stranded target nucleic acid in a sample, is less than or equal to 10 nM. The term “threshold of detection” is used herein to describe the minimal amount of target nucleic acid that must be present in a sample in order for detection to occur. For example, when a threshold of detection is 10 nM, then a signal can be detected when a target nucleic acid is present in the sample at a concentration of 10 nM or more. In some cases, the threshold of detection is less than or equal to 5 nM, 1 nM, 0.5 nM, 0.1 nM, 0.05 nM, 0.01 nM, 0.005 nM, 0.001 nM, 0.0005 nM, 0.0001 nM, 0.00005 nM, 0.00001 nM, 10 pM, 1 pM, 500 fM, 250 fM, 100 fM, 50 fM, 10 fM, 5 fM, 1 fM, 500 attomole (aM), 100 aM, 50 aM, 10 aM, or 1 aM. In some cases, the threshold of detection is in a range of from 1 aM to 1 nM, 1 aM to 500 pM, 1 aM to 200 pM, 1 aM to 100 pM, 1 aM to 10 pM, 1 aM to 1 pM, 1 aM to 500 fM, 1 aM to 100 fM, 1 aM to 1 fM, 1 aM to 500 aM, 1 aM to 100 aM, 1 aM to 50 aM, 1 aM to 10 aM, 10 aM to 1 nM, 10 aM to 500 pM, 10 aM to 200 pM, 10 aM to 100 pM, 10 aM to 10 pM, 10 aM to 1 pM, 10 aM to 500 fM, 10 aM to 100 fM, 10 aM to 1 fM, 10 aM to 500 aM, 10 aM to 100 aM, 10 aM to 50 aM, 100 aM to 1 nM, 100 aM to 500 pM, 100 aM to 200 pM, 100 aM to 100 pM, 100 aM to 10 pM, 100 aM to 1 pM, 100 aM to 500 fM, 100 aM to 100 fM, 100 aM to 1 fM, 100 aM to 500 aM, 500 aM to 1 nM, 500 aM to 500 pM, 500 aM to 200 pM, 500 aM to 100 pM, 500 aM to 10 pM, 500 aM to 1 pM, 500 aM to 500 fM, 500 aM to 100 fM, 500 aM to 1 fM, 1 fM to 1 nM, 1 fM to 500 pM, 1 fM to 200 pM, 1 fM to 100 pM, 1 fM to 10 pM, 1 fM to 1 pM, 10 fM to 1 nM, 10 fM to 500 pM, 10 fM to 200 pM, 10 fM to 100 pM, 10 fM to 10 pM, 10 fM to 1 pM, 500 fM to 1 nM, 500 fM to 500 pM, 500 fM to 200 pM, 500 fM to 100 pM, 500 fM to 10 pM, 500 fM to 1 pM, 800 fM to 1 nM, 800 fM to 500 pM, 800 fM to 200 pM, 800 fM to 100 pM, 800 fM to 10 pM, 800 fM to 1 pM, 1 pM to 1 nM, 1 pM to 500 pM, 1 pM to 200 pM, 1 pM to 100 pM, or 1 pM to 10 pM. In some cases, the threshold of detection in a range of from 800 fM to 100 pM, 1 pM to 10 pM, 10 fM to 500 fM, 10 fM to 50 fM, 50 fM to 100 fM, 100 fM to 250 fM, or 250 fM to 500 fM. In some cases the threshold of detection is in a range of from 2 aM to 100 pM, from 20 aM to 50 pM, from 50 aM to 20 pM, from 200 aM to 5 pM, or from 500 aM to 2 pM. In some cases, the minimum concentration at which a single stranded target nucleic acid is detected in a sample is in a range of from 1 aM to 1 nM, 10 aM to 1 nM, 100 aM to 1 nM, 500 aM to 1 nM, 1 fM to 1 nM, 1 fM to 500 pM, 1 fM to 200 pM, 1 fM to 100 pM, 1 fM to 10 pM, 1 fM to 1 pM, 10 fM to 1 nM, 10 fM to 500 pM, 10 fM to 200 pM, 10 fM to 100 pM, 10 fM to 10 pM, 10 fM to 1 pM, 500 fM to 1 nM, 500 fM to 500 pM, 500 fM to 200 pM, 500 fM to 100 pM, 500 fM to 10 pM, 500 fM to 1 pM, 800 fM to 1 nM, 800 fM to 500 pM, 800 fM to 200 pM, 800 fM to 100 pM, 800 fM to 10 pM, 800 fM to 1 pM, 1 pM to 1 nM, 1 pM to 500 pM, from 1 pM to 200 pM, 1 pM to 100 pM, or 1 pM to 10 pM. In some cases, the minimum concentration at which a single stranded target nucleic acid is detected in a sample is in a range of from 2 aM to 100 pM, from 20 aM to 50 pM, from 50 aM to 20 pM, from 200 aM to 5 pM, or from 500 aM to 2 pM. In some cases, the minimum concentration at which a single stranded target nucleic acid can be detected in a sample is in a range of from 1 aM to 100 pM. In some cases, the minimum concentration at which a single stranded target nucleic acid can be detected in a sample is in a range of from 1 fM to 100 pM. In some cases, the minimum concentration at which a single stranded target nucleic acid can be detected in a sample is in a range of from 10 fM to 100 pM. In some cases, the minimum concentration at which a single stranded target nucleic acid can be detected in a sample is in a range of from 800 fM to 100 pM. In some cases, the minimum concentration at which a single stranded target nucleic acid can be detected in a sample is in a range of from 1 pM to 10 pM. In some cases, the devices, systems, fluidic devices, kits, and methods described herein detect a target single-stranded nucleic acid in a sample comprising a plurality of nucleic acids such as a plurality of non-target nucleic acids, where the target single-stranded nucleic acid is present at a concentration as low as 1 aM, 10 aM, 100 aM, 500 aM, 1 fM, 10 fM, 500 fM, 800 fM, 1 pM, 10 pM, 100 pM, or 1 pM.


In some embodiments, the target nucleic acid is present in the cleavage reaction at a concentration of about 10 nM, about 20 nM, about 30 nM, about 40 nM, about 50 nM, about 60 nM, about 70 nM, about 80 nM, about 90 nM, about 100 nM, about 200 nM, about 300 nM, about 400 nM, about 500 nM, about 600 nM, about 700 nM, about 800 nM, about 900 nM, about 1 μM, about 10 μM, or about 100 μM. In some embodiments, the target nucleic acid is present in the cleavage reaction at a concentration of from 10 nM to 20 nM, from 20 nM to 30 nM, from 30 nM to 40 nM, from 40 nM to 50 nM, from 50 nM to 60 nM, from 60 nM to 70 nM, from 70 nM to 80 nM, from 80 nM to 90 nM, from 90 nM to 100 nM, from 100 nM to 200 nM, from 200 nM to 300 nM, from 300 nM to 400 nM, from 400 nM to 500 nM, from 500 nM to 600 nM, from 600 nM to 700 nM, from 700 nM to 800 nM, from 800 nM to 900 nM, from 900 nM to 1 μM, from 1 μM to 10 μM, from 10 μM to 100 μM, from 10 nM to 100 nM, from 10 nM to 1 μM, from 10 nM to 10 μM, from 10 nM to 100 μM, from 100 nM to 1 μM, from 100 nM to 10 μM, from 100 nM to 100 μM, or from 1 μM to 100 μM. In some embodiments, the target nucleic acid is present in the cleavage reaction at a concentration of from 20 nM to 50 μM, from 50 nM to 20 μM, or from 200 nM to 5 μM.


In some cases, the methods, compositions, reagents, enzymes, and kits described herein may be used to detect a target single-stranded nucleic acid in a sample where the sample is contacted with the reagents for a predetermined length of time sufficient for the trans-cleavage to occur or cleavage reaction to reach completion. In some cases, the devices, systems, fluidic devices, kits, and methods described herein detect a target single-stranded nucleic acid in a sample where the sample is contacted with the reagents for no greater than 60 minutes. Sometimes the sample is contacted with the reagents for no greater than 120 minutes, 110 minutes, 100 minutes, 90 minutes, 80 minutes, 70 minutes, 60 minutes, 55 minutes, 50 minutes, 45 minutes, 40 minutes, 35 minutes, 30 minutes, 25 minutes, 20 minutes, 15 minutes, 10 minutes, 5 minutes, 4 minutes, 3 minutes, 2 minutes, or 1 minute. Sometimes the sample is contacted with the reagents for at least 120 minutes, 110 minutes, 100 minutes, 90 minutes, 80 minutes, 70 minutes, 60 minutes, 55 minutes, 50 minutes, 45 minutes, 40 minutes, 35 minutes, 30 minutes, 25 minutes, 20 minutes, 15 minutes, 10 minutes, or 5 minutes. In some cases, the sample is contacted with the reagents for from 5 minutes to 120 minutes, from 5 minutes to 100 minutes, from 10 minutes to 90 minutes, from 15 minutes to 45 minutes, or from 20 minutes to 35 minutes. In some cases, the devices, systems, fluidic devices, kits, and methods described herein can detect a target nucleic acid in a sample in less than 10 hours, less than 9 hours, less than 8 hours, less than 7 hours, less than 6 hours, less than 5 hours, less than 4 hours, less than 3 hours, less than 2 hours, less than 1 hour, less than 50 minutes, less than 45 minutes, less than 40 minutes, less than 35 minutes, less than 30 minutes, less than 25 minutes, less than 20 minutes, less than 15 minutes, less than 10 minutes, less than 9 minutes, less than 8 minutes, less than 7 minutes, less than 6 minutes, or less than 5 minutes. In some cases, the devices, systems, fluidic devices, kits, and methods described herein can detect a target nucleic acid in a sample in from 5 minutes to 10 hours, from 10 minutes to 8 hours, from 15 minutes to 6 hours, from 20 minutes to 5 hours, from 30 minutes to 2 hours, or from 45 minutes to 1 hour.


When a guide nucleic acid binds to a target nucleic acid, the programmable nuclease's trans-cleavage activity can be initiated, and nucleic acids of a reporter can be cleaved, resulting in the detection of fluorescence. The guide nucleic acid may be a non-naturally occurring guide nucleic acid. A non-naturally occurring guide nucleic acid may comprise an engineered sequence having a repeat and a spacer that hybridizes to a target nucleic acid sequence of interest. A non-naturally occurring guide nucleic acid may be recombinantly expressed or chemically synthesized. Nucleic acid reporters can comprise a detection moiety, wherein the nucleic acid reporter can be cleaved by the activated programmable nuclease, thereby generating a signal. Some methods as described herein can a method of assaying for a target nucleic acid in a sample comprises contacting the sample to a complex comprising a guide nucleic acid comprising a segment that is reverse complementary to a segment of the target nucleic acid and a programmable nuclease that exhibits sequence independent cleavage upon forming a complex comprising the segment of the guide nucleic acid binding to the segment of the target nucleic acid; and assaying for a signal indicating cleavage of at least some protein-nucleic acids of a population of protein-nucleic acids, wherein the signal indicates a presence of the target nucleic acid in the sample and wherein absence of the signal indicates an absence of the target nucleic acid in the sample. The cleaving of the nucleic acid of a reporter using the programmable nuclease may cleave with an efficiency of 50% as measured by a change in a signal that is calorimetric, potentiometric, amperometric, optical (e.g., fluorescent, colorimetric, etc.), or piezo-electric, as non-limiting examples. Some methods as described herein can be a method of detecting a target nucleic acid in a sample comprising contacting the sample comprising the target nucleic acid with a guide nucleic acid targeting a target nucleic acid segment, a programmable nuclease capable of being activated when complexed with the guide nucleic acid and the target nucleic acid segment, a single stranded nucleic acid of a reporter comprising a detection moiety, wherein the nucleic acid of a reporter is capable of being cleaved by the activated programmable nuclease, thereby generating a first detectable signal, cleaving the single stranded nucleic acid of a reporter using the programmable nuclease that cleaves as measured by a change in color, and measuring the first detectable signal on the support medium. The cleaving of the single stranded nucleic acid of a reporter using the programmable nuclease may cleave with an efficiency of 50% as measured by a change in color. In some cases, the cleavage efficiency is at least 40%, 50%, 60%, 70%, 80%, 90%, or 95% as measured by a change in color. The change in color may be a detectable colorimetric signal or a signal visible by eye. The change in color may be measured as a first detectable signal. The first detectable signal can be detectable within 5 minutes of contacting the sample comprising the target nucleic acid with a guide nucleic acid targeting a target nucleic acid segment, a programmable nuclease capable of being activated when complexed with the guide nucleic acid and the target nucleic acid segment, and a single stranded nucleic acid of a reporter comprising a detection moiety, wherein the nucleic acid of a reporter is capable of being cleaved by the activated nuclease. The first detectable signal can be detectable within 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 20, 25, 30, 35, 40, 45, 50, 55, 60, 70, 80, 90, 100, 110, or 120 minutes of contacting the sample. In some embodiments, the first detectable signal can be detectable within from 1 to 120, from 5 to 100, from 10 to 90, from 15 to 80, from 20 to 60, or from 30 to 45 minutes of contacting the sample.


In some cases, the methods, reagents, enzymes, and kits described herein detect a target single-stranded nucleic acid with a programmable nuclease and a single-stranded nucleic acid of a reporter in a sample where the sample is contacted with the reagents for a predetermined length of time sufficient for trans-cleavage of the single stranded nucleic acid of a reporter.


Some methods as described herein can be a method of detecting a target nucleic acid in a sample comprising contacting the sample comprising the target nucleic acid with a guide nucleic acid targeting a target sequence, a programmable nuclease capable of being activated when complexed with the guide nucleic acid and the target sequence, a single stranded reporter nucleic acid comprising a detection moiety, wherein the reporter nucleic acid is capable of being cleaved by the activated nuclease, thereby generating a first detectable signal, cleaving the single stranded reporter nucleic acid using the programmable nuclease that cleaves as measured by a change in color, and measuring the first detectable signal on the support medium. The cleaving of the single stranded reporter nucleic acid using the programmable nuclease may cleave with an efficiency of 50% as measured by a change in color. In some cases, the cleavage efficiency is at least 40%, at least 50%, at least 60%, at least 70%, at least 80%, at least 90%, or at least 95% as measured by a change in color. The change in color may be a detectable colorimetric signal or a signal visible by eye. The change in color may be measured as a first detectable signal. The first detectable signal can be detectable within 5 minutes of contacting the sample comprising the target nucleic acid with a guide nucleic acid targeting a target sequence, a programmable nuclease capable of being activated when complexed with the guide nucleic acid and the target sequence, and a single stranded reporter nucleic acid comprising a detection moiety, wherein the reporter nucleic acid is capable of being cleaved by the activated nuclease. The first detectable signal can be detectable within 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 20, 25, 30, 35, 40, 45, 50, 55, 60, 70, 80, 90, 100, 110, or 120 minutes of contacting the sample.


Multiplexing Programmable Nucleases and Programmable Nickases

Described herein are compositions comprising a programmable nuclease (e.g., a CasΦ as disclosed herein) capable of being activated when complexed with the guide nucleic acid and the target nucleic acid molecule. Furthermore, these reagents can be used with different types of programmable nuclease, e.g., for multiplexing programmable nucleases. In some embodiments, the programmable nucleases can exist in RNP complexes that target multiple genes simultaneously. In some embodiments, a programmable nickase may be multiplexed with an additional programmable nuclease. For example, a programmable nickase may be multiplexed with an additional programmable nuclease for modification or detection of a target nucleic acid. In some embodiments, a first programmable nickase may be multiplexed with a second programmable nickase. In some embodiments, the programmable nickase may be a CasΦ programmable nickase.


In some embodiments, a CasΦ polypeptide disclosed herein may be multiplexed with multiple guide nucleic acids in the same sample, wherein the guide nucleic acids may comprise different sequences.


In some embodiments, an additional programmable nuclease used in multiplexing is any suitable programmable nuclease. Sometimes, the programmable nuclease is any Cas protein (also referred to as a Cas nuclease herein). In some cases, the programmable nuclease is Cas13. In some embodiments, the Cas13 is Cas13a, Cas13b, Cas13c, Cas13d, or Cas13e. In some cases, the programmable nuclease can be Mad7 or Mad2. In some cases, the programmable nuclease is a Cas12 protein. Sometimes the Cas12 is Cas12a, Cas12b, Cas12c, Cas12d, Cas12e, Cas12g, Cas12h, or Cas12i. In some cases, the programmable nuclease is another CasΦ protein. In some cases, the programmable nuclease is Csm1, Cas9, C2c4, C2c8, C2c5, C2c10, C2c9, or CasZ. Sometimes, the Csm1 can be also called smCms1, miCms1, obCms1, or suCms1. Sometimes CasZ can be also called Cas14a, Cas14b, Cas14c, Cas14d, Cas14e, Cas14f, Cas14g, or Cas14h. Sometimes, the programmable nuclease can be a type V CRISPR-Cas system. In some cases, the programmable nuclease can be a type VI CRISPR-Cas system. Sometimes the programmable nuclease can be a type III CRISPR-Cas system.


In some cases, an additional programmable nuclease used in multiplexing can be from, for example, Leptotrichia shahii (Lsh), Listeria seeligeri (Lse), Leptotrichia buccalis (Lbu), Leptotrichia wadeu (Lwa), Rhodobacter capsulatus (Rca), Herbinix hemicellulosilytica (Hhe), Paludibacter propionicigenes (Ppr), Lachnospiraceae bacterium (Lba), Eubacterium rectale (Ere), Listeria newyorkensis (Lny), Clostridium aminophilum (Cam), Prevotella sp. (Psm), Capnocytophaga canimorsus (Cca, Lachnospiraceae bacterium (Lba), Bergeyella zoohelcum (Bzo), Prevotella intermedia (Pin), Prevotella buccae (Pbu), Alistipes sp. (Asp), Riemerella anatipestifer (Ran), Prevotella aurantiaca (Pau), Prevotella saccharolytica (Psa), Prevotella intermedia (Pin2), Capnocytophaga canimorsus (Cca), Porphyromonas gulae (Pgu), Prevotella sp. (Psp), Porphyromonas gingivalis (Pig), Prevotella intermedia (Pin3), Enterococcus italicus (Ei), Lactobacillus salivarius (Ls), or Therms thermophilus (Tt). In some cases, an additional programmable nuclease used in multiplexing can be from, for example, a phage such as a bacteriophage also called a megaphage. The nucleases may come from a particular bacteriophage Glade called Biggiephage. Any combination of programmable nucleases can be used in multiplexing. In some embodiments, multiplexing of programmable nucleases takes place in one reaction volume. In other embodiments, multiplexing of programmable nucleases takes place in separate reaction volumes in a single device.


Amplification of a Target Nucleic Acid

Disclosed herein are methods of amplifying a target nucleic acid for detection using any of the methods, reagents, kits or devices described herein. The compositions for amplification of target nucleic acids and methods of use thereof, as described herein, are compatible with the DETECTR assay methods disclosed herein. The compositions for amplification of target nucleic acids and methods of use thereof, as described herein, are compatible with any of the programmable nucleases disclosed herein and use of said programmable nuclease in a method of detecting a target nucleic acid. A target nucleic acid can be an amplified nucleic acid of interest. The nucleic acid of interest may be any nucleic acid disclosed herein or from any sample as disclosed herein. This amplification can be thermal amplification (e.g., using PCR) or isothermal amplification. This nucleic acid amplification of the sample can improve at least one of sensitivity, specificity, or accuracy of the detection the target nucleic acid. The reagents for nucleic acid amplification can comprise a recombinase, an oligonucleotide primer, a single-stranded DNA binding (SSB) protein, and a polymerase. The nucleic acid amplification can be transcription mediated amplification (TMA). Nucleic acid amplification can be helicase dependent amplification (HDA) or circular helicase dependent amplification (cHDA). In additional cases, nucleic acid amplification is strand displacement amplification (SDA). The nucleic acid amplification can be recombinase polymerase amplification (RPA). The nucleic acid amplification can be at least one of loop mediated amplification (LAMP) or the exponential amplification reaction (EXPAR). Nucleic acid amplification is, in some cases, by rolling circle amplification (RCA), ligase chain reaction (LCR), simple method amplifying RNA targets (SMART), single primer isothermal amplification (SPIA), multiple displacement amplification (MDA), nucleic acid sequence based amplification (NASBA), hinge-initiated primer-dependent amplification of nucleic acids (HIP), nicking enzyme amplification reaction (NEAR), or improved multiple displacement amplification (IMDA). The nucleic acid amplification can be performed for no greater than 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 25, 30, 40, 50, or 60 minutes. Sometimes, the nucleic acid amplification reaction is performed at a temperature of around 20-45° C. The nucleic acid amplification reaction can be performed at a temperature no greater than 20° C., 25° C., 30° C., 35° C., 37° C., 40° C., 45° C. The nucleic acid amplification reaction can be performed at a temperature of at least 20° C., 25° C., 30° C., 35° C., 37° C., 40° C., or 45° C.


The compositions for amplification of target nucleic acids and methods of use thereof, as described herein, are compatible with any of the compositions comprising a programmable nuclease and a buffer, which has been developed to improve the function of the programmable nuclease and use of said compositions in a method of detecting a target nucleic acid. The compositions for amplification of target nucleic acids and methods of use thereof, as described herein, are compatible with any of the methods disclosed herein including methods of assaying for at least one base difference (e.g., assaying for a SNP or a base mutation) in a target nucleic acid sequence, methods of assaying for a target nucleic acid that lacks a PAM by amplifying the target nucleic acid sequence to introduce a PAM, and compositions used in introducing a PAM via amplification into the target nucleic acid sequence. In some cases, amplification of the target nucleic acid may increase the sensitivity of a detection reaction. In some cases, amplification of the target nucleic acid may increase the specificity of a detection reaction. Amplification of the target nucleic acid may increase the concentration of the target nucleic acid in the sample relative to the concentration of nucleic acids that do not correspond to the target nucleic acid. In some embodiments, amplification of the target nucleic acid may be used to modify the sequence of the target nucleic acid. For example, amplification may be used to insert a PAM sequence into a target nucleic acid that lacks a PAM sequence. In some cases, amplification may be used to increase the homogeneity of a target nucleic acid sequence. For example, amplification may be used to remove a nucleic acid variation that is not of interest in the target nucleic acid sequence.


An amplified target nucleic acid may be present in a DETECTR reaction in an amount relative to an amount of a programmable nuclease. In some embodiments, the amplified target nucleic acid is present in at least 1-fold, 2-fold, 3-fold, 4-fold, 5-fold, 10-fold, 25-fold, 50-fold, 100-fold, 500-fold, 1000-fold, 10,000-fold, or 100,000-fold molar excess relative to the amount of the programmable nuclease. In some embodiments, the amplified target nucleic acid is present in no more than 1-fold, 2-fold, 3-fold, 4-fold, 5-fold, 10-fold, 25-fold, 50-fold, 100-fold, 500-fold, 1000-fold, 10,000-fold, or 100,000-fold molar excess relative to the amount of the programmable nuclease. In some embodiments, the amplified target nucleic acid is present in from 1-fold to 2-fold, from 1-fold to 3-fold, from 1-fold to 4-fold, from 1-fold to 5-fold, from 1-fold to 10-fold, from 1-fold to 25-fold, from 1-fold to 50-fold, from 1-fold to 100-fold, from 1-fold to 500-fold, from 1-fold to 1000-fold, from 1-fold to 10,000-fold, from 1-fold to 100,000-fold, from 5-fold to 10-fold, from 5-fold to 25-fold, from 5-fold to 50-fold, from 5-fold to 100-fold, from 5-fold to 500-fold, from 5-fold to 1000-fold, from 5-fold to 10,000-fold, from 5-fold to 100,000-fold, from 10-fold to 25-fold, from 10-fold to 50-fold, from 10-fold to 100-fold, from 10-fold to 500-fold, from 10-fold to 1000-fold, from 10-fold to 10,000-fold, from 10-fold to 100,000-fold, from 100-fold to 500-fold, from 100-fold to 1000-fold, from 100-fold to 10,000-fold, from 100-fold to 100,000-fold, from 1000-fold to 10,000-fold, from 1000-fold to 100,000-fold, or from 10,000-fold to 100,000-fold molar excess relative to the amount of the programmable nuclease. In some embodiments, the programmable nuclease is present in at least 1-fold, 2-fold, 3-fold, 4-fold, 5-fold, 10-fold, 25-fold, 50-fold, 100-fold, 500-fold, 1000-fold, 10,000-fold, or 100,000-fold molar excess relative to the amount of the target nucleic acid. In some embodiments, the programmable nuclease is present in no more than 1-fold, 2-fold, 3-fold, 4-fold, 5-fold, 10-fold, 25-fold, 50-fold, 100-fold, 500-fold, 1000-fold, 10,000-fold, or 100,000-fold molar excess relative to the amount of the target nucleic acid. In some embodiments, the programmable nuclease is present in from 1-fold to 2-fold, from 1-fold to 3-fold, from 1-fold to 4-fold, from 1-fold to 5-fold, from 1-fold to 10-fold, from 1-fold to 25-fold, from 1-fold to 50-fold, from 1-fold to 100-fold, from 1-fold to 500-fold, from 1-fold to 1000-fold, from 1-fold to 10,000-fold, from 1-fold to 100,000-fold, from 5-fold to 10-fold, from 5-fold to 25-fold, from 5-fold to 50-fold, from 5-fold to 100-fold, from 5-fold to 500-fold, from 5-fold to 1000-fold, from 5-fold to 10,000-fold, from 5-fold to 100,000-fold, from 10-fold to 25-fold, from 10-fold to 50-fold, from 10-fold to 100-fold, from 10-fold to 500-fold, from 10-fold to 1000-fold, from 10-fold to 10,000-fold, from 10-fold to 100,000-fold, from 100-fold to 500-fold, from 100-fold to 1000-fold, from 100-fold to 10,000-fold, from 100-fold to 100,000-fold, from 1000-fold to 10,000-fold, from 1000-fold to 100,000-fold, or from 10,000-fold to 100,000-fold molar excess relative to the amount of the target nucleic acid. In some embodiments, the target nucleic acid is not present in the sample.


An amplified target nucleic acid may be present in a DETECTR reaction in an amount relative to an amount of a guide nucleic acid. In some embodiments, the amplified target nucleic acid is present in at least 1-fold, 2-fold, 3-fold, 4-fold, 5-fold, 10-fold, 25-fold, 50-fold, 100-fold, 500-fold, 1000-fold, 10,000-fold, or 100,000-fold molar excess relative to the amount of the guide nucleic acid. In some embodiments, the amplified target nucleic acid is present in no more than 1-fold, 2-fold, 3-fold, 4-fold, 5-fold, 10-fold, 25-fold, 50-fold, 100-fold, 500-fold, 1000-fold, 10,000-fold, or 100,000-fold molar excess relative to the amount of the guide nucleic acid. In some embodiments, the amplified target nucleic acid is present in from 1-fold to 2-fold, from 1-fold to 3-fold, from 1-fold to 4-fold, from 1-fold to 5-fold, from 1-fold to 10-fold, from 1-fold to 25-fold, from 1-fold to 50-fold, from 1-fold to 100-fold, from 1-fold to 500-fold, from 1-fold to 1000-fold, from 1-fold to 10,000-fold, from 1-fold to 100,000-fold, from 5-fold to 10-fold, from 5-fold to 25-fold, from 5-fold to 50-fold, from 5-fold to 100-fold, from 5-fold to 500-fold, from 5-fold to 1000-fold, from 5-fold to 10,000-fold, from 5-fold to 100,000-fold, from 10-fold to 25-fold, from 10-fold to 50-fold, from 10-fold to 100-fold, from 10-fold to 500-fold, from 10-fold to 1000-fold, from 10-fold to 10,000-fold, from 10-fold to 100,000-fold, from 100-fold to 500-fold, from 100-fold to 1000-fold, from 100-fold to 10,000-fold, from 100-fold to 100,000-fold, from 1000-fold to 10,000-fold, from 1000-fold to 100,000-fold, or from 10,000-fold to 100,000-fold molar excess relative to the amount of the guide nucleic acid. In some embodiments, the guide nucleic acid is present in at least 1-fold, 2-fold, 3-fold, 4-fold, 5-fold, 10-fold, 25-fold, 50-fold, 100-fold, 500-fold, 1000-fold, 10,000-fold, or 100,000-fold molar excess relative to the amount of the target nucleic acid. In some embodiments, the guide nucleic acid is present in no more than 1-fold, 2-fold, 3-fold, 4-fold, 5-fold, 10-fold, 25-fold, 50-fold, 100-fold, 500-fold, 1000-fold, 10,000-fold, or 100,000-fold molar excess relative to the amount of the target nucleic acid. In some embodiments, the guide nucleic acid is present in from 1-fold to 2-fold, from 1-fold to 3-fold, from 1-fold to 4-fold, from 1-fold to 5-fold, from 1-fold to 10-fold, from 1-fold to 25-fold, from 1-fold to 50-fold, from 1-fold to 100-fold, from 1-fold to 500-fold, from 1-fold to 1000-fold, from 1-fold to 10,000-fold, from 1-fold to 100,000-fold, from 5-fold to 10-fold, from 5-fold to 25-fold, from 5-fold to 50-fold, from 5-fold to 100-fold, from 5-fold to 500-fold, from 5-fold to 1000-fold, from 5-fold to 10,000-fold, from 5-fold to 100,000-fold, from 10-fold to 25-fold, from 10-fold to 50-fold, from 10-fold to 100-fold, from 10-fold to 500-fold, from 10-fold to 1000-fold, from 10-fold to 10,000-fold, from 10-fold to 100,000-fold, from 100-fold to 500-fold, from 100-fold to 1000-fold, from 100-fold to 10,000-fold, from 100-fold to 100,000-fold, from 1000-fold to 10,000-fold, from 1000-fold to 100,000-fold, or from 10,000-fold to 100,000-fold molar excess relative to the amount of the target nucleic acid. In some embodiments, the target nucleic acid is not present in the sample.


Kits

Disclosed herein are kits for use to detect, modify, edit, or regulate a target nucleic acid sequence as disclosed herein using the methods as discuss above. In some embodiments, the kit comprises the programmable nuclease system, reagents, and the support medium. The reagents and programmable nuclease system can be provided in a reagent chamber or on the support medium. Alternatively, the reagent and programmable nuclease system can be placed into the reagent chamber or the support medium by the individual using the kit. Optionally, the kit further comprises a buffer and a dropper. The reagent chamber can be a test well or container. The opening of the reagent chamber can be large enough to accommodate the support medium. The buffer can be provided in a dropper bottle for ease of dispensing. The dropper can be disposable and transfer a fixed volume. The dropper can be used to place a sample into the reagent chamber or on the support medium.


The kit or system for detection of a target nucleic acid described herein further comprises reagents for nucleic acid amplification of target nucleic acids in the sample. Isothermal nucleic acid amplification allows the use of the kit or system in remote regions or low resource settings without specialized equipment for amplification. Often, the reagents for nucleic acid amplification comprise a recombinase, an oligonucleotide primer, a single-stranded DNA binding (SSB) protein, and a polymerase. Sometimes, nucleic acid amplification of the sample improves at least one of sensitivity, specificity, or accuracy of the assay in detecting the target nucleic acid. In some cases, the nucleic acid amplification is performed in a nucleic acid amplification region on the support medium. Alternatively, or in combination, the nucleic acid amplification is performed in a reagent chamber, and the resulting sample is applied to the support medium. Sometimes, the nucleic acid amplification is isothermal nucleic acid amplification. In some cases, the nucleic acid amplification is transcription mediated amplification (TMA). Nucleic acid amplification is helicase dependent amplification (HDA) or circular helicase dependent amplification (cHDA) in other cases. In additional cases, nucleic acid amplification is strand displacement amplification (SDA). In some cases, nucleic acid amplification is by recombinase polymerase amplification (RPA). In some cases, nucleic acid amplification is by at least one of loop mediated amplification (LAMP) or the exponential amplification reaction (EXPAR). Nucleic acid amplification is, in some cases, by rolling circle amplification (RCA), ligase chain reaction (LCR), simple method amplifying RNA targets (SMART), single primer isothermal amplification (SPIA), multiple displacement amplification (MDA), nucleic acid sequence based amplification (NASBA), hinge-initiated primer-dependent amplification of nucleic acids (HIP), nicking enzyme amplification reaction (NEAR), or improved multiple displacement amplification (IMDA). Often, the nucleic acid amplification is performed for no greater than 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 25, 30, 40, 50, or 60 minutes, or any value from 1 to 60 minutes. Sometimes, the nucleic acid amplification is performed for from 1 to 60, from 5 to 55, from 10 to 50, from 15 to 45, from 20 to 40, or from 25 to 35 minutes. Sometimes, the nucleic acid amplification reaction is performed at a temperature of around 20-45° C. In some cases, the nucleic acid amplification reaction is performed at a temperature no greater than 20° C., 25° C., 30° C., 35° C., 37° C., 40° C., 45° C., or any value from 20° C. to 45° C. In some cases, the nucleic acid amplification reaction is performed at a temperature of at least 20° C., 25° C., 30° C., 35° C., 37° C., 40° C., or 45° C., or any value from 20° C. to 45° C. In some cases, the nucleic acid amplification reaction is performed at a temperature of from 20° C. to 45° C., from 25° C. to 40° C., from 30° C. to 40° C., or from 35° C. to 40° C.


In some embodiments, a kit for detecting a target nucleic acid comprising a support medium; a guide nucleic acid targeting a target sequence; a programmable nuclease capable of being activated when complexed with the guide nucleic acid and the target sequence; and a reporter nucleic acid comprising a detection moiety, wherein the reporter nucleic acid is capable of being cleaved by the activated nuclease, thereby generating a first detectable signal. Often, the kit further comprises primers for amplifying a target nucleic acid of interest to produce a PAM target nucleic acid.


In some embodiments, a kit for detecting a target nucleic acid comprising a PCR plate; a guide nucleic acid targeting a target sequence; a programmable nuclease capable of being activated when complexed with the guide nucleic acid and the target sequence; and a single stranded reporter nucleic acid comprising a detection moiety, wherein the reporter nucleic acid is capable of being cleaved by the activated nuclease, thereby generating a first detectable signal. The wells of the PCR plate can be pre-aliquoted with the guide nucleic acid targeting a target sequence, a programmable nuclease capable of being activated when complexed with the guide nucleic acid and the target sequence, and at least one population of a single stranded reporter nucleic acid comprising a detection moiety. A user can thus add the biological sample of interest to a well of the pre-aliquoted PCR plate and measure for the detectable signal with a fluorescent light reader or a visible light reader.


In some embodiments, a kit for modifying a target nucleic acid comprising a support medium; a guide nucleic acid targeting a target sequence; and a programmable nuclease capable of being activated when complexed with the guide nucleic acid and the target sequence.


In some embodiments, a kit for modifying a target nucleic acid comprising a PCR plate; a guide nucleic acid targeting a target sequence; and a programmable nuclease capable of being activated when complexed with the guide nucleic acid and the target sequence. The wells of the PCR plate can be pre-aliquoted with the guide nucleic acid targeting a target sequence, and a programmable nuclease capable of being activated when complexed with the guide nucleic acid and the target sequence. A user can thus add the biological sample of interest to a well of the pre-aliquoted PCR plate.


In some instances, such kits may include a package, carrier, or container that is compartmentalized to receive one or more containers such as vials, tubes, and the like, each of the container(s) comprising one of the separate elements to be used in a method described herein.


Suitable containers include, for example, test wells, bottles, vials, and test tubes. In one embodiment, the containers are formed from a variety of materials such as glass, plastic, or polymers.


The kit or systems described herein contain packaging materials. Examples of packaging materials include, but are not limited to, pouches, blister packs, bottles, tubes, bags, containers, bottles, and any packaging material suitable for intended mode of use.


A kit typically includes labels listing contents and/or instructions for use, and package inserts with instructions for use. A set of instructions will also typically be included. In one embodiment, a label is on or associated with the container. In some instances, a label is on a container when letters, numbers or other characters forming the label are attached, molded or etched into the container itself; a label is associated with a container when it is present within a receptacle or carrier that also holds the container, e.g., as a package insert. In one embodiment, a label is used to indicate that the contents are to be used for a specific therapeutic application. The label also indicates directions for use of the contents, such as in the methods described herein.


After packaging the formed product and wrapping or boxing to maintain a sterile barrier, the product may be terminally sterilized by heat sterilization, gas sterilization, gamma irradiation, or by electron beam sterilization. Alternatively, the product may be prepared and packaged by aseptic processing.


Unless otherwise defined, all technical terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. As used in this specification and the appended claims, the singular forms “a,” “an,” and “the” include plural references unless the context clearly dictates otherwise. Any reference to “or” herein is intended to encompass “and/or” unless otherwise stated.


As used herein, the term “comprising” and its grammatical equivalents specifies the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof. As used herein, the term “and/or” includes any and all combinations of one or more of the associated listed items.


Unless specifically stated or obvious from context, as used herein, the term “about” in reference to a number or range of numbers is understood to mean the stated number and numbers +/−10% thereof, or 10% below the lower listed limit and 10% above the higher listed limit for the values listed for a range.


As used herein the terms “individual,” “subject,” and “patient” are used interchangeably and include any member of the animal kingdom, including humans.


Methods of the disclosure can be performed in a subject. Compositions of the disclosure can be administered to a subject. A subject can be a human. A subject can be a mammal (e.g., rat, mouse, cow, dog, pig, sheep, horse). A subject can be a vertebrate or an invertebrate. A subject can be a laboratory animal. A subject can be a patient. A subject can be suffering from a disease. A subject can display symptoms of a disease. A subject may not display symptoms of a disease, but still have a disease. A subject can be under medical care of a caregiver (e.g., the subject is hospitalized and is treated by a physician). A subject can be a plant or a crop.


Methods of the disclosure can be performed in a cell. A cell can be in vitro. A cell can be in vivo. A cell can be ex vivo. A cell can be an isolated cell. A cell can be a cell inside of an organism. A cell can be an organism. A cell can be a cell in a cell culture. A cell can be one of a collection of cells. A cell can be a mammalian cell or derived from a mammalian cell. A cell can be a rodent cell or derived from a rodent cell. A cell can be a human cell or derived from a human cell. A cell can be a prokaryotic cell or derived from a prokaryotic cell. A cell can be a bacterial cell or can be derived from a bacterial cell. A cell can be an archaeal cell or derived from an archaeal cell. A cell can be a eukaryotic cell or derived from a eukaryotic cell. A cell can be a pluripotent stem cell. A cell can be a plant cell or derived from a plant cell. A cell can be an animal cell or derived from an animal cell. A cell can be an invertebrate cell or derived from an invertebrate cell. A cell can be a vertebrate cell or derived from a vertebrate cell. A cell can be a microbe cell or derived from a microbe cell. A cell can be a fungi cell or derived from a fungi cell. A cell can be from a specific organ or tissue.


Methods of the disclosure can be performed in a eukaryotic cell or cell line. In some embodiments, the eukaryotic cell is a Chinese hamster ovary (CHO) cell. In some embodiments, the eukaryotic cell is a Human embryonic kidney 293 cells (also referred to as HEK or HEK 293) cell. In some embodiments, the eukaryotic cell is a K562 cell.


Non-limiting examples of cell lines that can be used with the disclosure include C8161, CCRF-CEM, MOLT, mIMCD-3, NHDF, HeLa-S3, Huh1, Huh4, Huh7, HUVEC, HASMC, HEKn, HEKa, MiaPaCell, Panc1, PC-3, TF1, CTLL-2, CIR, Rat6, CV1, RPTE, A10, T24, J82, A375, ARH-77, Calu1, SW480, SW620, SKOV3, SK-UT, CaCo2, P388D1, SEM-K2, WEHI-231, HB56, TIB55, Jurkat, J45.01, LRMB, Bcl-1, BC-3, IC21, DLD2, Raw264.7, NRK, NRK-52E, MRCS, MEF, Hep G2, HeLa B, HeLa T4, COS, COS-1, COS-6, COS-M6A, BS-C-1 monkey kidney epithelial, BALB/3T3 mouse embryo fibroblast, 3T3 Swiss, 3T3-L1, 132-d5 human fetal fibroblasts; 10.1 mouse fibroblasts, 293-T, 3T3, 721, 9L, A2780, A2780ADR, A2780cis, A172, A20, A253, A431, A-549, ALC, B16, B35, BCP-1 cells, BEAS-2B, bEnd.3, BHK-21, BR 293, BxPC3, C3H-10T1/2, C6/36, Cal-27, CHO, CHO-7, CHO—IR, CHO-K1, CHO-K2, CHO-T, CHO Dhfr−/−, COR-L23, COR-L23/CPR, COR-L23/5010, COR-L23/R23, COS-7, COV-434, CML T1, CMT, CT26, D17, DH82, DU145, DuCaP, EL4, EM2, EM3, EMT6/AR1, EMT6/AR10.0, FM3, H1299, H69, HB54, HB55, HCA2, HEK-293, HeLa, Hepa1-6, Hepa1 cic7, HL-60, HMEC, HT-29, Jurkat, JY cells, K562 cells, Ku812, KCL22, KG1, KYO1, LNCap, Ma-Mel 1-48, MC-38, MCF-7, MCF-10A, MDA-MB-231, MDA-MB-468, MDA-MB-435, MDCK II, MDCK II, MOR/0.2R, MONO-MAC 6, MTD-1A, MyEnd, NCI-H69/CPR, NCI-H69/LX10, NCI-H69/LX20, NCI-H69/LX4, NIH-3T3, NALM-1, NW-145, OPCN/OPCT cell lines, Peer, PNT-1A/PNT 2, RenCa, RIN-5F, RMA/RMA5, Saos-2 cells, Sf-9, SkBr3, T2, T-47D, T84, THP1 cell line, U373, U87, U937, VCaP, Vero cells, WM39, WT-49, X63, YAC-1, and YAR. Non-limiting examples of other cells that can be used with the disclosure include immune cells, such as CART, T-cells, B-cells, NK cells, granulocytes, basophils, eosinophils, neutrophils, mast cells, monocytes, macrophages, dendritic cells, antigen-presenting cells (APC), or adaptive cells. Non-limiting examples of cells that can be used with this disclosure also include plant cells, such as Parenchyma, sclerenchyma, collenchyma, xylem, phloem, germline (e.g., pollen). Cells from lycophytes, ferns, gymnosperms, angiosperms, bryophytes, charophytes, chloropytes, rhodophytes, or glaucophytes. Non-limiting examples of cells that can be used with this disclosure also include stem cells, such as human stem cells, animal stem cells, stem cells that are not derived from human embryonic stem cells, embryonic stem cells, mesenchymal stem cells, pluripotent stem cells, induced pluripotent stem cells (iPS), somatic stem cells, adult stem cells, hematopoietic stem cells, tissue-specific stem cells.


Methods described herein may be used to create populations of cells comprising at least one of the cells described herein. In some cases, a population of cells comprises a non-naturally occurring compositions described herein.


Compositions of the disclosure include populations of cells, or any progeny thereof, comprising other compositions described herein or that have been modified by the methods described herein.


Methods described herein may include producing a protein from a cell or a population of cells described herein. In some cases, the method comprises producing a protein, and industrial protein, or a protein at large scale using a cell provided for herein that has been modified by any of the methods described herein. In some cases, a rodent cell or CHO cell is modified by a nuclease or cas enzyme described herein and is later used, expanded, or cultured for protein production. In some cases, a derivative or progeny of a modified CHO cell, as described herein, is used, expanded, or cultured for protein production. A method of protein production may further comprise a donor template, additional guide RNA, a buffer, a protease inhibitor, a nuclease inhibitor, or a detergent.


EXAMPLES

The following examples are included to further describe some aspects of the present disclosure and should not be used to limit the scope of the invention.


Example 1

Human Codon Optimized CasΦ polypeptide


Human codon-optimized nucleotide sequences of illustrative CasΦ polypeptides were prepared. TABLE 4 provides human codon optimized nucleotide sequences of illustrative CasΦ polypeptides that are suitable for use with the methods and compositions of the disclosure.









TABLE 4







Human codon optimized nucleotide sequences










Endogenous Amino



Name
Acid Sequence
Human Codon Optimized Nucleotide Sequence





CasΦ.2
MPKPAVESEFSKVLK
ATGCCTAAGCCTGCCGTGGAAAGCGAGTTCAG



KHFPGERFRSSYMKR
CAAGGTGCTGAAGAAGCACTTCCCCGGCGAGC



GGKILAAQGEEAVVA
GGTTCAGATCCAGCTACATGAAGAGAGGCGGC



YLQGKSEEEPPNFQPP
AAGATCCTGGCCGCTCAAGGCGAAGAAGCCGT



AKCHVVTKSRDFAE
GGTCGCATATCTGCAGGGCAAGAGCGAGGAA



WPIMKASEAIQRYIYA
GAACCTCCTAACTTCCAGCCTCCTGCCAAGTG



LSTTERAACKPGKSSE
CCACGTGGTCACCAAGAGCAGAGATTTCGCCG



SHAAWFAATGVSNH
AGTGGCCCATCATGAAGGCCTCTGAAGCCATC



GYSHVQGLNLIFDHT
CAGCGGTACATCTACGCCCTGAGCACAACAGA



LGRYDGVLKKVQLR
AAGAGCCGCCTGCAAGCCTGGCAAGAGCAGC



NEKARARLESINASR
GAATCTCACGCCGCTTGGTTTGCCGCTACCGG



ADEGLPEIKAEEEEVA
CGTGTCCAATCACGGCTACTCTCATGTGCAGG



TNETGHLLQPPGINPS
GCCTGAACCTGATCTTCGATCACACCCTGGGC



FYVYQTISPQAYRPRD
AGATACGACGGCGTGCTGAAAAAGGTGCAGC



EIVLPPEYAGYVRDPN
TGCGGAACGAGAAGGCCAGAGCCAGACTGGA



APIPLGVVRNRCDIQK
ATCCATCAACGCCAGCAGAGCCGATGAGGGCC



GCPGYIPEWQREAGT
TGCCTGAGATTAAGGCCGAAGAGGAAGAGGT



AISPKTGKAVTVPGLS
GGCCACAAACGAAACCGGCCATCTGCTGCAGC



PKKNKRMRRYWRSE
CACCTGGCATCAACCCTAGCTTCTACGTGTAC



KEKAQDALLVTVRIG
CAGACAATCAGCCCTCAGGCCTACAGACCCAG



TDWVVIDVRGLLRNA
GGACGAGATTGTGCTGCCTCCTGAGTATGCCG



RWRTIAPKDISLNALL
GCTACGTGCGGGATCCCAACGCTCCTATTCCT



DLFTGDPVIDVRRNIV
CTGGGCGTCGTGCGGAACAGATGCGACATCCA



TFTYTLDACGTYARK
GAAAGGCTGCCCCGGCTACATTCCCGAGTGGC



WTLKGKQTKATLDK
AGAGAGAAGCCGGCACCGCCATTTCTCCAAAG



LTATQTVALVAIDLG
ACAGGCAAAGCCGTGACCGTGCCTGGCCTGTC



QTNPISAGISRVTQEN
TCCTAAGAAAAACAAGCGGATGCGGCGGTACT



GALQCEPLDRFTLPD
GGCGGAGCGAGAAAGAAAAAGCCCAGGACGC



DLLKDISAYRIAWDR
CCTGCTGGTCACAGTGCGGATTGGCACAGATT



NEEELRARSVEALPE
GGGTCGTGATCGATGTGCGCGGCCTGCTGAGA



AQQAEVRALDGVSKE
AATGCCAGATGGCGGACAATCGCCCCTAAGGA



TARTQLCADFGLDPK
CATCAGCCTGAACGCACTGCTGGACCTGTTCA



RLPWDKMSSNTTFISE
CCGGCGATCCTGTGATTGACGTGCGGCGGAAC



ALLSNSVSRDQVFFTP
ATCGTGACCTTCACCTACACACTGGACGCCTG



APKKGAKKKAPVEV
CGGCACCTACGCCAGAAAGTGGACACTGAAG



MRKDRTWARAYKPR
GGCAAGCAGACCAAGGCCACTCTGGACAAGC



LSVEAQKLKNEALW
TGACCGCCACACAGACAGTGGCCCTGGTGGCT



ALKRTSPEYLKLSRR
ATTGATCTGGGCCAGACAAACCCTATCAGCGC



KEELCRRSINYVIEKT
CGGCATCAGCAGAGTGACCCAAGAAAATGGC



RRRTQCQIVIPVIEDL
GCCCTGCAGTGCGAGCCCCTGGACAGATTCAC



NVRFFHGSGKRLPGW
ACTGCCCGACGACCTGCTGAAGGACATCTCCG



DNFFTAKKENRWFIQ
CCTATAGAATCGCCTGGGACCGCAATGAAGAG



GLHKAFSDLRTHRSF
GAACTGAGAGCCAGAAGCGTGGAAGCCCTGC



YVFEVRPERTSITCPK
CTGAAGCACAGCAGGCTGAAGTGCGAGCACT



CGHCEVGNRDGEAFQ
GGACGGGGTGTCCAAAGAGACAGCCAGAACT



CLSCGKTCNADLDVA
CAGCTGTGCGCCGACTTTGGACTGGACCCCAA



THNLTQVALTGKTMP
AAGACTGCCCTGGGACAAGATGAGCAGCAAC



KREEPRDAQGTAPAR
ACCACCTTCATCAGCGAGGCCCTGCTGAGCAA



KTKKASKSKAPPAER
TAGCGTGTCCAGAGATCAGGTGTTCTTCACCC



EDQTPAQEPSQTS
CTGCTCCAAAGAAGGGCGCCAAGAAGAAAGC



(SEQ ID NO: 2)
CCCTGTCGAAGTGATGCGGAAGGACCGGACAT




GGGCCAGAGCTTACAAGCCCAGACTGTCCGTG




GAAGCTCAGAAGCTGAAGAACGAAGCCCTGT




GGGCCCTGAAGAGAACAAGCCCCGAGTACCT




GAAGCTGAGCCGGCGGAAAGAAGAACTCTGC




CGGCGGAGCATCAACTACGTGATCGAGAAAA




CCCGGCGGAGAACCCAGTGCCAGATCGTGATT




CCTGTGATCGAGGACCTGAACGTGCGGTTCTT




TCACGGCAGCGGCAAGAGACTGCCCGGCTGG




GATAATTTCTTCACCGCCAAAAAAGAAAACCG




GTGGTTCATCCAGGGCCTGCACAAGGCCTTCA




GCGACCTGAGAACCCACCGGTCCTTTTACGTG




TTCGAAGTGCGGCCCGAGCGGACCAGCATCAC




CTGTCCTAAATGCGGCCACTGCGAAGTGGGCA




ACAGAGATGGCGAGGCCTTCCAGTGTCTGAGC




TGTGGCAAGACCTGCAACGCCGACCTGGATGT




GGCCACTCACAATCTGACACAGGTGGCCCTGA




CCGGCAAGACCATGCCTAAGAGAGAGGAACC




TAGGGACGCCCAGGGTACAGCCCCTGCCAGAA




AGACAAAGAAAGCCAGCAAGAGCAAGGCCCC




TCCTGCCGAGAGAGAAGATCAGACCCCAGCTC




AAGAGCCCAGCCAGACATCT (SEQ ID NO: 1405)





CasΦ.4
MEKEITELTKIRREFP
ATGGAAAAAGAGATCACCGAGCTGACCAAGA



NKKFSSTDMKKAGKL
TCCGCAGAGAGTTCCCCAACAAGAAGTTCAGC



LKAEGPDAVRDFLNS
AGCACCGACATGAAGAAGGCCGGCAAGCTGC



CQEIIGDFKPPVKTNI
TGAAGGCCGAAGGACCTGATGCCGTGCGGGA



VSISRPFEEWPVSMVG
CTTCCTGAACAGCTGCCAAGAGATCATCGGCG



RAIQEYYFSLTKEELE
ACTTCAAGCCTCCAGTCAAGACCAACATCGTG



SVHPGTSSEDHKSFFN
TCCATCAGCAGACCCTTCGAGGAATGGCCCGT



ITGLSNYNYTSVQGL
GTCCATGGTTGGACGGGCCATCCAAGAGTACT



NLIFKNAKAIYDGTLV
ACTTCAGCCTGACCAAAGAGGAACTGGAAAG



KANNKNKKLEKKFN
CGTTCACCCCGGCACCAGCAGCGAGGACCACA



EINHKRSLEGLPIITPD
AGAGCTTTTTCAACATCACCGGCCTGAGCAAC



FEEPFDENGHLNNPPG
TACAACTACACCAGCGTGCAGGGCCTGAACCT



INRNIYGYQGCAAKV
GATCTTCAAGAACGCCAAGGCCATCTACGACG



FVPSKHKMVSLPKEY
GCACCCTGGTCAAGGCCAACAACAAGAACAA



EGYNRDPNLSLAGFR
GAAGCTCGAGAAGAAGTTTAACGAGATCAAC



NRLEIPEGEPGHVPWF
CACAAGCGGAGCCTGGAAGGCCTGCCTATCAT



QRMDIPEGQIGHVNKI
CACCCCTGATTTCGAGGAACCCTTCGACGAGA



QRFNFVHGKNSGKVK
ACGGCCACCTGAACAACCCTCCAGGCATCAAC



FSDKTGRVKRYHHSK
CGGAACATCTACGGCTATCAGGGCTGCGCCGC



YKDATKPYKFLEESK
CAAGGTGTTCGTGCCTTCTAAGCACAAGATGG



KVSALDSILAIITIGDD
TGTCCCTGCCTAAAGAGTACGAGGGCTACAAC



WVVFDIRGLYRNVFY
AGGGACCCCAACCTGTCTCTGGCCGGCTTCAG



RELAQKGLTAVQLLD
AAACAGACTGGAAATCCCTGAGGGCGAGCCT



LFTGDPVIDPKKGVV
GGCCATGTGCCATGGTTCCAGAGAATGGATAT



TFSYKEGVVPVFSQKI
CCCCGAGGGCCAGATCGGACACGTGAACAAG



VPRFKSRDTLEKLTSQ
ATCCAGCGGTTCAACTTCGTGCACGGCAAGAA



GPVALLSVDLGQNEP
CAGCGGCAAAGTGAAGTTCTCCGACAAGACCG



VAARVCSLKNINDKIT
GCAGAGTGAAGAGATACCACCACAGCAAGTA



LDNSCRISFLDDYKK
CAAGGACGCTACCAAGCCTTACAAGTTCCTGG



QIKDYRDSLDELEIKI
AAGAGTCCAAGAAGGTGTCAGCCCTGGACAG



RLEAINSLETNQQVEI
CATCCTGGCCATCATCACAATCGGCGACGACT



RDLDVFSADRAKANT
GGGTCGTGTTCGACATCAGAGGCCTGTACCGG



VDMFDIDPNLISWDS
AACGTGTTCTACAGAGAGCTGGCCCAGAAAGG



MSDARVSTQISDLYL
CCTGACAGCTGTGCAACTGCTGGACCTGTTTA



KNGGDESRVYFEINN
CCGGCGATCCCGTGATCGACCCCAAGAAAGGC



KRIKRSDYNISQLVRP
GTGGTCACCTTCAGCTACAAAGAGGGCGTCGT



KLSDSTRKNLNDSIW
CCCCGTCTTTAGCCAGAAAATCGTGCCCCGGT



KLKRTSEEYLKLSKR
TCAAGAGCCGGGACACCCTGGAAAAGCTGAC



KLELSRAVVNYTIRQS
CTCTCAGGGACCTGTGGCTCTGCTGTCTGTGG



KLLSGINDIVIILEDLD
ACCTGGGACAGAATGAACCTGTGGCCGCCAGA



VKKKFNGRGIRDIGW
GTGTGCAGCCTGAAGAACATCAACGACAAGAT



DNFFSSRKENRWFIPA
CACCCTGGACAACTCTTGCCGGATCAGCTTCC



FHKAFSELSSNRGLCV
TGGACGACTACAAGAAGCAGATCAAGGACTA



IEVNPAWTSATCPDC
CAGAGACAGCCTGGACGAGCTGGAAATCAAG



GFCSKENRDGINFTCR
ATCCGGCTGGAAGCCATCAACTCCCTCGAGAC



KCGVSYHADIDVATL
AAACCAGCAGGTCGAGATCAGAGATCTGGAC



NIARVAVLGKPMSGP
GTGTTCAGCGCCGACCGGGCCAAAGCCAATAC



ADRERLGDTKKPRVA
CGTGGACATGTTTGACATCGACCCTAACCTGA



RSRKTMKRKDISNST
TCAGCTGGGACTCCATGAGCGACGCCAGAGTC



VEAMVTA (SEQ ID
AGCACCCAGATCAGCGACCTGTACCTGAAGAA



NO: 4)
TGGCGGCGACGAGAGCCGGGTGTACTTTGAGA




TTAACAACAAACGGATTAAGCGGAGCGACTAC




AACATCAGCCAGCTCGTGCGGCCCAAGCTGAG




CGATAGCACCAGAAAGAACCTGAACGACAGC




ATCTGGAAGCTGAAGCGGACCAGCGAGGAAT




ACCTGAAGCTGAGCAAGCGGAAGCTGGAACT




GAGCAGAGCCGTCGTGAATTACACCATCCGGC




AGAGCAAACTGCTGAGCGGCATCAATGACATC




GTGATCATTCTCGAGGACCTGGACGTGAAGAA




GAAATTCAACGGCAGAGGCATCCGCGATATCG




GCTGGGACAACTTCTTCAGCTCCCGGAAAGAA




AACCGGTGGTTCATCCCCGCCTTCCACAAGGC




CTTTAGCGAGCTGAGCAGCAACAGGGGCCTGT




GCGTGATCGAAGTGAATCCTGCCTGGACCAGC




GCCACCTGTCCTGATTGTGGCTTCTGCAGCAA




AGAAAACAGAGATGGCATCAACTTCACGTGCC




GGAAGTGCGGCGTGTCCTACCACGCCGATATT




GACGTGGCCACACTGAATATTGCCAGAGTGGC




CGTGCTGGGCAAGCCTATGTCTGGACCTGCCG




ACAGAGAGAGACTGGGCGACACCAAGAAACC




TAGAGTGGCCCGCAGCAGAAAGACCATGAAG




CGGAAGGACATCAGCAACAGCACCGTCGAGG




CCATGGTTACAGCT (SEQ ID NO: 1406)





CasΦ.11
MSNTAVSTREHMSNK
ATGAGCAACACCGCCGTGTCCACCAGAGAACA



TTPPSPLSLLLRAHFP
CATGTCCAACAAGACAACCCCTCCATCTCCTC



GLKFESQDYKIAGKK
TGAGCCTGCTGCTGAGAGCCCACTTTCCTGGC



LRDGGPEAVISYLTG
CTGAAGTTCGAGAGCCAGGACTACAAGATCGC



KGQAKLKDVKPPAK
CGGCAAGAAACTGAGAGATGGCGGACCTGAG



AFVIAQSRPFIEWDLV
GCCGTGATCAGCTACCTGACTGGAAAAGGCCA



RVSRQIQEKIFGIPATK
GGCCAAGCTGAAGGACGTGAAGCCTCCTGCCA



GRPKQDGLSETAFNE
AGGCCTTTGTGATCGCCCAGAGCAGACCCTTC



AVASLEVDGKSKLNE
ATCGAGTGGGACCTCGTCAGAGTGTCCCGGCA



ETRAAFYEVLGLDAP
GATCCAAGAGAAGATCTTTGGCATCCCCGCCA



SLHAQAQNALIKSAIS
CCAAGGGCAGACCTAAGCAAGATGGCCTGAG



IREGVLKKVENRNEK
CGAGACAGCCTTCAACGAAGCCGTGGCCAGCC



NLSKTKRRKEAGEEA
TGGAAGTGGACGGCAAGAGCAAGCTGAACGA



TFVEEKAHDERGYLI
GGAAACCAGAGCCGCCTTCTACGAGGTGCTGG



HPPGVNQTIPGYQAV
GACTTGATGCCCCAAGCCTGCATGCTCAGGCC



VIKSCPSDFIGLPSGCL
CAGAATGCCCTGATCAAGAGCGCCATCAGCAT



AKESAEALTDYLPHD
CAGAGAAGGCGTGCTGAAGAAGGTGGAAAAC



RMTIPKGQPGYVPEW
CGGAACGAGAAGAACCTGAGCAAGACCAAGC



QHPLLNRRKNRRRRD
GGCGGAAAGAGGCTGGCGAAGAGGCCACCTT



WYSASLNKPKATCSK
TGTGGAAGAGAAGGCCCACGACGAGCGGGGC



RSGTPNRKNSRTDQIQ
TATCTGATTCATCCTCCTGGCGTGAACCAGAC



SGRFKGAIPVLMRFQ
AATCCCCGGCTATCAGGCCGTGGTCATCAAGA



DEWVIIDIRGLLRNAR
GCTGCCCCAGCGATTTCATCGGCCTGCCTAGT



YRKLLKEKSTIPDLLS
GGCTGTCTGGCCAAAGAGTCTGCCGAGGCTCT



LFTGDPSIDMRQGVC
GACCGATTACCTGCCTCACGACCGGATGACTA



TFIYKAGQACSAKMV
TCCCCAAGGGACAGCCTGGCTATGTGCCCGAA



KTKNAPEILSELTKSG
TGGCAGCACCCTCTGCTGAACAGAAGAAAGA



PVVLVSIDLGQTNPIA
ACCGGCGCAGAAGAGACTGGTACAGCGCCAG



AKVSRVTQLSDGQLS
CCTGAACAAGCCCAAGGCCACCTGTAGCAAGA



HETLLRELLSNDSSDG
GATCCGGCACACCCAACCGGAAGAACAGCAG



KEIARYRVASDRLRD
AACCGACCAGATCCAGAGCGGCAGATTCAAG



KLANLAVERLSPEHK
GGCGCCATTCCTGTGCTGATGCGGTTCCAGGA



SEILRAKNDTPALCKA
TGAGTGGGTCATCATCGACATCCGGGGCCTGC



RVCAALGLNPEMIAW
TGAGAAACGCCCGGTATCGGAAGCTGCTGAAA



DKMTPYTEFLATAYL
GAGAAGTCCACCATTCCTGACCTGCTGAGCCT



EKGGDRKVATLKPKN
GTTCACCGGCGATCCCAGCATCGATATGAGAC



RPEMLRRDIKFKGTE
AGGGCGTGTGCACCTTCATCTACAAGGCCGGC



GVRIEVSPEAAEAYRE
CAGGCCTGTAGCGCCAAGATGGTCAAGACAA



AQWDLQRTSPEYLRL
AGAACGCCCCTGAGATCCTGTCCGAGCTGACC



STWKQELTKRILNQL
AAGTCTGGACCTGTGGTGCTGGTGTCCATCGA



RHKAAKSSQCEVVV
CCTGGGCCAGACAAATCCTATCGCCGCCAAGG



MAFEDLNIKMMHGN
TGTCCAGAGTGACCCAGCTGTCTGATGGCCAG



GKWADGGWDAFFIK
CTGAGCCACGAGACACTGCTGAGGGAACTGCT



KRENRWFMQAFHKS
GAGCAACGATAGCAGCGACGGCAAAGAGATC



LTELGAHKGVPTIEVT
GCCCGGTACAGAGTGGCCAGCGACAGACTGA



PHRTSITCTKCGHCDK
GAGACAAGCTGGCCAATCTGGCCGTGGAAAG



ANRDGERFACQKCGF
ACTGAGCCCTGAGCACAAGAGCGAGATCCTGA



VAHADLEIATDNIERV
GAGCCAAGAACGACACCCCTGCTCTGTGCAAG



ALTGKPMPKPESERS
GCCAGAGTGTGTGCTGCCCTGGGACTGAACCC



GDAKKSVGARKAAF
TGAAATGATCGCCTGGGACAAGATGACCCCTT



KPEEDAEAAE (SEQ
ACACCGAGTTTCTGGCCACCGCCTACCTGGAA



ID NO: 2468)
AAAGGCGGCGACAGAAAAGTGGCCACACTGA




AGCCCAAGAACAGACCCGAGATGCTGCGGCG




GGACATCAAGTTCAAGGGAACCGAGGGCGTC




AGAATCGAGGTGTCACCTGAAGCCGCCGAGGC




CTATAGAGAAGCCCAGTGGGATCTGCAGAGG




ACAAGCCCCGAGTACCTGAGACTGTCCACCTG




GAAGCAAGAGCTGACAAAGAGAATCCTGAAC




CAGCTGCGGCACAAGGCCGCCAAAAGCAGCC




AGTGTGAAGTGGTGGTCATGGCCTTCGAGGAC




CTGAACATCAAGATGATGCACGGCAACGGCA




AGTGGGCCGATGGTGGATGGGATGCCTTCTTC




ATCAAGAAACGCGAGAACCGGTGGTTCATGCA




GGCCTTCCACAAGAGCCTGACAGAGCTGGGAG




CACACAAGGGCGTGCCAACCATCGAAGTGACC




CCTCACAGAACCAGCATCACCTGTACCAAGTG




CGGCCACTGCGACAAGGCCAACAGAGATGGG




GAGAGATTCGCCTGCCAGAAATGCGGCTTTGT




GGCCCACGCCGATCTGGAAATCGCCACCGACA




ACATCGAGAGAGTGGCCCTGACAGGCAAGCC




CATGCCTAAGCCTGAGAGCGAGAGAAGCGGC




GACGCCAAGAAATCTGTGGGAGCCAGAAAGG




CCGCCTTCAAGCCTGAGGAAGATGCCGAAGCT




GCCGAG (SEQ ID NO: 1407)





CasΦ.12
MIKPTVSQFLTPGFKL
ATGATCAAGCCTACCGTCAGCCAGTTTCTGAC



IRNHSRTAGLKLKNE
CCCTGGCTTCAAGCTGATCCGGAACCACTCTA



GEEACKKFVRENEIPK
GAACAGCCGGCCTGAAGCTGAAGAACGAGGG



DECPNFQGGPAIANII
CGAAGAGGCCTGCAAGAAATTCGTGCGCGAG



AKSREFTEWEIYQSSL
AACGAGATCCCCAAGGACGAGTGCCCCAACTT



AIQEVIFTLPKDKLPEP
TCAAGGCGGACCCGCCATTGCCAACATCATTG



ILKEEWRAQWLSEHG
CCAAGAGCCGCGAGTTCACCGAGTGGGAGATC



LDTVPYKEAAGLNLII
TACCAGTCTAGCCTGGCCATCCAAGAAGTGAT



KNAVNTYKGVQVKV
CTTCACCCTGCCTAAGGACAAGCTGCCCGAGC



DNKNKNNLAKINRKN
CTATCCTGAAAGAGGAATGGCGAGCCCAGTGG



EIAKLNGEQEISFEEIK
CTGTCTGAGCACGGACTGGATACCGTGCCTTA



AFDDKGYLLQKPSPN
CAAAGAAGCCGCCGGACTGAACCTGATCATCA



KSIYCYQSVSPKPFITS
AGAACGCCGTGAACACCTACAAGGGCGTGCA



KYHNVNLPEEYIGYY
AGTGAAGGTGGACAACAAGAACAAAAACAAC



RKSNEPIVSPYQFDRL
CTGGCCAAGATCAACCGGAAGAATGAGATCG



RIPIGEPGYVPKWQYT
CCAAGCTGAACGGCGAGCAAGAGATCAGCTTC



FLSKKENKRRKLSKRI
GAGGAAATCAAGGCCTTCGACGACAAGGGCT



KNVSPILGIICIKKDW
ACCTGCTGCAGAAGCCCTCTCCAAACAAGAGC



CVFDMRGLLRTNHW
ATCTACTGCTACCAGAGCGTGTCCCCTAAGCC



KKYHKPTDSINDLFD
TTTCATCACCAGCAAGTACCACAACGTGAACC



YFTGDPVIDTKANVV
TGCCTGAAGAGTACATCGGCTACTACCGGAAG



RFRYKMENGIVNYKP
TCCAACGAGCCCATCGTGTCCCCATACCAGTT



VREKKGKELLENICD
CGACAGACTGCGGATCCCTATCGGCGAGCCTG



QNGSCKLATVDVGQ
GCTATGTGCCTAAGTGGCAGTACACCTTCCTG



NNPVAIGLFELKKVN
AGCAAGAAAGAGAACAAGCGGCGGAAGCTGA



GELTKTLISRHPTPIDF
GCAAGCGGATCAAGAATGTGTCCCCAATCCTG



CNKITAYRERYDKLE
GGCATCATCTGCATCAAGAAAGATTGGTGCGT



SSIKLDAIKQLTSEQKI
GTTCGACATGCGGGGCCTGCTGAGAACAAACC



EVDNYNNNFTPQNTK
ACTGGAAGAAGTATCACAAGCCCACCGACAG



QIVCSKLNINPNDLPW
CATCAACGACCTGTTCGACTACTTCACCGGCG



DKMISGTHFISEKAQV
ATCCCGTGATCGACACCAAGGCCAATGTCGTG



SNKSEIYFTSTDKGKT
CGGTTCCGGTACAAGATGGAAAACGGCATCGT



KDVMKSDYKWFQDY
GAACTACAAGCCCGTGCGGGAAAAGAAGGGC



KPKLSKEVRDALSDIE
AAAGAGCTGCTGGAAAACATCTGCGACCAGA



WRLRRESLEFNKLSK
ACGGCAGCTGCAAGCTGGCCACAGTGGATGTG



SREQDARQLANWISS
GGCCAGAACAACCCTGTGGCCATCGGCCTGTT



MCDVIGIENLVKKNN
CGAGCTGAAAAAAGTGAACGGGGAGCTGACC



FFGGSGKREPGWDNF
AAGACACTGATCAGCAGACACCCCACACCTAT



YKPKKENRWWINAIH
CGATTTCTGCAACAAGATCACCGCCTACCGCG



KALTELSQNKGKRVI
AGAGATACGACAAGCTGGAAAGCAGCATCAA



LLPAMRTSITCPKCKY
GCTGGACGCCATCAAGCAGCTGACCAGCGAGC



CDSKNRNGEKFNCLK
AGAAAATCGAAGTGGACAACTACAACAACAA



CGIELNADIDVATENL
CTTCACGCCCCAGAACACCAAGCAGATCGTGT



ATVAITAQSMPKPTC
GCAGCAAGCTGAATATCAACCCCAACGATCTG



ERSGDAKKPVRARKA
CCCTGGGACAAGATGATCAGCGGCACCCACTT



KAPEFHDKLAPSYTV
CATCAGCGAGAAGGCCCAGGTGTCCAACAAG



VLREAV (SEQ ID NO:
AGCGAGATCTACTTTACCAGCACCGATAAGGG



12)
CAAGACCAAGGACGTGATGAAGTCCGACTAC




AAGTGGTTCCAGGACTATAAGCCCAAGCTGTC




CAAAGAAGTGCGGGACGCCCTGAGCGATATTG




AGTGGCGGCTGAGAAGAGAGAGCCTGGAATT




CAACAAGCTCAGCAAGAGCAGAGAGCAGGAC




GCCAGACAGCTGGCCAATTGGATCAGCAGCAT




GTGCGACGTGATCGGCATCGAGAACCTGGTCA




AGAAGAACAACTTCTTCGGCGGCAGCGGCAA




GAGAGAACCCGGCTGGGACAACTTCTACAAGC




CGAAGAAAGAAAACCGGTGGTGGATCAACGC




CATCCACAAGGCCCTGACAGAGCTGTCCCAGA




ACAAGGGAAAGAGAGTGATCCTGCTGCCTGCC




ATGCGGACCAGCATCACCTGTCCTAAGTGCAA




GTACTGCGACAGCAAGAACCGCAACGGCGAG




AAGTTCAATTGCCTGAAGTGTGGCATTGAGCT




GAACGCCGACATCGACGTGGCCACCGAAAATC




TGGCTACCGTGGCCATCACAGCCCAGAGCATG




CCTAAGCCAACCTGCGAGAGAAGCGGCGACG




CCAAGAAACCTGTGCGGGCCAGAAAAGCCAA




GGCTCCCGAGTTCCACGATAAGCTGGCCCCTA




GCTACACCGTGGTGCTGAGAGAAGCTGTG




(SEQ ID NO: 1408)





CasΦ.17
MYSLEMADLKSEPSL
ATGTACAGCCTGGAAATGGCCGACCTGAAGTC



LAKLLRDRFPGKYWL
CGAGCCTTCTCTGCTGGCTAAGCTGCTGAGAG



PKYWKLAEKKRLTG
ACAGATTCCCCGGCAAGTACTGGCTGCCTAAG



GEEAACEYMADKQL
TACTGGAAGCTGGCCGAGAAGAAGAGACTGA



DSPPPNFRPPARCVIL
CAGGCGGAGAAGAAGCCGCCTGCGAGTACAT



AKSRPFEDWPVHRVA
GGCTGACAAGCAGCTGGATAGCCCTCCACCTA



SKAQSFVIGLSEQGFA
ACTTCCGGCCTCCAGCCAGATGTGTGATCCTG



ALRAAPPSTADARRD
GCCAAGAGCAGACCCTTCGAGGATTGGCCAGT



WLRSHGASEDDLMA
GCACAGAGTGGCCAGCAAGGCCCAGTCTTTTG



LEAQLLETIMGNAISL
TGATCGGCCTGAGCGAGCAGGGCTTCGCTGCT



HGGVLKKIDNANVK
CTTAGAGCTGCCCCTCCTAGCACAGCCGACGC



AAKRLSGRNEARLNK
CAGAAGAGATTGGCTGAGAAGCCATGGCGCC



GLQELPPEQEGSAYG
AGCGAGGATGATCTGATGGCTCTGGAAGCCCA



ADGLLVNPPGLNLNI
GCTGCTGGAAACCATCATGGGCAACGCCATTT



YCRKSCCPKPVKNTA
CTCTGCACGGCGGCGTGCTGAAGAAGATCGAC



RFVGHYPGYLRDSDSI
AACGCCAACGTGAAGGCCGCCAAGAGACTGT



LISGTMDRLTIIEGMP
CCGGAAGAAACGAGGCCAGACTGAACAAGGG



GHIPAWQREQGLVKP
CCTGCAAGAGCTGCCTCCTGAGCAAGAGGGAT



GGRRRRLSGSESNMR
CTGCCTATGGCGCCGATGGCCTGCTGGTTAAT



QKVDPSTGPRRSTRS
CCTCCTGGCCTGAACCTGAACATCTACTGCAG



GTVNRSNQRTGRNGD
AAAGAGCTGCTGCCCCAAGCCTGTGAAGAACA



PLLVEIRMKEDWVLL
CCGCCAGATTCGTGGGACACTACCCCGGCTAC



DARGLLRNLRWRESK
CTGAGAGACTCCGACAGCATCCTGATCAGCGG



RGLSCDHEDLSLSGLL
CACCATGGACCGGCTGACAATCATCGAGGGAA



ALFSGDPVIDPVRNEV
TGCCCGGACACATCCCCGCCTGGCAACGAGAA



VFLYGEGIIPVRSTKP
CAGGGACTTGTGAAACCTGGCGGCAGAAGGC



VGTRQSKKLLERQAS
GGAGACTGTCTGGCAGCGAGAGCAACATGAG



MGPLTLISCDLGQTNL
ACAGAAGGTGGACCCCAGCACAGGCCCCAGA



IAGRASAISLTHGSLG
AGAAGCACAAGATCCGGCACCGTGAACAGAA



VRSSVRIELDPEIIKSF
GCAACCAGCGGACAGGCAGAAACGGCGATCC



ERLRKDADRLETEILT
TCTGCTGGTGGAAATCCGGATGAAGGAAGATT



AAKETLSDEQRGEVN
GGGTCCTGCTGGACGCCAGAGGCCTGCTGAGA



SHEKDSPQTAKASLC
AATCTGAGATGGCGCGAGTCCAAGAGAGGCCT



RELGLHPPSLPWGQM
GAGCTGCGATCACGAGGATCTGAGCCTGTCTG



GPSTTFIADMLISHGR
GACTGCTGGCCCTGTTTTCTGGCGACCCCGTG



DDDAFLSHGEFPTLE
ATCGATCCTGTGCGGAATGAGGTGGTGTTCCT



KRKKFDKRFCLESRP
GTACGGCGAGGGCATCATTCCAGTGCGGAGCA



LLSSETRKALNESLW
CAAAGCCTGTGGGCACCAGACAGAGCAAGAA



EVKRTSSEYARLSQR
ACTGCTGGAACGGCAGGCCAGCATGGGCCCTC



KKEMARRAVNFVVEI
TGACACTGATCTCTTGTGACCTGGGCCAGACC



SRRKTGLSNVIVNIED
AACCTGATTGCCGGCAGAGCCTCTGCTATCAG



LNVRIFHGGGKQAPG
CCTGACACATGGATCTCTGGGCGTCAGATCCA



WDGFFRPKSENRWFI
GCGTGCGGATTGAGCTGGACCCCGAGATCATC



QAIHKAFSDLAAHHG
AAGAGCTTCGAGCGGCTGAGAAAGGACGCCG



IPVIESDPQRTSMTCPE
ACAGACTGGAAACCGAGATCCTGACCGCCGCC



CGHCDSKNRNGVRFL
AAAGAAACCCTGAGCGACGAACAGAGGGGCG



CKGCGASMDADFDA
AAGTGAACAGCCACGAGAAGGATAGCCCACA



ACRNLERVALTGKPM
GACAGCCAAGGCCAGCCTGTGTAGAGAGCTG



PKPSTSCERLLSATTG
GGACTGCACCCTCCATCTCTGCCTTGGGGACA



KVCSDHSLSHDAIEK
GATGGGCCCTAGCACCACCTTTATCGCCGACA



AS (SEQ ID NO: 17)
TGCTGATCTCCCACGGCAGGGACGATGATGCC




TTTCTGAGCCACGGCGAGTTCCCCACACTGGA




AAAGCGGAAGAAGTTCGATAAGCGGTTCTGCC




TGGAAAGCAGACCCCTGCTGAGCAGCGAGAC




AAGAAAGGCCCTGAACGAGTCCCTGTGGGAA




GTGAAGAGAACCAGCAGCGAGTACGCCCGGC




TGAGCCAGAGAAAGAAAGAGATGGCTAGACG




GGCCGTGAACTTCGTGGTCGAGATCTCCAGAA




GAAAGACCGGCCTGTCCAACGTGATCGTGAAC




ATCGAGGACCTGAACGTGCGGATCTTTCACGG




CGGAGGAAAACAGGCTCCTGGCTGGGATGGCT




TCTTCAGACCCAAGTCCGAGAACCGGTGGTTC




ATCCAGGCCATCCACAAGGCCTTCAGCGATCT




GGCCGCTCACCACGGAATCCCTGTGATCGAGA




GCGACCCTCAGCGGACCAGCATGACCTGTCCT




GAGTGTGGCCACTGCGACAGCAAGAACCGGA




ATGGCGTTCGGTTCCTGTGCAAAGGCTGTGGC




GCCTCCATGGACGCCGATTTTGATGCCGCCTG




CCGGAACCTGGAAAGAGTGGCTCTGACAGGC




AAGCCCATGCCTAAGCCTAGCACCTCCTGTGA




AAGACTGCTGAGCGCCACCACCGGCAAAGTGT




GCTCTGATCACTCCCTGTCTCACGACGCCATCG




AGAAGGCTTCTTAA (SEQ ID NO: 1409)





CasΦ.18
MEKEITELTKIRREFP
ATGGAAAAAGAGATCACCGAGCTGACCAAGA



NKKFSSTDMKKAGKL
TCCGCAGAGAGTTCCCCAACAAGAAGTTCAGC



LKAEGPDAVRDFLNS
AGCACCGACATGAAGAAGGCCGGCAAGCTGC



CQEIIGDFKPPVKTNI
TGAAGGCCGAAGGACCTGATGCCGTGCGGGA



VSISRPFEEWPVSMVG
CTTCCTGAACAGCTGCCAAGAGATCATCGGCG



RAIQEYYFSLTKEELE
ACTTCAAGCCTCCAGTCAAGACCAACATCGTG



SVHPGTSSEDHKSFFN
TCCATCAGCAGACCCTTCGAGGAATGGCCCGT



ITGLSNYNYTSVQGL
GTCCATGGTTGGACGGGCCATCCAAGAGTACT



NLIFKNAKAIYDGTLV
ACTTCAGCCTGACCAAAGAGGAACTGGAAAG



KANNKNKKLEKKFN
CGTTCACCCCGGCACCAGCAGCGAGGACCACA



EINHKRSLEGLPIITPD
AGAGCTTTTTCAACATCACCGGCCTGAGCAAC



FEEPFDENGHLNNPPG
TACAACTACACCAGCGTGCAGGGCCTGAACCT



INRNIYGYQGCAAKV
GATCTTCAAGAACGCCAAGGCCATCTACGACG



FVPSKHKMVSLPKEY
GCACCCTGGTCAAGGCCAACAACAAGAACAA



EGYNRDPNLSLAGFR
GAAGCTCGAGAAGAAGTTTAACGAGATCAAC



NRLEIPEGEPGHVPWF
CACAAGCGGAGCCTGGAAGGCCTGCCTATCAT



QRMDIPEGQIGHVNKI
CACCCCTGATTTCGAGGAACCCTTCGACGAGA



QRFNFVHGKNSGKVK
ACGGCCACCTGAACAACCCTCCAGGCATCAAC



FSDKTGRVKRYHHSK
CGGAACATCTACGGCTATCAGGGCTGCGCCGC



YKDATKPYKFLEESK
CAAGGTGTTCGTGCCTTCTAAGCACAAGATGG



KVSALDSILAIITIGDD
TGTCCCTGCCTAAAGAGTACGAGGGCTACAAC



WVVFDIRGLYRNVFY
AGGGACCCCAACCTGTCTCTGGCCGGCTTCAG



RELAQKGLTAVQLLD
AAACAGACTGGAAATCCCTGAGGGCGAGCCT



LFTGDPVIDPKKGVV
GGCCATGTGCCATGGTTCCAGAGAATGGATAT



TFSYKEGVVPVFSQKI
CCCCGAGGGCCAGATCGGACACGTGAACAAG



VPRFKSRDTLEKLTSQ
ATCCAGCGGTTCAACTTCGTGCACGGCAAGAA



GPVALLSVDLGQNEP
CAGCGGCAAAGTGAAGTTCTCCGACAAGACCG



VAARVCSLKNINDKIT
GCAGAGTGAAGAGATACCACCACAGCAAGTA



LDNSCRISFLDDYKK
CAAGGACGCTACCAAGCCTTACAAGTTCCTGG



QIKDYRDSLDELEIKI
AAGAGTCCAAGAAGGTGTCAGCCCTGGACAG



RLEAINSLETNQQVEI
CATCCTGGCCATCATCACAATCGGCGACGACT



RDLDVFSADRAKANT
GGGTCGTGTTCGACATCAGAGGCCTGTACCGG



VDMFDIDPNLISWDS
AACGTGTTCTACAGAGAGCTGGCCCAGAAAGG



MSDARVSTQISDLYL
CCTGACAGCTGTGCAACTGCTGGACCTGTTTA



KNGGDESRVYFEINN
CCGGCGATCCCGTGATCGACCCCAAGAAAGGC



KRIKRSDYNISQLVRP
GTGGTCACCTTCAGCTACAAAGAGGGCGTCGT



KLSDSTRKNLNDSIW
CCCCGTCTTTAGCCAGAAAATCGTGCCCCGGT



KLKRTSEEYLKLSKR
TCAAGAGCCGGGACACCCTGGAAAAGCTGAC



KLELSRAVVNYTIRQS
CTCTCAGGGACCTGTGGCTCTGCTGTCTGTGG



KLLSGINDIVIILEDLD
ACCTGGGACAGAATGAACCTGTGGCCGCCAGA



VKKKFNGRGIRDIGW
GTGTGCAGCCTGAAGAACATCAACGACAAGAT



DNFFSSRKENRWFIPA
CACCCTGGACAACTCTTGCCGGATCAGCTTCC



FHKTFSELSSNRGLCV
TGGACGACTACAAGAAGCAGATCAAGGACTA



IEVNPAWTSATCPDC
CAGAGACAGCCTGGACGAGCTGGAAATCAAG



GFCSKENRDGINFTCR
ATCCGGCTGGAAGCCATCAACTCCCTCGAGAC



KCGVSYHADIDVATL
AAACCAGCAGGTCGAGATCAGAGATCTGGAC



NIARVAVLGKPMSGP
GTGTTCAGCGCCGACCGGGCCAAAGCCAATAC



ADRERLGDTKKPRVA
CGTGGACATGTTTGACATCGACCCTAACCTGA



RSRKTMKRKDISNST
TCAGCTGGGACTCCATGAGCGACGCCAGAGTC



VEAMVTA (SEQ ID
AGCACCCAGATCAGCGACCTGTACCTGAAGAA



NO: 18)
TGGCGGCGACGAGAGCCGGGTGTACTTTGAGA




TTAACAACAAACGGATTAAGCGGAGCGACTAC




AACATCAGCCAGCTCGTGCGGCCCAAGCTGAG




CGATAGCACCAGAAAGAACCTGAACGACAGC




ATCTGGAAGCTGAAGCGGACCAGCGAGGAAT




ACCTGAAGCTGAGCAAGCGGAAGCTGGAACT




GAGCAGAGCCGTCGTGAATTACACCATCCGGC




AGAGCAAACTGCTGAGCGGCATCAATGACATC




GTGATCATTCTCGAGGACCTGGACGTGAAGAA




GAAATTCAACGGCAGAGGCATCCGCGATATCG




GCTGGGACAACTTCTTCAGCTCCCGGAAAGAA




AACCGGTGGTTCATCCCCGCCTTCCACAAGAC




CTTTAGCGAGCTGAGCAGCAACAGGGGCCTGT




GCGTGATCGAAGTGAATCCTGCCTGGACCAGC




GCCACCTGTCCTGATTGTGGCTTCTGCAGCAA




AGAAAACAGAGATGGCATCAACTTCACGTGCC




GGAAGTGCGGCGTGTCCTACCACGCCGATATT




GACGTGGCCACACTGAATATTGCCAGAGTGGC




CGTGCTGGGCAAGCCTATGTCTGGACCTGCCG




ACAGAGAGAGACTGGGCGACACCAAGAAACC




TAGAGTGGCCCGCAGCAGAAAGACCATGAAG




CGGAAGGACATCAGCAACAGCACCGTCGAGG




CCATGGTTACAGCTTAA (SEQ ID NO: 1410)









Example 2

Illustrative CasΦ Guide RNA Sequences


Guide RNA sequences for complexing with the CasΦ polypeptides of the disclosure were prepared. TABLE 5 provides illustrative guide RNA sequences to target the target nucleic acid sequence TATTAAATACTCGTATTGCTGTTCGATTAT (SEQ ID NO: 1411). A guide nucleic acid of the disclosure can comprise the sequence of any of the guide RNAs provided in Table 5 or a portion thereof.









TABLE 5







Illustrative Casd guide RNA sequences















RNA sequence 






(5′->3′),



RNA
Repeat
Spacer
shown as DNA 


Name
Type
length
length
BOLD = spacer





CasΦ.2
crRNA
36
30
GTCGGAACGCTCAACGATTGC





CCCTC
ACGAGGGGAC 






(SEQ ID NO: 49)





CasΦ.7
crRNA
36
30
GGATCCAATCCTTTTTGATTG






CCCAATTCGTTGGGAC 






(SEQ ID NO: 51)





CasΦ.10
crRNA
36
30
GGATCTGAGGATCATTATTGC






TCGTTACGACGAGAC 






(SEQ ID NO: 52)





CasΦ.18
crRNA
36
30
ACCAAAACGACTATTGATTGC






CCAGTACGCTGGGAC 






(SEQ ID NO: 57)









Example 3

CasΦ Acts as a Programmable Nickase


The present example shows that a CasΦ polypeptide can comprise programmable nickase activity. FIG. 1 shows data from an experiment to analyze nicking ability of CasΦ ortholog proteins. For this experiment, five different CasΦ polypeptides: designated CasΦ.2, CasΦ.11, CasΦ.17, CasΦ.18, and CasΦ.12 in FIG. 1, were analyzed. Amino acid sequences of the proteins used in the experiment are shown in TABLE 4.


All reactions were carried out using guide RNA comprising a crRNA sequence comprising the CasΦ.18 repeat sequence (ACCAAAACGACTATTGATTGCCCAGTACGCTGGGAC (SEQ ID NO: 57)). Complexing of the CasΦ polypeptide with a guide RNA to form the ribonucleoprotein (RNP) complex was carried out at room temperature for 20 minutes. The RNP complex was incubated with the target DNA at 37° C. for 60 minutes in NEB CutSmart buffer (50 mM Potassium Acetate, 20 mM Tris-Acetate, 10 mM Magnesium Acetate, 100 ug/ml BSA, pH 7.9 at 25° C.). The target nucleic acid used for the reactions was a super-coiled plasmid DNA comprising the target sequence TATTAAATACTCGTATTGCTGTTCGATTAT (SEQ ID NO: 116), which was immediately downstream of a TTTN PAM sequence. The plasmid DNA sequence is provided below with the target sequence in bold:










(SEQ ID NO: 1412)



gtgtagataactacgatacgggagggcttaccatctggccccagtgctgcaatgataccgcgagac






ccacgctcaccggctccagatttatcagcaataaaccagccagccggaagggccgagcgcagaagt





ggtcctgcaactttatccgcctccatccagtctattaattgttgccgggaagctagagtaagtagt





tcgccagttaatagtttgcgcaacgttgttgccattgctacaggcatcgtggtgtcacgctcgtcg





tttggtatggcttcattcagctccggttcccaacgatcaaggcgagttacatgatcccccatgttg





tgcaaaaaagcggttagctccttcggtcctccgatcgttgtcagaagtaagttggccgcagtgtta





tcactcatggttatggcagcactgcataattctcttactgtcatgccatccgtaagatgcttttct





gtgactggtgagtactcaaccaagtcattctgagaatagtgtatgcggcgaccgagttgctcttgc





ccggcgtcaatacgggataataccgcgccacatagcagaactttaaaagtgctcatcattggaaaa





cgttcttcggggcgaaaactctcaaggatcttaccgctgttgagatccagttcgatgtaacccact





cgtgcacccaactgatcttcagcatcttttactttcaccagcgtttctgggtgagcaaaaacagga





aggcaaaatgccgcaaaaaagggaataagggcgacacggaaatgttgaatactcatactcttcctt





tttcaatattattgaagcatttatcagggttattgtctcatgagcggatacatatttgaatgtatt





tagaaaaataaacaaataggggttccgcgcacatttccccgaaaagtgccacctgacgtctaagaa





accattattatcatgacattaacctataaaaataggcgtatcacgaggccctttcgtctcgcgcgt





ttcggtgatgacggtgaaaacctctgacacatgcagctcccggagacggtcacagcttgtctgcca





tggacatgtttaTATTAAATACTCGTATTGCTGTTCGATTATgaccgaattccctgtcgtgccagc





tgcattaatgaatcggccaacgcgcggggagaggcggtttgcgtattgggcgctcttccgcttcct





cgctcactgactcgctgcgctcggtcgttcggctgcggcgagcggtatcagctcactcaaaggcgg





taatacggttatccacagaatcaggggataacgcaggaaagaacatgtgagcaaaaggccagcaaa





aggccaggaaccgtaaaaaggccgcgttgctggcgtttttccataggctccgcccccctgacgagc





atcacaaaaatcgacgctcaagtcagaggtggcgaaacccgacaggactataaagataccaggcgt





ttccccctggaagctccctcgtgcgctctcctgttccgaccctgccgcttaccggatacctgtccg





cctttctcccttcgggaagcgtggcgctttctcatagctcacgctgtaggtatctcagttcggtgt





aggtcgttcgctccaagctgggctgtgtgcacgaaccccccgttcagcccgaccgctgcgccttat





ccggtaactatcgtcttgagtccaacccggtaagacacgacttatcgccactggcagcagccactg





gtaacaggattagcagagcgaggtatgtaggcggtgctacagagttcttgaagtggtggcctaact





acggctacactagaagaacagtatttggtatctgcgctctgctgaagccagttaccttcggaaaaa





gagttggtagctcttgatccggcaaacaaaccaccgctggtagcggtggtttttttgtttgcaagc





agcagattacgcgcagaaaaaaaggatctcaagaagatcctttgatcttttctacggggtctgacg





ctcagtggaacgaaaactcacgttaagggattttggtcatgagattatcaaaaaggatcttcacct





agatccttttaaattaaaaatgaagttttaaatcaatctaaagtatatatgagtaaacttggtctg





acagttaccaatgcttaatcagtgaggcacctatctcagcgatctgtctatttcgttcatccatag





ttgcctgactccccgtc 






As shown in FIG. 1, CasΦ.17 and CasΦ.18 produced only nicked product (i.e. single strand breaks; “nicked”) by 60 minutes. By way of comparison, CasΦ.12 generated almost entirely linearized product demonstrating double-stranded breaks, while CasΦ.2 and CasΦ.11 generated some linearized product (i.e. double strand breaks) but primarily produced nicked intermediate. This data demonstrates that CasΦ orthologs can comprise programmable nickase activity.


Example 4

Effect of crRNA Repeat Sequence and RNP Complexing Temperature on CasΦ Nickase Activity


The present example shows that the crRNA repeat sequence and RNP complexing temperature can affect nickase activity of CasΦ. FIG. 2A and FIG. 2B illustrate results of a cis-cleavage experiment showing the percentage of input plasmid DNA that was nicked after 60 minutes of reaction at 37° C. by CasΦ RNP complex assembled at room temperature (FIG. 2A) or at 37° C. (FIG. 2B). FIG. 2C illustrates alignment of CasΦ.2, CasΦ.7, CasΦ.10, and CasΦ.18 repeat sequences showing conserved (highlighted in black) and diverged nucleotides.


For this study, each of three CasΦ polypeptides (CasΦ.11, CasΦ.17 and CasΦ.18 in FIGS. 2A and 2B) was tested for their ability to nick input plasmid DNA when complexed with one of four crRNAs comprising the repeat sequences of CasΦ.2, CasΦ.7, CasΦ.10 and CasΦ.18 (abbreviated j2, j7, j10 and j18, respectively in FIG. 2A and FIG. 2B). Amino acid sequences of the proteins used in the experiment are shown in TABLE 4. Guide RNA sequences corresponding to j2, j7, j10 and j18 are provided in TABLE 5. The input plasmid was a super-coiled plasmid (sequence shown in EXAMPLE 3) comprising the target sequence TATTAAATACTCGTATTGCTGTTCGATTAT (SEQ ID NO: 108) immediately downstream of a TTTN PAM. The incubation reaction to form the RNP complex was performed either at room temperature or at 37° C. for 60 minutes in NEB CutSmart buffer (50 mM Potassium Acetate, 20 mM Tris-Acetate, 10 mM Magnesium Acetate, 100 ug/ml BSA, pH 7.9 at 25° C.). The RNP complex was incubated with the input plasmid for 60 minutes at 37° C. The reaction was quenched with 1 mg/ml proteinase K, 0.08% SDS, and 15 mM EDTA. The data illustrated in FIG. 2A and FIG. 2B comes from a single replicate of the in vitro cis-cleavage experiment.


As shown in FIG. 2A, when the CasΦ polypeptides were assembled into RNP complexes with the guide nucleic acids at room temperature, crRNAs comprising repeat sequences from any of the proteins supported nickase activity by CasΦ.11, CasΦ.17 and CasΦ.18, with the exception of the CasΦ.17/CasΦ.2-repeat pairing. As shown in FIG. 2B, when the CasΦ polypeptides were assembled into RNP complexes with the guide nucleic acids at 37° C., as opposed to at room temperature, the activity of each protein was completely abolished when complexed with crRNAs comprising a repeat sequence from CasΦ.2 or CasΦ.10.


This example showed that the nickase activity of CasΦ can be affected by the crRNA repeat sequence. The data also showed that the nickase activity of CasΦ can be affected by the RNP complexing temperature.



FIG. 2D provides further examples of the nickase activity of CasΦ affected by the RNP complexing temperature. Nickase activity was assessed as described above for CasΦ.2, CasΦ.4, CasΦ.6, CasΦ.9, CasΦ.10, CasΦ.12 and CasΦ.13. Amino acid sequences of the proteins used in the experiment are shown in TABLE 1.


The effect of complexing temperature on the double strand cutting activity of CasΦ polypeptides was also assessed as described above. As shown in FIG. 2D, generally the double strand cutting activity of CasΦ polypeptides, particularly CasΦ.2, CasΦ.4 and CasΦ.12, is not affected by the RNP complexing temperature. Although some systems with less efficient double strand cutting activity, such as CasΦ.10, CasΦ.11 and CasΦ.13 in this example, are sensitive to RNP complexing temperature.


Example 5

CasΦ Nickase Cleaves Non-Target Strand


The present example shows that CasΦ nickase cleaves the non-target DNA strand. Results of the study are shown in FIG. 3. For this study, four different CasΦ polypeptides (CasΦ.12, CasΦ.2, CasΦ.11, and CasΦ.18 as shown in FIG. 1) were analyzed using a cis-cleavage assay. Amino acid sequences of the proteins used in the experiment are shown in TABLE 4. The CasΦ polypeptides were complexed with guide RNA to form RNP complexes All reactions were carried out using guide RNA comprising a crRNA sequence comprising the CasΦ.18 repeat sequence (ACCAAAACGACTATTGATTGCCCAGTACGCTGGGAC (SEQ ID NO: 57)). Complexing of the CasΦ polypeptides with guide RNA to form the ribonucleoprotein (RNP) complex was carried out at room temperature for 20 minutes. The RNP complex was incubated with the target DNA at 37° C. for 60 minutes in NEB CutSmart buffer (50 mM Potassium Acetate, 20 mM Tris-Acetate, 10 mM Magnesium Acetate, 100 ug/ml BSA, pH 7.9 at 25° C. The target nucleic acid used for the reactions was a super-coiled plasmid DNA (sequence shown in EXAMPLE 3) comprising the target sequence TATTAAATACTCGTATTGCTGTTCGATTAT (SEQ ID NO: 116), which was immediately downstream of a TTTN PAM sequence. The reaction was quenched with 1 mg/ml proteinase K, 0.08% SDS, and 15 mM EDTA. The resulting cleaved DNA from the reaction was Sanger sequenced using forward and reverse primers. The forward primer provided the sequence of the target strand (TS), while the reverse primer provided the sequence of the non-target strand (NTS). If a strand had been cleaved by the CasΦ polypeptide, the sequencing signal would drop off from the cleavage site in the sequencing data. FIG. 3 illustrates results of the Sanger sequencing.



FIG. 3, panel A, shows a control reaction where no CasΦ polypeptide was added. As a result, the target DNA was uncut and resulted in complete sequencing of both target and non-target strands. FIG. 3, panel B, illustrates the cleavage pattern for CasΦ.12, which comprises double-stranded DNA cleavage activity. The sequencing signal dropped off on both the target and the non-target strands (as shown by arrows), demonstrating cleavage of both strands of the target DNA. FIG. 3, panel C, illustrates the cleavage pattern for CasΦ.2, which predominantly nicks DNA (as illustrated in FIG. 1). The data showed that the sequencing signal dropped off on only the non-target strand (bottom arrow) demonstrating cleavage of the non-target strand. FIG. 3, panel D, illustrates the cleavage pattern for CasΦ.11, which comprises strong nickase activity (as illustrated in FIG. 1). The data showed that the sequencing signal dropped off on only the non-target strand (bottom arrow) demonstrating cleavage of the non-target strand. FIG. 3, panel E, illustrates the cleavage pattern for CasΦ.18, which comprises strong nickase activity (as illustrated in FIG. 1). The data showed that the sequencing signal dropped off on only the non-target strand (bottom arrow) demonstrating cleavage of the non-target strand. Thus, this example shows that CasΦ polypeptides comprising nickase activity cleave the non-target strand of a target DNA.


Example 6

Editing a Target Nucleic Acid


This example describes genetic modification of a target nucleic acid with a programmable CasΦ nuclease (e.g., any one of SEQ ID NO: 1-SEQ ID NO: 47, SEQ ID NO: 105 or SEQ ID NO: 107) of the present disclosure. The programmable CasΦ nuclease is administered with a guide nucleic acid capable of hybridizing to a segment of a target nucleic acid sequence of interests in a ribonucleoprotein complex or as separate nucleic acids encoding for each component. Subjects administered said composition are humans or non-human mammals. Upon binding of the guide nucleic acid to the segment of the target nucleic acid, the programmable CasΦ nuclease nicks or induces a double stranded break in the target. The target undergoes NHEJ or HDR. A donor nucleic acid may be co-administered. The donor nucleic acid may be to replace or repair a mutated segment of the target nucleic acid. The subject may have a disease. Upon genetic modification of the target nucleic acid, the disease or a symptom of the disease may be alleviated, or the disease may be cured.


Example 7

Editing a Plant or Crop Target Nucleic Acid


This example describes genetic modification of a plant or crop target nucleic acid with a programmable CasΦ nuclease (e.g., any one of SEQ ID NO: 1-SEQ ID NO: 47, SEQ ID NO: 105 or SEQ ID NO: 107) of the present disclosure. The programmable CasΦ nuclease is administered with a guide nucleic acid capable of hybridizing to a segment of a target nucleic acid sequence of interests in a ribonucleoprotein complex or as separate nucleic acids encoding for each component. Subjects administered said composition are plant or crop cells. Upon binding of the guide nucleic acid to the segment of the target nucleic acid, the programmable CasΦ nuclease nicks or induces a double stranded break in the target. The target undergoes NHEJ or HDR. A donor nucleic acid may be co-administered. The donor nucleic acid may be to replace or repair a mutated segment of the target nucleic acid. The result is an engineered plant or crop cell.


Example 8

Genetic Modification of a Target Nucleic Acid


This example describes genetic modification of a target nucleic acid with a dead programmable CasΦ nuclease (e.g., any one of SEQ ID NO: 1-SEQ ID NO: 47, SEQ ID NO: 105 or SEQ ID NO: 107 with a mutation rendering it catalytically inactive) of the present disclosure. The programmable CasΦ nuclease is further linked to a transcriptional regulator. The programmable CasΦ nuclease, the transcriptional regulator, and the guide nucleic acid capable of hybridizing to a segment of a target nucleic acid sequence of interests are administered as a ribonucleoprotein complex or as separate nucleic acids encoding for each component. Subjects administered said composition are humans or non-human mammals. Upon binding of the guide nucleic acid to the segment of the target nucleic acid, the dead programmable CasΦ nuclease upregulates or downregulates transcription. The subject may have a disease. Upon genetic modification of the target nucleic acid, the disease or a symptom of the disease may be alleviated, or the disease may be cured.


Example 9

Genetic Modification of a Plant of Crop Target Nucleic Acid


This example describes genetic modification of a plant or crop target nucleic acid with a dead programmable CasΦ nuclease (e.g., any one of SEQ ID NO: 1-SEQ ID NO: 47, SEQ ID NO: 105 or SEQ ID NO: 107 with a mutation rendering it catalytically inactive) of the present disclosure. The programmable CasΦ nuclease is further linked to a transcriptional regulator. The programmable CasΦ nuclease, the transcriptional regulator, and the guide nucleic acid capable of hybridizing to a segment of a target nucleic acid sequence of interests are administered as a ribonucleoprotein complex or as separate nucleic acids encoding for each component. Subjects administered said composition are humans or non-human mammals. Upon binding of the guide nucleic acid to the segment of the target nucleic acid, the dead programmable CasΦ nuclease upregulates or downregulates transcription. The result is an engineered plant or crop cell.


Example 10

Detection of a Target Nucleic Acid


This example describes detection of a target nucleic acid with a programmable CasΦ nuclease (e.g., any one of SEQ ID NO: 1-SEQ ID NO: 47, SEQ ID NO: 105 or SEQ ID NO: 107) of the present disclosure. The programmable CasΦ nuclease, the guide nucleic acid capable of hybridizing to a segment of a target nucleic acid sequence of interests, and a labeled ssDNA reporter are contacted to a sample. In the presence of the target nucleic acid in the sample, the guide nucleic acid binds to its target, thereby activating the programmable CasΦ nuclease to cleave the labeled ssDNA reporter and releasing a detectable label. The detectable label emits a detectable signal that is, optionally, quantified. In the absence of the target nucleic acid in the sample, the guide nucleic acid does not bind to its target, the labeled ssDNA reporter is not cleaved, and low or no signal is detected.


Example 11

Preference for Nicking or Double Strand Cleavage of Target DNA is a Property of CasΦ Enzymes, Independent of crRNA Repeat or Target Sequences


This example describes how the preference of a CasΦ polypeptide to cleave a single or both strands of a double-strand target DNA is independent of the crRNA repeat or target sequence. For this study, each of twelve CasΦ polypeptide (CasΦ.1, CasΦ.2, CasΦ.3, CasΦ.4, CasΦ.6, CasΦ.9, CasΦ.10, CasΦ.11, CasΦ.12, CasΦ.13, CasΦ.17 and CasΦ.18) was complexed with one of the crRNAs comprising the repeat sequences of CasΦ.1, CasΦ.2, CasΦ.4, CasΦ.7, CasΦ.10, CasΦ.11, CasΦ.12, CasΦ.13, CasΦ.17 and CasΦ.18. Amino acid sequences of the proteins used in the experiment are shown in TABLE 1 and crRNA sequences are provided in TABLE 2. The input plasmid was one of two super-coiled plasmids containing a target sequence (TATTAAATACTCGTATTGCTGTTCGATTAT (SEQ ID NO: 108) or CACAGCTTGTCTGTAAGCGGATGCCATATG (SEQ ID NO: 109)) immediately downstream of a TTTN PAM. The incubation reaction to form the RNP complex was performed at room temperature for 20 minutes in NEB CutSmart buffer (50 mM Potassium Acetate, 20 mM Tris-Acetate, 10 mM Magnesium Acetate, 100 ug/ml BSA, pH 7.9 at 25° C.). The RNP complex was incubated with the input plasmid for 60 minutes at 37° C. The reaction was quenched with 1 mg/ml proteinase K, 0.08% SDS, and 15 mM EDTA.


As shown in FIG. 4A, CasΦ polypeptides have a preference for nicking or linearizing (i.e. cleaving both strands) a double strand plasmid DNA target and this preference is not affected by the crRNA repeat or target DNA sequence.


Raw data used to generate a subset of the heatmap in FIG. 4A is shown in FIG. 4B. These data show that CasΦ.12 is predominantly a linearizer of plasmid DNA, i.e. CasΦ.12 predominantly cleaves both strands of a double strand target DNA. Whereas CasΦ.18 is predominantly a nickase and predominantly cleaves one strand of a double strand target DNA.


This example showed that the preference of a CasΦ polypeptide to cleave a single or both strands of a double-strand target DNA is independent of the crRNA repeat or target sequence.


Example 12

Structural Conservation Across the CasΦ Repeats


This example describes the conservation of structure across the CasΦ repeats. In particular, FIG. 5A shows the structure of the crRNA repeats for CasΦ.1, CasΦ.2, CasΦ.7, CasΦ.11, CasΦ.12, CasΦ.13, CasΦ.18, and CasΦ.32. crRNA sequences are provided in TABLE 2. There is high sequence and structure conservation in the 3′ half of the CasΦ repeats. The LocARNA alignment tool was used to confirm the consensus structure of CasΦ repeats, which is shown in FIG. 5B. The consensus was determined on the basis of the following crRNA repeats: CasΦ.1, CasΦ.2, CasΦ.4, CasΦ.7, CasΦ.10, CasΦ.11, CasΦ.12, CasΦ.13, Cas12Φ.17, CasΦ.18, CasΦ.19, CasΦ.21, CasΦ.22, CasΦ.23, CasΦ.24, CasΦ.25, CasΦ.26, CasΦ.27, CasΦ.28, CasΦ.29, CasΦ.30, CasΦ.31, CasΦ.32, CasΦ.33, CasΦ.35, CasΦ.41. The sequence of these repeats is provided in TABLE 5. As shown in FIG. 5B, CasΦ repeats have a highly conserved 3′ hairpin which includes a double stranded stem portion and a single-stranded loop portion. One strand of the stem includes the sequence CYC and the other strand includes the sequence GRG, where Y and R are complementary. The loop portion typically comprises four nucleotides. The 3′ end of CasΦ repeats comprise the sequence GAC and the G of this sequence is in the stem of the hairpin.


This example shows the conserved structure of CasΦ crRNA repeats.


Example 13

CasΦ PAM Preferences on Linear Targets


The present example shows the PAM preferences for CasΦ polypeptides on linear double stranded DNA targets. For this study, five different CasΦ polypeptides (CasΦ.2, CasΦ.4, CasΦ.11, CasΦ.12 and CasΦ.18) were analyzed using a cis-cleavage assay. Amino acid sequences of the proteins used are shown in TABLE 1. The CasΦ polypeptides were complexed their native crRNAs (i.e. the corresponding CasΦ.2, CasΦ.4, CasΦ.11, CasΦ.12 and CasΦ.18 repeats) to form RNP complexes at room temperature for 20 minutes. The RNP complex was incubated with target DNA at 37° C. for 60 minutes in NEB CutSmart buffer (50 mM Potassium Acetate, 20 mM Tris-Acetate, 10 mM Magnesium Acetate, 100 ug/ml BSA, pH 7.9 at 25° C.). The target DNA was a 1.1 kb PCR-amplified DNA product. Stating with a TTTA PAM, each position was varied one by one to the other 3 nucleotides for a total of 12 variants in addition to the parental TTTA PAM. Linear fragments were used to disfavor cleavage for greater sensitivity of PAM preference determination. FIG. 6A illustrates the absolute levels of double strand cleavage (or nicking for CasΦ.18). FIG. 6B illustrates the data from FIG. 6A after normalization to the parental TTTA PAM as 100%. FIG. 6C provides a summary of the optimal PAM preferences from the data in FIG. 6A and FIG. 6B. CasΦ.2 recognizes a GTTK PAM, where K is G or T. CasΦ.4 recognizes a VTTK PAM, where V is A, C or G and K is G or T. CasΦ.11 recognizes a VTTS PAM, where V is A, C or G and S is C or G. CasΦ.12 recognizes a TTTS PAM, where S is C or G. CasΦ.18 recognizes a VTTN PAM, where V is A, C or G and N is A, C, G or T.


This example shows the optimized PAM preferences for some of the CasΦ polypeptides.


Example 14

CasΦ Polypeptides Rapidly Nick Supercoiled DNA


The present example shows that CasΦ polypeptides rapidly nick supercoiled DNA but vary in their ability to deliver the second strand cleavage. For this study, five different CasΦ polypeptides (CasΦ.2, CasΦ.4, CasΦ.11, CasΦ.12 and CasΦ.18) were analyzed using a cis-cleavage assay. Amino acid sequences of the proteins used are shown in TABLE 1. The CasΦ polypeptides were complexed with their native crRNA to form 200 nM RNP complexes at room temperature in NEB CutSmart buffer (50 mM Potassium Acetate, 20 mM Tris-Acetate, 10 mM Magnesium Acetate, 100 ug/ml BSA, pH 7.9 at 25° C.) for 20 minutes in a volume of 30 μl. The target plasmid was one of two 2.2 kb super-coiled plasmids containing a target sequence









(TATTAAATACTCGTATTGCTGTTCGATTAT (SEQ ID NO: 108)


or






CACAGCTTGTCTGTAAGCGGATGCCATATG (SEQ ID NO: 109),







the guide RNAs targeted the underlined sequence) immediately downstream of a GTTG or TTTG PAM. At time “0” 30 μl of 20 nM target plasmid was mixed with RNP for a total volume of 60 μL The incubation temperature was 37° C. At 1, 3, 6, 15, 30 and 60 minutes, 9 μl portions of the reaction were withdrawn and stopped with reaction quench (1 mg/ml proteinase K, 0.08% SDS and 15 mM EDTA) and allowed to deproteinize for 30 minutes at 37° C. before agarose gel analysis. The cleavage was quantified as nicked or linear. FIG. 7 shows the rapid nicking of supercoiled target DNA by CasΦ polypeptides. The decrease in nicked products over time is due to the formation of linear product as the CasΦ polypeptides cleaves the second strand of the target DNA. CasΦ.12 rapidly cleaves both strands of supercoiled DNA.


This example shows that CasΦ polypeptides rapidly nick supercoiled DNA.


Example 15

Cas0 Polypeptides Prefers Full Length Repeats and Spacers Form 16-20 Nucleotide


The present example shows that CasΦ polypeptides prefer full-length repeats and spacers from 16 to 20 nucleotides. For this study, each of five CasΦ polypeptides (CasΦ.2, CasΦ.4, CasΦ.11, CasΦ.12 and CasΦ.18 in FIGS. 8A and 8B) was tested for their ability to cleave input plasmid DNA when complexed with one of either of the crRNAs comprising the repeat sequences of CasΦ.2 or CasΦ.18 (abbreviated j2 and j 18, respectively in FIG. 8A and FIG. 8B). Amino acid sequences of the proteins used in the experiment are shown in TABLE 1. Guide RNA sequences corresponding to j2 and j 18 are provided in TABLE 2. The CasΦ polypeptides were complexed to the crRNA in NEB CutSmart Buffer (50 mM Potassium Acetate, 20 mM Tris-Acetate, 10 mM Magnesium Acetate, 100 ug/ml BSA, pH 7.9 at 25° C.) for 20 minutes at room temperature. The ability of the CasΦ polypeptides to cleave a 2.2 kb plasmid containing a target sequence was assessed (FUT8_1: ACGCGTTTTAGAAGAGCAGCTTGTTAAGGCCAAAGAACAGATTGA (SEQ ID NO: 1413) and DNMT_1: AAAGATTTGTCCTTGGAGAACGGTGCTCATGCTTACAACCGGGA (SEQ ID NO: 1414), the PAM is underlined). Spacers targeting these target sequences were shortened from the 3′ end. The cleavage incubation was at 37° C. and the reaction was quenched after 10 minutes with 1 mg/ml proteinase K, 0.08% SDS and 15 mM EDTA. To assess the effect of shortening the crRNA repeats, the repeats were shortened from the 5′ end.


As shown in FIG. 8A, cRNA repeats with a length of 19 to 37 nucleotides supported cleavage activity of CasΦ polypeptides.


As shown in FIG. 8B, cleavage activity was observed over the range of spacer lengths tested (16 to 35 nucleotides). The optimal spacer length to support the cleavage activity of CasΦ polypeptides in this in vitro system is 16 to 20 nucleotides.


This example shows that CasΦ polypeptides prefer crRNA repeat lengths of 19 to 37 nucleotides and spacer lengths of 16 to 20 nucleotides in vitro.


Example 16

Cas40.12 Spacer Length Optimization in HEK293T Cells


The present example shows the use of CasΦ.12 as a gene editing tool in HEK293T cells and the effect of changing the length of the spacer. As illustrated in the schematic in FIG. 9A, a stable HEK293T cell line that expresses AcGFP was established. A plasmid expressing the crRNA under the control of the U6 promoter and CasΦ.12 under the control of the EFla promoter was transfected into the AcGFP-expressing HEK293T cell line. The CasΦ.12 was expressed as FLAGtag-SV40NLS-Cas12j.12-NLS-T2A-PuroR. GFP expression was assessed by flow cytometry at days 5, 7 and 10. The 30 nucleotide spacer sequence is 5′-TTGCCCAGGATGTTGCCATCCTCCTTGAAA-3′ (SEQ ID NO: 1415). To assess the effect of different spacer length, the spacer was shortened from its 3′ end. As shown in FIG. 9B, a spacer length of 15 to 30 nucleotides supported CasΦ.12 cleavage activity in HEK293T cells, but with less cleavage detected with the 15 and 16 nucleotide spacers. There is a preference for CasΦ.12 to have a spacer length of 17 to 22 nucleotides, but cleavage activity is still supported with the longer spacers tested.


Example 17

CasΦ Nucleases are a Novel Class of Protein


This example illustrates that the CasΦ nucleases identified herein are a novel class of Cas proteins. SEQ ID NOs: 1 to 47 and SEQ ID NO. 105 were searched in the InterPro database, but were not identified as belonging to a class of protein. As an example, the results for SEQ ID NO: 2 are shown in FIG. 10A. As a positive control, the Cpf1 sequence from Acidaminococcus sp. (strain BV3L6) was also searched and was identified as a CRISPR-associated endonuclease Cas12a family member, as shown in FIG. 10B.


Example 18

DNA Cleavage by CasΦ.19-CasΦ.48


This example illustrates the DNA cleavage activity of CasΦ.19 to CasΦ.45. Amino acid sequences of the proteins used in the experiment are shown in TABLE 1. The CasΦ polypeptides were complexed with their native crRNA (or the crRNA of the CasΦ polypeptide with the closest match based on amino acid sequence identity) to form 100 nM RNP complexes at room temperature in NEB CutSmart buffer (50 mM Potassium Acetate, 20 mM Tris-Acetate, 10 mM Magnesium Acetate, 100 ug/ml BSA, pH 7.9 at 25° C.) for 20 minutes in a volume of 30 μl. crRNA sequences are provided in TABLE 2. The target plasmid was a 2.1 kb plasmid containing the target sequence











(SEQ ID NO: 108)





TATTAAATACTCGTATTGCT
GTTCGATTAT.








The cleavage incubation was performed at 37° C. and the reaction was quenched after 60 minutes. Cleavage products where then analyzed by gel electrophoresis, as shown in FIG. 13A. This analysis identifies CasΦ.20, CasΦ.22, CasΦ.24, CasΦ.25, CasΦ.28, CasΦ.31, CasΦ.32, CasΦ.37, CasΦ.43 and CasΦ.45 as enzymes that predominantly linearize plasmid DNA, i.e. they predominantly cleave both strands of a double strand target DNA. Whereas DNA cleavage by CasΦ.21 results in mixed nicked and linear product, indicating that CasΦ.21 functions as a nickase as well as a linearizer of plasmid DNA with a preference for nickase activity under the conditions of the present study. Mixed nicked and linearized cleavage products were also identified following cleavage by CasΦ.26, CasΦ.29, CasΦ.33, CasΦ.34, CasΦ.38 and CasΦ.44. ‘SC’ represents ‘super-coiled’ un-cut target plasmid.


This example shows robust DNA cleavage by CasΦ polypeptides.


The inventors went on to demonstrate the robust generation of indels following targeting by CasΦ.12, CasΦ.20, CasΦ.21, CasΦ.22, CasΦ.25, CasΦ.28, CasΦ.31, CasΦ.32, CasΦ.33, CasΦ.34, CasΦ.37, CasΦ.43, and CasΦ.45. A stable HEK293T cell line that expresses AcGFP was established. HEK293T-AcGFP cells were transfected with crRNA and CasΦ expression plasmids using lipofectamine on day 0. Target sequences are provided in TABLE 6. Cells were harvested by trypsinization on day 3 for TIDE analysis. The target locus was amplified by PCR and the amplified product was then sequenced using Sanger sequencing. The TIDE analysis provides the frequency of indel mutations (https://tide.nki.nl/#about). As shown in FIG. 13B, targeting CasΦ.12, CasΦ.20, CasΦ.21, CasΦ.22, CasΦ.25, CasΦ.28, CasΦ.31, CasΦ.32, CasΦ.33, CasΦ.34, CasΦ.37, CasΦ.43, and CasΦ.45 to AcGFP led to the robust generation of indel mutations. FIG. 13C provides an alternative representation of the data shown in FIG. 13B for CasΦ.12, CasΦ.28, CasΦ.31, CasΦ.32 and CasΦ.33. These data further demonstrate the genome editing ability of CasΦ.20, CasΦ.21, CasΦ.22, CasΦ.25, CasΦ.28, CasΦ.31, CasΦ.32, CasΦ.33, CasΦ.34, CasΦ.37, CasΦ.43, and CasΦ.45.













TABLE 6







PAM
PAM
SEQ ID



Target Sequence
eGFP
acGFP
NO







KT_eGFP
TTAAGGCCAAAGAACAGATT
CTTG
CTTG
1416





OT_eGFP
CGTGATGGTCTCGATTGAGT
None
None
1417





T1_eGFP
AAGAAGTCGTGCTGCTTCAT
CTTG
CTTG
1418





T2_eGFP
ATCTGCACCACCGGCAAGCT
GTTC
GTTC
1419





T3_eGFP
TGGCGGATCTTGAAGTTCAC
GTTG
GTTG
1420





T4_eGFP
CCGTAGGTGGCATCGCCCTC
GTTC
CTTC
1421





T5_eGFP
ACGTCGCCGTCCAGCTCGAC
GTTT
None
1422





T6_eGFP
AAGAAGATGGTGCGCTCCTG
CTTG
CTCG
1423









Example 19

PAM Requirement for Castro Determined by In Vitro Enrichment


This example illustrates the NTTN PAM requirement for CasΦ.2, CasΦ.4, CasΦ.11 and CasΦ.12. An in vitro enrichment (IVE) analysis was performed. The CasΦ polypeptides were complexed with crRNA to form 500 nM RNP complexes at room temperature in NEB CutSmart buffer (50 mM Potassium Acetate, 20 mM Tris-Acetate, 10 mM Magnesium Acetate, 100 ug/ml BSA, pH 7.9 at 25° C.) for 30 minutes in a volume of 25 crRNA sequences are provided in TABLE 2. The cleavage incubation was performed at 37° C. and the reaction was quenched after 30 minutes. The substrate for the cleavage incubation was a pooled plasmid library which includes different PAM sequences. After quenching, the cleavage reactions were cleaned using Beckman SPRi beads. The samples were sequenced to identify which PAM sequences enabled target cleavage by the CasΦ polypeptides. As shown in FIG. 14A, this analysis revealed an NTTN PAM requirement for CasΦ.2, CasΦ.4, CasΦ.11 and CasΦ.12.


The inventors went on to assess the PAM requirement of CasΦ.20, CasΦ.26, CasΦ.32, CasΦ.38 and CasΦ.45. An IVE analysis was performed using the protocol described above for CasΦ.2, CasΦ.4, CasΦ.11 and CasΦ.12. As shown in FIG. 14B, Sanger sequencing revealed a NTNN PAM requirement for CasΦ.20, a NTTG PAM requirement for CasΦ.26, a GTTN PAM requirement for CasΦ.32 and CasΦ.38, and a NTTN PAM requirement for CasΦ.45.


The inventors also determined a single-base PAM requirement for CasΦ.20, CasΦ.24 and CasΦ.25. Amino acid sequences of the proteins used are shown in TABLE 1. The CasΦ polypeptides were complexed with their native crRNAs to form RNP complexes at room temperature for 20 minutes. crRNA sequences are provided in TABLE 2. The RNP complexes were incubated with target DNA at 37° C. for 60 minutes in NEB CutSmart buffer (50 mM Potassium Acetate, 20 mM Tris-Acetate, 10 mM Magnesium Acetate, 100 ug/ml BSA, pH 7.9 at 25° C.). The RNPs were then used in cleavage reactions with plasmid DNA comprising a target sequence and a PAM. Stating with a TTTg PAM, the PAM was mutated to each of the sequences shown in FIG. 14C to assess the PAM requirement. The products of the cleavage reactions were analyzed by gel electrophoresis, as seen in FIG. 14C. FIG. 14D provides the quantification of the gels shown in FIG. 14C. Together, the data in FIG. 14C and FIG. 14D demonstrate a NTNN PAM for DNA cleavage by CasΦ.20, CasΦ.24 and CasΦ.25.


This example demonstrates PAM sequences that enable CasΦ polypeptides to be targeted to a target sequence.


Example 20

CasΦ-Mediated Genome Editing in HEK293T Cells


This example illustrates the ability of CasΦ polypeptides to mediate genome editing in HEK293T cells, a cell line which is widely used in biological research. In this study, a CasΦ.12 plasmid, including both CasΦ polypeptide sequence and gRNA sequence, sometimes called an all-in-one, was delivered via lipofection. Spacers targeted exon 4 of the Fut8 gene. The spacer sequences are provided in TABLE 7. Cells were transfected on day 0 and harvested for analysis on day 5. As shown in FIG. 15, the target locus was modified following delivery of CasΦ.12 and gRNA 2. Cas9 was delivered to HEK293T cells to provide a positive control and no modification was detected when a non-targeting (NT) gRNA was used. The presence of indels was confirmed by next generation sequence analysis. The sample targeted by CasΦ.12 and gRNA 2 is shown in FIG. 15. The next generation sequence analysis revealed a diverse pattern of indels. The most frequent mutations were deletion mutations of 4 to 18 base pairs. The frequency of mutations was quantified and is illustrated as “% modified”, which is defined as the % of modification in the DNA sequence when aligned to unedited cells. Modifications can be deletions, insertions and substitutions.


This example demonstrates the use of CasΦ.12 as a robust genome editing tool.











TABLE 7







Spacer sequence 


Name
Target
(5′->3′) [SEQ ID NO]







Fut8_1
CasPhi target
GAAGAGCAGCTTGTTAAGGC 




(SEQ ID NO: 1424)





Fut8_2
CasPhi target
GCCTTAACAAGCTGCTCTTC 




(SEQ ID NO: 1425)





Fut8_3
Cas9 target 
ATTGATCAGGGGCCAgctat 



(control)
(SEQ ID NO: 1426)





Fut8_4
Cas9 target 
Acgcgtactcttcctatagc 



(control)
(SEQ ID NO: 1427)





Nt
Non target
CGTGATGGTCTCGATTGAGT




(SEQ ID NO: 1428)









Example 21

CasΦ-Mediated Genome Editing in CHO Cells


This example illustrates the ability of CasΦ polypeptides to mediate genome editing in CHO cells, an epithelial cell line which is frequently used in biological and medical research. To test the function of CasΦ.12 in CHO cells, 40 pmol CasΦ.12 was complexed to its native crRNA (2.5:1 crRNA:CasΦ). To prepare a mastermix of CasΦ.12 RNP, 3 μl crRNA (at 100 nM) was added to 1.6 μl CasΦ.12 (at 75 μM). Spacer sequences are provided in Table 8. The RNP complexes were incubated at 37° C. for 30 minutes. CHO cells were resuspended at 1.2×106 cells/ml in SF solution (Lonza). 40 μl of the cell suspension was added to the RNP complexes and 20 ul of the resultant suspension was then transferred to individual tubes for nucleofection. Lonza setting FF-137 was used to nucleofect the CHO cells. Cells were then harvested for analysis on day 5. As shown in FIG. 16A, CasΦ.12 induced the generation of indels in each of the endogenous genes tested (Bak1, Bax and Fut8). The ability of CasΦ.12 to induce indel mutations in each of these genes is further shown in FIG. 16F for Bak1, FIG. 16G for Bax and FIG. 16H for Fut8. Spacer sequences for FIG. 16F, FIG. 16G and FIG. 16H are provided in Tables F, G, and H, respectively. The data shown in FIG. 16F-H were produced with 200,000 CHO cells per transfection, RNP complexed with 250 pmol of CasΦ.12, and full-length unmodified guide RNA in molar excess relative to CasΦ.12, using the same Lonza reagents described for producing data presented in FIGS. 16A-E.











TABLE 8






Spacer sequence 
Repeat+Spacer sequence 


Name
(5′->3′)
(5′->3′), shown as DNA







Bak1_1
GAAGCTATGTTTTCCAT
CTTTCAAGACTAATAGATTGCTCCTTACGA



CTC (SEQ ID NO: 443)
GGAGACGAAGCTATGTTTTCCATCTC (SEQ




ID NO: 1197)





Bak1_2
GCAGGGGCAGCCGCCC
CTTTCAAGACTAATAGATTGCTCCTTACGA



CCTG
GGAGACGCAGGGGCAGCCGCCCCCTG



(SEQ ID NO: 444)
(SEQ ID NO: 1198)





Bak1_3
CTCCTAGAACCCAACA
CTTTCAAGACTAATAGATTGCTCCTTACGA



GGTA
GGAGACCTCCTAGAACCCAACAGGTA



(SEQ ID NO: 445)
(SEQ ID NO: 1199)





Bak1_4
GAAAGACCTCCTCTGTG
CTTTCAAGACTAATAGATTGCTCCTTACGA



TCC (SEQ ID NO: 446)
GGAGACGAAAGACCTCCTCTGTGTCC (SEQ




ID NO: 1200)





Bak1_5
TCCATCTCGGGGTTGGC
CTTTCAAGACTAATAGATTGCTCCTTACGA



AGG (SEQ ID NO: 447)
GGAGACTCCATCTCGGGGTTGGCAGG




(SEQ ID NO: 1201)





Bak1_6
TTCCTGATGGTGGAGAT
CTTTCAAGACTAATAGATTGCTCCTTACGA



GGA (SEQ ID NO: 448)
GGAGACTTCCTGATGGTGGAGATGGA




(SEQ ID NO: 1202)





Bax_1
CTAATGTGGATACTAAC
CTTTCAAGACTAATAGATTGCTCCTTACGA



TCC (SEQ ID NO: 479)
GGAGACCTAATGTGGATACTAACTCC (SEQ




ID NO: 1269)





Bax_2
TTCCGTGTGGCAGCTGA
CTTTCAAGACTAATAGATTGCTCCTTACGA



CAT (SEQ ID NO: 480)
GGAGACTTCCGTGTGGCAGCTGACAT (SEQ




ID NO: 1270)





Bax_3
CTGATGGCAACTTCAAC
CTTTCAAGACTAATAGATTGCTCCTTACGA



TGG(SEQ ID NO: 481)
GGAGACCTGATGGCAACTTCAACTGG




(SEQ ID NO: 1271)





Bax_4
TACTTTGCTAGCAAACT
CTTTCAAGACTAATAGATTGCTCCTTACGA



GGT (SEQ ID NO: 482)
GGAGACTACTTTGCTAGCAAACTGGT (SEQ




ID NO: 1272)





Bax_5
AGCACCAGTTTGCTAGC
CTTTCAAGACTAATAGATTGCTCCTTACGA



AAA (SEQ ID NO: 483)
GGAGACAGCACCAGTTTGCTAGCAAA




(SEQ ID NO: 1273)





Bax_6
AACTGGGGCCGGGTTG
CTTTCAAGACTAATAGATTGCTCCTTACGA



TTGC (SEQ ID NO: 484)
GGAGACAACTGGGGCCGGGTTGTTGC




(SEQ ID NO: 1274)





Fut8_1
CCACTTTGTCAGTGCGT
CTTTCAAGACTAATAGATTGCTCCTTACGA



CTG (SEQ ID NO: 507)
GGAGACCCACTTTGTCAGTGCGTCTG (SEQ




ID NO: 1325)





Fut8_2
CTCAATGGGATGGAAG
CTTTCAAGACTAATAGATTGCTCCTTACGA



GCTG (SEQ ID NO: 508)
GGAGACCTCAATGGGATGGAAGGCTG




(SEQ ID NO: 1326)





Fut8_3
AGGAATACATGGTACA
CTTTCAAGACTAATAGATTGCTCCTTACGA



CGTT (SEQ ID NO: 509)
GGAGACAGGAATACATGGTACACGTT




(SEQ ID NO: 1327)





Fut8_4
AAGAACATTTTCAGCTT
CTTTCAAGACTAATAGATTGCTCCTTACGA



CTC (SEQ ID NO: 510)
GGAGACAAGAACATTTTCAGCTTCTC (SEQ




ID NO: 1328)





Fut8_5
ATCCACTTTCATTCTGC
CTTTCAAGACTAATAGATTGCTCCTTACGA



GTT (SEQ ID NO: 511)
GGAGACATCCACTTTCATTCTGCGTT (SEQ




ID NO: 1329)





Fut8_6
TTTGTTAAAGGAGGCA
CTTTCAAGACTAATAGATTGCTCCTTACGA



AAGA(SEQ ID NO: 512)
GGAGACTTTGTTAAAGGAGGCAAAGA




(SEQ ID NO: 1330)









The inventors went on to demonstrate the ability of CasΦ.12 to mediate gene editing via the homology directed repair pathway. The inventors tested DNA donor oligos with 25 bp, 50 bp or 90 bp homology arms (HA), as shown in FIG. 16B. The donor oligos were delivered to CHO cells with or without CasΦ.12 and crRNA. As seen in FIG. 16C, indels were not detected in the absence of CasΦ.12. Whereas, indels were detected in the presence of CasΦ.12 and confirmed by sequencing the endogenous targeted locus (FIG. 16D). The sequencing analysis also showed the successful incorporation of a DNA donor oligo into the endogenous targeted locus (FIG. 16E).


The inventors further demonstrated the ability of CasΦ.12 to mediate gene editing of Bax and Fut8 genes via the homology directed repair pathway. In this additional study, DNA donor oligos with 20 bp, 25 bp, 30 bp or 40 bp 90 bp HA were used, shown in FIG. 16I. These DNA donor oligos were either unmodified or modified with phosphorothioate (PS) bonds between the first 5′, and the last two 3′ bases. As shown in FIG. 16J, CasΦ.12 mediated successful incorporation of a DNA donor oligo into the endogenous targeted locus. Finally, the inventors further optimized CasΦ.12-mediated genome editing of Fut8 using AAV6 delivery of the DNA donor. In this study, CHO cells were transfected with Fut8-targeting RNP (500 pmol) using Lonza nucleofection protocols. AAV6 donors at different MOIs were added to cells immediately after transfection. The frequency of indels and HDR was analyzed by NGS. As shown in FIG. 16K and FIG. 16L, CasΦ.12 induced the generation of indels and HDR.


These data further demonstrate the utility of CasΦ polypeptides as a genome editing tool.


Example 22

CasΦ-Mediated Genome Editing in K562 Cells


This example illustrates the ability of CasΦ polypeptides to mediate genome editing in K562 cells, a myelogenous leukemia cell line which is particularly useful for biological and medical research by virtue of its amenability for nucleofection by electroporation. In this study, K562 cells were nucleofected with Cas9 or CasΦ.12. To nucleofect the cells, 150,000 cells in SF solution (SF Cell Line 96 Amaxa) were added to the amount of plasmid (expressing the gRNA targeting the Fut8 gene and either Cas9 or CasΦ.12) indicated in FIG. 17. Amaxa program 96-FF-120 was used to nucleofect the cells. The cells were harvested two days after nucleofection and the frequency of indel mutations was determined. As shown in FIG. 17, as the amount of CasΦ.12 plasmid increased, the amount of indels detected in the endogenous Fut8 gene also increased.


Example 23

CasΦ-Mediated Genome Editing in Primary Cells


This example illustrates the ability of CasΦ polypeptides to mediate genome editing in primary cells, such as T cells. In this study, CasΦ.12 was delivered to human T cells. CasΦ.12 was complexed to its native crRNA comprising the spacer sequence 5′-GGGCCGAGAUGUCUCGCUCC-3′ (SEQ ID NO: 1429). Complexes were formed in a 3:1 ratio of crRNA:protein. For nucleofection, 50 pmol RNP was mixed with 320,000 cells per well and the Amaxa EH115 program was used. Immediately after nucleofection, 80 μl pre-warmed culture medium was added to each well. The cells were then left in the cuvette plate for 15 minutes before transfer to the culture plate. Genomic DNA was extracted from cells on day 3 and day 5. Flow cytometry analysis was performed on day 5. As shown in FIG. 18A, when CasΦ.12 was delivered with a gRNA targeting the endogenous beta-2 microglobulin (B2M) gene, a distinct population of B2M-negative cells was detected by flow cytometry analysis demonstrating the CasΦ.12-mediated knockout of the endogenous B2M gene. In the absence of the B2M-targeting gRNA, the population of B2M-negative cells was not observed by flow cytometry. Indels were confirmed by next generation sequencing analysis, as shown FIG. 18C, and quantified, as shown in FIG. 18B.


The inventors went on to use CasΦ.12 to target the T-cell receptor alpha-constant (TRAC) gene. Knockout of the TRAC gene prevents expression of the T cell receptor. Accordingly, TRAC knockout T cells are beneficial for T cell therapies (e.g. CAR-T cell therapies) because TRAC knockout T cells have a longer half-life in vivo as the T cells have less potential to attack the recipient's normal cells. In this study, CasΦ.12 and gRNA targeting the TRAC gene (CasPhi1 or CasPhi7) were delivered to T cells. As shown in FIG. 18D, the delivery of the CasΦ.12 and the gRNA resulted in a population of TRAC-negative cells, which were detected by flow cytometry. The inventors went on to confirm the presence of indel mutations by sequencing the target locus. As shown in FIG. 18E, the sequence analysis revealed insertion, deletion and substitution mutations at the endogenous targeted locus. The frequency of indel mutations was quantified, as shown in FIG. 18F.


These data demonstrate the utility of CasΦ polypeptides as a robust genome editing tool in primary human cells.


Example 24

Separable DNA Strand Cleavage Reactions of CasΦ Nucleases


This example further illustrates the mechanism of DNA strand cleavage by CasΦ polypeptides. In this study, CasΦ.4, CasΦ.12 and CasΦ.18 were complexed with their native crRNA. RNP complexes were formed by a 20 minute incubation at room temperature. The target plasmid was a 2.1 kb plasmid containing the target sequence











(SEQ ID NO: 108)




TATTAAATACTCGTATTGCTGTTCGATTAT.








carried out at 37° C. and had a duration of 30 minutes. The cleavage products were then analyzed by gel electrophoresis. As shown in FIG. 19, CasΦ polypeptides nick supercoiled (sc) DNA by cleaving the non-target DNA strand. Some CasΦ polypeptides, such as CasΦ.4 and CasΦ.12, then go on to cleave the second (target) strand to generate a linear product from a plasmid target. Whereas some CasΦ polypeptides, such as CasΦ.18, function as nickases and do not go on to cleave the second strand. CasΦ cleavage activity is dependent on metal cations, such as Mg2+. Varying the concentration of Mg2+ allows the cleavage of the first strand and then second strand by CasΦ.4 and CasΦ.12 to be visualized. As the concentration of Mg2+ increases, the amount of linearized product detected increases indicating that the second strand has been cleaved in the CasΦ.4 and CasΦ.12 reactions.


Example 25

Detection of a Target Nucleic Acid by CasΦ Polypeptides


This example illustrates the use of CasΦ.4 and CasΦ.18 in a nucleic acid detection assay by virtue of trans cleavage activity of ssDNA. In this study, 100 nM RNP was prepared and used in a detection assay. In the detection assay, the target dsDNA was at a concentration of 10 nM and the ssDNA reporter molecule was at a concentration of 100 nM. The target dsDNA included 5 target sequences, which were targeted by a pool of 5 gRNAs) with 7 base pairs flanking the 20 nucleotide target sequences on both 5′ and 3′ sides, as shown in FIG. 20. The detection assay was carried out at 37° C. The buffer conditions provided in TABLE 9 were tested in the detection assay. All buffers were supplemented with 0.1 mg/ml BSA and 1 mM TCEP. As seen in FIG. 20, when a gRNA (complexed to a CasΦ polypeptide) hybridizes to a target nucleic acid, the CasΦ's trans cleavage activity is activated such that a labeled ssDNA reporter is degraded. The degradation of the ssDNA reporter is detected as fluorescence thus allowing CasΦ polypeptides to be used in assays to achieve fast and high-fidelity detection of target nucleic acid molecules in a sample. As shown in FIG. 20, high pH (e.g. 8-9) and high Mg2+ concentration (e.g. 12-15 mM) provided preferred conditions for the detection assay.














TABLE 9







buffer ID #
pH
1X NaCl (mM)
1X MgCl2 (mM)





















1
9
150
15



2
9
150
3



3
7.5
0
3



4
9
0
3



5
9
0
15



6
7.5
150
3



7
7.5
150
15



8
8
37.5
3



9
8.5
150
12



10
7.5
0
15



11
8.5
0
6



12
9
150
3



13
9
0
3



14
9
150
15



15
8
150
6



16
7.5
150
15



17
8
112.5
15



18
9
0
15



19
7.5
150
3



20
8.5
112.5
3



21
8.5
37.5
12



22
7.5
0
3



23
8.5
112.5
6



24
7.5
37.5
6



25
8
0
12



26
7.5
112.5
6



27
8.5
37.5
15



28
9
37.5
6



29
9
112.5
12



30
7.5
37.5
12



31
7.5
0
15



32
7.5
112.5
12










These data demonstrate the utility of CasΦ polypeptides in nucleic acid detection assays.


Example 26

High Efficiency of CasΦ Polypeptide-Mediated Genome Editing in Primary Cells


The present example shows that CasΦ.12 mediates high genome editing efficiency that is comparable the editing efficiency mediated by Cas9. Results of the study are shown in FIG. 21. In this study, CasΦ.12 mRNA (SEQ ID NO: 107) with a









gRNA





(CUUUCAAGACUAAUAGAUUGCUCCUUACGAGGAGACGGGCCGAGAUGU







CUCGCUCC
 (SEQ ID NO: 1430)); spacer sequence is






bold and underlined)






or Cas9 mRNA with a gRNA (GGCCGAGATGTCTCGCTCCG (SEQ ID NO: 1431)) was delivered to T cells. gRNAs used in this study targeted the B2M gene. For nucleofection, T cells were resuspended in BTXpress electroporation medium (5×105 cells per well) and mixed with CasΦ.12 or Cas9 mRNA and 500 pmol gRNA. Cells were collected on day 2 for extraction of genomic DNA, and the frequency of indel mutations was determined. As shown in FIG. 21A, when 20 μg of CasΦ.12 mRNA was delivered with gRNA to T cells, high genome editing efficiency was achieved, and this was at a similar level to of genome editing achieved using Cas9. Cells were also collected on Day 2 for flow cytometry to determine the frequency of B2M knockout. As shown in FIG. 21B and quantified in FIG. 21A, a similar percentage of B2M-negative cells were detected after delivery of CasΦ.12 or Cas9 mRNA. Accordingly, this example demonstrates high efficiency of CasΦ polypeptide-mediated genome efficiency in primary cells.


Example 27

CasΦ Polypeptide-Mediated Genome Editing in CHO Cells


This present example describes the identification of optimized gRNAs for CasΦ.12-mediated genome editing in CHO cells. In this study, CasΦ.12 polypeptides (SEQ ID NO: 107) were complexed with a gRNA shown in TABLE 10. CHO cells were resuspended in SF solution and Lonza setting FF-137 was used to nucleofect the cells (200,000 cells per well) with 250 pmol RNP. Genomic DNA was extracted and the presence of indels was confirmed by next generation sequence analysis. FIG. 22A shows the frequency of indel mutations induced by CasΦ.12 polypeptides complexed with a 2′fluoro modified gRNA. As shown in FIG. 22B, gRNAs with ˜20% or greater editing efficiency were identified.











TABLE 10






Spacer sequence
RNA sequence (5′→3′),


Name
(5′→3′)
shown as DNA







R2849_Bakl_nsd_
CTGACTCCCAGCTCTGA
CTTTCAAGACTAATAGATTGCTCC


sg1
CCC (SEQ ID NO: 449)
TTACGAGGAGACCTGACTCCCAG




CTCTGACCC (SEQ ID NO: 1203)





R2855_Bak1_nsd_
CCATCTCCACCATCAGG
CTTTCAAGACTAATAGATTGCTCC


sg7
AAC (SEQ ID NO: 455)
TTACGAGGAGACCCATCTCCACC




ATCAGGAAC (SEQ ID NO: 1209)





R3977
TCCAGACGCCATCTTTCA
CTTTCAAGACTAATAGATTGCTCC


Bak1_exon1_sg1
GG
TTACGAGGAGACTCCAGACGCCA



(SEQ ID NO: 465)
TCTTTCAGG (SEQ ID NO: 1219)





R3978
TGGTAAGAGTCCTCCTG
CTTTCAAGACTAATAGATTGCTCC


Bakl_exon1_sg2
CCC
TTACGAGGAGACTGGTAAGAGTC



(SEQ ID NO: 466)
CTCCTGCCC (SEQ ID NO: 1220)





R3979
TTACAGCATCTTGGGTC
CTTTCAAGACTAATAGATTGCTCC


Bak1_exon3_sg1
AGG
TTACGAGGAGACTTACAGCATCT



(SEQ ID NO: 467)
TGGGTCAGG (SEQ ID NO: 1221)





R3980
GGTCAGGTGGGCCGGCA
CTTTCAAGACTAATAGATTGCTCC


Bak1_exon3_sg2
GCT
TTACGAGGAGACGGTCAGGTGGG



(SEQ ID NO: 468)
CCGGCAGCT (SEQ ID NO: 1222)





R3981
CTATCATTGGAGATGAC
CTTTCAAGACTAATAGATTGCTCC


Bak1_exon3_sg3
ATT
TTACGAGGAGACCTATCATTGGA



(SEQ ID NO: 469)
GATGACATT (SEQ ID NO: 1223)





R3982
GAGATGACATTAACCGG
CTTTCAAGACTAATAGATTGCTCC


Bak1_exon3_sg4
AGA
TTACGAGGAGACGAGATGACATT



(SEQ ID NO: 470)
AACCGGAGA (SEQ ID NO: 1224)





R3983
TGGAACTCTGTGTCGTAT
CTTTCAAGACTAATAGATTGCTCC


Bak1_exon3_sg5
CT
TTACGAGGAGACTGGAACTCTGT



(SEQ ID NO: 471)
GTCGTATCT (SEQ ID NO: 1225)





R3984
CAGAATTTACTGGAGCA
CTTTCAAGACTAATAGATTGCTCC


Bak1_exon3_sg6
GCT
TTACGAGGAGACCAGAATTTACT



(SEQ ID NO: 472)
GGAGCAGCT (SEQ ID NO: 1226)





R3985
ACTGGAGCAGCTGCAGC
CTTTCAAGACTAATAGATTGCTCC


Bak1_exon3_sg7
CCA
TTACGAGGAGACACTGGAGCAGC



(SEQ ID NO: 473)
TGCAGCCCA (SEQ ID NO: 1227)





R3986
CCAGCTGTGGGCTGCAG
CTTTCAAGACTAATAGATTGCTCC


Bak1_exon3_sg8
CTG
TTACGAGGAGACCCAGCTGTGGG



(SEQ ID NO: 474)
CTGCAGCTG (SEQ ID NO: 1228)





R3987
GTAGGCATTCCCAGCTG
CTTTCAAGACTAATAGATTGCTCC


Bak1_exon3_sg9
TGG
TTACGAGGAGACGTAGGCATTCC



(SEQ ID NO: 475)
CAGCTGTGG (SEQ ID NO: 1229)





R3988
GTGAAGAGTTCGTAGGC
CTTTCAAGACTAATAGATTGCTCC


Bak1_exon3_sg10
ATT
TTACGAGGAGACGTGAAGAGTTC



(SEQ ID NO: 476)
GTAGGCATT (SEQ ID NO: 1230)





R3989
ACCAAGATTGCCTCCAG
CTTTCAAGACTAATAGATTGCTCC


Bak1_exon3_sg11
GTA
TTACGAGGAGACACCAAGATTGC



(SEQ ID NO: 477)
CTCCAGGTA (SEQ ID NO: 1231)





R3990
CCTCCAGGTACCCACCA
CTTTCAAGACTAATAGATTGCTCC


Bak1_exon3_sg12
CCA
TTACGAGGAGACCCTCCAGGTAC



(SEQ ID NO: 478)
CCACCACCA (SEQ ID NO: 1232)









Example 28

Minimal Off-Target Effects of CasΦ Polypeptides


This example illustrates the off-target profiles of CasΦ.12 and Cas9. A major challenge in the translation of CRISPR/Cas9 technology into the clinic has been overcoming off-target effects. Off-target effects arise where a gRNA tolerates mismatches in complementarity of the gRNA and target sequence, and so the gRNA hybridizes to a sequence that is not the target sequence. Off-target effects are a source of major concern as it is important to avoid the production in unnecessary mutations that could be detrimental. In this study, CIRCLE-seq was performed to detect off-target sites (Tsai et al. 2017 Nature Methods). Sequencing was performed on genomic DNA extracted from CHO cells that had been transfected with CasΦ.12 polypeptide (SEQ ID NO: 107) and a gRNA targeting Fut8, CasΦ.12 polypeptide and a gRNA targeting BAX or Cas9 polypeptide and a gRNA targeting BAX. As shown in FIG. 23A, CasΦ.12 targeting Fut8 induced minimal off-target mutations. FIG. 23D shows the off-target mutations induced by Cas9 editing of Fut8. Similarly, CasΦ. 12 targeting BAX induced minimal off-target mutations, as shown in FIG. 23B. Cas9 targeting BAX induced a higher percentage of off-targets mutations, as shown in FIG. 23C, compared to CasΦ.12. Cas9 targeting Bak1 also induced a higher percentage of off-targets mutations, as shown in FIG. 23E, compared to CasΦ.12, as shown in FIG. 23F.


In a further study, GUIDE-Seq was performed to detect off-target sites (Tsai et al. 2015 Nature Biotechnology). Sequencing was performed on genomic DNA extracted from HEK293 cells following delivery of either CasΦ.12 polypeptide or Cas9 polypeptide and a gRNA targeting human Fut8. As shown in FIG. 23G, no off target mutations were detected in the CasΦ.12 polypeptide sample. Whereas, several off-target mutations were detected in Cas9 polypeptide sample, as shown in FIG. 23H. Accordingly, this example demonstrates that CasΦ polypeptides have fewer off-target effects than Cas9.


Example 29

CasΦ Polypeptide-Mediated Genome Editing Via Homology Directed Repair (HDR)


The present example illustrates the ability of that CasΦ.12 to mediate HDR. In this study, CasΦ.12 polypeptide (SEQ ID NO: 107) was complexed with a gRNA (CUUUCAAGACUAAUAGAUUGCUCCUUACGAGGAGACGAGUCUCUCAGCUGGUAC AC (SEQ ID NO: 1432)) targeting the TRAC gene and delivered to T cells. RNP complexes were formed by a 10 minute incubation at room temperature. T cells were resuspended at 5×105 cells/20 μL in electroporation solution (Lonza). T cells were nucleofected using the Amaxa P3 kit and Amaxa 4D Nucleofector with pulse code EH115. Immediately after nucleofection, 80 μl pre-warmed culture medium was added to each well. The cells were then left in the cuvette plate for 10 minutes before transfer to the culture plate. Cells were harvested and genomic DNA was extracted. The frequency of indel mutations HDR was determined and shown in FIG. 24A. The frequency of indel mutations and HDR was combined to determine the frequency of modification. Flow cytometry was also performed to determine the frequency of TRAC knockout, as assessed by the loss of CD3 at the cell surface. FIG. 24A shows CasΦ.12-mediated gene editing via the HDR pathway. FIG. 24B shows a schematic of the donor oligonucleotide. Thus, this example demonstrates the use of CasΦ polypeptides as robust genome editing tools.


Example 30

Multiplex Genome Editing with CasΦ Polypeptides


This example illustrates the ability of CasΦ RNP complexes to target multiple genes simultaneously. In this study, gRNAs targeting B2M or TRAC were incubated with CasΦ.12 polypeptides (SEQ ID NO: 107) for 10 minutes at room temperature to form RNP complexes. RNP complexes were formed with a variety of gRNAs with different modifications (unmodified, 2′-O-methyl on the last 3′ nucleotide of the crRNA (1me), 2′-O-methyl on the last two 3′ nucleotides of the crRNA (2me) and 2′-O-methyl on the last three 3′ nucleotides of the crRNA(3me)) and with different repeat and spacer sequences (20-20, which corresponds to 20 nucleotide repeat and 20 nucleotide spacer, and 20-17, which corresponds to 20 nucleotide repeat and 17 nucleotide spacer), as shown in TABLE 11. B2M targeting RNPs, TRAC targeting RNPs or B2M targeting RNPs and TRAC targeting RNPs were added to T cells. T cells were resuspended at 5×105 cells/20 μL in Nucleofection P3 solution and an Amaxa 4D 96-well electroporation system with pulse code EH115 was used to nucleofect the cells. Immediately after nucleofection, 85 μl pre-warmed culture medium was added to each well. The cells were then left in the cuvette plate for 10 minutes before transfer to the culture plate. On Day 3, genomic DNA was extracted. On Day 5, cells were harvested for flow cytometry. Quantification of the percentage of B2M-negative and CD3-negative cells is shown in FIG. 25A for gRNAs with a repeat length of 20 nucleotides and a spacer length of 20 nucleotides, and in FIG. 25B for gRNAs with a repeat length of 20 nucleotides and a spacer length of 17 nucleotides. Corresponding flow cytometry panels can be seen in FIG. 25C for gRNAs of different repeat and spacer lengths and with different modifications.


In a further study, RNP complexes were formed using CasΦ.12 and modified gRNAs (unmodified, 1me, 2me, 3me, 2′-fluoro on the last 3′ nucleotide of the crRNA (1F), 2′-fluoro on the last two 3′ nucleotides of the crRNA (2F) and 2′-fluoro on the last three 3′ nucleotides of the crRNA (3F)) with different lengths of spacer sequences (20-20 and 20-17 as above) that target TRAC. T cells were nucleofected with RNP complexes (125 μmol) using the P3 primary cell nucleofection kit and an Amaxa 4D 96-well electroporation system with pulse code EH115. As shown in FIG. 25D, —90% editing efficiency was achieved using CasΦ.12 and modified gRNAs. FIG. 25E shows a flow cytometry plot illustrating ˜90% TRAC knockout in T cells after delivery of CasΦ.12 and modified gRNAs. This data further demonstrates the ability of CasΦ to mediate high efficiency genome editing.














TABLE 11








Repeat
Spacer






sequence
sequence
crRNA sequence


Name
Target
Modification
(5′→3′)
(5′→3′)
(5′→3′)







R3150
B2M
Unmodified,
AUUGCUC
CAGUGGGGG
AUUGCUCCUUAC


20-20
Exon 2
2′OMe at last
CUUACGA
UGAAUUCAG
GAGGAGACCAG




3′ base (1me)
GGAGAC
UG (SEQ ID
UGGGGGUGAAU




2′OMe at last
(SEQ ID NO:
NO: 1434)
UCAGUG (SEQ ID




two 3′ bases
1433)

NO: 1435)




(2me)







2′OMe at last







three 3′ bases







(3me)








R3042
TRAC
Unmodified,
AUUGCUC
GAGUCUCUC
AUUGCUCCUUAC


20-20
Exon 1
1me
CUUACGA
AGCUGGUAC
GAGGAGACGAG




2me
GGAGAC
AC (SEQ ID
UCUCUCAGCUGG




3me
(SEQ ID NO:
NO: 1436)
UACAC (SEQ ID





1433)

NO: 1437)





R3150
B2M
Unmodified,
AUUGCUC
CAGUGGGGG
AUUGCUCCUUAC


20-17
Exon 2
1me
CUUACGA
UGAAUUCA
GAGGAGACCAG




2me
GGAGAC
(SEQ ID NO:
UGGGGGUGAAU




3me
(SEQ ID NO:
1438)
UCA (SEQ ID NO:





1433)

1439)





R3042
TRAC
Unmodified,
AUUGCUC
CAGUGGGGG
AUUGCUCCUUAC


20-17
Exon 1
1me
CUUACGA
UGAAUUCA
GAGGAGACGAG




2me
GGAGAC
(SEQ ID NO:
UCUCUCAGCUGG




3me
(SEQ ID NO:
1440)
UA (SEQ ID NO:





1433)

1441)









Example 31

Cas0 Polypeptides have an Extended Seed Region


The present example shows that CasΦ.12 has an extended seed region compared to Cas9 and does not tolerate mismatches in the complementarity of the spacer and target sequences within the first 1-16 nucleotides from the 5′ of the spacer sequence. In this study, CasΦ.12 (SEQ ID NO: 107) was complexed with a gRNA targeting TRAC gene and delivered to T cells. Spacer sequences contained a single mismatch at the position indicated in FIG. 26A or a mismatch at each of the two positions indicated in FIG. 26B. Mismatches were generated by substituting a purine for a purine (i.e. A to G and vice versa) and a pyrimidine for a pyrimidine (i.e. U to C and vice versa). RNP complexes were formed by a 10 minute incubation at room temperature. T cells were resuspended at 5×105 cells/20 μL in electroporation solution (Lonza). Amaxa P3 kit and Amaxa 4D Nucleofector was used to nucleofect the T cells. Immediately after nucleofection, 80 μl pre-warmed culture medium was added to each well. The cells were then left in the cuvette plate for 10 minutes before transfer to the culture plate. Cells were harvested for extraction of genomic DNA to determine the frequency of indel mutations and for flow cytometry to determine the percentage of CD3 knockout cells. As shown in FIG. 26A, no indel mutations or CD3 knockout were detected when there was a single mismatch in the complementarity of the spacer and target sequences at positions 1-16 from the 5′ end of the spacer sequence. Similarly, no indels or CD3 knockout cells were detected when there was a double mismatch in the complementarity of the spacer and target sequences at positions 1-16 from the 5′ end of the spacer sequence as shown in FIG. 26B. The data shown in FIG. 26A and FIG. 26B demonstrate that CasΦ polypeptides do not tolerate mismatches in complementarity between the spacer sequence and target sequence in the 5′ 16 positions of the spacer. This region in which mismatches are not tolerated is known as the “seed region”. Thus the seed region of CasΦ.12 is the first 16 bases from the 5′ end of the spacer. In contrast, the seed region of Cas9 is much shorter and is reported to be only 5 nucleotides long (Wu et al., Quant Biol. 2014 June; 2(2): 59-70). Shorter seed regions result in increased likelihood of off-target effects because the likelihood of mismatches between the spacer and target occurring outside the seed region is increased. Accordingly, longer seed regions result in a reduced likelihood of off-target effects. The long seed region of CasΦ.12 is therefore advantageous over the short seed region of Cas9 and contributes to the reduced off-target effects of CasΦ.12. FIG. 26C and FIG. 26D provide schematics of the gRNAs with mismatches.


Example 32

Use of Modified Guide RNAs with CasΦ Polypeptides


This example illustrates the ability of CasΦ.12 to mediate genome editing in CHO cells with modified gRNAs. In this study, RNP complexes were formed using CasΦ.12 polypeptide (SEQ ID NO: 107) and a modified gRNA shown in TABLE 12. For nucleofection, 200 pmol RNP was mixed with 200,000 cells per well. CHO cells were resuspended in SF solution and Lonza setting FF-137 was used to nucleofect the cells. Genomic DNA was extracted 48 hours after transfection and the frequency of indel mutations was determined. As shown in FIG. 27A, several modified gRNAs with editing efficiency of ˜10% were identified. In a further study, additional modified gRNAs were tested. As shown in FIG. 27B, modified gRNAs with editing efficiency of up to 40-50% were identified.


gRNAs with phosphorothioate (PS) backbone modifications, 2′-fluoro (2′-F) and 2′-O-Methyl (2′OMe) sugar modifications are known to increase metabolic stability and binding affinity to RNA, and replacing RNA nucleotides with DNA generates gRNAs with highly efficient gene-editing activity compared to the natural crRNA (Randar et al, 2015, PNA; McMahon et al. 2017, Molecular Therapy Vol. 26 No 5).














TABLE 12





SEQ
Name



Name


ID
(FIG.


Full modified guide 
(FIG.2


NO.
27A)
Modification
Position
(repeat and spacer)
7A, B)







1442
R2466_
2′-O-Methyl
2′OMe at 3 first
mC*mU*mU*UCAAGACUA
Synthego_



Mo1
(2′OMe), 3′
(5′) and last (3′)
AUAGAUUGCUCCUUACG
Mod




phosphoro-
bases, 3′ PS
AGGAGACAGGAAUACAU





thioate (PS)
bonds between
GGUACACmG*mU*mU*





bonds
first 3 (5′) and







last 2 (3′) bases







1443
R2466_
2′OMe, 3′,
2′OMe at 3 first
mA*mA*mU*AGAUUGCUC




Mo2
25 nucleotide
(5′) and last (3′)
CUUACGAGGAGACAGGA





repeat
bases, 3′ PS
AUACAUGGUACACmG*m






bonds between
U*mU






first 3 (5′) and







last 2 (3′) bases







1444
R2466_
2′-O-
2′-O-Methoxy-
/52MOErA*/i2MOErA/UA




Mo3
methoxy-
ethyl bases at 2
GAUUGCUCCUUACGAGG





ethyl bases
first (5′) and last
AGACAGGAAUACAUGGU






(3′) bases, 3′ PS
ACACG/i2MOErT/32MOErT






bonds between







first 2 (5′) and







last 2 (3′) bases







1445
R2466_
2′-Fluoro
First (5′) and last
/52FC/UUUCAAGACUAAU




Mo4
(2′-F)
(3′) base
AGAUUGCUCCUUACGAG







GAGACAGGAAUACAUGG







UACACGU/32FU/






1446
R2466_
2′-F, 25
First (5′) and last
/52FA/AUAGAUUGCUCCU
1F, 45F



Mo5
nucleotide
(3′) base
UACGAGGAGACAGGAAU
(25 nt




repeat

ACAUGGUACACGU/32FU/
R)





1447
R2466_
2′-F, PS,
First (5′) base

mC*U*UUCAAGACUAAUA

1, 2



Mo6
2′OMe
2′OMe, PS
GAUUGCUCCUUACGAGG
OMe-





between first
AGACAGGAAUACAUGGU
PS, 54,





two (5′) bases, last
ACA/i2FC/i2FG/i2FU/
55, 56′F





4 (3′) bases 2′-F
32FU/






1448
R2466_
2′-F, PS,
First (5′) base
mA*A*UAGAUUGCUCCUU
1, 2



Mo7
2′OMe, 25
2′OMe, PS
ACGAGGAGACAGGAAUA
OMe-




nucleotide
between first
CAUGGUACA/i2FC/i2FG/
PS, 54,




repeat
two (5′) bases, last
i2FU/32FU
55, 56′F





4 (3′)bases 2′-F

(25nt







R)





1449
R2466_
2′-F
Last 4 (3′) bases
CUUUCAAGACUAAUAGA
54, 55,



Mo8

2′-F
UUGCUCCUUACGAGGAG
56 2′F






ACAGGAAUACAUGGUAC







A/i2FC/i2FG/i2FU/32FU






1450
R2466_
2′-F, 25
Last 4 (3′) bases
AAUAGAUUGCUCCUUAC
54, 55,



Mo9
nucleotide
2′-F
GAGGAGACAGGAAUACA
56 2′F




repeat

UGGUACA/i2FC/i2FG/i2FU/
(25 nt






32FU
R)





1451
R2466_
C3 Spacer,
First (5′) and last

CUUUCAAGACUAAUAGA





Mo10
21 nucleotide
(3′) base
UUGCUCCUUACGAGGAG





spacer

ACAGGAAUACAUGGUAC







ACGUUG






1452
R2466_
C3 Spacer,
First (5′) and last

AAUAGAUUGCUCCUUAC





Mo11
21 nucleotide
(3′) base
GAGGAGACAGGAAUACA





spacer, 25

UGGUACACGUUG





nucleotide







spacer








1453
R2466_
DNA bases +
2′OMe at 3
mC*mU*mU*UCAAGACUA
1, 2, 3



Mo12
2′OMe, PS
first (5′) bases,
AUAGAUUGCUCCUUACG
Ome-





last 4 (3′) bases
AGGAGACAGGAAUACAU
PS 54,





DNA
GGUACACGTT
55, 56







DNA





1454
R2466_
DNA
Last (3′) 4
CUUUCAAGACUAAUAGA




Mo13
nucleoside
nucleoside
UUGCUCCUUACGAGGAG







ACAGGAAUACAUGGUAC







ACGTT






1455
R2466_
DNA
Nucleoside 1 of
CUUUCAAGACUAAUAGA
1, 54,



Mo14
nucleosides
spacer and last
UUGCUCCUUACGAGGAG
55, 56





(3′) 4 nucleosides
ACAGGAAUACAUGGUAC
DNA






ACGTT






1456
R2466_
DNA
Nucleoside 8 of
CUUUCAAGACUAAUAGA




Mo15
nucleosides
spacer and last
UUGCUCCUUACGAGGAG






(3′) 4 nucleosides
ACAGGAAUACAUGGUAC







ACGTT






1457
R2466_
DNA
Nucleoside 9 of
CUUUCAAGACUAAUAGA




Mo16
nucleosides
spacer and last
UUGCUCCUUACGAGGAG






(3′) 4 nucleosides
ACAGGAAUACAUGGUAC







ACGTT






1458
R2466_
DNA
Nucleoside 1 and
CUUUCAAGACUAAUAGA
1, 8, 54,



Mo17
nucleosides
8 of spacer and
UUGCUCCUUACGAGGAG
55, 56





last (3′) 4
ACAGGAAUACAUGGUAC
DNA





nucleosides
ACGTT






1459
R2466_
DNA
Nucleoside 1 and
CUUUCAAGACUAAUAGA




Mo18
nucleosides
9 of spacer and
UUGCUCCUUACGAGGAG






last (3′) 4
ACAGGAAUACAUGGUAC






nucleosides
ACGTT






1460
R2466_
DNA
Nucleoside 1, 8
CUUUCAAGACUAAUAGA
1, 8, 9,



Mo19
nucleosides
and 9 of spacer
UUGCUCCUUACGAGGAG
54, 55,





and last (3′) 4
ACAGGAAUACAUGGUAC
56





nucleosides
ACGTT
DNA





1461
R2466_
DNA bases,
Nucleoside 1, 8
AAUAGAUUGCUCCUUAC




Mo20
25 nucleotide
and 9 of spacer
GAGGAGACAGGAAUACA





repeat
and last (3′) 4
UGGUACACGTT






nucleosides







1462
R2466_
Poly-A-tail,


AAUAGAUUGCUCCUUAC





Mo21
25 nucleotide


GAGGAGACAGGAAUACA






repeat


UGGUACACGUUAAAAAA









A







1463
R2466_
DNA bases,
2′OMe and PS at
mC*mU*mU*UCAAGACUA
1, 2, 3



Mo22
2′OMe, PS
first 3 (5′) bases,
AUAGAUUGCUCCUUACG
OMe,





DNA bases at 1, 8
AGGAGACAGGAAUACAU
1, 8, 9,





and 9 of spacer,
GGUACACGTT
54, 55,





PS at last 4 (3′)

56





bases

DNA





1464
R2466_
Unmodified,

AAUAGAUUGCUCCUUAC




Mo23
25 nucleotide

GAGGAGACAGGAAUACA





repeat

UGGUACACGUU






1465
R2466
Unmodified
Unmodified
CUUUCAAGACUAAUAGA




(Un-


UUGCUCCUUACGAGGAG




modified)


ACAGGAAUACAUGGUAC







ACGUU










Example 33

Optimization of Guide RNA Repeat and Spacer Length in CHO Cells


This example describes the optimization of repeat and spacer lengths of gRNAs for genome editing in CHO cells. In this study, RNP complexes were formed by incubating CasΦ.12 polypeptides (SEQ ID NO: 107) with a gRNA targeting Fut8 gene shown in TABLE 13. The gRNAs had different repeat lengths (20 to 36 nucleotides) or spacer lengths (15 to 30 nucleotides). Genomic DNA was extracted and the frequency of indel mutations was determined. For nucleofection, 250 pmol RNP was mixed with 200,000 cells per well. After 2 days, cells were collected and genomic DNA was extracted to determine the frequency of indel mutations. FIG. 28A shows the generation of indels by CasΦ.12 with gRNAs containing repeat sequences of different lengths. FIG. 28B the shows the generation of indels by CasΦ.12 with gRNAs containing spacer sequences of different lengths. The optimal gRNA for CasΦ.12-mediated genome editing in CHO cells was found to have a 20-nucleotide repeat length and a 17-nucleotide spacer length.














TABLE 13








Repeat
Spacer




Repeat
Spacer
sequence
sequence
crRNA sequence  


Name
length
length
(5′→3′)
(5′→3′)
(5′→3′)







R3582
36
30
CUUUCAAGA
AGGAAUACAU
CUUUCAAGACUAA





CUAAUAGAU
GGUACACGUU
UAGAUUGCUCCUU





UGCUCCUUA
GAAGAACAUU
ACGAGGAGACAGG





CGAGGAGAC
(SEQ ID NO:
AAUACAUGGUACA





(SEQ ID NO:
1482)
CGUUGAAGAACAU





54)

U (SEQ ID NO: 1499)





R3583
36
29
CUUUCAAGA
AGGAAUACAU
CUUUCAAGACUAA





CUAAUAGAU
GGUACACGUU
UAGAUUGCUCCUU





UGCUCCUUA
GAAGAACAU
ACGAGGAGACAGG





CGAGGAGAC
(SEQ ID NO:
AAUACAUGGUACA





(SEQ ID NO:
1483)
CGUUGAAGAACAU





54)

(SEQ ID NO: 1500)





R3584
36
28
CUUUCAAGA
AGGAAUACAU
CUUUCAAGACUAA





CUAAUAGAU
GGUACACGUU
UAGAUUGCUCCUU





UGCUCCUUA
GAAGAACA
ACGAGGAGACAGG





CGAGGAGAC
(SEQ ID NO:
AAUACAUGGUACA





(SEQ ID NO:
1484)
CGUUGAAGAACA





54)

(SEQ ID NO: 1501)





R3585
36
27
CUUUCAAGA
AGGAAUACAU
CUUUCAAGACUAA





CUAAUAGAU
GGUACACGUU
UAGAUUGCUCCUU





UGCUCCUUA
GAAGAAC
ACGAGGAGACAGG





CGAGGAGAC
(SEQ ID NO:
AAUACAUGGUACA





(SEQ ID NO:
1485)
CGUUGAAGAAC





54)

(SEQ ID NO: 1502)





R3586
36
26
CUUUCAAGA
AGGAAUACAU
CUUUCAAGACUAA





CUAAUAGAU
GGUACACGUU
UAGAUUGCUCCUU





UGCUCCUUA
GAAGAA (SEQ
ACGAGGAGACAGG





CGAGGAGAC
ID NO: 1486)
AAUACAUGGUACA





(SEQ ID NO:

CGUUGAAGAA (SEQ





54)

ID NO: 1503)





R3587
36
25
CUUUCAAGA
AGGAAUACAU
CUUUCAAGACUAA





CUAAUAGAU
GGUACACGUU
UAGAUUGCUCCUU





UGCUCCUUA
GAAGA (SEQ
ACGAGGAGACAGG





CGAGGAGAC
ID NO: 1487)
AAUACAUGGUACA





(SEQ ID NO:

CGUUGAAGA (SEQ





54)

ID NO: 1504)





R3588
36
24
CUUUCAAGA
AGGAAUACAU
CUUUCAAGACUAA





CUAAUAGAU
GGUACACGUU
UAGAUUGCUCCUU





UGCUCCUUA
GAAG (SEQ ID
ACGAGGAGACAGG





CGAGGAGAC
NO: 1488)
AAUACAUGGUACA





(SEQ ID NO:

CGUUGAAG (SEQ ID





54)

NO: 1505)





R3589
36
23
CUUUCAAGA
AGGAAUACAU
CUUUCAAGACUAA





CUAAUAGAU
GGUACACGUU
UAGAUUGCUCCUU





UGCUCCUUA
GAA (SEQ ID
ACGAGGAGACAGG





CGAGGAGAC
NO: 1489)
AAUACAUGGUACA





(SEQ ID NO:

CGUUGAA (SEQ ID





54)

NO: 1506)





R3590
36
22
CUUUCAAGA
AGGAAUACAU
CUUUCAAGACUAA





CUAAUAGAU
GGUACACGUU
UAGAUUGCUCCUU





UGCUCCUUA
GA (SEQ ID
ACGAGGAGACAGG





CGAGGAGAC
NO: 1490)
AAUACAUGGUACA





(SEQ ID NO:

CGUUGA (SEQ ID





54)

NO: 1507)





R3591
36
21
CUUUCAAGA
AGGAAUACAU
CUUUCAAGACUAA





CUAAUAGAU
GGUACACGUU
UAGAUUGCUCCUU





UGCUCCUUA
G (SEQ ID NO:
ACGAGGAGACAGG





CGAGGAGAC
1491)
AAUACAUGGUACA





(SEQ ID NO:

CGUUG (SEQ ID





54)

NO: 1508)





R3592
36
20
CUUUCAAGA
AGGAAUACAU
CUUUCAAGACUAA





CUAAUAGAU
GGUACACGUU
UAGAUUGCUCCUU





UGCUCCUUA
(SEQ ID NO:
ACGAGGAGACAGG





CGAGGAGAC
1492)
AAUACAUGGUACA





(SEQ ID NO:

CGUU (SEQ ID





54)

NO: 1509)





R3593
36
19
CUUUCAAGA
AGGAAUACAU
CUUUCAAGACUAA





CUAAUAGAU
GGUACACGU
UAGAUUGCUCCUU





UGCUCCUUA
(SEQ ID NO:
ACGAGGAGACAGG





CGAGGAGAC
1493)
AAUACAUGGUACA





(SEQ ID NO:

CGU (SEQ ID





54)

NO:1510)





R3594
36
18
CUUUCAAGA
AGGAAUACAU
CUUUCAAGACUAA





CUAAUAGAU
GGUACACG
UAGAUUGCUCCUU





UGCUCCUUA
(SEQ ID NO:
ACGAGGAGACAGG





CGAGGAGAC
1494)
AAUACAUGGUACA





(SEQ ID NO:

CG (SEQ ID NO: 1511)





54)







R3595
36
17
CUUUCAAGA
AGGAAUACAU
CUUUCAAGACUAA





CUAAUAGAU
GGUACAC
UAGAUUGCUCCUU





UGCUCCUUA
(SEQ ID NO:
ACGAGGAGACAGG





CGAGGAGAC
1495)
AAUACAUGGUACA





(SEQ ID NO:

C (SEQ ID NO: 1512)





54)







R3596
36
16
CUUUCAAGA
AGGAAUACAU
CUUUCAAGACUAA





CUAAUAGAU
GGUACA (SEQ
UAGAUUGCUCCUU





UGCUCCUUA
ID NO: 1496)
ACGAGGAGACAGG





CGAGGAGAC

AAUACAUGGUACA





(SEQ ID NO:

(SEQ ID NO: 1513)





54)







R3597
36
15
CUUUCAAGA
AGGAAUACAU
CUUUCAAGACUAA





CUAAUAGAU
GGUAC (SEQ ID
UAGAUUGCUCCUU





UGCUCCUUA
NO: 1497)
ACGAGGAGACAGG





CGAGGAGAC

AAUACAUGGUAC





(SEQ ID NO:

(SEQ ID NO: 1514)





54)







R3598
35
20
UUUCAAGAC
AGGAAUACAU
UUUCAAGACUAAU





UAAUAGAUU
GGUACACGUU
AGAUUGCUCCUUA





GCUCCUUAC
(SEQ ID NO:
CGAGGAGACAGGA





GAGGAGAC
1498)
AUACAUGGUACAC





(SEQ ID NO:

GUU (SEQ ID





1466)

NO: 1515)





R3599
34
20
UUCAAGACU
AGGAAUACAU
UUCAAGACUAAUA





AAUAGAUUG
GGUACACGUU
GAUUGCUCCUUAC





CUCCUUACG
(SEQ ID NO:
GAGGAGACAGGAA





AGGAGAC
1498)
UACAUGGUACACG





(SEQ ID NO:

UU (SEQ ID NO: 1516)





1467)







R3600
33
20
UCAAGACUA
AGGAAUACAU
UCAAGACUAAUAG





AUAGAUUGC
GGUACACGUU
AUUGCUCCUUACG





UCCUUACGA
(SEQ ID NO:
AGGAGACAGGAAU





GGAGAC (SEQ
1498)
ACAUGGUACACGU





ID NO: 1468)

U (SEQ ID NO: 1517)





R3601
32
20
CAAGACUAA
AGGAAUACAU
CAAGACUAAUAGA





UAGAUUGCU
GGUACACGUU
UUGCUCCUUACGA





CCUUACGAG
(SEQ ID NO:
GGAGACAGGAAUA





GAGAC (SEQ
1498)
CAUGGUACACGUU





ID NO: 1469)

(SEQ ID NO: 1518)





R3602
31
20
AAGACUAAU
AGGAAUACAU
AAGACUAAUAGAU





AGAUUGCUC
GGUACACGUU
UGCUCCUUACGAG





CUUACGAGG
(SEQ ID NO:
GAGACAGGAAUAC





AGAC (SEQ ID
1498)
AUGGUACACGUU





NO: 1470)

(SEQ ID NO: 1519)





R3603
30
20
AGACUAAUA
AGGAAUACAU
AGACUAAUAGAUU





GAUUGCUCC
GGUACACGUU
GCUCCUUACGAGG





UUACGAGGA
(SEQ ID NO:
AGACAGGAAUACA





GAC (SEQ ID
1498)
UGGUACACGUU





NO: 1471)

(SEQ ID NO: 1520)





R3604
29
20
GACUAAUAG
AGGAAUACAU
GACUAAUAGAUUG





AUUGCUCCU
GGUACACGUU
CUCCUUACGAGGA





UACGAGGAG
(SEQ ID NO:
GACAGGAAUACAU





AC (SEQ ID
1498)
GGUACACGUU (SEQ





NO: 1472)

ID NO: 1521)





R3605
28
20
ACUAAUAGA
AGGAAUACAU
ACUAAUAGAUUGC





UUGCUCCUU
GGUACACGUU
UCCUUACGAGGAG





ACGAGGAGA
(SEQ ID NO:
ACAGGAAUACAUG





C (SEQ ID NO:
1498)
GUACACGUU (SEQ





1473)

ID NO: 1522)





R3606
27
20
CUAAUAGAU
AGGAAUACAU
CUAAUAGAUUGCU





UGCUCCUUA
GGUACACGUU
CCUUACGAGGAGA





CGAGGAGAC
(SEQ ID NO:
CAGGAAUACAUGG





(SEQ ID NO:
1498)
UACACGUU (SEQ ID





1474)

NO: 1523)





R3607
26
20
UAAUAGAUU
AGGAAUACAU
UAAUAGAUUGCUC





GCUCCUUAC
GGUACACGUU
CUUACGAGGAGAC





GAGGAGAC
(SEQ ID NO:
AGGAAUACAUGGU





(SEQ ID NO:
1498)
ACACGUU (SEQ ID





1475)

NO: 1524)





R3608
25
20
AAUAGAUUG
AGGAAUACAU
AAUAGAUUGCUCC





CUCCUUACG
GGUACACGUU
UUACGAGGAGACA





AGGAGAC
AGGAAUACAU
GGAAUACAUGGUA





(SEQ ID NO:
GGUACACGUU
CACGUU (SEQ ID





1476)
(SEQ ID NO:
NO: 1525)






2487)






R3609
24
20
AUAGAUUGC
AGGAAUACAU
AUAGAUUGCUCCU





UCCUUACGA
GGUACACGUU
UACGAGGAGACAG





GGAGAC (SEQ
AGGAAUACAU
GAAUACAUGGUAC





ID NO: 1477)
GGUACACGUU
ACGUU (SEQ ID






(SEQ ID NO:
NO: 1526)






2487)






R3610
23
20
UAGAUUGCU
AGGAAUACAU
UAGAUUGCUCCUU





CCUUACGAG
GGUACACGUU
ACGAGGAGACAGG





GAGAC (SEQ
AGGAAUACAU
AAUACAUGGUACA





ID NO: 1478)
GGUACACGUU
CGUU (SEQ ID






(SEQ ID NO:
NO: 1527)






2487)






R3611
22
20
AGAUUGCUC
AGGAAUACAU
AGAUUGCUCCUUA





CUUACGAGG
GGUACACGUU
CGAGGAGACAGGA





AGAC (SEQ ID
AGGAAUACAU
AUACAUGGUACAC





NO: 1479)
GGUACACGUU
GUU (SEQ ID






(SEQ ID NO:
NO: 1528)






2487)






R3612
21
20
GAUUGCUCC
AGGAAUACAU
GAUUGCUCCUUAC





UUACGAGGA
GGUACACGUU
GAGGAGACAGGAA





GAC (SEQ ID
AGGAAUACAU
UACAUGGUACACG





NO: 1480)
GGUACACGUU
UU (SEQ ID NO: 1529)






(SEQ ID NO:







2487)






R3613
20
20
AUUGCUCCU
AGGAAUACAU
AUUGCUCCUUACG





UACGAGGAG
GGUACACGUU
AGGAGACAGGAAU





AC (SEQ ID
AGGAAUACAU
ACAUGGUACACGU





NO: 1481)
GGUACACGUU
U (SEQ ID NO: 1530)






(SEQ ID NO:







2487)









Example 34

Identification of Optimal Guide RNAs for CasΦ Polypeptide-Mediated Genome Editing in Primary Cells


The present example shows identification of the best performing gRNAs that target TRAC, B2M and programmed cell death protein 1 (PD1) in T cells. In this study, CasΦ.12 polypeptides (SEQ ID NO: 107) were incubated with different gRNAs (shown in Table 14) at room temperature for 10 minutes to form RNP complexes. T cells were resuspended at 5×105 cells/20 μL in electroporation solution (Lonza) and an Amaxa 4D Nucleofector with pulse code EH115 was used to nucleofect the cells Immediately after nucleofection, 80 μl pre-warmed culture medium was added to each well. The cells were then left in the cuvette plate for 10 minutes before transfer to the culture plate. After 48 hours, DNA was extracted from half of the cells and PCR was performed to detect the frequency of indels. The rest of the cells were cultured until Day 5, and were then collected for flow cytometry to detect the frequency of TRAC or B2M knockout. FIG. 29A and FIG. 29B show exemplary gRNAs for targeting TRAC. FIG. 29B and FIG. 29C show exemplary gRNAs for targeting B2M. FIG. 29E shows exemplary gRNAs for targeting PD1. Additionally, this example demonstrates that a guide RNAs targeting a non-coding region can mediate gene knockout. For example, R3007, R2995, R2992 and R3014 target non-coding regions of the PD1 gene. The screening for gRNAs targeting TRAC is shown in FIG. 29F and for gRNAs targeting B2M is shown in FIG. 29H. Flow cytometry plots of exemplary gRNAs targeting TRAC are shown in FIG. 29G and of exemplary gRNAs targeting B2M in FIG. 29I.











TABLE 14





Name
Target
Spacer sequence (5′→3′)







R3041
TRAC
UCCCACAGAUAUCCAGAACC (SEQ ID NO: 2470)





R3042
TRAC
GAGUCUCUCAGCUGGUACAC (SEQ ID NO: 1436)





R3043
TRAC
AGAGUCUCUCAGCUGGUACA (SEQ ID NO: 2471)





R3061
TRAC
AAGUCCAUAGACCUCAUGUC (SEQ ID NO: 2472)





R3063
TRAC
AAGAGCAACAGUGCUGUGGC (SEQ ID NO: 2473)





R3066
TRAC
GUUGCUCCAGGCCACAGCAC (SEQ ID NO: 2474)





R3068
TRAC
GCACAUGCAAAGUCAGAUUU (SEQ ID NO: 2475)





R3069
TRAC
GCAUGUGCAAACGCCUUCAA (SEQ ID NO: 2476)





R3081
TRAC
CUAAAAGGAAAAACAGACAU (SEQ ID NO: 2477)





R3141
TRAC
CUCGACCAGCUUGACAUCAC (SEQ ID NO: 2478)





R3088
B2M
AUAUAAGUGGAGGCGUCGCG (SEQ ID NO: 2479)





R3091
B2M
GGGCCGAGAUGUCUCGCUCC (SEQ ID NO: 1429)





R3094
B2M
UGGCCUGGAGGCUAUCCAGC (SEQ ID NO: 2480)





R3119
B2M
AAGUUGACUUACUGAAGAAU (SEQ ID NO: 2481)





R3132
B2M
AGCAAGGACUGGUCUUUCUA (SEQ ID NO: 2482)





R3149
B2M
AGUGGGGGUGAAUUCAGUGU (SEQ ID NO: 2483)





R3150
B2M
CAGUGGGGGUGAAUUCAGUG (SEQ ID NO: 1434)





R3155
B2M
GGCUGUGACAAAGUCACAUG (SEQ ID NO: 2484)





R3156
B2M
GUCACAGCCCAAGAUAGUUA (SEQ ID NO: 2485)





R3157
B2M
UCACAGCCCAAGAUAGUUAA (SEQ ID NO: 2486)





R2946
PD1
UGUGACACGGAAGCGGCAGU (SEQ ID NO: 263)





R2992
PD1
GGGGCUGGUUGGAGAUGGCC (SEQ ID NO: 309)





R2995
PD1
GAGCAGCCAAGGUGCCCCUG (SEQ ID NO: 312)





R3007
PD1
ACACAUGCCCAGGCAGCACC (SEQ ID NO: 324)





R3014
PD1
AGGCCCAGCCAGCACUCUGG (SEQ ID NO: 331)









Example 35

RNP and mRNA Delivery of Caste Polypeptides


This example illustrates that CasΦ.12 can be delivered to primary cells as mRNA or as an RNP complex. In one study, RNP complexes were formed using CasΦ.12 protein (0, 100, 200 or 400 pmol) (SEQ ID NO: 107) and gRNAs (0, 400 or 800 pmol) targeting B2M or TRAC. RNP complexes were added to T cells. T cells were nucleofected using the Amaxa P3 kit and Amaxa 4D 96-well electroporation system with pulse code EH115. Cells were harvested for flow cytometry to determine the percentage of B2M or TRAC knockout cells, and genomic DNA was extracted to detect the frequency of indel mutations. As shown in FIG. 30A, a distinct population of B2M-negative cells was detected in T cells transfected with CasΦ.12 RNP complex targeting B2M. A distinct population of TRAC-negative cells was detected in in T cells transfected with CasΦ.12 RNP complex targeting TRAC, and shown in FIG. 30B. Quantification of the percentage of B2M knockout cells is shown in FIG. 30C and quantification of the percentage of TRAC knockout cells is shown in FIG. 30D. A high frequency of indel mutations was also seen after delivery of RNP complexes. As shown in FIG. 30E, —55% indel mutations was detected when RNP complexes targeting B2M were formed using 400 pmol protein and 800 pmol guide RNA. A similar frequency of indel mutations was detected when RNP complexes targeting TRAC were formed using the same conditions, as illustrated in FIG. 30F.


In a second study, CasΦ.12 mRNA was delivered to T cells with a gRNA targeting the B2M gene. For nucleofection, T cells were resuspended in BTXpress electroporation medium (5×105 cells per well) and mixed with CasΦ.12 mRNA and 500 pmol gRNA. Cells were collected on Day 2 for extraction of genomic DNA, and the frequency of indel mutations was determined. As shown in FIG. 30G, delivery of CasΦ.12 mRNA and gRNA resulted in a high frequency of indel mutations. This was at a comparable level to genome editing with delivery of Cas9 mRNA. Further data from this study are shown in FIG. 30I and FIG. 30J. FIG. 30I shows the frequency of indel mutations and functional knockout, as assessed by flow cytometry, of the B2M gene induced by either CasΦ.12 or Cas9 targeting the same site. FIG. 30J shows the distribution of the size of indel mutations induced by CasΦ.12 or Cas9 determined by NGS analysis. CasΦ.12 predominantly induced larger deletion mutations whereas Cas9 induced mostly small 1bp InDels. This data further confirms the ability of CasΦ.12 to mediate genome editing at the B2M locus.


Example 36

gRNA Processing by CasΦ Polypeptides in Mammalian Cells


This example illustrates the ability of CasΦ polypeptides to process gRNA in mammalian cells. In this study, HEK293T cells were transfected with crRNA and expression plasmids encoding CasΦ.12 (SEQ ID NO: 107) using lipofectamine on day 0. The crRNA had the repeat sequence (the region that binds to CasΦ.12) CTTTCAAGACTAATAGATTGCTCCTTACGAGGAGAC (SEQ ID NO: 54). To determine the nature of the crRNAs expressed in the HEK293T cells, the microRNA species in the HEK293T cells were analyzed by next generation sequencing. After 2 days, miRNA was extracted using the mirVANA kit. RNA was treated with recombinant Shrimp Alkaline Phosphatase (rSAP) to remove all the phosphates from the 5′ and 3′ ends of the RNA. PNK phosphorylation was then performed to add phosphate back to the 5′ ends in preparation for adaptor ligation to the RNA. RNA was then mixed with 3′ SR Adaptor for Illumina, followed by 3′ ligation enzyme mix and incubated for 1 hour at 25° C. in a thermal cycler. The reverse transcription primer was then hybridized to prevent adaptor-dimer formation. The SR RT primer hybridizes to the excess of 3′ SR Adaptor (that remains free after the 3′ ligation reaction) and transforms the single stranded DNA adaptor into a double-stranded DNA molecule. Double-stranded DNAs are not substrates for ligation mediated by T4 RNA Ligase 1 and therefore do not ligate to the 5′ SR. The RNA-ligation mixture from the previous step was mixed with SR RT primer for Illumina and placed in a thermocycler for the following program: 5 minutes at 75° C., 15 minutes at 37° C., 15 minutes at 25° C., hold at 4° C. The RNA-ligation mixture was then incubated with 5′ SR adaptor for 1 hour at 25° C. in a thermal cycler. Finally, RNA was reverse transcribed using ProtoScript II Reverse Transcriptase and amplified for PCR. The sample was then analyzed by next generation sequencing.


As shown in FIG. 31 the major crRNA molecule detected by sequence analysis was 24 nucleotides long (ATAGATTGCTCCTTACGAGGAGAC (SEQ ID NO: 1531) which is 12 nucleotides shorter than the full length repeat sequence (CTTTCAAGACTAATAGATTGCTCCTTACGAGGAGAC (SED ID NO: 54)) that was delivered to the HEK293T cells. This demonstrates how CasΦ.12 can process the repeat region of its crRNA in mammalian cells.


Example 37

CasΦ Polypeptide Cleavage Generates 5′ Overhangs


This example illustrates different CasΦ polypeptide-induced cleavage patterns. In this study, CasΦ polypeptides (CasΦ.12, CasΦ.45, CasΦ.43, CasΦ.39. CasΦ.37, CasΦ.33, CasΦ.32, CasΦ.30, CasΦ.28, CasΦ.25, CasΦ.24, CasΦ.22, CasΦ.20, CasΦ.18) were complexed with a crRNA to form RNPs. The RNPs were then used in cleavage reactions with plasmid DNA comprising a target sequence and a PAM (GTTG). The cleavage reaction was carried out at 37° C. and had a duration of 15 minutes. The cleavage products were then analyzed by gel electrophoresis. As shown in FIG. 32A, the majority of CasΦ polypeptides generated a linear product from a plasmid target, whilst some CasΦ polypeptides introduced nicks into the plasmid DNA.



FIG. 32B shows a schematic of the cut sites on the target and non-target strand of a double-stranded target nucleic acid. The nature of the cleavage patterns resulting from the location of the cut sites on the target and non-target strands was investigated by sequence analysis, as shown in FIG. 32C and represented in FIG. 32D. These data show that the cleavage pattern following CasΦ polypeptide mediated cleavage of target nucleic acid is a staggered cut comprising 5′ overhangs. FIG. 32E shows a table of cut sites and overhangs of the different CasΦ polypeptides. The “#bp overlap” corresponds to the length of the 5′ overhang for each CasΦ polypeptide. For comparison, Cpf1 introduces a staggered double-stranded DNA break with a 4- or 5-nucleotide 5′ overhang (Zetsche et. al 2015 Cell).


Example 38

Multiplex Genome Editing with CasΦ Polypeptides


This example illustrates the ability of CasΦ RNP complexes to knockout multiple genes simultaneously. In this study, gRNAs targeting B2M, TRAC and PDCD1 (provided in Table 15) were incubated with CasΦ.12 (SEQ ID NO: 12) for 10 minutes at room temperature to form B2M, TRAC, and PDC1 targeting RNPs, respectively. The B2M targeting RNPs, TRAC targeting RNPs, PDCD1 targeting RNPs and combinations thereof were added to T cells. T cells were resuspended at 5×105 cells/20 μL in Nucleofection P3 solution and an Amaxa 4D 96-well electroporation system with pulse code EH115 was used to nucleofect the cells. Immediately after nucleofection, 85 μl pre-warmed culture medium was added to each well. The cells were then left in the cuvette plate for 10 minutes before transfer to the culture plate. On Day 3, genomic DNA was extracted and sent for NGS sequencing and the % indel was measured with a positive % indel being indicative of % knockout. On Day 5, cells were harvested for flow cytometry and the % knockout was measured with fluorescently labeled antibodies to TRAC and B2M (antibody to PDCD1 unavailable). % indel results are presented in Table 16 and flow cytometry data presented in Table 17. Corresponding flow cytometry panels are shown in FIG. 33.











TABLE 15





Description
SEQ ID
Sequence







B2M gRNA
1532
CUUUCAAGACUAAUAGAUUGCUCCUUACG


(R3132)

AGGAGACAGCAAGGACUGGUCUUUCUA





TRAC gRNA
1432
CUUUCAAGACUAAUAGAUUGCUCCUUACG


(R3042)

AGGAGACGAGUCUCUCAGCUGGUACAC





PDCD1 gRNA
 791
CUUUCAAGACUAAUAGAUUGCUCCUUACG


(R2925)

AGGAGACUAGCACCGCCCAGACGACUG



















TABLE 16





Description
RNP Guide ID(s)
Amplicon
% INDEL







TRAC single KO
R3042
TRAC
77.6%


B2M single KO
R3132
B2M
85.5%


PDCD1 single KO
R2925
PDCD1
44.6%


TRAC, B2M double KO
R3132 & R3042
TRAC
58.8%


TRAC, B2M double KO
R3132 & R3042
B2M
61.2%


TRAC, B2M, PDCD1
R3132, R3042,
TRAC
59.2%


triple KO
R2925


TRAC, B2M, PDCD1
R3132, R3042,
B2M
69.4%


triple KO
R2925


TRAC, B2M, PDCD1
R3132, R3042,
PDCD1
42.1%


triple KO
R2925




















TABLE 17






B2M+
B2M+,
B2M−,
B2M−,


gRNA
CD3−
CD3+
CD3+
CD3−



















TRAC
94
5.91
0.00418
0.1


B2M
0.051
8.65
90.7
0.59


TRAC + B2M
4.2
4.89
4.01
86.9


TRAC + B2M +
4.74
14.1
4.33
76.8


PDCD1









Example 39

Genome Editing with CasΦ Polypeptides Mediates Efficient Editing of PCSK9 in Mouse Hepatoma Cells


The present example shows that CasΦ.12 RNP complexes are highly effective at mediating editing the PCSK9 gene. In this study, 95 CasΦ gRNAs targeting PCSK9 (sequences shown in Tables E and Q), were incubated with CasΦ.12 (SEQ ID NO: 12) to form RNP complexes. Positive control RNP complexes were also formed using Cas9 and a gRNA. Hepa1-6 mouse hepatoma cells (100,000 cells) were resuspended in SF solution (Lonza) and nucleofected with CasΦ RNPs (250 pmoles) or the control Cas9 RNPs (60 pmoles) using program CM-137 or CM-148 (Amaxa nucleofector). Cells were collected after 48 hours, genomic DNA was extracted and the frequency of indel mutations was determined using NGS. FIG. 34 shows that CasΦ.12 is a highly effective genome editing tool, with an indel frequency of up to 48% induced by CasΦ.12 RNP complexes. Whereas, the maximum indel frequency induced by Cas9 was only about 22%.


Example 40

Adeno-Associated Virus Encoding CasΦ.12 Facilitates Genome Editing


This example shows that a CasΦ.12 plasmid, including both CasΦ polypeptide sequence and gRNA sequence, sometimes called an all-in-one, can be used to facilitate genome editing. In this study, the crRNAs (sequences shown in Tables E and Q) from the initial RNP screen were chosen and truncations of these crRNAs were generated with repeat lengths of 36, 25, 20, or 19 nucleotides in combination with spacer lengths of 20, 17, or 16 nucleotides. Each crRNA was then cloned into an AAV vector consisting of U6 promoter to drive crRNA expression, intron-less EF1alpha short (EFS) promoter driving CasΦ expression, PolyA signal, and 1 kb stuffer sequence genomic. Hepa1-6 mouse hepatoma cells were nucleofected with 10 μg of each AAV plasmid. After 72 hours, genomic DNA was extracted and the frequency of indel mutations was determined using NGS. FIG. 35A shows a plasmid map of the adeno-associated virus (AAV) encoding the CasΦ polypeptide sequence and gRNA sequence. FIG. 35D shows the frequency of CasΦ.12 induced indel mutations in Hepa1-6 cells transduced with 10 μg of each AAV plasmid. gRNAs containing repeat sequences of 19, 20, 25 or 36 nucleotides and spacer sequences of 16, 17 or 20 nucleotides were used in this study. In the graph legend, repeat and spacer lengths are indicated as the number of nucleotides in the repeat followed by the number of nucleotides in the spacer, eg 20-17 has a repeat length of 20 nucleotides and a spacer length of 17 nucleotides. The frequency of indel mutations is comparable to that of Cas9. FIG. 35E and FIG. 35F show the frequency of CasΦ.12 induced indel mutations with different gRNA containing repeat and spacer sequences of different lengths (indicated as in FIG. 35F with repeat length followed by spacer length). This study demonstrates that the all-in-one vector method of CasΦ.12 mediated genome editing is robust across different gRNA sequences and with gRNAs of different repeat and spacer lengths.


AAV vectors are a leading platform for delivery of gene therapy for treatment of human disease (Wang et al., (2019) Nature Reviews Drug Discovery). One of the limitations of viral vector delivery of CRISPR/Cas9 is the size of Cas9. AAVs are roughly 20 nm, allowing for 4.5 kb genomic material to be packaged within it. This makes packaging Cas9 and a gRNA (˜4.2 kB) with any additional elements such as multiple gRNAs or a donor polynucleotide for HDR challenging (Lino et al., (2018), Drug Delivery). Whereas CasΦ is much smaller, allowing all of the components of the CRISPR system to be packaged in one viral vector.


Example 41

Optimization of Lipid Nanoparticle Delivery of CasΦ


This example describes the optimization of lipid nanoparticle (LNP) delivery of CasΦ mRNA and gRNA. In this study, the encapsulation efficiency of LNPs was optimized by testing different amine group to phosphate group ratio (N/P) of LNPs containing CasΦ mRNA and gRNA. An LNP kit from Precision Nanosystems (GenVoy-ILM™) was used to generate LNPs with different N/P ratios. LNPs were then dropped into HEK293T cells. Genomic DNA was extracted and the frequency of indel mutations was determined using NGS. The gRNA used in this study was R2470 with 2′O-methyl on the first three 5′ and last three 3′ nucleotides and phosphorothioate bonds in between the first three 5′ nucleotides and in between the last two 3′ nucleotides. The sequence of R2470 from 5′ to 3′ is 42256-779_601_SL. The mRNA was generated using T7 messenger mRNA IVT kit. As shown in FIG. 36, indel mutations were detected following the use of a range of N/P ratios.


LNPs are one of the most clinically advanced non-viral delivery systems for gene therapy. LNPs have many properties that make them ideal candidates for delivery of nucleic acids, including ease of manufacture, low cytotoxicity and immunogenicity, high effiency of nucleic acid encapsulation and cell transfection, multidosing capabilities and flexibility of design (Kulkarni et al., (2018) Nucleic Acid Therapeutics).


Example 42

Genome Editing in Hematopoietic Stem Cells with CasΦ Polypeptides


This example demonstrates CasΦ-mediated genome editing of CD34+ hematopoietic stem cells (HSCs). HSCs are stem cells that differentiate to give rise blood cells, such as T and B lymphocytes, erythrocytes, monocytes and macrophages. HSCs are important cells for future stem cell therapies as they have the potential to be used to treat genetic blood cell diseases (Morgan et al. (2017), Cell Stem Cell).


In this study human CD34+ cells were grown in XVIVO10 media (+5% FBS, +1X CC110) for three days. On the third day, the cells were nucleofected using the Lonza P3 kit with either RNP containing CasΦ.12 polypeptides complexed with B2M-targeting guide R3132 (42256-779_601_SL), or a mixture of CasΦ.12 mRNA with B2M-targeting guide. Cells were collected after 3 days, genomic DNA was purified and the frequency of indel mutations at the B2M locus was analyzed by NGS. As shown in FIG. 37, CasΦ.12 is an effective tool for genome editing when CasΦ.12 is delivered to cells as CasΦ.12 RNP complexes or CasΦ.12 mRNA.


This example illustrates the utility of CasΦ polypetides as genome editing tools in stem cells, such as HSCs.


Example 43

Genome Editing in Induced Pluripotent Stem Cells with CasΦ Polypeptides


This example demonstrates CasΦ-mediated genome editing of induced pluripotent stem cells (iPSCs). iPSCs are pluripotent stem cells that are generated from somatic cells. They can propagate indefinitely and give rise to any cell type in the body. These features make iPSCs a powerful tool for researching human disease and provide a promising prospect for cell therapies for a range of medical conditions. iPSCs can be generated in a patient-specific manner and used in autologous transplant, thereby overcoming complications of rejection by the host immune system (Moradi et al. (2019), Stem Cell Research & Therapy).


In this study, high quality WTC-11 iPSCs were harvested as single cells using Accutase treatment for 5 minutes. RNP complexes were formed using CasΦ.12 polypeptides and gRNAs targeting either the B2M locus or targeting a CIITA locus (sequences shown in Table 19). RNP complexes were formed using 2:1 gRNA:CasΦ.12 RNP (1000 pmol gRNA+500 pmol Cas12Φ.12) and incubating at room temperature for approximately 15 minutes. WTC-11 iPSCs (200,000 cells) were resuspended in 20 uL of P3 nucleofection solution per reaction and 40 uL of cell suspension was added to each RNP tube. Half of the volume of each RNP/cell suspension mixture was added to the Lonza 96 well shuttle and nucleofection was performed using the program CD118. To recover the transfected cells, 80 μL, of warm StemFlex media supplemented with 2 μM of Thiazovivin was added to the wells of the shuttle. The entire volume of the shuttle well was transferred to a 96 well plate previously coated with 0.337 mg/mL Matrigel containing 100 μL of 2 μM of Thiazovivin. Cells were allowed to recover for 24 hours in 3TC incubator with humidity control. Cells were confluent 48 hours post-transfection, and single-cell passaged using Accutase. Genomic DNA was extracted using KingFisher Tissue and DNA kit. NGS library preparation was performed using in house protocols and the frequency of indel mutations was quantified using Crispresso. As shown in FIG. 38, effective genome editing at the B2M and CIITA loci was achieved with CasΦ.12 RNP complexes in iPSCs.


This example demonstrates the utility of CasΦ as genome editing tools in iPSCs.












TABLE 19








SEQ





ID


Name
Target
Sequence
NO







R3132
B2M
AUUGCUCCUUACGAGGAGACAGCAAGGACU
2488




GGUCUUU






R4504_CasPhi12_S
CIITA
AUUGCUCCUUACGAGGAGACGGGCUCUGAC
1722




AGGUAGG






R5406_CasPhi12
CIITA
CUUUCAAGACUAAUAGAUUGCUCCUUACGA
2222




GGAGACGGGUCAAUGCUAGGUACUGC









Example 44

Genome Editing with CasΦ Polypeptides Mediates Efficient Editing of CIITA Locus


This example demonstrates CasΦ-mediated genome editing of the CIITA locus. In this study, RNP complexes were formed using CasΦ polypeptides and gRNAs targeting CIITA (sequences shown in Tables D and O). K562 cells were nucleofected with RNP complexes (250 pmol) using Lonza nucleofection protocols. Cells were harvested after 48 hours, genomic DNA was isolated and the frequency of indel mutations was evaluated using NGS analysis (MiSeq, Illumina). As shown in FIG. 39, effective genome editing of the CIITA locus was achieved using CasΦ RNP complexes.


While preferred embodiments of the present invention have been shown and described herein, it will be apparent to those skilled in the art that such embodiments are provided by way of example only. Numerous variations, changes, and substitutions will now occur to those skilled in the art without departing from the invention. It should be understood that various alternatives to the embodiments of the invention described herein may be employed in practicing the invention. It is intended that the following claims define the scope of the invention and that methods and structures within the scope of these claims and their equivalents be covered thereby.

Claims
  • 1-276. (canceled)
  • 277. A system comprising components, wherein the components comprise: a) a polypeptide, or a nucleic acid encoding the polypeptide, wherein the polypeptide comprises an amino acid sequence that is at least 85% identical to a sequence selected from SEQ ID NOs: 29 and 32; andb) an engineered guide nucleic acid, or a nucleic acid encoding the engineered guide nucleic acid, wherein the engineered guide nucleic acid comprises a first region comprising a nucleotide sequence that is complementary to a target sequence in a target nucleic acid and a second region that binds to the polypeptide, wherein the first region and the second region are heterologous to each other, and wherein the first region is located 3′ of the second region.
  • 278. The system of claim 277, wherein the polypeptide comprises an amino acid sequence that is at least 90% identical to a sequence selected from SEQ ID NO: 29 and SEQ ID NO: 32.
  • 279. The system of claim 277, wherein the polypeptide comprises an amino acid sequence that is at least 95% identical to a sequence selected from SEQ ID NO: 29 and SEQ ID NO: 32.
  • 280. The system of claim 277, wherein the polypeptide comprises an amino acid sequence selected from SEQ ID NO: 29 and SEQ ID NO: 32.
  • 281. The system of claim 277, wherein the polypeptide comprises an amino acid sequence that is at least 85% identical to the sequence of SEQ ID NO: 29, and wherein the second region of the engineered guide nucleic acid comprises an RNA sequence that is at least 85% identical to an RNA equivalent of SEQ ID NO: 68, wherein all thymines are uracils.
  • 282. The system of claim 277, wherein the polypeptide comprises an amino acid sequence that is at least 85% identical to the sequence of SEQ ID NO: 32, wherein the second region of the engineered guide nucleic acid comprises an RNA sequence that is at least 85% identical to an RNA equivalent of SEQ ID NO: 71, wherein all thymines are uracils.
  • 283. (canceled)
  • 284. The system of claim 277, wherein the engineered guide nucleic acid comprises one or more phosphorothioate (PS) backbone modifications, 2′-fluoro (2′-F) sugar modifications, or 2′-O-Methyl (2′OMe) sugar modifications.
  • 285. The system of claim 277, wherein the polypeptide is a nuclease that is capable of cleaving at least one strand of the target nucleic acid upon contact of a complex comprising the polypeptide and the engineered guide nucleic acid to the target nucleic acid.
  • 286. The system of claim 277, wherein the polypeptide comprises a mutation that reduces an enzymatic activity of the polypeptide relative to a polypeptide that is 100% identical to the sequence selected from SEQ ID NO: 29 and SEQ ID NO: 32, and wherein the polypeptide is fused to a fusion partner.
  • 287. The system of claim 277, wherein the components further comprise at least one of: a) a detection reagent; orb) an amplification reagent.
  • 288. The system of claim 287, wherein: a) the detection reagent is selected from: a reporter nucleic acid, a detection moiety, and an additional polypeptide, or is a combination thereof; andb) the amplification reagent is selected from: a primer, a polymerase, a dNTP, and an rNTP, or is a combination thereof.
  • 289. The system of claim 277, wherein the polypeptide comprises an amino acid sequence that is at least 85% identical to the sequence of SEQ ID NO: 32, and wherein the target sequence is adjacent to a protospacer adjacent motif (PAM) comprising a sequence of 5′-GTTN-3′.
  • 290. The system of claim 277, wherein the nucleic acid encoding the polypeptide is a messenger RNA.
  • 291. The system of claim 290, wherein the nucleic acid is an expression vector, and wherein the expression vector comprises or encodes the engineered guide nucleic acid.
  • 292. The system of claim 277, wherein the nucleic acid encoding the polypeptide is an expression vector.
  • 293. The system of claim 277, further comprising a lipid or lipid nanoparticle.
  • 294. The system of claim 277, wherein the engineered guide nucleic acid comprises at least 10 contiguous nucleotides that are complementary to a eukaryotic sequence.
  • 295. The system of claim 292, wherein the expression vector is an adeno-associated viral vector.
  • 296. The system of claim 277, wherein the polypeptide is fused to a heterologous amino acid sequence.
  • 297. The system of claim 277, wherein the polypeptide, or the nucleic acid encoding the polypeptide, and the engineered guide nucleic acid, or the nucleic acid encoding the engineered guide nucleic acid, are in a single composition.
  • 298. A composition comprising: a) a polypeptide, or a nucleic acid encoding the polypeptide, wherein the polypeptide comprises an amino acid sequence that is at least 90% identical to a sequence selected from SEQ ID NOS: 29 and 32; andb) an engineered guide nucleic acid or a nucleic acid encoding the engineered guide nucleic acid, wherein the engineered guide nucleic acid comprises a first region comprising a nucleotide sequence that is complementary to a target sequence in a target nucleic acid and a second region that binds to the polypeptide, wherein the first region and the second region are heterologous to each other, and wherein the first region is located 3′ of the second region.
  • 299. The composition of claim 298, wherein the polypeptide comprises an amino acid sequence that is at least 95% identical to a sequence selected from SEQ ID NO: 29 and SEQ ID NO: 32.
  • 300. The composition of claim 298, wherein the polypeptide comprises an amino acid sequence that is at least 85% identical to the sequence of SEQ ID NO: 29, and wherein the second region of the engineered guide nucleic acid comprises an RNA sequence that is at least 85% identical to an RNA equivalent of SEQ ID NO: 68, wherein all thymines are uracils.
  • 301. The composition of claim 298, wherein the polypeptide comprises an amino acid sequence that is at least 85% identical to the sequence of SEQ ID NO: 32, and wherein the second region of the engineered guide nucleic acid comprises an RNA sequence that is at least 85% identical to an RNA equivalent of SEQ ID NO: 71, wherein all thymines are uracils.
  • 302. The composition of claim 298, wherein the polypeptide is fused to at least one nuclear localization signal.
  • 303. The composition of claim 298, wherein the composition further comprises a fusion partner fused to the polypeptide or a nucleic acid encoding the fusion partner fused to the polypeptide.
  • 304. The composition of claim 298, wherein the polypeptide is a nuclease that is capable of cleaving at least one strand of the target nucleic acid upon contact of the composition to the target nucleic acid.
  • 305. The composition of claim 298, wherein the polypeptide comprises a RuvC domain that is capable of cleaving the target nucleic acid.
  • 306. The composition of claim 298, wherein the composition further comprises a donor nucleic acid.
CROSS-REFERENCE

The present application is a continuation of International Patent Application No. PCT/US2021/035781, filed Jun. 3, 2021, which claims priority to and benefit from U.S. Provisional Application No. 63/034,346, filed on Jun. 3, 2020, U.S. Provisional Application No. 63/037,535, filed on Jun. 10, 2020, U.S. Provisional Application No. 63/040,998, filed on Jun. 18, 2020, U.S. Provisional Application No. 63/092,481, filed on Oct. 15, 2020, U.S. Provisional Application No. 63/116,083, filed on Nov. 19, 2020, U.S. Provisional Application No. 63/124,676, filed on Dec. 11, 2020, U.S. Provisional Application No. 63/156,883, filed on Mar. 4, 2021, and U.S. Provisional Application No. 63/178,472, filed on Apr. 22, 2021, the entire contents of each of which are herein incorporated by reference.

Provisional Applications (8)
Number Date Country
63034346 Jun 2020 US
63037535 Jun 2020 US
63040998 Jun 2020 US
63092481 Oct 2020 US
63116083 Nov 2020 US
63124676 Dec 2020 US
63156883 Mar 2021 US
63178472 Apr 2021 US
Continuations (1)
Number Date Country
Parent PCT/US2021/035781 Jun 2021 US
Child 17819137 US