COMPOSITIONS AND METHODS FOR THE MODULATION OF ADAPTIVE IMMUNITY

FIELD OF THE DISCLOSURE

The disclosure is directed to molecular biology, and more, specifically, to compositions and methods for modifying expression and activity of RNA molecules involved in an adaptive immune response.

INCORPORATION OF SEQUENCE LISTING

The contents of the text file named “LOCN_003_001 US_SeqList_ST25”, which was created on Jun. 6, 2019 and is 2.93 MB in size, are hereby incorporated by reference in their entirety.

BACKGROUND

There has been a long-felt but unmet need in the art for simultaneously providing a gene therapy and suppressing the adaptive immune response that may arise when the gene therapy is delivered by, for example, a viral vector. The disclosure provides compositions and methods for specifically targeting RNA molecules in a sequence-specific manner that provides a gene therapy in vivo while masking the modified cells from the immune system of a subject, thereby preventing an adaptive immune response to the modified cell.

SUMMARY

The disclosure provides a composition comprising a nucleic acid sequence comprising a guide RNA (gRNA) sequence that specifically binds a target RNA sequence, wherein the target RNA sequence encodes a protein component of an adaptive immune response, and wherein the gRNA sequence comprises a spacer sequence comprising a portion of a nucleic acid sequence encoding the protein component, and wherein the protein component is selected from the group consisting of Beta-2-microglobulin (β2M), Human Leukocyte Antigen A (HLA-A), Human Leukocyte Antigen B (HLA-B), Human Leukocyte Antigen C (HLA-C), Cluster of Differentiation 28 (CD28), Cluster of Differentiation 80 (CD80), Cluster of Differentiation 86 (CD86), Inducible T-cell Costimulator (ICOS), ICOS Ligand (ICOSLG), OX40L, Interleukin 12 (IL 12), and CC Chemokine Receptor 7 (CCR7).

The disclosure also provides a composition comprising (a) a first sequence comprising a guide RNA (gRNA) that specifically binds a target sequence within an RNA molecule, wherein the target sequence comprises a sequence encoding a component of an adaptive immune response and (b) a sequence encoding a fusion protein, the sequence comprising a sequence encoding a first RNA-binding polypeptide and a sequence encoding a second RNA-binding polypeptide, wherein neither the first RNA-binding polypeptide nor the second RNA-binding polypeptide comprises a significant DNA-nuclease activity, wherein the first RNA-binding polypeptide and the second RNA-binding polypeptide are not identical, and wherein the second RNA-binding polypeptide comprises an RNA-nuclease activity.

The disclosure provides a composition comprising: (a) a first sequence comprising a guide RNA (gRNA) that specifically binds a first target sequence within a first RNA molecule, wherein the first target sequence comprises a sequence encoding a component of an adaptive immune response and (b) a second sequence comprising a second guide RNA (gRNA) that specifically binds a second target sequence within a second RNA molecule and (c) a sequence encoding a fusion protein, the sequence comprising a sequence encoding a first RNA-binding polypeptide and a sequence encoding a second RNA-binding polypeptide, wherein neither the first RNA-binding polypeptide nor the second RNA-binding polypeptide comprises a significant DNA-nuclease activity, wherein the first RNA-binding polypeptide and the second RNA-binding polypeptide are not identical, and wherein the second RNA-binding polypeptide comprises an RNA-nuclease activity.

In some embodiments of the compositions of the disclosure, including those wherein a gRNA sequence comprises a promoter capable of expressing the gRNA in a eukaryotic cell, the eukaryotic cell is an animal cell. In some embodiments, the animal cell is a mammalian cell. In some embodiments, the animal cell is a human cell.

In some embodiments of the compositions of the disclosure, including those wherein a gRNA sequence comprises a promoter capable of expressing the gRNA in a eukaryotic cell, the gRNA sequence comprises a sequence isolated or derived from a promoter capable of driving expression of an RNA polymerase. In some embodiments, the promoter sequence is isolated or derived from a U6 promoter.

In some embodiments of the compositions of the disclosure, including those wherein a gRNA sequence comprises a promoter capable of expressing the gRNA in a eukaryotic cell, the promoter comprises a sequence isolated or derived from a promoter capable of driving expression of a transfer RNA (tRNA). In some embodiments, the promoter sequence is isolated or derived from an alanine tRNA promoter, an arginine tRNA promoter, an asparagine tRNA promoter, an aspartic acid tRNA promoter, a cysteine tRNA promoter, a glutamine tRNA promoter, a glutamic acid tRNA promoter, a glycine tRNA promoter, a histidine tRNA promoter, an isoleucine tRNA promoter, a leucine tRNA promoter, a lysine tRNA promoter, a methionine tRNA promoter, a phenylalanine tRNA promoter, a proline tRNA promoter, a serine tRNA promoter, a threonine tRNA promoter, a tryptophan tRNA promoter, a tyrosine tRNA promoter, or a valine tRNA promoter. In some embodiments, the promoter sequence is isolated or derived from a valine tRNA promoter.

(SEQ ID NO: 88)

MSRSVALAVL ALLSLSGLEA IQRTPKIQVY SRHPADIEVD

LLKNGERIEK VEHSDLSFSK DWSFYLLYYT EFTPTEKDEY

ACRVNHVTLS QPKIVKWDRD M.

(SEQ ID NO: 12)

GUUUAAGAGCUAUGCUGGAAACAGCAUAGCAAGUUUAAAUAAGGCUAGUC

CGUUAUCAACUUGAAAAAGUGGCACCGAGUCGGUGCUUUUUUU

or

(SEQ ID NO: 13)

GUUUUAGAGCUAGAAAUAGCAAGUUAAAAUAAGGCUAGUCCGUUAUCAAC

UUGAAAAAGUGGCACCGAGUCGGUGCUUUUUUU.

(SEQ ID NO: 12)

GUUUAAGAGCUAUGCUGGAAACAGCAUAGCAAGUUUAAAUAAGGCUAGU

CCGUUAUCAACUUGAAAAAGUGGCACCGAGUCGGUGCUUUUUUU

or

(SEQ ID NO: 13)

GUUUUAGAGCUAGAAAUAGCAAGUUAAAAUAAGGCUAGUCCGUUAUCAA

CUUGAAAAAGUGGCACCGAGUCGGUGCUUUUUUU.

In some embodiments of the compositions of the disclosure, the gRNA does not bind or does not selectively bind to a second sequence within the RNA molecule.

In some embodiments of the compositions of the disclosure, an RNA genome or an RNA transcriptome comprises the RNA molecule.

In some embodiments of the compositions of the disclosure, including those wherein the composition comprises a first sequence comprising a first guide RNA (gRNA) that specifically binds a first target sequence within a first RNA molecule, wherein the first target sequence comprises a sequence encoding a component of an adaptive immune response and a second sequence comprising a second guide RNA (gRNA) that specifically binds a second target sequence within a second RNA molecule, the first RNA binding protein comprises a CRISPR-Cas protein. In some embodiments, the CRISPR-Cas protein is a Type II CRISPR-Cas protein. In some embodiments, the first RNA binding protein comprises a Cas9 polypeptide or an RNA-binding portion thereof. In some embodiments, the CRISPR-Cas protein is a Type V CRISPR-Cas protein. In some embodiments, the first RNA binding protein comprises a Cpf1 polypeptide or an RNA-binding portion thereof. In some embodiments, the CRISPR-Cas protein is a Type VI CRISPR-Cas protein. In some embodiments, the first RNA binding protein comprises a Cas13 polypeptide or an RNA-binding portion thereof. In some embodiments, the CRISPR-Cas protein comprises a native RNA nuclease activity. In some embodiments, the native RNA nuclease activity is reduced or inhibited. In some embodiments, the native RNA nuclease activity is increased or induced. In some embodiments, the CRISPR-Cas protein comprises a native DNA nuclease activity and wherein the native DNA nuclease activity is inhibited. In some embodiments, the CRISPR-Cas protein comprises a mutation. In some embodiments, a nuclease domain of the CRISPR-Cas protein comprises the mutation. In some embodiments, the mutation occurs in a nucleic acid encoding the CRISPR-Cas protein. In some embodiments, the mutation occurs in an amino acid encoding the CRISPR-Cas protein. In some embodiments, the mutation comprises a substitution, an insertion, a deletion, a frameshift, an inversion, or a transposition. In some embodiments, the mutation comprises a deletion of a nuclease domain, a binding site within the nuclease domain, an active site within the nuclease domain, or at least one essential amino acid residue within the nuclease domain.

In some embodiments of the compositions of the disclosure, including those wherein the composition comprises a first sequence comprising a first guide RNA (gRNA) that specifically binds a first target sequence within a first RNA molecule, wherein the first target sequence comprises a sequence encoding a component of an adaptive immune response and a second sequence comprising a second guide RNA (gRNA) that specifically binds a second target sequence within a second RNA molecule, the second RNA binding protein comprises or consists of a nuclease domain. In some embodiments, the second RNA binding protein comprises or consists of an RNAse. In some embodiments, the second RNA binding protein comprises or consists of an RNAse1. In some embodiments, the RNAse1 protein comprises or consists of SEQ ID NO: 20. In some embodiments, the second RNA binding protein comprises or consists of an RNAse4. In some embodiments, the RNAse4 protein comprises or consists of SEQ ID NO: 21. In some embodiments, the second RNA binding protein comprises or consists of an RNAse6. In some embodiments, the RNAse6 protein comprises or consists of SEQ ID NO: 22. In some embodiments, the second RNA binding protein comprises or consists of an RNAse7. In some embodiments, the RNAse7 protein comprises or consists of SEQ ID NO: 23. In some embodiments, the second RNA binding protein comprises or consists of an RNAse8. In some embodiments, the RNAse8 protein comprises or consists of SEQ ID NO: 24. In some embodiments, the second RNA binding protein comprises or consists of an RNAse2. In some embodiments, the RNAse2 comprises or consists of SEQ ID NO: 25. In some embodiments, the second RNA binding protein comprises or consists of an RNAse6PL. In some embodiments, the RNAse6PL protein comprises or consists of SEQ ID NO: 26. In some embodiments, the second RNA binding protein comprises or consists of an RNAseL. In some embodiments, the RNAseL protein comprises or consists of SEQ ID NO: 27. In some embodiments, the second RNA binding protein comprises or consists of an RNAseT2. In some embodiments, the RNAseT2 protein comprises or consists of SEQ ID NO: 28. In some embodiments, the second RNA binding protein comprises or consists of an RNAse11. In some embodiments, the RNAse11 protein comprises or consists of SEQ ID NO: 29. In some embodiments, the second RNA binding protein comprises or consists of an RNAseT2-like. In some embodiments, the RNAseT2-like protein comprises or consists of SEQ ID NO: 30.

(SEQ ID NO: 205)

1
MRIGKSSGWL NESVSLEYEH VSPPTRPRDT RRRPRAAGDG GLAHLHRRLA VGYAEDTPRT

61
EARSPAPRRP LPVAPASAPP APSLVPEPPM PVSLPAVSSP RFSAGSSAAI TDPFPSLPPT

121
PVLYAMAREL EALSDATWQP AVPLPAEPPT DARRGNTVFD EASASSPVIA SACPQAFASP

181
PRAPRSARAR RARTGGDAWP APTFLSRPSS SRIGRDVFGK LVALGYSREQ IRKLKQESLS

241
EIAKYHTTLT GQGFTHADIC RISRRRQSLR VVARNYPELA AALPELTRAH IVDIARQRSG

301
DLALQALLPV ATALTAAPLR LSASQIATVA QYGERPAIQA LYRLRRKLTR APLHLTPQQV

361
VAIASNTGGK RALEAVCVQL PVLRAAPYRL STEQVVAIAS NKGGKQALEA VKAHLLDLLG

421
APYVLDTEQV VAIASHNGGK QALEAVKADL LDLRGAPYAL STEQVVAIAS HNGGKQALEA

481
VKADLLELRG APYALSTEQV VAIASHNGGK QALEAVKAHL LDLRGVPYAL STEQVVAIAS

541
HNGGKQALEA VKAQLLDLRG APYALSTAQV VAIASNGGGK QALEGIGEQL LKLRTAPYGL

601
STEQVVAIAS HDGGKQALEA VGAQLVALRA APYALSTEQV VAIASNKGGK QALEAVKAQL

661
LELRGAPYAL STAQVVAIAS HDGGNQALEA VGTQLVALRA APYALSTEQV VAIASHDGGK

721
QALEAVGAQL VALRAAPYAL NTEQVVAIAS SHGGKQALEA VRALFPDLRA APYALSTAQL

781
VAIASNPGGK QALEAVRALF RELRAAPYAL STEQVVAIAS NHGGKQALEA VRALFRGLRA

841
APYGLSTAQV VAIASSNGGK QALEAVWALL PVLRATPYDL NTAQIVAIAS HDGGKPALEA

901
VWAKLPVLRG APYALSTAQV VAIACISGQQ ALEAIEAHMP TLRQASHSLS PERVAAIACI

961
GGRSAVEAVR QGLPVKAIRR IRREKAPVAG PPPASLGPTP QELVAVLHFF RAHQQPRQAF

1021
VDALAAFQAT RPALLRLLSS VGVTEIEALG GTIPDATERW QRLLGRLGFR PATGAAAPSP

1081
DSLQGFAQSL ERTLGSPGMA GQSACSPHRK RPAETAIAPR SIRRSPNNAG QPSEPWPDQL

1141
AWLQRRKRTA RSHIRADSAA SVPANLHLGT RAQFTPDRLR AEPGPIMQAH TSPASVSFGS

1201
HVAFEPGLPD PGTPTSADLA SFEAEPFGVG PLDFHLDWLL QILET.

In some embodiments, the TALEN polypeptide comprises or consists of:

(SEQ ID NO: 206)

1
mdpirsrtps parellpgpq pdrvqptadr ggappaggpl dglparrtms rtrlpsppap

61
spafsagsfs dllrqfdpsl ldtslldsmp avgtphtaaa paecdevqsg lraaddpppt

121
vrvavtaarp prakpaprrr aaqpsdaspa aqvdlrtlgy sqqqqekikp kvgstvaqhh

181
ealvghgfth ahivalsrhp aalgtvavky qdmiaalpea thedivgvgk qwsgaralea

241
lltvagelrg pplqldtgql vkiakrggvt aveavhasrn altgaplnlt paqvvaiasn

301
nggkqaletv qrllpvlcqa hgltpaqvva iashdggkqa letmqrllpv lcqahglppd

361
qvvaiasnig gkqaletvqr llpvlcqahg ltpdqvvaia shgggkqale tvqrllpvlc

421
qahgltpdqv vaiashdggk qaletvqrll pvlcqahglt pdqvvaiasn gggkqaletv

481
qrllpvlcqa hgltpdqvva iasnggkqal etvqrllpvl cqahgltpdq vvaiashdgg

541
kqaletvqrl lpvlcqthgl tpaqvvaias hdggkqalet vqqllpvlcq ahgltpdqvv

601
aiasniggkq alatvqrllp vlcqahgltp dqvvaiasng ggkqaletvq rllpvlcqah

661
gltpdqvvai asngggkqal etvqrllpvl cqahgltqvq vvaiasnigg kqaletvqrl

721
lpvlcqahgl tpaqvvaias hdggkqalet vqrllpvlcq ahgltpdqvv aiasngggkq

781
aletvqrllp vlcqahgltq eqvvaiasnn ggkqaletvq rllpvlcqah gltpdqvvai

841
asngggkqal etvqrllpvl cqahgltpaq vvaiasnigg kqaletvqrl lpvlcqdhgl

901
tlaqvvaias niggkqalet vqrllpvlcq ahgltqdqvv aiasniggkq aletvqrllp

961
vlcqdhgltp dqvvaiasni ggkqaletvq rllpvlcqdh gltldqvvai asnggkqale

1021
tvqrllpvlc qdhgltpdqv vaiasnsggk qaletvqrll pvlcqdhglt pnqvvaiasn

1081
ggkqalesiv aqlsrpdpal aaltndhlva laclggrpam davkkglpha pelirrvnrr

1141
igertshrva dyaqvvrvle ffqchshpay afdeamtqfg msrnglvqlf rrvgvtelea

1201
rggtlppasq rwdrilqasg mkrakpspts aqtpdqaslh afadslerdl dapspmhegd

1261
qtgassrkrs rsdravtgps aqhsfevrvp eqrdalhlpl swrvkrprtr iggglpdpgt

1321
piaadlaass tvmweqdaap fagaaddfpa fneeelawlm ellpqsgsvg gti.

(SEQ ID NO: 207)

1
MSRPRFNPRG DFPLQRPRAP NPSGMRPPGP FMRPGSMGLP RFYPAGRARG IPHRFAGHES

61
YQNMGPQRMN VQVTQHRTDP RLTKEKLDFH EAQQKKGKPH GSRWDDEPHI SASVAVKQSS

121
VTQVTEQSPK VQSRYTKESA SSILASFGLS NEDLEELSRY PDEQLTPENM PLILRDIRMR

181
KMGRRLPNLP SQSRNKETLG SEAVSSNVID YGHASKYGYT EDPLEVRIYD PEIPTDEVEN

241
EFQSQQNISA SVPNPNVICN SMFPVEDVFR QMDFPGESSN NRSFFSVESG TKMSGLHISG

301
GQSVLEPIKS VNQSINQTVS QTMSQSLIPP SMNQQPFSSE LISSVSQQER IPHEPVINSS

361
NVHVGSRGSK KNYQSQADIP IRSPFGIVKA SWLPKFSHAD AQKMKRLPTP SMMNDYYAAS

421
PRIFPHLCSL CNVECSHLKD WIQHQNTSTH IESCRQLRQQ YPDWNPEILP SRRNEGNRKE

481
NETPRRRSHS PSPRRSRRSS SSHRFRRSRS PMHYMYRPRS RSPRICHRFI SRYRSRSRSR

541
SPYRIRNPFR GSPKCFRSVS PERMSRRSVR SSDRKKALED VVQRSGHGTE FNKQKHLEAA

601
DKGHSPAQKP KTSSGTKPSV KPTSATKSDS NLGGHSIRCK SKNLEDDTLS ECKQVSDKAV

661
SLQRKLRKEQ SLHYGSVLLI TELPEDGCTE EDVRKLFQPF GKVNDVLIVP YRKEAYLEME

721
FKEAITAIMK YIETTPLTIK GKSVKICVPG KKKAQNKEVK KKTLESKKVS ASTLKRDADA

781
SKAVEIVTST SAAKTGQAKA SVAKVNKSTG KSASSVKSVV TVAVKGNKAS IKTAKSGGKK

841
SLEAKKTGNV KNKDSNKPVT IPENSEIKTS IEVKATENCA KEAISDAALE ATENEPLNKE

901
TEEMCVMLVS NLPNKGYSVE EVYDLAKPFG GLKDILILSS HKKAYIEINR KAAESMVKFY

961
TCFPVLMDGN QLSISMAPEN MNIKDEEAIF ITLVKENDPE ANIDTIYDRF VHLDNLPEDG

1021
LQCVLCVGLQ FGKVDHHVFI SNRNKAILQL DSPESAQSMY SFLKQNPQNI GDHMLTCSLS

1081
PKIDLPEVQI EHDPELEKES PGLKNSPIDE SEVQTATDSP SVKPNELEEE STPSIQTETL

1141
VQQEEPCEEE AEKATCDSDF AVETLELETQ GEEVKEEIPL VASASVSIEQ FTENAEECAL

1201
NQQMFNSDLE KKGAEIINPK TALLPSDSVF AEERNLKGIL EESPSEAEDF ISGITQTMVE

1261
AVAEVEKNET VSEILPSTCI VTLVPGIPTG DEKTVDKKNI SEKKGNMDEK EEKEFNTKET

1321
RMDLQIGTEK AEKNEGRMDA EKVEKMAAMK EKPAENTLFK AYPNKGVGQA NKPDETSKTS

1381
ILAVSDVSSS KPSIKAVIVS SPKAKATVSK TENQKSFPKS VPRDQINAEK KLSAKEFGLL

1441
KPTSARSGLA ESSSKFKPTQ SSLTRGGSGR ISALQGKLSK LDYRDITKQS QETEARPSIM

1501
KRDDSNNKTL AEQNTKNPKS TTGRSSKSKE EPLFPFNLDE FVTVDEVIEE VNPSQAKQNP

1561
LKGKRKETLK NVPFSELNLK KKKGKTSTPR GVEGELSFVT LDEIGEEEDA AAHLAQALVT

1621
VDEVIDEEEL NMEEMVKNSN SLFTLDELID QDDCISHSEP KDVTVLSVAE EQDLLKQERL

1681
VTVDEIGEVE ELPLNESADI TFATLNTKGN EGDTVRDSIG FISSQVPEDP STLVTVDEIQ

1741
DDSSDLHLVT LDEVTEEDED SLADFNNLKE ELNFVTVDEV GEEEDGDNDL KVELAQSKND

1801
HPTDKKGNRK KRAVDTKKTK LESLSQVGPV NENVMEEDLK TMIERHLTAK TPTKRVRIGK

1861
TLPSEKAVVT EPAKGEEAFQ MSEVDEESGL KDSEPERKRK KTEDSSSGKS VASDVPEELD

1921
FLVPKAGFFC PICSLFYSGE KAMTNHCKST RHKQNTEKFM AKQRKEKEQN EAEERSSR.

In some embodiments of the compositions of the disclosure, the composition further comprises (a) a sequence comprising a gRNA that specifically binds within an RNA molecule and (b) a sequence encoding a nuclease. In some embodiments, the sequence encoding a nuclease comprises a sequence isolated or derived from a CRISPR/Cas protein. In some embodiments, the CRISPR/Cas protein is isolated or derived from any one of a type I, a type IA, a type IB, a type IC, a type ID, a type IE, a type IF, a type IU, a type III, a type IIIA, a type IIIB, a type IIIC, a type IIID, a type IV, a type IVA, a type IVB, a type II, a type IIA, a type IIB, a type IIC, a type V, or a type VI CRISPR/Cas protein In some embodiments, the sequence encoding a nuclease comprises a sequence isolated or derived from a TALEN or a nuclease domain thereof. In some embodiments, the sequence encoding a nuclease comprises a sequence isolated or derived from a zinc finger nuclease or a nuclease domain thereof. In some embodiments, the target sequence comprises a sequence encoding a component of an adaptive immune response.

The disclosure provides a vector comprising a composition of the disclosure. In some embodiments, the vector is a viral vector. In some embodiments, the vector comprises a sequence isolated or derived from a lentivirus, an adenovirus, an adeno-associated virus (AAV) vector, or a retrovirus. In some embodiments, the vector is replication incompetent.

The disclosure provides a vector comprising a composition of the disclosure. In some embodiments, the vector is a viral vector. In some embodiments, the vector comprises a sequence isolated or derived from an adeno-associated vector (AAV). In some embodiments, the adeno-associated virus (AAV) is an isolated AAV. In some embodiments, the adeno-associated virus (AAV) is a self-complementary adeno-associated virus (scAAV). In some embodiments, the adeno-associated virus (AAV) is a recombinant adeno-associated virus (rAAV). In some embodiments, the adeno-associated virus (AAV) comprises a sequence isolated or derived from an AAV of serotype AAV1, AAV2, AAV3, AAV4, AAV5, AAV6, AAV7, AAV8, AAV9, AAV10, AAV11, or AAV12. In some embodiments, the adeno-associated virus (AAV) comprises a sequence isolated or derived from an AAV of serotype AAV9. In some embodiments, the adeno-associated virus (AAV) comprise a sequence isolated or derived from Anc80.

The disclosure provides a vector comprising a composition of the disclosure. In some embodiments, the vector is a viral vector. In some embodiments, the vector is a retrovirus.

The disclosure provides a vector comprising a composition of the disclosure. In some embodiments, the vector is a viral vector. In some embodiments, the vector is a lentivirus.

The disclosure provides a vector comprising a composition of the disclosure. In some embodiments, the vector is a non-viral vector. In some embodiments, the non-viral vector comprises a nanoparticle, a micelle, a liposome or lipoplex, a polymersome, a polyplex or a dendrimer.

The disclosure provides a composition comprising a vector of the disclosure.

The disclosure provides a cell comprising a vector of the disclosure.

The disclosure provides a cell comprising a cell of the disclosure.

In some embodiments of cells of the disclosure, the cell is a mammalian cell. In some embodiments, the cell is a human cell.

In some embodiments of cells of the disclosure, the cell is an immune cell. In some embodiments, the immune cell is a T lymphocyte (T-cell). In some embodiments, the T-cell is an effector T-cell, a helper T-cell, a memory T-cell, a regulatory T-cell, a natural Killer T-cell, a mucosal-associated invariant T-cell, or a gamma delta T cell.

In some embodiments of cells of the disclosure, the cell is an immune cell. In some embodiments, the immune cell is an antigen-presenting cell. In some embodiments, the antigen-presenting cell is a dendritic cell, a macrophage, or a B cell. In some embodiments, the antigen-presenting cell is a somatic cell.

In some embodiments of cells of the disclosure, the cell is an immune cell. In some embodiments, the cell is a healthy cell. In some embodiments, the cell is not a healthy cell. In some embodiments, the cell is isolated or derived from a subject having a disease or disorder.

The disclosure provides a composition comprising a cell of the disclosure.

The disclosure provides a composition comprising a plurality of cells of the disclosure.

The disclosure provides a method of masking a cell from an adaptive immune response comprising contacting a composition of the disclosure to the cell to produce a modified cell, wherein the composition modifies a level of expression of an RNA molecule of the modified cell and wherein the RNA molecule encodes a component of an adaptive immune response. In some embodiments, the cell is in vivo, in vitro, ex vivo or in situ. In some embodiments, the cell is in vitro or ex vivo. In some embodiments, a plurality of cells comprises the cell. In some embodiments, each cell of the plurality of cells contacts the composition, thereby producing a plurality of modified cells. In some embodiments, the method further comprises administering the modified cell to a subject. In some embodiments, the method further comprises administering the plurality of modified cells to a subject. In some embodiments, the cell is autologous. In some embodiments, the cell is allogeneic. In some embodiments, the plurality of modified cells is autologous. In some embodiments, the plurality of modified cells is allogeneic. In some embodiments, the component of an adaptive immune response comprises or consists of a component of a type I major histocompatibility complex (MHC I), a type II major histocompatibility complex (MEW II), a T-cell receptor (TCR), a costimulatory molecule or a combination thereof. In some embodiments, the MHC I component comprises an α1 chain, an α2 chain, an α3 chain, or a β2M protein. In some embodiments, the component of an adaptive immune response comprises or consists of an MHC I β2M protein. In some embodiments, the MEW II component comprises an α1 chain, an α2 chain, a β1 chain, or a β2 chain. In some embodiments, the TCR component comprises an α-chain and a β-chain. In some embodiments, the costimulatory molecule comprises a Cluster of Differentiation 28 (CD28), a Cluster of Differentiation 80 (CD80), a Cluster of Differentiation 86 (CD86), an Inducible T-cell COStimulator (ICOS), or an ICOS Ligand (ICOSLG) protein. In some embodiments, a protein component of an adaptive immune response is, without limitation, Beta-2-microglobulin (β2M), Human Leukocyte Antigen A (HLA-A), Human Leukocyte Antigen B (HLA-B), Human Leukocyte Antigen C (HLA-C), Cluster of Differentiation 28 (CD28), Cluster of Differentiation 80 (CD80), Cluster of Differentiation 86 (CD86), Inducible T-cell Costimulator (ICOS), ICOS Ligand (ICOSLG), OX40L, Interleukin 12 (IL12), or CC Chemokine Receptor 7 (CCR7).

The disclosure provides a method of preventing or reducing an adaptive immune response in a subject comprising administering a therapeutically effective amount of a composition of the disclosure to the subject, wherein the composition contacts at least one cell in the subject producing a modified cell, wherein the composition modifies a level of expression of an RNA molecule of the modified cell and wherein the RNA molecule encodes a component of an adaptive immune response.

The disclosure provides a method of treating a disease or disorder in a subject comprising administering a therapeutically effective amount of a composition of the disclosure to the subject, wherein the composition contacts at least one cell in the subject producing a modified cell, wherein the composition modifies a level of expression of an RNA molecule of the modified cell and wherein the composition prevents or reduces an adaptive immune response to the modified cell.

In some embodiments of the methods of the disclosure, the component of an adaptive immune response comprises or consists of a component of a type I major histocompatibility complex (MHC I), a type II major histocompatibility complex (MHC II), a T-cell receptor (TCR), a costimulatory molecule or a combination thereof. In some embodiments, the MHC I component comprises an α1 chain, an α2 chain, an α3 chain, or a β2M protein. In some embodiments, the component of an adaptive immune response comprises or consists of an MHC I β2M protein. In some embodiments, the MHC II component comprises an al chain, an α2 chain, a β1 chain, or a β2 chain. In some embodiments, the TCR component comprises an α-chain and a β-chain. In some embodiments, the costimulatory molecule comprises a Cluster of Differentiation 28 (CD28), a Cluster of Differentiation 80 (CD80), a Cluster of Differentiation 86 (CD86), an Inducible T-cell COStimulator (ICOS), or an ICOS Ligand (ICOSLG) protein.

In some embodiments of the methods of treating a disease or disorder of the disclosure, the disease or disorder is a genetic disease or disorder. In some embodiments, the disease or disorder is a single gene genetic disease or disorder. In some embodiments, the disease or disorder results from microsatellite instability. In some embodiments, the microsatellite instability occurs in a DNA sequence at least 1, 2, 3, 4, 5 or 6 repeated motifs. In some embodiments, an RNA molecule comprises a transcript of the DNA sequence and wherein the composition binds to a target sequence of the RNA molecule comprising at least 1, 2, 3, 4, 5, or 6 repeated motifs.

In some embodiments of the methods of the disclosure, the composition is administered systemically. In some embodiments, the composition is administered intravenously. In some embodiments, the composition is administered by an injection or an infusion.

In some embodiments of the methods of the disclosure, the composition is administered locally. In some embodiments, the composition is administered by an intraosseous, intraocular, intracerebral, or intraspinal route. In some embodiments, the composition is administered by an injection or an infusion.

In some embodiments of the methods of the disclosure, a therapeutically effective amount of the composition is a single dose.

In some embodiments of the methods of the disclosure, the composition is non-genome integrating.

BRIEF DESCRIPTION OF THE DRAWINGS

The patent or application file contains at least one drawing executed in color. Copies of this patent or patent application publication with color drawing(s) will be provided by the Office upon request and payment of the necessary fee.

FIG. 1A is a schematic diagram depicting an exemplary RNA Endonuclease-C. jejuni Cas9 fusion protein.

FIG. 1B is a graph depicting changes in expression levels of Zika NS5 in the presence of both E43 and E67 CjeCas9-endonuclease fusions with sgRNAs containing the various NS5-targeting spacer sequences as indicated in Table 8. Zika NS5 expression is displayed as fold change relative to the endonuclease loaded with an sgRNA containing a control (Lambda) spacer sequence.

FIG. 2A is a fluorescence microscopy image of cells transfected with CjeCas9-endonuclease fusions loaded with an sgRNA containing a Zika NS5-targeting spacer sequence.

FIG. 2B is a graph depicting changes of expression of Zika NS5 in the presence of CjeCas9-endonuclease fusions loaded with the appropriate Zika NS5-targeting sgRNA as compared to CjeCas9-endonuclease fusions loaded with a non-Zika NS5 targeting sgRNA.

FIG. 3 is a list of exemplary endonucleases for use in the compositions of the disclosure.

FIG. 4 is a schematic diagram depicting a construct encoding an exemplary RNA Endonuclease-C. jejuni Cas9 fusion protein and two gRNA molecules for modulating immune response in the context of a gene therapy. The present invention describes a means to address human disease using a CRISPR-based gene therapy or other non-self protein encoded in AAV while simultaneously altering host gene expression to prevent adaptive immune response to the non-self protein. In one embodiment, the AAV particle (left) carries a pair of guide RNAs and a CRISPR-associated (Cas) protein. The guides target a gene associated with adaptive immune response and a gene (or gene product) to promote therapeutic benefit, respectively. Upon delivery to target tissue, the immune response-targeted guide reduces expression of genes associated with antigen presentation (beta-2-microglobulin, B2M) or co-stimulation of T cells (ICOSLG, CD80, CD86, OX40L, IL12, CCR7). Antigen presentation inhibition prevents formation of T helper (Th) cells specific to the therapeutic transgenes such as Cas proteins while co-stimulation inhibition prevents the activation of Th cells that are specific to the transgene.

DETAILED DESCRIPTION

The disclosure provides compositions and methods for the simultaneous treatment of disease by targeting RNA molecules of a modified cell while masking the modified cell from an adaptive immune response. By inhibiting or reducing expression of a component of an adaptive immune response in the modified cell, the modified cell is invisible to a host immune system. For example, compositions of the disclosure may simultaneously target an RNA molecule associated with a genetic disease or disorder and an RNA molecule that encodes the β2M subunit of the MHC I. By selectively targeting an RNA molecule that encodes the β2M subunit of the MHC I, the composition prevents the modified cell from displaying one or more antigen peptides derived from an RNA targeting construct, vector, or combination thereof on the surface of the modified cell. Consequently, a subject's immune system does not identify the modified cell as containing foreign sequences and does not attempt to mount an immune response directed at the modified cell. This method increases the therapeutic efficacy of the treatment of the genetic disease or disorder while avoiding a common side effect of gene therapy.

RNA-Targeting Fusion Protein Compositions

The disclosure provides a composition comprising (a) a sequence comprising a guide RNA (gRNA) that specifically binds a target sequence within an RNA molecule and (b) a sequence encoding a fusion protein, the sequence comprising a sequence encoding a first RNA-binding polypeptide and a sequence encoding a second RNA-binding polypeptide, wherein neither the first RNA-binding polypeptide nor the second RNA-binding polypeptide comprises a significant DNA-nuclease activity, wherein the first RNA-binding polypeptide and the second RNA-binding polypeptide are not identical, and wherein the second RNA-binding polypeptide comprises an RNA-nuclease activity wherein the first RNA-binding polypeptide and the second RNA-binding polypeptide are not identical, and wherein the second RNA-binding polypeptide comprises an RNA-nuclease activity.

In some embodiments of the compositions of the disclosure, the target sequence comprises at least one repeated sequence.

In some embodiments of the compositions of the disclosure, the gRNA sequence comprises a promoter capable of expressing the gRNA in a eukaryotic cell.

In some embodiments of the compositions of the disclosure, the eukaryotic cell is an animal cell. In some embodiments, the animal cell is a mammalian cell. In some embodiments, the animal cell is a human cell.

In some embodiments of the compositions of the disclosure, the promoter is a constitutively active promoter. In some embodiments, the promoter sequence is isolated or derived from a promoter capable of driving expression of an RNA polymerase. In some embodiments, the promoter sequence is isolated or derived from a U6 promoter. In some embodiments, the promoter sequence is isolated or derived from a promoter capable of driving expression of a transfer RNA (tRNA). In some embodiments, the promoter sequence is isolated or derived from an alanine tRNA promoter, an arginine tRNA promoter, an asparagine tRNA promoter, an aspartic acid tRNA promoter, a cysteine tRNA promoter, a glutamine tRNA promoter, a glutamic acid tRNA promoter, a glycine tRNA promoter, a histidine tRNA promoter, an isoleucine tRNA promoter, a leucine tRNA promoter, a lysine tRNA promoter, a methionine tRNA promoter, a phenylalanine tRNA promoter, a proline tRNA promoter, a serine tRNA promoter, a threonine tRNA promoter, a tryptophan tRNA promoter, a tyrosine tRNA promoter, or a valine tRNA promoter. In some embodiments, the promoter sequence is isolated or derived from a valine tRNA promoter.

In some embodiments of the compositions of the disclosure, the sequence comprising the gRNA further comprises a spacer sequence that specifically binds to the target RNA sequence. In some embodiments, the spacer sequence has at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 87%, 90%, 95%, 97%, 99% or any percentage in between of complementarity to the target RNA sequence. In some embodiments, the spacer sequence has 100% complementarity to the target RNA sequence. In some embodiments, the spacer sequence comprises or consists of 20 nucleotides. In some embodiments, the spacer sequence comprises or consists of 21 nucleotides. In some embodiments, the spacer sequence comprises or consists of the sequence

(SEQ ID NO: 1)

UGGAGCGAGCAUCCCCCAAA,

(SEQ ID NO: 2)

GUUUGGGGGAUGCUCGCUCCA,

(SEQ ID NO: 3)

CCCUCACUGCUGGGGAGUCC,

(SEQ ID NO: 4)

GGACUCCCCAGCAGUGAGGG,

(SEQ ID NO: 5)

GCAACUGGAUCAAUUUGCUG,

(SEQ ID NO: 6)

GCAGCAAAUUGAUCCAGUUGC,

(SEQ ID NO: 7)

GCAUUCUUAUCUGGUCAGUGC,

(SEQ ID NO: 8)

GCACUGACCAGAUAAGAAUG,

(SEQ ID NO: 9)

GAGCAGCAGCAGCAGCAGCAG,

(SEQ ID NO: 10)

GCAGGCAGGCAGGCAGGCAGG,

(SEQ ID NO: 11)

GCCCCGGCCCCGGCCCCGGC,

or

(SEQ ID NO: 84)

GCTGCTGCTGCTGCTGCTGC,

(SEQ ID NO: 74)

GGGGCCGGGGCCGGGGCCGG,

(SEQ ID NO: 75)

GGGCCGGGGCCGGGGCCGGG,

(SEQ ID NO: 76)

GGCCGGGGCCGGGGCCGGGG,

(SEQ ID NO: 77)

GCCGGGGCCGGGGCCGGGGC,

(SEQ ID NO: 78)

CCGGGGCCGGGGCCGGGGCC,

or

(SEQ ID NO: 79)

CGGGGCCGGGGCCGGGGCCG.

(SEQ ID NO: 14)

GUGAUAAGUGGAAUGCCAUG,

(SEQ ID NO: 15)

CUGGUGAACUUCCGAUAGUG,

or

(SEQ ID NO: 16)

GAGATATAGCCTGGTGGTTC.

In some embodiments of the compositions of the disclosure, the sequence comprising the gRNA further comprises a scaffold sequence that specifically binds to the first RNA binding protein. In some embodiments, the scaffold sequence comprises a stem-loop structure. In some embodiments, the scaffold sequence comprises or consists of 90 nucleotides. In some embodiments, the scaffold sequence comprises or consists of 93 nucleotides. In some embodiments, the scaffold sequence comprises or consists of the sequence GUUUAAGAGCUAUGCUGGAAACAGCAUAGCAAGUUUAAAUAAGGCUAGUCCGUU AUCAACUUGAAAAAGUGGCACCGAGUCGGUGC U (SEQ ID NO: 83). In some embodiments, the scaffold sequence comprises or consists of the sequence GGACAGCAUAGCAAGUUAAAAUAAGGCUAGUCCGUUAUCAACUUGAAAAAGUGG CACCGAGUCGGUGCUUUUU (SEQ ID NO: 17). In some embodiments, the scaffold sequence comprises or consists of the sequence

(SEQ ID NO: 82)

GUUUAAGAGCUAUGCUGGAAACAGCAUAGCAAGUUUAAAUAAGGCUAGUC

CGUUAUCAACUUGAAAAAGUGGCACCGAGUCGGUGCUUUUUUU

or

(SEQ ID NO: 13)

GUUUUAGAGCUAGAAAUAGCAAGUUAAAAUAAGGCUAGUCCGUUAUCAAC

UUGAAAAAGUGGCACCGAGUCGGUGCUUUUUUU.

In some embodiments of the compositions of the disclosure, the gRNA does not bind or does not selectively bind to a second sequence within the RNA molecule.

In some embodiments of the compositions of the disclosure, an RNA genome or an RNA transcriptome comprises the RNA molecule.

In some embodiments of the compositions of the disclosure, the first RNA binding protein comprises a CRISPR-Cas protein. In some embodiments, the CRISPR-Cas protein is a Type II CRISPR-Cas protein. In some embodiments, the first RNA binding protein comprises a Cas9 polypeptide or an RNA-binding portion thereof. In some embodiments, the CRISPR-Cas protein comprises a native RNA nuclease activity. In some embodiments, the native RNA nuclease activity is reduced or inhibited. In some embodiments, the native RNA nuclease activity is increased or induced. In some embodiments, the CRISPR-Cas protein comprises a native DNA nuclease activity and the native DNA nuclease activity is inhibited. In some embodiments, the CRISPR-Cas protein comprises a mutation. In some embodiments, a nuclease domain of the CRISPR-Cas protein comprises the mutation. In some embodiments, the mutation occurs in a nucleic acid encoding the CRISPR-Cas protein. In some embodiments, the mutation occurs in an amino acid encoding the CRISPR-Cas protein. In some embodiments, the mutation comprises a substitution, an insertion, a deletion, a frameshift, an inversion, or a transposition. In some embodiments, the mutation comprises a deletion of a nuclease domain, a binding site within the nuclease domain, an active site within the nuclease domain, or at least one essential amino acid residue within the nuclease domain.

In some embodiments of the compositions of the disclosure, the first RNA binding protein comprises a CRISPR-Cas protein. In some embodiments, the CRISPR-Cas protein is a Type V CRISPR-Cas protein. In some embodiments, the first RNA binding protein comprises a Cpf1 polypeptide or an RNA-binding portion thereof. In some embodiments, the CRISPR-Cas protein comprises a native RNA nuclease activity. In some embodiments, the native RNA nuclease activity is reduced or inhibited. In some embodiments, the native RNA nuclease activity is increased or induced. In some embodiments, the CRISPR-Cas protein comprises a native DNA nuclease activity and the native DNA nuclease activity is inhibited. In some embodiments, the CRISPR-Cas protein comprises a mutation. In some embodiments, a nuclease domain of the CRISPR-Cas protein comprises the mutation. In some embodiments, the mutation occurs in a nucleic acid encoding the CRISPR-Cas protein. In some embodiments, the mutation occurs in an amino acid encoding the CRISPR-Cas protein. In some embodiments, the mutation comprises a substitution, an insertion, a deletion, a frameshift, an inversion, or a transposition. In some embodiments, the mutation comprises a deletion of a nuclease domain, a binding site within the nuclease domain, an active site within the nuclease domain, or at least one essential amino acid residue within the nuclease domain.

In some embodiments of the compositions of the disclosure, the first RNA binding protein comprises a CRISPR-Cas protein. In some embodiments, the CRISPR-Cas protein is a Type VI CRISPR-Cas protein. In some embodiments, the first RNA binding protein comprises a Cas13 polypeptide or an RNA-binding portion thereof. In some embodiments, the first RNA binding protein comprises a Cas13d polypeptide or an RNA-binding portion thereof. In some embodiments, the CRISPR-Cas protein comprises a native RNA nuclease activity. In some embodiments, the native RNA nuclease activity is reduced or inhibited. In some embodiments, the native RNA nuclease activity is increased or induced. In some embodiments, the CRISPR-Cas protein comprises a native DNA nuclease activity and the native DNA nuclease activity is inhibited. In some embodiments, the CRISPR-Cas protein comprises a mutation. In some embodiments, a nuclease domain of the CRISPR-Cas protein comprises the mutation. In some embodiments, the mutation occurs in a nucleic acid encoding the CRISPR-Cas protein. In some embodiments, the mutation occurs in an amino acid encoding the CRISPR-Cas protein. In some embodiments, the mutation comprises a substitution, an insertion, a deletion, a frameshift, an inversion, or a transposition. In some embodiments, the mutation comprises a deletion of a nuclease domain, a binding site within the nuclease domain, an active site within the nuclease domain, or at least one essential amino acid residue within the nuclease domain.

In some embodiments of the compositions of the disclosure, the first RNA binding protein comprises a Pumilio and FBF (PUF) protein. In some embodiments, the first RNA binding protein comprises a Pumilio-based assembly (PUMBY) protein. In some embodiments, a PUF1 protein of the disclosure comprises or consists of the amino acid sequence of

(SEQ ID NO: 208)

MDKSKQMNIN NLSNIPEVID PGITIPIYEE EYENNGESNS QLQQQPQKLG SYRSRAGKFS
60

NTLSNLLPSI SAKLHHSKKN SHGKNGAEFS SSNNSSQSTV ASKTPRASPS RSKMMESSID
120

GVTMDRPGSL TPPQDMEKLV HFPDSSNNFL IPAPRGSSDS FNLPHQISRT RNNTMSSQIT
180

SISSIAPKPR TSSGIWSSNA SANDPMQQHL LQQLQPTTSN NTTNSNTLND YSTKTAYFDN
240

MVSTSGSQMA DNKMNTNNLA IPNSVWSNTR QRSQSNASSI YTDAPLYEQP ARASISSHYT
300

IPTQESPLIA DEIDPQSINW VTMDPTVPSI NQISNLLPTN TISISNVFPL QHQQPQLNNA
360

INLTSTSLAT LCSKYGEVIS ARTLRNLNMA LVEFSSVESA VKALDSLQGK EVSMIGAPSK
420

ISFAKILPMH QQPPQFLLNS QGLPLGLENN NLQPQPLLQE QLFNGAVTFQ QQGNVSIPVF
480

NQQSQQSQHQ NHSSGSAGFS NVLHGYNNNN SMHGNNNNSA NEKEQCPFPL PPPNVNEKED
540

LLREIIELFE ANSDEYQINS LIKKSLNHKG TSDTQNFGPL PEPLSGREFD PPKLRELRKS
600

IDSNAFSDLE IEQLAIAMLD ELPELSSDYL GNTIVQKLFE HSSDIIKDIM LRKTSKYLTS
660

MGVHKNGTWA CQKMITMAHT PRQIMQVTQG VKDYCTPLIN DQFGNYVIQC VLKFGFPWNQ
720

FIFESIIANF WVIVQNRYGA RAVRACLEAH DIVTPEQSIV LSAMIVTYAE YLSTNSNGAL
780

LVTWFLDTSV LPNRHSILAP RLTKRIVELC GHRLASLTIL KVLNYRGDDN ARKIILDSLF
840

GNVNAHDSSP PKELTKLLCE TNYGPTFVHK VLAMPLLEDD LRAHIIKQVR KVLTDSTQIQ
900

PSRRLLEEVG LASPSSTHNK TKQQQQQHHN SSISHMFATP DTSGQHMRGL SVSSVKSGGS
960

KHTTMNTTTT NGSSASTLSP GQPLNANSNS SMGYFSYPGV FPVSGFSGNA SNGYAMNNDD
1020

LSSQFDMLNF NNGTRLSLPQ LSLTNHNNTT MELVNNVGSS QPHTNNNNNN NNTNYNDDNT
1080

VFETLTLHSA N.
1091

In some embodiments, a PUF3 protein of the disclosure comprises or consists of the amino acid sequence of

(SEQ ID NO: 209)

1
MEMNMDMDMD MELASIVSSL SALSHSNNNG GQAAAAGIVN GGAAGSQQIG GFRRSSFTTA

61
NEVDSEILLL HGSSESSPIF KKTALSVGTA PPFSTNSKKF FGNGGNYYQY RSTDTASLSS

121
ASYNNYHTHH TAANLGKNNK VNHLLGQYSA SIAGPVYYNG NDNNNSGGEG FFEKFGKSLI

181
DGTRELESQD RPDAVNTQSQ FISKSVSNAS LDTQNTFEQN VESDKNFNKL NRNTTNSGSL

241
YHSSSNSGSS ASLESENAHY PKRNIWNVAN TPVFRPSNNP AAVGATNVAL PNQQDGPANN

301
NFPPYMNGFP PNQFHQGPHY QNFPNYLIGS PSNFISQMIS VQIPANEDTE DSNGKKKKKA

361
NRPSSVSSPS SPPNNSPFPF AYPNPMMFMP PPPLSAPQQQ QQQQQQQQQE DQQQQQQQEN

421
PYIYYPTPNP IPVKMPKDEK TFKKRNNKNH PANNSNNANK QANPYLENSI PTKNTSKKNA

481
SSKSNESTAN NHKSHSHSHP HSQSLQQQQQ TYHRSPLLEQ LRNSSSDKNS NSNMSLKDIF

541
GHSLEFCKDQ HGSRFIQREL ATSPASEKEV IFNEIRDDAI ELSNDVFGNY VIQKFFEFGS

601
KIQKNTLVDQ FKGNMKQLSL QMYACRVIQK ALEYIDSNQR IELVLELSDS VLQMIKDQNG

661
NHVIQKAIET IPIEKLPFIL SSLTGHIYHL STHSYGCRVI QRLLEFGSSE DQESILNELK

721
DFIPYLIQDQ YGNYVIQYVL QQDQFTNKEM VDIKQEIIET VANNVVEYSK HKFASNVVEK

781
SILYGSKNQK DLIISKILPR DKNHALNLED DSPMILMIKD QFANYVIQKL VNVSEGEGKK

841
LIVIAIRAYL DKLNKSNSLG NRHLASVEKL AALVENAEV.

In some embodiments, a PUF4 protein of the disclosure comprises or consists of the amino acid sequence of

(SEQ ID NO: 210)

1
MSTKGLKEEI DDVPSVDPVV SETVNSALFQ LQLDDPEENA TSNAFANKVS QDSQFANGPP

61
SQMFPHPQMM GGMGFMPYSQ MMQVPHNPCP FFPPPDFNDP TAPLSSSPLN AGGPPMLFKN

121
DSLPFQMLSS GAAVATQGGQ NLNPLINDNS MKVLPIASAD PLWTHSNVPG SASVAIEETT

181
ATLQESLPSK GRESNNKASS FRRQTFHALS PTDLINAANN VTLSKDFQSD MQNFSKAKKP

241
SVGANNTAKT RTQSISFDNT PSSTSFIPPT NSVSEKLSDF KIETSKEDLI NKTAPAKKES

301
PTTYGAAYPY GGPLLQPNPI MPGHPHNISS PIYGIRSPFP NSYEMGAQFQ PFSPILNPTS

361
HSLNANSPIP LTQSPIHLAP VLNPSSNSVA FSDMKNDGGK PTTDNDKAGP NVRMDLINPN

421
LGPSMQPFHI LPPQQNTPPP PWLYSTPPPF NAMVPPHLLA QNHMPLMNSA NNKHHGRNNN

481
SMSSHNDNDN IGNSNYNNKD TGRSNVGKMK NMKNSYHGYY NNNNNNNNNN NNNNNSNATN

541
SNSAEKQRKI EESSRFADAV LDQYIGSIHS LCKDQHGCRF LQKQLDILGS KAADAIFEET

601
KDYTVELMTD SFGNYLIQKL LEEVTTEQRI VLTKISSPHF VEISLNPHGT RALQKLIECI

661
KTDEEAQIVV DSLRPYTVQL SKDLNGNHVI QKCLQRLKPE NFQFIFDAIS DSCIDIATHR

721
HGCCVLQRCL DHGTTEQCDN LCDKLLALVD KLTLDPFGNY VVQYIITKEA EKNKYDYTHK

781
IVHLLKPRAI ELSIHKFGSN VIEKILKTAI VSEPMILEIL NNGGETGIQS LLNDSYGNYV

841
LQTALDISHK QNDYLYKRLS EIVAPLLVGP IRNTPHGKRI IGMLHLDS.

In some embodiments, a PUF5 protein of the disclosure comprises or consists of the amino acid sequence of

(SEQ ID NO: 211)

1
MSDSTGRINS KASDSSSISD HQTADLSIFN GSFDGGAFSS SNIPLFNFMG TGNQRFQYSP

61
HPFAKSSDPC RLAALTPSTP KGPLNLTPAD FGLADFSVGN ESFADFTANN TSFVGNVQSN

121
VRSTRLLPAW AVDNSGNIRD DLTLQDVVSN GSLIDFAMDR TGVKFLERHF PEDHDNEMHF

181
VLFDKLTEQG AVFTSLCRSA AGNFIIQKFV EHATLDEQER LVRKMCDNGL IEMCLDKFAC

241
RVVQMSIQKF DVSIAMKLVE KISSLDFLPL CTDQCAIHVL QKVVKLLPIS AWSFFVKFLC

301
RDDNLMTVCQ DKYGCRLVQQ TIDKLSDNPK LHCFNTRLQL LHGLMTSVAR NCFRLSSNEF

361
ANYVVQYVIK SSGVMEMYRD TIIEKCLLRN ILSMSQDKYA SHVVEGAFLF APPLLLSEMM

421
DEIFDGYVKD QETNRDALDI LLFHQYGNYV VQQMISICIS ALLGKEERKM VASEMRLYAK

481
WFDRIKNRVN RHSGRLERFS SGKKIIESLQ KLNVPMTMTN EPMPYWAMPT PLMDISAHFM

541
NKLNFQKNSV FDE.

In some embodiments, a PUF6 protein of the disclosure comprises or consists of the amino acid sequence of

(SEQ ID NO: 212)

1
MTPNRRSTDS YNMLGASFDF DPDFSLLSNK THKNKNPKPP VKLLPYRHGS NTTSSDLDNY

61
IFNSGSGSSD DETPPPAAPI FISLEEVLLN GLLIDFAIDP SGVKFLEANY PLDSEDQIRK

121
AVFEKLTEST TLFVGLCHSR NGNFIVQKLV ELATPAEQRE LLRQMIDGGL LVMCKDKFAC

181
RVVQLALQKF DHSNVFQLIQ ELSTFDLAAM CTDQISIHVI QRVVKQLPVD MWTFFVHFLS

241
SGDSLMAVCQ DKYGCRLVQQ VIDRLAENTK LPCFKFRIQL LHSLMTCIVR NCYRLSSNEF

301
ANYVIQYVIK SSGIMEMYRD TIIDKCLLRN LLSMSQDKYA SHVIEGAFLF APPALLHEMM

361
EEIFSGYVKD VELNRDALDI LLFHQYGNYV VQQMISICTA ALIGKEERQL PPAILLLYSG

421
WYEKMKQRVL QHASRLERFS SGKKIIDSVM RHGVPTAAAI NAQAAPSLME LTAQFDAMFP

481
SFLAR.

In some embodiments, a PUF7 protein of the disclosure comprises or consists of the amino acid sequence of

(SEQ ID NO: 213)

1
MTPNRRSTDS YNMLGASFDF DPDFSLLSNK THKNKNPKPP VKLLPYRHGS NTTSSDSDSY

61
IFNSGSGSSD AETPAPVAPI FISLEDVLLN GQLIDFAIDP SGVKFLEANY PLDSEDQIRK

121
AVFEKFTEST TLFVGLCHSR NGNFIVQKLV ELATPAEQRE LLRQMIDGGL LAMCKDKFAC

181
RVVQLALQKF DHSNVFQLIQ ELSTFDLAAM CTDQISIHVI QRVVKQLPVD MWTFFVHFLS

241
SGDSLMAVCQ DKYGCRLVQQ VIDRLAENPK LPCFKFRIQL LHSLMTCIVR NCYRLSSNEF

301
ANYVIQYVIK SSGIMEMYRD TIIDKCLLRN LLSMSQDKYA SHVIEGAFLF APPALLHEMM

361
EEIFSGYVKD VESNRDALDI LLFHQYGNYV VQQMISICTA ALIGKEEREL PPAILLLYSG

421
WYEKMKQRVL QHASRLERFS SGKKIIDSVM RHGVPTAAAV NAQAAPSLME LTAQFDAMFP

481
SFLAR.

In some embodiments, a PUF8 protein of the disclosure comprises or consists of the amino acid sequence of

(SEQ ID NO: 214)

1
MSRPISIGNT CTFDPSASPI ESLGRSIGAQ KIVDSVCGSP IRSYGRHIST NPKNERLPDT

61
PEFQFATYMH QGGKVIGQNT LHMFGTPPSC YCAQENIPIS SNVGHVLSTI NNNYMNHQYN

121
GSNMFSNQMT QMLQAQAYND LQMHQAHSQS IRVPVQPSAT GIFSNPYREP TTTDDLLTRY

181
RANPAMMKNL KLSDIRGALL KFAKDQVGSR FIQQELASSK DRFEKDSIFD EVVSNADELV

241
DDIFGNYVVQ KFFEYGEERH WARLVDAIID RVPEYAFQMY ACRVLQKALE KINEPLQIKI

301
LSQIRHVIHR CMKDQNGNHV VQKAIEKVSP QYVQFIVDTL LESSNTIYEM SVDPYGCRVV

361
QRCLEHCSPS QTKPVIGQIH KRFDEIANNQ YGNYVVQHVI EHGSEEDRMV IVTRVSNNLF

421
EFATHKYSSN VIEKCLEQGA VYHKSMIVGA ACHHQEGSVP IVVQMMKDQY ANYVVQKMFD

481
QVTSEQRREL ILTVRPHIPV LRQFPHGKHI LAKLEKYFQK PAVMSYPYQD MQGSH.

In some embodiments, a PUF9 protein of the disclosure comprises or consists of the amino acid sequence of

(SEQ ID NO: 215)

1
MADPNWAYAP PTNYYADHSI AKPIMISGGH PSQDQGHSPK SESFGQSVTT AFNGMVDNLV

61
GSPSSSVQQR NYFTTTPFPI SRSPNDRNDD KIMGNGSYGV PIPIPQDGVP QGTPDFQMTP

121
FLQQGGHLIG GSPNGPVQVS GNWYSGGAGI FSTMQQADPS NGMPGMAAEF VNNENGMPGP

181
NGMHQQAMIS GSPPFPYQNM MNLTTSFGAM GLGPQQIQQR DPQMFQQPIL HEPIQGMAQN

241
GFGQQVFFTQ MQNQQHPQGQ AQQQLQQLAQ QHQQQQNSQQ FFGQGPNGMG NGGVMNDWSQ

301
RSFGMPQQQA QQNGLPPNFS QNPPRRRGPE DPNGQTPKTL QDIKNNVIEF AKDQHGSRFI

361
QQKLERASLR DKAAIFTPVL ENAEELMTDV FGNYVIQKFF EFGNNEQRNQ LVGTIRGNVM

421
KLALQMYGCR VIQKALEYVE EKYQHEILGE MEGQVLKCVK DQNGNHVIQK VIERVEPERL

481
QFIIDAFTKN NSDNVYTLSV HPYGCRVIQR VLEYCNEEQK QPVLDALQIH LKQLVLDQYG

541
NYVIQHVIEH GSPSDKEQIV QDVISDDLLK FAQHKFASNV IEKCLTFGGH AERNLIIDKV

601
CGDPNDPSPP LLQMMKDPFA NYVVQKMLDV ADPQHRKKIT LTIKPHIATL RKYNFGKHIL

661
LKLEKYFAKQ APANSSNSSS NDQIYEHSPF DIPLGADFSN HPF.

In some embodiments of the compositions of the disclosure, the first RNA binding protein does not require multimerization for RNA-binding activity. In some embodiments, the first RNA binding protein is not a monomer of a multimer complex. In some embodiments, a multimer protein complex does not comprise the first RNA binding protein.

In some embodiments of the compositions of the disclosure, the first RNA binding protein selectively binds to a target sequence within the RNA molecule. In some embodiments, the first RNA binding protein does not comprise an affinity for a second sequence within the RNA molecule. In some embodiments, the first RNA binding protein does not comprise a high affinity for or selectively bind a second sequence within the RNA molecule.

In some embodiments of the compositions of the disclosure, an RNA genome or an RNA transcriptome comprises the RNA molecule.

In some embodiments of the compositions of the disclosure, the first RNA binding protein comprises between 2 and 1300 amino acids, inclusive of the endpoints.

In some embodiments of the compositions of the disclosure, the sequence encoding the first RNA binding protein further comprises a nuclear localization signal (NLS). In some embodiments, the sequence encoding a nuclear localization signal (NLS) is positioned 3′ to the sequence encoding the first RNA binding protein. In some embodiments, the first RNA binding protein comprises an NLS at a C-terminus of the protein.

In some embodiments of the compositions of the disclosure, the sequence encoding the first RNA binding protein further comprises a first sequence encoding a first NLS and a second sequence encoding a second NLS. In some embodiments, the sequence encoding the first NLS or the second NLS is positioned 3′ to the sequence encoding the first RNA binding protein. In some embodiments, the first RNA binding protein comprises the first NLS or the second NLS at a C-terminus of the protein.

In some embodiments of the compositions of the disclosure, the second RNA binding protein comprises or consists of a nuclease domain. In some embodiments, the second RNA binding protein binds RNA in a manner in which it associates with RNA. In some embodiments, the second RNA binding protein associates with RNA in a manner in which it cleaves RNA.

In some embodiments of the compositions of the disclosure, the sequence encoding the second RNA binding protein comprises or consists of an RNAse. In some embodiments, the second RNA binding protein comprises or consists of an RNAse1 polypeptide. In some embodiments, the RNAse1 polypeptide comprises or consists of: KESRAKKFQRQHMDSDSSPSSSSTYCNQMMRRRNMTQGLCKPVNTFVHEPLVDVQNV CFQEKVTCKNGQGNCYKSNSSMHITDCRLTNGSRYPNCAYRTSPKERHIIVACEGSPYV PVHFDASVEDST (SEQ ID NO: 20). In some embodiments, the second RNA binding protein comprises or consists of an RNAse4 polypeptide. In some embodiments, the RNAse4 polypeptide comprises or consists of: QDGMYQRFLRQHVHPEETGGSDRYCDLMMQRRKMTLYHCKRFNTFIHEDIWNIRSIC S TTNIQCKNGKMNCHEGVVKVTDCRDTGS SRAPNCRYRAIASTRRVVIACEGNPQVPVH FDG (SEQ ID NO: 21). In some embodiments, the second RNA binding protein comprises or consists of an RNAse6 polypeptide. In some embodiments, the RNAse6 polypeptide comprises or consists of: WPKRLTKAHWFEIQHIQPSPLQCNRAMSGINNYTQHCKHQNTFLHDSFQNVAAVCDLL SIVCKNRRHNCHQSSKPVNMTDCRLTSGKYPQCRYSAAAQYKFFIVACDPPQKSDPPYK LVPVHLDSIL (SEQ ID NO: 22). In some embodiments, the second RNA binding protein comprises or consists of an RNAse7 polypeptide. In some embodiments, the RNAse7 polypeptide comprises or consists of: APARAGFCPLLLLLLLGLWVAEIPVSAKPKGMTSSQWFKIQHMQPSPQACNSAMKNINK HTKRCKDLNTFLHEPFSSVAATCQTPKIACKNGDKNCHQSHGPVSLTMCKLTSGKYPNC RYKEKRQNKSYVVACKPPQKKDSQQFHLVPVHLDRVL (SEQ ID NO: 23). In some embodiments, the second RNA binding protein comprises or consists of an RNAse8 polypeptide. In some embodiments, the RNAse8 polypeptide comprises or consists of: TSSQWFKTQHVQPSPQACNSAMSIINKYTERCKDLNTFLHEPFSSVAITCQTPNIACKNSC KNCHQSHGPMSLTMGELTSGKYPNCRYKEKHLNTPYIVACDPPQQGDPGYPLVPVHLD KVV (SEQ ID NO: 24). In some embodiments, the second RNA binding protein comprises or consists of an RNAse2 polypeptide. In some embodiments, the RNAse2 polypeptide comprises or consists of: KPPQFTWAQWFETQHINMTSQQCTNAMQVINNYQRRCKNQNTFLLTTFANVVNVCGN PNMTCPSNKTRKNCHHSGSQVPLIHCNLTTPSPQNISNCRYAQTPANMFYIVACDNRDQ RRDPPQYPVVPVHLDRII (SEQ ID NO: 25). In some embodiments, the second RNA binding protein comprises or consists of an RNAse6PL polypeptide. In some embodiments, the RNAse6PL polypeptide comprises or consists of: DKRLRDNHEWKKLIMVQHWPETVCEKIQNDCRDPPDYWTIHGLWPDKSEGCNRSWPF NLEEIKKNWMEITDSSLPSPSMGPAPPRWMRSTPRRSTLAEAWNSTGSWTSTGGCALPP AALPSGDLCCRPSLTAGSRGVGVDLTALHQLLHVHYSATGIIPEECSEPTKPFQIILHHDH TEWVQSIGMPIWGTISSSESAIGKNEESQPACAVLSHDS (SEQ ID NO: 26). In some embodiments, the second RNA binding protein comprises or consists of an RNAseL polypeptide. In some embodiments, the RNAseL polypeptide comprises or consists of: AAVEDNHLLIKAVQNEDVDLVQQLLEGGANVNFQEEEGGWTPLHNAVQMSREDIVEL LLRHGADPVLRKKNGATPFILAAIAGSVKdLLKLFLSKGADVNECDFYGFTAFMEAAVY GKVKALKFLYKRGANVNLRRKTKEDQERLRKGGATALMDAAEKGHVEVLKILLDEM GADVNACDNMGRNALIHALLSSDDSDVEAITHLLLDHGADVNVRGERGKTPLILAVEK KHLGLVQRLLEQEHIEINDTDSDGKTALLLAVELKLKKIAELLCKRGASTDCGDLVMTA RRNYDHSLVKVLLSHGAKEDFHPPAEDWKPQSSHWGAALKDLHRIYRPMIGKLKFFID EKYKIADTSEGGIYLGEYEKQEVAVKTFCEGSPRAQREVSCLQSSRENSHLVTFYGSESH RGHLEVCVTLCEQTLEACLDVHRGEDVENEEDEFARNVLSSIFKAVQELHLSCGYTHQD LQPQNILIDSKKAAHLADFDKSIKWAGDPQEVKRDLEDLGRLVLYVVKKGSISFEDLKA QSNEEVVQLSPDEETKDLIHRLFHPGEHVRDCLSDLLGHPFFWTWESRYRTLRNVGNES DIKTRKSESEILRLLQPGPSEHSKSFDKWTTKINECVMKKMNKFYEKRGNFYQNTVGDL LKFIRNLGEHIDEEKHKKMKLKIGDPSLYFQKTFPDLVIYVYTKLQNTEYRKHFPQTHSP NKPQCDGAGGASGLASPGC (SEQ ID NO: 27). In some embodiments, the second RNA binding protein comprises or consists of an RNAseT2 polypeptide. In some embodiments, the RNAseT2 polypeptide comprises or consists of: VQHWPETVCEKIQNDCRDPPDYWTIHGLWPDKSEGCNRSWPFNLEEIKDLLPEMRAYW PDVIHSFPNRSRFWKHEWEKHGTCAAQVDALNSQKKYFGRSLELYRELDLNSVLLKLGI KPSINYYQVADFKDALARVYGVIPKIQCLPPSQDEEVQTIGQIELCLTKQDQQLQNCTEP GEQPSPKQEVWLANGAAESRGLRVCEDGPVFYPPPKKTKH (SEQ ID NO: 28). In some embodiments, the second RNA binding protein comprises or consists of an RNAse11 polypeptide. In some embodiments the RNAse11 polypeptide comprises or consists of: EASESTMKIIKEEFTDEEMQYDMAKSGQEKQTIEILMNPILLVKNTSLSMSKDDMSSTLL TFRSLHYNDPKGNSSGNDKECCNDMTVWRKVSEANGSCKWSNNFIRSSTEVMRRVHR APSCKFVQNPGISCCESLELENTVCQFTTGKQFPRCQYHSVTSLEKILTVLTGHSLMSWL VCGSKL (SEQ ID NO: 29). In some embodiments, the second RNA binding protein comprises or consists of an RNAseT2-like polypeptide. In some embodiments, the RNAseT2-like polypeptidec omprises or consists of:

(SEQ ID NO: 30)

XLGGADKRLRDNHEWKKLIMVQHWPETVCEKIQNDCRDPPDYWTIHGLWP

DKSEGCNRSWPFNLEEIKDLLPEMRAYWPDVIHSFPNRSRFWKHEWEKHG

TCAAQVDALNSQKKYFGRSLELYRELDLNSVLLKLGIKPSINYYQTTEED

LNLDVEPTTEDTAEEVTIHVLLHSALFGEIGPRRW.

In some embodiments of the compositions of the disclosure, the second RNA binding protein comprises or consists of a mutated RNAse. In some embodiments, the second RNA binding protein comprises or consists of a mutated Rnase1 (Rnase1(K41R)) polypeptide. In some embodiments, the Rnase1(K41R) polypeptide comprises or consists of: KESRAKKFQRQHMDSDSSPSSSSTYCNQMMRRRNMTQGRCRPVNTFVHEPLVDVQNV CFQEKVTCKNGQGNCYKSNSSMHITDCRLTNGSRYPNCAYRTSPKERHIIVACEGSPYV PVHFDASVEDST (SEQ ID NO: 116). In some embodiments, the second RNA binding protein comprises or consists of a mutated Rnase1 (Rnase1(K41R, D121E)) polypeptide. In some embodiments, the Rnase1 (Rnase1(K41R, D121E)) comprises or consists of: KESRAKKFQRQHMDSDSSPSSSSTYCNQMMRRRNMTQGRCRPVNTFVHEPLVDVQNV CFQEKVTCKNGQGNCYKSNSSMHITDCRLTNGSRYPNCAYRTSPKERHIIVACEGSPYV PVHFEASVEDST (SEQ ID NO: 117). In some embodiments, the second RNA binding protein comprises or consists of a mutated Rnase1 (Rnase1(K41R, D121E, H119N)) polypeptide. In some embodiments, the Rnase1 (Rnase1(K41R, D121E, H119N)) polypeptide comprises or consists of: KESRAKKFQRQHMDSDSSPSSSSTYCNQMMRRRNMTQGRCRPVNTFVHEPLVDVQNV CFQEKVTCKNGQGNCYKSNSSMHITDCRLTNGSRYPNCAYRTSPKERHIIVACEGSPYV PVNFEASVEDST (SEQ ID NO: 118). In some embodiments, the second RNA binding protein comprises or consists of a mutated Rnase1. In some embodiments, the second RNA binding protein comprises or consists of a mutated Rnase1 (Rnase1(H119N)) polypeptide. In some embodiments, the Rnase1 (Rnase1(H119N)) polypeptide comprises or consists of: KESRAKKFQRQHMDSDSSPSSSSTYCNQMMRRRNMTQGRCKPVNTFVHEPLVDVQNV CFQEKVTCKNGQGNCYKSNSSMHITDCRLTNGSRYPNCAYRTSPKERHIIVACEGSPYV PVNFDASVEDST (SEQ ID NO: 119). In some embodiments, the second RNA binding protein comprises or consists of a mutated Rnase1 (Rnase1(R39D, N67D, N88A, G89D, R91D, H119N)) polypeptide. In some embodiments, the Rnase1 (Rnase1(R39D, N67D, N88A, G89D, R91D, H119N)) polypeptide comprises or consists of: KESRAKKFQRQHMDSDSSPSSSSTYCNQMMRRRNMTQGDCKPVNTFVHEPLVDVQNV CFQEKVTCKDGQGNCYKSNSSMHITDCRLTADSDYPNCAYRTSPKERHIIVACEGSPYV PVNFDASVEDST (SEQ ID NO: 120). In some embodiments, the second RNA binding protein comprises or consists of a mutated Rnase1 (Rnase1(R39D, N67D, N88A, G89D, R91D, H119N)) polypeptide. In some embodiments, the Rnase1 (Rnase1(R39D, N67D, N88A, G89D, R91D, H119N, K41R, D121E)) polypeptide comprises or consists of: KESRAKKFQRQHMDSDSSPSSSSTYCNQMMRRRNMTQGDCRPVNTFVHEPLVDVQNV CFQEKVTCKDGQGNCYKSNSSMHITDCRLTADSDYPNCAYRTSPKERHIIVACEGSPYV PVNFEASVEDST (SEQ ID NO: 121). In some embodiments, the second RNA binding protein comprises or consists of a mutated Rnase1 (Rnase1(R39D, N67D, N88A, G89D, R91D, H119N)) polypeptide. In some embodiments, the Rnase1 (Rnase1(R39D, N67D, N88A, G89D, R91D)) polypeptide comprises or consists of:

(SEQ ID NO: 122)

KESRAKKFQRQHMDSDSSPSSSSTYCNQMMRRRNMTQGDCKPVNTFVHEP

LVDVQNVCFQEKVTCKDGQGNCYKSNSSMHITDCRLTADSDYPNCAYRTS

PKERHIIVACEGSPYVPVHFDASVEDST.

In some embodiments, the second RNA binding protein comprises or consists of a mutated Rnase1 (Rnase1 (R39D, N67D, N88A, G89D, R91D, H119N, K41R, D121E)) polypeptide comprises or consists of:

(SEQ ID NO: 225)

KESRAKKFQRQHMDSDSSPSSSSTYCNQMMRRRNMTQGDCRPVNTFVHEP

LVDVQNVCFQEKVTCKDGQGNCYKSNSSMHITDCRLTADSDYPNCAYRTS

PKERHIIVACEGSPYVPVNFEASVEDST.

In some embodiments of the compositions of the disclosure, the second RNA binding protein comprises or consists of a NOB1 polypeptide. In some embodiments, the NOB1 polypeptide comprises or consists of:

(SEQ ID NO: 31)

APVEHVVADAGAFLRHAALQDIGKNIYTIREVVTEIRDKATRRRLAVLPY

ELRFKEPLPEYVRLVTEFSKKTGDYPSLSATDIQVLALTYQLEAEFVGVS

HLKQEPQKVKVSSSIQHPETPLHISGFHLPYKPKPPQETEKGHSACEPEN

LEFSSFMFWRNPLPNIDHELQELLIDRGEDVPSEEEEEEENGFEDRKDDS

DDDGGGWITPSNIKQIQQELEQCDVPEDVRVGCLTTDFAMQNVLLQMGLH

VLAVNGMLIREARSYILRCHGCFKTTSDMSRVFCSHCGNKTLKKVSVTV.

In some embodiments of the compositions of the disclosure, the second RNA binding protein comprises or consists of an endonuclease. In some embodiments, the second RNA binding protein comprises or consists of an endonuclease V (ENDOV). In some embodiments, the ENDOV polypeptide comprises or consists of: AFSGLQRVGGVDVSFVKGDSVRACASLVVLSFPELEVVYEESRMVSLTAPYVSGFLAFR EVPFLLELVQQLREKEPGLMPQVLLVDGNGVLHHRGEGVACHLGVLTDLPCVGVAKKL LQVDGLENNALHKEKIRLLQTRGDSFPLLGDSGTVLGMALRSHDRSTRPLYISVGHRMS LEAAVRLTCCCCRFRIPEPVRQADICSREHIRKS (SEQ ID NO: 32). In some embodiments, the second RNA binding protein comprises or consists of an endonuclease G (ENDOG) polypeptide. In some embodiments, the ENDOG polypeptide comprises or consists of: AELPPVPGGPRGPGELAKYGLPGLAQLKSRESYVLCYDPRTRGALWVVEQLRPERLRG DGDRRECDFREDDSVHAYHRATNADYRGSGFDRGHLAAAANHRWSQKAMDDTFYLS NVAPQVPHLNQNAWNNLEKYSRSLTRSYQNVYVCTGPLFLPRTEADGKSYVKYQVIGK NHVAVPTHFEKVLILEAAGGQIELRTYVMPNAPVDEAIPLERFLVPIESIERASGLLEVPNI LARAGSLKAITAGSK (SEQ ID NO: 33). In some embodiments, the second RNA binding protein comprises or consists of an endonuclease D1 (ENDOD1) polypeptide. In some embodiments, the ENDOD1 polypeptide comprises or consists of: RLVGEEEAGFGECDKFFYAGTPPAGLAAD SHVKICQRAEGAERFATLYSTRDRIPVYSA FRAPRPAPGGAEQRWLVEPQIDDPNSNLEEAINEAEAITSVNSLGSKQALNTDYLDSDYQ RGQLYPFSLSSDVQVATFTLTNSAPMTQSFQERWYVNLHSLMDRALTPQCGSGEDLYIL TGTVPSDYRVKDKVAVPEFVWLAACCAVPGGGWAMGFVKHTRDSDIIEDVMVKDLQ KLLPFNPQLFQNNCGETEQDTEKMKKILEVVNQIQDEERMVQSQKSSSPLSSTRSKRSTL LPPEASEGSSSFLGKLMGFIATPFIKLFQLIYYLVVAILKNIVYFLWCVTKQVINGIESCLY RLGSATISYFMAIGEELVSIPWKVLKVVAKVIRALLRILCCLLKAICRVLSIPVRVLVDVA TFPVYTMGAIPIVCKDIALGLGGTVSLLFDTAFGTLGGLFQVVFSVCKRIGYKVTFDNSG EL (SEQ ID NO: 34). In some embodiments, the second RNA binding protein comprises or consists of a Human flap endonuclease-1 (hFEN1) polypeptide. In some embodiments, the hFEN1 polypeptide comprises or consists of: MGIQGLAKLIADVAPSAIRENDIKSYFGRKVAIDASMSIYQFLIAVRQGGDVLQNEEGET TSHLMGMFYRTIRMMENGIKPVYVFDGKPPQLKSGELAKRSERRAEAEKQLQQAQAAG AEQEVEKFTKRLVKVTKQHNDECKHLLSLMGIPYLDAPSEAEASCAALVKAGKVYAAA TEDMDCLTFGSPVLMRHLTASEAKKLPIQEFHLSRILQELGLNQEQFVDLCILLGSDYCE SIRGIGPKRAVDLIQKHKSIEEIVRRLDPNKYPVPENWLHKEAHQLFLEPEVLDPESVELK WSEPNEEELIKFMCGEKQFSEERIRSGVKRLSKSRQGSTQGRLDDFFKVTGSLSSAKRKE PEPKGSTKKKAKTGAAGKFKRGK (SEQ ID NO: 35). In some embodiments, the second RNA binding protein comprises or consists of a DNA repair endonuclease XPF (ERCC4) polypeptide. In some embodiments, the ERCC4 polypeptide comprises or consists of:

(SEQ ID NO: 124)

MESGQPARRIAMAPLLEYERQLVLELLDTDGLVVCARGLGADRLLYHFLQ

LHCHPACLVLVLNTQPAEEEYFINQLKIEGVEHLPRRVTNEITSNSRYEV

YTQGGVIFATSRILVVDFLTDRIPSDLITGILVYRAHRIIESCQEAFILR

LFRQKNKRGFIKAFTDNAVAFDTGFCHVERVMRNLFVRKLYLWPRFHVAV

NSFLEQHKPEVVEIHVSMTPTMLAIQTAILDILNACLKELKCHNPSLEVE

DLSLENAIGKPFDKTIRHYLDPLWHQLGAKTKSLVQDLKILRTLLQYLSQ

YDCVTFLNLLESLRATEKAFGQNSGWLFLDSSTSMFINARARVYHLPDAK

MSKKEKISEKMEIKEGEGILWG.

In some embodiments of the compositions of the disclosure, the second RNA binding protein comprises or consists of an Endonuclease III-like protein 1 (NTHL) polypeptide. In some embodiments, the NTHL polypeptide comprises or consists of:

(SEQ ID NO: 123)

CSPQESGMTALSARMLTRSRSLGPGAGPRGCREEPGPLRRREAAAEARKS

HSPVKRPRKAQRLRVAYEGSDSEKGEGAEPLKVPVWEPQDWQQQLVNIRA

MRNKKDAPVDHLGTEHCYDSSAPPKVRRYQVLLSLMLSSQTKDQVTAGAM

QRLRARGLTVDSILQTDDATLGKLIYPVGFWRSKVKYIKQTSAILQQHYG

GDIPASVAELVALPGVGPKMAHLAMAVAWGTVSGIAVDTHVHRIANRLRW

TKKATKSPEETRAALEEWLPRELWHEINGLLVGFGQQTCLPVHPRCHACL

NQALCPAAQGL.

In some embodiments of the compositions of the disclosure, the second RNA binding protein comprises or consists of a human Schlafen 14 (hSLFN14) polypeptide. In some embodiments, the hSLFN14 polypeptide comprises or consists of:

(SEQ ID NO: 36)

ESTHVEFKRFTTKKVIPRIKEMLPHYVSAFANTQGGYVLIGVDDKSKEVV

GCKWEKVNPDLLKKEIENCIEKLPTFHFCCEKPKVNFTTKILNVYQKDVL

DGYVCVIQVEPFCCVVFAEAPDSWIMKDNSVTRLTAEQWVVMMLDTQSAP

PSLVTDYNSCLISSASSARKSPGYPIKVHKFKEALQ.

In some embodiments of the compositions of the disclosure, the second RNA binding protein comprises or consists of a human beta-lactamase-like protein 2 (hLACTB2) polypeptide. In some embodiments, the hLACTB2 polypeptide comprises or consists of:

(SEQ ID NO: 37)

TLQGTNTYLVGTGPRRILIDTGEPAIPEYISCLKQALTEFNTAIQEIVVT

HWHRDHSGGIGDICKSINNDTTYCIKKLPRNPQREEIIGNGEQQYVYLKD

GDVIKTEGATLRVLYTPGHTDDHMALLLEEENAIFSGDCILGEGTTVFED

LYDYMNSLKELLKIKADIIYPGHGPVIHNAEAKIQQYISHRNIREQQILT

LFRENFEKSFTVMELVKIIYKNTPENLHEMAKHNLLLHLKKLEKEGKIFS

NTDPDKKWKAHL.

In some embodiments of the compositions of the disclosure, the second RNA binding protein comprises or consists of an apurinic/apyrimidinic (AP) endodeoxyribonuclease (APEX) polypeptide. In some embodiments, the second RNA binding protein comprises or consists of an apurinic/apyrimidinic (AP) endodeoxyribonuclease (APEX2) polypeptide. In some embodiments, the APEX2 polypeptide comprises or consists of: MLRVVSWNINGIRRPLQGVANQEPSNCAAVAVGRILDELDADIVCLQETKVTRDALTEP LAIVEGYNSYFSFSRNRSGYSGVATFCKDNATPVAAEEGLSGLFATQNGDVGCYGNMD EFTQEELRALDSEGRALLTQHKIRTWEGKEKTLTLINVYCPHADPGRPERLVFKMRFYR LLQIRAEALLAAGSHVIILGDLNTAHRPIDHWDAVNLECFEEDPGRKWMDSLLSNLGCQ SASHVGPFIDSYRCFQPKQEGAFTCWSAVTGARHLNYGSRLDYVLGDRTLVIDTFQASF LLPEVMGSDHCPVGAVLSVSSVPAKQCPPLCTRFLPEFAGTQLKILRFLVPLEQSPVLEQ STLQHNNQTRVQTCQNKAQVRSTRPQPSQVGSSRGQKNLKSYFQPSPSCPQASPDIELPS LPLMSALMTPKTPEEKAVAKVVKGQAKTSEAKDEKELRTSFWKSVLAGPLRTPLCGGH REPCVMRTVKKPGPNLGRRFYMCARPRGPPTDPSSRCNFFLWSRPS (SEQ ID NO: 38). In some embodiments, the APEX2 polypeptide comprises or consists of: MLRVVSWNINGIRRPLQGVANQEPSNCAAVAVGRILDELDADIVCLQETKVTRDALTEP LAIVEGYNSYFSFSRNRSGYSGVATFCKDNATPVAAEEGLSGLFATQNGDVGCYGNMD EFTQEELRALDSEGRALLTQHKIRTWEGKEKTLTLINVYCPHADPGRPERLVFKMRFYR LLQIRAEALLAAGSHVIILGDLNTAHRPIDHWDAVNLECFEEDPGRKWMDSLLSNLGCQ SASHVGPFIDSYRCFQPKQEGAFTCWSAVTGARHLNYGSRLDYVLGDRTLVIDTFQASF LLPEVMGSDHCPVGAVLSVSSVPAKQCPPLCTRFLPEFAGTQLKILRFLVPLEQSP (SEQ ID NO: 39). In some embodiments, the second RNA binding protein comprises or consists of an apurinic or apyrimidinic site lyase (APEX1) polypeptide. In some embodiments, the APEX1 polypeptide comprises or consists of:

(SEQ ID NO: 125)

PKRGKKGAVAEDGDELRTEPEAKKSKTAAKKNDKEAAGEGPALYEDPPDQ

KTSPSGKPATLKICSWNVDGLRAWIKKKGLDWVKEEAPDILCLQETKCSE

NKLPAELQELPGLSHQYWSAPSDKEGYSGVGLLSRQCPLKVSYGIGDEEH

DQEGRVIVAEFDSFVLVTAYVPNAGRGLVRLEYRQRWDEAFRKFLKGLAS

RKPLVLCGDLNVAHEEIDLRNPKGNKKNAGFTPQERQGFGELLQAVPLAD

SFRHLYPNTPYAYTFWTYMMNARSKNVGWRLDYFLLS.

In some embodiments of the compositions of the disclosure, the second RNA binding protein comprises or consists of an angiogenin (ANG) polypeptide. In some embodiments, the ANG polypeptide comprises or consists of:

(SEQ ID NO: 40)

QDNSRYTHFLTQHYDAKPQGRDDRYCESIMRRRGLTSPCKDINTFIHGNK

RSIKAICENKNGNPHRENLRISKSSFQVTTCKLHGGSPWPPCQYRATAGF

RNVVVACENGLPVHLDQSIFRRP.

In some embodiments of the compositions of the disclosure, the second RNA binding protein comprises or consists of a heat responsive protein 12 (HRSP12) polypeptide. In some embodiments, the HRSP12 polypeptide comprises or consists of:

(SEQ ID NO: 41)

SSLIRRVISTAKAPGAIGPYSQAVLVDRTIYISGQIGMDPSSGQLVSGGV

AEEAKQALKNMGEILKAAGCDFTNVVKTTVLLADINDFNTVNEIYKQYFK

SNFPARAAYQVAALPKGSRIEIEAVAIQGPLTTASL.

In some embodiments of the compositions of the disclosure, the second RNA binding protein comprises or consists of a Zinc Finger CCCH-Type Containing 12A (ZC3H12A) polypeptide. In some embodiments, the ZC3H12A polypeptide comprises or consists of: GGGTPKAPNLEPPLPEEEKEGSDLRPVVIDGSNVAMSHGNKEVF SCRGILLAVNWFLER GHTDITVFVPSWRKEQPRPDVPITDQHILRELEKKKILVFTPSRRVGGKRVVCYDDRFIV KLAYESDGIVVSNDTYRDLQGERQEWKRFIEERLLMYSFVNDKFMPPDDPLGRHGPSLD NFLRKKPLTLE (SEQ ID NO: 42). In some embodiments, the ZC3H12A polypeptide comprises or consists of:

(SEQ ID NO: 43)

SGPCGEKPVLEASPTMSLWEFEDSHSRQGTPRPGQELAAEEASALELQMK

VDFFRKLGYSSTEIHSVLQKLGVQADTNTVLGELVKHGTATERERQTSPD

PCPQLPLVPRGGGTPKAPNLEPPLPEEEKEGSDLRPVVIDGSNVAMSHGN

KEVFSCRGILLAVNWFLERGHTDITVFVPSWRKEQPRPDVPITDQHILRE

LEKKKILVFTPSRRVGGKRVVCYDDRFIVKLAYESDGIVVSNDTYRDLQG

ERQEWKRFIEERLLMYSFVNDKFMPPDDPLGRHGPSLDNFLRKKPLTLEH

RKQPCPYGRKCTYGIKCRFFHPERPSCPQRSVADELRANALLSPPRAPSK

DKNGRRPSPSSQSSSLLTESEQCSLDGKKLGAQASPGSRQEGLTQTYAPS

GRSLAPSGGSGSSFGPTDWLPQTLDSLPYVSQDCLDSGIGSLESQMSELW

GVRGGGPGEPGPPRAPYTGYSPYGSELPATAAFSAFGRAMGAGHFSVPAD

YPPAPPAFPPREYWSEPYPLPPPTSVLQEPPVQSPGAGRSPWGRAGSLAK

EQASVYTKLCGVFPPHLVEAVMGRFPQLLDPQQLAAEILSYKSQHPSE.

In some embodiments of the compositions of the disclosure, the second RNA binding protein comprises or consists of a Reactive Intermediate Imine Deaminase A (RIDA) polypeptide. In some embodiments, the RIDA polypeptide comprises or consists of:

(SEQ ID NO: 44)

SSLIRRVISTAKAPGAIGPYSQAVLVDRTIYISGQIGMDPSSGQLVSGGV

AEEAKQALKNMGEILKAAGCDFTNVVKTTVLLADINDFNTVNEIYKQYFK

SNFPARAAYQVAALPKGSRIEIEAVAIQGPLTTASL.

In some embodiments of the compositions of the disclosure, the second RNA binding protein comprises or consists of a Phospholipase D Family Member 6 (PDL6) polypeptide. In some embodiments, the PDL6 polypeptide comprises or consists of:

(SEQ ID NO: 126)

EALFFPSQVTCTEALLRAPGAELAELPEGCPCGLPHGESALSRLLRALLA

ARASLDLCLFAFSSPQLGRAVQLLHQRGVRVRVVTDCDYMALNGSQIGLL

RKAGIQVRHDQDPGYMHHKFAIVDKRVLITGSLNWTTQAIQNNRENVLIT

EDDEYVRLFLEEFERIWEQFNPTKYTFFPPKKSHGSCAPPVSRAGGRLLS

WHRTCGTSSESQT.

In some embodiments of the compositions of the disclosure, the second RNA binding protein comprises or consists of a mitochondrial ribonuclease P catalytic subunit (KIAA0391) polypeptide. In some embodiments, the KIAA0391 polypeptide comprises or consists of:

(SEQ ID NO: 127)

KARYKTLEPRGYSLLIRGLIHSDRWREALLLLEDIKKVITPSKKNYNDCI

QGALLHQDVNTAWNLYQELLGHDIVPMLETLKAFFDFGKDIKDDNYSNKL

LDILSYLRNNQLYPGESFAHSIKTWFESVPGKQWKGQFTTVRKSGQCSGC

GKTIESIQLSPEEYECLKGKIMRDVIDGGDQYRKTTPQELKRFENFIKSR

PPFDVVIDGLNVAKMFPKVRESQLLLNVVSQLAKRNLRLLVLGRKHMLRR

SSQWSRDEMEEVQKQASCFFADDISEDDPFLLYATLHSGNHCRFITRDLM

RDHKACLPDAKTQRLFFKWQQGHQLAIVNRFPGSKLTFQRILSYDTVVQT

TGDSWHIPYDEDLVERCSCEVPTKWLCLHQKT.

In some embodiments of the compositions of the disclosure, the second RNA binding protein comprises or consists of an argonaute 2 (AGO2) polypeptide. In some embodiments of the compositions of the disclosure, the AGO2 polypeptide comprises or consists of:

(SEQ ID NO: 128)

SVEPMFRHLKNTYAGLQLVVVILPGKTPVYAEVKRVGDTVLGMATQCVQM

KNVQRTTPQTLSNLCLKINVKLGGVNNILLPQGRPPVFQQPVIFLGADVT

HPPAGDGKKPSIAAVVGSMDAHPNRYCATVRVQQHRQEIIQDLAAMVREL

LIQFYKSTRFKPTRIIFYRDGVSEGQFQQVLHHELLAIREACIKLEKDYQ

PGITFIVVQKRHHTRLFCTDKNERVGKSGNIPAGTTVDTKITHPTEFDFY

LCSHAGIQGTSRPSHYHVLWDDNRFSSDELQILTYQLCHTYVRCTRSVSI

PAPAYYAHLVAFRARYHLVDKEHDSAEGSHTSGQSNGRDHQALAKAVQVH

QDTLRTMYFA.

In some embodiments of the compositions of the disclosure, the second RNA binding protein comprises or consists of a mitochondrial nuclease EXOG (EXOG) polypeptide. In some embodiments, the EXOG polypeptide comprises or consists of:

(SEQ ID NO: 129)

QGAEGALTGKQPDGSAEKAVLEQFGFPLTGTEARCYTNHALSYDQAKRVP

RWVLEHISKSKIMGDADRKHCKFKPDPMPPTFSAFNEDYVGSGWSRGHMA

PAGNNKFSSKAMAETFYLSNIVPQDFDNNSGYWNRIEMYCRELTERFEDV

WVVSGPLTLPQTRGDGKKIVSYQVIGEDNVAVPSHLYKVILARRSSVSTE

PLALGAFVVPNEAIGFQPQLTEFQVSLQDLEKLSGLVFFPHLDRTSDIRN

ICSVDTCKLLDFQEFTLYLSTRKIEGARSVLRLEKIMENLKNAEIEPDDY

FMSRYEKKLEELKAKEQSGTQIRKPS.

In some embodiments of the compositions of the disclosure, the second RNA binding protein comprises or consists of a Zinc Finger CCCH-Type Containing 12D (ZC3H12D) polypeptide. In some embodiments, the ZC3H12D polypeptide comprises or consists of:

(SEQ ID NO: 130)

EHPSKMEFFQKLGYDREDVLRVLGKLGEGALVNDVLQELIRTGSRPGALE

HPAAPRLVPRGSCGVPDSAQRGPGTALEEDFRTLASSLRPIVIDGSNVAM

SHGNKETFSCRGIKLAVDWFRDRGHTYIKVFVPSWRKDPPRADTPIREQH

VLAELERQAVLVYTPSRKVHGKRLVCYDDRYIVKVAYEQDGVIVSNDNYR

DLQSENPEWKWFIEQRLLMFSFVNDRFMPPDDPLGRHGPSLSNFLSRKPK

PPEPSWQHCPYGKKCTYGIKCKFYHPERPHHAQLAVADELRAKTGARPGA

GAEEQRPPRAPGGSAGARAAPREPFAHSLPPARGSPDLAALRGSFSRLAF

SDDLGPLGPPLPVPACSLTPRLGGPDWVSAGGRVPGPLSLPSPESQFSPG

DLPPPPGLQLQPRGEHRPRDLHGDLLSPRRPPDDPWARPPRSDRFPGRSV

WAEPAWGDGATGGLSVYATEDDEGDARARARIALYSVFPRDQVDRVMAAF

PELSDLARLILLVQRCQSAGAPLGKP.

In some embodiments of the compositions of the disclosure, the second RNA binding protein comprises or consists of an endoplasmic reticulum to nucleus signaling 2 (ERN2) polypeptide. In some embodiments, the ERN2 polypeptide comprises or consists of:

(SEQ ID NO: 131)

RQQQPQVVEKQQETPLAPADFAHISQDAQSLHSGASRRSQKRLQSPSKQA

QPLDDPEAEQLTVVGKISFNPKDVLGRGAGGTFVFRGQFEGRAVAVKRLL

RECFGLVRREVQLLQESDRHPNVLRYFCTERGPQFHYIALELCRASLQEY

VENPDLDRGGLEPEVVLQQLMSGLAHLHSLHIVHRDLKPGNILITGPDSQ

GLGRVVLSDFGLCKKLPAGRCSFSLHSGIPGTEGWMAPELLQLLPPDSPT

SAVDIFSAGCVFYYVLSGGSHPFGDSLYRQANILTGAPCLAHLEEEVHDK

VVARDLVGAMLSPLPQPRPSAPQVLAHPFFWSRAKQLQFFQDVSDWLEKE

SEQEPLVRALEAGGCAVVRDNWHEHISMPLQTDLRKFRSYKGTSVRDLLR

AVRNKKHHYRELPVEVRQALGQVPDGFVQYFTNRFPRLLLHTHRAMRSCA

SESLFLPYYPPDSEARRPCPGATGR.

In some embodiments of the compositions of the disclosure, the second RNA binding protein comprises or consists of a pelota mRNA surveillance and ribosome rescue factor (PELO) polypeptide. In some embodiments, the PELO polypeptide comprises or consists of:

(SEQ ID NO: 132)

KLVRKNIEKDNAGQVTLVPEEPEDMWHTYNLVQVGDSLRASTIRKVQTES

STGSVGSNRVRTTLTLCVEAIDFDSQACQLRVKGTNIQENEYVKMGAYHT

IELEPNRQFTLAKKQWDSVVLERIEQACDPAWSADVAAVVMQEGLAHICL

VTPSMTLTRAKVEVNIPRKRKGNCSQHDRALERFYEQVVQAIQRHIHFDV

VKCILVASPGFVREQFCDYLFQQAVKTDNKLLLENRSKFLQVHASSGHKY

SLKEALCDPTVASRLSDTKAAGEVKALDDFYKMLQHEPDRAFYGLKQVEK

ANEAMAIDTLLISDELFRHQDVATRSRYVRLVDSVKENAGTVRIFSSLHV

SGEQLSQLTGVAAILRFPVPELSDQEGDSSSEED.

In some embodiments of the compositions of the disclosure, the second RNA binding protein comprises or consists of a YBEY metallopeptidase (YBEY) polypeptide. In some embodiments, the YBEY polypeptide comprises or consists of:

(SEQ ID NO: 133)

SLVIRNLQRVIPIRRAPLRSKIEIVRRILGVQKFDLGIICVDNKNIQHIN

RIYRDRNVPTDVLSFPFHEHLKAGEFPQPDFPDDYNLGDIFLGVEYIFHQ

CKENEDYNDVLTVTATHGLCHLLGFTHGTEAEWQQMFQKEKAVLDELGRR

TGTRLQPLTRGLFGGS.

In some embodiments of the compositions of the disclosure, the second RNA binding protein comprises or consists of a cleavage and polyadenylation specific factor 4 like (CPSF4L) polypeptide. In some embodiments, the CPSF4L comprises or consists of:

(SEQ ID NO: 134)

QEVIAGLERFTFAFEKDVEMQKGTGLLPFQGMDKSASAVCNFFTKGLCEK

GKLCPFRHDRGEKMVVCKHWLRGLCKKGDHCKFLHQYDLTRMPECYFYSK

FGDCSNKECSFLHVKPAFKSQDCPWYDQGFCKDGPLCKYRHVPRIMCLNY

LVGFCPEGPKCQFAQKIREFKLLPGSKI.

In some embodiments of the compositions of the disclosure, the second RNA binding protein comprises or consists of an hCG_2002731 polypeptide. In some embodiments, the hCG_2002731 polypeptide comprises or consists of: KLVRKNIEKDNAGQVTLVPEEPEDMWHTYNLVQVGDSLRASTIRKVQTESSTGSVGSN RVRTTLTLCVEAIDFD SQACQLRVKGTNIQENEYVKMGAYHTIELEPNRQFTLAKKQW DSVVLERIEQACDPAWSADVAAVVMQEGLAHICLVTP SMTLTRAKVEVNIPRKRKGNC SQHDRALEREYEQVVQAIQRHIHFDVVKCILVASPGFVREQFCDYMFQQAVKTDNKLLL ENRSKFLQVHASSGHKYSLKEALCDPTVASRLSDTKAAGEVKALDDFYKMLQHEPDRA FYGLKQVEKANEAMAIDTLLISDELFRHQDVATRSRYVRLVDSVKENAGTVRIFSSLHV SGEQLSQLTGVAAILRFPVPELSDQEGDSSSEED (SEQ ID NO: 135). In some embodiments, the hCG_2002731 polypeptide comprises or consists of:

(SEQ ID NO: 136)

DPAWSADVAAVVMQEGLAHICLVTPSMTLTRAKVEVNIPRKRKGNCSQHD

RALERFYEQVVQAIQRHIHFDVVKCILVASPGFVREQFCDYMFQQAVKTD

NKLLLENRSKFLQVHASSGHKYSLKEALCDPTVASRLSDTKAAGEVKALD

DFYKMLQHEPDRAFYGLKQVEKANEAMAIDTLLISDELFRHQDVATRSRY

VRLVDSVKENAGTVRIFSSLHVSGEQLSQLTGVAAILRFPVPELSDQEGD

SSSEED.

In some embodiments of the compositions of the disclosure, the second RNA binding protein comprises or consists of an Excision Repair Cross-Complementation Group 1 (ERCC1) polypeptide. In some embodiments, the ERCC1 polypeptide comprises or consists of:

(SEQ ID NO: 137)

MDPGKDKEGVPQPSGPPARKKFVIPLDEDEVPPGVRGNPVLKFVRNVPWE

FGDVIPDYVLGQSTCALFLSLRYHNLHPDYIHGRLQSLGKNFALRVLLVQ

VDVKDPQQALKELAKMCILADCTLILAWSPEEAGRYLETYKAYEQKPADL

LMEKLEQDFVSRVTECLTTVKSVNKTDSQTLLTTFGSLEQLIAASREDLA

LCPGLGPQK.

In some embodiments of the compositions of the disclosure, the second RNA binding protein comprises or consists of a ras-related C3 botulinum toxin substrate 1 isoform (RAC1) polypeptide. In some embodiments, the RAC1 polypeptide comprises or consists of:

(SEQ ID NO: 138)

KESRAKKFQRQHMDSDSSPSSSSTYCNQMMRRRNMTQGRCKPVNTFVHEP

LVDVQNVCFQEKVTCKNGQGNCYKSNSSMHITDCRLTNGSRYPNCAYRTS

PKERHIIVACEGSPYVPVHFDASVEDST.

In some embodiments of the compositions of the disclosure, the second RNA binding protein comprises or consists of a Ribonuclease A A1 (RAA1) polypeptide. In some embodiments, the RAA1 polypeptide comprises or consists of:

(SEQ ID NO: 139)

QDNSRYTHFLTQHYDAKPQGRDDRYCESIMRRRGLTSPCKDINTFIHGNK

RSIKAICENKNGNPHRENLRISKSSFQVTTCKLHGGSPWPPCQYRATAGF

RNVVVACENGLPVHLDQSIFRRP.

In some embodiments of the compositions of the disclosure, the second RNA binding protein comprises or consists of a Ras Related Protein (RAB1) polypeptide. In some embodiments, the RAB1 polypeptide comprises or consists of:

(SEQ ID NO: 140)

GLGLVQPSYGQDGMYQRFLRQHVHPEETGGSDRYCNLMMQRRKMTLYHCK

RFNTFIHEDIWNIRSICSTTNIQCKNGKMNCHEGVVKVTDCRDTGSSRAP

NCRYRAIASTRRVVIACEGNPQVPVHFDG.

In some embodiments of the compositions of the disclosure, the second RNA binding protein comprises or consists of a DNA Replication Helicase/Nuclease 2 (DNA2) polypeptide. In some embodiments, the DNA2 polypeptide comprises or consists of:

(SEQ ID NO: 141)

XSAVDNILLKLAKFKIGFLRLGQIQKVHPAIQQFTEQEICRSKSIKSLAL

LEELYNSQLIVATTCMGINHPIFSRKIFDFCIVDEASQISQPICLGPLFF

SRRFVLVGDHQQLPPLVLNREARALGMSESLFKRLEQNKSAVVQLTVQYR

MNSKIMSLSNKLTYEGKLECGSDKVANAVINLRHFKDVKLELEFYADYSD

NPWLMGVFEPNNPVCFLNTDKVPAPEQVEKGGVSNVTEAKLIVFLTSIFV

KAGCSPSDIGIIAPYRQQLKIINDLLARSIGMVEVNTVDKYQGRDKSIVL

VSFVRSNKDGTVGELLKDWRRLNVAITRAKHKLILLGCVPSLNCYPPLEK

LLNHLNSEKLISFFFCIWSHLIALL.

In some embodiments of the compositions of the disclosure, the second RNA binding protein comprises or consists of a FLJ35220 polypeptide. In some embodiments, the FLJ35220 polypeptide comprises or consists of:

(SEQ ID NO: 142)

MALRSHDRSTRPLYISVGHRMSLEAAVRLTCCCCRFRIPEPVRQADICSR

EHIRKSLGLPGPPTPRSPKAQRPVACPKGDSGESSALC.

In some embodiments of the compositions of the disclosure, the second RNA binding protein comprises or consists of a FLJ13173 polypeptide. In some embodiments, the FLJ13173 polypeptide comprises or consists of:

(SEQ ID NO: 143)

CYTNHALSYDQAKRVPRWVLEHISKSKIMGDADRKHCKFKPDPNIPPTFS

AFNEDYVGSGWSRGHMAPAGNNKFSSKAMAETFYLSNIVPQDFDNNSGYW

NRIEMYCRELTERFEDVWVVSGPLTLPQTRGDGKKIVSYQVIGEDNVAVP

SHLYKVILARRSSVSTEPLALGAFVVPNEAIGFQPQLTEFQVSLQDLEKL

SGLVFFPHLDRT.

In some embodiments of the compositions of the disclosure, the second RNA binding protein comprises or consists of Teneurin Transmembrane Protein (TENM) polypeptide. In some embodiments, the second RNA binding protein comprises or consists of Teneurin Transmembrane Protein 1 (TENM1) polypeptide. In some embodiments, the TENM1 polypeptide comprises or consists of: VTVSQMTSVLNGKTRRFADIQLQHGALCFNIRYGTTVEEEKNHVLEIARQRAVAQAWT KEQRRLQEGEEGIRAWTEGEKQQLLSTGRVQGYDGYFVLSVEQYLELSDSANNIHFMR QSEIGRR (SEQ ID NO: 144). In some embodiments, the second RNA binding protein comprises or consists of Teneurin Transmembrane Protein 2 (TENM2) polypeptide. In some embodiments, the TENM2 polypeptide comprises or consists of:

(SEQ ID NO: 145)

TVSQPTLLVNGKTRRFTNIEFQYSTLLLSIRYGLTPDTLDEEKARVLDQA

RQRALGTAWAKEQQKARDGREGSRLWTEGEKQQLLSTGRVQGYEGYYVLP

VEQYPELADSSSNIQFLRQNEMGKR.

In some embodiments of the compositions of the disclosure, the second RNA binding protein comprises or consists of Ribonuclease Kappa (RNAseK) polypeptide. In some embodiments, the RNAseK polypeptide comprises or consists of:

(SEQ ID NO: 204)

MGWLRPGPRPLCPPARASWAFSHRFPSPLAPRRSPTPFFMASLLCCGPKL

AACGIVLSAWGVIMLIMLGIFFNVHSAVLIEDVPFTEKDFENGPQNIYNL

YEQVSYNCFIAAGLYLLLGGFSFCQVRLNKRKEYMVR.

In some embodiments of the compositions of the disclosure, the second RNA binding protein comprises or consists of a transcription activator-like effector nuclease (TALEN) polypeptide or a nuclease domain thereof.

In some embodiments of the compositions of the disclosure, the second RNA binding protein comprises or consists a zinc finger nuclease polypeptide or a nuclease domain thereof. In some embodiments, the second RNA binding protein comprises or consists of a ZNF638 polypeptide or a nuclease domain thereof.

In some embodiments of the compositions of the disclosure, the second RNA binding protein comprises or consists of a PIN domain derived from the human SMG6 protein, also commonly known as telomerase-binding protein EST1A isoform 3, NCBI Reference Sequence: NP_001243756.1. In some embodiments, the PIN from hSMG6 is used herein in the form of a Cas fusion protein and as an internal control.

Guide RNA

The terms guide RNA (gRNA) and single guide RNA (sgRNA) are used interchangeably throughout the disclosure.

Guide RNAs (gRNAs) of the disclosure may comprise of a spacer sequence and a scaffolding sequence. In some embodiments, a guide RNA is a single guide RNA (sgRNA) comprising a contiguous spacer sequence and scaffolding sequence. In some embodiments, the spacer sequence and the scaffolding sequence are contiguous. In some embodiments, a scaffold sequence comprises a “direct repeat” (DR) sequence. DR sequences refer to the repetitive sequences in the CRISPR locus (naturally-occurring in a bacterial genome or plasmid) that are interspersed with the spacer sequences. It is well known that one would be able to infer the DR sequence of a corresponding Cas protein if the sequence of the associated CRISPR locus is known. In some embodiments, the spacer sequence and the scaffolding sequence are not contiguous. In some embodiments, a sequence encoding a guide RNA of the disclosure comprises or consists of a spacer sequence and a scaffolding sequence, that are separated by a linker sequence. In some embodiments, the linker sequence may comprise or consist of 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25, 30, 35, 40, 45, 50 or any number of nucleotides in between. In some embodiments, the linker sequence may comprise at least 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25, 30, 35, 40, 45, 50 or any number of nucleotides in between.

Guide RNAs (gRNAs) of the disclosure may comprise non-naturally occurring nucleotides. In some embodiments, a guide RNA of the disclosure or a sequence encoding the guide RNA comprises or consists of modified or synthetic RNA nucleotides. Exemplary modified RNA nucleotides include, but are not limited to, pseudouridine (Ψ), dihydrouridine (D), inosine (I), and 7-methylguanosine (m7G), hypoxanthine, xanthine, xanthosine, 7-methylguanine, 5, 6-Dihydrouracil, 5-methylcytosine, 5-methylcytidine, 5-hydropxymethylcytosine, isoguanine, and isocytosine.

Guide RNAs (gRNAs) of the disclosure may bind modified RNA within a target sequence. Within a target sequence, guide RNAs (gRNAs) of the disclosure may bind modified RNA. Exemplary epigenetically or post-transcriptionally modified RNA include, but are not limited to, 2′-O-Methylation (2′-OMe) (2′-O-methylation occurs on the oxygen of the free 2′-OH of the ribose moiety), N6-methyladenosine (m6A), and 5-methylcytosine (m5C).

In some embodiments of the compositions of the disclosure, a guide RNA of the disclosure comprises at least one sequence encoding a non-coding C/D box small nucleolar RNA (snoRNA) sequence. In some embodiments, the snoRNA sequence comprises at least one sequence that is complementary to the target RNA, wherein the target sequence of the RNA molecule comprises at least one 2′-OMe. In some embodiments, the snoRNA sequence comprises at least one sequence that is complementary to the target RNA, wherein the at least one sequence that is complementary to the target RNA comprises a box C motif (RUGAUGA) and a box D motif (CUGA).

Spacer sequences of the disclosure bind to the target sequence of an RNA molecule. Spacer sequences of the disclosure may comprise a CRISPR RNA (crRNA). Spacer sequences of the disclosure comprise or consist of a sequence having sufficient complementarity to a target sequence of an RNA molecule to bind selectively to the target sequence. Upon binding to a target sequence of an RNA molecule, the spacer sequence may guide one or more of a scaffolding sequence and a fusion protein to the RNA molecule. In some embodiments, a sequence having sufficient complementarity to a target sequence of an RNA molecule to bind selectively to the target sequence has at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 96, 97%, 98%, 99%, or any percentage identity in between to the target sequence. In some embodiments, a sequence having sufficient complementarity to a target sequence of an RNA molecule to bind selectively to the target sequence has 100% identity the target sequence.

Scaffolding sequences of the disclosure bind the first RNA-binding polypeptide of the disclosure. Scaffolding sequences of the disclosure may comprise a trans acting RNA (tracrRNA). Scaffolding sequences of the disclosure comprise or consist of a sequence having sufficient complementarity to a target sequence of an RNA molecule to bind selectively to the target sequence. Upon binding to a target sequence of an RNA molecule, the scaffolding sequence may guide a fusion protein to the RNA molecule. In some embodiments, a sequence having sufficient complementarity to a target sequence of an RNA molecule to bind selectively to the target sequence has at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 96, 97%, 98%, 99%, or any percentage identity in between to the target sequence. In some embodiments, a sequence having sufficient complementarity to a target sequence of an RNA molecule to bind selectively to the target sequence has 100% identity the target sequence. Alternatively, or in addition, in some embodiments, scaffolding sequences of the disclosure comprise or consist of a sequence that binds to a first RNA binding protein or a second RNA binding protein of a fusion protein of the disclosure. In some embodiments, scaffolding sequences of the disclosure comprise a secondary structure or a tertiary structure. Exemplary secondary structures include, but are not limited to, a helix, a stem loop, a bulge, a tetraloop and a pseudoknot. Exemplary tertiary structures include, but are not limited to, an A-form of a helix, a B-form of a helix, and a Z-form of a helix. Exemplary tertiary structures include, but are not limited to, a twisted or helicized stem loop. Exemplary tertiary structures include, but are not limited to, a twisted or helicized pseudoknot. In some embodiments, scaffolding sequences of the disclosure comprise at least one secondary structure or at least one tertiary structure. In some embodiments, scaffolding sequences of the disclosure comprise one or more secondary structure(s) or one or more tertiary structure(s).

In some embodiments of the compositions of the disclosure, a guide RNA or a portion thereof selectively binds to a tetraloop motif in an RNA molecule of the disclosure. In some embodiments, a target sequence of an RNA molecule comprises a tetraloop motif. In some embodiments, the tetraloop motif is a “GRNA” motif comprising or consisting of one or more of the sequences of GAAA, GUGA, GCAA or GAGA.

In some embodiments of the compositions of the disclosure, a guide RNA or a portion thereof that binds to a target sequence of an RNA molecule hybridizes to the target sequence of the RNA molecule. In some embodiments, a guide RNA or a portion thereof that binds to a first RNA binding protein or to a second RNA binding protein covalently binds to the first RNA binding protein or to the second RNA binding protein. In some embodiments, a guide RNA or a portion thereof that binds to a first RNA binding protein or to a second RNA binding protein non-covalently binds to the first RNA binding protein or to the second RNA binding protein.

In some embodiments of the compositions of the disclosure, a guide RNA or a portion thereof comprises or consists of between 10 and 100 nucleotides, inclusive of the endpoints. In some embodiments, a spacer sequence of the disclosure comprises or consists of between 10 and 30 nucleotides, inclusive of the endpoints. In some embodiments, a scaffold sequence of the disclosure comprises or consists of 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29 or 30 nucleotides. In some embodiments, the spacer sequence of the disclosure comprises or consists of 20 nucleotides. In some embodiments, the spacer sequence of the disclosure comprises or consists of 21 nucleotides. In some embodiments, a scaffold sequence of the disclosure comprises or consists of between 10 and 100 nucleotides, inclusive of the endpoints. In some embodiments, a scaffold sequence of the disclosure comprises or consists of 30, 35, 40, 45, 50, 55, 60, 65, 70, 76, 80, 87, 90, 95, 100 or any number of nucleotides in between. In some embodiments, the scaffold sequence of the disclosure comprises or consists of between 85 and 95 nucleotides, inclusive of the endpoints. In some embodiments, the scaffold sequence of the disclosure comprises or consists of 85 nucleotides. In some embodiments, the scaffold sequence of the disclosure comprises or consists of 90 nucleotides. In some embodiments, the scaffold sequence of the disclosure comprises or consists of 93 nucleotides.

In some embodiments of the compositions of the disclosure, a guide RNA or a portion thereof not comprise a nuclear localization sequence (NLS).

In some embodiments of the compositions of the disclosure, a guide RNA or a portion thereof not comprise a sequence complementary to a protospacer adjacent motif (PAM).

Therapeutic or pharmaceutical compositions of the disclosure do not comprise a PAMmer oligonucleotide. In other embodiments, optionally, non-therapeutic or non-pharmaceutical compositions may comprise a PAMmer oligonucleotide. The term “PAMmer” refers to an oligonucleotide comprising a PAM sequence that is capable of interacting with a guide nucleotide sequence-programmable RNA binding protein. Non-limiting examples of PAMmers are described in O'Connell et al. Nature 516, pages 263-266 (2014), incorporated herein by reference. A PAM sequence refers to a protospacer adjacent motif comprising about 2 to about 10 nucleotides. PAM sequences are specific to the guide nucleotide sequence-programmable RNA binding protein with which they interact and are known in the art. For example, Streptococcus pyogenes PAM has the sequence 5′-NGG-3′, where “N” is any nucleobase followed by two guanine (“G”) nucleobases. Cas9 of Francisella novicida recognizes the canonical PAM sequence 5′-NGG-3′, but has been engineered to recognize the PAM 5′-YG-3′ (where “Y” is a pyrimidine), thus adding to the range of possible Cas9 targets. The Cpf1 nuclease of Francisella novicida recognizes the PAM 5′-TTTN-3′ or 5′-YTN-3′.

In some embodiments of the compositions of the disclosure, a guide RNA or a portion thereof comprises a sequence complementary to a protospacer flanking sequence (PFS). In some embodiments, including those wherein a guide RNA or a portion thereof comprises a sequence complementary to a PFS, the first RNA binding protein may comprise a sequence isolated or derived from a Cas13 protein. In some embodiments, including those wherein a guide RNA or a portion thereof comprises a sequence complementary to a PFS, the first RNA binding protein may comprise a sequence encoding a Cas13 protein or an RNA-binding portion thereof. In some embodiments, the guide RNA or a portion thereof does not comprise a sequence complementary to a PFS.

In some embodiments of the compositions of the disclosure, a guide RNA sequence of the disclosure comprises a promoter to drive expression of the guide RNA. In some embodiments, a vector comprising a guide RNA sequence of the disclosure comprises a promoter to drive expression of the guide RNA. In some embodiments, the promoter is a constitutive promoter. In some embodiments, a promoter is a tissue-specific and/or cell-type specific promoter. In some embodiments, a promoter is an inducible promoter. In some embodiments, a promoter is a hybrid or a recombinant promoter. In some embodiments, a promoter is a promoter capable of driving expression in a mammalian cell. In some embodiments, a promoter is a promoter capable of expression in a human cell. In some embodiments, a promoter is a promoter capable of expressing the guide RNA sequence and restricting the expression to the nucleus of the cell. In some embodiments, a promoter is a human RNA polymerase promoter or a promoter sequence isolated or derived from a a human RNA polymerase promoter. In some embodiments, a promoter is a U6 promoter or a sequence isolated or derived from a sequence encoding a U6 promoter. In some embodiments, a promoter is a human tRNA promoter or a promoter sequence isolated or derived from a sequence a human tRNA promoter. In some embodiments, a promoter is a human valine tRNA promoter or a promoter sequence isolated or derived from a human valine tRNA promoter.

In some embodiments of the compositions of the disclosure, a promoter further comprises a regulatory element. In some embodiments, a vector comprising a promoter which further comprises a regulatory element. In some embodiments, a regulatory element enhances expression of the guide RNA. Exemplary regulatory elements include, but are not limited to, an enhancer element, an intron, an exon, or a combination thereof.

In some embodiments of the compositions of the disclosure, a vector of the disclosure comprises one or more of a guide RNA sequence, a promoter to drive expression of the guide RNA and a regulatory element to enhance expression of the guide RNA. In some embodiments of the compositions of the disclosure, the vector further comprises a nucleic acid sequence encoding a fusion protein of the disclosure.

Fusion Proteins

Fusion proteins of the disclosure comprise a first RNA binding protein and a second RNA binding protein. In some embodiments, along a sequence encoding the fusion protein, the sequence encoding the first RNA binding protein is positioned 5′ of the sequence encoding the second RNA binding protein. In some embodiments, along a sequence encoding the fusion protein, the sequence encoding the first RNA binding protein is positioned 3′ of the sequence encoding the second RNA binding protein.

In some embodiments of the compositions of the disclosure, the sequence encoding the first RNA binding protein comprises a sequence isolated or derived from a protein capable of binding an RNA molecule. In some embodiments, the sequence encoding the first RNA binding protein comprises a sequence isolated or derived from a protein capable of selectively binding an RNA molecule and not binding a DNA molecule, a mammalian DNA molecule or any DNA molecule. In some embodiments, the sequence encoding the first RNA binding protein comprises a sequence isolated or derived from a protein capable of binding an RNA molecule and inducing a break in the RNA molecule. In some embodiments, the sequence encoding the first RNA binding protein comprises a sequence isolated or derived from a protein capable of binding an RNA molecule, inducing a break in the RNA molecule, and not binding a DNA molecule, a mammalian DNA molecule or any DNA molecule. In some embodiments, the sequence encoding the first RNA binding protein comprises a sequence isolated or derived from a protein capable of binding an RNA molecule, inducing a break in the RNA molecule, and neither binding nor inducing a break in a DNA molecule, a mammalian DNA molecule or any DNA molecule.

In some embodiments of the compositions of the disclosure, the sequence encoding the first RNA binding protein comprises a sequence isolated or derived from a protein with no DNA nuclease activity.

In some embodiments of the compositions of the disclosure, the sequence encoding the first RNA binding protein comprises a sequence isolated or derived from a protein having DNA nuclease activity, wherein the DNA nuclease activity does not induce a break in a DNA molecule, a mammalian DNA molecule or any DNA molecule when a composition of the disclosure is contacted to an RNA molecule or introduced into a cell or into a subject of the disclosure.

In some embodiments of the compositions of the disclosure, the sequence encoding the first RNA binding protein comprises a sequence isolated or derived from a protein having DNA nuclease activity, wherein the DNA nuclease activity is inactivated and wherein the DNA nuclease activity does not induce a break in a DNA molecule, a mammalian DNA molecule or any DNA molecule when a composition of the disclosure is contacted to an RNA molecule or introduced into a cell or into a subject of the disclosure. In some embodiments, the sequence encoding the first RNA binding protein comprises a mutation that inactivates or decreases the DNA nuclease activity to a level at which the DNA nuclease activity does not induce a break in a DNA molecule, a mammalian DNA molecule or any DNA molecule when a composition of the disclosure is contacted to an RNA molecule or introduced into a cell or into a subject of the disclosure. In some embodiments, the sequence encoding the first RNA binding protein comprises a mutation that inactivates or decreases the DNA nuclease activity and the mutation comprises one or more of a substitution, inversion, transposition, insertion, deletion, or any combination thereof to a nucleic acid sequence or amino acid sequence encoding the first RNA binding protein or a nuclease domain thereof.

In some embodiments of the compositions of the disclosure, the sequence encoding the first RNA binding protein of an RNA-guided fusion protein disclosed herein comprises a sequence isolated or derived from a CRISPR Cas protein. In some embodiments, the CRISPR Cas protein comprises a Type II CRISPR Cas protein. In some embodiments, the Type II CRISPR Cas protein comprises a Cas9 protein. Exemplary Cas9 proteins of the disclosure may be isolated or derived from any species, including, but not limited to, a bacteria or an archaea. Exemplary Cas9 proteins of the disclosure may be isolated or derived from any species, including, but not limited to, Streptococcus pyogenes, Haloferax mediteranii, Mycobacterium tuberculosis, Francisella tularensis subsp. novicida, Pasteurella multocida, Neisseria meningitidis, Campylobacter jejune, Streptococcus thermophilus, Campylobacter lari CF89-12, Mycoplasma gallisepticum str. F, Nitratifractor salsuginis str. DSM 16511, Parvibaculum lavamentivorans, Roseburia intestinalis, Neisseria cinerea, a Gluconacetobacter diazotrophicus, an Azospirillum B510, a Sphaerochaeta globus str. Buddy, Flavobacterium columnare, Fluviicola taffensis, Bacteroides coprophilus, Mycoplasma mobile, Lactobacillus farciminis, Streptococcus pasteurianus, Lactobacillus johnsonii, Staphylococcus pseudintermedius, Filifactor alocis, Treponema denticola, Legionella pneumophila str. Paris, Sutterella wadsworthensis, Corynebacter diphtherias, Streptococcus aureus, and Francisella novicida.

Exemplary wild type S. pyogenes Cas9 proteins of the disclosure may comprise or consist of the amino acid sequence:

(SEQ ID NO: 147)

1
MDKKYSIGLD IGTNSVGWAV ITDEYKVPSK KFKVLGNTDR HSIKKNLIGA LLFDSGETAE

61
ATRLKRTARR RYTRRKNRIC YLQEIFSNEM AKVDDSFFHR LEESFLVEED KKHERHPIFG

121
NIVDEVAYHE KYPTIYHLRK KLVDSTDKAD LRLIYLALAH MIKFRGHFLI EGDLNPDNSD

181
VDKLFIQLVQ TYNQLFEENP INASGVDAKA ILSARLSKSR RLENLIAQLP GEKKNGLFGN

241
LIALSLGLTP NFKSNFDLAE DAKLQLSKDT YDDDLDNLLA QIGDQYADLF LAAKNLSDAI

301
LLSDILRVNT EITKAPLSAS MIKRYDEHHQ DLTLLKALVR QQLPEKYKEI FFDQSKNGYA

361
GYIDGGASQE EFYKFIKPIL EKMDGTEELL VKLNREDLLR KQRTFDNGSI PHQIHLGELH

421
AILRRQEDFY PFLKDNREKI EKILTFRIPY YVGPLARGNS RFAWMTRKSE ETITPWNFEE

481
VVDKGASAQS FIERMTNFDK NLPNEKVLPK HSLLYEYFTV YNELTKVKYV TEGMRKPAFL

541
SGEQKKAIVD LLFKTNRKVT VKQLKEDYFK KIECFDSVEI SGVEDRFNAS LGTYHDLLKI

601
IKDKDFLDNE ENEDILEDIV LTLTLFEDRE MIEERLKTYA HLFDDKVMKQ LKRRRYTGWG

661
RLSRKLINGI RDKQSGKTIL DFLKSDGFAN RNFMQLIHDD SLTFKEDIQK AQVSGQGDSL

721
HEHIANLAGS PAIKKGILQT VKVVDELVKV MGRHKPENIV IEMARENQTT QKGQKNSRER

781
MKRIEEGIKE LGSQILKEHP VENTQLQNEK LYLYYLQNGR DMYVDQELDI NRLSDYDVDH

841
IVPQSFLKDD SIDNKVLTRS DKNRGKSDNV PSEEVVKKMK NYWRQLLNAK LITQRKFDNL

901
TKAERGGLSE LDKAGFIKRQ LVETRQITKH VAQILDSRMN TKYDENDKLI REVKVITLKS

961
KLVSDFRKDF QFYKVREINN YHHAHDAYLN AVVGTALIKK YPKLESEFVY GDYKVYDVRK

1021
MIAKSEQEIG KATAKYFFYS NIMNFFKTEI TLANGEIRKR PLIETNGETG EIVWDKGRDF

1081
ATVRKVLSMP QVNIVKKTEV QTGGFSKESI LPKRNSDKLI ARKKDWDPKK YGGFDSPTVA

1141
YSVLVVAKVE KGKSKKLKSV KELLGITIME RSSFEKNPID FLEAKGYKEV KKDLIIKLPK

1201
YSLFELENGR KRMLASAGEL QKGNELALPS KYVNFLYLAS HYEKLKGSPE DNEQKQLFVE

1261
QHKHYLDEII EQISEFSKRV ILADANLDKV LSAYNKHRDK PIREQAENII HLFTLTNLGA

1321
PAAFKYFDTT IDRKRYTSTK EVLDATLIHQ SITGLYETRI DLSQLGGD.

Nuclease inactivated S. pyogenes Cas9 proteins may comprise a substitution of an Alanine (A) for a Aspartic Acid (D) at position 10 and an alanine (A) for a Histidine (H) at position 840. Exemplary nuclease inactivated S. pyogenes Cas9 proteins of the disclosure may comprise or consist of the amino acid sequence (D10A and H840A bolded and underlined):

(SEQ ID NO: 148)

1
MDKKYSIGLA IGTNSVGWAV ITDEYKVPSK KFKVLGNTDR HSIKKNLIGA LLFDSGETAE

61
ATRLKRTARR RYTRRKNRIC YLQEIFSNEM AKVDDSFFHR LEESFLVEED KKHERHPIFG

121
NIVDEVAYHE KYPTIYHLRK KLVDSTDKAD LRLIYLALAH MIKFRGHFLI EGDLNPDNSD

181
VDKLFIQLVQ TYNQLFEENP INASGVDAKA ILSARLSKSR RLENLIAQLP GEKKNGLFGN

241
LIALSLGLTP NFKSNFDLAE DAKLQLSKDT YDDDLDNLLA QIGDQYADLF LAAKNLSDAI

301
LLSDILRVNT EITKAPLSAS MIKRYDEHHQ DLTLLKALVR QQLPEKYKEI FFDQSKNGYA

361
GYIDGGASQE EFYKFIKPIL EKMDGTEELL VKLNREDLLR KQRTFDNGSI PHQIHLGELH

421
AILRRQEDFY PFLKDNREKI EKILTFRIPY YVGPLARGNS RFAWMTRKSE ETITPWNFEE

481
VVDKGASAQS FIERMTNFDK NLPNEKVLPK HSLLYEYFTV YNELTKVKYV TEGMRKPAFL

541
SGEQKKAIVD LLFKTNRKVT VKQLKEDYFK KIECFDSVEI SGVEDRFNAS LGTYHDLLKI

601
IKDKDFLDNE ENEDILEDIV LTLTLFEDRE MIEERLKTYA HLFDDKVMKQ LKRRRYTGWG

661
RLSRKLINGI RDKQSGKTIL DFLKSDGFAN RNFMQLIHDD SLTFKEDIQK AQVSGQGDSL

721
HEHIANLAGS PAIKKGILQT VKVVDELVKV MGRHKPENIV IEMARENQTT QKGQKNSRER

781
MKRIEEGIKE LGSQILKEHP VENTQLQNEK LYLYYLQNGR DMYVDQELDI NRLSDYDVDA

841
IVPQSFLKDD SIDNKVLTRS DKNRGKSDNV PSEEVVKKMK NYWRQLLNAK LITQRKFDNL

901
TKAERGGLSE LDKAGFIKRQ LVETRQITKH VAQILDSRMN TKYDENDKLI REVKVITLKS

961
KLVSDFRKDF QFYKVREINN YHHAHDAYLN AVVGTALIKK YPKLESEFVY GDYKVYDVRK

1021
MIAKSEQEIG KATAKYFFYS NIMNFFKTEI TLANGEIRKR PLIETNGETG EIVWDKGRDF

1081
ATVRKVLSMP QVNIVKKTEV QTGGFSKESI LPKRNSDKLI ARKKDWDPKK YGGFDSPTVA

1141
YSVLVVAKVE KGKSKKLKSV KELLGITIME RSSFEKNPID FLEAKGYKEV KKDLIIKLPK

1201
YSLFELENGR KRMLASAGEL QKGNELALPS KYVNFLYLAS HYEKLKGSPE DNEQKQLFVE

1261
QHKHYLDEII EQISEFSKRV ILADANLDKV LSAYNKHRDK PIREQAENII HLFTLTNLGA

1321
PAAFKYFDTT IDRKRYTSTK EVLDATLIHQ SITGLYETRI DLSQLGGD.

Nuclease inactivated S. pyogenes Cas9 proteins may comprise deletion of a RuvC nuclease domain or a portion thereof, an HNH domain, a DNAse active site, a ββα-metal fold or a portion thereof comprising a DNAse active site or any combination thereof.

Other exemplary Cas9 proteins or portions thereof may comprise or consist of the following amino acid sequences.

In some embodiments the Cas9 protein can be S. pyogenes Cas9 and may comprise or consist of the amino acid sequence:

(SEQ ID NO: 149)

MDKKYSIGLDIGTNSVGWAVITDEYKVPSKKFKVLGNTDRHSIKKNLIGA

LLFDSGETAEATRLKRTARRRYTRRKNRICYLQEIFSNEMAKVDDSFFHR

LEESFLVEEDKKHERHPIFGNIVDEVAYHEKYPTIYHLRKKLVDSTDKAD

LRLIYLALAHMIKFRGHFLIEGDLNPDNSDVDKLFIQLVQTYNQLFEENP

INASGVDAKAILSARLSKSRRLENLIAQLPGEKKNGLFGNLIALSLGLTP

NFKSNFDLAEDAKLQLSKDTYDDDLDNLLAQIGDQYADLFLAAKNLSDAI

LLSDILRVNTEITKAPLSASMIKRYDEHHQDLTLLKALVRQQLPEKYKEI

FFDQSKNGYAGYIDGGASQEEFYKFIKPILEKMDGTEELLVKLNREDLLR

KQRTFDNGSIPHQIHLGELHAILRRQEDFYPFLKDNREKIEKILTFRIPY

YVGPLARGNSRFAWMTRKSEETITPWNFEEVVDKGASAQSFIERMTNFDK

NLPNEKVLPKHSLLYEYFTVYNELTKVKYVIEGMRKPAFLSGEQKKAIVD

LLFKTNRKVTVKQLKEDYFKKIECFDSVEISGVEDRFNASLGTYHDLLKI

IKDKDFLDNEENEDILEDIVLTLTLFEDREMIEERLKTYAHLFDDKVMKQ

LKRRRYTGWGRLSRKLINGIRDKQSGKTILDFLKSDGFANRNFMQLIHDD

SLTFKEDIQKAQVSGQGDSLHEHIANLAGSPAIKKGILQTVKVVDELVKV

MGRHKPENIVIEMARENQTTQKGQKNSRERMKRIEEGIKELGSQILKEHP

VENTQLQNEKLYLYYLQNGRDMYVDQELDINRLSDYDVDHIVPQSFLKDD

SIDNKVLTRSDKNRGKSDNVPSEEVVKKMKNYWRQLLNAKLITQRKFDNL

TKAERGGLSELDKAGFIKRQLVETRQITKHVAQILDSRMNTKYDENDKLI

REVKVITLKSKLVSDFRKDFQFYKVREINNYHHAHDAYLNAVVGTALIKK

YPKLESEFVYGDYKVYDVRKMIAKSEQEIGKATAKYFFYSNIMNFFKTEI

TLANGEIRKRPLIETNGETGEIVVVDKGRDFATVRKVLSMPQVNIVKKTE

VQTGGFSKESILPKRNSDKLIARKKDWDPKKYGGFDSPTVAYSVLVVAKV

EKGKSKKLKSVKELLGITIMERSSFEKNPIDFLEAKGYKEVKKDLIIKLP

KYSLFELENGRKRMLASAGELQKGNELALPSKYVNFLYLASHYEKLKGSP

EDNEQKQLFVEQHKHYLDEIIEQISEFSKRVILADANLDKVLSAYNKHRD

KPIREQAENIIHLFTLTNLGAPAAFKYFDTTIDRKRYTSTKEVLDATLIH

QSITGLYETRIDLSQLGGD

In some embodiments the Cas9 protein can be S. aureus Cas9 and may comprise or consist of the amino acid sequence:

(SEQ ID NO: 150)

MKRNYILGLDIGITSVGYGIIDYETRDVIDAGVRLFKEANVENNEGRRSK

RGARRLKRRRRHRIQRVKKLLFDYNLLTDHSELSGINPYEARVKGLSQKL

SEEEFSAALLHLAKRRGVHNVNEVEEDTGNELSTKEQISRNSKALEEKYV

AELQLERLKKDGEVRGSINRFKTSDYVKEAKQLLKVQKAYHQLDQSFIDT

YIDLLETRRTYYEGPGEGSPFGWKDIKEWYEMLMGHCTYFPEELRSVKYA

YNADLYNALNDLNNLVITRDENEKLEYYEKFQIIENVFKQKKKPTLKQIA

KEILVNEEDIKGYRVTSTGKPEFTNLKVYHDIKDITARKEIIENAELLDQ

IAKILTIYQSSEDIQEELTNLNSELTQEEIEQISNLKGYTGTHNLSLKAI

NLILDELWHTNDNQIAIFNRLKLVPKKVDLSQQKEIPTTLVDDFILSPVV

KRSFIQSIKVINAIIKKYGLPNDIIIELAREKNSKDAQKMINEMQKRNRQ

TNERIEEIIRTTGKENAKYLIEKIKLHDMQEGKCLYSLEAIPLEDLLNNP

FNYEVDHIIPRSVSFDNSFNNKVLVKQEENSKKGNRTPFQYLSSSDSKIS

YETFKKHILNLAKGKGRISKTKKEYLLEERDINRFSVQKDFINRNLVDTR

YATRGLMNLLRSYFRVNNLDVKVKSINGGFTSFLRRKWKFKKERNKGYKH

HAEDALIIANADFIFKEWKKLDKAKKVMENQMFEEKQAESMPEIETEQEY

KEIFITPHQIKHIKDFKDYKYSHRVDKKPNRELINDTLYSTRKDDKGNTL

IVNNLNGLYDKDNDKLKKLINKSPEKLLMYHHDPQTYQKLKLIMEQYGDE

KNPLYKYYEETGNYLTKYSKKDNGPVIKKIKYYGNKLNAHLDITDDYPNS

RNKVVKLSLKPYRFDVYLDNGVYKFVTVKNLDVIKKENYYEVNSKCYEEA

KKLKKISNQAEFIASFYNNDLIKINGELYRVIGVNNDLLNRIEVNMIDIT

YREYLENMNDKRPPRIIKTIASKTQSIKKYSTDILGNLYEVKSKKHPQII

KKG

In some embodiments the Cas9 protein can be S. thermophiles CRISPR1 Cas9 and may comprise or consist of the amino acid sequence:

(SEQ ID NO: 151)

MSDLVLGLDIGIGSVGVGILNKVTGEIIHKNSRIFPAAQAENNLVRRTNR

QGRRLARRKKHRRVRLNRLFEESGLITDFTKISINLNPYQLRVKGLTDEL

SNEELFIALKNMVKHRGISYLDDASDDGNSSVGDYAQIVKENSKQLETKT

PGQIQLERYQTYGQLRGDFTVEKDGKKHRLINVFPTSAYRSEALRILQTQ

QEFNPQITDEFINRYLEILTGKRKYYHGPGNEKSRTDYGRYRTSGETLDN

IFGILIGKCTFYPDEFRAAKASYTAQEFNLLNDLNNLTVPTETKKLSKEQ

KNQIINYVKNEKAMGPAKLFKYIAKLLSCDVADIKGYRIDKSGKAEIHTF

EAYRKMKTLETLDIEQMDRETLDKLAYVLTLNTEREGIQEALEHEFADGS

FSQKQVDELVQFRKANSSIFGKGWHNFSVKLMMELIPELYETSEEQMTIL

TRLGKQKTTSSSNKTKYIDEKLLTEEIYNPVVAKSVRQAIKIVNAAIKEY

GDFDNIVIEMARETNEDDEKKAIQKIQKANKDEKDAAMLKAANQYNGKAE

LPHSVFHGHKQLATKIRLWHQQGERCLYTGKTISIHDLINNSNQFEVDHI

LPLSITFDDSLANKVLVYATANQEKGQRTPYQALDSMDDAWSFRELKAFV

RESKTLSNKKKEYLLTEEDISKFDVRKKFIERNLVDTRYASRVVLNALQE

HFRAHKIDTKVSVVRGQFTSQLRRHWGIEKTRDTYHHHAVDALIIAASSQ

LNLWKKQKNTLVSYSEDQLLDIETGELISDDEYKESVFKAPYQHFVDTLK

SKEFEDSILFSYQVDSKFNRKISDATIYATRQAKVGKDKADETYVLGKIK

DIYTQDGYDAFMKIYKKDKSKFLMYRHDPQTFEKVIEPILENYPNKQIND

KGKEVPCNPFLKYKEEHGYIRKYSKKGNGPEIKSLKYYDSKLGNHIDITP

KDSNNKVVLQSVSPWRADVYFNKTTGKYEILGLKYADLQFDKGTGTYKIS

QEKYNDIKKKEGVDSDSEFKFTLYKNDLLLVKDTETKEQQLFRFLSRTMP

KQKHYVELKPYDKQKFEGGEALIKVLGNVANSGQCKKGLGKSNISIYKVR

TDVLGNQHIIKNEGDKPKLDF.

In some embodiments the Cas9 protein can be N meningitidis Cas9 and may comprise or consist of the amino acid sequence:

(SEQ ID NO: 152)

MAAFKPNPINYILGLDIGIASVGWAMVEIDEDENPICLIDLGVRVFERAE

VPKTGDSLAMARRLARSVRRLTRRRAHRLLRARRLLKREGVLQAADFDEN

GLIKSLPNTPWQLRAAALDRKLTPLEWSAVLLHLIKHRGYLSQRKNEGET

ADKELGALLKGVADNAHALQTGDFRTPAELALNKFEKESGHIRNQRGDYS

HTFSRKDLQAELILLFEKQKEFGNPHVSGGLKEGIETLLMTQRPALSGDA

VQKMLGHCTFEPAEPKAAKNTYTAERFIWLTKLNNLRILEQGSERPLTDT

ERATLMDEPYRKSKLTYAQARKLLGLEDTAFFKGLRYGKDNAEASTLMEM

KAYHAISRALEKEGLKDKKSPLNLSPELQDEIGTAFSLFKTDEDITGRLK

DRIQPEILEALLKHISFDKFVQISLKALRRIVPLMEQGKRYDEACAEIYG

DHYGKKNTEEKIYLPPIPADEIRNPVVLRALSQARKVINGVVRRYGSPAR

IHIETAREVGKSFKDRKEIEKRQEENRKDREKAAAKFREYFPNFVGEPKS

KDILKLRLYEQQHGKCLYSGKEINLGRLNEKGYVEIDHALPFSRTWDDSF

NNKVLVLGSENQNKGNQTPYEYFNGKDNSREWQEFKARVETSRFPRSKKQ

RILLQKFDEDGFKERNLNDTRYVNRFLCQFVADRMRLTGKGKKRVFASNG

QITNLLRGFWGLRKVRAENDRHHALDAVVVACSTVAMQQKITRFVRYKEM

NAFDGKTIDKETGEVLHQKTHFPQPWEFFAQEVMIRVFGKPDGKPEFEEA

DTPEKLRTLLAEKLSSRPEAVHEYVTPLFVSRAPNRKMSGQGHMETVKSA

KRLDEGVSVLRVPLTQLKLKDLEKMVNREREPKLYEALKARLEAHKDDPA

KAFAEPFYKYDKAGNRTQQVKAVRVEQVQKTGVVVVRNHNGIADNATMVR

VDVFEKGDKYYLVPIYSWQVAKGILPDRAVVQGKDEEDWQLIDDSFNFKF

SLHPNDLVEVITKKARMFGYFASCHRGTGNINIRIHDLDHKIGKNGILEG

IGVKTALSFQKYQIDELGKEIRPCRLKKRPPVR.

In some embodiments the Cas9 protein can be Parvibaculum. lavamentivorans Cas9 and may comprise or consist of the amino acid sequence:

(SEQ ID NO: 153)

MERIFGFDIGTTSIGFSVIDYSSTQSAGNIQRLGVRIFPEARDPDGTPLN

QQRRQKRMMRRQLRRRRIRRKALNETLHEAGFLPAYGSADWPVVMADEPY

ELRRRGLEEGLSAYEFGRAIYHLAQHRHFKGRELEESDTPDPDVDDEKEA

ANERAATLKALKNEQTTLGAWLARRPPSDRKRGIHAHRNVVAEEFERLWE

VQSKFHPALKSEEMRARISDTIFAQRPVFWRKNTLGECRFMPGEPLCPKG

SWLSQQRRMLEKLNNLAIAGGNARPLDAEERDAILSKLQQQASMSWPGVR

SALKALYKQRGEPGAEKSLKFNLELGGESKLLGNALEAKLADMFGPDWPA

HPRKQEIRHAVHERLWAADYGETPDKKRVIILSEKDRKAHREAAANSFVA

DFGITGEQAAQLQALKLPTGWEPYSIPALNLFLAELEKGERFGALVNGPD

WEGWRRTNFPHRNQPTGEILDKLPSPASKEERERISQLRNPTVVRTQNEL

RKVVNNLIGLYGKPDRIRIEVGRDVGKSKREREEIQSGIRRNEKQRKKAT

EDLIKNGIANPSRDDVEKWILWKEGQERCPYTGDQIGFNALFREGRYEVE

HIWPRSRSFDNSPRNKTLCRKDVNIEKGNRMPFEAFGHDEDRWSAIQIRL

QGMVSAKGGTGMSPGKVKRFLAKTMPEDFAARQLNDTRYAAKQILAQLKR

LWPDMGPEAPVKVEAVTGQVTAQLRKLWTLNNILADDGEKTRADHRHHAI

DALTVACTHPGMTNKLSRYWQLRDDPRAEKPALTPPWDTIRADAEKAVSE

IVVSHRVRKKVSGPLHKETTYGDTGTDIKTKSGTYRQFVTRKKIESLSKG

ELDEIRDPRIKEIVAAHVAGRGGDPKKAFPPYPCVSPGGPEIRKVRLTSK

QQLNLMAQTGNGYADLGSNHHIAIYRLPDGKADFEIVSLFDASRRLAQRN

PIVQRTRADGASFVMSLAAGEAIMIPEGSKKGIWIVQGVVVASGQVVLER

DTDADHSTTTRPMPNPILKDDAKKVSIDPIGRVRPSND.

In some embodiments the Cas9 protein can be Corynebacter diphtheria Cas9 and may comprise or consist of the amino acid sequence:

(SEQ ID NO: 154)

MKYHVGIDVGTFSVGLAAIEVDDAGMPIKTLSLVSHIHDSGLDPDEIKSA

VTRLASSGIARRTRRLYRRKRRRLQQLDKFIQRQGWPVIELEDYSDPLYP

WKVRAELAASYIADEKERGEKLSVALRHIARHRGWRNPYAKVSSLYLPDG

PSDAFKAIREEIKRASGQPVPETATVGQMVTLCELGTLKLRGEGGVLSAR

LQQSDYAREIQEICRMQEIGQELYRKIIDVVFAAESPKGSASSRVGKDPL

QPGKNRALKASDAFQRYRIAALIGNLRVRVDGEKRILSVEEKNLVFDHLV

NLTPKKEPEWVTIAEILGIDRGQLIGTATMTDDGERAGARPPTHDTNRSI

VNSRIAPLVDWWKTASALEQHAMVKALSNAEVDDFDSPEGAKVQAFFADL

DDDVHAKLDSLHLPVGRAAYSEDTLVRLTRRMLSDGVDLYTARLQEFGIE

PSWTPPTPRIGEPVGNPAVDRVLKTVSRWLESATKTWGAPERVIIEHVRE

GFVTEKRAREMDGDMRRRAARNAKLFQEMQEKLNVQGKPSRADLWRYQSV

QRQNCQCAYCGSPITFSNSEMDHIVPRAGQGSTNTRENLVAVCHRCNQSK

GNTPFAIWAKNTSIEGVSVKEAVERTRHWVTDTGMRSTDFKKFTKAVVER

FQRATMDEEIDARSMESVAWMANELRSRVAQHFASHGTTVRVYRGSLTAE

ARRASGISGKLKFFDGVGKSRLDRRHHAIDAAVIAFTSDYVAETLAVRSN

LKQSQAHRQEAPQWREFTGKDAEHRAAWRVWCQKMEKLSALLTEDLRDDR

VVVMSNVRLRLGNGSAHKETIGKLSKVKLSSQLSVSDIDKASSEALWCAL

TREPGFDPKEGLPANPERHIRVNGTHVYAGDNIGLFPVSAGSIALRGGYA

ELGSSFHHARVYKITSGKKPAFAMLRVYTIDLLPYRNQDLFSVELKPQTM

SMRQAEKKLRDALATGNAEYLGWLVVDDELVVDTSKIATDQVKAVEAELG

TIRRWRVDGFFSPSKLRLRPLQMSKEGIKKESAPELSKIIDRPGWLPAVN

KLFSDGNVTVVRRDSLGRVRLESTAHLPVTWKVQ.

In some embodiments the Cas9 protein can be Streptococcus pasteurianus Cas9 and may comprise or consist of the amino acid sequence:

(SEQ ID NO: 155)

MTNGKILGLDIGIASVGVGIIEAKTGKVVHANSRLFSAANAENNAERRGF

RGSRRLNRRKKHRVKRVRDLFEKYGIVTDFRNLNLNPYELRVKGLTEQLK

NEELFAALRTISKRRGISYLDDAEDDSTGSTDYAKSIDENRRLLKNKTPG

QIQLERLEKYGQLRGNFTVYDENGEAHRLINVFSTSDYEKEARKILETQA

DYNKKITAEFIDDYVEILTQKRKYYHGPGNEKSRTDYGRFRTDGTTLENI

FGILIGKCNFYPDEYRASKASYTAQEYNFLNDLNNLKVSTETGKLSTEQK

ESLVEFAKNTATLGPAKLLKEIAKILDCKVDEIKGYREDDKGKPDLHTFE

PYRKLKFNLESINIDDLSREVIDKLADILTLNIIREGIEDAIKRNLPNQF

TEEQISEIIKVRKSQSTAFNKGWHSFSAKLMNELIPELYATSDEQMTILT

RLEKFKVNKKSSKNTKTIDEKEVTDEIYNPVVAKSVRQTIKIINAAVKKY

GDFDKIVIEMPRDKNADDEKKFIDKRNKENKKEKDDALKRAAYLYNSSDK

LPDEVFHGNKQLETKIRLWYQQGERCLYSGKPISIQELVHNSNNFEIDHI

LPLSLSFDDSLANKVLVYAWTNQEKGQKTPYQVIDSMDAAWSFREMKDYV

LKQKGLGKKKRDYLLTTENIDKIEVKKKFIERNLVDTRYASRVVLNSLQS

ALRELGKDTKVSVVRGQFTSQLRRKWKIDKSRETYHHHAVDALIIAASSQ

LKLWEKQDNPMFVDYGKNQVVDKQTGEILSVSDDEYKELVFQPPYQGFVN

TISSKGFEDEILFSYQVDSKYNRKVSDATIYSTRKAKIGKDKKEETYVLG

KIKDIYSQNGFDTFIKKYNKDKTQFLMYQKDSLTWENVIEVILRDYPTTK

KSEDGKNDVKCNPFEEYRRENGLICKYSKKGKGTPIKSLKYYDKKLGNCI

DITPEESRNKVILQSINPWRADVYFNPETLKYELMGLKYSDLSFEKGTGN

YHISQEKYDAIKEKEGIGKKSEFKFTLYRNDLILIKDIASGEQEIYRFLS

RTMPNVNHYVELKPYDKEKFDNVQELVEALGEADKVGRCIKGLNKPNISI

YKVRTDVLGNKYFVKKKGDKPKLDFKNNKK.

In some embodiments the Cas9 protein can be Neisseria cinerea Cas9 and may comprise or consist of the amino acid sequence:

(SEQ ID NO: 156)

MAAFKPNPMNYILGLDIGIASVGWAIVEIDEEENPIRLIDLGVRVFERAE

VPKTGDSLAAARRLARSVRRLTRRRAHRLLRARRLLKREGVLQAADFDEN

GLIKSLPNTPWQLRAAALDRKLTPLEWSAVLLHLIKHRGYLSQRKNEGET

ADKELGALLKGVADNTHALQTGDFRTPAELALNKFEKESGHIRNQRGDYS

HTFNRKDLQAELNLLFEKQKEFGNPHVSDGLKEGIETLLMTQRPALSGDA

VQKMLGHCTFEPTEPKAAKNTYTAERFVWLTKLNNLRILEQGSERPLTDT

ERATLMDEPYRKSKLTYAQARKLLDLDDTAFFKGLRYGKDNAEASTLMEM

KAYHAISRALEKEGLKDKKSPLNLSPELQDEIGTAFSLFKTDEDITGRLK

DRVQPEILEALLKHISFDKFVQISLKALRRIVPLMEQGNRYDEACTEIYG

DHYGKKNIEEKIYLPPIPADEIRNPVVLRALSQARKVINGVVRRYGSPAR

IHIETAREVGKSFKDRKEIEKRQEENRKDREKSAAKFREYFPNFVGEPKS

KDILKLRLYEQQHGKCLYSGKEINLGRLNEKGYVEIDHALPFSRTWDDSF

NNKVLALGSENQNKGNQTPYEYFNGKDNSREWQEFKARVETSRFPRSKKQ

RILLQKFDEDGFKERNLNDTRYINRFLCQFVADHMLLTGKGKRRVFASNG

QITNLLRGFWGLRKVRAENDRHHALDAVVVACSTIAMQQKITRFVRYKEM

NAFDGKTIDKETGEVLHQKAHFPQPWEFFAQEVMIRVFGKPDGKPEFEEA

DTPEKLRTLLAEKLSSRPEAVHKYVTPLFISRAPNRKMSGQGHMETVKSA

KRLDEGISVLRVPLTQLKLKDLEKMVNREREPKLYEALKARLEAHKDDPA

KAFAEPFYKYDKAGNRTQQVKAVRVEQVQKTGVWVHNHNGIADNATIVRV

DVFEKGGKYYLVPIYSWQVAKGILPDRAVVQGKDEEDWTVMDDSFEFKFV

LYANDLIKLTAKKNEFLGYFVSLNRATGAIDIRTHDTDSTKGKNGIFQSV

GVKTALSFQKYQIDELGKEIRPCRLKKRPPVR.

In some embodiments the Cas9 protein can be Campylobacter lari Cas9 and may comprise or consist of the amino acid sequence:

(SEQ ID NO: 157)

MRILGFDIGINSIGWAFVENDELKDCGVRIFTKAENPKNKESLALPRRNA

RSSRRRLKRRKARLIAIKRILAKELKLNYKDYVAADGELPKAYEGSLASV

YELRYKALTQNLETKDLARVILHIAKHRGYMNKNEKKSNDAKKGKILSAL

KNNALKLENYQSVGEYFYKEFFQKYKKNTKNFIKIRNTKDNYNNCVLSSD

LEKELKLILEKQKEFGYNYSEDFINEILKVAFFQRPLKDFSHLVGACTFF

EEEKRACKNSYSAWEFVALTKIINEIKSLEKISGEIVPTQTINEVLNLIL

DKGSITYKKFRSCINLHESISFKSLKYDKENAENAKLIDFRKLVEFKKAL

GVHSLSRQELDQISTHITLIKDNVKLKTVLEKYNLSNEQINNLLEIEFND

YINLSFKALGMILPLMREGKRYDEACEIANLKPKTVDEKKDFLPAFCDSI

FAHELSNPVVNRAISEYRKVLNALLKKYGKVHKIHLELARDVGLSKKARE

KIEKEQKENQAVNAWALKECENIGLKASAKNILKLKLWKEQKEICIYSGN

KISIEHLKDEKALEVDHIYPYSRSFDDSFINKVLVFTKENQEKLNKTPFE

AFGKNIEKWSKIQTLAQNLPYKKKNKILDENFKDKQQEDFISRNLNDTRY

IATLIAKYTKEYLNFLLLSENENANLKSGEKGSKIHVQTISGMLTSVLRH

TWGFDKKDRNNHLHHALDAIIVAYSTNSIIKAFSDFRKNQELLKARFYAK

ELTSDNYKHQVKFFEPFKSFREKILSKIDEIFVSKPPRKRARRALHKDTF

HSENKIIDKCSYNSKEGLQIALSCGRVRKIGTKYVENDTIVRVDIFKKQN

KFYAIPIYAMDFALGILPNKIVITGKDKNNNPKQWQTIDESYEFCFSLYK

NDLILLQKKNMQEPEFAYYNDFSISTSSICVEKHDNKFENLTSNQKLLFS

NAKEGSVKVESLGIQNLKVFEKYIITPLGDKIKADFQPRENISLKTSKKY

GLR.

In some embodiments the Cas9 protein can be T denticola Cas 9 and may comprise or consist of the amino acid sequence:

(SEQ ID NO: 158)

MKKEIKDYFLGLDVGTGSVGWAVTDTDYKLLKANRKDLWGMRCFETAETA

EVRRLHRGARRRIERRKKRIKLLQELFSQEIAKTDEGFFQRMKESPFYAE

DKTILQENTLFNDKDFADKTYHKAYPTINHLIKAWIENKVKPDPRLLYLA

CHNIIKKRGHFLFEGDFDSENQFDTSIQALFEYLREDMEVDIDADSQKVK

EILKDSSLKNSEKQSRLNKILGLKPSDKQKKAITNLISGNKINFADLYDN

PDLKDAEKNSISFSKDDFDALSDDLASILGDSFELLLKAKAVYNCSVLSK

VIGDEQYLSFAKVKIYEKHKTDLTKLKNVIKKHFPKDYKKVFGYNKNEKN

NNNYSGYVGVCKTKSKKLIINNSVNQEDFYKFLKTILSAKSEIKEVNDIL

TEIETGTFLPKQISKSNAEIPYQLRKMELEKILSNAEKHFSFLKQKDEKG

LSHSEKIIMLLTFKIPYYIGPINDNHKKFFPDRCWVVKKEKSPSGKTTPW

NFFDHIDKEKTAEAFITSRTNFCTYLVGESVLPKSSLLYSEYTVLNEINN

LQIIIDGKNICDIKLKQKIYEDLFKKYKKITQKQISTFIKHEGICNKTDE

VIILGIDKECTSSLKSYIELKNIFGKQVDEISTKNMLEEIIRWATIYDEG

EGKTILKTKIKAEYGKYCSDEQIKKILNLKFSGWGRLSRKFLETVTSEMP

GFSEPVNIITAMRETQNNLMELLSSEFTFTENIKKINSGFEDAEKQFSYD

GLVKPLFLSPSVKKMLWQTLKLVKEISHITQAPPKKIFIEMAKGAELEPA

RTKTRLKILQDLYNNCKNDADAFSSEIKDLSGKIENEDNLRLRSDKLYLY

YTQLGKCMYCGKPIEIGHVFDTSNYDIDHIYPQSKIKDDSISNRVLVCSS

CNKNKEDKYPLKSEIQSKQRGFWNFLQRNNFISLEKLNRLTRATPISDDE

TAKFIARQLVETRQATKVAAKVLEKMFPETKIVYSKAETVSMFRNKFDIV

KCREINDFHHAHDAYLNIVVGNVYNTKFTNNPWNFIKEKRDNPKIADTYN

YYKVFDYDVKRNNITAWEKGKTIITVKDMLKRNTPIYTRQAACKKGELFN

QTIMKKGLGQHPLKKEGPFSNISKYGGYNKVSAAYYTLIEYEEKGNKIRS

LETIPLYLVKDIQKDQDVLKSYLTDLLGKKEFKILVPKIKINSLLKINGF

PCHITGKTNDSFLLRPAVQFCCSNNEVLYFKKIIRFSEIRSQREKIGKTI

SPYEDLSFRSYIKENLWKKTKNDEIGEKEFYDLLQKKNLEIYDMLLTKHK

DTIYKKRPNSATIDILVKGKEKFKSLIIENQFEVILEILKLFSATRNVSD

LQHIGGSKYSGVAKIGNKISSLDNCILIYQSITGIFEKRIDLLKV.

In some embodiments the Cas9 protein can be S. mutans Cas9 and may comprise or consist of the amino acid sequence:

(SEQ ID NO: 159)

MKKPYSIGLDIGTNSVGWAVVTDDYKVPAKKMKVLGNTDKSHIEKNLLGA

LLFDSGNTAEDRRLKRTARRRYTRRRNRILYLQEIFSEEMGKVDDSFFHR

LEDSFLVIEDKRGERHPIFGNLEEEVKYHENFPTIYHLRQYLADNPEKVD

LRLVYLALAHIIKFRGHFLIEGKFDTRNNDVQRLFQEFLAVYDNTFENSS

LQEQNVQVEEILTDKISKSAKKDRVLKLFPNEKSNGRFAEFLKLIVGNQA

DFKKHFELEEKAPLQFSKDTYEEELEVLLAQIGDNYAELFLSAKKLYDSI

LLSGILTVTDVGTKAPLSASMIQRYNEHQMDLAQLKQFIRQKLSDKYNEV

FSDVSKDGYAGYIDGKTNQEAFYKYLKGLLNKIEGSGYFLDKIEREDFLR

KQRTFDNGSIPHQIHLQEMRAIIRRQAEFYPFLADNQDRIEKLLTFRIPY

YVGPLARGKSDFAWLSRKSADKITPWNFDEIVDKESSAEAFINRMTNYDL

YLPNQKVLPKHSLLYEKFTVYNELTKVKYKTEQGKTAFFDANMKQEIFDG

VFKVYRKVTKDKLMDFLEKEFDEFRIVDLTGLDKENKVFNASYGTYHDLC

KILDKDFLDNSKNEKILEDIVLTLTLFEDREMIRKRLENYSDLLTKEQVK

KLERRHYTGWGRLSAELIHGIRNKESRKTILDYLIDDGNSNRNFMQLIND

DALSFKEEIAKAQVIGETDNLNQVVSDIAGSPAIKKGILQSLKIVDELVK

IMGHQPENIVVEMARENQFTNQGRRNSQQRLKGLTDSIKEFGSQILKEHP

VENSQLQNDRLFLYYLQNGRDMYTGEELDIDYLSQYDIDHIIPQAFIKDN

SIDNRVLTSSKENRGKSDDVPSKDVVRKMKSYWSKLLSAKLITQRKFDNL

TKAERGGLTDDDKAGFIKRQLVETRQITKHVARILDERFNTETDENNKKI

RQVKIVTLKSNLVSNFRKEFELYKVREINDYHHAHDAYLNAVIGKALLGV

YPQLEPEFVYGDYPHFHGHKENKATAKKFFYSNIMNFFKKDDVRTDKNGE

IIWKKDEHISNIKKVLSYPQVNIVKKVEEQTGGFSKESILPKGNSDKLIP

RKTKKFYWDTKKYGGFDSPIVAYSILVIADIEKGKSKKLKTVKALVGVTI

MEKMTFERDPVAFLERKGYRNVQEENIIKLPKYSLFKLENGRKRLLASAR

ELQKGNEIVLPNHLGTLLYHAKNIHKVDEPKHLDYVDKHKDEFKELLDVV

SNFSKKYTLAEGNLEKIKELYAQNNGEDLKELASSFINLLTFTAIGAPAT

FKFFDKNIDRKRYTSTTEILNATLIHQSITGLYETRIDLNKLGGD

In some embodiments the Cas9 protein can be S. thermophilus CRISPR 3 Cas9 and may comprise or consist of the amino acid sequence:

(SEQ ID NO: 160)

MTKPYSIGLDIGTNSVGWAVTTDNYKVPSKKMKVLGNTSKKYIKKNLLGV

LLFDSGITAEGRRLKRTARRRYTRRRNRILYLQEIFSTEMATLDDAFFQR

LDDSFLVPDDKRDSKYPIFGNLVEEKAYHDEFPTIYHLRKYLADSTKKAD

LRLVYLALAHMIKYRGHFLIEGEFNSKNNDIQKNFQDFLDTYNAIFESDL

SLENSKQLEEIVKDKISKLEKKDRILKLFPGEKNSGIFSEFLKLIVGNQA

DFRKCFNLDEKASLHFSKESYDEDLETLLGYIGDDYSDVFLKAKKLYDAI

LLSGFLTVTDNETEAPLSSAMIKRYNEHKEDLALLKEYIRNISLKTYNEV

FKDDTKNGYAGYIDGKTNQEDFYVYLKKLLAEFEGADYFLEKIDREDFLR

KQRTFDNGSIPYQIHLQEMRAILDKQAKFYPFLAKNKERIEKILTFRIPY

YVGPLARGNSDFAWSIRKRNEKITPWNFEDVIDKESSAEAFINRMTSFDL

YLPEEKVLPKHSLLYETFNVYNELTKVRFIAESMRDYQFLDSKQKKDIVR

LYFKDKRKVTDKDIIEYLHAIYGYDGIELKGIEKQFNSSLSTYHDLLNII

NDKEFLDDSSNEAIIEEIIHTLTIFEDREMIKQRLSKFENIFDKSVLKKL

SRRHYTGWGKLSAKLINGIRDEKSGNTILDYLIDDGISNRNFMQLIHDDA

LSFKKKIQKAQIIGDEDKGNIKEVVKSLPGSPAIKKGILQSIKIVDELVK

VMGGRKPESIVVEMARENQYTNQGKSNSQQRLKRLEKSLKELGSKILKEN

IPAKLSKIDNNALQNDRLYLYYLQNGKDMYTGDDLDIDRLSNYDIDHIIP

QAFLKDNSIDNKVLVSSASNRGKSDDVPSLEVVKKRKTFWYQLLKSKLIS

QRKFDNLTKAERGGLSPEDKAGFIQRQLVETRQITKHVARLLDEKFNNKK

DENNRAVRTVKIITLKSTLVSQFRKDFELYKVREINDFHHAHDAYLNAVV

ASALLKKYPKLEPEFVYGDYPKYNSFRERKSATEKVYFYSNIMNIFKKSI

SLADGRVIERPLIEVNEETGESVWNKESDLATVRRVLSYPQVNVVKKVEE

QNHGLDRGKPKGLFNANLSSKPKPNSNENLVGAKEYLDPKKYGGYAGISN

SFTVLVKGTIEKGAKKKITNVLEFQGISILDRINYRKDKLNFLLEKGYKD

IELIIELPKYSLFELSDGSRRMLASILSTNNKRGEIHKGNQIFLSQKFVK

LLYHAKRISNTINENHRKYVENHKKEFEELFYYILEFNENYVGAKKNGKL

LNSAFQSWQNHSIDELCSSFIGPTGSERKGLFELTSRGSAADFEFLGVKI

PRYRDYTPSSLLKDATLIHQSVTGLYETRIDLAKLGEG

In some embodiments the Cas9 protein can be C. jejuni Cas9 and may comprise or consist of the amino acid sequence:

(SEQ ID NO: 161)

MARILAFDIGISSIGWAFSENDELKDCGVRIFTKVENPKTGESLALPRRL

ARSARKRLARRKARLNHLKHLIANEFKLNYEDYQSFDESLAKAYKGSLIS

PYELRFRALNELLSKQDFARVILHIAKRRGYDDIKNSDDKEKGAILKAIK

QNEEKLANYQSVGEYLYKEYFQKFKENSKEFTNVRNKKESYERCIAQSFL

KDELKLIFKKQREFGFSFSKKFEEEVLSVAFYKRALKDFSHLVGNCSFFT

DEKRAPKNSPLAFWVALTRIINLLNNLKNTEGILYTKDDLNALLNEVLKN

GTLTYKQTKKLLGLSDDYEFKGEKGTYFIEFKKYKEFIKALGEHNLSQDD

LNEIAKDITLIKDEIKLKKALAKYDLNQNQIDSLSKLEFKDHLNISFKAL

KLVTPLMLEGKKYDEACNELNLKVAINEDKKDFLPAFNETYYKDEVTNPV

VLRAIKEYRKVLNALLKKYGKVHKINIELAREVGKNHSQRAKIEKEQNEN

YKAKKDAELECEKLGLKINSKNILKLRLFKEQKEFCAYSGEKIKISDLQD

EKMLEIDHIYPYSRSFDDSYMNKVLVFTKQNQEKLNQTPFEAFGNDSAKW

QKIEVLAKNLPTKKQKRILDKNYKDKEQKNFKDRNLNDTRYIARLVLNYT

KDYLDFLPLSDDENTKLNDTQKGSKVHVEAKSGMLTSALRHTWGFSAKDR

NNHLHHAIDAVIIAYANNSIVKAFSDFKKEQESNSAELYAKKISELDYKN

KRKFFEPFSGFRQKVLDKIDEIFVSKPERKKPSGALHEETFRKEEEFYQS

YGGKEGVLKALELGKIRKVNGKIVKNGDMFRVDIFKHKKTNKFYAVPIYT

MDFALKVLPNKAVARSKKGEIKDWILMDENYEFCFSLYKDSLILIQTKDM

QEPEFVYYNAFTSSTVSLIVSKHDNKFETLSKNQKILFKNANEKEVIAKS

IGIQNLKVFEKYIVSALGEVTKAEFRQREDFKK

In some embodiments the Cas9 protein can be P. multocida Cas9 and may comprise or consist of the amino acid sequence:

(SEQ ID NO: 162)

MQTTNLSYILGLDLGIASVGWAVVEINENEDPIGLIDVGVRIFERAEVPK

TGESLALSRRLARSTRRLIRRRAHRLLLAKRFLKREGILSTIDLEKGLPN

QAWELRVAGLERRLSAIEWGAVLLHLIKHRGYLSKRKNESQTNNKELGAL

LSGVAQNHQLLQSDDYRTPAELALKKFAKEEGHIRNQRGAYTHTFNRLDL

LAELNLLFAQQHQFGNPHCKEHIQQYMTELLMWQKPALSGEAILKMLGKC

THEKNEFKAAKHTYSAERFVWLTKLNNLRILEDGAERALNEEERQLLINH

PYEKSKLTYAQVRKLLGLSEQAIFKHLRYSKENAESATFMELKAWHAIRK

ALENQGLKDTWQDLAKKPDLLDEIGTAFSLYKTDEDIQQYLTNKVPNSVI

NALLVSLNFDKFIELSLKSLRKILPLMEQGKRYDQACREIYGHHYGEANQ

KTSQLLPAIPAQEIRNPVVLRTLSQARKVINAIIRQYGSPARVHIETGRE

LGKSFKERREIQKQQEDNRTKRESAVQKFKELFSDFSSEPKSKDILKFRL

YEQQHGKCLYSGKEINIHRLNEKGYVEIDHALPFSRTWDDSFNNKVLVLA

SENQNKGNQTPYEWLQGKINSERWKNFVALVLGSQCSAAKKQRLLTQVID

DNKFIDRNLNDTRYIARFLSNYIQENLLLVGKNKKNVFTPNGQITALLRS

RWGLIKARENNNRHHALDAIVVACATPSMQQKITRFIRFKEVHPYKIENR

YEMVDQESGEIISPHFPEPWAYFRQEVNIRVFDNHPDTVLKEMLPDRPQA

NHQFVQPLFVSRAPTRKMSGQGHMETIKSAKRLAEGISVLRIPLTQLKPN

LLENMVNKEREPALYAGLKARLAEFNQDPAKAFATPFYKQGGQQVKAIRV

EQVQKSGVLVRENNGVADNASIVRTDVFIKNNKFFLVPIYTWQVAKGILP

NKAIVAHKNEDEWEEMDEGAKFKFSLFPNDLVELKTKKEYFFGYYIGLDR

ATGNISLKEHDGEISKGKDGVYRVGVKLALSFEKYQVDELGKNRQICRPQ

QRQPVR

In some embodiments the Cas9 protein can be F. novicida Cas9 and may comprise or consist of the amino acid sequence:

(SEQ ID NO: 163)

MNFKILPIAIDLGVKNTGVFSAFYQKGTSLERLDNKNGKVYELSKDSYTL

LMNNRTARRHQRRGIDRKQLVKRLFKLIWTEQLNLEWDKDTQQAISFLFN

RRGFSFITDGYSPEYLNIVPEQVKAILMDIFDDYNGEDDLDSYLKLATEQ

ESKISEIYNKLMQKILEFKLMKLCTDIKDDKVSTKTLKEITSYEFELLAD

YLANYSESLKTQKFSYTDKQGNLKELSYYHHDKYNIQEFLKRHATINDRI

LDTLLTDDLDIWNFNFEKFDFDKNEEKLQNQEDKDHIQAHLHHFVFAVNK

IKSEMASGGRHRSQYFQEITNVLDENNHQEGYLKNFCENLHNKKYSNLSV

KNLVNLIGNLSNLELKPLRKYFNDKIHAKADHWDEQKFIETYCHWILGEW

RVGVKDQDKKDGAKYSYKDLCNELKQKVTKAGLVDFLLELDPCRTIPPYL

DNNNRKPPKCQSLILNPKFLDNQYPNWQQYLQELKKLQSIQNYLDSFETD

LKVLKSSKDQPYFVEYKSSNQQIASGQRDYKDLDARILQFIFDRVKASDE

LLLNEIYFQAKKLKQKASSELEKLESSKKLDEVIANSQLSQILKSQHTNG

IFEQGTFLHLVCKYYKQRQRARDSRLYIMPEYRYDKKLHKYNNTGRFDDD

NQLLTYCNHKPRQKRYQLLNDLAGVLQVSPNFLKDKIGSDDDLFISKWLV

EHIRGFKKACEDSLKIQKDNRGLLNHKINIARNTKGKCEKEIFNLICKIE

GSEDKKGNYKHGLAYELGVLLFGEPNEASKPEFDRKIKKFNSIYSFAQIQ

QIAFAERKGNANTCAVCSADNAHRMQQIKIIEPVEDNKDKIILSAKAQRL

PAIPTRIVDGAVKKMATILAKNIVDDNWQNIKQVLSAKHQLHIPIIIESN

AFEFEPALADVKGKSLKDRRKKALERISPENIFKDKNNRIKEFAKGISAY

SGANLTDGDFDGAKEELDHIIPRSHKKYGTLNDEANLICVTRGDNKNKGN

RIFCLRDLADNYKLKQFETTDDLEIEKKIADTIWDANKKDFKFGNYRSFI

NLTPQEQKAFRHALFLADENPIKQAVIRAINNRNRTFVNGTQRYFAEVLA

NNIYLRAKKENLNTDKISFDYFGIPTIGNGRGIAEIRQLYEKVDSDIQAY

AKGDKPQASYSHLIDAMLAFCIAADEHRNDGSIGLEIDKNYSLYPLDKNT

GEVFTKDIFSQIKITDNEFSDKKLVRKKAIEGFNTHRQMTRDGIYAENYL

PILIHKELNEVRKGYTWKNSEEIKIFKGKKYDIQQLNNLVYCLKFVDKPI

SIDIQISTLEELRNILTTNNIAATAEYYYINLKTQKLHEYYIENYNTALG

YKKYSKEMEFLRSLAYRSERVKIKSIDDVKQVLDKDSNFIIGKITLPFKK

EWQRLYREWQNTTIKDDYEFLKSFFNVKSITKLHKKVRKDFSLPISTNEG

KFLVKRKTWDNNFIYQILNDSDSRADGTKPFIPAFDISKNEIVEAIIDSF

TSKNIFWLPKNIELQKVDNKNIFAIDTSKWFEVETPSDLRDIGIATIQYK

IDNNSRPKVRVKLDYVIDDDSKINYFMNHSLLKSRYPDKVLEILKQSTII

EFESSGFNKTIKEMLGMKLAGIYNETSNN

In some embodiments the Cas9 protein can be Lactobacillus buchneri Cas9 and may comprise or consist of the amino acid sequence:

(SEQ ID NO: 164)

MKVNNYHIGLDIGTSSIGWVAIGKDGKPLRVKGKTAIGARLFQEGNPAAD

RRMFRTTRRRLSRRKWRLKLLEEIFDPYITPVDSTFFARLKQSNLSPKDS

RKEFKGSMLFPDLTDMQYHKNYPTIYHLRHALMTQDKKFDIRMVYLAIHH

IVKYRGNFLNSTPVDSFKASKVDFVDQFKKLNELYAAINPEESFKINLAN

SEDIGHQFLDPSIRKFDKKKQIPKIVPVMMNDKVTDRLNGKIASEIIHAI

LGYKAKLDVVLQCTPVDSKPWALKFDDEDIDAKLEKILPEMDENQQSIVA

ILQNLYSQVTLNQIVPNGMSLSESMIEKYNDHHDHLKLYKKLIDQLADPK

KKAVLKKAYSQYVGDDGKVIEQAEFWSSVKKNLDDSELSKQIMDLIDAEK

FMPKQRTSQNGVIPHQLHQRELDEIIEHQSKYYPWLVEINPNKHDLHLAK

YKIEQLVAFRVPYYVGPMITPKDQAESAETVFSWMERKGTETGQITPWNF

DEKVDRKASANRFIKRMTTKDTYLIGEDVLPDESLLYEKFKVLNELNMVR

VNGKLLKVADKQAIFQDLFENYKHVSVKKLQNYIKAKTGLPSDPEISGLS

DPEHFNNSLGTYNDFKKLFGSKVDEPDLQDDFEKIVEWSTVFEDKKILRE

KLNEITWLSDQQKDVLESSRYQGWGRLSKKLLTGIVNDQGERIIDKLWNT

NKNFMQIQSDDDFAKRIHEANADQMQAVDVEDVLADAYTSPQNKKAIRQV

VKVVDDIQKAMGGVAPKYISIEFTRSEDRNPRRTISRQRQLENTLKDTAK

SLAKSINPELLSELDNAAKSKKGLTDRLYLYFTQLGKDIYTGEPINIDEL

NKYDIDHILPQAFIKDNSLDNRVLVLTAVNNGKSDNVPLRMFGAKMGHFW

KQLAEAGLISKRKLKNLQTDPDTISKYAMHGFIRRQLVETSQVIKLVANI

LGDKYRNDDTKIIEITARMNHQMRDEFGFIKNREINDYHHAFDAYLTAFL

GRYLYHRYIKLRPYFVYGDFKKFREDKVTMRNFNFLHDLTDDTQEKIADA

ETGEVIWDRENSIQQLKDVYHYKFMLISHEVYTLRGAMFNQTVYPASDAG

KRKLIPVKADRPVNVYGGYSGSADAYMAIVRIHNKKGDKYRVVGVPMRAL

DRLDAAKNVSDADFDRALKDVLAPQLTKTKKSRKTGEITQVIEDFEIVLG

KVMYRQLMIDGDKKFMLGSSTYQYNAKQLVLSDQSVKTLASKGRLDPLQE

SMDYNNVYlEILDKVNQYFSLYDMNKFRHKLNLGFSKFISFPNHNVLDGN

TKVSSGKREILQEILNGLHANPTFGNLKDVGITTPFGQLQQPNGILLSDE

TKIRYQSPTGLFERTVSLKDL

In some embodiments the Cas9 protein can be Listeria innocua Cas9 and may comprise or consist of the amino acid sequence:

(SEQ ID NO: 165)

MKKPYTIGLDIGTNSVGWAVLTDQYDLVKRKMKIAGDSEKKQIKKNFWGV

RLFDEGQTAADRRMARTARRRIERRRNRISYLQGIFAEEMSKTDANFFCR

LSDSFYVDNEKRNSRHPFFATIEEEVEYHKNYPTIYHLREELVNSSEKAD

LRLVYLALAHIIKYRGNFLIEGALDTQNTSVDGIYKQFIQTYNQVFASGI

EDGSLKKLEDNKDVAKILVEKVTRKEKLERILKLYPGEKSAGMFAQFISL

IVGSKGNFQKPFDLIEKSDIECAKDSYEEDLESLLALIGDEYAELFVAAK

NAYSAVVLSSIITVAETETNAKLSASMIERFDTHEEDLGELKAFIKLHLP

KHYEEIFSNTEKHGYAGYIDGKTKQADFYKYMKMTLENIEGADYFIAKIE

KENFLRKQRTFDNGAIPHQLHLEELEAILHQQAKYYPFLKENYDKIKSLV

TFRIPYFVGPLANGQSEFAWLTRKADGEIRPWNIEEKVDFGKSAVDFIEK

MTNKDTYLPKENVLPKHSLCYQKYLVYNELTKVRYINDQGKTSYFSGQEK

EQIFNDLFKQKRKVKKKDLELFLRNMSHVESPTIEGLEDSFNSSYSTYHD

LLKVGIKQEILDNPVNTEMLENIVKILTVFEDKRMIKEQLQQFSDVLDGV

VLKKLERRHYTGWGRLSAKLLMGIRDKQSHLTILDYLMNDDGLNRNLMQL

INDSNLSFKSIIEKEQVTTADKDIQSIVADLAGSPAIKKGILQSLKIVDE

LVSVMGYPPQTIVVEMARENQTTGKGKNNSRPRYKSLEKAIKEFGSQILK

EHPTDNQELRNNRLYLYYLQNGKDMYTGQDLDIHNLSNYDIDHIVPQSFI

TDNSIDNLVLTSSAGNREKGDDVPPLEIVRKRKVFWEKLYQGNLMSKRKF

DYLTKAERGGLTEADKARFIHRQLVETRQITKNVANILHQRFNYEKDDHG

NTMKQVRIVTLKSALVSQFRKQFQLYKVRDVNDYHHAHDAYLNGVVANTL

LKVYPQLEPEFVYGDYHQFDWFKANKATAKKQFYTNIMLFFAQKDRIIDE

NGEILWDKKYLDTVKKVMSYRQMNIVKKTEIQKGEFSKATIKPKGNSSKL

IPRKTNWDPMKYGGLDSPNMAYAVVIEYAKGKNKLVFEKKIIRVTIMERK

AFEKDEKAFLEEQGYRQPKVLAKLPKYTLYECEEGRRRMLASANEAQKGN

QQVLPNHLVTLLHHAANCEVSDGKSLDYIESNREMFAELLAHVSEFAKRY

TLAEANLNKINQLFEQNKEGDIKAIAQSFVDLMAFNAMGAPASFKFFETT

IERKRYNNLKELLNSTIIYQSITGLYESRKRLDD

In some embodiments the Cas9 protein can be L. pneumophilia Cas9 and may comprise or consist of the amino acid sequence:

(SEQ ID NO: 166)

MESSQILSPIGIDLGGKFTGVCLSHLEAFAELPNHANTKYSVILIDHNNF

QLSQAQRRATRHRVRNKKRNQFVKRVALQLFQHILSRDLNAKEETALCHY

LNNRGYTYVDTDLDEYIKDETTINLLKELLPSESEHNFIDWFLQKMQSSE

FRKILVSKVEEKKDDKELKNAVKNIKNFITGFEKNSVEGHRHRKVYFENI

KSDITKDNQLDSIKKKIPSVCLSNLLGHLSNLQWKNLHRYLAKNPKQFDE

QTFGNEFLRMLKNFRHLKGSQESLAVRNLIQQLEQSQDYISILEKTPPEI

TIPPYEARTNTGMEKDQSLLLNPEKLNNLYPNWRNLIPGIIDAHPFLEKD

LEHTKLRDRKRIISPSKQDEKRDSYILQRYLDLNKKIDKFKIKKQLSFLG

QGKQLPANLIETQKEMETHFNSSLVSVLIQIASAYNKEREDAAQGIWFDN

AFSLCELSNINPPRKQKILPLLVGAILSEDFINNKDKWAKFKIFWNTHKI

GRTSLKSKCKEIEEARKNSGNAFKIDYEEALNHPEHSNNKALIKIIQTIP

DIIQAIQSHLGHNDSQALIYHNPFSLSQLYTILETKRDGFHKNCVAVTCE

NYWRSQKTEIDPEISYASRLPADSVRPFDGVLARMMQRLAYEIAMAKWEQ

IKHIPDNSSLLIPIYLEQNRFEFEESFKKIKGSSSDKTLEQAIEKQNIQW

EEKFQRIINASMNICPYKGASIGGQGEIDHIYPRSLSKKHFGVIFNSEVN

LIYCSSQGNREKKEEHYLLEHLSPLYLKHQFGTDNVSDIKNFISQNVANI

KKYISFHLLTPEQQKAARHALFLDYDDEAFKTITKFLMSQQKARVNGTQK

FLGKQIMEFLSTLADSKQLQLEFSIKQITAEEVHDHRELLSKQEPKLVKS

RQQSFPSHAIDATLTMSIGLKEFPQFSQELDNSWFINHLMPDEVHLNPVR

SKEKYNKPNISSTPLFKDSLYAERFIPVWVKGETFAIGFSEKDLFEIKPS

NKEKLFTLLKTYSTKNPGESLQELQAKSKAKWLYFPINKTLALEFLHHYF

HKEIVTPDDTTVCHFINSLRYYTKKESITVKILKEPMPVLSVKFESSKKN

VLGSFKHTIALPATKDWERLFNHPNFLALKANPAPNPKEFNEFIRKYFLS

DNNPNSDIPNNGHNIKPQKHKAVRKVFSLPVIPGNAGTMMRIRRKDNKGQ

PLYQLQTIDDTPSMGIQINEDRLVKQEVLMDAYKTRNLSTIDGINNSEGQ

AYATFDNWLTLPVSTFKPEIIKLEMKPHSKTRRYIRITQSLADFIKTIDE

ALMIKPSDSIDDPLNMPNEIVCKNKLFGNELKPRDGKMKIVSTGKIVTYE

FESDSTPQWIQTLYVTQLKKQP

In some embodiments the Cas9 protein can be N lactamica Cas9 and may comprise or consist of the amino acid sequence:

(SEQ ID NO: 167)

MAAFKPNPMNYILGLDIGIASVGWAMVEVDEEENPIRLIDLGVRVFERAE

VPKTGDSLAMARRLARSVRRLTRRRAHRLLRARRLLKREGVLQDADFDEN

GLVKSLPNTPWQLRAAALDRKLTCLEWSAVLLHLVKHRGYLSQRKNEGET

ADKELGALLKGVADNAHALQTGDFRTPAELALNKFEKESGHIRNQRGDYS

HTFSRKDLQAELNLLFEKQKEFGNPHVSDGLKEDIETLLMAQRPALSGDA

VQKMLGHCTFEPAEPKAAKNTYTAERFIWLTKLNNLRILEQGSERPLTDT

ERATLMDEPYRKSKLTYAQARKLLGLEDTAFFKGLRYGKDNAEASTLMEM

KAYHAISRALEKEGLKDKKSPLNLSTELQDEIGTAFSLFKTDKDITGRLK

DRVQPEILEALLKHISFDKFVQISLKALRRIVPLMEQGKRYDEACAEIYG

DHYCKKNAEEKIYLPPIPADEIRNPVVLRALSQARKVINCVVRRYGSPAR

IHIETAREVGKSFKDRKEIEKRQEENRKDREKAAAKFREYFPNFVGEPKS

KDILKLRLYEQQHGKCLYSGKEINLVRLNEKGYVEIDHALPFSRTWDDSF

NNKVLVLGSENQNKGNQTPYEYFNGKDNSREWQEFKARVETSRFPRSKKQ

RILLQKFDEEGFKERNLNDTRYVNRFLCQFVADHILLTGKGKRRVFASNG

QITNLLRGFWGLRKVRIENDRHHALDAVVVACSTVAMQQKITRFVRYKEM

NAFDGKTIDKETGEVLHQKAHFPQPWEFFAQEVMIRVFGKPDGKPEFEEA

DTPEKLRTLLAEKLSSRPEAVHEYVTPLFVSRAPNRKMSGQGHMETVKSA

KRLDEGISVLRVPLTQLKLKGLEKMVNREREPKLYDALKAQLETHKDDPA

KAFAEPFYKYDKAGSRTQQVKAVRIEQVQKTGVWVRNHNGIADNATMVRV

DVFEKGGKYYLVPIYSWQVAKGILPDRAVVAFKDEEDWTVMDDSFEFRFV

LYANDLIKLTAKKNEFLGYFVSLNRATGAIDIRTHDTDSTKGKNGIFQSV

GVKTALSFQKNQIDELGKEIRPCRLKKRPPVR

In some embodiments the Cas9 protein can be N. meningitides Cas9 and may comprise or consist of the amino acid sequence:

(SEQ ID NO: 168)

MAAFKPNPINYILGLDIGIASVGWAMVEIDEDENPICLIDLGVRVFERAE

VPKTGDSLAMARRLARSVRRLTRRRAHRLLRARRLLKREGVLQAADFDEN

GLIKSLPNTPWQLRAAALDRKLTPLEWSAVLLHLIKHRGYLSQRKNEGET

ADKELGALLKGVADNAHALQTGDFRTPAELALNKFEKESGHIRNQRGDYS

HTFSRKDLQAELILLFEKQKEFGNPHVSGGLKEGIETLLMTQRPALSGDA

VQKMLGHCTFEPAEPKAAKNTYTAERFIWLTKLNNLRILEQGSERPLTDT

ERATLMDEPYRKSKLTYAQARKLLGLEDTAFFKGLRYGKDNAEASTLMEM

KAYHAISRALEKEGLKDKKSPLNLSPELQDEIGTAFSLFKTDEDITGRLK

DRIQPEILEALLKHISFDKFVQISLKALRRIVPLMEQGKRYDEACAEIYG

DHYGKKNTEEKIYLPPIPADEIRNPVVLRALSQARKVINGVVRRYGSPAR

IHIETAREVGKSFKDRKEIEKRQEENRKDREKAAAKFREYFPNFVGEPKS

KDILKLRLYEQQHGKCLYSGKEINLGRLNEKGYVEIDHALPFSRTWDDSF

NNKVLVLGSENQNKGNQTPYEYFNGKDNSREWQEFKARVETSRFPRSKKQ

RILLQKFDEDGFKERNLNDTRYVNRFLCQFVADRMRLTGKGKKRVFASNG

QITNLLRGFWGLRKVRAENDRHHALDAVVVACSTVAMQQKITRFVRYKEM

NAFDGKTIDKETGEVLHQKTHFPQPWEFFAQEVMIRVFGKPDGKPEFEEA

DTPEKLRTLLAEKLSSRPEAVHEYVTPLFVSRAPNRKMSGQGHMETVKSA

KRLDEGVSVLRVPLTQLKLKDLEKMVNREREPKLYEALKARLEAHKDDPA

KAFAEPFYKYDKAGNRTQQVKAVRVEQVQKTGVWVRNHNGIADNATMVRV

DVFEKGDKYYLVPIYSWQVAKGILPDRAVVQGKDEEDWQLIDDSFNFKFS

LHPNDLVEVITKKARMFGYFASCHRGTGNINIRIHDLDHKIGKNGILEGI

GVKTALSFQKYQIDELGKEIRPCRLKKRPPVR

In some embodiments the Cas9 protein can be B. longum Cas9 and may comprise or consist of the amino acid sequence:

(SEQ ID NO: 169)

MLSRQLLGASHLARPVSYSYNVQDNDVHCSYGERCFMRGKRYRIGIDVGL

NSVGLAAVEVSDENSPVRLLNAQSVIHDGGVDPQKNKEAITRKNMSGVAR

RTRRMRRRKRERLHKLDMLLGKFGYPVIEPESLDKPFEEWHVRAELATRY

IEDDELRRESISIALRHMARHRGWRNPYRQVDSLISDNPYSKQYGELKEK

AKAYNDDATAAEEESTPAQLVVAMLDAGYAEAPRLRWRTGSKKPDAEGYL

PVRLMQEDNANELKQIFRVQRVPADEWKPLFRSVFYAVSPKGSAEQRVGQ

DPLAPEQARALKASLAFQEYRIANVITNLRIKDASAELRKLTVDEKQSIY

DQLVSPSSEDITWSDLCDFLGFKRSQLKGVGSLTEDGEERISSRPPRLTS

VQRIYESDNKIRKPLVAWWKSASDNEHEAMIRLLSNTVDIDKVREDVAYA

SAIEFIDGLDDDALTKLDSVDLPSGRAAYSVETLQKLTRQMLTTDDDLHE

ARKTLFNVTDSWRPPADPIGEPLGNPSVDRVLKNVNRYLMNCQQRWGNPV

SVNIEHVRSSFSSVAFARKDKREYEKNNEKRSIFRSSLSEQLRADEQMEK

VRESDLRRLEAIQRQNGQCLYCGRTITFRTCEMDHIVPRKGVGSTNTRTN

FAAVCAECNRMKSNTPFAIWARSEDAQTRGVSLAEAKKRVTMFTFNPKSY

APREVKAFKQAVIARLQQTEDDAAIDNRSIESVAWMADELHRRIDWYFNA

KQYVNSASIDDAEAETMKTTVSVFQGRVTASARRAAGIEGKIHFIGQQSK

TRLDRRHHAVDASVIAMMNTAAAQTLMERESLRESQRLIGLMPGERSWKE

YPYEGTSRYESFHLWLDNMDVLLELLNDALDNDRIAVMQSQRYVLGNSIA

HDATIHPLEKVPLGSAMSADLIRRASTPALWCALTRLPDYDEKEGLPEDS

HREIRVHDTRYSADDEMGFFASQAAQIAVQEGSADIGSAIHHARVYRCWK

TNAKGVRKYFYGMIRVFQTDLLRACHDDLFTVPLPPQSISMRYGEPRVVQ

ALQSGNAQYLGSLVVGDEIEMDFSSLDVDGQIGEYLQFFSQFSGGNLAWK

HWVVDGFFNQTQLRIRPRYLAAEGLAKAFSDDVVPDGVQKIVTKQGWLPP

VNTASKTAVRIVRRNAFGEPRLSSAHHMPCSWQWRHE

In some embodiments the Cas9 protein can be A. muciniphila Cas9 and may comprise or consist of the amino acid sequence:

(SEQ ID NO: 170)

MSRSLTFSFDIGYASIGWAVIASASHDDADPSVCGCGTVLFPKDDCQAFK

RREYRRLRRNIRSRRVRIERIGRLLVQAQIITPEMKETSGHPAPFYLASE

ALKGHRTLAPIELWHVLRWYAHNRGYDNNASWSNSLSEDGGNGEDTERVK

HAQDLMDKHGTATMAETICRELKLEEGKADAPMEVSTPAYKNLNTAFPRL

IVEKEVRRILELSAPLIPGLTAEIIELIAQHHPLTTEQRGVLLQHGIKLA

RRYRGSLLFGQLIPRFDNRIISRCPVTWAQVYEAELKKGNSEQSARERAE

KLSKVPTANCPEFYEYRMARILCNIRADGEPLSAEIRRELMNQARQEGKL

TKASLEKAISSRLGKETETNVSNYFTLHPDSEEALYLNPAVEVLQRSGIG

QILSPSVYRIAANRLRRGKSVTPNYLLNLLKSRGESGEALEKKIEKESKK

KEADYADTPLKPKYATGRAPYARTVLKKVVEEILDGEDPTRPARGEAHPD

GELKAHDGCLYCLLDTDSSVNQHQKERRLDTMTNNHLVRHRMLILDRLLK

DLIQDFADGQKDRISRVCVEVGKELTTFSAMDSKKIQRELTLRQKSHTDA

VNRLKRKLPGKALSANLIRKCRIAMDMNWTCPFTGATYGDHELENLELEH

IVPHSFRQSNALSSLVLTWPGVNRMKGQRTGYDFVEQEQENPVPDKPNLH

ICSLNNYRELVEKLDDKKGHEDDRRRKKKRKALLMVRGLSHKHQSQNHEA

MKEIGMTEGMMTQSSHLMKLACKSIKTSLPDAHIDMIPGAVTAEVRKAWD

VFGVFKELCPEAADPDSGKILKENLRSLTHLHHALDACVLGLIPYIIPAH

HNGLLRRVLAMRRIPEKLIPQVRPVANQRHYVLNDDGRMMLRDLSASLKE

NIREQLMEQRVIQHVPADMGGALLKETMQRVLSVDGSGEDAMVSLSKKKD

GKKEKNQVKASKLVGVFPEGPSKLKALKAAIEIDGNYGVALDPKPVVIRH

IKVFKRIMALKEQNGGKPVRILKKGMLIHLTSSKDPKHAGVWRIESIQDS

KGGVKLDLQRAHCAVPKNKTHECNWREVDLISLLKKYQMKRYPTSYTGT

PR

In some embodiments the Cas9 protein can be O. laneus Cas9 and may comprise or consist of the amino acid sequence:

(SEQ ID NO: 171)

METTLGIDLGTNSIGLALVDQEEHQILYSGVRIFPEGINKDTIGLGEKEE

SRNATRRAKRQMRRQYFRKKLRKAKLLELLIAYDMCPLKPEDVRRWKNWD

KQQKSTVRQFPDTPAFREWLKQNPYELRKQAVTEDVTRPELGRILYQMIQ

RRGFLSSRKGKEEGKIFTGKDRMVGIDETRKNLQKQTLGAYLYDIAPKNG

EKYRFRTERVRARYTLRDMYIREFEIIWQRQAGHLGLAHEQATRKKNIFL

EGSATNVRNSKLITHLQAKYGRGHVLIEDTRITVTFQLPLKEVLGGKIEI

EEEQLKFKSNESVLFWQRPLRSQKSLLSKCVFEGRNFYDPVHQKWIIAGP

TPAPLSHPEFEEFRAYQFINNIIYGKNEHLTAIQREAVFELMCTESKDFN

FEKIPKHLKLFEKFNFDDTTKVPACTTISQLRKLFPHPVWEEKREEIWHC

FYFYDDNTLLFEKLQKDYALQTNDLEKIKKIRLSESYGNVSLKAIRRINP

YLKKGYAYSTAVLLGGIRNSFGKRFEYFKEYEPEIEKAVCRILKEKNAEG

EVIRKIKDYLVHNRFGFAKNDRAFQKLYHHSQAITTQAQKERLPETGNLR

NPIVQQGLNELRRTVNKLLATCREKYGPSFKFDHIHVEMGRELRSSKTER

EKQSRQIRENEKKNEAAKVKLAEYGLKAYRDNIQKYLLYKEIEEKGGTVC

CPYTGKTLNISHTLGSDNSVQIEHIIPYSISLDDSLANKTLCDATFNREK

GELTPYDFYQKDPSPEKWGASSWEEIEDRAFRLLPYAKAQRFIRRKPQES

NEFISRQLNDTRYISKKAVEYLSAICSDVKAFPGQLTAELRHLWGLNNIL

QSAPDITFPLPVSAENHREYYVITNEQNEVIRLFPKQGETPRIEKGELLL

TGEVERKVFRCKGMQEFQTDVSDGKYWRRIKLSSSVTWSPLFAPKPISAD

GQIVLKGRIEKGVFVCNQLKQKLKTGLPDGSYWISLPVISQTFKEGESVN

NSKLTSQQVQLFGRVREGIFRCHNYQCPASGADGNFWCTLDTDTAQPAFT

PIKNAPPGVGGGQIILTGDVDDKGIFHADDDLHYELPASLPKGKYYGIFT

VESCDPTLIPIELSAPKTSKGENLIEGNIWVDEHTGEVRFDPKKNREDQR

HHAIDAIVIALSSQSLFQRLSTYNARRENKKRGLDSTEHFPSPWPGFAQD

VRQSVVPLLVSYKQNPKTLCKISKTLYKDGKKIHSCGNAVRGQLHKETVY

GQRTAPGATEKSYHIRKDIRELKTSKHIGKVVDITIRQMLLKHLQENYHI

DITQEFNIPSNAFFKEGVYRIFLPNKHGEPVPIKKIRMKEELGNAERLKD

NINQYVNPRNNHHVMIYQDADGNLKEEIVSFWSVIERQNQGQPIYQLPRE

GRNIVSILQINDTFLIGLKEEEPEVYRNDLSTLSKHLYRVQKLSGMYYTF

RHHLASTLNNEREEFRIQSLEAWKRANPVKVQIDEIGRITFLNGPLC.

Exemplary wild type Francisella tularensis subsp. Novicida Cpf1 (FnCpf1) proteins of the disclosure may comprise or consist of the amino acid sequence:

(SEQ ID NO: 172)

1
MSIYQEFVNK YSLSKTLRFE LIPQGKTLEN IKARGLILDD EKRAKDYKKA KQIIDKYHQF

61
FIEEILSSVC ISEDLLQNYS DVYFKLKKSD DDNLQKDFKS AKDTIKKQIS EYIKDSEKFK

121
NLFNQNLIDA KKGQESDLIL WLKQSKDNGI ELFKANSDIT DIDEALEIIK SFKGWTTYFK

181
GFHENRKNVY SSNDIPTSII YRIVDDNLPK FLENKAKYES LKDKAPEAIN YEQIKKDLAE

241
ELTFDIDYKT SEVNQRVFSL DEVFEIANFN NYLNQSGITK FNTIIGGKFV NGENTKRKGI

301
NEYINLYSQQ INDKTLKKYK MSVLFKQILS DTESKSFVID KLEDDSDVVT TMQSFYEQIA

361
AFKTVEEKSI KETLSLLFDD LKAQKLDLSK IYFKNDKSLT DLSQQVFDDY SVIGTAVLEY

421
ITQQIAPKNL DNPSKKEQEL IAKKTEKAKY LSLETIKLAL EEFNKHRDID KQCRFEEILA

481
NFAAIPMIFD EIAQNKDNLA QISIKYQNQG KKDLLQASAE DDVKAIKDLL DQTNNLLHKL

541
KIFHISQSED KANILDKDEH FYLVFEECYF ELANIVPLYN KIRNYITQKP YSDEKFKLNF

601
ENSTLANGWD KNKEPDNTAI LFIKDDKYYL GVMNKKNNKI FDDKAIKENK GEGYKKIVYK

661
LLPGANKMLP KVFFSAKSIK FYNPSEDILR IRNHSTHTKN GSPQKGYEKF EFNIEDCRKF

721
IDFYKQSISK HPEWKDFGFR FSDTQRYNSI DEFYREVENQ GYKLTFENIS ESYIDSVVNQ

781
GKLYLFQIYN KDFSAYSKGR PNLHTLYWKA LFDERNLQDV VYKLNGEAEL FYRKQSIPKK

841
ITHPAKEAIA NKNKDNPKKE SVFEYDLIKD KRFTEDKFFF HCPITINFKS SGANKFNDEI

901
NLLLKEKAND VHILSIDRGE RHLAYYTLVD GKGNIIKQDT FNIIGNDRMK TNYHDKLAAI

961
EKDRDSARKD WKKINNIKEM KEGYLSQVVH EIAKLVIEYN AIVVFEDLNF GFKRGRFKVE

1021
KQVYQKLEKM LIEKLNYLVF KDNEFDKTGG VLRAYQLTAP FETFKKMGKQ TGIIYYVPAG

1081
FTSKICPVTG FVNQLYPKYE SVSKSQEFFS KFDKICYNLD KGYFEFSFDY KNFGDKAAKG

1141
KWTIASFGSR LINFRNSDKN HNWDTREVYP TKELEKLLKD YSIEYGHGEC IKAAICGESD

1201
KKFFAKLTSV LNTILQMRNS KTGTELDYLI SPVADVNGNF FDSRQAPKNM PQDADANGAY

1261
HIGLKGLMLL GRIKNNQEGK KLNLVIKNEE YFEFVQNRNN.

Exemplary wild type Lachnospiraceae bacterium sp. ND2006 Cpf1 (LbCpf1) proteins of the disclosure may comprise or consist of the amino acid sequence:

(SEQ ID NO: 173)

1
AASKLEKFTN CYSLSKTLRF KAIPVGKTQE NIDNKRLLVE DEKRAEDYKG VKKLLDRYYL

61
SFINDVLHSI KLKNLNNYIS LFRKKTRTEK ENKELENLEI NLRKEIAKAF KGAAGYKSLF

121
KKDIIETILP EAADDKDEIA LVNSFNGFTT AFTGFFDNRE NMFSEEAKST SIAFRCINEN

181
LTRYISNMDI FEKVDAIFDK HEVQEIKEKI LNSDYDVEDF FEGEFFNFVL TQEGIDVYNA

241
IIGGFVTESG EKIKGLNEYI NLYNAKTKQA LPKFKPLYKQ VLSDRESLSF YGEGYTSDEE

301
VLEVFRNTLN KNSEIFSSIK KLEKLFKNFD EYSSAGIFVK NGPAISTISK DIFGEWNLIR

361
DKWNAEYDDI HLKKKAVVTE KYEDDRRKSF KKIGSFSLEQ LQEYADADLS VVEKLKEIII

421
QKVDEIYKVY GSSEKLFDAD FVLEKSLKKN DAVVAIMKDL LDSVKSFENY IKAFFGEGKE

481
TNRDESFYGD FVLAYDILLK VDHIYDAIRN YVTQKPYSKD KFKLYFQNPQ FMGGWDKDKE

541
TDYRATILRY GSKYYLAIMD KKYAKCLQKI DKDDVNGNYE KINYKLLPGP NKMLPKVFFS

601
KKWMAYYNPS EDIQKIYKNG TFKKGDMFNL NDCHKLIDFF KDSISRYPKW SNAYDFNFSE

661
TEKYKDIAGF YREVEEQGYK VSFESASKKE VDKLVEEGKL YMFQIYNKDF SDKSHGTPNL

721
HTMYFKLLFD ENNHGQIRLS GGAELFMRRA SLKKEELVVH PANSPIANKN PDNPKKTTTL

781
SYDVYKDKRF SEDQYELHIP IAINKCPKNI FKINTEVRVL LKHDDNPYVI GIDRGERNLL

841
YIVVVDGKGN IVEQYSLNEI INNFNGIRIK TDYHSLLDKK EKERFEARQN WTSIENIKEL

901
KAGYISQVVH KICELVEKYD AVIALEDLNS GFKNSRVKVE KQVYQKFEKM LIDKLNYMVD

961
KKSNPCATGG ALKGYQITNK FESFKSMSTQ NGFIFYIPAW LTSKIDPSTG FVNLLKTKYT

1021
SIADSKKFIS SFDRIMYVPE EDLFEFALDY KNFSRTDADY IKKWKLYSYG NRIRIFAAAK

1081
KNNVFAWEEV CLTSAYKELF NKYGINYQQG DIRALLCEQS DKAFYSSFMA LMSLMLQMRN

1141
SITGRTDVDF LISPVKNSDG IFYDSRNYEA QENAILPKNA DANGAYNIAR KVLWAIGQFK

1201
KAEDEKLDKV KIAISNKEWL EYAQTSVK.

Exemplary wild type Acidaminococcus sp. BV3L6 Cpf1 (AsCpf1) proteins of the disclosure may comprise or consist of the amino acid sequence:

(SEQ ID NO: 174)

1
MTQFEGFTNL YQVSKTLRFE LIPQGKTLKH IQEQGFIEED KARNDHYKEL KPIIDRIYKT

61
YADQCLQLVQ LDWENLSAAI DSYRKEKTEE TRNALIEEQA TYRNAIHDYF IGRTDNLTDA

121
INKRHAEIYK GLFKAELFNG KVLKQLGTVT TTEHENALLR SFDKFTTYFS GFYENRKNVF

181
SAEDISTAIP HRIVQDNFPK FKENCHIFTR LITAVPSLRE HFENVKKAIG IFVSTSIEEV

241
FSFPFYNQLL TQTQIDLYNQ LLGGISREAG TEKIKGLNEV LNLAIQKNDE TAHIIASLPH

301
RFIPLFKQIL SDRNTLSFIL EEFKSDEEVI QSFCKYKTLL RNENVLETAE ALFNELNSID

361
LTHIFISHKK LETISSALCD HWDTLRNALY ERRISELTGK ITKSAKEKVQ RSLKHEDINL

421
QEIISAAGKE LSEAFKQKTS EILSHAHAAL DQPLPTTLKK QEEKEILKSQ LDSLLGLYHL

481
LDWFAVDESN EVDPEFSARL TGIKLEMEPS LSFYNKARNY ATKKPYSVEK FKLNFQMPTL

541
ASGWDVNKEK NNGAILFVKN GLYYLGIMPK QKGRYKALSF EPTEKTSEGF DKMYYDYFPD

601
AAKMIPKCST QLKAVTAHFQ THTTPILLSN NFIEPLEITK EIYDLNNPEK EPKKFQTAYA

661
KKTGDQKGYR EALCKWIDFT RDFLSKYTKT TSIDLSSLRP SSQYKDLGEY YAELNPLLYH

721
ISFQRIAEKE IMDAVETGKL YLFQIYNKDF AKGHHGKPNL HTLYWTGLFS PENLAKTSIK

781
LNGQAELFYR PKSRMKRMAH RLGEKMLNKK LKDQKTPIPD TLYQELYDYV NHRLSHDLSD

841
EARALLPNVI TKEVSHEIIK DRRFTSDKFF FHVPITLNYQ AANSPSKFNQ RVNAYLKEHP

901
ETPIIGIDRG ERNLIYITVI DSTGKILEQR SLNTIQQFDY QKKLDNREKE RVAARQAWSV

961
VGTIKDLKQG YLSQVIHEIV DLMIHYQAVV VLENLNFGFK SKRTGIAEKA VYQQFEKMLI

1021
DKLNCLVLKD YPAEKVGGVL NPYQLTDQFT SFAKMGTQSG FLFYVPAPYT SKIDPLTGFV

1081
DPFVWKTIKN HESRKHFLEG FDFLHYDVKT GDFILHFKMN RNLSFQRGLP GFMPAWDIVF

1141
EKNETQFDAK GTPFIAGKRI VPVIENHRFT GRYRDLYPAN ELIALLEEKG IVFRDGSNIL

1201
PKLLENDDSH AIDTMVALIR SVLQMRNSNA ATGEDYINSP VRDLNGVCFD SRFQNPEWPM

1261
DADANGAYHI ALKGQLLLNH LKESKDLKLQ NGISNQDWLA YIQELRN.

In some embodiments of the compositions of the disclosure, the sequence encoding the first RNA binding protein comprises a sequence isolated or derived from a CRISPR Cas protein. In some embodiments, the CRISPR Cas protein comprises a Type VI CRISPR Cas protein or portion thereof. In some embodiments, the Type VI CRISPR Cas protein comprises a Cas13 protein or portion thereof. Exemplary Cas13 proteins of the disclosure may be isolated or derived from any species, including, but not limited to, a bacteria or an archaea. Exemplary Cas13 proteins of the disclosure may be isolated or derived from any species, including, but not limited to, Leptotrichia wadei, Listeria seeligeri serovar 1/2b (strain ATCC 35967/DSM 20751/CIP 100100/SLCC 3954), Lachnospiraceae bacterium, Clostridium aminophilum DSM 10710, Carnobacterium gallinarum DSM 4847, Paludibacter propionicigenes WB4, Listeria weihenstephanensis FSL R9-0317, Listeria weihenstephanensis FSL R9-0317, bacterium FSL M6-0635 (Listeria newyorkensis), Leptotrichia wadei F0279, Rhodobacter capsulatus SB 1003, Rhodobacter capsulatus R121, Rhodobacter capsulatus DE442 and Corynebacterium ulcerans. Exemplary Cas13 proteins of the disclosure may be DNA nuclease inactivated. Exemplary Cas13 proteins of the disclosure include, but are not limited to, Cas13a, Cas13b, Cas13c, Cas13d and orthologs thereof. Exemplary Cas13b proteins of the disclosure include, but are not limited to, subtypes 1 and 2 referred to herein as Csx27 and Csx28, respectively.

Exemplary Cas13a proteins include, but are not limited to:

Cas13a
Cas13a

number
abbreviation
Organism name
Accession number
Direct Repeat sequence

Cas13a1
LshCas13a

Leptotrichia

WP_018451595.1
CCACCCCAATATCGAAGGGGACTAA

shahii

AAC (SEQ ID NO: 175)

Cas13a2
LwaCas13a

Leptotrichia

WP_021746774.1
GATTTAGACTACCCCAAAAACGAAG

wadei

GGGACTAAAAC (SEQ ID NO: 176)

Cas13a3
LseCas13a

Listeria seeligeri

WP_012985477.1
GTAAGAGACTACCTCTATATGAAAG

AGGACTAAAAC (SEQ ID NO: 177)

Cas13a4
LbmCas13a
Lachnospiraceae
WP_044921188.1
GTATTGAGAAAAGCCAGATATAGTT

bacterium

GGCAATAGAC (SEQ ID NO: 178)

MA2020

Cas13a5
LbnCas13a
Lachnospiraceae
WP_022785443.1
GTTGATGAGAAGAGCCCAAGATAG

bacterium

AGGGCAATAAC (SEQ ID NO: 179)

NK4A179

Cas13a6
CamCas13a
[Clostridium]
WP_031473346.1
GTCTATTGCCCTCTATATCGGGCTGT

aminophilum

TCTCCAAAC (SEQ ID NO: 180)

DSM 10710

Cas13a7
CgaCas13a

Carnobacterium

WP_034560163.1
ATTAAAGACTACCTCTAAATGTAAG

gallinarum DSM

AGGACTATAAC (SEQ ID NO: 181)

4847

Cas13a8
Cga2Cas13a

Carnobacterium

WP_034563842.1
AATATAAACTACCTCTAAATGTAAG

gallinarum DSM

AGGACTATAAC (SEQ ID NO: 182)

4847

Cas13a9
Pprcas13a

Paludibacter

WP_013443710.1
CTTGTGGATTATCCCAAAATTGAAG

propionicigenes

GGAACTACAAC (SEQ ID NO: 183)

WB4

Cas13a10
LweCas13a

Listeria

WP_036059185.1
GATTTAGAGTACCTCAAAATAGAAG

weihenstephanensis

AGGTCTAAAAC (SEQ ID NO: 184)

FSL R9-0317

Cas13a11
LbfCas13a
Listeriaceae
WP_036091002.1
GATTTAGAGTACCTCAAAACAAAAG

bacterium FSL

AGGACTAAAAC (SEQ ID NO: 185)

M6-0635

(Listeria

newyorkensis)

Cas13a12
Lwa2cas13a

Leptotrichia

WP_021746774.1
GATATAGATAACCCCAAAAACGAA

wadei F0279

GGGATCTAAAAC (SEQ ID NO: 186)

Cas13a13
RcsCas13a

Rhodobacter

WP_013067728.1
GCCTCACATCACCGCCAAGACGACG

capsulatus SB

GCGGACTGAAC (SEQ ID NO: 187)

1003

Cas13a14
RcrCas13a

Rhodobacter

WP_023911507.1
GCCTCACATCACCGCCAAGACGACG

capsulatus R121

GCGGACTGAAC (SEQ ID NO: 188)

Cas13a15
RcdCas13a

Rhodobacter

WP_023911507.1
GCCTCACATCACCGCCAAGACGACG

capsulatus

GCGGACTGAAC (SEQ ID NO: 189)

DE442

Exemplary wild type Cas13a proteins of the disclosure may comprise or consist of the amino acid sequence:

(SEQ ID NO: 190)

1
MGNLFGHKRW YEVRDKKDFK IKRKVKVKRN YDGNKYILNI NENNNKEKID NNKFIRKYIN

61
YKKNDNILKE FTRKFHAGNI LFKLKGKEGI IRIENNDDFL ETEEVVLYIE AYGKSEKLKA

121
LGITKKKIID EAIRQGITKD DKKIEIKRQE NEEEIEIDIR DEYTNKTLND CSIILRIIEN

181
DELETKKSIY EIFKNINMSL YKIIEKIIEN ETEKVFENRY YEEHLREKLL KDDKIDVILT

241
NFMEIREKIK SNLEILGFVK FYLNVGGDKK KSKNKKMLVE KILNINVDLT VEDIADFVIK

301
ELEFWNITKR IEKVKKVNNE FLEKRRNRTY IKSYVLLDKH EKFKIERENK KDKIVKFFVE

361
NIKNNSIKEK IEKILAEFKI DELIKKLEKE LKKGNCDTEI FGIFKKHYKV NFDSKKFSKK

421
SDEEKELYKI IYRYLKGRIE KILVNEQKVR LKKMEKIEIE KILNESILSE KILKRVKQYT

481
LEHIMYLGKL RHNDIDMTTV NTDDFSRLHA KEELDLELIT FFASTNMELN KIFSRENINN

541
DENIDFFGGD REKNYVLDKK ILNSKIKIIR DLDFIDNKNN ITNNFIRKFT KIGTNERNRI

601
LHAISKERDL QGTQDDYNKV INIIQNLKIS DEEVSKALNL DVVFKDKKNI ITKINDIKIS

661
EENNNDIKYL PSFSKVLPEI LNLYRNNPKN EPFDTIETEK IVLNALIYVN KELYKKLILE

721
DDLEENESKN IFLQELKKTL GNIDEIDENI IENYYKNAQI SASKGNNKAI KKYQKKVIEC

781
YIGYLRKNYE ELFDFSDFKM NIQEIKKQIK DINDNKTYER ITVKTSDKTI VINDDFEYII

841
SIFALLNSNA VINKIRNRFF ATSVWLNTSE YQNIIDILDE IMQLNTLRNE CITENWNLNL

901
EEFIQKMKEI EKDFDDFKIQ TKKEIFNNYY EDIKNNILTE FKDDINGCDV LEKKLEKIVI

961
FDDETKFEID KKSNILQDEQ RKLSNINKKD LKKKVDQYIK DKDQEIKSKI LCRIIFNSDF

1021
LKKYKKEIDN LIEDMESENE NKFQEIYYPK ERKNELYIYK KNLFLNIGNP NFDKIYGLIS

1081
NDIKMADAKF LFNIDGKNIR KNKISEIDAI LKNLNDKLNG YSKEYKEKYI KKLKENDDFF

1141
AKNIQNKNYK SFEKDYNRVS EYKKIRDLVE FNYLNKIESY LIDINWKLAI QMARFERDMH

1201
YIVNGLRELG IIKLSGYNTG ISRAYPKRNG SDGFYTTTAY YKFFDEESYK KFEKICYGFG

1261
IDLSENSEIN KPENESIRNY ISHFYIVRNP FADYSIAEQI DRVSNLLSYS TRYNNSTYAS

1321
VFEVFKKDVN LDYDELKKKF KLIGNNDILE RLMKPKKVSV LELESYNSDY IKNLIIELLT

1381
KIENTNDTL

Exemplary Cas13b proteins include, but are not limited to:

Species
Cas13b Accession
Cas13b Size (aa)

Paludibacter propionicigenes WB4
WP_013446107.1
1155

Prevotella sp. P5-60
WP_044074780.1
1091

Prevotella sp. P4-76
WP_044072147.1
1091

Prevotella sp. P5-125
WP_044065294.1
1091

Prevotella sp. P5-119
WP_042518169.1
1091

Capnocytophaga canimorsus Cc5
WP_013997271.1
1200

Phaeodactylibacter xiamenensis

WP_044218239.1
1132

Porphyromonas gingivalis W83
WP_005873511.1
1136

Porphyromonas gingivalis F0570
WP_021665475.1
1136

Porphyromonas gingivalis ATCC 33277
WP_012458151.1
1136

Porphyromonas gingivalis F0185
ERJ81987.1
1136

Porphyromonas gingivalis F0185
WP_021677657.1
1136

Porphyromonas gingivalis SJD2
WP_023846767.1
1136

Porphyromonas gingivalis F0568
ERJ65637.1
1136

Porphyromonas gingivalis W4087
ERJ87335.1
1136

Porphyromonas gingivalis W4087
WP_021680012.1
1136

Porphyromonas gingivalis F0568
WP_021663197.1
1136

Porphyromonas gingivalis

WP_061156637.1
1136

Porphyromonas gulae

WP_039445055.1
1136

Bacteroides pyogenes F0041
ERI81700.1
1116

Bacteroides pyogenes JCM 10003
WP_034542281.1
1116

Alistipes sp. ZOR0009
WP_047447901.1
954

Flavobacterium branchiophilum FL-15
WP_014084666.1
1151

Prevotella sp. MA2016
WP_036929175.1
1323

Myroides odoratimimus CCUG 10230
EHO06562.1
1160

Myroides odoratimimus CCUG 3837
EKB06014.1
1158

Myroides odoratimimus CCUG 3837
WP_006265509.1
1158

Myroides odoratimimus CCUG 12901
WP_006261414.1
1158

Myroides odoratimimus CCUG 12901
EHO08761.1
1158

Myroides odoratimimus (NZ_CP013690.1)
WP_058700060.1
1160

Bergeyella zoohelcum ATCC 43767
EKB54193.1
1225

Capnocytophaga cynodegmi

WP_041989581.1
1219

Bergeyella zoohelcum ATCC 43767
WP_002664492.1
1225

Flavobacterium sp. 316
WP_045968377.1
1156

Psychroflexus torquis ATCC 700755
WP_015024765.1
1146

Flavobacterium columnare ATCC 49512
WP_014165541.1
1180

Flavobacterium columnare

WP_060381855.1
1214

Flavobacterium columnare

WP_063744070.1
1214

Flavobacterium columnare

WP_065213424.1
1215

Chryseobacterium sp. YR477
WP_047431796.1
1146

Riemerella anatipestifer ATCC 11845 = DSM
WP_004919755.1
1096

15868

Riemerella anatipestifer RA-CH-2
WP_015345620.1
949

Riemerella anatipestifer

WP_049354263.1
949

Riemerella anatipestifer

WP_061710138.1
951

Riemerella anatipestifer

WP_064970887.1
1096

Prevotella saccharolytica F0055
EKY00089.1
1151

Prevotella saccharolytica JCM 17484
WP_051522484.1
1152

Prevotella buccae ATCC 33574
EFU31981.1
1128

Prevotella buccae ATCC 33574
WP_004343973.1
1128

Prevotella buccae D17
WP_004343581.1
1128

Prevotella sp. MSX73
WP_007412163.1
1128

Prevotella pallens ATCC 700821
EGQ18444.1
1126

Prevotella pallens ATCC 700821
WP_006044833.1
1126

Prevotella intermedia ATCC 25611 = DSM 20706
WP_036860899.1
1127

Prevotella intermedia

WP_061868553.1
1121

Prevotella intermedia 17
AFJ07523.1
1135

Prevotella intermedia

WP_050955369.1
1133

Prevotella intermedia

BAU18623.1
1134

Prevotella intermedia ZT
KJJ86756.1
1126

Prevotella aurantiaca JCM 15754
WP_025000926.1
1125

Prevotella pleuritidis F0068
WP_021584635.1
1140

Prevotella pleuritidis JCM 14110
WP_036931485.1
1117

Prevotella falsenii DSM 22864 = JCM 15124
WP_036884929.1
1134

Porphyromonas gulae

WP_039418912.1
1176

Porphyromonas sp. COT-052 OH4946
WP_039428968.1
1176

Porphyromonas gulae

WP_039442171.1
1175

Porphyromonas gulae

WP_039431778.1
1176

Porphyromonas gulae

WP_046201018.1
1176

Porphyromonas gulae

WP_039434803.1
1176

Porphyromonas gulae

WP_039419792.1
1120

Porphyromonas gulae

WP_039426176.1
1120

Porphyromonas gulae

WP_039437199.1
1120

Porphyromonas gingivalis TDC60
WP_013816155.1
1120

Porphyromonas gingivalis ATCC 33277
WP_012458414.1
1120

Porphyromonas gingivalis A7A1-28
WP_058019250.1
1176

Porphyromonas gingivalis JCVI SC001
EOA10535.1
1176

Porphyromonas gingivalis W50
WP_005874195.1
1176

Porphyromonas gingivalis

WP_052912312.1
1176

Porphyromonas gingivalis AJW4
WP_053444417.1
1120

Porphyromonas gingivalis

WP_039417390.1
1120

Porphyromonas gingivalis

WP_061156470.1
1120

Exemplary wild type Bergeyella zoohelcum ATCC 43767 Cas13b (BzCas13b) proteins of the disclosure may comprise or consist of the amino acid sequence:

(SEQ ID NO: 191)

1
menktslgnn iyynpfkpqd ksyfagyfna amentdsvfr elgkrlkgke ytsenffdai

61
fkenislvey eryvkllsdy fpmarlldkk evpikerken fkknfkgiik avrdlrnfyt

121
hkehgeveit deifgvldem lkstvltvkk kkvktdktke ilkksiekql dilcqkkley

181
lrdtarkiee krrnqrerge kelvapfkys dkrddliaai yndafdvyid kkkdslkess

241
kakyntksdp qqeegdlkip iskngvvfll slfltkqeih afkskiagfk atvideatvs

301
eatvshgkns icfmatheif shlaykklkr kvrtaeinyg eaenaeqlsv yaketlmmqm

361
ldelskvpdv vyqnlsedvq ktfiedwney lkenngdvgt meeeqvihpv irkryedkfn

421
yfairfldef aqfptlrfqv hlgnylhdsr pkenlisdrr ikekitvfgr lselehkkal

481
fikntetned rehyweifpn pnydfpkeni svndkdfpia gsildrekqp vagkigikvk

541
llnqqyvsev dkavkahqlk qrkaskpsiq niieeivpin esnpkeaivf ggqptaylsm

601
ndihsilyef fdkwekkkek lekkgekelr keigkelekk ivgkiqaqiq qiidkdtnak

661
ilkpyqdgns taidkeklik dlkqeqnilq klkdeqtvre keyndfiayq dknreinkvr

721
drnhkqylkd nlkrkypeap arkevlyyre kgkvavwlan dikrfmptdf knewkgeqhs

781
llqkslayye qckeelknll pekvfqhlpf klggyfqqky lyqfytcyld krleyisglv

841
qqaenfksen kvfkkvenec fkflkkqnyt hkeldarvqs ilgypifler gfmdekptii

901
kgktfkgnea lfadwfryyk eyqnfqtfyd tenyplvele kkqadrkrkt kiyqqkkndv

961
ftllmakhif ksvfkqdsid qfsledlyqs reerlgnqer arqtgerntn yiwnktvdlk

1021
lcdgkitven vklknvgdfi kyeydqrvqa flkyeeniew qaflikeske eenypyvver

1081
eieqyekvrr eellkevhli eeyilekvkd keilkkgdnq nfkyyilngl lkqlknedve

1141
sykvfnlnte pedvninqlk qeatdleqka fvltyirnkf ahnqlpkkef wdycqekygk

1201
iekektyaey faevfkkeke alik.

In some embodiments of the compositions of the disclosure, the sequence encoding the first RNA binding protein, or RNA-guided target RNA binding protein, comprises a sequence isolated or derived from a CasRX/Cas13d protein. CasRX/Cas13d is an effector of the type VI-D CRISPR-Cas systems. In some embodiments, the CasRX/Cas13d protein is an RNA-guided RNA endonuclease enzyme that can cut or bind RNA. In some embodiments, the CasRX/Cas13d protein can include one or more higher eukaryotes and prokaryotes nucleotide-binding (HEPN) domains. In some embodiments, the CasRX/Cas13d protein can include either a wild-type or mutated HEPN domain. In some embodiments, the CasRX/Cas13d protein includes a mutated HEPN domain that cannot cut RNA but can process guide RNA. In some embodiments, the CasRX/Cas13d protein does not require a protospacer flanking sequence.