PROGRAMMABLE DNA BASE EDITING BY NME2CAS9-DEAMINASE FUSION PROTEINS

Abstract
The present invention is related to the field of gene editing. In particular, the gene editing is directed toward single nucleotide base editing. For example, such single nucleotide base editing results in a conversion of a OG base pair to a T*A base pair. The high accuracy and precision of the presently disclosed single nucleotide base gene editor is accomplished by an NmeCas9 nuclease that is fused to a nucleotide deaminase protein. The compact nature of the NmeCas9 coupled with a larger number of compatible protospacer adjacent motifs provide the Cas9 fusion constructs contemplated herein to have a gene editing window that can edit sites that are not targetable by other conventional SpyCas9 base editor platforms.
Description
FIELD OF THE INVENTION

The present invention is related to the field of gene editing. In particular, the gene editing is directed toward single nucleotide base editing. For example, such single nucleotide base editing results in a conversion of a CG base pair to a TA base pair. The high accuracy and precision of the presently disclosed single nucleotide base gene editor is accomplished by an NmeCas9 nuclease that is fused to a nucleotide deaminase protein. The compact nature of the NmeCas9 coupled with a larger number of compatible protospacer adjacent motifs provide the Cas9 fusion constructs contemplated herein to have a gene editing window that can edit sites that are not targetable by other conventional SpyCas9 base editor platforms.


BACKGROUND

Many human diseases arise due to the mutation of a single base. The ability to correct such genetic aberrations is paramount in treating these genetic disorders. Clustered regularly interspaced short palindromic repeats (CRISPR) along with CRISPR associated (Cas) proteins comprise an RNA-guided adaptive immune system in archaea and bacteria. These systems provide immunity by targeting and inactivating nucleic acids that originate from foreign genetic elements.


SpyCas9 base editing platforms cannot be used to target all single-base mutations due to their limited editing windows. The editing window is constrained in part by the requirement for an NGG PAM and by the requirement that the edited base(s) be a very precise distance from the PAM. SpyCas9 is also intrinsically associated with high off-targeting effects in genome editing.


What is needed in the art is a highly accurate Cas9 single base editing platform having a programmable target specificity due to recognition of a diverse population of PAM sites.


SUMMARY OF THE INVENTION

The present invention is related to the field of gene editing. In particular, the gene editing is directed toward single nucleotide base editing. For example, such single nucleotide base editing results in a conversion of a CG base pair to a TA base pair. The high accuracy and precision of the presently disclosed single nucleotide base gene editor is accomplished by an NmeCas9 nuclease that is fused to a nucleotide deaminase protein. The compact nature of the NmeCas9 coupled with a larger number of compatible protospacer adjacent motifs provide the Cas9 fusion constructs contemplated herein to have a gene editing window that is superior to other conventional SpyCas9 base editor platforms.


In one embodiment, the present invention contemplates a mutated NmeCas9 protein comprising a fused nucleotide deaminase and a binding region for an N4CC nucleotide sequence. In one embodiment, said protein is Nme2Cas9. In one embodiment, said protein further comprises a nuclear localization signal protein. In one embodiment, said nucleotide deaminase is a cytidine deaminase. In one embodiment, said nucleotide deaminase is an adenosine deaminase. In one embodiment, the protein further comprises a uracil glycosylase inhibitor. In one embodiment, the said nuclear localization signal protein includes, but is not limited to, nucleoplasmin (NLS) and/or SV40 NLS and/or C-myc NLS. In one embodiment, said binding region is a protospacer accessory motif interacting domain. In one embodiment, said protospacer accessory motif interacting domain comprises said mutation. In one embodiment, said mutation is a D16A mutation. In one embodiment, said mutated NmeCas9 protein further comprises CBE4. In one embodiment, said mutated NmeCas9 protein further comprises a linker. In one embodiment, said linker is a 73aa linker. In one embodiment, said linker is a 3×HA-tag.


In one embodiment, the present invention contemplates a construct, wherein said construct is an optimized nNme2Cas9-ABEmax.


In one embodiment, the present invention contemplates a construct, wherein said construct is a nNme2Cas9-CBE4.


In one embodiment, the present invention contemplates a construct, wherein said construct is a YE1-BE3-nNme2Cas9 (D16A)-UGI.


In one embodiment, the present invention contemplates an adeno-associated virus comprising a mutated NmeCas9 protein, said mutated NmeCas9 protein comprising a fused nucleotide deaminase and a binding region for an N4CC nucleotide sequence. In one embodiment, said virus is an adeno-associated virus 8. In one embodiment, said virus is an adeno-associated virus 6. In one embodiment, said protein is Nme2Cas9. In one embodiment, said protein further comprises a nuclear localization signal protein. In one embodiment, said nucleotide deaminase is a cytidine deaminase. In one embodiment, said nucleotide deaminase is an adenosine deaminase. In one embodiment, the protein further comprises a uracil glycosylase inhibitor. In one embodiment, the nuclear localization signal protein includes, but is not limited to, nucleoplasmin (NLS) and/or SV40 NLS and/or C-myc NLS. In one embodiment, said binding region is a protospacer accessory motif interacting domain. In one embodiment, said protospacer accessory motif interacting domain comprises said mutation. In one embodiment, said mutation is a D16A mutation. In one embodiment, said mutated NmeCas9 protein further comprises CBE4. In one embodiment, said mutated NmeCas9 protein further comprises a linker. In one embodiment, said linker is a 73aa linker. In one embodiment, said linker is a 3×HA-tag.


In one embodiment, the present invention contemplates a construct, wherein said construct is an optimized nNme2Cas9-ABEmax.


In one embodiment, the present invention contemplates a construct, wherein said construct is a nNme2Cas9-CBE4.


In one embodiment, the present invention contemplates a construct, wherein said construct is a YE1-BE3-nNme2Cas9 (D16A)-UGI.


In one embodiment, the present invention contemplates a method, comprising: a) providing; i) a nucleotide sequence comprising a gene with a mutated single base, wherein said gene is flanked by an N4CC nucleotide sequence; ii) a mutated NmeCas9 protein comprising a fused nucleotide deaminase and a binding region for said N4CC nucleotide sequence; b) contacting said nucleotide sequence with said mutated NmeCas9 protein under conditions such that said binding region attaches to said N4CC nucleotide sequence; and c) replacing said mutated single base with a wild type base with said mutated NmeCas9 protein. In one embodiment, said protein is Nme2Cas9. In one embodiment, said protein further comprises a nuclear localization signal protein. In one embodiment, said nucleotide deaminase is a cytidine deaminase. In one embodiment, said nucleotide deaminase is an adenosine deaminase. In one embodiment, the protein further comprises a uracil glycosylase inhibitor. In one embodiment, the nuclear localization signal protein includes, but is not limited to, nucleoplasmin (NLS) and/or SV40 NLS and/or C-myc NLS. In one embodiment, said binding region is a protospacer accessory motif interacting domain. In one embodiment, said protospacer accessory motif interacting domain comprises said mutation. In one embodiment, said mutation is a D16A mutation. In one embodiment, said mutated NmeCas9 protein further comprises CBE4. In one embodiment, said mutated NmeCas9 protein further comprises a linker. In one embodiment, said linker is a 73aa linker. In one embodiment, said linker is a 3×HA-tag. In one embodiment, said gene encodes a tyrosinase. In one embodiment, said gene is Fah. In one embodiment, said gene is c-fos.


In one embodiment, the present invention contemplates a method, comprising: a) providing; i) a patient comprising a nucleotide sequence comprising a gene with a mutated single base, wherein said gene is flanked by an N4CC nucleotide sequence, wherein said mutated gene causes a genetically-based medical condition; ii) an adeno-associated virus comprising a mutated NmeCas9 protein, said mutated NmeCas9 protein comprising a fused nucleotide deaminase and a binding region for said N4CC nucleotide sequence; b) treating said patient with said adeno-associated virus under conditions such that said mutated NmeCas9 protein replaces said mutated single base with a wild type single base, such that said genetically-based medical condition does not develop. In one embodiment, said gene encodes a tyrosinase protein. In one embodiment, said genetically-based medical condition is tyrosinemia. In one embodiment, said virus is an adeno-associated virus 8. In one embodiment, said virus is an adeno-associated virus 6. In one embodiment, said protein is Nme2Cas9. In one embodiment, said protein further comprises a nuclear localization signal protein. In one embodiment, said nucleotide deaminase is a cytidine deaminase. In one embodiment, said nucleotide deaminase is an adenosine deaminase. In one embodiment, the protein further comprises a uracil glycosylase inhibitor. In one embodiment, the nuclear localization signal protein includes, but is not limited to, nucleoplasmin (NLS) and/or SV40 NLS and/or C-myc NLS. In one embodiment, said binding region is a protospacer accessory motif interacting domain. In one embodiment, said protospacer accessory motif interacting domain comprises said mutation. In one embodiment, said mutation is a D16A mutation. In one embodiment, said mutated NmeCas9 protein further comprises CBE4. In one embodiment, said mutated NmeCas9 protein further comprises a linker. In one embodiment, said linker is a 73aa linker. In one embodiment, said linker is a 3×HA-tag. In one embodiment, said gene encodes a tyrosinase. In one embodiment, said gene is Fah. In one embodiment, said gene is c-fos.


In one embodiment, the present invention contemplates a method, comprising: a) providing; i) a patient comprising a nucleotide sequence comprising a gene with a mutated single base, wherein said gene is flanked by an N4CC nucleotide sequence, wherein said mutated gene causes a genetically-based medical condition; ii) an optimized nNme2Cas9-ABEmax, comprising a mutated NmeCas9 protein, said mutated NmeCas9 protein comprising a fused nucleotide deaminase and a binding region for said N4CC nucleotide sequence; b) treating said patient with said optimized nNme2Cas9-ABEmax under conditions such that said mutated NmeCas9 protein replaces said mutated single base with a wild type single base, such that said genetically-based medical condition does not develop.


In one embodiment, the present invention contemplates a method, comprising: a) providing; i) a patient comprising a nucleotide sequence comprising a gene with a mutated single base, wherein said gene is flanked by an N4CC nucleotide sequence, wherein said mutated gene causes a genetically-based medical condition; ii) a nNme2Cas9-CBE4, comprising a mutated NmeCas9 protein, said mutated NmeCas9 protein comprising a fused nucleotide deaminase and a binding region for said N4CC nucleotide sequence; b) treating said patient with said nNme2Cas9-CBE4 under conditions such that said mutated NmeCas9 protein replaces said mutated single base with a wild type single base, such that said genetically-based medical condition does not develop.


In one embodiment, the present invention contemplates a method, comprising: a) providing; i) a patient comprising a nucleotide sequence comprising a gene with a mutated single base, wherein said gene is flanked by an N4CC nucleotide sequence, wherein said mutated gene causes a genetically-based medical condition; ii) a YE1-BE3-nNme2Cas9 (D16A)-UGI, comprising a mutated NmeCas9 protein, said mutated NmeCas9 protein comprising a fused nucleotide deaminase and a binding region for said N4CC nucleotide sequence; b) treating said patient with said nNme2Cas9-CBE4 under conditions such that said mutated NmeCas9 protein replaces said mutated single base with a wild type single base, such that said genetically-based medical condition does not develop.


Definitions

To facilitate the understanding of this invention, a number of terms are defined below. Terms defined herein have meanings as commonly understood by a person of ordinary skill in the areas relevant to the present invention. Terms such as “a”, “an” and “the” are not intended to refer to only a singular entity, but include the general class of which a specific example may be used for illustration. The terminology herein is used to describe specific embodiments of the invention, but their usage does not delimit the invention, except as outlined in the claims.


As used herein, the term “edit” “editing” or “edited” refers to a method of altering a nucleic acid sequence of a polynucleotide (e.g., for example, a wild type naturally occurring nucleic acid sequence or a mutated naturally occurring sequence) by selective deletion of a specific genomic target. Such a specific genomic target includes, but is not limited to, a chromosomal region, a gene, a promoter, an open reading frame or any nucleic acid sequence.


As used herein, the term “single base” refers to one, and only one, nucleotide within a nucleic acid sequence. When used in the context of single base editing, it is meant that the base at a specific position within the nucleic acid sequence is replaced with a different base. This replacement may occur by many mechanisms, including but not limited to, substitution or modification.


As used herein, the term “target” or “target site” refers to a pre-identified nucleic acid sequence of any composition and/or length. Such target sites include, but is not limited to, a chromosomal region, a gene, a promoter, an open reading frame or any nucleic acid sequence. In some embodiments, the present invention interrogates these specific genomic target sequences with complementary sequences of gRNA.


The term “on-target binding sequence” as used herein, refers to a subsequence of a specific genomic target that may be completely complementary to a programmable DNA binding domain and/or a single guide RNA sequence.


The term “off-target binding sequence” as used herein, refers to a subsequence of a specific genomic target that may be partially complementary to a programmable DNA binding domain and/or a single guide RNA sequence.


The term “effective amount” as used herein, refers to a particular amount of a pharmaceutical composition comprising a therapeutic agent that achieves a clinically beneficial result (i.e., for example, a reduction of symptoms). Toxicity and therapeutic efficacy of such compositions can be determined by standard pharmaceutical procedures in cell cultures or experimental animals, e.g., for determining the LD50 (the dose lethal to 50% of the population) and the ED50 (the dose therapeutically effective in 50% of the population). The dose ratio between toxic and therapeutic effects is the therapeutic index, and it can be expressed as the ratio LD50/ED50. Compounds that exhibit large therapeutic indices are preferred. The data obtained from these cell culture assays and additional animal studies can be used in formulating a range of dosage for human use. The dosage of such compounds lies preferably within a range of circulating concentrations that include the ED50 with little or no toxicity. The dosage varies within this range depending upon the dosage form employed, sensitivity of the patient, and the route of administration.


The term “symptom”, as used herein, refers to any subjective or objective evidence of disease or physical disturbance observed by the patient. For example, subjective evidence is usually based upon patient self-reporting and may include, but is not limited to, pain, headache, visual disturbances, nausea and/or vomiting. Alternatively, objective evidence is usually a result of medical testing including, but not limited to, body temperature, complete blood count, lipid panels, thyroid panels, blood pressure, heart rate, electrocardiogram, tissue and/or body imaging scans.


The term “disease” or “medical condition”, as used herein, refers to any impairment of the normal state of the living animal or plant body or one of its parts that interrupts or modifies the performance of the vital functions. Typically manifested by distinguishing signs and symptoms, it is usually a response to: i) environmental factors (as malnutrition, industrial hazards, or climate); ii) specific infective agents (as worms, bacteria, or viruses); iii) inherent defects of the organism (as genetic anomalies); and/or iv) combinations of these factors.


The terms “reduce,” “inhibit,” “diminish,” “suppress,” “decrease,” “prevent” and grammatical equivalents (including “lower,” “smaller,” etc.) when in reference to the expression of any symptom in an untreated subject relative to a treated subject, mean that the quantity and/or magnitude of the symptoms in the treated subject is lower than in the untreated subject by any amount that is recognized as clinically relevant by any medically trained personnel. In one embodiment, the quantity and/or magnitude of the symptoms in the treated subject is at least 10% lower than, at least 25% lower than, at least 50% lower than, at least 75% lower than, and/or at least 90% lower than the quantity and/or magnitude of the symptoms in the untreated subject.


The term “attached” as used herein, refers to any interaction between a medium (or carrier) and a drug. Attachment may be reversible or irreversible. Such attachment includes, but is not limited to, covalent bonding, ionic bonding, Van der Waals forces or friction, and the like. A drug is attached to a medium (or carrier) if it is impregnated, incorporated, coated, in suspension with, in solution with, mixed with, etc.


The term “drug” or “compound” as used herein, refers to any pharmacologically active substance capable of being administered which achieves a desired effect. Drugs or compounds can be synthetic or naturally occurring, non-peptide, proteins or peptides, oligonucleotides or nucleotides, polysaccharides or sugars.


The term “administered” or “administering”, as used herein, refers to any method of providing a composition to a patient such that the composition has its intended effect on the patient. An exemplary method of administering is by a direct mechanism such as, local tissue administration (i.e., for example, extravascular placement), oral ingestion, transdermal patch, topical, inhalation, suppository etc.


The term “patient” or “subject”, as used herein, is a human or animal and need not be hospitalized. For example, out-patients, persons in nursing homes are “patients.” A patient may comprise any age of a human or non-human animal and therefore includes both adult and juveniles (i.e., children). It is not intended that the term “patient” connote a need for medical treatment, therefore, a patient may voluntarily or involuntarily be part of experimentation whether clinical or in support of basic science studies.


The term “affinity” as used herein, refers to any attractive force between substances or particles that causes them to enter into and remain in chemical combination. For example, an inhibitor compound that has a high affinity for a receptor will provide greater efficacy in preventing the receptor from interacting with its natural ligands, than an inhibitor with a low affinity.


The term “pharmaceutically” or “pharmacologically acceptable”, as used herein, refer to molecular entities and compositions that do not produce adverse, allergic, or other untoward reactions when administered to an animal or a human.


The term, “pharmaceutically acceptable carrier”, as used herein, includes any and all solvents, or a dispersion medium including, but not limited to, water, ethanol, polyol (for example, glycerol, propylene glycol, and liquid polyethylene glycol, and the like), suitable mixtures thereof, and vegetable oils, coatings, isotonic and absorption delaying agents, liposome, commercially available cleansers, and the like. Supplementary bioactive ingredients also can be incorporated into such carriers.


The term “viral vector” encompasses any nucleic acid construct derived from a virus genome capable of incorporating heterologous nucleic acid sequences for expression in a host organism. For example, such viral vectors may include, but are not limited to, adeno-associated viral vectors, lentiviral vectors, SV40 viral vectors, retroviral vectors, adenoviral vectors. Although viral vectors are occasionally created from pathogenic viruses, they may be modified in such a way as to minimize their overall health risk. This usually involves the deletion of a part of the viral genome involved with viral replication. Such a virus can efficiently infect cells but, once the infection has taken place, the virus may require a helper virus to provide the missing proteins for production of new virions. Preferably, viral vectors should have a minimal effect on the physiology of the cell it infects and exhibit genetically stable properties (e.g., do not undergo spontaneous genome rearrangement). Most viral vectors are engineered to infect as wide a range of cell types as possible. Even so, a viral receptor can be modified to target the virus to a specific kind of cell. Viruses modified in this manner are said to be pseudotyped. Viral vectors are often engineered to incorporate certain genes that help identify which cells took up the viral genes. These genes are called marker genes. For example, a common marker gene confers antibiotic resistance to a certain antibiotic.


As used herein the “ROSA26 gene” or “Rosa26 gene” refers to a human or mouse (respectively) locus that is widely used for achieving generalized expression in the mouse. Targeting to the ROSA26 locus may be achieved by introducing a desired gene into the first intron of the locus, at a unique XbaI site approximately 248 bp upstream of the original gene trap line. A construct may be constructed using an adenovirus splice acceptor followed by a gene of interest and a polyadenylation site inserted at the unique XbaI site. A neomycin resistance cassette may also be included in the targeting vector.


As used herein the “PCSK9 gene” or “Pcsk9 gene” refers to a human or mouse (respectively) locus that encodes a PCSK9 protein. The PCSK9 gene resides on chromosome 1 at the band 1p32.3 and includes 13 exons. This gene may produce at least two isoforms through alternative splicing.


The term “proprotein convertase subtilisin/kexin type 9” and “PCSK9” refers to a protein encoded by a gene that modulates low density lipoprotein levels. Proprotein convertase subtilisin/kexin type 9, also known as PCSK9, is an enzyme that in humans is encoded by the PCSK9 gene. Seidah et al., “The secretory proprotein convertase neural apoptosis-regulated convertase 1 (NARC-1): liver regeneration and neuronal differentiation” Proc. Natl. Acad. Sci. U.S.A. 100 (3): 928-933 (2003). Similar genes (orthologs) are found across many species. Many enzymes, including PSCK9, are inactive when they are first synthesized, because they have a section of peptide chains that blocks their activity; proprotein convertases remove that section to activate the enzyme. PSCK9 is believed to play a regulatory role in cholesterol homeostasis. For example, PCSK9 can bind to the epidermal growth factor-like repeat A (EGF-A) domain of the low-density lipoprotein receptor (LDL-R) resulting in LDL-R internalization and degradation. Clearly, it would be expected that reduced LDL-R levels result in decreased metabolism of LDL-C, which could lead to hypercholesterolemia.


The term “hypercholesterolemia” as used herein, refers to any medical condition wherein blood cholesterol levels are elevated above the clinically recommended levels. For example, if cholesterol is measured using low density lipoproteins (LDLs), hypercholesterolemia may exist if the measured LDL levels are above, for example, approximately 70 mg/dl. Alternatively, if cholesterol is measured using free plasma cholesterol, hypercholesterolemia may exist if the measured free cholesterol levels are above, for example, approximately 200-220 mg/dl.


As used herein, the term “CRISPRs” or “Clustered Regularly Interspaced Short Palindromic Repeats” refers to an acronym for DNA loci that contain multiple, short, direct repetitions of base sequences. Each repetition contains a series of bases followed by 30 or so base pairs known as “spacer DNA”. The spacers are short segments of DNA from a virus and may serve as a ‘memory’ of past exposures to facilitate an adaptive defense against future invasions.


As used herein, the term “Cas” or “CRISPR-associated (cas)” refers to genes often associated with CRISPR repeat-spacer arrays.


As used herein, the term “Cas9” refers to a nuclease from Type II CRISPR systems, an enzyme specialized for generating double-strand breaks in DNA, with two active cutting sites (the HNH and RuvC domains), one for each strand of the double helix. Jinek combined tracrRNA and spacer RNA into a “single-guide RNA” (sgRNA) molecule that, mixed with Cas9, could find and cleave DNA targets through Watson-Crick pairing between the guide sequence within the sgRNA and the target DNA sequence.


The term “protospacer adjacent motif” (or PAM) as used herein, refers to a DNA sequence that may be required for a Cas9/sgRNA to form an R-loop to interrogate a specific DNA sequence through Watson-Crick pairing of its guide RNA with the genome. The PAM specificity may be a function of the DNA-binding specificity of the Cas9 protein (e.g., a “protospacer adjacent motif recognition domain” at the C-terminus of Cas9).


As used herein, the term “sgRNA” refers to single guide RNA used in conjunction with CRISPR associated systems (Cas). sgRNAs are a fusion of crRNA and tracrRNA and contain nucleotides of sequence complementary to the desired target site. Jinek et al., “A programmable dual-RNA-guided DNA endonuclease in adaptive bacterial immunity” Science 337(6096):816-821 (2012) Watson-Crick pairing of the sgRNA with the target site permits R-loop formation, which in conjunction with a functional PAM permits DNA cleavage or in the case of nuclease-deficient Cas9 allows binds to the DNA at that locus.


As used herein, the term “fluorescent protein” refers to a protein domain that comprises at least one organic compound moiety that emits fluorescent light in response to the appropriate wavelengths. For example, fluorescent proteins may emit red, blue and/or green light. Such proteins are readily commercially available including, but not limited to: i) mCherry (Clonetech Laboratories): excitation: 556/20 nm (wavelength/bandwidth); emission: 630/91 nm; ii) sfGFP (Invitrogen): excitation: 470/28 nm; emission: 512/23 nm; iii) TagBFP (Evrogen): excitation 387/11 nm; emission 464/23 nm.


As used herein, the term “sgRNA” refers to single guide RNA used in conjunction with CRISPR associated systems (Cas). sgRNAs contains nucleotides of sequence complementary to the desired target site. Watson-crick pairing of the sgRNA with the target site recruits the nuclease-deficient Cas9 to bind the DNA at that locus.


As used herein, the term “orthogonal” refers targets that are non-overlapping, uncorrelated, or independent. For example, if two orthogonal nuclease-deficient Cas9 gene fused to different effector domains were implemented, the sgRNAs coded for each would not cross-talk or overlap. Not all nuclease-deficient Cas9 genes operate the same, which enables the use of orthogonal nuclease-deficient Cas9 gene fused to a different effector domains provided the appropriate orthogonal sgRNAs.


As used herein, the term “phenotypic change” or “phenotype” refers to the composite of an organism's observable characteristics or traits, such as its morphology, development, biochemical or physiological properties, phenology, behavior, and products of behavior. Phenotypes result from the expression of an organism's genes as well as the influence of environmental factors and the interactions between the two.


“Nucleic acid sequence” and “nucleotide sequence” as used herein refer to an oligonucleotide or polynucleotide, and fragments or portions thereof, and to DNA or RNA of genomic or synthetic origin which may be single- or double-stranded, and represent the sense or antisense strand.


The term “an isolated nucleic acid”, as used herein, refers to any nucleic acid molecule that has been removed from its natural state (e.g., removed from a cell and is, in a preferred embodiment, free of other genomic nucleic acid).


The terms “amino acid sequence” and “polypeptide sequence” as used herein, are interchangeable and to refer to a sequence of amino acids.


As used herein the term “portion” when in reference to a protein (as in “a portion of a given protein”) refers to fragments of that protein. The fragments may range in size from four amino acid residues to the entire amino acid sequence minus one amino acid.


The term “portion” when used in reference to a nucleotide sequence refers to fragments of that nucleotide sequence. The fragments may range in size from 5 nucleotide residues to the entire nucleotide sequence minus one nucleic acid residue.


As used herein, the terms “complementary” or “complementarity” are used in reference to “polynucleotides” and “oligonucleotides” (which are interchangeable terms that refer to a sequence of nucleotides) related by the base-pairing rules. For example, the sequence “C-A-G-T,” is complementary to the sequence “G-T-C-A.” Complementarity can be “partial” or “total.” “Partial” complementarity is where one or more nucleic acid bases is not matched according to the base pairing rules. “Total” or “complete” complementarity between nucleic acids is where each and every nucleic acid base is matched with another base under the base pairing rules. The degree of complementarity between nucleic acid strands has significant effects on the efficiency and strength of hybridization between nucleic acid strands. This is of particular importance in amplification reactions, as well as detection methods which depend upon binding between nucleic acids.


The terms “homology” and “homologous” as used herein in reference to nucleotide sequences refer to a degree of complementarity with other nucleotide sequences. There may be partial homology or complete homology (i.e., identity). A nucleotide sequence which is partially complementary, i.e., “substantially homologous,” to a nucleic acid sequence is one that at least partially inhibits a completely complementary sequence from hybridizing to a target nucleic acid sequence. The inhibition of hybridization of the completely complementary sequence to the target sequence may be examined using a hybridization assay (Southern or Northern blot, solution hybridization and the like) under conditions of low stringency. A substantially homologous sequence or probe will compete for and inhibit the binding (i.e., the hybridization) of a completely homologous sequence to a target sequence under conditions of low stringency. This is not to say that conditions of low stringency are such that non-specific binding is permitted; low stringency conditions require that the binding of two sequences to one another be a specific (i.e., selective) interaction. The absence of non-specific binding may be tested by the use of a second target sequence which lacks even a partial degree of complementarity (e.g., less than about 30% identity); in the absence of non-specific binding the probe will not hybridize to the second non-complementary target.


The terms “homology” and “homologous” as used herein in reference to amino acid sequences refer to the degree of identity of the primary structure between two amino acid sequences. Such a degree of identity may be directed to a portion of each amino acid sequence, or to the entire length of the amino acid sequence. Two or more amino acid sequences that are “substantially homologous” may have at least 50% identity, preferably at least 75% identity, more preferably at least 85% identity, most preferably at least 95%, or 100% identity.


An oligonucleotide sequence which is a “homolog” is defined herein as an oligonucleotide sequence which exhibits greater than or equal to 50% identity to a sequence, when sequences having a length of 100 bp or larger are compared.


Low stringency conditions comprise conditions equivalent to binding or hybridization at 42° C. in a solution consisting of 5×SSPE (43.8 g/l NaCl, 6.9 g/l NaH2PO4.H2O and 1.85 g/l EDTA, pH adjusted to 7.4 with NaOH), 0.1% SDS, 5×Denhardt's reagent {50×Denhardt's contains per 500 ml: 5 g Ficoll (Type 400, Pharmacia), 5 g BSA (Fraction V; Sigma)} and 100 μg/ml denatured salmon sperm DNA followed by washing in a solution comprising 5×SSPE, 0.1% SDS at 42° C. when a probe of about 500 nucleotides in length is employed. Numerous equivalent conditions may also be employed to comprise low stringency conditions; factors such as the length and nature (DNA, RNA, base composition) of the probe and nature of the target (DNA, RNA, base composition, present in solution or immobilized, etc.) and the concentration of the salts and other components (e.g., the presence or absence of formamide, dextran sulfate, polyethylene glycol), as well as components of the hybridization solution may be varied to generate conditions of low stringency hybridization different from, but equivalent to, the above listed conditions. In addition, conditions which promote hybridization under conditions of high stringency (e.g., increasing the temperature of the hybridization and/or wash steps, the use of formamide in the hybridization solution, etc.) may also be used.


As used herein, the term “hybridization” is used in reference to the pairing of complementary nucleic acids using any process by which a strand of nucleic acid joins with a complementary strand through base pairing to form a hybridization complex. Hybridization and the strength of hybridization (i.e., the strength of the association between the nucleic acids) is impacted by such factors as the degree of complementarity between the nucleic acids, stringency of the conditions involved, the T. of the formed hybrid, and the G:C ratio within the nucleic acids.


As used herein the term “hybridization complex” refers to a complex formed between two nucleic acid sequences by virtue of the formation of hydrogen bounds between complementary G and C bases and between complementary A and T bases; these hydrogen bonds may be further stabilized by base stacking interactions. The two complementary nucleic acid sequences hydrogen bond in an antiparallel configuration. A hybridization complex may be formed in solution (e.g., Co t or Ro t analysis) or between one nucleic acid sequence present in solution and another nucleic acid sequence immobilized to a solid support (e.g., a nylon membrane or a nitrocellulose filter as employed in Southern and Northern blotting, dot blotting or a glass slide as employed in in situ hybridization, including FISH (fluorescent in situ hybridization)).


DNA molecules are said to have “5′ ends” and “3′ ends” because mononucleotides are reacted to make oligonucleotides in a manner such that the 5′ phosphate of one mononucleotide pentose ring is attached to the 3′ oxygen of its neighbor in one direction via a phosphodiester linkage. Therefore, an end of an oligonucleotide is referred to as the “5′ end” if its 5′ phosphate is not linked to the 3′ oxygen of a mononucleotide pentose ring. An end of an oligonucleotide is referred to as the “3′ end” if its 3′ oxygen is not linked to a 5′ phosphate of another mononucleotide pentose ring. As used herein, a nucleic acid sequence, even if internal to a larger oligonucleotide, also may be said to have 5′ and 3′ ends. In either a linear or circular DNA molecule, discrete elements are referred to as being “upstream” or 5′ of the “downstream” or 3′ elements. This terminology reflects the fact that transcription proceeds in a 5′ to 3′ fashion along the DNA strand. The promoter and enhancer elements which direct transcription of a linked gene are generally located 5′ or upstream of the coding region. However, enhancer elements can exert their effect even when located 3′ of the promoter element and the coding region. Transcription termination and polyadenylation signals are located 3′ or downstream of the coding region.


The term “transfection” or “transfected” refers to the introduction of foreign DNA into a cell.


As used herein, the terms “nucleic acid molecule encoding”, “DNA sequence encoding,” and “DNA encoding” refer to the order or sequence of deoxyribonucleotides along a strand of deoxyribonucleic acid. The order of these deoxyribonucleotides determines the order of amino acids along the polypeptide (protein) chain. The DNA sequence thus codes for the amino acid sequence.


As used herein, the term “gene” means the deoxyribonucleotide sequences comprising the coding region of a structural gene and including sequences located adjacent to the coding region on both the 5′ and 3′ ends for a distance of about 1 kb on either end such that the gene corresponds to the length of the full-length mRNA. The sequences which are located 5′ of the coding region and which are present on the mRNA are referred to as 5′ non-translated sequences. The sequences which are located 3′ or downstream of the coding region and which are present on the mRNA are referred to as 3′ non-translated sequences. The term “gene” encompasses both cDNA and genomic forms of a gene. A genomic form or clone of a gene contains the coding region interrupted with non-coding sequences termed “introns” or “intervening regions” or “intervening sequences.” Introns are segments of a gene which are transcribed into heterogeneous nuclear RNA (hnRNA); introns may contain regulatory elements such as enhancers. Introns are removed or “spliced out” from the nuclear or primary transcript; introns therefore are absent in the messenger RNA (mRNA) transcript. The mRNA functions during translation to specify the sequence or order of amino acids in a nascent polypeptide.


In addition to containing introns, genomic forms of a gene may also include sequences located on both the 5′ and 3′ end of the sequences which are present on the RNA transcript. These sequences are referred to as “flanking” sequences or regions (these flanking sequences are located 5′ or 3′ to the non-translated sequences present on the mRNA transcript). The 5′ flanking region may contain regulatory sequences such as promoters and enhancers which control or influence the transcription of the gene. The 3′ flanking region may contain sequences which direct the termination of transcription, posttranscriptional cleavage and polyadenylation.


The term “label” or “detectable label” are used herein, to refer to any composition detectable by spectroscopic, photochemical, biochemical, immunochemical, electrical, optical or chemical means. Such labels include biotin for staining with labeled streptavidin conjugate, magnetic beads (e.g., Dynabeads®), fluorescent dyes (e.g., fluorescein, texas red, rhodamine, green fluorescent protein, and the like), radiolabels (e.g., 3H, 125I, 35S, 14C, or 32P), enzymes (e.g., horse radish peroxidase, alkaline phosphatase and others commonly used in an ELISA), and calorimetric labels such as colloidal gold or colored glass or plastic (e.g., polystyrene, polypropylene, latex, etc.) beads. Patents teaching the use of such labels include, but are not limited to, U.S. Pat. Nos. 3,817,837; 3,850,752; 3,939,350; 3,996,345; 4,277,437; 4,275,149; and 4,366,241 (all herein incorporated by reference in their entirety). The labels contemplated in the present invention may be detected by many methods. For example, radiolabels may be detected using photographic film or scintillation counters, fluorescent markers may be detected using a photodetector to detect emitted light. Enzymatic labels are typically detected by providing the enzyme with a substrate and detecting, the reaction product produced by the action of the enzyme on the substrate, and calorimetric labels are detected by simply visualizing the colored label.





BRIEF DESCRIPTION OF THE DRAWINGS

The patent or application file contains at least one drawing executed in color. Copies of this patent or patent application publication with color drawing(s) will be provided by the Office upon request and payment of the necessary fee.



FIG. 1A-E illustrates exemplary schematic embodiments of an NmeCas9 deaminase fusion protein single base editor and exemplary constructed plasmids of base editors.



FIG. 1A shows an exemplary YE1-BE3-nNme2Cas9 (D16A)-UGI construct.



FIG. 1B shows an exemplary ABE7.10 nNme2Cas9 (D16A) construct.



FIG. 1C shows an exemplary ABE7.10-nNme2Cas9 (D16A) construct comprising two SV40 NLS sequences.



FIG. 1D shows an exemplary nNme2Cas9-CBE4 (also called a BE4-nNme2Cas9 (D16A)-UGI-UGI) construct.



FIG. 1E shows an exemplary optimized nNme2Cas9-ABEmax construct.



FIG. 2A-C presents exemplary data of the electroporation of HEK293T cells with DNA plasmids comprising a YE1-BE3-nNme2Cas9 (D16A)-UGI fusion protein efficiently converting C to T at endogenous target site 25 (TS25) in HEK293T cells via nucleofection.



FIG. 2A shows exemplary sequences for a TS25 endogenous target site (within the black rectangle). GN23 sgRNA base-pairs with the target DNA strand, leaving the displaced DNA strand for cytidine deaminase to edit (e.g. new green nucleotides).



FIG. 2B shows exemplary sequencing data showing a doublet nucleotide peak (7th position from 5′ end; arrow) demonstrating the successful single base editing of a cytidine to a thymidine (e.g., a CG base pair conversion to a TA base pair).



FIG. 2C shows an exemplary quantitation of the data shown in FIG. 2B plotting the percent conversion of C T single base editing. The percentage of C converted to T is about 40% in the base editor- and sgRNA-treated sample (p-value=6.88×10-6). The “no sgRNA” control displays the background noise due to Sanger sequencing. EditR (Kluesner et al., 2018) was used to perform the analysis.



FIG. 3A-F presents exemplary specific UGI target sites that were respectively integrated into YE1-BE3-nNme2Cas9/D16A mutant fusion proteins and co-expressed with enhanced green fluorescent protein (EGFP) in a stable K562-derived cell line. Converted bases are highlighted in orange color. Background signals were filtered using negative control samples (YE1-BE3-nNme2Cas9 nucleofected K562 cells without sgRNA constructs). N4CC PAMs are boxed. The percentage of total reads exhibiting mutations in base-editor-targeted sites is shown in the right column.



FIG. 3A shows an exemplary EGFP-Site 1.



FIG. 3B shows an exemplary EGFP-Site 2.



FIG. 3C shows an exemplary EGFP-Site 3.



FIG. 3D shows an exemplary EGFP-Site 4.



FIG. 3E shows an exemplary deep-sequencing analysis indicating where YE1-BE3-nNme2Cas9 converts C residues to T residues at endogenous c-fos promoter region. The percentage of total reads exhibiting mutations in base-editor-targeted sites is shown in the right column. The converted bases are highlighted in orange or yellow color. Background signals were filtered using negative control samples. The highest percentage of editing is 32.50%.



FIG. 3F shows an exemplary deep-sequencing analysis indicating where ABE7.10-nNme2Cas9 or ABEmax (Koblan et al., 2018)-nNme2Cas9 converts A residues to G residues at endogenous c-fos promoter region. The percentage of total reads exhibiting mutations in base-editor-targeted sites is shown in the right column. The converted bases are highlighted in orange color. Background signals were filtered using negative control samples. The percentage of editing is 0.53% by ABE7.10-nNme2Cas9 or 2.33% by ABEmax-nNme2Cas9 (D16A).



FIG. 4 presents an exemplary alignment of the wildtype Fah gene with the tyrosinemia Fah mutant gene showing an A-G single base gene editing target site (position 9). The respective SpyCas9 single PAM site and NmeCas9 double PAM sites are indicated for demonstrating the suboptimal targeting window relative to the SpyCas9 PAM site.



FIG. 5A-E illustrates exemplary three closely related Neisseria meningitidis Cas9 orthologs that have distinct PAMs.



FIG. 5A shows an exemplary schematic showing mutated residues (orange spheres) between Nme2Cas9 (left) and Nme3Cas9 (right) mapped onto the predicted structure of Nme1Cas9, revealing the cluster of mutations in the PID (black).



FIG. 5B shows an exemplary experimental workflow of the in vitro PAM discovery assay with a 10-bp randomized PAM region. Following in vitro digestion, adapters were ligated to cleaved products for library construction and sequencing.



FIG. 5C shows exemplary sequence logos resulting from in vitro PAM discovery reveal the enrichment of a N4GATT PAM for Nme1Cas9, consistent with its previously established specificity.



FIG. 5D shows exemplary sequence logos indicating that Nme1Cas9 with its PID swapped with that of Nme2Cas9 (left) or Nme3Cas9 (right) requires a C at PAM position 5. The remaining nucleotides were not determined with high confidence due to the modest cleavage efficiency of the PID-swapped protein chimeras (see FIG. 6C).



FIG. 5E shows an exemplary sequence logo showing that full-length Nme2Cas9 recognizes an N4CC PAM, based on efficient substrate cleavage of a target pool with a fixed C at PAM position 5, and with PAM nts 1-4 and 6-8 randomized.



FIG. 6A-D presents a characterization of Neisseria meningitidis Cas9 orthologs with rapidly-evolving PIDs, as related to FIG. 5A-E.



FIG. 6A shows an exemplary unrooted phylogenetic tree of NmeCas9 orthologs that are >80% identical to Nme1Cas9. Three distinct branches emerged, with the majority of mutations clustered in the PID. Groups 1 (blue), 2 (orange), and 3 (green) have PIDs with >98%, ˜52%, and ˜86% identity to Nme1Cas9, respectively. Three representative Cas9 orthologs (one from each group) (Nme1Cas9, Nme2Cas9 and Nme3Cas9) are indicated.



FIG. 6B shows an exemplary schematic showing the CRISPR-cas loci of the strains encoding the three Cas9 orthologs (Nme1Cas9, Nme2Cas9, and Nme3Cas9) from (A). Percent identities of each CRISPR-Cas component with N. meningitidis 8013 (encoding Nme1Cas9) are shown. Blue and red arrows denote pre-crRNA and tracrRNA transcription initiation sites, respectively.



FIG. 6C shows an exemplary normalized read counts (% of total reads) from cleaved DNAs from the in vitro assays for intact Nme1Cas9 (grey), for chimeras with Nme1Cas9's PID swapped with those of Nme2Cas9 and Nme3Cas9 (mixed colors), and for full-length Nme2Cas9 (orange), are plotted. The reduced normalized read counts indicate lower cleavage efficiencies in the chimeras.



FIG. 6D shows an exemplary sequence logos from the in vitro PAM discovery assay on an NNNNCNNN PAM pool by Nme1Cas9 with its PID swapped with those of Nme2Cas9 (left) or Nme3Cas9 (right).



FIG. 7A-D presents exemplary data showing that Nme2Cas9 uses a 22-24 nt spacer to edit sites adjacent to an N4CC PAM. All experiments were done in triplicate, and error bars represent the standard error of the mean (s.e.m.).



FIG. 7A shows an exemplary schematic diagram depicting transient transfection and editing of HEK293T TLR2.0 cells, with mCherry+ cells detected by flow cytometry 72 hours after transfection.



FIG. 7B shows an exemplary Nme2Cas9 editing of the TLR2.0 reporter. Sites with N4CC PAMs were targeted with varying efficiencies, while no Nme2Cas9 targeting was observed at an N4GATT PAM or in the absence of sgRNA. SpyCas9 (targeting a previously validated site with an NGG PAM) and Nme1Cas9 (targeting N4GATT) were used as positive controls.



FIG. 7C shows an exemplary effect of spacer length on the efficiency of Nme2Cas9 editing. An sgRNA targeting a single TLR2.0 site, with spacer lengths varying from 24 to 20 nts (including the 5′-terminal G required by the U6 promoter), indicate that highest editing efficiencies are obtained with 22-24 nt spacers.



FIG. 7D shows an exemplary An Nme2Cas9 dual nickase can be used in tandem to generate NHEJ- and HDR-based edits in TLR2.0. Nme2Cas9- and sgRNA-expressing plasmids, along with an 800-bp dsDNA donor for homologous repair, were electroporated into HEK293T TLR2.0 cells, and both NHEJ (mCherry+) and HDR (GFP+) outcomes were scored by flow cytometry. HNH nickase, Nme2Cas9D16A; RuvC nickase, Nme2Cas9H588A. Cleavage sites 32 bp and 64 bp apart were targeted using either nickase. The HNH nickase (Nme2Cas9D16A) yielded efficient editing, particularly with the cleavage sites that were separated by 32 bp, whereas the RuvC nickase (Nme2Cas9H588A) was not effective. Wildtype Nme2Cas9 was used as a control.



FIG. 8A-D presents exemplary data showing PAM, spacer, and seed requirements for Nme2Cas9 targeting in mammalian cells, as related to FIG. 7A-D. All experiments were done in triplicate and error bars represent s.e.m.



FIG. 8A shows an exemplary Nme2Cas9 targeting at N4CD sites in TLR2.0, with editing estimated based on mCherry+ cells. Four sites for each non-C nucleotide at the tested position (N4CA, N4CT and N4CG) were examined, and an N4CC site was used as a positive control.



FIG. 8B shows an exemplary Nme2Cas9 targeting at N4DC sites in TLR2.0 [similar to (A)].



FIG. 8C shows exemplary guide truncations on a TLR2.0 site (distinct from that in FIG. 2C) with a N4CCA PAM, revealing similar length requirements as those observed at the other site.



FIG. 8D shows exemplary Nme2Cas9 targeting efficiency is differentially sensitive to single-nucleotide mismatches in the seed region of the sgRNA. Data show the effects of walking single-nucleotide sgRNA mismatches along the 23-nt spacer in a TLR2.0 target site.



FIG. 9A-C presents exemplary data showing Nme2Cas9 genome editing at endogenous loci in mammalian cells via multiple delivery methods. All results represent 3 independent biological replicates, and error bars represent s.e.m.



FIG. 9A shows an exemplary Nme2Cas9 genome editing of endogenous human sites in HEK293T cells following transient transfection of Nme2Cas9- and sgRNA-expressing plasmids. 40 sites were screened initially (Table 1); the 14 sites shown (selected to include representatives of varying editing efficiencies, as measured by TIDE) were then re-analyzed in triplicate. An Nme1Cas9 target site (with an N4GATT PAM) was used as a negative control.



FIG. 9B shows exemplary data charts: Left panel: Transient transfection of a single plasmid expressing both Nme2Cas9 and sgRNA (targeting the Pcsk9 and Rosa26 loci) enables editing in Hepa1-6 mouse cells, as detected by TIDE. Right panel: Electroporation of sgRNA plasmids into K562 cells stably expressing Nme2Cas9 from a lentivector results in efficient indel formation.



FIG. 9C shows exemplary Nme2Cas9 can be electroporated as an RNP complex to induce genome editing. 40 picomoles Cas9 along with 50 picomoles of in vitro-transcribed sgRNAs targeting three different loci were electroporated into HEK293T cells. Indels were measured after 72 h using TIDE.



FIG. 10A-B presents exemplary data showing dose dependence and segmental deletions by Nme2Cas9, as related to FIG. 9A-C.



FIG. 10A shows exemplary increasing the dose of electroporated Nme2Cas9 plasmid (500 ng, vs. 200 ng in FIG. 3A) improves editing efficiency at two sites (TS16 and TS6). Data provided in yellow are re-used from FIG. 9A.



FIG. 10B shows exemplary Nme2Cas9 can be used to create precise segmental deletions. Two TLR2.0 targets with cleavage sites 32 bp apart were targeted simultaneously with Nme2Cas9. The majority of lesions created were deletions of exactly 32 bp (blue).



FIG. 11A-C presents exemplary data showing that Nme2Cas9 is subject to inhibition by a subset of type II-C anti-CRISPR families in vitro and in cells. All experiments were done in triplicate and error bars represent s.e.m.



FIG. 11A shows exemplary In vitro cleavage assay of Nme1Cas9 and Nme2Cas9 in the presence of five previously characterized anti-CRISPR proteins (10:1 ratio of Acr:Cas9). Top: Nme1Cas9 efficiently cleaves a fragment containing a protospacer with an N4GATT PAM in the absence of an Acr or in the presence of a negative control Acr (AcrE2). All five previously characterized type II-C Acr families inhibited Nme1Cas9, as expected. Bottom: Nme2Cas9 inhibition mirrors that of Nme1Cas9, except for the lack of inhibition by AcrIIC5Smu.



FIG. 11B shows exemplary genome editing in the presence of the five previously described anti-CRISPR families. Plasmids expressing Nme2Cas9 (200 ng), sgRNA (100 ng) and each respective Acr (200 ng) were co-transfected into HEK293T cells, and genome editing was measured using Tracking of Indels by Decomposition (TIDE) 72 hr post transfection. Consistent with our in vitro analyses, all type II-C anti-CRISPRs except AcrIIC5Smu inhibited genome editing, albeit with different efficiencies.



FIG. 11C shows exemplary Acr inhibition of Nme2Cas9 is dose-dependent with distinct apparent potencies. Nme2Cas9 is fully inhibited by AcrIIC1NMe and AcrIIC4Hpa at 2:1 and 1:1 mass ratios of cotransfected Acr and Nme2Cas9 plasmids, respectively.



FIG. 12 presents exemplary data showing that a Nme2Cas9 PID swap renders Nme1Cas9 insensitive to AcrIIC5Smu inhibition, as related to FIG. 11A-C. In vitro cleavage by the Nme1Cas9-Nme2Cas9PID chimera in the presence of previously characterized Acr proteins (10 uM Cas9-sgRNA+100 uM Acr).



FIG. 13A-E presents exemplary data showing orthogonality and relative accuracy of Nme2Cas9 and SpyCas9 at dual target sites, as related to FIG. 12.



FIG. 13A shows exemplary Nme2Cas9 and SpyCas9 guides are orthogonal. TIDE results show the frequencies of indels created by both nucleases targeting DS2 with either their cognate sgRNAs or with the sgRNAs of the other ortholog.



FIG. 13B shows exemplary Nme2Cas9 and SpyCas9 exhibiting comparable on-target editing efficiencies as assessed by GUIDE-seq. Bars indicate on-target read counts from GUIDE-Seq at the three dual sites targeted by each ortholog. Orange bars represent Nme2Cas9 and black bars represent SpyCas9.



FIG. 13C shows an exemplary SpyCas9's on-target vs. off-target read counts for each site. Orange bars represent the on-target reads while black bars represent off-targets.



FIG. 13D shows exemplary Nme2Cas9's on-target vs. off-target reads for each site.



FIG. 13E bar graphs showing exemplary indel efficiencies (measured by TIDE) at potential off-target sites predicted by CRISPRSeek. On- and off-target site sequences are shown on the left, with the PAM region underlined and sgRNA mismatches and non-consensus PAM nucleotides given in red.



FIG. 14A-E presents exemplary data showing that Nme2Cas9 exhibits little or no detectable off-targeting in mammalian cells.



FIG. 14A shows an exemplary schematic depicting dual sites (DSs) targetable by both SpyCas9 and Nme2Cas9 by virtue of their non-overlapping PAMs. The Nme2Cas9 PAM (orange) and SpyCas9 PAM (blue) are highlighted. A 24 nt Nme2Cas9 guide sequence is indicated in yellow; the corresponding guide sequence for SpyCas9 would be 4 nt shorter at the 5′ end.


14B shows an exemplary Nme2Cas9 and SpyCas9 that both induce indels at DSs. Six DSs in VEGFA (with GN3GN19NGGNCC sequences) were selected for direct comparisons of editing by the two orthologs. Plasmids expressing each Cas9 (with the same promoter, linkers, tags and NLSs) and its cognate guide were transfected into HEK293T cells. Indel efficiencies were determined by TIDE 72 hrs post transfection. Nme2Cas9 editing was detectable at all six sites and was marginally or significantly more efficient than SpyCas9 at two sites (DS2 and DS6, respectively). SpyCas9 edited four out of the six sites (DS1, DS2. DS4 and DS6), with two sites showing significantly higher editing efficiencies than Nme2Cas9 (DS1 and DS4). DS2, DS4 and DS6 were selected for GUIDE-Seq analysis as Nme2Cas9 was equally efficient, less efficient and more efficient than SpyCas9, respectively, at these sites.



FIG. 14C shows exemplary Nme2Cas9 genome editing that is highly accurate in human cells. Numbers of off-target sites detected by GUIDE-Seq for each nuclease at individual target sites are shown. In addition to dual sites, we analyzed TS6 (because of its high on-target editing efficiency) and Pcsk9 and Rosa26 sites in mouse Hepa1-6 cells (to measure accuracy in another cell type).



FIG. 14D shows an exemplary targeted deep sequencing to detect indels in edited cells confirms the high Nme2Cas9 accuracy indicated by GUIDE-seq.



FIG. 14E shows an exemplary sequence for the validated off-target site of the Rosa26 guide, showing the PAM region (underlined), the consensus CC PAM dinucleotide (bold), and three mismatches in the PAM-distal portion of the spacer (red).



FIG. 15A-C presents exemplary data showing Nme2Cas9 genome editing in vivo via all-in-one AAV delivery.



FIG. 15A shows exemplary workflow for delivery of AAV8.sgRNA.Nme2Cas9 to lower cholesterol levels in mice by targeting Pcsk9. Top: schematic of the all-in-one AAV vector expressing Nme2Cas9 and the sgRNA (individual genome elements not to scale). BGH, bovine growth hormone poly(A) site; HA, epitope tag; NLS, nuclear localization sequence; h, human-codon-optimized. Bottom: Timeline for AAV8.sgRNA.Nme2Cas9 tail-vein injections (4×1011 GCs), followed by cholesterol measurements at day 14 and indel, histology and cholesterol analyses at day 28 post-injection.



FIG. 15B shows an exemplary TIDE analysis to measure indels in DNA extracted from livers of mice injected with AAV8.Nme2Cas9+sgRNA targeting Pcsk9 and Rosa26 (control) loci. Indel efficiency at the lone off-target site identified by GUIDE-seq for these two sgRNAs (Rosa26|OT1) were also assessed by TIDE.



FIG. 15C shows an exemplary reduced serum cholesterol levels in mice injected with the Pcsk9-targeting guide compared to the Rosa26-targeting controls. P values are calculated by unpaired two-tailed t-test.



FIG. 16A-B presents exemplary data showing PCSK9 knockdown and liver histology following Nme2Cas9 AAV delivery and editing, related to FIG. 15A-C.



FIG. 16A shows exemplary Western blotting using anti-PCSK9 antibody reveals strongly reduced levels of PCSK9 in the livers of mice treated with sgPcsk9, compared to mice treated with sgRosa26. 2 ng of recombinant PCSK9 was used as a mobility standard (left-most lane), and a cross-reacting band in the liver samples is indicated by an asterisk. GAPDH was used as loading control (bottom panel).



FIG. 16B shows exemplary H&E staining from livers of mice injected with AAV8.Nme2Cas9+sgRosa26 (left) or AAV8.Nme2Cas9+sgPcsk9 (right) vectors. Scale bars, 25 μm.



FIG. 17A-C presents exemplary data showing Tyr editing ex vivo in mouse zygotes, related to FIG. 16A-B.



FIG. 17A shows an exemplary two sites in Tyr, each with N4CC PAMs, were tested for editing in Hepa1-6 cells. The sgTyr2 guide exhibited higher editing efficiency and was selected for further testing.



FIG. 17B shows an exemplary seven mice that survived post-natal development, and each exhibited coat color phenotypes as well as on-target editing, as assayed by TIDE.



FIG. 17C shows an exemplary Indel spectra from tail DNA of each mouse from (B), as well as an unedited C57BL/6NJ mouse, as indicated by TIDE analysis. Efficiencies of insertions (positive) and deletions (negative) of various sizes are indicated.



FIG. 18A-C presents exemplary data showing Nme2Cas9 genome editing ex vivo via all-in-one AAV delivery.



FIG. 18A shows an exemplary workflow for single-AAV Nme2Cas9 editing ex vivo to generate albino C57BL/6NJ mice by targeting the Tyr gene. Zygotes are cultured in KSOM containing AAV6.Nme2Cas9:sgTyr for 5-6 hours, rinsed in M2, and cultured for a day before being transferred to the oviduct of pseudo-pregnant recipients.



FIG. 18B shows exemplary albino (left) and chinchilla or variegated (middle) mice generated by 3×109 GCs, and chinchilla or variegated mice (right) generated by 3×108 GCs of zygotes with AAV6.Nme2Cas9:sgTyr.



FIG. 18C shows an exemplary summary of Nme2Cas9.sgTyr single-AAV ex vivo Tyr editing experiments at two AAV doses.



FIG. 19A-C shows an exemplary mCherry reporter assay for nSpCas9-ABEmax and optimized ABEmax-nNme2Cas9 (D16A) activities.



FIG. 19A shows exemplary sequence information of sequence information of ABE-mCherry reporter. There is a TAG stop codon in mCherry coding region. In the reporter-integrated stable cell line, there is no mCherry signal. The mCherry signal will show up if the nSpCas9-ABEmax or optimized ABEmax-nNme2Cas9 (D16A) can convert TAG to CAG (which is encoded Gln).



FIG. 19B shows an exemplary mCherry signals light up since SpCas9-ABE or ABEmax-nNme2Cas9 (DMA) is active in the specific region of the mCherry reporter. Upper panel is the negative control, middle panel shows the mCherry signals light up in reporter cells treated with nSpCas9-ABEmax, bottom panel shows the mCherry signals light up in reporter cells treated with optimized ABEmax-nNme2Cas9 (D16A).



FIG. 19C shows an exemplary FACs Quantitation of base editing events in mCherry reporter cells transfected with the SpCas9-ABE or ABEmax-nNme2Cas9 (D16A). N=6; error bars represent S.D. Results are from biological replicates performed in technical duplicates.



FIG. 20A-C shows an exemplary GFP reporter assay for nSpCas9-CBE4 (Addgene #100802) and CBE4-nNme2Cas9 (D16A)-UGI-UGI (CBE4 was cloned from Addgene #100802) activities.



FIG. 20A shows exemplary sequence information of CBE-GFP reporter. There is a mutation in the fluorophore core region of the GFP reporter line, which converts GYG to GHG. Therefore, there is no GFP signal. The GFP signal will show up if the nSpCas9-CBE4 or CBE4-nNme2Cas9 (D16A)-UGI-UGI can convert CAC to TAC/TAT (Histidine to Tyrosine).



FIG. 20B shows an exemplary GFP signal (green) since nSpCas9-CBE4 or CBE4-nNme2Cas9 (D16A)-UGI-UGI is active in the specific region of the GFP reporter. Upper panel is the negative control. Middle panel shows that the mCherry signals light up in the reporter cells treated with CBE4-nNme2Cas9 (D16A)-UGI-UGI. Bottom panel shows that the GFP signals light up in the reporter cells treated with CBE4-nNme2Cas9 (D16A)-UGI-UGI).



FIG. 20C shows an exemplary FACs Quantitation of base editing events in GFP reporter cells transfected with nSpCas9-CBE4 or CBE4-nNme2Cas9 (D16A)-UGI-UGI. N=6; error bars represent S.D. Results are from biological replicates performed in technical duplicates.



FIG. 21 shows exemplary cytosine editing by CBE4-nNme2Cas9 (D16A)-UGI-UGI. Upper panel shows the KANK3 targeting sequence information (PAM sequences are indicated in red) of Nme2Cas9 and base editing in the negative control samples. Bottom panel shows the quantification of the substitution rate of each type of base in the CBE4-nNme2Cas9 (D16A)-UGI-UGI editing window of the KANK3 target sequences. Sequence tables show nucleotide frequencies at each position. Frequencies of expected C-to-T conversion are highlighted in red.



FIG. 22 shows exemplary cytosine and adenine editing by CBE4-nNme2Cas9 (D16A)-UGI-UGI and optimized ABEmax-nNme2Cas9 (D16A), respectively. Upper panel shows the PLXNB2 targeting sequence information (PAM sequences are indicated in red) of Nme2Cas9 and base editing in the negative control samples. Middle panel shows the quantification of the substitution rate of each type of base in the optimized ABEmax-nNme2Cas9 (D16A) editing windows of the PLXNB2 target sequences. Sequence tables show nucleotide frequencies at each position. Frequencies of expected A-to-G conversion are highlighted in red. Bottom panel shows the quantification of the substitution rate of each type of base in the CBE4-nNme2Cas9 (D16A)-UGI-UGI editing windows of the PLXNB2 target sequences. Sequence tables show nucleotide frequencies at each position. Frequencies of expected C-to-T conversion are highlighted in red.





DETAILED DESCRIPTION OF THE INVENTION

The present invention is related to the field of gene editing. In particular, the gene editing is directed toward single nucleotide base editing. For example, such single nucleotide base editing results in a conversion of a CG base pair to a TA base pair. The high accuracy and precision of the presently disclosed single nucleotide base gene editor is accomplished by an NmeCas9 nuclease that is fused to a nucleotide deaminase protein. The compact nature of the NmeCas9 coupled with a larger number of compatible protospacer adjacent motifs provide the Cas9 fusion constructs contemplated herein can edit sites that are not targetable by conventional SpyCas9 base editor platforms.


A. NmeCas9 Single Base Editing


Cas9 is a programmable nuclease that uses a guide RNA to create a double-stranded break at any desired genomic locus. This programmability has been harnessed for biomedical and therapeutic approaches. However, Cas9-induced breaks often lead to imprecise repair by the cellular machinery, hindering its therapeutic application for single-base corrections as well as uniform and precise gene knock-outs. Moreover, it is extremely challenging to combine Cas9-induced DNA double strand breaks and a repair template for homologous directed repair (HDR) for correcting genetic mutations in post-mitotic cells (e.g. neuronal cells).


Single nucleotide base editing is a genome editing approach where a nuclease-dead or -impaired Cas9 (e.g., dead Cas9 (dCas9) or nickase Cas9 (nCas9)) is fused to another enzyme capable of base-editing nucleotides without causing DNA double strand breaks. To date, two broad classes of Cas9 base editors have been developed: i) cytidine deaminase (edits a CG base pair to a TA base pair) SpyCas9 fusion protein; and ii) adenosine deaminase (edits a AT base pair to a GC base pair) SpyCas9. Liu et al., “Nucleobase editors and uses thereof” US 2017/0121693; and Lui et al., “Fusions of cas9 domains and nucleic acid-editing domains” US 2015/0166980; (both herein incorporated by reference).


However as mentioned above, SpyCas9 base editing platforms cannot be used to target all single-base mutations due to their limited editing windows. The editing window is constrained by the requirement for an NGG PAM. SpyCas9 is also intrinsically associated with high off-targeting effects in genome editing.


In one embodiment, the present invention contemplates a deaminase fusion protein with a compact and hyper-accurate Nme2Cas9 (Neisseria meningitidis spp.). This Nme2Cas9 has 1,082 amino acids as compared to SpyCas9 that has 1,368 amino acids. This Nme2Cas9 ortholog functions efficiently in mammalian cells, recognizes an N4CC PAM, and is intrinsically hyper-accurate. Edraki et al., Mol Cell. (in preparation).


Although it is not necessary to understand the mechanism of an invention, it is believed that the compactness and hyper-accuracy of an NmeCas9 base editor targets single-base mutations that could not be reached previously by other Cas9 platforms currently known in the art. It is further believed that the NmeCas9 base editors contemplated herein target pathogenic mutations that are not feasible via current base editor platforms, and with an increased base editing accuracy.


In one embodiment, the present invention contemplates a fusion protein comprising a Nme2Cas9 and a deaminase protein, exemplary examples including ABE7.10-nNme2Cas9 (D16A); Optimized nNme2Cas9-ABEmax; nNme2Cas9-CBE4 (equals BE4-nNme2Cas9 (D16A)-UGI-UGI) as well as ABEmax-nNme2Cas9 (D16A). See, FIG. 1A, FIG. 1B, FIG. 1C, FIG. 1D and FIG. 1E.



FIG. 1A-E illustrates exemplary schematic embodiments of an NmeCas9 deaminase fusion protein single base editor and exemplary constructed plasmids of base editors. FIG. 1A shows an exemplary YE1-BE3-nNme2Cas9 (D16A)-UGI construct. FIG. 1B shows an exemplary ABE7.10 nNme2Cas9 (D16A) construct. FIG. 1C shows an exemplary ABE7.10-nNme2Cas9 (D16A) construct. FIG. 1C shows an exemplary ABE7.10-nNme2Cas9 (D16A) construct comprising two SV40 NLS sequences. FIG. 1D shows an exemplary nNme2Cas9-CBE4 (also called a BE4-nNme2Cas9 (D16A)-UGI-UGI) construct. FIG. 1E shows an exemplary optimized nNme2Cas9-ABEmax construct.


In one embodiment, the deaminase protein is Apobec1 (YE1-BE3). It is not intended to limit Apobec1 to one organism. In one embodiment, the Apobec1 is derived from a rat species. Kim et al., “Increasing the genome-targeting scope and precision of base editing with engineered Cas9-cytidine deaminase fusions”. Nature Biotechnology 35 (2017). In one embodiment, the Nme2Cas9 comprises an nNme2Cas9 D16A mutant. In one embodiment, the fusion protein further comprises a uracil glycosylase inhibitor protein (UGI). In one embodiment, the fusion protein comprises a YE1-BE3-nNme2Cas9 (D16A)-UGI construct. In one embodiment, the YE1-BE3-nNme2Cas9 (D16A)-UGI construct has the sequence of:










(SEQ ID NO: 1)




MSSETGPVAVDPTLRRRIEPHEFEVFFDPRELRKETCLLYEINWGGRHSIWRHTSQNTNK








HVEVNFIEKFTTERYFCPNTRCSITWFLSYSPCGECSRAITEFLSRYPHVTLFIYIARLYHH







ADPENRQGLRDLISSGVTIQIMTEQESGYCWRNFVNYSPSNEAHWPRYPHLWVRLYVLE







LYCIILGLPPCLNILRRKQPQLTFFTIALQSCHYQRLPPHILWATGLK
SGSETPGTSESATP







ES
MAAFKPNPINYILGLAIGIASVGWAMVEIDEEENPIRLIDLGVRVFERAEVPKTGDSLAMAR







RLARSVRRLTRRRAHRLLRARRLLKREGVLQAADFDENGLIKSLPNTPWQLRAAALDRKLTPL







EWSAVLLHLIKHRGYLSQRKNEGETADKELGALLKGVANNAHALQTGDFRTPAELALNKFEK







ESGHIRNQRGDYSHTFSRKDLQAELILLFEKQKEFGNPHVSGGLKEGIETLLMTQRPALSGDA







VQKMLGHCTFEPAEPKAAKNTYTAERFIWLTKLNNLRILEQGSERPLTDTERATLMDEPYRKS







KLTYAQARKLLGLEDTAFFKGLRYGKDNAEASTLMEMKAYHAISRALEKEGLKDKKSPLNLSS







ELQDEIGTAFSLFKTDEDITGRLKDRVQPEILEALLKHISFDKFVQISLKALRRIVPLMEQGKR







YDEACAEIYGDHYGKKNTEEKIYLPPIPADEIRNPVVLRALSQARKVINGVVRRYGSPARIHIET







AREVGKSFKDRKEIEKRQEENRKDREKAAAKFREYFPNFVGEPKSKDILKLRLYEQQHGKCL







YSGKEINLVRLNEKGYVEIDAALPFSRTWDDSFNNKVLVLGSENQNKGNQTPYEYFNGKDNS







REWQEFKARVETSRFPRSKKQRILLQKFDEDGFKECNLNDTRYVNRFLCQFVADHILLTGKG







KRRVFASNGQITNLLRGFWGLRKVRAENDRHHALDAVVVACSTVAMQQKITRFVRYKEMNAF







DGKTIDKETGKVLHQKTHFPQPWEFFAQEVMIRVFGKPDGKPEFEEADTPEKLRTLLAEKLS







SRPEAVHEYVTPLFVSRAPNRKMSGAHKDTLRSAKRFVKHNEKISVKRVWLTEIKLADLENMV







NYKNGREIELYEALKARLEAYGGNAKQAFDPKDNPFYKKGGQLVKAVRVEKTQESGVLLNKK







NAYTIADNGDMVRVDVFCKVDKKGKNQYFIVPIYAWQVAENILPDIDCKGYRIDDSYTFCFSL







HKYDLIAFQKDEKSKVEFAYYINCDSSNGRFYLAWHDKGSKEQQFRISTQNLVLIQKYQVNEL






GKEIRPCRLKKRPPVRSGGSTNLSDIIEKETGKQLVIQESILMLPEEVEEVIGNKPESDIL







VHTAYDESTDENVMLLTSDAPEYKPWALVIQDSNGENKIKML

SGGSPKKKRKV*






YE1-BE3 (underlined); linker (bold), nNme2Cas9 (italics), UGI (bold/underlined),





SV40 NLS (plain).





(SEQ ID NO: 2)




MSSETGPVAVDPTLRRRIEPHEFEVFFDPRELRKETCLLYEINWGGRHSIWRHTSQNTNK








HVEVNFIEKFTTERYFCPNTRCSITWFLSYSPCGECSRAITEFLSRYPHVTLFIYIARLYHH







ADPENRQGLRDLISSGVTIQIMTEQESGYCWRNFVNYSPSNEAHWPRYPHLWVRLYVLE







LYCHLGLPPCLNILRRKQPQLTFFTIALQSCHYQRLPPHILWATGLK
SGSETPGTSESATP







ES
AIAAFKPNPINYILGLAIGIASVGWAMVEIDEEENPIRLIDLGVRVFERAEVPKTGDSLAMAR







RLARSVRRLTRRRAHRLLRARRLLKREGVLQAADFDENGLIKSLPNTPWQLRAAALDRKLTPL







EWSAVLLHLIKHRGYLSQRKNEGETADKELGALLKGVANNAHALQTGDERTPAELALNKFEK







ESGHIRNQRGDYSHTFSRKDLQAELILLFEKQKEFGNPHVSGGLKEGIETLLMTQRPALSGDA







VQKMLGHCTFEPAEPKAAKNTYTAERFIWLTKLNNLRILEQGSERPLTDTERATLMDEPYRKS







KLTYAQARKLLGLEDTAFFKGLRYGKDNAEASTLMEMKAYHAISRALEKEGLKDKKSPLNLSS







ELQDEIGTAFSLEKTDEDITGRLKDRVQPEILEALLKHISFDKEVQISLKALRRIVPLMEQGKR







YDEACAEIYGDHYGKKNTEEKIYLPPIPADEIRNPVVLRALSQARKVINGVVRRYGSPARIHIET







AREVGKSFKDRKEIEKRQEENRKDREKAAAKFREYFPNEVGEPKSKDILKLRLYEQQHGKCL







YSGKEINLVRLNEKGYVEIDAALPFSRTWDDSFNNKVLVLGSENQNKGNQTPYEYFNGKDNS







REWQEFKARVETSRFPRSKKQRILLQKFDEDGEKECNLNDTRYVNRELCQFVADHILLTGKG







KRRVFASNGQITNLLRGFWGLRKVRAENDRHHALDAVVVACSTVAMQQKITREVRYKEMNAF







DGKTIDKETGKVLHQKTHFPQPWEFFAQEVMIRVEGKPDGKPEFEEADTPEKLRTLLAEKLS







SRPEAVHEYVTPLEVSRAPNRKMSGAHKDTLRSAKREVKHNEKISVKRVWLTEIKLADLENMV







NYKNGREIELYEALKARLEAYGGNAKQAFDPKDNPFYKKGGQLVKAVRVEKTQESGVLLNKK







NAYTIADNGDMVRVDVECKVDKKGKNQYFIVPIYAWQVAENILPDIDCKGYRIDDSYTECFSL







HKYDLIAFQKDEKSKVEFAYYINCDSSNGRFYLAWHDKGSKEQQFRISTQNLVLIQKYQKYNVEL







GKEIRPCRLKKRPPVR
SGGS

TNLSDHEKETGKQLVIQESILMLPEEVEEVIGNKPESDI









LVHTAYDESTDENVMLLTSDAPEYKPWALVIQDSNGENKIKML

SGGSPKKKRKV*






YE1-BE3 (underlined); linker (bold), nNme2Cas9 (italics), UGI (bold/underlined),





SV40 NLS (plain).






In one embodiment, the YE1-BE3-nNme2Cas9 (D16A)-UGI construct has the sequence of:


In one embodiment, the present invention contemplates a fusion protein comprising an NmeCas9/ABE7.10 deaminase protein. In one embodiment, the deaminase protein is TadA. In one embodiment, the deaminase protein is TadA 7.10. In one embodiment, the ABE7.10-nNme2Cas9 (D16A) construct has the following sequence:










(SEQ ID NO: 3)




MSEVEFSHEYWMRHALTLAKRAWDEREVPVGAVLVHNNRVIGEGWNRPIGRHDPTAH








AEIMALRQGGLVMQNYRLIDATLYVTLEPCVMCAGAMIHSRIGRVVFGARDAKTGAAG







SLMDVLHHPGMNHRVEITEGILADECAALLSDFFRMRRQEIKAQKKAQSSTD
custom-character







custom-character
custom-character
custom-character

SEVEFSHEYVVMRHALTLAKRARDEREVPVGAV









LVLNNRVIGEGWNRAIGLHDPTAHAEIMALRQGGLVMQNYRLIDATLYVTFEPCV









MCAGAMIHSRIGRVVFGVRNAKTGAAGSLMDVLHYPGMNHRVEITEGILADECAA









LLCYFFRMPRQVFNAQK

custom-character
GGSSGGSSGSETPGTSESATPESSGGSSGGSMAAF







KPNPINYILGLAIGIASVGWAMVEIDEEENPIRLIDLGVRVFERAEVPKTGDSLAMARRLARSVR







RLTRRRAHRLLRARRLLKREGVLQAADFDENGLIKSLPNTPWQLRAAALDRKLTPLEWSAVLL







HLIKHRGYLSQRKNEGETADKELGALLKGVANNAHALQTGDFRTPAELALNKFEKESGHIRN







QRGDYSHTFSRKDLQAELILLFEKQKEFGNPHVSGGLKEGIETLLMTQRPALSGDAVQKMLG







HCTFEPAEPKAAKNTYTAERFIWLTKLNNLRILEQGSERPLTDTERATLMDEPYRKSKLTYAQA







RKLLGLEDTAFFKGLRYGKDNAEASTLMEMKAYHAISRALEKEGLKDKKSPLNLSSELQDEIG







TAFSLFKTDEDITGRLKDRVQPEILEALLKHISFDKFVQISLKALRRIVPLMEQGKRYDEACAEI







YGDHYGKKNTEEKIYLPPIPADEIRNPVVLRALSQARKVINGVVRRYGSPARIHIETAREVGKSF







KDRKEIEKRQEENRKDREKAAAKFREYFPNFVGEPKSKDILKLRLYEQQHGKCLYSGKEINLV







RLNEKGYVEIDHALPFSRTWDDSFNNKVLVLGSENQNKGNQTPYEYFNGKDNSREWQEFKA







RVETSRFPRSKKQRILLQKFDEDGFKECNLNDTRYVNRFLCQFVADHILLTGKGKRRVFASN







GQITNLLRGFWGLRKVRAENDRHHALDAVVVACSTVAMQQKITRFVRYKEMNAFDGKTIDK







ETGKVLHQKTHFPQPWEFFAQEVMIRVFGKPDGKPEFEEADTPEKLRTLLAEKLSSRPEAVH







EYVTPLFVSRAPNRKMSGAHKDTLRSAKRFVKHNEKISVKRVWLTEIKLADLENMVNYKNGR







EIELYEALICARLEAYGGNAKQAFDPKDNPFYKKGGQLVKAVRVEKTQESGVLLNKKNAYTIA







DNGDMVRVDVFCKVDKKGKWQYFIVPIYAWQVAENILPDIDCKGYRIDDSYTFCFSLHKYDLI







AFQKDEKSKVEFAYYINCDSSNGRFYLAWHDKGSKEQQFRISTQNLVLIQKYQVNELGKEIRP







CRLKKRPPVR
EDKRPAATKKAGQAKKKK*






TadA (underlined), TadA 7.10 (underlined/bold), linker (bold), nNme2Cas9 (italics),





Nucleoplasmin NLS (plain).






In one embodiment, an ABE7.10-nNme2Cas9 (D16A) construct has the following amino acid sequence:










(SEQ ID NO: 4)




MSEVEFSHEYWMRHALTLAKRAWDEREVPVGAVLVHNNRVIGEGWNRPIGRHDPTAH








AEIMALRQGGLVMQNYRLIDATLYVTLEPCVMCAGAMIHSRIGRVVFGARDAKTGAAG







SLMDVLHHPGMNHRVEITEGILADECAALLSDFFRMRRQEIKAQKKAQSSTD
custom-character







custom-character
custom-character
custom-character

SEVEFSHEYWMRHALTLAKRARDEREVPVGAV









LVLNNRVIGEGWNRAIGLHDPTAHAEIMALRQGGLV
M
QNYRLIDATLYVTFEPCV









MCAGAMIIISRIGRVVFGVRNAKTGAAGSLMDVLHYPGMNHRVEITEGILADECAA









LLCYFFRIVIPRQVFNAQKKAQSSTD

custom-character
custom-character
custom-character
custom-character
MA







AFKPNPINYILGLAIGIASVGWAMVEIDEEENPIRLIDLGVRVFERAEVPKTGDSLAMARRLARS







VRRLTRRRAHRLLRARRLLKREGVLQAADFDENGLIKSLPNTPWQLRAAALDRKLTPLEWSAV







LLHLIKHRGYLSQRKNEGETADKELGALLKGVANNAHALQTGDFRTPAELALNKFEKESGHI







RNQRGDYSHITSRKDLQAELILLFEKQKEFGNPHVSGGLKEGIETLLMTQRPALSGDAVQKM







LGHCTFEPAEPKAAKNTYTAERFIWLTKLNNLRILEQGSERPLTDTERATLMDEPYRKSKLTYA







QARKLLGLEDTAFFKGLRYGKDNAEASTLMEMKAYHAISRALEKEGLKDKKSPLNLSSELQD







EIGTAFSLEKTDEDITGRLKDRVQPEILEALLKHISFDKEVQISLKALRRIVPLMEQGKRYDEAC







AETYGDHYGKKNTEEKIYLPPIPADEIRNPVVLRALSQARKVINGVVRRYGSPARIHIETAREVG







KSFKDRKEIEKRQEENRKDREKAAAKFREYFPNEVGEPKSKDILKLRLYEQQHGKCLYSGKEI







NLVRLNEKGYVEIDHALPFSRTWDDSENNKVLVLGSENQNKGNQTPYEYENGKDNSREWQE







FKARVETSRFPRSKKQRILLQKFDEDGEKECNLNDTRYVNRELCQFVADHILLTGKGKRRVF







ASNGQITNLLRGFWGLRKVRAENDRHHALDAVVVACSTVAMQQKITREVRYKEMNAFDGKTI







DKETGKVLHQKTHFPQPWEFFAQEVMIRVFGKPDGKPEFEEADTPEKLRTLLAEKLSSRPEA







VHEYVTPLEVSRAPNRKMSGAHKDTLRSAKREVKHNEKISVKRVWLTEIKLADLENMVNYKN







GREIELYEALKARLEAYGGNAKQAFDPKDNPFYKKGGQLVKAVRVEKTQESGVLLNKKNAYT







IADNGDMVRVDVECKVDKKGKNQYFIVPIYAWQVAENILPDIDCKGYRIDDSYTECFSLHKYD







LIAFQKDEKSKVEFAYYINCDSSNGRFYLAWHDKGSKEQQFRISTQNLVLIQKYQVNELGKEIR







PCRLKKRPPVR
custom-character KRPAATKKAGQAKKKK*






TadA (underlined), TadA 7.10 (underlined/bold), linker (bold italics), nNme2Cas9





(italics), Nucleoplasmin NLS (plain).






In one embodiment, an ABEmax-nNme2Cas9 (D16A) construct has the following amino acid sequence










(SEQ ID NO: 5)




custom-character
custom-character
PKKKRKV
SEVEFSHEYWMRHALTLAKRAWDEREVPVGAVLVHNN








RVIGEGWNRPIGRHDPTAHAEIMALRQGGLVMQNYRLIDATLYVTLEPCVMCAGAMIH







SRIGRVVFGARDAKTGAAGSLMDVLHHPGMNHRVEITEGILADECAALLSDFFRMRRQ







EIKAQKKAQSSTD
custom-character
custom-character
custom-character
custom-character

SEVEFSHEYWMRHA









LTLAKRARDEREVPVGAVLVLNNRVIGEGWNRAIGLIMPTAHAEIMALRQGGLV









MQNYRLIDATLYVTFEPCVMCAGAMIHSRIGRVVFGVRNAKTGAAGSLMDVLHYP









GMNHRVEITEGILADECAALLCYFFRMPRQVFNAQKKAQSSTD

custom-character
custom-character







custom-character
custom-character
MAAFKPNPINYILGLAIGIASVGWAMVEIDEEENPIRLIDLGVRVF







ERAEVPKTGDSLAMARRLARSVRRLTRRRAHRLLRARRLLKREGVLQAADFDENGLIKSLPNT







PWQLRAAALDRKLTPLEWSAVLLHLIKHRGYLSQRKNEGETADKELGALLKGVANNAHALQT







GDFRTPAELALNKFEKESGHIRNQRGDYSHTFSRKDLQAELILLFEKQKEFGNPHVSGGLKE







GIETLLMTQRPALSGDAVQKAILGHCTFEPAEPKAAKNTYTAERFIWLTKLNNLRILEQGSERP







LTDTERATLMDEPYRKSKLTYAQARKLLGLEDTAFFKGLRYGKDNAEASTLMEMKAYHAISRA







LEKEGLKDKKSPLNLSSELQDEIGTAFSLFKTDEDITGRLKDRVQPEILEALLKHISFDKFVQIS







LKALRRIVPLMEQGKRYDEACAEIYGDHYGKKNTEEKIYLPPIPADEIRNPVVLRALSQARKVI







NGVVRRYGSPARIHIETAREVGKSFKDRKEIEKRQEENRKDREKAAAKFREYFPNFVGEPKSK







DILKLRLYEQQHGKCLYSGKEINLVRLNEKGYVEIDHALPFSRTWDDSFNNKVLVLGSENQNK







GNQTPYEYFNGKDNSREWQEFKARVETSRFPRSKKQRILLQKFDEDGFKECNLNDTRYVNRF







LCQFVADHILLTGKGKRRVFASNGQITNLLRGFWGLRKVRAENDRHHALDAVVVACSTVAMQ







QKITRFVRYKEMNAFDGKTIDKETGKVLHQKTHFPQPWEFFAQEVMIRVFGKPDGKPEFEE







ADTPEKLRTLLAEKLSSRPEAVHEYVTPLFVSRAPNRKMSGAHKDTLRSAKRFVKHNEKISVK







RVWLTEIKLADLENMVNYKNGREIELYEALKARLEAYGGNAKQAFDPKDNPFYKKGGQLVKA







VRVEKTQESGVLLNKKNAYTIADNGDMVRVDVFCKVDKKGKNQYFIVPIYAWQVAENILPDI







DCKGYRIDDSYTFCFSLHKYDLIAFQKDEKSKVEFAYYINCDSSNGRFYLAWHDKGSKEQQF







RISTQNLVLIQKYQVNELGKEIRPCRLKKRPPVR
custom-character KRPAATKKAGQAKKKKcustom-characterPKKKRK







V*






TadA (underlined), TadA* 7.10 (underlined/bold), linker (bold italics), nNme2Cas9





(italics), Nucleoplasmin NLS (plain) and SV40 NLS (BOLD).






In one embodiment, a CBE4-nNme2Cas9 (D16A)-UGI-UGI construct has the following amino acid sequence:










(SEQ ID NO: 6)



PAAKRVKLDcustom-charactercustom-character PAAKRVKLDcustom-charactercustom-characterPKKKRKVcustom-characterSSE







TGPVAVDPTLRRRIEPHEFEVFFDPRELRKETCLLYEINWGGRHSIWRHTSQNTNKHVEV







NFIEKFTTERYFCPNTRCSITWFLSWSPCGECSRAITEFLSRYPHVTLFIYIARLYHHADPR







NRQGLRDLISSGVTIQIMTEQESGYCWRNFVNYSPSNEAHWPRYPHLWVRLYVLELYCII







LGLPPCLNILRRKQPQLTFFTIALQSCHYQRLPPHILWATGLK
custom-character
custom-character







custom-character
custom-character
custom-character
AAFKPNPINYILGLAIGIASVGWAMVEIDEEENPIRLIDLGVRVEER







AEVPKTGDSLAMARRLARSVRRLTRRRAHRLLRARRLLKREGVLQAADFDENGLIKSLPNTPW







QLRAAALDRKLTPLEWSAVLLHLIKHRGYLSQRKNEGETADKELGALLKGVANNAHALQTGD







FRTPAELALNKFEKESGHIRNQRGDYSHTFSRKDLQAELILLFEKQKEFGNPHVSGGLKEGIE







TLLMTQRPALSGDAVQKMLGHCTFEPAEPKAAKNTYTAERFIWLTKLNNLRILEQGSERPLTD







TERATLMDEPYRKSKLTYAQARKLLGLEDTAFFKGLRYGKDNAEASTLMEMKAYHAISRALEK







EGLKDKKSPLNLSSELQDEIGTAFSLEKTDEDITGRLKDRVQPEILEALLKHISFDKFVQISLKA







LRRIVPLMEQGKRYDEACAEIYGDHYGKKNTEEKIYLPPIPADEIRNPVVLRALSQARKVINGV







VRRYGSPARIHIETAREVGKSFKDRKEIEKRQEENRKDREKAAAKFREYFPNEVGEPKSKDILK







LRLYEQQHGKCLYSGKEINLVRLNEKGYVEIDHALPFSRTWDDSFNNKVLVLGSENQNKGNQ







TPYEYENGKDNSREWQEFKARVETSRFPRSKKQRILLQKFDEDGFKECNLNDTRYVNRFLCQ







FVADHILLTGKGKRRVFASNGQITNLLRGFWGLRKVRAENDRHHALDAVVVACSTVAMQQKI







TRFVRYKEMNAFDGKTIDKETGKVLHQKTHFPQPWEFFAQEVMIRVFGKPDGKPEFEEADT







PEKLRTLLAEKLSSRPEAVHEYVTPLFVSRAPNRKMSGAHKDTLRSAKRFVKHNEKISVKRVW







LTEIKLADLENMVNYKIVGREIELYEALKARLEAYGGNAKQAFDPKDNPFYKKGGQLVKAVRV







EKTQESGVLLNKKNAYTIADNGDMVRVDVFCKVDKKGKNQYFIVPIYAWQVAENILPDIDCK







GYRIDDSYTFCFSLHKYDLIAFQKDEKSKVEFAYYINCDSSNGRFYLAWHDKGSKEQQFRIST







QNLVLIQKYQVNELGKEIRPCRLKKRPPVR
custom-character
custom-character
custom-character
custom-character







custom-character
custom-character
custom-character
custom-character
custom-character

TNLSDIIEKETGKQLVI









QESILMLPEEVEEVIGNKPESDILVHTAYDESTDENVMLLTSDAPEYKPWALVIQDS









NGENKIKML

custom-character
custom-character

TNLSDIIEKETGKQLVIQESILMLPEEVEEVIGNKPESD









ILVHTAYDESTDENVMLLTSDAPEYKPWALVIQDSNGENKIKML

custom-character
custom-character







custom-character PAAKRVKLDcustom-charactercustom-character PAAKRVKLD






rApobec I (underlined), UGI (underlined/bold), linker (bold italics),





nNme2Cas9 (D16A) (italics), Cmyc-NLS (plain) and SV40 NLS (BOLD).






In one embodiment, an optimized nNme2Cas9-ABEmax construct refers to an optimized version with improved promoter, NLS sequences, and linker sequences. In some embodiments, an optimized nNme2Cas9-ABEmax construct comprises, 5′ to 3′, a C-myc NLS, 12aa linker, 15aa linker, SV40 NLS, TadA, TadA*7.10, 48aa linker, nNme2Cas9, a 73aa linker (3×HA-tag), 15aa linker, and a C-myc NLS. In some embodiments, an optimized nNme2Cas9-ABEmax construct further comprises at least two each alternating C-myc NLS and a 12aa linker at the 3′ end. In some embodiments, an optimized nNme2Cas9-ABEmax construct further comprises at least two each alternating 15aa linker and C-myc NLS at the 5′ end. See, FIG. 1E for example.


In one embodiment, an optimized nNme2Cas9-ABEmax construct has the following amino acid sequence










(SEQ ID NO: 7):



PAAKRVKLDcustom-charactercustom-character PAAKRVKLDcustom-charactercustom-characterPKKKRKV






SEVEFSHEYWMRHALTLAKRAWDEREVPVGAVLVHNNRVIGEGWNRPIGRHDPTAHA







EIMALRQGGLVMQNYRLIDATLYVTLEPCVMCAGAMIHSRIGRVVFGARDAKTGAAGS







LMDVLHHPGMNHRVEITEGILADECAALLSDFFRMRRQEIKAQKKAQSSTD
custom-character







custom-character
custom-character
custom-character

SEVEFSHEYWMRHALTLAKRARDEREVPVGAVL









VLNNRVIGEGWNRAIGLHDPTAHAEIMALRQGGLVMQNYRLIDATLYVTFEPCVM









CAGAMIHSRIGRVVFGVRNAKTGAAGSLMDVLHYPGMNIIRVEITEGILADECAAL









LCYFFRMPRQVFNAQKKAQSSTD

custom-character
custom-character
custom-character
custom-character
custom-character







custom-character
custom-character
AAFKPNPINYILGLAIGIASVGWAMVEIDEEENPIRLIDLGVRVFERAEVPK







TGDSLAMARRLARSVRRLTRRRAHRLLRARRLLKREGVLQAADFDENGLIKSLPNTPWQLRAA







ALDRKLTPLEWSAVLLHLIKHRGYLSQRKNEGETADKELGALLKGVANNAHALQTGDFRTPA







ELALNKFEKESGHIRNQRGDYSHTFSRKDLQAELILLFEKQKEFGNPHVSGGLKEGIETLLMT







QRPALSGDAVQKMLGHCTFEPAEPKAAKNTYTAERFIWLTKLNNLRILEQGSERPLTDTERAT







LMDEPYRKSKLTYAQARKLLGLEDTAFFKGLRYGKDNAEASTLMEMKAYHAISRALEKEGLK







DKKSPLNLSSELQDEIGTAFSLFKTDEDITGRLKDRVQPEILEALLKHISFDKFVQISLKALRRI







VPLMEQGKRYDEACAEIYGDHYGKKNTEEKIYLPPIPADEIRNPVVLRALSQARKVINGVVRRY







GSPARIHIETAREVGKSFKDRKEIEKRQEENRKDREKAAAKFREYFPNFVGEPKSKDILKLRLY







EQQHGKCLYSGKEINLVRLNEKGYVEIDHALPFSRTWDDSFNNKVLVLGSENQNKGNQTPYE







YFNGKDNSREWQEFKARVETSRFPRSKKQRILLQKFDEDGFKECNLNDTRYVNRFLCQFVA







DHILLTGKGKRRVFASNGQITNLLRGFWGLRKVRAENDRHHALDAVVVACSTVAMQQKITRF







VRYKEMNAFDGKTIDKETGKVLHQKTHFPQPWEFFAQEVMIRVFGKPDGKPEFEEADTPEK







LRTLLAEKLSSRPEAVHEYVTPLFVSRAPNRKMSGAHKDTLRSAKRFVKHNEKISVKRVWLTEI







KLADLENIVIVNYKNGREIELYEALKARLEAYGGNAKQAFDPKDNPFYKKGGQLVKAVRVEKT







QESGVLLNKKNAYTIADNGDMVRVDVFCKVDKKGKNQYFIVPIYAWQVAENILPDIDCKGYRI







DDSYTFCFSLHKYDLIAFQKDEKSKVEFAYYINCDSSNGRFYLAWHDKGSKEQQFRISTQNLV







LIQKYQVNELGKEIRPCRLKKRPPVR
custom-character
custom-character
custom-character
custom-character
custom-character







custom-character
custom-character
custom-character
custom-character PAAKRVKLDcustom-charactercustom-character PAAKRV






KLD





hTadA7.10 (underlined), hTadA*7.10 (underlined/bold), linker (bold italics),





nNme2Cas9 (italics), Cmyc-NLS (plain), SV40-NLS (bold).






In some embodiments, a plasmid nSpCas9-ABEmax (Addgene ID:112095) was used for experimental controls and for molecular cloning. In some embodiments, a plasmid nSpCas9-CBE4 (Addgene ID: 100802) was used for experimental controls and for molecular cloning.


Electroporation of HEK293T cells with DNA plasmids comprising a YE1-BE3-Nme2Cas9 nucleotide deaminase fusion protein achieved robust single-base editing of a CG base pair to a TA base pair at an endogenous target site (TS25). See, FIGS. 2A-C.



FIG. 2A-C presents exemplary data of the electroporation of HEK293T cells with DNA plasmids comprising a YE1-BE3-nNme2Cas9 (D16A)-UGI fusion protein efficiently converting C to T at endogenous target site 25 (TS25) in HEK293T cells via nucleofection. FIG. 2A shows exemplary sequences for a TS25 endogenous target site (within the black rectangle). GN23 sgRNA base-pairs with the target DNA strand, leaving the displaced DNA strand for cytidine deaminase to edit (e.g. new green nucleotides). FIG. 2B shows exemplary sequencing data showing a doublet nucleotide peak (7th position from 5′ end; arrow) demonstrating the successful single base editing of a cytidine to a thymidine (e.g., a CG base pair conversion to a TA base pair). FIG. 2C shows an exemplary quantitation of the data shown in FIG. 2B plotting the percent conversion of C T single base editing. The percentage of C converted to T is about 40% in the base editor- and sgRNA-treated sample (p-value=6.88×10-6). The “no sgRNA” control displays the background noise due to Sanger sequencing. EditR (Kluesner et al., 2018) was used to perform the analysis.


Four other YE1-BE3-nNme2Cas9/D16A mutant fusion proteins were co-expressed with enhanced green fluorescent protein (EGFP) in a stable K562-derived cell line expressing enhanced green fluorescent protein (EGFP). Each YE1-BE3-nNme2Cas9/D16A mutant fusion protein had a specific UGI target site. See, FIGS. 3A-D.


Deep-sequencing analysis indicates YE1-BE3-nNme2Cas9 converts C residues to T residues at each of the four EGFP target sites. The percentage of editing ranged from 0.24% to 2%. The potential base editing window is from nucleotides 2-8 in the displaced DNA strand, counting the nucleotide at the 5′ (PAM-distal) end as nucleotide #1. See, FIGS. 3A-D.



FIG. 3A-F presents exemplary specific UGI target sites that were respectively integrated into YE1-BE3-nNme2Cas9/D16A mutant fusion proteins and co-expressed with enhanced green fluorescent protein (EGFP) in a stable K562-derived cell line. Converted bases are highlighted in orange color. Background signals were filtered using negative control samples (YE1-BE3-nNme2Cas9 nucleofected K562 cells without sgRNA constructs). N4CC PAMs are boxed. The percentage of total reads exhibiting mutations in base-editor-targeted sites is shown in the right column. FIG. 3A shows an exemplary EGFP-Site 1. FIG. 3B shows an exemplary EGFP-Site 2. FIG. 3C shows an exemplary EGFP-Site 3. FIG. 3D shows an exemplary EGFP-Site 4.


Electroporation of HEK293T cells with DNA plasmids comprising a YE1-BE3-nNme2Cas9 c-fos promoter achieved robust single-base editing of a CG base pair to a TA base pair at endogenous target sites in the c-fos promoter (FIG. 3E). FIG. 3E shows an exemplary deep-sequencing analysis indicating where YE1-BE3-nNme2Cas9 converts C residues to T residues at endogenous c-fos promoter region. The percentage of total reads exhibiting mutations in base-editor-targeted sites is shown in the right column. The converted bases are highlighted in orange or yellow color. Background signals were filtered using negative control samples. The highest percentage of editing is 32.50%. FIG. 3F shows an exemplary deep-sequencing analysis indicating where ABE7.10-nNme2Cas9 or ABEmax (Koblan et al., 2018)-nNme2Cas9 converts A residues to G residues at endogenous c-fos promoter region. The percentage of total reads exhibiting mutations in base-editor-targeted sites is shown in the right column. The converted bases are highlighted in orange color. Background signals were filtered using negative control samples. The percentage of editing is 0.53% by ABE7.10-nNme2Cas9 or 2.33% by ABEmax-nNme2Cas9 (D16A).


In one embodiment, the present invention contemplates the expression of an ABE7.10-nNme2Cas9 (D16A) fusion protein for base editing. Although it is not necessary to understand the mechanism of an invention, it is believed that Nme2Cas9 base editing may be an effective treatment for tyrosinemia by reversing a G-to-A point mutation in the Fah gene with an ABE7.10-nNme2Cas9 (D16A) fusion protein.


G-to-A mutation (red) at the last nucleotide of exon 8 in Fah gene, causing exon skipping. FAH deficiency leads to toxin accumulation and severe liver damage. The position of a SpyCas9 PAM (black rectangular box) downstream of the mutation is not optimal for designing the sgRNA since the A mutation is out of the efficient base editing window of ABE7.10, which is 4-7th nt at the 5′ (PAM-distal) end (underlined) (Gaudelli et al., 2017).


However, there are two Nme2Cas9 PAMs (red rectangular box) in the downstream sequences that can potentially correct the mutation and revert DNA sequence to wildtype via ABE7.10-nNme2Cas9 (D16A). See, FIG. 4.



FIG. 4 presents an exemplary alignment of the wildtype Fah gene with the tyrosinemia Fah mutant gene showing an A-G single base gene editing target site (position 9). The respective SpyCas9 single PAM site and NmeCas9 double PAM sites are indicated for demonstrating the suboptimal targeting window relative to the SpyCas9 PAM site. This figure serves as a potential example of a site where Nme2Cas9 could overcome limitations of existing base editors. It is further believed that the NmeCas9 base editor described herein can perform precise base editing that cannot be achieved with conventional SpyCas9-derived base editors due to a suboptimal base editing window relative to available PAMs nearby.


Furthermore, we contemplate extending base editing to a tyrosinemia mouse model for reversing the G-to-A point mutation by viral delivery methods using ABEmax-nNme2Cas9 (D16A), where the desired editing cannot be achieved with SpyCas9-derived base editors due to a suboptimal base editing window relative to available PAMs nearby (e.g. FIG. 4).


B. NmeCas9 Constructs: Compact & Hyperaccurate


Clustered, regularly interspaced, short, palindromic repeats (CRISPR) along with CRISPR-associated (Cas) proteins constitute bacterial and archaeal adaptive immune pathways against phages and other mobile genetic elements (MGEs) (Barrangou et al., 2007; Brouns et al., 2008; Marraffini and Sontheimer, 2008). In Type II CRISPR systems, CRISPR RNA (crRNA) is bound to a trans-activating crRNA (tracrRNA) and loaded onto a Cas9 effector protein that cleaves MGE nucleic acids complementary to the crRNA (Garneau et al., 2010; Deltcheva et al., 2011; Sapranauskas et al., 2011; Gasiunas et al., 2012; Jinek et al., 2012). The crRNA:tracrRNA hybrid can be fused into a single-guide RNA (sgRNA) (Jinek et al., 2012). The RNA programmability of Cas9 endonucleases has made it a powerful genome editing platform in biotechnology and medicine (Cho et al., 2013; Cong et al., 2013; Hwang et al., 2013; Jiang et al., 2013; Jinek et al., 2013; Mali et al., 2013b).


In addition to sgRNA, Cas9 target recognition is usually associated with a 1-5 nucleotide signature downstream of the complementary DNA sequence, called a protospacer adjacent motif (PAM) (Deveau et al., 2008; Mojica et al., 2009). Cas9 orthologs exhibit considerable diversity in PAM length and sequence. Among Cas9 orthologs that have been characterized, Streptococcus pyogenes Cas9 (SpyCas9) is the most widely used, in part because it recognizes a short NGG PAM (Jinek et al., 2012) (N represents any nucleotide) that affords a high density of targetable sites. Nevertheless, Spy's relatively large size (i.e., 1,368 amino acids) makes this Cas9 difficult to package (along with sgRNA and promoters) into a single recombinant adeno-associated virus (rAAV). This has been shown to be a drawback for therapeutic applications given the promise shown by AAV vectors for in vivo gene delivery (Keeler et al., 2017). Moreover, SpyCas9 and its RNA guides have required extensive characterization and engineering to minimize the tendency to edit near-cognate, off-target sites. (Bolukbasi et al., 2015b; Tsai and Joung, 2016; Tycko et al., 2016; Chen et al., 2017; Casini et al., 2018; Yin et al., 2018). To date, subsequent engineering efforts have not overcome these size limitations.


Several Cas9 orthologs of less than 1,100 amino acids in length obtained from diverse species have been validated for mammalian genome editing, including strains of N. meningitidis (NmeCas9, 1,082 aa) (Esvelt et al., 2013; Hou et al., 2013), Staphylococcus aureus (SauCas9, 1,053 aa) (Ran et al., 2015), Campylobacter jejuni (CjeCas9, 984 aa) (Kim et al., 2017), and Geobacillus stearothermophilus (GeoCas9, 1,089 aa) (Harrington et al., 2017b). NmeCas9, CjeCas9, and GeoCas9 are representatives of type II-C Cas9s (Mir et al., 2018), most of which are <1,100 aa. With the exception of GeoCas9, each of these shorter sequence orthologs has been successfully deployed for in vivo editing via all-in-one AAV delivery (in which a single vector expresses both guide and effector) (Ran et al., 2015; Kim et al., 2017; Ibraheim et al., 2018, submitted). Furthermore, NmeCas9 and CjeCas9 have been shown to be naturally resistant to off-target editing (Lee et al., 2016; Kim et al., 2017; Amrani et al., 2018, submitted). However, the PAMs that are recognized by compact Cas9s are usually longer than that of SpyCas9, substantially reducing the number of targetable sites at or near a given locus; for example, i) N4GAYW/N4GYTT/N4GTCT for NmeCas9 (Esvelt et al., 2013; Hou et al., 2013; Lee et al., 2016; Amrani et al., 2018); ii) N2GRRT for SauCas9 (Ran et al., 2015); iii) N4RYAC for CjeCas9 (Kim et al., 2017); and iv) N4CRAA/N4GMAA for GeoCas9s (Harrington et al., 2017b) (Y=C, T; R=A, G; M=A, C; W=A, T). A smaller subset of target sites is advantageous for highly accurate and precise gene editing tasks including, but not limited to: i) editing of small targets (e.g. miRNAs); ii) correction of mutations by base editing which alters a very narrow window of bases relative to the PAM (Komor et al., 2016; Gaudelli et al., 2017); or iii) precise editing via homology-directed repair (HDR) which is most efficient when the rewritten bases are close to the cleavage site (Gallagher and Haber, 2018). Because of PAM restrictions, many editing sites cannot be targeted using all-in-one AAV vectors for in vivo delivery even with these shorter Cas9 proteins. For example, A SauCas9 mutant (SauCas9KKH) has been developed that has reduced PAM constraints (N3RRT), though this increase in targeting range often comes at the cost of reduced on-target editing efficacy, and off-target edits are still observed. (Kleinstiver et al., 2015).


Safe and effective CRISPR-based therapeutic gene editing will be greatly enhanced by Cas9 orthologs and variants that are highly active in human cells, resistant to off-targeting, sufficiently compact for all-in-one AAV delivery, and capable of accessing a high density of genomic sites. In one embodiment, the present invention contemplates a compact, hyper-accurate Cas9 (Nme2Cas9) from a distinct strain of N. meningitidis. In one embodiment, the present invention contemplates a method for single-AAV delivery of Nme2Cas9 and its sgRNA to perform efficient genome editing in vivo and/or ex vivo. Although it is not necessary to understand the mechanism of an invention, it is believed that this ortholog functions efficiently in mammalian cells and recognizes an N4CC PAM that affords a target site density identical to that of wild-type SpyCas9 (e.g., every 8 bp on average, when both DNA strands are considered).


1. PAM Interacting Domains and Anti-CRISPR Proteins


PAM recognition by Cas9 orthologs occurs predominantly through protein-DNA interactions between the PAM Interacting Domain (PID) and the nucleotides adjacent to the protospacer (Jiang and Doudna, 2017). PAM mutations often enable phage escape from type II CRISPR immunity (Paez-Espino et al., 2015), placing these systems under selective pressure not only to acquire new CRISPR spacers, but also to evolve new PAM specificities via PID mutations. In addition, some phages and MGEs express anti-CRISPR (Acr) proteins that inhibit Cas9 (Pawluk et al., 2016; Hynes et al., 2017; Rauch et al., 2017). PID binding is an effective inhibitory mechanism adopted by some Acrs (Dong et al., 2017; Shin et al., 2017; Yang and Patel, 2017), suggesting that PID variation may also be driven by selective pressure to escape Acr inhibition. Cas9 PIDs can evolve such that closely-related orthologs recognize distinct PAMs, as illustrated recently in two species of Geobacillus. The Cas9 encoded by G. stearothermophilus recognizes a N4CRAA PAM, but when its PID was swapped with that of strain LC300's Cas9, its PAM requirement changed to N4GMAA (Harrington et al., 2017b).


In one embodiment, the present invention contemplates a plurality of N. meningitidis Cas9 orthologs with divergent PIDs that recognize different PAMs. In one embodiment, the present invention contemplates a Cas9 protein with a high sequence identity (>80% along their entire lengths) to that of NmeCas9 strain 8013 (Nme1Cas9) (Zhang et al., 2013). Nme1Cas9 also has a small size and naturally high accuracy as discussed above. (Lee et al., 2016; Amrani et al., 2018). Alignments revealed three clades of meningococcal Cas9 orthologs, each with >98% identity in the N-terminal ˜820 amino acid (aa) residues, which includes all regions of the protein other than the PID. See, FIG. 5A and FIG. 6A.


All of these Cas9 orthologs are 1,078-1,082 aa in length. The first Glade (group 1) includes orthologs in which the >98% aa sequence identity with Nme1Cas9 extends through the PID. In contrast, the other two groups had PIDs that were significantly diverged from that of Nme1Cas9, with group 2 and group 3 orthologs averaging ˜52% and ˜86% PID sequence identity with Nme1Cas9, respectively. One meningococcal strain was selected from each group: i) Del1444 from group 2; and ii) 98002 from group 3 for detailed analysis, which are referred to herein as Nme2Cas9 (1,082 aa) and Nme3Cas9 (1,081 aa), respectively. The CRISPR-cas loci from these two strains have repeat sequences and spacer lengths that are identical to those of strain 8013. See, FIG. 6B. This strongly suggested that their mature crRNAs also have 24 nt guide sequences and 24 nt repeat sequences (Zhang et al., 2013). Similarly, the tracrRNA sequences of Del1444 and 98002 were 100% identical to the 8013 tracrRNA. See, FIG. 6B. These observations imply that the same sgRNA sequence scaffold can guide DNA cleavage by all three Cas9s.


To determine whether these Cas9 orthologs have distinct PAMs, the PID of Nme1Cas9 was replaced with that of either Nme2Cas9 or Nme3Cas9. To identify the corresponding PAM requirements, these protein chimeras were expressed in Escherichia coli, purified, and used for in vitro PAM identification (Karvelis et al., 2015; Ran et al., 2015; Kim et al., 2017). Briefly, a pool of DNA fragments containing a protospacer followed by a 10-nt randomized sequence was cleaved in vitro using recombinant Cas9 and a cognate, in vitro-transcribed sgRNA. See, FIG. 5B. Only those DNAs containing a Cas9 PAM sequence were expected to be cleaved. Cleavage products were then sequenced to identify the PAMs. See, FIGS. 5C-D.


The expected N4GATT PAM consensus was validated in the recovered full-length Nme1Cas9. See, FIG. 5C. Chimeric PID-swapped derivatives exhibited a strong preference for a C residue in the 5th position in place of the G recognized by Nme1Cas9. See, FIG. 5D.


In one embodiment, ABE7.10-nNme2Cas9 (D16A) is used for single-base editing of A●T base pair to a G●C base pair. In one embodiment, BEmax-nNme2Cas9 (D16A) is used for single-base editing of A●T base pair to a G●C base pair. (See, FIG. 3F).



FIG. 5A-E illustrates exemplary three closely related Neisseria meningitidis Cas9 orthologs that have distinct PAMs. FIG. 5A shows an exemplary schematic showing mutated residues (orange spheres) between Nme2Cas9 (left) and Nme3Cas9 (right) mapped onto the predicted structure of Nme1Cas9, revealing the cluster of mutations in the PID (black). FIG. 5B shows an exemplary experimental workflow of the in vitro PAM discovery assay with a 10-bp randomized PAM region. Following in vitro digestion, adapters were ligated to cleaved products for library construction and sequencing. FIG. 5C shows exemplary sequence logos resulting from in vitro PAM discovery reveal the enrichment of a N4GATT PAM for Nme1Cas9, consistent with its previously established specificity. FIG. 5D shows exemplary sequence logos indicating that Nme1Cas9 with its PID swapped with that of Nme2Cas9 (left) or Nme3 Cas9 (right) requires a C at PAM position 5. The remaining nucleotides were not determined with high confidence due to the modest cleavage efficiency of the PID-swapped protein chimeras (see FIG. 6C). FIG. 5E shows an exemplary sequence logo showing that full-length Nme2Cas9 recognizes an N4CC PAM, based on efficient substrate cleavage of a target pool with a fixed C at PAM position 5, and with PAM nts 1-4 and 6-8 randomized.


Any remaining PAM nucleotides could not be confidently assigned due to the low cleavage efficiencies of the chimeric proteins under the conditions used. See, FIG. 6C. To further resolve the PAMs, in vitro assays were performed on a library with a 7-nt randomized sequence possessing an invariant C at the 5th PAM position (e.g., 5′-CNNN-3′ on the sgRNA-noncomplementary strand). This strategy yielded a much higher cleavage efficiency and the results indicated that the Nme2Cas9 and Nme3Cas9 PIDs recognize CC(A) and NNNNCAAA PAMs, respectively. See, FIGS. 6C-D. The Nme3Cas9 consensus is similar to that of GeoCas9 (Harrington et al., 2017b).


These tests were repeated using a full-length Nme2Cas9 (rather than a PID-swapped chimera) with the NNNNCNNN DNA pool, and again a CC(A) consensus was recovered. See, FIG. 5E. It was noted that this test had more efficient cleavage. See, FIG. 6C. These data suggest that one or more of the 15 amino acid changes in Nme2Cas9 (relative to Nme1Cas9) outside of the PID support efficient DNA cleavage activity. See, FIG. 6C. Because the unique, 2-3 nt PAM of Nme2Cas9 affords a higher density of potential target sites than the previously described compact Cas9 orthologs, it was selected for further analyses.



FIG. 6A-C presents a characterization of Neisseria meningitidis Cas9 orthologs with rapidly-evolving PIDs, as related to FIG. 5A-E. FIG. 6A shows an exemplary unrooted phylogenetic tree of NmeCas9 orthologs that are >80% identical to Nme1Cas9. Three distinct branches emerged, with the majority of mutations clustered in the PID. Groups 1 (blue), 2 (orange), and 3 (green) have PIDs with >98%, approximately 52%, and approximately 86% identity to Nme1Cas9, respectively. Three representative Cas9 orthologs (one from each group) (Nme1Cas9, Nme2Cas9 and Nme3Cas9) are indicated. FIG. 6B shows an exemplary schematic showing the CRISPR-cas loci of the strains encoding the three Cas9 orthologs (Nme1Cas9, Nme2Cas9, and Nme3Cas9) from (A). Percent identities of each CRISPR-Cas component with N. meningitidis 8013 (encoding Nme1Cas9) are shown. Blue and red arrows denote pre-crRNA and tracrRNA transcription initiation sites, respectively. FIG. 6C shows an exemplary normalized read counts (% of total reads) from cleaved DNAs from the in vitro assays for intact Nme1Cas9 (grey), for chimeras with Nme1Cas9's PID swapped with those of Nme2Cas9 and Nme3Cas9 (mixed colors), and for full-length Nme2Cas9 (orange), are plotted. The reduced normalized read counts indicate lower cleavage efficiencies in the chimeras. FIG. 6D shows an exemplary sequence logos from the in vitro PAM discovery assay on an NNNNCNNN PAM pool by Nme1Cas9 with its PID swapped with those of Nme2Cas9 (left) or Nme3Cas9 (right).


2. N4CC PAM-Directed Gene Editing


To test the efficacy of Nme2Cas9 in human genome editing, a full-length (e.g., not PID-swapped) human-codon-optimized Nme2Cas9 construct was cloned into a mammalian expression plasmid with appended nuclear localization signals (NLSs) and linkers validated previously for Nme1Cas9 (Amrani et al., 2018). For initial tests, a modified, fluorescence-based Traffic Light Reporter (TLR2.0) was used (Certo et al., 2011). Briefly, a disrupted GFP is followed by an out-of-frame T2A peptide and mCherry cassette. When DNA double-strand breaks (DSBs) are introduced in the broken-GFP cassette, a subset of non-homologous end joining (NHEJ) repair events leave +1-frameshifted indels, placing mCherry in frame and yielding red fluorescence that can be easily quantified by flow cytometry See, FIG. 7A. Homology-directed repair (HDR) outcomes can also be scored simultaneously by including a DNA donor that restores the functional GFP sequence, yielding a green fluorescence (Certo et al., 2011). Because some indels do not introduce a +1 frameshift, the fluorescence readout generally provides an underestimate of the true editing efficiency. Nonetheless, the speed, simplicity, and low cost of the assay makes it useful as an initial, semi-quantitative measure of genome editing in HEK293T cells carrying a single TLR2.0 locus incorporated via lentivector.


For initial tests, Nme2Cas9 plasmid was transiently co-transfected with one of fifteen sgRNA plasmids carrying spacers that target TLR2.0 sites with N4CC PAMs. No HDR donor was included, so only NHEJ-based editing (mCherry) was scored. Most sgRNAs were in a G23 format (i.e. a 5′-terminal G to facilitate transcription, followed by a 23 nt guide sequence), as used routinely for Nme1Cas9 (Lee et al., 2016; Pawluk et al., 2016; Amrani et al., 2018; Ibraheim et al., 2018). No sgRNA and an sgRNA targeting an N4GATT PAM were used as negative controls, and SpyCas9+sgRNA and Nme1Cas9+sgRNA co-transfections (targeting NGG and N4GATT protospacers, respectively) were included as positive controls. Editing by SpyCas9 and Nme1Cas9 was readily detectable (˜28% and 10% mCherry, respectively). See, FIG. 7B.


For Nme2Cas9, all 15 targets with N4CC PAMs were functional, though to various extents ranging from 4% to 20% mCherry. These fifteen sites include examples with each of the four possible nucleotides in the 7th PAM position (e.g., after the CC dinucleotide), indicating that a slight preference for an A residue was observed in vitro (FIG. 5E) does not reflect a PAM requirement for editing applications in human cells. The N4GATT PAM control yielded mCherry signal similar to no-sgRNA control. See, FIG. 7B.


To determine whether both C residues in the N4CC PAM are involved in editing, a series of N4DC (D=A, T, G) and N4CD PAM sites were tested in TLR2.0 reporter cells. See, FIGS. 8A and 8B. No detectable editing was found at any of these sites, providing an initial indication that both C residues of the N4CC PAM consensus are required for efficient Nme2Cas9 activity.


The length of the spacer in the crRNA differs among Cas9 orthologs and can affect on-vs. off-target activity (Cho et al., 2014; Fu et al., 2014). SpyCas9's optimal spacer length is 20 nts, with truncations down to 17 nts tolerated (Fu et al., 2014). In contrast, Nme1Cas9 usually has 24-nt spacers (Hou et al., 2013; Zhang et al., 2013), and tolerates truncations down to 18-20 nts (Lee et al., 2016; Amrani et al., 2018). To test spacer length requirements for Nme2Cas9, guide RNA plasmids were created for each targeted single TLR2.0 site, but with varying spacer lengths. See, FIG. 7C and FIG. 8C. Comparable activities were observed with G23, G22 and G21 guides, but significantly decreased activity upon further truncation to G20 and G19 lengths. See, FIG. 7C. These results validate Nme2Cas9 as a genome editing platform, with 22-24 nt guide sequences, at N4CC PAM sites in cultured human cells.



FIG. 7A-D presents exemplary data showing that Nme2Cas9 uses a 22-24 nt spacer to edit sites adjacent to an N4CC PAM. All experiments were done in triplicate, and error bars represent the standard error of the mean (s.e.m.). FIG. 7A shows an exemplary schematic diagram depicting transient transfection and editing of HEK293T TLR2.0 cells, with mCherry+ cells detected by flow cytometry 72 hours after transfection. FIG. 7B shows an exemplary Nme2Cas9 editing of the TLR2.0 reporter. Sites with N4CC PAMs were targeted with varying efficiencies, while no Nme2Cas9 targeting was observed at an N4GATT PAM or in the absence of sgRNA. SpyCas9 (targeting a previously validated site with an NGG PAM) and Nme1Cas9 (targeting N4GATT) were used as positive controls. FIG. 7C shows an exemplary effect of spacer length on the efficiency of Nme2Cas9 editing. An sgRNA targeting a single TLR2.0 site, with spacer lengths varying from 24 to 20 nts (including the 5′-terminal G required by the U6 promoter), indicate that highest editing efficiencies are obtained with 22-24 nt spacers. FIG. 7D shows an exemplary An Nme2Cas9 dual nickase can be used in tandem to generate NHEJ- and HDR-based edits in TLR2.0. Nme2Cas9- and sgRNA-expressing plasmids, along with an 800-bp dsDNA donor for homologous repair, were electroporated into HEK293T TLR2.0 cells, and both NHEJ (mCherry+) and HDR (GFP+) outcomes were scored by flow cytometry. HNH nickase, Nme2Cas9D16A; RuvC nickase, Nme2Cas9H588A. Cleavage sites 32 bp and 64 bp apart were targeted using either nickase. The HNH nickase (Nme2Cas9D16A) yielded efficient editing, particularly with the cleavage sites that were separated by 32 bp, whereas the RuvC nickase (Nme2Cas9H588A) was not effective. Wildtype Nme2Cas9 was used as a control.


3. Precise Editing By HDR And HNH Nickase


Cas9 enzymes use their HNH and RuvC domains to cleave the guide-complementary and non-complementary strand of the target DNA, respectively. SpyCas9 nickases (nCas9s), in which either the HNH or RuvC domain is mutationally inactivated, have been used to induce homology-directed repair (HDR) and to improve genome editing specificity via DSB induction by dual nickases (Mali et al., 2013a; Ran et al., 2013).


To test the efficacy of Nme2Cas9 as a nickase, a Nme2Cas9D16A (HNH nickase) and Nme2Cas9H588A (RuvC nickase) were created, which possess alanine mutations in catalytic residues of the RuvC and HNH domains, respectively (Esvelt et al., 2013; Hou et al., 2013; Zhang et al., 2013). TLR2.0 cells, along with a GFP donor dsDNA, were used to determine whether Nme2Cas9-induced nicks can induce precise edits via HDR. Target sites within TLR2.0 were used to test the functionality of each nickase using guides targeting cleavage sites spaced 32 bp and 64 bp apart. See, FIG. 7D. Wildtype Nme2Cas9 targeting a single site showed efficient editing, with both NHEJ and HDR as outcomes of repair. For nickases, cleavage sites 32 bp and 64 bp apart showed editing using the Nme2Cas9D16A (HNH nickase), but neither target pair worked with Nme2Cas9H588A. These results suggest that Nme2Cas9 HNH nickase can be used for efficient genome editing, as long as the sites are in close proximity.


Studies in previously characterized Cas9s have identified a specific region proximal to the PAM where Cas9 activity is highly sensitive to sequence mismatches. This 8 to 12-nt region is known as the seed sequence and has been observed among all Cas9s characterized to date (Gorski et al., 2017). To determine whether Nme2Cas9 also possesses a seed sequence, a series of transient transfections was performed, each targeting the same locus in TLR2.0, but with a single-nucleotide mismatch at different positions of the guide. See, FIG. 8D. A significant decrease in the number of mCherry-positive cells was observed for mismatches in the first 10-12 nts proximal to the PAM, suggesting that Nme2Cas9 possesses a seed sequence in this region.



FIG. 8A-D presents exemplary data showing PAM, spacer, and seed requirements for Nme2Cas9 targeting in mammalian cells, as related to FIG. 7A-D. All experiments were done in triplicate and error bars represent s.e.m. FIG. 8A shows an exemplary Nme2Cas9 targeting at N4CD sites in TLR2.0, with editing estimated based on mCherry+ cells. Four sites for each non-C nucleotide at the tested position (N4CA, N4CT and N4CG) were examined, and an N4CC site was used as a positive control. FIG. 8B shows an exemplary Nme2Cas9 targeting at N4DC sites in TLR2.0 [similar to (A)]. FIG. 8C shows exemplary guide truncations on a TLR2.0 site (distinct from that in FIG. 2C) with a N4CCA PAM, revealing similar length requirements as those observed at the other site. FIG. 8D shows exemplary Nme2Cas9 targeting efficiency is differentially sensitive to single-nucleotide mismatches in the seed region of the sgRNA. Data show the effects of walking single-nucleotide sgRNA mismatches along the 23-nt spacer in a TLR2.0 target site.


4. Delivery Methods to Mammalian Cell Types


Nme2Cas9's ability to function in different mammalian cell lines was tested using various delivery methods. As an initial test, forty (40) different sites (29 with a N4CC PAM, and 11 sites were tested with a N4CD PAM). Several loci were selected (AAVS1, VEGFA, etc.), and target sites with N4CC PAMs were randomly chosen for editing with Nme2Cas9. Editing (%) was determined by transiently transfecting 150 ng of Nme2Cas9 along with 150 ng of sgRNA plasmids followed by TIDE analysis 72 hours post-transfection. A subset of sites exhibiting a range of editing efficiencies in this initial screen was selected for repeat analyses in triplicate. See, FIG. 9A; and Table 1.



FIG. 9A-C presents exemplary data showing Nme2Cas9 genome editing at endogenous loci in mammalian cells via multiple delivery methods. All results represent 3 independent biological replicates, and error bars represent s.e.m. FIG. 9A shows an exemplary Nme2Cas9 genome editing of endogenous human sites in HEK293T cells following transient transfection of Nme2Cas9- and sgRNA-expressing plasmids. 40 sites were screened initially (Table 1); the 14 sites shown (selected to include representatives of varying editing efficiencies, as measured by TIDE) were then re-analyzed in triplicate. An Nme1Cas9 target site (with an N4GATT PAM) was used as a negative control. FIG. 9B shows exemplary data charts: Left panel: Transient transfection of a single plasmid expressing both Nme2Cas9 and sgRNA (targeting the Pcsk9 and Rosa26 loci) enables editing in Hepa1-6 mouse cells, as detected by TIDE. Right panel: Electroporation of sgRNA plasmids into K562 cells stably expressing Nme2Cas9 from a lentivector results in efficient indel formation. FIG. 9C shows exemplary Nme2Cas9 can be electroporated as an RNP complex to induce genome editing. 40 picomoles Cas9 along with 50 picomoles of in vitro-transcribed sgRNAs targeting three different loci were electroporated into HEK293T cells. Indels were measured after 72 h using TIDE.









TABLE 1





Exemplary Endogenous human genome editing sites targeted by Nme2Cas9.





















SEQ ID

Site



Editing


NOS.
No.
Name
Spacer Seq
PAM
Locus
(%)





  8, 9
 1
TS1
GGTTCTGGGTACTTTTATCTGTCC
CCTCCACC
AAVS1
ND





 12, 13
 2
TS4
GTCTGCCTAACAGGAGGTGGGGGT
TAGACGAA
AAVS1
11





 16, 17
 3
TSS
GAATATCAGGAGACTAGGAAGGAG
GAGGCCTA
AAVSI
15





 20, 21
 4
TS6
GCCTCCCTGCAGGGCTGCTCCC
CAGCCCAA
LINC01588
20





 24, 25
 5
TS10
GAGCTAGTCTTCTTCCTCCAACCC
GGGCCCTA
AAVS1
 3.5





 28, 29
 6
TS11
GATCTGTCCCCTCCACCCCACAGT
GGGGCCAC
AAVS1
 9





 32, 33
 7
TS12
GGCCCAAATGAAAGGAGTGAGAGG
TGACCCGA
AAVS1
10





 36, 37
 8
TS13
GCATCCTCTTGCTTTCTTTGCCTG
GACACCCCA
AVSI
 2





 40, 41
 9
TS16
GGAGTCGCCAGAGGCCGGTGGTGG
ATTTCCTC
LINC01588
28





 44, 45
10
TS17
GCCCAGCGGCCGGATATCAGCTGC
CACGCCCG
LINC01588
ND





 48, 49
11
TS18
GGAAGGGAACATATTACTATTGC
TTTCCCTC
CYBB
 1





 52, 53
12
TS19
GTGGAGTGGCCTGCTATCAGCTAC
CTATCCAA
CYBB
 6





 56, 57
13
TS20
GAGGAAGGGAACATATTACTATTG
CTTTCCCT
CYBB
11.2





 60, 61
14
TS21
GTGAATTCTCATCAGCTAAAATGC
CAAGCCTT
CYBB
 1





 64, 65
15
TS25
GCTCACTCACCCACACAGACACAC
ACGTCCTC
VEGFA
15.6





 68, 69
16
TS26
GGAAGAATTTCATTCTGTTCTCAG
TTTTCCTG
CFTR
 2





 72, 73
17
TS27
GCTCAGTTTTCCTGGATTATGCCT
GGCACCAT
CFTR
 4





 76, 77
18
TS31
GCGTTGGAGCGGGGAGAAGGCCAG
GGGTCACT
VEGFA
 9





 80, 81
19
TS34
GGGCCGCGGAGATAGCTGCAGGGC
GGGGCCCC
LINC01588
ND





 84, 85
20
TS35
GCCCACCCGGCGGCGCCTCCCTGC
AGGGCTGC
LINC01588
ND





 88, 89
21
TS36
GCGTGGCAGCTGATATCCGGCCGC
TGGGCGTC
LINC01588
ND





 92, 93
22
TS37
GCCGCGGCGCGACGTGGAGCCAGC
CCCGCAAA
LINC01588
ND





 96, 97
23
TS38
GTGCTCCCCAGCCCAAACCGCCGC
GGCGCGAC
LINC01588
 2





100, 101
24
TS41
GTCAGATTGGCTTGCTCGGAATTG
CCAGCCAA
AGA
 3





104, 105
25
TS44
GCTGGGTGAATGGAGCGAGCAGCG
TCTTCGAG
VEGFA
 3





108, 109
26
TS45
GTCCTGGAGTGACCCCTGGCCTTC
TCCCCGCT
VEGFA
 7.4





112, 113
27
TS46
GATCCTGGAGTGACCCCTGGCCTT
CTCCCCGC
VEGFA
 6





116, 117
28
TS47
GTGTGTCCCTCTCCCCACCCGTCC
CTGTCCGG
VEGFA
23.1





120, 121
29
TS48
GTTGGAGCGGGGAGAAGGCCAGGG
GTCACTCC
VEGFA
 2





124, 125
30
TS49
GCGTTGGAGCGGGGAGAAGGCCAG
GGGTCACT
VEGFA
 4





128, 129
31
TS50
GTACCCTCCAATAATTTGGCTGGC
AATTCCGA
AGA
 6





132, 133
32
TS51
GATAATTTGGCTGGCAATTCCGAG
CAAGCCAA
AGA
4.5





136, 137
33
TS58
GCAGGGGCCAGGTGTCCTTCTCTG
GGGGCCTC
VEGFA
 5




(DS1)









140, 141
34
TS59
GAATGGCAGGCGGAGGTTGTACTG
GGGGCCAG
VEGFA
11.5




(DS2)









144, 145
35
TS60
GAGTGAGAGAGTGAGAGAGAGACA
CGGGCCAG
VEGFA
 3




(DS3)









148, 149
36
TS61
GTGAGCAGGCACCTGTGCCAACAT
GGGCCCGC
VEGFA
 3.5




(DS4)









152, 153
37
TS62
GCGTGGGGGCTCCGTGCCCCACGC
GGGTCCAT
VEGFA
 3.4




(DS5)









156,157
38
TS63
GCATGGGCAGGGGCTGGGGTGCAC
AGGCCCAG
VEGFA
16




(DS6)









160, 161
39
TS64
GAAAATTGTGATTTCCAGATCCAC
AAGCCCAA
FANCJ
 7





164, 165
40
TS65
GAGCAGAAAAAATTGTGATTTCC
AGATCCAC
FANCJ
ND















SEQ ID

Site
TIDE Primer




NOS.
No.
Name
name
FW TIDE primer
RV TIDE primer





 10. 11
 1
TS1
AAVS1_
TGGCTTAGCACCTCTC
AGAACTCAGGACCAACTTATTCTG





TIDE1
CAT






 14, 15
 2
TS4
AAVS1_
TGGCTTAGCACCTCTC
AGAACTCAGGACCAACTTATTCTG





TIDE1
CAT






 18, 19
 3
TSS
AAVS1_
TGGCTTAGCACCTCTC
AGAACTCAGGACCAACTTATTCTG





TIDE1
CAT






 22, 23
 4
TS6
LINC01588_
AGAGGAGCCTTCTGAC
ATGACAGACACAACCAGAGGGCA





TIDE
TGCTGCAGA






 26, 27
 5
TS10
AAVS1_
TGGCTTAGCACCTCTC
AGAACTCAGGACCAACTTATTCTG





TIDE1
CAT






 30, 31
 6
TS11
AAVS1_
TGGCTTAGCACCTCTC
AGAACTCAGGACCAACTTATTCTG





TIDE1
CAT






 34, 35
 7
TS12
AAVS1_
TCCGTCTTCCTCCACTC
TAGGAAGGAGGAGGCCTAAG





TIDE2
C






 38, 39
 8
TS13
AAVS1_
TCCGTCTTCCTCCACTC
TAGGAAGGAGGAGGCCTAAG





TIDE2
C






 42, 43
 9
TS16
LINC01588_
AGAGGAGCCTTCTGAC
ATGACAGACACAACCAGAGGGCA





TIDE
TGCTGCAGA






 46, 47
10
TS17
LINC01588_
AGAGGAGCCTTCTGAC
ATGACAGACACAACCAGAGGGCA





TIDE
TGCTGCAGA






 50, 51
11
TS18
NTS55_TIDE
TAGAGAACTGGGTAGT
CCAATATTGCATGGGATGG






GTG






 54, 55
12
TS19
NTS55_TIDE
TAGAGAACTGGGTAGT
CCAATATTGCATGGGATGG






GTG






 58, 59
13
TS20
NTS55_TIDE
TAGAGAACTGGGTAGT
CCAATATTGCATGGGATGG






GTG






 62, 63
14
TS21
NTS55_TIDE
TAGAGAACTGGGTAGT
CCAATATTGCATGGGATGG






GTG






 66, 67
15
TS25
VEGF_
GTACATGAAGCAACTC
ATCAAATTCCAGCACCGAGCGC





TIDE3
CAGTCCCA






 70, 71
16
TS26
hCFTR_
TGGTGATTATGGGAGA
ACCATTGAGGACGTTTGTCTCAC





TIDE1
ACTGGAGC






 74, 75
17
TS27
hCFTR_
TGGTGATTATGGGAGA
ACCATTGAGGACGTTTGTCTCAC





TIDE1
ACTGGAGC






 78, 79
18
TS31
VEGF_
GTACATGAAGCAACTC
ATCAAATTCCAGCACCGAGCGC





TIDE3
CAGTCCCA






 82, 83
19
TS34
LINC01588_
AGAGGAGCCTTCTGAC
ATGACAGACACAACCAGAGGGCA





TIDE
TGCTGCAGA






 86, 87
20
TS35
LINC01588_
AGAGGAGCCTTCTGAC
ATGACAGACACAACCAGAGGGCA





TIDE
TGCTGCAGA






 90. 91
21
TS36
LINC01588
AGAGGAGCCTTCTGAC
ATGACAGACACAACCAGAGGGCA





TIDE
TGCTGCAGA






 94, 95
22
TS37
LINC01588_
AGAGGAGCCTTCTGAC
ATGACAGACACAACCAGAGGGCA





TIDE
TGCTGCAGA






 98, 99
23
TS38
LINC01588_
AGAGGAGCCTTCTGAC
ATGACAGACACAACCAGAGGGCA





TIDE
TGCTGCAGA






102, 103
24
TS41
AGA_
GGCATAAGGAAATCGA
CATGTCCTCAAGTCAAGAACAAG





TIDE1
AGGTC






106, 107
25
TS44
VEGF_
GTACATGAAGCAACTC
ATCAAATTCCAGCACCGAGCGC





TIDE3
CAGTCCCA






110,111
26
TS45
VEGF_
GTACATGAAGCAACTC
ATCAAATTCCAGCACCGAGCGC





TIDE3
CAGTCCCA






114, 115
27
TS46
VEGF_
GTACATGAAGCAACTC
ATCAAATTCCAGCACCGAGCGC





TIDE3
CAGTCCCA






118, 119
28
TS47
VEGF_
GTACATGAAGCAACTC
ATCAAATTCCAGCACCGAGCGC





TIDE3
CAGTCCCA






122, 124
29
TS48
VEGF_
GTACATGAAGCAACTC
ATCAAATTCCAGCACCGAGCGC





TIDE3
CAGTCCCA






126, 127
30
TS49
VEGF_
GTACATGAAGCAACTC
ATCAAATTCCAGCACCGAGCGC





TIDE3
CAGTCCCA






130, 131
31
TS50
AGA_TIDE1
GGCATAAGGAAATCGA
CATGTCCTCAAGTCAAGAACAAG





AGGTC







134, 135
32
TS51
AGA_TIDE1
GGCATAAGGAAATCGA
CATGTCCTCAAGTCAAGAACAAG





AGGTC







138, 139
33
TS58
VEGF_
ACACGGGCAGCATGGG
GCTAGGGGAGAGTCCCACTGTCCA




(DS1)
TIDE4
AATAGTC






142, 143
34
TS59
VEGF_
CCTGTGTGGCTTTGCTT
GGTAGGGTGTGATGGGAGGCTAA




(DS2)
TIDE5
TGGTC
GC





146, 147
35
TS60
VEGF_
CCTGTGTGGCTTTGCTT
GGTAGGGTGTGATGGGAGGCTAA




(DS3)
TIDE5
TGGTC
GC





150, 151
36
TS61
VEGF
CCTGTGTGGCTTTGCTT
GGTAGGGTGTGATGGGAGGCTAA




(DS4)
TIDE5
TGGTC
GC





154, 155
37
TS62
VEGF_
GGAGGAAGAGTAGCTC
AGACCGAGTGGCAGTGACAGCAA




(DS5)
TIDE6
GCCGAGG
G





158, 159
38
TS63
VEGF_
AGGGAGAGGGAAGTG
GTCTTCCTGCTCTGTGCGCACGAC




(DS6)
TIDE7
TGGGGAAGG






162, 163
39
TS64
FancJ_TIDE5
GTTGGGGGCTCTAAGT
CTTCATCTGTATCTTCAGGATCA






TATGTAT






166, 167
40
TS65
FancJ_TIDE5
GTTGGGGGCTCTAAGT
CTTCATCTGTATCTTCAGGATCA






TATGTAT









HEK293T cells were used to support transient transfections and at 72-hours post transfection the, cells were harvested, followed by genomic DNA extraction and selective amplification of the targeted locus. TIDE analysis was used to measure indel efficiency at each locus (Brinkman et al., 2014). Nme2Cas9 editing was detectable at most of these sites, even though efficiencies varied depending on the target sequence. Table 1. Interestingly, Nme2Cas9 induced indels at several genomic sites with N4CD PAMs, albeit less consistently and at lower levels. Table 1. Fourteen (14) sites with N4CC PAMs were analyzed in triplicate, and consistent editing was observed. See, FIG. 9A. In addition, editing efficiency could be improved significantly by increasing the quantity of the Nme2Cas9 plasmid delivered, and this high efficiency could be extended to precise segmental deletion with two guides. See, FIGS. 10A and 10B.


The ability of Nme2Cas9 to function was tested in mouse Hepa1-6 cells (hepatoma-derived). For Hepa1-6 cells, a single plasmid encoding both Nme2Cas9 and an sgRNA (targeting either Rosa26 or Pcsk9) was transiently transfected and indels were measured after 72 hrs. Editing was readily observed at both sites. See, FIG. 9B, left. Nme2Cas9's functionality was also tested when stably expressed in human leukemia K562 cells. To this end, a lentiviral construct was created expressing Nme2Cas9 and transduced cells to stably express Nme2Cas9 under the control of the SFFV promoter. This stable cell line did not show any visible differences with respect to growth and morphology in comparison to untransduced cells, suggesting that Nme2Cas9 is not toxic when stably expressed. These cells were transiently electroporated with plasmids expressing sgRNAs and analyzed by TIDE after 72 hours to measure indel efficiencies. Efficient (>50%) editing was observed at all three sites tested, validating Nme2Cas9's ability to function upon lentiviral delivery in K562 cells. See, FIG. 9B.


Ribonucleoprotein (RNP) delivery of Cas9 and its sgRNA is also useful for some genome editing applications, and the greater transience of Cas9's presence can minimize off-target editing (Kim et al., 2014; Zuris et al., 2015). Moreover, some cell types (e.g. certain immune cells) are recalcitrant to DNA transfection-based editing (Schumann et al., 2015). To test whether Nme2Cas9 is functional by RNP delivery, a 6×His-tagged Nme2Cas9 (fused to three NLSs) was cloned into a bacterial expression construct and the recombinant protein was purified. The recombinant protein was then loaded with T7 RNA polymerase-transcribed sgRNAs targeting three previously validated sites. Electroporation of the Nme2Cas9:sgRNA complex induced successful editing at each of the three target sites in HEK293T cells, as detected by TIDE. See, FIG. 9C. Collectively these results indicate that Nme2Cas9 can be delivered effectively via plasmid or lentivirus, or as an RNP complex, in multiple cell types.


5. Anti-CRISPR Regulation


To date, five families of Acrs from diverse bacterial species have been shown to inhibit Nme1Cas9 in vitro and in human cells (Pawluk et al., 2016; Lee et al., 2018, submitted). Considering the high sequence identity between Nme1Cas9 and Nme2Cas9, at least some of these Acr families should inhibit Nme2Cas9. To test this, all five families of recombinant Acrs were expressed, purified and tested for Nme2Cas9's ability to cleave a target in vitro in the presence of a member of each family (10:1 Acr:Cas9 molar ratio). An inhibitor was used for the type I-E CRISPR system in E. coli (AcrE2) as a negative control, while Nme1Cas9 was used as a positive control. (Pawluk et al., 2014); (Pawluk et al., 2016). As expected, all 5 families inhibited Nme1Cas9, while AcrE2 failed to do so. See, FIG. 11A, top. AcrIIC1Nme, AcrIIC2Nme, AcrIIC3Nme, and AcrIIC4Hpa completely inhibited Nme2Cas9. Strikingly, however, AcrIIC5Smu which has been previously reported as the most potent of the Nme1Cas9 inhibitors (Lee et al., 2018), did not inhibit Nme2Cas9 in vitro even at a 10-fold molar excess. This suggests that it likely inhibits Nme1Cas9 by interacting with its PID.



FIG. 10A-B presents exemplary data showing dose dependence and segmental deletions by Nme2Cas9, as related to FIG. 9A-C. FIG. 10A shows exemplary increasing the dose of electroporated Nme2Cas9 plasmid (500 ng, vs. 200 ng in FIG. 3A) improves editing efficiency at two sites (TS16 and TS6). Data provided in yellow are re-used from FIG. 9A. FIG. 10B shows exemplary Nme2Cas9 can be used to create precise segmental deletions. Two TLR2.0 targets with cleavage sites 32 bp apart were targeted simultaneously with Nme2Cas9. The majority of lesions created were deletions of exactly 32 bp (blue).



FIG. 11A-C presents exemplary data showing that Nme2Cas9 is subject to inhibition by a subset of type II-C anti-CRISPR families in vitro and in cells. All experiments were done in triplicate and error bars represent s.e.m. FIG. 11A shows exemplary In vitro cleavage assay of Nme1Cas9 and Nme2Cas9 in the presence of five previously characterized anti-CRISPR proteins (10:1 ratio of Acr:Cas9). Top: Nme1Cas9 efficiently cleaves a fragment containing a protospacer with an N4GATT PAM in the absence of an Acr or in the presence of a negative control Acr (AcrE2). All five previously characterized type II-C Acr families inhibited Nme1Cas9, as expected. Bottom: Nme2Cas9 inhibition mirrors that of Nme1Cas9, except for the lack of inhibition by AcrIIC5Smu. FIG. 11B shows exemplary genome editing in the presence of the five previously described anti-CRISPR families. Plasmids expressing Nme2Cas9 (200 ng), sgRNA (100 ng) and each respective Acr (200 ng) were co-transfected into HEK293T cells, and genome editing was measured using Tracking of Indels by Decomposition (TIDE) 72 hr post transfection. Consistent with our in vitro analyses, all type II-C anti-CRISPRs except AcrIIC5Smu inhibited genome editing, albeit with different efficiencies. FIG. 11C shows exemplary Acr inhibition of Nme2Cas9 is dose-dependent with distinct apparent potencies. Nme2Cas9 is fully inhibited by AcrIIC1Nme and AcrIIC4Hpa at 2:1 and 1:1 mass ratios of cotransfected Acr and Nme2Cas9 plasmids, respectively.


To further test this, a Nme1Cas9/Nme2Cas9 chimera with the PID of Nme2Cas9 was tested. See, FIG. 5D and FIG. 6D. Due to the reduced activity of this hybrid, a ˜30× higher concentration of Cas9 was used to achieve a similar cleavage efficiency while maintaining the 10:1 Cas9:Acr molar ratio. No inhibition was observed by AcrIIC5Smu on this protein chimera. See, FIG. 12. This data provides further evidence that AcrIIC5Smu likely interacts with the PID of Nme1Cas9. Regardless of the mechanistic basis for the differential inhibition by AcrIIC5Smu, these results indicate that Nme2Cas9 is subject to inhibition by the other four type II-C Acr families.



FIG. 12 presents exemplary data showing that a Nme2Cas9 PID swap renders Nme1Cas9 insensitive to AcrIIC5Smu inhibition, as related to FIG. 11A-C. In vitro cleavage by the Nme1Cas9-Nme2Cas9PID chimera in the presence of previously characterized Acr proteins (10 uM Cas9-sgRNA+100 uM Acr).


Based on the above in vitro data, it was hypothesized that AcrIIC1Nme, AcrIIC2Nme, AcrIIC3Nme, and AcrIIC4Hpa could be used as off-switches for Nme2Cas9 genome editing. To test this, Nme2Cas9/sgRNA plasmid transfections (150 ng of each plasmid) targeting TS16 were performed in HEK293T cells in the presence or absence of Acr expression plasmids, as it has been reported that most Acrs inhibited Nme1Cas9 at those plasmid ratios (Pawluk et al., 2016). As expected, AcrIIC1Nme, AcrIIC2Nme, AcrIIC3Nme and AcrIIC4Hpa inhibited Nme2Cas9 genome editing, while AcrIIC5Smu had no effect. See, FIG. 11B. Complete inhibition was observed by AcrIIC3Nme and AcrIIC4Hpa, suggesting that they have high potency against Nme2Cas9 as compared to AcrIIC1Nme and AcrIIC2Nme. To further compare the potency of AcrIIC1Nme and AcrIIC4Hpa, we repeated the experiments at various ratios of Acr plasmid to Cas9 plasmid. See, FIG. 11C. The data show that the AcrIIC4Hpa plasmid is especially potent against Nme2Cas9. Together, these data suggest that several Acr proteins can be used as off-switches for Nme2Cas9-based applications.


6. Hyper-Accuracy


Nme1Cas9 demonstrates remarkable editing fidelity in cells and mouse models (Lee et al., 2016; Amrani et al., 2018; Ibraheim et al., 2018). Furthermore, the similarity of Nme2Cas9 to Nme1Cas9 over most of its length suggests that it may likewise be hyper-accurate. However, the higher number of sites sampled in the genome as a result of the dinucleotide PAM could create more opportunities for Nme2Cas9 off-targeting in comparison with Nme1Cas9 and its less frequently encountered 4-nucleotide PAM. To assess the off-target profile of Nme2Cas9, GUIDE-seq (genome-wide, unbiased identification of double-stranded breaks enabled by sequencing) was used to identify potential off-target sites empirically and in an unbiased fashion (Tsai et al., 2014). Even the best off-target prediction algorithms are prone to false negatives necessitating empirical target site profiling methods (Bolukbasi et al., 2015b; Tsai and Joung, 2016; Tycko et al., 2016). GUIDE-seq relies on the incorporation of double-stranded oligodeoxynucleotides (dsODNs) into DNA double-stranded break sites throughout the genome. These insertion sites are then detected by amplification and high-throughput sequencing.


Because SpyCas9 is a well-characterized Cas9 ortholog it is useful for multiplexed applications with other Cas9s, and as a benchmark for their editing properties (Jiang and Doudna, 2017; Komor et al., 2017). SpyCas9 and Nme2Cas9 were cloned into identical plasmid backbones, with the same UTRs, linkers, NLSs, and promoters, for parallel transient transfections (along with similarly matched sgRNA-expressing plasmids) into HEK293T cells. First, it was confirmed that the RNA guides for SpyCas9 and Nme2Cas9 are orthogonal, i.e. that Nme2Cas9 sgRNAs do not direct editing by SpyCas9, and vice versa. See, FIG. 13A. This was in contrast to earlier reported results with Nme1Cas9 (Esvelt et al., 2013; Fonfara et al., 2014).


Next, to identify a use of SpyCas9 as a benchmark for GUIDE-seq, because Spy Cas9 and. Nme2Cas9 have non-overlapping PAMs its can therefore potentially edit any dual site (DS) flanked by a 5′-NGGNCC-3′ sequence, which simultaneously fulfills the PAM requirements of both Cas9's. This permits side-by-side comparisons of off-targeting with RNA guides that facilitate an edit of the exact same on-target site. See, FIG. 14A. Six (6) DSs in VEGFA were targeted, each of which also has a G at the appropriate positions 5′ of the PAM such that both SpyCas9 and Nme2Cas9 guides (driven by the U6 promoter) were 100% complementary to the target site. Seventy-two (72) hours after transfection, a TIDE analysis was performed on these sites targeted by each nuclease. Nme2Cas9 induced indels at all six sites, albeit at low efficiencies at two of them, while SpyCas9 induced indels at four of the six sites. See, FIG. 14B. At two of the four sites (DS1 and DS4) at which SpyCas9 was effective, it induced ˜7-fold more indels than Nme2Cas9, while Nme2Cas9 induced a ˜3-fold higher frequency of indels than SpyCas9 at DS6. Both. Cas9 orthologs edited DS2 with approximately equal efficiency.


For GUIDE-seq, DS2, DS4 and DS6 were selected to sample off-target cleavage with Nme2Cas9 guides that direct on-target editing as efficiently, less efficiently, or more efficiently than the corresponding SpyCas9 guides, respectively. In addition to the three dual sites, TS6 was added as it has been observed to be an efficiently edited Nme2Cas9 target sites, having an approximate 30-50% indel efficiency depending on the cell type. See, FIGS. 9A and 10A. Similar data is seen with the mouse Pcsk9 and Rosa26 Nme2Cas9 sites. See, FIG. 9B.


Plasmid transfections were performed for each Cas9 along with their cognate sgRNAs and the dsODNs. Subsequently, GUIDE-seq libraries were prepared as described previously (Amrani et al., 2018). A GUIDE-seq analysis revealed efficient on-target editing for both Cas9 orthologs, with relative efficiencies (as reflected by GUIDE-seq read counts) that are similar to those observed by TIDE. FIG. 13B and. Table 2. (Tsai et al 2014; Zhu et al., 2017).



FIG. 13A-E presents exemplary data showing orthogonality and relative accuracy of Nme2Cas9 and SpyCas9 at dual target sites, as related to FIG. 12. FIG. 13A shows exemplary Nme2Cas9 and SpyCas9 guides are orthogonal. TIDE results show the frequencies of indels created by both nucleases targeting DS2 with either their cognate sgRNAs or with the sgRNAs of the other ortholog. FIG. 13B shows exemplary Nme2Cas9 and SpyCas9 exhibiting comparable on-target editing efficiencies as assessed by GUIDE-seq. Bars indicate on-target read counts from GUIDE-Seq at the three dual sites targeted by each ortholog. Orange bars represent Nme2Cas9 and black bars represent SpyCas9. FIG. 13C shows an exemplary SpyCas9's on-target vs. off-target read counts for each site. Orange bars represent the on-target reads while black bars represent off-targets. FIG. 13D shows exemplary Nme2Cas9's on-target vs. off-target reads for each site. FIG. 13E bar graphs showing exemplary indel efficiencies (measured by TIDE) at potential off-target sites predicted by CRISPRSeek. On- and off-target site sequences are shown on the left, with the PAM region underlined and sgRNA mismatches and non-consensus PAM nucleotides given in red.









TABLE 2





GUIDE-seq Data







SpyDS2 (gRNA.name SpyDS2)









offTarget
peak_score
predicted_cleavage_score





chr6:-:43748587:43748609
652
100





chr1:+:82004618:82004640
304
  4.1





chr1:-:31140567:31140589
275
 19.6





chr16:+:30357052:30357074
226
  0.6





chr5:-:33453895:33453917
217
  4





chr11:+:116600352:116600374
206
  0.4





chr17:-:46938649:46938671
191
  0.6





chr9:-:130859778:130859800
146
  5.4





chr15:+:59837681:59837703
143
  2.6





chr22:-:19135541:19135563
124
  0.3





chrX:+:49057600:49057622
122
  0.6





chr7:-:72751388:72751410
117
  2.6





chr3:-:51652045:51652067
115
  0.3





chr1:-:9544334:9544356
109
  0.7





chr3:-:47868006:47868028
 99
  2.6





chr9:+:140670069:140670091
 91
  0.4





chr2:-:149516035:149516057
 90
  0.3





chr22:-:18245713:18245735
 89
  0.2





chr3:+:154744438:154744460
 89
  2.6





chr17:-:73320669:73320691
 88
  0.7





chr1:-:38479457:38479479
 85
  2.6





chr7:+:33058792:33058814
 78
  0.3





chr9:+:108299833:108299855
 76
  1





chr1:-:23627429:23627451
 74
  0.5





chr2:-:63393272:63393294
 74
  0.5





chr16:+:71467786:71467808
 70
  0.6





chr1:-:111638773:111638795
 67
  0.3





chr1:-:213393740:213393762
 67
  0.5





chr7:+:38284425:38284447
 67
  0.3





chr7:-:134511606:134511628
 66
  0.7





chr7:+:152293366:152293388
 66
  0.7





chr17:+:60243345:60243367
 63
  0.5





chrX:-:48007735:48007757
 60
  0.6





chr1:+:52768707:52768729
 58
  5.4





chr19:-:38805324:38805346
 58
  0.3





chrX:-:41283776:41283798
 58
  2.6





chr11:-:14539718:14539740
 57
  2.6





chr6:+:32895093:32895115
 57
  0.7





chr7:-:138957343:138957365
 56
 98.6





chr3:-:63900682:63900704
 52
  0.4





chr5:-:79624954:79624976
 52
  9.6





chr7:+:76012229:76012251
 52
  0.7





chrX:+:39889198:39889220
 52
  2.6





chr4:-:99897525:99897547
 51
  5.4





chr1:-:25822709:25822731
 50
  0.7





chr5:+:17293204:17293226
 50
  0.7





chr13:-:66697991:66698013
 49
  0.1





chr5:-:80796103:80796125
 49
  2.6





chr16:+:49239128:49239150
 45
  1.9





chr3:+:69489884:69489906
 43
  0.5





chr8:+:113712655:113712677
 42
  0.3





chr2:-:24502672:24502694
 39
  2.6





chr7:-:65642349:65642371
 39
  2.6





chrX:-:135700076:135700098
 37
  2.6





chr1:-:99795756:99795778
 36
  6.2





chr19:+:1821377:1821399
 36
  0.2





chr4:-:75501534:75501556
 36
  0.3





chr18:+:74828740:74828762
 34
  0.3





chrX:+:133975784:133975806
 34
  6.2





chr14:+:55717904:55717926
 33
 98.6





chr13:+:49522615:49522637
 32
  0.3





chr3:-:77788415:77788437
 32
  0.7





chr11:-:48230825:48230847
 31
  6.2





chr1:-:1280441:1280463
 30
  0.3





chr7:+:44602379:44602401
 30
  5.4





chr12:-:108166294:108166316
 29
  5.4





chr7:-:111929850:111929872
 29
  4





chr12:-:122404237:122404259
 27
  0.2





chr12:-:79123453:79123475
 27
  0.7





chr22:-:46412541:46412563
 27
  6.2





chr5:+:93889070:93889092
 26
  0.3





chr10:-:97776548:97776570
 25
  0.6





chr2:-:56533335:56533357
 24
 98.6





chr3:+:149843401:149843423
 24
  0.1





chr1:-:232769157:232769179
 23
  2.6





chr15:-:75100050:75100072
 21
  2.6





chr18:+:37252965:37252987
 21
  0.6





chr2:-:44506208:44506230
 21
  7.6





chr4:+:182389352:182389374
 21
  0.6





chr11:+:9360929:9360951
 20
 98.6





chr12:+:23638452:23638474
 19
  0.4





chr7:-:66498753:66498775
 19
  1.4





chr13:+:32055862:32055884
 16
  6.2





chr15:-:59331986:59332008
 16
  6.2





chr2:+:126196868:126196890
 16
  0.7





chrX:-:77359566:77359588
 16
  0





chrX:+:24652788:24652810
 16
  6.2





chr17:-:17667857:17667879
 15
  0.4





chr21:+:34751155:34751177
 15
  2.6





chr2:-:48734975:48734997
 14
  5.4





chr1:-:69755048:69755070
 13
  2.6





chr16:+:90013282:90013304
 13
  1.1





chr18:-:630757:630779
 13
  5.4





chr3:-:163905630:163905652
 12
  0.6













SEQ

SEQ



ID

ID



NOS:
gRNAPlusPAM
NOS:
offTarget sequence





168
GGCAGGCGGAGGTTGTACTGNGG
169
GGCAGGCGGAGGTTGTACTGGGG





168
GGCAGGCGGAGGTTGTACTGNGG
170
GGAAGGCGGAAGTTGTACTGAGG





168
GGCAGGCGGAGGTTGTACTGNGG
171
GGCAGGCGGAGGTTGTAGTGGGG





168
GGCAGGCGGAGGTTGTACTGNGG
172
AGGAGGCGGAGGTTGCACTGAGG





168
GGCAGGCGGAGGTTGTACTGNGG
173
GGGAGGTGGAGGTTGTACTGAGG





168
GGCAGGCGGAGGTTGTACTGNGG
174
GGCAGGGGGAAGCTGTACTGTGG





168
GGCAGGCGGAGGTTGTACTGNGG
175
AGGAGGCGGAGGTTGCACTGAGG





168
GGCAGGCGGAGGTTGTACTGNGG
176
AGGAGGCGGAGGTTGTACTGAGC





168
GGCAGGCGGAGGTTGTACTGNGG
177
GGGAGGCGGAGGTTGTAATGAGG





168
GGCAGGCGGAGGTTGTACTGNGG
178
GGCAAGAGGAGGTTGGACTGGGG





168
GGCAGGCGGAGGTTGTACTGNGG
179
AGGAGGCGGAGGTTGCACTGAGG





168
GGCAGGCGGAGGTTGTACTGNGG
180
GGGAGGCGGAGGTTGTAGTGAGG





168
GGCAGGCGGAGGTTGTACTGNGG
181
AGGAAGCGGAGGTTGTAATGAGG





168
GGCAGGCGGAGGTTGTACTGNGG
182
AGGAGGCGGAGGTTGTAATGAGG





168
GGCAGGCGGAGGTTGTACTGNGG
183
GGGAGGCGGAGGTTGTAGTGAGG





168
GGCAGGCGGAGGTTGTACTGNGG
184
TCCAGGTGGAGGCTGTACTGAGG





168
GGCAGGCGGAGGTTGTACTGNGG
185
AGGAGGCAGAGGTTGCACTGGGG





168
GGCAGGCGGAGGTTGTACTGNGG
186
GGGAGGCGGAGGATGTAATGAGG





168
GGCAGGCGGAGGTTGTACTGNGG
187
CACAGGCAGAGGTTGTACTGAGC





168
GGCAGGCGGAGGTTGTACTGNGG
188
AGGAGGCGGAGGTTGTAGTGAGG





168
GGCAGGCGGAGGTTGTACTGNGG
189
GGGAGGCGGAGGTTGTAGTGAGG





168
GGCAGGCGGAGGTTGTACTGNGG
190
AGGAGGCAGAGGTTGAACTGAGG





168
GGCAGGCGGAGGTTGTACTGNGG
191
GGCAAGGGGAAGTTGTACTGTGG





168
GGCAGGCGGAGGTTGTACTGNGG
192
GGGAGGCAGAGGTTGCACTGAGG





168
GGCAGGCGGAGGTTGTACTGNGG
193
GAGAGGCGGAGGTTGCACTGAGG





168
GGCAGGCGGAGGTTGTACTGNGG
194
AGGAGGCGGAGGTTGCACTGAGG





168
GGCAGGCGGAGGTTGTACTGNGG
195
AGGAGGCAGAGGTTGCACTGAGG





168
GGCAGGCGGAGGTTGTACTGNGG
196
GGGAGGCAGAGGTTGCACTGAGG





168
GGCAGGCGGAGGTTGTACTGNGG
197
CAGAGGCGGAGGTTGTAGTGAGG





168
GGCAGGCGGAGGTTGTACTGNGG
198
AGCAGGTAGAGGTTGGACTGAGG





168
GGCAGGCGGAGGTTGTACTGNGG
199
AGGAGGCGGAGGTTGTAGTGAGG





168
GGCAGGCGGAGGTTGTACTGNGG
200
GGGAGGCAGAGGTTGCACTGAGG





168
GGCAGGCGGAGGTTGTACTGNGG
201
AGGAGGCGGAGGTTGCACTGAGG





168
GGCAGGCGGAGGTTGTACTGNGG
202
TGGAGGCGGAGGTTGTACTGAGC





168
GGCAGGCGGAGGTTGTACTGNGG
203
AGGAGGCAGAGGTTGCACTGAGG





168
GGCAGGCGGAGGTTGTACTGNGG
204
GGGAGGCGGAGGTTGTAGTGAGG





168
GGCAGGCGGAGGTTGTACTGNGG
205
AGGAGGCAGAGGTTGTACTGAGC





168
GGCAGGCGGAGGTTGTACTGNGG
206
AGGAGGCGGAGGTTGTAGTGAGG





168
GGCAGGCGGAGGTTGTACTGNGG
207
GGGAGGCGGAGGTTGTACTGAGC





168
GGCAGGCGGAGGTTGTACTGNGG
208
GGGAGGTGGAGGTTGCACTGAGG





168
GGCAGGCGGAGGTTGTACTGNGG
209
GGGAGGCGGAGGTTGTAGTGAGG





168
GGCAGGCGGAGGTTGTACTGNGG
210
AGGAGGCGGAGGTTGTAGTGAGG





168
GGCAGGCGGAGGTTGTACTGNGG
211
GGGAGGCGGAGGTTGTAGTGAGG





168
GGCAGGCGGAGGTTGTACTGNGG
212
AGGAGGCGGAGGTTGTACTGAGC





168
GGCAGGCGGAGGTTGTACTGNGG
213
AGAAGGCGGAGGTTGTAGTGAGG





168
GGCAGGCGGAGGTTGTACTGNGG
214
AGGAGGCGGAGGTTGTAGTGAGG





168
GGCAGGCGGAGGTTGTACTGNGG
215
AGGAGGCGGAGGCTGCACTGAGG





168
GGCAGGCGGAGGTTGTACTGNGG
216
GGGAGGCGGAGGTTGTAGTGAGG





168
GGCAGGCGGAGGTTGTACTGNGG
217
GGGAGGCGGAGGTTGCACTGAGG





168
GGCAGGCGGAGGTTGTACTGNGG
218
GGGAGGCAGAGGTTGCACTGAGG





168
GGCAGGCGGAGGTTGTACTGNGG
219
AGGAGGCAGAGGTTGTAATGAGG





168
GGCAGGCGGAGGTTGTACTGNGG
220
GGGAGGCGGAGGTTGTAGTGAGG





168
GGCAGGCGGAGGTTGTACTGNGG
221
GGAAGGCGGAGGTTGTAGTGAGG





168
GGCAGGCGGAGGTTGTACTGNGG
222
AGGAGGCAGAGGTTGTACTGAGC





168
GGCAGGCGGAGGTTGTACTGNGG
223
GGGAGGCAGAGGTTGTACTGAGC





168
GGCAGGCGGAGGTTGTACTGNGG
224
GGAAGGTGAAGGCTGTACTGCGG





168
GGCAGGCGGAGGTTGTACTGNGG
225
AGAAGGCAGAGGTTGCACTGAGG





168
GGCAGGCGGAGGTTGTACTGNGG
226
AGTAGGCAGAGGTTGCACTGAGG





168
GGCAGGCGGAGGTTGTACTGNGG
227
GGGAGGCAGAGGTTGTACTGAGC





168
GGCAGGCGGAGGTTGTACTGNGG
228
GGGAGGCGGAGGTTGTACTGAGC





168
GGCAGGCGGAGGTTGTACTGNGG
229
AGGAGGCAGAGGTTGTAATGAGG





168
GGCAGGCGGAGGTTGTACTGNGG
230
AGGAGGCGGAGGTTGTAGTGAGG





168
GGCAGGCGGAGGTTGTACTGNGG
231
GGGAGGCAGAGGTTGTACTGAGC





168
GGCAGGCGGAGGTTGTACTGNGG
232
GCCAGGCGGGTGCTGTACTGGGG





168
GGCAGGCGGAGGTTGTACTGNGG
233
AGGAGGCGGAGGTTGTACTGGGC





168
GGCAGGCGGAGGTTGTACTGNGG
234
TGGAGGCGGAGGTTGTACTGAGC





168
GGCAGGCGGAGGTTGTACTGNGG
235
GGGAGGTGGAGGTTGTACTGAGC





168
GGCAGGCGGAGGTTGTACTGNGG
236
AGGAGGTGGAGGTTGTAATGAGG





168
GGCAGGCGGAGGTTGTACTGNGG
237
AGGAGGCGGAGGTTGTAGTGAGG





168
GGCAGGCGGAGGTTGTACTGNGG
238
GGGAGGCAGAGGTTGTACTGAGC





168
GGCAGGCGGAGGTTGTACTGNGG
239
AGGAGGCAGAGGTTGCACTGAGG





168
GGCAGGCGGAGGTTGTACTGNGG
240
GGGAGGCAGAGGTTGTAGTGAGG





168
GGCAGGCGGAGGTTGTACTGNGG
241
GGGAGGCGGAGGTTGTACTGAGC





168
GGCAGGCGGAGGTTGTACTGNGG
242
GGTAGGCAAAGGTTGTACCAGGG





168
GGCAGGCGGAGGTTGTACTGNGG
243
GGGAGGCGGAGGTTGTAGTGAGG





168
GGCAGGCGGAGGTTGTACTGNGG
244
AGGAGGCAGAGGTTGTACTGAGC





168
GGCAGGCGGAGGTTGTACTGNGG
245
GGGAGGCAGAGGTTGTAGTGAGG





168
GGCAGGCGGAGGTTGTACTGNGG
246
GGGAGGCGGAGGTTGTAGTGAGG





168
GGCAGGCGGAGGTTGTACTGNGG
247
GGGAGGCAGAGGTTGTAGTGAGG





168
GGCAGGCGGAGGTTGTACTGNGG
248
GGGAGGCGGAGGTTGTACTGAGC





168
GGCAGGCGGAGGTTGTACTGNGG
249
GGGAGGTGGAGGTTGCACTGAGG





168
GGCAGGCGGAGGTTGTACTGNGG
250
CAGAGGCAGAGGTTGTACTGAGC





168
GGCAGGCGGAGGTTGTACTGNGG
251
GGGAGGCAGAGGTTGTACTGAGT





168
GGCAGGCGGAGGTTGTACTGNGG
252
GGGAGGCAGAGGTTGTACTGAGC





168
GGCAGGCGGAGGTTGTACTGNGG
253
AGAAGGCGGAGGTTGTAGTGAGG





168
GGCAGGCGGAGGTTGTACTGNGG
254
CGTCTGCGAGGGTACTAGTGAGA





168
GGCAGGCGGAGGTTGTACTGNGG
255
GGGAGGCAGAGGTTGTACTGAGC





168
GGCAGGCGGAGGTTGTACTGNGG
256
GGGAGACGGAGGTTGTAGTGAGG





168
GGCAGGCGGAGGTTGTACTGNGG
257
AGGAGGCAGAGGTTGTACTGAGC





168
GGCAGGCGGAGGTTGTACTGNGG
258
AGGAGGCGGAGGTTGTACTGAGC





168
GGCAGGCGGAGGTTGTACTGNGG
259
AGAAGGCAGAGGTTGTACTGAGC





168
GGCAGGCGGAGGTTGTACTGNGG
260
GCCAGGCTGAGGATGTACTGTGG





168
GGCAGGCGGAGGTTGTACTGNGG
261
AGGAGGCGGAGGTTGTACTGAGC





168
GGCAGGCGGAGGTTGTACTGNGG
262
GGGAGGCAGAGGTTGTAGTGAGG














SEQ ID
offTarget



guideAlignment2OffTarget
NOS:
Strand
mismatch.distance2PAM





....................

-






..A.......A.........

+
18, 10





.................G..

-
3





A.G............C....

+
20, 18, 5





..G...T.............

-
18, 14





......G...A.C.......

+
14, 10, 8





A.G............C....

-
20, 18, 5





A.G.................

-
20, 18





..G..............A..

+
18, 3





....A.A........G....

-
16, 14, 5





A.G............C....

+
20, 18, 5





..G..............G..

-
18, 3





A.G.A............A..
263
-
20, 18, 16, 3





A.G..............A..

-
20, 18, 3





..G..............G..

-
18, 3





TC....T.....C.......
264
+
20, 19, 14, 8





A.G....A.......C....
265
-
20, 18, 13, 5





..G.........A....A..

-
18, 8, 3





CA.....A............

+
20, 19, 13





A.G..............G..

-
20, 18, 3





..G..............G..

-
18, 3





A.G....A.......A....
266
+
20, 18, 13, 5





....A.G...A.........

+
16, 14, 10





..G....A.......C....

-
18, 13, 5





.AG............C....

-
19, 18, 5





A.G............C....

+
20, 18, 5





A.G....A.......C....
267
-
20, 18, 13, 5





..G....A.......C....

-
18, 13, 5





CAG..............G..
268
+
20, 19, 18, 3





A.....TA.......G....
269
-
20, 14, 13, 5





A.G..............G..

+
20, 18, 3





..G....A.......C....

+
18, 13, 5





A.G............C....

-
20, 18, 5





T.G.................

+
20, 18





A.G....A.......C....
270
-
20, 18, 13, 5





..G..............G..

-
18, 3





A.G....A............

-
20, 18, 13





A.G..............G..

+
20, 18, 3





..G.................

-
18





..G...T........C....

-
18, 14, 5





..G..............G..

-
18, 3





A.G..............G..

+
20, 18, 3





..G..............G..

+
18, 3





A.G.................

-
20, 18





A.A..............G..

-
20, 18, 3





A.G..............G..

+
20, 18, 3





A.G.........C..C....
271
-
20, 18, 8, 5





..G..............G..

-
18, 3





..G............C....

+
18, 5





..G....A.......C....

+
18, 13, 5





A.G....A.........A..
272
+
20, 18, 13, 3





..G..............G..

-
18, 3





..A..............G..

-
18, 3





A.G................A

-
20, 18, 13





..G....A............

-
18, 13





..A...T.A...C.......
273
+
18, 14, 12, 8





A.A....A.......C....
274
-
20, 18, 13, 5





A.T....A.......C....
275
+
20, 18, 13, 5





..G....A............

+
18, 13





..G.................

+
18





A.G....A.........A..
276
+
20, 18, 13, 3





A.G..............G..

-
20, 18, 3





..G....A............

-
18, 13





.C.......GT.C.......
277
-
19, 11, 10, 8





A.G.................

+
20, 18





T.G.................

-
20, 18





..G...T.............

-
18, 14





A.G...T..........A..
278
-
20, 18, 14, 3





A.G..............G..

-
20, 18, 3





..G....A............

-
18, 13





A.G....A.......C....
279
+
20, 18, 13, 5





..G....A.........G..

-
18, 13, 3





..G.................

-
18





..T....AA.........CA
280
+
18, 13, 12, 2, 1





..G..............G..

-
18, 3





A.G....A............

-
20, 18, 13





..G....A.........G..

+
18, 13, 3





..G..............G..

-
18, 3





..G....A.........G..

+
18, 13, 3





..G.................

+
18





..G...T........C....

+
18, 14, 5





CAG....A............
281
-
20, 19, 18, 13





..G....A............

+
18, 13





..G....A............

-
18, 13





A.A..............G..

+
20, 18, 3





C.TCT...AG...AC..G..
282
-
20, 18, 17, 16, 12, 11, 7, 6, 3





..G....A............

+
18, 13





..G..A...........G..

-
18, 15, 3





A.G....A............

+
20, 18, 13





A.G.................

-
20, 18





A.A....A............

-
20, 18, 13





.C............T....A

+
19, 13, 8





A.G.................

-
20, 18





..G....A.........G..

-
18, 13, 3












n.PAM.mismatch
n.guide.mismatch
PAM.sequence





0
0
GGG





0
2
AGG





0
1
GGG





0
3
AGG





0
2
AGG





0
3
TGG





0
3
AGG





1
2
AGC





0
2
AGG





0
3
GGG





0
3
AGG





0
2
AGG





0
4
AGG





0
3
AGG





0
2
AGG





0
4
AGG





0
4
GGG





0
3
AGG





1
3
AGC





0
3
AGG





0
2
AGG





0
4
AGG





0
3
TGG





0
3
AGG





0
3
AGG





0
3
AGG





0
4
AGG





0
3
AGG





0
4
AGG





0
4
AGG





0
3
AGG





0
3
AGG





0
3
AGG





1
2
AGC





0
4
AGG





0
2
AGG





1
3
AGC





0
3
AGG





1
1
AGC





0
3
AGG





0
2
AGG





0
3
AGG





0
2
AGG





1
2
AGC





0
3
AGG





0
3
AGG





0
4
AGG





0
2
AGG





0
2
AGG





0
3
AGG





0
4
AGG





0
2
AGG





0
2
AGG





1
3
AGC





1
2
AGC





0
4
CGG





0
4
AGG





0
4
AGG





1
2
AGC





1
1
AGC





0
4
AGG





0
3
AGG





1
2
AGC





0
4
GGG





1
2
GGC





1
2
AGC





1
2
AGC





0
4
AGG





0
3
AGG





1
2
AGC





0
4
AGG





0
3
AGG





1
1
AGC





0
5
GGG





0
2
AGG





1
3
AGC





0
3
AGG





0
2
AGG





0
3
AGG





1
1
AGC





0
3
AGG





1
4
AGC





1
2
AGT





1
2
AGC





0
3
AGG





1
9
AGA





1
2
AGC





0
3
AGG





1
3
AGC





1
2
AGC





1
3
AGC





0
3
TGG





1
2
AGC





0
3
AGG





offTarget_Start
offTarget_End
chromosome





43748587
43748609
chr6





82004618
82004640
chr1





31140567
31140589
chr1





30357052
30357074
chr16





33453895
33453917
chr5





116600352
116600374
chr11





46938649
46938671
chr17





130859778
130859800
chr9





59837681
59837703
chr15





19135541
19135563
chr22





49057600
49057622
chrX





72751388
72751410
chr7





51652045
51652067
chr3





9544334
9544356
chr1





47868006
47868028
chr3





140670069
140670091
chr9





149516035
149516057
chr2





18245713
18245735
chr22





154744438
154744460
chr3





73320669
73320691
chr17





38479457
38479479
chr1





33058792
33058814
chr7





108299833
108299855
chr9





23627429
23627451
chr1





63393272
63393294
chr2





71467786
71467808
chr16





111638773
111638795
chr1





213393740
213393762
chr1





38284425
38284447
chr7





134511606
134511628
chr7





152293366
152293388
chr7





60243345
60243367
chr17





48007735
48007757
chrX





52768707
52768729
chr1





38805324
38805346
chr19





41283776
41283798
chrX





14539718
14539740
chr11





32895093
32895115
chr6





138957343
138957365
chr7





63900682
63900704
chr3





79624954
79624976
chr5





76012229
76012251
chr7





39889198
39889220
chrX





99897525
99897547
chr4





25822709
25822731
chr1





17293204
17293226
chr5





66697991
66698013
chr13





80796103
80796125
chr5





49239128
49239150
chr16





69489884
69489906
chr3





113712655
113712677
chr8





24502672
24502694
chr2





65642349
65642371
chr7





135700076
135700098
chrX





99795756
99795778
chr1





1821377
1821399
chr19





75501534
75501556
chr4





74828740
74828762
chr18





133975784
133975806
chrX





55717904
55717926
chr14





49522615
49522637
chr13





77788415
77788437
chr3





48230825
48230847
chr11





1280441
1280463
chr1





44602379
44602401
chr7





108166294
108166316
chr12





111929850
111929872
chr7





122404237
122404259
chr12





79123453
79123475
chr12





46412541
46412563
chr22





93889070
93889092
chr5





97776548
97776570
chr10





56533335
56533357
chr2





149843401
149843423
chr3





232769157
232769179
chr1





75100050
75100072
chr15





37252965
37252987
chr18





44506208
44506230
chr2





182389352
182389374
chr4





9360929
9360951
chr11





23638452
23638474
chr12





66498753
66498775
chr7





32055862
32055884
chr13





59331986
59332008
chr15





126196868
126196890
chr2





77359566
77359588
chrX





24652788
24652810
chrX





17667857
17667879
chr17





34751155
34751177
chr21





48734975
48734997
chr2





69755048
69755070
chr 1





90013282
90013304
chr16





630757
630779
chr18





163905630
163905652
chr3





inExon
entrez_id
symbol





TRUE
7422
VEGFA






23266
ADGRL2






















6897
TARS






-







10241
CALCOCO2






114789
SLC25A25






























8468
FKBP6






23132
RAD54L2














22907
DHX30






79813
EHMT1






26122
EPC2






637
BID






4311
MME






2885
GRB2






51118
UTP11






51251
NT5C3A






83856
FSD1L














51057
WDPCP






















26750
RPS6KC1






445347
TARP






800
CALD1






























9372
ZFYVE9






90522
YIF1B














5682
PSMA1














254048
UBN2






6314
ATXN7






























55219
MACO1






















23635
SSBP2






23150
FRMD4B






114788
CSMD3






50618
ITSN2






























57455
REXO1














4155
MBP






159091
FAM122C






















1855
DVL1






















11179
ZNF277






144406
WDR66






















285600
KIAA0825






728558
ENTPD1-AS1






114800
CCDC85A






























647946
MIR924HG






6519
SLC3A1






























55253
TYW1














54778
RNF111






















9468
PCYT1B






10743
RAI1














129285
PPP1R21






















27098
CLUL1










SpyDS4 (gRNA.name SpyDS4)










SEQ ID



predicted_cleavage_score
NOS:
gRNAPlusPAM





100
283
GCAGGCACCTGTGCCAACATNGG





  0.1
283
GCAGGCACCTGTGCCAACATNGG





  0
283
GCAGGCACCTGTGCCAACATNGG





  0
283
GCAGGCACCTGTGCCAACATNGG





  0
283
GCAGGCACCTGTGCCAACATNGG





  0
283
GCAGGCACCTGTGCCAACATNGG





  0
283
GCAGGCACCTGTGCCAACATNGG





  0
283
GCAGGCACCTGTGCCAACATNGG





  0
283
GCAGGCACCTGTGCCAACATNGG





  0
283
GCAGGCACCTGTGCCAACATNGG





  0
283
GCAGGCACCTGTGCCAACATNGG













SEQ ID

SEQ ID



NOS.
offTarget_sequence
NOS.
guideAlignment2OffTarget





284
GCAGGCACCTGTGCCAACATGGG

....................





285
ACAGGCACTGATGCCAACTTTGG
295
A.......TGA.......T.





286
TAATGCCCTGGAGCCTCCCTGGC
296
TA.T..C.TG.A...TC.C.





287
GCAGGGCGCGCCGAGAGCAGCGG
297
.....GCG.GCC.AG.G..G





288
CCAGCCACCCAGCCCCTCCTCCC
298
C...C....CAGC..CT.C.





289
GTAAGCATATGATAGTCCATTTT
299
.T.A...TA..ATAGTC...





290
CCGCGTCCCTGCGCAAACCCAGG
300
C.GC.TC....C..A...CC





291
GTGCACCCCTGCTCCTACCCCCC
301
.TGCA.C....CT..T..CC





292
CCAGGGAGCAATGGCAGCGCGCC
302
C....G.G.AA..G..G.GC





293
GGCGGAAGTTGTACTGAGGTGAG
303
.GC..A.GT...A.TG.GG.





294
GCAGGAACTGGAGTGCACAGGTG
304
.....A..TG.A.TGC...G












offTarget




Strand
mismatch.distance2PAM
n.guide.mismatch







 0





+
20, 12, 11, 10, 2
 5





+
20, 19, 17, 14, 12, 11, 9, 5, 4, 2
10





+
15, 14, 13, 11, 10, 9, 7, 6, 4, 
10





-
20, 16, 11, 10, 9, 8, 5, 4, 2
 9





+
19, 17, 13, 12, 9, 8, 7, 6, 5, 4
10





-
20, 18, 17, 15, 14, 9, 6, 2, 1
 9





+
19, 18, 17, 16, 14, 9, 8, 5, 2, 1
10





+
20, 15, 13, 11, 10, 7, 4, 2, 1
 9





+
19, 18, 15, 13, 12, 8, 6, 5, 3, 2
10





-
15, 12, 11, 9, 7, 6, 5, 1
 8












PAM.sequence
offTarget_Start
offTarget_End





GGG
43748848
43748870





TGG
41551021
41551043





GGC
43748564
43748586





CGG
77359654
77359676





CCC
43741999
43742021





TTT
68132445
68132467





AGG
77359345
77359367





CCC
22774978
22775000





GCC
77359596
77359618





GAG
82004622
82004644





GTG
80003891
80003913













chromosome
inExon
entrez_id
symbol





chr6
NA
7422
VEGFA





chr22
TRUE
2033
EP300





chr6
TRUE
7422
VEGFA





chrX
TRUE
5230
PGK1





chr6

7422
VEGFA





chr15








chrX








chr6








chrX








chr11

23266
ADGRL2





chr12

5074
PAWR










SpyDS6 (gRNA.name SpyDS6)










peak_



offTarget
score
predicted_cleavage_score





chr6:+:80816457:80816479
699
  0.2





chr6:-:22774975:22774997
553
  1.4





chr6:-:43742023:43742045
458
100





chr7:-:124498153:124498175
449
  0.2





chr1:-:79194307:79194329
386
  0.2





chr17:+:77835740:77835762
383
  5.2





chr19:+:15313634:15313656
382
  0.7





chr12:+:96650610:96650632
374
  3.7





chr10:-:79681895:79681917
352
  1.5





chr6:+:20250488:20250510
338
  0.2





chr13:-:49117083:49117105
334
  0.1





chr12:-:80003893:80003915
330
  0.1





chr17:-:77543039:77543061
302
  1.6





chr8:-:65972642:65972664
299
  0.1





chr20:-:35488683:35488705
277
  2.4





chr11:+:100275645:100275667
271
  1.6





chr22:+:38338356:38338378
268
  0.4





chr13:+:45356854:45356876
255
  0.2





chr20:-:31061319:31061341
231
  1.9





chr11:-:66051111:66051133
229
  0.7





chr17:-:72637693:72637715
225
  2.4





chr11:+:128772408:128772430
198
  0.5





chr1:-:99257317:99257339
172
  0.1





chr15:-:39243269:39243291
171
  0.3





chr14:-:22258408:22258430
170
  0.2





chr21:-:42506703:42506725
166
  2.1





chr7:-:150036050:150036072
163
  0.2





chr7:-:1140569:1140591
162
  1.5





chr4:+:40239842:40239864
154
  0.4





chr22:-:50743552:50743574
151
  0.9





chr2:-:241904500:241904522
149
  3.1





chr9:-:136776149:136776171
146
  1





chr8:+:22487688:22487710
145
  0.3





chr1:-:110032844:110032866
144
  4.6





chr1:-:182626625:182626647
133
  0.9





chr5:-:134908150:134908172
127
  0.1





chr20:-:61928182:61928204
123
  9.1





chr10:-:88042752:88042774
120
  0.4





chr17:-:6131626:6131648
118
  0.2





chr4:-:1002743:1002765
117
  0.2





chr22:+:19106203:19106225
115
  0.2





chr1:+:44003969:44003991
114
  1.5





chr1:-:114792469:114792491
110
  0.4





chr19:+:38997988:38998010
110
  1.6





chr2:-:46897354:46897376
109
  0.1





chr12:+:121011672:121011694
108
  0.1





chr17:-:75891020:75891042
105
  0.6





chr9:+:139220931:139220953
 98
  5.7





chr14:+:24168625:24168647
 97
  0.1





chr15:-:74949775:74949797
 92
  0.1





chr19:+:44199443:44199465
 86
  0.4





chr12:+:75214528:75214550
 85
  0.3





chr17:-:46058760:46058782
 82
  0.2





chr16:-:90077745:90077767
 80
  1.3





chr20:+:62023611:62023633
 79
  2.1





chr12:+:121013758:121013780
 77
  0.1





chrX:+:106755923:106755945
 75
  1





chr10:+:44417540:44417562
 73
  0.3





chr11:-:118193407:118193429
 73
  1.4





chr16:-:13411476:13411498
 73
  0.2





chr4:-:8206405:8206427
 73
  0.6





chr16:+:1517259:1517281
 71
  0.1





chr1:-:150849202:150849224
 69
  0





chr19:-:2057711:2057733
 69
  1.1





chr9:-:136075308:136075330
 69
  0.1





chr12:+:29935821:29935843
 67
  0.1





chr11:-:70812278:70812300
 66
  0.2





chr13:-:89703965:89703987
 62
  2.3





chr1:+:110166721:110166743
 60
  0.6





chr11:-:114079332:114079354
 58
  0.2





chr10:-:71813737:71813759
 57
  0.4





chr19:-:17414518:17414540
 56
  0.3





chr3:-:184289395:184289417
 56
  0.3





chr14:-:94566714:94566736
 55
  0.2





chr5:+:178665449:178665471
 55
  0.1





chr5:+:149568491:149568513
 54
  0.3





chr11:-:70242709:70242731
 52
  0.1





chr21:-:45132035:45132057
 52
  0.1





chr17:-:827977:827999
 47
  0.2





chr18:-:35056448:35056470
 44
  0





chr6:+:12990616:12990638
 44
  0.2





chr8:-:17955569:17955591
 44
  0.1





chr1:-:148932239:148932261
 43
  0.3





chr19:-:32734619:32734641
 42
  0.2





chr1:+:228330887:228330909
 41
  0.1





chr3:+:140221489:140221511
 41
  3.3





chr5:-:139938346:139938368
 40
  0.2





chr22:+:23744878:23744900
 39
  1.6





chr10:-:16388635:16388657
 38
  0.1





chr17:+:34824953:34824975
 35
  0.1





chr3:-:129656921:129656943
 35
  0.4





chr14:-:93351573:93351595
 34
  0





chr1:+:33169255:33169277
 33
  0.1





chr18:+:29253123:29253145
 33
  0





chr6:+:20984444:20984466
 33
  0





chr10:-:77256682:77256704
 32
  2.5





chr15:+:89196634:89196656
 32
  1.7





chr18:-:73391522:73391544
 32
  0





chr10:-:72512814:72512836
 31
  0





chr8:+:80980935:80980957
 31
  0.1





chr11:+:37704008:37704030
 30
  0.1





chr12:+:52539310:52539332
 29
  1.7





chr14:-:56431001:56431023
 27
  0.3





chr15:-:66949803:66949825
 26
  1.6





chr7:+:100879116:100879138
 26
 98.6





chr11:-:94784149:94784171
 25
  0.2





chr12:+:111548733:111548755
 25
  0.8





chr19:+:2212199:2212221
 25
  0.2





chr13:-:22824517:22824539
 22
  0.1





chr13:-:84196623:84196645
 19
  0.2





chr15:+:96534271:96534293
 19
  0.1





chr21:-:21304105:21304127
 17
  0.2





chr17:-:39705337:39705359
 16
  0.2





chr20:-:56582544:56582566
 15
  0.9





chr20:+:49479068:49479090
 15
  0.1





chr1:+:89258185:89258207
 14
  0.1





chr15:-:51386687:51386709
 13
  0.1





chr19:+:38724286:38724308
 13
  0.3





chr16:-:2286384:2286406
 11
  0.2













SEQ

SEQ



ID

ID



NOS.
gRNAPlusPAM
NOS.
offTarget_sequence





305
GGGCAGGGGCTGGGGTGCACNGG
306
CGGCAGGGGCTGAGGGGCACTGG





305
GGGCAGGGGCTGGGGTGCACNGG
307
GGGTAGGAGCAGGGGTGCACTGG





305
GGGCAGGGGCTGGGGTGCACNGG
308
GGGCAGGGGCTGGGGTGCACAGG





305
GGGCAGGGGCTGGGGTGCACNGG
309
GGGCAGGAACTGGAGTGCACAGG





305
GGGCAGGGGCTGGGGTGCACNGG
310
GGGCAGGAACTGGAGTGCACAGG





305
GGGCAGGGGCTGGGGTGCACNGG
311
CAGCAGGGGCTGGGGTGCACAGG





305
GGGCAGGGGCTGGGGTGCACNGG
312
GGGAAGGGCCTGGGGTACACGGG





305
GGGCAGGGGCTGGGGTGCACNGG
313
GGGCCGGGGCAGGGGTGCACAGG





305
GGGCAGGGGCTGGGGTGCACNGG
314
AGACAGGGGCCGGGGTGCACAGG





305
GGGCAGGGGCTGGGGTGCACNGG
315
GGGCAGGAACTGGAGTGCACCGG





305
GGGCAGGGGCTGGGGTGCACNGG
316
AGGCAGGAACTGGAGTGCACGGG





305
GGGCAGGGGCTGGGGTGCACNGG
317
AGGCAGGAACTGGAGTGCACAGG





305
GGGCAGGGGCTGGGGTGCACNGG
318
AGGAAGGGACTGGGGTGCACTGG





305
GGGCAGGGGCTGGGGTGCACNGG
319
AGGCAGGAACTGGAGTGCACAGG





305
GGGCAGGGGCTGGGGTGCACNGG
320
AGGTGGGGGCTGGGGTGCACAGG





305
GGGCAGGGGCTGGGGTGCACNGG
321
AGGCAGGAACTGGGGTGCACGGG





305
GGGCAGGGGCTGGGGTGCACNGG
322
TGGCAGGGGCAGGGGTGAACTGG





305
GGGCAGGGGCTGGGGTGCACNGG
323
GGGCAGGAACTGGAGTGCACGGG





305
GGGCAGGGGCTGGGGTGCACNGG
324
GGCCAGGGGCTGGGGAGCACAGG





305
GGGCAGGGGCTGGGGTGCACNGG
325
GGGCAGGGCTGGGGGTGCACAGG





305
GGGCAGGGGCTGGGGTGCACNGG
326
TGGGTGGGGCTGGGGTGCACTGG





305
GGGCAGGGGCTGGGGTGCACNGG
327
GGGAGGGGGCTGGGGAGCACAGG





305
GGGCAGGGGCTGGGGTGCACNGG
328
GGGCAGGAACTGGAGTACACGGG





305
GGGCAGGGGCTGGGGTGCACNGG
329
GAGAAGGAGCTGGGGAGCACTGG





305
GGGCAGGGGCTGGGGTGCACNGG
330
GGGCAGGAACTGGAGTGCACCAG





305
GGGCAGGGGCTGGGGTGCACNGG
331
GGGCAAGGGCAGGGGTGCACCAG





305
GGGCAGGGGCTGGGGTGCACNGG
332
AAGAAGGGGCAAGGGTGCACAGG





305
GGGCAGGGGCTGGGGTGCACNGG
333
GGCCAGGAGCAGGGGTGCACGGG





305
GGGCAGGGGCTGGGGTGCACNGG
334
TGGCAGCGGCTGGGGAGCACTGG





305
GGGCAGGGGCTGGGGTGCACNGG
335
GGGCGTGGGCAGGGGTGCACTGG





305
GGGCAGGGGCTGGGGTGCACNGG
336
GGGCAGTGGCTGGGGTGCATTGG





305
GGGCAGGGGCTGGGGTGCACNGG
337
GGCCAGGAGCTGGGGTGCTCAGG





305
GGGCAGGGGCTGGGGTGCACNGG
338
CCTCAGGGGCTGGGGTGAACAGG





305
GGGCAGGGGCTGGGGTGCACNGG
339
TGGCAGGGTCTGGGGTGCACAGA





305
GGGCAGGGGCTGGGGTGCACNGG
340
GAGCAGGGTCTGGGGTGCATGGG





305
GGGCAGGGGCTGGGGTGCACNGG
341
GAGCAGGGACTGAGGGGCACAGG





305
GGGCAGGGGCTGGGGTGCACNGG
342
GAGCAGGGGCTGGGGGGCACTGG





305
GGGCAGGGGCTGGGGTGCACNGG
343
TGGCAGGGGTAAGGGTGCACTGG





305
GGGCAGGGGCTGGGGTGCACNGG
345
AGACAGAGGCTGGAGTGCACTGG





305
GGGCAGGGGCTGGGGTGCACNGG
346
AGGCAGGGGCTGGAGTTCACAGG





305
GGGCAGGGGCTGGGGTGCACNGG
347
AGGAAGGGACCAGGGTGCACCAG





305
GGGCAGGGGCTGGGGTGCACNGG
348
GGCCAGGAGCAGGGGTGCACAGG





305
GGGCAGGGGCTGGGGTGCACNGG
349
GGGCCGGGGCTGGGGTGCCAGGG





305
GGGCAGGGGCTGGGGTGCACNGG
350
GGGCGGGGGCTGGGGAGCACAGG





305
GGGCAGGGGCTGGGGTGCACNGG
351
AGGCAGGAGCCAGGGTGCAGAGG





305
GGGCAGGGGCTGGGGTGCACNGG
352
GGGCAGAGGCTGGAGTGCCCAGG





305
GGGCAGGGGCTGGGGTGCACNGG
353
GGACAGGGGCAGGGGTGCCCGGG





305
GGGCAGGGGCTGGGGTGCACNGG
354
AGGGAGGGGCTGGGGTGCACGGA





305
GGGCAGGGGCTGGGGTGCACNGG
355
GGGCAGGAACTGGAGTGCATAGG





305
GGGCAGGGGCTGGGGTGCACNGG
356
AGGCAGGAACTGGAGTGCACAAG





305
GGGCAGGGGCTGGGGTGCACNGG
357
GGGCAGAGGCTAGGGTGCAGTGG





305
GGGCAGGGGCTGGGGTGCACNGG
358
AGGTAGGGGTTGGGGGGCACAGG





305
GGGCAGGGGCTGGGGTGCACNGG
359
GGGCAGAAGCAGGGGTGCTCAGG





305
GGGCAGGGGCTGGGGTGCACNGG
360
GGGGAGGGGTGGGGGTGCACCGG





305
GGGCAGGGGCTGGGGTGCACNGG
361
GAGCAGGGGCTGGGGGGCACTGG





305
GGGCAGGGGCTGGGGTGCACNGG
362
GGGCAGAGGCTGGAGTGCCCAGG





305
GGGCAGGGGCTGGGGTGCACNGG
363
GGGTGGGGGCTGGGGTGCCCAGG





305
GGGCAGGGGCTGGGGTGCACNGG
364
GGGCAAGGGCAGGGGTGCCCTGG





305
GGGCAGGGGCTGGGGTGCACNGG
365
GAGAGGGAGCTGGGGTGCACGGG





305
GGGCAGGGGCTGGGGTGCACNGG
366
AGGCAGGGACTGAGGTGCATAGG





305
GGGCAGGGGCTGGGGTGCACNGG
367
GGGCCAGGGCTGAGGTGCACAGG





305
GGGCAGGGGCTGGGGTGCACNGG
368
TGGGAGGGGCTAGAGTGCACAGG





305
GGGCAGGGGCTGGGGTGCACNGG
369
CCGCAGGGGCTGGGATGCTGGGG





305
GGGCAGGGGCTGGGGTGCACNGG
371
GAGGAGGGGCTGGGGTGCCCTGG





305
GGGCAGGGGCTGGGGTGCACNGG
372
GGGCAAAGGCCGGGGTGCCCAGG





305
GGGCAGGGGCTGGGGTGCACNGG
373
AGGCGGGGGCTGGGGGGCTCGGG





305
GGGCAGGGGCTGGGGTGCACNGG
374
AGGCAGGGGCCAGGGTCCACAGG





305
GGGCAGGGGCTGGGGTGCACNGG
376
GGGTTGGGGTTGGGGTGCACAGG





305
GGGCAGGGGCTGGGGTGCACNGG
377
AGGCAGGGGCCGGGGTGCGCAGG





305
GGGCAGGGGCTGGGGTGCACNGG
378
GGGCACAGACTGGGGTGCATTGG





305
GGGCAGGGGCTGGGGTGCACNGG
379
GGGCTGGGGCTGAGGTGCGCCGG





305
GGGCAGGGGCTGGGGTGCACNGG
380
AGGCAGGGGCTGGGGGGCAAGGG





305
GGGCAGGGGCTGGGGTGCACNGG
381
GAGCGGGAGCTGGGGGGCACAGG





305
GGGCAGGGGCTGGGGTGCACNGG
382
GGGCAGGGACTGGGGTGCTTAGG





305
GGGCAGGGGCTGGGGTGCACNGG
383
GGGAAGGGGCTGGAGGGCACAGG





305
GGGCAGGGGCTGGGGTGCACNGG
384
GGGCAGGGGAAGGGGTGGACTGG





305
GGGCAGGGGCTGGGGTGCACNGG
385
AGACAGGGGCTGGAGTGCAGTGG





305
GGGCAGGGGCTGGGGTGCACNGG
386
GGGCAGAGGCTGGAGTGCAATGG





305
GGGCAGGGGCTGGGGTGCACNGG
387
GGGCTGGGGCTGGGGAGCAGGGG





305
GGGCAGGGGCTGGGGTGCACNGG
388
AGGAAAGGGCTGGAGTGCAGGGG





305
GGGCAGGGGCTGGGGTGCACNGG
389
GGGCAGGAACTGGAGTGCACCAG





305
GGGCAGGGGCTGGGGTGCACNGG
390
AGGCAGGAACTGGAGTGCACAAG





305
GGGCAGGGGCTGGGGTGCACNGG
391
AGGCAGAGCCTGGGGTGCAGGGG





305
GGGCAGGGGCTGGGGTGCACNGG
392
GGGCAGGGCCAGGGGAGCACAGG





305
GGGCAGGGGCTGGGGTGCACNGG
393
AGCCAGGGGCTGGGGGGAACAGG





305
GGGCAGGGGCTGGGGTGCACNGG
394
GGGCAGGGGATGGGGTGCAGTGG





305
GGGCAGGGGCTGGGGTGCACNGG
395
AGGCAAGGCCTGGGGTGCCCAGG





305
GGGCAGGGGCTGGGGTGCACNGG
396
GGGCTGGGGCTGGGGAGCACGGG





305
GGGCAGGGGCTGGGGTGCACNGG
397
GAGAAGGGGCTGGGAAGCACAGG





305
GGGCAGGGGCTGGGGTGCACNGG
398
AGGCAGGAACTGGAGTGCACAAG





305
GGGCAGGGGCTGGGGTGCACNGG
399
GGGGAGGGGCTGGGGTGCCAGGG





305
GGGCAGGGGCTGGGGTGCACNGG
400
AGGAAGGGGCTGGGGAAAACAGG





305
GGGCAGGGGCTGGGGTGCACNGG
401
CCCCAGGGGCTGGGGTGCCTGGG





305
GGGCAGGGGCTGGGGTGCACNGG
402
AAGCAGAGGCTGAAGTGCACAGG





305
GGGCAGGGGCTGGGGTGCACNGG
403
AGGCAGGAACTAGAGTGCACAGG





305
GGGCAGGGGCTGGGGTGCACNGG
404
GGGCAGGGGGTGGGGTCCACAGG





305
GGGCAGGGGCTGGGGTGCACNGG
406
GGGGAGGGGCTGGGGAGCACGGA





305
GGGCAGGGGCTGGGGTGCACNGG
407
AGGCAGAGGCTGGAGTGGACCGG





305
GGGCAGGGGCTGGGGTGCACNGG
408
GGGTAGGGGCTGGGGGATACCGG





305
GGGCAGGGGCTGGGGTGCACNGG
409
GGGAAGGGTCTGGAGTCCACTGG





305
GGGCAGGGGCTGGGGTGCACNGG
410
GGGCAGGAACTAGAGTGCACGGG





305
GGGCAGGGGCTGGGGTGCACNGG
411
GGGCAGGGACTGGGGTGCTCTGG





305
GGGCAGGGGCTGGGGTGCACNGG
412
GAGTAGGGGCAGGGGTGCTCTGG





305
GGGCAGGGGCTGGGGTGCACNGG
413
AGGAAGGGCCTGGGGTGCACAGA





305
GGGCAGGGGCTGGGGTGCACNGG
414
GGCCAGGGGCTGGGGTGCACGGT





305
GGGCAGGGGCTGGGGTGCACNGG
415
AGGCAGGGGCCAGGGTGCATGGG





305
GGGCAGGGGCTGGGGTGCACNGG
416
GGGCAGAGGATGGGGTGCAGGGG





305
GGGCAGGGGCTGGGGTGCACNGG
417
CGGCAGGGGCTGGAGTGCAGTGG





305
GGGCAGGGGCTGGGGTGCACNGG
418
AGGCAGGATCTGGAGTGCACAGG





305
GGGCAGGGGCTGGGGTGCACNGG
419
AGACAGGAGCTGGAGTGCACAAG





305
GGGCAGGGGCTGGGGTGCACNGG
420
TGGCAGGGGCAGGGATGCTCTGG





305
GGGCAGGGGCTGGGGTGCACNGG
421
CCTCAGGGGTTGGGATGCACTGG





305
GGGCAGGGGCTGGGGTGCACNGG
422
GAGCAGGGTCAGGGGTGCAGAGG





305
GGGCAGGGGCTGGGGTGCACNGG
423
CAGGAGTGGCTGGGGTGCACAGG





305
GGGCAGGGGCTGGGGTGCACNGG
424
GGGCCTGGGCTGAGATGCACGGG





305
GGGCAGGGGCTGGGGTGCACNGG
425
ACGCAGGGGCTAGGGAGCACAAG





305
GGGCAGGGGCTGGGGTGCACNGG
426
GGGCTGGGGCTGGGGAGGACGGG





305
GGGCAGGGGCTGGGGTGCACNGG
427
GGGCAGGGATTGGGGGGCACAGG





305
GGGCAGGGGCTGGGGTGCACNGG
428
AGGGAGGGGCCGGGCTGCACTGG













SEQ





ID





NOS.
guideAlignment2OffTarget
offTargetStrand
mismatch.distance2PAM






C...........A..G....
+
20, 8, 5






...T...A A..........
-
17, 13, 10






....................
-







.............AA....A
-
13, 12, 7






.............AA....A
-
13, 12, 7






CA..................
+
20, 19






...A....C.......A...
+
17, 12, 4






....C.....A.........
+
16, 10






A.A.......C.........
-
20, 18, 10






.......AA....A......
+
13, 12, 7





429
A......AA....A......
-
20, 13, 12, 7





430
A......AA....A......
-
20, 13, 12, 7






A..A....A...........
-
20, 17, 12





431
A......AA....A......
-
20, 13, 12, 7






A..TG...............
-
20, 17, 16






A......AA...........
+
20, 13, 12






T.........A......A..
+
20, 10, 3






.......AA....A......
+
13, 12, 7






..C...............A.
-
18, 5






........CTG.........
-
12, 11, 10






T..GT...............
-
20, 17, 16






...AG..........A....
+
17, 16, 5





432
.......AA....A..A...
-
13, 12, 7, 4





433
.A.A...A.......A....
-
19, 17, 13, 5






.......AA....A......
-
13, 12, 7






.....A....A.........
-
15, 10





434
AA.A......AA........
-
20, 19, 17, 10, 9






..C....A..A.........
-
18, 13, 10






T.....C........A....
+
20, 14, 5






....GT....A.........
-
16, 15, 10






......T............T
-
14, 1






..C....A..........T.
-
18, 13, 2





435
CCT..............A..
+
20, 19, 18, 3






T.......T...........
-
20, 12






.A......T..........T
-
19, 12, 1





436
.A......A...A..G....
-
19, 12, 8, 5






.A.............G....
-
19, 5





437
T........TAA........
-
20, 11, 10, 9





438
A.A...A......A......
-
20, 18, 14, 7






A............A..T...
-
20, 7, 4





439
A..A....A.CA........
+
20, 17, 12, 10, 9






..C....A..A.........
+
18, 13, 10






....C.............CA
-
16, 2, 1






....G..........A....
+
16, 5





440
A......A..CA.......G
-
20, 13, 10, 9, 1






......A......A....C.
+
14, 7, 2






..A.......A.......C.
-
18, 10, 2






A..G................
+
20, 17





441
.......AA....A.....T
+
13, 12, 7, 1





442
A......AA....A......
-
20, 13, 12, 7






......A....A.......G
+
14, 9, 1





443
A..T.....T.....G....
+
20, 17, 11, 5





444
......AA..A.......T.
-
14, 13, 10, 2






...G.....TG.........
-
17, 11, 10






.A............ G....
+
19, 5






......A......A....C.
+
14, 7, 2






...TG.............C.
+
17, 16, 2






.....A....A.......C.
+
15, 10, 2





445
.A.AG..A............
-
19, 17, 16, 13





446
A.......A...A......T
-
20, 12, 8, 1






....CA......A.......
-
16, 15, 8





447
T..G.......A.A......
+
20, 17, 9, 7






CC............A...TG
-
20, 19, 6, 2, 1






.A.G..............C.
-
19, 17, 2





448
.....AA...C.......C.
-
15, 14, 10, 2





449
A...G..........G..T.
+
20, 16, 5, 2





450
A.........CA....C...
-
20, 10, 9, 4






...TT....T..........
-
17, 16, 11






A.........C.......G.
+
20, 10, 2





451
.....CA.A..........T
-
15, 14, 12, 1






....T.......A.....G.
-
16, 8, 2






A..............G...A
-
20, 5, 1





452
.A..G..A.......G....
-
19, 16, 13, 5






........A.........TT
-
12, 2, 1






...A.........A.G....
+
17, 7, 5






.........AA......G..
+
11, 10, 3





453
A.A..........A.....G
-
20, 18, 7, 1






......A......A.....A
-
14, 7, 1






....T..........A...G
-
16, 5, 1





454
A..A.A.......A.....G
-
20, 17, 15, 7, 1






.......AA....A......
+
13, 12, 7





455
A......AA....A......
-
20, 13, 12, 7





456
A.....A.C..........G
-
20, 14, 12, 1






........C.A....A....
-
12, 10, 5





457
A.C............G.A..
+
20, 18, 5, 3






.........A.........G
+
11, 1





458
A....A..C.........C.
-
20, 15, 12, 2






....T..........A....
+
16, 5





459
.A.A..........AA....
-
19, 17, 6, 5





460
A......AA....A......
+
20, 13, 12, 7






...G..............CA
-
17, 2, 1





461
A..A...........AAA..
-
20, 17, 5, 4, 3





462
CCC...............CT
+
20, 19, 18, 2, 1





463
AA....A.....AA......
+
20, 19, 14, 8, 7





464
A......AA..A.A......
+
20, 13, 12, 9, 7






.........G......C...
-
11, 4






...G...........A....
+
17, 5





465
A.....A......A...G..
-
20, 14, 7, 3





466
...T...........GAT..
-
17, 5, 4, 3





467
...A....T....A..C...
+
17, 12, 7, 4





468
.......AA..A.A......
+
13, 12, 9, 7






........A.........T.
+
12, 2





469
.A.T......A.......T.
-
19, 17, 10, 2






A..A....C...........
-
20, 17, 12






..C.................
+
18





470
A........CA........T
-
20, 10, 9, 1






......A..A.........G
+
14, 11, 1






C............A.....G
+
20, 7, 1





471
A......AT....A......
-
20, 13, 12, 7





472
A.A....A.....A......
-
20, 18, 13, 7





473
T.........A...A...T.
+
20, 10, 6, 2





474
CCT......T....A.....
-
20, 19, 18, 11, 6





475
.A......T.A........G
-
19, 12, 10, 1





476
CA.G..T.............
-
20, 19, 17, 14





477
....CT......A.A.....
+
16, 15, 8, 6





478
AC.........A...A....
+
20, 19, 9, 5






....T..........A.G..
-
16, 5, 3






........AT.....G....
+
12, 11, 5





479
A..G......C...C.....
-
20, 17, 10, 6












n.PAM.mismatch
n.guide.mismatch
PAM.sequence





0
3
TGG





0
3
TGG





0
0
AGG





0
3
AGG





0
3
AGG





0
2
AGG





0
3
GGG





0
2
AGG





0
3
AGG





0
3
CGG





0
4
GGG





0
4
AGG





0
3
TGG





0
4
AGG





0
3
AGG





0
3
GGG





0
3
TGG





0
3
GGG





0
9
AGG





0
3
AGG





0
3
TGG





0
3
AGG





0
4
GGG





0
4
TGG





1
3
CAG





1
2
CAG





0
5
AGG





0
3
GGG





0
3
TGG





0
3
TGG





0
2
TGG





0
3
AGG





0
4
AGG





1
2
AGA





0
3
GGG





0
4
AGG





0
2
TGG





0
4
TGG





0
4
TGG





0
3
AGG





1
5
CAG





0
3
AGG





0
3
GGG





0
2
AGG





0
5
AGG





0
3
AGG





0
3
GGG





1
2
GGA





0
4
AGG





1
4
AAG





0
3
TGG





0
4
AGG





0
4
AGG





0
3
CGG





0
2
TGG





0
3
AGG





0
3
AGG





0
3
TGG





0
4
GGG





0
4
AGG





0
3
AGG





0
4
AGG





0
5
GGG





0
3
TGG





0
4
AUG





0
4
GGG





0
4
AGG





0
3
AGG





0
3
AGG





0
4
TGG





0
3
CGG





0
3
GGG





0
4
AGG





0
3
AGG





0
3
AGG





0
3
TGG





0
4
TGG





0
3
TGG





0
3
GGG





0
5
GGG





1
3
CAG





1
4
AAG





0
4
GGG





0
3
AGG





0
4
AGG





0
2
TGG





0
4
AGG





0
2
GGG





0
4
AGG





1
4
AAG





0
3
GGG





0
5
AGG





0
5
GGG





0
5
AGG





0
5
AGG





0
2
AGG





1
2
GGA





0
4
CGG





0
4
CGG





0
4
TGG





0
4
GGG





0
2
TGG





0
4
TGG





1
3
AGA





1
1
GGT





0
4
GGG





0
3
GGG





0
3
TGG





0
4
AGG





1
4
AAG





0
4
TGG





0
5
TGG





0
4
AGG





0
4
AGG





0
4
GGG





1
4
AAG





0
3
GGG





0
3
AGG





0
4
TGG





offTarget_Start
offTarget_End
chromosome





80816457
80816479
chr6





22774975
22774997
chr6





43742023
43742045
chr6





124498153
124498175
chr7





79194307
79194329
chr1





77835740
77835762
chr17





15313634
15313656
chr19





96650610
96650632
chr12





79681895
79681917
chr10





20250488
20250510
chr6





49117083
49117105
chr13





80003893
80003915
chr12





77543039
77543061
chr17





65972642
65972664
chr8





35488683
35488705
chr20





100275645
100275667
chr11





38338356
38338378
chr22





45356854
45356876
chr13





31061319
31061341
chr20





66051111
66051133
chr11





72637693
72637715
chr17





128772408
128772430
chr11





99257317
99257339
chr1





39243269
39243291
chr15





22258408
22258430
chr14





42506703
42506725
chr21





150036050
150036072
chr7





1140569
1140591
chr7





40239842
40239864
chr4





50743552
50743574
chr22





241904500
241904522
chr2





136776149
136776171
chr9





22487688
22487710
chr8





110032844
110032866
chr1





182626625
182626647
chr1





134908150
134908172
chr5





61928182
61928204
chr20





88042752
88042774
chr10





6131626
6131648
chr17





1002743
1002765
chr4





19106203
19106225
chr22





44003969
44003991
chr1





114792469
114792491
chr1





38997988
38998010
chr19





46897354
46897376
chr2





121011672
121011694
chr12





75891020
75891042
chr17





139220931
139220953
chr9





24168625
24168647
chr14





74949775
74949797
chr15





44199443
44199465
chr19





75214528
75214550
chr12





46058760
46058782
chr17





90077745
90077767
chr16





62023611
62023633
chr20





121013758
121013780
chr12





106755923
106755945
chrX





44417540
44417562
chr10





118193407
118193429
chr11





13411476
13411498
chr16





8206405
8206427
chr4





1517259
1517281
chr16





150849202
150849224
chr1





2057711
2057733
chr19





136075308
136075330
chr9





29935821
29935843
chr12





70812278
70812300
chr11





89703965
89703987
chr13





110166721
110166743
chr1





114079332
114079354
chr11





71813737
71813759
chr10





17414518
17414540
chr19





184289395
184289417
chr3





94566714
94566736
chr14





178665449
178665471
chr5





149568491
149568513
chr5





70242709
70242731
chr11





45132035
45132057
chr21





827977
827999
chr17





35056448
35056470
chr18





12990616
12990638
chr6





17955569
17955591
chr8





148932239
148932261
chr1





32734619
32734641
chr19





228330887
228330909
chr1





140221489
140221511
chr3





139938346
139938368
chr5





23744878
23744900
chr22





16388635
16388657
chr10





34824953
34824975
chr17





129656921
129656943
chr3





93351573
93351595
chr14





33169255
33169277
chr1





29253123
29253145
chr18





20984444
20984466
chr6





77256682
77256704
chr10





89196634
89196656
chr15





73391522
73391544
chr18





72512814
72512836
chr10





80980935
80980957
chr8





37704008
37704030
chr11





52539310
52539332
chr12





56431001
56431023
chr14





66949803
66949825
chr15





100879116
100879138
chr7





94784149
94784171
chr11





111548733
111548755
chr12





2212199
2212221
chr19





22824517
22824539
chr13





84196623
84196645
chr13





96534271
96534293
chr15





21304105
21304127
chr21





39705337
39705359
chr17





56582544
56582566
chr20





49479068
49479090
chr20





89258185
89258207
chr1





51386687
51386709
chr15





38724286
38724308
chr19





2286384
2286406
chr16





TRUE
594
BCKDHB














7422
VEGFA






25913
POT1






























2004
ELK3






9231
DLG5






















5074
PAWR






















140710
SOGA1













TRUE
85377
MICALL1














140688
NOL4L





TRUE
254263
CNIH2













TRUE
3762
KCNJ5






























5919
RARRES2






84310
C7orf50






399
RHOH






23654
PLXNB2






200772
LOC200772






7410
VAV2






55909
BIN3






127002
ATXN7L2






85397
RGS8






9547
CXCL14






57642
COL20A1






2894
GRID1






















9993
DGCR2






5792
PTPRF














6261
RYR1














9921
RNF10













TRUE
26102
DKEZP434A062














80153
EDC3






















80279
CDK5RAP3






79007
DBNDD1














9921
RNF10














283033
LINC00841






















54436
SH3TC1






1186
CLCN7





TRUE
405
ARNT





















TRUE
83857
TMTC1






22941
SHANK2














271
AMPD2






7704
ZBTB16






55506
H2AFY2














2049
EPHB3






122509
IFI27L1






9509
ADAMTS2






























64359
NXN






56853
CELF4






221692
PHACTR1














645166
LOC645166














2987
GUK1






64084
CLSTN2





TRUE
10307
APBB3






















































9331
B4GALT6






54901
CDKAL1














3669
ISG20














140766
ADAMTS14






7163
TPD52






























24146
CLDN15














23316
CUX2





TRUE
84444
DOT1L














3728
JUP














55653
BCAS4






5586
PKN2






388121
TNFAIP8L3


















Nme2DS2









offTarget
peak_score
predicted_cleavage_score





chr6:-:43748582:43748613
547
100





chrX:+:77359550:77359581
 44
  0















SEQ ID

SEQ ID



gRNA.name
NOS:
gRNAPlusPAM
NOS;
offTarget_sequence





Nme2DS2
480
GAATGGCAGGCGGAGG
481
GAATGGCAGGCGGAGGTT




TTGTACTGNNNCCNN

GTACTGGGGGCCAG





Nme2DS2
480
GAATGGCAGGCGGAGG
482
AAACGGAAGC




TTGTACTGNNNNCCNN

CGCACGTCTC






ACTAGTACCC TC













SEQ ID





NO:
guideAlignment2OffTarget
offTargetStrand
mismatch.distance2PAM






........................
-






483
A..C..A..C..C.C..CTC...A
+
24, 21, 18, 15, 12, 10, 7, 6, 5, 1












n.PAM.mismatch
n.guide.mismatch
PAM.sequence





0
 0
GGGGCCAG





0
10
GTACCCTC





offTarget_Start
offTarget_End
chromosome





43748582
43748613
chr6





77359550
77359581
chrX





inExon
entrez_id
symbol





TRUE
7422
VEGFA


















Nme2DS4









offTarget
peak_score
gRNA.name





chr6:-:43748843:43748874
66
c_DeCas9_human_TS14










gRNAPlusPAM











SEQ ID NO: 486
GTGAGCAGGCACCTGTGCCAACATNNNNCCNN












guideAlignment2Off


offTarget_sequence
Target












SEQ ID
GTGAGCAGGCACCTGTGCCAACATGGGCCCGC
........................


NO: 487












offTargetStrand
predicted_cleavage_score
mismatch.distance2PAM






100













n.PAM.mismatch
n.guide.mismatch
PAM.sequence













0
0
SEQ ID NO: 488
GGGCCCGC












offTarget_Start
offTarget_End
chromosome





43748843
43748874
chr6





inExon
entrez_id
symbol






7422
VEGFA










Nme2DS6









offTarget
peak_score
predicted_cleavage_score





chr6:-:43742018:43742049
483
100





chrX:-:77359465:77359496
 12
  0











gRNA.name
gRNAPlusPAM












d_DeCas9_human_TS16
SEQ ID NO: 489
GCATGGGCAGGGGCTGGGGTGCAC




NNNNCCNN





d_DeCas9_human_TS16
SEQ ID NO: 489
GCATGGGCAGGGGCTGGGGTGCAC




NNNNC












offTarget_sequence
guideAlignment2OffTarget
offTargetStrand













SEQ ID
GCATGGGCA
........................



NO: 490
GGGGCTGGG





GTGCACAGG





CCCAG
















SEQ ID
GCAGGAAGC
SEQ ID
...G.AAGC.TC
21,19,18,17,16,14,13,10,5,2


NO: 491
GTCGCCGGG
NO: 492
..C....G..C.




GGGCCCACA






AGGGT












n.PAM.mismatch
PAM.sequence
PAM.sequence














 0
SEQ ID
AGGCCCAG
SEQ ID
AATCCCTT



NO: 493

NO: 499






10
SEQ ID
ACAAGGGT
SEQ ID
ACTCCCTC



NO: 494

NO: 500












offTarget_Start
offTarget_End
chromosome





43742018
43742049
chr6





77359465
77359496
chrX





inExon
entrez_id
symbol






7422
VEGFA







VEGFA










Rosa26









offTarget
peak_score
predicted_cleavage_score





chr6:-
1175
100


:113076072:113076103







chrl 1:-
  24
  1.4


:73171296:73171327











gRNA.name
gRNAPlusPAM












Nme2Rosa
SEQ ID
TGAGGACCGCCCTGGGCCTGGGAGNNNNCC



NO: 495
NN





Nme2Rosa
SEQ ID
TGAGGACCGCCCTGGGCCTGGGAGNNNNCC



NO: 495
NN












offTarget_sequence
guideAlignment2OffTarget
offTargetStrand













SEQ ID
TGAGGACCGCCCTGGG
........................



NO: 496
CCTGGGAGAATCCCTT
















SEQ ID
GAAGGACCACCCTAGG
SEQ ID
GA......A....A..........



NO: 497
CCTGGGAGACTCCCT
NO: 498












mismatch.distance2PAM
n.PAM.mismatch
n.guide.mismatch






0
0





24, 23, 16, 11
0
4





PAM.sequence
offTarget_Start
offTarget_End













SEQ ID
AATCCCTT
113076072
113076103


NO: 499








SEQ ID
ACTCCCTC
73171296
73171327


NO: 500












chromosome
inExon
entrez_id





chr6

14910





chr11

94045










PCSK9









off-Target
peak_score
gRNA.name





chr4:-:106463720:106463751
266
Nme2PCSK9











gRNAPlusPAM
offTarget_sequence













SEQ ID
GGCCTGGCTGATGAGGCCG
SEQ ID
GGCCTGGCTGATGAGGCCGC


NO: 501
CACATNNNNCCNN
NO: 502
ACATGTGGCCAC












guideAlignment2OffTarget
offTargetStrand
predicted_cleavage_score





........................

100





mismatch.distance2PAM
n.PAM.mismatch
n.guide.mismatch






0
0












PAM.sequence
offTarget_Start
offTarget_End













SEQ ID NO:
GTGGCCAC
106463720
106463751


503












chromosome
inExon
entrez_id





chr4
TRUE
100102









For off-target identification, the analysis revealed that the DS2, DS4, and DS6 SpyCas9 sgRNAs appeared to direct editing at 93, 10, and 118 candidate off-target sites, respectively, in the normal range of off-targets when plasmid-based. SpyCas9 editing is analyzed by GUIDE-seq (Fu et al., 2014; Tsai et al., 2014). In striking contrast. the DS2, DS4, and DS6 Nme2Cas9 sgRNAs appeared to direct editing at 1, 0, and 1 off-target sites, respectively. FIG. 14C and Table 2. When compared to the GUIDE-seq read counts for the SpyCas9 off-targets, those of Nme2Cas9 were very low, further suggesting that Nme2Cas9 is highly specific. FIG. 13C cf. FIG. 13D. Nme2Cas9 GUIDE-seq analyses with the TS6, Pcsk9, and Rosa26 yielded similar results (0, 0, and 1 off-target sites, respectively, with a modest read count for the Rosa26-OT1 off-target site). FIG. 13C, FIG. 14D, and Table 2.



FIG. 14A-E presents exemplary data showing that Nme2Cas9 exhibits little or no detectable off-targeting in mammalian cells. FIG. 14A shows an exemplary schematic depicting dual sites (DSs) targetable by both SpyCas9 and Nme2Cas9 by virtue of their non-overlapping PAMs. The Nme2Cas9 PAM (orange) and SpyCas9 PAM (blue) are highlighted. A 24 nt Nme2Cas9 guide sequence is indicated in yellow; the corresponding guide sequence for SpyCas9 would be 4 nt shorter at the 5′ end. FIG. 14B shows an exemplary Nme2Cas9 and SpyCas9 that both induce indels at DSs. Six DSs in VEGFA (with GN3GN19NGGNCC sequences) were selected for direct comparisons of editing by the two orthologs. Plasmids expressing each Cas9 (with the same promoter, linkers, tags and NLSs) and its cognate guide were transfected into HEK293T cells. Indel efficiencies were determined by TIDE 72 hrs post transfection. Nme2Cas9 editing was detectable at all six sites and was marginally or significantly more efficient than SpyCas9 at two sites (DS2 and DS6, respectively). SpyCas9 edited four out of the six sites (DS1, DS2, DS4 and DS6), with two sites showing significantly higher editing efficiencies than Nme2Cas9 (DS1 and DS4). DS2, DS4 and DS6 were selected for GUIDE-Seq analysis as Nme2Cas9 was equally efficient, less efficient and more efficient than SpyCas9, respectively, at these sites. FIG. 14C shows exemplary Nme2Cas9 genome editing that is highly accurate in human cells. Numbers of off-target sites detected by GUIDE-Seq for each nuclease at individual target sites are shown. In addition to dual sites, we analyzed TSG (because of its high on-target editing efficiency) and Pcsk9 and Rosa26 sites in mouse Hepa1-6 cells (to measure accuracy in another cell type). FIG. 14D shows an exemplary targeted deep sequencing to detect indels in edited cells confirms the high Nme2Cas9 accuracy indicated by GUIDE-seq. FIG. 14E shows an exemplary sequence for the validated off-target site of the Rosa26 guide, showing the PAM region (underlined), the consensus CC PAM dinucleotide (bold), and three mismatches in the PAM-distal portion of the spacer (red).


To validate the off-target sites detected by GUIDE-seq, a targeted deep sequencing was performed to measure indel formation at the top off-target loci following GUIDE-seq-independent editing (i.e. without co-transfection of the dsODN). While SpyCas9 showed considerable editing at most off-target sites tested and, in some instances, was more efficient than that at the corresponding on-target site, Nme2Cas9 exhibited no detectable indels at the lone DS2 and DS6 candidate off-target sites. See, FIG. 14D. With the Rosa26 sgRNA, Nme2Cas9 induced ˜1% editing at the Rosa26-OT1 site in Hepa1-6 cells, compared to ˜30% on-target editing. See, FIG. 14D. It is noteworthy that this off-target site has a consensus Nme2Cas9 PAM (ACTCCCT) with only 3 mismatches at the PAM-distal end of the guide-complementary region (i.e. outside of the seed). See, FIG. 14E. These data support and reinforce our GUIDE-seq results indicating a high degree of accuracy for Nme2Cas9 genome editing in mammalian cells.


To further corroborate the above GUIDE-Seq results, CRISPRseek was used to computationally predict potential off-target sites for two active Nme2Cas9 sgRNAs that targeted TS25 and TS47, both of which are also in VEGFA See, FIG. 9A; (Zhu et al., 2014). Three (TS25) or four (TS47) of the most closely matched predicted sites, five with N4CC PAMs and two with N4CA PAMs; each had 2-5 mismatches, mostly in their PAM-distal, non-seed regions. See, FIG. 13E. On- vs. off-target editing was compared after Nme2Cas9+sgRNA plasmid transfections into HEK293T cells by targeted amplification of each locus, followed by TIDE analysis. Consistently, no indels could be detected at those off-target sites for either sgRNA by TIDE, while efficient on-target editing was readily detected in DNA from the same populations of cells. Taken together, our data indicate that Nme2Cas9 is a naturally hyper-accurate genome editing platform in mammalian cells.


7. Associated Adenovirus Delivery


The compact size, small PAM, and high fidelity of Nme2Cas9 offer major advantages for in vivo genome editing using Associated Adenovirus (AAV) delivery. To test whether effective Nme2Cas9 genome editing can be achieved via single-AAV delivery. Nme2Cas9 was cloned with its sgRNA and their promoters (UI a and U6, respectively) into an AAV vector backbone. See, FIG. 15A. An all-in-one AAV was prepared with an sgRN-.Nme2Cas9 packaged into a hepatotropic AAV8 capsid to target two genes in the mouse liver: i) Rosa26 (a commonly used safe harbor locus for transgene insertion) (Friedrich and Soriano, 1991) as a negative control; and ii) Pcsk9, a major regulator of circulating cholesterol homeostasis (Rashid et al., 2005), as a phenotypic target.


SauCas9- or Nme1Cas9-induced indels in Pcsk9 in the mouse liver results and reduced cholesterol levels providing a useful and easy-to-score in vivo benchmark for new editing platforms (Ran et al., 2015; Ibraheim et al., 2018). The Nme2Cas9 RNA guides were the same as those used above. See, FIG. 9B. FIG. 13D, and FIG. 14A-E. As Rosa26-OT1 was the only Nme2Cas9 off-target site that has been validated in cultured mammalian cells, the Rosa26 guide also provided us with an opportunity to assess on- vs. off-target editing in vivo. See, FIGS. 14D-E). The tail veins of two groups of mice=5) were injected with 4×1011 AAV8.sgRNA.Nme2Cas9 genome copies (GCs) targeting either Pcsk9 or Rosa26. Serum was collected at 0, 14 and 28 days post-injection for cholesterol level measurement. Mice were sacrificed at 28 days post-injection and liver tissues were harvested. See, FIG. 15A. Targeted deep sequencing of each locus revealed ˜38% and ˜46% indel induction at the Pcsk9 and Rosa26 editing sites, respectively, in the liver. See, FIG. 15B. Because hepatocytes constitute only 65-70% of total cellular content in the adult liver, Nme2Cas9 AAV-induced hepatocyte editing efficiencies with sgPcsk9 and sgRosa were approximately 54-58% and 66-71%, respectively (Racanelli and Rehermann, 2006).


Only 2.25% liver indels overall (˜3-3.5% in hepatocytes) were detected at the Rosa26-OT1 off-target site, comparable to the 1% editing that we observed at this site in transfected Hepa1-6 cells. FIG. 15B cf FIG. 14D. At both 14 and 28 days post-injection, Pcsk9 editing was accompanied by a ˜44% reduction in serum cholesterol levels, whereas mice treated with the sgRosa26-expressing AAV maintained normal level of cholesterol throughout the study. See, FIG. 15C. The ˜44% reduction in serum cholesterol in the Nme2Cas9/sgPcsk9 AAV-treated mice compares well with the ˜40% reduction reported with SauCas9 all-in-one AAV when targeting the same gene (Ran et al., 2015).



FIG. 15A-C presents exemplary data showing Nme2Cas9 genome editing in vivo via all-in-one AAV delivery. FIG. 15A shows exemplary workflow for delivery of AAV8.sgRNA.Nme2Cas9 to lower cholesterol levels in mice by targeting Pcsk9. Top: schematic of the all-in-one AAV vector expressing Nme2Cas9 and the sgRNA (individual genome elements not to scale). BGH, bovine growth hormone poly(A) site; HA, epitope tag; NLS, nuclear localization sequence; h, human-codon-optimized. Bottom: Timeline for AAV8.sgRNA.Nme2Cas9 tail-vein injections (4×1011 GCs), followed by cholesterol measurements at day 14 and indel, histology and cholesterol analyses at day 28 post-injection. FIG. 15B shows an exemplary TIDE analysis to measure indels in DNA extracted from livers of mice injected with AAV8.Nme2Cas9+sgRNA targeting Pcsk9 and Rosa26 (control) loci. Indel efficiency at the lone off-target site identified by GUIDE-seq for these two sgRNAs (Rosa26|OT1) were also assessed by TIDE. FIG. 15C shows an exemplary reduced serum cholesterol levels in mice injected with the Pcsk9-targeting guide compared to the Rosa26-targeting controls. P values are calculated by unpaired two-tailed t-test. FIG. 16A-B presents exemplary data showing PCSK9 knockdown and liver histology following Nme2Cas9 AAV delivery and editing, related to FIG. 15A-C. FIG. 16A shows exemplary Western blotting using anti-PCSK9 antibody reveals strongly reduced levels of PCSK9 in the livers of mice treated with sgPcsk9, compared to mice treated with sgRosa26. 2 ng of recombinant PCSK9 was used as a mobility standard (left-most lane), and a cross-reacting band in the liver samples is indicated by an asterisk. GAPDH was used as loading control (bottom panel). FIG. 16B shows exemplary H&E staining from livers of mice injected with AAV8.Nme2Cas9+sgRosa26 (left) or AAV8.Nme2Cas9+sgPcsk9 (right) vectors. Scale bars, 25 μm.


Western blotting was performed using an anti-PCSK9 antibody to estimate PCSK9 protein levels in the livers of mice treated with sgPcsk9 and sgRosa26. Liver PCSK9 was below the detection limit in mice treated with sgPcsk9, whereas sgRosa26-treated mice exhibited normal levels of PCSK9. See, FIG. 16A. Hematoxylin and eosin (H&E) staining and histology revealed no signs of toxicity or tissue damage in either group after Nme2Cas9 expression. See, FIG. 16B. These data validate Nme2Cas9 as a highly effective genome editing system in vivo, including when delivered by single-AAV vectors.


AAV vectors have recently been used for the generation of genome-edited mice, without the need for microinjection or electroporation, simply by soaking the zygotes in culture medium containing AAV vector(s), followed by reimplantation into pseudopregnant females (Yoon et al., 2018). Editing was obtained previously with a dual-AAV system in which SpyCas9 and its sgRNA were delivered in separate vectors (Yoon et al., 2018). To test whether Nme2Cas9 could perform accurate and efficient editing in mouse zygotes with an all-in-one AAV delivery system, we targeted Tyrosinase (Tyr). A bi-allelic inactivation of Tyr disrupts melanin production resulting in an albino phenotype (Yokoyama et al., 1990).


An efficient Tyr sgRNA was validated that cleaves the Tyr locus only seventeen (17) bp from the site of the classic albino mutation in Hepa1-6 cells by transient transfections. See, FIG. 17A. Next, C57BL/6NJ zygotes were incubated for 5-6 hours in culture medium containing 3×109 or 3×108 GCs of an all-in-one AAV6 vector expressing Nme2Cas9 along with the Tyr sgRNA. After overnight culture in fresh media, those zygotes that advanced to the two-cell stage were transferred to the oviduct of pseudopregnant recipients and allowed to develop to term. See, FIG. 18A. Coat color analysis of pups revealed mice that were albino, chinchilla (indicating a hypomorphic allele of Tyrosinase). or that had variegated coat color composed of albino and chinchilla spots but lacking black pigmentation. See, FIGS. 18B-C. These results suggest a high frequency of biallelic mutations since the presence of a wild-type Tyrosinase allele should render black pigmentation. A total of five pups (10%) were born from the 3×109 GCs experiment. All of them carried indels; phenotypically, two were albino, one was chinchilla. and two had variegated pigmentation, indicating mosaicism.


From the 3×108 GCs experiment, four (4) pups (14%) were obtained, two of which died at birth, preventing a coat color or genome analysis. Coat color analysis of the remaining two pups revealed one chinchilla and one mosaic pup. These results indicate that single-AAV delivery of Nme2Cas9 and its guide can be used to generate mutations in mouse zygotes without microinjection or electroporation.


To measure on-target indel formation in the Tyr gene, DNA was isolated from the tails of each mouse, the locus was amplified and upon which a TIDE analysis was performed. All mice had high levels of on-target editing by Nme2Cas9, varying from 84% to 100%. See, FIGS. 17B-C. Most lesions in albino mouse 9-1 were either a 1- or a 4-bp deletion, suggesting either mosaicism or trans-heterozygosity, but albino mouse 9-2 exhibited a uniform 2-bp deletion. See, FIG. 17C. FIG. 17 presents exemplary data showing Tyr editing ex vivo in mouse zygotes, related to FIG. 16A-B. FIG. 17A shows an exemplary two sites in Tyr, each with N4CC PAMs, were tested for editing in Hepa1-6 cells. The sgTyr2 guide exhibited higher editing efficiency and was selected for further testing. FIG. 17B shows an exemplary seven mice that survived post-natal development, and each exhibited coat color phenotypes as well as on-target editing, as assayed by TIDE. FIG. 17C shows an exemplary Indel spectra from tail DNA of each mouse from (B), as well as an unedited C57BL/6NJ mouse, as indicated by TIDE analysis. Efficiencies of insertions (positive) and deletions (negative) of various sizes are indicated.



FIG. 18A-C presents exemplary data showing Nme2Cas9 genome editing ex vivo via all-in-one AAV delivery. FIG. 18A shows an exemplary workflow for single-AAV Nme2Cas9 editing ex vivo to generate albino C57BL/6NJ mice by targeting the Tyr gene. Zygotes are cultured in KSOM containing AAV6.Nme2Cas9:sgTyr for 5-6 hours, rinsed in M2, and cultured for a day before being transferred to the oviduct of pseudo-pregnant recipients. FIG. 18B shows exemplary albino (left) and chinchilla or variegated (middle) mice generated by 3×109 GCs, and chinchilla or variegated mice (right) generated by 3×108 GCs of zygotes with AAV6.Nme2Cas9:sgTyr. FIG. 18C shows an exemplary summary of Nme2Cas9.sgTyr single-AAV ex vivo Tyr editing experiments at two AAV doses.


The data is inconclusive as to whether there was no mosaicism in mouse 9-2. or that additional alleles were absent from mouse 9-1. because only tail samples were sequenced and other tissues could have distinct lesions. Analysis of tail DNA from chinchilla mice revealed the presence of in-frame mutations that are potentially the cause of the chinchilla coat color. The limited mutational complexity suggests that editing occurred early during embryonic development in these mice. These results provide a streamlined route toward mammalian mutagenesis through the application of a single AAV vector, in this case delivering both Nme2Cas9 and its sgRNA.



FIG. 19A-B shows an exemplary mCherry reporter assay for nSpCas9-ABEmax and optimized nNme2Cas9-ABEmax activities. FIG. 19A shows exemplary sequence information of ABE-mCherry reporter. There is a TAG stop codon in the mCherry coding region. In the reporter-integrated stable cell line, there is no mCherry signal due to this stop codon. The mCherry signal will be activated if the nSpCas9-ABEmax or optimized nNme2Cas9-ABEmax can convert TAG to CAG, which encodes a glutamine residue. FIG. 19B shows an exemplary mCherry signal is activated due to SpCas9-ABE or Nme2Cas9-ABE activity. Upper panel: negative control (no editing); middle panel: mCherry activation by nSpCas9-ABEmax; bottom panel: mCherry activation by optimized nNme2Cas9-ABEmax. FIG. 19C shows an exemplary FACS quantitation of base editing events in mCherry reporter cells transfected with the SpCas9-ABE or Nme2Cas9-ABE. N=6; error bars represent S.D. Results are from three biological replicates performed in technical duplicates.



FIG. 20A-C shows an exemplary GFP reporter assay for nSpCas9-CBE4 (Addgene #100802) and nNme2Cas9-CBE4 (same plasmid backbone as Addgene #100802) activities. FIG. 20A shows exemplary sequence information of the CBE-GFP reporter. There is a mutation that converts GYG to GHG in the fluorophore core region of the GFP reporter line. There is no GFP signal due to this mutation. The GFP signal will be activated if the nSpCas9-CBE4 or nNme2Cas9-CBE4 can convert CAC (encoding histidine) to TAC/TAT (encoding tyrosine). FIG. 20B shows an exemplary GFP signal is activated due to nSpCas9-CBE4 or nNme2Cas9-CBE4 activity. Upper panel: negative control (no editing); middle panel: GFP activation by nSpCas9-CBE4; bottom panel: GFP activation by nNme2Cas9-CBE4). FIG. 20C shows an exemplary FACS quantitation of base editing events in GFP reporter cells transfected with nSpCas9-CBE4 or nNme2Cas9-CBE4. N=6; error bars represent S.D. Results are from biological replicates performed in technical duplicates.



FIG. 21 shows exemplary cytosine editing by nNme2Cas9-CBE4. Upper panel shows the KANK3 targeting sequence information (PAM sequences are indicated in red) of Nme2Cas9 and base editing in the negative control samples. Bottom panel shows the quantification of the substitution efficiency of each type of base in the nNmeCas9-CBE4 editing window of the KANK3 target sequences. Sequence tables show nucleotide frequencies at each position. Frequencies of expected C-to-T conversion are indicated in red.



FIG. 22 shows exemplary cytosine and adenine editing by nNme2Cas9-CBE4 and nNme2Cas9-ABEmax, respectively. Upper panel shows the PLXNB2 targeting sequence information (PAM sequences are indicated in red) of Nme2Cas9 and base editing in the negative control samples. Middle panel shows the quantification of the substitution rate of each type of base in the nNmeCas9-ABEmax editing windows of the PLXNB2 target sequence. Sequence tables show nucleotide frequencies at each position. Frequencies of expected A-to-G conversion are highlighted in red. Bottom panel shows the quantification of the substitution efficiency of each type of base in the nNmeCas9-CBE4 editing windows of the PLXNB2 target sequence. Sequence tables show nucleotide frequencies at each position. Frequencies of expected C-to-T conversion are highlighted in red.










8. Sequences



Alignment of Nme1Cas9 and Nme2Cas9


Non-PID aa differences (teal-underlined); PID aa


differences (yellow-underlined bold); active


site residues (red-bold).


Nme1Cas9 (1-60)


(SEQ ID NO: 652)



MAAFKPNSINYILGLDIGIASVGWAMVEIDEEENPIRLIDLGVRVFERAEVPKTGDSLAM






Nme2Cas9 (1-60)


(SEQ ID NO: 883)



MAAFKPNPINYILGLDIGIASVGWAMVEIDEEENPIRLIDLGVRVFERAEVPKTGDSLAM






Nme1Cas9 (61-120)


(SEQ ID NO: 653)



ARRLARSVRRLTRRRAHRLLRTRRLLKREGVLQAANFDENGLIKSLPNTPWQLRAAALDR






Nme2Cas9 (61-120)


(SEQ ID NO: 654)(SEQ ID NO: 653)



ARRLARSVRRLTRRRAHRLLRARRLLKREGVLQAADFDENGLIKSLPNTPWQLRAAALDR






Nme1Cas9 (121-180)


(SEQ ID NO: 655)



KLTPLEWSAVLLHLIKHRGYLSQRKNEGETADKELGALLKGVAGNAHALQTGDFRTPA






EL





Nme2Cas9 (121-180)


(SEQ ID NO: 656)



KLTPLEWSAVLLHLIKHRGYLSQRKNEGETADKELGALLKGVANNAHALQTGDFRTPA






EL





Nme1Cas9 (181-240)


(SEQ ID NO: 657)



ALNKFEKESGHIRNQRSDYSHTFSRKDLQAELILLFEKQKEFGNPHVSGGLKEGIETLLM






Nme2Cas9 (181-240)


(SEQ ID NO: 658)



ALNKFEKESGHIRNQRGDYSHTFSRKDLQAELILLFEKQKEFGNPHVSGGLKEGIETLLM






Nme1Cas9 (241-300)


(SEQ ID NO: 659)



TQRPALSGDAVQKMLGHCTFEPAEPKAAKNTYTAERFIWLTKLNNLRILEQGSERPLTDT






Nme2Cas9 (241-300)


(SEQ ID NO: 660)



TQRPALSGDAVQKMLGHCTFEPAEPKAAKNTYTAERFIWLTKLNNLRILEQGSERPLTD






T





Nme1Cas9 (301-360)


(SEQ ID NO: 661)



ERATLMDEPYRKSKLTYAQARKLLGLEDTAFFKGLRYGKDNAEASTLMEMKAYHAISR



AL





Nme2Cas9 (301-360)


(SEQ ID NO: 662)



ERATLMDEPYRKSKLTYAQARKLLGLEDTAFFKGLRYGKDNAEASTLMEMKAYHAISRAL






Nme1Cas9 (361-420)


(SEQ ID NO: 663)



EKEGLKDKKSPLNLSPELQDEIGTAFSLFKTDEDITGRLKDRIQPEILEALLKHISFDKF






Nme2Cas9 (361-420)


(SEQ ID NO: 664)



EKEGLKDKKSPLNLSSELQDEIGTAFSLFKTDEDITGRLKDRVQPEILEALLKHISFDKF






Nme1Cas9 (421-480)


(SEQ ID NO: 665)



VQISLKALRRIVPLMEQGKRYDEACAEIYGDHYGKKNTEEKIYLPPIPADEIRNPVVLRA






Nme2Cas9 (421-480)


(SEQ ID NO: 666)



VQISLKALRRIVPLMEQGKRYDEACAEIYGDHYGKKNTEEKIYLPPIPADEIRNPVVLRA






Nme1Cas9 (481-540)


(SEQ ID NO: 667)



LSQARKVINGVVRRYGSPARIHIETAREVGKSFKDRKEIEKRQEENRKDREKAAAKFREY






Nme2Cas9 (481-540)


(SEQ ID NO: 668)



LSQARKVINGVVRRYGSPARIHIETAREVGKSFKDRKEIEKRQEENRKDREKAAAKFREY






Nme1Cas9 (541-600)


(SEQ ID NO: 669)



FPNFVGEPKSKDILKLRLYEQQHGKCLYSGKEINLGRLNEKGYVEIDHALPFSRTWDDSF






Nme2Cas9 (541-600)


(SEQ ID NO: 670)



FPNFVGEPKSKDILKLRLYEQQHGKCLYSGKEINLVRLNEKGYVEIDHALPFSRTWDDSF






Nme1Cas9 (601-660)


(SEQ ID NO: 671)



NNKVLVLGSENQNKGNQTPYEYFNGKDNSREWQEFKARVETSRFPRSKKQRILLQKFDED






Nme2Cas9 (601-660)


(SEQ ID NO: 672)



NNKVLVLGSENQNKGNQTPYEYFNGKDNSREWQEFKARVETSRFPRSKKQRILLQKFDED






Nme1Cas9 (661-720)


(SEQ ID NO: 673)



GFKERNLNDTRYVNRFLCQFVADRMRLTGKGKKRVFASNGQITNLLRGFWGLRKVRAEND






Nme2Cas9 (661-720)


(SEQ ID NO: 674)



GFKECNLNDTRYVNRFLCQFVADHILLTGKGKRRVFASNGQITNLLRGFWGLRKVRAEND






Nme1Cas9 (721-780)


(SEQ ID NO: 675)



RHHALDAVVVACSTVAMQQKITRFVRYKEMNAFDGKTIDKETGEVLHQKTHFPQPWEFFA






Nme2Cas9 (721-780)


(SEQ ID NO: 676)



RHHALDAVVVACSTVAMQQKITRFVRYKEMNAFDGKTIDKETGKVLHQKTHFPQPWEFFA






Nme1Cas9 (781-840)


(SEQ ID NO: 677)



QEVMIRVFGKPDGKPEFEEADTLEKLRTLLAEKLSSRPEAVHEYVTPLFVSRAPNRKMSG






Nme2Cas9 (781-840)


(SEQ ID NO: 678)



QEVMIRVFGKPDGKPEFEEADTPEKLRTLLAEKLSSRPEAVHEYVTPLFVSRAPNRKMSG






Nme1Cas9 (841-895)


(SEQ ID NO: 679)




custom-character Tcustom-character SAKcustom-character Ecustom-character SVcustom-character RVcustom-character LTcustom-character K Lcustom-character DLEcustom-character MVNcustom-character








custom-character REcustom-character LYEALKARLEAcustom-character






Nme2Cas9 (841-899)


(SEQ ID NO: 680)




custom-character
custom-character Tcustom-character SAKcustom-character NEcustom-character SVcustom-character RVcustom-character LTcustom-character KLcustom-character DLcustom-character MVNcustom-character REcustom-character







LYEALKARLEAcustom-character





Nme1Cas9 (896-950)


(SEQ ID NO: 681)




custom-character AFcustom-character PFYKcustom-character Gcustom-character Qcustom-character VKAVRVEcustom-character VQcustom-character GVcustom-character








custom-character IADNcustom-character MVRV






Nme2Cas9 (900-954)


(SEQ ID NO: 682)




custom-character AFcustom-character PFYKcustom-character








custom-character Qcustom-character VKAVRVEcustom-character QESGVcustom-character IADNcustom-character MVRV






Nme1Cas9 (951-1005)


(SEQ ID NO: 683)



DVFcustom-character Kcustom-character




custom-character Ycustom-character VPIYcustom-character WQVAcustom-character ILPDcustom-charactercustom-character IDDScustom-character Fcustom-character FSLHcustom-character D






Nme2Cas9 (955-1007)


(SEQ ID NO: 684)



DVFcustom-character KVcustom-character Ycustom-character VPIYcustom-character WQVAcustom-character ILPDcustom-character IDDScustom-character Fcustom-character






FSLHcustom-character D





Nme1Cas9 (1006-1063)


(SEQ ID NO: 685)



Lcustom-character Kcustom-character Fcustom-character Ycustom-character Ccustom-character Gcustom-character HDcustom-character Kcustom-character







custom-character IQKYQcustom-character






Nme2Cas9 (1008-1063)


(SEQ ID NO: 686)



Lcustom-character Kcustom-character Fcustom-character Ycustom-character Ccustom-character Gcustom-character HDcustom-character Kcustom-character







custom-character Icustom-character QKYQcustom-character






Nme1Cas9 (1064-1082)


(SEQ ID NO: 687)




custom-character ELGKEIRPCRLKKRPPVR







Nme2Cas9 (1064-1082)


(SEQ ID NO: 688)




custom-character ELGKEIRPCRLKKRPPVR







Alignment of Nme1Cas9 and Nme3Cas9


Non-PID aa differences (teal-underlined); PID aa


differences (yellow-underlined bold); active


site residues (red-bold).


Nme1Cas9 1


(SEQ ID NO: 689)



MAAFKPNSINYILGLDIGIASVGWAMVEIDEEENPIRLIDLGVRVFERAE   50






Nme3Cas9 1


(SEQ ID NO: 690)



MAAFKPNPINYILGLDIGIASVGWAMVEIDEEENPIRLIDLGVRVFERAE   50






Nme1Cas9 51


(SEQ ID NO: 691)



VPKTGDSLAMARRLARSVRRLTRRRAHRLLRTRRLLKREGVLQAANFDEN  100






Nme3Cas9 51


(SEQ ID NO: 692)



VPKTGDSLAMARRLARSVRRLTRRRAHRLLRARRLLKREGVLQAADFDEN  100






Nme1Cas9 101


(SEQ ID NO: 693)



GLIKSLPNTPWQLRAAALDRKLTPLEWSAVLLHLIKHRGYLSQRKNEGET  150






Nme3Cas9 101


(SEQ ID NO: 694)



GLIKSLPNTPWQLRAAALDRKLTPLEWSAVLLHLIKHRGYLSQRKNEGET  150






Nme1Cas9 151


(SEQ ID NO: 695)



ADKELGALLKGVAGNAHALQTGDFRTPAELALNKFEKESGHIRNQRSDYS  200






Nme3Cas9 151


(SEQ ID NO: 696)



ADKELGALLKGVADNAHALQTGDFRTPAELALNKFEKECGHIRNQRGDYS  200






Nme1Cas9 201


(SEQ ID NO: 697)



HTFSRKDLQAELILLFEKQKEFGNPHVSGGLKEGIETLLMTQRPALSGDA  250






Nme3Cas9 201


(SEQ ID NO: 698)



HTFSRKDLQAELNLLFEKQKEFGNPHVSGGLKEGIETLLMTQRPALSGDA  250






Nme1Cas9 251


(SEQ ID NO: 699)



VQKMLGHCTFEPAEPKAAKNTYTAERFIWLTKLNNLRILEQGSERPLTDT  300






Nme3Cas9 251


(SEQ ID NO: 700)



VQKMLGHCTFEPAEPKAAKNTYTAERFIWLTKLNNLRILEQGSERPLTDT  300






Nme1Cas9 301


(SEQ ID NO: 701)



ERATLMDEPYRKSKLTYAQARKLLGLEDTAFFKGLRYGKDNAEASTLMEM  350






Nme3Cas9 301


(SEQ ID NO: 702)



ERATLMDEPYRKSKLTYAQARKLLSLEDTAFFKGLRYGKDNAEASTLMEM  350






Nme1Cas9 351


(SEQ ID NO: 703)



KAYHAISRALEKEGLKDKKSPLNLSPELQDEIGTAFSLFKTDEDITGRLK  400






Nme3Cas9 351


(SEQ ID NO: 704)



KAYHTISRALEKEGLKDKKSPLNLSPELQDEIGTAFSLFKTDEDITGRLK  400






Nme1Cas9 401


(SEQ ID NO: 705)



DRIQPEILEALLKHISFDKFVQISLKALRRIVPLMEQGKRYDEACAEIYG  450






Nme1Cas9 401


(SEQ ID NO: 706)



DRIQPEILEALLKH1SFDKFVQISLKALRRIVPLMEQGKRYDEACAEIYG  450






Nme1Cas9 451


(SEQ ID NO: 707)



DHYGKKNTEEKIYLPPIPADLIRNPVVLRALSQARKVINGVVRRYGSPAR  500






Nme3Cas9 451


(SEQ ID NO: 708)



DHYGKKNTEEKIYLPPIPADEIRNPVVLRALSQARKVINGVVRRYGSPAR  500






Nme1Cas9 501


(SEQ ID NO: 709)



IHIETAREVGKSFKDRKEIEKRQEENRKDREKAAAKFREYFPNEVGEPKS  550






Nme3Cas9 501


(SEQ ID NO: 710)



IHIETAREVGKSFKDRKEIEKRQEENRKDREKAAAKFREYFPNEVGEPKS  550






Nme1Cas9 551


(SEQ ID NO: 711)



KDILKLRLYEQQHGKCLYSGKEINLGRLNEKGYVEIDHALPFSRTWDDSF  600






Nme3Cas9 551


(SEQ ID NO: 712)



KDILKLRLYEQQHGKCLYSGKEINLGRLNEKGYVEIDHALPFSRTWDDSF  600






Nme1Cas9 601


(SEQ ID NO: 713)



NNKVLVLGSENQNKGNQTPYEYFNGKDNSREWQEFKARVETSRFPRSKKQ  650






Nme3Cas9 601


(SEQ ID NO: 714)



NNKVLVLGSENQNKGNQFPYEYENGKDNSREWQEFKARVETSRFPRSKKQ  650






Nme1Cas9 651


(SEQ ID NO: 715)



RILLQKFDEDGEKERNLNDTRYVNRFICQFVADRMRLIGKGKKRVFASNG  700






Nme3Cas9 651


(SEQ ID NO: 716)



RILLQKFDEDGEKERNLNDTRYVNRFLCQFVADRMRLTGKGKKRVFASNG  700






Nme1Cas9 701


(SEQ ID NO: 717)



QITNLLRGFWGLRKVRAENDRHHALDAVVVACSTVAMQQKITREVRYKEM  750






Nme3Cas9 701


(SEQ ID NO: 718)



QITNLLRGFWGLRKVRAENDRHHALDAVVVACSINAMQQKITREVRYKEM  750






Nme1Cas9 751


(SEQ ID NO: 719)



NAFDGKTIDKETGEVLHQKTHFPQPWEFFAQEVMIRVFGKPDGKPEFEEA  800






Nme3Cas9 751


(SEQ ID NO: 720)



NAFDGKTIDKETGEVLHQKTHFPQPWEFFLQEVMIRVFGKPDGKPEFEEA  800






Nme1Cas9 801


(SEQ ID NO: 721)



DTLEKLRTLLAEKLSSRPEAVHEYVTPLFVSRAPNRKMSGQGHMETVKSA  850






Nme3Cas9 801


(SEQ ID NO: 722)



DTPEKLRTLLAEKLSSRPEAVHEYVTPLFVSRAPNRKMSGQGHMETVKSA  850






Nme1Cas9 851


(SEQ ID NO: 723)



KRLDEGVSVLRVPLTQLKLKDLEKMVNREREPKLYEALKARLEAHKDDPA  900






Nme3Cas9 851


(SEQ ID NO: 724)



KRLDEGVSVLRVPLTQLKLKDLEKMVNREREPKLYEALKARLEAHKDDPA  900






Nme1Cas9 901


(SEQ ID NO: 725)



KAFAEPFYKYDKAGNRTQQVKAVRVEQVQKTGVWVRNHNGIADNATMVRV  950






Nme3Cas9 901


(SEQ ID NO. 726)



KAFAEPFYKYDKAGNRTQQVKAVRVEQVQKTGVWVRNFINGIADNATMVRV 950






Nme1Cas9 951


(SEQ ID NO: 727)



DVFEKGDKYYLVPIYSWQVAKGILPDRAcustom-character GKDEEDWcustom-character LIDcustom-character SFcustom-character FKFcustom-character  1000






Nme3Cas9 951


(SEQ ID NO: 728)



DVFEKGDKYYLVPIYSWQVAKGILPDRAVVcustom-character DEEDWTVIDESERFKEV 1000






Nme1Cas9 1001


(SEQ ID NO: 729)



Lcustom-character NDLcustom-character Vcustom-character KKcustom-character GYFcustom-character Rcustom-character TGcustom-character Icustom-character Rcustom-character HDLcustom-character GKcustom-character Gcustom-character I 1050






Nme3Cas9 1001


(SEQ H) NO: 884)



Lcustom-character NDLcustom-character Vcustom-character KKcustom-character GYFcustom-character Rcustom-character TGcustom-character Icustom-character Rcustom-character HDLcustom-character GKcustom-character Gcustom-character  1049






Nme1Cas9 1051


(SEQ ID NO: 730)



GVKTALSEQKYQIDEcustom-character GKEIRPCRLKKRPPVR                   1082






Nme3Cas9 1050


(SEQ ID NO: 731)



GVKTALSFQKYQIDEcustom-character GKEIRPCRLKKRPPVR                   1081






Plasmid-Expressed Nme2Cas9


SV40 NLS (yellow-BOLD); 3X-HA-Tag (green-(underlined/bold);


cMyc-like NLS (teal-plain); Linker (magenta-bold italics)


and Nme2Cas9 (italics).


(SEQ ID NO: 732)




MAAFKPNPINYILGLDIGIASVGWAMVEIDEEENPIRLIDLGVRVF








ERAEVPKTGDSLAMARRLARSVRRLTRRRAHRLLRARRLLKREGV







LQAADFDENGLIKSLPNTPWQLRAAALDRKLTPLEWSAVLLHLIK







HRGYLSQRKNEGETADKELGALLKGVANNAHALQTGDFRTPAELA







LNKFEKESGHIRNQRGDYSHTFSRKDLQAELILLFEKQKEFGNPH







VSGGLKEGIETLLMTQRPALSGDAVQKMLGHCTFEPAEPKAAKNT







YTAERFIWLTKLNNLRILEQGSERPLTDTERATLMDEPYRKSKLT







YAQARKLLGLEDTAFFKGLRYGKDNAEASTLMEMKAYHAISRALE







KEGLKDKKSPLNLSSELQDEIGTAFSLFKTDEDITGRLKDRVQPE







ILEALLKHISFDKFVQISLKALRRIVPLMEQGKRYDEACAEIYGD







HYGKKNTEEKIYLPPIPADEIRNPWLRALSQARKVINGWRRYGSP







ARIHIETAREVGKSFKDRKEIEKRQEENRKDREKAAAKFREYFPN







FVGEPKSKDILKLRLYEQQHGKCLYSGKEINLVRLNEKGYVEIDH







ALPFSRTWDDSFNNKVLVLGSENQNKGNQTPYEYFNGKDNSREWQ







EFKARVETSRFPRSKKQRILLQKFDEDGFKECNLNDTRYVNRFLC







QFVADHILLTGKGKRRVFASNGQITNLLRGFWGLRKVRAENDRHH







ALDAVWACSTVAMQQKITRFVRYKEMNAFDGKTIDKETGKVLHQK







TIIFPQPWEFFAQEVMIRVFGKPDGKPEFEEADTPEKLRTLLAEK







LSSRPEAVHEYVTPLFVSRAPNRKMSGAHKDTLRSAKRFVKHNEK







ISVKRVWLTEIKLADLENMVNYKNGREIELYEALKARLEAYGGNA







KQAFDPKDNPFYKKGGQLVKAVRVEKTQESGVLLNKKNAYTIADN







GDMVRVDVFCKVDKKGKNQYFIVPIYAWQVAENILPDIDCKGYRI







DDSYTFCFSLHKYDLIAFQKDEKSKVEFAYYINCDSSNGRFYLAW







HDKGSKEQQFRISTQNLVLIQKYQVNELG







custom-character
PKKKRKV
custom-character
custom-character







custom-character
custom-character
custom-character AAPAAKKKKLDFESG*






AAV-expressed Nme2Cas9


SV40 NLS (yellow-BOLD); 3X-HA-Tag (green-(underlined/bold);


Nucleoplasmin-like NLS (red-underline); c-myc NLS (teal-plain);


Linker (magenta-bold italics) and Nme2Cas9 (italics).


(SEQ ID NO: 733)




MV
PKKKRKV
custom-character
KRPAATKKAGQAKKKK
MAAFKPNPINYILGLDIGI








ASVGWAMVEIDEEENPIRLIDLGVRVFERAEVPKTGDSLAMARRL







ARSVRRLTRRRAHRLLRARRLLKREGVLQAADFDENGLIKSLPNT







PWQLRAAALDRKLTPLEWSAVLLHLIKHRGYLSQRKNEGETADKE







LGALLKGVANNAHALQTGDFRTPAELALNKFEKESGHIRNQRGDY







SHTFSRKDLQAELILLFEKQKEFGNPHVSGGLKEGIETLLMTQRP







ALSGDAVQKMLGHCTFEPAEPKAAKNTYTAERFIWLTKLNNLRIL







EQGSERPLTDTERATLMDEPYRKSKLTYAQARKLLGLEDTAFFKG







LRYGKDNAEASTLMEMKAYHAISRALEKEGLKDKKSPLNLSSELQ







DEIGTAFSLFKTDEDITGRLKDRVQPEILEALLKHISFDKFVQIS







LKALRRIVPLMEQGKRYDEACAEIYGDHYGKKNTEEKIYLPPIPA







DEIRNPVVLRALSQARKVINGVVRRYGSPARIHIETAREVGKSFK







DRKEIEKRQEENRKDREKAAAKFREYFPNFVGEPKSKDILKLRLY







EQQHGKCLYSGKEINLVRLNEKGYVEIDHALPFSRTWDDSFNNKV







LVLGSENQNKGNQTPYEYFNGKDNSREWQEFKARVETSRFPRSKK







QRILLQKFDEDGFKECNLNDTRYVNRFLCQFVADHILLTGKGKRR







VFASNGQITNLLRGFWGLRKVRAENDRHHALDAVVVACSTVAMQQ







KITRFVRYKEMNAFDGKTIDKETGKVLHQKTHFPQPWEFFAQEVM







IRVFGKPDGKPEFEEADTPEKLRTLLAEKLSSRPEAVHEYVTPLF







VSRAPNRKMSGAHKDTLRSAKRFVKHNEKISVKRVWLTEIKLADL







ENMVNYKNGREIELYEALKARLEAYGGNAKQAFDPKDNPFYKKGG







QLVKAVRVEKTQESGVLLNKKNAYTIADNGDMVRVDVFCKVDKKG







KNQYFIVPIYAWQVAENILPDIDCKGYRIDDSYTFCFSLHKYDLI







AFQKDEKSKVEFAYYINCDSSNGRFYLAWHDKGSKEQQFRISTQN







LVLIQKYQVNELGKEIRPCRLKKRPPVR







custom-character
KRPAATKKAGQAKKKK
custom-character







custom-character
custom-character
custom-character
custom-character







DYAAAPAAKKKKLD*






Recombinant Nme2Cas9


SV40 NLS (yellow-BOLD); Nucleoplasmin-like NLS


(red-underline); Linker (magenta-bold


italics) and Nme2Cas9 (italics).


(SEQ ID NO: 734)




PKKKRKV
custom-character
MAAFKPNPINYILGLDIGIASVGWAMVEIDEEENPI








RLIDLGVRVFERAEVPKTGDSLAMARRLARSVRRLTRRRAHRLLR







ARRLLKREGVLQAADFDENGLIKSLPNTPWQLRAAALDRKLTPLE







WSAVLLHLIKHRGYLSQRKNEGETADKELGALLKGVANNAHALQT







GDFRTPAELALNKFEKESGHIRNQRGDYSHTFSRKDLQAELILLF







EKQKEFGNPHVSGGLKEGIETLLMTQRPALSGDAVQKMLGHCTFE







PAEPKAAKNTYTAERFIWLTKLNNLRILEQGSERPLTDTERATLM







DEPYRKSKLTYAQARKLLGLEDTAFFKGLRYGKDNAEASTLMEMK







AYHAISRALEKEGLKDKKSPLNLSSELQDEIGTAFSLFKTDEDIT







GRLKDRVQPEILEALLKHISFDKFVQISLKALRRIVPLMEQGKRY







DEACAEIYGDHYGKKNTEEKIYLPPIPADEIRNPVVLRALSQARK







VINGVVRRYGSPARIHIETAREVGKSFKDRKEIEKRQEENRKDRE







KAAAKFREYFPNFVGEPKSKDILKLRLYEQQHGKCLYSGKEINLV







RLNEKGYVEIDHALPFSRTWDDSFNNKVLVLGSENQNKGNQTPYE







YFNGKDNSREWQEFKARVETSRFPRSKKQRILLQKFDEDGFKECN







LNDTRYVNRFLCQFVADHILLTGKGKRRVFASNGQITNLLRGFWG







LRKVRAENDRHHALDAVVVACSTVAMQQKITRFVRYKEMNAFDGK







TIDKETGKVLHQKTHFPQPWEFFAQEVMIRVFGKPDGKPEFEEAD







TPEKLRTLLAEKLSSRPEAVHEYVTPLFVSRAPNRKMSGAHKDTL







RSAKRFVKHNEKISVKRVWLTEIKLADLENMVNYKNGREIELYEA







LKARLEAYGGNAKQAFDPKDNPFYKKGGQLVKAVRVEKTQESGVL







LNKKNAYTIADNGDMVRVDVFCKVDKKGKNQYFIVPIYAWQVAEN







ILPDIDCKGYRIDDSYTFCFSLHKYDLIAFQKDEKSKVEFAYYIN







CDSSNGRFYLAWHDKGSKEQQFRISTQNLVLIQKYQVNELGKEIR







PCRLKKRPPVR







custom-character
custom-character
PAAKKKKLD
custom-character
KRPAATKKAGQAKKKK*






Recombinant Nme2Cas9 for use in mammalian cell RNP delivery:


SV40 NLS (yellow-BOLD); Nucleoplasmin-like NLS


(red-underline); Linker (magenta-bold


italics) and Nme2Cas9 (italics).


(SEQ ID NO: 735)




PKKKRKV
custom-character
MAAFKPNPINYILGLDIGIASVGWAMVEIDEEENPI








RLIDLGVRVFERAEVPKTGDSLAMARRLARSVRRLTRRRAHRLLR







ARRLLKREGVLQAADFDENGLIKSLPNTPWQLRAAALDRKLTPLE







WSAVLLHLIKHRGYLSQRKNEGETADKELGALLKGVANNAHALQT







GDFRTPAELALNKFEKESGHIRNQRGDYSHTFSRKDLQAELILLF







EKQKEFGNPHVSGGLKEGIETLLMTQRPALSGDAVQKMLGHCTFE







PAEPKAAKNTYTAERFIWLTKLNNLRILEQGSERPLTDTERATLM







DEPYRKSKLTYAQARKLLGLEDTAFFKGLRYGKDNAEASTLMEMK







AYHAISRALEKEGLKDKKSPLNLSSELQDEIGTAFSLFKTDEDIT







GRLKDRVQPEILEALLKHISFDKFVQISLKALRRIVPLMEQGKRY







DEACAEIYGDHYGKKNTEEKIYLPPIPADEIRNPVVLRALSQARK







VINGVVRRYGSPARIHIETAREVGKSFKDRKEIEKRQEENRKDRE







KAAAKFREYFPNFVGEPKSKDILKLRLYEQQHGKCLYSGKEINLV







RLNEKGYVEIDHALPFSRTIFDDSFNNKVLVLGSENQNKGNQTPY







EYFNGKDNSREWQEFKARVETSRFPRSKKQRILLQKFDEDGFKEC







NLNDTRYVNRFLCQFVADHILLTGKGKRRVFASNGQITNLLRGFW







GLRKVRAENDRHHALDAVVVACSTVAMQQKITRFVRYKEMNAFDG







KTIDKETGKVLHQKTHFPQPWEFFAQEVMIRVFGKPDGKPEFEEA







DTPEKLRTLLAEKLSSRPEAVHEYVTPLFVSRAPNRKMSGAHKDT







LRSAKRFVKHNEKISVKRVWLTEIKLADLENMVNYKNGREIELYE







ALKARLEAYGGNAKQAFDPKDNPFYKKGGQLVKAVRVEKTQESGV







LLNKKNAYTIADNGDIVIVRVDVFCKVDKKGKNQYFIVPIYAWQV







AENILPDIDCKGYRIDDSYTFCFSLHKYDLIAFQKDEKSKVEFAY







YINCDSSNGRFYLAWHDKGSKEQQFRISTQNLVLIQKYQVNELGK







EIRPCRLKKRPPVR







custom-character
custom-character
PAAKKKKLD
custom-character
KRPAATKKAGQAKKKK*







9. Therapeutic Applications


Although compact Cas9 orthologs have been previously validated for genome editing, including via single-AAV delivery, their longer PAMs have restricted therapeutic development due to target site frequencies that are lower than that of the more widely adopted SpyCas9. In addition, SauCas9 and its KKH variant with relaxed PAM requirements (Kleinstiver et al., 2015) are prone to off-target editing with some sgRNAs (Friedland et al., 2015; Kleinstiver et al., 2015). These limitations are exacerbated with target loci that require editing within a narrow sequence window, or that require precise segmental deletion. We have identified Nme2Cas9 as a compact and highly accurate Cas9 with a less restrictive dinucleotide PAM for genome editing by AAV delivery in vivo. The development of Nme2Cas9 greatly expands the genomic scope of in vivo editing, especially via viral vector delivery. The Nme2Cas9 all-in-one AAV delivery platform established in this study can in principle be used to target as wide a range of sites as SpyCas9 (due to the identical densities of optimal N4CC and NGG PAMs), but without the need to deliver two separate vectors to the same target cells. The availability of a catalytically dead version of Nme2Cas9 (dNme2Cas9) also promises to expand the scope of applications such as CRISPRi, CRISPRa, base editing, and related approaches (Dominguez et al., 2016; Komor et al., 2017). Moreover, Nme2Cas9's hyper-accuracy enables precise editing of target genes, potentially ameliorating safety issues resulting from off-target activities. Perhaps counterintuitively, the higher target site density of Nme2Cas9 (compared to that of Nme1Cas9) does not lead to a relative increase in off-target editing for the former. Similar results have been reported recently with SpyCas9 variants evolved to have shorter PAMs (Hu et al., 2018). Type II-C Cas9 orthologs are generally slower nucleases in vitro than SpyCas9 (Ma et al., 2015; Mir et al., 2018); interestingly, enzymological principles indicate that a reduced apparent kcat (within limits) can improve on- vs. off-target specificity for RNA-guided nucleases (Bisaria et al., 2017).


The discovery of Nme2Cas9 and Nme3Cas9 hinged on unexplored Cas9s that are highly related (outside of the PID) to an ortholog that was previously validated for human genome editing (Esvelt et al., 2013; Hou et al., 2013; Lee et al., 2016; Amrani et al., 2018). The relatedness of Nme2Cas9 and Nme3Cas9 to Nme1Cas9 brought an added benefit, namely that they use the exact same sgRNA scaffold, circumventing the need to identify and validate functional tracrRNA sequences for each. In the context of natural CRISPR immunity, the accelerated evolution of novel PAM specificities could reflect selective pressure to restore targeting of phages and MGEs that have escaped interference through PAM mutations (Deveau et al., 2008; Paez-Espino et al., 2015). Our observation that AcrIIC5Smu inhibits Nme1Cas9 but not Nme2Cas9 suggests a second, non-mutually-exclusive basis for accelerated PID variation, namely evasion of anti-CRISPR inhibition. We also speculate that accelerated variability may not be restricted to PIDs, perhaps resulting from selective pressures to evade anti-CRISPRs that bind other Cas9 domains. Cas9 inhibitors such as AcrIIC1 that bind more conserved regions of Cas9 likely present fewer routes toward mutational escape and therefore exhibit a broader inhibitory spectrum (Harrington et al., 2017a). Whatever the sources of selective pressure driving Acr and Cas9 co-evolution, the availability of validated inhibitors of Nme2Cas9 (e.g. AcrIIC1-4) provides opportunities for additional levels of control over its activities.


The approach used in this study (i.e. searching for rapidly-evolving domains within Cas9) can be implemented elsewhere, especially with bacterial species that are well-sampled at the level of genome sequence. This approach could also be applied to other CRISPR-Cas effector proteins such as Cas12 and Cas13 that have also been developed for genome or transcriptome engineering and other applications. This strategy could be especially compelling with Cas proteins that are closely related to orthologs with proven efficacy in heterologous contexts (e.g. in eukaryotic cells), as was the case for Nme1Cas9. The application of this approach to meningococcal Cas9 orthologs yielded a new genome editing platform, Nme2Cas9, with a unique combination of characteristics (compact size, dinucleotide PAM, hyper-accuracy, single-AAV deliverability, and Acr susceptibility) that promise to accelerate the development of genome editing tools for both general and therapeutic applications.









TABLE 3





The following presents exemplary sequences for plasmids and oligos as disclosed herein.







Exemplary Plasmids















Insert



SEQ


Plasmid

descrip-
Back

Insert
ID


#
Name
tion
bone
Purpose
Sequence
NO:





1
pAE70
Nme3Cas9
pMCSG
Bacterial
Seeexamples





PID on
7
expression
herein.





Nme1Cas9

of Nme1Cas9








with








Nme3Cas9 PID







2
pAE71
Nme2Cas9
pMCSG
Bacterial
Seeexamples





PID on
7
expression
herein.





Nme1Cas9

of Nme1Cas9








with








Nme2Cas9








PID







3
Pae113
Nme2TLR1
pLKO
Targeting
GTCACCTGCCTCGT
736 504






TLR2.0
GGAATACGG







with








Nme2Cas9







4
pAE114
Nme2TLR2
pLKO
Targeting
GCACCTGCCTCGTG
737 505






TLR2.0
GAATACGGT







with








Nme2Cas9







5
pAE115
Nme2TLR5
pLKO
Targeting TLR2.0
GTTCAGCGTGTCCG
738 506






with Nme2Cas9
GCTTTGGC






6
pAE116
Nme2TLR11
pLKO
Targeting TLR2.0
GTGGTGAGCAAGG
739 507






wath Nme2Cas9
GCGAGGAGCTG






7
pAE117
Nme2TLR12
pLKO
Targeting TLR2.0
GGGCGAGGAGCTG
740 508






with Nme2Cas9
TTCACCGGGGT






8
pAE118
Nme2TLR13
pLKO
Targeting TLR2.0
GTGAACTTGTGGCC
741 509






with Nme2Cas9
GTTTACGTCG






9
pAE119
Nme2TLR14
pLKO
Targeting TLR2.0
GCGTCCAGCTCGAC
742 510






with Nme2Cas9
CAGGATGGGC






10
pAE120
Nme2TLR15
pLKO
Targeting TLR2.0
GCGGTGAACAGCT
743 511






with Nme2Cas9
CCTCGCCCTTG






11
pAE121
Nme2TLR16
pLKO
Targeting TLR2.0
GGGCACCACCCCG
744 512






with Nme2Cas9
GTGAACAGCTC






12
pAE122
Nme2TLR17
pLKO
Targeting TLR2.0
GGCACCACCCCGGT
745 513






with Nme2Cas9
GAACAGCTCC






13
pAE123
Nme2TLR18
pLKO
Targeting TLR2.0
GGGATGGGCACCA
746 514






with Nme2Cas9
CCCCGGTGAAC






14
pAE124
Nme2TLR19
pLKO
Targeting TLR2.0
GCGTGTCCGGCTTT
747 515






with Nme2Cas9
GGCGAGACAA






15
pAE125
Nme2TLR20
pLKO
Targeting TLR2.0
GTCCGGCTTTGGCG
748 516






with Nme2Cas9
AGACAAATCA






16
pAE126
Nme2TLR21
pLKO
Targeting TLR2.0
GATCACCTGCCTCG
749 517






with Nme2Cas9
TGGAATACGG






17
pAE149
Nme2TLR22
pLKO
Targeting TLR2.0
GACGCTGAACTTGT
750 518






with Nme2Cas9
GGCCGTTTAC






18
pAE150
Nme2TLR23
pLKO
Targeting TLR2.0
GCCAAAGCCGGAC
751 519






with Nme2Cas9
ACGCTGAACTT






19
pAE193
Nme2TLR13
pLKO
Targeting TLR2.0
GGAACTTGTGGCCG
752 520




with23nt

with Nme2Cas9
TTTACGTCG





spacer









20
pAE194
Nme2TLR13
pLKO
Targeting TLR2.0
GAACTTGTGGCCGT
753 521




with 22 nt

with Nme2Cas9
TTACGTCG





spacer









21
pAE195
Nme2TLR13
pLKO
Targeting TLR2.0
GACTTGTGGCCGTT
754 522




with 21 nt

with Nme2Cas9
TACGTCG





spacer









22
pAE196
Nme2TLR13
pLKO
Targeting TLR2.0
GCTTGTGGCCGTTT
755 523




with 20nt

with Nme2Cas9
ACGTCG





spacer









23
pAE197
Nme2TLR13
pLKO
Targeting TLR2.0
GTTGTGGCCGTTTA
756 524




with 19nt

with Nme2Cas9
CGTCG





spacer









24
pAE213
Nme2TLR21
pLKO
Targeting TLR2.0
GTCACCTGCCTCGT
757 525




with G22

with Nme2Cas9
GGAATACGG





spacer









25
pAE214
Nme2TLR21
pLKO
Targeting TLR2.0
GCACCTGCCTCGTG
758 526




with G21

with Nme2Cas9
GAATACGG





spacer









26
pAE215
Nme2TLR21
pLKO
Targeting TLR2.0
GACCTGCCTCGTGG
759 527




with G20

with Nme2Cas9
AATACGG





spacer









27
pAE216
Nme2TLR21
pLKO
Targeting TLR2.0
GCCTGCCTCGTGGA
760 528




with G19

with Nme2Cas9
ATACGG





spacer









28
pAE90
Nme2TS1
pLKO
Targeting AAVS1
GGTTCTGGGTACTT
761 529






with Nme2Cas9
TTATCTGTCC






29
pAE93
Nme2TS4
pLKO
Targeting AA VS1
GTCTGCCTAACAGG
762






with Nme2Cas9
AGGTGGGGGT






30
pAE94
Nme2TS5
pLKO
Targeting AAVS1
GAATATCAGGAGA
763






with Nme2Cas9
CTAGGAAGGAG






31
pAE129
Nme2TS6
pLKO
Targeting
GCCTCCCTGCAGGG
764






LINC01588
CTGCTCCC







with Nme2Cas9







32
pAE130
Nme2TS10
pLKO
Targeting AAVS1
GAGCTAGTCTTCTT
765






with Nme2Cas9
CCTCCAACCC






33
pAE131
Nme2TS11
pLKO
Targeting AAVS1
GATCTGTCCCCTCC
766






with Nme2Cas9
ACCCCACAGT






34
pAE132
Nme2TS12
pLKO
Targeting AAVS1
GGCCCAAATGAAA
767






with Nme2Cas9
GGAGTGAGAGG






35
pAE133
Nme2TS13
pLKO
Targeting AAVS1
GCATCCTCTTGCTT
768






with Nme2Cas9
TCTTTGCCTG






36
pAE136
Nme2TS16
pLKO
Targeting
GGAGTCGCCAGAG
769






LINC01588
GCCGGTGGTGG







with Nme2Cas9







37
pAE137
Nme2TS17
pLKO
Targeting
GCCCAGCGGCCGG
770






LINC01588
ATATCAGCTGC







with Nme2Cas9







38
pAE138
Nme2TS18
pLKO
Targeting
GGAAGGGAACATA
771






CYBB with
TTACTATTGC







Nme2Cas9







39
pAE139
Nme2TS19
pLKO
Targeting
GTGGAGTGGCCTGC
772






CYBB with
TATCAGCTAC







Nme2Cas9







40
pAE140
Nme2TS20
pLKO
Targeting
GAGGAAGGGAACA
773






CYBB with
TATTACTATTG







Nme2Cas9







41
pAE141
Nme2TS21
pLKO
Targeting
GTGAATTCTCATCA
774






CYBB with
GCTAAAATGC







Nme2Cas9







42
pAE144
Nme2TS25
pLKO
Targeting
GCTCACTCACCCAC
775






VEGFA
ACAGACACAC







with Nme2Cas9







43
pAE145
Nme2TS26
pLKO
Targeting
GGAAGAATTTCATT
776






CFTR with
CTGTTCTCAG







Nme2Cas9







44
pAE146
Nme2TS27
pLKO
Targeting
GCTCAGTTTTCCTG
777






CFTR with
GATTATGCCT







Nme2Cas9







45
pAE152
Mme2TS31
pLKO
Targeting VEGFA
GCGTTGGAGCGGG
778






with Nme2Cas9
GAGAAGGCCAG






46
pAE153
Nme2TS34
pLKO
Targeting
GGGCCGCGGAGAT
779






LINC01588
AGCTGCAGGGC







with Nme2Cas9







47
pAE154
Nme2TS35
pLKO
Targeting
GCCCACCCGGCGG
780






LINC01588
CGCCTCCCTGC







with Nme2Cas9







48
pAE155
Nme2TS36
pLKO
Targeting
GCGTGGCAGCTGAT
781






UNC01588
ATCCGGCCGC







with Nme2Cas9







49
pAE 156
Nme2TS37
pLKO
Targeting
GCCGCGGCGCGAC
782






LINC01588
GTGGAGCCAGC







with Nme2Cas9







50
pAE157
Nme2TS38
pLKO
Targeting
GTGCTCCCCAGCCC
783






LINC01588
AAACCGCCGC







with Nme2Cas9







51
pAE159
Nme2TS41
pLKO
Targeting
GTCAGATTGGCTTG
784






AGA with
CTCGGAATTG







Nme2Cas9







52
pAEl 85
Nme2TS44
pLKO
Targeting VEGFA
GCTGGGTGAATGG
785






with Nme2Cas9
AGCGAGCAGCG






53
pAE186
Nme2TS45
pLKO
Targeting VEGFA
GTCCTGGAGTGACC
786






with Nme2Cas9
CCTGGCCTTC






54
pAE187
Nme2TS46
pLKO
Targeting VEGFA
GATCCTGGAGTGAC
787






with Nme2Cas9
CCCTGGCCTT






55
pAE188
Nme2TS47
pLKO
Targeting VEGFA
GTGTGTCCCTCTCC
788






with Nme2Cas9
CCACCCGTCC






56
pAE189
Nme2TS48
pLKO
Targeting VEGFA
GTTGGAGCGGGGA
789






with Nme2Cas9
GAAGGCCAGGG






57
pAE 190
Nme2TS49
pLKO
Targeting VEGFA
GCGTTGGAGCGGG
790






with Nme2Cas9
GAGAAGGCCAG






58
pAE191
Nme2TS50
pLKO
Targeting
GTACCCTCCAATAA
791






AGA with
TTTGGCTGGC







Nme2Cas9







59
pAE192
Nme2TS51
pLKO
Targeting
GATAATTTGGCTGG
792






AGA with
CAATTCCGAG







Mme2Cas9







60
pAE232
TS64_FancJ1
pLKO
Targeting FANCJ
GAAAATTGTGATTT
793






with Nme2Cas9
CCAGATCCAC






61
pAE233
TS65_FancJ2
pLKO
Targeting
GAGCAGAAAAAAT
794






FANCJ
TGTGATTTCC







with Nme2Cas9







62
pAE200
Nme2TS58
pLKO
Targeting
GCAGGGGCCAGGT
795




(Nme2DS1)

DS in
GTCCTTCTCTG







VEGFA with








Nme2Cas9







63
pAE201
Nme2TS59
pLKO
Targeting
GAATGGCAGGCGG
796




(Nme2DS2)

DS in
AGGTTGTACTG







VEGFA with








Nme2Cas9







64
pAE202
Nme2TS60
pLKO
Targeting
GAGTGAGAGAGTG
797




(Nme2DS3)

DS in
AGAGAGAGACA







VEGFA with








Nme2Cas9







65
pAE203
Nme2TS61
pLKO
Targeting
GTGAGCAGGCACC
798




(Nme2DS4)

DS in
TGTGCCAACAT







VEGFA with








Nme2Cas9







66
pAE204
Nme2TS62
pLKO
Targeting
GCGTGGGGGCTCC
799




(Nme2DS5)

DS in
GTGCCCCACGC







VEGFA with








Nme2Cas9







67
pAE205
Nme2TS63
pLKO
Targeting
GCATGGGCAGGGG
800




(Nme2DS6)

DS in
CTGGGGTGCAC







VEGFA with








Nme2Cas9







68
pAE207
SpyDS1
pLKO
Targeting
GGGCCAGGTGTCCT
801






DS in
TCTCTG







VEGFA with








SpyCas9







69
pAE208
SpyDS2
pLKO
Targeting
GGCAGGCGGAGGT
802






DS in
TGTACTG







VEGFA with








SpyCas9







70
pAE209
SpyDS3
pLKO
Targeting
GAGAGAGTGAGAG
803






DS in
AGAGACA







VEGFA with








SpyCas9







71
pAE210
SpyDS4
pLKO
Targeting DS in
GCAGGCACCTGTGC
804






VEGFA with
CAACAT







SpyCas9







72
pAE211
SpyDS5
pLKO
Targeting DS in
GGGGGCTCCGTGCC
805






VEGFA with
CCACGC







SpyCas9







73
pAE212
SpyDS6
pLKO
Targeting DS in
GGGCAGGGGCTGG
806






VEGFA with
GGTGCAC







SpyCas9







74
pAE169
hDeCas9 Wt
AAV
Nme2Cas9 all-in-one
See examples





in AAV

AAV expression with
herein.





backbone

sgRNA cassette







75
pAE217
hDeCas9 wt
pMCSG
wildtype Nme2Cas9
See examples





in pMSCG7
7
for bacterial
herein.





backbone

expression







76
pAE107
2xNLS
pCdest
Nme2Cas9 CMV-
See examples





Nme2Cas9

driven expression
herein.





with HA

plasmid







77
pAE127
hDemonCas9
pMSCG
Targeting
See examples





3XNLS in
7
endogenous loci
herein.





pMSCG7

with Nme2Cas9







78
pAM172
hNme2Cas9
pCVL
Lentivector
See examples





4X NLS with

containing UCOE,
herein.





3XHA

SFFV driven








Nme2Cas9 and Puro







79
pAM174
nickase
pCVL
Lentivector
See examples





hNme2Cas9

containing UCOE,
herein.





D16A 4X

SFFV driven






NLS with

Nme2Cas9 and Puro






3XHA









80
pAM175
nickase
pCVL
Lentivector
See examples





hNme2Cas9

containing
herein.





H588A4X

UCOE,






NLS with

SFFV driven






3XHA

Nme2Cas9 and








Puro







81
pAM177
dead
pCVL
Lentivector
See examples





hNme2Cas9

containing
herein.





4X NLS with

UCOE,






3XHA

SFFV driven








Nme2Cas9 and








Puro










Exemplary oligonucleotides















SEQ ID


Number
Name
Sequence
Purpose
NO:





1
AAVS1_T1DE1_FW
TGGCTTAGCACCTCTCCAT
TIDE analysis
807 575





2
LINC01588_TIDE_
AGAGGAGCCTTCTGACTGCT
TIDE analysis
808 576



FW
GCAGA







3
AAVS1_TIDE2_FW
TCCGTCTTCCTCCACTCC
TIDE analysis
809 577





4
NTS55_TIDE_FW
TAGAGAACTGGGTAGTGTG
TIDE analysis
810 578





5
VEGF_TIDE3_FW
GTACATGAAGCAACTCCAGT
TIDE analysis
811 579




CCCA







6
hCFTR_TIDE1_FW
TGGTGATTATGGGAGAACTG
TIDE analysis
812 580




GAGC







7
AGA_TIDE1_FW
GGCATAAGGAAATCGAAGGT
TIDE analysis
813 581




C







8
VEGF_TIDE4_FW
ACACGGGCAGCATGGGAATA
TIDE analysis
814 582




GTC







9
VEGF_TIDE5_FW
CCTGTGTGGCTTTGCTTTGGT
TIDE analysis
815 583




CG







10
VEGF_TIDE6_FW
GGAGGAAGAGTAGCTCGCCG
TIDE analysis
816 584




AGG







11
VEGF_TIDE7FW
AGGGAGAGGGAAGTGTGGG
TIDE analysis
817 585




GAAGG







12
AAVS1_TIDE1_RV
AGAACTCAGGACCAACTTAT
TIDE analysis
818 586




TCTG







13
LINC01588_
ATGACAGACACAACCAGAGG
TIDE analysis
819 587



TIDE_RV
GCA







14
AAVS1_TIDE2_RV
TAGGAAGGAGGAGGCCTAAG
TIDE analysis
820 588





15
NTS55_TIDE_RV
CCAATATTGCATGGGATGG
TIDE analysis
821 589





16
VEGFT_IDE3RV
ATCAAATTCCAGCACCGAGC
TIDE analysis
822 590




GC







17
hCFTR_TIDE1_RV
ACCATTGAGGACGTTTGTCTC
TIDE analysis
823 591




AC







18
AGA_TIDE1_RV
CATGTCCTCAAGTCAAGAAC
TIDE analysis
824 592




AAG







19
VEGF_TIDE4_RV
GCTAGGGGAGAGTCCCACTG
TIDE analysis
825 593




TCCA







20
VEGF_TIDE5_RV
GTAGGGTGTGATGGGAGGCT
TIDE analysis
826 594




AAGC







21
VEGF_TIDE6_RV
AGACCGAGTGGCAGTGACAG
TIDE analysis
827 595




CAAG







22
VEGF_T1DE7_RV
GTCTTCCTGCTCTGTGCGCAC
TIDE analysis
828 596




GAC







23
RandomPAM_FW
TAGCGGCCGCTCATGCGCGG
Protospacer with
829 597




CGCATTACCTTTACNNNNNN
randomized PAM





NNNNGGAT






CCTCTAGAGTCG







24
RandomPAM_RV
ACAGGAAACAGCTATGACCA
Protospacer with
830 598




TGAAAGCTTGCATGCCTGCA
randomized PAM





GGTCGACTCTA






GAGGATC







25
DS2_ON_FW1
ctacacgacgctcttccgatctCCTGGAG
Targeted Deep Seq
831 599




CGTGTACGTTGG







26
SpyDS2_OT1_FW1
ctacacgacgctcttccgatctCCTGTGG
Targeted Deep Seq
832 600




TCCCAGCTACTTG







27
SpyDS2_OT2_FW1
ctacacgacgctcttccgatctATCTGCG
Targeted Deep Seq
833 601




ATGTCCTCGAGG







28
SpyDS2_OT3_FW1
ctacacgacgctcttccgatctTGGTGTG
Targeted Deep Seq
834 602




CGCCTCTAACG







29
SpyDS2_OT4_FW1
ctacacgacgctcttccgatctGGAGTCT
Targeted Deep Seq
835 603




TGCTTTGTCACTCAGA







30
SpyDS2_OT5_FW1
ctacacgacgctcttccgatctAGCCTAG
Targeted Deep Seq
836 604




ACCCAGTCCCAT







31
SpyDS2_OT6_FW1
ctacacgacgctcttccgatctGCTGGGC
Targeted Deep Seq
837 605




ATAGTAGTGGACT







32
SpyDS2_OT7_FW1
ctacacgacgctcttccgatctTGGGGAG
Targeted Deep Seq
838 606




GCTGAGACACGA







33
SpyDS2_OT8_FW1
ctacacgacgctcttccgatctCTTGGGA
Targeted Deep Seq
839 608




GGCTGAGGCAAG







34
DS2_ON_RV1
agacgtgtgctcttccgatctCAGGAGG
Targeted Deep Seq
840 609




ATGAGAGCCAGG







35
SpyDS2_OT1_RV1
agacgtgtgctcttccgatctCAGGGTCT
Targeted Deep Seq
841 610




CACTCTATCACCCA







36
SpyDS2_OT2_RV1
agacgtgtgctcttccgatctACTGAATG
Targeted Deep Seq
842 612




GGTTGAACTTGGC







37
SpyDS2_OT3_RV1
agacgtgtgctcttccgatctGAGACAG
Targeted Deep Seq
843 613




AATCTTGCTCTGTCTCC







38
SpyDS2_OT4_RV1
agacgtgtgctcttccgatctTCCCAGCT
Targeted Deep Seq
844 612




ACTTGGGAGGC







39
SpyDS2_OT5_RV1
agacgtgtgctcttccgatctCCTGCCCA
Targeted Deep Seq
845 614




AATAGGGAAGCAG







40
SpyDS2_OT6_RV1
agacgtgtgctcttccgatctTGGCGCCT
Targeted Deep Seq
846 615




TAGTCTCTGCTAC







41
SpyDS2_OT7_RV1
agacgtgtgctcttccgatctGCATGAGA
Targeted Deep Seq
847 616




CACAGTTTCACTCTG







42
SpyDS2_OT8_RV1
agacgtgtgctcttccgatctGAGAGAGT
Targeted Deep Seq
848 617




CTCACTGCGTTGC







43
DS4_ON_FW3
ctacacgacgctcttccgatctTCTCTCA
Targeted Deep Seq
849 618




CCCACTGGGCAC







44
DS4_ON_RV3
agacgtgtgctcttccgatctGCTTCCAG
Targeted Deep Seq
850 619




ACGAGTGCAGA







45
SpyDS4_OT1_FW1
ctacacgacgctcttccgatctAAGTTTT
Targeted Deep Seq
851 620




CAAACCAGAAGAACTACGAC







46
SpyDS4_OT2_FW1
ctacacgacgctcttccgatctCCGGTAT
Targeted Deep Seq
852 621




AAGTCCTGGAGCG







47
SpyDS4_OT3_FW1
ctacacgacgctcttccgatctGCCAGGG
Targeted Deep Seq
853 622




AGCAATGGCAG







48
SpyDS4_OT6_FW1
ctacacgacgctcttccgatctCCTCGAA
Targeted Deep Seq
854 623




TTCCACGGGGTT







49
DS16_ON_FW1
ctacacgacgctcttccgatctGTTGGTG
Targeted Deep Seq
855 624




GGAGGGAAGTGAG







50
SpyDS6_OT1_FW1
ctacacgacgctcttccgatctGATGGCG
Targeted Deep Seq
856 625




GTTGTAGCGGC







51
SpyDS6_OT2_FW1
ctacacgacgctcttccgatctCACATAA
Targeted Deep Seq
857 626




ACCTATGTTTCAGCAGA







52
SpyDS6_OT3_FW1
ctacacgacgctcttccgatctGCTAGTT
Targeted Deep Seq
858 627




GGATTGAAGCAGGGT







53
SpyDS6_OT4_FW1
ctacacgacgctcttccgatctTTGAGTG
Targeted Deep Seq
859 628




CGGCAGCTTCC







54
SpyDS6_OT6_FW1
CtacacgacgctcttccgatctATAACCC
Targeted Deep Seq
860 629




TCCCAGGCAAAGTC







55
SpyDS6_OT7_FW1
ctacacgacgctcttccgatctAGCCTGC
Targeted Deep Seq
861 630




ACATCTGAGCTC







56
SpyDS6_OT8_FW1
ctacacgacgctcttccgatctGGAGCAT
Targeted Deep Seq
862 631




TGAAGTGCCTGG







57
DeDS6_ON_RV1
agacgtgtgctcttccgatctCAGCCTGG
Targeted Deep Seq
863 632




GACCACTGA







58
SpyDS6_OT1_RV1
agacgtgtgctcttccgatctCATCCTCG
Targeted Deep Seq
864 633




ACAGTCGCGG







59
SpyDS6_OT2_RV1
agacgtgtgctcttccgatctGACTGATC
Targeted Deep Seq
865 634




AAGTAGAATACTCATGGG







60
SpyDS6_OT3_RV1
agacgtgtgctcttccgatctCCCTGCCA
Targeted Deep Seq
866 635




GCACTGAAGC







61
SpyDS6_OT4_Rv1
agacgtgtgctcttccgatctGGTTCCTA
Targeted Deep Seq
867 636




TCTTTCTAGACCAGGAGT







62
SpyDS6_OT6_RV1
agacgtgtgctcttccgatctAGTGTGGA
Targeted Deep Seq
868 637




GGGCTCAGGG







63
SpyDS6_OT7_RV1
agacgtgtgctcttccgatctGATGGGCA
Targeted Deep Seq
869 638




GAGGAAGGCAA







64
SpyDS6_OT8_RV1
agacgtgtgctcttccgatctTCACTCTC
Targeted Deep Seq
870 639




ATGAGCGTCCCA







65
Nme2DS2_OT1_FW1
ctacacgacgctcttccgatctAAGGTTC
Targeted Deep Seq
871 640




CTTGCGGTTCGC







66
Nme2DS2_OT1_RV1
agacgtgtgctcttccgatctCGCTGCCA
Targeted Deep Seq
872 641




TTGCTCCCT







67
Nme2DS6_OT1_FW1
ctacacgacgctcttccgatctTCTCGCA
Targeted Deep Seq
873 642




CATTCTTCACGTCC







68
Nme2DS6_OT1_RV1
agacgtgtgctcttccgatctAGGAACCT
Targeted Deep Seq
874 643




TCCCGACTTAGGG







69
Rosa26_ON_FW1
ctacacgacgctcttccgatctCCCGCCC
Targeted Deep Seq
875 644




ATCTTCTAGAAAGAC







70
Rosa26_OT1_FW1
ctacacgacgctcttccgatctTGCCAGG
Targeted Deep Seq
876 645




TGAGGGACTGG







71
Rosa26_ON_RV1
agacgtgtgctcttccgatctTCTGGGAG
Targeted Deep Seq
877 646




TTCTCTGCTGCC







72
Rosa26_OT1_RV1
agacgtgtgctcttccgatctTGCCCAAC
Targeted Deep Seq
878 647




CTTAGCAAGGAG







73
pCSK9_ON_FW2
ctacacgacgctcttccgatcttacct
Targeted Deep Seq
879 648




tggagcaacggcg







74
PCSK9_ON_RV2
agacgtgtgctcttccgatctcccagga
Targeted Deep Seq
880 649




cgaggatggag







75
Tyr_500_FW3
GATAGTCACTCCAGGGGTTG
TIDE analysis
881 650





76
Tyr_500_RV3
GTGGTGAACCAATCAGTCCT
TIDE analysis
882 651









RNP Delivery for Mammalian Genome Editing

For RNP experiments, the Neon electroporation system was used exactly as described (Amrani et al., 2018). Briefly, 40 picomoles of 3×NLS-Nme2Cas9 along with 50 picomoles of T7-transcribed sgRNA was assembled in buffer R and electroporated using 10 μL Neon tips. After electroporation, cells were plated in pre-warmed 24-well plates containing the appropriate culture media without antibiotics. Electroporation parameters (voltage, width, number of pulses) were 1150 V, 20 ms, 2 pulses for HEK293T cells; 1000 V, 50 ms, 1 pulse for K562 cells.


In Vivo AAV8.Nme2Cas9+sgRNA Delivery and Liver Tissue Processing

For the AAV8 vector injections, 8-week-old female C57BL/6NJ mice were injected with 4×1011 genome copies per mouse via tail vein, with the sgRNA targeting a validated site in either Pcsk9 or Rosa26. Mice were sacrificed 28 days after vector administration and liver tissues were collected for analysis. Liver tissues were fixed in 4% formalin overnight, embedded in paraffin, sectioned and stained with hematoxylin and eosin (H&E). Blood was drawn from the facial vein at 0, 14 and 28 days post injection, and serum was isolated using a serum separator (BD, Cat. No. 365967) and stored at −80° C. until assay. Serum cholesterol level was measured using the Infinity™ colorimetric endpoint assay (Thermo-Scientific) following the manufacturer's protocol and as previously described (Ibraheim et al., 2018). For the anti-PCSK9 Western blot, 40 μg of protein from tissue or 2 ng of Recombinant Mouse PCSK9 Protein (R&D Systems, 9258-SE-020) were loaded onto a MiniPROTEAN® TGX™ Precast Gel (Bio-Rad). The separated bands were transferred onto a PVDF membrane and blocked with 5% Blocking-Grade Blocker solution (Bio-Rad) for 2 hours at room temperature. Next, the membrane was incubated with rabbit anti-GAPDH (Abcam ab9485, 1:2,000) or goat anti-PCSK9 (R&D Systems AF3985, 1:400) antibodies overnight. Membranes were washed in TBST and incubated with horseradish peroxidase (HRP)-conjugated goat anti-rabbit (Bio-Rad 1706515, 1:4,000), and donkey anti-goat (R&D Systems HAF109, 1:2,000) secondary antibodies for 2 hours at room temperature. The membranes were washed again in TBST and visualized using Clarity™ western ECL substrate (Bio-Rad) using an M35A XOMAT Processor (Kodak).


Ex Vivo AAV6.Nme2Cas9 Delivery in Mouse Zygotes

Zygotes were incubated in 15 μl drops of KSOM (Potassium-Supplemented Simplex Optimized Medium, Millipore, Cat. No. MR-106-D) containing 3×109 or 3×108 GCs of AAV6.Nme2Cas9.sgTyr vector for 5-6 h (4 zygotes in each drop). After incubation, zygotes were rinsed in M2 and transferred to fresh KSOM for overnight culture. The next day, the embryos that advanced to 2-cell stage were transferred into the oviduct of pseudopregnant recipients and allowed to develop to term.


EXPERIMENTAL
Example I
Discovery of Cas9 Orthologs with Differentially Diverged PIDs

Nme1Cas9 peptide sequence was used as a query in BLAST searches to find all Cas9 orthologs in Neisseria meningitidis species. Orthologs with >80% identity to Nme1Cas9 were selected for the remainder of this study. The PIDs were then aligned with that of Nme1Cas9 (residues 820-1082) using ClustalW2 and those with clusters of mutations in the PID were selected for further analysis. An unrooted phylogenetic tree of NmeCas9 orthologs was constructed using FigTree (http://tree.bio.ed.ac.uk/software/figtree/).


Example II
Cloning, Expression and Purification of Cas9 and Acr Orthologs

Examples of plasmids and oligonucleotides used in this study are listed in Table 3. The PIDs of Nme2Cas9 and Nme3Cas9 were ordered as gBlocks (IDT) to replace the PID of Nme1Cas9 using Gibson Assembly (NEB) in the bacterial expression plasmid pMSCG7 (Zhang et al., 2015), which encodes Nme1Cas9 with a 6×His tag. The construct was transformed into E. coli, expressed and purified as previously described (Pawluk et al., 2016). Briefly, Rosetta (DE3) cells containing the respective Cas9 plasmids were grown at 37° C. to an OD600 of 0.6 and protein expression was induced by 1 mM IPTG for 16 hr at 18° C. Cells were harvested and lysed by sonication in lysis buffer [50 mM Tris-HCl (pH 7.5), 500 mM NaCl, 5 mM imidazole. 1 mM DTT] supplemented with 1 mg/mL Lysozyme and protease inhibitor cocktail (Sigma). The lysate was then run through a Ni2+-NTA agarose column (Qiagen), and the bound protein was eluted with 300 mM imidazole and dialyzed into storage buffer [20 mM HEPES-NaOH (pH 7.5), 250 mM NaCl, 1 mM DTT]. For Acr proteins, 6×His-tagged proteins were expressed in E. coli strain BL2I Rosetta (DE3). Cells were grown at 37° C. to an optical density (OD600) of 0.6 in a shaking incubator. The bacterial cultures were cooled to 18° C., and protein expression was induced by adding 1 mM IPTG for overnight expression. The next day, cells were harvested and resuspended in lysis buffer supplemented with 1 mg/mL Lysozyme and protease inhibitor cocktail (Sigma) and protein was purified using the same protocol as for Cas9. The 6×His tag was removed by incubation of the resin-bound protein with Tobacco Etch Virus (TEV) protease overnight at 4° C. to isolate untagged Acrs.


Example III
In Vitro PAM Discovery Assay

A dsDNA target library with randomized PAM sequences was generated by overlapping PCR, with the forward primer containing the 10-nt randomized PAM region. The library was gel-purified and subjected to in vitro cleavage reaction by purified Cas9 along with T7-transcribed sgRNAs. 300 nM Cas9:sgRNA complex was used to cleave 300 nM of the target fragment in 1×NEBuffer 3.1 (NEB) at 37° C. for 1 hr. The reaction was then treated with proteinase K at 50° C. for 10 minutes and run on a 4% agarose/1×TAE gel. The cleavage product was excised, eluted, and cloned using a previously described protocol (Zhang et al., 2012), with modifications. Briefly, DNA ends were repaired, non-templated 2′-deoxyadenosine tails were added, and Y-shaped adapters were ligated. After PCR, the product was quantitated with KAPA Library Quantification Kit and sequenced using a NextSeq 500 (Illumina) to obtain 75 nt paired-end reads. Sequences were analyzed with custom scripts and R.


Example IV
Transfections and Mammalian Genome Editing

Human codon-optimized Nme2Cas9 was cloned by Gibson Assembly into the pCDest2 plasmid backbone previously used for Nme1Cas9 and SpyCas9 expression (Pawluk et al., 2016; Amrani et al., 2018). Transfection of HEK293T and HEK293T-TLR2.0 cells was performed as previously described (Amrani et al., 2018). For Hepa1-6 transfections, Lipofectamine LTX was used to transfect 500 ng of all-in-one AAV.sgRNA.Nme2Cas9 plasmid in 24-well plates (˜105 cells/well), using cells that had been cultured 24 hours before transfection. For K562 cells stably expressing Nme2Cas9 delivered via lentivector (see below), 50,000-150,000 cells were electroporated with 500 ng sgRNA plasmid using 10 μL Neon tips. To measure indels in all cells 72 hr after transfections, cells were harvested and genomic DNA was extracted using the DNaesy Blood and Tissue kit (Qiagen). The targeted locus was amplified by PCR, Sanger-sequenced (Genewiz), and analyzed by TIDE (Brinkman et al., 2014) using the Desktop Genetics web-based interface (http://tide.deskgen.com).


Example V
Lentiviral Transduction of K562 Cells to Stably Express Nme2Cas9

K562 cells stably expressing Nme2Cas9 were generated as previously described for Nme1Cas9 (Amrani et al., 2018). For lentivirus production, the lentiviral vector was co-transfected into HEK293T cells along with the packaging plasmids (Addgene 12260 & 12259) in 6-well plates using TransIT-LT1 transfection reagent (Mirus Bio). After 24 hours, culture media was aspirated from the transfected cells and replaced with 1 mL of fresh DMEM. The next day, the supernatant containing the virus was collected and filtered through a 0.45 μm filter. 10 uL of the undiluted supernatant along with 2.5 ug of Polybrene was used to transduce ˜106 K562 cells in 6-well plates. The transduced cells were selected using media supplemented with 2.5 μg/mL puromycin.


Example VI
RNP Delivery for Mammalian Genome Editing

For RNP experiments, the Neon electroporation system was used exactly as described (Amrani et al., 2018). Briefly, 40 picomoles of 3×NLS-Nme2Cas9 along with 50 picomoles of T7-transcribed sgRNA was assembled in buffer R and electroporated using 10 μL Neon tips. After electroporation, cells were plated in pre-warmed 24-well plates containing the appropriate culture media without antibiotics. Electroporation parameters (voltage, width, number of pulses) were 1150 V, 20 ms, 2 pulses for HEK293T cells; 1000 V, 50 ms, 1 pulse for K562 cells.


Example VII
GUIDE-seq

GUIDE-seq experiments were performed as described previously (Tsai et al., 2014), with minor modifications (Bolukbasi et al., 2015a). Briefly, HEK293T cells were transfected with 200 ng of Cas9 plasmid, 200 ng of sgRNA plasmid, and 7.5 pmol of annealed GUIDE-seq oligonucleotides using Polyfect (Qiagen). Alternatively, Hepa1-6 cells were transfected as described above. Genomic DNA was extracted with a DNeasy Blood and Tissue kit (Qiagen) 72 h after transfection according to the manufacturer's protocol. Library preparation and sequencing were performed exactly as described previously (Bolukbasi et al., 2015a). For analysis, all sequences with up to ten mismatches with the target site, as well as a C in the fifth PAM position (N4CN), were considered potential off-target sites. Data were analyzed using the Bioconductor package GUIDEseq version 1.1.17 (Zhu et al., 2017).


Example VIII
Targeted Deep Sequencing and Analysis

We used targeted deep sequencing to confirm the results of GUIDE-seq and to measure indel rates with maximal accuracy. We used two-step PCR amplification to produce DNA fragments for each on- and off-target site. For SpyCas9 editing at DS2 and DS6, we selected the top off-target sites based on GUIDE-seq read counts. For SpyCas9 editing at DS4, fewer candidate off-target sites were identified by GUIDE-seq, and only those with NGG (DS4|OT1, DS4|OT3, DS4|OT6) or NGC (DS4|OT2) PAMs were examined by sequencing. In the first step, we used locus-specific primers bearing universal overhangs with ends complementary to the adapters. In the first step, 2×PCR master mix (NEB) was used to generate fragments bearing the overhangs. In the second step, the purified PCR products were amplified with a universal forward primer and indexed reverse primers. Full-size products (˜250 bp) were gel-purified and sequenced on an Illumina MiSeq in paired-end mode. MiSeq data analysis was performed as previously described (Pinello et al., 2016; Ibraheim et al., 2018).


Example IX
Off-Target Analysis Using CRISPRseek

Global off-target predictions for TS25 and TS47 were performed using the Bioconductor package CRISPRseek. Minor changes were made to accommodate characteristics of Nme2Cas9 not shared with SpyCas9. Specifically, we used the following changes to: gRNA.size=24, PAM=“NNNCC”, PAM.size=6, RNA.PAM.pattern=“NNNNCN”, and candidate off-target sites with fewer than 6 mismatches were collected. The top potential off-target sites based on the numbers and positions of mismatches were selected. Genomic DNA from cells targeted by each respective sgRNA was used to amplify each candidate off-target locus and then analyzed by TIDE.


Example X
Mouse Strains and Embryo Collection

All animal experiments were conducted under the guidance of the Institutional Animal Care and Use Committee (IACUC) of the University of Massachusetts Medical School. C57BL/6NJ (Stock No. 005304). Mice were obtained from The Jackson Laboratory. All animals were maintained in a 12 h light cycle. The middle of the light cycle of the day when a mating plug was observed was considered embryonic day 0.5 (E0.5) of gestation. Zygotes were collected at E0.5 by tearing the ampulla with forceps and incubation in M2 medium containing hyaluronidase to remove cumulus cells.


Example XI
In Vivo AAV8.Nme2Cas9+sgRNA Delivery and Liver Tissue Processing

For the AAV8 vector injections, 8-week-old female C57BL/6NJ mice were injected with 4×1011 genome copies per mouse via tail vein, with the sgRNA targeting a validated site in either Pcsk9 or Rosa26. Mice were sacrificed 28 days after vector administration and liver tissues were collected for analysis. Liver tissues were fixed in 4% formalin overnight, embedded in paraffin, sectioned and stained with hematoxylin and eosin (H&E). Blood was drawn from the facial vein at 0, 14 and 28 days post injection, and serum was isolated using a serum separator (BD, Cat. No. 365967) and stored at −80° C. until assay. Serum cholesterol level was measured using the Infinity™ colorimetric endpoint assay (Thermo-Scientific) following the manufacturer's protocol and as previously described (Ibraheim et al., 2018). For the anti-PCSK9 Western blot, 40 μg of protein from tissue or 2 ng of Recombinant Mouse PCSK9 Protein (R&D Systems, 9258-SE-020) were loaded onto a MiniPROTEAN® TGX™ Precast Gel (Bio-Rad). The separated bands were transferred onto a PVDF membrane and blocked with 5% Blocking-Grade Blocker solution (Bio-Rad) for 2 hours at room temperature. Next, the membrane was incubated with rabbit anti-GAPDH (Abcam ab9485, 1:2,000) or goat anti-PCSK9 (R&D Systems AF3985, 1:400) antibodies overnight. Membranes were washed in TBST and incubated with horseradish peroxidase (HRP)-conjugated goat anti-rabbit (Bio-Rad 1706515, 1:4,000), and donkey anti-goat (R&D Systems HAF109, 1:2,000) secondary antibodies for 2 hours at room temperature. The membranes were washed again in TBST and visualized using Clarity™ western ECL substrate (Bio-Rad) using an M35A XOMAT Processor (Kodak).


Example XII
Ex Vivo AAV6.Nme2Cas9 Delivery in Mouse Zygotes

Zygotes were incubated in 15 μl drops of KSOM (Potassium-Supplemented Simplex Optimized Medium, Millipore, Cat. No. MR-106-D) containing 3×109 or 3×108 GCs of AAV6.Nme2Cas9.sgTyr vector for 5-6 h (4 zygotes in each drop). After incubation, zygotes were rinsed in M2 and transferred to fresh KSOM for overnight culture. The next day, the embryos that advanced to 2-cell stage were transferred into the oviduct of pseudopregnant recipients and allowed to develop to term.


REFERENCES, each of which are herein incorporated by reference in their entirety:

  • Amrani, N., Gao, X. D., Liu, P., Edraki, A., Mir, A., Ibraheim, R., Gupta, A., Sasaki, K. E., Wu, T., Donohoue, P. D., et al. (2018). NmeCas9 is an intrinsically high-fidelity genome editing platform. BioRxiv, https://doi.org/10.1101/172650.
  • Barrangou, R., Fremaux, C., Deveau, H., Richards, M., Boyaval, P., Moineau, S., Romero, D. A., and Horvath, P. (2007). CRISPR provides acquired resistance against viruses in prokaryotes. Science 315, 1709-1712.
  • Bisaria, N., Jarmoskaite, I., and Herschlag, D. (2017). Lessons from Enzyme Kinetics Reveal Specificity Principles for RNA-Guided Nucleases in RNA Interference and CRISPR-Based Genome Editing. Cell Syst. 4, 21-29.
  • Bolukbasi, M. F., Gupta, A., Oikemus, S., Derr, A. G., Garber, M., Brodsky, M. H., Zhu, L. J., and Wolfe, S. A. (2015a). DNA-binding-domain fusions enhance the targeting range and precision of Cas9. Nat. Methods 12, 1150-1156.
  • Bolukbasi, M. F., Gupta, A., and Wolfe, S. A. (2015b). Creating and evaluating accurate CRISPR-Cas9 scalpels for genomic surgery. Nat. Methods 13, 41-50.
  • Brinkman, E. K., Chen, T., Amendola, M., and van Steensel, B. (2014). Easy quantitative assessment of genome editing by sequence trace decomposition. Nucleic Acids Res. 42, e168.
  • Brouns, S. J., Jore, M. M., Lundgren, M., Westra, E. R., Slijkhuis, R. J., Snijders, A. P., Dickman, M. J., Makarova, K. S., Koonin, E. V., and van der Oost, J. (2008). Small CRISPR RNAs guide antiviral defense in prokaryotes. Science 321, 960-964.
  • Casini, A., Olivieri, M., Petris, G., Montagna, C., Reginato, G., Maule, G., Lorenzin, F., Prandi, D., Romanel, A., Demichelis, F., et al. (2018). A highly specific SpCas9 variant is identified by in vivo screening in yeast. Nat. Biotechnol. 36, 265-271.
  • Certo, M. T., Ryu, B. Y., Annis, J. E., Garibov, M., Jarjour, J., Rawlings, D. J., and Scharenberg, A. M. (2011). Tracking genome engineering outcome at individual DNA breakpoints. Nat. Methods 8, 671-676.
  • Chen, J. S., Dagdas, Y. S., Kleinstiver, B. P., Welch, M. M., Sousa, A. A., Harrington, L. B., Sternberg, S. H., Joung, J. K., Yildiz, A., and Doudna, J. A. (2017). Enhanced proofreading governs CRISPR-Cas9 targeting accuracy. Nature 550, 407-410.
  • Cho, S. W., Kim, S., Kim, J. M., and Kim, J. S. (2013). Targeted genome engineering in human cells with the Cas9 RNA-guided endonuclease. Nat. Biotechnol. 31, 230-232.
  • Cho, S. W., Kim, S., Kim, Y., Kweon, J., Kim, H. S., Bae, S., and Kim, J. S. (2014). Analysis of off-target effects of CRISPR/Cas-derived RNA-guided endonucleases and nickases. Genome Res. 24, 132-141.
  • Cong, L., Ran, F. A., Cox, D., Lin, S., Barretto, R., Habib, N., Hsu, P. D., Wu, X., Jiang, W., Marraffini, L. A., et al. (2013). Multiplex genome engineering using CRISPR/Cas systems. Science 339, 819-823.
  • Deltcheva, E., Chylinski, K., Sharma, C. M., Gonzales, K., Chao, Y., Pirzada, Z. A., Eckert, M. R., Vogel, J., and Charpentier, E. (2011). CRISPR RNA maturation by trans-encoded small RNA and host factor RNase III. Nature 471, 602-607.
  • Deveau, H., Barrangou, R., Garneau, J. E., Labonte, J., Fremaux, C., Boyaval, P., Romero, D. A., Horvath, P., and Moineau, S. (2008). Phage response to CRISPR-encoded resistance in Streptococcus thermophilus. J. Bacteriol. 190, 1390-1400.
  • Dominguez, A. A., Lim, W. A., and Qi, L. S. (2016). Beyond editing: repurposing CRISPR-Cas9 for precision genome regulation and interrogation. Nat. Rev. Mol. Cell Biol. 17, 5-15.
  • Dong, Guo, M., Wang, S., Zhu, Y., Wang, S., Xiong, Z., Yang, J., Xu, Z., and Huang, Z. (2017). Structural basis of CRISPR-SpyCas9 inhibition by an anti-CRISPR protein. Nature 546, 436-439.
  • Esvelt, K. M., Mali, P., Braff, J. L., Moosburner, M., Yaung, S. J., and Church, G. M. (2013). Orthogonal Cas9 proteins for RNA-guided gene regulation and editing. Nat. Methods 10, 1116-1121.
  • Fonfara, I., Le Rhun, A., Chylinski, K., Makarova, K. S., Lecrivain, A. L., Bzdrenga, J., Koonin, E. V., and Charpentier, E. (2014). Phylogeny of Cas9 determines functional exchangeability of dual-RNA and Cas9 among orthologous type II CRISPR-Cas systems. Nucleic Acids Res. 42, 2577-2590.
  • Friedland, A. E., Baral, R., Singhal, P., Loveluck, K., Shen, S., Sanchez, M., Marco, E., Gotta, G. M., Maeder, M. L., Kennedy, E. M., et al. (2015). Characterization of Staphylococcus aureus Cas9: a smaller Cas9 for all-in-one adeno-associated virus delivery and paired nickase applications. Genome Biol. 16, 257.
  • Friedrich, G., and Soriano, P. (1991). Promoter traps in embryonic stem cells: a genetic screen to identify and mutate developmental genes in mice. Genes Dev. 5, 1513-1523.
  • Fu, Y., Sander, J. D., Reyon, D., Cascio, V. M., and Joung, J. K. (2014). Improving CRISPR-Cas nuclease specificity using truncated guide RNAs. Nat. Biotechnol. 32, 279-284.
  • Gallagher, D. N., and Haber, J. E. (2018). Repair of a Site-Specific DNA Cleavage: Old-School Lessons for Cas9-Mediated Gene Editing. ACS Chem. Biol. 13, 397-405.
  • Garneau, J. E., Dupuis, M. E., Villion, M., Romero, D. A., Barrangou, R., Boyaval, P., Fremaux, C., Horvath, P., Magadan, A. H., and Moineau, S. (2010). The CRISPR/Cas bacterial immune system cleaves bacteriophage and plasmid DNA. Nature 468, 67-71.
  • Gasiunas, G., Barrangou, R., Horvath, P., and Siksnys, V. (2012). Cas9-crRNA ribonucleoprotein complex mediates specific DNA cleavage for adaptive immunity in bacteria. Proc. Natl. Acad. Sci. USA 109, E2579-2586.
  • Gaudelli, N. M., Komor, A. C., Rees, H. A., Packer, M. S., Badran, A. H., Bryson, D. I., and Liu, D. R. (2017). Programmable base editing of A*T to G*C in genomic DNA without DNA cleavage. Nature 551, 464-471.
  • Ghanta, K., Dokshin, G., Mir, A., Krishnamurthy, P., Gneid, H., Edraki, A., Watts, J., Sontheimer, E., and Mello, C. (2018). 5′ Modifications Improve Potency and Efficacy of DNA Donors for Precision Genome Editing. Biorxiv 354480.
  • Gorski, S. A., Vogel, J., and Doudna, J. A. (2017). RNA-based recognition and targeting: sowing the seeds of specificity. Nat. Rev. Mol. Cell Biol. 18, 215-228.
  • Harrington, L. B., Doxzen, K. W., Ma, E., Liu, J. J., Knott, G. J., Edraki, A., Garcia, B., Amrani, N., Chen, J. S., Cofsky, J. C., et al. (2017a). A Broad-Spectrum Inhibitor of CRISPR-Cas9. Cell 170, 1224-1233.
  • Harrington, L. B., Paez-Espino, D., Staahl, B. T., Chen, J. S., Ma, E., Kyrpides, N. C., and Doudna, J. A. (2017b). A thermostable Cas9 with increased lifetime in human plasma. Nat. Commun. 8, 1424.
  • Hou, Z., Zhang, Y., Propson, N. E., Howden, S. E., Chu, L. F., Sontheimer, E. J., and Thomson, J. A. (2013). Efficient genome engineering in human pluripotent stem cells using Cas9 from Neisseria meningitidis. Proc. Natl. Acad. Sci. USA 110, 15644-15649.
  • Hu, J. H., Miller, S. M., Geurts, M. H., Tang, W., Chen, L., Sun, N., Zeina, C. M., Gao, X., Rees, H. A., Lin, Z., et al. (2018). Evolved Cas9 variants with broad PAM compatibility and high DNA specificity. Nature 556, 57-63.
  • Hwang, W. Y., Fu, Y., Reyon, D., Maeder, M. L., Tsai, S. Q., Sander, J. D., Peterson, R. T., Yeh, J. R., and Joung, J. K. (2013). Efficient genome editing in zebrafish using a CRISPR-Cas system. Nat. Biotechnol. 31, 227-229.
  • Hynes, A. P., Rousseau, G. M., Lemay, M.-L., Horvath, P., Romero, D. A., Fremaux, C., and Moineau, S. (2017). An anti-CRISPR from a virulent streptococcal phage inhibits Streptococcus pyogenes Cas9. Nat. Microbiol. 2, 1374-1380.
  • Ibraheim, R., Song, C.-Q., Mir, A., Amrani, N., Xue, W., and Sontheimer, E. J. (2018). All-in-One Adeno-associated Virus Delivery and Genome Editing by Neisseria meningitidis Cas9 in vivo. BioRxiv, https://doi.org/10.1101/295055.
  • Jiang, F., and Doudna, J. A. (2017). CRISPR-Cas9 Structures and Mechanisms. Annu. Rev. Biophys. 46, 505-529.
  • Jiang, W., Bikard, D., Cox, D., Zhang, F., and Marraffini, L. A. (2013). RNA-guided editing of bacterial genomes using CRISPR-Cas systems. Nat. Biotechnol. 31, 233-239.
  • Jinek, M., Chylinski, K., Fonfara, I., Hauer, M., Doudna, J. A., and Charpentier, E. (2012). A programmable dual-RNA-guided DNA endonuclease in adaptive bacterial immunity. Science 337, 816-821.
  • Jinek, M., East, A., Cheng, A., Lin, S., Ma, E., and Doudna, J. (2013). RNA-programmed genome editing in human cells. eLife 2, e00471.
  • Karvelis, T., Gasiunas, G., Young, J., Bigelyte, G., Silanskas, A., Cigan, M., and Siksnys, V. (2015). Rapid characterization of CRISPR-Cas9 protospacer adjacent motif sequence elements. Genome Biol. 16, 253.
  • Keeler, A. M., ElMallah, M. K., and Flotte, T. R. (2017). Gene Therapy 2017: Progress and Future Directions. Clin. Transl. Sci. 10, 242-248.
  • Kim, E., Koo, T., Park, S. W., Kim, D., Kim, K.-E., Kim, K., Cho, H.-Y., Song, D. W., Lee, K. J., Jung, M. H., et al. (2017). In vivo genome editing with a small Cas9 ortholog derived from Campylobacter jejuni. Nat. Commun. 8, 14500.
  • Kim, S., Kim, D., Cho, S. W., Kim, J., and Kim, J. S. (2014). Highly efficient RNA-guided genome editing in human cells via delivery of purified Cas9 ribonucleoproteins. Genome Res. 24, 1012-1019.
  • Kim, B., Komor, A., Levy, J., Packer, M., Zhao, K., and Liu, D. (2017). Increasing the genome-targeting scope and precision of base editing with engineered Cas9-cytidine deaminase fusions. Nature Biotechnology 35.
  • Kleinstiver, B. P., Prew, M. S., Tsai, S. Q., Nguyen, N. T., Topkar, V. V., Zheng, Z., and Joung, J. K. (2015). Broadening the targeting range of Staphylococcus aureus CRISPR-Cas9 by modifying PAM recognition. Nat. Biotechnol. 33, 1293-1298.
  • Kluesner, M., Nedveck, D., Lahr, W., Garbe, J., Abrahante, J., Webber, B., and Moriarity, B. (2018). EditR: A Method to Quantify Base Editing from Sanger Sequencing. The CRISPR Journal 1, 239-250.
  • Koblan, L., Doman, J., Wilson, C., Levy, J., Tay, T., Newby, G., Maianti, J., Raguram, A., and Liu, D. (2018). Improving cytidine and adenine base editors by expression optimization and ancestral reconstruction. Nat Biotechnol 36, 843.
  • Komor, A. C., Badran, A. H., and Liu, D. R. (2017). CRISPR-Based Technologies for the Manipulation of Eukaryotic Genomes. Cell 168, 20-36.
  • Komor, A. C., Kim, Y. B., Packer, M. S., Zuris, J. A., and Liu, D. R. (2016). Programmable editing of a target base in genomic DNA without double-stranded DNA cleavage. Nature 533, 420-424.
  • Lee, C. M., Cradick, T. J., and Bao, G. (2016). The Neisseria meningitidis CRISPR-Cas9 system enables specific genome editing in mammalian cells. Mol. Ther. 24, 645-654.
  • Lee, J., Mir, A., Edraki, A., Garcia, B., Amrani, N., Lou, H. E., Gainetdinov, I., Pawluk, A., Ibraheim, R., Gao, X. D., et al. (2018). Potent Cas9 inhibition in bacterial and human cells by new anti-CRISPR protein families. BioRxiv, https://www.biorxiv.org/content/early/2018/2006/2020/350504.
  • Ma, E., Harrington, L. B., O'Connell, M. R., Zhou, K., and Doudna, J. A. (2015). Single-Stranded DNA Cleavage by Divergent CRISPR-Cas9 Enzymes. Mol. Cell 60, 398-407.
  • Mali, P., Aach, J., Stranges, P. B., Esvelt, K. M., Moosburner, M., Kosuri, S., Yang, L., and Church, G. M. (2013a). CAS9 transcriptional activators for target specificity screening and paired nickases for cooperative genome engineering. Nat. Biotechnol. 31, 833-838.
  • Mali, P., Yang, L., Esvelt, K. M., Aach, J., Guell, M., DiCarlo, J. E., Norville, J. E., and Church, G. M. (2013b). RNA-guided human genome engineering via Cas9. Science 339, 823-826.
  • Marraffini, L. A., and Sontheimer, E. J. (2008). CRISPR interference limits horizontal gene transfer in staphylococci by targeting DNA. Science 322, 1843-1845.
  • Mir, A., Edraki, A., Lee, J., and Sontheimer, E. J. (2018). Type II-C CRISPR-Cas9 biology, mechanism and application. ACS Chem. Biol. 13, 357-365.
  • Mojica, F. J., Diez-Villasenor, C., Garcia-Martinez, J., and Almendros, C. (2009). Short motif sequences determine the targets of the prokaryotic CRISPR defence system. Microbiology 155, 733-740.
  • Paez-Espino, D., Sharon, I., Morovic, W., Stahl, B., Thomas, B. C., Barrangou, R., and Banfield, J. F. (2015). CRISPR immunity drives rapid phage genome evolution in Streptococcus thermophilus. mBio 6.
  • Pawluk, A., Amrani, N., Zhang, Y., Garcia, B., Hidalgo-Reyes, Y., Lee, J., Edraki, A., Shah, M., Sontheimer, E. J., Maxwell, K. L., et al. (2016). Naturally occurring off-switches for CRISPR-Cas9. Cell 167, 1829-1838 e1829.
  • Pawluk, A., Bondy-Denomy, J., Cheung, V. H., Maxwell, K. L., and Davidson, A. R. (2014). A new group of phage anti-CRISPR genes inhibits the type I-E CRISPR-Cas system of Pseudomonas aeruginosa. mBio 5, e00896.
  • Pinello, L., Canver, M. C., Hoban, M. D., Orkin, S. H., Kohn, D. B., Bauer, D. E., and Yuan, G. C. (2016). Analyzing CRISPR genome-editing experiments with CRISPResso. Nat. Biotechnol. 34, 695-697.
  • Racanelli, V., and Rehermann, B. (2006). The liver as an immunological organ. Hepatology 43, S54-62.
  • Ran, F. A., Cong, L., Yan, W. X., Scott, D. A., Gootenberg, J. S., Kriz, A. J., Zetsche, B., Shalem, O., Wu, X., Makarova, K. S., et al. (2015). In vivo genome editing using Staphylococcus aureus Cas9. Nature 520, 186-191.
  • Ran, F. A., Hsu, P. D., Lin, C. Y., Gootenberg, J. S., Konermann, S., Trevino, A. E., Scott, D. A., Inoue, A., Matoba, S., Zhang, Y., et al. (2013). Double nicking by RNA-guided CRISPR Cas9 for enhanced genome editing specificity. Cell 154, 1380-1389.
  • Rashid, S., Curtis, D. E., Garuti, R., Anderson, N. N., Bashmakov, Y., Ho, Y. K., Hammer, R. E., Moon, Y. A., and Horton, J. D. (2005). Decreased plasma cholesterol and hypersensitivity to statins in mice lacking Pcsk9. Proc. Natl. Acad. Sci. USA 102, 5374-5379.
  • Rauch, B. J., Silvis, M. R., Hultquist, J. F., Waters, C. S., McGregor, M. J., Krogan, N. J., and Bondy-Denomy, J. (2017). Inhibition of CRISPR-Cas9 with Bacteriophage Proteins. Cell 168, 150-158 e110.
  • Sapranauskas, R., Gasiunas, G., Fremaux, C., Barrangou, R., Horvath, P., and Siksnys, V. (2011). The Streptococcus thermophilus CRISPR/Cas system provides immunity in Escherichia coli. Nucleic Acids Res. 39, 9275-9282.
  • Schumann, K., Lin, S., Boyer, E., Simeonov, D. R., Subramaniam, M., Gate, R. E., Haliburton, G. E., Ye, C. J., Bluestone, J. A., Doudna, J. A., et al. (2015). Generation of knock-in primary human T cells using Cas9 ribonucleoproteins. Proc. Natl. Acad. Sci. USA 112, 10437-10442.
  • Shin, J., Jiang, F., Liu, J. J., Bray, N. L., Rauch, B. J., Baik, S. H., Nogales, E., Bondy-Denomy, J., Corn, J. E., and Doudna, J. A. (2017). Disabling Cas9 by an anti-CRISPR DNA mimic. Sci. Adv. 3, e1701620.
  • Tsai, S. Q., and Joung, J. K. (2016). Defining and improving the genome-wide specificities of CRISPR-Cas9 nucleases. Nat. Rev. Genet. 17, 300-312.
  • Tsai, S. Q., Zheng, Z., Nguyen, N. T., Liebers, M., Topkar, V. V., Thapar, V., Wyvekens, N., Khayter, C., Iafrate, A. J., Le, L. P., et al. (2014). GUIDE-seq enables genome-wide profiling of off-target cleavage by CRISPR-Cas nucleases. Nat. Biotechnol. 33, 187-197.
  • Tycko, J., Myer, V. E., and Hsu, P. D. (2016). Methods for optimizing CRISPR-Cas9 genome editing specificity. Mol. Cell 63, 355-370.
  • Yang, H., and Patel, D. J. (2017). Inhibition Mechanism of an Anti-CRISPR Suppressor AcrIIA4 Targeting SpyCas9. Mol Cell 67, 117-127 e115.
  • Yin, H., Song, C. Q., Suresh, S., Kwan, S. Y., Wu, Q., Walsh, S., Ding, J., Bogorad, R. L., Zhu, L. J., Wolfe, S. A., et al. (2018). Partial DNA-guided Cas9 enables genome editing with reduced off-target activity. Nat. Chem. Biol. 14, 311-316.
  • Yokoyama, T., Silversides, D. W., Waymire, K. G., Kwon, B. S., Takeuchi, T., and Overbeek, P. A. (1990). Conserved cysteine to serine mutation in tyrosinase is responsible for the classical albino mutation in laboratory mice. Nucleic Acids Res. 18, 7293-7298.
  • Yoon, Y., Wang, D., Tai, P. W. L., Riley, J., Gao, G., and Rivera-Perez, J. A. (2018). Streamlined ex vivo and in vivo genome editing in mouse embryos using recombinant adeno-associated viruses. Nat. Commun. 9, 412.
  • Zhang, Y., Heidrich, N., Ampattu, B. J., Gunderson, C. W., Seifert, H. S., Schoen, C., Vogel, J., and Sontheimer, E. J. (2013). Processing-independent CRISPR RNAs limit natural transformation in Neisseria meningitidis. Mol. Cell 50, 488-503.
  • Zhang, Y., Rajan, R., Seifert, H. S., Mondragon, A., and Sontheimer, E. J. (2015). DNase H activity of Neisseria meningitidis Cas9. Mol. Cell 60, 242-255.
  • Zhang, Z., Theurkauf, W. E., Weng, Z., and Zamore, P. D. (2012). Strand-specific libraries for high throughput RNA sequencing (RNA-Seq) prepared without poly(A) selection. Silence 3, 9.
  • Zhu, L. J., Holmes, B. R., Aronin, N., and Brodsky, M. H. (2014). CRISPRseek: a bioconductor package to identify target-specific guide RNAs for CRISPR-Cas9 genome-editing systems. PLoS One 9, e108424.
  • Zhu, L. J., Lawrence, M., Gupta, A., Pages, H., Kucukural, A., Garber, M., and Wolfe, S. A. (2017). GUIDEseq: a bioconductor package to analyze GUIDE-Seq datasets for CRISPR-Cas nucleases. BMC Genomics 18, 379.
  • Zuris, J. A., Thompson, D. B., Shu, Y., Guilinger, J. P., Bessen, J. L., Hu, J. H., Maeder, M. L., Joung, J. K., Chen, Z.-Y., and Liu, D. R. (2015). Cationic lipid-mediated delivery of proteins enables efficient protein-based genome editing in vitro and in vivo. Nat. Biotechnol. 33, 73-80.


All publications and patents mentioned in the above specification are herein incorporated by reference. Various modifications and variations of the described methods and system of the invention will be apparent to those skilled in the art without departing from the scope and spirit of the invention. Although the invention has been described in connection with specific preferred embodiments, it should be understood that the invention as claimed should not be unduly limited to such specific embodiments. Indeed, various modifications of the described modes for carrying out the invention that are obvious to those skilled in biological control, biochemistry, molecular biology, entomology, plankton, fishery systems, and fresh water ecology, or related fields are intended to be within the scope of the following claims.

Claims
  • 1. A mutated NmeCas9 protein comprising a fused nucleotide deaminase and a binding region for an N4CC nucleotide sequence.
  • 2. The protein of claim 1, wherein said protein is Nme2Cas9.
  • 3. The protein of claim 1, further comprising a nuclear localization signal protein.
  • 4. The protein of claim 1, wherein said nucleotide deaminase is a cytidine deaminase.
  • 5. The protein of claim 1, wherein said nucleotide deaminase is an adenosine deaminase.
  • 6. The protein of claim 1, further comprising a uracil glycosylase inhibitor.
  • 7. The protein of claim 1, wherein said nuclear localization signal protein is selected from a nucleoplasmin and an SV40.
  • 8. The protein of claim 1, wherein said binding region is a protospacer accessory motif interacting domain.
  • 9. The protein of claim 8, wherein said protospacer accessory motif interacting domain comprises said mutation.
  • 10. The protein of claim 9, wherein said mutation s a D16A mutation.
  • 11. An adeno-associated virus comprising a mutated NmeCas9 protein, said mutated NmeCas9 protein comprising a fused nucleotide deaminase and a binding region for an N4CC nucleotide sequence.
  • 12. The virus of claim 11, wherein said virus is an adeno associated virus 8.
  • 13. The virus of claim 11, wherein said virus is an adeno-associated virus 6.
  • 14. The virus of claim 11, wherein said protein is Nme2Cas9.
  • 15. The virus of claim 11, wherein said protein further comprising a nuclear localization signal protein.
  • 16. The virus of claim 11, wherein said nucleotide deaminase is a cytidine deaminase.
  • 17. The virus of claim 1, wherein said nucleotide deaminase is an adenosine deaminase.
  • 18. The virus of claim 11, wherein said protein further comprises a uracil glycosylase inhibitor.
  • 19. The virus of claim 11, wherein said nuclear localization signal protein is selected from a nucleoplasmin and SV40.
  • 20. The virus of claim 11, wherein said binding region is a protospacer accessory motif interacting domain.
  • 21. The virus of claim 20, wherein said protospacer accessory motif interacting domain comprises said mutation.
  • 22. The virus of claim 21, wherein said mutation s a D16A mutation.
  • 23. A method, comprising: a) providing; i) a nucleotide sequence comprising a gene with a mutated single base, wherein said gene is flanked by an N4CC nucleotide sequence;ii) a mutated NmeCas9 protein comprising a fused nucleotide deaminase and a binding region for said N4CC nucleotide sequence;b) contacting said nucleotide sequence with said mutated NmeCas9 protein under conditions such that said binding region attaches to said N4CC nucleotide sequence; andc) replacing said mutated single base with a wild type base with said mutated NmeCas9 protein.
  • 24. The method of claim 23, wherein said protein is Nm2Cas9.
  • 25. The method of claim 23, wherein said protein further comprising a nuclear localization signal protein.
  • 26. The method of claim 23, wherein said nucleotide deaminase is a cytidine deaminase.
  • 27. The method of claim 23, wherein said nucleotide deaminase is an adenosine deaminase.
  • 28. The method of claim 23, wherein said protein further comprises a uracil glycosylase inhibitor.
  • 29. The method of claim 23, wherein said nuclear localization signal protein is selected from the group consisting of nucleoplasmin and SV40.
  • 30. The method of claim 23, wherein said binding region is a protospacer accessory motif interacting domain.
  • 31. The method of claim 30, wherein said protospacer accessory motif interacting domain comprises said Cas9 protein mutation.
  • 32. The method of claim 31, wherein said Cas9 protein mutation is a D16A mutation.
  • 33. A method, comprising: a) providing; i) a patient comprising a nucleotide sequence comprising a gene with a mutated single base, wherein said gene is flanked by an N4CC nucleotide sequence, wherein said mutated gene causes a genetically-based medical condition;ii) an adeno-associated virus comprising a mutated NmeCas9 protein, said mutated NmeCas9 protein comprising a fused nucleotide deaminase and a binding region for said N4CC nucleotide sequence;b) treating said patient with said adeno-associated virus under conditions such that said mutated NmeCas9 protein replaces said mutated single base with a wild type single base, such that said genetically-based medical condition does not develop.
  • 34. The method of claim 33, wherein said gene encodes a tyrosinase protein.
  • 35. The method of claim 33, wherein said genetically-based medical condition is tyrosinemia.
  • 36. The method of claim 33, wherein said virus is an adeno-associated virus 8.
  • 37. The method of claim 33, wherein said virus is an adeno-associated virus 6.
  • 38. The method of claim 33, wherein said protein is Nme2Cas9.
  • 39. The method of claim 33, wherein said protein further comprises a nuclear localization. signal protein.
  • 40. The method of claim 33, wherein said nucleotide deaminase is a cytidine deaminase.
  • 41. The method of claim 33, wherein said nucleotide deaminase is an adenosine deaminase.
  • 42. The method of claim 33, wherein said protein further comprises a uracil glycosylase inhibitor.
  • 43. The method of claim 33, wherein said nuclear localization signal protein is selected from the group consisting of nucleoplasmin and SV40.
  • 44. The method of claim 33, wherein said binding region is a protospacer accessory motif interacting domain.
  • 45. The method of claim 44, wherein said protospacer accessory motif interacting domain comprises said mutation.
  • 46. The method of claim 45, wherein said mutation is a D16A mutation.
CROSS-REFERENCE TO RELATED APPLICATION

This application claims priority to the co-pending PCT/US19/56341 application, filed Oct. 15, 2019 and the U.S. Provisional Patent Application No. 62/745,666, filed Oct. 15, 2018, now expired, herein incorporated by reference in its entirety. A Sequence Listing has been submitted in an ASCII text file named “19482.txt” created on Sep. 17, 2021, consisting of 342,134 bytes, the entire content of which is herein incorporated by reference.

PCT Information
Filing Document Filing Date Country Kind
PCT/US19/56341 10/15/2019 WO
Provisional Applications (1)
Number Date Country
62745666 Oct 2018 US