CRISPR-RELATED METHODS AND COMPOSITIONS

FIELD OF THE INVENTION

The invention relates to CRISPR-related methods and components for editing of, or delivery of a payload to, a target nucleic acid sequence.

SEQUENCE LISTING

The present application includes a Sequence Listing filed in electronic format. The Sequence Listing is entitled “4417101US4_ST25.txt” created on Jul. 24, 2020, and is 210,000 bytes in size. The information in the electronic format of the Sequence Listing is part of the present application and is incorporated herein by reference in its entirety.

BACKGROUND OF THE INVENTION

CRISPRs (Clustered Regularly Interspaced Short Palindromic Repeats) evolved in bacteria as an adaptive immune system to defend against viral attack. Upon exposure to a virus, short segments of viral DNA are integrated into the CRISPR locus. RNA is transcribed from a portion of the CRISPR locus that includes the viral sequence. That RNA, which contains sequence complimentary to the viral genome, mediates targeting of a Cas9 protein to the sequence in the viral genome. The Cas9 protein cleaves and thereby silences the viral target.

Recently, the CRISPR/Cas system has been adapted for genome editing in eukaryotic cells. The introduction of site-specific double strand breaks (DSBs) allows for target sequence alteration through one of two endogenous DNA repair mechanisms-either non-homologous end-joining (NHEJ) or homology-directed repair (HDR). The CRISPR/Cas system has also been used for gene regulation including transcription repression and activation without altering the target sequence. Targeted gene regulation based on the CRISPR/Cas system uses an enzymatically inactive Cas9 (also known as a catalytically dead Cas9).

SUMMARY OF THE INVENTION

Methods and compositions disclosed herein, e.g., a Cas9 molecule complexed with a gRNA molecule, can be used to target a specific location in a target DNA. Depending on the Cas9 molecule/gRNA molecule complex used specific editing or the delivery of a payload can be effected.

In one aspect, the disclosure features a gRNA molecule comprising a targeting domain which is complementary with a target sequence from a target nucleic acid disclosed herein, e.g., a sequence from: a gene or pathway described herein, e.g., in Section VIIB, e.g., in Table VII-13, VII-14, VII-15, VII-16, VII-17, VII-18, VII-19, VII-20, VII-21, VII-22, VII-23, VII-24, VII-25, IX-1, IX-1A, IX-2, IX-3, XIV-1, or Section VIII.

In another aspect, the disclosure features a composition, e.g., pharmaceutical composition, comprising a gRNA molecule described herein.

In some embodiments, the composition further comprises a Cas9 molecule, e.g., an eaCas9 or an eiCas9 molecule. In some embodiments, said Cas9 molecule is an eaCas9 molecule. In other embodiments, said Cas9 molecule is an eiCas9 molecule.

In some embodiments, said composition comprises a payload, e.g., a payload described herein, e.g., in Section VI, e.g., in Table VI-1, VI-2, VI-3, VI-4, VI-5, VI-6, or VI-7.

In some embodiments, the payload comprises: an epigenetic modifier, e.g., a molecule that modifies DNA or chromatin; component, e.g., a molecule that modifies a histone, e.g., an epigenetic modifier described herein, e.g., in Section VI; a transcription factor, e.g., a transcription factor described herein, e.g., in Section VI; a transcriptional activator domain; an inhibitor of a transcription factor, e.g., an anti-transcription factor antibody, or other inhibitors; a small molecule; an antibody; an enzyme; an enzyme that interacts with DNA, e.g., a helicase, restriction enzyme, ligase, or polymerase; and/or a nucleic acid, e.g., an enzymatically active nucleic acid, e.g., a ribozyme, or an mRNA, siRNA, of antisense oligonucleotide. In some embodiments, the composition further comprises a Cas9 molecule, e.g., an eiCas9, molecule.

In some embodiments, said payload is coupled, e.g., covalently or noncovalently, to a Cas9 molecule, e.g., an eiCas9 molecule. In some embodiments, said payload is coupled to said Cas9 molecule by a linker. In some embodiments, said linker is or comprises a bond that is cleavable under physiological, e.g., nuclear, conditions. In some embodiments, said linker is, or comprises, a bond described herein, e.g., in Section XI. In some embodiments, said linker is, or comprises, an ester bond. In some embodiments, said payload comprises a fusion partner fused to a Cas9 molecule, e.g., an eaCas9 molecule or an eiCas9 molecule.

In some embodiments, said payload is coupled, e.g., covalently or noncovalently, to the gRNA molecule. In some embodiments, said payload is coupled to said gRNA molecule by a linker. In some embodiments, said linker is or comprises a bond that is cleavable under physiological, e.g., nuclear, conditions. In some embodiments, said linker is, or comprises, a bond described herein, e.g., in Section XI. In some embodiments, said linker is, or comprises, an ester bond.

In some embodiments, the composition comprises an eaCas9 molecule. In some embodiments, the composition comprises an eaCas9 molecule which forms a double stranded break in the target nucleic acid.

In some embodiments, the composition comprises an eaCas9 molecule which forms a single stranded break in the target nucleic acid. In some embodiments, said single stranded break is formed in the complementary strand of the target nucleic acid. In some embodiments, said single stranded break is formed in the strand which is not the complementary strand of the target nucleic acid.

In some embodiments, the composition comprises HNH-like domain cleavage activity but having no, or no significant, N-terminal RuvC-like domain cleavage activity. In some embodiments, the composition comprises N-terminal RuvC-like domain cleavage activity but having no, or no significant, HNH-like domain cleavage activity.

In some embodiments, said double stranded break is within 10, 20, 30, 40, 50, 100, 150 or 200 nucleotides of a nucleotide of the target position. In some embodiments, said single stranded break is within 10, 20, 30, 40, 50, 100, 150 or 200 nucleotides of a nucleotide of the target position.

In some embodiments, the composition further comprises a template nucleic acid, e.g., a template nucleic acid described herein, e.g., in Section IV. In some embodiments, the template nucleic acid comprises a nucleotide that corresponds to a nucleotide of the target position.

In some embodiments, said template nucleic acid comprises a nucleotide that corresponds to a nucleotide of the target position from a sequence of: a gene, or a gene from a pathway, described herein, e.g., in Section VIIB, e.g., in Table VII-13, VII-14, VII-15, VII-16, VII-17, VII-18, VII-19, VII-20, VII-21, VII-22, VII-23, VII-24, VII-25, IX-1, IX-1A, IX-2, IX-3, XIV-1, or Section VIII.

In some embodiments, the template nucleic acid is or comprises a fragment of 10 to 500, 10 to 400, 10 to 300, 10 to 200 nucleotides in length from a sequence in: a gene, or a gene from a pathway, described herein, e.g., in Section VIIB, e.g., in Table VII-13, VII-14, VII-15, VII-16, VII-17, VII-18, VII-19, VII-20, VII-21, VII-22, VII-23, VII-24, VII-25, IX-1, IX-1A, IX-2, IX-3, XIV-1, or Section VIII.

In some embodiments, the template nucleic acid is or comprises a fragment of 10 to 500, 10 to 400, 10 to 300, 10 to 200 nucleotides in length, which differs at at least 1 nucleotide, but not more than 5, 10, 20 or 30% of its nucleotides, from a corresponding sequence in: a gene, or a gene from a pathway, described herein, e.g., in Section VIIB, e.g., in Table VII-13, VII-14, VII-15, VII-16, VII-17, VII-18, VII-19, VII-20, VII-21, VII-22, VII-23, VII-24, VII-25, IX-1, IX-1A, IX-2, IX-3, XIV-1, or Section VIII.

In some embodiments, the composition further comprises a second gRNA molecule, e.g., a second gRNA molecule described herein.

In some embodiments, said gRNA molecule and said second gRNA molecule mediate breaks at different sites in the target nucleic acid, e.g., flanking a target position. In some embodiments, said gRNA molecule and said second gRNA molecule are complementary to the same strand of the target. In some embodiments, said gRNA molecule and said second gRNA molecule are complementary to the different strands of the target.

In some embodiments, said Cas9 molecule mediates a double stranded break.

In some embodiments, said gRNA molecule and said second gRNA molecule are configured such that first and second break made by the Cas9 molecule flank a target position. In some embodiments, said double stranded break is within 10, 20, 30, 40, 50, 100, 150 or 200 nucleotides of a nucleotide of the target position.

In some embodiments, said template nucleic acid comprises a nucleotide that corresponds to a nucleotide of a target position from a sequence of: a gene, or a gene from a pathway, described herein, e.g., in Section VIIB, e.g., in Table VII-13, VII-14, VII-15, VII-16, VII-17, VII-18, VII-19, VII-20, VII-21, VII-22, VII-23, VII-24, VII-25, IX-1, IX-1A, IX-2, IX-3, XIV-1, or Section VIII.

In some embodiments, the template nucleic acid is a fragment of 10 to 500, 10 to 400, 10 to 300, 10 to 200 nucleotides in length from a sequence in: a gene, or a gene from a pathway, described herein, e.g., in Section VIIB, e.g., in Table VII-13, VII-14, VII-15, VII-16, VII-17, VII-18, VII-19, VII-20, VII-21, VII-22, VII-23, VII-24, VII-25, IX-1, IX-1A, IX-2, IX-3, XIV-1, or Section VIII.

In some embodiments, the template nucleic acid is a fragment of 10 to 500, 10 to 400, 10 to 300, 10 to 200 nucleotides in length, which differs at at least 1 nucleotide, but not more than 5, 10, 20 or 30% of its nucleotides, from a corresponding sequence in: a gene, or a gene from a pathway, described herein, e.g., in Section VIIB, e.g., in Table VII-13, VII-14, VII-15, VII-16, VII-17, VII-18, VII-19, VII-20, VII-21, VII-22, VII-23, VII-24, VII-25, IX-1, IX-1A, IX-2, IX-3, XIV-1, or Section VIII.

In some embodiments, said Cas9 molecule mediates a single stranded break.

In some embodiments, said gRNA molecule and said second gRNA molecule are configured such that a first and second break are formed in the same strand of the nucleic acid target, e.g., in the case of transcribed sequence, the template strand or the non-template strand.

In some embodiments, said first and second break flank a target position.

In some embodiments, one of said first and second single stranded breaks, or both are independently, within 10, 20, 30, 40, 50, 100, 150 or 200 nucleotides of a nucleotide of the target position.

In some embodiments, the composition further comprises a template nucleic acid. In some embodiments, the template nucleic acid comprises a nucleotide that corresponds to a nucleotide of the target position. In some embodiments, said template nucleic acid comprises a nucleotide that corresponds to a nucleotide of the target position from a sequence of: a gene, or a gene from a pathway, described herein, e.g., in Section VIIB, e.g., in Table VII-13, VII-14, VII-15, VII-16, VII-17, VII-18, VII-19, VII-20, VII-21, VII-22, VII-23, VII-24, VII-25, IX-1, IX-1A, IX-2, IX-3, XIV-1, or Section VIII.

In some embodiments, said gRNA molecule and said second gRNA molecule are configured such that a first and a second breaks are formed in different strands of the target. In some embodiments, said first and second break flank a target position. In some embodiments, one of said first and second single stranded breaks, or both are independently, within 10, 20, 30, 40, 50, 100, 150 or 200 nucleotides of a nucleotide of the target position.

In some embodiments, the composition comprises a second Cas9 molecule.

In some embodiments, one or both of said Cas9 molecule and said second Cas9 molecule are eiCas9 molecules. In some embodiments, said eiCas9 molecule is coupled to a payload by a linker and said second eiCas9 molecules is coupled to a second payload by a second linker.

In some embodiments, said payload and said second payload are the same. In some embodiments, said payload and said second payload are different. In some embodiments, said linker and said second linker are the same. In some embodiments, said linker and said second linker are different, e.g., have different release properties, e.g., different release rates.

In some embodiments, said payload and said second payload are each described herein, e.g., in Section VI, e.g., in Table VI-1, VI-2, VI-3, VI-4, VI-5, VI-6, or VI-7. In some embodiments, said payload and said second payload can interact, e.g., they are subunits of a protein.

In some embodiments, one of both of said Cas9 molecule and said second Cas9 molecule are eaCas9 molecules.

In some embodiments, said eaCas9 molecule comprises a first cleavage activity and said second eaCas9 molecule comprises a second cleavage activity. In some embodiments, said cleavage activity and said second cleavage activity are the same, e.g., both are N-terminal RuvC-like domain activity or are both HNH-like domain activity. In some embodiments, said cleavage activity and said second cleavage activity are different, e.g., one is N-terminal RuvC-like domain activity and one is HNH-like domain activity.

In some embodiments, said Cas9 molecule and said second Cas9 molecule are specific for different PAMs, e.g., one is specific for NGG and the other is specific for, e.g., NGGNG, NNAGAAW (W=A or T), or NAAR (R=A or G).

In some embodiments, said Cas9 molecule and said second Cas9 molecule both mediate double stranded breaks.

In some embodiments, said Cas9 molecule and said second Cas9 molecule are specific for different PAMs, e.g., one is specific for NGG and the other is specific for another PAM, e.g., another PAM described herein. In some embodiments, said gRNA molecule and said second gRNA molecule are configured such that first and second break flank a target position. In some embodiments, one of said first and second double stranded breaks, or both are independently, within 10, 20, 30, 40, 50, 100, 150 or 200 nucleotides of a nucleotide of the target position.

In some embodiments, one of said Cas9 molecule and said second Cas9 molecule mediates a double stranded break and the other mediates a single stranded break.

In some embodiments, said Cas9 molecule and said second Cas9 molecule are specific for different PAMs, e.g., one is specific for NGG and the other is specific for another PAM, e.g., another PAM described herein. In some embodiments, said gRNA molecule and said second gRNA molecule are configured such that a first and second break flank a target position. In some embodiments, said first and second break flank a target position. In some embodiments, one of said first and second breaks, or both are independently, within 10, 20, 30, 40, 50, 100, 150 or 200 nucleotides of a nucleotide of the target position.

In some embodiments, said Cas9 molecule and said second Cas9 molecule both mediate single stranded breaks.

In some embodiments, one of said first and second single stranded breaks, or both are independently, within 10, 20, 30, 40, 50, 100, 150 or 200 nucleotides of a nucleotide of the target position.

In some embodiments, said gRNA molecule, said second gRNA molecule are configured such that a first and second break are in the same strand.

In some embodiments, said Cas9 molecule and said second Cas9 molecule are specific for different PAMs, e.g., one is specific for NGG and the other is specific for another PAM, e.g., another PAM described herein. In some embodiments, said gRNA molecule, said second gRNA molecule are configured such that a first and second break flank a target position. In some embodiments, one of said first and second single stranded breaks, or both are independently, within 10, 20, 30, 40, 50, 100, 150 or 200 nucleotides of a nucleotide of the target position.

In some embodiments, said first and second break are on the different strands.

In some embodiments, said gRNA molecule, said second gRNA molecule are configured such that a first and second break flank a target position. In some embodiments, said first and second break flank a target position.

In some embodiments, one of said first and second single stranded breaks, or both are independently, within 10, 20, 30, 40, 50, 100, 150 or 200 nucleotides of a nucleotide of the target position.

In yet another aspect, the disclosure features a composition, e.g., a pharmaceutical composition, comprising a gRNA molecule and a second gRNA molecule described herein.

In some embodiments, the composition further comprises a nucleic acid, e.g., a DNA or mRNA, that encodes a Cas9 molecule described herein. In some embodiments, the composition further comprises a nucleic acid, e.g., a DNA or RNA, that encodes a second Cas9 molecule described herein. In some embodiments, the composition further comprises a template nucleic acid described herein.

In one aspect, the disclosure features a composition, e.g., a pharmaceutical composition, comprising, nucleic acid sequence, e.g., a DNA, that encodes one or more gRNA molecules described herein.

In some embodiments, said nucleic acid comprises a promoter operably linked to the sequence that encodes a gRNA molecule, e.g., a promoter described herein.

In some embodiments, said nucleic acid comprises a second promoter operably linked to the sequence that encodes a second gRNA molecule, e.g., a promoter described herein. In some embodiments, the promoter and second promoter are different promoters. In some embodiments, the promoter and second promoter are the same.

In some embodiments, the nucleic acid further encodes a Cas9 molecule described herein.

In some embodiments, the nucleic acid further encodes a second Cas9 molecule described herein.

In some embodiments, said nucleic acid comprises a promoter operably linked to the sequence that encodes a Cas9 molecule, e.g., a promoter described herein.

In some embodiments, said nucleic acid comprises a second promoter operably linked to the sequence that encodes a second Cas9 molecule, e.g., a promoter described herein. In some embodiments, the promoter and second promoter are different promoters. In some embodiments, the promoter and second promoter are the same.

In some embodiments, the composition further comprises a template nucleic acid e.g., a template nucleic acid described herein, e.g., in Section IV.

In another aspect, the disclosure features a composition, e.g., a pharmaceutical composition, comprising nucleic acid sequence that encodes one or more of: a) a Cas9 molecule, b) a second Cas9 molecule, c) a gRNA molecule, and d) a second gRNA molecule.

In some embodiments, each of a), b), c) and d) present are encoded on the same duplex molecule.

In some embodiments, a first sequence selected from of a), b), c) and d) is encoded on a first duplex molecule and a second sequence selected from a), b), c), and d) is encoded on a second duplex molecule.

In some embodiments, said nucleic acid encodes: a) and c); a), c), and d); or a), b), c), and d).

In some embodiments, the composition further comprises a Cas9 molecule, e.g., comprising one or more of the Cas9 molecules wherein said nucleic acid does not encode a Cas9 molecule.

In some embodiments, the composition further comprises an mRNA encoding Cas9 molecule, e.g., comprising one or more mRNAs encoding one or more of the Cas9 molecules wherein said nucleic acid does not encode a Cas9 molecule.

In some embodiments, the composition further comprises a template nucleic acid e.g., a template nucleic acid described herein, e.g., in Section IV.

In yet another aspect, the disclosure features a nucleic acid described herein.

In one aspect, the disclosure features a composition comprising: a) a gRNA molecule (or combination of gRNA molecules, e.g., a gRNA molecule and a second gRNA molecule); b) an eaCas9 molecule (or combination of eaCas9 molecules, e.g., an eaCas9 molecule; and a second eaCas9 molecule); and c) optionally, a template nucleic acid e.g., a template nucleic acid described herein, e.g., in Section IV.

In another aspect, the disclosure features a composition comprising: a) a gRNA molecule (or combination of gRNA molecules, e.g., a gRNA molecule and a second gRNA molecule); b) a nucleic acid, e.g. a DNA or mRNA encoding an eaCas9 molecule (or combination of eaCas9 molecules, e.g., an eaCas9 molecule and a second eaCas9 molecule); and c) optionally, a template nucleic acid, e.g., a template nucleic acid described herein, e.g., in Section IV.

In yet another aspect, the disclosure features a composition comprising: a) a nucleic acid, e.g., a DNA, which encodes a gRNA molecule (or combination of gRNA molecules, e.g., a gRNA molecule and a second gRNA molecule); b) an eaCas9 molecule (or combination of eaCas9 molecules, e.g., an eaCas9 molecule and a second eaCas9 molecule); and c) optionally, a template nucleic acid, e.g., a template nucleic acid described herein, e.g., in Section IV.

In still another aspect, the disclosure features a composition comprising: a) nucleic acid, e.g., a DNA, which encodes a gRNA molecule or (or combination of gRNA molecules, e.g., a gRNA molecule and a second gRNA molecule); b) nucleic acid, e.g. a DNA or mRNA encoding eaCas9 molecule or (or combination of eaCas9 molecules, e.g., an eaCas9 molecule and a second eaCas9 molecule) (wherein the gRNA molecule encoding nucleic acid and the eaCas9 molecule encoding nucleic acid can be on the same or different molecules); and c) optionally, a template nucleic acid, e.g., a template nucleic acid described herein, e.g., in Section IV.

In one aspect, the disclosure features a method of altering a cell, e.g., altering the structure, e.g., sequence, of a target nucleic acid of a cell, comprising contacting said cell with:

1) a composition comprising:

- a) a gRNA molecule (or combination of gRNA molecules, e.g., a gRNA molecule and a second gRNA molecule);
- b) an eaCas9 molecule (or combination of eaCas9 molecules, e.g., an eaCas9 molecule; and a second eaCas9 molecule); and
- c) optionally, a template nucleic acid, e.g., a template nucleic acid described herein, e.g., in Section IV;

2) a composition comprising:

- a) a gRNA molecule (or combination of gRNA molecules, e.g., a gRNA molecule and a second gRNA molecule);
- b) a nucleic acid, e.g. a DNA or mRNA encoding an eaCas9 molecule (or combination of eaCas9 molecules, e.g., an eaCas9 molecule and a second eaCas9 molecule); and
- c) optionally, a template nucleic acid, e.g., a template nucleic acid described herein, e.g., in Section IV;

3) a composition comprising:

- a) a nucleic acid, e.g., a DNA, which encodes a gRNA molecule (or combination of gRNA molecules, e.g., a gRNA molecule and a second gRNA molecule);
- b) an eaCas9 molecule (or combination of eaCas9 molecules, e.g., an eaCas9 molecule and a second eaCas9 molecule); and
- c) optionally, a template nucleic acid, e.g., a template nucleic acid described herein, e.g., in Section IV; or

4) a composition comprising:

- a) nucleic acid, e.g., a DNA, which encodes a gRNA molecule or (or combination of gRNA molecules, e.g., a gRNA molecule and a second gRNA molecule);
- b) nucleic acid, e.g. a DNA or mRNA encoding eaCas9 molecule or (or combination of eaCas9 molecules, e.g., an eaCas9 molecule and a second eaCas9 molecule), (wherein the gRNA molecule encoding nucleic acid and the eaCas9 molecule encoding nucleic acid can be on the same or different molecules); and
- c) optionally, a template nucleic acid, e.g., a template nucleic acid described herein, e.g., in Section IV.

In some embodiments, a gRNA molecule or nucleic acid encoding a gRNA molecule, and an eaCas9 molecule, or nucleic acid encoding an eaCas9 molecule, are delivered in or by, one dosage form, mode of delivery, or formulation.

In some embodiments, a) a gRNA molecule or nucleic acid encoding a gRNA molecule is delivered in or by, a first dosage form, a first mode of delivery, or a first formulation; and b) an eaCas9 molecule, or nucleic acid encoding an eaCas9 molecule, is delivered in or by a second dosage form, second mode of delivery, or second formulation.

In some embodiments, the cell is an animal or plant cell. In some embodiments, the cell is a mammalian, primate, or human cell. In some embodiments, the cell is a human cell, e.g., a cell from described herein, e.g., in Section VIIA. In some embodiments, the cell is: a somatic cell, germ cell, prenatal cell, e.g., zygotic, blastocyst or embryonic, blastocyst cell, a stem cell, a mitotically competent cell, a meiotically competent cell. In some embodiments, the cell is a human cell, e.g., a cancer cell or other cell characterized by a disease or disorder.

In some embodiments, the target nucleic acid is a chromosomal nucleic acid. In some embodiments, the target nucleic acid is an organellar nucleic acid. In some embodiments, the target nucleic acid is a mitochondrial nucleic acid. In some embodiments, the target nucleic acid is a chloroplast nucleic acid.

In some embodiments, the cell is a cell of a disease causing organism, e.g., a virus, bacterium, fungus, protozoan, or parasite.

In some embodiments, the target nucleic acid is the nucleic acid of a disease causing organism, e.g., of a disease causing organism, e.g., a virus, bacterium, fungus, protozoan, or parasite.

In some embodiments, said method comprises: modulating the expression of a gene or inactivating a disease organism.

In some embodiments, said cell is a cell characterized by unwanted proliferation, e.g., a cancer cell. In some embodiments, said cell is a cell characterized by an unwanted genomic component, e.g., a viral genomic component. In some embodiments, the cell is a cell described herein, e.g., in Section IIA. In some embodiments, a control or structural sequence of at least, 2 3, 4, or 5 genes is altered.

In some embodiments, the target nucleic acid is a rearrangement, a kinase, a rearrangement that comprises a kinase, or a tumor suppressor.

In some embodiments, the method comprises cleaving a target nucleic acid within 10, 20, 30, 40, 50, 100, 150 or 200 nucleotides of a nucleotide of the target position. In some embodiments, said composition comprises a template nucleic acid.

In some embodiments, the template nucleic acid comprises a nucleotide that corresponds to a nucleotide of the target position.

In some embodiments, said template nucleic acid comprises a nucleotide that corresponds to a nucleotide of the target position from a sequence of: a gene, or a gene from a pathway, described herein, e.g., in Section VIIB, e.g., in Table VII-13, VII-14, VII-15, VII-16, VII-17, VII-18, VII-19, VII-20, VII, 21, VII-22, VII-23, VII-24, VII-25, IX-1, IX-1A, IX-2, IX-3, XIV-1, or Section VIII.

In some embodiments,

a) a control region, e.g., a cis-acting or tans-acting control region, of a gene is cleaved;

b) the sequence of a control region, e.g., a cis-acting or tans-acting control region, of a gene is altered, e.g., by an alteration that modulates, e.g., increases or decreases, expression a gene under control of the control region, e.g., a control sequence is disrupted or a new control sequence is inserted;

c) the coding sequence of a gene is cleaved;

d) the sequence of a transcribed region, e.g., a coding sequence of a gene is altered, e.g., a mutation is corrected or introduced, an alteration that increases expression of or activity of the gene product is effected, e.g., a mutation is corrected; and/or

e) the sequence of a transcribed region, e.g., the coding sequence of a gene is altered, e.g., a mutation is corrected or introduced, an alteration that decreases expression of or activity of the gene product is effected, e.g., a mutation is inserted, e.g., the sequence of one or more nucleotides is altered so as to insert a stop codon.

In some embodiments, a control region or transcribed region, e.g., a coding sequence, of at least 2, 3, 4, 5, or 6 genes are altered.

In another aspect, the disclosure features a method of treating a subject, e.g., by altering the structure, e.g., altering the sequence, of a target nucleic acid, comprising administering to the subject, an effective amount of:

1) a composition comprising:

- a) a gRNA molecule (or combination of gRNA molecules, e.g., a gRNA molecule and a second gRNA molecule);
- b) an eaCas9 molecule (or combination of eaCas9 molecules, e.g., an eaCas9 molecule; and a second eaCas9 molecule); and
- c) optionally, a template nucleic acid, e.g., a template nucleic acid described herein, e.g., in Section IV;

2) a composition comprising:

- a) a gRNA molecule (or combination of gRNA molecules, e.g., a gRNA molecule and a second gRNA molecule);
- b) a nucleic acid, e.g. a DNA or mRNA encoding an eaCas9 molecule (or combination of eaCas9 molecules, e.g., an eaCas9 molecule and a second eaCas9 molecule); and
- c) optionally, a template nucleic acid, e.g., a template nucleic acid described herein, e.g., in Section IV;

3) a composition comprising:

- a) a nucleic acid, e.g., a DNA, which encodes a gRNA molecule (or combination of gRNA molecules, e.g., a gRNA molecule and a second gRNA molecule);
- b) an eaCas9 molecule (or combination of eaCas9 molecules, e.g., an eaCas9 molecule and a second eaCas9 molecule); and
- c) optionally, a template nucleic acid, e.g., a template nucleic acid described herein, e.g., in Section IV; and/or

4) a composition comprising:

- a) nucleic acid, e.g., a DNA, which encodes a gRNA molecule or (or combination of gRNA molecules, e.g., a gRNA molecule and a second gRNA molecule);
- b) nucleic acid, e.g. a DNA or mRNA encoding eaCas9 molecule or (or combination of eaCas9 molecules, e.g., an eaCas9 molecule and a second eaCas9 molecule), (wherein the gRNA molecule encoding nucleic acid and the eaCas9 molecule encoding nucleic acid can be on the same or different molecules); and
- c) optionally, a template nucleic acid, e.g., a template nucleic acid described herein, e.g., in Section IV.

In some embodiments, a gRNA molecule or nucleic acid encoding a gRNA molecule, and an eaCas9 molecule, or nucleic acid encoding an eaCas9 molecule, are delivered in or by one dosage form, mode of delivery, or formulation.

In some embodiments, a gRNA molecule or nucleic acid encoding a gRNA molecule is delivered in or by a first dosage form, in a first mode of delivery, or first formulation; and an eaCas9 molecule, or nucleic acid encoding an eaCas9 molecule, is delivered in or by a second dosage form, second mode of delivery, or second formulation.

In some embodiments, the subject is an animal or plant. In some embodiments, the subject is a mammalian, primate, or human.

In some embodiments, the target nucleic acid is the nucleic acid of a human cell, e.g., a cell described herein, e.g., in Section VIIA. In some embodiments, the target nucleic acid is the nucleic acid of: a somatic cell, germ cell, prenatal cell, e.g., zygotic, blastocyst or embryonic, blasotcyst cell, a stem cell, a mitotically competent cell, a meiotically competent cell.

In some embodiments, the target nucleic acid is a chromosomal nucleic acid. In some embodiments, the target nucleic acid is an organellar nucleic acid. In some embodiments, the nucleic acid is a mitochondrial nucleic acid. In some embodiments, the nucleic acid is a chloroplast nucleic acid.

In some embodiments, the target nucleic acid is the nucleic acid of a disease causing organism, e.g., of a disease causing organism, e.g., a virus, bacterium, fungus, protozoan, or parasite. In some embodiments, said method comprises modulating expression of a gene or inactivating a disease organism.

In some embodiments, the target nucleic acid is the nucleic acid of a cell characterized by unwanted proliferation, e.g., a cancer cell. In some embodiments, said target nucleic acid comprises an unwanted genomic component, e.g., a viral genomic component. In some embodiments, a control or structural sequence of at least, 2 3, 4, or 5 genes is altered. In some embodiments, the target nucleic acid is a rearrangement, a kinase, a rearrangement that comprises a kinase, or a tumor suppressor.

In some embodiments, the method comprises cleaving a target nucleic acid within 10, 20, 30, 40, 50, 100, 150 or 200 nucleotides of a nucleotide of the target position.

In some embodiments, said composition comprises a template nucleic acid. In some embodiments, the template nucleic acid comprises a nucleotide that corresponds to a nucleotide of the target position.

In some embodiments,

a) a control region, e.g., a cis-acting or trans-acting control region, of a gene is cleaved;

b) the sequence of a control region, e.g., a cis-acting or trans-acting control region, of a gene is altered, e.g., by an alteration that modulates, e.g., increases or decreases, expression a gene under control of the control region, e.g., a control sequence is disrupted or a new control sequence is inserted;

c) the coding sequence of a gene is cleaved;

e) the non-coding sequence of a gene or an intergenic region between genes is cleaved; and/or

f) the sequence of a transcribed region, e.g., the coding sequence of a gene is altered, e.g., a mutation is corrected or introduced, an alteration that decreases expression of or activity of the gene product is effected, e.g., a mutation is inserted, e.g., the sequence of one or more nucleotides is altered so as to insert a stop codon.

In some embodiments, a control region or transcribed region, e.g., a coding sequence, of at least 2, 3, 4, 5, or 6 genes are altered.

In one aspect, the disclosure features a composition comprising: a) a gRNA molecule (or combination of gRNA molecules, e.g., a gRNA molecule and a second gRNA molecule); b) a Cas9 molecule, e.g., an eiCas9 molecule (or combination of Cas9 molecules, e.g., an eiCas9 molecule and a second eiCas9 molecule); and c) a payload coupled, covalently or non-covalently, to a complex of the gRNA molecule and the Cas9 molecule, e.g., coupled to the Cas9 molecule or the gRNA molecule.

In another aspect, the disclosure features a composition comprising: a) a gRNA molecule (or combination of gRNA molecules, e.g., a gRNA molecule and a second gRNA molecule); b) a nucleic acid, e.g. a DNA or mRNA encoding a Cas9 molecule, e.g., an eiCas9 molecule (or combination of Cas9 molecules, e.g., an eiCas9 molecule and a second eiCas9 molecule); and c) a payload which is: coupled, covalently or non-covalently, the gRNA molecule; or a fusion partner with the Cas9 molecule.

In yet another aspect, the disclosure features a composition comprising: a) a nucleic acid, e.g., a DNA, which encodes a gRNA molecule (or combination of gRNA molecules, e.g., a gRNA molecule and a second gRNA molecule); b) a Cas9 molecule, e.g., an eiCas9 molecule (or combination of Cas9 molecules, e.g., an eiCas9 molecule and a second eiCas9 molecule); and c) a payload which is coupled, covalently or non-covalently, to the Cas9 molecule.

In still another aspect, the disclosure features a composition comprising: a) nucleic acid, e.g., a DNA, which encodes a gRNA molecule or (or combination of gRNA molecules, e.g., a gRNA molecule and a second gRNA molecule); b) nucleic acid, e.g. a DNA or mRNA, encoding a Cas9 molecule, e.g., an eiCas9 molecule (or combination of Cas9 molecules, e.g., an eiCas9 molecule and a second eiCas9 molecule) (wherein the gRNA molecule encoding nucleic acid and the eaCas9 molecule encoding nucleic acid can be on the same or different molecules); and c) a payload which is a fusion partner with the Cas9 molecule.

In one aspect, the disclosure features a method of delivering a payload to a cell, e.g., by targeting a payload to target nucleic acid, comprising contacting said cell with:

1) a composition comprising:

- a) a gRNA molecule (or combination of gRNA molecules, e.g., a gRNA molecule and a second gRNA molecule);
- b) a Cas9 molecule, e.g., an eiCas9 molecule (or combination of Cas9 molecules, e.g., an eiCas9 molecule and a second eiCas9 molecule); and
- c) a payload coupled, covalently or non-covalently, to a complex of the gRNA molecule and the Cas9 molecule, e.g., coupled to the Cas9 molecule or the gRNA molecule;

2) a composition comprising:

- a) a gRNA molecule (or combination of gRNA molecules, e.g., a gRNA molecule and a second gRNA molecule);
- b) a nucleic acid, e.g. a DNA or mRNA encoding a Cas9 molecule, e.g., an eiCas9 molecule (or combination of Cas9 molecules, e.g., an eiCas9 molecule and a second eiCas9 molecule); and
- c) a payload which is: coupled, covalently or non-covalently, the gRNA molecule; or a fusion partner with the Cas9 molecule;

3) a composition comprising:

- a) a nucleic acid, e.g., a DNA, which encodes a gRNA molecule (or combination of gRNA molecules, e.g., a gRNA molecule and a second gRNA molecule);
- b) a Cas9 molecule, e.g., an eiCas9 molecule (or combination of Cas9 molecules, e.g., an eiCas9 molecule and a second eiCas9 molecule); and
- c) a payload which is coupled, covalently or non-covalently, to the Cas9 molecule; and/or

4) a composition comprising:

- a) nucleic acid, e.g., a DNA, which encodes a gRNA molecule or (or combination of gRNA molecules, e.g., a gRNA molecule and a second gRNA molecule);
- b) nucleic acid, e.g. a DNA or mRNA, encoding a Cas9 molecule, e.g., an eiCas9 molecule (or combination of Cas9 molecules, e.g., an eiCas9 molecule and a second eiCas9 molecule) (wherein the gRNA molecule encoding nucleic acid and the eaCas9 molecule encoding nucleic acid can be on the same or different molecules); and
- c) a payload which is a fusion partner with the Cas9 molecule.

In some embodiments, a gRNA molecule or nucleic acid encoding a gRNA molecule is delivered in or by a first dosage form, first mode of delivery, or first formulation; and a Cas9 molecule, or nucleic acid encoding a Cas9 molecule, is delivered in or by a second dosage form, second mode of delivery, or second formulation.

In some embodiments, the cell is an animal or plant cell. In some embodiments, the cell is a mammalian, primate, or human cell. In some embodiments, the cell is a human cell, e.g., a human cell described herein, e.g., in Section VIIA. In some embodiments, the cell is: a somatic cell, germ cell, prenatal cell, e.g., zygotic, blastocyst or embryonic, blasotcyst cell, a stem cell, a mitotically competent cell, a meiotically competent cell. In some embodiments, the cell is a human cell, e.g., a cancer cell, a cell comprising an unwanted genetic element, e.g., all or part of a viral genome.

In some embodiments, the gRNA mediates targeting of a chromosomal nucleic acid. In some embodiments, the gRNA mediates targeting of a selected genomic signature. In some embodiments, the gRNA mediates targeting of an organellar nucleic acid. In some embodiments, the gRNA mediates targeting of a mitochondrial nucleic acid. In some embodiments, the gRNA mediates targeting of a chloroplast nucleic acid.

In some embodiments, the cell is a cell of a disease causing organism, e.g., a virus, bacterium, fungus, protozoan, or parasite.

In some embodiments, the gRNA mediates targeting of the nucleic acid of a disease causing organism, e.g., of a disease causing organism, e.g., a virus, bacterium, fungus, protozoan, or parasite.

In some embodiments, the payload comprises a payload described herein, e.g., in Section VI.

In some embodiments, said cell is a cell characterized by unwanted proliferation, e.g., a cancer cell. In some embodiments, said cell is characterized by an unwanted genomic component, e.g., a viral genomic component.

In some embodiments, a control or structural sequence of at least, 2 3, 4, or 5 genes is altered.

In some embodiments, the gRNA targets a selected genomic signature, e.g., a mutation, e.g., a germline or acquired somatic mutation. In some embodiments, the gRNA targets a rearrangement, a kinase, a rearrangement that comprises a kinase, or tumor suppressor. In some embodiments, the gRNA targets a cancer cell, e.g., a cancer cell disclosed herein, e.g., in Section VIIA. In some embodiments, the gRNA targets a cell which has been infected with a virus.

In another aspect, the disclosure features a method of treating a subject, e.g., by targeting a payload to target nucleic acid, comprising administering to the subject, an effective amount of:

1) a composition comprising:

- a) a gRNA molecule (or combination of gRNA molecules, e.g., a gRNA molecule and a second gRNA molecule);
- b) a Cas9 molecule, e.g., an eiCas9 molecule (or combination of Cas9 molecules, e.g., an eiCas9 molecule and a second eiCas9 molecule); and
- c) a payload coupled, covalently or non-covalently, to a complex of the gRNA molecule and the Cas9 molecule, e.g., coupled to the Cas9 molecule;

2) a composition comprising:

- a) a gRNA molecule (or combination of gRNA molecules, e.g., a gRNA molecule and a second gRNA molecule);
- b) a nucleic acid, e.g. a DNA or mRNA encoding a Cas9 molecule, e.g., an eiCas9 molecule (or combination of Cas9 molecules, e.g., an eiCas9 molecule and a second eiCas9 molecule); and
- c) a payload which is:
  - coupled, covalently or non-covalently, the gRNA molecule; or
  - is a fusion partner with the Cas9 molecule;

3) a composition comprising:

- a) a nucleic acid, e.g., a DNA, which encodes a gRNA molecule (or combination of gRNA molecules, e.g., a gRNA molecule and a second gRNA molecule);
- b) a Cas9 molecule, e.g., an eiCas9 molecule (or combination of Cas9 molecules, e.g., an eiCas9 molecule and a second eiCas9 molecule); and
- c) a payload which is coupled, covalently or non-covalently, to the Cas9 molecule; and/or

4) a composition comprising:

- a) a nucleic acid, e.g., a DNA, which encodes a gRNA molecule or (or combination of gRNA molecules, e.g., a gRNA molecule and a second gRNA molecule);
- b) a nucleic acid, e.g. a DNA or mRNA, encoding a Cas9 molecule, e.g., an eiCas9 molecule (or combination of Cas9 molecules, e.g., an eiCas9 molecule and a second eiCas9 molecule), (wherein the gRNA molecule encoding nucleic acid and the eaCas9 molecule encoding nucleic acid can be on the same or different molecules); and
- c) a payload which is a fusion partner with the Cas9 molecule.

In some embodiments, a gRNA molecule or nucleic acid encoding a gRNA molecule is delivered in or by a first dosage, mode of delivery form or formulation; and a Cas9 molecule, or nucleic acid encoding a Cas9 molecule, is delivered in or by a second dosage form, mode of delivery, or formulation.

In some embodiments, the subject is an animal or plant cell. In some embodiments, the subject is a mammalian, primate, or human cell.

In some embodiments, the gRNA mediates targeting of a human cell, e.g., a human cell described herein, e.g., in Section VIIA. In some embodiments, the gRNA mediates targeting of: a somatic cell, germ cell, prenatal cell, e.g., zygotic, blastocyst or embryonic, blasotcyst cell, a stem cell, a mitotically competent cell, a meiotically competent cell. In some embodiments, the gRNA mediates targeting of a cancer cell or a cell comprising an unwanted genomic element, e.g., all or part of a viral genome. In some embodiments, the gRNA mediates targeting of a chromosomal nucleic acid. In some embodiments, the gRNA mediates targeting of a selected genomic signature. In some embodiments, the gRNA mediates targeting of an organellar nucleic acid. In some embodiments, the gRNA mediates targeting of a mitochondrial nucleic acid. In some embodiments, the gRNA mediates targeting of a chloroplast nucleic acid. In some embodiments, the gRNA mediates targeting of the nucleic acid of a disease causing organism, e.g., of a disease causing organism, e.g., a virus, bacterium, fungus, protozoan, or parasite. In some embodiments, the gRNA targets a cell characterized by unwanted proliferation, e.g., a cancer cell, e.g., a cancer cell from Section VIIA, e.g., from Table VII-11. In some embodiments, the gRNA targets a cell characterized by an unwanted genomic component, e.g., a viral genomic component.

In some embodiments, a control element, e.g., a promoter or enhancer, is targeted. In some embodiments, the gRNA targets a rearrangement, a kinase, a rearrangement that comprises a kinase, or a tumor suppressor. In some embodiments, the gRNA targets a selected genomic signature, e.g., a mutation, e.g., a germline or acquired somatic mutation.

In some embodiments, the gRNA targets a cancer cell. In some embodiments, the gRNA targets a cell which has been infected with a virus.

In some embodiments, at least one eaCas9 molecule and a payload are administered. In some embodiments, the payload comprises a payload described herein, e.g., in Section VI.

In one aspect, the disclosure features a reaction mixture comprising a composition described herein and a cell.

Unless otherwise defined, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. Although methods and materials similar or equivalent to those described herein can be used in the practice or testing of the present invention, suitable methods and materials are described below. All publications, patent applications, patents, and other references mentioned herein are incorporated by reference in their entirety. In addition, the materials, methods, and examples are illustrative only and not intended to be limiting.

Headings, including numeric and alphabetical headings and subheadings, are for organization and presentation and are not intended to be limiting.

Other features and advantages of the invention will be apparent from the detailed description, drawings, and from the claims.

BRIEF DESCRIPTION OF THE DRAWING

The Figures described below, that together make up the Drawing, are for illustration purposes only, not for limitation.

FIG. 1A-G are representations of several exemplary gRNAs.

FIG. 1A depicts a modular gRNA molecule derived in part (or modeled on a sequence in part) from Streptococcus pyogenes (S. pyogenes) as a duplexed structure (SEQ ID NOS 42 and 43, respectively, in order of appearance);

FIG. 1B depicts a unimolecular (or chimeric) gRNA molecule derived in part from S. pyogenes as a duplexed structure (SEQ ID NO: 44);

FIG. 1C depicts a unimolecular gRNA molecule derived in part from S. pyogenes as a duplexed structure (SEQ ID NO: 45);

FIG. 1D depicts a unimolecular gRNA molecule derived in part from S. pyogenes as a duplexed structure (SEQ ID NO: 46);

FIG. 1E depicts a unimolecular gRNA molecule derived in part from S. pyogenes as a duplexed structure (SEQ ID NO: 47);

FIG. 1F depicts a modular gRNA molecule derived in part from Streptococcus thermophilus (S. thermophilus) as a duplexed structure (SEQ ID NOS 48 and 49, respectively, in order of appearance);

FIG. 1G depicts an alignment of modular gRNA molecules of S. pyogenes and S. thermophilus (SEQ ID NOS 50-53, respectively, in order of appearance).

FIG. 2 depicts an alignment of Cas9 sequences from Chylinski et al., RNA BIOL. 2013; 10(5): 726-737. The N-terminal RuvC-like domain is boxed and indicated with a “Y”. The other two RuvC-like domains are boxed and indicated with a “B”. The HNH-like domain is boxed and indicated by a “G”. Sm: S. mutans (SEQ ID NO: 1); Sp: S. pyogenes (SEQ ID NO: 2); St: S. thermophilus (SEQ ID NO: 3); Li: L. innocua (SEQ ID NO: 4). Motif: this is a motif based on the four sequences: residues conserved in all four sequences are indicated by single letter amino acid abbreviation; “*” indicates any amino acid found in the corresponding position of any of the four sequences; and “-” indicates any amino acid, e.g., any of the 20 naturally occurring amino acids.

FIG. 3A shows an alignment of the N-terminal RuvC-like domain from the Cas9 molecules disclosed in Chylinski et al. (SEQ ID NOS 54-103, respectively, in order of appearance). The last line of FIG. 3A identifies 3 highly conserved residues.

FIG. 3B shows an alignment of the N-terminal RuvC-like domain from the Cas9 molecules disclosed in Chylinski et al. with sequence outliers removed (SEQ ID NOS 104-177, respectively, in order of appearance). The last line of FIG. 3B identifies 4 highly conserved residues.

FIG. 4A shows an alignment of the HNH-like domain from the Cas9 molecules disclosed in Chylinski et al. (SEQ ID NOS 178-252, respectively, in order of appearance). The last line of FIG. 4A identifies conserved residues.

FIG. 4B shows an alignment of the HNH-like domain from the Cas9 molecules disclosed in Chylinski et al. with sequence outliers removed (SEQ ID NOS 253-302, respectively, in order of appearance). The last line of FIG. 4B identifies 3 highly conserved residues.

FIG. 5 depicts an alignment of Cas9 sequences from S. pyogenes and Neisseria meningitidis (N. meningitidis). The N-terminal RuvC-like domain is boxed and indicated with a “Y”. The other two RuvC-like domains are boxed and indicated with a “B”. The HNH-like domain is boxed and indicated with a “G”. Sp: S. pyogenes; Nm: N. meningitidis. Motif: this is a motif based on the two sequences: residues conserved in both sequences are indicated by a single amino acid designation; “*” indicates any amino acid found in the corresponding position of any of the two sequences; “-” indicates any amino acid, e.g., any of the 20 naturally occurring amino acids, and “-” indicates any amino acid, e.g., any of the 20 naturally occurring amino acids, or absent.

FIG. 6 shows a nucleic acid sequence encoding Cas9 of N. meningitidis (SEQ ID NO: 303). Sequence indicated by an “R” is an SV40 NLS; sequence indicated as “G” is an HA tag; sequence indicated by an “O” is a synthetic NLS sequence. The remaining (unmarked) sequence is the open reading frame (ORF).

DEFINITIONS

“Domain”, as used herein, is used to describe segments of a protein or nucleic acid. Unless otherwise indicated, a domain is not required to have any specific functional property.

Calculations of “homology” or “sequence identity” between two sequences (the terms are used interchangeably herein) are performed as follows. The sequences are aligned for optimal comparison purposes (e.g., gaps can be introduced in one or both of a first and a second amino acid or nucleic acid sequence for optimal alignment and non-homologous sequences can be disregarded for comparison purposes). The optimal alignment is determined as the best score using the GAP program in the GCG software package with a Blossum 62 scoring matrix with a gap penalty of 12, a gap extend penalty of 4, and a frame shift gap penalty of 5. The amino acid residues or nucleotides at corresponding amino acid positions or nucleotide positions are then compared. When a position in the first sequence is occupied by the same amino acid residue or nucleotide as the corresponding position in the second sequence, then the molecules are identical at that position (as used herein, in some embodiments, amino acid or nucleic acid “identity” is equivalent to amino acid or nucleic acid “homology”). The percent identity between the two sequences is a function of the number of identical positions shared by the sequences.

“Modulator”, as used herein, refers to an entity, e.g., a drug, that can alter the activity (e.g., enzymatic activity, transcriptional activity, or translational activity), amount, distribution, or structure of a subject molecule or genetic sequence. In an embodiment, modulation comprises cleavage, e.g., breaking of a covalent or non-covalent bond, or the forming of a covalent or non-covalent bond, e.g., the attachment of a moiety, to the subject molecule. In an embodiment, a modulator alters the, three dimensional, secondary, tertiary, or quaternary structure, of a subject molecule. A modulator can increase, decrease, initiate, or eliminate a subject activity.

“Large molecule”, as used herein, refers to a molecule having a molecular weight of at least 2, 3, 5, 10, 20, 30, 40, 50, 60, 70, 80, 90, or 100 kD. Large molecules include proteins, polypeptides, nucleic acids, biologics, and carbohydrates.

“Polypeptide”, as used herein, refers to a polymer of amino acids having less than 100 amino acid residues. In an embodiment, it has less than 50, 20, or 10 amino acid residues.

“Reference molecule”, e.g., a reference Cas9 molecule or reference gRNA, as used herein, refers to a molecule to which a subject molecule, e.g., a subject Cas9 molecule of subject gRNA molecule, e.g., a modified or candidate Cas9 molecule is compared. For example, a Cas9 molecule can be characterized as having no more than 10% of the nuclease activity of a reference Cas9 molecule. Examples of reference Cas9 molecules include naturally occurring unmodified Cas9 molecules, e.g., a naturally occurring Cas9 molecule such as a Cas9 molecule of S. pyogenes, or S. thermophilus. In an embodiment, the reference Cas9 molecule is the naturally occurring Cas9 molecule having the closest sequence identity or homology with the Cas9 molecule to which it is being compared. In an embodiment, the reference Cas9 molecule is a sequence, e.g., a naturally occurring or known sequence, which is the parental form on which a change, e.g., a mutation has been made.

“Replacement”, or “replaced”, as used herein with reference to a modification of a molecule does not require a process limitation but merely indicates that the replacement entity is present.

“Small molecule”, as used herein, refers to a compound having a molecular weight less than about 2 kD, e.g., less than about 2 kD, less than about 1.5 kD, less than about 1 kD, or less than about 0.75 kD.

“Subject”, as used herein, may mean either a human or non-human animal. The term includes, but is not limited to, mammals (e.g., humans, other primates, pigs, rodents (e.g., mice and rats or hamsters), rabbits, guinea pigs, cows, horses, cats, dogs, sheep, and goats). In an embodiment, the subject is a human. In other embodiments, the subject is poultry.

“Treat”, “treating” and “treatment”, as used herein, mean the treatment of a disease in a mammal, e.g., in a human, including (a) inhibiting the disease, i.e., arresting or preventing its development; (b) relieving the disease, i.e., causing regression of the disease state; or (c) curing the disease.

“X” as used herein in the context of an amino acid sequence, refers to any amino acid (e.g., any of the twenty natural amino acids) unless otherwise specified.

DETAILED DESCRIPTION

I. gRNA Molecules

A gRNA molecule, as that term is used herein, refers to a nucleic acid that promotes the specific targeting or homing of a gRNA molecule/Cas9 molecule complex to a target nucleic acid. gRNA molecules can be unimolecular (having a single RNA molecule), sometimes referred to herein as “chimeric” gRNAs, or modular (comprising more than one, and typically two, separate RNA molecules). A gRNA molecule comprises a number of domains. The gRNA molecule domains are described in more detail below.

Several exemplary gRNA structures, with domains indicated thereon, are provided in FIG. 1. While not wishing to be bound by theory with regard to the three dimensional form, or intra- or inter-strand interactions of an active form of a gRNA, regions of high complementarity are sometimes shown as duplexes in FIG. 1 and other depictions provided herein.

In an embodiment, a unimolecular, or chimeric, gRNA comprises, preferably from 5′ to 3′:

- a targeting domain (which is complementary to a target nucleic acid);
- a first complementarity domain;
- a linking domain;
- a second complementarity domain (which is complementary to the first complementarity domain);
- a proximal domain; and
- optionally, a tail domain.

In an embodiment, a modular gRNA comprises:

- a first strand comprising, preferably from 5′ to 3′;
  - a targeting domain (which is complementary with a target sequence from a target nucleic acid disclosed herein, e.g., a sequence from: a gene or pathway described herein, e.g., in Section VIIB, e.g., in Table VII-13, VII-14, VII-15, VII-16, VII-17, VII-18, VII-19, VII-20, VII-21, VII-22, VII-23, VII-24, VII-25, IX-1, IX-1A, IX-2, IX-3, XIV-1, or Section VIII); and
  - a first complementarity domain; and
    - a second strand, comprising, preferably from 5′ to 3′:
  - optionally, a 5′ extension domain;
  - a second complementarity domain; and
  - a proximal domain; and
  - optionally, a tail domain.

The domains are discussed briefly below:

1) The Targeting Domain:

FIG. 1A-G provides examples of the placement of targeting domains.

The targeting domain comprises a nucleotide sequence that is complementary, e.g., at least 80, 85, 90, or 95% complementary, e.g., fully complementary, to the target sequence on the target nucleic acid. The targeting domain is part of an RNA molecule and will therefore comprise the base uracil (U), while any DNA encoding the gRNA molecule will comprise the base thymine (T). While not wishing to be bound by theory, it is believed that the complementarity of the targeting domain with the target sequence contributes to specificity of the interaction of the gRNA molecule/Cas9 molecule complex with a target nucleic acid. It is understood that in a targeting domain and target sequence pair, the uracil bases in the targeting domain will pair with the adenine bases in the target sequence. In an embodiment, the target domain itself comprises, in the 5′ to 3′ direction, an optional secondary domain, and a core domain. In an embodiment, the core domain is fully complementary with the target sequence. In an embodiment, the targeting domain is 5 to 50, e.g., 10 to 40, e.g., 10 to 30, e.g., 15 to 30, e.g., 15 to 25 nucleotides in length. In an embodiment, the targeting domain is 15, 16, 17, 18, 19, 20, 21, 22, 23, 24 or 25 nucleotides in length. The strand of the target nucleic acid with which the targeting domain is complementary is referred to herein as the complementary strand. Some or all of the nucleotides of the domain can have a modification, e.g., modification found in Section X herein.

In an embodiment, the targeting domain is 16 nucleotides in length.

In an embodiment, the targeting domain is 17 nucleotides in length.

In an embodiment, the targeting domain is 18 nucleotides in length.

In an embodiment, the targeting domain is 19 nucleotides in length.

In an embodiment, the targeting domain is 20 nucleotides in length.

In an embodiment, the targeting domain is 21 nucleotides in length.

In an embodiment, the targeting domain is 22 nucleotides in length.

In an embodiment, the targeting domain is 23 nucleotides in length.

In an embodiment, the targeting domain is 24 nucleotides in length.

In an embodiment, the targeting domain is 25 nucleotides in length.

Targeting domains are discussed in more detail below.

2) The First Complementarity Domain:

FIG. 1A-G provides examples of first complementarity domains.

The first complementarity domain is complementary with the second complementarity domain, and in an embodiment, has sufficient complementarity to the second complementarity domain to form a duplexed region under at least some physiological conditions. In an embodiment, the first complementarity domain is 5 to 30 nucleotides in length. In an embodiment, the first complementarity domain is 5 to 25 nucleotides in length. In an embodiment, the first complementary domain is 7 to 25 nucleotides in length. In an embodiment, the first complementary domain is 7 to 22 nucleotides in length. In an embodiment, the first complementary domain is 7 to 18 nucleotides in length. In an embodiment, the first complementary domain is 7 to 15 nucleotides in length. In an embodiment, the first complementary domain is 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, or 25 nucleotides in length.

In an embodiment, the first complementarity domain comprises 3 subdomains, which, in the 5′ to 3′ direction are: a 5′ subdomain, a central subdomain, and a 3′ subdomain. In an embodiment, the 5′ subdomain is 4-9, e.g., 4, 5, 6, 7, 8 or 9 nucleotides in length. In an embodiment, the central subdomain is 1, 2, or 3, e.g., 1, nucleotide in length. In an embodiment, the 3′ subdomain is 3 to 25, e.g., 4-22, 4-18, or 4 to 10, or 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, or 25, nucleotides in length.

The first complementarity domain can share homology with, or be derived from, a naturally occurring first complementarity domain. In an embodiment, it has at least 50% homology with a first complementarity domain disclosed herein, e.g., an S. pyogenes, or S. thermophilus, first complementarity domain.

Some or all of the nucleotides of the domain can have a modification, e.g., modification found in Section X herein.

First complementarity domains are discussed in more detail below.

3) The Linking Domain

FIG. 1B-E provides examples of linking domains.

A linking domain serves to link the first complementarity domain with the second complementarity domain of a unimolecular gRNA. The linking domain can link the first and second complementarity domains covalently or non-covalently. In an embodiment, the linkage is covalent. In an embodiment, the linking domain covalently couples the first and second complementarity domains, see, e.g., FIG. 1B-E. In an embodiment, the linking domain is, or comprises, a covalent bond interposed between the first complementarity domain and the second complementarity domain. Typically, the linking domain comprises one or more, e.g., 2, 3, 4, 5, 6, 7, 8, 9, or 10 nucleotides.

In modular gRNA molecules the two molecules can be associated by virtue of the hybridization of the complementarity domains, see e.g., FIG. 1A.

A wide variety of linking domains are suitable for use in unimolecular gRNA molecules. Linking domains can consist of a covalent bond, or be as short as one or a few nucleotides, e.g., 1, 2, 3, 4, or 5 nucleotides in length.

In an embodiment, a linking domain is 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, or 25 or more nucleotides in length. In an embodiment, a linking domain is 2 to 50, 2 to 40, 2 to 30, 2 to 20, 2 to 10, or 2 to 5 nucleotides in length. In an embodiment, a linking domain shares homology with, or is derived from, a naturally occurring sequence, e.g., the sequence of a tracrRNA that is 5′ to the second complementarity domain. In an embodiment, the linking domain has at least 50% homology with a linking domain disclosed herein.

Some or all of the nucleotides of the domain can have a modification, e.g., modification found in Section X herein.

Linking domains are discussed in more detail below.

4) The 5′ Extension Domain

In an embodiment, a modular gRNA can comprise additional sequence, 5′ to the second complementarity domain, referred to herein as the 5′ extension domain, see, e.g., FIG. 1A. In an embodiment, the 5′ extension domain is, 2-10, 2-9, 2-8, 2-7, 2-6, 2-5, 2-4 nucleotides in length. In an embodiment, the 5′ extension domain is 2, 3, 4, 5, 6, 7, 8, 9, or 10 or more nucleotides in length.

5) The Second Complementarity Domain:

FIG. 1A-F provides examples of second complementarity domains.

The second complementarity domain is complementary with the first complementarity domain, and in an embodiment, has sufficient complementarity to the second complementarity domain to form a duplexed region under at least some physiological conditions. In an embodiment, e.g., as shown in FIG. 1A or FIG. 1B, the second complementarity domain can include sequence that lacks complementarity with the first complementarity domain, e.g., sequence that loops out from the duplexed region.

In an embodiment, the second complementarity domain is 5 to 27 nucleotides in length.

In an embodiment, it is longer than the first complementarity region.

In an embodiment, the second complementary domain is 7 to 27 nucleotides in length. In an embodiment, the second complementary domain is 7 to 25 nucleotides in length. In an embodiment, the second complementary domain is 7 to 20 nucleotides in length. In an embodiment, the second complementary domain is 7 to 17 nucleotides in length. In an embodiment, the complementary domain is 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24 or 25 nucleotides in length.

In an embodiment, the second complementarity domain comprises 3 subdomains, which, in the 5′ to 3′ direction are: a 5′ subdomain, a central subdomain, and a 3′ subdomain. In an embodiment, the 5′ subdomain is 3 to 25, e.g., 4 to 22, 4 to 18, or 4 to 10, or 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, or 25 nucleotides in length. In an embodiment, the central subdomain is 1, 2, 3, 4 or 5, e.g., 3, nucleotides in length. In an embodiment, the 3′ subdomain is 4 to 9, e.g., 4, 5, 6, 7, 8 or 9 nucleotides in length.

In an embodiment, the 5′ subdomain and the 3′ subdomain of the first complementarity domain, are respectively, complementary, e.g., fully complementary, with the 3′ subdomain and the 5′ subdomain of the second complementarity domain.

The second complementarity domain can share homology with or be derived from a naturally occurring second complementarity domain. In an embodiment, it has at least 50% homology with a second complementarity domain disclosed herein, e.g., an S. pyogenes, or S. thermophilus, first complementarity domain.

Some or all of the nucleotides of the domain can have a modification, e.g., modification found in Section X herein.

6) A Proximal Domain:

FIG. 1A-F provides examples of proximal domains.

In an embodiment, the proximal domain is 5 to 20 nucleotides in length. In an embodiment, the proximal domain can share homology with or be derived from a naturally occurring proximal domain. In an embodiment, it has at least 50% homology with a proximal domain disclosed herein, e.g., an S. pyogenes, or S. thermophilus, proximal domain.

Some or all of the nucleotides of the domain can have a modification, e.g., modification found in Section X herein.

7) A Tail Domain:

FIG. 1A and FIG. 1C-F provide examples of tail domains.

As can be seen by inspection of the tail domains in FIG. 1A and FIG. 1C-F, a broad spectrum of tail domains are suitable for use in gRNA molecules. In an embodiment, the tail domain is 0 (absent), 1, 2, 3, 4, 5, 6, 7, 8, 9, or 10 nucleotides in length. In an embodiment, the tail domain nucleotides are from or share homology with sequence from the 5′ end of a naturally occurring tail domain, see e.g., FIG. 1D or FIG. 1E. In an embodiment, the tail domain includes sequences that are complementary to each other and which, under at least some physiological conditions, form a duplexed region.

In an embodiment, the tail domain is absent or is 1 to 50 nucleotides in length. In an embodiment, the tail domain can share homology with or be derived from a naturally occurring proximal tail domain. In an embodiment, it has at least 50% homology with a tail domain disclosed herein, e.g., an S. pyogenes, or S. thermophilus, tail domain.

Some or all of the nucleotides of the domain can have a modification, e.g., modification found in Section X herein.

In an embodiment, the tail domain includes nucleotides at the 3′ end that are related to the method of in vitro or in vivo transcription. When a T7 promoter is used for in vitro transcription of the gRNA, these nucleotides may be any nucleotides present before the 3′ end of the DNA template. When a U6 promoter is used for in vivo transcription, these nucleotides may be the sequence UUUUUU. When alternate pol-III promoters are used, these nucleotides may be various numbers or uracil bases or may include alternate bases.

The domains of gRNA molecules are described in more detail below.

The Targeting Domain

The “targeting domain” of the gRNA is complementary to the “target domain” on the target nucleic acid. The strand of the target nucleic acid comprising the nucleotide sequence complementary to the core domain of the gRNA is referred to herein as the “complementary strand” of the target nucleic acid. Guidance on the selection of targeting domains can be found, e.g., in Fu Y et al., NAT BIOTECHNOL 2014 (doi: 10.1038/nbt.2808) and Sternberg S H et al., NATURE 2014 (doi: 10.1038/nature13011).

In an embodiment, the targeting domain is 16, 17, 18, 19, 20, 21, 22, 23, 24 or 25 nucleotides in length.

In an embodiment, the targeting domain comprises 16, 17, 18, 19, 20, 21, 22, 23, 24 or 25 nucleotides in length.