The instant application contains a Sequence Listing which has been submitted electronically in XML format and is hereby incorporated by reference in its entirety. Said XML copy, created on Apr. 11, 2023, is named 740487 083474-036 SL.xml and is 494,677 bytes in size.
Editing genomes using the RNA-guided DNA targeting principle of CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats-CRISPR associated proteins) has become a popular in a wide variety of applications. The main advantage of CRISPR system lies in the minimal requirement for programmable DNA interference: an endonuclease, such as a Cas9, Cas12, or any programmable nucleases, which is guided by a customizable RNA structure. Cas9 nuclease is a multi-domain enzyme that uses an HNH nuclease domain to cleave a target nucleic acid strand. The CRISPR/Cas9 protein-RNA complex is directed to and is localized on the target by a guide RNA, then it cleaves the target to generate a DNA double strand break (dsDNA break, DSB). After cleavage, DNA repair mechanisms are activated to repair the cleaved strand. Repair mechanisms are generally two types: non-homologous end joining (NHEJ) or homologous recombination (HR). Basically, NHEJ dominates repair, and, being error prone, generates random indels (insertions or deletions) causing frame shift mutations, among others. In contrast, HR has a more precise repairing capability and is potentially capable of incorporating the exact substitution or insertion. To enhance HR, several techniques have been tried, for example: combination of fusion proteins of Cas9 nuclease with homology-directed repair (HDR) effectors to enforce their localization at DSBs, introducing an overlapping homology arm, or suppression of NHEJ. Most of these techniques rely on the host DNA repair systems.
Recently, a new genetic editing system for site-specific genetic engineering using Programmable Addition via Site-Specific Targeting Elements (PASTE) has been developed (See, e.g., loannidi et al., “Drag-and-drop genome insertion without DNA cleavage with CRISPRdirected integrases,” bioRxiv preprint, 2021, doi: https://doi.org/10.1101/2021.1101 466786; and U.S. patent application Ser. No. 17/451,734, the entire contents of each are hereby incorporated by reference in their entirety). PASTE comprises the addition of an integration site into the target genome followed by the insertion of one or more genes of interest or one or more nucleic acid sequences of interest at the site. PASTE combines gene editing technologies and integrase technologies to achieve unidirectional incorporation of genes in a genome for the treatment of diseases and diagnosis of disease. Despite these developments, the insertion of long sequences into the target genome is still a challenge.
Therefore, there is a need for more effective tools for gene editing and delivery.
The present disclosure provides compositions and systems for programmable gene editing that utilize, comprising a DNA binding nickase, a reverse transcriptase, an integration enzyme, and a guide RNA pair comprising heterologous gRNAs each separately comprising a scaffold sequence, a primer binding sequence, an integration sequence, a spacer sequence, and optionally a reverse transcription template sequence. In one aspect, provided herein is a composition comprising: a DNA binding nickase or a functional fragment or variant thereof; a reverse transcriptase (RT) or a functional fragment or variant thereof; an integration enzyme or a functional fragment or variant thereof, wherein the integration enzyme is selected from the group consisting of an integrase, a recombinase, and a reverse transcriptase; and a guide RNA (gRNA) pair comprising: a first heterologous gRNA or functional fragments or variants thereof, comprising: a first spacer sequence, a first scaffold sequence, a first reverse transcription template sequence that comprises at least a first portion of an at least first integration recognition sequence; a first primer binding sequence, and a second heterologous gRNA or functional fragment or variant thereof, comprising: a second spacer sequence, a second scaffold sequence, a second reverse transcription template sequence that comprises at least a second portion of the first integration recognition sequence, a second primer binding sequence, wherein the first heterologous RNA and the second heterologous RNA collectively encode the entirety of the first integration recognition sequence.
In some embodiments, the first primer binding sequence, the second primer binding sequence, or both, are at least about 9 nucleotides in length or about 9-15 nucleotides in length.
In some embodiments, the at least first integration recognition sequence is at least about 38 nucleotides in length or about 38-46 nucleotides in length.
In some embodiments, the first heterologous gRNA does not comprise a reverse transcription template sequence or the first and second heterologous gRNAs do not comprise a reverse transcription template sequence.
In some embodiments, the first reverse transcription template sequence, the second reverse transcription template sequence, or both, are about 1-34 nucleotides in length.
In some embodiments, the first spacer sequence, the second spacer sequence, or both, are at least about 20 nucleotides in length or about 17-21 nucleotides in length.
In some embodiments, the first scaffold sequence, the second scaffold sequence, or both, are at least about 60 nucleotides in length or about 60-120 nucleotides in length.
In some embodiments, the first reverse transcription template sequence encodes a first extended sequence, and the second reverse transcription template sequence encodes a second extended sequence.
In some embodiments, the first and second extended sequences comprise at least about 5 complementary nucleotides with respect to each other, about 5-10 complementary nucleotides with respect to each other, about 11-20 complementary nucleotides with respect to each other, or about 21-30 complementary nucleotides with respect to each other, about 31-40 complementary nucleotides with respect to each other, about 41-50 complementary nucleotides with respect to each other, or about 51-60 complementary nucleotides with respect to each other.
In some embodiments, annealing of the complementary nucleotides forms a duplex which results in an insertion of the at least first integration recognition sequence into a target location.
In some embodiments, the first and second heterologous gRNAs form a double stranded nucleic acid.
In some embodiments, the first spacer sequences and the second space sequence are separated by at least about 0-1000 nucleotides in the genome.
In some embodiments, the first and second heterologous gRNAs comprise from 5′-3′ in this order the spacer sequence, the scaffold sequence, the integration sequence, and the primer binding sequence.
In some embodiments, the DNA binding nickase is a Cas9-D10A, a Cas9-H840A, a Cas12a nickase, or a Cas12b nickase, or a functional fragment or variant thereof
In some embodiments, the reverse transcriptase is derived from Moloney Murine Leukemia Virus (M-MLV) reverse transcriptase, transcription xenopolymerase (RTX), avian myeloblastosis virus reverse transcriptase (AMV-RT), or Eubacterium rectale maturase RT (MarathonRT).
In some embodiments, the reverse transcriptase comprises a mutation relative to the wild-type sequence. In some embodiments, the reverse transcriptase is a M-MLV reverse transcriptase, an AMV-RT, MarathonRT, or a RTX, optionally the reverse transcriptase is a modified M-MLV reverse transcriptase relative to the wildtype M-MLV reverse transcriptase, and optionally the M-MLV reverse transcriptase domain comprises one or more of the mutations selected from the group consisting of D200N, T306K, W313F, T330P, and L603W.
In some embodiments, the first scaffold sequence, the second scaffold sequence, or both, comprises at least 80% sequence identity to any of the nucleic acid sequences set forth in Table A.
In some embodiments, the integration recognition sequence comprises at least 80% sequence identity to any one of the nucleic acid sequences set forth in Table B.
In some embodiments, the first and second heterologous gRNAs comprise the nucleic acid sequence of SEQ ID NO: 1-80, SEQ ID NO: 81-160, SEQ ID NO: 161-362, SEQ ID NO: 363-372, or SEQ ID NO: 373-394.
In some embodiments, the integration enzyme is Dre, Vika, Bxb1, φC31, RDF, FLP, φBT1, R1, R2, R3, R4, R5, TP901-1, A118, φFC1, φC1, MR11, TG1, φ370.1, WO, BL3, SPBc, K38, Peaches, Veracruz, Rebeuca, Theia, Benedict, KSSJEB, PattyP, Doom, Scowl, Lockley, Switzer, Bob3, Troube, Abrogate, Anglerfish, Sarfire, SkiPole, ConceptII, Museum, Severus, Airmid, Benedict, Hinder, ICleared, Sheen, Mundrea, BxZ2, (pRV, retrotransposases encoded by R2, L1, Tol2 Tc1, Tc3, Mariner (Himar 1), Mariner (mos 1), or Minos, or any functional fragments or variants thereof
In some embodiments, the integration enzyme is Bxb1 or any functional fragments or variants thereof.
In some embodiments, the integration sequence is an attB sequence, an attP sequence, an attL sequence, an attR sequence, a Vox sequence, a FRT sequence, or a functional fragment or variant thereof
In some embodiments, the integration sequence is an attB sequence, optionally the attB sequence comprises about 38-46 base pairs.
In some embodiments, the integration sequence is an attp sequence, optionally the attp sequence comprises about 48-52 base pairs.
In some embodiments, the DNA binding nickase is a Cas9-D10A, a Cas9-H840A, a Cas12a/b/c/d/e/f/h/i/j, or a functional fragment or variant thereof
In another aspect, provided herein is a method of site-specifically integrating an exogenous nucleic acid into a cell genome, the method comprising: (a) incorporating an integration sequence at a target location in the cell genome by introducing into a cell: (i) a DNA binding nickase or a functional fragment or variant thereof; (ii) a reverse transcriptase (RT) or a functional fragment or variant thereof; and (iii) a guide RNA (gRNA) pair comprising a first heterologous gRNA or functional fragments or variants thereof, comprising: a first spacer sequence, a first scaffold sequence, a first reverse transcription template sequence that comprises at least a first portion of an at least first integration recognition sequence; a first primer binding sequence and a second heterologous gRNA or functional fragments or variants thereof, comprising: a second spacer sequence, a second scaffold sequence, a second reverse transcription template sequence that comprises at least a second portion of the first integration recognition sequence, a second primer binding sequence , wherein: the first and second heterologous gRNAs interact with the DNA binding nickase and target the target location in the cell genome, the DNA binding nickase nicks a strand of the cell genome, and the reverse transcriptase reverse transcribes (i) the first reverse transcription template sequence into a first extended sequence that encodes the at least first portion of the first integration recognition sequence and (ii) the second reverse transcription template sequence into a second extended sequence that encodes the at least second portion of the first integration recognition sequence, the first and second extended sequences comprise at least about 5 complementary nucleotides with respect to each other, wherein annealing of the complementary nucleotides forms a duplex which results in an insertion of the at least first integration recognition sequence into the target location. The method further comprises: (b) integrating the nucleic acid into the cell genome by introducing into the cell: (i) a DNA or RNA strand comprising the nucleic acid linked to a sequence that is complementary or associated to the integration sequence; and (ii) an integration enzyme or a functional fragment or variant thereof, wherein the integration enzyme is selected from the group consisting of an integrase, a recombinase, and a reverse transcriptase, wherein the integration enzyme incorporates the nucleic acid into the cell genome at the at least first integration recognition sequence by integration, recombination, or reverse transcription of the sequence that is complementary or associated to the integration sequence, thereby introducing the nucleic acid into the target location of the cell genome of the cell.
In some embodiments, the first and second heterologous gRNAs hybridize to a complementary strand of the cell genome to the genomic strand that is nicked by the DNA binding nickase, optionally the integration enzyme is introduced as a peptide or a nucleic acid encoding the integration enzyme, optionally DNA binding nickase is introduced as a peptide or a nucleic acid encoding the DNA binding nickase, optionally the DNA or RNA strand comprising the nucleic acid is introduced into the cell as a minicircle, a plasmid, mRNA or a linear DNA, optionally the DNA or RNA strand comprising the nucleic acid is between 1000 bp and 36,000 bp, optionally the DNA or RNA strand comprising the nucleic acid is more than 36,000 bp, optionally the DNA or RNA strand comprising the nucleic acid is less than 1000 bp, and optionally the DNA comprising the nucleic acid is introduced into the cell as a minicircle.
In some embodiments, the minicircle does not comprise a sequence of a bacterial origin.
In some embodiments, the DNA binding nickase is linked to the reverse transcriptase, and the DNA binding nickase linked to the reverse transcriptase domain and the integration enzyme are linked via a linker.
In some embodiments, the linker is cleavable,
In some embodiments, the linker is non-cleavable.
In some embodiments, the linker can be replaced by two associating binding domains of the DNA binding nickase linked to the reverse transcriptase.
In some embodiments, the DNA binding nickase, the reverse transcriptase, the gRNA pair, the DNA or RNA comprising nucleic acid linked to a complementary or associated integration sequence, and the integration enzyme are introduced into a cell in a single reaction.
In some embodiments, the nucleic acid is introduced into the cell as an adeno-associated virus (AAV) or an adenovirus (AdV).
In some embodiments, the DNA binding nickase, the reverse transcriptase, the gRNA pair, the DNA or RNA comprising nucleic acid linked to a complementary or associated integration sequence, and the integration enzyme are introduced using a virus, a RNP, an mRNA, a lipid, or a polymeric nanoparticle.
In some embodiments, the nucleic acid is a reporter gene, and optionally the reporter gene is a fluorescent protein.
In some embodiments, the cell is a dividing cell.
In some embodiments, the cell is a non-dividing cell.
In some embodiments, the target location in the cell genome is the locus of a mutated gene.
In some embodiments, the nucleic acid is a degradation tag for programmable knockdown of proteins in the presence of small molecules.
In some embodiments, the cell is a mammalian cell, a bacterial cell, or a plant cell.
In some embodiments, the nucleic acid is a T-cell receptor (TCR), a chimeric antigen receptor (CAR), an interleukin, a cytokine, or an immune checkpoint gene for integration into a T-cell or natural killer (NK) cell, and optionally the TCR, the CAR, the interleukin, the cytokine, or the immune checkpoint gene is incorporated into the target site of the T-cell or NK cell genome using a minicircle DNA.
In some embodiments, the nucleic acid is a beta hemoglobin (HBB) gene and the cell is a hematopoietic stem cell (HSC), optionally the HBB gene is incorporated into the target site in the HSC genome using a minicircle DNA, and optionally the nucleic acid is a gene responsible for beta thalassemia or sickle cell anemia.
In some embodiments, the nucleic acid is a metabolic gene, optionally metabolic gene is involved in alpha-1 antitrypsin deficiency or ornithine transcarbamylase (OTC) deficiency, and optionally the metabolic gene is a gene involved in an inherited disease.
In some embodiments, the nucleic acid is a gene involved in an inherited disease or an inherited syndrome, and optionally the inherited disease is cystic fibrosis, familial hypercholesterolemia, adenosine deaminase (ADA) deficiency, X-linked SCID (X-SCID), Wiskott-Aldrich syndrome (WAS), hemochromatosis, Tay-Sachs, fragile X syndrome, Huntington's disease, Marfan syndrome, phenylketonuria, or muscular dystrophy.
In another aspect, provided herein is a nucleic acid molecule encoding the DNA binding nickase, the reverse transcriptase, the integration enzyme, and the gRNA pair. In another aspect, provided herein is a vector comprising the nucleic acid molecule.
In another aspect, provided herein is a cell comprising the composition, the nucleic acid molecule, or the vector.
In some embodiments, the cell is a prokaryotic cell.
In some embodiments, the cell is a eukaryotic cell.
In some embodiments, the eukaryotic cell is a mammalian cell, and optinally the mammalian cell is a human cell.
In another aspect, provided herein is a gRNA pair that specifically binds to a DNA binding nickase, wherein the gRNA pair comprises a first heterologous gRNA or functional fragments or variants thereof, and a second heterologous gRNA or functional fragments or variants thereof, and wherein the first and second heterologous gRNAs separately comprise a scaffold sequence, a primer binding sequence, an integration sequence, a spacer sequence, and optionally a reverse transcription template sequence.
In another aspect, provided herein is a polypeptide comprising a DNA binding nuclease comprising a nickase activity C-terminally linked to a reverse transcriptase linked to an integration enzyme via a linker.
In some embodiment: the linker is cleavable or non-cleavable; the integration enzyme is fused to an estrogen receptor; the DNA binding nuclease comprising a nickase activity is selected from the group consisting of Cas9-D10A, Cas9-H840A, and Cas12a/b/c/d/e/f/g/h/i/j; the reverse transcriptase is a M-MLV reverse transcriptase, a AMV-RT, a MarathonRT, or a XRT, optionally wherein the reverse transcriptase is a modified M-MLV relative to a wild-type M-MLV reverse transcriptase, optionally wherein the M-MLV reverse transcriptase domain comprises one or more of mutations selected from the group consisting of D200N, T306K, W313F, T330P, and L603W; the integration enzyme is selected from group consisting of Cre, Dre, Vika, Bxb1, φC31, RDF, FLP, φBT1, R1, R2, R3, R4, R5, TP901-1, A118, φFC1, φC1, MR11, TG1, φ370.1, WO, BL3, SPBc, K38, Peaches, Veracruz, Rebeuca, Theia, KSSJEB, PattyP, Doom, Scowl, Lockley, Switzer, Bob3, Troube, Abrogate, Anglerfish, Sarfire, SkiPole, Conceptll, Museum, Severus, Airmid, Benedict, Hinder, ICleared, Sheen, Mundrea, BxZ2, φRV, retrotransposases encoded by R2, L1, To12 Tc1, Tc3, Mariner (Himar 1), Mariner (mos 1), Minos, and any mutants thereof.
PASTE editing utilizes a modified PRIME gene editing technique to site-specifically insert an integration site within a target polynucleotide (e.g., genome) and subsequently utilizing the site to integrate a polynucleotide of interest (See, e.g., US20220145293, the entire contents of which are incorporated by reference herein for all purposes). PASTE-REPLACE editing utilizes PASTE but with a paired set of gRNAs that enable the simultaneous deletion of a polynucleotide sequence (e.g., a gene) and replacement of the polynucleotide with an exogenous polynucleotide of interest (e.g., a variant gene). The first step in PASTE and PASTE-REPLACE editing generally comprises the use of a nickase (e.g., a Cas9 nickase) fused to a reverse transcriptase and an extended gRNA (pegRNA). The pegRNA comprises at least three functional polynucleotides (i) a targeting sequence (targeting the nickase to the target polynucleotide site), (ii) a primer binding site (PBS), and (iii) a reverse transcriptase template sequence containing the integration site. However, providing all three of these functionalities in a single RNA molecule means the pegRNAs are relatively long (typically 150-200 nucleotides) making the pegRNA difficult and expensive to manufacture at a large scale, as would be required for therapeutic or diagnostic uses. Additionally, the long length of the pegRNAs may impact editing efficiency; for example, biochemical measurements show that the complex design of the pegRNA reduces its affinity to Cas9, and likely decreases the efficiency of the process. As such, the current disclosure provides improved PASTE editing systems that allow for efficient editing and enhanced manufacturability. Providing a gRNA pair was found to be particularly advantageous in technologies like PASTE because it allows the insertion of long (38-46 bp) integration sites (versus PRIME editing which in many instances requires only short reverse transcriptase template sequences encoding a single nucleotide change).
The section headings used herein are for organizational purposes only and are not to be construed as limiting the subject matter described.
Unless defined otherwise, all technical and scientific terms used herein have the same meaning as is commonly understood by one of skill in the art to which the claimed subject matter belongs. It is to be understood that the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of any subject matter claimed.
The use of the singular forms herein includes the plural unless specifically stated otherwise. As used herein, the singular forms “a,” “an,” and “the” include plural referents unless the context clearly dictates otherwise. Furthermore, use of the term “including” as well as other forms, such as “include,” “includes,” and “included,” is not limiting.
It is understood that wherever aspects are described herein with the language “comprising,” otherwise analogous aspects described in terms of “consisting of” and/or “consisting essentially of” are also provided.
Units, prefixes, and symbols are denoted in their Systeme International de Unites (SI) accepted form. Numeric ranges are inclusive of the numbers defining the range.
As described herein, any concentration range, percentage range, ratio range or integer range is to be understood to include the value of any integer within the recited range and, when appropriate, fractions thereof (such as one tenth and one hundredth of an integer), unless otherwise indicated.
The terms “about” or “comprising essentially of” refer to a value or composition that is within an acceptable error range for the particular value or composition as determined by one of ordinary skill in the art, which will depend in part on how the value or composition is measured or determined, i.e., the limitations of the measurement system. When particular values or compositions are provided in the application and claims, unless otherwise stated, the meaning of “about” or “comprising essentially of” should be assumed to be within an acceptable error range for that particular value or composition.
The term “and/or” where used herein is to be taken as specific disclosure of each of the two specified features or components with or without the other. Thus, the term “and/or” as used in a phrase such as “A and/or B” herein is intended to include “A and B,” “A or B,” “A” (alone), and “B” (alone). Likewise, the term “and/or” as used in a phrase such as “A, B, and/or C” is intended to encompass each of the following aspects: A, B, and C; A, B, or C; A or C; A or B; B or C; A and C; A and B; B and C; A (alone); B (alone); and C (alone).
When proteins are contemplated herein, it should be understood that polynucleotides encoding the proteins are also provided, as are vectors comprising the polynucleotides encoding the proteins.
As used herein, the term “Cas9” refers to an RNA-guided nuclease comprising a Cas9 domain, or a functional fragment or variant thereof (e.g., a protein comprising an active or inactive DNA cleavage domain of Cas9, and/or the gRNA binding domain of Cas9).
As used herein, the term “DNA binding nickase” such as a Cas9 or Cas12 nickase refers to a variant of DNA binding nuclease which is capable of cleaving only one strand of a target double stranded polynucleotide, thereby introducing a single-strand break in the target double strand polynucleotide. Similar terminology is used herein in reference to other Cas nucleases that exhibit nickase activity. For example, a “Cas12e nickase” would be used similarly herein to refer to a Cas12e which is capable of cleaving only one strand of a target double stranded polynucleotide, thereby introducing a single-strand break in the target double strand polynucleotide
As used herein, the term “derived from,” with reference to a polynucleotide sequence refers to a polynucleotide sequence that has at least 85%, 90%, 95%, 96%, 97%, 98%, or 99% sequence identity to a reference naturally occurring nucleic acid sequence from which it is derived. The term “derived from,” with reference to an amino acid sequence refers to an amino acid sequence that has at least 85%, 90%, 95%, 96%, 97%, 98%, or 99% sequence identity to a reference naturally occurring amino acid sequence from which it is derived. The term “derived from” as used herein does not denote any specific process or method for obtaining the polynucleotide or amino acid sequence. For example, the polynucleotide or amino acid sequence can be chemically synthesized.
As used herein, the term “DNA” or “DNA polynucleotides” refers to macromolecules that include multiple deoxyribonucleotides that are polymerized via phosphodiester bonds. Deoxyribonucleotides are nucleotides in which the sugar is deoxyribose.
As used herein, the term “functional fragment” in reference to a nucleic acid sequence, an amino acid sequence, or the like refers to a fragment of a reference nucleic acid sequence, an amino acid sequence, or the like that retains at least one particular function. For example, a functional fragment of an aptamer binding protein can refer to a fragment of the protein that retains the ability to bind the cognate aptamer. Not all functions of the reference protein need be retained by a functional fragment of the protein. In some instances, one or more functions are selectively reduced or eliminated.
As used herein, the term “functional variant” in reference to a nucleic acid sequence, an amino acid sequence, or the like refers to a nucleic acid sequence, an amino acid sequence, or the like that comprises at least one nucleic acid or amino acid modification (e.g., a substitution, deletion, addition) compared to the nucleic acid or amino acid sequence of a reference nucleic acid sequence, an amino acid sequence, or the like, that retains at least one particular function. For example, a functional variant of an aptamer binding protein refers to a protein that binds an aptamer comprising an amino acid substitution as compared to a wild type reference protein that retains the ability to bind the cognate aptamer. Not all functions of the reference wild type protein need be retained by the functional variant of the protein. In some instances, one or more functions are selectively reduced or eliminated.
As used herein, the term “fusion protein” and grammatical equivalents thereof refer to a protein that comprises an amino acid sequence derived from at least two separate proteins. The amino acid sequence of the at least two separate proteins can be directly connected through a peptide bond; or can be operably connected through an amino acid linker. Therefore, the term fusion protein encompasses embodiments, wherein the amino acid sequence of e.g., Protein A is directly connected to the amino acid sequence of Protein B through a peptide bond (Protein A-Protein B), and embodiments, wherein the amino acid sequence of e.g., Protein A is operably connected to the amino acid sequence of Protein B through an amino acid linker (Protein A-linker-Protein B).
A used herein, the term “fuse” and grammatical equivalents thereof refer to the operable connection of an amino acid sequence derived from one protein to the amino acid sequence derived from different protein. The term fuse encompasses both a direct connection of the two amino acid sequences through a peptide bond, and the indirect connection through an amino acid linker.
As used herein, the term “guide RNA” or “gRNA” refers to an RNA polynucleotide that guides the insertion or deletion of one or more polynucleotides of interest (e.g., a gene of interest) into a target polynucleotide (e.g., genome) via a nuclease, nickase, or functional fraction or variant thereof (e.g., a Cas protein, e.g., Cas9).
As used herein, the term “integrase” refers to a protein capable of integrating a polynucleotide of interest (e.g., a gene) into a desired location or target site (e.g., at an integration site) in a target polynucleotide (e.g., the genome of a cell). The integration can occur in a single reaction or multiple reactions.
As used herein, the term “integration sequence” refers to a polynucleotide sequence that encodes an integration site.
As used herein, the term “integration site” refers to a polynucleotide sequence capable of being recognized by an integrase.
As used herein, the term “modification,” with reference to a polynucleotide sequence, refers to a polynucleotide sequence that comprises at least one substitution, alteration, inversion, addition, or deletion of nucleotide compared to a reference polynucleotide sequence. Modifications can include the inclusion of non-naturally occurring nucleotide residues. As used herein, the term “modification,” with reference to an amino acid sequence refers to an amino acid sequence that comprises at least one substitution, alteration, inversion, addition, or deletion of an amino acid residue compared to a reference amino acid sequence. Modifications can include the inclusion of non-naturally occurring amino acid residues. Naturally occurring amino acid derivatives are not considered modified amino acids for purposes of determining percent identity of two amino acid sequences. For example, a naturally occurring modification of a glutamate amino acid residue to a pyroglutamate amino acid residue would not be considered an amino acid modification for purposes of determining percent identity of two amino acid sequences. Further, for example, a naturally occurring modification of a glutamate amino acid residue to a pyroglutamate amino acid residue would not be considered an amino acid “modification” as defined herein.
As used herein, the term “nickase” refers to a protein (e.g., a nuclease) that has the ability to cleave only one strand of a target double stranded polynucleotide, thereby introducing a single-strand break in the target double strand polynucleotide. In some embodiments, for example, an editing polypeptide described herein comprises a Cas9 nuclease with one of the two nuclease domains inactivated, e.g., by amino acid substitution of H840A, wherein the Cas9 has nickase activity but is not able to make a double strand break in a target double stranded polynucleotide.
As used herein, the terms “operably connected” and “operably linked” are used interchangeably and refer to a linkage of polynucleotide sequence elements or polypeptide sequence elements in a functional relationship. For example, a polynucleotide sequence is operably connected when it is placed into a functional relationship with another polynucleotide sequence. In some embodiments, a transcription regulatory polynucleotide sequence e.g., a promoter, enhancer, or other expression control element is operably-linked to a polynucleotide sequence that encodes a protein if it affects the transcription of the polynucleotide sequence that encodes the protein.
As used herein, the term “orthogonal integration sites” refers to integrations sites that do not significantly recognize the recognition site or nucleotide sequence of the integrase (e.g., recombinase) recognized by the other.
The determination of “percent identity” between two sequences (e.g., polypeptide or polynucleotides) can be accomplished using a mathematical algorithm. A specific, non-limiting example of a mathematical algorithm utilized for the comparison of two sequences is the algorithm of Karlin S & Altschul S F (1990) PNAS 87: 2264-2268, modified as in Karlin S & Altschul SF (1993) PNAS 90: 5873-5877, each of which is herein incorporated by reference in its entirety. Such an algorithm is incorporated into the NBLAST and XBLAST programs of Altschul SF et al., (1990) J Mol Biol 215: 403, which is herein incorporated by reference in its entirety. BLAST nucleotide searches can be performed with the NBLAST nucleotide program parameters set, e.g., for score=100, wordlength=12 to obtain nucleotide sequences homologous to a nucleic acid molecule described herein. BLAST protein searches can be performed with the XBLAST program parameters set, e.g., to score 50, wordlength=3 to obtain amino acid sequences homologous to a protein molecule described herein. To obtain gapped alignments for comparison purposes, Gapped BLAST can be utilized as described in Altschul SF et al., (1997) Nuc Acids Res 25: 3389-3402, which is herein incorporated by reference in its entirety. Alternatively, PSI BLAST can be used to perform an iterated search which detects distant relationships between molecules (Id.). When utilizing BLAST, Gapped BLAST, and PSI Blast programs, the default parameters of the respective programs (e.g., of XBLAST and NBLAST) can be used (See, e.g., National Center for Biotechnology Information (NCBI) on the worldwide web, ncbi.nlm.nih.gov). Another specific, non-limiting example of a mathematical algorithm utilized for the comparison of sequences is the algorithm of Myers and Miller, 1988, CABIOS 4:11-17, which is herein incorporated by reference in its entirety. Such an algorithm is incorporated in the ALIGN program (version 2.0) which is part of the GCG sequence alignment software package. When utilizing the ALIGN program for comparing amino acid sequences, a PAM120 weight residue table, a gap length penalty of 12, and a gap penalty of 4 can be used. The percent identity between two sequences can be determined using techniques similar to those described above, with or without allowing gaps. In calculating percent identity, typically only exact matches are counted.
As used herein the term “pharmaceutical composition” means a composition that is suitable for administration to an animal, e.g., a human subject, and comprises a therapeutic agent and a pharmaceutically acceptable carrier or diluent. A “pharmaceutically acceptable carrier or diluent” means a substance for use in contact with the tissues of human beings and/or non-human animals without excessive toxicity, irritation, allergic response, or other problem or complication, commensurate with a reasonable therapeutic benefit/risk ratio.
The terms “polynucleotide,” “nucleic acid,” and “nucleic acid molecule” are used interchangeably herein and refer to a polymer of DNA or RNA. The nucleic acid molecule can be single-stranded or double-stranded; contain natural, non-natural, or altered nucleotides; and contain a natural, non-natural, or altered internucleotide linkage, such as a phosphoroamidate linkage or a phosphorothioate linkage, instead of the phosphodiester found between the nucleotides of an unmodified nucleic acid molecule. Nucleic acid molecules include, but are not limited to, all nucleic acid molecules which are obtained by any means available in the art, including, without limitation, recombinant means, e.g., the cloning of nucleic acid molecules from a recombinant library or a cell genome, using ordinary cloning technology and polymerase chain reaction, and the like, and by synthetic means. The skilled artisan will appreciate that, except where otherwise noted, nucleic acid sequences set forth in the instant application will recite thymidine (T) in a representative DNA sequence but where the sequence represents RNA (e.g., mRNA), the thymidines (Ts) would be substituted for uracils (Us). Thus, any of the RNA polynucleotides encoded by a DNA identified by a particular sequence identification number may also comprise the corresponding RNA (e.g., mRNA) sequence encoded by the DNA, where each thymidine (T) of the DNA sequence is substituted with uracil (U).
As used herein, the term “polynucleotide of interest” refers to a polynucleotide intended or desired to be integrated into a target polynucleotide using any suitable method (e.g., a method described herein).
As used herein, the term “primer binding site” or “PBS” refers to the portion of a gRNA that binds to the polynucleotides sequence at the 3′ end of the flap that is formed after the DNA binding nickase nicks the target polynucleotide sequence.
The terms “protein” and “polypeptide” are used interchangeably herein and refer to a polymer of at least two amino acids linked by a peptide bond.
As used herein, the term “protospacer” refers to the DNA sequence that has the same (or similar) nucleotide sequence as the spacer sequence of a gRNA. The gRNA anneals to the complement of the protospacer sequence on the opposite strand of the DNA.
As used herein, the term “protospacer adjacent motif” or “PAM” refers to a short DNA sequence, typically 2-6 base pairs, that functions to aid a Cas nickase in recognizing the target DNA.
As used herein, the term “recognition site” refers to a polynucleotide sequence that pairs with an integration site to mediate integration by an integrase (e.g., a recombinase).
As used herein, the term “RNA” or “RNA polynucleotide” refers to macromolecules that include multiple ribonucleotides that are polymerized via phosphodiester bonds. Ribonucleotides are nucleotides in which the sugar is ribose. RNA may contain modified nucleotides; and contain natural, non-natural, or altered internucleotide linkages, such as a phosphoroamidate linkage or a phosphorothioate linkage, instead of the phosphodiester found between the nucleotides of an unmodified nucleic acid molecule.
As used herein, the term “hairpin loop” in reference to an RNA polynucleotide (e.g., an aptamer) refers to an RNA sequence that under physiological conditions is able to base-pair to form a double helix that ends in an unpaired loop.
As used herein, the term “reverse transcriptase” refers to a protein (e.g., a polymerase) that is capable of RNA-dependent DNA synthesis. All known reverse transcriptases require a primer to synthesize a DNA transcript from an RNA template. An exemplary reverse transcriptase commonly used in the art is derived from the moloney murine leukemia virus (M-MLV). See, e.g., Gerard, G. R., DNA 5:271-279 (1986) and Kotewicz, M. L., et al., Gene 35:249-258 (1985).
As used herein, the term “reverse transcriptase template sequence” refers to the portion of a gRNA that encodes the polynucleotide desired to be integrated into the target polynucleotide (e.g., genome) that is synthesized by the reverse transcriptase. The reverse transcriptase template sequence is used as a template during DNA synthesis by the reverse transcriptase.
As used herein, the term “scaffold” in reference to a gRNA refers to a polynucleotide in a gRNA that mediates binding to a nuclease (e.g., nickase) or a functional fragment or variant thereof (e.g., Cas9 (e.g., Cas9 nickases)).
As used herein, the term “spacer” in reference to a gRNA refers to a polynucleotide in a gRNA that mediates binding to a polynucleotide comprising a sequence complementary to the protospacer.
As used herein, the term “therapeutic nucleotide modification” refers to a polynucleotide of interest that encodes at least one nucleotide modification (e.g., substitution, deletion, or insertion) relative to the endogenous target polynucleotide (e.g., gene) sequence that is intended to have or does have a therapeutic effect in a subject.
A “therapeutically effective amount” of a therapeutic agent (e.g., a composition or system described herein) refers to any amount of the therapeutic agent that, when used alone or in combination with another therapeutic agent, protects a subject against the onset of a disease or promotes disease regression evidenced by a decrease in severity of disease symptoms, an increase in frequency and duration of disease symptom-free periods, or a prevention of impairment or disability due to the disease affliction. The ability of a therapeutic agent to promote disease regression can be evaluated using a variety of methods known to the skilled practitioner, such as in human subjects during clinical trials, in animal model systems predictive of efficacy in humans, or by assaying the activity of the agent in in vitro assays.
As used herein, the terms “treat,” treating,” “treatment,” and the like refer to reducing or ameliorating a disease and/or symptom(s) associated therewith or obtaining a desired pharmacologic and/or physiologic effect. It will be appreciated that, although not precluded, treating a disease does not require that the disease, or symptom(s) associated therewith be completely eliminated. In some embodiments, the effect is therapeutic, i.e., without limitation, the effect partially or completely reduces, diminishes, abrogates, abates, alleviates, decreases the intensity of, or cures a disease and/or adverse symptom attributable to the disease. In some embodiments, the effect is preventative, i.e., the effect protects or prevents an occurrence or reoccurrence of a disease. To this end, the presently disclosed methods comprise administering a therapeutically effective amount of a compositions as described herein.
PRIME editing generally involves the use of Cas9 nickase fused to a reverse-transcriptase and an extended gRNA (pegRNA). The pegRNA comprises a standard guide sequence (e.g., a spacer and a scaffold to target the Cas9 to the target site), a PBS) and a reverse transcriptase template sequence containing the desired nucleotide edit (see, e.g., Scholefield, J., Harrison, P. T. Prime editing — an update on the field. Gene Ther 28, 396-401 (2021). https://doi.org/10.1038/s41434-021-00263-9).
In some embodiments, the compositions and systems described herein are useful in the method of PASTE editing. PASTE editing utilizes a modified PRIME technique to site-specifically insert an integration site within a target polynucleotide and subsequently utilizing the site to integrate a polynucleotide sequence of interest (see, e.g., U.S. Ser. No. 17/451,734, the entire contents of which are incorporated by reference herein for all purposes).
In some embodiments, the compositions, systems, and methods described herein utilize a DNA binding nickase (or a functional fragment or variant thereof). In some embodiments, a functional fragment or functional variants of a DNA binding nickase is used, wherein the fragment or variant maintains nickase activity.
In some embodiments, the DNA binding nickase is a naturally occurring nickase (or functional fragment or variant thereof). In some embodiments, the DNA binding nickase (or a functional fragment or variant thereof) is a nickase that has been modified (e.g., incorporates one or more amino acid modifications compared to a reference sequence) to impart nickase activity. For example, the DNA binding nickase (or a functional fragment or variant thereof) may be a Cas9 nuclease (or functional fragment or variant thereof) with one of the two nuclease domains inactivated, e.g., by amino acid substitution of H840A, wherein the Cas9 has nickase activity but is not able to make a double strand break in a target double stranded polynucleotide.
In some embodiments, the DNA binding nickase comprises a Cas9 nickase, Cas12e (CasX) nickase, Cas12d (CasY) nickase, Cas12a (Cpf1) nickase, Cas12b1 (C2c1) nickase, Cas13a (C2c2) nickase, Cas12c (C2c3) nickase (or a functional fragment or variant of any of the foregoing).
In some embodiments, the DNA binding nickase is a Cas9 nickase (or a functional fragment or variant thereof). The wild type Cas9 comprises two separate nuclease domains, the RuvC domain (which cleaves the non-protospacer DNA strand) and HNH domain (which cleaves the protospacer DNA strand). In some embodiments, the Cas9 nickase comprises only a single functioning nuclease domain.
In some embodiments, the Cas9 nickase comprises a mutation in the RuvC domain which inactivates the RuvC nuclease activity. Suitable mutations include, but are not limited to, e.g., in aspartate (D) 10, histidine (H) 983, aspartate (D) 986, or glutamate (E) 762, (See, e.g., Nishimasu et al., “Crystal structure of Cas9 in complex with guide RNA and target DNA,” Cell/ 156(5), 935-949, which is incorporated herein by reference). In some embodiments, the Cas9 nickase (or a functional fragment or variant thereof) comprises at least one of the following amino acid substitutions D10X, H983X, D986X, or E762X, wherein X is any amino acid other than the wild-type amino acid. In some embodiments, the Cas9 nickase (or a functional fragment or variant thereof) comprises at least one of the following amino acid substitutions D10A, H983A, D986A, or E762A, or a combination thereof. A Cas9 nickase (or a functional fragment or variant thereof) comprising a D10A amino acid substitution is also referred to herein as Cas9-D10A. Likewise, a Cas9 nickase (or a functional fragment or variant thereof) comprising a H983A amino acid substitution is also referred to herein as Cas9-H983A. A Cas9 nickase (or a functional fragment or variant thereof) comprising a D986A amino acid substitution is also referred to herein as Cas9-D986A. A Cas9 nickase (or a functional fragment or variant thereof) comprising a E762A amino acid substitution is also referred to herein as Cas9-E762A.
In some embodiments, the Cas9 nickase (or a functional fragment or variant thereof) comprises a mutation in the HNH domain which inactivates the HNH nuclease activity. Suitable mutations include, but are not limited to, a mutation in histidine (H) 840 or asparagine (R) 863 (amino acid numbering relative to SEQ ID NO: 1) (See supra). In some embodiments, the Cas9 nickase (or a functional fragment or variant thereof) comprises at least one of the following amino acid substitutions H840X or R863X, wherein X is any amino acid other than the wild-type amino acid. In some embodiments, the Cas9 nickase (or a functional fragment or variant thereof) comprises at least one of the following amino acid substitutions H840A or R863A, or a combination thereof. A Cas9 nickase (or a functional fragment or variant thereof) comprising an H840A amino acid substitution is also referred to herein as Cas9-H840A. Likewise, a Cas9 nickase (or a functional fragment or variant thereof) comprising an R863A amino acid substitution is also referred to herein as a Cas9-R863A.
In some embodiments, the DNA binding nickase (or a functional fragment or variant thereof) comprises Cas9-D10A, Cas9-H983A, Cas9-D986A, Cas9-E762A, Ca9s-H840A, or Cas9-R863A (or a functional fragment or variant of any of the foregoing). In some embodiments, the DNA binding nickase (or a functional fragment or variant thereof) comprises Cas9-D10A, Cas9-H983A, Cas9-D986A, or Cas9-E762A (or a functional fragment or variant of any of the foregoing). In some embodiments, the DNA binding nickase comprises Cas9-H840A or Cas9-R863A (or a functional fragment or variant of any of the foregoing). In some embodiments, the DNA binding nickase (or a functional fragment or variant thereof) comprises Cas9-H840A (or a functional fragment or variant of any of the foregoing).
Reverse Transcriptases
In some embodiments, the compositions, systems, and methods described herein utilize a reverse transcriptase (or a functional fragment or variant thereof). In some embodiments, a functional fragment or functional variants of a reverse transcriptase is used, wherein the fragment or variant maintains reverse transcriptase activity.
In some embodiments, the reverse transcriptase is a naturally occurring reverse transcriptase (or functional fragment or variant thereof). In some embodiments, the reverse transcriptase is derived from a naturally occurring reverse transcriptase (or functional fragment or variant thereof). In some embodiments, the reverse transcriptase (or a functional fragment or variant thereof) is a reverse transcriptase that has been modified (e.g., incorporates one or more amino acid modifications compared to a reference sequence). In some embodiments, the modified reverse transcriptase comprises one or more improved properties as compared to the corresponding reference sequence (e.g., thermostability, fidelity, reverse transcriptase activity).
Exemplary reverse transcriptases include, but are not limited to, moloney murine leukemia virus (M-MLV) reverse transcriptase; human immunodeficiency virus (HIV) reverse transcriptase and avian sarcoma-leukosis virus (ASLV) reverse transcriptase, which includes but is not limited to rous sarcoma virus (RSV) reverse transcriptase, avian myeloblastosis virus (AMY) reverse transcriptase, avian erythroblastosis virus (AEV) helper virus MCAV reverse transcriptase, avian myelocytomatosis virus MC29 helper virus MCAV reverse transcriptase, avian reticuloendotheliosis virus (REV-T) helper virus REV-A reverse transcriptase, avian sarcoma virus UR2 helper virus UR2AV reverse transcriptase, avian sarcoma virus Y73 helper virus YAV reverse transcriptase, rous associated virus (RAV) reverse transcriptase, and myeloblastosis associated virus (MAV) reverse transcriptase.
Any of the forementioned exemplary reverse transcriptases can be modified, e.g., comprises at least one amino acid substitution, deletion, or addition.
In some embodiments, the reverse transcriptase is derived from the M-MLV reverse transcriptase. In some embodiments, the M-MLV reverse transcriptase is naturally occurring. In some embodiments, the M-MLV reverse transcriptase is non-naturally occurring.
In some embodiments, the compositions, systems, and methods described herein utilize an integrase (or a functional fragment or variant thereof) and a cognate integration sequence. Integrases, integration sequences, and integration sites are particularly useful in methods of PASTE editing (e.g., as described herein). It is understood by the person of ordinary skill in the art that integration sites and integrases for use in the compositions, systems, and methods described herein will be selected in pairs, wherein the selected integrase will specifically recognize the selected integration site.
The integrase (or functional fragment or variant thereof) can be provided as part of the editing polypeptide (e.g., as described herein, e.g., as a fusion protein) or as a separate polypeptide. In some embodiments, the integrase (or functional fragment or variant thereof) is part of the editing polypeptide (e.g., a fusion protein). In some embodiments, the integrase (or functional fragment or variant thereof) is polypeptide separate from the editing polypeptide.
Exemplary integrases include recombinases, reverse transcriptases, and retrotransposases. Exemplary integrases include, but are not limited to, Cre, Dre, Vika, Bxb1, φC31, RDF, FLP, φBT1, R1, R2, R3, R4, R5, TP901-1, A118, φFC1, φC1, MR11, TG1, φ370.1, WO, BL3, SPBc, K38, Peaches, Veracruz, Rebeuca, Theia, Benedict, KSSJEB, PattyP, Doom, Scowl, Lockley, Switzer, Bob3, Troube, Abrogate, Anglerfish, Sarfire, SkiPole, Conceptll, Museum, Severus, Airmid, Benedict, Hinder, ICleared, Sheen, Mundrea, BxZ2, φRV, and retrotransposases encoded by R2, L1, To12 Tc1, Tc3, Mariner (Himar 1), Mariner (mos 1), and Minos. In some embodiments, the integrase is Bxb1.
The integrases (e.g., recombinases) explicitly provided herein are not meant to be exclusive examples of integrases (e.g., recombinases) that can be used in embodiments of the disclosure. The methods and compositions of the disclosure can be expanded by mining databases for new orthogonal integrases (e.g., recombinases) or designing synthetic integrases (e.g., recombinases) with defined DNA specificities (See, e.g., Groth et al., “Phage integrases: biology and applications.” J. Mol. Biol. 2004; 335, 667-678; Gordley et al., “Synthesis of programmable integrases.” Proc. Natl. Acad. Sci. USA. 2009; 106, 5053-5058; the entire contents of each of which is hereby incorporated by reference in their entirety for all purposes).
In some embodiments, the integrase (or functional fragment or variant thereof) is a recombinase that incorporates the polynucleotide of interest into the target polynucleotide (e.g., a genome of a cell) at an integration site by recombination. Exemplary recombinases include serine recombinases and tyrosine recombinases. In some embodiments, the integrase is a serine recombinase. In some embodiments, the integrase is a tyrosine recombinase. Exemplary serine recombinases include, but are not limited to, Hin, Gin, Tn3, β-six, CinH, ParA, γδ, Bxb 1, φC31, TP901, TG1, φBT1, R1, R2, R3, R4, R5, φRV1, φFC1, MR11, A118, U153, gp29. Examples of serine recombinases also include, without limitation, recombinases Peaches, Veracruz, Rebeuca, Theia, Benedict, KSSJEB, PattyP, Doom, Scowl, Lockley, Switzer, Bob3, Troube, Abrogate, Anglerfish, Sarfire, SkiPole, Conceptll, Museum, Severus, Airmid, Benedict, Hinder, ICleared, Sheen, Mundrea, and BxZ2 from Mycobacterial phages. In some embodiments, the integrase is Hin, Gin, Tn3, β-six, CinH, ParA, γδ, Bxb1, φC31, TP901, TG1, φBT1, R1, R2, R3, R4, R5, φRV1, φFC1, MR11, A118, U153, or gp29. In some embodiments, the integrase is a tyrosine recombinase. Exemplary, tyrosine recombinases include, but are not limited to, Cre, FLP, R, Lambda, HK101, HK022, and pSAM2.
In some embodiments, the integrase is a reverse transcriptase that incorporates the polynucleotide of interest into the target polynucleotide (e.g., a genome of a cell) at an integration site by reverse transcription.
In some embodiments, the integrase (or functional fragment or variant thereof) is a retrotransposase that incorporates the polynucleotide of interest into the target polynucleotide (e.g., a genome of a cell) at an integration site by retrotransposition. Exemplary retrotransposases include, but are not limited to, retrotransposases encoded by elements such as R2, L1, To12 Tc1, Tc3, Mariner (Himar 1), Mariner (mos 1), Minos, and any functional variants thereof.
In some embodiments, the compositions, systems, and methods described herein utilize a linker (e.g., a peptide linker) (e.g., one or more different linkers). Common linkers (e.g., glycine and glycine/serine linkers) are known in the art. Any suitable linker(s) can be utilized as long as each component can mediate the desired function.
In some embodiments, at least two components of an editing polypeptide (e.g., described herein) are operably connected via a linker. In some embodiments, each component of an editing polypeptide (e.g., described herein) is operably connected to the preceding and/or subsequent component of the editing polypeptide via a linker. In some embodiments, each component of an editing polypeptide (e.g., described herein) is operably connected to the preceding and/or subsequent component of the editing polypeptide via a different linker.
In some embodiments, the linker is from about 2-100, 2-50, 2-25, 2-10, 4-100, 4-4-25, 4-10, 5-100, 5-50, 5-25, 5-10, 10-100, 10-50, or 10-25 amino acids in length. In some embodiments, the linker is about 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, or 100 amino acids in length.
In some embodiments, the compositions, systems, and methods described herein utilize a reverse transcriptase template sequence. The reverse transcriptase template sequence serves as a template (i.e., encodes) the polynucleotide of interest (e.g., polynucleotide comprising, e.g., therapeutic nucleotide modification, diagnostic nucleotide modification; or e.g., a polynucleotide comprising an integration sequence encoding an integration site) for incorporation into a target polynucleotide (e.g., a gene or genome of a cell). In some embodiments, the reverse transcriptase template sequence comprises a therapeutic or diagnostic target nucleotide modification (e.g., in some embodiments a single nucleotide substitution, e.g., for use in PRIME editing methods). In some embodiments, the reverse transcriptase template sequence comprises an integration sequence comprising an integration site.
In some embodiments, the compositions, systems, and methods described herein utilize an integration sequence (e.g., comprising an integration site) and a cognate integrase (e.g., as described herein). Integration sequences, integration sites, and integrases are particularly useful in methods of PASTE editing (e.g., as described herein). In some embodiments, the gRNA comprises an integration sequence encoding an integration site. Inclusion of the integration sequence encoding an integration site in the gRNA allows for the incorporation of the integration site into a desired (site-specific) location in the polynucleotide (e.g., gene or genome) being edited.
It is understood by the person of ordinary skill in the art that integration sites and integrases for use in the compositions, systems, and methods described herein will be selected in pairs, wherein the selected integrase will specifically recognize the selected integration site. Exemplary integration sites include, but are not limited to, lox71 sites, attB sites, attP sites, attL sites, attR sites, Vox sites, FRT sites, or pseudo attP sites.
It is common knowledge to the person of ordinary skill in the art, that integration typically requires (e.g., as with serine integrases) an integration site (encoded by the gRNA) and a recognition site (e.g., linked to a polynucleotide of interest for insertion) both of which are recognized by the integrase. The integration site can be inserted into the target polynucleotide (e.g., of a cell) using a nuclease (e.g., a nickase), a gRNA, and/or an integrase. A single or a plurality of integration sites can be added to a target polynucleotide (e.g., a genome). In some embodiments, one integration site is added to a target polynucleotide (e.g., a genome). In some embodiments, more than one integration site is added to a target polynucleotide (e.g., a genome). The recognition site may be operably linked to a target polynucleotide (e.g., gene of interest) in an exogenous DNA or RNA (e.g., as described herein).
To insert more than one unique polynucleotide (e.g., gene) of interest, each at a specific site, multiple orthogonal integrations sites can be added to the specific desired locations or target sites within the polynucleotide (e.g., genome) to mediate site-specific integration of the multiple polynucleotides. A first integration site is “orthogonal” to a second integration site when it does not significantly recognize the recognition site or the integrase (e.g., recombinase) recognized by the second integration site. Thus, for example, one attB site of an integrase (e.g., a recombinase) can be orthogonal to an attB site of a different recombinase (e.g., integrase). In addition, one pair of attB and attP sites of an integrase (e.g., a recombinase) can be orthogonal to another pair of attB and attP sites recognized by the same integrase (e.g., recombinase). A pair of recombinases are considered orthogonal to each other, as defined herein, when there is recognition of each other's attB or attP site sequences. In some embodiments, the same integrase (e.g., recombinase) or two different recombinases (e.g., integrases) recognize the same integration site less than 30%, 28%, 26%, 24%, 22%, 20%, 18%, 16%, 14%, 12%, 10%, 8%, 6%, 4%, 2%, or 1%, or any range that is formed from any two of those values as endpoints of the time.
A single or a plurality of integration sites can be added to a target polynucleotide (e.g., a genome). In some embodiments, one integration site is added to a target polynucleotide (e.g., a genome). In some embodiments, more than one integration site is added to a target polynucleotide (e.g., a genome).
The central dinucleotide of some integrases is involved in the association of the two paired integration sites. For example, the central dinucleotide of BxbINT is involved in the association of the AttB integration site with the AttP recognition site. Therefore, changing the matched central dinucleotide can modify the integrase activity and provide orthogonality for the insertion of multiple genes. Therefore, expanding the set of AttB/AttP dinucleotides can enable multiplex gene insertion using gRNAs.
In some embodiments, the attB and/or attP site sequences comprise a central dinucleotide sequence. It has been shown that, for example, the central dinucleotide can be changed to GA from GT and that only GA containing attB/attP sites interact and will not cross react with GT containing sequences. In some embodiments, the central dinucleotide is selected from the group consisting of AG, AC, TG, TC, CA, CT, GA, AA, TT, CC, GG, AT, TA, GC, CG and GT. In some embodiments, the central dinucleotide is nonpalindromic. In some embodiments, the central dinucleotide is palindromic. In some embodiments, the integration site and the recognition site of a pair share the same central dinucleotide and can mediate recombination in the presence of the cognate integrase.
7.8. gRNAs
In some embodiments, the compositions, systems, and methods described herein comprise or utilize a gRNA. A gRNA typically functions to guide the insertion or deletion of one or more polynucleotides of interest (e.g., a gene of interest) into a target polynucleotide (e.g., genome). In some embodiments, the gRNA molecule is naturally occurring. In some embodiments, a gRNA molecule is non-naturally occurring. In some embodiments, a gRNA molecule is a synthetic gRNA molecule. In some embodiments, the gRNA comprises one or nucleotide modifications (e.g., to improve stability and/or half-life after being introduced into a cell).
7.9. Paired gRNAs
In some embodiments, the compositions, systems, and methods described herein comprise or utilize one or more set of paired guides that allow for the simultaneous deletion of an endogenous polynucleotide (e.g., gene) and insertion of a polynucleotide of interest (e.g., modified gene). The target dsDNA comprises two protospacers each on opposite strands of the target dsDNA. One gRNA (e.g., targeting gRNA) is targeted to one strand, while the other gRNA (e.g., targeting gRNA) of the pairs is targeted to the opposite strand. The targeting gRNA: editing polypeptide complex generates a single strand nick at each target site.
7.10. Modification of gRNAs
In some embodiments, the gRNA comprises one or nucleotide modifications (e.g., to improve stability and/or half-life after being introduced into a cell). In some embodiments, chemical modifications on the ribose rings and phosphate backbone of gRNAs are incorporated. Ribose modifications are typically placed at the 2′OH as it is readily available for manipulation. Simple modifications at the 2′OH include 2′-O-methyl, 2′-fluoro, and 2′-deoxy-2′-fluoro-beta-D-arabinonucleic acid (2′fluoro-ANA). More extensive ribose modifications such as 2′F-4′-Cα-OMe and 2′,4′-di-Cα-OMe combine modification at both the 2′ and 4′ carbons. Exemplary phosphodiester modifications include sulfide-based phosphorothioate (PS) or acetate-based phosphonoacetate alterations. Combinations of the ribose and phosphodiester modifications can also be utilized such as 2′-O-methyl 3′phosphorothioate (MS), or 2′-O-methyl-3′-thioPACE (MSP), and 2′-O-methyl-3′-phosphonoacetate (MP) RNAs. Locked and unlocked nucleotides such as locked nucleic acid (LNA), bridged nucleic acids (BNA), S-constrained ethyl (cEt), and unlocked nucleic acid (UNA) are examples of sterically hindered nucleotide modifications that can also be utilized.
7.11. Delivery of gRNAs
The gRNAs described herein (e.g., targeting gRNAs, ngRNAs) can be delivered to a cell or a population of cells by any suitable method known in the art. For example, via an RNA polynucleotide; via a vector (e.g., a plasmid or viral vector) comprising an RNA polynucleotide; via a particle (e.g., a viral particle, lipid particle, nanoparticle (e.g., a lipid nanoparticle)) encapsulating the polynucleotide or vector. Methods of delivering each of the aforementioned are known to the person of ordinary skill in the art. Also provided herein are pharmaceutical compositions comprising a gRNA described herein (e.g., targeting gRNA, ngRNA) polynucleotide; a vector (e.g., a plasmid or viral vector) comprising the polynucleotide; a particle (e.g., a viral particle, lipid particle, nanoparticle (e.g., a lipid nanoparticle)) encapsulating the polynucleotide; and a pharmaceutically acceptable excipient.
Exemplary viral vectors include, but are not limited to, adenovirus vectors, adeno-associated virus vectors, lentivirus vectors, retrovirus vectors, poxvirus vectors, parapoxivirus vectors, vaccinia virus vectors, fowlpox virus vectors, herpes virus vectors, adeno-associated virus vectors, alphavirus vectors, lentivirus vectors, rhabdovirus vectors, measles virus, Newcastle disease virus vectors, picornaviruses vectors, or lymphocytic choriomeningitis virus vectors.
Provided herein are compositions (including pharmaceutical compositions), systems, and kits comprising any one or more (e.g., all) of the components described herein (e.g., an editing polypeptide, one of more gRNAs, polynucleotide inserts). In one aspect, provided herein is a system comprising at least two components of an editing system described herein (e.g., a DNA binding nickase, a reverse transcriptase, a integration enzyme, a gRNA pair). In one aspect, provided herein are compositions comprising at least one components of an editing system described herein (e.g., a DNA binding nickase, a reverse transcriptase, a integration enzyme, a gRNA pair).
Pharmaceutical compositions descried herein comprise at least one component of an editing system described herein (e.g., a DNA binding nickase) and a pharmaceutically acceptable excipient (see, e.g., Remington's Pharmaceutical Sciences (1990) Mack Publishing Co., Easton, PA, the entire contents of which is incorporated by reference herein for all purposes).
In one aspect, also provided herein are methods of making pharmaceutical compositions described herein comprising providing at least one component of an editing system described herein (e.g., a DNA binding nickase) and formulating it into a pharmaceutically acceptable composition by the addition of one or more pharmaceutically acceptable excipient. In some embodiments, the pharmaceutical composition comprises a single component described herein (e.g., a DNA binding nickase). In some embodiments, the pharmaceutical composition comprises a plurality of the components described herein (e.g., a DNA binding nickase, a reverse transcriptase, a integration enzyme, a gRNA pair, etc.).
Acceptable excipients (e.g., carriers and stabilizers) are preferably nontoxic to recipients at the dosages and concentrations employed, and include buffers such as phosphate, citrate, or other organic acids; antioxidants including ascorbic acid or methionine; preservatives (such as octadecyldimethylbenzyl ammonium chloride; hexamethonium chloride; benzalkonium chloride, benzethonium chloride; phenol, butyl or benzyl alcohol; alkyl parabens such as methyl or propyl paraben; catechol; resorcinol; cyclohexanol; 3-pentanol;or m-cresol); low molecular weight (less than about 10 residues) polypeptides; proteins, such as serum albumin, gelatin, or immunoglobulins; hydrophilic polymers such as polyvinylpyrrolidone; amino acids such as glycine, glutamine, asparagine, histidine, arginine, or lysine; monosaccharides, disaccharides, or other carbohydrates including glucose, mannose, or dextrins; chelating agents such as EDTA; sugars such as sucrose, mannitol, trehalose or sorbitol; salt-forming counter-ions such as sodium; metal complexes (e.g., Zn-protein complexes); and/or non-ionic surfactants such as TWEEN™, PLURONICS™ or polyethylene glycol (PEG).
A pharmaceutical composition may be formulated for any route of administration to a subject. The skilled person knows the various possibilities to administer a pharmaceutical composition described herein a in order to deliver the editing system or composition to a target cell. Non-limiting embodiments include parenteral administration, such as intramuscular, intradermal, subcutaneous, transcutaneous, or mucosal administration. In one embodiment, the pharmaceutical composition is formulated for intravenous administration. In one embodiment, the pharmaceutical composition is formulated for administration by intramuscular, intradermal, or subcutaneous injection. Injectables can be prepared in conventional forms, either as liquid solutions or suspensions. The injectables can contain one or more excipients. Exemplary excipients include, for example, water, saline, dextrose, glycerol or ethanol. In addition, if desired, the pharmaceutical compositions to be administered can also contain minor amounts of non-toxic auxiliary substances such as wetting or emulsifying agents, pH buffering agents, stabilizers, solubility enhancers, or other such agents, such as for example, sodium acetate, sorbitan monolaurate, triethanolamine oleate or cyclodextrins. In some embodiments, the pharmaceutical composition is formulated in a single dose. In some embodiments, the pharmaceutical compositions if formulated as a multi-dose.
Pharmaceutically acceptable excipients (e.g., carriers) used in the parenteral preparations described herein include for example, aqueous vehicles, nonaqueous vehicles, antimicrobial agents, isotonic agents, buffers, antioxidants, local anesthetics, suspending and dispersing agents, emulsifying agents, sequestering or chelating agents or other pharmaceutically acceptable substances. Examples of aqueous vehicles, which can be incorporated in one or more of the formulations described herein, include sodium chloride injection, Ringer's injection, isotonic dextrose injection, sterile water injection, dextrose or lactated Ringer's injection. Nonaqueous parenteral vehicles, which can be incorporated in one or more of the formulations described herein, include fixed oils of vegetable origin, cottonseed oil, corn oil, sesame oil or peanut oil. Antimicrobial agents in bacteriostatic or fungistatic concentrations can be added to the parenteral preparations described herein and packaged in multiple-dose containers, which include phenols or cresols, mercurials, benzyl alcohol, chlorobutanol, methyl and propyl p-hydroxybenzoic acid esters, thimerosal, benzalkonium chloride or benzethonium chloride. Isotonic agents, which can be incorporated in one or more of the formulations described herein, include sodium chloride or dextrose. Buffers, which can be incorporated in one or more of the formulations described herein, include phosphate or citrate. Antioxidants, which can be incorporated in one or more of the formulations described herein, include sodium bisulfate. Local anesthetics, which can be incorporated in one or more of the formulations described herein, include procaine hydrochloride. Suspending and dispersing agents, which can be incorporated in one or more of the formulations described herein, include sodium carboxymethylcelluose, hydroxypropyl methylcellulose or polyvinylpyrrolidone. Emulsifying agents, which can be incorporated in one or more of the formulations described herein, include Polysorbate 80 (TWEEN® 80). A sequestering or chelating agent of metal ions, which can be incorporated in one or more of the formulations described herein, is EDTA. Pharmaceutical carriers, which can be incorporated in one or more of the formulations described herein, also include ethyl alcohol, polyethylene glycol or propylene glycol for water miscible vehicles; orsodium hydroxide, hydrochloric acid, citric acid or lactic acid for pH adjustment.
The precise dose to be employed in a pharmaceutical composition will also depend on the route of administration, and the seriousness of the condition caused by it, and should be decided according to the judgment of the practitioner and each subject's circumstances. For example, effective doses may also vary depending upon means of administration, target site, physiological state of the subject (including age, body weight, and health), other medications administered, or whether therapy is prophylactic or therapeutic. Therapeutic dosages are preferably titrated to optimize safety and efficacy.
Also provided herein are kits comprising at least one pharmaceutical composition described herein. In addition, the kit may comprise a liquid vehicle for solubilizing or diluting, and/or technical instructions. The technical instructions of the kit may contain information about administration and dosage and subject groups. In some embodiments, the kit contains a single container comprising a single pharmaceutical composition described herein. In some embodiments, the kit at least two separate containers, each comprising a different pharmaceutical composition described herein (e.g., a first container comprising a pharmaceutical composition comprising one component of an editing system described herein, e.g., an editing polypeptide described herein, and a second container comprising a second pharmaceutical composition comprising a second component of an editing system described herein, e.g., a gRNA).
Guide RNA (gRNA) pairs comprising two heterologous atgRNAs for gene editing were assessed.
The gRNA pairs were used to replace the pegRNA and nicking guide generally found in PASTE system to more efficiently introduce long PASTE sequence edits (38-46 bp). The two heterologous atgRNAs comprise three design considerations which are tested in Example 2 below: (1) the spacing between both atgRNA relative to each other, (2) the different combinations of guides, and (3) the amount of overlap between the attB insertion site of the two guides.
Although complete overlap via complementary sequence of the two atgRNA results in gene insertion, incomplete overlap (for example, 14 bp to about 46 bp of site overlap) can enhance insertion efficiency. For example, incomplete overlap of the attB integration sequence with respect to the first and second heterologous gRNAs may prevent off-target integration into guide plasmids. Furthermore, no nicking guide is needed when gRNA pairs are used. The nicking guide is replaced by engineered spacer sequences in of both atgRNAs. Moreover, the reverse transcriptase (RT) is optional and according to the examples presented below removing the RT can yield better performing paired guides.
Table 1 below lists exemplary sequences for some of the PASTE system elements (integration site sequence and scaffold).
Different gRNA pair designs based on the design considerations presented in Example 1 were assessed, by analyzing the attb attachment site integration efficiency was assessed as well.
Panels of paired guides were designed with specificity for the ACTB, mouse DNMT1, and mouse NOLC1 locus, corresponding to paired guide sequences shown below in Table 1, 2, and 3 respectively.
Cell culture. HEK293FT cells (American Type Culture Collection (ATCC)-CRL32156) were cultured in Dulbecco's Modified Eagle Medium with high glucose, sodium pyruvate, and GlutaMAX (Thermo Fisher Scientific), additionally supplemented with 10% (v/v) fetal bovine serum (FBS) and 1× penicillin-streptomycin (Thermo Fisher Scientific).
Transfection. Cells were plated at 5-15K the day prior to transfection in a 96-well plate coated with poly-D-lysine (BD Biocoat). HEK293FT were transfected with Lipofectamine 3000 (Thermo Fisher Scientific), according to manufacturer's specifications. For AttB insertion, 35.5ng of each dual guide plasmid and 100 ng SpCas9-RT plasmid were delivered to each well.
Genomic DNA extraction, purification, and quantitation. DNA was harvested from transfected cells by removal of media, resuspension in 50 μL of QuickExtract (Lucigen), and incubation at 65° C. for 15 min, 68° C. for 15 min, and 98° C. for 10 min. Target regions were PCR amplified with NEBNext High-Fidelity 2× PCR Master Mix (NEB) based on the manufacturer's protocol. Barcodes and adapters for Illumina sequencing were added in a subsequent PCR amplification. Amplicons were pooled and prepared for sequencing on a MiSeq (Illumina). Reads were demultiplexed and analyzed with appropriate pipelines.
Specific ACTB specific paired guides matched or exceeded the percent attB integration efficiency relative to functioned at a significant yield with multiple pairs matching or exceeding single guide performance (
Cell culture Hepal-6 cells (American Type Culture Collection (ATCC)-CRL32156) were cultured in Dulbecco's Modified Eagle Medium with high glucose, sodium pyruvate, and GlutaMAX (Thermo Fisher Scientific), additionally supplemented with 10% (v/v) fetal bovine serum (FBS) and 1× penicillin-streptomycin (Thermo Fisher Scientific).
Transfection. Cells were plated at 5-15K the day prior to transfection in a 96-well plate coated with poly-D-lysine (BD Biocoat). Hepal-6 cells were transfected with Lipofectamine 3000 (Thermo Fisher Scientific), according to manufacturer's specifications. For AttB insertion, 35.5 ng of each dual guide plasmid and 100 ng SpCas9-RT plasmid were delivered to each well.
Genomic DNA extraction and purification and quantitation. DNA was harvested from transfected cells by removal of media, resuspension in 50 μL of QuickExtract (Lucigen), and incubation at 65° C. for 15 min, 68° C. for 15 min, and 98° C. for 10 min. Target regions were PCR amplified with NEBNext High-Fidelity 2× PCR Master Mix (NEB) based on the manufacturer's protocol. Barcodes and adapters for Illumina sequencing were added in a subsequent PCR amplification. Amplicons were pooled and prepared for sequencing on a MiSeq (Illumina). Reads were demultiplexed and analyzed with appropriate pipelines.
DNMT1 specific paired guides can yield higher levels of editing at mouse targets compared with Prime editing (
Cell culture. Hepal -6 cells (American Type Culture Collection (ATCC)-CRL32156) were cultured in Dulbecco's Modified Eagle Medium with high glucose, sodium pyruvate, and GlutaMAX (Thermo Fisher Scientific), additionally supplemented with 10% (v/v) fetal bovine serum (FBS) and 1× penicillin-streptomycin (Thermo Fisher Scientific).
Transfection. Cells were plated at 5-15K the day prior to transfection in a 96-well plate coated with poly-D-lysine (BD Biocoat). Hepal-6 cells were transfected with Lipofectamine 3000 (Thermo Fisher Scientific), according to manufacturer's specifications. For AttB insertion, 35.5ng of each dual guide plasmid, and 100 ng SpCas9-RT plasmid were delivered to each well.
Genomic DNA extraction and purification and quantitation. DNA was harvested from transfected cells by removal of media, resuspension in 50 μL of QuickExtract (Lucigen), and incubation at 65° C. for 15 min, 68° C. for 15 min, and 98° C. for 10 min. Target regions were PCR amplified with NEBNext High-Fidelity 2× PCR Master Mix (NEB) based on the manufacturer's protocol. Barcodes and adapters for Illumina sequencing were added in a subsequent PCR amplification. Amplicons were pooled and prepared for sequencing on a MiSeq (Illumina). Reads were demultiplexed and analyzed with appropriate pipelines.
The amount of attb integration using paired guides outperforms the attb integration efficiency of most combinations of distinct single atgRNA plus nicking guide (
The integration of cargo genes with PASTE system using paired guides instead of atgRNA and nicking guides was assessed. Paired guides, encoded in sequences presented in Table 4 and 5, were designed to target either the human or mouse NOLC1 locus.
Cell culture. HEK293FT cells (American Type Culture Collection (ATCC)-CRL32156) were cultured in Dulbecco's Modified Eagle Medium with high glucose, sodium pyruvate, and GlutaMAX (Thermo Fisher Scientific), additionally supplemented with 10% (v/v) fetal bovine serum (FBS) and 1× penicillin-streptomycin (Thermo Fisher Scientific).
Transfection. Cells were plated at 5-15K the day prior to transfection in a 96-well plate coated with poly-D-lysine (BD Biocoat). HEK293FT were transfected with Lipofectamine 3000 (Thermo Fisher Scientific), according to manufacturer's specifications. For PASTE insertions, 18ng of each dual guide plasmid, 64 ng cargo plasmid, and 100 ng SpCas9-RT-BXB1 encoding plasmid were delivered to each well.
Genomic DNA extraction and purification. DNA was harvested from transfected cells by removal of media, resuspension in 50 μL of QuickExtract (Lucigen), and incubation at 65° C. for 15 min, 68° C. for 15 min, and 98° C. for 10 min. After thermocycling, lysates were purified via addition of 45 μL of AMPure magnetic beads (Beckman Coulter), mixing, and two 75% ethanol wash steps. After purification, genomic DNA was eluted in 25 μL water.
Genome editing quantification by digital droplet polymerase chain reaction (ddPCR). To quantify PASTE editing efficiency by digital droplet PCR, 24 μL solutions were prepared in a 96-well plate containing: 1) 12 μL 2× ddPCR Supermix for Probes (Bio-Rad); 2) primers for amplification of the integration junction at 250 nM-900 nM; 3) FAM probe for detection of the integration junction amplicon at 250 nM; 4) 1.44 μL RPP30 HEX reference mix (Bio-Rad); 5) 0.12 μL FastDigest restriction enzyme for degradation of primer off-targets (Thermo Fisher); and 6) Sample DNA at 1-10 ng/pt. 20 μL of reaction mix was transferred to a Dg8 Cartridge (Bio-Rad) and loaded into a QX2000 droplet generator (Bio-Rad). 40 μL droplets suspended in ddPCR droplet reader oil were transferred to a new 96-well plate and thermocycled according to manufacturer's specifications. Lastly, the 96-well plate was transferred to a QX200 droplet reader (Bio-Rad) and the generated data were analyzed using Quantasoft Analysis Pro to quantify DNA editing.
Paired guides used in conjunction with the PASTE system at the mouseNOLC1 locus demonstrated higher integration efficiency of a cargo polypeptide (i.e., eGFP) relative to a single atgRNA guide plus nicking guide (
Material and Methods—NOLC Mouse Locus
Cell culture. Hepal-6 cells (American Type Culture Collection (ATCC)-CRL32156) were cultured in Dulbecco's Modified Eagle Medium with high glucose, sodium pyruvate, and GlutaMAX (Thermo Fisher Scientific), additionally supplemented with 10% (v/v) fetal bovine serum (FBS) and 1× penicillin-streptomycin (Thermo Fisher Scientific).
Transfection. Cells were plated at 5-15K the day prior to transfection in a 96-well plate coated with poly-D-lysine (BD Biocoat). Hepal-6 cells were transfected with Lipofectamine 3000 (Thermo Fisher Scientific), according to manufacturer's specifications. For AttB insertion, 35.5 ng of each dual guide plasmid, and 100 ng SpCas9-RT plasmid were delivered to each well. For PASTE insertion, 19 ng of each dual guide plasmid is used, 97 ng of the PASTE plasmid (PASTEvl or PASTEv3), and 65 ng of the template plasmid was used.
Genomic DNA extraction and purification and quantitation. DNA was harvested from transfected cells by removal of media, resuspension in 50 μL of QuickExtract (Lucigen), and incubation at 65° C. for 15 min, 68° C. for 15 min, and 98° C. for 10 min. Target regions were PCR amplified with NEBNext High-Fidelity 2× PCR Master Mix (NEB) based on the manufacturer's protocol. Barcodes and adapters for Illumina sequencing were added in a subsequent PCR amplification. Amplicons were pooled and prepared for sequencing on a MiSeq (Illumina). Reads were demultiplexed and analyzed with appropriate pipelines.
Genome editing quantification by digital droplet polymerase chain reaction (ddPCR). To quantify PASTE editing efficiency by digital droplet PCR, 24 μL solutions were prepared in a 96-well plate containing: 1) 12 μL 2× ddPCR Supermix for Probes (Bio-Rad); 2) primers for amplification of the integration junction at 250 nM-900 nM; 3) FAM probe for detection of the integration junction amplicon at 250 nM; 4) 1.44 μL RPP30 HEX reference mix (Bio-Rad); 5) 0.12 μL FastDigest restriction enzyme for degradation of primer off-targets (Thermo Fisher); and 6) Sample DNA at 1-10 ng/μL. 20 μL of reaction mix was transferred to a Dg8 Cartridge (Bio-Rad) and loaded into a QX2000 droplet generator (Bio-Rad). 40 μL droplets suspended in ddPCR droplet reader oil were transferred to a new 96-well plate and thermocycled according to manufacturer's specifications. Lastly, the 96-well plate was transferred to a QX200 droplet reader (Bio-Rad) and the generated data were analyzed using Quantasoft Analysis Pro to quantify DNA editing.
Paired guides used in conjunction with the PASTE system at the human NOLC1 locus demonstrated higher integration efficiency of a cargo polypeptide (i.e., eGFP) relative to a single atgRNA guide plus nicking guide (
An AdV vector cocktail to package the complete PASTE-paired guide system (i.e., Cas9-reverse transcriptase-integrase, paired guides, and genetic cargo) in viral vectors was assessed. Upon packaging and delivering the PASTE-paired guide system components across 3 AdV vectors, percent integration of eGFP at the mouse NOLC1 locus in Hepa 1-6 locus was measured by digital droplet PCR.
Material and Methods—Adenoviral delivery of PASTE and Paired Guides
Cell culture. Hepa 1-5 cellswere cultured in Dulbecco's Modified Eagle Medium with high glucose, sodium pyruvate, and GlutaMAX (Thermo Fisher Scientific), additionally supplemented with 10% (v/v) fetal bovine serum (FBS) and 1× penicillin-streptomycin (Thermo Fisher Scientific).
Transfection. Cells were plated at 5-15K the day prior to transfection in a 96-well plate coated with poly-D-lysine (BD Biocoat). HEK293FT were transfected with Lipofectamine 3000 (Thermo Fisher Scientific), according to manufacturer's specifications. For PASTE insertions, 18ng of each dual guide plasmid, 64ng cargo plasmid, and 100 ng SpCas9-RT-BXB1 encoding plasmid were delivered to each well.
Genomic DNA extraction and purification. DNA was harvested from transfected cells by removal of media, resuspension in 50 μL of QuickExtract (Lucigen), and incubation at 65° C. for 15 min, 68° C. for 15 min, and 98° C. for 10 min. After thermocycling, lysates were purified via addition of 45 μL of AMPure magnetic beads (Beckman Coulter), mixing, and two 75% ethanol wash steps. After purification, genomic DNA was eluted in 25 μL water.
Genome editing quantification by digital droplet polymerase chain reaction (ddPCR). To quantify PASTE editing efficiency by digital droplet PCR, 24 μL solutions were prepared in a 96-well plate containing: 1) 12 μL 2× ddPCR Supermix for Probes (Bio-Rad); 2) primers for amplification of the integration junction at 250 nM-900 nM; 3) FAM probe for detection of the integration junction amplicon at 250 nM; 4) 1.44 μL RPP30 HEX reference mix (Bio-Rad); 5) 0.12 μL FastDigest restriction enzyme for degradation of primer off-targets (Thermo Fisher); and 6) Sample DNA at 1-10 ng/pt. 20 μL of reaction mix was transferred to a Dg8 Cartridge (Bio-Rad) and loaded into a QX2000 droplet generator (Bio-Rad). 40 μL droplets suspended in ddPCR droplet reader oil were transferred to a new 96-well plate and thermocycled according to manufacturer's specifications. Lastly, the 96-well plate was transferred to a QX200 droplet reader (Bio-Rad) and the generated data were analyzed using Quantasoft Analysis Pro to quantify DNA editing.
AdV production and transduction. Adenoviral vectors were cloned using the AdEasy-1 system obtained from Addgene. Briefly, SpCas9-RT-P2A-Blast, Bxb1 and guide RNAs, and an EGFP cargo gene were cloned into separate adenoviral template backbones and recombined to add the full Adenoviral genome with the AdEasy-1 plasmid in BJ5183 E. coli cells. These recombined plasmids were sent to Vector BioLabs for commercial production. Additional adenoviral vectors were produced for in vivo experiments by the University of Massachusetts Medical School Viral Vector Core, as previously described (PMID: 31043560).
eGFP integration into the attB site using SpCas9-RT-P2A-Blast Bxb1 and paired guides at the mouse NOLC locus in a Hepa 1-6 cell line using either a paired guide labeled, “mouse NOLC1 region forward pair with rev 38bp AttB guide 7+2” or “mouse NOLC1 region forward pair with rev 38bp AttB guide 5,” were observed.
Other embodiments of the disclosure will be apparent to those skilled in the art from consideration of the specification and practice of the disclosure disclosed herein. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the disclosure being indicated by the following claims.
All patents and publications cited herein are incorporated by reference herein in their entirety.
This application claims the benefit of U.S. Provisional Patent Application Ser. No. 63/363,310, filed Apr. 20, 2022. The entire content of the above-referenced patent application is incorporated by reference in their entirety herein.
This invention was made with government support under EB031957 and AI49694 awarded by the National Institutes of Health. The government has certain rights in the invention.
Number | Date | Country | |
---|---|---|---|
63363310 | Apr 2022 | US |