The instant application contains a Sequence Listing which has been submitted electronically in XML file format and is hereby incorporated by reference in its entirety. Said XML, copy, created on Oct. 18, 2022, is named 733339_083474-024_SL.xml and is 66,696,250 bytes in size.
Genome editing systems have developed as a promising technology for the development of therapeutic tools. Systems such as CRISPR/Cas9, TALEN, and zinc finger proteins have been used to alter the genomes of organisms. However, these systems are limited by a number of factors, including size, cargo capacity, and targeting ability.
Retrotransposons are mobile elements that insert themselves into the genome of a host through an RNA intermediate. This is in contrast to the mechanism of most DNA transposons, which directly insert themselves into a host genome. Retrotransposons are categorized as long terminal repeat (LTR) retrotransposons and non-LTR retrotransposons.
Non-LTR retrotransposons are among the most frequently occurring transposable elements in the eukaryotic genome. They can be either randomly inserting or site-specific. Site-specific non-LTR retrotransposons are generally characterized by the presence of specific activity—reverse transcriptase activity, DNA nicking activity, and nucleic acid binding activity. The genetic loci for these activities are found in either a single open reading frame (ORF) or split between two ORFs. The DNA nicking activity of single-ORF systems is found with restriction-like endonuclease (RLE) domains. Multiple non-LTR retrotransposon families, such as the R2, R4, R5, R8, R9, Dong and Cre families, are categorized as RLE containing non-LTR retrotransposons.
Of the known non-LTR retrotransposons, the most well studied is the R2 element. The R2 element is comprised of R2 RNA and the R2 protein. The R2 element contains a single open reading frame (ORF), which encodes a reverse transcriptase, an endonuclease, and includes DNA binding regions and zinc finger motifs. R2 element. R2 inserts itself into a host genome through a mechanism known as Target Primed Reverse Transcription (TPRT), which is a stepwise reaction including a first nick of host DNA, reverse transcription of the R2 RNA into the first strand, a second nick of host DNA, and synthesis of a second strand.
The mechanism by which the R2 element inserts into a host genome, being independent of endogenous cellular repair pathways, as well as the capacity to carry an RNA molecule of varying sizes to a host genome, makes the R2 element a potentially powerful genome editing system. However, the R2 element specifically inserts itself into either the 28S or 18S ribosomal RNA locus. Therefore, it lacks the ability to target insertions to a particular locus, which is a critical aspect for viable genome editing systems. Other site-specific retrotransposons are similarly limited to particular loci. There remains an unmet need for a genome editing system that is capable of directed insertion of large nucleic acids into a host genome.
The present disclosure is directed to a genome editing system comprising: i) an R2 element enzyme; and ii) a payload RNA, wherein the payload RNA comprises an insertion template and optionally one or more of a 5′ homology region, a 3′ homology region, and a protein binding element, wherein the insertion template comprises a sequence for a nucleic acid insertion into the genome, and wherein the R2 element enzyme comprises a reverse transcriptase domain, and a nickase domain.
In some embodiments the R2 element enzyme further comprises a targeting domain. In some embodiments the targeting domain is a natural targeting domain or an engineered targeting domain. In some embodiments, the nucleic acid insertion into the genome is a DNA or RNA insertion template. In some embodiments, the R2 element enzyme is a modified R2 element enzyme. In some embodiments, the coding sequence of the R2 element enzyme is modified. In some embodiments, wherein the modified R2 element enzyme is modified by an N-terminal or C-terminal truncation of the R2 element enzyme sequence. In some embodiments, the modified R2 element enzyme comprises a linker. In some embodiments the linker is an XTEN linker.
In some embodiments, the genome editing system targets a genomic locus. In some embodiments, the genome editing system targets a genomic locus other than the 28S rRNA locus. In some embodiments, an N-terminal zinc finger domain of the R2 element enzyme is modified to target a genomic locus other than the 28S rRNA locus. In some embodiments, a non-naturally occurring targeting region is fused to the N-terminus of the R2 element enzyme or inserted into the R2 element enzyme.
In some embodiments, the modified R2 element enzyme is a fusion protein. In some embodiments, the modified R2 element is fused to a Cas9 protein that is fully active, catalytically dead (H840A/D10A for SpCas9), or functioning as a nickase (H840A or D10A for SpCas9). In some embodiments, the modified R2 element is fused to a Cas12 protein that is fully active, catalytically dead, or functioning as a nickase. In some embodiments, the modified R2 element is fused to a TALEN protein, zinc finger protein, argonaute, or meganuclease protein.
In some embodiments, the genome editing system further comprises a guide RNA. In some embodiments, the 5′ homology region of the payload RNA is engineered to target a genomic locus other than the 28S rRNA locus. In some embodiments, the 5′ homology region, the 3′ homology region, or both the 5′ and 3′ homology region target an exogenously introduced landing sequence.
In some embodiments, the insertion region is introduced into the genome of a specific cell type. In some embodiments, the specific cell type is a post-mitotic cell. In some embodiments, the genome editing system functions in post-mitotic cells. In some embodiments, the genome editing system functions independently from intrinsic nucleic acid repair systems.
In some embodiments, the payload RNA template further comprises a 5′ untranslated region (UTR), a 3′ UTR, or both a 5′ UTR and a 3′ UTR. In some embodiments, the 5′ homology region and the 3′ homology region are located between the 5′ UTR and 3′ UTR. In some embodiments, the 5′ homology region and the 3′ homology region are located outside the 5′ UTR and 3′ UTR. In some embodiments, the payload RNA further comprises a 5′ untranslated region (UTR), a 3′ UTR, or both a 5′ and a 3′ UTR, wherein the UTRs are truncated. In some embodiments, the payload RNA does not comprise a 5′ UTR. In some embodiments, the payload RNA does not comprise a 3′ UTR.
In some embodiments, the payload RNA further comprises a nuclear retention element. In some embodiments, the payload RNA further comprises a Cas9 or Cas12 guide RNA, wherein the Cas9 or Cas12 guide RNA comprises an extension with a 5′ homology sequence, a 3′ homology sequence, a 5′ untranslated region (UTR), a 3′ UTR, an insertion template, or any combination thereof. In some embodiments the nucleic acid insertion template is a sequence of greater than 1000 base pairs.
In some embodiments, the R2 element enzyme comprises a nuclear localization signal (NLS).
In some embodiments, the insertion region comprises a template for a reporter gene, a transcription factor gene, a transgene, an enzyme gene, or a therapeutic gene.
The present disclosure is also directed to a method of inserting a large nucleic acid into a genome within a cell using a Cas9 or Cas12 fusion protein, wherein the method comprises supplying a Cas9 or Cas12 fusion protein to a cell, wherein the Cas9 or Cas12 fusion protein is supplied with a payload RNA template, wherein the RNA template is reverse transcribed by the Cas9 or Cas12 fusion protein prior to being inserted into the genome of the cell; and wherein the large nucleic acid is inserted into the genome of the cell.
In some embodiments, the Cas9 fusion protein comprises a Cas9 portion and an R2 element portion. In some embodiments, the Cas9 fusion protein comprises a targeting domain, a reverse transcriptase domain, and a nickase domain. In some embodiments, the Cas12 fusion protein comprises a Cas12 portion and an R2 element portion.
The disclosure is also directed to a method of inserting an exogenous nucleic acid into the genome of a post-mitotic cell, wherein the method comprises subjecting the genome of the post-mitotic cell to a modified Cas9 protein that inserts the exogenous nucleic acid into the genome of the post-mitotic cell. In some embodiments, the modified Cas9 protein is fused to an R2 element enzyme. In some embodiments, the modified Cas9 fusion protein targets an endogenous landing site. In some embodiments, the Cas9 fusion protein targets an exogenously introduced landing site in the genome of the post-mitotic cell.
The disclosure is also directed to a method of editing a genome comprising subjecting the cell to the genome editing systems described above.
The disclosure is also directed to a composition comprising a cell edited by the genome editing systems or methods of editing genomes described above.
The disclosure is also directed to a genome editing system comprising: i) a payload RNA, wherein the payload RNA comprises an insertion template and optionally one or more of a 5′ homology region, a 3′ homology region, and a protein binding element, wherein the insertion template comprises a sequence for a nucleic acid insertion into the genome; ii) a non-LTR site specific retrotransposon element enzyme; wherein the non-LTR site specific retrotransposon element enzyme comprises a reverse transcriptase domain and, optionally, a nuclease or nickase domain, and wherein if the non-LTR-site specific retrotransposon element enzyme does not comprise the optional nuclease or nickase domain, the genome editing system further comprises iii) a nuclease or nickase enzyme. In some embodiments, the nuclease or nickase enzyme is a programmable nuclease or nickase. In some embodiments, the non-LTR site specific retrotransposon element enzyme further comprises a targeting domain. In some embodiments, the targeting domain is a natural targeting domain or an engineered targeting domain.
The disclosure is also directed to a genome editing system where the non-LTR site specific retrotransposon comes from the R1, R2, R4, R5, R6, R7, R8, R9, CRE, NeSL, HERO, or Utopia families, or from the 9 family classifications established for RLE domain containing nLTR retrotransposons (
In some embodiments, the nucleic acid insertion into the genome is a DNA or RNA insertion template.
In some embodiments, the non-LTR site specific retrotransposon element enzyme is a modified non-LTR site specific retrotransposon element enzyme. In some embodiments, the coding sequence of the non-LTR site specific retrotransposon element enzyme is modified. In some embodiments, the modified non-LTR site specific retrotransposon element enzyme is modified by an N-terminal or C-terminal truncation of the non-LTR site specific retrotransposon element enzyme sequence.
In some embodiments, the modified non-LTR site specific retrotransposon element enzyme comprises a linker. In some embodiments, the linker is an XTEN linker.
The genome editing system of the disclosure targets a genomic locus. In some embodiments, the genome editing system targets a genomic locus other than the 28S rRNA locus. In some embodiments, an N-terminal zinc finger domain of the non-LTR site specific retrotransposon element enzyme is modified to target a genomic locus other than the 28S rRNA locus. In some embodiments, a non-naturally occurring targeting region is fused to the N-terminus of the non-LTR site specific retrotransposon element enzyme or inserted into the non-LTR site specific retrotransposon element enzyme.
In some embodiments, the modified non-LTR site specific retrotransposon element enzyme is a fusion protein. In some embodiments, the modified non-LTR site specific retrotransposon element is fused to a Cas9 protein that is fully active, catalytically dead (H840A/D10A for SpCas9), or functioning as a nickase (H840A or D10A for SpCas9). In some embodiments, the modified non-LTR site specific retrotransposon element is co-delivered with a Cas9 protein that is fully active, catalytically dead (H840A/D10A for SpCas9), or functioning as a nickase (H840A or D10A for SpCas9). In some embodiments, the modified non-LTR site specific retrotransposon element is fused to a Cas12, IscB, IsrB, or TnpB protein that is fully active, catalytically dead, or functioning as a nickase. In some embodiments, the modified non-LTR site specific retrotransposon element is delivered in trans with a Cas12, IscB, IsrB, or TnpB protein that is fully active, catalytically dead, or functioning as a nickase. In some embodiments, the modified non-LTR site specific retrotransposon element is fused to a TALEN protein, zinc finger protein, argonaute, or meganuclease protein.
In some embodiments, the disclosure further comprises a guide RNA. In some embodiments, the disclosure further comprises multiple guide RNA.
In some embodiments, the genome editing system of the disclosure comprises a payload wherein the 5′ homology region, the 3′ homology region, or both the 5′ and 3′ homology region of the payload RNA is engineered to target a genomic locus other than the 28S rRNA locus. In some embodiments, the 5′ homology region, the 3′ homology region, or both the 5′ and 3′ homology region target an exogenously introduced landing sequence.
In some embodiments, the insertion region is introduced into the genome of a specific cell type. In some embodiments, the specific cell type is a post-mitotic cell, a non-dividing cell, or a quiescent cell. In some embodiments, the genome editing system functions in post-mitotic cells, non-dividing cells, or quiescent cells. In some embodiments, the genome editing system functions independently from intrinsic nucleic acid repair systems.
In some embodiments, the payload RNA template further comprises a 5′ untranslated region (UTR), a 3′ UTR, or both a 5′ UTR and a 3′ UTR. In some embodiments, the 5′ homology region and the 3′ homology region are located between the 5′ UTR and 3′ UTR. In some embodiments, the 5′ homology region and the 3′ homology region are located outside the 5′ UTR and 3′ UTR. In some embodiments, the payload RNA further comprises a 5′ untranslated region (UTR), a 3′ UTR, or both a 5′ and a 3′ UTR, wherein the UTRs are truncated. In some embodiments, the payload RNA does not comprise a 5′ UTR. In some embodiments, the payload RNA does not comprise a 3′ UTR. In some embodiments, the payload RNA further comprises a nuclear retention element. In some embodiments, the payload RNA further comprises a Cas9 or Cas12 guide RNA, and wherein the Cas9 or Cas12 guide RNA comprises an extension with a 5′ homology sequence, a 3′ homology sequence, a 5′ untranslated region (UTR), a 3′ UTR, an insertion template, or any combination thereof.
In some embodiments, the nucleic acid insertion template is a sequence of greater than 1000 base pairs.
In some embodiments, the genome editing system targets a genome for a deletion. In some embodiments, the deletions are between 1 and 150 bases.
In some embodiments, the non-LTR site specific retrotransposon element enzyme comprises a nuclear localization signal (NLS).
In some embodiments, the insertion region comprises a template for a reporter gene, a transcription factor gene, a transgene, an enzyme gene, or a therapeutic gene.
The disclosure is also directed to a method of inserting a large nucleic acid into a genome within a cell using a Cas9 or Cas12 fusion protein, wherein the method comprises supplying a Cas9 or Cas12 fusion protein to a cell, wherein the Cas9 or Cas12 fusion protein is supplied with a payload RNA template, wherein the RNA template is reverse transcribed by the Cas9 or Cas12 fusion protein prior to being inserted into the genome of the cell; and wherein the large nucleic acid is inserted into the genome of the cell. In some embodiments, the Cas9 fusion protein comprises a Cas9 portion and a non-LTR site specific retrotransposon element portion. In some embodiments. the Cas9 fusion protein comprises a targeting domain, a reverse transcriptase domain, and a nickase domain. In some embodiments, the Cas12 fusion protein comprises a Cas12 portion and a non-LTR site specific retrotransposon element portion.
The disclosure is also directed to a method of inserting an exogenous nucleic acid into the genome of a post-mitotic cell, wherein the method comprises subjecting the genome of the post-mitotic cell to a modified Cas9 protein that inserts the exogenous nucleic acid into the genome of the post-mitotic cell. In some embodiments, the modified Cas9 protein is fused to a non-LTR site specific retrotransposon element enzyme. In some embodiments, the modified Cas9 fusion protein targets an endogenous landing site. In some embodiments, the Cas9 fusion protein targets an exogenously introduced landing site in the genome of the post-mitotic cell.
The disclosure is also directed to a method of editing a genome comprising subjecting the cell to the genome editing system as described herein. The disclosure is also directed to a composition comprising the cell edited by the genome editing methods described herein.
The disclosure is also directed to a method of correcting a genetic mutation related to disease or human pathology, wherein the method comprises making small nucleotide changes or small nucleotide insertions (1-100 bp) in a human genome using the genome editing system of claim 1 or claim 47.
In some embodiments, the genome editing system is delivered via single or multi vector AAV, adenovirus, lentivirus, herpes simplex virus, PEG10 viral like particles, PNMA viral like particles, gag-like viral like particles, nanoblades, gesicles, or Friend murine leukemia virus (FMLV) viral like proteins.
In some embodiments, the components of the genome editing system are delivered as all RNA in lipid nanoparticles or another RNA delivery reagent. In some embodiments, wherein the non-LTR site specific retrotransposon is delivered as mRNA. In some embodiments, the guide RNAs are delivered as synthetic RNA. In some embodiments, the payload is delivered as mRNA.
The disclosure is also directed to a genome editing system targets and edits the genome at more than one site.
The present disclosure is directed to site specific non-Long Terminal Repeat (LTR) retrotransposons and systems incorporating these non-LTR retrotransposons for inserting large nucleic acids at targeted locations within a genome. The present disclosure is also directed to site-specific non-LTR retrotransposons and related systems for performing small nucleotide changes in a genome. In some embodiments, a small nucleotide change comprises a point mutation. In some embodiments, a small nucleotide change comprises a small nucleotide insertion.
The present disclosure is also directed to modified R2 fusion proteins for inserting large nucleic acids at targeted locations within a genome. The present disclosure is also directed to Cas9 fusion proteins for inserting large nucleic acids at targeted locations within a genome, which includes Cas9-R2 fusion proteins. In some embodiments, the genome is a human genome.
The present disclosure is also directed to the insertion of exogenous R2 landing sites within a genome, such that a R2 protein, modified R2 protein, or R2 fusion protein that may target a non-28S locus for insertion of a large genetic element. In some embodiments, the R2 fusion protein is an R2-Cas9 fusion protein. In some embodiments, the R2 fusion protein is a Cas12-R2 fusion protein. In some embodiments, the R2 fusion protein is a TALEN-R2 fusion protein.
Unless stated otherwise, terms and techniques used within this application have the meaning generally known to one of skill in the art.
The term “about” as used herein is understood to modify the specified value. Unless explicitly stated otherwise, the term about is understood to modify the specified values +/−10%. As used herein, the term about applied to a range modifies both endpoints of the range. By way of example, a range of “about 5 to 10” is understood to mean “about 5 to about 10.”
Unless explicitly stated otherwise, the term “payload” as used herein means at least a nucleic acid that may be integrated into a host genome. Thus, “payload RNA” will be understood to comprise an RNA molecule comprising at least an insertion region, wherein the insertion region can be integrated into a host genome.
As used herein, “cell-specific,” or “cell-type specific,” would be understood by one of skill in the art to mean occurring or being expressed at a higher frequency or existing at an increased level in one cell type in contrast to other cell types.
As used herein, the terms “target site” and “landing site” are used interchangeably unless specified otherwise.
Unless explicitly stated otherwise, the term “nucleic acid” is understood to refer to both ribonucleic acid (RNA) and deoxyribonucleic acid (DNA) molecules. This may include chemically synthesized nucleic acid molecules, single stranded or double stranded nucleic acid molecules, linearized nucleic acid molecules, circularized nucleic acid molecules, chemically modified nucleic acid molecules, and nucleic acids with biochemical modifications.
In addition to canonical single-ORF RLE domain containing non-LTR retrotransposons, such as R2, R4, R5, R8, R9, Dong, and Cre families, retrotransposons for use in or as part of the genome editing system described herein may also be characterized as part of a larger phylogenetic family. The retrotransposons in these larger phylogenetic families contemplated for use in or as a part of the genome editing systems described herein include the 8,248 RLE-domain containing retrotransposon uncovered as part of the computational analysis described in Example 7. These 8,248 retrotransposon-like orthologs are divided into 9 families, termed RLED1-RLED9. In some embodiments, the non-LTR retrotransposon is a member of the RLED1 family. In some embodiments, the non-LTR retrotransposon is a member of the RLED2 family. In some embodiments, the non-LTR retrotransposon is a member of the RLED3 family. In some embodiments, the non-LTR retrotransposon is a member of the RLED4 family. In some embodiments, the non-LTR retrotransposon is a member of the RLED5 family. In some embodiments, the non-LTR retrotransposon is a member of the RLED6 family. In some embodiments, the non-LTR retrotransposon is a member of the RLED7 family. In some embodiments, the non-LTR retrotransposon is a member of the RLED8 family. In some embodiments, the non-LTR retrotransposon is a member of the RLED9 family. In some embodiments, the non-LTR retrotransposon is a member of the R1 family. In some embodiments, the non-LTR retrotransposon is a member of the R2 family. In some embodiments, the non-LTR retrotransposon is a member of the R4 family. In some embodiments, the non-LTR retrotransposon is a member of the R5 family. In some embodiments, the non-LTR retrotransposon is a member of the R6 family. In some embodiments, the non-LTR retrotransposon is a member of the R7 family. In some embodiments, the non-LTR retrotransposon is a member of the R8 family. In some embodiments, the non-LTR retrotransposon is a member of the R9 family. In some embodiments, the non-LTR retrotransposon is a member of the Cre family. In some embodiments, the non-LTR retrotransposon is a member of the NeSL family. In some embodiments, the non-LTR retrotransposon is a member of the HERO family. In some embodiments, the non-LTR retrotransposon is a member of the Utopia family.
Without limiting the instant disclosure to any one particular theory, R2 retrotransposons are thought to work via a mechanism known as target-primed reverse transcription, or “TPRT.” TPRT is a mechanism by which an endonuclease creates a nick in a first DNA strand at a specific location, creating a “primed” 3′ hydroxyl end for reverse transcription. After the initial DNA nick, an mRNA molecule is reverse transcribed by the reverse transcriptase.
In some embodiments, the R2 element enzyme is modified. In some embodiments, the R2 element enzyme is modified by an N-terminal truncation of the R2 element enzyme sequence, a C-terminal truncation of the R2 element enzyme sequence, or both an N-terminal and a C-terminal truncation of the R2 element enzyme sequence.
In some embodiments, the R2 element enzyme is a fusion protein. In some embodiments, the R2 element enzyme comprises a fusion of an R2 protein with a Cas9 protein. In some embodiments, the R2 element enzyme comprises a fusion of an R2 protein with a Cas12 protein. In some embodiments, the R2 element enzyme comprises a fusion of an R2 protein with a Cas9 protein, wherein the Cas9 portion and the R2 protein portion are connected by a linker. In some embodiments, the R2 element enzyme comprises a fusion of an R2 protein with a Cas12 protein, wherein the Cas12 portion and the R2 protein portion are connected by a linker.
Protein binding elements of the disclosure can come in a multitude of forms. In one embodiment, a protein binding element may be an endogenous nucleic acid sequence. In one embodiment, a protein binding element may be an exogenous or introduced nucleic acid sequence. In one embodiment, the protein binding element may be a synthesized nucleic acid sequence.
In some embodiments the genome editing system comprises a guide RNA. In some embodiments, the genome editing system comprises multiple guide RNAs. In some embodiments, the genome editing system comprises paired guide RNAs.
The R2 element naturally targets the 28S rRNA locus. The instant disclosure contemplates the insertion of payloads into either the 28S rRNA locus or into other genomic loci. In some embodiments, the insertion site is a targeted genomic insertion site. In some embodiments, the insertion site is targeted by a targeting domain in a fusion protein. In some embodiments, the insertion site has been exogenously introduced to the genome. In some embodiments, the insertion site has been exogenously introduced by a site-directed genome editing system that is not capable of delivering large genetic insertions. In some embodiments, the targeted genomic site is targeted for a point mutation. In some embodiments, the targeted genomic site is targeted for a small nucleotide insertion.
The instant disclosure also contemplates additional non-LTR site-specific retrotransposons for use in or as part of the genome editing system described herein that do not target the 28S rRNA locus. In some embodiments, the genome is targeted for a large genetic insertion. In some embodiments, the insertion site is a targeted genomic insertion site. In some embodiments, the insertion site is targeted by a targeting domain in a fusion protein. In some embodiments, the insertion site has been exogenously introduced to the genome. In some embodiments, the insertion site has been exogenously introduced by a site-directed genome editing system that is not capable of delivering large genetic insertions. In some embodiments, the targeted genomic site is targeted for a point mutation. In some embodiments, the targeted genomic site is targeted for a small nucleotide insertion.
Payloads of the instant disclosure may encode proteins, such as enzymes. In some embodiments, the payload may act as a regulatory element. Thus, if an embodiment of the disclosure states, by way of example, that “the payload comprises a therapeutic protein,” it is generally understood that the payload comprises a template that, upon insertion, will lead to expression of a therapeutic protein encoded by the template. Exemplary vectors for expression are shown in
In some embodiments, the insertion region comprises a template for a reporter gene. In some embodiments, the reporter gene encodes a fluorescent protein. In some embodiments, the reporter gene encodes a green fluorescent protein. In some embodiments, the reporter gene encodes eGFP.
In some embodiments, the insertion region comprises a template for a transcription factor gene.
In some embodiments, the insertion region comprises a template for a transgene.
In some embodiments, the insertion region comprises a template for an enzyme gene, or a therapeutic gene. In some embodiments, the therapeutic protein can be used in conjunction with another therapeutic.
In some embodiments, the payload comprises a protein that is capable of converting one cell type to another.
In some embodiments, the payload comprises a protein that is capable of killing a specific cell type. In some embodiments, the payload comprises a protein that is capable of killing a tumor cell. In some embodiments, the payload comprises an immune modulating protein.
In some embodiments, the payload comprises a 5′UTR. In some embodiments, the payload comprises a 3′UTR. In some embodiments, the payload comprises a 5′UTR and a 3′ UTR. In some embodiments, the payload consists of a 5′UTR. In some embodiments, the payload consists of a 3′UTR. In some embodiments, the payload comprises a 5′UTR and a 5′ homology region. In some embodiments, the payload comprises a 3′UTR and a 3′ homology region. In some embodiments, the payload comprises a 5′UTR, a 5′ homology region, a 3′UTR and a 3′ homology region. In some embodiments, the payload comprises a 5′ homology region, a 3′UTR and a 3′ homology region. In some embodiments, the payload comprises a 5′UTR, a 5′ homology region, and a 3′ homology region. In some embodiments, the payload comprises a 5′ homology region and a 3′ homology region. In some embodiments, the 3′ homology region comprises less than 30 base pairs. In some embodiments the 3′ homology region comprises less than 20 base pairs. In some embodiments, the 3′ homology region comprises less than 10 base pairs. In some embodiments, the 3′ homology region comprises less than 5 base pairs.
The instant disclosure contemplates programmable nucleases or nickases for use in or as a part of the genome editing systems described herein. In some embodiments, the programmable nuclease or nickase is a Cas9 protein. In some embodiments, the programmable nuclease or nickase is a Cas12 protein. In some embodiments the programmable nuclease or nickase is IscB. In some embodiments, the programmable nuclease or nickase is IsrB. In some embodiments, the programmable nuclease or nickase is TnpB. In some embodiments, the programmable nuclease or nickase is a TALEN nuclease. In some embodiments, the programmable nuclease or nickase is fused to the non-LTR site-specific retrotransposon element. In some embodiments, the programmable nuclease or nickase is non-covalently linked to the non-LTR site-specific retrotransposon element. In some embodiment, the programmable nuclease or nickase acts in cis with the non-LTR site-specific retrotransposon element. In some embodiments, the programmable nuclease or nickase acts in trans with the non-LTR site-specific retrotransposon element.
In some embodiments, the payload results in the insertion of a therapeutic gene into a host genome. In some embodiments, the therapeutic gene is intended to treat a neurological disorder or a neurodegenerative disorder. In some embodiments, the therapeutic gene is intended to treat cancer. In some embodiments, the therapeutic gene is intended to treat an autoimmune disorder.
In some embodiments, the payload results in the insertion of a therapeutic gene for treating a genetically inherited disease. In some embodiments, the genetically inherited disease is Meier-Gorlin syndrome. In some embodiments, the genetically inherited disease is Seckel syndrome 4. In some embodiments, the genetically inherited disease is Joubert syndrome 5. In some embodiments, the genetically inherited disease is Leber congenital amaurosis 10. In some embodiments, the genetically inherited disease is Charcot-Marie-Tooth disease, type 2. In some embodiments, the genetically inherited disease is leukoencephalopathy. In some embodiments, the genetically inherited disease is Usher syndrome, type 2C. In some embodiments, the genetically inherited disease is spinocerebellar ataxia 28. In some embodiments, the genetically inherited disease is glycogen storage disease type III. In some embodiments, the genetically inherited disease is primary hyperoxaluria, type I. In some embodiments, the genetically inherited disease is long QT syndrome 2. In some embodiments, the genetically inherited disease is Sjögren-Larsson syndrome. In some embodiments, the genetically inherited disease is hereditary fructosuria. In some embodiments, the genetically inherited disease is neuroblastoma. In some embodiments, the genetically inherited disease is amyotrophic lateral sclerosis type 9. In some embodiments, the genetically inherited disease is Kallmann syndrome 1. In some embodiments, the genetically inherited disease is limb-girdle muscular dystrophy, type 2L. In some embodiments, the genetically inherited disease is familial adenomatous polyposis 1. In some embodiments, the genetically inherited disease is familial type 3 hyperlipoproteinemia. In some embodiments, the genetically inherited disease is Alzheimer's disease, type 1. In some embodiments, the genetically inherited disease is metachromatic leukodystrophy. In some embodiments, the genetically inherited disease is cancer. In some embodiments, the genetically inherited disease is Uveitis. In some embodiments, the genetically inherited disease is SCA1. In some embodiments, the genetically inherited disease is SCA2. In some embodiments, the genetically inherited disease is FUS-Amyotrophic Lateral Sclerosis (ALS). In some embodiments, the genetically inherited disease is MAPT-Frontotemporal Dementia (FTD). In some embodiments, the genetically inherited disease is Myotonic Dystrophy Type 1 (DM1). In some embodiments, the genetically inherited disease is Diabetic Retinopathy (DR/DME). In some embodiments, the genetically inherited disease is Oculopharyngeal Muscular Dystrophy (OPMD). In some embodiments, the genetically inherited disease is SCAB. In some embodiments, the genetically inherited disease is C9ORF72-Amyotrophic Lateral Sclerosis (ALS). In some embodiments, the genetically inherited disease is SOD1-Amyotrophic Lateral Sclerosis (ALS). In some embodiments, the genetically inherited disease is SCA6. In some embodiments, the genetically inherited disease is SCA3 (Machado-Joseph Disease). In some embodiments, the genetically inherited disease is Multiple system Atrophy (MSA). In some embodiments, the genetically inherited disease is Treatment-resistant Hypertension. In some embodiments, the genetically inherited disease is Myotonic Dystrophy Type 2 (DM2). In some embodiments, the genetically inherited disease is Fragile X-associated Tremor Ataxia Syndrome (FXTAS). In some embodiments, the genetically inherited disease is West Syndrome with ARX Mutation. In some embodiments, the genetically inherited disease is Age-related Macular Degeneration (AMD)/Geographic Atrophy (GA). In some embodiments, the genetically inherited disease is C9ORF72-Frontotemporal Dementia (FTD). In some embodiments, the genetically inherited disease is Facioscapulohumeral Muscular Dystrophy (FSHD). In some embodiments, the genetically inherited disease is Fragile X Syndrome (FXS). In some embodiments, the genetically inherited disease is Huntington's Disease. In some embodiments, the genetically inherited disease is Glaucoma. In some embodiments, the genetically inherited disease is Acromegaly. In some embodiments, the genetically inherited disease is Achromatopsia (total color blindness). In some embodiments, the genetically inherited disease is Ullrich congenital muscular dystrophy. In some embodiments, the genetically inherited disease is Hereditary myopathy with lactic acidosis. In some embodiments, the genetically inherited disease is X-linked spondyloepiphyseal dysplasia tarda. In some embodiments, the genetically inherited disease is Neuropathic pain (Target: CPEB). In some embodiments, the genetically inherited disease is Persistent Inflammation and injury pain (Target: PABP). In some embodiments, the genetically inherited disease is Neuropathic pain (Target: miR-30c-5p). In some embodiments, the genetically inherited disease is Neuropathic pain (Target: miR-195). In some embodiments, the genetically inherited disease is Friedreich's Ataxia. In some embodiments, the genetically inherited disease is Uncontrolled gout. In some embodiments, the genetically inherited disease is Inflammatory pain (Target: Nav1.7 and Nav1.8). In some embodiments, the genetically inherited disease is Choroideremia. In some embodiments, the genetically inherited disease is Focal epilepsy. In some embodiments, the genetically inherited disease is Alpha-1 Antitrypsin deficiency (AATD). In some embodiments, the genetically inherited disease is Androgen Insensitivity Syndrome. In some embodiments, the genetically inherited disease is Opioid-induced hyperalgesia (Target: Raf-1). In some embodiments, the genetically inherited disease is Neurofibromatosis type 1. In some embodiments, the genetically inherited disease is Stargardt's Disease. In some embodiments, the genetically inherited disease is Dravet Syndrome. In some embodiments, the genetically inherited disease is Retinitis Pigmentosa. In some embodiments, the genetically inherited disease is Hemophilia A (factor VIII). In some embodiments, the genetically inherited disease is Hemophilia B (factor IX). In some embodiments, the genetically inherited disease is Parkinson's Disease.
In some embodiments, the linker is a polypeptide linker. In some embodiments, the linker is a non-peptide linker. In some embodiments, the linker comprises a polypeptide portion and a non-peptide portion. In some embodiments, the linker comprises an extended recombinant polypeptide (XTEN). In some embodiments, the linker comprises the amino acid sequence (Gly4Ser)n (SEQ ID NO: 33380), where n is an integer. In some embodiments, the linker comprises the amino acid sequence (Gly4Ser)n, wherein n is 1, 2, 3, 4, 5, 6, 7, 8, 9, or 10 (SEQ ID NO: 33381). In some embodiments, the linker comprises the amino acid sequence (Gly4Ser)n, wherein n is greater than 10 (SEQ ID NO: 33382). In some embodiments, the linker comprises a synthetic portion. In some embodiments, the linker comprises polyethylene glycol (PEG). In some embodiments, the linker is a synthetic linker. In some embodiments (Gly2Ser)n, wherein n is an integer. In some embodiments, the linker comprises the amino acid sequence (Gly2Ser)n, wherein n is 1, 2, 3, 4, 5, 6, 7, 8, 9, or 10 (SEQ ID NO: 33383). In some embodiments, the linker comprises the amino acid sequence (Gly2Ser)n, wherein n is greater than 10 (SEQ ID NO: 33384). In some embodiments, the linker comprises the amino acid sequence (Ser-Gly-Gly-Ser)n (SEQ ID NO: 33385), where n is an integer. In some embodiments, the linker comprises the amino acid sequence (Ser-Gly-Gly-Ser)n, wherein n is 1, 2, 3, 4, 5, 6, 7, 8, 9, or 10 (SEQ ID NO: 33386). In some embodiments, the linker comprises the amino acid sequence (Ser-Gly-Gly-Ser)n, wherein n is greater than 10 (SEQ ID NO: 33387). In some embodiments the linker comprises the amino acid sequence (Glu-Ala-Ala-Ala-Lys)n (SEQ ID NO: 33388), wherein n is an integer. In some embodiments, the linker comprises the amino acid sequence (Glu-Ala-Ala-Ala-Lys)n, wherein n is 1, 2, 3, 4, 5, 6, 7, 8, 9, or 10 (SEQ ID NO: 33389). In some embodiments, the linker comprises the amino acid sequence (Glu-Ala-Ala-Ala-Lys)n, wherein n is greater than 10 (SEQ ID NO: 33390). In some embodiments, the linker comprises a proline linker.
The present disclosure relates to a method of editing a genome using a genome editing system. The present disclosure also relates to the method of editing a genome using a genome editing system, wherein the genome editing system comprises i) an R2 element enzyme, and ii) a payload RNA; wherein the payload RNA comprises one or more of a 5′ homology region, a 3′ homology region, a protein binding element, and an insertion region; wherein the insertion region comprises a template for a small or large nucleic acid insertion into the genome; and wherein the R2 element enzyme comprises a targeting domain, a reverse transcriptase domain, and a nickase domain.
In some embodiments, the target genome is in a eukaryotic cell. In some embodiments, the targeted genome is in a mammalian cell. In some embodiments, the targeted genome is in a dividing mammalian cell. In some embodiments, the targeted genome is in a non-dividing cell. In some embodiments, the targeted genome is in a quiescent cell.
In some embodiments, the genome editing system targets a genomic position for deletion rather than editing. In some embodiments, the genome editing system targets a genomic site for deletion that is between 1 and 150 nucleotides. In some embodiments, the genome editing system comprises a payload RNA with a 5′ homology region and a 3′ homology region, wherein the 5′ homology region and the 3′ homology region, wherein the 5′ homology region and the 3′ homology region are positioned to delete the genomic target. In some embodiments, the genome editing system is capable of deleting a genomic target and inserting a novel nucleic acid region into the genome concurrently.
The present disclosure relates to compositions, wherein the composition comprises a cell, and wherein the cell comprises a genome that has been edited using a genome editing system.
Exemplary sequences of payload UTRs and target homologies are provided in Table 1.
Exemplary sequences of Cas9 guides are provided in Table 2.
Exemplary sequences of NGS, gel primers, and Sanger primers are provided in Table 3.
Exemplary sequences of ddPCR primers and probes are provided in Table 4.
Having now described the present disclosure in detail, the same will be more clearly understood by reference to the following examples. The following examples are included solely for purposes of illustration and are not considered limiting embodiments. All patents and publications referred to herein are expressly incorporated by reference.
To determine the ability of animal R2 elements to integrate into the human genome, HEK293FT cells were transfected with specific plasmids containing the zebra finch (Taeniopygia guttata) R2 element (R2Tg), a payload, or both the R2tg plasmid and a payload plasmid. Following isolation of DNA from transfected cells, those cells transfected with an R2Tg plasmid and an eGFP payload (eGFP flanked by UTR regions and 100 bp homology to the human R2 locus), showed a distinct PCR product (
Following successful insertion of an eGFP payload, the features of the R2 system that could increase integration efficiency were examined. In the following experiments, unless otherwise stated, three plasmids are used. The first plasmid contains at least an R2 protein. The second plasmid contains at least a portion of a payload reporter. The third plasmid contains at least R2 landing sites.
The R2 landing site plasmids contain R2 landing sites of variable size. This size is indicated in the format 26/3 (
Following transfection of these three plasmids with varying length of R2 landing sites, integration was measured by luminescence, indicating integration of the luminescent payload (
Next, the tolerability of mutations within the R2 landing sites was tested.
After determining that short landing sites could provide for efficient integration, the effect of insertion homology length (to the landing sites) on integration efficiency was evaluated. To test the effect of homology length on integration efficiency, HEK293FT cells were transfected with three separate plasmids. The first plasmid contained an R2 protein encoding region, the second plasmid encoded a partial (inactive) luciferase reporter region and R2 landing sites, and the third plasmid encoded a luciferase insertion as well as regions of homology of varying number of base pairs homologous to the R2 landing site in the second plasmid. Cells were then treated with aphidicolin, which blocks cell division and thus also stops Homology Directed Repair (HDR). Without being bound to any one theory, by blocking HDR, integration is more likely to occur due to an R2 related mechanism.
When treated with 1 μm, 5 μm, or 25 μm aphidicolin (or DMSO control) (
When flanking regions (UTR and additional homology region) were increased in size to 100 bp (
An overview of the role homology of the payload plays in integration efficiency (as measured by luminescent readout) is seen in
Also, the effect of truncations of the 5′ and 3′UTRs from the payload portion (
Next, we evaluated how solely altering the 3′ homology regions would affect integration efficiency. In this experiment, HEK293FT cells were transfected with 3 plasmids. The first plasmid contained an R2 protein encoding region. The second plasmid contained a partial luciferase reporter with wtR2 landing sites. The third plasmid contained a luciferase insertion with alterations to the 3′ UTR, as named on the x-axis (
Next, we evaluated whether modifications to the R2 protein could increase integration efficiency. First, permissible domains within the R2 protein into or onto which various additional moieties could be fused were identified. As before, three plasmids were introduced into HEK293FT cells. The first plasmid contained an R2 protein which contained different GFP variants at different points along the R2 protein. The second plasmid encoded a partial (inactive) luciferase reporter region and R2 landing sites, and the third plasmid encoded a luciferase insertion as well as regions of homology to the R2 landing site in the second plasmid.
These variant R2 proteins were modified by inserting GFP variants throughout the length of the protein, beginning from the N-terminus. By example, LNK1_1 is located closer to the N-terminus than is LNK1_7. LNK_nt indicates a fusion to the N-terminus, while LNK_ct indicates a fusion to the C-terminus. As seen in
The matter of whether R2 could integrate a payload using a “short target”-truncated R2 landing sites (26/3 bp) was investigated.
Also, the matter of whether the addition of a nuclear localization signal would increase integration efficiency at the Beta-actin locus of HEK293FT cells (
After determining the ability of a nuclear localization signal to boost integration, the primary localization of transfected R2 proteins into HEK293FT cells was evaluated.
Thus, modifying the R2 protein portion can allow for greater integration efficiency. To further study integration efficiency, a fluorescent GFP reporter responsive to R2 activity (
Using this fluorescent readout approach, the efficiency of integration was evaluated using flow cytometry. HEK293FT cells were transfected with specific plasmids. These samples were wild-type R2 (
Next, the matter of whether the truncation of the R2 protein resulted in alteration of integration efficiency was studied. The N-terminal portion of the R2 protein was serially truncated, as indicated by the vertical lines in
Also, the matter of whether C-terminal truncations of the R2 protein may result in viable R2 proteins that sustain integration efficiency (
The issue that ablation of the restriction-like endonuclease (RLE) domain would affect integration activity was then studied. HEK293FT cells were transfected by three plasmids. The first plasmid contains a partial luciferase reporter with wtR2 landing sites (26/22 bp). The second plasmid encodes either a wild type R2 protein or an RLE deficient R2 protein. The third plasmid encodes a luciferase payload. Absence of the RLE domain in the R2 protein almost completely abolishes the integration efficiency of a wild-type R2 protein (
Finally, the matter of whether certain other domains of the R2 protein could be removed or modified without adverse effect. was evaluated.
This Example tested whether the payload itself could be modified to sustain nuclear localization.
Further, whether the UTR elements of the payload were necessary for their integration, or if they may be modified, was studied. In this experiment, HEK293FT cells were transfected with three plasmids. The first plasmid encoded an R2 protein. The second plasmid encoded a partial luciferase reporter and wtR2 landing sites. The third plasmid contained the luciferase payload and any of many UTR modifications (
Also, the evaluation of R2 fusion proteins and fusion proteins with linkers were viable for use in genome editing was carried out. In this experiment, HEK293FT cells were transfected with 3 plasmids. The first plasmid contained an R2 protein (with or without an NLS) fused to a Cas9 protein connected by an XTEN linker (16 amino acids in length) at various points through the N-terminal portion of the R2 protein (see
Lastly, the determination of whether these Cas9-R2 fusion proteins were capable of editing human genomes was carried out. HEK293FT cells were stably transfected with a eGFP precursor gene with a 20 bp deletion. As such, the reporter is inactive until the 20 base pairs are inserted into the precursor.
The ORFs of 4,464 eukaryotic assemblies (animals and protists) from GenBank for RT, CCHC zinc finger, and RLE domains of known retroelements were also examined. Using a computational pipeline (
We found that families varied in length, with the longer family 3 and 5 ORFs having mean lengths of 1,390 and 1,280 residues, respectively, and family 4 containing shorter ORFs, with a mean length of 966 residues (
The distance to the predicted insertion site, which is indicative of UTR length, also varied substantially, with family 1, 8, and 9 having the least distance between up and downstream annotations and ORF, suggesting shorter UTR lengths (Fig. S1C). While families 1, 3, 5, 6, 7, 8, and 9 associate with previously identified orthologs and subfamilies, families 2 and 4 had no association to known subfamilies (Table 5).
We next examined the preferred integration sites for these families. Family 1 exhibited a preference for integrating into 28S and 18S rRNA gene sites; family 3 exhibited a preference for integrating into 5S and likely spliced leader sequences; families 4, 6, and 9 exhibited a preference for integrating into tandem repeats and microsatellites, including novel repeat sequences; family 5 exhibited a preference for integrating into snRNA gene loci and some tRNA preferences; family 7 exhibited a preference for integrating into tRNA; and family 8 exhibited a preference for integrating into 28S loci (Table 1). Family 2 has an unknown integration site preference. Accordingly, the zinc finger motifs across these different families are divergent (
Clusters showed two reverse transcriptase (RT) architectures, with families 3 and 4 containing broad RT-like domains, and all other families containing more specific non-LTR retrotransposon RT domains (
We next investigated retroelements which had discordant 5′ and 3′ homologies. We found multiple instances of discordant homologies, including in family 1, which has members with 5′ small subunit rRNA preferences and 3′ large subunit rRNA preferences, and family 5 which contains systems with 5′ SL1 splicing leader preferences and 3′ U2 small nuclear RNA (snRNA) target preferences (
To elucidate divergent target preferences, Rfam annotations were made around all members (
We also heterologously reconstituted site-specific retrotransposition in human cells to model integration preferences and retargeting in eukaryotic genomes. We synthesized a panel of 12 retrotransposon ORFs from our computational exploration, selecting a sample that included R2 elements with demonstrated activity in mammalian cells (R2Ol) (A. Kuroki-Kami, et al. 2019 Mob. DNA. 10, 23; Su, et al., 2019. RNA. 25, 1432-1438), experimentally characterized retrotransposon groups without proven activity in mammalian cells (R2Bm), previously computationally described retrotransposons (R2Ci, R2Tg, R2Is, R2Pap, R2Dr, R2Tsp, HeroDr) (Kojima et al., 2016 PLoS One. 11, e0163496), and novel retrotransposons (R10Mbr, R2Toc, R2Mes) (Table 6), which all ranged in sequence similarity between 13%-67% (
To evaluate the native targeting capacity of these candidates for the 28S loci, we developed a plasmid reporter containing 200 bp of the 28S target with upstream expression of the N-terminus of Gaussia luciferase (Gluc) and delivered a payload containing an exon with 28S homology, predicted UTRs for corresponding orthologs, and a C-terminal Gluc fragment. This system enabled readout of insertion efficiency by luciferase production, and we found that only a limited subset (R2Bm, R2Tg, and R2Mes) had native activity from insertion of this heterologous Gluc cargo in HEK293FT cells (
Daniorerio
Myotis
brand
Phlebotomus
papatasi
Bombymori
Mesoligia
furuncula
Cionaintes
tinalis
Ixodes
scapularis
Trichinella
spiralis
Taeniopygia
guttata
Talpaocci-
dentalis
Daniorerio
Oryzias
latipes
As R2Tg had the highest insertion activity, we continued to explore the programmability of this R2 system. The characterization of R2Tg enzymatic activities and payload flexibility at the 28S locus and a reprogrammed target in human cells were assessed (
Having determined that the R2TgZF2mut mutant ablated integration, we speculated that supplementing additional DNA binding or nicking activity could rescue R2 integration activity at the 28S target site. We mutated Cas9 from Streptococcus pyogenes to generate either nickase (SpCas9H840A) or dead (SpCas9D10A,H840A) variants and fused these Cas9 variants via an XTEN linker to the N-terminus of R2TgΔ1-184,ZF2mut, which contains both a truncation that retains activity (Δ1-184) and the inactivating ZF2 domain mutation (ZF2mut). We then designed Cas9 guides against the 28S target region and coupled these with the R2Tg variants (
Given that the homology of the RNA template is a strong determinant of the target site, we probed the necessary homology for integration. We tested iterative truncations of either the 5′ or 3′ homology regions (
The malleable constraints of RNA cargo homology, especially at the 3′ end, prompted us to test cargo components. We next tested whether priming could occur internally to cargo, which would allow for successful integration after swapping the UTR and homology regions. Successful insertion from internal homology allows for scarless integration, with significant gene editing applications (
While permutations of cargo components and complete removal of the 3′ UTR were tolerated, deletion of the 5′ UTR region resulted in significantly lower integration rates (
We next programmed the R2Tg system to integrate at different loci by swapping target homologies (
To find optimal payloads for efficient insertion at new loci, we designed a panel of payloads following integration guidelines that were effective at the 28S locus (
As some R2 retrotransposons have been proposed to function as a homodimer upon binding their cognate RNA templates (Yang et al., 1998. Mol. Cell. Biol. 18, 3455-3465), we were motivated to explore whether dual guides on opposing DNA strands might emulate dual nicking and recruitment of R2Tg and stimulate more efficient integration. Comparing single and dual guides, we found that certain paired guides achieved up to 15% integration with minimal indels generated and near perfect integration >99% using payloads with 100 nt of homology (
We next determined whether diverse non-LTR retrotransposons could be repurposed for integration in cells despite failing at the 28S locus. To compensate for potentially ineffective binding or cleavage at the 28S locus in mammalian cells, we fused a panel of 11 additional retrotransposon candidates to SpCas9H840A and tested them for additional guided insertion improvements at the 28S locus. We found that several of the retrotransposons, including many without activity at 28S target, had significant increases in 28S insertion when paired with targeting guides (
We modified corresponding payloads for scarless insertion by rearranging homology regions internal to UTRs, and reprogrammed homology regions and SpCas9 guides to target payloads to the AAVS1 locus (
After developing our SpCas9H840A-R2Toc-based insertion system, which we refer to as Site-specific Target-primed Insertion via Targeted CRISPR Homing of Retroelements (STITCHR), we explored multiple applications for STITCHR-based programmable gene insertion in mammalian cells. To generalize STITCHR reprogramming to other loci beyond AAVS1, we targeted the NOLC1 and SERPINA1 loci with panels of single and dual guides, finding that dual guides integration efficiencies up to 13% and 10% insertion at NOLC1 and SERPINA1, respectively (
To take advantage of scarless genome insertion with R2Toc, we investigated whether we could place an EGFP tag in-frame to a protein target. We chose NOLC1 due to its distinct nuclear organization and designed our template in the reverse direction to prevent constitutive expression of the EGFP off the template cargo (
Multiple types of genomic edits by STITCHR, including single base edits, small insertions, and a range of large payload insertions are enabled by the flexible nature of the retrotransposon insertion pathway (
To test STITCHR activity in a non-dividing context, we inhibited HDR using the cell cycling inhibitor aphidicolin, which traps cells at the G1/S phase transition and inhibits HDR activity. We found that STITCHR integration of an EGFP cargo at the NOLC1 locus was not inhibited by increasing concentrations of aphidicolin and led to increases in efficiency at intermediate aphidicolin concentrations. In contrast, HDR integration by SpCas9 nuclease at the EMX1 locus was inhibited by up to 94% by aphidicolin (
To extend STITCHR to multiplexed editing without reliance on cell division, we investigated whether STITCHR could mediate multiplexed integration at two different sites in the genome. We simultaneously delivered guide RNAs and cargos targeting the AAVS1 and NOLC1 loci for Gluc and EGFP insertion, respectively, finding that multiplexed insertion was possible with 12% and 6% integration at the AAVS1 and NOLC1 loci, respectively (
We also examined STITCHR in the context of a concurrent insertion/deletion approach. We compared a SpCas9H840A-R2Toc using a single fixed guide RNA (N4, see table 7) to target the NOLC1 locus to that of the non-targeting, SpCas9H840A alone. An EGFP insert was used as a payload. When homology arms on the payload template were separated by 0 bp, 50 bp, 100 bp, or 150 bp, we were successfully able to delete the genomic target while concurrently inserting the EGFP payload into the NOLC1 locus (
We further examined the possibility of using STITCHR to create single nucleotide edits and small nucleotide insertions. SpCas9H840A-R2Toc was used with dual guides N4 and N8 (N8 Sequence: GGGAACCACGCGGCGAATGC (SEQ ID NO: 33429)) with a payload of either a GFP insert (
We next tested whether the nuclease activity of the genome editing system had to be provided in cis or if it could be provided in trans. An EGFP payload (with 50 nt homology arms) was used in conjunction with a SpCas9H840A-R2Toc, in which Cas9 is fused to the R2Toc element, targeting the NOLC1 locus with dual N4 and N8 guides, and was compared to the non-targeting SpCas9H840A. In addition, the nuclease activity conferred by SpCas9H840A was also examined with separate (trans) expression of R2Toc (
HEK293FT cells (ATCC) were cultured in Dulbecco's Modified Eagle Medium with 4.5 g/l glucose, sodium pyruvate, GlutaMAX (Thermo Fisher Scientific) and supplemented with 10% (v/v) fetal bovine serum (FBS) and 1× penicillin-streptomycin (Thermo Fisher Scientific). Cells were maintained below confluency at 37° C. and 5% CO,
Cells were transfected in 96 well poly-D-Lysine plates (Corning) 16-24 h after plating at a confluency of 70% using Lipofectamine 3000 according to the manufacturer's protocol. In brief, 50 ng R2-expressing plasmid, 50 ng cargo plasmid, 50 ng reporter plasmid (optional) and 30 ng of sgRNA-expressing plasmids were transfected. 72 h post transfection, genomic DNA was isolated by removing media and adding 50 μl QuickExtract (Lucigen) per well. After a 5 min incubation at room temperature, the lysate was transferred to a 96 well PCR plate and incubated at 65° C. for 15 min, 68° C. for 15 min, and 98° C. for 10 min and used as input for targeted deep sequencing. Lysates were further purified using AMPure magnetic beads (Beckman Coulter) according to the manufacturer's protocol and eluted in 25 μL water, if used as input for ddPCR or NGS-based assays.
Editing Quantification by Next Generation Sequencing or ddCPR
Insertion efficiencies into plasmid and genomic DNA were quantified using a 3-primer assay. Here, a forward primer was combined with two reverse primers, one of which binds in the uninserted DNA and the other in inserted DNA. The forward and two reverse primers in a 2:1:1 ratio were added at a total combined concentration of 0.5 μM for a first round PCR counting 20 cycles. A second round PCR with 12 cycles added barcoded primers for Illumina NGS. The 28S, AAVS1, and SERPINA1 experiments were quantified by 3 primer NGS for total integration and indel rates. For NOLC1, the 3-primer assay was used for analyzing indels associated with integration events and the WT locus. NOLC1 total integration was assayed by digital droplet PCR (ddPCR) as described below.
To quantify NOLC1 integration efficiency by digital droplet PCR, 24 solutions were prepared in a 96-well plate containing 1) 12 μL 2 x ddPCR Supermix for Probes (Bio-Rad) 2) primers for amplification of the integration junction at 250 nM-900 nM, 3) FAM probe for detection of the integration junction amplicon at 250 nM 4) 1.44 μL RPP30 HEX reference mix (Bio-Rad) 5) 0.12 μL FastDigest restriction enzyme for degradation of primer off-targets (Thermo Fisher) and 6) Sample DNA at 1-10 ng/μL. The 20 μL of reaction mix was transferred to a Dg8 Cartridge (Bio-Rad) and loaded into a QX2000 droplet generator (Bio-Rad). 40 μL droplets suspended in ddPCR droplet reader oil were transferred to a new 96-well plate and thermocycled according to manufacturer's specifications. Lastly, the 96-well plate was transferred to a QX200 droplet reader (Bio-Rad) and the generated data were analyzed using Quantasoft Analysis Pro to quantify DNA editing.
SpCas9H840A has the potential to improve insertion through recruitment and supplementation of nicking activity (
A panel of payloads was designed to optimize payload design for efficient insertion at retargeted loci. The panel was designed to target the NOLC1 locus to expand upon our initial findings from R2Tg natural insertion at the 28S locus (
Additionally, insertion activity was tested at the endogenous AAVS1 locus using SpCas9H840A-R2Tg fusion proteins with different R2Tg protein truncations. C-terminal truncations were found to be not tolerated, whereas the 1-184 residue N-terminal truncation of R2Tg retained activity while offering a more compact version of the SpCas9H840A-R2Tg fusion (
This application claims the benefit of U.S. Provisional Patent Application Ser. Nos. 63/262,714 and 63/371,246 respectively filed on Oct. 19, 2021, and Aug. 12, 2022, and the entire disclosure of which is incorporated herein by reference.
This invention was made with Government support under Grant No. R21 AI149694 awarded by the National Institutes of Health (NIH) and under Grant No. R01 EB031957. The Government has certain rights in this invention.
| Number | Date | Country | |
|---|---|---|---|
| 63262714 | Oct 2021 | US | |
| 63371246 | Aug 2022 | US |