COMPOSITIONS AND METHODS FOR INCREASING HOMOLOGY-DIRECTED REPAIR

Abstract
The present disclosure provides compositions comprising a gene-editing polypeptide, a single-stranded donor DNA, and one or more staple oligonucleotides. The present disclosure provides compositions comprising a DNA nanostructure and a gene-editing polypeptide. The present disclosure provides gene editing methods using the compositions. The present disclosure provides methods of using the compositions to produce a genetically modified cell. The present disclosure provides kits useful for carrying out gene editing.
Description
INTRODUCTION

Molecular self-assembly with scaffolded DNA origami offers a route for folding nucleic acid molecules in user-defined ways, to generate DNA nanostructures. DNA nanostructures comprises a single-stranded DNA that is folded into distinct shapes via oligonucleotides termed “staples.”


Engineered nuclease systems can be used to cleave a target DNA at a specified location. Examples of engineered nuclease systems include TALENs, zinc finger nucleases, meganucleases, and CRISPR-Cas systems. CRISPR-Cas systems include Cas proteins, which are involved in acquisition, targeting and cleavage of foreign DNA or RNA, and a guide RNA(s), which includes a segment that binds Cas proteins and a segment that binds to a target nucleic acid. For example, Class 2 CRISPR-Cas systems comprise a single Cas protein bound to a guide RNA (gRNA), where the Cas protein binds to and cleaves a targeted nucleic acid. Introduction of a break in a nucleic acid (e.g. genome) can facilitate the introduction of a donor nucleic acid. The programmable nature of these nuclease systems has facilitated their use as a versatile technology for use in modification of target nucleic acid.


SUMMARY

The present disclosure provides compositions comprising a gene-editing polypeptide, a single-stranded donor DNA, and one or more staple oligonucleotides. The present disclosure provides compositions comprising a DNA nanostructure and a gene-editing polypeptide. The present disclosure provides gene editing methods using the compositions. The present disclosure provides methods of using the compositions to produce a genetically modified cell. The present disclosure provides kits useful for carrying out gene editing.





BRIEF DESCRIPTION OF THE DRAWINGS


FIG. 1A-1D depict use of DNA nanostructures for integration of large DNA insertions via homology-directed repair (HDR).



FIG. 2A-2C depict DNA origami to generate highly compact repair templates.



FIG. 3A-3P provide amino acid sequences of various CRISPR/Cas effector polypeptides.



FIG. 4 provides an example of a nucleotide sequence of interest in a donor DNA.



FIG. 5 provides an example of a nucleotide sequence of interest in a donor DNA.



FIG. 6 provides examples of homology arms for insertion into chr9:107,422,356/hg38.



FIG. 7 provides nucleotide sequences of examples of homology arms for insertion into chr1:179,826,688/hg38.



FIG. 8A-8D depict design, purification and characterization of a DNA nanostructure encoding mNeon for integration into the human genome.



FIG. 9A-9D depict entry into the nucleus and integration into the genome via HDR of DNA nanostructures compared to unstructured ssDNA.



FIG. 10A-10D depict recruitment of CRISPR Cas9 to the ends of templates using shuttles.



FIG. 11A-11E depict incorporation of DNA nanostructures encoding a human genome into human primary cells.



FIG. 12 depicts an atomic force microscope (AFM) image of a 5 kb ssDNA encoding IL2Ra, folded into a long 18-helix bundle.





DEFINITIONS

The terms “polynucleotide” and “nucleic acid,” used interchangeably herein, refer to a polymeric form of nucleotides of any length, either ribonucleotides or deoxyribonucleotides. Thus, this term includes, but is not limited to, single-, double-, or multi-stranded DNA or RNA, genomic DNA, cDNA, DNA-RNA hybrids, or a polymer comprising purine and pyrimidine bases or other natural, chemically or biochemically modified, non-natural, or derivatized nucleotide bases.


By “hybridizable” or “complementary” or “substantially complementary” it is meant that a nucleic acid (e.g. RNA, DNA) comprises a sequence of nucleotides that enables it to non-covalently bind, i.e. form Watson-Crick base pairs and/or G/U base pairs, “anneal”, or “hybridize,” to another nucleic acid in a sequence-specific, antiparallel, manner (i.e., a nucleic acid specifically binds to a complementary nucleic acid) under the appropriate in vitro and/or in vivo conditions of temperature and solution ionic strength. Standard Watson-Crick base-pairing includes: adenine (A) pairing with thymidine (T), adenine (A) pairing with uracil (U), and guanine (G) pairing with cytosine (C) [DNA, RNA]. In addition, for hybridization between two RNA molecules (e.g., dsRNA), and for hybridization of a DNA molecule with an RNA molecule (e.g., when a DNA target nucleic acid base pairs with a guide RNA, etc.): guanine (G) can also base pair with uracil (U). For example, G/U base-pairing is at least partially responsible for the degeneracy (i.e., redundancy) of the genetic code in the context of tRNA anti-codon base-pairing with codons in mRNA. Thus, in the context of this disclosure, a guanine (G) (e.g., of dsRNA duplex of a guide RNA molecule; of a guide RNA base pairing with a target nucleic acid, etc.) is considered complementary to both a uracil (U) and to an adenine (A). For example, when a G/U base-pair can be made at a given nucleotide position of a dsRNA duplex of a guide RNA molecule, the position is not considered to be non-complementary, but is instead considered to be complementary.


Hybridization and washing conditions are well known and exemplified in Sambrook, J., Fritsch, E. F. and Maniatis, T. Molecular Cloning: A Laboratory Manual, Second Edition, Cold Spring Harbor Laboratory Press, Cold Spring Harbor (1989), particularly Chapter 11 and Table 11.1 therein; and Sambrook, J. and Russell, W., Molecular Cloning: A Laboratory Manual, Third Edition, Cold Spring Harbor Laboratory Press, Cold Spring Harbor (2001). The conditions of temperature and ionic strength determine the “stringency” of the hybridization.


Hybridization requires that the two nucleic acids contain complementary sequences, although mismatches between bases are possible. The conditions appropriate for hybridization between two nucleic acids depend on the length of the nucleic acids and the degree of complementarity, variables well known in the art. The greater the degree of complementarity between two nucleotide sequences, the greater the value of the melting temperature (Tm) for hybrids of nucleic acids having those sequences. For hybridizations between nucleic acids with short stretches of complementarity (e.g. complementarity over 35 or less, 30 or less, 25 or less, 22 or less, 20 or less, or 18 or less nucleotides) the position of mismatches can become important (see Sambrook et al., supra, 11.7-11.8). Typically, the length for a hybridizable nucleic acid is 8 nucleotides or more (e.g., 10 nucleotides or more, 12 nucleotides or more, 15 nucleotides or more, 20 nucleotides or more, 22 nucleotides or more, 25 nucleotides or more, or 30 nucleotides or more). Temperature, wash solution salt concentration, and other conditions may be adjusted as necessary according to factors such as length of the region of complementation and the degree of complementation.


It is understood that the sequence of a polynucleotide need not be 100% complementary to that of its target nucleic acid to be specifically hybridizable or hybridizable. Moreover, a polynucleotide may hybridize over one or more segments such that intervening or adjacent segments are not involved in the hybridization event (e.g., a bulge, a loop structure or hairpin structure, etc.). A polynucleotide can comprise 60% or more, 65% or more, 70% or more, 75% or more, 80% or more, 85% or more, 90% or more, 95% or more, 98% or more, 99% or more, 99.5% or more, or 100% sequence complementarity to a target region within the target nucleic acid sequence to which it will hybridize. For example, an antisense nucleic acid in which 18 of 20 nucleotides of the antisense compound are complementary to a target region, and would therefore specifically hybridize, would represent 90 percent complementarity. In this example, the remaining noncomplementary nucleotides may be clustered or interspersed with complementary nucleotides and need not be contiguous to each other or to complementary nucleotides. Percent complementarity between particular stretches of nucleic acid sequences within nucleic acids can be determined using any convenient method. Example methods include BLAST programs (basic local alignment search tools) and PowerBLAST programs (Altschul et al., J. Mol. Biol., 1990, 215, 403-410; Zhang and Madden, Genome Res., 1997, 7, 649-656), the Gap program (Wisconsin Sequence Analysis Package, Version 8 for Unix, Genetics Computer Group, University Research Park, Madison Wis.), e.g., using default settings, which uses the algorithm of Smith and Waterman (Adv. Appl. Math., 1981, 2, 482-489), and the like.


The terms “peptide,” “polypeptide,” and “protein” are used interchangeably herein, and refer to a polymeric form of amino acids of any length, which can include coded and non-coded amino acids, chemically or biochemically modified or derivatized amino acids, and polypeptides having modified peptide backbones.


A polynucleotide or polypeptide has a certain percent “sequence identity” to another polynucleotide or polypeptide, meaning that, when aligned, that percentage of bases or amino acids are the same, and in the same relative position, when comparing the two sequences. Sequence identity can be determined in a number of different ways. To determine sequence identity, sequences can be aligned using various convenient methods and computer programs (e.g., BLAST, T-COFFEE, MUSCLE, MAFFT, etc.), available over the world wide web at sites including ncbi.nlm.nili.gov/BLAST, ebi.ac.uk/Tools/msa/tcoffee/, ebi.ac.uk/Tools/msa/muscle/, mafft.cbrc.jp/alignment/software/. See, e.g., Altschul et al. (1990), J. Mol. Bioi. 215:403-10.


A DNA sequence that “encodes” a particular RNA is a DNA nucleotide sequence that is transcribed into RNA. A DNA polynucleotide may encode an RNA (mRNA) that is translated into protein (and therefore the DNA and the mRNA both encode the protein), or a DNA polynucleotide may encode an RNA that is not translated into protein (e.g. tRNA, rRNA, microRNA (miRNA), a “non-coding” RNA (ncRNA), a guide RNA, etc.).


A “protein coding sequence” or a sequence that encodes a particular protein or polypeptide, is a nucleotide sequence that is transcribed into mRNA (in the case of DNA) and is translated (in the case of mRNA) into a polypeptide in vitro or in vivo when placed under the control of appropriate regulatory sequences.


The terms “DNA regulatory sequences,” “control elements,” and “regulatory elements,” used interchangeably herein, refer to transcriptional and translational control sequences, such as promoters, enhancers, polyadenylation signals, terminators, protein degradation signals, and the like, that provide for and/or regulate transcription of a non-coding sequence (e.g., guide RNA) or a coding sequence and/or regulate translation of an encoded polypeptide.


As used herein, a “promoter” or a “promoter sequence” is a DNA regulatory region capable of binding RNA polymerase and initiating transcription of a downstream (3′ direction) coding or non-coding sequence. For purposes of the present disclosure, the promoter sequence is bounded at its 3′ terminus by the transcription initiation site and extends upstream (5′ direction) to include the minimum number of bases or elements necessary to initiate transcription at levels detectable above background. Within the promoter sequence will be found a transcription initiation site, as well as protein binding domains responsible for the binding of RNA polymerase. Eukaryotic promoters will often, but not always, contain “TATA” boxes and “CAT” boxes. Various promoters, including inducible promoters, may be used to drive expression by the various vectors of the present disclosure.


A “target nucleic acid” as used herein is a polynucleotide (e.g., DNA such as genomic DNA) that includes a site (“target site” or “target sequence”) targeted by a modified CRISPR/Cas effector polypeptide of the present disclosure. The target sequence is the sequence to which the guide sequence of a guide nucleic acid (e.g., guide RNA; e.g., a dual guide RNA or a single-molecule guide RNA) will hybridize. For example, the target site (or target sequence) 5′-GAGCAUAUC-3′ within a target nucleic acid is targeted by (or is bound by, or hybridizes with, or is complementary to) the sequence 5′-GAUAUGCUC-3′. Suitable hybridization conditions include physiological conditions normally present in a cell. For a double stranded target nucleic acid, the strand of the target nucleic acid that is complementary to and hybridizes with the guide RNA is referred to as the “complementary strand” or “target strand”; while the strand of the target nucleic acid that is complementary to the “target strand” (and is therefore not complementary to the guide RNA) is referred to as the “non-target strand” or “non-complementary strand.”


A “small interfering” or “short interfering RNA” or siRNA is an RNA duplex of nucleotides that is targeted to a gene interest (a “target gene”). An “RNA duplex” refers to the structure formed by the complementary pairing between two regions of an RNA molecule. siRNA is “targeted” to a gene in that the nucleotide sequence of the duplex portion of the siRNA is complementary to a nucleotide sequence of the targeted gene. In some cases, the length of the duplex of siRNAs is less than 30 nucleotides. In some cases, the duplex can be 29, 28, 27, 26, 25, 24, 23, 22, 21, 20, 19, 18, 17, 16, 15, 14, 13, 12, 11 or 10 nucleotides in length. In some cases, the length of the duplex is 19-25 nucleotides in length. The RNA duplex portion of the siRNA can be part of a hairpin structure. In addition to the duplex portion, the hairpin structure may contain a loop portion positioned between the two sequences that form the duplex. The loop can vary in length. In some cases, the loop is 5, 6, 7, 8, 9, 10, 11, 12 or 13 nucleotides in length. The hairpin structure can also contain 3′ or 5′ overhang portions. In some cases, the overhang is a 3′ or a 5′ overhang 0, 1, 2, 3, 4 or 5 nucleotides in length.


As used herein, the term “microRNA” refers to any type of interfering RNAs, including but not limited to, endogenous microRNAs and artificial microRNAs (e.g., synthetic miRNAs). Endogenous microRNAs are small RNAs naturally encoded in the genome which are capable of modulating the productive utilization of mRNA. An artificial microRNA can be any type of RNA sequence, other than endogenous microRNA, which is capable of modulating the activity of an mRNA. A microRNA sequence can be an RNA molecule composed of any one or more of these sequences. MicroRNA (or “miRNA”) sequences have been described in publications such as Lim, et al., 2003, Genes & Development, 17, 991-1008, Lim et al., 2003, Science, 299, 1540, Lee and Ambrose, 2001, Science, 294, 862, Lau et al., 2001, Science 294, 858-861, Lagos-Quintana et al., 2002, Current Biology, 12, 735-739, Lagos-Quintana et al., 2001, Science, 294, 853-857, and Lagos-Quintana et al., 2003, RNA, 9, 175-179. Examples of microRNAs include any RNA that is a fragment of a larger RNA or is a miRNA, siRNA, stRNA, sncRNA, tncRNA, snoRNA, smRNA, shRNA, snRNA, or other small non-coding RNA. See, e.g., US Patent Applications 20050272923, 20050266552, 20050142581, and 20050075492. A “microRNA precursor” (or “pre-miRNA”) refers to a nucleic acid having a stem-loop structure with a microRNA sequence incorporated therein. A “mature microRNA” (or “mature miRNA”) includes a microRNA that has been cleaved from a microRNA precursor (a “pre-miRNA”), or that has been synthesized (e.g., synthesized in a laboratory by cell-free synthesis), and has a length of from about 19 nucleotides to about 27 nucleotides, e.g., a mature microRNA can have a length of 19 nt, 20 nt, 21 nt, 22 nt, 23 nt, 24 nt, 25 nt, 26 nt, or 27 nt. A mature microRNA can bind to a target mRNA and inhibit translation of the target mRNA.


General methods in molecular and cellular biochemistry can be found in such standard textbooks as Molecular Cloning: A Laboratory Manual, 3rd Ed. (Sambrook et al., HaRBor Laboratory Press 2001); Short Protocols in Molecular Biology, 4th Ed. (Ausubel et al. eds., John Wiley & Sons 1999); Protein Methods (Bollag et al., John Wiley & Sons 1996); Nonviral Vectors for Gene Therapy (Wagner et al. eds., Academic Press 1999); Viral Vectors (Kaplift & Loewy eds., Academic Press 1995); Immunology Methods Manual (I. Lefkovits ed., Academic Press 1997); and Cell and Tissue Culture: Laboratory Procedures in Biotechnology (Doyle & Griffiths, John Wiley & Sons 1998), the disclosures of which are incorporated herein by reference.


As used herein, the terms “treatment,” “treating,” and the like, refer to obtaining a desired pharmacologic and/or physiologic effect. The effect may be prophylactic in terms of completely or partially preventing a disease or symptom thereof and/or may be therapeutic in terms of a partial or complete cure for a disease and/or adverse effect attributable to the disease. “Treatment,” as used herein, covers any treatment of a disease in a mammal, e.g., in a human, and includes: (a) preventing the disease from occurring in a subject which may be predisposed to the disease but has not yet been diagnosed as having it; (b) inhibiting the disease, i.e., arresting its development; and (c) relieving the disease, i.e., causing regression of the disease.


The terms “individual,” “subject,” “host,” and “patient,” used interchangeably herein, refer to an individual organism, e.g., a mammal, including, but not limited to, murines, simians, humans, non-human primates, ungulates, felines, canines, bovines, ovines, mammalian farm animals, mammalian sport animals, and mammalian pets.


Before the present invention is further described, it is to be understood that this invention is not limited to particular embodiments described, as such may, of course, vary. It is also to be understood that the terminology used herein is for the purpose of describing particular embodiments only, and is not intended to be limiting, since the scope of the present invention will be limited only by the appended claims.


Where a range of values is provided, it is understood that each intervening value, to the tenth of the unit of the lower limit unless the context clearly dictates otherwise, between the upper and lower limit of that range and any other stated or intervening value in that stated range, is encompassed within the invention. The upper and lower limits of these smaller ranges may independently be included in the smaller ranges, and are also encompassed within the invention, subject to any specifically excluded limit in the stated range. Where the stated range includes one or both of the limits, ranges excluding either or both of those included limits are also included in the invention.


Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. Although any methods and materials similar or equivalent to those described herein can also be used in the practice or testing of the present invention, the preferred methods and materials are now described. All publications mentioned herein are incorporated herein by reference to disclose and describe the methods and/or materials in connection with which the publications are cited.


It must be noted that as used herein and in the appended claims, the singular forms “a,” “an,” and “the” include plural referents unless the context clearly dictates otherwise. Thus, for example, reference to “a gene-editing polypeptide” includes a plurality of such polypeptide and reference to “the staple oligonucleotide” includes reference to one or more staple oligonucleotide and equivalents thereof known to those skilled in the art, and so forth. It is further noted that the claims may be drafted to exclude any optional element. As such, this statement is intended to serve as antecedent basis for use of such exclusive terminology as “solely,” “only” and the like in connection with the recitation of claim elements, or use of a “negative” limitation.


It is appreciated that certain features of the invention, which are, for clarity, described in the context of separate embodiments, may also be provided in combination in a single embodiment. Conversely, various features of the invention, which are, for brevity, described in the context of a single embodiment, may also be provided separately or in any suitable sub-combination. All combinations of the embodiments pertaining to the invention are specifically embraced by the present invention and are disclosed herein just as if each and every combination was individually and explicitly disclosed. In addition, all sub-combinations of the various embodiments and elements thereof are also specifically embraced by the present invention and are disclosed herein just as if each and every such sub-combination was individually and explicitly disclosed herein.


The publications discussed herein are provided solely for their disclosure prior to the filing date of the present application. Nothing herein is to be construed as an admission that the present invention is not entitled to antedate such publication by virtue of prior invention. Further, the dates of publication provided may be different from the actual publication dates which may need to be independently confirmed.


DETAILED DESCRIPTION

The present disclosure provides compositions comprising a gene-editing polypeptide, a single-stranded donor DNA, and one or more staple oligonucleotides. The present disclosure provides compositions comprising a DNA nanostructure and a gene-editing polypeptide. The present disclosure provides gene editing methods using the compositions. The present disclosure provides methods of using the compositions to produce a genetically modified cell. The present disclosure provides kits useful for carrying out gene editing.


Features of the present disclosure are illustrated schematically in FIG. 1A-1D and in FIG. 2A-2C. As depicted schematically in FIG. 1A, in CRISPR-mediated homology-directed repair (HDR), a CRISPR/Cas effector polypeptide (e.g., a Cas9 polypeptide) and its guide RNA are introduced into a cell, along with a donor DNA encoding new genetic information (where such a donor DNA may also be a repair template). The CRISPR/Cas effector polypeptide is directed by its guide RNA to cleave at a genomic target site of interest, generating a blunt-ended double-strand break (DSB) in the DNA. Repair templates are then used by the cell to resolve the DSB through HDR, incorporating the desired changes. As depicted schematically in FIG. 1B, short donor DNA templates (<100 nucleotides in length) are efficiently integrated via HDR when introduced as single-stranded DNA (ssDNA) molecules with 30-90 nucleotide homology arms. As depicted schematically in FIG. 1C, longer donor DNA (e.g., >500 nucleotides in length) are introduced as double-stranded DNA (dsDNA) molecules to avoid nuclease-mediated template degradation in the cytosol, and are generally flanked by longer (300-500 bp) homology arms. Long repair templates are poorly integrated via HDR without a combination of changes to delivery method, template sequence, and/or cell physiology. As depicted schematically in FIG. 1D, longer donor DNA molecules can be folded into compact, nuclease-resistant nanostructures through scaffolded DNA origami methods, bringing the 5′ and 3′ homology arms of the donor DNA closer together.



FIG. 2A-2C depict use of DNA origami to generate highly compact repair templates (DNA nanostructures comprising a donor DNA folded using “staple” oligonucleotides). As depicted schematically in FIG. 2A, a 7-kbp linear double-stranded DNA (dsDNA) molecule would measure about 2,400 nm in length, not including its 400 bp homology arms. The same template as a nanostructured molecule would measure 100 nm in length, not including homology arms. As illustrated schematically in FIG. 2B, a 7-kb linear single-stranded DNA (ssDNA) molecule can be folded into structures such as a 24-helix bundle or 6-helix bundle, with the insertion region nanostructured into helices and the homology arms left as ssDNA. Folding of ssDNA into a nanostructure via DNA origami methods is accomplished by addition of strategically designed ssDNA “staple” oligonucleotides. Use of staple oligonucleotides to generate a DNA nanostructure is depicted schematically in FIG. 2C.


Compositions

The present disclosure provides compositions for gene editing (e.g., editing a target nucleic acid by homology-directed repair (HDR)), where the compositions comprise: a) a gene-editing polypeptide; b) a single-stranded donor DNA; and c) one or more “staple” nucleic acids. The single-stranded donor DNA comprises: i) a first homology arm at or near the 5′ end of the donor DNA, wherein the first homology arm comprises a nucleotide sequence that is at least partially complementary to a first nucleotide sequence in the target nucleic acid; and ii) a second homology arm at or near the 3′ end of the donor DNA, wherein the second homology arm comprises a nucleotide sequence that is at least partially complementary to a second nucleotide sequence in the target nucleic acid. The single-stranded donor DNA comprises a nucleotide sequence of interest (an “intervening nucleotide sequence” or a “nucleic acid of interest”) between the first homology arm and the second homology arm. The one or more staple nucleic acids are least partially complementary to the donor DNA (i.e., at least partially complementary to the nucleotide sequence of interest) such that the one or more staple oligonucleotides hybridize to the donor DNA, such that the donor DNA folds into a nanostructure (a “donor DNA nanostructure”) in which the first homology arm and the second homology arm are brought into proximity to one another. The donor DNA nanostructure is more compact than single-stranded donor DNA that is not contacted with the staple nucleic acids. The closer proximity of the first homology arm and the second homology arm to one another in the donor DNA nanostructure, compared to the single-stranded donor DNA that is not contacted with the staple nucleic acids, provides for increased efficiency of HDR, compared to HDR carried out with single-stranded donor DNA that is not contacted with the staple nucleic acids. Single-stranded donor DNA that is not contacted with the staple nucleic acids is also referred to herein as “unfolded single-stranded donor DNA” or “linear single-stranded donor DNA.” The first and second homology arms remain single stranded; the staple oligonucleotides generally do not hybridize to the first homology arm or the second homology arm.


The one or more staple nucleic acids (also referred to herein as “staple oligonucleotides”) comprise nucleotide sequences that are least partially complementary to the donor DNA such that the one or more staple oligonucleotides hybridize to the donor DNA. The staple oligonucleotides are designed to fold the single-stranded donor DNA into a nanostructure in such a way that the first homology arm and the second homology arm are brought into proximity to one another. The present disclosure provides a composition comprising: a) a gene editing polypeptide; and b) a DNA nanostructure comprising a donor DNA and at least one staple oligonucleotide, wherein the donor DNA comprises a first homology arm and a second homology arm that are in proximity to one another by virtue of the folding of the donor DNA induced by the staple oligonucleotide(s).


The first and the second homology arms are brought into proximity to one another in the nanostructure formed by hybridization of the single-stranded donor DNA with the one or more staple oligonucleotides, relative to their positions in the linear single-stranded donor DNA (donor DNA not hybridized with the one or more staple oligonucleotides). For example, the distance between the first and the second homology arms can be from about 5 nm to about 150 nm from one another; e.g., from about 5 nm to about 10 nm, from about 10 nm to about 20 nm, from about 20 nm to about 25 nm, from about 25 nm to about 50 nm, from about 50 nm to about 75 nm, from about 75 nm to about 100 nm, from about 100 nm to about 110 nm, from about 110 nm to about 120 nm, from about 120 nm to about 130 nm, from about 130 nm to about 140 nm, or from about 140 nm to about 150 nm.


Staple Oligonucleotides

A composition of the present disclosure can comprise from 1 to 250 staple oligonucleotides; e.g., a composition of the present disclosure can comprise from 1 to 4, from 1 to 5, from 1 to 6, from 1 to 7, from 5 to 15, from 4 to 6, from 6 to 8, from 8 to 10, from 10 to 12, from 12 to 14, from 14 to 16, from 16 to 18, from 18 to 20, from 20 to 25, from 25 to 50, from 50 to 75, from 75 to 100, from 100 to 125, from 125 to 150, from 150 to 175, from 175 to 200, from 200 to 225, or from 225 to 250, staple oligonucleotides. In some cases, a composition of the present disclosure comprises 2 staple oligonucleotides. In some cases, a composition of the present disclosure comprises 3 staple oligonucleotides. In some cases, a composition of the present disclosure comprises 4 staple oligonucleotides. In some cases, a composition of the present disclosure comprises 5 staple oligonucleotides. In some cases, a composition of the present disclosure comprises 6 staple oligonucleotides. In some cases, a composition of the present disclosure comprises 7 staple oligonucleotides. In some cases, a composition of the present disclosure comprises 8 staple oligonucleotides. In some cases, a composition of the present disclosure comprises 9 staple oligonucleotides. In some cases, a composition of the present disclosure comprises 10 staple oligonucleotides. In some cases, a composition of the present disclosure comprises 11 staple oligonucleotides. In some cases, a composition of the present disclosure comprises 12 staple oligonucleotides. In some cases, a composition of the present disclosure comprises 13 staple oligonucleotides. In some cases, a composition of the present disclosure comprises 14 staple oligonucleotides. In some cases, a composition of the present disclosure comprises 15 staple oligonucleotides. In some cases, a composition of the present disclosure comprises 16 staple oligonucleotides. In some cases, a composition of the present disclosure comprises 17 staple oligonucleotides. In some cases, a composition of the present disclosure comprises 18 staple oligonucleotides. In some cases, a composition of the present disclosure comprises 19 staple oligonucleotides. In some cases, a composition of the present disclosure comprises 20 staple oligonucleotides. In some cases, a composition of the present disclosure comprises from 20 to 25 staple oligonucleotides. In some cases, a composition of the present disclosure comprises from 25 to 50 staple oligonucleotides. In some cases, a composition of the present disclosure comprises from 50 to 75 staple oligonucleotides. In some cases, a composition of the present disclosure comprises from 75 to 100 staple oligonucleotides. In some cases, a composition of the present disclosure comprises from 100 to 125 staple oligonucleotides. In some cases, a composition of the present disclosure comprises from 125 to 150 staple oligonucleotides. In some cases, a composition of the present disclosure comprises from 150 to 175 staple oligonucleotides. In some cases, a composition of the present disclosure comprises 175 to 200 staple oligonucleotides. In some cases, a composition of the present disclosure comprises from 200 to 225 staple oligonucleotides. In some cases, a composition of the present disclosure comprises from 225 to 250 staple oligonucleotides. In some cases, a composition of the present disclosure comprises from 5 to 15 staple oligonucleotides. In some cases, a composition of the present disclosure comprises from 5 to 10 staple oligonucleotides. In some cases, a composition of the present disclosure comprises from 3 to 15 staple oligonucleotides. In some cases, a composition of the present disclosure comprises from 3 to 10 staple oligonucleotides. In some cases, a composition of the present disclosure comprises from 5 to 20 staple oligonucleotides. In some cases, a composition of the present disclosure comprises from 5 to 25 staple oligonucleotides.


Each staple oligonucleotide can independently have a length of from about 6 nucleotides to about 60 nucleotides. For example, each staple oligonucleotide can independently have a length of from about 6 nucleotides (nt) to about 10 nt, from about 10 nt to about 15 nt, from about 15 nt to about 20 nt, from about 20 nt to about 25 nt, from about 25 nt to about 30 nt, from about 30 nt to about 35 nt, from about 35 nt to about 40 nt, from about 40 nt to about 45 nt, from about 45 nt to about 50 nt, from about 50 nt to about 55 nt, or from about 55 nt to about 60 nt. In some cases, each staple oligonucleotide independently has a length of from about 20 nucleotides to about 50 nucleotides. In some cases, each staple oligonucleotide independently has a length of from about 15 nucleotides to about 30 nucleotides. In some cases, each staple oligonucleotide independently has a length of from about 20 nucleotides to about 40 nucleotides.


The staple oligonucleotides bind to the single-stranded donor DNA at sites internal to the first and second homology arms (i.e., the staple oligonucleotides bind to the nucleotide sequence that is between the first and second homology arms). In other words, the staple oligonucleotides do not bind to the first homology arm or to the second homology arm.


A staple oligonucleotide comprises: i) a first nucleotide sequence that hybridizes with a first nucleotide sequence (“first donor nucleotide sequence”) in the donor DNA; and ii) a second nucleotide sequence that hybridizes with a second nucleotide sequence (“second donor nucleotide sequence”) in the donor DNA. The first donor nucleotide sequence and the second donor nucleotide sequence can be separated from one another by from 5 nucleotides to 10,000 nucleotides; the nucleotide sequence between the first donor nucleotide sequence and the second donor nucleotide sequence is referred to as the “intervening donor nucleotide sequence” or “nucleotide sequence of interest.” The intervening nucleotide sequence can be from 5 nucleotides (nt) to 10,000 nt in length; e.g., the intervening nucleotide sequence can be from 5 nt to about 10 nt, from about 10 nt to about 25 nt, from about 25 nt to about 50 nt, from about 50 nt to about 100 nt, from about 100 nt to about 250 nt, from about 250 nt to about 500 nt, from about 500 nt to about 1,000 nt, from about 1,000 nt to about 1500 nt, from about 1500 nt to about 2000 nt, from about 2000 nt to about 2500 nt, from about 2500 nt to about 5000 nt, from about 5000 nt to about 7500 nt, or from about 7500 to about 10,000 nt. Upon hybridization (binding) of the staple oligonucleotide to the donor DNA, the first donor nucleotide sequence and the second donor nucleotide sequence, which both bind to the same staple oligonucleotide, are brought into proximity to one another. The intervening donor nucleotide sequence can be in the form of a loop, a helix, etc.


In some cases, the staple oligonucleotides are designed to fold the linear single-stranded donor DNA into a two-dimensional or three-dimensional nanostructure. In some cases, the staple oligonucleotides are designed to fold the linear single-stranded donor DNA into a nanostructure comprising parallel helices. In some cases, the staple oligonucleotides are designed to fold the linear single-stranded donor DNA into a two-dimensional or 3-dimensional polygonal network wherein the edges of the network are comprised of blocks of at least two parallel helices. In some cases, the staple oligonucleotides are designed to fold the linear single-stranded donor DNA into a 6-helix nanostructure. In some cases, the staple oligonucleotides are designed to fold the linear single-stranded donor DNA into a 24-helix nanostructure. In some cases, the linear single-stranded donor DNA will be folded, using staple oligonucleotides, into a multilayer DNA nanostructure. In some cases, the multilayer DNA nanostructure will be folded into a honeycomb lattice (see, e.g., Ke et al. (2012) J. Am. Chem. Soc. 134:1770). In some cases, the multilayer DNA nanostructure will be folded in a square lattice (see, e.g., Ke et al. (2009) J. Am. Chem. Soc. 131:15903). In some cases, the multilayer DNA nanostructure will incorporate curvature (see, e.g., Han et al. (2011) Science 332:342). In some cases, some parts of the DNA nanostructure will remain single stranded and will not include staples (will not have staple oligonucleotides hybridized to the DNA), to allow for flexibility.


One or more of the staple oligonucleotides can include a detectable label. Suitable detectable labels include fluorophores. Examples of fluorophores include, but are not limited to: an Alexa Fluor® dye (e.g., Alexa Fluor® 350, Alexa Fluor® 405, Alexa Fluor® 430, Alexa Fluor® 488, Alexa Fluor® 500, Alexa Fluor® 514, Alexa Fluor® 532, Alexa Fluor® 546, Alexa Fluor® 555, Alexa Fluor® 568, Alexa Fluor® 594, Alexa Fluor® 610, Alexa Fluor® 633, Alexa Fluor® 635, Alexa Fluor® 647, Alexa Fluor® 660, Alexa Fluor® 680, Alexa Fluor® 700, Alexa Fluor® 750, Alexa Fluor® 790, and the like), an ATTO dye (e.g., ATTO 390, ATTO 425, ATTO 465, ATTO 488, ATTO 495, ATTO 514, ATTO 520, ATTO 532, ATTO Rho6G, ATTO 542, ATTO 550, ATTO 565, ATTO Rho3B, ATTO Rho11, ATTO Rho12, ATTO Thio12, ATTO Rho101, ATTO 590, ATTO 594, ATTO Rho13, ATTO 610, ATTO 620, ATTO Rho14, ATTO 633, ATTO 647, ATTO 647N, ATTO 655, ATTO Oxa12, ATTO 665, ATTO 680, ATTO 700, ATTO 725, ATTO 740), a DyLight dye, a cyanine dye (e.g., Cy2, Cy3, Cy3.5, Cy3b, Cy5, Cy5.5, Cy7, Cy7.5), a FluoProbes dye, a Sulfo Cy dye, a Seta dye, an IRIS Dye, a SeTau dye, an SRfluor dye, a Square dye, fluorescein isothiocyanate (FITC), tetramethylrhodamine isothiocyanate (TRITC), tetramethylrhodamine (TMR), Silicon Rhodamine (SiR), Texas Red, Oregon Green, Pacific Blue, Pacific Green, and Pacific Orange.


In some cases, a staple oligonucleotide comprises a transcription factor binding site (e.g., to promote import into the nucleus) and/or a site for recruitment (binding) of DNA repair proteins. Transcription factor binding sites are known in the art. A non-limiting example of a transcription factor binding site is a nucleotide sequence comprising CCAAT. See e.g., Khamis et al. (2018) Nucleic Acids Research 46:e72; The ENCODE Project Consortium (2012) Nature 489:57.


Donor DNA

By a “donor nucleic acid” or “donor polynucleotide” or “donor DNA” it is meant a single-stranded DNA to be inserted at a site cleaved by a gene-editing polypeptide (e.g., a CRISPR/Cas effector protein; a TALEN; a ZFN; a meganuclease) (e.g., after dsDNA cleavage, after nicking a target DNA, after dual nicking a target DNA, and the like). The donor polynucleotide can contain sufficient homology to a genomic sequence at the target site, e.g. 70%, 80%, 85%, 90%, 95%, or 100% homology with the nucleotide sequences flanking the target site, e.g. within about 50 bases or less of the target site, e.g. within about 30 bases, within about 15 bases, within about 10 bases, within about 5 bases, or immediately flanking the target site, to support homology-directed repair between it and the genomic sequence to which it bears homology.


Approximately 25, 50, 100, or 200 nucleotides, or more than 200 nucleotides, of sequence homology between a donor DNA and a genomic sequence (or any integral value between 10 and 200 nucleotides, or more) can support homology-directed repair. Donor polynucleotides can be of any length, e.g., 50 nucleotides or more, 100 nucleotides or more, 250 nucleotides or more, 500 nucleotides or more, 1000 nucleotides or more, 5000 nucleotides or more, etc. A suitable donor DNA can be from 50 nucleotides to 100 nucleotides, from 100 nucleotides to 500 nucleotides, from 500 nucleotides to 1000 nucleotides, from 1000 nucleotides to 5000 nucleotides, or from 5000 nucleotides to 10,000 nucleotides, or more than 10,000 nucleotides, in length.


As noted above, the donor DNA comprises a first homology arm and a second homology arm. The first homology arm is at or near the 5′ end of the donor DNA; and comprises a nucleotide sequence that is at least partially complementary to a first nucleotide sequence in a target nucleic acid. The second homology arm is at or near the 3′ end of the donor DNA; and comprises a nucleotide sequence that is at least partially complementary to a second nucleotide sequence in the target nucleic acid. The first and second homology arms can each independently have a length of from about 10 nucleotides to 400 nucleotides; e.g., from 10 nucleotides (nt) to 15 nt, from 15 nt to 20 nt, from 20 nt to 25 nt, from 25 nt to 30 nt, from 30 nt to 35 nt, from 35 nt to 40 nt, from 40 nt to 45 nt, from 45 nt to 50 nt, from 50 nt to 75 nt, from 75 nt to 100 nt, from 100 nt to 125 nt, from 125 nt to 150 nt, from 150 nt to 175 nt, from 175 nt to 200 nt, from 200 nt to 225 nt, from 225 nt to 250 nt, from 250 nt to 275 nt, from 275 nt to 300 nt, from 325 nt to 350 nt, from 350 nt to 375 nt, or from 375 nt to 400 nt.


The first homology arm and the second homology arm of the donor DNA flank a nucleotide sequence (“a nucleotide sequence of interest” or “an intervening nucleotide sequence”) that is to be introduced into a target nucleic acid. The nucleotide sequence of interest can comprise: i) a nucleotide sequence encoding a polypeptide of interest; ii) a nucleotide sequence encoding an exon of a gene; iii) a promoter sequence; iv) an enhancer sequence; v) a nucleotide sequence encoding a non-coding RNA; or vi) any combination of the foregoing.


The donor DNA can provide for gene correction, gene replacement, gene tagging, transgene insertion, nucleotide deletion, gene disruption, gene mutation, etc. For example, the donor DNA can be used to add, e.g., insert or replace, nucleic acid material to a target DNA (e.g. to “knock in” a nucleic acid that encodes a protein, an siRNA, an miRNA, etc.), to add a tag (e.g., 6×His, a fluorescent protein (e.g., a green fluorescent protein; a yellow fluorescent protein, etc.), hemagglutinin (HA), FLAG, etc.), to add a regulatory sequence to a gene (e.g. promoter, polyadenylation signal, internal ribosome entry sequence (IRES), 2A peptide, start codon, stop codon, splice signal, localization signal, enhancer, etc.), to modify a nucleic acid sequence (e.g., introduce a mutation), and the like. For example, the donor DNA can be used to modify DNA in a site-specific, i.e. “targeted”, way; for example gene knock-out, gene knock-in, gene editing, gene tagging, etc., as used in, for example, gene therapy, e.g. to treat a disease; or as an antiviral, antipathogenic, or anticancer therapeutic, the production of genetically modified organisms in agriculture, the large scale production of proteins by cells for therapeutic, diagnostic, or research purposes, the induction of pluripotent stem cells, biological research, the targeting of genes of pathogens for deletion or replacement, etc.


In some cases, the donor DNA comprises a nucleotide sequence encoding a polypeptide of interest. Polypeptides of interest include, e.g., a) functional versions of a polypeptide that comprises one or more amino acid substitutions, insertions, and/or deletions and that exhibits reduced function, e.g., where the reduced function is associated with or causes a pathological condition; b) fluorescent polypeptides; c) hormones; d) receptors for ligands; e) ion channels; f) neurotransmitters; g) and the like.


In some cases, the donor DNA comprises a nucleotide sequence that encodes a wild-type protein that is lacking in the recipient cell. In some cases, the donor DNA encodes a wild type factor (e.g. Factor VII, Factor VIII, Factor IX and the like) involved in coagulation. In some cases, the donor DNA comprises a nucleotide sequence that encodes a therapeutic antibody. In some cases, the donor DNA comprises a nucleotide sequence that encodes an engineered protein or receptor. In some cases, the engineered receptor is a T cell receptor (TCR), a natural killer (NK) receptor (NKR), or a B cell receptor (BCR). In some cases, the engineered TCR or NKR targets a cancer marker (e.g., a polypeptide that is expressed (e.g., over-expressed) on the surface of a cancer cell). In some cases, the donor DNA comprises a nucleotide sequence that encodes a chimeric antigen receptor (CAR). In some cases, the CAR targets a cancer marker. Donor DNAs encoding CAR, TCR, and/or NCR proteins may be folded into DNA origami structures (DNA nanostructures) and delivered into T cells or NK cells in vitro or in vivo.


Non-limiting examples of polypeptides that can be encoded by a donor DNA include, e.g., IL1B (interleukin 1, beta), XDH (xanthine dehydrogenase), TP53 (tumor protein p53), PTGIS (prostaglandin 12 (prostacyclin) synthase), MB (myoglobin), IL4 (interleukin 4), ANGPT1 (angiopoietin 1), ABCG8 (ATP-binding cassette, sub-family G (WHITE), member 8), CTSK (cathepsin K), PTGIR (prostaglandin 12 (prostacyclin) receptor (IP)), KCNJ11 (potassium inwardly-rectifying channel, subfamily J, member 11), INS (insulin), CRP (C-reactive protein, pentraxin-related), PDGFRB (platelet-derived growth factor receptor, beta polypeptide), CCNA2 (cyclin A2), PDGFB (platelet-derived growth factor beta polypeptide (simian sarcoma viral (v-sis) oncogene homolog)), KCNJ5 (potassium inwardly-rectifying channel, subfamily J, member 5), KCNN3 (potassium intermediate/small conductance calcium-activated channel, subfamily N, member 3), CAPN10 (calpain 10), PTGES (prostaglandin E synthase), ADRA2B (adrenergic, alpha-2B-, receptor), ABCGS (ATP-binding cassette, sub-family G (WHITE), member 5), PRDX2 (peroxiredoxin 2), CAPNS (calpain 5), PARP14 (poly (ADP-ribose) polymerase family, member 14), MEX3C (mex-3 homolog C (C. elegans)), ACE angiotensin I converting enzyme (peptidyl-dipeptidase A) 1), TNF (tumor necrosis factor (TNF superfamily, member 2)), IL6 (interleukin 6 (interferon, beta 2)), STN (statin), SERPINE1 (serpin peptidase inhibitor, clade E (nexin, plasminogen activator inhibitor type 1), member 1), ALB (albumin), ADIPOQ (adiponectin, C1Q and collagen domain containing), APOB (apolipoprotein B (including Ag(x) antigen)), APOE (apolipoprotein E), LEP (leptin), MTHFR (5,10-methylenetetrahydrofolate reductase (NADPH)), APOA1 (apolipoprotein A-I), EDN1 (endothelin 1), NPPB (natriuretic peptide precursor B), NOS3 (nitric oxide synthase 3 (endothelial cell)), PPARG (peroxisome proliferator-activated receptor gamma), PLAT (plasminogen activator, tissue), PTGS2 (prostaglandin-endoperoxide synthase 2 (prostaglandin G/H synthase and cyclooxygenase)), CETP (cholesteryl ester transfer protein, plasma), AGTR1 (angiotensin II receptor, type 1), HMGCR (3-hydroxy-3-methylglutaryl-Coenzyme A reductase), IGF1 (insulin-like growth factor 1 (somatomedin C)), SELE (selectin E), REN (renin), PPARA (peroxisome proliferator-activated receptor alpha), PON1 (paraoxonase 1), KNG1 (kininogen 1), CCL2 (chemokine (C—C motif) ligand 2), LPL (lipoprotein lipase), vWF (von Willebrand factor), F2 (coagulation factor II (thrombin)), ICAM1 (intercellular adhesion molecule 1), TGFB1 (transforming growth factor, beta 1), NPPA (natriuretic peptide precursor A), IL10 (interleukin 10), EPO (erythropoietin), SOD1 (superoxide dismutase 1, soluble), VCAM1 (vascular cell adhesion molecule 1), IFNG (interferon, gamma), LPA (lipoprotein, Lp(a)), MPO (myeloperoxidase), ESR1 (estrogen receptor 1), MAPK1 (mitogen-activated protein kinase 1), HP (haptoglobin), F3 (coagulation factor III (thromboplastin, tissue factor)), CST3 (cystatin C), COG2 (component of oligomeric Golgi complex 2), MMP9 (matrix metallopeptidase 9 (gelatinase B, 92 kDa gelatinase, 92 kDa type IV collagenase)), SERPINC1 (serpin peptidase inhibitor, clade C (antithrombin), member 1), F8 (coagulation factor VIII, procoagulant component), HMOX1 (heme oxygenase (decycling) 1), APOC3 (apolipoprotein C-III), IL8 (interleukin 8), PROK1 (prokineticin 1), CBS (cystathionine-beta-synthase), NOS2 (nitric oxide synthase 2, inducible), TLR4 (toll-like receptor 4), SELP (selectin P (granule membrane protein 140 kDa, antigen CD62)), ABCA1 (ATP-binding cassette, sub-family A (ABC1), member 1), AGT (angiotensinogen (serpin peptidase inhibitor, clade A, member 8)), LDLR (low density lipoprotein receptor), GPT (glutamic-pyruvate transaminase (alanine aminotransferase)), VEGFA (vascular endothelial growth factor A), NR3C2 (nuclear receptor subfamily 3, group C, member 2), IL18 (interleukin 18 (interferon-gamma-inducing factor)), NOS1 (nitric oxide synthase 1 (neuronal)), NR3C1 (nuclear receptor subfamily 3, group C, member 1 (glucocorticoid receptor)), FGB (fibrinogen beta chain), HGF (hepatocyte growth factor (hepapoietin A; scatter factor)), ILIA (interleukin 1, alpha), RETN (resistin), AKT1 (v-akt murine thymoma viral oncogene homolog 1), LIPC (lipase, hepatic), HSPD1 (heat shock 60 kDa protein 1 (chaperonin)), MAPK14 (mitogen-activated protein kinase 14), SPP1 (secreted phosphoprotein 1), ITGB3 (integrin, beta 3 (platelet glycoprotein 111a, antigen CD61)), CAT (catalase), UTS2 (urotensin 2), THBD (thrombomodulin), F10 (coagulation factor X), CP (ceruloplasmin (ferroxidase)), TNFRSF11B (tumor necrosis factor receptor superfamily, member 11b), EDNRA (endothelin receptor type A), EGFR (epidermal growth factor receptor (erythroblastic leukemia viral (v-erb-b) oncogene homolog, avian)), MMP2 (matrix metallopeptidase 2 (gelatinase A, 72 kDa gelatinase, 72 kDa type IV collagenase)), PLG (plasminogen), NPY (neuropeptide Y), RHOD (ras homolog gene family, member D), MAPK8 (mitogen-activated protein kinase 8), MYC (v-myc myelocytomatosis viral oncogene homolog (avian)), FN1 (fibronectin 1), CMA1 (chymase 1, mast cell), PLAU (plasminogen activator, urokinase), GNB3 (guanine nucleotide binding protein (G protein), beta polypeptide 3), ADRB2 (adrenergic, beta-2-, receptor, surface), APOA5 (apolipoprotein A-V), SOD2 (superoxide dismutase 2, mitochondrial), F5 (coagulation factor V (proaccelerin, labile factor)), VDR (vitamin D (1,25-dihydroxyvitamin D3) receptor), ALOX5 (arachidonate 5-lipoxygenase), HLA-DRB1 (major histocompatibility complex, class II, DR beta 1), PARP1 (poly (ADP-ribose) polymerase 1), CD40LG (CD40 ligand), PON2 (paraoxonase 2), AGER (advanced glycosylation end product-specific receptor), IRS1 (insulin receptor substrate 1), PTGS1 (prostaglandin-endoperoxide synthase 1 (prostaglandin G/H synthase and cyclooxygenase)), ECE1 (endothelin converting enzyme 1), F7 (coagulation factor VII (serum prothrombin conversion accelerator)), URN (interleukin 1 receptor antagonist), EPHX2 (epoxide hydrolase 2, cytoplasmic), IGFBP1 (insulin-like growth factor binding protein 1), MAPK10 (mitogen-activated protein kinase 10), FAS (Fas (TNF receptor superfamily, member 6)), ABCB1 (ATP-binding cassette, sub-family B (MDR/TAP), member 1), JUN (jun oncogene), IGFBP3 (insulin-like growth factor binding protein 3), CD14 (CD14 molecule), PDE5A (phosphodiesterase 5A, cGMP-specific), AGTR2 (angiotensin II receptor, type 2), CD40 (CD40 molecule, TNF receptor superfamily member 5), LCAT (lecithin-cholesterol acyltransferase), CCR5 (chemokine (C—C motif) receptor 5), MMPI (matrix metallopeptidase 1 (interstitial collagenase)), TIMP1 (TIMP metallopeptidase inhibitor 1), ADM (adrenomedullin), DYT10 (dystonia 10), STAT3 (signal transducer and activator of transcription 3 (acute-phase response factor)), MMP3 (matrix metallopeptidase 3 (stromelysin 1, progelatinase)), ELN (elastin), USF1 (upstream transcription factor 1), CFH (complement factor H), HSPA4 (heat shock 70 kDa protein 4), MMP12 (matrix metallopeptidase 12 (macrophage elastase)), MME (membrane metallo-endopeptidase), F2R (coagulation factor II (thrombin) receptor), SELL (selectin L), CTSB (cathepsin B), ANXAS (annexin A5), ADRB1 (adrenergic, beta-1-, receptor), CYBA (cytochrome b-245, alpha polypeptide), FGA (fibrinogen alpha chain), GGT1 (gamma-glutamyltransferase 1), LIPG (lipase, endothelial), HIF1A (hypoxia inducible factor 1, alpha subunit (basic helix-loop-helix transcription factor)), CXCR4 (chemokine (C—X—C motif) receptor 4), PROC (protein C (inactivator of coagulation factors Va and VIIIa)), SCARB1 (scavenger receptor class B, member 1), CD79A (CD79a molecule, immunoglobulin-associated alpha), PLTP (phospholipid transfer protein), ADD1 (adducin 1 (alpha)), FGG (fibrinogen gamma chain), SAA1 (serum amyloid A1), KCNH2 (potassium voltage-gated channel, subfamily H (eag-related), member 2), DPP4 (dipeptidyl-peptidase 4), G6PD (glucose-6-phosphate dehydrogenase), NPR1 (natriuretic peptide receptor A/guanylate cyclase A (atrionatriuretic peptide receptor A)), VTN (vitronectin), KIAA0101 (KIAA0101), FOS (FBJ murine osteosarcoma viral oncogene homolog), TLR2 (toll-like receptor 2), PPIG (peptidylprolyl isomerase G (cyclophilin G)), IL1R1 (interleukin 1 receptor, type I), AR (androgen receptor), CYP1A1 (cytochrome P450, family 1, subfamily A, polypeptide 1), SERPINA1 (serpin peptidase inhibitor, clade A (alpha-1 antiproteinase, antitrypsin), member 1), MTR (5-methyltetrahydrofolate-homocysteine methyltransferase), RBP4 (retinol binding protein 4, plasma), APOA4 (apolipoprotein A-IV), CDKN2A (cyclin-dependent kinase inhibitor 2A (melanoma, p16, inhibits CDK4)), FGF2 (fibroblast growth factor 2 (basic)), EDNRB (endothelin receptor type B), ITGA2 (integrin, alpha 2 (CD49B, alpha 2 subunit of VLA-2 receptor)), CABIN1 (calcineurin binding protein 1), SHBG (sex hormone-binding globulin), HMGB1 (high-mobility group box 1), HSP90B2P (heat shock protein 90 kDa beta (Grp94), member 2 (pseudogene)), CYP3A4 (cytochrome P450, family 3, subfamily A, polypeptide 4), GJA1 (gap junction protein, alpha 1, 43 kDa), CAV1 (caveolin 1, caveolae protein, 22 kDa), ESR2 (estrogen receptor 2 (ER beta)), LTA (lymphotoxin alpha (TNF superfamily, member 1)), GDF15 (growth differentiation factor 15), BDNF (brain-derived neurotrophic factor), CYP2D6 (cytochrome P450, family 2, subfamily D, polypeptide 6), NGF (nerve growth factor (beta polypeptide)), SP1 (Sp 1 transcription factor), TGIF1 (TGFB-induced factor homeobox 1), SRC (v-src sarcoma (Schmidt-Ruppin A-2) viral oncogene homolog (avian)), EGF (epidermal growth factor (beta-urogastrone)), PIK3CG (phosphoinositide-3-kinase, catalytic, gamma polypeptide), HLA-A (major histocompatibility complex, class I, A), KCNQ1 (potassium voltage-gated channel, KQT-like subfamily, member 1), CNR1 (cannabinoid receptor 1 (brain)), FBN1 (fibrillin 1), CHKA (choline kinase alpha), BEST1 (bestrophin 1), APP (amyloid beta (A4) precursor protein), CTNNB1 (catenin (cadherin-associated protein), beta 1, 88 kDa), IL2 (interleukin 2), CD36 (CD36 molecule (thrombospondin receptor)), PRKAB1 (protein kinase, AMP-activated, beta 1 non-catalytic subunit), TPO (thyroid peroxidase), ALDH7A1 (aldehyde dehydrogenase 7 family, member A1), CX3CR1 (chemokine (C—X3-C motif) receptor 1), TH (tyrosine hydroxylase), F9 (coagulation factor IX), GH1 (growth hormone 1), TF (transferrin), HFE (hemochromatosis), IL17A (interleukin 17A), PTEN (phosphatase and tensin homolog), GSTM1 (glutathione S-transferase mu 1), DMD (dystrophin), GATA4 (GATA binding protein 4), F13A1 (coagulation factor XIII, A1 polypeptide), TTR (transthyretin), FABP4 (fatty acid binding protein 4, adipocyte), PON3 (paraoxonase 3), APOC1 (apolipoprotein C—I), INSR (insulin receptor), TNFRSF1B (tumor necrosis factor receptor superfamily, member 1B), HTR2A (5-hydroxytryptamine (serotonin) receptor 2A), CSF3 (colony stimulating factor 3 (granulocyte)), CYP2C9 (cytochrome P450, family 2, subfamily C, polypeptide 9), TXN (thioredoxin), CYP11B2 (cytochrome P450, family 11, subfamily B, polypeptide 2), PTH (parathyroid hormone), CSF2 (colony stimulating factor 2 (granulocyte-macrophage)), KDR (kinase insert domain receptor (a type III receptor tyrosine kinase)), PLA2G2A (phospholipase A2, group IIA (platelets, synovial fluid)), B2M (beta-2-microglobulin), THBS1 (thrombospondin 1), GCG (glucagon), RHOA (ras homolog gene family, member A), ALDH2 (aldehyde dehydrogenase 2 family (mitochondrial)), TCF7L2 (transcription factor 7-like 2 (T-cell specific, HMG-box)), BDKRB2 (bradykinin receptor B2), NFE2L2 (nuclear factor (erythroid-derived 2)-like 2), NOTCH1 (Notch homolog 1, translocation-associated (Drosophila)), UGT1A1 (UDP glucuronosyltransferase 1 family, polypeptide A1), IFNA1 (interferon, alpha 1), PPARD (peroxisome proliferator-activated receptor delta), SIRT1 (sirtuin (silent mating type information regulation 2 homolog) 1 (S. cerevisiae)), GNRH1 (gonadotropin-releasing hormone 1 (luteinizing-releasing hormone)), PAPPA (pregnancy-associated plasma protein A, pappalysin 1), ARR3 (arrestin 3, retinal (X-arrestin)), NPPC (natriuretic peptide precursor C), AHSP (alpha hemoglobin stabilizing protein), PTK2 (PTK2 protein tyrosine kinase 2), IL13 (interleukin 13), MTOR (mechanistic target of rapamycin (serine/threonine kinase)), ITGB2 (integrin, beta 2 (complement component 3 receptor 3 and 4 subunit)), GSTT1 (glutathione S-transferase theta 1), IL6ST (interleukin 6 signal transducer (gp130, oncostatin M receptor)), CPB2 (carboxypeptidase B2 (plasma)), CYP1A2 (cytochrome P450, family 1, subfamily A, polypeptide 2), HNF4A (hepatocyte nuclear factor 4, alpha), SLC6A4 (solute carrier family 6 (neurotransmitter transporter, serotonin), member 4), PLA2G6 (phospholipase A2, group VI (cytosolic, calcium-independent)), TNFSF11 (tumor necrosis factor (ligand) superfamily, member 11), SLC8A1 (solute carrier family 8 (sodium/calcium exchanger), member 1), F2RL1 (coagulation factor II (thrombin) receptor-like 1), AKR1A1 (aldo-keto reductase family 1, member A1 (aldehyde reductase)), ALDH9A1 (aldehyde dehydrogenase 9 family, member Al), BGLAP (bone gamma-carboxyglutamate (gla) protein), MTTP (microsomal triglyceride transfer protein), MTRR (5-methyltetrahydrofolate-homocysteine methyltransferase reductase), SULT1A3 (sulfotransferase family, cytosolic, 1A, phenol-preferring, member 3), RAGE (renal tumor antigen), C4B (complement component 4B (Chido blood group), P2RY12 (purinergic receptor P2Y, G-protein coupled, 12), RNLS (renalase, FAD-dependent amine oxidase), CREB1 (cAMP responsive element binding protein 1), POMC (proopiomelanocortin), RAC1 (ras-related C3 botulinum toxin substrate 1 (rho family, small GTP binding protein Rac1)), LMNA (lamin NC), CD59 (CD59 molecule, complement regulatory protein), SCNSA (sodium channel, voltage-gated, type V, alpha subunit), CYP1B1 (cytochrome P450, family 1, subfamily B, polypeptide 1), MIF (macrophage migration inhibitory factor (glycosylation-inhibiting factor)), MMP13 (matrix metallopeptidase 13 (collagenase 3)), TIMP2 (TIMP metallopeptidase inhibitor 2), CYP19A1 (cytochrome P450, family 19, subfamily A, polypeptide 1), CYP21A2 (cytochrome P450, family 21, subfamily A, polypeptide 2), PTPN22 (protein tyrosine phosphatase, non-receptor type 22 (lymphoid)), MYH14 (myosin, heavy chain 14, non-muscle), MBL2 (mannose-binding lectin (protein C) 2, soluble (opsonic defect)), SELPLG (selectin P ligand), AOC3 (amine oxidase, copper containing 3 (vascular adhesion protein 1)), CTSL1 (cathepsin L1), PCNA (proliferating cell nuclear antigen), IGF2 (insulin-like growth factor 2 (somatomedin A)), ITGB1 (integrin, beta 1 (fibronectin receptor, beta polypeptide, antigen CD29 includes MDF2, MSK12)), CAST (calpastatin), CXCL12 (chemokine (C—X—C motif) ligand 12 (stromal cell-derived factor 1)), IGHE (immunoglobulin heavy constant epsilon), KCNE1 (potassium voltage-gated channel, Isk-related family, member 1), TFRC (transferrin receptor (p90, CD71)), COL1A1 (collagen, type I, alpha 1), COL1A2 (collagen, type I, alpha 2), IL2RB (interleukin 2 receptor, beta), PLA2G10 (phospholipase A2, group X), ANGPT2 (angiopoietin 2), PROCR (protein C receptor, endothelial (EPCR)), NOX4 (NADPH oxidase 4), HAMP (hepcidin antimicrobial peptide), PTPN11 (protein tyrosine phosphatase, non-receptor type 11), SLC2A1 (solute carrier family 2 (facilitated glucose transporter), member 1), IL2RA (interleukin 2 receptor, alpha), CCL5 (chemokine (C—C motif) ligand 5), IRF1 (interferon regulatory factor 1), CFLAR (CASP8 and FADD-like apoptosis regulator), CALCA (calcitonin-related polypeptide alpha), EIF4E (eukaryotic translation initiation factor 4E), GSTP1 (glutathione S-transferase pi 1), JAK2 (Janus kinase 2), CYP3A5 (cytochrome P450, family 3, subfamily A, polypeptide 5), HSPG2 (heparan sulfate proteoglycan 2), CCL3 (chemokine (C—C motif) ligand 3), MYD88 (myeloid differentiation primary response gene (88)), VIP (vasoactive intestinal peptide), SOAT1 (sterol O-acyltransferase 1), ADRBK1 (adrenergic, beta, receptor kinase 1), NR4A2 (nuclear receptor subfamily 4, group A, member 2), MMP8 (matrix metallopeptidase 8 (neutrophil collagenase)), NPR2 (natriuretic peptide receptor B/guanylate cyclase B (atrionatriuretic peptide receptor B)), GCH1 (GTP cyclohydrolase 1), EPRS (glutamyl-prolyl-tRNA synthetase), PPARGC1A (peroxisome proliferator-activated receptor gamma, coactivator 1 alpha), F12 (coagulation factor XII (Hageman factor)), PECAM1 (platelet/endothelial cell adhesion molecule), CCL4 (chemokine (C—C motif) ligand 4), SERPINA3 (serpin peptidase inhibitor, clade A (alpha-1 antiproteinase, antitrypsin), member 3), CASR (calcium-sensing receptor), GJAS (gap junction protein, alpha 5, 40 kDa), FABP2 (fatty acid binding protein 2, intestinal), TTF2 (transcription termination factor, RNA polymerase II), PROS1 (protein S (alpha)), CTF1 (cardiotrophin 1), SGCB (sarcoglycan, beta (43 kDa dystrophin-associated glycoprotein)), YME1L1 (YME1-like 1 (S. cerevisiae)), CAMP (cathelicidin antimicrobial peptide), ZC3H12A (zinc finger CCCH-type containing 12A), AKR1B1 (aldo-keto reductase family 1, member B1 (aldose reductase)), DES (desmin), MMPI (matrix metallopeptidase 7 (matrilysin, uterine)), AHR (aryl hydrocarbon receptor), CSF1 (colony stimulating factor 1 (macrophage)), HDAC9 (histone deacetylase 9), CTGF (connective tissue growth factor), KCNMA1 (potassium large conductance calcium-activated channel, subfamily M, alpha member 1), UGT1A (UDP glucuronosyltransferase 1 family, polypeptide A complex locus), PRKCA (protein kinase C, alpha), COMT (catechol-β-methyltransferase), S100B (S100 calcium binding protein B), EGR1 (early growth response 1), PRL (prolactin), IL15 (interleukin 15), DRD4 (dopamine receptor D4), CAMK2G (calcium/calmodulin-dependent protein kinase II gamma), SLC22A2 (solute carrier family 22 (organic cation transporter), member 2), CCL11 (chemokine (C—C motif) ligand 11), PGF (placental growth factor), THPO (thrombopoietin), GP6 (glycoprotein VI (platelet)), TACR1 (tachykinin receptor 1), NTS (neurotensin), HNF1A (HNF1 homeobox A), SST (somatostatin), KCND1 (potassium voltage-gated channel, Shal-related subfamily, member 1), LOC646627 (phospholipase inhibitor), TBXAS1 (thromboxane A synthase 1 (platelet)), CYP2J2 (cytochrome P450, family 2, subfamily J, polypeptide 2), TBXA2R (thromboxane A2 receptor), ADH1C (alcohol dehydrogenase 1C (class I), gamma polypeptide), ALOX12 (arachidonate 12-lipoxygenase), AHSG (alpha-2-HS-glycoprotein), BHMT (betaine-homocysteine methyltransferase), GJA4 (gap junction protein, alpha 4, 37 kDa), SLC25A4 (solute carrier family 25 (mitochondrial carrier; adenine nucleotide translocator), member 4), ACLY (ATP citrate lyase), ALOX5AP (arachidonate 5-lipoxygenase-activating protein), NUMA1 (nuclear mitotic apparatus protein 1), CYP27B1 (cytochrome P450, family 27, subfamily B, polypeptide 1), CYSLTR2 (cysteinyl leukotriene receptor 2), SOD3 (superoxide dismutase 3, extracellular), LTC4S (leukotriene C4 synthase), UCN (urocortin), GHRL (ghrelin/obestatin prepropeptide), APOC2 (apolipoprotein C-II), CLEC4A (C-type lectin domain family 4, member A), KBTBD10 (kelch repeat and BTB (POZ) domain containing 10), TNC (tenascin C), TYMS (thymidylate synthetase), SHC1 (SHC (Src homology 2 domain containing) transforming protein 1), LRP1 (low density lipoprotein receptor-related protein 1), SOCS3 (suppressor of cytokine signaling 3), ADH1B (alcohol dehydrogenase 1B (class I), beta polypeptide), KLK3 (kallikrein-related peptidase 3), HSD11B1 (hydroxysteroid (11-beta) dehydrogenase 1), VKORC1 (vitamin K epoxide reductase complex, subunit 1), SERPINB2 (serpin peptidase inhibitor, clade B (ovalbumin), member 2), TNS1 (tensin 1), RNF19A (ring finger protein 19A), EPOR (erythropoietin receptor), ITGAM (integrin, alpha M (complement component 3 receptor 3 subunit)), PITX2 (paired-like homeodomain 2), MAPK7 (mitogen-activated protein kinase 7), FCGR3A (Fc fragment of IgG, low affinity 111a, receptor (CD16a)), LEPR (leptin receptor), ENG (endoglin), GPX1 (glutathione peroxidase 1), GOT2 (glutamic-oxaloacetic transaminase 2, mitochondrial (aspartate aminotransferase 2)), HRH1 (histamine receptor H1), NR112 (nuclear receptor subfamily 1, group I, member 2), CRH (corticotropin releasing hormone), HTR1A (5-hydroxytryptamine (serotonin) receptor 1A), VDAC1 (voltage-dependent anion channel 1), HPSE (heparanase), SFTPD (surfactant protein D), TAP2 (transporter 2, ATP-binding cassette, sub-family B (MDR/TAP)), RNF123 (ring finger protein 123), PTK2B (PTK2B protein tyrosine kinase 2 beta), NTRK2 (neurotrophic tyrosine kinase, receptor, type 2), IL6R (interleukin 6 receptor), ACHE (acetylcholinesterase (Yt blood group)), GLP1R (glucagon-like peptide 1 receptor), GHR (growth hormone receptor), GSR (glutathione reductase), NQO1 (NAD(P)H dehydrogenase, quinone 1), NR5A1 (nuclear receptor subfamily 5, group A, member 1), GJB2 (gap junction protein, beta 2, 26 kDa), SLC9A1 (solute carrier family 9 (sodium/hydrogen exchanger), member 1), MAOA (monoamine oxidase A), PCSK9 (proprotein convertase subtilisin/kexin type 9), FCGR2A (Fc fragment of IgG, low affinity IIa, receptor (CD32)), SERPINF1 (serpin peptidase inhibitor, clade F (alpha-2 antiplasmin, pigment epithelium derived factor), member 1), EDN3 (endothelin 3), DHFR (dihydrofolate reductase), GAS6 (growth arrest-specific 6), SMPD1 (sphingomyelin phosphodiesterase 1, acid lysosomal), UCP2 (uncoupling protein 2 (mitochondrial, proton carrier)), TFAP2A (transcription factor AP-2 alpha (activating enhancer binding protein 2 alpha)), C4BPA (complement component 4 binding protein, alpha), SERPINF2 (serpin peptidase inhibitor, clade F (alpha-2 antiplasmin, pigment epithelium derived factor), member 2), TYMP (thymidine phosphorylase), ALPP (alkaline phosphatase, placental (Regan isozyme)), CXCR2 (chemokine (C—X—C motif) receptor 2), SLC39A3 (solute carrier family 39 (zinc transporter), member 3), ABCG2 (ATP-binding cassette, sub-family G (WHITE), member 2), ADA (adenosine deaminase), JAK3 (Janus kinase 3), HSPA1A (heat shock 70 kDa protein 1A), FASN (fatty acid synthase), FGF1 (fibroblast growth factor 1 (acidic)), F11 (coagulation factor XI), ATP7A (ATPase, Cu++ transporting, alpha polypeptide), CR1 (complement component (3b/4b) receptor 1 (Knops blood group)), GFAP (glial fibrillary acidic protein), ROCK1 (Rho-associated, coiled-coil containing protein kinase 1), MECP2 (methyl CpG binding protein 2 (Rett syndrome)), MYLK (myosin light chain kinase), BCHE (butyrylcholinesterase), LIPE (lipase, hormone-sensitive), PRDX5 (peroxiredoxin 5), ADORA1 (adenosine A1 receptor), WRN (Werner syndrome, RecQ helicase-like), CXCR3 (chemokine (C—X—C motif) receptor 3), CD81 (CD81 molecule), SMAD7 (SMAD family member 7), LAMC2 (laminin, gamma 2), MAP3K5 (mitogen-activated protein kinase kinase kinase 5), CHGA (chromogranin A (parathyroid secretory protein 1)), IAPP (islet amyloid polypeptide), RHO (rhodopsin), ENPP1 (ectonucleotide pyrophosphatase/phosphodiesterase 1), PTHLH (parathyroid hormone-like hormone), NRG1 (neuregulin 1), VEGFC (vascular endothelial growth factor C), ENPEP (glutamyl aminopeptidase (aminopeptidase A)), CEBPB (CCAAT/enhancer binding protein (C/EBP), beta), NAGLU (N-acetylglucosaminidase, alpha), F2RL3 (coagulation factor II (thrombin) receptor-like 3), CX3CL1 (chemokine (C—X3-C motif) ligand 1), BDKRB1 (bradykinin receptor B1), ADAMTS13 (ADAM metallopeptidase with thrombospondin type 1 motif, 13), ELANE (elastase, neutrophil expressed), ENPP2 (ectonucleotide pyrophosphatase/phosphodiesterase 2), CISH (cytokine inducible SH2-containing protein), GAST (gastrin), MYOC (myocilin, trabecular meshwork inducible glucocorticoid response), ATP1A2 (ATPase, Na+/K+ transporting, alpha 2 polypeptide), NF1 (neurofibromin 1), GJB1 (gap junction protein, beta 1, 32 kDa), MEF2A (myocyte enhancer factor 2A), VCL (vinculin), BMPR2 (bone morphogenetic protein receptor, type II (serine/threonine kinase)), TUBB (tubulin, beta), CDC42 (cell division cycle 42 (GTP binding protein, 25 kDa)), KRT18 (keratin 18), HSF1 (heat shock transcription factor 1), MYB (v-myb myeloblastosis viral oncogene homolog (avian)), PRKAA2 (protein kinase, AMP-activated, alpha 2 catalytic subunit), ROCK2 (Rho-associated, coiled-coil containing protein kinase 2), TFPI (tissue factor pathway inhibitor (lipoprotein-associated coagulation inhibitor)), PRKG1 (protein kinase, cGMP-dependent, type I), BMP2 (bone morphogenetic protein 2), CTNND1 (catenin (cadherin-associated protein), delta 1), CTH (cystathionase (cystathionine gamma-lyase)), CTSS (cathepsin S), VAV2 (vav 2 guanine nucleotide exchange factor), NPY2R (neuropeptide Y receptor Y2), IGFBP2 (insulin-like growth factor binding protein 2, 36 kDa), CD28 (CD28 molecule), GSTA1 (glutathione S-transferase alpha 1), PPIA (peptidylprolyl isomerase A (cyclophilin A)), APOH (apolipoprotein H (beta-2-glycoprotein I)), S100A8 (S100 calcium binding protein A8), IL11 (interleukin 11), ALOX15 (arachidonate 15-lipoxygenase), FBLN1 (fibulin 1), NR1H3 (nuclear receptor subfamily 1, group H, member 3), SCD (stearoyl-CoA desaturase (delta-9-desaturase)), GIP (gastric inhibitory polypeptide), CHGB (chromogranin B (secretogranin 1)), PRKCB (protein kinase C, beta), SRD5A1 (steroid-5-alpha-reductase, alpha polypeptide 1 (3-oxo-5 alpha-steroid delta 4-dehydrogenase alpha 1)), HSD11B2 (hydroxysteroid (11-beta) dehydrogenase 2), CALCRL (calcitonin receptor-like), GALNT2 (UDP-N-acetyl-alpha-D-galactosaminepolypeptide N-acetylgalactosaminyltransferase 2 (GalNAc-T2)), ANGPTL4 (angiopoietin-like 4), KCNN4 (potassium intermediate/small conductance calcium-activated channel, subfamily N, member 4), PIK3C2A (phosphoinositide-3-kinase, class 2, alpha polypeptide), HBEGF (heparin-binding EGF-like growth factor), CYP7A1 (cytochrome P450, family 7, subfamily A, polypeptide 1), HLA-DRB5 (major histocompatibility complex, class II, DR beta 5), BNIP3 (BCL2/adenovirus E1B 19 kDa interacting protein 3), GCKR (glucokinase (hexokinase 4) regulator), S100A12 (S100 calcium binding protein A12), PADI4 (peptidyl arginine deaminase, type IV), HSPA14 (heat shock 70 kDa protein 14), CXCR1 (chemokine (C—X—C motif) receptor 1), H19 (H19, imprinted maternally expressed transcript (non-protein coding)), KRTAP19-3 (keratin associated protein 19-3), insulin, RAC2 (ras-related C3 botulinum toxin substrate 2 (rho family, small GTP binding protein Rac2)), RYR1 (ryanodine receptor 1 (skeletal)), CLOCK (clock homolog (mouse)), NGFR (nerve growth factor receptor (TNFR superfamily, member 16)), DBH (dopamine beta-hydroxylase (dopamine beta-monooxygenase)), CHRNA4 (cholinergic receptor, nicotinic, alpha 4), CACNA1C (calcium channel, voltage-dependent, L type, alpha 1C subunit), PRKAG2 (protein kinase, AMP-activated, gamma 2 non-catalytic subunit), CHAT (choline acetyltransferase), PTGDS (prostaglandin D2 synthase 21 kDa (brain)), NR1H2 (nuclear receptor subfamily 1, group H, member 2), TEK (TEK tyrosine kinase, endothelial), VEGFB (vascular endothelial growth factor B), MEF2C (myocyte enhancer factor 2C), MAPKAPK2 (mitogen-activated protein kinase-activated protein kinase 2), TNFRSF11A (tumor necrosis factor receptor superfamily, member 11a, NFKB activator), HSPA9 (heat shock 70 kDa protein 9 (mortalin)), CYSLTR1 (cysteinyl leukotriene receptor 1), MAT1A (methionine adenosyltransferase I, alpha), OPRL1 (opiate receptor-like 1), IMPA1 (inositol(myo)-1(or 4)-monophosphatase 1), CLCN2 (chloride channel 2), DLD (dihydrolipoamide dehydrogenase), PSMA6 (proteasome (prosome, macropain) subunit, alpha type, 6), PSMB8 (proteasome (prosome, macropain) subunit, beta type, 8 (large multifunctional peptidase 7)), CHI3L1 (chitinase 3-like 1 (cartilage glycoprotein-39)), ALDH1B1 (aldehyde dehydrogenase 1 family, member B1), PARP2 (poly (ADP-ribose) polymerase 2), STAR (steroidogenic acute regulatory protein), LBP (lipopolysaccharide binding protein), ABCC6 (ATP-binding cassette, sub-family C(CFTR/MRP), member 6), RGS2 (regulator of G-protein signaling 2, 24 kDa), EFNB2 (ephrin-B2), cystic fibrosis transmembrane conductance regulator (CFTR), GJB6 (gap junction protein, beta 6, 30 kDa), APOA2 (apolipoprotein A-II), AMPD1 (adenosine monophosphate deaminase 1), DYSF (dysferlin, limb girdle muscular dystrophy 2B (autosomal recessive)), FDFT1 (farnesyl-diphosphate farnesyltransferase 1), EDN2 (endothelin 2), CCR6 (chemokine (C—C motif) receptor 6), GJB3 (gap junction protein, beta 3, 31 kDa), IL1RL1 (interleukin 1 receptor-like 1), ENTPD1 (ectonucleoside triphosphate diphosphohydrolase 1), BBS4 (Bardet-Biedl syndrome 4), CELSR2 (cadherin, EGF LAG seven-pass G-type receptor 2 (flamingo homolog, Drosophila)), F11R (F11 receptor), RAPGEF3 (Rap guanine nucleotide exchange factor (GEF) 3), HYAL1 (hyaluronoglucosaminidase 1), ZNF259 (zinc finger protein 259), ATOX1 (ATX1 antioxidant protein 1 homolog (yeast)), ATF6 (activating transcription factor 6), KHK (ketohexokinase (fructokinase)), SAT1 (spermidine/spermine N1-acetyltransferase 1), GGH (gamma-glutamyl hydrolase (conjugase, folylpolygammaglutamyl hydrolase)), TIMP4 (TIMP metallopeptidase inhibitor 4), SLC4A4 (solute carrier family 4, sodium bicarbonate cotransporter, member 4), PDE2A (phosphodiesterase 2A, cGMP-stimulated), PDE3B (phosphodiesterase 3B, cGMP-inhibited), FADS1 (fatty acid desaturase 1), FADS2 (fatty acid desaturase 2), TMSB4X (thymosin beta 4, X-linked), TXNIP (thioredoxin interacting protein), LIMS1 (LIM and senescent cell antigen-like domains 1), RHOB (ras homolog gene family, member B), LY96 (lymphocyte antigen 96), FOXO1 (forkhead box O1), PNPLA2 (patatin-like phospholipase domain containing 2), TRH (thyrotropin-releasing hormone), GJC1 (gap junction protein, gamma 1, 45 kDa), SLC17A5 (solute carrier family 17 (anion/sugar transporter), member 5), FTO (fat mass and obesity associated), GJD2 (gap junction protein, delta 2, 36 kDa), PSRC1 (proline/serine-rich coiled-coil 1), CASP12 (caspase 12 (gene/pseudogene)), GPBAR1 (G protein-coupled bile acid receptor 1), PXK (PX domain containing serine/threonine kinase), IL33 (interleukin 33), TRIB1 (tribbles homolog 1 (Drosophila)), PBX4 (pre-B-cell leukemia homeobox 4), NUPR1 (nuclear protein, transcriptional regulator, 1), 15-Sep(15 kDa selenoprotein), CILP2 (cartilage intermediate layer protein 2), TERC (telomerase RNA component), GGT2 (gamma-glutamyltransferase 2), MT-CO1 (mitochondrially encoded cytochrome c oxidase I), UOX (urate oxidase, pseudogene), a CRISPR/Cas effector polypeptide, an enzymatically active CRISPR/Cas effector polypeptide (e.g., is capable of cleaving a target nucleic acid). and a CRISPR/Cas effector polypeptide that is not enzymatically active (e.g., does not cleave a target nucleic acid, but retains binding to the target nucleic acid). In some cases, the donor DNA encodes a wild-type version of any of the foregoing polypeptides; i.e., the donor DNA can encode a “normal” version that does not include a mutation(s) that results in reduced function, lack of function, or pathogenesis.


In some cases, the donor DNA comprises a nucleotide sequence encoding a fluorescent polypeptide. Suitable fluorescent proteins include, but are not limited to, green fluorescent protein (GFP) or variants thereof, blue fluorescent variant of GFP (BFP), cyan fluorescent variant of GFP (CFP), yellow fluorescent variant of GFP (YFP), enhanced GFP (EGFP), enhanced CFP (ECFP), enhanced YFP (EYFP), GFPS65T, Emerald, Topaz (TYFP), Venus, Citrine, mCitrine, GFPuv, destabilized EGFP (dEGFP), destabilized ECFP (dECFP), destabilised EYFP (dEYFP), mCFPm, Cerulean, T-Sapphire, CyPet, YPet, mKO, HcRed, t-HcRed, DsRed, DsRed2, DsRed-monomer, J-Red, dimer2, t-dimer2(12), mRFP1, pocilloporin, Renilla GFP, Monster GFP, paGFP, Kaede protein and kindling protein, Phycobiliproteins and Phycobiliprotein conjugates including B-Phycoerythrin, R-Phycoerythrin and Allophycocyanin. Other examples of fluorescent proteins include mHoneydew, mBanana, mOrange, dTomato, tdTomato, mTangerine, mStrawberry, mCherry, mGrape1, mRaspberry, mGrape2, mPlum (Shaner et al. (2005) Nat. Methods 2:905-909), and the like. Any of a variety of fluorescent and colored proteins from Anthozoan species, as described in, e.g., Matz et al. (1999) Nature Biotechnol. 17:969-973, can be encoded.


In some cases, the donor DNA encodes an RNA, e.g., an siRNA, a microRNA, a short hairpin RNA (shRNA), an anti-sense RNA, a riboswitch, a ribozyme, an aptamer, a ribosomal RNA, a transfer RNA, and the like.


A donor DNA can include, in addition to a nucleotide sequence encoding one or more gene products (e.g., an RNA and/or a polypeptide), one or more transcriptional control elements, e.g., a promoter, an enhancer, and the like. In some cases, the transcriptional control element is inducible. In some cases, the promoter is reversible. In some cases, the transcriptional control element is constitutive. In some cases, the promoter is functional in a eukaryotic cell. In some cases, the promoter is a cell type-specific promoter. In some cases, the promoter is a tissue-specific promoter.


The nucleotide sequence of the donor DNA is typically not identical to the target nucleic acid (e.g., genomic sequence) that it replaces. Rather, the donor DNA may contain at least one or more single base changes, insertions, deletions, inversions or rearrangements with respect to the target nucleic acid (e.g., genomic sequence), so long as sufficient homology is present to support homology-directed repair (e.g., for gene correction, e.g., to convert a disease-causing base pair or a non-disease-causing base pair). In some cases, the donor DNA comprises a non-homologous sequence flanked by two regions of homology, such that homology-directed repair between the target DNA region and the two flanking sequences results in insertion of the non-homologous sequence at the target region. Donor DNA may also comprise a vector backbone containing sequences that are not homologous to the DNA region of interest (the target nucleic acid) and that are not intended for insertion into the DNA region of interest (the target nucleic acid). Generally, the homologous region(s) of a donor sequence will have at least 50% sequence identity to a target nucleic acid (e.g., a genomic sequence) with which recombination is desired. In certain cases, 60%, 70%, 80%, 90%, 95%, 98%, 99%, or 99.9% sequence identity is present. Any value between 1% and 100% sequence identity can be present, depending upon the length of the donor polynucleotide.


The donor DNA may comprise certain nucleotide sequence differences as compared to the target nucleic acid (e.g., genomic sequence), where such difference include, e.g. restriction sites, nucleotide polymorphisms, selectable markers (e.g., drug resistance genes, fluorescent proteins, enzymes etc.), etc., which may be used to assess for successful insertion of the donor DNA at the cleavage site or in some cases may be used for other purposes (e.g., to signify expression at the targeted genomic locus). In some cases, if located in a coding region, such nucleotide sequence differences will not change the amino acid sequence, or will make silent amino acid changes (i.e., changes which do not affect the structure or function of the protein). Alternatively, these sequences differences may include flanking recombination sequences such as FLPs, loxP sequences, or the like, that can be activated at a later time for removal of the marker sequence. In some cases, the donor DNA will include one or more nucleotide sequences to aid in localization of the donor to the nucleus of the recipient cell or to aid in the integration of the donor DNA into the target nucleic acid. For example, in some case, the donor DNA may comprise one or more nucleotide sequences encoding one or more nuclear localization signals (e.g. PKKKRKV (SEQ ID NO:1), VSRKRPRP (SEQ ID NO:2), QRKRKQ (SEQ ID NO:3), and the like (Frietas et al (2009) Curr Genomics 10:550-7). In some cases, the donor DNA will include nucleotide sequences to recruit DNA repair enzymes to increase insertion efficiency. Human enzymes involved in homology directed repair include MRN-Ct1P, BLM-DNA2, Exo1, ERCC1, Rad51, Rad52, Ligase 1, PolΘ, PARP1, Ligase 3, BRCA2, RecQ/BLM-TopoIIIα, RTEL, Polδ, and Polη (Verma and Greenburg (2016) Genes Dev. 30 (10): 1138-1154). In some cases, the donor DNA is delivered as reconstituted chromatin (Cruz-Becerra and Kadonaga (2020) eLife 2020;9:e55780 DOI: 10.7554/eLife.55780).


In some cases, the ends of the donor DNA are protected (e.g., from exonucleolytic degradation) by any convenient method and such methods are known to those of skill in the art. For example, one or more dideoxynucleotide residues can be added to the 3′ terminus of a linear molecule and/or self-complementary oligonucleotides can be ligated to one or both ends. See, for example, Chang et al. (1987) Proc. Natl. Acad Sci USA 84:4959-4963; Nehls et al. (1996) Science 272:886-889. Additional methods for protecting exogenous polynucleotides from degradation include, but are not limited to, addition of terminal amino group(s) and the use of modified internucleotide linkages such as, for example, phosphorothioates, phosphoramidates, and 0-methyl ribose or deoxyribose residues. As an alternative to protecting the termini of a linear donor DNA, additional lengths of sequence may be included outside of the regions of homology that can be degraded without impacting recombination.


Generating Nanostructures

DNA origami structures incorporate DNA as a building material to make nanoscale shapes. The DNA origami process can involve the folding of one or more long, “scaffold” DNA strands (e.g., a single-stranded linear donor DNA) into a particular shape using a plurality of rationally designed “staple” DNA strands (“staple oligonucleotides” or “staple nucleic acids”). The sequences of the staple strands can be designed such that they hybridize to particular portions of the scaffold strands and, in doing so, force the scaffold strands into a particular shape. The DNA origami can include a scaffold strand and a plurality of rationally designed staple strands. The scaffold strand can have any sufficiently non-repetitive sequence.


As noted above, staple oligonucleotides are designed to fold the linear single-stranded donor DNA into a two-dimensional or three-dimensional nanostructure. In some cases, the staple oligonucleotides are selected such that the DNA nanostructure is planar. In some cases, the staple oligonucleotides are selected such that the DNA nanostructure is substantially barrel- or tube-shaped. The staples oligonucleotides are selected such that the barrel shape is closed at both ends or is open at one or both ends. In certain cases, the barrel shape of the DNA nanostructure can be a hexagonal tube. In some cases, the staple oligonucleotides are designed to fold the linear single-stranded donor DNA into a nanostructure comprising parallel helices. In some cases, the staple oligonucleotides are designed to fold the linear single-stranded donor DNA into a two-dimensional or 3-dimensional polygonal network wherein the edges of the network are comprised of blocks of at least two parallel helices. In some cases, the staple oligonucleotides are designed to fold the linear single-stranded donor DNA into a 6-helix nanostructure. In some cases, the staple oligonucleotides are designed to fold the linear single-stranded donor DNA into a 24-helix nanostructure.


Methods useful in making DNA nanostructures can be found, for example, in Rothemund (2006) Nature 440:297-302; Castro et al. (2011) Nat. Methods 8:221; Douglas et al. (2009) Nature 459:414-418; Dietz et al. (2009) Science 325:725-730; and U.S. Pat. App. Pub. Nos. 2007/0117109, 2008/0287668, 2010/0069621, 2015/0329584, and 2010/0216978, each of which is incorporated by reference in its entirety. Staple design can be facilitated using, for example, CADnano software, available on the internet at www(dot)cadnano(dot)org. Once the staple oligonucleotides are designed, they are synthesized, mixed with the linear single-stranded donor DNA in a buffer solution, heated (for example to about 90 degrees Celsius), then cooled to room temperature. The buffer solution is selected to allow for hybridization of the linear single-stranded donor DNA with the staple oligonucleotides. In some cases, the buffer comprises magnesium. Generally, a stoichiometric excess of the staple oligonucleotides is used. As one non-limiting example, 10-100 times as many staple oligonucleotide molecules are present as would be needed to fold all the linear single-stranded donor DNA molecules.


The DNA nanostructures can be purified to remove excess staple oligonucleotides and/or to remove unfolded DNA. Purification can be carried out using, e.g., gel electrophoresis. In some cases, a DNA nanostructure present in a composition of the present disclosure is at least 50% pure, at least 60% pure, at least 70% pure, at least 80% pure, at least 90% pure, at least 95% pure, at least 98% pure, or at least 99% pure.


DNA Nanostructures

The present disclosure provides a composition comprising: a) a gene editing polypeptide or a gene-editing system (e.g., a gene-editing system comprising a CRISPR/Cas effector polypeptide and a guide nucleic acid); and b) a DNA nanostructure formed by hybridizing a linear single-stranded donor DNA with one or more staple oligonucleotides, i.e., a DNA nanostructure comprising a donor DNA and at least one staple oligonucleotide, where the donor DNA comprises a first homology arm and a second homology arm that are in proximity to one another by virtue of the folding of the donor DNA induced by the staple oligonucleotide(s). The first and second homology arms are single-stranded and are located at or near the 5′ and the 3′ ends, respectively, of the DNA nanostructure. As described above, the donor DNA comprises a nucleotide sequence of interest between the first and second homology arms.


The first and the second homology arms are brought into proximity to one another in the nanostructure formed by hybridization of the single-stranded donor DNA with the one or more staple oligonucleotides, relative to their positions in the linear single-stranded donor DNA (donor DNA not hybridized with the one or more staple oligonucleotides). For example, the distance between the first and the second homology arms can be from about 5 nm to about 150 nm from one another; e.g., from about 5 nm to about 10 nm, from about 10 nm to about 20 nm, from about 20 nm to about 25 nm, from about 25 nm to about 50 nm, from about 50 nm to about 75 nm, from about 75 nm to about 100 nm, from about 100 nm to about 110 nm, from about 110 nm to about 120 nm, from about 120 nm to about 130 nm, from about 130 nm to about 140 nm, or from about 140 nm to about 150 nm.


Thus, the DNA nanostructure includes: a) a donor DNA comprising: i) a first homology arm having a length of from about 10 nucleotides to 400 nucleotides; ii) a nucleotide sequence of interest; and iii) a second homology arm having a length of from about 10 nucleotides to 400 nucleotides; and b) one or more staple oligonucleotides hybridized to the donor DNA at one or more sites between the first and the second homology arms; where donor DNA that is between the first and the second homology arms (the “nucleotide sequence of interest”) is folded by virtue of hybridization of the one or more staple oligonucleotides such that the folded structure has a length of from about 5 nm to about 500 nm; e.g., from about 5 nm to about 10 nm, from about 10 nm to about 20 nm, from about 20 nm to about 25 nm, from about 25 nm to about 50 nm, from about 50 nm to about 75 nm, from about 75 nm to about 100 nm, from about 100 nm to about 110 nm, from about 110 nm to about 120 nm, from about 120 nm to about 130 nm, from about 130 nm to about 140 nm, from about 140 nm to about 150 nm, from about 150 nm to about 175 from about, from about 175 from about to about 200 nm, from about 200 nm to about 225 nm, from about 225 nm to about 250 nm, from about 250 nm to about 275 nm, from about 275 nm to about 300 nm, from about 300 nm to about 325 nm, from about 325 nm to about 350 nm, from about 350 nm to about 375 nm, from about 375 nm to about 400 nm, from about 400 nm to about 425 nm, from about 425 nm to about 450 nm, from about 450 nm to about 475 nm, or from about 475 nm to 500 nm. In some cases, the folded structure has a length of from about 5 nm to about 150 nm. In some cases, the folded structure has a length of from about 25 nm to about 150 nm. In some cases, the folded structure has a length of from about 50 nm to about 150 nm.


In some cases, the DNA nanostructure is a 6-helix bundle. In some cases, the DNA nanostructure is a 24-helix bundle. In some cases, the DNA nanostructure is cuboid. In some cases, the DNA nanostructure is a cylinder, i.e., is cylindrical in shape. In some cases, a DNA nanostructure is a multilayer DNA nanostructure. In some cases, the multilayer DNA nanostructure is a honeycomb lattice. In some cases, the multilayer DNA nanostructure is a square lattice. In some cases, the multilayer DNA nanostructure will incorporate curvature. In some cases, some parts of the DNA nanostructure will remain single stranded and will not include staples (will not have staple oligonucleotides hybridized to the DNA), to allow for flexibility.


In some cases, the DNA nanostructure is described based on the number of helices formed in an X direction, the number of helices formed in a Y direction, and a depth of such helices (indicated by the length in bases of such helices). For example, a cuboid shape comprising 3 helices in the X direction and 3 helices in the Y direction, where each helix has a length of 32 bases, would be described as “3H×3H (×32b). Helices so formed may be discontinuous. The DNA nanostructure can be any of a variety of sizes and shapes, including cuboids such as 3H×3H (×32b, ×64b, ×128b, ×256b, ×512b, or ×1024b); 4H×4H (×32b, ×64b, ×128b, ×256b, or ×512b), 12H×12H (×48b). The DNA nanostructure can be a cylinder such as 6H×10H (×32b, ×64b, or ×128b). The DNA nanostructure can be a honeycomb lattice such as 6H×6H (×84b). See, e.g., US 2015/0218204.


In some cases, the DNA nanostructure serves as a carrier for an agent (also referred to as a “payload”) such as a small molecule, an RNA, a polypeptide, or a ligand for a receptor. The agent can be a therapeutic agent. The agent can be a labeling agent (e.g., an imaging agent). In some cases, the payload is a small molecule, e.g., a cytotoxic anti-cancer agent, an anti-angiogenic agent, and the like. The payload can be a polypeptide, e.g., an anti-angiogenic polypeptide, an immunotherapeutic agent (e.g., a chimeric antigen receptor; a therapeutic antibody), an antibody (e.g., a single-chain Fv, a nanobody, etc.), and the like. The payload can be a peptide hormone, a growth factor, and the like. The payload can be non-covalently associated with the DNA nanostructure. The payload can be covalently linked to the DNA nanostructure. In some cases, a targeting moiety (e.g., an antibody) is covalently linked to the DNA nanostructure.


In some cases, the payload is a DNA-dependent protein kinase (DNA-PK) inhibitor. In some cases, the DNA-PK inhibitor is selected from NU7441 (8-(4-Dibenzothienyl)-2-(4-morpholinyl)-4H-1-benzopyran-4-one), M3814 (S)-(2-chloro-4-fluoro-5-(7-morpholinoquinazolin-4-yl)phenyl)(6-met-hoxypyridazin-3-yl)methanol, NU 7026 (2-(4-morpholinyl)-4H-naphthol[1,2-b]pyran-4-one), compound 401 (2-(4-morpholinyl)-4H-pyrimido[2,1-a]isoquinolin-4-one), PI 103 hydrochloride (3-[4-(4-morpholinylpyrido[3′,2′:4,5]furo[3,2-d]pyrimidin-2-yl]phenol hydrochloride); DMNB (4,5-dimethoxy-2-nitrobenzaldehyde); ETP 45658 (3-[1-methyl-4-(4-morpholinyl)-1H-pyrazolo[3,4-d]pyrimidin-6-ylphenol), KU 0060648 (4-ethyl-N-[4-[2-(4-morpholinyl)-4-oxo-4H-1-benzopyran-8-yl]-1-dibenzothi-enyl]-1-piperazineacetamide), LTURM 34 (8-(4-dibenzothienyl1)-2-(4-morpholinyl)-4H-1,3-benzoxazin-4-one), 1-(2-hydroxy-4-morpholin-4-yl-phenyl)-ethanone; PIK-75 HCl (2-methyl-5-nitro-2-R6-bromoimidazo[1,2-a]pyridin-3-yl)methylene]-1-meth-ylhydrazide-benzenesulfonic acid, monohydrochloride); and CC-115 (1-ethyl-7-(2-methyl-6-(1h-1,2,4-triazol-3-yl)pyridin-3-yl)-3,4-dihydropy-razino(2,3-b)pyrazin-2(1h)-one).


In some cases, the payload is a small molecule inhibitor that modulates DNA repair pathways (e.g. NU7441 (KU-57788) and KU-0060648 that pharmacologically inhibit non homologous end joining pathways). NU7441 has the following structure:




embedded image


KU-0060648 has the following structure:




embedded image


The payload can be an effector polypeptide. Non-limiting examples of suitable effector polypeptides include adrenocorticotropic hormone (ACTH), amylin, angiopoietin-1, angiopoietin-2, angiotensin, atrial natriuretic peptide (ANP), bone-derived growth factor (BDGF), bone morphogenic protein (BMP) (e.g., BMP-2, BMP-4, BMP-7), brain-derived neurotrophic factor (BDNF), calcitonin, cartilage-derived growth factor (CDGF), cholecystokinin, ephrin B2 (EphB2), epidermal growth factor (EGF), erythropoietin, fibroblast growth factor (FGF), FGF-2, FGF-19, FGF-3, FGF-8, follicle stimulating hormone (FSH), gastrin, glial fibrillary acidic protein (GFAP), glucagon, granulocyte colony stimulating factor (G-CSF), granulocyte-macrophage colony stimulating factor (GM-CSF), ghrelin, hematopoietic growth factor, hepatocyte growth factor (HGF), insulin, insulin-like growth factor (IGF1), an interleukin (IL) (e.g., IL-1, IL-1, IL-4, IL-5, IL-6, IL-8, IL-9, IL-10, IL-12, IL-13, IL-15, IL-18, IL-23), an interferon (IFN) (e.g., IFN-α, IFN-β, IFN-γ), keratinocyte growth factor, leptin, leuteinizing hormone, macrophage inflammatory peptide MIP-1a, macrophage inflammatory peptide MIP-1b, macrophage colony stimulating factor (M-CSF), macrophage chemoattractant and activating factor (MCAF), melanocyte-stimulating hormone (MSH), nerve growth factor (NGF), neural epidermal growth factor-like-1 (NELL1), neurotrophin-3 (NT-3), neurotrophin-4 (NT-4), neuronal growth factor, neutrophil activating protein (NAP), a Notch polypeptide (e.g., Notch-1, Notch-2, Notch-3, Notch-4), oxytocin, parathyroid hormone, parathyroid hormone-related peptide (PTHrP), placental growth factor, platelet derived growth factor (PDGF), prolactin, RANTES, relaxin, renin, resistin, skeletal growth factor (SGF), somatostatin, stromal derived growth factor 1 (SDF-1), thyroid stimulating hormone (TSH), thyrotropin-releasing hormone, transforming growth factor (TGF)-β1, TGF-β2, TGF-β3, TGF-β4, TGF-β5, tumor necrosis factor (TNF), thrombopoietin, vascular endothelial growth factor (VEGF), vasopressin, vasoactive intestinal peptide, a Wnt polypeptide (e.g., Wnt1, Wnt2, Wnt2b (also called Wnt13), Wnt3, Wnt3a, Wnt4, Wnt5a, Wnt5b, Wnt6, Wnt7a, Wnt7b, Wnt7c, Wnt8, Wnt8a, Wnt8b, Wnt5c, Wnt10a, Wnt10b, Wnt11, Wnt14, Wnt15, or Wnt16), and the like.


In some cases, the payload is a nucleic acid payload (e.g., a DNA and/or RNA). The nucleic acid payload can be any nucleic acid of interest, e.g., the nucleic acid payload can be linear or circular, and can be a plasmid, a viral genome, an RNA (e.g., a coding RNA such as an mRNA or a non-coding RNA such as a guide RNA, a short interfering RNA (siRNA), a short hairpin RNA (shRNA), a microRNA (miRNA), and the like), a DNA, etc. In some cases, the payload is an mRNA encoding an antigen that is used to elicit an immune response (e.g., a vaccine). In some cases, the nucleic payload is an RNAi agent (e.g., an shRNA, an siRNA, a miRNA, etc.) or a DNA template encoding an RNAi agent. In some cases, the nucleic acid payload is an siRNA molecule (e.g., one that targets an mRNA, one that targets a miRNA). In some cases, the nucleic acid payload is an LNA molecule (e.g., one that targets a miRNA). In some cases, the nucleic acid payload is a miRNA. In some cases, the nucleic acid payload includes an mRNA that encodes a protein of interest. A nucleic acid payload can include one or more of: a modified backbone, a modified sugar, and a modified base.


Compositions

The present disclosure provides a composition comprising: a) a gene-editing polypeptide; b) a single-stranded donor DNA comprising: i) a first homology arm at or near the 5′ end of the donor DNA, wherein the first homology arm comprises a nucleotide sequence that is at least partially complementary to a first nucleotide sequence in a target nucleic acid; and ii) a second homology arm at or near the 3′ end of the donor DNA, wherein the second homology arm comprises a nucleotide sequence that is at least partially complementary to a second nucleotide sequence in the target nucleic acid; c) one or more staple oligonucleotides, wherein the one or more staple oligonucleotides are least partially complementary to the donor DNA such that the one or more staple oligonucleotides hybridize to donor DNA, such that the donor DNA template folds into a nanostructure in which the first homology arm and the second homology arm are brought into proximity to one another. In some cases, where the gene-editing polypeptide is a CRISPR/Cas effector polypeptide, the composition further comprises a guide RNA. The present disclosure provides a composition comprising: a) a gene-editing polypeptide; and b) a DNA nanostructure formed by hybridizing a linear single-stranded donor DNA with the one or more staple oligonucleotides, i.e., a DNA nanostructure comprising a donor DNA and at least one staple oligonucleotide, where the donor DNA comprises a first homology arm and a second homology arm that are in proximity to one another by virtue of the folding of the donor DNA induced by the staple oligonucleotide(s). In some cases, where the gene-editing polypeptide is a CRISPR/Cas effector polypeptide, the composition further comprises a guide RNA.


A composition of the present disclosure may comprise a pharmaceutically acceptable excipient, a variety of which are known in the art and need not be discussed in detail herein. Pharmaceutically acceptable excipients have been amply described in a variety of publications, including, for example, “Remington: The Science and Practice of Pharmacy”, 19th Ed. (1995), or latest edition, Mack Publishing Co; A. Gennaro (2000) “Remington: The Science and Practice of Pharmacy”, 20th edition, Lippincott, Williams, & Wilkins; Pharmaceutical Dosage Forms and Drug Delivery Systems (1999) H. C. Ansel et al., eds 7th ed., Lippincott, Williams, & Wilkins; and Handbook of Pharmaceutical Excipients (2000) A. H. Kibbe et al., eds., 3rd ed. Amer. Pharmaceutical Assoc.


A composition of the present disclosure can include: a) components as described above (e.g.: i) a single-stranded donor DNA comprising homology arms; a gene-editing polypeptide; and one or more staple oligonucleotides; or ii) a DNA nanostructure, as described above; and a gene-editing polypeptide); and b) one or more of: a buffer, a surfactant, an antioxidant, a hydrophilic polymer, a dextrin, a chelating agent, a suspending agent, a solubilizer, a thickening agent, a stabilizer, a bacteriostatic agent, a wetting agent, and a preservative. Suitable buffers include, but are not limited to, (such as N,N-bis(2-hydroxyethyl)-2-aminoethanesulfonic acid (BES), bis(2-hydroxyethyl)amino-tris(hydroxymethyl)methane (BIS-Tris), N-(2-hydroxyethyl)piperazine-N′3-propanesulfonic acid (EPPS or HEPPS), glycylglycine, N-2-hydroxyehtylpiperazine-N′-2-ethanesulfonic acid (HEPES), 3-(N-morpholino)propane sulfonic acid (MOPS), piperazine-N,N′-bis(2-ethane-sulfonic acid) (PIPES), sodium bicarbonate, 3-(N-tris(hydroxymethyl)-methyl-amino)-2-hydroxy-propanesulfonic acid) TAPSO, (N-tris(hydroxymethyl)methyl-2-aminoethanesulfonic acid (TES), N-tris(hydroxymethyl)methyl-glycine (Tricine), tris(hydroxymethyl)-aminomethane (Tris), etc.). Suitable salts include, e.g., NaCl, MgCl2, KCl, MgSO4, etc.


In some cases, the composition is sterile. In some cases, the composition is suitable for administration to a human subject, e.g., where the composition is sterile and is free of detectable pyrogens and/or other toxins.


A composition of the present disclosure may include other components, such as pharmaceutical grades of mannitol, lactose, starch, magnesium stearate, sodium saccharin, talcum, cellulose, glucose, sucrose, magnesium, carbonate, and the like. The compositions may contain pharmaceutically acceptable auxiliary substances as required to approximate physiological conditions such as pH adjusting and buffering agents, toxicity adjusting agents and the like, for example, sodium acetate, sodium chloride, potassium chloride, calcium chloride, sodium lactate, hydrochloride, sulfate salts, solvates (e.g., mixed ionic salts, water, organics), hydrates (e.g., water), and the like. In some cases, a composition of the present disclosure comprises saline.


In some cases, the components of a composition of the present disclosure are present in a particle, or associated with a particle. The terms “particle” and “nanoparticle” can be used interchangeably, as appropriate. For example, in some cases, a composition of the present disclosure comprises a lipid and/or a hydrophilic polymer. For example, in some cases, a composition of the present disclosure comprises a cationic lipid and a hydrophilic polymer, for instance wherein the cationic lipid comprises 1,2-dioleoyl-3-trimethylammonium-propane (DOTAP) or 1,2-ditetradecanoyl-sn-glycero-3-phosphocholine (DMPC) and/or wherein the hydrophilic polymer comprises ethylene glycol or polyethylene glycol (PEG); and/or wherein the composition further comprises cholesterol.


The components of a composition of the present disclosure can be present in a biodegradable core-shell structured nanoparticle with a poly (β-amino ester) (PBAE) core enveloped by a phospholipid bilayer shell. In some cases, particles/nanoparticles based on self-assembling bioadhesive polymers are used; such particles/nanoparticles may be applied to oral delivery. Other embodiments, such as oral absorption and ocular delivery of hydrophobic drugs are also contemplated. A molecular envelope technology, which involves an engineered polymer envelope which is protected and delivered to the site of the disease, can be used. Doses of about 5 mg/kg can be used, with single or multiple doses, depending on various factors, e.g., the target tissue.


Lipidoid compounds (e.g., as described in US patent application 20110293703) are also suitable for inclusion in a composition of the present disclosure. In one aspect, aminoalcohol lipidoid compounds are combined with components of a composition of the present disclosure to form microparticles, nanoparticles, liposomes, or micelles. The aminoalcohol lipidoid compounds may be combined with other aminoalcohol lipidoid compounds, polymers (synthetic or natural), surfactants, cholesterol, carbohydrates, proteins, lipids, etc. to form the particles. These particles may then optionally be combined with a pharmaceutical excipient to form a pharmaceutical composition.


A composition of the present disclosure can include poly(beta-amino alcohol) (PBAA). US Patent Publication No. 20130302401 relates to a class of poly(beta-amino alcohols) (PBAAs) that has been prepared using combinatorial polymerization. Sugar-based particles may be used, for example GalNAc, as described with reference to WO2014118272 (incorporated herein by reference) and Nair, J K et al., 2014, Journal of the American Chemical Society 136 (49), 16958-16961) can be included in a composition of the present disclosure.


In some cases, components of a composition of the present disclosure are included in lipid nanoparticles (LNPs). Examples of suitable cationic lipids that can be included in a composition of the present disclosure include, e.g.: 1,2-dilineoyl-3-dimethylammonium-propane (DLinDAP), 1,2-dilinoleyloxy-3-N,N-dimethylaminopropane (DLinDMA), 1,2-dilinoleyloxy-keto-N,N-dimethyl-3-aminopropane (DLinKDMA), and 1,2-dilinoleyl-4-(2-dimethylaminoethyl)[1,3]-dioxolane (DLinKC2-DMA). Preparation of LNPs is described in, e.g., Rosin et al. (2011) Molecular Therapy 19:1286-2200). The cationic lipids 1,2-dilineoyl-3-dimethylammonium-propane (DLinDAP), 1,2-dilinoleyloxy-3-N,N-dimethylaminopropane (DLinDMA), 1,2-dilinoleyloxyketo-N,N-dimethyl-3-aminopropane (DLinK-DMA), 1,2-dilinoleyl-4-(2-dimethylaminoethyl)-[1,3]-dioxolane (DLinKC2-DMA), (3-o-[2″-(methoxypolyethyleneglycol 2000) succinoyl]-1,2-dimyristoyl-sn-glycol (PEG-S-DMG), and R-3-[(.omega.-methoxy-poly(ethylene glycol)2000) carbamoyl]-1,2-dimyristyloxlpropyl-3-amine (PEG-C-DOMG) can be used. Components of a composition of the present disclosure can be encapsulated in LNPs containing DLinDAP, DLinDMA, DLinK-DMA, and DLinKC2-DMA (cationic lipid:DSPC:CHOL: PEGS-DMG or PEG-C-DOMG at 40:10:40:10 molar ratios). In some cases, 0.2% SP-DiOC18 is incorporated.


Spherical Nucleic Acid (SNA™) constructs and other nanoparticles (particularly gold nanoparticles) can be used. See, e.g., Cutler et al., J. Am. Chem. Soc. 2011 133:9254-9257, Hao et al., Small 2011 7:3158-3162, Zhang et al., ACS Nano. 2011 5:6962-6970, Cutler et al., J. Am. Chem. Soc. 2012 134:1376-1391, Young et al., Nano Lett. 2012 12:3867-71, Zheng et al., Proc. Natl. Acad. Sci. USA. 2012 109:11975-80, Mirkin, Nanomedicine 2012 7:635-638 Zhang et al., J. Am. Chem. Soc. 2012 134:16488-1691, Weintraub, Nature 2013 495:S14-S16, Choi et al., Proc. Natl. Acad. Sci. USA. 2013 110(19): 7625-7630, Jensen et al., Sci. Transl. Med. 5, 209ra152 (2013) and Mirkin, et al., Small, 10:186-192.


Self-assembling nanoparticles may be constructed with polyethyleneimine (PEI) that is PEGylated with an Arg-Gly-Asp (RGD) peptide ligand attached at the distal end of the polyethylene glycol (PEG).


In general, a “nanoparticle” refers to any particle having a diameter of less than 1000 nm. In some cases, nanoparticles suitable for use in encapsulating the components of a composition of the present disclosure have a diameter of 500 nm or less, e.g., from 25 nm to 35 nm, from 35 nm to 50 nm, from 50 nm to 75 nm, from 75 nm to 100 nm, from 100 nm to 150 nm, from 150 nm to 200 nm, from 200 nm to 300 nm, from 300 nm to 400 nm, or from 400 nm to 500 nm. In some cases, suitable nanoparticles have a diameter of from 25 nm to 200 nm. In some cases, suitable nanoparticles have a diameter of 100 nm or less. In some cases, suitable nanoparticles have a diameter of from 35 nm to 60 nm.


Nanoparticles may be provided in different forms, e.g., as solid nanoparticles (e.g., metal such as silver, gold, iron, titanium), non-metal, lipid-based solids, polymers), suspensions of nanoparticles, or combinations thereof. Metal, dielectric, and semiconductor nanoparticles may be prepared, as well as hybrid structures (e.g., core-shell nanoparticles). Nanoparticles made of semiconducting material may also be labeled quantum dots if they are small enough (typically below 10 nm) that quantization of electronic energy levels occurs. Such nanoscale particles are used in biomedical applications as drug carriers or imaging agents and may be adapted for similar purposes in the present disclosure. Semi-solid and soft nanoparticles are also suitable for use. A prototype nanoparticle of semi-solid nature is the liposome.


In some cases, components of a composition of the present disclosure are encapsulated in a liposome. Liposomes are spherical vesicle structures composed of a uni- or multilamellar lipid bilayer surrounding internal aqueous compartments and a relatively impermeable outer lipophilic phospholipid bilayer. Liposomes can be made from several different types of lipids; however, phospholipids are most commonly used to generate liposomes. Although liposome formation is spontaneous when a lipid film is mixed with an aqueous solution, it can also be expedited by applying force in the form of shaking by using a homogenizer, sonicator, or an extrusion apparatus. Several other additives may be added to liposomes in order to modify their structure and properties. For instance, either cholesterol or sphingomyelin may be added to the liposomal mixture in order to help stabilize the liposomal structure and to prevent the leakage of the liposomal inner cargo. A liposome formulation may be mainly comprised of natural phospholipids and lipids such as 1,2-distearoryl-sn-glycero-3-phosphatidyl choline (DSPC), sphingomyelin, egg phosphatidylcholines and monosialoganglioside.


A stable nucleic-acid-lipid particle (SNALP) can be used. The SNALP formulation may contain the lipids 3-N-[(methoxypoly(ethylene glycol) 2000) carbamoyl]-1,2-dimyristyloxy-propylamine (PEG-C-DMA), 1,2-dilinoleyloxy-N,N-dimethyl-3-aminopropane (DLinDMA), 1,2-distearoyl-sn-glycero-3-phosphocholine (DSPC) and cholesterol, in a 2:40:10:48 molar percent ratio. The SNALP liposomes may be prepared by formulating D-Lin-DMA and PEG-C-DMA with distearoylphosphatidylcholine (DSPC), and cholesterol. The resulting SNALP liposomes can be about 80-100 nm in size. A SNALP may comprise synthetic cholesterol (Sigma-Aldrich, St Louis, Mo., USA), dipalmitoylphosphatidylcholine (Avanti Polar Lipids, Alabaster, Ala., USA), 3-N-[(w-methoxy poly(ethylene glycol)2000)carbamoyl]-1,2-dimyrestyloxypropylamine, and cationic 1,2-dilinoleyloxy-3-N,Ndimethylaminopropane. A SNALP may comprise synthetic cholesterol (Sigma-Aldrich), 1,2-distearoyl-sn-glycero-3-phosphocholine (DSPC; Avanti Polar Lipids Inc.), PEG-cDMA, and 1,2-dilinoleyloxy-3-(N;N-dimethyl)aminopropane (DLinDMA).


Other cationic lipids, such as amino lipid 2,2-dilinoleyl-4-dimethylaminoethyl[1,3]-dioxolane (DLin-KC2-DMA) can be included in a composition of the present disclosure. A preformed vesicle with the following lipid composition may be contemplated amino lipid, distearoylphosphatidylcholine (DSPC), cholesterol and (R)-2,3-bis(octadecyloxy) propyl-1-(methoxy poly(ethylene glycol)2000)propylcarbamate (PEG-lipid) in the molar ratio 40/10/40/10, respectively, and a FVII siRNA/total lipid ratio of approximately 0.05 (w/w). To ensure a narrow particle size distribution in the range of 70-90 nm and a low polydispersity index of 0.11.+−.0.04 (n=56), the particles may be extruded up to three times through 80 nm membranes prior to adding the guide RNA. Particles containing the highly potent amino lipid 16 may be used, in which the molar ratio of the four lipid components 16, DSPC, cholesterol and PEG-lipid (50/10/38.5/1.5) which may be further optimized to enhance in vivo activity.


Lipids may be formulated with components of a composition of the present disclosure to form lipid nanoparticles (LNPs). Suitable lipids include, but are not limited to, DLin-KC2-DMA4, C12-200 and colipids disteroylphosphatidyl choline, cholesterol, and PEG-DMG. The component molar ratio may be about 50/10/38.5/1.5 (DLin-KC2-DMA or C12-200/disteroylphosphatidyl choline/cholesterol/PEG-DMG).


Components of a composition of the present disclosure can be encapsulated in PLGA microspheres such as that further described in US published applications 20130252281 and 20130245107 and 20130244279.


Supercharged proteins can be included in a composition of the present disclosure. Supercharged proteins are a class of engineered or naturally occurring proteins with unusually high positive or negative net theoretical charge. Both supernegatively and superpositively charged proteins exhibit the ability to withstand thermally- or chemically-induced aggregation. Superpositively charged proteins are also able to penetrate mammalian cells. Associating cargo with these proteins, such as plasmid DNA, RNA, or other proteins, can enable the functional delivery of these macromolecules into mammalian cells both in vitro and in vivo.


Cell Penetrating Peptides (CPPs) can be included in a composition of the present disclosure. CPPs typically have an amino acid composition that either contains a high relative abundance of positively charged amino acids such as lysine or arginine or has sequences that contain an alternating pattern of polar/charged amino acids and non-polar, hydrophobic amino acids.


A composition of the present disclosure can be present in a virus-like particle (VLP), e.g., as described in WO 2020/102709. For example, a composition of the present disclosure can include a lentiviral matrix (MA) polypeptide, a lentiviral capsid (CA) polypeptide, and a lentiviral nucleocapsid (NC) polypeptide.


Gene-Editing Systems

Suitable gene-editing systems include: i) a clustered regularly interspaced short palindromic repeats (CRISPR) associated (Cas) effector polypeptide and a guide nucleic acid; ii) a zinc finger nuclease (ZFN); iii) a transcription activator-like effector nuclease (TALEN); and iv) a meganuclease (e.g., an engineered meganuclease).


Zinc Finger Nucleases (ZFNs)

In some cases, the gene-editing system comprises a zinc-finger nuclease (ZFN). ZFNs are engineered double-strand break inducing proteins comprised of a zinc finger DNA binding domain and a double strand break inducing agent domain Engineered ZFNs consist of two zinc finger arrays (ZFAs), each of which is fused to a single subunit of a non-specific endonuclease, such as the nuclease domain from the FokI enzyme, which becomes active upon dimerization. Typically, a single ZFA consists of 3 to 5 zinc finger domains, each of which is designed to recognize a specific nucleotide triplet (GGC, GAT, etc.). Thus, ZFNs composed of two “3-finger” ZFAs are capable of recognizing 18 base pairs of target sequence; 18 base pairs of recognition sequence is generally unique, even within large genomes such as those of humans and plants. By directing the co-localization and dimerization of two FokI nuclease monomers, ZFNs generate a functional site-specific endonuclease that creates a double-stranded break (DSB) in DNA at the targeted locus.


Useful zinc-finger nucleases include those that are known and those that are engineered to have specificity for one or more desired target sites (TS). Zinc finger domains are amenable for designing polypeptides which specifically bind a selected polynucleotide recognition sequence, for example, within the target site of the host cell genome. ZFNs consist of an engineered DNA-binding zinc finger domain linked to a non-specific endonuclease domain, for example nuclease domain from a Type IIs endonuclease such as HO or FokI. Alternatively, engineered zinc finger DNA binding domains can be fused to other double-strand break inducing agents or derivatives thereof that retain DNA nicking/cleaving activity. For example, this type of fusion can be used to direct the double-strand break inducing agent to a different target site, to alter the location of the nick or cleavage site, to direct the inducing agent to a shorter target site, or to direct the inducing agent to a longer target site. In some examples a zinc finger DNA binding domain is fused to a site-specific recombinase, transposase, or a derivative thereof that retains DNA nicking and/or cleaving activity. Additional functionalities can be fused to the zinc-finger binding domain, including transcriptional activator domains, transcription repressor domains, and methylases. In some embodiments, dimerization of nuclease domain is required for cleavage activity.


Each zinc finger recognizes three consecutive base pairs in the target DNA. For example, a 3 finger domain recognizes a sequence of 9 contiguous nucleotides, with a dimerization requirement of the nuclease, two sets of zinc finger triplets are used to bind 18 nucleotides of recognition sequence. In some cases, engineered linker sequences are used between the zinc fingers allowing the fingers to ‘skip’ 1, 2 or more nucleotides between each three consecutive nucleotide unit. Useful designer zinc finger modules include those that recognize various GNN and ANN triplets (Dreier, et al., (2001) J Biol Chem 276:29466-78; Dreier, et al., (2000) J Mol Biol 303:489-502; Liu, et al., (2002) J Biol Chem 277:3850-6), as well as those that recognize various CNN or TNN triplets (Dreier, et al., (2005) J Biol Chem 280:35588-97; Jamieson, et al., (2003) Nature Rev Drug Discov 2:361-8). See also, Durai, et al., (2005) Nucleic Acids Res 33:5978-90; Segal, (2002) Methods 26:76-83; Porteus and Carroll, (2005) Nat Biotechnol 23:967-73; Pabo, et al., (2001) Ann Rev Biochem 70:313-40; Wolfe, et al., (2000) Ann Rev Biophys Biomol Struct 29:183-212; Segal and Barbas, (2001) Curr Opin Biotechnol 12:632-7; Segal, et al., (2003) Biochemistry 42:2137-48; Beerli and Barbas, (2002) Nat Biotechnol 20:135-41; Carroll, et al., (2006) Nature Protocols 1:1329; Ordiz, et al., (2002) Proc Natl Acad Sci USA 99:13290-5; Guan, et al., (2002) Proc Natl Acad Sci USA 99:13296-301; WO2002099084; WO00/42219; WO02/42459; WO2003062455; US20030059767; US Patent Application Publication Number 2003/0108880; U.S. Pat. Nos. 6,140,466, 6,511,808 and 6,453,242. Useful zinc-finger nucleases also include those described in WO03/080809; WO05/014791; WO05/084190; WO08/021207; WO09/042186; WO09/054985; and WO10/065123.


Where the gene-editing system comprises a zinc finger nuclease, optimal target sites may be selected using a number of publicly available online resources. See, e.g., Reyon et al., BMC Genomics 12:83 (2011), which is hereby incorporated by reference in its entirety. For example, Oligomerized Pool Engineering (OPEN) is a highly robust and publicly available protocol for engineering zinc finger arrays with high specificity and in vivo functionality, and has been successfully used to generate ZFNs that function efficiently in plants, zebrafish, and human somatic and pluripotent stem cells. OPEN is a selection-based method in which a pre-constructed randomized pool of candidate ZFAs is screened to identify those with high affinity and specificity for a desired target sequence. ZFNGenome is a GBrowse-based tool for identifying and visualizing potential target sites for OPEN-generated ZFNs. ZFNGenome provides a compendium of potential ZFN target sites in sequenced and annotated genomes of model organisms. ZFNGenome currently includes a total of more than 11.6 million potential ZFN target sites, mapped within the fully sequenced genomes of seven model organisms; Saccharomyces cerevisiae, Chlamydomonas reinhardtii, Arabidopsis thaliana, Drosophila melanogaster, Danio rerio, Caenorhabditis elegans, and Homo sapiens. Additional model organisms, including three plant species; Glycine max (soybean), Oryza sativa (rice), Zea mays (maize), and three animal species Tribolium castaneum (red flour beetle), Mus musculus (mouse), Rattus norvegicus (brown rat) can also be used. ZFNGenome provides information about each potential ZFN target site, including its chromosomal location and position relative to transcription initiation site(s). Users can query ZFNGenome using several different criteria (e.g., gene ID, transcript ID, target site sequence).


For more information on ZFNs, refer to U.S. Pat. No. 8,685,737, which is hereby incorporated by reference in its entirety.


TALENs

In some cases, the gene-editing system comprises a TAL-effector DNA binding domain-nuclease fusion protein (TALEN). A TAL effector comprises a DNA binding domain that interacts with DNA in a sequence-specific manner through one or more tandem repeat domains. The repeated sequence typically comprises 34 amino acids, and the repeats are typically 91-100% homologous with each other. Polymorphism of the repeats is usually located at positions 12 and 13, and there appears to be a one-to-one correspondence between the identity of repeat variable-diresidues at positions 12 and 13 with the identity of the contiguous nucleotides in the TAL-effector's target sequence.


The TAL-effector DNA binding domain can be engineered to bind to a desired target sequence, and fused to a nuclease domain, e.g., from a type II restriction endonuclease, e.g., a nonspecific cleavage domain from a type II restriction endonuclease such as FokI (see e.g., Kim et al. (1996) Proc. Natl. Acad. Sci. USA 93:1156-1160). Other useful endonucleases may include, for example, HhaI, HindIII, Nod, BbvCI, EcoRI, BglI, and AlwI. Thus, in some embodiments, a TALEN® includes a TAL effector domain containing a plurality of TAL effector repeat sequences that, in combination, bind to a specific nucleotide sequence in the target DNA sequence, such that the TALEN® cleaves the target DNA within or adjacent to the specific nucleotide sequence. Suitable TALEN® includes those described in WO10/079430 and U.S. Patent Application Publication No. 2011/0145940. See also, Ma et al. (2016) Methods Mol. Biol. 1451:17.


In some cases, the TAL effector domain that binds to a specific nucleotide sequence within a target DNA that includes 10 or more DNA binding repeats, and in some cases 15 or more DNA binding repeats. In some cases, each DNA binding repeat includes a repeat variable-diresidue (RVD) that determines recognition of a base pair in the target DNA sequence, wherein each DNA binding repeat is responsible for recognizing one base pair in the target DNA sequence, and wherein the RVD comprises one or more of: HD for recognizing C; NG for recognizing T; NI for recognizing A; NN for recognizing G or A; NS for recognizing A or C or G or T; N* for recognizing C or T, where * represents a gap in the second position of the RVD; HG for recognizing T; H* for recognizing T, where * represents a gap in the second position of the RVD; IG for recognizing T; NK for recognizing G; HA for recognizing C; ND for recognizing C; HI for recognizing C; HN for recognizing G; NA for recognizing G; SN for recognizing G or A; and YG for recognizing T.


If the sequence-specific, e.g., genome editing, endonuclease to be utilized is a TALEN®, in some cases, optimal target sites may be selected in accordance with the methods described by Sanjana et al., Nature Protocols, 7:171-192 (2012), which is hereby incorporated by reference in its entirety. In brief, TALENs function as dimers, and a pair of TALEN®, referred to as the left and right TALEN®, target sequences on opposite strands of DNA. TALEN® can be engineered as a fusion of the TALE DNA-binding domain and a monomeric FokI catalytic domain To facilitate FokI dimerization, the left and right TALEN® target sites can be chosen with a spacing of approximately 14-20 bases. Therefore, for a pair of TALEN®, each targeting 20-bp sequences, an optimal target site can have the form 5′-TN19N14-20N19A-3′, where the left TALEN targets 5′-TN19-3′ and the right TALEN targets the antisense strand of 5′-N19A-3′ (N=A, G, T or C).


For more information on TALENs, refer to U.S. Pat. No. 8,685,737, which is hereby incorporated by reference in its entirety.


CRISPR/Cas Effector Polypeptides

In some cases, the gene-editing polypeptide is a CRISPR/Cas effector polypeptide. In some cases, a suitable CRISPR-Cas effector polypeptide is a class 2 CRISPR/Cas effector polypeptide such as a type II, type V, or type VI CRISPR/Cas effector polypeptide. In some cases, a suitable CRISPR/Cas effector polypeptide is a class 2 CRISPR/Cas effector polypeptide. In some cases, a suitable CRISPR/Cas effector polypeptide is a type II CRISPR/Cas effector polypeptide (e.g., a Cas9 protein). In some cases, a suitable CRISPR/Cas effector polypeptide is a type V CRISPR/Cas effector polypeptide (e.g., a Cpf1 protein, a C2c1 protein, or a C2c3 protein), e.g., a Cas12a, a Cas12b, a Cas12c, a Cas12d, or a Cas12e polypeptide. In some cases, a suitable CRISPR/Cas effector polypeptide is a type VI CRISPR/Cas effector polypeptide (e.g., a C2c2 protein; also referred to as a “Cas13a” protein), e.g., a Cas13a, a Cas13b, a Cas13c, or a Cas13d polypeptide. In some cases, a suitable CRISPR/Cas effector polypeptide is a CasX protein. In some cases, a suitable CRISPR/Cas effector polypeptide is a CasY protein. In some cases, a suitable CRISPR/Cas effector polypeptide is a CasZ protein. In some cases, a suitable CRISPR/Cas effector polypeptide is a Cas14a, a Cas14b, or a Cas14c polypeptide.


In class 2 CRISPR systems, the functions of the effector complex (e.g., the cleavage of target DNA) are carried out by a single endonuclease (e.g., see Zetsche et al., Cell. 2015 Oct. 22; 163(3):759-71; Makarova et al., Nat Rev Microbiol. 2015 November; 13(11):722-36; Shmakov et al., Mol Cell. 2015 Nov. 5; 60(3):385-97); and Shmakov et al. (2017) Nature Reviews Microbiology 15:169. As such, the term “class 2 CRISPR/Cas protein” is used herein to encompass the CRISPR/Cas effector polypeptide (e.g., the target nucleic acid cleaving protein) from class 2 CRISPR systems. Thus, the term “class 2 CRISPR/Cas effector polypeptide” as used herein encompasses type II CRISPR/Cas effector polypeptides (e.g., Cas9); type V-A CRISPR/Cas effector polypeptides (e.g., Cpf1 (also referred to a “Cas12a”)); type V-B CRISPR/Cas effector polypeptides (e.g., C2c1 (also referred to as “Cas12b”)); type V-C CRISPR/Cas effector polypeptides (e.g., C2c3 (also referred to as “Cas12c”)); type V-U1 CRISPR/Cas effector polypeptides (e.g., C2c4); type V-U2 CRISPR/Cas effector polypeptides (e.g., C2c8); type V-U5 CRISPR/Cas effector polypeptides (e.g., C2c5); type V-U4 CRISPR/Cas proteins (e.g., C2c9); type V-U3 CRISPR/Cas effector polypeptides (e.g., C2c10); type VI-A CRISPR/Cas effector polypeptides (e.g., C2c2 (also known as “Cas13a”)); type VI-B CRISPR/Cas effector polypeptides (e.g., Cas13b (also known as C2c4)); and type VI-C CRISPR/Cas effector polypeptides (e.g., Cas13c (also known as C2c7)). To date, class 2 CRISPR/Cas effector polypeptides encompass type II, type V, and type VI CRISPR/Cas effector polypeptides, but the term is also meant to encompass any class 2 CRISPR/Cas effector polypeptide suitable for binding to a corresponding guide RNA and forming an RNP complex.


In some cases, the CRISPR/Cas effector polypeptide comprises an amino acid sequence having at least 50%, at least 60%, at least 70%, at least 80%, at least 90%, at least 95%, at least 98%, at least 99%, or 100%, amino acid sequence identity to the amino acid sequence depicted in any one of FIG. 3A-3P.


In some cases, the CRISPR/Cas effector polypeptide is a type II CRISPR/Cas effector polypeptide. In some cases, the CRISPR/Cas effector polypeptide is a Cas9 polypeptide. The Cas9 protein is guided to a target site (e.g., stabilized at a target site) within a target nucleic acid sequence (e.g., a chromosomal sequence or an extrachromosomal sequence, e.g., an episomal sequence, a minicircle sequence, a mitochondrial sequence, a chloroplast sequence, etc.) by virtue of its association with the protein-binding segment of the Cas9 guide RNA. In some cases, a Cas9 polypeptide comprises an amino acid sequence having at least 50%, at least 60%, at least 70%, at least 80%, at least 90%, at least 95%, at least 98%, at least 99%, or more than 99%, amino acid sequence identity to the Streptococcus pyogenes Cas9 depicted in FIG. 3A. In some cases, the Cas9 polypeptide used in a composition or method of the present disclosure is a Staphylococcus aureus Cas9 (saCas9) polypeptide. In some cases, the saCas9 polypeptide comprises an amino acid sequence having at least 85%, at least 90%, at least 95%, at least 98%, at least 99%, or 100%, amino acid sequence identity to the saCas9 amino acid sequence depicted in FIG. 3G.


In some cases, a suitable Cas9 polypeptide is a high-fidelity (HF) Cas9 polypeptide. Kleinstiver et al. (2016) Nature 529:490. For example, amino acids N497, R661, Q695, and Q926 of the amino acid sequence depicted in FIG. 3A are substituted, e.g., with alanine. For example, an HF Cas9 polypeptide can comprise an amino acid sequence having at least 90%, at least 95%, at least 98%, at least 99%, or 100%, amino acid sequence identity to the amino acid sequence depicted in FIG. 3A, where amino acids N497, R661, Q695, and Q926 are substituted, e.g., with alanine. In some cases, a suitable Cas9 polypeptide exhibits altered PAM specificity. See, e.g., Kleinstiver et al. (2015) Nature 523:481.


In some cases, the genome-editing endonuclease is a type V CRISPR/Cas endonuclease. In some cases, a type V CRISPR/Cas endonuclease is a Cpf1 protein. In some cases, a Cpf1 protein comprises an amino acid sequence having at least 30%, at least 35%, at least 40%, at least 45%, at least 50%, at least 55%, at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 90%, or 100%, amino acid sequence identity to the Cpf1 amino acid sequence depicted in any one of FIG. 3H-3J.


Guide Nucleic Acids

A nucleic acid that binds to a class 2 CRISPR/Cas effector polypeptide (e.g., a Cas9 protein; a type V or type VI CRISPR/Cas protein; a Cpf1 protein; etc.) and targets the complex to a specific location within a target nucleic acid is referred to herein as a “guide RNA” or “CRISPR/Cas guide nucleic acid” or “CRISPR/Cas guide RNA.” A guide RNA provides target specificity to the complex (the RNP complex) by including a targeting segment, which includes a guide sequence (also referred to herein as a targeting sequence), which is a nucleotide sequence that is complementary to a sequence of a target nucleic acid. The term “guide RNA”, as used herein, refers to an RNA that comprises: i) an “activator” nucleotide sequence that binds to a CRISPR/Cas effector polypeptide (e.g., a class 2 CRISPR/Cas effector polypeptide such as a type II, type V, or type VI CRISPR/Cas endonuclease) and activates the CRISPR/Cas effector polypeptide; and ii) a “targeter” nucleotide sequence that comprises a nucleotide sequence that hybridizes with a target nucleic acid. The “activator” nucleotide sequence and the “targeter” nucleotide sequence can be on separate RNA molecules (e.g., a “dual-guide RNA”); or can be on the same RNA molecule (a “single-guide RNA”). A guide nucleic acid in some cases includes only ribonucleotides. In some cases, a guide nucleic acid includes both ribonucleotides and deoxyribonucleotides.


In some cases, a CRISPR/Cas guide RNA comprises one or more modifications, e.g., a base modification, a backbone modification, a sugar modification, etc., to provide the nucleic acid with a new or enhanced feature (e.g., improved stability, such as improved in vivo stability). Suitable nucleic acid modifications include, but are not limited to: 2′0-methyl modified nucleotides, 2′ fluoro modified nucleotides, locked nucleic acid (LNA) modified nucleotides, peptide nucleic acid (PNA) modified nucleotides, nucleotides with phosphorothioate linkages, and a 5′ cap (e.g., a 7-methylguanylate cap (m7G)). Suitable modified nucleic acid backbones containing a phosphorus atom therein include, for example, phosphorothioates, chiral phosphorothioates, phosphorodithioates, phosphotriesters, aminoalkylphosphotriesters, methyl and other alkyl phosphonates including 3′-alkylene phosphonates, 5′-alkylene phosphonates and chiral phosphonates, phosphinates, phosphoramidates including 3′-amino phosphoramidate and aminoalkylphosphoramidates, phosphorodiamidates, thionophosphoramidates, thionoalkylphosphonates, thionoalkylphosphotriesters, selenophosphates and boranophosphates having normal 3′-5′ linkages, 2′-5′ linked analogs of these, and those having inverted polarity wherein one or more internucleotide linkages is a 3′ to 3′, 5′ to 5′ or 2′ to 2′ linkage. Suitable oligonucleotides having inverted polarity comprise a single 3′ to 3′ linkage at the 3′-most internucleotide linkage i.e. a single inverted nucleoside residue which may be a basic (the nucleobase is missing or has a hydroxyl group in place thereof). Various salts (such as, for example, potassium or sodium), mixed salts and free acid forms are also included. A CRISPR-Cas guide RNA can also include one or more substituted sugar moieties. Suitable polynucleotides comprise a sugar substituent group selected from: OH; F; O—, S—, or N-alkyl; O—, S—, or N-alkenyl; O—, S— or N-alkynyl; or O-alkyl-O-alkyl, wherein the alkyl, alkenyl and alkynyl may be substituted or unsubstituted C1 to C10 alkyl or C2 to C10 alkenyl and alkynyl. Particularly suitable are O((CH2)nO)mCH3, O(CH2)nOCH3, O(CH2)nNH2, O(CH2)nCH3, O(CH2)nONH2, and O(CH2)nON((CH2)nCH3)2, where n and m are from 1 to about 10. Other suitable polynucleotides comprise a sugar substituent group selected from: C1 to C10 lower alkyl, substituted lower alkyl, alkenyl, alkynyl, alkaryl, aralkyl, O-alkaryl or O-aralkyl, SH, SCH3, OCN, Cl, Br, CN, CF3, OCF3, SOCH3, SO2CH3, ONO2, NO2, N3, NH2, heterocycloalkyl, heterocycloalkaryl, aminoalkylamino, polyalkylamino, substituted silyl, an RNA cleaving group, a reporter group, an intercalator, a group for improving the pharmacokinetic properties of an oligonucleotide, or a group for improving the pharmacodynamic properties of an oligonucleotide, and other substituents having similar properties. A suitable modification includes 2′-methoxyethoxy (2′-O—CH2CH2OCH3, also known as 2′-O-(2-methoxyethyl) or 2′-MOE) (Martin et al., Helv. Chim. Acta, 1995, 78, 486-504) i.e., an alkoxyalkoxy group. A further suitable modification includes 2′-dimethylaminooxyethoxy, i.e., a O(CH2)2ON(CH3)2 group, also known as 2′-DMAOE, as described in examples herein below, and 2′-dimethylaminoethoxyethoxy (also known in the art as 2′-O-dimethyl-amino-ethoxy-ethyl or 2′-DMAEOE), i.e., 2′-O—CH2—O—CH2—N(CH3)2.


Examples of various CRISPR/Cas effector proteins and CRISPR/Cas guide RNAs (as well as information regarding requirements related to protospacer adjacent motif (PAM) sequences present in targeted nucleic acids) can be found in the art, for example, see Jinek et al., Science. 2012 Aug. 17; 337(6096):816-21; Chylinski et al., RNA Biol. 2013 May; 10(5):726-37; Ma et al., Biomed Res Int. 2013; 2013:270805; Hou et al., Proc Natl Acad Sci U S A. 2013 Sep. 24; 110(39):15644-9; Jinek et al., Elife. 2013; 2:e00471; Pattanayak et al., Nat Biotechnol. 2013 September; 31(9):839-43; Qi et al., Cell. 2013 Feb. 28; 152(5):1173-83; Wang et al., Cell. 2013 May 9; 153(4):910-8; Auer et al., Genome Res. 2013 Oct. 31; Chen et al., Nucleic Acids Res. 2013 Nov. 1; 41(20):e19; Cheng et al., Cell Res. 2013 October; 23(10):1163-71; Cho et al., Genetics. 2013 November; 195(3):1177-80; DiCarlo et al., Nucleic Acids Res. 2013 April; 41(7):4336-43; Dickinson et al., Nat Methods. 2013 October; 10(10):1028-34; Ebina et al., Sci Rep. 2013; 3:2510; Fujii et al., Nucleic Acids Res. 2013 Nov. 1; 41(20):e187; Hu et al., Cell Res. 2013 November; 23(11):1322-5; Jiang et al., Nucleic Acids Res. 2013 Nov. 1; 41(20):e188; Larson et al., Nat Protoc. 2013 November; 8(11):2180-96; Mali et al., Nat Methods. 2013 October; 10(10):957-63; Nakayama et al., Genesis. 2013 Dec; 51(12):835-43; Ran et al., Nat Protoc. 2013 November; 8(11):2281-308; Ran et al., Cell. 2013 Sep. 12; 154(6):1380-9; Upadhyay et al., G3 (Bethesda). 2013 Dec. 9; 3(12):2233-8; Walsh et al., Proc Natl Acad Sci U S A. 2013 Sep. 24; 110(39):15514-5; Xie et al., Mol Plant. 2013 Oct. 9; Yang et al., Cell. 2013 Sep. 12; 154(6):1370-9; Briner et al., Mol Cell. 2014 Oct. 23; 56(2):333-9; and U.S. patents and patent applications: U.S. Pat. Nos. 8,906,616; 8,895,308; 8,889,418; 8,889,356; 8,871,445; 8,865,406; 8,795,965; 8,771,945; 8,697,359; 20140068797; 20140170753; 20140179006; 20140179770; 20140186843; 20140186919; 20140186958; 20140189896; 20140227787; 20140234972; 20140242664; 20140242699; 20140242700; 20140242702; 20140248702; 20140256046; 20140273037; 20140273226; 20140273230; 20140273231; 20140273232; 20140273233; 20140273234; 20140273235; 20140287938; 20140295556; 20140295557; 20140298547; 20140304853; 20140309487; 20140310828; 20140310830; 20140315985; 20140335063; 20140335620; 20140342456; 20140342457; 20140342458; 20140349400; 20140349405; 20140356867; 20140356956; 20140356958; 20140356959; 20140357523; 20140357530; 20140364333; and 20140377868; all of which are hereby incorporated by reference in their entirety.


Examples and guidance related to type V or type VI CRISPR/Cas endonucleases and guide RNAs (as well as information regarding requirements related to protospacer adjacent motif (PAM) sequences present in targeted nucleic acids) can be found in the art, for example, see Zetsche et al., Cell. 2015 Oct. 22; 163(3):759-71; Makarova et al., Nat Rev Microbiol. 2015 November; 13(11):722-36; and Shmakov et al., Mol Cell. 2015 Nov. 5; 60(3):385-97.


Methods

The present disclosure provides methods of modifying a target nucleic acid in a cell (e.g., a target cell) by HDR. The methods generally involve introducing into the cell a composition of the present disclosure. The present disclosure provides methods of making a genetically modified cell. The methods comprise introducing into a target cell (where the target cell comprises a target DNA) a composition of the present disclosure, wherein the introducing step results in modification of the target DNA by HDR, thereby producing a genetically modified cell. Target nucleic acids include genomic DNA, mitochondrial DNA, and the like. In some cases, a target nucleic acid comprises a nucleotide sequence encoding a mutated polypeptide, e.g., a polypeptide that has reduced function as a result of a mutation (insertion and/or deletion and/or substitution of one or more nucleotides), where the polypeptide with reduced function causes a pathological condition.


In some cases, the target cell is in vitro. In some cases, the target cell is in vivo. In some cases, the target cell is a prokaryotic cell. In some cases, the target cell is a eukaryotic cell.


Non-limiting examples of cells (target cells) include: a prokaryotic cell, eukaryotic cell, a bacterial cell, an archaeal cell, a cell of a single-cell eukaryotic organism, a protozoa cell, a cell from a plant (e.g., cells from plant crops, fruits, vegetables, grains, soy bean, corn, maize, wheat, seeds, tomatoes, rice, cassava, sugarcane, pumpkin, hay, potatoes, cotton, cannabis, tobacco, flowering plants, conifers, gymnosperms, angiosperms, ferns, clubmosses, hornworts, liverworts, mosses, dicotyledons, monocotyledons, etc.), an algal cell, (e.g., Botryococcus braunii, Chlamydomonas reinhardtii, Nannochloropsis gaditana, Chlorella pyrenoidosa, Sargassum patens, C. agardh, and the like), seaweeds (e.g. kelp) a fungal cell (e.g., a yeast cell, a cell from a mushroom), an animal cell, a cell from an invertebrate animal (e.g., fruit fly, cnidarian, echinoderm, nematode, etc.), a cell from a vertebrate animal (e.g., fish, amphibian, reptile, bird, mammal), a cell from a mammal (e.g., an ungulate (e.g., a pig, a cow, a goat, a sheep); a rodent (e.g., a rat, a mouse); a non-human primate; a human; a feline (e.g., a cat); a canine (e.g., a dog); etc.), and the like. In some cases, the cell is a cell that does not originate from a natural organism (e.g., the cell can be a synthetically made cell; also referred to as an artificial cell).


A cell can be an in vitro cell (e.g., established cultured cell line). A cell can be an ex vivo cell (cultured cell from an individual). A cell can be an in vivo cell (e.g., a cell in an individual). A cell can be an isolated cell. A cell can be a cell inside of an organism. A cell can be an organism. A cell can be a cell in a cell culture (e.g., in vitro cell culture). A cell can be one of a collection of cells. A cell can be a prokaryotic cell or derived from a prokaryotic cell. A cell can be a bacterial cell or can be derived from a bacterial cell. A cell can be an archaeal cell or derived from an archaeal cell. A cell can be a eukaryotic cell or derived from a eukaryotic cell. A cell can be a plant cell or derived from a plant cell. A cell can be an animal cell or derived from an animal cell. A cell can be an invertebrate cell or derived from an invertebrate cell. A cell can be a vertebrate cell or derived from a vertebrate cell. A cell can be a mammalian cell or derived from a mammalian cell. A cell can be a rodent cell or derived from a rodent cell. A cell can be a human cell or derived from a human cell. A cell can be a microbe cell or derived from a microbe cell. A cell can be a fungi cell or derived from a fungi cell. A cell can be an insect cell. A cell can be an arthropod cell. A cell can be a protozoan cell. A cell can be a helminth cell.


Suitable cells include a stem cell (e.g. an embryonic stem (ES) cell, an induced pluripotent stem (iPS) cell; a germ cell (e.g., an oocyte, a sperm, an oogonia, a spermatogonia, etc.); a somatic cell, e.g. a fibroblast, an oligodendrocyte, a glial cell, a hematopoietic cell, a neuron, a muscle cell, a bone cell, a hepatocyte, a pancreatic cell, etc.


Suitable cells include human embryonic stem cells, fetal cardiomyocytes, myofibroblasts, mesenchymal stem cells, cardiomyocytes, adipocytes, totipotent cells, pluripotent cells, blood stem cells, myoblasts, adult stem cells, bone marrow cells, mesenchymal cells, embryonic stem cells, parenchymal cells, epithelial cells, endothelial cells, mesothelial cells, fibroblasts, osteoblasts, chondrocytes, exogenous cells, endogenous cells, stem cells, hematopoietic stem cells, bone-marrow derived progenitor cells, myocardial cells, skeletal cells, fetal cells, undifferentiated cells, multi-potent progenitor cells, unipotent progenitor cells, monocytes, cardiac myoblasts, skeletal myoblasts, macrophages, capillary endothelial cells, xenogeneic cells, allogeneic cells, and post-natal stem cells.


In some cases, the cell is an immune cell, a neuron, an epithelial cell, and endothelial cell, or a stem cell. In some cases, the immune cell is a T cell, a B cell, a monocyte, a natural killer cell, a dendritic cell, or a macrophage. In some cases, the immune cell is a cytotoxic T cell. In some cases, the immune cell is a helper T cell. In some cases, the immune cell is a regulatory T cell (Treg).


In some cases, the cell is a stem cell. Stem cells include adult stem cells. Adult stem cells are also referred to as somatic stem cells.


Adult stem cells are resident in differentiated tissue, but retain the properties of self-renewal and ability to give rise to multiple cell types, usually cell types typical of the tissue in which the stem cells are found. Numerous examples of somatic stem cells are known to those of skill in the art, including muscle stem cells; hematopoietic stem cells; epithelial stem cells; neural stem cells; mesenchymal stem cells; mammary stem cells; intestinal stem cells; mesodermal stem cells; endothelial stem cells; olfactory stem cells; neural crest stem cells; and the like.


Stem cells of interest include mammalian stem cells, where the term “mammalian” refers to any animal classified as a mammal, including humans; non-human primates; domestic and farm animals; and zoo, laboratory, sports, or pet animals, such as dogs, horses, cats, cows, mice, rats, rabbits, etc. In some cases, the stem cell is a human stem cell. In some cases, the stem cell is a rodent (e.g., a mouse; a rat) stem cell. In some cases, the stem cell is a non-human primate stem cell.


Stem cells can express one or more stem cell markers, e.g., SOX9, KRT19, KRT7, LGR5, CA9, FXYD2, CDH6, CLDN18, TSPAN8, BPIFB1, OLFM4, CDH17, and PPARGC1A.


In some cases, the stem cell is a hematopoietic stem cell (HSC). HSCs are mesoderm-derived cells that can be isolated from bone marrow, blood, cord blood, fetal liver and yolk sac. HSCs are characterized as CD34+ and CD3. HSCs can repopulate the erythroid, neutrophil-macrophage, megakaryocyte and lymphoid hematopoietic cell lineages in vivo. In vitro, HSCs can be induced to undergo at least some self-renewing cell divisions and can be induced to differentiate to the same lineages as is seen in vivo. As such, HSCs can be induced to differentiate into one or more of erythroid cells, megakaryocytes, neutrophils, macrophages, and lymphoid cells.


In other cases, the stem cell is a neural stem cell (NSC). Neural stem cells (NSCs) are capable of differentiating into neurons, and glia (including oligodendrocytes, and astrocytes). A neural stem cell is a multipotent stem cell which is capable of multiple divisions, and under specific conditions can produce daughter cells which are neural stem cells, or neural progenitor cells that can be neuroblasts or glioblasts, e.g., cells committed to become one or more types of neurons and glial cells respectively. Methods of obtaining NSCs are known in the art.


In other cases, the stem cell is a mesenchymal stem cell (MSC). MSCs originally derived from the embryonal mesoderm and isolated from adult bone marrow, can differentiate to form muscle, bone, cartilage, fat, marrow stroma, and tendon. Methods of isolating MSC are known in the art; and any known method can be used to obtain MSC. See, e.g., U.S. Pat. No. 5,736,396, which describes isolation of human MSC.


A cell is in some cases a plant cell. A plant cell can be a cell of a monocotyledon. A cell can be a cell of a dicotyledon.


In some cases, the cell is a plant cell. For example, the cell can be a cell of a major agricultural plant, e.g., Barley, Beans (Dry Edible), Canola, Corn, Cotton (Pima), Cotton (Upland), Flaxseed, Hay (Alfalfa), Hay (Non-Alfalfa), Oats, Peanuts, Rice, Sorghum, Soybeans, Sugarbeets, Sugarcane, Sunflowers (Oil), Sunflowers (Non-Oil), Sweet Potatoes, Tobacco (Burley), Tobacco (Flue-cured), Tomatoes, Wheat (Durum), Wheat (Spring), Wheat (Winter), and the like. As another example, the cell is a cell of a vegetable crops which include but are not limited to, e.g., alfalfa sprouts, aloe leaves, arrow root, arrowhead, artichokes, asparagus, bamboo shoots, banana flowers, bean sprouts, beans, beet tops, beets, bittermelon, bok choy, broccoli, broccoli rabe (rappini), brussels sprouts, cabbage, cabbage sprouts, cactus leaf (nopales), calabaza, cardoon, carrots, cauliflower, celery, chayote, chinese artichoke (crosnes), chinese cabbage, chinese celery, chinese chives, choy sum, chrysanthemum leaves (tung ho), collard greens, corn stalks, corn-sweet, cucumbers, daikon, dandelion greens, dasheen, dau mue (pea tips), donqua (winter melon), eggplant, endive, escarole, fiddle head ferns, field cress, frisee, gai choy (chinese mustard), gailon, galanga (siam, thai ginger), garlic, ginger root, gobo, greens, hanover salad greens, huauzontle, jerusalem artichokes, jicama, kale greens, kohlrabi, lamb's quarters (quilete), lettuce (bibb), lettuce (boston), lettuce (boston red), lettuce (green leaf), lettuce (iceberg), lettuce (lolla rossa), lettuce (oak leaf—green), lettuce (oak leaf—red), lettuce (processed), lettuce (red leaf), lettuce (romaine), lettuce (ruby romaine), lettuce (russian red mustard), linkok, lo bok, long beans, lotus root, mache, maguey (agave) leaves, malanga, mesculin mix, mizuna, moap (smooth luffa), moo, moqua (fuzzy squash), mushrooms, mustard, nagaimo, okra, ong choy, onions green, opo (long squash), ornamental corn, ornamental gourds, parsley, parsnips, peas, peppers (bell type), peppers, pumpkins, radicchio, radish sprouts, radishes, rape greens, rape greens, rhubarb, romaine (baby red), rutabagas, salicornia (sea bean), sinqua (angled/ridged luffa), spinach, squash, straw bales, sugarcane, sweet potatoes, swiss chard, tamarindo, taro, taro leaf, taro shoots, tatsoi, tepeguaje (guaje), tindora, tomatillos, tomatoes, tomatoes (cherry), tomatoes (grape type), tomatoes (plum type), tumeric, turnip tops greens, turnips, water chestnuts, yampi, yams (names), yu choy, yuca (cassava), and the like.


In some cases, the plant cell is a cell of a plant component such as a leaf, a stem, a root, a seed, a flower, pollen, an anther, an ovule, a pedicel, a fruit, a meristem, a cotyledon, a hypocotyl, a pod, an embryo, endosperm, an explant, a callus, or a shoot.


A cell is in some cases an arthropod cell. For example, the cell can be a cell of a sub-order, a family, a sub-family, a group, a sub-group, or a species of, e.g., Chelicerata, Myriapodia, Hexipodia, Arachnida, Insecta, Archaeognatha, Thysanura, Palaeoptera, Ephemeroptera, Odonata, Anisoptera, Zygoptera, Neoptera, Exopterygota, Plecoptera, Embioptera, Orthoptera, Zoraptera, Dermaptera, Dictyoptera, Notoptera, Grylloblattidae, Mantophasmatidae, Phasmatodea, Blattaria, Isoptera, Mantodea, Parapneuroptera, Psocoptera, Thysanoptera, Phthiraptera, Hemiptera, Endopterygota or Holometabola, Hymenoptera, Coleoptera, Strepsiptera, Raphidioptera, Megaloptera, Neuroptera, Mecoptera, Siphonaptera, Diptera, Trichoptera, or Lepidoptera.


A cell is in some cases an insect cell. For example, in some cases, the cell is a cell of a mosquito, a grasshopper, a true bug, a fly, a flea, a bee, a wasp, an ant, a louse, a moth, or a beetle.


Methods of introducing a nucleic acids and polypeptides into a cell are known in the art, and any convenient method can be used to introduce a composition of the present into a target cell. Suitable methods are known in the art and include e.g., lipofection, electroporation, calcium phosphate precipitation, polyethyleneimine (PEI)-mediated transfection, DEAE-dextran mediated transfection, liposome-mediated transfection, virus-like particle-based delivery, particle gun technology, calcium phosphate precipitation, direct micro injection, nanoparticle-mediated delivery, and the like. Any or all of the components of a composition of the present disclosure can be introduced into a target cell using known methods.


Kits

The present disclosure provides a kit comprising components of a composition of the present disclosure. A kit of the present disclosure can include: a) a gene-editing polypeptide; b) a single-stranded donor DNA comprising: i) a first homology arm at or near the 5′ end of the donor DNA, wherein the first homology arm comprises a nucleotide sequence that is at least partially complementary to a first nucleotide sequence in the target nucleic acid; ii) a second homology arm at or near the 3′ end of the donor DNA, wherein the second homology arm comprises a nucleotide sequence that is at least partially complementary to a second nucleotide sequence in the target nucleic acid; and iii) a nucleotide sequence of interest, where the nucleotide sequence of interest is between the first homology arm and the second homology arm; and c) one or more staple oligonucleotides, wherein the one or more staple oligonucleotides are least partially complementary to the donor DNA such that the one or more staple oligonucleotides hybridize to donor DNA, such that the donor DNA template folds into a nanostructure in which the first homology arm and the second homology arm are brought into proximity to one another.


A kit of the present disclosure can further comprise a gene-editing polypeptide. For example, a kit of the present disclosure can further comprise a CRISPR/Cas effector polypeptide.


The components of the kit can be in separate containers. The components can be provided in a liquid solution, e.g., together with a buffer.


Examples of Non-Limiting Aspects of the Disclosure

Aspects, including embodiments, of the present subject matter described above may be beneficial alone or in combination, with one or more other aspects or embodiments. Without limiting the foregoing description, certain non-limiting aspects of the disclosure are provided below. As will be apparent to those of skill in the art upon reading this disclosure, each of the individually numbered aspects may be used or combined with any of the preceding or following individually numbered aspects. This is intended to provide support for all such combinations of aspects and is not limited to combinations of aspects explicitly provided below:


Aspect 1. A composition comprising: a) a gene-editing polypeptide; b) a single-stranded donor DNA comprising: i) a first homology arm at or near the 5′ end of the donor DNA, wherein the first homology arm comprises a nucleotide sequence that is at least partially complementary to a first nucleotide sequence in the target nucleic acid; ii) a second homology arm at or near the 3′ end of the donor DNA, wherein the second homology arm comprises a nucleotide sequence that is at least partially complementary to a second nucleotide sequence in the target nucleic acid; and iii) a nucleotide sequence of interest between the first homology arm and the second homology arm; and c) one or more staple oligonucleotides, wherein the one or more staple oligonucleotides are least partially complementary to the nucleotide sequence of interest, such that the one or more staple oligonucleotides hybridize to the nucleotide sequence of interest, such that the single-stranded donor DNA folds into a DNA nanostructure in which the first homology arm and the second homology arm are brought into proximity to one another.


Aspect 2. The composition of aspect 1, comprising from 2 to 250 staple oligonucleotides.


Aspect 3. The composition of aspect 1, wherein the single-stranded donor DNA has a length of from about 50 nucleotides to about 10,000 nucleotides.


Aspect 4. The composition of aspect 1, wherein the single-stranded donor DNA has a length of from about 100 nucleotides to about 7,500 nucleotides, or from about 100 nucleotides to about 10,000 nucleotides.


Aspect 5. The composition of aspect 1, wherein the single-stranded donor DNA has a length of from about 1,000 nucleotides to about 7,500 nucleotides, or from about 1,000 nucleotides o about 10,000 nucleotides.


Aspect 6. The composition of any one of aspects 1-5, wherein at least one of the one or more staple oligonucleotides comprises a detectable label.


Aspect 7. The composition of aspect 6, wherein the detectable label comprises a fluorophore.


Aspect 8. The composition of any one of aspects 1-7, wherein each of the one or more staple oligonucleotides independently has a length of from about 10 nucleotides to about 60 nucleotides.


Aspect 9. The composition of any one of aspects 1-8, wherein each of the one or more staple oligonucleotides independently has a length of from about 20 nucleotides to about 50 nucleotides.


Aspect 10. The composition of any one of aspects 1-9, wherein the nanostructure comprises a 6-helix bundle.


Aspect 11. The composition of any one of aspects 1-9, wherein the nanostructure comprises a 24-helix bundle.


Aspect 12. The composition of any one of aspects 1-11, wherein the gene-editing polypeptide is a transcription activator-like effector nuclease.


Aspect 13. The composition of any one of aspects 1-11, wherein the gene-editing polypeptide is a zinc finger nuclease.


Aspect 14. The composition of any one of aspects 1-11, wherein the gene-editing polypeptide is a CRISPR/Cas effector polypeptide.


Aspect 15. The composition of aspect 14, wherein the CRISPR/Cas effector polypeptide is a type II CRISPR/Cas effector polypeptide.


Aspect 16. The composition of aspect 14, wherein the CRISPR/Cas effector polypeptide is a type V CRISPR/Cas effector polypeptide.


Aspect 17. The composition of aspect 14, wherein the CRISPR/Cas effector polypeptide is a type VI CRISPR/Cas effector polypeptide.


Aspect 18. The composition of any one of aspects 14-17, comprising one or more guide nucleic acids, wherein each of the one or more guide nucleic acids comprises a targeter nucleotide sequence that hybridizes to a target nucleic acid.


Aspect 19. The composition of aspect 18, wherein the guide nucleic acid is a guide RNA.


Aspect 20. The composition of aspect 19, wherein the guide RNA comprises one or more of: i) a modified sugar; ii) a modified base; and iii) a modified internucleoside linkage.


Aspect 21. The composition of any one of aspects 1-20, wherein the nucleotide sequence of interest comprises one or more of: i) a nucleotide sequence encoding a protein of interest; ii) a nucleotide sequence encoding an exon of a gene; iii) a promoter sequence; iv) an enhancer sequence; and v) a sequence encoding a non-coding RNA.


Aspect 22. A composition comprising: a) a gene-editing polypeptide; and b) a DNA nanostructure, wherein the DNA nanostructure comprises, in order from 5′ to 3′: i) a first homology arm; ii) a nucleotide sequence of interest that is hybridized to one or more staple oligonucleotides, wherein the nucleotide sequence of interest is folded via the staple oligonucleotides such that the nucleotide sequence of interest has a length of from about 5 nm to about 500 nm; and iii) a second homology arm.


Aspect 23. The composition of aspect 22, wherein the DNA nanostructure comprises a 6-helix bundle, a 24-helix bundle, a tube, a cube, a square lattice, or a honeycomb lattice.


Aspect 24. The composition of aspect 22, wherein the DNA nanostructure incorporates curvature.


Aspect 25. The composition of aspect 22, wherein the DNA nanostructure comprises one or more single-stranded regions that do not comprise a hybridized staple oligonucleotide.


Aspect 26. The composition of any one of aspects 22-25, wherein the gene-editing polypeptide is a CRISPR/Cas effector polypeptide and wherein the composition further comprises a guide nucleic acid.


Aspect 27. The composition of any one of aspects 22-26, wherein the composition is packaged in a virus-like particle.


Aspect 28. The composition of any one of aspects 22-27, wherein the DNA nanostructure comprises a small molecule, a nucleic acid, or a polypeptide encapsulated within the DNA nanostructure.


Aspect 29. A method of modifying a target nucleic acid in a cell by homology-directed repair, the method comprising introducing into the cell a composition according to any one of aspects 1-28.


Aspect 30. The method of aspect 29, wherein the cell is a prokaryotic cell.


Aspect 31. The method of aspect 29, wherein the cell is a eukaryotic cell.


Aspect 32. The method of aspect 31, wherein the cell is in vitro.


Aspect 33. The method of aspect 31, wherein the cell is in vivo.


Aspect 34. The method of any one of aspects 29-33, wherein the eukaryotic cell is a human cell, a non-human mammalian cell, a reptile cell, an amphibian cell, a cell of an invertebrate, a plant cell, an insect cell, an avian cell, a fungal cell, a fish cell, an algal cell, or an arachnid cell.


Aspect 35. The method of aspect 34, wherein the mammalian cell is a T cell or an NK cell.


Aspect 36. The method of aspect 35, wherein the nucleotide sequence of interest comprises a nucleotide sequence encoding a receptor.


Aspect 37. The method of aspect 36, wherein the receptor is a T cell receptor, an NK cell receptor, or a chimeric antigen receptor (CAR).


Aspect 38. A method of making a genetically modified cell, the method comprising the method comprising introducing into a target cell comprising a target DNA, a composition according to any one of aspects 1-28, wherein said introducing results in modification of the target DNA by homology directed repair, thereby producing a genetically modified cell.


Aspect 39. The method of aspect 38, wherein the cell is a prokaryotic cell.


Aspect 40. The method of aspect 38, wherein the cell is a eukaryotic cell.


Aspect 41. The method of aspect 40, wherein the cell is in vitro.


Aspect 42. The method of aspect 40, wherein the cell is in vivo.


Aspect 43. The method of any one of aspects 38-42, wherein the eukaryotic cell is a mammalian cell, a human cell, a non-human mammalian cell, a reptile cell, an amphibian cell, a cell of an invertebrate, a plant cell, an insect cell, an avian cell, a fish cell, a fungal cell, an algal cell, or an arachnid cell.


Aspect 44. The method of aspect 43, wherein the mammalian cell is a T cell or an NK cell.


Aspect 45. The method of aspect 44, wherein the nucleotide sequence of interest comprises a nucleotide sequence encoding a receptor.


Aspect 46. The method of aspect 45, wherein the receptor is a T cell receptor, an NK cell receptor, or a chimeric antigen receptor (CAR).


Aspect 47. A kit comprising: a) a gene-editing polypeptide; b) a single-stranded donor DNA comprising: i) a first homology arm at or near the 5′ end of the donor DNA, wherein the first homology arm comprises a nucleotide sequence that is at least partially complementary to a first nucleotide sequence in the target nucleic acid; ii) a second homology arm at or near the 3′ end of the donor DNA, wherein the second homology arm comprises a nucleotide sequence that is at least partially complementary to a second nucleotide sequence in the target nucleic acid; and iii) a nucleotide sequence of interest between the first homology arm and the second homology arm; and c) one or more staple oligonucleotides, wherein the one or more staple oligonucleotides are least partially complementary to the nucleotide sequence of interest, such that the one or more staple oligonucleotides hybridize to the nucleotide sequence of interest, such that the donor DNA template folds into a nanostructure in which the first homology arm and the second homology arm are brought into proximity to one another.


Aspect 48. The kit of aspect 47, wherein (a), (b), and (c) are in separate containers.


EXAMPLES

The following examples are put forth so as to provide those of ordinary skill in the art with a complete disclosure and description of how to make and use the present invention, and are not intended to limit the scope of what the inventors regard as their invention nor are they intended to represent that the experiments below are all or the only experiments performed. Efforts have been made to ensure accuracy with respect to numbers used (e.g. amounts, temperature, etc.) but some experimental errors and deviations should be accounted for. Unless indicated otherwise, parts are parts by weight, molecular weight is weight average molecular weight, temperature is in degrees Celsius, and pressure is at or near atmospheric. Standard abbreviations may be used, e.g., bp, base pair(s); kb, kilobase(s); pl, picoliter(s); s or sec, second(s); min, minute(s); h or hr, hour(s); aa, amino acid(s); kb, kilobase(s); bp, base pair(s); nt, nucleotide(s); i.m., intramuscular(ly); i.p., intraperitoneal(ly); s.c., subcutaneous(ly); and the like.


Example 1

To identify the three-dimensional structures (“nanostructures”) that display the greatest efficiency for HDR, a library comprising six three-dimensional structures is generated with an EGFP expression plasmid, using six sets of stable oligonucleotides. Staple oligonucleotides are designed to cause different folding patterns and structures based on DNA base pairing (Rothemund (2006) Nature 440:297-302; Castro et al. (2011) Nat. Methods 8:221), including a 6-helix bundle and a 24-helix bundle. Each structure also comprises ssDNA homology arms to promote HDR into the genome. The homology arms are not incorporated into the three-dimensional structure. Staple oligonucleotides can include chemical modifications such as fluorophores, or transcription factor binding motifs (to promote import into the nucleus or to promote recruitment of DNA repair proteins).


Two genomic sites are chosen for analysis: chr9:107,422,356/hg38 and chr1:179,826,688/hg38. Nanostructures are generated using staple oligonucleotides. The nanostructures are transfected, along with Cas9 and a guide RNA, into HEK293 cells and analyzed for HDR using polymerase chain reaction (PCR), Sanger sequencing, and next-generation sequencing (NGS). The experiment is repeated using larger payloads (the portion of the donor DNA between the homology arms) encoding a dead Cas9 (dCas9) polypeptide. Results of the two experiments are compared. Nanostructures promote HDR with large DNA insertions.


Examples of donor DNA and homology arms are shown in FIG. 4-FIG. 7. One donor DNA (“BFP-mEGFP construct”; FIG. 4) comprises a nucleotide sequence encoding blue fluorescent protein (BFP) under transcriptional control of an EF-1 alpha promoter+T7 promoter, and also includes a woodchuck hepatitis virus post-transcriptional regulatory element (WPRE), which increases transgene expression. It also has a 5′ homology arm and a 3′ homology arm. This donor DNA is folded into a nanostructure and is introduced, along with a Cas9 protein and a guide RNA, into HEK293 cells. This genetically modified cell line can be used to study HDR, which is detected by replacing a codon in BFP that produces green fluorescence protein instead of blue fluorescent protein. Introduction of this construct into specific loci in the genome allowed one to study locus- and chromatin-specific effects in HDR.


One donor DNA (dCas9-3×Flag-2×NLS; FIG. 5) comprises a nucleotide sequence encoding an inactive Cas9 (dCas9) polypeptide tagged with 3×Flag and 2×NLS (nuclear localization signal) under transcriptional control of a β-actin promoter, and also includes a beta-globin poly (A) signal. It also has a 5′ homology arm and a 3′ homology arm. This donor DNA is folded into a nanostructure and is introduced, along with a Cas9 protein and a guide RNA, into HEK293 cells. By changing the homology arms, this construct can allow endogenous fusion of proteins with dCas9 and can also allow integration of dCas9 or an active Cas9 into specific loci in the genome. Examples of homology arms are provided in FIG. 6 and FIG. 7.


Example 2: Generation and Characterization of DNA Nanostructures


FIG. 8A-8D depict design, purification and characterization of a DNA nanostructure encoding mNeon for integration into the human genome. FIG. 8A. Schematic of a long unstructured ssDNA HDR template folded into a DNA nanostructure and integrated into the genome via CRISPR/Cas9-mediated HDR. FIG. 8B. Schematic of a 2716 bases long template to insert mNeon into an intergenic site in the human genome. FIG. 8C. Cadnano design of an 18-helix bundle DNA nanostructure and structure prediction. FIG. 8D. Gel and atomic force microscopy (AFM) characterization of the 18-helix bundle DNA nanostructure.



FIG. 9A-9D depict entry into the nucleus and integration into the genome via HDR of DNA nanostructures compared to unstructured ssDNA. The data show that electroporation allows DNA nanostructures to enter the nucleus and integrate into the genome at similar rates as unstructured ssDNA through homology directed editing. FIG. 9A. Flow Cytometry data shows simple structures (ssDNA+5 staples) are more efficiently incorporated into the genome compared to unstructured ssDNA and the 18-helix bundle. Electroporation shows similar values across unstructured, simple and the 18-helix bundle. FIG. 9B. Graphed flow cytometry data shows simple structures perform best for both transfection and electroporation. FIG. 9C. PCR using primers flanking the insertion site show electroporation leads to similar insertion efficiencies for ssDNA, simple and the 18-helix bundle. A larger insertion was observed for ssDNA and the 18-helix bundle which could be a concatemer, simple structure shows a single band at the right size. FIG. 9D. AFM images show that the 18-helix bundle retained its shape after electroporation.



FIG. 10A-10D depict recruitment of CRISPR Cas9 to the ends of templates using shuttles. The data show recruiting CRISPR Cas9 to the ends of the templates using shuttles, leading to higher knockin efficiencies of large templates, including DNA nanostructures, in both HEK293T cells and K562 cells. FIG. 10A. Schematic of Cas9 binding to the ends of the templates for shuttling them into the nucleus. FIG. 10B. Flow cytometry data shows electroporated, shuttled ssDNA, simple and the 18-helix bundle have similar knockin efficiencies. FIG. 10C. AFM image of CRISPR Cas9 RNP bound to the 18-helix bundle. FIG. 10D. Simple structure shows significantly higher knockin into K562 cells, compared to unstructured ssDNA.



FIG. 11A-11E depict incorporation of DNA nanostructures encoding a human genome into human primary cells. The data show that DNA nanostructures encoding a human genome are incorporated into human primary cells with high efficiency. FIG. 11A. Schematic of knockin strategy of a 3.5 kb HDR template encoding IL2RA with a linked GFP, as well as an mCherry under a EF1a promoter. FIG. 11B. Different DNA nanostructure designs using the template shown in FIG. 11A). FIG. 11C. Live cell count shows the 18-helix bundle has a slightly higher toxicity compared to other structures and unstructured ssDNA. FIG. 11D. Structured templates show increased knockin efficiency compared to ssDNA alone, except for the 18-helix bundle. FIG. 11E. A mix of 18-helix bundles and unstructured dsDNA showed a striking end-to-end size difference.



FIG. 12 depicts an atomic force microscope (AFM) image of a 5 kb ssDNA encoding IL2Ra, folded into a long 18-helix bundle. The image shows that a longer 18-helix bundle encoding IL2Ra is folded with high precision.


While the present invention has been described with reference to the specific embodiments thereof, it should be understood by those skilled in the art that various changes may be made and equivalents may be substituted without departing from the true spirit and scope of the invention. In addition, many modifications may be made to adapt a particular situation, material, composition of matter, process, process step or steps, to the objective, spirit and scope of the present invention. All such modifications are intended to be within the scope of the claims appended hereto.

Claims
  • 1. A composition comprising: a) a gene-editing polypeptide;b) a single-stranded donor DNA comprising: i) a first homology arm at or near the 5′ end of the donor DNA, wherein the first homology arm comprises a nucleotide sequence that is at least partially complementary to a first nucleotide sequence in the target nucleic acid;ii) a second homology arm at or near the 3′ end of the donor DNA, wherein the second homology arm comprises a nucleotide sequence that is at least partially complementary to a second nucleotide sequence in the target nucleic acid; andiii) a nucleotide sequence of interest between the first homology arm and the second homology arm; andc) one or more staple oligonucleotides, wherein the one or more staple oligonucleotides are least partially complementary to the nucleotide sequence of interest, such that the one or more staple oligonucleotides hybridize to the nucleotide sequence of interest, such that the single-stranded donor DNA folds into a DNA nanostructure in which the first homology arm and the second homology arm are brought into proximity to one another.
  • 2. The composition of claim 1, comprising from 2 to 250 staple oligonucleotides, or from 5 to 15 staple oligonucleotides.
  • 3. The composition of claim 1, wherein the single-stranded donor DNA has a length of from about 50 nucleotides to about 10,000 nucleotides.
  • 4. The composition of claim 1, wherein the single-stranded donor DNA has a length of from about 100 nucleotides to about 7,500 nucleotides, or from about 100 nucleotides to about 10,000 nucleotides.
  • 5. The composition of claim 1, wherein the single-stranded donor DNA has a length of from about 1,000 nucleotides to about 7,500 nucleotides, or from about 1,000 nucleotides o about 10,000 nucleotides.
  • 6. The composition of any one of claims 1-5, wherein at least one of the one or more staple oligonucleotides comprises a detectable label.
  • 7. The composition of claim 6, wherein the detectable label comprises a fluorophore.
  • 8. The composition of any one of claims 1-7, wherein each of the one or more staple oligonucleotides independently has a length of from about 10 nucleotides to about 60 nucleotides.
  • 9. The composition of any one of claims 1-8, wherein each of the one or more staple oligonucleotides independently has a length of from about 20 nucleotides to about 50 nucleotides.
  • 10. The composition of any one of claims 1-9, wherein the nanostructure comprises a 6-helix bundle.
  • 11. The composition of any one of claims 1-9, wherein the nanostructure comprises a 24-helix bundle.
  • 12. The composition of any one of claims 1-11, wherein the gene-editing polypeptide is a transcription activator-like effector nuclease.
  • 13. The composition of any one of claims 1-11, wherein the gene-editing polypeptide is a zinc finger nuclease.
  • 14. The composition of any one of claims 1-11, wherein the gene-editing polypeptide is a CRISPR/Cas effector polypeptide.
  • 15. The composition of claim 14, wherein the CRISPR/Cas effector polypeptide is a type II CRISPR/Cas effector polypeptide.
  • 16. The composition of claim 14, wherein the CRISPR/Cas effector polypeptide is a type V CRISPR/Cas effector polypeptide.
  • 17. The composition of claim 14, wherein the CRISPR/Cas effector polypeptide is a type VI CRISPR/Cas effector polypeptide.
  • 18. The composition of any one of claims 14-17, comprising one or more guide nucleic acids, wherein each of the one or more guide nucleic acids comprises a targeter nucleotide sequence that hybridizes to a target nucleic acid.
  • 19. The composition of claim 18, wherein the guide nucleic acid is a guide RNA.
  • 20. The composition of claim 19, wherein the guide RNA comprises one or more of: i) a modified sugar; ii) a modified base; and iii) a modified internucleoside linkage.
  • 21. The composition of any one of claims 1-20, wherein the nucleotide sequence of interest comprises one or more of: i) a nucleotide sequence encoding a protein of interest; ii) a nucleotide sequence encoding an exon of a gene; iii) a promoter sequence; iv) an enhancer sequence; and v) a sequence encoding a non-coding RNA.
  • 22. A composition comprising: a) a gene-editing polypeptide; andb) a DNA nanostructure, wherein the DNA nanostructure comprises, in order from 5′ to 3′: i) a first homology arm;ii) a nucleotide sequence of interest that is hybridized to one or more staple oligonucleotides, wherein the nucleotide sequence of interest is folded via the staple oligonucleotides such that the nucleotide sequence of interest has a length of from about 5 nm to about 500 nm; andiii) a second homology arm.
  • 23. The composition of claim 22, wherein the DNA nanostructure comprises a 6-helix bundle, a 24-helix bundle, a tube, a cube, a square lattice, or a honeycomb lattice.
  • 24. The composition of claim 22, wherein the DNA nanostructure incorporates curvature.
  • 25. The composition of claim 22, wherein the DNA nanostructure comprises one or more single-stranded regions that do not comprise a hybridized staple oligonucleotide.
  • 26. The composition of any one of claims 22-25, wherein the gene-editing polypeptide is a CRISPR/Cas effector polypeptide and wherein the composition further comprises a guide nucleic acid.
  • 27. The composition of any one of claims 22-26, wherein the composition is packaged in a virus-like particle.
  • 28. The composition of any one of claims 22-27, wherein the DNA nanostructure comprises a small molecule, a nucleic acid, or a polypeptide encapsulated within the DNA nanostructure.
  • 29. A method of modifying a target nucleic acid in a cell by homology-directed repair, the method comprising introducing into the cell a composition according to any one of claims 1-28.
  • 30. The method of claim 29, wherein the cell is a prokaryotic cell.
  • 31. The method of claim 29, wherein the cell is a eukaryotic cell.
  • 32. The method of claim 31, wherein the cell is in vitro.
  • 33. The method of claim 31, wherein the cell is in vivo.
  • 34. The method of any one of claims 29-33, wherein the eukaryotic cell is a human cell, a non-human mammalian cell, a reptile cell, an amphibian cell, a cell of an invertebrate, a plant cell, an insect cell, an avian cell, a fungal cell, a fish cell, an algal cell, or an arachnid cell.
  • 35. The method of claim 34, wherein the mammalian cell is a T cell or an NK cell.
  • 36. The method of claim 35, wherein the nucleotide sequence of interest comprises a nucleotide sequence encoding a receptor.
  • 37. The method of claim 36, wherein the receptor is a T cell receptor, an NK cell receptor, or a chimeric antigen receptor (CAR).
  • 38. A method of making a genetically modified cell, the method comprising the method comprising introducing into a target cell comprising a target DNA, a composition according to any one of claims 1-28, wherein said introducing results in modification of the target DNA by homology directed repair, thereby producing a genetically modified cell.
  • 39. The method of claim 38, wherein the cell is a prokaryotic cell.
  • 40. The method of claim 38, wherein the cell is a eukaryotic cell.
  • 41. The method of claim 40, wherein the cell is in vitro.
  • 42. The method of claim 40, wherein the cell is in vivo.
  • 43. The method of any one of claims 38-42, wherein the eukaryotic cell is a mammalian cell, a human cell, a non-human mammalian cell, a reptile cell, an amphibian cell, a cell of an invertebrate, a plant cell, an insect cell, an avian cell, a fish cell, a fungal cell, an algal cell, or an arachnid cell.
  • 44. The method of claim 43, wherein the mammalian cell is a T cell or an NK cell.
  • 45. The method of claim 44, wherein the nucleotide sequence of interest comprises a nucleotide sequence encoding a receptor.
  • 46. The method of claim 45, wherein the receptor is a T cell receptor, an NK cell receptor, or a chimeric antigen receptor (CAR).
  • 47. A kit comprising: a) a gene-editing polypeptide;b) a single-stranded donor DNA comprising: i) a first homology arm at or near the 5′ end of the donor DNA, wherein the first homology arm comprises a nucleotide sequence that is at least partially complementary to a first nucleotide sequence in the target nucleic acid;ii) a second homology arm at or near the 3′ end of the donor DNA, wherein the second homology arm comprises a nucleotide sequence that is at least partially complementary to a second nucleotide sequence in the target nucleic acid; andiii) a nucleotide sequence of interest between the first homology arm and the second homology arm; andc) one or more staple oligonucleotides, wherein the one or more staple oligonucleotides are least partially complementary to the nucleotide sequence of interest, such that the one or more staple oligonucleotides hybridize to the nucleotide sequence of interest, such that the donor DNA template folds into a nanostructure in which the first homology arm and the second homology arm are brought into proximity to one another.
  • 48. The kit of claim 47, wherein (a), (b), and (c) are in separate containers.
CROSS-REFERENCE

This application claims the benefit of U.S. Provisional Patent Application No. 63/049,316, filed Jul. 8, 2020, which application is incorporated herein by reference in its entirety.

PCT Information
Filing Document Filing Date Country Kind
PCT/US21/40652 7/7/2021 WO
Provisional Applications (1)
Number Date Country
63049316 Jul 2020 US