This invention was made with government support under Grant No. 1DP50D024583 awarded by the National Institutes of Health. The government has certain rights in the invention.
The invention relates generally to methods of DNA editing capable of providing efficient and continuous nucleotide diversification in human cells.
The advancement of methods for studying the genetic dynamics of eukaryotic cells, such as directed evolution, lineage tracing, and molecular recording, depends upon development of additional tools for targeted, continuous mutagenesis. Existing tools tend to rely upon non-physiological environments, tend to saturate mutagenized sites rapidly, and/or have only been adapted in bacterial or yeast systems. While approaches for relatively long editing regions have been identified and demonstrated in bacterial and yeast cells, a need exists for an editor system that is efficient in inducing continuous nucleotide diversification in cells of multicellular eukaryotic organisms, especially in mammalian cells.
The current disclosure relates, at least in part, to the discovery of compositions and methods capable of performing targeted mutagenesis in higher eukaryotic cells, particularly in mammalian cells in culture, across large spans of targeted nucleic acid sequence, at mutation rates that are robust as compared to background rates of polymerase-mediated mutation. In certain aspects, the compositions and methods of the instant disclosure provide for enhanced, targeted mutagenesis of mammalian cells capable of enabling directed evolution of targeted sequences in living cells. Accordingly, application of the instant compositions and methods to drug and/or peptide evolution and screening in mammalian cell lines is expressly contemplated, as are other applications as set forth herein and as known in the art.
In one aspect, the instant disclosure provides a fusion protein that includes: (i) a bacteriophage RNA polymerase and (ii) a nucleic acid-editing deaminase.
In one embodiment, the bacteriophage RNA polymerase is a T7 RNA polymerase or a T7-like RNA polymerase. Optionally, the T7-like RNA polymerase is a N4 RNA polymerase.
In another embodiment, the nucleic acid-editing deaminase is a cytidine deaminase, an adenine deaminase and/or a guanine deaminase. Optionally, the cytidine deaminase is an activation-induced cytidine deaminase. Optionally, the activation-induced cytidine deaminase is rat APOBEC1 or AID. Optionally, the AID cytidine deaminase is a hyperactive mutant of AID. Optionally, the hyperactive mutant of AID is AID*Δ.
In an additional embodiment, the fusion protein further includes a nuclear localization signal (NLS). Optionally, the NLS is attached at the C-terminus of the fusion protein.
In certain embodiments, the fusion protein further includes a uracil glycosylase inhibitor (UGI). Optionally, the UGI is attached at a location C-terminal to the nucleic acid-editing deaminase and the bacteriophage RNA polymerase.
Another aspect of the instant disclosure provides a nucleic acid that includes: (i) a nucleic acid sequence encoding for a bacteriophage RNA polymerase and (ii) a nucleic acid sequence encoding for a nucleic acid-editing deaminase.
In one embodiment, the nucleic acid further includes a nucleic acid sequence encoding for a nuclear localization signal (NLS). Optionally, nucleic acid sequence encoding for the NLS is attached at the 3′-terminus of the nucleic acid.
In another embodiment, the nucleic acid further includes a nucleic acid sequence encoding for a uracil glycosylase inhibitor (UGI). Optionally, the nucleic acid sequence encoding for the UGI is attached at a location 3′ of the nucleic acid sequence encoding for the nucleic acid-editing deaminase and the nucleic acid sequence encoding for the bacteriophage RNA polymerase.
In an additional embodiment, the nucleic acid further includes a mammalian expression vector promoter. Optionally, the mammalian expression vector promoter is located 5′ of the nucleic acid sequence encoding for a bacteriophage RNA polymerase and the nucleic acid sequence encoding for the nucleic acid-editing deaminase. Optionally, the mammalian expression vector promoter is a CMV promoter, a SV-40 promoter, an (EF)-1 promoter or a tetracycline-inducible mammalian promoter (e.g., Tet-On, Tet-Off, etc.).
In another embodiment, the nucleic acid further includes an origin of replication. Optionally, the nucleic acid is a plasmid.
An additional aspect of the disclosure provides a mammalian cell that includes a first nucleic acid of the disclosure (e.g., encoding for a fusion protein that includes a bacteriophage RNA polymerase and a nucleic acid-editing deaminase).
In one embodiment, the mammalian cell further harbors a second nucleic acid that includes a bacteriophage promoter corresponding to the bacteriophage RNA polymerase of the first nucleic acid. Optionally, the bacteriophage promoter is a T7 promoter or is a T7-like promoter. Optionally, the T7-like promoter is a N4 promoter.
In certain embodiments, the bacteriophage promoter of the second nucleic acid is operably linked to a target nucleic acid sequence. Optionally, the target nucleic acid sequence is a mammalian target nucleic acid sequence. Optionally, the mammalian target nucleic acid sequence is ABL1, FLT3, MCL1, PRKCQ, WEE1, ABL2, FNTA, MDM2, PRKCSH, XIAP, AKT1, GSK3A, MEK1, PRKCZ, AKT2, GSK3B, MET, PRKDC, AKT3, HDAC1, MTOR, PSENEN, AIX, HDAC2, NFKB1, PSMB5, AR, HDAC3, NTRK1, PTK2, ATM, HDAC6, P4HB, PTPN11, AURKA, HDAC8, p53, PTPN6, AURKB, HER2, PAK1, RAC1, AURKC, HSP90AA1, PARP1, RET, BCL2, HSP90AB1, PDGFRA, ROCK1, BCL-ABL1, HSP90AB4P, PDGFRB, ROCK2, BMX HSP90B1, PDK1, RPS6KA1, BRAF, HSP90B3P, PIK3CA, RPS6KA2, BTK, IGF1R, PIK3CB, RPS6KA3, CASP3, IKBKE, PIK3CD, RPS6KA4, CCR5, ITK, PIK3CG, RPS6KA5, CDK1, JAK2, PLK1, RPS6KA6, CDK2, KDR, PLK2, RPS6KB2, CDK4, KIT, PLK3, RXRA, CDK6, KRAS, PPM1D, RXRB, CDK7, MAP2K1, PRKAA1, SGK3, CTNNB1, MAP2K2, PRKCA, SMO, DHFR, MAPK11, PRKCB, SRC, EGFR, MAPK12, PRKCD, SYK, ERBB2, MAPK13, PRKCE, TBK1, FGFR1, MAPK14, PRKCG, TEC, FGFR3, MAPK7, PRKCH, TNF, FLT1, MAPK8, PRKCI and/or TOP1.
In some embodiments, the second nucleic acid is harbored on a plasmid within the mammalian cell.
In an embodiment, the second nucleic acid is integrated into the genome of the mammalian cell. Optionally, the second nucleic acid is integrated into the genome of the mammalian cell at the Rosa 26 locus. Optionally, the first nucleic acid and the second nucleic acid are integrated into the genome of the mammalian cell at the Rosa 26 locus.
In embodiments, the mammalian cell is a mouse cell. Optionally, the mammalian cell is a mouse oocyte cell.
In certain embodiments, the mammalian cell further harbors a cell type-specific Cre-recombinase or Cre-ER capable of inducing conditional expression of the first nucleic acid and/or the second nucleic acid where Cre-recombinase is present.
In one embodiment, the mammalian cell is a cell of a mammalian cell line. Optionally, the mammal cell line is HEK293T, VERO, BHK, HeLa, CV1, MDCK, 3T3, a myeloma cell line, PC12, WI38 or Chinese hamster ovary (CHO).
Another aspect of the instant disclosure provides a method for performing mutagenesis upon a target nucleic acid of a mammalian cell, the method involving: (a) providing a mammalian cell; (b) contacting the mammalian cell with: (i) a first nucleic acid of the instant disclosure; and (ii) a second nucleic acid that includes a bacteriophage promoter operably linked to a target nucleic acid; where contacting of the mammalian cell with the first nucleic acid and the second nucleic acid is performed in any order, including concurrently; and (c) culturing the mammalian cell for a duration of time sufficient for mutation of the target nucleic acid to be detected.
In one embodiment, the first nucleic acid is harbored on a plasmid.
In another embodiment, contacting step (b) includes transfecting the first nucleic acid into the mammalian cell. Optionally, the transfecting involves a lentivirus.
In other embodiments, contacting step (b) includes genomic integration of the first nucleic acid.
In certain embodiments, the second nucleic acid is harbored on a plasmid.
In an additional embodiment, contacting step (b) involves transfecting the second nucleic acid into the mammalian cell.
In other embodiments, contacting step (b) involves genomic integration of the second nucleic acid.
A further aspect of the instant disclosure provides a kit that includes a nucleic acid of the instant disclosure and instructions for its use.
In one embodiment, the kit further includes a transfection agent. Optionally, the transfection agent is a lentivirus.
As used herein, the term “bacteriophage RNA polymerase” refers to any bacteriophage-derived RNA polymerase (RNAP) that possesses DNA processivity, which is expressly contemplated to include all variant, mutant and/or derivative forms of bacteriophage RNAP, provided that DNA processivity is maintained. Specific examples of RNAP are set forth below, and include, without limitation, T7 RNAP and T7-like RNA polymerases, such as T3 RNAP, SP6 RNAP and/or N4 RNAP.
The term “nucleic acid-editing deaminase,” as used herein, refers to any deaminase that is capable of performing somatic hypermutation. Deaminases effect the deamination or removal of an amine group of a nucleic acid. Expressly contemplated examples of nucleic acid-editing deaminases include, but are not limited to, adenine deaminase, cytidine deaminase (including activation-induced cytidine deaminase), and guanine deaminase. Specific examples of nucleic acid-editing deaminases are provided in additional detail elsewhere herein.
The term “fusion protein” as used herein refers to an engineered polypeptide that combines sequence elements excerpted from two or more other proteins, optionally from two or more naturally-occurring proteins.
The terms “transfect,” “transfects,” “transfecting” and “transfection” as used herein refer to the delivery of nucleic acids (usually DNA or RNA) to the cytoplasm or nucleus of cells, e.g., through the use of lentiviral delivery vectors/plasmids, cationic lipid vehicle(s) and/or by means of electroporation, or other art-recognized means of transfection.
The term “plasmid” as used herein refers to a construction comprised of genetic material designed to direct transformation of a targeted cell. The plasmid consist of a plasmid backbone. A “plasmid backbone” as used herein contains multiple genetic elements positional and sequentially oriented with other necessary genetic elements such that the nucleic acid in a nucleic acid cassette can be transcribed and when necessary translated in the transfected cells. The term plasmid as used herein can refer to nucleic acid, e.g., DNA derived from a plasmid vector, cosmid, phagemid or bacteriophage, into which one or more fragments of nucleic acid may be inserted or cloned which encode for particular genes
A “viral vector” as used herein is one that is physically incorporated in a viral particle by the inclusion of a portion of a viral genome within the vector, e.g., a packaging signal, and is not merely DNA or a located gene taken from a portion of a viral nucleic acid. Thus, while a portion of a viral genome can be present in a plasmid of the present disclosure, that portion does not cause incorporation of the plasmid into a viral particle and thus is unable to produce an infective viral particle.
As used herein, the term “vector” refers to any genetic element, such as a plasmid, phage, transposon, cosmid, chromosome, virus, virion, etc., which is capable of replication when associated with the proper control elements and which can transfer gene sequences between cells. Thus, the term includes cloning and expression vehicles, as well as viral vectors.
As used herein, the term “integrating vector” refers to a vector whose integration or insertion into a nucleic acid (e.g., a chromosome) is accomplished via an integrase. Examples of “integrating vectors” include, but are not limited to, retroviral vectors, transposons, and adeno associated virus vectors.
As used herein, the term “integrated” refers to a vector that is stably inserted into the genome (i.e., into a chromosome) of a host cell.
As used herein, the term “genome” refers to the genetic material (e.g., chromosomes) of an organism.
The term “target nucleic acid” refers to any nucleotide sequence (e.g., RNA or DNA), the manipulation of which may be deemed desirable for any reason (e.g., for directed evolution, to treat disease, confer improved qualities, expression of a protein of interest in a host cell, expression of a ribozyme, etc.), by one of ordinary skill in the art. Such nucleic acid sequences include, but are not limited to, coding sequences of genes (e.g., enzyme-encoding genes, transcription factor-encoding genes, cytokine-encoding genes, reporter genes, selection marker genes, oncogenes, drug resistance genes, growth factors, etc.), and non-coding regulatory sequences which do not encode an mRNA or protein product (e.g., promoter sequence, polyadenylation sequence, termination sequence, enhancer sequence, etc.).
As used herein, the term “exogenous gene” refers to a gene that is not naturally present in a host organism or cell, or is artificially introduced into a host organism or cell.
The term “gene” refers to a nucleic acid (e.g., DNA or RNA) sequence that comprises coding sequences necessary for the production of a polypeptide or precursor (e.g., proinsulin). The polypeptide can be encoded by a full length coding sequence or by any portion of the coding sequence so long as the desired activity or functional properties (e.g., enzymatic activity, ligand binding, signal transduction, etc.) of the full-length or fragment are retained. The term also encompasses the coding region of a structural gene and includes sequences located adjacent to the coding region on both the 5′ and 3′ ends for a distance of about 1 kb or more on either end such that the gene corresponds to the length of the full-length mRNA. The sequences that are located 5′ of the coding region and which are present on the mRNA are referred to as 5′ untranslated sequences. The sequences that are located 3′ or downstream of the coding region and which are present on the mRNA are referred to as 3′ untranslated sequences. The term “gene” encompasses both cDNA and genomic forms of a gene. A genomic form or clone of a gene contains the coding region interrupted with non-coding sequences termed “introns” or “intervening regions” or “intervening sequences.” Introns are segments of a gene which are transcribed into nuclear RNA (hnRNA); introns may contain regulatory elements such as enhancers. Introns are removed or “spliced out” from the nuclear or primary transcript; introns therefore are absent in the messenger RNA (mRNA) transcript. The mRNA functions during translation to specify the sequence or order of amino acids in a nascent polypeptide.
As used herein, the term “gene expression” refers to the process of converting genetic information encoded in a gene into RNA (e.g., mRNA, rRNA, tRNA, or snRNA) through “transcription” of the gene (i.e., via the enzymatic action of an RNA polymerase), and for protein encoding genes, into protein through “translation” of mRNA. Gene expression can be regulated at many stages in the process. “Up-regulation” or “activation” refers to regulation that increases the production of gene expression products (i.e., RNA or protein), while “down-regulation” or “repression” refers to regulation that decrease production. Molecules (e.g., transcription factors) that are involved in up-regulation or down-regulation are often called “activators” and “repressors,” respectively.
Where “amino acid sequence” is recited herein to refer to an amino acid sequence of a naturally occurring protein molecule, “amino acid sequence” and like terms, such as “polypeptide” or “protein” are not meant to limit the amino acid sequence to the complete, native amino acid sequence associated with the recited protein molecule.
As used herein, the terms “nucleic acid molecule encoding,” “DNA sequence encoding,” “DNA encoding,” “RNA sequence encoding,” and “RNA encoding” refer to the order or sequence of deoxyribonucleotides or ribonucleotides along a strand of deoxyribonucleic acid or ribonucleic acid. The order of these deoxyribonucleotides or ribonucleotides determines the order of amino acids along the polypeptide (protein) chain. The DNA or RNA sequence thus codes for the amino acid sequence.
As used herein, the term “variant,” when used in reference to a protein, refers to proteins encoded by partially homologous nucleic acids so that the amino acid sequence of the proteins varies. As used herein, the term “variant” encompasses proteins encoded by homologous genes having both conservative and nonconservative amino acid substitutions that do not result in a change in protein function, as well as proteins encoded by homologous genes having amino acid substitutions that cause decreased (e.g., null mutations) protein function or increased protein function.
The terms “in operable combination,” “in operable order,” and “operably linked” as used herein refer to the linkage of nucleic acid sequences in such a manner that a nucleic acid molecule capable of directing the transcription of a given gene and/or the synthesis of a desired protein molecule is produced. The term also refers to the linkage of amino acid sequences in such a manner so that a functional protein is produced.
As used herein, the term “regulatory element” refers to a genetic element which controls some aspect of the expression of nucleic acid sequences. For example, a promoter is a regulatory element that facilitates the initiation of transcription of an operably linked coding region. Other regulatory elements are splicing signals, polyadenylation signals, termination signals, RNA export elements, internal ribosome entry sites, etc.
Transcriptional control signals in eukaryotes comprise “promoter” and “enhancer” elements. Promoters and enhancers consist of short arrays of DNA sequences that interact specifically with cellular proteins involved in transcription (Maniatis et al., Science 236:1237 [1987]). Promoter and enhancer elements have been isolated from a variety of eukaryotic sources including genes in yeast, insect and mammalian cells, and viruses (analogous control elements, i.e., promoters, are also found in prokaryotes). The selection of a particular promoter and enhancer depends on what cell type is to be used to express the protein of interest. Some eukaryotic promoters and enhancers have a broad host range while others are functional in a limited subset of cell types (for review see, Voss et al., Trends Biochem. Sci., 11:287 [1986]; and Maniatis et al., supra). For example, the SV40 early gene enhancer is very active in a wide variety of cell types from many mammalian species and has been widely used for the expression of proteins in mammalian cells (Dijkema et al, EMBO J. 4:761 [1985]). Two other examples of promoter/enhancer elements active in a broad range of mammalian cell types are those from the human elongation factor 1α gene (Uetsuki et al., J. Biol. Chem., 264:5791 [1989]; Kim et al., Gene 91:217 [1990]; and Mizushima and Nagata, Nuc. Acids. Res., 18:5322 [1990]) and the long terminal repeats of the Rous sarcoma virus (Gorman et al., Proc. Natl. Acad. Sci. USA 79:6777 [1982]) and the human cytomegalovirus (Boshart et al., Cell 41:521 [1985]).
As used herein, the term “promoter/enhancer” denotes a segment of DNA which contains sequences capable of providing both promoter and enhancer functions (i.e., the functions provided by a promoter element and an enhancer element, see above for a discussion of these functions). For example, the long terminal repeats of retroviruses contain both promoter and enhancer functions. The enhancer/promoter may be “endogenous” or “exogenous” or “heterologous.” An “endogenous” enhancer/promoter is one which is naturally linked with a given gene in the genome. An “exogenous” or “heterologous” enhancer/promoter is one which is placed in juxtaposition to a gene by means of genetic manipulation (i.e., molecular biological techniques such as cloning and recombination) such that transcription of that gene is directed by the linked enhancer/promoter.
The term “promoter,” “promoter element,” or “promoter sequence” as used herein, refers to a DNA sequence which when ligated to a nucleotide sequence of interest is capable of controlling the transcription of the nucleotide sequence of interest into mRNA. A promoter is typically, though not necessarily, located 5′ (i.e., upstream) of a nucleotide sequence of interest whose transcription into mRNA it controls, and provides a site for specific binding by RNA polymerase and other transcription factors for initiation of transcription.
Promoters may be constitutive or regulatable. The term “constitutive” when made in reference to a promoter means that the promoter is capable of directing transcription of an operably linked nucleic acid sequence in the absence of a stimulus (e.g., heat shock, chemicals, etc.). In contrast, a “regulatable” promoter is one which is capable of directing a level of transcription of an operably linked nucleic acid sequence in the presence of a stimulus (e.g., heat shock, chemicals, etc.) which is different from the level of transcription of the operably linked nucleic acid sequence in the absence of the stimulus.
Eukaryotic expression vectors may also contain “viral replicons” or “viral origins of replication.” Viral replicons are viral DNA sequences that allow for the extrachromosomal replication of a vector in a host cell expressing the appropriate replication factors. Vectors that contain either the SV40 or polyoma virus origin of replication replicate to high “copy number” (up to 104 copies/cell) in cells that express the appropriate viral T antigen. Vectors that contain the replicons from bovine papillomavirus or Epstein-Barr virus replicate extrachromosomally at “low copy number” (˜100 copies/cell). However, it is not intended that expression vectors be limited to any particular viral origin of replication.
As used herein, the term “retrovirus” refers to a retroviral particle which is capable of entering a cell (i.e., the particle contains a membrane-associated protein such as an envelope protein or a viral G glycoprotein which can bind to the host cell surface and facilitate entry of the viral particle into the cytoplasm of the host cell) and integrating the retroviral genome (as a doublc-stranded provirus) into the genome of the host cell. The term “retrovirus” encompasses Oncovirinae (e.g., Moloney murine leukemia virus (MoMOLV), Moloney murine sarcoma virus (MoMSV), and Mouse mammary tumor virus (MMTV), Spumavirinae, amd Lentivirinae (e.g., Human immunodeficiency virus, Simian immunodeficiency virus, Equine infection anemia virus, and Caprine arthritis-encephalitis virus; See, e.g., U.S. Pat. Nos. 5,994,136 and 6,013,516, both of which are incorporated herein by reference).
As used herein, the term “retroviral vector” refers to a retrovirus that has been modified to express a gene of interest. Retroviral vectors can be used to transfer genes efficiently into host cells by exploiting the viral infectious process. Foreign or heterologous genes cloned (i.e., inserted using molecular biological techniques) into the retroviral genome can be delivered efficiently to host cells which are susceptible to infection by the retrovirus.
The term “Rhabdoviridae” refers to a family of enveloped RNA viruses that infect animals, including humans, and plants. The Rhabdoviridae family encompasses the genus Vesiculovirus which includes vesicular stomatitis virus (VSV), Cocal virus, Piry virus, Chandipura virus, and Spring viremia of carp virus (sequences encoding the Spring viremia of carp virus are available under GenBank accession number U18101). The G proteins of viruses in the Vesiculovirus genera are virally-encoded integral membrane proteins that form externally projecting homotrimeric spike glycoproteins complexes that are required for receptor binding and membrane fusion. The G proteins of viruses in the Vesiculovirus genera have a covalently bound palmititic acid (C16) moiety. The amino acid sequences of the G proteins from the Vesiculoviruses are fairly well conserved. For example, the Piry virus G protein share about 38% identity and about 55% similarity with the VSV G proteins (several strains of VSV are known, e.g., Indiana, New Jersey, Orsay, San Juan, etc., and their G proteins are highly homologous). The Chandipura virus G protein and the VSV G proteins share about 37% identity and 52% similarity. Given the high degree of conservation (amino acid sequence) and the related functional characteristics (e.g., binding of the virus to the host cell and fusion of membranes, including syncytia formation) of the G proteins of the Vesiculoviruses, the G proteins from non-VSV Vesiculoviruses may be used in place of the VSV G protein for the pseudotyping of viral particles. The G proteins of the Lyssa viruses (another genera within the Rhabdoviridae family) also share a fair degree of conservation with the VSV G proteins and function in a similar manner (e.g., mediate fusion of membranes) and therefore may be used in place of the VSV G protein for the pseudotyping of viral particles. The Lyssa viruses include the Mokola virus and the Rabies viruses (several strains of Rabies virus are known and their G proteins have been cloned and sequenced). The Mokola virus G protein shares stretches of homology (particularly over the extracellular and transmembrane domains) with the VSV G proteins which show about 31% identity and 48% similarity with the VSV G proteins. Preferred G proteins share at least 25% identity, preferably at least 30% identity and most preferably at least 35% identity with the VSV G proteins. The VSV G protein from which New Jersey strain (the sequence of this G protein is provided in GenBank accession numbers M27165 and M21557) is employed as the reference VSV G protein.
As used herein, the term “lentivirus vector” refers to retroviral vectors derived from the Lentiviridae family (e.g., human immunodeficiency virus, simian immunodeficiency virus, equine infectious anemia virus, and caprine arthritis-encephalitis virus) that are capable of integrating into non-dividing cells (See, e.g., U.S. Pat. Nos. 5,994,136 and 6,013,516, both of which are incorporated herein by reference).
As used herein, the term “adeno-associated virus (AAV) vector” refers to a vector derived from an adeno-associated virus serotype, including without limitation, AAV-1, AAV-2, AAV-3, AAV-4, AAV-5, AAVX7, etc. AAV vectors can have one or more of the AAV wild-type genes deleted in whole or part, preferably the rep and/or cap genes, but retain functional flanking ITR sequences.
As used herein the term, the term “in vitro” refers to an artificial environment and to processes or reactions that occur within an artificial environment. In vitro environments can consist of, but are not limited to, test tubes and cell cultures. The term “in vivo” refers to the natural environment (e.g., an animal or a cell) and to processes or reaction that occur within a natural environment.
As used herein, the term “clonally derived” refers to a cell line that it derived from a single cell.
As used herein, the term “non-clonally derived” refers to a cell line that is derived from more than one cell.
As used herein, the term “passage” refers to the process of diluting a culture of cells that has grown to a particular density or confluency (e.g., 70% or 80% confluent), and then allowing the diluted cells to regrow to the particular density or confluency desired (e.g., by replating the cells or establishing a new roller bottle culture with the cells.
As used herein, the term “stable,” when used in reference to genome, refers to the stable maintenance of the information content of the genome from one generation to the next, or, in the particular case of a cell line, from one passage to the next. Accordingly, a genome is considered to be stable if no gross changes occur in the genome (e.g., a gene is deleted or a chromosomal translocation occurs). The term “stable” does not exclude subtle changes that may occur to the genome such as point mutations.
As used herein, the term “cell culture” refers to any in vitro culture of cells. Included within this term are continuous cell lines (e.g., with an immortal phenotype), primary cell cultures, finite cell lines (e.g., non-transformed cells), and any other cell population maintained in vitro, including oocytes and embryos.
As used herein, the term “host cell” refers to any eukaryotic cell (e.g., mammalian cells, avian cells, amphibian cells, plant cells, fish cells, and insect cells), whether located in vitro or in vivo.
Unless specifically stated or obvious from context, as used herein, the term “about” is understood as within a range of normal tolerance in the art, for example within 2 standard deviations of the mean. “About” can be understood as within 10%, 9%, 8%, 7%, 6%, 5%, 4%, 3%, 2%, 1%, 0.5%, 0.1%, 0.05%, or 0.01% of the stated value.
In certain embodiments, the term “approximately” or “about” refers to a range of values that fall within 25%, 20%, 19%, 18%, 17%, 16%, 15%, 14%, 13%, 12%, 11%, 10%, 9%, 8%, 7%, 6%, 5%, 4%, 3%, 2%, 1%, or less in either direction (greater than or less than) of the stated reference value unless otherwise stated or otherwise evident from the context (except where such number would exceed 100% of a possible value).
Unless otherwise clear from context, all numerical values provided herein are modified by the term “about.”
By “control” or “reference” is meant a standard of comparison. Methods to select and test control samples are within the ability of those in the art. Determination of statistical significance is within the ability of those skilled in the art, e.g., the number of standard deviations from the mean that constitute a positive result.
As used herein, the term “each,” when used in reference to a collection of items, is intended to identify an individual item in the collection but does not necessarily refer to every item in the collection. Exceptions can occur if explicit disclosure or context clearly dictates otherwise.
As used herein, the term “subject” includes humans and mammals (e.g., mice, rats, pigs, cats, dogs, and horses). In many embodiments, subjects are mammals, particularly primates, especially humans. In some embodiments, subjects are livestock such as cattle, sheep, goats, cows, swine, and the like; poultry such as chickens, ducks, geese, turkeys, and the like; and domesticated animals particularly pets such as dogs and cats. In some embodiments (e.g., particularly in research contexts) subject mammals will be, for example, rodents (e.g., mice, rats, hamsters), rabbits, primates, or swine such as inbred pigs and the like.
Unless specifically stated or obvious from context, as used herein, the term “or” is understood to be inclusive. Unless specifically stated or obvious from context, as used herein, the terms “a”, “an”, and “the” are understood to be singular or plural.
Ranges can be expressed herein as from “about” one particular value, and/or to “about” another particular value. When such a range is expressed, another aspect includes from the one particular value and/or to the other particular value. Similarly, when values are expressed as approximations, by use of the antecedent “about,” it is understood that the particular value forms another aspect. It is further understood that the endpoints of each of the ranges are significant both in relation to the other endpoint, and independently of the other endpoint. It is also understood that there are a number of values disclosed herein, and that each value is also herein disclosed as “about” that particular value in addition to the value itself. It is also understood that throughout the application, data are provided in a number of different formats and that this data represent endpoints and starting points and ranges for any combination of the data points. For example, if a particular data point “10” and a particular data point “15” are disclosed, it is understood that greater than, greater than or equal to, less than, less than or equal to, and equal to 10 and 15 are considered disclosed as well as between 10 and 15. It is also understood that each unit between two particular units are also disclosed. For example, if 10 and 15 are disclosed, then 11, 12, 13, and 14 are also disclosed.
Ranges provided herein are understood to be shorthand for all of the values within the range. For example, a range of 1 to 50 is understood to include any number, combination of numbers, or sub-range from the group consisting 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, or 50 as well as all intervening decimal values between the aforementioned integers such as, for example, 1.1, 1.2, 1.3, 1.4, 1.5, 1.6, 1.7, 1.8, and 1.9. With respect to sub-ranges, “nested sub-ranges” that extend from either end point of the range are specifically contemplated. For example, a nested sub-range of an exemplary range of 1 to 50 may comprise 1 to 10, 1 to 20, 1 to 30, and 1 to 40 in one direction, or 50 to 40, 50 to 30, 50 to 20, and 50 to 10 in the other direction.
The transitional term “comprising,” which is synonymous with “including,” “containing,” or “characterized by,” is inclusive or open-ended and does not exclude additional, unrecited elements or method steps. By contrast, the transitional phrase “consisting of” excludes any element, step, or ingredient not specified in the claim. The transitional phrase “consisting essentially of” limits the scope of a claim to the specified materials or steps “and those that do not materially affect the basic and novel characteristic(s)” of the claimed invention.
The embodiments set forth below and recited in the claims can be understood in view of the above definitions.
Other features and advantages of the disclosure will be apparent from the following description of the preferred embodiments thereof, and from the claims. Unless otherwise defined, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this disclosure belongs. Although methods and materials similar or equivalent to those described herein can be used in the practice or testing of the present disclosure, suitable methods and materials are described below. All published foreign patents and patent applications cited herein are incorporated herein by reference. All other published references, documents, manuscripts and scientific literature cited herein are incorporated herein by reference. In the case of conflict, the present specification, including definitions, will control. In addition, the materials, methods, and examples are illustrative only and not intended to be limiting.
The following detailed description, given by way of example, but not intended to limit the disclosure solely to the specific embodiments described, may best be understood in conjunction with the accompanying drawings, in which:
The current disclosure relates, at least in part, to the identification of a system capable of performing targeted mutagenesis in higher eukaryotic cells, particularly in mammalian cells in culture, across large regions (e.g., 2 kb or more) of targeted nucleic acid sequence, at significantly elevated on-target rates of mutation, as compared to either off-target mutation rates or to background rates of polymerase-mediated mutation. In some aspects, a regions of nucleic acid sequence that is to be targeted for mutagenesis is placed under control of (operably linked to) a bacteriophage promoter (e.g., a T7 promoter), and this promoter-target nucleic acid construct is introduced to a mammalian cell (optionally via transfection). Meanwhile, a nucleic acid construct that encodes for a RNA polymerase (that recognizes the bacteriophage promoter associated with the target nucleic acid sequence) and an operably linked nucleic acid-editing deaminase is constructed and also introduced to the mammalian cell harboring the phage promoter-target nucleic acid construct. The targeted mammalian cell is then cultured for an amount of time sufficient to allow the RNA polymerase to process across the targeted nucleic acid region of interest, and to thereby introduce deaminase-mediated mutants into the targeted nucleic acid sequence during such phage RNA polymerase processing across the targeted nucleic acid.
In certain aspects, the compositions and methods of the instant disclosure therefore provide for enhanced, targeted mutagenesis of mammalian cells, to an extent that is capable of enabling directed evolution of targeted sequences in living cells. As such, application of the instant compositions and methods to drug and/or peptide evolution and screening in mammalian cell lines is expressly contemplated, as are other applications as set forth herein and as are known in the art.
Bacteriophage RNAPs have been previously identified as capable of reading through DNA sequences under the control of a specific promoter without auxiliary transcription factors (8). In particular, the T7 RNAP/T7 promoter system has been previously described as capable of serving as an orthogonal gene expression system in mammalian cells (9, 10). Somatic hypermutation machinery, especially the family of cytidine deaminases, have also been leveraged to induce DNA base switching by catalyzing the deamination of cytosine (C) and subsequent conversion to uracil (U), which is read as thymine (T) by polymerases (11). The instant disclosure has examined whether combining the DNA processivity of bacteriophage DNA-dependent RNA polymerases (RNAPs) with the somatic hypermutation capability of cytidine deaminases could enable continuous, targeted mutagenesis in eukaryotic cells. As demonstrated herein, such a system for pseudo-random integrated mutation of eukaryotic cells (PRIME) is indeed effective and robust.
Various expressly contemplated components of certain compositions and methods of the instant disclosure are considered in additional detail below.
Certain aspects of the instant disclosure relate to compositions and methods that include bacteriophage promoters, as well as corresponding bacteriophage polymerases, to achieve targeted mutagenesis in mammalian cells across long stretches of sequence. Exemplary bacteriophage promoters of the instant disclosure include, but are not limited to, the following.
T7 Bacteriophage Promoter
The T7 bacteriophage promoter has the sequence 5′-TAATACGACTCACTATAG-3′ (SEQ ID NO: 1). The T7 RNA polymerase initiates transcription at the 3′-terminal guanine (G) of the T7 promoter sequence. The T7 polymerase then transcribes using the opposite strand as a template, processing from 5′->3′. The first base in a T7 polymerase transcript is therefore a guanine (G). The T7 promoter family includes both constitutive promoters and negatively regulated promoters, which can be turned off by a repressor protein. The most common bacterial strain to use with a T7 promoter system is BL21 (DE3) which is an E. coli B strain that contains a λ lysogen with an inducible T7 RNAP gene on the chromosome. However, it is possible to engineer many other E. coli strains to conditionally express T7 RNAP.
T7-Like Bacteriophage Promoters
T7-like bacteriophage promoters most notably include the T3 promoter and the N4 promoter. The T3 promoter has the sequence 5′-AATTAACCCTCACTAAAG-3′ (SEQ ID NO: 2). The bacteriophage T3 and T7 RNA polymerases are closely related, yet are highly specific for their own promoter sequences. T7 promoter variants that contain substitutions of T3-specific base-pairs at one or more positions within the T7 promoter consensus sequence have been previously synthesized and cloned. Template competition assays between variant and consensus promoters have demonstrated that the primary determinants of promoter specificity are located in the region from −10 to −12, and that the base-pair at −11 is of particular importance. Changing this base-pair from G:C, which is normally present in T7 promoters, to C:G, which is found at this position in T3 promoters, was identified to prevent utilization by the T7 RNA polymerase and simultaneously enabled transcription from the variant T7 promoter by the T3 enzyme. Substitution of T7 base-pairs with T3 base-pairs at other positions where the two consensus sequences diverge were also observed to affect the overall efficiency with which the variant promoter was utilized by the T7 RNA polymerase, but these changes were not sufficient to permit recognition by the T3 RNA polymerase. Switching the −11 base-pair in the T3 promoter consensus to the T7 base-pair prevented utilization by the T3 RNA polymerase, but did not allow the T3 variant promoter to be utilized by the T7 RNA polymerase. This probably reflects a greater specificity of the T7 RNA polymerase for base-pairs at other positions where the promoter sequences differ, most notably at −15. Without wishing to be bound by theory, the magnitude of the effects of base substitutions in the T7 promoter on promoter strength (−11C much greater than −10C greater than −12A) were found to correlate with the affinity of the T7 polymerase for the promoter variants, which suggested that the discrimination of the phage RNA polymerases for their promoters was mediated primarily at the level of DNA binding, rather than at the level of initiation (Klement et al. J Mol Biol. 215: 21-9).
N4 Bacteriophage Promoters
N4 bacteriophage promoters comprise conserved sequences and a 3-base loop-5-base pair (bp) stem DNA hairpin structure on single-stranded templates. As an example, N4 Bacteriophage RNAP Polymerase has been identified to bind a 20-nucleotide (nt) N4 P2 promoter deoxyoligonucleotide with high affinity (Kd=2 nM) to form a salt-resistant complex. It has also been shown that N4 Bacteriophage RNAP Polymerase interacts specifically with the central base of the hairpin loop (−11G) and a base at the stem (−8G) and that the guanine 6-keto and 7-imino groups at both positions are essential for binding and complex salt resistance. The major determinant (−11G), which has been described as presented to N4 Bacteriophage RNAP Polymerase in the context of a hairpin loop, appears to interact with N4 Bacteriophage RNAP PolymeraseTrp-129. This interaction has been described as reliant upon template single-strandedness at positions −2 and −1. Contacts with the promoter have been described as disrupted when the RNA product becomes 11-12 nt long (see Wigneshweraraj et al. Biomolecules. 5: 647-667, the entire contents of which are incorporated by reference herein, in their entirety).
In certain aspects, compositions and methods that rely upon bacteriophage RNA polymerases to achieve targeted mutagenesis in mammalian cells across long stretches of sequence are provided. Bacteriophage-encoded RNA polymerase (RNAP) was first discovered in T7 phage-infected Escherichia coli cells. It was known that phage infection of host bacterial cells led to redirection of host gene expression towards generation of progeny phage particles; however, a previously uncharacterized “switching event” that provoked expression of late bacteriophage genes was first attributed to a phage-encoded RNAP. This phage RNAP was identified as recognizing promoters in the phage genome and expressing phage genes using a single-polypeptide polymerase of −100 kDa molecular weight, which is −4 times smaller than bacterial RNAPs. This was a substantial simplification from the previously known RNAPs from bacteria (5 subunits) and eukaryotes (more than 12 subunits). In spite of its relative simplicity, the single-unit T7 RNAP has been described as able to recognize promoter DNA and unwind double-stranded (ds) DNA to form open complex. After abortive initiation, it proceeds to processive RNA elongation. The simplicity of T7 phage RNAP renders it an attractive model system for study of transcription mechanisms and tool for protein expression in bacterial cells (Basu et al. Nucleic. 30; 237-250). In certain aspects of the instant disclosure, use of the T7 RNAP in concert with nucleic acid-editing deaminases is expressly contemplated for effecting mutagenesis across long stretches of target sequence in eukaryotic cells, particularly mammalian cells. It is also contemplated herein that other polymerases can be used in concert with nucleic acid-editing deaminases, to similar effect. Such other polymerases include, for example and without limitation, T7-like RNA polymerases, such as T3 RNAP, SP6 RNAP and/or N4 RNAP, as described in additional detail below.
T7 RNA Polymerase (T7 RNAP)
T7 RNA Polymerase is an RNA polymerase originally identified in T7 bacteriophage. The T7 RNAP catalyzes formation of RNA from DNA in the 5′→3′ direction. T7 polymerase has been described as extremely promoter-specific and transcribes only DNA downstream of a T7 promoter 5′-TAATACGACTCACTATAG-3′ (SEQ ID NO: 1), with transcription beginning at the 3′ G of the T7 promoter). T7 polymerase has also been described to require a double stranded DNA template and Mg′ ion as cofactor for the synthesis of RNA. It has been described as possessing a very low error rate, and has a molecular weight of 99 kDa (Sousa et al. Progress in Nucleic Acid Research and Molecular Biology. 73: 1-41).
T7-Like RNA Polymerases
T7 RNA Polymerase is a member of a family of single-subunit RNAPs that comprises but is not limited to phage RNAPs including T3 RNA Polymerase, SP6 RNA Polymerase, K11 RNA Polymerase, and N4 RNA Polymerase. These non-T7 RNA polymerases are categorized as T7-like RNA Polymerases.
T3 RNA Polymerase is a member of the DNA-dependent RNA polymerase family and was originally isolated from Bacteriophage T3. It is highly specific to the T3 promoter and transcribes from DNA templates having the T3 promoter. Commercially produced T3 RNA Pol enzyme is expressed from E. coli and is active at 37° C. It has been used in the art for RNA synthesis applications such as for generating in vitro translation templates, hybridization probes, RNA assay substrates, and others.
SP6 RNA Polymerase is a DNA-dependent RNA polymerase isolated from phage-infected Salmonella typhimurium. The enzyme has an extremely high specificity for SP6 promoter sequences (1, 2) and has been described as synthesizing large quantities of RNA from a DNA fragment inserted downstream from a promoter. Strong promoter sequences have been used to construct various cloning vectors, and inserts into the multiple cloning site of these vectors can be transcribed to generate discrete RNAs.
K11 RNA polymerase is an RNA polymerase isolated from gene 1 of the Klebsiella phage K11. It is part of the T7 RNAP family.
N4 RNA Polymerase: Transcription of bacteriophage N4 middle genes is carried out by a phage-coded, heterodimeric RNA polymerase (N4 RNAPII), which belongs to the family of T7-like RNA polymerases. In contrast to phage T7-RNAP, N4 RNAPII displays no activity on double-stranded templates and low activity on single-stranded templates. In vivo, at least one additional N4-coded protein (p17) is required for N4 middle transcription.
Certain aspects of the instant disclosure relate to compositions and methods that relate to combining the somatic hypermutation capability of a deaminase with the DNA processivity of an orthologous bacteriophage RNA polymerase. Deamination or the removal of an amine group in nucleic acid is carried out by enzymes called deaminases that include, but are not limited to, adenine deaminase, cytidine deaminase (including activation-induced cytidine deaminase), and guanine deaminase.
Adenine deaminases include E. coli TadA, human ADAR2, mouse ADA, and human ADAT2 (see Guadelli et al. Nature. 551: 464-471). Exemplary sequences of adenine deaminases include the following.
coli str. K-12 substr. MG1655]
Escherichia coli str. K-12 substr. MG1655,
Homo sapiens adenosine deaminase RNA specific
Homo sapiens adenosine deaminase RNA specific
Mus musculus adenosine deaminase (Ada),
Homo sapiens adenosine deaminase tRNA
Homo sapiens adenosine deaminase tRNA
Mus musculus adenosine deaminase
Cytidine deaminase is an enzyme that in humans is encoded by the CDA gene, which has the following mRNA sequence:
Homo sapiens cytidine deaminase (CDA), mRNA
The human CDA-encoded protein is:
Homo sapiens cytidine deaminase (CDA), protein
The cytidine deaminase gene encodes for an enzyme involved in pyrimidine salvaging. The encoded protein forms a homotetramer that catalyzes the irreversible hydrolytic deamination of cytidine and deoxycytidine to uridine and deoxyuridine, respectively. It is one of several deaminases responsible for maintaining the cellular pyrimidine pool. Mutations in this gene have been described as associated with decreased sensitivity to the cytosine nucleoside analogue cytosine arabinoside, used in the treatment of certain childhood leukemias. Apobec-1 is an RNA-specific cytidine deaminase that possesses homology to other members of the cytidine/deoxycytidine deaminase family, particularly within the domain HVE-PCXXC proposed to coordinate zinc binding and catalysis. APOBEC1 (rat) is an apolipoprotein B mRNA editing enzyme. The APOBEC1 protein is responsible for the postranscriptional editing of a CAA codon for Gln to a UAA codon for a stop codon in the APOB mRNA. APOBEC1 has also been described as involved in CGA (Arg) to UGA (Stop) editing in the NF1 mRNA. APOBEC1 has been described to be expressed exclusively in the small intestine. The rat apobec-1 gene spans 16 kb and includes one untranslated (exon A) and five translated exons (exons 1-5).
The wild-type mRNA sequence of rat APOBEC1 is the following:
Rattus norvegicus apolipoprotein B mRNA editing
The corresponding wild-type rat APOBEC1 protein sequence is the following:
Rattus norvegicus apolipoprotein B mRNA editing
Activation-induced cytidine deaminase, also known as AICDA and AID, is a 24 kDa enzyme which in humans is encoded by the AICDA gene. It creates mutations in DNA by deamination of cytosine base, which turns it into uracil (which is recognized as a thymine). In other words, it changes a C: G base pair into a U: G mismatch. The cell's DNA replication machinery recognizes the U as a T, and hence C: G is converted to a T: A base pair. During germinal center development of B lymphocytes, AID also generates other types of mutations, such as C: G to A: T.
Homo sapiens activation induced cytidine
Homo sapiens activation induced cytidine
Within the above plasmid, AID*Δ includes the following peptide sequence (SEQ ID NO: 18):
The above plasmid also includes the AID*4 DNA sequence (SEQ ID NO: 30):
Guanine deaminase—also known as cypin, guanase, guanine aminase, GAH, and guanine aminohydrolase—is an aminohydrolase enzyme which converts guanine to xanthine. Cypin is a major cytosolic protein that interacts with PSD-95.
Homo sapiens guanine deaminase (GDA), transcript
Homo sapiens guanine deaminase (GDA), transcript
Other sequences relevant to the instant disclosure include the following:
Rattus norvegicus APOBEC1 DNA sequence (SEQ ID NO: 37):
Rattus norvegicus APOBEC1-T7 Polymerase-NLS plasmid DNA
Rattus norvegicus APOBEC1-T7 RNA Polymerase-NLS polypeptide
Rattus norvegicus APOBEC1-T7 RNA Polymerase-UGI-NLS plasmid
Rattus norvegicus APOBEC1-T7 RNA Polymerase-UGI-NLS polypeptide
In certain aspects, the compositions of the instant disclosure include a uracil glycosylate inhibitor. Uracil glycosylate inhibitor has been shown to facilitate C:G→T:A mutations. Uracil glycosylate inhibitor or uracil-DNA glycosylase inhibitor (UGI) is a small protein from Bacillus subtilis bacteriophage PBS1 which inhibits E. coli and other species' uracil DNA glycosylase (UDG). UGI can disassociate UDG: DNA complexes. This protein binds specifically and reversibly to the host uracil-DNA glycosylase, preventing removal of uracil residues from PBS2 DNA by the host uracil-excision repair system. An exemplary UGI sequence is:
Bacillus subtilis Uracil glycosylate inhibitor
In some aspects, the compositions of the present disclosure include a pEditor containing the T7 RNAP-cytidine deaminase fusion gene with a nuclear localization signal. A nuclear localization signal or sequence (NLS) is an amino acid sequence that ‘tags’ a protein for import into the cell nucleus by nuclear transport. Typically, this signal consists of one or more short sequences of positively charged lysines or arginines exposed on the protein surface. Different nuclear localized proteins may share the same NLS. An NLS has the opposite function of a nuclear export signal (NES), which targets proteins out of the nucleus. (Kalderon et al. Cell. 39: 499-509).
Classical NLSs can be classified as either monopartite or bipartite. The major structural differences between the two is that the two basic amino acid clusters in bipartite NLSs are separated by a relatively short spacer sequence (hence bipartite—2 parts), while monopartite NLSs are not. The first NLS to be discovered was the sequence PKKKRKV (SEQ ID NO: 22) in the SV40 Large T-antigen (a monopartite NLS; Kalderon et al. Cell. 39: 499-509). The NLS of nucleoplasmin, KR[PAATKKAGQA]KKKK (SEQ ID NO: 23), is the prototype of the ubiquitous bipartite signal: two clusters of basic amino acids, separated by a spacer of about 10 amino acids (Dingwall et al. J. Cell Biol. 107: 841-9). Both signals are recognized by importin α. Importin α contains a bipartite NLS itself, which is specifically recognized by importin β. The latter can be considered the actual import mediator.
Chelsky et al. proposed the consensus sequence K-K/R-X-K/R (SEQ ID NO: 24) for monopartite NLSs (Dingwall et al.). A Chelsky sequence may, therefore, be part of the downstream basic cluster of a bipartite NLS. Makkerh et al. carried out comparative mutagenesis on the nuclear localization signals of SV40 T-Antigen (monopartite), C-myc (monopartite), and nucleoplasmin (bipartite), and showed amino acid features common to all three. The role of neutral and acidic amino acids was shown for the first time in contributing to the efficiency of the NLS (Makkerh et al. Curr. Biol. 6: 1025-7).
Rotello et al. compared the nuclear localization efficiencies of eGFP fused NLSs of SV40 Large T-Antigen, nucleoplasmin (AVKRPAATKKAGQAKKKKLD; SEQ ID NO: 25), EGL-13 (MSRRRKANPTKLSENAKKLAKEVEN; SEQ ID NO: 26), c-Myc (PAAKRVKLD; SEQ ID NO: 27) and TUS-protein (KLKIKRPVK; SEQ ID NO: 28) through rapid intracellular protein delivery. They found significantly higher nuclear localization efficiency of c-Myc NLS compared to that of SV40 NLS (Ray et al. Bioconjug. Chem. 26: 1004-7).
An expression vector, otherwise known as an expression construct, is commonly a plasmid or virus designed for gene expression in cells. The vector is used to introduce a specific gene into a target cell, and can commandeer the cell's mechanism for protein synthesis to produce the protein encoded by the gene. Expression vectors are the basic tools in biotechnology for the production of proteins. The vector is engineered to contain regulatory sequences that act as enhancer and promoter regions and lead to efficient transcription of the gene carried on the expression vector. The promoters for cytomegalovirus (CMV) and SV40 are commonly used in mammalian expression vectors to drive gene expression. Non-viral promoter, such as the elongation factor (EF)-1 promoter, is also known.
CMV Promoter is commonly included in vectors used in genetic engineering work conducted in mammalian cells, as it is a strong promoter that drives constitutive expression of genes under its control. This promoter has been used to express a plethora of eukaryotic gene products and is used for specialty protein production, gene therapy, and DNA-based vaccination, among other applications.
The CMV promoter has the following sequence (SEQ ID NO: 29):
SV40 Promoter (Simian Virus 40 promoter) contains the SV40 enhancer promoter region and origin of replication (part no. GA-ori-00009.1) for high-level expression and replication in cell lines expressing the large T antigen (e.g. COS-7 and 293T cells). It does not replicate episomally in the absence of the SV40 large T antigen. The SV40 promoter is weak in B cells, but SV40 exhibits high activity in T24 and HCV29 human bladder urethelium carcinoma cell lines.
Human elongation factor-1 alpha (EF-1 alpha) or EF-1 is a constitutive non-viral promoter of human origin that can be used to drive ectopic gene expression in various in vitro and in vivo contexts. EF-1 alpha is often useful in conditions where other promoters (such as CMV) have diminished activity or have been silenced (as in embryonic stem cells).
Directed evolution (DE) is a method used in protein engineering that mimics the process of natural selection to steer proteins or nucleic acids toward a user-defined goal. In general, DE involves subjecting a gene to iterative rounds of mutagenesis, selection (expressing those variants and isolating members with the desired function), and amplification (generating a template for the next round). Advantageously, it can be performed both in vivo and in vitro). Directed evolution is used both for protein engineering as an alternative to rationally designing modified proteins, as well as studies of fundamental evolutionary principles in a controlled, laboratory environment.
Mammalian cells have been employed in DE to engineer recombinant proteins, particularly those that require posttranslational modifications, such as antibodies, hormones and cytokines. Bacteria and yeast are less suitable to evolve these types of proteins because they have insufficient disulfide-bridge formation mechanisms, lack glycosylation, and frequently form protein aggregates. The ability to evolve mammalian proteins within mammalian cells is a relatively recent development, with the methods of the instant disclosure constituting an advance in mammalian mutagenesis approaches available for performing DE. Enhanced performance of DE in mammalian cells is expected to decrease the development time required for generating robust, high-producing mammalian cells lines for commercial applications involving engineering of novel enzymes, proteins (e.g., pharmaceutical applications), and immune support therapies (e.g., bacteriophage with antibody genes). As compared to bacteria and yeast, mammalian cells exhibit low productivity due to their slow growth rates and tendency to undergo programmed cell death (apoptosis). DE in mammalian cells has previously relied upon non-physiological environments, with such DE methods rapidly saturating mutagenized sites, or such DE approaches have only been adapted optimally in bacterial and yeast systems. Use of DE in mammalian cells prior to the instant disclosure has also been hampered because mammalian cells are time-consuming to work with, exhibit a low efficiency of stable gene integration, have a tendency toward multiple gene insertions, and display highly variable expression levels. Certain aspects of the instant disclosure relate to compositions and methods that involve pseudo-random integrated mutation of eukaryotic cells (PRIME), which enables DE in mammalian cells while overcoming some of the above-stated challenges to DE previously described in the art (Pourmir et al. Comput Struct Biotechnol J. 2: e201209012).
The methods and compositions of the instant disclosure can be applied to achieve targeted mutagenesis of mammalian cells across long stretches of sequence, optionally in and around effectively any region of the genome, including targeted genes and/or other genetic elements. In certain embodiments, the methods and compositions of the instant disclosure can be applied to oncogenes and/or cancer-related genes. Exemplary oncogenes and/or cancer-related genes include, but are not limited to, those recited in Table 1.
In certain aspects, the instant disclosure describes methods and compositions designed to achieve targeted mutagenesis of mammalian cells across long stretches of sequence. Mammalian cell culture is used widely in academic, medical and industrial settings. It has provided a means to study the physiology and biochemistry of the cell, and developments in the fields of cell and molecular biology have required the use of reproducible model systems, which cultured cell lines are especially capable of providing. For medical use, cell culture provides test systems to assess the efficacy and toxicology of potential new drugs. Large-scale mammalian cell culture has allowed production of biologically active proteins, initially production of vaccines and then recombinant proteins and monoclonal antibodies; meanwhile, recent innovative uses of cell culture include tissue engineering, as a means of generating tissue substitutes.
Mammalian cells can be isolated from tissues for ex vivo culture in several ways. Cells can be easily purified from blood. However, only the white cells are capable of growth in culture. Cells can be isolated from solid tissues by digesting the extracellular matrix using enzymes such as collagenase, trypsin, or pronase, before agitating the tissue to release the cells into suspension. Alternatively, pieces of tissue can be placed in growth media, and the cells that grow out are available for culture. This method is known as explant culture. Cells that are cultured directly from a subject are known as primary cells. With the exception of some derived from tumors, most primary cell cultures have limited lifespan (Voight et al. Journal of Molecular and Cellular Cardiology. 86: 187-98). An established or immortalized cell line has acquired the ability to proliferate indefinitely either through random mutation or deliberate modification, such as artificial expression of the telomerase gene. Numerous cell lines are well established as representative of particular cell types. Examples of commonly used mammalian cell lines include HEK293T cells, VERO, BHK, HeLa, CV1 (including Cos), MDCK, 293, 3T3, myeloma cell lines (e.g., NSO, NS 1), PC12, W138 cells, and Chinese hamster ovary (CHO) cells, among many other examples (Langdon et al. Molecular Biomethods Handbook. 861-873).
Mammalian cell transfection is a technique commonly used to express exogenous DNA or RNA in a host cell line. There are many different methods available for transfecting mammalian cells, depending upon the cell line characteristics, desired effect, and downstream applications. These methods can be broadly divided into two categories: those used to generate transient transfection, and those used to generate stable transfectants. Transient transfection methods include, but are not limited to, liposome-mediated transfection, non-liposomal transfection agents (lipids and polymers), dendrimer-based transfection, and electroporation. Stable transfection methods include, but are not limited to microinjection, and virus-mediated gene delivery.
Certain aspects of the instant disclosure describe methods and compositions designed to achieve targeted mutagenesis in mammalian cells across long stretches of sequence, via use of virus-mediated gene delivery (bacteriophages). Viral vectors, such as bacteriophages, retrovirus, adenovirus (types 2 and 5), adeno-associated virus, herpes virus, pox virus, human foamy virus (HFV), and lentivirus have been used for gene transfection. All viral vector genomes have been modified by deleting some areas of their genomes so that their replication becomes altered, rendering such viruses safer than native forms. However, viral delivery systems have some problems, including: the marked immunogenicity of viruses, which can cause induction of the inflammatory system, potentially leading to degeneration of transducted tissue; and toxin production, including mortality, the insertional mutagenesis; and their limitation in transgenic capacity size. During the past few years some viral vectors with specific receptors have been designed that are capable of transferring transgenes to some other specific cells, which are not their natural target cells (retargeting) (Nayerossadat et al. Adv Biomed Res. 1: 27).
The instant disclosure also provides kits containing compositions of the instant disclosure, e.g., for use in methods of the present disclosure. Kits of the instant disclosure may include one or more containers comprising a composition (e.g., a nucleic acid encoding for a nucleic acid-editing deaminase and a bacteriophage RNA polymerase (e.g., T7 RNAP), optionally also encoding for a UGI and/or a NLS) of this disclosure. In some embodiments, the kits further include instructions for use in accordance with the methods of this disclosure. In some embodiments, these instructions comprise a description of administration/transfection of the composition(s) to mammalian cells, optionally further including instructions for performance of directed evolution of a targeted gene in mammalian cell(s).
Instructions supplied in the kits of the instant disclosure are typically written instructions on a label or package insert (e.g., a paper sheet included in the kit), but machine-readable instructions (e.g., instructions carried on a magnetic or optical storage disk) are also acceptable. Instructions may be provided for practicing any of the methods described herein.
The kits of this disclosure are in suitable packaging. Suitable packaging includes, but is not limited to, vials, bottles, jars, flexible packaging (e.g., sealed Mylar or plastic bags), and the like. The container may further comprise a mammalian cell transfection agent.
Kits may optionally provide additional components such as buffers and interpretive information. Normally, the kit comprises a container and a label or package insert(s) on or associated with the container.
The practice of the present disclosure employs, unless otherwise indicated, conventional techniques of chemistry, molecular biology, microbiology, recombinant DNA, genetics, immunology, cell biology, cell culture and transgenic biology, which are within the skill of the art. See, e.g., Maniatis et al., 1982, Molecular Cloning (Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y.); Sambrook et al., 1989, Molecular Cloning, 2nd Ed. (Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y.); Sambrook and Russell, 2001, Molecular Cloning, 3rd Ed. (Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y.); Ausubel et al., 1992), Current Protocols in Molecular Biology (John Wiley & Sons, including periodic updates); Glover, 1985, DNA Cloning (IRL Press, Oxford); Anand, 1992; Guthrie and Fink, 1991; Harlow and Lane, 1988, Antibodies, (Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y.); Jakoby and Pastan, 1979; Nucleic Acid Hybridization (B. D. Hames & S. J. Higgins eds. 1984); Transcription And Translation (B. D. Hames & S. J. Higgins eds. 1984); Culture Of Animal Cells (R. I. Freshney, Alan R. Liss, Inc., 1987); Immobilized Cells And Enzymes (IRL Press, 1986); B. Perbal, A Practical Guide To Molecular Cloning (1984); the treatise, Methods In Enzymology (Academic Press, Inc., N.Y.); Gene Transfer Vectors For Mammalian Cells (J. H. Miller and M. P. Calos eds., 1987, Cold Spring Harbor Laboratory); Methods In Enzymology, Vols. 154 and 155 (Wu et al. eds.), Immunochemical Methods In Cell And Molecular Biology (Mayer and Walker, eds., Academic Press, London, 1987); Handbook Of Experimental Immunology, Volumes I-IV (D. M. Weir and C. C. Blackwell, eds., 1986); Riott, Essential Immunology, 6th Edition, Blackwell Scientific Publications, Oxford, 1988; Hogan et al., Manipulating the Mouse Embryo, (Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y., 1986); Westerfield, M., The zebrafish book. A guide for the laboratory use of zebrafish (Danio rerio), (4th Ed., Univ. of Oregon Press, Eugene, 2000).
Unless otherwise defined, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this disclosure belongs. Although methods and materials similar or equivalent to those described herein can be used in the practice or testing of the present disclosure, suitable methods and materials are described below. All publications, patent applications, patents, and other references mentioned herein are incorporated by reference in their entirety. In case of conflict, the present specification, including definitions, will control. In addition, the materials, methods, and examples are illustrative only and not intended to be limiting.
Reference will now be made in detail to exemplary embodiments of the disclosure. While the disclosure will be described in conjunction with the exemplary embodiments, it will be understood that it is not intended to limit the disclosure to those embodiments. To the contrary, it is intended to cover alternatives, modifications, and equivalents as may be included within the spirit and scope of the disclosure as defined by the appended claims. Standard techniques well known in the art or the techniques specifically described below were utilized.
Design and Construction of pTarget and pEditor Plasmids
A list of the plasmids and primers used in this disclosure are listed in Table 2.
pcDNA3.1(+)-IRES-GFP was a gift from Kathleen L. Collins (Addgene plasmids #51406). pCMV-BE3 was a gift from David Liu (Addgene plasmid #73021). pGH335_MS2-AID*Δ-Hygro was a gift from Michael Bassik (Addgene plasmid #85406). Lenti_CMV_T_IR, Lenti_PAX2 and Lenti_VSVg were gifts from Jamie Marshall. T7 RNAP was ordered as a gBlock from Integrated DNA Technologies (IDT). The Cas9(D10A) in the pCMV-BE3 construct was replaced with T7 RNAP by Gibson assembly to generate pAPOBEC-T7 and pAPOBEC-T7-UGI in which the original T7 promoter was also deleted to avoid self-editing. Rat APOBEC1 in pAPOBEC-T7 and pAPOBEC-T7-UGI was replaced with AID*A amplified from pGH335_MS2-AID*Δ-Hygro to generate pAID-T7 and pAID-T7-UGI. For pTarget, T7 promoter-GFP fragment was amplified from pcDNA3.1(+)-IRES-GFP and was sub-cloned into a pUC19 backbone. This fragment was also sub-cloned into the Lenti_CMV-T-IR to generate the Lenti_CMV_T7_GFP-T-IR. A pTarget plasmid without T7 promoter was also cloned as a negative control. BFP fragment was generated from GFP sequence via site-directed mutagenesis. pAID-T7G645A-UGI, pAID-T7P266L-UGI, pAID-T7P266LG645A-UGI and pAID-T7G645AQ744R-UGI were cloned via site-directed mutagenesis using wild type pAID-T7-UGI as a template. All plasmid sequences were verified using Sanger sequencing. All cloning primers were ordered from IDT. Plasmids were extracted using Qiaprep® Spin Miniprep Kit and Plasmid Plus Midi Kit (Qiagen®).
HEK293T cells were obtained from ATCC and were grown in high-glucose (4.5 g/L) DMEM supplemented with GlutaMAX™, 1 mM sodium pyruvate, 10% FBS, 100 units/mL of penicillin and 100 μg/mL of streptomycin in a humidified chamber with 5% CO2 at 37° C. Cells were maintained at ˜80% confluence in 24-well plates on the day of transfection. 250 ng of pTarget and 250 ng of pEditor plasmids were mixed together with 1 μl of TransIT-X2 reagent (Mirus) and the mixture was incubated in 50 μl of Opti-MEM® (Thermo Fisher Scientific™) for 30 min. The mixture was then added drop-wise to each well. For time-point experiment using target-integrated single cell clones, cells were cultured in 12-well plates and were transfected with 1000 ng of pTarget plasmids. Cells were subsequently harvested at the time points indicated above.
3 million HEK293T cells were cultured in 10 mL of culture media in a 10-cm dish. Cells were transfected with 12 μg of Lenti_CMV_T7_GFP-T-IR, 9 μg of Lenti_PAX2 and 3 μg of lenti_VSVg. 24 hr after transfection, culture media was replaced with 6 mL of high-glucose (4.5 g/L) DMEM supplemented with GlutaMAX™, 1 mM sodium pyruvate, 30% FBS, 100 units/mL of penicillin and 100 μg/mL of streptomycin. Supernatant containing viral particles was collocated and filtered through 0.22 μM filters 24 hr after. To generate single cell clones, HEK293T cells in a 6-well plate with 2.5 mL of culture media received 500 μl of virus together with polybrene at a final concentration of 8 μg/mL. Two days after transduction, successfully-integrated cells were selected by puromycin at a concentration of 1.5 μg/mL. Seven days after transduction, integrated cells were subject to FACS-sorting in single cell format into 96-well plates using a MoFlo® Astrios™ EQ Cell Sorter (Beckman Coulter™) and single cells were allowed to expand to form colonies.
HEK293T cells transfected with pTarget and pEditor plasmids were seeded in a 24-well glassbottom plate. Cells were imaged using an inverted Nikon® CSU-W1 Yokogawa® spinning disk confocal microscope with 488 nm (GFP) and 405 nm (BFP) lasers, an air objective (Plan Apo λ, numerical aperture (NA)=0.75, 20×, Nikon), and an Andor® Zyla sCMOS® camera. NIS-Elements AR software (v4.30.01, Nikon®) was used for image capture. Images were processed using ImageJ (National Institutes of Health). CellProfiler (version 3.1.5, Broad Institute) (21) was used for segmentation and counting BFP and GFP positive cells. GFP positive cells were further thresholded by Otsu's method using integrated intensity with the R package autothresholdr (22).
To sequence the targeted region (˜2000 bp) on pTarget, plasmids were extracted from ˜1 million cells using Qiaprep Spin Miniprep Kit. PCR was performed using those plasmids as templates (primer sequences are shown in Table 2 above. Ampure® XP beads (Beckman Coulter™) were added to samples at a 0.8:1 ratio to size select for the pcr'ed fragments. The concentration of each sample was measured by Qubit™ (Thermo Fisher Scientific™). 1 ng of DNA at a volume of 2.5 μl from each sample was used as input for the subsequent library preparation. Sequencing library was prepared following the Nextera® XT Kit protocol (Illumina®) except that half the amount of each reagent was used. To sequence the targeted loci, genomic DNA was extracted from ˜1 million cells using the Quick-DNA™ Kit (Zymo Research™). 4 μl of extracted genomic DNA were used to set up in vitro transcription reactions at a volume of 10 μl using HiScribe™ T7 High Yield RNA Synthesis Kit (New England BioLabs, Inc.®). The newly synthesized RNA was purified using RNA Clean & Concentrator Kit (Zymo Research™). Reverse transcription was performed using SuperScript® IV First-Strand Synthesis System (Thermo Fisher Scientific™) cDNA was purified using AMPure® XP beads at a ratio of 1:1 and was used as the template for subsequent PCR reactions. The concentration of each sample was measured by Qubit® and the same Nextera® XT Kit protocol was followed to prepare sequencing library. Sequences were measured on a MiSeq® (Illumina®) with paired-end reads.
On average, 1 million reads were produced for each sample. Illumina® sequencing adapters were trimmed during sample demultiplexing using bcl2fastq2 (version 2.19.1). Bases in each read with Illumina® quality score lower than 25 were filtered. Alignment on respective reference sequence was performed using Bowtie 2 (v2.2.4.1) (23). Alignment files were generated in bam format and were visualized in Geneious (v11.1.5). The mutation enrichment was calculated at each base with custom Matlab™ scripts. The first and last 15 bases of each aligned read and bases with read count less than 100 were excluded from the analysis. Transitions, transversions, and indels observed at each position were calculated, and the C->T and G->A mutation profiles were plotted, respectively, for each sample. The mutation rate per base data was obtained by dividing the number of reads with mutations over the number of total reads at each base. The average mutation rate for each possible combination of base switching for each sample was calculated by averaging the mutation rate per base data across the targeted region. The pT7 sample was used to estimate the background error rates introduced through sample preparation and Illumina® sequencing. The final average mutation rate for each base switching combination was calculated by subtracting the background error rate. Negative values were set to 0. All bar graphs and dot plots were generated in RStudio® using ggplot2.
Pairwise comparison was analyzed using two-sided t test.
It was initially examined whether combining T7 RNAP with a cytidine deaminase could create a means of continuously diversifying DNA nucleotides downstream of a T7 promoter (
To test whether fusing a cytidine deaminase to T7 RNAP maintained T7 RNAP activity, pTarget and various pEditor plasmids were transfected into HEK 293T cells and EGFP fluorescence under each condition was measured. Consistent with previous reports (9, 10), T7 RNAP alone (pT7) was able to drive EGFP expression, while deaminase alone (pAPOBEC) could not (
The ability of the T7 RNAP-deaminase fusion protein to induce mutations was then tested within a targeted region. HEK293T cells transfected with both pTarget and pEditor were collected 3 days after transfection. pTarget plasmids were then extracted, and a downstream 2000-bp window was amplified by PCR for high-throughput sequencing (
The overall average C->T and G->A mutation rates for each of the pEditor variants was then calculated. The most efficient variant, which was observed to be pAID-T7-UGI, showed an average C->T mutation rate of 1.30 per 1000 base pairs (kbp−1) and an average G->A mutation rate of 2.92 kbp−1(
PRIME was then utilized to mutate targeted gene loci within the human genome. An EGFP gene under the control of a T7 promoter was integrated into the HEK293T genome via lentiviral transduction. A CMV promoter was also included upstream of the T7 promoter, to allow for subsequent single cell sorting by EGFP fluorescence. A single cell clone of the EGFP construct-integrated cells was then selected and expanded (
To examine potential off-target effects of the PRIME system in the genome, a search for regions in the genome that possess the conserved T7 promoter sequence (TAATACGACTCACTATAG; SEQ ID NO: 1) was performed. Although an exact match for the T7 promoter sequence in the human genome was not identified, three regions possessing a single-base mismatch, located at distinct locations in chromosomes 6, 7 and 8, respectively, were identified. Among them, the regions in chromosome 6 and 7 (designated “Chr6” and “Chr7”, respectively) shared the same sequence (TAATACAACTCACTATAG; SEQ ID NO: 1) (
T7 RNAP is widely used in biotechnology and has previously been shown to be highly engineerable. It was examined if the editing rate of PRIME could be tuned by modifying the elongation rate of T7 RNAP or its processivity over the DNA template, as, without wishing to be bound by theory, such changes would be expected to modulate the probability of cytidine deaminase-DNA template interaction. To this end, three mutations (P266L, G645A, Q744R) relative to the wild type T7 RNAP were constructed and tested, with these particular mutations identified based upon previous studies (
To demonstrate PRIME can perform functional mutagenesis in mammalian systems, PRIME was used to shift the fluorescence spectra of blue fluorescent protein (BFP). A single H66Y amino acid substitution (in this case, CAC->TAC or TAT) has been previously identified to cause a shift in the fluorescence excitation and emission spectra of BFP, to that of GFP16 (
In summary, the above examples have demonstrated that cytidine deaminase fused to T7 RNAP can be used to generate localized nucleotide diversity within the human genome at an average C->T and G->A mutation rate ranging from ˜0.4-4 kbp−1 within a week. Higher editing efficiency may be achieved via additional engineering of the T7 RNAP. The wide editing window of PRIME (>2000 bps) makes it possible to target a long stretch of a selected genomic region over multiple cellular generations. In comparing PRIME with other reported directed evolution methods (
TRACE (T7 polymeRAce-driven Continuous Editing), as described herein and also referred to herein as “PRIME”, is a method that enables continuous, targeted mutagenesis in human cells using a cytidine deaminase fused to T7 RNA polymerase. TRACE can be applied to enable cell lineage recordings both in vitro and in vivo. A reconstruction of lineage trees by grouping and ranking DNA mutations from sequencing reads is shown in
A TRACE transgenic mouse is generated by decomposing the TRACE system into two components: the TRACE editor consisting of the T7 RNA-polymerase deaminase fusion protein, and the T7 recording template consisting of a T7 promoter and a transcribed editing template. Both the TRACE editor as well as the T7 promoter-recording template are integrated into a mouse at the Rosa 26 locus. Oocytes containing a T7 promoter-recording template are then fertilized with sperm harboring a constitutively active TRACE editor to initiate sequence diversification in the whole embryo. In addition, to enable cell type-specific lineage tracing, existing mouse lines expressing cell type-specific Cre-recombinase or Cre-ER (a tamoxifen inducible version of Cre) are leveraged to drive the conditional expression of a stably integrated TRACE editor in cells where Cre-recombinase is present. Thus, by crossing the TRACE mouse line with a Cre-driver line, cell-type specific lineage recording is achieved, and additional temporal resolution is provided by tamoxifen induction.
All patents and publications mentioned in the specification are indicative of the levels of skill of those skilled in the art to which the disclosure pertains. All references cited in this disclosure are incorporated by reference to the same extent as if each reference had been incorporated by reference in its entirety individually.
One skilled in the art would readily appreciate that the present disclosure is well adapted to carry out the objects and obtain the ends and advantages mentioned, as well as those inherent therein. The methods and compositions described herein as presently representative of preferred embodiments are exemplary and are not intended as limitations on the scope of the disclosure. Changes therein and other uses will occur to those skilled in the art, which are encompassed within the spirit of the disclosure, are defined by the scope of the claims.
In addition, where features or aspects of the disclosure are described in terms of Markush groups or other grouping of alternatives, those skilled in the art will recognize that the disclosure is also thereby described in terms of any individual member or subgroup of members of the Markush group or other group.
The use of the terms “a” and “an” and “the” and similar referents in the context of describing the disclosure (especially in the context of the following claims) are to be construed to cover both the singular and the plural, unless otherwise indicated herein or clearly contradicted by context. The terms “comprising,” “having,” “including,” and “containing” are to be construed as open-ended terms (i.e., meaning “including, but not limited to,”) unless otherwise noted. Recitation of ranges of values herein are merely intended to serve as a shorthand method of referring individually to each separate value falling within the range, unless otherwise indicated herein, and each separate value is incorporated into the specification as if it were individually recited herein.
All methods described herein can be performed in any suitable order unless otherwise indicated herein or otherwise clearly contradicted by context. The use of any and all examples, or exemplary language (e.g., “such as”) provided herein, is intended merely to better illuminate the disclosure and does not pose a limitation on the scope of the disclosure unless otherwise claimed. No language in the specification should be construed as indicating any non-claimed element as essential to the practice of the disclosure.
Embodiments of this disclosure are described herein, including the best mode known to the inventors for carrying out the disclosed invention. Variations of those embodiments may become apparent to those of ordinary skill in the art upon reading the foregoing description.
The disclosure illustratively described herein suitably can be practiced in the absence of any element or elements, limitation or limitations that are not specifically disclosed herein. Thus, for example, in each instance herein any of the terms “comprising”, “consisting essentially of”, and “consisting of” may be replaced with either of the other two terms. The terms and expressions which have been employed are used as terms of description and not of limitation, and there is no intention that in the use of such terms and expressions of excluding any equivalents of the features shown and described or portions thereof, but it is recognized that various modifications are possible within the scope of the invention claimed. Thus, it should be understood that although the present disclosure provides preferred embodiments, optional features, modification and variation of the concepts herein disclosed may be resorted to by those skilled in the art, and that such modifications and variations are considered to be within the scope of this disclosure as defined by the description and the appended claims.
It will be readily apparent to one skilled in the art that varying substitutions and modifications can be made to the invention disclosed herein without departing from the scope and spirit of the invention. Thus, such additional embodiments are within the scope of the present disclosure and the following claims. The present disclosure teaches one skilled in the art to test various combinations and/or substitutions of chemical modifications described herein toward generating conjugates possessing improved contrast, diagnostic and/or imaging activity. Therefore, the specific embodiments described herein are not limiting and one skilled in the art can readily appreciate that specific combinations of the modifications described herein can be tested without undue experimentation toward identifying conjugates possessing improved contrast, diagnostic and/or imaging activity.
The inventors expect skilled artisans to employ such variations as appropriate, and the inventors intend for the disclosure to be practiced otherwise than as specifically described herein. Accordingly, this disclosure includes all modifications and equivalents of the subject matter recited in the claims appended hereto as permitted by applicable law. Moreover, any combination of the above-described elements in all possible variations thereof is encompassed by the disclosure unless otherwise indicated herein or otherwise clearly contradicted by context. Those skilled in the art will recognize, or be able to ascertain using no more than routine experimentation, many equivalents to the specific embodiments of the disclosure described herein. Such equivalents are intended to be encompassed by the following claims.
This application claims the benefit of U.S. Provisional Application No. 62/830,084 filed Apr. 5, 2019, entitled “A Pseudo-Random DNA Editor for Efficient and Continuous Nucleotide Diversification in Human Cells,” the entire contents of which are incorporated herein by reference.
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/US2020/026679 | 4/3/2020 | WO | 00 |
Number | Date | Country | |
---|---|---|---|
62830084 | Apr 2019 | US |