Gene editing technologies rely on the use of engineered nucleases to introduce targeted modifications in the genomes of living cells. In particular, the clustered regularly interspaced short palindromic repeats (CRISPR)/Cas9 RNA-guided nuclease (RGN) system, has revolutionized this field, providing a simple and efficient means of inducing DNA double-strand breaks (DSBs) at targeted genomic loci. In Streptococcus pyogenes, the CRISPR RNAs (crRNAs) and the trans-activating-crRNA (tracrRNA) form a complex that guides the Cas9 nuclease to the target DNA. The only constraint for target sequences is that they must immediately precede a suitable protospacer adjacent motif (PAM) of the form NGG5 or NGA6. This bacterial CRISPR system has been further simplified to utilize a single-guide RNA (sgRNA) molecule, which is a chimeric RNA that replaces both the crRNA and tracrRNA elements.
The CRISPR system has been adapted for use in mammalian cells, where gene knock out can be accomplished by introducing DSBs at the target locus that, when repaired by error-prone DNA repair pathways such as non-homologous end joining (NHEJ), cause inactivating mutations. Despite the high rates of allele modification that can be achieved with RGNs, the laborious and costly screening needed for identification and isolation of isogenic cell lines remains challenging in genetic engineering.
Alternatively, strain development can be streamlined by co-delivering engineered nucleases with donor vectors containing expression cassettes that confer antibiotic resistance for rapid clonal screening. These donor vectors often share a common architecture that consists of two DNA sequences homologous to the region of DNA upstream and downstream of the intended DSB, flanking the DNA that will be incorporated into the genome following repair of the DSB. Donor vectors stimulate DNA repair through homologous recombination (HR), a pathway that can be hijacked for targeted integration of DNA sequences into genomes. This method has been used successfully for multiple applications, including gene knock-out, delivery of therapeutic genes, or for tagging endogenous proteins. Gene editing via donor vectors is precise, however, it is inefficient and it relies on construction of lengthy homology arms using complex cloning strategies, costly synthesis of DNA fragments, or both.
Furthermore, an important drawback for genome engineering applications, which often requires integration of constructs in excess of 5 kb, is that the efficiency of HR decreases as the size of the DNA insert between the homology arms increases. More importantly, since homology between the donor vector and the target site is critical, each donor vector is necessarily associated with a specific sgRNA. Consequently, the time frame necessary for design, testing and validation of new strains generated using HR is excessively long. Platforms for rapid and low cost multiplexed genomic integration are needed.
Additionally, genome-scale gain-of-function screening is a powerful tool to systematically identify genes that regulate biological processes. The activation of endogenous genes with artificial transcription factors (ATFs) is an enticing technology, not only for developing gene therapies or disease models, but also for interrogating gene function through genome-wide screenings. ATFs consist of a programmable DNA binding domain that can be customized to target a transcriptional activation domain to the appropriate locus for upregulation of gene expression. While zinc finger proteins and Transcriptional Activator-Like Effectors (TALE) have been used for gene activation, the RNA guided nuclease (RGN) platform is arguably the most popular since the DNA binding specificity can be engineered rapidly and at low cost. RGN-based gene activation, also known as CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) activation or CRISPRa, requires a single-guide RNA (sgRNA) and catalytically dead Cas9 (dCas9) coupled with a transcriptional activator. First generation transcriptional activators, which typically used VP64 or VP16 activation domains, required multiple ATFs acting in synergy near the transcriptional start site (TSS) of the gene of interest for optimal gene activation. This important limitation is lessened when using second-generation transcriptional activators, including VP160, SAM, VPR, suntag, VP64-dCas9-BFP-VP64, Scaffold, and P300, which are capable of activating expression of some target genes when used individually.
A key application of second generation transcriptional activators has been the interrogation of gene function by introducing genetic perturbations at genome-scale using libraries of sgRNAs. However, the success of gain-of-function screenings fundamentally relies on the effective activation of target genes by the ATFs in order to overcome the applied selection pressure. Unfortunately, it is becoming evident that even second generation CRISPRa technologies are often limited by their need for multiple sgRNA to achieve adequate activation of many genes and the lack of established parameters to best position ATFs within endogenous promoters for effective upregulation of gene expression. These constraints in gain-of-function screenings by ATFs may lead to results that are skewed in favor of select subgroups of sgRNAs for which activation is readily achieved with a single sgRNA.
To address shortcomings in loss-of-function genome-scale screenings, hits from CRISPR knock out screenings can be refined by simultaneously considering hits from short hairpin RNA (shRNA) screenings. Unfortunately, there are no such alternatives to CRISPRa that function by a different mechanism and that, by having different advantages and limitations, can be used in parallel with CRISPRa screenings to comprehensively identify targets. While ideal outcomes from screenings require robust activation of target gene expression, current CRISPRa technologies often exhibit relatively weak, variable, or unpredictable activation across targets.
To address these limitations, a novel universal vector integration platform system for gene activation is described herein, which bypasses native promoters to achieve unprecedented levels of endogenous gene activation. Since genomic context at the promoter greatly impacts output expression when using ATFs, it is possible to circumvent this problem through insertion of a synthetic promoter near the transcriptional start site (TSS) of target genes. This system not only overrides negative regulatory elements, but is also highly customizable, given the existing assortment of well-characterized synthetic promoters capable of both constitutive and inducible gene expression.
This platform enables rapid, robust and inducible activation of both individual and multiplexed gene transcripts. This gene activation system is multiplexable and easily tuned for precise control of expression levels. Importantly, since promoter vector integration requires just one variable sgRNA to target each gene of interest, this procedure can be adapted for gain-of-function screenings. Collectively, these results demonstrate a novel system for gene modulation with wide adaptability in cell line engineering and genome-scale functional screenings.
The present disclosure relates to a system for targeted genome engineering and methods for altering the expression of genes and interrogating the function of genes.
One aspect of the present invention provides a system for targeted genome engineering, the system comprising one or more vectors comprising: (i) nucleic acids for integration in genomic DNA with no significant homology to the target sequence in genomic DNA; (ii) a single guide RNA (sgRNA) that binds one or more vectors; (iii) a sgRNA that binds a double-stranded nucleic sequence in genomic DNA where the vectors can be integrated; and (iv) a nuclease that causes a double-stranded nucleic acid break of the targeted nucleic acid molecules.
In some embodiments of the invention disclosed herein, the nucleic acids for integration in genomic DNA with no significant homology to the target sequence in genomic DNA; the single guide RNA (sgRNA) that binds one or more vectors; the sgRNA that binds a double-stranded nucleic sequence in genomic DNA where the vectors can be integrated; and the nuclease that causes a double-stranded nucleic acid break of the targeted nucleic acid molecules are located on the same or different vectors of the system.
In some embodiments of the invention disclosed herein, the sgRNA that binds one or more vectors and the sgRNA that binds a double-stranded nucleic sequence in genomic DNA where the vectors can be integrated are the same sgRNA. In other embodiments of the above aspect of the invention, the sgRNA that binds one or more vectors and the sgRNA that binds a double-stranded nucleic sequence in genomic DNA where the vectors can be integrated are different sgRNAs.
In some embodiments of the invention disclosed herein, the sgRNA that binds one or more vectors is a universal sgRNA.
In some embodiments of the invention disclosed herein, the nuclease is expressed from an expression cassette.
In some embodiments of the invention disclosed herein, the one or more vectors further comprises a polynucleotide encoding for a marker protein. In other embodiments of the invention disclosed herein, a sgRNA target site is cloned upstream of the marker protein. In other embodiments of the invention disclosed herein, the marker protein is an antibiotic resistance protein or a florescent protein.
In some embodiments of the invention disclosed herein, the polynucleotide encoding for a marker protein is expressed on a vector separate from the one or more vectors comprising the nucleic acids for integration in genomic DNA with no significant homology to the target sequence in genomic DNA; the single guide RNA (sgRNA) that binds one or more vectors; the sgRNA that binds a double-stranded nucleic sequence in genomic DNA where the vectors can be integrated; and the nuclease that causes a double-stranded nucleic acid break of the targeted nucleic acid molecules.
In some embodiments of the invention disclosed herein, the sgRNA that binds a double-stranded nucleic sequence in genomic DNA where the vectors can be integrated is complementary to a portion of the nucleic acid sequence of a target DNA.
In some embodiments of invention disclosed herein, the nucleic acids with no significant homology to the target nucleic acid molecule are about 0.1 kilobase to about 50 kilobases in size.
In some embodiments of the invention disclosed herein, the nuclease is a Zinc finger nuclease (ZFN), RNA guided nucleases (RGN), or transcription activator-like effector nucleases (TALEN). In other embodiments of the invention disclosed herein, the RGN is Caspase 9 (Cas9).
In some embodiments of the invention disclosed herein, the one or more vectors are plasmids or viral vectors. In other embodiments of the invention disclosed herein, the viral vector is a lentivirus vector, an adenovirus vector, or an adeno-associated vector (AAV).
In some embodiments of the invention disclosed herein, the system for targeted genome engineering further comprises one or more additional sgRNA molecules that causes a double-stranded nucleic acid break of one or more additional target nucleic acid molecules.
In some embodiments of the invention disclosed herein, the system does not require the entire vector that can be integrated to have any homology with the target site.
Another aspect of the present invention provides a method of altering the expression of at least one gene product, the method comprising: (i) introducing into a cell a system for targeted genome engineering as disclosed herein; and (ii) selecting for successfully transfected cells by applying selective pressure; wherein the expression of at least one gene product is reduced or eliminated relative to a cell that has not been transfected with the system for targeted genome engineering.
In some embodiments of the invention disclosed herein, the method occurs in vivo or in vitro. In other embodiments of the invention disclosed herein, the cell is a eukaryotic cell.
Another aspect of the present invention provides a system for targeted genome engineering, the system comprising one or more vectors comprising: (i) at least one nucleic acid with no significant homology to the target genomic DNA site and that contains a promoter for controlling gene expression; (ii) a primary sgRNA that binds the target nucleic acid molecule at or near the transcription start site of a gene in the target nucleic acid molecule; (iii) a universal secondary sgRNA that binds one or more vectors; and (iv) a nuclease that causes a double-stranded nucleic acid break of the targeted nucleic acid molecules.
In some embodiments of the invention disclosed herein, the at least one nucleic acid with no significant homology to the target genomic DNA site and that contains a promoter for controlling gene expression comprises: (1) a nucleic acid promoter followed by a universal secondary sgRNA; (2) two opposing, constitutive promoters separated by a universal secondary sgRNA; or (3) two inducible promoters in opposite orientations separated by an universal secondary sgRNA.
In some embodiments of the invention disclosed herein, the at least one nucleic acid with no significant homology to the target genomic DNA site and that contains a promoter for controlling gene expression; the primary sgRNA that binds the target nucleic acid molecule at or near the transcription start site of a gene in the target nucleic acid molecule; the universal secondary sgRNA that binds one or more vectors; and the nuclease that causes a double-stranded nucleic acid break of the targeted nucleic acid molecules are located on the same or different vectors of the system.
In some embodiments of the invention disclosed herein, each inducible promoter of the two inducible promoters in opposite orientations separated by a universal secondary sgRNA contains multiple TetO repeats and a transferase gene operatively linked to a reverse tetracycline transactivator (rtTA) via a T2A peptide.
In some embodiments of the invention disclosed herein, the one or more vectors further comprise a polynucleotide encoding for a marker protein. In other embodiments of the invention disclosed herein, the marker protein is an antibiotic resistance protein or a florescent protein.
In some embodiments of the invention disclosed herein, the nucleic acid promotor is heterologous to the promoter of the target nucleic acid molecule.
In some embodiments of the invention disclosed herein, the nuclease is a Zinc finger nuclease (ZFN), RNA guided nucleases (RGN), or transcription activator-like effector nucleases (TALEN). In other embodiments of the invention disclosed herein, the RGN is Caspase 9 (Cas9).
In some embodiments of the invention disclosed herein, the one or more vectors are plasmid or viral vectors. In other embodiments of the invention disclosed herein, the viral vector is a lentivirus vector, an adenovirus vector, or an adeno-associated vector (AAV).
Another aspect of the present invention provides a method of altering the expression of at least one gene product, the method comprising: (i) introducing into a cell a system for targeted genome engineering as disclosed herein; and (ii) selecting for successfully transfected cells by applying selective pressure, wherein the expression of at least one gene product is activated relative to a cell that is not transfected with the system of targeted genome engineering.
In some embodiments of the invention disclosed herein, the method occurs in vivo or in vitro. In other embodiments of the invention disclosed herein, the cell is a eukaryotic cell.
Another aspect of the present invention provides a method of identifying the genetic basis of one or more medical symptoms exhibited by a subject, the method comprising: (i) obtaining a biological sample from the subject and isolating a population of cells having a first phenotype from the biological sample; (ii) transfecting a library of sgRNA into the cells; (iii) introducing into the cells a system of targeted genome engineering as disclosed herein; (iv) selecting for successfully transfected cells by applying the selective pressure; (v) selecting the cells that survive under the selective pressure, (vi) determining the genomic loci of the DNA molecule that interacts with the first phenotype and identifying the genetic basis of the one or more medical symptoms exhibited by the subject.
In some embodiments of the invention disclosed herein, selective pressure is applied by contacting the cells with an antibiotic and selecting the cells that survive. In some embodiments of the method disclosed herein, the antibiotic is puromycin or hygromycin.
Additional features and advantages are described herein, and will be apparent from the following Detailed Description, Drawings and the claims.
The features, objects and advantages other than those set forth above will become more readily apparent when consideration is given to the detailed description below. Such detailed description makes reference to the following drawings, wherein:
While the present invention is susceptible to various modifications and alternative forms, exemplary embodiments thereof are shown by way of example in the drawings and are herein described in detail. It should be understood, however, that the description of exemplary embodiments is not intended to limit the invention to the particular forms disclosed, but on the contrary, the intention is to cover all modifications, equivalents and alternatives falling within the spirit and scope of the invention as defined by the embodiments above and the claims below. Reference should therefore be made to the embodiments above and claims below for interpreting the scope of the invention.
The system and methods now will be described more fully hereinafter with reference to the accompanying drawings, in which some, but not all embodiments of the invention are shown. Indeed, the invention may be embodied in many different forms and should not be construed as limited to the embodiments set forth herein; rather, these embodiments are provided so that this disclosure will satisfy applicable legal requirements.
Likewise, many modifications and other embodiments of the system and methods described herein will come to mind to one of skill in the art to which the invention pertains having the benefit of the teachings presented in the foregoing descriptions and the associated drawings. Therefore, it is to be understood that the invention is not to be limited to the specific embodiments disclosed and that modifications and other embodiments are intended to be included within the scope of the appended claims. Although specific terms are employed herein, they are used in a generic and descriptive sense only and not for purposes of limitation.
Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of skill in the art to which the invention pertains. Although any methods and materials similar to or equivalent to those described herein can be used in the practice or testing of the present invention, the preferred methods and materials are described herein.
Articles “a” and “an” are used herein to refer to one or to more than one (i.e. at least one) of the grammatical object of the article. By way of example, “an element” means at least one element and can include more than one element.
As used herein, the term “and/or” includes any and all combinations of one or more of the associated listed items. As used herein, the singular forms “a,” “an,” and “the” are intended to include the plural forms as well as the singular forms, unless the context clearly indicates otherwise. It will be further understood that the terms “comprises” and/or “comprising,” when used in this specification, specify the presence of stated features, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, steps, operations, elements, components, and/or groups thereof.
The term “about” in association with a numerical value means that the numerical value can vary plus or minus by 5% or less of the numerical value.
Overview
The present disclosure provides a multiplexable and universal nuclease-assisted vector integration system for rapid generation of gene knockouts using selection that does not require customized targeting vectors, thereby minimizing the cost and time needed for gene editing. Importantly, this system is capable of remodeling native genomes (e.g. mammalian) through integration of large DNA, (e.g., about 50 kb), enabling rapid generation and screening of multigene knockouts from a single transfection. These results support that nuclease assisted vector integration is a robust tool for genome-scale gene editing that will facilitate diverse applications in synthetic biology and gene therapy.
Also described herein are vectors and methods for rapid and efficient integration of heterologous DNA at target sites in genomes with high efficiency. These methods can be adapted to precisely manipulate and activate native gene expression. Furthermore, these techniques can be used for creating cell lines to model human diseases, for activating gene expression to correct genetic diseases or even for performing genetic screenings.
In one aspect, a system for targeted genome engineering, the system comprising one or more vectors comprising: (i) nucleic acids for integration in genomic DNA with no significant homology to the target sequence in genomic DNA; (ii) a single guide RNA (sgRNA) that binds one or more vectors; (iii) a sgRNA that binds a double-stranded nucleic sequence in genomic DNA where the vectors will be integrated; and (iv) a nuclease that causes a double-stranded nucleic acid break of the targeted nucleic acid molecules.
As used herein, the term “targeted genome engineering” refers to a type of genetic engineering in which DNA is inserted, deleted, modified, or replaced in the genome of a living organism or cell. Targeted genome engineering can involve integrating nucleic acids into genomic DNA at a target site of interest in order to manipulate (e.g., increase, decrease, knockout, activate) the expression of one or more genes.
As used herein, the term “knockout” refers to a genetic technique in which one of an organism's genes is made inoperative. Knocking out two genes simultaneously in an organism is known as a double knockout. Similarly, triple knockout (TKO) and quadruple knockouts (QKO) are used to describe three or four knocked out genes, respectively. Heterozygous knockouts refer to when only one of the two gene copies (alleles) is knocked out, and homozygous knockouts refer to when both gene copies are knocked out.
As used herein the term “activate” refers to activation of native gene expression, which can include, but is not limited to, increasing the levels of gene products or initiating gene expression of a previously inactive gene. Robust and controllable systems for activation of native gene expression have been pursued for multiple applications in gene therapy, regenerative medicine and synthetic biology. These systems, rather than introducing heterologous genes that are expressed from constitutive or tunable promoters, use proteins that regulate transcription of genes in their natural chromosomal context. There are several advantages to activating native gene expression compared with overexpressing exogenous genes including ease of cloning, simple delivery, tunability and potential for simultaneous regulation of multiple gene splicing isoforms.
As used herein, “single guide RNA” (the terms “single guide RNA” and “sgRNA” may be used interchangeably herein) refers to a single RNA species capable of directing RNA-guided nuclease (RGN) mediated cleavage of target DNA. In some embodiments, a single guide RNA may contain the sequences necessary for RGN nuclease activity and a target sequence complementary to a target DNA of interest.
As used herein, the terms “universal sgRNA,” “secondary sgRNA,” or “universal secondary sgRNA” are used interchangeably to refer to sgRNA that binds to and directs RGN-mediated cleavage of one or more vectors.
As used herein, the term “primary sgRNA” is used to refer to the sgRNA that binds to and directs RGN-mediated cleavage genomic DNA. The primary sgRNA can be customized to integrate nucleic acids (e.g., vectors) at any target site in the genome.
As used herein, the term “no significant homology to the target sequence in genomic DNA” means that the nucleic acids to be inserted into the genomic DNA have less than about 20%, 15%, 10%, 5%, or 1% homology to the genomic DNA. As used herein, the term “homology” refers to the similarity between two nucleic acid sequences. Homology among DNA, RNA, or proteins is typically inferred from their nucleotide or amino acid sequence similarity. Significant similarity is strong evidence that two sequences are related by evolutionary changes from a common ancestral sequence. Alignments of multiple sequences are used to indicate which regions of each sequence are homologous. The term “percent homology” is used herein to mean “sequence similarity.” The percentage of identical nucleic acids or residues (percent identity) or the percentage of nucleic acids residues conserved with similar physicochemical properties (percent similarity), e.g. leucine and isoleucine, is used to quantify the homology.
As described herein, sequence identity is related to sequence homology. Homology comparisons may be conducted by eye or using sequence comparison programs. These commercially available computer programs may calculate percent (%) homology between two or more sequences and may also calculate the sequence identity shared by two or more amino acid or nucleic acid sequences. Sequence homologies may be generated by any of a number of computer programs known in the art, for example BLAST or FASTA.
Percentage (%) sequence homology may be calculated over contiguous sequences, i.e., one sequence is aligned with the other sequence and each amino acid or nucleotide in one sequence is directly compared with the corresponding amino acid or nucleotide in the other sequence, one residue at a time. This is called an “ungapped” alignment. Ungapped alignments are performed only over a relatively short number of residues. Although this is a very simple and consistent method, it fails to take into consideration that, for example, in an otherwise identical pair of sequences, one insertion or deletion may cause the following amino acid residues to be put out of alignment, thus potentially resulting in a large reduction in percent homology when a global alignment is performed. Therefore, most sequence comparison methods are designed to produce optimal alignments that take into consideration possible insertions and deletions without unduly penalizing the overall homology or identity score. This is achieved by inserting “gaps” in the sequence alignment to try to maximize local homology or identity.
In some embodiments, the nucleic acids for integration in genomic DNA with no significant homology to the target sequence in genomic DNA; the single guide RNA (sgRNA) that binds one or more vectors; the sgRNA that binds a double-stranded nucleic sequence in genomic DNA where the vectors can be integrated; and the nuclease that causes a double-stranded nucleic acid break of the targeted nucleic acid molecules are located on the same or different vectors of the system. In other embodiments, the sgRNA that binds one or more vectors and the sgRNA that binds a double-stranded nucleic sequence in genomic DNA where the vectors can be integrated are the same sgRNA. In yet other embodiments, the sgRNA that binds one or more vectors and the sgRNA that binds a double-stranded nucleic sequence in genomic DNA where the vectors can be integrated are diffrent sgRNAs. In yet other embodiments, the sgRNA that binds one or more vectors is a universal sgRNA.
In some embodiments, multiple vectors can be integrated into one genomic site, where the multiple vectors are linearized by being cut by a single sgRNA, the vectors all having the target nucleic acid sequence for one sgRNA, so a single sgRNA can target the RGN to cut and linearize the vectors at a particular sequence located in all the vectors. All the vectors can be integrated into a target DNA of interest that has been cut by the RGN and inserted into a target DNA of interest that has been cut by an RGN targeted by a sgRNA complementary to a nucleic acid sequence located in the target DNA of interest.
In other embodiments, the nuclease is expressed from an expression cassette. The term “expression cassette” as used herein refers to a distinct component of vector DNA consisting of a gene and regulatory sequence to be expressed by a transfected cell, whereby the expression cassette directs the cell to make RNA and protein. Different expression cassettes can be transfected into different organisms including bacteria, yeast, plants, and mammalian cells as long as the correct regulatory sequences are used.
In other embodiments, the one or more vectors further comprises a polynucleotide encoding for a marker protein. In yet other embodiments, a sgRNA target site is cloned upstream of the marker protein. In yet other embodiments, the marker protein is an antibiotic resistance protein or a florescence protein. In some embodiments, the polynucleotide encoding for a marker protein is expressed on a separate vector.
As used herein, the terms “marker protein” or “selectable marker” are used interchangeably herein to refer to proteins encoded by a gene that when introduced into a cell (prokaryotic or eukaryotic) confers a trait suitable for artificial selection. Marker proteins or selectable markers are used in laboratory, molecular biology, and genetic engineering applications to indicate the success of a transfection or other procedure meant to introduce foreign DNA into a cell. Selectable markers include, but are not limited to, resistance to antibiotics, herbicides or other compounds, which would be lethal to cells, organelles or tissues not expressing the resistance gene or allele. Selection of transformants is accomplished by growing the cells or tissues under selective pressure, i.e., on media containing the antibiotic, herbicide or other compound. If the selectable marker is a “lethal” selectable marker, cells which express the selectable marker will live, while cells lacking the selectable marker will die. If the selectable marker is “non-lethal,” transformants (i.e., cells expressing the selectable marker) will be identifiable by some means from non-transformants, but both transformants and non-transformants will live in the presence of the selection pressure.
Antibiotic resistance genes for use as selectable markers include, but are not limited to, genes encoding for proteins resistant to puromycin, hygromycin, blasticidin, and neomycin. The genes encoding resistance to antibiotics such as ampicillin, chloroamphenicol, tetracycline or kanamycin, are examples of selectable markers for E. coli.
Examples of marker proteins include, but are not limited to an antibiotic resistance protein. In particular, beta-lactamase confers ampicillin resistance to bacterial host, neo gene from Tn5 confers resistance to kanamycin in bacteria and geneticin in eukaryotic cells. Other examples of marker proteins include, but are not limited to, florescence proteins, such as green fluorescent protein (GFP), red fluorescent protein (RFP), bilirubin-inducible fluorescent protein UnaG, dsRed, eqFP611, Dronpa, TagRFPs, KFP, EosFP, Dendra, and IrisFP.
In other embodiments, the sgRNA that binds a double-stranded nucleic sequence in genomic DNA where the vectors will be integrated is complementary to a portion of the nucleic acid sequence of a target DNA.
In other embodiments, the nucleic acids with no significant homology to the target nucleic acid molecule are about 0.001 kilobases to 100 kilobases in size, such as about 0.001, 0.002, 0.003, 0.005, 0.010, 0.020, 0.030, 0.040, 0.050, 0.060, 0.070, 0.080, 0.090, 0.01, 0.02, 0.03, 0.04, 0.05, 0.06, 0.07, 0.08, 0.09, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, or about 100 kilobases in size. In other embodiments, the nucleic acids with no significant homology to the target nucleic acid molecule are about 0.1 kilobase to about 50 kilobases in size.
As used herein, the term “nuclease” refers to an enzyme capable of cleaving the phosphodiester bonds between monomers of nucleic acids. Nucleases variously effect single and double stranded breaks in their target molecules. In living organisms, they are essential machinery for many aspects of DNA repair. Nucleases are used in genetic engineering. There are two primary classifications based on the locus of activity. Exonucleases digest nucleic acids from the ends. Endonucleases act on regions in the middle of target molecules. They are further subcategorized as deoxyribonucleases and ribonucleases. The former acts on DNA, the latter on RNA. Examples of nucleases include, but are not limited to artificial restriction enzymes and artificial transcription factors (ATFs).
There are multiple approaches to controlling native gene expression, however recent advances in genetic engineering have made it possible to rapidly design and assemble artificial transcription factors (ATFs) that are both efficient and highly specific. One key feature of ATFs is that they typically have a modular structure, with two distinct and independent domains: (1) a DNA-binding domain, and (2) a transcriptional activation domain. Through customization of the DNA binding and transcriptional activation domains, it is possible to select a genomic target and activate gene expression exclusively at that locus.
First generation transcriptional activation domains are relatively weak and require binding of multiple ATFs in close proximity, within the promoter, in order to function synergistically and efficiently initiate transcription. However, second-generation transcriptional activation domains can facilitate high levels of gene activation, even when using a single ATF.
Artificial transcription factors are classified according to the nature of the DNA-binding domain in three main groups: Zinc Finger Proteins (ZFP), Transcriptional Activator-Like Effectors (TALEs), and RNA-guided nucleases (RGNs). Each of these ATFs is effective at activating native gene expression.
As used herein, the terms “genomic DNA” or “genomic target DNA” or “target DNA” refer to chromosomal DNA. Most organisms have the same genomic DNA in every cell, but only certain genes are active in each cell to allow for cell function and differentiation within the body. The genome of an organism (encoded by the genomic DNA) is the (biological) information of heredity which is passed from one generation of organism to the next.
As used herein, “RNA-guided nuclease” or “RGN” means a nuclease capable of DNA or RNA cleavage directed by RNA base paring. Examples of RGNs include, but are not limited to, Caspase 9 (Cas9), Zinc Finger nuclease (ZFN), and TALENs.
CrSPR-CAS9-sgRNA
The Clustered Regularly Interspersed Short Palindromic Repeats/CRISPR-associated (CRISPR/Cas) system includes a recently identified type of SSN. CRISPR/Cas molecules are components of a prokaryotic adaptive immune system that is functionally analogous to eukaryotic RNA interference, using RNA base pairing to direct DNA or RNA cleavage. Directing DNA DSBs requires two components: the Cas9 protein, which functions as an endonuclease, and CRISPR RNA (crRNA) and tracer RNA (tracrRNA) sequences that aid in directing the Cas9/RNA complex to target DNA sequence (Makarova et al., Nat Rev Microbiol, 9(6):467-477, 2011). The modification of a single targeting RNA can be sufficient to alter the nucleotide target of a Cas protein. In some cases, crRNA and tracrRNA can be engineered as a single cr/tracrRNA hybrid to direct Cas9 cleavage activity (Jinek et al., Science, 337(6096):816-821, 2012). The CRISPR/Cas system can be used in bacteria, yeast, humans, and zebrafish, as described elsewhere (see, e.g., Jiang et al., Nat Biotechnol, 31(3):233-239, 2013; Dicarlo et al., Nucleic Acids Res, doi:10.1093/nar/gkt135, 2013; Cong et al., Science, 339(6121):819-823, 2013; Mali et al., Science, 339(6121):823-826, 2013; Cho et al., Nat Biotechnol, 31(3):230-232, 2013; and Hwang et al., Nat Biotechnol, 31(3):227-229, 2013).
TALENS
Transcription Activator-Like Effector Nucleases (TALENs) are artificial restriction enzymes generated by fusing the TAL effector DNA binding domain to a
DNA cleavage domain. These reagents enable efficient, programmable, and specific DNA cleavage and represent powerful tools for genome editing in situ. Transcription activator-like effectors (TALEs) can be quickly engineered to bind practically any DNA sequence. The term TALEN, as used herein, is broad and includes a monomeric TALEN that can cleave double stranded DNA without assistance from another TALEN. The term TALEN is also used to refer to one or both members of a pair of TALENs that are engineered to work together to cleave DNA at the same site. TALENs that work together may be referred to as a left-TALEN and a right-TALEN, which references the handedness of DNA. See U.S. Ser. No. 12/965,590; U.S. Ser. No. 13/426,991 (U.S. Pat. No. 8,450,471); U.S. Ser. No. 13/427,040 (U.S. Pat. No. 8,440,431); U.S. Ser. No. 13/427,137 (U.S. Pat. No. 8,440,432); and U.S. Ser. No. 13/738,381, all of which are incorporated by reference herein in their entirety.
TAL effectors are proteins secreted by Xanthomonas bacteria. The DNA binding domain contains a highly conserved 33-34 amino acid sequence with the exception of the 12th and 13th amino acids. These two locations are highly variable (Repeat Variable Diresidue (RVD)) and show a strong correlation with specific nucleotide recognition. This simple relationship between amino acid sequence and DNA recognition has allowed for the engineering of specific DNA binding domains by selecting a combination of repeat segments containing the appropriate RVDs.
The non-specific DNA cleavage domain from the end of the Fokl endonuclease can be used to construct hybrid nucleases that are active in a yeast assay. These reagents are also active in plant cells and in animal cells. Initial TALEN studies used the wild-type Fokl cleavage domain, but some subsequent TALEN studies also used Fokl cleavage domain variants with mutations designed to improve cleavage specificity and cleavage activity. The Fokl domain functions as a dimer, requiring two constructs with unique DNA binding domains for sites in the target genome with proper orientation and spacing. Both the number of amino acid residues between the TALEN DNA binding domain and the Fokl cleavage domain and the number of bases between the two individual TALEN binding sites are parameters for achieving high levels of activity. The number of amino acid residues between the TALEN DNA binding domain and the Fokl cleavage domain may be modified by introduction of a spacer (distinct from the spacer sequence) between the plurality of TAL effector repeat sequences and the Fokl endonuclease domain. The spacer sequence may be 12 to 30 nucleotides.
The relationship between amino acid sequence and DNA recognition of the TALEN binding domain allows for designable proteins. In this case artificial gene synthesis is problematic because of improper annealing of the repetitive sequence found in the TALE binding domain. One solution to this is to use a publicly available software program (DNAWorks) to calculate oligonucleotides suitable for assembly in a two-step PCR; oligonucleotide assembly followed by whole gene amplification. A number of modular assembly schemes for generating engineered TALE constructs have also been reported. Both methods offer a systematic approach to engineering DNA binding domains that is conceptually similar to the modular assembly method for generating zinc finger DNA recognition domains.
Once the TALEN genes have been assembled they are inserted into plasmids; the plasmids are then used to transfect the target cell where the gene products are expressed and enter the nucleus to access the genome. TALENs can be used to edit genomes by inducing double-strand breaks (DSB), which cells respond to with repair mechanisms. In this manner, they can be used to correct mutations in the genome which, for example, cause disease.
Zinc Finger Nuclease (ZFNs)
Zinc finger nucleases (ZFNs) are enzymes having a DNA cleavage domain and a DNA binding zinc finger domain. ZFNs may be made by fusing the nonspecific DNA cleavage domain of an endonuclease with site-specific DNA binding zinc finger domains. Such nucleases are powerful tools for gene editing and can be assembled to induce double strand breaks (DSBs) site-specifically into genomic DNA. ZFNs allow specific gene disruption as during DNA repair, the targeted genes can be disrupted via mutagenic non-homologous end joint (NHEJ) or modified via homologous recombination (HR) if a closely related DNA template is supplied.
In some embodiments, the nuclease is Zinc finger nuclease (ZFN), RNA guided nucleases (RGN), or transcription activator-like effector nucleases (TALEN). In yet other embodiments, RGN is Caspase 9 (Cas9).
In some embodiments, the one or more vectors are plasmids or viral vectors. In other embodiments, the viral vector is a lentivirus vector, an adenovirus vector, or an adeno-associated vector (AAV).
In some embodiments, the system further comprises one or more additional sgRNA molecules that causes a double-stranded nucleic acid break of one or more additional target nucleic acid molecules. In this aspect, the genome can be cut is at several different sites (e.g., 1, 2, 3, 4, 5, 6, 7, 8, 9, or 10 sites) at or near the same time, and vector DNA is being inserted into those one or more sites.
In other embodiments, the system does not require the entire vector that can be integrated to have any homology with the target site.
Yet another aspect of the present invention provides a system for targeted genome engineering, the system comprising one or more vectors comprising: (i) at least one nucleic acid with no significant homology to the target genomic DNA site and that contains a promoter for controlling gene expression; (ii) a primary sgRNA that binds the target nucleic acid molecule at or near the transcription start site of a gene in the target nucleic acid molecule; (iii) a universal secondary sgRNA that binds one or more vectors; and (iv) a nuclease that causes a double-stranded nucleic acid break of the targeted nucleic acid molecules.
In some embodiments, the at least one nucleic acid with no significant homology to the target genomic DNA site and that contains a promoter for controlling gene expression comprises: (i) a nucleic acid promoter followed by a universal secondary sgRNA; (ii) two opposing constitutive promoters separated by a universal secondary sgRNA; or (iii) two inducible promoters in opposite orientations separated by an universal secondary sgRNA.
In some embodiments, the at least one nucleic acid with no significant homology to the target genomic DNA site and that contains a promoter for controlling gene expression; the primary sgRNA that binds the target nucleic acid molecule at or near the transcription start site of a gene in the target nucleic acid molecule; the universal secondary sgRNA that binds one or more vectors; and the a nuclease that causes a double-stranded nucleic acid break of the targeted nucleic acid molecules are located on the same or different vectors of the system.
The term “constitutive promoter” as used herein refers to an unregulated promoter that allows for continual transcription of its associated gene. These promoters direct expression in virtually all tissues and are independent of environmental and developmental factors. As their expression is normally not conditioned by endogenous factors, constitutive promoters are usually active across species and even across kingdoms. Examples of constitutive promoters include, but are not limited to, CMV, EF1A, and SV40 promoters.
In some embodiments, the two opposing constitutive promoters have similar activity or are identical to one another. In other embodiments, the two opposing constitutive promoters are non-identical to one another.
The term “inducible promoter” as used herein refers to a regulated promoter that allows for controlled transcription of its associated gene. The performance of inducible promoters is not conditioned to endogenous factors but to environmental conditions and external stimuli that can be artificially controlled. Inducible promoters can be modulated by factors such as light, oxygen levels, heat, cold and wounding, as well as chemicals, steroids, and alcohol. Since some of these factors are difficult to control outside an experimental setting, promoters that respond to chemical compounds, not found naturally in the organism of interest, are useful for genetic engineering. Examples of inducible promoters include, but are not limited to, the tetracycline ON (Tet-On) system, the negative inducible pLac promoter, the negative inducible promoter pBad, heat shock-inducible Hsp70 or Hsp90-derived promoters, and heat shock-inducible Cre and Cas9.
The terms “opposing” or “opposite” as it is used herein in connection with the terms “opposing constitutive promoters” or “inducible promoters in opposite orientations” means that the promoters are arranged to direct the expression in both directions on the vector and ensures that there is always a promoter correctly positioned regardless of integration orientation of the vector nucleic acids into the target nucleic acids.
In yet other embodiments, each inducible promotor of the two inducible promoters in opposite orientations separated by a universal secondary sgRNA contains multiple TetO repeats and a transferase gene operatively linked to a reverse tetracycline transactivator (rtTA) via a T2A peptide. In some embodiments, the number of TetO repeats of the inducible promoters can be 2, 3, 4, 5, 6, 7, 8, 9, or 10.
In some embodiments, the one or more vectors further comprise a polynucleotide encoding for a marker protein. In other embodiments, the marker protein is an antibiotic resistance protein or a florescence protein.
In some embodiments, the nucleic acid promotor is heterologous to the promoter of the target nucleic acid molecule.
In some embodiments, the nuclease is Zinc finger nuclease (ZFN), RNA guided nucleases (RGN), or transcription activator-like effector nucleases (TALEN). In other embodiments, the RGN is Caspase 9 (Cas9).
In some embodiments, the one or more vectors are plasmid or viral vectors. In other embodiments, the viral vector is a lentivirus vector, an adenovirus vector, or an adeno-associated vector (AV).
Another aspect of the present disclosure provides a method of altering the expression of at least one gene product, the method comprising: (i) introducing into a cell a system of targeted genome engineering as described herein; and (ii) selecting for successfully transfected cells by applying selective pressure; wherein the expression of at least one gene product is reduced or eliminated relative to a cell that has not been transfected with the system of targeted genome engineering.
As used herein, the term “altering expression of at least one gene product” refers to increasing, decreasing, knocking out, or activating the expression of a gene product of a cell using the targeted genome engineering systems described herein, relative to an unaltered cell.
As used herein, the term “gene product” refers to the biochemical material, either RNA or protein, resulting from expression of a gene.
In some embodiments, the method occurs in vivo or in vitro. In other embodiments, the cell is a eukaryotic cell.
The terms “cell,” “cell line,” and “cell culture” include progeny thereof. It is also understood that all progeny may not be precisely identical, such as in DNA content, due to deliberate or inadvertent mutation. Variant progeny that have the same function or biological property of interest, as screened for in the original cell, are included.
Yet another aspect of the present invention provides a method of altering the expression of at least one gene product, the method comprising: (i) introducing into a cell a system for targeted engineering as described herein; and (ii) selecting for successfully transfected cells by applying selective pressure, wherein the expression of at least one gene product is activated relative to a cell that is not transfected with the system for targeted engineering. In some embodiments, the method occurs in vivo or in vitro. In other embodiments, the cell is a eukaryotic cell.
Yet another aspect of the present invention provides a method of identifying the genetic basis of one or more medical symptoms exhibited by a subject, the method comprising: (i) obtaining a biological sample from the subject and isolating a population of cells having a first phenotype from the biological sample; (ii) transfecting a library of sgRNA into the cells; (iii) introducing into the cells a system for targeting genome engineering; (iv) selecting for successfully transfected cells by applying the selective pressure; (v) selecting the cells that survive under the selective pressure; and (vi) determining the genomic loci of the DNA molecule that interacts with the first phenotype and identifying the genetic basis of the one or more medical symptoms exhibited by the subject.
As used herein, the term “selective pressure” refers to the influence exerted by some factor (such as an antibiotic, heat, light, pressure, or a marker protein) on natural selection to promote one group of organisms or cells over another. In the case of antibiotic resistance, applying antibiotics cause a selective pressure by killing susceptible cells, allowing antibiotic-resistant cells to survive and multiply.
In some embodiments, selective pressure is applied by contacting the cells with an antibiotic and selecting the cells that survive. In other embodiments, the antibiotic is puromycin.
In another embodiment, the polynucleotide can encode for a fluorescent protein for easier monitoring of genome integration and expression, and to label or track particular cells.
As used herein, the term “phenotype” refers to any observable characteristic or functional effect that can be measured in an assay such as changes in cell growth, proliferation, morphology, enzyme function, signal transduction, expression patterns, downstream expression patterns, reporter gene activation, hormone release, growth factor release, neurotransmitter release, ligand binding, apoptosis, and product formation. Such assays include, but are not limited to, transformation assays, changes in proliferation, anchorage dependence, growth factor dependence, foci formation, growth in soft agar, tumor proliferation in nude mice, and tumor vascularization in nude mice; apoptosis assays, e.g, DNA laddering and cell death, expression of genes involved in apoptosis; signal transduction assays, e.g., changes in intracellular calcium, cAMP, cGMP, IP3, changes in hormone and neurotransmittor release; receptor assays, e.g., estrogen receptor and cell growth; growth factor assays, e.g., EPO, hypoxia and erythrocyte colony forming units assays; enzyme product assays, e.g., FAD-2 induced oil desaturation; transcription assays, e.g., reporter gene assays; and protein production assays, e.g., VEGF ELISAs. A candidate gene is “associated with” a selected phenotype if modulation of gene expression of the candidate gene causes a change in the selected phenotype.
The terms “polynucleotide”, “nucleotide”, “nucleotide sequence”, “nucleic acid” and “oligonucleotide” are used interchangeably. They refer to a polymeric form of nucleotides of any length, either deoxyribonucleotides or ribonucleotides, or analogs thereof. Polynucleotides may have any three dimensional structure, and may perform any function, known or unknown. The following are non-limiting examples of polynucleotides: coding or non-coding regions of a gene or gene fragment, loci (locus) defined from linkage analysis, exons, introns, messenger RNA (mRNA), transfer RNA, ribosomal RNA, short interfering RNA (siRNA), short-hairpin RNA (shRNA), single guide RNA (sgRNA), micro-RNA (miRNA), ribozymes, cDNA, recombinant polynucleotides, branched polynucleotides, plasmids, vectors, isolated DNA of any sequence, isolated RNA of any sequence, nucleic acid probes, and primers. A polynucleotide may comprise one or more modified nucleotides, such as methylated nucleotides and nucleotide analogs. If present, modifications to the nucleotide structure may be imparted before or after assembly of the polymer. The sequence of nucleotides may be interrupted by non-nucleotide components. A polynucleotide may be further modified after polymerization, such as by conjugation with a labeling component.
The terms “complementary” or “substantially complementary” as used herein refers the hybridization or Watson-Crick base pairing between nucleotides or nucleic acids, such as, for instance, between the two strands of a double stranded DNA molecule or between an oligonucleotide primer and a primer binding site on a single stranded nucleic acid to be sequenced or amplified or between a sgRNA and a target nucleic acid molecule. Complementary nucleotides are, generally, A and T (or A and U), or C and G. Two single stranded RNA or DNA molecules are said to be substantially complementary when the nucleotides of one strand, optimally aligned and compared and with appropriate nucleotide insertions or deletions, pair with at least about 60%, 70%, 80%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 100% of the nucleotides of the other strand. Alternatively, substantial complementarity exists when an RNA or DNA strand will hybridize under selective hybridization conditions to its complement. Typically, selective hybridization occurs when there is at least about 65%, 70%, 80%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% complementarity over a stretch of at least 14 to 25 nucleotides.
As used herein, “expression” refers to the process by which a polynucleotide is transcribed from a DNA template (such as into and sgRNA or mRNA) and/or the process by which a transcribed mRNA is subsequently translated into peptides, polypeptides, or proteins. Transcripts and encoded polypeptides may be collectively referred to as “gene product.” If the polynucleotide is derived from genomic DNA, expression may include splicing of the mRNA in a eukaryotic cell. The term “capable of expression” means the vector has all the components necessary to express the sgRNA or the heterologous gene product, as described below and known to one of ordinary skill in the art. The polynucleotide of the first vector can encode for a protein to tag the cells it is integrated into, to knock out a gene located within the DNA target of interest, to introduce a mutant version of the gene located within the target DNA of interest, to express inhibitory RNAs, or any polynucleotide of interest.
As used herein, the term “subject” refers to any animal classified as a mammal, including humans, mice, rats, domestic and farm animals, non-human primates, and zoo, sport or pet animals, such as dogs, horses, cats, and cows.
As used herein, the terms “library” or “library of sgRNA” refers to a plurality of sgRNAs that are capable of targeting a plurality of genomic loci in a population of cells.
Several aspects of the disclosure relate to vector systems comprising one or more vectors, or vectors as such. Vectors can be designed for expression of RGNs and polynucleotides (e.g. nucleic acid transcripts, proteins, or enzymes) in prokaryotic or eukaryotic cells. For example, RGN or polynucleotides can be expressed in bacterial cells such as Escherichia coli, insect cells (using baculovirus expression vectors), yeast cells, or mammalian cells. Suitable host cells are discussed further in Goeddel, GENE EXPRESSION TECHNOLOGY: METHODS IN ENZYMOLOGY 185, Academic Press, San Diego, Calif. (1990). Alternatively, the recombinant expression vector can be transcribed and translated in vitro, for example using T7 promoter regulatory sequences and T7 polymerase.
A “vector” is a replicon, such as a plasmid, phage, or cosmid, to which another nucleic acid segment may be attached so as to bring about the replication of the attached segment. A vector is capable of transferring polynucleotides (e.g. gene sequences) to target cells (e.g., bacterial plasmid vectors, particulate carriers and liposomes).
Typically, the terms “vector construct,” “expression vector,” “gene expression vector,” “gene delivery vector,” “gene transfer vector,” “transfer vector,” and “expression cassette” all refer to an assembly which is capable of directing the expression of a sequence or gene of interest. Thus, the terms include cloning and expression vehicles.
As used herein, a “promoter” may refer to any nucleic acid sequence that regulates the initiation of transcription for a particular polypeptide-encoding nucleic acid under its control. A promoter minimally includes the genetic elements necessary for the initiation of transcription (e.g., RNA polymerase Ill-mediated transcription), and may further include one or more genetic regulatory elements that serve to specify the prerequisite conditions for transcriptional initiation.
The term “regulatory element” as used herein includes promoters, enhancers, internal ribosomal entry sites (IRES), and other expression control elements (e.g. transcription termination signals, such as polyadenylation signals and poly-U sequences). Such regulatory elements are described, for example, in Goeddel, (1990). Regulatory elements include those that direct constitutive expression of a nucleotide sequence in many types of host cell and those that direct expression of the nucleotide sequence only in certain host cells (e.g., tissue-specific regulatory sequences). A tissue-specific promoter may direct expression primarily in a desired tissue of interest, such as muscle, neuron, bone, skin, blood, specific organs (e.g. liver, pancreas), or particular cell types (e.g. lymphocytes). Regulatory elements may also direct expression in a temporal-dependent manner, such as in a cell-cycle dependent or developmental stage-dependent manner, which may or may not also be tissue or cell-type specific. In some embodiments, a vector comprises one or more pol III promoter, one or more pol II promoters, one or more pol I promoters, or combinations thereof. Examples of pol III promoters include, but are not limited to, U6 and H1 promoters.
Examples of pol II promoters include, but are not limited to, the retroviral Rous sarcoma virus (RSV) LTR promoter (optionally with the RSV enhancer), the cytomegalovirus (CMV) promoter (optionally with the CMV enhancer), the SV40 promoter, the dihydrofolate reductase promoter, the β-actin promoter, the phosphoglycerol kinase (PGK) promoter, and the EF1α promoter. Also encompassed by the term “regulatory element” are enhancer elements, such as WPRE; CMV enhancers; the R-U5′ segment in LTR of HTLV-I (Mol. Cell. Biol., Vol. 8(1), p. 466-472, 1988); SV40 enhancer; and the intron sequence between exons 2 and 3 of rabbit β-globin (Proc. Natl. Acad. Sci. USA., Vol. 78(3), p. 1527-31, 1981). It will be appreciated by those skilled in the art that the design of the expression vector can depend on such factors as the choice of the host cell to be transformed, the level of expression desired, etc. A vector can be introduced into host cells to thereby produce transcripts, proteins, or peptides, including fusion proteins or peptides, encoded by nucleic acids as described herein (e.g., clustered regularly interspersed short palindromic repeats (CRISPR) transcripts, proteins, enzymes, mutant forms thereof, fusion proteins thereof, etc.).
A promoter may be encoded by the endogenous genome of a host cell, or it may be introduced as part of a recombinantly engineered polynucleotide. A promoter sequence may be taken from one host species and used to drive expression of a gene in a host cell of a different species. A promoter sequence may also be artificially designed for a particular mode of expression in a particular species, through random mutation or rational design. In recombinant engineering applications, specific promoters are used to express a recombinant gene under a desired set of physiological or temporal conditions or to modulate the amount of expression of a recombinant nucleic acid.
Methods for transforming a host cell with an expression vector may differ depending upon the species of the desired host cell. For example, yeast cells may be transformed by lithium acetate treatment (which may further include carrier DNA and PEG treatment) or electroporation. These methods are included for illustrative purposes and are in no way intended to be limiting or comprehensive. Routine experimentation through means well known in the art may be used to determine whether a particular expression vector or transformation method is suited for a given host cell. Furthermore, reagents and vectors suitable for many different host microorganisms are commercially available and/or well known in the art.
Many suitable expression vectors and features thereof are known in the art; for example, various vectors and techniques are illustrated in Current Protocols in Molecular Expression vectors may contain, without limitation, a centromeric (CEN) sequence, an autonomous replication sequence (ARS), a promoter, an origin of replication, and a marker gene (e.g., auxotrophic, antibiotic, or other selectable markers). Examples of expression vectors may include plasmids, yeast artificial chromosomes, 2μπι plasmids, yeast integrative plasmids, yeast replicative plasmids, shuttle vectors, and episomal plasmids.
Vectors may be introduced and propagated in a prokaryote. In some embodiments, a prokaryote is used to amplify copies of a vector to be introduced into a eukaryotic cell or as an intermediate vector in the production of a vector to be introduced into a eukaryotic cell (e.g. amplifying a plasmid as part of a viral vector packaging system). In some embodiments, a prokaryote is used to amplify copies of a vector and express one or more nucleic acids, such as to provide a source of one or more proteins for delivery to a host cell or host organism. Expression of proteins in prokaryotes is most often carried out in Escherichia coli with vectors containing constitutive or inducible promoters directing the expression of either fusion or non-fusion proteins. Fusion vectors add a number of amino acids to a protein encoded therein, such as to the amino terminus of the recombinant protein. Such fusion vectors may serve one or more purposes, such as: (i) to increase expression of recombinant protein; (ii) to increase the solubility of the recombinant protein; and (iii) to aid in the purification of the recombinant protein by acting as a ligand in affinity purification. Often, in fusion expression vectors, a proteolytic cleavage site is introduced at the junction of the fusion moiety and the recombinant protein to enable separation of the recombinant protein from the fusion moiety subsequent to purification of the fusion protein. Such enzymes, and their cognate recognition sequences, include Factor Xa, thrombin and enterokinase. Example fusion expression vectors include pGEX (Pharmacia Biotech Inc; Smith and Johnson, 1988. Gene 67: 31-40), pMAL (New England Biolabs, Beverly, Mass.) and pRIT5 (Pharmacia, Piscataway, N.J.) that fuse glutathione S-transferase (GST), maltose E binding protein, or protein A. respectively, to the target recombinant protein.
Examples of suitable inducible non-fusion E. coli expression vectors include pTrc (Amrann et al., (1988) Gene 69:301-315) and pET 11d (Studier et al., GENE EXPRESSION TECHNOLOGY: METHODS IN ENZYMOLOGY 185, Academic Press, San Diego, Calif. (1990) 60-89).
In some embodiments, a vector is a yeast expression vector. Examples of vectors for expression in yeast Saccharomyces cerivisae include pYepSec1 (Baldari, et al., 1987. EMBO J. 6: 229-234), pMFa (Kuijan and Herskowitz, 1982. Cell 30: 933-943), pJRY88 (Schultz et al., 1987. Gene 54: 113-123), pYES2 (Invitrogen Corporation, San Diego, Calif.), and picZ (InVitrogen Corp, San Diego, Calif.).
In some embodiments, a vector drives protein expression in insect cells using baculovirus expression vectors. Baculovirus vectors available for expression of proteins in cultured insect cells (e.g., SF9 cells) include the pAc series (Smith, et al., 1983. Mol. Cell. Biol. 3: 2156-2165) and the pVL series (Lucklow and Summers, 1989. Virology 170: 31-39).
In some embodiments, a vector is capable of driving expression of one or more sequences in mammalian cells using a mammalian expression vector. Examples of mammalian expression vectors include, but are not limited to, pCDM8 (Seed, 1987. Nature 329: 840) and pMT2PC (Kaufman, et al., 1987. EMBO J. 6: 187-195). When used in mammalian cells, the expression vector's control functions are typically provided by one or more regulatory elements. For example, commonly used promoters are derived from polyoma, adenovirus 2, cytomegalovirus, simian virus 40, and others disclosed herein and known in the art. For other suitable expression systems for both prokaryotic and eukaryotic cells see, e.g., Chapters 16 and 17 of Sambrook, et al., MOLECULAR CLONING: A LABORATORY MANUAL. 2nd ed., Cold Spring Harbor Laboratory, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y., 1989.
In some embodiments, the recombinant mammalian expression vector is capable of directing expression of the nucleic acid preferentially in a particular cell type (e.g., tissue-specific regulatory elements are used to express the nucleic acid). Tissue-specific regulatory elements are known in the art. Non-limiting examples of suitable tissue-specific promoters include the albumin promoter (liver-specific; Pinkert, et al., 1987. Genes Dev. 1: 268-277), lymphoid-specific promoters (Calame and Eaton, 1988. Adv. Immunol. 43: 235-275), in particular promoters of T cell receptors (Winoto and Baltimore, 1989. EMBO J. 8: 729-733) and immunoglobulins (Baneiji, et al., 1983. Cell 33: 729-740; Queen and Baltimore, 1983. Cell 33: 741-748), neuron-specific promoters (e.g., the neurofilament promoter; Byrne and Ruddle, 1989. Proc. Natl. Acad. Sci. USA 86: 5473-5477), pancreas-specific promoters (Edlund, et al., 1985. Science 230: 912-916), and mammary gland-specific promoters (e.g., milk whey promoter; U.S. Pat. No. 4,873,316 and European Application Publication No. 264,166). Developmentally-regulated promoters are also encompassed, e.g., the murine hox promoters (Kessel and Gruss, 1990. Science 249: 374-379) and the α-fetoprotein promoter (Campes and Tilghman, 1989. Genes Dev. 3: 537-546).
The practice of the present invention employs, unless otherwise indicated, conventional techniques of immunology, biochemistry, chemistry, molecular biology, microbiology, cell biology, genomics and recombinant DNA, which are within the skill of the art. See Sambrook, Fritsch and Maniatis, MOLECULAR CLONING: A LABORATORY MANUAL, 2nd edition (1989); CURRENT PROTOCOLS IN MOLECULAR BIOLOGY (F. M. Ausubel, et al. eds., (1987)); the series METHODS IN ENZYMOLOGY (Academic Press, Inc.): PCR 2: A PRACTICAL APPROACH (M. J. MacPherson, B. D. Hames and G. R. Taylor eds. (1995)), Harlow and Lane, eds. (1988) ANTIBODIES, A LABORATORY MANUAL, and ANIMAL CELL CULTURE (R. I. Freshney, ed. (1987)).
Conventional and standard techniques may be used for recombinant DNA molecule, protein, and antibody production, as well as for tissue culture and cell transformation. Enzymatic reactions and purification techniques are typically performed according to the manufacturer's specifications or as commonly accomplished in the art using conventional procedures known in the art, or as described herein. Unless specific definitions are provided, the nomenclature utilized in connection with, and the laboratory procedures and techniques of analytical chemistry, synthetic organic chemistry, and medicinal and pharmaceutical chemistry described herein are those well-known and commonly used in the art. Standard techniques may be used for chemical syntheses, chemical analyses, pharmaceutical preparation, formulation, and delivery, and treatment of patients.
Further, the terminology used herein is for the purpose of exemplifying particular embodiments only and is not intended to limit the scope of the invention as disclosed herein. Any method and material similar or equivalent to those described herein can be used in the practice of the invention as disclosed herein and only exemplary methods, devices, and materials are described herein.
The invention now will be exemplified for the benefit of the artisan by the following non-limiting examples that depict some of the embodiments by and in which the invention can be practiced.
The traditional approach to integrate heterologous DNA at target genomic loci using homologous recombination of donor vectors is shown in the schematic of
Cell Culture and Transfection
HEK293T and HCT116 cells were obtained from the American Tissue Collection Center (ATCC) and were maintained in DMEM supplemented with 10% fetal bovine serum and 1% penicillin/streptomycin at 37° C. with 5% CO2. HEK293T and HCT116 cells were transfected with Lipofectamine 2000 (Invitrogen) according to manufacturer's instructions. Transfection efficiency in 293T cells was routinely higher than 80% whereas transfection efficiency in HCT116 cells was ˜55% as determined by FACS following delivery of a control GFP expression plasmid. The antibiotics used for selection of clonal populations of HCT116 cells were Puromycin 0.5 μg/ml, Hygromycin 100 μg/ml, Blasticidin 10 μg/ml and Neomycin 1 mg/ml.
Plasmids and Oligonucleotides
The plasmids encoding spCas9 and sgRNA were obtained from Addgene (Plasmids #41815 and #47108). The backbone for the transfer vectors was synthesized by IDT Technologies as gene blocks and cloned into a pCDNA3.1 backbone. Oligonucleotides for construction of sgRNAs were obtained from IDT Technologies, hybridized, phosphorylated and cloned in the sgRNA and transfer vectors using BbsI sites as previously described in Perez-Pinera et. al, Nat Methods 10, 973-976, 2013. The target sequences of the gRNAs are provided in Table 2.
PCR
Seventy-two hours after transfection genomic DNA was isolated using DNeasy Blood & Tissue Kit (Qiagen). PCRs were performed using KAPA2G Robust PCR kits. A typical 25 μL reaction used 20-100 ng of genomic DNA, Buffer A (5 μL), Enhancer (5 μL), dNTPs (0.5 μL), 10 μM forward primer (1.25 μL), 10 μM reverse primer (1.25 μL), KAPA2G Robust DNA Polymerase (0.5 U) and water (up to 25 μL). The DNA sequences of the primers for each target are provided in Table 4. The PCR products were visualized in 2% agarose gels and images were captured using a ChemiDoc-It2 (UVP).
Surveyor Assay
Seventy-two hours after transfection genomic DNA was isolated using DNeasy Blood & Tissue Kit (Qiagen). The region surrounding the RGN target site was amplified by PCR with the AccuPrime PCR kit (Invitrogen) and 50-200 ng of genomic DNA as template with primers provided in Table 3. The PCR products were melted and reannealed using the temperature program: 95° C. for 180 s, 85° C. for 20 s, 75° C. for 20 s, 65° C. for 20 s, 55° C. for 20 s, 45° C. for 20 s, 35° C. for 20 s and 25° C. for 20 s with a 0.1° C./s decrease rate in between steps. Eighteen microliters of the reannealed duplex was combined with 1 μl of the Surveyor nuclease and 1 μl of enhancer solution (Integrated DNA Technologies), incubated at 42° C. for 60 min and then separated on a 10% TBE polyacrylamide gel. The gels were stained with ethidium bromide and visualized using a ChemiDoc-It2 (UVP). Quantification was performed using methods previously described in Guschin et. al. Methods Mol Biol 649, 247-256, 2010.
Western Blot
Cells were lysed with loading buffer, boiled for 5 min, loaded in NuPAGE® Novex 4-12% Bis-Tris Gel polyacrylamide gels and transferred to nitrocellulose membranes. Non-specific antibody binding was blocked with 50 mM Tris/150 mM NaCl/0.1% Tween-20 (TBS-T) with 5% nonfat milk for 30 min. The membranes were incubated with primary antibodies anti-GAPDH (Cell Signaling Technology) or anti-CTTN (Cell Signaling Technology) in 5% BSA or 5% nonfat milk in TBS-T diluted 1:1,000 for 60 min and the membranes were washed with TBS-T for 30 min. Membranes labeled with primary antibodies were incubated with anti-rabbit HRP-conjugated antibody (Sigma-Aldrich) diluted 1:10,000 for 30 min, and washed with TBS-T for 30 minutes. Membranes were visualized using the Clarity™ ECL Western Blotting Substrate (Bio-Rad) and images were captured using a ChemiDoc-It2 (UVP).
Quantification of Integration Efficiency
HCT116 cells were transfected with individual RGNs targeting either CTTN exon 8 or HLA-DRA, as well as Cas9, one universal RGN, and either one or two transfer vectors with expression cassettes conferring resistance to puromycin or puromycin and hygromycin. A total of 450,000 cells were transfected using 100 ng of each plasmid. The transfection efficiency was ˜55% as determined by FACS following delivery of a control GFP expression plasmid. Three days post transfection, 90% of cells from each well were harvested and replated into 10 cm dishes for selection with the appropriate antibiotics. Cells with monoallelic modifications were selected with puromycin whereas cells with biallelic modifications were selected with puromycin and hygromycin. Media and antibiotics were replenished every three days. Visible colonies appeared after approximately after one week. The number of clones for each transfection was counted and integration efficiency was determined as the ratio of the number of clonal cells derived from each transfection relative to the number of alleles modified by each specific sgRNA, as measured in experimental control samples using the surveyor assay.
Results
The first version of a genomic DNA integration system relied upon a sgRNA capable of introducing DSBs at genetic loci of interest and a vector where the sgRNA target site was cloned upstream of a GFP transgene. Single guide RNAs were validated using the Surveyor Assay three days after transfection. No gene modification was detected in control samples, however, co-transfection of Cas9 and sgRNA effectively introduced insertions and deletions in all the target sites analyzed in these studies (
Multiplex integration was evaluated by comprehensively characterizing genomic incorporation of two transfer vectors intended for two distinct loci: one that expresses GFP and contains a GAPDH RGN target sequence, and another that expresses RFP but contains an ACTB RGN target sequence (
These findings show that DSBs in genomes can avidly capture linear DNA present in the nucleus regardless of homology whereas circular vectors are not efficiently integrated at DSBs. Since transfer vectors linearized with TALENs are also effectively integrated at DSBs generated with RGNs (
Unlike HR-based genomic integration systems, large size vectors can be fully integrated in genomic DNA very efficiently (
While multiplexed integration of a single vector at multiple loci has broad applications for synthetic biology, integration of multiple vectors at a single locus is particularly interesting for cell line engineering purposes, such as rapid gene knock out. By simply cotransfecting Cas9, a sgRNA targeting the CTTN locus and a universal sgRNA targeting two separate transfer vectors that encode puromycin or hygromycin resistance expression cassettes, one vector was successfully integrated into each allele of the CTTN gene (
Overall, the timeframe from sgRNA design to HCT116 clonal cell verification and expansion was 2-3 weeks with minimal resources and screening effort required. Cell lines were generated with monoallelic or biallelic modifications at 4 loci tested, including CTTN exon 8 and HLA-DRA (
The percent of total alleles modified by NAVI in diploid cells is 62.5% following selection with a single antibiotic, with 90% of clones containing at least a monoallelic modification. Under dual antibiotic selection, 75.4% of the clones contained biallelic modification and 98.2% of clones had at least one allele modified (Table 5).
Following selection in 10-cm plates with the appropriate antibiotic, total colonies were counted and divided by total cells transfected to obtain the overall editing efficiency of NAVI. This value was then adjusted to account for overall sgRNA editing efficiency, as measured by surveyor nuclease assay. This quantification was performed at 2 different loci using either a single or two antibiotics for selection.
Data collected from integration-specific PCR was used to determine allelic modification rates among clonal cell populations isolated selection. The total number of clones from each genotype (+/+, +/−, and −/−) was determined for each of four genomic targets analyzed. The frequency of allelic modification (total number of alleles modified divided by total number of alleles) was calculated for clones selected using one or two antibiotics.
A limitation for multiplexing applications using NAVI is the potential for off-target integration. Since NAVI relies on linearized DNA integrating at DSBs, naturally occurring DSBs or DSBs derived from off-target binding of the sgRNAs become sites for potential unintended integration as demonstrated in
In HCT116 cells, up to 4 antibiotics have been successfully used for rapid isolation of cell lines with dual gene knock-outs, however, only 10% of the clones contained the desired mutations simultaneously (
Mutations can often be found at the junction of genomic DNA with the integrated transfer vector suggesting that the integration mechanism involves an error-prone DNA repair pathway. Genomic DNA from pooled populations of 293 cells transfected and RGNs targeting GAPDH or ACTB and the corresponding transfer vectors was isolated and the regions flanking plasmid integration in genomic DNA were amplified by PCR. The PCR products corresponding to integration events in plus or minus orientation were cloned and sequenced. The sequencing results identified a wide range of mutations at the junction of genomic DNA with the vector suggesting that a mutagenic DNA repair pathway mediates integration of the vector into the target site (
While mutagenesis generated via NHEJ remains a highly efficient and effective strategy for select applications, the insertion of large or complex sequences and the ability to easily select for modified cells often necessitates the use of homology directed repair (HDR) based strategies. The time-consuming construction of donor vectors for HDR gene editing is often technically challenging, costly, and leads to poor modification rates. By using customized single-stranded oligonucleotides (ssODN) the efficiency of gene editing increases, but the scale of possible genetic changes is greatly diminished. Additionally, as both donor vectors and ssODN require two discontiguous regions of homology, neither is well suited to multiplexing. Nuclease-Assisted Vector Integration (NAVI) is a unique strategy to bypass HDR and the need for customized donor vectors required for traditional genome editing technologies.
Multiplexed genome editing via nuclease assisted vector integration presents a unique opportunity for genome-scale engineering in mammalian cells. The results demonstrate that NAVI is capable of rapidly remodeling mammalian genomes by targeted insertion of large expression cassettes in one single step. NAVI eliminates the need for homologous sequence within donor vectors. While NAVI sacrifices single base pair resolution, it is capable of achieving predictable and robust patterns of integration into native genomes. Virtually any vector may be integrated at a target site in the genome without cloning, setting it apart from all prior integration systems. Importantly, facile integration of large constructs up to 50 kbp, including an entire phage genome were demonstrated, however no upper size limit was identified. Finally, through multiplexed NAVI, a novel system for targeted gene disruption was demonstrated, in which screening time is greatly reduced by via positive selection. In summary, this novel approach to gene editing extends the capacity of structural and functional mammalian genome engineering for applications in synthetic biology and creates new opportunities for developing more efficient gene therapies.
This Example describes a protocol for activation of ASCL1 expression using RGNs consisting of S. pyogenes Cas9 and single guide RNAs (
The following protocol for designing, assembling and testing RGN transcription factors assumes that a dCas9-transcriptional activator has already been obtained. To aid the identification of a suitable activation system, Table 6 summarizes the different dCas9-transcriptional activators compatible with the gene activation systems described herein.
Construction of sgRNA Expression Plasmids
1. An appropriate sgRNA vector should be chosen prior to guide design. Examples of sgRNA vectors for cloning and expression of custom sgRNAs using include, but are not limited to, those described in Table 7.
Dual expression of Cas9 and sgRNA from a single plasmid is an alternative to a two plasmid system. This protocol uses pSPgRNA (Addgene #47108), which includes two BbsI/BpiI sites interspaced between a human U6 promoter and the sgRNA loop for cloning of oligonucleotides (
2. Oligonucleotides for sgRNA construction. Target selection: The identification of optimal target sites for activation of gene expression remains, essentially, an empirical process. It has been shown that the region comprising −400 to −50 bp at the 5′ end of the transcriptional start site (TSS) is optimal. Since the TSS is clearly annotated in most genome browsers, the sequence of the gene of interest is imported into DNA analysis software and used to identify potential target sites. Benchling, a freely available web-based DNA analysis platform that incorporates a “Genome Engineering” tool to identify all possible sgRNAs within any sequence specified by the user can be used. Benchling provides on-target and off-target scores associated with each target site. Off-target changes in gene expression are uncommon when using multiple sgRNAs to activate gene expression, since all target sites must be found simultaneously near the TSS of the off-target gene. However, since second-generation systems for gene activation require one single sgRNA, it is important to identify high quality sgRNAs with favorable off-target scores. For each sgRNA, Benchling provides a detailed list of potential off-target sites that can be used for biased detection of off-target gene activation.
The target sequences chosen to activate ASCL1 gene expression are: 5′-GCTGGGTGTCCCATTGAAA-3′ (SEQ ID NO: 56); 5′-CAGCCGCTCGCTGCAGCAG-3′ (SEQ ID NO: 57); 5′-TGGAGAGTTTGCAAGGAGC-3′ (SEQ ID NO: 58); 5′-GTTTATTCAGCCGGGAGTC-3′ (SEQ ID NO: 59). For each target sequence, a sense oligonucleotide is generated in the format: 5′-CACC G NNNNNNNNNNNNNNNNNNNN-3′ (SEQ ID NO: 60), where N 20 represents the 20 bases of the genomic DNA at the 5′ end of the PAM. The number of nucleotides in the sgRNA complementary with the target site can range between 17 and 20 bp. In fact, it has been demonstrated that sgRNAs with 17 or 18 complementary nucleotides efficiently guide S. pyogenes Cas9 to the target site where it introduces double strand breaks with improved specificity. The first four bases are complementary to the sgRNA vector overhangs, while the fifth base is G in order to initiate transcription of RNA from the upstream U6 promoter. A second oligonucleotide, representing the antisense target sequence, is generated in the format: 5′-AAACY20 C-3′ (SEQ ID NO: 61). Here, AAAC are vector complementing overhangs, Y20 represents the reverse complement of the target sequence, and the last C complements the leading G of the sense oligonucleotide (
The sequences of the oligonucleotides for assembly of sgRNAs that can target the ASCL1 promoter are:
3. Nuclease-free Molecular biology grade (MBG) water.
4. Tris Buffered Saline (TBS), 50 mM Tris pH 7.4 and 150 mM NaCl.
5. Restriction endonuclease BbsI/BpiI. There are multiple commercial sources for BbsI/BpiI. Some formulations of BbsI/BpiI require storage at −80° C. and, repeated cycles of freeze-thaw that occur when used frequently, result in decreased enzymatic activity and undesired background during cloning. Formulations of BbsI/BpiI that can be stored at −20° C.
6. T4 Polynucleotide Kinase (PNK).
7. T4 DNA ligase and T4 DNA Ligase Buffer with ATP. T4 DNA ligase buffer typically contains 10 mM dithiothreitol, which is not stable through repeated freeze-thaw cycles. Single use aliquots of T4 buffer can be prepared.
8. Transformation-competent E. coli. Any chemically competent cells or electro-competent cells can be used, such asHIT Competent Cells-DH5α. These chemically competent cells can be transformed very efficiently without heat-shock by mixing 1.5 μL of the ligation reaction with 30 μL of competent cells followed by incubation at 4° C. for 1-10 min and plating. When using this short protocol, plates prewarmed at 37° C. ensures transformation efficiency. If the transformation efficiency is too low, addition of 100 μL of SOC broth and incubation at 37° C. with shaking for 10 min should yield hundreds to thousands of colonies.
9. LB-Agar plates containing 100 μg/mL carbenicillin for bacterial culture.
10. KAPA2G Robust PCR Kit (KAPA Biosystems) and 10 mM dNTP mix.
11. Sequencing and colony PCR primer, M13 Forward: 5′-TGTAAAACGACGGCCAGT-3′ (SEQ ID NO:70).
12. Ethidium bromide, 10 mg/mL.
13. Electrophoresis Buffer (TAE) 40 mM Tris pH 7.2, 20 mM Acetate, and 1 mM EDTA.
14. Agarose.
15. LB broth containing 100 μg/mL carbenicillin.
16. Qiagen Spin Miniprep Kit.
Activation of Target Gene Expression
1. Mammalian cell line, such as HEK293T.
2. Phosphate-buffered saline (PBS), 8 mM Na2HPO4, 2 mM KH2PO4 pH 7.4, 137 mM NaCl and 2.7 mM KCl.
3. 0.25% Trypsin-EDTA.
4. Complete mammalian cell culture medium appropriate for the chosen cell line, such as DMEM supplemented with 10% Fetal Bovine Serum (FBS) and 1% penicillin/streptomycin.
5. Lipofectamine 2000 (Thermo Fisher Scientific) or other suitable transfection reagent(s).
6. Opti-MEM (Thermo Fisher Scientific) reduced serum media.
7. Twenty-four well tissue culture-treated plates.
8. Transfection plasmids: pSPgRNA(s) with target sequence. pcDNA-dCas9-VP64 (Addgene#47107) or other suitable dCas9 transcriptional activator expression vector. pMAX-GFP (Amaxa) or other suitable reporter plasmid for measuring transfection efficiency.
Analysis of mRNA Expression
1. 0.25% Trypsin-EDTA.
2. PBS.
3. QIAshredder (Qiagen).
4. RNeasy Plus RNA isolation kit (Qiagen).
5. qScript cDNA SuperMix (Quanta Biosciences).
6. RNase/DNase-free water.
7. PerfeCTa® SYBR® Green FastMix (Quanta Biosciences).
8. Oligonucleotides for qPCR. Using high quality primers helps ensure reproducible qPCR results. Repeated freeze-thaw cycles can alter primer binding to the template. Upon receipt, the primers are resuspended in MBG water and prepare single use aliquots that are stored at −80° C. Multiple oligonucleotides are often designed and tested for finding a suitable primer combination that is specific and amplifies the target transcript with 90-110% efficiency. Many design tools, such as Primer3Plus, are freely available as stand-alone or web-based applications. qPCR is performed using fast cycling two-step protocols with amplicons between 100 and 150 bp long. One consideration for primer design is to use primers that bind different exons separated, if possible, by several kilobases. This will ensure that any residual genomic DNA that might be present in the RNA sample will not be amplified during the PCR reaction.
9. CFX96 Real-Time PCR Detection System (Bio-Rad).
Design and construction of sgRNA Expression Plasmids
The procedure utilized for generating sgRNA vectors accomplishes plasmid digestion, oligonucleotide phosphorylation and ligation in a single reaction without DNA purification steps. This is a low cost and highly efficient procedure that can be completed in less than two hours from annealing to transformation.
1. Design and synthesize/order oligonucleotides to target the regions of the promoter proximal to the TSS of the target transcript. Stocks of each oligonucleotide prepared at 100 μM in nuclease-free molecular biology grade water, can be stored frozen for extended periods.
2. Combine 1 μL of each sense and antisense oligonucleotide with 98 μL of TBS in a PCR tube. Anneal the oligonucleotide mix by incubation at 95° C. for 5 min, followed by 25° C. for 3 min.
3. Mix 1 μL of annealed and diluted oligonucleotides with 170 ng sgRNA vector, 2 μL 10×T4 ligase buffer, 1 μL of T4 ligase, 1 μL BbsI/BpiI, 1 μL T4 polynucleotide kinase (PNK), and MBG water to a final reaction volume of 20 μL. The sgRNA vector backbone is simultaneously digested and ligated with the annealed, phosphorylated oligonucleotides in a single reaction with the following thermocycling program: 37° C., 5 min. 16° C., 10 min. Repeat a and b for a total of three cycles.
4. Transform ligated plasmid by mixing 1.5 μL of the reaction product with 30 μL of competent E. coli, spread onto prewarmed LB agar containing 100 μg/mL carbenicillin, and incubate overnight at 37° C.
5. Correct ligation is ensured by analyzing four transformants per plate using colony PCR with KAPA2G Robust PCR Kits. 25 μL reactions containing MBG water (11.9 μL), 5×KAPA2G Buffer (5.0 μL), 5× Enhancer (5.0 μL), 10 mM dNTP mix (0.50 μL), 10 μM M13 Forward primer (1.25 μL), 10 μM Reverse primer (antisense cloning oligonucleotide) (1.25 μL), and 5 U/μL KAPA2G Robust (0.10 μL) are used for sequencing. With a pipette tip, scrape one colony from the plate, transfer to the PCR reaction and, immediately, to a second PCR tube containing LB broth. The PCR reactions are performed in a thermocycler according to manufacturer's instructions and the PCR products analyzed in 2% agarose gels containing 0.1-0.2 μg/mL ethidium bromide. The expected size of the correct PCR product is ˜330 bp.
6. One colony, verified by PCR, is grown overnight in 5 mL of LB broth with 100 μg/mL carbenicillin.
7. The plasmid DNA from the bacterial culture is purified using a plasmid purification kit such as the Qiagen Spin Miniprep Kit and the construct is verified by DNA sequencing with M13 Forward primer.
Activation of Target Gene Expression in Mammalian Cells
1. A typical experimental setup includes reactions containing plasmid mixtures such as the following: GFP (1 μg). sgRNA 1 and dCas9 (0.5 μg each). sgRNA 2 and dCas9 (0.5 μg each). sgRNA 3 and dCas9 (0.5 μg each). sgRNA 4 and dCas9 (0.5 μg each). sgRNA 1+sgRNA 2+sgRNA 3+sgRNA 4 (0.125 μg of each) and dCas9 (0.5 μg).
Plasmid DNA purified using Qiagen Spin Miniprep Kit is suitable for transfection of a variety of cell lines, however, the resulting plasmid prep contains significant levels of endotoxins from E. coli that can result in decreased viability in some cell types. DNA precipitation with ethanol is usually sufficient to obtain transfection grade DNA suitable for use in most cell types. A control transfection reaction containing a GFP or similar expression plasmid should be used to ensure adequate transfection efficiency is achieved under identical experimental conditions and to serve as a negative control for qPCR.
2. For optimal transfection efficiency, low passage 293T cells in logarithmic growth are trypsinized, harvested, and resuspended at 106 cells/mL in DMEM.
3. As per manufacturer's instructions, the DNA is mixed with 50 μL of Opti-MEM in a microfuge tube and, in a separate tube, 2 μL of Lipofectamine 2000 are mixed with 50 μL of Opti-MEM. After 5 min, the contents of both tubes are combined and incubated for an additional 20 min. The 100 μL DNA-lipofectamine reagent mixture is pipetted into one well of a 24-well treated tissue culture dish and promptly mixed with 400 μL of freshly harvested and properly diluted cells. Transfections are typically performed in antibiotic free medium. Decreased transfection efficiency or viability by using antibiotics in 293T cells has not been observed.
4. Incubate the cells for 48-72 h before analyzing gene expression.
Analysis of Gene Expression by qPCR
1. The cells are trypsinized and washed with PBS once. Gene expression is analyzed in three independent experiments that are performed on three different days using biological duplicates in each experiment. Since RNA is unstable and degrades rapidly over time, it can be advantageous to harvest the cells and freeze cell pellets until all three experiments have been completed. At that point RNA extraction is performed from all samples simultaneously to minimize variability due to sample handling.
2. Total RNA is isolated using the RNeasy Plus RNA isolation kit (Qiagen) or another standard enzymatic removal method of genomic DNA after RNA isolation. The cells are lysed by adding an appropriate volume of RLT Plus with 10 μL/mL of β-mercaptoethanol and homogenized with QIAshredder columns. All other steps are performed according to manufacturer's instructions. It is recommended to prepare 70% ethanol and RPE buffer fresh before use.
3. cDNA synthesis is performed using the qScript cDNA SuperMix (Quanta Biosciences) by incubation of 1 μg of RNA with 4 μL of qScript cDNA SuperMix and RNase/DNase-free water up to 20 μL. The thermocycling parameters are: (a) 5 min at 25° C. (b) 30 min at 42° C. (c) 5 min at 85° C. For the cDNA synthesis reaction to occur identically in all samples, it is important to use equal amounts of RNA from all samples. cDNA can be prepared from 1 μg of RNA.
4. Real-time PCR is performed using PerfeCTa® SYBR® Green FastMix (Quanta Biosciences) with the CFX96 Real-Time PCR Detection System (Bio-Rad). The primers are designed using Primer3Plus, purchased from IDT and validated by agarose gel electrophoresis and melting curve analysis. For each sample, quantification of a housekeeping gene (such as GAPDH) must be performed in addition to analysis of the target gene. The qPCR reactions contain 10 μL PerfeCTa® SYBR® Green FastMix (2×), 2 μL forward primer (5 μM), 2 μL reverse primer (5 μM), cDNA and RNase/DNase-free water up to 20 μL. The optimal cycling parameters for each gene must be determined experimentally to ensure efficient amplification over an appropriate dynamic range. Standard curves are generated using tenfold dilutions with cDNA obtained from the sample presumed to have the highest transcript concentration. The use of plasmid DNA or other synthetic templates can lead to errors in determining the linear range of the PCR.
5. Calculate fold-increase mRNA expression of the gene of interest normalized to GAPDH expression using the ddCt method.
A nuclease-assisted vector integration (NAVI) for insertion of promoters at target sites was selected. NAVI can be rapidly adapted to integrate heterologous DNA at virtually any locus via two simultaneous DSBs: first in the genome, guided by a primary sgRNA, and second within the targeting vector (TV), guided by a universal secondary sgRNA. The TV is then integrated into the genomic locus through Non-Homologous End Joining (NHEJ). This platform is universal since vector integration at any target site can be simply accomplished by customizing the primary sgRNA.
To develop a universal system of NAVI-based gene activation (NAVIa), two vectors for constitutive expression and one vector for inducible expression were designed.
Cell Culture and Transfection
293T and HCT116 cells were obtained from the American Tissue Collection Center (ATCC) and were maintained in DMEM supplemented with 10% fetal bovine serum and 1% penicillin/streptomycin at 37° C. with 5% CO2. 293T and HCT116 cells were transfected with Lipofectamine 2000 (Invitrogen) according to manufacturer's instructions. Transfection efficiencies were routinely higher than 80% for 293T cells and higher than 50% for HCT116 cells as determined by fluorescent microscopy following delivery of a control GFP expression plasmid. Induction of gene expression, unless otherwise noted, was carried out with 200 ng/mL doxycycline in DMEM prepared with 10% tetracycline-free FBS for 4 days.
Plasmids and Oligonucleotides
The plasmids encoding SpCas9 (Plasmid #41815), sgRNA (#47108), SpdCas9-VPR (#63798) and sgRNA library (#1000000078) were obtained from Addgene. The backbone for the targeting vectors was synthesized by IDT Technologies as gene blocks and cloned into a pCDNA3.1 plasmid. Guide sequences were obtained from IDT Technologies, hybridized, phosphorylated and cloned in the sgRNA vector using BbsI sites (see also Example 3). The target sequences are provided in Table 8.
PCR
Seventy-two hours after transfection, genomic DNA was isolated using DNeasy Blood & Tissue Kit (Qiagen). PCRs were performed using KAPA2G Robust PCR kits (KAPA Biosystems). A typical 25 μL reaction used 20-100 ng of genomic DNA, Buffer A (5 μL), Enhancer (5 μL), dNTPs (0.5 μL), 10 μM forward primer (1.25 μL), 10 μM reverse primer (1.25 μL), KAPA2G Robust DNA Polymerase (0.5 U) and water (up to 25 μL). The DNA sequence of the primers for each target and the cycling parameters for each reaction are provided in Table 9. The PCR products were visualized in 2% agarose gels and images were captured using a ChemiDoc-It2 (UVP).
qPCR
Cells were harvested and flash-frozen in liquid nitrogen prior to RNA-extraction using the RNeasy Plus RNA isolation kit (Qiagen) according to manufacturer's instructions. cDNA synthesis was carried out using the qScript cDNA Synthesis Kit (Quanta Biosciences) from 1 μg of RNA and reactions were performed as directed by the supplier. For RT-qPCR, SsoFast EvaGreen Supermix (Bio-Rad) was added to cDNA and primers targeting the gene of interest and GAPDH (Table 10). Following 30 s at 95° C., qPCR (5 s at 95° C., 20 s at 55° C., 40 total cycles) preceded melt-curve analysis of the product by the CFX Connect Real-Time System (Bio-Rad). Ct values were used to calculate changes in expression level, relative to GAPDH and control samples by the 2−ΔΔCt method.
Results
The two constitutive vectors contain either one CMV promoter followed by a target site for a universal secondary sgRNA (constitutive single promoter targeting vector, cspTV) or two opposing constitutive promoters separated by the secondary sgRNA target site (constitutive dual promoter targeting vector, cdpTV), each containing a cassette for expression of the puromycin N-acetyl-transferase gene. The targeting vector for inducible expression (inducible dual promoter targeting vector, idpTV) includes two identical promoters in opposite orientations, each consisting of seven TetO repeats and a minimal CMV promoter (mCMV). The idpTV also carries a puromycin N-acetyl-transferase gene linked with a reverse tetracycline transactivator (rtTA) via a T2A peptide. As in the cdpTV, the opposing promoters of the idpTV flank a universal secondary sgRNA target sequence. A DSB introduced in either idpTV or cdpTV by Cas9 generates a linear fragment of DNA with diametric promoters oriented towards the free ends of the vector (
In order to evaluate this gene activation architecture in the context of the human genome, three target genes were selected whose reported levels of activation utilizing CRISPRa are either high (ASCL1, ˜103-fold), medium (NEUROD1, ˜102-fold), or low (POU5F1, ˜10-fold). The primary sgRNAs targeting the genome were co-transfected into 293T cells with three plasmids containing (1) an expression cassette for active Cas9, (2) customized cspTV, cdpTV or idpTV, and (3) a universal secondary sgRNA. Following transfection, cells with integration of the TV were selected using puromycin and, in cells transfected with the idpTV, gene expression was induced with doxycycline. In parallel, one sgRNA or a mixture of 4 sgRNAs (previously validated for use with CRISPRa) were co-transfected into 293 Ts with dCas9-VPR for comparison of the NAVIa with CRISPRa. Gene expression using an individual sgRNA directing dCas9-VPR to target promoters was increased ˜10-fold for all targets tested but not statistically significant. Utilization of 4 sgRNAs simultaneously activated gene expression more effectively than 1 sgRNA (ASCL1: ˜1800-fold, NEUROD1: ˜2900-fold, POU5F:1 ˜90-fold). The levels of gene activation using the cspTV (ASCL1: ˜730-fold, NEUROD1: ˜600-fold, POU5F:1 ˜200-fold) or cdpTV (ASCL1: ˜8500-fold, NEUROD1: ˜3000-fold, POU5F1: ˜1000-fold) were superior to CRISPRa using 1 sgRNA but lower or not statistically different from activation obtained using 4 sgRNA for two of the three targets. However, the idpTV (ASCL1: ˜7200-fold, NEUROD1: ˜76000-fold, POU5F1: ˜5370-fold) surpassed activation obtained using dCas9-VPR using 4 sgRNAs (
To further explore the trends we observed in 293T cells, NeuroD1 was targeted using the cdpTV in other cell lines. NAVIa effectively activated expression of NeuroD1 in the human colorectal carcinoma cell line HCT116, the primary human fibroblast cell line MRC-5, and the mouse neuroblastoma cell line Neuro2A (
When using CRISPRa it is difficult to predict optimal sgRNA target sites for efficient gene activation. While it is generally accepted that proximity to the TSS of the target site is important, other parameters such as presence of enhancers or local chromatin structure are also critical and, perhaps, more difficult to predict. We investigated a potential correlation between gene activation using NAVIa and distance between integration site and TSS by measuring gene expression induced with sgRNAs that target DNA sequences between positions −1010 and +1995, relative to the TSS of 3 different genes (
These results demonstrate a novel platform to activate native gene expression based on integration of heterologous promoters that overcomes some of the limitations intrinsic to CRISPRa. Promoter integration is accomplished by NAVI, which utilizes NHEJ and therefore overcomes some of the intrinsic limitations of DNA integration platforms that rely on Homologous Recombination (HR). For example, NHEJ is more effective than HR in non-dividing cells and has been exploited to integrate therapeutic transgenes in post-mitotic cells. In addition, we demonstrate that since this integration mechanism requires only one element that is variable, it can be adapted for genome-scale screenings.
Although NAVI is subject to some shortcomings associated with its specific gene editing mechanism, such as the error-prone nature of NHEJ, only minor indels at target sites were observed (
One concern about the NAVIa system is that it is prone to Cas9 off-target nuclease activity. Such activity may lead to off-target vector integration and the inadvertent upregulation of additional genes. This problem could be lessened by using truncated sgRNAs or enhanced versions of Cas9 that have increased specificity. While CRISPRa is also susceptible to off-target activation, one fundamental difference between both systems is that, for sustained gene activation, CRISPRa necessitates the stable expression, or repeated introduction, of heterologous system components, which may have obvious negative implications on their own. In addition, it has been demonstrated that gene activation from viral vectors is less efficient than activation with episomal plasmids, presumably due to lower copy number. In contrast, NAVIa only necessitates transient nuclease activity to integrate a single synthetic element and is easily amenable to repeated customization to reduce or completely eliminate off-target effects.
Since maximal gene activation may not be desirable in all experimental settings, CRISPRa has been adapted for tunable gene expression through combinatorial delivery of multiple sgRNAs. However, such efforts to modulate gene expression have proven unpredictable, with results that are difficult to reproduce. Alternatively, NAVIa enables facile customization of TV, including selection from a wide variety of gene regulatory mechanisms provided by existing artificial promoters. The idpTV used in these experiments introduces a doxycycline-inducible promoter and a precise temporal control of gene expression that could be tuned by the concentration of doxycycline in the growth medium. Induction of gene expression for 96 h with concentrations of doxycycline ranging from 2 ng/mL to 2 μg/mL led to a dose-dependent increase in gene expression ranging between ˜337-fold and ˜26015-fold (
Tetracycline-inducible systems have been designed for high responsiveness to doxycycline, yet background expression in the absence of inducer, while low, continues to be a problem that hinders applications requiring precise control over gene activation. While inducibility is a significant advantage of NAVIa over CRISPRa, tetracycline-inducible promoters are typically used to modulate expression cassettes within a vector, and not in a genomic context where the surrounding transcriptional regulatory elements may contribute to undesired expression at steady state. Analysis of NEUROD1 activation within samples not induced with doxycycline revealed significant background expression (˜432-fold over basal expression,
One significant advantage of NAVIa over existing CRISPRa methods is the rapid and facile generation and screening of stable cell lines with tunable or programmable properties and a highly predictable pattern of integration. Inducible CRISPRa methods have been developed by integrating a tetracycline-inducible Cas9-based transcriptional activator at random genomic loci. Induction of target gene expression with these systems requires persistent expression of the sgRNA while expression of the ATF, and ultimately target gene activation, is controlled by treatment with doxycycline. Although these systems are tunable, they exhibit significant background expression in the absence of doxycycline. In contrast, NAVIa replaces native promoters via targeted integration of a tetracycline-inducible promoter to achieve a rapid response to the inducer while avoiding unpredictable lentiviral integration patterns. Further refinements of the minimal promoter, the positioning of TetO sites, and other attributes of the integrated vector will remove not only background expression but also basal expression, allowing generation of functional knock out or overexpression of a gene a single cell line by simply varying the concentration of inducer.
Another potential limitation of NAVIa in these experiments was the integration of two promoters in different orientations. While this approach ensures that one promoter is always positioned in the correct orientation for overexpression of the target gene, it is possible that the other promoter can modify expression in the opposite orientation. While this shortcoming also occurs with bidirectional gene activation induced by CRISPRa, it can be overcome in NAVIa by simply using a single promoter. This alternative strategy requires screening a few clones to identify those with the promoter in the correct orientation, but effectively prevents potential aberrant activation at the opposite end of the vector. Future iterations to enhance efficiency of this technique will require precise control over orientation by manipulating the DNA repair process.
One important feature of CRISPRa architectures is multiplexability. Different genes can be activated simultaneously by delivering sgRNAs targeting different promoter. Two benefits of NAVI over other integration platforms, such as those utilizing HR, are the universal adaptability of the system to target different genomic loci, by simply providing additional primary sgRNAs, and facile clone isolation upon selection. Since activation of different genes using NAVIa can be accomplished using a set of vectors in which the only variable element is the primary sgRNA, this flexible architecture is also compatible with multiplexing. To demonstrate these capabilities, sgRNAs were first identified for targeting additional genes with NAVIa including IL1B, IL1R2, LIN28A and ZFP42 (
CRISPRa gain-of-function genetic screenings rely on robust activation of native genes for efficient genome-scale interrogation. However, the required use of single sgRNAs, which are often insufficient for upregulating gene expression, may introduce important biases since only genes that are permissive for activation will be interrogated effectively. Previously, it was found that since shRNA and CRISPR-Cas9 knock down gene expression by different mechanisms, their application in parallel for genome-scale loss of function screenings generates results that are complementary. Unlike loss-of-function screenings, there are no alternative methods complementary of CRISPRa to perform gain-of-function screenings. However, since NAVIa requires only one sgRNA per target and achieves robust activation across targets, it was compatible with genome-scale activation screenings.
Transfection and Transduction of sgRNA Library
The human SAM library of sgRNAs, with 3× coverage of coding gene promoters, was prepared following the guidelines provided by Konermann et al., Nature, 517:583-588 (2015) and packaged into 2nd-generation lentivirus within 293T cells. The resultant library was transduced into MCF7 cells.
Following a brief recovery period over a single passage, 107 MCF7 cells were transfected with the NAVIa system plasmids (Cas9, TV, and secondary sgRNA) and selected by 1 μg/mL puromycin. Cells were split into two groups, which were either treated with 4-hydroxytamoxifen or not treated. The treated cells received 5 μM 4-hydroxytamoxifen for 14 days, replaced every two days. The untreated cells were handled identically receiving fresh media without 4-hydroxytamoxifen. After 14 days the cells were washed and recovered for isolation of genomic DNA.
NGS
The sgRNA expression cassettes from library genomic DNA samples and controls were amplified in two rounds using KAPA HiFi HotStart polymerase (KAPA Biosystems). The first round reactions amplified the entire human U6 sgRNA expression cassette (552 bp) and were separated in 2% agarose gels, excised using the QIAquick Gel Extraction Kit (Qiagen), and used as template with the NGS primers (
The final pool was quantitated using Qubit (Life Technologies, Grand Island, N.Y.) and the average size determined on the on an Agilent bioanalyzer HS DNA chip (Agilent Technologies, Wilmington, Del.) and diluted to 5 nM final concentration. The 5 nM dilution was further quantitated by qPCR on a BioRad CFX Connect Real-Time System (Bio-Rad Laboratories, Inc. CA).
The final denatured library pool was spiked with 10% indexed PhiX control library and loaded at a concentration of 9 pM onto one lane of a 2-lane Rapid flowcell for cluster formation on the cBOT, and then sequenced on an Illumina HiSeq 2500 with version 2 SBS sequencing reagents for a total read length of 100 nt from one end of the molecules. The PhiX control library provides a balanced genome for calculation of matrix, phasing and prephasing, which are essential for accurate basecalling.
The run generated .bcl files, which were converted into demultiplexed compressed fastq files using bcl2fastq 2.17.1.14 (Illumina, CA). A secondary pipeline decompressed the fastq files, generated plots with quality scores using FastX Tool Kit, and generated a report with the number of reads per barcoded sample library. Final fastq file data sets were first parsed using Cutadapt, to isolate sgRNA targeting sequences from leading and trailing sequence, and then analyzed using MAGeCK.
Following trimming, counting, and normalization of read counts, it was determined that the number of sgRNAs transduced into MCF7 cells was 4,292 (Table 11). Of the unique reads detected, ˜85% were found to be within the CRISPRa samples and ˜93% for NAVIa. In total, 77% of the unique reads overlapped between the CRISPRa and NAVIa libraries. In all, one or more sgRNA covering 3,817 genes were found to have been covered by these reads, with 100% overlap between the CRISPRa and NAVIa libraries, thus enabling a direct comparison between both methods.
The normalized read counts from the CRISPRa and NAVIa experiments were separately scored by gene association and assigned p-values according to the MAGeCK-RRA algorithm.
NGS Hit Validation
The top two hits from each the CRISPRa (CHSY1, GDF9) and NAVIa screen (MFSD2B, HMGCL) as well as the hit identified by both approaches (IPO9) were chosen for further tamoxifen resistance study. For each target, the primary sgRNA identified in the screen was co-transfected into MCF7 cells with Cas9, the cdpTV, and the universal secondary sgRNA followed by selection with 1 μg/mL puromycin. Ten thousand cells of each selected pool, and 10,000 wild type MCF7 cells, were seeded into 4-hydroxytamoxifen (5 μM) and tamoxifen-free media. The cells were cultured for 10 days, and were trypsinized every other day to refresh media and treat experimental cells with 4-hydroxytamoxifen in suspension. On day 10 cells were again trypsinized and counted. The cell culture and counting was done in duplicate by two independent researchers (n=4).
Statistics
Statistical analysis was performed by two-way ANOVA with alpha equal to 0.05 or with t tests in Prism 7.
A genome-scale gain-of-function experimental framework for NAVIa was tested in which lentiviruses were first generated from a library of plasmids targeting the promoters of native transcription factors (library), which were transduced into 293T cells at MOI 0.2 (
Finally, side-by-side genome-scale screenings was performed between NAVIa and CRISPRa to evaluate their ability to identify transcription factors associated with rapid growth in 293T cells. While each method generated positive selection results, the enrichment observed with NAVIa was significantly more robust than that observed with CRISPRa. In addition, there is significant exclusivity, which highlights the differences between these approaches and suggests that NAVIa and CRISPRa could provide valuable complementary results. By combining results from each method, it is possible to identify a strong list of candidate genes with potential roles in the phenotype under investigation.
To demonstrate the applicability of NAVIa genetic screenings, in comparison with CRISPRa, transcription that confer a proliferative advantage in 293T cells were identified. After 14 days of growth, next generation sequencing of the sgRNA expression cassette was performed for each of the gain-of-function screenings. Examination of FDR q-values from the top scores from each method reveals a different distribution for the top 350 hits, with a shift in significance for all hits skewed toward NAVIa (
To verify the results from the tamoxifen 252 resistance screen, the top two gene hits from each screen were validated, as well as IPO9. Target-specific primary sgRNAs in combination with cdpTV, Cas9 and the secondary sgRNA were delivered to MCF7 cells, which, after selection with puromycin, were treated with tamoxifen. Each of the cell lines generated displayed increased resistance to tamoxifen compared with wild type, although not all the measurements were significant due to large variability across samples (
In summary, the robust levels of activation, multiplexing capabilities, and adaptability for genome-scale gain-of-function screenings make NAVIa an attractive new platform for a variety of synthetic biology applications including metabolic engineering, drug screening, and signal transduction pathway analysis.
This application claims priority to U.S. Provisional Patent Application No. 62/487,001, filed Apr. 19, 2017, the disclosure of which is hereby incorporated by cross-reference in its entirety.
Number | Date | Country | |
---|---|---|---|
62487001 | Apr 2017 | US |