Genetically engineered mouse models (GEMMs) have been the paradigm for analyzing gene function in vivo in a temporal- and tissue-specific manner. However, as GEMM generation is an expensive laborious process, many alternative transgenic approaches, such as electroporation (EP)-mediated and viral gene deliveries, have been increasingly adapted as more rapid and efficient methods of creating somatic mosaics. Both methods entail injecting specific tissues with virus or foreign DNAs to transduce the surrounding cells and create somatic mosaics. EP can yield genome-inserted DNA using transposons or less efficiently with CRISPR/Cas9 and subsequent insertion of a donor template. Despite their speed, these methods have major pitfalls that dissuade more widespread adoption. Viral vectors have limited payloads, incite immune responses, and require special expertise, while both transposons and viral methods suffer from their unpredictable genomic integration patterns, possible insertional mutagenesis, and epigenetic transgene silencing. Both suffer from transgene copy number variability and overexpression artifacts such as cytotoxicity and transcriptional squelching, hence clonal genotypic/phenotypic variability are significant con-founding factors.
With the identification of hundreds of recurrent, putative cancer driver mutations, many of which are gain-of-function (GOF) oncogenes, it is imperative to create a tractable in vivo platform that can model these potential oncogenes, possibly in conjunction with tumor suppressor mutations. For each GOF oncogene, there are often tens of different recurrent missense mutations that can function in distinct ways. Many well-known tumor suppressor mutants are loss-of-function (LOF) phenotypes, for which one can utilize large-scale KO-mice consortia to breed multiple-KO-mice (e.g., Pten/p53/Nf1-KO). Even then, creating such mice is significantly time-consuming, expensive, and prone to some methodological confounds. Alternatively, CRISPR/Cas9 systems can simultaneously induce multiple KOs in vivo in mice, but can have significant unintended off-target genome alterations.
In one aspect, provided herein is a flexible in vivo platform that can simultaneously model combinations of GOF and LOF mutations not only cheaply but also in a GEMM-like fashion. We demonstrate that successful dual recombinase mediated cassette exchange (dRMCE, or MADR) can be catalyzed in situ in somatic cells in well-characterized reporter mice with definitive genetic labeling of recombined cells. Moreover, we demonstrate the utility of this system in generating mosaicism with a mixture of GOF and LOF mutations, including patient-specific driver mutations. Ultimately, our MADR tumor models demonstrates this method has a potential to become a higher-throughput, first-pass experiment to test and study various putative tumor driver mutations, and provides a rapid pipeline for preclinical drug discovery in a patient-specific manner.
Described herein are systems, nucleic acids, and vectors useful for establishing a transgenic cell for use in cell therapy. These vectors circumvent problems associated with current methods used in creating cells with a transgene stably integrated in a genomic location. Current problems include lack of control of ploidy, lack of control of integration site, and restrictions on transgenic insert size. The systems described herein, solve these problems, and allow for safer more reproducible methods of cell therapy. These systems and the methods for using them are applicable to the establishment of cells and cell lines useful for delivering a gene product such as a neurotrophic factor and/or a growth factor to a subject with a neurodegenerative disease, such as Parkinson's disease, Amyotrophic Lateral Sclerosis (ALS), or Alzheimer's disease.
In one aspect, described herein, is a mammalian cell comprising a genomic integrated transgene, wherein the genomic integrated transgene comprises a neurotrophic factor, and is integrated at a genomic site comprising the AAVS1 locus, H11 locus, or HPRT1 locus. In certain embodiments, the cell is a human cell. In certain embodiments, the human cell is an induced pluripotent stem cell. In certain embodiments, the neurotrophic factor comprises glial cell line-derived neurotrophic factor (GDNF), neurturin, growth/differentiation factor (GDF) 5, mesencephalic astrocyte-derived neurotrophic factor (MANF), cerebral dopaminergic neurotrophic factor (CDNF), or combinations thereof. In certain embodiments, the neurotrophic factor is GDNF. In certain embodiments, the neurotrophic factor is under the control of an inducible promoter. In certain embodiments, the inducible promoter is a tetracycline or doxycycline inducible promoter. In certain embodiments, the neurotrophic factor and/or the inducible promoter are flanked by one or more of a recombinase recognition site, a tandem repeat of a transposable element, or an insulator sequence. In certain embodiments, a single copy of the transgene is integrated into the genome of the cell. In various embodiments, the neurotropic factor and/or the inducible promoter are flanked by paired recombinase recognition sites. In various embodiments, the paired recombinase recognition sites comprise a variant recombinase recognition site and a wild-type recombinase recognition site. In various embodiments, the variant recombinase recognition site exhibits reduced cleavage by a recombinase compared to the wild-type recombinase recognition site. In various embodiments, the paired recombinase recognition sites comprise LoxP sites or FRT sites.
In another aspect described herein is a system, comprising: (a) a promoter-less donor vector, comprising a polyadenylation signal or transcription stop element upstream from a transgene or nucleic acid encoding an RNA, the transgene or nucleic acid encoding an RNA, and paired recombinase recognition sites; (b) and one expression vector, comprising two genes encoding recombinases specific to the paired recombinase recognition sites, or two expression vectors, the first expression vector comprising one gene encoding a first recombinase that is specific to one of the paired recombinase recognition sites, and the second expression vector comprising one gene encoding a second recombinase that is specific to the other of the paired recombinase recognition sites. In certain embodiments, the promoter-less donor vector selected from the group consisting of plasmid, viral vector, and bacterial artificial chromosome (BAC). In certain embodiments, the promoter-less donor vector comprises at least four polyadenylation signals upstream from the transgene or nucleic acid encoding the RNA. In certain embodiments, the promoter-less donor vector further comprises a post-transcriptional regulatory element. In certain embodiments, the promoter-less donor vector further comprises a polyadenylation signal downstream from the transgene or nucleic acid encoding an RNA. In certain embodiments, the promoter-less donor vector comprises: a PGK polyadenylation signal (pA); a trimerized SV40pA; the transgene or nucleic acid encoding an RNA; loxP and flippase recognition target (FRT); a rabbit beta-globin pA; and a woodchuck hepatitis virus post-transcriptional regulatory element (WPRE). In certain embodiments, the paired recombinase recognition sites are loxP and flippase recognition target (FRT), and the recombinases are cre and flp. In certain embodiments, the paired recombinase recognition sites are VloxP and flippase recognition target (FRT), and the recombinases are VCre and flp. In certain embodiments, the paired recombinase recognition sites are SloxP and flippase recognition target (FRT), and the recombinases are SCre and flp. In certain embodiments, the recombinase is PhiC31 recombinase and the recombinase recognition sites are attB and attP. In certain embodiments, the wherein the recombinase is Nigri, Panto, or Vika and recombinase recognition sites are nox, pox, and vox, respectively. In certain embodiments, wherein one or both of the paired recombinase recognition sites comprise a mutation. In certain embodiments, the RNA is siRNA, snRNA, sgRNA, lncRNA or miRNA. In certain embodiments, the transgene or the nucleic acid encoding an RNA comprises disease associated mutations. In certain embodiments, the transgene or the nucleic acid encoding an RNA comprise a gain-of-function (GOF) gene mutation, loss-of-function (LOF) gene mutation, or both. In certain embodiments, the transgene comprises a factor that prevents apoptosis or promotes survival of a neuronal cell, increases the proliferation of a neuronal cell, or promotes differentiation of a neuronal cell. In certain embodiments, the factor is a growth factor. In certain embodiments, the growth factor comprises glial cell line-derived neurotrophic factor (GDNF), neurturin, growth/differentiation factor (GDF) 5, mesencephalic astrocyte-derived neurotrophic factor (MANF), cerebral dopaminergic neurotrophic factor (CDNF), or combinations thereof. In certain embodiments, the growth factor comprises glial cell line-derived neurotrophic factor (GDNF). In certain embodiments, the donor vector comprises an open reading frame (ORF) that begins with a splice acceptor. In certain embodiments, the donor vector comprises a fluorescent reporter. In certain embodiments, provided herein, is a mammalian cell comprising the system. In certain embodiments, the cell is a human cell. In certain embodiments, the cell is a pluripotent cell. In certain embodiments, the pluripotent cell is an induced pluripotent cell. In certain embodiments, the cell is for use in a method of delivering a gene product (e.g., growth factor, neurotrophic factor) to a subject having a neruodegnerative disorder, the method comprising administering the mammalian cell to the individual. In certain embodiments, the neurodegenerative disorder comprises Parkinson's Disease, Amyotrophic Lateral Sclerosis (ALS), or Alzheimer's Disease. In certain embodiments, the neurodegenerative disorder comprises Parkinson's Disease. In certain embodiments, the neurodegenerative disorder comprises Amyotrophic Lateral Sclerosis (ALS). In certain embodiments, the cell is for use in a method of increasing GDNF protein level in the brain of in an individual, the method comprising administering the mammalian cell to the individual.
In another aspect, provided herein, is a promoter-less donor vector, comprising: a polyadenylation signal or transcription stop element upstream from a transgene or nucleic acid encoding an RNA; the transgene or nucleic acid encoding an RNA; and paired recombinase recognition sites. In certain embodiments, the promoter-less donor vector selected from the group consisting of plasmid, viral vector, and bacterial artificial chromosome (BAC). In certain embodiments, the promoter-less donor vector comprises at least four polyadenylation signals upstream from the transgene or nucleic acid encoding the RNA. In certain embodiments, the transgene or RNA is selected from the group consisting of an oncogene, loss-of-function (LOF) mutation of a tumor suppressor gene, gain-of-function (GOF) mutation of a proto-oncogene, pseudogene, siRNA, snRNA, sgRNA, lncRNA, miRNA, epigenetic modification, non-coding genetic or epigenetic abnormality associated with human disease, and combinations thereof. In certain embodiments, the promoter-less donor vector further comprises a post-transcriptional regulatory element. In certain embodiments, the promoter-less donor vector further comprises a polyadenylation signal downstream from the transgene or nucleic acid encoding an RNA. In certain embodiments, one or both of the paired recombinase recognition sites comprise a mutation. In certain embodiments, the promoter-less donor vector comprises: PGK polyadenylation signal (pA); trimerized SV40pA; a transgene or RNA; loxP and flippase recognition target (FRT); a rabbit beta-globin pA; and a woodchuck hepatitis virus post-transcriptional regulatory element (WPRE). In certain embodiments, the transgene comprises a factor that prevents apoptosis or promotes survival of a neuronal cell, increases the proliferation of a neuronal cell, or promotes differentiation of a neuronal cell. In certain embodiments, the factor is a growth factor. In certain embodiments, the growth factor comprises glial cell line-derived neurotrophic factor (GDNF), neurturin, growth/differentiation factor (GDF) 5, mesencephalic astrocyte-derived neurotrophic factor (MANF), cerebral dopaminergic neurotrophic factor (CDNF), or combinations thereof. In certain embodiments, the growth factor comprises glial cell line-derived neurotrophic factor (GDNF). In certain embodiments, provided herein, is a mammalian cell comprising the promoter-less donor vector. In certain embodiments, the mammalian cell is a human cell. In certain embodiments, the mammalian cell is a pluripotent cell. In certain embodiments, the pluripotent cell is an induced pluripotent cell. In certain embodiments, the cell is for use in a method of delivering a gene product (e.g., growth factor, neurotrophic factor) to a subject having a neruodegnerative disorder in an individual, the method comprising administering the mammalian cell to the individual. In certain embodiments, the neurodegenerative disorder comprises Parkinson's Disease, Amyotrophic Lateral Sclerosis (ALS), or Alzheimer's Disease. In certain embodiments, the neurodegenerative disorder comprises Parkinson's Disease. In certain embodiments, the neurodegenerative disorder comprises Amyotrophic Lateral Sclerosis (ALS). In certain embodiments, the cell is for use in a method of increasing GDNF protein level in the brain of in an individual, the method comprising administering the mammalian cell to the individual.
In another aspect, provided herein, is a method of genetic manipulation of a mammalian cell, comprising: transfecting or transducing the mammalian cell with the system described herein. In certain embodiments, the mammalian cell is a human cell, the system targets the AAVS1 locus, H11 locus, or HPRT1 locus, and the method is an in vitro or ex vivo method. In certain embodiments, the mammalian cell is a mouse cell, and the system targets the ROSA26 locus, Hipp11 locus, Tigre locus, ColA1 locus, or Hprt locus. In certain embodiments, the method further comprises administering to the cell or contacting the cell with one or more recombinase enzymes. In certain embodiments, the one or more recombinase enzymes comprise, a Cre recombinase, a flippase recombinase, a Cre and a flippase recombinase, a Nigri recombinase, a Panto recombinase or a Vika recombinase.
Exemplary embodiments are illustrated in referenced figures. It is intended that the embodiments and figures disclosed herein are to be considered illustrative rather than restrictive.
One skilled in the art will recognize many methods and materials similar or equivalent to those described herein, which could be used in the practice of the present invention. Indeed, the present invention is in no way limited to the methods and materials described. For purposes of the present invention, the following terms are defined below.
As used herein the term “about” when used in connection with a referenced numeric indication means the referenced numeric indication plus or minus up to 5% of that referenced numeric indication, unless otherwise specifically provided for herein. For example, the language “about 50%” covers the range of 45% to 55%. In various embodiments, the term “about” when used in connection with a referenced numeric indication can mean the referenced numeric indication plus or minus up to 4%, 3%, 2%, 1%, 0.5%, or 0.25% of that referenced numeric indication, if specifically provided for in the claims.
In some embodiments, “control elements” refers collectively to promoter regions, polyadenylation signals, transcription termination sequences, upstream regulatory domains, origins of replication, internal ribosome entry sites (“IRES”), enhancers, and the like, which collectively provide for the replication, transcription and translation of a coding sequence in a recipient cell. Not all of these control elements need always be present, so long as the selected coding sequence is capable of being replicated, transcribed and translated in an appropriate host cell.
As used herein “paired” with respect to recombinase recognition sites refers to two recombinase recognition sites, one 5′ to a recited genetic element (e.g., gene of interest, promoter or other regulatory element) and one 3′ to the stated genetic element. Paired recombinase recognition sites may be identical (e.g., LoxP-LoxP), comprise a wild-type and a variant site (e.g., LoxP-Lox71 or the reverse), or sites of two different origins whether wild-type or variant (e.g., FRT-LoxP or FRT-Lox66). Wild-type LoxP comprises the sequence ATAACTTCGTATAATGTATGCTATACGAAGTTAT (SEQ ID NO:17). Wild-type FRT comprises the sequence GAAGTTCCTATTCTCTAGAAAGTATAGGAACTTC (SEQ ID NO:18). A variant of these sequences is any sequence that varies by one or more nucleotides and can be cleaved by its recombinase (e.g., Cre for Lox sites and Flippase for FRT sites). In certain embodiments, such variants may be cleaved by their recombinase at a lower efficiency.
In some embodiments, “promoter region” is used herein in its ordinary sense to refer to a nucleotide region including a DNA regulatory sequence, wherein the regulatory sequence is derived from a gene which is capable of binding RNA polymerase and initiating transcription of a downstream (3′-direction) coding sequence.
In some embodiments, “operably linked” refers to an arrangement of elements wherein the components so described are configured so as to perform their usual function. Thus, control elements operably linked to a coding sequence are capable of effecting the expression of the coding sequence. The control elements need not be contiguous with the coding sequence, so long as they function to direct the expression thereof. Thus, for example, intervening untranslated yet transcribed sequences can be present between a promoter sequence and the coding sequence and the promoter sequence can still be considered “operably linked” to the coding sequence.
In some embodiments, “promoter-less” as used herein with reference to a donor vector refers a vector that does not have a eukaryotic promoter.
Described herein are exogenous nucleic acids and vectors for use in rendering a cell transgenic. In certain embodiments, the cell is a mammalian cell. In certain embodiments, the mammalian cell is a human cell. In certain embodiments, the mammalian cell is a human cell with pluripotent capability such as a fetal cell, an embryonic stem cell, a precursor cell or an induced pluripotent cell. In certain embodiments, these transgenic cells are useful to deploy as a therapy for neurodegenerative disease.
In some embodiments, “exogenous” with respect to a nucleic acid indicates that the nucleic acid is part of a recombinant nucleic acid construct, or is not in its natural environment. For example, an exogenous nucleic acid can be a sequence from one species introduced into another species, i.e., a heterologous nucleic acid. Typically, such an exogenous nucleic acid is introduced into the other species via a recombinant nucleic acid construct. An exogenous nucleic acid also can be a sequence that is native to an organism and that has been reintroduced into cells of that organism. An exogenous nucleic acid that includes a native sequence can often be distinguished from the naturally occurring sequence by the presence of non-natural sequences linked to the exogenous nucleic acid, e.g., non-native regulatory sequences flanking a native sequence in a recombinant nucleic acid construct. In addition, stably transformed exogenous nucleic acids typically are integrated at positions other than the position where the native sequence is found. In certain embodiments, the exogenous nucleic acids are targeted to a “safe” landing site. A “safe” site is a genomic region that is devoid of genes and their associated regulatory sequences, and possess a low likelihood of disrupting normal cellular function or initiating oncogenic transformation of a cell. In certain embodiments, the known safe site is the AAVS1 locus. Exogenous elements may be added to a nucleic acid construct, for example using genetic recombination. Genetic recombination is the breaking and rejoining of DNA strands to form new molecules of DNA encoding a novel set of genetic information.
As used herein, the terms “homologous,” “homology,” or “percent homology” when used herein to describe to a nucleic acid sequence, relative to a reference sequence, can be determined using the formula described by Karlin and Altschul (Proc. Natl. Acad. Sci. USA 87: 2264-2268, 1990, modified as in Proc. Natl. Acad. Sci. USA 90:5873-5877, 1993). Such a formula is incorporated into the basic local alignment search tool (BLAST) programs of Altschul et al. (J. Mol. Biol. 215: 403-410, 1990). Percent homology of sequences can be determined using the most recent version of BLAST, as of the filing date of this application.
Also described herein are polypeptides encoded by the nucleic acids of the disclosure. The terms “polypeptide” and “protein” are used interchangeably to refer to a polymer of amino acid residues, and are not limited to a minimum length. Polypeptides, including antibodies and antibody chains and other peptides, e.g., linkers and binding peptides, may include amino acid residues including natural and/or non-natural amino acid residues. The terms also include post-expression modifications of the polypeptide, for example, glycosylation, sialylation, acetylation, phosphorylation, and the like. In some aspects, the polypeptides may contain modifications with respect to a native or natural sequence, as long as the protein maintains the desired activity. These modifications may be deliberate, as through site-directed mutagenesis, or may be accidental, such as through mutations of hosts which produce the proteins or errors due to PCR amplification.
Percent (%) sequence identity with respect to a reference polypeptide sequence is the percentage of amino acid residues in a candidate sequence that are identical with the amino acid residues in the reference polypeptide sequence, after aligning the sequences and introducing gaps, if necessary, to achieve the maximum percent sequence identity, and not considering any conservative substitutions as part of the sequence identity. Alignment for purposes of determining percent amino acid sequence identity can be achieved in various ways that are known for instance, using publicly available computer software such as BLAST, BLAST-2, ALIGN or Megalign (DNASTAR) software. Appropriate parameters for aligning sequences are able to be determined, including algorithms needed to achieve maximal alignment over the full length of the sequences being compared. For purposes herein, however, % amino acid sequence identity values are generated using the sequence comparison computer program ALIGN-2. The ALIGN-2 sequence comparison computer program was authored by Genentech, Inc., and the source code has been filed with user documentation in the U.S. Copyright Office, Washington D.C., 20559, where it is registered under U.S. Copyright Registration No. TXU510087. The ALIGN-2 program is publicly available from Genentech, Inc., South San Francisco, Calif., or may be compiled from the source code. The ALIGN-2 program should be compiled for use on a UNIX operating system, including digital UNIX V4.0D. All sequence comparison parameters are set by the ALIGN-2 program and do not vary.
In situations where ALIGN-2 is employed for amino acid sequence comparisons, the % amino acid sequence identity of a given amino acid sequence A to, with, or against a given amino acid sequence B (which can alternatively be phrased as a given amino acid sequence A that has or comprises a certain % amino acid sequence identity to, with, or against a given amino acid sequence B) is calculated as follows: 100 times the fraction X/Y, where X is the number of amino acid residues scored as identical matches by the sequence alignment program ALIGN-2 in that program's alignment of A and B, and where Y is the total number of amino acid residues in B. It will be appreciated that where the length of amino acid sequence A is not equal to the length of amino acid sequence B, the % amino acid sequence identity of A to B will not equal the % amino acid sequence identity of B to A. Unless specifically stated otherwise, all % amino acid sequence identity values used herein are obtained as described in the immediately preceding paragraph using the ALIGN-2 computer program.
As used herein the terms “individual,” “subject,” and “patient” are interchangeable, and includes individuals diagnosed with, suspected of being afflicted with a neurodegenerative disease, or selected as having one or more risk-factors for a neurodegenerative disease. In certain embodiments, the individual is a mammal. In certain embodiments, the individual is a human person.
GEMM-based approaches still entail cumbersome mouse engineering and significant cross-breeding. Conversely, electroporation and viral transgenesis has enabled quick somatic transgenic investigations of development and disease but lack the precision of GEMMS. Transposons are becoming popular for producing stable somatic transgenics in developmental studies and in vivo tumor modeling. However, these methods suffer from random genomic insertions, position effect variation including transgene shutdown, and copy number variability. MADR overcomes the intrinsic disadvantages associated with these methods, and is a robust strategy for creating somatic mosaics with predefined insertion sites and copy numbers and requiring a negligible amount of colony maintenance. We demonstrated the versatility of MADR to generate combined modes (GOF/LOF) of mutations for multiple tumor drivers expeditiously and flexibly.
In one aspect, the methods herein utilize MADR to create mosaics and tumors in a host of tissues. Additionally, non-integrating viral vectors could be employed to deliver MADR constituents to avoid insertional mutagenesis. Provided in Table 1 is a comparison of in vivo genetic manipulation approaches. In some embodiments of a MADR method, the time for engineering is about 2 weeks per plasmid. In some embodiments of a MADR method, the copy number is 1-2 depending on zygosity of recipient. In some embodiments of a MADR method, breeding is performed with one line per target strain. In some embodiments of a MADR method, expression is generally stable depending on locus silencing. In some embodiments of a MADR method, payload is governed by plasmid limits. In some embodiments of a MADR method, focality depends on electrode orientation. In some embodiments of a MADR method, efficiency can be titered to approach 100% insertion. In some embodiments of a MADR method, transgenes can potentially hop in and out before Flp/Cre dilution. In some embodiments, a MADR method is compatible/complementary with other methods, e.g., orthogonal to CRISPR/Cas variants, HITI, Slendr, and/or Base writers.
The MADR method entails utilization of two different recombinases. One can restrict the cell type specificity of MADR targeting by carefully choosing the combinations of promoters driving the expression of recombinases. In some embodiments, in vivo MADR is performed with bacterial artificial chromosomes. A donor plasmid harboring large chunks of genomic fragments driving the expression of fluorescent reporter or recombinases, such as VCre, can be created with loxP and FRT sites added on each end, enabling further higher-complexity lineage tracing studies. In some embodiments, described herein is a self-excising FlpO-2A-Cre, which shifts the reaction equilibrium toward the complete integration. In some cases, this maximizes MADR efficiency.
Next generation sequencing has exponentially increased the catalogue of recurrent somatic mutations seen in tumors. Further, it is now increasingly appreciated that histologically similar tumors can often have disparate genetic underpinnings with different phenotypes (e.g. K27M vs. G34R). We show proof of principle for using MADR as a platform for rapid ‘personalized’ modeling of diverse glioma types by combining GOF and LOF mutations. To our knowledge, our MADR-based model is the only one successful at recapitulating the spatiotemporal regulation of tumor growth by K27M vs G34R mutations. Further, by unambiguously comparing K27M and G34R mutant cells side-by-side in vivo in individual animals—a unique advantage of MADR—we have observed the increased ability of K27M to accelerate tumor growth compared to G34R. Thus, while our K27M and G34R models are both 100% penetrant, these distinct mutations at closely situated residues exert distinct and powerful influences over tumor growth dynamics and tumor sites of origin. We noted a similarly remarkable pattern in our novel side-by-side comparisons of YAP1-MAMLD1 and C11orf95-RELA ependymoma models, whereby synchronized MADR transgenesis in the same cell populations led to disparate survival times. This suggests that the clinical age of onset for tumor subtypes may not by reflective only of cell origin or time of mutation, but also is highly-dependent on driver-mutation dictated growth dynamics. There is a “reverse chronology” in terms of enhancers that are activated after PRC2 complex inactivation. Using our novel models combined with single-cell approaches, our observations that K27M tumor cells exhibit a protracted pre-tumor stage culminating in a primitive ES-like transcriptional and epigenetic state is consistent with the possibility that K27M mutation exhibits this same reverse chronology reactivation of developmental enhancers.
In summary, our findings establish MADR as a robust genetic methodology, one which promises to democratize the generation of high-resolution GOF and LOF mosaics, allowing a small lab to model a wide spectrum of genetic subtypes in vivo. Additionally, this genetic framework is adaptable to the thousands of mouse lines already engineered with dual recombinase recognition sites, and can easily be adapted to any cell, organoid or organism that can be engineered with a MADR recipient site. Given MADR's ability to be combined with the existing arsenal of genetic approaches, its single-cell resolution, and its compatibility with sequencing technologies, these tools allow for efficient, higher throughput investigation of gene function in development and disease.
Accordingly, embodiments of the present invention are based, at least in part, from these findings.
Described herein is a system of nucleic acids and/or vectors for rendering a cell transgenic with a transgene of interest. The transgene can be flanked by two different recombinase recognition sites, such as LoxP and FLT, allowing for introduction of the transgene of interest into a specific site of the genome of a cell. In certain embodiments, the transgene of interest comprises a neurotrophic factor. In certain embodiments, the neurotrophic factor comprises glial cell line-derived neurotrophic factor (GDNF), neurturin, growth/differentiation factor (GDF) 5, mesencephalic astrocyte-derived neurotrophic factor (MANF), cerebral dopaminergic neurotrophic factor (CDNF), or combinations thereof. In certain embodiments, the neurotrophic factor comprises GDNF. In certain embodiments, two or more neurotrophic factors may be included on the same or different nucleic acids/vectors for targeting to the genome of a cell.
In certain embodiments, the transgene of interest is under the control of an inducible promoter. An inducible promoter allows transcription, and thus production, of a polypeptide encoded by the transgene of interest to be controlled by administration of an inducing agent. The inducible promoter is one that is not activated or only minimally activated in the absence of an inducing agent. This allows for the production of a neurotrophic factor to be tuned or adjusted in an individual that has been administered a vector that comprises the transgene or cells comprising a vector that comprises the transgene. This allows for enhanced safety and increased therapeutic potential, as levels of neurotrophic factor that are too high have unwanted side effects, and levels that are too low may not be therapeutically effective. In certain embodiments, the inducible promoter is a tetracycline-regulated promoter. In certain embodiments, the transgene of interest that is under the control of an inducible promoter comprises GDNF, neurturin, GDF 5, MANF, CDNF, or combinations thereof. In certain embodiments, the transgene of interest that is under the control of an inducible promoter is GDNF.
In certain embodiments, the systems, nucleic acids and/or vectors further comprise an expression cassette that constitutively expresses a synthetic transcription factor that is activated by a small-molecule compound. In certain embodiments, the, the synthetic inducible transcription factor is the reverse tetracycline-controlled transactivator (rtTA). The rtTA transactivator is inducible by a tetracycline class antibiotic such as doxycycline. In certain embodiments, the synthetic transcription factor is supplied on a second nucleic acid/vector or the same nucleic acid/vector as that of the neurotrophic factor under control of an inducible element.
In certain embodiments, the neurotrophic factor that can be supplied by the systems, vectors, and nucleic acids described herein comprises GDNF. A GDNF gene supplies, upon transcription and translation, a GDNF polypeptide to an individual that has been administered either the naked vector or a cell(s) comprising the vector. The GDNF gene is a nucleic acid sequence that encodes a GDNF polypeptide, and includes, for example, an open reading frame (ORF) lacking at least one or all introns from an endogenous GDNF gene. In certain embodiments, the GDNF gene is at least about 85%, 90%, 95%, 97%, 98%, 99%, or 100% homologous to the DNA sequence set forth in SEQ ID NO: 1. In certain embodiments, the GDNF gene encodes a polypeptide at least about 85%, 90%, 95%, 97%, 98%, 99%, or 100% identical to the amino acid sequence set forth in SEQ ID NO: 2.
In certain embodiments, the transgene can be flanked by insulator sequences. An insulator sequence is a genetic element that prevents propagation of heterochromatin, and can be used to “insulate” a transgene and its regulatory sequences form epigenetic silencing. In certain embodiments, the insulator sequence can be the gypsy insulator of Drosophila, a Fab family insulator, or the chicken β-globin insulator (cHS4).
The systems, nucleic acids and/or vectors described herein are useful in a method for the delivering a gene product to a subject having a neurodegenerative disease or condition. In certain embodiments, the nucleic acids and/or vectors are integrated at a known safe site in the genome in a cell to be administered to an individual with a neurodegenerative disease. The neurodegenerative disease can be Alzheimer's disease, Parkinson's disease, or Amyotrophic lateral sclerosis (ALS). Additionally, these nucleic acids and/or vectors are useful in a method to increase GDNF, neurturin, GDF 5, MANF or CDNF protein levels in the brain of an individual, the midbrain of an individual, or the substantia nigra of an individual. In certain embodiments, the nucleic acids/vectors are used in a method to increase GDNF protein levels in the brain of an individual, the midbrain of an individual, or the substantia nigra of an individual.
Methods of delivering a gene product to a subject having a neurodegenerative disease or condition are also described herein. In certain embodiments, the neurodegenerative disorder comprises Parkinson's Disease, Amyotrophic Lateral Sclerosis (ALS), or Alzheimer's Disease. In certain embodiments, the method comprises administering a cell comprising the nucleic acids/vectors described herein to an individual in need thereof. In certain embodiments, the method comprises administering a cell comprising the nucleic acids/vectors comprising an inducible GDNF described herein to an individual in need thereof.
Described herein, is a method for the delivering a gene product to a subject having a neurodegenerative disease or condition, or an individual afflicted with a neurodegenerative disease or condition, including administering a quantity of cells to the individual afflicted with the neurodegenerative disease or condition, wherein the cells comprise a genomic integrated vector comprising a GDNF gene operably coupled to an inducible promoter, and wherein the GDNF gene and the inducible promoter are flanked by non-viral tandem repeats or recombinase recognition sites.
Also described herein, is a method of increasing GDNF levels in the brain of an individual afflicted with a neurodegenerative disease or condition, including a) administering a quantity of cells to the individual afflicted with the neurodegenerative disease or condition, wherein the cells comprise a genomic integrated vector comprising a GDNF gene operably coupled to an inducible promoter, and wherein the GDNF gene and the inducible promoter are flanked by non-viral tandem repeats; and b) administering an inducing agent to the individual. In certain embodiments, the inducing agent is doxycycline.
Also described herein is a method of increasing GDNF levels in the brain of an individual afflicted with a neurodegenerative disease or condition, including administering an inducing agent to the individual; wherein the individual has previously been administered a quantity of cells, wherein the cells comprise a genomic integrated vector comprising a GDNF gene operably coupled to an inducible promoter activated by the inducing agent. In certain embodiments, the inducing agent is doxycycline.
Various embodiments of the present invention provide for a system, comprising: a promoter-less donor vector, comprising a polyadenylation signal or transcription stop element upstream from a transgene or nucleic acid encoding an RNA, the transgene or nucleic acid encoding an RNA, and paired recombinase recognition sites; and one expression vector, comprising two genes encoding recombinases specific to the paired recombinase recognition sites. In certain embodiments, the promoter-less donor vector selected from the group consisting of plasmid, viral vector, and bacterial artificial chromosome (BAC).
Other embodiments of the present invention provide for a system, comprising: a promoter-less donor vector, comprising a polyadenylation signal or transcription stop element upstream from a transgene or nucleic acid encoding an RNA, the transgene or nucleic acid encoding an RNA, and paired recombinase recognition sites; and two expression vectors, the first expression vector comprising one gene encoding a first recombinase that is specific to one of the paired recombinase recognition sites, and the second expression vector comprising one gene encoding a second recombinase that is specific to the other of the paired recombinase recognition sites. In certain embodiments, the promoter-less donor vector selected from the group consisting of plasmid, viral vector, and bacterial artificial chromosome (BAC).
In various embodiments, the promoter-less donor vector comprises at least four polyadenylation signals upstream from the transgene or nucleic acid encoding the RNA. In various embodiments, the promoter-less donor vector comprises at 2, 3, 4, 5 or 6 polyadenylation signals upstream from the transgene or nucleic acid encoding the RNA.
In various embodiments, the promoter-less donor vector further comprises a post-transcriptional regulatory element. In various embodiments, the promoter-less donor vector further comprises a polyadenylation signal downstream from the transgene or nucleic acid encoding an RNA.
In various embodiments, the promoter-less donor vector further comprises an open reading frame (ORF) that begins with a splice acceptor.
In various embodiments, the promoter-less donor vector further comprises a fluorescent reporter.
In various embodiments, the viral vector is an adeno-associated viral (AAV) vector. In various embodiments, the AAV vector is AAV1, AAV2, AAV3, AAV4, AAV5, AAV6, AAV7, AAV8, or AAV9. In various embodiments the viral AAV vector is a hybrid AAV vector; for example, wherein the capsid is derived from another serotype displaying the cell tropism of choice.
In particular embodiments, the promoter-less donor vector comprises: PGK polyadenylation signal (pA); trimerized SV40pA; the transgene or nucleic acid encoding an RNA; loxP and flippase recognition target (FRT); a rabbit beta-globin pA; and a woodchuck hepatitis virus post-transcriptional regulatory element (WPRE).
As non-limiting examples, the paired recombinase recognition sites can be loxP and flippase recognition target (FRT), and the recombinases would be cre and flp; the paired recombinase recognition sites can be VloxP and flippase recognition target (FRT), and the would be are VCre and flp; the paired recombinase recognition sites can be SloxP and flippase recognition target (FRT), and the recombinases would be SCre and flp. As a further non-limiting example, the recombinase can be PhiC31 recombinase, and PhiC31 recognition sites can be attB and attP. PhiC31 recognizes the attB and attP sites and creates attR and attL sites. Thus, a plasmid with attB and a target site with attP will catalyze insertion in the presence of PhiC31. Also, as further non-limiting examples, the recombinases can be Nigri, Panto, or Vika and their cognate sites are nox, pox, and vox, respectively.
In various aspects the paired recombinase recognition sites are chosen to increase the efficiency of integration of transgene or inducible transgene into the genome of a host cell. In certain embodiments, a variant LoxP site is paired with a wild-type or variant FRT site. In certain embodiments, a variant FRT site is paired with a wild-type or variant LoxP site. In certain embodiments, a variant Lox selected from Lox71, Lox66, lox511, lox5171, lox2272 is paired with a wild-type or variant FRT site. In certain embodiments, a Lox71 site is paired with an FRT site or variant FRT site. In certain embodiments, a Lox66 site is paired with an FRT site or variant FRT site. In certain embodiments a variant FRT selected from FRT1, FRT2, FRT3, FRT4, FRT5, FRT12, FRT13, FRT14, FRT545 is paired with a wild-type FRT. In certain embodiments a variant FRT selected from FRT1, FRT2, FRT3, FRT4, FRT5, FRT12, FRT13, FRT14, FRT545 is paired with a wild-type LoxP. In certain embodiments, the choice of paired recombination sites increases the efficiency of transgenic insertion into a cellular genome by 25%, 50%, 75%, or 100% or more.
In various embodiments, one or both of the paired recombinase recognition sites comprise a mutation. In various embodiments, the mutation for loxP is selected from lox71, lox75, lox44, loxJT15, loxJT12, loxJT510, lox66, lox76, lox43, loxJTZ2, loxJTZ17, loxKR3, loxBait, lox5171, lox2272, lox2722, m2, and combinations thereof. In various embodiments, the mutation for FRT is selected from FRT+10, FRT+11, FRT−10, FRT−11, F3, F5, F13, F14, F15, F5T2, F545, f2161, f2151, f2262, f61, and combinations thereof.
The mutation can allow for better transgenesis, and thus, new transgenic mice do not need to be generated. Furthermore, combinatorial experiments can be applied in a shorter window of time which allows for results to be obtained immediately when more than two different donor plasmids are used. This is also valuable in models wherein the organisms develop faster than mice.
In various embodiments, the RNA in the system(s) is siRNA, snRNA, sgRNA, lncRNA or miRNA. In various embodiments, the transgene or the nucleic acid encoding an RNA comprises disease associated mutations. In various embodiments, the transgene or the RNA comprise a gain-of-function (GOF) gene mutation, loss-of-function (LOF) gene mutation, or both. In various embodiments, the transgene or RNA is selected from the group consisting of an oncogene, loss-of-function (LOF) mutation of a tumor suppressor gene, gain-of-function (GOF) mutation of a proto-oncogene, pseudogene, siRNA, snRNA, sgRNA, lncRNA, miRNA, epigenetic modification, non-coding genetic or epigenetic abnormality associated with human disease, and combinations thereof.
Various embodiments of the present invention provide for a promoter-less donor vector, comprising: a polyadenylation signal or transcription stop element upstream from a transgene or nucleic acid encoding an RNA; the transgene or nucleic acid encoding an RNA; and paired recombinase recognition sites. In certain embodiments, the promoter-less donor vector selected from the group consisting of plasmid, viral vector, and bacterial artificial chromosome (BAC).
In various embodiments, the promoter-less donor vector comprises at least four polyadenylation signals upstream from the transgene or nucleic acid encoding the RNA. In various embodiments, the promoter-less donor vector comprises at 2, 3, 4, 5 or 6 polyadenylation signals upstream from the transgene or nucleic acid encoding the RNA.
In various embodiments, the promoter-less donor vector further comprises a post-transcriptional regulatory element. In various embodiments, the promoter-less donor vector further comprises a polyadenylation signal downstream from the transgene or nucleic acid encoding an RNA.
In various embodiments, the promoter-less donor vector further comprises an open reading frame (ORF) that begins with a splice acceptor.
In various embodiments, the promoter-less donor vector further comprises a fluorescent reporter.
In various embodiments, the viral vector is an adeno-associated viral (AAV) vector. In various embodiments, the AAV vector is AAV1, AAV2, AAV3, AAV4, AAVS, AAV6, AAV7, AAV8, or AAV9. In various embodiments the viral AAV vector is a hybrid AAV vector; for example, wherein the capsid is derived from the another serotype displaying the cell tropism of choice.
In particular embodiments, the promoter-less donor vector comprises: PGK polyadenylation signal (pA); trimerized SV40pA; the transgene or nucleic acid encoding an RNA; loxP and flippase recognition target (FRT); a rabbit beta-globin pA; and a woodchuck hepatitis virus post-transcriptional regulatory element (WPRE).
As non-limiting examples, the paired recombinase recognition sites can be loxP and flippase recognition target (FRT); the paired recombinase recognition sites can be VloxP and flippase recognition target (FRT); the paired recombinase recognition sites can be SloxP and flippase recognition target (FRT). As a further non-limiting example, the recombinase can be PhiC31 recombinase. PhiC31 recognizes the attB and attP sites and creates attR and attL sites. Also, as further non-limiting examples, the recombinases can be Nigri, Panto, or Vika.
In various aspects the paired recombinase recognition sites are chosen to increase the efficiency of integration of transgene or inducible transgene into the genome of a host cell. In certain embodiments, a variant LoxP site is paired with a wild-type or variant FRT site. In certain embodiments, a variant FRT site is paired with a wild-type or variant LoxP site. In certain embodiments, a variant Lox selected from Lox71, Lox66, lox511, lox5171, lox2272 is paired with a wild-type or variant FRT site. In certain embodiments, a Lox71 site is paired with an FRT site or variant FRT site. In certain embodiments, a Lox66 site is paired with an FRT site or variant FRT site. In certain embodiments a variant FRT selected from FRT1, FRT2, FRT3, FRT4, FRT5, FRT12, FRT13, FRT14, FRT545 is paired with a wild-type FRT. In certain embodiments a variant FRT selected from FRT1, FRT2, FRT3, FRT4, FRT5, FRT12, FRT13, FRT14, FRT545 is paired with a wild-type LoxP. In certain embodiments, the choice of paired recombination sites increases the efficiency of transgenic insertion into a cellular genome by 25%, 50%, 75%, or 100% or more.
In various embodiments, one or both of the paired recombinase recognition sites comprise a mutation. In various embodiments, the mutation for loxP is selected from lox71, lox75, lox44, loxJT15, loxJT12, loxJT510, lox66, lox76, lox43, loxJTZ2, loxJTZ17, loxKR3, loxBait, lox5171, lox2272, lox2722, m2, and combinations thereof. In various embodiments, the mutation for FRT is selected from FRT+10, FRT+11, FRT−10, FRT−11, F3, F5, F13, F14, F15, F5T2, F545, f2161, f2151, f2262, f61, and combinations thereof. The mutation can allow for better transgenesis, and thus, new transgenic mice do not need to be generated. Furthermore combinatorial experiments can be applied in a shorter window of time which allows for results to be obtained immediately when more than two different donor plasmids are used. This is also valuable in models wherein the organisms develop faster than mice.
In various embodiments, the RNA in the system(s) is siRNA, snRNA, sgRNA, lncRNA or miRNA. In various embodiments, the transgene or the nucleic acid encoding an RNA comprises disease associated mutations. In various embodiments, the transgene or the RNA comprise a gain-of-function (GOF) gene mutation, loss-of-function (LOF) gene mutation, or both. In various embodiments, the transgene or RNA is selected from the group consisting of an oncogene, loss-of-function (LOF) mutation of a tumor suppressor gene, gain-of-function (GOF) mutation of a proto-oncogene, pseudogene, siRNA, snRNA, sgRNA, lncRNA, miRNA, epigenetic modification, non-coding genetic or epigenetic abnormality associated with human disease, and combinations thereof.
In particular embodiments, the promoter-less donor vector comprises: PGK polyadenylation signal (pA); trimerized SV40pA; a transgene or nucleic acid encoding an RNA; loxP and flippase recognition target (FRT); a rabbit beta-globin pA; and a woodchuck hepatitis virus post-transcriptional regulatory element (WPRE).
Various embodiments provide for a method of genetic manipulation of a mammalian cell, comprising: transfecting or transducing the mammalian cell with a system of the present invention.
In various embodiments, the mammalian cell is a human cell and the system of the present invention targets AAVS1 locus, H11, HPRT1, or ROSA26, and the method is an in vitro or ex vivo method.
In various embodiments, the mammalian cell is a mouse cell and the system of the present invention targets ROSA26, Hipp11, Tigre, ColA1, or Hprt. In these embodiments, the method is in vitro, in vivo, or ex vivo.
Various embodiments of the present invention provide for a non-human animal model, comprising: a non-human animal comprising a system of the present invention, wherein the transgene or RNA is selected from the group consisting of an oncogene, loss-of-function (LOF) mutation of a tumor suppressor gene, gain-of-function (GOF) mutation of a proto-oncogene, pseudogene, siRNA, shRNA, sgRNA, lncRNA, miRNA, epigenetic modification, non-coding genetic or epigenetic abnormality associated with human disease, and combinations thereof.
Various embodiments of the present invention provide for a non-human animal model, comprising: a non-human animal wherein a system of the present invention has been administered to the non-human animal, and wherein the transgene or RNA is selected from the group consisting of an oncogene, loss-of-function (LOF) mutation of a tumor suppressor gene, gain-of-function (GOF) mutation of a proto-oncogene, pseudogene, siRNA, shRNA, sgRNA, lncRNA, miRNA, epigenetic modification, non-coding genetic or epigenetic abnormality associated with human disease, and combinations thereof.
In various embodiments, the non-human animal model is a personalized non-human animal model of a human subject's cancer and the transgene or RNA is based on the human subject's cancer. In various embodiments, the non-human animal model is a personalized non-human animal model of a human subject's disease or condition and the transgene or RNA is based on the human subject's disease or condition. “Based on” as used in reference to “based on” a human subject's disease, condition, or cancer refers to having the transgene or RNA model the genetic profile of the human subject's disease, condition or cancer. As a non-limiting example, a transgene based on a human subject's cancer can be gene that is a gain-of-function genetic mutation that is believed to be a cause of the human subject's cancer.
In various embodiments, the non-human animal model comprises a gain of function mutation (GOF), a loss of function mutation (LOF), or both.
Various embodiments provide for a method of generating the non-human animal model of the present invention, comprising: transfecting or transducing the non-human animal model with a system of the present invention, wherein the transgene or RNA is selected from the group consisting of an oncogene, loss-of-function (LOF) mutation of a tumor suppressor gene, gain-of-function (GOF) mutation of a proto-oncogene, pseudogene, siRNA, shRNA, sgRNA, lncRNA, miRNA, epigenetic modification, non-coding genetic or epigenetic abnormality associated with human disease, and combinations thereof.
The system of the present invention, is as described above and herein.
Various embodiments of the present invention provide for a method of assessing the effects of a drug candidate, comprising: providing the non-human animal model of the present invention; administering the drug candidate to the non-human animal model; and assessing the effects of the drug candidate on the non-human animal model.
In various embodiments, the method further comprises identifying the drug candidate as beneficial when the drug candidate provides beneficial results. In various embodiments, the method further comprises identifying the drug candidate and non-beneficial when the drug candidate does not provide beneficial results.
Various embodiments of the present invention provide for a mammalian cell comprising a system of the present invention as described herein. Other embodiments provide for a mammalian cell comprising a promoter-less donor vector of the present invention as described herein.
In various embodiments, the mammalian cell is a human cell. In various embodiments, the mammalian is a pluripotent cell. In various embodiments, the pluripotent cell is an induced pluripotent cell.
Various embodiments of the present invention provide for a mammalian cell comprising a genomic integrated transgene, wherein the genomic integrated transgene comprises a neurotrophic factor, and is integrated at a genomic site comprising a AAVS1 locus, H11 locus, or HPRT1 locus.
In various embodiments, the mammalian cell is a human cell. In various embodiments, the human cell is an induced pluripotent stem cell.
In various embodiments, the neurotrophic factor comprises glial cell line-derived neurotrophic factor (GDNF), neurturin, growth/differentiation factor (GDF) 5, mesencephalic astrocyte-derived neurotrophic factor (MANF), cerebral dopaminergic neurotrophic factor (CDNF), or combinations thereof. In various embodiments, the neurotrophic factor is GDNF.
In various embodiments, the neurotrophic factor is under the control of an inducible promoter. In various embodiments, the inducible promoter is a tetracycline inducible promoter. In various embodiments, the neurotrophic factor and or the inducible promoter are flanked by one or more of a recombinase recognition site, a tandem repeat of a transposable element, or an insulator sequence.
Various embodiments of the present invention provide for a method of delivering a gene product to an individual with a neurodegenerative disease or disorder comprising administering a mammalian cell of the present invention as described herein.
In various embodiments, the neurodegenerative disease or disorder comprises Parkinson's Disease, Amyotrophic Lateral Sclerosis (ALS), or Alzheimer's Disease.
In various embodiments, the neurodegenerative disease or disorder comprises Parkinson's Disease.
In various embodiments, the neurodegenerative disease or disorder comprises Amyotrophic Lateral Sclerosis (ALS).
Various embodiments of the present invention provide for a method of increasing a GDNF protein level in the brain of in an individual comprising administering a mammalian cell of the present invention to the individual.
The following examples are provided to better illustrate the claimed invention and are not to be interpreted as limiting the scope of the invention. To the extent that specific materials are mentioned, it is merely for purposes of illustration and is not intended to limit the invention. One skilled in the art may develop equivalent means or reactants without the exercise of inventive capacity and without departing from the scope of the invention.
All mice were used in accordance with the Cedars-Sinai Institutional Animal Care and Use Committee. Embryonic day (E) 0.5 was established as the day of vaginal plug. Wild-type CD1 mice were provided by Charles River Laboratories. Gt(ROSA)26Sortm4(ACTB-tdTomato,-EGFP)Luo/J and Gt(ROSA)26Sortm1.1(CAG-EGFP)Fsh/Mmjax mice (JAX Mice) were bred with wild-type CD1 mice (Charles River) or C57BL/6J mice to generate heterozygous mice. Male and female embryos between E12.5 and E15.5 were used for the in utero electroporations, and pups between postnatal day (P) 0 and P21 for the postnatal experiments. Pregnant dams were kept in single cages and pups were kept with their mothers until P21, in the institutional animal facility under standard 12: 12 h light/dark cycles.
The pDonor plasmids were derived from PGKneotpAlox2, using In-Fusion cloning (Clontech) or NEBuilder HiFi DNA Assembly Master Mix (NEB) in combination with standard restriction digestion techniques (Breunig et al., 2015, Soriano, 1999). Briefly, FRT site was created by annealing two oligos and infusing the insert into PGKneot-pAlox2. Downstream generation of donor plasmids were done by removing the existing ORF and adding a new cassette using In-Fusion or ligation, as was done for the smFP-HA ORF (Addgene 59759). PB-CAG-plasmids were previously described and created using combination of In-Fusion, NEB assembly, and ligation strategies (Breunig et al., 2015, Breunig et al., 2012). Primer sequences used for In-Fusion or assembly reactions are avail-able upon request. PCR was done using a standard protocol with KAPA HiFi PCR reagents. The original CMV Flp-2A-Cre and CMV Flp-IRES-Cre recombinase expression constructs were previously validated in the context of in vitro dRMCE (Anderson et al., 2012).
AAVS1 targeting MADR vector was derived from AAVS1-targeting vector AAVS1_Puro_PGK1_3×FLAG_Twin_Strep (Addgene 68375). TagBFP2-V5-nls-P2A-puroR-Cag-LoxP-TdTomato-FRT was inserted into this AAVS1 vector, and a human cell line was transfected with it and selected in puromycin. MADR-SM_FP-myc (bright) and MADR-TagBFP2-3flag WPRE was transfected into the resulting stable cell line with Cag-Flpo-2A-Cre to induce the MADR reaction.
KAPA HiFi PCR reagents were used to PCR genomic DNA collected from mouse MADR lines. Amplicons were run on an E-Gel apparatus to assess size.
Gt(ROSA)26Sortm4(ACTB-tdTomato,-EGFP)Luo/J and Gt(ROSA)26Sortm1.1(CAG-EGFP)Fsh/Mmjax mice (JAX Mice) were bred with wild-type CD1 (Charles River) or C57BL/6J (JAX) mice to generate heterozygous mice. Postnatal lateral ventricle EPs were performed as previously described (Breunig et al., 2015). P1-3 pups were placed on ice for ˜5 min. All DNA mixtures contained 0.5-1 μg/μl of Flp-Cre expression vector, donor plasmid, hypBase, or CAG-reporter plasmids diluted in Tris-EDTA buffer, unless noted otherwise. Fast green dye was added (10% v/v) to the mixture, which was injected into the lateral ventricle. Platinum Tweezertrodes delivered 5 pulses of 120 V (50 ms; separated by 950 ms) from the ECM 830 System (Harvard Apparatus). SignaGel was applied to increase conductance. Mice were warmed under a heat lamp and returned to their cages.
In utero electroporation experiments were performed according to standard methods (McKenna et al., 2011). TagBFP2-HRasG12V and Flp-Cre plasmids were EPed into E14.5 RCE mice embryos. After electroporation, the embryos were allowed to survive to P15, at which time TagBFP2-HrasG12V (MADR mediated insertion), EGFP (non-MADR Cre-mediated recombination) and Sox2 expression was analyzed by immunostaining.
In our experimentation, we have successfully employed in vivo electroporation, in vitro electroporation (i.e. nucleofection), and lipofection to effect MADR.
In vivo electroporation is believed to work by allowing plasmid DNA to permeate the plasma membrane and enter the nuclear space of cells undergoing mitosis. Thus, it is believed to be largely specific for the proliferating populations. However, postmitotic cells may be also targeted by mixing nuclear pore dilators with the DNA.
As we have shown in our description of MADR, this approach facilitates stable expression of single-copy transgenes for studying development and disease. The number of MADR transduced cells is largely dictated by the concentration of the MADR donor, the concentration of FlpO and Cre recombinases, and the proliferation rate of the targeted populations. Specifically, as we have shown, the number of MADR cells versus Cre recombined cells can be titrated in a defined population by varying the ratio of donor plasmid to recombinase plasmid.
However, as can be seen in our postnatal electroporations, we note that under the standard conditions that we have chosen (100 ng/ul of recombinase: 1000 ng/ul of donor plasmid), a pattern emerges whereby MADR transduction inversely correlates with the initial mitotic activity of the cells. Specifically, striatal glia are readily Cre recombined but are more rarely MADR transduced. Conversely, the radial glial populations, which are relatively more quiescent as bona fide neural stem cells, make up a major population of MADR cells. Notably, ependymal cells, which have been recently reported to be the result of terminal asymmetric or symmetric divisions tend to be readily targeted by MADR—presumably due to the fact that they don't dilute the plasmids after the initial cell division targeted by electroporation. The cell cycle of the CNS lengthens over development, and postnatal cells are relatively more quiescent than their embryonic counterparts so smaller initial populations are typically transduced by postnatal electroporation. Thus, if large numbers of parenchymal glia or embryonically-generated neurons are desired, in utero electroporation may be performed targeting the local region (i.e.,
We have not observed significant differences in MADR efficiency based on donor plasmid size between the standard ranges of plasmid DNA (4 Kb up to 18 Kb). Empirically testing using time-lapse imaging of MADR donors into proxy cells in vitro at 3 days post lipofection is in agreement with in vivo observations (data not shown). Plasmid mixes were based on identical molar ratios of individual donor variants. However, altering signaling pathways involved in cell fate, survival, proliferation, etc. will likely lead to changes in overall MADR cell numbers compared with using only genetic reporters.
We typically employ the strong CAG promoter due to its presence in the mouse lines that we utilize. However, there are several ways of attenuating the strength of this promoter:
After anesthesia, mouse brains were isolated and fixed in 4% paraformaldehyde on a rotator/shaker overnight at 4° C. Brains were embedded in 4% low-melting point agarose (Thermo Fisher) and sectioned at 70 μm on a vibratome (Leica).
Immunohistochemistry (IHC) was performed using standard methodology as previously described (Breunig et al., 2015). Agarose sections were stored in Phosphate Buffered Saline (PBS) with 0.05% sodium azide until use. Details on the primary antibodies can be found in the Table 3. All primary antibodies were used in PBS-0.03% Triton with 5% normal donkey serum. All secondary antibodies (Jackson ImmunoResearch) were used at 1:1000. Care was taken when including fast green dye for ventricle targeting in shorter duration experiments. Though the dye rapidly diluted in longer survival experiments, it confounded early (0-2 day) single-copy reporter detection and was omitted in these cases because of fluorescence in the far red wavelengths.
For pre-bleached immunohistochemistry, 70 μm tissue sections were dehydrated with increasing concentrations of methanol (20%, 40%, 60%, 80%, 100%) for 15 minutes each in water at RT, and then treated overnight with 5% H2O2 in 100% methanol at 4° C. Tissue was then rehydrated using methanol (100%, 80%, 60%, 40%, 20%), 15 minutes each in water, and then washed with PBS before proceeding with normal immunostaining.
Three heterozygous P0 mTmG pup brains were dissociated to establish the mouse neural stem cell line used in the study. The cell line was maintained as previously described (Breunig et al., 2015). Cells were grown in media containing Neurobasal®-A Medium (Life Technologies 10888-022) supplemented with B-27 without vitamin A (Life Technologies 12587-010), GlutaMAX (Life Technologies 35050), Antibiotic-Antimycotic (Life Technologies 15240), human epidermal growth factor (hEGF) (Sigma E9644), heparin (Sigma H3393), and basic fibroblast growth factor (bFGF) (Millipore GF003). Mouse NSC nucleofection was performed using the Nucleofector 2b device and Mouse Neural Stem Cell Kit according to manufacturer's recommendations (Lonza AG). The nucleofection mixtures contained plasmids with equal concentrations of 10 ng/μl.
N2A proxy cells expressing PIP-Venus/mCherry-hGEM1/110 were plated in a 96-well format and imaged with at 20× objective lens under phase, red and green fluorescence using an Incucyte S3 System (Essen Bioscience, Ann Arbor, Mich.). Images were collected every 30 min using Incucyte S3 Software.
In high-throughput drug testing experiments, 10.000 cell from the cell lines generated from tumor dissociation and non-tumor control cells were plated in 96 well plates. 24 hours after the seeding appropriate concentration of each drug (1 μM for Vacquinol-1(Sigma-Aldrich, SML1187) and 0.5 μM for AKT 1/2 kinase inhibitor (Sigma- Aldrich, A6730)) was added to the media and cells were imaged for 92 hours in phase contrast using Incucyte S3 System. Images were collected every 2 hours using Incucyte S3 Software. Cell proliferation images analysis was done with Incucyte S3 software and normalized results presented and analyzed with Graphpad Prism 7.
All fixed images were collected on a Nikon A1R inverted laser confocal microscope. The live image of mNSCs was obtained on an EVOS digital fluorescence inverted microscope. For whole brain images, the automated stitching function of Nikon Elements was used. ND2 files were then imported into ImageJ to create Z-projection images, which were subsequently edited in Adobe Photoshop CS6. In several rotated images (e.g.
For each condition, two pups were EPed with pCAG-TagBFP2-nls, pDonor-smFP-HA, and Flp-2A-Cre. The brains were taken two days post-EP, and two non-adjacent sections from each brain were stained with HA-Tag antibody and EGFP. For each section, ˜25 BFP+ cells were randomly selected, among which HA+ and EGFP+ cells among BFP+ cells were counted. The proportions were averaged over four sections for each group.
Cells were collected as previously described (Breunig et al., 2015). Cells were briefly rinsed in PBS, removed by enzymatic dissociation using Accutase (Millipore), pelleted at 250 g for 3 min, and resuspended in the media. FACS was done on a Beckman Coulter MoFlo at the Cedars-Sinai Flow Cytometry Core.
The cell pellets were resuspended in laemmli buffer and boiled for 5 min at 95° C. Protein concentrations were measured on a ThermoScientific NanoDrop 2000. After SDS-PAGE separation and transfer onto nitrocellulose membranes, proteins were detected using the antibodies listed in the Table 3, diluted in 5% milk in 0.1% PBS-Tween. All secondary antibodies (Li-cor IRDye®) were used at 1:15000. Proteins were visualized by infrared detection using the Li-Cor Odyssey® CLX Imaging System.
mTmG mNSCs were nucleofected (Lonza VPG-1004) with 6 μg of either piggybac or MADR TagBFP plasmid and 6 mg of FlpO 2A Cre in a T75 flask. After 4 days, cells were sorted through FACS, and 100,000-200,000 BFP+ cells were seeded onto Milo scWestern chips (ProteinSimple C300). Each chip was stained for guinea pig mKate (Kerafast EMU108) at 1:20 in Cy3 and rabbit histone H3 (Cell Signaling 4499) at 1:20 in 647. Imaging was performed using the Innoscan 710 microarray scanner.
Doxycycline (Clontech 631311) was added to culture media at the final concentration of 100 ng/ml. Puromycin (Clontech 631305) was used at 1 μg/ml.
We have previously used FlEx-based transgene expression, specifically Cre-mediated inversion and activation of EGFP cassette (FlEx-EGFP). To test our multi-miR-E targeting Nf1, Pten, and Trp53, we made a CAG-driven FlEx-based construct harboring the multiple miR-Es (FlEx-multi-miR-E). Postnatal mNSC line was established by dissociating CD1 pup brains, transfected with EGFP or FlEx-multi-miR-E and Cre-recombinase vector. Fluorescent cells were sorted and subjected to mRNA extraction and SYBR-based Fluidigm BioMark dynamic array using qPCR probes for Nf1, Pten, and Trp53.
For whole mount imaging, the iDisco tissue clearing method was used (Renier et at 2014). Fixed samples were gradually dehydrated in 20%, 40%, 60%, 80%, 100%, 100% methanol/H2O, 1 hour each at RT, and then bleached overnight in 5% H2O2 in 100% methanol overnight at 4° C., followed by a gradual rehydration (80%, 60%, 40%, 20% methanol/H2O, then PBS with 0.2% Triton X-100, 1 hour each at RT). Samples were then incubated in PBS with 0.2% Triton X-100, 20% DMSO, and 0.3M glycine for 2 days at 37° C. to permeabilize tissue, and then incubated in PBS with 0.2% Triton X-100, 10% DMSO, and 6% normal donkey serum for 2 days at 37° C. to block the tissue for staining. Samples were then incubated with primary antibodies in PBS with 0.2% Triton and 10 μg/ml heparin (PTwH), at 37° C. for 5 days, followed by 5 washes of PTwH, 1 hour each at RT, plus 1 overnight wash at RT. Samples were then incubated in secondary antibodies in PTwH, at 37° C. for 5 days, followed by 5 washes of PTwH, 1 hour each at RT, plus 1 overnight wash at RT.
Following staining, samples were again dehydrated gradually in 20%, 40%, 60%, 80%, 100%, 100% methanol/H2O, 1 hour each at RT, and then stored overnight in 100% methanol at 4° C. Samples were then incubated in a solution of 66% dichloromethane (DCM, Sigma 270997) in methanol for 3 hours at RT, followed by 2 washes with 100% DCM, 15 minutes each at RT, and then placed directly into dibenzyl ether (DBE, Sigma 108014) for clearing and imaging. Cleared samples were stored in DBE in glass containers at RT in the dark. Samples were imaged in DBE using a light sheet microscope (Ultramicroscope II, LaVision Biotec) equipped with an sCMOS camera (Andor NEO 5.5) and a 2×/0.5 objective lens with a 6 mm WD dipping cap.
Light sheet datasets were imported into Imaris 9.1 (Bitplane) for 3D visualization. To digitally remove artifacts and fluorescent debris, the surface tool was used to create surface renderings of unwanted fluorescence, and the ‘mask all’ function in the surface menu was used to create fluorescence channels with debris removed. To create a digital surface of the whole sample, the volume-rendering tool was set to ‘normal shading’ and the color was set to gray. Movies of 3-D datasets were generated using the ‘animation’ tool.
Samples were generated for expansion microscopy following the Pro-ExM protocol (Tillberg et al. 2015). Briefly, 100 μm sections were stained for EGFP and HA-tag. Before expansion, samples were imaged in water using a confocal microscope (Nikon A1R) for pre-expansion imaging.
Samples were anchored in 0.1 mg/ml Acryloyl-X, SE ((6-((acryloyl)amino)hexanoic acid, succinimidyl ester; Thermo-Fisher) in PBS with 10% DMSO, overnight at RT. After washing with PBS (3×10 minutes), samples were incubated for 30 minutes at 4° C. in monomer solution (PBS, 2 M NaCl, 8.625% (w/w) sodium acrylate, 2.5% (w/w) acrylamide, 0.15% (w/w) N,N-methylenebisacrylamide), immediately after addition of 0.2% (w/w) tetra- methylethylenediamine (TEMED), 0.2% (w/w) ammonium persulfate (APS), and 0.1% (w/w) 4-hydroxy-2,2,6,6-tetramethylpiperidin-1-oxyl (4-hydroxy-TEMPO). Slices were then incubated for 2 hours at 37° C. for gelation. After incubation, samples were incubated overnight in a 6-well plate at RT with no shaking in a digestion solution containing Proteinase K (New England Biolabs) diluted to 8 units/ml in digestion buffer (50 mM Tris pH 8, 1 mM EDTA, 0.5% Triton X-100, 1 M NaCl). Following digestion, samples were washed with excess H2O 4 times, 1 hour per wash at RT, and then stabilized in 2% low melting agarose in H2O before imaging. Images were acquired using a confocal microscope (Nikon A1R) with a 40× long WD objective (Nikon CFI Apo 40xw NIR).
After bleaching, immunohistochemistry was performed to stain for EGFP in the 405 channel. After incubation in secondary antibody, sections were incubated in 50 μM Draq5 (Cell signaling 4084S) in PBS for 2 minutes at RT, followed by washes of PBS (3×5 minutes). Sections were then incubated in 2% w/w Eosin Y (Sigma E4009) in 80% ethanol for 2 minutes at RT, followed by washes with PBS (3×5 minutes). Finally, sections were incubated in another Draq5 solution (50 μM in PBS) for 3 minutes, before washing with PBS, mounting, and imaging.
For each condition, pups were EPed with pDonor-smFP-Myc and Flpo-2A-Cre. The brains were taken two days post-EP, and two non-adjacent sections from each brain were stained with Myc-Tag antibody and EGFP. For each section, cells were quantified for insertion (Myc expressed) and cre excision (only EGFP expressed) using Syglass VR with an Oculus Rift system. Quantifications were indicated as percentages of total cells counted per section. The proportions were averaged over two sections from different animals for each group. Fast green was omitted from these assays as the dye was found to fluoresce in the same wavelengths as Alexa647. Though the dye rapidly diluted in longer survival experiments, it confounded early (0-2 day) single-copy reporter detection.
Reverse scaffold and forward primers (IDT DNA) were combined in a PCR reaction and subsequent purification to make concentrated sgRNAs (Ran et al., 2013). 100 ng of each fragment was combined with plasmid DNA for EP. We used previously-validated target sites for tumor modeling (Xue et al., 2014, Heckl et al., 2014) (Table 3).
A pure population of tumor cells was obtained by FACS and genomic DNA was isolated (Qiagen DNeasy). Using primers flanking the gRNA target site, we PCR amplified the regions expected to contain InDel mutations for Nf1, Trp53, and Pten. The PCR amplified fragments were topo cloned using the Thermo Fisher Zero Blunt TOPO kit and transformed into One Shot MAX Efficiency DH5-T1R cells.
For premature stop codon base conversions, EGFP+ cells were obtained by FACS, and genomic DNA was isolated (Qiagen DNeasy). Using primers flanking the sgRNA target site, we PCR-amplified the regions expected to contain base conversions for Nf1, Trp53, and Pten. The amplicons were normalized to 20 ng/ul and sent for sequencing to the AMPLICON-EZ service (Genewiz).
Fastq files for each gene-primer pair were aligned to a custom genome file containing that gene locus using STARlong, and bwa-mem with default parameters, which all gave similar results. The BAM files were up-loaded to IGV for visualization.
Stock Akalumine-HCL resuspended in dH2O at 10 mM and stored in −80. Aliquot diluted in dH20 to a final con-centration of 5 mM and a final quantity of 10 uL/g w/v mouse, and IP injected prior to imaging. Mice were anaesthetized with isofluorane according to IACUC protocol, and imaged using IVIS Ilumina XRMS at 1.5 FOV and 60 s exposure rate.
Mice were euthanized in CO2 chamber and brains were collected in PBS. Immediately, EGFP+ tissue was micro-dissected under a Revolve Hybrid Microscope (Echo Labs, San Diego, Calif.). If allowed by the size of the tumor, some remains of the brain with residual tumor tissue was fixed in 4% PFA for tissue analysis. Microdissected tissue was mechanically dissociated into <1 mm pieces and further digested with Collagenase IV (Worthington Biochemical, Lakewood, N.J.), and DNAse I (Worthington Biochemical, Lakewood, N.J.). The resultant single cell suspension was filtered through 40 mm cell strainer (Stellar Scientific, Baltimore, Md.) and erythrocytes were lysed with ACK lysis buffer (Thermo Fisher Scientific, Waltham, Mass.). Single cell suspensions were split into 3 parts: First, for scRNAseq or sc-ATACseq experiments, GFP+ cells from single cell samples were FACS sorted (into 1.5 ml tubes for 10× Chromium). A secondary fraction was used for in vitro cell line establishment. Specifically, cells were resuspended in Neurobasal media (Thermo Fisher Scientific, Waltham, Mass.) supplemented with penicillin-streptomycin-amphotericin (Thermo Fisher Scientific, Waltham, MA), B-27 supplement without Vitamin A (Thermo Fisher Scientific, Waltham, Mass.), Glutamax (Thermo Fisher Scientific, Waltham, Mass.), EGF (Shenandoah Biotechnology, Warwick, Pa.), FGF (Shenandoah Biotechnology, Warwick, Pa.), PDGF-AA (Shenandoah Biotechnology, Warwick, Pa.) and heparin (StemCell Technologies, Cambridge, Mass.); and cultured in a CELLstart CTS (Thermo Fisher Scientific, Waltham, Mass.) treated T25 Flask. Finally, the last third of the single cell suspensions were fixed in 80% methanol-PBS and stored at −80 C.
Single-cell RNA-seq libraries were prepared per the Single Cell 3′ v2 Reagent Kits User Guide (10× Genomics, Pleasanton, Calif.). Cellular suspensions were loaded on a Chromium Controller instrument (10× Genomics) to generate single-cell Gel Bead-In-EMulsions (GEMs). GEM-reverse transcription (RT) was performed in a Veriti 96-well thermal cycler (Thermo Fisher Scientific, Waltham, Mass.). After RT, GEMs were harvested and the cDNAs were amplified, cleaned up with SPRIselect Reagent Kit (Beckman Coulter, Pasadena, Calif.). Indexed sequencing libraries were constructed using Chromium Single-Cell 3′ Library Kit for enzymatic fragmentation, end-repair, A-tailing, adapter ligation, ligation cleanup, sample index PCR, and PCR cleanup. The barcoded sequencing libraries were quantified by quantitative PCR using the KAPA Library Quantification Kit (KAPA Bio-systems, Wilmington, Mass.). Sequencing libraries were loaded on a NovaSeq 6000 (Illumina, San Diego, Calif.) with a custom sequencing setting (26 bp for Read 1 and 91 bp for Read 2).
The demultiplexed raw reads were aligned to the transcriptome using STAR (version 2.5.1) (Dobin et al., 2013) with default parameters, using a custom UCSC mouse reference with mm10 annotation, containing all protein coding and long non-coding RNA genes. Expression counts for each gene in all samples were collapsed and normalized to unique molecular identifier (UMI) counts using Cell Ranger software version 2.0.0 (10× Genomics). The result is a large digital expression matrix with cell barcodes as rows and gene identities as columns.
To obtain 2-D projections of the population's dynamics, principal component analysis (PCA) was firstly run on the normalized gene-barcode matrix of the top 5,000 most variable genes to reduce the number of dimensions using Seurat package version 2.1-3 (Butler et al, 2018) in R v3.4.2-4.
GFP+ FACS sorted cells were processed following manufacture instruction for sc-ATACseq (10× Genomics, Pleasanton, Calif.). Specifically, sorted cells were filtered through a 40 mm cell strainer, pelleted and resuspended in one volume of lysis buffer (Tris-HCl 10 mM, NaCl 10 mM, MgCl2 3 mM, Tween-20 0.1% (Bio-Rad, 1610781), Nonidet P40 substitute 0.1% (Sigma-Aldrich, 74385), digitonin 0.01% (Sigma-Aldrich, 300410) and BSA 1% in Nuclease-fre water), cells were incubated on ice until optimal cell lysis. Then, lysis buffer was blocked by adding 10 volumes of Wash buffer (Tris-HCl 10 mM, NaCl 10 mM, MgCl2 3 mM BSA 1%, Tween-20 0.1% in Nuclease-free water). Isolated nuclei were pelleted and resuspended in 1× nuclei buffer (10× Genomics, Pleasanton, Calif.), Finally, nuclei concentration was calculated with an hematocytometer and proceeded immediately with sc-ATACseq library construction protocol.
scATAC sequencing library was prepared on the 10× Genomics Chromium platform following the manufacturer's protocol (10× Genomics 1000110). The isolated nuclei suspension was diluted and then incubated with trans-position mix for a targeted nuclei recovery of 10,000 cells. GEMs were then captured on the Chromium Chip E (10× Genomics 1000082). Following GEM incubation, clean up was performed using Dynabeads MyOne Silane beads (10× Genomics 2000048) and SPRIselect reagent (Beckman Coulter B23318). Finally, the library was amplified for a total of 10 SI PCR cycles.
Three public processed data (GSE70630, GSE89567, and GSE102130) were obtained from their respective GEO websites. GSE70630 and GSE89567 were back-converted to TPM values. GSE102130 was divided into K27M (GSE102130_K27M) and GBM (GSE102130_GBM) datasets (6 and 3 patients, respectively). To identify the non-malignant microglia and mOGs in the datasets, we used PCA-tSNE and Louvain clustering as implemented in Scanpy (Wolf et al., 2018). The clusters containing the markers of microglia (CSF1R, LAPTM5, CD74, TY-ROBP) or mOGs (MBP, MOG, PLP1), as double-checked by t-test and Wilcoxon, were removed. For each dataset, the number of malignant tumor cells matched closely with those determined by the original authors (GSE70630: 4044 vs 4050, GSE89567: 5157 vs 5097, GSE102130: 2270 vs 2259). GSE102130 GBM did not contain any microglia or mOGs. For processing in Seurat, GSE102130_K27M was divided into 6 samples. All datasets, including the MADR mouse datasets, were normalized to have the library size of 10e5. For the comparative analysis across the tumor types, we used the relative expression as defined by (Filbin et al., 2018) to make the heatmap in
The three 10×UMI count matrices (mK27M1, mK27M2, mK27M3) were normalized to have the library size of 10e5 for each cell. Then, we clustered in the same way as the public dataset to distinguish microglia and mOGs in Scanpy (Wolf et al., 2018). Cells that had more than 10% mitochondrial reads, less than 1000 unique reads, or more than 5000 unique reads were filtered out in Seurat (2.3.3) (Butler et al., 2018). After filtering, there were 2761, 562, and 3469 cells in mK27M1, mK27M2, and mK27M3, respectively.
P1-4 genes were obtained from (Filbin et al., 2018) and used as the highly variable genes argument (genes.use) to identify the common substructures in each human and mouse dataset. The cells were clustered using CCA-UMAP (RunMultiCCA and DimPlot with ‘umap’), and the cluster-specific marker genes were identified using the Seurat function “find_all_markers” with the default arguments. To merge the mouse and human CCA-UMAPs, the mouse gene names were converted to their orthologous human counterparts using Ensembl BioMart (www.ensembl.org/biomart). For module scoring, the functions CellCycleScoring and AddModuleScore were used. The four gene lists (OC, AC, OPC, and Cycle) correspond to P1-4 genes. DoHeatmap function with at most top 50 genes for each cluster was used to make the heatmaps.
SCENIC (1.0.0-02) was run with all default settings as described in (Aibar et al., 2017). We used the two default databases for each species (500 bp-upstream and tss-centered-10 kb). The raw matrices with the library size of 10e5 for each cell and the metadata dataframe from Seurat processing were used as inputs for SCENIC. For the heatmap and tSNE plotting, we used the binary regulon output. The package component AUCell was used to select a threshold for each regulon and then score each regulon for their enrichment in each cell (Aibar et al., 2017). The scores were then binarized (on vs off), and the outputs clustered according to this binary activity matrix (Aibar et al., 2017).
CellRanger was used to identify and annotate open chromatin regions and perform aggregation of samples and initial clustering of cells and motif analysis. CellRanger outputs were used as inputs for cisTopic and SnapA-TAC and samples were processed according to recommended settings (Bravo Gonzalez-Blas et al., 2019, Fang et al., 2019) for annotating clusters, Topics, ontology, gene accessibility, and motifs. The Harmony package (Korsunsky et al., 2018) was used according default settings in conjunction with SnapATAC to align E18 datasets.
We completed the H3K27me3 ChIP reactions using 30 μg of mouse pediatric brain tumor chromatin and 4 μg of antibody (Active Motif, cat #39155). The ChIP reactions also contained a drosophila chromatin spike in for the normalization of the sequencing data. We diluted a small fraction of the ChIP DNA and performed qPCR using positive control primer pairs that worked well in similar assays. For H3K27me3, the primer pair targeted to the promoter region of the active gene ACTB serves as a good negative control.
Nikon Elements and ImageJ software was used to analyze images. All results are shown as mean±SEM, except when indicated otherwise. For statistical analyses, the following convention was used: *: p<0.05, **: p<0.01, ***: p<0.001. “Student's t-test” refers to the unpaired test.
The three 10×UMI count matrices (mK27M1, mK27M2, mK27M3) were normalized to have the library size of 10e5 for each cell. Then, we clustered in the same way as the public dataset to distinguish microglia and mOGs in Scanpy. Cells that had more than 10% mitochondrial reads, less than 1000 unique reads, or more than 5000 unique reads were filtered out in Seurat (2.3.3). After filtering, there were 2761, 562, and 3469 cells in mK27M1, mK27M2, and mK27M3, respectively. After filtering, there were 2761, 562, and 3469 cells in mK27M1, mK27M2, and mK27M3, respectively.
ChIP-seq reads were aligned to the mouse reference genome mm10 using bwa. BigWig tracks were generated for each sample.
H3K27me3 clustering was performed using ngs.plot (version 2.61) (Shen et at, 2014) for each sample with mm10 mouse genome build. The list of genes associated with 7 clusters were imported to Seurat, and the expression for each cluster of genes was calculated using Seurat AddModuleScore.
The cells expressing EDITOR were subject to PCR amplification (list primers). Fastq files for each gene-primer pair were aligned to a custom genome file containing that gene locus using STARlong and bwa-mem with de-fault parameters, both of which gave similar results. The BAM files were uploaded to IGV for visualization.
mTmG is a mouse line that constitutively expresses membrane tdTomato and switches to EGFP expression upon Cre-mediated recombination. To effect MADR in mTmG, we created a promoter-less donor plasmid encoding TagBFP2 flanked by loxP and FRT sites (
To validate the single-copy insertion, we created a donor plasmid carrying puromycin N-acetyl-transferase (PAC) and enriched the cells that correctly express the transgene via antibiotic selection. (
Assays for gene function are often performed using transduced or transfected cell lines in vitro, but the constitutive expression of some transgenes can hinder stable cell line generation if the mutations decrease fitness. To avoid this, inducible genetic systems, such as TRE, may be employed to make the cell line first and then start expressing the gene(s) of interest. To showcase the utility of single-allele mTmGHet mNSCs, we established a pipeline for inducible cell line production by nucleofecting these cells with a MADR-compatible vector containing rtTA-V10 and TRE-Bi element (
This in vitro pipeline is beneficial to interrogating the consequences of GOF mutations in various primary cell lines derived from any animal carrying loxP and Frt by providing more homogeneous, inducible stable cell lines. As proof-of-principle for this, and to determine whether the 3′ cistron of the TRE-Bi element was sporadically expressed because of distal promoter/enhancer regions, we generated a cell line that inducibly expresses the Notch ligand, Dll1, with a bi-cistronic TRE-Bi-Dll1/EGFP donor vector (
From the mTmGHet mNSCs, we also generated distinct cell lines with 4 different “spaghetti monster” reporter proteins (SM-FPs) in a single nucleofection (Viswanathan et al., 2015). We used this pipeline, which we name MADR with multi-ply-antigenic XFPs (MADR MAX) (
To check that MADR works in human cells, we engineered a MADR-compatible recipient site (
To effect MADR in vivo, we electroporated (EPed) donor plasmids containing fluorescent protein reporters (TagBFP2 or membrane-tagged SM_FP-myc) and Flp-Cre (0.5 μg/μl each) into the neural stem/progenitor cells lining the ventricular/subventricular zone (VZ/SVZ) of postnatal day 2 (P2) mTmGHet pups (
By two weeks, differentiated striatal glia and olfactory bulb neurons appeared (
To test the effect of plasmid concentrations on the in vivo recombination efficiencies, we varied the concentrations of Flp-Cre plasmid and SM_FPY-myc for high-sensitivity detection of recombined cells (
To rule out the possibility that transgene expression was due to the expression from randomly integrated or non-recombined episomes, we performed a series of control EPs (
Although MADR is compatible with many existing mice, mTmG presented us with the drawback of being unable to use the red color channel (e.g.
One potential limitation of MADR is its utilization of two commonly used recombinases, Flp and Cre. Thus, we tested overlaying conditional VCre-mediated activation of another transgene. To do this, we created a plasmid expressing VCre downstream of TagBFP2-P2A (
Given the stable genomic insertion and transgene expression that MADR provides, we sought to exploit MADR for generating single-copy in vivo tumor models. LOF tumor suppressor gene mutations such as Nf1, Pten, and Trp53 are some of the most prevalent driver genes in glioma patients. Mouse glioma models show that knocking out these tumor suppressors leads to high-grade gliomas. For example, dual Trp53/Nf1-KOs promote the pre-malignancy hyperproliferation of oligodendrocyte progenitors (OPCs). We wanted to test whether miR-E shRNAs against tumor suppressors are sufficient for tumorigenesis as this approach can be made reversible.
First, we created a donor construct harboring TagBFP2 followed by 3 validated miR-Es targeted at Nf1, Pten, and Trp53 (
To further test this, we switched to CRISPR/Cas9-based knockout of these suppressors. CRISPR/Cas9 has been demonstrated to be highly efficacious for mutating genes in vivo using EP. Using episomal plasmids, we observed that sgRNAs against all Nf1, Trp53, and Pten resulted in the formation of white matter-associated, high grade, Olig2+tumors in agreement with GEMM, MADM, and in utero EP-based CRISPR models (
Confocal imaging demonstrated that the tumor was largely devoid of tdTomato-labeled populations, whereas the vasculature stayed red (
To complement these Cas9-based LOF methods, we added the CRISPR/Cas base editor (FNLS) to MADR (
We made a HrasG12V-based MADR donor compatible with RCE reporter mouse and performed in utero EP (IU-EP) in E14 RCE-heterozygous embryos (
We previously studied a PB-tumor model based on HrasG12V, which results in 100% penetrant glioma when EPed in post-natal WT pups. When the MADR TagBFP2-HrasG12V transgene was delivered postnatally to mTmGHet, HrasG12V+ cells similarly overproliferated when compared with EGFP+populations (
Interestingly, in homozygous mTmG mice, blue-only cells (HrasG12V×2) occupied a bigger patch of tumor cross-section than cells expressing both blue and green (HrasG12V×1) (
Many tumor drivers are fusion proteins, but it can be difficult to make a conditional GEMM mimicking chromosomal rearrangement. For example, the fusion protein drivers YAP1-MAML1D and C11orf95-RELA are recurrently seen in supratentorial ependymomas, and we made MADR vectors to express them (
Almost all human tumors present with a distinct set of somatic and germline mutations, either passenger or directly contributing to cancer. With the ability to pick and choose mutations and to compare these sets of mutations, MADR can serve as a personalized tumor model platform tailored for studying nuanced idiosyncrasies with important implications to drug resistance and survival that are unique to each tumor subtype. As a proof-of-principle, we chose to model pediatric GBM where H3F3A K27M or G34R mutations are observed in more than 50% of patients, but co-occur with a variety of other mutations. For example, H3F3A mutations are often coincident with recurrent dominant-active Pdgfra (D842V), and dominant-negative Trp53 (R270H) To demonstrate MADR's utility in this context, we made donor plasmids for modeling simultaneous H3f3a, Pdgfra, and Trp53 mutations—with variants differing only by missense mutations for G34R or K27M to study the differential effects of these driver genes (
First, we checked for appropriate expression of H3f3a, Pdgfra, and Trp53 by immunohistochemistry in vivo and in vitro and noted coincident expression of all proteins (
Patient tumors bearing either K27M or G34R/V mutations exhibit different transcriptomes as well as clinical features. Human K27M gliomas cluster along the midline, whereas G34R occur in the cerebral hemispheres. K27M tumors manifest in younger patients than G34R/V. Seemingly in agreement with their earlier clinical presentation, some K27M+ mice exhibited midline gliomas by P100, at which time G34R+ displayed diffuse glial hyperplasias and very rare, small tumors (
Pathological features included high cell density, microvascular proliferation, and necrosis at late stages (
To compare the cell autonomous properties of these cells we exploited unique properties of MADR whereby each allele can receive only one transgene insertion, and co-delivered K27M and G34R plasmids at a 1:1 ratio (
Several studies have shown that K27M mutations lead to hypomethylation at the H3K27 residue, and we confirmed the hypomethylation of K27M mutant cells by H3K27me3 antibody (
Immunohistological analysis demonstrated that tumor cells upregulated Bmi1 (
This heterogeneity of glial markers was ostensibly similar to recent findings in human K27M tumors, which demonstrated a significant degree of intratumor heterogeneity by single-cell RNA sequencing (scRNA-seq). Given the availability of this analogous human K27M data we took the unique opportunity to credential the MADR model cells against their human counterparts and gain deeper insight in to the heterogeneity through the use of scRNA-seq.
We subjected EGFP+ sorted tumor cells from 3 independent K27M tumors to droplet-based scRNA-seq (
The “Cycle” cluster consisted of cells expressing markers of proliferation, including Top2a, mKi67, and Ccnb1 (
To conduct cross-species analysis of K27M gliomas, we repeated the Seurat clustering with all the cells from mouse and human K2M tumors (
We also performed clustering with the more common practice of employing highly-variable-genes for CCA, clustering, and UMAP analysis. This approach led to some almost identical clusters (e.g. cycling populations) but division of other populations into sub-clusters (e.g. OPC), which varied by the parameters chosen (
We also used the differentially expressed genes identified across human K27M, GBM, IDH astrocytoma, IDH oligodendroglioma to plot a heatmap comparing our 3 mouse K27M tumors. Our MADR K27M tumors were more similar to the human counterparts than to other glioma subtypes (
We have shown a global matching between the MADR-based K27M mouse and the human K27M glioma transcriptomes, especially in that they show similar developmental hierarchies and overrepresentations of cycling cells. To our knowledge, our K27M scRNA-seq dataset is one of the first created to validate a mouse tumor model. Therefore, we subjected the datasets to further analysis to gain novel insights. The K27M mutation leads to widespread epigenetic perturbation, which led us to focus on whether similar transcription factor (TFs) networks underlie human and mouse tumors.
SCENIC is a method that applies random-forest regression to scRNA-seq datasets to identify regulons (a regulon is a curated, known co-expression module based on a TF and its positively correlated target genes). This type of regulon-based analysis is robust because of its holistic nature, and minimizes the batch and patient-specific effects, which can confound scRNA-seq (
In tSNE-plots derived from parallelly processing the mouse and human K27M cells in SCENIC, the cells were clustered along their cell types, indicating that these cell clusters have differential TF-networks (
To examine the underlying epigenetic state through the examination of differentially accessible genome regions (DARs), we performed single-nucleus ATAC-seq of K27M mouse tumors and compared them to normal P50 and E18 mouse brains (
Finally, alignment of DARs from these scATAC samples and analogous bulk datasets further supported the tSNE findings that glial lineage-associated transcription factors like Olig2, Sox9, and Sox10 exhibit reduced relative accessibility when compared with P50 glial lineages and mutual exclusivity in terms of Sox9 and Sox10 (
We designed two AAV viruses. One expresses FlpO-2A-Cre while the other has a non-expressed (inverted) TagBFP reporter gene. When the TagBFP is transduced into cells by itself, it doesn't appear to be expressed. However, in the presence of the FlpO-2A-Cre virus, cells with the MADR recipient locus appear to lose expression of the tdTomato and EGFP transgenes and begin to express TagBFP (
The significance of this is because it would obviate the need for proliferation to facilitate MADR and thus make it easy to target postmitotic cells and other tissues with single-copy transgenesis. Many types of disease models or safer gene therapy dosing can thus be made.
We modified AAVS1-pAct-GFPnls to AAVS-pACT-loxP-TagBFP-V5-nls WPRE FRT and have MADR-ready iPSCs. The function of this MADR cassette was validated in HEK293T cells (
We modified loxP and FRT sites in both recipient genome and MADR pDonors. The function of MADR was validated in HEK293T cells (
We used tissue-specific promoters on the recombinases expression vector. The function of the tissue-specific recombinases vector was validated in vivo in the mouse brain (
Various embodiments of the invention are described above in the Detailed Description. While these descriptions directly describe the above embodiments, it is understood that those skilled in the art may conceive modifications and/or variations to the specific embodiments shown and described herein. Any such modifications or variations that fall within the purview of this description are intended to be included therein as well. Unless specifically noted, it is the intention of the inventors that the words and phrases in the specification and claims be given the ordinary and accustomed meanings to those of ordinary skill in the applicable art(s).
The foregoing description of various embodiments of the invention known to the applicant at this time of filing the application has been presented and is intended for the purposes of illustration and description. The present description is not intended to be exhaustive nor limit the invention to the precise form disclosed and many modifications and variations are possible in the light of the above teachings. The embodiments described serve to explain the principles of the invention and its practical application and to enable others skilled in the art to utilize the invention in various embodiments and with various modifications as are suited to the particular use contemplated. Therefore, it is intended that the invention not be limited to the particular embodiments disclosed for carrying out the invention.
While particular embodiments of the present invention have been shown and described, it will be obvious to those skilled in the art that, based upon the teachings herein, changes and modifications may be made without departing from this invention and its broader aspects and, therefore, the appended claims are to encompass within their scope all such changes and modifications as are within the true spirit and scope of this invention.
All publications herein are incorporated by reference to the same extent as if each individual publication or patent application was specifically and individually indicated to be incorporated by reference. The following description includes information that may be useful in understanding the present invention. It is not an admission that any of the information provided herein is prior art or relevant to the presently claimed invention, or that any publication specifically or implicitly referenced is prior art.
As used herein the term “comprising” or “comprises” is used in reference to compositions, methods, and respective component(s) thereof, that are useful to an embodiment, yet open to the inclusion of unspecified elements, whether useful or not. It will be understood by those within the art that, in general, terms used herein are generally intended as “open” terms (e.g., the term “including” should be interpreted as “including but not limited to,” the term “having” should be interpreted as “having at least,” the term “includes” should be interpreted as “includes but is not limited to,” etc.). Although the open-ended term “comprising,” as a synonym of terms such as including, containing, or having, is used herein to describe and claim the invention, the present invention, or embodiments thereof, may alternatively be described using alternative terms such as “consisting of” or “consisting essentially of.”
This application includes a claim of priority under 35 U.S.C. § 119(e) to U.S. provisional patent application No. 62/862,576, filed Jun. 17, 2019, the entirety of which is hereby incorporated by reference.
This invention was made with Government support under CA202900 and CA236687 awarded by the National Institutes of Health. The Government has certain rights in the invention.
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/US20/37946 | 6/16/2020 | WO |
Number | Date | Country | |
---|---|---|---|
62862576 | Jun 2019 | US |