The instant application contains a Sequence Listing which has been filed electronically in ASCII format and is hereby incorporated by reference in its entirety. Said ASCII copy, created Oct. 6, 2021, is named 079445-1273450-006220US_SL.txt and is 1.47 MB (1,547,157) bytes in size.
Provided herein are compositions and methods for identifying and using stem cell differentiation regulation factors. For example, in some embodiments, provided herein are compositions and methods for identifying stem cell differentiation regulation factors using marker gene expression libraries. Also provided herein are compositions and methods for generating differentiated and induced cells lines and uses of such cell lines.
Stem cells are cells that are capable of differentiating into many cell types. Embryonic stem cells are derived from embryos and are potentially capable of differentiation into all of the differentiated cell types of a mature body. Certain types of stem cells are “pluripotent,” which refers to their capability of differentiating into many cell types. One type of pluripotent stem cell is the human embryonic stem cell (hESC), which is derived from a human embryonic source. Human embryonic stem cells are capable of indefinite proliferation in culture, and therefore, are an invaluable resource for supplying cells and tissues to repair failing or defective human tissues in vivo.
Similarly, induced pluripotent stem (iPS) cells, which may be derived from non-embryonic sources, can proliferate without limit and differentiate into each of the three embryonic germ layers. It is understood that iPS cells behave in culture essentially the same as ESCs. Human iPS cells and ES cells express one or more pluripotent cell-specific markers, such as Oct-4, SSEA-3, SSEA-4, Tra 1-60, Tra 1-81, and Nanog (Yu et al. Science, Vol. 318. No. 5858, pp. 1917-1920 (2007); herein incorporated by reference in its entirety). Also, recent findings of Chan, indicate that expression of Tra 1-60, DNMT3B, and REX1 can be used to positively identify fully reprogrammed human iPS cells, whereas alkaline phosphatase, SSEA-4, GDF3, hTERT, and NANOG are insufficient as markers of fully reprogrammed human iPS cells. (Chan et al., Nat. Biotech. 27:1033-1037 (2009); herein incorporated by reference in its entirety).
The cell fate decision making of stem cells is governed by multistep dynamic processes, in which transcriptional networks play a critical role (Chambers and Tomlinson, 2009 Development 136, 2311-2322; Filipczyk et al., 2015 Nat. Cell Biol. 17, 1235-1246; Kim et al., 2008 Cell 132, 1049-1061; MacArthur et al., 2009 Nat. Rev. Mol. Cell Biol. 10, 672-681). Expression of different transcription factors coordinate to activate or suppress sets of genes specific to different lineages, serving as major regulators that maintain cell identities or drive cell fate transitions (Iwafuchi-Doi and Zaret, 2014 Genes Dev. 28, 2679-2692; Zaret and Carroll, 2011 Genes Dev. 25, 2227-2241). The successes of somatic cell reprogramming and directed lineage differentiation using transcription factors highlight their central role in cell fate determination (Davis et al., 1987 Cell 51, 987-1000; Takahashi and Yamanaka, 2006 Cell 126, 663-676; Vierbuchen et al., 2010 Nature 463, 1035-1041; Xu et al., 2015 Cell Stem Cell 16, 119-134). Over the past few decades, although individual or combinatorial transcription factors have been identified for cell differentiation, there is a dearth of systematically unbiased studies of how specific genetic programs determine cell fate maintenance and transitions. Because of this, the available tools to control stem cell differentiation are limited and the full promise of stem cells as therapeutic, drug screening, and research tools have gone unmet.
A systematic screening approach to profile and characterize all transcription factors is needed to offer new insights into their contributions to cell fate decisions, which greatly enhances the ability to manipulate cell fate for both basic research and therapeutic purposes.
Provided herein are compositions and methods for identifying and using stem cell differentiation regulation factors. For example, in some embodiments, provided herein are compositions and methods for identifying stem cell differentiation regulation factors using marker gene expression libraries. Also provided herein are compositions and methods for generating differentiated and induced cells lines and uses of such cell lines.
The compositions, systems, kits, and methods of the present disclosure overcome limitations of existing technologies to identify transcription factors and nucleic that drive differentiation of pluripotent cells. The transcription factors identified using the described methods find use in research, screening, and therapeutic applications.
In some embodiments, provided herein are systems and methods for identifying factors involved in (e.g., that regulate or control) the differentiation of stem cells by employing a CRISPR activation (CRISPRa)-mediated gain-of-function screening platform. In some such embodiments, a reporter stem cell line is generated that comprises components of a CRSIPR activation system. In some embodiments, the cell line is exposed to an sgRNA library targeting all putative transcription factors or other candidate factors that may be involved in a cellular differentiation process.
In some embodiments, the CRISPR activation system comprises a dCas9 construct under the transcriptional control of a first promoter. In some embodiments, the dCas9 is fused to a peptide epitope. In some embodiments, the activation system further comprises a VP64 transactivation domain under the transcriptional control of a second promoter. In some embodiments, the VP64 transactivation domain is fused to a peptide that specifically binds to the peptide epitope. In some embodiments, the activation system further comprises a selection marker under the transcriptional control of a third promoter. In some embodiments, each of the first, second, and third promoters are different than each other.
For example, in some embodiments, provided herein is a method of identifying pluripotent cell differentiation markers, comprising: a) generating a pluripotent cell line that expresses i) nuclease dead Cas9 fused to a plurality of peptide epitopes; ii) a single chain variable chain antibody fragment specific for the peptide epitope fused to a VP64 tranactivator domain; and iii) a transactivator polypeptide; b) contacting the cell line with a plurality of single guide RNAs (sgRNAs) specific for activation of pluripotent cell differentiation factors to generate a gene activation library; c) sorting the library to identify pluripotent cells that retain pluripotency or differentiate; and d) identifying cell differentiation factors that induce or prevent differentiation of the pluripotent cells. In some embodiments, the differentiation factors are transcription factors or non-coding (e.g., lincRNAs). In some embodiments, the cells are further contacted with a plurality of non-targeting sgRNAs (e.g., to serve as a negative control). In some embodiments, the cells further overexpress endogenous POU domain, class 3, transcription factor 2 (Brn2). In some embodiments, each cell differentiation factors is targeted with a plurality (e.g., at least 10, 20, 30, 40, 50, 60, 70, 80, 90, or 100) distinct sgRNAs. In some embodiments, the cells that retain pluripotency are identified by screening for expression of SSEA1 after culture in media lacking inhibitors of GSK3 and ERK pathways. In some embodiments, cells that differentiate are identified by expression of a differentiation marker. For example, in some embodiments, cells that differentiate into neuronal cells express Tuj1. In some embodiments, the identifying comprises sequencing of sgRNAs after selection for cells that retain pluripotency or differentiate. In some embodiments, the sequencing further comprises comparing the level of the sgRNAs to the level of non-targeting sgRNAs. In some embodiments, cell differentiation factors that retain pluripotency are one or more of the regulation factors shown in
In some embodiments, the pluripotent cells are induced pluripotent stem cells, adult stem cells, or embryonic stem cells. In some embodiments, the method further comprises the step of activating pairs or groups of pluripotent cell differentiation factors.
In some embodiments, the method comprises or further comprises the step of performing a CRISPR gene repression screen. For example, in some embodiments, the CRISP repression screen comprises: a) contacting a pluripotent cell that expresses dCas9 fused to a transcription repressor domain with a plurality of sgRNAs specific for repression of a plurality of cell differentiation factors; b) sorting the library to identify cells that retain pluripotency or differentiate; and c) identifying cell differentiation factors that induce or prevent differentiation of said pluripotent cells. In some embodiments, the CRISPR repression screen and the CRISPR activation screen are performed in the same or different pluripotent cells. In some embodiments, the CRISPR repression screen and the CRISPR activation screen are performed simultaneously using vectors comprising a first sgRNA specific for activation of a first cell differentiation factor and a second sgRNA specific for repression of a second cell differentiation factor.
Further embodiments provide a library of pluripotent cells generated by the methods descried herein.
Additional embodiments provide a kit or system, comprising: a) a pluripotent cell line that expresses i) nuclease dead Cas9 fused to a plurality of peptide epitopes; ii) a single chain variable chain antibody fragment specific for the peptide epitope fused to a VP64 tranactivator domain; and iii) a transactivator polypeptide; and b) a plurality of single guide RN As (sgRNAs) specific for activation of pluripotent cell differentiation factors. In some embodiments, the kit or system further comprises reagents for analysis of one or more properties (e.g., pluripotency or differentiation) of the cell lines. In some embodiments, the kit or system further comprises reagents for sequencing the cells to identify the presence of said sgRNAs. In some embodiments, the system comprises or further comprises a CRISPR repression system as described herein. In some embodiments, the system comprises one or more sgRNAs (e.g., 10 or more, 100 or more, 1000 or more, or 5000 or more) described in Table 13 (e.g., SEQ ID NOs:586-8317).
Yet other embodiments provide a method of determining the differentiation status of pluripotent or somatic cells, comprising: a) assaying the cells for the expression of one or more transcription factors or lincRNAs selected from those in
Still further embodiments provide a method of differentiating pluripotent or somatic (e.g., fibroblast) cells into neuronal cells, comprising: inducing expression of one or more cell regulation factors shown in
Certain embodiments provide differentiated cells generated by the methods described herein.
Embodiments of the present disclosure provide a plurality of neuronal cells that express one or more cell differentiation regulation factors shown in
Further embodiments provide a method of inducing pluripotency or maintaining pluripotency of a cell line (e.g., a somatic or pluripotent cell line), comprising: inducing expression of one or more cell regulation factors shown in
Still other embodiments provide a plurality of pluripotent cells generated or maintained by the methods described herein.
In other embodiments, the present disclosure provides a plurality of pluripotent or iPSCs cells that express one or more cell regulation factors shown in
Some embodiments provide a method of transplanting cells, comprising: transplanting differentiated cells generated by the methods described herein into a subject in need thereof (e.g., a subject diagnosed with a disease or condition).
Further embodiments are described herein.
As used herein the term “stem cell” (“SC”) refers to cells that can self-renew and differentiate into multiple lineages. A stem cell is a developmentally pluripotent or multipotent cell. A stem cell can divide to produce two daughter stem cells, or one daughter stem cell and one progenitor (“transit”) cell, which then proliferates into the tissue's mature, fully formed cells. Stem cells may be derived, for example, from embryonic sources (“embryonic stem cells”) or derived from adult sources. For example, U.S. Pat. No. 5,843,780 to Thompson describes the production of stem cell lines from human embryos. PCT publications WO 00/52145 and WO 01/00650 (herein incorporated by reference in their entireties) describe the use of cells from adult humans in a nuclear transfer procedure to produce stem cell lines.
Examples of adult stem cells include, but are not limited to, hematopoietic stem cells, neural stem cells, mesenchymal stem cells, and bone marrow stromal cells. These stem cells have demonstrated the ability to differentiate into a variety of cell types including adipocytes, chondrocytes, osteocytes, myocytes, bone marrow stromal cells, and thymic stroma (mesenchymal stem cells); hepatocytes, vascular cells, and muscle cells (hematopoietic stem cells); myocytes, hepatocytes, and glial cells (bone marrow stromal cells) and, indeed, cells from all three germ layers (adult neural stem cells).
As used herein, the term “totipotent cell” refers to a cell that is able to form a complete embryo (e.g., a blastocyst).
As used herein, the term “pluripotent cell” or “pluripotent stem cell” refers to a cell that has complete differentiation versatility, e.g., the capacity to grow into any of the mammalian body's approximately 260 cell types. A pluripotent cell can be self-renewing, and can remain dormant or quiescent within a tissue. Unlike a totipotent cell (e.g., a fertilized, diploid egg cell), a pluripotent cell, even a pluripotent embryonic stem cell, cannot usually form a new blastocyst.
As used herein, the term “induced pluripotent stem cells” (“iPSCs”) refers to a stem cell induced from a somatic cell, e.g., a differentiated somatic cell, and that has a higher potency than said somatic cell. iPS cells are capable of self-renewal and differentiation into mature cells.
As used herein, the term “multipotent cell” refers to a cell that has the capacity to grow into a subset of the mammalian body's approximately 260 cell types. Unlike a pluripotent cell, a multipotent cell does not have the capacity to form all of the cell types.
As used herein, the term “progenitor cell” refers to a cell that is committed to differentiate into a specific type of cell or to form a specific type of tissue.
As used herein, the term “embryonic stem cell” (“ES cell” or ESC”) refers to a pluripotent cell that is derived from the inner cell mass of a blastocyst (e.g., a 4- to 5-day-old human embryo), and has the ability to yield many or all of the cell types present in a mature animal.
As used herein the term “feeder cells” refers to cells used as a growth support in some tissue culture systems. Feeder cells may, for example, embryonic striatum cells or stromal cells.
As used herein, the term “chemically defined media” refers to culture media of known or essentially-known chemical composition, both quantitatively and qualitatively. Chemically defined media is free of all animal products, including serum or serum-derived components (e.g., albumin).
Provided herein are compositions and methods for identifying and using stem cell differentiation regulation factors. For example, in some embodiments, provided herein are compositions and methods for identifying stem cell differentiation regulation factors using marker gene expression libraries. Also provided herein are compositions and methods for generating differentiated and induced cells lines and uses of such cell lines.
The RNA-guided microbial endonuclease CRISPR (clustered regularly interspaced short palindromic repeat)/Cas9 (CRISPR associated protein 9) system was recently repurposed as a tool for sequence-specific gene editing and transcriptional regulation (Cho et al., 2013 Nat. Biotechnol. 31, 230-232; Cong et al., 2013 Science 339, 819-823; Fu et al., 2014 Nat. Biotechnol. 32, 279-284; Jinek et al. Science 337, 816-821, 2012; Mali et al., 2013b Science 339, 823-826; Qi et al., 2013 Cell 152, 1173-1183; Ran et al., 2015 Nature 520, 186-191; Yu et al., 2015 Cell Stem Cell 16, 142-147). The nuclease-dead Cas9 (dCas9) fused with transcription activator domains allows endogenous genes activation, leading to CRISPR activation (CRISPRa) methods (Chavez et al., 2015 Nat. Method. 12, 326-328; Cheng et al., 2013 Cell Res. 23, 1163-1171; Gilbert et al., 2013 Cell 154, 442-451; Hilton et al., 2015 Nat. Biotechnol. 33, 510-517; Konermann et al., 2015 Nature 517, 583-588; Maeder et al., 2013 Nat. Method. 10, 977-979; Mali et al., 2013a Nat. Biotechnol. 31, 833-838; Perez-Pinera et al., 2013 Nat. Method. 10, 973-976; Tanenbaum et al., 2014 Cell 159, 635-646; Zalatan et al., 2015 Cell 160, 339-350). Previous work demonstrated that CRISPR activation of endogenous genes allowed, in principle, somatic cell reprogramming and directed cell differentiation (Black et al., 2016 Cell Stem Cell 19, 406-414; Chakraborty et al., 2014 Stem Cell Reports 3, 940-947; Chavez et al., 2015 Nat. Method. 12, 326-328; Wei et al., 2016 Sci. Rep. 6, 19648). However, since these studies relied on using a mixture of multiple sgRNAs for activating a single gene and inducing differentiation, applying these methods for large-scale activation screening has been a major challenge.
Unlike cell growth phenotypes that entail a dropout live-or-dead process, cell fate determination is a dynamic, stochastic process that often generates a heterogeneous cell population with diverse phenotypes (e.g., non-dropout) (Hanna et al., 2009 Nature 462, 595-601; Johnston and Desplan, 2010 Annu. Rev. Cell Dev. Biol. 26, 689-719). This imposes another challenge to simply perform dropout screens that distinguish lineage specification processes from spontaneous differentiation events. Furthermore, because developmental programs are highly dependent on the expression level of endogenous genes (Niwa et al., 2000 Nat. Genet. 24, 372-376; Papapetrou et al., 2009 Proc. Natl. Acad. Sci. USA 106, 12759-12764), gain-of-function screens that allow very efficient gene activation (comparable to cDNA overexpression) while covering a broad range of expression offer more promise for identifying candidate genes driving cell lineages. To date, two reports used CRISPRa for cell growth-based dropout screens (Gilbert et al., 2014 Cell 159, 647-661; Konermann et al., 2015 Nature 517, 583-588). However, the application of CRISPRa screens for the systematic inference of cell fate determination has not yet been established.
Experiments described herein overcame these challenges by developing a CRISPR activation (CRISPRa)-mediated gain-of-function screening approach to identify transcription factors (TFs) important for stem cell fate determination. An enhanced CRISPRa system was developed in mouse embryonic stem (ES) cells that efficiently activates endogenous genes and drives cell lineage differentiation. A single sgRNA was sufficient to induce neuron or muscle differentiation. Based on the system, a large-scale sgRNA library (>50,000 sgRNA) was used to target all putative endogenous TF genes (˜800) and a small set of noncoding RNA genes (50). Targeting a single gene using multiple sgRNAs (>60 sgRNA per gene) allowed activating each gene to a broad range of expression levels. A CRISPRa dropout screen was used to identify genes that promote stem cell self-renewal, as well as a non-dropout screen for inducing neural differentiation. The top gene hits were validated using individual sgRNAs, and it was observed that all hits could maintain self-renewal. For neural differentiation, it was confirmed that 19 out of top 20 gene hits could induce efficient neural differentiation. For both screens, the lists of gene hits include known TF factors and those TFs and noncoding RNAs that are not previously related to self-renewal maintenance or neural differentiation. Different identified TFs preferentially induced different types of neurons. Deep sequencing and functional analysis of a few gene hits (Mlxip for self-renewal and Jun for neural differentiation) confirmed their functions for driving desired cellular processes.
Thus, the compositions and methods provide herein allow for the identification of the relevant factors necessary, sufficient, and/or useful for controlling differentiation of stem cells into any desired fat. The transcription factors identified herein and identifiable using the compositions and methods described herein provide target and reagents for differentiation of cells an provide the cells made therefrom that find use as research tools, drug screening targets, and therapeutics (e.g., via cell transplantation into a host).
The CRISPRa gain-of-function screens and stem cell libraries described herein find use in research, therapeutic, and screening applications to determine differentiation factors for a variety of stem cells. The differentiation factors identified further find use in stem cell differentiation for research, screening, and clinical applications.
As described herein, embodiments of the present disclosure provide compositions and methods for identifying stem cell differentiation regulation factors. In some embodiments, the methods utilize a modified pluripotent or multipotent (e.g., stem cell) line. The present disclosure is not limited to particular cell lines. Examples include, but are not limited iPSC, embryonic stem cells, adult stem cells, and the like.
In some embodiments, the CRISPR activation system comprises a dCas9 construct under the transcriptional control of a first promoter. In some embodiments, the dCas9 is fused to a peptide epitope. In some embodiments, the activation system further comprises a VP64 transactivation domain under the transcriptional control of a second promoter. In some embodiments, the VP64 transactivation domain is fused to a peptide that specifically binds to the peptide epitope. In some embodiments, the activation system further comprises a selection marker under the transcriptional control of a third promoter. In some embodiments, each of the first, second, and third promoters are different than each other.
In some embodiments, cell lines for determination of differentiation regulation factors are pluripotent cells modified with a dead Cas9/transactivator activation system. For example in some embodiments, cells comprise a nuclease dead Cas9 (dCas9). In some embodiments, the dCas9 is fused to a signal activation component (e.g., a plurality of peptide epitopes as described in Tanenbaum et al., (2014). Cell 159, 635-646; herein incorporated by reference in its entirety). In some embodiments, the cell lines further comprise a single chain variable chain antibody fragment specific for the peptide epitope fused to a tranactivator domain (e.g., VP64; See e.g., Beerli et al., Proc Natl Acad Sci USA. 1998 Dec 8; 95(25): 14628-14633; herein incorporated by reference in its entirety) and a transactivator polypeptide. In some embodiments, the activation components are provided on a vector (e.g., retroviral vector, adenoviral viral vector, adeno-associated vector, lentiviral vector, etc.). In some embodiments, cells further overexpress endogenous Brn2 (e.g., via an sgRNA that targets activation of Brn2).
In some embodiments, the cells lines are next contacted with a plurality of sgRNAs (e.g., targeting cell differentiation regulation factors). In some embodiments, sgRNAs target transcription factors or non-coding RNAs (e.g., lincRNAs). In some embodiments, more than one (e.g., at least 10, 20, 30, 40, 50, 60, 70, 80, 90, or 100) sgRNAs specific for each differentiation factor are utilized. In some embodiments, sgRNAs are provided on vectors (e.g., retroviral vector, adenoviral viral vector, adeno-associated vector, lentiviral vector, etc.). In some embodiments, cells are further contacted with a plurality of non-targeting sgRNAs (e.g., to serve as negative controls). In some embodiments, a double CRISPR screen is performed using dual-sgRNA-constructs comprising two (or more) sgRNAs to screen for interactions between multiple cell differentiation factors in combination.
In some embodiments, the method further comprises contacting the cell differentiation factors with a fibroblast or other cell line and identifying cell differentiation factors that promote transdifferentiation of the fibroblast cell line. In some embodiments, the fibroblast cell line is contacted with combinations of two or more cell differentiation factors. In some embodiments, the cell differentiation factors that promote differentiation are combinations of Ngn1+Brn2, Ezh2+Brn2, Mecom+Ezh2, Ngn1+Ezh2, or Ngn1+Foxo1.
In some embodiments, the method comprises or further comprises the step of performing a CRISPR gene repression screen. For example, in some embodiments, the CRISPR repression screen comprises: a) contacting a pluripotent cell that expresses dCas9 fused to a transcription repressor domain (e.g., KRAB) with a plurality of sgRNAs specific for repression of a plurality of cell differentiation factors; b) sorting the library to identify cells that retain pluripotency or differentiate; and c) identifying cell differentiation factors that induce or prevent differentiation of said pluripotent cells. In some embodiments, the CRISPR repression screen and the CRISPR activation screen are performed in the same or different pluripotent cells. In some embodiments, the CRISPR repression screen and the CRISPR activation screen are performed simultaneously using vectors comprising a first sgRNA specific for activation of a first cell differentiation factor and a second sgRNA specific for repression of a second cell differentiation factor.
The resulting gene activation library from CRISPR activation and/or repressor cells are then further analyzed as described below. For example, in some embodiments, following delivery of sgRNAs, cells are cultured and cells that retain pluripotency or differentiate are identified. In some embodiments, cells are sorted based on the presence or absence of differentiation or pluiptency markers.
In some embodiments, in order to identify regulation factors for pluipotency, cells are cultured under conditions that do not inhibit differentiation (e.g., in media lacking inhibitors of GSK3 and ERK pathways). In some embodiments, pluripotent cells are sorted by identifying and selecting (e.g., using flow cytometry) cells that express SSEA1 after culture.
In some embodiments, cells that differentiate are identified by sorting for cells that express differentiation markers specific to the final cell type. For example, in some embodiments, cells that differentiate into neuronal cells are identified by sorting for cells that express Tuj1.
In some embodiments, cell differentiation factors are activated and analyzed in pairs or groups (e.g., as described in Example 2 below) in order to identify combined effects of between different factors.
In some embodiments, after selection, cell differentiation regulation factors are identified by identifying sgRNAs that persist in the sorted cells. In some embodiments, sequencing (e.g., deep sequencing) is used to identify sgRNAs. In some embodiments, sequencing methods further comprises comparing the level of said sgRNAs to the level of non-targeting sgRNAs.
In deep sequencing, a high number of replicates of each sequencing read (e.g., at least 10, 20, 30, 40, 50, or 100) are used to improve accuracy. The present disclosure is not limited to a particular sequencing technique. Exemplary sequencing techniques are described below. A variety of nucleic acid sequencing methods are contemplated for use in the methods of the present disclosure including, for example, chain terminator (Sanger) sequencing, dye terminator sequencing, and high-throughput sequencing methods. Many of these sequencing methods are well known in the art. See, e.g., Sanger et al., Proc. Natl. Acad. Sci. USA 74:5463-5467 (1997); Maxam et al., Proc. Natl. Acad. Sci. USA 74:560-564 (1977); Drmanac, et al., Nat. Biotechnol. 16:54-58 (1998); Kato, Int. J. Clin. Exp. Med. 2:193-202 (2009); Ronaghi et al., Anal. Biochem. 242:84-89 (1996); Margulies et al., Nature 437:376-380 (2005); Ruparel et al., Proc. Natl. Acad. Sci. USA 102:5932-5937 (2005), and Harris et al., Science 320:106-109 (2008); Levene et al., Science 299:682-686 (2003); Korlach et al., Proc. Natl. Acad. Sci. USA 105:1176-1181 (2008); Branton et al., Nat. Biotechnol. 26(10):1146-53 (2008); Eid et al., Science 323:133-138 (2009); each of which is herein incorporated by reference in its entirety.
Next-generation sequencing (NGS) methods share the common feature of massively parallel, high-throughput strategies, with the goal of lower costs in comparison to older sequencing methods (see, e.g., Voelkerding et al., Clinical Chem., 55: 641-658, 2009; MacLean et al., Nature Rev. Microbiol., 7: 287-296; each herein incorporated by reference in their entirety). NGS methods can be broadly divided into those that typically use template amplification and those that do not. Amplification-requiring methods include pyrosequencing commercialized by Roche as the 454 technology platforms (e.g., GS 20 and GS FLX), the Solexa platform commercialized by illumina, and the Supported Oligonucleotide Ligation and Detection (SOLiD) platform commercialized by Applied Biosystems. Non-amplification approaches, also known as single-molecule sequencing, are exemplified by the HeliScope platform commercialized by Helicos BioSciences, and emerging platforms commercialized by VisiGen, Oxford Nanopore Technologies Ltd., Life Technologies/Ion Torrent, and Pacific Biosciences, respectively.
Other emerging single molecule sequencing methods include real-time sequencing by synthesis using a VisiGen platform (Voelkerding et al., Clinical Chem., 55: 641-58, 2009; U.S. Pat. No. 7,329,492; U.S. patent application Ser. No. 11/671,956; U.S. patent application Ser. No. 11/781,166; each herein incorporated by reference in their entirety) in which immobilized, primed DNA template is subjected to strand extension using a fluorescently-modified polymerase and florescent acceptor molecules, resulting in detectible fluorescence resonance energy transfer (FRET) upon nucleotide addition.
Exemplary cell regulation factors indicative of cells that retain pluripotency or differentiate are described in the Figures and Tables herein. For example, in some embodiments, cell transcription factors that retain pluripotency are one or more of the regulation factors shown in
The cell differentiation factors identified using the described methods find use in a variety of applications. Exemplary uses are described herein.
In some embodiments, the present disclosure provides cells lines, kits, and systems for use in the described methods. For example, in some embodiments, provided herein are libraries of modified pluripotent cells as described above. For example, in some embodiments, the cells comprise a dCas9 construct under the transcriptional control of a first promoter. In some embodiments, the dCas9 is fused to a peptide epitope. In some embodiments, the cells comprise a VP64 transactivation domain under the transcriptional control of a second promoter. In some embodiments, the VP64 transactivation domain is fused to a peptide that specifically binds to the peptide epitope. In some embodiments, the cells comprise a selection marker under the transcriptional control of a third promoter. In some embodiments, each of the first, second, and third promoters are different than each other.
In some embodiments, cells express i) nuclease dead Cas9 fused to a plurality of peptide epitopes; ii) a single chain variable chain antibody fragment specific for said peptide epitope fused to a VP64 tranactivator domain; and iii) a transactivator polypeptide.
In some embodiments, the cell lines described herein find use in screening (e.g., drug screening) and research applications as described below.
In some embodiments, provided herein are kits and systems comprising the cell lines described herein. In some embodiments, kits and systems further comprise a plurality of sgRNAs specific for activation of pluripotent cell differentiation factors. In some embodiments, the kit or system comprises one or more sgRNAs (e.g., 10 or more, 100 or more, 1000 or more, or 5000 or more) described in Table 13 (e.g., SEQ ID NOs:586-8317).
In some embodiments, kits and systems further comprise reagents for analysis of one or more properties of the cell lines (e.g., pluripotency or differentiation), reagents for sequencing the cells to identify the presence of sgRNAs, reagents for further downstream analysis (e.g., molecular analysis, toxicity screening, drug screening, or cellular activity assays), or computer software and computer systems for analyzing data.
In some embodiments, the present disclosure provides compositions and methods for differentiating cells into multipotent or specific cell types. The present disclosure is not limited to particular target cell types. Examples include, but are not limited to, epithelial cells (e.g., exocrine secretory epithelial cells, hormone secreting cells (e.g., islet cells), keratinizing epithelial cells (e.g., skin cells), central nervous system cells (e.g., neuronal cells), blood cells, and organ cells.
In some embodiments, differentiation is induced by increasing expression of cellular regulation factors identified using the methods described herein. In some embodiments, expression is induced by exogenously introduced differentiation genes. In one embodiment, the exogenously introduced gene may be expressed from a chromosomal locus different from the endogenous chromosomal locus of the gene. Such chromosomal locus may be a locus with open chromatin structure, and contain gene(s) dispensible for a somatic cell. In other words, the desirable chromosomal locus contains gene(s) whose disruption will not cause cells to die. Exemplary chromosomal loci include, for example, the mouse ROSA 26 locus and type II collagen (Col2a1) locus (See Zambrowicz et al., 1997) The exogenously introduced pluripotency gene may be expressed from an inducible promoter such that their expression can be regulated as desired.
In some embodiments, the exogenously introduced gene is transiently transfected into cells, either individually or as part of a cDNA expression library. The cDNA library is prepared by conventional techniques. Briefly, mRNA is isolated from an organism of interest. An RNA-directed DNA polymerase is employed for first strand synthesis using the mRNA as template. Second strand synthesis is carried out using a DNA-directed DNA polymerase which results in the cDNA product. Following conventional processing to facilitate cloning of the cDNA, the cDNA is inserted into an expression vector such that the cDNA is operably linked to at least one regulatory sequence. The choice of expression vectors for use in connection with the cDNA library is not limited to a particular vector. Any expression vector suitable for use in mammalian cells is appropriate. In one embodiment, the promoter which drives expression from the cDNA expression construct is an inducible promoter. The term regulatory sequence includes promoters, enhancers and other expression control elements. Exemplary regulatory sequences are described in Goeddel: Gene Expression Technology: Methods in Enzymology, Academic Press, San Diego, Calif. (1990). For instance, any of a wide variety of expression control sequences that control the expression of a DNA sequence when operatively linked to it may be used in these vectors to express cDNAs. It should be understood that the design of the expression vector may depend on such factors as the choice of the host cell to be transformed and/or the type of protein desired to be expressed. Moreover, the vector's copy number, the ability to control that copy number and the expression of any other protein encoded by the vector, such as antibiotic markers, should also be considered.
In some embodiments, the CRISPR activation and/or repression system is expressed from an inducible promoter. The term “inducible promoter”, as used herein, refers to a promoter that, in the absence of an inducer (such as a chemical and/or biological agent), does not direct expression, or directs low levels of expression of an operably linked gene (including cDNA), and, in response to an inducer, its ability to direct expression is enhanced, Exemplary inducible promoters include, for example, promoters that respond to heavy metals (CRC Boca Raton, Fla. (1991), 167-220; Brinster et al. Nature (1982), 296, 39-42), to thermal shocks, to hormones (Lee et al. P.N.A.S. USA (1988), 85, 1204-1208; (1981), 294, 228-232; Klock et al. Nature (1987), 329, 734-736; Israel and Kaufman, Nucleic Acids Res. (1989), 17, 2589-2604), promoters that respond to chemical agents, such as glucose, lactose, galactose or antibiotic.
A tetracycline-inducible promoter is an example of an inducible promoter that responds to an antibiotics. See Gossen et al., 2003. The tetracycline-inducible promoter comprises a minimal promoter linked operably to one or more tetracycline operator(s). The presence of tetracycline or one of its analogues leads to the binding of a transcription activator to the tetracycline operator sequences, which activates the minimal promoter and hence the transcription of the associated cDNA and the expression of CRISPR activation and/or repression system. Tetracycline analogue includes any compound that displays structural homologies with tetracycline and is capable of activating a tetracycline-inducible promoter. Exemplary tetracycline analogues includes, for example, doxycycline, chlorotetracycline and anhydrotetracycline.
In some embodiments, expression of cell differentiation factors is induced via activating sgRNAs as described herein (e.g., Example 1). One or more sgRNAs are introduced into a pluripotent cell that expresses a CRISPR activation system (e.g., those described herein or other suitable system).
In some embodiments, differentiation is induced via small molecules that active expression or activity of cell differentiation genes or downstream signaling partners.
In some embodiments, cells are cultured under conditions that promote differentiation. In some embodiments, cultures are adherent cultures, e.g., the cells are attached to a substrate. The substrate is typically a surface in a culture vessel or another physical support, e.g. a culture dish, a flask, a bead or other carrier. In some embodiments, the substrate is coated to improve adhesion of the cells and suitable coatings include laminin, poly-lysine, poly-ornithine and gelatin. In some embodiments, the cells are grown in a monolayer culture or in suspension or as balls or clusters of cells. At higher densities, cells may begin to pile up on each other, but the cultures are essentially monolayers or begin as monolayers, attached to the substrate.
Cells differentiated using the methods described herein find use in a variety of research, screening, and clinical applications. In some embodiments, cells are used to prepare antibodies and cDNA libraries that am specific for the differentiated phenotype. General techniques used in raising, purifying and modifying antibodies, and their use in immunoassays and immunoisolation methods are described in Handbook of Experimental Immunology (Weir & Blackwell, eds.), Current Protocols in Immunology (Coligan et al., eds.); and Methods of Immunological Analysis (Masseyeff et al., eds., Weinheim: VCH Verlags GmbH). General techniques involved in preparation of mRNA and cDNA libraries are described in RNA Methodologies: A Laboratory Guide for Isolation and Characterization (R. E. Farrell, Academic Press, 1998); cDNA Library Protocols (Cowell & Austin, eds., Humana Press); and Functional Genomics (Hunt & Livesey, eds., 2000). Relatively homogeneous cell populations are particularly suited for use in drug screening and therapeutic applications.
In some embodiments, the cells generated by methods provided herein or the above-described cell lines are used to screen for agents (e.g., small molecule drugs, peptides, polynucleotides, and the like) or environmental conditions (such as culture conditions or manipulation) that affect the cells. Particular screening applications relate to the testing of pharmaceutical compounds in drug research. Assessment of the activity of candidate pharmaceutical compounds generally involves combining the cells with the candidate compound, determining any change in the morphology, marker phenotype, or metabolic activity of the cells that is attributable to the compound (compared with untreated cells or cells treated with an inert compound), and then correlating the effect of the compound with the observed change. Any suitable assays for detecting changes associated with test agents may find use in such embodiments. The screening may be done, for example, either because the compound is designed to have a pharmacological effect on specific cell types, because a compound designed to have effects elsewhere may have unintended side effects, or because the compound is part of a library screen for a desired effect. Two or more drugs can be tested in combination (by combining with the cells either simultaneously or sequentially), to detect possible drug-drug interaction effects. In some applications, compounds are screened for cytotoxicity.
In some embodiments, methods and systems are provided for assessing the safety and efficacy of drugs that act upon the differentiated cells, or drugs that might be used for another purpose but may have unintended effects upon the cells. In some embodiments, cells described herein find use in high throughput screening (ITS) applications. In some embodiments, a HTS screening platform is provided (e.g., cells and plates) that allows for the rapid testing of large number (e.g., 1×103, 1×104, 1×105, 1×106 (or more) of agents (e.g., small molecule compounds, peptides, etc.).
In some embodiments cells generated using methods and reagents described herein are utilized for therapeutic delivery to a subject (e.g., a subject with a disease or other condition). Cells may be placed directly in contact with subject tissue or may be otherwise sealed or encapsulated (e.g., to avoid direct contact). In embodiments in which cells are encapsulated, exchange of factors, nutrients, gases, etc. between the encapsulated cells and the subject tissue is allowed. In some embodiments, cells are implanted/transplanted on a matrix or other delivery platform.
If appropriate, cells are co-administered with one or more pharmaceutical agents or bioactives that facilitate the survival and function of the transplanted cells.
Support materials suitable for use for purposes of the present disclosure include tissue templates, conduits, barriers, and reservoirs useful for tissue repair. In particular, synthetic and natural materials in the form of foams, sponges, gels, hydrogels, textiles, and nonwoven structures, which have been used in vitro and in vivo to reconstruct or regenerate biological tissue, as well as to deliver chemotactic agents for inducing tissue growth, are suitable for use in practicing the methods of the present disclosure. See, for example, the materials disclosed in U.S. Pat. Nos. 5,770,417, 6,022,743, 5,567,612, 5,759,830, 6,626,950, 6,534,084, 6,306,424, 6,365,149, 6,599,323, 6,656,488, U.S. Published Application 2004/0062753 A1, U.S. Pat. Nos. 4,557,264 and 6,333,029.
Cells generated with methods and reagents herein may be implanted as dispersed cells or formed into implantable clusters. In some embodiments, cells are provided in biocompatible degradable polymeric supports; porous, permeable, or semi-permeable non-degradable devices; or encapsulated (e.g., to protect implanted cells from host immune response, etc.). Cells may be implanted into an appropriate site in a recipient. Suitable implantation sites depend on the cell type and may include, for example, the brain, spinal cord, skin, liver, natural pancreas, renal subcapsular space, omentum, peritoneum, subserosal space, intestine, stomach, or a subcutaneous pocket.
In some embodiments, cells or cell clusters are encapsulated for transplantation into a subject. Encapsulation techniques are generally classified as microencapsulation, involving small spherical vehicles, and macroencapsulation, involving larger flat-sheet and hollow-fiber membranes (Uludag, H. et al. Technology of mammalian cell encapsulation. Adv Drug Deliv Rev. 2000; 42: 29-64, herein incorporated by reference in its entirety).
Methods of preparing microcapsules include those disclosed by Lu M Z, et al. Biotechnol Bioeng. 2000, 70: 479-83; Chang T M and Prakash S, Mol Biotechnol. 2001, 17: 249-60; and Lu M Z, et al., J. Microencapsul. 2000, 17: 245-51; herein incorporated by reference in their entireties. For example, microcapsules may be prepared by complexing modified collagen with a ter-polymer shell of 2-hydroxyethyl methylacrylate (HEMA), methacrylic acid (MAA) and methyl methacrylate (MMA), resulting in a capsule thickness of 2-5 μm. Such microcapsules can be further encapsulated with additional 2-5 μm ter-polymer shells in order to impart a negatively charged smooth surface and to minimize plasma protein absorption (Chia, S. M. et al. Multi-layered microcapsules for cell encapsulation Biomaterials. 2002 23: 849-56; herein incorporated by reference in its entirety). In some embodiments, microcapsules are based on alginate, a marine polysaccharide (Sambanis. Diabetes Technol. Ther. 2003, 5: 665-8; herein incorporated by reference in its entirety) or its derivatives. For example, microcapsules can be prepared by the polyelectrolyte complexation between the polyanions sodium alginate and sodium cellulose sulphate with the polycation poly(methylene-co-guanidine) hydrochloride in the presence of calcium chloride.
In some embodiments, cells generated using methods and reagents described herein are microencapsulated for transplantation into a subject (e.g., to prevent immune destruction of the cells). Microencapsulation of cells provides local protection of implanted/transplanted cells from immune attack (e.g., along with or without the use of systemic immune suppressive drugs). In some embodiments, cells and/or cell clusters are microencapsulated in a polymeric, hydrogel, or other suitable material, including but not limited to: poly(orthoesters), poly(anhydrides), poly(phosphoesters), poly(phosphazenes), polysaccharides, polyesters, poly(lactic acid), poly(L-lysine), poly(glycolic acid), poly(lactic-co-glycolic acid), poly(lactic acid-co-lysine), poly(lactic acid-graft-lysine), polyanhydrides, poly(fatty acid dimer), poly(fumaric acid), poly(sebacic acid), poly(carboxyphenoxy propane), poly(carboxyphenoxy hexane), poly(anhydride-co-imides), poly(amides), poly(ortho esters), poly(iminocarbonates), poly(urethanes), poly(organophasphazenes), poly(phosphates), poly(ethylene vinyl acetate), poly(caprolactone), poly(carbonates), poly(amino acids), poly(acrylates), polyacetals, poly(cyanoacrylates), poly(styrenes), poly(vinyl chloride), poly(vinyl fluoride), poly(vinyl imidazole), chlorosulfonated polyolefins, polyethylene oxide, polystyrene, polysaccharides, alginate, hydroxypropyl cellulose (HPC), N-isopropylacrylamide (NIPA), polyethylene glycol, polyvinyl alcohol (PVA), polyethylenimine, chitosan (CS), chitin, dextran sulfate, heparin, chondroitin sulfate, gelatin, etc., and their derivatives, co-polymers, and mixtures thereof. In some embodiments, cells are microencapsulated in an encapsulant comprising or consisting of alginate. Cells may be embedded in a material or within a particle (e.g., nanoparticle, microparticle, etc.) or other structure (e.g., matrix, nanotube, vesicle, globule, etc.). In some embodiments, microencapsulating structures are modified with immune-modulating or immunosuppressive compounds to reduce or prevent immune response to encapsulated cells. For example, in some embodiments, cells are encapsulated within an encapsulant material (e.g., alginate hydrogel) that has been modified by attachment of an immune-modulating agent (e.g., the immune modulating chemokine, CXCL12 (also known as SDF-1). In some embodiments, such an immune modulating agent is a T-cell chemorepellent and/or a pro-survival factor.
In some embodiments, cells generated using methods and reagents described herein are macroencapsulated for transplantation into a subject. Macroencapsulation of cells, for example, within a permeable or semi-permeable chamber, provides local protection of implanted/transplanted cells from immune attack (e.g., along with or without the use of systemic immune suppressive drugs), prevents spread of cells to other tissues or areas of the body, and/or allows for efficient removal of cells. Suitable devices for macroencapsulation include those described in, for example, U.S. Pat. No. 5,914,262; Uludag, et al., Advanced Drug Delivery Reviews, 2000, pp. 29-64, vol. 42, herein incorporated by reference in their entireties.
Other encapsulation (micro or macro) devices and methods may find use in embodiments described herein. For example, methods and devices described in U.S. Pub No. 20130209421, U.S. Pat. No. 8,785,185, each of which are herein incorporated by reference in their entireties, are within the scope of embodiments described herein.
As described above and in the examples below, a number of new transcription factor and other regulatory factors involved in the regulating the differentiation processes have been discover using the screening methods described herein. These factors find use in generating stem cells or differentiated cells have desired properties for use in research, drug screening, and therapeutic applications.
In some embodiments, individual or combinations of these factors are used to induce differentiation in a stem cell to obtain differentiated cells or multipotent cells of a particular lineage (e.g., neural stem cells). In some embodiments, such factor are introduced exogenously to stem cells in vitro or in vivo (e.g., via expression vector, etc). In some embodiments, endogenous factors are up or down regulated by providing activators or inhibitors of endogenous expression.
In some embodiments, individual or combinations of these factors are used to induce differentiation in a somatic cell (e.g., fibroblast, neuronal cell, etc).
In some embodiments, individual or combinations of these factors are used to maintain or induce pluripotency in a cell line. In some embodiments, such factor are introduced exogenously to stem cells or somatic cells in vitro or in vivo (e.g., via expression vector, etc). In some embodiments, endogenous factors are up or down regulated by providing activators or inhibitors of endogenous expression.
In some embodiments, one or more of the markers described in Tables 3 and 4 are targeted. In some embodiments, provided herein are one or more sgRNAs (e.g., 10 or more, 100 or more, 1000 or more, or 5000 or more) described in Table 13 (e.g., SEQ ID NOs:586-8317) for use in targeting the described markers.
In some embodiments, provided herein are cell generated by such methods and the use of such cells, for example, in drug screening, diagnostic, and therapeutic indications.
Where transcription factors are introduced as peptides, in some embodiments they are complexed with cell membrane permeable peptides (e.g., Tat protein, penetratin, etc.) to facilitate entry into target cells.
sgRNA Library Construction
The oligo library was PCR amplified, gel purified and ligated to the linearized backbone vector (pSLQ1373) digested with BstXI and BlpI using In-Fusion cloning (Clontech).
Cell Culture
E14 mouse ES cells and CamES cells were maintained on gelatin coated tissue culture plates with basal medium (50% Neurobasal, 50% Dulbecco modified Eagle medium (DMEM) Ham's nutrient mixture F12, 0.5% NEAA, 0.5% Sodium Pyruvate, 0.5% GlutaMax, 0.5% N2, 1% B27, 0.1 mM β-mercaptoethanol and 0.05 g/L bovine albumin fraction V; all from Thermo Fisher Scientific) supplemented with LIF (Millipore) and 2i (Stemgent), Human embryonic kidney (HEK293T) cells (ATCC) were cultured in 10% fetal bovine serum (Thermo Fisher Scientific) in DMEM (Thermo Fisher Scientific).
Lentiviral Production
HEK293T cells were seeded at ˜30% confluence one day before transfection. Lentivirus were produced by cotransfecting with pHR plasmids and encoding packaging protein vectors (pMD2.G and pCMV-dR8.91) using TransIT-LT1 transfection reagents (Mirus). Viral supernatants were collected 3 days after transfection and filtered through 0.45 μm strainer. Supernatant was used for transduction immediately or kept at −80° C. for long-term storage.
High-Throughput Pooled Screening
Screens were performed in two independent replicates for both self-renewal and neural differentiation. For both screens, 108 CamES cells were transduced with the pooled lentiviral library with an MOI of 0.3, treated with puromycin, and cultured in specified medium. After a period of time indicated for each screen, cells were harvested and FACS/MACS sorted. Deep sequencing was performed to profile the sgRNA counts in each sample, and computationally analyzed to infer top sgRNA and gene hits.
Plasmid Design and Construction
To clone sgRNA vectors, the optimized sgRNA expression vector (pSLQ1373) was linearized and gel purified (Chen et al., 2013 Cell 155, 1479-1491). New sgRNA sequences were PCR amplified from pSLQ1373 using different forward primers and a common reverse primer, gel purified and ligated to the linearized pSLQ1373 vector using In-Fusion cloning (Clontech). Primers used to construct individual sgRNAs are shown in Table 1. To change the promoter of scFv-sfGFP-VP64, the EF1α and PGK promoters were PCR amplified, gel purified, and ligated to linearized pSLQ1504 using In-Fusion cloning (Clontech).
sgRNA Library Design
Putative transcription factor (TF) genes were selected according to the TRANSFAC database, and TSS (transcription start site) for each gene was determined using the Gencode and refFlat databases. All possible transcripts were selected if multiple TSSs existed for a gene. All sgRNAs targeting was −3 kb to 0 relative to TSS. Using the CRISPR-era algorithm (Liu et al., 2015 Bioinformatics 31, 3676-3678), the targeting sequences of sgRNAs adjacent to an NGG PAM (protospacer adjacent motif) were computed, starting with a G (for more efficient U6 promoter activity) with a length of 20 bp. The sgRNAs containing homopolymers spanning greater than 3 nucleotides (nt) were discarded. To avoid off-target effects, sgRNA sequences alignment to the mouse genome was computed using the short read aligner Bowtie, and those with less than 2 mismatches with another genomic region were excluded. Furthermore, sgRNA sequences that contained certain restriction sites (BstXI and XhoI) were also removed. sgRNAs with a GC content between 30% and 70% were used. An average of about 60 sgRNAs were selected for each target gene. Sequences for non-targeting negative control sgRNAs were generated using a randomized mouse gene TSS region and selected using the same rules as described above.
sgRNA Library Construction
The oligonucleotide pool was synthesized by Custom Array. The oligo library was PCR amplified, gel purified and ligated to the linearized pSLQ1373 digested with BstXI and BlpI using in-Fusion cloning.
Construction of the CamES Cell Line
Mouse ES cells were co-transduced with multiple lentiviral constructs that expressed dCas9-SunTag from a TRE3G promoter, scFV-sfGFP-VP64 from the EF1a or PGK promoter, and rtTA from the EF1a promoter. After adding Doxycycline, polyclonal cells were sorted by flow cytometry using a BD FACS Aria2 for GFP+ and mCherry+ cells. After verification of gene activation using a sgBrn2, monoclonal cells were further sorted, and one efficient cell line was selected as CamES cells.
Construction of the Tuj-1-hCD8 CamES Cell Line
Construction of CRISPR/Cas9 vector for Tuj1 knockin. The pX330-derived pSLQ1654 encoding the nuclease Cas9 and an optimized sgRNA sequence was first linearized by a BbsI digest and gel purified. Two primers sgTuj-1 F and sgTuj-1 R were phosphorylated, annealed, and ligated to the linearized vector pSLQ1654 to generate pSLQ1654-sgTuj1. sgTuj-1 F: caccgcccaagtgaagttgctcgcagc (SEQ ID NO:378). sgTuj-1 R: aaacgctgcgagcaacttcacttgggc (SEQ ID NO:379).
Construction of DNA template. The Tuj1-IRES-hCD8 vector (pSLQ1760) was assembled with three fragments (5′ homologous arm of Tuj1, IRES-hCD8 and 3′ homologous arm of Tuj1) and a modified pUC19 backbone vector by using Gibson Assembly Master Mix (New England Biolabs). Both 5′ and 3′ homology arms were PCR amplified from the genomic DNA extracted from mouse ES cells with Herculase 11 Fusion DNA polymerase (Agilent). The IRES-hCD8 was PCR amplified from pSLQ1729 (gift from Wendell Lim). The backbone vector was linearized by digestion with PmeI and Zra1. All DNA fragments and the backbone vector were gel purified followed by a Gibson assembly reaction. Primers: 5′ homologous arm F: aaagtgccacctgacactcagtccLagatgtcgtgcgg. 5′ (SEQ ID NO:380) homologous arm R: tcacttgggcccctgggct (SEQ ID NO:381). IRES-human CD8 F: caggggcccaagtgaactagtaaaattcgcccctctccctc (SEQ ID NO:382). IRES-human CD8 R: cagctgcgagcaactttaacctgcaaaaagggagcagtuaaagg (SEQ ID NO:383). 3′ homologous arm F: agttgctcgcagctggggt (SEQ ID NO:384). 3′ homologous arm R: agctggagaccgttttttctgactgactggatacagggcat (SEQ ID NO:385).
Electroporation and clonal Tuj1-hCD8 CamES cells: 2.5 μg pSLQ1654-sgTuj1, 12.5 μg Tuj1-1RES-hCD8 template DNA in 100 μL. Nucleofector solution (Amaxa) were electroporated into 1×106 CamES cells using program A-030. Both plasmids were maxiprepped using the Endofree Maxiprep Kit (Qiagen). After 3 days of culture, sorted single cells were seeded in a 96-well plate with one cell per well. All clonal cell lines were analyzed using PCR and sequencing (Yu et al., 2015 Cell 16, 142-147).
Quantitative RT-PCR
Cells were harvested using Accutase (STEMCELL), and total RNA was isolated using the RNeasy Plus Mini Kit (QIAGEN), according to manufacturer's instructions. Reverse transcription was performed using iScript cDNA Synthesis kit (Bio-Rad). Quantitative PCR reactions were prepared with iTaq Universal SYBR Green Supermix (Bio-Rad). Reactions were run on a LightCycler thermal cycler (Bio-Rad). Primers used are summarized in Table 2.
High-Throughput Pooled Self-Renewal Screening
Screens were performed in two independent replicates. For both screens. 108 CamES cells were transduced with the pooled lentiviral library with an MOI of 0.3 on day −3. On day −2, CamES cells were treated with puromycin (Invitrogen, 1 μg/mL) in basal medium supplemented with LIF and 2i. After 48 hours of puromycin selection, cells were harvested as the day 0 sample. Another 108 CamES cells with the same treatment were passaged for 10 times under the basal medium supplemented with LIF and Doxycycline (Invitrogen, 100 ng/mL), without 2i. Cells were passaged every 3 days. After 30 days, cells were harvested, stained with mouse anti-SSEA1 (BD, 1:50), and FACS sorted using BD FACS Aria2 as SSEA1+ sample (
High-Throughput Pooled Neural Differentiation Screening
The neural differentiation screens were performed as two independent replicates. For both screens, 108 CamES cells were seeded at 40,000 cells/cm2 density at day −1. Cells were transduced with pooled lentiviral sgRNA library with an MOI of 0.3 at day 0 in basal medium supplemented with LIF and 2i. At day 1, puromycin was added at 1 μg/mL in ES2N medium (Millipore) with Doxycycline for another 24 hours. Fresh ES2N medium was changed with Doxycycline every day starting day 2. On day 12, cells were harvested and sorted for hCD8+ and hCD8− cells using EasySep human CD8 isolation kit (STEMCELL Technologies) (
Flow Cytometry Analysis
Cells were harvested, washed, and adjusted to a concentration of 106 cells/mL, in ice cold PBS with 2% FBS. Cells were stained and incubated with diluted primary antibodies at 4° C. for 30 mins in Eppendorf tubes. After staining, cells were washed three times by centrifugation at 400 g for 5 mins and resuspended in 500 μL to 1 mL in ice cold PBS. Cells were kept in dark on ice and analyzed using BD Accuri C6 Cytometer.
Immunocytochemistry
Experiments were performed on cells seeded on plate (IBIDI) that had been coated with gelatin (0.1%) overnight at 37° C. Cells were washed twice with PBS, fixed in 4% Paraformaldehyde (Wako) for 15 mins at room temperature, permeabilized and blocked with 0.1% Triton X-100, 5% donkey serum in PBS (blocking buffer) for 1 h at room temperature. After three times wash with PBS, cells were incubated with primary antibodies. The following primary antibodies with indicated dilution in blocking buffer were used: Rabbit anti-Oct4 (Santa Cruz, 1:200), Rabbit anti-Nanog (Abcam, 1:500), Mouse anti-Tuj1 (Covance, 1:1000), Rabbit anti-Map2 (Cell Signaling Technology, 1:200), Rabbit anti-NeuN (Abcam, 1:1000), Rabbit anti-vGluT1 (Synaptic Systems, 1:200). Rabbit anti-GFAP (Dako, 1:500), Rabbit anti-Olig-2 (Millipore, 1:500) Cells were incubated with primary antibodies at 4° C. for overnight, then washed three times with PBS. After staining with corresponding secondary antibodies in blocking buffer for 1 hour at room temperature, cells were washed three times with PBS and stained with DAPI (Vector Labs) for 5 mins. Washed cells were examined using a Nikon Spinning Disk Confocal microscope with TIRF.
Electrophysiology
External bath solution for whole cell patch clamp recordings contains (in mM) 140 NaCl, 5 KCl, 2 cacl2, 2 MgC2, 20 HEPES, and glucose 10, pH 7.4. Action potentials were recorded current-clamp while sodium and potassium currents were recorded under voltage clamp. The internal pipette solution contained (in mM): 123 K-gluconate, 10 KCl, 1 MgCl2, HEPES, 1 EGTA, 0.1 CaCl2, 1 MgATP, 0.3 Na4GTP and glucose 4, pH 7.2. For current clamp experiments, currents were injected to keep membrane potentials around −65 mV, and action potentials were elicited by stepwise current injections.
Western Blot
Samples were collected with NP40 buffer with protease inhibitor and phosphatase inhibitor, and boiled in 1×SDS loading buffer, separated by SDS-PAGE gels, and transferred onto a nitrocellulose (NC) membrane, which was blocked with 5% non-fat dry milk and incubated with primary antibodies at 4° C. overnight. Rabbit anti-Jun antibody (Cell Signaling Technology, 1:1000), rabbit anti-β-actin antibody (Cell Signaling Technology, 1:5000), rabbit anti-phospho-Jun antibody (Cell Signaling Technology. 1:1000) were used as primary antibodies. HRP-conjugated donkey anti-rabbit IgG (Jackson ImmunoResearch, 1:5000) were used as secondary antibodies. Signals were detected using SuperSignal West Femto Maximum Sensitivity Substrate (Thermo Scientific). β-actin was used as a loading control.
Differentiation of Mouse ES Cells Through Embryoid Body Formation
The sgKlf2- and sgMlxip-transduced CamES cells were trypsinized, plated on ultralow attachment plates, and cultured in Knockout DMEM supplemented with 10% FBS, without Doxcycline. After 6 days, aggregated cells were collected and seeded onto gelatin-coated plates. Four days later, cells were fixed and stained with markers for three germ layers.
RNA-Seq
CamES cells were transduced with individual sgRNAs, expanded, and differentiated after 2 days of puromycin selection in 6 well plates. Total RNA was purified using RNeasy Plus Mini Kit (Qiagen). Libraries were prepared using TruSeq Stranded mRNA LT Sample Prep kit (Illumina) according to the manufacturer's instructions. Samples were combined and purified using Ampure XP Agencourt beads (Beckman Coulter) and sequenced on a Hi-Seq 4000 (Illumina), to generate paired-end 150 bp reads. Each sample was sequenced to an average depth of 40 million reads.
Reads were mapped with kallisto (Bray et al., 2016 Nature biotechnology 34, 525-527) to the provided GRCm38 downloaded from bio math at Berkley. Normalized gene expression and differentially expressed genes were estimated using sleuth (Pimentel et al., 2016 bioRxiv) and DESeq2 (Love et al., 2014 Genome Biol 15, 550) for the self-renewal and neural data, respectively. Gene ontology analysis was performed using the Bioconductor package gage (Luo et al., 2009 BMC Bioinformatics 10, 161). AP-1 targets were defined as genes that have an AP-1 consensus binding motif (Biddie et al., 2011 Mol Cell 43, 145-155; Rauscher et al., 1988 Genes & Development 2, 1687-1699; Shaulian and Karin, 2002 Nat Cell Biol 4, E131-E136; Zhou et al., 2005 DNA Research 12, 139-150) within 500 bases upstream of the TSS.
Bioinformatic Analysis of sgRNA and Gene Hits
Data processing was conducted with custom scripts. Reads were mapped allowing for a mismatch for the first and last base pair of the spacer, which uniquely identified sgRNA.
Each sample was normalized by the total read count. This gave a frequency for each sgRNA:
For the self-renewal screen, in each condition (CamES cells and SSEA+ cells), frequency for each sgRNA was averaged across replicates. sgRNA with less than 20 counts at time 0 were discarded. The sgRNA enrichment (Esg) was calculated as the log 2 fold change from the average time 0 frequency to the average SSEA+ frequency.
For the neuronal differentiation screen, the paired Tuj1-hCD8+ and Tuj1-hCD8− were used to compute the enrichment scores. Specifically, frequencies were computed as above, sgRNA with less than 1 count in the Tuj1-hCD8− library was discarded. Enrichment for each sgRNA in each replicate was calculated as the log 2 fold-change from the Tuj1-hCD8− sample to the Tuj1-hCD8− libraries. Enrichment was averaged across replicates and used as Esg in subsequent analysis.
For each gene, an enrichment score (ESgene) was calculated from the sgRNA enrichment above, as follows. An unnormalized enrichment score (Egene.top3) was calculated by averaging Fsg for the 3 sgRNA with highest Esg. Egene.top3 was normalized by the distribution of nontargeting sgRNA as follows (Gilbert et al., 2014 Cell 159, 647-661).
Suppose a gene had N targeting sgRNA. Then, 10000 bootstrap samples of size N were drawn from the nontargeting sgRNA. For each sample of size N, Esample.top3 was computed as above. This gave an empirical estimate of the distribution of Egene.top3 if the all the sgRNA targeting that gene had been negative control sgRNA. For the final, normalized gene enrichment score (ESgene), the unnormalized enrichment score was divided by the 0.9 quantile of thie smpirical distribution:
After ranking genes by ES, the most enriched sgRNA for each gene was selected to subsequently validate.
Generation of CRISPRa Mouse Embryonic Stem Cells for Single sgRNA-Mediated Gene Activation and Cell Fate Control
Single sgRNA-mediated efficient endogenous gene activation is useful for large-scale pooled screens of sophisticated cell differentiation phenotypes (
The dCas9-SunTag system contains two components, a SunTag polypeptide domain fused to dCas9 and a VP64 transactivator domain fused to a single chain fragment variable (scFv). It was investigated whether their expression ratio was a key factor determining the activation efficiency. To facilitate fine-tuning their ratio, each component was cloned onto a lentiviral vector (
Twenty eight clonal cell lines with the PGK promoter were sorted, and one cell line (#5) showing best Brn2 activation was obtained, which was named CamES (CRISPR-activating mouse ES) cells (
It was next tested if this promoted neural differentiation (Chanda et al., 2014 Stem Cell Reports 3, 282-296). Using a single sgAsc11, robust differentiation of CamES cells into a neuronal phenotype was observed at day 8, which stained positively for the neuronal markers Tuj1 (class III beta-tubulin) and Map2 (Microtubule-associated protein 2) (
The CamES cells activating endogenous Asc11 were compared with overexpression of exogenous Asc11 cDNA for neural differentiation. A similar neuronal phenotype was observed using the two approaches (
CamES Cells Allow an eCRISPRa-Mediated Dropout Screen to Identify Transcription Factors that Maintain Self-Renewal
CamES cells were used as an unbiased screening platform to identify key factors among the set of all putative transcription factors that direct cell fate determination. Initial studies focused on factor contributing to the maintenance of ES cell self-renewal. An sgRNA library targeting all putative TFs (˜800) and a small set of lincRNAs (long intergenic noncoding RNAs) (˜50) was generated. Multiple sgRNA (60 sgRNAs per gene on average) were designed to target each gene to cover a broad range of gene activation. An additional 9,296 non-targeting negative control sgRNAs were included. Altogether, a library with a total of 55,336 sgRNAs was generated (
The sgRNA library was introduced into CamES cells as a gain-of-function screen to study stem cell self-renewal. Self-renewal of mouse ES cells in serum-free conditions requires simultaneous inhibition of the GSK3 and ERK pathways, which is typically achieved by using two small molecule inhibitors (2i) (Ying et al., 2008 Nature 453, 519-523). It was determined whether activating transcription factors could functionally rescue the loss of 2i to support self-renewal over a long period of time. To do this, the lentiviral sgRNA library was transduced into CamES cells, cultured the transduced cells in −2i medium, and passaged every three days (
To identify genes whose gain-of-function maintains self-renewal of ES cells, deep sequencing was used to read out the sgRNA representation (
Gene-level enrichment scores were obtained by considering the enrichment of the top three sgRNAs targeting each gene and normalizing by the empirical distribution of the non-targeting sgRNA. A good correlation was obtained between both sgRNA enrichment and gene-level scores across independent library transductions (
Validation of Top Enriched sgRNAs Promoting Long-Term Maintenance of Self-Renewal in ES Cells
Using the non-targeting sgRNA normalized gene scoring method, all detected sgRNAs and their targeting genes were ranked (
The most enriched sgRNAs of the top 18 genes were selected for validation (
Quantitative PCR results confirmed activation of target genes by each sgRNA (
Deep Sequencing and Functional Validation Confirmed the Function of Positive Hits for Self-Renewal Maintenance
sgMlxip was chosen to explore its role in promoting self-renewal. The MLXIP protein forms a heterodimer with MLX (Max-like protein X) and modulates transcriptional regulation in response to cellular glucose levels (Stoltzman et al., 2008 Proc. Natl. Acad. Sci. USA 105, 6912-6917), and its function related to ES cell self-renewal is unknown.
The developmental potential of CamES +sgMlxip cells cultured in −2i conditions for generating the three germ layers was evaluated using CamES +sgKlf2 as a comparison. After removal of Dox to switch of eCRISPRa activity, spontaneous differentiation of both samples in serum-based medium via embryoid body formation generated cells representative of ecdoderm (Tuj1+), mesoderm (SMA+), and endoderm (Sox17+) lineages (
RNA-seq analysis was performed on CamES +sgMlxip and CamES +sgKlf2 cells cultured in −2i conditions, and compared to CamES cells cultured with or without 2i. Both samples exhibited high mRNA expression for most pluripotency genes and low expression for most lineage specific genes, with a pattern similar to ES cells cultured in 2i medium and distinct from cells without 2i (
The 2i cocktail contains two small molecules that maintain pluripotency by inhibiting GSK3 (CHIR99021) and MEK1/2 (PD0325901) (Ying et al., 2008 Nature 453, 519-523). Via activation of the Wnt pathway and inhibition of the MAPK pathway, the 2i molecules inhibit differentiation while promoting proliferation of ES cells. The RNA-seq gene expression profiles for the Wnt and MAPK pathways were compared among the samples. For the Wnt pathway genes, CamES-sgMlxip cells correlated well with CamES cells in +2i medium (R2=0.81), while poorly with CamES cells in −2i medium (R2=0.35) (
Similar results were observed for the MAPK pathway: there was a good correlation between CamES +sgMlxip and CamES +2i samples (R2=0.91), compared to a poor correlation between CamES-sgMlxip and CamES-2i (R2=0.59). Gene expression related to the MAPK pathway showed a similar pattern at the transcript level in both CamES +sgMlxip and CamES +2i cells. For example, inhibition of Jun, a major transcription factor of the MAPK pathway, was observed in both CamES +sgMlxip and CamES +2i cells, as well as inhibition of other MAPK related genes (EGF, FAS, FGF, PDGF and TGFb) (
The PI3K pathway, which is important in the regulation of ES cell pluripotency and proliferation (Yu and Cui, 2016 Development 143, 3050-3060), was also investigated. The CamES +sgMlxip cells also showed a similar expression pattern as CamES +2i cells (
In summary, both functional tests and gene expression indicate that true positive hits identified using the CRISPRa screening method maintain self-renewal of stem cells.
Engineered CamES Cells Allow an eCRISPRa-Mediated Non-Dropout Screen to Identify Key Factors Promoting Neural Differentiation
A eCRISPRa gain-of-function screen was performed to identify TFs that promote the dynamic, complex neural differentiation process. Transcription factor-mediated lineage specification is heterogeneous and stochastic: unlike in the dropout screen, a desired differentiated cell type may only represent a small subset of the total population; and spontaneous differentiation may generate the desired cell type even when a non-functional factor is present.
To address these challenges, a clonal reporter CamES (Tuj1-hCD8 CamES) cell line carrying a biallelic human CD8 (hCD8) gene cassette appended downstream to endogenous Tuj1 via an IRES (internal ribosome entry site) was established (
The parameters of cell density and differentiation time for screening, which affected neural differentiation efficiency, were determined. 40,000 cells/cm2 was chosen as the seeding density, as Tuj1-hCD8 CamES cells transduced the sgRNA library maximized the seeding cell number and showed detectable neural marker expression Tuj1 and Map2 (
Deep sequencing was used to identify sgRNAs for transcription factors that enhance neural differentiation. The overall distributions of sgRNA from samples collected from plasmid library, sorted Tuj1-hCD8+ and Tuj1-hCD8− cells was compared (
Stem cell differentiation is affected by stochastic factors. In these experiments, activation of Asc11, a powerful neural inducer, led to only 47.6% of cells being Tuj1-hCD8+(
Validation of Top Enriched sgRNAs Promoting Neural Differentiation
Among the ranked gene hits, the top 20 most effective sgRNAs were chosen for validation (
Twenty individual sgRNAs for the top gene hits, as well as 6 non-targeting negative control sgRNAs were tested, Quantitative PCR results showed activation (10 to 10,000 fold) of 19 genes out of 20 tested by their cognate sgRNA (
Another neuronal marker, NCAM, was used to test differentiation of CamES cells. Similarly, all 20 sgRNAs generated NCAM+ cells (20-60%) after 12 days of differentiation in basal medium, and all negative control sgRNAs showed much less NCAM+ cells (below 10%) (
Activation of different endogenous genes induced different neural subtypes (
Functional Test and Transcriptome Profiling Confirmed sgJun-Induced Neural Differentiation
The role of Jun for promoting neural differentiation was examined. Jun has not previously been tied to early neural development. It was observed that sgJun could induce functional neurons that were able to generate action potentials upon current injection (
Jun regulates downstream target genes through its phosphorylation and the AP-1 complex formation with c-Fos (Rauscher et al., 1988 Genes Dev. 2, 1687-1699). It was confirmed that endogenous Jun induced by sgJun also was phosphorylated (
Previous work reported that overexpression of β-catenin in mouse ES cells induce neurogenesis (Otero et al., 2004: Development 131, 3545-3557). The excessive expression of Wnt genes in the cells indicates that the Wnt pathway plays an important role in sgJun-induced neurogenesis (
Paired-Analysis is Useful in the Non-Dropout Cell Differentiation Screen
In dropout screens, cells that are negative for the phenotype of interest are almost completely removed from the selected population. Therefore, one can calculate enrichment of the selected population relative to initial pool of sgRNAs to infer functional genes (
Quantitative Genetic Interaction Mapping Using CRISPRI
The vectors used in this study were constructed by using standard molecular cloning techniques, including PCR, restriction enzyme digestion and ligation. Custom oligonucleotides were from Integrated DNA Technologies. E. coli strain D1H5a was used for the transformation and selected by 100 μg/ml of carbenicillin, or 50 μg/ml of Kanamycin. DNA was extracted and purified using Plasmid Mini or Midi Kits (Macherey-Nagel). Sequences of the vector constructs were verified with Quintarabio's DNA sequencing service.
Construct Design
The dCas9-KRAB plasmid and sgRNA expressing plasmid are previously described vectors (Du, D. & Qi, L S. Cold Spring Harbor Protocols 2016, (2016)). The SpeI and Sail sites were mutated in the sgRNA expression plasmid. The single sgRNA expression plasmids were cloned as described previously with minor modifications. Briefly, the plasmids were cloned by PCR from an existing sgRNA template using a unique 50 primer containing the desired protospacer (N is the protospacer) and a common primer with (SpeI and SalI sites). The PCR products and the lentiviral mice 16 (mU6) based sgRNA expression vector were digested with BstXI and XhoI and the two pieces of DNA were ligated together. The single vector was introduced unique SpeI and SalI sites to enable the insertion of the mU6-sgRNA expression cassettes.
To construct a lentiviral vector for mU6-driven expression of combinatorial gRNAs, mU6-sgRNA expression cassettes were prepared from digestion of the storage vector with XbaI and XhoI enzymes, and inserted into the target single sgRNA expression vector backbone, using ligation via the compatible sticky ends generated by digestion of the target single sgRNA expression vector with SpeI and SalI enzymes.
The Single Library Cloning
A library of 336 sgRNAs targeting a set of 112 genes encoding epigenetic regulators (3 sgRNAs/gene) was constructed using top prediction hits from the CRISPR-ERA algorithm (Liu, H, et al Bioinformatics 31, 3676-3678 (2015)). The library also included 30 non-targeting negative control sgRNAs. sgRNAs containing XbaI, XhoI, SpeI, and SalI restriction sites, which were used for double sgRNA library construction, were excluded. Individual oligos encoding sgRNAs were synthesized in a 384-well format, pooled, and the single sgRNA expression vectors were constructed individually by ligating the oligos into a common sgRNA lentiviral vector with SpeI and SalI sites. After sequencing validation, 336 sgRNA constructs were manually mixed with equal amount for the single sgRNA screens and double sgRNA library construction. The sgRNA sequence and corresponding genes are listed in Table 5.
Combinatorial sgRNA Library Pool
To generate the pooled storage vector library, the 336 single sgRNA expression vectors were mixed equally. Pooled lentiviral vector libraries harboring combinatorial gRNA(s) were constructed with the same strategy as for the generation of combinatorial sgRNA constructs described above, except that the assembly was performed with pooled inserts and vectors, instead of individual ones. Briefly, the pooled mU6-sgRNA inserts were generated by a single-pot digestion of the pooled storage vector library with XbaI and XhoI. The destination lentiviral vectors were digested with SpeI and SalI. The digested inserts and vectors were ligated via their compatible ends (i.e., XbaI+SalI & XhoI+SpeI) to create the pooled double sgRNA library (336×336=112,896 total combinations) in the lentiviral vector. The lentiviral sgRNA library pools were prepared in DHS ultra-competent cells (Agilent Technologies) and purified by Plasmid Midi Kit (Macherey-Nagel). The sequences of the deep sequencing is listed in Table 6.
Cell Culture
1HEK293T and HEK293 cells were cultured in DMEM supplemented with 10%/6 fetal bovine serum, 100 units/ml streptomycin and 100 mg/ml penicillin at 3TC, with 5% CO2. To generate inducible CRISPRi HEK293 (TetOn-dCas9-KRAB) cell line, the cells were lentivirally transduced with constructs that express dCas9-KRAB from the TRE3G promoter and rtTA. Pure polyclonal populations of CRISPRi cell line were treated with doxycycline, and sorted by flow cytometry using a BD FACS Aria2 for mCherry expression. These cells were then grown in the absence of doxycycline until mCherry fluorescence reduced to uninduced levels.
Lentivirus Production and Transduction
Lentiviruses were produced and packaged in HEK293T cells as described previously with minor modification (Du et al., 2016, supra). Briefly, HEK 293′T were transfected with standard packaging vectors using Mirus TransIT-LT1 transfection reagent (Mirus MIR 2300) according to the manufacturer's instructions. Viral supernatant was harvested 48-72 h following transfection and either filtered through a 0.45 μm syringe filter or snap-frozen.
Growth Competition Assay
Cells were grown at minimum library coverage of 1,000 for the screens. The target cells were infected in the presence of 8 μg/ml polybrene (Sigma) at a multiplicity of infection of about 0.3 to ensure single copy integration in most cells, which is corresponded to an infection efficiency of 30-40%. For single library screens, cells were grown in the flasks and harvested at 0, 12 and 20 days after puromycin selection; for double library screens, cells were grown in the flasks and harvested at 0, 8 and 16 days after puromycin selection. Cells were maintained at least 1,000 cells per sgRNA for each screen.
After the cell samples were collected, the genomic DNA was isolated using QIAamp DNA Blood Maxi Kit (Qiagen) according to the manufacturer's protocol, the cassette encoding the sgRNA was amplified by PCR, and relative sgRNA abundance was determined by next generation sequencing on an Illumina Miseq for single screens or an lllumina HiSeq-2500 for double screens using custom primers with previously described protocols at high coverage (Bassik, M. C. et al. Cell 152, 909-922(2013); Roguev, A. et al. Nat. Methods 10, 432-437 (2013)). Two biological replicates of each screen were performed.
For the cell growth validation experiments, the viruses with single sgRNAs or double sgRNA were transduced into HEK293 (TetOn-dCas9-KRAB) cells, followed by the selection with 2 μg/ml puromycin to remove the uninfected cells. Three days after the cells were treated with or without Dox (0.5 ug/ml), the cell viability was measured by XTT assay (Biotium) according to the manufacturer's experimental protocol. 2,000 to 10,000 cells were plated into 96-well tissue culture plates for the growth assay. For each 96 well, 30 μl of XTT solution was added to the 100 ul cell cultures at the time points indicated. Cells were incubated for 6 hours at 37 C with 5% CO2. Measure the absorbance signal of the samples with a spectrophotometer at a wavelength of 450-500 nm. Measure background absorbance at a wavelength between 630-690 nm. The normalized absorbance values were obtained by subtracting background absorbance from signal absorbance.
Validation of Gene Hits
Cells were harvested and total RNA was isolated using the RNAeasy Kit (Qiagen), according to manufacturer's instructions. RNA was converted to cDNA using iScript™ cDNA Synthesis Kit according to manufacturer's instructions (Bio-rad). Quantitative PCR reactions were prepared with a 2× master mix according to the manufacturer's instructions (Bio-rad). Reactions were run on CFX96 Touch™ Real-Time PCR Detection System (Bio-rad). Primer sequences for qPCR are listed in Table 3.
To develop a CRISPRi combinatorial screening approach, a single library consisting of 336 sgRNAs using was constructed using a computational algorithm (Liu, H. et al. Bioinformatics 31, 3676-3678 (2015)), which sequence-specifically targeted 112 genes (3 sgRNA/gene) involved in chromatin regulation (for the gene list and their sgRNAs, see Table 5). The library also included 30 negative control sgRNAs without target sites in the human genome. Pooled cloning of 336 sgRNAs onto itself generated a mixed double sgRNA library containing 112,896 (336×336) combinations. Both libraries were prepared as lentivirus pools ready for large-scale mammalian cell transduction at a low multiplicity of infection (MOI=0.3).
The repressive dCas9-KRAB protein was conditionally expressed under the control of the Doxycycline (Dox)-inducible promoter TetON-3G in the human embryonic kidney 293 (HEK293) cells. Transducing both libraries into clonal HEK293-dCas9-KRAB cells generated two pooled cell populations (
It was first investigated if sgRNA distribution remained consistent between biological replicates before and after library screening. Sequencing single and double libraries with or without Dox at different time points exhibited consistently high coefficient of determination (R2) (
It was next determined if inducible expression of dCas9-KRAB allowed one to identify single and double gene perturbations that influenced cell growth (
The above inducible experiments were performed at end time points. sgRNA distribution was further compared for both single and double libraries with and without Dox induction at intermediate time points (day 12 for single library and day 8 for double library). Consistent phenotypes at these time points compared to end time points were observed. For example, a similar list of genes whose repression slowed down cell growth, including ME14, MED15, WDR82, PAF1, and RIF1. The absence of WDR5 at day 12 indicates that WDR5 has a moderate role for growth compared to other gene hits. For the double library, a similar bifurcation pattern was observed, with a difference that the bifurcation degree (measured by the angle between the two populations) is smaller at earlier time points.
The consistent gene hits and dropout pattern for both libraries between different time points propelled a comparison of datasets across a broad time course. It was investigated if the trend of dropout effects could provide another layer of identification of true positive hits (
The time-course sgRNA enrichment was compared in the absence of Dox for both single and double libraries. No significant changes of sgRNA distribution were observed over time for both libraries without Dox. For the single library, comparing the day 0 sample with day 12 or day 20 samples (+/− Dox) showed only dropout of gene hits with Dox (
Two negative interactions were validated, demonstrating their ability to suppress cell proliferation and causing repression of target endogenous genes. Two pairs were chosen for testing: MGBRP/MED6, and BRD7/LEO1. MRGBP is a component of the NuA4 histone acetyltransferase complex involved in gene activation by acetylation of histones; BRD7 is a member of the bromodomain-containing protein family; and LEO1 is a component of the PAF1 complex (PAF1C) involved in transcription of RNA Pol II. The results confirmed the validity of the double repression and synthetic lethality-based growth effects. As shown in
Based on the curated set of protein complexes and pathways, a GI map depicting the genetic cross-talk between different functional modules involved in chromatin was created (
The nuclease Cas9 for gene editing-mediated knockout allows complete loss of function, yet knockout can be heterogeneous among alleles due to existence of in-frame indels. On the contrary, CRISPRi-based dCas9 transcription knockdown leads to partial, homogeneous loss of function (Mandegar, M. A. et al. Cell Stem Cell 18, 541-553 (2016)). Applying the two methods to higher-order genetic screening needs to consider these important differences. For example, as epistatic genetic screens require simultaneous perturbation of multiple genes (usually 2 genes, 4 alleles), the heterogeneity of gene knockout in pooled CRISPR screens may result in a significant portion of cells without proper epistatic perturbation. Among the cells that are properly perturbed, complete knockout of function offers a highly sensitive way to discover novel gene combinations whose perturbation leads to measurable phenotypes (e.g., growth). Yet, combinatorial multiple gene knockouts may easily cause lethal effects by itself, precluding testing other phenotypes (e.g., differentiation or host-pathogen interaction).
On the contrary, partial knockdown by CRISPRi, while being less sensitive than CRISPR knockout, likely avoids major dominating lethal effects. The homogeneous transcriptional repression could generate cell populations with consistent multi-gene perturbation. Furthermore, sgRNAs binding at various loci along the promoter lead to varying levels of CRISPRi repression, which is contemplated to provide dosage-dependent combinatorial screening distinct from binary perturbation from CRISPR. The demonstration of the inducible and titratable features of CRISPRi combinatorial screening showed the method allows assaying genetic interactions temporally and potentially in a dose-dependent manner.
Compared to RNAi-based methods, the major approach for genetic interaction mapping, CRISPRi presents a few advantages as well. CRISPRi knockdown is specific (Gilbert, L. A. et al., Cell 159, 647-661 (2014)), with less concerns about multiple sgRNAs in the same cells causing unexpected off-target perturbation. As CRISPR activation (CRISPRa) is based on somewhat similar setup as CRISPRi, by changing a repressive effector into an activating effector, the same approach can be expanded into gain-of-function screening of pairwise of genes. Furthermore, combining CRISPRi and CRISPRa into the some cells is contemplated to allow simultaneous activating a gene while repressing another gene. These dramatically expand the modes of epistatic screens that can be performed.
Development of high-throughput epistasis-mapping technologies has made it possible to interrogate complex biological phenomena. Mapping the PPI networks and GI networks have become major methodologies to study epistasis. The PPI networks report on gene products that interact physically; (GIs, in contrast, illustrate functional relationships between genes including, but not limited to, physical interactions of their gene products. They often reveal how groups of proteins and complexes work together to carry out biological functions and can describe the cross-talk between pathways and processes. Therefore, the method for mapping GI networks in mammalian cells provides a useful, natural complement to PPI mapping methods and other existing GI mapping methods. Integrating the two types of information is extremely powerful in understanding complex biology in broader contexts of basic and translational research.
Besides gene activation, gene repression also can facilitate cell fate conversion. For example, knockdown of many epigenetic modulators increases the efficiency of reprogramming or transdifferentiation processes. This example describes, a repression screen platform to identify cell fate conversion barriers genes.
To perform gene repression screens, a clonal mouse ES cell line carrying Staphylococcus aureus (SaCas91-KRAB is co-transfected with Cas9, sgRNA targeting mouse Rosa 26 loci, and a vector containing dCas9-KRAB with a Zeocin-resistance gene. Zeocin-resistant cells are sorted into a 96-well plate. After a week of culture, the genome is purified and the correct integration of SadCas9-KRAB into Rosa 26 loci is confirmed. This clonal cell is used as a platform to identify gene barriers of differentiation processes.
To perform single gene repression screens, a genome-wide gene repression SadCas9 sgRNA library is generated. The library includes sgRNAs targeting −50 bp to +300 bp region relative to all putative genes in the mouse genome. All the available sgRNAs are blasted through mouse genome and excluded if there is predicted off-target binding. Other design criteria and construction method are similar to the design of activation sgRNA library described in Example 1. This repression library is transduced into the SadCas9 repression mouse ES cells, and neural differentiation is performed as in the single screen. On day 12, cells are harvested and sorted for hCD8+ and hCD8−. The sgRNAs are sequenced, paired-analyzed for enriched genes in hCD8+ and hCD8− populations, and a list of top hits for neural differentiation barrier genes is identified.
Over the past years, the literature has shown that the activation of combinatorial transcription factors can control a cell fate. For example, the transcription factors Oct4, Klf4, Sox2, and c-Myc are used to reprogram somatic cells to induced pluripotent stem (iPS) cells. Moreover, activation of combinatorial transcription factors also induces the generation of many cell types, such as cardiomyocytes, neurons, and hepatocytes, directly from somatic cells. These works indicate that single TFs are not sufficient to achieve a cell fate conversion process in most cases. Thus, a platform that allows combinatorial screen is in urgent need to facilitate cell fate determination studies.
To perform a second-round combinatorial activation screen, an sgRNA library that achieves double gene activation is generated. In this library, two different sgRNA cassettes are constructed into one vector. The first cassette contains sgRNAs targeting top hit genes from the single activation screen, which are driven by a human U6 promoter. Meanwhile, each vector contains the second cassette, which is a sgRNA with a different stemloop sequence driven by a mouse U6 promoter. The sgRNAs of the second cassettes also target top hit genes from the first round activation single screen. This construct expresses sgRNAs targeting two different genes, as well as avoids recombination of repeated sgRNA sequences. Two different sgRNAs bind to dCas9 and achieve the activation of two different top hit genes simultaneously in the dCas9-activation system. This allows the combinatorial double activation screen.
In some embodiments, this double activation library is transduced into CamES cells, and neural differentiation is performed as in the single screen. On day 12, cells are harvested and sorted for hCD8+ and hCD8−. The sgRNAs are sequenced and paired-analyze enriched genes in hCD8+ and hCD8− populations are identified. The screen identifies optimal TF combinations that drive neural differentiation of mouse ES cells.
Additionally, the combination of gain-of-function and loss-of-function techniques accelerates cell fate conversion, and sheds light on the fully revelation of cellular reprogramming mechanisms. However, a platform to perform gain-of-function and loss-of-function screen simultaneously is not available at present.
To perform a simultaneous activation/repression screen, a clonal ES cell line carrying gene activation/repression cassettes is generated. Vectors containing two cassettes separately are constructed. One vector contains the activation cassette, which is a dead Streptococcus pyogenes Cas9 (SpCas9)-activation system, with a eGFP gene cassette. The other vector comprises SadCas9-KRAB, with a zeocin-resistance gene cassette following. The two vectors, together with Cas9 and sgRNA targeting mouse Rosa26 loci are co-transfected into mouse ES cells. To select mouse ES cells carrying these two system, transfected ES cells are selected with zeocin. After seven days, remaining zeocin-resistant cells are analyzed with flow cytometry and single GFP+ cells are sorted into 96-well plates. One week later, the genome of clonal cells is analyzed to confirm the correct integration of both activation and repression cassettes. This clonal cell line allows the activation and repression of different genes simultaneously.
An sgRNA library that achieves gene turning-on and -off simultaneously is constructed. In this library, two different sgRNA cassettes are constructed into one vector. The first cassette contains sgRNAs of SpCas9 targeting top hit genes from the single activation screen, which are driven by a human U6 promoter. Meanwhile, each vector contains the second cassette, which is a sgRNA of SaCas9 driven by a mouse U6 promoter. The sgRNAs of SaCas9 in the second cassettes target top hit genes from the first round repression screen. This construct expresses sgRNAs of SpCas9 and SaCa9, and thus allows simultaneous gene activation and repression.
This activation/repression library is applied to clonal turning-on/off mouse ES cells, and neural differentiation is performed as in the single screen. On day 12, cells are harvested and sorted for hCD8+ and hCD8−. The sgRNAs and paired-analyze enriched genes in hCD8+ and hCD8− populations are sequenced. A series of gene combinations having both TF determinants and neural differentiation barriers is identified. The simultaneous turning-on of IT determinants and turning-off of neural differentiation barriers generates very high efficiency of neural cells of mouse ES cells.
To clone sgRNA vectors, the optimized sgRNA expression vector (pSLQ1373) was linearized and gel purified (Chen et al., 2013). New sgRNA sequences were PCR amplified from pSLQ1373 using different forward primers and a common reverse primer, gel purified and ligated to the linearized pSLQ1373 vector using In-Fusion cloning (Clontech). Primers used to construct individual sgRNAs are shown in Table 8. To change the promoter of scFv-sfGFP-VP64, the EF1α and PGK promoters were PCR amplified, gel purified, and ligated to linearized pSLQ504 using In-Fusion cloning (Clontech).
Two-guide expression vectors were assembled by a two-step cloning procedure. First, new sgRNA sequence (integrated DNA Technologieds) were PCR amplified from pSLQ5004 and ligated into BstXI and XhoI-digested pSLQ5004 parental vector, which contained a modified human 136 promoter (hU6). The same single sgRNA expression constructs were cloned into pSLQ1373 as previously described, which contained a modified mouse U6 promoter (mU6) and an optimized stem loop sequence of sgRNA. Second, the two-guide expression cassettes were then assembled from PCR amplified single cassettes using two sgRNA forward and reverse primers from pSLQ5004-based single sgRNA constructs and inserted into NsiI-digested pSLQ1373 single sgRNA constructs. Primers used to construct individual sgRNAs are shown in Table 11.
sgRNA Library Design
Putative transcription factor (TF) genes were selected according to the TRANSFAC database, and TSS (transcription start site) for each gene was determined using the Gencode and refFlat databases. All possible transcripts were selected if multiple TSSs exist for a gene. All sgRNAs targeting −3 kb to 0 relative to TSS were kept. Using the CRISPR-era algorithm (Liu et al., 2015), the targeting sequences of sgRNAs adjacent to an NGG PAM (protospacer adjacent motif) were computed, starting with a G (for more efficient U6 promoter activity) with a length of 20 bp. The sgRNAs containing homopolymers spanning greater than 3 nucleotides (nt) were discarded. To avoid off-target effects, sgRNA sequences alignment to the mouse genome was computed using the short read aligner Bowtie, and those with less than 2 mismatches with another genomic region were excluded. Furthermore, sgRNA sequences that contained certain restriction sites (BstXI and BlpI) were also removed. sgRNAs with a GC content between 30% and 70% were kept. An average of about 60 sgRNAs were selected for each target gene. Sequences for non-targeting negative control sgRNAs were generated using a randomized mouse gene TSS region and selected using the same rules as described above.
sgRNA Library Construction
The oligonucleotide pool was synthesized by Custom Array. The oligo library was PCR amplified, gel purified and ligated to the linearized backbone vector (pSLQ1373) digested with BstXI and BlpI using In-Fusion cloning. Libraries and parental vector will be made available on addgene.org.
E14 mouse ES cells and CamES cells were maintained on gelatin coated tissue culture plates with basal medium (50% Neurobasal, 50% Dulbecco modified Eagle medium (DMEM)/Ham's nutrient mixture F12, 0.5% NEAA, 0.5% Sodium Pyruvate, 0.5% GlutaMax, 0.5% N2, 1% B27, 0.1 mM β-mercaptoethanol and 0.05 g/L bovine albumin fraction V; all from Thermo Fisher Scientific) supplemented with LIF (Millipore) and 2i (Stemgent). Human embryonic kidney (HEK293T) cells (ATCC) were cultured in 10% fetal bovine serum (Thermo Fisher Scientific) in DMEM (Thermo Fisher Scientific).
Mouse ES cells were co-transduced with multiple lentiviral constructs that expressed dCas9-SunTag from a TRE3G promoter, scFV-sfGFP-VP64 from the EF1a or PGK promoter, and reverse tetracycline-controlled transactivator (rtTA) from the EF1a promoter. After adding Doxycycline, polyclonal cells were sorted by flow cytometry using a BD FACS Aria2 for GFP+ and mCherry+ cells. After verification of gene activation using a sgBrn2, monoclonal cells were further sorted, and one efficient cell line was chosen as CamES cells.
Construction of CRISPR/Cas9 vector for Tuj1 knockin. The pX330-derived pSLQ1654 encoding the nuclease Cas9 and an optimized sgRNA sequence was first linearized by a BbsI digest and gel purified. Two primers sgTuj-1 F and sgTuj-1 R were phosphorylated, annealed, and ligated to the linearized vector pSLQ1654 to generate pSLQ1654-sgTuj1. sgTuj-1 F: caccgcccaagtgaagttgctcgcagc. sgTuj-1 R: aaacgctgegagcaacttcacttgggc.
Construction of DNA template. The Tuj1-IRES-hCD8 vector (pSLQ1760) was assembled with three fragments (5′ homologous arm of Tuj1, IRES-hCD8 and 3′ homologous arm of Tuj1) and a modified pUC19 backbone vector by using Gibson Assembly Master Mix (New England Biolabs). Both 5′ and 3′ homology arms were PCR amplified from the genomic DNA extracted from mouse ES cells with Herculase 11 Fusion DNA polymerase (Agilent). The IRES-hCD8 was PCR amplified from pSLQ1729. The backbone vector was linearized by digestion with PmeI and ZraI. All DNA fragments and the backbone vector were gel purified followed by a Gibson assembly reaction. Primers: 5′ homologous arm F: aaagtgccacctgacactcagtcctagatgtcgtgegg (SEQ ID NO:380). 5′ homologous arm R: tcacttgggcccctgggct (SEQ ID NO:381). IRES-human CD8 F: caggggcccaagtgaactagtaaaattcgcccctctccctc (SEQ ID NO:382). IRES-human CD8 R: cagctgcgagcaactttaacctgcaaaaagggagcagtaaagg (SEQ ID NO:383). 3′ homologous arm F: agttgctcgcagctggggt (SEQ ID NO:384). 3′ homologous arm R: agctggagaccgttttttctgactgactggalacagggcat (SEQ ID NO:385).
Electroporation and clonal Tuj1-hCD8 CamES cells: 2.5 μg pSLQ1654-sgTuj1, 12.5 μg Tuj1-IRES-hCD8 template DNA in 100 μL Nucleofector solution (Amaxa) were electroporated into 1×106 CamES cells using program A-030. Both plasmids were maxiprepped using the Endofree Maxiprep Kit (Qiagen). After 3 days of culture, sorted single cells were seeded in a 96-well plate with one cell per well. All clonal cell lines were analyzed using PCR and sequencing (Yu et al., 2015).
HEK293T cells were seeded at ˜30% confluence one day before transfection. Lentivirus were produced by cotransfecting with pHR plasmids and encoding packaging protein vectors (pMD2.G and pCMV-dR8.91) using TransIT-LT1 transfection reagents (Mirus). Viral supernatants were collected 3 days after transfection and filtered through 0.45 μm strainer. Supernatant was used for transduction immediately or kept at −80° C. for long-term storage.
Cells were harvested using Accutase (STEMCELL), and total RNA was isolated using the RNeasy Plus Mini Kit (QIAGEN), according to manufacturer's instructions. Reverse transcription was performed using iScript cDNA Synthesis kit (Bio-Rad). Quantitative PCR reactions were prepared with iTaq Universal SYBR Green Supermix (Bio-Rad). Reactions were run on a LightCycler thermal cycler (Bio-Rad). Primers used are summarized in Table 9.
The neural differentiation screens were performed as two independent replicates. For both screens, 108 CamES cells were seeded at 40,000 cells/cm2 density at day −2. Cells were transduced with pooled lentiviral sgRNA library with an MOI of 0.3 at day −1 in basal medium supplemented with LIF and 2i. At day 0, puromycin was added at 1 μg/mL in ES2N medium (Millipore) with Doxycycline for another 24 hours. Fresh ES2N medium was changed with Doxycycline every day starting day 2. On day 12, cells were harvested and sorted for hCD8+ and hCD8− cells using EasySep human CD8 isolation kit (STEMCELL Technologies). Populations of cells expressing this library of sgRNAs were either harvested at the outset of the experiment (the day 0 time point: after 24 hours puromycin selection), hCD8+, or hCD8− cells. Genomic DNA was harvested from all samples; the sgRNA-encoding regions were then amplified by PCR using HiSeq forward and reverse primers and sequenced on an lllumina HiSeq-4000 using HiSeq custom primer with previously described protocols at high coverage (Bassik et al., 2013; Kampmann et al., 2014). Primers used are summarized in Table 12.
For the individual sgRNA validation experiments, a similar protocol was used, except that CamES cells were cultured in basal medium seeded at 5,500 cells/cm7 after puromycin selection and transduced with a high MOI. Top 100 hits are summarized in Table 10.
Combinatorial sgRNA Library Construction
A library of 44 sgRNAs including a set of 19 genes was designed by using the top prediction hits from the single screens and six nontargeting negative-control sgRNAs. Any sgRNAs containing NsiI restriction sites, which were used for combinatorial sgRNA library construction, were excluded. Individual oligonuclotides encoding sgRNAs were synthesized in a 96-well format (Integrated DNA Technologieds), and cloned into pSLQ1373 individually as previously described. At the same time, the same sgRNA sequence was synthesized (Integrated DNA Technologies) using different forward sequence. These sgRNAs were cloned into pSLQ5004 individually as previously described. After sequencing validation, all pSLQ1373-sgRNA constructs were manually mixed and all pSLQ5004-sgRNA constructs separately mixed in equal amounts for combinatorial sgRNA library construction. To generate the pooled combinatorial sgRNA library, the sgRNA sequence were PCR amplified using two sgRNA forward and reverse primers from pooled pSLQ5004-sgRNA constructs, gel purified and ligated into the NsiO-digested pooled pSLQ1373-sgRNA constructs using In-Fusion cloning (Clontech). The combinatorial sgRNA-library pools were prepared in Stellar competent cells (TaKaRa) and purified with a Plasmid Maxi Kit (Qiagen). The representation of each of the double-sgRNA constructs was then quantified by NGS with the oligonucleotides listed in Table 11.
The double neural differentiation screens were performed as two independent replicates. For both screens, 6 millions CamES cells were seeded at 40,000 cells/cm2 density at day −1. Cells were transduced with pooled lentiviral double sgRNA library with an MOI of 0.3 at day 0 in basal medium supplemented with LIF and 2i. At day 1, puromycin was added at 1 μg/mL in basal medium with Doxycycline for another 24 hours. Fresh basal medium was changed with Doxycycline every day starting day 2. On day 12, cells were harvested and sorted for CD8+ and CD8− cells using Aria II cell sorter (BD Biosciences). Genomic DNA was harvested from all samples; the double sgRNA-encoding regions were then amplified by PCR using MiSeq forward and reverse primers and sequenced on an Illumina Miseq using HiSeq custom primer, which for the first sgRNA, and MiSeq custom primer, which for the second sgRNA. Primers used are summarized in Table 12.
For the individual double sgRNA validation experiments, a similar protocol was used, except that CamES cells were transduced with a high MOI.
Primary cultures of cortex neurons were prepared from postnatal day 1 wild-type black rat. Rats were decapitated, and their brains were removed in pre-cooled physiological saline. The cortex was dissected. Tissues were slightly minced and placed into a Papain Dissociation solution (Worthington Biochemical Corporation) containing 20 units/ml papain and 0.005% DNase in Earle's Balanced Salt Solution (Thermo Fisher Scientific). The solution was equilibrated in 95% O2, 5% CO2 before the tissue was incubated at 37° C. for 1 hour. After incubation, the tissue and solution mixture was triturated. Undissociated tissue was allowed to settle and the cloudy suspension was removed and centrifuged at 300×g for 5 minutes. The supernatant was then discarded and the cell pellet was resuspended in a DNase/albumin-inhibitor solution. A discontinuous density gradient was prepared by gently layering the cell suspension on top of an albumin-inhibitor solution in a centrifuged tube. The mixture was centrifuged at 145×g for 5 minutes. The supernatant was discarded and the neurons were resuspended in Neurobasal (Invitrogen) medium containing 2% B27 supplement, 2 mM glutamine and 0.5% penicillin/streptomycin. A total of 250,000 cells were plated onto a well of 24-well plates that had been pre-treated with 12.5 μg/ml poly-D-lysine (Sigma). The plates were incubated at 37° C. in a 5% CO2/95% air incubator and half of the medium was changed every three days.
Rat Primary Cortical Astrocytes (Thermo Fisher Scientific) were cultured and plated according to manufacturer's instructions. The astrocytes were fed every three days with fresh medium.
One week after culturing primary neurons and astrocytes, the induced neurons were gently removed from the dishes by trypsin dissociation and were replated onto primary neurons or astrocytes. Electrophysiological recordings were performed between day 14 and day 21 after replating.
Preparation Before Induction
Induction of Induced Neurons
After lentiviruses infection for about 14 days (extensive neurites outgrowth should be observed in this stage), the induced cells were progressed for further maturation: Re-plate and co-culture directly with primary neurons/astrocytes.
The antibody CD8-APC was purchased from BD Biosciences. and Anti-PSA-NCAM-APC was from Miltenyi Biotec. Cells were harvested, washed, and adjusted to a concentration of 106 cells/mL in ice cold PBS with 2% FBS. Cells were stained and incubated with diluted primary antibodies at 4° C. for 30 mins in Eppendorf tubes. After staining, cells were washed three times by centrifugation at 400 g for 5 mins and resuspended in 500 μL to 1 mL in ice cold PBS. Cells were kept in dark on ice and analyzed using BD Accuri C6 Cytometer. Cell sorting was performed by using Aria II cell sorter (BD Biosciences).
Experiments were performed on cells seeded on plate (IBIDI) that had been coated with gelatin (0.1%) overnight at 37° C. Cells were washed twice with PBS, fixed in 4% Paraformaldehyde (Wako) for 15 mins at room temperature, permeabilized and blocked with 0.1% Triton X-100, 5% donkey serum in PBS (blocking buffer) for 1 h at room temperature. After three times wash with PBS, cells were incubated with primary antibodies. The following primary antibodies with indicated dilution in blocking buffer were used: Rabbit anti-Oct4 (Santa Cruz, 1:200), Mouse anti-Tuj1 (Covance, 1:1000), Rabbit anti-Map2 (Cell Signaling Technology, 1:200), Rabbit anti-NeuN (Abcam, 1:1000), Rabbit anti-vGluT1 (Synaptic Systems, 1:200), Rabbit anti-GFAP (Dako, 1:500), Rabbit anti-Olig-2 (Millipore, 1:500), Rabbit anti-Tbr1 (Abcam, 1:100), Rabbit anti-Synapsin I (Abcam, 1:200), Rabbit anti-GABA (Sigma, 1:250). Cells were incubated with primary antibodies at 4° C. for overnight, then washed three times with PBS. After staining with corresponding secondary antibodies in blocking buffer for 1 hour at room temperature, cells were washed three times with PBS and stained with DAPI (Vector Labs) for 5 mins. Washed cells were examined using a Nikon Spinning Disk Confocal microscope with TIRF.
The following method was used to calculate the efficiency of neuronal induction. The total number of Map2+ cells with a neuronal morphology, defined as cells having a circular, three-dimensional appearance that extend a thin process at least three times longer than their cell body, were quantified 14 days after infection. The Map2+ and DAPI+ cells were counted from at least 20 randomly selected images at 20× magnification for each condition. The Map2+ cell number was divided by the number of DAPI+ cells to get a percentage of neuron-like cells.
Lentivirus infections (with an additional sfGFP-expression virus) and transgene induction were performed similarly to as described for the fibroblast-induced neurons production, using basal medium. Patch-clamp electrophysiological recordings were performed on sfGFP positive fibroblast-induced neurons. GFP positive neurons located using a Lambda DG-4 illumination system and Q Imaging Fast 1394 Rolera-Mgi Plus camera controlled by Micro-Manager (Version 1.4) mounted on an Olympus BX51WI fluorescence microscope. Whole-cell responses were recorded using an MultiClamp 7008 (Molecular Devices) amplifier and headstage and low-pass filtered at 10 KHz before digitization using a DigiData 1440 data acquisition system (Molecular Devices). Data was stored on a PC running pClamp software (Version 10.4, Molecular Devices). Patch-pipettes were fabricated from 1.5 mm OD borosilicate capillary glass (Warner Instruments) using a microipette puller (Sutter Instrument, Model P-87) to give tip resistances of 2-4 MO. The series resistance for all recordings was under 10MΩ (Mean: 5.62MΩ, SEM: 0.38, n=12). Capacitance transients and series resistance errors were compensated for (70%) using the amplifier circuitry. The sodium and potassium currents currents were recorded in the voltage-clamp configuration at a holding potential of −80 mV. Spontaneous postsynaptic currents were recorded in the voltage-clamp configuration at a holding potential of −60 mV or −70 mV. Spontaneous action potentials were recorded in neurons held at −60 mV to −80 mV. Action potentials were also evoked by applying depolarizing current.
All experiments were performed at ambient room temperature (25° C.). The external solution contained (in mM): NaCl (130), HEPES-Na (10), KCl (5), CaCl2(2), Glucose (10). For voltage-gated sodium currents, tetraethylammonium (TEA, 5 mM) was added to the external solution and the internal solution contained (in mM): CsF (120), HEPES (10), EGTA (11), CaCl2 (1), MgCl2 (1), TEA-Cl (10), KOH (11). For voltage-gated potassium currents, tetrodotoxin (TTX, 500 nM) was added to the external solution and the internal solution contained (in mM): KF (120), HEPES (10), EGTA (11), CaCl2) (1), MgCl2 (1), KCl (10), KOH (11). For current clamp recordings of action potentials, 2 mM MgATP was added to the internal solution. All recording solutions had pH values of 7.3-7.4 with osmolality of 290-300 mOsm/kg. Drug applications were administered via local perfusion approximately 200 μm from the recorded cells at a flow rate of 0.2 ml/min and solutions were continually withdrawn from the recording chamber by vacuum aspiration. Drugs were applied until responses reached a steady-state level. Electrophysiological data were analyzed offline using Clampfit 10.4 and data was plotted using Graphpad Prism software.
Bloinformatic Analysis of sgRNA and Gene Hits
Data processing was conducted with custom scripts. Reads were mapped allowing for a mismatch for the first and last base pair of the spacer, which uniquely identified sgRNA. Each sample was normalized by the total read count. This gave a frequency for each sgRNA:
The paired Tuj1-hCD8+ and Tuj1-hCD8− were used to compute the enrichment scores. Specifically, frequencies as above were computed as above, and sgRNA with less than 1 count in the Tuj1-hCD8− library were discarded. Enrichment was computed for each sgRNA in each replicate as the log 2 fold-change from the Tuj1-hCD8− sample to the Tuj1-hCD8+ libraries. Enrichment was averaged across replicates and used as Esg in subsequent analysis. For each gene, an enrichment score (ESgene) was computed from the sgRNA enrichment above, as follows. An unnormalized enrichment score (Egene.top3) was computed by averaging Esg for the 3 sgRNA with highest Esg. Egene.top3 was normalized by the distribution of nontargeting sgRNA as follows (Gilbert et al., 2014, supra).
Suppose a gene had N targeting sgRNA. 10000 bootstrap samples of size N were drawn from the nontargeting sgRNA. For each sample of size N, Esample.top3 was computed as above. This gave an empirical estimate of the distribution of Egene.top3 if the all the sgRNA targeting that gene had been negative control sgRNA. For the final, normalized gene enrichment score (ESgene), the unnormalized enrichment score was divided by the 0.9 quantile of this empirical distribution:
After ranking genes by ES, the most enriched sgRNA was selected for each gene to subsequently validate.
The count matrix was calculated by exact match for both ends, throwing all other reads out. The correlation of counts between replicates of the same condition was high (0.942-0.992), indicating high reproducibility of the double screen. Effect sizes for each gene pair was calculated using MAGeCK MLE (Li et al Genome Biology 2015, 16:281).
Suppose the null hypothesis that the guide pair of genes i and j have an effect size equal to the maximum of the individual effect size. This will be the case if one gene is the primary driver of neuronal differentiation. Note that the coefficients estimated by MACeCK (βij for genes i and j, in that order) arise from a generalized linear regression and should, if the model posited by MACeCK is correct, be normally distributed.
Consider the null hypothesis H0: the effect of guide targeting two genes is less than the maximal effect of guides targeting either gene. The order of the guide is taken into account. A consistent but smaller effect is predicted with the order of the guides reversed. Let signm(x, y) be the function that returns the sign of the larger of the absolute values of the inputs. Under the null hypothesis βij=signm(βi0, β0j) max(|βi0|, |β0j|).
To this end, note that the standard deviation of βij is bounded above by
√{square root over (8β
Therefore the difference βij−signm(βi0, β0j) max(|βi0|, |β0j|) has standard error bounded above by
One can construct a test statistic to test H0 as
The test statistic constructed does not have an exactly normal distribution due to the high correlation between estimates (since all gene-gene pairs are tested) and therefore an empirical Bayes approach is used to determine significant genes while appropriately controlling the false discovery rate (Efron Large-scale inference: empirical Bayes methods for estimation, testing, and prediction, volume 1. Cambridge University Press. 2012; Efron et al R package 2011).
Since large variation gene effect size was observed (
This example describes the identification of novel TFs driving direct neuronal reprogramming from fibroblasts. Using primary fibroblasts as a screening platform is technically challenging. Firstly, as primary cells have limited expansion capacities, it is difficult to modify them to generate a homogenous population, which achieves consistent CRISPR activation activities. Secondly, the neuronal transdifferentiation of fibroblasts is inefficient and not well suited for the enrichment of the desired cell population for the subsequent sgRNA sequencing.
Thus, mouse ES cells were chosen as a screening platform for the generation of candidate TFs driving neuronal-fate. The ectopic expression of individual key TFs that are critical for neuronal transdifferentiation can also drive neuronal differentiation of mouse ES cells, which supports the use of mouse ES cell differentiation as a discovery tool for neuronal-inducing TFs. Besides, as a model of developmental biology, ES cells have been successfully used to elucidate roles of many master transdifferentiation TFs of other lineages. Finally, mouse ES cells are technically easy to be equipped with CRISRP activation tools and suitable for single sgRNA screens.
A polypeptide-based SunTag CRISPRa system in mouse ES cells (Tanenbaum et al., 2014, supra) was modified (
An sgRNA library targeting all putative TFs (˜800), with an average of 60 sgRNAs per gene was constructed. This sgRNA library also contained 9,296 non-targeting negative control sgRNAs, leading to a total of 55,336 sgRNAs (
Cells expressing varied neuronal lineage markers resulted from the activation of different endogenous genes were detected (
It was next tested if these neuronal factors induce transdifferentiation. As reported, Asc11 alone can induce neuronal transition from mouse fibroblasts. cDNAs of individual genes was transduced into mouse embryonic fibroblasts, cultured cells under transdifferentiation condition, and stained them with neuronal marker Map2. Among the 19 genes tested, only Ngn1 induced neuronal marker expression (
To generate a deep view of how sgRNA design and gene activation level affects neuronal differentiation, other high-ranking sgRNAs of the 19 hit genes were investigated. Quantitative PCR results showed that effective endogenous gene activation (10 to 10,000 fold) was achieved by most of their cognate sgRNAs (
To investigate the determinants of CRISPRa activation in more depth, the targeting locations of top-ranked sgRNAs of the 19 hit genes was investigated. The observed signal followed a mixture distribution (
The strategy to use ES cells differentiation as a tool to discover lineage reprogramming factors was justified by the fact that Ngn1, a hit of the primary screen, is able to convert fibroblasts to neurons. However, as most hits failed to achieve transdifferentiation, the difference between the two processes was highlighted. Compared to ES cell differentiation, a direct lineage programming process utilizes profound transcriptional, epigenetic, and metabolic changes of target cells. These complex mechanisms tend to be initiated by synergistic genetic interactions, instead of a single factor. In most cases, direct lineage reprogramming can only be mediated by the ectopic expression of a gene cocktail. Thus, it was hypothesized that novel gene interactions greatly facilitate direct neuronal reprogramming.
Current gain-of-function techniques, such as cDNA overexpression, are difficult to apply in a pairwise manner, even for a moderate number of genes. In addition, optimal gene expression levels are important for cell fate determinations. Overexpression libraries have limitations owing to dosage and functional issues, and thus may fail to cover genes' optimal expression level. To address these problems, a strategy to determine the gene interactions between the primary hits based on double sgRNA screen was developed. A library of dual-sgRNA-constructs targeting the top neuronal inducers was generated (
With the same strategy as in single CRISPRa screening, double CRISPRa screening was performed (
Hierarchical clustering of sgRNAs based on the correlation of their interactions shows that a fraction of sgRNAs tended to form a high number of interactions (
On the other hand, for genes whose higher activation lead to similar neuronal differentiation, such as Brn2 and Jun, a targeting sgRNA achieving highest activation tend to form stronger interactions then their counterparts (
Gene Combinations Identified in Double CRISPRa Screen Convert Fibroblasts into Neurons
Based on false discovery rate, a list of gene pairs that showed strong synergistic effects was identified. Strong synergies included Ngn1+Ezh2, Ngn1+Foxo1, Tcf15+Zeb1, Tcf15+Foxo1, and Zeb1+Ezh2. To confirm these interactions, constructs expressing corresponding single and double sgRNAs were generated, and their effects in neuronal differentiation of CamES cells was tested. All of the identified sgRNA pairs showed additive effects in neuronal differentiation of mouse ES cells (
The synergistic links to Ngn1, the top hit in the single guide screen, that was identified have not been previously reported to drive neuronal transdifferentiation from fibroblasts. The ability of the above identified synergistic gene pairs to drive neuronal transdifferentiation from fibroblasts was investigated.
One gene pair, Ngn1+Ezh2, induced over 50% Map2+ cells, which is almost 50-fold more than Ngn1 alone (
Here, two new powerful neuronal inducing cocktails were identified: Ngn-1+Ezh2 and Ngn1+Foxo1. It was tested whether the induced cells possess neuron functions. The expression of other mature neuron markers in Ngn1+Ezb2 and Ngn1+Foxo1 induced cells, including Synapsin and NeuN was confirmed (
It was next assessed whether the induced neurons using Ngn1+Ezh2 and Ngn1+Foxo1 were capable of synaptically integrating into pre-existing neural networks. After 7 days' co-infection of cDNAs and a superfold GFP (sfGFP) reporter, the induced neuron cells were re-plated onto rat neonatal cortical neurons that had been cultured for 7 days in vitro. One week after re-plating, patch-clamp recordings from sfGFP-positive induced neuron cells were performed (
For all the induced neuron cells analyzed (5/5), action potentials that fired spontaneously were observed (
Table 13 Shows Exemplary sgRNAs for Genes Targeted in Examples 1-4.
All publications and patents mentioned in the above specification are herein incorporated by reference as if expressly set forth herein. Various modifications and variations of the described method and system of the disclosure will be apparent to those skilled in the art without departing from the scope and spirit of the disclosure. Although the disclosure has been described in connection with specific preferred embodiments, it should be understood that the disclosure as claimed should not be unduly limited to such specific embodiments. Indeed, various modifications of the described modes for carrying out the disclosure that are obvious to those skilled in relevant fields are intended to be within the scope of the following claims.
This application is a continuation of U.S. patent application Ser. No. 15/863,005, filed Jan. 5, 2018, which claims the benefit of U.S. Provisional Application No. 62/443,401, filed Jan. 6, 2017, both of which are incorporated herein by reference in their entireties.
The invention was made with Government support under contract OD017887 awarded by the National Institutes of Health. The Government has certain rights in the invention.
Number | Date | Country | |
---|---|---|---|
62443401 | Jan 2017 | US |
Number | Date | Country | |
---|---|---|---|
Parent | 15863005 | Jan 2018 | US |
Child | 17496275 | US |