The sequence listing associated with this application is provided in text format in lieu of a paper copy and is hereby incorporated by reference into the specification. The name of the text file containing the sequence listing is: 37731_Seq_Final—2011-09-15.txt. The text file is 221 KB; was created on Sep. 15, 2011; and is being submitted via EFS-Web with the filing of the specification.
This invention relates to methods, reagents, and kits for selectively isolating nuclei from a cell type of interest suitable for use in analysis of gene expression and chromatin profiling of individual cell types within a tissue.
Growth and development of multicellular organisms requires the production of many specialized cell types that make up the tissues and organs of the adult body. The generation of a differentiated cell from an undifferentiated progenitor involves epigenetic reprogramming of the stem cell genome to establish the appropriate lineage-specific transcription program. Initial establishment and subsequent maintenance of this transcriptional program is effected through chromatin-based gene silencing and activation mechanisms involving the dynamic interplay of transcription factors, post-translational modification of histones, the deposition of histone variants, DNA methylation, and nucleosome remodeling (Brien, G. L., and A. P. Bracken, “Transcriptomics: Unravelling the Biology of Transcription Factors and Chromatin Remodelers During Development and Differentiation,” Semin. Cell Dev. Biol. 20:835-841, 2009; Muller, C., and A. Leutz, “Chromatin Remodeling in Development and Differentiation,” Curr. Opin. Genet. Dev. 11:167-174, 2001; Ng, R. K., and J. B. Gurdon, “Epigenetic Inheritance of Cell Differentiation Status,” Cell Cycle 7:1173-1177, 2008). Defining precisely how cellular differentiation is imposed and maintained is a central goal of developmental biology, and is also critical to understanding how the process can go awry, leading to disease states such as cancer. Despite the importance of this problem, knowledge of the mechanics of differentiation processes in vivo is still quite limited, in large part due to the technical difficulty associated with isolating pure cell types from a tissue for transcriptional and epigenomic profiling.
Current methods for the study of pure individual cell types include the use of cultured cell lines (Mito, Y., et al., “Genome-Scale Profiling of Histone H3.3 Replacement Patterns,” Nat. Genet. 37:1090-1097, 2005; Rao, R. R. and S. L. Stice, “Gene Expression Profiling of Embryonic Stem Cells Leads to Greater Understanding of Pluripotency and Early Developmental Events,” Biol. Reprod. 71:1772-1778, 2004; Rivolta, M. N. and M. C. Holley, “Cell Lines in Inner Ear Research,” J. Neurobiol. 53:306-318, 2002), ex vivo differentiation from progenitor cells (Bhattacharya, B., et al., “A Review of Gene Expression Profiling of Human Embryonic Stem Cell Lines and Their Differentiated Progeny,” Curr. Stem Cell Res. Ther. 4:98-106, 2009; Trion, S., et al., “Directed Differentiation of Pluripotent Stem Cells: From Developmental Biology to Therapeutic Applications,” Cold Spring Harb. Symp. Quant. Biol. 73:101-110, 2008), laser capture microdissection (LCM) of sectioned tissues (Brunskill, E. W., et al., “Atlas of Gene Expression in the Developing Kidney at Microanatomic Resolution,” Dev. Cell 15:781-791, 2008; Jiao, Y., et al., “A Transcriptome Atlas of Rice Cell Types Uncovers Cellular, Functional and Developmental Hierarchies,” Nat. Genet. 41:258-263, 2009; Nakazono, M., et al., “Laser-Capture Microdissection, a Tool for the Global Analysis of Gene Expression in Specific Plant Cell Types: Identification of Genes Expressed Differentially in Epidermal Cells or Vascular Tissues of Maize,” Plant Cell 15:583-596, 2003), and fluorescence-activated cell sorting (FACS) of fluorescently labeled cell lines or protoplasts (Birnbaum, K., et al., “A Gene Expression Map of the Arabidopsis Root,” Science 302:1956-1960, 2003; de la Cruz, A. F., and B. A. Edgar, “Flow Cytometric Analysis of Drosophila Cells,” Methods Mol. Biol. 420:373-389, 2008; Gifford, M. L., et al., “Cell-Specific Nitrogen Responses Mediate Developmental Plasticity,” Proc. Natl. Acad. Sci. USA 105, 803-808, 2008; Zhang, Y., et al., “Identification of Genes Expressed in C. elegans Touch Receptor Neurons,” Nature 418:331-335, 2002). Of these techniques, LCM and FACS are the only ones applicable to in vivo studies, but both are limited in that they involve extensive tissue manipulation, require complex and highly expensive equipment, and offer relatively low throughput. Several new methods, such as cell type-specific chemical modification of RNA (Miller, M. R., et al. “TU-Tagging: Cell Type-Specific RNA Isolation From Intact Complex Tissues,” Nat. Methods 6:439-441, 2009) and affinity tagging of ribosomal proteins or poly(A)-binding proteins (Heiman, M., et al., “A Translational Profiling Approach for the Molecular Characterization of CNS Cell Types,” Cell 135:738-748, 2008; Mustroph, A., et al., “Profiling Translatomes of Discrete Cell Populations Resolves Altered Cellular Priorities During Hypoxia in Arabidopsis,” Proc. Natl. Acad. Sci. USA, 2009; Roy, P. J., et al., “Chromosomal Clustering of Muscle-Expressed Genes in Caenorhabditis elegans,” Nature 418:975-979, 2002) have also been successfully employed to measure the gene expression profiles of individual cell types, but these approaches cannot be used to study chromatin features.
Therefore, a need exists for a simple and broadly applicable method for studying gene expression and chromatin regulation in individual cell types to make the study of cell differentiation and function more accessible.
This summary is provided to introduce a selection of concepts in a simplified form that are further described below in the Detailed Description. This summary is not intended to identify key features of the claimed subject matter nor is it intended to be used as an aid in determining the scope of the claimed subject matter.
In one aspect, the present invention provides a vector for selectively labeling nuclei in a cell type of interest comprising a nucleic acid sequence encoding a fusion polypeptide comprising (a) a nuclear envelope targeting region and (b) an affinity reagent binding region. In some embodiments, the affinity reagent binding region comprises a biotin ligase accepting site. In some embodiments, the affinity binding region comprises an epitope recognized by an antibody.
In another aspect, the present invention provides a cell comprising a vector for selectively labeling the cell type of interest, the vector comprising a nucleic acid sequence encoding a fusion polypeptide comprising (a) a nuclear envelope targeting region; and (b) an affinity reagent binding region, wherein the fusion polypeptide is incorporated into the nuclei of the cell. In some embodiments, the cell is in a tissue, culture, or part of a transgenic organism.
In another aspect, the invention provides a kit for selectively labeling nuclei in a cell type of interest, the kit comprising: (a) a vector comprising a first expression cassette comprising a nucleic acid sequence encoding a fusion polypeptide comprising: (i) a nuclear envelope targeting region; and (ii) an affinity reagent binding region; and (b) a capture molecule capable of specifically binding to the affinity binding region, or a modification thereof. In some embodiments, the affinity binding region comprises a biotin ligase accepting site. In some embodiments, the kit further comprises a second expression cassette for expressing a biotin ligase polypeptide. In some embodiments, the capture reagent is bound to a magnetic particle.
In another aspect, the invention provides a method for generating in vivo biotinylated nuclei in a cell type of interest. The method according to this aspect comprises recombinantly expressing in the cell a fusion polypeptide comprising (i) a nuclear envelope targeting region and (ii) an affinity reagent binding region, wherein one of the fusion polypeptide or a molecule that modifies the fusion polypeptide is under the control of a promoter specific to the cell type of interest.
In another aspect, the invention provides a method for selectively isolating nuclei from a cell type of interest present in a plurality of cells wherein at least a portion of the cells recombinantly express a fusion polypeptide comprising (i) a nuclear envelope targeting region and (ii) an affinity reagent binding region, wherein at least one of the fusion polypeptide or a molecule that modifies the fusion protein is under the control of a promoter specific to the cell type of interest. The method comprises: (a) lysing the plurality of cells under conditions suitable to generate a cell lysate comprising a plurality of intact nuclei; (b) contacting the cell lysate with a capture molecule that specifically binds to the affinity reagent binding region, or a modified form thereof, under conditions suitable to bind the nuclei comprising the fusion polypeptide; and (c) isolating the nuclei bound to the capture molecule.
In another aspect, the present invention provides a method of generating in vivo biotinylated nuclei in a cell type of interest. The method comprises recombinantly co-expressing in the cell: (a) a fusion polypeptide comprising (i) a nuclear envelope targeting region; and (ii) a biotin ligase accepting site; and (b) a biotin ligase; wherein the co-expression of the recombinant fusion polypeptide and the biotin ligase produces biotinylated nuclei in the cell of interest. In some embodiments, the nucleic acid sequences encoding the fusion polypeptide and biotin ligase are present on the same vector, and wherein the co-expressing comprises introducing one or more copies of the vector encoding the fusion polypeptide and biotin ligase into the cell type of interest, or a progenitor of the cell type of interest. In other embodiments, the nucleic acid sequences encoding the fusion polypeptide and biotin ligase are present on separate vectors, and wherein the co-expressing comprises introducing one or more copies of the vector encoding the fusion polypeptide and introducing one or more copies of the vector encoding biotin ligase into the cell type of interest, or a progenitor of the cell type of interest. In some embodiments, the cell type of interest is in a mixture of multiple cell types. In some embodiments, the method further comprises isolating biotinylated nuclei from the cells using a capture molecule that specifically binds to biotin.
In another aspect, the present invention provides a method of selectively isolating nuclei from a cell type of interest wherein at least a portion of the cells co-express (i) a recombinant fusion polypeptide comprising a nuclear envelope targeting region and a biotin ligase accepting site, and (ii) a biotin ligase, wherein expression of at least one of the recombinant fusion polypeptide or the biotin ligase is under the control of a promoter that is specific for the cell type of interest, and wherein the co-expression of the recombinant fusion polypeptide and the biotin ligase selectively produces biotinylated nuclei in the cell type of interest. The method comprises: (a) lysing the plurality of cells from the mixture under conditions suitable to generate a cell lysate comprising a plurality of intact nuclei; and (b) contacting the cell lysate with a capture molecule that specifically binds to biotin under conditions suitable to bind the biotinylated nuclei; and (c) isolating the biotinylated nuclei bound to the capture molecule. In some embodiments, the cell type of interest is in a mixture of multiple cell types, such as a cell culture or tissue. In some embodiments, the capture molecule is bound to a magnetic particle. In some embodiments, the capture molecule is selected from the group consisting of: streptavidin or a fragment thereof, avidin or a fragment thereof, and an anti-biotin antibody or a fragment thereof. In some embodiments, the method further comprises extracting nucleic acids from the isolated biotinylated nuclei. In some embodiments, the method further comprises performing gene expression analysis on the isolated nucleic acids. In some embodiments, the method further comprises performing analysis of the chromatin structure of the nucleic acids.
Finally, in another aspect, the present invention provides a method of visually tagging nuclei in a cell type of interest comprising introducing a vector comprising a nucleic acid sequence encoding a fusion polypeptide comprising (a) a nuclear envelope targeting region and (b) a fluorescent protein, into the cell type of interest. In some embodiments, the cell type of interest is eukaryotic. In some embodiments, the nuclear envelope targeting region selectively targets the outer nuclear membrane. In some embodiments, the nuclear envelope targeting region selectively targets the inner nuclear membrane. In some embodiments, the vector is a viral vector. In some embodiments, the cell type of interest is a neuron, such as a post-mitotic neuron.
The compositions, kits and methods of the present invention are useful, for example, for isolating the nuclei of a cell type of interest from a mixture of a plurality of cell types. The resulting purified nuclei can be used to perform transcriptional profiling and epigenomic profiling. Therefore, the compositions, kits and methods of the present invention provide a time and cost-effective approach for generating gene expression and epigenomic data for a cell type of interest to make the study of cell differentiation and function more accessible.
The foregoing aspects and many of the attendant advantages of this invention will become more readily appreciated as the same become better understood by reference to the following detailed description, when taken in conjunction with the accompanying drawings, wherein:
SEQ ID NO:1 Arabidopsis RAN GTPASE ACTIVATING PROTEIN 1 (RanGAP1) WPP domain DNA
SEQ ID NO:2 Arabidopsis RAN GTPASE ACTIVATING PROTEIN 1 (RanGAP1) WPP domain amino acid
SEQ ID NO:3 enhanced Green Fluorescent Protein (eGFP) DNA
SEQ ID NO:4 enhanced Green Fluorescent Protein (eGFP) amino acid
SEQ ID NO:5 biotin ligase recognition peptide DNA
SEQ ID NO:6 biotin ligase recognition peptide amino acid
SEQ ID NO:7 shortened biotin ligase recognition peptide DNA
SEQ ID NO:8 shortened biotin ligase recognition peptide amino acid
SEQ ID NO:9 DNA encoding the full length nuclear tagging fusion (NTF) protein, as used in EXAMPLE 1
SEQ ID NO:10 full length amino acid of the nuclear tagging fusion (NTF) protein, as used in EXAMPLE 1
SEQ ID NO:11 E. coli biotin holoenzyme synthetase (BirA) DNA
SEQ ID NO:12 E. coli biotin holoenzyme synthetase (BirA) amino acid
SEQ ID NO:13 A. thaliana ACTIN DEPOLYMERIZING FACTOR 8 (ADF8) promoter
SEQ ID NO:14 A. thaliana GLABRA 2 (GL2) promoterA
SEQ ID NO:15 A. thaliana ACTION 2 (ACT2) promoter
SEQ ID NO:16 mCherry fluorescent protein DNA
SEQ ID NO:17 mCherry fluorescent protein amino acid
SEQ ID NO:18 C. elegans H3.3 (his-72) promoter sequence (Chromosome III, 12368042 to 12369042, −strand)
SEQ ID NO:19 C. elegans H3.3 (his-72) 3′UTR sequence (Chromosome III, 12366572 to 12367571, −strand)
SEQ ID NO:20 C. elegans pie-1 promoter sequence (Chromosome III, 12424364 to 12426776, +strand)
SEQ ID NO:21 C. elegans pie-1 3′ UTR sequence (Chromosome III, 12428972 to 12429871, +strand)
SEQ ID NO:22 C. elegans NPP-9 domain DNA with introns
SEQ ID NO:23 C. elegans NPP-9 domain amino acid
SEQ ID NO:24 DNA encoding the full length nuclear tagging fusion (NTF) protein, as used in EXAMPLE 4
SEQ ID NO:25 full length amino acid of the nuclear tagging fusion (NTF) protein, as used in EXAMPLE 4
SEQ ID NO:26 3X FLAG affinity tag domain nucleic acid
SEQ ID NO:27 3X FLAG affinity tag domain amino acid
SEQ ID NO:28 D. melanogaster RanGAP domain DNA with introns
SEQ ID NO:29 D. melanogaster RanGAP domain amino acid
SEQ ID NO:30 DNA encoding the full length nuclear tagging fusion (NTF) protein, as used in EXAMPLE 5
SEQ ID NO:31 full length amino acid of the nuclear tagging fusion (NTF) protein, as used in EXAMPLE 5
SEQ ID NO:32 D. melanogaster twist promoter
SEQ ID NOS:33-86 primer sequences
SEQ ID NO:87 biotin ligase recognition peptide DNA
SEQ ID NO:88 biotin ligase recognition peptide amino acid
SEQ ID NO:89 amino acid sequence of linker used in the nuclear tagging fusion proteins based on Sun-1 and Nesprin-3, as described in EXAMPLE 6
SEQ ID NO:90 DNA encoding the mouse Nesprin-3 protein, as used in EXAMPLE 6
SEQ ID NO:91 full length amino acid of the mouse Nesprin-3 protein, as used in EXAMPLE 6
SEQ ID NO:92 DNA encoding the mouse Sun-1 protein, as used in EXAMPLE 6
SEQ ID NO:93 full length amino acid of the mouse Sun-1 protein, as used in EXAMPLE 6
SEQ ID NO:94 DNA encoding the D. melanogaster klarsicht protein (klar), as used in EXAMPLE 7
SEQ ID NO:95 full length amino acid of the D. melanogaster klarsicht (klar) protein, as used in EXAMPLE 7
SEQ ID NO:96 DNA encoding the C. elegans Unc-84 protein, as used in EXAMPLE 7
SEQ ID NO:97 full length amino acid of the C. elegans Unc-84 protein, as used in EXAMPLE 7
SEQ ID NO:98 DNA encoding the C. elegans Unc-83 protein, as used in EXAMPLE 7
SEQ ID NO:99 full length amino acid of the C. elegans Unc-83 protein, as used in EXAMPLE 7
Unless defined otherwise, all technical and scientific terms used herein have the meaning commonly understood by one of ordinary skill in the art to which this invention belongs. Practitioners are particularly directed to Sambrook, J., and Russell, D. W., eds., Molecular Cloning: A Laboratory Manual, 3rd ed., Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y. (2001), which is incorporated herein by reference, for definitions and terms of the art.
The following definitions are presented to provide clarity with respect to the terms as they are used in the specification and claims to describe the present invention.
As used herein, the term “gene” refers to a nucleic acid (e.g., DNA or RNA) sequence that comprises coding sequences necessary for the production of an RNA and/or a polypeptide, or its precursor as well as noncoding sequences (untranslated regions) surrounding the 5′ and 3′ ends of the coding sequences. The term “gene” encompasses both cDNA and genomic forms of a gene. A functional polypeptide can be encoded by a full length coding sequence or by any portion of the coding sequence as long as the desired activity or functional properties (e.g., enzymatic activity, ligand binding, signal transduction, antigenic presentation) of the polypeptide are retained. The sequences which are located 5′ of the coding region and which are present on the mRNA are referred to as 5′ untranslated sequences (“5′UTR”). The sequences which are located 3′ or downstream of the coding region and which are present on the mRNA are referred to as 3′ untranslated sequences, or (“3′UTR”).
As used herein, the terms “polypeptide” or “protein” are used interchangeably to refer to polymers of amino acids of any length. A polypeptide or amino acid sequence “derived from” a designated protein refers to the origin of the polypeptide.
As used herein, the term “promoter” refers to a region, or combination of regions, of DNA within a gene that facilitates the transcription of the gene. These regions typically provide binding sites for transcription factors, which participate in the assembly of the transcriptional complex.
As used herein, the term “operatively linked” refers to a juxtaposition wherein the components so described are in a relationship permitting them to function in their intended manner. For example, a promoter sequence is operatively linked to a coding sequence if the promoter sequence promotes transcription of the coding sequence.
As used herein, the term “antibody” encompasses antibodies and antibody fragments thereof, derived from any antibody-producing vertebrate (e.g., mouse, rat, rabbit, camelid, and primate, including human), that specifically bind to a polypeptide target of interest, or portions thereof.
As used herein, the term “vector” is a nucleic acid molecule, preferably self-replicating, which transfers and/or replicates an inserted nucleic acid molecule into and/or between host cells. Exemplary vectors include plasmid vectors and viral vectors. An example of viral vector is a Lentiviral vector.
As used herein, the terms indicating “percent identity” or “percent identical,” refer to the percentage of nucleotides in a nucleic acid sequence or amino acid residues in a polypeptide sequence that are identical with the nucleic acid sequence or amino acid sequence of a specified molecule, after aligning the sequences to achieve the maximum percent identify. For example, the Vector NTI Advance™ 9.0 may be used for sequence alignment.
As used herein, the term “variant,” in reference to a nucleic acid or polypeptide of any length, refers to a related nucleic acid or polypeptide that has between 90% and 99% identity with the nucleic acid or polypeptide of reference over the length of the reference nucleotide or amino acid sequence, such as 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, and 99% identity or with the reference nucleotide or amino acid sequence. Furthermore, the related nucleic acid or polypeptide possesses the equivalent functional qualities of the reference nucleic acid or protein. For example, a polypeptide that is a variant of biotin ligase recognition peptide can have between 90% to 99% identity with the sequence of the reference biotin ligase recognition peptide, wherein the variant polypeptide is capable of recognition and biotinylation by biotin ligase. In another example, a polypeptide that is a variant of a nuclear envelope targeting region polypeptide can have between 90% and 99% identity with the sequence of the reference nuclear envelope targeting region polypeptide, wherein the variant polypeptide is capable of being translocated and attached to the nuclear envelope of the cell. In yet another example, a variant nucleic acid promoter sequence can have between 90% and 99% identity with the reference promoter sequence, wherein the variant promoter sequence is capable of initiating transcription with the same or similar transcription factors as the reference promoter sequence.
The present invention provides a cost- and time-effective method to isolate the nuclei of a cell type of interest to enable genomic analyses of the cell-type.
In one embodiment, the invention utilizes a vector comprising a nucleic acid sequence encoding a fusion polypeptide comprising (a) a nuclear envelope targeting region and (b) an affinity reagent binding region.
As used herein, the term “affinity reagent binding region” refers to an amino acid sequence that is capable of directly binding to, or being bound by, a capture affinity reagent (e.g., an antibody that selectively binds to an epitope in the affinity reagent binding region), and also encompasses an amino acid sequence that is modified, such as by a post-translational modification (e.g., biotinylated in vivo), wherein the modified (e.g., biotinylated) version of the amino acid sequence is capable of binding to an affinity reagent (e.g. avidin and streptavidin).
As used herein, the terms “affinity reagent”, “capture reagent”, and “capture molecule” are used interchangeably to refer to reagents that bind to affinity reagent binding regions with sufficient specificity and avidity to facilitate the isolation of any molecule or cell structure, namely nuclei, with an affinity reagent binding region incorporated therein.
In some embodiments, the affinity region comprises an epitope tag and the affinity binding reagent is an antibody that selectively binds to the epitope tag. In some embodiments, the affinity binding region comprises a “biotin ligase accepting site,” also referred to as a “biotin ligase recognition peptide (BLRP),” that is biotinylated in vivo with a biotin ligase and the affinity binding reagent is a capture molecule capable of specifically binding to biotin. In some embodiments, the in vivo biotinylated nuclei of the cell type of interest are subsequently purified utilizing a biotin capture molecule.
Various embodiments of this invention, also referred to herein as “INTACT” (isolation of nuclei tagged in specific cell types), allow for the production and isolation of cell-type specific nuclei that are tagged (i.e., labeled) with a nuclear tagging fusion (“NTF”) polypeptide comprising an affinity binding region (e.g., comprising an epitope tag or biotin ligase accepting site for biotinylation) and a nuclear envelope targeting domain.
In an exemplary embodiment, isolation of tagged nuclei was accomplished by the co-expression of Escherichia coli biotin ligase BirA and a nuclear tagging fusion (NTF) protein in two Arabidopsis thaliana root epidermis cell types, as described in EXAMPLE 1. In the exemplary embodiment described in EXAMPLE 1, the NTF protein comprised the following three regions: (1) the WPP domain of Arabidopsis RAN GTPASE ACTIVATING PROTEIN 1 (RanGAP1), which is necessary and sufficient for envelope association (Rose, A., and I. Meier, “A Domain Unique to Plant RanGAP Is Responsible for Its Targeting to the Plant Nuclear Rim,” Proc. Natl. Acad. Sci. USA 98:15377-15382, 2001), (2) the green fluorescent protein (GFP) for visualization, and (3) the affinity binding region comprising the biotin ligase recognition peptide (BLRP), which acts as a substrate for the E. coli biotin ligase BirA (Beckett, D., et al., “A Minimal Peptide Substrate in Biotin Holoenzyme Synthetase-Catalyzed Biotinylation,” Protein Sci. 8:921-929, 1999). Cell type-specific expression of the NTF protein was driven in A. thaliana root epidermis hair cells using ACTIN DEPOLYMERIZING FACTOR 8 (ADF8) promoter (Ruzicka, D. R., et al., “The Ancient Subclasses of Arabidopsis Actin Depolymerizing Factor Genes Exhibit Novel and Differential Expression,” Plant J. 52:460-472, 2007), and in non-hair cells using GLABRA2 (GL2) promoter (Masucci, J. D., et al., “The Homeobox Gene GLABRA2 is Required for Position-Dependent Cell Differentiation in the Root Epidermis of Arabidopsis thaliana,” Development 122:1253-1260, 1996). As described in EXAMPLES 1-3, the method provided a high yield and purity of nuclei from each cell type of interest, facilitating a robust analyses of the genome-wide gene expression and chromatin structures for each cell type.
To demonstrate the applicability of the INTACT method to all eukaryotic organisms, NTF protein and BirA ligase were co-expressed specifically in germline cells of Caenorhabditis elegans resulting in the successful production and isolation of tagged germline cell nuclei, as described in EXAMPLE 4. Similarly, NTF protein and BirA ligase were successfully co-expressed specifically in somitic cells of Drosophila melanogaster embryos, as described in EXAMPLE 5. Furthermore, NTF proteins incorporating SUN or KASH domains, in connection with either GFP or tdTomato/epitope tag, were expressed mice (as described in EXAMPLE 6) and D. melanogaster (EXAMPLE 7).
In accordance with the foregoing, in one embodiment, the present invention provides a vector 10 for selectively labeling nuclei in a cell type of interest comprising a nucleic acid sequence encoding a nuclear tagging fusion (NTF) polypeptide 30 comprising (a) a nuclear envelope targeting region 32; and (b) an affinity reagent binding region 34. In the embodiment of the vector shown in
For example, in one embodiment as described in EXAMPLES 1-3, the vector 10 was designed for use in plant cells and comprised a nucleic acid sequence encoding the WPP domain of the Arabidopsis RAN GTPASE ACTIVATING PROTEIN 1 (RanGAP1), set forth herein as SEQ ID NO: 1. As described in EXAMPLE 1, the expressed NTF protein that included the amino acid sequence of the RanGAP1 WPP domain, set forth herein as SEQ ID NO:2, successfully caused the translocation and incorporation of the fusion protein to the nuclear envelope of A. thaliana epidermal root cells. In some embodiments, the vector 10 comprises a nucleic acid sequence encoding a nuclear envelope targeting region with an amino acid sequence of SEQ ID NO:2, or a variant thereof. Because the RanGAP1 WPP domain is relatively conserved in plants, use of this domain is predicted to be useful in employing this system in many other, if not all, types of plants.
For embodiments using cell-types from non-plant organisms, the C-terminus of the endogenous RanGAP protein may be used, or any number of nuclear pore proteins may be used. For example, in one embodiment for the nematode Caenorhabditis elegans, NPP-9, a C. elegans homolog of mammalian Nup358/RanBP2 was used to target the NTF protein 30 to the nuclear envelope, as described in EXAMPLE 4. The amino acid sequence for the NPP-9 domain is set forth herein as SEQ ID NO:23, and is encoded by the nucleic acid set forth herein as SEQ ID NO:23. Additionally, in an embodiment for Drosophila melanogaster, the D. melanogaster RanGAP protein was used to target the NTF protein 30 to the nuclear envelope, as described in EXAMPLE 5. The amino acid sequence for the D. melanogaster RanGAP domain is set forth herein as SEQ ID NO:29, and is encoded by the nucleic acid set forth herein as SEQ ID NO:28. Accordingly, some embodiments comprise a nucleic acid encoding polypeptides that are variants with at least 90% identity of SEQ ID NOS: 23 and 29.
The nuclear envelope is a double lipid bilayer composed of an inner nuclear membrane (INM) and outer nuclear membrane (ONM) separated by a space referred to as the lumen (L) (see
In other embodiments, the nuclear envelope targeting region causes the NTF to be embedded in the INM. Embodiments that incorporate nuclear envelope targeting regions specific for the INM are useful to accommodate culture or extraction techniques that may compromise the ONM of the nuclear envelope. For example, some detergents may disrupt the ONM, as described in EXAMPLE 6. In this regard, NTF proteins incorporating members of the SUN domain family of proteins were shown to tag the INM of nuclei in mice and Drosophila, respectively, as described in EXAMPLES 6 and 7. Accordingly, in some embodiments, the nuclear envelope targeting region comprises a SUN domain. Illustrative SUN domains include the sequence from amino acid residue 771 to amino acid residue 911 of SEQ ID NO:93, and the sequence from amino acid residue 971 to amino acid residue 1108 of SEQ ID NO:97. In some embodiments, the KASH domain has a polypeptide sequence with at least 90% identity to the sequence from amino acid residue 771 to amino acid residue 911 of SEQ ID NO:93, or the sequence from amino acid residue 971 to amino acid residue 1108 of SEQ ID NO:97, or any naturally occurring homolog thereof. An additional representative SUN domain is incorporated in the sequence from amino acid residue 425 to amino acid residue 460 of the D. melanogaster klaroid protein, the amino acid sequence of which has the Genbank accession number NM—136396.3, hereby incorporated herein by reference (as accessed on Sep. 15, 2011).
As illustrated in
A person of ordinary skill in the art will recognize that the affinity reagent binding region 34 can also be an epitope located within a detectable polypeptide, such as a fluorescent protein or other visualization tag. Thus, in some embodiments, the affinity (capture) reagent, such as an antibody, can be used against an epitope contained in a fluorescence protein or other visualization tag that is included in the fusion protein, as described herein. In another embodiment, the capture reagent that specifically binds to the affinity reagent binding region 34 may be labeled with a molecule capable of emitting detectable light or energy. In some embodiments, the immunological capture agent may also be bound to a bead. Numerous types of antibody-bound beads are commercially available.
To ensure access of the affinity capture reagent to the affinity reagent binding region 34 (e.g., epitope tag or biotin ligase accepting site) of the NTF protein 30 of a tagged nucleus, it is preferred that the vector 10 encodes an NTF protein 30 such that the relative positions of the translated nuclear targeting region 32 and affinity reagent binding region 34 will result in the positioning of the affinity binding region 34 in the extra-nuclear space of the cell upon the incorporation of the NTF protein 30 to the nuclear envelope 46. For example,
In other embodiments, the NTF protein comprises a nuclear envelope targeting region, such as a SUN domain, that causes the translocation of the NTF in the INM. In such embodiments, it is preferred that the affinity reagent binding region is positioned such that it resides in the lumen between the INM and ONM upon embedding of the NTF in the INM. For example, as described in EXAMPLE 6 and illustrated in
In some embodiments, the encoded affinity binding region 34 comprises a biotin acceptor site 35 for a biotin ligase. The encoded biotin acceptor site 35 is capable of becoming biotinylated in vivo in the presence of a biotin holoenzyme synthetase. In vivo biotinylation is a highly specific post-translational modification mediated by endogenous biotin ligases (Cronan, J. E., et al., J. Biol. Chem. 265:10327-33, 1990). In one embodiment of the vector 10, the encoded biotin ligase acceptor site 35 is a target for the E. coli biotin carboxyl carrier protein (BCCP), a subunit of acetyl-CoA carboxylase (Samols, et al., J. Biol. Chem. 263:6461-4, 1988). Escherichia coli biotin holoenzyme synthetase (BirA) is encoded by the nucleic acid sequence set forth herein as SEQ ID NO:11 (Barker and Campbell, J. Mol. Biol. 146:451-67, 1981), and has the polypeptide sequence set forth herein as SEQ ID NO:12. The BirA enzyme is an exemplary enzyme that catalyzes biotin activation by covalently joining biotin with ATP to form biotin-5′-adenylate, with subsequent transfer to the epsilon amino group of a specific BCCP lysine residue (Barker and Campbell, J. Mol. Biol. 146:469-92, 1981b). Because in vivo biotinylation is highly specific for the BCCP lysine, it can be achieved without modification of critical lysine residues belonging to antibody recognition sequences and thus without functional loss of the recognition domains. Accordingly, in one embodiment, as described in EXAMPLES 1-3, the vector 10 comprises the nucleotide sequence set forth herein as SEQ ID NO:5, which encodes a biotin ligase accepting site 35, set forth in herein as SEQ ID NO:6. In other embodiments, the vector 10 comprises any nucleic acid sequence encoding a biotin ligase accepting site 35 with an amino acid sequence of SEQ ID NO:6, or variant thereof. In other embodiments, as described in EXAMPLES 4 and 5, the vector 10 comprises the nucleotide sequence set forth herein as SEQ ID NO:87, which encodes a biotin ligase accepting site 35, with an amino acid sequence set forth in herein as SEQ ID NO:88. In other embodiments, the vector 10 comprises a nucleic acid sequence encoding a biotin ligase accepting site 35 comprising an amino acid sequence of SEQ ID NO:88, or variant thereof.
It is noted that while BirA typically recognizes a large protein domain, Schatz and colleagues have identified short peptides (Schatz, P. J., Biotechnology 11:1138-43, 1993; Beckett, et al., Protein Sci. 8:921-9, 1999) that efficiently mimic BCCP biotin acceptor function. Accordingly, in some embodiments, the vector 10 comprises a nucleic acid sequence that encodes a shortened biotin ligase accepting site comprising the amino acid sequence GLNDIFEAQKIEWHE, set forth herein as SEQ ID NO:8. An example of a nucleic acid sequence encoding the shortened biotin ligase accepting site of SEQ ID NO:8 is set forth herein as SEQ ID NO:7. In some embodiments, the vector 10 comprises a nucleic acid sequence that encodes a shortened biotin ligase accepting site that is a variant of SEQ ID NO:8.
As described herein, the embodiments of the vectors, kits and methods incorporating an in vivo biotinylated fusion protein and biotin capture reagent allow for high yields of purified nuclei and purity of nucleic acid from cell-type specific cells. This likely due to the fact that the interaction between biotin and streptavidin is orders of magnitude stronger than typically observed for antigen/antibody interactions. Therefore, such embodiments allow for the isolation of a high percentage of the tagged nuclei and the selective purification of nucleic acids from the tagged nuclei.
In some embodiments, such as is illustrated in
In some embodiments, the target organism is a plant. In one illustrative embodiment, as described in EXAMPLE 1, the vector 10 includes the promoter 12 for ACTIN DEPOLYMERIZING FACTOR 8 (ADF8) (Ruzicka et al., 2007), presented herein as SEQ ID NO:13, resulting in expression of the NTF protein 30 exclusively in hair cells of the A. thaliana root epidermis. Accordingly, in some embodiments wherein the cell type of interest is derived from a plant root epidermis hair cell, the vector comprises a promoter nucleic acid sequence that is a variant of SEQ ID NO:13 (ADF8) and has at least 90% identity thereto. In another embodiment, also described in EXAMPLE 1, the vector 10 included the promoter 12 for GLABRA 2 (GL2) (Masucci et al., 1996), presented herein as SEQ ID NO:14, resulting in expression of the NTF protein exclusively in the non-hair cells of the A. thaliana root epidermis. Accordingly, in some embodiments wherein the cell type of interest is derived from a plant root epidermis cell, the vector comprises a promoter sequence that is a variant of SEQ ID NO:14 (GL2) and has at least 90% identity thereto.
As will be apparent to persons of ordinary skill in the art, promoter sequences 12 for cell-type specific transcription of the NTF encoding nucleic acid can be selected from the organism of choice. For example, in embodiments in which the cell type of interest is a D. melanogaster cell type, known promoters specific for the D. melanogaster cell type may be used. For example, in the embodiment described in EXAMPLE 5 and illustrated in
In some embodiments, the cell-type specific promoter 12 comprises the incorporation of 3′ UTR sequence to further facilitate the cell-type specific transcription of the vector sequence. For example, in the embodiment described in EXAMPLE 4, the promoter sequence 12 comprises the sequence for C. elegans pie-1 promoter, set forth herein as SEQ ID NO:20, and contains additional 3′ UTR sequence 12a C. elegans pie-1 3′ UTR, set forth herein as SEQ ID NO:21, were used for germline specific expression of the transgenic constructs. As illustrated in
As described above, some embodiments of the vector 10 encode an affinity reagent binding region comprising a biotin ligase accepting site 35. In further embodiments, the vector 10 encoding the NTF protein 30 in a first expression cassette also comprises a nucleic acid sequence 24 encoding a biotin ligase 38 in a second expression cassette 11, wherein the encoded biotin ligase 38 is capable of ligating biotin 36 to the biotin ligase accepting site 35 in the encoded NTF protein 30. As described above, biotin ligase 38 catalyzes biotin activation by covalently joining biotin 36 with ATP to form biotin-5′-adenylate, with subsequent transfer to the epsilon amino group of a specific lysine residues within a specific amino acid sequence recognized by the ligase 38. In some embodiment, biotin ligase 38 is E. coli biotin ligase BirA, the polypeptide sequence of which is set forth herein as SEQ ID NO:11, and is encoded by the nucleic acid sequence 24 set forth herein as SEQ ID NO:12. Accordingly, in some embodiments, the vector comprises a nucleotide sequence encoding an amino acid sequence of Accordingly, in some embodiments, the biotin ligase 38 is encoded by a nucleotide sequence 24 SEQ ID NO:11, or any variant thereof. In some embodiments, the expression cassette 11 encodes a variant biotin ligase 38 with an amino acid sequence with at least 90% identity to SEQ ID NO:12.
In preferred embodiments, the expression of at least one of the NTF polypeptide 30 or biotin ligase polypeptide 38 is controlled by a cell type-specific promoter 12. As described above, any known cell type-specific promoter sequence 12 can be incorporated into the vector(s) 10 to facilitate the transcription, and to enable the subsequent translation, of the sequence to which it is operatively linked in the cell type of interest. In some embodiments, only one of the sequences encoding the NTF protein sequence or the biotin ligase is operatively linked to a cell type-specific promoter 12, whereas the other is operatively linked to a constitutive promoter 13. In other embodiments, the sequences encoding both the fusion protein sequence (i.e., first expression cassette) and the biotin ligase (i.e., second expression cassette 11) are operatively linked to the same or different promoters 12 that is/are specific for the same cell type. In some embodiments, a single cell type specific promoter sequence 12 drives the expression of 1) an NTF protein comprising a nuclear targeting region, an affinity reagent binding region comprising a biotin ligase accepting site, and 2) a biotin ligase (i.e., in a unitary expression cassette). Consequently, in each of the embodiments described, only the cell type of interest will co-express both the NTF protein sequence and the biotin ligase to result in nuclei biotinylated in vivo.
In other embodiments, the sequence 24 encoding biotin ligase 38 is operatively linked to a distinct promoter sequence 13 (i.e. in a second expression cassette 11). In some embodiments, the sequence 24 encoding the biotin ligase 38 in the second expression cassette 11 is operatively linked to a cell type specific promoter sequence 12, which can be the same as, or different from, the promoter sequence 12 driving expression of the NTF protein 30. In other embodiments, expression of the biotin ligase 38 (i.e., second expression cassette) is under the control of a constitutive promoter 13 that is not cell-type specific and the expression of the NTF polypeptide 30 is under the control of a cell type-specific promoter 12. For example, in the embodiments illustrated in
In some embodiments, the encoded NTF polypeptide 30 further comprises a visualization tag region 44, which is useful in permitting the visual confirmation of the NTF protein 30 being incorporated into the nuclear envelope 46 of the cells of interest that contain the vector 10. In some embodiments, the affinity reagent binding region 34 comprises the visualization tag, which thus serves a dual purpose of allowing for visualization and binding to an affinity binding reagent. As illustrated in
In some embodiments, the encoded NTF polypeptide 30 further comprises one or more spacer regions 40 that separate the nuclear envelope targeting region 32 and the affinity reagent binding region 34 (e.g., epitope tag or biotin ligase accepting site 35) of the fusion protein. For example, in the embodiment of the vector 10 illustrated in
In some embodiments, the vector 10 encodes the NTF protein 30 and the biotin ligase 38. In this regard, the vector 10 may encode the NTF protein 30 and the biotin ligase 38 in the same (i.e. unitary) expression cassette driven by a cell type-specific promoter 12. Alternatively, the vector 10 may encode the NTF protein 30 and the biotin ligase 38 in the separate (i.e. first and second) expression cassettes, wherein at least one of the first and second expression cassettes 11 is driven by a cell type-specific promoter 12. In other embodiments, the vector 10 encodes the NTF protein 30, and a separate (i.e. second vector) encodes the biotin ligase 38 in a second expression cassette 11. As above, at least one of the first and second expression cassettes 11 is driven by a cell type-specific promoter 12.
One of ordinary skill in the art will recognize that in accordance with some embodiments of the invention, the vector(s) provided by the present invention for producing labeled nuclei (e.g., epitope tagged or in vivo biotinylated nuclei) may optionally include additional sequences known by those of skill in the art that facilitate the functionality of the vector in the cell type of interest. For example, vectors can include additional known sequences such as an origin of replication, selectable markers and sequences to facilitate transcription and translation of the fusion protein and biotin ligase. Such sequences also include polyadenylation tails, UTR sequences and Kozak sequences. See Sambrook, J., and Russell, D. W., eds., Molecular Cloning: A Laboratory Manual, 3rd ed., Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y. (2001). In some embodiments, the vector is a plasmid. In other embodiments, the vector is a viral vector. In a further embodiment, the viral vector is a Lentivirus vector.
It is demonstrated that the use of the present invention is widely applicable to eukaryotic organisms, including plants and animals, as described in EXAMPLES 1-5. Accordingly, in some embodiments, the vector(s) are useful for producing labeled (e.g., epitope tagged or biotinylated) nuclei in a cell type of interest that is derived from a eukaryotic organism. As used herein, the term “derived from” is used to indicate the originating organism that gave rise to the cell type of interest. In preferred embodiments, the cell type of interest is a specific type of differentiated cell that has developed within the originating organism at some temporal point in the organism's development, and is distinct from other cell types within the same organism. At the time of expression of the fusion protein (or co-expression with biotin ligase) encoded by the vector(s) of the present invention, the cell-type of interest may be incorporated into an intact tissue of the living originating organism, or maintained in an appropriate cell culture environment. Accordingly, in some embodiments, the vector comprising the NTF polypeptide is useful for producing labeled (e.g., epitope tagged or in vivo biotinylated) nuclei in a cell type of interest that is derived from a multicellular eukaryotic organism.
In some embodiments, the vector 10 encoding the NTF polypeptide 30 is useful for producing labeled (e.g., epitope tagged or in vivo biotinylated) nuclei in a cell type of interest that is derived from a plant, such as A. thaliana. In some embodiments, the vector 10 encoding the NTF polypeptide 30 is useful for producing labeled (e.g., epitope tagged or in vivo biotinylated) nuclei in a cell type of interest that is derived from an animal. In other embodiments, the vector 10 encoding the NTF polypeptide 30 is useful for producing labeled (e.g., epitope tagged or in vivo biotinylated) nuclei in a cell type of interest that is derived from an arthropod, such as D. melanogaster. In some embodiments, the vector 10 encoding the NTF polypeptide 30 is useful for producing labeled (e.g., epitope tagged or in vivo biotinylated) nuclei in a cell type of interest that is derived from a nematode, such as C. elegans. In some embodiments, the vector 10 encoding the NTF fusion 30 polypeptide is useful for producing labeled (e.g., in vivo biotinylated) nuclei in a cell type of interest that is derived from a mammal, such as rodents, dogs, cats, cats, horses, or primates including humans. In a further embodiment, the vector 10 encoding the NTF fusion 30 polypeptide is useful for producing labeled nuclei in a cell type of interest that is derived from a mouse.
In another aspect, the present invention provides a cell comprising a vector 10 for selectively labeling the cell type of interest, the vector 10 comprising a nucleic acid sequence encoding a nuclear tagging fusion (NTF) polypeptide 30 comprising (a) a nuclear envelope targeting region 32, and (b) an affinity reagent binding region 34, wherein the expressed NTF polypeptide 30 is incorporated into the nucleus 50 of the cell. Exemplary embodiments of the vector 10 have been described above. In some embodiments, the affinity reagent binding region 34 is an epitope tag, as described above. In some embodiments, the affinity reagent binding region 34 of the NTF polypeptide 30 comprises a biotin acceptor site 35 and the cell further comprises a vector 10 encoding a biotin ligase 38, such as BirA.
In one embodiment, the cell is part of a transgenic organism. In another embodiment, the cell is in a tissue. The tissue can be in a living organism or be maintained under appropriate culture conditions to permit the further development of the cell within the tissue. As used herein, the term “tissue” is used to describe an intermediate level of cellular organization between individual cells and a whole organism. The tissue is comprised of multiple cells, often of varying types, that may cooperate or function in concert to perform a united task. Accordingly, a cell contemplated in this embodiment is likely to be surrounded by different cell types with distinct developmental histories. In yet another embodiment, the invention provides a cell comprising the vector or vectors described above, wherein the cell is in culture. As used herein, the term “culture” is intended to mean any environment outside the organism of origin wherein conditions are maintained to facilitate the continuation of cell functions.
In another aspect, the invention provides a kit for selectively labeling nuclei 50 in a cell type of interest, the kit comprising: (a) a vector 10 comprising a first expression cassette comprising a nucleic acid sequence encoding a nuclear tagging fusion (NTF) polypeptide 30 comprising: (i) a nuclear envelope targeting region 32; and (ii) an affinity reagent binding region 34; and (b) a capture molecule (i.e., affinity reagent) capable of specifically binding to the affinity reagent binding region, or a modification thereof. In some embodiments, the affinity reagent binding region 34 comprises an epitope tag. In some embodiments, the affinity reagent binding 34 region comprises a biotin ligase accepting site 35. In some embodiments, the kit further comprises a second expression cassette 11 for expressing a biotin ligase polypeptide 38. In some embodiments the first and second expression cassettes are on the same vector 10. In other embodiments the first and second expression cassettes are on different vectors. In some embodiments, the capture reagent is bound to a magnetic particle. Various elements of the kit are described above in the context of the vector.
In some embodiments, the sequence encoding the NTF polypeptide is operatively linked to a cell type-specific promoter 12 for a cell type of interest. In embodiments comprising a second expression cassette 11 encoding a biotin ligase 38, the sequence encoding at least one of the NTF polypeptide or the biotin ligase polypeptide is operatively linked to a cell type-specific promoter 12. In some embodiments, the first, second, or both expression cassettes are adapted to receive a promoter 12 to be operationally linked to the sequence encoding the fusion protein and/or the sequence encoding the biotin ligase. For example, the expression cassette can include an insertion site flanked by one or more restriction enzyme recognition sequences for insertion of a promoter sequence, such as a particular cell-type specific promoter, using standard cloning techniques known by those of skill in the art.
In some embodiments, the first and second expression cassettes are provided on the same vector 10. In other embodiments, the first and second expression cassettes are provided in separate vectors.
The components of the kit may be adapted to function in cells of any eukaryotic organism of interest. Organisms of interest can include fungi, plants, and animals. Animals of interest include arthropods, nematodes, and mammals. One of ordinary skill in the art will recognize that functionality for any organism of interest requires selection of the appropriate nucleic acid sequence 14 encoding a nuclear envelope targeting domain 32, as described herein. In some embodiments, the first expression cassette encoding the NTF polypeptide is adapted to receive a nucleic sequence 14 encoding a nuclear envelope targeting domain 32 useful for translocation of the NTF polypeptide 30 to the nuclear envelope 46 of the organism of interest. As above, the expression cassette for the NTF polypeptide 30 can include an insertion site flanked by one or more restriction enzyme recognition sequences for insertion of a sequence encoding a nuclear envelope targeting region sequence using standard cloning techniques known by those of skill in the art.
In further embodiments, the kits of the invention further comprise an affinity reagent, i.e., capture molecule, that specifically binds to an epitope, such as one of any known epitope tags, in the affinity reagent binding region. Affinity reagents include antibodies or fragments thereof.
In some embodiments, the kit further comprises an affinity reagent, i.e., capture molecule, that specifically binds to biotin to facilitate the isolation of the in vivo biotinylated nuclei of a cell type of interest. The capture molecule can be any molecule known to specifically bind to biotin. Suitable examples include streptavidin, avidin, or antibodies specific for biotin, or functional fragments of any of the aforesaid molecules.
In some embodiments, the kit comprises a capture molecule that is bound to a magnetic particle to facilitate the isolation for the in vivo biotinylated nuclei of a cell type of interest.
In some embodiments of the kit, the affinity reagent binding region comprises at least one fluorescent protein domain. Such fluorescent protein domains are known in the art, and include GFP, dtTomato, and mCherry.
In some embodiments of the kit, the nuclear envelope targeting region comprises a SUN domain, a KASH domain, a WPP domain, an NPP-9 domain, a Nup358/RanBP2, or RanGAP domain, as described in EXAMPLES 1-7.
In some embodiments, the kit further comprises a device to facilitate isolation of the tagged (i.e., epitope tagged or in vivo biotinylated) nuclei of a cell type of interest. In one embodiment illustrated in
Each kit is preferably provided in suitable packaging and may also contain reagents useful for selectively isolating tagged (e.g., epitope tagged or in vivo biotinylated) nuclei in a cell type of interest, such as, for example, transfection reagents, selective media, control inserts, sequencing primers and PCR amplification primers, dNTPs, high fidelity polymerase and buffer, reagents for cell lysis, rinse buffers, reagents for DNA extraction, detection reagents, instructions, and the like. In some embodiments, the kit includes cells transformed with one or more vector(s) of the kit. Cells can include eukaryotic cells of various origins, for example, plant, arthropod, nematode, or mammalian cells.
In another aspect, the present invention provides a method of generating in vivo biotinylated nuclei in a cell type of interest comprising recombinantly co-expressing in the cell (a) a nuclear tagging fusion (NTF) polypeptide 30 comprising (i) a nuclear envelope targeting region 32; and (ii) a biotin ligase accepting site 35; and (b) a biotin ligase 38; wherein the co-expression of the recombinant NTF polypeptide 30 and the biotin ligase 38 produces biotinylated nuclei in the cell type of interest. The methods of this embodiment of the invention may be carried out using the vectors and kits described herein.
In accordance with the foregoing, the co-expression of the recombinant NTF polypeptide 30 and the biotin ligase 38 produces biotinylated nuclei in the cell. Without intending to be bound by theory, the nucleic acid sequences encoding both the NTF polypeptide and the biotin ligase are transcribed into mRNA by virtue of the assembly of transcription factors and transcription complex proteins, including RNA Polymerase as facilitated by the operatively linked promoters. The mRNA is used as a translation template by the cells' endogenous ribosomes. The NTF polypeptide 30, by virtue of the nuclear envelope targeting region 32, is transported and incorporated into the nuclear envelope 46. See, for example, the embodiment illustrated in
As described herein, the vector 10 encoding the NTF polypeptide 30 can optionally encode additional domains, such as one or more spacer regions 40, 42 and/or one or more visualization tags 44.
Conventional cloning techniques may be used to insert a sequence encoding a known nuclear envelope targeting region in frame with a sequence encoding the affinity reagent binding region, such as a biotin ligase accepting site, within an expression vector, to obtain a sequence encoding a fusion protein. See, e.g., Sambrook et al., Molecular Cloning: A Laboratory Manual, 3d Ed., Cold Spring Harbor Press, Plainsview, N.Y. (2000). Examples of sequences encoding nuclear envelope targeting regions, affinity reagent binding regions comprising epitope tags, and biotin ligase accepting sites are provided herein. Similarly, conventional cloning techniques may be used to insert sequences encoding a biotin ligase into an expression vector, as described herein.
The nucleotide sequences encoding the NTF polypeptide 30 and the biotin ligase 38 are each operatively linked to promoter sequence(s) within the vector(s) that facilitates the binding of transcription factors and assembly of the transcription complex to generate mRNA transcripts of the sequences and subsequently generate the corresponding polypeptide gene products. In a preferred embodiment, expression of at least one of the NTF polypeptide and the biotin ligase is under the control of a promoter 12 specific to the cell type of interest. In a further embodiment, expression of both of the NTF polypeptide and the biotin ligase are under the control of a promoter 12 specific to the cell type of interest, which may be the same or different promoter 12. Use of a promoter 12 specific to a cell type of interest in this manner ensures that the co-expression of the NTF protein and biotin ligase will be exclusive to the cell type of interest, and not in neighboring cells with distinct developmental histories.
In one embodiment, the nucleotide sequences encoding the NTF polypeptide 30 comprising a nuclear envelope targeting region 32 and an affinity reagent binding region 34 are introduced into the cell, or a progenitor of the cell type of interest. In one embodiment, the nucleotide sequences encoding the NTF polypeptide 30 and the biotin ligase 38 are present in the same expression vector 10, wherein the co-expressing comprises introducing one or more copies of the vector encoding the NTF polypeptide 30 and biotin ligase 38 into the cell, or a progenitor of the cell type of interest. In another embodiment, the nucleotide sequences encoding the NTF polypeptide 38 and the biotin ligase 38 are present on separate expression vectors, wherein the co-expressing comprises introducing one or more copies of the vector 10 encoding the NTF polypeptide 30 and one or more copies of the vector encoding biotin ligase 38 in a second expression cassette 11 into the cell, or a progenitor of the cell type of interest.
The term “introduce” is used herein to describe any act of causing the vector 10 to be present in a cell at any time in the course of the cell's development. In some embodiments of the method, at least one copy of the expression vector or vectors is introduced into an existing cell of the type of interest by direct use of conventional transformation techniques (e.g., DNA transfection). In another embodiment, at least one copy of the expression vector or vectors is introduced into a cell type of interest by transforming the vector or vectors into a progenitor cell of the cell type of interest. Consequently, by virtue of DNA replication and cell division, the progenitor cells give rise to a plurality of cells of the cell type of interest, each cell of which contains at least one copy of the vector or vectors (i.e., genetically modified cells). The progeny cells of the progenitor cells may comprise a multitude of cell-types with distinct developmental histories in addition to the cell type of interest. In one embodiment, the progenitor cell is a stem cell. In another embodiment, the progenitor cell is an embryo.
Alternatively, in some embodiments, the vector or vectors introduced into a progenitor cell is not duplicated in its entirety during the course of cell replication. In contrast, the elements of the vector or vectors, including the sequences encoding the NTF polypeptide 30 and biotin ligase 38 and their operatively linked promoters, are transferred into the genome of the cell. In accordance with the foregoing, in a further embodiment, at least one copy of the expression vector or vectors is introduced into a cell type of interest by transforming the vector or vectors into a progenitor cell of the cell type of interest, and by virtue of DNA replication and cell division, the progenitor cell gives rise to a plurality of cells of the cell type of interest, each cell of which contains at least one copy of the sequences encoding the NTF polypeptide 30 and biotin ligase 38 and their operatively linked promoters.
In another embodiment, the NTF protein 38 further comprises a visualization tag 44. As described, this is useful to perform visual confirmation of the proper expression and nuclear envelope-localization of the fusion protein. Visual confirmation may be performed using standard microscopy techniques. The visualization tag 44 may be one of many conventional and well-known polypeptide sequences known to emit light or other detectable energy, as described above.
In another embodiment, the cell type of interest is present in a mixture of multiple cell types. The different cell types in the mixture are understood to have distinct developmental histories, although they may be the progeny of a common progenitor cell. As a consequence of the distinct developmental histories, the different cell types exhibit different phenotypes and possess unique repertoires of gene transcription factors. In one embodiment, all of the cell types are the progeny of a common progenitor cell that received the vector or vectors encoding the NTF polypeptide 30 and biotin ligase 38. In another embodiment, the mixture of cell types may be in a cell culture, as distinct from the organism of origin. In another embodiment, the mixture is a tissue, which is an organized conglomeration of cells of different types that cooperate to perform a function in the organism of origin.
In some embodiments, the cell type of interest is of plant, nematode, arthropod, or mammalian origin. For example, in the embodiments described in EXAMPLE 1, the cell types of interest were hair cell and non-hair cells in the root epidermis of the plant A. thaliana. In the embodiment described in EXAMPLE 4, the cell type of interest was germline cells in C. elegans. In the embodiment described in EXAMPLE 5, the cell type of interest was somitic cells in D. melanogaster embryos.
In another embodiment, as described herein, the method further comprises isolating labeled (e.g., biotinylated) nuclei from the cells using a capture molecule that specifically binds to the affinity reagent binding region, or a modified (e.g., biotinylated) form thereof.
In another aspect, the invention provides a method for selectively isolating nuclei from a cell type of interest present in a plurality of cells. The method according to this aspect comprises (a) recombinantly expressing in a plurality of cells of a cell type of interest a nuclear tagging fusion (NTF) polypeptide 30 comprising (i) a nuclear envelope targeting region 32 and (ii) an affinity reagent binding region 32, wherein the NTF polypeptide is under the control of a promoter specific to the cell type of interest 12; (b) lysing the plurality of cells of step (a) under conditions suitable to generate a cell lysate comprising a plurality of intact nuclei; (c) contacting the cell lysate with a capture molecule that specifically binds to the affinity reagent binding region, or a modification thereof, under conditions suitable to bind the nuclei comprising the fusion polypeptide; and (d) isolating the nuclei bound to the capture molecule. The methods of this embodiment of the invention may be carried out using the vectors and kits described herein.
In another aspect, the invention provides a method for selectively isolating nuclei from a cell type of interest present in a plurality of cells, wherein at least a portion of the cells recombinantly express a fusion polypeptide, the fusion polypeptide comprising (i) a nuclear envelope targeting region 32 and (ii) an affinity reagent binding region 32, wherein the NTF polypeptide is under the control of a promoter specific to the cell type of interest 12. The method according to the aspect comprises (a) lysing the plurality of cells under conditions suitable to generate a cell lysate comprising a plurality of intact nuclei; (b) contacting the cell lysate with a capture molecule that specifically binds to the affinity reagent binding region, or a modification thereof, under conditions suitable to bind the nuclei comprising the fusion polypeptide; and (c) isolating the nuclei bound to the capture molecule. The methods of this embodiment of the invention may be carried out using the vectors and kits described herein.
In some embodiments of the method, the nuclei of the cell type of interest are isolated from a mixture of multiple cell types obtained from at least one of a plant, a nematode, an arthropod, or a mammal. In some embodiments, the nuclei are isolated from a mixture or plurality of cells obtained from a mammal, such as a mouse.
In some embodiments, the nuclear envelope targeting regions comprises a SUN domain, a KASH domain, a WPP domain, an NPP-9 domain, a Nup358/RanBP2, or RanGAP domain, as described in EXAMPLES 1-7.
In some embodiments, the method further comprises permeabilizing the cells and subjecting the nucleic acids therein to biochemical manipulation before the cell lysis step, as illustrated in
In some embodiments, the method comprises introducing a viral vector, encoding the nuclear tagging fusion protein into the host organisms to induce recombinant expression of the fusion protein. Any viral vector suitable to induce expression within a host cell is contemplated. One example is a Lentivirus vector. The host organism can be any eukaryotic organism for which nuclear envelope targeting regions are known (and incorporated into the NTF protein). Illustrative eukaryotic organisms include plants, nematodes, arthropods, and mammals. More specifically, illustrative model organisms include A. thaliana, C. elegans, D. melanogaster, and Mus musculus.
In some embodiments, the viral vector is introduced into a progenitor cell of the cell type of interest. In other embodiments, the viral vector is introduced into a post-mitotic cell. As used herein, the term “post-mitotic” is used to refer to the cell cycle state where the cell will no longer undergo further division. In some embodiments, the post mitotic cell is a neuron. An exemplary description of this embodiment is described in EXAMPLE 6.
In another aspect, the present invention provides a method of selectively isolating nuclei from a cell type of interest wherein at least a portion of the cells co-express (i) a recombinant nuclear tagging fusion (NTF) polypeptide 30 comprising a nuclear envelope targeting region 32 and a biotin ligase accepting site 35, and (ii) a biotin ligase 38, wherein expression of at least one of the recombinant NTF polypeptide 30 or the biotin ligase 38 is under the control of a promoter 12 that is specific for the cell type of interest, and wherein the co-expression of the recombinant NTF polypeptide 30 and the biotin ligase 38 produces biotinylated nuclei only in the cell type of interest. The method comprises: (a) lysing the plurality of cells from the mixture under conditions suitable to generate a cell lysate comprising a plurality of intact nuclei from the plurality of cells; (b) contacting the cell lysate with a capture molecule that specifically binds to biotin under conditions suitable to bind the biotinylated nuclei; and (c) isolating the biotinylated nuclei bound to the capture molecule.
Cells may be lysed by any conventional method sufficient to interrupt the continuity of the cell's outer plasma membrane (illustrated in
It is preferred that nuclei are rinsed to rid the solution of cellular debris from the lysate. The nuclei can be rinsed in NPB, pelleted by centrifugation, and resuspended in NPB multiple times. In preferred embodiments, the nuclei are finally resuspended in a low volume to enhance the interaction between the nuclei and capture molecules, such as streptavidin-containing molecules that specifically bind to biotin. For example, as described in EXAMPLE 1, nuclei from an initial 3 grams of plant root tissue were finally resuspended in 1 mL of NPB after introduction.
In accordance with an embodiment of the method provided by the present invention, a capture molecule is contacted with the cell lysate under conditions suitable for binding to the biotinylated nuclei. As described herein, the capture molecule can be any molecule that specifically binds to biotin that is attached to the fusion protein. In some embodiments, the capture molecule is streptavidin, or a fragment thereof. In some embodiments, the capture molecule is avidin, or a fragment thereof. In some embodiments, the capture molecule is an anti-biotin antibody, or a fragment thereof.
In some embodiments, the capture molecule is immobilized on a solid substrate, such as a tissue culture plate or filter and the cell lysate is passed over the immobilized capture molecule. Interactions between the biotinylated nuclei and immobilized capture molecules effectively immobilize the biotinylated nuclei and allow the non-biotinylated nuclei to be rinsed away. After isolation, the nuclei may be collected, for example, by interrupting the interaction between the biotinylated nuclei and the capture molecule, and collecting the supernatant.
In other embodiments, the capture molecule is not immobilized on a solid substrate. In a preferred embodiment, the capture molecule is bound to a magnetic particle. For instance, as described in EXAMPLE 1, streptavidin-coated Dynabeads®(Invitrogen M-280) were contacted to the cell lysate at ˜1.5×107 beads/mL of resuspended nuclei. The mixture was agitated by rotation at 4° C. for 30 minutes to maximize the binding of the streptavidin-coated beads to the biotinylated nuclei. In one embodiment, the biotinylated nuclei are subsequently isolated from the mixture by passing the mixture through a magnetic field at least one time. It is preferred that the mixture be diluted by about ten-fold to lower the concentration. In the embodiment described in EXAMPLE 1, the suspension was passed through a pipette placed in the groove of a MiniMACS™ separator magnet (Miltenyi Biotec, catalog #130-042-102). The suspension was allowed to pass through the pipette at approximately 0.75 mL per minute. The magnetic field captured the bead-bound biotinylated nuclei while allowing the non-biotinylated nuclei and other debris to pass. In some embodiments, the process is repeated by resuspending the isolated nuclei. The pipette is removed from the magnetic field and the nuclei are resuspended by repeatedly drawing NPB or other suitable buffer repeatedly in and out of the pipette. The process can be thus repeated as described.
In some embodiments, the method provided by the present invention further comprises extracting the nucleic acids from the isolated (e.g., tagged or biotinylated) nuclei. Conventional techniques and reagents, including many commercially available kits, are available for extracting of DNA and RNA. Isolated nucleic acids are useful for subsequent genomic analyses of the cell type of interest, including analyses of gene expression and chromatin regulation. Illustrative analyses are described in EXAMPLES 2 and 3.
In another aspect, the invention provides a method of visually tagging nuclei in a cell type of interest. The method comprises introducing a vector comprising a nucleic acid sequence encoding a fusion polypeptide into a cell-type of interest. The polypeptide comprises (a) a nuclear envelope targeting region and (b) a fluorescent protein. The methods of this aspect of the invention may be carried out using the vectors and kits described herein
In some embodiments, the vector is a plasmid. In some embodiments, the vector is a viral vector. In a further embodiment, the viral vector is a Lentivirus.
In some embodiments, the cell type of interest is eukaryotic. Eukaryotic cells include fungal, plant, and animal cells. Animal cells include the non-limiting categories: poriforan, cniderian, platyhelminth, nematode, annelid, mollusk, arthropod, echinoderm and vertebrate cells. In some embodiments, the vertebrate cells are mammalian, such as mouse cells. Additional specific examples of animal groups are described above.
The cell type of interest may be in culture or in vivo within a host organism.
In some embodiments of this aspect, the nuclear envelope targeting region selectively targets the outer nuclear membrane (ONM). For example, in embodiments described in EXAMPLES 6 and 7, a KASH domain was incorporated into an NTF protein. Upon expression, the NTF protein localized to the ONM providing a tag on nuclei that enabled their visualization and isolation.
In some embodiments of this aspect, the nuclear envelope targeting region selectively targets the inner nuclear membrane (INM). For example, in embodiments described in EXAMPLES 6 and 7, a SUN domain was incorporated into an NTF protein. Upon expression, the NTF protein localized to the ONM providing a tag on nuclei that enabled their visualization and isolation. Specific localization of the NTF proteins incorporating the SUN (and KASH) domain to the INM (or ONM) are described in more detail in EXAMPLE 6.
In accordance with this aspect, the cell type of interest can be any cell type for which a functional promoter and nuclear envelope targeting region is known. Thus, the cell type of interest can be from any lineage. For example, in some embodiments, the cell type of interest is a nerve cell.
In some embodiments, the method further comprises isolating the tagged nuclei. For example, the expressed fusion protein can comprise an affinity reagent binding region as described above. The tagged nuclei can be isolated or purified utilizing any of the methods, kits or reagents described above. In some embodiments, the tagged nuclei are isolated under conditions that preserve both the INM and ONM. This can be accomplished, for example, through the use of very mild detergents, or with reagents that omit or lack detergents, as described in EXAMPLE 6. In some embodiments, the nuclei are isolated under conditions that preserve only the INM, as described in EXAMPLE 6.
In some embodiments, the cells are permeabilized prior to isolation of the tagged nuclei. The permeabilization is useful to introduce reagents into the cells that can biochemically manipulate the genomic DNA or chromatin before the cells are lysed and the nuclei are isolated, as described in EXAMPLE 6.
The compounds, kits, and methods of the present method as described herein are useful for isolating the nuclei from a cell type of interest. In some embodiments, the cells of the cell type of interest exist in a mixture of multiple cell types that exhibit distinct phenotypes and developmental histories. Consequently, the present invention provides a cost effective and robust alternative to present methods for analyzing genome expression and chromatin regulation in a specific cell type. The method reduces the need for expensive and highly technical equipment, avoids undue manipulation of the biological sample, and results in a highly pure sample of genomic material from a cell type of interest, making the study of cell differentiation and function more accessible.
The following examples merely illustrate the best mode now contemplated for practicing the invention, but should not be construed to limit the invention. All literature citations are expressly incorporated by reference.
This Example describes the development of a method and reagents for isolation of nuclei tagged in specific cell types (INTACT) in the model system Arabidopsis thaliana.
Rationale:
As a proof-of-concept, in this Example, the INTACT system was employed to study the two cell types of the Arabidopsis root epidermis: hair (H) cells and non-hair (NH) cells. These two cell types originate from a common progenitor and make up the entire epidermal layer of the root, arising in alternating vertical cell files along the axis of this organ. The hair cells form long tubular outgrowths that are involved in water and nutrient uptake, anchorage, and interaction with soil microbes, while the non-hair cells do not produce such outgrowths (see Grierson, C., and J. Schiefelbein, “Root hairs,” in The Arabidopsis Book, C. R. Somerville and E. M. Meyerowitz, eds. (Rockville, Md.: American Society of Plant Biologists), 2002). The formation of these cell types has been extensively studied at the genetic and cell biological levels (Ishida, T., et al., “A Genetic Regulatory Network in the Development of Trichomes and Root Hairs,” Annu. Rev. Plant Biol. 59:365-386, 2008), and many genes that are expressed preferentially in each cell type have been identified (Birnbaum et al., Science 2003; Brady, S. M., et al., “A High-Resolution Root Spatiotemporal Map Reveals Dominant Expression Patterns,” Science 318:801-806, 2007; Won, S.-K., et al., “cis-Element- and Transcriptome-Based Screening of Root Hair-Specific Genes and Their Functional Characterization in Arabidopsis,” Plant Physiology 150:1459-1473, 2009), providing a point of comparison for the gene expression studies using the INTACT method, as described in EXAMPLE 2.
Methods:
Constructs and Transgenic Plants for INTACT
The vector used for INTACT, illustrated schematically in
Each of these constructs encoding the NTF protein were co-transformed into Arabidopsis ecotype Col-0 along with a vector comprising a second expression cassette 11 encoding the E. coli biotin ligase 38 (BirA) (the polypeptide of which is set forth herein as SEQ ID NO:12, and is encoded by the nucleic acid sequence set forth herein as SEQ ID NO:11). Expression of the BirA gene was driven from the constitutive ACT2 (At3g18780) promoter (the nucleic acid sequence of which is set forth herein as SEQ ID NO: 15), as described in An, Y. Q., et al., “Strong, Constitutive Expression of the Arabidopsis Act2/Act8 Actin Subclass in Vegetative Tissues,” Plant J. 10:107-121, 1996, incorporated herein by reference. See also Zilberman, D., et al., “Histone H2A.Z and DNA Methylation Are Mutually Antagonistic Chromatin Marks,” Nature 456:125-129, 2008 (see
First-generation double transgenic plants were selfed to produce plants that were homozygous for both the NTF and BirA transgenes. Multiple individual NTF/BirA double transgenic lines showing the expected expression patterns were combined and used in all subsequent experiments.
Plant Growth and Harvesting of Root Tissue
Plants were grown under fluorescent light for 16 hours per day at 22° C. on agar-solidified ½ strength MS media; Murashige, T., and F. Skoog, “A Revised Medium for Rapid Growth and Bioassays With Tobacco Tissue Culture,” Plant Physiol. 15:473-497, 1962. Plates were kept in a nearly vertical orientation such that the roots grew along the surface of the media. When the plants reached 7 days of age, a 1.25 cm section of the roots, from within the fully differentiated root hair zone but below the position of the first lateral roots, was harvested with a razor blade. This region of root tissue was used in all experiments.
Purification of Biotinylated Nuclei
For each purification, 3 g of root tissue was frozen in liquid nitrogen, ground to a fine powder and resuspended in 10 mL of nuclei purification buffer (NPB) containing: 20 mM MOPS, 40 mM NaCl, 90 mM KCl, 2 mM EDTA, 0.5 mM EGTA, 0.5 mM spermidine, 0.2 mM spermine, pH=7) containing Roche Complete® protease inhibitors. Nuclear suspensions were then filtered through 70 μM nylon mesh and pelleted at 1000×g for 5 minutes at 4° C. Nuclei were washed with 1 mL of NPB, pelleted again, and finally resuspended in 1 mL of NPB. Twenty-five microliters of Invitrogen M-280 streptavidin-coated Dynabeads® (˜1.5×107 beads) were added to the nuclear suspensions and this mixture was rotated at 4° C. for 30 minutes to allow binding of beads to the biotinylated nuclei.
The 1 mL suspension of beads and nuclei was diluted to 10 mL volume with NPB containing 0.1% Triton X-100 (NPBt) and drawn into a plastic 10 mL serological pipette. A MiniMACS™ separator magnet (Miltenyi Biotec, catalog #130-042-102) was then used to capture the Dynabeads®-bound nuclei using a flow-based setup, as shown in
Typically, 3 g of tissue yielded 1−3×105 nuclei. This amount was used for each RNA isolation or chromatin immunoprecipitation experiment, as described below. Purity and yield of nuclei after purification were determined by staining of total nuclei with DAPI prior to purification and subsequent counting of the number of bead-bound nuclei and unbound nuclei in the purified preparation, considering bead-bound nuclei to be the target nuclei and non bead-bound nuclei as contaminating nuclei from other cell types.
Analysis of Nuclear Tagging Fusion Protein Retention on the Nuclear Surface
Total nuclei were isolated from GL2p:NTF/ACT2p:BirA transgenic roots and were washed twice with nuclei purification buffer (NPB) to test for dissociation of NTF from the non-hair cell nuclei by streptavidin western blotting. Nuclei were initially extracted in 1 mL of NPB and pelleted by centrifugation. Total protein from 10% percent of this supernatant fraction and 10% of the pelleted nuclei was loaded on a 12% polyacrylamide gel (input). Nuclei were then resuspended in 1 mL NPB with mixing for 5 min, pelleted again, and the wash was repeated. Total protein from 10% of the nuclei from each wash (washed nuclei), and 100% of total protein from each wash supernatant (wash supernatant; prepared by trichloroacetic acid precipitation of protein from the entire supernatant) were loaded on the same gel. Streptavidin western blotting was performed as described below.
For imaging analysis, total nuclei were extracted from the roots of each indicated line and were mixed for 30 minutes with streptavidin-coated Dynabeads®. The same number of total nuclei were used in each case. Nuclei-bead mixtures were then mounted on glass slides and viewed at 20× magnification under a light microscope.
Immunoprecipitation and Western Blotting
Whole cell extracts were prepared from transgenic roots by grinding in liquid N2 and resuspension in 2 volumes of RIPA buffer (50 mM Tris, 150 mM NaCl, 1% NP-40, 0.5% sodium deoxycholate, 0.1% sodium dodecyl sulfate, pH=7.5) containing Roche Complete® protease inhibitors. This extract was cleared by centrifugation to give the input fraction. An aliquot of input was treated with an anti-GFP polyclonal antibody (Santa Cruz Biotechnology, catalog #GFP-FL), followed by incubation with protein A agarose (Millipore, catalog #16-157) to immunoprecipitate the NTF protein. Bead-bound proteins were washed twice for 5 minutes with RIPA buffer and eluted with 2X SDS loading buffer (100 mM Tris, 10% sodium dodecyl sulfate, 30% glycerol, 1% β-mercaptoethanol, 0.2% bromophenol blue, pH=7.5). Input and immunoprecipitated fractions were electrophoresed on a 12% SDS polyacrylamide gel and transferred to a nitrocellulose membrane. The membrane was blocked in PBSt (11.9 mM sodium phosphate, 137 mM NaCl, 2.7 mM KCl, 0.1% Triton X-100, pH=7.4) with 10% milk for 30 minutes, washed twice for 5 minutes with PBSt, and incubated with a 1:2000 dilution of streptavidin-HRP (GE, catalog #RPN1231) in PBSt with 1% BSA for 30 minutes. The membrane was then washed three times for 5 minutes with PBSt and biotinylated proteins were detected using ECL detection reagents (Pierce, catalog #34075).
Fluorescence-Activated Cell Sorting (FACs) of Non-Hair Cell Protoplasts
As a control for comparison, Arabidopsis non-hair cells were isolated from root extracts using fluorescence-activated cell sorting (FACS) according to a methodology previous described by Birnbaum, K., et al., “Cell Type-Specific Expression Profiling in Plants Via Cell Sorting of Protoplasts From Fluorescent Reporter Lines,” Nat. Methods 2:615-619, 2005, incorporated herein by reference in its entirety.
Results:
The present inventors developed a novel system for tagging nuclei using an outer nuclear envelope-tagging fusion (NTF) protein. In the present embodiment, the NTF served as a substrate for biotinylation. As shown in
Fluorescence microscopic examination of the ADF8p:NTF/ACT2p:BirA and GL2p:NTF/ACT2p:BirA transgenic lines showed that both promoters were expressed exclusively in the expected cell type and that the NTF did indeed accumulate on the nuclear envelope (
Furthermore, as shown in
As a further confirmation that the NTF was biotinylated, streptavidin western blotting was performed on whole cell extracts, on anti-GFP immunoprecipitates (IP) from the roots of each transgenic line, and on extracts from a line expressing only ACT2p:BirA. As shown in
To isolate labeled nuclei from hair and non-hair cells, total nuclei from the fully differentiated root hair zone of young seedlings in each transgenic line were extracted and the nuclei were incubated with streptavidin-coated magnetic beads. A simple liquid flow-based system was employed to capture the bead-bound nuclei on a magnet as the solution of bound and unbound nuclei flowed past. This apparatus was constructed from common laboratory supplies and a Dynal Mini-MACS™ magnet, as diagrammed in
As a control for comparison to the INTACT method, GFP-positive non-hair cell protoplasts were also sorted using fluorescence-activated sorting (FACS) according to the methodology previously described by Birnbaum, et al., 2005.
As described, it was also demonstrated that nuclei purified from the ADF8p:NTF/ACT2p:BirA hair cell and GL2p:NTF/ACT2p:BirA non-hair cell lines could be specifically bound by streptavidin-coated magnetic beads, which resisted dissociation even after multiple washes with nuclei purification buffer, and shown in
Discussion:
In order to circumvent the limitations of current methods and to make the study of cell differentiation and function more accessible, a simple and generally applicable method was developed for studying gene expression and chromatin in individual cell types. To avoid the need for dissociating or mechanically separating cells, a strategy was developed to transgenically tag nuclei in specific cell types and then isolate them from the total pool of nuclei derived from a tissue by affinity isolation targeting the tag.
It has been shown that the nuclear and total cellular mRNA pools are generally comparable, making nuclei a reasonable source of mRNA for gene expression measurements (see Barthelson, R. A., et al. “Comparison of the Contributions of the Nuclear and Cytoplasmic Compartments to Global Gene Expression in Human Cells,” BMC Genomics 8:340, 2007; Jacob, Y., et al., “The Nuclear Pore Protein AtTPR Is Required for RNA Homeostasis, Flowering Time, and Auxin Signaling,” Plant Physiol. 144:1383-1390, 2007). Thus, affinity purified nuclei can be used for the measurement of the gene expression and chromatin profiles of individual cell types. The present strategy to achieve this was to express an expression cassette encoding a nuclear tagging fusion (NTF) protein comprising a nuclear envelope targeting sequence, green fluorescent protein (GFP), and the biotin ligase recognition peptide (BLRP), in the presence of E. coli biotin ligase (BirA) in individual cell types (i.e., under the control of a cell-type specific promoter) in order to generate biotinylated nuclei specifically in those cells. These nuclei could then be purified from the total nuclear pool by virtue of the interaction between biotin and streptavidin. This strategy is referred to herein as INTACT, for isolation of nuclei tagged in specific cell types.
The data provided herein demonstrate that the novel INTACT method is easy to perform, does not require sophisticated instrumentation or specialized skills, and can produce large quantities of the desired nuclei at very high purity, in contrast to FACS and LCM-based methods for cell isolation. For example, INTACT provided recovery of >105 nuclei at nearly 100% purity, whereas <10% of hair cell-specific protoplasts with only 50% purity were recovered using FACS based on GFP fluorescence (see
Conclusion:
These results demonstrate that the INTACT method results in high yield and high purity of cell-specific nuclei for each cell type tested. The average purity (+/−SD) of the nuclei obtained was found to be 92.8+/−1.6% for hair cell nuclei and 95+/−2.2% for non-hair cell nuclei, which was considerably greater than the purity observed with the use of FACS to isolate GFP-positive protoplasts.
This Example describes gene expression profiling of the INTACT-purified nuclei from hair cells and non-hair cells of the A. thaliana root epidermis, generated as described in EXAMPLE 1.
Rationale:
As described above in EXAMPLE 1, the formation of hair and non-hair cells of the A. thaliana root epidermis has been extensively studied at the genetic and cell biological levels (Ishida et al., 2008), and many genes that are expressed preferentially in each cell type have been identified (Birnbaum et al., 2003; Brady et al., 2007; Won et al., 2009), providing a point of comparison for gene expression studies using the INTACT method.
Methods:
Generation and Purification of Biotinylated Nuclei
Biotinylated nuclei from hair and non-hair cells of A. thaliana were generated and purified as described in EXAMPLE 1.
Gene Expression Profiling Using Nuclear RNA
Total RNA was isolated from purified nuclei (obtained as described in EXAMPLE 1), using the Qiagen RNeasy® Micro kit. RNA was first treated with RNase-free Dnase I and then cDNA was prepared and amplified using the Sigma Whole Transcriptome Amplification Kit (Sigma, catalog #WTA2). This synthesis/amplification method begins with a cDNA synthesis using primers with a random 3′ end and defined 5′ end, followed by PCR using primers that match the 5′ end of the primers used for cDNA synthesis. The amplified cDNA was labeled in a random priming reaction using Cy dye-containing random 9mers as directed in the Roche NimbleGen® protocol supplied with the arrays. Sheared genomic DNA was labeled with the complementary Cy dye and was then co-hybridized along with labeled cDNA to a custom-designed Arabidopsis 1.9 million feature tiling array obtained from Roche NimbleGen®, which was described previously (Bernatavichute, Y. V., et al. “Genome-Wide Association of Histone H3 Lysine Nine Methylation With CHG DNA Methylation in Arabidopsis thaliana,” PLoS ONE 3:e3156, 2008). This array covers the entire sequenced portion of the Arabidopsis genome with an isothermal probe design. All array hybridizations and scanning were performed by the Genomics Shared Resource lab at the Fred Hutchinson Cancer Research Center.
Two biological replicates of the experiment were performed for each cell type and the raw log2 ratio data from each of these were processed by conversion to standard deviates on a probe-by-probe basis. An expression score was then calculated for each gene by averaging the log2 ratios of the first 100 exonic probes, starting at the 3′ end of the gene and moving toward the 5′ end. In order to define the set of genes enriched in each cell type we compared the data sets from each cell type using the program CyberT® (described in Baldi, P., and A. D. Long, “A Bayesian Framework for the Analysis of Microarray Expression Data: Regularized t-Test and Statistical Inferences of Gene Changes,” Bioinformatics 17:509-519, 2001). Within CyberT®, a Bayesian analysis was performed using with a window size of 101 and a confidence level of 10. Genes were classified as enriched in a given cell type if they showed a fold difference between cell types of >1.3 and a Bayes p value of <0.02.
Gene Ontology (GO) analysis was performed on each set of cell type-enriched genes using the GeneCodis 2.0 program (Carmona-Saez, P., et al., “GENECODIS: A Web-Based Tool for Finding Significant Concurrent Annotations in Gene Lists,” Genome Biol 8:R3, 2007; Nogales-Cadenas, R., et al., “GeneCodis: Interpreting Gene Lists Through Enrichment Analysis and Integration of Diverse Biological Information,” Nucleic Acids Res. 37:W317-322, 2009) with a hypergeometric test and false discovery rate calculation to correct the p values for multiple testing. The full set of genes present on the array was used as the background set in these analyses. Chi squared tests were also performed on the observed versus expected percentage of genes in selected GO categories.
Comparison of Whole Genome Expression Profiles from Total and Nuclear RNA Pools
Whole-genome expression profiling was performed using total and nuclear RNA pools from the differentiated root hair zone (same root segment used for INTACT purifications) of 7 day old non-transgenic plants. RNA isolated from whole root segments and nuclei was converted to cDNA, amplified, labeled, and hybridized to tiling arrays as described above. The whole genome expression profiles for each RNA source were compared on a scatterplot. A linear trend line was fit to the data to obtain an R value.
qRT-PCR Analysis
Wild Type (WT) Col-0 and gl2-8 mutant seedlings (T-DNA insertion line SALK—130213) (Alonso, J. M., et al., “Genome-Wide Insertional Mutagenesis of Arabidopsis thaliana,” Science 301:653-657, 2003) were grown on plates of agar-solidified ½ strength MS as described above, and RNA was prepared from the root hair zone of 7-day-old seedlings using the Qiagen RNeasy® Plant Mini kit. Each RNA sample was treated with RNase-free DNAse I and cDNA was prepared using the Superscript® III kit (Invitrogen, catalog #18080-051) with oligo dT primers according to the manufacturer's instructions. Real-time PCR was performed on an Applied Biosystems 7900HT instrument using SYBR green detection chemistry. Relative quantities of each transcript were calculated using the 2ddct method (Livak, K. J., and T. D. Schmittgen, “Analysis of Relative Gene Expression Data Using Real-Time Quantitative PCR and the 2(-Delta Delta C(T)) Method,” Methods 25:402-408, 2001) with At1g13320 serving as the endogenous control transcript in each case (Czechowski, T., et al., “Genome-Wide Identification and Testing of Superior Reference Genes for Transcript Normalization in Arabidopsis,” Plant Physiol. 139:5-17, 2005). Primer sequences are given below in TABLE 1.
Results:
Overall, 21 out of the 27 tested genes (78%) were confirmed to have higher expression in gl2-8 roots, as illustrated in TABLE 1. Genes that showed increased expression in the mutant are likely to be true hair cell-specific genes, but those that do not show an increase are not necessarily false positives. Some hair cell-specific transcripts might not have a higher relative abundance in the gl2 mutant because the hair-like cells induced in the mutant may express only a subset of the entire hair cell transcriptome, and some genes that are hair cell-specific (as compared to non-hair cells) may be expressed in other root cell types. The latter scenario could prevent detection of increased abundance in the mutant due to signals arising from other root cell types.
After successfully isolating nuclei from fully differentiated hair and non-hair cells, gene expression profiles of each cell type were measured using nuclear RNA. cDNA was prepared and amplified from the total nuclear RNA of each cell type. The cDNA was Cy dye-labeled and hybridized to Roche NimbleGen® whole-genome tiling microarrays along with fragmented genomic DNA labeled with the complementary Cy dye. Expression scores for the 26,992 annotated genes represented on the array were calculated using data from each of two biological replicates per cell type, and these datasets were then compared. A gene was defined as preferentially expressed in a given cell type if it showed a fold difference between cell types of >1.3 with a Bayes p value of <0.02 (Baldi and Long, 2001). Using these criteria, 946 genes were identified that were enriched in hair cells and 118 genes were identified that were enriched in non-hair cells.
To determine whether the hair and non-hair cell-enriched genes identified by INTACT correspond to genes identified using other methods, the identified cell type-enriched gene lists were compared to those obtained in previous expression studies. Nineteen of 24 confirmed hair cell-specific genes identified by Won et al., 2009, were present in the hair cell-enriched gene list generated by the INTACT method, and none were found in the non-hair gene set. Therefore, most of the previously confirmed hair cell-enriched genes were found using INTACT, and these genes were found throughout the range of expression levels in the present dataset, indicating that INTACT can identify cell type-specific genes regardless of expression level, as shown in TABLE 2.
The INTACT cell type-enriched gene lists were compared to genes identified from earlier studies that performed expression profiling using FACS-purified protoplasts of hair and non-hair cells (Birnbaum et al., 2003; Brady et al., 2007). Only about 20% of the genes previously defined as specific to each cell type were present in the corresponding INTACT gene lists. In addition, only 11 of the 24 confirmed hair cell-specific genes were found in the FACS-based hair cell-enriched gene list. The discrepancies between INTACT and FACS-based expression profiles of each cell type could be attributable to technical differences between the studies, such as cDNA amplification methods, microarray platforms used, and methods for defining cell-type specific expression. However, a major source of variation may also arise from differences in the purity of target cells or nuclei achieved with each of the methods, as described in EXAMPLE 1. While the INTACT method is shown here to give nearly 100% purity of the desired nuclei, in contrast, a published FACS protocol (Birnbaum et al., 2005) was unable to achieve a purity of greater than 50% for hair or non-hair cell protoplasts from the present transgenic lines (see
Another possible explanation for the discrepancies between INTACT and FACS-based expression profiles is that differences in the total and nuclear RNA pools could be prevalent in the tissue used for these experiments. In order to address this issue, whole-genome expression profiling was performed for nuclear and total RNA from the same tissue used for INTACT purification of hair and non-hair cell nuclei.
As an independent measure of the accuracy of the present expression profiles, 27 genes were selected from the hair cell-enriched set and analyzed for expression levels in wild-type and gl2-8 mutant roots. Given that all epidermal cells are converted to hair cells in a gl2 mutant (Di Cristina, M., et al., “The Arabidopsis Athb-10 (GLABRA2) is an HD-Zip Protein Required for Regulation of Root Hair Development,” Plant J. 10:393-402, 1996; Masucci et al., 1996), it is reasoned that true hair cell-specific genes should show higher relative expression levels in gl2-8 roots as compared to wild-type roots. In total, 21 of the 27 genes (78%) tested were found to have a higher relative expression level in gl2-8 roots, and 10 of these 21 were found only in the INTACT hair cell dataset and not in the FACS-based dataset (Brady et al., 2007) (see TABLE 1 above). Expression levels for a representative subset of the tested genes is shown in
While not wishing to be bound by theory, it is hypothesized that the inability to detect increases in expression for 6/27 hair-cell enriched genes has a biological basis, given the high purity of the present cell-type specific population of nuclei obtained by the INTACT method. It is unknown how closely the hair-like cells induced in the mutant resemble normal hair cells in terms of their global gene expression profile. It is possible that these hair-like cells express only a part of the hair cell transcriptome, certainly enough to cause polarized growth and secondary cell wall thickening, but perhaps not all of it. Therefore, genes that are at significantly higher levels in gl2-8 are very likely to be hair cell-specific, but those that do not increase are not necessarily false positives. Furthermore, because the present expression profile comparisons were only between hair and non-hair cells, genes are categorized as hair-cell specific only relative to non-hair cells, but some of these genes might also be expressed in other root cell types. In the case of such genes, an expression increase in the mutant could be obscured by signals from other root cell types.
To test for biological functions known to be associated with the hair and non-hair cell types, each cell type-enriched gene set was analyzed for overrepresentation of Gene Ontology (GO) terms (Ashburner, M., et al., “Gene Ontology: Tool for the Unification of Biology,” The Gene Ontology Consortium. Nat. Genet. 25:25-29, 2000).
Discussion:
Gene expression profiling using INTACT-purified hair and non-hair cell nuclei revealed a large number of genes that are preferentially expressed in each of these cell types. Among the genes classified herein as hair cell-enriched, most of the reporter-confirmed hair cell-specific genes were identified in the gl2-8 mutant roots as compared to wild-type roots. Additionally, increased expression was observed for many of the putative hair cell genes in the gl2-8 mutant roots as compared to wild-type roots. Analysis of overrepresentation of GO terms within the present gene sets revealed genes that were previously characterized as being involved in the specification of each of these cell types. In the case of hair cells, the GO terms analysis revealed an overabundance of genes involved in structural and physiological processes known to be important for the function of this cell type, such as translation, energy generation, cell expansion, vacuole function, and cytoskeletal dynamics. Furthermore, because nuclear and total RNA pools have a very similar composition, and INTACT provides nuclei at nearly 100% purity, the expression profiles generated from INTACT-purified nuclei should accurately represent the transcriptome of the cell type from which they were purified.
Conclusion:
These results demonstrate that the INTACT method results in high yield and purity of nuclei cell-specific nuclei populations that are suitable for gene expression analysis across the entire genome. Using the INTACT method, hundreds of genes were identified that are preferentially expressed in hair cell and non-hair cells of A. thaliana root epidermis, including nearly all of the previously confirmed hair cell-specific genes.
This Example describes chromatin profiling of the INTACT-purified nuclei from hair cells and non-hair cells of the A. thaliana root epidermis.
Rationale:
As described above in EXAMPLE 1, the formation of hair and non-hair cells of the A. thaliana root epidermis has been extensively studied at the genetic and cell biological levels (Ishida et al., 2008), and many genes that are expressed preferentially in each cell type have been identified (Birnbaum et al., 2003; Brady et al., 2007; Won et al., 2009). Additionally, as described in EXAMPLE 2, the use of the INTACT method enabled the identification of 946 genes that were enriched in hair cells and 118 genes enriched in non-hair cells using whole-genome tiling microarrays. These data provide an opportunity to examine the relationship of preferentially expressed genes with chromatin structure.
Methods:
Chromatin Profiling by Chromatin Immunoprecipitation
For chromatin immunoprecipitation (ChIP) experiments, excised root tissue were treated with 1% formaldehyde in NPB for 15 minutes prior to extraction and purification of biotinylated nuclei as described above. The ChIP protocol used herein is based on that of Gendrel et al (Gendrel, A. V., et al., “Profiling Histone Modification Patterns in Plants Using Genomic Tiling Microarrays,” Nat. Methods 2:213-218, 2005), but was modified for smaller amounts of starting material. Purified nuclei were lysed in 120 μL of nuclei lysis buffer (50 mM Tris, 10 mM EDTA, 1% sodium dodecyl sulfate, pH=8) and sonicated using a Diagenode Bioruptor® to yield chromatin fragments with an average size of ˜500 bp. Sonicated chromatin was cleared by centrifugation and diluted to 1.3 mL final volume with ChIP dilution buffer (16.7 mM Tris, 1.2 mM EDTA, 1.1% Triton X-100, 167 mM NaCl, pH=8). Diluted chromatin was pre-treated with 20 μL (bed volume) of protein A agarose beads (Millipore, catalog #16-157) for 30 minutes at 4° C. and then cleared by centrifugation. This chromatin was then divided into 2-3 aliquots of equal volume and 1-3 μg of antibody was added to each aliquot. The following antibodies were used in the experiments: H3, Abcam ab1791; H3K4me3, Abcam ab8580; H3K27me3, Millipore 07-449. Antibodies were incubated with chromatin at 4° C. overnight on a rocking platform, then 20 μL (bed volume) of protein A agarose beads were added with rocking at 4° C. for an additional 2 hours. Beads were washed once for 5 minutes at 4° C. in 0.5 mL of each of the following buffers: low salt wash buffer (20 mM Tris, 150 mM NaCl, 0.1% sodium dodecyl sulfate, 1% Triton X-100, 2 mM EDTA, pH=8), high salt wash buffer (20 mM Tris, 500 mM NaCl, 1% sodium deoxycholate, 1% NP-40, 1 mM EDTA, pH=8), LiCl wash buffer (10 mM Tris, 250 mM LiCl, 0.1% sodium dodecyl sulfate, 1% Triton X-100, 2 mM EDTA, pH=8), and TE (10 mM Tris, 1 mM EDTA, pH=7.5). Chromatin was eluted from the beads in 200 μL of elution buffer (100 mM NaHCO3, 1% sodium dodecyl sulfate) with vortexing for 5 minutes, then NaCl was added to 0.5 M and eluted chromatin was heated to 100° C. for 15 minutes to reverse crosslinks. DNA was isolated by treating the chromatin with RNase A, Proteinase K, and purification using the Qiagen MinElute® kit. Amplification of ChIP DNA was performed with the Sigma Single Cell Whole Genome Amplification kit (Sigma, catalog # WGA4) as directed, and the amplified material was labeled with Cy3 or Cy5 dye as described above. For each experiment, the H3K4me3 or H3K27me3 ChIP DNA was co-hybridized to the tiling array (same array as used for expression analysis) along with H3 ChIP DNA from the same starting chromatin to equalize for nucleosome occupancy.
Two biological replicates of each ChIP were performed and the log2 ratios from each replicate array were converted to standard deviates, averaged, and smoothed using triangular smoothing as described previously (Ooi, S. L., et al., “A Native Chromatin Purification System for Epigenomic Profiling in Caenorhabditis elegans,” Nucleic Acids Res, 38(4):e26, 2010). These data were used for all analyses. Cluster analysis was performed with Cluster 3 (Eisen, M. B., et al., “Cluster Analysis and Display of Genome-Wide Expression Patterns,” Proc. Natl. Acad. Sci. USA 95:14863-14868, 1998) and results were viewed using Java Treeview 1.1.0 (Saldanha, A. J., “Java Treeview—Extensible Visualization of Microarray Data,” Bioinformatics 20:3246-3248, 2004). End analysis was performed as previously described (Henikoff, S., et al., “Genome-Wide Profiling of Salt Fractions Maps Physical Properties of Chromatin,” Genome Research 19:460-469, 2009), and the analysis of each gene was stopped at the point where another genomic feature (gene or transposable element) was encountered. All microarray data are available from GEO (Accession Number GSE19654).
Results:
In order to gain insight into the chromatin changes that accompany the differentiation of hair and non-hair cells from a common progenitor, two different histone modifications were profiled in each cell type: the transcription-associated mark trimethylation of H3 lysine 4 (H3K4me3) (Santos-Rosa, H., et al., “Active Genes Are Tri-Methylated at K4 of Histone H3,” Nature 419:407-411, 2002) and the Polycomb silencing-associated mark trimethylation of H3 lysine 27 (H3K27me3) (Nekrasov, M., et al., “Pcl-PRC2 Is Needed to Generate High Levels of H3-K27 Trimethylation at Polycomb Target Genes,” EMBO J. 26:4078-4088, 2007).
Chromatin immunoprecipitation (ChIP) was performed by shearing crosslinked chromatin from purified hair and non-hair cell nuclei to an average size of 500 bp, followed by immunoprecipitation with an antibody against either H3K4me3 or H3K27me3. To equalize for nucleosome occupancy, a sample of each input chromatin was also immunoprecipitated with an antibody against the C-terminus of H3, which should precipitate all nucleosomes irrespective of their post-translational modifications. Each amplified and labeled H3K4me3 or H3K27me3 ChIP DNA was co-hybridized to tiling arrays along with amplified and labeled H3ChIP DNA from the same input chromatin. Two biological replicates of each ChIP were performed for each of the two cell types.
To visualize the relationship between gene expression and each of the modifications, heat maps were generated by aligning the profiles for each modification at the 5′ and 3′ ends of each annotated gene on the array, and then ranking genes by decreasing expression level in the corresponding cell type.
Regarding the H3K27me3 histone modification,
In order to determine whether differences in the H3K4me3 and H3K27me3 profiles between cell types might correspond to genes that were preferentially expressed in each cell type, each non-hair (NH) cell profile was subtracted from the corresponding hair (H) cell profile. Heat maps were generated from the subtracted profiles for each modification by aligning them at the 5′ ends of genes and ranking each list of cell type-enriched genes based on the fold difference in expression level between the cell types, from largest to smallest. H3K4me3 is enriched at active genes and depleted from inactive genes, and the heat maps show high H3K4me3 levels in coding regions of active relative to inactive genes. Conversely, H3K27me3 is enriched at inactive genes and depleted from active genes, and the heat maps show low levels of H3K27me3 in the coding regions of active relative to inactive genes. Cell type-enriched genes with the largest fold differences between cell types often showed both higher H3K4me3 and lower H3K27me3 levels in the cell type where they were preferentially expressed (
Discussion:
The preferential expression of a gene in one cell type often correlates with major differences between the cell types in the trimethylation of histone H3 at lysines 4 and 27, demonstrating that chromatin differences exist between hair and non-hair cells, which can be readily monitored in nuclei purified using this method. The INTACT method is simple, fast, and should be widely applicable.
Profiling of two histone modifications, H3K4me3 and H3K27me3, in hair and non-hair cell nuclei, showed that it is possible to produce robust and highly reproducible ChIP data from the number of nuclei obtained using INTACT. Both of these histone modifications showed distributions similar to those recently described in Arabidopsis (Oh, S., et al., “Genic and Global Functions for Paf1C in Chromatin Modification and Gene Expression in Arabidopsis,” PLoS Genet. 4:e1000077, 2008; Zhang, X., et al., “Genome-Wide Analysis of Mono-, di- and Trimethylation of Histone H3 Lysine 4 in Arabidopsis thaliana,” Genome Biol. 10:R62, 2009; Zhang, X., et al., “Whole-Genome Analysis of Histone H3 Lysine 27 Trimethylation in Arabidopsis,” PLoS Biol 5:e129, 2007). In addition, it is demonstrated that in each cell type the level of H3K4me3 within a gene decreases with decreasing expression level and the H3K27me3 modification increases, decreasing expression (
Previous profiling of H3K4me3 and H3K27me3 in Arabidopsis suggested that many plant genes have overlapping regions of H3K4me3 and H3K27me3, as observed in mammalian cells, but because whole plant tissues were used in these experiments it was not clear whether these overlaps were in individual cells or were an artifact of the amalgamation of signals from multiple cell types (Oh et al., 2008; Zhang et al., 2009; Zhang et al., 2007). By profiling chromatin landscapes at cell type-resolution we are able to show that these modifications do indeed coexist in the same cell type, as has been observed in mammalian cells (Bernstein, B. E., et al., “A Bivalent Chromatin Structure Marks Key Developmental Genes in Embryonic Stem Cells,” Cell 125:315-326, 2006; Roh et al., 2006).
A comparison of each histone modification profile by subtraction of the non-hair cell profile from that of the hair cell showed that the largest expression differences between cell types often corresponded to an increase in H3K4me3 and a decrease in H3K27me3 in the cell type showing preferential expression of a given gene. This suggests that a balance between the activities of Trithorax group protein-mediated H3K4 trimethylation and Polycomb group protein-mediated trimethylation of H3K27 is involved in establishing cell type-specific expression. However, many differentially expressed genes showed little difference in histone modification levels between cell types over cell type-enriched genes, indicating that there are mechanisms for generating cell type-specific expression that are unrelated to the H3K4me3/H3K27me3 balance.
Conclusion:
These results demonstrate that the INTACT method results in high yield and purity of nuclei cell-specific nuclei populations that are suitable for robust and highly reproducible chromatin analysis.
This Example describes the application of the INTACT method to produce and isolate in vivo biotinylated nuclei in germline cells of Caenorhabditis elegans.
Rationale:
As proof of applicability of the INTACT method to non-plant eukaryote organisms, transgenes for a nuclear tagging fusion protein and biotin ligase were co-expressed in germline cells of C. elegans and the resulting nuclei were isolated.
Methods:
Constructs and Transgenic Nematodes for INTACT
Vectors encoding a nuclear tagging fusion (NTF) protein and a biotin ligase were constructed as illustrated schematically in
Ultimately, the full length amino acid sequence of the NTF polypeptide is set forth herein as SEQ ID NO:25, encoded in the vector 10 by the nucleic acid sequence set forth as SEQ ID NO:24. The sequence encoding the fusion protein was operatively linked to the pie-1 promoter 12, which is specific for C. elegans germline cells. Specifically, the pie-1 promoter 12 (SEQ ID NO:20) was disposed in the vector at the 5′ end of the NTF encoding sequence, and the pie-1 3′UTR 12a (SEQ ID NO:21) was disposed in the vector at the 3′ end of the NTF encoding sequence.
In the embodiment illustrated in this example, a separate expression vector comprising a second expression cassette 11 with the gene 24 encoding E. coli biotin ligase (BirA), previously described herein in EXAMPLE 1 (amino acid sequence set forth herein as SEQ ID NO:12, encoded by the nuclei acid sequence set forth herein as SEQ ID NO:11). The nucleotide sequence 24 encoding the BirA ligase was followed on its 3′ end with an optional sequence 22 encoding a visualization tag, specifically GFP. This provides a simple mechanism to confirm expression of the BirA ligase. As illustrated in
Each of the constructs illustrated in
Purification of Biotinylated Nuclei/Immunoprecipitation and Western Blotting
For purification of nuclei, whole worms were frozen in liquid nitrogen, ground into a fine powder, and cells were lysed as previously described in EXAMPLE 1 in reference to plant cells.
Whole cell extracts were prepared, and nuclei were isolated as described in EXAMPLE 1. Fusion protein was immunoprecipitated and electrophoresed as previously described in EXAMPLE 1 in reference to plant cells, except that an mCherry fluorescence was used to detect the presence of tagged nuclei isolated using the miniMACS™ separator magnet instead of GFP fluorescence, also described in EXAMPLE 1.
Results:
The INTACT method was demonstrated herein, in EXAMPLES 1-3, to be effective in causing the biotinylation of nuclei for two plant cell types, facilitating their purification and robust genomic analyses. As described in this example, the INTACT method resulted in transgenically expressed NTF and biotinylated protein being localized in the nuclear envelope of C. elegans cell types of interest. In the embodiment presented in this example, a fusion protein comprised of a NPP-9 domain that served as a nuclear envelope targeting region, an mCherry domain that served as a visualization tag region, and a biotin ligase accepting site that served as the affinity reagent binding region. The nucleic acid encoding the NTF protein was expressed under the control of the pie-1 promoter and pie-1 3′UTR sequence, a promoter sequence specific for gene expression in C. elegans germline cells.
To determine whether transgenically expressed fusion proteins were biotinylated in vivo when co-expressed with biotin ligase (BirA), NTF protein was immunoprecipitated from C. elegans that did or did not also transgenically co-express biotin ligase BirA. The immunoprecipitated NTF protein was blotted and probed using streptavidin-HRP. Referring to
To determine whether the intact nuclei recovered from whole nuclei extractions retained fusion protein on their surface, cells from C. elegans transgenic for the NTF protein and BirA were lysed and intact nuclei were isolated, as described in EXAMPLE 1 in regard to plant cells. Recovered cells were stained with DAPI, and visualized for the presence of DNA and mCherry staining.
To determine whether the intact nuclei isolated from transgenic C. elegans lysates were biotinylated (via the NTF polypeptide tag), immunoprecipitates were assessed from C. elegans expressing the fusion protein alone, or co-expressing the fusion protein and BirA. The precipitates were assessed before or after streptavidin “pull-down”, which was accomplished by incubation with streptavidin-coated Dynabead® and the application of a magnetic field, as described in EXAMPLE 1.
Discussion:
It is demonstrated herein that the NTF protein comprising a nuclear envelope targeting region and biotin accepting site can be selectively expressed in a cell type of interest (germline cells) in live C. elegans. By virtue of the nuclear envelope targeting region, the NTF protein can be incorporated into the nuclear envelope and is retained therein even after cell lysis and isolation of the nuclei. Furthermore, it is demonstrated herein that the nuclei tagged with the NTF are biotinylated when the cells also co-express BirA. Thus, the in vivo biotinylated nuclei can be easily isolated from a cell lysate with high yields and purity for relatively low cost and without highly technical equipment.
Conclusion:
These results demonstrate that the INTACT method, incorporating promoter and nuclear envelope targeting regions for the nematode, C. elegans, results in a high yield and purity of the cell type of interest. This confirms that the INTACT method is applicable to animal systems as well as plants.
This Example describes the application of the INTACT method to produce and isolate in vivo biotinylated nuclei in Drosophila melanogaster.
Rationale:
As additional proof of applicability of the INTACT method to non-plant eukaryote organisms, transgenes for a nuclear tagging fusion protein NTF and biotin ligase were co-expressed in D. melanogaster, and the resulting biotinylated nuclei were detected in the specific cell type of interest.
Methods:
Constructs and Transgenic Nematodes for INTACT
Vectors encoding a nuclear tagging fusion protein and a biotin ligase were constructed, and are illustrated schematically in
The sequence encoding the fusion protein was operatively linked to the twist promoter 12, which is specific for somitic cells in D. melanogaster embryos. Specifically, the twist promoter was disposed in the vector at the 5′ end of the NTF encoding sequence. The sequence of the twist promoter is set forth herein as SEQ ID NO:32.
As described in EXAMPLES 1 and 4, the embodiment illustrated in this example incorporated a separate expression vector comprising a second expression cassette 11 containing the gene 24 encoding the E. coli biotin ligase (BirA) (amino acid sequence set forth herein as SEQ ID NO:12, encoded by the nuclei acid sequence set forth herein as SEQ ID NO:11). As illustrated in
The nucleic acid sequence encoding the fusion protein, shown in
In an alternative approach, the vectors shown in
Nuclei were isolated from embryos transgenic for NTF protein and BirA. Briefly, Drosophila whole embryos were dechorionated with bleach. Nuclear extracts were made by disrupting the cells' plasma membranes in nuclear buffer using a douncing homogenizer. Nuclei were washed, and collected by centrifugation prior to incubation with beads, as described in EXAMPLE 1. Isolated nuclei samples were treated with DAPI stain, incubated with anti-FLAG antibody, and incubated with streptavidin conjugated to a fluorescent tag. As is commonly known, these treatments can be applied simultaneously or in a series, commonly in the order listed herein, under standard conditions known for fluorescent antibodies. Staining/fluorescence was visualized using standard fluorescence microscopy techniques targeting each of the treatments applied. For example, visualization of fluorescence can be performed using any of several different fluorescent microscopes, including Nikon E800, Zeiss LSM Confocal, and Deltavision.
Results:
As demonstrated in this example, the INTACT method successfully resulted in the co-expression of transgenic NTF protein in somitic cells of the D. melanogaster embryos under the control of the Drosophila twist promoter.
In order to verify the localization and retention of the NTF protein in the nuclear envelopes of the D. melanogaster somitic cells, the nucleus isolates were visualized for the presence of DNA, the FLAG epitope, biotinylation, and mCherry fluorescence.
Conclusion:
These results demonstrate that the INTACT method, incorporating promoter and nuclear envelope targeting regions for D. melanogaster, results in the cell-specific tagging of nuclei. This provides further confirmation that the INTACT method is applicable to a variety of animal systems, as well as plants.
This Example describes the application of a nuclear immunopurification method to rapidly and efficiently purify in vitro- and in vivo-labeled nuclei in mice.
In this example, the nuclear tagging protein is an integral membrane protein fused to a fluorescent protein module that allows tagged nuclei to be visualized at any point during the isolation procedure. The tagging protein is easily diversified by the addition of standard affinity reagent binding region tags, thus allowing the user to label multiple genetically distinct types of nuclei in one experiment. The data described herein establishes the applicability of the INTACT procedure to enable the isolation of cell-type specific nuclei in mammals. Thus, the INTACT method simplifies the generation of cell-type specific genomic, biochemical, and cell biological data across eukaryotic cells of all lineages.
Rationale:
Many biological problems involve the study of functionally relevant cell types or cellular states. Though there are numerous schemes for defining cellular states, it is widely accepted that cell types are determined by the expression of cell-type specific combinations of proteins, RNAs, and epigenetic modifications of genomes (Arendt, D., “The Evolution of Cell Types in Animals: Emerging Principles From Molecular Studies,” Nat. Rev. Genet. 9:868-882, 2008; Christodoulou, F., et al., “Ancient Animal MicroRNAs and the Evolution of Tissue Identity,” Nature 463:1084-1088, 2010; Hemberger, M., et al., “Epigenetic Dynamics of Stem Cells and Cell Lineage Commitment: Digging Waddington's Canal,” Nat. Rev. Mol. Cell. Biol. 10:526-37, 2009; Zernicka-Goetz, M., et al., “Making a Firm Decision: Multifaceted Regulation of Cell Fate in the Early Mouse Embryo,” Nature Rev. Genet. 10:467-477, 2009). All of these factors can now be studied with high-throughput genomic and proteomic approaches that leverage the power of fully sequenced genomes.
Despite the ever-expanding array of techniques that can be used to analyze genomes, transcriptomes and proteomes, many of these methods are biochemical approaches that require millions of cells to obtain a robust signal. As a result, these genome-scale assays are most easily applied to either homogeneous populations of easily grown tissue culture cells or highly heterogeneous mixtures of cells obtained from whole tissues. A major challenge for the field is the development of techniques for the isolation of specific cell types from heterogeneous tissues or mixtures. The development of in situ measurement technologies solves this problem for some types of measurements (Levsky, J. M., et al., “Single-Cell Gene Expression Profiling,” Science 297:836-840, 2002). Other solutions include FACS sorting of heterogeneous populations of cells or the purification of proteins and their binding partners in a cell type specific manner through the use of various tagging approaches (Shilo, Y. and R. Aebersold., “Quantitative Proteome Analysis Using Isotope-Coded Affinity Tags and Mass Spectrometry,” Nat. Protoc. 1:139-145, 2006; Morin X., et al., “A Protein Trap Strategy to Detect GFP-Tagged Proteins Expressed From Their Endogenous Loci in Drosophila,” Proc. Natl. Acad. Sci. 98:15050-15055, 2001; Clyne P. J., et al., “Green Fluorescent Protein Tagging Drosophila Proteins at Their Native Genomic Loci With Small P Elements,” Genetics 165:1433-1441, 2003; Quñones-Coello A. T., et al., “Exploring Strategies for Protein Trapping in Drosophila,” Genetics 175:1089-1104, 2007; Buszczak M., et al., “The Carnegie Protein Trap Library: a Versatile Tool for Drosophila Developmental Studies,” Genetics 175:1505-1531, 2007; Huh W., et al., “Global Analysis of Protein Localization in Budding Yeast,” Nature 425:686-691, 2003).
A method for the isolation of intact nuclei from a specific cell type is desirable for many reasons. For example, the chromatin of isolated nuclei maintains much of its structure even when the outer cellular membrane is destroyed and the details of this structure can be probed by a variety of enzymatic manipulations. Examples include the classical nuclease mapping methods that have been used for many years as a means to position transcriptional enhancers, promoters and other important genomic structures (Enver, T., et al., “Simian Virus 40-Mediated C is Induction of the Xenopus Beta-Globin DNase I Hypersensitive Site,” Nature 318:680-3, 1985; Richard-Foy, H. and G. L. Hager, “Sequence-Specific Positioning of Nucleosomes Over the Steroid-Inducible MMTV Promoter,” EMBO J. 6:2321-2328, 1987; Weintraub, H., and M. Groudine, “Chromosomal Subunits in Active Genes Have an Altered Conformation,” Science 193:848-856, 1976; Wu C., “The 5.′ Ends of Drosophila Heat Shock Genes in Chromatin Are Hypersensitive to DNase I,” Nature 286:854-860, 1980). These techniques can be successfully expanded to whole genome resolution as a result of fully sequenced genomes and high-throughput analytical technologies, such as DNA microarrays and single molecule sequencing (Barski, A, et al., “High-Resolution Profiling of Histone Methylations in the Human Genome,” Cell 129:823-837, 2007; Bernstein B. E., et al., “Genomic Maps and Comparative Analysis of Histone Modifications in Human and Mouse,” Cell 120:169-181, 2005; Boyle A. P., et al., “High-Resolution Mapping and Characterization of Open Chromatin Across the Genome,” Cell 132:311.-322, 2008; Core, L. J., et al., “Nascent RNA Sequencing Reveals Widespread Pausing and Divergent Initiation at Human Promoters,” Science 322:1845-1848, 2008; Crawford, G. E., et al., “DNase-chip: A High-Resolution Method to Identify DNase I Hypersensitive Sites Using Tiled Microarrays,” Nat. Methods 3:503-509, 2006; Heintzman N. D., et al., “Distinct and Predictive Chromatin Signatures of Transcriptional Promoters and Enhancers in the Human Genome,” Nat. Genet. 39:311.-318, 2007; Henikoff, S., et al., “Genome-Wide Profiling of Salt Fractions Maps Physical Properties of Chromatin,” Genome Res. 19:460-469, 2008; Ren B., et al., “Genome-Wide Location and Function of DNA Binding Proteins,” Science 290:2306-9, 2000; Sabo, P. J., et al., “Genome-Scale Mapping of DNase I Sensitivity In Vivo Using Tiling DNA Microarrays,” Nat. Methods 3:511-518, 2006). Second, some structurally complex tissues or cell types are difficult to isolate with current technology. For example, structurally complex neurons are difficult to isolate without damaging the outer membrane. This makes FACS sorting of whole cells a challenge. The ability to isolate neuron-specific nuclei would simply efforts to study specific neuronal sub-types.
As described in EXAMPLES 1-5, the INTACT method was successfully applied to isolate cell-type specific nuclei in plants (A. thaliana; EXAMPLES 1-3), nematodes (C. elegans; EXAMPLE 4), and insects (D. melanogaster; EXAMPLE 5). This example further demonstrates that the INTACT method can be applied to mammalian cells to isolate cell-type specific nuclei. Specifically, this example describes methods and constructs that take advantage of the relative stability of isolated nuclei and permits isolation of the organelle from a specific homogeneous cell type. In the described nucleus immunopurification method, purified populations of genetically tagged nuclei are isolated on magnetic beads. To perform the immunopurification, a genetically encoded tag was developed that is positioned on the outside of the nucleus. The tag is a fusion protein where either GFP or tdTomato is fused to the nuclear integral membrane proteins Sun-1 or Nesprin-3 (Crisp M., et al., “Coupling of the Nucleus and Cytoplasm: Role of the LINC Complex,” J. Cell Biol. 172:41-53, 2006; Wilhelmsen, K., et al., “Nesprin-3, A Novel Outer Nuclear Membrane Protein, Associates With the Cytoskeletal Linker Protein Plectin,” J. Cell Biol. 171:799-810, 2005; Haque, F., et al., “SUN1 Interacts With Nuclear Laminin a and Cytoplasmic Nesprins to Provide a Physical Connection Between the Nuclear Lamina and the Cytoskeleton,” Mol. Cell. Biol. 26:3738-3751, 2006). Thus, nuclei can be tracked through the entire procedure, the integrity of the chromatin is preserved throughout the isolation process, multiple distinct classes of nuclei can be isolated in one experiment, and the method is simple to execute. As described supra, the only requirements of the broad applicability of the technique are 1) a cell type that will accept a transgene and for which a nuclear envelop targeting sequence is known, and 2) a promoter that can drive the expression of the nuclear tagging protein in the cell population of interest.
Methods:
Antibodies
GFP (Invitrogen A11122), MYC (Abcam ab9106), FLAG (Sigma F7425), HSV (Sigma H6030), VSV-G (Sigma V4888), HA (Abcam ab71113), AU1 (Abcam ab3401), and V5 (Sigma V8137).
DNA Constructs
A polynucleic acid encoding a synthetic polypeptide linker with the sequence LAAASGGGGSGGGGSLAAASEFSAAALSGGGGSGGGGSAAAL (SEQ ID NO:89), was inserted into the Nesprin-3 reading frame between amino acids 907 and 908 of the unmodified amino acid sequence. The unmodified amino acid sequence of Nesprin-3 has the Genbank Accession No. NP—001036164.1, incorporated herein by reference, and is set forth herein as SEQ ID NO:91. The unmodified Nesprin-3 polypeptide is encoded by a nucleic acid that has the Genbank Accession No. NM—001042699.1, incorporated by reference, and is set forth herein as SEQ ID NO:90. The same cassette was placed between amino acid 913 and the stop codon of Sun-1 polypeptide sequence. The unmodified Sun-1 amino acid sequence has the Genbank Accession No. NP—077771.1, incorporated herein by reference, and is set forth herein as SEQ ID NO:93. The unmodified Sun-1 polypeptide is encoded by a nucleic acid that has the Genbank Accession No. NM—024451.1, incorporated herein by reference, and is set forth herein as SEQ ID NO:92. A polynucleic acid encoding two copies of the super-folder GFP variant was then cloned into the centrally located EcoRI site of the linker (corresponding to the amino acids EF in the linker, underlined in the recitation above). The super-folder GFP variant is described in (Pedelacq, J-D., et al., “Engineering and Characterization of a Superfolder Green Fluorescent Protein,” Nat. Biotech. 24:79-88, 2005), which is expressly incorporated herein by reference in it entirety. The Sun-tdTomato constructs used the same linker strategy except that the incoming fluorescent protein carried a restriction site at its 3′ end that allowed the addition of various C-terminal epitope tags. Epitope tags were multimerized as follows: 3XMYC, 4×HA, 3X FLAG, 3XVSVg, 2XV5, 3X HSV, and 4×AU1, the nucleic acid and amino acid sequences of which are standard and well-known in the art.
Lentivirus Production
Lentivirus was produced in transfected 293/T17 cells using a third generation production scheme (Hanawa, H., et al., “Efficient Gene Transfer Into Rhesus Repopulating Hematopoietic Stem Cells Using a Simian Immunodeficiency Virus-Based Lentiviral Vector System,” Blood 103:4062-4069, 2004). After media harvest, the supernatant was concentrated first on a Vivacell 100 (Sartorius) concentrator followed by ultracentrifugation for 3 hours at 100,000 Xg. Viruses were untitered and as indicated in the text, Synapsin, Murine Stem Cell Virus (MSCV), and Cytomegalovirus (CMV) promoters were used to drive expression.
Assay Systems
Cos, Hela, 293, and N2a cells were transfected by the Fugene method (Roche). Transfected-detergent permeabilized cells were processed for immunohistochemistry using standard techniques. Rat primary hippocampal cultures were electroporated using the Amaxa-Nucleofector system (Lonza) at P0. Primary cultures were virus infected at P3-P4 using 1:100-1:200 dilutions of concentrated lentivirus. 500 nl of concentrated lentivirus was infused into the striatum of isoflurane anesthetized 8 week old C57BL/6 male mice using an Angle Two Stereotaxic system (myNeurol.ab) at −1.89ML, 0.50 AP, −4.00 DV (Bregma=0). Brains were processed using standard cryo-histological methods.
Magnetic Bead Preparation
The following conditions are per immunopurification reaction. 150 μls (4.5 mg) Protein G Dynabeads (Invitrogen 100.03D) were concentrated on a magnetic stand and resuspended in 600 μls of PBS/0.1% Tween20 containing 10-15 μg of purified antibody. 500 μls (5 mg) of Sheep Anti-Rabbit Dynabeads (Invitrogen 112.03D) were washed 3X in PBS/0.5% BSA and resuspended in 600 μls of PBS/0.5% BSA containing 10-30 μg of purified antibody. 250 μls (1.5 mg) of Biotin Binder Dynabeads (Invitrogen 110.47) were washed 3X in PBS/0.5% BSA and resuspended in 600 μls of PBS/0.5% BSA containing 5-15 μg of purified biotinylated antibody. The antibody was adsorbed to the bead for a minimum of 15 minutes at room temperature or indefinitely at 4° C. with constant agitation. After the completion of the binding reaction, the beads were washed 2-3X in the binding buffer minus antibody and resuspended in 500 μls of the immunopurification buffer.
Immunopurification of Nuclei
106-107 cells were swelled in 1 ml 10 mM β-Glycerophosphate pH 7, 2 mM MgCl2, 1% Tween40 for 5 minutes on ice (Philpot, J. S, and J. E. Stanier, “The Choice of the Suspension Medium for Rat-Liver-Cell Nuclei,” Biochem. J. 63:214-223, 1956). After the addition of an equal volume of dH2O, the incubation was continued for 5 minutes on ice (Cocco, L., et al., “Inositides in the Nucleus: Presence and Characterization of the Isozymes of Phospholipase β Family in NIH 3T3 Cells,” Biochim. Biophys. Acta. 1438:295-299, 1999). The suspension was then Dounce homogenized and equilibrated with an equal volume of 120 mM β-Glycerophosphate pH 7, 2 mM MgCl2, 10-80% Glycerol. Nuclei were pelleted through a two-step sucrose cushion at 1000×g for 10 minutes at 4° C. The lower cushion was 500 mM Sucrose, 2 mM MgCl2, 25 mM KCL, 65 mM β-Glycerophosphate pH 7, 5-40% Glycerol. The upper cushion was 340 mM Sucrose, 2 mM MgCl2, 25 mM KCl, 65 mM β-Glycerophosphate pH 7, 5-40% Glycerol. 5% Glycerol is standard, but higher levels can be used. All solutions contain β-mercaptoethanol at 1 mM, sodium butyrate at 5 mM, and PMSF at 1 mM.
Whole tissue was disrupted using a Potter Elvehjem homogenizer in 250 mM Sucrose, 2 mM MgCl2, 25 mM KCl, 65 mM β-Glycerophosphate pH 7. The sample was filtered through a 40 μm mesh, and brought to 0.5% NP40 and homogenized with another 4-6 tractions when nuclei containing only the INM (Inner Nuclear Membrane) was desired. To isolate nuclei containing both the ONM (Outer Nuclear Membrane) and INM, the sample was first filtered as above and then Dounce (tight pestle B) homogenized until nuclei were liberated. The lysate was then layered over a two-step sucrose cushion as previously described.
Pelleted nuclei were gently resuspended in immunopurification buffer: 340 mM Sucrose, 2 mM MgCl2, 25 mM KCL, 65 mM β-Glycerophosphate pH 7, 5% Glycerol (lacking β-mercaptoethanol). Nuclei were then added to an equal volume of magnetic beads in the same buffer. The beads were in 5-10 fold excess over total nuclei. The binding reaction was run at 4° C. for 20 minutes with constant agitation. It was essential that the immunopurification mixture fill the reaction vessel because the presence of any air in the tube during the incubation may have caused the nuclei to clump, thus reducing the efficacy of the immunopurification. Immunoadsorbed nuclei were washed using a magnetic stand 5 times as follows: 1×5 mls immunopurification buffer, 4×1 ml immunopurification buffer. Adsorbed nuclei were then Micrococcal nuclease treated in 15 mM HEPES pH 7.5, 1 mM KCl, 2 mM MgCl2, 1 mM CaCl2, 340 mM Sucrose.
An alternate procedure involved first the permeabilization of 106-107 cells in 35 mM Hepes pH 7, 5 mM K2HPO4, 80 mM KCl, 5 mM MgCl2, 0.5 mM CaCl2, 50 ug/ml lysolecithin for 1 minute at room temperature, followed by enzymatic treatment (DNaseI or Micrococcal Nuclease) in 35 mM Hepes pH 7, 5 mM K2HPO4, 80 mM KCl, 5 mM MgCl2, 2 mM CaCl2 (Pfiefer, G. P. and A. D. Riggs, “Chromatin Differences Between Active and Inactive X Chromosomes Revealed by Genomic Footprinting of Permeabilized Cells Using DNase I and Ligation-Mediated PCR. Genes Dev. 5:1102-1113, 1991). After appropriate washes, the aforementioned nuclear isolation protocol was used to harvest nuclei for the immunopurification.
Nucleosome Extraction
106 bead-bound or unbound nuclei were digested with 12.5 units of Micrococcal nuclease (Worthington) at 37° C. for 15 minutes in 15 mM Hepes pH 7, 1 mM KCl, 5 mM MgCl2, 2 mM CaCl2, 340 mM Sucrose. The reaction was terminated by the addition of 5 mM EGTA and nucleosomes were extracted on ice by a 50-400 mM NaCl series in 15 mM Hepes pH 7, 1 mM KCl, 5 mM MgCl2, 2 mM EGTA, 340 mM Sucrose (Henikoff, S., et al., “Genome-Wide Profiling of Salt Fractions Maps Physical Properties of Chromatin,” Genome Res. 19:460-469, 2008; Sanders, M. M., “Fractionation of Nucleosomes by Salt Elution From Micrococcal Nuclease-Digested Nuclei,” J. Cell Biol. 79:97-109, 1978). Each extraction reaction was for 20 minutes.
Bead-Nuclei Imaging
Following each nuclear-immunopurification experiment one third of the input, bound and combined supernatant/wash material was loaded into an 8-well Lab-Tek chamber slide. After the nuclei and bead-nuclei complexes settled to an even monolayer, photomicrographs were taken at low magnification (4-10×) with a standard epifluorescence equipped microscope.
Results:
The Tagging Strategy
A great deal is known about the structural network that anchors the nucleus into the cytoskeleton of a eukaryotic cell. The outer nuclear membrane (ONM) is traversed by a family of single pass integral membrane proteins that contain a conserved KASH (Klarsicht, ANC-1, Syne Homology) domain that functions as a nuclear envelop targeting domain (see
The KASH domain interacts with the SUN domain within the lumen (L) of the nuclear double lipid bilayer (
The precise location of the fusion protein junctions were determined by trial and error in the case of Nesprin-3. Ultimately, it was determined that a position between the transmembrane and C-terminal-most spectrin domain was the best location for the insertion of a tag. Moreover, GFP fluorescence is undetectable in fusions where the insertion is bounded by less than 10 linker amino acids on either side of the fluorescent protein. Null mutations in the C. elegans SUN homolog UNC-84 are fully rescued by C-terminal UNC-84-GFP fusions. Therefore, Sun-1 was fused to GFP and tdTomato in the exact same manner (Malone, C. J., et al., “UNC-84 Localizes to the Nuclear Envelope and Is Required for Nuclear Migration and Anchoring During C. elegans Development,” Development 126:3171-3181, 1999). Neither Sun-1 nor Nesprin-3 was truncated because there is evidence in the literature that such manipulations lead to dominant negative activity (Crisp M., et al., “Coupling of the Nucleus and Cytoplasm: Role of the LINC Complex,” J. Cell Biol. 172:41-53, 2006) However, in a subsequent experiment, described below in EXAMPLE 7, a KASH domain family protein from D. melanogaster was truncated by the first 164 amino acids and retained tagging function with apparently healthy cells.
Cellular Localization of Nuclear Tags
In preliminary expression tests using various transformed tissue culture cell lines (COA, HeLa, 293, and N2a), it was clearly evident that all of the tested nuclear tagging fusion proteins (Nesprin-3-2XGFP, Sun-2XGFP and Sun-tdTomato3XMYC) localized properly to the periphery of the nuclear envelope. Specifically, DsRed1 was co-expressed with Nesprin-2XGFP and Sun-2XGFP tags, and GFP was co-expressed with Sun-tTomato. The CMV promoter was used to drive expression in all cells. Image acquisition was at 24 hours post-transfection using an IX81 Olympus Disk Spinning Unit Confocal microscope. The resulting cells were observed for fluorescence of the Nesprin or Sun tags alone, and for a merger of fluorescence of the tags and the co-expressed reporter images. Analysis revealed clear and distinct labeling of the nuclear envelope for each nuclear tag construct in each cell line tested (not shown).
Furthermore, cell division is not required for proper localization of the nuclear tags. In this regard, post-mitotic neurons in rat primary hippocampal cultures received Sun-1 nuclear tagging proteins incorporating either 2xGFP or tdTomato visualization tags. Alternatively, cells were made to express an alternative polypeptide incorporating LacZ coupled with a nuclear localization sequence and a GFP visualization tag. The expression of the tagging proteins was driven by the CMV promoter. The primary neurons were transformed via electroporation. The resulting fluorescence patterns are illustrated in
Next, it was determined whether the fusion tags localized to the correct nuclear membrane through selective permeabilization of tagged cells with the detergents Triton X-100 and Digitonin. In this regard, moderate levels of Triton X-100 will permeabilize all cellular membranes, whereas only the outer nuclear bilayer (ONM) is disrupted by low levels of Digitonin. See
Purification of Nuclei
An important consideration in the development of nuclear-immunopurification procedure is the obvious problem that clumped nuclei can not be used in the assay. In general the inclusion of glycerol in many of the isolation buffers is advantageous to prevent aggregation (Philpot, J. S, and J. E. Stanier, “The Choice of the Suspension Medium for Rat-Liver-Cell Nuclei,” Biochem. J. 63:214-223, 1956). Thus, a β-Glycerophosphate based buffer system was used.
A second consideration is that clearly the differential localization of the Sun and Nesprin-based nuclear tagging proteins requires an isolation procedure that selectively preserves the architecture of the nuclear membranes. For the analysis of the present example, it was possible to isolate nuclei containing the INM only, or retaining both the ONM and the INM, from in vivo tissue sources. For example,
A third issue is based on the finding that nuclei are very difficult to immunoprecipitate from crude cellular lysates. Thus, a procedure was developed that involves two steps: 1) the bulk purification of the organelle by density based sedimentation through high concentration sucrose, and 2) the selective immunopurification of tagged nuclei. Molecular manipulations of the nuclei can be performed before or after the immunopurification.
Nuclear-Immunopurification
The effectiveness of bead-conjugated antiGFP (or anti-epitope tag) to appropriately isolate and/or purify the tagged nuclei was assessed. A 1:1 mixture of Sun-2XGFP and Sun-tdTomato-3XMYC tagged COS cell nuclei were prepared from transfected cells. Nuclear-immunopurification using either anti-GFP-Dynabeads or anti-MYC-Dynabeads effectively separated the differentially tagged nuclei. In both experiments the beads (˜107) were in 10-fold excess to nuclei (˜106). Bound beads were washed 5 times, as described in the Methods section, and the total wash material was combined with that obtained from the supernatant of the immunopurification reaction. Magnetic beads pre-loaded with an anti-GFP antibody were observed to effectively demix a mixture of Sun-GFP and Sun-tdTomato-3XMYC tagged COS cell nuclei (not shown). The converse experiment produced concordant results: magnetic beads pre-loaded with an anti-epitope tag (MYC) antibody were observed to effectively demix a mixture of Sun-GFP and Sun-tdTomato-3XMYC tagged COS cell nuclei (not shown).
Furthermore, the variations of the Sun-tdTomato tag were generated to independently incorporate the epitope tags, HA, AU1, FLAG, HSV, V5, and VSV-G. Bead-bound antibodies against each epitope tag were assessed for the ability to appropriately isolate and/or purify the tagged nuclei from a 1:1 mixture of Sun-2XGFP and Sun-tdTomato-3X[epitope tag] tagged COS cell nuclei, as described above. Beads carrying antibody against each of the epitope tags effectively immunopurified the corresponding tagged nucleus with little if any enrichment for a control GFP-tagged nucleus included in the binding reaction (not shown). No differences in the stability or localization of the various Sun-tdTomato epitope tagged proteins were detected.
Nuclei to bead titrations were performed. A 1:1 mixture of Sun-2XGFP and Sun-tdTomato-3XMYC tagged COS cell nuclei was subjected to nuclear-immunopurification using either anti-GFP-Dynabeads or anti-MYC-Dynabeads. The combined immunopurification supernatant and washes for the corresponding nuclear-immunopurification experiment were observed. After the 1:1 mixture was generated, it was diluted to 1:5, 1:10, and 1:20. The 1:1, 1:5, 1:10 and 1:20 columns represent binding reactions containing 1×107, 0.2×107, 0.1×107, and 0.05×107 nuclei per 1×107 beads. At higher dilutions, some cross-reactivity was observed between anti-GFP polyclonal antibodies and tdTomato in the “bound” sample (after the wash was removed). This cross-reactivity becomes problematic when dealing with non-saturating levels of nuclei. As indicated, the anti-GFP polyclonal used in this study inefficiently detects tdTomato. Thus, in practice, single tag experiments can be performed with either the red or green fluorescent tags; however, double label experiments are best performed using the Sun-dTomato-epitope tag variants.
It is apparent that the immunopurification protocol can be performed with a variety of magnetic beads. However, the preferred system is a Protein G coupled Dynabead that is adsorbed to the antibody of interest prior to the actual capture. A second option is to use preadsorbed Sheep Anti-Rabbit Dynabeads. A third option involves the biotinylation of a primary antibody coupled with the use of Streptavidin Dynabeads. The first option is preferred for this example simply because the adsorption protocol is rapid and there is no observed need to block the beads before the immunopurification.
Manipulating Nuclei
After nuclear-immunopurification, a downstream manipulation can be performed on bead-bound nuclei. See
The chromatin of bead bound nuclei was successfully digested with micrococcal nuclease (
Finally, it was observed that Dynabeads are weak ion exchangers and bind DNA at low (˜50 mM) salt concentrations. Thus, digested nucleosomes were inefficiently released from the bead-nucleus complex at low (<100 mM) levels of salt (compare
Conclusion:
In conclusion, a generalized scheme for the isolation of genetically tagged nuclei is demonstrated. The only requirement for the nuclear-immunopurification system is that the target cell be genetically taggable. One advantage of this strategy is that the nucleus is effectively coated with magnetic beads. This inhibits clumping and lysis by avoiding the use of centrifugation steps. Thus, one advantageous application of this technology is that multi-step procedures that include the magnetic bead coating provided by nuclear-immunopurification maintain the nuclear structure during lengthy manipulations.
The data presented in this example demonstrates that the nuclear tagging approach of the INTACT method can be successfully adapted and applied in vivo to mice. Thus, the INTACT method can be applied to eukaryotic cells of any lineage, including mammalian cells. Use of cell-type specific promoters permits the production of cell-type specific genomic data. A nuclear tag can be introduced through the traditional transgenic approach, or, as shown in this example for mice, through a faster route such as a viral vector. This approach can be easily coupled with numerous analytical techniques, such as chromatin immunoprecipitation (CHIP) (Barski, A, et al., “High-Resolution Profiling of Histone Methylations in the Human Genome,” Cell 129:823-837, 2007; Ren B., et al., “Genome-Wide Location and Function of DNA Binding Proteins,” Science 290:2306-9, 2000, and as illustrated above in Example 3), DNaseI hypersensitivity (Boyle A. P., et al., “High-Resolution Mapping and Characterization of Open Chromatin Across the Genome,” Cell 132:311.-322, 2008; Crawford, G. E., et al., “DNase-chip: A High-Resolution Method to Identify DNase I Hypersensitive Sites Using Tiled Microarrays,”Nat. Methods 3:503-509, 2006; Sabo, P. J., et al., “Genome-Scale Mapping of DNase I Sensitivity In Vivo Using Tiling DNA Micro arrays,” Nat. Methods 3:511-518, 2006), and/or nuclear run-on (Core, L. J., et al., “Nascent RNA Sequencing Reveals Widespread Pausing and Divergent Initiation at Human Promoters,” Science 322:1845-1848, 2008). In combination with the aforementioned procedures, nuclear-immunopurification can facilitate the study of cell-type specific transcriptional enhancers, promoters and other genomic elements, enabling a deeper understanding of the mechanisms that control cell type-specific processes.
This example describes the design of nuclear tagging fusion proteins incorporating additional KASH and SUN domains, their in vivo expression in D. melanogaster resulting in the nuclear tagging of specific cell types, and the use of capture reagents to specifically isolate tagged nuclei.
Rationale:
The nucleus of a eukaryotic cell is a double lipid bilayer composed of both an inner nuclear membrane (INM) and an outer nuclear membrane (ONM). As described above, the KASH domain family of proteins are embedded in the ONM; while, the SUN domain family of proteins are embedded in the INM. As described in Example 6, nuclear tagging fusion proteins were constructed that incorporated either a KASH domain or a SUN domain. The nuclear tagging fusion proteins were successfully used to tag nuclei in mice, and permitted the purification of tagged nuclei for subsequence genomic analysis. As further proof that the KASH and SUN domain family members can serve as nuclear envelope targeting regions in the INTACT method, additional nuclear tagging fusion proteins using additional KASH and SUN domain family members were expressed in D. melanogaster, resulting in the in vivo tagging of nuclei. Additionally, using the GLY4/UAC D. melanogaster expression system, cell-type specific expression of the nuclear tagging fusion proteins was demonstrated. Finally, the ability to purify tagged nuclei from a mixture containing nuclei tagged with a distinct affinity reagent binding region was demonstrated.
Methods and Results:
Antibodies
GFP (Invitrogen A11122) and FLAG (Sigma F7425).
DNA Constructs
DNA constructs encoding the nuclear tagging fusion proteins were constructed according to the general design described in Example 6, above. Briefly, a polynucleic acid encoding a synthetic polypeptide linker with the sequence LAAASGGGGSGGGGSLAAASEFSAAALSGGGGSGGGGSAAAL (SEQ ID NO:89), was inserted into the reading frames of D. melanogaster endogenous genes for klarcicht (“klar”; containing a KASH family member domain). The amino acid sequence for the unmodified klar protein is set forth herein as SEQ ID NO:95, and is encoded by the nucleic acid sequence set forth herein as SEQ ID NO:94. The KASH domain comprises amino acids 512 to 567 of the polypeptide sequence (SEQ ID NO:95). The nucleic acid encoding the linker was inserted in the klar-encoding reading frame such that the linker would appear between amino acids 495 and 496, which is N-terminal to the KASH domain. See the general scheme illustrated in
Similarly, the linker was inserted into the reading frames of C. elegans endogenous genes for Unc-84 (containing a SUN family member domain), and Unc-83 (containing a KASH family member domain). The amino acid sequence for the unmodified Unc-84 protein is set forth herein as SEQ ID NO:97, and is encoded by the nucleic acid sequence set forth herein as SEQ ID NO:96. The SUN domain comprises amino acids 971 to 1108 of the polypeptide sequence (SEQ ID NO:96). The nucleic acid encoding the linker was inserted in the Unc-84-encoding reading frame such that the linker would appear C-terminal to the SUN domain. See the general scheme illustrated in FIG. 18A-1/A-2 (in context of the mouse Sun-1 gene). The amino acid sequence for the unmodified Unc-83 protein is set forth herein as SEQ ID NO:99, and is encoded by the nucleic acid sequence set forth herein as SEQ ID NO:98. The nucleic acid encoding the linker was inserted in the Unc-83-encoding reading frame such that the linker would appear N-terminal to the KASH domain. See the general scheme illustrated in
As illustrated generally in
The nucleic acid constructs encoding the nuclear tagging fusion proteins incorporate the GAL4/UAS expression system. Briefly, the promoter regions of the reading frames incorporate an upstream activation sequence (UAS) that can be bound by Gal4. Gal4 is a yeast-derived transcription factor protein that can initiate gene transcription upon binding to the UAS in the promoter region. Many transgenic D. melanogaster lines are available that express Gal4 in various specific cell lineages, some of which are used as described below.
Nuclear Tagging Fusion Protein Expression in D. melanogaster Larvae
The described constructs encoding the klar-, Unc-82-, and Unc-83-based nuclear tagging fusion proteins were expressed in cultured cell lines, as generally described above in Example 6. Furthermore, klar-, Unc-82-, and Unc-83-based nuclear tagging fusion proteins were expressed in the ventral nerve cord (VNC) of the 3rd instar of D. melanogaster larvae, driven by the GAL4/UAS system. Localization of the fusion protein tags was assessed using fluorescence based microscopy to visualize the GFP and tdTomato tags. The larvae were also DAPI-stained to establish the location of the nucleic acid within the observed cells. The images were overlayed, with congruent fluorescent signals indicating the localization of the tagging fusion proteins to the nuclear membranes.
After a preliminary confirmation that the nuclear tagging fusion proteins were expressed and localize to the nucleus in tissue culture cells, the klar-, Unc-82-, and Unc-83-based nuclear tagging fusion proteins were shown to also tag nuclei in vivo.
Expression of the Unc-84-2XGFP in Cell Lineages of the D. melanogaster Brain
Expression of the Unc-84-2XGFP nuclear tagging fusion protein was induced in female D. melanogaster flies using the GAL4/UAS system in fly lineages with specific Gal4 expression in fruitless neurons, Kenyon cells of the mushroom body, antennal lobe subgroup cells, and octopaminergic neurons. The cell type-specific expression of the nuclear tagging fusion proteins was assessed using fluorescence microscopy. Images of the frontal and ventral views of each brain were collected.
Use of the GAL4/UAS expression system in D. melanogaster afforded the opportunity to assess cell-type specific expression of the nuclear tagging fusion proteins in various distinct cell-types of interest (neuronal lineages) while using the same expression construct. The fluorescence signals in
Immunocapture of Tagged D. melanogaster Nuclei
A mixture of nuclei tagged with either GFP or tdTomato was prepared from either transfected DmBg3-C2 cells or 3rd instar larval neurons. In both experiments, the GFP and tdTomato tagged nuclei were prepared separately, mixed together, and then subjected to immunocapture by magnetic beads that were pre-adsorbed to either an anti-GFP or anti-Flag antibody, as generally described in Example 6.
The initial mixture (i.e., “input”) contained both red and green fluorescently labeled-tagged nuclei in a 1:1 mixture. In the first experiment, beads coupled to anti-GFP antibody effectively separated the mixture into a bead bound population of green nuclei and an unbound population of red nuclei (not shown). The converse experiment, where the beads were loaded with an anti-Flag antibody yielded a bead bound population of red nuclei and an unbound population of green nuclei (not shown). It is noted that the anti-Flag bead capture typically worked less efficiently than the GFP capture, as indicated by a higher rate of red nuclei appearing in the wash population.
Conclusion:
These results demonstrate that the INTACT method can incorporate additional members of the SUN and KASH domain families to serve as nuclear envelope targeting regions. It is noteworthy that, in the Drosophila system described herein, SUN and KASH domains derived from C. elegans functioned to localize the tagging fusion proteins to the nuclei, illustrating the power of these domains to function as nuclear envelope targeting regions across animal phyla. Furthermore, this example provides additional evidence that nuclei tagged according to the INTACT method can be purified from a mixture of nuclei, to facilitate subsequent analysis of the chromatin contained therein.
While the preferred embodiment of the invention has been illustrated and described, it will be appreciated that various changes can be made therein without departing from the spirit and scope of the invention.
This application is a continuation-in-part of PCT Patent Application No. PCT/US2011/040375, filed on Jun. 14, 2011, which claims the benefit of U.S. Provisional Application No. 61/354,663, filed on Jun. 14, 2010, the entire disclosures of which are hereby incorporated by reference herein.
This invention was made in part with Government support under NIH #1F32GM083449-01 awarded by the National Institutes of Health. The Government has certain rights in the invention.
Number | Date | Country | |
---|---|---|---|
61354663 | Jun 2010 | US |
Number | Date | Country | |
---|---|---|---|
Parent | PCT/US11/40375 | Jun 2011 | US |
Child | 13234109 | US |