Hematopoiesis is the process by which hematopoietic stem cells (HSCs) give rise to all hematopoietic lineages during the lifetime of an individual. To sustain life-long hematopoiesis, HSC must self-renew to maintain or expand the HSC pool [1], and they must differentiate to form committed hematopoietic progenitor cells (HPCs) that progressively lose self-renewal potential and become increasingly restricted in their lineage potential. A combination of extrinsic and intrinsic signals are thought to converge to regulate HSC differentiation versus self-renewal decisions, but the molecular mechanisms that regulate these processes are poorly understood [2].
A multitude of cytokines have been cloned that affect HSCs and HPCs; however, to date none of these, alone or in combination, can induce the symmetrical, self-renewing HSC division in vitro that is needed for HSC expansion. Recently, several novel regulators of HSC fate decisions have been identified. For instance, overexpression of HoxB4 (GeneID: 15412) results in expansion of murine and human HSCs with an increased competitive repopulation potential [3,4,5]; novel extrinsic regulators implicated in self-renewal of HSCs include Notch [6], Wnt [7,8], and the morphogens, sonic hedgehog (Shh) (GeneID: 6469) [9] and bone morphogenetic protein (BMP)-4 (GeneID: 652) [9]. While the discovery of these novel regulators provides credence to the hypothesis that extrinsic and intrinsic signals can influence HSC fate, a more global gene and/or protein expression analysis of human HSC should provide additional insight into pathways that support HSC self-renewal.
The invention is directed to an in vivo functional genomics screen for extracting functional information for a global gene expression profiling dataset in a high-throughput manner. A global gene expression profiling dataset can be obtained from, for example, differential expression analysis, such as that obtained from gene microarray analysis or quantitative RT-PCR analysis. The invention also provides use of the discovered genes and/or function of the genes. For example, the genes that were discovered to play a role in hematopoiesis can be manipulated to increase or decrease their expression in stem cells (for example, to expand stem cells or differentiate them) and/or in the treatment of disorders (e.g., blood disorders) or diseases (e.g., cancer), for example, via gene therapy.
One embodiment of the invention provides an in vivo method of assigning function to a gene (the function of a gene generally refers to the function of the protein that the gene codes, e.g., a function/role in hematopoiesis) comprising: a) providing a gene expression profiling dataset; b) identifying (e.g., by comparison to sequences present in a database) at least one ortholog in an animal model for at least one gene of an unknown function from the dataset; c) altering expression (either increasing or decreasing (including preventing) expression of a gene or its protein product) of the ortholog in the animal model; d) detecting one or more changes in the animal model due to alteration of the expression; and e) correlating the one or more changes in the animal model with the function of the gene. Another embodiment further comprises compiling a functional profile (e.g., correlating a function with one or more of the genes provided in the gene expression profile) comprising repeating steps b)-e) until two or more genes from the dataset having unknown function are associated with a function.
In one embodiment, the gene expression profiling dataset is obtained from differential gene expression analysis, including, but not limited to, gene microarray analysis or quantitative PCR analysis.
In one embodiment, the animal model is a mouse, rat, zebrafish or xenopus animal model. In another embodiment, the animal model is an embryonic cell model.
In one embodiment, the expression of the ortholog in the animal is decreased, such as by the use of antisense oligonucleotides. In one embodiment, the antisense oligonucleotides are morpholino antisense oligonucleotides.
In one embodiment, the one or more changes detected are phenotypic (e.g., any detectable characteristic of an organism (i.e., structural, biochemical, physiological (including, for example, a decrease or increase in blood production or a decrease or increase in the amount of a transcript or protein produced, such as a transcription factor or a protein which is cell marker) and behavioral). In another embodiment, the phenotypic change is an alteration in blood cell production or transcription factor expression.
One embodiment provides increased (as compared to an animal (of the same species) without increased (e.g., overexpression) ortholog gene expression) ortholog expression in the animal model.
Definitions
As used herein, the terms below are defined by the following meanings:
As used herein, the term “ortholog” refers to genes in different species which evolved from a common ancestral gene. Due to their separation following a speciation event, orthologs may diverge, but usually have similarity at the sequence and structure levels. Orthologous genes are inherited through vertical descent from a common ancestor. These genes may arise from a common ancestral gene after speciation has occurred, or they may be present as polymorphic alleles in a population before speciation occurs. Not all orthologs perform the same, or even similar, functions as their counterparts.
As used herein, the phrase “animal model” refers to a non-human animal or embryo (e.g., mouse models, zebrafish models, dog models etc.) with a disease or biological system or activity that is similar to a human condition or system/activity (e.g., the development of blood or the vasculature). The use of animal models allows researchers to investigate disease states or biological processes.
“Genes” are the units of heredity in living organisms. They are encoded in the organism's genetic material (usually DNA or RNA), and control the physical development and behavior of the organism. Genes encode the information necessary to construct the chemicals (proteins etc.) needed for the organism to fuinction. The term “genes” generally refers to the region of DNA (or RNA, in the case of some viruses) that determines the structure of a protein (the coding sequence), together with the region of DNA that controls when and where the protein will be produced (the regulatory sequence).
Differential gene expression analysis/techniques include, but are not limited to, differential screening (e.g., with use of, for example, a phage library), subtractive screening (an RT-PCR based method, such as suppression subtractive hybridization (SSH)), differential display and DNA microarray (e.g., Affymetrix (commercial) or Spotted arrays)). The study of differential gene expression provides biologically information regarding gene expression. For example, the correlation of changes in gene expression with specific changes in physiology can provide information to assign a function to the gene whose expression changed. Differential gene expression techniques are well known in the art.
Antisense oligonucleotides interact with complementary strands of nucleic acids, modifying expression of genes. Antisense oligonucleotide technology is known in the art. Morpholino oligonucleotides are molecules used in antisense technology to block access of other molecules to specific sequences within nucleic acid molecules. They can block access of other molecules to small (˜25 base) regions of ribonucleic acid (RNA). Morpholinos are sometimes referred to as PMO, an acronym for phosphorodiamidate morpholino oligo.
Morpholinos are synthetic molecules which are the product of a redesign of natural nucleic acid structure. Usually 25 bases in length, they bind to complementary sequences of RNA by standard nucleic acid base-pairing. Structurally, the difference between Morpholinos and DNA is that while Morpholinos have standard nucleic acid bases, those bases are bound to morpholine rings instead of deoxyribose rings and linked through phosphorodiamidate groups instead of phosphates. Replacement of anionic phosphates with the uncharged phosphorodiamidate groups eliminates ionization in the usual physiological pH range, so Morpholinos in organisms or cells are uncharged molecules. Morpholinos are not chimeric oligos; the entire backbone of a Morpholino is made from these modified subunits. Morpholinos are most commonly used as single-stranded oligos, though heteroduplexes of a Morpholino strand and a complementary DNA strand may be used in combination with cationic cytosolic delivery reagents.
Morpholinos do not degrade their target RNA molecules, unlike many antisense structural types (e.g., phosphorothioates, siRNA). Instead, Morpholinos act by “steric blocking”, binding to a target sequence within an RNA and simply getting in the way of molecules which might otherwise interact with the RNA.
A gene expression profiling dataset is a preselected set of genes of interest (e.g., genes for which determination of function is desirable); such a dataset can be obtained from, for example, differential gene expression analysis/techniques (e.g., a dataset can be formed of genes that are differentially expressed in one source or combined sources, a dataset can also be comprised of any gene or genes for which the determination of function is desirable).
As used herein, the phrase “expression profiling” refers to differential gene expression analysis/techniques, such as microarray technology. Microarray technology allows for the comparison of gene expression between, for example, normal and diseased (e.g., cancerous) cells or cells which express different cell markers. There are several names for this technology—DNA microarrays, DNA arrays, DNA chips, gene chips, others.
Microarrays exploit the preferential binding of complementary nucleic acid sequences. A microarray is typically a glass slide, on to which DNA molecules are attached at fixed locations (spots or features). There may be tens of thousands of spots on an array, each containing a huge number of identical DNA molecules (or fragments of identical molecules), of lengths from twenty to hundreds of nucleotides. The spots on a microarray are either printed on the microarrays by a robot, or synthesized by photo-lithography (similar to computer chip productions) or by ink-jet printing. There are commercially available microarrays, however many labs produce their own microarrays.
In this disclosure, “comprises,” “comprising,” “containing” and “having” and the like can have the meaning ascribed to them in U.S. Patent law and can mean “includes,” “including,” and the like; “consisting essentially of” or “consists essentially” likewise has the meaning ascribed in U.S. Patent law and the term is open-ended, allowing for the presence of more than that which is recited so long as basic or novel characteristics of that which is recited is not deleteriously changed by the presence of more than that which is recited.
Other definitions may appear throughout this disclosure in the appropriate context.
Functional Analysis of Human Hematopoietic Stem Cell Gene Expression Using Zebrafish
The current understanding of the expressed gene profile of HSCs comes primarily from murine HSCs that can be purified to near homogeneity [10,11,12,13,14]. The difficulty in purifying human HSCs to similar degrees of homogeneity makes study of the transcriptome of human HSCs more difficult. Human HSCs and HPCs are CD34 positive, while cells that engraft in severe combined immune deficiency (SCID) mice are enriched in the CD34+Lineage(Lin)− CD38− fraction [15]. As fewer than 1/500 CD34+Lin−CD38− cells can repopulate SCID mice [15], the expressed gene profile of CD34+Lin−CD38− cells is likely only partially enriched for HSC-specific genes [12,16]. Previously it was demonstrated that the Rhodamine (Rho) 123− and c-kit+ subpopulation of CD34+Lin−CD38− cells (Rholo) are highly enriched for primitive HPCs with myeloid-lymphoid-initiating cell (ML-IC) capacity relative to CD34+CD38−CD33−Rhohi (Rhohi ) cells [17;
Comparison of the transcriptome of Rholo and Rhohi cells from umbilical cord blood (UCB) and bone marrow (BM) identified conserved genes and gene pathways that define the human HSC. Because of the inherent limitations of using gene expression data to infer biological gene function, the hematopoietic role of these genes in a high-throughput in vivo functional genomics screen in the zebrafish was assessed. Using this strategy a series of genes that represent novel regulators of human HSC fate decisions was identified. Further, this work represents the first example of a functional genetic screening strategy that is a large step toward obtaining biologically relevant functional data from global gene profiling studies.
To identify candidate regulators of HSC fate decisions, the transcriptome of human umbilical cord blood and bone marrow CD34+CD33−CD38−Rholoc-kit+ cells, enriched for hematopoietic stem/progenitor cells with CD34+CD33−CD38−Rhohi cells, enriched in committed progenitors, were compared. 277 differentially expressed transcripts conserved in these ontogenically distinct cell sources were identified. A morpholino antisense oligonucleotide (MO)-based functional screen in zebrafish was performed to determine the hematopoietic function of 61 genes that had no previously known function in HSC biology and for which a likely zebrafish ortholog could be identified.
MO knockdown of 14/61 (23%) of the differentially expressed transcripts resulted in hematopoietic defects in developing zebrafish embryos, as demonstrated by altered levels of circulating blood cells at 30 and 48 hours post fertilization and subsequently confirmed by quantitative RT-PCR for erythroid-specific hbae1 and/or myeloid-specific lcp1 transcripts. Recapitulating the knockdown phenotype using a second MO of independent sequence, absence of the phenotype using a mismatched MO sequence and rescue of the phenotype by cDNA-based overexpression of the targeted transcript for zebrafish spry4 confirmed the specificity of MO targeting in this system. Further characterization of the spry4-deficient zebrafish embryos demonstrated that hematopoietic defects were not due to more wide-spread defects in the mesodermal development, and therefore represented primary defects in HSC specification, proliferation and/or differentiation. Overall, this high-throughput screen for the functional validation of differentially expressed genes using a zebrafish model of hematopoiesis represents a major step toward obtaining meaningful information from global gene profiling of HSC.
The following example is put forth so as to provide those of ordinary skill in the art with a complete disclosure and description of how to make and use the invention, and are not intended to limit the scope of what the inventors regard as their invention.
Materials and Methods
Isolation of Rholo and Rhohi Cell Populations from UCB and BM.
Human UCB from full-term delivered infants and BM from healthy donors were obtained after informed consent in accordance with guidelines approved by the University of Minnesota Committee on the Use of Human Subjects in Research. Each biologically distinct replicate was comprised of one to four donors for UCB and individual donors for BM samples. CD34+CD38−CD33−Rholoc-kit+ and CD34+CD38−CD33−Rhohi fractions were selected by sequential Ficol-Hypaque separation, MACS column depletion and fluorescence activated cell sorting as previously described [ 17]. Post-sort analysis demonstrated that sorted populations contained fewer than 1-2% contaminating cells from the opposing population (
Determination of ML-IC Frequencies.
ML-IC frequencies for UCB samples (n=3) were determined as previously described [17]. An ML-IC was defined as a single cell that gave rise to at least one LTC-IC and one NK-IC. Results are presented as ML-IC frequency±standard deviation of the mean.
Processing of RNA Samples and Oligonucleotide Microarray Analysis.
Total cellular RNA was isolated from UCB (n=5) and BM (n=4) Rholo and Rhohi cells using the PicoPure RNA Isolation Kit (Arcturus, Mountain View, Calif., United States) per the manufacturer's instructions. Seven to ten thousand Rholo and Rhohi cells were sorted directly into 100 μL Extraction Buffer (XB) provided with the PicoPure RNA Isolation Kit (Arctutus) prior to RNA isolation. Labeled complimentary-RNA (cRNA) was generated by one round of IVT-based, linear amplification using the RiboAmp OA RNA Amplification Kit (Arcturus) followed by labeling with the Enzo Bioarray™ HighYield™ RNA Transcript Labeling Kit (Enzo Life Sciences, Farmingdale, N.Y., United States) according to the manufacturer's instructions. Samples were hybridized to Affymetrix™ HG-U133 A & B chips (Affymetrix™ Inc., Santa Clara, Calif., United States), washed and scanned at the University of Minnesota Affymetrix™ Microarray Core Facility as described in the Affymetrix™ GeneChip® Expression Analysis Technical Manual.
Oligonucleotide Microarray Data Analysis.
Affymetrix® HG-U133 GeneChips™ were processed using Genedata Refiner software (GeneData, San Francisco, Calif., United States) to assess overall quality. Feature intensities for each chip were condensed into a single intensity value per gene using the Affymetrix Statistical Algorithm (MAS 5.0) with Tau=0.015, Alpha1=0.04, Alpha2 0.06 and a scaling factor of 500. Expression data was analyzed using GeneData's Expressionist and Microsoft Excel (Microsoft, Redmond, Wash., United States). Differential gene expression for comparison of Rholo versus Rhohi cells was defined by using a paired Student's t-test with a threshold of p<0.05, and a paired fold change was used to rank gene lists. Differentially expressed genes were classified according to their respective gene pathways and gene ontologies when available by using the web-based Affymetrix™ NetAffx analysis tool (http://www.affymetrix.com) and the National Institutes of Allergy and Infectious Disease (NIAID) Database for Annotation Visualization and Integrated Discovery (DAVID) analysis tool (http://alps1.niaid.nih.gov/david).
Microarray Q-RT-PCR Confirmation.
Labeled cRNA was reverse transcribed to generate cDNA using SuperScript™ II Reverse Transcriptase (Invitrogen, Carlsbad, Calif., United States) according to the manufacturer's instructions. Quantitative, real-time PCR (Q-RT-PCR) was performed using the ABI Prism® 7000 Sequence Detection System (Applied Biosystems, Foster City, Calif., United States). Briefly, 3 ng of cDNA was amplified by 40 cycles of a two-step PCR reaction (95° C. for 15 sec. denaturation and 60° C. for 1 min. annealing/elongation) containing 100 nM gene-specific primers (
High-throughput Loss-of-function Genetic screen in Zebrafish.
The likely zebrafish orthologs of differentially expressed human genes were identified using Ensembl's gene homology prediction program (http://www.ensemble.org, build Zv3) in combination with comparison of the human protein sequence to The Institute for Genomics Research zebrafish EST database (http://www.tigr.org, release 13-15). The criteria for a likely ortholog was ≧40% amino acid identity over the entire length of the protein or ≧50% if the fish protein sequence was only partial.
Morpholino oligos are short chains of Morpholino subunits comprised of a nucleic acid base, a morpholine ring and a non-ionic phosphorodiamidate intersubunit linkage. Morpholinos are believed to act via a steric block mechanism (RNAse H-independent). Morpholino antisense oligonucleotide (MO) (Gene Tools, Philomath, Oreg., United States) sequences were designed complimentary to the region of translational initiation of the zebrafish orthologs in order to inhibit protein translation (
Whole Zebrafish Q-RT-PCR Confirmation.
Total RNA was isolated from five MO-injected zebrafish embryos at 48 hpf with hematopoietic defects, determined based on gata1:DsRed+ hematopoietic cell production, or five uninjected clutchmate controls using the RNeasy Mini Kit (QIAGEN, Valencia, Calif., United States) according to the manufacturer's instructions. Total RNA was incubated with DNaseI (Invitrogen) to digest contaminating genomic DNA, and reverse transcribed to generate cDNA using SuperScript# II Reverse Transcriptase (Invitrogen) according to the manufacturer's protocol. Quantitative, real-time PCR was performed using the ABI Prism® 7000 Sequence Detection System (Applied Biosystems). Briefly, 1/20 of the total cDNA from five zebrafish embryos was amplified by 40 cycles of a two-step PCR reaction (95° C. for 15 sec. denaturation and 60° C. for 1 min. annealing/elongation) containing 100 nM gene-specific primers (
Whole-mount in situ Hybridization.
Scl, myod, flk1, gata1, and cmyb riboprobes were generated and whole- mount in situ hybridization of zebrafish embryos was conducted as previously described [41].
Overexpression of Human and Zebrafish Sprouty Gene Family Members.
The XhoI and Kpnl fragment of the SPRY1 open reading frame (ORF) (Open Biosystems, Huntsville, Ala., United States) was cloned into pENTRIA (Invitrogen) and subsequently transferred into a modified pFRM2.1 zebrafish expression vector using the Gateway cloning system™ (Invitrogen) to create the pFRM2.1−SPRY1 vector. pFRM2.1_SPRY1 was co-injected with pFRM2.1_eGFP at a 5:1 ratio into the yolk/cell interface of one cell gata1:DsRed Tg zebrafish embryos as described for MO injections. Defects in hematopoietic development of eGFP+ embryos were analyzed by comparison to embryos from the same clutch injected with pFRM2.1_eGFP alone using fluorescence microscopy to visualize DsRed+ blood cells.
Accession Numbers
The National Center for Biotechnology Information (http://www.ncbi.nlm.nih.-gov/): Entrez Gene accession numbers for the following genes are: ABCB1(5243), ARMCX2 (9823), BMP4 (652), C12or f2 (11228), CCR7 (1236), Ccr7 (12775), CDKN1A (1026), chd (30161), cmyb (30519), CRYGD (1421), EVI1 (2122), EZH2 (2146), FLJ14917 (84947), flk1 (also known as kdr) (58106), FOXM1 (2305), gata1 (30481), GATA2 (2624), gata2 (30480), hbae1 (30597), HDHD2 (84064), HELLS (3070), HLF (3131), HMGA2 (8091), Hoxb4 (15412), HOXB4 (3214), HSPC039 (51124), IRAK3 (11213), Irak3 (73914), KIAA1102 (22998), KLF5 (688), lcp1 (30583), LEF1 (51176), LMO2 (4005), lmo2 (30332), MAFB (9935), MGC15875 (85007), MRPS6 (64968), myod (30513), NOTCH2 (4853), PIM1 (5292), PRKCH (5583), Prkch (18755), RBPMS (11030), scl(also known as tal1) (30766), SHH (6469), SLC40A1 (30061), SLC03A1 (28232), SNX5 (27131), SPARC (6678), Sparc (20692), spry4 (114437), SPRY1 (10252), SSBP2 (23635), SUZ12 (23512), ZFHX1B (9839), ZNF165 (7718), and ZNF331 (55422). The microarray data have been deposited in the GEO database (http://www.ncbi.nlm.nih.gov/geo/), and have been assigned the accession number GSE2666. The genes and microarray data are incorporated herein by referenced.
Results and Discussion
Myeloid-Lymphoid initiating Cells (ML-ICs) are highly enriched in Rholo compared to Rhohi cells. In the past, the study of human HSCs has been limited since the CD34+Lin−CD38− fraction of hematopoietic cells, commonly used as an HSC enriched population, contains fewer than 0.2% SCID-repopulating cells [15], suggesting considerable heterogeneity. ML-ICs, single hematopoietic cells that can generate several daughter cells that are capable of re-initiating long-term myeloid and long-term lymphoid cultures, were highly enriched by selecting the Rholo fraction of CD34+Lin−CD38− cells. While the Rholo population still only contains 15-25% ML-ICs and therefore remains heterogeneous, the enrichment factor is about 5- to 10-fold greater than CD34+Lin−CD38− cells [17]. The ML-IC frequency was ≧10-fold higher in UCB Rholo compared to Rhohi cells (
Genes differentially expressed between Rholo and Rhohi cells from both UCB and BM. Comparing genes differentially expressed between Rholo and Rhohi cells from ontogenically distinct sources identified conserved genes and gene pathways that govern self-renewal and differentiation of human HSCs. The experimental design used is illustrated in
2,707 and 4,667 probe sets differentially expressed between Rholo and Rhohi cells from UCB and BM were identified (as presented in U.S. Ser. No. 60/690,089, which is herein incorporated by reference for the description of the differentially expressed probe sets, including confirmation Q-RT-PCR and the 277 unique transcripts). The fidelity of the microarray results was confirmed using quantitative RT-PCR (Q-RT-PCR). Further analysis was focused on 277 unique transcripts, represented by 304 probe sets that were differentially expressed between Rholo and Rhohi cells from both UCB and BM with a fold change >1.5 in either UCB or BM.
Among the conserved genes enriched in Rholo cells, many have been implicated in early hematopoiesis, including CDKN1A (GeneID: 1026), a cell cycle regulator for maintenance of murine HSCs [18], and ABCB1 (GeneID: 5243), the ABC-transporter family member responsible for the Rholo phenotype [17]. Several transcription factors (TFs) known to play a role in early hematopoiesis or leukemogenesis were also identified, including HLF (GeneID: 3131), involved in leukemogenic chromosomal translocations [19] and EVI1 (GeneID: 2122), a TF associated with myeloid leukemias [20]. Other TFs without a known role in hematopoiesis were also more highly expressed in Rholo cells, including HMGA2 (GenefD: 8091), a high mobility group gene; and the zinc finger TFs ZNF165 (GeneID: 7718), ZNF331 (GeneID: 55422) and KLF5 (GeneID: 688). All Rholo-enriched genes are listed in U.S. Ser. No. 60/690,089, which is herein incorporated by reference for the list of Rholo-enriched genes. As demonstrated in previous HSC gene profiling studies [12,13,14], >40% of genes enriched in Rholo cells lack a functional annotation, are hypothetical proteins or are expressed sequence tags (ESTs), and thus represent currently uncharacterized regulators of HSC fate decisions (
Some genes with well-established roles in HSC self-renewal and early differentiation are not present in the Rholo enriched gene list. However, most of these were differentially expressed in both datasets, but differences did not reach statistical significance. For instance, LMO2 (GeneID: 4005) [21] and GATA2 (GeneID: 2624) [22], known to be involved in HSC development and self-renewal, were expressed significantly higher in BM Rholo than Rhohi cells. Although similar trends were seen in UCB Rholo cells, these differences were not statistically significant. Conversely, HOXB4 (GeneID: 3214) [4] expression was significantly higher in UCB Rholo than Rhohi cells, but this difference was not statistically significant in BM. Although the stringent criteria for differential expression likely contribute to the omission of some genes that might be differentially expressed, another explanation might be that expression of these genes is maintained when cells differentiate from a Rholo to a Rhohi stage. The latter is consistent with most known HSC-associated genes expressed at levels that are much higher than the normalized average microarray expression levels in Rholo and Rhohi cells from both UCB and BM.
Conserved genes enriched in the Rhohi cells included LEF1, an effector of Wnt signaling expressed in pre-B and T cells [23], and NOTCH2 (GeneID: 4853), involved in hematopoietic differentiation cell fate decision [24]. Several TFs known to play a role in hematopoietic cell differentiation were more highly expressed in Rhohi compared to Rholo cells, including HELLS (GeneID: 3070) and MAFB (GeneID: 9935) [25,26]. Additional TFs with no known role in hematopoietic development were also enriched in Rhohi cells, such as the zinc finger homeobox gene, ZFHX1B (GeneID: 9839), and the polycomb genes, EZH2 (GeneID: 2146) and SUZ12 (GeneID: 23512), the later plays a role in germ cell development [27]. Globin (Hb) gene family members were also more highly expressed in UCB and BM Rhohi than Rholo cells. Consistent with the ontogenic expression patterns of fetal versus adult Hb genes, Hbγ genes were more highly expressed in perinatal UCB Rhohi cells, while Hbβ genes were more highly expressed in adult BM Rhohi cells. Additional genes enriched in Rhohi cells are listed in U.S. Ser. No. 60/690,089, which is incorporated by reference for the list of genes enriched in Rhohi cells.
Because of functional redundancy amongst gene families, the data for common differentially expressed gene family members were examined. The Id family of transcriptional repressors [28] was enriched in the Rholo fraction, but was represented by different family members in UCB (ID4) and BM (ID1, ID2 and ID3). Similarly, various H1 and H2 histone genes were enriched in the Rholo fraction in both datasets, but were represented by distinct family members.
It was also evaluated whether common differentially expressed genes were concentrated on specific chromosomes. It was found that genes were not only concentrated on certain chromosomes, but at specific g-band addresses. Of the genes enriched in Rholo cells, 9% reside at 6p21, a region involved in recurrent chromosomal translocations in myeloid [29] and lymphoid [30] leukemias, and home to the PIM1 oncogene (GeneID: 5292) [31]. Six members of the H2B and one member of the H1 histone family as well as CDKN1A, more highly expressed in Rholo than Rhohi cells, reside at 6p21. The remaining Rholo-enriched genes at 6p21 consist of six class II major histocompatibility complex (MHC) family members and a putative testis specific zinc finger TF, ZNF165. H1 and H2 histone gene family members [11,13], class II MHC antigens [12,13] and CDKN1A [13] were also found amongst the genes identified in studies characterizing the transcriptome of murine HSC. The differential expression of such a large number of genes located at this chromosomal address, suggests that like CDKN1A, other genes located at 6p21 with as yet unknown hematopoietic function can a role in HSC proliferation or differentiation.
The genes expressed more highly in Rholo versus Rhohi cells were compared with published gene expression data. Comparison with the study by Ivanova et al. [12] that compared human CD34+Lin−CD38− with CD34+Lin−CD38+ cells, yielded only seven genes in common: ARMCX2 (GeneID: 9823), CRYGD (GeneID: 1421), HLF, K1AA1102 (GeneID: 22998), RBPMS (GeneID: 11030), SLCO3A1 (GeneID: 28232) and SSBP2 (GeneID: 23635). The lack of overlap is that surprising, as Rholo and Rhohi cells are subpopulations of the CD34+Lin−CD38− population used by Ivanova et al. Comparison of genes expressed more highly in Rholo versus Rhohi cells with genes expressed more highly in murine side population (SP)/KLS/CD34− compared to total BM cells published by Ramalho-Santos et al. [13] identified 16 likely orthologs and 38 common gene family members (this comparison is presented in U.S. Ser. No. 60/690,089 , which is herein incorporated by reference), suggesting that HSC specific genes are conserved across species.
In vivo fimctional genomics screen in zebrafish. Because gene-profiling per se does not prove functional importance, an in vivo functional genomics screen in zebrafish was developed (
From the 277 unique transcripts that were differentially expressed between Rholo and Rhohi cells of both UCB and BM, genes with known function in hematopoiesis, MHC genes, histones, and genes that are known to play a role in glucose and protein metabolism and RNA and DNA synthesis were eliminated, resulting in a final list of 158 genes. Of these, a putative zebrafish ortholog for 86 was identified, and MOs were designed against 61 (
Additionally, the observed reduction of both erythroid and myeloid gene expression following knockdown of candidate genes is consistent with their presumed roles in HSC fate decisions prior to specification of the common myeloid progenitor. The validity of the Q-RT-PCR analysis was corroborated by analysis of hbae1 and lcp1 transcript levels in gata1 MO targeted embryos, in which there was a virtually complete loss of hbae1 expression and an almost 2-fold increase in myeloid-specific lcp1 transcripts (
The greater than 20% frequency of blood defects seen in the screen compares very favorably with the 0.5-1% frequency of hematopoietic phenotypes seen by ethylnitrosourea (ENU) mutagenesis screens that mutate genes in a near random fashion [38] and the 4% of hematopoietic phenotypes seen in a morpholino-based finctional screen of the zebrafish secretome [39] (S.C. Ekker, unpublished data). The high incidence of blood defects also demonstrates that the candidate genes identified by comparing the transcriptome of Rholo and Rhohi cells represent genes with important roles in HSC biology. A candidate ortholog in zebrafish for 72/158 of the differentially expressed human genes was not identifiable. This may be because currently only one quarter of the zebrafish genome is high-quality finished sequence. Hence, some genes with important roles in hematopoiesis may have been untested. However, from a sample of 10 genes that lacked a zebrafish match, only one has a likely ortholog in the Fugu or Medaka sequencing projects, and thus incomplete genome coverage provides only a partial explanation. Alternative possibilities include a reduced level of primary sequence conservation between functional orthologs that may have been missed using the comparative genomics criteria presented herein, or that a number of genes are not conserved between fish and man, and thus might be less important for the conserved processes of hematopoietic self-renewal and differentiation. Therefore, the relatively high incidence of blood defects in conserved genes may in part reflect “evolutionary filtering” in the screen. Consistent with this hypothesis, all zebrafish genes whose mutation resulted in a visible embryonic phenotype identified using a retroviral insertion strategy have a likely human ortholog [40].
Although the frequency of blood defects is high, the screen is not as well suited for the identification of knockdown phenotypes that result in increased HSC proliferation and/or differentiation. The inventors and others have successfully shown that dramatically increased hematopoietic development can be modeled in zebrafish, as is the case for knockdown of the BMP-antagonist chordin (GeneID: 30161) and the corresponding dino mutant [41,42]; however, more modest increases in hematopoietic cell production or skewing of lineage differentiation may be undetectable. Although these caveats may lead to underestimation of the true frequency of genes with a role in early hematopoietic development and differentiation, the screening procedure used has proven effective for extracting functional information from a global gene expression profiling dataset in a high-throughput manner.
Of note, viable knockout mice exist for 4/14 genes identified in the functional screen (Sparc (GeneID: 20692), Irak3 (GeneID: 73914), Ccr7 (GeneID: 12775) and Prkch (GeneID: 18755)) [43,44,45,46]. One could argue that if a viable knockout mouse exists, the gene of interest may not be important in hematopoiesis. However, lack of an overt hematopoietic phenotype does not preclude a role of a gene in HSC self-renewal and differentiation, as this may only be detectable under conditions where the hematopoietic system is stressed or in transplantation experiments. For example, HoxB4−/− mice develop normally, and present with only subtle differences in spleen and BM cellularity [47]. However, the proliferative response of HSC in vitro and in vivo is decreased, consistent with the observation that over-expression of HoxB4 supports expansion of competitive repopulating units and SCID-repopulating cells [4,5].
Characterization of Sprouty family members in zebrafish hematopoiesis. To further verify the hematopoietic role of genes identified by gene array, a more extensive evaluation of the zebrafish targeted with a MO against spry4 (GeneID: 114437) (spry4 morphant or spry4MO) was performed. Although human SPRY1 (GeneID: 10252) was differentially expressed, a MO against zebrafish spry4 was used, as it is expressed in the region of the lateral plate mesoderm, the first site of zebrafish hematopoiesis [48], and it was the full-length zebrafish Sprouty gene with the greatest protein homology to human SPRY1. Recently the partial sequence of a potential zebrafish spry1 ortholog was predicted by Ensemble's gene prediction software based on genomic sequence information. However, the single exon that was predicted does not contain an ATG start codon, or a conserved splice donor or acceptor site, and the putative zebrafish spry1 sequence only partially covers the human SPRY1 gene. Moreover, the genomic location of Ensemble's putative zebrafish spry1 is currently not known, and therefore it is not possible to use syntenic relationships to determine the most likely zebrafish ortholog for human SPRY1. At present, there is not sufficient sequence data available to design gain- or loss-of-function experiments for the putative zebrafish spry1, thus precluding an analysis of hematopoietic function in the zebrafish model. Therefore, spry4 is currently the best full-length, MO-targetable candidate ortholog for human SPRY1, and based on the results, at the very least zebrafish spry4 and SPRY1 share a conserved function in hematopoiesis.
To confirm the specificity of MO targeting in the spry4MO, a second spry4 MO of independent sequence and a 4-base mismatched spry4 MO were injected into zebrafish embryos. Injection of the independent spry4 MOs induced a hematopoietic phenotype in >65% of injected embryos, while the 4-base mismatched MO did not induce any phenotypic changes (
To rule out the possibility that the hematopoietic defect observed in the spry4MO was secondary to a vascular defect, spry4 MO was injected into fli1:eGFP Tg zebrafish. While the resulting embryos exhibited minor defects in cardinal vein remodeling and morphogenesis of inter-segmental vessels in the posterior tail, there were no major defects in vascular development (
Thisse et al. have shown that overexpression of zebrafish spry4 MRNA leads to an expansion of the posterior intermediate cell mass (ICM) [48]. Human SPRY1 was overexpressed in gata1:DsRed Tg zebrafish embryos, and a similar dose-dependent expansion of DsRed+ blood cells in the posterior ICM (
Characterization of hematopoietic gene expression in the spry4MO by whole-mount in situ hybridization revealed a reduction in scl expression at 4 somites (8/15), and virtually no scl (12/20) or gata1 (18/25) expression at 20 somites (
In vertebrates, Sprouty family members act as antagonists for fibroblast growth factor (FGF), vascular endothelial growth factor and epidermal growth factor signaling, and they may be involved in feedback regulation, as Sprouty gene expression is induced by activation of these signaling pathways [50]. Sprouty genes antagonize receptor tyrosine kinase (RTK) signaling at the level of the Ras/Raf/MAPK (Mitogen-Activated Protein Kinase) pathway; however, they also can serve as positive regulators of these pathways in some cell types [50]. Therefore, while not wishing to be limited to a particular mechanism, it is believed that SPRY1 affects HSC by modulating FGF-mediated, perhaps in combination with other RTK-mediated, signaling. In fact, three of the 14 genes that induce a hematopoietic defect in the zebrafish screen, SPRY1,MAFB and SPARC, are all involved in FGF signaling [50,51,52], thus suggesting a role for FGF in hematopoiesis.
The sequential genetic screen in zebrafish (optionally followed by confirmation in mammalian models (such as mammalian HSC models)) will establish a hematopoietic function for genes identified by gene array analysis in a high-throughput and efficient manner.
Genes, or the proteins they code for, can be used to expand (e.g., propagate) stem cells, by using the gene itself, or by using small molecules or siRNA (small interfering RNA (siRNA; SiRNAs usually have a well defined structure: a short (about 21-nt) double-strand of RNA (dsRNA) with 2-nt 3′ overhangs on either end), sometimes known as short interfering RNA or silencing RNA, are a class of 20-25 nucleotide-long RNA molecules that play a variety of roles in biology. Most notably, this is the RNA interference pathway (RNAi) where the siRNA interferes with the expression of a specific gene, additionally, siRNAs play additional roles in RNAi-related pathways, e.g., as an antiviral mechanism or in shaping the chromatin structure of a genome).
All patents and publications referenced or mentioned herein are indicative of the levels of skill of those skilled in the art to which the invention pertains, and each such referenced patent or publication is hereby incorporated by reference to the same extent as if it had been incorporated by reference in its entirety individually or set forth wherein in its entirety. Applicants reserve the right to physically incorporate into this specification any and all materials and information from any such cited patents or publications.
The specific methods and compositions described herein are representative of preferred embodiments and are exemplary and not intended as limitations on the scope of the invention. Other objects, aspects, and embodiments will occur to those skilled in the art upon consideration of this specification, and are encompassed within the spirit of the invention as defined by the scope of the statements. It will be readily apparent to one skilled in the art that varying substitutions and modifications may be made to the invention disclosed herein without departing from the scope and spirit of the invention. The invention illustratively described herein suitably may be practiced in the absence of any element or elements, or limitation or limitations, which is not specifically disclosed herein as essential. The methods and processes illustratively described herein suitably may be practiced in differing orders of steps, and that they are not necessarily restricted to the orders of steps indicated herein or in the statements. As used herein and in the appended statements, the singular forms “a,” “an,” and “the” include plural reference unless the context clearly dictates otherwise. Under no circumstances may the patent be interpreted to be limited to the specific examples or embodiments or methods specifically disclosed herein. Under no circumstances may the patent be interpreted to be limited by any statement made by any Examiner or any other official or employee of the Patent and Trademark Office unless such statement is specifically and without qualification or reservation expressly adopted in a responsive writing by Applicants.
The terms and expressions that have been employed are used as terms of description and not of limitation, and there is no intent in the use of such terms and expressions to exclude any equivalent of the features shown and described or portions thereof, but it is recognized that various modifications are possible within the scope of the invention as presented in the statements. Thus, it will be understood that although the present invention has been specifically disclosed by preferred embodiments and optional features, modification and variation of the concepts herein disclosed may be resorted to by those skilled in the art, and that such modifications and variations are considered to be within the scope of this invention as defined by the appended statements.
This application claims priority to U.S. Provisional Application Ser. No. 60/690,089, filed Jun. 13, 2005, the contents of the provisional application is incorporated herein by reference in its entirety.
This research was supported by the National Cancer Institute Program Project Grant (CA065493) and National Institute General Medical Sciences R01 (GM63904). The government may have certain rights to this invention.
Number | Date | Country | |
---|---|---|---|
60690089 | Jun 2005 | US |