This invention includes the establishment of key statistical models and data processing steps that will enable the evaluation of expression data derived from cultured neurons derived from induced pluripotent stem cells. It compares test data to a reference set of data from, for example, previously characterized neurons, neuronal progenitor cells, pluripotent stem cells with known biological characteristics.
In one aspect, a computer implemented method of identifying a determined dopaminergic precursor cell within an in vitro population of neuronal progenitor cells is provided. The method includes, receiving a test dataset including data including gene expression profile information for an in vitro population of neuronal progenitor cells; querying a gene expression reference database to compare the test dataset with the gene expression reference database, the gene expression reference database including gene expression profile information for a desirable determined dopaminergic precursor cell; and outputting a computed label classification including an indication of whether the in vitro population of neuronal progenitor cells includes a determined dopaminergic precursor cell.
Provided herein are computer implemented methods of classifying an in vitro population of neuronal progenitor cells, the methods comprising receiving a test dataset comprising gene expression levels and expression levels of one or more metagenes for a cell or a plurality of cells comprised in an in vitro population of neuronal progenitor cells, wherein the one or more metagenes are determined based on correlated gene expression levels of reference cells in a reference database, wherein the reference cells are neuronal cells at one or more different stages of differentiation; applying the expression levels of the one or more metagenes as input to a process configured to determine a probability of the cell or the plurality of cells having metagene expression levels of a determined dopaminergic precursor cell; determining a deviation score for the cell or the plurality of cells, wherein the deviation score indicates the degree to which the gene expression levels in the test dataset deviate from gene expression levels in one or more reference cells in the reference database, wherein the one or more reference cells are at a stage of differentiation indicating a determined dopaminergic precursor cell; and outputting, based on the probability and the deviation score, a computed label classification comprising an indication of whether said cell or said plurality of cells from the in vitro population of neuronal progenitor cells is a determined dopaminergic precursor cell.
In some embodiments, the process comprises a supervised classification model trained using (i) expression levels of the one or more metagenes of the reference cells in the reference database; and (ii) class labels indicating each of the one or more different stages of differentiation for reference cells in the reference database, to determine a probability of a cell or a plurality of cells having metagene expression levels of a determined dopaminergic precursor cell.
Also provided herein are computer implemented methods of training a process to determine a probability of a cell or a plurality of cells having metagene expression levels of a determined dopaminergic precursor cell, the methods comprising training a supervised classification model using (i) expression levels of one or more metagenes, wherein the one or more metagenes are determined based on correlated gene expression levels of reference cells in a reference database, wherein the reference cells are neuronal cells at one or more different stages of differentiation; and (ii) class labels indicating each of the one or more different stages of differentiation for reference cells in the reference database, to determine a probability of a cell or a plurality of cells having metagene expression levels of a determined dopaminergic precursor cell.
Also provided herein are computer implemented methods of classifying an in vitro population of neuronal progenitor cells, the methods comprising receiving a test dataset comprising gene expression levels and expression levels of one or more metagenes for a cell or a plurality of cells comprised in an in vitro population of neuronal progenitor cells, wherein the one or more metagenes are determined based on correlated gene expression levels of reference cells in a reference database, wherein the reference cells are neuronal cells at one or more different stages of differentiation; applying the expression levels of the one or more metagenes as input to a process, the process comprising a supervised classification model trained using (i) expression levels of the one or more metagenes of reference cells in the reference database; and (ii) class labels indicating each of the one or more different stages of differentiation of reference cells in the reference database, to determine a probability of a cell or a plurality of cells having metagene expression levels of a determined dopaminergic precursor cell; determining a deviation score for the cell or the plurality of cells, wherein the deviation score indicates the degree to which the gene expression levels in the test dataset deviate from gene expression levels in one or more reference cells in the reference database, wherein the one or more reference cells are at a stage of differentiation indicating a determined dopaminergic precursor cell; and outputting, based on the probability and the deviation score, a computed label classification comprising an indication of whether said cell or plurality of cells from the in vitro population of neuronal progenitor cells is a determined dopaminergic precursor cell.
In some of any of the preceding embodiments, the method comprises, based on the computed label classification, identifying the in vitro population of neuronal progenitor cells as a population comprising determined dopaminergic precursor cells.
In some of any of the preceding embodiments, the supervised classification model is a logistic regression model.
In some of any of the preceding embodiments, the reference cells are an in vitro population of neuronal progenitor cells. In some of any of the preceding embodiments, said in vitro population of neuronal progenitor cells is formed by culturing one or more induced pluripotent stem cells (iPSC) in vitro for a period of time under conditions capable of differentiating the one or more iPSCs to a neuronal progenitor cell, optionally wherein the neuronal progenitor cell is one or more of a floor plate midbrain progenitor cells, determined dopaminergic precursor cells, or dopamine (DA) neurons. In some embodiments, said iPSC is a human iPSC. In some embodiments, said human is a healthy subject. In some embodiments, said human is a subject with Parkinson's disease.
In some of any of the preceding embodiments, the culturing is for period of time that is between at or about 2 and at or about 25 days. In some of any of the preceding embodiments, said iPSC is cultured for, for about, or for at least 2 days. In some of any of the preceding embodiments, said iPSC is cultured for, for about, or for at least 5 days. In some of any of the preceding embodiments, said iPSC is cultured for, for about, or for at least 10 days. In some of any of the preceding embodiments, said iPSC is cultured for, for about, or for at least 13 days. In some of any of the preceding embodiments, said iPSC is cultured for, for about, or for at least 15 days. In some of any of the preceding embodiments, said iPSC is cultured for, for about, or for at least 18 days. In some of any of the preceding embodiments, said iPSC is cultured for, for about, or for at least 25 days.
In some of any of the preceding embodiments, the reference database comprises gene expression levels determined from one or more reference cell populations, wherein each of the one or more reference cell populations are formed by culturing one or more iPSC in vitro for a different period of time each under conditions capable of differentiating the one or more iPSCs to a neuronal progenitor cell, optionally wherein the neuronal progenitor cell is one or more of a floor plate midbrain progenitor cells, determined dopaminergic precursor cells, or dopamine (DA) neuron. In some embodiments, the different period of time is between 2 and 30 days. In some embodiments, the different period of time is between 11 and 25 days.
In some of any of the preceding embodiments, the one or more stages of differentiation of reference cells in the reference database are formed by culturing one or more iPSC in vitro for one or more different period of time under conditions capable of differentiating the one or more iPSCs to a neuronal progenitor cell, optionally wherein the neuronal progenitor cell is one or more of a floor plate midbrain progenitor cells, determined dopaminergic precursor cells, or dopamine (DA) neuron, wherein the different period of time is between about 11 days and about 25 days, optionally a period of time of at or about 13 days; a period of time of at or about 18 days; or a period of time of at or about 25 days. In some of any of the preceding embodiments, at least one of the one or more reference cell populations in the reference database comprises gene expression levels determined by culturing the iPSC for at or about day 13, 18, or 25 days.
In some of any of the preceding embodiments, the conditions capable of differentiating the one or more iPSCs to a neuronal progenitor cell comprises culturing the iPSCs by (a) a first incubation comprising exposing the cells to (i) an inhibitor of TGF-β/activing-Nodal signaling; (ii) at least one activator of Sonic Hedgehog (SHH) signaling; (iii) an inhibitor of bone morphogenetic protein (BMP) signaling; and (iv) an inhibitor of glycogen synthase kinase 3β (GSK3β) signaling, optionally under conditions to differentiate the cells to floor plate midbrain progenitor cells, optionally wherein the first incubation is initiated on day 0 of the culturing; and (b) a second incubation of cells after the first incubation, wherein the second incubation comprises culturing the cells under conditions to neurally differentiate the cells, optionally wherein the second incubation is initiated at or about day 11 after the first incubation, and further optionally wherein the second incubation is for between at or about 11 and at or about 25 days. In some embodiments, the conditions to neurally differentiate the cells comprises exposing the cells to (i) brain-derived neurotrophic factor (BDNF); (ii) ascorbic acid; (iii) glial cell-derived neurotrophic factor (GDNF); (iv) dibutyryl cyclic AMP (dbcAMP); (v) transforming growth factor beta-3 (TGFβ3) (collectively, “BAGCT”); and (vi) an inhibitor of Notch signaling.
In some of any of the preceding embodiments, at least one of the one or more reference cell populations in the reference database comprises gene expression levels determined by culturing the iPSC for at or about 13 days. In some of any of the preceding embodiments, at least one of the one or more reference cell populations comprises gene expression levels determined by culturing the iPSC for at or about 18 days. In some of any of the preceding embodiments, at least one of the one or more reference cell populations comprises gene expression levels determined by culturing the iPSC for at or about 25 days.
In some of any of the preceding embodiments, the one or more metagenes and the expression levels of the one or more metagenes are determined by using a dimensionality reduction technique on one or more reference cells of the one or more reference database. In some embodiments, the dimensionality reduction technique is used on a reference cell population comprising gene expression levels determined at or about 13 days of culturing iPSC in vitro under conditions to differentiate neuronal progenitor cells. In some of any of the preceding embodiments, the dimensionality reduction technique is used on a reference cell population comprising gene expression levels determined at or about 18 days of culturing iPSC in vitro under conditions to differentiate neuronal progenitor cells. In some of any of the preceding embodiments, the dimensionality reduction technique is used on a reference cell population comprising gene expression levels determined at or about 25 days of culturing iPSC in vitro under conditions to differentiate neuronal progenitor cells. In some of any of the preceding embodiments, the dimensionality reduction technique is used on each of a reference cell population comprising gene expression levels determined at or about 13 days of culturing iPSC in vitro under conditions to differentiate neuronal progenitor cells; a reference cell population comprising gene expression levels determined at or about 18 days of culturing iPSC in vitro under conditions to differentiate neuronal progenitor cells; and a reference cell population comprising gene expression levels determined at or about 25 days of culturing iPSC in vitro under conditions to differentiate neuronal progenitor cells.
In some of any of the preceding embodiments, the supervised classification model is trained using the expression levels of the one or more metagenes determined from the one or more reference cells. In some of any of the preceding embodiments, the supervised classification model is trained using the expression levels of the one or more metagenes determined from one or more reference cells comprising gene expression levels between 11 and 25 days of culturing iPSC in vitro under conditions to differentiate neuronal progenitor cells, optionally one or more of 13, 18, and 25 days of culturing iPSC in vitro under conditions to differentiate neuronal progenitor cells. In some of any of the preceding embodiments, the supervised classification model is trained using the expression levels of the one or more metagenes determined from the one or more reference cells comprising gene expression levels determined at or about 13 days of culturing iPSC in vitro under conditions to differentiate neuronal progenitor cells. In some of any of the preceding embodiments, the supervised classification model is trained using the expression levels of the one or more metagenes determined from the one or more reference cells comprising gene expression levels determined at or about 18 days of culturing iPSC in vitro under conditions to differentiate neuronal progenitor cells. In some of any of the preceding embodiments, the supervised classification model is trained using the expression levels of the one or more metagenes determined from the one or more reference cells comprising gene expression levels determined at or about 25 days of culturing iPSC in vitro under conditions to differentiate neuronal progenitor cells. In some of any of the preceding embodiments, the supervised classification model is trained using the expression levels of the one or more metagenes determined from each of a reference cell population comprising gene expression levels determined at or about 13 days of culturing iPSC in vitro under conditions to differentiate neuronal progenitor cells; a reference cell population comprising gene expression levels determined at or about 18 days of culturing iPSC in vitro under conditions to differentiate neuronal progenitor cells; and a reference cell population comprising gene expression levels determined at or about 25 days of culturing iPSC in vitro under conditions to differentiate neuronal progenitor cells.
In some of any of the preceding embodiments, the class label indicating each of the one or more different stages of differentiation of the reference cells is either a determined dopaminergic precursor cell or a not a determined dopaminergic precursor cell.
In some of any of the preceding embodiments, the class label indicating each of the one or more different stages of differentiation of the reference cells is determined using an in vivo method. In some embodiments, the in vivo method comprises transplanting the in vitro population of neuronal progenitor cells comprising a reference cell population into a brain region of an animal model of Parkinson's disease; assessing the occurrence of an outcome associated with a therapeutic effect of the transplantation on the animal model, optionally wherein the outcome is selected from innervation or engrafting with host cells, reduction of a brain lesion in the animal model, or reversal of a brain lesion in the animal model; and designating the class label as a determined dopaminergic precursor cell if the transplantation results in the occurrence of the outcome associated with a therapeutic effect; or designating the class label as not a determined dopaminergic precursor cell if the transplantation does not result in the occurrence of the outcome associated with a therapeutic effect. In some embodiments, the brain region is the substantia nigra. In some of any of the preceding embodiments, the in vivo method comprises a behavioral assay.
In some of any of the preceding embodiments, the class label indicating each of the one or more different stages of differentiation of the reference cells is determined using an in vitro method. In some embodiments, the in vitro method comprises assessing dopamine production levels of a reference cell population; and the class label is designated as a determined dopaminergic precursor cell if the dopamine production levels are increased relative to a pluripotent stem cell. In some of any of the preceding embodiments, assessment of dopamine production is by high performance liquid chromatography.
In some of any of the preceding embodiments, the in vitro method comprises assessing levels of Tyrosine Hydroxylase expression for a reference cell population; and the class label is designated as a not a determined dopaminergic precursor cell if the reference cell population expresses high Tyrosine Hydroxylase. In some embodiments, the levels of Tyrosine Hydroxylase expression are assessed using flow cytometry.
In some of any of the preceding embodiments, the reference database further comprises the class labels of the one or more reference cells.
In some of any of the preceding embodiments, the expression levels of the one or more metagenes in the test dataset is determined based on (i) the one or more metagenes determined from the one or more reference cells in the reference database and (ii) the gene expression levels in the test dataset. In some embodiments, the expression levels of the one or more metagenes in the test dataset is determined using regression analysis based on (i) the one or more metagenes determined from the one or more reference cells in the reference database and (ii) the gene expression levels in the test dataset. In some of any of the preceding embodiments, the expression levels of the one or more metagenes in the test dataset is determined by merging the gene expression levels in the test dataset with the reference database to create an updated reference database and applying the dimensionality reduction technique on the updated reference database.
In some of any of the preceding embodiments, the dimensionality reduction technique is conventional non-negative matrix factorization, discriminant non-negative matrix factorization, graph regularized non-negative matrix factorization, bootstrapping sparse non-negative matrix factorization, or regularized non-negative matrix factorization. In some of any of the preceding embodiments, the dimensionality reduction technique is conventional non-negative matrix factorization.
In some of any of the preceding embodiments, the number of the one or more metagenes is chosen based on the performance of the supervised classification model in determining a probability of a cell or a plurality of cells having metagene expression levels of a determined dopaminergic precursor cell. In some of any of the preceding embodiments, the number of the one or more metagenes is chosen based on evaluating one or more metrics determined from performing the dimensionality reduction technique using multiple candidate numbers of metagenes. In some embodiments, the one or more metrics comprise cophenetic distance, dispersion, residuals, residual sum of squares (RSS), silhouette, and/or sparseness values.
In some of any of the preceding embodiments, the computed label classification indicates that said cell or plurality of cells from the in vitro population of neuronal progenitor cells is a determined dopaminergic precursor cell if the probability of the cell or the plurality of cells having metagene expression levels of the determined dopaminergic precursor cell is greater than a threshold probability value. In some embodiments, the threshold probability value is set such that a determined dopaminergic precursor cell is identified with greater than or greater than about 75%, 80%, 85%, 90%, or 95% sensitivity; and/or the threshold probability value is set such that a determined dopaminergic precursor cell is identified with greater than or greater than about 75%, 80%, 85%, 90%, or 95% specificity. In some embodiments, the threshold probability value is set such that a determined dopaminergic precursor cell is identified with greater than or greater than about 98% sensitivity and 100% specificity. In some of any of the preceding embodiments, the threshold probability value is determined by using the area under a receiver operator characteristic (ROC) curve based on the supervised classification model. In some of any of the preceding embodiments, the threshold probability value is between or between about 0.4 and 0.8 inclusive. In some of any of the preceding embodiments, the threshold probability value is or is about 0.4, 0.45, 0.5, 0.55, 0.6, 0.65, 0.7, 0.75, or 0.8.
In some of any of the preceding embodiments, the deviation score for the cell or the plurality of cells is determined using a single-gene deviation score for each of one or more genes in the test dataset. In some embodiments, the single-gene deviation scores are determined using differences between the gene expression levels of the test dataset and the gene expression levels in one or more reference cells in the reference database. In some embodiments, the differences are absolute differences. In some of any of the preceding embodiments, the single-gene deviation scores are determined using standard deviations of gene expression levels in one or more of the one or more reference cells. In some of any of the preceding embodiments, the single-gene deviation scores are z-scores determined using the differences between the gene expression levels of the test dataset and the gene expression levels in the one or more reference cells in the reference database; and the standard deviations of gene expression levels in one or more of the one or more reference cells of the reference database.
In some of any of the preceding embodiments, the gene expression levels in one or more reference cells in the reference database are determined based on average gene expression levels in one or more reference cells of the reference database. In some of any of the preceding embodiments, the gene expression levels in the one or more reference cells in the reference database are determined based on the expression levels of the one or more metagenes in the test dataset. In some embodiments, the gene expression levels in the one or more reference cells in the reference database are determined using regression analysis based on (i) the expression levels of the one or more metagenes in the test dataset and (ii) the gene expression levels in the test dataset.
In some of any of the preceding embodiments, the deviation score is a summary statistic based on all single-gene deviation scores. In some of any of the preceding embodiments, the deviation score is a summary statistic based on single-gene deviation scores for one or more marker genes. In some of any of the preceding embodiments, the summary statistic is a sum. In some of any of the preceding embodiments, the summary statistic is a weighted sum. In some embodiments, the single-gene deviation scores of the one or more marker genes have higher weight.
In some of any of the preceding embodiments, the summary statistic is a percentile value. In some embodiments, the percentile value is between or between about the 50% percentile and the 100% percentile; and/or the percentile value is or is about the 50%, 60%, 70%, 80%, 90%, or 95% percentile.
In some of any of the preceding embodiments, the marker genes comprise radial glial cell markers, early neuronal development genes, pluripotency specific markers, intermediate to late neuronal markers, neurofilament light polypeptide chain markers, neurofilament medium polypeptide chain markers, nestin filament markers, early patterning markers, neural progenitor cell markers, early migration markers, stage-specific transcription factors, genes required for normal development of neurons, genes controlling dopaminergic neuron development, genes regulating identity and fate of neuronal progenitor cells, dopaminergic neuron markers, astrocyte markers, forebrain markers, hindbrain markers, subthalamic nucleus markers, radial glial markers, cell cycle markers, or any combination of any of the foregoing. In some of any of the preceding embodiments, the marker genes are or comprise WNT1, VIM, TOP2A, TH, SOX2A, SLIT2, RFX4, POU5F1, PITX2, PAX6, OTX2, NR4A2, NHLH2, NEUROD4, NEUROD1, NES, NEFM, NEFL, NASP, MAP2, LMX1A, LIN28A, HOXA2, HMGB2, HES1, FOXG1, FOXA2, FABP7, DDC, DCX, BARHL2, BARJL1, ASPM, ALDH1A1, or any combination of any of the foregoing.
In some of any of the preceding embodiments, the computed label classification indicates that said cell or plurality of cells from the in vitro population of neuronal progenitor cells is a determined dopaminergic precursor cell if the deviation score indicates that at least or at least about 50%, 50%, 70%, 80%, 90%, or 95% of gene expression levels in the test dataset are no more than five standard deviations away from gene expression levels of the one or more reference cells in the reference database. In some of any of the preceding embodiments, the computed label classification indicates that said cell or plurality of cells from the in vitro population of neuronal progenitor cells is a determined dopaminergic precursor cell if the deviation score indicates that at least or at least about 95% of gene expression levels in the test dataset are no more than 10, 9, 8, 7, 6, or 5 standard deviations away from the gene expression levels of the one or more reference cells in the reference database. In some of any of the preceding embodiments, the computed label classification indicates that said cell or plurality of cells from the in vitro population of neuronal progenitor cells is a determined dopaminergic precursor cell if the deviation score indicates that at least or at least about 50%, 50%, 70%, 80%, 90%, or 95% of marker gene expression levels in the test dataset are no more than five standard deviations away from the gene expression levels of the one or more reference cells in the reference database. In some of any of the preceding embodiments, the computed label classification indicates that said cell or plurality of cells from the in vitro population of neuronal progenitor cells is a determined dopaminergic precursor cell if the deviation score indicates that at least or at least about 95% of marker gene expression levels in the test dataset are no more than 10, 9, 8, 7, 6, or 5 standard deviations away from the gene expression levels of the one or more reference cells in the reference database.
In some of any of the preceding embodiments, the computed label classification indicates that said cell or plurality of cells from the in vitro population of neuronal progenitor cells is a determined dopaminergic precursor cell if the probability of the cell or the plurality of cells having metagene expression levels of the determined dopaminergic precursor cell is greater than the threshold probability value; and the deviation score indicates that at least or at least about 50%, 50%, 70%, 80%, 90%, or 95% of gene expression levels in the test dataset are no more than five standard deviations away from the gene expression levels of the one or more reference cells in the reference database. In some of any of the preceding embodiments, the computed label classification indicates that said cell or plurality of cells from the in vitro population of neuronal progenitor cells is a determined dopaminergic precursor cell if the probability of the cell or the plurality of cells having metagene expression levels of the determined dopaminergic precursor cell is greater than the threshold probability value; and the deviation score indicates that at least or at least about 50%, 50%, 70%, 80%, 90%, or 95% of marker gene expression levels in the test dataset are no more than five standard deviations away from the gene expression levels of the one or more reference cells in the reference database. In some of any of the preceding embodiments, the computed label classification indicates that said cell or plurality of cells from the in vitro population of neuronal progenitor cells is a determined dopaminergic precursor cell if the probability of the cell or the plurality of cells having metagene expression levels of the determined dopaminergic precursor cell is greater than the threshold probability value; the deviation score indicates that at least or at least about 50%, 50%, 70%, 80%, 90%, or 95% of gene expression levels in the test dataset are no more than five standard deviations away from the gene expression levels of the one or more reference cells in the reference database; the deviation score indicates that at least or at least about 50%, 50%, 70%, 80%, 90%, or 95% of marker gene expression levels in the test dataset are no more than five standard deviations away from the gene expression levels of the one or more reference cells in the reference database.
In some of any of the preceding embodiments, the computed label classification indicates that said cell or plurality of cells from the in vitro population of neuronal progenitor cells is a determined dopaminergic precursor cell if the differences in expression of the marker genes between the test dataset and reference cells of the reference database is statistically insignificant based on a multiple-comparison corrected significance level. In some embodiments, the multiple-comparison corrected significance level is a Bonferroni corrected significance level or a false discover rate corrected significance level. In some of any of the preceding embodiments, the multiple-comparison corrected significance level is 0.01, 0.05, or 0.1.
In some of any of the preceding embodiments, said gene expression levels are obtained from microarray analysis of cellular RNA, RNA sequencing, or both. In some of any of the preceding embodiments, said gene expression levels are obtained from RNA sequencing. In some of any of the preceding embodiments, the RNA sequencing is performed on bulk RNA from the plurality of cells or a plurality of reference cells. In some of any of the preceding embodiments, the RNA sequencing is performed on RNA from the single cells or a single reference cell. In some of any of the preceding embodiments, the gene expression levels of reference cells in the reference database comprises expression levels determined by RNA sequencing that is performed on bulk RNA from a plurality of reference cells and on RNA from a single reference cell.
In some of any of the preceding embodiments, receiving said test dataset comprises receiving input from an array analysis system. In some of any of the preceding embodiments, receiving the test dataset comprises receiving input via a computer network. In some of any of the preceding embodiments, said one or more reference databases forms part of a storage medium.
In some of any of the preceding embodiments, the method comprises repeating the receiving, applying, determining, and outputting steps if the computed label classification indicates that said cell or plurality of cells is not a determined dopaminergic neuronal cell, optionally wherein the steps are repeated the same or a different in vitro population of neuronal progenitor cells. In some embodiments, the receiving, applying, determining, and outputting steps are repeated or repeated about one, two, three, four, five, six, seven, eight, nine, or 10 days after the previous iteration of the method.
In some of any of the preceding embodiments, the method comprises repeating the receiving, applying, determining, and outputting steps if the computed label classification indicates that said cell or plurality of cells is not a determined dopaminergic neuronal cell, wherein the steps are repeated using different in vitro population of neuronal progenitor cells formed by culturing another iPSC clone under conditions capable of differentiating the one or more iPSCs to a neuronal progenitor cell, optionally wherein the neuronal progenitor cell is one or more of a floor plate midbrain progenitor cells, determined dopaminergic precursor cells, or dopamine (DA) neurons. In some embodiments, said different in vitro population of neuronal progenitor cells is formed from the same human subject as the previous iteration of the method.
In some of any of the preceding embodiments, the receiving, applying, determining, and outputting steps are repeated on in vitro population of neuronal progenitor cells formed by culture of iPSC for different periods of time and/or under different conditions capable of differentiating the one or more iPSCs to a neuronal progenitor cell, until an indication that said cell or said plurality of cells is a determined dopaminergic neuronal cell is output.
Also provided herein are populations of determined dopaminergic precursor cells identified by the method of some of any of the preceding embodiments.
Also provided herein are methods of treatment, the methods comprising administering to a subject having Parkinson's disease the population of determined dopaminergic precursor cells of some of any of the preceding embodiments. In some embodiments, the administering is by implanting the population of determined dopaminergic precursor cells into one or more brain regions of the subject. In some embodiments, the one or more brain regions comprise the substantia nigra.
In some of any of the preceding embodiments, the population of determined dopaminergic precursor cells is autologous to the subject. In some of any of the preceding embodiments, the population of determined dopaminergic precursor cells is allogeneic to the subject.
Also provided herein are methods of treating a subject having Parkinson's disease, the methods comprising implanting a population of determined dopaminergic precursor cells into a brain region of a subject having Parkinson's disease, wherein the population of determined dopaminergic precursor cells has been identified using the computer implemented method of some of any of the preceding embodiments.
In some embodiments, the population of determined dopaminergic precursor cells is autologous to the subject. In some of any of the preceding embodiments, the population of determined dopaminergic precursor cells is allogeneic to the subject. In some of any of the preceding embodiments, about or at least or 1×106 cells are injected into the substantia nigra. In some of any of the preceding embodiments, the cells are injected into both the left and right hemispheres.
Provided herein is a method of classifying whether an in vitro population of neuronal progenitor cells contains a particular differentiated neuronal cell type. In some embodiments, the provided methods classify whether an in vitro population of differentiated neuronal cells contains determined dopamingergic precursor cells. In some embodiments, the methods provided herein identify whether an in vitro population of neuronal cells contain determined dopaminergic precursor cells. In some embodiments, determined dopaminergic precursor cells are cells that differentiate into dopaminergic neurons and cannot differentiate into non-dopaminergic cells. A cell population that is classified according to the provided method can be used to identify cells of interest, for example, for therapeutic application. Thus, also provided are populations of determined dopaminergic precursor cells identified by the provide methods, and pharmaceutical compositions containing the same. In some embodiments, the determined dopaminergic precursor cells have therapeutic application in the treatment of neurodegenerative diseases, such as Parkinson's disease.
In provided methods, the methods include receiving a test dataset that includes (1) gene expression levels and (2) expression levels of one or more metagenes for a cell or a plurality of cells contained in an in vitro population of neuronal progenitor cells in which the one or more metagenes are determined based on correlated gene expression levels of reference cells in a reference database. In some embodiments, the in vitro population of neuronal progenitor cells is a population of cells that has been subjected to a process to differentiate pluripotent stem cells, such as induced pluripotent stem cells (iPSCs), into neuronal cells, such as dopaminergic neurons or a determined precursor of dopaminergic neurons. In some embodiments, the methods include applying the expression levels of the one or more metagenes as input to a process configured to determine a probability of the cell or the plurality of cells in the in vitro population of neuronal progenitor cells having metagene expression levels of a determined dopaminergic precursor cell. In some embodiments, the methods include also determining a deviation score for the cell or the plurality of cells in the in vitro population of neuronal progenitor cells in which the deviation score indicates the degree to which the gene expression levels in the test dataset deviate from gene expression levels in one or more reference cells in the reference database, wherein the one or more reference cells are at a stage of differentiation indicating a determined dopaminergic precursor cell. In some embodiments, the deviation score is determined using the gene expression levels in the test dataset and the gene expression levels in a reference database. In some embodiments, the methods include outputting, based on the probability and the deviation score, a computed label classification that provides an indication of whether said cell or said plurality of cells from the in vitro population of neuronal progenitor cells is a determined dopaminergic precursor cell, thereby classifying whether the in vitro population of neuronal progenitor cells is a population that is or contains determined dopaminergic precursor cell. In some embodiments, the methods thus can identify based on the classification whether the in vitro population of neuronal progenitor cells is a population that contains determined dopaminergic precursor cells.
In some embodiments, certain differentiated neuronal cell populations differentiated from pluripotent stem cells, including determined dopaminergic precursor cells, may be cells in a stage of differentiation where the cells are not identifiable by one or a small number of features or characteristics. The methods provided herein allow for the determination of cell identity when a single or small number of features or characteristics, such as gene expression markers or functional properties, are unavailable (e.g., unknown) or cannot be practically used to determine cellular identity. For example, as shown in
Induced pluripotent stem cells (iPSCs) are considered useful as a cell therapy for at least their ability to be differentiated into specialized cell types. For example, iPSCs, like pluripotent stem cells, can be differentiated into specific cell types that can be used to replace diseased or damaged tissue. In some cases, iPSCs that have been differentiated into a particular neuronal cell type or precursor may be used to treat neurodegenerative diseases, for example by differentiating iPSCs and implanting the differentiated neuronal cells into the brain of a subject having a neurodegenerative disease. The inability to determine the identity of the differentiated cells throughout the differentiation process can lead to uncertainty about the success of the process. For example, the differentiation process may need to be run to completion in order to determine if the differentiation process was successful. Thus, without the ability to determine whether differentiating cells are progressing through the transient stages as needed, the differentiation process becomes time consuming and inefficient, and can hinder treatment of the subject, for example when a differentiation process fails. Furthermore, in some cases, the therapeutic treatment can include administering (e.g., injecting) to the subject differentiated cells that have not entered a final differentiation stage.
In some embodiments, cells at an intermediate stage of differentiation cannot be, or cannot easily be, identified by definitive biomarkers. The methods provided herein allow for the identification of cells at stages of differentiation where no definitive features or characteristics are available or can be practically used to determine cell identity. In some embodiments, the methods provided herein improve the differentiation process, for example, by allowing a determination of cell identity throughout the stages of differentiation, which can be used to determine whether cells undergoing a differentiation process are differentiating appropriately and/or according to defined standards. If it is determined that the cells are not differentiating appropriately, in some embodiments, the process can be terminated and optionally reinitiated with different iPSC clones from the patient.
In some embodiments, the methods provided herein may be used in combination with a process that includes generating neuronal cells useful for the treatment of a neurodegenerative disease, such as Parkinson's disease, by differentiation from iPSCs. In some embodiments, the methods provided herein can be used to identify neuronal cells generated by a differentiation process, for example a process described in Section II, that are useful for the treatment of Parkinson's disease.
The methods provided herein can be used to determine if an in vitro population of cells comprises predetermined dopaminergic precursor cells. In some embodiments, the methods provided herein comprise determining metagenes and expression levels thereof of test cells comprised in the in vitro population. In some embodiments, the methods provided herein comprise determining the probability of the test cells having metagene expression levels of a determined dopaminergic precursor cell. In some embodiments, the probability is determined using a machine learning model. In some embodiments, the methods provided herein comprise determining a deviation score indicating the degree to which the gene expression levels of the test cells deviate from expected gene expression levels. In some embodiments, the expected gene expression levels are based on gene expression levels of reference cells that are known to be determined dopaminergic precursor cells. In some embodiments, the methods provided herein comprise outputting a computed label classification based on one or both of (i) the probability of the test cells having metagene expression levels of a determined dopaminergic precursor cell and (ii) the deviation score. In some embodiments, the deviation score is based on a subset of marker genes. In some embodiments, determining the probability of the test cells having metagene expression levels of a determined dopaminergic precursor cell allows for the identification of cells with the desired phenotype, said phenotypes lacking individual marker genes. In some embodiments, determining the deviation score allows for the identification of cells that may contain abnormalities, for instance in the expression of certain marker genes. Thus, the methods provided herein provide a multifaceted approach for determining suitable cells for treatment.
In the subsections below, exemplary features of provided methods of classifying whether an in vitro population of neuronal progenitor cells contains a particular differentiated neuronal cell type, and methods for identifying a particular differentiated neuronal cell type, are described. Related compositions and methods of production and uses thereof also are described.
Provided herein are, inter alia, methods that use gene expression as a phenotype to identify dopaminergic precursors in an in vitro cell population of neuronal progenitor cells. The methods provided herein provide, inter alia, information whether a cell preparation (e.g., a population of neuronal progenitor cells) includes cells that are determined to differentiate into a specific functional cell type (e.g., a determined dopaminergic precursor cell) or whether the cell preparation includes cells from earlier stages (e.g. pluripotent stem cells, specified cells), other differentiating neuron types, and other differentiated cell types.
Thus, in one aspect, a computer implemented method of identifying a determined dopaminergic precursor cell within an in vitro population of neuronal progenitor cells is provided. The method includes, receiving a test dataset including data including gene expression profile information for an in vitro population of neuronal progenitor cells; querying a gene expression reference database to compare the test dataset with the gene expression reference database, the gene expression reference database including gene expression profile information for a desirable determined dopaminergic precursor cell; and outputting a computed label classification including an indication of whether the in vitro population of neuronal progenitor cells includes a determined dopaminergic precursor cell.
The methods provided herein may define a determined state of a cell and predict whether a cell preparation will differentiate into a specific cell type. The reference database provided herein may include gene expression profile information of two cell types. In embodiments, the cells identified with the methods provided herein are determined to differentiate into a specific functional cell type. Whether a cell is determined to differentiate into a specific functional cell type (e.g., a determined dopaminergic precursor cell) may further be demonstrated in vitro or in vivo by allowing the cells to fully differentiate. In embodiments, the cells identified with the methods provided herein are pluripotent stem cells, specified cells, differentiating neuron types other than dopaminergic precursors or other differentiated cell types.
In embodiments, the computer implemented method further includes a machine learning model trained to determine whether the in vitro population of neuronal progenitor cells includes the determined dopaminergic precursor cell, the machine learning model outputting the computed label classification. In embodiments, the in vitro population of neuronal progenitor cells are formed by allowing an induced pluripotent stem cell (iPSC) to differentiate in vitro. In embodiments, the iPSC is a human iPSC. In embodiments, the iPSC is cultured for at least 15 days under conditions for differentiation into a neuronal progenitor cell. In embodiments, the iPSC is cultured for about 18 days under conditions for differentiation into a neuronal progenitor cell. The in vitro cell population of neuronal progenitor cells provided herein may be formed by methods commonly known and used in the art to differentiate dopaminergic neurons from iPSCs. Exemplary methods of differentiation processes are described in Section II. Different timepoints of the process for differentiating dopaminergic neurons from iPCSs may result in cells that are at different stages of differention. Therefore, the term “d18” or “day 18” as provided herein refers to the 18th day of the process of differentiating an iPSC to form a dopaminergic neuron. Likewise, the term “d0” or “day 0” refers to the day of the process of differentiating an iPSC to form a dopaminergic neuron is initiated. The provided methods can be used to classify, and thus identify, a differentiated population of neuronal cells that, based on classification labels in accord with the provided methods, is determined to contain a particular neuronal progenitor cell, such as a determined dopaminergic precursor cell.
In some embodiments, the computer implemented method includes a machine learning model trained to determine the probability of a cell or plurality of cells comprised in the in vitro population of neuronal progenitor cells as having metagene expression levels of a determined dopaminergic precursor cell. In embodiments, the machine learning model outputs the probability (also referred to herein as a Neuroscore) of the cell or plurality of cells having metagene expression levels of a determined dopaminergic precursor cell. In embodiments, the computer implemented method further includes determining a deviation score (also referred to herein as Novelty score) for the cell or plurality of cells, wherein the deviation score is indicative of the degree to which gene expression levels of the cell or plurality of cells deviates from expected gene expression levels. In some embodiments, the expected gene expression levels are based on gene expression levels of reference cells, e.g., reference cells that are known to be determined dopaminergic precursor cells. In some embodiments, the computer implemented method includes outputting based on the probability and the deviation score the computed label classification.
The methods, algorithms, and systems described herein are designed to produce a new way of defining a determined dopaminergic precursor cell or dopaminergic cell. This new way is called a computed definition and the previous types of definitions are referred to as biological definitions (functional, structural, genesis). The computed definition is related to a biological definition, but as discussed herein, the computed definition provides a more robust and accurate way of comparing two different cells and determining whether they are the same type of cell or different cell types. In some embodiments, the computed definition provides a more robust and accurate way of identifying a cell of unknown identity.
The computed definition refers to the use of computational analysis of information to arrive at the definition. Disclosed are databases of information about one or more cells. For example, some of the databases are reference databases. A reference database can comprise cell datasets that are produced from cell data for at least two known cell lines, tissues, or primary cells. By known cell line, tissue, or primary cell is meant a cell line for which some characteristic, such as phenotype, such as dopaminergic cell, a determined dopaminergic precursor cell, and has been identified by conventional biological assays, e.g. derivation method, source material, biochemical assays (e.g. enzyme activity, e.g. alkaline phosphatase activity) or markers like specific, identified proteins which are thought to be able to identify a specific cell type. In some embodiments, the cells for which some characteristics are known are referred to as reference cells. A computed phenotype can be defined by the global profiling methods, such as gene expression (or other molecular profiling method) which is then utilized in the methods disclosed herein. Biological phenotypes, such as whether a cell is a stem cell or differentiated cell, which have been determined using subsets of profiling data, such as a subset of markers or gene expression, can be used and incorporated into the methods in the form of labeled associated biological classes.
A. Reference Cells
The methods provided herein, in some aspects, include the use of reference cells and/or reference databases to identify (e.g., determine) the presence of determined dopaminergic precursor cells within an in vitro population of neuronal progenitor cells. The types of reference cells contemplated for use according to the methods provided herein include cells with known identity (e.g., labeled cell) and known characteristics, e.g., have characterized gene expression profiles. In some embodiments, the reference databases comprise reference cell labels and the corresponding reference cell characteristics from a plurality of reference cells. In some embodiments, the reference database can be used, e.g., according to the methods provided herein, to determine whether a cell of unknown identity (e.g., unlabeled) having certain characteristics, e.g., gene expression patterns, has a certain cellular identity.
In some embodiments, the reference cell is a pluripotent stem cell. In some embodiments, the pluripotent stem cell is an induced pluripotent stem cell (iPSC). In some embodiments, the iPSC is generated from fibroblasts collected from a healthy human subject. In some embodiments, the iPSC is generated from fibroblasts collected from a human subject having Parkinson's disease. In some embodiments, the iPSC is generated from fibroblasts collected from a human subject predisposed to developing Parkinson's disease. Exemplary methods for iPSC generation are described in Section II.
In some embodiments, the reference cell is a cell differentiated under conditions to become a neuronal progenitor cell, such as a floor plate midbrain progenitor cells, determined dopaminergic precursor cells, or a dopaminergic neuron. In some embodiments, the reference cell is a cell differentiated according to any of the methods described in Section II. In some embodiments, the reference cell is a determined dopaminergic precursor cell. In some embodiments, the reference cell is a dopaminergic neuron. In some embodiments, the differentiated cell, the determined dopaminergic cell, and/or the dopaminergic cell is derived from an iPSC, for example an iPSC as described above, that has been cultured under conditions to promote differentiation into a dopaminergic cell.
In some embodiments, the reference cell is a cell that is described, e.g., labelled, characterized, in a publically available database.
In some embodiments, the reference cell is of known identity. Thus, in some instances, the identity of the cell can be used as a label for the reference cell. In some embodiments, the reference cell label is indicative of a cellular phenotype. In some embodiments, the reference cell label is indicative of cellular characteristics, e.g., gene expression levels. In some embodiments, the reference cell label indicates if the reference cell is a pluripotent stem cell. In some embodiments, the reference cell label indicates if the reference cell is a determined dopaminergic precursor cell. In some embodiments, the reference cell label indicates if the reference cell is a dopaminergic neurons.
In some embodiments, the reference cell label indicates the differentiation stage of the reference cell. In some embodiments, the reference cell label indicates the period of time that the reference cell has been cultured under differentiation conditions. In some embodiments, the reference cell label indicates the period of time that the reference cell has been cultured under differentiation conditions to become a dopaminergic neuron, e.g., any of the periods of time described in Section II.
In some embodiments, the reference cell label is based on publically available annotations for the reference cell. In some embodiments, the reference cell label is based on the assessment of dopamine production levels of the reference cell. In some embodiments, dopamine production levels are assessed using high performance liquid chromatography (HPLC). In some embodiments, the reference cell label is based on the assessment of tyrosine hydroxylase (TH) expression in the reference cell. In some embodiments, TH expression is assessed using cell staining methods. In some embodiments, the reference cell label is based on the assessment of FOXA2 expression in the reference cell. In some embodiments, FOXA2 expression is assessed using cell staining methods. In some embodiments, TH expression is assessed using flow cytometry.
In some embodiments, a reference cell is characterized as a dopaminergic neuron if it expresses a marker of a midbrain dopaminergic neuron, such as expression of FOXA2 or tyrosine hydroxylase (TH). In some embodiments, a reference cell expresses TH (TH+). In some embodiments, the reference cell expresses FOXA2 (FOXA2+). In some embodiments, the reference cell expresses TH and FOXA2 (TH+FOXA2+).
In some embodiments, the reference cell is determined to or capable of becoming dopaminergic neuron, i.e. is a determined dopaminergic precursor cell, as ascertained based on one or more characteristics that indicate the reference cell is capable of having functional activity of a dopaminergic neuron but may not yet express a marker of a dopaminergic neuron or may not express it at a high level. For example, a reference cell may exhibit lower levels of TH than a dopaminergic neuron, yet still exhibits one or more characteristics of a determined dopaminergic precursor cell indicating the differentiated cell is capable of having functional activity of a dopaminergic neuron. In some embodiments, the one or more characteristics of the reference cell include activity to survive, engraft, and/or innervate other cells when administered in vivo, e.g. to an animal model. In some embodiments, the reference cells are capable of innervating host tissue upon transplantation into an animal or human subject.
In some embodiments, the reference cell is a cell with therapeutic effect to treat a neurodegenerative disease. In some embodiments, the reference cell when implanted ameliorates or reverses symptoms of a neurodegenerative disease. In some embodiments, the neurodegenerative disease is Parkinson's disease. In some embodiments, the reference cells when implanted in the substantia nigra of a subject, e.g., patient, in need thereof improves Parkinsonian symptoms.
In some embodiments, the reference cell is screened for its therapeutic effect to treat a neurodegenerative disease, such as determined in an animal model of a neurodegenerative disease. In some embodiments, the neurodegenerative disease is Parkinson's disease. In some embodiments, the reference cells are screened using an animal model of Parkinson's disease. Any known and available animal model of Parkinson's disease can be used for screening. In some embodiments, the animal model is a lesion model wherein animals received unilateral stereotaxic injection of 6-hydroxydopamine (6-OHDA) into the substantia nigra. In some embodiments, the animal model is a lesion model wherein animals received unilateral stereotaxic injection of 6-OHDA into the medial forebrain bundle. In some embodiments, the reference cells are implanted into the substantia nigra of the animal model. In some embodiments, a behavioral assay is performed to screen for therapeutic effects of the implantation on the animal model. In some embodiments, the behavioral assay comprises monitoring amphetamine-induced circling behavior. In some embodiments, the reference cell is determined to reduce, decrease or reverse a Parkinsonian model brain lesion in this model. In some embodiments, the reference cell may be a cell that does not reduce, decrease or reverse a Parkinsonian model brain lesion in this model. The reference database may include data from various reference cell populations that exhibit varied or different therapeutic effects to treat a neurodegenerative disease, such as in an animal model.
As described above, in some embodiments, any of a number of reference cell characteristics of a particular reference cell or cells can be determined, including any one or more characteristics, traits, features or attributes of a reference cell. In some embodiments, the reference cell characteristics can be used as data to characterize or describe a particular reference cell population. For instance, reference cell characteristics may include mRNA expression levels, microRNA expression levels, protein expression levels, post-translational protein modification levels, non-coding RNA expression profiles, DNA methylation levels, histone modification levels, transcription factor-DNA site binding profiles, DNA sequence profiles, or any other type of cell characteristic, or a combination of any of the foregoing. Any of the one or more of the reference cell characteristics can be used as data to input into or populate a reference cell database.
In some embodiments, reference cell characteristics include protein expression levels. In some embodiments, reference cell characteristics include post-translational protein modification levels. In some embodiments, reference cell characteristics include non-coding RNA expression profiles. In some embodiments, reference cell characteristics include epigenetic profiles. In some embodiments, reference cell characteristics include transcriptional profiles. In some embodiments, reference cell characteristics include gene expression levels. In some embodiments, the reference cell database can include information about any one or more of the above reference cell characteristics.
In some embodiments, the gene expression levels are obtained using microarray analysis. In some embodiments, the gene expression levels are obtained using RNA sequencing. In some embodiments, the gene expression levels are obtained using both microarray analysis and RNA sequencing. In some embodiments, the RNA sequencing is performed on bulk RNA from a plurality of cells. In some embodiments, the RNA sequencing is performed on single cells. In some embodiments, the RNA sequencing is performed on bulk RNA from a plurality of cells and on single cells.
In some aspects, a plurality of reference cells with known identities, e.g., labels, and known characteristics, e.g., gene expression levels, are used to populate a reference database. In some embodiments, the plurality of reference cells used to populate the reference database have different labels from one another. In some embodiments, a portion of the reference cells used to populate the reference database have the same label. In some embodiments, a portion of the reference cells used to populate the reference database have labels different from the other reference cells of the reference database. Thus, in some embodiments, the reference database may include a plurality of reference cells, some having the same label as other cells of the reference database and some having labels different from other cells in the reference database.
In some embodiments, the reference cell characteristics for particular reference cells are included in a reference database. In some embodiments, the reference database contains reference cell labels. In some embodiments, the reference database contains protein expression levels of reference cells. In some embodiments, the reference database contains epigenetic profiles of reference cells. In some embodiments, the reference database contains transcriptional profiles of reference cells. In some embodiments, the reference database contains gene expression levels of reference cells. In some embodiments, the reference database contains gene expression data from publically available databases. In some embodiments, the reference database contains microarray data. In some embodiments, the reference database contains RNA sequencing data. In some embodiments, the reference database contains microarray data and RNA sequencing data.
In some embodiments, the reference database contains bulk RNA sequencing data. In some embodiments, the bulk RNA sequencing data is obtained from a plurality of reference cells. In some embodiments, bulk RNA sequencing data is obtained from pooled RNA from the plurality of reference cells.
Any known and available methods for obtaining bulk RNA sequencing data can be used (for example, see Chao et al., 2019, BMC Genomics 20: 571, incorporated by reference herein in its entirety). For instance, total RNA from a sample, e.g., a plurality of reference cells from an in vitro population of cells, can be isolated using TRIZOL, treated with DNase I, and purified. Concentration and quality of isolated RNA can be measured and checked prior to library preparation for total RNA or mRNA. For library preparation, total RNA or mRNA are fragmented and converted to cDNA using reverse transcription. After construction, amplification, and optional barcoding of double-stranded cDNA, libraries can be processed for next generation sequencing using any known and available library preparation techniques, sequencing platforms, and genomic-alignment tools.
In some embodiments, the reference database includes single-cell RNA sequencing data. In some embodiments, the use of single-cell RNA sequencing data affords certain advantages. In some embodiments, the use of single-cell RNA sequencing data allows for characterization of subpopulations of cells, for instance of determined dopaminergic precursor cells within a larger in vitro population of cells. In some embodiments, the use of single-cell RNA sequencing data reduces the number of reference cells required for use in the methods provided herein. In some embodiments, the use of single-cell RNA sequencing data improves characterization of biological variability across reference cells. In some embodiments, the use of single-cell RNA sequencing data allows for easier validation and interpretation of gene expression levels.
Any known and available methods for single-cell RNA sequencing can be used (for example, see Zheng et al., 2017 (Nature Communications 8: 14049), and Haque et al., 2017 (Genome Medicine 9: 75, incorporated by reference herein in their entirety). For single-RNA sequencing, single cells from a sample, for instance an in vitro population of cells, can be isolated using flow cytometric cell-sorting, microfluidic platform, or droplet-based methods. Isolated cells are lysed to allow capture of RNA molecules. Poly[T]-primers can be used for the analysis of polyadenylated mRNA molecules specifically, and primed mRNA molecules are converted to cDNA using reverse transcription. In some instances, unique molecular identifiers can be used to mark single mRNA molecules based on cellular origin. The cDNA pool is then amplified, optionally barcoded, and sequenced, for instance using next-generation sequencing (NGS) and with library preparation techniques, sequencing platforms, and genomic-alignment tools similar to those used for bulk RNA samples. In some instances, unbiased cell-type classification within a mixed population of distinct cell types can be achieved with as few as 10,000 to 50,000 reads per cell, and single-cell libraries from various common protocols can be close to saturation when sequenced to a depth of 1,000,000 reads.
In some embodiments, the reference databases comprise bulk RNA sequencing data and single-cell RNA sequencing data. In some embodiments, the bulk RNA sequencing data and the single-cell RNA sequencing data are obtained from the same sample, e.g., in vitro population of cells. In some embodiments, the single-cell RNA sequencing data can be used to approximate the bulk RNA sequencing data obtained from the same sample, e.g., in vitro population of cells. In some embodiments, approximated bulk RNA sequencing data is obtained by averaging single-cell RNA sequencing data from reference cells comprised in the same sample, e.g., in vitro population of cells. In some embodiments, the reference database comprises approximated bulk RNA sequencing data.
In embodiments, the gene expression reference database includes transcriptional profiles of one or more dopaminergic neurons. In embodiments, the method includes classifying cells with the in vitro population of neuronal progenitor cells based at least in part on a computationally derived protein-protein network. In embodiments, the gene expression profile information includes a transcriptional profile. In embodiments, the gene expression profile information includes a transcriptional profile from a single cell. In embodiments, the gene expression reference database comprises known class labels.
The reference database is made up of cell datasets, and each cell dataset is made up of characteristic data. Characteristic data are output from, for example, mRNA expression analysis, microRNA expression analysis, protein expression analysis, post-translational protein modification analysis, non-coding RNA expression analysis, DNA methylation pattern analysis, histone modification analysis, transcription factor-DNA site binding analysis, DNA sequence analysis or any other type of cell characteristic.
B. Test Cells
In some aspects, the methods provided herein allow for determining whether a cell or plurality of cells of unknown identity are determined dopaminergic precursor cells. In some embodiments, the cell or plurality cells of unknown identity are test cells. In some embodiments, the test cells are an in vitro population of cells. In some embodiments, the test cells are contained in an in vitro population of neural progenitor cells. In some embodiments, the test cells include cells differentiated under conditions to become dopaminergic neurons. In some embodiments, the test cells include cells differentiated according to any of the methods described in Section II. In some embodiments, the test cells include cells differentiated under conditions to become dopaminergic neurons for any of the periods of time described in Section II. In some embodiments, the cells being differentiated are pluripotent stem cells. In some embodiments, the pluripotent stem cells are induced pluripotent stem cells (iPSCs). In some embodiments, the iPSCs are generated from fibroblasts collected from healthy human subjects. In some embodiments, the iPSCs are generated from fibroblasts collected from human subjects with Parkinson's disease. Exemplary methods for iPSC generation are described in Section II.
In some embodiments, the determination of the identity of the test cells, e.g., whether the test cells are determined dopaminergic precursor cells or not, indicates whether the in vitro population of cells contains a population of determined dopaminergic precursor cells or not.
In some embodiments, a test dataset is determined from the test cells. In some embodiments, the test dataset is used to determine whether the test cell is a determined dopaminergic precursor cell. In some embodiments, the test dataset is used to determine whether the test cells contain determined dopaminergic precursor cells.
A “test dataset” is a dataset that is produced from a cell (e.g., a neuronal progenitor cell) for which a computed definition is desired. It is produced from characteristic data for an unknown cell line, tissue, or primary cell. Unknown in this context means that a computed definition is desired. Typically the test dataset will be comprised of a global profile as discussed herein as it relates to the global profile of the reference database. The test dataset can be merged with the reference database forming an updated reference database. In certain embodiments this can be as simple as adding the data to an existing spreadsheet. Therefore, the test dataset including gene expression profile information for an in vitro population of neuronal progenitor cells may be included (merged) in the reference database after determining that the in vitro population of neuronal progenitor cells includes a determined dopaminergic precursor cell.
In some embodiments, the test data set includes characteristics of test cells. For example, in some cases, the test data set includes the same types of characteristics as those determined for reference cells. In some embodiments, the test dataset may include cell characteristics such as mRNA expression levels, microRNA expression levels, protein expression levels, post-translational protein modification levels, non-coding RNA expression profiles, DNA methylation levels, histone modification levels, transcription factor-DNA site binding profiles, DNA sequence profiles, or any other type of cell characteristic.
In some embodiments, the test dataset includes protein expression levels. In some embodiments, the test dataset includes post-translational protein modification levels. In some embodiments, the test dataset includes non-coding RNA expression profiles. In some embodiments, the test dataset includes epigenetic profiles. In some embodiments, the test dataset includes transcriptional profiles. In some embodiments, the test dataset includes gene expression levels.
In some embodiments, the gene expression levels are obtained using microarray analysis. In some embodiments, the gene expression levels are obtained using RNA sequencing. In some embodiments, the gene expression levels are obtained using both microarray analysis and RNA sequencing. In some embodiments, the RNA sequencing is performed on bulk RNA from a plurality of cells. In some embodiments, the RNA sequencing is performed on single cells. In some embodiments, the RNA sequencing is performed on bulk RNA from a plurality of cells and on single cells. Exemplary methods of extracting, preparing and analyzing bulk RNA and single-cell RNA are described in Section I.A above.
In some embodiments, the test cell characteristics are included in a test dataset. In some embodiments, the test dataset includes protein expression levels of test cells. In some embodiments, the test dataset includes epigenetic profiles of test cells. In some embodiments, the test dataset includes transcriptional profiles of test cells. In some embodiments, the test dataset includes gene expression levels of test cells. In some embodiments, the test dataset includes microarray data. In some embodiments, the test dataset includes RNA sequencing data. In some embodiments, the test dataset includes microarray data and RNA sequencing data. In some embodiments, the test dataset includes bulk RNA sequencing data. In some embodiments, the test dataset includes single-cell RNA sequencing data. In some embodiments, the test dataset includes bulk RNA sequencing data and single-cell RNA sequencing data. In some embodiments, the test dataset includes expression levels of one or more metagenes. Determination of metagenes and expression levels thereof is discussed in Section I.C.
C. Metagenes
In some aspects, the methods provided herein make use of metagenes and expression levels of metagenes for determining the identity of test cells. A metagene refers to a pattern of gene expression. For example, a metagene may be a group of genes with correlated gene expression. In some embodiments, a metagene combines information from multiple individual genes, and the expression level of the metagene is calculated based on the expression levels of the individual genes. Multiple metagenes and expression levels thereof can be determined based on individual gene expression levels. In some embodiments, metagene expression levels are based on combined individual gene expression levels, and the determination of said metagenes comprises determining the degree to which an individual gene's expression level contributes to the expression level of a metagene. For instance, metagene expression levels can be a weighted combination of individual gene expression levels, and the determination of said metagenes comprises determining for each metagene the weights of individual genes. In some embodiments, metagenes and expression levels thereof reflect correlated expression levels across individual genes. In some embodiments, metagenes and expression levels thereof reflect individual genes coexpressed by cells of the same phenotype (e.g., determined dopaminergic precursor cells). Exemplary coexpressed genes of determined dopaminergic precursor cells are discussed in Section III.
In some aspects, the methods provided herein use the expression levels of metagenes to determine if a cell contained in a population of cells is a determined dopaminergic precursor cell. In some embodiments, the expression levels of metagenes are used to determine whether a population of cells contained determined dopaminergic precursor cells. In some aspects, the use of metagenes reduces the number of features used in determining if a cell is a determined dopaminergic precursor cell or if a population of cells contains determined dopaminergic precursor cells. In some aspects, reducing the number of features makes such determination more computationally tractable. In some aspects, reducing the number of features improves the accuracy of such determination. For instance, the performance of a machine learning model trained using metagene expression levels may be higher than one trained on gene expression levels, particularly since metagenes combine and/or retain information from individual genes.
1. Metagene Determination
In some embodiments, metagenes are determined based on the gene expression levels of reference cells. In some embodiments, the gene expression levels of reference cells are contained in a reference database. Exemplary reference cells and reference databases are described in Section I.A. In some embodiments, a reference database containing microarray data is used to determine metagenes. In some embodiments, a reference database containing RNA sequencing data is used to determine metagenes. In some embodiments, a reference database containing microarray data and reference database containing RNA sequencing data are used to determine metagenes. In some embodiments, a reference database containing bulk RNA sequencing data is used to determine metagenes. In some embodiments, a reference database containing single-cell RNA sequencing data is used to determine metagenes. In some embodiments, a reference database containing bulk RNA sequencing data and a reference database containing single-cell RNA sequencing data are used to determine metagenes.
In some embodiments, metagenes are computationally determined. In some embodiments, metagenes are determined using a dimensionality reduction technique. A dimensionality reduction technique transforms data from a higher-dimensional space (e.g., individual genes) into a lower-dimensional space (e.g., metagenes) such that the lower-dimensional representation of the data still retains meaningful or informative properties of the original data. In some embodiments, metagenes are determined by applying a dimensionality reduction technique on a database.
In some embodiments, the dimensionality reduction technique is a linear technique. In some embodiments, the dimensionality reduction technique is factor analysis. In some embodiments, the dimensionality reduction technique is network component analysis. In some embodiments, the dimensionality reduction technique is linear discriminant analysis. In some embodiments, the dimensionality reduction technique is independent component analysis (ICA). In some embodiments, the dimensionality reduction technique is principal component analysis (PCA). In some embodiments, the dimensionality reduction technique is sparse PCA. In some embodiments, the dimensionality reduction technique is robust PCA.
In some embodiments, the dimensionality reduction technique is non-negative matrix factorization (NMF). Using NMF, a matrix can be factorized into two matrices such that all three matrices have no negative elements. This non-negativity can makes the resulting matrices easier to inspect, for instance when the original matrix itself contains only non-negative values. In some embodiments, the dimensionality reduction technique is conventional NMF. In some embodiments, the dimensionality reduction technique is discriminant NMF. In some embodiments, the dimensionality reduction technique is regularized NMF. In some embodiments, the dimensionality reduction technique is graph regularized NMF. In some embodiments, the dimensionality reduction technique is bootstrapping sparse NMF.
In some embodiments, the dimensionality reduction technique is a non-linear technique. In some embodiments, the dimensionality reduction technique is kernel PCA. In some embodiments, the dimensionality reduction technique is generalized discriminant analysis (GDA). In some embodiments, the dimensionality reduction technique is an autoencoder. In some embodiments, the dimensionality reduction technique is T-distributed Stochastic Neighbor Embedding (t-SNE). In some embodiments, the dimensionality reduction technique is a manifold learning technique. In some embodiments, the dimensionality reduction technique is Isomap. In some embodiments, the dimensionality reduction technique is locally linear embedding (LLE). In some embodiments, the dimensionality reduction technique is Hessian LLE. In some embodiments, the dimensionality reduction technique is Laplacian eigenmaps. In some embodiments, the dimensionality reduction technique is graph-based kernel PCA. In some embodiments, the dimensionality reduction technique is uniform manifold approximation and projection (UMAP).
In some embodiments, the dimensionality reduction technique is a clustering technique that can be used as a dimensionality reduction technique. In some embodiments, the dimensionality reduction technique is a connectivity-based clustering method. In some embodiments, the dimensionality reduction technique is hierarchical clustering. In some embodiments, the dimensionality reduction technique is a centroid-based clustering method. In some embodiments, the dimensionality reduction technique is k-means clustering. In some embodiments, the dimensionality reduction technique is a distribution-based clustering method. In some embodiments, the dimensionality reduction technique is Gaussian mixture modeling. In some embodiments, the dimensionality reduction technique is a density-based clustering method. In some embodiments, the dimensionality reduction technique is DBSCAN. In some embodiments, the dimensionality reduction technique is OPTICS. In some embodiments, the dimensionality reduction technique is a grid-based clustering method. In some embodiments, the dimensionality reduction technique is STING. In some embodiments, the dimensionality reduction technique is CLIQUE.
2. Metagene Expression Levels
In some embodiments, expression levels of the determined metagenes are calculated. In some embodiments, metagene expression levels are determined using the same reference database used to determine metagenes. In some embodiments, metagene expression levels are determined using a reference database not used to determine metagenes. In some embodiments, metagene expression levels are determined using test datasets (e.g., any test dataset described in Section I.B.). Determination of metagene expression levels is possible if expression levels of the same or similar sets of genes are included in the reference databases used to determine metagenes and the reference databases and/or test dataset used to determine metagene expression levels.
In some embodiments, metagene gene expression levels are determined using reference databases containing microarray data. In some embodiments, metagene gene expression levels are determined using a reference database containing RNA sequencing data. In some embodiments, metagene gene expression levels are determined using a reference database containing microarray data and reference databases comprising RNA sequencing data. In some embodiments, metagene gene expression levels are determined using reference database containing bulk RNA sequencing data. In some embodiments, metagene gene expression levels are determined using a reference database containing single-cell RNA sequencing data. In some embodiments, metagene gene expression levels are determined using a reference database containing bulk RNA sequencing data and a reference database containing single-cell RNA sequencing data.
In some embodiments, metagenes are determined using a reference database containing bulk RNA sequencing data, and metagene expression levels are determined using a reference database containing bulk RNA sequencing data. In some embodiments, metagenes are determined using a reference database containing bulk RNA sequencing data, and metagene expression levels are determined using a reference database containing single-cell RNA sequencing data. In some embodiments, metagenes are determined a reference database containing single-cell RNA sequencing data, and metagene expression levels are determined using a reference database containing bulk RNA sequencing data. In some embodiments, metagenes are determined using a reference database containing single-cell RNA sequencing data, and metagene expression levels are determined using a reference database containing single-cell RNA sequencing data. In some embodiments, metagenes are determined using a reference database containing bulk RNA sequencing data and a reference database containing single-cell RNA sequencing data, and metagene expression levels are determined a reference database containing bulk RNA sequencing data. In some embodiments, metagenes are determined using a reference database containing bulk RNA sequencing data and a reference database containing single-cell RNA sequencing data, and metagene expression levels are determined using a reference database containing single-cell RNA sequencing data.
In some embodiments, metagene gene expression levels are determined using a test dataset containing microarray data. In some embodiments, metagene gene expression levels are determined using a test dataset containing RNA sequencing data. In some embodiments, metagene gene expression levels are determined using a test dataset containing microarray data and RNA sequencing data. In some embodiments, metagene gene expression levels are determined using a test dataset containing bulk RNA sequencing data. In some embodiments, metagene gene expression levels are determined using a test dataset containing single-cell RNA sequencing data. In some embodiments, metagene gene expression levels are determined using a test dataset containing bulk RNA sequencing data and single-cell RNA sequencing data.
In some embodiments, metagenes are determined using a reference database containing bulk RNA sequencing data, and metagene expression levels are determined using a test dataset containing bulk RNA sequencing data. In some embodiments, metagenes are determined using a reference database containing bulk RNA sequencing data, and metagene expression levels are determined using a test dataset containing single-cell RNA sequencing data. In some embodiments, metagenes are determined using a reference database containing single-cell RNA sequencing data, and metagene expression levels are determined using a test dataset containing bulk RNA sequencing data. In some embodiments, metagenes are determined using a reference database containing single-cell RNA sequencing data, and metagene expression levels are determined using a test dataset containing single-cell RNA sequencing data. In some embodiments, metagenes are determined using a reference database containing bulk RNA sequencing data and reference databases containing single-cell RNA sequencing data, and metagene expression levels are determined using a test dataset containing bulk RNA sequencing data. In some embodiments, metagenes are determined using a reference database containing bulk RNA sequencing data and reference databases containing single-cell RNA sequencing data, and metagene expression levels are determined using a test dataset containing single-cell RNA sequencing data.
In some embodiments, metagenes are determined by applying a dimensionality reduction technique on one or more reference databases. In some embodiments, one or more outputs of the dimensionality reduction technique are used to determine metagene expression levels.
In some embodiments, one or more outputs of the dimensionality reduction technique and a reference database are used to determine metagene expression levels based on the reference database. In some embodiments, one or more outputs of the dimensionality reduction technique and a test dataset are used to determine metagene expression levels based on the test dataset.
In some embodiments, the one or more outputs of the dimensionality reduction technique includes information on how multiple individual genes are combined to form a metagene. In some embodiments, the one or more outputs of the dimensionality reduction technique includes information on the degree to which an individual gene's expression level contributes to the expression level of a metagene. In some embodiments, the one or more outputs of the dimensionality reduction technique includes the weights of individual genes, for instance when metagene expression levels are a weighted combination of individual gene expression levels.
In some embodiments, metagene expression levels are determined using regression analysis. In some embodiments, the regression analysis is linear regression. In some embodiments, regression analysis is performed using one or more outputs of the dimensionality reduction technique and the reference database. In some embodiments, regression analysis is used to approximate gene expression levels of the reference database using the one or more outputs of the dimensionality reduction technique (e.g., the weights of individual genes in contributing to a metagene). In some embodiments, regression analysis is used to approximate gene expression levels of the reference database as a weighted combination of the weights of individual genes in contributing to a metagene. In some embodiments, the weights estimated by regression analysis can be used as metagene expression levels for the reference database.
In some embodiments, regression analysis is performed using one or more outputs of the dimensionality reduction technique and the test dataset. In some embodiments, regression analysis is used to approximate gene expression levels of the test dataset using the one or more outputs of the dimensionality reduction technique (e.g., the weights of individual genes in contributing to a metagene). In some embodiments, regression analysis is used to approximate gene expression levels of the test dataset as a weighted combination of the weights of individual genes in contributing to a metagene. In some embodiments, the weights estimated by regression analysis can be used as metagene expression levels for the test dataset.
D. Probability Assessment (e.g. Neuroscore)
In some aspects, the methods provided herein include the use of a machine learning model. In some embodiments, the machine learning model is trained to determine the prospect of a cell or a plurality of cells having metagene expression levels of a determined dopaminergic precursor cell. In some embodiments, the machine learning model is trained to determine the probability of a cell or a plurality of cells having metagene expression levels of a determined dopaminergic precursor cell. In some embodiments, the machine learning model is trained to classify a cell or a plurality of cells as having metagene expression levels of a determined dopaminergic precursor cell or not.
In some embodiments, the machine learning model is trained on expression levels of one or more metagenes. In some embodiments, the machine learning model is trained on metagene expression levels determined based on reference databases (e.g., as determined using any of the reference databases described in Section I.A. and any of the methods described in Section I.C.).
In some embodiments, the machine learning model is a supervised classification model. In some embodiments, the machine learning model is trained using reference cell labels comprised in the reference databases. In some embodiments, the reference cell labels indicate if the corresponding reference cells are determined dopaminergic precursor cells. In some embodiments, the reference cell labels indicate the period of time that corresponding reference cells have differentiated under conditions to become dopaminergic neurons, e.g., any of the periods of time described in Section II. In some embodiments, the reference cell labels indicate if the period of time is at least or at least about 18 days. In some embodiments, the reference cell labels indicate if the period of time is between or between about 18 and 25 days.
In some embodiments, the supervised classification model is a logistic regression model. In some embodiments, the supervised classification model is a linear discriminant analysis (LDA) model. In some embodiments, the supervised classification model is a Naïve Bayes classifier. In some embodiments, the supervised classification model is a perceptron. In some embodiments, the supervised classification model is a support vector machine (SVM). In some embodiments, the supervised classification model is a quadratic classifier. In some embodiments, the supervised classification model is a decision tree. In some embodiments, the supervised classification model is a random forest. In some embodiments, the supervised classification model is a neural network. In some embodiments, the supervised classification model is an ensemble model comprising any of the foregoing models.
In embodiments, the machine learning model is a best fitting classification model identified by an algorithm as most stable to random perturbations. In embodiments, the best fitting classification model can cluster individual datasets such that each dataset within a cluster is indistinguishable from each other dataset within said cluster. In embodiments, the method includes identifying computationally derived class labels based only on biological characteristics. In embodiments, the method includes identifying differences in at least one dataset for at least one label between at least two samples in at least two clusters. In embodiments, the method includes filtering within a cluster for samples having a similar label profile. In embodiments, the method includes defining differentially regulated protein-protein networks. In embodiments, the method includes using the protein-protein networks to define a class membership, manipulate class membership, or define biological function of said neuronal progenitor cells. In embodiments, the best fitting classification model can cluster individual datasets such that each dataset within a cluster is different from each other individual dataset.
At some point after a reference database is received the methods can include performing unsupervised classification. This means that a new sorting of the data is performed, with no preconceptions about the results of the sorting. The sorting is typically performed multiple times, at least 5, 10, 20, 50, 100, 200, 300, 500, for example. The sorting results are analyzed for a result that is stable, meaning that the result of the sorting is providing the same result, or a similar result (at least 80%, 85%, 90%, 95%, 97%, 99% or 100% of the previous result). The re-sorting of the data can be performed completely de novo or it can start with certain assumptions.
In some embodiments, metagene expression levels for test cells are determined based on a test dataset (e.g., any of the test datasets described in Section I.B. and using any of the methods described in Section I.C.), and the metagene expression levels are applied as input to the trained machine learning model. In some embodiments, the machine learning model outputs a binary prediction of the test cells having metagene expression levels of a determined dopaminergic precursor cell. In some embodiments, the machine learning model outputs the prospect of the test cells having metagene expression levels of a determined dopaminergic precursor cell. In some embodiments, the machine learning model outputs the probability of the test cells having metagene expression levels of a determined dopaminergic precursor cell. The output (e.g., binary prediction, prospect, probability) is also referred to as a “Neuroscore” herein.
In some embodiments, the Neuroscore output for test cells, e.g. probability of the test cells having metagene expression levels of a determined dopaminergic precursor cell, is compared to a predetermined threshold. In some embodiments, the methods provided herein output a computed label classification, and the computed label classification indicates that the test cells comprise a determined dopaminergic precursor cell if the predetermined threshold is exceeded.
A variety of methods and criteria can be used to set a predetermined threshold for the Neuroscore. For instance, the predetermined threshold can be set in order to optimize specificity and/or sensitivity in predicting if test cells have metagene expression levels of a determined dopaminergic precursor cell. In some embodiments, the predetermined threshold is set such that test cells having metagene expression levels of a determined dopaminergic precursor cell are identified with greater than or greater than about 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% sensitivity. In some embodiments, the predetermined threshold is set such that test cells having metagene expression levels of a determined dopaminergic precursor cell are identified with greater than or greater than about 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% specificity. In some embodiments, the predetermined threshold is set such that test cells having metagene expression levels of a determined dopaminergic precursor cell are identified with greater than or greater than 98% sensitivity and 100% specificity.
In some embodiments, the predetermined threshold is set based on Neuroscores calculated based on reference databases. In some embodiments, the reference databases comprise gene expression levels of reference cells differentiated according to any of the methods described in Section II. In some embodiments, the predetermined threshold is set such that reference cells differentiated for at least or at least about 18 days have Neuroscores exceeding the predetermined threshold. In some embodiments, the predetermined threshold is set such that reference cells differentiated for between or between about 18 and 25 days have Neuroscores exceeding the predetermined threshold. In some embodiments, the predetermined threshold is set such that reference cells known to have a therapeutic effect, e.g., reduce or reverse symptoms of Parkinson's disease, have Neuroscores exceeding the predetermined threshold.
In some embodiments, the computed label classification indicates that the test cells are or contain a determined dopaminergic precursor cell if the test cells' Neuroscore indicates a probability greater than or greater than about 0.4 of the test cells' having metagene expression levels of a determined dopaminergic precursor cell. In some embodiments, the computed label classification indicates that the test cells are or contain a determined dopaminergic precursor cell if the test cells' Neuroscore indicates a probability greater than or greater than about 0.45 of the test cells' having metagene expression levels of a determined dopaminergic precursor cell. In some embodiments, the computed label classification indicates that the test cells are or contain a determined dopaminergic precursor cell if the test cells' Neuroscore indicates a probability greater than or greater than about 0.5 of the test cells' having metagene expression levels of a determined dopaminergic precursor cell. In some embodiments, the computed label classification indicates that the test cells are or contain a determined dopaminergic precursor cell if the test cells' Neuroscore indicates a probability greater than or greater than about 0.55 of the test cells' having metagene expression levels of a determined dopaminergic precursor cell. In some embodiments, the computed label classification indicates that the test cells are or contain a determined dopaminergic precursor cell if the test cells' Neuroscore indicates a probability greater than or greater than about 0.6 of the test cells' having metagene expression levels of a determined dopaminergic precursor cell. In some embodiments, the computed label classification indicates that the test cells are or contain a determined dopaminergic precursor cell if the test cells' Neuroscore indicates a probability greater than or greater than about 0.65 of the test cells' having metagene expression levels of a determined dopaminergic precursor cell. In some embodiments, the computed label classification indicates that the test cells are or contain a determined dopaminergic precursor cell if the test cells' Neuroscore indicates a probability greater than or greater than about 0.7 of the test cells' having metagene expression levels of a determined dopaminergic precursor cell. In some embodiments, the computed label classification indicates that the test cells are or contain a determined dopaminergic precursor cell if the test cells' Neuroscore indicates a probability greater than or greater than about 0.75 of the test cells' having metagene expression levels of a determined dopaminergic precursor cell. In some embodiments, the computed label classification indicates that the test cells are or contain a determined dopaminergic precursor cell if the test cells' Neuroscore indicates a probability greater than or greater than about 0.8 of the test cells' having metagene expression levels of a determined dopaminergic precursor cell. In some embodiments, the computed label classification indicates that the test cells are or contain a determined dopaminergic precursor cell if the test cells' Neuroscore indicates a probability greater than or greater than about 0.85 of the test cells' having metagene expression levels of a determined dopaminergic precursor cell. In some embodiments, the computed label classification indicates that the test cells are or contain a determined dopaminergic precursor cell if the test cells' Neuroscore indicates a probability greater than or greater than about 0.9 of the test cells' having metagene expression levels of a determined dopaminergic precursor cell. In some embodiments, the computed label classification indicates that the test cells are or contain a determined dopaminergic precursor cell if the test cells' Neuroscore indicates a probability greater than or greater than about 0.95 of the test cells' having metagene expression levels of a determined dopaminergic precursor cell.
In some embodiments, the computed label classification indicates that the test cells are or contain a determined dopaminergic precursor cell if the test cells' Neuroscore is greater than or greater than about a threshold probability value. In some embodiments, the threshold probability value is between or between about 0.4 and 1, inclusive. In some embodiments, the threshold probability value is between or between about 0.4 and 0.9, inclusive. In some embodiments, the threshold probability value is between or between about 0.4 and 0.8, inclusive. In some embodiments, the threshold probability value is between or between about 0.4 and 0.7, inclusive. In some embodiments, the threshold probability value is between or between about 0.4 and 0.6, inclusive. In some embodiments, the threshold probability value is between or between about 0.5 and 0.8, inclusive. In some embodiments, the threshold probability value is between or between about 0.5 and 0.7, inclusive. In some embodiments, the threshold probability value is between or between about 0.5 and 0.6, inclusive.
In some embodiments, the threshold probability value is or is about 0.4. In some embodiments, the threshold probability value is or is about 0.45. In some embodiments, the threshold probability value is or is about 0.5. In some embodiments, the threshold probability value is or is about 0.55. In some embodiments, the threshold probability value is or is about 0.6. In some embodiments, the threshold probability value is or is about 0.65. In some embodiments, the threshold probability value is or is about 0.7. In some embodiments, the threshold probability value is or is about 0.75. In some embodiments, the threshold probability value is or is about 0.8. In some embodiments, the threshold probability value is or is about 0.85. In some embodiments, the threshold probability value is or is about 0.9. In some embodiments, the threshold probability value is or is about 0.95.
E. Deviation Score (e.g. Novelty Score)
In some aspects, the methods provided herein comprise calculating a deviation score. The deviation score, also referred to herein as a Novelty Score, indicates the degree to which gene expression levels comprised in a test dataset (e.g., any described in Section I.B.) differ from expected gene expression levels. Expected gene expression values can be determined using a variety of methods. In some embodiments, expected gene expression levels are based on gene expression levels comprised in a reference database, for instance any exemplified in Section I.A. In some embodiments, expected gene expression levels are based on average gene expression levels in a reference database.
In some embodiments, expected gene expression levels are based on the expression levels of one or more metagenes determined for a test dataset, for instance determined using any of the exemplary methods described in Section I.C. herein. In some embodiments, expected gene expression levels are calculated based on gene expression levels in the test dataset and metagenes and expression levels thereof determined for the test dataset. Any method that can be used to calculate an expected value (e.g., expected gene expression level) based on the relationship between one or more predictors (e.g., metagene expression levels for the test dataset) and a dependent value (e.g., gene expression levels in the test dataset) can be used. In some embodiments, regression analysis is used to calculate expected gene expression levels for the test dataset.
In some embodiments, the deviation score is based on all genes whose expression levels are contained in the test dataset. In some embodiments, the deviation score is based on a subset of genes whose expression levels are contained in the test dataset.
In some embodiments, the deviation score is based on a set of preselected marker genes. In some embodiments, the marker genes are chosen based on their diagnostic capability, for instance if their expression levels can be used to distinguish between cell types (e.g., determined dopaminergic precursor cells and other cell types). In some embodiments, the marker genes comprise radial glial cell markers, early neuronal development genes, pluripotency specific markers, intermediate to late neuronal markers, neurofilament light polypeptide chain markers, neurofilament medium polypeptide chain markers, nestin filament markers, early patterning markers, neural progenitor cell markers, early migration markers, stage-specific transcription factors, genes required for normal development of neurons, genes controlling dopaminergic neuron development, genes regulating identity and fate of neuronal progenitor cells, dopaminergic neuron markers, astrocyte markers, forebrain markers, hindbrain markers, subthalamic nucleus markers, radial glial markers, cell cycle markers, or any combination of any of the foregoing. In some embodiments, the marker genes include genes not expected to be expressed by determined dopaminergic precursor cells. In some embodiments, the marker genes include one or more of any of the genes described in Table E1.
In some embodiments, preliminary deviation scores are calculated, and the maximum preliminary deviation score is output as the deviation score. In some embodiments, a first deviation score is calculated based on all genes whose expression levels are contained in the test dataset, and a second deviation score is calculated based on a subset of genes. In some embodiments, a first deviation score is calculated based on all genes whose expression levels are contained in the test dataset, and a second deviation score is calculated based on a set of preselected marker genes. In some embodiments, the deviation score is the maximum value of the preliminary deviation scores.
In some embodiments, the deviation of single genes is calculated as residuals (i.e., differences) between gene expression levels comprised in a test dataset and gene expression levels of one or more reference cells. In some embodiments, the one or more reference cells are at a stage of differentiation indicating a determined dopaminergic precursor cell. In some embodiments, the residuals are normalized. In some embodiments, the residuals are normalized by dividing by the variance of gene expression levels in a reference database, e.g., any of those described in Section I.A. In some embodiments, the residuals are normalized by dividing by the standard deviation of gene expression levels in the reference database.
In some embodiments, the deviation score is a summary statistic of the one or more single-gene deviation scores. Any known summary statistic can be used. In some embodiments, the deviation score is the average single-gene deviation score. In some embodiments, the deviation score is a sum of the single-gene deviation scores. In some embodiments, the deviation score is a weighted sum of the single-gene deviation scores. In some embodiments, single-gene deviation scores of particular genes (e.g., marker genes, for instance those described in Table E1 herein), are weighted more than single-gene deviation scores for other genes. In some embodiments, the deviation score is the single-gene deviation score corresponding to a percentile of one or more single-gene deviation scores. In some embodiments, the percentile is between or between about the 50% percentile and the 100% percentile. In some embodiments, the percentile is between or between about the 60% percentile and the 100% percentile. In some embodiments, the percentile is between or between about the 70% percentile and the 100% percentile. In some embodiments, the percentile is between or between about the 80% percentile and the 100% percentile. In some embodiments, the percentile is between or between about the 90% percentile and the 100% percentile. In some embodiments, the percentile is or is about the 95% percentile.
In some embodiments, the Novelty Score output for test cells is compared to a predetermined threshold. In some embodiments, the methods provided herein output a computed label classification, and the computed label classification indicates that the test cells are or contain a determined dopaminergic precursor cell if the predetermined threshold is not exceeded.
A variety of methods and criteria can be used to set a predetermined threshold for the Novelty Score. In some embodiments, the predetermined threshold is set based on Novelty Scores calculated based on a reference database. In some embodiments, the reference database includes gene expression levels of reference cells differentiated according to any of the methods described in Section II.
In some embodiments, the computed label classification indicates that the test cells are or contain a determined dopaminergic precursor cell if the test cells' Novelty Score indicates that at least or at least about 50% of gene expression levels in the test dataset are no more than five standard deviations away from expected gene expression levels. In some embodiments, the computed label classification indicates that the test cells are or contain a determined dopaminergic precursor cell if the test cells' Novelty Score indicates that at least or at least about 60% of gene expression levels in the test dataset are no more than five standard deviations away from expected gene expression levels. In some embodiments, the computed label classification indicates that the test cells are or contain a determined dopaminergic precursor cell if the test cells' Novelty Score indicates that at least or at least about 70% of gene expression levels in the test dataset are no more than five standard deviations away from expected gene expression levels. In some embodiments, the computed label classification indicates that the test cells are or contain a determined dopaminergic precursor cell if the test cells' Novelty Score indicates that at least or at least about 80% of gene expression levels in the test dataset are no more than five standard deviations away from expected gene expression levels. In some embodiments, the computed label classification indicates that the test cells are or contain a determined dopaminergic precursor cell if the test cells' Novelty Score indicates that at least or at least about 90% of gene expression levels in the test dataset are no more than five standard deviations away from expected gene expression levels. In some embodiments, the computed label classification indicates that the test cells are or contain a determined dopaminergic precursor cell if the test cells' Novelty Score indicates that at least or at least about 95% of gene expression levels in the test dataset are no more than five standard deviations away from expected gene expression levels.
In some embodiments, the computed label classification indicates that the test cells are or contain a determined dopaminergic precursor cell if the test cells' Novelty Score indicates that at least or at least about 95% of gene expression levels in the test dataset are no more than 10 standard deviations away from expected gene expression levels. In some embodiments, the computed label classification indicates that the test cells are or contain a determined dopaminergic precursor cell if the test cells' Novelty Score indicates that at least or at least about 95% of gene expression levels in the test dataset are no more than 9 standard deviations away from expected gene expression levels. In some embodiments, the computed label classification indicates that the test cells are or contain a determined dopaminergic precursor cell if the test cells' Novelty Score indicates that at least or at least about 95% of gene expression levels in the test dataset are no more than 8 standard deviations away from expected gene expression levels. In some embodiments, the computed label classification indicates that the test cells are or contain a determined dopaminergic precursor cell if the test cells' Novelty Score indicates that at least or at least about 95% of gene expression levels in the test dataset are no more than 7 standard deviations away from expected gene expression levels. In some embodiments, the computed label classification indicates that the test cells are or contain a determined dopaminergic precursor cell if the test cells' Novelty Score indicates that at least or at least about 95% of gene expression levels in the test dataset are no more than 6 standard deviations away from expected gene expression levels. In some embodiments, the computed label classification indicates that the test cells are or contain a determined dopaminergic precursor cell if the test cells' Novelty Score indicates that at least or at least about 95% of gene expression levels in the test dataset are no more than five standard deviations away from expected gene expression levels.
In some embodiments, the computed label classification indicates that the test cells are or contain a determined dopaminergic precursor cell if the test cells' Novelty Score indicates that at least or at least about 50% of marker gene expression levels in the test dataset are no more than five standard deviations away from expected gene expression levels. In some embodiments, the computed label classification indicates that the test cells are or contain a determined dopaminergic precursor cell if the test cells' Novelty Score indicates that at least or at least about 60% of marker gene expression levels in the test dataset are no more than five standard deviations away from expected gene expression levels. In some embodiments, the computed label classification indicates that the test cells are or contain a determined dopaminergic precursor cell if the test cells' Novelty Score indicates that at least or at least about 70% of marker gene expression levels in the test dataset are no more than five standard deviations away from expected gene expression levels. In some embodiments, the computed label classification indicates that the test cells are or contain a determined dopaminergic precursor cell if the test cells' Novelty Score indicates that at least or at least about 80% of marker gene expression levels in the test dataset are no more than five standard deviations away from expected gene expression levels. In some embodiments, the computed label classification indicates that the test cells are or contain a determined dopaminergic precursor cell if the test cells' Novelty Score indicates that at least or at least about 90% of marker gene expression levels in the test dataset are no more than five standard deviations away from expected gene expression levels. In some embodiments, the computed label classification indicates that the test cells are or contain a determined dopaminergic precursor cell if the test cells' Novelty Score indicates that at least or at least about 95% of gene expression levels in the test dataset are no more than five standard deviations away from expected gene expression levels.
In some embodiments, the computed label classification indicates that the test cells are or contain a determined dopaminergic precursor cell if the test cells' Novelty Score indicates that at least or at least about 95% of marker gene expression levels in the test dataset are no more than 10 standard deviations away from expected gene expression levels. In some embodiments, the computed label classification indicates that the test cells are or contain a determined dopaminergic precursor cell if the test cells' Novelty Score indicates that at least or at least about 95% of marker gene expression levels in the test dataset are no more than 9 standard deviations away from expected gene expression levels. In some embodiments, the computed label classification indicates that the test cells are or contain a determined dopaminergic precursor cell if the test cells' Novelty Score indicates that at least or at least about 95% of marker gene expression levels in the test dataset are no more than 8 standard deviations away from expected gene expression levels. In some embodiments, the computed label classification indicates that the test cells are or contain a determined dopaminergic precursor cell if the test cells' Novelty Score indicates that at least or at least about 95% of marker gene expression levels in the test dataset are no more than 7 standard deviations away from expected gene expression levels. In some embodiments, the computed label classification indicates that the test cells are or contain a determined dopaminergic precursor cell if the test cells' Novelty Score indicates that at least or at least about 95% of marker gene expression levels in the test dataset are no more than 6 standard deviations away from expected gene expression levels. In some embodiments, the computed label classification indicates that the test cells are or contain a determined dopaminergic precursor cell if the test cells' Novelty Score indicates that at least or at least about 95% of marker gene expression levels in the test dataset are no more than five standard deviations away from expected gene expression levels.
In some embodiments, the computed label classification indicates that the test cells are or contain a determined dopaminergic precursor cell if the test cells' Novelty Score is less than less than about 10. In some embodiments, the computed label classification indicates that the test cells are or contain a determined dopaminergic precursor cell if the test cells' Novelty Score is less than less than about 9. In some embodiments, the computed label classification indicates that the test cells are or contain a determined dopaminergic precursor cell if the test cells' Novelty Score is less than less than about 8. In some embodiments, the computed label classification indicates that the test cells are or contain a determined dopaminergic precursor cell if the test cells' Novelty Score is less than less than about 7. In some embodiments, the computed label classification indicates that the test cells are or contain a determined dopaminergic precursor cell if the test cells' Novelty Score is less than less than about 6. In some embodiments, the computed label classification indicates that the test cells are or contain a determined dopaminergic precursor cell if the test cells' Novelty Score is less than less than about 5.
F. Exemplary Method
In some embodiments, the methods provided herein are used to determine if test cells, e.g. a population of neuronal progenitor cells produced by a differentiation process from iPSCs, are or contain determined dopaminergic precursor cells. In some embodiments, the ability to determine if a test cell population contains determined dopaminergic precursor cells according to any of the methods provided herein can validate release of the cells for use in subsequent applications. In some embodiments, subsequent applications can include therapeutic applications of the determined dopaminergic precursor cells, such as for use in treating a neurodegenerative disease. In some embodiments, the therapeutic applications include the implantation of the test cells for the treatment of a neurodegenerative disease. In some embodiments, the neurodegenerative disease is Parkinson's disease. In some embodiments, the test cells are implanted in the substantia nigra for treating the neurodegenerative disease, e.g. Parkinson's disease.
An exemplary process in accord with the provided methods is shown in
In some embodiments, the trained machine learning is used as part of the methods provided herein (circle 7) for classifying test cells. In some embodiments, Novelty Scores are calculated based on the reference databases. In some embodiments, the Novelty Scores based on the reference databases are used to identify NeuroScore and Novelty Score thresholds (circle 8).
In some embodiments, test cells are used to produce a test dataset including gene expression levels of the test cells. In some embodiments, the gene expression levels of the test cells are obtained using RNA sequencing. In some embodiments, the gene expression levels are subjected to sequencing alignment (circle 1). In some embodiments, the sequencing alignment is performed using a Salmon pseudoaligner. In some embodiments, the test dataset is supplied to the trained model (circle 2). In some embodiments, a NeuroScore (circle 10) and a Novelty Score (circle 11) are output for the test dataset. In some embodiments, the NeuroScore and the Novelty Score are compared to the previously determined NeuroScore and Novelty Score thresholds. In some embodiments, the test cells are transplanted and/or screened, for instance if both thresholds are met. In some embodiments, the test cells are discarded, for instance if neither threshold is met.
In some embodiments, reference cells and reference databases are produced, for instance according to any of the methods described in Sections I.A and II. In some embodiments, the reference cells are produced using iPSCs generated from subjects with Parkinson's disease. In some embodiments, the reference databases include gene expression levels of reference cells allowed to differentiate from iPSCs for various times in culture, such as for, for about, or for at least 13, 18, and 25 days under conditions to differentiate iPSCs into neuronal cells. In some embodiments, the reference database includes bulk RNA sequencing data. In some embodiments, the reference database includes single-cell RNA sequencing data. In some embodiments, the reference database includes reference cell labels indicating if reference cells exhibit features of determined dopaminergic precursor cells, for example, as determined by functional assays, such as using animal models of a neurodegenerative disease. In some embodiments, the reference database includes reference cell labels of a cell population differentiated into neuronal cells from iPSCs for, for about, or for at least 18 days. The methods of differentiation can include any as described in Section II.
In some embodiments, the reference database including single-cell RNA sequencing data is used to determine metagenes, for instance using any of the methods described in Section I.C.1. In some embodiments, and based on the determined metagenes, metagene expression levels are determined using a reference database including bulk RNA sequencing data, for instance using any of the methods described in Section I.C.2.
In some embodiments, the metagene expression levels are used to train a machine learning model, for instance any described in Section I.D. In some embodiments, the machine learning model is a supervised classification model. In some embodiments, the machine learning model is a logistic regression model. In some embodiments, the machine learning model is trained using reference cell labels comprised in the reference databases.
In some embodiments, test cells and test datasets are produced, for instance using any of the methods described in Sections I.B. and II. In some embodiments, the test cells are produced using iPSCs generated from a patient with Parkinson's disease. In some embodiments, the test dataset is used to determine metagene expression levels for the test cells, for instance using any of the methods described in Section I.C.2. In some embodiments, the test cells are contained in an in vitro population of cells. In some embodiments, the test cells are contained in an in vitro population of neuronal progenitor cells
In some embodiments, the metagene expression levels determined from the test dataset are supplied as input to the machine learning model. In some embodiments, the machine learning model outputs a Neuroscore (e.g., any exemplified in Section I.D.). In some embodiments, a Novelty Score is determined using the test dataset, for instance according to any of the methods described in Section I.E. In some embodiments, a Neuroscore and a Novelty Score are determined for the test cells.
In some embodiments, the test cells' Neuroscore is compared to a predetermined threshold (e.g., any described in Section I.D.). In some embodiments, the test cells' Novelty Score is compared to a predetermined threshold (e.g., any described in Section I.E.). In some embodiments, both the Neuroscore and the Novelty Score of the test cells are compared to predetermined thresholds.
In some embodiments, the methods provided herein include outputting a computed label classification comprising an indication of whether the test cells include a determined dopaminergic precursor cell. In some embodiments, the computed label classification is based on the Neuroscore and comparison thereof to its corresponding predetermined threshold. In some embodiments, the computed label classification is based on the Novelty Score and comparison thereof to its corresponding predetermined threshold. In some embodiments, the computed label classification is based on both the Neuroscore and comparison thereof to its corresponding predetermined threshold and on the Novelty Score and comparison thereof to its corresponding predetermined threshold.
In some embodiments, the computed label classification indicates that the test cells are or contain a determined dopaminergic precursor cell if the test cells' Neuroscore indicates a probability greater than or greater than about 0.5 of the test cells' having metagene expression levels of a predetermined dopaminergic precursor cell. In some embodiments, the computed label classification indicates that the test cells are or contain a determined dopaminergic precursor cell if the test cells' Novelty Score indicates that at least or at least about 95% of gene expression levels in the test dataset are no more than five standard deviations away from expected gene expression levels. In some embodiments, the computed label classification indicates that the test cells are or contain a determined dopaminergic precursor cell if (i) the test cells' Neuroscore indicates a probability greater than or greater than about 0.5 of the test cells' having metagene expression levels of a determined dopaminergic precursor cell and (ii) the test cells' Novelty Score indicates that at least or at least about 95% of gene expression levels in the test dataset are no more than five standard deviations away from expected gene expression levels.
In some embodiments, the test cells' computed label classification indicates that the test cells are or contain determined dopaminergic precursor cells. In some embodiments, the in vitro population of cells comprising the test cells identified as determined dopaminergic precursor cells is selected for use. In some embodiments, the in vitro population of cells containing the test cells identified as determined dopaminergic precursor cells is selected for transplant, for instance according to any of the methods described in Section V.
In some embodiments, the test cells' computed label classification indicates that the test cells do not contain determined dopaminergic precursor cells. In some embodiments, the test cells' Novelty Score indicates that less than or less than about 95% of gene expression levels in the test dataset were no more than five standard deviations away from expected gene expression levels. In some embodiments, the in vitro population of cells comprising the test cells not identified as determined dopaminergic precursor cells is no longer allowed to differentiate. In some embodiments, the in vitro population of cells containing the test cells not identified as determined dopaminergic precursor cells is discarded. In some embodiments, the methods provided herein are repeated by producing an additional set of test cells and another test dataset. In some embodiments, the additional set of test cells is produced from the same subject with Parkinson's disease. In some embodiments, the additional set of test cells is produced from the same population of iPSCs with which the first set of test cells was produced. In some embodiments, a computed label classification is output for the additional set of test cells.
In some embodiments, the test cells' computed label classification indicates that the test cells do not contain determined dopaminergic precursor cells. In some embodiments, the test cells' Neuroscore indicates that a probability less than or less than about 0.5 of the test cells' having metagene expression levels of a determined dopaminergic precursor cell. In some embodiments, the test cells' Novelty Score indicates that greater than or greater than about 95% of gene expression levels in the test dataset were no more than five standard deviations away from expected gene expression levels. In some embodiments, the in vitro population of cells containing the test cells not identified as determined dopaminergic precursor cells is allowed to continue differentiating. In some embodiments, an additional set of test cells and test dataset from the same in vitro population of cells is collected. In some embodiments, a computed label classification is output for the additional set of test cells.
In some embodiments, the additional set of test cells is collected and tested according to the methods provided herein between or between about one and 30 days after testing of the first set of test cells. In some embodiments, the additional set of test cells is collected and tested according to the methods provided herein between or between about one and 25 days after testing of the first set of test cells. In some embodiments, the additional set of test cells is collected and tested according to the methods provided herein between or between about one and 20 days after testing of the first set of test cells. In some embodiments, the additional set of test cells is collected and tested according to the methods provided herein between or between about one and 15 days after testing of the first set of test cells. In some embodiments, the additional set of test cells is collected and tested according to the methods provided herein between or between about one and 10 days after testing of the first set of test cells. In some embodiments, the additional set of test cells is collected and tested according to the methods provided herein between or between about one and 5 days after testing of the first set of test cells. In some embodiments, the additional set of test cells is collected and tested according to the methods provided herein between or between about one and 3 days after testing of the first set of test cells.
In some embodiments, the methods provided herein are repeated until a computed label classification is provided indicating that test cells produced from the subject are or contain determined dopaminergic precursor cells.
In embodiments, the computed label classification is an unsupervised classification of the updated reference database including clustering RNA, DNA and/or protein profiles. In embodiments, the gene expression profile information is obtained from microarray analysis of cellular RNA. In embodiments, the gene expression profile information is obtained from microarray analysis of cellular RNA derived from a single cell. In embodiments, the computed label classification is an unsupervised machine classification including a bootstrapping sparse non-negative matrix factorization.
In embodiments, the gene expression reference database forms part of a storage medium. In embodiments, receiving the test dataset includes receiving input from an array analysis system. In embodiments, receiving the test dataset includes receiving input via a computer network. In embodiments, the data in the reference database is associated with one or more labeled associated biological classes of the cells.
In some aspects, the methods provided herein include the use of reference cells and/or test cells that are the product of a method to differentiate a cell. In some embodiments, the reference cells and/or test cells described in Sections I.A. and I.B. are the product of a method to differentiate a pluripotent stem cell. Various sources of pluripotent stem cells can be used, including embryonic stem (ES) cells and induced pluripotent stem cells (iPSCs). In some embodiments, the cell is an iPSC. In some embodiments, the pluripotent stem cell is an iPSC. In some embodiments, the pluripotent stem cell is an iPSC, artificially derived from a non-pluripotent cell. iPSCs may be generated by a process known as reprogramming, wherein non-pluripotent cells are effectively “dedifferentiated” to an embryonic stem cell-like state by engineering them to express genes such as OCT4, SOX2, and KLF4. Takahashi and Yamanaka Cell (2006) 126: 663-76.
In some embodiments, the cell is a pluripotent stem cell. In some embodiments, the cell is a pluripotent stem cell that was artificially derived from a non-pluripotent cell of a subject. In some embodiments, the non-pluripotent cell is a fibroblast. In some embodiments, the subject is a human. In some embodiments, the subject is a human with Parkinson's Disease. In some embodiments, the pluripotent stem cell is an iPSC.
A standard art-accepted test, such as the ability to form a teratoma in 8-12 week old SCID mice, can be used to establish the pluripotency of a cell population. However, identification of various pluripotent stem cell characteristics can also be used to identify pluripotent cells. In some aspects, pluripotent stem cells can be distinguished from other cells by particular characteristics, including by expression or non-expression of certain combinations of molecular markers. More specifically, human pluripotent stem cells may express at least some, and optionally all, of the markers from the following non-limiting list: SSEA-3, SSEA-4, TRA-1-60, TRA-1-81, TRA-2-49/6E, ALP, Sox2, E-cadherin, UTF-1, Oct4, Lin28, Rex1, and Nanog. In some aspects, a pluripotent stem cell characteristic is a cell morphologies associated with pluripotent stem cells.
Methods for generating iPSCs are known. For example, mouse iPSCs were reported in 2006 (Takahashi and Yamanaka), and human iPSCs were reported in late 2007 (Takahashi et al. and Yu et al). Mouse iPSCs demonstrate important characteristics of pluripotent stem cells, including the expression of stem cell markers, the formation of tumors containing cells from all three germ layers, and the ability to contribute to many different tissues when injected into mouse embryos at a very early stage in development. Human iPSCs also express stem cell markers and are capable of generating cells characteristic of all three germ layers.
In some embodiments, the reference cells and/or the test cells are neuronal cells that have been differentiated from a pluripotent stem cell. In some embodiments, the cells are differentiated using methods that differentiate cells, e.g., iPSCs, into any neural cell type using any available or known method for inducing the differentiation of cells. As is understood, the particular differentiation protocol and timing of the culture may result in different states of differentiated neuronal cells. In some embodiments, the differentiation is carried out by culture of pluripotent stem cells, e.g. iPSCs, under conditions to produce neuronal progenitor cells that are or include cells that are committed to being a neuronal cell. In some embodiments, the iPSCs are differentiated under conditions to result in floor plate midbrain progenitor cells, determined dopaminergic precursor cells, and/or dopamine (DA) neurons. In some embodiments, iPSCs are cultured under conditions to for differentiation into determined dopaminergic precursor cells. In some embodiments, the iPSCs are cultured under conditions to differentiate into dopaminergic neurons. Any available and known method for inducing differentiation of the cells, e.g., pluripotent stem cells, into floor plate midbrain progenitor cells, determined dopaminergic precursor cells, and/or dopamine (DA) neurons can be used. Exemplary methods of differentiating neural cells can be found, e.g., in WO2013104752, WO2010096496, WO2013067362, WO2014176606, WO2016196661, WO2015143342, US20160348070, the contents of which are hereby incorporated by reference in their entirety. In some embodiments, iPSCs are allowed to differentiate in culture as part of differentiation into neuronal cells. In some embodiments, the cells are cultured or incubated in the presence of one or more factors able to induce or promote the differentiation of iPSCs into neuronal cells. In some embodiments, the iPSCs are cultured in the presence of one or more of (i) an inhibitor of TGF-β/activing-Nodal signaling; (ii) at least one activator of Sonic Hedgehog (SHH) signaling; (iii) an inhibitor of bone morphogenetic protein (BMP) signaling; and (iv) an inhibitor of glycogen synthase kinase 3β (GSK3β) signaling. In some embodiments, the iPSCs are cultured in the presence of (i) an inhibitor of TGF-β/activing-Nodal signaling; (ii) at least one activator of Sonic Hedgehog (SHH) signaling; (iii) an inhibitor of bone morphogenetic protein (BMP) signaling; and (iv) an inhibitor of glycogen synthase kinase 3β (GSK3β) signaling. In some embodiments, the inhibitor of TGF-β/activing-Nodal signaling is 5B431542 (e.g. between about 1 μM and about 20 μM, such as 10 μM). In some embodiments, the at least one activator of SHH signaling is SHH (e.g. between about 10 ng/mL and about 500 ng/mL, such as 100 ng/mL) or purmorphamine (e.g. between about 0.1 μM and about 10 μM, such as 2 μM). In some embodiments, the at least one activator of SHH signaling includes SHH protein (e.g. between about 10 ng/mL and about 500 ng/mL, such as 100 ng/mL) and purmorphamine (e.g. between about 0.1 μM and about 10 μM, such as 2 μM). In some embodiments, the inhibitor of BMP signaling is LDN193189 (e.g. between about 0.01 μM and about 5 μM, such as 0.1 μM). In some embodiments, the inhibitor of GSK3β signaling is CHIR99021 (e.g. between about 0.1 μM and about 10 μM, such as 2 μM).
In some embodiments, the iPSCs are exposed to the one or more factors or agents at the initiation of the culturing or incubation (day 0). In some embodiments, the presence of the one or more of the factors or agents, each independently, may be maintained in the culture for the duration of the culture or for a portion of the culture. In some embodiments, the one or more factors or agents are, each independently, present in the culture for a time period to allow differentiation of the iPSCs into midbrain floor plate precursors, or until such cells exhibit characteristics of midbrain floor plate precursors as determined by a classification label according to the provided methods. In some embodiments, the one or more factors or agents are, each independently, present in the culture for up to day 5, up to day 6, up to day 7, up to day 8, up to day 9, up to day 10, up to day 11, up to day 12 or up to day 13 of the culture. For example, in an exemplary protocol, the culturing under conditions for differentiating iPSCs into neuronal cells includes initiating a first incubation on about day 0, wherein the first incubation includes culturing the pluripotent stem cells and exposing the cells to (i) an inhibitor of TGF-β/activing-Nodal signaling from day 0 through day 10, each day inclusive; (ii) at least one activator of Sonic Hedgehog (SHH) signaling from day 1 through day 6, each day inclusive; (iii) an inhibitor of bone morphogenetic protein (BMP) signaling from day 0 through day 10, each day inclusive; and (iv) an inhibitor of glycogen synthase kinase 3β (GSK3β) signaling from day 0 through day 12, each day inclusive.
In some embodiments, a second culture or incubation can be carried out on cells differentiated in the first culture, in which the second culture or incubation is carried out the presence of one or more additional agents or factors under conditions to further neurally differentiate the cells. In some embodiments, the second culture or initiation may be initiated at or about the time that the cells in the first culture have differentiated into midbrain floor plate precursors, or until such cells exhibit characteristics of midbrain floor plate precursors as determined by a classification label according to the provided methods. In some embodiments, the one or more additional agents or factors can include any one or more the one or more factors present in the first culture. In some embodiments, the one or more additional agents or factors can include one or more of (i) brain-derived neurotrophic factor (BDNF); (ii) ascorbic acid; (iii) glial cell-derived neurotrophic factor (GDNF); (iv) cyclic AMP (cAMP), e.g. dibutyryl cyclic AMP (dbcAMP); (v) transforming growth factor beta-3 (TGFβ3) (collectively, “BAGCT”); and (vi) an inhibitor of Notch. In some embodiments, the additional agents or factors include (i) brain-derived neurotrophic factor (BDNF); (ii) ascorbic acid; (iii) glial cell-derived neurotrophic factor (GDNF); (iv) dibutyryl cyclic AMP (dbcAMP); (v) transforming growth factor beta-3 (TGFβ3) (collectively, “BAGCT”); and (vi) an inhibitor of Notch. In some embodiments, the cells are exposed to a concentration of BDNF between about 1 ng/mL and 100 ng/mL (e.g. 20 ng/mL). In some embodiments, the cells are exposed to ascorbic acid at a concentration of between about 0.05 mM and 5 mM, e.g. 0.2 mM. In some embodiments, the cells are exposed to GDNF at a concentration of between 1 ng/mL and 100 ng/mL, e.g. 20 ng/mL. In some embodiments, the cells are exposed to cAMP, e.g. dibutyryl cyclic AMP (dbcAMP), at a concentration between about 0.05 mM and 5 mM, e.g. about 0.5 mM. In some embodiments, the cells are exposed to transforming growth factor beta 3 (TGFβ3) at a concentration of between about 0.1 ng/mL and 10 ng/mL, e.g. 1 ng/mL.
In some embodiments, the second culture or incubation can be carried out for a period of time to differentiate the cells into determined dopaminergic precursor cells, or until such cells exhibit characteristics of dopaminergic neurons as determined by a classification label according to the provided methods. In some embodiments the second culture or incubation can be carried out for a period of time to differentiate the cells into dopaminergic neurons, or until such cells exhibit characteristics of dopaminergic neurons as determined by a classification label according to the provided methods. In some embodiments, the second culture or incubation is carried out up until about day 30 after the initiation of the first culture or incubations. In some embodiments, the second culture or incubation is carried out up until about day 11 to day 25 after initiation of the first culture or incubations, such as from day 11, day 12, day 13, day 14, day 15, day 16, day 17, day 18, day 19, day 20, day 21, day 22, day 23, day 24 or day 25. In some embodiments, the second culture or incubation is carried out to at or about day 18 after initiation of the first culture. In some embodiments, the second culture is carried out to at or about day 25 after initiation of the first culture.
In some embodiments, cells of the culture are exposed to the one or more additional factors or agents for the duration of the culture or for a period of time. In some embodiments, the presence of the one or more of additional factors or agents, each independently, may be maintained in the culture for the duration of the culture or for a portion of the culture. In some embodiments, the one or more additional factors or agents are, each independently, present in the culture for a time period to differentiate the cells into determined dopaminergic precursor cells, or until such cells exhibit characteristics of dopaminergic neurons as determined by a classification label according to the provided methods. In some embodiments, the one or more additional factors or agents are, each independently, present in the culture for a time period to differentiate the cells into dopaminergic neurons, or until such cells exhibit characteristics of dopaminergic neurons as determined by a classification label in accord with the provided methods. In some embodiments, the second culture or incubation is carried out up until about day 30 after the initiation of the first culture or incubations. In some embodiments, the one or more additional agent or factor are, each independently, present in the culture from the initiation of the second culture until about day 11 to day 25 after initiation of the first culture or incubation, such as up until day 11, day 12, day 13, day 14, day 15, day 16, day 17, day 18, day 19, day 20, day 21, day 22, day 23, day 24 or day 25. In some embodiments, the one or more additional agent or factor are, each independently, present in the culture from the initiation of the second culture to at or about day 18 after initiation of the first culture. In some embodiments, the one or more additional agent or factor are, each independently, present in the culture from the initiation of the second culture until to at or about day 25 after initiation of the first culture. For example, in an exemplary protocol, the culturing under conditions for differentiating iPSCs into neuronal cells further includes a second incubation in which cells from the first incubation are further cultured by exposing the cells to (i) brain-derived neurotrophic factor (BDNF); (ii) ascorbic acid; (iii) glial cell-derived neurotrophic factor (GDNF); (iv) dibutyryl cyclic AMP (dbcAMP); (v) transforming growth factor beta-3 (TGFβ3) (collectively, “BAGCT”); and (vi) an inhibitor of Notch, beginning on day 11. In some embodiments, the cells are exposed to BAGCT until harvest of the neurally differentiated cells, such as until day 18 or until day 25. In some embodiments, the second incubation may further include culture by exposing the cells to an inhibitor of GSK3β signaling from day 11 through day 12, each day inclusive.
In some embodiments, the incubation may include culture by exposing the cells to an inhibitor of Rho-associated protein kinase (ROCK) signaling at one or more times during the culturing, such as on about day 0, day 7, day 16 and/or day 20 from the initiation of the first culture. In some embodiments, the ROCK inhibitor is Y-27632 (e.g. between about 1 μM and about 20 μM, such as about 10 μM.
In some embodiments, the culturing of the iPSCs under conditions for differentiation into neuronal cells can be for a time period from the initiation of the culturing until harvest of differentiated cells that is between 10 days and 30 days. It is understood that the particular timing may be chosen based on the desired differentiation state of the cells, for example as determined empirically by a functional or other phenotypic assay or as determined based on classification label of the differentiated cells as determined in accord with the provided methods. In some embodiments, a reference cell is differentiated by culture for a certain or defined period of time. In some embodiments a reference cell is differentiated by culture for a total period of time in which the cell is determined to exhibit a desired functional or phenotypic attribute or feature, e.g. as described in Section I.A. In some embodiments, a test cell is differentiated by culture for a total period of time. In some embodiments, a test cell is differentiated by culture for a total period of time at which it is determined the test cell exhibits a desired classification label in accord with the provided methods. In some embodiments, the provided methods can be used to assess if a test cell has been cultured under conditions for its differentiation into a desired neuronal cell, e.g. determined dopaminergic precursor cell, by its classification label as determined in accord with any of the provided methods.
In embodiments, the iPSC is cultured for differentiation into a neuronal cell for at least 10 days. In embodiments, the iPSC is cultured for differentiation into a neuronal cell for at least 11 days. In embodiments, the iPSC is cultured for differentiation into a neuronal cell for at least 12 days. In embodiments, the iPSC is cultured for differentiation into a neuronal cell for at least 13 days. In embodiments, the iPSC is cultured for differentiation into a neuronal cell for at least 14 days. In embodiments, the iPSC is cultured for differentiation into a neuronal cell for at least 15 days. In embodiments, the iPSC is cultured for differentiation into a neuronal cell for at least 16 days. In embodiments, the iPSC is cultured for differentiation into a neuronal cell for at least 17 days. In embodiments, the iPSC is cultured for differentiation into a neuronal cell for at least 18 days. In embodiments, the iPSC is cultured for differentiation into a neuronal cell for at least 19 days. In embodiments, the iPSC is cultured for differentiation into a neuronal cell for at least 20 days.
In embodiments, the iPSC is cultured for differentiation into a neuronal cell for about 10 days. In embodiments, the iPSC is cultured for differentiation into a neuronal cell for about 11 days. In embodiments, the iPSC is cultured for differentiation into a neuronal cell for about 12 days. In embodiments, the iPSC is cultured for differentiation into a neuronal cell for about 13 days. In embodiments, the iPSC is cultured for differentiation into a neuronal cell for about 14 days. In embodiments, the iPSC is cultured for differentiation into a neuronal cell for about 15 days. In embodiments, the iPSC is cultured for differentiation into a neuronal cell for about 16 days. In embodiments, the iPSC is cultured for differentiation into a neuronal cell for about 17 days. In embodiments, the iPSC is cultured for differentiation into a neuronal cell for about 18 days. In embodiments, the iPSC is cultured for differentiation into a neuronal cell for about 19 days. In embodiments, the iPSC is cultured for differentiation into a neuronal cell for about 20 days. In embodiments, the iPSC is cultured for differentiation into a neuronal cell for about 21 days. In embodiments, the iPSC is cultured for differentiation into a neuronal cell for about 22 days. In embodiments, the iPSC is cultured for differentiation into a neuronal cell for about 23 days. In embodiments, the iPSC is cultured for differentiation into a neuronal cell for about 24 days. In embodiments, the iPSC is cultured for differentiation into a neuronal cell for about 25 days.
In some embodiments, reference cells, for example as described in Section I.A., undergo methods of differentiation as described herein. In some embodiments, test cells, for example as described in Section I.B., undergo methods of differentiation as described herein. In some embodiments, both reference cells and test cells undergo the same methods of differentiation as provided herein.
In some embodiments, the determined dopaminergic precursor cells identified by the methods provided herein have certain increased and/or decreased gene expression levels relative to a pluripotent stem cell. In some embodiments, an in vitro population of neuronal progenitor cells having certain increased and/or decreased gene expression levels relative to a pluripotent stem cell is indicative of the in vitro population comprising desirable determined dopaminergic precursor cells.
In embodiments, the gene expression profile information for the desirable determined dopaminergic precursor cell includes increased gene expression levels relative to a pluripotent stem cell for a first gene set, wherein the first gene set includes at least one increased gene within one or more first gene ontologies of Table 1.
In embodiments, the gene expression profile information for the desirable determined dopaminergic precursor cell includes increased gene expression levels relative to a pluripotent stem cell for a first gene set, wherein the first gene set includes at least one increased gene within one or more first gene ontologies selected from the group consisting of gene ontologies of Table 1.
In embodiments, the gene expression profile information for the desirable determined dopaminergic precursor cell includes increased gene expression levels relative to a pluripotent stem cell for a first gene set, wherein the first gene set includes at least one increased gene within one or more first gene ontologies of GO:0007399, GO:0120025, GO:0042995, GO:0032502, GO:0044767, GO:0048856, GO:0048731, GO:0022008, GO:0048699, GO:0007275, GO:0030030, GO:0032501, GO:0044707, GO:0050874, GO:0048468, GO:0120036, GO:0120038, GO:0044463, GO:0097458, GO:0045202, GO:0030182, GO:0030154, GO:0048869, GO:0051960, GO:0007156, GO:0005929, GO:0072372, GO:0035082, GO:0035083, GO:0035084, GO:0060284, GO:0050767, GO:0001578, GO:0016339, GO:0043005, GO:0044456, GO:0098742, GO:0045664, GO:0006928, GO:0099699, GO:0048666, GO:0003341, GO:0036142, GO:0005509, GO:0097060, GO:0031514, GO:0009434, GO:0031512, GO:0007155, GO:0098602, GO:0010975, GO:0098794, GO:0022610, GO:0030424, GO:0099240, GO:0032989, GO:0120035, GO:0000902, GO:0007148, GO:0045790, GO:0045791, GO:0048812, GO:0036477, GO:0031344, GO:0120039, GO:0061564, GO:0048858, GO:0099055, GO:0009653, GO:0098609, GO:0016337, GO:0031175, GO:0005930, GO:0035085, GO:0035086, GO:0010720, GO:0007416, GO:0097014, GO:0032990, GO:0098936, GO:0043025, GO:0050768, GO:0051962, GO:0050808, GO:0007409, GO:0007410, GO:2000026, GO:0045597, GO:0044441, GO:0044442, GO:0007417, GO:0048667, GO:0010721, GO:0044459, GO:0060322, GO:0045211, GO:0045666, GO:0032838, GO:0099056, GO:0051961, GO:0044297, GO:0007018, GO:0050769, GO:0040011, GO:0050793, GO:0051094, GO:0005874, GO:0000904, GO:0010976, GO:0045595, GO:0050770, GO:0099536, GO:0098889, GO:0051239, GO:0007420, GO:0099537, GO:0031346, GO:0007268, GO:0098916, GO:0097485, GO:0044782, GO:0031226, GO:0060285, GO:0071974, GO:0010769, GO:0001539, GO:0050804, GO:0099177, GO:0005887, GO:0098984, GO:0045665, GO:0050919, GO:0007411, GO:0008040, GO:0030425, GO:0061387, GO:0097447, GO:0050803, GO:0042734, GO:0042391, GO:0001764, GO:0032279, GO:0010770, GO:0021953, GO:0099572, GO:0098590, GO:0044447, GO:0098978, GO:0014069, GO:0097481, GO:0097483, GO:0033267, GO:0010977, GO:0007017, GO:0150034, GO:0034702, GO:0034703, GO:0050807, GO:0060271, GO:0042384, GO:0051240, GO:0050772, GO:0120031, GO:0007626, GO:0008092, GO:0005886, GO:0005904, GO:0007610, GO:0044708, GO:0098793, GO:0022604, GO:0007267, GO:0071944, GO:0099060, GO:0022836, GO:0030031, GO:0042220, GO:0019226, GO:0030516, GO:0035637, GO:0045596, GO:0021954, GO:0022832, GO:0005244, GO:1902495, GO:0050771, GO:0048513, GO:0022839, GO:0098948, GO:0001508, GO:0099568, GO:0008484, GO:0051966, GO:0003358, GO:0033602, GO:0005261, GO:0015281, GO:0015338, GO:0022603, GO:1990351, GO:0097729, GO:0015631, GO:0051270, GO:0005216, GO:0016043, GO:0044235, GO:0071842, GO:0031345, GO:0005856, GO:0022838, GO:0099061, GO:0098982, GO:0051674, GO:0048870, GO:0060294, GO:0072359, GO:0099634, GO:0015630, GO:0036126, GO:1990939, GO:0072347, GO:0015267, GO:0015249, GO:0015268, GO:0022803, GO:0022814, GO:0008045, GO:0098797, GO:0060160, GO:0099146, GO:0010771, GO:0000226, GO:0045503, GO:0005578, GO:0030334, GO:0044304, GO:0010463, GO:0010646, GO:0008574, GO:0043279 or any combination thereof.
In embodiments, the gene expression profile information for the desirable determined dopaminergic precursor cell includes increased gene expression levels relative to a pluripotent stem cell for a first gene set, wherein the first gene set includes at least one increased gene within one or more first gene ontologies selected from the group consisting of GO:0007399, GO:0120025, GO:0042995, GO:0032502, GO:0044767, GO:0048856, GO:0048731, GO:0022008, GO:0048699, GO:0007275, GO:0030030, GO:0032501, GO:0044707, GO:0050874, GO:0048468, GO:0120036, GO:0120038, GO:0044463, GO:0097458, GO:0045202, GO:0030182, GO:0030154, GO:0048869, GO:0051960, GO:0007156, GO:0005929, GO:0072372, GO:0035082, GO:0035083, GO:0035084, GO:0060284, GO:0050767, GO:0001578, GO:0016339, GO:0043005, GO:0044456, GO:0098742, GO:0045664, GO:0006928, GO:0099699, GO:0048666, GO:0003341, GO:0036142, GO:0005509, GO:0097060, GO:0031514, GO:0009434, GO:0031512, GO:0007155, GO:0098602, GO:0010975, GO:0098794, GO:0022610, GO:0030424, GO:0099240, GO:0032989, GO:0120035, GO:0000902, GO:0007148, GO:0045790, GO:0045791, GO:0048812, GO:0036477, GO:0031344, GO:0120039, GO:0061564, GO:0048858, GO:0099055, GO:0009653, GO:0098609, GO:0016337, GO:0031175, GO:0005930, GO:0035085, GO:0035086, GO:0010720, GO:0007416, GO:0097014, GO:0032990, GO:0098936, GO:0043025, GO:0050768, GO:0051962, GO:0050808, GO:0007409, GO:0007410, GO:2000026, GO:0045597, GO:0044441, GO:0044442, GO:0007417, GO:0048667, GO:0010721, GO:0044459, GO:0060322, GO:0045211, GO:0045666, GO:0032838, GO:0099056, GO:0051961, GO:0044297, GO:0007018, GO:0050769, GO:0040011, GO:0050793, GO:0051094, GO:0005874, GO:0000904, GO:0010976, GO:0045595, GO:0050770, GO:0099536, GO:0098889, GO:0051239, GO:0007420, GO:0099537, GO:0031346, GO:0007268, GO:0098916, GO:0097485, GO:0044782, GO:0031226, GO:0060285, GO:0071974, GO:0010769, GO:0001539, GO:0050804, GO:0099177, GO:0005887, GO:0098984, GO:0045665, GO:0050919, GO:0007411, GO:0008040, GO:0030425, GO:0061387, GO:0097447, GO:0050803, GO:0042734, GO:0042391, GO:0001764, GO:0032279, GO:0010770, GO:0021953, GO:0099572, GO:0098590, GO:0044447, GO:0098978, GO:0014069, GO:0097481, GO:0097483, GO:0033267, GO:0010977, GO:0007017, GO:0150034, GO:0034702, GO:0034703, GO:0050807, GO:0060271, GO:0042384, GO:0051240, GO:0050772, GO:0120031, GO:0007626, GO:0008092, GO:0005886, GO:0005904, GO:0007610, GO:0044708, GO:0098793, GO:0022604, GO:0007267, GO:0071944, GO:0099060, GO:0022836, GO:0030031, GO:0042220, GO:0019226, GO:0030516, GO:0035637, GO:0045596, GO:0021954, GO:0022832, GO:0005244, GO:1902495, GO:0050771, GO:0048513, GO:0022839, GO:0098948, GO:0001508, GO:0099568, GO:0008484, GO:0051966, GO:0003358, GO:0033602, GO:0005261, GO:0015281, GO:0015338, GO:0022603, GO:1990351, GO:0097729, GO:0015631, GO:0051270, GO:0005216, GO:0016043, GO:0044235, GO:0071842, GO:0031345, GO:0005856, GO:0022838, GO:0099061, GO:0098982, GO:0051674, GO:0048870, GO:0060294, GO:0072359, GO:0099634, GO:0015630, GO:0036126, GO:1990939, GO:0072347, GO:0015267, GO:0015249, GO:0015268, GO:0022803, GO:0022814, GO:0008045, GO:0098797, GO:0060160, GO:0099146, GO:0010771, GO:0000226, GO:0045503, GO:0005578, GO:0030334, GO:0044304, GO:0010463, GO:0010646, GO:0008574, GO:0043279 and any combination thereof.
In embodiments, the first gene set includes about 1-500 increased genes within one or more of the first gene ontologies. In embodiments, the first gene set includes about 2-500 increased genes within one or more of the first gene ontologies. In embodiments, the first gene set includes about 3-500 increased genes within one or more of the first gene ontologies. In embodiments, the first gene set includes about 4-500 increased genes within one or more of the first gene ontologies. In embodiments, the first gene set includes about 5-500 increased genes within one or more of the first gene ontologies. In embodiments, the first gene set includes about 6-500 increased genes within one or more of the first gene ontologies. In embodiments, the first gene set includes about 7-500 increased genes within one or more of the first gene ontologies. In embodiments, the first gene set includes about 8-500 increased genes within one or more of the first gene ontologies. In embodiments, the first gene set includes about 9-500 increased genes within one or more of the first gene ontologies. In embodiments, the first gene set includes about 10-500 increased genes within one or more of the first gene ontologies.
In embodiments, the first gene set includes about 15-500 increased genes within one or more of the first gene ontologies. In embodiments, the first gene set includes about 20-500 increased genes within one or more of the first gene ontologies. In embodiments, the first gene set includes about 25-500 increased genes within one or more of the first gene ontologies. In embodiments, the first gene set includes about 30-500 increased genes within one or more of the first gene ontologies. In embodiments, the first gene set includes about 35-500 increased genes within one or more of the first gene ontologies. In embodiments, the first gene set includes about 40-500 increased genes within one or more of the first gene ontologies. In embodiments, the first gene set includes about 45-500 increased genes within one or more of the first gene ontologies. In embodiments, the first gene set includes about 50-500 increased genes within one or more of the first gene ontologies. In embodiments, the first gene set includes about 55-500 increased genes within one or more of the first gene ontologies. In embodiments, the first gene set includes about 60-500 increased genes within one or more of the first gene ontologies. In embodiments, the first gene set includes about 65-500 increased genes within one or more of the first gene ontologies. In embodiments, the first gene set includes about 70-500 increased genes within one or more of the first gene ontologies. In embodiments, the first gene set includes about 75-500 increased genes within one or more of the first gene ontologies. In embodiments, the first gene set includes about 80-500 increased genes within one or more of the first gene ontologies. In embodiments, the first gene set includes about 85-500 increased genes within one or more of the first gene ontologies. In embodiments, the first gene set includes about 90-500 increased genes within one or more of the first gene ontologies. In embodiments, the first gene set includes about 95-500 increased genes within one or more of the first gene ontologies. In embodiments, the first gene set includes about 100-500 increased genes within one or more of the first gene ontologies.
In embodiments, the first gene set includes about 105-500 increased genes within one or more of the first gene ontologies. In embodiments, the first gene set includes about 115-500 increased genes within one or more of the first gene ontologies. In embodiments, the first gene set includes about 120-500 increased genes within one or more of the first gene ontologies. In embodiments, the first gene set includes about 125-500 increased genes within one or more of the first gene ontologies. In embodiments, the first gene set includes about 130-500 increased genes within one or more of the first gene ontologies. In embodiments, the first gene set includes about 135-500 increased genes within one or more of the first gene ontologies. In embodiments, the first gene set includes about 140-500 increased genes within one or more of the first gene ontologies. In embodiments, the first gene set includes about 145-500 increased genes within one or more of the first gene ontologies. In embodiments, the first gene set includes about 150-500 increased genes within one or more of the first gene ontologies. In embodiments, the first gene set includes about 155-500 increased genes within one or more of the first gene ontologies. In embodiments, the first gene set includes about 160-500 increased genes within one or more of the first gene ontologies. In embodiments, the first gene set includes about 165-500 increased genes within one or more of the first gene ontologies. In embodiments, the first gene set includes about 170-500 increased genes within one or more of the first gene ontologies. In embodiments, the first gene set includes about 175-500 increased genes within one or more of the first gene ontologies. In embodiments, the first gene set includes about 180-500 increased genes within one or more of the first gene ontologies. In embodiments, the first gene set includes about 185-500 increased genes within one or more of the first gene ontologies. In embodiments, the first gene set includes about 190-500 increased genes within one or more of the first gene ontologies. In embodiments, the first gene set includes about 195-500 increased genes within one or more of the first gene ontologies. In embodiments, the first gene set includes about 200-500 increased genes within one or more of the first gene ontologies.
In embodiments, the first gene set includes about 205-500 increased genes within one or more of the first gene ontologies. In embodiments, the first gene set includes about 215-500 increased genes within one or more of the first gene ontologies. In embodiments, the first gene set includes about 220-500 increased genes within one or more of the first gene ontologies. In embodiments, the first gene set includes about 225-500 increased genes within one or more of the first gene ontologies. In embodiments, the first gene set includes about 230-500 increased genes within one or more of the first gene ontologies. In embodiments, the first gene set includes about 235-500 increased genes within one or more of the first gene ontologies. In embodiments, the first gene set includes about 240-500 increased genes within one or more of the first gene ontologies. In embodiments, the first gene set includes about 245-500 increased genes within one or more of the first gene ontologies. In embodiments, the first gene set includes about 250-500 increased genes within one or more of the first gene ontologies. In embodiments, the first gene set includes about 255-500 increased genes within one or more of the first gene ontologies. In embodiments, the first gene set includes about 260-500 increased genes within one or more of the first gene ontologies. In embodiments, the first gene set includes about 265-500 increased genes within one or more of the first gene ontologies. In embodiments, the first gene set includes about 270-500 increased genes within one or more of the first gene ontologies. In embodiments, the first gene set includes about 275-500 increased genes within one or more of the first gene ontologies. In embodiments, the first gene set includes about 280-500 increased genes within one or more of the first gene ontologies. In embodiments, the first gene set includes about 285-500 increased genes within one or more of the first gene ontologies. In embodiments, the first gene set includes about 290-500 increased genes within one or more of the first gene ontologies. In embodiments, the first gene set includes about 295-500 increased genes within one or more of the first gene ontologies. In embodiments, the first gene set includes about 300-500 increased genes within one or more of the first gene ontologies.
In embodiments, the first gene set includes about 305-500 increased genes within one or more of the first gene ontologies. In embodiments, the first gene set includes about 315-500 increased genes within one or more of the first gene ontologies. In embodiments, the first gene set includes about 320-500 increased genes within one or more of the first gene ontologies. In embodiments, the first gene set includes about 325-500 increased genes within one or more of the first gene ontologies. In embodiments, the first gene set includes about 330-500 increased genes within one or more of the first gene ontologies. In embodiments, the first gene set includes about 335-500 increased genes within one or more of the first gene ontologies. In embodiments, the first gene set includes about 340-500 increased genes within one or more of the first gene ontologies. In embodiments, the first gene set includes about 345-500 increased genes within one or more of the first gene ontologies. In embodiments, the first gene set includes about 350-500 increased genes within one or more of the first gene ontologies. In embodiments, the first gene set includes about 355-500 increased genes within one or more of the first gene ontologies. In embodiments, the first gene set includes about 360-500 increased genes within one or more of the first gene ontologies. In embodiments, the first gene set includes about 365-500 increased genes within one or more of the first gene ontologies. In embodiments, the first gene set includes about 370-500 increased genes within one or more of the first gene ontologies. In embodiments, the first gene set includes about 375-500 increased genes within one or more of the first gene ontologies. In embodiments, the first gene set includes about 380-500 increased genes within one or more of the first gene ontologies. In embodiments, the first gene set includes about 385-500 increased genes within one or more of the first gene ontologies. In embodiments, the first gene set includes about 390-500 increased genes within one or more of the first gene ontologies. In embodiments, the first gene set includes about 395-500 increased genes within one or more of the first gene ontologies. In embodiments, the first gene set includes about 400-500 increased genes within one or more of the first gene ontologies.
In embodiments, the first gene set includes about 405-500 increased genes within one or more of the first gene ontologies. In embodiments, the first gene set includes about 415-500 increased genes within one or more of the first gene ontologies. In embodiments, the first gene set includes about 420-500 increased genes within one or more of the first gene ontologies. In embodiments, the first gene set includes about 425-500 increased genes within one or more of the first gene ontologies. In embodiments, the first gene set includes about 430-500 increased genes within one or more of the first gene ontologies. In embodiments, the first gene set includes about 435-500 increased genes within one or more of the first gene ontologies. In embodiments, the first gene set includes about 440-500 increased genes within one or more of the first gene ontologies. In embodiments, the first gene set includes about 445-500 increased genes within one or more of the first gene ontologies. In embodiments, the first gene set includes about 450-500 increased genes within one or more of the first gene ontologies. In embodiments, the first gene set includes about 455-500 increased genes within one or more of the first gene ontologies. In embodiments, the first gene set includes about 460-500 increased genes within one or more of the first gene ontologies. In embodiments, the first gene set includes about 465-500 increased genes within one or more of the first gene ontologies. In embodiments, the first gene set includes about 470-500 increased genes within one or more of the first gene ontologies. In embodiments, the first gene set includes about 475-500 increased genes within one or more of the first gene ontologies. In embodiments, the first gene set includes about 480-500 increased genes within one or more of the first gene ontologies. In embodiments, the first gene set includes about 485-500 increased genes within one or more of the first gene ontologies. In embodiments, the first gene set includes about 490-500 increased genes within one or more of the first gene ontologies. In embodiments, the first gene set includes about 495-500 increased genes within one or more of the first gene ontologies.
In embodiments, the first gene set includes 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100, 101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, 124, 125, 126, 127, 128, 129, 130, 131, 132, 133, 134, 135, 136, 137, 138, 139, 140, 141, 142, 143, 144, 145, 146, 147, 148, 149, 150, 151, 152, 153, 154, 155, 156, 157, 158, 159, 160, 161, 162, 163, 164, 165, 166, 167, 168, 169, 170, 171, 172, 173, 174, 175, 176, 177, 178, 179, 180, 181, 182, 183, 184, 185, 186, 187, 188, 189, 190, 191, 192, 193, 194, 195, 196, 197, 198, 199, 200, 201, 202, 203, 204, 205, 206, 207, 208, 209, 210, 211 212, 213, 214, 215, 216, 217, 218, 219, 220, 221, 222, 223, 224, 225, 226, 227, 228, 229, 230, 231, 232, 233, 234, 235, 236, 237, 238, 239, 240, 241, 242, 243, 244, 245, 246, 247, 248, 249, 250, 251, 252, 253, 254, 255, 256, 257, 258, 259, 260, 261, 262, 263, 264, 265, 266, 267, 268, 269, 270, 271, 272, 273, 274, 275, 276, 277, 278, 279, 280, 281, 282, 283, 284, 285, 286, 287, 288, 289, 290, 291, 292, 293, 294, 295, 296, 297, 298, 299, 300, 301, 302, 303, 304, 305, 306, 307, 308, 309, 310, 311 312, 313, 314, 315 316, 317, 318, 319, 320, 321, 322, 323, 324, 325, 326, 327, 328, 329, 330, 331, 332, 333, 334, 335, 336, 337, 338, 339, 340, 341, 342, 343, 344, 345, 346, 347, 348, 349, 350, 351, 352, 353, 354, 355, 356, 357, 358, 359, 360, 361, 362, 363, 364, 365, 366, 367, 368, 369, 370, 371, 372, 373, 374, 375, 376, 377, 378, 379, 380, 381, 382, 383, 384, 385, 386, 387, 388, 389, 390, 391, 392, 393, 394, 395, 396, 397, 398, 399, 400, 401, 402, 403, 404, 405, 406, 407, 408, 409, 410, 411 412, 413, 414, 415 416, 417, 418, 419, 420, 421, 422, 423, 424, 425, 426, 427, 428, 429, 430, 431, 432, 433, 434, 435, 436, 437, 438, 439, 440, 441, 442, 443, 444, 445, 446, 447, 448, 449, 450, 451, 452, 453, 454, 455, 456, 457, 458, 459, 460, 461, 462, 463, 464, 465, 466, 467, 468, 469, 470, 471, 472, 473, 474, 475, 476, 477, 478, 479, 480, 481, 482, 483, 484, 485, 486, 487, 488, 489, 490, 491, 492, 493, 494, 495, 496, 497, 498, 499 or 500 increased genes within one or more of the first gene ontologies.
The gene expression profile information for the desirable determined dopaminergic precursor cell may include increased gene expression levels relative to a pluripotent stem cell for a first gene set, wherein the first gene set includes at least one increased gene within one or more first gene ontologies of Table 1. “One or more” as described herein in the context of first gene ontologies refers to at least one, for example, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, etc. of first gene ontologies. In embodiments, the first gene set includes about 1-500 increased genes within 1-300 of the first gene ontologies. In embodiments, the first gene set includes about 1-500 increased genes within 10-300 of the first gene ontologies. In embodiments, the first gene set includes about 1-500 increased genes within 20-300 of the first gene ontologies. In embodiments, the first gene set includes about 1-500 increased genes within 30-300 of the first gene ontologies. In embodiments, the first gene set includes about 1-500 increased genes within 40-300 of the first gene ontologies. In embodiments, the first gene set includes about 1-500 increased genes within 50-300 of the first gene ontologies. In embodiments, the first gene set includes about 1-500 increased genes within 60-300 of the first gene ontologies. In embodiments, the first gene set includes about 1-500 increased genes within 70-300 of the first gene ontologies. In embodiments, the first gene set includes about 1-500 increased genes within 80-300 of the first gene ontologies. In embodiments, the first gene set includes about 1-500 increased genes within 90-300 of the first gene ontologies. In embodiments, the first gene set includes about 1-500 increased genes within 100-300 of the first gene ontologies. In embodiments, the first gene set includes about 1-500 increased genes within 110-300 of the first gene ontologies. In embodiments, the first gene set includes about 1-500 increased genes within 120-300 of the first gene ontologies. In embodiments, the first gene set includes about 1-500 increased genes within 130-300 of the first gene ontologies. In embodiments, the first gene set includes about 1-500 increased genes within 140-300 of the first gene ontologies. In embodiments, the first gene set includes about 1-500 increased genes within 150-300 of the first gene ontologies. In embodiments, the first gene set includes about 1-500 increased genes within 160-300 of the first gene ontologies. In embodiments, the first gene set includes about 1-500 increased genes within 170-300 of the first gene ontologies. In embodiments, the first gene set includes about 1-500 increased genes within 180-300 of the first gene ontologies. In embodiments, the first gene set includes about 1-500 increased genes within 190-300 of the first gene ontologies. In embodiments, the first gene set includes about 1-500 increased genes within 200-300 of the first gene ontologies. In embodiments, the first gene set includes about 1-500 increased genes within 210-300 of the first gene ontologies. In embodiments, the first gene set includes about 1-500 increased genes within 220-300 of the first gene ontologies. In embodiments, the first gene set includes about 1-500 increased genes within 230-300 of the first gene ontologies. In embodiments, the first gene set includes about 1-500 increased genes within 240-300 of the first gene ontologies. In embodiments, the first gene set includes about 1-500 increased genes within 250-300 of the first gene ontologies. In embodiments, the first gene set includes about 1-500 increased genes within 260-300 of the first gene ontologies. In embodiments, the first gene set includes about 1-500 increased genes within 270-300 of the first gene ontologies. In embodiments, the first gene set includes about 1-500 increased genes within 280-300 of the first gene ontologies. In embodiments, the first gene set includes about 1-500 increased genes within 290-300 of the first gene ontologies.
In embodiments, the first gene set includes about 1-500 increased genes within 1-290 of the first gene ontologies. In embodiments, the first gene set includes about 1-500 increased genes within 1-280 of the first gene ontologies. In embodiments, the first gene set includes about 1-500 increased genes within 1-270 of the first gene ontologies. In embodiments, the first gene set includes about 1-500 increased genes within 1-260 of the first gene ontologies. In embodiments, the first gene set includes about 1-500 increased genes within 1-250 of the first gene ontologies. In embodiments, the first gene set includes about 1-500 increased genes within 1-240 of the first gene ontologies. In embodiments, the first gene set includes about 1-500 increased genes within 1-230 of the first gene ontologies. In embodiments, the first gene set includes about 1-500 increased genes within 1-220 of the first gene ontologies. In embodiments, the first gene set includes about 1-500 increased genes within 1-210 of the first gene ontologies. In embodiments, the first gene set includes about 1-500 increased genes within 1-200 of the first gene ontologies. In embodiments, the first gene set includes about 1-500 increased genes within 1-190 of the first gene ontologies. In embodiments, the first gene set includes about 1-500 increased genes within 1-180 of the first gene ontologies. In embodiments, the first gene set includes about 1-500 increased genes within 1-170 of the first gene ontologies. In embodiments, the first gene set includes about 1-500 increased genes within 1-160 of the first gene ontologies. In embodiments, the first gene set includes about 1-500 increased genes within 1-150 of the first gene ontologies. In embodiments, the first gene set includes about 1-500 increased genes within 1-140 of the first gene ontologies. In embodiments, the first gene set includes about 1-500 increased genes within 1-130 of the first gene ontologies. In embodiments, the first gene set includes about 1-500 increased genes within 1-120 of the first gene ontologies. In embodiments, the first gene set includes about 1-500 increased genes within 1-110 of the first gene ontologies. In embodiments, the first gene set includes about 1-500 increased genes within 1-100 of the first gene ontologies. In embodiments, the first gene set includes about 1-500 increased genes within 1-90 of the first gene ontologies. In embodiments, the first gene set includes about 1-500 increased genes within 1-80 of the first gene ontologies. In embodiments, the first gene set includes about 1-500 increased genes within 1-70 of the first gene ontologies. In embodiments, the first gene set includes about 1-500 increased genes within 1-60 of the first gene ontologies. In embodiments, the first gene set includes about 1-500 increased genes within 1-50 of the first gene ontologies. In embodiments, the first gene set includes about 1-500 increased genes within 1-40 of the first gene ontologies. In embodiments, the first gene set includes about 1-500 increased genes within 1-30 of the first gene ontologies. In embodiments, the first gene set includes about 1-500 increased genes within 1-20 of the first gene ontologies. In embodiments, the first gene set includes about 1-500 increased genes within 1-10 of the first gene ontologies. In embodiments, the first gene set includes about 1-500 increased genes within 1-5 of the first gene ontologies.
In embodiments, the first gene set includes at least one increased gene within 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100, 101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, 124, 125, 126, 127, 128, 129, 130, 131, 132, 133, 134, 135, 136, 137, 138, 139, 140, 141, 142, 143, 144, 145, 146, 147, 148, 149, 150, 151, 152, 153, 154, 155, 156, 157, 158, 159, 160, 161, 162, 163, 164, 165, 166, 167, 168, 169, 170, 171, 172, 173, 174, 175, 176, 177, 178, 179, 180, 181, 182, 183, 184, 185, 186, 187, 188, 189, 190, 191, 192, 193, 194, 195, 196, 197, 198, 199, 200, 201, 202, 203, 204, 205, 206, 207, or 208 first gene ontologies of Table 1.
In embodiments, the first gene set includes 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100, 101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, 124, 125, 126, 127, 128, 129, 130, 131, 132, 133, 134, 135, 136, 137, 138, 139, 140, 141, 142, 143, 144, 145, 146, 147, 148, 149, 150, 151, 152, 153, 154, 155, 156, 157, 158, 159, 160, 161, 162, 163, 164, 165, 166, 167, 168, 169, 170, 171, 172, 173, 174, 175, 176, 177, 178, 179, 180, 181, 182, 183, 184, 185, 186, 187, 188, 189, 190, 191, 192, 193, 194, 195, 196, 197, 198, 199, 200, 201, 202, 203, 204, 205, 206, 207, 208, 209, 210, 211 212, 213, 214, 215, 216, 217, 218, 219, 220, 221, 222, 223, 224, 225, 226, 227, 228, 229, 230, 231, 232, 233, 234, 235, 236, 237, 238, 239, 240, 241, 242, 243, 244, 245, 246, 247, 248, 249, 250, 251, 252, 253, 254, 255, 256, 257, 258, 259, 260, 261, 262, 263, 264, 265, 266, 267, 268, 269, 270, 271, 272, 273, 274, 275, 276, 277, 278, 279, 280, 281, 282, 283, 284, 285, 286, 287, 288, 289, 290, 291, 292, 293, 294, 295, 296, 297, 298, 299, 300, 301, 302, 303, 304, 305, 306, 307,308, 309, 310, 311 312, 313, 314, 315 316, 317, 318, 319, 320, 321, 322, 323, 324, 325, 326, 327, 328, 329, 330, 331, 332, 333, 334, 335, 336, 337, 338, 339, 340, 341, 342, 343, 344, 345, 346, 347, 348, 349, 350, 351, 352, 353, 354, 355, 356, 357, 358, 359, 360, 361, 362, 363, 364, 365, 366, 367, 368, 369, 370, 371, 372, 373, 374, 375, 376, 377, 378, 379, 380, 381, 382, 383, 384, 385, 386, 387, 388, 389, 390, 391, 392, 393, 394, 395, 396, 397, 398, 399, 400, 401, 402, 403, 404, 405, 406, 407, 408, 409, 410, 411 412, 413, 414, 415 416, 417, 418, 419, 420, 421, 422, 423, 424, 425, 426, 427, 428, 429, 430, 431, 432, 433, 434, 435, 436, 437, 438, 439, 440, 441, 442, 443, 444, 445, 446, 447, 448, 449, 450, 451, 452, 453, 454, 455, 456, 457, 458, 459, 460, 461, 462, 463, 464, 465, 466, 467, 468, 469, 470, 471, 472, 473, 474, 475, 476, 477, 478, 479, 480, 481, 482, 483, 484, 485, 486, 487, 488, 489, 490, 491, 492, 493, 494, 495, 496, 497, 498, 499 or 500 increased genes within 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100, 101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, 124, 125, 126, 127, 128, 129, 130, 131, 132, 133, 134, 135, 136, 137, 138, 139, 140, 141, 142, 143, 144, 145, 146, 147,148, 149, 150, 151, 152, 153, 154, 155, 156, 157, 158, 159, 160, 161, 162, 163, 164, 165, 166, 167, 168, 169, 170, 171, 172, 173, 174, 175, 176, 177, 178, 179, 180, 181, 182, 183, 184, 185, 186, 187, 188, 189, 190, 191, 192, 193, 194, 195, 196, 197, 198, 199, 200, 201, 202, 203, 204, 205, 206, 207 or 208 first gene ontologies of Table 1.
In embodiments, the first gene ontologies are any one of the gene ontologies listed in Table 1. In embodiments, the first gene ontologies are any one of GO:0007399, GO:0120025, GO:0042995, GO:0032502, GO:0044767, GO:0048856, GO:0048731, GO:0022008, GO:0048699, GO:0007275, GO:0030030, GO:0032501, GO:0044707, GO:0050874, GO:0048468, GO:0120036, GO:0120038, GO:0044463, GO:0097458, GO:0045202, GO:0030182, GO:0030154, GO:0048869, GO:0051960, GO:0007156, GO:0005929, GO:0072372, GO:0035082, GO:0035083, GO:0035084, GO:0060284, GO:0050767, GO:0001578, GO:0016339, GO:0043005, GO:0044456, GO:0098742, GO:0045664, GO:0006928, GO:0099699, GO:0048666, GO:0003341, GO:0036142, GO:0005509, GO:0097060, GO:0031514, GO:0009434, GO:0031512, GO:0007155, GO:0098602, GO:0010975, GO:0098794, GO:0022610, GO:0030424, GO:0099240, GO:0032989, GO:0120035, GO:0000902, GO:0007148, GO:0045790, GO:0045791, GO:0048812, GO:0036477, GO:0031344, GO:0120039, GO:0061564, GO:0048858, GO:0099055, GO:0009653, GO:0098609, GO:0016337, GO:0031175, GO:0005930, GO:0035085, GO:0035086, GO:0010720, GO:0007416, GO:0097014, GO:0032990, GO:0098936, GO:0043025, GO:0050768, GO:0051962, GO:0050808, GO:0007409, GO:0007410, GO:2000026, GO:0045597, GO:0044441, GO:0044442, GO:0007417, GO:0048667, GO:0010721, GO:0044459, GO:0060322, GO:0045211, GO:0045666, GO:0032838, GO:0099056, GO:0051961, GO:0044297, GO:0007018, GO:0050769, GO:0040011, GO:0050793, GO:0051094, GO:0005874, GO:0000904, GO:0010976, GO:0045595, GO:0050770, GO:0099536, GO:0098889, GO:0051239, GO:0007420, GO:0099537, GO:0031346, GO:0007268, GO:0098916, GO:0097485, GO:0044782, GO:0031226, GO:0060285, GO:0071974, GO:0010769, GO:0001539, GO:0050804, GO:0099177, GO:0005887, GO:0098984, GO:0045665, GO:0050919, GO:0007411, GO:0008040, GO:0030425, GO:0061387, GO:0097447, GO:0050803, GO:0042734, GO:0042391, GO:0001764, GO:0032279, GO:0010770, GO:0021953, GO:0099572, GO:0098590, GO:0044447, GO:0098978, GO:0014069, GO:0097481, GO:0097483, GO:0033267, GO:0010977, GO:0007017, GO:0150034, GO:0034702, GO:0034703, GO:0050807, GO:0060271, GO:0042384, GO:0051240, GO:0050772, GO:0120031, GO:0007626, GO:0008092, GO:0005886, GO:0005904, GO:0007610, GO:0044708, GO:0098793, GO:0022604, GO:0007267, GO:0071944, GO:0099060, GO:0022836, GO:0030031, GO:0042220, GO:0019226, GO:0030516, GO:0035637, GO:0045596, GO:0021954, GO:0022832, GO:0005244, GO:1902495, GO:0050771, GO:0048513, GO:0022839, GO:0098948, GO:0001508, GO:0099568, GO:0008484, GO:0051966, GO:0003358, GO:0033602, GO:0005261, GO:0015281, GO:0015338, GO:0022603, GO:1990351, GO:0097729, GO:0015631, GO:0051270, GO:0005216, GO:0016043, GO:0044235, GO:0071842, GO:0031345, GO:0005856, GO:0022838, GO:0099061, GO:0098982, GO:0051674, GO:0048870, GO:0060294, GO:0072359, GO:0099634, GO:0015630, GO:0036126, GO:1990939, GO:0072347, GO:0015267, GO:0015249, GO:0015268, GO:0022803, GO:0022814, GO:0008045, GO:0098797, GO:0060160, GO:0099146, GO:0010771, GO:0000226, GO:0045503, GO:0005578, GO:0030334, GO:0044304, GO:0010463, GO:0010646, GO:0008574, GO:0043279 or any combination thereof.
In embodiments, the gene expression profile information for the desirable determined dopaminergic precursor cell includes increased gene expression levels relative to a pluripotent stem cell for a first gene set, wherein the first gene set includes at least one increased gene within one or more first gene ontologies selected from the group consisting of: GO0005509, GO0016339, GO0007416 and GO0048731. In embodiments, the gene expression profile information for the desirable determined dopaminergic precursor cell includes increased gene expression levels relative to a pluripotent stem cell for a first gene set, wherein the first gene set includes at least one increased gene within one or more first gene ontologies of: GO0005509, GO0016339, GO0007416 or GO0048731. In embodiments, the gene expression profile information for the desirable determined dopaminergic precursor cell includes increased gene expression levels relative to a pluripotent stem cell for a first gene set, wherein the first gene set includes at least one increased gene within one or more first gene ontologies selected from the group consisting of: GO0048699, GO0050767, GO0060160, GO0097458, GO0010975, GO0022008 and any combination thereof. In embodiments, the gene expression profile information for the desirable determined dopaminergic precursor cell includes increased gene expression levels relative to a pluripotent stem cell for a first gene set, wherein the first gene set includes at least one increased gene within one or more first gene ontologies of: GO0048699, GO0050767, GO0060160, GO0097458, GO0010975, GO0022008 or any combination thereof.
In embodiments, the first gene set includes at least one (e.g., 1, 2, 3, 4, 5, 6 etc.) increased gene of Table 2, Table 3, Table 4, Table 5 Table 6 or Table 7 or any combination thereof.
In embodiments, the first gene set includes at least one (e.g., 1, 2, 3, 4, 5, 6 etc.) increased gene of Table 2. In embodiments, the at least one (e.g., 1, 2, 3, 4, 5, 6 etc.) increased gene is GPM6A, DRD2, BMP7, EFNB3, SEMA3C, FSCN2, LGI1, SRCIN1, WNT4, SLIT2, NRG1, TTBK1, RNF165, CDH2, ELAVL4, ONECUT2, KREMEN1, SCRT1, KIAA1024, DSCAM, MAP2, FAT4, PAK3, NGF, SEMA6D, STMN2, ZFHX3, LRP2, APOA1, CAMK2B, MDGA1, ISLR2, SNAP25, NEUROD4, PHOX2B, DCX, MAGI2, PIK3R1, NCAM1, NTRK3, PITX3, MYT1L, AVIL, CDK5R2, INSM1, SOX21, IL6ST, KIFSC, SYNJ1, KALRN, GFRA1, TCTN1, CELSR1, IRX5, PMP22, RUNX1, DPYSL4, NRCAM, ZNF521, MDGA2, PROX1, ZNF536, MAP1A, NEGR1, PLXNA4, EPB41L3, GAP43, EPHA7, DLL3, VSTM2L, ID4, NRN1, SPOCK1, DUSP10, COL3A1, CX3CL1, SLIT3, MAPK8IP2, FAIM2, TCF12, BMP6, NRBP2, NCAM2, HIPK2, CDH11, ADGRL3, ZNF804A, ULK2, CCKAR, SARM1, PLXNA3, ENC1, ASCL1, UNCX, MEIS1, ARX, SRRM4, TRIM67, ALCAM, NTN1, ZNF365, GFI1, ADCYAP1, CNR1, ANKRD1, ALK, STMN4, MAPT, RUFY3, PLXNA2, PLXNC1, MAP1B, DPYSL5, PTPRO, FZD1 or DLX5.
In embodiments, the at least one (e.g., 1, 2, 3, 4, 5, 6 etc.) increased gene is selected from the group consisting of GPM6A, DRD2, BMP7, EFNB3, SEMA3C, FSCN2, LGI1, SRCIN1, WNT4, SLIT2, NRG1, TTBK1, RNF165, CDH2, ELAVL4, ONECUT2, KREMEN1, SCRT1, KIAA1024, DSCAM, MAP2, FAT4, PAK3, NGF, SEMA6D, STMN2, ZFHX3, LRP2, APOA1, CAMK2B, MDGA1, ISLR2, SNAP25, NEUROD4, PHOX2B, DCX, MAGI2, PIK3R1, NCAM1, NTRK3, PITX3, MYT1L, AVIL, CDK5R2, INSM1, SOX21, IL6ST, KIF5C, SYNJ1, KALRN, GFRA1, TCTN1, CELSR1, IRX5, PMP22, RUNX1, DPYSL4, NRCAM, ZNF521, MDGA2, PROX1, ZNF536, MAP1A, NEGR1, PLXNA4, EPB41L3, GAP43, EPHA7, DLL3, VSTM2L, ID4, NRN1, SPOCK1, DUSP10, COL3A1, CX3CL1, SLIT3, MAPK8IP2, FAIM2, TCF12, BMP6, NRBP2, NCAM2, HIPK2, CDH11, ADGRL3, ZNF804A, ULK2, CCKAR, SARM1, PLXNA3, ENC1, ASCL1, UNCX, MEIS1, ARX, SRRM4, TRIM67, ALCAM, NTN1, ZNF365, GFI1, ADCYAP1, CNR1, ANKRD1, ALK, STMN4, MAPT, RUFY3, PLXNA2, PLXNC1, MAP1B, DPYSL5, PTPRO, FZD1 and DLX5.
In embodiments, the first gene set includes at least one (e.g., 1, 2, 3, 4, 5, 6 etc.) increased gene of Table 3. In embodiments, the at least one (e.g., 1, 2, 3, 4, 5, 6 etc.) increased gene is DRD2, BMP7, EFNB3, SEMA3C, SRCIN1, SLIT2, NRG1, TTBK1, CDH2, KREMEN1, SCRT1, KIAA1024, DSCAM, MAP2, PAK3, NGF, SEMA6D, STMN2, ZFHX3, LRP2, CAMK2B, ISLR2, SNAP25, PHOX2B, MAGI2, NTRK3, PITX3, AVIL, IL6ST, SYNJ1, KALRN, PMP22, NRCAM, PROX1, ZNF536, NEGR1, PLXNA4, EPHA7, DLL3, ID4, SPOCK1, DUSP10, COL3A1, CX3CL1, TCF12, BMP6, ZNF804A, ULK2, SARM1, PLXNA3, ENC1, ASCL1, MEIS1, TRIM67, NTN1, ZNF365, GFI1, ADCYAP1, CNR1, ANKRD1, ALK, MAPT, RUFY3, PLXNA2, PLXNC1, MAP1B, PTPRO or FZD1.
In embodiments, the at least one (e.g., 1, 2, 3, 4, 5, 6 etc.) increased gene is selected from the group consisting of DRD2, BMP7, EFNB3, SEMA3C, SRCIN1, SLIT2, NRG1, TTBK1, CDH2, KREMEN1, SCRT1, KIAA1024, DSCAM, MAP2, PAK3, NGF, SEMA6D, STMN2, ZFHX3, LRP2, CAMK2B, ISLR2, SNAP25, PHOX2B, MAGI2, NTRK3, PITX3, AVIL, IL6ST, SYNJ1, KALRN, PMP22, NRCAM, PROX1, ZNF536, NEGR1, PLXNA4, EPHA7, DLL3, ID4, SPOCK1, DUSP10, COL3A1, CX3CL1, TCF12, BMP6, ZNF804A, ULK2, SARM1, PLXNA3, ENC1, ASCL1, MEIS1, TRIM67, NTN1, ZNF365, GFI1, ADCYAP1, CNR1, ANKRD1, ALK, MAPT, RUFY3, PLXNA2, PLXNC1, MAP1B, PTPRO and FZD1.
In embodiments, the first gene set includes at least one (e.g., 1, 2, 3, 4, 5, 6 etc.) increased gene of Table 4. In embodiments, the at least one (e.g., 1, 2, 3, 4, 5, 6 etc.) increased gene is DRD2, RGS4, or PALM.
In embodiments, the at least one (e.g., 1, 2, 3, 4, 5, 6 etc.) increased gene is selected from the group consisting of DRD2, RGS4, and PALM.
In embodiments, the first gene set includes at least one (e.g., 1, 2, 3, 4, 5, 6 etc.) increased gene of Table 5. In embodiments, the at least one (e.g., 1, 2, 3, 4, 5, 6 etc.) increased gene is GPM6A, KIFAP3, DRD2, EFNB3, FSCN2, SLC8A1, SCGN, SRCIN1, PACRG, TRIM9, NRG1, TTBK1, HTR2A, SLC18A1, CERKL, CDH2, PALMD, KREMEN1, TANC2, MAPK10, SCN3A, LRRC4, DSCAM, TGFB3, MAP2, ELFN1, PAK3, NGF, CPEB2, DDN, STMN2, LRP2, CAMK2B, SVOP, SRR, SNAP25, PPFIA2, KCNA2, SYT5, BAIAP3, CADM2, CHRM2, DCX, MAGI2, KLHL1, NTRK3, PITX3, P2RX3, ADGRA1, AVIL, CADM3, CDK5R2, IL6ST, KIFSC, SYNJ1, TSPOAP1, DRP2, TMPRSS3, SYBU, HMP19, SNAP91, SCN11A, PALM, SLC1A4, NRCAM, CACNG4, CNIH2, DGKI, CLSTN2, MAP1A, GLRA2, CUBN, SCN7A, EPB41L3, BSN, GAP43, EPHA7, VSTM2L, SPOCK1, CX3CL1, MAPK8IP2, CAMK2N1, PDE1C, NCAM2, SLC17A6, SLC18A3, KCNC1, ADGRL3, ZNF804A, SARM1, GRIK4, ENC1, ASCL1, DMTN, KNCN, TMEM163, CLDN5, KCND3, PCDHB13, GABRR2, ALCAM, SV2B, KCTD16, ADCYAP1, APBA1, CNR1, STMN4, CADPS, MAPT, RUFY3, TP63, NRSN1, MAP1B, PCSK2, DPYSL5, GRM3, SLC6A1, ABAT, CACNA1C, CACNG2, PTPRO, CHRNA5, or CDH10.
In embodiments, the first gene set includes at least one (e.g., 1, 2, 3, 4, 5, 6 etc.) increased gene of Table 5. In embodiments, the at least one (e.g., 1, 2, 3, 4, 5, 6 etc.) increased gene is selected from the group consisting of GPM6A, KIFAP3, DRD2, EFNB3, FSCN2, SLC8A1, SCGN, SRCIN1, PACRG, TRIM9, NRG1, TTBK1, HTR2A, SLC18A1, CERKL, CDH2, PALMD, KREMEN1, TANC2, MAPK10, SCN3A, LRRC4, DSCAM, TGFB3, MAP2, ELFN1, PAK3, NGF, CPEB2, DDN, STMN2, LRP2, CAMK2B, SVOP, SRR, SNAP25, PPFIA2, KCNA2, SYT5, BAIAP3, CADM2, CHRM2, DCX, MAGI2, KLHL1, NTRK3, PITX3, P2RX3, ADGRA1, AVIL, CADM3, CDK5R2, IL6ST, KIFSC, SYNJ1, TSPOAP1, DRP2, TMPRSS3, SYBU, HMP19, SNAP91, SCN11A, PALM, SLC1A4, NRCAM, CACNG4, CNIH2, DGKI, CLSTN2, MAP1A, GLRA2, CUBN, SCN7A, EPB41L3, BSN, GAP43, EPHA7, VSTM2L, SPOCK1, CX3CL1, MAPK8IP2, CAMK2N1, PDE1C, NCAM2, SLC17A6, SLC18A3, KCNC1, ADGRL3, ZNF804A, SARM1, GRIK4, ENC1, ASCL1, DMTN, KNCN, TMEM163, CLDN5, KCND3, PCDHB13, GABRR2, ALCAM, SV2B, KCTD16, ADCYAP1, APBA1, CNR1, STMN4, CADPS, MAPT, RUFY3, TP63, NRSN1, MAP1B, PCSK2, DPYSL5, GRM3, SLC6A1, ABAT, CACNA1C, CACNG2, PTPRO, CHRNA5, and CDH10.
In embodiments, the first gene set includes at least one (e.g., 1, 2, 3, 4, 5, 6 etc.) increased gene of Table 6. In embodiments, the at least one (e.g., 1, 2, 3, 4, 5, 6 etc.) increased gene is EFNB3, SEMA3C, SRCIN1, SLIT2, CDH2, KREMEN1, KIAA1024, DSCAM, MAP2, PAK3, NGF, SEMA6D, STMN2, CAMK2B, ISLR2, SNAP25, MAGI2, NTRK3, AVIL, KALRN, PMP22, NRCAM, NEGR1, PLXNA4, EPHA7, SPOCK1, CX3CL1, ZNF804A, ULK2, SARM1, PLXNA3, ENC1, TRIM67, NTN1, ZNF365, GFI1, ADCYAP1, CNR1, ANKRD1, MAPT, RUFY3, PLXNA2, PLXNC1, MAP1B, PTPRO or FZD1.
In embodiments, the at least one (e.g., 1, 2, 3, 4, 5, 6 etc.) increased gene is selected from the group consisting of EFNB3, SEMA3C, SRCIN1, SLIT2, CDH2, KREMEN1, KIAA1024, DSCAM, MAP2, PAK3, NGF, SEMA6D, STMN2, CAMK2B, ISLR2, SNAP25, MAGI2, NTRK3, AVIL, KALRN, PMP22, NRCAM, NEGR1, PLXNA4, EPHA7, SPOCK1, CX3CL1, ZNF804A, ULK2, SARM1, PLXNA3, ENC1, TRIM67, NTN1, ZNF365, GFI1, ADCYAP1, CNR1, ANKRD1, MAPT, RUFY3, PLXNA2, PLXNC1, MAP1B, PTPRO and FZD1.
In embodiments, the first gene set includes at least one (e.g., 1, 2, 3, 4, 5, 6 etc.) increased gene of Table 7. In embodiments, the at least one (e.g., 1, 2, 3, 4, 5, 6 etc.) increased gene is GPM6A, DRD2, BMP7, EFNB3, SEMA3C, FSCN2, LGI1, SRCIN1, WNT4, SLIT2, NAV3, NRG1, TTBK1, RNF165, PRDM16, CDH2, ELAVL4, ONECUT2, KREMEN1, SCRT1, KIAA1024, DSCAM, MAP2, PRDM8, FAT4, PAK3, NGF, SEMA6D, STMN2, ZFHX3, LRP2, APOA1, CAMK2B, MDGA1, ISLR2, SNAP25, NEUROD4, PHOX2B, DCX, MAGI2, PIK3R1, NCAM1, NTRK3, PITX3, MYT1L, AVIL, CDK5R2, INSM1, SOX21, IL6ST, KIFSC, SYNJ1, KALRN, GFRA1, TCTN1, CELSR1, IRX5, PMP22, SOX6, RUNX1, DPYSL4, NRCAM, ZNF521, MDGA2, PROX1, FGF5, ZNF536, MAP1A, DCHS1, NEGR1, PLXNA4, EPB41L3, GAP43, EPHA7, DLL3, VSTM2L, ID4, NRN1, SPOCK1, DUSP10, COL3A1, CX3CL1, SLIT3, MAPK8IP2, FAIM2, TCF12, BMP6, NRBP2, NCAM2, HIPK2, CDH11, ADGRL3, ZNF804A, ULK2, CCKAR, SARM1, PLXNA3, ENC1, ASCL1, UNCX, MEIS1, ARX, SRRM4, TRIM67, ALCAM, NTN1, ZNF365, GFI1, ADCYAP1, CNR1, ANKRD1, ALK, STMN4, MAPT, RUFY3, PLXNA2, PLXNC1, MAP1B, DPYSL5, PTPRO, FZD1 or DLX5.
In embodiments, the at least one (e.g., 1, 2, 3, 4, 5, 6 etc.) increased gene is selected from the group consisting of GPM6A, DRD2, BMP7, EFNB3, SEMA3C, FSCN2, LGI1, SRCIN1, WNT4, SLIT2, NAV3, NRG1, TTBK1, RNF165, PRDM16, CDH2, ELAVL4, ONECUT2, KREMEN1, SCRT1, KIAA1024, DSCAM, MAP2, PRDM8, FAT4, PAK3, NGF, SEMA6D, STMN2, ZFHX3, LRP2, APOA1, CAMK2B, MDGA1, ISLR2, SNAP25, NEUROD4, PHOX2B, DCX, MAGI2, PIK3R1, NCAM1, NTRK3, PITX3, MYT1L, AVIL, CDK5R2, INSM1, SOX21, IL6ST, KIFSC, SYNJ1, KALRN, GFRA1, TCTN1, CELSR1, IRX5, PMP22, SOX6, RUNX1, DPYSL4, NRCAM, ZNF521, MDGA2, PROX1, FGF5, ZNF536, MAP1A, DCHS1, NEGR1, PLXNA4, EPB41L3, GAP43, EPHA7, DLL3, VSTM2L, ID4, NRN1, SPOCK1, DUSP10, COL3A1, CX3CL1, SLIT3, MAPK8IP2, FAIM2, TCF12, BMP6, NRBP2, NCAM2, HIPK2, CDH11, ADGRL3, ZNF804A, ULK2, CCKAR, SARM1, PLXNA3, ENC1, ASCL1, UNCX, MEIS1, ARX, SRRM4, TRIM67, ALCAM, NTN1, ZNF365, GFI1, ADCYAP1, CNR1, ANKRD1, ALK, STMN4, MAPT, RUFY3, PLXNA2, PLXNC1, MAP1B, DPYSL5, PTPRO, FZD1 and DLX5.
In embodiments, the at least one increased gene is selected from the group consisting of: CAPN14, FAT3, FAT4, PCDHGC4, SLC8A1, SLIT2, CEMIP2, CDHR3, CDH2, DRD2, EPHB2, MAGI2, PCDHB11, PCDHB13, PCDHB14, PCDHB16, PCDHB2, ADGRG6, ELF5, EPHA7, FOXP1, GDF7, HOXA1, MINAR1, MSX1, NRBP2, NRIP1, PITX3, POU6F2, PTPRO, SLC35D1, TCF12, ZFHX3 and ZNF703. In embodiments, the at least one increased gene is CAPN14, FAT3, FAT4, PCDHGC4, SLC8A1, SLIT2, CEMIP2, CDHR3, CDH2, DRD2, EPHB2, MAGI2, PCDHB11, PCDHB13, PCDHB14, PCDHB16, PCDHB2, ADGRG6, ELF5, EPHA7, FOXP1, GDF7, HOXA1, MINAR1, MSX1, NRBP2, NRIP1, PITX3, POU6F2, PTPRO, SLC35D1, TCF12, ZFHX3 or ZNF703.
In embodiments, the increased expression levels are at least 4 times higher relative to a pluripotent stem cell. In embodiments, the increased expression levels are about 4 times higher relative to a pluripotent stem cell. In embodiments, the increased expression levels are at least 5 times higher relative to a pluripotent stem cell. In embodiments, the increased expression levels are about 5 times higher relative to a pluripotent stem cell. In embodiments, the increased expression levels are at least 6 times higher relative to a pluripotent stem cell. In embodiments, the increased expression levels are about 6 times higher relative to a pluripotent stem cell. In embodiments, the increased expression levels are at least 7 times higher relative to a pluripotent stem cell. In embodiments, the increased expression levels are about 7 times higher relative to a pluripotent stem cell. In embodiments, the increased expression levels are at least 8 times higher relative to a pluripotent stem cell. In embodiments, the increased expression levels are about 8 times higher relative to a pluripotent stem cell. In embodiments, the increased expression levels are at least 9 times higher relative to a pluripotent stem cell. In embodiments, the increased expression levels are about 9 times higher relative to a pluripotent stem cell. In embodiments, the increased expression levels are at least 10 times higher relative to a pluripotent stem cell. In embodiments, the increased expression levels are about 10 times higher relative to a pluripotent stem cell.
In embodiments, the increased expression levels are at least 11 times higher relative to a pluripotent stem cell. In embodiments, the increased expression levels are about 11 times higher relative to a pluripotent stem cell. In embodiments, the increased expression levels are at least 12 times higher relative to a pluripotent stem cell. In embodiments, the increased expression levels are about 12 times higher relative to a pluripotent stem cell. In embodiments, the increased expression levels are at least 13 times higher relative to a pluripotent stem cell. In embodiments, the increased expression levels are about 13 times higher relative to a pluripotent stem cell. In embodiments, the increased expression levels are at least 14 times higher relative to a pluripotent stem cell. In embodiments, the increased expression levels are about 14 times higher relative to a pluripotent stem cell. In embodiments, the increased expression levels are at least 15 times higher relative to a pluripotent stem cell. In embodiments, the increased expression levels are about 15 times higher relative to a pluripotent stem cell. In embodiments, the increased expression levels are at least 16 times higher relative to a pluripotent stem cell. In embodiments, the increased expression levels are about 16 times higher relative to a pluripotent stem cell. In embodiments, the increased expression levels are at least 17 times higher relative to a pluripotent stem cell. In embodiments, the increased expression levels are about 17 times higher relative to a pluripotent stem cell. In embodiments, the increased expression levels are at least 18 times higher relative to a pluripotent stem cell. In embodiments, the increased expression levels are about 18 times higher relative to a pluripotent stem cell. In embodiments, the increased expression levels are at least 19 times higher relative to a pluripotent stem cell. In embodiments, the increased expression levels are about 19 times higher relative to a pluripotent stem cell. In embodiments, the increased expression levels are at least 20 times higher relative to a pluripotent stem cell. In embodiments, the increased expression levels are about 20 times higher relative to a pluripotent stem cell.
In embodiments, the increased expression levels are about 4-100 times higher relative to a pluripotent stem cell. In embodiments, the increased expression levels are 4-100 times higher relative to a pluripotent stem cell. In embodiments, the increased expression levels are about 6-100 times higher relative to a pluripotent stem cell. In embodiments, the increased expression levels are 6-100 times higher relative to a pluripotent stem cell. In embodiments, the increased expression levels are about 8-100 times higher relative to a pluripotent stem cell. In embodiments, the increased expression levels are 8-100 times higher relative to a pluripotent stem cell. In embodiments, the increased expression levels are about 10-100 times higher relative to a pluripotent stem cell. In embodiments, the increased expression levels are 10-100 times higher relative to a pluripotent stem cell. In embodiments, the increased expression levels are about 20-100 times higher relative to a pluripotent stem cell. In embodiments, the increased expression levels are 20-100 times higher relative to a pluripotent stem cell. In embodiments, the increased expression levels are about 30-100 times higher relative to a pluripotent stem cell. In embodiments, the increased expression levels are 30-100 times higher relative to a pluripotent stem cell. In embodiments, the increased expression levels are about 40-100 times higher relative to a pluripotent stem cell. In embodiments, the increased expression levels are 40-100 times higher relative to a pluripotent stem cell. In embodiments, the increased expression levels are about 50-100 times higher relative to a pluripotent stem cell. In embodiments, the increased expression levels are 50-100 times higher relative to a pluripotent stem cell. In embodiments, the increased expression levels are about 60-100 times higher relative to a pluripotent stem cell. In embodiments, the increased expression levels are 60-100 times higher relative to a pluripotent stem cell. In embodiments, the increased expression levels are about 70-100 times higher relative to a pluripotent stem cell. In embodiments, the increased expression levels are 70-100 times higher relative to a pluripotent stem cell. In embodiments, the increased expression levels are about 80-100 times higher relative to a pluripotent stem cell. In embodiments, the increased expression levels are 80-100 times higher relative to a pluripotent stem cell. In embodiments, the increased expression levels are about 90-100 times higher relative to a pluripotent stem cell. In embodiments, the increased expression levels are 90-100 times higher relative to a pluripotent stem cell.
In embodiments, the increased expression levels are about 4-90 times higher relative to a pluripotent stem cell. In embodiments, the increased expression levels are 4-90 times higher relative to a pluripotent stem cell. In embodiments, the increased expression levels are about 4-80 times higher relative to a pluripotent stem cell. In embodiments, the increased expression levels are 4-80 times higher relative to a pluripotent stem cell. In embodiments, the increased expression levels are about 4-70 times higher relative to a pluripotent stem cell. In embodiments, the increased expression levels are 4-70 times higher relative to a pluripotent stem cell. In embodiments, the increased expression levels are about 4-60 times higher relative to a pluripotent stem cell. In embodiments, the increased expression levels are 4-60 times higher relative to a pluripotent stem cell. In embodiments, the increased expression levels are about 4-50 times higher relative to a pluripotent stem cell. In embodiments, the increased expression levels are 4-50 times higher relative to a pluripotent stem cell. In embodiments, the increased expression levels are about 4-40 times higher relative to a pluripotent stem cell. In embodiments, the increased expression levels are 4-40 times higher relative to a pluripotent stem cell. In embodiments, the increased expression levels are about 4-30 times higher relative to a pluripotent stem cell. In embodiments, the increased expression levels are 4-30 times higher relative to a pluripotent stem cell. In embodiments, the increased expression levels are about 4-20 times higher relative to a pluripotent stem cell. In embodiments, the increased expression levels are 4-20 times higher relative to a pluripotent stem cell. In embodiments, the increased expression levels are about 4-10 times higher relative to a pluripotent stem cell. In embodiments, the increased expression levels are 4-10 times higher relative to a pluripotent stem cell. In embodiments, the increased expression levels are about 4-8 times higher relative to a pluripotent stem cell. In embodiments, the increased expression levels are 4-8 times higher relative to a pluripotent stem cell. In embodiments, the increased expression levels are about 4-6 times higher relative to a pluripotent stem cell. In embodiments, the increased expression levels are 4-6 times higher relative to a pluripotent stem cell.
In embodiments, the gene expression profile information for the desirable determined dopaminergic precursor cell includes decreased gene expression levels relative to a pluripotent stem cell for a second gene set, wherein the second gene set includes at least one decreased gene within one or more second gene ontologies of Table 8.
In embodiments, the gene expression profile information for the desirable determined dopaminergic precursor cell includes decreased gene expression levels relative to a pluripotent stem cell for a second gene set, wherein the second gene set includes at least one decreased gene within one or more second gene ontologies selected from the group consisting of gene ontologies of Table 8.
In embodiments, the gene expression profile information for the desirable determined dopaminergic precursor cell includes decreased gene expression levels relative to a pluripotent stem cell for a second gene set, wherein the second gene set includes at least one decreased gene within one or more second gene ontologies of GO:0044459, GO:0071944, GO:0005886, GO:0005904, GO:0031226, GO:0005887, GO:0042127, GO:0005576, GO:0044421, GO:0070887, GO:0034097, GO:0050896, GO:0051869, GO:0071345, GO:0048856, GO:0010033, GO:0044425, GO:0007166, GO:0032501, GO:0044707, GO:0050874, GO:0023052, GO:0023046, GO:0044700, GO:0031982, GO:0031988, GO:0032502, GO:0044767, GO:0007154, GO:0071310, GO:0005615, GO:0042221, GO:0031224, GO:0051049, GO:0019221, GO:0048583, GO:0008284, GO:0007275, GO:0023051, GO:0010646, GO:0048584, GO:0051239, GO:0032879, GO:0006954, GO:0007165, GO:0023033, GO:0043230, GO:0098771, GO:0055065, GO:0016021, GO:1903561, GO:0009966, GO:0035466, GO:0050801, GO:0010647, GO:0006811, GO:0065008, GO:0051240, GO:0098590, GO:0055082, GO:0055080, GO:0023056, GO:0006875, GO:0070062, GO:0051716, GO:0048878, GO:0043269, GO:0065009, GO:0051050, GO:0050865, GO:0098857, GO:0006873, GO:0048518, GO:0043119, GO:0030003, GO:0048731, GO:0042592, GO:0045121, GO:0006952, GO:0002217, GO:0042829, GO:0048522, GO:0051242, GO:0046903, GO:0005102, GO:0030154, GO:0019725, GO:0001775, GO:0009967, GO:0035468, GO:0002376, GO:0072503, GO:0045321, GO:0050863, GO:0050878, GO:0048869, GO:0002703, GO:0050670, GO:0022407, GO:0032944, GO:0016020, GO:1902533, GO:0010740, GO:0043270, GO:0045785, GO:0072507, GO:0009888, GO:0022409, GO:0042493, GO:0017035, GO:0002682, GO:0006874, GO:0032101, GO:0070663, GO:0007204, GO:1902531, GO:0010627, GO:1903039, GO:1903037, GO:0002694, GO:0031012, GO:0009605, GO:0044281, GO:2000021, GO:0055074, GO:0035296, GO:0097746, GO:0042312, GO:0044093, GO:0002685, GO:0098589, GO:0051480, GO:0003013, GO:0008015, GO:0070261, GO:1901700, GO:0007187, GO:0030155, GO:0003006, GO:0034220, GO:0050870, GO:0009611, GO:0002245, GO:0008217, GO:1903524, GO:0042129, GO:0033993, GO:0050880, GO:0007188, GO:0051704, GO:0051706, GO:0035150, GO:0030198, GO:0032103, GO:0043062, GO:0050867, GO:0040017, GO:0002687, GO:0022857, GO:0005386, GO:0015563, GO:0015646, GO:0022891, GO:0022892, GO:0048608, GO:0015267, GO:0015249, GO:0015268, GO:0002274, GO:0001890, GO:0048513, GO:0022803, GO:0022814, GO:0002684, GO:0050776, GO:0002819, GO:0045937, GO:0010562, GO:0002366, GO:0061458, GO:0051094, GO:0034762, GO:2000147, GO:0030141, GO:0002263, GO:0006955, GO:0015075, GO:0099503, GO:0000003, GO:0019952, GO:0050876, GO:0098772, GO:0002252, GO:0009653, GO:0050900, GO:1901701, GO:0042802, GO:0043085, GO:0048554, GO:0030335, GO:0005215, GO:0005478, GO:0022414, GO:0044702, GO:0051241, GO:0002696, GO:0046873, GO:0042060, GO:0003018, GO:0032940, GO:0031410, GO:0016023, GO:0002822, GO:0046394, GO:0051272, GO:0097708, GO:0009986, GO:0009928, GO:0009929, GO:0016053, GO:0051928, GO:0042327, GO:0031225, GO:0010469, GO:0009987, GO:0008151, GO:0044763, GO:0050875, GO:0006950, GO:0043207, GO:0002886, GO:0051249, GO:0098655, GO:0005575, GO:0008372, GO:0002697, GO:0019935, GO:0007267, GO:0032496, GO:0070160, GO:0005216, GO:0034765, GO:0006820, GO:0006822, GO:0005911, GO:0019933, GO:0004252, GO:0048545, GO:0051924, GO:0006812, GO:0006819, GO:0015674, GO:0019932, GO:0051707, GO:0009613, GO:0042828, GO:0001934, GO:0022838, GO:1902105, GO:0006636, GO:0071624, GO:0055085, GO:0010959, GO:0005923, GO:0030001, GO:0002237, GO:0009607, GO:0002699, GO:0005261, GO:0015281, GO:0015338, GO:1903522, GO:0043408, GO:0008324, GO:0015711, GO:0071622, GO:0070665, GO:0002683, GO:0010543, GO:0050730, GO:0007189, GO:0010579, GO:0010580, GO:0016338, GO:0050671, GO:0015318, GO:0050777, GO:0050793, GO:0030054, GO:0022610, GO:0032946, GO:0043300, GO:0042102, GO:0001817, GO:0002275, GO:0032844, GO:0060429, GO:0001653, GO:0031347, GO:0048646, GO:0042981, GO:0051345, GO:0002690, GO:0043302, GO:0098660, GO:0009719, GO:0048018, GO:0071884, GO:0009116, GO:0043168, GO:0002444, GO:0043296, GO:0065007, GO:0098662, GO:0043299, GO:0030193, GO:0042119, GO:0050921, GO:0002688, GO:0043410, GO:0022836, GO:0090022, GO:0002888, GO:0002821, GO:1900046, GO:0042509, GO:0042510, GO:0042513, GO:0042516, GO:0042519, GO:0042522, GO:0042525, GO:0042528, GO:0035295, GO:0043235, GO:0022839, GO:0090023, GO:0043065, GO:0046718, GO:0019063, GO:0043067, GO:0043070, GO:0030545, GO:0001816, GO:0003382, GO:0044409, GO:0051806, GO:0030260, GO:0051828, GO:0036230, GO:0010941, GO:0009725, GO:0002476, GO:0002526, GO:0051384, GO:0050790, GO:0048552, GO:0051247, GO:0008285, GO:0097755, GO:0045909, GO:0031960, GO:0070374, GO:0002824, GO:0030728, GO:0007155, GO:0098602, GO:0035556, GO:0007242, GO:0007243, GO:0023013, GO:0023034, GO:0010942, GO:0070372, GO:0051046, GO:0043068, GO:0043071, GO:1902107, GO:0002283, GO:0005509, GO:0050818, GO:0051336, GO:0009119, GO:0003073, GO:0036018, GO:0046635, GO:2000026, GO:0006082, GO:0001819, GO:0004175, GO:0016809, GO:0050764, GO:0043436, GO:0005201, GO:0097028, GO:0008528, GO:0045055, GO:0016477, GO:0030168, GO:0035239, GO:0070820, GO:0031349, GO:0001932, GO:0098797, GO:0045137, GO:0043312, GO:0002446, GO:0052547, GO:0048585, GO:0009070, GO:0009113, GO:0034764, GO:0022600, GO:0016323, GO:0045597, GO:0042803, GO:0016324, GO:0045177, GO:0008406, GO:0006887, GO:0016194, GO:0016195, GO:0008236, GO:0072358, GO:0001944, GO:0002521, GO:1902624, GO:0044283, GO:0048519, GO:0043118, GO:0045684, GO:0006690, GO:0010522, GO:0022890, GO:0015082, GO:0019752, GO:0071396, GO:0001525, GO:0050731, GO:0036017, GO:0042609, GO:0050817, GO:0070252, GO:0060670, GO:0019369, GO:0019229, GO:0009164, GO:0017171, GO:0045907, GO:0008289, GO:1902622, GO:0050920, GO:0051047, GO:0046649, GO:0032270, GO:0009991, GO:0033628, GO:0004715, GO:0045776, GO:0042454, GO:0005515, GO:0001948, GO:0045308, GO:0002706, GO:1903530, GO:1901657, GO:0030322, GO:0042270, GO:0045088, GO:0046717, GO:0016661, GO:0008584, GO:0002428, GO:1901568, GO:0042325, GO:0044433, GO:0044057, GO:0031638, GO:0006953, GO:0050729, GO:0046546, GO:0042531, GO:0042511, GO:0042515, GO:0042517, GO:0042520, GO:0042523, GO:0042526, GO:0042529, GO:0046850, GO:0005178, GO:0048514, GO:0045682, GO:0003674, GO:0005554, GO:0046634, GO:0061041, GO:0008016, GO:0043407, GO:0046456, GO:0007596, GO:0045606, GO:0014070, GO:0048870, GO:0051674, GO:0002704, GO:0007584, GO:0070228, GO:0002675, GO:0052548, GO:0001664, GO:0090330, GO:0045117, GO:0034340, GO:0044853, GO:0032587, GO:0007586, GO:0097529, GO:0045595, GO:0040012, GO:0050866, GO:0010035, GO:0034767, GO:0098801, GO:0015079, GO:0015388, GO:0022817, GO:0044706, GO:1901605, GO:0009636, GO:0007599, GO:0002705, GO:2000145, GO:0034103, GO:0032642, GO:0098805, GO:0051209, GO:1901137, GO:0090066, GO:0098641, GO:0032409, GO:0007589, GO:0046128, GO:0061134, GO:0015893, GO:0001726, GO:0001893, GO:0030334, GO:0042398 or any combination thereof.
In embodiments, the gene expression profile information for the desirable determined dopaminergic precursor cell includes decreased gene expression levels relative to a pluripotent stem cell for a second gene set, wherein the second gene set includes at least one decreased gene within one or more second gene ontologies selected from the group consisting of GO:0044459, GO:0071944, GO:0005886, GO:0005904, GO:0031226, GO:0005887, GO:0042127, GO:0005576, GO:0044421, GO:0070887, GO:0034097, GO:0050896, GO:0051869, GO:0071345, GO:0048856, GO:0010033, GO:0044425, GO:0007166, GO:0032501, GO:0044707, GO:0050874, GO:0023052, GO:0023046, GO:0044700, GO:0031982, GO:0031988, GO:0032502, GO:0044767, GO:0007154, GO:0071310, GO:0005615, GO:0042221, GO:0031224, GO:0051049, GO:0019221, GO:0048583, GO:0008284, GO:0007275, GO:0023051, GO:0010646, GO:0048584, GO:0051239, GO:0032879, GO:0006954, GO:0007165, GO:0023033, GO:0043230, GO:0098771, GO:0055065, GO:0016021, GO:1903561, GO:0009966, GO:0035466, GO:0050801, GO:0010647, GO:0006811, GO:0065008, GO:0051240, GO:0098590, GO:0055082, GO:0055080, GO:0023056, GO:0006875, GO:0070062, GO:0051716, GO:0048878, GO:0043269, GO:0065009, GO:0051050, GO:0050865, GO:0098857, GO:0006873, GO:0048518, GO:0043119, GO:0030003, GO:0048731, GO:0042592, GO:0045121, GO:0006952, GO:0002217, GO:0042829, GO:0048522, GO:0051242, GO:0046903, GO:0005102, GO:0030154, GO:0019725, GO:0001775, GO:0009967, GO:0035468, GO:0002376, GO:0072503, GO:0045321, GO:0050863, GO:0050878, GO:0048869, GO:0002703, GO:0050670, GO:0022407, GO:0032944, GO:0016020, GO:1902533, GO:0010740, GO:0043270, GO:0045785, GO:0072507, GO:0009888, GO:0022409, GO:0042493, GO:0017035, GO:0002682, GO:0006874, GO:0032101, GO:0070663, GO:0007204, GO:1902531, GO:0010627, GO:1903039, GO:1903037, GO:0002694, GO:0031012, GO:0009605, GO:0044281, GO:2000021, GO:0055074, GO:0035296, GO:0097746, GO:0042312, GO:0044093, GO:0002685, GO:0098589, GO:0051480, GO:0003013, GO:0008015, GO:0070261, GO:1901700, GO:0007187, GO:0030155, GO:0003006, GO:0034220, GO:0050870, GO:0009611, GO:0002245, GO:0008217, GO:1903524, GO:0042129, GO:0033993, GO:0050880, GO:0007188, GO:0051704, GO:0051706, GO:0035150, GO:0030198, GO:0032103, GO:0043062, GO:0050867, GO:0040017, GO:0002687, GO:0022857, GO:0005386, GO:0015563, GO:0015646, GO:0022891, GO:0022892, GO:0048608, GO:0015267, GO:0015249, GO:0015268, GO:0002274, GO:0001890, GO:0048513, GO:0022803, GO:0022814, GO:0002684, GO:0050776, GO:0002819, GO:0045937, GO:0010562, GO:0002366, GO:0061458, GO:0051094, GO:0034762, GO:2000147, GO:0030141, GO:0002263, GO:0006955, GO:0015075, GO:0099503, GO:0000003, GO:0019952, GO:0050876, GO:0098772, GO:0002252, GO:0009653, GO:0050900, GO:1901701, GO:0042802, GO:0043085, GO:0048554, GO:0030335, GO:0005215, GO:0005478, GO:0022414, GO:0044702, GO:0051241, GO:0002696, GO:0046873, GO:0042060, GO:0003018, GO:0032940, GO:0031410, GO:0016023, GO:0002822, GO:0046394, GO:0051272, GO:0097708, GO:0009986, GO:0009928, GO:0009929, GO:0016053, GO:0051928, GO:0042327, GO:0031225, GO:0010469, GO:0009987, GO:0008151, GO:0044763, GO:0050875, GO:0006950, GO:0043207, GO:0002886, GO:0051249, GO:0098655, GO:0005575, GO:0008372, GO:0002697, GO:0019935, GO:0007267, GO:0032496, GO:0070160, GO:0005216, GO:0034765, GO:0006820, GO:0006822, GO:0005911, GO:0019933, GO:0004252, GO:0048545, GO:0051924, GO:0006812, GO:0006819, GO:0015674, GO:0019932, GO:0051707, GO:0009613, GO:0042828, GO:0001934, GO:0022838, GO:1902105, GO:0006636, GO:0071624, GO:0055085, GO:0010959, GO:0005923, GO:0030001, GO:0002237, GO:0009607, GO:0002699, GO:0005261, GO:0015281, GO:0015338, GO:1903522, GO:0043408, GO:0008324, GO:0015711, GO:0071622, GO:0070665, GO:0002683, GO:0010543, GO:0050730, GO:0007189, GO:0010579, GO:0010580, GO:0016338, GO:0050671, GO:0015318, GO:0050777, GO:0050793, GO:0030054, GO:0022610, GO:0032946, GO:0043300, GO:0042102, GO:0001817, GO:0002275, GO:0032844, GO:0060429, GO:0001653, GO:0031347, GO:0048646, GO:0042981, GO:0051345, GO:0002690, GO:0043302, GO:0098660, GO:0009719, GO:0048018, GO:0071884, GO:0009116, GO:0043168, GO:0002444, GO:0043296, GO:0065007, GO:0098662, GO:0043299, GO:0030193, GO:0042119, GO:0050921, GO:0002688, GO:0043410, GO:0022836, GO:0090022, GO:0002888, GO:0002821, GO:1900046, GO:0042509, GO:0042510, GO:0042513, GO:0042516, GO:0042519, GO:0042522, GO:0042525, GO:0042528, GO:0035295, GO:0043235, GO:0022839, GO:0090023, GO:0043065, GO:0046718, GO:0019063, GO:0043067, GO:0043070, GO:0030545, GO:0001816, GO:0003382, GO:0044409, GO:0051806, GO:0030260, GO:0051828, GO:0036230, GO:0010941, GO:0009725, GO:0002476, GO:0002526, GO:0051384, GO:0050790, GO:0048552, GO:0051247, GO:0008285, GO:0097755, GO:0045909, GO:0031960, GO:0070374, GO:0002824, GO:0030728, GO:0007155, GO:0098602, GO:0035556, GO:0007242, GO:0007243, GO:0023013, GO:0023034, GO:0010942, GO:0070372, GO:0051046, GO:0043068, GO:0043071, GO:1902107, GO:0002283, GO:0005509, GO:0050818, GO:0051336, GO:0009119, GO:0003073, GO:0036018, GO:0046635, GO:2000026, GO:0006082, GO:0001819, GO:0004175, GO:0016809, GO:0050764, GO:0043436, GO:0005201, GO:0097028, GO:0008528, GO:0045055, GO:0016477, GO:0030168, GO:0035239, GO:0070820, GO:0031349, GO:0001932, GO:0098797, GO:0045137, GO:0043312, GO:0002446, GO:0052547, GO:0048585, GO:0009070, GO:0009113, GO:0034764, GO:0022600, GO:0016323, GO:0045597, GO:0042803, GO:0016324, GO:0045177, GO:0008406, GO:0006887, GO:0016194, GO:0016195, GO:0008236, GO:0072358, GO:0001944, GO:0002521, GO:1902624, GO:0044283, GO:0048519, GO:0043118, GO:0045684, GO:0006690, GO:0010522, GO:0022890, GO:0015082, GO:0019752, GO:0071396, GO:0001525, GO:0050731, GO:0036017, GO:0042609, GO:0050817, GO:0070252, GO:0060670, GO:0019369, GO:0019229, GO:0009164, GO:0017171, GO:0045907, GO:0008289, GO:1902622, GO:0050920, GO:0051047, GO:0046649, GO:0032270, GO:0009991, GO:0033628, GO:0004715, GO:0045776, GO:0042454, GO:0005515, GO:0001948, GO:0045308, GO:0002706, GO:1903530, GO:1901657, GO:0030322, GO:0042270, GO:0045088, GO:0046717, GO:0016661, GO:0008584, GO:0002428, GO:1901568, GO:0042325, GO:0044433, GO:0044057, GO:0031638, GO:0006953, GO:0050729, GO:0046546, GO:0042531, GO:0042511, GO:0042515, GO:0042517, GO:0042520, GO:0042523, GO:0042526, GO:0042529, GO:0046850, GO:0005178, GO:0048514, GO:0045682, GO:0003674, GO:0005554, GO:0046634, GO:0061041, GO:0008016, GO:0043407, GO:0046456, GO:0007596, GO:0045606, GO:0014070, GO:0048870, GO:0051674, GO:0002704, GO:0007584, GO:0070228, GO:0002675, GO:0052548, GO:0001664, GO:0090330, GO:0045117, GO:0034340, GO:0044853, GO:0032587, GO:0007586, GO:0097529, GO:0045595, GO:0040012, GO:0050866, GO:0010035, GO:0034767, GO:0098801, GO:0015079, GO:0015388, GO:0022817, GO:0044706, GO:1901605, GO:0009636, GO:0007599, GO:0002705, GO:2000145, GO:0034103, GO:0032642, GO:0098805, GO:0051209, GO:1901137, GO:0090066, GO:0098641, GO:0032409, GO:0007589, GO:0046128, GO:0061134, GO:0015893, GO:0001726, GO:0001893, GO:0030334, GO:0042398 and any combination thereof.
In embodiments, the second gene set includes about 1-1000 decreased genes within one or more of the second gene ontologies. In embodiments, the second gene set includes about 2-1000 decreased genes within one or more of the second gene ontologies. In embodiments, the second gene set includes about 3-1000 decreased genes within one or more of the second gene ontologies. In embodiments, the second gene set includes about 4-1000 decreased genes within one or more of the second gene ontologies. In embodiments, the second gene set includes about 5-1000 decreased genes within one or more of the second gene ontologies. In embodiments, the second gene set includes about 6-1000 decreased genes within one or more of the second gene ontologies. In embodiments, the second gene set includes about 7-1000 decreased genes within one or more of the second gene ontologies. In embodiments, the second gene set includes about 8-1000 decreased genes within one or more of the second gene ontologies. In embodiments, the second gene set includes about 9-1000 decreased genes within one or more of the second gene ontologies. In embodiments, the second gene set includes about 10-1000 decreased genes within one or more of the second gene ontologies.
In embodiments, the second gene set includes about 15-1000 decreased genes within one or more of the second gene ontologies. In embodiments, the second gene set includes about 20-1000 decreased genes within one or more of the second gene ontologies. In embodiments, the second gene set includes about 25-1000 decreased genes within one or more of the second gene ontologies. In embodiments, the second gene set includes about 30-1000 decreased genes within one or more of the second gene ontologies. In embodiments, the second gene set includes about 35-1000 decreased genes within one or more of the second gene ontologies. In embodiments, the second gene set includes about 40-1000 decreased genes within one or more of the second gene ontologies. In embodiments, the second gene set includes about 45-1000 decreased genes within one or more of the second gene ontologies. In embodiments, the second gene set includes about 50-1000 decreased genes within one or more of the second gene ontologies. In embodiments, the second gene set includes about 55-1000 decreased genes within one or more of the second gene ontologies. In embodiments, the second gene set includes about 60-1000 decreased genes within one or more of the second gene ontologies. In embodiments, the second gene set includes about 65-1000 decreased genes within one or more of the second gene ontologies. In embodiments, the second gene set includes about 70-1000 decreased genes within one or more of the second gene ontologies. In embodiments, the second gene set includes about 75-1000 decreased genes within one or more of the second gene ontologies. In embodiments, the second gene set includes about 80-1000 decreased genes within one or more of the second gene ontologies. In embodiments, the second gene set includes about 85-1000 decreased genes within one or more of the second gene ontologies. In embodiments, the second gene set includes about 90-1000 decreased genes within one or more of the second gene ontologies. In embodiments, the second gene set includes about 95-1000 decreased genes within one or more of the second gene ontologies. In embodiments, the second gene set includes about 100-1000 decreased genes within one or more of the second gene ontologies.
In embodiments, the second gene set includes about 105-1000 decreased genes within one or more of the second gene ontologies. In embodiments, the second gene set includes about 115-1000 decreased genes within one or more of the second gene ontologies. In embodiments, the second gene set includes about 120-1000 decreased genes within one or more of the second gene ontologies. In embodiments, the second gene set includes about 125-1000 decreased genes within one or more of the second gene ontologies. In embodiments, the second gene set includes about 130-1000 decreased genes within one or more of the second gene ontologies. In embodiments, the second gene set includes about 135-1000 decreased genes within one or more of the second gene ontologies. In embodiments, the second gene set includes about 140-1000 decreased genes within one or more of the second gene ontologies. In embodiments, the second gene set includes about 145-1000 decreased genes within one or more of the second gene ontologies. In embodiments, the second gene set includes about 150-1000 decreased genes within one or more of the second gene ontologies. In embodiments, the second gene set includes about 155-1000 decreased genes within one or more of the second gene ontologies. In embodiments, the second gene set includes about 160-1000 decreased genes within one or more of the second gene ontologies. In embodiments, the second gene set includes about 165-1000 decreased genes within one or more of the second gene ontologies. In embodiments, the second gene set includes about 170-1000 decreased genes within one or more of the second gene ontologies. In embodiments, the second gene set includes about 175-1000 decreased genes within one or more of the second gene ontologies. In embodiments, the second gene set includes about 180-1000 decreased genes within one or more of the second gene ontologies. In embodiments, the second gene set includes about 185-1000 decreased genes within one or more of the second gene ontologies. In embodiments, the second gene set includes about 190-1000 decreased genes within one or more of the second gene ontologies. In embodiments, the second gene set includes about 195-1000 decreased genes within one or more of the second gene ontologies. In embodiments, the second gene set includes about 200-1000 decreased genes within one or more of the second gene ontologies.
In embodiments, the second gene set includes about 205-1000 decreased genes within one or more of the second gene ontologies. In embodiments, the second gene set includes about 215-1000 decreased genes within one or more of the second gene ontologies. In embodiments, the second gene set includes about 220-1000 decreased genes within one or more of the second gene ontologies. In embodiments, the second gene set includes about 225-1000 decreased genes within one or more of the second gene ontologies. In embodiments, the second gene set includes about 230-1000 decreased genes within one or more of the second gene ontologies. In embodiments, the second gene set includes about 235-1000 decreased genes within one or more of the second gene ontologies. In embodiments, the second gene set includes about 240-1000 decreased genes within one or more of the second gene ontologies. In embodiments, the second gene set includes about 245-1000 decreased genes within one or more of the second gene ontologies. In embodiments, the second gene set includes about 250-1000 decreased genes within one or more of the second gene ontologies. In embodiments, the second gene set includes about 255-1000 decreased genes within one or more of the second gene ontologies. In embodiments, the second gene set includes about 260-1000 decreased genes within one or more of the second gene ontologies. In embodiments, the second gene set includes about 265-1000 decreased genes within one or more of the second gene ontologies. In embodiments, the second gene set includes about 270-1000 decreased genes within one or more of the second gene ontologies. In embodiments, the second gene set includes about 275-1000 decreased genes within one or more of the second gene ontologies. In embodiments, the second gene set includes about 280-1000 decreased genes within one or more of the second gene ontologies. In embodiments, the second gene set includes about 285-1000 decreased genes within one or more of the second gene ontologies. In embodiments, the second gene set includes about 290-1000 decreased genes within one or more of the second gene ontologies. In embodiments, the second gene set includes about 295-1000 decreased genes within one or more of the second gene ontologies. In embodiments, the second gene set includes about 300-1000 decreased genes within one or more of the second gene ontologies.
In embodiments, the second gene set includes about 305-1000 decreased genes within one or more of the second gene ontologies. In embodiments, the second gene set includes about 315-1000 decreased genes within one or more of the second gene ontologies. In embodiments, the second gene set includes about 320-1000 decreased genes within one or more of the second gene ontologies. In embodiments, the second gene set includes about 325-1000 decreased genes within one or more of the second gene ontologies. In embodiments, the second gene set includes about 330-1000 decreased genes within one or more of the second gene ontologies. In embodiments, the second gene set includes about 335-1000 decreased genes within one or more of the second gene ontologies. In embodiments, the second gene set includes about 340-1000 decreased genes within one or more of the second gene ontologies. In embodiments, the second gene set includes about 345-1000 decreased genes within one or more of the second gene ontologies. In embodiments, the second gene set includes about 350-1000 decreased genes within one or more of the second gene ontologies. In embodiments, the second gene set includes about 355-1000 decreased genes within one or more of the second gene ontologies. In embodiments, the second gene set includes about 360-1000 decreased genes within one or more of the second gene ontologies. In embodiments, the second gene set includes about 365-1000 decreased genes within one or more of the second gene ontologies. In embodiments, the second gene set includes about 370-1000 decreased genes within one or more of the second gene ontologies. In embodiments, the second gene set includes about 375-1000 decreased genes within one or more of the second gene ontologies. In embodiments, the second gene set includes about 380-1000 decreased genes within one or more of the second gene ontologies. In embodiments, the second gene set includes about 385-1000 decreased genes within one or more of the second gene ontologies. In embodiments, the second gene set includes about 390-1000 decreased genes within one or more of the second gene ontologies. In embodiments, the second gene set includes about 395-1000 decreased genes within one or more of the second gene ontologies. In embodiments, the second gene set includes about 400-1000 decreased genes within one or more of the second gene ontologies.
In embodiments, the second gene set includes about 405-1000 decreased genes within one or more of the second gene ontologies. In embodiments, the second gene set includes about 415-1000 decreased genes within one or more of the second gene ontologies. In embodiments, the second gene set includes about 420-1000 decreased genes within one or more of the second gene ontologies. In embodiments, the second gene set includes about 425-1000 decreased genes within one or more of the second gene ontologies. In embodiments, the second gene set includes about 430-1000 decreased genes within one or more of the second gene ontologies. In embodiments, the second gene set includes about 435-1000 decreased genes within one or more of the second gene ontologies. In embodiments, the second gene set includes about 440-1000 decreased genes within one or more of the second gene ontologies. In embodiments, the second gene set includes about 445-1000 decreased genes within one or more of the second gene ontologies. In embodiments, the second gene set includes about 450-1000 decreased genes within one or more of the second gene ontologies. In embodiments, the second gene set includes about 455-1000 decreased genes within one or more of the second gene ontologies. In embodiments, the second gene set includes about 460-1000 decreased genes within one or more of the second gene ontologies. In embodiments, the second gene set includes about 465-1000 decreased genes within one or more of the second gene ontologies. In embodiments, the second gene set includes about 470-1000 decreased genes within one or more of the second gene ontologies. In embodiments, the second gene set includes about 475-1000 decreased genes within one or more of the second gene ontologies. In embodiments, the second gene set includes about 480-1000 decreased genes within one or more of the second gene ontologies. In embodiments, the second gene set includes about 485-1000 decreased genes within one or more of the second gene ontologies. In embodiments, the second gene set includes about 490-1000 decreased genes within one or more of the second gene ontologies. In embodiments, the second gene set includes about 495-1000 decreased genes within one or more of the second gene ontologies.
In embodiments, the second gene set includes about 500-1000 decreased genes within one or more of the second gene ontologies. In embodiments, the second gene set includes about 505-1000 decreased genes within one or more of the second gene ontologies. In embodiments, the second gene set includes about 510-1000 decreased genes within one or more of the second gene ontologies. In embodiments, the second gene set includes about 515-1000 decreased genes within one or more of the second gene ontologies. In embodiments, the second gene set includes about 520-1000 decreased genes within one or more of the second gene ontologies. In embodiments, the second gene set includes about 525-1000 decreased genes within one or more of the second gene ontologies. In embodiments, the second gene set includes about 530-1000 decreased genes within one or more of the second gene ontologies. In embodiments, the second gene set includes about 535-1000 decreased genes within one or more of the second gene ontologies. In embodiments, the second gene set includes about 540-1000 decreased genes within one or more of the second gene ontologies. In embodiments, the second gene set includes about 545-1000 decreased genes within one or more of the second gene ontologies. In embodiments, the second gene set includes about 550-1000 decreased genes within one or more of the second gene ontologies. In embodiments, the second gene set includes about 555-1000 decreased genes within one or more of the second gene ontologies. In embodiments, the second gene set includes about 565-1000 decreased genes within one or more of the second gene ontologies. In embodiments, the second gene set includes about 570-1000 decreased genes within one or more of the second gene ontologies. In embodiments, the second gene set includes about 575-1000 decreased genes within one or more of the second gene ontologies. In embodiments, the second gene set includes about 580-1000 decreased genes within one or more of the second gene ontologies. In embodiments, the second gene set includes about 585-1000 decreased genes within one or more of the second gene ontologies. In embodiments, the second gene set includes about 590-1000 decreased genes within one or more of the second gene ontologies. In embodiments, the second gene set includes about 595-1000 decreased genes within one or more of the second gene ontologies. In embodiments, the second gene set includes about 600-1000 decreased genes within one or more of the second gene ontologies.
In embodiments, the second gene set includes about 605-1000 decreased genes within one or more of the second gene ontologies. In embodiments, the second gene set includes about 615-1000 decreased genes within one or more of the second gene ontologies. In embodiments, the second gene set includes about 620-1000 decreased genes within one or more of the second gene ontologies. In embodiments, the second gene set includes about 625-1000 decreased genes within one or more of the second gene ontologies. In embodiments, the second gene set includes about 630-1000 decreased genes within one or more of the second gene ontologies. In embodiments, the second gene set includes about 635-1000 decreased genes within one or more of the second gene ontologies. In embodiments, the second gene set includes about 640-1000 decreased genes within one or more of the second gene ontologies. In embodiments, the second gene set includes about 645-1000 decreased genes within one or more of the second gene ontologies. In embodiments, the second gene set includes about 650-1000 decreased genes within one or more of the second gene ontologies. In embodiments, the second gene set includes about 655-1000 decreased genes within one or more of the second gene ontologies. In embodiments, the second gene set includes about 660-1000 decreased genes within one or more of the second gene ontologies. In embodiments, the second gene set includes about 665-1000 decreased genes within one or more of the second gene ontologies. In embodiments, the second gene set includes about 670-1000 decreased genes within one or more of the second gene ontologies. In embodiments, the second gene set includes about 675-1000 decreased genes within one or more of the second gene ontologies. In embodiments, the second gene set includes about 680-1000 decreased genes within one or more of the second gene ontologies. In embodiments, the second gene set includes about 685-1000 decreased genes within one or more of the second gene ontologies. In embodiments, the second gene set includes about 690-1000 decreased genes within one or more of the second gene ontologies. In embodiments, the second gene set includes about 695-1000 decreased genes within one or more of the second gene ontologies. In embodiments, the second gene set includes about 700-1000 decreased genes within one or more of the second gene ontologies.
In embodiments, the second gene set includes about 705-1000 decreased genes within one or more of the second gene ontologies. In embodiments, the second gene set includes about 715-1000 decreased genes within one or more of the second gene ontologies. In embodiments, the second gene set includes about 720-1000 decreased genes within one or more of the second gene ontologies. In embodiments, the second gene set includes about 725-1000 decreased genes within one or more of the second gene ontologies. In embodiments, the second gene set includes about 730-1000 decreased genes within one or more of the second gene ontologies. In embodiments, the second gene set includes about 735-1000 decreased genes within one or more of the second gene ontologies. In embodiments, the second gene set includes about 740-1000 decreased genes within one or more of the second gene ontologies. In embodiments, the second gene set includes about 745-1000 decreased genes within one or more of the second gene ontologies. In embodiments, the second gene set includes about 750-1000 decreased genes within one or more of the second gene ontologies. In embodiments, the second gene set includes about 755-1000 decreased genes within one or more of the second gene ontologies. In embodiments, the second gene set includes about 760-1000 decreased genes within one or more of the second gene ontologies. In embodiments, the second gene set includes about 765-1000 decreased genes within one or more of the second gene ontologies. In embodiments, the second gene set includes about 770-1000 decreased genes within one or more of the second gene ontologies. In embodiments, the second gene set includes about 775-1000 decreased genes within one or more of the second gene ontologies. In embodiments, the second gene set includes about 780-1000 decreased genes within one or more of the second gene ontologies. In embodiments, the second gene set includes about 785-1000 decreased genes within one or more of the second gene ontologies. In embodiments, the second gene set includes about 790-1000 decreased genes within one or more of the second gene ontologies. In embodiments, the second gene set includes about 795-1000 decreased genes within one or more of the second gene ontologies. In embodiments, the second gene set includes about 800-1000 decreased genes within one or more of the second gene ontologies.
In embodiments, the second gene set includes about 805-1000 decreased genes within one or more of the second gene ontologies. In embodiments, the second gene set includes about 815-1000 decreased genes within one or more of the second gene ontologies. In embodiments, the second gene set includes about 820-1000 decreased genes within one or more of the second gene ontologies. In embodiments, the second gene set includes about 825-1000 decreased genes within one or more of the second gene ontologies. In embodiments, the second gene set includes about 830-1000 decreased genes within one or more of the second gene ontologies. In embodiments, the second gene set includes about 835-1000 decreased genes within one or more of the second gene ontologies. In embodiments, the second gene set includes about 840-1000 decreased genes within one or more of the second gene ontologies. In embodiments, the second gene set includes about 845-1000 decreased genes within one or more of the second gene ontologies. In embodiments, the second gene set includes about 850-1000 decreased genes within one or more of the second gene ontologies. In embodiments, the second gene set includes about 855-1000 decreased genes within one or more of the second gene ontologies. In embodiments, the second gene set includes about 860-1000 decreased genes within one or more of the second gene ontologies. In embodiments, the second gene set includes about 865-1000 decreased genes within one or more of the second gene ontologies. In embodiments, the second gene set includes about 870-1000 decreased genes within one or more of the second gene ontologies. In embodiments, the second gene set includes about 875-1000 decreased genes within one or more of the second gene ontologies. In embodiments, the second gene set includes about 880-1000 decreased genes within one or more of the second gene ontologies. In embodiments, the second gene set includes about 885-1000 decreased genes within one or more of the second gene ontologies. In embodiments, the second gene set includes about 890-1000 decreased genes within one or more of the second gene ontologies. In embodiments, the second gene set includes about 895-1000 decreased genes within one or more of the second gene ontologies. In embodiments, the second gene set includes about 900-1000 decreased genes within one or more of the second gene ontologies.
In embodiments, the second gene set includes about 905-1000 decreased genes within one or more of the second gene ontologies. In embodiments, the second gene set includes about 915-1000 decreased genes within one or more of the second gene ontologies. In embodiments, the second gene set includes about 920-1000 decreased genes within one or more of the second gene ontologies. In embodiments, the second gene set includes about 925-1000 decreased genes within one or more of the second gene ontologies. In embodiments, the second gene set includes about 930-1000 decreased genes within one or more of the second gene ontologies. In embodiments, the second gene set includes about 935-1000 decreased genes within one or more of the second gene ontologies. In embodiments, the second gene set includes about 940-1000 decreased genes within one or more of the second gene ontologies. In embodiments, the second gene set includes about 945-1000 decreased genes within one or more of the second gene ontologies. In embodiments, the second gene set includes about 950-1000 decreased genes within one or more of the second gene ontologies. In embodiments, the second gene set includes about 955-1000 decreased genes within one or more of the second gene ontologies. In embodiments, the second gene set includes about 960-1000 decreased genes within one or more of the second gene ontologies. In embodiments, the second gene set includes about 965-1000 decreased genes within one or more of the second gene ontologies. In embodiments, the second gene set includes about 970-1000 decreased genes within one or more of the second gene ontologies. In embodiments, the second gene set includes about 975-1000 decreased genes within one or more of the second gene ontologies. In embodiments, the second gene set includes about 980-1000 decreased genes within one or more of the second gene ontologies. In embodiments, the second gene set includes about 985-1000 decreased genes within one or more of the second gene ontologies. In embodiments, the second gene set includes about 990-1000 decreased genes within one or more of the second gene ontologies. In embodiments, the second gene set includes about 995-1000 decreased genes within one or more of the second gene ontologies.
In embodiments, the second gene set includes 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100, 101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, 124, 125, 126, 127, 128, 129, 130, 131, 132, 133, 134, 135, 136, 137, 138, 139, 140, 141, 142, 143, 144, 145, 146, 147, 148, 149, 150, 151, 152, 153, 154, 155, 156, 157, 158, 159, 160, 161, 162, 163, 164, 165, 166, 167, 168, 169, 170, 171, 172, 173, 174, 175, 176, 177, 178, 179, 180, 181, 182, 183, 184, 185, 186, 187, 188, 189, 190, 191, 192, 193, 194, 195, 196, 197, 198, 199, 200, 201, 202, 203, 204, 205, 206, 207, 208, 209, 210, 211 212, 213, 214, 215, 216, 217, 218, 219, 220, 221, 222, 223, 224, 225, 226, 227, 228, 229, 230, 231, 232, 233, 234, 235, 236, 237, 238, 239, 240, 241, 242, 243, 244, 245, 246, 247, 248, 249, 250, 251, 252, 253, 254, 255, 256, 257, 258, 259, 260, 261, 262, 263, 264, 265, 266, 267, 268, 269, 270, 271, 272, 273, 274, 275, 276, 277, 278, 279, 280, 281, 282, 283, 284, 285, 286, 287, 288, 289, 290, 291, 292, 293, 294, 295, 296, 297, 298, 299, 300, 301, 302, 303, 304, 305, 306, 307, 308, 309, 310, 311 312, 313, 314, 315 316, 317, 318, 319, 320, 321, 322, 323, 324, 325, 326, 327, 328, 329, 330, 331, 332, 333, 334, 335, 336, 337, 338, 339, 340, 341, 342, 343, 344, 345, 346, 347, 348, 349, 350, 351, 352, 353, 354, 355, 356, 357, 358, 359, 360, 361, 362, 363, 364, 365, 366, 367, 368, 369, 370, 371, 372, 373, 374, 375, 376, 377, 378, 379, 380, 381, 382, 383, 384, 385, 386, 387, 388, 389, 390, 391, 392, 393, 394, 395, 396, 397, 398, 399, 400, 401, 402, 403, 404, 405, 406, 407, 408, 409, 410, 411 412, 413, 414, 415 416, 417, 418, 419, 420, 421, 422, 423, 424, 425, 426, 427, 428, 429, 430, 431, 432, 433, 434, 435, 436, 437, 438, 439, 440, 441, 442, 443, 444, 445, 446, 447, 448, 449, 450, 451, 452, 453, 454, 455, 456, 457, 458, 459, 460, 461, 462, 463, 464, 465, 466, 467, 468, 469, 470, 471, 472, 473, 474, 475, 476, 477, 478, 479, 480, 481, 482, 483, 484, 485, 486, 487, 488, 489, 490, 491, 492, 493, 494, 495, 496, 497, 498, 499, 500, 501, 502, 503, 504, 505, 506, 507, 508, 509, 510, 511 512, 513, 514, 515, 516, 517, 518, 519, 520, 521, 522, 523, 524, 525, 526, 527, 528, 529, 530, 231, 532, 533, 534, 535, 536, 537, 538, 539, 540, 541, 542, 543, 544, 545, 546, 547, 548, 549, 550, 551, 552, 553, 554, 555, 556, 557, 558, 559, 560, 561, 562, 563, 564, 565, 566, 567, 568, 569, 570, 571, 572, 573, 574, 575, 576, 577, 578, 579, 580, 581, 582, 583, 584, 585, 586, 587, 588, 589, 590, 591, 592, 593, 594, 595, 596, 597, 598, 599, 600, 601, 605, 603, 604, 605, 606, 607, 608, 609, 610, 611 612, 613, 614, 615 616, 617, 618, 619, 620, 621, 622, 623, 624, 625, 626, 627, 628, 629, 630, 631, 632, 633, 634, 635, 636, 637, 638, 639, 640, 641, 642, 643, 644, 645, 646, 647, 648, 649, 650, 651, 652, 653, 654, 655, 656, 657, 658, 659, 660, 661, 662, 663, 664, 665, 666, 667, 668, 669, 670, 671, 672, 673, 674, 675, 676, 677, 678, 679, 680, 681, 682, 683, 684, 685, 686, 687, 688, 689, 690, 691, 692, 693, 694, 695, 696, 697, 698, 699, 700, 701, 702, 703, 704, 705, 706, 707,708, 709, 710, 711 712, 713, 717, 715 716, 714, 718, 719, 720, 721, 722, 723, 724, 725, 726, 727, 728, 729, 730, 731, 732, 733, 734, 735, 736, 737, 738, 739, 740, 741, 742, 743, 744, 745, 746, 747, 748, 749, 750, 751, 752, 753, 757, 755, 756, 754, 758, 759, 760, 761, 762, 763, 764, 765, 766, 767, 768, 769, 770, 771, 772, 773, 774, 775, 776, 777, 778, 779, 780, 781, 782, 783, 784, 785, 786, 787, 788, 789, 790, 791, 792, 793, 794, 795, 796, 797, 798, 799, 800, 801, 802, 803, 804, 805, 806, 807, 808, 809, 810, 811 812, 813, 817, 815 816, 814, 818, 819, 820, 821, 822, 823, 824, 825, 826, 827, 828, 829, 830, 831, 832, 833, 834, 835, 836, 837, 838, 839, 840, 841, 842, 843, 845, 846, 847, 848, 849, 850, 851, 852, 853, 854, 855, 856, 854, 858, 859, 860, 861, 862, 863, 864, 865, 866, 867, 868, 869, 870, 871, 872, 873, 874, 875, 876, 877, 878, 879, 880, 881, 882, 883, 884, 885, 886, 887, 888, 889, 890, 891, 892, 893, 894, 895, 896, 897, 898, 899, 900, 901, 902, 903, 904, 905, 906, 907, 908, 909, 910, 911 912, 913, 917, 915 916, 914, 918, 919, 920, 921, 922, 923, 924, 925, 926, 927, 928, 929, 930, 931, 932, 933, 934, 935, 936, 937, 938, 939, 940, 941, 942, 943, 945, 946, 947, 948, 949, 950, 951, 952, 953, 954, 955, 956, 954, 958, 959, 960, 961, 962, 963, 964, 965, 966, 967, 968, 969, 970, 971, 972, 973, 974, 975, 976, 977, 978, 979, 980, 981, 982, 983, 984, 985, 986, 987, 988, 989, 990, 991, 992, 993, 994, 995, 996, 997, 998, 999, or 1000 decreased genes within one or more of the second gene ontologies.
The gene expression profile information for the desirable determined dopaminergic precursor cell may include decreased gene expression levels relative to a pluripotent stem cell for a second gene set, wherein the second gene set includes at least one decreased gene within one or more second gene ontologies of Table 8. “One or more” as described herein in the context of second gene ontologies refers to at least one, for example, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, etc. of second gene ontologies.
In embodiments, the second gene set includes about 1-500 decreased genes within 1-1000 of the second gene ontologies. In embodiments, the second gene set includes about 1-500 decreased genes within 50-1000 of the second gene ontologies. In embodiments, the second gene set includes about 1-500 decreased genes within 100-1000 of the second gene ontologies. In embodiments, the second gene set includes about 1-500 decreased genes within 150-1000 of the second gene ontologies. In embodiments, the second gene set includes about 1-500 decreased genes within 200-1000 of the second gene ontologies. In embodiments, the second gene set includes about 250-500 decreased genes within 50-1000 of the second gene ontologies. In embodiments, the second gene set includes about 1-500 decreased genes within 300-1000 of the second gene ontologies. In embodiments, the second gene set includes about 1-500 decreased genes within 350-1000 of the second gene ontologies. In embodiments, the second gene set includes about 1-500 decreased genes within 400-1000 of the second gene ontologies. In embodiments, the second gene set includes about 1-500 decreased genes within 450-1000 of the second gene ontologies. In embodiments, the second gene set includes about 1-500 decreased genes within 500-1000 of the second gene ontologies. In embodiments, the second gene set includes about 1-500 decreased genes within 550-1000 of the second gene ontologies. In embodiments, the second gene set includes about 1-500 decreased genes within 600-1000 of the second gene ontologies. In embodiments, the second gene set includes about 1-500 decreased genes within 650-1000 of the second gene ontologies. In embodiments, the second gene set includes about 1-500 decreased genes within 700-1000 of the second gene ontologies. In embodiments, the second gene set includes about 1-500 decreased genes within 750-1000 of the second gene ontologies. In embodiments, the second gene set includes about 1-500 decreased genes within 800-1000 of the second gene ontologies. In embodiments, the second gene set includes about 1-500 decreased genes within 850-1000 of the second gene ontologies. In embodiments, the second gene set includes about 1-500 decreased genes within 900-1000 of the second gene ontologies. In embodiments, the second gene set includes about 1-500 decreased genes within 950-1000 of the second gene ontologies.
In embodiments, the second gene set includes about 1-500 decreased genes within 1-300 of the second gene ontologies. In embodiments, the second gene set includes about 1-500 decreased genes within 10-300 of the second gene ontologies. In embodiments, the second gene set includes about 1-500 decreased genes within 20-300 of the second gene ontologies. In embodiments, the second gene set includes about 1-500 decreased genes within 30-300 of the second gene ontologies. In embodiments, the second gene set includes about 1-500 decreased genes within 40-300 of the second gene ontologies. In embodiments, the second gene set includes about 1-500 decreased genes within 50-300 of the second gene ontologies. In embodiments, the second gene set includes about 1-500 decreased genes within 60-300 of the second gene ontologies. In embodiments, the second gene set includes about 1-500 decreased genes within 70-300 of the second gene ontologies. In embodiments, the second gene set includes about 1-500 decreased genes within 80-300 of the second gene ontologies. In embodiments, the second gene set includes about 1-500 decreased genes within 90-300 of the second gene ontologies. In embodiments, the second gene set includes about 1-500 decreased genes within 100-300 of the second gene ontologies. In embodiments, the second gene set includes about 1-500 decreased genes within 110-300 of the second gene ontologies. In embodiments, the second gene set includes about 1-500 decreased genes within 120-300 of the second gene ontologies. In embodiments, the second gene set includes about 1-500 decreased genes within 130-300 of the second gene ontologies. In embodiments, the second gene set includes about 1-500 decreased genes within 140-300 of the second gene ontologies. In embodiments, the second gene set includes about 1-500 decreased genes within 150-300 of the second gene ontologies. In embodiments, the second gene set includes about 1-500 decreased genes within 160-300 of the second gene ontologies. In embodiments, the second gene set includes about 1-500 decreased genes within 170-300 of the second gene ontologies. In embodiments, the second gene set includes about 1-500 decreased genes within 180-300 of the second gene ontologies. In embodiments, the second gene set includes about 1-500 decreased genes within 190-300 of the second gene ontologies. In embodiments, the second gene set includes about 1-500 decreased genes within 200-300 of the second gene ontologies. In embodiments, the second gene set includes about 1-500 decreased genes within 210-300 of the second gene ontologies. In embodiments, the second gene set includes about 1-500 decreased genes within 220-300 of the second gene ontologies. In embodiments, the second gene set includes about 1-500 decreased genes within 230-300 of the second gene ontologies. In embodiments, the second gene set includes about 1-500 decreased genes within 240-300 of the second gene ontologies. In embodiments, the second gene set includes about 1-500 decreased genes within 250-300 of the second gene ontologies. In embodiments, the second gene set includes about 1-500 decreased genes within 260-300 of the second gene ontologies. In embodiments, the second gene set includes about 1-500 decreased genes within 270-300 of the second gene ontologies. In embodiments, the second gene set includes about 1-500 decreased genes within 280-300 of the second gene ontologies. In embodiments, the second gene set includes about 1-500 decreased genes within 290-300 of the second gene ontologies.
In embodiments, the second gene set includes about 1-500 decreased genes within 1-290 of the second gene ontologies. In embodiments, the second gene set includes about 1-500 decreased genes within 1-280 of the second gene ontologies. In embodiments, the second gene set includes about 1-500 decreased genes within 1-270 of the second gene ontologies. In embodiments, the second gene set includes about 1-500 decreased genes within 1-260 of the second gene ontologies. In embodiments, the second gene set includes about 1-500 decreased genes within 1-250 of the second gene ontologies. In embodiments, the second gene set includes about 1-500 decreased genes within 1-240 of the second gene ontologies. In embodiments, the second gene set includes about 1-500 decreased genes within 1-230 of the second gene ontologies. In embodiments, the second gene set includes about 1-500 decreased genes within 1-220 of the second gene ontologies. In embodiments, the second gene set includes about 1-500 decreased genes within 1-210 of the second gene ontologies. In embodiments, the second gene set includes about 1-500 decreased genes within 1-200 of the second gene ontologies. In embodiments, the second gene set includes about 1-500 decreased genes within 1-190 of the second gene ontologies. In embodiments, the second gene set includes about 1-500 decreased genes within 1-180 of the second gene ontologies. In embodiments, the second gene set includes about 1-500 decreased genes within 1-170 of the second gene ontologies. In embodiments, the second gene set includes about 1-500 decreased genes within 1-160 of the second gene ontologies. In embodiments, the second gene set includes about 1-500 decreased genes within 1-150 of the second gene ontologies. In embodiments, the second gene set includes about 1-500 decreased genes within 1-140 of the second gene ontologies. In embodiments, the second gene set includes about 1-500 decreased genes within 1-130 of the second gene ontologies. In embodiments, the second gene set includes about 1-500 decreased genes within 1-120 of the second gene ontologies. In embodiments, the second gene set includes about 1-500 decreased genes within 1-110 of the second gene ontologies. In embodiments, the second gene set includes about 1-500 decreased genes within 1-100 of the second gene ontologies. In embodiments, the second gene set includes about 1-500 decreased genes within 1-90 of the second gene ontologies. In embodiments, the second gene set includes about 1-500 decreased genes within 1-80 of the second gene ontologies. In embodiments, the second gene set includes about 1-500 decreased genes within 1-70 of the second gene ontologies. In embodiments, the second gene set includes about 1-500 decreased genes within 1-60 of the second gene ontologies. In embodiments, the second gene set includes about 1-500 decreased genes within 1-50 of the second gene ontologies. In embodiments, the second gene set includes about 1-500 decreased genes within 1-40 of the second gene ontologies. In embodiments, the second gene set includes about 1-500 decreased genes within 1-30 of the second gene ontologies. In embodiments, the second gene set includes about 1-500 decreased genes within 1-20 of the second gene ontologies. In embodiments, the second gene set includes about 1-500 decreased genes within 1-10 of the second gene ontologies. In embodiments, the second gene set includes about 1-500 decreased genes within 1-5 of the second gene ontologies.
In embodiments, the second gene set includes at least one decreased gene within 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100, 101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, 124, 125, 126, 127, 128, 129, 130, 131, 132, 133, 134, 135, 136, 137, 138, 139, 140, 141, 142, 143, 144, 145, 146, 147, 148, 149, 150, 151, 152, 153, 154, 155, 156, 157, 158, 159, 160, 161, 162, 163, 164, 165, 166, 167, 168, 169, 170, 171, 172, 173, 174, 175, 176, 177, 178, 179, 180, 181, 182, 183, 184, 185, 186, 187, 188, 189, 190, 191, 192, 193, 194, 195, 196, 197, 198, 199, 200, 201, 202, 203, 204, 205, 206, 207, 208, 209, 210, 211 212, 213, 214, 215, 216, 217, 218, 219, 220, 221, 222, 223, 224, 225, 226, 227, 228, 229, 230, 231, 232, 233, 234, 235, 236, 237, 238, 239, 240, 241, 242, 243, 244, 245, 246, 247, 248, 249, 250, 251, 252, 253, 254, 255, 256, 257, 258, 259, 260, 261, 262, 263, 264, 265, 266, 267, 268, 269, 270, 271, 272, 273, 274, 275, 276, 277, 278, 279, 280, 281, 282, 283, 284, 285, 286, 287, 288, 289, 290, 291, 292, 293, 294, 295, 296, 297, 298, 299, 300, 301, 302, 303, 304, 305, 306, 307, 308, 309, 310, 311 312, 313, 314, 315 316, 317, 318, 319, 320, 321, 322, 323, 324, 325, 326, 327, 328, 329, 330, 331, 332, 333, 334, 335, 336, 337, 338, 339, 340, 341, 342, 343, 344, 345, 346, 347, 348, 349, 350, 351, 352, 353, 354, 355, 356, 357, 358, 359, 360, 361, 362, 363, 364, 365, 366, 367, 368, 369, 370, 371, 372, 373, 374, 375, 376, 377, 378, 379, 380, 381, 382, 383, 384, 385, 386, 387, 388, 389, 390, 391, 392, 393, 394, 395, 396, 397, 398, 399, 400, 401, 402, 403, 404, 405, 406, 407,408, 409, 410, 411 412, 413, 414, 415 416, 417, 418, 419, 420, 421, 422, 423, 424, 425, 426, 427, 428, 429, 430, 431, 432, 433, 434, 435, 436, 437, 438, 439, 440, 441, 442, 443, 444, 445, 446, 447, 448, 449, 450, 451, 452, 453, 454, 455, 456, 457, 458, 459, 460, 461, 462, or 463 second gene ontologies of Table 8.
In embodiments, the second gene set includes 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100, 101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, 124, 125, 126, 127, 128, 129, 130, 131, 132, 133, 134, 135, 136, 137, 138, 139, 140, 141, 142, 143, 144, 145, 146, 147, 148, 149, 150, 151, 152, 153, 154, 155, 156, 157, 158, 159, 160, 161, 162, 163, 164, 165, 166, 167, 168, 169, 170, 171, 172, 173, 174, 175, 176, 177, 178, 179, 180, 181, 182, 183, 184, 185, 186, 187, 188, 189, 190, 191, 192, 193, 194, 195, 196, 197, 198, 199, 200, 201, 202, 203, 204, 205, 206, 207, 208, 209, 210, 211 212, 213, 214, 215, 216, 217, 218, 219, 220, 221, 222, 223, 224, 225, 226, 227, 228, 229, 230, 231, 232, 233, 234, 235, 236, 237, 238, 239, 240, 241, 242, 243, 244, 245, 246, 247, 248, 249, 250, 251, 252, 253, 254, 255, 256, 257, 258, 259, 260, 261, 262, 263, 264, 265, 266, 267, 268, 269, 270, 271, 272, 273, 274, 275, 276, 277, 278, 279, 280, 281, 282, 283, 284, 285, 286, 287, 288, 289, 290, 291, 292, 293, 294, 295, 296, 297, 298, 299, 300, 301, 302, 303, 304, 305, 306, 307, 308, 309, 310, 311 312, 313, 314, 315 316, 317, 318, 319, 320, 321, 322, 323, 324, 325, 326, 327, 328, 329, 330, 331, 332, 333, 334, 335, 336, 337, 338, 339, 340, 341, 342, 343, 344, 345, 346, 347, 348, 349, 350, 351, 352, 353, 354, 355, 356, 357, 358, 359, 360, 361, 362, 363, 364, 365, 366, 367, 368, 369, 370, 371, 372, 373, 374, 375, 376, 377, 378, 379, 380, 381, 382, 383, 384, 385, 386, 387, 388, 389, 390, 391, 392, 393, 394, 395, 396, 397, 398, 399, 400, 401, 402, 403, 404, 405, 406, 407, 408, 409, 410, 411 412, 413, 414, 415 416, 417, 418, 419, 420, 421, 422, 423, 424, 425, 426, 427, 428, 429, 430, 431, 432, 433, 434, 435, 436, 437, 438, 439, 440, 441, 442, 443, 444, 445, 446, 447, 448, 449, 450, 451, 452, 453, 454, 455, 456, 457, 458, 459, 460, 461, 462, 463, 464, 465, 466, 467, 468, 469, 470, 471, 472, 473, 474, 475, 476, 477, 478, 479, 480, 481, 482, 483, 484, 485, 486, 487, 488, 489, 490, 491, 492, 493, 494, 495, 496, 497, 498, 499, 500, 501, 502, 503, 504, 505, 506, 507, 508, 509, 510, 511 512, 513, 514, 515, 516, 517, 518, 519, 520, 521, 522, 523, 524, 525, 526, 527, 528, 529, 530, 231, 532, 533, 534, 535, 536, 537, 538, 539, 540, 541, 542, 543, 544, 545, 546, 547, 548, 549, 550, 551, 552, 553, 554, 555, 556, 557, 558, 559, 560, 561, 562, 563, 564, 565, 566, 567, 568, 569, 570, 571, 572, 573, 574, 575, 576, 577, 578, 579, 580, 581, 582, 583, 584, 585, 586, 587, 588, 589, 590, 591, 592, 593, 594, 595, 596, 597, 598, 599, 600, 601, 605, 603, 604, 605, 606, 607, 608, 609, 610, 611 612, 613, 614, 615 616, 617, 618, 619, 620, 621, 622, 623, 624, 625, 626, 627, 628, 629, 630, 631, 632, 633, 634, 635, 636, 637, 638, 639, 640, 641, 642, 643, 644, 645, 646, 647, 648, 649, 650, 651, 652, 653, 654, 655, 656, 657, 658, 659, 660, 661, 662, 663, 664, 665, 666, 667, 668, 669, 670, 671, 672, 673, 674, 675, 676, 677, 678, 679, 680, 681, 682, 683, 684, 685, 686, 687, 688, 689, 690, 691, 692, 693, 694, 695, 696, 697, 698, 699, 700, 701, 702, 703, 704, 705, 706, 707, 708, 709, 710, 711 712, 713, 717, 715 716, 714, 718, 719, 720, 721, 722, 723, 724, 725, 726, 727, 728, 729, 730, 731, 732, 733, 734, 735, 736, 737, 738, 739, 740, 741, 742, 743, 744, 745, 746, 747, 748, 749, 750, 751, 752, 753, 757, 755, 756, 754, 758, 759, 760, 761, 762, 763, 764, 765, 766, 767, 768, 769, 770, 771, 772, 773, 774, 775, 776, 777, 778, 779, 780, 781, 782, 783, 784, 785, 786, 787, 788, 789, 790, 791, 792, 793, 794, 795, 796, 797, 798, 799, 800, 801, 802, 803, 804, 805, 806, 807, 808, 809, 810, 811 812, 813, 817, 815 816, 814, 818, 819, 820, 821, 822, 823, 824, 825, 826, 827, 828, 829, 830, 831, 832, 833, 834, 835, 836, 837, 838, 839, 840, 841, 842, 843, 845, 846, 847, 848, 849, 850, 851, 852, 853, 854, 855, 856, 854, 858, 859, 860, 861, 862, 863, 864, 865, 866, 867, 868, 869, 870, 871, 872, 873, 874, 875, 876, 877, 878, 879, 880, 881, 882, 883, 884, 885, 886, 887, 888, 889, 890, 891, 892, 893, 894, 895, 896, 897, 898, 899, 900, 901, 902, 903, 904, 905, 906, 907, 908, 909, 910, 911 912, 913, 917, 915 916, 914, 918, 919, 920, 921, 922, 923, 924, 925, 926, 927, 928, 929, 930, 931, 932, 933, 934, 935, 936, 937, 938, 939, 940, 941, 942, 943, 945, 946, 947, 948, 949, 950, 951, 952, 953, 954, 955, 956, 954, 958, 959, 960, 961, 962, 963, 964, 965, 966, 967, 968, 969, 970, 971, 972, 973, 974, 975, 976, 977, 978, 979, 980, 981, 982, 983, 984, 985, 986, 987, 988, 989, 990, 991, 992, 993, 994, 995, 996, 997, 998, 999, or 1000 decreased genes within 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100, 101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, 124, 125, 126, 127, 128, 129, 130, 131, 132, 133, 134, 135, 136, 137, 138, 139, 140, 141, 142, 143, 144, 145, 146, 147, 148, 149, 150, 151, 152, 153, 154, 155, 156, 157, 158, 159, 160, 161, 162, 163, 164, 165, 166, 167, 168, 169, 170, 171, 172, 173, 174, 175, 176, 177, 178, 179, 180, 181, 182, 183, 184, 185, 186, 187, 188, 189, 190, 191, 192, 193, 194, 195, 196, 197, 198, 199, 200, 201, 202, 203, 204, 205, 206, 207, 208, 209, 210, 211 212, 213, 214, 215, 216, 217, 218, 219, 220, 221, 222, 223, 224, 225, 226, 227, 228, 229, 230, 231, 232, 233, 234, 235, 236, 237, 238, 239, 240, 241, 242, 243, 244, 245, 246, 247, 248, 249, 250, 251, 252, 253, 254, 255, 256, 257, 258, 259, 260, 261, 262, 263, 264, 265, 266, 267, 268, 269, 270, 271, 272, 273, 274, 275, 276, 277, 278, 279, 280, 281, 282, 283, 284, 285, 286, 287, 288, 289, 290, 291, 292, 293, 294, 295, 296, 297, 298, 299, 300, 301, 302, 303, 304, 305, 306, 307, 308, 309, 310, 311 312, 313, 314, 315 316, 317, 318, 319, 320, 321, 322, 323, 324, 325, 326, 327, 328, 329, 330, 331, 332, 333, 334, 335, 336, 337, 338, 339, 340, 341, 342, 343, 344, 345, 346, 347, 348, 349, 350, 351, 352, 353, 354, 355, 356, 357, 358, 359, 360, 361, 362, 363, 364, 365, 366, 367, 368, 369, 370, 371, 372, 373, 374, 375, 376, 377, 378, 379, 380, 381, 382, 383, 384, 385, 386, 387, 388, 389, 390, 391, 392, 393, 394, 395, 396, 397, 398, 399, 400, 401, 402, 403, 404, 405, 406, 407, 408, 409, 410, 411 412, 413, 414, 415 416, 417, 418, 419, 420, 421, 422, 423, 424, 425, 426, 427, 428, 429, 430, 431, 432, 433, 434, 435, 436, 437, 438, 439, 440, 441, 442, 443, 444, 445, 446, 447, 448, 449, 450, 451, 452, 453, 454, 455, 456, 457, 458, 459, 460, 461, 462, or 463 second gene ontologies of Table 8.
In embodiments, the second gene ontologies are any one of the gene ontologies listed in Table 8. In embodiments, the second gene ontologies are any one of GO:0044459, GO:0071944, GO:0005886, GO:0005904, GO:0031226, GO:0005887, GO:0042127, GO:0005576, GO:0044421, GO:0070887, GO:0034097, GO:0050896, GO:0051869, GO:0071345, GO:0048856, GO:0010033, GO:0044425, GO:0007166, GO:0032501, GO:0044707, GO:0050874, GO:0023052, GO:0023046, GO:0044700, GO:0031982, GO:0031988, GO:0032502, GO:0044767, GO:0007154, GO:0071310, GO:0005615, GO:0042221, GO:0031224, GO:0051049, GO:0019221, GO:0048583, GO:0008284, GO:0007275, GO:0023051, GO:0010646, GO:0048584, GO:0051239, GO:0032879, GO:0006954, GO:0007165, GO:0023033, GO:0043230, GO:0098771, GO:0055065, GO:0016021, GO:1903561, GO:0009966, GO:0035466, GO:0050801, GO:0010647, GO:0006811, GO:0065008, GO:0051240, GO:0098590, GO:0055082, GO:0055080, GO:0023056, GO:0006875, GO:0070062, GO:0051716, GO:0048878, GO:0043269, GO:0065009, GO:0051050, GO:0050865, GO:0098857, GO:0006873, GO:0048518, GO:0043119, GO:0030003, GO:0048731, GO:0042592, GO:0045121, GO:0006952, GO:0002217, GO:0042829, GO:0048522, GO:0051242, GO:0046903, GO:0005102, GO:0030154, GO:0019725, GO:0001775, GO:0009967, GO:0035468, GO:0002376, GO:0072503, GO:0045321, GO:0050863, GO:0050878, GO:0048869, GO:0002703, GO:0050670, GO:0022407, GO:0032944, GO:0016020, GO:1902533, GO:0010740, GO:0043270, GO:0045785, GO:0072507, GO:0009888, GO:0022409, GO:0042493, GO:0017035, GO:0002682, GO:0006874, GO:0032101, GO:0070663, GO:0007204, GO:1902531, GO:0010627, GO:1903039, GO:1903037, GO:0002694, GO:0031012, GO:0009605, GO:0044281, GO:2000021, GO:0055074, GO:0035296, GO:0097746, GO:0042312, GO:0044093, GO:0002685, GO:0098589, GO:0051480, GO:0003013, GO:0008015, GO:0070261, GO:1901700, GO:0007187, GO:0030155, GO:0003006, GO:0034220, GO:0050870, GO:0009611, GO:0002245, GO:0008217, GO:1903524, GO:0042129, GO:0033993, GO:0050880, GO:0007188, GO:0051704, GO:0051706, GO:0035150, GO:0030198, GO:0032103, GO:0043062, GO:0050867, GO:0040017, GO:0002687, GO:0022857, GO:0005386, GO:0015563, GO:0015646, GO:0022891, GO:0022892, GO:0048608, GO:0015267, GO:0015249, GO:0015268, GO:0002274, GO:0001890, GO:0048513, GO:0022803, GO:0022814, GO:0002684, GO:0050776, GO:0002819, GO:0045937, GO:0010562, GO:0002366, GO:0061458, GO:0051094, GO:0034762, GO:2000147, GO:0030141, GO:0002263, GO:0006955, GO:0015075, GO:0099503, GO:0000003, GO:0019952, GO:0050876, GO:0098772, GO:0002252, GO:0009653, GO:0050900, GO:1901701, GO:0042802, GO:0043085, GO:0048554, GO:0030335, GO:0005215, GO:0005478, GO:0022414, GO:0044702, GO:0051241, GO:0002696, GO:0046873, GO:0042060, GO:0003018, GO:0032940, GO:0031410, GO:0016023, GO:0002822, GO:0046394, GO:0051272, GO:0097708, GO:0009986, GO:0009928, GO:0009929, GO:0016053, GO:0051928, GO:0042327, GO:0031225, GO:0010469, GO:0009987, GO:0008151, GO:0044763, GO:0050875, GO:0006950, GO:0043207, GO:0002886, GO:0051249, GO:0098655, GO:0005575, GO:0008372, GO:0002697, GO:0019935, GO:0007267, GO:0032496, GO:0070160, GO:0005216, GO:0034765, GO:0006820, GO:0006822, GO:0005911, GO:0019933, GO:0004252, GO:0048545, GO:0051924, GO:0006812, GO:0006819, GO:0015674, GO:0019932, GO:0051707, GO:0009613, GO:0042828, GO:0001934, GO:0022838, GO:1902105, GO:0006636, GO:0071624, GO:0055085, GO:0010959, GO:0005923, GO:0030001, GO:0002237, GO:0009607, GO:0002699, GO:0005261, GO:0015281, GO:0015338, GO:1903522, GO:0043408, GO:0008324, GO:0015711, GO:0071622, GO:0070665, GO:0002683, GO:0010543, GO:0050730, GO:0007189, GO:0010579, GO:0010580, GO:0016338, GO:0050671, GO:0015318, GO:0050777, GO:0050793, GO:0030054, GO:0022610, GO:0032946, GO:0043300, GO:0042102, GO:0001817, GO:0002275, GO:0032844, GO:0060429, GO:0001653, GO:0031347, GO:0048646, GO:0042981, GO:0051345, GO:0002690, GO:0043302, GO:0098660, GO:0009719, GO:0048018, GO:0071884, GO:0009116, GO:0043168, GO:0002444, GO:0043296, GO:0065007, GO:0098662, GO:0043299, GO:0030193, GO:0042119, GO:0050921, GO:0002688, GO:0043410, GO:0022836, GO:0090022, GO:0002888, GO:0002821, GO:1900046, GO:0042509, GO:0042510, GO:0042513, GO:0042516, GO:0042519, GO:0042522, GO:0042525, GO:0042528, GO:0035295, GO:0043235, GO:0022839, GO:0090023, GO:0043065, GO:0046718, GO:0019063, GO:0043067, GO:0043070, GO:0030545, GO:0001816, GO:0003382, GO:0044409, GO:0051806, GO:0030260, GO:0051828, GO:0036230, GO:0010941, GO:0009725, GO:0002476, GO:0002526, GO:0051384, GO:0050790, GO:0048552, GO:0051247, GO:0008285, GO:0097755, GO:0045909, GO:0031960, GO:0070374, GO:0002824, GO:0030728, GO:0007155, GO:0098602, GO:0035556, GO:0007242, GO:0007243, GO:0023013, GO:0023034, GO:0010942, GO:0070372, GO:0051046, GO:0043068, GO:0043071, GO:1902107, GO:0002283, GO:0005509, GO:0050818, GO:0051336, GO:0009119, GO:0003073, GO:0036018, GO:0046635, GO:2000026, GO:0006082, GO:0001819, GO:0004175, GO:0016809, GO:0050764, GO:0043436, GO:0005201, GO:0097028, GO:0008528, GO:0045055, GO:0016477, GO:0030168, GO:0035239, GO:0070820, GO:0031349, GO:0001932, GO:0098797, GO:0045137, GO:0043312, GO:0002446, GO:0052547, GO:0048585, GO:0009070, GO:0009113, GO:0034764, GO:0022600, GO:0016323, GO:0045597, GO:0042803, GO:0016324, GO:0045177, GO:0008406, GO:0006887, GO:0016194, GO:0016195, GO:0008236, GO:0072358, GO:0001944, GO:0002521, GO:1902624, GO:0044283, GO:0048519, GO:0043118, GO:0045684, GO:0006690, GO:0010522, GO:0022890, GO:0015082, GO:0019752, GO:0071396, GO:0001525, GO:0050731, GO:0036017, GO:0042609, GO:0050817, GO:0070252, GO:0060670, GO:0019369, GO:0019229, GO:0009164, GO:0017171, GO:0045907, GO:0008289, GO:1902622, GO:0050920, GO:0051047, GO:0046649, GO:0032270, GO:0009991, GO:0033628, GO:0004715, GO:0045776, GO:0042454, GO:0005515, GO:0001948, GO:0045308, GO:0002706, GO:1903530, GO:1901657, GO:0030322, GO:0042270, GO:0045088, GO:0046717, GO:0016661, GO:0008584, GO:0002428, GO:1901568, GO:0042325, GO:0044433, GO:0044057, GO:0031638, GO:0006953, GO:0050729, GO:0046546, GO:0042531, GO:0042511, GO:0042515, GO:0042517, GO:0042520, GO:0042523, GO:0042526, GO:0042529, GO:0046850, GO:0005178, GO:0048514, GO:0045682, GO:0003674, GO:0005554, GO:0046634, GO:0061041, GO:0008016, GO:0043407, GO:0046456, GO:0007596, GO:0045606, GO:0014070, GO:0048870, GO:0051674, GO:0002704, GO:0007584, GO:0070228, GO:0002675, GO:0052548, GO:0001664, GO:0090330, GO:0045117, GO:0034340, GO:0044853, GO:0032587, GO:0007586, GO:0097529, GO:0045595, GO:0040012, GO:0050866, GO:0010035, GO:0034767, GO:0098801, GO:0015079, GO:0015388, GO:0022817, GO:0044706, GO:1901605, GO:0009636, GO:0007599, GO:0002705, GO:2000145, GO:0034103, GO:0032642, GO:0098805, GO:0051209, GO:1901137, GO:0090066, GO:0098641, GO:0032409, GO:0007589, GO:0046128, GO:0061134, GO:0015893, GO:0001726, GO:0001893, GO:0030334, GO:0042398 or any combination thereof.
In embodiments, the gene expression profile information for the desirable determined dopaminergic precursor cell includes decreased gene expression levels relative to a pluripotent stem cell for a second gene set, wherein the second gene set includes at least one decreased gene within one or more second gene ontologies selected from the group consisting of: GO0070887, GO0044459 and GO0044281. In embodiments, the gene expression profile information for the desirable determined dopaminergic precursor cell includes decreased gene expression levels relative to a pluripotent stem cell for a second gene set, wherein the second gene set includes at least one decreased gene within one or more second gene ontologies of: GO0070887, GO0044459, or GO0044281. In embodiments, the gene expression profile information for the desirable determined dopaminergic precursor cell includes decreased gene expression levels relative to a pluripotent stem cell for a second gene set, wherein the second gene set includes at least one decreased gene within one or more second gene ontologies selected from the group consisting of: GO0042127, GO006954, and GO0032502 and any combination thereof. In embodiments, the gene expression profile information for the desirable determined dopaminergic precursor cell includes decreased gene expression levels relative to a pluripotent stem cell for a second gene set, wherein the second gene set includes at least one decreased gene within one or more second gene ontologies of: GO0042127, GO006954, GO0032502 or any combination thereof.
In embodiments, the second gene set includes at least one (e.g., 1, 2, 3, 4, 5, 6 etc.) decreased gene of Table 9, Table 10, Table 11, or any combination thereof.
In embodiments, the second gene set includes at least one (e.g., 1, 2, 3, 4, 5, 6 etc.) decreased gene of Table 9. In embodiments, the at least one (e.g., 1, 2, 3, 4, 5, 6 etc.) decreased gene is DYSF, RASAL3, AKR1C3, CGREF1, SULT2B1, CAV2, IL12A, HMGA1, HHLA2, HMX2, CARD11, TSPO, IRF6, CEBPB, BCL11B, CASR, INPP5D, FGF21, NODAL, TNFRSF1B, HPSE, GRPR, TNMD, SPINT2, IER5, CAV1, JAML, SOX10, SFN, NPYSR, MYB, HMOX1, CDH5, HEY2, CLDN7, CXCR2, FGF2, APELA, FLT3LG, CD22, CDCA7L, NPM1, STYK1, SKOR2, LRRC32, HRG, CDH3, IL4R, TERT, ANG, RAB25, NRK, ADM, MARVELD3, DPP4, CD4, LTF, FGF4, ERBB3, IFITM1, P3H2, BAX, WNT11, CEBPA, AVPR1A, PTPRZ1, EIFSA, EPO, NPR1, NQO2, FGF16, EPHAl, CCL26, NR1D1, SYK, PTGES, TCIRG1, HCLS1, RAC2, NME2, TESC, HCK, FZD5, ETS1, APLN, TRIM71, ADA, MYC, GCNT2, SFRP1, FGFR4, EMX1, KDR, RARG, CD74, DRD3, PDPN, TRNP1, HPN, PLAU, TNFSF12, GAS6, SRPX, FGF19, PROK2, TSLP, SHMT2, PIM2, GHRHR, EBI3, ADORA1, NOS3, LIF, PINX1, TNFRSF8, FA2H, LECT1, CHRM1, NME1, SOX15, S100A11, NCCRP1, CD40, SERPINB3, RARRES3, LIN28A, TCL1A, ICOSLG, HYAL1, AIF1, LEP, EEF1E1, PRKCH, VIPR1, IL34, SH2B3, SPINT1, ESRP2, PYCARD, CLEC4G, MATK, EAF2, TACR1, EGFL7, CCNI2, GAL, FERMT1, SFRP5, PPP1R16B, MLXIPL, OVOL1, CD9, TNFSF9, KDF1, MST1R, IL23A, FLT1, FLT3, HLA-G, ADAMTS8, GUCY2C, MMP9, ALOX15B, VDR, SIX4, LGALS3, LAMC2, CCNE1, NPPC, CLC, APOE, MAP3K5, CCND1, XCL1, PTPN6, GLI1, TCL1B, PIM1, ARG2, LYN, NRARP, ELL3, TDGF1, FOSL1, CDCA7, NANOG, CCKBR, BNC1, PNP, TRIB1, HPGD, PRTN3, KIAA1462, HTR1A, BTK, FZD7, IFNLR1, JAK3, CD55, TFAP4, SLA, FBXO2, RBPMS2, OSMR, IL12RB2, EPCAM, IL6, IDO1, CHP2, PTAFR, CXCL1, SFRP2, PF4, CCDC88B, PRKCQ, CXCL5, TGFA, GJA1, FZD5, RPA3, TACSTD2, TNFRSF11A, CNN1, or PTGER2.
In embodiments, the at least one (e.g., 1, 2, 3, 4, 5, 6 etc.) decreased gene is selected from the group consisting of DYSF, RASAL3, AKR1C3, CGREF1, SULT2B1, CAV2, IL12A, HMGA1, HHLA2, HMX2, CARD11, TSPO, IRF6, CEBPB, BCL11B, CASR, INPP5D, FGF21, NODAL, TNFRSF1B, HPSE, GRPR, TNMD, SPINT2, IER5, CAV1, JAML, SOX10, SFN, NPYSR, MYB, HMOX1, CDH5, HEY2, CLDN7, CXCR2, FGF2, APELA, FLT3LG, CD22, CDCA7L, NPM1, STYK1, SKOR2, LRRC32, HRG, CDH3, IL4R, TERT, ANG, RAB25, NRK, ADM, MARVELD3, DPP4, CD4, LTF, FGF4, ERBB3, IFITM1, P3H2, BAX, WNT11, CEBPA, AVPR1A, PTPRZ1, EIFSA, EPO, NPR1, NQO2, FGF16, EPHAl, CCL26, NR1D1, SYK, PTGES, TCIRG1, HCLS1, RAC2, NME2, TESC, HCK, FZD5, ETS1, APLN, TRIM71, ADA, MYC, GCNT2, SFRP1, FGFR4, EMX1, KDR, RARG, CD74, DRD3, PDPN, TRNP1, HPN, PLAU, TNFSF12, GAS6, SRPX, FGF19, PROK2, TSLP, SHMT2, PIM2, GHRHR, EBI3, ADORA1, NOS3, LIF, PINX1, TNFRSF8, FA2H, LECT1, CHRM1, NME1, SOX15, S100A11, NCCRP1, CD40, SERPINB3, RARRES3, LIN28A, TCL1A, ICOSLG, HYAL1, AIF1, LEP, EEF1E1, PRKCH, VIPR1, IL34, SH2B3, SPINT1, ESRP2, PYCARD, CLEC4G, MATK, EAF2, TACR1, EGFL7, CCNI2, GAL, FERMT1, SFRP5, PPP1R16B, MLXIPL, OVOL1, CD9, TNFSF9, KDF1, MST1R, IL23A, FLT1, FLT3, HLA-G, ADAMTS8, GUCY2C, MMP9, ALOX15B, VDR, SIX4, LGALS3, LAMC2, CCNE1, NPPC, CLC, APOE, MAP3K5, CCND1, XCL1, PTPN6, GLI1, TCL1B, PIM1, ARG2, LYN, NRARP, ELL3, TDGF1, FOSL1, CDCA7, NANOG, CCKBR, BNC1, PNP, TRIB1, HPGD, PRTN3, KIAA1462, HTR1A, BTK, FZD7, IFNLR1, JAK3, CD55, TFAP4, SLA, FBXO2, RBPMS2, OSMR, IL12RB2, EPCAM, IL6, IDO1, CHP2, PTAFR, CXCL1, SFRP2, PF4, CCDC88B, PRKCQ, CXCL5, TGFA, GJA1, FZD9, RPA3, TACSTD2, TNFRSF11A, CNN1, and PTGER2.
In embodiments, the second gene set includes at least one (e.g., 1, 2, 3, 4, 5, 6 etc.) decreased gene of Table 10. In embodiments, the at least one (e.g., 1, 2, 3, 4, 5, 6 etc.) decreased gene is C3, AFAP1L2, PTGDR, CMKLR1, CEBPB, NFKBID, TNFRSF1B, SMPDL3B, F2RL1, HMOX1, CXCR2, FPR2, IL17RE, CHST4, IL4R, NFKBIZ, RELB, ADM, ALOX5, SPP1, SIGIRR, EPO, CCL26, SYK, PTGES, TFR2, AHCY, TCIRG1, CHI3L1, UGT1A1, NLRP10, HCK, RARRES2, KLKB1, CXCL2, F12, ALOX15, PROK2, ELF3, ADORA1, CXCL6, CD40, HYAL1, AIF1, ADGRE2, IL34, AHSG, THEMIS2, MMP25, PLSCR1, NMI, PYCARD, TACR1, LBP, GAL, F11R, LY75, IL23A, NRROS, XCL1, ASS1, LYN, BTK, TNFAIP6, IL6, IDO1, PTAFR, CXCL1, PF4, PRKCQ, IL17C, CXCL5, GJA1, CXCL3, PLA2G4C, ICAM1, ORM2, SDC1, PTGER2, or TLR3.
In embodiments, the at least one (e.g., 1, 2, 3, 4, 5, 6 etc.) decreased gene is selected from the group consisting of C3, AFAP1L2, PTGDR, CMKLR1, CEBPB, NFKBID, TNFRSF1B, SMPDL3B, F2RL1, HMOX1, CXCR2, FPR2, IL17RE, CHST4, IL4R, NFKBIZ, RELB, ADM, ALOX5, SPP1, SIGIRR, EPO, CCL26, SYK, PTGES, TFR2, AHCY, TCIRG1, CHI3L1, UGT1A1, NLRP10, HCK, RARRES2, KLKB1, CXCL2, F12, ALOX15, PROK2, ELF3, ADORA1, CXCL6, CD40, HYAL1, AIF1, ADGRE2, IL34, AHSG, THEMIS2, MMP25, PLSCR1, NMI, PYCARD, TACR1, LBP, GAL, F11R, LY75, IL23A, NRROS, XCL1, ASS1, LYN, BTK, TNFAIP6, IL6, IDO1, PTAFR, CXCL1, PF4, PRKCQ, IL17C, CXCL5, GJA1, CXCL3, PLA2G4C, ICAM1, ORM2, SDC1, PTGER2, and TLR3.
In embodiments, the second gene set includes at least one (e.g., 1, 2, 3, 4, 5, 6 etc.) decreased gene of Table 11. In embodiments, the at least one (e.g., 1, 2, 3, 4, 5, 6 etc.) decreased gene is C3, MOG, FOXI3, ACTN3, P2RX2, TWIST2, DYSF, MYBPC2, VSIG1, AKR1C3, CAV2, COL23A1, PTGDR, SLC2A4, RNF43, SHROOM1, BCAN, FGR, LFNG, KRTDAP, GCM2, SEMA4A, SYNGR3, COL13A1, SAMHD1, PDCD1, HMGA1, DSC3, GCNT4, FGF22, SNTG2, HMX2, CARD11, TSPO, IRF6, KLF15, ALAS2, KLK7, KCP, B3GNT2, CMKLR1, ACSBG1, CLDN3, MTHFD1, CEBPB, BCL11B, GDF3, CASR SLC29A1, POU2F3, TBX6, DAZAP1, TIMP4, PVALB, INPP5D, MAL2, NDP, ATXN3, MPZL2, NODAL, TNFRSF1B, BARX1, AFP, HPSE, SOCS1, DDX25, LAMB3, TNMD, BOLL, SPINT2, LPAR3, CAV1, IRF4, SOX10, SFN, NPYSR, MYB, F2RL1, MYBBP1A, HMOX1, TNFAIP2, CCDC85B, RASGRP4, CXCL14, CDH5, CA2, HEY2, ASB2, GNPNAT1, PADI2, RITZ, PCOLCE, CXCR2, FPR2, FGF2, HELLS, HACD1, APELA, LCTL, EVPL, GAB3, FLT3LG, RASAL1, ARC, ACTL8, NPM1, HSPE1, CDH1, SKOR2, ZNF488, RAP1GAP2, CR2, HRG, FABP5, CDH3, PSMB8, FOXD3, SP8, TERT, ANG, SPRR2F, RAMP3, UPK1B, JADE2, TJP2, ETV1, RYR2, RAB25, HSPA2, NRK, RELB, CTSC, INHBB, ANXA3, EPOR, ZFP57, BIK, ADM, DAZL, TM4SF1, PRKCD, CD4 ARTN, POU5F1, LTF, YBX2, SPRY4, EDA, FGF4, FOXA3, NR1I2, SPIB, STAR, FAM65B, ERBB3 ATIC, ARHGAP22, HAPLN3, FRAT2, MPZ, ZMYND15, ARHGAP4, NPAS1, DOCK2, RSPO4, ACAN, TCF15, COL14A1, MTHFD1L, BAX, WNT11, CEBPA, AVPR1A, PTPRZ1, SPP1, ADRA2C, HOOK1, CRYBA4, ANGPT4, SS18L2, BCL11A, CHMP4C, P2RY1, ZIC5, THOC6, NFE2 KRT17, EPO, RPS6KA1, UPK1A, FAM150B, LHCGR, FGF16, DPPA4, KRT7, EPHAl, CNFN, CLRN1, NR1D1, EPAS1, SYK, CHRNA9, PKP1, CLEC4D, PPARGC1B, GRID2, SEMA3G, RAPGEF3, SPTB, GJA5, RCN3, SP7, TCIRG1, CHI3L1, UGT1A1, HCLS1, SSH3, METTLE, RORC, KRTAP13-4, RAC2, KLK13, NME2, TESC, RRS1, HCK, FZD5, NPY1R, CATSPER4, PRRX2, ETS1, ALPL, APLN, ACP5, TRIM71, ADA, RARRES2, PRDX1, S1PR5, MYC, GCNT2, SFRP1, FGFR4, SHISA3, NPTX1, RP11-240B13.2, FOXI2, EMX1, KDR, VWDE, DNMT3B, ALDH1A3, ALDOC, RARG, CD74, TDRD5, FOXG1, DRD3, CDHR1, MFSD2A, PDPN, INSC, RTN4RL2, RAD54L, GABRA5, HESX1, WDR74, TRNP1, HPN, EIF4EBP1, DNAH11, FKBP4, DPPA5, ALOX15, SOHLH2, PHC1, LCP1, STC1, ATOH1, EPHA6, HES3, TNFSF12, GAS6, PKP3, FGF19, PROK2, PAQR5, CBR1, ELF3, M1AP, ITM2A, LAMC3, TEC, LHX6, PHOSPHO1, GHRHR, GJA4, PHLDA3, RGS14, VWA1, SEMG1, VENTX, OSCAR, LRRK1, NKX1-2, ECSCR, ADORAL, ITGAM, NOS3, SLC44A4, PFN1, MOV10L1, ALPK3, LIF, KLK8, TLL2, VILl, TULP1, PHGDH, FA2H, PCDH1, HSPD1, MGST1, ENPP1, LECT1, CHRM1, NME1, SOX15, PLA2G3, MMP17, VWA2, PCSK9, CPNE9, PPP1R13L, KRT15, ADCYAP1R1, PCK2, DOC2A, ARHGEF15, KRT18, ETV4, SRY, CTSV, LIN28A, AQP5, UNC5B, BBC3, GAS1, TCL1A, SLC34A2, NRN1L, NPTX2, HYAL1, AIF1, LEP, PRKCH, KCNQ1, TNNT2, IL34, SH2B3, AHSG, SPINT1, RASIP1, MMP25, P2RX5, GRB7, APRT, VAV1, TNNT1, ESRP2, SLC45A3, MATK, ESRP1, ITGB1BP2, CARMIL2, CLN8, CHAC1, EGFL7, TESMIN, SFRP5, SLC7A5, BATF2, PPP1R16B, TBX22, ADM2, FOXH1, MLXIPL, FOXS1, F11R, CDX4, OVOL1, VSX2, CD9, MME, GJC3, KDF1, FLT1, FLT3, CCDC63, HLA-G, HTR6, CLDN4, TRPC6, UNC13A, ACTN2, NRROS, GJB3, FAM150A, SLC2A14, JPH1, MMP9, ALOX15B, SH3GL3, VDR, SIX4, LGALS3, PRSS8, COL6A3, ZSCAN10, MAG, TRPM2, COL6A2, RAB38, LAMC2, CRABP1, HRH2, NPPC, CLC, MYLPF, KRTAP5-11, S100A4, ZIC2, APOE, LYAR, 0C90, CCND1, KLK4, RXFP1, MB21D1, PTGIS, INHBE, PTPN6, PLCG2, FBL, GLI1, ASST, PACSIN1, TMC1, PIM1, HPRT1, AK4, ARG2, LYN, NRARP, ELL3, TEX19, TDGF1, MESP2, MYOZ1, MT1G, GATA5, FOSL1, FUT9, TAF4B, NANOG, MEI1, CCKBR, ALOX12B, ST14, GNG8, BNC1, KCNJ10, PIWIL3, SYNE4, CCNBlIP1, DLX4, ASNS, TAF7L, SLC6A11, RORB, PAK1IP1, NOTO, HPGD, FOXL2, KRT19, LGR6, WIPF3, MFGE8, PRTN3, CD19, LTBR, FSTL4, FAM101B, MMP19, BTK, KLK5, UST, FZD7, CCM2L, ANOS1, HES2, JAK3, MKX, SLA, SORL1, PLPPR4, FRAS1, DUSP6, TRPV2, ITGB4, RP1-302G2.5, RBPMS2, YBX3, EPCAM, KLF1, IL6, SH2D2A, KREMEN2, THY1, CXCL1, PRDM14, CRYGD, SALL4, GRHL3, UTF1, DPPA3, OLFML3, AHSP, SYPL2, SFRP2, NOS1, TFAP2C, RNF112, LCK, PRKCQ, FHL2, UGT8, TDRD1, MREG, SOCS3, GH2, TGFA, TEAD4, GJA1, FZD9, FAM101A, COL4A1, HCN1, TACSTD2, UNC45B, SOCS2, ICAM1, PODXL, ZFP42, CST6, GAL3ST1, TNFRSF11A, ENG, TNNI3, CD79B, SDC1, TCF21, SPATA16, COL9A3, TLR3, DIAPH2, PREX2, ADAMTS4, TRIM54, or RAC3.
In embodiments, the at least one (e.g., 1, 2, 3, 4, 5, 6 etc.) decreased gene is selected from the group consisting of C3, MOG, FOXI3, ACTN3, P2RX2, TWIST2, DYSF, MYBPC2, VSIG1, AKR1C3, CAV2, COL23A1, PTGDR, SLC2A4, RNF43, SHROOM1, BCAN, FGR, LFNG, KRTDAP, GCM2, SEMA4A, SYNGR3, COL13A1, SAMHD1, PDCD1, HMGA1, DSC3, GCNT4, FGF22, SNTG2, HMX2, CARD11, TSPO, IRF6, KLF15, ALAS2, KLK7, KCP, B3GNT2, CMKLR1, ACSBG1, CLDN3, MTHFD1, CEBPB, BCL11B, GDF3, CASR SLC29A1, POU2F3, TBX6, DAZAP1, TIMP4, PVALB, INPP5D, MAL2, NDP, ATXN3, MPZL2, NODAL, TNFRSF1B, BARX1, AFP, HPSE, SOCS1, DDX25, LAMB3, TNMD, BOLL, SPINT2, LPAR3, CAV1, IRF4, SOX10, SFN, NPYSR, MYB, F2RL1, MYBBP1A, HMOX1, TNFAIP2, CCDC85B, RASGRP4, CXCL14, CDH5, CA2, HEY2, ASB2, GNPNAT1, PADI2, RITZ, PCOLCE, CXCR2, FPR2, FGF2, HELLS, HACD1, APELA, LCTL, EVPL, GAB3, FLT3LG, RASAL1, ARC, ACTL8, NPM1, HSPE1, CDH1, SKOR2, ZNF488, RAP1GAP2, CR2, HRG, FABP5, CDH3, PSMB8, FOXD3, SP8, TERT, ANG, SPRR2F, RAMP3, UPK1B, JADE2, TJP2, ETV1, RYR2, RAB25, HSPA2, NRK, RELB, CTSC, INHBB, ANXA3, EPOR, ZFP57, BIK, ADM, DAZL, TM4SF1, PRKCD, CD4 ARTN, POU5F1, LTF, YBX2, SPRY4, EDA, FGF4, FOXA3, NR1I2, SPIB, STAR, FAM65B, ERBB3 ATIC, ARHGAP22, HAPLN3, FRAT2, MPZ, ZMYND15, ARHGAP4, NPAS1, DOCK2, RSPO4, ACAN, TCF15, COL14A1, MTHFD1L, BAX, WNT11, CEBPA, AVPR1A, PTPRZ1, SPP1, ADRA2C, HOOK1, CRYBA4, ANGPT4, SS18L2, BCL11A, CHMP4C, P2RY1, ZIC5, THOC6, NFE2 KRT17, EPO, RPS6KA1, UPK1A, FAM150B, LHCGR, FGF16, DPPA4, KRT7, EPHAl, CNFN, CLRN1, NR1D1, EPAS1, SYK, CHRNA9, PKP1, CLEC4D, PPARGC1B, GRID2, SEMA3G, RAPGEF3, SPTB, GJA5, RCN3, SP7, TCIRG1, CHI3L1, UGT1A1, HCLS1, SSH3, METTLE, RORC, KRTAP13-4, RAC2, KLK13, NME2, TESC, RRS1, HCK, FZD5, NPY1R, CATSPER4, PRRX2, ETS1, ALPL, APLN, ACP5, TRIM71, ADA, RARRES2, PRDX1, S1PR5, MYC, GCNT2, SFRP1, FGFR4, SHISA3, NPTX1, RP11-240B13.2, FOXI2, EMX1, KDR, VWDE, DNMT3B, ALDH1A3, ALDOC, RARG, CD74, TDRD5, FOXG1, DRD3, CDHR1, MFSD2A, PDPN, INSC, RTN4RL2, RAD54L, GABRA5, HESX1, WDR74, TRNP1, HPN, EIF4EBP1, DNAH11, FKBP4, DPPA5, ALOX15, SOHLH2, PHC1, LCP1, STC1, ATOH1, EPHA6, HES3, TNFSF12, GAS6, PKP3, FGF19, PROK2, PAQR5, CBR1, ELF3, M1AP, ITM2A, LAMC3, TEC, LHX6, PHOSPHO1, GHRHR, GJA4, PHLDA3, RGS14, VWA1, SEMG1, VENTX, OSCAR, LRRK1, NKX1-2, ECSCR, ADORAL, ITGAM, NOS3, SLC44A4, PFN1, MOV10L1, ALPK3, LIF, KLK8, TLL2, VILl, TULP1, PHGDH, FA2H, PCDH1, HSPD1, MGST1, ENPP1, LECT1, CHRM1, NME1, SOX15, PLA2G3, MMP17, VWA2, PCSK9, CPNE9, PPP1R13L, KRT15, ADCYAP1R1, PCK2, DOC2A, ARHGEF15, KRT18, ETV4, SRY, CTSV, LIN28A, AQP5, UNC5B, BBC3, GAS1, TCL1A, SLC34A2, NRN1L, NPTX2, HYAL1, AIF1, LEP, PRKCH, KCNQ1, TNNT2, IL34, SH2B3, AHSG, SPINT1, RASIP1, MMP25, P2RX5, GRB7, APRT, VAV1, TNNT1, ESRP2, SLC45A3, MATK, ESRP1, ITGB1BP2, CARMIL2, CLN8, CHAC1, EGFL7, TESMIN, SFRP5, SLC7A5, BATF2, PPP1R16B, TBX22, ADM2, FOXH1, MLXIPL, FOXS1, F11R, CDX4, OVOL1, VSX2, CD9, MME, GJC3, KDF1, FLT1, FLT3, CCDC63, HLA-G, HTR6, CLDN4, TRPC6, UNC13A, ACTN2, NRROS, GJB3, FAM150A, SLC2A14, JPH1, MMP9, ALOX15B, SH3GL3, VDR, SIX4, LGALS3, PRSS8, COL6A3, ZSCAN10, MAG, TRPM2, COL6A2, RAB38, LAMC2, CRABP1, HRH2, NPPC, CLC, MYLPF, KRTAP5-11, S100A4, ZIC2, APOE, LYAR, 0C90, CCND1, KLK4, RXFP1, MB21D1, PTGIS, INHBE, PTPN6, PLCG2, FBL, GLI1, ASS1, PACSIN1, TMC1, PIM1, HPRT1, AK4, ARG2, LYN, NRARP, ELL3, TEX19, TDGF1, MESP2, MYOZ1, MT1G, GATA5, FOSL1, FUT9, TAF4B, NANOG, MEI1, CCKBR, ALOX12B, ST14, GNG8, BNC1, KCNJ10, PIWIL3, SYNE4, CCNB1IP1, DLX4, ASNS, TAF7L, SLC6A11, RORB, PAK1IP1, NOTO, HPGD, FOXL2, KRT19, LGR6, WIPF3, MFGE8, PRTN3, CD19, LTBR, FSTL4, FAM101B, MMP19, BTK, KLK5, UST, FZD7, CCM2L, ANOS1, HES2, JAK3, MKX, SLA, SORL1, PLPPR4, FRAS1, DUSP6, TRPV2, ITGB4, RP1-302G2.5, RBPMS2, YBX3, EPCAM, KLF1, IL6, SH2D2A, KREMEN2, THY1, CXCL1, PRDM14, CRYGD, SALL4, GRHL3, UTF1, DPPA3, OLFML3, AHSP, SYPL2, SFRP2, NOS1, TFAP2C, RNF112, LCK, PRKCQ, FHL2, UGT8, TDRD1, MREG, SOCS3, GH2, TGFA, TEAD4, GJA1, FZD9, FAM101A, COL4A1, HCN1, TACSTD2, UNC45B, SOCS2, ICAM1, PODXL, ZFP42, CST6, GAL3ST1, TNFRSF11A, ENG, TNNI3, CD79B, SDC1, TCF21, SPATA16, COL9A3, TLR3, DIAPH2, PREX2, ADAMTS4, TRIM54, or RAC3. C3, MOG, FOXI3, ACTN3, P2RX2, TWIST2, DYSF, MYBPC2, VSIG1, AKR1C3, CAV2, COL23A1, PTGDR, SLC2A4, RNF43, SHROOM1, BCAN, FGR, LFNG, KRTDAP, GCM2, SEMA4A, SYNGR3, COL13A1, SAMHD1, PDCD1, HMGA1, DSC3, GCNT4, FGF22, SNTG2, HMX2, CARD11, TSPO, IRF6, KLF15, ALAS2, KLK7, KCP, B3GNT2, CMKLR1, ACSBG1, CLDN3, MTHFD1, CEBPB, BCL11B, GDF3, CASR SLC29A1, POU2F3, TBX6, DAZAP1, TIMP4, PVALB, INPP5D, MAL2, NDP, ATXN3, MPZL2, NODAL, TNFRSF1B, BARX1, AFP, HPSE, SOCS1, DDX25, LAMB3, TNMD, BOLL, SPINT2, LPAR3, CAV1, IRF4, SOX10, SFN, NPY5R, MYB, F2RL1, MYBBP1A, HMOX1, TNFAIP2, CCDC85B, RASGRP4, CXCL14, CDH5, CA2, HEY2, ASB2, GNPNAT1, PADI2, RITZ, PCOLCE, CXCR2, FPR2, FGF2, HELLS, HACD1, APELA, LCTL, EVPL, GAB3, FLT3LG, RASAL1, ARC, ACTL8, NPM1, HSPE1, CDH1, SKOR2, ZNF488, RAP1GAP2, CR2, HRG, FABP5, CDH3, PSMB8, FOXD3, SP8, TERT, ANG, SPRR2F, RAMP3, UPK1B, JADE2, TJP2, ETV1, RYR2, RAB25, HSPA2, NRK, RELB, CTSC, INHBB, ANXA3, EPOR, ZFP57, BIK, ADM, DAZL, TM4SF1, PRKCD, CD4 ARTN, POU5F1, LTF, YBX2, SPRY4, EDA, FGF4, FOXA3, NR1I2, SPIB, STAR, FAM65B, ERBB3 ATIC, ARHGAP22, HAPLN3, FRAT2, MPZ, ZMYND15, ARHGAP4, NPAS1, DOCK2, RSPO4, ACAN, TCF15, COL14A1, MTHFD1L, BAX, WNT11, CEBPA, AVPR1A, PTPRZ1, SPP1, ADRA2C, HOOK1, CRYBA4, ANGPT4, SS18L2, BCL11A, CHMP4C, P2RY1, ZIC5, THOC6, NFE2 KRT17, EPO, RPS6KA1, UPK1A, FAM150B, LHCGR, FGF16, DPPA4, KRT7, EPHAl, CNFN, CLRN1, NR1D1, EPAS1, SYK, CHRNA9, PKP1, CLEC4D, PPARGC1B, GRID2, SEMA3G, RAPGEF3, SPTB, GJA5, RCN3, SP7, TCIRG1, CHI3L1, UGT1A1, HCLS1, SSH3, METTLE, RORC, KRTAP13-4, RAC2, KLK13, NME2, TESC, RRS1, HCK, FZD5, NPY1R, CATSPER4, PRRX2, ETS1, ALPL, APLN, ACP5, TRIM71, ADA, RARRES2, PRDX1, S1PR5, MYC, GCNT2, SFRP1, FGFR4, SHISA3, NPTX1, RP11-240B13.2, FOXI2, EMX1, KDR, VWDE, DNMT3B, ALDH1A3, ALDOC, RARG, CD74, TDRD5, FOXG1, DRD3, CDHR1, MFSD2A, PDPN, INSC, RTN4RL2, RAD54L, GABRA5, HESX1, WDR74, TRNP1, HPN, EIF4EBP1, DNAH11, FKBP4, DPPA5, ALOX15, SOHLH2, PHC1, LCP1, STC1, ATOH1, EPHA6, HES3, TNFSF12, GAS6, PKP3, FGF19, PROK2, PAQR5, CBR1, ELF3, M1AP, ITM2A, LAMC3, TEC, LHX6, PHOSPHO1, GHRHR, GJA4, PHLDA3, RGS14, VWA1, SEMG1, VENTX, OSCAR, LRRK1, NKX1-2, ECSCR, ADORAL, ITGAM, NOS3, SLC44A4, PFN1, MOV10L1, ALPK3, LIF, KLK8, TLL2, VILl, TULP1, PHGDH, FA2H, PCDH1, HSPD1, MGST1, ENPP1, LECT1, CHRM1, NME1, SOX15, PLA2G3, MMP17, VWA2, PCSK9, CPNE9, PPP1R13L, KRT15, ADCYAP1R1, PCK2, DOC2A, ARHGEF15, KRT18, ETV4, SRY, CTSV, LIN28A, AQP5, UNC5B, BBC3, GAS1, TCL1A, SLC34A2, NRN1L, NPTX2, HYAL1, AIF1, LEP, PRKCH, KCNQ1, TNNT2, IL34, SH2B3, AHSG, SPINT1, RASIP1, MMP25, P2RX5, GRB7, APRT, VAV1, TNNT1, ESRP2, SLC45A3, MATK, ESRP1, ITGB1BP2, CARMIL2, CLN8, CHAC1, EGFL7, TESMIN, SFRP5, SLC7A5, BATF2, PPP1R16B, TBX22, ADM2, FOXH1, MLXIPL, FOXS1, F11R, CDX4, OVOL1, VSX2, CD9, MME, GJC3, KDF1, FLT1, FLT3, CCDC63, HLA-G, HTR6, CLDN4, TRPC6, UNC13A, ACTN2, NRROS, GJB3, FAM150A, SLC2A14, JPH1, MMP9, ALOX15B, SH3GL3, VDR, SIX4, LGALS3, PRSS8, COL6A3, ZSCAN10, MAG, TRPM2, COL6A2, RAB38, LAMC2, CRABP1, HRH2, NPPC, CLC, MYLPF, KRTAP5-11, S100A4, ZIC2, APOE, LYAR, 0C90, CCND1, KLK4, RXFP1, MB21D1, PTGIS, INHBE, PTPN6, PLCG2, FBL, GLI1, ASS1, PACSIN1, TMC1, PIM1, HPRT1, AK4, ARG2, LYN, NRARP, ELL3, TEX19, TDGF1, MESP2, MYOZ1, MT1G, GATA5, FOSL1, FUT9, TAF4B, NANOG, MEI1, CCKBR, ALOX12B, ST14, GNG8, BNC1, KCNJ10, PIWIL3, SYNE4, CCNB1IP1, DLX4, ASNS, TAF7L, SLC6A11, RORB, PAK1IP1, NOTO, HPGD, FOXL2, KRT19, LGR6, WIPF3, MFGE8, PRTN3, CD19, LTBR, FSTL4, FAM101B, MMP19, BTK, KLK5, UST, FZD7, CCM2L, ANOS1, HES2, JAK3, MKX, SLA, SORL1, PLPPR4, FRAS1, DUSP6, TRPV2, ITGB4, RP1-302G2.5, RBPMS2, YBX3, EPCAM, KLF1, IL6, SH2D2A, KREMEN2, THY1, CXCL1, PRDM14, CRYGD, SALL4, GRHL3, UTF1, DPPA3, OLFML3, AHSP, SYPL2, SFRP2, NOS1, TFAP2C, RNF112, LCK, PRKCQ, FHL2, UGT8, TDRD1, MREG, SOCS3, GH2, TGFA, TEAD4, GJA1, FZD9, FAM101A, COL4A1, HCN1, TACSTD2, UNC45B, SOCS2, ICAM1, PODXL, ZFP42, CST6, GAL3ST1, TNFRSF11A, ENG, TNNI3, CD79B, SDC1, TCF21, SPATA16, COL9A3, TLR3, DIAPH2, PREX2, ADAMTS4, TRIM54, and RAC3.
In embodiments, the at least one decreased gene is selected from the group consisting of: ADCY8, AKR1C3, ALDH3A1, APRT, ASNS, BAX, BBC3, CCND1, CDH5, CH25H, CMKLR1, COL16A1, CXCL1, CXCL2, EDNRB, EEF1E1, RIPOR2, FGF10, FGF22, FZD7, GJA1, GNG8, GNPNAT1, HPGD, ICAM1, ITPR2, KLF1, KLF15, LEP, LPL, LRRC32, MAP3K5, MX1, MYC, NME1, NME2, NQO2, NR1D1, P2RY1, PCOLCE2, PDE4A, PDIA5, PFKP, PHGDH, PLK5, PPP1R14A, PRODH, PSMB8, PSMB9, PYCR1, RAPGEF3, RYR2, SCARB1, SHMT2, SIPA1, SPHK1, TRIM22, VDR, ADA, ADGRG3, ADGRL4, ANK1, ART3, CAll, CABP1, CDH15, CDHR1, COL13A1, EPHA6, CALHM6, GRID2IP, HS3ST3B1, ICAM5, JCAD, LGR6, LRRC38, NOXO1, PDPN, PLPPR5, PODXL, RAMP3, RGS7BP, RIMS4, RTBDN, RTN4RL2, S100A10, SEMA4A, SGCG, SH2D5, SHISA9, SHROOM1, SLC22A3, SLC24A2, SLC29A2, SLC6A11, SLC7A10, SLC7A5, SLCO2A1, STAC2, STYK1, TMC1, UNC13A, WWC1, ABCG2, ACSBG1, ACSS1, ACY1, AHCY, ALOX12B, AMD1, ARG2, ASS1, BCAT1, CHST2, CLN8, ENTPD2, FABP5, FADS3, FUT4, FUT9, GAL3ST3, GMDS, HACD1, HAS3, HPD, KYAT1, LDHD, MPP1, OGDHL, PDE4A, PGM1, PIPDX, PLAAT3, PLA2G4C, PLCB3, PNP, PSAT1, PTGES, REXO2, SCARB1, SLC27A6, SPHK1, STAB2, UAP1L1 and UCK2. In embodiments, the at least one decreased gene is ADCY8, AKR1C3, ALDH3A1, APRT, ASNS, BAX, BBC3, CCND1, CDH5, CH25H, CMKLR1, COL16A1, CXCL1, CXCL2, EDNRB, EEF1E1, RIPOR2, FGF10, FGF22, FZD7, GJA1, GNG8, GNPNAT1, HPGD, ICAM1, ITPR2, KLF1, KLF15, LEP, LPL, LRRC32, MAP3K5, MX1, MYC, NME1, NME2, NQO2, NR1D1, P2RY1, PCOLCE2, PDE4A, PDIA5, PFKP, PHGDH, PLK5, PPP1R14A, PRODH, PSMB8, PSMB9, PYCR1, RAPGEF3, RYR2, SCARB1, SHMT2, SIPA1, SPHK1, TRIM22, VDR, ADA, ADGRG3, ADGRL4, ANK1, ART3, CAll, CABP1, CDH15, CDHR1, COL13A1, EPHA6, CALHM6, GRID2IP, HS3ST3B1, ICAM5, JCAD, LGR6, LRRC38, NOXO1, PDPN, PLPPR5, PODXL, RAMP3, RGS7BP, RIMS4, RTBDN, RTN4RL2, S100A10, SEMA4A, SGCG, SH2D5, SHISA9, SHROOM1, SLC22A3, SLC24A2, SLC29A2, SLC6A11, SLC7A10, SLC7A5, SLCO2A1, STAC2, STYK1, TMC1, UNC13A, WWC1, ABCG2, ACSBG1, ACSS1, ACY1, AHCY, ALOX12B, AMD1, ARG2, ASS1, BCAT1, CHST2, CLN8, ENTPD2, FABP5, FADS3, FUT4, FUT9, GAL3ST3, GMDS, HACD1, HAS3, HPD, KYAT1, LDHD, MPP1, OGDHL, PDE4A, PGM1, PIPDX, PLAAT3, PLA2G4C, PLCB3, PNP, PSAT1, PTGES, REXO2, SCARB1, SLC27A6, SPHK1, STAB2, UAP1L1 or UCK2.
In embodiments, the decreased expression levels are at least 4 times lower relative to a pluripotent stem cell. In embodiments, the decreased expression levels are about 4 times lower relative to a pluripotent stem cell. In embodiments, the decreased expression levels are at least 5 times lower relative to a pluripotent stem cell. In embodiments, the decreased expression levels are about 5 times lower relative to a pluripotent stem cell. In embodiments, the decreased expression levels are at least 6 times lower relative to a pluripotent stem cell. In embodiments, the decreased expression levels are about 6 times lower relative to a pluripotent stem cell. In embodiments, the decreased expression levels are at least 7 times lower relative to a pluripotent stem cell. In embodiments, the decreased expression levels are about 7 times lower relative to a pluripotent stem cell. In embodiments, the decreased expression levels are at least 8 times lower relative to a pluripotent stem cell. In embodiments, the decreased expression levels are about 8 times lower relative to a pluripotent stem cell. In embodiments, the decreased expression levels are at least 9 times lower relative to a pluripotent stem cell. In embodiments, the decreased expression levels are about 9 times lower relative to a pluripotent stem cell. In embodiments, the decreased expression levels are at least 10 times lower relative to a pluripotent stem cell. In embodiments, the decreased expression levels are about 10 times lower relative to a pluripotent stem cell.
In embodiments, the decreased expression levels are at least 11 times lower relative to a pluripotent stem cell. In embodiments, the decreased expression levels are about 11 times lower relative to a pluripotent stem cell. In embodiments, the decreased expression levels are at least 12 times lower relative to a pluripotent stem cell. In embodiments, the decreased expression levels are about 12 times lower relative to a pluripotent stem cell. In embodiments, the decreased expression levels are at least 13 times lower relative to a pluripotent stem cell. In embodiments, the decreased expression levels are about 13 times lower relative to a pluripotent stem cell. In embodiments, the decreased expression levels are at least 14 times lower relative to a pluripotent stem cell. In embodiments, the decreased expression levels are about 14 times lower relative to a pluripotent stem cell. In embodiments, the decreased expression levels are at least 15 times lower relative to a pluripotent stem cell. In embodiments, the decreased expression levels are about 15 times lower relative to a pluripotent stem cell. In embodiments, the decreased expression levels are at least 16 times lower relative to a pluripotent stem cell. In embodiments, the decreased expression levels are about 16 times lower relative to a pluripotent stem cell. In embodiments, the decreased expression levels are at least 17 times lower relative to a pluripotent stem cell. In embodiments, the decreased expression levels are about 17 times lower relative to a pluripotent stem cell. In embodiments, the decreased expression levels are at least 18 times lower relative to a pluripotent stem cell. In embodiments, the decreased expression levels are about 18 times lower relative to a pluripotent stem cell. In embodiments, the decreased expression levels are at least 19 times lower relative to a pluripotent stem cell. In embodiments, the decreased expression levels are about 19 times lower relative to a pluripotent stem cell. In embodiments, the decreased expression levels are at least 20 times lower relative to a pluripotent stem cell. In embodiments, the decreased expression levels are about 20 times lower relative to a pluripotent stem cell.
In embodiments, the decreased expression levels are about 4-100 times lower relative to a pluripotent stem cell. In embodiments, the decreased expression levels are 4-100 times lower relative to a pluripotent stem cell. In embodiments, the decreased expression levels are about 6-100 times lower relative to a pluripotent stem cell. In embodiments, the decreased expression levels are 6-100 times lower relative to a pluripotent stem cell. In embodiments, the decreased expression levels are about 8-100 times lower relative to a pluripotent stem cell. In embodiments, the decreased expression levels are 8-100 times lower relative to a pluripotent stem cell. In embodiments, the decreased expression levels are about 10-100 times lower relative to a pluripotent stem cell. In embodiments, the decreased expression levels are 10-100 times lower relative to a pluripotent stem cell. In embodiments, the decreased expression levels are about 20-100 times lower relative to a pluripotent stem cell. In embodiments, the decreased expression levels are 20-100 times lower relative to a pluripotent stem cell. In embodiments, the decreased expression levels are about 30-100 times lower relative to a pluripotent stem cell. In embodiments, the decreased expression levels are 30-100 times lower relative to a pluripotent stem cell. In embodiments, the decreased expression levels are about 40-100 times lower relative to a pluripotent stem cell. In embodiments, the decreased expression levels are 40-100 times lower relative to a pluripotent stem cell. In embodiments, the decreased expression levels are about 50-100 times lower relative to a pluripotent stem cell. In embodiments, the decreased expression levels are 50-100 times lower relative to a pluripotent stem cell. In embodiments, the decreased expression levels are about 60-100 times lower relative to a pluripotent stem cell. In embodiments, the decreased expression levels are 60-100 times lower relative to a pluripotent stem cell. In embodiments, the decreased expression levels are about 70-100 times lower relative to a pluripotent stem cell. In embodiments, the decreased expression levels are 70-100 times lower relative to a pluripotent stem cell. In embodiments, the decreased expression levels are about 80-100 times lower relative to a pluripotent stem cell. In embodiments, the decreased expression levels are 80-100 times lower relative to a pluripotent stem cell. In embodiments, the decreased expression levels are about 90-100 times lower relative to a pluripotent stem cell. In embodiments, the decreased expression levels are 90-100 times lower relative to a pluripotent stem cell.
In embodiments, the decreased expression levels are about 4-90 times lower relative to a pluripotent stem cell. In embodiments, the decreased expression levels are 4-90 times lower relative to a pluripotent stem cell. In embodiments, the decreased expression levels are about 4-80 times lower relative to a pluripotent stem cell. In embodiments, the decreased expression levels are 4-80 times lower relative to a pluripotent stem cell. In embodiments, the decreased expression levels are about 4-70 times lower relative to a pluripotent stem cell. In embodiments, the decreased expression levels are 4-70 times lower relative to a pluripotent stem cell. In embodiments, the decreased expression levels are about 4-60 times lower relative to a pluripotent stem cell. In embodiments, the decreased expression levels are 4-60 times lower relative to a pluripotent stem cell. In embodiments, the decreased expression levels are about 4-50 times lower relative to a pluripotent stem cell. In embodiments, the decreased expression levels are 4-50 times lower relative to a pluripotent stem cell. In embodiments, the decreased expression levels are about 4-40 times lower relative to a pluripotent stem cell. In embodiments, the decreased expression levels are 4-40 times lower relative to a pluripotent stem cell. In embodiments, the decreased expression levels are about 4-30 times lower relative to a pluripotent stem cell. In embodiments, the decreased expression levels are 4-30 times lower relative to a pluripotent stem cell. In embodiments, the decreased expression levels are about 4-20 times lower relative to a pluripotent stem cell. In embodiments, the decreased expression levels are 4-20 times lower relative to a pluripotent stem cell. In embodiments, the decreased expression levels are about 4-10 times lower relative to a pluripotent stem cell. In embodiments, the decreased expression levels are 4-10 times lower relative to a pluripotent stem cell. In embodiments, the decreased expression levels are about 4-8 times lower relative to a pluripotent stem cell. In embodiments, the decreased expression levels are 4-8 times lower relative to a pluripotent stem cell. In embodiments, the decreased expression levels are about 4-6 times lower relative to a pluripotent stem cell. In embodiments, the decreased expression levels are 4-6 times lower relative to a pluripotent stem cell.
In embodiments, the gene expression profile information for the desirable determined dopaminergic precursor cell comprises an undesirable gene expression profile comprising one or more undesirable genes. In embodiments, the one or more undesirable genes is a cancer marker gene. In embodiments, the one or more undesirable genes is a tyrosine hydroxylase gene. An “undesirable gene” is a gene characterisitic for a non-dopaminergic cell or a non non-dopaminergic neuron. A “non-dopaminergic cell” or a “non-dopaminergic neuron” is a cell that lacks biological features of a dopaminergic neuron (e.g., does not express dopamine) Examples of non-dopaminergic neurons include without limitation, GABAergic cells, serotonergic neurons, non-A9 dopaminergic neurons, an ependymal cell, an astrocyte, a microglial cell or an oligodendrocyte. In embodiments, the non-dopaminergic neuron does not express detectable amounts of dopamine. In embodiments, the non-dopaminergic neuron expresses tyrosine hydroxylase.
Also provided herein are populations of cells identified as comprising a neuronal progenitor cell population identified based on the classification methods provided heren. For example, provided herein are populations of cells identified as comprising determined dopaminergic precursor cells (identified, e.g., by the methods provided herein). In some embodiments, a dose of such identified cells is provided as a composition or formulation, such as a pharmaceutical composition or formulation. In some embodiments, the dose of cells comprises differentiated cells, for instance cells differentiated according to any of the methods described in Section I.A.2. herein. In some embodiments, the dose of cells is identified as comprising determined dopaminergic precursor cells according to any of the methods described in Section I.F. herein.
Such compositions can be used in accord with the provided methods, such as in the prevention or treatment of diseases, conditions, and disorders, such as neurodegenerative disorders.
The term “pharmaceutical formulation” refers to a preparation which is in such form as to permit the biological activity of an active ingredient contained therein to be effective, and which contains no additional components which are unacceptably toxic to a subject to which the formulation would be administered.
A “pharmaceutically acceptable carrier” refers to an ingredient in a pharmaceutical formulation, other than an active ingredient, which is nontoxic to a subject. A pharmaceutically acceptable carrier includes, but is not limited to, a buffer, excipient, stabilizer, or preservative.
In some aspects, the choice of carrier is determined in part by the particular cell or agent and/or by the method of administration. Accordingly, there are a variety of suitable formulations. For example, the pharmaceutical composition can contain preservatives. Suitable preservatives may include, for example, methylparaben, propylparaben, sodium benzoate, and benzalkonium chloride. In some aspects, a mixture of two or more preservatives is used. The preservative or mixtures thereof are typically present in an amount of about 0.0001% to about 2% by weight of the total composition. Carriers are described, e.g., by Remington's Pharmaceutical Sciences 16th edition, Osol, A. Ed. (1980). Pharmaceutically acceptable carriers are generally nontoxic to recipients at the dosages and concentrations employed, and include, but are not limited to: buffers such as phosphate, citrate, and other organic acids; antioxidants including ascorbic acid and methionine; preservatives (such as octadecyldimethylbenzyl ammonium chloride; hexamethonium chloride; benzalkonium chloride; benzethonium chloride; phenol, butyl or benzyl alcohol; alkyl parabens such as methyl or propyl paraben; catechol; resorcinol; cyclohexanol; 3-pentanol; and m-cresol); low molecular weight (less than about 10 residues) polypeptides; proteins, such as serum albumin, gelatin, or immunoglobulins; hydrophilic polymers such as polyvinylpyrrolidone; amino acids such as glycine, glutamine, asparagine, histidine, arginine, or lysine; monosaccharides, disaccharides, and other carbohydrates including glucose, mannose, or dextrins; chelating agents such as EDTA; sugars such as sucrose, mannitol, trehalose or sorbitol; salt-forming counter-ions such as sodium; metal complexes (e.g. Zn-protein complexes); and/or non-ionic surfactants such as polyethylene glycol (PEG).
Buffering agents in some aspects are included in the compositions. Suitable buffering agents include, for example, citric acid, sodium citrate, phosphoric acid, potassium phosphate, and various other acids and salts. In some aspects, a mixture of two or more buffering agents is used. The buffering agent or mixtures thereof are typically present in an amount of about 0.001% to about 4% by weight of the total composition. Methods for preparing administrable pharmaceutical compositions are known. Exemplary methods are described in more detail in, for example, Remington: The Science and Practice of Pharmacy, Lippincott Williams & Wilkins; 21st ed. (May 1, 2005).
The formulation or composition may also contain more than one active ingredient useful for the particular indication, disease, or condition being prevented or treated with the cells or agents, where the respective activities do not adversely affect one another. Such active ingredients are suitably present in combination in amounts that are effective for the purpose intended. Thus, in some embodiments, the pharmaceutical composition further includes other pharmaceutically active agents or drugs, such as carbidopa-levodopa (e.g., Levodopa), dopamine agonists (e.g., pramipexole, ropinirole, rotigotine, and apomorphine), MAO B inhibitors (e.g., selegiline, rasagiline, and safinamide), catechol O-methyltransferase (COMT) inhibitors (e.g., entacapone and tolcapone), anticholinergics (e.g., benztropine and trihexylphenidyl), amantadine, etc. In some embodiments, the agents or cells are administered in the form of a salt, e.g., a pharmaceutically acceptable salt. Suitable pharmaceutically acceptable acid addition salts include those derived from mineral acids, such as hydrochloric, hydrobromic, phosphoric, metaphosphoric, nitric, and sulphuric acids, and organic acids, such as tartaric, acetic, citric, malic, lactic, fumaric, benzoic, glycolic, gluconic, succinic, and arylsulphonic acids, for example, p-toluenesulphonic acid.
The formulation or composition may also be administered in combination with another form of treatment useful for the particular indication, disease, or condition being prevented or treated with the cells or agents, where the respective activities do not adversely affect one another. Thus, in some embodiments, the pharmaceutical composition is administered in combination with deep brain stimulation (DBS).
The pharmaceutical composition in some embodiments contains agents or cells in amounts effective to treat or prevent the disease or condition, such as a therapeutically effective or prophylactically effective amount. Therapeutic or prophylactic efficacy in some embodiments is monitored by periodic assessment of treated subjects. For repeated administrations over several days or longer, depending on the condition, the treatment is repeated until a desired suppression of disease symptoms occurs. However, other dosage regimens may be useful and can be determined. The desired dosage can be delivered by a single bolus administration of the composition, by multiple bolus administrations of the composition, or by continuous infusion administration of the composition.
The agents or cells can be administered by any suitable means, for example, by stereotactic injection (e.g., using a catheter). In some embodiments, a given dose is administered by a single bolus administration of the cells or agent. In some embodiments, it is administered by multiple bolus administrations of the cells or agent, for example, over a period of months or years. In some embodiments, the agents or cells can be administered by stereotactic injection into the brain, such as in the substantia nigra.
For the prevention or treatment of disease, the appropriate dosage may depend on the type of disease to be treated, the type of agent or agents, the type of cells or recombinant receptors, the severity and course of the disease, whether the agent or cells are administered for preventive or therapeutic purposes, previous therapy, the subject's clinical history and response to the agent or the cells, and the discretion of the attending physician. The compositions are in some embodiments suitably administered to the subject at one time or over a series of treatments.
The cells or agents may be administered using standard administration techniques, formulations, and/or devices. Provided are formulations and devices, such as syringes and vials, for storage and administration of the compositions. With respect to cells, administration can be autologous. For example, non-pluripotent cells (e.g., fibroblasts) can be obtained from a subject, and administered to the same subject following reprogramming and differentiation. When administering a therapeutic composition (e.g., a pharmaceutical composition containing a genetically reprogrammed and/or differentiated cell or an agent that treats or ameliorates symptoms of a disease or disorder, such as a neurodegenerative disorder), it will generally be formulated in a unit dosage injectable form (solution, suspension, emulsion). Formulations include those for stereotactic administration, such as into the brain (e.g. the substantia nigra).
Compositions in some embodiments are provided as sterile liquid preparations, e.g., isotonic aqueous solutions, suspensions, emulsions, dispersions, or viscous compositions, which may in some aspects be buffered to a selected pH. Liquid preparations are normally easier to prepare than gels, other viscous compositions, and solid compositions. Additionally, liquid compositions are somewhat more convenient to administer, especially by injection. Viscous compositions, on the other hand, can be formulated within the appropriate viscosity range to provide longer contact periods with specific tissues. Liquid or viscous compositions can comprise carriers, which can be a solvent or dispersing medium containing, for example, water, saline, phosphate buffered saline, polyol (for example, glycerol, propylene glycol, liquid polyethylene glycol) and suitable mixtures thereof.
Sterile injectable solutions can be prepared by incorporating the agent or cells in a solvent, such as in admixture with a suitable carrier, diluent, or excipient such as sterile water, physiological saline, glucose, dextrose, or the like.
The formulations to be used for in vivo administration are generally sterile. Sterility may be readily accomplished, e.g., by filtration through sterile filtration membranes
Also provided herein are methods of treating involving administration of a neuronal progenitor cell population identified based on the classification methods provided heren to a subject having a neurodegenerative disease in need of treatment thereof. In some embodiments, the a population of neuronal progenitor cells that are determined dopaminergic precursor cells are identified, (e.g., by the methods provided herein), and the method further includes administering the determined dopaminergic precursor cell to a subject in need thereof. Also provided herein are uses of any of the provided compositions or populations of neuronal progenitor cells, e.g. determined dopaminergic precursor cells, in such methods and treatments, and in the preparation of a medicament in order to carry out such therapeutic methods. In some embodiments, the methods thereby treat the neurodegenerative disease in the subject. Also provided herein are uses of any of the compositions, such as pharmaceutical compositions provided herein, for the treatment of a neurodegenerative disease. In embodiments, the subject suffers from a neurodegenerative disease. In embodiments, the subject suffers from Parkinson's Disease. In some embodiments, the determined dopaminergic precursor cells are differentiated from PSCs (e.g. iPSCs) autologous to the subject to be treated, i.e. the PSCs are derived from the same subject to whom the differentiated cells are administered.
In some embodiments, non-pluripotent cells (e.g., fibroblasts) derived from patients having Parkinson's disease (PD) are reprogrammed to become iPSCs, such as in accord with differentiation processes as described in Section II. In some embodiments, fibroblasts may be reprogrammed to iPSCs by transforming fibroblasts with genes (OCT4, SOX2, NANOG, LIN28, and KLF4) cloned into a plasmid (for example, see, Yu, et al., Science DOI: 10.1126/science.1172482). In some embodiments, non-pluripotent fibroblasts derived from patients having PD are reprogrammed to become iPSCs before differentiation into determined DA neuron progenitors cells and/or DA neurons, such as by use of the non-integrating Sendai virus to reprogram the cells (e.g., use of CTS™ CytoTune™-iPS 2.1 Sendai Reprogramming Kit). In some embodiments, the resulting differentiated cells are then administered to the patient from whom they are derived in an autologous stem cell transplant. In some embodiments, the PSCs (e.g., iPSCs) are allogeneic to the subject to be treated, i.e. the PSCs are derived from a different individual than the subject to whom the differentiated cells will be administered. In some embodiments, non-pluripotent cells (e.g., fibroblasts) derived from another individual (e.g. an individual not having a neurodegenerative disorder, such as Parkinson's disease) are reprogrammed to become iPSCs before differentiation into determined DA neuron progenitor cells and/or DA neurons. In some embodiments, reprogramming is accomplished, at least in part, by use of the non-integrating Sendai virus to reprogram the cells (e.g., use of CTS™ CytoTune™-iPS 2.1 Sendai Reprogramming Kit). In some embodiments, the resulting differentiated cells are then administered to an individual who is not the same individual from whom the differentiated cells are derived (e.g. allogeneic cell therapy or allogeneic cell transplantation).
In some embodiments, the subject has a neurodegenerative disease. In some embodiments, the neurodegenerative disease comprises the loss of dopamine neurons in the brain. In some embodiments, the subject has lost dopamine neurons in the substantia nigra (SN). In some embodiments, the subject has lost dopamine neurons in the substantia nigra pas compacta (SNc). In some embodiments, the subject exhibits rigidity, bradykinesia, postural reflect impairment, resting tremor, or a combination thereof. In some embodiments, the subject exhibits abnormal [18F]-L-DOPA PET scan. In some embodiments, the subject exhibits [18F]-DG-PET evidence for a Parkinson's Disease Related Pattern (PDRP).
In some embodiments, the neurodegenerative disease is Parkinsonism. In some embodiments, the neurodegenerative disease is Parkinson's disease. In some embodiments, the neurodegenerative disease is idiopathic Parkinson's disease. In some embodiments, the neurodegenerative disease is a familial form of Parkinson's disease. In some embodiments, the subject has mild Parkinson's disease. In some embodiments, the subject has a Movement Disorder Society-Unified Parkinson's Disease Rating Scale (MDS-UPDRS) motor score of less than or equal to 32. In some embodiments, the subject has Parkinson's Disease. In some embodiments, the subject has moderate or advanced Parkinson's disease. In some embodiments, the subject has mild Parkinson's disease. In some embodiments, the subject has a MDS-UPDRS motor score of between 33 and 60.
In some embodiments, the therapeutic composition comprising cells identified as comprising determined dopaminergic precursor cells is administered to treat a neurodegenerative disease, e.g., PD. In some embodiments, the dose of cells is a dose of a composition of cells, e.g., as described in Section III herein.
In some embodiments, the size or timing of the doses is determined as a function of the particular disease or condition in the subject. In some cases, the size or timing of the doses for a particular disease in view of the provided description may be empirically determined.
In some embodiments, the dose of cells is administered to the substantia nigra of the subject. In some embodiments, the dose of cells is administered to one hemisphere of the subject's substantia nigra. In some embodiments, the dose of cells is administered to both hemispheres of the subject's substantia nigra.
In some embodiments, the dose of cells comprises between at or about 250,000 cells per hemisphere and at or about 20 million cells per hemisphere, between at or about 500,000 cells per hemisphere and at or about 20 million cells per hemisphere, between at or about 1 million cells per hemisphere and at or about 20 million cells per hemisphere, between at or about 5 million cells per hemisphere and at or about 20 million cells per hemisphere, between at or about 10 million cells per hemisphere and at or about 20 million cells per hemisphere, between at or about 15 million cells per hemisphere and at or about 20 million cells per hemisphere, between at or about 250,000 cells per hemisphere and at or about 15 million cells per hemisphere, between at or about 500,000 cells per hemisphere and at or about 15 million cells per hemisphere, between at or about 1 million cells per hemisphere and at or about 15 million cells per hemisphere, between at or about 5 million cells per hemisphere and at or about 15 million cells per hemisphere, between at or about 10 million cells per hemisphere and at or about 15 million cells per hemisphere, between at or about 250,000 cells per hemisphere and at or about 10 million cells per hemisphere, between at or about 500,000 cells per hemisphere and at or about 10 million cells per hemisphere, between at or about 1 million cells per hemisphere and at or about 10 million cells per hemisphere, between at or about 5 million cells per hemisphere and at or about 10 million cells per hemisphere, between at or about 250,000 cells per hemisphere and at or about 5 million cells per hemisphere, between at or about 500,000 cells per hemisphere and at or about 5 million cells per hemisphere, between at or about 1 million cells per hemisphere and at or about 5 million cells per hemisphere, between at or about 250,000 cells per hemisphere and at or about 1 million cells per hemisphere, between at or about 500,000 cells per hemisphere and at or about 1 million cells per hemisphere, or between at or about 250,000 cells per hemisphere and at or about 500,000 cells per hemisphere.
In some embodiments, the dose of cells is between at or about 1 million cells per hemisphere and at or about 30 million cells per hemisphere. In some embodiments, the dose of cells is between at or about 5 million cells per hemisphere and at or about 20 million cells per hemisphere. In some embodiments, the dose of cells is between at or about 10 million cells per hemisphere and at or about 15 million cells per hemisphere.
In some embodiments, the number of cells administered to the subject is between about 0.25×106 total cells and about 20×106 total cells, between about 0.25×106 total cells and about 15×106 total cells, between about 0.25×106 total cells and about 10×106 total cells, between about 0.25×106 total cells and about 5×106 total cells, between about 0.25×106 total cells and about 1×106 total cells, between about 0.25×106 total cells and about 0.75×106 total cells, between about 0.25×106 total cells and about 0.5×106 total cells, between about 0.5×106 total cells and about 20×106 total cells, between about 0.5×106 total cells and about 15×106 total cells, between about 0.5×106 total cells and about 10×106 total cells, between about 0.5×106 total cells and about 5×106 total cells, between about 0.5×106 total cells and about 1×106 total cells, between about 0.5×106 total cells and about 0.75×106 total cells, between about 0.75×106 total cells and about 20×106 total cells, between about 0.75×106 total cells and about 15×106 total cells, between about 0.75×106 total cells and about 10×106 total cells, between about 0.75×106 total cells and about 5×106 total cells, between about 0.75×106 total cells and about 1×106 total cells, between about 1×106 total cells and about 20×106 total cells, between about 1×106 total cells and about 15×106 total cells, between about 1×106 total cells and about 10×106 total cells, between about 1×106 total cells and about 5×106 total cells, between about 5×106 total cells and about 20×106 total cells, between about 5×106 total cells and about 15×106 total cells, between about 5×106 total cells and about 10×106 total cells, between about 10×106 total cells and about 20×106 total cells, between about 10×106 total cells and about 15×106 total cells, or between about 15×106 total cells and about 20×106 total cells.
In certain embodiments, the cells, or individual populations of sub-types of cells, are administered to the subject at a range of about 5 million cells per hemisphere to about 20 million cells per hemisphere or any value in between these ranges. Dosages may vary depending on attributes particular to the disease or disorder and/or patient and/or other treatments.
In some embodiments, the patient is administered multiple doses, and each of the doses or the total dose can be within any of the foregoing values. In some embodiments, the dose of cells comprises the administration of from or from about 5 million cells per hemisphere to about 20 million cells per hemisphere, each inclusive.
In some embodiments, the dose of cells, e.g. differentiated cells, is administered to the subject as a single dose or is administered only one time within a period of two weeks, one month, three months, six months, 1 year or more.
In the context of stem cell transplant, administration of a given “dose” encompasses administration of the given amount or number of cells as a single composition and/or single uninterrupted administration, e.g., as a single injection or continuous infusion, and also encompasses administration of the given amount or number of cells as a split dose or as a plurality of compositions, provided in multiple individual compositions or infusions, over a specified period of time, such as a day. Thus, in some contexts, the dose is a single or continuous administration of the specified number of cells, given or initiated at a single point in time. In some contexts, however, the dose is administered in multiple injections or infusions in a single period, such as by multiple infusions over a single day period.
Thus, in some aspects, the cells of the dose are administered in a single pharmaceutical composition. In some embodiments, the cells of the dose are administered in a plurality of compositions, collectively containing the cells of the dose.
In some embodiments, cells of the dose may be administered by administration of a plurality of compositions or solutions, such as a first and a second, optionally more, each containing some cells of the dose. In some aspects, the plurality of compositions, each containing a different population and/or sub-types of cells, are administered separately or independently, optionally within a certain period of time.
In some embodiments, the administration of the composition or dose, e.g., administration of the plurality of cell compositions, involves administration of the cell compositions separately. In some aspects, the separate administrations are carried out simultaneously, or sequentially, in any order.
In some embodiments, the subject receives multiple doses, e.g., two or more doses or multiple consecutive doses, of the cells. In some embodiments, two doses are administered to a subject. In some embodiments, multiple consecutive doses are administered following the first dose, such that an additional dose or doses are administered following administration of the consecutive dose. In some aspects, the number of cells administered to the subject in the additional dose is the same as or similar to the first dose and/or consecutive dose. In some embodiments, the additional dose or doses are larger than prior doses.
In some aspects, the size of the first and/or consecutive dose is determined based on one or more criteria such as response of the subject to prior treatment, e.g. disease stage and/or likelihood or incidence of the subject developing adverse outcomes, e.g., dyskinesia.
In some embodiments, the dose of cells is generally large enough to be effective in improving symptoms of the disease.
In some embodiments, the cells are administered at a desired dosage, which in some aspects includes a desired dose or number of cells or cell type(s) and/or a desired ratio of cell types. In some embodiments, the dosage of cells is based on a desired total number (or number per kg of body weight) of cells in the individual populations or of individual cell types (e.g., TH+ or TH−). In some embodiments, the dosage is based on a combination of such features, such as a desired number of total cells, desired ratio, and desired total number of cells in the individual populations.
Thus, in some embodiments, the dosage is based on a desired fixed dose of total cells and a desired ratio, and/or based on a desired fixed dose of one or more, e.g., each, of the individual sub-types or sub-populations.
In particular embodiments, the numbers and/or concentrations of cells refer to the number of TH-negative cells. In other embodiments, the numbers and/or concentrations of cells refer to the number or concentration of all cells administered.
In some aspects, the size of the dose is determined based on one or more criteria such as response of the subject to prior treatment, e.g. disease type and/or stage, and/or likelihood or incidence of the subject developing toxic outcomes, e.g., dyskinesia.
While various embodiments and aspects of the present invention are shown and described herein, it will be obvious to those skilled in the art that such embodiments and aspects are provided by way of example only. Numerous variations, changes, and substitutions will now occur to those skilled in the art without departing from the invention. It should be understood that various alternatives to the embodiments of the invention described herein may be employed in practicing the invention.
The section headings used herein are for organizational purposes only and are not to be construed as limiting the subject matter described. All documents, or portions of documents, cited in the application including, without limitation, patents, patent applications, articles, books, manuals, and treatises are hereby expressly incorporated by reference in their entirety for any purpose.
The abbreviations used herein have their conventional meaning within the chemical and biological arts. The chemical structures and formulae set forth herein are constructed according to the standard rules of chemical valency known in the chemical arts.
Unless defined otherwise, technical and scientific terms used herein have the same meaning as commonly understood by a person of ordinary skill in the art. See, e.g., Singleton et al., DICTIONARY OF MICROBIOLOGY AND MOLECULAR BIOLOGY 2nd ed., J. Wiley & Sons (New York, N.Y. 1994); Sambrook et al., MOLECULAR CLONING, A LABORATORY MANUAL, Cold Springs Harbor Press (Cold Springs Harbor, N Y 1989). Any methods, devices and materials similar or equivalent to those described herein can be used in the practice of this invention. The following definitions are provided to facilitate understanding of certain terms used frequently herein and are not meant to limit the scope of the present disclosure.
As used herein, the term “about” means a range of values including the specified value, which a person of ordinary skill in the art would consider reasonably similar to the specified value. In embodiments, the term “about” means within a standard deviation using measurements generally acceptable in the art. In embodiments, about means a range extending to +/−10% of the specified value. In embodiments, about means the specified value.
“Nucleic acid” refers to deoxyribonucleotides or ribonucleotides and polymers thereof in either single-, double- or multiple-stranded form, or complements thereof. The term “polynucleotide” refers to a linear sequence of nucleotides. The term “nucleotide” typically refers to a single unit of a polynucleotide, i.e., a monomer. Nucleotides can be ribonucleotides, deoxyribonucleotides, or modified versions thereof. Examples of polynucleotides contemplated herein include single and double stranded DNA, single and double stranded RNA (including siRNA), and hybrid molecules having mixtures of single and double stranded DNA and RNA. Nucleic acids can be linear or branched. For example, nucleic acids can be a linear chain of nucleotides or the nucleic acids can be branched, e.g., such that the nucleic acids comprise one or more arms or branches of nucleotides. Optionally, the branched nucleic acids are repetitively branched to form higher ordered structures such as dendrimers and the like.
The terms also encompass nucleic acids containing known nucleotide analogs or modified backbone residues or linkages, which are synthetic, naturally occurring, and non-naturally occurring, which have similar binding properties as the reference nucleic acid, and which are metabolized in a manner similar to the reference nucleotides. Examples of such analogs include, without limitation, phosphodiester derivatives including, e.g., phosphoramidate, phosphorodiamidate, phosphorothioate (also known as phosphothioate), phosphorodithioate, phosphonocarboxylic acids, phosphonocarboxylates, phosphonoacetic acid, phosphonoformic acid, methyl phosphonate, boron phosphonate, or O-methylphosphoroamidite linkages (see Eckstein, Oligonucleotides and Analogues: A Practical Approach, Oxford University Press); and peptide nucleic acid backbones and linkages. Other analog nucleic acids include those with positive backbones; non-ionic backbones, modified sugars, and non-ribose backbones (e.g. phosphorodiamidate morpholino oligos or locked nucleic acids (LNA)), including those described in U.S. Pat. Nos. 5,235,033 and 5,034,506, and Chapters 6 and 7, ASC Symposium Series 580, Carbohydrate Modifications in Antisense Research, Sanghui & Cook, eds. Nucleic acids containing one or more carbocyclic sugars are also included within one definition of nucleic acids. Modifications of the ribose-phosphate backbone may be done for a variety of reasons, e.g., to increase the stability and half-life of such molecules in physiological environments or as probes on a biochip. Mixtures of naturally occurring nucleic acids and analogs can be made; alternatively, mixtures of different nucleic acid analogs, and mixtures of naturally occurring nucleic acids and analogs may be made.
The words “complementary” or “complementarity” refer to the ability of a nucleic acid in a polynucleotide to form a base pair with another nucleic acid in a second polynucleotide. For example, the sequence A-G-T is complementary to the sequence T-C-A. Complementarity may be partial, in which only some of the nucleic acids match according to base pairing, or complete, where all the nucleic acids match according to base pairing.
The term “complement,” as used herein, refers to a nucleotide (e.g., RNA or DNA) or a sequence of nucleotides capable of base pairing with a complementary nucleotide or sequence of nucleotides. As described herein and commonly known in the art the complementary (matching) nucleotide of adenosine is thymidine and the complementary (matching) nucleotide of guanosine is cytosine. Thus, a complement may include a sequence of nucleotides that base pair with corresponding complementary nucleotides of a second nucleic acid sequence. The nucleotides of a complement may partially or completely match the nucleotides of the second nucleic acid sequence. Where the nucleotides of the complement completely match each nucleotide of the second nucleic acid sequence, the complement forms base pairs with each nucleotide of the second nucleic acid sequence. Where the nucleotides of the complement partially match the nucleotides of the second nucleic acid sequence only some of the nucleotides of the complement form base pairs with nucleotides of the second nucleic acid sequence. Examples of complementary sequences include coding and a non-coding sequences, wherein the non-coding sequence contains complementary nucleotides to the coding sequence and thus forms the complement of the coding sequence. A further example of complementary sequences are sense and antisense sequences, wherein the sense sequence contains complementary nucleotides to the antisense sequence and thus forms the complement of the antisense sequence.
As described herein the complementarity of sequences may be partial, in which only some of the nucleic acids match according to base pairing, or complete, where all the nucleic acids match according to base pairing. Thus, two sequences that are complementary to each other, may have a specified percentage of nucleotides that are the same (i.e., about 60% identity, preferably 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or higher identity over a specified region).
“Percentage of sequence identity” is determined by comparing two optimally aligned sequences over a comparison window, wherein the portion of the polynucleotide or polypeptide sequence in the comparison window may comprise additions or deletions (i.e., gaps) as compared to the reference sequence (which does not comprise additions or deletions) for optimal alignment of the two sequences. The percentage is calculated by determining the number of positions at which the identical nucleic acid base or amino acid residue occurs in both sequences to yield the number of matched positions, dividing the number of matched positions by the total number of positions in the window of comparison and multiplying the result by 100 to yield the percentage of sequence identity.
The terms “identical” or percent “identity,” in the context of two or more nucleic acids or polypeptide sequences, refer to two or more sequences or subsequences that are the same or have a specified percentage of amino acid residues or nucleotides that are the same (i.e., about 60% identity, preferably 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or higher identity over a specified region, when compared and aligned for maximum correspondence over a comparison window or designated region) as measured using a BLAST or BLAST 2.0 sequence comparison algorithms with default parameters described below, or by manual alignment and visual inspection (see, e.g., NCBI web site http://www.ncbi nlm nih.gov/BLAST/or the like). Such sequences are then said to be “substantially identical.” This definition also refers to, or may be applied to, the compliment of a test sequence. The definition also includes sequences that have deletions and/or additions, as well as those that have substitutions. As described below, the preferred algorithms can account for gaps and the like. Preferably, identity exists over a region that is at least about 25 amino acids or nucleotides in length, or more preferably over a region that is 50-100 amino acids or nucleotides in length.
The phrase “stringent hybridization conditions” refers to conditions under which a probe will hybridize to its target subsequence, typically in a complex mixture of nucleic acids, but to no other sequences. Stringent conditions are sequence-dependent and will be different in different circumstances. Longer sequences hybridize specifically at higher temperatures. An extensive guide to the hybridization of nucleic acids is found in Tijssen, Techniques in Biochemistry and Molecular Biology—Hybridization with Nucleic Probes, “Overview of principles of hybridization and the strategy of nucleic acid assays” (1993). Generally, stringent conditions are selected to be about 5-10° C. lower than the thermal melting point (Tm) for the specific sequence at a defined ionic strength pH. The Tm is the temperature (under defined ionic strength, pH, and nucleic concentration) at which 50% of the probes complementary to the target hybridize to the target sequence at equilibrium (as the target sequences are present in excess, at Tm, 50% of the probes are occupied at equilibrium). Stringent conditions may also be achieved with the addition of destabilizing agents such as formamide. For selective or specific hybridization, a positive signal is at least two times background, preferably 10 times background hybridization. Exemplary stringent hybridization conditions can be as following: 50% formamide, 5×SSC, and 1% SDS, incubating at 42° C., or, 5×SSC, 1% SDS, incubating at 65° C., with wash in 0.2×SSC, and 0.1% SDS at 65° C.
Nucleic acids that do not hybridize to each other under stringent conditions are still substantially identical if the polypeptides which they encode are substantially identical. This occurs, for example, when a copy of a nucleic acid is created using the maximum codon degeneracy permitted by the genetic code. In such cases, the nucleic acids typically hybridize under moderately stringent hybridization conditions. Exemplary “moderately stringent hybridization conditions” include a hybridization in a buffer of 40% formamide, 1 M NaCl, 1% SDS at 37° C., and a wash in 1×SSC at 45° C. A positive hybridization is at least twice background. Those of ordinary skill will readily recognize that alternative hybridization and wash conditions can be utilized to provide conditions of similar stringency. Additional guidelines for determining hybridization parameters are provided in numerous references, e.g., Current Protocols in Molecular Biology, ed. Ausubel, et al., supra.
The term “probe” or “primer”, as used herein, is defined to be one or more nucleic acid fragments whose specific hybridization to a sample can be detected. A probe or primer can be of any length depending on the particular technique it will be used for. For example, PCR primers are generally between 10 and 40 nucleotides in length, while nucleic acid probes for, e.g., a Southern blot, can be more than a hundred nucleotides in length. The probe may be unlabeled or labeled as described below so that its binding to the target or sample can be detected. The probe can be produced from a source of nucleic acids from one or more particular (preselected) portions of a chromosome, e.g., one or more clones, an isolated whole chromosome or chromosome fragment, or a collection of polymerase chain reaction (PCR) amplification products. The length and complexity of the nucleic acid fixed onto the target element is not critical to the invention. One of skill can adjust these factors to provide optimum hybridization and signal production for a given hybridization procedure, and to provide the required resolution among different genes or genomic locations.
The term “gene” means the segment of DNA involved in producing a protein; it includes regions preceding and following the coding region (leader and trailer) as well as intervening sequences (introns) between individual coding segments (exons). The leader, the trailer as well as the introns include regulatory elements that are necessary during the transcription and the translation of a gene. Further, a “protein gene product” is a protein expressed from a particular gene.
The word “expression” or “expressed” as used herein in reference to a gene means the transcriptional and/or translational product of that gene. The level of expression of a DNA molecule in a cell may be determined on the basis of either the amount of corresponding mRNA that is present within the cell or the amount of protein encoded by that DNA produced by the cell (Sambrook et al., 1989, Molecular Cloning: A Laboratory Manual, 18.1-18.88).
Expression of a transfected gene can occur transiently or stably in a cell. During “transient expression” the transfected gene is not transferred to the daughter cell during cell division. Since its expression is restricted to the transfected cell, expression of the gene is lost over time. In contrast, stable expression of a transfected gene can occur when the gene is co-transfected with another gene that confers a selection advantage to the transfected cell. Such a selection advantage may be a resistance towards a certain toxin that is presented to the cell.
The terms “gene ontology” or “gene ontologies” as provided herein are used according to their common meaning in the biological and bioinformatics arts, wherein a gene ontology is a representation of genes, gene expressions and gene properties and their relationships to each other. A gene ontology may include a cellular component (the parts of a cell or its extracellular environment), a molecular function (the elemental activities of a gene product at the molecular level, such as binding or catalysis) and a biological process (operations or sets of molecular events with a defined beginning and end, pertinent to the functioning of integrated living units such as cells, tissues, organs, and organisms). Each GO term within an ontology has a term name, which may be a word or string of words; a unique alphanumeric identifier; a definition with cited sources; and a namespace indicating the domain to which it belongs.
The term “isolated”, when applied to a nucleic acid or protein, denotes that the nucleic acid or protein is essentially free of other cellular components with which it is associated in the natural state. It can be, for example, in a homogeneous state and may be in either a dry or aqueous solution. Purity and homogeneity are typically determined using analytical chemistry techniques such as polyacrylamide gel electrophoresis or high performance liquid chromatography. A protein that is the predominant species present in a preparation is substantially purified.
The term “isolated” may also refer to a cell or sample cells. An isolated cell or sample cells are a single cell type that is substantially free of many of the components which normally accompany the cells when they are in their native state or when they are initially removed from their native state. In certain embodiments, an isolated cell sample retains those components from its natural state that are required to maintain the cell in a desired state. In some embodiments, an isolated (e.g. purified, separated) cell or isolated cells, are cells that are substantially the only cell type in a sample. A purified cell sample may contain at least 60%, 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, 99%, or 100% of one type of cell. An isolated cell sample may be obtained through the use of a cell marker or a combination of cell markers, either of which is unique to one cell type in an unpurified cell sample.
The term “purified” denotes that a nucleic acid or protein gives rise to essentially one band in an electrophoretic gel. In some embodiments, the nucleic acid or protein is at least 50% pure, optionally at least 65% pure, optionally at least 75% pure, optionally at least 85% pure, optionally at least 95% pure, and optionally at least 99% pure.
A “cell” as used herein, refers to a cell carrying out metabolic or other function sufficient to preserve or replicate its genomic DNA. A cell can be identified by well-known methods in the art including, for example, presence of an intact membrane, staining by a particular dye, ability to produce progeny or, in the case of a gamete, ability to combine with a second gamete to produce a viable offspring. Cells may include prokaryotic and eukaryotic cells. Prokaryotic cells include but are not limited to bacteria. Eukaryotic cells include but are not limited to yeast cells and cells derived from plants and animals, for example mammalian, insect (e.g., spodoptera) and human cells.
A “stem cell” is a cell characterized by the ability of self-renewal through mitotic cell division and the potential to differentiate into a tissue or an organ. Among mammalian stem cells, embryonic and somatic stem cells can be distinguished. Embryonic stem cells reside in the blastocyst and give rise to embryonic tissues, whereas somatic stem cells reside in adult tissues for the purpose of tissue regeneration and repair.
The term “pluripotent” or “pluripotency” refers to cells with the ability to give rise to progeny that can undergo differentiation, under appropriate conditions, into cell types that collectively exhibit characteristics associated with cell lineages from the three germ layers (endoderm, mesoderm, and ectoderm). Pluripotent stem cells can contribute to tissues of a prenatal, postnatal or adult organism. A standard art-accepted test, such as the ability to form a teratoma in 8-12 week old SCID mice, can be used to establish the pluripotency of a cell population. However, identification of various pluripotent stem cell characteristics can also be used to identify pluripotent cells.
“Pluripotent stem cell characteristics” refer to characteristics of a cell that distinguish pluripotent stem cells from other cells. Expression or non-expression of certain combinations of molecular markers are examples of characteristics of pluripotent stem cells. More specifically, human pluripotent stem cells may express at least some, and optionally all, of the markers from the following non-limiting list: SSEA-3, SSEA-4, TRA-1-60, TRA-1-81, TRA-2-49/6E, ALP, Sox2, E-cadherin, UTF-1, Oct4, Lin28, Rex1, and Nanog. Cell morphologies associated with pluripotent stem cells are also pluripotent stem cell characteristics.
The terms “induced pluripotent stem cell,” “iPS” and “iPSC” refer to a pluripotent stem cell artificially derived (e.g., through man-made manipulation) from a non-pluripotent cell. A “non-pluripotent cell” can be a cell of lesser potency to self-renew and differentiate than a pluripotent stem cell. Cells of lesser potency can be, but are not limited to adult stem cells, tissue specific progenitor cells, primary or secondary cells.
“Self renewal” refers to the ability of a cell to divide and generate at least one daughter cell with the self-renewing characteristics of the parent cell. The second daughter cell may commit to a particular differentiation pathway. For example, a self-renewing hematopoietic stem cell can divide and form one daughter stem cell and another daughter cell committed to differentiation in the myeloid or lymphoid pathway. A committed progenitor cell has typically lost the self-renewal capacity, and upon cell division produces two daughter cells that display a more differentiated (i.e., restricted) phenotype. Non-self-renewing cells refers to cells that undergo cell division to produce daughter cells, neither of which have the differentiation potential of the parent cell type, but instead generates differentiated daughter cells.
An adult stem cell is an undifferentiated cell found in an individual after embryonic development. Adult stem cells multiply by cell division to replenish dying cells and regenerate damaged tissue. An adult stem cell has the ability to divide and create another cell like itself or to create a more differentiated cell. Even though adult stem cells are associated with the expression of pluripotency markers such as Rex1, Nanog, Oct4 or Sox2, they do not have the ability of pluripotent stem cells to differentiate into the cell types of all three germ layers. Adult stem cells have a limited ability to self renew and generate progeny of distinct cell types. Adult stem cells can include hematopoietic stem cell, a cord blood stem cell, a mesenchymal stem cell, an epithelial stem cell, a skin stem cell or a neural stem cell. A tissue specific progenitor refers to a cell devoid of self-renewal potential that is committed to differentiate into a specific organ or tissue. A primary cell includes any cell of an adult or fetal organism apart from egg cells, sperm cells and stem cells. Examples of useful primary cells include, but are not limited to, skin cells, bone cells, blood cells, cells of internal organs and cells of connective tissue. A secondary cell is derived from a primary cell and has been immortalized for long-lived in vitro cell culture.
The term “reprogramming” refers to the process of dedifferentiating a non-pluripotent cell into a cell exhibiting pluripotent stem cell characteristics.
A “cell culture” is an in vitro population of cells residing outside of an organism. The cell culture can be established from primary cells isolated from a cell bank or animal, or secondary cells that are derived from one of these sources and immortalized for long-term in vitro cultures.
The terms “culture,” “culturing,” “grow,” “growing,” “maintain,” “maintaining,” “expand,” “expanding,” etc., when referring to cell culture itself or the process of culturing, can be used interchangeably to mean that a cell is maintained outside the body (e.g., ex vivo) under conditions suitable for survival. Cultured cells are allowed to survive, and culturing can result in cell growth, differentiation, or division. For example, in embodiments, the term “expand” refers to the differentiation of an iPSC in vitro. Cells are typically cultured/expanded in media, which can be changed during the course of the culture. The terms “medium,” “media” and “culture solution” refer to the cell culture milieu. Media is typically an isotonic solution, and can be liquid, gelatinous, or semisolid, e.g., to provide a matrix for cell adhesion or support. Media, as used herein, can include the components for nutritional, chemical, and structural support necessary for culturing a cell. The term “media” refers to a solution that includes various components including without limitation inorganic salts, amino acids, vitamins, growth factors, and other protein components. As used herein, “conditions to allow growth” in culture and the like refers to conditions of temperature (typically at about 37° C. for mammalian cells), humidity, CO2 (typically around 5%), in appropriate media (including salts, buffer, serum), such that the cells are able to undergo cell division or at least maintain viability for at least 24 hours, preferably longer (e.g., for days, weeks or months). The term “derived from,” when referring to cells or a biological sample, indicates that the cell or sample was obtained from the stated source at some point in time. For example, a cell derived from an individual can represent a primary cell obtained directly from the individual (i.e., unmodified), or can be modified, e.g., by introduction of a recombinant vector, by culturing under particular conditions, or immortalization. In some cases, a cell derived from a given source will undergo cell division and/or differentiation such that the original cell is no longer exists, but the continuing cells will be understood to derive from the same source.
Where appropriate the expanding of iPSC may be subjected to a process of selection. A process of selection may include a selection marker introduced into an induced pluripotent stem cell upon transfection. A selection marker may be a gene encoding for a polypeptide with enzymatic activity. The enzymatic activity includes, but is not limited to, the activity of an acetyltransferase and a phosphotransferase. In some embodiments, the enzymatic activity of the selection marker is the activity of a phosphotransferase. The enzymatic activity of a selection marker may confer to a transfected induced pluripotent stem cell the ability to expand in the presence of a toxin. Such a toxin typically inhibits cell expansion and/or causes cell death. Examples of such toxins include, but are not limited to, hygromycin, neomycin, puromycin and gentamycin. In embodiments, the toxin is hygromycin. Through the enzymatic activity of a selection maker a toxin may be converted to a non-toxin, which no longer inhibits expansion and causes cell death of a transfected induced pluripotent stem cell. Upon exposure to a toxin a cell lacking a selection marker may be eliminated and thereby precluded from expansion.
Identification of the induced pluripotent stem cell may include, but is not limited to the evaluation of the afore mentioned pluripotent stem cell characteristics. Such pluripotent stem cell characteristics include without further limitation, the expression or non-expression of certain combinations of molecular markers. Further, cell morphologies associated with pluripotent stem cells are also pluripotent stem cell characteristics. The term “hiPSC-derived neuronal cell” refers to a neuronal progenitor cell (NPC) or a mature neuron that has been derived (e.g., differentiated) from a hiPSC cell in vitro. The hiPSCs can be differentiated by any appropriate method known in the art.
The development of an embryo can be described as self-assembly. The mother and fetus have closely associated blood vessels so that the fetus can be nourished during development, but the embryo develops by itself, through a series of cell-cell interactions that direct the fate of cells that then influence the fate of other cells. As the embryo develops, cells narrow their possible fates, until only one fate remains. During embryogenesis a pluripotent cell matures through specific stages that cumulatively commit it to a specific fate: first specification, then determination, and finally differentiation.
The term “specification” or “specified” as provided herein refers to the fate of a cell or tissue narrowed to a limited number of specific cell types. A specified cell can still change its specific fate until it reaches the determined state, in which it has only one choice of cell type it can differentiate into.
The term “determination” or “determined” as provided herein refers to a cell or tissue capable of differentiating autonomously even when placed into another region of the embryo or a cluster of differently specified cells in a petri dish.
The term “differentiation” or “differentiate” as provided herein refers to a cell or cells that have acquired a cell type-specific function.
A “specified state” as provided herein refers to cells that can be influenced by their environment but have limited fate options. For example, a bit of ectoderm can be transplanted to another part of the embryo and will interpret the surrounding signals in ectodermal terms and can form many types of neurons, glia, or skin.
A “determined state” as determined herein refers to a cell having a narrow range of fates. For example, determined ventral mesencephalic dopamine neuron precursors cannot make other types of neurons. They are not yet neurons themselves and may or may not express the definitive markers of specific cell types.
A “neuronal progenitor cell” is a cell that has a tendency to differentiate into a neuronal cell and does not have the pluripotent potential of a stem cell. A neuronal progenitor is a cell that is committed to the neuronal lineage and is characterized by expressing one or more marker genes that are specific for the neuronal lineage. Examples of neuronal lineage marker genes are N-CAM, the intermediate-filament protein nestin, SOX2, vimentin, A2B5, and the transcription factor PAX-6 for early stage neural markers (i.e. neural progenitors); NF-M, MAP-2AB, synaptosin, glutamic acid decarboxylase, β111-tubulin and tyrosine hydroxylase for later stage neural markers (i.e. differentiated neural cells). The terms “neural” and “neuronal” are used according to their common meaning in the art and can be used interchangeably throughout.
In embodiments, the neuronal progenitor cell includes an increased expression level of one or more genes within one or more gene ontologies of Table 1. In embodiments, the neuronal progenitor cell includes a decreased expression level of one or more genes within one or more gene ontologies of Table 8. Where the neuronal progenitor cell includes an increased expression level or a decreased expression level of one or more of the genes within one ore more gene ontologies of Table 1 or Table 8, respectively, the neuronal progenitor cell may be a determined dopaminergic precursor cell or a dopaminergic cell.
An “undesirable neuronal progenitor cell” is a cell that is unable to differentiate into a dopaminergic neuron. An undesirable neuronal progenitor cell is not a determined dopaminergic precursor cell or a dopaminergic cell. An undesirable neuronal progenitor cell may be a cell capable of differentiating into neuron types other than dopaminergic cells.
A “specified cell or “specified tissue” as used herein refers to a cell capable of differentiating autonomously (i.e., by itself) when placed in an environment that is neutral with respect to the developmental pathway, such as in a petri dish or test tube. At the stage of specification, cell commitment may still be capable of being altered. If a specified cell is transplanted to a population of differently specified cells, the fate of the transplant will be altered by its interactions with its new neighbors.
The term “determined dopaminergic precursor cell” as provided herein refers to a cell that differentiates into a dopaminergic neuron and cannot differentiate into a non-dopaminergic cell. The term “determined cell” as provided herein refers to a cell capable of differentiating autonomously when placed into a region of an embryo that is unrelated to said cell. For example, an unrelated region for a determined dopaminergic precursor cell is any other organ, tissue other than the brain. The term “determined cell” as provided herein further includes a cell capable of differentiating autonomously when placed into a cluster of differently specified cells in a petri dish. If a cell or tissue type is able to differentiate according to its specified fate even under these circumstances, the commitment is considered irreversible. Thus, a “determined dopaminergic precursor cell” is a cell capable to differentiate into a dopaminergic neuron independently of its environment. A determined dopaminergic precursor cell may express Foxa2 or Nurrl. A determined dopaminergic precursor cell may not express serotonin.
A “dopaminergic cell” or a “differentiated dopaminergic cell” as used herein refers to a cell capable of synthesizing the neurotransmitter dopamine. In embodiments, the dopaminergic cell is an A9 dopaminergic cell. The term “A9 dopaminergic cell” refers to the most densely packed group of dopaminergic cells in the human brain, which are located in the pars compacta of the substantia nigra in the midbrain of healthy, adult humans.
The term “sample” includes sections of tissues such as biopsy and autopsy samples, and frozen sections taken for histological purposes. Such samples include blood and blood fractions or products (e.g., bone marrow, serum, plasma, platelets, red blood cells, and the like), sputum, tissue, cultured cells (e.g., primary cultures, explants, and transformed cells), stool, urine, other biological fluids (e.g., prostatic fluid, gastric fluid, intestinal fluid, renal fluid, lung fluid, cerebrospinal fluid, and the like), etc. A sample is typically obtained from a “subject” such as a eukaryotic organism, most preferably a mammal such as a primate, e.g., chimpanzee or human; cow; dog; cat; a rodent, e.g., guinea pig, rat, mouse; rabbit; or a bird; reptile; or fish. In some embodiments, the sample is obtained from a human.
A “control” sample or value refers to a sample that serves as a reference, usually a known reference, for comparison to a test sample. For example, a test sample can be taken from a test condition, e.g., in the presence of a test compound, and compared to samples from known conditions, e.g., in the absence of the test compound (negative control), or in the presence of a known compound (positive control). A control can also represent an average value gathered from a number of tests or results. One of skill in the art will recognize that controls can be designed for assessment of any number of parameters. For example, a control can be devised to compare therapeutic benefit based on pharmacological data (e.g., half-life) or therapeutic measures (e.g., comparison of side effects). One of skill in the art will understand which controls are valuable in a given situation and be able to analyze data based on comparisons to control values. Controls are also valuable for determining the significance of data. For example, if values for a given parameter are widely variant in controls, variation in test samples will not be considered as significant.
As used herein, the term “neurodegenerative disorder” refers to a disease or condition in which the function of a subject's nervous system becomes impaired. Examples of neurodegenerative diseases that may be treated with a compound, pharmaceutical composition, or method described herein include Alexander's disease, Alper's disease, Alzheimer's disease, Amyotrophic lateral sclerosis, Ataxia telangiectasia, Batten disease (also known as Spielmeyer-Vogt-Sjogren-Batten disease), Bovine spongiform encephalopathy (BSE), Canavan disease, chronic fatigue syndrome, Cockayne syndrome, Corticobasal degeneration, Creutzfeldt-Jakob disease, frontotemporal dementia, Gerstmann-Sträussler-Scheinker syndrome, Huntington's disease, HIV-associated dementia, Kennedy's disease, Krabbe's disease, kuru, Lewy body dementia, Machado-Joseph disease (Spinocerebellar ataxia type 3), Multiple sclerosis, Multiple System Atrophy, myalgic encephalomyelitis, Narcolepsy, Neuroborreliosis, Parkinson's disease, Pelizaeus-Merzbacher Disease, Pick's disease, Primary lateral sclerosis, Prion diseases, Refsum's disease, Sandhoffs disease, Schilder's disease, Subacute combined degeneration of spinal cord secondary to Pernicious Anaemia, Schizophrenia, Spinocerebellar ataxia (multiple types with varying characteristics), Spinal muscular atrophy, Steele-Richardson-Olszewski disease, progressive supranuclear palsy, or Tabes dorsalis.
A “global profile” as referred to herein is a profile of a characteristic, such as, but not limited to, expression of mRNA, microRNA, DNA methylation, DNA sequence, transcription factor binding, proteins, proteome-wide phospho-proteins, in which there is not a preselection of what genes, DNA sites or what proteins or what subset of the characteristic should be profiled with a specific technique (e.g. microarrays).
A “protein-protein network” as referred to herein is a list of pairwise interacting proteins. These interactions have been derived from previous studies where e.g. the binding of a protein “A” to protein “B” has been shown with biochemical, functional or other biological assays. This interaction can represent a physical covalent or non-covalent binding event of protein “A” with protein “B” or the transient binding of protein “A” to protein “B” in a short lived biochemical reaction such as when protein “A” phosphorylates protein “B”.
A “Stem Cell Matrix” as referred to herein is a collection or database of global profiling data, such as global molecular analysis profiles, which may be gene expression profiles, microRNA expression profiles, non-coding RNA profiles, DNA methylation profiles, transcription factor binding profiles, proteomic profiles, global proteome-wide phospho-protein profiles, DNA sequence profiles, or a combination of elements of the mentioned global profiles.
A “transcriptional profile” as referred to herein is the complete or partial set of data obtained from a cell or a population of cells that can be determined from a single time point or over a period of time, consisting of the RNA types that are transcribed from the genome. These RNA types include, but are not limited to, mRNA, microRNA (miRNA), PIWI-interacting RNAs (piRNAs), endogenous small interfering RNAs (e-siRNAs), TINY RNAs (tiRNA), long non coding RNAs or a combination of the mentioned RNA-types.
A “computer network” as referred to herein is one or more computers in operable communication with each other. Computer implemented refers to one or more steps being actions being performed by a computer, computer system, or computer network. A computer program product as referred to herein is a product which can be implemented and used on a computer, such as software.
An “unsupervised classification” as referred to herein is a computational, algorithm-based classification system, which builds models based on a set of inputs where not all labels for all samples are available or known or understood. As disclosed herein, what has been defined by others as semi-supervised machine learning, which combines both labeled and unlabeled examples to generate an appropriate function or classifier, as unsupervised classification system, can be used.
An “unsupervised cluster method” as referred to herein is an unsupervised machine learning approach to cluster transcriptional profiles of the cell preparations into stable groups. For example, consensus clustering (Monti, S., P. Tamayo, J. Mesirov and T. Golub (2003). “Consensus Clustering: A Resampling-Based Method for Class Discovery and Visualization of Gene Expression Microarray Data.” Machine Learning 52 (1-2): 91-118) outputs a sample-wise distance matrix where the distance between every sample to every other sample in the dataset is represented by a value set between 1 (indistinguishable similar in the context of the data set) and 0 (no similarity detectable in the context of the dataset). A cluster is defined in the consensus clustering framework of a set of samples with high similarity based on the sample-wise distance matrix based on a cutoff set by the consensus clustering algorithm individually for each model. Every other algorithm which outputs a fitting clustering model with and distance measure among all samples can be used instead of the consensus clustering algorithm.
A “similar label profile” as referred to herein may be a common regulatory biochemical or metabolic activity. A similar label profile could be labels from the reference data set (e.g. induced pluripotent stem cells), labels which were derived computationally (e.g. some or all samples belonging to one or more specified clusters) or a combination thereof (e.g. some or all induced pluripotent stem cells which also belong to one or more computationally derived clusters). This could be the identification of a set of marker genes, proteins or pathways different among computationally derived clusters, which can be identified in the future with other biochemical techniques and thus allow identification of computationally identified cluster members with a biochemical assay.
A “labeled associated biological class” as referred to herein is a class based upon a biological definition of a cell, such as by markers or expression, with the main characteristic being that the class is determined by a subset of the total possible profile information.
A “cell characteristic analysis system” as referred to herein is a system, which can assay a characteristic of a cell, such as gene expression, microRNA expression, or methylation patterning.
“Obtaining” as used in the context of data or values, such as characteristic data or values refers to acquiring this data or values. It can be acquired, by for example, collection, such as through a machine, such as a micro array analysis machine. It can also be acquired by downloading or getting data that has already been collected, and for example, stored in a way in which it can be retrieved at a later time.
“Outputting” as referred to herein means an analytical result after processing data by an algorithm. An “updated reference database” as referred to herein is a reference database which has had a dataset merged into it. A “cell dataset” refers to any collection of characteristic data. “Characteristic data” refers to any data of a cell, such as gene expression, microRNA expression, or for example, methylation patterning.
Specific and preferred values disclosed for components, ingredients, additives, cell types, markers, and like aspects, and ranges thereof, are for illustration only; they do not exclude other defined values or other values within defined ranges. The compositions, apparatus, and methods of the disclosure include those having any value or any combination of the values, specific values, more specific values, and preferred values described herein.
It is understood that the examples and embodiments described herein are for illustrative purposes only and that various modifications or changes in light thereof will be suggested to persons skilled in the art and are to be included within the spirit and purview of this application and scope of the appended claims. All publications, patents, and patent applications cited herein are hereby incorporated by reference in their entirety for all purposes.
Among the provided embodiments are:
(a) a first incubation comprising exposing the cells to (i) an inhibitor of TGF-β/activing-Nodal signaling; (ii) at least one activator of Sonic Hedgehog (SHH) signaling; (iii) an inhibitor of bone morphogenetic protein (BMP) signaling; and (iv) an inhibitor of glycogen synthase kinase 3β (GSK3β) signaling, optionally under conditions to differentiate the cells to floor plate midbrain progenitor cells, optionally wherein the first incubation is initiated on day 0 of the culturing; and
(b) a second incubation of cells after the first incubation, wherein the second incubation comprises culturing the cells under conditions to neurally differentiate the cells, optionally wherein the second incubation is initiated at or about day 11 after the first incubation, and further optionally wherein the second incubation is for between at or about 11 and at or about 25 days.
Among the Provided Embodiments are:
1. A computer implemented method of identifying a determined dopaminergic precursor cell within an in vitro population of neuronal progenitor cells, the method comprising:
receiving a test dataset comprising data including gene expression profile information for an in vitro population of neuronal progenitor cells;
querying a gene expression reference database to compare said test dataset with said gene expression reference database, said gene expression reference database comprising gene expression profile information for a desirable determined dopaminergic precursor cell; and outputting a computed label classification comprising an indication of whether said in vitro population of neuronal progenitor cells copmrises a determined dopaminergic precursor cell.
2. The computer implemented method of embodiment 1, wherein said gene expression profile information for said desirable determined dopaminergic precursor cell comprises increased gene expression levels relative to a pluripotent stem cell for a first gene set, wherein said first gene set comprises at least one increased gene within one or more first gene ontologies selected from the group consisting of: GO0005509, GO0016339, GO0007416 and GO0048731.
3. The computer implemented method of embodiment 2, wherein said at least one increased gene is selected from the group consisting of: CAPN14, FAT3, FAT4, PCDHGC4, SLC8A1, SLIT2, CEMIP2, CDHR3, CDH2, DRD2, EPHB2, MAGI2, PCDHB11, PCDHB13, PCDHB14, PCDHB16, PCDHB2, ADGRG6, ELF5, EPHA7, FOXP1, GDF7, HOXA1, MINAR1, MSX1, NRBP2, NRIP1, PITX3, POU6F2, PTPRO, SLC35D1, TCF12, ZFHX3 and ZNF703.
4. The computer implemented method of one of embodiments 1 to 3, wherein said gene expression profile information for said desirable determined dopaminergic precursor cell comprises decreased gene expression levels relative to a pluripotent stem cell for a second gene set, wherein said second gene set comprises at least one decreased gene within one or more second gene ontologies selected from the group consisting of: GO0070887, GO0044459 and GO0044281.
5. The computer implemented method of embodiment 4, wherein said at least one decreased gene is selected from the group consisting of: ADCY8, AKR1C3, ALDH3A1, APRT, ASNS, BAX, BBC3, CCND1, CDH5, CH25H, CMKLR1, COL16A1, CXCL1, CXCL2, EDNRB, EEF1E1, RIPOR2, FGF10, FGF22, FZD7, GJA1, GNG8, GNPNAT1, HPGD, ICAM1, ITPR2, KLF1, KLF15, LEP, LPL, LRRC32, MAP3K5, MX1, MYC, NME1, NME2, NQO2, NR1D1, P2RY1, PCOLCE2, PDE4A, PDIA5, PFKP, PHGDH, PLK5, PPP1R14A, PRODH, PSMB8, PSMB9, PYCR1, RAPGEF3, RYR2, SCARB1, SHMT2, SIPA1, SPHK1, TRIM22, VDR, ADA, ADGRG3, ADGRL4, ANK1, ART3, CAll, CABP1, CDH15, CDHR1, COL13A1, EPHA6, CALHM6, GRID2IP, HS3ST3B1, ICAM5, JCAD, LGR6, LRRC38, NOXO1, PDPN, PLPPR5, PODXL, RAMP3, RGS7BP, RIMS4, RTBDN, RTN4RL2, S100A10, SEMA4A, SGCG, SH2D5, SHISA9, SHROOM1, SLC22A3, SLC24A2, SLC29A2, SLC6A11, SLC7A10, SLC7A5, SLCO2A1, STAC2, STYK1, TMC1, UNC13A, WWC1, ABCG2, ACSBG1, ACSS1, ACYL, AHCY, ALOX12B, AMD1, ARG2, ASST, BCAT1, CHST2, CLN8, ENTPD2, FABP5, FADS3, FUT4, FUT9, GAL3ST3, GMDS, HACD1, HAS3, HPD, KYAT1, LDHD, MPP1, OGDHL, PDE4A, PGM1, PIPDX, PLAAT3, PLA2G4C, PLCB3, PNP, PSAT1, PTGES, REXO2, SCARB1, SLC27A6, SPHK1, STAB2, UAP1L1 and UCK2.
6. The computer implemented method of one of embodiments 1 to 5, further comprising a machine learning model trained to determine whether said in vitro population of neuronal progenitor cells includes said determined dopaminergic precursor cell, said machine learning model outputting said computed label classification.
7. The computer implemented method of one of embodiments 1 to 6, wherein said in vitro population of neuronal progenitor cells are formed by allowing an induced pluripotent stem cell (iPSC) to expand in vitro.
8. The computer implemented method of one of embodiments 1 to 7, wherein said iPSC is a human iPSC.
9. The computer implemented method of one of embodiments 1 to 8, wherein said iPSC is allowed to expand for at least 15 days.
10. The computer implemented method of one of embodiments 1 to 9, wherein said iPSC is allowed to expand for about 18 days.
11. The computer implemented method of one of embodiments 1 to 10, wherein said gene expression profile information for said desirable determined dopaminergic precursor cell comprises an undesirable gene expression profile comprising one or more undesirable genes.
12. The computer implemented method of embodiment 11, wherein said one or more undesirable gene is a cancer marker gene.
13. The computer implemented method of embodiment 11, wherein said one or more undesirable genes is a tyrosine hydroxylase gene.
14. The computer implemented method of embodiment 6, wherein said machine learning model is a best fitting classification model identified by an algorithm as most stable to random perturbations.
15. The computer implemented method of embodiment 14, wherein said best fitting classification model can cluster individual datasets such that each dataset within a cluster is indistinguishable from each other dataset within said cluster.
16. The computer implemented method of one of embodiments 1-15, comprising identifying computationally derived class labels based only on biological characteristics.
17. The computer implemented method of one of embodiments 1-16, comprising identifying differences in at least one dataset for at least one label between at least two samples in at least two clusters.
18. The computer implemented method of one of embodiments 1-17, comprising filtering within a cluster for samples having a similar label profile.
19. The computer implemented method of one of embodiments 1-18, comprising defining differentially regulated protein-protein networks.
20. The computer implemented method of embodiment 19, comprising using said protein-protein networks to define a class membership, manipulate class membership, or define biological function of said neuronal progenitor cells.
21. The computer implemented method of embodiment 14, wherein said best fitting classification model can cluster individual datasets such that each dataset within a cluster is different from each other individual dataset.
22. The computer implemented method of one of embodiments 1-21, wherein said computed label classification is an unsupervised classification of said updated reference database comprising clustering RNA, DNA and/or protein profiles.
23. The computer implemented method of one of embodiments 1-22, wherein said gene expression profile information is obtained from microarray analysis of cellular RNA.
24. The computer implemented method of one of embodiments 1-23, wherein said computed label classification is an unsupervised machine classification comprising a bootstrapping sparse non-negative matrix factorization.
25. The computer implemented method of one of embodiments 1-24, wherein said gene expression reference database comprises transcriptional profiles of one or more dopaminergic neurons.
26. The computer implemented method of one of embodiments 1-25, further comprising classifying cells with said in vitro population of neuronal progenitor cells based at least in part on a computationally derived protein-protein network.
27. The method of one of embodiments 1-26, wherein said gene expression profile information comprises a transcriptional profile.
28. The computer implemented method of one of embodiments 1-27, wherein said gene expression reference database comprises known class labels.
29. The computer implemented method of one of embodiments 1-28, wherein said gene expression reference database forms part of a storage medium.
30. The computer implemented method of one of embodiments 1-29, wherein receiving said test dataset comprises receiving input from an array analysis system.
31. The computer implemented method of one of embodiments 1-29, wherein receiving the test dataset comprises receiving input via a computer network.
32. The computer implemented method of one of embodiments 1-29, wherein said data in said reference database is associated with one or more labeled associated biological classes of the cells.
The following examples are included for illustrative purposes only and are not intended to limit the scope of the invention.
Methods of Identifying Dopaminergic Neurons and Progenitor Cells
The differentiation of induced pluripotent stem cells (iPSC) or embryonic stem cells (ESC) into neurons (Studer, 2012) is a developmental process which adheres to the principles of developmental biology.
A method was developed for evaluating the whole cell phenotype of a cell type, for instance that of dopaminergic neurons, based on gene expression data collected during differentiation. An exemplary workflow for this method is shown in
Parameter #1: a Neuroscore that was the result of a logistic regression model that measured the probability of a “test” developing neuronal cell preparation (e.g., an in vitro population of neuronal progenitor cells) being a phenotypic match to a reference developmentally-determined dopaminergic neuron (determined dopaminergic precursor cell). See
Parameter #2: a Novelty score that indicated the phenotypic deviation of a “test” dopaminergic neuron preparation (in vitro population of neuronal progenitor cells) when compared to a known reference set of developmentally-determined dopaminergic neurons. The novelty score measured technical as well as biological variations in the data. Here larger Novelty score values indicated gene expression patterns usually not observed in the standard reference set. According to the NeuroTest algorithm, high quality determined day 18 dopaminergic lines (determined dopaminergic precursor cells) had a Neuroscore ≥500 and Novelty Score ≤0.48. These thresholds allowed for the labelling of a sample as a “pass” for having a high likelihood of continuing to mature into a therapeutically viable dopaminergic neuron as cellular development continues to day 25 and beyond.
This style of two parameter descriptor for evaluating the whole cell phenotype of a cell type is reminiscent of a different and distinct cell test called PluriTest. The new test procedure provided herein is focused on identifying a specific transitory developmental state of a cell type (e.g., a determined dopaminergic precursor cell), and then imputing a likelihood for its developmental end point. This was not the case for Pluritest, which was solely focused on identifying the stable cell state known as pluripotency (Muller et al., 2011).
Underlying NeuroTest were two custom data analysis methods: [1] a reference-neuron data model, based on generated gene expression data and publicly available neuron gene expression data and [2] a computing method to compare RNAseq gene expression data coming from new neuronal test samples to the reference gene expression data summarized in the model. The exemplary workflow depicted in
A. The Design and Construction of the NeuroTest Reference Set Data Model
To generate the reference datasets used in developing the NeuroTest model, dopaminergic neuron cellular samples were generated by differentiation of iPSCs in vitro and sampling of cell lines as they differentiated from d0 to d60, or beyond. Sample by sample, mRNA was extracted in bulk to enable the determination of the cell's gene expression pattern (Hrdlickova et al., 2017). The integration and analysis of these gene expression patterns was responsible for the creation of the developmentally-determined neuron data model used in NeuroTest.
To measure these gene expression patterns, total RNA was extracted from DA neurons using AllPrep DNA/RNA Mini Kit (QIAGEN) following the manufacturer's protocol. This was RNA quality was assessed based on RNA integrity number (RIN) using an Agilent Bioanalyzer. Any samples with RIN less than 7.5 were re-isolated. Paired end sequencing libraries were prepared using the Illumina PolyA+ TruSeq mRNA Library Prep kit V2 and sequenced using an Illumina HiSeq2500. Samples were sequenced to an average of 30 million paired end reads (Hrdlickova et al., 2017). The reads were converted into a table of gene expression data by aligning the reads to the transcriptome (Salmon version 0.7.2, (Patro et al., 2017)) and counting how many reads aligned to each gene. The summed counts directly reflected the concentration of a specific mRNA transcript in the cell at the time of the RNA extraction. Read counts were normalized to TPM (Transcripts Per Kilobase Million) values before analysis by Non Negative Matrix factorization (Brunet et al., 2004).
After sequencing, the RNAseq datasets as well as microarray datasets were included in the NeuroTest model and themselves included a variety of neuron focused gene expression datasets. Together, these reflected the discriminatory needs of the model and provided a perspective on intra- and inter-patient cell line variation, as well as sample to sample biological and technical variation present in DA neuron preparations. The datasets included:
6 RNAseq datasets from DA neurons used for a successful Rat neuron transplantation study (60 Rats in study), wherein transplantation led to reveral of the effect of a Parkinsonian model brain lesion. These were “gold standard” datasets which can be thought of as a dopaminergic neuron substitute for iPSC lines which have been “proven” pluripotent by passing the Teratoma assay (Daley et al., 2009). For this transplantation study, iPSCs were generated from six patients with Parkinson's disease (PD). First, punch biopsies were used to harvest skin fibroblasts from each patient. Tissue from the biopsies was minced with a scalpel and subjected to collagenase or trypsin treatment before being placed in culture. The fibroblasts were then reprogrammed to integration-free iPSCs using Sendai virus and frozen at passage 10.
After reprogramming, iPSCs were placed in an in vitro dopaminergic neuron differentiation protocol prior to being transplanted in a PD rat model. In this model, rats received unilateral stereotaxic injection of 6-hydroxydopamine (6-OHDA) into the substantia nigra or the medial forebrain bundle. This lesioning led to asymmetric dopamine discharge after amphetamine treatment (i.e., dopamine was discharged only from the unlesioned hemisphere) that caused lesioned rats to circle in one direction when moving. In this study, after baseline circling behavior was measured in lesioned rats, neural precursors at day 18 of the dopaminergic neuron differentiation protocol were transplanted into the lesioned hemisphere. Rats were then periodically tested for amphetamine-induced circling. Six to eight weeks after transplant, the net number of amphetamine-induced rotations was reduced to zero. This result showed that transplantation of developmentally determined dopaminergic precursor cells (i.e., neural precursors at day 18 of the dopaminergic neuron differentiation protocol) led to the reversal or amelioration of PD symptoms.
70 Microarray datasets from dopaminergic neuron preparations. These were quality controlled and annotated with an indication of final dopamine production levels. Microarray datasets included dopaminergic neuron preparations from day 25 of a dopaminergic neuron differentiation protocol, and iPSCs subjected to this protocol were generated from 12 PD patients.
47 RNAseq datasets from dopaminergic neuron preparations, annotated with quality control data for Tyrosine Hydroxylase staining followed by flow cytometry. Cell lines were sampled at day 0, day 13, day 18 and day 25 of a dopaminergic neuron differentiation protocol. These datasets were collected using iPSCs generated from the same PD patients as above as well as from healthy control subjects.
56 RNAseq datasets from dopaminergic neuron preparations originating from 7 individuals, each with biological replicate clones and sampled at day 0, day 13, day 18 and day 25 of a dopaminergic neuron differentiation protocol. These datasets were collected using iPSCs generated from the same PD patients as above as well as from a healthy control subject.
8 RNAseq spiked mixtures (0.1%, 1% spike) of dopaminrgic neurons with iPSC. These datasets were collected using iPSCs generated from the same PD patients as above as well as from healthy control subjects.
Some of these datasets contained samples with known and characterized imperfections, such as chromosome abnormalities. These imperfections can be labelled, and their inclusion enhances the discriminatory power of the NeuroTest model.
B. The NeuroTest Data Model and Non-Negative Matrix Factorization (NMF)
For training the NeuroTest data model, non-negative matrix factorization (NMF) was first applied to the reference datasets (RNAseq and microarray datasets) described in Section A above. In contrast to distance-based clustering algorithms, such as hierarchical clustering, NMF uses matrix factorization to detect relations between items (Brunet et al., 2004). The dataset was represented as a large matrix, called the V matrix, which contained N mRNAs, and M cells lines. Over many iterations, NMF computed two component matrices, the W matrix (an N×k matrix) and the H matrix (a k×M matrix), which when multiplied together approximated the complete matrix for the dataset. Initial values in the W and H matrices were chosen randomly, and each iteration attempted to minimize the distance between WH and V. Clustering of cell lines was read out from the H matrix, in which each entry was indexed to a cluster number and a cell line, and contained a value indicating how well the cell line fit in that cluster (Brunet et al., 2004).
The criteria that conventional NMF (V˜W×H) optimizes is quality of approximation of all samples in the V matrix with a given number of metagenes. The number of metagenes is equivalent to k; the W matrix reflects how each gene in the V matrix contributes to a metagene; and the H matrix reflects cell lines' expression levels of these metagenes. Sometimes, approximation of all samples in the V matrix can lead to inappropriate “placement” of metagenes/meta-samples, for example: (1) between determined and less constrained stages, or (2) closer to an easy to approximate, large, low heterogeneity subgroup such as day 0. Therefore, discriminant NMF (Zafeiriou et al., 2006) was selected, which used the class labels in the training of the NMF model for detecting developmentally-determined cell types. Class labels indicated whether or not a cell line was at day 18 or later of the dopaminergic neuron differentiation protocol. To increase tolerance towards platform specific technical artifacts, the model was pre-trained on an initial collection of Illumina Beadarray data and lifted via a virtual Array approach to the RNA-seq platform. Model lifting was accomplished by using DNA probe sequence matching and summing code, quantile normalization, and transfer filtering. The “novelty” detection used conventional NMF since all samples were considered to stem from the same class of determined dopaminergic neurons (determined dopaminergic precursor cells). In this example, a relatively low dimensionality of k=3 (i.e., number of metagenes) was used.
After NMF was performed, the NeuroTest data model was then trained based on the outputs of NMF. Specifically, a logistic regression model was trained using metagene expression levels (the H matrix) and the class labels indicating whether or not a cell line was at day 18 or later of the dopaminergic neuron differentiation protocol. The number and selection of metagenes used for training (rows of the H matrix) was chosen based on a systematic search procedure optimizing for high accuracy in predicting class labels. Metagenes highly expressed in the target class (i.e., dopaminergic differentiation day 18 or later) were used for training. Parameters were selected by 5-fold cross-validation (Hastie et al., 2009) and evaluated on an unused portion of the training dataset which had been set aside for this purpose. Defined mixtures were used to identify the sensitivity of the approach, and to define cut-off boundaries.
C. Method to Compare the Input Test Data with the NeuroTest Data Model
After training of the NeuroTest model, test samples containing RNAseq data from separate developing neuronal preparations were prepared for input. Specifically, a TPM (Transcripts Per Kilobase Million) based “virtual array” was constructed for each test sample from its RNAseq data. A “virtual array” probe set was generated by locating the exact match probe sequences from the HT12v4 Illumina array in the Gencode v25 transcriptome sequences. This “virtual array” probe set was pruned for probes with either no match in the Gencode v25 transcriptome, or that had large model errors. The error in the “virtual array” model was assessed by performing a t-test between the expression in pluripotent samples of the GSE53094 dataset (processed as described above) and the pluripotent samples in the original training dataset. Thus, probes with no hits in Gencode v25 or with a foldchange >0.5 and a p.value<0.05 according to the t-test were removed, leaving 10,079 probes. A sample “virtual-array” was created by summing the Salmon TPM for transcripts with matches to each of these 10,079 probe sequences. The data was then transformed into a standard R-lumiBatch object (Du et al., 2008), quantile normalized, and tested with the previously prepared NeuroTest predictive model.
Specifically, the test sample's gene expression data was first converted to that of the metagenes used in training the NeuroTest model. To do so, and using the W matrix generated by applying NMF to the reference databases, regression analysis was performed to solve for the weighted combination of W-matrix basis vectors that best reconstructed the test sample's gene expression data. These weights corresponded to metagene expression levels of the test sample. The logistic regression model was then tested with the metagene expression levels of the test sample, while the gene expression data of the test sample was compared to that of the reference datasets. This yielded the NeuroScore and Novelty Score, respectively, which together reflected how similar the “test sample” precursor dopaminergic neuron was to those in the original reference data model.
After determining the test sample's NeuroScore and Novelty Score, these values were compared to predetermined thresholds for each parameter. The NeuroScore and Novelty Score thresholds were previously set to separate high quality dopaminergic neuronal lines from those with quantifiable deviations from the dopaminergic neuron developmentally-determined phenotype (e.g. “Low quality, low dopamine producing” cell lines) with 98% sensitivity and 100% specificity. Specifically, NeuroScore and Novelty Score thresholds were set based upon empirical testing using age-specific gene expression patterns from various timepoints throughout cellular differentiation (Day 0 to Day 13, Day 18, and Day 25). Previously, high NeuroScores had been obtained using Day 18 and Day 25 gene expression patterns, while low scores had been obtained for Day 0 gene expression patterns. High Novelty Scores had been obtained for gene expression patterns not usually observed for determined dopaminergic precursor cells. To find appropriate thresholds that could classify determined dopaminergic precursor cells with the highest degree of accuracy, both NeuroScore and Novelty Score thresholds had been iteratively adjusted until the area under the receiver operator characteristic (ROC) curve was maximized Based on this analysis, test samples were classified as determined dopaminergic precursor cells if they displayed Neuroscore ≥500 and Novelty Score ≤0.48.
Preparations of precursor dopaminergic neurons that had unusually high Novelty Scores indicated that these test samples should be: (a) excluded from any downstream therapeutic applications and (b) evaluated for epigenetic or genetic abnormalities or unwanted differentiation. Cell lines that had NeuroScores just below the cutoff threshold would need further investigation to confirm the integrity of the precursor dopaminergic neuron developmentally-determined state. For cell lines not passing either threshold, they may need to be excluded from any downstream therapeutic applications and potentially examined to rule out genetic abnormalities. Dopaminergic neuron differentiation of failures can be examined to evaluate reasons for failing NeuroTest.
D. Computing Framework
The computing framework used to implement parts [1] and [2] of NeuroTest was written in the R statistical computing language (R Development Core Team, 2010). R may be used as well as other modern programming languages with tools for statistical analysis. Nucleic acid sequence alignment used the Salmon pseudo aligner (Patro et al., 2017). NeuroTest was deployed as a data analysis pipeline for Illumina short read sequencing data and used on a Linx based local server or a Linux based virtual machine running either locally, or in a remote “cloud” computing environment. The pipeline included sequence quality evaluation and verification steps, sequence alignment to the transcriptome, counting and summarization of all gene expression levels, statistical (quantile) normalization of gene expression counts, statistical comparison to the data in the model and preparation and plotting of graphical output.
E. The NeuroTest Model Validation Dataset
Additional RNAseq datasets were used to validate the NeuroTest model trained in Section B above. Before validation, these datasets were prepared for input as described in Section C above. As shown in
Prior to validation, the NeuroTest model was initially trained on discriminating genes from the microarray data and supplemented with RNAseq based gene expression data. Then, RNAseq data was used as validation data since the model training was done with Illumina beadarray data by using 5 fold cross-validation. The validation RNAseq data was generated or downloaded from public data repositories. The samples in the upper left quadrant of
F. The NeuroTest Challenge Dataset and Testing the Data Model
For further validation and to demonstrate that the model can distinguish between cell types expected to pass or fail NeuroTest, a test dataset was constructed with a set of predicted outcomes. The challenge dataset consisted of 86 publicly available RNAseq datasets, created from a variety of brain cell types (mainly astrocytes and various neurons). The RNAseq data were downloaded from The Gene Expression Omnibus (GEO-NCBI) https://www.ncbi.nlm.nih.gov/geo/.
Archival GEO GSE dataset numbers:
GSE116124 (di Domenico et al., 2019)
GSE117664 (Astrocytes, unpublished, but data released)
GSE99652 (Weissbein et al., 2017)
GSE120306 (unpublished, but data released for ipsc derived astrocytes)
GSE98289 (Hall et al., 2017)
GSE84684 (Kouroupi et al., 2017).
Challenging the NeuroTest model trained in Section B above with these new datasets revealed that the model could determine which samples matched to the phenotype of a dopaminergic neuron and which did not.
G. R-Code Underlying the NeuroTest Core Functions
Example R-code which executes the statistical routine exemplified above for comparing the test sample to the reference data model is shown below. On the server, it functioned as a part of a larger data analysis pipeline. This routine could be envisaged and re-written in numerous different ways.
The use of single-cell RNAseq (scRNAseq) data was evaluated for use in the method for determining the whole cell phenotype of a cell type described in Example 1 herein. As above, NMF was used to derive metagenes (W matrix) and expression levels thereof (H matrix) from scRNAseq datasets. After performing NMF, metagenes derived from scRNAseq data were compared to those derived from corresponding bulk RNA data. Next, a logistic regression model was trained on metagene expression levels derived from scRNAseq data in order to predict the presence of determined dopaminergic neurons, and its performance on bulk RNAseq test samples was assessed.
To do so, neural precursor cells were generated as described above from the same PD patients and healthy control subjects. Single-cell RNA (scRNA) was isolated from these precursor cells at day 13, day 18, and day 25 of an in vitro dopaminergic neuron differentiation protocol using the isolation protocol illustrated in
A. Comparing Metagenes
Metagenes and expression levels thereof between different types of data (scRNAseq, bulk RNAseq) from the same samples were compared. Aggregrated scRNAseq data (i.e., bulk from single cell data) was also generated in order to approximate bulk RNAseq data, with aggregation achieved by taking the mean gene expression level across single cells within the same sample. Conventional NMF was performed on each dataset in order to determine each datasets' metagene composition and the expression levels of each metagene.
B. Comparing Model Performance and Output
To evaluate an scRNAseq-trained model used to predict the presence of a determined dopaminergic precursor cell, an NMF and model training procedure similar to that decribed in Example 1, Section B, herein was employed. Specifically, conventional NMF was first performed on scRNAseq data from precursor cells at day 25 of differentiation, thus producing a W matrix reflecting the contribution of each gene to a metagene. Next, scRNAseq gene expression data from each of several timepoints during differentiation was converted to metagene expression data. As above, this conversion was performed by using the W matrix and regression analysis to solve for each sample's metagene expression levels. Finally, a logistic regression model was trained using the metagene expression data and class labels indicating whether or not the cells were determined dopaminergic precursor cells.
To test for model performance, the scRNAseq-trained model was tested on 111 out-of-sample bulk RNAseq data points. Of these datapoints, 75 were from samples of determined dopaminergic precursor cells. As shown by the receiver operator characteristic (ROC) curve in
Together, these results indicate that scRNAseq data could be incorporated into the method for determining the whole cell phenotype of a cell type described in Example 1 herein.
Single-cell RNAseq data was incorporated into the method described in Example 1 herein. The evaluation of test samples' expression of various marker genes was also incorporated.
A. Datasets for Model Training and Gene Deviation Estimation
Single-cell and bulk RNAseq datasets were generated as described in Examples 1 and 2 herein. Specifically, scRNA and bulk RNA were isolated from samples of precursor cells at day 13, day 18, and day 25 of an in vitro dopaminergic neuron differentiation protocol. After RNA sequencing, all scRNAseq data was pre-processed using a Seurat single-cell processing pipeline. This preprocessing was used to match single cells to their respective cell lines, remove data representing more than one cell (doublets), and filter out samples based on mitochondrial and ribosomal RNA content. Only genes with data available in all scRNAseq and bulk RNAseq datasets were included in subsequent processing.
B. Non-Negative Matrix Factorization (NMF) for Metagene Derivation
As in Example 1, metagenes were derived using NMF. Specifically, conventional NMF was performed for each scRNAseq dataset (day 13, day 18, day 25), in this manner deriving separate metagenes (W matrices) for each developmental timepoint. These metagene models described expected patterns of whole culture gene expression throughout differentiation. Initial W and H matrices were provided for each performance of NMF. For the initial W matrix, uniform manifold approximation and projection (UMAP) was performed on the scRNAseq dataset after preprocessing with principal component analysis (PCA). The cluster centroids output by UMAP, for which there were 5-6 clusters per scRNAseq dataset, were used as the initial W matrix. An initial H matrix was approximated from each scRNAseq dataset and its corresponding initial W matrix using non-negative least squares approximation.
C. Model Training
After NMF, the metagene expression levels (loadings) of the bulk RNAseq datasets were determined for all metagenes (i.e., those derived from each of the three scRNAseq datasets). First, the W matrices produced in Section B above were location- and scale-normalized. Next, a penalized regression model was used per sample in order to estimate each sample's bulk RNAseq data using each of the normalized W matrix (timepoint-specific metagenes). In this manner, samples' expression levels of metagenes derived throughout development were approximated, thus providing a time-resolved profile for each sample. Using these profiles, a logistic regression model was trained using the metagene expression levels for the bulk RNAseq datasets and class labels indicating whether or not the samples in the bulk RNAseq datasets were at day 18 or later of the dopaminergic neuron differentiation protocol. Thus, a model for predicting the presence of a determined (e.g., day 18 or later) dopaminergic precursor cell was generated, the output of the model providing an indication akin to the NeuroScore described in Example 1 herein. As the model was trained on bulk RNAseq data, key aspects related to cell population structure and important biological processes, such as cell cycle status, were captured in the model.
D. Deviation Score Calculation
Deviation scores similar to the Novelty Scores described in Example 1 herein were also calculated per bulk RNA sample. These deviation scores provided summary statitics of irregular pattrns of gene expression. To do so, single-gene expression level deviation was calculated per sample. Calculated deviations were specific to the timepoint at which each sample was collected (day 13, day 18, or day 25). First, and for optimal calculation of deviation given the count-based nature of bulk RNAseq data, a Limma-Voom counts-per-million (CPM) approach was used to convert bulk RNAseq data from units of TPM to CPM. Next, a linear model was used per sample in order to calculate estimated gene expression data based on the sample's metagene expression levels (estimated in Section B above). The residuals per gene (difference between the estimated gene expression data and the actual bulk RNAseq data in CPM) was then calculated.
To normalize residuals across genes, a set of genes with stable expression levels was first used to estimate typical deviation across samples. The median absolute deviation of stably expressed genes with log2CPM values between four and 9.5 was used as an estimate of typical gene deviation across samples, and based on this analysis, a value of 0.5 was used as a baseline for residual standard deviation. Thus, residuals were normalized by dividing by either the standard deviation of gene expression across samples or 0.5 if such standard deviation was less than 0.5.
After normalization, two quantile values per sample were determined. First, the 95% quantile of the absolute normalized residuals was calculated. Second, the 95% quantile of absolute normalized residuals corresponding to ˜30 predefined marker genes was determined. These marker genes are shown in Table E1 below and were chosen based on their dynamic behavior through and impact on dopaminergic neuron differentiation. An exemplary sample's normalized residuals for these marker genes are shown in
E. Thresholds for Model Output and Deviation Scores
To establish predetermined thresholds for evaluating test samples, model predictions (NeuroScores) and deviation scores (Novelty Scores) across samples were examined. As in Example 1, Section C herein, samples' bulk RNAseq data was converted using a linear model to expression levels of metagenes used to train the model produced in Example 3, Section C, and these converted metagene expression levels were provided to the trained model. Deviation scores were also calculated per sample as described in Section C above.
Such analysis indicated that samaples from day 18-25 of differentiation were likely to have model output greater than 0 (i.e., probability greater than 0.5 of the sample comprising a determined dopaminergic precursor cell), and it was determined that samples having a Novelty Score of less than 5 had acceptable gene deviation.
F. Model Validation
The present invention is not intended to be limited in scope to the particular disclosed embodiments, which are provided, for example, to illustrate various aspects of the invention. Various modifications to the compositions and methods described will become apparent from the description and teachings herein. Such variations may be practiced without departing from the true scope and spirit of the disclosure and are intended to fall within the scope of the present disclosure.
This application claims priority to U.S. provisional applications 62/878,701, filed Jul. 25, 2019, entitled “METHOD OF IDENTIFYING DOPAMINERGIC NEURONS AND PROGENITOR CELLS,” the contents of which are incorporated by reference in its entirety for all purposes.
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/US2020/043627 | 7/24/2020 | WO |
Number | Date | Country | |
---|---|---|---|
62878701 | Jul 2019 | US |