The present Application relates to methods, compositions, and kits for assessing endometrial transformation, including the implantation window.
Despite recent advances in assisted reproductive technologies, implantation rates remain relatively low. Implantation failures are thought to be associated with inadequate endometrium receptivity and/or with defects in the embryo-endometrium dialogue. The endometrium is receptive to blastocyst implantation during a spatially and temporally restricted window, called “the implantation window” or the “window of implantation.” In humans, this period begins 6-10 days after the LH surge and lasts approximately 48 hours. Several parameters have been suggested for assessing endometrium receptivity, including endometrial thickness which is a traditional criterion, endometrial morphological aspect and endometrial and subendometrial blood flow. However, their positive predictive value is still limited.
More recently, transcriptomic approaches have been utilized to identify biomarkers of the human implantation window. Using microarray technology in human biopsy samples, several authors have observed modifications in gene expression profile associated to the transition of the human endometrium from a pre-receptive (early-secretory phase) to a receptive (mid-secretory phase) state (Carson et al., 2002; Riesewijk et al., 2003; Mirkin et al., 2005; Talbi et al., 2006). However, only very few genes were in common between all these studies (Haouzi et al., 2009). Such variability in the results may have several explanations: differences in the day of the endometrial biopsies, different patient profiles, inadequate numbers of endometrial samples studied, and the overall complexity of the endometrium.
The endometrium is unlike any other tissue as it consists of multiple cell types which vary dramatically in state through a monthly cycle as they enter and exit the cell cycle, remodel, and undergo various forms of differentiation with relatively rapid rates. The notable variance in menstrual cycle lengths within and between individuals10 adds an additional variable to the system. Studies to date including transcriptomic characterizations have been insufficient to understand and characterize hallmark endometrial events, such as the implantation window.
Given these deficiencies in the art, and in view of the broad relevance and importance of human fertility and regenerative and reproductive biology, there has been a long need in the art for a systematic characterization and molecular understanding of endometrial transformation across the natural menstrual cycle that go beyond the traditional histological characterization scheme well established in the art. Such an understanding—including the identification of useful biomarkers associated with hallmark endometrial events, e.g., the implantation window—would make a significant contribution to the art and to the field of medical intervention into human reproductive technologies, e.g., in vitro fertilization and contraception technologies.
The present disclosure is based, in part, on the finding that after systemic transcriptomic characterization of the human endometrium across six (6) cell types—including (1) previously uncharacterized ciliated epithelium, (2) unciliated epithelium, (3) stromal cells (e.g., stromal fibroblasts), (4) endothelium cells, (5) macrophages, and (6) lymphocytes—and the different phases of the menstrual cycle (e.g., menstruation, follicular phase, ovulation, and luteal phase), that certain genes (e.g., biomarkers) are indicative and/or provide a gene expression signature for one or more hallmark endometrial events, e.g., a specific phase of endometrial transformation, such as, the implantation window. Accordingly, aspects of the present Application relate to methods and compositions for transcriptomic characterization of human endometrium over the different cell types making up the endometrium as the cells undergo change throughout the complete transformation cycle of the endometrium during a menstrual cycle to identify cell-type-specific gene signatures that may be used to evaluate endometrial samples for the appearance or presence of one or more menstrual cycle events, e.g., implantation window.
In various embodiments, the present invention relates to using the cell-type-specific gene expression signatures (e.g., biomarker panels) to evaluate, assess, or otherwise probe one or more endometrial samples from a subject to detect the appearance or presence of one or more menstrual cycle events, e.g., implantation window. In some embodiments, the endometrial samples can be evaluated in bulk, that is as a complete tissue sample since the gene expression signatures are characteristic of a unique endometrial cell type. In other embodiments, the endometrial sample can be process to separate out one or more specific cell types, e.g., the unciliated epithelial cells, using a means for cell separation (e.g., FACS cell-sorting). The separated endometrial cell subtypes can be separately evaluated using the appropriate gene expression signature for that cell type to detect the appearance or presence of one or more menstrual cycle events, e.g., implantation window, in that tissue or cell sample.
In other aspects, the present Application relates to the identified cell-type-specific gene panel signatures, i.e., sets of biomarkers, which correspond or otherwise mark the appearance, presence, or disappearance of a specific phase of endometrial transformation, such as, for example, the window of implantation. In still other aspects, the present Application describes practical and/or clinical application of the identified gene panel signatures to detect the appearance, presence, or disappearance of a particular transformation state of the endometrium of a subject, i.e., the detection of the window of implantation.
Further, aspects of the present disclosure relate to methods and compositions for detecting the phase of endometrial transformation in a subject by detecting and measuring differentially expressed genes (e.g., biomarkers). In some embodiments, differentially expressed genes (e.g., biomarkers) are detected in a sample from a subject (e.g., a patient). In some aspects, the present disclosure relates to methods to detect the opening of the window of implantation in a subject. In other aspects, the present disclosure relates to methods to detect the opening of decidualization. Some aspects of the present disclosure relate to methods of detecting the early-proliferative, late-proliferative, early-secretory, mid-secretory, and/or late-secretory phase of the menstrual cycle of a subject.
Additional aspects and embodiments of the present invention described herein are as follows.
In one aspect, the Application provides a method of diagnosing a menstrual cycle event in a subject, comprising detecting in a biological sample a gene signature for one or more endometrial cell types. The menstrual cycle event can include the follicular phase, ovulation, or the luteal phase, or a window of implantation (WOI).
In various embodiments, one or more endometrial cell types can be selected from the group consisting of stroma cells, endothelium cells, immune cells, unciliated epithelium cells, and ciliated epithelium cells.
In some embodiments, the one or more endometrial cell types is unciliated cells and the gene signature comprises one or more biomarkers selected from the group consisting of: PLUA, MMP7, THBS1, CADM1, NPAS3, ATP1A1, ANK3, ALPL, TRAK1, SCGB1D2, MT1F, MT1X, MT1E, MT1G, CXCL14, MAOA, DPP4, NUPR1, GPX3, and PAEP. In certain embodiments, CADM1, NPAS3, ATP1A1, and TRAK1 are downregulated and NUPR1 is upregulated relative to WOI.
In other embodiments, the one or more endometrial cell types is stromal cells and the gene signature comprises one or more biomarkers selected from the group consisting of: STC1, NGATC2, BMP2, PMAIP1, MMP11, SFRP1, WNT5A, ZFYVE21, CILP, SLF2, MATN2, S100A4, DKK1, CRYAB, FOXO1, IL15, FGF7, and LMCD1. In certain embodiments, the NFATC2, BMP2, PMAIP1, ZFYVE21, CILP, SLF2, MATN2, and FGF7 are downregulated and CRYAB is upregulated relative in WOI.
In certain embodiments, the methods may include the step of separating the one or more endometrial cells prior to the detection step. For example, prior to detection of biomarkers in a sample, the stroma cells, endothelium cells, immune cells, unciliated epithelium cells, and ciliated epithelium cells can be separated from one another.
In various embodiments, the cells can separated by fluorescence activated cell sorting (FACS).
In other embodiments, the methods may include the additional step of transferring a fertilized embryo to the uterus of the subject determined to be within the window of implantation.
In still another aspect, the Application provides a method for determining a gene expression profile in each of a plurality of endometrial cells, wherein said endometrial cells are:
(a) in an endometrial sample obtained from a subject, and
(b) unciliated epithelial cells. The unciliated epithelial cells can be separated from ciliated epithelial cells. The gene expression profile of an unciliated epithelial cell can be identified using one or more gene expression markers characteristic of unciliated epithelial cells. The gene expression profile can comprise at least twenty genes selected from the group consisting of the genes shown in
In certain embodiments, the gene expression markers characteristic of unciliated epithelial cells can comprise PLUA, MMP7, THBS1, CADM1, NPAS3, ATP1A1, ANK3, ALPL, TRAK1, SCGB1D2, MT1F, MT1X, MT1E, MT1G, CXCL14, MAOA, DPP4, NUPR1, GPX3, and PAEP.
In still another aspect, the Application provides method for detecting that a subject is within a window of implantation (WOI), the method comprising: (a) determining a level of expression of at least twenty genes in a sample of endometrial cells obtained from a subject, wherein the twenty genes are selected from the group consisting of the genes shown in
In yet another aspect, the Application provides a method for identifying a subject as being within a window of implantation (WOI), the method comprising: (a) determining a level of expression of at least one gene in an isolated cell population, wherein the at least one gene is selected from the group consisting of PAEP, GPX3, CXCL14, NUPR1, DPP4, MAOA, MT1G, MT1E, MT1X, and MT1F, wherein the isolated cell population has been isolated from a sample of endometrial cells obtained from a subject, wherein the cell population comprises cells having elevated expression of genes associated with epithelial cells and depressed expression of genes associated with cilial function; and (b) comparing the determined level of expression of the at least one gene with a control level; and (c) identifying the subject as being within the WOI, wherein the subject is identified as being within the WOI if the level of the expression of at least one gene is at least two-fold higher than a control level.
In some embodiments, a method of increasing the likelihood of becoming pregnant comprises (a) performing gene expression assay (e.g., to assay the RNA and/or protein level for one or more genes of interest), for example in tissue (e.g., endometrial tissue, or blood) or in one or more cell types of interest to determine whether a subject (e.g., a woman) is within a window of implantation (WOI); and (b) transferring a fertilized embryo to the uterus of the subject determined to be within the window of implantation.
In some embodiments, a method of treating infertility in a subject in need thereof comprises administering an effective amount of an agent that upregulates any one or more of genes associated with a WOI, for example, but not limited to, any one or more of the genes selected from the group consisting of PAEP, GPX3, CXCL14, NUPR1, DPP4, MAOA, MT1G, MT1E, MT1X, and MT1F in one or more of the tissues in the subject in an effective amount to treat the infertility.
In still other aspects, the Application provides a method for detecting a window of implantation (WOI) in a subject, the method comprising: (a) isolating a cell population within a sample of endometrial cells obtained from a subject, wherein the cell population comprises cells having elevated expression of genes associated with epithelial cells and depressed expression of genes associated with cilial function; (b) determining a level of expression of at least one gene in the cell population wherein the at least one gene is selected from the group consisting of PAEP, GPX3, and CXCL14; and (c) determining whether the subject has entered the WOI, wherein the subject is identified as within the WOI if the level of the expression of at least one gene is higher than a predetermined level. In some embodiment, step (a) comprises determining the level of expression of at least two genes from the group consisting of PAEP, GPX3, and CXCL14. In other embodiments, step (a) comprises determining the level of expression of each of the genes from the group consisting of PAEP, GPX3, and CXCL14.
The method in some embodiments may involve determining the level of expression of at least one gene selected from the group consisting of NUPR1, DPP4, MAOA, MT1G, MT1E, MT1X, and MT1F. In other embodiments, the method may involve determining the level of expression of at least two genes selected from the group consisting of NUPR1, DPP4, MAOA, MT1G, MT1E, MT1X, and MT1F. In still other embodiments, the method may involve determining the level of expression of at least three genes selected from the group consisting of NUPR1, DPP4, MAOA, MT1G, MT1E, MT1X, and MT1F. In yet other embodiment, the method may involve determining the level of expression of each gene selected from the group consisting of NUPR1, DPP4, MAOA, MT1G, MT1E, MT1X, and MT1F.
In any of the methods herein, the step of determining the level of expression of a gene comprises determining the amount of a nucleic acid. The level of nucleic acid can be determined using a real-time reverse transcriptase PCR (RT-PCR) assay and/or a nucleic acid microarray. In other embodiments, the nucleic acid can be determined using a hybridization assay and at least one labeled binding agent (e.g., a labeled oligonucleotide binding agent).
In any of the method herein, the step of determining the level of expression of a gene can involved determining an amount of a protein encoded by that gene, such as by using an immunohistochemical assay, an immunoblotting assay, and/or a flow cytometry assay.
In various embodiments, the sample can be selected from the group consisting of a sample of endometrium tissue, endometrial stromal cells, and/or endometrial fluid.
The subject of any of the methods herein may be a human, for example, a woman trying to become pregnant, e.g., an in vitro fertilization candidate/patient.
In yet another aspect, the present Application provides a method of increasing the likelihood of becoming pregnant comprising using the method that includes evaluating the expression level(s) of one or more of the genes described herein (for example in Tables 1-17 or elsewhere in this Application) in a subject to determine whether the subject is approaching, entering, in, or exiting a window of implantation, and implanting a fertilized embryo (e.g., from an in vitro fertilization procedure) if the window of implantation is open. In some embodiments, the gene expression levels are detected in a biological sample obtained from the subject, for example a tissue sample, for example a blood, endometrial tissue, endometrial cells, or endometrial fluid sample. In some embodiments, one or more cell types (e.g., ciliated epithelial cells, unciliated epithelial cells, stromal fibrolasts, and/or other cell types described in this Application, for example, but not limited to, cell types 1-6 described above) are isolated from the biological sample, or the nd sample is enriched for one or more cell types (e.g., ciliated epithelial cells, unciliated epithelial cells, stromal fibrolasts, and/or other cell types described in this Application, for example, but not limited to, cell types 1-6 described above).
In still another aspect, the Application provides a method of treating infertility in a subject in need thereof, comprising administering an effective amount of an agent that upregulates any one or more of the genes selected from the group consisting of PAEP, GPX3, CXCL14, NUPR1, DPP4, MAOA, MT1G, MT1E, MT1X, and MT1F in one or more of the tissues in the subject in an effective amount to treat the infertility. The agent can include a nucleic acid encoding for any one or more of the genes selected from the group consisting of PAEP, GPX3, CXCL14, NUPR1, DPP4, MAOA, MT1G, MT1E, MT1X, and MT1F in an expression system. The administering of the agent can result in the opening of the window of implantation in the subject.
Other aspects of the invention are described in or are obvious from the following disclosure, and are within the ambit of the invention.
The following drawings form part of the present specification and are included to further demonstrate certain aspects of the present disclosure, which can be better understood by reference to one or more of these drawings in combination with the detailed description of specific embodiments presented herein.
There has long been a need in the art for a systematic characterization and molecular understanding of endometrial transformation across the natural menstrual cycle that go beyond the traditional histological characterization scheme well established in the art. Such an understanding—including the identification of useful biomarkers associated with hallmark endometrial events, e.g., the implantation window—would make a significant contribution to the art and to the field of medical intervention into human reproductive technologies, e.g., in vitro fertilization and contraception technologies.
In a human menstrual cycle, endometrium undergoes remodeling, shedding, and regeneration, which are processes driven by substantial gene expression changes in the underlying cellular hierarchy. Despite its importance in human fertility and regenerative biology, mechanistic understanding of this unique type of tissue homeostasis has remained rudimentary. Described in the present Application are the transcriptomic transformations of human endometrium at single cell resolution. Further described are dissections of multidimensional cellular heterogeneity of the tissue across the entire natural menstrual cycle. The methods described herein permitted the recognition of six discrete endometrial cell types that were analyzed, including previously uncharacterized ciliated epithelium. Further analysis of gene expression patterns within these newly defined cell types demonstrated characteristic signatures for each cell type and phase during four major phases of endometrial transformation. This resulted in the surprising discovery that the human window of implantation opens up with an abrupt and discontinuous transcriptomic activation in the epithelium, accompanied with widespread decidualized feature in the stroma. Also unexpected was the finding of signatures in luminal and glandular epithelium during epithelial gland reconstruction, suggesting a mechanism for adult gland formation. Described herein are precise and accurate methods for determination of endometrial status, e.g., the implantation window, useful in the treatment and/or management of patients, including but not limited to patients in need of assisted reproduction.
The present disclosure is based, in part, on the finding that certain genes (e.g., biomarkers) are indicative of one or more specific phases of endometrial transformation that occur in the human menstrual cycle. Aspects of the present disclosure relate to methods and compositions for detecting the phase of endometrial transformation in a subject by detecting and measuring differentially expressed genes. In some embodiments, differentially expressed genes are detected in a sample from a subject (e.g., a patient). In some aspects, the present disclosure relates to methods to detect the opening of the window of implantation and/or decidualization in a subject. Some aspects of the present disclosure relate to methods of detecting the early-proliferative, late-proliferative, early-secretory, mid-secretory, and/or late-secretory phase of the menstrual cycle of a subject. The present disclosure is based, in part, on the finding that after systemic transcriptomic characterization of the human endometrium over the entire menstrual cycle, gene expression signatures could be identified that uniquely correspond to one of six identified endometrial cell subtypes (ciliated epithelium, unciliated epithelium, stromal cells, endothelium cells, macrophages, and lymphocytes) and which may be used to identify or detect one or more hallmark endometrial events, e.g., a specific phase of endometrial transformation, such as, the implantation window, in an endometrial sample. In various embodiments, the present invention relates to using the cell-type-specific gene expression signatures (e.g., biomarker panels) to evaluate, assess, or otherwise probe one or more endometrial samples from a subject to detect the appearance or presence of one or more menstrual cycle events, e.g., implantation window. In some embodiments, the endometrial samples can be evaluated in bulk, that is as a complete tissue sample since the gene expression signatures are characteristic of a unique endometrial cell type. In other embodiments, the endometrial sample can be process to separate out one or more specific cell types, e.g., the unciliated epithelial cells, using a means for cell separation (e.g., FACS cell-sorting). The separated endometrial cell subtypes can be separately evaluated using the appropriate gene expression signature for that cell type to detect the appearance or presence of one or more menstrual cycle events, e.g., implantation window, in that tissue or cell sample.
In other aspects, the present Application relates to the identified cell-type-specific gene panel signatures, i.e., sets of biomarkers, which correspond or otherwise mark the appearance, presence, or disappearance of a specific phase of endometrial transformation, such as, for example, the window of implantation. In still other aspects, the present Application describes practical and/or clinical application of the identified gene panel signatures to detect the appearance, presence, or disappearance of a particular transformation state of the endometrium of a subject, i.e., the detection of the window of implantation.
Unless defined otherwise, all technical and scientific terms used herein have the meaning commonly understood by one of ordinary skill in the art to which this invention belongs. The following references provide one of skill in the art to which this invention pertains with a general definition of many of the terms used in this invention: Singleton et al., Dictionary of Microbiology and Molecular Biology (2d ed. 1994); The Cambridge Dictionary of Science and Technology (Walker ed., 1988); Hale & Marham, The Harper Collins Dictionary of Biology (1991); and Lackie et al., The Dictionary of Cell & Molecular Biology (3d ed. 1999); and Cellular and Molecular Immunology, Eds. Abbas, Lichtman and Pober, 2nd Edition, W.B. Saunders Company. For the purposes of the present invention, the following terms are further defined.
As used herein and in the claims, the singular forms “a,” “an,” and “the” include the singular and the plural reference unless the context clearly indicates otherwise. Thus, for example, a reference to “an agent” includes a single agent and a plurality of such agents.
As used herein, a “biomarker,” or “biological marker,” generally refers to a measurable indicator of some biological state or condition. The term is also occasionally used to refer to a substance whose detection indicates the presence of a living organism. Biomarkers are often measured and evaluated to examine normal biological processes, pathogenic processes, or pharmacologic responses to a therapeutic intervention. Combined groups of biomarkers with a uniquely characteristic pattern associated with a condition, disease, or otherwise biological state (e.g., a stage of the menstrual cycle or the window of implantation) may be referred to as a “biomarker signature” or equivalently as a “gene signature” or “gene expression signature” or “gene expression profile.” A gene signature or gene expression signature is a single or combined group of genes in a cell with a uniquely characteristic pattern of gene expression that occurs as a result of a biological process (e.g., a stage of the menstrual cycle) or pathogenic medical condition (e.g., endometriosis). Activating pathways in a regular physiological process (e.g., the transformation pathway along the menstrual cycle) or a physiological response to a stimulus results in a cascade of signal transduction and interactions that elicit altered levels of gene expression, which is classified as the gene signature of that physiological process or response.
The clinical applications of gene signatures breakdown into prognostic, diagnostic, and predictive signatures. The phenotypes that may theoretically be defined by a gene expression signature range from those that predict the survival or prognosis of an individual with a disease, those that are used to differentiate between different subtypes of a disease, to those that predict activation of a particular pathway (e.g., predict the timing of WOI). Ideally, gene signatures can be used to select a group of patients for whom a particular treatment will be effective (e.g., timing of WOI for in vitro fertilization candidates).
Prognostic refers to predicting the likely outcome or course of a disease. Classifying a biological phenotype or medical condition based on a specific gene signature or multiple gene signatures, can serve as a prognostic biomarker for the associated phenotype or condition. This concept termed prognostic gene signature, serves to offer insight into the overall outcome of the condition regardless of therapeutic intervention. Several studies have been conducted with focus on identifying prognostic gene signatures with the hopes of improving the diagnostic methods and therapeutic courses adopted in a clinical settings. It is important to note that prognostic gene signatures are not a target of therapy; they offer additional information to consider when discussing details such as duration or dosage or drug sensitivity etc. In therapeutic intervention. The criteria a gene signature preferably meets to be deemed a prognostic marker include demonstration of its association with the outcomes of the condition, reproducibility and validation of its association in an independent group of patients and lastly, the prognostic value must demonstrate independence from other standard factors in a multivariate analysis.
A diagnostic gene signature serves as a biomarker that distinguishes phenotypically similar medical conditions that have a threshold of severity consisting of mild, moderate or severe phenotypes. Establishing verified methods of diagnosing clinically indolent and significant cases allows practitioners to provide more accurate care and therapeutic options that range from no therapy, preventative care to symptomatic relief. These diagnostic signatures also allow for a more accurate representation of test samples used in research.
A predictive gene signature predicts the effect of treatment in patients or study participants that exhibit a particular disease phenotype. A predictive gene signature unlike a prognostic gene signature can be a target for therapy. The information predictive signatures provide are more rigorous than that of prognostic signatures as they are based on treatment groups with therapeutic intervention on the likely benefit from treatment, completely independent of prognosis. Predictive gene signatures addresses the paramount need for ways to personalize and tailor therapeutic intervention in diseases. These signatures have implications in facilitating personalized medicine through identification of more novel therapeutic targets and identifying the most qualified subjects for optimal benefit of specific treatments.
This Application may reference the “status” or “state” of a biomarker in a sample. In various embodiments, reference to the “abnormal status or state” of a biomarker means the biomarker's status in a particular sample differs from the status generally found in average samples (e.g., healthy samples or average diseased samples). Examples include mutated, elevated, decreased, present, absent, etc. Reference to a biomarker with an “elevated status” means that one or more of the above characteristics (e.g., expression or mRNA level) is higher than normal levels. Generally this means an increase in the characteristic (e.g., expression or mRNA level) as compared to an index value. Conversely reference to a biomarker's “low status” means that one or more of the above characteristics (e.g., gene expression or mRNA level) is lower than normal levels. Generally this means a decrease in the characteristic (e.g., expression) as compared to an index value. In this context, a “negative status” of a biomarker generally means the characteristic is absent or undetectable.
It is noted that in this disclosure and particularly in the claims and/or paragraphs, terms such as “comprises”, “comprised”, “comprising” and the like can have the meaning attributed to it in U.S. patent law; e.g., they can mean “includes”, “included”, “including”, and the like; and that terms such as “consisting essentially of” and “consists essentially of” have the meaning ascribed to them in U.S. patent law, e.g., they allow for elements not explicitly recited, but exclude elements that are found in the prior art or that affect a basic or novel characteristic of the invention.
As used herein, “decidualization” is a process that results in significant changes to cells of the endometrium in preparation for, and during, pregnancy. This includes morphological and functional changes to endometrial stromal cells (ESCs), the presence of decidual white blood cells (leukocytes), and vascular changes to maternal arteries. The sum of these changes results in the endometrium changing into a structure called the decidua.
As used herein, the “epithelium” is one of the four basic types of animal tissue, along with connective tissue, muscle tissue and nervous tissue. Epithelial tissues line the outer surfaces of organs and blood vessels throughout the body, as well as the inner surfaces of cavities in many internal organs, e.g., the uterus.
As used herein, “endometrium” is the mucous membrane lining the uterus, which thickens during the menstrual cycle in preparation for possible implantation of an embryo.
An “isolated cell” refers to a cell which has been separated from other components and/or cells which naturally accompany the isolated cell in a tissue or mammal.
The term “obtaining” as in “obtaining the spore associated protein” is intended to include purchasing, synthesizing or otherwise acquiring the spore associated protein (or indicated substance or material).
As used herein, a “sample” refers to a composition that comprises biological materials such as (but not limited to) endometrial tissue, endometrial cells, or endometrial fluid from a subject.
The term “subject” refers to a subject in need of the analysis described herein. In some embodiments, the subject is a patient (e.g., a female patient). In some embodiments, the subject is a human (e.g., a woman). In some embodiments, the human is trying to become pregnant. The subject in need of the analysis described herein may be a patient suffers from infertility.
As used herein, “transcriptome” refers to the collection of all gene transcripts in a given cell and comprises both coding RNA (mRNAs) and non-coding RNAs (e.g., siRNA, miRNA, hnRNA, tRNA, etc.). As used herein, an “mRNA transcriptome” refers to the population of all mRNA molecules present (in the appropriate relative abundances) in a given cell. An mRNA transcriptome comprises the transcripts that encode the proteins necessary to generate and maintain the phenotype of the cell. As used herein, an mRNA transcriptome may or may not further comprise mRNA molecules that encode proteins for general cell existence, e.g., housekeeping genes and the like.
As used herein, the term “window of implantation (“WOI”)” or, equivalently, “implantation window” refers to is defined as that period when the uterus is receptive for implantation of the free-lying blastocyst. This period of receptivity is short and results from the programmed sequence of the action of estrogen and progesterone on the endometrium.
Ranges: throughout this disclosure, various aspects of the invention can be presented in a range format. It should be understood that the description in range format is merely for convenience and brevity and should not be construed as an inflexible limitation on the scope of the invention. Accordingly, the description of a range should be considered to have specifically disclosed all the possible subranges as well as individual numerical values within that range. For example, description of a range such as from 1 to 6 should be considered to have specifically disclosed subranges such as from 1 to 3, from 1 to 4, from 1 to 5, from 2 to 4, from 2 to 6, from 3 to 6 etc., as well as individual numbers within that range, for example, 1, 2, 2.7, 3, 4, 5, 5.3, and 6. This applies regardless of the breadth of the range.
In various aspect, the present Application relates to transcriptomic assessment of various types of cells making up the endometrium throughout the menstrual cycle. The menstrual cycle is the regular natural change that occurs in the female reproductive system (specifically the uterus and ovaries) that makes pregnancy possible. The cycle is required for the production of oocytes, and for the preparation of the uterus for pregnancy.
The menstrual cycle is complex and is controlled by many different glands and the hormones that these glands produce. The hypothalamus causes the nearby pituitary gland to produce certain chemicals, which prompt the ovaries to produce the sex hormones estrogen and progesterone. The menstrual cycle is a biofeedback system, which means each structure and gland is affected by the activity of the others.
The menstrual cycle is divided into four recognized main phases: menstruation, the follicular phase, ovulation, and the luteal phase. Menstruation is the elimination of the thickened lining of the uterus (endometrium) from the body through the vagina. Menstrual fluid contains blood, cells from the lining of the uterus (endometrial cells) and mucus. The average length of a period is between three days and one week. The follicular phase starts on the first day of menstruation and ends with ovulation. Prompted by the hypothalamus, the pituitary gland releases follicle stimulating hormone (FSH). This hormone stimulates the ovary to produce around five to 20 follicles (tiny nodules or cysts), which bead on the surface. Each follicle houses an immature egg. Usually, only one follicle will mature into an egg, while the others die. This can occur around day 10 of a 28-day cycle. The growth of the follicles stimulates the lining of the uterus to thicken in preparation for possible pregnancy. Ovulation is the release of a mature egg from the surface of the ovary. This generally occurs mid-cycle, around two weeks or so before menstruation starts. During the follicular phase, the developing follicle causes a rise in the level of estrogen. The hypothalamus in the brain recognizes these rising levels and releases a chemical called gonadotrophin-releasing hormone (GnRH). This hormone prompts the pituitary gland to produce raised levels of luteinizing hormone (LH) and FSH. Within two days, ovulation is triggered by the high levels of LH. The egg is funneled into the fallopian tube and towards the uterus by waves of small, hair-like projections. The life span of the typical egg is only around 24 hours. The luteal phase occurs when the egg bursts from its follicle and the ruptured follicle stays on the surface of the ovary. For the next two weeks or so, the follicle transforms into a structure known as the corpus luteum. This structure starts releasing progesterone, along with small amounts of estrogen. This combination of hormones maintains the thickened lining of the uterus, waiting for a fertilized egg to implant during the window of implantation. If a fertilized egg implants in the lining of the uterus, it produces the hormones that are necessary to maintain the corpus luteum. This includes human chorionic gonadotrophin (HCG), the hormone that is detected in a urine test for pregnancy. The corpus luteum keeps producing the raised levels of progesterone that are needed to maintain the thickened lining of the uterus. If pregnancy does not occur, the corpus luteum dies, usually around day 22 in a 28-day cycle. The drop in progesterone levels causes the lining of the uterus to fall away. This is known as menstruation. The cycle then repeats.
This cyclic transformation of the endometrium is executed through dynamic changes in states and interactions of multiple cell types, including luminal and glandular epithelial cells, stromal cells, vascular endothelial cells, and infiltrating immune cells. Although different categorization schemes exist, the transformation has been primarily divided into two major stages by the event of ovulation: the proliferative (pre-ovulatory) and secretory (post-ovulatory) stage.3 During the secretory stage, endometrium enters a narrow window of receptive state that is both structurally and biochemically ideal for embryo to implant,4, 5 This, the mid-secretory stage, is known as the window of implantation (WOI). To prepare for this state, the tissue undergoes considerable reconstruction in the proliferative stage, during which one of the most essential elements is the formation of epithelial glands6, lined by glandular epithelium.
Despite its importance in human fertility and regenerative biology, mechanistic understanding of endometrium-related tissue homeostasis has remained rudimentary. Described in the present Application are the transcriptomic transformations of human endometrium at single cell resolution. Further described are dissections of multidimensional cellular heterogeneity of the tissue across the entire natural menstrual cycle. The methods described herein permitted the recognition of six discrete endometrial cell types that were analyzed, including previously uncharacterized ciliated epithelium. Further analysis of gene expression patterns within these newly defined cell types demonstrated characteristic signatures for each cell type and phase during four major phases of endometrial transformation. This resulted in the surprising discovery that the human window of implantation opens up with an abrupt and discontinuous transcriptomic activation in the epithelium, accompanied with widespread decidualized feature in the stroma. Also unexpected was the finding of signatures in luminal and glandular epithelium during epithelial gland reconstruction, suggesting a mechanism for adult gland formation. Described herein are precise and accurate methods for determination of endometrial status, e.g., the implantation window, useful in the treatment and/or management of patients, including but not limited to patients in need of assisted reproduction.
As used herein, a “menstrual cycle event” refers to any distinct biological state, phase, or condition that occurs during the course of the menstrual cycle which can be detected by a gene signature or biomarker signature associated with one or more endometrial cell subtypes (e.g., stroma cells, endothelium cells, immune cells, unciliated epithelium cells, and ciliated epithelium cells). An example of a menstrual cycle event is ovulation. Another example of a menstrual cycle event is a window of implantation.
In various aspect, the present Application relates to methods of evaluating the human menstrual cycle with respect to the transcriptome of cells making up the endometrium in order to identifying single biomarkers or combinations of biomarkers (e.g., biomarker panels of biomarker signatures) that characterize, identify, or otherwise are associated with one or more hallmark states of the menstrual cycle, e.g., the window of implantation.
The transcriptome can be assessed on the bulk endometrium tissue at one or time points during that menstrual cycle. In this way, the cells composing the endometrium (e.g., the epithelium, stroma (stratum compactum and stratum spongiosum), glandular epithelium, and the lymphatic and/or blood vessel component therein) can be analyzed in bulk. In another approach, the different cells making up the varied types of endometrial sub-components can be separated first, and the transcriptome can be determined for each isolated cell type.
The transcriptome is the complete set of transcripts in a cell, and their quantity, for a specific developmental stage or physiological condition. Understanding the transcriptome is essential for interpreting the functional elements of the genome and revealing the molecular constituents of cells and tissues, and also for understanding development and disease. The key aims of transcriptomics are: to catalogue all species of transcript, including mRNAs, non-coding RNAs and small RNAs; to determine the transcriptional structure of genes, in terms of their start sites, 5′ and 3′ ends, splicing patterns and other post-transcriptional modifications; and to quantify the changing expression levels of each transcript during development and under different conditions.
Various technologies are well-known in the art for deducing and quantifying the transcriptome, including hybridization- or sequence-based approaches. Hybridization-based approaches typically involve incubating fluorescently labelled cDNA with custom-made microarrays or commercial high-density oligo microarrays. Specialized microarrays have also been designed; for example, arrays with probes spanning exon junctions can be used to detect and quantify distinct spliced isoforms. Genomic tiling microarrays that represent the genome at high density have been constructed and allow the mapping of transcribed regions to a very high resolution, from several base pairs to ˜100 bp. Hybridization-based approaches are high throughput and relatively inexpensive, except for high-resolution tiling arrays that interrogate large genomes. However, these methods have several limitations, which include: reliance upon existing knowledge about genome sequence; high background levels owing to cross-hybridization; and a limited dynamic range of detection owing to both background and saturation of signals. Moreover, comparing expression levels across different experiments is often difficult and can require complicated normalization methods.
In contrast to microarray methods, sequence-based approaches directly determine the cDNA sequence. Initially, Sanger sequencing of cDNA or EST libraries was used, but this approach is relatively low throughput, expensive and generally not quantitative. Tag-based methods were developed to overcome these limitations, including serial analysis of gene expression (SAGE), cap analysis of gene expression (CAGE), and massively parallel signature sequencing (MPSS). These tag-based sequencing approaches are high throughput and can provide precise, ‘digital’ gene expression levels. However, most are based on Sanger sequencing technology, and a significant portion of the short tags cannot be uniquely mapped to the reference genome. Moreover, only a portion of the transcript is analysed and isoforms are generally indistinguishable from each other. These disadvantages limit the use of traditional sequencing technology in annotating the structure of transcriptomes.
Recently, the development of novel high-throughput DNA sequencing methods has provided a new method for both mapping and quantifying transcriptomes. This method, termed RNA-Seq (RNA sequencing), has advantages over existing approaches for determining transcriptomes.
RNA-Seq uses deep-sequencing technologies. In general, a population of RNA (total or fractionated, such as poly(A)+) is converted to a library of cDNA fragments with adaptors attached to one or both ends. Each molecule, with or without amplification, is then sequenced in a high-throughput manner to obtain short sequences from one end (single-end sequencing) or both ends (pair-end sequencing). The reads are typically 30-400 bp, depending on the DNA-sequencing technology used. In principle, any high-throughput sequencing technology can be used for RNA-Seq, e.g., the Illumina IG18, Applied Biosystems SOLiD22 and Roche 454 Life Science systems have already been applied for this purpose. The Helicos Biosciences tSMS system is also appropriate and has the added advantage of avoiding amplification of target cDNA. Following sequencing, the resulting reads are either aligned to a reference genome or reference transcripts, or assembled de novo without the genomic sequence to produce a genome-scale transcription map that consists of both the transcriptional structure and/or level of expression for each gene.
Further reference can be made regarding transcriptome analysis and RNA-Seq technologies known in the art: (1) Wang et al., Nat Rev Genet. 2009 January; 10(1): 57-63; (2) Lee et al., Circ Res. 2011 Dec. 9; 109(12):1332-41; (3) Nagalakshimi et al., Curr Protoc Mol Biol. 2010 January; Chapter 4: Unit 4.11.1-13; and (4) Mutz et al., Curr Opin Biotechnol. 2013 February; 24(1):22-30, each of which are incorporated herein by reference.
Transcriptome analysis by next-generation sequencing (RNA-seq) allows investigation of a transcriptome at unsurpassed resolution. One major benefit is that RNA-seq is independent of a priori knowledge on the sequence under investigation, thereby also allowing analysis of poorly characterized Plasmodium species.
The transcriptome can be profiled by high throughput techniques including SAGE, microarray, and sequencing of clones from cDNA libraries. For more than a decade, oligonucleotide microarrays have been the method of choice providing high throughput and affordable costs. However, microarray technology suffers from well-known limitations including insufficient sensitivity for quantifying lower abundant transcripts, narrow dynamic range and biases arising from non-specific hybridizations. Additionally, microarrays are limited to only measuring known/annotated transcripts and often suffer from inaccurate annotations. Sequencing-based methods such as SAGE rely upon cloning and sequencing cDNA fragments. This approach allows quantification of mRNA abundance by counting the number of times cDNA fragments from a corresponding transcript are represented in a given sample, assuming that cDNA fragments sequenced contain sufficient information to identify a transcript. Sequencing-based approaches have a number of significant technical advantages over hybridization-based microarray methods. The output from sequence-based protocols is digital, rather than analog, obviating the need for complex algorithms for data normalization and summarization while allowing for more precise quantification and greater ease of comparison between results obtained from different samples. Consequently the dynamic range is essentially infinite, if one accumulates enough sequence tags. Sequence-based approaches do not require prior knowledge of the transcriptome and are therefore useful for discovery and annotation of novel transcripts as well as for analysis of poorly annotated genomes. However, until recently the application of sequencing technology in transcriptome profiling has been limited by high cost, by the need to amplify DNA through bacterial cloning, and by the traditional Sanger approach of sequencing by chain termination.
The next-generation sequencing (NGS) technology eliminates some of these barriers, enabling massive parallel sequencing at a high but reasonable cost for small studies. The technology essentially reduces the transcriptome to a series of randomly fragmented segments of a few hundred nucleotides in length. These molecules are amplified by a process that retains spatial clustering of the PCR products, and individual clusters are sequenced in parallel by one of several technologies. Current NGS platforms include the Roche 454 Genome Sequencer, Illumina's Genome Analyzer, and Applied Biosystems' SOLiD. These platforms can analyze tens to hundreds of millions of DNA fragments simultaneously, generate giga-bases of sequence information from a single run, and have revolutionized SAGE and cDNA sequencing technology. For example, the 3′ tag Digital Gene Expression (DGE) uses oligo-dT priming for first strand cDNA synthesis, generates libraries that are enriched in the 3′ untranslated regions of polyadenylated mRNAs, and produces base cDNA tags.
In various aspects, the present Application relates to menstrual cycle biomarkers, i.e., biomarkers which are associated with the various transformational phases of the menstrual cycle, e.g., menstruation, ovulation, One or more such biomarkers may be present in a specific population of cells (e.g., human endometrial stromal cells (hESCs)) and the level of each biomarker may deviate from the level of the same biomarker in a different population of cells and/or in a different subject (e.g., patient). For example, a biomarker that is indicative of decidualization or the opening of the window of implantation (WOI) may have an elevated level or a reduced level in a sample from a subject relative to the level of the same marker in a control sample.
Exemplary biomarkers indicative of the various phases of endometrial transformation in epithelial cells are shown in Table 1. Exemplary biomarkers indicative of the various phases of endometrial transformation in stromal cells (e.g., stromal fibroblast) are shown in Table 2. In some embodiments, a biomarker is differentially expressed in a sample that has been decidualized compared to a sample that is non-decidualized. In some embodiments, a biomarker is differentially expressed in a sample that has an open WOI compared to a sample that does not have an open WOI.
In various embodiments, assessment of the transcriptome of a cell (e.g., limited to an isolated cell or a single cell type, such as unciliated epithelial cells), or a batch of one or more types of isolated cells or cell types (e.g., unciliated epithelial cells together with stromal cells) can be analyzed by transcriptomic analysis using a method known in the art. As part of the transcriptomic analysis, the gene expression levels may be measured or determined for at least one gene. In other embodiments, the gene expression levels can be measured for between 1 and 10 genes, or between 5 and 20 genes, or between 10 and 40 genes, or between 20 and 80 genes, or between 40 and 160 genes, or between 80 and 320 genes, or between 160 and 640 genes, or more. In still other embodiments, the gene expression levels can be measured for at least 1 gene, at least 10 genes, at least 20 genes, at least 30 genes, at least 40 genes, at least 50 genes, at least 60 genes, at least 70 genes, at least 80 genes, at least 90 genes, at least 100 genes, at least 125 genes, at least 150 genes, at least 175 genes, at least 200 genes, at least 300, 400, 500, 600, 700, 800, 900, or 1000 genes or more, for example of the gene listed in any of Tables 1-17 or other genes described in this Application as indicative of WOI status.
In various embodiments, the following tables provide examples of temporally-changing genes identified as a result of transcriptome analysis of endometrial tissues in bulk and/or isolated endometrial cells (e.g., unciliated epithelial cells or stromal cells) measured along the menstrual cycle.
The biomarkers described herein may have a level in a sample obtained from a subject (i.e., patient) that has an open window of implantation (WOI) that deviates (e.g., is increased or decreased) when compared to the level of the same biomarker in a sample obtained from a subject that does not have an open WOI. The biomarkers described herein may have a level in decidualized cells that deviates (i.e., is increased or reduced) from the level of the same marker in non-decidualized cells by at least 20% (e.g., 30%, 50%, 80%, 100%, 2-fold, 5-fold, 10-fold, 20-fold, 50-fold, 100-fold or more). Such a biomarker or set of biomarkers may be used in both diagnostic/prognostic applications and non-clinical applications (e.g., for research purposes).
In some embodiments, epithelial biomarkers are one or more of PLAU, MMP7, THBS1, CADM1, NPAS3, ATP1A1, ANK3, ALPL, TRAK1, SCGB1D2, MT1F, MT1X, MT1E, MT1G, CXCL14, MAOA, DPP4, NUPR1, GPX3, PAEP (see
In other embodiments, the unciliated epithelial biomarkers include the following subset or panel of biomarkers that are associated with a window of implantation.
In still other embodiments, the stromal biomarkers include the following subset or panel of biomarkers that are associated with a window of implantation.
In reference to Tables 9 and 10 with regard to whether the expression of a biomarker (e.g., CADM1) at any point in time during the menstrual cycle (e.g., the point of WOI) considered “up” (+) or “down” (−) regulated depends the relative level of expression of that biomarker at the particular point in time of interest (e.g., point of WOI) relative to the point in the menstrual cycle of peak expression of that biomarker. The peak expression level is determined computationally by a known computation method. Thus, biomarkers such as CADM1 and NPAS3 showed peak expression during the proliferative phase of the menstrual cycle; thus, the expression at the WOI was ascribed a value of “down-regulated.” To the contrary, NUPR1 was ascribed an expression value of “up-regulated” since its expression peaked in the WOI.
The biomarkers of Table 9 and 10 may be further classified into three broad categories:
1. A negative biomarker: its expression falls above a threshold indicates a classification of “out of WOI” (e.g., CADMI, ATP1A1, ALPL, FGF7, or LMCD1). In general, these markers are not expressed in WOI, but are expressed in other major phases of the menstrual cycle. Therefore, considerable expression of these genes would indicate “out of WOI.”
2. A type 1 positive biomarker: its expression falls above a threshold indicates a classification of “likely within early-sec or WOI” (e.g., MT1F, X, E, G). These biomarkers show considerable expression in early-sec or WOI relative to their expression levels in other phases of the menstrual cycle.
3. A type 2 positive biomarker: its expression falls above a threshold indicates a classification of “likely within late-sec or WOI” (e.g., CXCL14, PAEP, FGF7, LMCD1). These biomarkers show considerable expression in late-sec or WOI relative to their expression levels in other phases of the menstrual cycle.
There are many potential ways to build the gene classifiers described herein, as well as other gene classifiers, for predicting one or more phases or events (e.g., WOI) during the menstrual cycle, including determining the thresholds.
In one possible approach, a machine learning based method can be used to build a classifier (e.g., a support vector machine, random forest). The expression profile of the biomarkers would then be used to train a classifier on training sample sets, deriving thresholds for the markers (which would most likely be different for different markers). Then the classifiers would be tested on sample sets. Via cross-validation, the most informative genes and their corresponding thresholds would be able to be determined.
In another approach, a gene set enrichment (GSEA) based method could be used to build a classifier. Given the fact that the genes selected in
In certain embodiments, the detection methods may rely on the predictive value of only a single biomarker, such as a biomarker that has a relatively exclusive expression in a certain phase, e.g., in WOI (e.g., IL15). In other embodiments, the detection methods may rely on the predictive value of biomarkers which show up-regulation in WOI relative to late-sec phase (e.g., IL15, CXCL14, MAOA, or DPP4).
In certain other embodiments, the detection methods may rely on a combination of epithelial biomarkers from
The biomarkers identified in
The biomarkers identified in
Any of the biomarkers described herein, either taken alone or in combination (e.g., at least two biomarkers, at least three biomarkers, or more biomarkers), can be used in the assay methods also described herein for analyzing a sample from a subject to determine the one or more specific phases of endometrial transformation that occur in the human menstrual cycle. Results obtained from such assay methods can be used in either clinical applications or non-clinical applications, including, but not limited to, those described herein.
The methods for identifying biomarkers and subsequently detecting biomarkers may involve with bulk tissues, e.g., bulk endometrial tissues. This is because the inventors have discovered that the biomarkers discussed herein from one subtissue, e.g., those presented in
This means that the various endometrial sub-tissues or cell types were found to have unique gene signatures which may be evaluated without first having to separate an endometrial tissue into its component cells.
However, the methods of biomarker detection also contemplate first processing a sample to first separate cell types, thereby conducting the biomarker analysis on only a single type of cell, e.g., unciliated endometrium or stromal cells.
Thus, in various embodiments, the methods disclosed herein may involve the step of processing a sample (e.g., an endometrial sample) by separating out one or more cell types, e.g., separating out unciliated epithelium cells, cilitated epithelium cells, stratum compactum cells (stromal), stratum spongiosum cells (stromal), glandular epithelium cells, luminal epithelium cells, and lymphatic or blood vessel cells from an endometrium sample. Once the cells of the endometrium are separated and collected or pooled, the cells of each individual tissue subtype can be evaluated for biomarker expression based on detection of any of the biomarkers of Tables 1-17.
Methods of Cell Separation are Well-Known in the Art.
Isolation of one or multiple cell types from a heterogeneous population is an integral part of modern biological research and routine clinical diagnosis and treatment. Purification of specific cells is essential for basic cell biology research, cellular enumeration in certain pathologies and cell based regenerative therapies. The main principle of separating any cell type from a population is to utilize one or more properties that are unique to that cell type. The most widely used cell isolation and separation techniques can be broadly classified as based on adherence, morphology (density/size) and antibody binding. The high precision single cell isolation methods are usually based on one or more of these properties while newer techniques incorporating microfluidics make use of some additional cellular characteristics. The recent improvements in cell isolation procedures vis-à-vis purity, yield and viability of cells has resulted in significant advances in the areas of stem cell biology, oncology and regenerative medicine among others.
A cell isolation procedure can either be a positive selection or a negative selection—the former aims at isolating the target cell type from the entire population, usually with specific antibodies while the latter strategy involves the depletion of all cell types of the population resulting in only the target cells remaining. Both types of isolation methods have their own advantages and disadvantages. Due to the use of specific antibodies targeting a particular cell type, positive selection yields a higher purity of the desired population. On the other hand, it is more complex to design an antibody cocktail to deplete all the non-target cells making negative selection less efficient vis-à-vis purity. Furthermore, a cell population isolated through positive selection can be sequentially purified through several cycles of the procedure, a benefit that negative selective cannot provide. However, positively selected cells carry antibodies and other labelling agents that may interfere with downstream culture and assays—if that is a concern, it is preferable to use a negative selection method
To isolate a particular cell type from a heterogeneous population, the unique properties of that cell type can be exploited. Cell isolation techniques are broadly classified into four categories based on the following cellular characteristics:
(1) Surface charge and adhesion—This feature determines the extent of attachment of cells to plastic and other polymer surfaces and can be used to separate adherent cells from suspension/free-floating cells.
(2) Cell size and density—The physical properties of size and density are commonly used for the bulk recovery of cells; either by sedimentation, filtration or density gradient centrifugation.
(3) Cell morphology and physiology—Different cell types can be distinguished on the basis of shape, histological staining, media selective growth, redox potential and other visual and behavioural properties which can then be harnessed to isolate those cells.
(4) Surface markers—Specific binding of surface antigens to either antibodies or aptamers can selectively capture cells of the specific surface phenotype. The captured cells are subsequently detected with the help of measurable probes—usually fluorochromes and magnetic particles—with which the antibodies/aptamers are labelled.
In addition, two or more of the above principles can be combined to further increase the specificity of isolated cells—usually such compound techniques consist of a label free (the first three in the list) method along-with a label incorporating method.
Using these well-known methods and the known properties and characteristics distinguishing the endometrial cell types from one another, the person of ordinary skill in the art can isolate or separate one or more cell types from a bulk endometrial tissue sample without undue experimentation.
In some embodiments, data is obtained for each of a plurality of cells in an endometrial sample. The data is then evaluated and a cell type is assigned to each cell based on one or more characteristic markers (e.g., one or more markers characteristic of a cell type of interest). In some embodiments, the gene expression data is used to determine the cell type, e.g., an unciliated epithelial cell or a stromal cell. For example, one or more of the following non-limiting genes can be used to identify a cell as an unciliated epithelial cell: PLUA, MMP7, THBS1, CADM1, NPAS3, ATP1A1, ANK3, ALPL, TRAK1, SCGB1D2, MT1F, MT1X, MT1E, MT1G, CXCL14, MAOA, DPP4, NUPR1, GPX3, and PAEP. Similarly, one or more of the following non-limiting genes can be used to identify a cell as a stromal cell: STC1, NGATC2, BMP2, PMAIP1, MMP11, SFRP1, WNT5A, ZFYVE21, CILP, SLF2, MATN2, S100A4, DKK1, CRYAB, FOXO1, IL15, FGF7, and LMCD1.
Alternatively, in some embodiments, gene expression data for a plurality of cells in an endometrial sample can be obtained (e.g., bulk gene expression data) and evaluated to determine patterns of gene expression associated with different cell types within the sample without having to first separate the sample into distinct subcellular populations, i.e., a bulk assessment.
Bulk assessment may involve first using cell-type defining genes in
For gene set enrichment analysis (GSEA), one embodiment approach would be a scoring scheme where a (a>0) is added to the total score s if expression (>threshold) of a positive marker is observed, and subtract a from s if expression of a negative marker is seen. Similar to the original GSEA, based on a marker's importance and the category it belongs to, it may be assigned a weight.
Any sample that may contain a biomarker (e.g., a biological sample such as endometrial tissue, endometrial cells, or endometrial fluid) can be analyzed by the assay methods described herein. A sample may also include a tissue or biological fluid (e.g., blood) which is obtained non-invasively. The methods described herein may include providing a sample obtained from a subject. In some examples, the sample may be from an in vitro assay, for example, an in vitro cell culture (e.g., an in vitro culture of human endometrial unciliated epithelial and/or human endometrial stromal cells (hESCs)). As used herein, a “sample” refers to a composition that comprises biological materials such as (but not limited to) endometrial tissue, endometrial cells, or endometrial fluid from a subject. A sample includes both an initial unprocessed sample taken from a subject as well as subsequently processed, e.g., partially purified or preserved forms. Exemplary samples include endometrial tissue, endometrial stromal cells, placental tissue, blood, plasma, or mucus. Exemplary endometrial tissue includes, but is not limited to, decidua basalis, decidua capsularis, or decidua parietalis. In some embodiments, the sample is a body fluid sample such as an endometrial fluid sample. In some embodiments, multiple (e.g., at least 2, 3, 4, 5, or more) samples may be collected from subject, over time or at particular time intervals, for example to assess the disease progression or evaluate the efficacy of a treatment.
A sample can be obtained from a subject using any means known in the art. In some embodiments, the sample is obtained from the subject by removing the sample (e.g., an endometrial tissue sample) from the subject. In some embodiments, the sample is obtained from the subject by a surgical procedure (e.g., dilation and curettage (D&C)). In some embodiments, the sample is obtained from the subject by a biopsy (e.g., an endometrial biopsy). In some embodiments, the sample is obtained from the subject by aspirating, brushing, scraping, or a combination thereof. In some embodiments, the sample is obtained from a human. In some embodiments, the sample is obtained non-invasively.
Any of the samples described herein can be subject to analysis using the assay methods described herein, which involve measuring the level of one or more biomarkers as described herein. Levels (e.g., the amount) of a biomarker disclosed herein, or changes in levels the biomarker, can be assessed using conventional assays or those described herein.
As used herein, the terms “determining” or “measuring,” or alternatively “detecting,” may include assessing the presence, absence, quantity and/or amount (which can be an effective amount) of a substance within a sample, including the derivation of qualitative or quantitative concentration levels of such substances, or otherwise evaluating the values and/or categorization of such substances in a sample from a subject.
In some embodiments, the level of a biomarker is assessed or measured by directly detecting the protein in a sample (e.g., an endometrial tissue sample, endometrial cell sample, or endometrial fluid sample). Alternatively or in addition, the level of a protein can be assessed or measured indirectly in a sample, for example, by detecting the level of activity of the protein (e.g., enzymatic assay).
The level of a protein (e.g., a biomarker protein) may be measured using an immunoassay. Examples of immunoassays include any known assay (without limitation), and may include any of the following: immunoblotting assay (e.g., Western blot), immunohistochemical analysis, flow cytometry assay, immunofluorescence assay (IF), enzyme linked immunosorbent assays (ELISAs) (e.g., sandwich ELISAs), radioimmunoassays, electrochemiluminescence-based detection assays, magnetic immunoassays, lateral flow assays, and related techniques. Additional suitable immunoassays for detecting a biomarker protein provided herein will be apparent to those of skill in the art.
Such immunoassays may involve the use of an agent (e.g., an antibody) specific to the target biomarker. An agent such as an antibody that “specifically binds” to a target biomarker is a term well understood in the art, and methods to determine such specific binding are also well known in the art. An antibody is said to exhibit “specific binding” if it reacts or associates more frequently, more rapidly, with greater duration and/or with greater affinity with a particular target biomarker than it does with alternative biomarkers. It is also understood by reading this definition that, for example, an antibody that specifically binds to a first target peptide may or may not specifically or preferentially bind to a second target peptide. As such, “specific binding” or “preferential binding” does not necessarily require (although it can include) exclusive binding. Generally, but not necessarily, reference to binding means preferential binding. In some examples, an antibody that “specifically binds” to a target peptide or an epitope thereof may not bind to other peptides or other epitopes in the same antigen. In some embodiments, a sample may be contacted, simultaneously or sequentially, with more than one binding agent that binds different protein biomarkers (e.g., multiplexed analysis).
As used herein, the term “antibody” refers to a protein that includes at least one immunoglobulin variable domain or immunoglobulin variable domain sequence. For example, an antibody can include a heavy (H) chain variable region (abbreviated herein as VH), and a light (L) chain variable region (abbreviated herein as VL). In another example, an antibody includes two heavy (H) chain variable regions and two light (L) chain variable regions. The term “antibody” encompasses antigen-binding fragments of antibodies (e.g., single chain antibodies, Fab and sFab fragments, F(ab′)2, Fd fragments, Fv fragments, scFv, and domain antibodies (dAb) fragments (de Wildt et al., Eur J Immunol. 1996; 26(3):629-39.)) as well as complete antibodies. An antibody can have the structural features of IgA, IgG, IgE, IgD, IgM (as well as subtypes thereof). Antibodies may be from any source including, but not limited to, primate (human and non-human primate) and primatized (such as humanized) antibodies.
In some embodiments, the antibodies as described herein can be conjugated to a detectable label and the binding of the detection reagent to the peptide of interest can be determined based on the intensity of the signal released from the detectable label. Alternatively, a secondary antibody specific to the detection reagent can be used. One or more antibodies may be coupled to a detectable label. Any suitable label known in the art can be used in the assay methods described herein. In some embodiments, a detectable label comprises a fluorophore. As used herein, the term “fluorophore” (also referred to as “fluorescent label” or “fluorescent dye”) refers to moieties that absorb light energy at a defined excitation wavelength and emit light energy at a different wavelength. In some embodiments, a detection moiety is or comprises an enzyme. In some embodiments, an enzyme is one (e.g., β-galactosidase) that produces a colored product from a colorless substrate.
In some examples, an assay method described herein is applied to measure the level of a cellular biomarker in a sample. Such cells may be collected according to routine practice and the level of cellular biomarkers can be measured via a conventional method.
In other examples, an assay method described herein is applied to measure the level of a circulate biomarker in a sample, which can be any biological sample including, but not limited to, a fluid sample (e.g., a blood sample or plasma sample), a tissue sample, or a cell sample. Any of the assays known in the art including, e.g., immunoassays can be used for measuring the level of such biomarkers.
It will be apparent to those of skill in the art that this disclosure is not limited to immunoassays. Detection assays that are not based on an antibody, such as mass spectrometry, are also useful for the detection and/or quantification of biomarkers as provided herein. Assays that rely on a chromogenic substrate can also be useful for the detection and/or quantification of biomarkers as provided herein.
Alternatively, the level of nucleic acids encoding a biomarker in a sample can be measured via a conventional method. In some embodiments, measuring the expression level of nucleic acid encoding the biomarker comprises measuring mRNA. In some embodiments, the expression level of mRNA encoding a biomarker can be measured using real-time reverse transcriptase (RT) Q-PCR or a nucleic acid microarray. Methods to detect biomarker nucleic acid sequences include, but are not limited to, polymerase chain reaction (PCR), reverse transcriptase-PCR (RT-PCR), in situ PCR, quantitative PCR (Q-PCR), real-time quantitative PCR (RT Q-PCR), in situ hybridization, Southern blot, Northern blot, sequence analysis, microarray analysis, detection of a reporter gene, or other DNA/RNA hybridization platforms.
Any binding agent that specifically binds to a desired biomarker may be used in the methods and kits described herein to measure the level of a biomarker in a sample. In some embodiments, the binding agent is an antibody or an aptamer that specifically binds to a desired protein biomarker. In other embodiments, the binding agent may be one or more oligonucleotides complementary to a coding nucleic acid or a portion thereof. In some embodiments, a sample may be contacted, simultaneously or sequentially, with more than one binding agent that binds different biomarkers (e.g., multiplexed analysis).
To measure the level of a target biomarker, a sample can be in contact with a binding agent under suitable conditions. In general, the term “contact” refers to an exposure of the binding agent with the sample or cells collected therefrom for suitable period sufficient for the formation of complexes between the binding agent and the target biomarker in the sample, if any. In some embodiments, the contacting is performed by capillary action in which a sample is moved across a surface of the support membrane.
In some embodiments, the assays may be performed on low-throughput platforms, including single assay format. For example, a low throughput platform may be used to measure the presence and amount of a protein in a sample (e.g., endometrium tissue, endometrial stromal cells, and/or endometrial fluid) for diagnostic methods, monitoring of disease and/or treatment progression, and/or predicting whether a disease or disorder may benefit from a particular treatment.
In some embodiments, it may be necessary to immobilize a binding agent to the support member. Methods for immobilizing a binding agent will depend on factors such as the nature of the binding agent and the material of the support member and may require particular buffers. Such methods will be evident to one of ordinary skill in the art. For example, the biomarker set in a sample as described herein may be measured using any of the kits and/or detecting devices which are also described herein.
The type of detection assay used for the detection and/or quantification of a biomarker such as those provided herein may depend on the particular situation in which the assay is to be used (e.g., clinical or research applications), on the kind and number of biomarkers to be detected, and/or on the kind and number of patient samples to be run in parallel, to name a few parameters.
In various embodiments, the number of biomarkers that are measured fall between between 1 and 10 genes, or between 5 and 20 genes, or between 10 and 40 genes, or between 20 and 80 genes, or between 40 and 160 genes, or between 80 and 320 genes, or between 160 and 640 genes, or more. In still other embodiments, the gene expression levels can be measured for at least 1 gene, at least 10 genes, at least 20 genes, at least 30 genes, at least 40 genes, at least 50 genes, at least 60 genes, at least 70 genes, at least 80 genes, at least 90 genes, at least 100 genes, at least 125 genes, at least 150 genes, at least 175 genes, at least 200 genes, at least 300, 400, 500, 600, 700, 800, 900, or 1000 genes or more.
The assay methods described herein may be used for both clinical and non-clinical purposes. Some examples are provided herein.
Diagnostic and/or Prognostic Applications
The levels of one or more of the biomarkers in a sample obtained from a subject may be measured by the assay methods described herein and used for various clinical purposes. These clinical purposes may include, but are not limited to: identifying a subject having infertility, detecting or diagnosing the opening and/or closing of the window of implantation (WOI) in a subject trying to become pregnant, transferring an embryo in a subject that has been diagnosed as being within the window of implantation; treating a subject with infertility (e.g., by causing the overexpression or silencing of one or more of the genes disclosed herein using gene therapy), based on the level of one or more biomarkers described herein.
When needed, the level of a biomarker in a sample as determined by an assay methods described herein may be normalized with an internal control in the same sample or with a standard sample (having a predetermined amount of the biomarker) to obtain a normalized value. Either the raw value or the normalized value of the biomarker can then be compared with that in a reference sample or a control sample. A deviated (e.g., increased or reduced) value of the biomarker in a sample obtained from a subject as relative to the value of the same biomarker in the reference or control sample is indicative of whether the WOI is open or closed. Such a sample indicates that the subject from which the sample was obtained may be within the WOI.
In some embodiments, the level of the biomarker in a sample obtained from a subject can be compared to a predetermined threshold value for that biomarker, and a deviated (e.g., elevated or reduced) value of the biomarker may indicate that the window of implantation is open or closed for that subject.
The control sample or reference sample may be a sample obtained from a healthy individual. Alternatively, the control sample or reference sample contains a known amount of the biomarker to be assessed. In some embodiments, the control sample or reference sample is a sample obtained from a control subject.
The control level can be a predetermined level or threshold. Such a predetermined level can represent the level of the protein in a population of subjects that are within the window of implantation (WOI). It can also represent the level of the protein in a population of subjects that are not within the WOI.
The predetermined level can take a variety of forms. For example, it can be single cut-off value, such as a median or mean. In some embodiments, such a predetermined level can be established based upon comparative groups, such as where one defined group is known to be within the window of implantation, and another group is known to not be in the window of implantation. Alternatively, the predetermined level can be a range including, for example, a range representing the levels of the protein in a control population.
The control level as described herein can be determined by any technology known in the field. In some examples, the control level can be obtained by performing a conventional method (e.g., the same assay for obtaining the level of the protein in a test sample as described herein) on a control sample as also described herein. In other examples, levels of the protein can be obtained from members of a control population and the results can be analyzed by any method known in the field (e.g., a computational program) to obtain the control level (a predetermined level) that represents the level of the protein in the control population.
By comparing the level of a biomarker in a sample obtained from a candidate subject to the reference value as described herein, it can be determined whether the candidate subject is within the WOI. For example, if the level of biomarker(s) in a sample from the candidate subject deviates (e.g., is increased or decreased) from the reference value (by e.g., 1%, 5%, 10%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90%, 100%, 150%, 200%, 300%, 400%, 500% or more from a reference value), the candidate subject might be identified as being within the WOI.
As used herein, “an absolute value of the ratio” refers to the ratio of the determined level of the biomarker in the sample to the control level of the biomarker. Control levels are described in detail herein. In some embodiments, the absolute value of the ratio is at least 2, at least 3, at least 4, at least 5, at least 6, at least 7, at least 8, at least 9, at least 10, at least 11, at least 12, at least 13, at least 14, at least 15, at least 16, at least 17, at least 18, at least 19, at least 20, at least 30, at least 40, at least 50, at least 60, at least 70, at least 80, at least 90, at least 100, at least 150, at least 200, at least 300, at least 400, at least 500, or at least 1000. In some embodiments, the absolute value of the ratio is between 2-1000. In some embodiments, the absolute value of the ratio is between 5-1000, between 10-1000, between 15-1000, between 20-1000, between 30-1000, between 40-1000, between 50-1000, between 60-1000, between 70-1000, between 80-1000, between 90-100, between 100-1000, between 200-1000, between 300-1000, between 400-1000, or between 500-1000. In some embodiments, the absolute value of the ratio is between 2-500, between 2-400, between 2-300, between 2-200, between 2-100, between 2-90, between 2-80, between 2-70, between 2-60, between 2-50, between 2-40, between 2-30, between 2-20, between 2-15, between 2-10, or between 2-5.
As used herein, “an elevated level,” “an increased level,” or “a level above a reference value” means that the level of the biomarker is higher than a reference value, such as a predetermined threshold of a level the biomarker in a control sample. An elevated or increased level of a biomarker includes a level of the biomarker that is, for example, 1%, 5%, 10%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90%, 100%, 150%, 200%, 300%, 400%, 500% or more above a reference value. In some embodiments, the level of the biomarker in the test sample is at least 1.1, 1.2, 1.3, 1.4, 1.5, 1.6, 1.7, 1.8, 1.9, 2, 2.5, 3, 3.5, 4, 4.5, 5, 6, 7, 8, 9, 10, 25, 50, 100, 150, 200, 300, 400, 500, 1000, 10000-fold or more higher than the level of the biomarker in a reference sample.
As used herein, “a reduced level,” “a decreased level,” or “a level below a reference value” means that the level of the biomarker is lower than a reference value, such as a predetermined threshold of a level the biomarker in a control sample. A reduced or decreased level of a biomarker includes a level of the biomarker that is, for example, 1%, 5%, 10%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90%, 100%, 150%, 200%, 300%, 400%, 500% or more below a reference value. In some embodiments, the level of the biomarker in the test sample is at least 1.1, 1.2, 1.3, 1.4, 1.5, 1.6, 1.7, 1.8, 1.9, 2, 2.5, 3, 3.5, 4, 4.5, 5, 6, 7, 8, 9, 10, 25, 50, 100, 150, 200, 300, 400, 500, 1000, 10000-fold or more less than the level of the biomarker in a reference sample.
In some embodiments, the candidate subject is a human patient trying to become pregnant. If the subject is identified as not responsive to the treatment, a higher dose and/or frequency of dosage of the therapeutic agent (e.g., a gene therapy agent) are administered to the subject identified. In some embodiments, the dosage or frequency of dosage of the therapeutic agent is maintained, lowered, or ceased in a subject identified as responsive to the treatment or not in need of further treatment. Alternatively, an alternative treatment can be administered to a subject who is found to not be responsive to a first or subsequent treatment. In some embodiments, an alternative treatment can be administered to a subject who is found to have a negative reaction to a first or subsequent treatment.
Also within the scope of the present disclosure are methods of evaluating a subject for transfer of one or more fertilized eggs or embryos. To practice this method, the level of one or more biomarkers in a sample collected from a subject trying to become pregnant is measured to determine the phase of menstrual cycle. If the biomarker level or levels indicate that the subject is within the WOI, one or more fertilized eggs or embryos may be transferred to the subject. If the biomarker level or levels indicate that the subject is not within the WOI, or is near or at the end of the WOI, one or more fertilized eggs or embryos may be transferred to the subject during the following menstrual cycle. A fertilized egg or embryo can be transferred to a subject using any means known in the art including, but not limited to, in vitro fertilization (IVF), ultra-sound guided IVF, and surgical embryo transfer (SET).
In some embodiments, the level of expression of a particular gene or biomarker is obtained as the absolute number of copies of mRNA a particular tissue sample or cell (e.g., endometrium tissue or cell sample). In other embodiments, the level of expression of a particular gene or biomarker is obtained by normalizing the amount of an expression product of a particular gene of interest against the amount of expression of a normalizing gene (e.g., one or more housekeeping genes) product. Normalization may be done to generate an index value or simply to help in reducing background noise when determining the expression level of the gene of interest. In one embodiment, for example, in determining the level of expression of a relevant gene in accordance with the present invention, the amount of an expression product of the gene (e.g., mRNA, cDNA, protein) is measured within one or more cells, particularly tumor cells, and normalized against the amount of the expression product(s) of a normalizing gene, or a set of normalizing genes, within the same one or more cells, to obtain the level of expression of the relevant marker gene. For example, when a single gene is used as a normalizing gene, a housekeeping gene whose expression is determined to be independent of endometrial cycling or transformation. A set of such housekeeping genes can also be used in gene expression analysis to provide a combined normalizing gene set. Housekeeping genes are well known in the art, with examples including, but are not limited to, G1/SB (glucuronidase, beta), HMBS (hydroxymethylbilane synthase), SDHA (succinate dehydrogenase complex, subunit A, flavoprotein), UBC (ubiquitin C) and YWHAZ (tyrosine 3-monooxygenase/tryptophan 5-monooxygenase activation protein, zeta polypeptide). When a combined normalizing gene set is used in the normalization, the amount of gene expression of such normalizing genes can be averaged, combined together by straight additions or by a defined algorithm. Genes other than housekeeping genes may also be used as normalizing genes.
Those skilled in the art will appreciate how to obtain and use an index value in the methods of the invention. For example, the index value may represent the gene expression levels found in a normal sample obtained from the patient of interest (e.g., a healthy woman during one or more points in the menstrual cycle), in which case an expression level in the sample significantly higher than this index value would indicate, e.g., a poor prognosis or increased likelihood of abnormal menstrual cycle.
Alternatively, the index value may represent the average expression level of for a set of individuals from a diverse population or a subset of the population. For example, one may determine the average expression level of a gene or gene panel in a random sampling of patients at a specific point in the menstrual cycle, e.g., ovulation or the window of implantation. This average expression level may be termed the “threshold index value.”
Alternatively the index value may represent the average expression level of a particular gene marker in a plurality of training patients (e.g., patients within the window of implantation) with similar outcomes whose clinical and follow-up data are available and sufficient to define and categorize the patients by outcome, e.g., recurrence or prognosis. See, e.g., Examples, infra. For example, a “good prognosis index value” can be generated from a plurality of training cancer patients characterized as having “good outcome”, e.g., those who are fertile. A “poor prognosis index value” can be generated from a plurality of training cancer patients defined as having “poor outcome”, e.g., those who are infertile. Thus, a good prognosis index value of a particular gene may represent the average level of expression of the particular gene in patients having a “good outcome,” whereas a poor prognosis index value of a particular gene represents the average level of expression of the particular gene in patients having a “poor outcome.”
Further, levels of any of the biomarkers described herein may be applied for non-clinical uses including, for example, for research purposes. In some embodiments, the methods described herein may be used to study cell behavior and/or cell mechanisms. For example, one or more of the biomarkers described herein may be used to evaluate decidualization, which can be used for various purposes, including studies on decidualization and development of new agents that specifically target decidualization defects.
In some embodiments, the levels of biomarker sets, as described herein, may be relied on in the development of new therapeutics for infertility. For example, the levels of a biomarker may be measured in samples obtained from a subject who has been administered a new therapy (e.g., a clinical trial). In some embodiments, the level of the biomarker set may indicate the efficacy of the new therapeutic prior to, during, or after the administration of the new therapy.
Disclosed herein are methods to recognize a specific cell population within a sample of endometrial cells, and then use the transcriptomic analysis of that specific cell population to detect the opening of the window of implantation. Data disclosed herein demonstrate that the disclosed methods may be used in modified form to both detect and predict other events of interest in the menstrual cycle. Using the same combination of underlying analytical principles—allowing unbiased definition of endometrial cell populations, and then tracking their transcriptomic trajectories using mutual information analyses to enrich the data set for time-associated gene expressions—overcomes the problems posed in detecting the signal in the context of the noise. In this case, the signal comprises short-term changes in the expression status interest of some of the cell types, including transcriptomic shifts from day-to-day in individual patients. On the other hand, the noise is generated by the patient-to-patient variability in the length of menstrual cycles, and variation in the length and onset-timing of reproductively-significant functional changes in the endometrium where the variation between subjects (several days) exceeds or equals the time scale at which it is useful to detect events. Application of the disclosed methods to a reference population have solved this problem by providing both a reference data set against which individual patients can be evaluated, while the same methods provide the means to obtain and evaluate that individual patient's endometrial status without requiring independent knowledge of the length or phase of the patient's menstrual cycle, or more critically, the length and timing of medically useful events within that cycle.
By way of example, the disclosed methods can detect the opening of the WOI, and can also be used to detect the closing of WOI. In some embodiments, the disclosed methods are used to predict the opening or closing of WOI. Both prediction and detection of the opening and closing of the window are useful in the management of patients in need of embryo implantation. In some aspects, the disclosed methods are used to predict or detect the event of ovulation. Such prediction of ovulation is useful in the management of patient fertility and reproduction. In some aspects, the disclosed methods are used to detect the transcriptomic state of unciliated epithelium. These cells were previously unrecognized in the art, and have no distinctive morphological characteristics, but predictably precede ovulation. In some embodiments, the disclosed methods may be used for the detection of transcriptomic differentiation of glandular and luminal epithelial cell types. This also provides an improved method of prediction of ovulation compared to previously established schema.
In some aspects, shifts in the population frequency of endometrial cell populations can also be correlated to events of physiological and medical utility. In some embodiments, using a combination of such data—the recognition of time associated clusters of gene expression within cell sub-populations, differentiation of gene-expression patterns between cell sub-populations, and actual changes in the frequency of sub-populations within the endometrial population as a whole—provides enhanced diagnosis of endometrial status both by using a large number of orthogonal analyses to improve precision and decrease the impact of idiosyncratic expression of small numbers of genes as part of patient-to-patient variation. In some embodiments, enhanced diagnosis of endometrial status is achieved by maximizing the information obtainable from smaller samples, thereby minimizing the invasiveness and increasing safety and acceptability of the sampling procedure required to support the method.
The type of detection assay used for the detection and/or quantification of a biomarker such as those provided herein may depend on the particular situation in which the assay is to be used (e.g., clinical or research applications), on the kind and number of biomarkers to be detected, and/or on the kind and number of patient samples to be run in parallel, to name a few parameters.
In various aspects of the present Application, the results of any analyses can be communicated to physicians, genetic counselors and/or patients (or other interested parties such as researchers) in a transmittable form that can be communicated or transmitted to any of the above parties. Such a form can vary and can be tangible or intangible. The results can be embodied in descriptive statements, diagrams, photographs, charts, images or any other visual forms. For example, graphs showing expression or activity level or sequence variation information for various biomarkers of Tables 1-6 can be used in explaining the results. Diagrams showing such information for additional target gene(s) are also useful in indicating some testing results. The statements and visual forms can be recorded on a tangible medium such as papers, computer readable media such as floppy disks, compact disks, etc., or on an intangible medium, e.g., an electronic medium in the form of email or website on internet or intranet. In addition, results can also be recorded in a sound form and transmitted through any suitable medium, e.g., analog or digital cable lines, fiber optic cables, etc., via telephone, facsimile, wireless mobile phone, internet phone and the like.
Thus, the information and data on a test result (e.g., the window of implantation) can be produced anywhere in the world (e.g., a testing facility) and transmitted to a different location (e.g., a hospital, patient testing laboratory, or a home). As an illustrative example, when an expression level, activity level, or sequencing (or genotyping) assay is conducted outside the United States, the information and data on a test result may be generated, cast in a transmittable form as described above, and then imported into the United States. Accordingly, the present invention also encompasses a method for producing a transmittable form of information on at least one of (a) expression level or (b) activity level for at least one patient sample. The method comprises the steps of (1) determining at least one of (a) or (b) above according to methods of the present invention; and (2) embodying the result of the determining step in a transmittable form. The transmittable form is the product of such a method.
Techniques for analyzing such expression, activity, and/or sequence data (indeed any data obtained according to the invention) will often be implemented using hardware, software or a combination thereof in one or more computer systems or other processing systems capable of effectuating such analysis.
The computer-based analysis function can be implemented in any suitable language and/or browsers. For example, it may be implemented with C language and preferably using object-oriented high-level programming languages such as Visual Basic, SmallTalk, C++, and the like. The application can be written to suit environments such as the Microsoft Windows® environment including Windows® 98, Windows® 2000, Windows® NT, and the like, as well as Google®-based systems, e.g., Google Docs®. In addition, the application can also be written for the Apple® computers and MacOS® graphical user interface, SUN®, UNIX or LINUX environments, as well as smart phone computer platforms, e.g., iPhone®-based, Windows®-based, and Android®-based smart phones. In addition, the functional steps can also be implemented using a universal or platform-independent programming language. Examples of such multi-platform programming languages include, but are not limited to, hypertext markup language (HTML), JAVA®, JavaScript®, Flash programming language, common gateway interface/structured query language (CGI/SQL), practical extraction report language (PERL), AppleScript® and other system script languages, programming language/structured query language (PL/SQL), and any internet browser, e.g., Google® Chrome, Microsoft® Windows Explorer, and MacOS Safari. When active content web pages are used, they may include Java® applets or ActiveX® controls or other active content technologies.
The analysis function can also be embodied in computer program products and used in the systems described above or other computer- or internet-based systems. Accordingly, another aspect of the present invention relates to a computer program product comprising a computer-usable medium having computer-readable program codes or instructions embodied thereon for enabling a processor to carry out gene status analysis. These computer program instructions may be loaded onto a computer or other programmable apparatus to produce a machine, such that the instructions which execute on the computer or other programmable apparatus create means for implementing the functions or steps described above. These computer program instructions may also be stored in a computer-readable memory or medium that can direct a computer or other programmable apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory or medium produce an article of manufacture including instruction means which implement the analysis. The computer program instructions may also be loaded onto a computer or other programmable apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions or steps described above.
Thus one aspect of the present invention provides a system for determining the state of menstruation, e.g., detecting the occurrence of the implantation window (WOI). Generally speaking, the system comprises (1) computer means for receiving, storing, and/or retrieving a patient's gene status data (e.g., expression level or activity level of measured biomarkers) and optionally clinical parameter data (e.g., traditional histological menstrual cycle data); (2) computer means for querying this patient data; (3) computer means for determining the state of menstruation, e.g., the WOI, on this patient data; and (4) computer means for outputting/displaying this conclusion. In some embodiments, this means for outputting the conclusion may comprise a computer means for informing a health care professional of the conclusion.
One example of such a system includes a computer system that may include at least one input module for entering patient data into the computer system. The computer system may include at least one output module for indicating the state of the patient's menstrual cycle and/or indicating suggested treatments determined by the computer system. The computer system may include at least one memory module in communication with the at least one input module and the at least one output module.
The at least one memory module may include, e.g., a removable storage drive, which can be in various forms, including but not limited to, a magnetic tape drive, a floppy disk drive, a VCD drive, a DVD drive, an optical disk drive, etc. The removable storage drive may be compatible with a removable storage unit such that it can read from and/or write to the removable storage unit. The removable storage unit may include a computer usable storage medium having stored therein computer-readable program codes or instructions and/or computer readable data. For example, the removable storage unit may store patient data. Example of removable storage units are well known in the art, including, but not limited to, floppy disks, magnetic tapes, optical disks, and the like. The at least one memory module may also include a hard disk drive, which can be used to store computer readable program codes or instructions, and/or computer readable data.
In addition, the at least one memory module may further include an interface and a removable storage unit that is compatible with the interface such that software, computer readable codes or instructions can be transferred from the removable storage unit into computer system. Examples of the interface and the removable storage unit pairs include, e.g., removable memory chips and sockets associated therewith, program cartridges and cartridge interface, and the like.
The computer system may include at least one processor module. It should be understood that the at least one processor module may consist of any number of devices. The at least one processor module may include a data processing device, such as a microprocessor or microcontroller or a central processing unit. The at least one processor module may include another logic device such as a DMA (Direct Memory Access) processor, an integrated communication processor device, a custom VLSI (Very Large Scale Integration) device or an ASIC (Application Specific Integrated Circuit) device. In addition, the at least one processor module may include any other type of analog or digital circuitry that is designed to perform the processing functions described herein. The at least one memory module [606] may be configured for storing patient data entered via the at least one input module [630] and processed via the at least one processor module [602]. Patient data relevant to the present invention may include expression level, activity level, copy number and/or sequence information for PTEN and/or a CCG. Patient data relevant to the present invention may also include clinical parameters relevant to the patient's disease. Any other patient data a physician might find useful in making treatment decisions/recommendations may also be entered into the system, including but not limited to age, gender, and race/ethnicity and lifestyle data such as diet information. Other possible types of patient data include symptoms currently or previously experienced, patient's history of illnesses, medications, and medical procedures.
The at least one memory module may include a computer-implemented method stored therein. The at least one processor module may be used to execute software or computer-readable instruction codes of the computer-implemented method. The computer-implemented method may be configured to, based upon the patient data, indicate whether the patient has an increased likelihood of recurrence, progression or response to any particular treatment, generate a list of possible treatments, etc.
In certain embodiments, the computer-implemented method may be configured to identify a patient being tested for menstrual cycle state. For example, the computer-implemented method may be configured to inform a physician (e.g., an in vitro fertilization specialist) that a particular patient's menstrual cycle is at a window of implantation. Alternatively or additionally, the computer-implemented method may be configured to actually suggest a particular course of treatment based on the answers to/results for various queries.
The practice of the present invention may also employ conventional biology methods, software and systems. Computer software products of the invention typically include computer readable media having computer-executable instructions for performing the logic steps of the method of the invention. Suitable computer readable medium include floppy disk, CD-ROM/DVD/DVD-ROM, hard-disk drive, flash memory, ROM/RAM, magnetic tapes and others. Basic computational biology methods are described in, for example, Setubal et al., INTRODUCTION TO COMPUTATIONAL BIOLOGY METHODS (PWS Publishing Company, Boston, 1997); Salzberg et al. (Ed.), COMPUTATIONAL METHODS IN MOLECULAR BIOLOGY, (Elsevier, Amsterdam, 1998); Rashidi & Buehler, BIOINFORMATICS BASICS: APPLICATION IN BIOLOGICAL SCIENCE AND MEDICINE (CRC Press, London, 2000); and Ouelette & Bzevanis, BIOINFORMATICS: A PRACTICAL GUIDE FOR ANALYSIS OF GENE AND PROTEINS (Wiley & Sons, Inc., 2.sup.nd ed., 2001); see also, U.S. Pat. No. 6,420,108, which are incorporated herein by reference.
The present invention may also make use of various computer program products and software for a variety of purposes, such as probe design, management of data, analysis, and instrument operation. See U.S. Pat. Nos. 5,593,839; 5,795,716; 5,733,729; 5,974,164; 6,066,454; 6,090,555; 6,185,561; 6,188,783; 6,223,127; 6,229,911 and 6,308,170, which are incorporated herein by reference. Additionally, the present invention may have embodiments that include methods for providing genetic information over networks such as the Internet as shown in U.S. Ser. No. 10/197,621 (U.S. Pub. No. 20030097222); Ser. No. 10/063,559 (U.S. Pub. No. 20020183936), Ser. No. 10/065,856 (U.S. Pub. No. 20030100995); Ser. No. 10/065,868 (U.S. Pub. No. 20030120432); Ser. No. 10/423,403 (U.S. Pub. No. 20040049354), which are incorporated herein by reference.
The assay methods described herein may be used for both clinical and non-clinical purposes. Some examples are provided herein.
The present disclosure also provides kits and devices for use in measuring the level of a biomarker set as described herein. Such a kit or device can comprise one or more binding agents that specifically bind to a gene product of target biomarkers, such as the biomarkers listed in any of Tables 1-17. For example, such a kit or detecting device may comprise at least one binding agent that is specific to one or more protein biomarkers selected from Tables 1-17. In some instances, the kit or detecting device comprises binding agents specific to two or more members of the protein biomarker set described herein.
Levels of specific expression products of genes (e.g., NUPR1, CADM1, NPAS3, ATP1A1, and/or TRAK1; CRYAB, NFATC2, BMP2, PMAIP1, ZFYVE21, CILP, SLF2, MATN2, and/or FGF7) can be assessed by any appropriate method. In some embodiments, the levels of specific expression products are analyzed using one or more assays comprising any solid support (e.g., one or more chips). For example, a solid support (e.g., a chip) may be used to analyze at least one (e.g., 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, or more) biological sample(s) of or from a subject.
Sections of the solid support (e.g., the chip) may be modified with one binding partner or more than one binding partner. The solid support may be linked in any manner to the binding partner(s). As a non-limiting example, the binding partner(s) may be physisorbed or otherwise bound (e.g., bound directly) onto the surface of the solid support or covalently linked through appropriate coupling chemistry in any manner including, but not limited to: linkage through a epoxide on the surface, creation of an amido link (i.e., through NHS EDC chemistry) using a amine or carboxylic acid group present on the surface, linkage between a thiol and a thiol reactive group (i.e., a maleimide group), formation of a Schiff base between aldehyde and amines, reaction to an anhydride present on the surface, and/or through a photo-activatable linker.
The binding partner may be any binding partner useful for the instant compositions or methods. For example, the binding partner may be a protein (with naturally occurring amino acids or artificial amino acids), one or more nucleic acids made of naturally occurring bases or artificial bases (including, for example, DNA or RNA), sugars, carbohydrates, one or more small molecules (including, but not limited to one or more of: a vitamin, hormone, cofactor, heme group, chelate, fatty acid, or other known small molecule, and/or a phage).
The binding partners may be applied to the surface of the substrate by deposition of a droplet at a pre-defined location in any manner and using any device including, but not limiting to: the use of a pipette, a liquid dispenser, plotter, nano-spotter, nano-plotter, arrayer, spraying mechanism or other suitable fluid handling device.
In some embodiments, antibodies or antigen-binding fragments are provided that are suited for use in the instant methods and compositions. Immunoassays utilizing such antibody or antigen-binding fragments useful for the instant compositions and methods may be competitive or non-competitive immunoassays in either a direct or an indirect format. Non-limiting examples of such immunoassays are Enzyme Linked Immunoassays (ELISA), radioimmunoassays (RIA), sandwich assays (immunometric assays), flow cytometry-based assays, western blot assays, immunoprecipitation assays, immunohistochemistry assays, immuno-microscopy assays, lateral flow immuno-chromatographic assays, and proteomics arrays. For example, the binding partners may be antibodies (or antibody-binding fragments thereof) with specificity towards a protein of interest including one or more of unciliated epithelial biomarkers NUPR1, CADM1, NPAS3, ATP1A1, and/or TRAK1; or one or more of stromal biomarkers CRYAB, NFATC2, BMP2, PMAIP1, ZFYVE21, CILP, SLF2, MATN2, and/or FGF7.
In some embodiments, oligonucleotide binding partners are used to assess the levels of specific expression products of genes. The oligonucleotide binding partners may be of any type known or used. As a set of non-limiting examples, in certain embodiments the oligonucleotide probes may be RNA oligonucleotides, DNA oligonucleotides, a mixture of RNA oligonucleotides and DNA nucleotides, and/or oligonucleotides that may be mixtures of RNA and DNA. The oligonucleotide binding partners may be naturally occurring or synthetic. The oligonucleotide binding partners may be of any length. As a set of non-limiting examples, the length of the oligonucleotide binding partners may range from about 5 to about 50 nucleotides, from about 10 to about 40 nucleotides, or from about 15 to about 40 nucleotides. The array may comprise any number of oligonucleotide binding partners specific for each target gene. For example, the array may comprise less than 10 (e.g., 9, 8, 7, 6, 5, 4, 3, 2, or 1) oligonucleotide probes specific for each target gene. As another example, the array may comprise more than 10, more than 50, more than 100, or more than 1000 oligonucleotide binding partners specific for each target gene.
The array may further comprise control binding partners such as, for example mismatch control oligonucleotide binding partners or control antibodies or antigen binding fragments thereof. Where mismatch control oligonucleotide binding partners are present, the quantifying step may comprise calculating the difference in hybridization signal intensity between each of the oligonucleotide binding partners and its corresponding mismatch control binding partner. Where control antibodies or antigen binding fragments thereof are present, the quantifying step may comprise calculating the difference in hybridization signal intensity between antibodies or antigen binding fragments for the genes under examination (e.g., NUPR1, CADM1, NPAS3, ATP1A1, and/or TRAK1; CRYAB, NFATC2, BMP2, PMAIP1, ZFYVE21, CILP, SLF2, MATN2, and/or FGF7) and a control or “housekeeping” antibody or antigen binding fragment thereof. The quantifying may further comprise calculating the average difference in hybridization signal intensity between each of the oligonucleotide probes and its corresponding mismatch control probe for each gene.
The array (e.g., chip) may contain any number of analysis regions. As a set of non-limiting examples, the array may contain one or more than one (e.g., 2, 3, 4, 5, 6, 7, 8, 9, 10, 20, 25, 30, 35, 40, or more) analysis regions. Each analysis region may comprise any number of binding partners immobilized to a substrate portion therein. As a non-limiting set of examples, each analysis region may comprise between one and 1,000 binding partners, one and 500 binding partners, one and 250 binding partners, one and 100 binding partners, two and 1,000 binding partners, two and 500 binding partners, two and 250 binding partners, two and 100 binding partners, three and 1,000 binding partners, three and 500 binding partners, three and 250 binding partners, or three and 100 binding partners immobilized to a substrate portion therein.
Binding partners including, but not limited to, antibodies or antigen-binding fragments that bind to the specific antigens of interest can be immobilized, e.g., by binding to a solid support (e.g., a chip, carrier, membrane, columns, proteomics array, etc.). In one set of embodiments, a material used to form the solid support has an optical transmission of greater than 90% between 400 and 800 nm wavelengths of light (e.g., light in the visible range). Optical transmission may be measured through a material having a thickness of, for example, about 2 mm (or in other embodiments, about 1 mm or about 0.1 mm). In some instances, the optical transmission is greater than or equal to 80%, greater than or equal to 85%, greater than or equal to 88%, greater than or equal to 92%, greater than or equal to 94%, or greater than or equal to 96% between 400 and 800 nm wavelengths of light. In some embodiments, the material used to form the solid support has an optical transmission of less than or equal to 99.9%, less than or equal to 96%, less than or equal to 94%, less than or equal to 92%, less than or equal to 90%, less than or equal to 85%, less than or equal to 80%, less than or equal to 50%, less than or equal to 30%, or less than or equal to 10% between 400 and 800 nm wavelengths of light. Combinations of the above-referenced ranges are also possible.
The array may be fabricated on a surface of virtually any shape (e.g., the array may be planar) or even a multiplicity of surfaces. Non-limiting examples of solid support materials useful for the compositions and methods described herein may include glass, plastics, elastomeric materials, membranes, or other suitable materials for performing immunoassays. The solid support may be formed from one material, or it may be formed from two or more materials.
Specific solid support materials may include, but are not limited to: any type of glass (e.g., fused silica, borosilicate glass, Pyrex®, or Duran®). In one embodiment, the solid support is a glass chip. The solid support may also comprise a non-glass substrate (e.g., a plastic substrate) coated with a glass film dioxide produced by a process such as sputtering, oxidation of silicon, or through reaction of silane reagents. The glass surface may be further modified with functionalized silane reagents including, for example: amine-terminated silanes (aminopropyltriethoxy silane) and epoxide-terminated silanes (glycidoxypropyltrimethoxysilane).
Additional specific solid support materials may include, but are not limited to: thermoplastic polymers and may comprise one or more of: polystyrene, polycarbonate, polymethylmethacrylate, cyclic olefin copolymers, polyethylene, polypropylene, polyvinyl chloride, polyvinylidene difluoride, any fluoropolymers (e.g., polytetrafluoroethylene, also known as Teflon®), polylactic acid, poly(methyl methacrylate) (also known as PMMA or acrylic; e.g., Lucite®, Perspex®, and Plexiglas®), and acrylonitrile butadiene styrene.
Additional specific solid support materials may include, but are not limited to: one or more elastomeric materials including polysiloxanes (silicones such as polydimethylsiloxane) and rubbers (polyisoprene, polybutadiene, chloroprene, styrene-butadiene, nitrile rubber, polyether block amides, ethylene-vinyl acetate, epichlorohydrin rubber, isobutene-isoprene, nitrile, neoprene, ethylene-propylene, and hypalon).
Additional specific solid support materials may include, but are not limited to: one or more membrane substrates such as dextran, amyloses, nylon, Polyvinylidene fluoride (PVDF), fiberglass, and natural or modified celluloses (e.g., cellulose, nitrocellulose, CNBr-activated cellulose, and cellulose modified with polyacrylamides, agaroses, and/or magnetite). The nature of the support can be either fixed or suspended in a solution (e.g., beads).
In some embodiments, the material and dimensions (e.g., thickness) of a solid support (e.g., a chip) is substantially impermeable to water vapor. In some embodiments, a cover may also be present. In some embodiments, the cover is substantially impermeable to water vapor. For instance, a solid support (e.g., a chip) may include a cover comprising a material known to provide a high vapor barrier, such as metal foil, certain polymers, certain ceramics and combinations thereof. Examples of materials having low water vapor permeability are provided below. In other cases, the material is chosen based at least in part on the shape and/or configuration of the chip. For instance, certain materials can be used to form planar devices whereas other materials are more suitable for forming devices that are curved or irregularly shaped.
A material used to form all or portions of a section or component of any composition described herein may have, for example, a water vapor permeability of less than about 5.0 g·mm/m2·d, less than about 4.0 g·mm/m2·d, less than about 3.0 g·mm/m2·d, less than about 2.0 g·mm/m2·d, less than about 1.0 g·mm/m2·d, less than about 0.5 g·mm/m2·d, less than about 0.3 g·mm/m2·d, less than about 0.1 g·mm/m2·d, or less than about 0.05 g·mm/m2·d. In some cases, the water vapor permeability may be, for example, between about 0.01 g·mm/m2·d and about 2.0 g·mm/m2·d, between about 0.01 g·mm/m2·d and about 1.0 g·mm/m2·d, between about 0.01 g·mm/m2·d and about 0.4 g·mm/m2·d, between about 0.01 g·mm/m2·d and about 0.04 g·mm/m2·d, or between about 0.01 g·mm/m2·d and about 0.1 g·mm/m2·d. The water vapor permeability may be measured at, for example, 40° C. at 90% relative humidity (RH). Combinations of materials with any of the aforementioned water vapor permeabilities may be used in the instant compositions or methods.
In some embodiments, the material and dimensions of a solid support (e.g., a chip) and/or cover may vary. For example, the chip may be configured to provide one or more regions (e.g., liquid containment regions). In certain embodiments, the chip may be configured to provide two or more regions (e.g., liquid containment regions). In certain embodiments, two or more of the regions are fluidically separated from other regions. In one embodiment, all of the regions are fluidically separated from other regions. In some embodiments, all of the regions are fluidically connected. The chip may comprise any number of liquid containment regions. As a non-limiting example, the chip may comprise one, two, three, four, five, six, seven, eight, nine, or ten liquid containment regions, each of which may be fluidically separated from one another. In other embodiments, the chip may comprise one, two, three, four, five, six, seven, eight, nine, or ten liquid containment regions that are fluidically connected to one another.
A solid support (e.g., a chip) described herein may have any suitable volume for carrying out an analysis such as a chemical and/or biological reaction or other process. The entire volume of the solid support may include, for example, any reagent storage areas, analysis regions, liquid containment regions, waste areas, as well as one or more identifiers. In some embodiments, small amounts of reagents and samples are used and the entire volume of the a liquid containment region is, for example, less than or equal to 10 mL, less than or equal to 5 mL, less than or equal to 1 mL, less than or equal to 500 μL, less than or equal to 250 μL, less than or equal to 100 μL, less than or equal to 50 μL, less than or equal to 25 μL, less than or equal to 10 μL, less than or equal to 5 μL, or less than or equal to 1 μL. In some embodiments, small amounts of reagents and samples are used and the entire volume of the a liquid containment region is, for example, at least 10 mL, at least 5 mL, at least 1 mL, at least 500 μL, at least 250 μL, at least 100 μL, at least 50 μL, at least 25 μL, at least 10 μL, at least 5 μL, or at least 1 μL. Combinations of the above-referenced values are also possible.
The length and/or width of the solid support (e.g., chip) may be, for example, less than or equal to 300 mm, less than or equal to 200 mm, less than or equal to 150 mm, less than or equal to 100 mm, less than or equal to 95 mm, less than or equal to 90 mm, less than or equal to 85 mm, less than or equal to 80 mm, less than or equal to 75 mm, less than or equal to 70 mm, less than or equal to 65 mm, less than or equal to 60 mm, less than or equal to 55 mm, less than or equal to 50 mm, less than or equal to 45 mm, less than or equal to 40 mm, less than or equal to 35 mm, less than or equal to 30 mm, less than or equal to 25 mm, or less than or equal to 20 mm. In some embodiments, the length and/or width of the chip may be, for example, at least 300 mm, at least 200 mm, at least 150 mm, at least 100 mm, at least 95 mm, at least 90 mm, at least 85 mm, at least 80 mm, at least 75 mm, at least 70 mm, at least 65 mm, at least 60 mm, at least 55 mm, at least 50 mm, at least 45 mm, at least 40 mm, at least 35 mm, at least 30 mm, at least 25 mm, or at least 20 mm. Combinations of the above-referenced values are also possible. In some embodiments, the thickness of the solid support (e.g., chip) may be, for example, less than or equal to 5 mm, less than or equal to 3 mm, less than or equal to 2 mm, less than or equal to 1 mm, less than or equal to 0.9 mm, less than or equal to 0.8 mm, less than or equal to 0.7 mm, less than or equal to 0.5 mm, less than or equal to 0.4 mm, less than or equal to 0.3 mm, less than or equal to 0.2 mm, or less than or equal to 0.1 mm. In some embodiments, the thickness of the solid support (e.g., chip) may be, for example, at least 5 mm, at least 3 mm, at least 2 mm, at least 1 mm, at least 0.9 mm, at least 0.8 mm, at least 0.7 mm, at least 0.5 mm, at least 0.4 mm, at least 0.3 mm, at least 0.2 mm, or at least 0.1 mm. Combinations of the above-referenced values are also possible. One or more solid supports (e.g., chips) may be analyzed at the same time by any suitable device. An adapter may be used with the one or more solid supports (e.g., chips) in order to insert and securely hold them in the analyzer.
In some embodiments, the solid support (e.g., chip) includes one or more identifiers. Any method or type of identification may be used. For example, an identifier may be, but is not limited to, any type of label such as a bar code or an RFID tag. The identifier may include the name, patient number, social security number, or any other method of identification for a subject. The identifier may also be a randomized identifier of any type useful in a clinical setting.
It should be understood that the solid supports (e.g., chips) and their respective components described herein are exemplary and that other configurations and/or types of solid supports (e.g., chips) and components can be used with the systems and methods described herein.
The binding of a one or more binding partners (e.g., to detect the binding of a protein or other substance of interest including, but not limited to, antigen-bound antibody complexes) may be quantified by any method known in the art. The quantification may, for example, be performed by detection or interrogation of an active molecule bound to an antibody. In a multiplexed format, where more than one assay is being performed on a continuous area, the signals associated with each assay must be differentiable from the other assays. Any suitable strategy known in the art may be used including, but not limited to: (1) using a label with substantially non-overlapping spectral and/or electrochemical properties: (2) using a signal amplification chemistry that remains attached or deposited in close proximity to the tracer itself.
In some embodiments, labeled binding partners (e.g., antibodies or antigen binding fragments) may be used as tracers to detect binding (e.g., using antigen bound antibody complexes). Examples of the types of labels which may be useful for the instant methods and compositions include enzymes, radioisotopes, colloidal metals, fluorescent compounds, magnetic, chemiluminescent compounds, electrochemiluminescent groups, metal nanoparticles, and bioluminescent compounds. Radiolabeled binding partners (e.g., antibodies) may be prepared using any known method and may involve coupling a radioactive isotope such as 153Eu, 3H, 32P, 35S, 59Fe, or 125I, which can then be detected by gamma counter, scintillation counter or by autoradiography. Binding partners (e.g., antibodies or antigen binding fragments) may alternatively be labeled with enzymes such as yeast alcohol dehydrogenase, horseradish peroxidase, alkaline phosphatase, and the like, then developed and detected spectrophotometrically or visually. The label may be used to react a chromogen into a detectable chromophore (including, for example, if the chromogen is a precipitating dye).
Suitable fluorescent labels may include, but are not limited to: fluorescein, fluorescein isothiocyanate, fluorescamine, rhodamine, Alexa Fluor® dyes (such as Alexa Fluor® 350, Alexa Fluor® 405, Alexa Fluor® 430, Alexa Fluor® 488, Alexa Fluor® 514, Alexa Fluor® 532, Alexa Fluor® 546, Alexa Fluor® 555, Alexa Fluor® 568, Alexa Fluor® 594, Alexa Fluor® 610, Alexa Fluor® 633, Alexa Fluor® 635, Alexa Fluor® 647, Alexa Fluor® 660, Alexa Fluor® 680, Alexa Fluor® 700, Alexa Fluor® 750, or Alexa Fluor® 790), cyanine dyes including, but not limited to: Cy2, Cy3, Cy3.5, Cy5, Cy5.5, Cy7, and Cy7.5, and the like. The labels may also be time-resolved fluorescent (TRF) atoms (e.g., Eu or Sr with appropriate ligands to enhance TRF yield). More than one fluorophore capable of producing a fluorescence resonance energy transfer (FRET) may also be used. Suitable chemiluminescent labels may include, but are not limited to: acridinium esters, luminol, imidazole, oxalate ester, luciferin, and any other similar labels.
Suitable electrochemiluminescent groups for use may include, as a non-limiting example: Ruthenium and similar groups. A metal nanoparticle may also be used as a label. The metal nanoparticle may be used to catalyze a metal enhancement reaction (such as gold colloid for silver enhancement).
Any of the labels described herein or known in the field may be linked to the tracer using covalent or non-covalent means. The label may be presented on or inside an object like a bead (including, for example, a plain bead, hollow bead, or bead with a ferromagnetic core), and the bead is then attached to the binding partner (e.g., an antibody or antigen-binding fragment thereof). The label may also be a nanoparticle including, but not limited to, an up-converting phosphorescent system, nanodot, quantum dot, nanorod, and/or nanowire. The label linked to the antibody may also be a nucleic acid, which might then be amplified (e.g., using PCR) before quantification by one or more of optical, electrical or electrochemical means.
In some embodiments, the binding partner is immobilized on the solid support prior to formation of binding complexes. In other embodiments, immobilization of the antibody and antigen-binding fragment is performed after formation of binding complexes.
In one embodiment, immunoassay methods disclosed herein comprise immobilizing binding partners (e.g., antibodies or antigen-binding fragments) to a solid support (e.g., a chip); applying a sample (e.g., an endometrial fluid sample) to the solid support under conditions that permit binding of the expression product of a biomarker (e.g., a protein) to one or more binding partners (e.g., one or more antibodies or antigen-binding fragments), if present in the sample; removing the excess sample from the solid support; detecting the bound complex (using, e.g., detectably labeled antibodies or antigen-binding fragments) under conditions that permit binding (e.g., of an expression product to the antigen-bound immobilized antibodies or antigen-binding fragments); washing the solid support and assaying for the label.
Reagents can be stored in or on a chip for various amounts of time. For example, a reagent may be stored for longer than 1 hour, longer than 6 hours, longer than 12 hours, longer than 1 day, longer than 1 week, longer than 1 month, longer than 3 months, longer than 6 months, longer than 1 year, or longer than 2 years. Optionally, the chip may be treated in a suitable manner in order to prolong storage. For instance, chips having stored reagents contained therein may be vacuum sealed, stored in a dark environment, and/or stored at low temperatures (e.g., below 4° C. or 0° C.). The length of storage depends on one or more factors such as the particular reagents used, the form of the stored reagents (e.g., wet or dry), the dimensions and materials used to form the substrate and cover layer(s), the method of adhering the substrate and cover layer(s), and how the chip is treated or stored as a whole. Storing of a reagent (e.g., a liquid or dry reagent) on a solid support material may involve covering and/or sealing the chip prior to use or during packaging.
Any solid state assay device described herein may be included in a kit. The kit may include any packaging useful for such devices. The kit may include instructions for use in any format or language. The kit may also direct the user to obtain further instructions from one or more locations (physical or electronic). The included instructions can comprise a description of how to use the components contained in the kit for measuring the level of a biomarker set (e.g., protein biomarker or nucleic acid biomarker) in a biological sample collected from a subject, such as a human patient. The instructions relating to the use of the kit generally include information as to the amount of each component and suitable conditions for performing the assay methods described herein.
The components in the kits may be in unit doses, bulk packages (e.g., multi-dose packages), or sub-unit doses. The kit can also comprise one or more buffers as described herein but not limited to a coating buffer, a blocking buffer, a wash buffer, and/or a stopping buffer.
The kits of this present disclosure are in suitable packaging. Suitable packaging includes, but is not limited to, vials, bottles, jars, flexible packaging (e.g., sealed Mylar or plastic bags), and the like. Also contemplated are packages for use in combination with a specific device, such as an PCR machine, a nucleic acid array, or a flow cytometry system.
Kits may optionally provide additional components such as interpretive information, such as a control and/or standard or reference sample. Normally, the kit comprises a container and a label or package insert(s) on or associated with the container. In some embodiments, the present disclosure provides articles of manufacture comprising contents of the kits described above.
All procedures involving human endometrium were conducted in accordance with the Institutional Review Board (IRB) guidelines for Stanford University under the IRB code IRB-35448 and IVI/University of Valencia under the IRB code 1603-IGX-016-CS, including informed consent for tissue collection from all subjects. Collection of endometrial biopsies was approved by the IRB code 1603-IGX-016-CS. There were no medical reasons to obtain the endometrial biopsies. Healthy ovum donors were recruited in the context of the research project approved by the IRB. Informed written consent was obtained from each woman before an endometrial biopsy was performed in their natural menstrual cycle (no hormone stimulation). De-identified human endometrium was obtained from women aged 18-34, with regular menstrual cycle (3-4 days every 28-30 days), BMI ranging 19-29 kg/m2 (inclusive), and negative serological tests for HIV, HBV, HCV, RPR and normal karyotype. Women with the following conditions were excluded from tissue collection: with recent contraception (IUD in past 3 months; hormonal contraceptives in past 2 months), uterine pathology (endometriosis, leiomyoma, or adenomyosis; bacterial, fungal, or viral infection), and polycystic ovary syndrome.
A two-stage dissociation protocol was used to dissociate endometrium tissue and separate it into stromal fibroblast and epithelium enriched single cell suspensions. Prior to the dissociation, the tissue was rinsed with DMEM (Sigma) on a petri dish to remove blood and mucus. Excess DMEM was removed after the rinsing. The tissue was then minced into pieces as small as possible, and dissociated in collagenase A1 (Sigma) overnight at 4° C. in a 50 mL falcon tube at horizontal position. This primary enzymatic step dissociates stromal fibroblasts into single cells while leaving epithelium glands and lumen mostly undigested. The resulting tissue suspension was then briefly homogenized and left un-agitated for 10 mins in a 50 mL Falcon tube at vertical position, during which epithelial glands and lumen sedimented as a pellet and stromal fibroblasts stayed suspended in the supernatant. The supernatant was therefore collected as the stromal fibroblast-enriched suspension. The pellet was washed twice in 50 mL DMEM to further remove residual stromal fibroblasts. The washed pellet was then dissociated in 400 μL TrypLE Select (Life technology) for 20 mins at 37° C., during which homogenization was performed via intermittent pipetting. DNaseI (100 μL) was then added to the solution to digest extracellular genomic DNA. The digestion was quenched with 1.5 mL DMEM after 5 min incubation. The resulting cell suspension was then pipetted, filtered through a 50 μm cell strainer, and centrifuged at 1000 rpm for 5 min. The pellet was re-suspended as the epithelium-enriched suspension.
Single Cell Capture, Imaging, and cDNA Generation
For cell suspension of both portions, live cells were enriched via MACS dead cell removal kit (Miltenyi Biotec) following the manufacture's protocol. The resulting cell suspension was diluted in DMEM into a final concentration of 300-400 cells/μL before being loaded onto a medium C1 chip for mRNA Seq (Fluidigm). Live dead cell stain (Life Technology) was added directly into the cell suspension. Single cell capture, mRNA reverse-transcription, and cDNA amplification were performed on the Fluidigm C1 system using default scripts for mRNA Seq. All capture site images were recorded using an in-house built microscopic system at 20× magnification through phase, GFP, and Y3 channels. 1 μL pre-diluted ERCC (Ambion) was added into the lysis mix, resulting in a final dilution factor of 1:80,000 in the mix.
Single-cell cDNA concentration and size distribution were analyzed on a capillary electrophoresis-based automated fragment analyzer (Advanced Analytical). Fragmented and barcoded cDNA libraries were prepared only for cells imaged as singlet or empty at the capture site and with >0.06 ng/uL cDNA generated. Library preparation was performed using Nextera XT DNA Sample Preparation kit (Illumina) on a Mosquito HTS liquid handler (TTP Labtech) following Fluidigm's single cell library preparation protocol with a 4× scale-down of all reagents. Dual-indexed single-cell libraries were pooled and sequenced in pair-end reads on Nextseq (Illumina) to a depth of 1-2×106 reads per cell. Bcl2fastq v2.17.1.14 was used to separate out the data for each single cell by using unique barcode combinations from the Nextera XT preparation and to generate *.fastq files.
Raw reads in the *.fastq files were trimmed to 75 bp using fastqx, aligned to Ensembl human reference genome GRCh38.87 (dna.primary_assembly) using STAR (Dobin et al., 2013) with default parameters, duplicate-removed using picard MarkDuplicates with default parameters. Aligned reads were converted to counts using HTSeq (Anders et al., 2015) and Ensembl GTF for GRCh38.87 under the setting -m intersection-strict \-s no. Downstream data analysis was performed in R and Java. For each cell, counts were normalized to log transformed reads per million (log 2(rpm+1)) by the equation
where i is for cell i and j for gene j.
For quality filtering, fraction of reads mapped to ERCC (fERCC) was used as the quality metric and empirical cumulative distribution of fERCC in empty capture sites recorded on the C1 chip was calculated and used as the null model (ecdfnull). Single cells retained for downstream analysis were those with (ecdfnull(fERCC))<0.05. 2149 cells were retained for downstream analysis.
To obtain differentially expressed genes for a cell type or state, for each gene, Wilcoxon's rank sum test (Mann and Whitney, 1947) was performed and 2) fold change (FC, dummy variable=1E-02) was calculated between cells within a cell type/state and the cells from other cell types/states. P-values obtained from the Wilcoxon's rank sum test were adjusted for multiple comparisons by Benjamini-Hochberg's procedure (Benjamini and Hochberg, 1995) to obtain p.adj. To evaluate the “sensitivity” and “specificity” of a gene in identifying a cell type/state, the percent of cells was also calculated within the cell type/state of interest that are expressing the gene (pctin) and the percent of cells from other cell types/states expressing the gene (pctout), as well as the ratio between the pctin and pctout.
Functional enrichment analysis was performed using Gene Ontology Enrichment Analysis (geneontology.org) and each enriched ontology hierarchy (FDR<0.05) was reported with two terms in the hierarchy: the term with the highest significant value and 2) the term with the highest specificity.
The “time-associatedness” of a gene was calculated as the MI between the expression of a gene and time (or pseudotime) using the Java implementation of ARACNe-AP (Lachmann et al., 2016). For each gene, MIi=MI((e1i, e2i, . . . , eni), (t1, t2, . . . , tn)), where i is for gene i, eni is for expression of gene i in cell n, and tn is the time (or pseudotime) annotation of cell n. The statistical significance of the MIi was evaluated using the null model where the time (or pseudotime) annotation was permutated for 1000 times with respect to cells, based on which an empirical cumulative distribution function (ecdfnull,i) between the expression of gene i and the permutated time (or pseudotime) was constructed using R function ecdf. The p-value for MIi was calculated as (1-ecdfnull,i(MIi)). The p-values were then adjusted for multiple comparisons by Benjamini-Hochberg's procedure (Benjamini and Hochberg, 1995) to obtain FDR for each gene.
Over-dispersion of genes was calculated as
where CVi2 is the squared variation of coefficient of gene i across cells of interest and CVe2 is the expected squared variation of coefficient given mean, fitted using non-ERCC counts. All pairwise distances between cells were calculated as (1-Pearson's correlation). Dimensional reduction was performed using R implementation of tSNE (Rtsne).
Smoothing of “Time-Associated” Genes and Assignment into Characteristic Phases
To estimate the pseudotime at which a gene reached maximum expression (pseudotimemax), smoothing of gene expression was performed with respect to pseudotime using the R function smooth spline( ) (spar=1) and the pseudotime(s) at which a smoothed curve reached local maximum was estimated using the R function peaks( ) and inflection point estimated using custom R script. Characteristic signatures for phase 1-4 were identified by assigning each pseudotime-associated gene that was identified (
A dynamic transcriptional factor (
A two-step approach was taken in identifying cycling cells and defining endometrium-specific cell cycle signatures. A published gene set encompassing 43 G1/S and 55 G2/M genes (Tirosh et al., 2016), was used, representing the intersection of four previous gene sets (Kowalczyk et al., 2015; Macosko et al., 2015; Whitfield, 2002), and calculated a G1/S and a G2/M score for all single cells in unciliated epithelial and stromal fibroblasts, respectively, following the scoring scheme in (Tirosh et al., 2016). Briefly, cells with at least 2× average expression of either G1/S or G2/M genes than the average of all cells in the respective cell type was assigned as putative cycling cells. Wilcoxon's rank sum test (Mann and Whitney, 1947) was performed between the putative cycling cells and the rest of cells in the cell type to enrich for cell-cycle associated transcriptome signatures that were specific to endometrium (
For each identified phase and subphase, the expression of a known ligand or receptor was evaluated as the percent of unciliated epithelial cells or stromal fibroblasts expressing the genes to obtain p(epi, j) and p(str, j), where j is for phase j. A ligand or receptor is only considered expressed by a cell type in a phase if p is greater than 25%. The interaction between a ligand-receptor pair is established if when a ligand is expressed in one cell type and its known receptor is expressed in the other. The ligand-receptor pairing information was based on the database provided by (Ramilowski et al., 2015). Ligand-receptor pairs were sorted, from top to bottom, left to right, by the level of interaction, quantified as the total number of interactions normalized by the total number of possible interactions between the two cell types within a phase. This information can be used to identify one or more ligand-receptor pairs that can be used to determine the menstrual status of a subject, for example to determine whether the subject is within the WOI.
Endometrial tissues were fixed for 24-48 h in 4% paraformaldehyde (PFA) at room temperature, trimmed, embedded in paraffin, and sectioned into 3 μm in thickness onto APES-coated slides.
Tissue sections were baked at 60° C. for 1 h, deparaffined with Histoclear and rehydrated with ethanol series. Antigen retrieval was performed by boiling tissue sections in 10 mM sodium citrate buffer (pH 6.0) for 20 min, followed by immediate cool down in cold water for 10 min. Tissue permeabilization was done with 0.25% Triton X 100 in PBS for 5 min, followed by wash in 0.05% Triton X100 in PBS for 5 min twice. Non-specific binding was blocked with 5% BSA-0.05% Triton X100-4% goat serum in PBS for 1 h at room temperature. Tissue sections were then incubated with primary antibodies over night at 4° C. and secondary antibodies for 1 h at room temperature. Primary antibodies used and dilution ratios are Vimentin (2 μg/mL, ab8978, Abcam), Prolactin (1:10, PA5-26006, Thermo Fischer Scientific), CD3 (1:100, ab5690, Abcam), CD56 (1:50, ab133345, Abcam). Secondary antibodies used and dilution ratios are: Goat anti-mouse IgG (H+L) Superclonal™ Alexa Fluor 488 (1:200, A27034, Thermo Fischer Scientific) and Goat anti-rabbit IgG (H+L) Superclonal™ Alexa Fluor 555 (1:200, A27039, Thermo Fisher Scientific). All sections were counterstained with 4′, 6′-diamidino-2-phenylindole (DAPI) (Thermo Fisher Scientific) and mounted with Aquatex® (Merck-Millipore). Images were captured with a confocal microscope (FV1000, Olympus) at 20× and 60× magnification with oil immersion and analyzed using Imaris (Bitplane).
Combined RNA and antibody in situ hybridizations were performed according to the manufacturer's technical note “RNAscope Multiplex Fluorescent v2 Assay combined with Immunofluorescence” for FFPE samples (Advanced Cell Diagnostics). 15 min and 30 min incubation were used for target retrieval and Protease Plus treatment, respectively. RNA probes (Advanced Cell Diagnostics) with the following channel assignment (C), fluorophore, and dilution in TSA buffer were used: CDHR3 (C1, cyanine 3, 1:1500), C11orf88 (C2, cyanine 5, 1:750); C20orf85 (C1, cyanine 3, 1:1500), FAM183A (C2, cyanine 5, 1:1500). Tissue sections were blocked with SuperBlock (PBS) blocking buffer (Fisher Scientific) for 30 min at room temperature, incubated in anti-human FOXJ1 (1:500, eBioscience) over night at 4° C. and goat anti-mouse IgG secondary antibody (1:500, Life Technologies) for 2 h at room temperature. All sections were mounted with Prolong Diamond Antifade Mountant (Thermo Fisher Scientific). Imaging was carried out on an Axio-plan epifluorescence microscope equipped with an Axiocam 506 mono camera (Zeiss) using a 20×/0.8 Plan-Apochromat objective (Zeiss). For each sample, 8-10 fields of view were captured with 10-15 z-stacks.
Z-stacks were projected (maximum intensity projection, MIP) using ImageJ. The resulting MIP images were analyzed using CellProfiler 3.0.0 as follows: 1) Correct background by subtracting the lower quartile of the intensity measured from the whole image. 2) Detect cell nuclei using the DAPI channel and cell boundaries using Voronoi distance (25 pixels) from the nuclei. 3) Enhance RNA signals using a tophat filter (5 pixels) and detect signals by intensity threshold (0.004 and 0.002 for Cy3 and Cy5, respectively). 4) Measure antibody intensity for each detected cell. All images were analyzed in the same way, with no image excluded.
To characterize endometrial transformation across the natural human menstrual cycle, endometrial biopsies from 19 healthy and fertile females were collected, 4-27 days after the onset of her latest menstrual bleeding (
Dimensional reduction via t-distributed stochastic neighbor embedding tSNE) (Maaten and Hinton, 2008) on the top over-dispersed genes (Method) revealed clear segregation of cells into distinct groups (
Using RNA and antibody co-staining (Method), previously unannotated discriminatory markers and epithelial lineage identity were validated, and the spatial distribution of ciliated epithelium was visualized in situ. Four genes were selected for RNA staining: they were identified as highly discriminatory for the cell type (
Samples were taken throughout the menstrual cycle and annotated by the day of menstrual cycle (the number of days after the onset of last menstrual bleeding). While the time variable serves as an informative proxy for assigning endometrial states, it is susceptible to bias due to variances in menstrual cycle lengths between and within women (Guo et al., 2006), and limited in resolution due to variance of cells within an individual. To study transcriptomes of endometrial transformation in an unbiased manner, within-cell type dimension reduction (tSNE) was performed using whole transcriptome data from unciliated epithelium and stromal fibroblast, respectively. The results revealed four major phases for both cell types, which are referred to as phases 1-4 (
Endometrial transformation over the menstrual cycle is at least in part a continuous process. A model that not only retains phase-wise characteristics but also allows delineation of continuous features between and within phases will enable higher precision characterizations. To build such a model, a mutual information (MI) (Tkačik and Walczak, 2011) based approach was used, such that the information provided by the time annotation was exploited, its limitation noted in the previous section minimized, and potential continuity between and within phases accounted for. Briefly, enrich for genes that were changing across the menstrual cycle based on the MI between gene expression and time annotation regardless of underlying model of dynamics (Method). In total 3,198 and 1,156 “time-associated” genes for unciliated epithelium and stromal fibroblast were obtained, respectively (FDR<0.05) (
Interestingly, notable discontinuity in the trajectory of unciliated epithelia between phase 4 and the preceding phases was observed (
Unlike their epithelial counterparts, transcriptomic dynamics in stromal fibroblasts demonstrated more stage-wise characteristics, where genes were up-regulated in a modular form, revealing boundaries between phases (
While the WOI opened up with an abrupt transcriptomic transition in unciliated epithelial cells, it closed with a more continuous transition dynamics (
The parallel transition in stromal fibroblasts was also characterized with three similar groups of genes (
Cell type identity and cell state are primarily driven by small groups of transcriptional regulators. Therefore, it was sought to identify WOI-associated transcriptional factors (TF) to understand what drives the opening and closure of WOI. All TFs that are dynamic across the menstrual cycle (Method) and found for both unciliated epithelia and stromal fibroblasts were first characterized; these TFs can be primarily assigned to two main categories (
Next, WOI-associated TFs were defined as those with a peak expression detected after the opening of WOI (
In summary, the analysis enabled the identification of key drivers for the opening and closure of the human WOI as well as transitions between other major cycle phases (
Since its formalization in 1950 (Noyes et al., 1950), a histological definition of endometrial phases, i.e., the proliferative, early-, mid-, and late-secretory phases, has been used as the gold standard in determining endometrial state. It also usually serves as the ground truth in bulk-based profiling studies in categorizing endometrial phases. Given that there were clear differences between the phase definition as used herein and the canonical definition, the relationship between the two were investigated.
Cell mitosis is one of the most distinct features of the pre-ovulatory (proliferative) endometrium, hence the naming of proliferative phase. Thus, to identify the boundary between proliferative and secretory phases, cell cycle activities across the menstrual cycle were explored. Specifically, endometrial cell cycle associated genes were defined (
To further validate this assignment, characteristic signatures for phase 1-4 were defined and major hierarchies of biological processes that were enriched by the signatures were identified. While phase 1 was characterized with processes such as tissue regeneration, e.g., Wnt signaling pathways (unciliated epithelium: epi), tissue morphogenesis (epi), wound healing (stromal fibroblasts: str), and angiogenesis (str) and phase 2 by cell proliferation (epi), phase 3 was dominated by negative regulation of growth (epi) and response to ions (epi) and phase 4 by secretion (epi) and implantation (epi). The transition from a positive to a negative regulation in growth from phase 2 to 3 further confirmed a pre-ovulatory to post-ovulatory transition (Talbi et al., 2006).
Lastly, previous bulk tissue analyses were used to help differentiate the pre-ovulatory and post-ovulatory phases. It was reasoned that although bulk data is confounded by the varying proportion of the major cell types, i.e., stromal fibroblasts and unciliated epithelial cells, bulk and single cell data taken together should have high level of consensus on genes that 1) are in synchrony between the two cell types or 2) have negligible expression in one cell type but significant phase-specific dynamics in another. Therefore, genes were identified with these characteristics using the single cell data (
In phase 1, sub-phases were observed in both unciliated epithelial cells and stromal fibroblasts that are primarily characterized with genes that are gradually decreasing or increasing towards later part of the phases (
Examples of genes that have expression peaks in different phases (phase 1, 2, 3, or 4) in ciliated epithelia and stromal fibroblasts are provided in Tables 16 and 17, respectively. Accordingly, one or more of these genes can be evaluated (e.g., using RNA and/or protein expression levels) in one or more of these cell types to determine whether a subject is in menstrual phase 1, 2, 3, or 4, for example to determine whether the subject is approaching, entering, in, or exiting a WOI. For example, the expression level of one or more genes (e.g., 1-10, 10-25, 25-50, 50-100, 100-250, 250-500, 500-1,000 or more or all of the genes) characteristic of one or more phases (for example, one or more genes for each phase) can be assayed and compared to a reference level (e.g., for each gene) associated with one of the phases (e.g., for phase 1, phase 2, phase 3, phase 4, or 2, 3, or all thereof) to determine whether a subject has a gene expression level that is indicative of being in phase 1, phase 2, phase 3, phase 4, of for example approaching, entering, in, or exiting a WOI.
Lastly, interactions between unciliated epithelial cells and stromal fibroblasts were explored by identifying ligand-receptor pairs that were expressed by the two cell types across the major phases/subphases of the cycle (Method). One major feature be noted within the identified ligand-receptor pairs: they are dominated by a diverse repertoire of extracellular matrix (ECM) proteins paired with integrin receptors, suggesting that ECM-integrin interaction is a major route of communication between the two cell types. Key interactions were identified at the WOI such as between LIF and IL6ST, with LIF being a key gene implicated in endometrial receptivity (Evans et al., 2009, 2016; White et al., 2007).
WNT5A
DCP1A
GREM2
NPDC1
CDK11B
ABRACL
SLCO4A1
SFRP4
SLC25A24
MXD1
UBE2G1
FBXO21
PSMG3
ODC1
NREP
CCT2
ADNP
NAE1
SOCS3
GOLPH3
AGPAT5
PTMAP5
FRK
OLA1
EGFR
FZD6
EDF1
PLA2G16
GBP5
CXCL3
KIAAI324
AP3S1
PARP14
HACD3
LINC01502
IFI6
IP6K2
SCNN1G
PSMC1
IRF2BPL
ALDH18A1
ANKRD55
AKAP1
CORO1C
C16orf72
ALDH16A1
PLA2G4A
GNG11
EDNRB
MMP11
MREG
ZNF252P
IFITM3
TRIM22
STEAP4
SLC22A5
PLXDC2
PSMD11
PAKIIP1
TPM4
FAM155A
ASRGL1
MFSD4A
ANTXR1
LTV1
CNOT6L
CRIP1
RNF8
ELP3
DUSP6
PITHD1
ITGAV
ANAPC4
PSAP
WWC2
GGTA1P
FXYD3
CCNC
ADCYAP1R1
ID3
PSAT1
ALDH6A1
AOX1
IGFBP3
SMIM15
ATP5C1
MYH10
MTPN
GGCT
LYPLAL1
LY6E
SREK1
RPARP-AS1
CRIP2
TWSG1
SH3YL1
HAL
SHH
INO80D
TRIM59
DST
FAM96B
GABRP
FXYD2
BMP2
PLCB1
UQCRH
MGST2
VTCN1
PRELID3B
CITED2
FLNA
SH3RF1
DNAJC19
LSM5
ARPC1B
SEC61G
SLC44A1
COL12A1
UCHL3
UBA3
RANBP1
KRTCAP2
CAMK2D
ATP2C2
PTGS2
TBL1XR1
SAR1A
EMC10
ALPL
TALDO1
LINC01207
LINC01588
ITIH5
EPB41L2
AC013461.1
UNC5B
SPATA13
BACE2
MMP7
ACTR3
APOBEC3C
HSPE1
TMEM131
CTAGE5
ACADSB
LCN2
FDPS
VAMP7
MYO10
NRXN3
SIAH2
NABP1
QSOX1
UTP11
PPP1R9A
PHGDH
MSX2
C19orf53
MAOA
CSF1
RDX
RPP30
MSH3
BHLHE40
AMD1
SLC1A1
GJA1
WDR48
AEN
MGLL
POLG2
MRFAP1
C2CD4A
ENC1
ZHX2
SMARCA1
RCC2
PTGS1
NPR3
IDO1
RAI14
SLC9A3R1
CASP2
PRKDC
PIP5K1B
MRPL55
MGST1
LIF
IRF6
CYP51A1
SNRPD1
COBL
DGUOK
CCL20
TUBA1A
ARHGAP26
CTD-3014M21.1
CD2AP
ANXA3
OVOL1
ARSB
CYP1B1
PGM2L1
NME2
ETV5
CEBPB
ATPIF1
RASGEF1B
WNT7A
EIF2S1
OSTC
TP53
MTA2
TFAP2C
CLEC4E
LAPTM4B
GPR22
PKP4
TBC1D5
LPAR3
ACSL5
KRT23
HMGA1
TCF7L2
UTRN
GLG1
RNF122
APOL2
SLC15A4
ELK3
SEMA3A
PAFAH1B2
CHD4
SLBP
CSRP2
TMEM45B
USP10
PRMT1
OAT
LYRM2
GPBP1L1
RASSF4
FAM134B
BCAT1
MINOS1
NTPCR
PSIP1
PPL
CNDP2
GDF15
COL18A1
MAPK1IP1L
INTS6
MCAM
TMEM184B
SEC14L1
SIK1
PROM1
ING3
PLAGL2
MALT1
PDZD2
MRPL3
DEPTOR
C3
RBM22
TMED10
NIPSNAP3A
CAP1
GNG5
COMP
NRP2
MORF4L1P1
ZMYND8
RIOK1
SLC26A2
ZNF652
PPP2R5A
PIM1
OCLN
CBWD5
ANP32B
ZDHHC9
WDR1
RAB11A
MFAP2
KIF21A
GXYLT2
HTATSF1
RNF150
CTNNA2
HN1L
CYR61
RC3H1
AEBP2
CTSB
RAB27A
THEM4
FAM65B
ZDHHC13
GCLC
DDAH1
ATP1B1
HPGD
MAGED1
EIF4E3
LUZP1
GLIPR1
PGR
HMGN2
HNRNPR
DYNLT1
CTSA
RBP1
SIX4
CNKSR3
PARP1
SLC39A8
NDUFA1
PHYHIPL
IL18
PHLDA1
CH17-373123.1
GPI
AP1S2
CCDC186
CXCL14
PLAU
ARMC8
FAM96A
CNPY2
SPATS2
MBP
SLC7A2
SERPINB9
TCERG1
CCDC14
SEPT7
TXNDC16
TMEM141
TSPAN1
AMOTL2
SERPINA1
LRP6
FBL
CITED4
ACTN1
ATP6V1A
NCEH1
GPR89A
POLR2D
FRAS1
NDUFB1
FREM2
RIMKLB
CD74
PAPD5
DCUN1D1
C21orf33
THAP4
NAAA
PIGR
THBS1
LSM12
METTL7A
PRRG4
SREBF2
CKB
TMEM92
TNF
MED24
EBP
SERINC5
SUFU
MFSD6
TC2N
ARHGAP29
USP16
R3HDM2
C8orf33
COX16
ECHS1
MRPS2
B4GALT5
ZNF644
CLMN
NUDT19
FAM174B
POC1B
SEPHS2
EMP1
SLC39A10
HNRNPF
ARID1A
PREP
FTH1P10
SLC15A1
TOP2B
MDM4
HELB
NDUFS5
TMEM261
RNF183
GRAMD1C
RNF152
RBMXL1
POGLUT1
ATP5G1
MTHFD2L
ZCCHC6
ANXA4
ADAMTS9
STEAP1
BZW2
LINC00998
AK3
HPRT1
VPS41
ILF3
PALLD
INIP
ZNF589
LRIG1
GSN
IRX3
ASPH
TMEM33
ZRANB2
HADH
CAPNS1
FAM120B
ERN1
XRCC5
ZNF286A
SNHG6
KIAA1143
ETFRF1
NEBL
C2CD4B
CFI
ATXN1
TRAF3IP2
PARK7
ATP6V0E2
ECI2
CXCR4
MARCKSL1
TMEM120B
THYN1
POMP
RCN1
ITGA1
SCCPDH
TSPAN15
TLE3
TIMM17A
MMAB
PAX2
B3GNT2
DPP4
HLA-H
SPRY1
RBMX
KRT8
ATP5J2
RAB4A
G0S2
DUSP10
DNAJC10
EIF4E
APRT
NDUFB6
SLC4A7
TRAM1
ATIC
S1PR2
PHF14
PCDH7
GMNN
CKMT2
HIST1H2AC
MIR4435-2HG
MTPAP
EIF4B
SELENOW
COA3
PYY
TMC5
IL32
NMD3
CEP57
ARL4A
ZBTB11
FAM177A1
LAMB3
BHLHE41
NPM1P27
SRSF2
CCDC170
ANAPC16
ALDH3B2
C12orf75
IL23A
SRPK1
BTF3L4
SH3RF2
FKBP9
CD36
SLPI
RASSF3
CLUHP3
SLC25A6
METAP1
PTS
IDH1
C4BPA
SMAD3
VIM
LRRC75A-AS1
NDUFA2
SLC25A1
MPZL2
SNX29
SNHG16
CPM
GAS7
CTTN
SSBP1
GMPR
MAP3K5
RRAS2
LIPA
PSMA6
CENPX
CSRP1
COL1A2
PAX8
FBXO32
SENP5
HMOX2
FAM84A
PPP4R2
ERLEC1
LEPROT
CD47
MTFMT
PRDX6
EEF1E1
BCAP29
RHOBTB3
DEFB1
IGF2R
AGO3
MARCH6
SYNGR2
PER2
TPD52L1
MITF
B3GNT7
TUSC3
GTF2A2
FUT8
TP53I3
CREB3L1
TNFSF15
MSN
ST3GAL5
UBE3A
GDI2
ATP5F1
BNIP3L
AQP3
HMGB1P5
ALDH3A2
IMPDH2
FH
ITGA6
HGD
GRN
RBBP8
KIZ
EIF1AX
CCDC146
CD99
CD81
DHRS3
FHL2
IGFBP4
STON2
SRRM2
MRPS34
MAP2K6
UBBP4
MB21D2
ANO1
PTGFRN
NDUFA8
NAALADL2
CPT1A
MUC16
DEFB4A
CDC42EP3
BRD3
COX4I1
PLLP
GPT2
SPP1
MED4
LINC01480
CBX5
PKP2
PNPO
SQLE
LINC01320
HDAC9
TNKS2
PDCD4
TRIM2
ATP1A1
SNX9
AGR2
RGS10
EMID1
MSI2
EDN3
HK2
CYB5A
SRD5A3
EXT2
ADAM28
TXN2
NUCKS1
CTC-444N24.11
LDLR
VEGFA
CTSS
TAF9
TRIM16
KRT19
GRHL2
SLCO3A1
DUSP5
DLC1
YLPM1
PLEKHA5
NBEAL1
BAG5
AREG
ADGRF1
E2F3
MEST
C6orf48
AKR1C3
TMEM256
TMEM144
CP
SVIL
ARL3
C7orf73
YBX1
RFLNB
PPM1H
DCPS
SEMA3B
TFDP2
CHD7
HNRNPM
RANBP17
RXFP1
SCGB2A2
ADGRA3
EXOSC5
BEX3
TNS1
SLF2
TAP2
NUPR1
ANKLE2
EIF3E
HSPD1
CTBP1
AIFM1
FKBP5
CRYAB
FOSL1
SLC47A1
TCTN2
WDR77
KYAT3
HMGCR
RASD1
CYTOR
NSG1
MECOM
BTG2
TFCP2L1
FAM129B
PAPLN
CA12
APOOL
BOD1L1
OGFOD1
PHB2
CALD1
PAX8-AS1
JARID2
CTSH
H2AFZ
HERPUD1
KCNK13
FOXO3
TXNIP
CXCL1
TCEAL1
RAD51C
POLR2G
L3MBTL4
DCXR
FAM3C
PABPC4
PORCN
SNRPN
TLE4
NFIA
UBE2D2
ZNF292
MACC1
PSMD12
CEP290
TSTD1
PYURF
CYP26A1
TRAK2
SPECC1
AGO2
TFAM
PTPRJ
FAM213A
LINC00844
TNFAIP2
RBPJ
ZBTB38
EXOSC8
LAMB2
IL20RA
ANXA2P2
VCAN
HNRNPAB
GAN
FAM111A
MEX3D
LLGL2
ARHGAP18
HNMT
NFKB1
DMKN
CHCHD2
LRRC41
NPTN
PLEKHF2
MYO9A
LAMC2
PPP1R2
ACTL6A
SULT1E1
ORAI2
ADAMTS8
GPR160
ANKRD33B
MUM1
AHSA1
POLR2J3
DLGAP1
IFNGR1
GPX3
ARL14
PCMTD2
EEF1D
POLD2
LIPG
BTAF1
PAEP
SHISA2
NPAS3
STX18
UBE2Q2P2
TRAK1
DUOX1
STC1
MYO6
COLGALT1
PBX1
PSMA7
ACPP
ATP6V1G1
TUFT1
RARRES2
PAN3
SLC25A5
LARS
NAA60
CCNA1
NNMT
SMURF2
DAAM1
SLC12A2
RIN2
NOSTRIN
PHB
FBLN1
CD83
AC093673.5
HNRNPAO
IGFB P2
DLG5
TFPI2
HABP2
ATP6V1B2
TMEM41B
EDN1
COA4
TAP1
TMED4
CYP3A5
TARBP1
BMPR1B
NONO
ITM2C
RNASET2
LINC00116
CLDN10
ITGB6
BST2
PAICS
EIF3M
PRKX
SLC39A14
SYNE2
PTBP2
PAM
APEX1
RHOB
JTB
HLA-DOB
HKDC1
HSPB8
SFXN2
NME1
ID2
MCC
EMC4
ABCC3
RAB11FIP1
COL27A1
RBBP7
SRGAP3
RABGAP1L
WIPI1
SCIN
FAM98B
ERI1
REC8
DUT
SUDS3
MSMO1
C8orf4
SPIN1
DDHD1
CNP
THSD4
NFIB
SH3BGRL3
SLC40A1
DEK
MRPL1
CCT4
SERPINA3
OPRK1
CAPN6
NAPSB
KHDRBS1
CWC15
DDX1
TCF20
ACSL4
LRRC1
PIK3R1
TRIM33
CXADR
ATP1B3
ZNF611
DNAJC15
DHRS7
IGEBP7
CMTM7
KIAA1456
NBPF10
C22orf29
MUC1
PART1
SERPING1
TNFRSF12A
ATP5G3
PAPD4
PLEKHG1
ARF5
SMS
GEM
SPOCD1
ZNF121
PRDM2
CNTLN
FARSB
ENPP3
CYP24A1
TXNRD1
CDCA7L
PPA2
AGR3
C2orf88
TUBB2A
CXCL2
BCL9
CLNS1A
REEP5
SERPINA5
SLC15A2
WHRN
CLU
OCIAD2
NEIL2
SMG1
PPP3CA
TMEM101
DUOXA1
FGL2
ADAM9
EIF3G
MARCKS
CHD3
AFMID
SCGB2A1
ZBTB20
TARDBP
ADAMTS6
ANP32E
TCEA3
PDXDC1
PLIN2
LITAF
RIF1
TM2D3
SNRPB
SAMHD1
CARMIL1
RAMP2
TNFSF10
ZNF608
DDX6
CDC123
HNRNPK
NAMPT
ARL4C
HES1
SF3B1
ARHGAP17
DFFA
GRHPR
ANK3
HSD17B2
ABLIM1
UBE2E3
USP7
PGD
LONRF1
AK4
SORD
DNAJB1
PSMB4
GABPB1-AS1
GOLIM4
COL9A2
TPI1P1
PAPSS1
BICD1
SF3A3
OXR1
VCL
SEMA3C
ENAH
SLC16A1
HSPA1A
MLLT3
HNRNPA1P48
LIMCH1
ZCRB1
ABCG1
HSPH1
MYL6B
GAS5
PLEKHA2
DANCR
USMG5
TLE1
AXL
MRPL44
EEF1A1P13
SELENOH
RAB14
PIKFYVE
DENND2C
LUM
S100A16
DEGS2
EIF3D
ALCAM
SLC7A1
ATP6V1C2
MAP1B
MTF2
ARID2
UQCC2
ERC1
MARK1
MT1F
CCND2
GPRC5A
RIDA
SNRPF
HEY2
HDDC2
MT1X
COL3A1
SUPV3L1
FAM13B
YWHAQ
XYLT2
SMIM22
UPK1B
MMP2
ATRX
KRR1
RAN
PGRMC1
LONRF2
CDK7
SERPINE1
PIP5K1A
SLC35F2
SLIRP
ESR1
FAM110C
SCGB1D4
FSTL1
TPBG
MID1
PRPS2
LDHB
MPHOSPH10
SCGB1D2
COL1A1
BID
KMO
MDK
ARID1B
LAMTOR4
TESMIN
AKAP12
PITPNB
TLK1
TPM3
SNRPB2
PKHD1L1
MMP26
TCF4
ITCH
TLR2
AKIRIN1
FMC1
ATP6V0B
ST14
TIMP1
STX12
PSMD4
PLPP2
TNKS1BP1
SF3B6
XDH
SYNCRIP
CSF3
DDOST
MAP4
PRR15
VDAC1
AFDN
COL4A1
AP000462.1
LINC00665
STIP1
PKM
HMGN5
NHSL1
NAP1L1
ZNF827
WBSCR22
PDLIM1
SERBP1
TM9SF3
HEY1
SPARC
TNFRSF21
SBNO1
STMN1
DLG1
ARPC4
LPIN1
LGALS1
DNTTIP2
MRPS17
ALDH1B1
CCT8
TM7SF3
SYBU
IFITM1
HS3ST1
PPT1
TPR
RCN2
ADH5
KCMF1
TMEM98
ANKRD28
RBM3
NCL
MYBBP1A
SOX17
SPHK1
TIMP3
TNFRSF10B
PPIL4
AHCY
TOP1
STRBP
TIAM1
DCN
NELFCD
LINC01138
DNPH1
TCEAL4
RSRP1
SDCBP2
THBS2
MPRIP-AS1
SLFN5
HACD2
BARD1
HMGN3
SMIM5
CTSC
MED17
SLC39A6
CCT3
TMEM14A
OFD1
MT1E
YTHDC2
CTGF
PTEN
BROX
TAF8
ARL6IP5
TMEM154
ID1
NFATC1
TULP4
MIA3
DCBLD2
NDUFA13
MT1G
C11orf96
DENND4A
GAS2
MRPS25
CHCHD5
CTB-178M22.2
MT2A
RGS2
CMTM6
BLOC1S6
ATP5A1
NAV2
ATP5I
MT1M
SAMD4A
SDCBP
NHP2
DKC1
PLEKHA3
DLX5
LMO7
PDS5B
FAM133B
RAPGEF2
NAP1L4
UGT2B7
NEK1
MT1H
TIMP2
IFT57
PELI1
ATP5L
GATA2
CS
UTP15
PTN
CREB5
DCAF16
TOMM22
GREB1
B3GNT5
SLC18A2
PMEPA1
TSPAN6
CCNG2
SNHG14
ANKRD11
PLA2G4F
LIG4
HDAC2
CADM1
EYA2
PFDN5
OXCT1
NOV
SLC30A2
NOTCH3
L3MBTL3
UBE2N
UBAP2L
RCAN1
APOPT1
ADGRL2
C1S
ASAP1
SMAD9
HSBP1
TIMM8B
ADIPOR2
GAST
NRP1
PPP2R3A
SPDEF
CEP95
STXBP6
NDUFC2
FAM84B
S100A6
CD44
GUSB
TSPAN14
SCD
HSD11B2
TCN1
HSPA1B
ARAP2
MMADHC
MAGI1
RREB1
SLAIN1
RASEF
IFITM2
NINJ1
PDGFC
PDIA3
ELF2
APOL4
GCNT3
HSP90AA2P
S0X9
ADAT2
GSTK1
JUN
HOMER2
CRISP3
PR5523
N4BP2
SLC25A26
CCND1
BASP1
SORBS2
RIMKLBP1
NFATC2
NRCAM
STK17A
NASP
NEO1
RHOU
ELK4
ALDH7A1
HCP5
OTUD6B-AS1
IFI27
SNX5
TOB1
PCDH17
KLF9
SEMA3E
ETNK1
HIST1H4C
DBI
COX17
PPFIBP2
MEIS1
TARS
SUB1
RSRC1
PFKL
IKZF2
DYNLT3
CBX1
SIPA1L1
DYNC1I1
DNMT3A
GDA
NME4
CDYL2
MYO1B
CAB39
GLA
CUL5
SH3BGRL
CREG1
RBL2
CRISPLD2
KPNA4
FRMD4B
DNMT1
PLOD1
CDC42SE2
SLC34A2
COL6A3
RRP15
LCLAT1
SNRPD3
PABPN1
OST4
VNN1
MAP4K4
CRISPLD1
U5P22
CCP110
SP100
HADHA
SLC3A1
TINAGL1
TPGS2
CSNK2A2
ST13
SYNJ2BP
GAPDHP65
DDX52
AGPS
ZNF516
PRPF40A
LGR5
FDFT1
BCL2A1
STARD3NL
PIP4K2A
HNRNPD
MTURN
COX7A2
TNFAIP6
F3
TSPYL1
HSPB1
KAT6B
CUTA
TSPAN8
CXCL8
ITGA6
CDV3
POSTN
POLG2
ADAMTS9
C11orf96
OTUD4
XBP1
CNTN1
ABCA1
HLA-B
PMAIP1
PPP2CA
KDM6B
ZNF704
PTGDS
LGALS3
PER2
RUNX1
CELF2
FREM1
SLC26A7
LAMB1
GEM
RAP2B
PLAU
IGFBP7
WEE1
AHCY
STC1
H2AFZ
CXADR
IL33
ARIH1
MGST1
TNFRSF12A
PTGS2
AP1G1
PAG1
AKAP12
ACTA2
MAP3K8
PFKFB4
IRF2BP2
HIST1H4C
CHD
1
SCARA5
UGCG
ZC3H12A
TOP1
TRIB2
ELMSAN1
ATP6V0E1
ERRFI1
KPNA4
TAX1BP1
MRC2
KLF4
GPX1
INHBA
MCL1
EPCAM
PPP2R2C
BCL6
SERPING1
CDH2
ETV5
PDIA4
MTUS2
SERPINE1
NNMT
ANXA1
CCDC85B
GTPBP4
STMN1
GPRC5A
PSMA7
CYTOR
PSMD11
ZSWIM6
RBP7
THBS1
SRI
TGFBI
SQSTM1
PODXL
OLFM1
EMP1
PSME1
MAP2K3
CFL1
SDC4
PGR
BHLHE40
PFN1
HMGA1
PDE4B
TMEM2
RUNX1T1
KPNA2
ABCC9
B4GALT1
RTN4
RNF152
BRD8
OSER1
PPP1R14A
NFATC2
ERN1
EIF5
PEBP1
DNAJB1
CAP1
F13A1
FGFR1
PHLDA1
IGDCC4
LDLR
C3
BZW1
ETS2
PELI1
SKA2
MIR22HG
IGFBP4
SYNJ2
LRMP
MSANTD3
BEX3
ARC
IL15
MAFF
COQ10B
ELK3
N4BP2L2
TNFAIP3
TMEM45A
MIR4435-
FBXO33
PSMD7
ZCCHC11
HSPA1A
APOD
2HG
FOSL1
ATP1B1
TNFRSF9
CACNA1D
NFKBIZ
SNX10
MMP7
IER3
AMOTL2
GDF7
ANXA2
TGM2
PDGFC
PPP1R15B
LIMS1
ECM1
CAST
ALDH1A3
PIM3
NFKB1
LAPTM4B
ZFYVE21
GFPT2
CFD
ABL2
ALYREF
ATP13A3
TRAM1
ANXA2P2
MGP
FJX1
ANKRD28
MEST
PIP5K1B
TUBA1C
HAND2
ELL2
LIF
ITGB1
HOXA10
GPX3
HSPB1
TES
ETS1
RAB22A
ZBTB8A
TRIB1
PRPS2
CD44
NR3C1
RAN
PKD1L2
SFMBT2
BCAT1
SDK2
SEC24A
SDC2
FAM213A
LMCD1
MYL9
CAV1
MYADM
SERTAD1
PDS5B
FGF7
TXNIP
SGK1
FHL2
CSNK1A1
PPIB
NR4A1
MAOB
TWIST1
DUSP14
HSPH1
DIO2
RDH10
TUBB
CXCL1
ANK2
EGR3
P4HA2
ARID5B
TMEM37
NRIP1
B3GNT2
CPM
TMEM144
PAEP
PLA2G2A
KLF5
KMT2C
MEX3D
ANO1
CYP4B1
FOXO1
LRRFIP1
PARD6B
AFF4
GLG1
ATF3
APCDD1
CD83
TLE3
LTBP2
HOXA11
CORO1C
C1orf21
NINJ1
RAB7A
IFI6
SEC22B
THBS2
HSPB6
TNC
REL
PMEPA1
SLF2
ADAMTS5
LMOD1
CXCL2
HK2
PIM2
TRPS1
NCOA7
EFEMP1
BAZ1A
SDCBP
SKIL
ANKRD20A11P
PLIN2
C1R
SPSB1
CLEC2B
TSKU
DAAM1
LDHA
IGF2
RASSF3
TXNRD1
ZBTB2
TNRC6B
TIMP3
PILRA
BMP2
CDC14A
AHSA1
RASSF2
MTHFD2
RBP1
RIPK2
QKI
TFAP2C
GXYLT2
STOM
SDHD
KRT19
FOXP1
TMED4
CDK6
YBX3
SLC2A8
GADD45A
CD59
TPBG
ZNF532
MEDAG
C1S
AMFR
TP53BP2
ZFAND2A
HSD11B2
MIF
PAPLN
GFRA2
ARID4B
MIR29A
FAM46A
TLN1
SPTSSA
DUSP5
ATP2B1
CYR61
F3
TWISTNB
DSTN
NOCT
LTBP1
ALCAM
GARNL3
NME2
SLC8A1
SLC39A14
SNX9
ID3
SPEF2
DKK1
LCP1
KLHL21
GSPT1
HSPE1
PPM1H
DAXX
MCC
CTNNAL1
PLK2
FKBP9
ARHGAP20
RAB31
ENPEP
MAP1LC3B
STX3
PPP1R15A
SPECC1
S100A4
TGFBR2
CEBPB
BACH1
USP22
PDGFRA
DPYSL2
PSMA4
ARL4C
ADNP
CPE
FAM198B
CLIC4
NUPR1
LMNA
EIF3A
COL27A1
RBM6
HLA-C
MMP2
ADM
ATP6V1G1
PAMR1
FABP5
STAT3
PIK3R1
PIM1
PTRF
PCSK5
MATN2
FKBP1A
FBLN5
WDR43
HSP90AA2P
ISLR
RORB
LITAF
AKAP13
ADAM12
ILF3
BGN
HELLPAR
S100A11
ADCY1
CKS2
LAMC1
MMP11
ITGB8
PDIA6
GPX4
ZBTB43
EAF1
MMP16
TMEM196
FBLN2
UBL5
MAP1B
MXD1
TNFRSF19
MME
HLA-A
AASS
TNFAIP2
NFE2L2
KLF10
LETM1
CXCL14
PDCD5
GCLC
MINOS1
GLIPR1
TMEM132B
INSR
SLIRP
CADM1
SPRY2
PGRMC1
REV3L
CACNB2
H19
FNDC3B
CDKN1A
MFAP2
NTRK3
TCEAL4
COLEC11
CRY1
EIF4E
PRSS23
JAZF1
CRYAB
GABRA2
DNAJB6
TNIP1
WNT5A
FN1
TAGLN
APLP2
ADAMTS16
TFPI2
GUCY1A2
CILP
ENPP1
MAF
CD34
KIF1B
CRABP2
NR2F2-AS1
ALDOA
MASP1
EZR
IFNGR2
ANO4
SEMA5A
TPM2
ST3GAL5
CREB5
NAMPTP1
PAM
PARM1
SERPINF1
PRLR
CD55
NAMPT
GJA1
SLC12A2
SELENOP
FBXO32
SCD
UBE2D3
MFAP4
PLCD1
UQCR10
DDX21
CSF1
FNDC1
INTS6
IRS2
HAND2-AS1
ZBTB38
ISOC1
ALDH1A1
PLCL1
PALMD
MYL12A
SLC2A1
LINC01588
SFRP1
PLEKHH2
AC005062.2
RBX1
HSPB8
PSMD6
ETV1
PTN
DHRS3
GLUL
B4GALT5
PTP4A1
SFRP4
EBF1
POLR2L
APOC1
MAPK6
RAP1B
NREP
ELN
PDLIM1
In unciliated epithelial cells, further segregation of cells was noticed (
To identify the nature of the segregation, differential expression analysis was performed and genes were found that consistently differentiated the subpopulations across multiple phases (
Genes that were previously reported to be critical for endometrial remodeling and embryo implantation were noticed within the differentially expressed genes (
Compared to the consistent distinction between the ciliated and unciliated epithelium, the deviation between luminal and glandular epithelium at transcriptome level was subtler and more dynamic: it became noticeable at late phase 1 and was most pronounced in phase 2 (
Functional enrichment analysis of genes overexpressed in the luminal epithelium in proliferative phase revealed extensive enrichments in morphogenesis and tubulogenesis which lead to development of anatomic structures as well as morphogenesis at cell level that lead to differentiation (
In addition, a third cell group was identified in the first three biopsies on the pseudotime trajectory (ordered by the median of pseudotime of all cells from a woman) (
Adult human endometrial gland formation in menstrual cycles have been proposed to originate from the clonogenic epithelial, or mesenchymal progenitors, or both, in the unshed layer of the uterus (basalis) (Nguyen et al., 2017; W. C. et al., 1997). The present data indicates that endometrial re-epithelization is through MET from mesenchymal progenitors, a process that has been demonstrated in transgenic mouse models (Cousins et al., 2014; Huang et al., 2012; Patterson et al., 2013) but had yet to be observed in human. The present data also shows that following re-epithelization, endometrial gland reconstruction in adult human endometrium is driven by tubulogenesis in luminal epithelium, which involves the formation of either linear or branched tube structures from a simple epithelial sheet (Hogan and Kolodziej, 2002; Iruela-Arispe and Beitel, 2013)—a mechanism that also contributes to gland formation during the development of human fetal uterus (for review, see Cunha et al., 2017; Robboy et al., 2017). This process is also characterized by proliferation activities that are locally concentrated at glandular epithelium.
Using the phase definition of unciliated epithelial cells and stromal fibroblast, other endometrial cell types from the same woman were assigned into their respective phases, and quantified for their abundance across the cycle (
Infiltrating lymphocytes were reported to play essential roles in decidualization during pregnancy, where they were primarily involved in decidual angiogenesis and regulating trophoblastic invasion30 (Hanna et al., 2006). Their functions in decidualization during the natural human menstrual cycle, however, remain to be defined. The dramatic increase in lymphocyte abundance in the early secretory phase in the data strongly suggests their involvement in decidualization (
Compared to their counterparts in non-decidualized endometrium (i.e., secretory (phase 3) and proliferative phases), lymphocytes in decidualized endometrium (phase 4) in natural menstrual cycle have increased expression of markers that characteristic of uterine NK cells during pregnancy (CD69, ITGA1, NCAM1/CD56) (
Next, genes were identified that are dynamically changing in the immune cells across the menstrual cycle, and those that are associated with NK functionality were characterized (
Intriguingly, decidualized stromal fibroblasts upregulated immune-related genes that reciprocated those upregulated in phase 4 immune cells. With the diversification of NKR observed in immune cells in the decidualized endometrium, an overall elevation in MHC class I genes in decidualized stromal fibroblasts was observed (
Using immunofluorescence, the spatial proximity between the identified immune subsets and stromal fibroblasts before (
The human menstrual cycle is not shared with many other species. Similar cycles have only been consistently observed in human, apes, and old world monkeys,1, 2 and not in any of the model organisms which undergo sexual reproduction such as mouse, zebrafish, or fly. This cyclic transformation is executed through dynamic changes in states and interactions of multiple cell types, including luminal and glandular epithelial cells, stromal cells, vascular endothelial cells, and infiltrating immune cells. Although different categorization schemes exist, the transformation has been primarily divided into two major stages by the event of ovulation: the proliferative (pre-ovulatory) and secretory (post-ovulatory) stage.3 During the secretory stage, endometrium enters a narrow window of receptive state that is both structurally and biochemically ideal for embryo to implant,4, 5 This, the mid-secretory stage, is known as the window of implantation (WOI). To prepare for this state, the tissue undergoes considerable reconstruction in the proliferative stage, during which one of the most essential elements is the formation of epithelial glands6, lined by glandular epithelium.
Given the broad relevance in human fertility and regenerative biology, a systematic characterization of endometrial transformation across the natural menstrual cycle has been long pursued. Histological characterizations established the morphological definition of menstrual, proliferative, early-, mid-, and late-secretory stages.3 Bulk level transcriptomic profiling advanced the characterization to a molecular and quantitative level,7, 8 and demonstrated the feasibility of translating the definition into clinical diagnosis of WOI.9 However, it has been a challenge to derive unbiased or mechanism-linked characterization from bulk-based readouts due to the uniquely heterogeneous and dynamic nature of endometrium.
The complexity of endometrium is unlike any other tissue: it consists of multiple cell types which vary dramatically in state through a monthly cycle as they enter and exit the cell cycle, remodel, and undergo various forms of differentiation with relatively rapid rates. The notable variance in menstrual cycle lengths within and between individuals10 adds an additional variable to the system. Thus, improved transcriptomic characterization of endometrial transformation, at the current stage of understanding, required that cell types and states be defined with minimum bias. High precision characterization and mechanistic understanding of hallmark events, such as WOI, required study of both the static and dynamic aspects of the tissue. Single cell RNAseq provided an ideal platform for these purposes. A systematic transcriptomic delineation of human endometrium across the natural menstrual cycle at single cell resolution was performed, and the results are disclosed herein.
In the present work, both static and dynamic characteristics of the human endometrium across the menstrual cycle with single cell resolution were studied. At the transcriptomic level, an unbiased approach was used to identify 6 major endometrial cell types, including a ciliated epithelial cell type, and 4-four major phases of endometrial transformation. For the unciliated epithelial cells and stromal fibroblasts, high-resolution trajectories were used to track their remodeling through the menstrual cycle with minimum bias. Based on these fundamental units and structures, the receptive state of the tissue was identified and characterized with high precision, and the dynamic cellular and molecular transformations that lead to the receptive state were studied.
The use of single cell RNAseq to characterize human endometrium is at an early stage. Using endometrial biopsies, a previous study was only limited to the most abundant stromal fibroblasts (late-secretory phase, Krjutskov et al., 2016). Coincident with the present work, the feasibility of generating data from other endometrial cell types was also demonstrated by a group using full-thickness uterus (secretory phase, Wu et al., 2018), but cell types were only analyzed at a single time point on a single patient who underwent hysterectomy due to leiomyoma—a gynecological pathology known to cause menstrual abnormalities. Another coincident study modeled decidualization using in vitro cultures of human endometrial stromal fibroblasts and compared the result to the transition of stromal fibroblasts from mid- to late-secretory phase biopsies (Lucas et al., 2018). In the present study, biopsies were sampled from 19 healthy women across the entire menstrual cycle. Each of the reported biological phenotypes was supported by multiple biological replicates (i.e., women,
An important result of the present work is the molecular characterization of the ciliated epithelium as a transcriptomically distinct endometrial cell type; these cells are consistently present but dynamically changing in abundance across the menstrual cycle (
In general, ciliary motility facilitates the material transport (e.g., fluid or particles). The notable increase of ciliated epithelia in the second proliferative phase (
The opening of WOI was identified, and a method diagnosing the unique transcriptomic dynamics accompanying both the entrance and the closure of the WOI. It was previously postulated that a continuous dynamic would better describe the entrance of WOI, since human embryo implantation doesn't seem to be controlled by a single hormonal factor as in mice33, 34 (Hoversland et al., 1982; Paria et al., 1993), while discontinuous characteristics were also speculated based on morphological observation of plasma membrane transformation35 (Murphy, 2004). The present data suggest that the WOI opens with an abrupt and discontinuous transcriptomic transition in unciliated epithelium, accompanied by a more continuous transition in stromal fibroblasts. The abruptness of the transition also suggests that it should be possible to diagnose the opening of the WOI with high precision in clinical practices of in vitro fertilization and embryo transfer.
It is intriguing that the mid- and late-secretory phases fall into the same major phase at the transcriptomic level, especially since the physiological differences between mid- (high progesterone level, embryo implantation) and late-secretory phase (progesterone withdraw, preparing for tissue desquamation) seem to be as large as that between early- to mid-secretory phase, if not larger. In fact, the characteristic transition at the closure of the WOI is largely contributed to by the same group of genes that contributed to the abrupt opening of the WOI, except that while at the opening their upregulation is rapid and uniform across all cells, at the closure the downregulation was executed less uniformly and across a longer period of time. From a dynamic perspective, the difference suggests that the transition between mid-to-late secretory phases, although in magnitude may be similar to that between early-to-mid secretory phases, is slower in rate, perhaps reflective of the rate of progesterone withdrawal. From a molecular perspective, the less uniform downregulation of genes suggests that the closure of the WOI may be mediated through paracrine factors and cell-cell communications.
The abrupt opening of the WOI also allowed elucidation of the relationship between the WOI and decidualization. As noted earlier, decidualization is the transformation of stromal fibroblasts that is necessary for pregnancy in both human and mouse, and supports the development of an implanted embryo. However, contrary to the mouse, where decidualization is triggered by implanting embryo(s)36 (Cha et al., 2012) and thus occurs exclusively during pregnancy, in humans, decidualization occurs spontaneously during natural human menstrual cycles independent of the presence of an embryo21 (Evans et al., 2016). Thus, the relative timing between the WOI and the initiation of decidualization in human is unclear. While histological observation suggests that decidualization starts around mid-secretory phase, the present data indicates that decidualization is initiated before the opening of the WOI, and that at the opening of the WOI decidualized features are widespread in stromal fibroblasts at the transcriptomic level. This lag of morphological signals relative to transcriptomic signals could result from the delay of phenotypic manifestation after transcription either due to inherent delay between transcription and translation or through post-transcriptional modifications.
The transcriptomic signature in luminal and glandular epithelium during epithelial gland formation was identified. The original definition of luminal and glandular epithelia was established based on the distinct morphology and physical location between the two. Their distinction at the transcriptome level had not previously been established. Markers were found that differentiate the two across multiple phases of the menstrual cycle. Moreover, signatures were discovered that are differentially up-regulated in glandular and luminal epithelium during the formation of epithelial glands. Epithelial glands create proper biochemical milieu for embryo implantation and subsequent development of pregnancy. In humans, the mechanism for their reconstruction during proliferative phases, however, is unclear. Previous studies through clonogenic assays reported that the cyclic regeneration of both glandular and luminal epithelium was executed by progenitors with sternness characteristics in the unshed layer of the uterus (basalis) (Huang et al., 2012; Nguyen et al., 2017; W. C. et al., 1997). The present analysis suggests a mechanism that involves MET for re-epithelization followed by tubulogenesis in the luminal epithelium as well as proliferation activities that were locally, concentrated at glandular epithelium for reformation of epithelial glands. The data however cannot rule out the possibility that cells that re-epitheliate the endometrium are the progeny of previously reported candidates with stemness characteristics.
Lastly, evidence was provided for the direct interplay between stroma and lymphocytes during decidualization in menstrual cycle. Analysis suggested that, during decidualization in cycling endometrium, stromal fibroblasts are directly responsible for the activation of lymphocytes through IL2-elicited pathways. The diversification of activating and inhibitory NKR in immune cells and the overall up-regulation of MHC class I molecules in stromal fibroblasts is particularly interesting. During pregnancy, cytotoxic NK cells were tolerant towards the semi-allogeneic fetus37 (Schmitt et al., 2007). This paradoxical phenomenon was hypothesized to be mediated by 1) the upregulation of non-classical MHC class I molecule (HLA-G)38 (Apps et al., 2007), the ligand to NK inhibitory receptor, and 2) the downregulation of classical MHC class I molecules (HLA-A, HLA-B)39, 40 (Moffett-King, 2002; Sivori et al., 2000) that engage with NK activating receptors. Results demonstrate that similar suppression in NK cells with high cytotoxic potential occurs during natural menstrual cycle, however exerted by decidualized stromal fibroblasts.
In summary, the human endometrium was systematically characterized across the menstrual cycle from both a static and a dynamic perspective. The high resolution of the data and the analytical framework allowed previously unresolved questions that are centered on the tissue's receptivity to embryo implantation to be answered. These findings and the molecular signatures that were discovered provide conceptual foundations and practical molecular anchors for reproductive and clinical applications.
The following references are cited within the present Application. Each is incorporate herein by reference in their entireties.
This application claims priority to U.S. Provisional Patent Application No. 62/686,621, filed Jun. 18, 2018, the contents of which are incorporated herein by reference.
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/US2019/037814 | 6/18/2019 | WO | 00 |
Number | Date | Country | |
---|---|---|---|
62686621 | Jun 2018 | US |