Genetic disorders and congenital abnormalities (also called birth defects) occur in about 3 to 5% of all live births (Robinson and Linden (1993)). In the United States, birth defects are the leading cause of infant mortality (Anderson et al. (1997)). Genetic disorders and congenital anomalies are associated with enormous medical-care costs and create a heavy psychological and emotional burden on those afflicted and/or their families (Czeizel et al., (1984); Centers for Disease Control (1989); Kaplan (1991); and Cuniff et al. (1995)).
Prenatal diagnosis has become an essential facet of clinical management of pregnancy itself, as well as a critical step toward the detection, prevention, and possible treatment of genetic disorders. Current prenatal diagnostic methods are typically limited to methods that rely on anatomical abnormalities, chromosomal anomalies, and single gene mutations as markers. Often, the biological mechanisms underlying fetal diseases are poorly understood, and therapeutic interventions are lacking.
The present invention encompasses the understanding that examination of fetal gene expression may provide an increased understanding of fetal development that may lead to novel therapies and diagnostic methods for fetal diseases and conditions. The present invention encompasses the discovery that genomic profiles (such as, for example, gene expression profiles) provide useful information that enables the design and implementation of therapies for fetal conditions and diseases. Such therapies would include therapies that can be applied prenatally, for example in utero, and/or shortly after birth. By employing genomic approaches, the inventors have created a new dimension to fetal diagnosis and treatment.
The present invention further encompasses the discovery of novel biomarkers for Down Syndrome (also known as Trisomy 21) that may afford diagnostic methods that are less invasive, require less biological material, and/or may be performed at earlier gestational ages than do current diagnostic methods.
In one aspect, provided are methods usefule for identifying therapeutic agents for a fetal disease or condition, comprising the steps of obtaining a reference genomic profile; obtaining a test genomic profile from a sample of amniotic fluid and/or maternal blood, wherein the sample is obtained from a subject suffering from or carrying a fetus suffering from a fetal disease or condition; determining differences between the test genomic profile and the reference genomic profile; inputting the test genomic profile into a first computing machine; accessing a storage repository on a second computing machine, wherein the storage repository contains a set of stored genomic profiles of one or more cell line(s) that have each been contacted with a different agent, wherein each stored genomic profile is mapped to data representing a corresponding agent; generating, by a correlator executing on the first or the second computing machine, a correlation between each stored genomic profile and the test genomic profile; and selecting at least one agent whose corresponding genomic profile has a negative correlation score with the test genomic profile, the selected agent being likely to reduce the differences between the test genomic profile and the reference genomic profile. In some embodiments, the genomic profiles comprise information selected from the group consisting of mRNA levels (i.e., such as those obtained from gene expression profiling experiments), protein expression levels, DNA methylation patterns, metabolite profiles, and combinations thereof.
In some embodiments, the fetal disease or condition is selected from the group consisting of twin-to-twin-transfusion syndrome (TTTS), gastroschisis, Down Syndrome, fetal structural anomalies, fetal congenital heart anomaly, fetal kidney anomalies, neural tube defects, and congenital diaphragmatic hernia. In some embodiments, the fetal disease or condition is Down Syndrome. In some embodiments, inventive methods further comprise testing activity of at least one identified agent for therapeutic effects in a fetal disease or condition. In some such embodiments, the at least one identified agent is tested for medical applications in utero.
In another aspect, the invention provides a method comprising a step administering to a patient suffering from a fetal disease or condition an effective dose of an agent identified by methods of the present invention, such that symptoms of the fetal disease or condition are ameliorated. In some embodiments, the fetal disease or condition is selected from the group consisting of twin-to-twin-transfusion syndrome (TTTS), gastroschisis, Down Syndrome, fetal structural anomalies, fetal congenital heart anomaly, fetal kidney anomalies, neural tube defects, and congenital diaphragmatic hernia. In some embodiments in which the fetal disease or condition is Down Syndrome, the agent is selected from the group consisting of anti-oxidants (e.g., celastrol), ion channel modulators, G-protein signaling modulators, and combinations thereof. In some embodiments, the agent is a calcium channel blocker (e.g., verapamil, felodipine, nifedipine, combinations thereof, etc.). In some embodiments, the agent is selected from the group consisting of copper sulfate, 15-delta prostaglandian J2, blebbistatin, prochlorperazine, 17-dimethylamino-geldanamyc in, butein, nordihydroguaiaretic acid, acetylsalicyclic acid, 51825898, sirolimus, docosahexaenoic acid ethyl ester, diclofenac, mercaptopurine, indometacin, 5279552, 17-allylamino-geldanamycin, rottlerin, paclitaxel, pyrvinium, flufenamic acid, oligomycin, 5114445, resveratrol, Y-27632, carbamazepine, nitrendipine, fluphenazine, 5152487, prazosin, 5140203, cytochalasin B, vorinostate, MG-132, HNMPA-(AM)3, decitabine, U0125, nocodazole, 5224221, 3-hydroxy-DL-kynurenine, 5162773, oxaprozin, colforsin, exemestane, felodipine, HC toxin, 5213008, dimethyloxalylglycine, 5109870, calmidazolium, 5255229, derivatives thereof, and combinations thereof. In some embodiments, the effective dose of the agent is administered in utero and/or perinatally.
In yet another aspect, the invention provides methods for evaluating efficacy of a treatment for a fetal disease or condition. Such inventive methods comprise steps of (a) hybridizing RNA from an amniotic fluid and/or maternal blood sample from a subject suffering from or carrying a fetus suffering from a fetal disease or condition to at least one polynucleotide probe for at least one predetermined gene such that expression levels of at least one predetermined gene are obtained, wherein the sample is obtained from a subject to which the agent in step (b) has not been administered; (b) administering an agent to a subject suffering from the fetal disease or condition; (c) hybridizing RNA from an amniotic fluid and/or maternal blood sample from a subject suffering from or carrying a fetus suffering from a fetal disease or condition to at least one genetic probe for the same predetermined gene(s) from step (a) such that expression levels of the predetermined gene(s) are obtained, wherein the sample is obtained from a subject to which the agent has been administered; (d) comparing the gene expression levels of the predetermined genes obtained from steps (a) and (c); and (e) determining, based on the comparison, efficacy of the agent as a treatment for the fetal disease or condition. In some embodiments, the fetal disease or condition is selected from the group consisting of twin-to-twin-transfusion syndrome (TTTS), gastroschisis, Down Syndrome, fetal structural anomalies, fetal congenital heart anomaly, fetal kidney anomalies, neural tube defects, and congenital diaphragmatic hernia.
In yet another aspect, the invention provides methods for diagnosing Down Syndrome. In some embodiments, such methods comprise steps of: providing a test sample comprising fetal RNA, wherein the fetal RNA is obtained from amniotic fluid and/or maternal blood obtained from a woman pregnant with a fetus with a known gender and gestational age, and wherein the test sample comprises a plurality of nucleic acid segments labeled with a detectable agent; providing a gene-expression array comprising a plurality of genetic probes, wherein each genetic probe is immobilized to a discrete spot on a substrate surface to form the array; providing a database comprising levels of mRNA expression established for trisomy 21 male and female fetuses at different gestational ages; contacting the array with the test sample under conditions to allow the nucleic acid segments in the sample to specifically hybridize to the genetic probes on the array; determining the binding of individual nucleic acid segments of the test sample to individual genetic probes immobilized on the array to obtain a binding pattern; establishing a gene expression pattern for the fetus; comparing the gene expression pattern of the fetus to the levels of mRNA expression in the database; and providing, based on the comparison, a diagnosis with respect to Down Syndrome.
In some embodiments, such methods comprise steps of: providing an amniotic fluid and/or maternal blood sample from a pregnant woman; hybridizing RNA from the sample to at least ten genetic probes for at least ten genes that are differentially expressed in trisomy 21 fetuses such that expression levels of the at least ten genes are obtained; and determining, based on the expression levels of the at least ten genes, a diagnosis with respect to Down Syndrome.
In yet another aspect, the invention provides gene expression microarrays for use in prenatal diagnostic applications for one or more particular fetal diseases or conditions. In many embodiments, inventive gene expression microarrays comprise a solid substrate, the substrate having a surface, and a plurality of genetic probes, wherein each genetic probe is immobilized to a discrete spot on the surface of the substrate to form an array and wherein at least a subset of the genetic probes comprise sequences from a predetermined set of genes that are differentially expressed in a fetal disease or condition for which prenatal diagnosis is desired. In some embodiments, the subset of the genes represented on the microarray comprises at least ten of the genes listed in Tables 2 and 4. In some embodiments, the fetal disease or condition is selected from the group consisting of twin-to-twin-transfusion syndrome (TTTS), gastroschisis, Down Syndrome, fetal structural anomalies, fetal congenital heart anomaly, fetal kidney anomalies, neural tube defects, and congenital diaphragmatic hernia.
In another aspect, the invention provides kits comprising inventive gene expression microarrays as described above, a database comprising baseline levels of mRNA expression established for karyotypically and developmentally normal male and normal female fetuses at different gestational ages, and instructions for using the array and database. In some embodiments, kits further comprise materials to extract fetal RNA from a sample of amniotic fluid obtained from a pregnant woman. In some embodiments, kits further comprise materials to extract RNA from a sample of blood obtained from a pregnant woman and instructions on how to distinguish fetal RNA from maternal RNA in the sample.
Throughout the specification, several terms are employed that are defined in the following paragraphs.
As used herein, the terms “about” and “approximately,” in reference to a number, is used herein to include numbers that fall within a range of 20%, 10%, 5%, or 1% in either direction (greater than or less than) the number unless otherwise stated or otherwise evident from the context (except where such number would exceed 100% of a possible value).
As used herein, the term “administer” means giving something (such as, for example a therapy, treatment, compound, and/or dose thereof) to an individual. Routes of administration include, but are not limited to, topical (including epicutaneous, enema, eye drops, ear drops, intranasal, vaginal, etc.), enteral (including oral, feeding tube, rectal, etc.), parenteral (including intravenous, intraarterial, intramuscular, intracardiac, subcutaneous, intraosseous, intradermal, intrathecal, intraperitonael, transdermal, transmucosal, inhalational), epidural, intravitreal. Administration to a subject may result in the therapy, treatment, compound, dose thereof, etc. being applied in utero. In some embodiments, the individual to which something is administered is a pregnant woman. In some embodiments, the individual to which something is administered is a fetus. In some embodiments, administering to a fetus comprises administering to the pregnant woman carrying the fetus.
As used herein, the term “biomarker” refers to its meaning as understood in the art. The term can refer to an indicator that provides information about, among other things, a process, condition, developmental stage, or outcome of interest, e.g., a fetus's diagnosis with respect to Down Syndrome. In general, the value of such an indicator is correlated with a process, condition, developmental stage, or outcome of interest. The term “biomarker” can also refer to a molecule that is the subject of an assay or measurement the result of which provides information about a process, condition, developmental stage, or outcome of interest. For example, an elevated expression level of a particular gene can be an indicator that a subject has a particular condition. The expression level of the gene, an elevated expression level of the gene, and the gene expression product itself, can all be referred to as “biomarkers”.
As used herein, the term “client” or “client agent” when used in reference to a computing environment, may be used interchangeable with any one of the following terms: “client machine(s),” “client(s),” “client computer(s),” “client device(s),” “client computing device(s),” “client node(s),” “endpoint(s),” “endpoint node(s),” or a “second machine.” A client machine can, in some embodiments, be a computing device. A client machine can in some embodiments execute, operate or otherwise provide an application that can be any one of the following: software; a program; executable instructions; a web browser; a web-based client; a client-server application; a thin-client computing client; an ActiveX control; a Java applet; software related to voice over internet protocol (VoIP) communications like a soft IP telephone; an application for streaming video and/or audio; an application for facilitating real-time-data communications; a HTTP client; a FTP client; an Oscar client; a Telnet client; or any other type and/or form of executable instructions capable of executing on client machine. In some embodiments, the client machine is be a virtual machine 102C such as those manufactured by XenSolutions, Citrix Systems, IBM, VMware, or any other virtual machine able to implement the methods and systems described herein.
As used herein, the term “complementary” refers to nucleic acid sequences that base-pair according to the standard Watson-Crick complementary rules, or that are capable of hybridizing to a particular nucleic acid segment under relatively stringent conditions. Nucleic acid polymers are optionally complementary across only portions of their entire sequences.
As used herein, the term “differentially expressed” in reference to genes refers to the state of having a different expression pattern or level depending on the type of cell, tissue, and/or sample, from which the gene expression products are derived. “Differentially expressed” genes may be upregulated or downregulated in the cell, tissue, and/or samples as compared to controls. For example, a gene that is upregulated in samples obtained from a subject suffering from Down Syndrome as compared to a subject who is not can be said to be “differentially expressed.” As another example, a gene that is downregulated in samples from a subject that has undergone a developmental transition as compared to a subject who has not can also be said to be “differentially expressed.”
As used herein, the term “Down Syndrome” (also known as “Down's Syndrome” and “trisomy 21”) has its meaning as known in the art and refers to a disorder that results from extra genetic material from all or part of human chromosome 21.
As used herein, the term “effective dose” is used interchangeably with the term “effective amount” and refers to any dose or amount of a compound, composition, therapeutic agent, etc. that is sufficient to fulfill its intended purpose(s), i.e., a desired biological or medicinal response in a tissue or subject. For example, in certain embodiments of the present invention, the purpose(s) may be: to slow down or stop the progression, aggravation, or deterioration of the symptoms of a fetal disease or condition, to bring about amelioration of the symptoms of the fetal disease or condition, and/or to cure the fetal disease or condition.
As used herein, the terms “fluorophore”, “fluorescent moiety”, “fluorescent label”, “fluorescent dye” and “fluorescent labeling moiety” are used herein interchangeably. They refer to a molecule which, in solution and upon excitation with light of appropriate wavelength, emits light back. Numerous fluorescent dyes of a wide variety of structures and characteristics are suitable for use in the practice of this invention. Similarly, methods and materials are known for fluorescently labeling nucleic acids (see, for example, Haugland (1994)). In choosing a fluorophore, it is preferred that the fluorescent molecule absorbs light and emits fluorescence with high efficiency (i.e., high molar absorption coefficient and fluorescence quantum yield, respectively) and is photostable (i.e., it does not undergo significant degradation upon light excitation within the time necessary to perform the analysis).
As used herein, the term “gene” refers to a discrete nucleic acid sequence responsible for a discrete cellular product and/or performing one or more intracellular or extracellular functions. In some embodiments, the term “gene” refers to a nucleic acid that includes a portion encoding a protein and optionally encompasses regulatory sequences, such as promoters, enhancers, terminators, and the like, which are involved in the regulation of expression of the protein encoded by the gene of interest. Such gene and regulatory sequences may be derived from the same natural source, or may be heterologous to one another. In some embodiments, a gene does not encode proteins but rather provide templates for transcription of functional RNA molecules such as tRNAs, rRNAs, etc. Alternatively or additionally, in some embodiments, a gene may define a genomic location for a particular event/function, such as the binding of proteins and/or nucleic acids.
As used herein, the term “gene expression” refers to the conversion of the information, contained in a gene, into a gene product. A gene product can be the direct transcriptional product of a gene (e.g., mRNA, tRNA, rRNA, antisense RNA, ribozyme structural RNA or any other type of RNA), or the product of subsequent downstream processing events (e.g., splicing, RNA processing, translation). In some embodiments, a gene product is a protein produced by translation of an mRNA. In some embodiments, gene products are RNAs that are modified by processes such as capping, polyadenylation, methylation, and editing, proteins post-translationally modified, and proteins modified by, for example, methylation, acetylation, phosphorylation, ubiquitination, ADP-ribosylation, myristilation, and glycosylation.
As used herein, the term “gene expression array” refers to an array comprising a plurality of genetic probes immobilized on a substrate surface that can be used for quantitation of mRNA expression levels. In the context of the present invention, the term “array-based gene expression analysis” is used to refer to methods of gene expression analysis that use gene-expression arrays. The term “genetic probe”, as used herein, refers to a nucleic acid molecule of known sequence, which has its origin in a defined region of the genome and can be a short DNA sequence (or oligonucleotide), a PCR product, or mRNA isolate. Genetic probes are gene-specific DNA sequences to which nucleic acids from a test sample of amniotic fluid RNA are hybridized. Genetic probes specifically bind (or specifically hybridize) to nucleic acid of complementary or substantially complementary sequence through one or more types of chemical bonds, usually through hydrogen bond formation.
As used herein, the phrases “genomic profile” and “genomic signature” are used interchangeable to refer to a genome-wide profile in a given cell, cell type, tissue, individual, sample, condition, disease state, etc. A genomic profile or genomic signature is the genome-wide equivalent of a “molecular signature” and may refer to, among other things, a gene expression profile, a protein expression profile, a DNA methylation pattern, metabolite profiles, etc.
As used herein, the term “gestational age” refers to age of an embryo, fetus, or fetus as calculated from the first day of the mother's last menstrual period. In humans, the gestational age may count the period of time from about two weeks before fertilization takes place.
As used herein, the term “isolated” when applied to RNA means a molecule of RNA or a portion thereof, which (1) by virtue of its origin or manipulation, is separated from at least some of the components with which it was previously associated; or (2) was produced or synthesized by the hand of man.
As used herein, the terms “labeled”, “labeled with a detectable agent” and “labeled with a detectable moiety” are used interchangeably. They are used to specify that a nucleic acid molecule or individual nucleic acid segments from a sample can be visualized, for example, following binding (i.e., hybridization) to genetic probes. In hybridization methods, samples of nucleic acid segments may be detectably labeled before the hybridization reaction or a detectable label may be selected that binds to the hybridization product. Preferably, the detectable agent or moiety is selected such that it generates a signal which can be measured and whose intensity is related to the amount of hybridized nucleic acids. In array-based methods, the detectable agent or moiety is also preferably selected such that it generates a localized signal, thereby allowing spatial resolution of the signal from each spot on the array. Methods for labeling nucleic acid molecules are well known in the art (see below for a more detailed description of such methods). Labeled nucleic acid fragments can be prepared by incorporation of or conjugation to a label, that is directly or indirectly detectable by spectroscopic, photochemical, biochemical, immunochemical, electrical, optical, or chemical means. Suitable detectable agents include, but are not limited to: various ligands, radionuclides, fluorescent dyes, chemiluminescent agents, microparticles, enzymes, colorimetric labels, magnetic labels, and haptens. Detectable moieties can also be biological molecules such as molecular beacons and aptamer beacons.
As used herein, the term “messenger RNA” or “mRNA” refers a form of RNA that serves as a template for protein biosynthesis. In many embodiments, the amount of a particular mRNA (i.e., having a particular sequence, and originating from a particular same gene) reflects the extent to which the gene encoding the mRNA has been “expressed.”
As used herein, the terms “microarray,” “array” and “biochip” are used interchangeably and refer to an arrangement, on a substrate surface, of multiple nucleic acid molecules of known sequences. Each nucleic acid molecule is immobilized to a “discrete spot” (i.e., a defined location or assigned position) on the substrate surface. The term “microarray” more specifically refers to an array that is miniaturized so as to require microscopic examination for visual evaluation. Arrays used in the methods of the invention are preferably microarrays.
As used herein, the terms “nucleic acid” and “nucleic acid molecule” are used herein interchangeably. They refer to a deoxyribonucleotide or ribonucleotide polymer in either single- or double-stranded form, and unless otherwise stated, encompass known analogs of natural nucleotides that can function in a similar manner as naturally occurring nucleotides. The terms encompass nucleic acid-like structures with synthetic backbones, as well as amplification products.
As used herein, the term “oligonucleotide” refers to usually short strings of DNA or RNA to be used as hybridizing probes or nucleic acid molecule array elements. These short stretches of sequence are often chemically synthesized. The size of the oligonucleotide depends on the function or use of the oligonucleotides. When used in microarrays for hybridization, oligonucleotides can comprise natural nucleic acid molecules or synthesized nucleic acid molecules and comprise between 5 and 150 nucleotides, preferably between about 15 and about 100 nucleotides, more preferably between 15 and 30 nucleotides and most preferably, between 18 and 25 nucleotides complementary to mRNA.
As used herein, the term “prenatal disease or condition” refers to any disease or condition that can affect fetuses. The term “prenatal disease or condition” encompasses diseases or conditions that have symptoms that manifest during fetal development and/or result in detectable changes at prenatal stages. Thus, for example, Down Syndrome, which affects adults, is considered a “prenatal disease or condition” because the syndrome results in detectable changes at prenatal stages.
As used herein, the term “RNA transcript” refers to the product resulting from transcription of a DNA sequence. When the RNA transcript is the original, unmodified product of a RNA polymerase catalyzed transcription, it is referred to as the primary transcript. An RNA transcript that has been processed (e.g., spliced, etc.) will differ in sequence from the primary transcript; a fully processed transcript is referred to as a “mature” RNA. The term “transcription” refers to the process of copying a DNA sequence of a gene into an RNA product, generally conducted by a DNA-directed RNA polymerase using the DNA as a template. A processed RNA transcript that is translated into protein is often called a messenger RNA (mRNA).
As used herein, the term “statistically significant number” refers to a number of samples (analyzed or to be analyzed) that is large enough to provide reliable data.
As used herein, the terms “subject” and “individual” are used herein interchangeably. They refer to a human or another animal (e.g., mouse, rat, rabbit, dog, cat, cattle, swine, sheep, horse, or primate) that can be afflicted with or is susceptible to a disease, disorder, condition, or complication (e.g., Down Syndrome) but may or may not have the disease or disorder. In many embodiments, the subject is a human being. In many embodiments, the subject is a fetus. In some embodiments, the subject is a newborn.
As used herein, the term “suffering from” is used to describe subjects that have been diagnosed as having a particular disease or condition, whether or not the subject is experiencing symptoms typical of that disease or condition.
As used herein, the term “susceptible” means having an increased risk for and/or a propensity for something, i.e., a condition such as twin-to-twin transfusion syndrome (TTTS), gastroschisis, congenital diaphragmatic hernia, and/or Down Syndrome. The term takes into account that an individual “susceptible” for a condition may never be diagnosed with the condition.
The terms “therapeutic agent” and “drug” are used herein interchangeably. They refer to a bioactive substance, molecule, compound, agent, factor or composition effective in the treatment of a disease or clinical condition.
As used herein, the term “treatment” characterizes a method or process that is aimed at (1) delaying or preventing the onset of a disease or condition; (2) slowing down or stopping the progression, aggravation, or deterioration of one or more symptoms of the disease or condition; (3) bringing about ameliorations of the symptoms of the disease or condition; (4) reducing the severity or incidence of the disease or condition; or (5) curing the disease or condition. A treatment may be administered prior to the onset of the disease, for a prophylactic or preventive action. Alternatively or additionally, the treatment may be administered after initiation of the disease or condition, for a therapeutic action.
As used herein, the term “trisomy 21” describes the karyotypic condition in humans and in human cells of having an extra copy of chromosome 21. “Trisomy 21” is often used interchangeably with “Down Syndrome.” As used herein, the term “trisomy 21” fetus refers to a fetus that has undergone karyotyping and has been diagnosed as having an extra copy of chromosome 21.
As mentioned above, the present invention provides technologies for developing and/or evaluating therapies for fetal diseases and conditions and for diagnosis of fetal diseases and conditions such as Down Syndrome.
Fetal RNA can be obtained from biological samples such as amniotic fluid or maternal blood from pregnant women.
Notwithstanding the well-known instability of RNA, fetal RNA survives in amniotic fluid in amounts and in a condition appropriate for analysis. In some embodiments, inventive methods involve providing or obtaining a sample of amniotic fluid obtained from a pregnant woman. Amniotic fluid is generally collected by amniocentesis, in which a long needle is inserted in the mother's lower abdomen into the amniotic cavity inside the uterus to withdraw a certain volume of amniotic fluid.
For prenatal diagnosis, most amniocenteses are performed between the 14th and 20th weeks of pregnancy and the volume of amniotic fluid withdrawn is about 10 to 30 mL. Traditionally, the most common indications for amniocentesis include: advanced maternal age (typically set, in the US, at 35 years or more at the estimated time of delivery), previous child with a birth defect or genetic disorder, parental chromosomal rearrangement, family history of late-onset disorders with genetic components, recurrent miscarriages, positive maternal serum screening test (Multiple Marker Screening) documenting increased risk of fetal neural tube defects and/or fetal chromosomal abnormality, and abnormal fetal ultrasound examination (for example, revealing signs known to be associated with fetal aneuploidy). However, the amount and type of information that may be obtained from an amniotic fluid sample according to the present invention may support a change in standard operating procedure, such that amniocentesis is considered or performed in any pregnancy.
Amniocentesis is also performed for therapeutic purposes. In such cases, large amounts of amniotic fluid (>1 L) are removed (amnioreduction) to correct polyhydramnios (i.e., an excess of amniotic fluid surrounding the fetus). Polyhydramnios can represent a danger because of an increased risk of premature rupture of the membranes, and may also be a sign of birth defect or other medical problems such as gestational diabetes or fetal hydrops. Polyhydramnios is also observed in multiple gestations. Twin-to-twin transfusion syndrome (TTTS) is defined sonographically as the combined presence of an excess of amniotic fluid in one sac and an insufficiency of amniotic fluid in the other sac. In TTTS, the goal of the amnioreduction is to attempt to decrease the likelihood of miscarriage or preterm labor by reducing the amniotic fluid volume in the sac of the recipient twin.
In the context of the present invention, samples of amniotic fluid may be obtained after standard or therapeutic amniocentesis. In conventional amniocentesis procedures, fetal cells present in the amniotic fluid are isolated by centrifugation and grown in culture for chromosome analysis, biochemical analysis, and/or molecular biological analysis. Centrifugation also produces a supernatant sample (herein termed “remaining amniotic material”), which is usually stored at −20° C. as a back-up in case of assay failure. Aliquots of this supernatant may also be used for additional assays such as determination of alpha-fetoprotein and acetyl cholinesterase levels. After a certain period of time, the frozen supernatant sample is typically discarded. In aminoreductions, the entire sample of amniotic fluid withdrawn is discarded. The standard protocol followed by the Cytogenetics Laboratory at Tufts Medical Center (Boston, Mass.) provides the Applicants with fresh and frozen samples of amniotic fluid (from therapeutic amniocenteses) and fresh samples of remaining amniotic material (from diagnostic amniocenteses).
Maternal blood can be obtained more readily than amniotic fluid and contains fetal and placental mRNAs. The term “maternal blood” is used to refer to blood from the pregnant woman (as opposed to blood from the fetus). In some embodiments, inventive methods involve providing or obtaining a sample of maternal blood obtained from a pregnant woman. Blood samples may, for example, be whole blood samples, that is, samples that are not separated into components such as plasma or serum. Fetal transcripts found in whole blood differ from those found in plasma. Blood samples can be drawn using standard techniques well known in the art such as venipuncture.
In the practice of methods of the invention, fetal RNA may be isolated from a sample of amniotic fluid obtained from a pregnant woman. Isolation may be carried out by any suitable method of RNA isolation or extraction.
In certain embodiments, fetal RNA is obtained by treating a sample of amniotic fluid such that fetal RNA present in the amniotic fluid sample is extracted. In certain embodiments, fetal RNA is extracted after removal of substantially all or some of the cell populations present in the sample of amniotic fluid. The cell populations may be removed from the sample by any suitable method, for example, by centrifugation. More than one centrifugation step may be performed to ensure that substantially all cell populations have been removed. In some embodiments, the cell populations are removed within two hours of obtaining the sample. In some embodiments, the cell populations are removed immediately after obtaining the sample of amniotic fluid.
When substantially all cell populations are removed from the sample, amniotic fluid fetal RNA consists essentially of cell-free fetal RNA. When extracted from a sample of remaining amniotic material obtained by centrifugation, fetal RNA comprises cell-free fetal RNA as well as fetal RNA from the cells still present in the remaining material.
Fetal RNA may also be obtained by isolating cells from the sample of amniotic fluid, optionally cultivating these isolated cells, and extracting RNA from the cells. In such cases, amniotic fluid fetal RNA consists essentially of fetal RNA from the cultured cells.
In some embodiments of the invention, fetal RNA is obtained from whole maternal blood, which contains a mixture of maternal mRNAs as well as fetal and placental mRNAs. In such embodiments, fetal and/or placental RNA is not isolated from maternal mRNAs. Rather, certain sets of transcripts are known to be expressed only from fetuses and/or by the placenta. (See, e.g., Maron et al. (2007), the entire contents of which are herein incorporated by reference.) Such fetal biomarkers allow analyses of fetal mRNAs within a sample also containing maternal mRNAs. In some such embodiments, computational methods for decomposing signals from different sources are used to help to distinguish fetal RNA from maternal RNA.
In some embodiments, before isolation or extraction of fetal RNA, the sample of amniotic fluid material or whole maternal blood is stored for a certain period of time under suitable storage conditions. In some embodiments, suitable storage conditions comprise temperatures ranging between about 10° C. to about −220° C., inclusive. In some embodiments, samples are stored at about 4° C., at about −10° C., at about −20° C., at about −70° C., or at about −80° C. In some embodiments, samples are stored for less than about 28 days. In some embodiments, samples are stored for more than about twenty-four hours. In some embodiments, before freezing, an RNase inhibitor, which prevents degradation of fetal RNA by RNases (i.e., ribonucleases), is added to the sample. In some embodiments, the RNase inhibitor is added within two hours of obtaining the sample. In some embodiments, the RNAse inhibitor is added within one hour of obtaining the sample. In some embodiments, the RNAse inhibitor is added within thirty minutes of obtaining the sample. In some embodiments, the RNAse inhibitor is added within ten minutes of obtaining the sample. In some embodiments, the RNAse inhibitor is added within five minutes of obtaining the sample. In some embodiments, the RNAse inhibitor is added within two minutes of obtaining the sample. In some embodiments, the RNase inhibitor is added immediately after obtaining the sample. In some embodiments, before RNA extraction, the frozen sample is thawed at 37° C. and mixed with a vortex.
In some embodiments, the sample is frozen (e.g., flash-frozen in liquid nitrogen and dry ice), stored, and thawed; then RNAse inhibitor is added after thawing. In some such embodiments, the RNase inhibitor is added within two hours of thawing. In some embodiments, the RNAse inhibitor is added within one hour of thawing. In some embodiments, the RNAse inhibitor is added within thirty minutes of thawing. In some embodiments, the RNAse inhibitor is added within ten minutes of thawing. In some embodiments, the RNAse inhibitor is added within five minutes of thawing. In some embodiments, the RNAse inhibitor is added within two minutes of thawing.
The most commonly used RNase inhibitor is a natural protein derived from human placenta that specifically (and reversibly) binds RNases (Blackburn et al. (1977)). RNase inhibitors are commercially available, for example, from Ambion (Austin, Tex.; as SUPERase•In™), Promega, Inc. (Madison, Wis.; as rRNasin® Ribonuclease Inhibitor) and Applied Biosystems (Framingham, Mass.). In general, precautions for preventing RNases contaminations in RNA samples, which are well known in the art and include the use of gloves, of certified RNase-free reagents and ware, of specifically treated water and of low temperatures, as well as routine decontamination and the like, are used in the practice of the methods of the invention.
For amniotic fluid samples, isolating fetal RNA may include treating the sample such that fetal RNA present in the sample is extracted and made available for analysis. Any suitable isolation method that results in extracted amniotic fluid fetal RNA may be used in the practice of the invention. For maternal whole blood sample, fetal RNA may be extracted together with maternal RNA, but some fetal mRNAs will be distinguishable from maternal RNA (Maron et al. 2007).
In order to obtain the most accurate assessment of the fetus, it may be desirable to minimize artifacts from manipulation processes. Therefore, the number of extraction and modification steps is in some embodiments kept as low as possible.
Methods of RNA extraction are well known in the art (see, for example, Sambrook et al., (1989)). Most methods of RNA isolation from bodily fluids or tissues are based on the disruption of the tissue in the presence of protein denaturants to quickly and effectively inactivate RNases. Generally, RNA isolation reagents comprise, among other components, guanidinium thiocyanate and/or beta-mercaptoethanol, which are known to act as RNase inhibitors (Chirgwin et al. (1979)). Isolated total RNA is then further purified from the protein contaminants and concentrated by selective ethanol precipitations, phenol/chloroform extractions followed by isopropanol precipitation (see, for example, Chomczynski and Sacchi (1987)) or cesium chloride, lithium chloride or cesium trifluoroacetate gradient centrifugations (see, for example, Glisin et al. (1974); and Stern and Newton (1986)).
In certain methods of the invention, for example those wherein fetal RNA is subjected to a gene-expression analysis, it may be desirable to isolate mRNA from total RNA in order to allow the detection of even low level messages (Alberts et al. (1994)).
Purification of mRNA from total RNA typically relies on the poly(A) tail present on most mature eukaryotic mRNA species. Several variations of isolation methods have been developed based on the same principle. In a first approach, a solution of total RNA is passed through a column containing oligo(dT) or d(U) attached to a solid cellulose matrix in the presence of high concentrations of salts to allow the annealing of the poly(A) tail to the oligo(dT) or d(U). The column is then washed with a lower salt buffer to remove and release the poly(A) mRNAs. In a second approach, a biotinylated oligo(dT) primer is added to the solution of total RNA and used to hybridize to the 3′ poly(A) region of the mRNAs. The hybridization products are captured and washed at high stringency using streptavidin coupled to paramagnetic particles and a magnetic separation stand. The mRNA is eluted from the solid phase by the simple addition of ribonuclease-free deionized water. Other approaches do not require the prior isolation of total RNA. For example, uniform, superparamagnetic, polystyrene beads with oligo(dT) sequences covalently bound to the surface may be used to isolate mRNA directly by specific base pairing between the poly(A) residues of mRNA and the oligo(dT) sequences on the beads. Furthermore, the oligo(dT) sequence on the beads may also be used as a primer for the reverse transcriptase to subsequently synthesize the first strand of cDNA. Alternatively, new methods or improvements of existing methods for total RNA or mRNA isolation, preparation and purification may be devised by one skilled in the art and used in the practice of the methods of the invention.
Numerous different and versatile kits can be used to extract RNA (i.e., total RNA or mRNA) from bodily fluids and are commercially available from, for example, Ambion, Inc. (Austin, Tex.), Amersham Biosciences (Piscataway, N.J.), BD Biosciences Clontech (Palo Alto, Calif.), BioRad Laboratories (Hercules, Calif.), Dynal Biotech Inc. (Lake Success, N.Y.), Epicentre Technologies (Madison, Wis.), Gentra Systems, Inc. (Minneapolis, Minn.), GIBCO BRL (Gaithersburg, Md.), Invitrogen Life Technologies (Carlsbad, Calif.), MicroProbe Corp. (Bothell, Wash.), Organon Teknika (Durham, N.C.), Promega, Inc. (Madison, Wis.), and Qiagen Inc. (Valencia, Calif.). For example, the RNAprotect Amniotic fluid Kit (Qiagen) may be used to extract fetal RNA from amniotic fluid. Similarly, the QIAamp DNA Blood Mini Kit (Qiagen), QIAamp DNA Blood Maxi Kit (Qiagen), and PaxGene blood RNA kit (PreAnalytiX) can be used to extract RNA from maternal whole blood. User Guides that describe in great detail the protocol to be followed are usually included in all these kits. Sensitivity, processing time and cost may be different from one kit to another. One of ordinary skill in the art can easily select the kit(s) most appropriate for a particular situation.
In certain embodiments, RNA (for example, fetal RNA from amniotic fluid or fetal and maternal RNA from whole maternal blood) is amplified before being analyzed. In some embodiments, before analysis, the RNA is converted, by reverse-transcriptase, into complementary DNA (cDNA), which, optionally, may, in turn, be converted into complementary RNA (cRNA) by transcription.
Amplification methods are well known in the art (see, for example, Kimmel and Berger (1987); Sambrook et al. (1989); Ausubel (Ed.) (2002); and U.S. Pat. Nos. 4,683,195; 4,683,202 and 4,800,159). Standard nucleic acid amplification methods include: polymerase chain reaction (or PCR, see, for example, Innis (Ed.) (1990); and Innis (Ed.) (1995); and ligase chain reaction (or LCR, see, for example, Landegren et al. (1988); and Barringer et al. (1990)).
Methods for transcribing RNA into cDNA are also well known in the art. Reverse transcription reactions may be carried out using non-specific primers, such as an anchored oligo-dT primer, or random sequence primers, or using a target-specific primer complementary to the RNA for each genetic probe being monitored, or using thermostable DNA polymerases (such as avian myeloblastosis virus reverse transcriptase or Moloney murine leukemia virus reverse transcriptase). Other methods include transcription-based amplification system (TAS) (see, for example, Kwoh et al. (1989)), isothermal transcription-based systems such as Self-Sustained Sequence Replication (3SR) (see, for example, Guatelli et al. (1990)), and Q-beta replicase amplification (see, for example, Smith et al. (1997); and Burg et al. (1996)).
The cDNA products resulting from these reverse transcriptase methods may serve as templates for multiple rounds of transcription by the appropriate RNA polymerase (for example, by nucleic acid sequence based amplification or NASBA, see, for example, Kievits et al. (1991); and Greijer et al. (2001)). Transcription of the cDNA template rapidly amplifies the signal from the original target mRNA.
These methods as well as others (either known or newly devised by one skilled in the art) may be used in the practice of the invention.
Amplification can also be used to quantify the amount of extracted RNA (see, for example, U.S. Pat. No. 6,294,338). Alternatively or additionally, amplification using appropriate oligonucleotide primers can be used to label cell-free RNA prior to analysis (see below). Suitable oligonucleotide amplification primers can easily be selected and designed by one skilled in the art.
In certain embodiments, RNA (for example, after amplification, or after conversion to cDNA or to cRNA) is labeled with a detectable agent or moiety before being analyzed. The role of a detectable agent is to facilitate detection of fetal RNA or to allow visualization of hybridized nucleic acid fragments (e.g., nucleic acid fragments bound to genetic probes). In some embodiments, the detectable agent is selected such that it generates a signal which can be measured and whose intensity is related to the amount of labeled nucleic acids present in the sample being analyzed. In array-based analysis methods, the detectable agent is also in some embodiments selected such that it generates a localized signal, thereby allowing spatial resolution of the signal from each spot on the array.
The association between the nucleic acid molecule and detectable agent can be covalent or non-covalent. Labeled nucleic acid fragments can be prepared by incorporation of or conjugation to a detectable moiety. Labels can be attached directly to the nucleic acid fragment or indirectly through a linker. Linkers or spacer arms of various lengths are known in the art and are commercially available, and can be selected to reduce steric hindrance, or to confer other useful or desired properties to the resulting labeled molecules (see, for example, Mansfield et al. (1995)).
Methods for labeling nucleic acid molecules are well-known in the art. For a review of labeling protocols, label detection techniques and recent developments in the field, see, for example, Kricka (2002); van Gijlswijk et al. (2001); and Joos et al. (1994). Standard nucleic acid labeling methods include: incorporation of radioactive agents, direct attachment of fluorescent dyes (see, for example, Smith et al. (1985)) or of enzymes (see, for example, Connoly and Rider (1985)); chemical modifications of nucleic acid fragments making them detectable immunochemically or by other affinity reactions (see, for example, Broker et al., (1978); Bayer et al. (1980); Langer et al. (1981); Richardson et al. (1983); Brigati et al. (1983); Tchen et al. (1984); Landegent et al. (1984); and Hopman et al. (1987)); and enzyme-mediated labeling methods, such as random priming, nick translation, PCR and tailing with terminal transferase (for a review on enzymatic labeling, see, for example, Temsamani and Agrawal (1996)). More recently developed nucleic acid labeling systems include, but are not limited to: ULS (Universal Linkage System; see, for example, Wiegant et al. (1999)), photoreactive azido derivatives (see, for example, Neves et al. (2000)), and alkylating agents (see, for example, Sebestyen et al. (1998)).
Any of a wide variety of detectable agents can be used in the practice of the present invention. Suitable detectable agents include, but are not limited to: various ligands, radionuclides (such as, for example, 32P, 35S, 3H, 14C, 125I, 131I and the like); fluorescent dyes (for specific exemplary fluorescent dyes, see below); chemiluminescent agents (such as, for example, acridinium esters, stabilized dioxetanes and the like); microparticles (such as, for example, quantum dots, nanocrystals, phosphors and the like); enzymes (such as, for example, those used in an ELISA, i.e., horseradish peroxidase, beta-galactosidase, luciferase, alkaline phosphatase); colorimetric labels (such as, for example, dyes, colloidal gold and the like); magnetic labels (such as, for example, Dynabeads™); and biotin, dioxigenin or other haptens and proteins for which antisera or monoclonal antibodies are available.
In certain embodiments, fetal amniotic fluid RNA (after amplification, or conversion to cDNA or to cRNA) is fluorescently labeled. Numerous known fluorescentlabeling moieties of a wide variety of chemical structures and physical characteristics are suitable for use in the practice of this invention. Suitable fluorescent dyes include, but are not limited to: Cy-3™, Cy-5™, Texas red, FITC, phycoerythrin, rhodamine, fluorescein, fluorescein isothiocyanine, carbocyanine, merocyanine, styryl dye, oxonol dye, BODIPY dye (i.e., boron dipyrromethene difluoride fluorophore, see, for example, C. S. Chen et al., J. Org. Chem. 2000, 65: 2900-2906; Chen et al. (2000); U.S. Pat. Nos. 4,774,339; 5,187,288; 5,227,487; 5,248,782; 5,614,386; 5,994,063; and 6,060,324), and equivalents, analogues, derivatives or combinations of these molecules. Similarly, methods and materials are known for linking or incorporating fluorescent dyes to biomolecules such as nucleic acids (see, for example, Haugland (1994)). Fluorescent labeling dyes as well as labeling kits are commercially available from, for example, Amersham Biosciences, Inc. (Piscataway, N.J.), Molecular Probes, Inc. (Eugene, Oreg.), and New England Biolabs, Inc. (Beverly, Mass.).
Favorable properties of fluorescent labeling agents to be used in the practice of the invention include high molar absorption coefficient, high fluorescence quantum yield, and photostability. Some labeling fluorophores exhibit absorption and emission wavelengths in the visible (i.e., between 400 and 750 nm) rather than in the ultraviolet range of the spectrum (i.e., lower than 400 nm).
In some embodiments, RNA (for example, after amplification or conversion to cDNA or cRNA) is made detectable through one of the many variations of the biotin-avidin system, which are well known in the art. Biotin RNA labeling kits are commercially available, for example, from Roche Applied Science (Indianapolis, Ind.) Perkin Elmer (Boston, Mass.), and NuGEN (San Carlos, Calif.).
Detectable moieties can also be biological molecules such as molecular beacons and aptamer beacons. Molecular beacons are nucleic acid molecules carrying a fluorophore and a non-fluorescent quencher on their 5′ and 3′ ends. In the absence of a complementary nucleic acid strand, the molecular beacon adopts a stem-loop (or hairpin) conformation, in which the fluorophore and quencher are in close proximity to each other, causing the fluorescence of the fluorophore to be efficiently quenched by FRET (i.e., fluorescence resonance energy transfer). Binding of a complementary sequence to the molecular beacon results in the opening of the stem-loop structure, which increases the physical distance between the fluorophore and quencher thus reducing the FRET efficiency and allowing emission of a fluorescence signal. The use of molecular beacons as detectable moieties is well-known in the art (see, for example, Sokol et al., (1998); and U.S. Pat. Nos. 6,277,581 and 6,235,504). Aptamer beacons are similar to molecular beacons except that they can adopt two or more conformations (see, for example, Kaboev et al. (2000); Yamamoto et al. (2000); Hamaguchi et al. (2001); Poddar and Le (2001)).
A “tail” of normal or modified nucleotides may also be added to nucleic acid fragments for detectability purposes. A second hybridization with nucleic acid complementary to the tail and containing a detectable label (such as, for example, a fluorophore, an enzyme or bases that have been radioactively labeled) allows visualization of the nucleic acid fragments bound to the array (see, for example, system commercially available from Enzo Biochem Inc., New York, N.Y.).
The selection of a particular nucleic acid labeling technique may depend on the situation and may be governed by several factors, such as the ease and cost of the labeling method, the quality of sample labeling desired, the effects of the detectable moiety on the hybridization reaction (e.g., on the rate and/or efficiency of the hybridization process), the nature of the detection system to be used, the nature and intensity of the signal generated by the detectable label, and the like.
II. Analysis of RNA from Amniotic Fluid or Whole Maternal Blood
According to the present invention, RNA such as fetal RNA from amniotic fluid or RNA from whole maternal blood can be analyzed to obtain information regarding fetal gene expression. In certain embodiments, analyzing the RNA comprises determining the quantity, concentration or sequence composition of RNA.
RNA may be analyzed by any of a variety of methods. Methods of analysis of RNA are well-known in the art (see, for example, Sambrook et al. (1989); and Ausubel (Ed.) (2002)).
For example, the quantity and concentration of RNA extracted from amniotic fluid or whole maternal blood samples may be evaluated by UV spectroscopy, wherein the absorbance of a diluted RNA sample is measured at 260 and 280 nm (Wilfinger et al. (1997)). Quantitative measurements may also be carried out using certain fluorescent dyes, such as, for example, RiboGreen® (commercially available from Molecular Probes, Eugene, Oreg.), which exhibit a large fluorescence enhancement when bound to nucleic acids. RNA labeled with these fluorescent dyes can be detected using standard fluorometers, fluorescence microplate reader or filter fluorometers. Another method for analyzing quantity and quality of RNA samples is through use of a BioAnalyzer (commercially available from Agilent Technologies, Foster City, Calif.), which separates charged biological molecules (such as nucleic acids) using microfluidic technologies and then a laser to excite intercalating fluorescent dyes.
RNA may also be analyzed through sequencing. For example, RNase T1, which cleaves single-stranded RNA specifically at the 3′-side of guanosine residues in a two-step mechanism, may be used to digest denatured RNA. Partial digestion of 3′ or 5′ labeled RNA with this enzyme thus generates a ladder of G residues. The cleavage can be monitored by radioactive (Ikehara et al. (1986)) and photometric (Grunert et al. (1993)) detection systems, by zymogram assay (Bravo et al. (1994)), agar diffusion test (Quaas et al. (1989)), lanthan assay (Anfinsen et al. (1954)) or methylene blue test (Greiner-Stoeffele et al. (1996)) or by fluorescence correlation spectroscopy (Korn et al. (2000)).
Other methods for analyzing RNA include northern blots, wherein the components of the RNA sample being analyzed are resolved by size prior to detection thereby allowing identification of more than one species simultaneously, and slot/dot blots, wherein unresolved mixtures are used.
In certain embodiments, analyzing the RNA comprises submitting the extracted RNA to a gene-expression analysis. In some embodiments, this includes the simultaneous analysis of multiple genes.
For example, analysis of RNA may include detecting the presence of and/or quantitating a RNA transcribed from a gene known or suspected to be involved in a fetal anomaly such as congenital diaphragmatic hernia or condition such as Down Syndrome. In some embodiments, such genes are differentially expressed in the fetal disease or condition.
In analyses carried out to detect the presence or absence of RNA transcribed from a specific gene, the detection may be performed by any of a variety of physical, immunological and biochemical methods. Such methods are well-known in the art, and include, for example, protection from enzymatic degradation such as 51 analysis and RNase protection assays, in which hybridization to a labeled nucleic acid probe is followed by enzymatic degradation of single-stranded regions of the probe and analysis of the amount and length of probe protected from degradation.
In some embodiments of the invention, real time RT-PCR methods are employed that allow quantification of RNA transcripts and viewing of the increase in amount of nucleic acid as it is amplified. The TaqMan assay, a quenched fluorescent dye system, may also be used to quantitate targeted mRNA levels (see, for example Livak et al. (1995)).
In some embodiments of the invention involving methods that allow quantification of RNA transcripts (such as real time RT-PCR), expression housekeeping genes are used as normalization controls. Examples of housekeeping genes include GAPDH, 18S rRNA, beta-actin, cyclophilin, tubulin, etc.
Other methods are based on the analysis of cDNA derived from mRNA, which is less sensitive to degradation than RNA and therefore easier to handle. These methods include, but are not limited to, sequencing cDNA inserts of an expressed sequence tag (EST) clone library (see, for example, Adams et al. (1991)) and serial analysis of gene expression (or SAGE), which allows quantitative and simultaneous analysis of a large number of transcripts (see, for example, U.S. Pat. No. 5,866,330; V. E. Velculescu et al. (1995); and Zhang et al. (1997)). These two methods survey the whole spectrum of mRNA in a sample rather than focusing on a predetermined set.
Other methods of analysis of cDNA derived from mRNA include reverse transcriptase-mediated PCR (RT-PCR) gene expression assays. These methods are directed at specific target gene products and allow the qualitative (non-quantitative) detection of transcripts of very low abundance (see, for example, Su et al. (1997)). A variation of these methods, called competitive RT-PCR, in which a known amount of exogenous template is added as internal control, has been developed to allow quantitative measurements (see, for example, Beker-Andre and Hahlbrock (1989); Wang et al. (1989); and Gilliland et al. (1990)).
mRNA analysis may also be performed by differential display reverse transcriptase PCR (DDRT-PCR; see, for example, Liang and Pardee (1992)) or RNA arbitrarily primed PCR (RAP-CPR; see, for example, Welsh et al. (1992); and McClelland et al. (1993)). In these methods, RT-PCR fingerprint profiles of transcripts are generated by random priming and differentially expressed genes appear as changes in the fingerprint profiles between two samples. Identification of a differentially expressed gene requires further manipulation (i.e., the appropriate band of the gel must be excised, subcloned, sequenced and matched to a gene in a sequence database).
In certain embodiments, the methods of the invention include submitting fetal amniotic fluid RNA or RNA from whole maternal blood to an array-based gene expression analysis.
Traditional molecular biology methods, such as most of those described above, typically assess one gene per experiment, which significantly limits the overall throughput and prevents gaining a broad picture of gene function. Technologies based on DNA array or microarray (also called gene expression microarray), which were developed more recently, offer the advantage of allowing the monitoring of thousands of genes simultaneously through identification of sequence (gene/gene mutation) and determination of gene expression level (abundance) of genes (see, for example, Marshall and Hodgson (1998); Ramsay (1998); Ekins and Chu (1999); and Lockhart and Winzeler (2000)).
In a gene expression experiment, labeled cDNA or cRNA targets derived from the mRNA of an experimental sample are hybridized to nucleic acid probes immobilized to a solid support. By monitoring the amount of label associated with each DNA location, it is possible to infer the abundance of each mRNA species represented.
There are two standard types of DNA microarray technology in terms of the nature of the arrayed DNA sequence. In the first format, probe cDNA sequences (typically 500 to 5,000 bases long) are immobilized to a solid surface and exposed to a plurality of targets either separately or in a mixture. In the second format, oligonucleotides (typically 20-80-mer oligos) or peptide nucleic acid (PNA) probes are synthesized either in situ (i.e., directly on-chip) or by conventional synthesis followed by on-chip attachment, and then exposed to labeled samples of nucleic acids.
The analyzing step in the methods of the invention can be performed using any of a variety of methods, means and variations thereof for carrying out array-based gene expression analysis. Array-based gene expression methods are known in the art and have been described in numerous scientific publications as well as in patents (see, for example, Schena et al. (1995); Schena et al. (1996); and Chen et al. (1998); U.S. Pat. Nos. 5,143,854; 5,445,934; 5,807,522; 5,837,832; 6,040,138; 6,045,996; 6,284,460; and 6,607,885)
In the practice of the present invention, these methods as well as other methods known in the art for carrying out array-based gene expression analysis may be used as described or modified such that they allow fetal mRNA levels of gene expression to be evaluated.
In many embodiments, a test genomic profile comprising information about at least a subset of genes in a given genome for a test sample is used. In some embodiments, the test genomic profile contains information about at least 10, 20, 30, 40, 50, 60, 70, 80, 90, 100, 110, 120, 130, 140, 150, 160, 170, 180, 190, 200, 220, 240, 260, 280, 300, 320, 340, 360, 380, 400, 420, 440, 460, 480, 500, 550, 600, 650, 700, 750, 800, 850, 900, 950, 1000, 1100, 1200, 1300, 1400, 1500, 1600, 1700, 1800, 1900, 2000, 2200, 2400, 2600, 2800, 3000, 3200, 3400, 3600, 3800, 4000, 4200, 4400, 4600, 4800, 5000, 5200, 5400, 5600, 5800, 6000, 6500, 7000, 7500, 8000, 8500, 9000, 9500, 10000, 11000, 12000, 13000, 14000, 15000, 16000, 17000, 18000, 19000, 20000, or more genes.
In some embodiments, the test genomic profile is a test gene expression profile, i.e., a set of RNA levels for a plurality of genes. RNA levels can be, for example, obtained by analyzing RNA using an array-based gene expression method.
In some embodiments, RNA to be analyzed by an array-based gene expression method (e.g. “test sample RNA”) is isolated from a sample of amniotic fluid as described above. In some embodiments, test sample RNA is isolated from a sample of maternal blood. In many embodiments, the subject from whom test sample RNA is obtained (i.e., the “test subject”) is a pregnant woman carrying a fetus having or suspected of having a fetal disease or condition, such as Down Syndrome. (Down Syndrome and other fetal diseases or conditions are described herein). In some embodiments, the subject from whom test sample RNA is obtained is a fetus having or suspected of having a fetal disease or condition.
A test sample of RNA to be used in the methods of the invention may include a plurality of nucleic acid fragments labeled with a detectable agent.
The extracted RNA may be amplified, reverse-transcribed, labeled, fragmented, purified, concentrated and/or otherwise modified prior to the gene-expression analysis. Techniques for the manipulation of nucleic acids are well-known in the art, see, for example, J. Sambrook et al. (1989), Innis (Ed.) (1990); Tijssen (1993); M. A. Innis (Ed.) (1995), Academic Press: New York, N.Y.; and Ausubel (Ed.) (2002).
In certain embodiments, in order to improve the resolution of the array-based gene expression analysis, the nucleic acid fragments of the test sample are less then 500 bases long, in some embodiments less than about 200 bases long. The use of small fragments significantly increases the reliability of the detection of small differences or the detection of unique sequences.
Methods of RNA fragmentation are known in the art and include: treatment with ribonucleases (e.g., RNase T1, RNase V1 and RNase A), sonication (see, for example, Deininger (1983)), mechanical shearing, and the like (see, for example, Sambrook et al. (1989); Tijssen (1993); Ordahl et al. (1976); Oefner et al. (1996); Thorstenson et al. (1998)). Random enzymatic digestion of the RNA leads to fragments containing as low as 25 to 30 bases.
Fragment size of the nucleic acid segments in the test sample may be evaluated by any of a variety of techniques, such as, for example, electrophoresis (see, for example, Siles and Collier (1997)) or matrix-assisted laser desorption/ionization time-of-flight mass spectrometry (see, for example, Chiu et al. (2000)).
In the practice of the methods of the invention, the test sample of fetal amniotic fluid RNA is labeled before analysis. Suitable methods of nucleic acid labeling with detectable agents have been described in detail above.
Prior to hybridization, the labeled nucleic acid fragments of the test sample may be purified and concentrated before being resuspended in the hybridization buffer. Columns such as Microcon 30 columns may be used to purify and concentrate samples in a single step. Alternatively or additionally, nucleic acids may be purified using a membrane column (such as a Qiagen column) or Sephadex G50 and precipitated in the presence of ethanol.
In many embodiments, a test genomic profile is compared against a reference genomic profile comprising information about at least a subset of genes in a given genome for a reference sample. In some embodiments, a reference genomic profile contains information about at least 10, 20, 30, 40, 50, 60, 70, 80, 90, 100, 110, 120, 130, 140, 150, 160, 170, 180, 190, 200, 220, 240, 260, 280, 300, 320, 340, 360, 380, 400, 420, 440, 460, 480, 500, 550, 600, 650, 700, 750, 800, 850, 900, 950, 1000, 1100, 1200, 1300, 1400, 1500, 1600, 1700, 1800, 1900, 2000, 2200, 2400, 2600, 2800, 3000, 3200, 3400, 3600, 3800, 4000, 4200, 4400, 4600, 4800, 5000, 5200, 5400, 5600, 5800, 6000, 6500, 7000, 7500, 8000, 8500, 9000, 9500, 10000, 11000, 12000, 13000, 14000, 15000, 16000, 17000, 18000, 19000, 20000, or more genes.
The reference genomic profile can comprise, for example, a value or set of values related to the amount and/or pattern of gene expression in a reference sample. In some embodiments, the reference genomic profile is a reference gene expression profile, i.e., a set of RNA levels for a plurality of genes. RNA levels can be, for example, obtained by analyzing RNA using an array-based gene expression method as described herein.
In some embodiments, the value or set of values for the reference is obtained from experiments performed on reference samples and/or using a reference subject. For example, reference values can be obtained from experiments on samples derived from comparable samples, such as amniotic fluid and/or maternal whole blood from pregnant women carrying fetuses that are not suffering from a condition or disease from which the test subject suffers.
In some embodiments, the reference sample is obtained from a fetus of the same gestational and/or developmental age as the test sample, or from a pregnant woman carrying such a fetus. In some embodiments, the reference sample is obtained from a fetus of the same gender, or from a pregnant woman carrying such a fetus. In some embodiments, the reference sample is obtained from a fetus that shares one or more attributes in common with the fetus from which the test sample is obtained, or from a pregnant woman carrying such a fetus.
In some embodiments, the reference sample is obtained from the same subject who provided the test sample, except at a different point in time and/or in a different stage as the subject was when the test sample was obtained. For example, the reference sample can be obtained from the same individual who is (and/or whose fetus is) at a different stage of development, at a different stage of the disease or condition, and/or at a different stage with respect to treatment (e.g., before treatment, and the commencement of treatment, during the treatment regimen, after treatment etc.). Alternatively or additionally, in embodiments wherein the subject is a pregnant woman, the reference sample can be obtained from the same woman when she was pregnant with another fetus that was not suffering from the particular disease or condition.
In some embodiments, a reference value or set of reference values may be determined, for example, by calculations, using algorithms, and/or from previously acquired and/or archived data.
In some embodiments, a reference expression profile is compiled from gene expression data obtained from more than one reference sample. For example, gene expression values for one gene (or for a particular subset of genes) may be obtained from data obtained from one reference sample or a set of reference samples, while gene expression values for another gene or for another particular subset of genes (which may or may not overlap the other particular subset of genes) may be obtained from another reference sample or set of reference samples. Alternatively or additionally, a gene expression value for one or more particular gene(s) in the reference expression profile may be averaged from a set of values obtained from more than one reference sample.
Any of a variety of arrays may be used in the practice of the present invention. Investigators can either rely on commercially available arrays or generate their own. Methods of making and using arrays are well known in the art (see, for example, Kern and Hampton (1997); Schummer et al. (1997); Solinas-Toldo et al. (1997); Johnston (1998); Bowtell (1999); Watson and Akil (1999); Freeman et al. (2000); Lockhart and Winzeler (2000); Cuzin (2001); Zarrinkar et al. (2001); Gabig and Wegrzyn (2001); and Cheung et al. (2001); see also, for example, U.S. Pat. Nos. 5,143,854; 5,434,049; 5,556,752; 5,632,957; 5,700,637; 5,744,305; 5,770,456; 5,800,992; 5,807,522; 5,830,645; 5,856,174; 5,959,098; 5,965,452; 6,013,440; 6,022,963; 6,045,996; 6,048,695; 6,054,270; 6,258,606; 6,261,776; 6,277,489; 6,277,628; 6,365,349; 6,387,626; 6,458,584; 6,503,711; 6,516,276; 6,521,465; 6,558,907; 6,562,565; 6,576,424; 6,587,579; 6,589,726; 6,594,432; 6,599,693; 6,600,031; and 6,613,893).
Arrays comprise a plurality of genetic probes immobilized to discrete spots (i.e., defined locations or assigned positions) on a substrate surface. Gene arrays used in accordance with some embodiments of the invention contain probes representing a comprehensive set of genes across the genome. In some such embodiments, the genes represented by the probes do not represent any particular subset of genes, and/or may be a random assortment of genes. In some embodiments of the invention, the gene arrays comprise a particular subset or subsets of genes. The subsets of genes may be represent particular classes of genes of interest. For example, an array comprising probes for developmental genes may be used in order to focus analyses on developmental genes. In such embodiments using arrays having particular subsets, more than one class of genes of interest may be represented on the same array.
Substrate surfaces suitable for use in the present invention can be made of any of a variety of rigid, semi-rigid or flexible materials that allow direct or indirect attachment (i.e., immobilization) of genetic probes to the substrate surface. Suitable materials include, but are not limited to: cellulose (see, for example, U.S. Pat. No. 5,068,269), cellulose acetate (see, for example, U.S. Pat. No. 6,048,457), nitrocellulose, glass (see, for example, U.S. Pat. No. 5,843,767), quartz or other crystalline substrates such as gallium arsenide, silicones (see, for example, U.S. Pat. No. 6,096,817), various plastics and plastic copolymers (see, for example, U.S. Pat. Nos. 4,355,153; 4,652,613; and 6,024,872), various membranes and gels (see, for example, U.S. Pat. No. 5,795,557), and paramagnetic or supramagnetic microparticles (see, for example, U.S. Pat. No. 5,939,261). When fluorescence is to be detected, arrays comprising cyclo-olefin polymers may in some embodiments be used (see, for example, U.S. Pat. No. 6,063,338).
The presence of reactive functional chemical groups (such as, for example, hydroxyl, carboxyl, amino groups and the like) on the material can be exploited to directly or indirectly attach genetic probes to the substrate surface. Methods for immobilizing genetic probes to substrate surfaces to form an array are well-known in the art.
More than one copy of each genetic probe may be spotted on the array (for example, in duplicate or in triplicate). This arrangement may, for example, allow assessment of the reproducibility of the results obtained. Related genetic probes may also be grouped in probe elements on an array. For example, a probe element may include a plurality of related genetic probes of different lengths but comprising substantially the same sequence. Alternatively, a probe element may include a plurality of related genetic probes that are fragments of different lengths resulting from digestion of more than one copy of a cloned piece of DNA. A probe element may also include a plurality of related genetic probes that are identical fragments except for the presence of a single base pair mismatch. An array may contain a plurality of probe elements. Probe elements on an array may be arranged on the substrate surface at different densities.
Array-immobilized genetic probes may be nucleic acids that contain sequences from genes (e.g., from a genomic library), including, for example, sequences that collectively cover a substantially complete genome or a subset of a genome (for example, the array may contain only human genes that are expressed throughout development). Genetic probes may be long cDNA sequences (500 to 5,000 bases long) or shorter sequences (for example, 20-80-mer oligonucleotides). The sequences of the genetic probes are those for which gene expression levels information is desired. Additionally or alternatively, the array may comprise nucleic acid sequences of unknown significance or location. Genetic probes may be used as positive or negative controls (for example, the nucleic acid sequences may be derived from karyotypically normal genomes or from genomes containing one or more chromosomal abnormalities; alternatively or additionally, the array may contain perfect match sequences as well as single base pair mismatch sequences to adjust for non-specific hybridization).
Techniques for the preparation and manipulation of genetic probes are well-known in the art (see, for example, J. Sambrook et al. (1989); Innis (Ed.) (1990); Tijssen (1993); Innis (Ed.) (1995); and Ausubel (Ed.) (2002)).
Long cDNA sequences may be obtained and manipulated by cloning into various vehicles. They may be screened and re-cloned or amplified from any source of genomic DNA. Genetic probes may be derived from genomic clones including mammalian and human artificial chromosomes (MACs and HACs, respectively, which can contain inserts from ˜5 to 400 kilobases (kb)), satellite artificial chromosomes or satellite DNA-based artificial chromosomes (SATACs), yeast artificial chromosomes (YACs; 0.2-1 Mb in size), bacterial artificial chromosomes (BACs; up to 300 kb); P1 artificial chromosomes (PACs; ˜70-100 kb) and the like.
Genetic probes may also be obtained and manipulated by cloning into other cloning vehicles such as, for example, recombinant viruses, cosmids, or plasmids (see, for example, U.S. Pat. Nos. 5,266,489; 5,288,641 and 5,501,979).
In some embodiments, genetic probes are synthesized in vitro by chemical techniques well-known in the art and then immobilized on arrays. Such methods are especially suitable for obtaining genetic probes comprising short sequences such as oligonucleotides and have been described in scientific articles as well as in patents (see, for example, Narang et al. (1979); Brown et al. (1979); Belousov et al. (1997); Guschin et al. (1997); Blommers et al. (1994); and Frenkel et al. (1995); see also for example, U.S. Pat. No. 4,458,066).
For example, oligonucleotides may be prepared using an automated, solid-phase procedure based on the phosphoramidite approach. In such a method, each nucleotide is individually added to the 5-end of the growing oligonucleotide chain, which is attached at the 3′-end to a solid support. The added nucleotides are in the form of trivalent 3′-phosphoramidites that are protected from polymerization by a dimethoxytrityl (or DMT) group at the 5-position. After base-induced phosphoramidite coupling, mild oxidation to give a pentavalent phosphotriester intermediate and DMT removal provides a new site for oligonucleotide elongation. The oligonucleotides are then cleaved off the solid support, and the phosphodiester and exocyclic amino groups are deprotected with ammonium hydroxide. These syntheses may be performed on commercial oligo synthesizers such as the Perkin Elmer/Applied Biosystems Division DNA synthesizer.
Methods of attachment (or immobilization) of oligonucleotides on substrate supports have been described (see, for example, Maskos and Southern (1992); Matson et al. (1995); Lipshutz et al. (1999); Rogers et al. (1999); Podyminogin et al. (2001); Belosludtsev et al. (2001)).
Oligonucleotide-based arrays have also been prepared by synthesis in situ using a combination of photolithography and oligonucleotide chemistry (see, for example, Pease et al., (1994); Lockhart et al. (1996); Singh-Gasson et al. (1999); Pirrung et al. (2001); McGall et al., (2001); Barone et al. (2001); Butler et al. (2001); Nuwaysir et al. (2002)). The chemistry for light-directed oligonucleotide synthesis using photolabile protected 2′-deoxynucleoside phosphoramites has been developed by Affymetrix Inc. (Santa Clara, Calif.) and is well known in the art (see, for example, U.S. Pat. Nos. 5,424,186 and 6,582,908).
An alternative to custom arraying of genetic probes is to rely on commercially available arrays and micro-arrays. Such arrays have been developed, for example, by Affymetrix Inc. (Santa Clara, Calif.), Illumina, Inc. (San Diego, Calif.), Spectral Genomics, Inc. (Houston, Tex.), and Vysis Corporation (Downers Grove, Ill.).
In some embodiments of the invention, provided are gene expression arrays for use in prenatal diagnostic applications. Such arrays are generally custom-made such that the genes represented on the array include at least a subset of genes that are known to be or suspected of being differentially expressed in a particular fetal disease or condition for which prenatal diagnosis is desirable. For example, a gene expression array for use in diagnosing Down Syndrome would include genetic probes for genes that are differentially expressed in trisomy 21 fetuses as discussed in Examples 2 and 3 of the present application.
In the methods of the invention, the gene expression array may be contacted with the test sample under conditions wherein the nucleic acid fragments in the sample specifically hybridize to the genetic probes immobilized on the array.
The hybridization reaction and washing step(s), if any, may be carried out under any of a variety of experimental conditions. Numerous hybridization and wash protocols have been described and are well-known in the art (see, for example, Sambrook et al. (1989); Tijssen (1993); and Anderson (Ed.) (1999)). Methods of the invention may be carried out by following known hybridization protocols, by using modified or optimized versions of known hybridization protocols or newly developed hybridization protocols as long as these protocols allow specific hybridization to take place.
The term “specific hybridization” refers to a process in which a nucleic acid molecule preferentially binds, duplexes, or hybridizes to a particular nucleic acid sequence under stringent conditions. In the context of the present invention, this term more specifically refers to a process in which a nucleic acid fragment from a test sample preferentially binds (i.e., hybridizes) to a particular genetic probe immobilized on the array and to a lesser extent, or not at all, to other immobilized genetic probes of the array. Stringent hybridization conditions are sequence dependent. The specificity of hybridization increases with the stringency of the hybridization conditions; reducing the stringency of the hybridization conditions results in a higher degree of mismatch being tolerated.
The hybridization and/or wash conditions may be adjusted by varying different factors such as the hybridization reaction time, the time of the washing step(s), the temperature of the hybridization reaction and/or of the washing process, the components of the hybridization and/or wash buffers, the concentrations of these components as well as the pH and ionic strength of the hybridization and/or wash buffers.
In certain embodiments, the hybridization and/or wash steps are carried out under very stringent conditions. In other embodiments, the hybridization and/or wash steps are carried out under moderate to stringent conditions. In still other embodiments, more than one washing steps are performed. For example, in order to reduce background signal, a medium to low stringency wash is followed by a wash carried out under very stringent conditions.
As is well known in the art, the hybridization process may be enhanced by modifying other reaction conditions. For example, the efficiency of hybridization (i.e., time to equilibrium) may be enhanced by using reaction conditions that include temperature fluctuations (i.e., differences in temperature that are higher than a couple of degrees). An oven or other devices capable of generating variations in temperatures may be used in the practice of the methods of the invention to obtain temperature fluctuation conditions during the hybridization process.
It is also known in the art that hybridization efficiency is significantly improved if the reaction takes place in an environment where the humidity is not saturated. Controlling the humidity during the hybridization process provides another means to increase the hybridization sensitivity. Array-based instruments usually include housings allowing control of the humidity during all the different stages of the experiment (i.e., pre-hybridization, hybridization, wash and detection steps).
Additionally or alternatively, a hybridization environment that includes osmotic fluctuation may be used to increase hybridization efficiency. Such an environment where the hyper-/hypo-tonicity of the hybridization reaction mixture varies may be obtained by creating a solute gradient in the hybridization chamber, for example, by placing a hybridization buffer containing a low salt concentration on one side of the chamber and a hybridization buffer containing a higher salt concentration on the other side of the chamber
In the practice of the methods of the invention, the array may be contacted with the test sample under conditions wherein the nucleic acid segments in the sample specifically hybridize to the genetic probes on the array. As mentioned above, the selection of appropriate hybridization conditions will allow specific hybridization to take place. In certain cases, the specificity of hybridization may further be enhanced by inhibiting repetitive sequences.
In certain embodiments, repetitive sequences present in the nucleic acid fragments are removed or their hybridization capacity is disabled. By excluding repetitive sequences from the hybridization reaction or by suppressing their hybridization capacity, one prevents the signal from hybridized nucleic acids to be dominated by the signal originating from these repetitive-type sequences (which are statistically more likely to undergo hybridization). Failure to remove repetitive sequences from the hybridization or to suppress their hybridization capacity results in non-specific hybridization, making it difficult to distinguish the signal from the background noise.
Removing repetitive sequences from a mixture or disabling their hybridization capacity can be accomplished using any of a variety of methods well-known to those skilled in the art. These methods include, but are not limited to, removing repetitive sequences by hybridization to specific nucleic acid sequences immobilized to a solid support (see, for example, Brison et al. (1982)); suppressing the production of repetitive sequences by PCR amplification using adequate PCR primers; or inhibiting the hybridization capacity of highly repeated sequences by self-reassociation (see, for example, Britten et al. (1974)).
In some embodiments, the hybridization capacity of highly repeated sequences is competitively inhibited by including, in the hybridization mixture, unlabeled blocking nucleic acids. The unlabeled blocking nucleic acids, which are mixed to the test sample before the contacting step, act as a competitor and prevent the labeled repetitive sequences from binding to the highly repetitive sequences of the genetic probes, thus decreasing hybridization background. In certain embodiments, for example when cDNA derived from fetal mRNA is analyzed, the unlabeled blocking nucleic acids are Human Cot-1 DNA. Human Cot-1 DNA is commercially available, for example, from Gibco/BRL Life Technologies (Gaithersburg, Md.).
In some embodiments, inventive methods include determining the binding of individual nucleic acid fragments of the test sample to individual genetic probes immobilized on the array in order to obtain a binding pattern. In array-based gene expression, determination of the binding pattern is carried out by analyzing the labeled array which results from hybridization of labeled nucleic acid segments to immobilized genetic probes.
In certain embodiments, determination of the binding includes: measuring the intensity of the signals produced by the detectable agent at each discrete spot on the array.
Analysis of the labeled array may be carried out using any of a variety of means and methods, whose selection will depend on the nature of the detectable agent and the detection system of the array-based instrument used.
In certain embodiments, the detectable agent comprises a fluorescent dye and the binding is detected by fluorescence. In other embodiments, the sample of RNA is biotin-labeled and after hybridization to immobilized genetic probes, the hybridization products are stained with a streptavidin-phycoerythrin conjugate and visualized by fluorescence. Analysis of a fluorescently labeled array usually comprises: detection of fluorescence over the whole array, image acquisition, quantitation of fluorescence intensity from the imaged array, and data analysis.
Methods for the detection of fluorescent labels and the creation of fluorescence images are well known in the art and include the use of “array reading” or “scanning” systems, such as charge-coupled devices (i.e., CCDs). Any known device or method, or variation thereof can be used or adapted to practice the methods of the invention (see, for example, Hiraoka et al. (1987); Aikens et al. (1989); Divane et al. (1994); Jalal et al. (1998); and Cheung et al. (1999); see also, for example, U.S. Pat. Nos. 5,539,517; 5,790,727; 5,846,708; 5,880,473; 5,922,617; 5,943,129; 6,049,380; 6,054,279; 6,055,325; 6,066,459; 6,140,044; 6,143,495; 6,191,425; 6,252,664; 6,261,776 and 6,294,331).
Commercially available microarrays scanners are typically laser-based scanning systems that can acquire one (or more) fluorescent image (such as, for example, the instruments commercially available from PerkinElmer Life and Analytical Sciences, Inc. (Boston, Mass.), Virtek Vision, Inc. (Ontario, Canada) and Axon Instruments, Inc. (Union City, Calif.)). Arrays can be scanned using different laser intensities in order to ensure the detection of weak fluorescence signals and the linearity of the signal response at each spot on the array. Fluorochrome-specific optical filters may be used during the acquisition of the fluorescent images. Filter sets are commercially available, for example, from Chroma Technology Corp. (Rockingham, Vt.).
In some embodiments, a computer-assisted imaging system capable of generating and acquiring fluorescence images from arrays such as those described above, is used in the practice of the methods of the invention. One or more fluorescent images of the labeled array after hybridization may be acquired and stored.
In some embodiments, a computer-assisted image analysis system is used to analyze the acquired fluorescent images. Such systems allow for an accurate quantitation of the intensity differences and for an easier interpretation of the results. A software for fluorescence quantitation and fluorescence ratio determination at discrete spots on an array is usually included with the scanner hardware. Softwares and/or hardwares are commercially available and may be obtained from, for example, BioDiscovery (El Segundo, Calif.), Imaging Research (Ontario, Canada), Affymetrix, Inc. (Santa Clara, Calif.), Applied Spectral Imaging Inc. (Carlsbad, Calif.); Chroma Technology Corp. (Brattleboro, Vt.); Leica Microsystems, (Bannockburn, Ill.); and Vysis Inc. (Downers Grove, Ill.). Other softwares are publicly available (e.g., MicroArray Image Analysis, and Combined Expression Data and Sequence Analysis (http://rana.lbl.gov); Chiang et al. (2001); a system written in R and available through the Bioconductor project (http://www.bioconductor.org); a Java-based TM4 software system available from the Institute for Genomic Research (http://www.tigr.org/software); and a Web-based system developed at Lund University (http://base.thep.lu.se)).
Accurate determination of fluorescence intensities requires normalization and determination of the fluorescence ratio baseline (Brazma and Vilo (2000)). Data reproducibility may be assessed by using arrays on which genetic probes are spotted in duplicate or triplicate. Baseline thresholds may also be determined using global normalization approaches (Kerr et al. (2000)). Other arrays include a set of maintenance genes which shows consistent levels of expression over a wide variety of tissues and allows the normalization and scaling of array experiments.
In the practice of the methods of the invention, any of a large variety of bioinformatics and statistical methods may be used to analyze data obtained by array-based gene expression analysis. Such methods are well known in the art (for a review of essential elements of data acquisition, data processing, data analysis, data mining and of the quality, relevance and validation of information extracted by different bioinformatics and statistical methods, see, for example, Watson et al. (1998); Duggan et al. (1999); Bassett et al. (1999); Hess et al. (2001); Marcotte and Date (2001); Weinstein et al. (2002); Dewey (2002); Butte (2002); Tamames et al. (2002); Xiang et al. (2003).
In gene expression array experiments, quantitative readouts of expression levels are typically provided. Typically, after normalization of data, genes having at least a 1.5-fold differences (i.e. a ratio of about 1.5) in expression levels between test and control samples may be considered “differentially expressed.” In some embodiments of the invention, genes considered to be differentially expressed show at least two-fold, at least five-fold, at least ten-fold, at least 15-fold, at least 20-fold, or at least 25-fold different expression levels compared to controls. (It is to be understood that the fold different expression levels can be determined in either direction, i.e., the expression levels for the test sample may be at least 1.5-fold higher or 1.5-fold lower than expression levels for the control sample.) In some embodiments, differential expression is assessed with respect to statistical significance for individual genes or groups of genes; in such embodiments, the fold-change may be lower, e.g., 1.4-fold, 1.3-fold, 1.2-fold, or 1.1-fold or even lower.
It will be appreciated that both the fold-difference cutoff for being considered differentially expressed varies depending on several factors which may include, for example, the type of samples used, the quantity and quality of the RNA sample, the power of the statistical analyses, the type of genes of interest, etc. In some embodiments, a lower cutoff ratio (i.e. −fold difference) is used, e.g., ratios of about 1.4, or about 1.3. In some embodiments, a higher cutoff ratio than about 1.5 is used, e.g., about 2.0, about 2.5, about 3.0, about 3.5, about 4.0, about 4.5, about 5.0, etc.
In some embodiments of the invention, gene expression data is analyzed at an individual level such that individual genes that are differentially expressed are identified. In some embodiments of the invention, gene expression data is analyzed by gene sets. Sets of genes may be grouped together based on location such as on a chromosomal band. For example, a set of genes on Chr21q22, a chromosomal region involved in Down Syndrome, may be analyzed together. Individual gene analyses and/or gene set analyses may identify functional groups and/or gene pathways involved in a particular fetal disease or condition. For further description of analytical methods used in the practice of the invention, see Examples 2 and 3 of the present application.
In some embodiments of the invention, gene expression data is fed into software programs to generate protein networks that may be involved in a particular fetal condition or disease. Protein network analyses may facilitate the design and/or development of novel fetal treatment approaches (see Examples 10 and 11 of the present application).
IV. Methods of Identifying Therapeutic Agents, Regimens, and/or Compounds
In one aspect, the invention provides methods for identifying therapeutic agents and/or regimens for a fetal disease or condition. In some embodiments, such methods comprise steps of: obtaining a reference genomic profile; obtaining a test genomic profile from a sample of amniotic fluid and/or maternal blood, wherein the sample is obtained from a subject suffering from or carrying a fetus suffering from a fetal disease or condition; determining differences between the test genomic profile and the reference genomic profile; inputting the test genomic profile into a first computing machine; accessing a storage repository on a second computing machine, wherein the storage repository contains a set of stored genomic profiles of one or more cell line(s) that have each been contacted with a different agent, wherein each genomic profile is mapped to data representing a corresponding agent; generating, by a correlator executing on the first or second computing machine, a correlation between each stored genomic profile and the test genomic profile; and selecting at least one agent whose corresponding genomic profile has a negative correlation score with the test genomic profile, the selected agent being likely to reduce the differences between the test genomic profile and the reference genomic profile.
Characteristics of reference genomic profiles and test genomic profiles are described herein. (See section III: Array-Based Gene Expression Analysis of Fetal RNA). In some embodiments, obtaining a reference genomic profile and/or a test genomic profile comprises creating the genomic profile using methods as described herein. In some embodiments, a reference genomic profile and/or test genomic profile is obtained from another source (e.g., a clinical laboratory, a research laboratory, a commercial service). Test genomic profiles are obtained from a sample of amniotic fluid using methods as described herein.
Genomic profiles generally comprise information about at least a subset of genes (and/or gene products) in a given genome. In some embodiments, genomic profiles comprise information about at least 10, 20, 30, 40, 50, 60, 70, 80, 90, 100, 110, 120, 130, 140, 150, 160, 170, 180, 190, 200, 220, 240, 260, 280, 300, 320, 340, 360, 380, 400, 420, 440, 460, 480, 500, 550, 600, 650, 700, 750, 800, 850, 900, 950, 1000, 1100, 1200, 1300, 1400, 1500, 1600, 1700, 1800, 1900, 2000, 2200, 2400, 2600, 2800, 3000, 3200, 3400, 3600, 3800, 4000, 4200, 4400, 4600, 4800, 5000, 5200, 5400, 5600, 5800, 6000, 6500, 7000, 7500, 8000, 8500, 9000, 9500, 10000, 11000, 12000, 13000, 14000, 15000, 16000, 17000, 18000, 19000, 20000, or more genes.
In some embodiments, the genomic profiles comprise information selected from the group consisting of mRNA levels (e.g., obtained from gene expression profiling experiments), protein expression levels, DNA methylation patterns, metabolite profiles, and combinations thereof.
Differences between the test genomic profile and the reference genomic profile can be determined using any of a variety of methods known in the art, such as, but not limited to, analytical and/or bioinformatics methods as discussed herein (see, for example, “Binding Detection and Data Analysis” in section III: “Array-Based Gene Expression Analysis of Fetal RNA”). In some embodiments, differences are determined using algorithms, functions, and/or scripts executing on one or more computing machines as described herein. In some embodiments, differences are determined visually and/or manually, e.g., by an individual. Not every point of difference between the test genomic profile and the reference genomic profile needs to be determined, though such a determination is contemplated and included in some embodiments of the invention. As would be understood by one of ordinary skill in the art, the determined differences generally provide an overall picture of differences across the genome and may guide an understanding of what genetic pathways may be disrupted in the test sample as compared to the reference sample. In some embodiments, a single difference is determined. In some embodiments, a plurality of differences is determined. In some embodiments at least 5, 10, 20, 30, 40, 50, 60, 70, 80, 90, 100, 110, 120, 130, 140, 150, 160, 170, 180, 190, 200, or more differences are determined
In accordance with provided methods, test genomic profiles are inputted into a first computing machine. By “inputting” it is meant that the test genomic profile, data representation(s) thereof, or data representation(s) of a subset of information contained in the test genomic profile, is entered into the first computing machine. By “data representations,” it is meant that the information in the test genomic profile may be summarized, abstracted, and/or represented in a different way (e.g., using numbers, symbols, code(s), binary numbers, etc.) before being inputted into the computing machine. In some embodiments, only a subset of information contained in the test genomic profile is inputted into the first computing machine. In some such embodiments, the subset of information comprises information deemed relevant (e.g., by a research or clinician, as determined by an algorithm, etc.) In some embodiments, inputting involves use of one or more inputting devices as described below. (See “computing machines.”)
In some embodiments, one or more names of genes that whose expression or other state is altered in the test genomic profile is inputted into the first computing machine, and the correlation step involves generating a correlation factor between the genes and the stored genomic profiles.
Although not required, in some embodiments, the reference genomic profile is also inputted into a computing machine, which may or may not be the same as the first computing machine into which the test genomic profile is inputted.
The fetal disease or condition may be any fetal disease or condition for which a therapeutic agent is desired. In some embodiments, the fetal disease or condition is selected from the group consisting of twin-to-twin-transfusion syndrome (TTTS), gastroschisis, Down Syndrome, fetal structural anomalies, fetal congenital heart anomaly, fetal kidney anomalies, neural tube defects, and congenital diaphragmatic hernia. In some such embodiments, the fetal disease or condition is Down Syndrome. In some embodiments, methods for identifying therapeutic agents further comprise testing the selected agent for medical applications in utero. It may be desirable, for example, to apply a therapy prenatally to the fetus and/or perinatally. In instances where prenatal therapy is desired, it may be advantageous to test the efficacy and safety of the therapeutic agent in utero, which would allow medical intervention before birth.
The storage repository (in some embodiments, known as a “reference database”) on the second computing machine contains a set of stored genomic profiles. In some embodiments, the stored genomic profiles are of one or more cell line(s) that have each been contacted with a different agent. Thus, in such embodiments, each stored genomic profile corresponds to a particular agent. In such embodiments, each stored genomic profile is mapped to data representing a corresponding agent in the storage repository, such that it is possible to determine which agent, when contacted to the one or more cell line(s), corresponds to a given stored genomic profile.
The one or more cell lines can be any of a variety of cell lines known in the art, such as those used in biomedical and/or clinical research. These included without limitation cancer cell lines such as, for example, MCF7 (human breast cancer epithelial cells), PC3 (human prostate cancer epithelial cells), HL60 (human leukemia cells), SKMEL5 (human melanoma cells), etc. Generally, any cell line that is generated from a biological sample and/or organism may be suitable for use in the invention, so long as the cells in the cell line contain a genome that is comparable to that of the test and/or reference sample. Generally, the cell line(s) are obtained from a species that is the same as that from which the test and/or reference sample is obtained. For example, if the test genomic profile is to be obtained from a human biological sample, the cell line(s) used would likely be a human cell line.
The different agents can comprise any number of compounds, small molecules, drug candidates, nucleic acid agents, etc. The different agents may comprise all or a subset of the compounds, small molecules, etc. in a library or collection (e.g., historical collections of compounds and/or libraries from diversity-oriented syntheses). In some embodiments, the different agents comprise bioactive small molecules. In some embodiments, the different agents comprise agents in one or more classes of small molecules, e.g., histone deacetylase (HDAC) inhibitors, estrogens, phenothiazines, etc.
For example, a storage repository amenable for use in accordance with methods of the invention may comprise stored genomic profiles from a reference database comprising mRNA levels known as the “Connectivity Map,” which is publicly available. (See Lamb et al. (2006), the entire contents of which are herein incorporated by reference in their entirety.) The Connectivity Map comprises a reference collection of gene-expression profiles from cultured human cells treated with bioactive small molecules along with pattern-matching software that allows connections between small molecules, genes, diseases, and drugs to be found. Other storage repositories can also be used in accordance with the invention. Such storage repositories may include genomic profiles comprising information such as DNA methylation patterns, protein expression profiles, metabolite profiles, or combinations thereof. Information from more than one such database may be combined for use in inventive methods.
Storage repositories may be located on one or more computing machines as described below. Typically, storage repositories are located on one or more main memory components, although they can alternatively or additionally be located on a subsidiary memory component (such as, but not limited to, an external disk or drive in communication with the second computing machine).
The storage repository may be accessed in any of a variety of ways. In some embodiments, the storage repository is accessed via a bus (e.g., a system bus) within the second computing machine. In some embodiments, the storage repository is accessed via a netwok as described below (see “Computing machines”).
Correlation scores generally give an indication of the degree to which two variables are associated. In some embodiments, the correlations score ranges from −1 to +1 and may be known as a “correlation coefficient.” In some such embodiments, a positive correlation score denotes a positive correlation, a negative correlation score denotes a negative correlation (also known as an “inverse correlation”), a correlation score of zero denotes no correlation, and the magnitude of the correlation score is an indication of the strength of the correlation. For example, in some such embodiments wherein the correlation score ranges from −1 to +1, the greater the magnitude of the correlation score, the greater the strength of the correlation (whether it is positive or negative). Thus, in such embodiments, the closer a negative correlation score is to −1, the stronger the negative correlation is, whereas the closer a negative correlation score is to 0, the weaker the negative correlation is. Similarly, in such embodiments, the closer a positive correlation score is to +1, the stronger the positive correlation is, whereas the closer a positive correlation score is to 0, the weaker the positive correlation is.
The correlation score can be generated using a correlator, which can execute on the first or second computing machine or both. The correlator may, in some embodiments, be a function, script, algorithm, computer program, software, etc. that employs a computational method to determine the correlation score between the test genomic profile and each stored genomic profile. For the purposes of computing the correlation score, in some embodiments, each datum of information in the test genomic profile and/or stored genomic profile is represented by a number. (For example, the number may correspond to gene expression values, fold-gene expression as compared to a control or reference, extent of methylation, extent of deacetylation, fold-protein expression as compared to a control or reference, etc.). Computational methods to compare genomic profiles (e.g., to determine a correlation score) are known in the art and include without limitation, nonparametric rank-based pattern-matching strategies such as those based on the Kolmogorov-Smirnov statistic (See, e.g., Hollander and Wolfe, Nonparametric Statistic Methods. Wiley, New York, ed. 2, 1999, pp. 178-185, the contents of which are herein incorporated by reference in their entirety.)
Selecting at least one agent in many embodiments comprises selecting an agent whose corresponding stored genomic profile has a strong (or “high”) negative correlation score with the test genomic profile. Such an agent may be deemed likely to reduce the differences between the test genomic profile and the reference genomic profile and may be desirable candidates and therapeutic drugs for the fetal disease or condition. In some embodiments, the strong negative correlation score is a correlation coefficient less than −0.4, −0.5, −0.6, −0.7, −0.8, −0.9, or less.
In some embodiments, an agent whose corresponding stored genomic profile has a strong (or “high”) positive correlation score with the test genomic profile. Such an agent may be deemed likely to mimic the test genomic profile, and may, for example, be useful in creating a model (e.g., an animal model and/or model in a cell line) of the disease or condition. In some embodiments, the strong positive correlation score is a positive coefficient greater than +0.4, +0.5, +0.6, +0.7, +0.8, +0.9, or more.
Selecting may be accomplished using for example, a function, script, algorithm, computer program, software, etc. executing on a computing machine, such as the first or second computing machine described herein. In some embodiments, selecting is accomplished without using a computing machine. In some embodiments, selecting is accomplished manually, e.g., an individual may scan a list of correlation scores and determine which agent(s) are to be selected.
In some embodiments, methods further comprise a step of testing activity of the selected agent (e.g., a candidate compound) in a model for the fetal disease or condition. Suitable models for the fetal disease or condition include, but are not limited to, animal models, in vitro cell culture assays, etc.
The first and second computing machines may be the same or different machine and may each comprise any type of computing device, such as a computer, network device or appliance capable of communicating on any type and form of network and performing the operations described herein.
As shown in
The central processing unit is any logic circuitry that responds to and processes instructions fetched from the main memory unit. In many embodiments, the central processing unit comprises a microprocessor unit, such as those manufactured by Intel Corporation (Mountain View, Calif.), those manufactured by Motorola Corporation (Schaumburg, Ill.), those manufactured by Transmeta Corporation (Santa Clara, Calif.), the RS/6000 processor, those manufactured by International Business Machines (White Plains, N.Y.), and/or those manufactured by Advanced Micro Devices (Sunnyvale, Calif.). The computing device may be based on any of these processors, or any other processor capable of operating as described herein.
Main memory may comprise one or more memory chips capable of storing data and allowing any storage location to be directly accessed by the microprocessor, such as Static random access memory (SRAM), Burst SRAM or SynchBurst SRAM (BSRAM), Dynamic random access memory (DRAM), Fast Page Mode DRAM (FPM DRAM), Enhanced DRAM (EDRAM), Extended Data Output RAM (EDO RAM), Extended Data Output DRAM (EDO DRAM), Burst Extended Data Output DRAM (BEDO DRAM), Enhanced DRAM (EDRAM), synchronous DRAM (SDRAM), JEDEC SRAM, PC100 SDRAM, Double Data Rate SDRAM (DDR SDRAM), Enhanced SDRAM (ESDRAM), SyncLink DRAM (SLDRAM), Direct Rambus DRAM (DRDRAM), and/or Ferroelectric RAM (FRAM). The main memory may be based on any of the above described memory chips, or any other available memory chips capable of operating as described herein. In the embodiment shown in
In some embodiments, the first computing machine is the same as the second computing machine.
In some embodiments, the first computing machine is different than the second computing machine. In some such embodiments, the first computing machine and second computing machine are connected via a network (e.g., a local-area network (LAN) (such as a company Intranet), a metropolitan area network (MAN), and/or a wide area network (WAN) (such as the Internet or the World Wide Web)). In some embodiments, the first computing machine and second computing machine are connected via more than one network. In some embodiments, the network comprises a private network. In some embodiments, the network comprises a public network.
Any type and/or form of network may be used to connect the first and second computing machines in embodiments wherein they are different. Networks compatible for use in accordance with the invention include, but are not limited to, any of the following: a point to point network, a broadcast network, a wide area network, a local area network, a telecommunications network, a data communication network, a computer network, an ATM (Asynchronous Transfer Mode) network, a SONET (Synchronous Optical Network) network, a SDH (Synchronous Digital Hierarchy) network, a wireless network and a wireline network. In some embodiments, the network comprises a wireless link, such as an infrared channel or satellite band. The topology of the network may comprise a bus, star, and/or ring network topology. The network may be of any network topology known to those ordinarily skilled in the art capable of supporting the operations described herein. In some embodiments, the network comprise mobile telephone networks utilizing any protocol or protocols used to communicate among mobile devices, including AMPS, TDMA, CDMA, GSM, GPRS or UMTS. In some embodiments, different types of data may be transmitted via different protocols. In other embodiments, the same types of data may be transmitted via different protocols.
In some embodiments, the first and/or second computing machine may comprise multiple, logically-grouped machines, which may or may not be remote from each other. In some such embodiments, a logical group of remote machines may be referred to as a server farm. In some embodiments, the remote machines are geographically dispersed. In some embodiments, a server farm may be a single entity. In some embodiments, the server farm comprises a plurality of server farms. In some embodiments, remote machines within each server farm are heterogeneous (e.g., one or more of the remote machines can operate according to one type of operating system platform (e.g., WINDOWS NT, manufactured by Microsoft Corp. of Redmond, Wash.), while one or more of the other remote machines can operate according to another type of operating system platform (e.g., Unix or Linux)).
In some embodiments, a remote machine within a server farm is not physically proximate to another remote machine in the same server farm. Thus, a group of remote machines logically grouped as a server farm may be interconnected using, for example, a wide-area network (WAN) connection or a metropolitan-area network (MAN) connection. For example, a server farm may include remote machines physically located in different continents or different regions of a continent, country, state, city, campus, or room. In some embodiments, data transmission speeds between remote machines in the server farm are increased by using a local-area network (LAN) and/or other direct connection to connect the remote machines.
In some embodiments, a remote machine is a file server, application server, web server, proxy server, appliance, network appliance, gateway, application gateway, gateway server, virtualization server, deployment server, SSL VPN server, firewall, or combination thereof. In some embodiments, a remote machine provides a remote authentication dial-in user service (referred to as a RADIUS server). In some embodiments, a remote machine is a blade server. In some embodiments, a remote machine executes a virtual machine providing, to a user or client computer, access to a computing environment.
In some embodiments, a client communicates with a remote machine. In some such embodiments, the client communicates directly with one of the remote machines in a server farm. In some embodiments, the remote machine receives requests from the client, forwards the requests to a second remote machine, and responds to the request by the client with a response to the request from the remote machine.
Any of a wide variety of input/output (I/O) devices 130a-130n may be present in the computing device 100. Input devices include, without limitation, keyboards, mice, trackpads, trackballs, microphones, and drawing tablets. Output devices include, without limitation, video displays, speakers, inkjet printers, laser printers, and dye-sublimation printers. The I/O devices may be controlled by an I/O controller 123 as shown in
Referring again to
The computing device 100 may include a network interface 118 to interface to the network 104 through a variety of connections including, but not limited to, standard telephone lines, LAN or WAN links (e.g., 802.11, T1, T3, 56 kb, X.25, SNA, DECNET), broadband connections (e.g., ISDN, Frame Relay, ATM, Gigabit Ethernet, Ethernet-over-SONET), wireless connections, or some combination of any or all of the above. Connections can be established using a variety of communication protocols (e.g., TCP/IP, IPX, SPX, NetBIOS, Ethernet, ARCNET, SONET, SDH, Fiber Distributed Data Interface (FDDI), RS232, IEEE 802.11, IEEE 802.11a, IEEE 802.11b, IEEE 802.11g, CDMA, GSM, WiMax and direct asynchronous connections). In some embodiments, the computing device 100 communicates with other computing devices 100′ via any type and/or form of gateway or tunneling protocol such as Secure Socket Layer (SSL) or Transport Layer Security (TLS), and/or the Citrix Gateway Protocol manufactured by Citrix Systems, Inc. of Ft. Lauderdale, Fla. The network interface 118 may comprise a built-in network adapter, network interface card, PCMCIA network card, card bus network adapter, wireless network adapter, USB network adapter, modem, or any other device suitable for interfacing the computing device 100 to any type of network capable of communication and performing the operations described herein.
In some embodiments, the computing device 100 may comprise or be connected to multiple display devices 124a-124n, which each may be of the same or different type and/or form. As such, any of the I/O devices 130a-130n and/or the I/O controller 123 may comprise any type and/or form of suitable hardware, software, or combination of hardware and software to support, enable or provide for the connection and use of multiple display devices 124a-124n by the computing device 100. For example, the computing device 100 may include any type and/or form of video adapter, video card, driver, and/or library to interface, communicate, connect or otherwise use the display devices 124a-124n. In some embodiments, a video adapter may comprise multiple connectors to interface to multiple display devices 124a-124n. In some embodiments, the computing device 100 may include multiple video adapters, with each video adapter connected to one or more of the display devices 124a-124n. In some embodiments, any portion of the operating system of the computing device 100 may be configured for using multiple displays 124a-124n. In some embodiments, one or more of the display devices 124a-124n may be provided by one or more other computing devices, such as computing devices 100a and 100b connected to the computing device 100, for example, via a network. These embodiments may include any type of software designed and constructed to use another computer's display device as a second display device 124a for the computing device 100. One ordinarily skilled in the art will recognize and appreciate the various ways and embodiments that a computing device 100 may be configured to have multiple display devices 124a-124n.
In some embodiments, an I/O device 130 may be a bridge between the system bus 150 and an external communication bus, such as a USB bus, an Apple Desktop Bus, an RS-232 serial connection, a SCSI bus, a FireWire bus, a FireWire 800 bus, an Ethernet bus, an AppleTalk bus, a Gigabit Ethernet bus, an Asynchronous Transfer Mode bus, a HIPPI bus, a Super HIPPI bus, a SerialPlus bus, a SCl/LAMP bus, a FibreChannel bus, or a Serial Attached small computer system interface bus.
A computing device 100 of the sort depicted in
In another aspect, the invention provides a method of treating a fetal disease or condition comprising administering to a patient suffering from a fetal disease or condition an effective dose of a compound or therapeutic agent identified by methods of the present invention, such that symptoms of the fetal disease or condition are ameliorated.
The fetal disease or condition may be selected from the group consisting of twin-to-twin-transfusion syndrome (TTTS), gastroschisis, Down Syndrome, fetal structural anomalies, fetal congenital heart anomaly, fetal kidney anomalies, neural tube defects, and congenital diaphragmatic hernia. In some embodiments in which the fetal disease or condition is Down Syndrome, the compound is selected from the group consisting of anti-oxidants, ion channel modulators, G-protein signaling modulators, and combinations thereof. It is proposed, without wishing to be bound by any particular theory, that anti-oxidants (e.g., celastrol) and ion channel modulators such as calcium channel blockers (e.g., verapamil, felodipine, nifedipine, combinations thereof, etc.) may be beneficial in the treatment of Down Syndrome, as suggested by gene expression data presented in Examples 2-4 of the present application.
In some embodiments, the compound is selected from the group consisting of copper sulfate, 15-delta prostaglandian J2, blebbistatin, prochlorperazine, 17-dimethylamino-geldanamycin, butein, nordihydroguaiaretic acid, acetylsalicyclic acid, 51825898, sirolimus, docosahexaenoic acid ethyl ester, diclofenac, mercaptopurine, indometacin, 5279552, 17-allylamino-geldanamycin, rottlerin, paclitaxel, pyrvinium, flufenamic acid, oligomycin, 5114445, resveratrol, Y-27632, carbamazepine, nitrendipine, fluphenazine, 5152487, prazosin, 5140203, cytochalasin B, vorinostate, MG-132, HNMPA-(AM)3, decitabine, U0125, nocodazole, 5224221, 3-hydroxy-DL-kynurenine, 5162773, oxaprozin, colforsin, exemestane, felodipine, HC toxin, 5213008, dimethyloxalylglycine, 5109870, calmidazolium, 5255229, derivatives thereof, and combinations thereof (See Example 5.)
In some embodiments, the effective dose of the compound is administered in utero and/or perinatally. In some embodiments, the individual to which something is administered is a pregnant woman. In some embodiments, the individual to which something is administered is a fetus. In some embodiments, administering to a fetus comprises administering to the pregnant woman carrying the fetus.
In some aspects, the invention provides methods for evaluating the efficacy of a treatment for a fetal disease or condition. It may be desirable to evaluate the efficacy and/or necessity of a currently used treatment, for example, to distinguish between subgroups of patients with particular diseases or conditions that may respond differently to a particular treatment. In some embodiments, the treatment is a novel treatment being developed for use in routine prenatal care.
In some aspects, the invention provides methods for identifying therapeutic agents for a fetal disease or condition. Therapies are sorely lacking for many diseases and conditions affecting fetuses (such as Down Syndrome). Even for fetal diseases and conditions for which there are available therapies, existing interventions are often only available after birth or at very late stages in fetal development, which may be too late to be beneficial.
Methods for evaluating efficacy of a treatment generally comprise hybridizing RNA from an amniotic fluid and/or maternal blood sample from a subject suffering from or carrying a fetus suffering from a fetal disease or condition to at least one polynucleotide probe for at least one predetermined gene such that expression levels of at least one predetermined gene are obtained, wherein the sample is obtained from a subject to which the agent in step (b) has not been administered; (b) administering an agent to a subject suffering from the fetal disease or condition; (c) hybridizing RNA from an amniotic fluid and/or maternal blood sample from a subject suffering from or carrying a fetus suffering from a fetal disease or condition to at least one genetic probe for the same predetermined gene(s) from step (a) such that expression levels of the predetermined gene(s) are obtained, wherein the sample is obtained from a subject to which the agent has been administered; (d) comparing the gene expression levels of the predetermined genes obtained from steps (a) and (c); and (e) determining, based on the comparison, efficacy of the agent as a treatment for the fetal disease or condition.
Treatments can be evaluated and/or therapeutic agents can be identified for any of a variety of fetal diseases or conditions using inventive methods described above. These include fetal anomalies such as gastroschisis, diaphragmatic hernia, fetal congenital heart anomaly, fetal kidney anomalies, etc.; chromosomal abnormalities such as Down Syndrome, etc.; and fetal functional abnormalities such as twin to twin transfusion syndrome (TTTS), neural tube defects, etc.
In some aspects, the invention provides methods for diagnosing Down Syndrome. Some inventive diagnostic methods involve performing gene-expression analyses as described herein.
In some embodiments, diagnostic methods comprise providing an amniotic fluid and/or maternal blood sample from a pregnant woman; hybridizing RNA from the sample to at least ten genetic probes for at least ten genes that are differentially expressed in trisomy 21 fetuses such that expression levels of the at least ten genes are obtained; and determining, based on the expression levels of the at least ten genes, a diagnosis with respect to Down Syndrome.
The fetal RNA is obtained from a biological sample (such as, for example, amniotic fluid or maternal whole blood) from a woman pregnant with a fetus with a known gender and gestational age. Gene expression array experiments are then performed on the fetal RNA, and the resulting gene expression pattern for the fetus is compared against established gene expression profiles of sex-matched and gestationally age-matched fetuses that are karyotypically and developmentally normal. (Gene expression profiles for normal fetuses would be obtained from a database of mRNA expression levels established for male and female fetuses at different gestational ages.) The comparison of the test sample's gene expression profile with that from the database of data is then used as a basis for determining a diagnosis of Down Syndrome.
In some embodiments, the expression of genes of the sample is compared against expression profiles of sex-matched and gestationally age-matched trisomy 21 fetuses. (Gene expression for trisomy 21 fetuses may be, for example, obtained from a database of mRNA expression levels established for male and female trisomy 21 fetuses at different gestational ages.) In such embodiments, similarities between the gene expression of the sample and that of the reference trisomy 21 data are positive indicators for a diagnosis of Down Syndrome.
Some inventive diagnostic methods involve detecting expression of a particular gene or subset of genes that are known to be differentially expressed in trisomy 21 fetuses. In such methods, a biological sample (such as, for example, amniotic fluid or maternal whole blood) from a pregnant woman is provided. Expression of at least one gene that is differentially expressed in trisomy 21 fetuses is then detected in the biological sample, and a determination is made based on the detected expression with respect to a diagnosis of Down Syndrome.
In some embodiments of the invention, a custom microarray is used for performing gene expression profiling experiments. Such custom microarrays contain genetic probes for at least a subset of genes that are differentially expressed in trisomy 21 fetuses.
The diagnosis in inventive diagnostic methods can be, for example that the fetus has or does not have Down Syndrome, that the fetus is at risk for developing Down Syndrome, that the fetus is in a particular stage of developing Down Syndrome, that the fetus is likely to develop particular disorders related to Down Syndrome, that the fetus may or may not be responsive to particular therapeutic interventions, etc.
Inventive kits are provided that may be used in prenatal diagnostic applications. Such kits comprise gene expression microarrays that are designed to contain genetic probes for at least a subset of differentially expressed genes associated with a particular fetal disease or condition (as described herein in the “Gene expression microarrays” section). Such kits also comprise a database, or information about how to access a database, comprising baseline levels of mRNA expression established for karyotypically and developmentally normal male and normal female fetuses at different gestational ages. Instructions for using the gene expression arrays in conjunction with the database for diagnostic purposes are also included in inventive kits. In some embodiments, inventive kits include materials for extracting RNA from samples. In some such embodiments, materials are provided that allow extraction of RNA from amniotic fluid samples. In some embodiments, materials are provided that allow extraction of RNA from maternal whole blood samples as well as instructions on how to distinguish fetal RNA transcripts from maternal RNA transcripts.
It will be understood by those of ordinary skill in the art that inventive methods, microarrays, and reagents can be used in the development and evaluation of treatments for and/or diagnosis of a variety of fetal disorders. These include, but are not limited to, fetal anomalies such as gastroschisis, diaphragmatic hernia, fetal congenital heart anomaly, fetal kidney anomalies, etc.; chromosomal abnormalities such as Down Syndrome, etc.; and fetal functional abnormalities such as twin to twin transfusion syndrome (TTTS), neural tube defects, etc. For illustrative purposes, a subset of these diseases and conditions are described in further detail below.
Down Syndrome (also known as Trisomy 21) is a disorder caused by the presence of an extra copy of genetic material on chromosome 21 in humans. Trisomy 21 is the most common liveborn fetal autosomal aneuploidy. Down Syndrome patients have shortened life expectancy and reduced fertility. Most Down Syndrome patients exhibit mild to moderate mental retardation. The biological mechanisms underlying Down Syndrome are poorly understood, and fetal therapeutic interventions are lacking. Understanding gene expression profiles of trisomy 21 fetuses may shed light on genetic mechanisms underlying Down Syndrome and may lead to therapies and/or to novel biomarkers for prenatal diagnosis.
Gastroschisis and CDH are relatively common malformations that are easily detected on sonographic examination, and can be associated with significant postnatal morbidity. For both conditions, amniocentesis may be offered as part of the initial diagnostic work-up.
Gastroschisis is a common birth defect characterized by a fissure in the abdominal wall, usually accompanied by protrusion of the viscera. This condition is currently detected with maternal serum screening and confirmed by ultrasound examination. Though it is no longer a lethal disease, short- and long-term morbidity of this condition can be significant, and the degree of intestinal damage is highly variable. In gastroschisis, the typical appearance of exteriorized intestine is a thickened, foreshortened mesentery and stiff, inflamed bowel loops covered with a “peel” of thick pseudomembranes. The degree of peel is variable, and a subset of patients hardly exhibits any bowel wall inflammation at all.
It remains unclear what factors influence clinical severity and/or outcome. It has been suggested that the duration of exposure of the intestinal loops to amniotic fluid may correlate with the degree of damage, though this has not been confirmed. Postnatal recovery of intestinal function and length of hospitalization is shorter in full-term infants than in pre- or near-term ones, contradicting the notion that amniotic fluid is noxious in this condition. Other suggested factors include the diameter of the abdominal wall defect and the degree of mesenteric constriction, though studies have yielded inconclusive results.
While the majority of neonates with gastroschisis will have a return of normal bowel function within weeks, a substantial minority of patients will experience prolonged intestinal dysfunction lasting months and requiring prolonged parenteral nutrition. These infants are at significant risk of developing central venous line-associated sepsis, TPN-related cholestasis, and liver disease. It is not yet possible to antenatally stratify patients with gastroschisis and to predict which fetuses will follow a protracted course after birth.
Congenital diaphragmatic hernia (CDH) is a condition that can easily be diagnosed prenatally. Despite significant advances in postnatal management, CDH is still associated with high morbidity rates, and survival for the most severe forms continues to be poor. The biology of this condition is still largely unknown, but its genetic basis is becoming increasingly recognized.
The following examples describe some of the preferred modes of making and practicing the present invention. However, it should be understood that these examples are for illustrative purposes only and are not meant to limit the scope of the invention. Furthermore, unless the description in an Example is presented in the past tense, the text, like the rest of the specification, is not intended to suggest that experiments were actually performed or data were actually obtained.
This Example demonstrates the successful extraction and amplification of cell-free fetal mRNA from both fresh and frozen residual amniotic fluid samples. Amniotic fluid samples were initially collected for routine diagnostic purposes; the supernatant is usually discarded following karyotype analysis, while in therapeutic amniocentesis the entire sample is discarded. In a cytogenetics laboratory, samples were spun at 350×g for 10 minutes to remove cells for culture. Samples were centrifuged again at 13,000×g either upon receipt in the case of fresh samples, or immediately after thawing in the case of frozen samples. This ensured that the extracted RNA was truly extracellular.
RNA was extracted using the Qiagen Viral RNA mini kit. Sample starting volumes were typically 420 μL. Synthetic poly-A RNA (15-25 μg) was added to the sample during extraction as a carrier. RNA was concentrated into a final volume of 60 μL.
Initially mRNA was extracted from frozen samples, and was present at a concentration between 500 and 1000 pg/mL. To test whether RNA was degraded by the freeze/thaw process and/or the time lapse between drawing and freezing the sample, frozen samples were thawed and two 420 μL aliquots were drawn; one for immediate processing and one that was kept at 4° C. for three hours before being subjected to RNA extraction. In all cases, there was a significant loss of amplifiable RNA over the three-hour period. Nevertheless, if the amniotic fluid was frozen immediately after acquisition, more RNA was recovered from the frozen sample as compared to the fresh sample.
From these preliminary experiments, it appears that the extracellular RNA present in amniotic fluid at the time of sample acquisition degrades over time. However, there also appears to be an increase in extracellular RNA from lysis and degradation of amniocytes, either over time or from the freezing and thawing of a sample. To obtain the most accurate assessment of extracellular RNA, it is suggested that samples be cleared of all cells as soon as possible after being drawn. It is suggested that samples then be processed immediately or subjected to the addition of RNAse inhibitor and frozen at −80° C.
In this Example, gene expression differences were analyzed between samples from second trimester fetuses with Down Syndrome (DS, also known as trisomy 21) and euploid gestational age-matched controls. Differentially expressed genes were identified by two different methods: one analyzing individual gene differences and another analyzing sets of genes. Such analyses may yield information useful in diagnosis and treatment of fetal diseases, and/or the understanding of fetal development.
Ten mL of residual amniotic fluid (AF) was obtained from women undergoing fetal genetic testing between 16 and 21 weeks of gestation.
RNA was extracted from 8 samples from trisomy 21 fetuses and from 12 euploid samples using the RNeasy® Maxi Kit (QIAGEN). cDNA was synthesized from extracted RNA, amplified, biotin labeled, and hybridized to Human Genome U133 Plus 2.0 Microarrays (Affymetrix). Arrays were scanned with a GeneArray Scanner and analyzed using GeneChip Microarray Suite 5.0 (Affymetrix). The Bioconductor tool set in the R statistical computing and graphics software environment (http://www.r-project.org/) and Gene Set Enrichment Analysis (GSEA) tool (Subramanian et al. (2005) the contents of which are herein incorporated by reference in their entirety) from the Broad Institute were used to determine if concordant gene expression differences are present in trisomy 21 fetuses compared to normal controls. Onto-Express (Khatri et al. (2002), the contents of which are herein incorporated by reference in their entirety) was used to determine significantly over-represented Gene Ontology (GO) annotations among the gene lists.
Twenty-three genes were significantly differentially expressed in trisomy 21 fetuses compared to controls. Two of the 23 genes are on chromosome 21. Over-represented functional categories among the list of genes include apoptosis, integrin-mediated signaling and multicellular organismal development. Only one chromosomal band was significantly differentially expressed as a group: the critical region of chromosome 21 (q22). These genes include those that encode potassium and chloride ion binding and transport proteins, as well as those related to heart contraction and nervous system development.
These results show that microarray profiling is useful for the evaluation of abnormal gene expression in the early development of fetuses with trisomy 21. A set of genes, including those related to heart and nervous system development and function, were differentially expressed in fetuses with DS compared to controls. Gene expression profiling of trisomy 21 fetuses early in gestation may contribute to a better understanding of developmental abnormalities associated with this condition.
In this Example, expanded gene expression analyses of trisomy 21 fetuses were performed to identify more genes and further characterize dysregulated genes in Down Syndrome. Similar to the analyses in Example 2, expression differences were analyzed by examining individual genes as well as sets of genes.
Residual amniotic fluid (AF) supernatant samples were obtained from women in their second trimester of pregnancy who were undergoing fetal genetic testing for routine clinical indications. All samples were anonymous, although the karyotype results and gestational ages were known. Samples were stored at −80° C. until RNA extraction. The initial study set consisted of AF samples with the following confirmed metaphase karyotypes: 47,XX,+21 (n=4); 47,XY,+21 (n=5); 46, XX (n=6); and 46, XY (n=6). Gestational ages of fetuses ranged from about 15 weeks to about 22 weeks.
RNA Extraction and cDNA Synthesis and Amplification
RNA was extracted from amniotic fluid samples from trisomy 21 fetuses and from age- and sex-matched karyotypically normal controls. RNA was extracted using a commercially available kit (RNeasy® Maxi Kit, QIAGEN) with some modification. Samples were thawed and homogenized in TRIzol LS reagent (Invitrogen) to permit complete dissociation of nucleoprotein complexes. After homogenization, samples were combined with chloroform to allow separation into organic and aqueous phases. The aqueous phases of each sample were then passed through RNeasy® columns and then processed according to the remaining steps of the manufacturer's protocol for the RNAeasy® Maxi Kit. (For 10 mL of AF, 30 mL of TRIzol LS reagent and 8 mL of chloroform was used.)
RNA was precipitated using 3M NaOAc and 100% ethanol, and 80% ethanol was added after 4 h incubation at −20° C.
cDNA was synthesized from extracted RNA and then amplified and purified using the WT-Ovation™ Pico RNA Amplification System (NuGEN) and the DNA Clean & Concentrator-25 (Zymo Research) according to the manufacturer's instructions. The WT-Ovation™ Pico RNA Amplification System is specifically designed for the uniform amplification of low starting quantities of RNA (500 pg to 50 ng). In this system, amplification is initiated both at the 3′ end and randomly throughout the whole transcriptome to enable amplification of intact mRNA as well as non-poly(A) transcripts and compromised RNA samples. Double-stranded cDNA was synthesized through a two-step process, using both random and poly(T) primers and reverse transcriptase in the first strand step, and DNA polymerase to generate double stranded cDNA in the second step.
Unamplified cDNA was purified using Agencourt RNAClean® magnetic beads and then amplified using the SPIA™ Amplification Protocol (NuGEN) according to manufacturer's instructions. SPIA™ amplification uses DNA/RNA chimeric primers, DNA polymerase and RNAse H in a homogenous isothermal assay. RNAse H is used to degrade RNA in the DNA/RNA heteroduplex at the 5′ end of the first cDNA strand, which results in the exposure of a DNA sequence that is available for binding of a chimeric primer. DNA polymerase then initiates replication at the 3′ end of the primer, displacing the existing forward strand. The RNA portion at the 5′ end of the newly synthesized strand is again removed by RNase H, exposing part of the unique priming site for initiation of the next round of cDNA synthesis. The process is repeated, leading to up to 15,000 fold amplification of cDNA that is complementary to the original RNA.
Amplified cDNA was purified using Zymo-Spin II Columns (Zymo Research). The quality and quantity of amplified cDNA was measured on the Agilent Bioanalyzer 2100 Expert software (Agilent) with the RNA 6000 Nano kit (Agilent).
cDNA was then biotin labeled and fragmented using the FL-Ovation cDNA Biotin Module V2 (NuGEN) according to the manufacturer's instructions. At least 5 μg of biotin labeled and fragmented cDNA suitable for hybridization to the Affymetrix GeneChip® Human Genome U133 Plus 2.0 Array was obtained. (Affymetrix GeneChip® Human Genome U133 Plus 2.0 Arrays allow analysis of over 47,000 transcripts and variants derived from over 38,500 human genes.) Arrays were washed, stained with streptavidin-phycoerythrin, scanned with the GeneArray Scanner, and analyzed using the GeneChip Microarray Suite 5.0 (Affymetrix, Santa Clara, Calif.). Array quality was assessed in R (version 2.7.2) using the simpleaffy package in BioConductor (version 1.7; www.bioconductor.org). Three arrays with scaling factors above 22 and fewer than 15% present calls were discarded.
Seven samples from DS fetuses remained: 5 males and 2 females. Five gender-matched controls were matched within 4 days of gestational age of the corresponding DS samples; the other two were collected 10 and 12 days earlier than the respective DS samples. A total of 7 DS and 7 matched controls were further analyzed. (See Table 1.)
Normalization was performed using the three step command from the AffyPLM package in BioConductor, using ideal mismatch for background/signal adjustment, quantile normalization, and the Tukey biweight summary method (Gentleman et al. (2005), the contents of which are herein incorporated by reference in their entirety.
This summary method includes a logarithmic transformation, improving normality of the data. Identification of individual differentially-expressed genes was performed via two-sided, paired t-tests using the multtest package in BioConductor, with the Benjamini-Hochberg adjustment for multiple testing (Benjamini and Hochberg (1995), the contents of which are herein incorporated by reference in their entirety).
Gene Set Enrichment Analysis (GSEA) (Subramanian et al. (2005), the contents of which are herein incorporated by reference in their entirety) was performed using GSEA software v. 2.0 and MSigDB version 2.4. This analysis identifies consistent differential expression of sets of genes defined in the MSigDB database. We examined both the functional, curated gene sets (MSigDB collection c2) and gene sets defined by chromosomal bands (MSigDB collection c1), but only the chromosomal band analysis yielded sets that were significant with a false discovery rate (FDR) below 0.05. The full results of the chromosomal band analysis appear in Table 3.
To identify the most differentially expressed genes from these statistically significant gene sets, we chose the “leading edge subset,” a group of the most-upregulated genes in the gene set (Subramanian et al. (2005)). Specifically, the leading edge subset of a gene set contains the genes that contribute the most to the set's enrichment score (ES), a statistic reflecting the degree to which a gene set is over-represented at the top or bottom of a list of genes ranked by their differential expression.
Hierarchical clustering was performed in R, using complete-linkage hierarchical clustering (the hclust function in the stats package), and heatmaps created via the heatmap.2 function in the gplots package, using the “scale=‘row’” option to z-score normalize the rows.
Using paired t-tests, two sets of genes were identified as being significantly and consistently differentially expressed between trisomy 21 samples and their euploid controls. One set, the “Individual gene set,” comprised 414 probes (see Table 2) whose individual expression levels were significantly different via paired t-tests (adjusted p-value <0.5) in samples matched for sex and gestational age.
Homo sapiens, clone
Homo sapiens, clone
Drosophila)
Homo sapiens, clone
Only five probe sets among the Individual gene set were located on chromosome 21, corresponding to the genes CLIC6, ITGB2, RUNX1, and two open reading frames (ORFs) of unknown function (C21orf67, C21orf86). Four of these five were up-regulated in DS; the exception was RUNX1, which was down-regulated in the DS samples. In the full Individual gene set, 224 (54%) of the genes were up-regulated and 190 (46%) were down-regulated. There was widespread differential expression between trisomic and euploid fetuses, and clustering based on these genes alone, excluding chromosome 21 genes, is sufficient to separate the euploid and trisomic samples (
The second set was identified by Gene Set Enrichment Analysis (Subramanian et al. (2005)) (GSEA). A single chromosomal band, chromosome 21, band 22, was identified as having genes that were significantly up-regulated as a group (false discovery rate [FDR] q-value=0.006) in DS fetuses. For functional analysis (see Example 4), the 82-gene “Leading Edge” subset of the genes GSEA identified from band chr21q22 was selected (see Methods and Table 4). Only three genes (CLIC6, RUNX1, and C21orf87) are common to both the Individual and Leading Edge gene sets.
To quantify the extent of differential expression of the known trisomic genes, changes in expression levels of all chromosome 21 probes were examined on microarrays. For each probe set, the fold-change between its average expression level in the DS samples and its average expression in the controls was computed. A histogram of these changes is shown in
In this Example, genes identified in Example 3 as being differentially expressed in Down Syndrome fetuses were subject to functional analyses in order to examine possible mechanisms underlying the disease.
Functional analysis of gene lists was performed in DAVID (Dennis et al. (2003), the contents of which are herein incorporated in their entirety), using the Panther functional annotation classes (Thomas et al. (2003), the contents of which are herein incorporated in their entirety) in addition to the default pathway selections, which include Gene Ontology (GO) terms (Ashburner et al. (2000), the contents of which are herein incorporated in their entirety), pathways defined from the KEGG (Kanehisa et al. (2000), the contents of which are herein incorporated in their entirety) and BioCarta (www.biocarta.com) databases, InterPro protein families (Apweiler et al. (2000), the contents of which are herein incorporated in their entirety), and Protein Information Resource keywords (Barker et al. (2000), the contents of which are herein incorporated in their entirety). DAVID's EASE score rather than the more stringent Benjamini-Hochberg FDR cutoff was used for DAVID results because the FDR adjustment assumes independence of the functional pathways (which, in practice, overlap heavily by design), and adjusting for multiple testing in such cases is controversial (Gentleman (20040. Therefore, to reduce the possibility of false-positive associations, only those functional processes represented in the DAVID output for both the Individual and Leading Edge gene sets were focused upon.
Pathway analysis of the Individual and Leading Edge gene sets identified in Example 3 was performed in the Database for Annotation, Visualization, and Integrated Discovery (DAVID). All functional annotations were examined with a modified Fisher exact p-value (the “EASE” score) below 0.1 (see Materials and Methods). The full DAVID results for the two gene sets appear in Tables 5 and 6. Several consistent patterns in differential expression were observed in both the Individual and the Leading Edge gene sets (Table 7). Because of the limited overlap between these two sets and the size of the functional groups considered, the two gene sets can be seen as providing largely independent confirmation of the importance of these functional processes in DS. Therefore, in this Example, only functional processes implicated by both gene sets were focused upon. Using this criterion, the following functions appear to be disrupted in DS (Table 7): oxidative stress, ion transport, G-protein signaling, immune and stress response, circulatory system functions, cell structure, sensory perception, and several developmental processes.
Although there is very little overlap between the gene lists from the two sets (414 individual gene list and 84-leading edge subset), the gene lists tend to implicate the same processes. Without wishing to be bound by any particular theory, the inventors suggest that several of the implicated functional groups of genes may be amenable to a single explanation. For example, reactive oxygen species (especially hydrogen peroxide) are known to disrupt ion transport mechanisms, leading to problems with signal transduction through cell membranes, leading to cellular dysfunction (possibly including structural membrane problems) and pathological symptoms. (See, e.g., Kourie et al. (1998), the contents of which are herein incorporated by reference in their entirety). Kourie et al. point out that oxidative stress can act both directly on ion transport genes and pathways or indirectly by targeting membrane phospholipids.
Based on studies in adult patients, oxidative stress is known to play a role in Down Syndrome as well as in Alzheimer's disease (Zana et al. (2007)). Data described in this Example consistently support a role for oxidative stress in Down Syndrome fetuses. For example, significant expression differences were observed in a few genes involved in phospholipid biology, many genes involved in ion transport, a few genes involved in heart muscle physiology, and some DNA damage repair genes. Without wishing to be bound by any particular theory, the inventors propose that data presented in this Example suggest that there is an oxidative stress response present before birth.
Data presented in this Example also support a role for G-proteins in Down Syndrome. Though a role for G-protein dependent pathways has been suggested by the literature (see, e.g., Best et al. (2007) and Lumbreras et al. (2006)), the data presented here suggest a wider role for G-protein signaling than had been appreciated before.
Immune response genes that were differentially expressed include three interferon receptors on chromosome 21 and other genes involved in broader processes. Genes involved in developmental processes and sensory perception also appear to be misregulated in trisomy 21 samples.
In this Example, compounds that may be useful in treatments for Down Syndrome were identified using genomic approaches. The lists of differentially-expressed genes in trisomy 21 fetuses obtained in Example 3 was used in conjunction with the Connectivity Map (Lamb et al. (2006) and Lamb et al. (2007), the contents of each which are herein incorporated by reference in their entirety), a publicly available reference collection of gene expression profiles of human cells treated with bioactive small molecules.
To identify compounds with molecular signatures that might mimic or mitigate the effects of DS, we used Connectivity Map build 1.0, which contains a database of 564 expression profiles representing the effects of 164 compounds on 4 cancer cell lines, using the Affymetrix U133A microarrays (Lamb et al. (2006)). Since the U133 Plus 2.0 arrays used in the present study contain a superset of the probe sets on the U133A arrays, the Connectivity Map analysis was run using only those probe sets that were common to both arrays.
To further confirm the importance of functional processes identified in Example 4, the Connectivity Map was used to identify compounds whose molecular signatures either mimic or counteract that of DS. Four compounds with average connectivity scores above 0.7 (indicating a high correlation with the DS molecular signature), and 9 compounds with average connectivity scores below −0.7 (indicating a high negative correlation) were found.
The full results of the Connectivity Map analysis appear in Table 8.
Compounds identified as potentially capable of reversing an observed DS molecular phenotype (and thus might be candidates for further hypothesis testing in vitro) include without limitation NSC-5255229, celastrol, calmidazolium, NSC-5109870, dimethyloxalylglycine, NSC-5213008, verapamil, HC toxin, and felodipine. Celastrol is an antioxidant and anti-inflammatory agent that has been suggested for use in treating Alzheimer disease, which prematurely affects many DS patients (Allison et al. (2001), the contents of which are herein incorporated by reference in their entirety). Calmidazolium is a calmodulin inhibitor, which decreases sensitivity to calcium ion signaling, and has been considered for use in treating osteoporosis (Seales et al. (2006), the contents of which are herein incorporated by reference in their entirety). Verapamil and felodipine are both calcium channel blockers, while dimethyloxalylglycine is a hydroxylase inhibitor thought to increase resistance to oxidative stress (Cummins et al. (2008) and Zaman et al. (1999), the contents of each of which are herein incorporated by reference in their entirety).
The four compounds that (according to this analysis) most mimic the DS phenotype also relate to potassium and calcium signaling or oxidation. These results also implicate oxidative stress and ion transport, providing a third level of confirmation of the importance of these functional classes.
Without wishing to be bound by any particular theory, it is contemplated that compounds identified as having molecular signatures that counteract that of DS may be useful in therapeutic interventions for Down Syndrome. In some embodiments, such interventions are commenced before birth of the affected fetus. In some embodiments, derivatives of compounds identified as having molecular signatures that counteract that of DS are used. In some embodiments, combinations of two or more compounds (and/or derivatives thereof) whose molecular signatures counteract that of DS are used in therapeutic intervention(s).
Results described in Examples 2-5 demonstrate that transcriptional profiling of RNA in uncultured amniotic fluid provides a unique molecular window into developmental disorders in the living human fetus. In addition to identifying genes relevant to the DS phenotype, functional profiling was undertaken to identify significantly disrupted biological pathways.
Without wishing to be bound by any particular theory, it is contemplated that among the functional pathway groups identified by both the individual and gene set analyses, several are amenable to a single explanation. Reactive oxygen species, especially hydrogen peroxide, are known to disrupt ion transport mechanisms, leading to problems with signal transduction through cell membranes, cell dysfunction, structural failure of membrane integrity, and ultimately to pathological symptoms, particularly in neural and cardiac tissues (Kourie et al. (1998)). Consistent evidence of several of these steps was observed, including dysregulation of oxidative stress response genes, phospholipids, ion transport molecules, heart muscle genes, structural proteins, and DNA damage repair genes, in both the Individual and the Leading Edge gene sets (Table 7 and
It has previously been suggested that oxidative stress plays an important role in DS (Zaman et al. (1999)). Since individuals with DS demonstrate pathology consistent with Alzheimer's disease at an early age (Bush et al. (2004)), links to the role of oxidative stress in Alzheimer's have been explored (Zana et al, (2007)). Lockstone et al. (2007) found that oxidative stress response genes were over-represented in adult but not fetal DS tissue, and suggested that this response might reflect adult-onset DS pathologies such as Alzheimer disease. More recently, a few groups have found oxidative stress response markers in fetal DS tissues, although neither study emphasized this particular result or considered the potential relationship between oxidative stress and other functional pathways (Rozovski et al. (2007) and Mao et al. (2005)). Esposito et al. (2008) identified oxidative stress and apoptosis genes in neural progenitor cell lines generated from the frontal cortex of second trimester DS fetuses. They suggested that up-regulation of the chromosome 21 gene S100B causes an increase in reactive oxygen species and stress-response kinases, leading to an increase in programmed cell death. Using a biochemical approach, other investigators demonstrated increased levels of isoprostanes, a marker of oxidative stress, in second trimester amniotic fluid samples from DS fetuses (Perrone et al. (2007)).
The present application discloses the first functional analysis of the DS fetus that implicates not only oxidative stress, but potential intermediate consequences, such as defects in ion transport and G-protein signaling.
In mouse models, at least one G-protein coupled potassium channel protein (GIRK2) has been implicated in DS pathology (Best et al. (2007) and Harashima et al. (2006)). Another study using adult mouse models has suggested a role for two other G-protein dependent pathways in DS and Alzheimer disease (Lumbreras et al. (2006)). The inventors' results described in Examples 3-5, however, suggests a wider and more fundamental role for G-protein signaling, involving a large number of proteins and appearing as early as the second trimester.
These results also contribute to an ongoing debate regarding the extent of transcriptional changes due to trisomy 21. Despite the many prior studies of gene expression in DS, the precise mechanism by which the additional set of chromosome 21 genes disrupts normal development and results in the phenotype of DS remains unknown. Consistent with several previous studies (Mao et al. (2005), Amano et al. (2004), and Dauphinot et al. (2005)), it was observed that trisomic genes generally showed increased expression in DS, with average up-regulation centered near 1.5-fold (
In the present disclosure, hierarchical clustering of samples based on expression levels of the 409 Individual genes not located on chromosome 21 completely separates the DS samples from the controls (as seen in
While previous reports identified significant differential expression of trisomic genes, the present analysis surprisingly did not. It is noted that, since most of the amniotic fluid from which RNA was obtained for these studies is cell-free, care should be taken when comparing these results to previously published transcriptomic profiles of material that used fetal cells or tissue (Altug-Teber et al. (2007), Chung et al. (2005), FitzPatrick et al. (2002), Rozovski et al. (2007), Mao et al. (2005), Gross et al. (2002), and Li et al. (2006)). Without wishing to be bound by any particular theory, it is proposed that this discrepancy may also be due the some of the other data being derived from mouse models of DS, which are more genetically homogeneous than the human population samples used in the present studies. It is further proposed, without wishing to be bound by any particular theory, that most likely the discrepancy is due to the use of a strict statistical cutoff for differential expression in the present studies, including adjustment for multiple testing of over 54,000 probe sets. So relatively few chromosome 21 genes were found with such consistent expression in the diverse sample population that the evidence for their moderate up-regulation exceeded this strict significance cutoff Fortunately, GSEA was developed precisely to detect such consistent but modest expression changes. With no a priori bias, the GSEA tool identified the DS critical region as the only strongly (q<0.05) up-regulated chromosomal band in the DS samples.
Addition of the Connectivity Map analysis not only confirmed the pathways implicated by DAVID, but also suggested possible testable hypotheses to develop novel treatments for DS, starting with an in vitro approach to explore the effects of compounds suggested by the Connectivity Map analysis and/or of other compounds with similar effects on oxidation or ion transport. Work described here serves as proof of concept that gene expression profiles from living second trimester human fetuses with developmental disorders can lead to a better understanding of the early etiology of disease as well as the secondary consequences of congenital anomalies, and may suggest future innovative approaches to treatment.
In the present Example, gene expression profiles of fetuses with structural abnormalities such as gastroschisis and congenital diaphragmatic hernia (CDH) are obtained and analyzed. Such profiling is predicted to enable development of in utero therapies for these conditions via an increased understanding of the genetic mechanisms underlying these conditions.
Pregnant women are recruited into the study at the time of sonographic confirmation of the defect. Amniotic fluid supernatant samples that would otherwise be discarded are obtained from the cytogenetics laboratory involved. Approaches to isolation, amplification, and hybridization of fetal cell-free mRNA from amniotic fluid, as well as computational approaches, are the same as described in Examples 2 and 3.
Gene expression profiles may be used, for example, to identify possible therapeutic regimens and/or agents. For example, gene expression profiles of fetuses with structural abnormalities may be used in conjunction with the Connectivity Map (see Example 5) to identify a list of candidate compounds that may have therapeutic uses for these conditions.
In this Example, gene expression analyses are performed before and after treatment to explore the effect of current fetal treatments. Twin-to-twin transfusion syndrome (TTTS) is used in this particular example. It is understood that insights gained from this Example may also be useful for developing and/or assessing treatments for other fetal anomalies, conditions, and diseases (such as Down Syndrome.)
Treatment of TTTS has changed dramatically with the introduction of endoscopic laser ablation of communicating placental vessels. While occlusion of these anastomoses has been shown to halt the transfusion syndrome in almost all cases (as documented by return of diuresis and amniotic fluid in the donor twin within hours or days), survival rates are less than expected, and intrauterine demise of both twins is still observed in 20-25% of cases. In another 30%, only one twin will ultimately survive. Of note, these results appear independent of the individual surgeon and medical center, and have not significantly improved since the introduction of the technique 15 years ago.
It is still unclear why only 15-20% of all identical twins develop the syndrome (although almost all monochorionic twins have placental anastomoses), or why some exhibit a rapid deterioration, while others improve spontaneously. With the availability of an effective treatment, the ability to differentiate the more aggressive subgroup of this disease would be remarkably useful in improving survival, without placing pregnant women and fetuses at undue risk if the disease is predicted to follow a more benign course. Current diagnostic methods are not sensitive enough to distinguish this group, despite very detailed sonographic and Doppler descriptions of the different clinical stages of TTTS.
In the present Example, samples of amniotic fluid are obtained pre and post intervention to examine specific changes in gene expression that result from treatment. Amniotic fluid is routinely collected at the time of the fetoscopic procedure. To analyze the effects of the intervention, the study includes an amniocentesis at a later date at a time when the risk of preterm labor associated with the procedure has subsided. Methods of isolating, amplifying, and hybridizing fetal cell-free mRNA from amniotic fluid, as well as computational methods, are similar to those described in Example 3. The list of genes up-regulated in amniotic fluid at the time of the procedure are compared to the same fetus following laser ablation at a later date.
In this Example, baseline gene expression data are obtained from normal fetuses at various ages during the second trimester, a time during which medical intervention could occur in order to prevent the development of symptoms of the fetus. Such baseline gene expression data would be useful for comparison purposes when analyzing gene expression patterns in fetuses with chromosomal, structural, and/or growth abnormalities such as Down Syndrome fetuses.
In this example, RNA samples are obtained from maternal whole blood samples rather than amniotic fluid. Although amniotic fluid is a source of pure fetal mRNA, it can only be obtained through an invasive procedure. Maternal blood can be available through less invasive procedures, though it presents challenges because it contains a mixture of both maternal and fetal nucleic acids. Nevertheless, the inventors had previously shown that they can identify fetal gene expression in maternal whole blood samples (Maron et al. 2007, the entire contents of which are herein incorporated by reference in their entirety).
Preliminary studies had provided baseline gene expression data on 10 fetuses between 36 and 39 weeks of gestation (i.e., at term). This time period was selected because of ease of coordinating antepartum and postpartum maternal and infant blood samples. This study proved that it is possible to isolate mRNA from whole blood and perform a comparison genomic analysis to identify genes that were differentially up-regulated in the maternal antepartum samples, representing candidate fetal transcripts.
In this Example, women who are undergoing elective termination of pregnancy are enrolled. An antenatal sample is then obtained during pregnancy, as well as a fetal cord blood sample (technically feasible after 18 weeks of gestation) and a post-termination sample.
Methods to Isolate Fetal mRNA from Whole Maternal Blood
Whole blood samples are obtained from women in the first, second and third trimesters of pregnancy, as well as before and after term delivery. All blood samples are obtained in PaxGene specimen tubes (PreAnalytiX) and stored at room temperature for 6 to 36 hours prior to RNA extraction.
RNA is extracted using the PaxGene blood RNA kit (PreAnalytiX) according the manufacturer's instructions. Following extraction, a portion of the eluted RNA sample is analyzed on the Bioanalyzer 2100 (Agilent) to assess quantity and quality of each sample. Samples with distinct peaks representing 18S ribosomal RNA and a minimum quantity of 1 μg of total RNA are selected for further processing. Extracted total RNA are then amplified and converted to cDNA using the commercially available One Step Amplification Kit (Affymetrix) according to Van Gelder et al. (1990). Following amplification, cDNA is then assessed with the Bioanalyzer 2100 for quantity and quality. When possible, 15 μg of amplified labeled cDNA is then fragmented and hybridized to the GeneChip® Human Genome U133 Plus 2.0 Array, which allows analysis of over 47,000 transcripts and variants derived from over 38,500 human genes.
Gene expression data analysis methods known to be relevant to time series analysis are used to identify genes that are changing significantly and consistently during normal development. Such methods include analysis of variance (ANOVA), Fourier-transform methods (Wichert et al. 2004; Aach and Church 2001) and spline-fitting methods (Bar-Joseph, et al. 2003).
In this Example, gene expression data from abnormal fetuses is compared to that of gestationally age-matched normal fetuses. Once a developmental gene expression profile is established for normal fetuses in the second trimester and at term, cases of gestational-age matched pregnancies that are complicated by fetal chromosomal or anatomic abnormalities are sought for analyses. Studies in this Example focus on trisomy 21 because of the interest in providing noninvasive prenatal diagnosis (as opposed to screening) for this condition. Additional analyses on diseased fetuses such as those with trisomy 21 (using baseline gene expression profiles from Example 5) may provide an opportunity to explore new hypotheses, such as whether fetuses with severe intrauterine growth restriction at term manifest neurodevelopmental abnormalities in utero.
Preliminary studies showed that it is possible to successfully isolate mRNA from whole blood, hybridize to gene expression arrays, and create gene lists of up-regulated genes that are involved in fetal development at term. Methods for processing maternal blood samples are the same as in Example 5. Expression data obtained from “diseased” fetuses are compared to that of “healthy” fetuses at the same gestational age to determine differences in expression.
In this Example, custom microarrays for prenatal diagnostic applications are developed using gene expression data from Examples 7 and 8. Custom microarrays could be developed for a variety of disorders affecting fetuses depending on the gene expression data that becomes available from Examples 7 and 8.
Using trisomy 21 as an example, affected fetuses are shown to have differences in expression of genes related to cardiac function depending on the extent of their underlying cardiac malformation(s). Custom gene expression microarrays are designed specifically to include genes identified as being differentially regulated for a particular condition.
Such custom microarrays could help identify those infants likely to manifest cardiac failure in the perinatal period, which will influence location of delivery. Similarly, a “neurodevelopmental” custom array, used with blood samples from pregnant women with complicated pregnancies, could identify those fetuses with the highest likelihood of abnormal neurologic development. Such determinations may influence decisions regarding route of delivery or how long to allow the pregnant woman to labor.
Several companies already offer creation of custom microarrays for reasonable costs via an online system. In particular, Agilent Technologies allows uploading of probe sequences, a selection of various formats, and printing using well-validated technology. Ultimately sequences most likely to distinguish between normal and abnormal fetuses are selected and custom arrays with such sequences are ordered. For gene expression profiling, methods described in Example 7 are used, except that custom arrays are used instead of Affymetrix arrays.
Analyses discussed in this Example are directed to developing novel fetal treatment approaches that could potentially allow intervention earlier than existing therapies allow. Current fetal therapies are generally offered when signs and symptoms of disease have already developed, or the clinical significance of the condition is well-known. Although in some cases fetal treatment is necessary to prevent fetal demise (for example, laser ablation of shared vessels in twin-to-twin transfusion syndrome, TTTS), in many cases the treatment is too late. The development of treatments that can be offered earlier may be facilitated by understanding what biological pathways are involved in normal fetal development and by the ability to identify fetuses that are developing abnormally whether or not they show symptoms. Such treatments might reduce or ameliorate symptoms before birth. Although the following discussion focuses on TTTS, it is to be understood that novel treatment approaches could similarly be designed for Down Syndrome.
The metabolic and hormonal aspects of TTTS have been extensively studied. It is now believed that the incidence of hydrops in the recipient twin is higher than one would expect based solely on a theory of fluid overload. It has been speculated, without wishing to be bound by any particular theory, that the (appropriate) up-regulation of the renin-angiotensin system in the chronically hypovolemic donor causes inappropriate vasoconstriction and fluid retention in the recipient, as long as both fetuses remain connected by vascular anastomoses. It has been suggested that activation of renin in the donor kidney may be secondary to chronic hypoxemia because of elevated placental vascular resistance. A large percentage of donor fetuses have a significantly smaller placental share, which may cause intrauterine growth restriction (IUGR) and increased vascular resistance. Other putative factors and pathways have been proposed as well, lending credence to the theory that fetal well-being and/or stress affect placental and maternal metabolism. Analysis of amniotic fluid and maternal serum from subjects with varying degrees of severity of TTTS, and their differential analysis before and after surgical treatment, could potentially offer new insight in the pathophysiology of the disease, methods of early detection of severe or rapidly evolving forms, and opportunities to offer non-operative treatment.
The inventors had previously demonstrated that microarray gene expression profiling of amniotic fluid provides important information about fetal well-being, development, and potential disease status (Larrabee et al. (2005), the contents of which are herein incorporated by reference in their entirety). Two fetuses with TTTS and 2 fetuses with hydrops fetalis were compared to pooled information from 6 normal fetuses. At the time of publication of Larrabee et al. (2005), pathway analysis had not been performed. Subsequently, data from the preliminary study's gene set was analyzed using Ingenuity pathway analysis software. Results are presented in Table 9. Although statistical significance was limited due to the small numbers of samples involved, the results suggest a strong involvement of carbohydrate metabolism pathways in the pathophysiology of TTTS and hydrops fetalis.
Results of pathway analyses are reviewed for consistent trends. Should specific biologic pathways be identified in specific diseases, consultants with knowledge of the disease and experts in pharmacology are identified to develop specific suggestions of drugs that would be safe to administer to the fetus and that would warrant controlled study.
All literature and similar material cited in this application, including, patents, patent applications, articles, books, treatises, dissertations and web pages, regardless of the format of such literature and similar materials, are expressly incorporated by reference in their entirety. In the event that one or more of the incorporated literature and similar materials differs from or contradicts this application, including defined terms, term usage, described techniques, or the like, this application controls.
The section headings used herein are for organizational purposes only and are not to be construed as limiting the subject matter described in any way.
Other embodiments of the invention will be apparent to those skilled in the art from a consideration of the specification or practice of the invention disclosed herein. It is intended that the specification and examples be considered as exemplary only, with the true scope of the invention being indicated by the following claims.
The present application claims benefit of and priority to U.S. provisional applications Ser. Nos. 61/057,874 (filed on Jun. 1, 2008) and 61/180,904 (filed on May 25, 2009), the contents of which are herein incorporated by reference in their entirety.
This invention was made with U.S. government support under the Eunice Kennedy Shriver National Institute of Child Health and Human Development Award (R01 grant nos. HD042053-06 and R01 HD058880-01 to Diana Bianchi and Donna Slonim respectively). The government of the United States of America has certain rights in the invention.
Filing Document | Filing Date | Country | Kind | 371c Date |
---|---|---|---|---|
PCT/US09/45876 | 6/1/2009 | WO | 00 | 11/22/2010 |
Number | Date | Country | |
---|---|---|---|
61057874 | Jun 2008 | US | |
61180904 | May 2009 | US |