The present invention relates to markers of acute asthma exacerbation and methods of using the same for the prediction, diagnosis and prognosis of acute asthma exacerbation.
Asthma is a chronic inflammatory disease of the airways that is characterized by recurrent episodes of reversible airway obstruction and airway hyperresponsiveness (AHR). Typical clinical manifestations of acute asthma exacerbation (also known as asthma attack) include shortness of breath, wheezing, coughing and chest tightness that can become life threatening or fatal. Despite the considerable progress that has been made in elucidating the pathophysiology of asthma, the prevalence, morbidity, and mortality of the disease has increased during the past two decades. In 1995, in the United States alone, nearly 1.8 million emergency room visits, 466,000 hospitalizations and 5,429 deaths were directly attributed to acute asthma exacerbation.
It is generally accepted that allergic asthma is initiated by an inappropriate inflammatory reaction to airborne allergens. The lungs of asthmatics demonstrate an intense infiltration of lymphocytes, mast cells and eosinophils. A large body of evidence has demonstrated this immune response to be driven by CD4+ T-cells expressing a TH2 cytokine profile. Four major pathophysiological responses seen in human asthma include upregulation of serum IgE (atopy), eosinophilia, excessive mucus secretion, and AHR.
Current therapy for asthma includes use of bronchodilators, corticosteroids and leukotriene inhibitors. The treatments share the same therapeutic goal of bronchodilation, reducing inflammation and facilitating expectoration. Many of such treatments, however, include undesired side effects and lose effectiveness after being used for a period of time. Additionally, only limited agents for therapeutic intervention are available for decreasing the airway remodeling process that occurs in asthmatics. Therefore, there remains a need for an increased molecular understanding of asthma, and a need for the identification of novel therapeutic strategies to combat these complex diseases.
In one aspect, the invention provides a method for determining the molecular signature of asthma exacerbation attack of a subject, comprising the steps of determining the level of at least one biomarker in said subject prior to an exacerbation attack; determining the level of the at least one biomarker in said subject during an asthma exacerbation attack; and ascertaining the difference between the level of the biomarker prior to the attack and the level during the attack. The difference in the level of a particular biomarker or plurality of biomarkers (i.e., a change in expression of one or more biomarkers) indicates the molecular signature, which in turn indicates the type of asthma exacerbation attack. In some embodiments, the levels of biomarkers are determined from a sample obtained from the subject. In one embodiment, the sample is a blood sample comprising peripheral blood mononuclear cells (PBMCs). In some embodiments, the type of asthma exacerbation attack is one of innate immunity (subgroup X), as indicated e.g. by a change in expression of one or more biomarkers listed in Tables 4, 6 and 9. In some embodiments, the type of asthma exacerbation attack is one of cognate immunity (subgroup Y), as indicated e.g. by a change in expression of one or more biomarkers listed in Tables 5 and 10. In some embodiments, the type of asthma exacerbation attack is coextensive with an airway infection, as indicated e.g. by a change in expression of one or more biomarkers listed in Tables 11 and 12. In yet other embodiments, the type of asthma exacerbation attack does not involve an airway infection, as indicated by a change in expression of biomarkers selected from a group comprising interferon induced with helicase C domain 1 (IFIH1; e.g. SEQ ID NO:60), leukotriene A4 hydrolase (LTA4H; e.g. SEQ ID NO:61) and open reading frame number 25 of human chromosome 6 (C6ORF25; SEQ ID NO:62). In some embodiments, the biomarkers are nucleic acids. In other embodiments, the biomarkers are polypeptides.
In another aspect, the invention provides a method for selecting a treatment for asthma exacerbation in a patient, comprising the steps of determining the type of asthma exacerbation based on the molecular signature in the patient (supra), then selecting a treatment corresponding to the type of asthma exacerbation. In some embodiments, the therapies are tailored to stopping T and/or B cell cognate immunity, innate immunity and/or airway infection. In some embodiments, the blood of the patient is monitored to ascertain a change in the levels of one or more biomarkers to assess the effectiveness of treatment and to revise therapy as indicated.
In one aspect, the invention provides a method for identifying individuals at risk for asthma, by identifying an individual who does not yet exhibit symptoms of asthma, measuring the level of at least one product in a sample obtained from the individual, comparing that level to a reference level of the product, and optionally providing the result of the comparison to a user. The product is the product of at least one gene that is differentially expressed in individuals having an acute exacerbation of asthma versus those not having an acute exacerbation of asthma. A difference between the reference level and the level of the product indicates that the individual is at risk for asthma.
In another aspect, the invention provides a method for identifying individuals at risk for asthma exacerbation, by identifying an individual who is a known asthmatic, measuring the level of at least one product in a sample obtained from the individual, comparing that level to a reference level of the product, and optionally providing the result of the comparison to a user. The product is the product of at least one gene that is differentially expressed in individuals having an acute exacerbation of asthma versus those not having an acute exacerbation of asthma. A difference between the reference level and the level of the product indicates that the individual is at risk for acute exacerbation of asthma.
In some embodiments, the invention provides a method of identifying individuals at risk for asthma or asthma exacerbation, comprising: (a) identifying an individual who does not exhibit symptoms of asthma; (b) measuring the level of at least one product in a sample obtained from the individual, wherein the product is produced from a gene which is differentially expressed during asthma exacerbation; and (c) comparing said level of step (a) to a reference level of said product, wherein a difference between said level of step (a) and the reference level indicates that the individual is at risk for asthma or asthma exacerbation. In one embodiment, the individual has exhibited one or more symptoms of asthma previously. In another embodiment, the individual has not exhibited one or more symptoms of asthma previously.
In another aspect, the invention provides an array for use in assessing the risk for asthma or asthma exacerbation in a patient, comprising a plurality of discrete regions or addresses, each of which comprises a target molecule disposed thereon, wherein a subset of the plurality of discrete regions has disposed thereon target molecules that can specifically detect a marker of asthma exacerbation. In some embodiments, the subset of the plurality of discrete regions that can specifically detect a marker of asthma exacerbation is at least 5%, 15%, 20%, 25%, 30%, 40%, 45%, 50%, 55%, 60%, 65%, 70%, 75%, 80%, 90%, 95%, or 99% of the total discrete regions on the array. In some embodiments, the target molecules are single stranded polynucleotides that hybridize to polynucleotides obtained from a sample. In other embodiments, the target molecules are peptide recognition moieties, such as for example antibodies or antibody fragments, aptamers, cognate ligands or receptors, and the like. In some embodiments, the sample is obtained from an individual. In some embodiments, the sample from the individual is a blood sample which contains peripheral blood mononuclear cells.
In some embodiments of the aforementioned aspects, the genes or markers that are differentially expressed during asthma exacerbation are depicted in Tables 2-6 and 8-12. In other embodiments, the genes or markers are involved in interleukin-15 (IL-15) signaling, (B-cell receptor) BCR signaling, toll-like receptor (TLR) signaling, interferon (IFN) signaling and/or interferon regulatory factor (IRFs) pathways.
In any one or more of the foregoing aspects, the reference levels of gene products that are differentially expressed during asthma exacerbation are levels of those gene products that are expressed in individuals free of asthma symptoms or in an asthma quiet period. In some embodiments, the reference level is an average of levels obtained from symptom free or asthma quiet individuals. In other embodiments, the reference level is obtained during an asthma quiet period from the same individual who is being tested.
In another aspect, the invention provides a combination of polynucleotides comprising at least 2 or more, or at least 10 substantially purified and isolated polynucleotides, wherein each polynucleotide comprises at least 22 contiguous nucleotides of a gene selected from the group comprising the genes set forth in Table 2, Table 3, Table 4, Table 5, Table 6, Table 8, Table 9, Table 10, Table 11, Table 12 and SEQ ID NOs: 1-77, or the complements and fragments thereof. In one embodiment, the combination of polynucleotides is attached to a substrate to form an array.
In another aspect, the invention provides a kit comprising a detection reagent which binds to the gene product of one or more genes that are differentially expressed in a sample obtained from an individual having an asthma exacerbation versus a sample obtained from an individual having an asthma quiet period. In some embodiments, the one or more genes are selected from the group consisting of the genes set forth in Tables 1-6 and 8-12, and SEQ ID NOs:1-59. In some embodiments, the sample obtained from an individual having an asthma exacerbation is a blood sample. In some embodiments, the gene product comprises a polypeptide and the detection reagent comprises an antibody or an aptamer. In other embodiments, the gene product comprises a polynucleotide and the detection reagent comprises an oligonucleotide or a polynucleotide.
In any one or more of the foregoing aspects, the samples obtained from individuals can be any cell, tissue or fluid. In some embodiments, the sample is a blood sample, which contains peripheral blood mononuclear cells (PBMCs). In other embodiments, the sample is serum.
In another aspect, the invention provides a method of discovering a compound that is effective for treating asthma exacerbation, comprising: providing a candidate compound; determining whether said compound inhibits IL-15 activity, wherein inhibition of IL-15 activity indicates that said compound is effective for treating acute exacerbation of asthma.
In one embodiment, the methods for determining the molecular signature of asthma exacerbation comprise combining a sample from a patient with one or more agents capable of reacting with one or more markers in the sample, and detecting a reaction.
The present invention provides a new class of markers that are differentially expressed in acute exacerbation of asthma, particularly in peripheral blood mononuclear cells and/or serum. Specifically, the markers of the present invention upregulate or downregulate their expression in individuals having an asthma attack. The present invention provides methods for assessing the state-of-health as it relates to asthma in an individual by comparing the expression level of one or more markers with a reference expression level of the one or more markers. The present invention also provides methods for asthma diagnosis, prognosis, or assessment in which the expression level of one or more markers of the present invention is compared to a reference level of the one or more markers.
A study was conducted to investigate the transcriptomics and proteomics of asthma exacerbation. The study was intended to identify potential new targets and/or markers for asthma, particularly asthma exacerbation. The approach to the answers to these questions involved seeking to identify differences between the asthma quiet and asthma exacerbation phenotypes at the molecular level.
The inventors have discovered that particular sets of genes are differentially expressed in individuals during asthma exacerbation as compared to during an asthma quiet time. A subset of those genes that are differentially expressed during exacerbation versus quiet, and which have a false discovery rate of less than 0.05 are listed in Table 2. The study individuals were clustered into three subgroups, based upon their exacerbation molecular profile: subgroup X, subgroup Y and subgroup Z.
For subgroup X individuals, Table 4 depicts 1081 genes having an exacerbation versus quiet expression differential with a false discovery rate of less than 0.05. The subgroup X differentially expressed genes include many well-defined interferon (IFN)-inducible genes and transcription factors, such as IFNα, IFNβ, ISGF3G, IRF7, IRF1, SP100, OAS1, OAS2, MX1, MX2, ISG15, IFITM1, NM1, IR27, IR6, IR30, GBP1, GBP2, SP110, IRF4, IFITM2, and IFI16. The subgroup X differentially expressed genes also include genes linked to IFN, such as for example FGL2, LGALS, IL23A, ARTS-1, STAT1, STAT2, IRF1, IRF4, IRF7, ISGF3G, and the like. The subgroup X differentially expressed genes also include those genes driven by interferon regulatory factors (IRFs), such as OAS2, STAT2, IL15, TAP1, CTSS, IFIT3, OAS1, EIF2AK2, PSMB10, CYBB, CASP7, BCL2, STAT1, PSMB9, CASP8, CDKN1A, CASP1, HLA-G, VIL2, GATA3, GBP1, CXCR4, MS4A1, DNASE2, CCL5, TAP2, TEGT, PLSCR1, ISG15, and TNFSF10. The subgroup X differentially expressed genes also include those genes regulated by interleukin-15 (IL-15), which are listed for example in Table 6.
The differential expression of one or more subgroup X differentially expressed genes in a sample of an individual comprises a molecular profile of an asthma exacerbation that indicates that the asthma exacerbation involves innate immunity. The innate immune system is generally known in the art to be involved in the recruitment of immune cells to sites of infection and inflammation through the production of cytokines, the activation of the complement cascade, the identification and removal of foreign substances by leukocytes, and the activation of the adaptive (cognate) immune system via antigen presentation. For subgroup Y individuals, Table 5 depicts 574 genes having an exacerbation versus quiet expression differential with a false discovery rate of less than 0.05. The B-cell receptor (BCR) pathway was identified as a canonical pathway specific to subgroup Y, which includes genes CD72, CD19, CD79B, Syk, BLNK, Rac/Cdc42, MEKKs, and IKK. For subgroup Z individuals, the Toll-like receptor—Toll-IL-1 receptor domain-containing adaptor inducing interferon-β (TLR-TRIF)-induced intracellular signaling pathway was identified as a canonical pathway specific to subgroup Z genes.
The differential expression of one or more subgroup Y differentially expressed genes in a sample of an individual comprises a molecular profile of an asthma exacerbation that indicates that the asthma exacerbation involves cognate immunity. Cognate (adaptive) immunity is generally known in the art to involve the generation and/or elicitation of a specific B-cell (antibody) and T-cell (T-cell receptor) response to antigens and is triggered when a pathogen or other foreign agent evades the innate immune system and generates a threshold level of antigen. Activation of the cognate system integrates with the innate system through antigen presenting cells.
“Asthma exacerbation,” “acute exacerbation of asthma” “exacerbation attack” and “asthma attack” are phrases that are used interchangeably. “Asthma quiet,” “asthma quiet period,” “quiet asthma period,” and “quiet visits,” are phrases that are used interchangeably and generally refer to asthma symptomless periods. In some cases, the air passages of individuals with asthma are inflamed during a quiet period.
The terms “molecular signature,” “expression profile” and “gene expression profile” refer to two or more genes or gene products which represent a particular state of health of an individual. Alternatively, the molecular signature represents a collection of expression values for a plurality (e.g., at least two, but frequently about 10, about 100, about 1000, or more) of members of a library of genes or gene products. In some embodiments, the molecular signature represents the expression pattern for all of the nucleotide sequences in a library or array of nucleotide sequences or genes. Alternatively, the molecular signature represents the expression pattern for one or more subsets of a library of genes or gene products. In some embodiments, the molecular signature indicates the asthma status of an individual, such as e.g. a quiet period or an exacerbation. In some embodiments, the molecular signature is a molecular signature of asthma exacerbation for an individual with asthma, which indicates the type of exacerbation. Types of exacerbation include e.g. exacerbation involving innate immunity, exacerbation involving cognate or adaptive immunity, exacerbation associated with an infection and exacerbation not associated with any infection.
Various aspects of the invention are described in further detail in the following subsections. The use of subsections is not meant to limit the invention. Each subsection may apply to any aspect of the invention.
As discussed earlier, expression level of markers of the present invention can be used as an indicator and/or predictor of asthma exacerbation. Detection and measurement of the relative amount of an asthma-associated gene, marker or gene product (polynucleotide or polypeptide) of the invention (generally referred to as “marker” or “biomarker”) can be by any method known in the art.
Methodologies for peptide detection include protein extraction from a cell or tissue sample, followed by binding of an antibody specific for the target protein to the protein sample, and detection of the antibody. Antibodies are generally detected by the use of a labeled secondary antibody. The label can be a radioisotope, a fluorescent compound, an enzyme, an enzyme co-factor, or ligand. Such methods are well understood in the art.
Detection of specific polynucleotide molecules may be assessed by gel electrophoresis, column chromatography, or direct sequencing, quantitative PCR, RT-PCR, or nested PCR among many other techniques well known to those skilled in the art.
Detection of the presence or number of copies of all or part of a marker as defined by the invention may be performed using any method known in the art. It is convenient to assess the presence and/or quantity of a DNA or cDNA by Southern analysis, in which total DNA from a cell or tissue sample is extracted, is hybridized with a labeled probe (i.e., a complementary DNA molecule), and the probe is detected. The label group can be a radioisotope, a fluorescent compound, an enzyme, or an enzyme co-factor. Other useful methods of DNA detection and/or quantification include direct sequencing, gel electrophoresis, column chromatography, and quantitative PCR, as would be understood by one skilled in the art.
Methodologies for detection of a transcribed polynucleotide can include RNA extraction from a cell or tissue sample, followed by hybridization of a labeled probe (i.e., a complementary polynucleotide molecule) specific for the target RNA to the extracted RNA and detection of the probe (e.g., Northern blotting).
The markers disclosed in the present invention can be employed in the prediction, diagnosis and/or prognosis of asthma exacerbation comprising the steps of (a) detecting an expression level of an asthma exacerbation marker in a patient; (b) comparing that expression level to a reference expression level of the same asthma exacerbation marker; (c) and diagnosing a patient has having asthma or an asthma exacerbation event, based upon the comparison made. This can be achieved by comparing the expression profile of one or more asthma exacerbation markers in a subject of interest to at least one reference expression profile of the asthma exacerbation markers. The reference expression profile(s) can include an average expression profile or a set of individual expression profiles each of which represents the gene expression of the asthma exacerbation markers in a particular asthma patient during a quiet period or in a disease-free individual.
In many embodiments, one or more asthma exacerbation markers, which are selected from any one or more of Tables 2-6 and 8-12 and SEQ ID NOs:1-77, can be used for asthma diagnosis or disease monitoring. In one embodiment, each asthma exacerbation marker has a p-value of less than 0.01, 0.005, 0.001, 0.0005, 0.0001, or less. In another embodiment, the asthma exacerbation marker comprises a gene having a log 2 difference between asthma exacerbation and asthma quiet of ≧|0.25| (absolute value of 0.25).
The asthma exacerbation markers of the present invention can be used alone, or in combination with other clinical tests, for asthma diagnosis, prognosis or monitoring. Conventional methods for detecting or diagnosing asthma include, but are not limited to, blood tests, chest X-ray, biopsies, skin tests, mucus tests, urine/excreta sample testing, physical exam, or any and all related clinical examinations known to the skilled artisan. Any of these methods, as well as any other conventional or non-conventional method, can be used, in addition to the methods of the present invention, to improve the accuracy of asthma diagnosis, prognosis or monitoring.
The expression profile of a patient of interest (which by definition comprises the level of at least one marker in a sample obtained from an individual) can be compared to one or more reference expression profiles. The reference expression profiles (which by definition comprise a reference level of the marker) can be determined concurrently with the expression profile of the patient of interest. The reference expression profiles can also be predetermined or prerecorded in electronic or other types of storage media.
The reference expression profiles can include average expression profiles, or individual profiles representing gene expression patterns in particular patients. In one embodiment, the reference expression profiles used for a prediction or diagnosis of asthma exacerbation include an average expression profile of the marker(s) in tissue samples, such as peripheral blood samples, of healthy volunteers or individuals during an asthma quiet period. In one embodiment, the reference expression profiles include an average expression profile of the marker(s) in tissue samples, such as peripheral blood samples, of reference asthma patients who have known or determinable disease status. Any averaging method may be used, such as arithmetic means, harmonic means, average of absolute values, average of log-transformed values, or weighted average. In one example, the reference asthma patients have the same disease assessment. In another example, the reference patients are healthy volunteers used in a diagnostic method. In another example, the reference asthma patients can be divided into at least two classes, each class of patients having a different respective disease assessment. The average expression profile in each class of patients constitutes a separate reference expression profile, and the expression profile of the patient of interest is compared to each of these reference expression profiles.
Other types of reference expression profiles can also be used in the present invention. In yet another embodiment, the present invention uses a numerical threshold as a control level. The numerical threshold may comprise a ratio, including, but not limited to, the ratio of the expression level of a marker in an asthma patient in relation to the expression level of the same marker in a healthy or asthma quiet individual; or the ratio between the expression levels of the marker in an asthma patient both before and after an exacerbation event. The numerical threshold may also by a ratio of marker expression levels between patients with differing disease assessments.
The expression profile of the patient of interest and the reference expression profile(s) can be constructed in any form. In one embodiment, the expression profiles comprise the expression level of each marker used in outcome prediction. The expression levels can be absolute, normalized, or relative levels. Suitable normalization procedures include, but are not limited to, those used in nucleic acid array gene expression analyses or those described in Hill, et al., (Hill (2001) Genome Biol. 2:research0055.1-0055.13). In one example, the expression levels are normalized such that the mean is zero and the standard deviation is one. In another example, the expression levels are normalized based on internal or external controls, as appreciated by those skilled in the art. In still another example, the expression levels are normalized against one or more control transcripts with known abundances in blood samples. In many cases, the expression profile of the patient of interest and the reference expression profile(s) are constructed using the same or comparable methodologies.
In another embodiment, each expression profile being compared comprises one or more ratios between the expression levels of different markers. An expression profile can also include other measures that are capable of representing gene expression patterns or protein levels.
The peripheral blood samples used in the present invention can be either whole blood samples, samples comprising enriched PBMCs, or serum. In one example, the peripheral blood samples used for preparing the reference expression profile(s) comprise enriched or purified PBMCs, and the peripheral blood sample used for preparing the expression profile of the patient of interest is a whole blood sample. In another example, all of the peripheral blood samples employed in outcome prediction comprise enriched or purified PBMCs. In many cases, the peripheral blood samples are prepared from the patient of interest and reference patients using the same or comparable procedures.
Other types of blood samples can also be employed in the present invention, such as serum, which contains protein biomarkers; and the gene or protein expression profiles in these blood samples are statistically significantly correlated with patient outcome.
Construction of the expression profiles typically involves detection of the expression level of each marker used in the prediction, diagnosis, prognosis or monitoring of asthma exacerbation. Numerous methods are available for this purpose. For instance, the expression level of a gene can be determined by measuring the level of the RNA transcript(s) of the gene(s). Suitable methods include, but are not limited to, quantitative RT-PCR, Northern blot, in situ hybridization, slot-blotting, nuclease protection assay, and nucleic acid array (including bead array). The expression level of a gene can also be determined by measuring the level of the polypeptide(s) encoded by the gene. Suitable methods include, but are not limited to, immunoassays (such as ELISA, RIA, FACS, or Western blot), 2-dimensional gel electrophoresis, mass spectrometry, or protein arrays.
In one aspect, the expression level of a marker is determined by measuring the RNA transcript level of the gene in a tissue sample, such as a peripheral blood sample. RNA can be isolated from the peripheral blood or tissue sample using a variety of methods. Exemplary methods include guanidine isothiocyanate/acidic phenol method, the TRIZOL® Reagent (Invitrogen), or the Micro-FastTrack™ 2.0 or FastTrack™ 2.0 mRNA Isolation Kits (Invitrogen). The isolated RNA can be either total RNA or mRNA. The isolated RNA can be amplified to cDNA or cRNA before subsequent detection or quantitation. The amplification can be either specific or non-specific. Suitable amplification methods include, but are not limited to, reverse transcriptase PCR (RT-PCR), isothermal amplification, ligase chain reaction, and Q-beta replicase.
In one embodiment, the amplification protocol employs reverse transcriptase. The isolated mRNA can be reverse transcribed into cDNA using a reverse transcriptase, and a primer consisting of oligo (dT) and a sequence encoding the phage T7 promoter. The cDNA thus produced is single-stranded. The second strand of the cDNA is synthesized using a DNA polymerase, combined with an RNase to break up the DNA/RNA hybrid. After synthesis of the double-stranded cDNA, T7 RNA polymerase is added, and cRNA is then transcribed from the second strand of the doubled-stranded cDNA. The amplified cDNA or cRNA can be detected or quantitated by hybridization to labeled probes. The cDNA or cRNA can also be labeled during the amplification process and then detected or quantitated.
In another embodiment, quantitative RT-PCR (such as TaqMan, ABI) is used for detecting or comparing the RNA transcript level of a marker of interest. Quantitative RT-PCR involves reverse transcription (RT) of RNA to cDNA followed by relative quantitative PCR (RT-PCR).
In PCR, the number of molecules of the amplified target DNA increases by a factor approaching two with every cycle of the reaction until some reagent becomes limiting. Thereafter, the rate of amplification becomes increasingly diminished until there is not an increase in the amplified target between cycles. If a graph is plotted on which the cycle number is on the X axis and the log of the concentration of the amplified target DNA is on the Y axis, a curved line of characteristic shape can be formed by connecting the plotted points. Beginning with the first cycle, the slope of the line is positive and constant. This is said to be the linear portion of the curve. After some reagent becomes limiting, the slope of the line begins to decrease and eventually becomes zero. At this point the concentration of the amplified target DNA becomes asymptotic to some fixed value. This is said to be the plateau portion of the curve.
The concentration of the target DNA in the linear portion of the PCR is proportional to the starting concentration of the target before the PCR is begun. By determining the concentration of the PCR products of the target DNA in PCR reactions that have completed the same number of cycles and are in their linear ranges, it is possible to determine the relative concentrations of the specific target sequence in the original DNA mixture. If the DNA mixtures are cDNAs synthesized from RNAs isolated from different tissues or cells, the relative abundances of the specific mRNA from which the target sequence was derived may be determined for the respective tissues or cells. This direct proportionality between the concentration of the PCR products and the relative mRNA abundances is true in the linear range portion of the PCR reaction.
The final concentration of the target DNA in the plateau portion of the curve is determined by the availability of reagents in the reaction mix and is independent of the original concentration of target DNA. Therefore, in one embodiment, the sampling and quantifying of the amplified PCR products are carried out when the PCR reactions are in the linear portion of their curves. In addition, relative concentrations of the amplifiable cDNAs can be normalized to some independent standard, which may be based on either internally existing RNA species or externally introduced RNA species. The abundance of a particular mRNA species may also be determined relative to the average abundance of all mRNA species in the sample.
In one embodiment, the PCR amplification utilizes internal PCR standards that are approximately as abundant as the target. This strategy is effective if the products of the PCR amplifications are sampled during their linear phases. If the products are sampled when the reactions are approaching the plateau phase, then the less abundant product may become relatively over-represented. Comparisons of relative abundances made for many different RNA samples, such as is the case when examining RNA samples for differential expression, may become distorted in such a way as to make differences in relative abundances of RNAs appear less than they actually are. This can be improved if the internal standard is much more abundant than the target. If the internal standard is more abundant than the target, then direct linear comparisons may be made between RNA samples.
A problem inherent in clinical samples is that they are of variable quantity or quality. This problem can be overcome if the RT-PCR is performed as a relative quantitative RT-PCR with an internal standard in which the internal standard is an amplifiable cDNA fragment that is larger than the target cDNA fragment and in which the abundance of the mRNA encoding the internal standard is roughly 5-100 fold higher than the mRNA encoding the target. This assay measures relative abundance, not absolute abundance of the respective mRNA species.
In another embodiment, the relative quantitative RT-PCR uses an external standard protocol. Under this protocol, the PCR products are sampled in the linear portion of their amplification curves. The number of PCR cycles that are optimal for sampling can be empirically determined for each target cDNA fragment. In addition, the reverse transcriptase products of each RNA population isolated from the various samples can be normalized for equal concentrations of amplifiable cDNAs. While empirical determination of the linear range of the amplification curve and normalization of cDNA preparations are tedious and time-consuming processes, the resulting RT-PCR assays may, in certain cases, be superior to those derived from a relative quantitative RT-PCR with an internal standard.
In yet another embodiment, nucleic acid arrays (including bead arrays) are used for detecting or comparing the expression profiles of a marker of interest. The nucleic acid arrays can be commercial oligonucleotide or cDNA arrays. They can also be custom arrays comprising concentrated probes for the markers of the present invention. In many examples, at least 5%, 10%, 15%, 20%, 25%, 30%, 35%, 40%, 45%, 50%, or more of the total probes on a custom array of the present invention are probes for asthma exacerbation markers. These probes can hybridize under stringent or nucleic acid array hybridization conditions to the RNA transcripts, or the complements thereof, of the corresponding markers.
“Nucleic acid array hybridization conditions” refer to the temperature and ionic conditions that are normally used in nucleic acid array hybridization. These conditions include 16-hour hybridization at 45° C., followed by at least three 10-minute washes at room temperature. The hybridization buffer comprises 100 mM MES, 1 M Na+, 20 mM EDTA, and 0.01% Tween 20. The pH of the hybridization buffer preferably is between 6.5 and 6.7. The wash buffer is 6×SSPET, which contains 0.9 M NaCl, 60 mM NaH2PO4, 6 mM EDTA, and 0.005% Triton X-100. Under more stringent nucleic acid array hybridization conditions, the wash buffer can contain 100 mM MES, 0.1 M Na+, and 0.01% Tween 20.
As used herein, “stringent conditions” are at least as stringent as, for example, conditions G-L shown in Table 7. “Highly stringent conditions” are at least as stringent as conditions A-F shown in Table 7. Hybridization is carried out under the hybridization conditions (Hybridization Temperature and Buffer) for about four hours, followed by two 20-minute washes under the corresponding wash conditions (Wash Temp. and Buffer).
In one example, a nucleic acid array of the present invention includes at least 2, 5, 10, or more different probes. Each of these probes is capable of hybridizing under stringent or nucleic acid array hybridization conditions to a different respective marker of the present invention. Multiple probes for the same marker can be used on the same nucleic acid array. The probe density on the array can be in any range.
The probes for a marker of the present invention can be a nucleic acid probe, such as, DNA, RNA, PNA (peptide nucleic acid), or a modified form thereof. The nucleotide residues in each probe can be either naturally occurring residues (such as deoxyadenylate, deoxycytidylate, deoxyguanylate, deoxythymidylate, adenylate, cytidylate, guanylate, and uridylate), or synthetically produced analogs that are capable of forming desired base-pair relationships. Examples of these analogs include, but are not limited to, aza and deaza pyrimidine analogs, aza and deaza purine analogs, and other heterocyclic base analogs, wherein one or more of the carbon and nitrogen atoms of the purine and pyrimidine rings are substituted by heteroatoms, such as oxygen, sulfur, selenium, and phosphorus. Similarly, the polynucleotide backbones of the probes can be either naturally occurring (such as through 5′ to 3′ linkage), or modified. For instance, the nucleotide units can be connected via non-typical linkage, such as 5′ to 2′ linkage, so long as the linkage does not interfere with hybridization. For another instance, peptide nucleic acids, in which the constitute bases are joined by peptide bonds rather than phosphodiester linkages, can be used.
The probes for the markers can be stably attached to discrete regions on a nucleic acid array. By “stably attached,” it means that a probe maintains its position relative to the attached discrete region during hybridization and signal detection. The position of each discrete region on the nucleic acid array can be either known or determinable. All of the methods known in the art can be used to make the nucleic acid arrays of the present invention. Hybridization probes or amplification primers for the markers of the present invention can be prepared by using any method known in the art.
In another embodiment, nuclease protection assays are used to quantitate RNA transcript levels in peripheral blood samples. There are many different versions of nuclease protection assays. The common characteristic of these nuclease protection assays is that they involve hybridization of an antisense nucleic acid with the RNA to be quantified. The resulting hybrid double-stranded molecule is then digested with a nuclease that digests single-stranded nucleic acids more efficiently than double-stranded molecules. The amount of antisense nucleic acid that survives digestion is a measure of the amount of the target RNA species to be quantified. Examples of suitable nuclease protection assays include the RNase protection assay provided by Ambion, Inc. (Austin, Tex.).
In one embodiment, the probes/primers for a marker significantly diverge from the sequences of other markers. This can be achieved by checking potential probe/primer sequences against a human genome sequence database, such as the Entrez database at the U.S. National Center for Biotechnology Information (“NCBI”). One algorithm suitable for this purpose is the BLAST algorithm. This algorithm involves first identifying high scoring sequence pairs (HSPs) by identifying short words of length W in the query sequence, which either match or satisfy some positive-valued threshold score T when aligned with a word of the same length in a database sequence. T is referred to as the neighborhood word score threshold. The initial neighborhood word hits act as seeds for initiating searches to find longer HSPs containing them. The word hits are then extended in both directions along each sequence to increase the cumulative alignment score. Cumulative scores are calculated using, for nucleotide sequences, the parameters M (reward score for a pair of matching residues; always >0) and N (penalty score for mismatching residues; always <0). The BLAST algorithm parameters W, T, and X determine the sensitivity and speed of the alignment. These parameters can be adjusted for different purposes, as appreciated by those skilled in the art.
In another embodiment, the probes for markers can be polypeptide in nature, such as, antibody probes. The expression levels of the markers of the present invention are thus determined by measuring the levels of polypeptides encoded by the markers. Methods suitable for this purpose include, but are not limited to, immunoassays such as ELISA, RIA, FACS, dot blot, Western Blot, immunohistochemistry, and antibody-based radio-imaging. In addition, high-throughput protein sequencing, 2-dimensional SDS-polyacrylamide gel electrophoresis, mass spectrometry, or protein arrays can be used.
In one embodiment, ELISAs are used for detecting the levels of the target proteins. In an exemplifying ELISA, antibodies capable of binding to the target proteins are immobilized onto selected surfaces exhibiting protein affinity, such as wells in a polystyrene or polyvinylchloride microtiter plate. Samples to be tested are then added to the wells. After binding and washing to remove non-specifically bound immunocomplexes, the bound antigen(s) can be detected. Detection can be achieved by the addition of a second antibody which is specific for the target proteins and is linked to a detectable label. Detection can also be achieved by the addition of a second antibody, followed by the addition of a third antibody that has binding affinity for the second antibody, with the third antibody being linked to a detectable label. Before being added to the microtiter plate, cells in the samples can be lysed or extracted to separate the target proteins from potentially interfering substances.
In another exemplifying ELISA, the samples suspected of containing the target proteins are immobilized onto the well surface and then contacted with the antibodies. After binding and washing to remove non-specifically bound immunocomplexes, the bound antigen is detected. Where the initial antibodies are linked to a detectable label, the immunocomplexes can be detected directly. The immunocomplexes can also be detected using a second antibody that has binding affinity for the first antibody, with the second antibody being linked to a detectable label.
Another exemplary ELISA involves the use of antibody competition in the detection. In this ELISA, the target proteins are immobilized on the well surface. The labeled antibodies are added to the well, allowed to bind to the target proteins, and detected by means of their labels. The amount of the target proteins in an unknown sample is then determined by mixing the sample with the labeled antibodies before or during incubation with coated wells. The presence of the target proteins in the unknown sample acts to reduce the amount of antibody available for binding to the well and thus reduces the ultimate signal.
Different ELISA formats can have certain features in common, such as coating, incubating or binding, washing to remove non-specifically bound species, and detecting the bound immunocomplexes. For instance, in coating a plate with either antigen or antibody, the wells of the plate can be incubated with a solution of the antigen or antibody, either overnight or for a specified period of hours. The wells of the plate are then washed to remove incompletely adsorbed material. Any remaining available surfaces of the wells are then “coated” with a nonspecific protein that is antigenically neutral with regard to the test samples. Examples of these nonspecific proteins include bovine serum albumin (BSA), casein and solutions of milk powder. The coating allows for blocking of nonspecific adsorption sites on the immobilizing surface and thus reduces the background caused by nonspecific binding of antisera onto the surface.
In ELISAs, a secondary or tertiary detection means can be used. After binding of a protein or antibody to the well, coating with a non-reactive material to reduce background, and washing to remove unbound material, the immobilizing surface is contacted with the control or clinical or biological sample to be tested under conditions effective to allow immunocomplex (antigen/antibody) formation. These conditions may include, for example, diluting the antigens and antibodies with solutions such as BSA, bovine gamma globulin (BGG) and phosphate buffered saline (PBS)/Tween and incubating the antibodies and antigens at room temperature for about 1 to 4 hours or at 4° C. overnight. Detection of the immunocomplex is facilitated by using a labeled secondary binding ligand or antibody, or a secondary binding ligand or antibody in conjunction with a labeled tertiary antibody or third binding ligand.
Following all incubation steps in an ELISA, the contacted surface can be washed so as to remove non-complexed material. For instance, the surface may be washed with a solution such as PBS/Tween, or borate buffer. Following the formation of specific immunocomplexes between the test sample and the originally bound material, and subsequent washing, the occurrence of the amount of immunocomplexes can be determined.
To provide a detecting means, the second or third antibody can have an associated label to allow detection. In one embodiment, the label is an enzyme that generates color development upon incubating with an appropriate chromogenic substrate. Thus, for example, one may contact and incubate the first or second immunocomplex with a urease, glucose oxidase, alkaline phosphatase or hydrogen peroxidase-conjugated antibody for a period of time and under conditions that favor the development of further immunocomplex formation (e.g., incubation for 2 hours at room temperature in a PBS-containing solution such as PBS-Tween).
After incubation with the labeled antibody, and subsequent washing to remove unbound material, the amount of label can be quantified, e.g., by incubation with a chromogenic substrate such as urea and bromocresol purple or 2,2′-azido-di-(3-ethyl)-benzthiazoline-6-sulfonic acid (ABTS) and H2O2, in the case of peroxidase as the enzyme label. Quantitation can be achieved by measuring the degree of color generation, e.g., using a spectrophotometer.
Another method suitable for detecting polypeptide levels is RIA (radioimmunoassay). An exemplary RIA is based on the competition between radiolabeled-polypeptides and unlabeled polypeptides for binding to a limited quantity of antibodies. Suitable radiolabels include, but are not limited to, 125I. In one embodiment, a fixed concentration of 125I-labeled polypeptide is incubated with a series of dilution of an antibody specific to the polypeptide. When the unlabeled polypeptide is added to the system, the amount of the 125I-polypeptide that binds to the antibody is decreased. A standard curve can therefore be constructed to represent the amount of antibody-bound 125I-polypeptide as a function of the concentration of the unlabeled polypeptide. From this standard curve, the concentration of the polypeptide in unknown samples can be determined. Protocols for conducting RIA are well known in the art.
Suitable antibodies for the present invention include, but are not limited to, polyclonal antibodies, monoclonal antibodies, chimeric antibodies, humanized antibodies, single chain antibodies, Fab fragments, or fragments produced by a Fab expression library. Neutralizing antibodies (e.g., such as those which inhibit dimer formation) can also be used. Methods for preparing these antibodies are well known in the art. In one embodiment, the antibodies of the present invention can bind to the corresponding marker gene products or other desired antigens with binding affinities of at least 104 M−1, 105 M−1, 106 M−1, 107 M−1, or more.
The antibodies of the present invention can be labeled with one or more detectable moieties to allow for detection of antibody-antigen complexes. The detectable moieties can include compositions detectable by spectroscopic, enzymatic, photochemical, biochemical, bioelectronic, immunochemical, electrical, optical or chemical means. The detectable moieties include, but are not limited to, radioisotopes, chemiluminescent compounds, labeled binding proteins, heavy metal atoms, spectroscopic markers such as fluorescent markers and dyes, magnetic labels, linked enzymes, mass spectrometry tags, spin labels, electron transfer donors and acceptors, and the like.
The antibodies of the present invention can be used as probes to construct protein arrays for the detection of expression profiles of the markers. Methods for making protein arrays or biochips are well known in the art. In many embodiments, a substantial portion of probes on a protein array of the present invention are antibodies specific for the marker products. For instance, at least 5%, 10%, 15%, 20%, 25%, 30%, 35%, 40%, 45%, 50%, or more probes on the protein array can be antibodies specific for the marker gene products.
In yet another aspect, the expression levels of the markers are determined by measuring the biological functions or activities of these genes. Where a biological function or activity of a gene is known, suitable in vitro or in vivo assays can be developed to evaluate the function or activity. These assays can be subsequently used to assess the level of expression of the marker.
After the expression level of each marker is determined, numerous approaches can be employed to compare expression profiles. Comparison of the expression profile of a patient of interest to the reference expression profile(s) can be conducted manually or electronically. In one example, comparison is carried out by comparing each component in one expression profile to the corresponding component in a reference expression profile. The component can be the expression level of a marker, a ratio between the expression levels of two markers, or another measure capable of representing gene expression patterns. The expression level of a gene can have an absolute or a normalized or relative value. The difference between two corresponding components can be assessed by fold changes, absolute differences, or other suitable means.
Comparison of the expression profile of a patient of interest to the reference expression profile(s) can also be conducted using pattern recognition or comparison programs, such as the k-nearest-neighbors algorithm as described in Armstrong, et al., (Armstrong (2002) Nature Genetics 30:41-47), or the weighted voting algorithm as described below. In addition, the serial analysis of gene expression (SAGE) technology, the GEMTOOLS gene expression analysis program (Incyte Pharmaceuticals), the GeneCalling and Quantitative Expression Analysis technology (Curagen), and other suitable methods, programs or systems can be used to compare expression profiles.
Multiple markers can be used in the comparison of expression profiles. For instance, 2, 4, 6, 8, 10, 12, 14, or more markers can be used. In addition, the marker(s) used in the comparison can be selected to have relatively small p-values (e.g., two-sided p-values). In many examples, the p-values indicate the statistical significance of the difference between gene expression levels in different classes of patients. In many other examples, the p-values suggest the statistical significance of the correlation between gene expression patterns and clinical outcome. In one embodiment, the markers used in the comparison have p-values of no greater than 0.05, 0.01, 0.001, 0.0005, 0.0001, or less. Markers with p-values of greater than 0.05 can also be used. These genes may be identified, for instance, by using a relatively small number of blood samples.
Similarity or difference between the expression profile of a patient of interest and a reference expression profile is indicative of the class membership of the patient of interest. Similarity or difference can be determined by any suitable means. The comparison can be qualitative, quantitative, or both.
In one example, a component in a reference profile is a mean value, and the corresponding component in the expression profile of the patient of interest falls within the standard deviation of the mean value. In such a case, the expression profile of the patient of interest may be considered similar to the reference profile with respect to that particular component. Other criteria, such as a multiple or fraction of the standard deviation or a certain degree of percentage increase or decrease, can be used to measure similarity.
In another example, at least 50% (e.g., at least 60%, 70%, 80%, 90%, or more) of the components in the expression profile of the patient of interest are considered similar to the corresponding components in a reference profile. Under these circumstances, the expression profile of the patient of interest may be considered similar to the reference profile. Different components in the expression profile may have different weights for the comparison. In some cases, lower percentage thresholds (e.g., less than 50% of the total components) are used to determine similarity.
The marker(s) and the similarity criteria can be selected such that the accuracy of the diagnostic determination or the outcome prediction (the ratio of correct calls over the total of correct and incorrect calls) is relatively high. For instance, the accuracy of the determination or prediction can be at least 50%, 60%, 70%, 80%, 90%, or more.
The invention also provides methods (also referred to herein as “screening assays”) for identifying agents capable of modulating marker expression (“modulators”), i.e., candidate or test compounds or agents comprising therapeutic moieties (e.g., peptides, peptidomimetics, peptoids, polynucleotides, small molecules or other drugs) which (a) bind to a marker gene product or (b) have a modulatory (e.g., upregulation or downregulation; stimulatory or inhibitory; potentiation/induction or suppression) effect on the activity of a marker gene product or, more specifically, (c) have a modulatory effect on the interactions of the marker gene product with one or more of its natural substrates, or (d) have a modulatory effect on the expression of the marker. Such assays typically comprise a reaction between the marker gene product and one or more assay components. The other components may be either the test compound itself, or a combination of test compound and a binding partner of the marker gene product.
The test compounds of the present invention are generally either small molecules or biomolecules. Small molecules include, but are not limited to, inorganic molecules and small organic molecules. Biomolecules include, but are not limited to, naturally-occurring and synthetic compounds that have a bioactivity in mammals, such as polypeptides, polysaccharides, and polynucleotides. In one embodiment, the test compound is a small molecule. In another embodiment, the test compound is a biomolecule. One skilled in the art will appreciate that the nature of the test compound may vary depending on the nature of the protein encoded by the marker of the present invention.
The test compounds of the present invention may be obtained from any available source, including systematic libraries of natural and/or synthetic compounds. Test compounds may also be obtained by any of the numerous approaches in combinatorial library methods known in the art, including: biological libraries; peptoid libraries (libraries of molecules having the functionalities of peptides, but with a novel, non-peptide backbone which are resistant to enzymatic degradation but which nevertheless remain bioactive; see, e.g., Zuckerman et al. (Zuckerman (1994) J. Med. Chem. 37:2678-85); spatially addressable parallel solid phase or solution phase libraries; synthetic library methods requiring deconvolution; the “one-bead, one-compound” library method; and synthetic library methods using affinity chromatography selection. The biological library and peptoid library approaches are applicable to peptide, non-peptide oligomers or small molecule libraries of compound (Lam (1997) Anticancer Drug Des. 12:145).
The invention provides methods of screening test compounds for inhibitors of the marker gene products of the present invention. The method of screening comprises obtaining samples from subjects diagnosed with or suspected of having asthma, contacting each separate aliquot of the samples with one or more of a plurality of test compounds, and comparing expression of one or more marker gene products in each of the aliquots to determine whether any of the test compounds provides a substantially decreased level of expression or activity of a marker gene product relative to samples with other test compounds or relative to an untreated sample or control sample. In addition, methods of screening may be devised by combining a test compound with a protein and thereby determining the effect of the test compound on the protein.
In addition, the invention is further directed to a method of screening for test compounds capable of modulating with the binding of a marker gene product and a binding partner, by combining the test compound, the marker gene product, and binding partner together and determining whether binding of the binding partner and the marker gene product occurs. The test compound may be either a small molecule or a biomolecule.
Modulators of marker gene product expression, activity or binding ability are useful as therapeutic compositions of the invention. Such modulators (e.g., antagonists or agonists) may be formulated as pharmaceutical compositions, as described herein below. Such modulators may also be used in the methods of the invention, for example, to diagnose, treat, or prognose asthma.
The invention provides methods of conducting high-throughput screening for test compounds capable of inhibiting activity or expression of a marker gene product of the present invention. In one embodiment, the method of high-throughput screening involves combining test compounds and the marker gene product and detecting the effect of the test compound on the marker gene product.
A variety of high-throughput functional assays well-known in the art may be used in combination to screen and/or study the reactivity of different types of activating test compounds. Since the coupling system is often difficult to predict, a number of assays may need to be configured to detect a wide range of coupling mechanisms. A variety of fluorescence-based techniques is well-known in the art and is capable of high-throughput and ultra high throughput screening for activity, including but not limited to BRET™ (bioluminescence resonance energy transfer) or FRET™ (fluorescence resonance energy transfer) (both by Packard Instrument Co., Meriden, Conn.). The ability to screen a large volume and a variety of test compounds with great sensitivity permits for analysis of the therapeutic targets of the invention to further provide potential inhibitors of asthma. The BIACORE™ system (a plasmon resonance system) may also be manipulated to detect binding of test compounds with individual components of the therapeutic target, to detect binding to either the encoded protein or to the ligand.
Therefore, the invention provides for high-throughput screening of test compounds for the ability to inhibit activity of a protein encoded by the marker gene products listed in Tables 2, 3, 4, 5, 6, 8, 9, 10, 11, 12 and/or SEQ ID NOs:1-77, by combining the test compounds and the protein in high-throughput assays such as BIACORE™, or in fluorescence-based assays such as FRET or BRET™. In addition, high-throughput assays may be utilized to identify specific factors which bind to the encoded proteins, or alternatively, to identify test compounds which prevent binding of the receptor to the binding partner. In the case of orphan receptors, the binding partner may be the natural ligand for the receptor. Moreover, the high-throughput screening assays may be modified to determine whether test compounds can bind to either the encoded protein or to the binding partner (e.g., substrate or ligand) which binds to the protein.
In one embodiment, the high-throughput screening assay detects the ability of a plurality of test compounds to bind to a marker gene product selected from the group consisting of the markers listed in Tables 2, 3, 4, 5, 6, 8, 9, 10, 11, 12 and/or SEQ ID NOs:1-77. In another specific embodiment, the high-throughput screening assay detects the ability of a plurality of a test compound to inhibit a binding partner (such as a ligand) to bind to a marker gene product selected from the group consisting of the markers listed in Tables 2, 3, 4, 5, 6, 8, 9, 10, 11, 12 and/or SEQ ID NOs:1-77. In yet another specific embodiment, the high-throughput screening assay detects the ability of a plurality of a test compounds to modulate signaling through a marker gene product selected from the group consisting of the markers listed in Tables 2, 3, 4, 5, 6, 8, 9, 10, 11, 12 and/or SEQ ID NOs:1-77.
Polynucleotide probes that correspond to the genes/markers of the present invention can be used to make nucleic acid arrays. A typical nucleic acid array includes at least one substrate support. The substrate support includes a plurality of discrete regions or addresses. The location of each discrete region is either known or determinable. The discrete regions can be organized in various forms or patterns. For instance, the discrete regions can be arranged as an array of regularly spaced areas on the surface of the substrate. Other patterns, such as linear, concentric or spiral patterns, can be used. In one embodiment, a nucleic acid array of the present invention is a bead array which includes a plurality of beads stably associated with the polynucleotide probes of the present invention.
Polynucleotide probes can be stably attached to their respective discrete regions through covalent and/or non-covalent interactions. By “stably attached” or “stably associated,” it means that during nucleic acid array hybridization the polynucleotide probe maintains its position relative to the discrete region to which the probe is attached. Any suitable method can be used to attach polynucleotide probes to a nucleic acid array substrate. In one embodiment, the attachment is achieved by first depositing the polynucleotide probes to their respective discrete regions and then exposing the surface to a solution of a cross-linking agent, such as glutaraldehyde, borohydride, or other bifunctional agents. In another embodiment, the polynucleotide probes are covalently bound to the substrate via an alkylamino-linker group or by coating the glass slides with polyethylenimine followed by activation with cyanuric chloride for coupling the polynucleotides. In yet another embodiment, the polynucleotide probes are covalently attached to a nucleic acid array through polymer linkers. The polymer linkers may improve the accessibility of the probes to their purported targets.
In addition, the polynucleotide probes can be stably attached to a nucleic acid array substrate through non-covalent interactions. In one embodiment, the polynucleotide probes are attached to the substrate through electrostatic interactions between positively charged surface groups and the negatively charged probes. In another embodiment, the substrate is a glass slide having a coating of a polycationic polymer on its surface, such as a cationic polypeptide. The probes are bound to these polycationic polymers. In yet another embodiment, the methods described in U.S. Pat. No. 6,440,723, which is incorporated herein by reference, are used to attach the probes to the nucleic acid array substrate(s).
Various materials can be used to make the substrate support. Suitable materials include, but are not limited to, glasses, silica, ceramics, nylons, quartz wafers, gels, metals, and papers. The substrates can be flexible or rigid. In one embodiment, they are in the form of a tape that is wound up on a reel or cassette. Two or more substrate supports can be used in the same nucleic acid array.
The surfaces of the substrate support can be smooth and substantially planar. The surfaces of the substrate can also have a variety of configurations, such as raised or depressed regions, trenches, v-grooves, mesa structures, and other irregularities. The surfaces of the substrate can be coated with one or more modification layers. Suitable modification layers include inorganic and organic layers, such as metals, metal oxides, polymers, or small organic molecules. In one embodiment, the surface(s) of the substrate is chemically treated to include groups such as hydroxyl, carboxyl, amine, aldehyde, or sulfhydryl groups.
The discrete regions on the substrate can be of any size, shape and density. For instance, they can be squares, ellipsoids, rectangles, triangles, circles, other regular or irregular geometric shapes, or any portion or combination thereof. In one embodiment, each of the discrete regions has a surface area of less than 10−1 cm2, such as less than 10−2, 10−3, 10−4, 10−5, 10−6, or 10−7 cm2. In another embodiment, the spacing between each discrete region and its closest neighbor, measured from center-to-center, is in the range of from about 10 to about 400 μm. The density of the discrete regions may range, for example, between 50 and 50,000 regions/cm2.
All of the methods known in the art can be used to make the nucleic acid arrays of the present invention. For instance, the probes can be synthesized in a step-by-step manner on the substrate, or can be attached to the substrate in pre-synthesized forms. Algorithms for reducing the number of synthesis cycles can be used. In one embodiment, a nucleic acid array of the present invention is synthesized in a combinational fashion by delivering monomers to the discrete regions through mechanically constrained flowpaths. In another embodiment, a nucleic acid array of the present invention is synthesized by spotting monomer reagents onto a substrate support using an ink jet printer. In yet another embodiment, polynucleotide probes are immobilized on a nucleic acid array of the present invention by using photolithography techniques.
The nucleic acid arrays of the present invention can also be bead arrays which comprise a plurality of beads. Polynucleotide probes can be stably attached to each bead using any of the above-described methods.
In one embodiment, a substantial portion of all polynucleotide probes on a nucleic acid array of the present invention can hybridize under stringent or nucleic acid array hybridization conditions (Table 7) to genes that are differentially expressed in samples from individuals having asthma exacerbation versus an asthma quiet period. In some embodiments, at least 5%, 10%, 15%, 20%, 25%, 35%, 45%, 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 99% or more of all polynucleotide probes on the nucleic acid array can hybridize to asthma exacerbation differentially expressed genes. The probes for these genes can be concentrated on one substrate support. They can also be attached to two or more substrate supports, such as in the bead arrays.
Any number of polynucleotide probes can be included in a nucleic acid array of the present invention. For instance, the nucleic acid array can include at least 2, 5, 10, 20, 30, 40, 50, 100, 200, 300, 400, 500, 1,000 or more different probes, and each probe can hybridize under stringent or nucleic acid array hybridization conditions to a different respective gene selected from asthma exacerbation genes. In one embodiment, a nucleic acid array of the present invention includes a first set of probes which are capable of hybridizing under stringent or nucleic acid array hybridization conditions to different respective asthma exacerbation genes. In yet another embodiment, a nucleic acid array of the present invention includes at least 2, 5, 10, 20, 30, 40, 50, 100, 200, 300, 400, 500, 1,000, 2,000, 3,000, 4,000, 5,000, or more different probes, and each probe can hybridize under stringent or nucleic acid array hybridization conditions to a different respective target sequence selected from any one or more of Tables 2-6 and 8-12, and SEQ ID NOs:1-77, or the complement thereof.
Multiple probes can be included in the nucleic acid arrays of the present invention for detecting the same target sequence. For instance, at least 2, 5, 10, 15, 20, 25, 30 or more different probes can be used for detecting the same target sequence selected from any one or more of Tables 2-6 and 8-12, and SEQ ID NOs:1-77. In one embodiment, a nucleic acid array of the present invention includes at least 30, 40, 50, or 60 different probes for each target sequence of interest. In another embodiment, a nucleic acid array of the present invention includes 25-39 probes for each target sequence of interest.
Each probe can be attached to a different respective discrete region on a nucleic acid array. Alternatively, two or more different probes can be attached to the same discrete region. The concentration of one probe with respect to the other probe or probes in the same region may vary according to the objectives and requirements of the particular experiment. In one embodiment, different probes in the same region are present in approximately equimolar ratio.
In some embodiments, probes for different tiling or target sequences are attached to different discrete regions on a nucleic acid array. In some applications, probes for different tiling or target sequences are attached to the same discrete region.
The length of each probe on a nucleic acid array of the present invention can be selected to achieve the desirable hybridization effects. For instance, each probe can include or consist of 15, 20, 25, 30, 35, 40, 45, 50, 60, 70, 80, 90, 100 or more consecutive nucleotides. In one embodiment, each probe consists of 25 consecutive nucleotides.
The nucleic acid arrays of the present invention can also include control probes which can hybridize under stringent or nucleic acid array hybridization conditions to respective control sequences, or the complements thereof.
In addition, the present invention features kits useful for the diagnosis or selection of treatment of asthma. Each kit includes or consists essentially of at least one probe for an asthma exacerbation marker. Reagents or buffers that facilitate the use of the kit can also be included. Any type of probe can be used in the present invention, such as hybridization probes, amplification primers, antibodies, or any and all other probes commonly used and known to the skilled artisan. In one embodiment, the asthma exacerbation markers are selected from Table 2, Table 3, Table 4, Table 5, Table 6, Table 8, Table 9, Table 10, Table 11, Table 12 and/or SEQ ID NOs:1-77.
In one embodiment, a kit of the present invention includes or consists essentially of at least 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, or more polynucleotide probes or primers. Each probe/primer can hybridize under stringent conditions or nucleic acid array hybridization conditions to a different respective asthma exacerbation marker. As used herein, a polynucleotide can hybridize to a gene if the polynucleotide can hybridize to an RNA transcript, or complement thereof, of the gene. In another embodiment, a kit of the present invention includes one or more antibodies, each of which is capable of binding to a polypeptide encoded by a different respective asthma prognostic or disease gene/marker.
In one example, a kit of the present invention includes or consists essentially of probes (e.g., hybridization or PCR amplification probes or antibodies) for at least 1, 2, 3, 4, 5, 10, 15, 20, 25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, or more genes selected from Tables 2, 3, 4, 5, 6, 8, 9, 10, 11, 12 and/or SEQ ID NOs:1-77. In another embodiment, the kit can contain nucleic acid probes and antibodies to 1, 2, 3, 4, 5, 10, 15, 20, 25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, or more genes selected from Tables 2, 3, 4, 5, 6, 8, 9, 10, 11, 12 and/or SEQ ID NOs:1-77.
The probes employed in the present invention can be either labeled or unlabeled. Labeled probes can be detectable by spectroscopic, photochemical, biochemical, bioelectronic, immunochemical, electrical, optical, chemical, or other suitable means. Exemplary labeling moieties for a probe include radioisotopes, chemiluminescent compounds, labeled binding proteins, heavy metal atoms, spectroscopic markers such as fluorescent markers and dyes, magnetic labels, linked enzymes, mass spectrometry tags, spin labels, electron transfer donors and acceptors, and the like.
The kits of the present invention can also have containers containing buffer(s) or reporter means. In addition, the kits can include reagents for conducting positive or negative controls. In one embodiment, the probes employed in the present invention are stably attached to one or more substrate supports. Nucleic acid hybridization or immunoassays can be directly carried out on the substrate support(s). Suitable substrate supports for this purpose include, but are not limited to, glasses, silica, ceramics, nylons, quartz wafers, gels, metals, papers, beads, tubes, fibers, films, membranes, column matrices, or microtiter plate wells. The kits of the present invention may also contain one or more controls, each representing a reference expression level of a marker detectable by one or more probes contained in the kits.
It should be understood that the above-described embodiments and the following examples are given by way of illustration, not limitation. Various changes and modifications within the scope of the present invention will become apparent to those skilled in the art from the present description.
Adult subjects age 18 years or older with confirmed diagnosis of mild, moderate or severe persistent asthma were enrolled in a prospective 12-month non-interventional study of gene expression associated with asthma. Enrollment was stratified by severity of asthma as defined by NIH 1997 guidelines (NIH Publication No. 07-4051, originally printed July 1997).
A prospective, multi-center, non-interventional study, which included subjects having asthma, was conducted in five countries (Australia, Iceland, Ireland, U.K., and USA). Three types of study visits were conducted: (a) exacerbation visits, defined as taking place during exacerbation attacks and within 14 days of attack onset; (b) follow-up visits, defined as taking place within 14 days after cessation of exacerbation attack; and (c) quiet visits, defined as taking place during stable disease at approximately 3 month intervals.
Blood samples were collected for gene expression analyses from each subject at each visit. Samples were collected into Vacutainer™ cell preparation tubes (CPT, Becton Dickinson). Samples were shipped overnight and cell differential counts taken using a Pentra 5™ (Horiba ABX). Peripheral blood mononuclear cells (“PBMCs”) were purified according to manufacturer's instructions. Isolated PBMC pellets were stored at −80° C. pending RNA purification. RLT lysis buffer (with 0.1% β-mercaptoethanol; Qiagen) was added to the frozen pellets. RNA was isolated from the lysate using RNeasy™ Mini Kit (Qiagen Catalog #74104) and DNase treated (Qiagen RNase-free DNase Kit Catalog #79254). The DNase treated RNA preparation was further purified using a Phase Lock Gel™ column (Brinkman). RNA quality was assessed as acceptable by Agilent Bioanalyzer™ gel (Model 2100), and quantified using SpectraMax™ (Molecular Devices).
Exacerbation Visit Samples: From the total of 357 enrolled subjects, at least one evaluable exacerbation visit sample was collected from each of 118 (59 severe, 51 moderate, and 8 mild) subjects. A total of 166 exacerbation visits samples were collected from these 118 subjects. Of these, 25% were collected on the day of exacerbation attack onset, 16% one day post onset, 18% two days post onset, 37% between 3 and 9 days post onset, and the remaining 4% between 10 and 14 days post-onset. 161 of the exacerbation samples were collected while the subjects were experiencing one or more of the following symptoms: wheezing, chest tightness, and/or shortness of breath, (with concomitant symptom of cough reported for 48% of samples). For the other 5 exacerbation samples, for which neither wheezing, chest tightness nor shortness of breath were reported, cough attributed by the physician to an exacerbation attack was noted. Symptoms of upper respiratory infections associated with exacerbation attack were reported for 23% of 166 exacerbation visit samples.
Follow-up Visit Samples: A total of 125 evaluable follow-up samples from 102 subjects were collected from the 118 exacerbation visit subjects.
Quiet Visit Samples: A total of 393 evaluable quiet visit samples were collected from the 118 subjects used in the comparison of quiet and exacerbation visits. A total of 345 evaluable quiet visit samples were collected from the 102 subjects used in analyses relating to follow-up visits.
Gene expression levels in samples were determined using the U133A Affymetrix GENECHIP Array®. Quality control acceptance criteria are shown in Table 1. Samples that did not pass these quality control criteria were re-run, and samples that failed twice were excluded from analyses. A sample was considered evaluable if (a) GENECHIP quality control acceptance criteria were met, and (b) at least one exacerbation visit sample and at least one quiet visit sample was available from the same subject. Labeled target for oligonucleotide arrays were prepared using 2 μg of total RNA according to the Affymetrix protocol. Biotinylated cRNA was hybridized to the HG-U133A Affymetrix GENECHIP Array®. Raw intensity values were processed using Affymetrix MAS 5.0 software, which calculated signal expression levels and present/absent calls for each probe set.
Of the 22,283 probe sets present on the U133A array, a subset of 9,696 probe sets, which met the following two criteria, were analyzed: (a) those probe sets detected as being present in at least 10% of the samples; and (b) those probe sets having a signal of at least 50 in at least 10% of the samples.
The clinical and gene expression databases were merged using SAS version 9.1. Correct association of GENECHIP data with sample donor was verified by determining that gender specific expression patterns correctly reflected the donor's gender, and by a consistent expression pattern of HLA marker genes in samples collected at different times from the same donor.
Analysis of covariance (ANCOVA) methods were used to adjust for covariates when testing for differences in expression levels between visit types. The ANCOVA models used log2-transformed MAS 5.0 signal as the dependent variable and included terms for visit type, asthma severity defined by NIH guidelines, sex, age (18-39, 40-59, or 60-83), race, sample processing lab, maximum corticosteroid exposure (a 4-level variable reflecting corticosteroid exposure at time of visit, with systemic>inhaled>intranasal>no corticosteroid exposure), an indicator for use of leukotriene antagonist at time of visit, bactin-GAPDH ratio (an indicator of RNA quality), and monocyte/lymphocyte ratio. The visit type factor was limited to quiet visits and exacerbation visits in some analyses, while in others it included follow-up visits. Some analyses included an additional 3-level factor for exacerbation subgroup. In some analyses, pairwise contrasts were run between specific levels of factors with more than two levels. In such cases, the contrasts were performed using two-sided t-tests, with the denominator of the t-statistics derived from the ANCOVA error term. Separate ANCOVAs were run for each probe set. To adjust for the multiplicity of testing, false discovery rates were calculated across all probe sets, separately for each term in the ANCOVA model or pairwise contrast. All ANCOVAs and false discovery rate (“FDR”; Benjamini and Hochberg, J. of the Royal Statistical Society, Series B, 57:289-3001995) adjustments for multiplicity of testing calculations were run using SAS version 9.1.
Complete linkage hierarchical analysis (SPOTFIRE version 8.1) was used to identify exacerbation visit samples with similar exacerbation associated patterns of gene expression. The difference between the log 2 expression level during each individual exacerbation visit and the mean log 2 expression level of quiet visits for the same subject was calculated for each of the 166 exacerbation visits for each probe set with an association with exacerbation P<0.05. These log ratios were ordered by spectral bi-clustering analysis to organize the expression profiles of each exacerbation visit according to their level of similarity to each other. Based on the heterogeneity observed, iterative K-means clustering was used to assess the evidence for distinct visit subgroups (see, e.g., Hartigan and Wong, Applied Statistics 1979, 28:100-108.) Clustering of visits was executed for K=2, 3, 4, and 8 clusters. For each clustering, the strength of the clustering was assessed by two complementary methods. First, the silhouette statistic (SW) was calculated for each cluster (subgroup) and the overall clustering (see, e.g., Rousseeuw P, J Comput Appl Math 1987, 20:53-65). Second, typical levels of gaussian experimental noise were injected into the expression data, 100 realizations of this noisy data were each clustered, and the weighted sum of the fraction of realizations where the same groups of visits were co-clustered was calculated to generate a robustness index, (R), which is closely related to the measures described in McShane, 2002 (McShane et al., Bioinformatics 2002, 18 (11):1462-1469). For K=2 clusters, there was a clear and robust separation into two clusters (SW=0.19, R=0.998). For K=3 clusters, robust groupings were also found (SW=0.08, R=0.88). Beyond K=3 clusters, the SW and R measures declined further, indicating little support for more than 3 subgroups. Based on these results, K means clustering using 3 clusters was used to segregate exacerbation visit samples into three subgroups designated as X, Y and Z. Analyses using data for all 9,696 probe sets were then conducted to compare subgroup exacerbation visit expression levels to quiet visit expression levels. Probe sets showing a within-subgroup exacerbation association of FDR<0.05 and average fold change with exacerbation >1.2 were defined as meeting the criteria for association with exacerbation.
In order to track the biological pathways and functional networks implicated in exacerbation attacks, genes associated with exacerbation attack were analyzed using Ingenuity Pathway Analysis (IPA 3.1 release, Ingenuity® Systems, www.Ingenuity.com; see e.g., Calvano et al., 2005 Nature 437:1032-1037) to reveal relationships between them.
Conversion of 2 μg of total RNA from the above preparations to cDNA was accomplished using the Applied Biosystems High Capacity cDNA Archive Kit (Applied Biosystems Catalog #4322171) was performed according to the manufacturer's instructions. Pre-validated, QC tested gene specific primer-probe pairs, optimized for use on any ABI PRISM™ sequence detection system, were purchased from Applied Biosystems (ABI). Real-time quantitative gene expression assay kits were obtained from Applied Biosystems. The genes assayed were IFNα1 (assay Hs00256882_s1), IFNβ1 (assay Hs00277188_s1), IFNγ (assay Hs00174143_m1), IL18 (assay Hs00155517_m1), IL13 (assay Hs00174379_m1), and the endogenous normalizer control, ZNF592 (assay Hs00206029_m1). All study samples were normalized to ZNF592 levels to determine relative concentration values.
Using the Taqman Assay-On-Demand (“AOD”) product insert volume recommendations, a master mix was prepared using Taqman™ Universal PCR Master Mix (Catalog #4304437) and aliquoted into a 96 well plate (ABI Catalog #N801-0560 and caps #N801-0935) for a final volume of 50 μl/well. Duplicate wells for serially diluted standards and cDNA samples (50 ng/well) were assayed on an ABI PRISM 7700 Sequence detector (Sequence Detector Software v1.7) using universal thermal cycling conditions of 50° C. for 2 minutes, 95° C. for 10 minutes and 40 cycles of 95° C. for 15 seconds, 60° C. for 1 minute.
Relative quantification of RNA transcript levels was performed following the guidelines described in ABI PRISM 7700 Sequence Detection System User Bulletin #2 using the relative standard curve method. Specifically, standard curves are calculated for target standards and endogenous control, input values determined for target and endogenous control using standard curves' slope and y-intercept, and target input values are normalized to endogenous control. Fold change is calculated using the 50 ng standard as a calibrator and relative concentration of sample is obtained by multiplying fold change by calibrator, then averaged. To utilize the standard curve method for RNA quantification, a tissue empirically determined to express the target gene was identified using Applied Biosystems Taqman AOD™. Standard curve tissue sources were: cervix tumor from Ambion for INFα, activated human monocyte for IFNβ, activated human PBMC for INFγ and IL18, and thymus from Ambion for IL13. Cycle threshold (Ct) values of >35 were considered below the limits of detection. For standard curve development, the goal was to achieve a Ct value between 18 and 25 for 100 ng of cDNA. This allowed for appropriate standard curve dynamic range. Standard curves consisted of two-fold serial dilutions of total cDNA from 100 ng/well to 1.5 ng/well. Standard curves were performed on each plate for every assay and were used for sample quantification and assay performance monitoring. Inter-plate % CV for standard curve points were <4.5% for IFNα and IL-13 and <3% for interferon-b1, IFNβ and IL-18.
AOD for single exon gene targets (IFNα1 and IFNβ1) can produce inaccurate transcript expression values if the RNA preparations used for cDNA conversion contain genomic DNA. The following strategy was developed and employed to determine which cDNA samples contained genomic DNA.
Genomic sequence analysis was performed in the area of the human KIAA0644 gene product (accession #NM—014817) to determine predicted mRNA sequences using the Ensembl Gene Browser (see Fernandez Suarez and Schuster, “Using the Ensembl Genome Server to Browse Genomic Sequence Data,” UNIT 1.15 in Current Protocols in Bioinformatics, Supplement 16, January 2007; and also http://www.ensembl.org/index.html). A Taqman primer/probe pair was designed (ABI Primer Express) from a predicted nontranslated sequence located approximately 1.5 Kb 3′ of the KIAA0644 single exon gene product open reading frame on chromosome 7. In a Taqman assay, this primer/probe pair was shown to produce a strong signal, Ct value of 24.14, using a human genomic DNA preparation (Clontech catalog #6550-1) and no signal (Ct value of 40) using a commercially available purified RNA preparation from Ambion (human kidney, catalog #7976). Taqman analysis of all AOS cDNA preparations was performed using this primer/probe pair. Samples producing a Ct value of 35 or greater were determined to not be contaminated with genomic DNA, while samples producing a Ct value of less than 35 were considered to be contaminated with genomic DNA. For single exon gene target results, samples containing genomic DNA were not included in the statistical analysis (12% of AOS samples).
Statistical analyses were conducted to compare gene expression for 5 preselected genes (IFNα1, IFNβ1, IFNγ, IL13, and IL18) for exacerbated and quiet asthma periods. For each subgroup of patients (defined by K-means analysis described above), a mixed model analysis of variance was fit to the expression data to compare expression between exacerbation and quiet periods at the five percent significance level. The model included a fixed effect for visit type (quiet or exacerbated) and a random effect for patient to account for multiple visits per patient.
Gene expression from the quiet periods were then combined for the two subgroups to estimate the between and within subject variability in expression for asthma patients during quiet periods. These variance components were also estimated for a set of 28 healthy volunteers.
By ANCOVA analysis of the 118 subjects comparing quiet (393 samples) and exacerbation (166 samples), 78 probe sets had changes in expression that were associated with exacerbation, based on a criterion of FDR<0.05. The significance of the association with exacerbation ranged from 5.16E-5 to 4.6E-2 (Table 2). A listing and annotation of these 78 probe sets and the significance of association is given in Table 2.
The comparison between exacerbation and quiet visits that identified the 78 sequences was based on 118 subjects, and for 16 of these subjects no evaluable follow-up visit samples were available. The ANCOVA comparing quiet and exacerbation visit gene expression was rerun on visits only from the 102 subjects that also had follow-up visit data. The significant differences between quiet and exacerbation in the comparison based on 118 subjects also trend towards significance in the comparison based on 102 subjects, although, as predictable given the loss of statistical power, there are fewer probe sets with FDR<0.05 in the analysis based on 102 donors (Table 2).
The ANCOVA comparing mean expression levels in quiet and follow-up genes indicated that gene expression levels associated with exacerbation had returned to quiet visit levels at follow-up visit (Table 2).
To examine in more detail expression patterns associated with exacerbation, spectral bi-clustering analysis was performed using the difference between the log 2 expression levels during each individual exacerbation visit and the mean log 2 expression level of quiet visits for each of the 166 exacerbation visits. This analysis revealed significant heterogeneity between the expression profiles of exacerbation visits, and suggested that sub-grouping of visits might increase our power to detect transcripts that were differentially expressed only within specific subgroups. It was determined that K-means clustering using 3 clusters defined three relatively distinct and robust exacerbation associated gene expression patterns (see Methods section). K-means clustering was therefore used to assign each exacerbation sample to one of three subgroups (or clusters) designated as X (30 visits), Y (64 visits) and Z (72 visits). ANCOVA was performed on all 9,696 probe sets to compare mean expression levels in each exacerbation subgroup with mean quiet visit expression levels, thereby identifying sub-group specific expression profiles that might have been masked by heterogeneity when data from all exacerbation visits were lumped together.
Since the subgroups (or clusters) were defined based on minimizing variability among members of the same subgroup, the p-values and FDR values observed for specific exacerbation subgroup versus quiet visit comparisons for any given probe set can not be interpreted as numerically equivalent to p-values and FDRs obtained in the earlier analysis of quiet versus exacerbation expression levels. Rather the “within subgroup” FDR values are used primarily to rank the probe sets in terms of significance of differences between exacerbation and quiet expression levels. Therefore, FDR values generated from within subgroup analyses were designated as “comparative FDRs”. Therefore, in practical terms, a probe set with a comparative within subgroup X FDR of <1E-15 is much more likely to be associated with exacerbation than the probe sets with comparative FDRs of >0.05, but the overall probability of association with exacerbation can not be stated to be FDR<1E-15.
Within subgroup X, 1,081 probe sets had differences between exacerbation and quiet visit expression levels, as defined by comparative FDR<0.05 and absolute average fold change >1.2, and 48% of these 1,081 probe sets had comparative FDR<1E-3. Table 4 lists these probes sets along with their gene annotations and the strength of association with exacerbation as determined by comparative FDR. These findings indicate a very robust exacerbation associated gene expression profile within subgroup X. Analyses were then performed to determine the differences between quiet and follow-up visits within subgroup X. Of the 30 subgroup X exacerbation visits, evaluable follow-up samples were available for 22, resulting in 8 (26.6%) fewer samples in the analysis comparing quiet to follow-up visits within subgroup X. Even with this smaller sample size, ANCOVA comparing expression in quiet visits and 22 exacerbation visits for which there was a corresponding follow-up visit, showed that 793 (74%) of the 1,081 exacerbation associated probes sets retained a comparative FDR<0.05, indicating that a robust exacerbation associated expression profile was detectable even with a 26.6% decrease in sample size. In stark contrast, the ANCOVA comparing quiet visits and 22 follow-up visits of subgroup X exacerbation visits for all 9,696 probe sets identified only 36 differences with FCR<0.05, indicating that, unlike exacerbation samples, follow-up samples are very similar to quiet visit samples. Of the 793 probe sets significantly associated with exacerbation in the 22 visit analysis, only 2 had a significant difference between quiet and follow-up visits.
Many of the 1,081 exacerbation-associated subgroup X probe sets did not show even a slight trend towards association with exacerbation in subgroup Y, with 26% having a subgroup Y association FDR >0.5 (50%). These data indicate significant qualitative gene expression differences in subgroup X and Y exacerbations. Overlap between subgroups was also observed, however, with 21% of the subgroup X probe sets showing an association with exacerbation (comparative FDR<0.05) within subgroup Y.
ANCOVA comparing exacerbation and quiet visit expression levels within sub-group Y identified 574 probe sets associated with exacerbation. For subgroup Y, there were 64 exacerbation visits in the analyses comparing quiet and exacerbation, and 51 in the analyses that included follow-up visits. As was seen in the both the conglomerate and subgroup X analyses, for most probe sets subgroup Y expression levels had returned to quiet visit levels by follow-up. The list of the subgroup Y probe sets together with the metrics for association with exacerbation is given in Table 5. Of the probes sets associated with exacerbation in subgroup Y, 24% overlapped with subgroup X probe sets. In addition, subgroup Y probe sets include 39% that did not show even a slight trend with exacerbation (comparative FDR >0.5) in subgroup X. These data confirm the striking difference between subgroups X and Y exacerbations.
Subgroup Z contained the largest number of exacerbation visits (72) and the analyses that included follow-up visits contained 52 samples. The total number of exacerbation association probe sets in subgroup Z was 211, and the lowest relative FDR observed was 0.0004. No probe sets were identified in subgroup Z that did not also show a significant association with exacerbation in subgroup X and/or Y, indicating that subgroup Z does not represent a third qualitatively distinct exacerbation associated profile. Rather the data show that subgroup Z contains the visits that differ the least from quiet visits, and suggest that visits with weak to absent exacerbation associated profiles were assigned by the K-means algorithm to this group.
Chi-square tests or ANOVAs were performed to determine if clinical or technical parameters could be identified that had a significant association with subgroup assignment. A significant association between body mass index (BMI) and subgroup assignment was identified. Mean BMI was statistically significantly lower (p=0.006) in subgroup X than subgroup Y, and was statistically suggestively lower (p=0.0501) in subgroup Z than subgroup Y. Mean BMI was 28.4, 32.4, and 30.2 in subgroups X, Y and Z, respectively. Additionally, subgroup Y samples were somewhat less likely (p=0.042) to be from fasting subjects at time of visit, with 30%, 22% and 29% fasting samples in subgroup X, Y and Z respectively. Subgroup Y samples tended to be from older patients (mean age 46.0 years) than those in subgroup X (mean age 39.1) or Z (mean age 43.5), and this difference was significant in the comparison of subgroups X and Y (p=0.03).
No evidence was found for association between subgroup assignments and any of the following parameters: sex, race, country, disease severity, atopy status, respiratory infection, systemic, inhaled, intra-nasal corticosteroid or leukotriene inhibitor use, histamine H2 antagonist or PPI use, medical history of acid reflux, time between onset of exacerbation and sample collection, or sample processing lab. Also no evidence was found for association with FEV1 or IgE, but many values were missing in this analysis.
The mean number of days between quiet and exacerbation visits was significantly smaller for subgroup X (48.4 days) than for subgroup Y (62.7 days) and or Z (79.6 days). (The p-value of 0.03 is for the overall test comparing the means for the 3 subgroups; the p-value for X vs. Z was 0.014; the p-value for X vs. Y was >0.05). While not wishing to be bound by theory, given the study design, this difference possibly indicates that subgroup X samples were more likely to be collected from subjects who sought medical attention due to symptoms of attack, whereas the samples in the other subgroups were more likely to include some samples collected during a scheduled visit whose exacerbation attack symptoms had not triggered the patient to come in for an exacerbation visit. Other explanations for this observed difference are also possible, including that the different types of exacerbation visits may just have different frequencies of occurrence. The associations between subgroup assignment and clinical parameters therefore suggest that exacerbations with the most acute attack symptoms (as defined by prompting the subject to seek more immediate medical attention) tended to be in subgroup X. Those with exacerbations (and asthma) associated the high BMI tended to be in subgroup Y, perhaps providing a molecular signature for the previously observed link between higher BMI and symptoms of asthma. Those in subgroup Z tended to display more mild form of exacerbation profile as apparently reflected in the significantly longer intervals between seeking medical attention and the very much less robust molecular signature.
To determine the biological pathways and functional networks implicated in exacerbation by gene expression patterns, genes were analyzed using Ingenuity Pathway Analysis. Various canonical pathways are specific to subgroups X and Y, respectively. For example, subgroup X canonical signaling pathways include e.g. natural killer cell, antigen presentation, leukocyte extravasation, JAK/Stat, interferon, GM-CSF, T cell receptor, toll-like receptor and IL-10 signaling. Subgroup Y canonical signaling pathways include e.g. IL-4, B cell receptor, death receptor, SAPK/JNK, IL-2, PTEN, circadian rhythm, IGF-1, actin cytoskeleton, PI3K/Akt and insulin receptor signaling. Many IFN-inducible genes were noted in subgroup X. These include the interferon regulatory factors (IRFs) that are known to drive the transcription of various IFN-inducible genes. IRF1, IRF7, IRF9 are upregulated in exacerbation while IRF4 is down in exacerbation.
Networks were built (using the Connect Tool) around these IRFs using the subgroup X associated genes and expression and/or transcription as the connectivity from IRFs to the subgroup X associated genes. Subgroup X associated genes for IRF1, IRF7 and IRF9 are also upregulated in exacerbation suggesting that these IRFs drive the expression of these exacerbation genes. Likewise, exacerbation genes in the IRF4 hub are down and so is IRF4 in exacerbation. A network centered around IL15 was also highly significant, suggesting that IL15 could be regulating the expression of several exacerbation genes. All subgroup X associated genes were connected to IL15 based on information available in IPA for IL15 regulation of gene expression using the Connect tool (Table 6).
Subgroup Y showed a robust signature for the canonical pathway for B cell signaling. While subgroup Z did not have a robust signature, pathway analysis identified TLR pathway as well represented among the genes that passed the significance filter in this subgroup.
Since interferon response elements were so strongly identified with exacerbations, and IFNγ is so strongly identified with a Th1 type response and IFNα and β more consistent with the Th2 response classically associated with asthma, TAQMAN analysis of a subset of samples from subgroups X and Y was performed to assess the association of each of these genes with subgroups X and Y. As shown in Table 3, the results indicate that elevated levels of IFNα and β were associated with subgroup X exacerbations. IFNγ did not differ significantly between quiet and exacerbation samples.
Comparison of quiet and exacerbation visit gene expression profiles identified significant exacerbation associated changes in gene expression levels. Expression levels had returned to quiet visit levels two weeks after the attack. Clustering algorithms identified three relatively distinct exacerbation phenotypes defined by PBMC gene expression profiles, and analysis showed that gene expression patterns identified by ANCOVA performed within subgroups had also returned to quiet visit levels two weeks following an exacerbation, confirming that the within subgroup analysis did, indeed, identify genes significantly associated with exacerbation.
Pathway analysis for the three subgroups identified distinct pathways active within subgroups X, Y and Z. Many IFN-inducible genes such as OAS1, OAS3, MX1, IFITM3, IFIT3, IFI27, IFI35, IFIT1, et cetera are observed in subgroup X. These include interferon regulatory factors (IRFs), a family of transcription factors involved in the regulation of the interferon response. Of the nine known IRFs, IRF1, IRF7 and IRF9 are up-regulated in exacerbation, while IRF4 is down-regulated in exacerbation. The majority of the subgroup X genes that are regulated by IRF1 and IRF7 are also up-regulated in exacerbation. The majority of the subgroup X genes regulated by IRF4, such as CXCR4, MS4A, VIL2 and GATA3 are also down-regulated in exacerbation. IFN response in subgroup X is likely regulated by these IRFs and maybe either a Type I IFN (IFNα/IFNβ and others, such as IFN-ω, -ε, and -κ) or a Type II IFN (IFNγ) response. Taqman data indicates that the subgroup X IFN pathway is driven by IFNα and IFNβ. Data analysis indicates that the IFN pathway activation observed in the instant exacerbation samples are not attributable to respiratory infections, and that samples in this subgroup tend to have come from patients with normal BMI.
Another likely player in subgroup X exacerbations is IL15. IL15 is a TH1 cytokine that activates T-cells in a T-cell receptor independent manner. TCR a, TCR z and CD3D, which is associated with TCR, are down-regulated in exacerbation along with CD8B, a co-receptor for MHC class I as well as downstream signaling proteins such as ITK, PLCg1, TEC, SOS2, PIK3R1 and CALM1. IL15 is up-regulated in exacerbation and so is IL2RG, the shared signaling component of IL15R. So likely subgroup X type exacerbations involve IL15 activation of T-cells in a TCR-independent manner. IRF1 induces IL15, and IFNs may activate CD8T-cells via IL15.
TLRs trigger IFN-responses. TLR-signal transduction occurs either in a MYD-88 dependent manner through the recruitment of IRAK1/4, TRAF6, TAB1/2, TAK1 or in a MYD-88 independent manner that involves TRAM, TRIF, TBK-1, IKK-e and other signaling molecules. TLR3 and TLR4 are the only Toll receptors that utilize the MYD-88 independent signaling pathway. TLR1, TLR2, TLR4 are all expressed at significantly higher levels in exacerbation as well as MYD88, MD-2, CD14 and a downstream kinase EIF2AK1
MDA5/IFIH, which is a cytosolic receptor for intracellular viral RNAs and synthetic dsRNAs, and which mediates TLR-independent induction of type I IFN genes, is also upregulated in subgroup X suggesting that both TLR-dependent and independent pathways are activated in subgroup X.
Additional pathways regulated in subgroup X include, for example, the NK-cell signaling pathway and the antigen presentation pathway.
The NK-cell signaling pathway is common to subgroup Y as well. Subgroup Y genes involved in NK activation such as FCER1 and FCGR3 are expressed at higher levels in exacerbation, as well as the downstream signaling molecules LCK, SYK, LAT, RAC and RRAS, but not PIK3C1 and PIK3RA1. On the NK-inhibition side, receptors LILRB1, LAIR1, AIRM1, as well some downstream signaling molecules, are up-regulated in exacerbation, suggesting compensatory mechanisms in place for NK signaling. Some parallels and some differences in both arms of NK signaling can be noted for subgroup X. Actin-cytoskeletal structural genes such as ARPC5, PFN, CYFIP1, ARPC1B, but not VIL2, are upregulated in Subgroup Y. Some of these trend in the opposite direction for Subgroup X.
The expression levels of TLRs, IRFs, IL15 do not significantly change in subgroup Y compared to the quiets. Few genes common to the TLR, IFN, IL15 pathways in subgroup X such as for example MDA5, IFI35, ICAM2, CCR2, and IL2RG are also seen in Subgroup Y, and almost all trend in the same direction.
Additionally, different genes with similar functions showed sub-group specificity. For example, phopholipase scramblase 1 (PSCR1) is elevated in subgroup X (FDR=7.13E-13) but not in sub-group Y (FDR=0.509), whereas phopholipase scramblase 3 (PSCR3) is elevated in sub-group Y (FDR=0.003) but not in subgroup X (FDR=0.99).
The following tables, which are referenced in the foregoing description, are herein incorporated in their entirety.
Plasma samples from asthmatic donors during either previously scheduled or random exacerbation visits, and healthy volunteer donors were analyzed by ELISA for the presence of various cytokines, sST2 protein, which is the soluble form of ST2, an IL-1 receptor family member and cognate receptor for IL-33 (see Sanada et al., J. Clin. Invest., 117:1538-1548, 2007, which is incorporated herein by reference), and chitinase 3-like 1 protein (YKL-40, CHI3L1) (see Table 8.) CHI3L1 showed a significant difference is expression in the sera of asthmatics versus healthy volunteers, indicating its usefulness as an asthma-associated biomarker.
Serum sST2 concentrations were found to be significantly higher in (a) asthmatics versus healthy donors (p<0.05); (b) asthmatics during exacerbation versus asthmatics during scheduled visits (p<0.05); and (c) asthmatics during exacerbation versus healthy volunteers (p<0.0005). Specifically, the concentration of sST2 in sera was observed to be elevated upon exacerbation (90 pg/mL) relative to normal controls (55 pg/ml) (p value<0.0001). It was further observed that, upon asthma exacerbation, males have higher sST2 concentration in the sera (126 pg/mL) relative to females (78 pg/mL) (p value<0.01).
The question of whether sST2 is induced in response to G-protein coupled receptor (GPCR) activation was examined in a human mast cell line (HMC-1; see Versluis et al., Int. Immunopharmacol., 8:866-873, 2008.) We observed strong induction of sST2 mRNA and protein expression upon cell activation with asthma associated anaphylatoxin C5a and adenosine analog NECA, that activate GPCR signaling via C5a and adenosine receptor, respectively. Thus, sST2 is a useful asthma and exacerbation biomarker for the clinic.
We have shown that there are significant differences in PBMC gene expression profiles of asthma exacerbation subjects and asthma quiet or healthy subjects. In this example, we have shown that the expression level of many asthma associated genes can vary over time (e.g. between visits separated by time) within a subject, and can range from close to healthy to very different from healthy, and that differences between subjects are not necessarily greater than differences within subjects. The result of such an analysis will enable the selection of more optimal asthma and asthma exacerbation biomarker candidates that have higher incidences of deviation from healthy and quiet, respectively, on a per visit basis, as well as lower intra-subject deviations. (See copending U.S. Patent Application No. 60/879,994, which is herein incorporated by reference.) Non-limiting examples of such more optimal biomarkers for exacerbation include BLVRA (biliverdin reductase A), CSE1L (chromosome segregation 1-like), CTSC (cathepsin C), FCN1 (ficolin 1), GRN (granulin), LAMP2 (lysosomal-associated membrane protein 2), PECAM1 (platelet/endothelial cell adhesion molecule-1), S100A9 (S100 calcium binding protein A9) and SP110 (SP110 nuclear body protein). Exacerbation biomarkers having low intra-subject variability and high deviation from quiet or healthy are also shown in Table 9 and Table 10 for cluster X and cluster Y subgroups, respectively. These markers can be used to predict an exacerbation event in asthma sufferers.
To demonstrate this intra-subject variability, a first analysis was run on GeneChips from the first visit for each subject and a second analysis was run on GeneChips from the second visit for each subject (subsequent analysis looked at later visits). Using all subjects and analyzing data from all visits analysis, 438 probesets, which were significantly associated with asthma, were selected. For each probeset, the log 2 fold change was calculated for each asthma sample (including exacerbation asthma samples) over average healthy (all subjects, all visits). A quantitative scale was devised, which indicates the “distance” between an individual asthma (asthma exacerbation) profile and the mean healthy profile. Then the range of distance of asthma or asthma exacerbation from healthy was analyzed on a subject-by-subject basis.
The first and second visit analyses gave the same results, including the same cluster structure, same asthma genes, and almost the same fold change in expression level. However, it was noted that the subjects move between a subcluster that is very different from healthy and a subcluster that is close to healthy, showing that some asthma-associated and exacerbation-associated genes vary within a subject over time.
The 438 probesets used for asthma profile (supra) were examined for their association with other inflammatory diseases. Approximately 155 of those markers were significantly associated with asthma and not with multiple sclerosis (MS) or inflammatory bowel disease (IBD). 164 were associated with asthma and MS, with an additional 112 at least trending to significance in MS. 16 markers were associated with asthma and Crohn's disease, 10 of which did not also associated with MS. Nine (9) markers were associated with asthma and ulcerative colitis (UC).
The majority of genes common to MS and asthma changed in the same direction relative to normal or healthy in both diseases, with the following exceptions: IL21R (interleukin 21 receptor) was up in MS, down in asthma, and down more in severe asthma; CUTL1 (Cut-like 1, CCAAT displacement protein) was up in MS, down in asthma, down more in severe asthma; DGKD (Diacylglycerol kinase, delta 130 kDa) was up in MS, down in asthma, down more in severe asthma; and KIAA0528 (hypothetical protein LOC9847) was up in MS, down in asthma, and down more in severe asthma.
Of the 166 exacerbation samples, 39 occurred during a respiratory system infection and 127 occurred with out symptoms of infection. To identify probe sets that showed association with exacerbation only in the presence of infection, an ANCOVA was performed comparing the 39 samples collected during infection with the quiet visits from the same patients. 54 probesets were identified with FDR<0.05 (Table 11) Of note among the 54 were 16 of the 54 probe sets showed an association with exacerbation in the presence of infection, but did not show a significant association in the analysis comparing the mixed group of 166 exacerbations (with and without infection) and quiet samples (Table 12). Consistent with this finding, none of these 16 was significantly associated with exacerbations in the absence of infection. These data indicate that there were some probe sets whose association with exacerbation was detectable only in the presence of a concomitant infection.
At least three probe sets were observed to be associated with exacerbation in the absence of infection (i.e. not associated with exacerbation in the presence of infection). Those probes sets include: (a) interferon induced with helicase C domain 1 (IFIH1; e.g. SEQ ID NO:60), (b) leukotriene A4 hydrolase (LTA4H; e.g. SEQ ID NO:61) and (c) open reading frame number 25 of human chromosome 6 (C6ORF25; SEQ ID NO:62). These probe sets can serve as biomarkers of exacerbation triggered by inert non-infectious agents.
1Estimated Difference for Exacerbation Expression − Quiet Expression.
1The hybrid length is that anticipated for the hybridized region(s) of the hybridizing polynucleotides. When hybridizing a polynucleotide to a target polynucleotide of unknown sequence, the hybrid length is assumed to be that of the hybridizing polynucleotide. When polynucleotides of known sequence are hybridized, the hybrid length can be determined by aligning the sequences of the polynucleotides and identifying the region or regions of optimal sequence complementarity.
HSSPE (1x SSPE is 0.15M NaCl, 10 mM NaH2PO4, and 1.25 mM EDTA, pH 7.4) can be substituted for SSC (1x SSC is 0.15M NaCl and 15 mM sodium citrate) in the hybridization and wash buffers.
Homo sapiens hypothetical protein
This application claims priority to U.S. Provisional Application Nos. 61/059,153, filed on Jun. 5, 2008; 61/084,787, filed on Jul. 30, 2008 and 61/111,917, filed Nov. 6, 2008 respectively, and which are incorporated herein by reference in their entirety.
Number | Date | Country | |
---|---|---|---|
61059153 | Jun 2008 | US | |
61084787 | Jul 2008 | US | |
61111917 | Nov 2008 | US |