The present invention generally relates to a unique set of genetic polymorphisms associated with lupus, and compositions and methods for assessing risk of developing lupus as well as for diagnosing and treating lupus.
Lupus is an autoimmune disease involving antibodies that attack connective tissue. The disease is estimated to affect nearly 1 million Americans, primarily women between the ages of 20-40. The principal form of lupus is a systemic one (systemic lupus erythematosus; SLE). Systemic Lupus Erythematosus (SLE) is a chronic autoimmune disease with strong genetic as well as environmental components (See, e.g., Hochberg M C, Dubois' Lupus Erythematosus. 5th ed., Wallace D J, Hahn B H, eds. Baltimore: Williams and Wilkins (1997); Wakeland E K, et al., Immunity 2001; 15(3):397-408; Nath S K, et al., Curr. Opin. Immunol. 2004; 16(6):794-800). Autoantibodies play an important role in the pathogenesis of SLE, and the diverse clinical manifestations of the disease are due to the deposition of antibody-containing immune complexes in blood vessels leading to inflammation in the kidney, brain and skin, together with direct pathogenic effects of autoantibodies contributing to hemolytic anemia and thrombocytopenia. SLE is generally characterized as an autoimmune connective-tissue disorder with a wide range of clinical features, which predominantly affects women, especially from certain ethnic groups. D'Cruz et al., Lancet (2007), 369:587-596. SLE is associated with the production of antinuclear antibodies, circulating immune complexes, and activation of the complement system. SLE has an incidence of about 1 in 700 women between the ages of 20 and 60. SLE can affect any organ system and can cause severe tissue damage. Numerous autoantibodies of differing specificity are present in SLE. SLE patients often produce autoantibodies having anti-DNA, anti-Ro, and anti-platelet specificity and that are capable of initiating clinical features of the disease, such as glomerulonephritis, arthritis, serositis, complete heart block in newborns, and hematologic abnormalities. These autoantibodies are also possibly related to central nervous system disturbances. Arbuckle et al. describes the development of autoantibodies before the clinical onset of SLE (Arbuckle et al. N. Engl. J. Med. 349(16): 1526-1533 (2003)). Definitive diagnosis of lupus, including SLE, is not easy, resulting in clinicians resorting to a multi-factorial signs and symptoms-based classification approach. Gill et al., American Family Physician (2003), 68(11): 2179-2186.
Untreated lupus can be fatal as it progresses from attack of skin and joints to internal organs, including lung, heart, and kidneys (with renal disease being the primary concern), thus making early and accurate diagnosis of and/or assessment of risk of developing lupus particularly critical. Lupus mainly appears as a series of flare-ups, with intervening periods of little or no disease manifestation. Kidney damage, measured by the amount of proteinuria in the urine, is one of the most acute areas of damage associated with pathogenicity in SLE, and accounts for at least 50% of the mortality and morbidity of the disease.
One of the most difficult challenges in clinical management of complex autoimmune diseases such as lupus is the accurate and early identification of the disease in a patient. Over the years, many linkage and candidate gene studies have been performed to identify genetic factors contributing to SLE susceptibility. Haplotypes carrying the HLA Class II alleles DRB1*0301 and DRB1*1501 are clearly associated with disease as well as the presence of antibodies to nuclear autoantigens. See, e.g., Goldberg M A, et al., Arthritis Rheum 1976; 19(2):129-32; Graham R R, et al., Am J Hum Genet. 2002; 71(3):543-53; and Graham R R, et al., Eur J Hum Genet. 2007; 15(8):823-30). More recently, variants of Interferon Regulatory Factor 5 (IRF5) and Signal Transducer and Activator of Transcription 4 (STAT4) were discovered to be significant risk factors for SLE. See, e.g., Sigurdsson S, et al., Am J Hum Genet. 2005; 76(3):528-37; Graham R R, et al., Nat Genet. 2006; 38(5):550-55; Graham R R, et al., Proc Natl Acad Sci USA 2007; 104(16):6758-63; and Remmers E F, et al., N Engl J Med 2007; 357(10):977-86. The identification of IRF5 and STAT4 as SLE risk genes provides support for the concept that the Type-I interferon pathway is central to disease pathogenesis. See, e.g., Ronnblom L, et al., J Exp Med 2001; 194(12):F59-63; Baechler E C, et al., Curr Opin Immunol 2004; 16(6):801-07; Banchereau J, et al., Immunity 2006; 25(3):383-92; Miyagi T, et al., J Exp Med 2007; Epublication; September 10.
To this end, it would be highly advantageous to have molecular-based diagnostic methods that can be used to objectively identify the presence of and/or classify the disease in a patient. Genetic variations, or polymorphisms, are genetic variations that are present in an organism's genome. Polymorphisms include single nucleotide polymorphisms (SNPs). See, e.g., Carlson et al., Nature 2004; 429:446-452; Bell, Nature 2004; 429:453-463; Evans & Relling, Nature 2004; 429:464-468. SNPs have been strongly correlated with risk and/or presence of serious diseases such as diabetes (Sladek et al., Nature 2007; 445: 881-828; Zeggini et al., Science 2007; April 26; Scott et al., Science 2007; April 26; and Saxena et al., Science 2007; April 26); Crohn disease (e.g., Hampe et al., Nat. Genet. 2007; February; 39(2):207-11); rheumatoid arthritis (e.g., US Pat. Pub. No. 2007/0031848); and other inflammatory autoimmune disease (e.g., U.S. Pat. No. 6,900,016; U.S. Pat. No. 7,205,106).
Until recently, it has not been possible to comprehensively examine the genome for variants that modify risk to complex diseases such as lupus. However, the generation of an extensive catalog of common human variation (see, e.g., Nature 2005; 437(7063):1299-320) coupled with technological advances that permit cost-effective and accurate genotyping of hundreds of thousands of variants, has fueled a revolution in human genetics. For the first time, it is possible to conduct well-powered genome-wide association scans to more fully test the hypothesis that common variants influence risk. In the past two years, this technology has been highly validated. See, e.g., Dewan A, et al., Science 2006; 314(5801):989-92; Nature 2007; 447(7145):661-78, Matarin M, et al., Lancet neurology 2007; 6(5):414-20; Moffatt M F, et al., Nature 2007; 448(7152):470-73; Plenge R M, et al., N Engl J Med 2007; Saxena R, et al., Science 2007; 316(5829):1331-36; Scott L J, et al., Science 2007; 316(5829):1341-45; Scuteri A, et al., PLoS Genet. 2007; 3(7):e115. The identified risk loci are providing new insights into the molecular pathways dysregulated in human disease.
However, there continues to be a significant lack of credible information on SNP associations with complex diseases such as lupus, thus it is clear that a continuing need exists to identify polymorphisms associated with such diseases. Such associations would greatly benefit the identification of the presence of lupus in patients or the determination of susceptibility to develop the disease. In addition, statistically and biologically significant and reproducible information regarding association of a SNP with a complex disease such as lupus could be utilized as an integral component in efforts to identify specific subsets of patients who would be expected to significantly benefit from treatment with a particular therapeutic agent, for example where the therapeutic agent is or has been shown in clinical studies to be of therapeutic benefit in such specific lupus patient subpopulation.
The invention described herein meets the above-described needs and provides other benefits.
All references cited herein, including patent applications and publications, are incorporated by reference in their entirety.
The invention provides accurate, simple, and rapid methods and compositions for identifying lupus, and for assessing risk of developing lupus, based at least in part on the identification of one or more genetic variations, e.g., SNPs, that are correlated with high statistical and biological significance with the presence, subtypes, and/or patient subpopulations of lupus. More specifically, the invention relates to the identification of a unique set of SNPs, unique combinations of such SNPs, and linkage disequilibrium regions that are associated with lupus and its subtypes, and patient subpopulations suffering from same.
In particular, the unique set and/or combinations of SNPs can be used as a genetic profile or signature indicative of a subject at risk of developing lupus, or indicative of the disease or symptom or condition thereof. The polymorphisms disclosed herein are useful as biomarkers for assessing risk of developing lupus, as well as for targets for the design of diagnostic reagents. In some aspects, the SNP is not associated with a gene. In other aspects, the SNP is associated with a gene, and can be located either in an intergenic or intragenic region, and more particularly, can be located in a coding or noncoding region. The genes associated with a SNP of the present invention may be associated with an unknown gene, or may be associated with a known gene e.g., ITGAM or BLK.
The SNPs identified herein provide targets for development of therapeutic agents for use in the diagnosis and treatment of genetically identified lupus patients, including diagnosis and targeted treatment of lupus patient subpopulations exhibiting a distinct genetic signature comprising one or more of the SNPs of the present invention. For example, in one aspect, the genes containing the genetic variations identified herein, and the nucleic acid (e.g., DNA or RNA) associated with these genes, and proteins encoded by these genes, can be used as targets for the development of therapeutic agents (e.g., small molecule compounds, antibodies, antisense/RNAi agents, etc.) or used directly as therapeutic agents (e.g., therapeutic proteins, etc.) for the treatment of lupus.
Accordingly, in one aspect, the invention provides a set of one or more SNPs that form a unique genetic signature for assessing the risk of developing lupus. In one aspect, the unique genetic signature comprises about 1-10, 10-20, 20-30, 30-40, or 40-50 SNPs selected from any of the SNPs set forth in
In one aspect, the unique genetic signature comprises one or more SNPs, 2 or more SNPs, 3 or more SNPs, 4 or more SNPs, 5 or more SNPs, 6 or more SNPs, 7 or more SNPs, 8 or more SNPs, 9 or more SNPs, 10 or more SNPs, 11 or more SNPs, 12 or more SNPs, 13 or more SNPs, 14 or more SNPs, 15 or more SNPs, 16 or more SNPs, 17 or more SNPs, 18 or more SNPs, 19 or more SNPs, or 20 or more SNPs selected from any of the SNPs set forth in
In another aspect, the invention provides for methods of assessing whether a subject is at risk of developing lupus by detecting in a biological sample obtained from said subject, the presence of a genetic signature indicative of risk of developing lupus, wherein said genetic signature comprises a set of one or more SNPs selected from any of the SNPs set forth in
In another aspect, the invention provides for methods of diagnosing lupus in a subject by detecting in a biological sample obtained from said subject, the presence of a genetic signature indicative of lupus, wherein said genetic signature comprises a set of one or more SNPs selected from any of the SNPs set forth in
In another aspect, the invention provides for an isolated polynucleotide or fragment thereof that is at least about 10 nucleotides in length, wherein the polynucleotide or fragment thereof comprises: a) a genetic variation at a nucleotide position corresponding to the position of a single nucleotide polymorphism (SNP) selected from any of the SNPs set forth in
In one aspect, the invention provides for an isolated PRO-associated polynucleotide or fragment thereof that is at least about 10 nucleotides in length, wherein the PRO-associated polynucleotide or fragment thereof comprises: a) a genetic variation at a nucleotide position corresponding to the position of a single nucleotide polymorphism (SNP) selected from any of the SNPs set forth in
In another aspect, the invention provides for an oligonucleotide that is (a) an allele-specific oligonucleotide that hybridizes to a region of a polynucleotide comprising a genetic variation at a nucleotide position corresponding to the position of a single nucleotide polymorphism (SNP) set forth in
In another aspect, the invention provides for a kit comprising any one of the oligonucleotide above and, optionally, at least one enzyme. In one aspect, the at least one enzyme is a polymerase. In another aspect, the at least one enzyme is a ligase.
In another aspect, the invention provides for a microarray comprising any of the oligonucleotides above.
In another aspect, the invention provides for a method of detecting the absence or presence of a variation in a polynucleotide at a nucleotide position corresponding to the position of a single nucleotide polymorphism (SNP) as set forth in
In another aspect, the invention provides for a method of amplifying a nucleic acid comprising a variation in a polynucleotide at a nucleotide position corresponding to the position of a single nucleotide polymorphism (SNP) selected from any of the SNPs as set forth in
In another aspect, the invention provides for a method of determining the genotype of a biological sample from a mammal, the method comprising detecting, in nucleic acid material derived from the biological sample, the absence or presence of a variation in a polynucleotide at a nucleotide position corresponding to the position of a single nucleotide polymorphism (SNP) selected from any of the SNPs as set forth in
In another aspect, the biological sample is known to or suspected of comprising a polynucleotide of the present invention, wherein the polynucleotide comprises a variation at a nucleotide position corresponding to the position of a single nucleotide polymorphism (SNP) selected from any of the SNPs as set forth in
In another aspect, the invention provides for a method of sub-classifying lupus in a mammal, the method comprising detecting the presence of one or more of the SNPs set forth in
In another aspect, the detecting comprises carrying out a process selected from a primer extension assay; an allele-specific primer extension assay; an allele-specific nucleotide incorporation assay; an allele-specific oligonucleotide hybridization assay; a 5′ nuclease assay; an assay employing molecular beacons; and an oligonucleotide ligation assay.
In another aspect, the invention provides for a method for predicting whether a subject with lupus will respond to a lupus therapeutic agent, the method comprising determining whether the subject comprises a variation in a polynucleotide at a nucleotide position corresponding to the position of a single nucleotide polymorphism (SNP) selected from any of the SNPs as set forth in
In another aspect, the invention provides a method of diagnosing or prognosing lupus in a subject, the method comprising detecting the presence of a variation in a polynucleotide derived from a biological sample obtained from the subject, wherein: (a) the biological sample is known to comprise, or is suspected of comprising, a polynucleotide comprising the variation; (b) the variation comprises, or is located at a nucleotide position corresponding to, a SNP selected from any of the SNPs set forth in
In another aspect, the invention provides a method of diagnosing or prognosing lupus in a subject, the method comprising detecting the presence of a variation in a PRO or PRO-associated polynucleotide derived from a biological sample obtained from the subject, wherein: (a) the biological sample is known to comprise, or is suspected of comprising, a PRO or PRO-associated polynucleotide comprising the variation; (b) the variation comprises, or is located at a nucleotide position corresponding to, a SNP set forth in
In another aspect, the invention provides a method of aiding in the diagnosis or prognosis of lupus in a subject, the method comprising detecting the presence of a variation in a polynucleotide derived from a biological sample obtained from the subject, wherein: (a) the biological sample is known to comprise, or suspected of comprising, a polynucleotide comprising the variation; (b) the variation comprises, or is located at a nucleotide position corresponding to, a SNP selected from any of the SNPs set forth in
In another aspect, the invention provides a method of aiding in the diagnosis or prognosis of lupus in a subject, the method comprising detecting the presence of a variation in a PRO or PRO-associated polynucleotide derived from a biological sample obtained from the subject, wherein: (a) the biological sample is known to comprise, or suspected of comprising, a PRO or PRO-associated polynucleotide comprising the variation; (b) the variation comprises, or is located at a nucleotide position corresponding to, a SNP selected from any of the SNPs set forth in
In another aspect, the polynucleotide comprises a sequence within a linkage disequilibrium region (e.g., as set forth in
In another aspect, the PRO-associated polynucleotide encodes a PRO that is encoded by a sequence within a linkage disequilibrium region (e.g., as set forth in
In another aspect, the invention provides for a method of identifying a therapeutic agent effective to treat lupus in a patient subpopulation, the method comprising correlating efficacy of the agent with the presence in the patient of one or more of the SNPs selected from any of the SNPs set forth in
In another aspect, the invention provides for a method of identifying a therapeutic agent effective to treat lupus in a patient subpopulation, the method comprising correlating efficacy of the agent with the presence of a combination of the SNPs selected from any of the SNPs set forth in
In another aspect, the invention provides for a method of treating a lupus condition in a subject in whom a genetic variation is known to be present at a nucleotide position corresponding to a single nucleotide polymorphism (SNP) selected from any of the SNPs set forth in
In another aspect, the invention provides for a method of treating a subject having a lupus condition, the method comprising administering to the subject a therapeutic agent effective to treat the condition in a subject who has a genetic variation at a nucleotide position corresponding to a single nucleotide polymorphism (SNP) selected from any of the SNPs set forth in
In another aspect, the invention provides for a method of treating a subject having a lupus condition, the method comprising administering to the subject a therapeutic agent shown to be effective to treat said condition in at least one clinical study wherein the agent was administered to at least five human subjects who each had a genetic variation at a nucleotide position corresponding to a single nucleotide polymorphism (SNP) selected from any of the SNPs set forth in
In another aspect, the invention provides for a method of treating a lupus subject of a specific lupus patient subpopulation, wherein the subpopulation is characterized at least in part by association with genetic variation at a nucleotide position corresponding to a SNP selected from any of the SNPs set forth in
In another aspect, the invention provides for a method comprising manufacturing a lupus therapeutic agent, and packaging the agent with instruction to administer the agent to a subject who has or is believed to have lupus and who has a genetic variation at a position corresponding to a single nucleotide polymorphism (SNP) selected from any of the SNPs set forth in
In another aspect, the invention provides for a method of specifying a therapeutic agent for use in a lupus patient subpopulation, the method comprising providing instruction to administer the therapeutic agent to a patient subpopulation characterized by a genetic variation at a position corresponding to a single nucleotide polymorphism (SNP) selected from any of the SNPs set forth in
In another aspect, the invention provides for a method for marketing a therapeutic agent for use in a lupus patient subpopulation, the method comprising informing a target audience about the use of the therapeutic agent for treating the patient subpopulation as characterized by the presence, in patients of such subpopulation, of a genetic variation at a position corresponding to a single nucleotide polymorphism (SNP) selected from any of the SNPs set forth in
In another aspect, the invention provides for a method for modulating signaling through the B cell receptor in a subject in whom a genetic variation is known to be present at a nucleotide position corresponding to a single nucleotide polymorphism (SNP) selected from any of the SNPs set forth in
In another aspect, the invention provides for a method for modulating the differentiation of Th17 cells in a subject in whom a genetic variation is known to be present at a nucleotide position corresponding to a single nucleotide polymorphism (SNP) selected from any of the SNPs set forth in
In another aspect, the invention provides for a set of SNPs comprising a genetic signature indicative of the risk of developing lupus, wherein said set of SNPs comprises one or more SNPs selected from any of the SNPs set forth in
In another aspect, the invention provides for a set of SNPs comprising a genetic signature indicative of lupus, wherein said set of SNPs comprises one or more SNPs selected from any of the SNPs set forth in
The following key applies to the headings in
The invention provides accurate, simple, and rapid methods and compositions for identifying lupus, and for assessing risk of developing lupus, based at least in part on the identification of one or more genetic variations, e.g., SNPs, that are correlated with high statistical and biological significance with the presence, subtypes, and/or patient subpopulations of lupus. More specifically, the invention relates to the identification of a unique set of SNPs, unique combinations of such SNPs, and linkage disequilibrium regions that are associated with lupus and its subtypes, and patient subpopulations suffering from same.
In particular, the unique set and/or combinations of SNPs can be used as a genetic profile or signature indicative of a subject at risk of developing lupus, or indicative of the disease or symptom or condition thereof. The polymorphisms disclosed herein are useful as biomarkers for assessing risk of developing lupus, as well as for targets for the design of diagnostic reagents. In some embodiments, the SNP is not associated with a gene. In other embodiments, the SNP is associated with a gene, and can be located either in an intergenic or intragenic region, and more particularly, can be located in a coding or noncoding region. The genes associated with a SNP of the present invention may be associated with an unknown gene, or may be associated with a known gene e.g., ITGAM or BLK.
The SNPs identified herein provide targets for development of therapeutic agents for use in the diagnosis and treatment of genetically identified lupus patients, including diagnosis and targeted treatment of lupus patient subpopulations exhibiting a distinct genetic signature comprising one or more of the SNPs of the present invention. For example, in one embodiment, the genes containing the genetic variations identified herein, and the nucleic acid (e.g., DNA or RNA) associated with these genes, and proteins encoded by these genes, can be used as targets for the development of therapeutic agents (e.g., small molecule compounds, antibodies, antisense/RNAi agents, etc.) or used directly as therapeutic agents (e.g., therapeutic proteins, etc.) for the treatment of lupus.
The practice of the present invention will employ, unless otherwise indicated, conventional techniques of molecular biology (including recombinant techniques), microbiology, cell biology, biochemistry, and immunology, which are within the skill of the art. Such techniques are explained fully in the literature, such as, “Molecular Cloning: A Laboratory Manual”, second edition (Sambrook et al., 1989); “Oligonucleotide Synthesis” (M. J. Gait, ed., 1984); “Animal Cell Culture” (R. I. Freshney, ed., 1987); “Methods in Enzymology” (Academic Press, Inc.); “Current Protocols in Molecular Biology” (F. M. Ausubel et al., eds., 1987, and periodic updates); “PCR: The Polymerase Chain Reaction”, (Mullis et al., eds., 1994).
Primers, oligonucleotides and polynucleotides employed in the present invention can be generated using standard techniques known in the art.
Unless defined otherwise, technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. Singleton et al., Dictionary of Microbiology and Molecular Biology 2nd ed., J. Wiley & Sons (New York, N.Y. 1994), and March, Advanced Organic Chemistry Reactions, Mechanisms and Structure 4th ed., John Wiley & Sons (New York, N.Y. 1992), provide one skilled in the art with a general guide to many of the terms used in the present application.
For purposes of interpreting this specification, the following definitions will apply and whenever appropriate, terms used in the singular will also include the plural and vice versa. In the event that any definition set forth below conflicts with any document incorporated herein by reference, the definition set forth below shall control.
“Lupus” or “lupus condition”, as used herein is an autoimmune disease or disorder that in general involves antibodies that attack connective tissue. The principal form of lupus is a systemic one, systemic lupus erythematosus (SLE), including cutaneous SLE and subacute cutaneous SLE, as well as other types of lupus (including nephritis, extrarenal, cerebritis, pediatric, non-renal, discoid, and alopecia). See, generally, D'Cruz et al., supra.
The term “polynucleotide” or “nucleic acid,” as used interchangeably herein, refers to polymers of nucleotides of any length, and include DNA and RNA. The nucleotides can be deoxyribonucleotides, ribonucleotides, modified nucleotides or bases, and/or their analogs, or any substrate that can be incorporated into a polymer by DNA or RNA polymerase. A polynucleotide may comprise modified nucleotides, such as methylated nucleotides and their analogs. If present, modification to the nucleotide structure may be imparted before or after assembly of the polymer. The sequence of nucleotides may be interrupted by non-nucleotide components. A polynucleotide may be further modified after polymerization, such as by conjugation with a labeling component. Other types of modifications include, for example, “caps”, substitution of one or more of the naturally occurring nucleotides with an analog, internucleotide modifications such as, for example, those with uncharged linkages (e.g., methyl phosphonates, phosphotriesters, phosphoamidates, cabamates, etc.) and with charged linkages (e.g., phosphorothioates, phosphorodithioates, etc.), those containing pendant moieties, such as, for example, proteins (e.g., nucleases, toxins, antibodies, signal peptides, poly-L-lysine, etc.), those with intercalators (e.g., acridine, psoralen, etc.), those containing chelators (e.g., metals, radioactive metals, boron, oxidative metals, etc.), those containing alkylators, those with modified linkages (e.g., alpha anomeric nucleic acids, etc.), as well as unmodified forms of the polynucleotide(s). Further, any of the hydroxyl groups ordinarily present in the sugars may be replaced, for example, by phosphonate groups, phosphate groups, protected by standard protecting groups, or activated to prepare additional linkages to additional nucleotides, or may be conjugated to solid supports. The 5′ and 3′ terminal OH can be phosphorylated or substituted with amines or organic capping groups moieties of from 1 to 20 carbon atoms. Other hydroxyls may also be derivatized to standard protecting groups. Polynucleotides can also contain analogous forms of ribose or deoxyribose sugars that are generally known in the art, including, for example, 2′-O-methyl-2′-O— allyl, 2′-fluoro- or 2′-azido-ribose, carbocyclic sugar analogs, α-anomeric sugars, epimeric sugars such as arabinose, xyloses or lyxoses, pyranose sugars, furanose sugars, sedoheptuloses, acyclic analogs and abasic nucleoside analogs such as methyl riboside. One or more phosphodiester linkages may be replaced by alternative linking groups. These alternative linking groups include, but are not limited to, embodiments wherein phosphate is replaced by P(O)S (“thioate”), P(S)S (“dithioate”), “(O)NR 2 (“amidate”), P(O)R, P(O)OR′, CO or CH 2 (“formacetal”), in which each R or R′ is independently H or substituted or unsubstituted alkyl (1-20 C) optionally containing an ether (—O—) linkage, aryl, alkenyl, cycloalkyl, cycloalkenyl or araldyl. Not all linkages in a polynucleotide need be identical. The preceding description applies to all polynucleotides referred to herein, including RNA and DNA.
“Oligonucleotide,” as used herein, refers to short, single stranded polynucleotides that are at least about seven nucleotides in length and less than about 250 nucleotides in length. Oligonucleotides may be synthetic. The terms “oligonucleotide” and “polynucleotide” are not mutually exclusive. The description above for polynucleotides is equally and fully applicable to oligonucleotides.
The term “primer” refers to a single stranded polynucleotide that is capable of hybridizing to a nucleic acid and allowing the polymerization of a complementary nucleic acid, generally by providing a free 3′-OH group.
The term “PRO” refers to a polypeptide encoded by any gene encoded by a nucleic acid sequence located within a linkage disequilibrium region (LD region), where the LD region is determined in accordance with information set forth in
The term “genetic variation” or “nucleotide variation” refers to a change in a nucleotide sequence (e.g., an insertion, deletion, inversion, or substitution of one or more nucleotides, such as a single nucleotide polymorphism (SNP)) relative to a reference sequence (e.g., a commonly-found and/or wild-type sequence, and/or the sequence of a major allele). The term also encompasses the corresponding change in the complement of the nucleotide sequence, unless otherwise indicated. In one embodiment, a genetic variation is a somatic polymorphism. In one embodiment, a genetic variation is a germline polymorphism.
A “single nucleotide polymorphism”, or “SNP”, refers to a single base position in an RNA or DNA molecule (e.g., a polynucleotide), at which different alleles, or alternative nucleotides, exist in a population. The SNP position (interchangeably referred to herein as SNP, SNP site, SNP locus) is usually preceded by and followed by highly conserved sequences of the allele (e.g., sequences that vary in less than 1/100 or 1/1000 members of the populations). An individual may be homozygous or heterozygous for an allele at each SNP position.
The term “amino acid variation” refers to a change in an amino acid sequence (e.g., an insertion, substitution, or deletion of one or more amino acids, such as an internal deletion or an N- or C-terminal truncation) relative to a reference sequence.
The term “variation” refers to either a nucleotide variation or an amino acid variation.
The term “a genetic variation at a nucleotide position corresponding to a SNP,” “a nucleotide variation at a nucleotide position corresponding to a SNP,” and grammatical variants thereof refer to a nucleotide variation in a polynucleotide sequence at the relative corresponding nucleotide position occupied by the SNP in the genome. The term also encompasses the corresponding variation in the complement of the nucleotide sequence, unless otherwise indicated. In some embodiments, the nucleotide variation is in a PRO-associated polynucleotide sequence at the relative corresponding nucleotide position occupied by the SNP in the genome.
The term “linkage disequilibrium region SNP,” or “LD region SNP” refers to a SNP present in a specific region of DNA, such region delineated by appropriate nucleic acid/genomic markers, e.g., coordinates or SNPs. In one embodiment, a LD region is delineated by a first coordinate (e.g., coordinate A) and a second coordinate (e.g., coordinate B), both coordinates referring to the same chromosome. In one embodiment, a LD region is delineated by a first SNP (e.g., SNP A) and a second SNP (e.g., SNP B). Thus, in one embodiment, a LD region SNP refers to a SNP located in a nucleic acid region (e.g., genomic region) ranging from a first coordinate to a second coordinate, or a first SNP to a second SNP. Examples of such LD regions and LD region SNPs are shown in
The term “array” or “microarray” refers to an ordered arrangement of hybridizable array elements, preferably polynucleotide probes (e.g., oligonucleotides), on a substrate. The substrate can be a solid substrate, such as a glass slide, or a semi-solid substrate, such as nitrocellulose membrane.
The term “amplification” refers to the process of producing one or more copies of a reference nucleic acid sequence or its complement. Amplification may be linear or exponential (e.g., PCR). A “copy” does not necessarily mean perfect sequence complementarity or identity relative to the template sequence. For example, copies can include nucleotide analogs such as deoxyinosine, intentional sequence alterations (such as sequence alterations introduced through a primer comprising a sequence that is hybridizable, but not fully complementary, to the template), and/or sequence errors that occur during amplification.
The term “allele-specific oligonucleotide” refers to an oligonucleotide that hybridizes to a region of a target nucleic acid that comprises a nucleotide variation (generally a substitution). “Allele-specific hybridization” means that, when an allele-specific oligonucleotide is hybridized to its target nucleic acid, a nucleotide in the allele-specific oligonucleotide specifically base pairs with the nucleotide variation. An allele-specific oligonucleotide capable of allele-specific hybridization with respect to a particular nucleotide variation is said to be “specific for” that variation.
The term “allele-specific primer” refers to an allele-specific oligonucleotide that is a primer.
The term “primer extension assay” refers to an assay in which nucleotides are added to a nucleic acid, resulting in a longer nucleic acid, or “extension product,” that is detected directly or indirectly. The nucleotides can be added to extend the 5′ or 3′ end of the nucleic acid.
The term “allele-specific nucleotide incorporation assay” refers to a primer extension assay in which a primer is (a) hybridized to target nucleic acid at a region that is 3′ or 5′ of a nucleotide variation and (b) extended by a polymerase, thereby incorporating into the extension product a nucleotide that is complementary to the nucleotide variation.
The term “allele-specific primer extension assay” refers to a primer extension assay in which an allele-specific primer is hybridized to a target nucleic acid and extended.
The term “allele-specific oligonucleotide hybridization assay” refers to an assay in which (a) an allele-specific oligonucleotide is hybridized to a target nucleic acid and (b) hybridization is detected directly or indirectly.
The term “5′ nuclease assay” refers to an assay in which hybridization of an allele-specific oligonucleotide to a target nucleic acid allows for nucleolytic cleavage of the hybridized probe, resulting in a detectable signal.
The term “assay employing molecular beacons” refers to an assay in which hybridization of an allele-specific oligonucleotide to a target nucleic acid results in a level of detectable signal that is higher than the level of detectable signal emitted by the free oligonucleotide.
The term “oligonucleotide ligation assay” refers to an assay in which an allele-specific oligonucleotide and a second oligonucleotide are hybridized adjacent to one another on a target nucleic acid and ligated together (either directly or indirectly through intervening nucleotides), and the ligation product is detected directly or indirectly.
The term “target sequence,” “target nucleic acid,” or “target nucleic acid sequence” refers generally to a polynucleotide sequence of interest in which a nucleotide variation is suspected or known to reside, including copies of such target nucleic acid generated by amplification.
As used herein, a subject “at risk” of developing lupus may or may not have detectable disease or symptoms of disease, and may or may not have displayed detectable disease or symptoms of disease prior to the treatment methods described herein. “At risk” denotes that a subject has one or more risk factors, which are measurable parameters that correlate with development of lupus, as described herein and known in the art. A subject having one or more of these risk factors has a higher probability of developing lupus than a subject without one or more of these risk factor(s). For example, in some embodiments, a subject “at risk” of developing lupus has a genetic signature comprising one or more of the SNPs set forth in
The term “detection” includes any means of detecting, including direct and indirect detection.
The term “diagnosis” is used herein to refer to the identification or classification of a molecular or pathological state, disease or condition. For example, “diagnosis” may refer to identification of a particular type of lupus condition, e.g., SLE. “Diagnosis” may also refer to the classification of a particular sub-type of lupus, e.g., by tissue/organ involvement (e.g., lupus nephritis), by molecular features (e.g., a patient subpopulation characterized by genetic variation(s) in a particular gene or nucleic acid region.)
The term “aiding diagnosis” is used herein to refer to methods that assist in making a clinical determination regarding the presence, degree or other nature, of a particular type of symptom or condition of lupus. For example, a method of aiding diagnosis of lupus can comprise measuring the amount or detecting the presence orabsence of one or more SNPs in a biological sample from an individual. In another example, a method of aiding diagnosis of lupus can comprise measuring the amount or detecting the presence of one or more SNPsin a biological sample from an individual.
The term “prognosis” is used herein to refer to the prediction of the likelihood of autoimmune disorder-attributable disease symptoms, including, for example, recurrence, flaring, and drug resistance, of an autoimmune disease such as lupus. The term “prediction” is used herein to refer to the likelihood that a patient will respond either favorably or unfavorably to a drug or set of drugs. In one embodiment, the prediction relates to the extent of those responses. In one embodiment, the prediction relates to whether and/or the probability that a patient will survive or improve following treatment, for example treatment with a particular therapeutic agent, and for a certain period of time without disease recurrence. The predictive methods of the invention can be used clinically to make treatment decisions by choosing the most appropriate treatment modalities for any particular patient. The predictive methods of the present invention are valuable tools in predicting if a patient is likely to respond favorably to a treatment regimen, such as a given therapeutic regimen, including for example, administration of a given therapeutic agent or combination, surgical intervention, steroid treatment, etc., or whether long-term survival of the patient, following a therapeutic regimen is likely. Diagnosis of SLE may be according to current American College of Rheumatology (ACR) criteria. Active disease may be defined by one British Isles Lupus Activity Group's (BILAG) “A” criteria or two BILAG “B” criteria. Some signs, symptoms, or other indicators used to diagnose SLE adapted from: Tan et al. “The Revised Criteria for the Classification of SLE” Arth Rheum 25 (1982) may be malar rash such as rash over the cheeks, discoid rash, or red raised patches, photosensitivity such as reaction to sunlight, resulting in the development of or increase in skin rash, oral ulcers such as ulcers in the nose or mouth, usually painless, arthritis, such as non-erosive arthritis involving two or more peripheral joints (arthritis in which the bones around the joints do not become destroyed), serositis, pleuritis or pericarditis, renal disorder such as excessive protein in the urine (greater than 0.5 gm/day or 3+ on test sticks) and/or cellular casts (abnormal elements derived from the urine and/or white cells and/or kidney tubule cells), neurologic signs, symptoms, or other indicators, seizures (convulsions), and/or psychosis in the absence of drugs or metabolic disturbances that are known to cause such effects, and hematologic signs, symptoms, or other indicators such as hemolytic anemia or leukopenia (white bloodcount below 4,000 cells per cubic millimeter) or lymphopenia (less than 1,500 lymphocytes per cubic millimeter) or thrombocytopenia (less than 100,000 platelets per cubic millimeter). The leukopenia and lymphopenia generally must be detected on two or more occasions. The thrombocytopenia generally must be detected in the absence of drugs known to induce it. The invention is not limited to these signs, symptoms, or other indicators of lupus.
As used herein, “treatment” refers to clinical intervention in an attempt to alter the natural course of the individual or cell being treated, and can be performed before or during the course of clinical pathology. Desirable effects of treatment include preventing the occurrence or recurrence of a disease or a condition or symptom thereof, alleviating a condition or symptom of the disease, diminishing any direct or indirect pathological consequences of the disease, decreasing the rate of disease progression, ameliorating or palliating the disease state, and achieving remission or improved prognosis. In some embodiments, methods and compositions of the invention are useful in attempts to delay development of a disease or disorder.
An “effective amount” refers to an amount effective, at dosages and for periods of time necessary, to achieve the desired therapeutic or prophylactic result. A “therapeutically effective amount” of a therapeutic agent may vary according to factors such as the disease state, age, sex, and weight of the individual, and the ability of the antibody to elicit a desired response in the individual. A therapeutically effective amount is also one in which any toxic or detrimental effects of the therapeutic agent are outweighed by the therapeutically beneficial effects. A “prophylactically effective amount” refers to an amount effective, at dosages and for periods of time necessary, to achieve the desired prophylactic result. Typically but not necessarily, since a prophylactic dose is used in subjects prior to or at an earlier stage of disease, the prophylactically effective amount will be less than the therapeutically effective amount.
An “individual,” “subject” or “patient” is a vertebrate. In certain embodiments, the vertebrate is a mammal. Mammals include, but are not limited to, primates (including human and non-human primates) and rodents (e.g., mice and rats). In certain embodiments, a mammal is a human.
A “patient subpopulation,” and grammatical variations thereof, as used herein, refers to a patient subset characterized as having one or more distinctive measurable and/or identifiable characteristics that distinguishes the patient subset from others in the broader disease category to which it belongs. Such characteristics include disease subcategories (e.g., SLE, lupus nephritis), gender, lifestyle, health history, organs/tissues involved, treatment history, etc. In one embodiment, a patient subpopulation is characterized by genetic signatures, including genetic variations in particular nucleotide positions and/or regions (such as SNPs).
The term “sample”, as used herein, refers to a composition that is obtained or derived from a subject of interest that contains a cellular and/or other molecular entity that is to be characterized and/or identified, for example based on physical, biochemical, chemical and/or physiological characteristics. For example, the phrase “disease sample” and variations thereof refers to any sample obtained from a subject of interest that would be expected or is known to contain the cellular and/or molecular entity that is to be characterized.
By “tissue or cell sample” is meant a collection of similar cells obtained from a tissue of a subject or patient. The source of the tissue or cell sample may be solid tissue as from a fresh, frozen and/or preserved organ or tissue sample or biopsy or aspirate; blood or any blood constituents; bodily fluids such as cerebral spinal fluid, amniotic fluid, peritoneal fluid, or interstitial fluid; cells from any time in gestation or development of the subject. The tissue sample may also be primary or cultured cells or cell lines. Optionally, the tissue or cell sample is obtained from a disease tissue/organ. The tissue sample may contain compounds which are not naturally intermixed with the tissue in nature such as preservatives, anticoagulants, buffers, fixatives, nutrients, antibiotics, or the like. A “reference sample”, “reference cell”, “reference tissue”, “control sample”, “control cell”, or “control tissue”, as used herein, refers to a sample, cell or tissue obtained from a source known, or believed, not to be afflicted with the disease or condition for which a method or composition of the invention is being used to identify. In one embodiment, a reference sample, reference cell, reference tissue, control sample, control cell, or control tissue is obtained from a healthy part of the body of the same subject or patient in whom a disease or condition is being identified using a composition or method of the invention. In one embodiment, a reference sample, reference cell, reference tissue, control sample, control cell, or control tissue is obtained from a healthy part of the body of an individual who is not the subject or patient in whom a disease or condition is being identified using a composition or method of the invention.
For the purposes herein a “section” of a tissue sample is meant a single part or piece of a tissue sample, e.g. a thin slice of tissue or cells cut from a tissue sample. It is understood that multiple sections of tissue samples may be taken and subjected to analysis according to the present invention, provided that it is understood that the present invention comprises a method whereby the same section of tissue sample is analyzed at both morphological and molecular levels, or is analyzed with respect to both protein and nucleic acid.
By “correlate” or “correlating” is meant comparing, in any way, the performance and/or results of a first analysis or protocol with the performance and/or results of a second analysis or protocol. For example, one may use the results of a first analysis or protocol in carrying out a second protocols and/or one may use the results of a first analysis or protocol to determine whether a second analysis or protocol should be performed. With respect to the embodiment of gene expression analysis or protocol, one may use the results of the gene expression analysis or protocol to determine whether a specific therapeutic regimen should be performed.
The word “label” when used herein refers to a compound or composition which is conjugated or fused directly or indirectly to a reagent such as a nucleic acid probe or an antibody and facilitates detection of the reagent to which it is conjugated or fused. The label may itself be detectable (e.g., radioisotope labels or fluorescent labels) or, in the case of an enzymatic label, may catalyze chemical alteration of a substrate compound or composition which is detectable.
A “medicament” is an active drug to treat a disease, disorder, and/or condition. In one embodiment, the disease, disorder, and/or condition is lupus or its symptoms or side effects.
The term “increased resistance” to a particular therapeutic agent or treatment option, when used in accordance with the invention, means decreased response to a standard dose of the drug or to a standard treatment protocol.
The term “decreased sensitivity” to a particular therapeutic agent or treatment option, when used in accordance with the invention, means decreased response to a standard dose of the agent or to a standard treatment protocol, where decreased response can be compensated for (at least partially) by increasing the dose of agent, or the intensity of treatment.
“Patient response” can be assessed using any endpoint indicating a benefit to the patient, including, without limitation, (1) inhibition, to some extent, of disease progression, including slowing down and complete arrest; (2) reduction in the number of disease episodes and/or symptoms; (3) reduction in lesional size; (4) inhibition (i.e., reduction, slowing down or complete stopping) of disease cell infiltration into adjacent peripheral organs and/or tissues; (5) inhibition (i.e. reduction, slowing down or complete stopping) of disease spread; (6) decrease of auto-immune response, which may, but does not have to, result in the regression or ablation of the disease lesion; (7) relief, to some extent, of one or more symptoms associated with the disorder; (8) increase in the length of disease-free presentation following treatment; and/or (9) decreased mortality at a given point of time following treatment.
The term “antagonist” is used in the broadest sense, and includes any molecule that partially or fully inhibits or neutralizes a biological activity of a polypeptide, such as PRO, or that partially or fully inhibits the transcription or translation of a nucleic acid encoding the polypeptide. Exemplary antagonist molecules include, but are not limited to, antagonist antibodies, polypeptide fragments, oligopeptides, organic molecules (including small molecules), and anti-sense nucleic acids.
The term “agonist” is used in the broadest sense, and includes any molecule that partially or fully mimics a biological activity of a polypeptide, such as PRO, or that increases the transcription or translation of a nucleic acid encoding the polypeptide. Exemplary agonist molecules include, but are not limited to, agonist antibodies, polypeptide fragments, oligopeptides, organic molecules (including small molecules), PRO-associated polynucleotides, PRO polypeptides, and PRO-Fc fusions.
A “therapeutic agent that targets a PRO or a PRO-associated polynucleotide” means any agent that affects the expression and/or activity of PRO or a PRO-associated polynucleotide including, but not limited to, any of the PRO agonists or antagonists described herein, including such therapeutic agents that are already known in the art as well as those that are later developed.
A “lupus therapeutic agent”, a “therapeutic agent effective to treat lupus”, and grammatical variations thereof, as used herein, refer to an agent that when provided in an effective amount is known, clinically shown, or expected by clinicians to provide a therapeutic benefit in a subject who has lupus. In one embodiment, the phrase includes any agent that is marketed by a manufacturer, or otherwise used by licensed clinicians, as a clinically-accepted agent that when provided in an effective amount would be expected to provide a therapeutic effect in a subject who has lupus. In one embodiment, a lupus therapeutic agent comprises a non-steroidal anti-inflammatory drug (NSAID), which includes acetylsalicylic acid (e.g., aspirin), ibuprofen (Motrin), naproxen (Naprosyn), indomethacin (Indocin), nabumetone (Relafen), tolmetin (Tolectin), and any other embodiments that comprise a therapeutically equivalent active ingredient(s) and formulation thereof. In one embodiment, a lupus therapeutic agent comprises acetaminophen (e.g., Tylenol), corticosteroids, or anti-malarials (e.g., chloroquine, hydroxychloroquine). In one embodiment, a lupus therapeutic agent comprises an immunomodulating drug (e.g., azathioprine, cyclophosphamide, methotrexate, cyclosporine). In one embodiment, a lupus therapeutic agent is an anti-B cell agent (e.g., anti-CD20 (e.g., rituximab), anti-CD22), an anti-cytokine agent (e.g., anti-tumor necrosis factor α, anti-interleukin-1-receptor (e.g., anakinra), anti-interleukin 10, anti-interleukin 6 receptor, anti-interferon alpha, anti-B-lymphocyte stimulator), an inhibitor of costimulation (e.g., anti-CD154, CTLA4-Ig (e.g., abatacept)), a modulator of B-cell anergy (e.g., LJP 394 (e.g., abetimus)). In one embodiment, a lupus therapeutic agent comprises hormonal treatment (e.g., DHEA), and anti-hormonal therapy (e.g., the anti-prolactin agent bromocriptine). In one embodiment, a lupus therapeutic agent is an agent that provides immunoadsorption, is an anti-complement factor (e.g., anti-C5a), T cell vaccination, cell transfection with T-cell receptor zeta chain, or peptide therapies (e.g., edratide targeting anti-DNA idiotypes).
A therapeutic agent that has “marketing approval”, or that has been “approved as a therapeutic agent”, or grammatical variations thereof of these phrases, as used herein, refer to an agent (e.g., in the form of a drug formulation, medicament) that is approved, licensed, registered or authorized by a relevant governmental entity (e.g., federal, state or local regulatory agency, department, bureau) to be sold by and/or through and/or on behalf of a commercial entity (e.g., a for-profit entity) for the treatment of a particular disorder (e.g., lupus) or a patient subpopulation (e.g., patients with lupus nephritis, patients of a particular ethnicity, gender, lifestyle, disease risk profile, etc.). A relevant governmental entity includes, for example, the Food and Drug Administration (FDA), European Medicines Evaluation Agency (EMEA), and equivalents thereof.
“Antibodies” (Abs) and “immunoglobulins” (Igs) refer to glycoproteins having similar structural characteristics. While antibodies exhibit binding specificity to a specific antigen, immunoglobulins include both antibodies and other antibody-like molecules which generally lack antigen specificity. Polypeptides of the latter kind are, for example, produced at low levels by the lymph system and at increased levels by myelomas.
The terms “antibody” and “immunoglobulin” are used interchangeably in the broadest sense and include monoclonal antibodies (e.g., full length or intact monoclonal antibodies), polyclonal antibodies, monovalent antibodies, multivalent antibodies, multispecific antibodies (e.g., bispecific antibodies so long as they exhibit the desired biological activity) and may also include certain antibody fragments (as described in greater detail herein). An antibody can be chimeric, human, humanized and/or affinity matured.
The term “anti-PRO antibody” or “an antibody that binds to PRO” refers to an antibody that is capable of binding PRO with sufficient affinity such that the antibody is useful as a diagnostic and/or therapeutic agent in targeting PRO. Preferably, the extent of binding of an anti-PRO antibody to an unrelated, non-PRO protein is less than about 10% of the binding of the antibody to PRO as measured, e.g., by a radioimmunoassay (RIA). In certain embodiments, an antibody that binds to PRO has a dissociation constant (Kd) of ≦1 μM, ≦100 nM, ≦10 nM, ≦1 nM, or ≦0.1 nM. In certain embodiments, an anti-PRO antibody binds to an epitope of PRO that is conserved among PRO from different species.
The terms “full length antibody,” “intact antibody” and “whole antibody” are used herein interchangeably to refer to an antibody in its substantially intact form, not antibody fragments as defined below. The terms particularly refer to an antibody with heavy chains that contain the Fc region.
“Antibody fragments” comprise a portion of an intact antibody, preferably comprising the antigen binding region thereof. Examples of antibody fragments include Fab, Fab′, F(ab′)2, and Fv fragments; diabodies; linear antibodies; single-chain antibody molecules; and multispecific antibodies formed from antibody fragments.
Papain digestion of antibodies produces two identical antigen-binding fragments, called “Fab” fragments, each with a single antigen-binding site, and a residual “Fc” fragment, whose name reflects its ability to crystallize readily. Pepsin treatment yields an F(ab′)2 fragment that has two antigen-combining sites and is still capable of cross-linking antigen.
“Fv” is a minimum antibody fragment which contains a complete antigen-binding site. In one embodiment, a two-chain Fv species consists of a dimer of one heavy- and one light-chain variable domain in tight, non-covalent association. Collectively, the six CDRs of an Fv confer antigen-binding specificity to the antibody. However, even a single variable domain (or half of an Fv comprising only three CDRs specific for an antigen) has the ability to recognize and bind antigen, although at a lower affinity than the entire binding site.
The Fab fragment contains the heavy- and light-chain variable domains and also contains the constant domain of the light chain and the first constant domain (CH1) of the heavy chain. Fab′ fragments differ from Fab fragments by the addition of a few residues at the carboxy terminus of the heavy chain CH1 domain including one or more cysteines from the antibody hinge region. Fab′-SH is the designation herein for Fab′ in which the cysteine residue(s) of the constant domains bear a free thiol group. F(ab′)2 antibody fragments originally were produced as pairs of Fab′ fragments which have hinge cysteines between them. Other chemical couplings of antibody fragments are also known.
“Single-chain Fv” or “scFv” antibody fragments comprise the VH and VL domains of antibody, wherein these domains are present in a single polypeptide chain. Generally, the scFv polypeptide further comprises a polypeptide linker between the VH and VL domains which enables the scFv to form the desired structure for antigen binding. For a review of scFv see Pluckthun, in The Pharmacology of Monoclonal Antibodies, vol. 113, Rosenburg and Moore eds., Springer-Verlag, New York, pp. 269-315 (1994).
The term “diabodies” refers to small antibody fragments with two antigen-binding sites, which fragments comprise a heavy-chain variable domain (VH) connected to a light-chain variable domain (VL) in the same polypeptide chain (VH-VL). By using a linker that is too short to allow pairing between the two domains on the same chain, the domains are forced to pair with the complementary domains of another chain and create two antigen-binding sites. Diabodies may be bivalent or bispecific. Diabodies are described more fully in, for example, EP 404,097; WO93/1161; Hudson et al. (2003) Nat. Med. 9:129-134; and Hollinger et al., Proc. Natl. Acad. Sci. USA 90: 6444-6448 (1993). Triabodies and tetrabodies are also described in Hudson et al. (2003) Nat. Med. 9:129-134.
The term “monoclonal antibody” as used herein refers to an antibody obtained from a population of substantially homogeneous antibodies, i.e., the individual antibodies comprising the population are identical except for possible mutations, e.g., naturally occurring mutations, that may be present in minor amounts. Thus, the modifier “monoclonal” indicates the character of the antibody as not being a mixture of discrete antibodies. In certain embodiments, such a monoclonal antibody typically includes an antibody comprising a polypeptide sequence that binds a target, wherein the target-binding polypeptide sequence was obtained by a process that includes the selection of a single target binding polypeptide sequence from a plurality of polypeptide sequences. For example, the selection process can be the selection of a unique clone from a plurality of clones, such as a pool of hybridoma clones, phage clones, or recombinant DNA clones. It should be understood that a selected target binding sequence can be further altered, for example, to improve affinity for the target, to humanize the target binding sequence, to improve its production in cell culture, to reduce its immunogenicity in vivo, to create a multispecific antibody, etc., and that an antibody comprising the altered target binding sequence is also a monoclonal antibody of this invention. In contrast to polyclonal antibody preparations which typically include different antibodies directed against different determinants (epitopes), each monoclonal antibody of a monoclonal antibody preparation is directed against a single determinant on an antigen. In addition to their specificity, monoclonal antibody preparations are advantageous in that they are typically uncontaminated by other immunoglobulins.
The modifier “monoclonal” indicates the character of the antibody as being obtained from a substantially homogeneous population of antibodies, and is not to be construed as requiring production of the antibody by any particular method. For example, the monoclonal antibodies to be used in accordance with the present invention may be made by a variety of techniques, including, for example, the hybridoma method (e.g., Kohler et al., Nature, 256: 495 (1975); Harlow et al., Antibodies: A Laboratory Manual, (Cold Spring Harbor Laboratory Press, 2nd ed. 1988); Hammerling et al., in: Monoclonal Antibodies and T-Cell Hybridomas 563-681 (Elsevier, N.Y., 1981)), recombinant DNA methods (see, e.g., U.S. Pat. No. 4,816,567), phage display technologies (see, e.g., Clackson et al., Nature, 352: 624-628 (1991); Marks et al., J. Mol. Biol. 222: 581-597 (1992); Sidhu et al., J. Mol. Biol. 338(2): 299-310 (2004); Lee et al., J. Mol. Biol. 340(5): 1073-1093 (2004); Fellouse, Proc. Natl. Acad. Sci. USA 101(34): 12467-12472 (2004); and Lee et al., J. Immunol. Methods 284(1-2): 119-132 (2004), and technologies for producing human or human-like antibodies in animals that have parts or all of the human immunoglobulin loci or genes encoding human immunoglobulin sequences (see, e.g., WO98/24893; WO96/34096; WO96/33735; WO91/10741; Jakobovits et al., Proc. Natl. Acad. Sci. USA 90: 2551 (1993); Jakobovits et al., Nature 362: 255-258 (1993); Bruggemann et al., Year in Immunol. 7:33 (1993); U.S. Pat. Nos. 5,545,807; 5,545,806; 5,569,825; 5,625,126; 5,633,425; 5,661,016; Marks et al., Bio. Technology 10: 779-783 (1992); Lonberg et al., Nature 368: 856-859 (1994); Morrison, Nature 368: 812-813 (1994); Fishwild et al., Nature Biotechnol. 14: 845-851 (1996); Neuberger, Nature Biotechnol. 14: 826 (1996) and Lonberg and Huszar, Intern. Rev. Immunol. 13: 65-93 (1995).
The monoclonal antibodies herein specifically include “chimeric” antibodies in which a portion of the heavy and/or light chain is identical with or homologous to corresponding sequences in antibodies derived from a particular species or belonging to a particular antibody class or subclass, while the remainder of the chain(s) is identical with or homologous to corresponding sequences in antibodies derived from another species or belonging to another antibody class or subclass, as well as fragments of such antibodies, so long as they exhibit the desired biological activity (U.S. Pat. No. 4,816,567; and Morrison et al., Proc. Natl. Acad. Sci. USA 81:6855-9855 (1984)).
“Humanized” forms of non-human (e.g., murine) antibodies are chimeric antibodies that contain minimal sequence derived from non-human immunoglobulin. In one embodiment, a humanized antibody is a human immunoglobulin (recipient antibody) in which residues from a hypervariable region of the recipient are replaced by residues from a hypervariable region of a non-human species (donor antibody) such as mouse, rat, rabbit, or nonhuman primate having the desired specificity, affinity, and/or capacity. In some instances, framework region (FR) residues of the human immunoglobulin are replaced by corresponding non-human residues. Furthermore, humanized antibodies may comprise residues that are not found in the recipient antibody or in the donor antibody. These modifications may be made to further refine antibody performance. In general, a humanized antibody will comprise substantially all of at least one, and typically two, variable domains, in which all or substantially all of the hypervariable loops correspond to those of a non-human immunoglobulin, and all or substantially all of the FRs are those of a human immunoglobulin sequence. The humanized antibody optionally will also comprise at least a portion of an immunoglobulin constant region (Fc), typically that of a human immunoglobulin. For further details, see Jones et al., Nature 321:522-525 (1986); Riechmann et al., Nature 332:323-329 (1988); and Presta, Curr. Op. Struct. Biol. 2:593-596 (1992). See also the following review articles and references cited therein: Vaswani and Hamilton, Ann. Allergy, Asthma & Immunol. 1:105-115 (1998); Harris, Biochem. Soc. Transactions 23:1035-1038 (1995); Hurle and Gross, Curr. Op. Biotech. 5:428-433 (1994).
A “human antibody” is one which comprises an amino acid sequence corresponding to that of an antibody produced by a human and/or has been made using any of the techniques for making human antibodies as disclosed herein. Such techniques include screening human-derived combinatorial libraries, such as phage display libraries (see, e.g., Marks et al., J. Mol. Biol., 222: 581-597 (1991) and Hoogenboom et al., Nucl. Acids Res., 19: 4133-4137 (1991)); using human myeloma and mouse-human heteromyeloma cell lines for the production of human monoclonal antibodies (see, e.g., Kozbor J. Immunol., 133: 3001 (1984); Brodeur et al., Monoclonal Antibody Production Techniques and Applications, pp. 55-93 (Marcel Dekker, Inc., New York, 1987); and Boerner et al., J. Immunol., 147: 86 (1991)); and generating monoclonal antibodies in transgenic animals (e.g., mice) that are capable of producing a full repertoire of human antibodies in the absence of endogenous immunoglobulin production (see, e.g., Jakobovits et al., Proc. Natl. Acad. Sci. USA, 90: 2551 (1993); Jakobovits et al., Nature, 362: 255 (1993); Bruggermann et al., Year in Immunol., 7: 33 (1993)). This definition of a human antibody specifically excludes a humanized antibody comprising antigen-binding residues from a non-human animal.
An “affinity matured” antibody is one with one or more alterations in one or more CDRs thereof which result in an improvement in the affinity of the antibody for antigen, compared to a parent antibody which does not possess those alteration(s). In one embodiment, an affinity matured antibody has nanomolar or even picomolar affinities for the target antigen. Affinity matured antibodies are produced by procedures known in the art. Marks et al. Bio/Technology 10:779-783 (1992) describes affinity maturation by VH and VL domain shuffling. Random mutagenesis of HVR and/or framework residues is described by: Barbas et al. Proc Nat. Acad. Sci. USA 91:3809-3813 (1994); Schier et al. Gene 169:147-155 (1995); Yelton et al. J. Immunol. 155:1994-2004 (1995); Jackson et al., J. Immunol. 154(7):3310-9 (1995); and Hawkins et al, J. Mol. Biol. 226:889-896 (1992).
A “blocking antibody” or an “antagonist antibody” is one which inhibits or reduces a biological activity of the antigen it binds. Certain blocking antibodies or antagonist antibodies partially or completely inhibit the biological activity of the antigen.
A “small molecule” or “small organic molecule” is defined herein as an organic molecule having a molecular weight below about 500 Daltons.
A “PRO-binding oligopeptide” or an “oligopeptide that binds PRO” is an oligopeptide that is capable of binding PRO with sufficient affinity such that the oligopeptide is useful as a diagnostic and/or therapeutic agent in targeting PRO. In certain embodiments, the extent of binding of a PRO-binding oligopeptide to an unrelated, non-PRO protein is less than about 10% of the binding of the PRO-binding oligopeptide to PRO as measured, e.g., by a surface plasmon resonance assay. In certain embodiments, a PRO-binding oligopeptide has a dissociation constant (Kd) of ≦1 μM, ≦100 nM, ≦10 nM, ≦1 nM, or ≦0.1 nM.
A “PRO-binding organic molecule” or “an organic molecule that binds PRO” is an organic molecule other than an oligopeptide or antibody as defined herein that is capable of binding PRO with sufficient affinity such that the organic molecule is useful as a diagnostic and/or therapeutic agent in targeting PRO. In certain embodiments, the extent of binding of a PRO-binding organic molecule to an unrelated, non-PRO protein is less than about 10% of the binding of the PRO-binding organic molecule to PRO as measured, e.g., by a surface plasmon resonance assay. In certain embodiments, a PRO-binding organic molecule has a dissociation constant (Kd) of ≦1 μM, ≦100 nM, ≦10 nM, ≦1 nM, or ≦0.1 nM.
The dissociation constant (Kd) of any molecule that binds a target polypeptide may conveniently be measured using a surface plasmon resonance assay. Such assays may employ a BIAcore™-2000 or a BIAcore™-3000 (BIAcore, Inc., Piscataway, N.J.) at 25° C. with immobilized target polypeptide CM5 chips at ˜10 response units (RU). Briefly, carboxymethylated dextran biosensor chips (CM5, BIAcore Inc.) are activated with N-ethyl-N′-(3-dimethylaminopropyl)-carbodiimide hydrochloride (EDC) and N-hydroxysuccinimide (NHS) according to the supplier's instructions. Target polypeptide is diluted with 10 mM sodium acetate, pH 4.8, to 5 μg/ml (˜0.2 μM) before injection at a flow rate of 5 μl/minute to achieve approximately 10 response units (RU) of coupled protein. Following the injection of target polypeptide, 1 M ethanolamine is injected to block unreacted groups. For kinetics measurements, two-fold serial dilutions of the binding molecule (0.78 nM to 500 nM) are injected in PBS with 0.05% Tween 20 (PBST) at 25° C. at a flow rate of approximately 25 μl/min. Association rates (kon) and dissociation rates (koff) are calculated using a simple one-to-one Langmuir binding model (BIAcore Evaluation Software version 3.2) by simultaneously fitting the association and dissociation sensorgrams. The equilibrium dissociation constant (Kd) is calculated as the ratio koff/kon. See, e.g., Chen, Y., et al., (1999) J. Mol. Biol. 293:865-881. If the on-rate of an antibody exceeds 106 M−1s−1 by the surface plasmon resonance assay above, then the on-rate can be determined by using a fluorescent quenching technique that measures the increase or decrease in fluorescence emission intensity (excitation=295 nm; emission=340 nm, 16 nm band-pass) at 25° C. of a 20 nM antibody (Fab form) in PBS, pH 7.2, in the presence of increasing concentrations of antigen as measured in a spectrometer, such as a stop-flow equipped spectrophometer (Aviv Instruments) or a 8000-series SLM-Aminco spectrophotometer (ThermoSpectronic) with a stirred cuvette.
A “liposome” is a small vesicle composed of various types of lipids, phospholipids and/or surfactant which is useful for delivery of an agent, e.g., a drug, to a mammal. The components of the liposome are commonly arranged in a bilayer formation, similar to the lipid arrangement of biological membranes.
The word “label” when used herein refers to a detectable compound or composition. The label may be detectable by itself (e.g., radioisotope labels or fluorescent labels) or, in the case of an enzymatic label, may catalyze chemical alteration of a substrate compound or composition which results in a detectable product. Radionuclides that can serve as detectable labels include, for example, I-131, I-123, I-125, Y-90, Re-188, Re-186, At-211, Cu-67, Bi-212, and Pd-109.
An “isolated” biological molecule, such as a nucleic acid, polypeptide, or antibody, is one which has been identified and separated and/or recovered from at least one component of its natural environment.
Reference to “about” a value or parameter herein includes (and describes) embodiments that are directed to that value or parameter per se. For example, description referring to “about X” includes description of “X.”
It is understood that aspect and embodiments of the invention described herein include “consisting” and/or “consisting essentially of” aspects and embodiments.
Nucleotide variations associated with lupus are provided herein. These variations provide biomarkers for lupus, and/or predispose or contribute to development, persistence and/or progression of lupus. Accordingly, the invention disclosed herein is useful in a variety of settings, e.g., in methods and compositions related to lupus diagnosis and therapy.
Nucleic acid, according to any of the above methods, may be genomic DNA; RNA transcribed from genomic DNA; or cDNA generated from RNA. Nucleic acid may be derived from a vertebrate, e.g., a mammal. A nucleic acid is said to be “derived from” a particular source if it is obtained directly from that source or if it is a copy of a nucleic acid found in that source.
Nucleic acid includes copies of the nucleic acid, e.g., copies that result from amplification. Amplification may be desirable in certain instances, e.g., in order to obtain a desired amount of material for detecting variations. For example, a PRO-associated polynucleotide or portion thereof may be amplified from nucleic acid material. The amplicons may then be subjected to a variation detection method, such as those described below, to determine whether a variation is present in the amplicon.
Variations may be detected by certain methods known to those skilled in the art. Such methods include, but are not limited to, DNA sequencing; primer extension assays, including allele-specific nucleotide incorporation assays and allele-specific primer extension assays (e.g., allele-specific PCR, allele-specific ligation chain reaction (LCR), and gap-LCR); allele-specific oligonucleotide hybridization assays (e.g., oligonucleotide ligation assays); cleavage protection assays in which protection from cleavage agents is used to detect mismatched bases in nucleic acid duplexes; analysis of MutS protein binding; electrophoretic analysis comparing the mobility of variant and wild type nucleic acid molecules; denaturing-gradient gel electrophoresis (DGGE, as in, e.g., Myers et al. (1985) Nature 313:495); analysis of RNase cleavage at mismatched base pairs; analysis of chemical or enzymatic cleavage of heteroduplex DNA; mass spectrometry (e.g., MALDI-TOF); genetic bit analysis (GBA); 5′ nuclease assays (e.g., TaqMan®); and assays employing molecular beacons. Certain of these methods are discussed in further detail below.
Detection of variations in target nucleic acids may be accomplished by molecular cloning and sequencing of the target nucleic acids using techniques well known in the art. Alternatively, amplification techniques such as the polymerase chain reaction (PCR) can be used to amplify target nucleic acid sequences directly from a genomic DNA preparation from tumor tissue. The nucleic acid sequence of the amplified sequences can then be determined and variations identified therefrom. Amplification techniques are well known in the art, e.g., polymerase chain reaction is described in Saiki et al., Science 239:487, 1988; U.S. Pat. Nos. 4,683,203 and 4,683,195.
The ligase chain reaction, which is known in the art, can also be used to amplify target nucleic acid sequences. See, e.g., Wu et al., Genomics 4:560-569 (1989). In addition, a technique known as allele-specific PCR can also be used to detect variations (e.g., substitutions). See, e.g., Ruano and Kidd (1989) Nucleic Acids Research 17:8392; McClay et al. (2002) Analytical Biochem. 301:200-206. In certain embodiments of this technique, an allele-specific primer is used wherein the 3′ terminal nucleotide of the primer is complementary to (i.e., capable of specifically base-pairing with) a particular variation in the target nucleic acid. If the particular variation is not present, an amplification product is not observed. Amplification Refractory Mutation System (ARMS) can also be used to detect variations (e.g., substitutions). ARMS is described, e.g., in European Patent Application Publication No. 0332435, and in Newton et al., Nucleic Acids Research, 17:7, 1989.
Other methods useful for detecting variations (e.g., substitutions) include, but are not limited to, (1) allele-specific nucleotide incorporation assays, such as single base extension assays (see, e.g., Chen et al. (2000) Genome Res. 10:549-557; Fan et al. (2000) Genome Res. 10:853-860; Pastinen et al. (1997) Genome Res. 7:606-614; and Ye et al. (2001) Hum. Mut. 17:305-316); (2) allele-specific primer extension assays (see, e.g., Ye et al. (2001) Hum. Mut. 17:305-316; and Shen et al. Genetic Engineering News, vol. 23, Mar. 15, 2003), including allele-specific PCR; (3) 5′ nuclease assays (see, e.g., De La Vega et al. (2002) BioTechniques 32:S48-S54 (describing the TaqMan® assay); Ranade et al. (2001) Genome Res. 11:1262-1268; and Shi (2001) Clin. Chem. 47:164-172); (4) assays employing molecular beacons (see, e.g., Tyagi et al. (1998) Nature Biotech. 16:49-53; and Mhlanga et al. (2001) Methods 25:463-71); and (5) oligonucleotide ligation assays (see, e.g., Grossman et al. (1994) Nuc. Acids Res. 22:4527-4534; patent application Publication No. US 2003/0119004 A1; PCT International Publication No. WO 01/92579 A2; and U.S. Pat. No. 6,027,889).
Variations may also be detected by mismatch detection methods. Mismatches are hybridized nucleic acid duplexes which are not 100% complementary. The lack of total complementarity may be due to deletions, insertions, inversions, or substitutions. One example of a mismatch detection method is the Mismatch Repair Detection (MRD) assay described, e.g., in Faham et al., Proc. Natl. Acad. Sci. USA 102:14717-14722 (2005) and Faham et al., Hum. Mol. Genet. 10:1657-1664 (2001). Another example of a mismatch cleavage technique is the RNase protection method, which is described in detail in Winter et al., Proc. Natl. Acad. Sci. USA, 82:7575, 1985, and Myers et al., Science 230:1242, 1985. For example, a method of the invention may involve the use of a labeled riboprobe which is complementary to the human wild-type target nucleic acid. The riboprobe and target nucleic acid derived from the tissue sample are annealed (hybridized) together and subsequently digested with the enzyme RNase A which is able to detect some mismatches in a duplex RNA structure. If a mismatch is detected by RNase A, it cleaves at the site of the mismatch. Thus, when the annealed RNA preparation is separated on an electrophoretic gel matrix, if a mismatch has been detected and cleaved by RNase A, an RNA product will be seen which is smaller than the full-length duplex RNA for the riboprobe and the mRNA or DNA. The riboprobe need not be the full length of the target nucleic acid, but can a portion of the target nucleic acid, provided it encompasses the position suspected of having a variation.
In a similar manner, DNA probes can be used to detect mismatches, for example through enzymatic or chemical cleavage. See, e.g., Cotton et al., Proc. Natl. Acad. Sci. USA, 85:4397, 1988; and Shenk et al., Proc. Natl. Acad. Sci. USA, 72:989, 1975. Alternatively, mismatches can be detected by shifts in the electrophoretic mobility of mismatched duplexes relative to matched duplexes. See, e.g., Cariello, Human Genetics, 42:726, 1988. With either riboprobes or DNA probes, the target nucleic acid suspected of comprising a variation may be amplified before hybridization. Changes in target nucleic acid can also be detected using Southern hybridization, especially if the changes are gross rearrangements, such as deletions and insertions.
Restriction fragment length polymorphism (RFLP) probes for the target nucleic acid or surrounding marker genes can be used to detect variations, e.g., insertions or deletions. Insertions and deletions can also be detected by cloning, sequencing and amplification of a target nucleic acid. Single stranded conformation polymorphism (SSCP) analysis can also be used to detect base change variants of an allele. See, e.g. Orita et al., Proc. Natl. Acad. Sci. USA 86:2766-2770, 1989, and Genomics, 5:874-879, 1989.
The invention provides for compositions of isolated polynucleotides that comprise a polynucleotide or fragment thereof comprising a SNP. In one embodiment, the polynucleotide is a PRO-associated polynucleotide.
In particular, the invention provides for compositions that comprise unique sets and/or combinations of SNPs that can be used as a genetic profile or signature indicative of a subject at risk of developing lupus, or indicative of the disease or symptom or condition thereof. The polymorphisms disclosed herein are useful as biomarkers for assessing risk of developing lupus, as well as for targets for the design of diagnostic reagents. In some embodiments, the SNP is not associated with a gene. In other embodiments, the SNP is associated with a gene, and can be located either in an intergenic or intragenic region, and more particularly, can be located in a coding or noncoding region. The genes associated with a SNP of the present invention may be associated with an unknown gene, or may be associated with a known gene e.g., ITGAM or BLK.
The SNPs identified herein provide targets for development of therapeutic agents for use in the diagnosis and treatment of genetically identified lupus patients, including diagnosis and targeted treatment of lupus patient subpopulations exhibiting a distinct genetic signature comprising one or more of the SNPs of the present invention. For example, in one embodiment, the genes containing the genetic variations identified herein, and the nucleic acid (e.g., DNA or RNA) associated with these genes, and proteins encoded by these genes, can be used as targets for the development of therapeutic agents (e.g., small molecule compounds, antibodies, antisense/RNAi agents, etc.) or used directly as therapeutic agents (e.g., therapeutic proteins, etc.) for the treatment of lupus.
Accordingly, in one aspect, the invention provides a set of one or more SNPs that form a unique genetic signature for assessing the risk of developing lupus. In one aspect, the unique genetic signature comprises about 1-10, 10-20, 20-30, 30-40, or 40-50 SNPs selected from any of the SNPs set forth in
In one aspect, the unique genetic signature comprises 1 or more SNPs, 3 or more SNPs, 3 or more SNPs, 4 or more SNPs, 5 or more SNPs, 6 or more SNPs, 7 or more SNPs, 8 or more SNPs, 9 or more SNPs, 10 or more SNPs, 11 or more SNPs, 12 or more SNPs, 13 or more SNPs, 14 or more SNPs, 15 or more SNPs, 16 or more SNPs, 17 or more SNPs, 18 or more SNPs, 19 or more SNPs, or 20 or more SNPs selected from any of the SNPs set forth in
In another embodiment, the invention provides for an isolated polynucleotide (e.g., DNA or RNA) or fragment thereof that is at least about 10 nucleotides in length, wherein the polynucleotide or fragment thereof comprises: a) a genetic variation at a nucleotide position corresponding to the position of a single nucleotide polymorphism (SNP) selected from any of those SNPs set forth in
In one embodiment of the invention, genetic variation in the region upstream of the transcription initiation site of B Lymphoid tyrosine Kinase (BLK) and C8orf13 (chromosome 8p23.1) is associated with disease risk in both the U.S. and Swedish case/control series (rs13277113, OR=1.39, meta P=1×10−10), and also with altered mRNA levels in B cell lines. In another embodiment, variants in the Integrin Alpha M (ITGAM) and Integrin Alpha X (ITGAX) region (chromosome 16p11.2) are associated with SLE in the combined sample (rs11574637, OR=1.33, meta P=3×10−11). In a comprehensive genome-wide association scan in SLE, the present inventors have identified and then confirmed through replication two new genetic loci: a) a promoter region allele that correlates with reduced expression of BLK and increased expression of C8orf13 and b) SNPs (or variants) within the ITGAM/ITGAX region that are in strong linkage disequilibrium with two common nonsynonymous alleles of ITGAM.
In one embodiment, the polynucleotide or fragment thereof is at least about 10 nucleotides in length, alternatively at least about 15 nucleotides in length, alternatively at least about 20 nucleotides in length, alternatively at least about 30 nucleotides in length, alternatively at least about 40 nucleotides in length, alternatively at least about 50 nucleotides in length, alternatively at least about 60 nucleotides in length, alternatively at least about 70 nucleotides in length, alternatively at least about 80 nucleotides in length, alternatively at least about 90 nucleotides in length, alternatively at least about 100 nucleotides in length, alternatively at least about 110 nucleotides in length, alternatively at least about 120 nucleotides in length, alternatively at least about 130 nucleotides in length, alternatively at least about 140 nucleotides in length, alternatively at least about 150 nucleotides in length, alternatively at least about 160 nucleotides in length, alternatively at least about 170 nucleotides in length, alternatively at least about 180 nucleotides in length, alternatively at least about 190 nucleotides in length, alternatively at least about 200 nucleotides in length, alternatively at least about 250 nucleotides in length, alternatively at least about 300 nucleotides in length, alternatively at least about 350 nucleotides in length, alternatively at least about 400 nucleotides in length, alternatively at least about 450 nucleotides in length, alternatively at least about 500 nucleotides in length, alternatively at least about 600 nucleotides in length, alternatively at least about 700 nucleotides in length, alternatively at least about 800 nucleotides in length, alternatively at least about 900 nucleotides in length, alternatively at least about 1000 nucleotides in length, and alternatively about the length of the full-length coding sequence. In any of these embodiments, the fragment or full-length polynucleotide may also include part or all of a naturally-occurring flanking region of a SNP. In this context the term “about” means the referenced nucleotide sequence length plus or minus 10% of that referenced length.
In another embodiment, the sequence of the polynucleotide comprises a genetic variation within a linkage disequilibrium region e.g., as set forth in any of
In one embodiment, an isolated polynucleotide provided herein is detectably labeled, e.g., with a radioisotope, a fluorescent agent, or a chromogenic agent. In another embodiment, an isolated polynucleotide is a primer. In another embodiment, an isolated polynucleotide is an oligonucleotide, e.g., an allele-specific oligonucleotide. In another embodiment, an oligonucleotide may be, for example, from 7-60 nucleotides in length, 9-45 nucleotides in length, 15-30 nucleotides in length, or 18-25 nucleotides in length. In another embodiment, an oligonucleotide may be, e.g., PNA, morpholino-phosphoramidates, LNA, or 2′-alkoxyalkoxy. Oligonucleotides as provided herein are useful, e.g., as hybridization probes for the detection of genetic variations.
In one embodiment, the invention provides a composition comprising a plurality of polynucleotides capable of specifically hybridizing to at least 1, 2, 3, 4, or 5 PRO-associated polynucleotides, each PRO-associated polynucleotide comprising a genetic variation at a nucleotide position corresponding to the position of a SNP set forth in any of
An allele-specific oligonucleotide can be used in conjunction with a control oligonucleotide that is identical to the allele-specific oligonucleotide, except that the nucleotide that specifically base pairs with the genetic variation is replaced with a nucleotide that specifically base pairs with the corresponding nucleotide present in the wild type PRO-associated polynucleotide. Such oligonucleotides may be used in competitive binding assays under hybridization conditions that allow the oligonucleotides to distinguish between a PRO-associated polynucleotide comprising a genetic variation and a PRO-associated polynucleotide comprising the corresponding wild type nucleotide.
Using routine methods based on, e.g., the length and base composition of the oligonucleotides, one skilled in the art can arrive at suitable hybridization conditions under which (a) an allele-specific oligonucleotide will preferentially bind to a PRO-associated polynucleotide comprising a genetic variation relative to a wild type PRO-associated polynucleotide, and (b) the control oligonucleotide will preferentially bind to a wild type PRO-associated polynucleotide relative to a PRO-associated polynucleotide comprising a genetic variation. Exemplary conditions include conditions of high stringency, e.g., hybridization conditions of 5× standard saline phosphate EDTA (SSPE) and 0.5% NaDodSO4 (SDS) at 55° C., followed by washing with 2×SSPE and 0.1% SDS at 55° C. or room temperature. In another embodiment, a binding agent is provided that preferentially binds to a PRO comprising an amino acid variation, relative to a wild-type PRO. In one embodiment, the amino acid variation is any resulting from a genetic variation in a nucleotide position corresponding to a SNP set forth in any of
The invention also provides a variety of compositions suitable for use in performing methods of the invention. In one embodiment, the invention comprises at least one nucleic acid molecule useful for detecting one or more genetic variations as disclosed in
In another embodiment, the invention provides arrays that can be used in the methods of the present invention. In one embodiment, an array of the invention comprises individual or collections of nucleic acid molecules useful for detecting one or more genetic variations. For instance, an array of the invention may comprise a series of discretely placed individual allele-specific oligonucleotides or sets of allele-specific oligonucleotides. Several techniques are well-known in the art for attaching nucleic acids to a solid substrate such as a glass slide. One method is to incorporate modified bases or analogs that contain a reactive moiety that is capable of attachment to a solid substrate, such as an amine group, a derivative of an amine group, or another group with a positive charge, into nucleic acid molecules that are synthesized. The synthesized product is then contacted with a solid substrate, such as a glass slide coated with an aldehyde or other reactive group. The aldehyde or other reactive group will form a covalent link with the reactive moiety on the amplified product, which will become covalently attached to the glass slide. Other methods, such as those using amino propryl silican surface chemistry are also known in the art.
A biological sample, according to any of the above methods, may be obtained using certain methods known to those skilled in the art. Biological samples may be obtained from vertebrate animals, and in particular, mammals. Tissue biopsy is often used to obtain a representative piece of tumor tissue. Alternatively, tumor cells can be obtained indirectly in the form of tissues or fluids that are known or thought to contain the tumor cells of interest. For instance, samples of lung cancer lesions may be obtained by resection, bronchoscopy, fine needle aspiration, bronchial brushings, or from sputum, pleural fluid or blood. Variations in target nucleic acids (or encoded polypeptides) may be detected from a tumor sample or from other body samples such as urine, sputum or serum. (Cancer cells are sloughed off from tumors and appear in such body samples.) By screening such body samples, a simple early diagnosis can be achieved for diseases such as cancer. In addition, the progress of therapy can be monitored more easily by testing such body samples for variations in target nucleic acids (or encoded polypeptides). Additionally, methods for enriching a tissue preparation for tumor cells are known in the art. For example, the tissue may be isolated from paraffin or cryostat sections. Cancer cells may also be separated from normal cells by flow cytometry or laser capture microdissection.
Subsequent to the determination that a subject, or the tissue or cell sample comprises a genetic variation disclosed herein, it is contemplated that an effective amount of an appropriate lupus therapeutic agent may be administered to the subject to treat the lupus condition in the subject. Diagnosis in mammals of the various pathological conditions described herein can be made by the skilled practitioner. Diagnostic techniques are available in the art which allow, e.g., for the diagnosis or detection of lupus in a mammal.
A lupus therapeutic agent can be administered in accordance with known methods, such as intravenous administration as a bolus or by continuous infusion over a period of time, by intramuscular, intraperitoneal, intracerobrospinal, subcutaneous, intra-articular, intrasynovial, intrathecal, oral, topical, or inhalation routes. Optionally, administration may be performed through mini-pump infusion using various commercially available devices.
Effective dosages and schedules for administering lupus therapeutic agents may be determined empirically, and making such determinations is within the skill in the art. Single or multiple dosages may be employed. For example, an effective dosage or amount of interferon inhibitor used alone may range from about 1 mg/kg to about 100 mg/kg of body weight or more per day. Interspecies scaling of dosages can be performed in a manner known in the art, e.g., as disclosed in Mordenti et al., Pharmaceut. Res., 8:1351 (1991).
When in vivo administration of a lupus therapeutic agent is employed, normal dosage amounts may vary from about 10 ng/kg to up to 100 mg/kg of mammal body weight or more per day, preferably about 1 μg/kg/day to 10 mg/kg/day, depending upon the route of administration. Guidance as to particular dosages and methods of delivery is provided in the literature; see, for example, U.S. Pat. No. 4,657,760; 5,206,344; or 5,225,212. It is anticipated that different formulations will be effective for different treatment compounds and different disorders, that administration targeting one organ or tissue, for example, may necessitate delivery in a manner different from that to another organ or tissue.
It is contemplated that yet additional therapies may be employed in the methods. The one or more other therapies may include but are not limited to, administration of steroids and other standard of care regimens for the disorder in question. It is contemplated that such other therapies may be employed as an agent separate from, e.g., a targeted lupus therapeutic agent.
The invention also provides for methods of detecting the presence of lupus is provided by detecting a variation in a PRO or PRO-associated polynucleotide derived from a biological sample. In one embodiment, the biological sample is obtained from a mammal suspected of having lupus.
The invention also provides for methods of determining the genotype of a biological sample is provided by detecting whether a genetic variation is present in a PRO-associated polynucleotide derived from the biological sample. In one embodiment, the genetic variation is at a nucleotide position corresponding to the position of a SNP set forth in any of
The invention also provides for methods identifying cells in a biological sample from a mammal that are known to comprise, or suspected of comprising, a PRO or PRO-associated polynucleotide comprising a variation, by detecting the variation in a PRO or PRO-associated polynucleotide derived from the cells of the biological sample. In one embodiment, the variation is a genetic variation. In one embodiment, the genetic variation is at a nucleotide position corresponding to the position of a SNP set forth in any of
The invention also provides for methods diagnosing lupus in a mammal by detecting the presence of a variation in a PRO or PRO-associated polynucleotide derived from a biological sample obtained from the mammal, wherein the biological sample is known to comprise, or suspected of comprising, a PRO or PRO-associated polynucleotide comprising the variation. The invention also provides for methods for aiding in the diagnosing lupus in a mammal by detecting the presence of a variation in a PRO or PRO-associated polynucleotide derived from a biological sample obtained from the mammal, wherein the biological sample is known to comprise, or suspected of comprising, a PRO or PRO-associated polynucleotide comprising the variation. In one embodiment, the variation is a genetic variation. In one embodiment, the genetic variation is at a nucleotide position corresponding to the position of a SNP set forth in any of
Various algorithms known in the art and described herein can be used for assessing risk of developing lupus and response to therapy. Variants associated with a phenotype can interact in an additive, allelic dose dependent manner. In some embodiments of the invention, an algorithm based on a stratification scheme can be used to assess risk of developing lupus, disease severity, and response to-therapy. Lupus cases can be stratified into groups based on the number of risk alleles carried. In one embodiment, the risk allele is defined as the allele enriched in lupus cases relative to controls from the loci. For example, in one embodiment, where a total of 19 alleles from 18 loci are listed, then the maximum possible number of risk alleles is equal to 38. As described herein, the lupus cases stratified by the number of risk alleles and tertiles of the resulting distribution can be determined. The tertiles of lupus cases can then be examined for differences in disease severity, risk and response to therapy. In another embodiment, a method is provided for predicting whether a subject with lupus will respond to a therapeutic agent that targets a PRO or PRO-associated polynucleotide by determining whether the subject comprises a variation in a PRO or PRO-associated polynucleotide, wherein the presence of a variation in a PRO or PRO-associated polynucleotide indicates that the subject will respond to the therapeutic agent. In one embodiment, the variation is a genetic variation. In one embodiment, the genetic variation is at a nucleotide position corresponding to the position of a SNP set forth in any of
The invention also encompasses methods of detecting the absence or presence in a subject, or sample obtained therefrom, of a genetic variation at a nucleotide position corresponding to the position of a SNP as set forth in any of
The invention also provides for methods of detecting the absence or presence of a genetic variation in a nucleic acid associated with a PRO by (a) contacting the nucleic acid with an allele-specific oligonucleotide that is specific for the genetic variation under conditions suitable for hybridization of the allele-specific oligonucleotide to the nucleic acid; and (b) detecting the absence or presence of allele-specific hybridization. In one embodiment, the genetic variation is at a nucleotide position corresponding to the position of a SNP set forth in any of
The invention also provides for methods for assessing predisposition of a subject to develop lupus by detecting presence or absence in the subject of a variation in a PRO or PRO-associated polynucleotide, wherein the presence of a variation in a PRO or PRO-associated polynucleotide indicates that the subject is predisposed to develop lupus. In one embodiment, the variation is a genetic variation. In one embodiment, the genetic variation is at a nucleotide position corresponding to the position of a SNP set forth in any of
The invention also provides for methods of sub-classifying lupus in a mammal, the method comprising detecting the presence of a variation in a PRO-associated polynucleotide at a nucleotide position corresponding to the position of a single nucleotide polymorphism (SNP) as set forth in any of
In one embodiment of the detection methods of the invention, the detecting comprises carrying out a process selected from a primer extension assay; an allele-specific primer extension assay; an allele-specific nucleotide incorporation assay; an allele-specific oligonucleotide hybridization assay; a 5′ nuclease assay; an assay employing molecular beacons; and an oligonucleotide ligation assay.
The invention also provides methods of identifying a therapeutic agent effective to treat lupus in a patient subpopulation, the method comprising correlating efficacy of the agent with the presence of a genetic variation at a nucleotide position corresponding to a single nucleotide polymorphism (SNP) in the patient subpopulation, wherein the SNP is one of those listed in
Methods of the invention provide information useful for determining appropriate clinical intervention steps, if and as appropriate. Therefore, in one embodiment of a method of the invention, the method further comprises a clinical intervention step based on results of the assessment of the presence or absence of a variation in a PRO or PRO-associated polynucleotide as disclosed herein. For example, appropriate intervention may involve prophylactic and treatment steps, or adjustment(s) of any then-current prophylactic or treatment steps based on genetic information obtained by a method of the invention.
As would be evident to one skilled in the art, in any method of the invention, while detection of presence of a variation would positively indicate a characteristic of a disease (e.g., presence or subtype of a disease), non-detection of a variation would also be informative by providing the reciprocal characterization of the disease.
The invention also provides for methods of amplifying a nucleic acid comprising a PRO-associated polynucleotide or fragment thereof is provided, wherein the PRO-associated polynucleotide or fragment thereof comprises a genetic variation. In one embodiment, the method comprises (a) contacting the nucleic acid with a primer that hybridizes to a sequence 5′ or 3′ of the genetic variation, and (b) extending the primer to generate an amplification product comprising the genetic variation. In one embodiment, the method further comprises contacting the amplification product with a second primer that hybridizes to a sequence 5′ or 3′ of the genetic variation, and extending the second primer to generate a second amplification product. In one such embodiment, the method further comprises amplifying the amplification product and second amplification product, e.g., by polymerase chain reaction.
In some embodiments, the genetic variation is at a nucleotide position corresponding to the position of a SNP of the present invention. In one such embodiment, the genetic variation comprises a SNP set forth in any of
Still further methods of the invention include methods of treating lupus in a mammal, comprising steps of obtaining tissue or a cell sample from the mammal, examining the tissue or cells for presence or absence of a variation as disclosed herein, and upon determining presence or absence of the variation in said tissue or cell sample, administering an effective amount of an appropriate therapeutic agent to said mammal. Optionally, the methods comprise administering an effective amount of a targeted lupus therapeutic agent, and, optionally, a second therapeutic agent (e.g., steroids, etc.) to said mammal.
In one embodiment, a method of treating lupus is provided, the method comprising administering to the subject an effective amount of an antagonist or agonist of PRO. In one embodiment, the subject exhibits variation in a PRO or PRO-associated polynucleotide. In one embodiment, the variation is a genetic variation. In one embodiment, the genetic variation is at a nucleotide position corresponding to the position of a SNP set forth in any of
The invention also provides for methods of treating a lupus condition in a subject in whom a genetic variation is known to be present at a nucleotide position corresponding to a single nucleotide polymorphism (SNP) listed in
The invention also provides for methods of treating a subject having a lupus condition, the method comprising administering to the subject a therapeutic agent known to be effective to treat the condition in a subject who has a genetic variation at a nucleotide position corresponding to a single nucleotide polymorphism (SNP) listed in
The invention also provides for methods of treating a subject having a lupus condition, the method comprising administering to the subject a therapeutic agent previously shown to be effective to treat said condition in at least one clinical study wherein the agent was administered to at least five human subjects who each had a genetic variation at a nucleotide position corresponding to a single nucleotide polymorphism (SNP) listed in
The invention also provides for methods of treating a lupus subject who is of a specific lupus patient subpopulation comprising administering to the subject an effective amount of a therapeutic agent that is approved as a therapeutic agent for said subpopulation, wherein the subpopulation is characterized at least in part by association with genetic variation at a nucleotide position corresponding to a SNP listed in
The invention also provides for methods of specifying a therapeutic agent for use in a lupus patient subpopulation, the method comprising providing instruction to administer the therapeutic agent to a patient subpopulation characterized by a genetic variation at a position corresponding to a single nucleotide polymorphism (SNP) listed in
The invention also provides for methods for marketing a therapeutic agent for use in a lupus patient subpopulation, the method comprising informing a target audience about the use of the therapeutic agent for treating the patient subpopulation as characterized by the presence, in patients of such subpopulation, of a genetic variation at a position corresponding to a single nucleotide polymorphism (SNP) listed in
The invention also provides for methods for modulating signaling through the B cell receptor in a subject in whom a genetic variation is known to be present at a nucleotide position corresponding to a single nucleotide polymorphism (SNP) listed in
The invention also provides for methods for modulating the differentiation of Th17 cells in a subject in whom a genetic variation is known to be present at a nucleotide position corresponding to a single nucleotide polymorphism (SNP) listed in
In one embodiment of the invention, kits are provided. In one embodiment, a kit comprises any of the polynucleotides described herein, optionally with an enzyme. In one embodiment, the enzyme is at least one enzyme selected from a nuclease, a ligase, and a polymerase.
In one embodiment, the invention provides a kit comprising a composition of the invention, and instructions for using the composition to detect lupus by determining whether a subject's genome comprises a genetic variation as disclosed herein. In one embodiment, the composition of the invention comprises a plurality of polynucleotides capable of specifically hybridizing to at least 1, 2, 3, 4, or 5 PRO-associated polynucleotides, each PRO-associated polynucleotide comprising a genetic variation at a nucleotide position corresponding to the position of a SNP set forth in any of
For use in the applications described or suggested above, kits or articles of manufacture are also provided by the invention. Such kits may comprise a carrier means being compartmentalized to receive in close confinement one or more container means such as vials, tubes, and the like, each of the container means comprising one of the separate elements to be used in the method. For example, one of the container means may comprise a probe that is or can be detectably labeled. Such probe may be a polynucleotide specific for a PRO-associated polynucleotide. Where the kit utilizes nucleic acid hybridization to detect the target nucleic acid, the kit may also have containers containing nucleotide(s) for amplification of the target nucleic acid sequence and/or a container comprising a reporter means, such as a biotin-binding protein, such as avidin or streptavidin, bound to a reporter molecule, such as an enzymatic, florescent, or radioisotope label.
The kit of the invention will typically comprise the container described above and one or more other containers comprising materials desirable from a commercial and user standpoint, including buffers, diluents, filters, needles, syringes, and package inserts with instructions for use. A label may be present on the container to indicate that the composition is used for a specific therapy or non-therapeutic application, and may also indicate directions for either in vivo or in vitro use, such as those described above.
The kits of the invention have a number of embodiments. A typical embodiment is a kit comprising a container, a label on said container, and a composition contained within said container; wherein the composition includes detecting agent for a PRO or PRO-associated polynucleotide, the label on said container indicates that the composition can be used to evaluate the presence of the PRO or PRO-associated polynucleotide in at least one type of mammalian cell, and instructions for using the detecting agent for evaluating the presence of the PRO or PRO-associated polynucleotide in at least one type of mammalian cell. The kit can further comprise a set of instructions and materials for preparing a tissue sample and applying antibody and probe to the same section of a tissue sample. For example, a kit may comprise a container, a label on said container, and a composition contained within said container; wherein the composition includes a polynucleotide that hybridizes to a complement of a PRO-associatd polynucleotide under stringent conditions, the label on said container indicates that the composition can be used to evaluate the presence of a PRO-associated polynucleotide in at least one type of mammalian cell, and instructions for using the polynucleotide for evaluating the presence of PRO-associated RNA or DNA in at least one type of mammalian cell.
Other optional components in the kit include one or more buffers (e.g., block buffer, wash buffer, substrate buffer, etc), other reagents such as substrate (e.g., chromogen) which is chemically altered by an enzymatic label, epitope retrieval solution, control samples (positive and/or negative controls), control slide(s) etc.
The invention herein also encompasses a method for marketing a lupus therapeutic agent or a pharmaceutically acceptable composition thereof comprising promoting to, instructing, and/or specifying to a target audience, the use of the agent or pharmaceutical composition thereof for treating a patient or patient population with lupus from which a sample has been obtained showing the presence of a genetic variation as disclosed herein.
Marketing is generally paid communication through a non-personal medium in which the sponsor is identified and the message is controlled. Marketing for purposes herein includes publicity, public relations, product placement, sponsorship, underwriting, and sales promotion. This term also includes sponsored informational public notices appearing in any of the print communications media designed to appeal to a mass audience to persuade, inform, promote, motivate, or otherwise modify behavior toward a favorable pattern of purchasing, supporting, or approving the invention herein.
The marketing of the diagnostic method herein may be accomplished by any means. Examples of marketing media used to deliver these messages include television, radio, movies, magazines, newspapers, the internet, and billboards, including commercials, which are messages appearing in the broadcast media.
The type of marketing used will depend on many factors, for example, on the nature of the target audience to be reached, e.g., hospitals, insurance companies, clinics, doctors, nurses, and patients, as well as cost considerations and the relevant jurisdictional laws and regulations governing marketing of medicaments and diagnostics. The marketing may be individualized or customized based on user characterizations defined by service interaction and/or other data such as user demographics and geographical location.
The following are examples of the methods and compositions of the invention. It is understood that various other embodiments may be practiced, given the general description provided above.
The bibliographic information for the references cited (and denoted by number) in Examples 1-3 are provided at the end of Example 3. The bibliographic information for the references cited (and denoted by number) in Examples 4-6 are provided at the end of Example 6.
This Example describes materials and methods undertaken to perform a genome-wide scan for SLE in a large sample comprising 1311 SLE cases and 3340 controls. Over 500,000 variants, which captured common variation across an estimated 85% of the human genome, 24 were genotyped and tested for an association to SLE.
SLE case samples were genotyped from the following collections: a) 338 subjects from the Autoimmune Biomarkers Collaborative Network (ABCoN), an NIH/NIAMS funded repository,25 b) 141 subjects from the Multiple Autoimmune Disease Genetics Consortium (MADGC),26 c) 613 subjects from the University of California San Francisco (UCSF) Lupus Genetics Project10,27 and d) 335 subjects from the University of Pittsburgh Medical Center (UPMC)28 plus 8 samples collected at The Feinstein Institute for Medical Research. All SLE cases were self-described Caucasians. The diagnosis of SLE (fulfillment of four or more of the American College of Rheumatology (ACR) defined criteria29) was confirmed in all cases by medical record review (94%) or through written documentation of criteria by treating rheumatologists (6%). Clinical data were reviewed and tabulated at each institution.
A total of 3583 control samples were examined in the association analyses. As part of this project, 1861 control samples were selected and then genotyped from the New York Cancer Project (NYCP) collection 30, based on self-described ethnicity, gender and age. In addition, genotype data from 1722 self-described Caucasian control samples were obtained from the publicly available iControlDB database <www.illumina.com/pages.ilmn?ID=231>.
For replication, DNA samples from an independent collection of 793 Swedish SLE patients (all of whom fulfilled four or more of the classification criteria for SLE as defined by the ACR) and 857 healthy Swedish control individuals, were genotyped. The patients were from rheumatology clinics at the Lund, Uppsala, Karolinska (Solna) and Umeå University Hospitals.7 The Institutional Review Boards of all collaborating institutions approved these studies, and all participants gave informed consent.
Control samples from the NYCP (N=1861) were genotyped on the Illumina HumanHap550 Genotyping BeadChip31 at The Feinstein Institute. 1465 samples (464 cases, 1001 controls) were genotyped on the HumanHap550v1 chip and 1875 samples (1015 cases, 860 controls) were genotyped on the HumanHap550v3 chip. Genotype data from 1452 of these control samples were submitted to iControlDB and made publicly available prior to publication. An additional, independent set of 1722 Caucasian samples genotyped using the HumanHap550 BeadChip was obtained from Studies 66 and 67 of the iControlDB <www.illumina.com/pages.ilmn?ID=231>. Case samples were genotyped at The Feinstein Institute in serial phases; Series 1 consisted of the 479 cases from ABCoN and MADGC, Series 2 included the 613 cases from UCSF, and Series 3 was comprised of 387 cases from UPMC and The Feinstein Institute. The 545,080 single nucleotide polymorphisms (SNPs) present on both HumanHap550 versions were advanced into the analysis. Case and control samples with average call rates <80% across the chip were re-genotyped.
In the Swedish replication collection, the SNPs rs11574637 and rs13277113 were genotyped using homogeneous single base primer extension assays with fluorescence polarization detection at the SNP Technology Platform in Uppsala <www.genotyping.se> and reagents from Perkin-Elmer.32 The genotype call rate in the samples was 96% and the reproducibility was 100% according to duplicate assays of 4.6% of the genotypes. A three generation CEPH pedigree with 20 members was genotyped in parallel with the study samples, and no deviation from Mendelian inheritance was observed for either of the SNPs.
Samples with an average call rate of ≦95% (N=42) or where the reported sex of the individual was discordant with observed sex (N=21) were excluded from the analysis. The identity by state (IBS) across the genome was estimated for each sample, and the samples examined for cryptic relatedness. One sample from each pair estimated to be duplicates or 1st-3rd degree relatives was removed (Pi_hat>0.10 and Z1≧0.15, N=161). Three of these pairs were comprised of both a case and a control; the control was removed. SNPs with a frequency in cases of <1% (N=21,644) or a HWE P≦1×10−6 in controls (N=2819) were removed from the analysis. SNPs with missingness >5% (N=6074) were removed. SNPs were tested for the probability of a significant difference in missingness between cases and controls; SNPs with P≦1×10−5 (N=7646) were removed. SNPs were also tested for batch effects: for example, between ABCoN samples and all other cases; SNPs with P<1×10−9 (N=13) were removed.
Population outliers were detected using EIGENSTRAT33. Samples more than 6 standard deviations from the mean along any of the top 10 principal components were excluded from the analysis (N=141). Data from the 3340 remaining control samples were randomly assigned to each SLE case series proportionately, resulting in a ˜2.5 control:case ratio (Table 1).
Series 1 consisted of 411 cases and 1047 controls, Series 2 was comprised of 595 cases and 1516 controls, and Series 3 was comprised of 305 cases and 777 controls. Overall, 93% of cases were female, and 62% of controls were female. No significant differences in allele frequencies were noted between males and females.
SNPs with >2% missing data in at least one series and where the missing data was unequally distributed between cases and controls (differential missingness, P<1×10−3) were removed (N=3323). SNPs in the pseudo-autosomal region of chromosome X (N=13) showed no significant association and were excluded from further analysis. The sample and marker filtering were conducted using analytical modules within the software program PLINK34. For each series, a total of 502,033 SNPs were advanced into downstream analyses.
The association of all SNPs to SLE susceptibility was calculated using 2×2 contingency tables. A genomic control inflation factor (λgc) was then calculated for each sample series.35 The genomic control inflation factor is a metric based on the median chi square that reflects whether the bulk of the distribution conforms to the null hypothesis (λgc=1.0). A λgc value >1 indicates an elevation of the average chi square association statistic due to systemic technical artifacts or the presence of population stratification. After removing low quality data to minimize technical artifacts, evidence of inflation was noted for each series: 1.14, 1.18, and 1.11, respectively, for Series 1, 2 and 3. To correct for the presence of population stratification, principal components for each series were calculated using a subset of SNPs in EIGENSTRAT. SNPs with case MAF <2% (5011), control HWE P≦1×10−4 (1792), or missing data >1% (50414) were removed, as were SNPs in regions of abnormal LD patterns due to structural variation on chromosomes 6 (from 24-36 Mb), 8 (8-12 Mb), 11 (42-58 Mb), and 17 (40-43 Mb). The remaining 440,202 SNPs were used to calculate principal components. In each series, the first 4 principal components were used to adjust the association statistic for all 502,033 SNPs. After adjustment for population stratification, the λgc for each series approached 1.0 (see Table 1). The corrected association statistic for each series was combined by the weighted merging of the Z-score incorporating λgc (
To test for heterogeneity between the three case-control studies for the most associated variants, the Breslow-Day test implemented in PLINK was run for the SNP with the best association in each of these regions: HLA DRB, STAT4, IRF5, BLK, and ITGAM/ITGAX. No significant heterogeneity was detected (each P>0.2).
Associations between individual SNPs and subphenotypes were calculated for the combined dataset (
Gene expression measurements of Epstein-Barr Virus transformed B cell lines from 210 unrelated, healthy HapMap individuals from a publicly available dataset (GENEVAR project, www.sanger.ac.uk/humgen/genevar/) were examined for a correlation to variants significantly associated with SLE.36 Specifically, the median fluorescence intensity of 4 measurements from probes for BLK (GI—33469981-S), C8orf13 (GI—32698772-S), ITGAM (GI—6006013-S), ITGAX (GI—34452172-S), ACTB (beta-actin, GI—5016088-S), and GAPDH (GI—7669491-S) from 60 U.S. residents with northern and western European ancestry (CEU), 60 Yoruba (YRI), 45 Han Chinese individuals from Beijing (CHB) and 45 Japanese individuals from Tokyo (JPT) were examined. The expression data for BLK, C8orf13, GAPDH and ACTB were stratified by rs13277113 genotype (obtained from the HapMap (www.hapmap.org)), and the significance of the differential expression was measured by a 2-tailed t-test assuming an equal variance. Similarly, the expression data for ITGAM, ITGAX, GAPDH and ACTB were stratified by genotype at rs11574637 and tested for significance using a t-test. Expression data normalized on a log scale across the HapMap populations as described by the GENEVAR project yielded similar results to the median fluorescence intensity.
The association of BLK and C8orf13 expression to cis-genetic variation in an independent set of 400 EBV-transformed B cells was obtained by the examination and data-mining of a recently published study (www.sph.umich.edu/csg/liang/asthma/).37 Specifically, the association of a proxy for rs13277113 (rs4840568) to the expression levels of BLK (probe 206255_at) and C8 orf13 (probe 226614_s_at) was measured as described by Dixon et al.37
A total of 502,033 polymorphic SNPs on the Illumina chips passed quality control filters and were tested for association to SLE in a staged fashion using 3 case-control series (Table 1). A combined association statistic was calculated by addition of the Z-scores converted from the EIGENSTRAT-corrected chi square test statistic, weighted for series size and adjusted for the residual λgc of each series (see Methods).
A comparison of the observed meta-analysis P values relative to the P values for a null distribution is shown in
After removing variants in HLA, IRF5 and STAT4 from the chi expected vs. observed analysis, the deviation of P values from the null distribution was not eliminated (
BLK/C8orf13
Several variants on the short arm of chromosome 8 (8p23.1) were associated with SLE (
rs13277113 maps to the interval between two genes transcribed in opposite directions: BLK—a src family tyrosine kinase that signals downstream of the B cell receptor, and C8orf13—an ubiquitously expressed gene of unknown function (
Common genetic variation has been shown to correlate with levels of cis gene expression.8, 36, 37, 40 To determine whether the associated promoter SNPs might influence mRNA expression of BLK and/or C8orf13, a gene expression dataset generated from Epstein-Barr virus transformed B lymphocyte cell lines from the 210 unrelated HapMap samples was queried.36 Strikingly, the risk “A” allele of rs13277113 was associated with lower levels of mRNA expression of BLK (
Multiple conserved transcription factor binding sites, including motifs for IRF1, PPARG and an interferon-stimulated response element, are located in the 5′ region of BLK and C8orf13. However, neither rs13277113 nor correlated variants (r2>0.5) altered known transcription factor binding sites or other known functional nucleic acid motifs. We conclude that rs13277113, or variation strongly associated with rs13277113, alters the level of mRNA expression of BLK and C8orf13.
Variants within a cluster of integrin alpha chain genes on chromosome 16 were also significantly associated with SLE (
rs11574637 is part of a large block of correlated SNPs that covers ˜150 kb encoding several genes including ITGAM and the 5′ portion of ITGAX (
Associations with SLE Clinical Features
Finally, the associations between the two top SNPs, rs11574637 (BLK) and rs13277113 (ITGAM), and the presence of individual ACR criteria, using the combined case series 1-3 (
The current effort describes the results of a comprehensive genome-wide association study performed in SLE. By studying a large number of SLE cases—1311—and an even larger group of controls—3340, the major alleles contributing risk to SLE were detected. The strong signals observed in the HLA region, IRF5 and STAT4 served as positive controls for the experiment, and confirm that these loci are among the most important genetic factors in this disease.
The src family tyrosine kinase BLK is an interesting new candidate gene for SLE. Expression of BLK is highly restricted to the B lymphocyte lineage.41 Blk expression in the mouse is first observed in cycling late pro-B cells, continues throughout B cell development, and is subsequently downregulated in plasma B cells.42 A knockout mouse for Blk has no gross phenotype43, and functional studies in human B cells have not been performed. Without being bound by theory, BLK is one of the tyrosine kinases that transduces signals downstream of the B cell receptor, and it perhaps has a redundant role in the mouse, given the lack of a phenotype in the knockouts. There is precedent for major species differences in the role of B cell receptor associated kinases. For example, Bruton's tyrosine kinase (BTK) deficiency in humans leads to X-linked agammaglobulinemia, and a complete lack of B cells.44 However, deficiency of Btk in the mouse is associated with a much milder phenotype, with production of mature B cells that are functionally impaired.45
Signaling through the B cell receptor is important for establishing the B cell repertoire through induction of anergy, deletion and receptor editing during B cell development.46, 47 As shown here, the risk allele at BLK is associated with reduced expression of BLK mRNA in transformed B cell lines. Without being bound by theory, the altered protein levels of BLK might influence tolerance mechanisms in B cells, predisposing individuals to systemic autoimmunity. A similar mechanism has recently been shown for Ly108, one of the major genetic loci in the NZM2410 mouse model for lupus.48 Accordingly, in one embodiment of the invention, one of skill in the art can use the information provided herein to assess the effect of the risk haplotype on expression of the ubiquitously expressed gene C8orf13.
A second locus identified in this scan is ITGAM/ITGAX. While ITGAX is not excluded from consideration based on the strong LD in the region that extends into the 5′ portion of ITGAX, the data suggests that ITGAM may be the relevant gene in the region. ITGAM (also known as CD11b, Mac-1, and the complement receptor type 3) is a well characterized integrin alpha chain molecule that is expressed by a variety of myeloid cell types, including dendritic cells, macrophages, monocytes, and neutrophils.49-51 ITGAM forms a heterodimer with ITGB2 (CD18), and mediates adhesion between cell types in the immune system, and the adhesion of myeloid cells to endothelium.52 Mice deficient for ITGAM show enhanced disease progression and inflammation in several models of autoimmunity,53-55 including lupus, and recent data suggest that ITGAM may function normally to suppress Th17 differentiation,56 a pathway that has been linked with induction of autoimmunity. Of interest, the expression of CD11b has been reported to be elevated on the neutrophils of active SLE patients.57 The risk allele for ITGAM with its two highly correlated nonsynonymous alleles may predispose to altered function and/or regulation of expression of the protein, thereby contributing to systemic autoimmunity.
In summary, the current data identify two new susceptibility loci for SLE: BLK/C8orf13 on chromosome 8 and ITGAM/ITGAX on chromosome 16. The most likely candidate genes within these two loci are BLK and ITGAM. The identification of these genes provides important new insights into the genetic basis of SLE and also suggests potential new targets for therapy.
In this Example, the initial data set consisted of the cases and the controls from the genome wide association study described above in Examples 1 and 2, with genotypes from Illumina HumanHap550v1 chips and Illumina HumanHap550v3 chips. The data set from Illumina HumanHap550v1 chips consisted of 555352 SNPs in each of 464 cases and 1962 controls. The data set from Illumina HumanHap550v3 chips consisted of 561466 SNPs in each of 971 cases and 1621 controls. For each data set, quality-control filters were applied similarly to the manner described above in Examples 1 and 2. The resulting data set from HumanHap550v1 chips consisted of 534523 SNPs in each of 422 cases and 1881 controls. The resulting data set from HumanHap550v3 chips consisted of 549273 SNPs in each of 929 cases and 1558 controls.
The above data set from Illumina HumanHap550v1 chips was merged with the above data set from Illumina HumanHap550v3 chips. The resulting data set consisted of 564307 SNPs in each of 1351 cases and 3439 controls. This data set was merged with genotypes from the CGEMS breast and prostate cancer studies: 553820 SNPs in each of 4527 samples, used as controls. The resulting data set consisted of 570099 SNPs in each of 1351 cases and 7966 controls. Quality-control filters were applied similarly to the manner described above in Examples 1 and 2. The resulting data set consisted of 446856 SNPs in each of 1351 cases and 7966 controls.
The above data set was used to impute genotype probabilities for each polymorphic CEU SNP in the Phase II HapMap, via the program IMPUTE (www.stats.ox.ac.uk/˜marchini/software/gwas/impute.html). The recommended effective population size (−Ne 11418) was used.
Association between SLE status and each imputed SNP was calculated with the program SNPTEST (www.stats.ox.ac.uk/˜marchini/software/gwas/snptest.html). Population outliers were excluded; they were determined with the program EIGENSTRAT, in a manner similar to that described above in Examples 1 and 2. Both additive and general frequentist models were tested.
The selection and genotyping of the SLE case samples was described previously (1). Briefly, DNA samples from a) 338 subjects from the Autoimmune Biomarkers Collaborative Network (ABCoN), an NIH/NIAMS-funded repository (2), b) 141 subjects from the Multiple Autoimmune Disease Genetics Consortium (MADGC) (3), (ABCON+MADGC=case series 1), c) 613 subjects from the University of California San Francisco (UCSF) Lupus Genetics Project (4, 5) (case series 2) and d) 335 subjects from the University of Pittsburgh Medical Center (UPMC) (6) plus 8 samples collected at The Feinstein Institute for Medical Research (case series 3) were genotyped using the Illumina 550K array. All SLE cases were North Americans of European descent, as determined by self-report. The diagnosis of SLE (fulfillment of four or more of the American College of Rheumatology (ACR) defined criteria (7)) was confirmed in all cases by medical record review (94%) or through written documentation of criteria by treating rheumatologists (6%). Clinical data for these case series were presented elsewhere (4, 3, 2, 6, 5).
A total of 8147 control samples genotyped using the Illumina 550K array were examined in the association analyses. Three sources were used for controls (all North Americans of European descent): 1861 samples from the New York Health Project (NYHP) collection (8); 1722 samples from the publicly available iControlDB database (www.illumina.com/pages.ilmn?ID=231); and 4564 samples from the publicly available Cancer Genetics Markers of Susceptibility (CGEMS) project (http://cgems.cancer.gov/). Genotyping of the NYHP samples was described previously (1).
Sample and SNP filtering was conducted using analytical modules within the software programs PLINK (9) and EIGENSTRAT (10), as described below.
a) SLE Cases, NYCP Samples, and iControlDB Samples
The Illumina 550K SNP array, version 1 (HH550v1) was used to genotype 464 cases and 1962 controls, and the Illumina 550K SNP array, version 3 (HH550v3) was used to genotype 971 cases and 1621 controls as described (1). Samples where the reported sex did not match the observed sex (HH550v1: 10, HH550v3: 11) and samples with >5% missing genotypes (HH550v1: 25, HH550v3: 21) were excluded from the analysis. Cryptic relatedness between the SLE cases and controls was determined by the estimation of the identity-by-state (IBS) across the genome for all possible pair-wise sample combinations. A sample from each pair estimated to be duplicates or 1st-3rd degree relatives were excluded (Pi_hat≧0.10 and Z1≧0.15; HH550v1: 88, HH550v3: 73). SNPs with HWE P≦1×10−6 in controls (HH550v1: 3176, HH550v3: 2240) and SNPs with >5% missing data (HH550v1: 12605, HH550v3: 7137) were removed. The SNPs were tested for a significant difference in the frequency of missing data between cases and controls, and SNPs with P≦1×10−5 in the differential missingness test implemented in PLINK were removed (HH550v1: 5027, HH550v3: 2804). The SNPs were also tested for a significant allele frequency difference between genders; all SNPs had P>1×10−9 in controls. The data was examined for the presence of batch effects (for example, between ABCoN samples and all other cases), and SNPs with an allele frequency difference with a P<1×10−9 were excluded (HH550v1: 18, HH550v3: 10). Variants with heterozygous haploid genotypes were set to missing (HH550v1: 2305, HH550v3: 875). In addition, variants with a minor allele frequency <0.0001 were removed (HH550v1: 97, HH550v3: 57).
For the 2277 prostate cancer samples and, separately, 2287 breast cancer samples, heterozygous haploid genotypes were set to missing (prostate: 2717, breast: 0). Samples where the reported gender did not match the observed gender (prostate: 0, breast: 2) and samples with >5% missing data (prostate: 15, breast: 1) were excluded. Samples were tested for cryptic relatedness, as described above, and one sample was removed from each pair estimated to be duplicates or 1st-3rd degree relatives (Pi_hat≧0.10 and Z1≧0.15; prostate: 12, breast: 7). SNPs with a MAF<0.0001 (prostate: 3254, breast: 2166) were removed.
Additional data quality filters were applied to the merged dataset consisting of all SLE cases and controls. SNPs with >5% missing data (N=65,421) and samples with >5% missing data (N=0) were removed. A test for duplicate samples was conducted using 957 independent SNPs with MAF ≧0.45, and no duplicate samples were found. SNPs with HWE P≦1×10−6 in controls (N=2174) and SNPs with >2% missing data (N=5522) were removed. We tested the SNPs for a significant difference in the proportion of missing data between cases and controls and removed SNPs with excess missing data differential (P≦1×10−5, N=16080). SNPs were tested for a significant difference between genders and all SNPs had P≧1×10−9 in controls. SNPs were also examined for the presence of batch effects; in particular, between CGEMS breast cancer samples and all other controls, and between CGEMS prostate cancer samples and all other controls and removed SNPs with P<1×10−9 (N=73). After application of the above quality filters, 480,831 SNPs remained.
The cases and controls were tested for the presence of population outliers using EIGENSTRAT. SNPs with MAF <2% in cases (N=16068), HWE P≦1×10−4 in controls (N=977), or >1% missing data (N=17029); SNPs in regions of abnormal LD patterns due to structural variation on chromosomes 6 (from 24-36 Mb), 8 (8-12 Mb), 11 (42-58 Mb), and 17 (40-43 Mb); and SNPs in the pseudoautosomal region of chromosome X (N=12) were excluded for the purpose of determining the principal components (EIGENSTRAT) of variation to detect population outliers. Samples with greater than 6 standard deviations from the mean along any of the top 10 principal components were removed (N=148).
The final data set had 1310 cases, 7859 controls, and 480,831 SNPs, and the genomic control inflation factor (λgc) (11) was 1.06 after the application of the above data quality filters.
The extensive linkage disequilibrium present in the human genome allows the inference of untyped variants in certain situations with a high degree of confidence. IMPUTE, a program for imputing unobserved genotypes in genome-wide case-control studies based on a set of known haplotypes (HapMap Phase II haplotypes, www.hapmap.org), was used in the analysis (www.stats.ox.ac.uk/˜marchini/software/gwas/impute.html).
Imputing the GNE Cases and NYCP, iDB, and CGEMS Controls
After quality control filters, there were 1310 GNE cases, 3344 NYCP and iDB controls, 4515 CGEMS controls, and 446,856 SNPs. The program IMPUTE (v0.3.1) was run with the included CEU haplotype, legend, and map files aligned to NCBI Build 35. The effective population size was set to the recommended value of 11418. No strand file was used; strand alignment checking in IMPUTE was turned on. Cases, NYCP and iDB controls, and CGEMS controls were imputed separately, and each chromosome was imputed separately in its entirety. 2,562,708 SNPs were imputed.
SNPTEST (v1.1.3) was used to do association tests on both the actual and imputed genotypes. For SNPs that were already genotyped, the actual genotypes were used. The association test was the Cochran-Armitage test for an additive genetic effect, with the “-proper” option to completely take into account the uncertainty of the genotypes. Only SNPs with an information score above 0.50 (i.e., frequentist_add_proper_info>0.50) were kept (2,481,907 SNPs [97%]).
Results. A non-redundant list of SLE loci associated with SLE (P<1×10−5) in the analysis of 1310 cases and 7859 controls is presented in Table 1. The rank ordered list was generated by displaying the single variant with the lowest P value in a +/−100 kb interval from generated by displaying the single variant with the lowest P value in a +/−100 kb interval from the analysis of 2.3 million SNPs as described above.
A total of 16 alleles met one of the criteria described below for confirmed SLE risk loci (Table 2).
1) SLE Risk Loci with at Least 2 Independent Reports of P≦1×10−5.
The literature was examined for loci with 2 independent reports in non-overlapping SLE cohorts with a P≦1×10−5. The literature search represents publications prior to April 2008. The identical variant (or proxy with r2>0.3) showing association to SLE with the same direction of effect was required. A total of 7 alleles fulfilled the requirements, including HLA-DRB1*0301 (HLA-DR3,(18, 19)), HLA-DRB1*1501 (HLA-DR2,(18, 19)), Protein Tyrosine Phospatase Non-receptor type 22 (PTPN22, (20, 21)), Interferon Regulatory Factor 5 (IRF5, (22, 23)), Signal Transducer and Activator of Transcription 4 (STAT4, (5, 21)), B Lymphoid tyrosine Kinase (BLK, (21, 1)) and Integrin Alpha M (ITGAM, (1, 24)). The identical allele or best proxy (r2>0.85) in the 1310 SLE case and 7859 control genome-wide association scan described here was advanced into the analysis (Table 2).
2) SLE Risk Loci with a Single Report of P≦1×10−5.
A literature search for SLE risk loci with a reported P≦1×10−5 in a single publication as of April 2008 was performed and a total of 18 loci were identified.
In 13 of the loci, the identical variant or near-perfect proxy (r2>0.9) was genotyped in the 1310 SLE case and 7859 control genome scan described above (Table 4). A meta-analysis using the methodology described below was performed for the 13 loci, and 8 of the loci achieved a P≦5×10−8. The loci (labeled by a single gene within the locus) achieving genome-wide significance include; Pituitary Tumor-Transforming Protein 1 (PTTG1), APG5 autophagy 5-like (ATG5), CTD-binding SR-like protein rA9 (KIAA1542), Ubiquitin-conjugating Enzyme E2L3 (UBE2L3), PX domain containing serine/threonine kinase (PXK), Fc fragment of IgG, low affinity IIa, Receptor (FCGR2A), Tumor Necrosis Factor (ligand) Superfamily 4 (TNFSF4), and B-cell scaffold protein with Ankyrin repeats 1 (BANK1). The variant reaching genome wide significance in the meta-analysis was advanced into the analysis (Table 5, Table 2). In the remaining 5 loci, the reported variant or near-perfect proxy (r2>0.9) was not genotyped in the 1310 SLE case and 7859 control SLE genome-wide association scan (Table 2). However, a variant in interleukin-1 receptor-associated kinase 1 (IRAK1) had an observed P≦1×10−4 and was advanced into the analysis (Table 1).
The corrected association statistic for each series was combined by the summing of the Z-scores weighted for cohort size.
9.5 × 10−25
1.4 × 10−19
2.5 × 10−14
2.3 × 10−11
4.4 × 10−16
1.0 × 10−10
2.5 × 10−11
6.9 × 10−22
3.0 × 10−11
1.0 × 10−10
5.3 × 10−14
2.7 × 10−12
3.0 × 10−10
3.7 × 10−10
†Best SNP: the SNP with the lowest P at that locus.
SLE risk loci were identified using two primary methods—a) analysis of 1310 SLE cases and 7859 controls, and b) a meta-analysis with previously reported SLE risk loci.
A non-redundant list of the variants with strong association to SLE risk (P<1×10−6) is provided in Table 6.
Variants associated with a phenotype are known to interact in an additive, allelic dose dependent manner (38, 39). In one exemplary embodiment, the following algorithm can be used to assess risk to lupus, disease severity, and response to therapy. Lupus cases can be stratified into groups based on the number of risk alleles carried. In this exemplary embodiment, the risk allele is defined as the allele enriched in lupus cases relative to controls from the loci. For example in Table 6, there are a total of 19 alleles from 18 loci, making the maximum possible number of risk alleles equal to 38. The lupus cases stratified by the number of risk alleles and tertiles of the resulting distribution can be determined. The tertiles of lupus cases can then be examined for differences in disease severity, risk and response to therapy.
9.5 × 10−25
1.4 × 10−19
2.5 × 10−14
2.3 × 10−11
Genomic DNA from 192 SLE patients and 96 healthy controls was whole genome amplified prior to resequencing. Genomic DNA was resequenced of all the exons and selected noncoding regions (2.5 kb of the promoter region-upstream of exon 1) in B-lymphoid Kinase (BLK), Intergrin Alpha M (ITGAM), and Intergrin Alpha X (ITGAX).
Initial allele calling was performed by software provided by “Polymorphic”. All the coding polymorphisms as well as common noncoding alleles were manually verified to confirm the allele calls, and create the genotyping files, used for the association and haplotype analysis.
Variants of ITGAM/ITGAX are provided in Tables 7 and 9 and Tables 8 and 10. The variants of Tables 7 and 9 are not present in the database dbSNP build 129. The variants of Tables 8 and 10 were discovered by sequencing of ITGAM/ITGAX and BLK.
A genome-wide association study for SLE was performed. 1079 SLE cases and 1411 controls were genotyped with the Illumina HumanHap550 Genotyping BeadChip (555,352 SNPs). The SLE cases were from three distinct cohorts. Control samples were chosen based on available HLA typing, ethnicity, gender, and age. Most controls (all but 277) were chosen such that the frequency of HLA DR2 and DR3 haplotypes would match that found in SLE.
There have been three versions of the Illumina HumanHap550. The number of SNPs shared between version 1 and version 3 is 545,080; only these SNPs were analyzed. Version 1 was used for all cohort 1 and cohort 2 samples and 1001 control samples. Version 3 was used for all cohort 3 samples and 410 control samples.
Chips with average call rates <80% were redone. After all redos were complete, samples with <90% call rates were removed.
Samples were initially divided into two groups for analysis. The first group (Group 1) consisted of all cohort 1 and cohort 2 samples (466 cases) and 724 control samples. The second group (Group 2) consisted of all cohort 3 samples (613 cases) and the remaining 687 control samples.
Samples were checked for agreement between genotype-determined gender and clinical records; a discrepancy was found in 10 samples (3 cases, 7 controls), which were removed from further analysis.
Samples were then tested for intercontinental admixture using the program STRUCTURE (the online link can be accessed by typing “pritch.bsd.uchicago.edu/structure” with “.html” as the suffix)(essentially as described in Pritchard et al., Genetics (2000), 155:945-959; Falush et al., Genetics (2003), 164:1567-1587; Falush et al., Molecular Ecology Notes (2007), doi:10.1111/j.1471-8286.2007.01758.x). The HumanHap550 includes a “DNA Test Panel” of 276 SNPs which are ideal for determining percent-ancestry to the CEU, YRI, and CHB+JPT populations of the HapMap project. (CHB and JPT could not be discriminated using these SNPs.) 274 of the 276 SNPs in the DNA Test Panel were genotyped in all HapMap populations; STRUCTURE was run with genotypes for these 274 SNPs in the set consisting of the remaining Group-1 samples (463 cases, 717 controls) plus one sample from each pedigree in the HapMap project (i.e., 20 CEPH samples from Utah (CEU), 30 Yoruba samples (YRI), 45 Han Chinese samples (CHB), and 44 Japanese samples (JPT)). The HapMap samples were included as positive controls and to aid the clustering algorithm. STRUCTURE was run independently three times with the same parameters: using the admixture ancestry model and the correlated allele-frequency model with no prior population information, assuming three populations, with 30,000 burn-in steps followed by 100,000 Markov-Chain Monte Carlo steps. The three runs had very similar coefficients of ancestry for each sample, and each HapMap sample had >93.0% ancestry to its geographic origin; each CEU sample had >97.0% CEU ancestry. Samples which had <90.0% CEU ancestry in any of the three runs (28 cases, 24 controls) were removed from further analysis.
For the remaining samples (435 cases, 693 controls), SNPs with call rates <95% (23,275 SNPs (4%)) were removed from further analysis. Then, SNPs with Hardy-Weinberg probability ≦0.001 in controls (15,622 SNPs (3%)) were removed from further analysis.
Samples were not explicitly checked for agreement between genotype-determined gender and clinical records.
SNPs with call rates <95% (34,998 SNPs (6%)) were removed from further analysis.
Samples were then tested for intercontinental admixture using STRUCTURE, as described above. Samples which had <90.0% CEU ancestry in any of the three runs (21 cases, 24 controls) were removed from further analysis.
For the remaining samples (592 cases, 663 controls), SNPs with Hardy-Weinberg probability ≦0.001 in controls (22,202 SNPs (4%)) were removed from further analysis.
Groups 1 and 2 were combined for the final analysis.
The remaining samples (435 cases, 693 controls) in Group 1 were combined with the remaining samples (592 cases, 663 controls) in Group 2 to yield the Final Group (1027 cases, 1356 controls). Only the SNPs remaining in both Group 1 and Group 2 (496,458 SNPs) were analyzed further.
All samples without gender discrepancies (1076 cases, 1404 controls) were checked to see if they could be duplicates or related. Initially all pairs of samples were compared across 800 SNPs spread across the genome. Duplicate and related candidates were then checked across 540,000+ SNPs. Three groups of outliers were detected. The first group (20 pairs) had >95% identity between each pair and was deemed duplicates. The second group (17 pairs) had 67-77% identity between each pair and was deemed related. The third group (5 pairs) had 58-63% identity between each pair and was deemed related. (Average identity between samples was 51-55%.) Overall, 39 samples (29 cases, 10 controls) were removed from the Final Group.
SNPs in mitochondrial DNA (19 SNPs) were removed from further analysis.
The resulting Main Group (998 cases, 1346 controls, 496,439 SNPs) was used in the analysis below.
The same analysis was also performed on specific subsets of the Main Group:
Subset 1: Females only (907 cases, 967 controls) and Subset 2: Cases with lupus nephritis, and all controls (286 cases, 1346 controls)
All SNPs in the Main Group were analyzed using EIGENSTRAT, which is a program that is essentially described in Price et al., Nature Genetics (2006), 38:904-909 (the online link can be accessed by typing “genepath.med.harvard.edu/˜reich/EIGENSTRAT” with “.htm” as the suffix), which also corrects for population stratification. The top 10 principal components were used to remove outliers for 5 rounds and then correct for stratification. The EIGENSTRAT chi-square statistic was then calculated, and the one-tailed probability of the chi-squared distribution was calculated with Microsoft Excel's CHIDIST function with one degree of freedom.
To determine the top candidate regions, we first reduced the number of candidate SNPs by using a P-value threshold: for Subset 1 (females) and Subset 2 (nephritis), SNPs with a P>2.0×10−5 were removed from further analysis, and for the Main Group (998 cases, 1346 controls), SNPs with a P>7.0×10−5 were removed from further analysis. In the females subset, 19 SNPs remained. In the nephritis subset, 35 SNPs remained. In the Main Group, 47 SNPs remained. Then, the linkage-disequilibrium (LD) region containing each SNP was determined by examining LD plots utilizing the HelixTree program (the online link can be accessed by typing “www.goldenhelix.com/pharmhelixtreefeatures” with “.html” as the suffix) (Golden Helix, Mont., USA). The EM algorithm was used to calculate D′ and r2 using only the genotypes of the cases and controls. Regions were delineated by eye, using D′≧˜0.9 as bounds.
Once each region was delineated, the genes in each region were looked at with an art-established genome browser (e.g., the UCSC Genome Browser, essentially described in Kuhn et al., Nucleic Acids Res. (2007), 35(database issue):D668-73; the online link can be accessed by typing “genome.ucsc” with “.edu” as the suffix, March 2006 assembly). Immune-specific gene expression, as determined in the IRIS study (Abbas et al., Genes and Immunity (2005), 6:319-331, including its online supplementary material) was examined. Top candidate regions were identified, for example, by the presence of immune-specific genes in a region. In the nephritis subset, 11 regions containing 20 candidate SNPs were chosen as likely to contain at least one risk allele for SLE (
This patent application claims priority to U.S. Patent Application Nos. 60/939,156, filed May 21, 2007, and 61/013,283, filed Dec. 12, 2007. The contents of these patent applications and all the references contain therein are hereby incorporated by reference in their entirety.
Filing Document | Filing Date | Country | Kind | 371c Date |
---|---|---|---|---|
PCT/US08/64430 | 5/21/2008 | WO | 00 | 8/23/2010 |
Number | Date | Country | |
---|---|---|---|
60939156 | May 2007 | US | |
61013283 | Dec 2007 | US |