This invention relates to genetic polymorphisms useful for assessing the response to lipid lowering drug therapy and adverse drug reactions of those medicaments. In addition it relates to genetic polymorphisms useful for assessing cardiovascular risks in humans, including, but not limited to, atherosclerosis, ischemia/reperfusion, hypertension, restenosis, arterial inflammation, myocardial infarction, and stroke. Specifically, the present invention identifies and describes gene variations which are individually present in humans with cardiovascular disease states, relative to humans with normal, or non-cardiovascular disease states, and/or in response to medications relevant to cardiovascular disease. Further, the present invention provides methods for the identification and therapeutic use of compounds as treatments of cardiovascular disease or as prophylactic therapy for cardiovascular diseases. Moreover, the present invention provides methods for the diagnostic monitoring of patients undergoing clinical evaluation for the treatment of cardiovascular disease, and for monitoring the efficacy of compounds in clinical trials. Still further, the present invention provides methods to use gene variations to predict personal medication schemes omitting adverse drug reactions and allowing an adjustment of the drug dose to achieve maximum benefit for the patient. Additionally, the present invention describes methods for the diagnostic evaluation and prognosis of various cardiovascular diseases, and for the identification of subjects exhibiting a predisposition to such conditions.
Cardiovascular disease is a major health risk throughout the industrialized world.
Cardiovascular diseases include but are not limited by the following disorders of the heart and the vascular system: congestive heart failure, myocardial infarction, atherosclerosis, ischemic diseases of the heart, coronary heart disease, all kinds of atrial and ventricular arrhythmias, hypertensive vascular diseases and peripheral vascular diseases.
Heart failure is defined as a pathophysiologic state in which an abnormality of cardiac function is responsible for the failure of the heart to pump blood at a rate commensurate with the requirement of the metabolizing tissue. It includes all forms of pumping failure such as high-output and low-output, acute and chronic, right-sided or left-sided, systolic or diastolic, independent of the underlying cause.
Myocardial infarction (MI) is generally caused by an abrupt decrease in coronary blood flow that follows a thrombotic occlusion of a coronary artery previously narrowed by arteriosclerosis. MI prophylaxis (primary and secondary prevention) is included as well as the acute treatment of MI and the prevention of complications.
Ischemic diseases are conditions in which the coronary flow is restricted resulting in an perfusion which is inadequate to meet the myocardial requirement for oxygen. This group of diseases include stable angina, unstable angina and asymptomatic ischemia.
Arrhythmias include all forms of atrial and ventricular tachyarrhythmias (atrial tachycardia, atrial flutter, atrial fibrillation, atrio-ventricular reentrant tachycardia, preexitation syndrome, ventricular tachycardia, ventricular flutter, ventricular fibrillation) as well as bradycardic forms of arrhythmias.
Hypertensive vascular diseases include primary as well as all kinds of secondary arterial hypertension (renal, endocrine, neurogenic, others).
Peripheral vascular diseases are defined as vascular diseases in which arterial and/or venous flow is reduced resulting in an imbalance between blood supply and tissue oxygen demand. It includes chronic peripheral arterial occlusive disease (PAOD), acute arterial thrombosis and embolism, inflammatory vascular disorders, Raynaud's phenomenon and venous disorders.
Atherosclerosis, the most prevalent of vascular diseases, is the principal cause of heart attack, stroke, and gangrene of the extremities, and thereby the principal cause of death. Atherosclerosis is a complex disease involving many cell types and molecular factors (for a detailed review, see Ross, 1993, Nature 362: 801-809 and Lusis, A. J., Nature 407, 233-241 (2000)). The process, in normal circumstances a protective response to insults to the endothelium and smooth muscle cells (SMCs) of the wall of the artery, consists of the formation of fibrofatty and fibrous lesions or plaques, preceded and accompanied by inflammation. The advanced lesions of atherosclerosis may occlude the artery concerned, and result from an excessive inflammatory-fibroproliferative response to numerous different forms of insult. For example, shear stresses are thought to be responsible for the frequent occurrence of atherosclerotic plaques in regions of the circulatory system where turbulent blood flow occurs, such as branch points and irregular structures.
The first observable event in the formation of an atherosclerotic plaque occurs when blood-borne monocytes adhere to the vascular endothelial layer and transmigrate through to the sub-endothelial space. Adjacent endothelial cells at the same time produce oxidized low density lipoprotein (LDL). These oxidized LDLs are then taken up in large amounts by the monocytes through scavenger receptors expressed on their surfaces. In contrast to the regulated pathway by which native LDL (nLDL) is taken up by nLDL specific receptors, the scavenger pathway of uptake is not regulated by the monocytes.
These lipid-filled monocytes are called foam cells, and are the major constituent of the fatty streak. Interactions between foam cells and the endothelial and SMCs which surround them lead to a state of chronic local inflammation which can eventually lead to smooth muscle cell proliferation and migration, and the formation of a fibrous plaque. Such plaques occlude the blood vessel concerned and thus restrict the flow of blood, resulting in ischemia.
Ischemia is a condition characterized by a lack of oxygen supply in tissues of organs due to inadequate perfusion. Such inadequate perfusion can have number of natural causes, including atherosclerotic or restenotic lesions, anemia, or stroke, to name a few. Many medical interventions, such as the interruption of the flow of blood during bypass surgery, for example, also lead to ischemia. In addition to sometimes being caused by diseased cardiovascular tissue, ischemia may sometimes affect cardiovascular tissue, such as in ischemic heart disease. Ischemia may occur in any organ, however, that is suffering a lack of oxygen supply.
The most common cause of ischemia in the heart is atherosclerotic disease of epicardial coronary arteries. By reducing the lumen of these vessels, atherosclerosis causes an absolute decrease in myocardial perfusion in the basal state or limits appropriate increases in perfusion when the demand for flow is augmented. Coronary blood flow can also be limited by arterial thrombi, spasm, and, rarely, coronary emboli, as well as by ostial narrowing due to luetic aortitis. Congenital abnormalities, such as anomalous origin of the left anterior descending coronary artery from the pulmonary artery, may cause myocardial ischemia and infarction in infancy, but this cause is very rare in adults. Myocardial ischemia can also occur if myocardial oxygen demands are abnormally increased, as in severe ventricular hypertrophy due to hypertension or aortic stenosis. The latter can be present with angina that is indistinguishable from that caused by coronary atherosclerosis. A reduction in the oxygen-carrying capacity of the blood, as in extremely severe anemia or in the presence of carboxy-hemoglobin, is a rare cause of myocardial ischemia. Not infrequently, two or more causes of ischemia will coexist, such as an increase in oxygen demand due to left ventricular hypertrophy and a reduction in oxygen supply secondary to coronary atherosclerosis.
The foregoing studies are aimed at defining the role of particular gene variations presumed to be involved in the misleading of normal cellular function leading to cardiovascular disease. However, such approaches cannot identify the full panoply of gene variations that are involved in the disease process.
At present, the only available treatments for cardiovascular disorders are pharmaceutical based medications that are not targeted to an individual's actual defect; examples include angiotensin converting enzyme (ACE) inhibitors and diuretics for hypertension, insulin supplementation for non-insulin dependent diabetes mellitus (NIDDM), cholesterol reduction strategies for dyslipidaemia, anticoagulants, β blockers for cardiovascular disorders and weight reduction strategies for obesity. If targeted treatment strategies were available it might be possible to predict the response to a particular regime of therapy and could markedly increase the effectiveness of such treatment. Although targeted therapy requires accurate diagnostic tests for disease susceptibility, once these tests are developed the opportunity to utilize targeted therapy will become widespread. Such diagnostic tests could initially serve to identify individuals at most risk of hypertension and could allow them to make changes in lifestyle or diet that would serve as preventative measures. The benefits associated by coupling the diagnostic tests with a system of targeted therapy could include the reduction in dosage of administered drugs and thus the amount of unpleasant side effects suffered by an individual. In more severe cases a diagnostic test may suggest that earlier surgical intervention would be useful in preventing a further deterioration in condition.
It is an object of the invention to provide genetic diagnosis of predisposition or susceptibility for cardiovascular diseases. Another related object is to provide treatment to reduce or prevent or delay the onset of disease in those predisposed or susceptible to this disease. A further object is to provide means for carrying out this diagnosis.
Accordingly, a first aspect of the invention provides a method of diagnosis of disease in an individual, said method comprising determining one, various or all genotypes in said individual of the genes listed in the Examples.
In another aspect, the invention provides a method of identifying an individual predisposed or susceptible to a disease, said method comprising determining one, various or all genotypes in said individual of the genes listed in the Examples.
The invention is of advantage in that it enables diagnosis of a disease or of certain disease states via genetic analysis which can yield useable results before onset of disease symptoms, or before onset of severe symptoms. The invention is further of advantage in that it enables diagnosis of predisposition or susceptibility to a disease or of certain disease states via genetic analysis.
The invention may also be of use in confirming or corroborating the results of other diagnostic methods. The diagnosis of the invention may thus suitably be used either as an isolated technique or in combination with other methods and apparatus for diagnosis, in which latter case the invention provides a further test on which a diagnosis may be assessed.
The present invention stems from using allelic association as a method for genotyping individuals; allowing the investigation of the molecular genetic basis for cardiovascular diseases. In a specific embodiment the invention tests for the polymorphisms in the sequences of the listed genes in the Examples. The invention demonstrates a link between this polymorphisms and predispositions to cardiovascular diseases by showing that allele frequencies significantly differ when individuals with “bad” serum lipids are compared to individuals with “good” serum levels. The meaning of “good and bad” serum lipid levels is defined in Table 1a.
Certain disease states would benefit, that is to say the suffering of the patient may be reduced or prevented or delayed, by administration of treatment or therapy in advance of disease appearance; this can be more reliably carried out if advance diagnosis of predisposition or susceptibility to disease can be diagnosed.
Pharmacogenomics and Adverse Drug Reactions
Adverse drug reactions (ADRs) remain a major clinical problem. A recent meta-analysis suggested that in the USA in 1994, ADRs were responsible for 100000 deaths, making them between the fourth and sixth commonest cause of death (Lazarou 1998, J. Am. Med. Assoc. 279:1200). Although these figures have been heavily criticized, they emphasize the importance of ADRs. Indeed, there is good evidence that ADRs account for 5% of all hospital admissions and increase the length of stay in hospital by two days at an increased cost of ˜$2500 per patient. ADRs are also one of the commonest causes of drug withdrawal, which has enormous financial implications for the pharmaceutical industry. ADRs, perhaps fortunately, only affect a minority of those taking a particular drug. Although factors that determine susceptibility are unclear in most cases, there is increasing interest in the role of genetic factors. Indeed, the role of inheritable variations in predisposing patients to ADRs has been appreciated since the late 1950s and early 1960s through the discovery of deficiencies in enzymes such as pseudocholinesterase (butyrylcholinesterase) and glucose-6-phosphate dehydrogenase (G6PD). More recently, with the first draft of the human genome just completed, there has been renewed interest in this area with the introduction of terms such as pharmacogenomics and toxicogenomics. Essentially, the aim of pharmacogenomics and pharmacogenetics is to produce personalized medicines, whereby administration of the drug class and dosage is tailored to an individual genotype. Thus, the term pharmacogenetics embraces both efficacy and toxicity.
The 3-hydroxy-3-methylglutaryl coenzyme A (HMG-CoA) reductase inhibitors (“statins”) specifically inhibit the enzyme HMG-CoA reductase which catalyzes the rate limiting step in cholesterol biosynthesis. These drugs are effective in reducing the primary and secondary risk of coronary artery disease and coronary events, such as heart attack, in middle-aged and older men and women, in both diabetic and non-diabetic patients, and are often prescribed for patients with hyperlipidemia. Statins used in secondary prevention of coronary artery or heart disease significantly reduce the risk of stroke, total mortality and morbidity and attacks of myocardial ischemia; the use of statins is also associated with improvements in endothelial and fibrinolytic functions and decreased platelet thrombus formation.
The tolerability of these drugs during long term administration is an important issue. Adverse reactions involving skeletal muscle are not uncommon, and sometimes serious adverse reactions involving skeletal muscle such as myopathy and rhabdomyolysis may occur, requiring discontinuation of the drug. In addition an increase in serum creatine kinase (CK) may be a sign of a statin related adverse event. The extend of such adverse events can be read from the extend of the CK level increase (as compared to the upper limit of normal [ULN]).
Occasionally arthralgia, alone or in association with myalgia, has been reported. Also an elevation of liver transaminases has been associated with statin administration.
It was shown that the drug response to statin therapy is a class effects, i.e. all known and presumably also all so far undiscovered statins share the same beneficial and harmful effects (Ucar, M. et al., Drug Safety 2000, 22:441). It follows that the discovery of diagnostic tools to predict the drug response to a single statin will also be of aid to guide therapy with other statins.
The present invention provides diagnostic tests to predict the patient's individual response to statin therapy. Such responses include, but are not limited by the extent of adverse drug reactions, the level of lipid lowering or the drug's influence on disease states. Those diagnostic tests may predict the response to statin therapy either alone or in combination with another diagnostic test or another drug regimen.
The present invention is based at least in part on the discovery that a specific allele of a polymorphic region of a so called “candidate gene” (as defined below) is associated with CVD or drug response.
For the present invention the following candidate genes were analyzed:
Lipid Metabolism
Numerous studies have shown a connection between serum lipid levels and cardiovascular diseases. Candidate genes falling into this group include but are not limited by genes of the cholesterol pathway, apolipoproteins and their modifying factors.
Coagulation
Ischemic diseases of the heart and in particular myocardial infarction may be caused by a thrombotic occlusion. Genes falling into this group include all genes of the coagulation cascade and their regulatory elements.
Inflammation
Complications of atherosclerosis are the most common causes of death in Western societies. In broad outline atherosclerosis can be considered to be a form of chronic inflammation resulting from interaction modified lipoproteins, monocyte-derived macrophages, T cells, and the normal cellular elements of the arterial wall. This inflammatory process can ultimately lead to the development of complex lesions, or plaques, that protrude into the arterial lumen. Finally plaque rupture and thrombosis result in the acute clinical complications of myocardial infarction and stroke (Glass et al., Cell 2001, 104:503-516).
It follows that all genes related to inflammatory processes, including but not limited by cytokines, cytokine receptors and cell adhesion molecules are candidate genes for CVD.
Glucose and Energy Metabolism
As glucose and energy metabolism is interdependent with the metabolism of lipids (see above) also the former pathways contain candidate genes. Energy metabolism in general also relates to obesity, which is an independent risk factor for CVD (Melanson et al., Cardiol Rev 2001 9:202-207). In addition high blood glucose levels are associated with many microvascular and macrovascular complications and may therefore affect an individuals disposition to CVD (Duckworth, Curr Atheroscler Rep 2001, 3:383-391).
Hypertension
As hypertension is an independent risk factor for CVD, also genes that are involved in the regulation of systolic and diastolic blood pressure affect an individuals risk for CVD (Safar, Curr Opin Cardiol 2000, 15:258-263). Interestingly hypertension and diabetes (see above) appear to be interdependent, since hypertension is approximately twice as frequent in patients with diabetes compared with patients without the disease. Conversely, recent data suggest that hypertensive persons are more predisposed to the development of diabetes than are normotensive persons (Sowers et al., Hypertension 2001, 37:1053-1059).
Genes Related to Drug Response
Those genes include metabolic pathways involved in the absorption, distribution, metabolism, excretion and toxicity (ADMET) of drugs. Prominent members of this group are the cytochrome P450 proteins which catalyze many reactions involved in drug metabolism.
Unclassified Genes
As stated above, the mechanisms that lead to cardiovascular diseases or define the patient's individual response to drugs are not completely elucidated. Hence also candidate genes were analysed, which could not be assigned to the above listed categories. The present invention is based at least in part on the discovery of polymorphisms, that lie in genomic regions of unknown physiological function.
Results
After conducting an association study, we surprisingly found polymorphic sites in a number of candidate genes which show a strong correlation with the following phenotypes of the patients analysed: “Healthy” as used herein refers to individuals that neither suffer from existing CVD, nor exhibit an increased risk for CVD through their serum lipid level profile. “CVD prone” as used herein refers to individuals with existing CVD and/or a serum lipid profile that confers a high risk to get CVD (see Table 1a for definitions of healthy and CVD prone serum lipid levels). “High responder” as used herein refers to patients who benefit from relatively small amounts of a given drug. “Low responder” as used herein refers to patients who need relatively high doses in order to obtain benefit from the medication. “Tolerant patient” refers to individuals who can tolerate high doses of a medicament without exhibiting adverse drug reactions. “ADR patient” as used herein refers to individuals who suffer from ADR or show clinical symptoms (like creatine kinase elevation in blood) even after receiving only minor doses of a medicament (see Table 1b for a detailed definition of drug response phenotypes).
Polymorphic sites in candidate genes that were found to be significantly associated with either of the above mentioned phenotypes will be referred to as “phenotype associated SNPs” (PA SNPs). The respective genomic loci that harbour PA SNPs will be referred to as “phenotype associated genes” (PA genes), irrespective of the actual function of this gene locus.
In particular we surprisingly found PA SNPs associated with CVD, drug efficacy (EFF) or adverse drug reactions (ADR) in the following genes:
ABCA1: ATP-Binding Cassette, Sub-Family A (ABC1), Member 1
The membrane-associated protein encoded by this gene is a member of the superfamily of ATP-binding cassette (ABC) transporters. ABC proteins transport various molecules across extra- and intracellular membranes. ABC genes are divided into seven distinct subfamilies (ABC1, MDR/TAP, MRP, ALD, OABP, GCN20, White). This protein is a member of the ABC1 subfamily. Members of the ABC1 subfamily comprise the only major ABC subfamily found exclusively in multicellular eukaryotes. With cholesterol as its substrate, this protein functions as a cholesterol efflux pump in the cellular lipid removal pathway. Mutations in this gene have been associated with Tangier's disease and familial high-density lipoprotein deficiency.
ABCB1: ATP-Binding Cassette, Sub-Family B (MDR/TAP), Member 1
The membrane-associated protein encoded by this gene is a member of the superfamily of ATP-binding cassette (ABC) transporters. ABC proteins transport various molecules across extra- and intra-cellular membranes. ABC genes are divided into seven distinct subfamilies (ABC1, MDR/TAP, MRP, ALD, OABP, GCN20, White). This protein is a member of the MDR/TAP subfamily. Members of the MDR/TAP subfamily are involved in multidrug resistance. The protein encoded by this gene is an ATP-dependent drug efflux pump for xenobiotic compounds with broad substrate specificity. It is responsible for decreased drug accumulation in multidrug-resistant cells and often mediates the development of resistance to anticancer drugs. This protein also functions as a transporter in the blood-brain barrier.
ACACB: Acetyl-Coenzyme A Carboxylase Beta
Acetyl-CoA carboxylase (ACC) is a complex multifunctional enzyme system. ACC is a biotin-containing enzyme which catalyzes the carboxylation of acetyl-CoA to malonyl-CoA, the rate-limiting step in fatty acid synthesis. ACC-beta is thought to control fatty acid oxidation by means of the ability of malonyl-CoA to inhibit carnitine-palmitoyl-CoA transferase I, the rate-limiting step in fatty acid uptake and oxidation by mitochondria. ACC-beta may be involved in the regulation of fatty acid oxidation, rather than fatty acid biosynthesis. There is evidence for the presence of two ACC-beta isoforms.
ADRB3: Adrenergic, Beta-3-, Receptor
The ADRB3 gene product, beta-3-adrenergic receptor, is located mainly in adipose tissue and is involved in the regulation of lipolysis and thermogenesis. Beta adrenergic receptors are involved in the epenephrine and norepinephrine-induced activation of adenylate cyclase through the action of G proteins.
AKAP1: A Kinase (PRKA) Anchor Protein 1
The A-kinase anchor proteins (AKAPs) are a group of structurally diverse proteins, which have the common function of binding to the regulatory subunit of protein kinase A (PKA) and confining the holoenzyme to discrete locations within the cell. This gene encodes a member of the AKAP family. Alternative splicing of this gene results in 2 transcript variants encoding 2 isoforms with different sizes. Both of the isoforms bind to types I and R regulatory subunits of PKA and anchor them to mitochondria. As compared to the longer isoform, the shorter isoform lacks a K-homologous motif, which is an RNA-binding domain typically associated with proteins involved in RNA catalysis, mRNA processing, or translation. The longer isoform is speculated to be involved in the cAMP-dependent signal transduction pathway and in directing RNA to a specific cellular compartment. The function of the shorter isoform has not been determined.
AKAP10: A Kinase (PRKA) Anchor Protein 10
The A-kinase anchor proteins (AKAPs) are a group of structurally diverse proteins, which have the common function of binding to the regulatory subunit of protein kinase A (PKA) and confining the holoenzyme to discrete locations within the cell. This gene encodes a member of the AKAP family. The encoded protein interacts with both the type I and type II regulatory subunits of PKA; therefore, it is a dual-specific AKAP. This protein is highly enriched in mitochondria. It contains RGS (regulator of G protein signalling) domains, in addition to a PKA-RII subunit-binding domain. The mitochondrial localization and the presence of RGS domains may have important implications for the function of this protein in PKA and G protein signal transduction.
AKAP13: A Kinase (PRKA) Anchor Protein 13
The A-kinase anchor proteins (AKAPs) are a group of structurally diverse proteins, which have the common function of binding to the regulatory subunit of protein kinase A (PKA) and confining the holoenzyme to discrete locations within the cell. This gene encodes a member of the AKAP family. Alternative splicing of this gene results in at least 3 transcript variants encoding different isoforms containing a dbI oncogene homology (DH) domain and a pleckstrin homology (PH) domain. The DH domain is associated with guanine nucleotide exchange activation for the Rho/Rac family of small GTP binding proteins, resulting in the conversion of the inactive GTPase to the active form capable of transducing signals. The PH domain has multiple functions. Therefore, these isoforms function as scaffolding proteins to coordinate a Rho signaling pathway and, in addition, function as protein kinase A-anchoring proteins.
AMPD1: Adenosine Monophosphate Deaminase 1 (Isoform M)
Adenosine monophosphate deaminase 1 catalyzes the deamination of AMP to IMP in skeletal muscle and plays an important role in the purine nucleotide cycle. Two other genes have been identified, AMPD2 and AMPD3, for the liver- and erythocyte-specific isoforms, respectively. Deficiency of the muscle-specific enzyme is apparently a common cause of exercise-induced myopathy and probably the most common cause of metabolic myopathy in the human.
APOE: Apolipoprotein E
Chylomicron remnants and very low density lipoprotein (VLDL) remnants are rapidly removed from the circulation by receptor-mediated endocytosis in the liver. Apolipoprotein E, a main apoprotein of the chylomicron, binds to a specific receptor on liver cells and peripheral cells. ApoE is essential for the normal catabolism of triglyceride-rich lipoprotein constituents. The APOE gene is mapped to chromosome 19 in a cluster with APOC1 and APOC2. Defects in apolipoprotein E result in familial dysbetalipoproteinemia, or type III hyperlipoproteinemia (HLP III), in which increased plasma cholesterol and triglycerides are the consequence of impaired clearance of chylomicron and VLDL remnants.
APOM: Apolipoprotein M
The protein encoded by this gene is an apolipoprotein and member of the lipocalin protein family. It is found associated with high density lipoproteins and to a lesser extent with low density lipoproteins and triglyceride-rich lipoproteins. The encoded protein is secreted through the plasma membrane but remains membrane-bound, where it is involved in lipid transport. Two transcript variants encoding two different isoforms have been found for this gene, but only one of them has been fully characterized.
ARHGAP1: Rho GTPase Activating Protein 1
GTPase-activating protein for rho, rac and Cdc42Hs; has an SH3 binding domain
ATP1A2: ATPase, Na+/K+ Transporting, Alpha 2 (+) Polypeptide
ATP2A1: ATPase, Ca++ Transporting, Cardiac Muscle, Fast Twitch 1
This gene encodes one of the SERCA Ca (2+)-ATPases, which are intracellular pumps located in the sarcoplasmic or endoplasmic reticula of muscle cells. This enzyme catalyzes the hydrolysis of ATP coupled with the translocation of calcium from the cytosol to the sarcoplasmic reticulum lumen, and is involved in muscular excitation and contraction. Mutations in this gene cause some autosomal recessive forms of Brody disease, characterized by increasing impairment of muscular relaxation during exercise. Alternative splicing results in two transcript variants encoding different isoforms.
BAT3: HLA-B Associated Transcript 3
A cluster of genes, BAT1-BAT5, has been localized in the vicinity of the genes for TNF alpha and TNF beta. These genes are all within the human major histocompatibility complex class III region. The protein encoded by this gene is a nuclear protein. It has been implicated in the control of apoptosis and regulating heat shock protein. There are three alternatively spliced transcript variants described for this gene.
BAT4: HLA-B Associated Transcript 4
A cluster of genes, BAT1-BAT5, has been localized in the vicinity of the genes for TNF alpha and TNF beta. These genes are all within the human major histocompatibility complex class III region. The protein encoded by this gene is thought to be involved in some aspects of immunity.
BAT5: HLA-B Associated Transcript 5
A cluster of genes, BAT1-BAT5, has been localized in the vicinity of the genes for TNF alpha and TNF beta. These genes are all within the human major histocompatibility complex class III region. The protein encoded by this gene is thought to be involved in some aspects of immunity.
BRD3: Bromodomain Containing 3
This gene was identified based on its homology to the gene encoding the RING3 protein, a serine/threonine kinase. The gene localizes to 9q34, a region which contains several major histocompatibility complex (MHC) genes. The function of the encoded protein is not known.
CDC42BPB: CDC42 Binding Protein Kinase Beta (DMPK-Like)
The protein encoded by this gene is a member of the Ser/Thr protein kinase family. This protein contains a Cdc42/Rac-binding p21 binding domain resembling that of PAK kinase. The kinase domain of this protein is most closely related to that of myotonic dystrophy kinase-related ROK. Studies of the similar gene in rat suggested that this kinase may act as a downstream effector of Cdc42 in cytoskeletal reorganization.
CDC42EP2: CDC42 Effector Protein (Rho GTPase Binding) 2
CDC42, a small Rho GTPase, regulates the formation of F-actin-containing structures through its interaction with the downstream effector proteins. The protein encoded by this gene is a member of the Borg family of CDC42 effector proteins. Borg family proteins contain a CRIB (Cdc42/Rac interactive-binding) domain. They bind to, and negatively regulate the function of, CDC42. Coexpression of this protein with dominant negative mutant CDC42 protein in fibroblast was found to induce pseudopodia formation, which suggested a role of this protein in actin filament assembly and cell shape control.
CDC42EP3: CDC42 Effector Protein (Rho GTPase Binding) 3
CDC42, a small Rho GTPase, regulates the formation of F-actin-containing structures through its interaction with the downstream effector proteins. The protein encoded by this gene is a member of the Borg family of CDC42 effector proteins. Borg family proteins contain a CRIB (Cdc42/Rac interactive-binding) domain. They bind to, and negatively regulate the function of, CDC42. This protein can interact with CDC42, as well as with the ras homolog gene family, member Q (ARHQ/TC10). Expression of this protein in fibroblasts has been shown to induce pseudopodia formation.
CDC42EP4: CDC42 Effector Protein (Rho GTPase Binding) 4
The product of this gene is a member of the CDC42-binding protein family. Members of this family interact with Rho family GTPases and regulate the organization of the actin cytoskeleton. This protein has been shown to bind both CDC42 and TC10 GTPases in a GTP-dependent manner. When overexpressed in fibroblasts, this protein was able to induce pseudopodia formation, which suggested a role in inducing actin filament assembly and cell shape control.
CENPC1: Centromere Protein C 1
Centromere protein C 1 is a centromere autoantigen and a component of the inner kinetochore plate. The protein is required for maintaining proper kinetochore size and a timely transition to anaphase. A putative psuedogene exists on chromosome 12.
CETP: Cholesteryl Ester Transfer Protein, Plasma
Cholesteryl ester transfer protein (CETP) transfers cholesteryl esters between lipoproteins. CETP may effect susceptibility to atherosclerosis.
CPB2: Carboxypeptidase B2 (Plasma, Carboxypeptidase U)
Carboxypeptidases are enzymes that hydrolyze C-terminal peptide bonds. The carboxypeptidase family includes metallo-, serine, and cysteine carboxypeptidases. According to their substrate specificity, these enzymes are referred to as carboxypeptidase A (cleaving aliphatic residues) or carboxypeptidase B (cleaving basic amino residues). The protein encoded by this gene is activated by trypsin and acts on carboxypeptidase B substrates. After thrombin activation, the mature protein downregulates fibrinolysis. Polymorphisms have been described for this gene and its promoter region. Available sequence data analyses indicate splice variants that encode different isoforms.
CROT: Carnitine O-Octanoyltransferase
CSF2: Colony Stimulating Factor 2 (Granulocyte-Macrophage) IL3: Interleukin 3 (Colony-Stimulating Factor, Multiple)
The protein encoded by this gene is a cytokine that controls the production, differentiation, and function of granulocytes and macrophages. The active form of the protein is found extracellularly as a homodimer. This gene has been localized to a cluster of related genes at chromosome region 5q31, which is known to be associated with interstitial deletions in the 5q-syndrome and acute myelogenous leukemia. Other genes in the cluster include those encoding interleukins 4, 5, and 13.
DFNA5: Deafness, Autosomal Dominant 5
Hearing impairment is a heterogeneous condition with over 40 loci described. The protein encoded by this gene is expressed in fetal cochlea, however, its function is not known. Nonsyndromic hearing impairment is associated with a mutation in this gene.
F2: Coagulation Factor II (Thrombin)
Coagulation factor II is proteolytically cleaved to form thrombin in the first step of the coagulation cascade which ultimately results in the stemming of blood loss. F2 also plays a role in maintaining vascular integrity during development and postnatal life. Mutations in F2 leads to various forms of thrombosis and dysprothrombinemia.
FKBP1A: FK506 Binding Protein 1A, 12 kDa
The protein encoded by this gene is a member of the immunophilin protein family, which play a role in immunoregulation and basic cellular processes involving protein folding and trafficking. This encoded protein is a cis-trans prolyl isomerase that binds the immunosuppressants FK506 and rapamycin. It interacts with several intracellular signal transduction proteins including type I TGF-beta receptor. It also interacts with multiple intracellular calcium release channels including the tetrameric skeletal muscle ryanodine receptor. In mouse, deletion of this homologous gene causes congenital heart disorder known as noncompaction of left ventricular myocardium. There is evidence of multiple alternatively spliced transcript variants for this gene, but the full length nature of some variants has not been determined.
FYN: FYN Oncogene Related to SRC, FGR, YES
This gene is a member of the protein-tyrosine kinase oncogene family. It encodes a membrane-associated tyrosine kinase that has been implicated in the control of cell growth. The protein associates with the p85 subunit of phosphatidylinositol 3-kinase and interacts with the fyn-binding protein. Alternatively spliced transcript variants encoding distinct isoforms exist.
GHR: Growth Hormone Receptor
Biologically active growth hormone (MIM 139250) binds its transmembrane receptor (GHR), which dimerizes to activate an intracellular signal transduction pathway leading to synthesis and secretion of insulin-like growth factor I (IGF1; MIM 147440). In plasma, IGF1 binds to the soluble IGF1 receptor (IGF1R; MIM 147370). At target cells, this complex activates signal-transduction pathways that result in the mitogenic and anabolic responses that lead to growth. [supplied by OMIM]
HSPA9B: Heat Shock 70 kDa Protein 9B (Mortalin-2)
The product encoded by this gene belongs to the heat shock protein 70 family which contains both heat-inducible and constitutively expressed members. The latter are called heat-shock cognate proteins. This gene encodes a heat-shock cognate protein. This protein plays a role in the control of cell proliferation. It may also act as a chaperone.
IQGAP1: IQ Motif Containing GTPase Activating Protein 1
IQGAP2: IQ Motif Containing GTPase Activating Protein 2
LAG3: Lymphocyte-Activation Gene 3
Lymphocyte-activation protein 3 belongs to Ig superfamily and contains 4 extracellular Ig-like domains. The LAG3 gene contains 8 exons. The sequence data, exon/intron organization, and chromosomal localization all indicate a close relationship of LAG3 to CD4.
LCAT: Lecithin-Cholesterol Acyltransferase
This gene encodes the extracellular cholesterol esterifying enzyme, lecithin-cholesterol acyltransferase. The esterification of cholesterol is required for cholesterol transport. Mutations in this gene have been found to cause fish-eye disease as well as LCAT deficiency.
LCP2: Lymphocyte Cytosolic Protein 2 (SH2 Domain Containing Leukocyte Protein of 76 kDa)
SLP-76 was originally identified as a substrate of the ZAP-70 protein tyrosine kinase following T cell receptor (TCR) ligation in the leukemic T cell line Jurkat. The SLP-76 locus has been localized to human chromosome 5q33 and the gene structure has been partially characterized in mice. The human and murine cDNAs both encode 533 amino acid proteins that are 72% identical and comprised of three modular domains. The NH2-terminus contains an acidic region that includes a PEST domain and several tyrosine residues which are phosphorylated following TCR ligation. SLP-76 also contains a central proline-rich domain and a COOH-terminal SH2 domain. A number of additional proteins have been identified that associate with SLP-76 both constitutively and inducibly following receptor ligation, supporting the notion that SLP-76 functions as an adaptor or scaffold protein. Studies using SLP-76 deficient T cell lines or mice have provided strong evidence that SLP-76 plays a positive role in promoting T cell development and ac
LIF: Leukemia Inhibitory Factor (Cholinergic Differentiation Factor)
Leukaemia inhibitory factor is a cytokine that induces macrophage differentiation. Neurotransmitters and neuropeptides, well known for their role in the communication between neurons, are also capable of activating monocytes and macrophages and inducing chemotaxis in immune cells. LIF signals through different receptors and transcription factors. LIF in conjunction with BMP2 acts in synergy on primary fetal neural progenitor cells to induce astrocytes.
LIMK1: LIM Domain Kinase 1
There are approximately 40 known eukaryotic LIM proteins, so named for the LIM domains they contain. LIM domains are highly conserved cysteine-rich structures containing 2 zinc fingers. Although zinc fingers usually function by binding to DNA or RNA, the LIM motif probably mediates protein-protein interactions. LIM kinase-1 and LIM kinase-2 belong to a small subfamily with a unique combination of 2 N-terminal LIM motifs and a C-terminal protein kinase domain. LIMK1 is likely to be a component of an intracellular signaling pathway and may be involved in brain development. LIMK1 hemizygosity is implicated in the impaired visuospatial constructive cognition of Williams syndrome. Two splice variant have been identified.
LIPA: Lipase A, Lysosomal Acid, Cholesterol Esterase (Wolman Disease)
LIPA encodes lipase A, the lysosomal acid lipase (also known as cholesteryl ester hydrolase). This enzyme functions in the lysosome to catalyze the hydrolysis of cholesteryl esters and triglycerides. Mutations in LIPA can result in Wolman disease and cholesteryl ester storage disease.
LPA: Lipoprotein, Lp(a)
LPL: Lipoprotein Lipase
LPL encodes lipoprotein lipase, which is expressed in heart, muscle, and adipose tissue. LPL functions as a homodimer, and has the dual functions of triglyceride hydrolase and ligand/bridging factor for receptor-mediated lipoprotein uptake. Severe mutations that cause LPL deficiency result in type I hyperlipoproteinemia, while less extreme mutations in LPL are linked to many disorders of lipoprotein metabolism.
LTA: Lymphotoxin Alpha (TNF Superfamily, Member 1)
Lymphotoxin alpha, a member of the tumor necrosis factor family, is a cytokine produced by lymphocytes. LTA is highly inducible, secreted, and exists as homotrimeric molecule. LTA forms heterotrimers with lymphotoxin-beta which anchors lymphotoxin-alpha to the cell surface. LTA mediates a large variety of inflammatory, immunostimulatory, and antiviral responses. LTA is also involved in the formation of secondary lymphoid organs during development and plays a role in apoptosis.
MTND4L: NADH Dehydrogenase 4L
NDUFA6: NADH Dehydrogenase (Ubiquinone) 1 Alpha Subcomplex, 6, 14 kDa
NDUFB10: NADH Dehydrogenase (Ubiquinone) 1 Beta Subcomplex, 10, 22 kDa
Subunit of NADH-ubiquinone oxidoreductase (complex I); transports electrons from NADH to ubiquinone
NDUFB5: NADH Dehydrogenase (Ubiquinone) 1 Beta Subcomplex, 5, 16 kDa
The protein encoded by this gene is a subunit of the multisubunit NADH: ubiquinone oxido-reductase (complex I). Mammalian complex I is composed of 45 different subunits. It locates at the mitochondrial inner membrane. This protein has NADH dehydrogenase activity and oxido-reductase activity. It transfers electrons from NADH to the respiratory chain. The immediate electron acceptor for the enzyme is believed to be ubiquinone.
NDUFC2: NADH Dehydrogenase (Ubiquinone) 1, Subcomplex Unknown, 2, 14.5 kDa
Subunit of NADH-ubiquinone oxidoreductase (complex I); transports electrons from NADH to ubiquinone
NF1: Neurofibromin 1 (Neurofibromatosis, Von Recklinghausen Disease, Watson Disease)
Mutations linked to neurofibromatosis type 1 led to the identification of NF1. NF1 encodes the protein neurofibromin, which appears to be a negative regulator of the ras signal transduction pathway. In addition to type 1 neurofibromatosis, mutations in NF1 can also lead to juvenile myelomonocytic leukemia. Alternatively spliced NF1 mRNA transcripts have been isolated, although their functions, if any, remain unclear.
GRAF: GTPase Regulator Associated with Focal Adhesion Kinase Pp125(FAK)
SPC25: AD024-Protein
TOSO: Regulator of Fas-Induced Apoptosis
ZNF202: Zinc Finger Protein 202
PAK2: P21 (CDKN1A)-Activated Kinase 2
The p21 activated kinases (PAK) are critical effectors that link Rho GTPases to cytoskeleton reorganization and nuclear signaling. The PAK proteins are a family of serine/threonine kinases that serve as targets for the small GTP binding proteins, CDC42 and RAC1, and have been implicated in a wide range of biological activities. The protein encoded by this gene is activated by proteolytic cleavage during caspase-mediated apoptosis, and may play a role in regulating the apoptotic events in the dying cell.
PDCD6IP: Programmed Cell Death 6 Interacting Protein
This gene encodes a protein thought to participate in programmed cell death. Studies using mouse cells have shown that overexpression of this protein can block apoptosis. In addition, the product of this gene binds to the product of the PDCD6 gene, a protein required for apoptosis, in a calcium-dependent manner. This gene product also binds to endophilins, proteins that regulate membrane shape during endocytosis. Overexpression of this gene product and endophilins results in cytoplasmic vacuolization which may be partly responsible for the protection against cell death.
PDE4D: Phosphodiesterase 4D, cAMP-Specific (Phosphodiesterase E3 Dunce Homolog, Drosophila
CAMP-specific phosphodiesterase 4D; has similarity to Drosophila dnc, which is the affected protein in learning and memory mutant dunce
PDGFRA: Platelet-Derived Growth Factor Receptor, Alpha Polypeptide
This gene encodes a cell surface tyrosine kinase receptor for members of the platelet-derived growth factor family. These growth factors are mitogens for cells of mesenchymal origin. The identity of the growth factor bound to a receptor monomer determines whether the functional receptor is a homodimer or a heterodimer, composed of both platelet-derived growth factor receptor alpha and beta polypeptides. Studies in knockout mice, where homozygosity is lethal, indicate that the alpha form of the platelet-derived growth factor receptor is particularly important for kidney development since mice heterozygous for the receptor exhibit defective kidney phenotypes.
PFKM: Phosphofructokinase, Muscle
PLA2G4C: Phospholipase A2, Group IVC (Cytosolic, Calcium-Independent)
PLP1: Proteolipid Protein 1 (Pelizaeus-Merzbacher Disease, Spastic Paraplegia 2, Uncomplicated)
PPP1R12C: Protein Phosphatase 1, Regulatory (Inhibitor) Subunit 12C
Low similarity to MYPT2
PRKAR2B: Protein Kinase, Camp-Dependent, Regulatory, Type II, Beta
PRKCB1: Protein Kinase C, Beta 1
PTK2B: PTK2B Protein Tyrosine Kinase 2 Beta
This gene encodes a cytoplasmic protein tyrosine kinase which is involved in calcium-induced regulation of ion channels and activation of the map kinase signaling pathway. The encoded protein may represent an important signaling intermediate between neuropeptide-activated receptors or neurotransmitters that increase calcium flux and the downstrearm signals that regulate neuronal activity. The encoded protein undergoes rapid tyrosine phosphorylation and activation in response to increases in the intracellular calcium concentration, nicotinic acetylcholine receptor activation, membrane depolarization, or protein kinase C activation. This protein has been shown to bind CRK-associated substrate, nephrocystin, GTPase regulator associated with FAK, and the SH2 domain of GRB2. The encoded protein is a member of the FAK subfamily of protein tyrosine kinases but lacks significant sequence similarity to kinases from other subfamilies. Four transcript variants encoding two different isoforms have been found for this gene
PYGM: Phosphorylase, Glycogen; Muscle (McArdle Syndrome, Glycogen Storage Disease Type V)
RABGGTA: Rab Geranylgeranyltransferase, Alpha Subunit
RYR1: Ryanodine Receptor 1 (Skeletal)
RYR3: Ryanodine Receptor 3
SCARB1: Scavenger Receptor Class B, Member 1
SCO2: SCO Cytochrome Oxidase Deficient Homolog 2 (Yeast)
Mammalian cytochrome c oxidase (COX) catalyzes the transfer of reducing equivalents from cytochrome c to molecular oxygen and pumps protons across the inner mitochondrial membrane. In yeast, 2 related COX assembly genes, SCO1 and SCO2 (synthesis of cytochrome c oxidase), enable subunits 1 and 2 to be incorporated into the holoprotein. This gene is the human homolog of the yeast SCO2 gene.
SELE: Selectin E (Endothelial Adhesion Molecule 1)
The endothelial leukocyte adhesion molecule-1 is expressed by cytokine-stimulated endothelial cells. It is thought to be responsible for the accumulation of blood leukocytes at sites of inflammation by mediating the adhesion of cells to the vascular lining. It exhibits structural features such as the presence of lectin- and EGF-like domains followed by short consensus repeat (SCR) domains that contain 6 conserved cysteine residues. These proteins are part of the selectin family of cell adhesion molecules. This gene is present in single copy in the human genome and contains 14 exons spanning about 13 kb of DNA. Adhesion molecules participate in the interaction between leukocytes and the endothelium and appear to be involved in the pathogenesis of atherosclerosis.
SEPP1: Selenoprotein P, Plasma, 1
Selenoprotein P is an extracellular glycoprotein and is the only selenoprotein known to contain multiple selenocysteine residues. Two isoforms of this protein are Sep51 and Sep61. Sep51 lacks part of the C-terminal sequence. Selenoprotein P binds heparin and associates with endothelial cells. They are implicated as an oxidant defense in the extracellular space and in the transport of selenium.
SERPINA1: Serine (or Cysteine) Proteinase Inhibitor, Clade A (Alpha-1 Antiproteinase, Antitrypsin), Member 1
Alpha-1-antitrypsin is a protease inhibitor, deficiency of which is associated with emphysema and liver disease. The protein is encoded by a gene (PI) located on the distal long arm of chromosome 14. [supplied by OMIM]
SERPINA5: Serine (or Cysteine) Proteinase Inhibitor, Clade A (Alpha-1 Antiproteinase, Antitrypsin), Member 5
SERPINB2: Serine (or Cysteine) Proteinase Inhibitor, Clade B (Ovalbumin), Member 2
SLC6A8: Solute Carrier Family 6 (Neurotransmitter Transporter, Creatine), Member 8
Sodium and chloride-dependent creatine transporter; member of neurotransmitter transporter family
SSA1: Sjogren Syndrome Antigen A1 (52 kDa, Ribonucleoprotein Autoantigen SS-A/Ro)
The protein encoded by this gene is a member of the tripartite motif (TRIM) family. The TRIM motif includes three zinc-binding domains, a RING, a B-box type 1 and a B-box type 2, and a coiled-coil region. This protein is part of the RoSSA ribonucleoprotein which includes a single polypeptide and one of four small RNA molecules. The RoSSA particle localizes to both the cytoplasm and the nucleus. RoSSA interacts with autoantigens in patients with Sjogren syndrome and systemic lupus erythematosus. The function of the RoSSA particle has not been determined. Two alternatively spliced transcript variants for this gene have been described; however, the full length nature of one variant has not been determined.
STCH: Stress 70 Protein Chaperone, Microsome-Associated, 60 kDa
SULT1A2: Sulfotransferase Family, Cytosolic, 1A, Phenol-Preferring, Member 2
Sulfotransferase enzymes catalyze the sulfate conjugation of many hormones, neurotransmitters, drugs, and xenobiotic compounds. These cytosolic enzymes are different in their tissue distributions and substrate specificities. The gene structure (number and length of exons) is similar among family members. This gene encodes one of two phenol sulfotransferases with thermostable enzyme activity. Two alternatively spliced variants that encode the same protein have been described.
SYK: Spleen Tyrosine Kinase
TAP1: Transporter 1, ATP-Binding Cassette, Sub-Family B (MDR/TAP)
The membrane-associated protein encoded by this gene is a member of the superfamily of ATP-binding cassette (ABC) transporters. ABC proteins transport various molecules across extra- and intra-cellular membranes. ABC genes are divided into seven distinct subfamilies (ABC1, MDR/TAP, MRP, ALD, OABP, GCN20, White). This protein is a member of the MDR/TAP subfamily. Members of the MDR/TAP subfamily are involved in multidrug resistance. The protein encoded by this gene is involved in the pumping of degraded cytosolic peptides across the endoplasmic reticulum into the membrane-bound compartment where class I molecules assemble. Mutations in this gene may be associated with ankylosing spondylitis, insulin-dependent diabetes mellitus, and celiac disease.
TAP2: Transporter 2, ATP-Binding Cassette, Sub-Family B (MDR/TAP)
The membrane-associated protein encoded by this gene is a member of the superfamily of ATP-binding cassette (ABC) transporters. ABC proteins transport various molecules across extra- and intra-cellular membranes. ABC genes are divided into seven distinct subfamilies (ABC1, MDR/TAP, MRP, ALD, OABP, GCN20, White). This protein is a member of the MDR/TAP subfamily. Members of the MDR/TAP subfamily are involved in multidrug resistance. This gene is located 7 kb telomeric to gene family member ABCB2. The protein encoded by this gene is involved in antigen presentation. This protein forms a heterodimer with ABCB2 in order to transport peptides from the cytoplasm to the endoplasmic reticulum. Mutations in this gene may be associated with ankylosing spondylitis, insulin-dependent diabetes mellitus, and celiac disease. Alternative splicing of this gene produces two products which differ in peptide selectivity and level of restoration of surface expression of MHC class I molecules.
THBD: Thrombomodulin
TRIM28: Tripartite Motif-Containing 28 LocusID:
TRIP10: Thyroid Hormone Receptor Interactor 10
Similar to the non-kinase domains of FER and Fes/Fps tyrosine kinases; binds to activated Cdc42 and may regulate actin cytoskeleton; contains an SH3 domain
UGT2B15: UDP Glycosyltransferase 2 Family, Polypeptide B15
VEGF: Vascular Endothelial Growth Factor
Many polypeptide mitogens, such as basic fibroblast growth factor (MIM 134920) and platelet-derived growth factors (MIM 173430, MIM 190040), are active on a wide range of different cell types. In contrast, vascular endothelial growth factor is a mitogen primarily for vascular endothelial cells. It is, however, structurally related to platelet-derived growth factor
WASL: Wiskott-Aldrich Syndrome-Like
The Wiskott-Aldrich syndrome (WAS) family of proteins share similar domain structure, and are involved in transduction of signals from receptors on the cell surface to the actin cytoskeleton. The presence of a number of different motifs suggests that they are regulated by a number of different stimuli, and interact with multiple proteins. Recent studies have demonstrated that these proteins, directly or indirectly, associate with the small GTPase, Cdc42, known to regulate formation of actin filaments, and the cytoskeletal organizing complex, Arp2/3. The WASL gene product is a homolog of WAS protein, however, unlike the latter, it is ubiquitously expressed and shows highest expression in neural tissues. It has been shown to bind Cdc42 directly, and induce formation of long actin microspikes.
CACNA2D2: Calcium Channel, Voltage-Dependent, Alpha 2/Delta Subunit 2
TFAP2B: Transcription Factor AP-2 Beta (Activating Enhancer Binding Protein 2 Beta)
TRIT1: tRNA Isopentenyltransferase 1
This enzyme modifies both cytoplasmic and mitochondrial tRNAs at A(37) to give isopentenyl A(37).
UGT2A1: UDP Glycosyltransferase 2 Family, Polypeptide A1
As PA SNPs are linked to other SNPs in neighboring genes on a chromosome (Linkage Disequilibrium) those SNPs could also be used as marker SNPs. In a recent publication it was shown that SNPs are linked over 100 kb in some cases more than 150 kb (Reich D. E. et al. Nature 411, 199-204, 2001). Hence SNPs lying in regions neighbouring PA SNPs could be linked to the latter and by this being a diagnostic marker. These associations could be performed as described for the gene polymorphism in methods.
Definitions
For convenience, the meaning of certain terms and phrases employed in the specification, examples, and appended claims are provided below. Moreover, the definitions by itself are intended to explain a further background of the invention.
The term “allele”, which is used interchangeably herein with “allelic variant” refers to alternative forms of a gene or portions thereof. Alleles occupy the same locus or position on homologous chromosomes. When a subject has two identical alleles of a gene, the subject is said to be homozygous for the gene or allele. When a subject has two different alleles of a gene, the subject is said to be heterozygous for the gene. Alleles of a specific gene can differ from each other in a single nucleotide, or several nucleotides, and can include substitutions, deletions, and insertions of nucleotides. An allele of a gene can also be a form of a gene containing a mutation.
The term “allelic variant of a polymorphic region of a gene” refers to a region of a gene having one of several nucleotide sequences found in that region of the gene in other individuals.
“Homology” or “identity” or “similarity” refers to sequence similarity between two peptides or between two nucleic acid molecules. Homology can be determined by comparing a position in each sequence which may be aligned for purposes of comparison. When a position in the compared sequence is occupied by the same base or amino acid, then the molecules are homologous at that position. A degree of homology between sequences is a function of the number of matching or homologous positions shared by the sequences. An “unrelated” or “non-homologous” sequence shares less than 40% identity, though preferably less than 25% identity, with one of the sequences of the present invention.
The term “a homologue of a nucleic acid” refers to a nucleic acid having a nucleotide sequence having a certain degree of homology with the nucleotide sequence of the nucleic acid or complement thereof. A homologue of a double stranded nucleic acid having SEQ ID NO. X is intended to include nucleic acids having a nucleotide sequence which has a certain degree of homology with SEQ ID NO. X or with the complement thereof. Preferred homologous of nucleic acids are capable of hybridizing to the nucleic acid or complement thereof.
The term “interact” as used herein is meant to include detectable interactions between molecules, such as can be detected using, for example, a hybridization assay.
The term interact is also meant to include “binding” interactions between molecules. Interactions may be, for example, protein-protein, protein-nucleic acid, protein-small molecule or small molecule-nucleic acid in nature.
The term “intronic sequence” or “intronic nucleotide sequence” refers to the nucleotide sequence of an intron or portion thereof.
The term “isolated” as used herein with respect to nucleic acids, such as DNA or RNA, refers to molecules separated from other DNAs or RNAs, respectively, that are present in the natural source of the macromolecule. The term isolated as used herein also refers to a nucleic acid or peptide that is substantially free of cellular material, viral material, or culture medium when produced by recombinant DNA techniques, or chemical precursors or other chemicals when chemically synthesized.
Moreover, an “isolated nucleic acid” is meant to include nucleic acid fragments which are not naturally occurring as fragments and would not be found in the natural state. The term “isolated” is also used herein to refer to polypeptides which are isolated from other cellular proteins and is meant to encompass both purified and recombinant polypeptides.
The term “lipid” shall refer to a fat or fat-like substance that is insoluble in polar solvents such as water. The term “lipid” is intended to include true fats (e.g. esters of fatty acids and glycerol); lipids (phospholipids, cerebrosides, waxes); sterols (cholesterol, ergosterol) and lipoproteins (e.g. HDL, LDL and VLDL).
The term “locus” refers to a specific position in a chromosome. For example, a locus of a gene refers to the chromosomal position of the gene.
The term “modulation” as used herein refers to both up-regulation, (i.e., activation or stimulation), for example by agonizing, and down-regulation (i.e. inhibition or suppression), for example by antagonizing of a bioactivity (e.g. expression of a gene).
The term “molecular structure” of a gene or a portion thereof refers to the structure as defined by the nucleotide content (including deletions, substitutions, additions of one or more nucleotides), the nucleotide sequence, the state of methylation, and/or any other modification of the gene or portion thereof.
The term “mutated gene” refers to an allelic form of a gene, which is capable of altering the phenotype of a subject having the mutated gene relative to a subject which does not have the mutated gene. If a subject must be homozygous for this mutation to have an altered phenotype, the mutation is said to be recessive. If one copy of the mutated gene is sufficient to alter the genotype of the subject, the mutation is said to be dominant. If a subject has one copy of the mutated gene and has a phenotype that is intermediate between that of a homozygous and that of a heterozygous (for that gene) subject, the mutation is said to be co-dominant.
As used herein, the term “nucleic acid” refers to polynucleotides such as deoxyribonucleic acid (DNA), and, where appropriate, ribonucleic acid (RNA). The term should also be understood to include, as equivalents, derivatives, variants and analogs of either RNA or DNA made from nucleotide analogs, including peptide nucleic acids (PNA), morpholino oligonucleotides (J. Summerton and D. Weller, Antisense and Nucleic Acid Drug Development 7:187 (1997)) and, as applicable to the embodiment being described, single (sense or antisense) and double-stranded polynucleotides. Deoxyribonucleotides include deoxyadenosine, deoxycytidine, deoxyguanosine, and deoxythymidine. For purposes of clarity, when referring herein to a nucleotide of a nucleic acid, which can be DNA or an RNA, the term “adenosine”, “cytidine”, “guanosine”, and “thymidine” are used. It is understood that if the nucleic acid is RNA, a nucleotide having a uracil base is uridine.
The term “nucleotide sequence complementary to the nucleotide sequence set forth in SEQ ID NO. x” refers to the nucleotide sequence of the complementary strand of a nucleic acid strand having SEQ ID NO. x. The term “complementary strand” is used herein interchangeably with the term “complement”. The complement of a nucleic acid strand can be the complement of a coding strand or the complement of a non-coding strand. When referring to double stranded nucleic acids, the complement of a nucleic acid having SEQ ID NO. x refers to the complementary strand of the strand having SEQ ID NO. x or to any nucleic acid having the nucleotide sequence of the complementary strand of SEQ ID NO. x. When referring to a single stranded nucleic acid having the nucleotide sequence SEQ ID NO. x, the complement of this nucleic acid is a nucleic acid having a nucleotide sequence which is complementary to that of SEQ ID NO. x. The nucleotide sequences and complementary sequences thereof are always given in the 5′ to 3′ direction. The term “complement” and “reverse complement” are used interchangeably herein.
The term “operably linked” is intended to mean that the promoter is associated with the nucleic acid in such a manner as to facilitate transcription of the nucleic acid.
The term “polymorphism” refers to the coexistence of more than one form of a gene or portion thereof. A portion of a gene of which there are at least two different forms, i.e., two different nucleotide sequences, is referred to as a “polymorphic region of a gene”. A polymorphic region can be a single nucleotide, the identity of which differs in different alleles. A polymorphic region can also be several nucleotides long.
A “polymorphic gene” refers to a gene having at least one polymorphic region.
To describe a “polymorphic site” in a nucleotide sequence often there is used an “ambiguity code” that stands for the possible variations of nucleotides in one site. The list of ambiguity codes is summarized in the following table:
So, for example, a “R” in a nucleotide sequence means that either an “a” or a “g” could be at that position.
The terms “protein”, “polypeptide” and “peptide” are used interchangeably herein when referring to a gene product.
A “regulatory element”, also termed herein “regulatory sequence is intended to include elements which are capable of modulating transcription from a basic promoter and include elements such as enhancers and silencers. The term “enhancer”, also referred to herein as “enhancer element”, is intended to include regulatory elements capable of increasing, stimulating, or enhancing transcription from a basic promoter. The term “silencer”, also referred to herein as “silencer element” is intended to include regulatory elements capable of decreasing, inhibiting, or repressing transcription from a basic promoter. Regulatory elements are typically present in 5′ flanking regions of genes. However, regulatory elements have also been shown to be present in other regions of a gene, in particular in introns. Thus, it is possible that genes have regulatory elements located in introns, exons, coding regions, and 3′ flanking sequences. Such regulatory elements are also intended to be encompassed by the present invention and can be identified by any of the assays that can be used to identify regulatory elements in 5′ flanking regions of genes.
The term “regulatory element” further encompasses “tissue specific” regulatory elements, i.e., regulatory elements which effect expression of the selected DNA sequence preferentially in specific cells (e.g., cells of a specific tissue). gene expression occurs preferentially in a specific cell if expression in this cell type is significantly higher than expression in other cell types. The term “regulatory element” also encompasses non-tissue specific regulatory elements, i.e., regulatory elements which are active in most cell types. Furthermore, a regulatory element can be a constitutive regulatory element, i.e., a regulatory element which constitutively regulates transcription, as opposed to a regulatory element which is inducible, i.e., a regulatory element which is active primarily in response to a stimulus. A stimulus can be, e.g., a molecule, such as a hormone, cytokine, heavy metal, phorbol ester, cyclic AMP (cAMP), or retinoic acid.
Regulatory elements are typically bound by proteins, e.g., transcription factors. The term “transcription factor” is intended to include proteins or modified forms thereof, which interact preferentially with specific nucleic acid sequences, i.e., regulatory elements, and which in appropriate conditions stimulate or repress transcription. Some transcription factors are active when they are in the form of a monomer. Alternatively, other transcription factors are active in the form of a dimer consisting of two identical proteins or different proteins (heterodimer). Modified forms of transcription factors are intended to refer to transcription factors having a post-translational modification, such as the attachment of a phosphate group. The activity of a transcription factor is frequently modulated by a post-translational modification. For example, certain transcription factors are active only if they are phosphorylated on specific residues. Alternatively, transcription factors can be active in the absence of phosphorylated residues and become inactivated by phosphorylation. A list of known transcription factors and their DNA binding site can be found, e.g., in public databases, e.g., TFMATRIX Transcription Factor Binding Site Profile database.
As used herein, the term “specifically hybridizes” or “specifically detects” refers to the ability of a nucleic acid molecule of the invention to hybridize to at least approximately 6, 12, 20, 30, 40, 50, 60, 70, 80, 90, 100, 110, 120, 130 or 140 consecutive nucleotides of either strand of a gene.
The term “wild-type allele” refers to an allele of a gene which, when present in two copies in a subject results in a wild-type phenotype. There can be several different wild-type alleles of a specific gene, since certain nucleotide changes in a gene may not affect the phenotype of a subject having two copies of the gene with the nucleotide changes.
“Adverse drug reaction” (ADR) as used herein refers to an appreciably harmful or unpleasant reaction, resulting from an intervention related to the use of a medicinal product, which
predicts hazard from future administration and warrants prevention or specific treatment, or alteration of the dosage regimen, or withdrawal of the product. In it's most severe form an ADR might lead to the death of an individual.
The term “Drug Response” is intended to mean any response that a patient exhibits upon drug administration. Specifically drug response includes beneficial, i.e. desired drug effects, ADR or no detectable reaction at all. More specifically the term drug response could also have a qualitative meaning, i.e. it embraces low or high beneficial effects, respectively and mild or severe ADR, respectively. The term “Statin Response” as used herein refers to drug response after statin administration. An individual drug response includes also a good or bad metabolizing of the drug, meaning that “bad metabolizers” accumulate the drug in the body and by this could show side effects of the drug due to accumulative overdoses.
“Candidate gene” as used herein includes genes that can be assigned to either normal cardiovascular function or to metabolic pathways that are related to onset and/or progression of cardiovascular diseases.
With regard to drug response the term “candidate gene” includes genes that can be assigned to distinct phenotypes regarding the patient's response to drug administration. Those phenotypes may include patients who benefit from relatively small amounts of a given drug (high responders) or patients who need relatively high doses in order to obtain the same benefit (low responders). In addition those phenotypes may include patients who can tolerate high doses of a medicament without exhibiting ADR, or patients who suffer from ADR even after receiving only low doses of a medicament.
As neither the development of cardiovascular diseases nor the patient's response to drug administration is completely understood, the term “candidate gene” may also comprise genes with presently unknown function.
“PA SNP” (phenotype associated SNP) refers to a polymorphic site which shows a significant association with a patients phenotype (healthy, diseased, low or high responder, drug tolerant, ADR prone, etc.)
“PA gene” (phenotype associated gene) refers to a genomic locus harbouring a PA SNP, irrespective of the actual function of this gene locus.
PA gene polypeptide refers to a polypeptide encoded at least in part by a PA gene.
The term “Secondary SNP” is intended to mean a SNP that is in neighborhood to at least one other (“primary”) SNP. Due to linkage disequilibrium both primary and secondary SNP(s) might shown a similar association with a phenotype.
The term “Haplotype” as used herein refers to a group of two or more SNPs that are functionally and/or spatially linked. I.e. haplotypes define groups of SNPs that lie inside genes belonging to identical (or related metabolic) pathways and/or lie on the same chromosome. Haplotypes are expected to give better predictive/diagnostic information than a single SNP
The term “statin” is intended to embrace all inhibitors of the enzyme 3-hydroxy-3-methylglutaryl coenzyme A (HMG-CoA) reductase. Statins specifically inhibit the enzyme HMG-CoA reductase which catalyzes the rate limiting step in cholesterol biosynthesis. Known statins are Atorvastatin, Cerivastatin, Fluvastatin, Lovastatin, Pravastatin and Simvastatin.
Methods for Assessing Cardiovascular Status
The present invention provides diagnostic methods for assessing cardiovascular status in a human individual. Cardiovascular status as used herein refers to the physiological status of an individual's cardiovascular system as reflected in one or more markers or indicators. Status markers include without limitation clinical measurements such as, e.g., blood pressure, electrocardiographic profile, and differentiated blood flow analysis as well as measurements of LDL- and HDL-Cholesterol levels, other lipids and other well established clinical parameters that are standard in the art. Status markers according to the invention include diagnoses of one or more cardiovascular syndromes, such as, e.g., hypertension, acute myocardial infarction, silent myocardial infarction, stroke, and atherosclerosis. It will be understood that a diagnosis of a cardiovascular syndrome made by a medical practitioner encompasses clinical measurements and medical judgement. Status markers according to the invention are assessed using conventional methods well known in the art. Also included in the evaluation of cardiovascular status are quantitative or qualitative changes in status markers with time, such as would be used, e.g., in the determination of an individual's response to a particular therapeutic regimen.
The methods are carried out by the steps of:
(i) determining the sequence of one or more polymorphic positions within one, several or all of the genes listed in Examples or other genes mentioned in this file in the individual to establish a polymorphic pattern for the individual; and
(ii) comparing the polymorphic pattern established in (i) with the polymorphic patterns of humans exhibiting different markers of cardiovascular status. The polymorphic pattern of the individual is, preferably, highly similar and, most preferably, identical to the poly-morphic pattern of individuals who exhibit particular status markers, cardiovascular syndromes, and/or particular patterns of response to therapeutic interventions. Poly-morphic patterns may also include polymorphic positions in other genes which are shown, in combination with one or more polymorphic positions in the genes listed in the Examples, to correlate with the presence of particular status markers. In one embodiment, the method involves comparing an individual's polymorphic pattern with polymorphic patterns of individuals who have been shown to respond positively or negatively to a particular therapeutic regimen. Therapeutic regimen as used herein refers to treatments aimed at the elimination or amelioration of symptoms and events associated cardiovascular disease. Such treatments include without limitation one or more of alteration in diet, lifestyle, and exercise regimen; invasive and noninvasive surgical techniques such as atherectomy, angioplasty, and coronary bypass surgery; and pharmaceutical interventions, such as administration of ACE inhibitors, angiotensin II receptor antagonists, diuretics, alpha-adrenoreceptor antagonists, cardiac glycosides, phosphodiesterase inhibitors, beta-adrenoreceptor antagonists, calcium channel blockers, HMG-CoA reductase inhibitors, imidazoline receptor blockers, endothelin receptor blockers, organic nitrites, and modulators of protein function of genes listed in the Examples. Interventions with pharmaceutical agents not yet known whose activity correlates with particular polymorphic patterns associated with cardiovascular disease are also encompassed. It is contemplated, for example, that patients who are candidates for a particular therapeutic regimen will be screened for polymorphic patterns that correlate with responsivity to that particular regimen.
In a preferred embodiment, the method involves comparing an individual's polymorphic pattern with polymorphic patterns of individuals who exhibit or have exhibited one or more markers of cardiovascular disease, such as, e.g., elevated LDL-Cholesterol levels, high blood pressure, abnormal electrocardiographic profile, myocardial infarction, stroke, or atherosclerosis.
In another embodiment, the method involves comparing an individual's polymorphic pattern with polymorphic patterns of individuals who exhibit or have exhibited one or more drug related phenotypes, such as, e.g., low or high drug response, or adverse drug reactions.
In practicing the methods of the invention, an individual's polymorphic pattern can be established by obtaining DNA from the individual and determining the sequence at predetermined polymorphic positions in the genes such as those described in this file.
The DNA may be obtained from any cell source. Non-limiting examples of cell sources available in clinical practice include blood cells, buccal cells, cervicovaginal cells, epithelial cells from urine, fetal cells, or any cells present in tissue obtained by biopsy. Cells may also be obtained from body fluids, including without limitation blood, saliva, sweat, urine, cerebrospinal fluid, feces, and tissue exudates at the site of infection or inflammation. DNA is extracted from the cell source or body fluid using any of the numerous methods that are standard in the art. It will be understood that the particular method used to extract DNA will depend on the nature of the source.
Diagnostic and Prognostic Assays
The present invention provides methods for determining the molecular structure of at least one polymorphic region of a gene, specific allelic variants of said polymorphic region being associated with cardiovascular disease. In one embodiment, determining the molecular structure of a polymorphic region of a gene comprises determining the identity of the allelic variant. A polymorphic region of a gene, of which specific alleles are associated with cardiovascular disease can be located in an exon, an intron, at an intron/exon border, or in the promoter of the gene.
The invention provides methods for determining whether a subject has, or is at risk, of developing a cardiovascular disease. Such disorders can be associated with an aberrant gene activity, e.g., abnormal binding to a form of a lipid, or an aberrant gene protein level. An aberrant gene protein level can result from an aberrant transcription or post-transcriptional regulation. Thus, allelic differences in specific regions of a gene can result in differences of gene protein due to differences in regulation of expression. In particular, some of the identified polymorphisms in the human gene may be associated with differences in the level of transcription, RNA maturation, splicing, or translation of the gene or transcription product.
In preferred embodiments, the methods of the invention can be characterized as comprising detecting, in a sample of cells from the subject, the presence or absence of a specific allelic variant of one or more polymorphic regions of a gene. The allelic differences can be: (i) a difference in the identity of at least one nucleotide or (ii) a difference in the number of nucleotides, which difference can be a single nucleotide or several nucleotides.
A preferred detection method is allele specific hybridization using probes overlapping the polymorphic site and having about 5, 10, 20, 25, or 30 nucleotides around the polymorphic region. Examples of probes for detecting specific allelic variants of the polymorphic region located in intron X are probes comprising a nucleotide sequence set forth in any of SEQ ID NO. X. In a preferred embodiment of the invention, several probes capable of hybridizing specifically to allelic variants are attached to a solid phase support, e.g., a “chip”. Oligonucleotides can be bound to a solid support by a variety of processes, including lithography. For example a chip can hold up to 250,000 oligonucleotides (GeneChip, Affymetrix). Mutation detection analysis using these chips comprising oligonucleotides, also termed “DNA probe arrays” is described e.g., in Cronin et al. (1996) Human Mutation 7:244 and in Kozal et al. (1996) Nature Medicine 2:753. In one embodiment, a chip comprises all the allelic variants of at least one polymorphic region of a gene. The solid phase support is then contacted with a test nucleic acid and hybridization to the specific probes is detected. Accordingly, the identity of numerous allelic variants of one or more genes can be identified in a simple hybridization experiment. For example, the identity of the allelic variant of the nucleotide polymorphism of nucleotide A or G at position 33 of Seq ID 1 (baySNP179) and that of other possible polymorphic regions can be determined in a single hybridization experiment.
In other detection methods, it is necessary to first amplify at least a portion of a gene prior to identifying the allelic variant. Amplification can be performed, e.g., by PCR and/or LCR, according to methods known in the art. In one embodiment, genomic DNA of a cell is exposed to two PCR primers and amplification for a number of cycles sufficient to produce the required amount of amplified DNA. In preferred embodiments, the primers are located between 40 and 350 base pairs apart. Preferred primers for amplifying gene fragments of genes of this file are listed in Table 2 in the Examples.
Alternative amplification methods include: self sustained sequence replication (Guatelli, J. C. et al., 1990, Proc. Natl. Acad. Sci. U.S.A. 87:1874-1878), transcriptional amplification system (Kwoh, D. Y. et al., 1989, Proc. Natl. Acad. Sci. U.S.A. 86:1173-1177), Q-Beta Replicase (Lizardi, P. M. et al., 1988, Bio/Technology 6:1197), or any other nucleic acid amplification method, followed by the detection of the amplified molecules using techniques well known to those of skill in the art. These detection schemes are especially useful for the detection of nucleic acid molecules if such molecules are present in very low numbers.
In one embodiment, any of a variety of sequencing reactions known in the art can be used to directly sequence at least a portion of a gene and detect allelic variants, e.g., mutations, by comparing the sequence of the sample sequence with the corresponding wild-type (control) sequence. Exemplary sequencing reactions include those based on techniques developed by Maxam and Gilbert (Proc. Natl. Acad Sci USA (1977) 74:560) or Sanger (Sanger et al (1977) Proc. Nat. Acad. Sci 74:5463). It is also contemplated that any of a variety of automated sequencing procedures may be utilized when performing the subject assays (Biotechniques (1995) 19:448), including sequencing by mass spectrometry (see, for example, U.S. Pat. No. 5,547,835 and international patent application Publication Number WO 94/16101, entitled DNA Sequencing by Mass Spectrometry by H. Koster; U.S. Pat. No. 5,547,835 and international patent application Publication Number WO 94/21822 entitled “DNA Sequencing by Mass Spectrometry Via Exonuclease Degradation” by H. Koster), and U.S. Pat. No. 5,605,798 and International Patent Application No. PCT/US96/03651 entitled DNA Diagnostics Based on Mass Spectrometry by H. Koster; Cohen et al. (1996) Adv Chromatogr 36:127-162; and Griffin et al. (1993) Appl Biochem Biotechnol 38:147-159). It will be evident to one skilled in the art that, for certain embodiments, the occurrence of only one, two or three of the nucleic acid bases need be determined in the sequencing reaction. For instance, A-track or the like, e.g., where only one nucleotide is detected, can be carried out.
Yet other sequencing methods are disclosed, e.g., in U.S. Pat. No. 5,580,732 entitled “Method of DNA sequencing employing a mixed DNA-polymer chain probe” and U.S. Pat. No. 5,571,676 entitled “Method for mismatch-directed in vitro DNA sequencing”.
In some cases, the presence of a specific allele of a gene in DNA from a subject can be shown by restriction enzyme analysis. For example, a specific nucleotide polymorphism can result in a nucleotide sequence comprising a restriction site which is absent from the nucleotide sequence of another allelic variant.
In other embodiments, alterations in electrophoretic mobility is used to identify the type of gene allelic variant. For example, single strand conformation polymorphism (SSCP) may be used to detect differences in electrophoretic mobility between mutant and wild type nucleic acids (Orita et al. (1989) Proc Natl. Acad. Sci USA 86:2766, see also Cotton (1993) Mutat Res 285:125-144; and Hayashi (1992) Genet Anal Tech Appl 9:73-79). Single-stranded DNA fragments of sample and control nucleic acids are denatured and allowed to renature. The secondary structure of single-stranded nucleic acids varies according to sequence, the resulting alteration in electrophoretic mobility enables the detection of even a single base change. The DNA fragments may be labeled or detected with labeled probes. The sensitivity of the assay may be enhanced by using PNA (rather than DNA), in which the secondary structure is more sensitive to a change in sequence. In another preferred embodiment, the subject method utilizes heteroduplex analysis to separate double stranded heteroduplex molecules on the basis of changes in electrophoretic mobility (Keen et al. (1991) Trends Genet 7:5).
In yet another embodiment, the identity of an allelic variant of a polymorphic region is obtained by analyzing the movement of a nucleic acid comprising the polymorphic region in polyacrylamide gels containing a gradient of denaturant is assayed using denaturing gradient gel electrophoresis (DGGE) (Myers et al (1985) Nature 313:495). When DGGE is used as the method of analysis, DNA will be modified to insure that it does not completely denature, for example by adding a GC clamp of approximately 40 bp of high-melting GC-rich DNA by PCR. In a further embodiment, a temperature gradient is used in place of a denaturing agent gradient to identify differences in the mobility of control and sample DNA (Rosenbaum and Reissner (1987) Biophys Chem 265:1275).
Examples of techniques for detecting differences of at least one nucleotide between 2 nucleic acids include, but are not limited to, selective oligonucleotide hybridization, selective amplification, or selective primer extension. For example, oligonucleotide probes may be prepared in which the known polymorphic nucleotide is placed centrally (allele-specific probes) and then hybridized to target DNA under conditions which permit hybridization only if a perfect match is found (Saiki et al. (1986) Nature 324:163); Saiki et al (1989) Proc. Natl. Acad. Sci USA 86:6230; and Wallace et al. (1979) Nucl. Acids Res. 6:3543). Such allele specific oligonucleotide hybridization techniques may be used for the simultaneous detection of several nucleotide changes in different polymorphic regions of gene. For example, oligonucleotides having nucleotide sequences of specific allelic variants are attached to a hybridizing membrane and this membrane is then hybridized with labeled sample nucleic acid. Analysis of the hybridization signal will then reveal the identity of the nucleotides of the sample nucleic acid.
Alternatively, allele specific amplification technology which depends on selective PCR amplification may be used. Oligonucleotides used as primers for specific amplification may carry the allelic variant of interest in the center of the molecule (so that amplification depends on differential hybridization) (Gibbs et al (1989) Nucleic Acids Res. 17:2437-2448) or at the extreme 3′ end of one primer where, under appropriate conditions, mismatch can prevent, or reduce polymerase extension (Prossner (1993) Tibtech 11:238; Newton et al. (1989) Nucl. Acids Res. 17:2503). This technique is also termed “PROBE” for Probe Oligo Base Extension. In addition it may be desirable to introduce a novel restriction site in the region of the mutation to create cleavage-based detection (Gasparini et al (1992) Mol. Cell Probes 6:1).
In another embodiment, identification of the allelic variant is carried out using an oligonucleotide ligation assay (OLA), as described, e.g., in U.S. Pat. No. 4,998,617 and in Landegren, U. et al., Science 241:1077-1080 (1988). The OLA protocol uses two oligonucleotides which are designed to be capable of hybridizing to abutting sequences of a single strand of a target. One of the oligonucleotides is linked to a separation marker, e.g., biotinylated, and the other is detectably labeled. If the precise complementary sequence is found in a target molecule, the oligonucleotides will hybridize such that their termini abut, and create a ligation substrate. Ligation then permits the labeled oligonucleotide to be recovered using avidin, or another biotin ligand. Nickerson, D. A. et al. have described a nucleic acid detection assay that combines attributes of PCR and OLA (Nickerson, D. A. et al., Proc. Natl. Acad. Sci. (U.S.A.) 87:8923-8927 (1990). In this method, PCR is used to achieve the exponential amplification of target DNA, which is then detected using OLA.
Several techniques based on this OLA method have been developed and can be used to detect specific allelic variants of a polymorphic region of a gene. For example, U.S. Pat. No. 5,593,826 discloses an OLA using an oligonucleotide having 3′-amino group and a 5′-phosphorylated oligonucleotide to form a conjugate having a phosphoramidate linkage. In another variation of OLA described in Tobe et al. ((1996) Nucleic Acids Res 24: 3728), OLA combined with PCR permits typing of two alleles in a single microtiter well. By marling each of the allele-specific primers with a unique hapten, i.e. digoxigenin and fluorescein, each LA reaction can be detected by using hapten specific antibodies that are labeled with different enzyme reporters, alkaline phosphatase or horseradish peroxidase. This system permits the detection of the two alleles using a high throughput format that leads to the production of two different colors.
The invention further provides methods for detecting single nucleotide polymorphisms in a gene. Because single nucleotide polymorphisms constitute sites of variation flanked by regions of invariant sequence, their analysis requires no more than the determination of the identity of the single nucleotide present at the site of variation and it is unnecessary to determine a complete gene sequence for each patient. Several methods have been developed to facilitate the analysis of such single nucleotide polymorphisms.
In one embodiment, the single base polymorphism can be detected by using a specialized exonuclease-resistant nucleotide, as disclosed, e.g., in Mundy, C. R. (U.S. Pat. No. 4,656,127). According to the method, a primer complementary to the allelic sequence immediately 3′ to the polymorphic site is permitted to hybridize to a target molecule obtained from a particular animal or human. If the polymorphic site on the target molecule contains a nucleotide that is complementary to the particular exonuclease-resistant nucleotide derivative present, then that derivative will be incorporated onto the end of the hybridized primer. Such incorporation renders the primer resistant to exonuclease, and thereby permits its detection. Since the identity of the exonuclease-resistant derivative of the sample is known, a finding that the primer has become resistant to exonucleases reveals that the nucleotide present in the polymorphic site of the target molecule was complementary to that of the nucleotide derivative used in the reaction. This method has the advantage that it does not require the determination of large amounts of extraneous sequence data.
In another embodiment of the invention, a solution-based method is used for determining the identity of the nucleotide of a polymorphic site. Cohen, D. et al. (French Patent 2,650,840; PCT Appln. No. WO91/02087). As in the Mundy method of U.S. Pat. No. 4,656,127, a primer is employed that is complementary to allelic sequences immediately 3′ to a polymorphic site. The method determines the identity of the nucleotide of that site using labeled dideoxynucleotide derivatives, which, if complementary to the nucleotide of the polymorphic site will become incorporated onto the terminus of the primer.
An alternative method, known as Genetic Bit Analysis or GBA TM is described by Goelet, P. et al. (PCT Appln. No. 92/15712). The method of Goelet, P. et al. uses mixtures of labeled terminators and a primer that is complementary to the sequence 3′ to a polymorphic site. The labeled terminator that is incorporated is thus determined by, and complementary to, the nucleotide present in the polymorphic site of the target molecule being evaluated. In contrast to the method of Cohen et al. (French Patent 2,650,840; PCT Appln. No. WO91/02087) the method of Goelet, P. et al. is preferably a heterogeneous phase assay, in which the primer or the target molecule is immobilized to a solid phase.
Recently, several primer-guided nucleotide incorporation procedures for assaying polymorphic sites in DNA have been described (Komher, J. S. et al., Nucl. Acids. Res. 17:7779-7784 (1989); Sokolov, B. P., Nucl. Acids Res. 18:3671 (1990); Syvanen, A.-C., et al., Genomics 8:684-692 (1990), Kuppuswamy, M. N. et al., Proc. Natl. Acad. Sci. (U.S.A.) 88:1143-1147 (1991); Prezant, T. R. et al., Hum. Mutat. 1:159-164 (1992); Ugozzoli, L. et al., GATA 9:107-112 (1992); Nyren, P. et al., Anal. Biochem. 208:171-175 (1993)). These methods differ from GBA TM in that they all rely on the incorporation of labeled deoxynucleotides to discriminate between bases at a polymorphic site. In such a format, since the signal is proportional to the number of deoxynucleotides incorporated, polymorphisms that occur in runs of the same nucleotide can result in signals that are proportional to the length of the run (Syvanen, A.-C., et al., Amer. J. Hum. Genet. 52:46-59 (1993)).
For determining the identity of the allelic variant of a polymorphic region located in the coding region of a gene, yet other methods than those described above can be used. For example, identification of an allelic variant which encodes a mutated gene protein can be performed by using an antibody specifically recognizing the mutant protein in, e.g., immunohistochemistry or immunoprecipitation. Antibodies to wild-type gene protein are described, e.g., in Acton et al. (1999) Science 271:518 (anti-mouse gene antibody cross-reactive with human gene). Other antibodies to wild-type gene or mutated forms of gene proteins can be prepared according to methods known in the art. Alternatively, one can also measure an activity of an gene protein, such as binding to a lipid or lipoprotein. Binding assays are known in the art and involve, e.g., obtaining cells from a subject, and performing binding experiments with a labeled lipid, to determine whether binding to the mutated form of the receptor differs from binding to the wild-type of the receptor.
If a polymorphic region is located in an exon, either in a coding or non-coding region of the gene, the identity of the allelic variant can be determined by determining the molecular structure of the mRNA, pre-mRNA, or cDNA. The molecular structure can be determined using any of the above described methods for determining the molecular structure of the genomic DNA, e.g., sequencing and SSCP.
The methods described herein may be performed, for example, by utilizing pre-packaged diagnostic kits, such as those described above, comprising at least one probe or primer nucleic acid described herein, which may be conveniently used, e.g., to determine whether a subject has or is at risk of developing a disease associated with a specific gene allelic variant.
Sample nucleic acid for using in the above-described diagnostic and prognostic methods can be obtained from any cell type or tissue of a subject. For example, a subject's bodily fluid (e.g. blood) can be obtained by known techniques (e.g. venipuncture) or from human tissues like heart (biopsies, transplanted organs). Alternatively, nucleic acid tests can be performed on dry samples (e.g. hair or skin). Fetal nucleic acid samples for prenatal diagnostics can be obtained from maternal blood as described in International Patent Application No. WO91/07660 to Bianchi. Alternatively, amniocytes or chorionic villi may be obtained for performing prenatal testing.
Diagnostic procedures may also be performed in situ directly upon tissue sections (fixed and/or frozen) of patient tissue obtained from biopsies or resections, such that no nucleic acid purification is necessary. Nucleic acid reagents may be used as probes and/or primers for such in situ procedures (see, for example, Nuovo, G. J., 1992, PCR in situ hybridization: protocols and applications, Raven Press, New York).
In addition to methods which focus primarily on the detection of one nucleic acid sequence, profiles may also be assessed in such detection schemes. Fingerprint profiles may be generated, for example, by utilizing a differential display procedure, Northern analysis and/or RT-PCR.
In practicing the present invention, the distribution of polymorphic patterns in a large number of individuals exhibiting particular markers of cardiovascular status or drug response is determined by any of the methods described above, and compared with the distribution of polymorphic patterns in patients that have been matched for age, ethnic origin, and/or any other statistically or medically relevant parameters, who exhibit quantitatively or qualitatively different status markers. Correlations are achieved using any method known in the art, including nominal logistic regression, chi square tests or standard least squares regression analysis. In this manner, it is possible to establish statistically significant correlations between particular polymorphic patterns and particular cardiovascular statuses (given in p values). It is further possible to establish statistically significant correlations between particular polymorphic patterns and changes in cardiovascular status or drug response such as, would result, e.g., from particular treatment regimens. In this manner, it is possible to correlate polymorphic patterns with responsivity to particular treatments.
In another embodiment of the present invention two or more polymorphic regions are combined to define so called ‘haplotypes’. Haplotypes are groups of two or more SNPs that are functionally and/or spatially linked. It is possible to combine SNPs that are disclosed in the present invention either with each other or with additional polymorphic regions to form a haplotype. Haplotypes are expected to give better predictive/diagnostic information than a single SNP.
In a preferred embodiment of the present invention a panel of SNPs/haplotypes is defined that predicts the risk for CVD or drug response. This predictive panel is then used for genotyping of patients on a platform that can genotype multiple SNPs at the same time (Multiplexing). Preferred platforms are e.g. gene chips (Affymetrix) or the Luminex LabMAP reader. The subsequent identification and evaluation of a patient's haplotype can then help to guide specific and individualized therapy.
For example the present invention can identify patients exhibiting genetic polymorphisms or haplotypes which indicate an increased risk for adverse drug reactions. In that case the drug dose should be lowered in a way that the risk for ADR is diminished. Also if the patient's response to drug administration is particularly high (or the patient is badly metabolizing the drug), the drug dose should be lowered to avoid the risk of ADR.
In turn if the patient's response to drug administration is low (or the patient is a particularly high metabolizer of the drug), and there is no evident risk of ADR, the drug dose should be raised to an efficacious level.
It is self evident that the ability to predict a patient's individual drug response should affect the formulation of a drug, i.e. drug formulations should be tailored in a way that they suit the different patient classes (low/high responder, poor/good metabolizer, ADR prone patients). Those different drug formulations may encompass different doses of the drug, i.e. the medicinal products contains low or high amounts of the active substance. In another embodiment of the invention the drug formulation may contain additional substances that facilitate the beneficial effects and/or diminish the risk for ADR (Folkers et al. 1991, U.S. Pat. No. 5,316,765).
Isolated Polymorphic Nucleic Acids, Probes, and Vectors
The present invention provides isolated nucleic acids comprising the polymorphic positions described herein for human genes; vectors comprising the nucleic acids; and transformed host cells comprising the vectors. The invention also provides probes which are useful for detecting these polymorphisms.
In practicing the present invention, many conventional techniques in molecular biology, microbiology, and recombinant DNA, are used. Such techniques are well known and are explained fully in, for example, Sambrook et al., 1989, Molecular Cloning: A Laboratory Manual, Second Edition, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y.; DNA Cloning: A Practical Approach, Volumes I and II, 1985 (D. N. Glover ed.); Oligonucleotide Synthesis, 1984, (M. L. Gait ed.); Nucleic Acid Hybridization, 1985, (Hames and Higgins); Ausubel et al., Current Protocols in Molecular Biology, 1997, (John Wiley and Sons); and Methods in Enzymology Vol. 154 and Vol. 155 (Wu and Grossman, and Wu, eds., respectively).
Insertion of nucleic acids (typically DNAs) comprising the sequences in a functional surrounding like full length cDNA of the present invention into a vector is easily accomplished when the termini of both the DNAs and the vector comprise compatible restriction sites. If this cannot be done, it may be necessary to modify the termini of the DNAs and/or vector by digesting back single-stranded DNA overhangs generated by restriction endonuclease cleavage to produce blunt ends, or to achieve the same result by filling in the single-stranded termini with an appropriate DNA polymerase.
Alternatively, any site desired may be produced, e.g., by ligating nucleotide sequences (linkers) onto the termini. Such linkers may comprise specific oligonucleotide sequences that define desired restriction sites. Restriction sites can also be generated by the use of the polymerase chain reaction (PCR). See, e.g., Saiki et al., 1988, Science 239:48. The cleaved vector and the DNA fragments may also be modified if required by homopolymeric tailing.
The nucleic acids may be isolated directly from cells or may be chemically synthesized using known methods. Alternatively, the polymerase chain reaction (PCR) method can be used to produce the nucleic acids of the invention, using either chemically synthesized strands or genomic material as templates. Primers used for PCR can be synthesized using the sequence information provided herein and can further be designed to introduce appropriate new restriction sites, if desirable, to facilitate incorporation into a given vector for recombinant expression.
The nucleic acids of the present invention may be flanked by native gene sequences, or may be associated with heterologous sequences, including promoters, enhancers, response elements, signal sequences, polyadenylation sequences, introns, 5′- and 3′-noncoding regions, and the like. The nucleic acids may also be modified by many means known in the art. Non-limiting examples of such modifications include methylation, “caps”, substitution of one or more of the naturally occurring nucleotides with an analog, internucleotide modifications such as, for example, those with uncharged linkages (e.g., methyl phosphonates, phosphotriesters, phosphoroamidates, carbamates, morpholines etc.) and with charged linkages (e.g., phosphorothioates, phosphorodithioates, etc.). Nucleic acids may contain one or more additional covalently linked moieties, such as, for example, proteins (e.g., nucleases, toxins, antibodies, signal peptides, poly-L-lysine, etc.), intercalators (e.g., acridine, psoralen, etc.), chelators (e.g., metals, radioactive metals, iron, oxidative metals, etc.), and alkylators. PNAs are also included. The nucleic acid may be derivatized by formation of a methyl or ethyl phosphotriester or an alkyl phosphoramidate linkage. Furthermore, the nucleic acid sequences of the present invention may also be modified with a label capable of providing a detectable signal, either directly or indirectly. Exemplary labels include radioisotopes, fluorescent molecules, biotin, and the like.
The invention also provides nucleic acid vectors comprising the gene sequences or derivatives or fragments thereof of genes described in the Examples. A large number of vectors, including plasmid and fungal vectors, have been described for replication and/or expression in a variety of eukaryotic and prokaryotic hosts, and may be used for gene therapy as well as for simple cloning or protein expression. Non-limiting examples of suitable vectors include without limitation pUC plasmids, pET plasmids (Novagen, Inc., Madison, Wis.), or pRSET or pREP (Invitrogen, San Diego, Calif.), and many appropriate host cells, using methods disclosed or cited herein or otherwise known to those skilled in the relevant art. The particular choice of vector/host is not critical to the practice of the invention.
Suitable host cells may be transformed/transfected/infected as appropriate by any suitable method including electroporation, CaCl2 mediated DNA uptake, fungal or viral infection, microinjection, microprojectile, or other established methods. Appropriate host cells included bacteria, archebacteria, fungi, especially yeast, and plant and animal cells, especially mammalian cells. A large number of transcription initiation and termination regulatory regions have been isolated and shown to be effective in the transcription and translation of heterologous proteins in the various hosts. Examples of these regions, methods of isolation, manner of manipulation, etc. are known in the art. Under appropriate expression conditions, host cells can be used as a source of recombinantly produced peptides and polypeptides encoded by genes of the Examples. Nucleic acids encoding peptides or polypeptides from gene sequences of the Examples may also be introduced into cells by recombination events. For example, such a sequence can be introduced into a cell and thereby effect homologous recombination at the site of an endogenous gene or a sequence with substantial identity to the gene. Other recombination-based methods such as non-homologous recombinations or deletion of endogenous genes by homologous recombination may also be used.
In case of proteins that form heterodimers or other multimers, both or all subunits have to be expressed in one system or cell.
The nucleic acids of the present invention find use as probes for the detection of genetic polymorphisms and as templates for the recombinant production of normal or variant peptides or polypeptides encoded by genes listed in the Examples.
Probes in accordance with the present invention comprise without limitation isolated nucleic acids of about 10-100 bp, preferably 15-75 bp and most preferably 17-25 bp in length, which hybridize at high stringency to one or more of the polymorphic sequences disclosed herein or to a sequence immediately adjacent to a polymorphic position. Furthermore, in some embodiments a full-length gene sequence may be used as a probe. In one series of embodiments, the probes span the polymorphic positions in genes disclosed herein. In another series of embodiments, the probes correspond to sequences immediately adjacent to the polymorphic positions.
Polymorphic Polypeptides and Polymorphism-Specific Antibodies
The present invention encompasses isolated peptides and polypeptides encoded by genes listed in the Examples comprising polymorphic positions disclosed herein. In one preferred embodiment, the peptides and polypeptides are useful screening targets to identify cardiovascular drugs. In another preferred embodiments, the peptides and polypeptides are capable of eliciting antibodies in a suitable host animal that react specifically with a polypeptide comprising the polymorphic position and distinguish it from other polypeptides having a different sequence at that position.
Polypeptides according to the invention are preferably at least five or more residues in length, preferably at least fifteen residues. Methods for obtaining these polypeptides are described below. Many conventional techniques in protein biochemistry and immunology are used. Such techniques are well known and are explained in Immunochemical Methods in Cell and Molecular Biology, 1987 (Mayer and Waler, eds; Academic Press, London); Scopes, 1987, Protein Purification: Principles and Practice, Second Edition (Springer-Verlag, N.Y.) and Handbook of Experimental immunology, 1986, Volumes I-IV (Weir and Blackwell eds.).
Nucleic acids comprising protein-coding sequences can be used to direct the ITT recombinant expression of polypeptides encoded by genes disclosed herein in intact cells or in cell-free translation systems. The known genetic code, tailored if desired for more efficient expression in a given host organism, can be used to synthesize oligonucleotides encoding the desired amino acid sequences. The polypeptides may be isolated from human cells, or from heterologous organisms or cells (including, but not limited to, bacteria, fungi, insect, plant, and mammalian cells) into which an appropriate protein-coding sequence has been introduced and expressed. Furthermore, the polypeptides may be part of recombinant fusion proteins.
Peptides and polypeptides may be chemically synthesized by commercially available automated procedures, including, without limitation, exclusive solid phase synthesis, partial solid phase methods, fragment condensation or classical solution synthesis. The polypeptides are preferably prepared by solid phase peptide synthesis as described by Merrifield, 1963, J. Am. Chem. Soc. 85:2149.
Methods for polypeptide purification are well-known in the art, including, without limitation, preparative disc-gel electrophoresis, isoelectric focusing, HPLC, reversed-phase HPLC, gel filtration, ion exchange and partition chromatography, and countercurrent distribution. For some purposes, it is preferable to produce the polypeptide in a recombinant system in which the protein contains an additional sequence tag that facilitates purification, such as, but not limited to, a polyhistidine sequence. The polypeptide can then be purified from a crude lysate of the host cell by chromatography on an appropriate solid-phase matrix. Alternatively, antibodies produced against peptides encoded by genes disclosed herein, can be used as purification reagents. Other purification methods are possible.
The present invention also encompasses derivatives and homologues of the polypeptides. For some purposes, nucleic acid sequences encoding the peptides may be altered by substitutions, additions, or deletions that provide for functionally equivalent molecules, i.e., function-conservative variants. For example, one or more amino acid residues within the sequence can be substituted by another amino acid of similar properties, such as, for example, positively charged amino acids (arginine, lysine, and histidine); negatively charged amino acids (aspartate and glutamate); polar neutral amino acids; and non-polar amino acids.
The isolated polypeptides may be modified by, for example, phosphorylation, sulfation, acylation, or other protein modifications. They may also be modified with a label capable of providing a detectable signal, either directly or indirectly, including, but not limited to, radioisotopes and fluorescent compounds.
The present invention also encompasses antibodies that specifically recognize the polymorphic positions of the invention and distinguish a peptide or polypeptide containing a particular polymorphism from one that contains a different sequence at that position. Such polymorphic position-specific antibodies according to the present invention include polyclonal and monoclonal antibodies. The antibodies may be elicited in an animal host by immunization with peptides encoded by genes disclosed herein or may be formed by in vitro immunization of immune cells. The immunogenic components used to elicit the antibodies may be isolated from human cells or produced in recombinant systems. The antibodies may also be produced in recombinant systems programmed with appropriate antibody-encoding DNA. Alternatively, the antibodies may be constructed by biochemical reconstitution of purified heavy and light chains. The antibodies include hybrid antibodies (i.e., containing two sets of heavy chain/light chain combinations, each of which recognizes a different antigen), chimeric antibodies (i.e., in which either the heavy chains, light chains, or both, are fusion proteins), and univalent antibodies (i.e., comprised of a heavy chain/light chain complex bound to the constant region of a second heavy chain). Also included are Fab fragments, including Fab′ and F(ab).sub.2 fragments of antibodies. Methods for the production of all of the above types of antibodies and derivatives are well-known in the art and are discussed in more detail below. For example, techniques for producing and processing polyclonal antisera are disclosed in Mayer and Walker, 1987, Immunochemical Methods in Cell and Molecular Biology, (Academic Press, London). The general methodology for making monoclonal antibodies by hybridomas is well known. Immortal antibody-producing cell lines can be created by cell fusion, and also by other techniques such as direct transformation of B lymphocytes with oncogenic DNA, or transfection with Epstein-Barr virus. See, e.g., Schreier et al., 1980, Hybridoma Techniques; U.S. Pat. Nos. 4,341,761; 4,399,121; 4,427,783; 4,444,887; 4,466,917; 4,472,500; 4,491,632; and 4,493,890. Panels of monoclonal antibodies produced against peptides encoded by genes disclosed herein can be screened for various properties; i.e. for isotype, epitope affinity, etc.
The antibodies of this invention can be purified by standard methods, including but not limited to preparative disc-gel electrophoresis, isoelectric focusing, HPLC, reversed-phase HPLC, gel filtration, ion exchange and partition chromatography, and countercurrent distribution. Purification methods for antibodies are disclosed, e.g., in The Art of Antibody Purification, 1989, Amicon Division, W. R. Grace & Co. General protein purification methods are described in Protein Purification: Principles and Practice, R. K. Scopes, Ed., 1987, Springer-Verlag, New York, N.Y.
Methods for determining the immunogenic capability of the disclosed sequences and the characteristics of the resulting sequence-specific antibodies and immune cells are well-known in the art. For example, antibodies elicited in response to a peptide comprising a particular polymorphic sequence can be tested for their ability to specifically recognize that polymorphic sequence, i.e., to bind differentially to a peptide or polypeptide comprising the polymorphic sequence and thus distinguish it from a similar peptide or polypeptide containing a different sequence at the same position.
Kits
As set forth herein, the invention provides diagnostic methods, e.g., for determining the identity of the allelic variants of polymorphic regions present in the gene loci of genes disclosed herein, wherein specific allelic variants of the polymorphic region are associated with cardiovascular diseases. In a preferred embodiment, the diagnostic kit can be used to determine whether a subject is at risk of developing a cardiovascular disease. This information could then be used, e.g., to optimize treatment of such individuals.
In preferred embodiments, the kit comprises a probe or primer which is capable of hybridizing to a gene and thereby identifying whether the gene contains an allelic variant of a polymorphic region which is associated with a risk for cardiovascular disease. The kit preferably further comprises instructions for use in diagnosing a subject as having, or having a predisposition, towards developing a cardiovascular disease. The probe or primers of the kit can be any of the probes or primers described in this file.
Preferred kits for amplifying a region of a gene comprising a polymorphic region of interest comprise one, two or more primers.
Antibody-Based Diagnostic Methods and Kits:
The invention also provides antibody-based methods for detecting polymorphic patterns in a biological sample. The methods comprise the steps of: (i) contacting a sample with one or more antibody preparations, wherein each of the antibody preparations is specific for a particular polymorphic form of the proteins encoded by genes disclosed herein, under conditions in which a stable antigen-antibody complex can form between the antibody and antigenic components in the sample; and (ii) detecting any antigen-antibody complex formed in step (i) using any suitable means known in the art, wherein the detection of a complex indicates the presence of the particular polymorphic form in the sample.
Typically, immunoassays use either a labelled antibody or a labelled antigenic component (e.g., that competes with the antigen in the sample for binding to the antibody). Suitable labels include without limitation enzyme-based, fluorescent, chemiluminescent, radioactive, or dye molecules. Assays that amplify the signals from the probe are also known, such as, for example, those that utilize biotin and avidin, and enzyme-labelled immunoassays, such as ELISA assays.
The present invention also provides kits suitable for antibody-based diagnostic applications. Diagnostic kits typically include one or more of the following components:
(i) Polymorphism-specific antibodies. The antibodies may be pre-labelled; alternatively, the antibody may be unlabelled and the ingredients for labelling may be included in the kit in separate containers, or a secondary, labelled antibody is provided; and
(ii) Reaction components: The kit may also contain other suitably packaged reagents and materials needed for the particular immunoassay protocol, including solid-phase matrices, if applicable, and standards.
The kits referred to above may include instructions for conducting the test. Furthermore, in preferred embodiments, the diagnostic kits are adaptable to high-throughput and/or automated operation.
Drug Targets and Screening Methods
According to the present invention, nucleotide sequences derived from genes disclosed herein and peptide sequences encoded by genes disclosed herein, particularly those that contain one or more polymorphic sequences, comprise useful targets to identify cardiovascular drugs, i.e., compounds that are effective in treating one or more clinical symptoms of cardiovascular disease. Furthermore, especially when a protein is a multimeric protein that are build of two or more subunits, is a combination of different polymorphic subunits very useful.
Drug targets include without limitation (i) isolated nucleic acids derived from the genes disclosed herein, and (ii) isolated peptides and polypeptides encoded by genes disclosed herein, each of which comprises one or more polymorphic positions.
In Vitro Screening Methods:
In one series of embodiments, an isolated nucleic acid comprising one or more polymorphic positions is tested in vitro for its ability to bind test compounds in a sequence-specific manner. The methods comprise:
(i) providing a first nucleic acid containing a particular sequence at a polymorphic position and a second nucleic acid whose sequence is identical to that of the first nucleic acid except for a different sequence at the same polymorphic position;
(ii) contacting the nucleic acids with a multiplicity of test compounds under conditions appropriate for binding; and
(iii) identifying those compounds that bind selectively to either the first or second nucleic acid sequence.
Selective binding as used herein refers to any measurable difference in any parameter of binding, such as, e.g., binding affinity, binding capacity, etc.
In another series of embodiments, an isolated peptide or polypeptide comprising one or more polymorphic positions is tested in vitro for its ability to bind test compounds in a sequence-specific manner. The screening methods involve:
(i) providing a first peptide or polypeptide containing a particular sequence at a polymorphic position and a second peptide or polypeptide whose sequence is identical to the first peptide or polypeptide except for a different sequence at the same polymorphic position;
(ii) contacting the polypeptides with a multiplicity of test compounds under conditions appropriate for binding; and
(iii) identifying those compounds that bind selectively to one of the nucleic acid sequences.
In preferred embodiments, high-throughput screening protocols are used to survey a large number of test compounds for their ability to bind the genes or peptides disclosed above in a sequence-specific manner.
Test compounds are screened from large libraries of synthetic or natural compounds. Numerous means are currently used for random and directed synthesis of saccharide, peptide, and nucleic acid based compounds. Synthetic compound libraries are commercially available from Maybridge Chemical Co. (Trevillet, Cornwall, UK), Comgenex (Princeton, N.J.), Brandon Associates (Merrimack, N.H.), and Microsource (New Milford, Conn.). A rare chemical library is available from Aldrich (Milwaukee, Wis.). Alternatively, libraries of natural compounds in the form of bacterial, fungal, plant and animal extracts are available from e.g. Pan Laboratories (Bothell, Wash.) or MycoSearch (N.C.), or are readily producible. Additionally, natural and synthetically produced libraries and compounds are readily modified through conventional chemical, physical, and biochemical means.
In Vivo Screening Methods:
Intact cells or whole animals expressing polymorphic variants of genes disclosed herein can be used in screening methods to identify candidate cardiovascular drugs.
In one series of embodiments, a permanent cell line is established from an individual exhibiting a particular polymorphic pattern. Alternatively, cells (including without limitation mammalian, insect, yeast, or bacterial cells) are programmed to express a gene comprising one or more polymorphic sequences by introduction of appropriate DNA. Identification of candidate compounds can be achieved using any suitable assay, including without limitation (i) assays that measure selective binding of test compounds to particular polymorphic variants of proteins encoded by genes disclosed herein; (ii) assays that measure the ability of a test compound to modify (i.e., inhibit or enhance) a measurable activity or function of proteins encoded by genes disclosed herein; and (iii) assays that measure the ability of a compound to modify (i.e., inhibit or enhance) the transcriptional activity of sequences derived from the promoter (i.e., regulatory) regions of genes disclosed herein.
In another series of embodiments, transgenic animals are created in which (i) one or more human genes disclosed herein, having different sequences at particular polymorphic positions are stably inserted into the genome of the transgenic animal; and/or (ii) the endogenous genes disclosed herein are inactivated and replaced with human genes disclosed herein, having different sequences at particular polymorphic positions. See, e.g., Coffman, Semin. Nephrol. 17:404, 1997; Esther et al., Lab. Invest. 74:953, 1996; Murakami et al., Blood Press. Suppl. 2:36, 1996. Such animals can be treated with candidate compounds and monitored for one or more clinical markers of cardiovascular status.
The following are intended as non-limiting examples of the invention.
Material and Methods
Genotyping of patient DNA with the Pyrosequencing™ Method as described in the patent application WO 9813523:
First a PCR is set up to amplify the flanking regions around a SNP. Therefor 2 ng of genomic DNA (patient sample) are mixed with a primerset (20-40 pmol) producing a 75 to 320 bp PCR fragment with 0, 3 to 1 U Qiagens Hot Star Taq Polymerase™ in a total volume of 20 μL. One primer is biotinylated depending on the direction of the sequencing primer. To force the biotinylated primer to be incorporated it is used 0, 8 fold.
For primer design, programms like Oligo 6™ (Molecular Biology Insights) or Primer Select™ (DNAStar) are used. PCR setup is performed by a BioRobot 3000™ from Qiagen. PCR takes place in T1 or Tgradient Thermocyclers™ from Biometra.
The whole PCR reaction is transferred into a PSQ plate™ (Pyrosequencing) and prepared using the Sample Prep Tool™ and SNP Reagent Kit™ from Pyrosequencing according to their instructions.
Preparation of Template for Pyrosequencing™:
Sample Preparation Using PSQ 96 Sample Prep Tool:
1. Mount the PSQ 96 Sample Prep Tool Cover onto the PSQ 96 Sample Prep Tool as follows: Place the cover on the desk, retract the 4 attachment rods by separating the handle from the magnetic rod holder, fit the magnetic rods into the holes of the cover plate, push the handle downward until a click is heard. The PSQ 96 Sample Prep Tool is now ready for use.
2. To transfer beads from one plate to another, place the covered tool into the PSQ 96 Plate containing the samples and lower the magnetic rods by separating the handle from the magnetic rod holder. Move the tool up and down a few times then wait for 30-60 seconds. Transfer the beads into a new PSQ 96 plate containing the solution of choice.
3. Release the beads by lifting the magnetic rod holder, bringing it together with the handle. Move the tool up and down a few times to make sure that the beads are released.
All steps are performed at room temperature unless otherwise stated.
Immobilization of PCR Product:
Biotinylated PCR products are immobilized on streptavidin-coated Dynabeads™ M-280 Streptavidin. Parallel immobilization of several samples are performed in the PSQ 96 Plate.
Mix PCR product, 20 μl of a well optimized PCR, with 25 μL 2× BW-buffer II. Add 60-150 μg Dynabeads. It is also possible to add a mix of Dynabeads and 2× BW-buffer II to the PCR product yielding a final BW-buffer II concentration of approximately 1×.
1. Incubate at 65° C. for 15 min agitation constantly to keep the beads dispersed. For optimal immobilization of fragments longer than 300 bp use 30 min incubation time.
Strand Separation:
4. For strand separation, use the PSQ 96 Sample Prep Tool to transfer the beads with the immobilized sample to a PSQ 96 Plate containing 50 μl 0.50 M NaOH per well. Release the beads.
5. After approximately 1 min, transfer the beads with the immobilized strand to a PSQ 96 Plate containing 99 μl 1× Annealing buffer per well and mix thoroughly.
6. Transfer the beads to a PSQ 96 Plate containing 45 μl of a mix of 1× Annealing buffer and 3-15 pmoles sequencing primer per well.
7. Heat at 80° C. for 2 minutes in the PSQ 96 Sample Prep Thermoplate and move to room temperature.
8. After reaching room temperature, continue with the sequencing reaction.
Sequencing Reaction:
1. Choose the method to be used (“SNP Method”) and enter relevant information in the PSQ 96 Instrument Control software.
2. Place the cartridge and PSQ 96 Plate in the PSQ 96 Instrument.
3. Start the run.
Genotyping Using the ABI 7700/7900 Instrument (TaqMan)
SNP genotypisation using the TaqMan (Applied Biosystems/Perkin Elmer) was performed according to the manufacturer's instructions. The TaqMan assay is discussed by Lee et al., Nucleic Acids Research 1993, 21: 3761-3766.
Genotyping with a Service Contractor:
Qiagen Genomics, formerly Rapigene, is a service contractor for genotyping SNPs in patient samples. Their method is based on a primer extension method where two complementary primers are designed for each genotype that are labeled with different tags. Depending on the genotype only one primer will be elongated together with a certain tag. This tag can be detected with mass spectrometry and is a measure for the respective genotype. The method is described in the following patent: “Detection and identification of nucleic acid molecules—using tags which may be detected by non-fluorescent spectrometry or potentiometry” (WO 9727325).
To exemplify the present invention and it's utility (the imaginary) baySNP 28 will be used in the following:
The nucleotide polymorphism found for baySNP 28 (e.g. C to T exchange) and the gene in which it presumably resides can be read from table 3. baySNP 28 was genotyped in various patient cohorts using primers as described in table 2. As a result the following number of patients carrying different genotypes were found (information combined from tables 3 and 5a):
When comparing the number of female patients exhibiting a high response to statin therapy (HELD_FEM_HIRESP) with the control cohort (HELD_FEM_LORESP) it appears that the number of low responders carrying the CT genotype is increased. This points to a lower statin response among female individuals with the CT genotype. Applying statistical tests on those findings the following p-values were obtained (data taken from table 5b):
As at least one of the GTYPE p values is below 0, 05 the association of genotype and statin response phenotype is regarded as statistically significant. I.e. the analysis of a patient's genotype can predict the response to statin therapy. In more detail one can calculate the relative risk to exhibit a certain statin response phenotype when carrying a certain genotype (data taken from table 6a):
In case of baySNP 28 the risk to exhibit a high responder phenotype is 3, 38 times higher when carrying the TT genotype. This indicates that a TT polymorphism in baySNP 28 is an independent risk factor for high statin response in females. On the other hand carriers of a CT or CC genotype have a reduced risk of being a high responder.
In addition statistical associations can be calculated on the basis on alleles. This calculation would identify risk alleles instead of risk genotypes.
In case of baySNP 28 the following allele counts were obtained (data combined from tables 3 and 5a):
When comparing the number of female patients with high statin response (HELD_FEM_HIRESP) with the control cohort (HELD_FEM_LORESP) it appears that the number of high responders carrying the T allele is increased, whereas the number of high responders carrying the C allele is diminished. This points to a higher statin response among female individuals with the T allele. Applying statistical tests on those findings the following p-values were obtained (data taken from table 5b):
As at least one of the ALLELE p values is below 0, 05 the association of allele and statin response phenotype is regarded as statistically significant (in this example significant p values were obtained from two statistical tests). I.e. also the analysis of a patient's alleles from baySNP 28 can predict the extend of statin response. In more detail one can calculate the relative risk to exhibit a certain statin response phenotype when carrying a certain allele (data taken from table 6b):
In case of baySNP 28 the risk to exhibit a high responder phenotype is 2, 39 times higher when carrying the T allele. This indicates that the T allele of baySNP28 is an independent risk factor for a high statin response in females. In other words those patients should receive lower doses of statins in order to avoid ADR. However due to their ‘high responder’ phenotype they will still benefit from the drug. In turn carriers of the C allele should receive higher drug doses in order to experience a beneficial therapeutic effect.
Another example is (the imaginary) baySNP 29, which is taken to exemplify polymorphisms relevant for adverse drug reactions. baySNP 29 was found significant when comparing male patients with severe ADR to the respective controls (as defined in table 1b).
The relative risk ratios for the genotypes AA, AG and GG were as follows (data taken from table 6a):
In this case male patients carrying the AA genotype have a 3, 15 times higher risk to suffer from ADR. In other words those patients should either receive lower doses of statins or switch to an alternative therapy in order to avoid ADR. On the other hand male patients with AG or GG genotypes appear to be more resistant to ADR and hence better tolerate statin therapy.
As can be seen from the following tables some of the associations that are disclosed in the present invention are indicative for more than one phenotype. Some baySNPs can for example be linked to ADR, but also to the risk to suffer from CVD (table 6).
*When assembling the cohorts for advanced and severe ADR we focused on the CK serum levels as those provide a more independent measure of statin related ADR.
An informed consent was signed by the patients and control people. Blood was taken by a physician according to medical standard procedures.
Samples were collected anonymous and labeled with a patient number.
DNA was extracted using kits from Qiagen.
Homo sapiens ras GTPase-activating-like
Homo sapiens ras GTPase-activating-like
Homo sapiens TNFa and gene for tumor
Homo sapiens PAC clone RP1-102K2
Homo sapiens c-lbc mRNA for guanine
Homo sapiens 12 BAC RP11-13J12 (Roswell Park Cancer
Homo sapiens NADH:ubiquinone oxidoreductase PDSW
Homo sapiens mRNA for KIAA0621 protein, partial cds.
Homo sapiens P-glycoprotein (MDR1) gene, exon 10 and partial cds.
H. sapiens creatine transporter gene
H. sapiens creatine transporter gene
H. sapiens creatine transporter gene
H. sapiens gene for lecithin-cholesterol acyltransferase (LCAT)
Homo sapiens SCO cytochrome oxidase deficient
Homo sapiens SCO cytochrome oxidase deficient
Homo sapiens NADH:ubiquinone oxidoreductase PDSW
Homo sapiens NADH-ubiquinone oxidoreductase
Homo sapiens NADH-ubiquinone oxidoreductase
Homo sapiens, Rab geranylgeranyltransferase, alpha
H. sapiens mRNA for kinase A anchor protein
H. sapiens mRNA for kinase A anchor protein
Homo sapiens partial ZNF202 gene for zinc finger protein homolog, exon 4
Homo sapiens partial ZNF202 gene for zinc finger protein homolog, exon 4
Homo sapiens mRNA for Cdc42-interacting protein 4 (CIP4)
Homo sapiens NADH-ubiquinone oxidoreductase
Homo sapiens cytosolic phospholipase A2-gamma mRNA, complete cds.
Homo sapiens cytosolic phospholipase A2-gamma mRNA, complete cds.
Homo sapiens cytosolic phospholipase A2-gamma mRNA, complete cds.
Homo sapiens ras
Homo sapiens c-syn protooncogene mRNA, complete cds.
Homo sapiens c-syn protooncogene mRNA, complete cds.
H. sapiens mRNA for TIF1beta zinc finger protein
Homo sapiens carnitine O-octanoyltransferase (CROT), mRNA.
Homo sapiens carnitine O-octanoyltransferase (CROT), mRNA.
Homo sapiens carnitine O-octanoyltransferase (CROT), mRNA.
Homo sapiens UDP
Homo sapiens c-lbc mRNA for guanine
Homo sapiens c-lbc mRNA for guanine
Homo sapiens c-lbc mRNA for guanine
Homo sapiens c-lbc mRNA for guanine
Homo sapiens Borg4 mRNA, complete cds.
Homo sapiens Borg4 mRNA, complete cds.
Homo sapiens Borg4 mRNA, complete cds.
Homo sapiens protein kinase A anchoring protein mRNA, complete cds.
Homo sapiens protein kinase A anchoring protein mRNA, complete cds.
Homo sapiens CDC42-binding
Homo sapiens CDC42-binding protein
Homo sapiens CDC42-binding protein
Homo sapiens CDC42-binding protein
Homo sapiens CDC42-binding protein
Homo sapiens CDC42-binding protein
Homo sapiens CDC42-binding protein
Homo sapiens CDC42-binding protein
Homo sapiens PAC 126N20 derived from
Homo sapiens CRIB-containing
Homo sapiens CRIB-containing BORG1
Homo sapiens CRIB-containing BORG1
Homo sapiens mRNA for ryanodine receptor 3, complete CDS
Homo sapiens mRNA for ryanodine receptor 3, complete CDS
Homo sapiens mRNA; cDNA DKFZp434A0530
Homo sapiens mRNA for N-WASP, complete cds.
Homo sapiens mRNA for N-WASP, complete cds.
Homo sapiens mRNA for N-WASP, complete cds.
Homo sapiens mRNA for N-WASP, complete cds.
Homo sapiens cyclic AMP phosphodiesterase mRNA, complete cds.
Homo sapiens protein kinase C
Homo sapiens GAP-related protein (NF1) mRNA, complete cds.
Homo sapiens GAP-related protein (NF1) mRNA, complete cds.
Homo sapiens GAP-related protein (NF1) mRNA, complete cds.
Homo sapiens serine (or cysteine) proteinase inhibitor, clade
Homo sapiens serine (or cysteine) proteinase inhibitor, clade A
Homo sapiens acetyl-Coenzyme A carboxylase beta (ACACB), mRNA.
Homo sapiens p21-activated protein kinase (Pak2) mRNA, complete cds.
Homo sapiens p21-activated protein kinase (Pak2) mRNA, complete cds.
Homo sapiens p21-activated protein kinase (Pak2) mRNA, complete cds.
Homo sapiens p21-activated protein kinase (Pak2) mRNA, complete cds.
Homo sapiens p21-activated protein kinase (Pak2) mRNA, complete cds.
H. sapiens syk mRNA for protein-tyrosine kinase
Homo sapiens NADH-ubiquinone oxidoreductase
H. sapiens lipoprotein lipase (LPL) gene, exons
Homo sapiens zinc finger protein 202 (ZNF202)
Table 5a and 5b Cohort Sizes and p-Values of PA SNPs
The baySNP number refers to an internal numbering of the PA SNPs. Cpval denotes the classical Pearson chi-squared test, Xpval denotes the exact version of Pearson's chi-squared test, LRpval denotes the likelihood-ratio chi-squared test. Cpvalue, Xpvalue, and LRpvalue are calculated as described in (SAS/STAT User's Guide of the SAS OnlineDoc, Version 8), (L. D. Fisher and G. van Belle, Biostatistics, Wiley Interscience 1993), and (A. Agresti, Statistical Science 7, 131 (92)). The GTYPE and Allele p values were obtained through the respective chi square tests when comparing COHORTs A and B. For GTYPE p value the number of patients in cohort A carrying genotypes 11, 12 or 22 (FQ11 A, FQ 12 A, FQ 22 A; genotypes as defined in table 3) were compared with the respective patients in cohort B (FQ11 B, FQ 12 B, FQ 22 B; genotypes as defined in table 3) resulting in the respective chi square test with a 3×2 matrix. For Allele p values we compared the allele count of alleles 1 and 2 (A1 and A2) in cohorts A and B, respectively (chi square test with a 2×2 matrix). SIZE A and B: Number of patients in cohorts A and B, respectively. See table 4 for definition of COHORTs A and B.
we calculate
Here, the case and control populations represent any case-control-group pair, or bad (case)-good (control)-group pair, respectively (due to their increased response to statins, ‘high responders’ are treated as a case cohort, whereas ‘low responders’ are treated as the respective control cohort). A value RR1>1, RR2>1, and RR3>1 indicates an increased risk for individuals carrying genotype 1, genotype 2, and genotype 3, respectively. For example, RR1=3 indicates a 3-fold risk of an individual carrying genotype 1 as compared to individuals carrying genotype 2 or 3 (a detailed description of relative risk calculation and statistics can be found in (Biostatistics, L. D. Fisher and G. van Belle, Wiley Interscience 1993)). The baySNP number refers to an internal numbering of the PA SNPs and can be found in the sequence listing. null: not defined.
In cases where a relative risk is not given in the table (three times zero or null) the informative genotype can be drawn from the right part of the table where the frequencies of genotypes are given in the cases and control cohorts. For example BaySNP 3360 gave the following results:
It can be concluded that a GT or TT genotype is only present in the control cohort; these genotypes are somehow protective against ADR. An analogous proceeding can be used to determine protective alleles if no relative risk is given (table 6b).
we calculate
Here, the case and control populations represent any case-control-group pair, or bad (case)-good (control)-group pair, respectively (due to their increased response to statins, ‘high responders’ are treated as a case cohort, whereas ‘low responders’ are treated as the respective control cohort). A value RR1>1, and RR2>1 indicates an increased risk for individuals carrying allele 1, and allele2, respectively. For example, RR1=3 indicates a 3-fold risk of an individual carrying allele 1 as compared to individuals not carrying allele 1 (a detailed description of relative risk calculation and statistics can be found in (Biostatistics, L. D. Fisher and G. van Belle, Wiley Interscience 1993)). The baySNP number refers to an internal numbering of the PA SNPs and can be found in the sequence listing. null: not defined.
Number | Date | Country | Kind |
---|---|---|---|
04016816.3 | Jul 2004 | EP | regional |
Filing Document | Filing Date | Country | Kind | 371c Date |
---|---|---|---|---|
PCT/EP05/07600 | 7/13/2005 | WO | 1/12/2007 |