Organ and bone marrow transplantation are routinely used for the treatment of patients with end-stage disease such as leukemia, liver failure due to hepatitis C, and kidney failure. While the frequency of organ and tissue transplants has increased dramatically over the past decades, histoincompatibility between the transplant recipient and the donor remains a significant barrier to the success of the transplant.
Histocompatibility, also known as immunocompatibility, refers to the compatibility between two individuals or the actual organs or tissues to be transplanted (also known as “grafts”). Consequences of histoincompatibility include graft rejection, also known as host versus graft disease (HVGD) in organ transplant, and graft versus host disease (GVHD), typically associated with bone marrow transplants. In GVHD, immune cells derived from donor hematopoetic stem cells identify host tissue as foreign and mount an immune response against them. In HVGD, host immune cells identify the graft organ as foreign and mount an immune response against it. Both GVHD and HVGD are debilitating conditions and can require patients to be placed on severe immunosuppressive regimens, with attendant complications. Immunocompatibility largely depends on the genetic similarities between donor and recipient and is generally determined by blood typing and by Major Histocompatibility Complex (MHC) typing, which in humans is also referred to as the Human Leukocyte Antigen (HLA) typing. The MHC of humans is a cluster of genes occupying a region located on the sixth chromosome. The strongest antigens of the MHC are separated into two classes—class I and class II. Class I and II MHC molecules are found in nearly every cell in the body and are the major determinants used by the body's immune system for recognition and differentiation of self from non-self. MHC molecules present antigen peptides to the T cells of the immune system and different MHC molecules differ in the efficiency with which they bind sequences of the antigenic peptides and some are better than others at presenting antigens to the immune system. The class I MHC molecules are encoded by three loci—HLA A, HLA B, and HLA C—and class II MHC molecules are encoded by three loci—HLA DR, HLA DP, and HLA DQ. While the number of alleles at each locus varies widely, a person can only inherit two alleles for each HLA locus. The large number of possible combinations at each locus make the genes of the MHC the most polymorphic loci known.
Every person's HLA pattern can be “fingerprinted” through tissue typing. Tissue typing, or HLA matching, is used to measure the pattern of HLA antigens present for a potential transplant donor and recipient and to determine the level of compatibility between them. The more similar the HLA antigen patterns are from the two tissue samples, the less likely it is that the graft will be rejected.
HLA typing has revolutionized the treatment of many end-stage diseases by increasing the success rate of transplantation of bone marrow cells or organs, but graft rejection still occurs with significant frequency even in sibling transplants in which donor and host are perfectly matched for all blood type and HLA antigens. This may be due, at least in part, to the fact that many other histocompatibility antigens have not yet been identified.
Despite the advances in tissue typing and the creation of numerous tissue and organ registries used to screen potential donors and recipients prior to transplantation, the prevalence of life-threatening complications such as graft failure and rejection remains a significant barrier to the overall success of transplantation.
The compatibility of bodily tissues with the immune system is a central and unpredictable feature of the etiology of numerous medical conditions, including the rejection of allografts, the development of GVHD and HVGD, spontaneous abortions, and the treatment of many hematologic disorders. Improvements to the methods currently available for screening recipients in need of a transplant against potential donors are necessary to reduce the likelihood of graft rejection, GVHD, and HVGD.
Histoincompatibility is generally believed to be due to genetic differences or polymorphisms between individuals. Because the DNA of any two individuals is known to differ at millions of single-nucleotide polymorphisms (SNP) scattered throughout the human genome, it is often assumed that histoincompatibility results from a large number of small differences between the antigen repertoires of the two individuals. However, we have discovered places in the human genome in which entire segments, ranging from hundreds of base pairs to multi-kilobases of the human genome, are present in some individuals and missing in others. Many of these individual “deletion polymorphisms” or “deletion variants” remove protein-coding sequences from the human genome, and thus result in large changes to an individual's antigen repertoire relative to the changes associated with individual SNPs.
When a deletion variant appears in all copies of the gene in an individual, the result is generally a lack of expression of the gene product in that individual. If an individual does not have the deletion in all copies of the gene, the gene is present and the gene product is generally expressed. As a result, the immune cells of an individual with a deletion variant in all copies of the gene will not have been exposed to this gene or its product, and will tend to recognize the gene product as foreign when it is presented on tissue from another individual. In the context of transplant, this will result in an immune response when the donor and host are not matched, also known as a “deletion mismatch” for the specific deletion polymorphism. For example, in the context of organ transplant, a person having a deletion variant in gene X that results in a lack of expression of gene X that receives a kidney from a donor that does not have a deletion variant in gene X, and is therefore positive for gene X, could mount an immune response against the antigen encoded by gene X and the cells which express it. In the context of bone marrow transplant, immune cells from a donor having a deletion variant in gene X, if transplanted into an individual who is positive for gene X, could mount an immune response against the product of gene X and the cells that express it. In the context of fetal loss, a mother who lacks gene X could miscarry a fetus which is positive for gene X due to an immune response by the mother against the product of gene X.
Several of these common deletion variants are present in genes that are specifically expressed in organs relevant to transplantation and are likely to be determinants used by the body's immune system for recognition and differentiation of self from non-self. If the presence of a deletion resulting in the absence of the antigen is not matched between two subjects for whom immunocompatibility is desired (e.g., a graft donor and a graft recipient), the result is an immune response mounted by the subject having the deletion (i.e., lacking the gene product) against the protein product present in the subject lacking the deletion (i.e., having the expressed protein product). Therefore, these common deletion variants that affect the expression of antigens can be used to screen individuals for immunocompatibility, and used to manage, measure, prevent, and provoke histoincompatibility.
Accordingly, in one aspect, the invention features a method for predicting immunocompatibility of the immune system of a first subject with a cell, tissue, or organ from a second subject that includes the following steps. A biological sample from a first subject and a biological sample from a second subject are obtained and the presence or absence of at least one deletion variant in the DNA sequence of a gene in the first and second biological samples is determined, where the deletion variant substantially prevents expression of an antigen encoded by the gene and where the deletion variant is in a gene selected from the group consisting of UGT2B28, TRY6, LCE3C, PRB1, OR51A2, ORF4F5, GNB1L, MGAM, and MCEE. The deletion variant can be a common deletion variant and can be in anywhere in the gene including the coding region or in a regulatory element of the gene. In one embodiment, the deletion variant is at least 100, 200, 300, 400, 500, 600, 700, 800, 900, or 1000 bp, or 2 kb, 3 kb, 4 kb, 5 kb, 7 kb, 8 kb, 9 kb, 10 kb, 100 kb, 200 kb, 300 kb, 400 kb, 500 kb, 600 kb, 700 kb, 745 kb, 800 kb, 900 kb, or 1000 kb in length. In one embodiment, the deletion variant is between 100 bp and 745 kb in length. In another embodiment, at least two, three, four, five, six, seven, eight, nine, ten or more deletion variants are identified. The presence or absence of the deletion variant can be determined, for example, by polymerase chain reaction, DNA sequencing, sequencing of the whole genome, or a subset thereof, Southern blotting, restriction fragment length polymorphism analysis, microelectrophoresis, sequencing by hybridization, single molecule sequencing, or microarray analysis. The presence or absence of the deletion variant can also be determined indirectly by testing polymorphisms (e.g., SNPs) that are in linkage disequilibrium with deletion polymorphisms or by genotyping polymorphisms (e.g., SNPs) that are inside a deleted region to infer the presence of a deletion that removes the site of the SNP. Preferably, the deletion is in a gene that is normally expressed in the biological sample.
The presence or absence of the at least one deletion variant in the DNA sequence of the gene is then compared between the first and second subjects. The immune system of the first subject is immunocompatible with a cell, tissue, or organ from the second subject if the comparison results in one of the following: (i) the first subject has at least one intact copy of the gene, where the antigen encoded by the gene is expressed or (ii) the second subject has a deletion variant in all copies of the gene, where the deletion variant substantially prevents expression of the antigen encoded by the gene. Based on this comparison, three possible scenarios would predict immunocompatibility between the immune system of the first subject and the cell, tissue, or organ from the second subject: 1) both the first and second subjects have a deletion variant in all copies of the gene, which substantially prevents expression of the antigen, encoded by the gene, in both subjects; 2) both subjects have at least one intact copy of the gene and the antigen encoded by the gene is expressed; and 3) the second subject has a deletion variant in all copies of the gene that substantially prevents expression of the antigen encoded by the gene and the first subject has at least one intact copy of the gene that does not have a deletion variant, in which case, the antigen is expressed.
In one embodiment, the method further includes determining the presence or absence of at least one additional deletion variant in the DNA sequence of a gene in the first and second biological sample where the deletion variant substantially prevents expression of an antigen encoded by the gene and where the at least one additional deletion variant is in a gene selected from the group consisting of UGT2B17, GSTT1, GSTM1, and CYP2A6. The presence or absence of the at least one additional deletion variant in the DNA sequence of the gene is then compared between the first and second subjects. The immune system of the first subject is immunocompatible with a cell, tissue, or organ from the second subject if the comparison results in one of the following: (i) the first subject has at least one intact copy of the gene, where the antigen encoded by the gene is expressed or (ii) the second subject has a deletion variant in all copies of the gene, where the deletion variant substantially prevents expression of the antigen encoded by the gene. Based on this comparison for the additional deletion variant, any of the three possible scenarios described above would predict immunocompatibility between the immune system of the first subject and the cell, tissue, or organ from the second subject. In one desirable combination, the at least one deletion variant is in the UGT2B28 gene and the at least one additional deletion variant is in the UGT2B17 gene. In another desirable combination, the at least one deletion variant is in the UGT2B28 gene and the at least one additional deletion variant is in the GSTT1 or GSTM1 gene, or both.
In a related aspect, the invention features a method for predicting immunocompatibility of the immune system of a first subject with a cell, tissue, or organ from a second subject that includes the following steps. A biological sample from a first subject and a biological sample from a second subject are obtained and the presence or absence of at least one deletion variant antigen in the first and second biological samples is determined, for example, using immunological methods (e.g., ELISA or western blotting based methods). The at least one deletion variant antigen can be a common deletion variant antigen and is preferably one of the following: UGT2B28, TRY6, LCE3C, PRB1, OR51A2, ORF4F5, GNB1L, MGAM, and MCEE. The deletion variant antigen is not an antigen encoded by an MHC, HLA, or Rh factor gene. In one embodiment, at least two, three, four, five, six, seven, eight, nine, ten or more deletion variant antigens are compared.
The presence or absence of the deletion variant antigen is then compared between the first and second subjects. The immune system of the first subject is immunocompatible with a cell, tissue, or organ from the second subject if the comparison results in one of the following: i) the first subject expresses the at least one deletion variant antigen or (ii) the second subject does not express the at least one deletion variant antigen. Based on this comparison, three possible scenarios would predict immunocompatibility between the immune system of the first subject and the cell, tissue, or organ from the second subject: 1) both the first and second subjects express the deletion variant antigen; 2) both the first and second subjects do not express the deletion variant antigen; and 3) the first subject expresses the deletion variant antigen and the second subject does not express the deletion variant antigen.
In one embodiment, the method further includes determining the presence or absence of at least one additional deletion variant antigens selected from the group consisting of UGT2B17, GSTT1, GSTM1, and CYP2A6. The presence or absence of the at least one additional deletion variant antigen is then compared between the first and second subjects. The immune system of the first subject is immunocompatible with a cell, tissue, or organ from the second subject if the comparison results in one of the following: i) the first subject expresses the at least one additional deletion variant antigen or (ii) the second subject does not express the at least one additional deletion variant antigen. Based on this comparison for the additional deletion variant, any of the three possible scenarios described above would predict immunocompatibility between the immune system of the first subject and the cell, tissue, or organ from the second subject. In one desirable combination, the at least one deletion variant antigen is UGT2B28 and the at least one additional deletion variant antigen is UGT2B17. In another desirable combination, the at least one deletion variant antigen is UGT2B28 and the at least one additional deletion variant antigen is GSTT1 or GSTM1, or both.
In a related aspect, the invention also features a method for predicting the immunocompatibility of the immune system of a first subject with a cell, tissue, or organ from a second subject that includes the following steps. A biological sample is obtained from the first subject and second subjects. The presence or absence of one or more deletion variants in the DNA sequence of at least one gene in the biological samples is determined, where the one or more deletion variants substantially prevents the expression of an antigen encoded by the at least one gene. The deletion variant is not in an MHC, Rh factor, or HLA gene. The deletion variant can be a common deletion variant and can be in anywhere in the gene including the coding region or in a regulatory element of the gene. In one embodiment, the deletion variant is at least 100, 200, 300, 400, 500, 600, 700, 800, 900, or 1000 bp, or 2 kb, 3 kb, 4 kb, 5 kb, 7 kb, 8 kb, 9 kb, 10 kb, 100 kb, 200 kb, 300 kb, 400 kb, 500 kb, 600 kb, 700 kb, 745 kb, 800 kb, 900 kb, or 1000 kb in length. In one embodiment, the deletion variant is between 100 bp and 745 kb in length. The presence or absence of the deletion variant can be determined, for example, by polymerase chain reaction, DNA sequencing, Southern blotting, restriction fragment length polymorphism analysis, or microarray analysis. The presence or absence of the deletion variant can also be determined indirectly by testing polymorphisms (e.g., SNPs) that are in linkage disequilibrium with deletion polymorphisms or by genotyping polymorphisms (e.g., SNPs) that are inside a deleted region to infer the presence of a deletion that removes the site of the SNP. Preferably, the deletion is in a gene that is normally expressed in the biological sample. Preferably, the deletion variant is in one of the following genes: UGT2B17, UGT2B28, TRY6, LCE3C, GSTM1, GSTT1, CYP2A6, PRB1, OR51A2, ORF4F5, GNB1L, MGAM, and MCEE.
The presence or absence of the deletion variants is then used to determine the deletion variant pattern for the first and second subjects. The deletion variant pattern is compared between the first and second subjects and the immune system of the first subject is immunocompatible with a cell, tissue, or organ from the second subject if the subjects have a substantially identical deletion variant pattern (e.g., at least 50%, 60%, 70%, 80%, 90%, 95%, 99%, or 100% identical) and the subjects are not immunocompatible if they do not have a substantially identical deletion variant pattern (e.g., less than 50%, 40%, 30%, 20%, 10%, 5%, or less The immune system of the first subject is also immunocompatible with a cell, tissue, or organ from the second subject if the comparison results in one of the following: (i) the first subject has at least one intact copy of at least one gene, where the antigen encoded by the gene is expressed or (ii) the second subject has a deletion variant in all copies of the at least one gene, where the deletion variant substantially prevents expression of the antigen encoded by the gene. Based on this comparison, three possible scenarios would predict immunocompatibility between the immune system of the first subject and the cell, tissue, or organ from the second subject: 1) both the first and second subjects have a deletion variant in all copies of the same gene, which substantially prevents expression of the antigen, encoded by the gene, in both subjects; 2) both subjects have at least one intact copy of the same gene and the antigen encoded by the gene is expressed; and 3) the second subject has a deletion variant in all copies of the same gene that substantially prevents expression of the antigen encoded by the gene and the first subject has at least one intact copy of the same gene that does not have a deletion variant, in which case, the antigen is expressed.
Optionally, the method can further include determining the presence or absence of the antigen encoded by the at least one gene that is not an MHC gene, where the presence or absence of the antigen is used to determine the deletion variant antigen pattern for the first and second subjects. The deletion variant antigen pattern is compared between the first and second subjects and the immune system of the first subject is immunocompatible with a cell, tissue, or organ from the second subject if the subjects have a substantially identical deletion antigen variant pattern (e.g., at least 50%, 60%, 70%, 80%, 90%, 95%, 99%, or 100% identical) and the subjects are not immunocompatible if they do not have a substantially identical deletion variant antigen pattern (e.g., less than 50%, 40%, 30%, 20%, 10%, 5%, or less than 1% identical). The immune system of the first subject is immunocompatible with a cell, tissue, or organ from the second subject if the comparison results in one of the following: (i) the first subject expresses the deletion variant antigen or (ii) the second subject does not express the deletion variant antigen. Based on this comparison, three possible scenarios would predict immunocompatibility between the immune system of the first subject and the cell, tissue, or organ from the second subject: (1) both the first subject and the second subject express the deletion variant antigen, (2) both the first subject and the second subject do not express the deletion variant antigen, or (3) the first subject expresses the deletion variant antigen and the second subject does not express the antigen.
In another aspect, the invention features a method for predicting immunocompatibility of the immune system of a first subject with a cell, tissue, or organ from a second subject that includes the following steps. A biological sample from a first subject and a biological sample from a second subject are obtained and the DNA sequence of the whole genome, or a subset thereof, is determined. The sequence of the whole genome, or subset thereof from the first sample and the second sample are then compared and the presence or absence of at least one deletion mismatch loci is determined. A deletion mismatch loci includes at least one deletion variant in the DNA sequence of a gene, where the deletion variant substantially prevents expression of an antigen encoded by the gene. In one embodiment, the deletion variant is in the DNA sequence of any one or more of the following genes: UGT2B17, UGT2B28, TRY6, LCE3C, GSTM1, GSTT1, CYP2A6, PRB1, OR51A2, ORF4F5, GNB1L, MGAM, and MCEE. The deletion variant can be a common deletion variant and can be in anywhere in the gene including the coding region or in a regulatory element of the gene. In one embodiment, the deletion variant is at least 100, 200, 300, 400, 500, 600, 700, 800, 900, or 1000 bp, or 2 kb, 3 kb, 4 kb, 5 kb, 7 kb, 8 kb, 9 kb, 10 kb, 100 kb, 200 kb, 300 kb, 400 kb, 500 kb, 600 kb, 700 kb, 745 kb, 800 kb, 900 kb, or 1000 kb in length. In one embodiment, the deletion variant is between 100 bp and 745 kb in length. In another embodiment, at least two, three, four, five, six, seven, eight, nine, ten or more deletion mismatch loci are identified. The whole genome, or a subset thereof, can be sequenced using any technique known in the art including, but not limited to, microelectrophoresis, genomic hybridization, single molecule sequencing, or microarray analysis. Preferably, the deletion mismatch is a deletion variant in a gene that is normally expressed in the biological sample. Alternatively or additionally, the sequence of the genome or subset thereof of the first subject can be compared to a reference genome DNA sequence, where the reference genome sequence can be the DNA sequence from a third subject or from a composite of multiple subjects.
The immune system of the first subject is immunocompatible with a cell, tissue, or organ from the second subject if the comparison results in one of the following: (i) the first subject has at least one intact copy of the gene, where the antigen encoded by the gene is expressed or (ii) the second subject has a deletion variant in all copies of the gene, where the deletion variant substantially prevents expression of the antigen encoded by the gene. Based on this comparison, three possible scenarios would predict immunocompatibility between the immune system of the first subject and the cell, tissue, or organ from the second subject: 1) both the first and second subjects have a deletion variant in all copies of the gene, which substantially prevents expression of the antigen, encoded by the gene, in both subjects; 2) both subjects have at least one intact copy of the gene and the antigen encoded by the gene is expressed; and 3) the second subject has a deletion variant in all copies of the gene that substantially prevents expression of the antigen encoded by the gene and the first subject has at least one intact copy of the gene that does not have a deletion variant, in which case, the antigen is expressed.
Each of the above methods can be used alone or in combination to determine immunocompatibility between an organ, tissue, or cell donor and a recipient or between a woman and a potential father, an embryo, or fetus (collectively referred to as “maternal/fetal compatibility”). For organ transplants and maternal/fetal compatibility, the first subject is the organ or tissue recipient or the woman and the second subject is the organ or tissue donor, the prospective father, or the embryo or fetus. In each of these scenarios, the immune system of the recipient would not be newly exposed to the antigen upon transplantation. For bone marrow or peripheral blood transplantation, the first and second subjects are reversed, that is, the first subject is the bone marrow or peripheral blood donor and the second subject is the bone marrow or peripheral blood recipient.
Each of the above methods can further include determining the blood type or the MHC type for the first or second subject. In various embodiments of the above aspects, the first or second biological sample is an organ, or part thereof, a tissue, or a bodily fluid, such as blood, serum, plasma, bone marrow, cerebrospinal fluid, amniotic fluid, urine, saliva, or semen.
In one example of the above aspects, the second subject is in need of a bone marrow or peripheral blood transplant and the first subject is a potential bone marrow or peripheral blood donor and the method is used to determine if the two subjects are a donor/recipient match. In this example, the deletion variant can be identified, for example, in a UGT2B17, UGT2B28, GSTM1, GSTT1, MGAM, or CYP2A6 gene or in the antigen encoded by the any of the genes. In another example, the first subject is an organ or tissue recipient and the second subject is a potential organ or tissue donor and the method is used to determine if the two subjects are a donor/recipient match. For example, the methods can be used to identify a donor/recipient match for a subject in need of a liver transplant where the deletion variant is preferably identified in one or more of the following genes: UGT2B17, UGT2B28, GSTM1, GSTT1, and CYP2A6, or in the antigens encoded by any of the genes. In another example, the methods can be used to identify a donor/recipient match for a subject in need of a kidney transplant where the deletion variant is identified in a UGT2B28, GSTT1 or GSTM1 gene or in the antigens encoded by the genes.
In yet another example of the above aspects, the method is used to predict the immunocompatibility of prospective parents (e.g., where the first subject is a woman and the second subject is a prospective father or a potential sperm donor) or between a woman and an embryo (e.g., an embryo that is conceived by in vitro fertilization) or a pregnant woman and her fetus. Desirably, if the method is used to determine immunocompatibility between a woman and an embryo or fetus, the deletion variant antigen or deletion variant encoding the antigen is normally expressed by the fetal or embryonic cells. For example, the methods can be used to determine compatibility between a woman and an embryo or fetus where the deletion variant is preferably identified in one or more of the following genes: UGT2B28, UGT2B17, or LCE3C, which are expressed in the placenta, or in the antigens encoded by any of the genes.
If, using any of the above methods described herein, the first and second subjects are not immunocompatible, the deletion variant antigen can be administered to the first subject to tolerize the subject to the deletion variant antigen. The deletion variant antigen can be administered by gene therapy or protein therapy.
The methods of the above aspects can also be used to determine histoincompatibility. For example, if the second subject is in need of a bone marrow or peripheral blood transplant and the first subject is a bone marrow or peripheral blood donor, the method can be used to identify the subjects as a donor/recipient match if the first subject is not immunocompatible with the second subject. Such a method can be used, for example, to treat a subject that has a hematologic disorder (e.g., myelodysplastic syndrome, aplastic anemia, sickle cell anemia, metabolic disease, or a blood cell cancer such as Hodgkin's lymphoma, non-Hodgkin's lymphoma, leukemia, and multiple myeloma) and the desired outcome is for the donor's immune cells to attack the diseased cells in the host. For example, if the second subject has a blood cell cancer, the deletion variant is preferably detected in an antigen or in a gene that encodes an antigen that is specifically expressed on the cancer cells in the patient suffering from the blood cell cancer.
The invention also features a kit for deletion variant typing that includes at least one nucleic acid molecule that is complementary to a DNA sequence of at least a portion of a gene selected from the following: UGT2B28, TRY6, LCE3C, PRB1, OR51A2, ORF4F5, GNB1L, MGAM, and MCEE. The kit also includes instructions for the use of the nucleic acid molecule for deletion variant typing. The kit can further include at least one additional nucleic acid molecule that is complementary to the DNA of any one or of the following genes: UGT2B17, GSTT1, GSTM1, and CYP2A6. The nucleic acid molecule can be a primer used for a polymerase chain reaction or a probe that hybridizes to the gene at high stringency.
The invention also features a kit for deletion variant antigen typing that includes at least one binding agent (e.g., an antibody or fragment thereof) that specifically binds at least one antigen encoded by a gene selected from the following: UGT2B28, TRY6, LCE3C, PRB1, OR51A2, ORF4F5, GNB1L, MGAM, and MCEE. The kit can also include at least one binding agent (e.g., ann antibody or fragment thereof) that specifically binds a at least one antigen encoded by a gene selected from the following: UGT2B17, GSTT1, GSTM1, and CYP2A6. The kit also includes instructions for the use of the binding agent (e.g., antibody or fragment thereof) for deletion variant antigen typing.
By “antigen” is meant a polypeptide chain of two or more amino acids regardless of any post-translational modification (e.g., glycosylation or phosphorylation) that stimulates a cellular or humoral immune response.
By “biological sample” is meant a tissue biopsy, cell, bodily fluid (e.g., blood, serum, plasma, semen, urine, saliva, amniotic fluid, or cerebrospinal fluid), organ, or part thereof, or other specimen obtained from a patient or a test subject. Desirably, the biological sample includes nucleic acid molecules or polypeptides or both.
By “cell, tissue, or organ” is meant any cell, tissue or organ from the body or bodily fluid of a subject. Non-limiting examples of organs include kidney, liver, skin, pancreas, heart, lung, muscle, small bowel, hand, cornea, or any part thereof Non-limiting examples of tissues include skin, bone, heart valve, blood, bone marrow, semen, an embryo, and a fetus. Non-limiting examples of cells include red blood cells, white blood cells, stem cells, sperm, egg, embryonic cells, and fetal cells.
By “deletion variant” or “deletion polymorphism” is meant a segment of the genome that is present in some individuals of a species and absent in other individuals of that species. Deletion variants can vary in size from 100, 200, 300, 400, 500, 600, 700, 800, 900, or 1000 bp, or 2 kb, 3 kb, 4 kb, 5 kb, 7 kb, 8 kb, 9 kb, 10 kb, 100 kb, 200 kb, 300 kb, 400 kb, 500 kb, 600 kb, 700 kb, 745 kb, 800 kb, 900 kb, or 1000 kb in length. In one embodiment, the deletion variant is between 100 bp and 745 kb in length. By “common deletion variant” is meant a deletion variant that is seen with a frequency of at least 2%, 3%, 4%, 5%, 6%, 7%, 8%, 9%, or at least 10% in a given population. Most common deletions appear to result from ancestral mutations that have been inherited by descent; their frequency is strongly related to ancestry, and they are in linkage disequilibrium with nearby SNP variants. Desirably, the deletion variant or common deletion variant is a deletion in all copies of the gene that prevents expression of a gene, or prevents expression of an antigen encoded by a gene. Deletion variants can be found in the exons, introns, or the coding region of the gene or in the sequences that control expression of the gene. Examples of protein-encoding genes identified as having common deletion polymorphisms include UGT2B17, UGT2B28, TRY6, LCE3C, GSTM1, GSTT1, CYP2A6, PRB1, OR51A2, ORF4F5, GNB1L, MGAM, and MCEE.
By “a deletion variant in all copies of the gene” or “homozygous deletion” is meant the deletion of all of an individual's potential copies of a DNA locus, which may result from inheritance of a substantially identical deletion variant from both parents; or from the inheritance of different but overlapping deletions from one's parents; or from the combined effect of an inherited deletion and a subsequent, de novo mutation that removes that remaining intact copy of a DNA locus. For an autosomal DNA locus, or for an X-chromosome DNA locus in females, a deletion variant in all copies of the gene means a deletion of the DNA locus on both chromosomes. For a sex-chromosome locus in males, a deletion variant in all copies of the gene means a deletion of the only copy of that locus. For example, in the CYP2A6 gene, there is more than one deletion allele of the same locus present in the population that leads to the complete deletion of the DNA locus.
By “deletion variant antigen” is meant an antigen that is encoded by a gene with a “deletion variant” which, when present, prevents expression of the antigen. Preferably, a deletion variant antigen is not an HLA, MHC antigen, or Rh factor. For example, the antigens encoded by UGT2B17, UGT2B28, TRY6, LCE3C, GSTM1, GSTT1, CYP2A6, PRB1, OR51A2, ORF4F5, GNB1L, MGAM, or MCEE are considered deletion variant antigens because, when the deletion variant is present, expression of the antigen is prevented.
By “deletion variant pattern” is meant a compilation of the determination of the presence or absence of deletion variants present in one or more genes in a biological sample. Deletion variant patterns can be determined at the nucleic acid sequence level or at the antigen expression level using any standard method for nucleic acid sequence determination or antigen expression detection known in the art or described herein. The deletion variant pattern can be determined for one gene, two genes, three or more genes, a genomic locus, a chromosome, or an entire genome for a subject sample. The deletion variant pattern can also be determined for one or more deletion variant antigens. A deletion variant pattern identified for one gene, two genes, three or more genes, a genomic locus, a chromosome, an entire genome, or an antigen for one subject sample can be compared to a deletion variant pattern for the same one gene, two genes, three or more genes, a genomic loci, specified genomic loci, a chromosome, an entire genome, or an antigen identified for a second subject sample. The two patterns are said to be substantially identical if they are more than 50%, 60%, 70%, 80%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or more identical over the one gene, two genes, three or more genes, genomic loci, chromosome, entire genome, or antigen compared. Two subjects with a substantially identical deletion variant pattern are said to be immunocompatible. Deletion variant patterns can be compared over an entire region or only for genes or genomic loci that are relevant to the organ or tissue for which immunocompatibility is desired.
By “deletion variant typing” is meant the process of determining the presence or absence of a deletion variant, preferably a common deletion variant, in a nucleic acid encoding an antigen. Deletion variant typing may or may not be used in combination with HLA typing.
By “deletion variant antigen typing” is meant the process of determining the presence or absence of a deletion variant antigen encoded by a gene having a deletion variant, preferably a common deletion variant. Deletion variant typing may or may not be used in combination with HLA typing.
By “deletion mismatch locus” is meant the absence of a genetic locus from the genome, or subset thereof, of one sample that is not absent (i.e. not homozygous deleted) in the genome, or subset thereof of another sample. Generally, the absence of the genetic locus is due to the presence of a deletion variant in all copies of that locus (i.e., a homozygous deletion).
By “donor” is meant a mammal, preferably a human, from whom an organ or a tissue is removed. The mammal may be alive or dead at the time the organ or tissue is removed. By “potential donor” is meant an individual who is identified as having an organ or tissue suitable for transplant. Generally, a potential donor will be free of disease affecting the organ or tissue to be transplanted. For example, a potential liver donor will generally have a healthy liver and be free of liver cancer, cirrhosis, sepsis, or infection with hepatitis A, B, or C virus or human immunodeficiency virus. A potential bone marrow or peripheral blood donor will generally be free of viral infection, blood cancer, or any type of hematologic disorders. A “preferred donor” is a donor that is matched to a recipient either by standard methods known in the art, such as blood typing, HLA typing, or by the methods described herein, or a combination thereof. Donors can be obtained from a registry of potential donors such as the National Cord Blood Program, United Network for Organ Sharing, National Marrow Donor Program, and any other public or private international, national, state, or local organ procurement organizations or organ donor registries. Information pertaining to potential donors can be entered into a database including name, age, sex, race, blood type, HLA type, and deletion variant typing, deletion variant antigen typing, or deletion variant pattern.
By “donor/recipient match” is meant a donor and a recipient that are identified as having (donor) and needing (recipient) the same organ, tissue, blood, or bone marrow and are immunocompatible. Donor/recipient matches need not be a perfect match but may have sufficiently matched criteria (e.g., blood type, HLA type, antigen type), which can be determined by the skilled artisan or the transplant physician. Preferably, a donor/recipient match will have the same blood type and will be identical for at least 1 deletion variant antigen, preferably 2 or more, 3 or more, 4 or more, 5 or more, and most preferably all of the deletion variant antigens for the biological sample being tested. A donor/recipient match will also preferably have an identical pattern for at least one HLA allele, preferably 2 or more, 3 or more, 4 or more, 5 or more, or all 6 commonly tested HLA alleles (e.g., 2 each for HLA-A, HLA-B, and HLA-DR). Donor/recipient matches can be further screened using additional medical criteria such as size of organ and urgency of need of organ, as well as geographic criteria and other health considerations.
By “expression” is meant the production by cells of a gene or polypeptide detectable by standard art known methods. For example, polypeptide expression is often detected by immunological methods, DNA expression is often detected by Southern blotting or polymerase chain reaction (PCR), and RNA expression is often detected by northern blotting, PCR, or RNAse protection assays.
By “gene” is meant a nucleic acid (e.g., DNA) sequence that comprises coding sequences necessary for the production of a polypeptide, precursor, or RNA (e.g., mRNA, rRNA, tRNA), as well as regulatory sequences that promote or restrict the expression of that gene. The term encompasses the coding region and the sequences located adjacent to the coding region on both the 5′ and 3′ ends. Sequences which regulate the expression of a gene's coding sequence are typically located close (e.g., within a distance of about 10 kb) to the coding sequence and are frequently called “promoter elements.” Sequences located 5′ of the coding region and present on the mRNA are referred to as 5′ non-translated sequences. Sequences located 3′ or downstream of the coding region and present on the mRNA are referred to as 3′ non-translated sequences. The term “gene” encompasses both cDNA and genomic forms of a gene. A genomic form contains the coding region (“exons”) interrupted with non-coding sequences termed “introns” or “intervening regions” or “intervening sequences.” Introns are segments of a gene that are transcribed into nuclear RNA (hnRNA); introns may contain regulatory elements such as enhancers. Exons are the segments of the DNA that encode the polypeptide. Introns are removed or “spliced out” from the nuclear or primary transcript; introns therefore are absent in the messenger RNA (mRNA) transcript. The mRNA functions during translation to specify the sequence or order of amino acids in a nascent polypeptide. As used herein, the term “nucleic acid” means a polynucleotide such as deoxyribonucleic acid (DNA) or ribonucleic acid (RNA) and encompasses both single-stranded and double-stranded nucleic acid. Total genomic DNA is a particularly useful nucleic acid with which to practice a method of the invention. When detecting a polymorphism in a coding region, mRNA or cDNA are also useful.
By “genome” is meant the complete genetic content of an organism. The genome includes both the genes and the non-coding sequences. By “a subset of the whole genome” is meant a substantial portion of the genome. For example, chromosomal DNA is a preferred subset of the whole genome. In another example, the DNA sequences encoding proteins is a preferred subset of the whole genome. In another example, the DNA sequences encoding proteins that are known to be expressed in a particular organ or tissue type of interest is a preferred subset of the whole genome. In another example, the DNA sequences encoding protein sequences that are known to be presented by the MHC or to elicit antibody responses are a preferred subset of the genome.
By “hematologic disorder” is meant any abnormal condition of any type of blood cell including erythrocytes (red blood cells), platelets, leukocytes, monocytes, granulocytes, lymphocytes. Examples of diseases of the blood include cancers such as Hodgkin's lymphoma, non-Hodgkin's lymphoma, leukemia, multiple myeloma, and myelodysplastic syndrome. Also included are diseases of the immune system, aplastic anemia (when bone marrow stops producing new blood cells), inherited diseases of the bone marrow such as sickle cell anemia, and some metabolic diseases.
By “hybridize” is meant pair to form a double-stranded molecule between complementary polynucleotide sequences, or portions thereof, under various conditions of stringency. (See, e.g., Wahl and Berger (1987) Methods Enzymol. 152:399; Kimmel, Methods Enzymol. 152:507, 1987.) For example, stringent salt concentration will ordinarily be less than about 750 mM NaCl and 75 mM trisodium citrate, preferably less than about 500 mM NaCl and 50 mM trisodium citrate, and most preferably less than about 250 mM NaCl and 25 mM trisodium citrate. Low stringency hybridization can be obtained in the absence of organic solvent, e.g., formamide, while high stringency hybridization can be obtained in the presence of at least about 35% formamide, and most preferably at least about 50% formamide. Stringent temperature conditions will ordinarily include temperatures of at least about 30° C., more preferably of at least about 37° C., and most preferably of at least about 42° C. Varying additional parameters, such as hybridization time, the concentration of detergent, e.g., sodium dodecyl sulfate (SDS), and the inclusion or exclusion of carrier DNA, are well known to those skilled in the art. Various levels of stringency are accomplished by combining these various conditions as needed. In a preferred embodiment, hybridization will occur at 30° C. in 750 mM NaCl, 75 mM trisodium citrate, and 1% SDS. In a more preferred embodiment, hybridization will occur at 37° C. in 500 mM NaCl, 50 mM trisodium citrate, 1% SDS, 35% formamide, and 100 μg/ml denatured salmon sperm DNA (ssDNA). In a most preferred embodiment, hybridization will occur at 42° C. in 250 mM NaCl, 25 mM trisodium citrate, 1% SDS, 50% formamide, and 200 μg/ml ssDNA. Useful variations on these conditions will be readily apparent to those skilled in the art.
For most applications, washing steps that follow hybridization will also vary in stringency. Wash stringency conditions can be defined by salt concentration and by temperature. As above, wash stringency can be increased by decreasing salt concentration or by increasing temperature. For example, stringent salt concentration for the wash steps will preferably be less than about 30 mM NaCl and 3 mM trisodium citrate, and most preferably less than about 15 mM NaCl and 1.5 mM trisodium citrate. Stringent temperature conditions for the wash steps will ordinarily include a temperature of at least about 25° C., more preferably of at least about 42° C., and most preferably of at least about 68° C. In a preferred embodiment, wash steps will occur at 25° C. in 30 mM NaCl, 3 mM trisodium citrate, and 0.1% SDS. In a more preferred embodiment, wash steps will occur at 42° C. in 15 mM NaCl, 1.5 mM trisodium citrate, and 0.1% SDS. In a most preferred embodiment, wash steps will occur at 68° C. in 15 mM NaCl, 1.5 mM trisodium citrate, and 0.1% SDS. Additional variations on these conditions will be readily apparent to those skilled in the art. Hybridization techniques are well known to those skilled in the art and are described, for example, in Benton and Davis (Science 196:180, 1977); Grunstein and Hogness (Proc. Natl. Acad. Sci., USA 72:3961, 1975); Ausubel et al. (Current Protocols in Molecular Biology, Wiley Interscience, New York, 2001); Berger and Kimmel (Guide to Molecular Cloning Techniques, 1987, Academic Press, New York); and Sambrook et al., Molecular Cloning: A Laboratory Manual, Cold Spring Harbor Laboratory Press, New York.
By “immunocompatibility,” “immunological compatibility,” or “histocompatibility” is meant a condition in which the cells or tissue of one subject do not elicit an immune response by the immune system of another subject. Generally, immunocompatibility is measured by determining the presence of antigens in the cells or tissue of one subject that are absent in the cells or tissue of another subject and would cause the second subject to elicit an immune response against the antigen(s). Examples of such antigens known in the art include the glycosyltransferase enzyme that modifies the carbohydrate content of the red blood cell antigens and determines the blood type of an individual (e.g., Type A, B, AB, or O), HLA antigens, and the Rh antigen. Immunocompatibility can be absolute or relative to another individual based on the number of antigens tested and found in the subjects tested. For example, if a first subject has the same blood type, Rh factor and 3 out of 6 HLA antigens that are identical to one individual and the same blood type, Rh factor antigen, and 5 out of 6 HLA antigens that are identical to the second individual, the first subject is said to be more immunocompatible with the second individual than with the first individual.
By “major histocompatibility complex” or “MHC” is meant a complex of genes encoding cell surface molecules that are required for antigen presentation to T cells. The MHC is a large genomic region or gene family found in most vertebrates containing many genes with important immune system roles. In humans, the MHC is also referred to as the Human Leukocyte Antigen (HLA) and spans almost 4 megabases of chromosome 6. The strongest antigens of the MHC are separated into two classes—class I and class II. Class I and II MHC molecules are found in nearly every cell in the body and are the major determinants used by the body's immune system for recognition and differentiation of self from non-self. The class I MHC molecules are encoded by three loci—HLA A, HLA B, and HLA C—and class II MHC molecules are encoded by three loci—HLA DR, HLA DP, and HLA DQ.
By “polymorphism” is meant the occurrence of different forms, stages, or types in individual organisms or in organisms of the same species, independent of sexual variations, for example, the DNA sequence variations that occur when a single nucleotide (A, T, C, or G) in the genome sequence is altered. One example of a polymorphism is a single nucleotide polymorphism (SNP).
By “predicting the immunocompatibility” is meant determining or identifying the genetic similarities between two individuals or between an individual and a cell, tissue, or organ to be transplanted into that individual.
By “recipient” is meant a mammal, preferably a human, in need of an organ or a tissue transplant. Recipients can also be entered into a registry or a waiting list of subjects in need of an organ or tissue transplant. Information pertaining to recipients that can be entered into a database includes name, age, sex, race, blood type, HLA tissue type, geographic location, and urgency of the needed organ or tissue donation.
By “subject” is meant a mammal, including, but not limited to, a human or non-human mammal, such as a bovine, equine, canine, ovine, or feline.
By “substantially prevents expression” is meant to cause a reduction in the expression of a gene or antigen by at least 50%, 60%, 70%, 80%, 90%, 95%, 99%, or 100% when compared to the expression of the gene or antigen in a sample that does not have a deletion variant in the gene or a deletion variant antigen. The term “substantially prevents expression” also includes a loss or reduction in the expression of a gene or antigen spatially or temporally during development when compared to the expression of the gene or antigen in a sample that does not have a deletion variant in the gene or a deletion variant antigen.
By “tolerize” is meant providing an antigen or nucleic acid sequence encoding an antigen to an individual to reduce or prevent antigen-specific immune responses.
By “transplantation” is meant the transfer of cells, tissues, blood, bone marrow, or organs from one area of the body to another area of the body or from one organism to another. Allogeneic transplantation refers to transplantation between genetically different members of the same species. Nearly all organ and bone marrow transplants are allografts. These may be between brothers and sisters, parents and children, or between donors and recipients who are not related to each other. Autologous transplantation refers to transplantation of an organism's own cell or tissues; autologous transplantation may be used to repair or replace damaged tissue; autologous bone marrow transplantation permits the usage of more severe and toxic cancer therapies by replacing bone marrow damaged by the treatment with marrow that was removed and stored prior to treatment. By xenogenic transplantation is meant transplantation between members of different species; for example, the transplantation of animal organs into humans. Transplantation can refer to the transfer of a healthy organ or tissue such as liver, kidney, heart, pancreas, skin, lungs, and cornea. Transplantation can also refer to the transfer or replacement of blood or bone marrow, for example in as bone marrow transplant (BMT), umbilical cord blood, or peripheral blood stem cell transplant (PBSCT), where diseased blood cells or stem cells can be restored or replaced.
We have discovered a number of common deletion variants in genes that encode for antigens expressed in tissues relevant to immunocompatibility. The conservation of these common deletion variants among multiple individuals, the presence of the antigens encoded by these polymorphic genes in relevant tissues, and the ability of the antigen to elicit an immune response, makes them ideal candidates for screening methods that determine immunocompatibility between two subjects in any situation where compatibility or histocompatibility is desired.
Other features and advantages of the invention will be apparent from the following Detailed Description, the drawings, and the claims.
The genetic sequences of different people are remarkably similar. When the chromosomes of two humans are compared, their DNA sequences can be identical for hundreds of bases. But at about one in every 1,200 bases, on average, the sequences will differ. Differences in individual bases are by far the most common type of genetic variation. These genetic differences are known as single nucleotide polymorphisms, or SNPs. The International HapMap Project is focused on identifying the basis for a large fraction of the genetic diversity in the human species by identifying most of the approximately 10 million SNPs estimated to occur commonly in the human genome.
For geneticists, SNPs act as markers to locate genes in DNA sequences. However, testing all of the 10 million common SNPs in a person's chromosomes would be extremely expensive. The development of the HapMap is a global collaboration designed to enable geneticists to take advantage of how SNPs and other genetic variants are organized on chromosomes. Genetic variants that are near each other tend to be inherited together. For example, all of the people who have an adenine rather than a guanine at a particular location in a chromosome can have identical genetic variants at other SNPs in the chromosomal region surrounding the adenine. These regions of linked variants are known as haplotypes.
In many parts of the human chromosomes, just a handful of haplotypes are found. For example, in a given population, 55% of people may have one version of a haplotype, 30% may have another, 8% may have a third, and the rest may have a variety of less common haplotypes. The International HapMap Project is identifying these common haplotypes in four populations from different parts of the world.
One type of human genetic variation consists of deletion variants—segments of the human genome that are present in some individuals and absent in others. The locations of common deletion variants in the human genome are largely unknown, as is the best way to determine the association of such variants with disease. To address these questions, we developed an approach for using the HapMap to discover, localize, and analyze common deletion variants. We found hundreds of deletion variants, 1 kb-745 kb in size, including common deletion variants that were observed as homozygous deletions in a number of expressed genes that are specifically expressed in organs relevant to transplantation, such as liver, prostate, kidney, intestine, and skin. These common deletion variants prevent the expression of the protein, or antigen, encoded by these genes.
The present invention features methods for identifying immunocompatible subjects by determining the presence or absence of deletion variants, preferably a deletion variant in all copies of the gene, that substantially prevents expression of either the gene or the antigen encoded by a gene. The lack of expression of the gene or of an antigen encoded by the gene, respectively, is used to identify subjects or subject samples that are immunocompatible. Screening subjects for immunocompatibility is used, for example, to identify donor/recipient matches for transplantation, to identify maternal/fetal compatibility issues in prospective parents, and to identify bone marrow donors that are not immunocompatible with a recipient and can be used to provoke an immune response to the tumor cells in a recipient having a blood cancer. Therefore, the present invention provides methods for immunocompatibility typing which can be used alone or together with previously known typing techniques to manage, measure, prevent, and provoke histoincompatibility.
Identification of Common Deletion Variants
We used data from the International HapMap Project, including about 1.3 million SNP assays in 270 individuals of European, Yoruban, and Chinese and Japanese ancestry, to identify clusters of regionally aberrant genotype patterns (see Examples below). We validated the presence of polymorphic deletions by fluorescence in situ hybridization (FISH), fluorescence allelic-intensity measurements, and PCR. Altogether, more than 80 common deletions were validated by one or more of these approaches
The deletion alleles were linked to the same SNP alleles in each population, suggesting that each deletion derived from an ancestral mutation that occurred before humans migrated from Africa to Europe and Asia. The observed levels of linkage disequilibrium indicates that these common deletion variants are highly conserved among individuals and that SNPs can be used to discover, analyze, and serve as markers for these variants.
Thirteen protein-coding genes were disrupted or entirely removed by common deletions (Table 1). These common deletion variants were found in multiple genes with roles in drug response, olfaction, and sex steroid metabolism. To learn more about these common gene deletion variants, we developed quantitative PCR assays for distinguishing individuals with 0, 1, and 2 gene copies (
N.D. = no data
These common deletion variants were detected in several tissues including liver, prostate, kidney, heart, and skin, all of which are important to immunocompatibility and transplantation. The expression products from these genes, also known as antigens, are absent in an individual having the common deletion variant. As a result, the immune system of that individual would not be exposed to the antigen, and, if exposed, would recognize the antigen as foreign and respond by mounting an immune response to the antigen. The conservation of these common deletion variants, particularly among people of a shared ancestry, and the ability of the encoded antigens to elicit an immune response indicates that these common deletion variants and the encoded antigens are an effective tool for screening individuals for immunocompatibility, particularly with respect to the organs or tissues in which the antigen is expressed.
Methods for the use of common deletion variants, for example, those identified in Table 1, to screen for and manage immunocompatibility are described in detail below. It should be understood by the skilled artisan that any deletion variant, particularly a common deletion variant, that affects the expression of any antigen can be used in the methods described herein to screen for and manage immunocompatibility or to provoke histoincompatibility, if desired. In addition, it should be understood that any methods for sequencing all or part of a subject's genome, or determining the deletion variant pattern or deletion variant antigen pattern for a subject can be used to identify additional deletion variants in expressed antigens and to screen for and manage immunocompatibility.
Methods for Deletion Variant Typing
Individual subjects can be typed for the presence or absence of common deletion variants in a biological sample using methods for detection of the deletion in the gene or methods for detection of the antigen encoded by the gene. The biological sample used to detect the gene or protein can be any biological material from the subject (e.g., the graft recipient, potential donor, mother, fetus, father, or prospective parent) that contains the antigen or nucleic acids encoding the antigen. For detection of deletion variant antigens, the biological sample is preferably a sample in which the antigen is normally expressed. Desirably the biological material is a bodily fluid, such as blood, serum, plasma, amniotic fluid, cerebrospinal fluid, saliva, urine, and semen, or a cell or tissue in which the antigen or nucleic acid encoding the antigen is expressed. In the case of an organ transplant, the biological sample is desirably a biopsy of the organ to be transplanted and the antigen or nucleic acid encoding the antigen is expressed in the organ. In the case of a bone marrow or peripheral blood transplant, the biological sample is preferably blood, serum, or plasma in which the antigen or nucleic acid encoding the antigen is expressed.
Methods of detecting deletion variants in a nucleic acid are well known to those skilled in the art. In one example, polymerase chain reaction (PCR) can be used to detect a deletion variant in a nucleic acid. Oligonucleotide PCR primers that flank a known deletion polymorphism can be used to amplify genomic DNA spanning the deletion breakpoints in individuals carrying the deletion allele; alternatively, oligonucleotide primers inside the deleted sequence can be used to amplify genomic DNA selectively in individuals carrying the other (non-deletion) allele. The amplified genomic DNA can then be sequenced, analyzed by fluorescence quantitation, resolved on a gel, or otherwise analyzed, and the presence or absence of a deletion variant can be determined. These PCR-based methods can be combined to identify individuals carrying 0, 1, or 2 copies of the deletion allele. Furthermore, quantitative PCR can be used to compare the abundance of a polymorphically deleted locus to the abundance of a control locus, and thereby infer copy number, and thereby infer the deletion status of an individual. Methods for PCR amplifying and sequencing a nucleic acid molecule are well known to those skilled in the art (see, for example, Sambrook et al., Molecular Cloning: A Laboratory Manual, 2nd ed., Cold Spring Harbor Press, Plainview, N.Y. (1989); Ausubel et al., Current Protocols in Molecular Biology (Supplement 47), John Wiley & Sons, New York (1999); Dieffenbach and Dveksler, PCR Primer: A Laboratory Manual, Cold Spring Harbor Press (1995)). The following are examples of PCR primers and quantitative fluorescent probes which we have used to successfully genotype deletion polymorphisms in DNA samples from individuals:
Note:
VIC is the fluorescent label commonly known as “VIC” (available, for example, from Applied Biosystems) and MGBNFQ is a non-fluorescent quencher molecule (available, for example, from Applied Biosystems).
Note:
6FAM is the fluorescent label commonly known as “6FAM” (available, for example, from IDT) and BHQ-1 is a non-fluorescent quencher molecule (available, for example, from IDT).
Sequence analysis, which is any manual or automated process by which the order of nucleotides in a nucleic acid is determined, also can be useful for determining the presence or absence of a common deletion variant. It is understood that the term sequence analysis encompasses chemical (Maxam-Gilbert) and dideoxy enzymatic (Sanger) sequencing as well as variations thereof. Thus, the term sequence analysis includes capillary array DNA sequencing, which relies on capillary electrophoresis and laser-induced fluorescence detection and can be performed using, for example, the MegaBACE 1000 or ABI 3700. Also encompassed by the term sequence analysis are thermal cycle sequencing (Sears et al., Biotechniques 13:626-633 (1992)); solid-phase sequencing (Zimmerman et al., Methods Mol. Cell Biol. 3:39-42 (1992)) and sequencing with mass spectrometry such as matrix-assisted laser desorption/ionization time-of-flight mass spectrometry MALDI-TOF MS (Fu et al., Nature Biotech. 16: 381-384 (1998)). Sequence analysis can be used to determine the sequence of a particular genetic loci known to have a common deletion variant, an entire gene known to contain a common deletion variant, a chromosome, or the entire genome of a subject. The term sequence analysis also includes, for example, sequencing by hybridization (SBH), which relies on an array of all possible short oligonucleotides to identify a segment of sequences present in an unknown DNA (Chee et al., Science 274:61-614 (1996); Drmanac et al., Science 260:1649-1652 (1993); Drmanac et al., Nature Biotech. 16:54-58 (1998), Margulies et al., Nature 437:376-380 (2005) and Bentley, Curr. Opin. Genet. Dev. 16:545-552 (2006)). The whole genome approach to typing individual subjects for the presence or absence of common deletion variants is described in detail below.
Other methods for detecting the presence or absence of a deletion variant include electrophoretic analysis and restriction fragment length polymorphism (RFLP) analysis. Electrophoretic analysis, as used herein in reference to one or more nucleic acid molecules such as amplified fragments, means a process whereby charged molecules are moved through a stationary medium under the influence of an electric field. Electrophoretic migration separates nucleic acid molecules primarily on the basis of their charge, which is in proportion to their size. The term electrophoretic analysis includes analysis using both slab gel electrophoresis, such as agarose or polyacrylamide gel electrophoresis, and capillary electrophoresis. Capillary electrophoretic analysis is generally performed inside a small-diameter (50-100-μm) quartz capillary in the presence of high (kilovolt-level) separating voltages with separation times of a few minutes. Using capillary electrophoretic analysis, nucleic acids are conveniently detected by UV absorption or fluorescent labeling, and single-base resolution can be obtained on fragments up to several hundred base pairs. Such methods of electrophoretic analysis, and variants thereof, are well known in the art, as described, for example, in Ausubel et al., Current Protocols in Molecular Biology, John Wiley & Sons, Inc. New York (1999).
Restriction fragment length polymorphism (RFLP) analysis also can be useful for determining the presence or absence of a deletion variant (Jarcho et al., in Current Protocols in Human Genetics, Dracopoli et al., eds., pages 2.7.1-2.7.5, John Wiley & Sons, New York (1994); Innis et al., (Ed.), PCR Protocols, San Diego: Academic Press, Inc. (1990)). As used herein, restriction fragment length polymorphism analysis means any method for distinguishing genetic polymorphisms using a restriction enzyme, which is an endonuclease that catalyzes the degradation of nucleic acid and recognizes a specific base sequence, generally a palindrome or inverted repeat. One skilled in the art understands that the use of RFLP analysis depends upon an enzyme that can differentiate two alleles at a polymorphic site. For example, if the restriction enzyme recognizes a specific base sequence that is present in the nucleic acid sequence containing the deletion variant, then a subject having the deletion variant would not have cleavage at that restriction enzyme site and would therefore produce a different enzymatic cleavage pattern than a subject lacking the deletion variant and having the restriction enzyme site.
Other methods for detecting the presence or absence of a deletion variant at a polymorphic site include allele-specific oligonucleotide (ASO) hybridization. Allele-specific oligonucleotide hybridization is based on the use of a labeled oligonucleotide probe having a sequence perfectly complementary, for example, to a known or predicted deletion variant site. A heteroduplex mobility assay (HMA) is another well-known assay that can be used to detect a common deletion variant according to a method of the invention. HMA is useful for detecting the presence of a polymorphic sequence since a DNA duplex carrying a mismatch has reduced mobility in a polyacrylamide gel compared to the mobility of a perfectly base-paired duplex (Delwart et al., Science 262:1257-1261 (1993); White et al., Genomics 12:301-306 (1992)).
The technique of single strand conformational polymorphism (SSCP) can also be used to detect the presence or absence of a deletion variant (see Hayashi, PCR Methods Applic. 1:34-38 (1991)). This technique can be used to detect deletions based on differences in the secondary structure of single-strand DNA that produce an altered electrophoretic mobility upon non-denaturing gel electrophoresis. Polymorphic fragments are detected by comparison of the electrophoretic pattern of the test fragment to corresponding standard fragments containing known alleles.
SNP genotyping can also be used to detect the presence or absence of a deletion variant. We have observed that common deletion polymorphisms are generally in linkage disequilibrium with nearby SNPs, which suggests that specific SNP genotyping assays could be used to indirectly detect a deletion polymorphism. In this technique, a SNP that is known to be in linkage disequilibrium with a deletion polymorphism, such that individuals carrying the deletion almost always carry a particular variant of the SNP, is used as a marker for the presence of the deletion. Individuals can be typed for the SNP as a way of indirectly typing for the deletion. Techniques for deriving SNP genotypes include hybridization to allele-specific complementary sequences on microarrays or beads, as well as allele-specific primer extension.
We have further observed that genotyping of a SNP that is inside a deleted region can also be used to infer the presence of a deletion that removes the site of the SNP. In particular, the presence of the deletion causes particular SNP genotyping results, including null genotypes, apparent mendelian inconsistencies, and reductions in intensity measurements. Techniques for deriving SNP genotypes include hybridization to allele-specific complementary sequences on microarrays or beads, as well as allele-specific primer extension.
Denaturing gradient gel electrophoresis (DGGE) also can be used to detect a deletion variant. In DGGE, double-stranded DNA is electrophoresed in a gel containing an increasing concentration of denaturant; double-stranded fragments made up of mismatched alleles have segments that melt more rapidly, causing such fragments to migrate differently as compared to perfectly complementary sequences (Sheffield et al., “Identifying DNA Polymorphisms by Denaturing Gradient Gel Electrophoresis” in Innis et al., supra, 1990).
In addition to using DGGE as described above, other methods to detect heteroduplexes include temperature gradient gel electrophoresis (TGGE), constant denaturant gel electrophoresis (CDGE), and base excision sequence scanning (BESS) (Gupta, The Scientist 13:25-28 (1999)). Other methods include oligonucleotide ligation assay (OLA) in which a PCR-amplified target is hybridized to two oligonucleotides, one tagged, for example, with biotin, and the other with a reporter molecule and then ligated with DNA ligase. If the tag and reporter oligonucleotides are ligated, the tagged molecule can be used to isolate the ligated oligonucleotide and the reporter molecule can be detected.
Other well-known approaches for determining the presence or absence of a deletion variant include automated sequencing and RNAase mismatch techniques (Winter et al., Proc. Natl. Acad. Sci. 82:7575-7579 (1985)). In view of the above, one skilled in the art realizes that the methods of the invention for determining the presence or absence of a deletion variant in an individual can be practiced using any one of the well known assays described above, or another art-recognized assay for genotyping. Furthermore, one skilled in the art understands that individual alleles can be detected by any combination of molecular methods (see, in general, Birren et al. (Eds.) Genome Analysis: A Laboratory Manual Volume 1 (Analyzing DNA) New York, Cold Spring Harbor Laboratory Press (1997)).
Additional methods for determining the presence of deletion variants include fluorescence in situ hybridization (FISH) and fluorescence allelic-intensity measurements, examples of which are described in the Examples below. FISH is used to visualize the presence or absence of DNA sequence on chromosomes, via hybridization of a fluorescent probe to the chromosome in site.
In addition to the above methods for detecting the presence of a known human deletion polymorphism, additional methods, known to those versed in the art, can be used to scan the genome of one individual for deletions of DNA sequences which are present in other individuals. One such method is microarray hybridization, in which DNA from a subject is probed with a microarray of nucleic acids containing human genomic sequences, and the user identifies microarray probes which are not bound by that individual's genomic DNA. Another such method is whole-genome sequencing, in which the DNA from an individual is systematically sequenced. In this application, the practitioner could look for nucleic acid sequences which appear to be absent from that individual's sequence but which are known to be present in other individuals. Another such method is subtractive hybridization, in which two DNA samples are compared by molecular techniques which allow DNA sequences that are present in the first sample to be selectively removed from the second sample, leaving only those DNA sequences that are present in the second sample and not in the first sample. Such an approach could be used to identify genomic loci that were deleted in the individual from whom the first sample was obtained but present in the second individual from which the second sample was obtained.
Methods for detecting the presence or absence of a deletion variant antigen are also well known in the art and include, for example, immunoassays to detect the presence of an antigen in the biological sample of the subject. Polyclonal or monoclonal antibodies specific for each antigen can be used in any standard immunoassay format (e.g., ELISA, sandwich ELISA, Western blot, or RIA; see, e.g., Ausubel et al., supra) to determine the presence of the antigen. Standard methods for enzyme immunoassays can also be used to detect antigens that are present on enzymes, such as GSTM1, GSTT1, UGT2B17, UGT2B28, and CYP2A6. ELISA assays are the preferred method for measuring levels of any one or more of the following antigens: UGT2B17, UGT2B28, TRY6, LCE3C, GSTM1, GSTT1, CYP2A6, PRB1, OR51A2, ORF4F5, GNB1L, MGAM, and MCEE. Particularly preferred, for ease and simplicity of detection, and its quantitative nature, is the sandwich or double antibody ELISA of which a number of variations exist, all of which are contemplated by the present invention. For example, in a typical sandwich ELISA, unlabeled antibody that recognizes the antigen is immobilized on a solid phase, e.g. microtiter plate, and the sample to be tested is added. After a certain period of incubation to allow formation of an antibody-antigen complex, a second antibody, labeled with a reporter molecule capable of inducing a detectable signal, is added and incubation is continued to allow sufficient time for binding with the antigen at a different site, resulting with a formation of a complex of antibody-antigen-labeled antibody. The presence of the antigen is determined by observation of a signal, which may be quantitated by comparison with control samples containing known amounts of antigen.
Immunohistochemical techniques can also be utilized for detection of any of the antigens in a tissue biopsy sample. For example, a tissue sample can be obtained from a subject, sectioned, and stained for the presence of the antigen using an antibody that specifically binds the antigen and any standard detection system (e.g., one that includes a secondary antibody conjugated to an enzyme, such as horseradish peroxidase). General guidance regarding such techniques can be found in, e.g., Bancroft et al., Theory and Practice of Histological Techniques, Churchill Livingstone, 1982 and Ausubel et al., supra).
The methods described herein can be used to detect one or more deletion variants, preferably common deletion variants, in a single gene or in more than one gene. For example, an individual can be typed for the presence of one, two, three, four, five, six or more common deletion variants in nucleic acids encoding one, two, three, four, five, six or more different antigens (e.g., UGT2B17, UGT2B28, TRY6, LCE3C, GSTM1, GSTT1, CYP2A6, PRB1, OR51A2, ORF4F5, GNB1L, MGAM, and MCEE). The methods described herein can be used to detect one or more deletion variant antigens. For example, an individual can be typed for the presence or absence of one, two, three, four, five, six or more deletion variant antigens. While it is preferred that two subjects are a perfect match for each and every deletion variant or deletion variant antigen tested, individuals can be ranked for immunocompatibility depending on the number of matches and the relative importance of the antigen. For example, an individual in need of a liver transplant would seek a donor having a common deletion variant type match at the UGT2B17, UGT2B28, and GSTM1 loci, all of which are expressed in the liver, but may not be matched for common deletion variants at the OR51A2 loci, which is expressed in the olfactory epithelium. Two subjects can also be typed for deletion variant patters or deletion variant antigen patterns in which one or more genes, genomic loci, chromosome, or entire genome is assayed using the methods described herein to determine the presence or absence of deletion variants throughout the one or more genes, genomic loci, chromosome, or entire genome assayed. The information is then compiled into a deletion variant pattern for each subject and can be compared either for overall substantially identical patterns or for substantial identity within a defined set of genes or antigens, e.g., those expressed in an organ or tissue being transplanted. For example, a subject in need of a liver transplant may show deletion variants in 3 genes expressed in the kidney and 1 gene expressed in the liver and a potential donor has a deletion variant in 1 of the same genes expressed in the kidney and the same 1 gene expressed in the liver. The potential liver donor is identified as immunocompatible because of the 100% identity of the deletion variant pattern in the relevant tissue (i.e., the liver).
Methods for Whole Genome Sequence Analysis to Determine Immunocompatibility
As described above, sequence analysis, including any manual or automated process, can be used for determining the presence or absence of a common deletion variant. Such sequence analysis can also be used to analyze the genome, or a subset thereof, of an individual subject and to compare that subject's genome sequence, or subset thereof, to the genome sequence or the same subset thereof, in a second individual or a cell, tissue, or organ from the second individual. This type of whole genome, or subset thereof, sequence analysis can be used to search for or identify a deletion variant that is present in one individual and absent in a second individual. The deletion variant can vary in size from 100, 200, 300, 400, 500, 600, 700, 800, 900, or 1000 bp, or 2 kb, 3 kb, 4 kb, 5 kb, 7 kb, 8 kb, 9 kb, 10 kb, 100 kb, 200 kb, 300 kb, 400 kb, 500 kb, 600 kb, 700 kb, 745 kb, 800 kb, 900 kb, or 1000 kb in length. A deletion variant present at a particular loci in one individual and absent in a second individual is called a deletion mismatch loci.
The identification of a deletion mismatch loci between two individuals is predictive of histoincompatibility if:
(i) there is a homozygous deletion variant in a loci in the genome of a first subject (e.g., a candidate bone marrow donor) but not in a loci in the genome of the second subject (e.g., a candidate bone marrow recipient, (ii) there is a homozygous deletion in a loci in the genome of a first subject (e.g., a candidate organ recipient) but not in a loci in the genome of the second subject (e.g., a candidate organ donor), or (iii) there is a homozygous deletion in a loci in the genome of a first subject (e.g., a candidate mother) but not in a loci in the genome of the second subject (e.g., the candidate father, embryo, fetus, or miscarriage).
Alternatively or additionally, the sequence of the genome or subset thereof of the first subject can be compared to a reference genome DNA sequence, where the reference genome sequence can be the DNA sequence from a third subject or from a composite of multiple subjects. The identification of a deletion mismatch loci between the first subject and the reference genome DNA sequence is then carried out as described above and used to predict histoincompatibility as described above.
The whole-genome analysis can be performed using any sequencing technique known in the art or described herein. In one example, a whole genome sequencing approach can be used where millions of genome-wide sequence reads are obtained from the patient's DNA. Technologies available for massively parallel sequencing include sequencing by synthesis in arrays such as on fiber optic slides and single-molecule sequencing via nanopores (Margulies et al., Nature 437:376-380 (2005) and Bentley, Curr. Opin. Genet. Dev. 16:545-552 (2006)). Homozygous deletions are identified as loci which are not covered by any sequence reads, despite overall sequencing having been performed at a sufficient depth to have covered all genomic loci present in that individual.
Another technique useful for the whole genome sequence analysis is genomic hybridization. For this method, patient DNA is labeled with a suitable marker (typically a fluorescent molecule) with or without amplification, and hybridized to an array consisting of DNA probes. These probes can consist of oligonucleotides, plasmids, fosmids, or other genomic clones. Deletions are identified from probes for which the patient's DNA fails to yield the appreciable hybridization signal that is normally observed in DNA from other individuals or fails to yield hybridization signal beyond that would be expected from cross-hybridization to other genomic sequences.
Additional techniques for whole genome sequence analysis are described in Bentley, supra, (herein incorporated by reference in its entirety) and include microelectrophoresis and single molecule sequencing.
Immunocompatibility between two subjects can be determined by the identification of deletion mismatch loci, where two subjects would be considered not immunocompatible if there is at least one, two, three, four, five, six, seven, eight, nine, ten or more homozygous deletion mismatch loci identified between the two subjects; or when a scoring system, which combines information across multiple deletion mismatch loci, is determined to have an appropriately high mismatch score. Preferably, the one or more deletion mismatch loci would remove the protein-coding sequences and prevent expression of the encoded antigen in the individual homozygous for the deletion. Alternatively or additionally, a scoring system can be used to determine the relevance of each deletion mismatch locus identified between the two subjects. The scoring system would score each of the homozygous deletion mismatch loci for its potential contribution to antigenicity, and produce a composite score which combines information across all deletion mismatch loci, and potentially combines this with additional information relevant to histocompatibility, such as the subjects' sex and the subjects' HLA types. For example, a scoring system could assign points for deletions which remove protein-coding sequences for which the encoded proteins are generally expressed in tissues relevant to the immune response considered in the clinical application. For example, for kidney transplant, deletion variants in genes encoding proteins which are expressed in the kidney are assigned points. Additional points are awarded if those deletions affect protein-coding sequences which (i) encode peptide sequences known or predicted to be presented by that individual's HLA alleles or (ii) contain sequences which are particularly accessible to antibodies, such as sequences encoding extracellular domains of proteins.
In the scoring system described above, donor-recipient pairs with a high “deletion mismatch score” are interpreted to be more likely to have histoincompatibilities; such a diagnosis might recommend the use of a different donor, or the application of a tolerization regimen, or the further investigation of any particular deletion mismatches identified by this analysis. This further investigation could include testing the relevant donor or patient for pre-existing antibodies or pre-existing T-cell responses to the antigen encoded by the genomic region(s) identified as the deletion loci.
Statistical analysis or metrics for prioritization or comparison of genomic information are known in the art and can be applied to the methods herein to prioritize and compare the deletion mismatch loci between two subjects and to generate a composite mismatch score reflecting mismatches (including deletion mismatches) at multiple loci. Examples of such analytical methods include naïve Bayesian scoring, decision trees, and boosting; these and similar approaches are routinely applied to genome-scale data sets to derive focused predictions (Jansen et al., Science 302: 449-453, 2003; Calvo et al., Nat Genet 38: 576-582, 2006).
Uses of the Deletion Variants to Measure and Manage Immunocompatibility
We have discovered a number of common deletion variants, particularly among people of a shared ancestry, in genes that encode for antigens expressed in tissues relevant to immunocompatibility. The conservation of these common deletion variants among multiple individuals, the presence of the antigens encoded by these polymorphic genes in relevant tissues, and the ability of the antigen to elicit an immune response, makes them ideal candidates for screening methods that determine immunocompatibility in any situation where immunocompatibility or lack of immunocompatibility is desired.
For example, the methods described herein can be used to detect common deletion variants to determine immunocompatibility between a subject in need of a transplant (a recipient) and a potential donor. These methods can also be used to screen for maternal/fetal incompatibility in cases of spontaneous abortion or among prospective parents having difficulty conceiving. The methods for identifying common deletion variants can also be used to identify a bone marrow donor for a recipient having a blood cancer where the recipient and the donor are not immunocompatible. In this case, a donor's immune system would attack the cancer cells that remain in the recipients blood system thereby enabling the transplanted bone marrow to not only replace the host's bone marrow but also to aid in the treatment of the cancer by killing off any remaining cancer cells present in the recipient. All of these uses are described in detail below.
Organ, Bone Marrow, and Blood Transplantation
Despite the increased success of organ and bone marrow transplantation in recent decades, the overall success is limited by the likelihood of graft rejection and the potentially fatal effects of GVHD or HVGD. In GVHD, most commonly seen in bone marrow transplants, the immune cells in the donor's graft recognize the antigens in the recipient as foreign and mount an immune attack against the host cells. In HVGD, most commonly seen in organ transplants, the recipient's immune system recognizes the antigens expressed in the donor organ graft as foreign and mounts an immune attack against the graft. Although in some cases the immune response can be treated using immunosuppressive drugs, the problems that arise from these drugs presents additional health related complications.
Blood typing and tissue typing for HLA antigens are the most common screens used today for determining immunocompatibility between a recipient and a potential donor prior to transplantation. However, these methods, when used alone, are not always effective or sufficient due to the inadequacies of HLA typing methods and the presence of additional antigens that can elicit an immune response
The deletion variants identified using the methods described herein are useful for screening individuals for immunocompatibility prior to transplantation. In general, a biological sample is obtained from the recipient in need of a transplant and the potential donor. The biological sample can be any bodily fluid (e.g., blood, serum, plasma, amniotic fluid, cerebrospinal fluid, saliva, urine, or semen), tissue, or cell and the sample is tested for the presence or absence of a deletion variant either at the nucleic acid level or the antigen level using the methods described above. For organ transplants, a blood sample or a biopsy sample from the organ to be transplanted or both are preferred. For bone marrow transplants, a blood, serum, or plasma sample is preferred, although the particular of involvement of liver, intestine, and skin in typical GVHD suggests that antigens in liver, intestine, and skin are also relevant to histocompatibility.
Deletion variant, preferably common deletion variant, typing information can include a nucleic acid “type” or antigen “type” for a particular antigen identified by the methods described herein as having a common deletion variant or any combination of the antigens described in Table 1. Common deletion variant typing can also include whole genome sequences for an individual where common deletion variants can be identified and matched with potential donors based on genome sequencing and analysis as described herein. Deletion variant typing information can also include deletion variant pattern or deletion variant antigen pattern information for a subject.
An organ recipient and organ donor are said to match when the organ donor does not have any antigens that are deleted in the recipient. For histocompatibility between an organ or tissue donor and an organ or tissue recipient, one of three scenarios can occur: 1) both the recipient and the donor have a deletion variant in all copies of the gene, which prevents expression of the antigen in both the recipient and the donor; 2) both the recipient and the donor do not have a deletion variant and both express the antigen; and 3) the recipient does not have the deletion variant and expresses the antigen and the donor has a deletion variant in all copies of the gene that prevents expression of the antigen. In all of these scenarios, the immune system of the recipient would not be newly exposed to the antigen upon transplantation. For histocompatibility between a bone marrow or peripheral blood donor and a bone marrow or peripheral blood recipient, one of three scenarios can occur: 1) both the recipient and the donor have a deletion variant in all copies of the gene which prevents expression of the antigen in both the recipient and the donor; 2) both the recipient and the donor do not have a deletion variant and both express the antigen; and 3) the donor does not have the deletion variant and expresses the antigen and the recipient has the deletion variant in all copies of the gene that prevents expression of the antigen. In all of these scenarios, the immune system of the bone marrow donor would not be newly exposed to the antigen expressed by the recipient upon transplantation.
The methods described herein can be used to detect a deletion variant in a single gene or in more than one gene. For example, an individual can be typed for the presence of one, two, three, four, five, six or more common deletion variants in expressed antigens. Furthermore, an individual can be screened for deletion variants throughout her genome using whole genome sequencing techniques such as those described above (e.g., genomic hybridization to microarrays, microelectrophoresis, and single molecule sequencing). While it is preferred that two subjects are a perfect match for each and every common deletion variant tested, individuals can be ranked for immunocompatibility depending on the number of matches and the relative importance of the antigen expressed by the gene having the common deletion variant. Priority scoring systems, statistical analysis, and metrics can be used by the skilled artisan to rank the subjects for immunocompatibility. For example, an individual in need of a liver transplant would generally seek a donor having a common deletion variant type match at any, and preferably all, of the UGT2B17, UGT2B28, and GSTM1 loci, all of which are expressed in the liver, but may not be matched for common deletion variants at the OR51A2 locus, which is expressed in the olfactory epithelium. An individual in need of a kidney transplant would generally seek a donor having a common deletion variant type match at any, and preferably all, of the UGT2B28, GSTT1, and GSTM1 loci, all of which are expressed in the kidney. An individual in need of a bone marrow transplant would generally seek a donor having a common deletion variant type match at any, and preferably all, of the UGT2B17, UGT2B28, GSTM1, GSTM1, and CYP2A6 loci. Combinations of the above with any additional deletion variants either described herein or known in the art, or identified by whole genome sequencing analysis as described herein, can be used to further type the candidate transplant donor and recipients.
A transplant recipient can be screened or “typed” for deletion variants, preferably common deletion variants, in any one or more of the nucleic acids or antigens listed herein at any time after diagnosis of a disease or a propensity to develop a disease that would require an organ, tissue, blood, or bone marrow transplant. A transplant donor can be screened or “typed” for deletion variants, preferably common deletion variants, in any one or more of the antigens listed herein at any time after which the decision to donate or serve as a potential donor is made or after the donor's organ, tissue, blood or bone marrow become available. Information regarding the common deletion variant typing of the recipient and donor can be used to identify a histocompatibility match with an already identified individual (e.g., a sibling or a relative) or entered into a registry or waiting list for subjects in need of an organ or bone marrow transplant and potential donors along with additional pertinent information such as name, age, sex, race, blood type, HLA tissue type, geographic location, and urgency of the needed organ or tissue donation.
Procedures for matching transplant donors and recipients using transplant registries are known to the skilled artisan. Generally, when organs are donated, the procuring organization accesses the national transplant computer system, UNetsm, through the Internet, or contacts the UNOS Organ Center directly. In either situation, information about the donor is entered into UNetsm and a donor/recipient match is run for each donated organ. The resulting match list of potential recipients is ranked according to objective medical criteria (i.e. blood type, tissue type, common deletion variant or antigen type, size of the organ, medical urgency of the patient, as well as time already spent on the waiting list and distance between donor and recipient). Each organ has its own specific criteria.
Using the match of potential recipients, the local organ procurement coordinator or an organ placement specialist contacts the transplant center of the highest ranked patient, based on policy criteria, and offers the organ. If the organ is turned down, the next potential recipient's transplant center on the match list is contacted. Calls are made to multiple recipients' transplant centers in succession to expedite the organ placement process until the organ is placed. Once the organ is accepted for a patient, transportation arrangements are made and the transplant surgery is scheduled.
Antigen or nucleic acid typing using the deletion variants identified herein can also be used to determine the need for additional immunosuppressive medications such as purine analogs, corticosteroids, FK506, cyclosporine, rapamycin, mycophenolate mofetil, antithymocyte globulin, and anti-CD3 and anti-IL-2 receptor monoclonal antibodies during and after transplantation. For example, if the donor and recipient were not perfectly matched for antigens tested, the clinician may decide to use additional immunosuppressive medications than if donor and recipient had been a perfect match.
In addition, using the deletion variants described herein, immune rejection can also be monitored by assaying for the presence of antibodies directed against the common deletion variant antigen. Standard immunoassays using the antigen as a substrate to detect binding to antibodies present in the serum or blood sample from a subject are known in the art. Examples of kits in the art used to detect antibodies to a given antigen in serum include kits to detect Helicobacter pylori, Rubella, and cytomegalovirus.
In this example, a recipient, after transplantation, can be screened regularly for the presence of antibodies, or fragments thereof, that specifically bind any of the deletion variant antigens that are or are not matched for the donor and recipient samples. The increased presence of such antibodies as compared to a sample taken prior to transplantation is indicative of an immune response against the antigen and may suggest imminent graft rejection. In this case, the clinician can use the information to make decisions regarding the use of additional immunosuppressive medications or removal of the graft. The development of therapies for depleting such antibodies from a patient, or for masking or otherwise interfering with their ability to bind to antigen, is also contemplated in this invention.
Graft Versus Tumor Effect
An immune attack by donor-derived immune cells against cancerous host cells is frequently a desired feature of a bone marrow transplant. This “graft-versus-tumor” or “graft-versus-leukemia” effect has been an occasionally successful but highly unpredictable feature of bone marrow transplant. Bone marrow derived from individuals who are deleted for antigens that are generally expressed selectively in leukemic cells might be able to mount a graft-versus-leukemia response without causing a dangerous graft-versus-host risk to other tissues.
In this subset of bone marrow or peripheral blood transplantation, an immune response is actually desired in order to mount an attack against tumor cells present in the blood or bone marrow of the recipient. When a subject has a hematologic disorder, such as blood cell cancer, a bone marrow or peripheral blood transplant is used to introduce new marrow into the recipient's system in order to produce healthy red blood cells, white blood cells, and platelets. Bone marrow transplants are often used, for example, after high doses of chemotherapy or radiation which killing the cancer cells but also kill the patient's bone marrow.
In this example, common deletion variant typing of the nucleic acids of the invention or the antigens encoded by the invention is done to identify a bone marrow or blood donor that is not compatible with the recipient. Any one or more of the antigens or common deletion variants can be screened but it is most desirable to screen for antigens that are expressed by the cancer cells or progenitor cells. Alternatively or additionally, a whole genome sequence analysis can be performed to identify common deletion variants at a deletion mismatch loci. A donor is identified as incompatible with the recipient if the donor has a deletion variant in all copies of the gene that prevents expression of the antigen and the recipient does not have the deletion variant and expresses the antigen. Once a histoincompatible donor is identified for the recipient, the transplant is performed and desirably, results in an immune attack mounted by the donor's transplanted immune cells against the remaining cancer or disease cells in the host recipient. This desired outcome of transplantation is termed graft versus tumor and not only provides healthy blood cells to the patient but also aids in the treatment of the cancer by killing the remaining cancer cells.
Maternal/Fetal Immunocompatibility
The methods of the present invention are also useful for screening individuals for immunocompatibility to diagnose and understand maternal/fetal incompatibility issues that may contribute to spontaneous abortion or miscarriage. In some cases, fertility issues arise not because of fertility problems but because of immunocompatibility issues between the mother and the prospective father or sperm donor. One common example of such a case occurs when a mother is Rh negative and her partner is Rh positive. Rh factor is a protein present in the red blood cells of most people, capable of inducing intense antigenic reactions. If the mother has an Rh antibody titer after sensitization during a previous pregnancy or due to a previous incompatible transfusion, and the fetus is Rh positive, then the mother's immune system can mount an attack against the fetal cells expressing the Rh factor. Such an attack can result in spontaneous abortion or many lifelong complications for the baby before and after birth. Pregnant women or women interested in conceiving are often tested for the presence of antibodies for Rh as are fetuses in women who are Rh negative.
Despite this understanding of Rh compatibility issues, many spontaneous abortions and fertility problems still occur as a result of incompatibility of antigens that have not yet been identified. Using the methods of the present invention, a woman and a prospective man wanting to conceive can be tested for any one or more of the common deletion variants of the invention. Such typing can occur at the DNA level (either whole genome sequencing or to identify the presence or absence of known deletion variants) or using antigen typing for common deletion variant antigens other than MHC, Rh factor, or blood type. Antigen typing can occur as a preliminary screen or after fertility problems or one or more spontaneous abortions have occurred. Similarly, a woman intending to use a sperm donor can be screened and the sperm can be screened for deletion variants or expression of the deletion variant antigens encoded by the polymorphic genes. In the case of known incompatibility, the fetus can also be tested. Similarly, a woman undergoing in vitro fertilization could have several embryos tested for histocompatibility with her, to ensure that a histocompatible embryo is implanted and thereby maximize the probability of a successful pregnancy. Information gained from antigen or common deletion variant typing can be used to understand fertility issues, to identify problems with potential partners, or to monitor an at risk fetus when incompatibility is known.
For histocompatibility between a woman and a prospective father or sperm donor or an embryo or fetus, one of three scenarios can occur: 1) both the mother and the father, sperm, embryo or fetus have a deletion variant, preferably a have a deletion variant in all copies of the gene, which prevents expression of the antigen in both the recipient and the donor; 2) both the mother and the father, sperm, embryo or fetus do not have a deletion variant and both express the antigen; and 3) the mother does not have the deletion variant and expresses the antigen and the father, sperm, embryo or fetus has a deletion variant in all copies of the gene that prevents expression of the antigen. In all of these scenarios, the immune system of the mother would not be newly exposed to the antigen upon transplantation.
In one example, a pregnant woman presents at her OB/GYN office for a prenatal visit. Routine blood work determines that she has one or more common deletion variants resulting in non-expression of the encoded antigen. Examples of particular deletion variants that are useful in this method include UGT2B28, UGT2B17, and LCE3C, all of which are expressed in the placenta. Her partner does not have the common deletion variant and expresses the antigen. The pregnant woman is then further tested to determine if she has a serum antibody titer to the antigen. Fetal DNA or antigen typing using amniotic fluid can also be performed. If the fetus is determined to lack the common deletion variant or express the antigen, further monitoring of the fetus by the clinician or by sonography or amniocentesis can be performed. However, if the fetus is determined to have the common deletion variant, the fetus is judged to be at low risk for immune attacks by the maternal immune system and can be followed by non-invasive procedures such as sonography.
Combination Screening Methods
Although the methods described herein are effective for determining immunocompatibility between individuals, they can also be combined with additional known screens and tissue typing methods for the identification of compatible or incompatible individuals. Such methods are known in the art and include blood type matching, Rh factor typing, and HLA typing, both of which are known in the art. When immunocompatibility is desired, individuals matched for antigens can also be screened (either prior to or after antigen screening) for matching blood types and matching HLA types. When immunoincompatibility is desired, individuals that are identified as having different antigens types can also be screened (either prior to or after antigen screening) for the presence or absence of distinct blood types and HLA types.
For blood typing, an individual with Type A blood is compatible with an individual with Types A or O. An individual with Type B blood is compatible with an individual with Types B or O. An individual with Type O blood is only compatible with an individual with Type O. An individual with Type AB blood is compatible with an individual having any blood type. Blood types can also be measured for compatibility of Rh factor.
For HLA typing, the screen can include any number of the proteins encoded by the HLA region and generally includes from one to six of the proteins. The polymorphic proteins encoded by the HLA region have been designated HLA-A, -B, -C,-DR,-DQ, and -DP. HLA-A, -B, and -C consist of a single polymorphic chain. HLA-DR, -DQ, and -DP proteins contain two polymorphic chains, designated alpha and beta. These D-region proteins are encoded by loci designated DRA, DRB1, DRB3, DRB4, DQA1, DQB1, DPA1, and DPB1. (See Schwartz, Ann. Rev. Immunol. 3:27-261, 1985.) The products encoded by the polymorphic HLA loci are most commonly typed by serological or nucleic acid based typing methods. See for example, U.S. Pat. No. 6,194,147 for a description of methods for HLA typing.
Of the many HLA antigens, the National Marrow Donor Program (NMDP) sets minimum matching levels that must be met before a donor or cord blood unit from the NMDP Registry can be used for a transplant. These minimum requirements are based on research studies of transplant outcomes. The HLA antigens that are looked at for these minimum requirements are called HLA-A, -B and -DRB1. One set of these three antigens is inherited from the mother and another set is inherited from the father. This makes a total of six antigens to match. For cord blood units, the NMDP requires a match of at least four of these six HLA antigens. For adult marrow or peripheral (circulating) blood cell donors, the NMDP requires a match of at least five of these six HLA antigens.
Potential donors and recipients can also be tested for crossmatching in which the recipient's blood and the potential donor's blood are place together in a test tube and examined to see if there is cell death. If all the cells survive without death of the donor's cells, there is a negative crossmatch, which is indicative of immunocompatibility of the individuals. If the cells of the donor begin to die, a positive crossmatch results, which is indicative of immunoincompatibility.
Tolerization
For any of the immunocompatibility testing methods where compatibility is desired, if two subjects are found to be immunoincompatible due to the presence in one subject of a deletion variant in all copies of the gene that is not present in another subject (e.g., an organ donor and a recipient or a potential mother and father trying to conceive), but a transplant or fertilization must still take place between the two subjects, methods for tolerizing the subject having the common deletion variant and therefore lacking the expressed antigen can be used to reduce the risk of organ rejection or spontaneous abortion.
Tolerization regimens are intended to prepare the immune system of an individual (e.g., organ recipient or prospective mother) to accept an antigen that is not expressed in that individual due to a polymorphic deletion. For example, an individual awaiting an organ transplant could be treated to facilitate acceptance of antigens that are not expressed in that individual. In another example, if prospective parents are not compatible because the prospective mother does not express one or more of the antigens encoded by the common deletion variants and the prospective father does, the prospective mother can be treated to tolerize her to the presence of the antigen that may be expressed on the fetus.
Tolerization can be achieved through any gene therapy or protein therapy regimens known in the art for delivery of an antigen or a nucleic acid encoding an antigen to the individual in need of tolerization. The purified protein or nucleic acid encoding the antigen can be delivered directly to a target organ or systemically.
For protein therapy, purified forms of the antigen used for tolerization can be purchased from a commercial source or can be produced by recombinant methods known in the art (see, for example, Sambrook et al., Molecular Cloning: A Laboratory Manual, Vols. 1-3, Cold Spring Harbor Laboratory Press, 3 ed., 2001, or F. Ausubel et al., Current Protocols in Molecular Biology (Green Publishing and Wiley-Interscience: New York, 1987) and periodic updates.
The desired antigen can also be delivered via a nucleic acid encoding the antigen. The nucleic acid can be any nucleic acid (DNA or RNA) including genomic DNA, cDNA, and mRNA encoding the antigen. Methods for nucleic acid therapy are known in the art and can be found, for example, in Sambrook et al., supra, Ausubel et al., supra, and Watson et al., Recombinant DNA, Chapter 12, 2d edition, Scientific American Books, 1992).
In gene therapy applications, genes are introduced into cells in order to achieve in vivo synthesis of a therapeutically effective genetic product. “Gene therapy” includes both conventional gene therapy where a lasting effect is achieved by a single treatment, and the administration of gene therapeutic agents, which involves the one time or repeated administration of a therapeutically effective DNA or mRNA. Standard gene therapy methods typically allow for transient protein expression at the target site ranging from several hours to several weeks. Re-application of the nucleic acid can be utilized as needed to provide additional periods of tolerization.
An additional method for tolerizing immune cells from one individual to a known antigen is to “immunodeplete” those cells which bind to a particular antigen, or which bind to peptide fragments presented on cell surfaces by the MHC. Methods for immunodepletion are known in the art and are reviewed, for example, in Blazar and Murphy, Philos Trans R Soc Lond B Biol Sci. 360:1747-67 (2005).
The locations of common deletions in the human genome are largely unknown, as is the best way to determine the association of such variants with disease. To address these questions, we developed an approach for using the HapMap to discover, localize, and analyze common deletion variants. We found hundreds of deletion variants, 1 kb-745 kb in size, including more than 100 common deletions that were observed as homozygous deletions. Ten of these common deletion variants remove the coding regions of expressed genes thought to contribute to drug response, olfaction, and sex steroid hormone metabolism; the gene deletion variants also explained variation in gene expression at these loci. Most common deletions appear to result from ancestral mutations that have been inherited by descent; they are in linkage disequilibrium with nearby single-nucleotide polymorphisms (SNPs), such that their association to disease could be discovered in whole-genome association studies.
SNPs have long been appreciated as common, potentially phenotype-causing genetic variants and as markers for other, undiscovered variants via linkage disequilibrium. Genome-wide SNP discovery efforts, and the construction of a map of human SNP variation (HapMap consortium), allow for the use of whole-genome SNP genotyping to discover common ancestral mutations that affect disease risk.
Recently, it has been recognized that structural variation—including duplications, deletions, and inversions—is common and extensive. (See, for example, Sebat et al., Science 305:525-8 (2004); Iafrate et al., Nat. Genet. 36:949-51 (2004); Tuzun, et al., Nat. Genet. 37:727-32 (2005); and Sharp et al., Am. J. Hum. Genet. 77:78-88 (2005)). Of all forms of structural rearrangement of a locus, the form with the most obvious potential functional relevance is that which removes the DNA sequence altogether. However, little is known about the location of common deletion polymorphisms on the scale of specific exons and regulatory elements; even less is known about which deletion variants may be sufficiently common to appear as homozygous deletions in many individuals.
To identify, catalog, and enable study of deletion variants across the human genome, we set out to develop and validate a method for discovering deletions from SNP genotypes. We hypothesized that a segregating deletion would leave “footprints” in SNP genotypes, including null genotypes, apparent deviations from Mendelian inheritance, and apparent deviations from Hardy-Weinberg equilibrium (
To determine whether a subset of “failed” SNP genotyping assays in the HapMap data might reflect structural variation, we asked whether such failures are physically clustered in a manner that is specific to individuals. Consistent with this hypothesis, the rate of Mendelian-inconsistent genotypes was elevated near other Mendelian-inconsistent genotypes in the same individual (regardless of whether the same genotyping platform was used for both assays), but was unrelated to Mendelian inconsistencies in other individuals (
We used data from the International HapMap Project to identify clusters of aberrant genotype patterns across the genome. We used the unfiltered genotypes from release 16 of the HapMap, which we downloaded from http://hapmap.org. These consisted of separate genotype files for four population samples: 90 CEPH individuals (30 trios) of European ancestry; 90 individuals (30 trios) of Yoruban ancestry sampled in Ibadan, Nigeria; 45 unrelated individuals of Han Chinese ancestry sampled in Beijing; and 45 unrelated individuals of Japanese ancestry sampled in Tokyo. The population samples are described in detail in Altshuler et al. Nature 437:1299-1320 (2005)). We combined the data from the Chinese and Japanese population samples and thereafter treated the data set as three population samples of 90 individuals each.
A complication is that this data had been generated at ten different genotyping centers, using seven different genotyping technologies, each of which showed distinct rates of each type of “failed assay.” We noted that the background rates of Mendel failure, null genotypes, and Hardy-Weinberg disequilibrium differed greatly from technology to technology, and even for the same technology when used by different centers. Furthermore, there were many sample-by-batch interactions, in which particular samples were associated with elevated rates of null genotypes or Mendel failures in particular experimental batches. To distinguish physically clustered patterns of aberrant genotypes from sporadically appearing patterns, we developed a set of statistical thresholds, tailored to each genotype pattern, genotyping center, and genotyping technology, for identifying significantly clustered patterns.
Because we sought to identify multi-assay patterns in the data from independent genotype assays, we did not combine data from multiple assays that potentially used the same sequence features for amplification, labeling, or restriction digest. Thus, we excluded all the Perlegen assays, because the use of 10-kb amplicons on that platform potentially caused long-range patterns of aberrant genotypes wherever an undiscovered SNP altered either primer-binding site. We also excluded data from any experiments whose batch structure corresponded to physical regions of the genome, because this design potentially allowed batch-specific experimental artifacts to appear as regional patterns in the data.
We looked for clustering of aberrant genotype patterns in each of the populations separately as described below.
Null Genotypes
For each genotype assay and population sample, we defined the “null genotype pattern” of that assay as the binary vector (length 90) of null genotype calls across the 90 individuals in that population sample. For each such pattern that was observed on any genotyping platform, we considered each pattern together with its close neighbors (R2>0.8) in pattern space. (This fuzzy clustering was necessary because genotype assays do not consistently obtain 100% complete calls, even in euploid samples.) We determined the background frequency of that set of patterns on the combination of genotyping technology, genotyping center, and (wherever possible) on the specific experimental batch in question. Using that background frequency, we defined a statistical threshold for clustering by finding numbers x and y such that the binomial probability of observing 2 occurrences in x physically consecutive assays, or 3 occurrences in y physically consecutive assays, was sufficiently small that, after testing (num_patterns×num_assays) hypotheses, we would expect fewer than two chance discoveries per platform. We identified all genomic segments (runs of two or three examples of the pattern) where the clustering of this pattern exceeded the statistical threshold, and clustered any segments that overlapped.
Mendel Failures
For each genotype assay and population sample (CEPH and Yoruba samples only), we defined the “Mendel failure pattern” of that assay as the binary vector (length 60) of null genotype calls across the 60 parent-offspring pairs in that population sample. For each such pattern that was observed, we considered each pattern together with its close neighbors (R2>0.8) in pattern space. This fuzzy clustering was desirable because the same deletion segregating in a population can give rise to non-identical patterns of Mendel failure at different SNPs, due to the fact that the non-deletion SNP haplotypes that are segregating in a trio (whose conflicts result in the Mendel conflicts) may not disagree at all SNPs.
Assessment of Clustering of “Failure Profiles”
For both Mendel failure profiles and null genotype profiles, we observed that highly similar (R2>0.8) profiles tended to be physically clustered in the genome. More specifically, we observed that the probability of observing a “match” to any particular profile was a decreasing function of physical distance from that profile, even when we considered only pairs of SNP assays that were typed using different technology platforms (
The Phase I HapMap data was produced by ten different genotyping centers, with each chromosome arm primarily genotyped by one particular center (HapMap Consortium, Altshuler et al. Nature 437:1299-1320 (2005)). Approximately 120 thousand SNP assays were performed by centers outside of their primary regions, or on genome-wide platforms such as Affymetrix 100K SNP arrays, allowing cross-platform analyses like those in
We therefore analyzed the data from each genotyping center separately. For each genotyping center, we first ordered all of the SNP assays from that center by genomic position. For each pattern (clustered set of highly similar profiles) that was observed multiple times, we determined that pattern's background frequency at that center, and wherever possible on the specific experimental batch in question. (Batch information was obtained from the International HapMap Consortium.) We then analyzed the physical distribution of all observations of that pattern relative to all of the SNP assays from that center (ordered by genomic position). A list of “candidate clusters” was determined by considering every consecutive pair and consecutive trio of observations of that pattern, together with any other, intervening SNP assays from that center. To assess the tightness of each such candidate cluster, a “clustering p-value” was calculated to assess the probability of observing a cluster at least as tight (in consecutive-assay space) as that cluster, given (i) the background frequency of the pattern, (ii) the number of SNP assays spanned by the cluster, and (iii) the total number of SNP assays performed by that center. The distribution of these p-values is shown in
We clustered all overlapping genomic segments that were identified by this analysis, into 702 genomic loci.
We were concerned that multiplexed batches of SNP assays that were performed together could also give rise to potential patterns in the data, which (if distributed non-randomly in genomic space with respect to that center's other SNP assays) could give rise to potential batch artifacts. We therefore excluded those clusters that consisted entirely of SNP assays from the same experimental batch. This resulted in a set of 541 predictions.
Hardy-Weinberg Disequilibrium
We observed that a deletion tended to reduce the ratio of observed heterozygosity to expected heterozygosity (hetobs/hetexp) by a uniform amount (
Wherever the resulting genomic segments overlapped with clusters of Mendel failure or null genotypes as discovered above, we clustered those segments together. (Because heterozygosity can show regional correlations due to haplotype structure, selection, and potentially duplicated sequence, we did not promote loci based on (hetobs/hetexp) alone unless confirmed by one of the other lines of evidence; however, the (hetobs/hetexp) deviations were useful for extending clusters discovered by Mendel failures, because the Mendel failures themselves may not be observed at every marker in the deleted region (
More specifically, we defined as the “failure profile” of an assay its pattern of Mendel failure across the 60 pairs of relatives in a population, its pattern of null genotypes across the 90 individuals in a population, and its deviation from the expected level of heterozygosity in that population. We looked for regions of the genome in which highly similar “failure profiles” appeared at nearby markers (
Using these methods we identified 541 candidate polymorphic deletions 1-200 kB in size (as shown in Appendix A). 120 of these loci generated null genotypes in multiple individuals, suggesting the existence of common, homozygous deletions. More than 90% of the discovered deletion variants were novel. Half of these loci were 1-7 kb in size and were therefore not detectable by earlier approaches; 98% were 1-30 kb in size and would have had little chance of detection by commonly used hybridization-based approaches.
It was critical to validate the presence of segregating deletions at the predicted sites, given their origin in data that fails typical quality control standards and the statistical nature of the inference. We used four methods: fluorescent in situ hybridization (FISH), two-color fluorescence allele-intensity measurements, PCR amplification, and comparison to previous work. These methods are described below in the Materials and Methods section.
First, we performed fluorescent in situ hybridization (FISH) on four candidate deletions that completely contained available FISH probes. The FISH assays confirmed the existence of segregating deletions at each site, and confirmed their Mendelian inheritance wherever suitable cell lines were available (
Second, we examined two-color fluorescence data from the assays that had been used to genotype SNPs on chromosomes 4q, 7q, and 18p at the Broad Institute. Specifically, this method associates a quantitative fluorescence signal with each allele at each typed SNP in each individual. At most SNPs, individuals' fluorescence-intensity measurements cluster into two or three discrete groups corresponding to homozygous and heterozygous genotypes. At SNPs under 15 candidate deletion loci, fluorescence intensity data instead clustered into as many as six groups (
Third, we selected 60 loci for which the pattern of genotypes suggested the existence of multiple individuals with homozygous deletions, and confirmed the existence of homozygous deletions at 51 of these loci by PCR assays that failed in the suspected homozygous-null individuals but succeeded in all other individuals tested (
Fourth, quantitative PCR was performed in all 269 HapMap DNA samples for 11 candidate deletions that overlapped the coding exons of genes (described below) and were discovered in many individuals: at 10/11 loci, three discrete clusters were observed, identifying individuals with 0, 1, and 2 gene copies (
We also tested an additional 56 loci that were not among our core predictions, but met a more-relaxed set of statistical thresholds; the confirmation rate among these other candidate variants was considerably lower, suggesting that relaxation of the statistical thresholds would be unwarranted.
Finally, we compared the locations of the candidate deletions to results from an earlier study, in which the approximate genomic locations of 102 candidate deletions in a single individual were discovered by the existence of fosmid end pair sequence reads from that individual that map more than 48 kb apart on the reference human genome sequence. (Tuzun et al., supra). Twenty-eight of our candidate deletions resided within these fosmids; in each case, the location of the aberrant genotypes further refined the localization of the deletion variant.
In sum, 90 predicted deletion variants (including 68 of 120 predicted common homozygous deletions) were validated by one or more of these approaches. Based on the experimental results, we estimate that 15% of the still-untested candidate deletion loci may be false positives.
We found thirteen genes for which exons were deleted at an appreciable frequency (Table 1). Of these genes, eight were observed as homozygous deletions. These common gene deletion polymorphisms included two genes involved in the metabolism of sex steroid hormones (UGT2B28 and UGT2B17). Common deletions also removed two genes encoding olfactory receptors (OR51A2 and OR4F5) and three genes (CYP2A6, GSTT1, and GSTM1) with roles in detoxification and drug metabolism. (For information on previously identified deletions in some of these genes see Seidgard et al., Proc. Natl. Acad. Sci. 85:7293-7297 (1988), Nunoya et al., Pharmacogenetics 8:239-249 (1998); and Pemble et al. Biochem. J. 300 Pt1:271-276 (1994).)
To assess the frequencies and inheritance of these gene deletions in different populations, we developed quantitative PCR assays for accurately genotyping individuals as carrying 0, 1, or 2 gene copies, and used these assays to successfully genotype eight of the ten gene deletion variants in all the HapMap individuals (Table 1). The resulting genotypes showed Mendelian inheritance, Hardy-Weinberg equilibrium, and expected transmission rates, suggesting that each behaves as a stable, heritable genetic variant. The gene deletion variants were observed in individuals of European, Yoruba, and Chinese and Japanese ancestry, though the frequency of each deletion varied from population to population (Table 1).
Assessing functional relevance requires testing for association to phenotype. A simple phenotype is the level of expression for each transcript. Based on global profiles of gene expression in a subset of the samples, we found that three commonly deleted genes (Table 1) are expressed at appreciable levels in the lymphoblastoid cell lines used to measure individual variation in gene expression. (Monks et al., supra and Morley et al., Nature 430:743-747 (2004)). We compared published expression measurements from these cell lines to deletion genotypes that we obtained experimentally. Variation in gene dosage explained respectively 88%, 26%, and 75% of the observed variation in expression of the three genes (
For medical genetics, a key question is whether one must discover each deletion variant in every patient, using dedicated technology, or can rely on linkage disequilibrium by using nearby SNPs as proxies for common deletions. The answer to this question depends on the linkage disequilibrium properties of common deletion variants: if common deletion of a locus is due to recurrent mutation there, then deletions must be discovered independently in every patient; if common deletion of a locus results from an ancestral mutation that has been inherited by descent, then it will often segregate on an ancestral haplotype and be in linkage disequilibrium with nearby SNPs.
In addition, to the extent that deletions result from unique ancestral mutational events, they will often be in linkage disequilibrium with nearby SNPs, and ancestral SNP haplotypes can serve as proxy in disease studies as well as immunocompatibility assays.
We observed strong LD between SNPs from HapMap and validated deletions. For example, nine of the ten gene deletions (for which we had designed accurate quantitative PCR genotyping assays) showed significant LD with nearby SNPs, and six of the ten had a perfect SNP proxy (r2=1) in one or more populations (see, for example
Our results indicate that the human genome has hundreds of common, multi-kilobase deletion variants, including some that remove genes, and that SNPs can be used to discover, analyze, and serve as markers for these variants. While we have used this approach on the HapMap, the same approach can be used to search for deletion variants in any set of SNP genotypes, such as data from imminent whole-genome association studies. Discarded, “failed” assays from earlier medical genetics studies could also be re-examined to search for the spatially patterned signature of a segregating deletion. Such an approach could be used together with intensity data from genotyping assays (
We describe an initial catalog of common deletion variants, but it is just a first draft toward a complete catalog. We have detected only those deletions large enough to affect multiple, independent HapMap SNP assays; most deletions smaller than 5 kb would not be detected at the current HapMap marker density. Phase 2 of the HapMap, with an assay every 1 kb, will considerably increase this resolution. The low density of HapMap assays in very-recently-duplicated regions of the genome has also impeded our discovery of deletions there; thus, our findings are limited to deletions of relatively unique sequences. Other types of structural variants, such as multi-copy duplications, may be more susceptible to recurrent structural mutation and therefore show less linkage disequilibrium. The application of diverse methods for finding structural variants (Sebat et al., supra; Iafrate et al., supra; Tuzun et al., supra; Sharp et al., supra; and Fredman et al., Nat. Genet. 36:861-866 (2004)), together with the development of follow-on genotyping assays, will allow more-complete catalogs of structural variants and their linkage disequilibrium properties.
Most importantly, an integrated view of structural variation and SNP variation is critical to medical genetics. To the extent that common deletion variants are in linkage disequilibrium, their association to disease can be discovered by the kinds of strategies proposed for SNP association studies (HapMap consortium, Altshuler et al. Nature 437:1299-1320 (2005)). In the future, medical genetics will benefit from a full catalog of common variants, since all types of alleles must be considered in an unbiased search for the causes of disease.
Materials and Methods
Fluorescent in situ Hybridization (FISH)
Fosmid clones with end sequences mapped to locations within predicted deletion intervals were obtained from the BAC/PAC resource, and DNA was isolated from each fosmid with the Maxi DNA plasmid kit (Qiagen). Fosmid DNAs were then labeled by nick translation with Spectrum Green-11-dUTP (G248P89259F2 and G248P87989C3 on chromosome 4) or Spectrum Orange-11-dUTP [Vysis, Inc.] (G248P87609A7 on chromosome 8 and G248P81036F4 on chromosome 18). We co-hybridized the test probes with appropriate positive control probes: Spectrum Orange-11-dUTP-labeled BAC clone RP11-363G1 (BAC/PAC; chromosome 4p15.1), and biotin-16-dUTP-labeled chromosome 8 and 18 paint probes (Roche). FISH experiments were performed using standard hybridization conditions on metaphase chromosome preparations derived from lymphoblastoid cell lines obtained from the Coriell Institute for Medical Research. Cy5-labeled streptavidin was used for detection of the biotin labeled chromosome 8 and 18 paint probes. Images were captured on an Olympus AX70 fluorescent microscope equipped with a CCD camera (Photometrics KAF 1400) with appropriate fluorescent filters and analyzed with Applied Imaging's Genus software.
The chromosome 4 fosmids used for FISH validation (G248P89259F2 and G248P87989C3) are mapped to segmental duplication-containing regions (Sebat et al., supra). Sequences with >94% nucleotide similarity are located <1 Mb (on chromosome 4) from each fosmid (http://genome.ucsc.edu). We considered the possibility that these probes could hybridize to a segmental duplication and yield a positive FISH signal, even if the target sequence were deleted. To investigate this, we repeated these experiments six times under various hybridization conditions, including once with an extended hybridization of 48 hours. In four out of these six experiments for a given probe and in a minimum of 25 metaphase spreads examined per individual, we consistently observed zero fluorescent probe signals (e.g., for fosmid probe G248P89259F2: NA19098), one signal (NA19100, NA19200, NA19202), or two fluorescent probe signals (NA19099, NA19201) per individual. Furthermore, in these experiments we included parent-offspring trios and FISH results were consistent with Mendelian inheritance of deletions. In two experiments (including the 48 hour hybridization protocol), those individuals believed to be homozygous for the deletion, heterozygous for the deletion, and homozygous for the non-deletion allele were observed in a minimum of 25 metaphase spreads per individual to have two faint signals (e.g., for fosmid probe G248P89259F2: NA19098), one faint and one strong signal (NA19100, NA19200, NA19202), and two strong signals (NA19099, NA19201), respectively. FIG. 6C shows such a signal intensity difference in an individual heterozygous for the chromosome 4 deletion containing fosmid G248P87989C3.
PCR Validation of Homozygous Deletion Variants
To validated predicted homozygous deletions by PCR, we selected 60 candidate deletion loci for which the pattern of genotypes predicted the existence of at least two individuals with homozygous deletions in at least one population. The criterion for validation was confirmation of a precise predicted pattern of amplification success and amplification failure across at least 12 samples that included at least two predicted examples of each result. Any deviation from that pattern was classified as a confirmation failure. The predictions (about which individuals harbored homozygous deletions) were derived from the SNP genotypes—the individuals in whom multiple null genotypes had given rise to the predicted deletion variant (Appendix A) were predicted to be homozygous null; all other individuals were predicted to have genetic material at that locus. Importantly, we chose PCR amplification sites that were distinct from any of the sequences used in the SNP genotyping assays, so that this would be an independent confirmation of a predicted result. Table 4 includes a list of PCR primers that were used in PCR assays for each deletion variant.
Illumina (Two-Color, Allele-Specific Fluorescence) Validation of Deletion Variants
Seventeen candidate deletion variants covered at least three SNPs that had been assayed on the Illumina platform at the Broad Institute. The Illumina platform generates a quantitative allele-specific intensity measurement for each allele in each individual in a population. The normalized allele-specific intensity measurements are comparable across individuals and generally fall into two or three discrete clusters, corresponding to individuals homozygous for allele 1, individuals homozygous for allele 2, and individuals heterozygous for alleles 1 and 2. For SNPs covered by predicted deletion variants, we observed additional genotype classes corresponding to individuals hemizygous for allele 1, individuals hemizygous for allele 2, and individuals homozygous for the deletion allele. We considered a deletion variant validated if (i) we observed one or more of these additional, well-separated genotype clusters, and (ii) all of the individuals predicted (from multi-marker genotype patterns) to be hemizygous or homozygous deleted in fact fell into the appropriate additional cluster.
Quantitative PCR
Individuals' deletion genotypes cannot be unambiguously inferred from SNP genotypes data (see, for example,
Small (60-90 nt) amplicons from the test and control loci were simultaneously amplified in the same tube, in 96-well plates (one plate per population, including five replicate samples and one blank sample) on a Bio-Rad iCycler. The threshold cycle (Ct) was calculated for each fluorophore separately, and the difference between the threshold cycles for the two fluorophores (delta_Ct) was used as a measurement of relative copy number that could be compared from sample to sample on the same plate. For each assay, the delta_Ct measurements clustered into three discrete groups (with one group typically showing no amplification of the test locus at all). For some assays, these groups were initially incompletely separated; in these cases, averaging of the delta_Ct measurements across 3-5 replicates resulted in discrete, well-separated clusters of average measurements. For each assay, we treated these three clusters as “+/+,” “+/−,” and “−/−” genotypes. In each case, the resulting genotype calls for replicate samples agreed completely, and the resulting genotypes showed Mendelian inheritance and Hardy-Weinberg equilibrium.
The non-MHC factors which determine histocompatibility are generally unknown. As a consequence, allogeneic transplantations carry risk due to unforeseen incompatibilities between donor and host. The human genome has recently been shown to exhibit large-scale deletion polymorphism, including many large common deletion variants that appear as homozygous deletions in a significant fraction of the population. In the following example of the methods of the invention we investigated whether deletion mismatches for common deletion variants (homozygous deletion in donor but not in host) were associated with graft-versus-host disease (GVHD) following allogeneic hematopoetic stem cell transplantation (aHSCT).
Using the methods described below, we evaluated 500 aHSCT cases involving HLA-identical sibling donor-recipient pairs. We typed donors and patients for the presence of six gene deletions, and assessed whether aGVHD and cGVHD occurrence were associated with mismatch for these gene deletions. We found that mismatch for two common deletion variants, UGT2B28 and UGT2B17, was associated with chronic GVHD, and, for UGT2B17, was also associated with acute GVHD. These results demonstrate that large deletion variants may contribute to histoincompatibilities among individuals, and validate the usefulness of the invention described in this application. GVHD risk might be reduced by prospectively typing donors and patients for deletion variants in UGT2B17 and UGT2B28 genes.
Patients
The main study population consisted of 500 aHSCT recipients and their HLA-identical sibling donors. Inclusion criteria were the use of full myeloablative aHSCT. All recipients and donors gave written informed consent according to protocols approved by the institutional review boards of Helsinki University Central Hospital and the Dana Farber Cancer Institute (protocol 01-206).
The aGVHD replication study population consisted of 336 aHSCT recipients and their HLA-identical sibling donors, collected as described previously (Nichols et al., Blood. Dec. 15, 1996;88(12):4429-34).
Genotyping of Deletion Variants
We developed a quantitative PCR assay for typing each deletion variant in each donor and patient. In this assay, the locus of interest and a control, two-copy locus (PMP22) are simultaneously amplified in a 20 μl reaction containing TaqMan Master Mix (Applied Biosystems) together with a forward primer, a reverse primer, and a dual-labeled probe for each locus. The probe for the test locus (gene deletion polymorphism) is labeled with FAM and a BHQ-1 quencher (IDT); the probe for the control locus is labeled with VIC and an MGB quencher (Applied Biosystems). The simultaneous amplification of the test and control loci is monitored by real-time PCR and a threshold cycle (Ct) is determined separately for each locus by separation of the FAM and VIC spectra.
A sample was determined to be homozygous deleted for the test locus if the control locus showed robust amplification (Ct<32) while the test locus failed to amplify after 40 cycles. The quantity δCt =Ct—control−Ctgene showed a discrete, bimodal distribution across the remaining, non-homozygous deleted samples; samples from the higher δCt cluster were determined to have two copies of the gene, and samples from the lower δCt cluster were determined to have one copy. As a quality-control check, we verified that both of the following were true: (i) membership in the three genotype classes (corresponding to 0, 1, 2 copies) showed Hardy-Weinberg equilibrium and (ii) sibling genotypes were correlated across the cohort: regression of patient genotypes against the genotypes of their sibling donors yielded a regression coefficient that was not significantly different from 0.5.
Determination of Mismatches
Transplants were determined to involve a donor-recipient “deletion mismatch” for a deletion variant if the donor was homozygous deleted for that gene, and the recipient had a positive number (1 or 2) of gene copies. Transplants were considered to involve a “sex mismatch” if they involved a female donor and a male recipient.
Statistical Analysis
Acute and chronic GVHD were diagnosed and graded according to standard criteria. Acute GVHD cases were those with grades 2-4 aGVHD; controls were those with grades 0-1 GVHD. Chronic GVHD cases were those with “limited” or “extensive” cGVHD; cGVHD controls were those with “no” cGVHD.
The relationship between deletion variant mismatches and GVHD status was first assessed by association analysis of mismatch at each individual locus with aGVHD and cGVHD, using the 360 donor-recipient pairs who had no known mismatch risk factors (no sex mismatch). We performed a one-sided chi-square test.
The Michigan aGVHD cohort was used for replication analysis of the single locus showing positive association for aGVHD in the initial analysis, assessed by association analysis of mismatch at each individual locus with aGVHD. We performed a one-sided chi-square test.
Two loci (UGT2B17 and UGT2B28) were found to show positive associations in the initial analysis and were then assessed in the full cohort of 836 donor-recipient pairs using a regression model for GVHD risk whose terms included age, transplantation year, sex mismatch, UGT2B17 mismatch, UGT2B28 mismatch, and the interaction terms sex+UGT2B17 mismatch and sex+UGT2B28 mismatch.
These results demonstrate that deletion variants may contribute to histoincompatibilities among individuals. GVHD risk might be reduced by prospectively typing donors and patients for UGT2B17 and UGT2B28 gene deletions.
As described for Example 2, the non-MHC factors which determine histocompatibility are generally unknown. As a consequence, allogeneic organ transplantations carry risk due to unforeseen incompatibilities between donor and host. This study was designed to investigate whether mismatch for common deletion variants (homozygous deletion in donor but not in host) is associated with host-versus-graft disease (HVGD) following kidney transplantation.
Patients
The first study population consists of 500 renal allograft recipients and their HLA-identical sibling donors. The second study population consists of 700 renal allograft recipients and their unrelated donors. All recipients and sibling donors provided written informed consent according to protocols approved by the institutional review boards of Massachusetts General Hospital, Helsinki University Central Hospital, and Hospital do Rim o Hipertensao Sao Paulo.
Genotyping of Deletion Variants
Samples from each of the patient populations have been collected and will be used for the genotyping of deletion variants as described in Example 2. Methods for analysis of the samples after genotyping of deletion variants is performed are described below.
Determination of Deletion Variant Mismatches
Transplants are determined to involve a donor-recipient “mismatch” for a deletion variant if the recipient had a deletion variant in all copies of the gene (i.e., homozygous deletion) and the donor had a positive number (1 or 2) of gene copies.
Statistical Analysis
Renal allograft rejection was diagnosed and graded according to standard criteria. The primary diagnostic categories used in this study are “rejection,” “no rejection,” and “days to rejection.”
For the first (sibling-donor) study population, the relationship between gene deletion mismatches and rejection status is assessed by association analysis of mismatch at each individual locus with risk of rejection. A one-sided chi-square test is used to assess whether mismatch is associated with increased risk of rejection.
Any two loci found to show positive associations in the initial analysis are then assessed using a regression model for rejection risk whose terms include age, donor sex, recipient sex, transplantation year, cold ischemia time, and mismatch for each of the deletion variants.
For the second (unrelated-donor) study population, we assess the contribution of gene deletion mismatches using using a regression model for rejection risk whose terms include age, donor sex, recipient sex, transplantation year, cold ischemia time, mismatch for each of the gene deletion polymorphisms, and the numbers of HLA-AB and HLA-DR mismatches.
This table lists 541 predicted deletion variants identified from patterns of SNP assay failures in the Phase 1 Hapmap, as described in this study. The three leftmost columns show the location of the predicted deletion variant (the genomic coordinates spanned by all SNPs that supported the prediction). The five rightmost columns describe the evidence supporting each prediction: the locations of SNP assays, the population and type of supporting evidence, and the individuals in whose genotypes that evidence was observed.
Key to populations:
All physical coordinates are shown on the hg16 build of the human genome.
The description of the specific embodiments of the invention is presented for the purposes of illustration. It is not intended to be exhaustive or to limit the scope of the invention to the specific forms described herein. Although the invention has been described with reference to several embodiments, it will be understood by one of ordinary skill in the art that various modifications can be made without departing from the spirit and the scope of the invention, as set forth in the claims. All patents, patent applications, and publications referenced herein are hereby incorporated by reference.
Other embodiments are in the claims.
This application claims the benefit of U.S. Provisional Application No. 60/741,638 filed on Dec. 2, 2005, herein incorporated by reference.
The United States Government has a paid-up license in this invention and the right in limited circumstances to require the patent owner to license others on reasonable terms as provided for by the terms of grant number 1U54HG02750 awarded by the National Human Genome Research Institute of the National Institutes of Health.
Number | Date | Country | |
---|---|---|---|
60741638 | Dec 2005 | US |