The present invention is directed to; inter alia, methods and kits for prenatal genetic testing and particularly for identifying and/or analyzing fetal haplotype with a high degree of confidence.
Noninvasive prenatal genetic testing (NIPT) of whole chromosomal aneuploidies has already altered the landscape of prenatal diagnostics in the United States and increasingly worldwide. Aside from the noninvasiveness, advantages of NIPT include rapid turnaround, relatively low cost, and no hassle care for pregnant couples. Arguably, these benefits are largely made possible because it is not necessary to construct parental haplotypes in order to accurately diagnose chromosomal copy number. For noninvasive prenatal diagnosis (NIPD) of monogenic disease, on the other hand, this is not the case. In order for NIPD to take hold in the clinical setting it will be necessary to develop universal methodologies that apply to the diagnosis of any mutation, maternal or paternal, regardless of inheritance. Although some universal techniques for NIPD have already been described, each one requires time-consuming and sophisticated parental haplotype construction in advance of test interpretation (Fan et al. 2012 Nature 487:320-324; Kitzman et al. 2012. Sci Transl Med 4: 137ra176; and Lo et al. Sci Transl Med 2: 61ra91).
The classic haplotype construction methodology is simpler to implement because it involves the collection of DNA samples from several family members for linkage analysis. Nevertheless, this process is often complicated or sometimes made impossible by low compliance, couple privacy concerns, or the unavailability of living first degree relatives. To address these issues, researchers have also developed various molecular and statistical techniques for family-independent haplotyping (Browning and Browning, 2011. Nat Rev Genet 12:703-714). Unfortunately, the described molecular techniques are either too expensive, too time-consuming, and/or too labor intensive for use in a clinical setting. Moreover, statistical approaches, which rely on high throughput analysis of population data, are not appropriate for clinical application.
Medical centers around the world offer invasive prenatal diagnostic services for local population-specific founder mutations on a routine basis. Depending on the carrier frequency within the population, founder mutation tests often comprise a significant component of the overall molecular testing in such healthcare laboratories. Some examples of common founder mutations for which prenatal testing would be relevant include those implicated in long QT syndrome within the Finnish population (Marjamaa et al. 2009 Ann Med 41:2.34-240); the delF508 mutation in CFTR causing cystic fibrosis in the caucasian European population (Moral et al. 1994, Nat Genet 7:169-175); a mutation in the SERPINA1 gene causing alpha1-antitrypsin deficiency in Scandinavian Caucasians (Cox et al. 1985, Nature 316:79-81); a mutation in Columbians causing early onset Alzheimer's disease (Lalli et al., 2013, Alzheimers Dement, S277-S283); and scores of founder mutations in the Tunisian (Romdhane et al., 2012 Orphanet J Rare Dis 7:52) and Ashkenazi Jewish (AJ) populations (Zlotogora, J. 2014, Mendelian disorders among Jews).
There is an unmet need for a rapid, cost-effective, and routine test that can be implemented for highly accurate fetal haplotype identification, such as for NIPD of monogenic disorders, without reliance on blood sample collection from relatives of the pregnant couple.
The present invention provides, in some embodiments, methods and kits for identifying and/or analyzing fetal haplotype with a high degree of confidence.
According to another embodiment, the present invention provides a method for non-invasively predicting an increased risk of a disease-associated parental haplotype inherited by a fetus of a pregnant female, the method comprising:
(i) obtaining at least a replicate of a fetal nucleic acid sequence sequenced at a depth of at least 100× coverage for a single nucleotide polymorphism (SNP) in said haplotype, said fetal nucleic acid sequence being derived from a single DNA sample obtained from the pregnant female from week 5 of gestation and onward; and
(ii) analyzing said replicate of fetal nucleic acid sequence, wherein a high identity of said fetal haplotype to a consensus family haplotype indicates that said fetus is a carrier of said disease-associated parental haplotype;
thereby predicting an increased risk of a disease-associated parental haplotype inherited by said fetus.
According to another embodiment, the present invention provides a method for non-invasively predicting an increased risk of a monogenic disease or disorder in a fetus of a pregnant female, the method comprising:
(i) obtaining at least a replicate of a fetal nucleic acid sequence sequenced at a depth of at least 100× coverage for a SNP associated with said monogenic disease or disorder, said fetal nucleic acid sequence being derived from a single DNA sample obtained from the pregnant female from week 5 of gestation and onward; and
(ii) analyzing said replicate of fetal nucleic acid sequence, wherein a high identity of said fetal haplotype to a consensus family haplotype indicates that said fetus is a carrier of a parental haplotype;
thereby predicting an increased risk of a monogenic disease or disorder in said fetus.
According to some embodiments, said sample is a plasma sample. According to some embodiments, said DNA is plasma DNA. According to some embodiments, said plasma DNA is cell-free fetal DNA (cffDNA).
In another embodiment, said replicate of a fetal nucleic acid sequence is sequenced at a depth of at least 1,500× mean coverage. In another embodiment, said replicate of a fetal nucleic acid sequence is sequenced at a depth of at least 2,000× mean coverage. According to another embodiment, said fetal nucleic acid sequence is sequenced at a depth of at least 2,500× mean coverage. According another embodiment, said fetal nucleic acid sequence is sequenced at a depth of at least 3,000× mean coverage.
According to another embodiment, the replicate of a fetal nucleic acid sequence is sequence at a depth of at least 1000× coverage per single nucleotide polymorphism investigated. According to another embodiment, the replicate of a fetal nucleic acid sequence is sequence at a depth of at least 150× coverage per single nucleotide polymorphism investigated. According to another embodiment, the replicate of a fetal nucleic acid sequence is sequence at a depth of at least 250× coverage per single nucleotide polymorphism investigated. According to another embodiment, the replicate of a fetal nucleic acid sequence is sequence at a depth of at least 500× coverage per single nucleotide polymorphism investigated.
According to another embodiment, the consensus family haplotype is based on the fetus's father, mother, a first-degree parental family member or a combination thereof.
According to another embodiment, the methods of the invention are for use in non-invasively predicting an increased risk of a disease-associated paternal haplotype inherited by a fetus of a pregnant female, wherein the consensus family haplotype is a consensus paternal haplotype derived from the father, a first-degree paternal family member or a combination thereof.
According to another embodiment, said analyzing said replicate of fetal nucleic acid sequence comprises determining one or more paternal haplotype informative single-nucleotide polymorphism (SNP)s in at least one replicate of fetal nucleic acid, said paternal haplotype informative SNPs are not present in the maternal genotype, thereby determining unique paternal SNPs identified in the fetus.
According to another embodiment, the consensus family haplotype comprises at least 500 disease-informative SNPs. According to another embodiment, the monogenic disease or disorder is caused by, or strongly associated with, a founder mutation. According to another embodiment, the consensus family haplotype comprises at least 500 mutation-flanking SNPs.
According to another embodiment, the methods of the invention comprise obtaining the replicate of a fetal nucleic acid sequence during weeks 5 to 8 of gestation.
According to another embodiment, the fetal nucleic acid sequence comprises less than 4% of the DNA sample obtained from the pregnant female. According to another embodiment, the fetal nucleic acid sequence comprises less than 1.5% of the DNA sample obtained from the pregnant female.
According to another embodiment, the fetal nucleic acid sequence is present at a concentration of equal to or less than 4 pg/ul. According to another embodiment, the fetal nucleic acid sequence is present at a concentration of equal to or less than 1.5 pg/ul.
According to another embodiment, the monogenic disease or disorder presents with autosomal recessive inheritance. According to some embodiments, the monogenic disease or disorder is selected from the group consisting of Gaucher disease, cystic fibrosis, beta-thalassemia, sickle cell anemia, Alpha 1-antitrypsin deficiency, Bardet Biedl syndrome, Bloom syndrome, Canavan disease, Familial Dysautonomia, Fanconi anemia C, Hermansky-Pudlak syndrome, Joubert syndrome 2, Microcephaly with complex motor and sensory axonal neuropathy, Maple Syrup Urine Disease (MSUD), Mucolipidosis IV, Nemaline myopathy. Niemann-Pick Disease A, Usher syndrome I, Usher syndrome III, Walker Warburg syndrome and Zelweger syndrome.
According to another embodiment, the monogenic disease or disorder is cystic fibrosis.
According to another embodiment, the present invention provides a kit for identifying or analyzing fetal haplotype with a high degree of confidence.
Further embodiments and the full scope of applicability of the present invention will become apparent from the detailed description given hereinafter. However, it should be understood that the detailed description and specific examples, while indicating preferred embodiments of the invention, are given by way of illustration only, since various changes and modifications within the spirit and scope of the invention will become apparent to those skilled in the art from this detailed description.
The patent or application file contains at least one drawing executed in color. Copies of this patent or patent application publication with color drawing(s) will be provided by the Office upon request and payment of the necessary fee.
The present invention provides, in some embodiments, methods and kits for identifying and/or analyzing fetal haplotype with a high degree of confidence.
By virtue of identifying fetal haplotype, the invention may be applicable for many methods, including but not limited to, noninvasive prenatal diagnosis (NIPD), such as, of a monogenic disease, or alternatively, for human leukocyte antigen (HLA) typing, such as, for screening potential cord blood donors.
The present invention is based, in part, on the understanding that the common denominator among all population-specific mutations is that they each appear with their own mutation-flanking molecular fingerprint or haplotype. In particular embodiments of the invention, this fingerprint is used as a tool for fetal haplotype identification such as for NIPD. Thus, by means of highly targeted next generation sequencing (NGS), it is exemplified herein that fine-mapping of a founder mutation fingerprint is a potentially valuable asset for NIPD of an autosomal recessive disease. According to advantageous embodiments, the methods described herein alleviate the hassle of constructing family-specific haplotypes (e.g., for founder mutation NIPD). Moreover, the use of mutation-specific fingerprints eliminates the need for sophisticated molecular haplotyping methods, thereby effecting major savings with regard to test duration, reagent cost, and labor expenditure.
Parental haplotype construction is a primary drawback to NIPD of monogenic disease. Family-specific haplotype assembly is typically necessary for diagnosis of minuscule amounts of circulating cell-free fetal DNA. Nevertheless, this endeavor still hampers practical application of NIPD in the clinic because current haplotyping techniques are still too time-consuming and laborious to be carried out within the limited time constraints of prenatal testing.
To address this pitfall, the inventors have devised a universal strategy for rapid fetal haplotype identification, thereby being useful for NIPD of a prevalent mutation. Accordingly, some embodiments of the invention are applicable in the context of NIPD, including but not limited to, of a monogenic disease, and particularly of diseases associated with autosomal recessive disease-causing mutations.
As exemplified herein below using a non-limiting founder mutation, a consensus Gaucher disease-associated mutation-flanking haplotype was fine-mapped by means of targeted next generation sequencing, so as to successfully diagnose seven unrelated fetuses. One skilled in the art will appreciate that the methods described herein are shown as a non-limiting demonstration for accurate fetal haplotype identification. Accordingly, the methods and kits of the invention may be used for NIPD of any worldwide autosomal recessive founder mutation.
In additional embodiments, the disclosed invention is applicable for human leukocyte antigen (HLA) typing of a fetus, including but not limited to, for screening potential cord blood donors.
Thus, the present invention provides rapid, economical, and readily adaptable methods and kits for highly accurate fetal haplotype identification.
According to some embodiments, there is provided a method for predicting an increased risk of maternal and/or paternal haplotypes inherited by a fetus of a pregnant female.
According to some embodiments, said method comprises obtaining or providing a sample obtained from a pregnant female, referred to herein as “maternal sample”. In one embodiment, the maternal sample includes any processed or unprocessed, solid, semi-solid, or liquid biological sample, e.g., blood, urine, saliva, mucosal samples (such as samples from uterus or vagina, etc.). For example, the maternal sample may be a sample of whole blood, partially lysed whole blood, plasma, or partially processed whole blood.
According to some embodiments, said maternal blood sample is plasma DNA, e.g., cell-free fetal DNA (cffDNA) or free-floating DNA from maternal whole blood. In some embodiments, the fetal nucleic acid sequence3 is derived from a single DNA sample from the mother. In some embodiments, DNA samples from the mother are not pooled. In some embodiments, the replicates are derived from two different DNA samples. In some embodiments, the replicates are not technical replicates.
The sample of maternal blood can be obtained by standard techniques, such as using a needle and syringe. In another embodiment, the maternal blood sample is a maternal peripheral blood sample. Alternatively, the maternal blood sample can be a fractionated portion of peripheral blood, such as a maternal plasma sample. In another embodiment, once the blood sample is obtained, total DNA can be extracted from the sample using standard techniques known to one skilled in the art. A non-limiting example for DNA extraction is the FlexiGene DNA kit (QIAGEN). In another embodiment, maternal plasma may be further separated from peripheral blood by centrifugation, such as exemplified herein, at 1,900×g for 10 minutes at 4° C. The plasma supernatant may be re-centrifuged at 16,000×g for 10 minutes at 4° C. In another embodiment, a fraction of the resulting supernatant is used for cell-free DNA extraction, to thereby receive maternal plasma DNA extracts. Standard techniques for receiving cell-free DNA extraction are known to a skilled artisan, a non-limiting example of which is the QIAamp Circulating Nucleic Acid kit (QIAGEN). In some embodiments, the total DNA is subsequently fragmented, such as to sizes of approximately 300 bp-800 bp. For example, the total DNA can be fragmented by sonication.
In some embodiments, the methods described herein include a step of determining the amount of fetal nucleic acid within the obtained DNA sample (e.g., concentration, relative amount, absolute amount, copy number, and the like).
In some cases, the amount of fetal nucleic acid in a sample is referred to as “fetal fraction”. In some embodiments, “fetal fraction” refers to the fraction of fetal nucleic acid in circulating cell-free nucleic acid in the maternal sample. A determinant of the resolution of the fetal genetic map or fetal genomic sequence at a given level, or depth, of DNA sequencing is the fractional concentration of fetal DNA in the maternal biological sample. Typically, the higher the fractional fetal DNA concentration, the higher is the resolution of the fetal genetic map or fetal genomic sequence that can be elucidated at a given level of DNA sequencing. As the fractional concentration of fetal DNA in maternal plasma is higher than that in maternal serum, maternal plasma is typically considered a more preferred maternal biological sample type than maternal serum.
A size fractionation step can also be performed on the nucleic acid molecules in the maternal sample. As fetal DNA is known to be shorter than maternal DNA in maternal plasma, the fraction of smaller molecular size can be harvested and then used for the methods of the invention. Such a fraction would contain a higher fractional concentration of fetal DNA than in the original biological sample.
Thus, the sequencing of a fraction enriched in fetal DNA can allow one to construct the fetal genetic map or deduce the fetal genomic sequence with a higher resolution at a particular level of analysis (e.g. depth of sequencing), than if a non-enriched sample has been used.
Typically, applying said size fractionation step may alter the technology more cost-effective. As non-limiting examples of methods for size fractionation, one could use (i) gel electrophoresis followed by the extraction of nucleic acid molecules from specific gel fractions; (ii) nucleic acid binding matrix with differential affinity for nucleic acid molecules of different sizes; or (iii) filtration systems with differential retention for nucleic acid molecules of different sizes.
In another embodiment, the maternal plasma DNA extracts are pre-amplified, in replicate (e.g., in duplicate or more), using standard techniques, a non-limiting example of which is the SurePlex Amplification System (BlueGnome). In particular embodiments, said pre-amplification step is performed ahead of downstream processing, i.e., before the analysis step. As exemplified herein, undertaking the methods of the invention using at least a replicate of amplified fetal nucleic acid sequences, substantially augmented statistical confidence in each individual fetal SNP genotype call.
In some embodiments of the method disclosed herein, the DNA is amplified (e.g., in replicate or more) after plasma DNA is extracted. As used herein, the term “amplified” is intended to mean that additional copies of the DNA are made to thereby increase the number of copies of the DNA, which is typically accomplished using the polymerase chain reaction (PCR). Additional methods of amplification are known to one skilled in the art.
In another embodiment, said replicate of a fetal nucleic acid sequence is sequenced by next generation sequencing (NGS). In another embodiment, said replicate of a fetal nucleic acid sequence is sequenced at a depth of at least 50×, 60×, 70×, 80×, 90×, 100×, 150×, 200×, 300×, 350×, 400×, 450×, or 500× coverage per single nucleotide polymorphism (SNP) investigated. In another embodiment, said replicate of a fetal nucleic acid sequence is sequenced at a depth of at least 1,000× average coverage, of at least 1,500× average coverage, of at least 2,000× average coverage, of at least 2,500× average coverage or of at least 3,000× average coverage, as well as individual numbers within that range. Each possibility represents a separate embodiment of the invention. In some embodiments, the coverage is not an average coverage, but a coverage of each investigated base pair or SNP of the haplotype.
It is common in the art to refer to NGS as having an average coverage for the whole genome. In the methods of the invention the depth of coverage is given for the informative area around a disease-associated allele. This informative area is the haplotype. In some embodiments, the required depth of coverage is for the haplotype. In some embodiments, the required depth of coverage is for each SNP of the haplotype. In some embodiments, the required depth of coverage is for each base of the haplotype. A skilled artisan will appreciate that 100× average coverage is a much lower coverage than 100× coverage for each base-pair of a disease-associated haplotype.
As used herein, the term “depth” refers to the number of times a nucleotide is read during the sequencing process. The term “coverage” refers to the average number of reads representing a given nucleotide in the reconstructed sequence. Accordingly, deep sequencing indicates that the total number of reads is many times larger than the length of the sequence under study.
According to another embodiment, said analyzing said fetal nucleic acid sequence comprises comparing said fetal haplotype to a consensus haplotype. According another embodiment, said consensus haplotype is a population-based haplotype based on subjects unrelated to said fetus. In some embodiments, a consensus founder haplotype for a specific disease or condition is obtained from a publicly available haplotype database, such as but not limited to, HapMap or deCode.
The term “consensus haplotype” as used herein refers to a DNA sequence surrounding a specific genomic locus of interest, such as but not limited to, a founder mutation locus, an HLA locus or a genetic susceptibility locus. In some embodiments, the consensus haplotype may span upstream (+) or downstream (−) of the locus. In another embodiment, the consensus haplotype is both upstream and downstream of the locus of interest.
The required length of consensus haplotype for obtaining high accuracy predictions depends on a number of variables such as but not limited to, SNP frequency and recombination susceptibility of the target genomic region. According to some embodiments, the length of said consensus haplotype is of at least +/−250 kb from the locus of interest. According to some embodiments, the length of said consensus haplotype is of at least +/−500 kb from the locus of interest. According to some embodiments, the length of said consensus haplotype is of at least +/−1 Mb from the locus of interest. According to some embodiments, the length of said consensus haplotype is of at least +/−3 Mb from the locus of interest. According to some embodiments, the length of said consensus haplotype is of at least +/−5 Mb from the locus of interest. In some embodiments, the consensus haplotype comprises at least 500, 750, 1000, 1100, 1200, 1250, 1300, 1400, 1500, 1600, 1700, 1750, 1800, 1900, or 2000 mutation-flanking SNPs. Each possibility represents a separate embodiment of the invention. In some embodiments, the consensus haplotype comprises at least 1500 mutation-flanking SNPs. In some embodiments, the consensus haplotype comprises about 1700 mutation-flanking SNPs.
In some embodiments, the investigated SNPs are disease-informative SNPs. In some embodiments, the SNPs are haplotype-informative SNPs. As used herein, the term “disease-informative SNP” refers to a SNP flanking a founder mutation for a disease. As the SNP flanks the mutation that causes/contributes to the disease, the SNP is thus informative about the presence of the disease, even if the SNP itself is not responsible for the disease. Specific haplotypes are informative for the presence of a disease allele, thus disease-informative SNPs can also be haplotype-informative SNPs. As used herein, a “haplotype-informative SNP” is a SNP that distinguishes between two possible haplotypes. In some embodiments, a haplotype-informative SNP distinguishes between maternal and paternal haplotypes. In some embodiments, a haplotype-informative SNP distinguishes between a disease haplotype and a healthy haplotype. In some embodiments, a haplotype-informative SNP distinguishes between a familial haplotype and a population haplotype. As used herein, the terms “familial haplotype” and “family haplotype” are interchangeable.
In some embodiments, the consensus haplotype comprises at least 500, 750, 1000, 1100, 1200, 1250, 1300, 1400, 1500, 1600, 1700, 1750, 1800, 1900, or 2000 disease-informative SNPs or haplotype-informative SNPs. Each possibility represents a separate embodiment of the invention. In some embodiments, the consensus haplotype comprises at least 1500 disease-informative or haplotype-informative SNPs. In some embodiments, the consensus haplotype comprises about 1700 disease-informative or haplotype informative SNPs.
The throughput of the above-mentioned sequencing-based methods can be increased with the use of indexing or barcoding. Thus, a sample or subject-specific index or barcode can be added to nucleic acid fragments in a particular nucleic acid sequencing library. Then, a number of such libraries, each with a sample or subject-specific index or barcode, are mixed together and sequenced together. Following the sequencing reactions, the sequencing data can be harvested from each sample or patient based on the barcode or index. This strategy can increase the throughput and thus the cost-effectiveness of embodiments of the current invention.
In one embodiment, the nucleic acid molecules in the biological sample can be selected or fractionated prior to quantitative genotyping (e.g. sequencing). In one variant, the nucleic acid molecules are treated with a device (e.g. a microarray) which can preferentially bind nucleic acid molecules from selected loci in the genome. Then, the sequencing can be performed preferentially on nucleic acid molecules captured by the device. This scheme will allow one to target the sequencing towards the genomic region of interest. In another embodiment, said sequencing is of loci comprising single nucleotide polymorphisms (SNPs), such as SNPs linked to a disease or disorder. One skilled in the art will appreciate that many SNPs are linked to a disease or disorder. In one embodiment, said SNP is linked to a founder mutation. In another embodiment, said sequencing is of founder mutation-flanking SNPs. In some embodiments, at least 500, 750, 1000, 1100, 1200, 1250, 1300, 1400, 1500, 1600, 1700, 1750, 1800, 1900, or 2000 mutation-flanking SNPs were investigated. Each possibility represents a separate embodiment of the invention.
As used herein, “founder mutation” refers to a mutation that appears in the DNA of one or more individuals who are founders of a distinct population. Founder mutations can initiate with changes that occur in the DNA and are typically passed down to other generations.
In one embodiment, said disease is Gaucher, such as Gaucher type I. In another embodiment, said founder mutation is N370S (c.1226A>G or p.N409S according to GenBank accession #: NM_001005741.2). In another embodiment, said founder mutation is 84GG (c.84dupG on GenBank sequence NM_001005741.2). None limiting examples of founder mutations for which the prenatal testing of the invention would be relevant include those implicated in long QT syndrome within the Finnish population (Marjamaa et al. 2009 Ann Med 41:2.34-240); the delF508 mutation in CFTR causing cystic fibrosis in the caucasian European population (Moral et al. 1994, Nat Genet 7:169-175); a mutation in the SERPINA1 gene causing alpha1-antitrypsin deficiency in Scandinavian Caucasians (Cox et al. 1985, Nature 316:79-81); a mutation in Columbians causing early onset Alzheimer's disease (Lalli et al., 2013, Alzheimers Dement, S277-S283); and scores of founder mutations in the Tunisian (Romdhane et al., 2012 Orphanet J Rare Dis 7:52) and Ashkenazi Jewish (AJ) populations (Zlotogora, J. 2014, Mendelian disorders among Jews); mutations residing in the HBB gene which cause Beta-thalassemia in Mediterranean and Asian populations (Cao and Galanello. Genet Med. 2010 February; 12(2):61-76); the mutation c.191dupA in the ANOS gene which is highly predictive of adult limb-girdle muscular dystrophy (Bushby et al. 2011, Brain January; 134(Pt 1):171-82).
In one embodiment, the disease is cystic fibrosis. In some embodiments, the founder mutation is 3121-1G>A in an intron of the CFTR gene.
Founder mutations have been also identified in many types of cancers. Some non-limiting examples of cancer related founder mutations are mutations in the BRCA1 and BRCA2 associated with breast cancer. The founder mutations P57T, R603C, Q630C and A628K variants of the netrin-1 receptor UNCSC have been implicated in the predisposition and carcinogenesis leading to solid cancers in humans (EP patent application 2267153).
In another embodiment, the methods and kits disclosed herein are useful for determining the susceptibility to a microdeletion or microduplication syndrome, such as Prader-Willi syndrome, Angelman syndrome, DiGeorge syndrome, Smith-Magenis syndrome, Rubinstein-Taybi syndrome, Miller-Dieker syndrome, Williams syndrome, and Charcot-Marie-Tooth syndrome, or a disorder selected from the group consisting of Cri du Chat syndrome, Retinoblastoma, Wolf-Hirschhorn syndrome, Wilms tumor, spinobulbar muscular atrophy, cystic fibrosis, Gaucher disease, Marfan syndrome and sickle cell anemia.
One skilled in the art will appreciate that the length of sequence to be analyzed according to the methods described herein, depends on the specific haplotype to be determined. In some embodiments, a number of loci along a chromosome that needs to be sequenced is between 5,000 and 10,000 loci; between 10,000 and 50,000 loci; between 1,000 and 500 loci; between 500 and 300 loci; between 300 and 200 loci; between 200 and 150 loci; between 150 and 100 loci; between 100 and 50 loci; between 50 and 20 loci; or between 20 and 10 loci. In some embodiments, at least 2 loci, at least 10 loci, at least 20 loci, at least 50 loci, at least 100 loci, at least 1,000 loci, at least 5,000 loci or at least 10,000 are sequenced.
In another embodiment, the method further comprises analyzing said replicate of fetal nucleic acid sequence, wherein a high identity of said fetal haplotype to a consensus haplotype indicates that said fetus is a carrier of a maternal and/or paternal haplotype. In some embodiments, the consensus haplotype is a population haplotype. As used herein, a “population haplotype” refers to a haplotype that was generated for a given population that is not related to the fetus. In some embodiments, a population haplotype is not a familial haplotype. A population haplotype is not a haplotype generated from one particular family with one particular disease allele. In some embodiments, a population haplotype is constructed from a particular ethnicity. In some embodiments, the ethnicity is one with a high prevalence of a particular heritable disease. In some embodiments, the ethnicity is Ashkenazi Jews.
In some embodiments, the population haplotype is useful for comparison against fetal DNA from more than one unrelated fetuses. In some embodiments, the methods of the invention are for use in non-invasively predicting an increased risk of a disease-associated parental haplotype inherited by more than one unrelated fetus from more than one unrelated pregnant females. In some embodiments, the methods of the invention are for non-invasively predicting an increased risk of a monogenic disease or disorder in more than one unrelated fetus. In some embodiments, the methods of the invention are universal NIPD assays that can be used for any fetus without knowledge of parental haplotypes.
In some embodiments, the methods of the invention further comprise generating a population haplotype from a general population to use as a reference for analyzing fetal nucleic acid sequences. The use of a population haplotype as well as at least 100× coverage for a given SNP or haplotype allow for an accurate universal diagnostic of fetus nucleic acids without knowledge of the parental haplotypes.
In some embodiments, the term “high identity” as used herein refers to at least 90% identity of said fetal haplotype to a consensus haplotype. In another embodiment high identity refers to at least 95% identity of said fetal haplotype to a consensus haplotype. In another embodiment high identity refers to at least 98% identity of said fetal haplotype to a consensus haplotype. In another embodiment high identity refers to at least 99% identity of said fetal haplotype to a consensus haplotype.
In another embodiment, the method further comprises analyzing said replicate of fetal nucleic acid sequence, wherein a high identity of said fetal haplotype to a family-based haplotype indicates that said fetus is a carrier of a maternal and/or paternal haplotype. As used herein, a “parental” haplotype refers to a maternal, paternal or common haplotype. In some embodiments, a parental haplotype comprises a haplotype generated from a combining of the paternal and maternal haplotype. In some embodiments, the parental haplotype is derived from sequencing of samples taken from first-degree relatives of either parent.
According to another embodiment, said analyzing said replicate of fetal nucleic acid sequence comprises determining one or more paternal haplotype informative single-nucleotide polymorphism (SNP)s in at least one replicate of fetal nucleic acid, said paternal haplotype informative SNPs are not present in the maternal genotype, thereby determining unique paternal SNPs identified in the fetus. In some embodiments, the methods of the invention are for use in non-invasively predicting an increased risk of a disease-associated paternal haplotype inherited by a fetus of a pregnant female, wherein the consensus haplotype is a consensus paternal haplotype derived from the father, a first-degree paternal family member or a combination thereof.
According another embodiment, said analyzing said replicate of fetal nucleic acid sequence comprises determining maternal haplotype informative SNPs in one or more replicates of fetal nucleic acid, thereby determining maternal haplotype in said fetus.
One skilled in the art would appreciate that in instances where parental homozygosity overlaps with a consensus haplotype, larger genetic regions may be analysed, so as to increase the probability of heterozygote locus identification. In some embodiments, larger genetic regions include up to hundreds or thousands additional SNPs.
According to some embodiments, said method is for predicting an increased risk of a monogenic disease or disorder in a fetus of a pregnant female. According another embodiment, said maternal haplotype comprises a founder haplotype encompassing a founder mutation, said method being useful for predicting an increased risk of said founder mutation in said fetus. According another embodiment, said monogenic disease or disorder is caused by, or strongly associated with, a founder mutation. According another embodiment, said monogenic disease or disorder presents with autosomal recessive inheritance.
None limiting examples of diseases or disorders caused by, or strongly associated with, a founder mutation include Gaucher disease, cystic fibrosis, beta-thalassemia, sickle cell anemia, Amegakaryocytic Thrombocytopenia, Alpha 1-antitrypsin deficiency, Ataxia Telangiectasia, Autoimmune Polyglandular Syndrome, Bardet Biedl syndrome, Bloom syndrome, Canavan disease, Costeff syndrome, Cystinosis, Dihydrolipoamide dehydrogenase deficiency, Ellis-van Creveld syndrome, Familial Dysautonomia, Familial hyperinsulinemia, Fanconi anemia C, Glycogen Storage Disease Type Ia, Hermansky-Pudlak syndrome, Homocystinuria, autosomal recessive Hydrocephalus, Joubert syndrome 2, Leber congenital amaurosis, Leigh syndrome, Microcephaly with complex motor and sensory axonal neuropathy, Maple Syrup Urine Disease (MSUD), Megalencephalic leukoencephalopathy with subcortical cysts, Mitochondrial neurogastrointestinal encephalopathy syndrome, Mucolipidosis IV, Nemaline myopathy. Niemann-Pick disease A, Osteopetrosis, Pendred syndrome, Pontocerebellar hypoplasia type 1, Progressive cerebello-cerebral atrophy, Retinitis pigmentosa, Rothmund-Thomson syndrome, Senior-Loken syndrome, Tay-Sachs disease, Tyrosinemia, Usher syndrome I, Usher syndrome III, Walker Warburg syndrome and Zelweger syndrome. According to another embodiment, the present invention provides a kit for identifying and/or analyzing fetal haplotype with a high degree of confidence. In one embodiment, the kit comprises one or more components for sequencing a nucleic acid sample (e.g., fetal nucleic acid sequence) at a depth of at least 100× coverage.
The kits may include, in some embodiments, ligands and buffers for practicing the disclosed methods. The kits may include, in some embodiments, at least one vial, test tube, flask, bottle, syringe or the like.
In another embodiment, there is provided a method for prenatal diagnosis of Gaucher type I. In another embodiment, said method comprises the method comprising: obtaining a fetal nucleic acid sequence sequenced, said fetal nucleic acid sequence being derived from plasma DNA samples obtained from a pregnant female; wherein at least one SNP listed in
As used herein, the term “Single Nucleotide Polymorphism” or “SNP” refers to a single nucleotide that may differ between the genomes of two members of the same species. The usage of the term should not imply any limit on the frequency with which each variant occurs.
The process of determining which specific nucleotide (i.e., allele) is present at each of one or more SNP positions is referred to as SNP genotyping. The present invention provides methods of SNP genotyping, such as for use in screening for a variety of disorders, or determining predisposition thereto, or determining responsiveness to a form of treatment, or prognosis, or in genome mapping or SNP association analysis.
According to one aspect the present invention provides a method for non-invasively predicting an increased risk of maternal and/or paternal haplotypes inherited by a fetus of a pregnant female, the method comprising: obtaining a fetal SNP genotype derived from DNA samples obtained from the pregnant female; and analyzing fetal SNP genotype, wherein at least 95% identity of said fetal SNP haplotype to a consensus haplotype indicates that said fetus is a carrier of a maternal and/or paternal haplotype; thereby predicting an increased risk of a maternal and/or paternal haplotype inherited by said fetus.
In another embodiment, determining at least part of a fetal genome could be used for paternity testing by comparing the deduced fetal genotype or haplotype with the genotype or haplotype of the alleged father.
Nucleic acid samples can be genotyped to determine which allele(s) is/are present at any given genetic region (e.g., SNP position) of interest by methods well known in the art. The neighboring sequence can be used to design SNP detection reagents such as oligonucleotide probes, which may optionally be implemented in a kit format. Exemplary SNP genotyping methods are described in Chen et al., “Single nucleotide polymorphism genotyping: biochemistry, protocol, cost and throughput”, Pharmacogenomics J. 2003; 3(2):77-96; Kwok et al., “Detection of single nucleotide polymorphisms”, Curr Issues MoI. Biol. 2003 April; 5(2):43-60; Shi, “Technologies for individual genotyping: detection of genetic polymorphisms in drug targets and disease genes”, Am J Pharmacogenomics. 2002; 2(3): 197-205; and Kwok, “Methods for genotyping single nucleotide polymorphisms”, Annu Rev Genomics Hum Genet 2001; 2:235-58. Exemplary techniques for high-throughput SNP genotyping are described in Marnellos, “High-throughput SNP analysis for genetic association studies”, Curr Opin Drug Discov Devel. 2003 May; 6(3):317-21.
Common SNP genotyping methods include, but are not limited to, TaqMan assays, molecular beacon assays, nucleic acid arrays, allele-specific primer extension, allele-specific PCR, arrayed primer extension, homogeneous primer extension assays, primer extension with detection by mass spectrometry, pyrosequencing, multiplex primer extension sorted on genetic arrays, ligation with rolling circle amplification, homogeneous ligation, OLA (see, e.g., U.S. Pat. No. 4,988,167), multiplex ligation reaction sorted on genetic arrays, restriction-fragment length polymorphism, single base extension-tag assays, and the Invader assay. Such methods may be used in combination with detection mechanisms such as, for example, luminescence or chemiluminescence detection, fluorescence detection, time-resolved fluorescence detection, fluorescence resonance energy transfer, fluorescence polarization, mass spectrometry, and electrical detection.
In another embodiment, a “sequence” refers to a DNA sequence or a genetic sequence. It may refer to the primary, physical structure of the DNA molecule or strand in an individual. It may refer to the sequence of nucleotides found in that DNA molecule, or the complementary strand to the DNA molecule. It may refer to the information contained in the DNA molecule as its representation in silico.
In another embodiment, a “locus” refers to a particular region of interest on the DNA of an individual, which may refer to a SNP, the site of a possible insertion or deletion, or the site of some other relevant genetic variation. Disease-linked SNPs may also refer to disease-linked loci. Polymorphic Allele, also “Polymorphic Locus,” refers to an allele or locus where the genotype varies between individuals within a given species. Some examples of polymorphic alleles include single nucleotide polymorphisms, short tandem repeats, deletions, duplications, and inversions. Polymorphic Site refers to the specific nucleotides found in a polymorphic region that vary between individuals.
Haplotype refers to a combination of alleles at multiple loci that are typically inherited together on the same chromosome. Haplotype may refer to as few as two loci or to an entire chromosome depending on the number of recombination events that have occurred between a given set of loci. Haplotype can also refer to a set of SNPs on a single chromatid that are statistically associated.
Genetic data also “genotypic data” refers to the data describing aspects of the genome of one or more individuals. It may refer to one or a set of loci, partial or entire sequences, partial or entire chromosomes, or the entire genome. It may refer to the identity of one or a plurality of nucleotides; it may refer to a set of sequential nucleotides, or nucleotides from different locations in the genome, or a combination thereof. Genotypic data is typically in silico, however, it is also possible to consider physical nucleotides in a sequence as chemically encoded genetic data. Genotypic Data may be said to be “on,” “of,” “at,” “from” or “on” the individual(s). Genotypic Data may refer to output measurements from a genotyping platform where those measurements are made on genetic material.
“Genetic material” or “Genetic sample” refers to physical matter, such as tissue or blood, from one or more individuals comprising DNA or RNA.
Allelic data refers to a set of genotypic data concerning a set of one or more alleles. It may refer to the phased, haplotypic data. It may refer to SNP identities, and it may refer to the sequence data of the DNA, including insertions, deletions, repeats and mutations. It may include the parental origin of each allele.
Confidence refers to the statistical likelihood that the called SNP, allele or set of alleles correctly represents the real genetic state of the individual.
Homozygous refers to having similar alleles as corresponding chromosomal loci. Heterozygous refers to having dissimilar alleles as corresponding chromosomal loci.
Maternal Plasma refers to the plasma portion of the blood from a female who is pregnant. Parental context refers to the genetic state of a given SNP, on each of the two relevant chromosomes for one or both of the two parents of the target.
Clinical decision refers to any decision to take or not take an action that has an outcome that affects the health or survival of an individual. In the context of prenatal diagnosis, a clinical decision may refer to a decision to abort or not abort a fetus. A clinical decision may also refer to a decision to conduct further testing, to take actions to mitigate an undesirable phenotype, or to take actions to prepare for the birth of a child with abnormalities.
The term “HLA-type” refers to the complement of HLA antigens present on the cells of an individual. An individual's HLA-type may be used to predict favorable donor-recipient pairs for tissue transplant or blood transfusion or may be used as an indicator of the individual's susceptibility to certain diseases or conditions. In particular, an individual's HLA serotype can be used to predict compatibility between a blood transfusion donor and recipient. An HLA-type can be determined according to the proteins expressed from particular alleles of genes in the MHC region; for example an HLA-type can refer to specific HLA class I proteins or HLA class II proteins. Typically, genes that may be represented in an HLA-type include one or more genes selected from the group consisting of HLA-A, HLA-B, HLA-Cw, HLA-DR, HLA-DQ and HLA-DP. Terminology for specific HLA-types is usually expressed in accordance with reports released by the World Health Organization Committee on Nomenclature.
The term “HLA gene” as used herein, refers to a genomic nucleotide sequence that expresses an HLA class I or HLA class II proteins. Class I HLA genes include HLA-A, HLA-B and HLA-C, and class II HLA genes include HLA-DR, HLA-DQ, HLA-DQB1, and HLA-DP. The genes include a coding region which is a portion of the genomic sequence that is transcribed into mRNA and translated into a protein product. The genes further include portions of the genomic sequence that regulate expression of particular protein products. In another embodiment the present invention is a method for inferring fetal HLA genotype by comparison to a predetermined consensus haplotype.
In some embodiments, the methods of the invention comprise ASP-SEQ. In some embodiments, the methods of the invention are for predicting an increased risk of paternal haplotypes inherited by a fetus. In some embodiments, only the risk of paternal haplotypes can be predicted and not maternal haplotypes. In some embodiments, the increased risk of disease or disorder in the fetus is due to inheritance of a paternal disease or disorder causing or contributing allele. In some embodiments, the disease is cystic fibrosis.
In some embodiments, the methods of the invention are for very early non-invasive prenatal diagnosis. In some embodiments, the methods of the invention are performed very early during pregnancy. In some embodiments, very early is before week 9 of gestation. In some embodiments, very early is before week 8 of gestation. In some embodiments, very early is between weeks 4 and 10, 4 and 9, 4 and 8, 5 and 10, 5 and 9 and 5 and 8 of gestation. Each possibility represents a separate embodiment of the invention. In some embodiments, the DNA samples are obtained from the pregnant female during weeks 5 to 8 of gestation. In some embodiments, the methods of the invention can be performed from very early in gestation and onward.
In some embodiments, the replicate is obtained from week 5 of gestation and onward. In some embodiments, the replicate is obtained from week 4 and onward. In some embodiments, both samples of the replicate are obtained between weeks 5 and 8. In some embodiments, one sample of the replicate is obtained very early and the other sample is obtained after the very early time period. In some embodiments, one sample of the replicate is obtained between weeks 5 to 8 and the other sample is obtained after week 8.
As used herein, a “replicate” refers to at least two samples taken from the same mother, but at different time points. In some embodiments, the replicates are taken on different days. In some embodiments, the replicates are taken at least a week apart. In some embodiments, the replicates are taken at least 2 weeks apart. In some embodiments, the use of a replicate increases the accuracy and reliability of the methods of the invention over the performance of the same method but with only a single sample. In some embodiments, use of a replicate increases accuracy of the method by at least 10, 20, 30, 40, 50, 60, 70, 80, 90, 100, 150, 200, 250, or 300% as compared to performing the method with only one sample. Each possibility represents a separate embodiment of the invention.
In some embodiments, the consensus haplotype is a paternal haplotype. In some embodiments, the consensus haplotype is a paternal family haplotype. In some embodiments, the consensus haplotype is based on the fetus's father, at least one first-degree paternal family member, or a combination thereof. In some embodiments, the consensus haplotype is a parental haplotype. In some embodiments, the consensus haplotype is a maternal, paternal or common haplotype. As used herein, the term “common” in reference to haplotypes refers to a haplotype that is shared, or common, to both parents. In some embodiments, a common haplotype is a consensus haplotype generated from both the maternal and paternal haplotypes. Such a consensus haplotype though common to the two would not be identical to either of the parent's haplotypes. In some embodiments, the consensus parental haplotype is not a population haplotype.
As used herein, a “first-degree parental family member” refers to a family member with only one degree of separation from the parent. This includes, parents, siblings and children of the parent of the fetus.
In some embodiments, at least one replicate of fetal nucleic acid sequence is obtained. In some embodiments, at least two replicates of fetal nucleic acid sequence are obtained. In some embodiments, the replicates are obtained during the very early prenatal period. In some embodiments, the replicates are obtained at least 1 week apart. In some embodiments, the replicates are obtained between weeks 5 to 8 of gestation.
In some embodiments, the fetal nucleic acid sequence comprises a proportion of the DNA sample obtained from the pregnant female too low to perform Targeted Deep Sequencing (TDS). In some embodiments, the fetal nucleic acid sequence comprises not more than 4%, 3.5%, 3%, 2.5%, 2%, 1.5%, or 1% of the DNA sample obtained from the pregnant female. Each possibility represents a separate embodiment of the invention. In some embodiments, the fetal nucleic acid sequence comprises less than or equal to 4%, 3.5%, 3%, 2.5%, 2%, 1.5%, 1% or 0.5% of the DNA samples obtained from the pregnant female. Each possibility represents a separate embodiment of the invention. In some embodiments, fetal nucleic acid sequences comprise less than or equal to 4%, 3.5%, 3%, 2.5%, 2%, 1.5%, 1% or 0.5% of the DNA sample obtained from the pregnant female. Each possibility represents a separate embodiment of the invention. In some embodiments, the fetal nucleic acid sequence comprises less than or equal to 4% of the DNA samples obtained from the pregnant female. In some embodiments, the fetal nucleic acid sequence comprises less than or equal to 1% of the DNA samples obtained from the pregnant female. In some embodiments, the fetal nucleic acid sequence comprises less than 4% of the DNA samples obtained from the pregnant female. In some embodiments, the fetal nucleic acid sequence comprises less than 1% of the DNA samples obtained from the pregnant female.
In some embodiments, the fetal nucleic acid sequence is present in the sample at a concentration too low to perform TDS. In some embodiments, the fetal nucleic acid sequence is present in the sample at a concentration of or less than 4, 3.5, 3, 2.5, 2, 1.5, 1, or 0.5 picograms (pg)/microlitre (ul). In some embodiments, the fetal nucleic acid sequence is present in the sample at a concentration of or less than 4 pg/ul. In some embodiments, the fetal nucleic acid sequence is present in the sample at a concentration of or less than 1 pg/ul.
As used herein, the term “about” when combined with a value refers to plus and minus 10% of the reference value. For example, a length of about 1000 nanometers (nm) refers to a length of 1000 nm+−100 nm.
It is noted that as used herein and in the appended claims, the singular forms “a,” “an,” and “the” include plural referents unless the context clearly dictates otherwise. Thus, for example, reference to “a polynucleotide” includes a plurality of such polynucleotides and reference to “the polypeptide” includes reference to one or more polypeptides and equivalents thereof known to those skilled in the art, and so forth. It is further noted that the claims may be drafted to exclude any optional element. As such, this statement is intended to serve as antecedent basis for use of such exclusive terminology as “solely,” “only” and the like in connection with the recitation of claim elements or use of a “negative” limitation.
In those instances where a convention analogous to “at least one of A, B, and C, etc.” is used, in general such a construction is intended in the sense one having skill in the art would understand the convention (e.g., “a system having at least one of A, B, and C” would include but not be limited to systems that have A alone, B alone, C alone, A and B together, A and C together, B and C together, and/or A, B, and C together, etc.). It will be further understood by those within the art that virtually any disjunctive word and/or phrase presenting two or more alternative terms, whether in the description, claims, or drawings, should be understood to contemplate the possibilities of including one of the terms, either of the terms, or both terms. For example, the phrase “A or B” will be understood to include the possibilities of “A” or “B” or “A and B.”
It is appreciated that certain features of the invention, which are, for clarity, described in the context of separate embodiments, may also be provided in combination in a single embodiment. Conversely, various features of the invention, which are, for brevity, described in the context of a single embodiment, may also be provided separately or in any suitable sub-combination. All combinations of the embodiments pertaining to the invention are specifically embraced by the present invention and are disclosed herein just as if each and every combination was individually and explicitly disclosed. In addition, all sub-combinations of the various embodiments and elements thereof are also specifically embraced by the present invention and are disclosed herein just as if each and every such sub-combination was individually and explicitly disclosed herein.
Additional objects, advantages, and novel features of the present invention will become apparent to one ordinarily skilled in the art upon examination of the following examples, which are not intended to be limiting. Additionally, each of the various embodiments and aspects of the present invention as delineated hereinabove and as claimed in the claims section below finds experimental support in the following examples.
Various embodiments and aspects of the present invention as delineated hereinabove and as claimed in the claims section below find experimental support in the following examples.
Materials and Methods
Sample Collection and DNA Extraction
Pregnant Ashkenazi Jewish (AJ) couples, carrying mutation(s) in the GBA gene, were recruited at the Shaare Zedek Medical Center (SZMC) Gaucher Clinic. Peripheral blood samples were collected from each couple, relevant mutation carrier family members, 8 unrelated AJ GBA N370S homozygotes, and 3 unrelated AJ GBA N370S heterozygote duos. Genomic DNA was then prepared from all samples using the FlexiGene DNA kit (QIAGEN) according to the manufacturer's protocol. For pregnant female indices, plasma was separated from peripheral blood by centrifugation at 1,900×g for 10 minutes at 4° C. The plasma supernatant was then recentrifuged at 16,000×g for 10 minutes at 4° C. and 3 ml of the resulting supernatant was used for cell-free DNA extraction with the QIAamp Circulating Nucleic Acid kit (QIAGEN) according to the manufacturer's protocol. The maternal plasma DNA extracts were then pre-amplified, in duplicate, with the SurePlex Amplification System (Illumina) ahead of downstream processing. All familial mutations in GBA were Sanger sequence verified prior to commencement of the study. Ethical approval for the study, including usage of materials from human subjects, was obtained from the local institutional review board and written informed consent was obtained from all study participants.
Two TruSeq Custom Amplicon panels were designed with Design Studio software (Illumina) to amplify and sequence GBA-flanking SNPs in all samples. The smaller panel sequenced 490 SNPs and the larger panel sequenced 5,000 SNPs. Indexed next generation sequencing libraries were prepared and normalized according to the manufacturer's protocol (Illumina) followed by 2×150 bp pair-end sequencing on a MiSeq (small panel) or NextSeq 500 (large panel) instrument (Illumina) to a mean depth of at least 500× or 3800× for genomic and plasma DNA samples, respectively. After sequencing runs, the data were aligned to target sequences on the human reference genome (hg19) using MiSeq Reporter software (Illumina) for the small panel or the TruSeq Amplicon v1.1 app on BaseSpace (https://basespace.illumina.com/) for the large panel. Genotyping data was extracted from each alignment using the SAMtools mpileup program to yield sample-specific SNP genotype profiles and then the SNPs were annotated by snpEff with dbSNP138 (small panel) or dbSNP141 (large panel). These profiles were then combined into single family-specific.csv files using in-house software so as to facilitate familial and fetal linkage analysis (see below). Prior to linkage analysis, non-GBA flanking SNP calls and SNP calls on heavily self-chained genomic segments were removed. Genomic DNA SNP genotype calls were categorized into one of 3 distinct classifications based on the percentage of non-reference genome allele (B allele) sequencing reads at each locus: homozygote reference allele (AA; 0%-20% B allele reads); homozygote non-reference allele (BB; 80%-100% B allele reads); or heterozygote (AB; 30%-70% B allele reads). Any loci that did not meet these classification criteria were excluded from further downstream analysis. As a rule, parental haplotypes were constructed with SNPs for which the parent was heterozygous and at least one of his/her first degree relatives was homozygous.
Construction of Consensus AJ N370S and Familial Haplotypes
The initial consensus AJ N370S GBA-flanking haplotype was constructed by performing homozygosity mapping with custom SNP small panel NGS datasets from 7 unrelated AJ N370S homozygotes (14 N370S chromosomes). Subsequently, 6 more AJ N370S haplotypes were derived from linkage analysis on SNP NGS datasets from 6 unrelated AJ N370S mutation carrier duos. Each linkage-based N370S haplotype was then crossed with the consensus sequence derived from homozygosity mapping to identify inconsistencies. These sequence discrepancies were then used to mark consensus AJ N370S founder haplotype cut-offs (based on 20 N370S chromosomes, altogether, after the completion of all data intersections). The larger consensus AJ N370S GBA-flanking haplotype was constructed by performing homozygosity mapping with custom SNP large panel NGS datasets from 8 unrelated AJ N370S homozygotes (16 N370S chromosomes). Subsequently, 12 more AJ N370S haplotypes were derived from linkage analysis on SNP NGS datasets from 12 unrelated AJ N370S mutation carrier duos. The final consensus AJ N370S founder haplotype cut-offs (based on 28 N370S chromosomes, altogether, after the completion of all data intersections) were then set as described above regarding the initial consensus haplotype construct. Identification of fetal alleles in maternal plasma DNA
In order to construct credible small fetal haplotypes (composed of <5 SNPs) with the small SNP sequencing panel, plasma DNA samples were sequenced in duplicate at high depth (>3,000× mean coverage) so as to augment statistical confidence in each individual fetal SNP genotype call. In all, four different combinations of parental SNP genotypes were analyzed in plasma DNA: A) Error rate informative (father and mother [of the fetus] both homozygote “AA”); B) Dosage informative (father and mother homozygote for opposite alleles); C) Paternal haplotype informative (father heterozygote and mother homozygote); and D) Maternal haplotype informative SNPs (mother heterozygote and father homozygote). Error rate informative SNPs measured the sequencing error rate in plasma DNA samples by assessing the appearance of biologically impossible SNP reads. At >1000× read depth, error rates of 0.6%+/−0.6% were measured in plasma DNA samples. Dosage informative SNPs (denoted heretofore as “SNP I”) measured the paternal portion of fetal plasma DNA by determining the fraction of paternal alleles per maternal alleles. These SNPs also confirmed the presence of fetal DNA in maternal plasma. Paternal haplotype informative SNPs (denoted heretofore as “SNP II”) feature a unique nucleotide in the fetus' father that is not present in the maternal genotype. When identified in maternal plasma DNA, the paternal unique allele is expected to comprise the same fraction as those of paternal alleles in dosage informative SNPs. In general, the paternal haplotype of the fetus was deduced wherever the father's unique SNP II allele was identified in one of 2 plasma DNA replicates (at a SNP position with >1000× sequencing depth) with relatively high frequency (>2σ from the mean sequencing error rate as determined from error rate informative SNPs) in maternal plasma DNA. The computed sensitivity/specificity scores for this method are provided as a function of the number of unique paternal SNPs identified in the fetus (see Table 1).
For plasma DNA samples with high fetal dosage (>30% paternal fetal fraction), the paternal haplotype in the fetus was also deduced from non-unique SNP II alleles (with >500× coverage) for which there were no discrepancies between replicate fetal haplotype calls. The computed sensitivity/specificity scores for this method are provided as a function of the number of non-unique paternal SNPs identified in the fetus (see Table 2).
AThe formula for these calculations was as follows: [1 − ([(0.5)(1 − er)]2)n] where “n” represents the number of SNPs in the fetal haplotype and “er” represents the chance (which is 5%) of unique paternal allele detection at 2σ from the sequencing error rate as determined from error rate informative SNP sequences. For 1 to 4 SNP haplotypes, a 0.03% correction was applied to account for the sex-specific male recombination rate in the +/−250 kb genomic region surrounding GBA, but if longer haplotypes do not flank the mutation, this correction should continue to be applied.
Maternal haplotype informative SNPs (denoted heretofore as “SNP III”) were used to determine the maternal haplotype in the fetus at >1000× sequencing coverage. These SNPs indicated a heterozygous fetal genotype when allele-allele ratios were balanced, and a homozygous fetal genotype when these ratios were imbalanced by a number >3σ from the mean sequencing error rate (as determined from error rate informative SNPs). Depending on the father's homozygous allele, the maternal fetal allele was deduced based on the presence or absence of skewing (<50% non-reference nucleotide skewed representation if the father was homozygote A [for the reference nucleotide]; >50% non-reference nucleotide skewed if the father was homozygote B [for the non-reference nucleotide]) in maternal heterozygous SNP III loci on both plasma DNA replicates. The computed sensitivity/specificity scores for this method are provided as a function of the number of maternal haplotyped SNPs identified in the fetus (see Table 3).
AThe formula for these calculations was as follows: [1 − [(0.5)2]n] where “n” represents the number of SNPs in the fetal haplotype. For 1 to 4 SNP haplotypes, a 0.07% correction was applied to account for the sex-specific female recombination rate in the +/−250 kb GBA region but if longer haplotypes do not flank the mutation, this correction should continue to be applied.
All parental SNP combinations that did not fall within the above guidelines were not utilized in this study. In order to construct large fetal haplotypes (composed of >5 SNPs) with the large SNP sequencing panel, plasma DNA samples were analyzed as above with the following modifications. Error rate informative SNPs indicated a 1% error rate at read depths exceeding 100×. Accordingly, paternal haplotype informative and maternal haplotype informative SNPs were assessed from a minimum read depth of 100 whereupon only skewing exceeding 1% B-allele frequency in plasma DNA with respect to maternal DNA (at a particular locus) was considered significant enough for incorporation into the fetal haplotype. This filter was applied so as to reduce genotyping errors emerging from either sequencing error and/or off-target sequence contamination.
Ultimately, fetal diagnosis was achieved after comparing the paternal and maternal cell-free fetal DNA (cffDNA) haplotypes with family-based and/or N370S consensus or near consensus haplotypes as relevant. Altogether, the entire noninvasive NGS-based prenatal test, from blood sample processing to fetal diagnosis, was completed in 5 work days. In addition, all diagnoses were confirmed by post-natal genetic testing. For family 1, allelic inheritance of the N370S mutation was further confirmed by postnatal linkage analysis with short tandem repeat (STR) markers.
Sample Collection and DNA Extraction for ASP-SEQ
Couples undergoing preimplantation genetic diagnosis (PGD) at the Shaare Zedek Medical Center (SZMC) PGD or Assaf Harofeh IVF clinics were recruited into the study. Pregnant study participants and their partners provided at least one or more peripheral blood samples between weeks 5 and 8 of gestation. For pregnant female indices, plasma was separated from peripheral blood by centrifugation at 1,900×g for 10 minutes at 4° C. The plasma supernatant was then recentrifuged at 16,000×g for 10 minutes at 4° C. and 3 ml of the resulting supernatant was used for cell-free DNA extraction with the QIAamp Circulating Nucleic Acid kit (QIAGEN) according to the manufacturer's protocol. The maternal plasma DNA extracts were then pre-amplified, in duplicate, with the NEBNext® Ultra™ II DNA Library Prep kit (New England Biolabs) ahead of downstream processing. The inclusion criteria for the investigation were as follows: singleton clinical pregnancy had to be confirmed by ultrasound during week 6 gestation; the couple's first-degree family member genomic DNA samples needed to be available for parental haplotype phasing purposes; a DNA sample from CVS or amniotic fluid testing from a later stage of pregnancy had to be provided during the course of the study for test validation purposes. Couples who did not meet all of the study's inclusion criteria were excluded from the investigation and not analyzed by any genetic testing methods. Generally, there was preference to recruit PGD pregnant couples into the study who were carriers of CFTR mutations. However, in the PGD clinic, most couples refrain from performing follow up invasive prenatal testing to confirm the genetic status of their fetus due to high PGD accuracy rates and fear of miscarriage of a “very precious” pregnancy. Hence, CF-mutation carriage was not required from study participants who each signed informed consent to allow their plasma samples to be analyzed for non-pathogenic CFTR-proximal single nucleotide polymorphisms (SNPs). For the non-CF couples, all underwent screening for at least 14 common mutations in the CFTR gene prior to the study. Ethical approval for the study, including usage of materials from human subjects, was obtained from the local institutional review board and written informed consent was obtained from all study participants.
Next Generation Sequencing (NGS) of CFTR-Flanking Single Nucleotide Polymorphisms (SNPs)
Custom targeted deep sequencing (TDS) and ASP-SEQ panels were designed to sequence and genotype 1,700 CFTR-flanking SNPs. However, only ASP-SEQ panels were designed to sequence SNP targets in an allele-specific manner. TDS panels sequenced SNP targets without any allele-specificity. Accordingly, the TDS panel was applied to genotype all samples, genomic and plasma DNA, in the study while the ASP-SEQ panel was applied to plasma and genomic maternal DNA samples only. For both TDS and ASP-SEQ panels, indexed next generation sequencing libraries were prepared and normalized according to the manufacturer's protocol (Illumina) followed by 2×150 bp pair-end sequencing on a MiSeq or NextSeq 500 instrument (Illumina) to a mean depth of 1000× for genomic and plasma DNA samples. After sequencing runs, the data were aligned to target sequences on the human reference genome (hg19) and genotyping data was extracted from each alignment and annotated using GATK software (ref; Broad Institute). These profiles were then combined into single family-specific.csv files using in-house software so as to facilitate familial and fetal linkage analysis (see below). As a rule, parental haplotypes were constructed with SNPs for which the parent was heterozygous and at least one of his/her first-degree relatives was homozygous.
Standard Haplotype Construction and Identification of Fetal Paternal Alleles in Maternal Plasma DNA Using TDS
For each genomic DNA sample in the study (whether from the pregnant couple or their first-degree family members), heterozygous genotype calls from TDS were trio-phased to obtain paternal allele-specific haplotypes. TDS was then used to identify paternal mutations, variants, or CFTR-flanking alleles in all plasma samples.
Identification of Fetal Paternal Alleles in Maternal Plasma DNA Using ASP-SEQ
ASP-SEQ was used to identify paternal mutations, variants, or CFTR-flanking alleles in all plasma samples by comparing ASP-SEQ results of maternal genomic DNA with its corresponding plasma DNA ASP-SEQ libraries. For every maternal genomic and accompanying plasma DNA sample with standard deep sequencing genotype information, two different targeted ASP-SEQ libraries were prepared. ASP-SeqA libraries amplified only reference SNP alleles (“A”) but not non-reference alleles (“B”). Conversely, ASP-SeqB libraries amplified only non-reference SNP alleles (“B”) but not reference alleles (“A”). After high throughput sequencing of each ASP-SEQ library, successfully amplified regions are mapped to the human genome and utilized to detect fetal DNA that does not exist in maternal only genomic DNA. Thus, for every fetal haplotype informative SNP locus, ASP-SEQ will determine whether a “child-specific” allele was transmitted to the fetus or not. In parallel, TDS was performed on paternal genomic DNA to determine if the “child-specific” alleles existed also in a particular paternal haplotype. Plasma DNA samples were sequenced in duplicate at high depth (>1,000× mean coverage) and only paternal haplotype informative SNPs (father heterozygote and mother homozygote) were analyzed. Paternal haplotype informative SNPs feature a unique nucleotide in the fetus' father that is not present in the maternal genotype. All other parental SNP combinations were not utilized for ASP-SEQ-based paternal allele derivation. Paternal haplotype informative SNPs were assessed from a minimum read depth of 100× whereupon only allele-specific amplification of the paternal “unique allele” in the plasma ASP-SEQ libraries that did not appear in maternal genomic DNA ASP-SEQ library controls were incorporated into the fetal haplotype. This filter was applied so as to reduce genotyping errors emerging from either sequencing error and/or off-target sequence contamination.
Ultimately, fetal diagnosis was achieved after comparing the paternal cell-free fetal DNA (cffDNA) haplotypes with family-based trio phase haplotypes as relevant. Altogether, the entire noninvasive NGS-based prenatal test, from blood sample processing to fetal diagnosis, was completed in 5 work days. In addition, all diagnoses were confirmed by prenatal amniotic fluid genetic testing.
Study Description
Eight pregnant AJ couples, of which one or both partners were heteroallelic carriers of GBA N370S, were enrolled in the study (
Fine Mapping of the Consensus AJ N370S Founder Haplotype Region.
To fine map the N370S founder region, the inventors sequenced 7 unrelated homoallelic AJ mutation carriers on the targeted GBA-flanking SNP panel. Six of these homoallelic patients with type I Gaucher disease were homozygotic for all 490 SNPs on the initial sequencing panel. The seventh sample shared the same haplotype within and 3′ to GBA, but a heterozygous region was clearly identified
144,388 nucleotides 5′ to the gene and beyond (at rs2306124, dbSNP 138). Hence, this sample was used to demarcate a preliminary consensus founder haplotype (
Preliminary NIPD of an Autosomal Recessive Founder Mutation.
For pilot testing, families 1 and 2 offered 3 different avenues with which to assess the utility of the consensus N370S haplotype. This was because both parents (of the fetus) in family 1 were N370S carriers in addition to the mother in family 2 (
Extended Fine Mapping of the Consensus AJ N370S Founder Haplotype Region.
Although initial testing of families 1 and 2 showed promising results regarding the utility of the consensus N370S haplotype for incorporation into NIPD, it was clear that for expanded N370S testing in a clinical setting a more sophisticated sequencing panel would be required to facilitate setup of a universal assay for noninvasive prenatal Gaucher disease testing. The concerns with the initial 490-SNP sequencing panel were 4-fold. As evidenced by HapMap and deCode data, meiotic recombination is quite infrequent in the immediate human GBA-flanking locus (±250 kb), which was the small target of the pilot sequencing panel. In this genomic context, homozygosity of an N370S mutation carrier parent, such as in the family 1 father, would be expected to occur commonly because DNA is rearranged at a reduced rate in the peri-GBA locus. Along these lines, low recombination rates translate into low genotypic complexity, which, in turn, leads to limited availability of linkage-informative SNPs, which are crucial to fetal haplotyping. Thus, small family-based haplotypes, which generally handicap fetal haplotyping, such as that of the family 1 father (3 SNPs) and that of the family 2 mother (11 SNPs; Table 4), would be predicted to represent the majority as opposed to the minority of cases. Another reason to consider looking beyond a distance of 250 kb from GBA would be to complete fine mapping of the 3′ boundary of the consensus N370S sequence, which proved so beneficial for fetal typing of family 1 and 2 maternal N370S-paired alleles. Finally, N370S aside, the implementation of a larger targeted sequencing panel should hypothetically be used to diagnose any mutation in GBA via familial linkage analysis, regardless of whether the mutation is a founder allele or not. For all these aforementioned reasons, a newer and much improved targeted deep-sequencing panel was designed to sequence 10 times the amount of GBA-flanking SNPs (˜5,000 SNPs) across an 8-fold-sized genomic region (GBA ±2 Mb) before moving forward with NIPD for other families in the study. The first priority, in terms of test implementation, was to use the new expanded sequencing panel to complete fine mapping of the founder N370S haplotype. As mentioned above, the original sequencing panel successfully demarcated a 5′ boundary for the consensus sequence that was approximately 17 kb upstream of GBA and at least 219 kb downstream. When repeating the same exercise (as that described in
To make effective use of the expanded near consensus N370S haplotype without allowing haplotype errors to corrupt downstream fetal analysis, the inventors carefully inspected each N370S chromosome in all mutation carrier parents in the study (families 1 through 8) using the large sequencing panel. It was found that, in some cases, recombination was detected in the true parent specific N370S-linked sequence with respect to the founder mutation near consensus haplotype (
NIPD of GBA N370S Using an Improved Targeted Sequencing Panel.
Having setup the framework with which to embark on streamlined NIPD for the N370S founder mutation, the inventors returned to families 1 and 2 and retested the same samples using the expanded sequencing panel. One of the primary issues with the previous analysis involving these families was the small size of linkage-based haplotypes in the family 1 paternal N370S allele and the family 2 maternal N370S allele (Table 4). As expected, the large sequencing panel clearly solved this issue for families 1 and 2 (and, essentially, all families in this study). Ranging from 113 to 336 phased SNPs, all parental family-based haplotypes in the current investigation were of substantial size and content to enable scoring of fetal haplotypes with generally high confidence (
The principles set forth in these preliminary tests were subsequently put into practice for fetal allele identification involving families 3 through 8 (
To summarize, the outcomes of this proof-of-concept study are presented in
First, a consensus DelF508 founder haplotype is identified and constructed, such as by the methods disclosed hereinabove, inter alia by using the publicly available haplotype database, such as HapMap or deCode or whole genome sequencing data from one or more ethnicities.
Subsequently, peripheral blood samples are collected from pregnant female indices and plasma is separated from peripheral blood by methods known in the art, e.g., centrifugation at 1,900×g for 10 minutes at 4° C. The plasma supernatant is then re-centrifuged at 16,000×g for 10 minutes at 4° C. and 3 ml of the resulting supernatant was used for cell-free DNA extraction such as with the QIAamp Circulating Nucleic Acid kit (QIAGEN) according to the manufacturer's protocol. The maternal plasma DNA extracts are then pre-amplified, in duplicate, such as with the SurePlex Amplification System (Illumina) ahead of downstream processing.
Thereafter, the DNA extracts suspected of having the DelF508 founder mutation are amplified with standard or allele-specific amplification methods followed by sequencing. Indexed next generation sequencing libraries are prepared and normalized (e.g., Illumina) according to the manufacturer's protocol followed by 2×150 bp pair-end sequencing to a mean depth of at least 500× for genomic and plasma DNA samples, respectively. After sequencing runs, the data are aligned to target sequences on the human reference and genotyping data is extracted
Fetal diagnosis of cystic fibrosis is ultimately achieved after comparing the paternal and maternal cell-free fetal DNA (cffDNA) haplotypes with DelF508 consensus haplotype.
First, a consensus for the G6V mutation in the HBB gene founder haplotype is identified and constructed, such as by the methods disclosed hereinabove, inter alia by using the publicly available haplotype database, such as HapMap or deCode or whole genome sequencing data from one or more ethnicities.
Subsequently, peripheral blood samples are collected from pregnant female indices and plasma is separated from peripheral blood by methods known in the art, e.g., centrifugation at 1,900×g for 10 minutes at 4° C. The plasma supernatant is then re-centrifuged at 16,000×g for 10 minutes at 4° C. and 3 ml of the resulting supernatant was used for cell-free DNA extraction such as with the QIAamp Circulating Nucleic Acid kit (QIAGEN) according to the manufacturer's protocol. The maternal plasma DNA extracts are then pre-amplified, in duplicate, such as with the SurePlex Amplification System (Illumina) ahead of downstream processing.
Thereafter, the DNA extracts suspected of having the G6V founder mutation are amplified with standard or allele-specific amplification methods followed by sequencing. Indexed next generation sequencing libraries are prepared and normalized (e.g., Illumina) according to the manufacturer's protocol followed by 2×150 bp pair-end sequencing to a mean depth of at least 500× for genomic and plasma DNA samples, respectively. After sequencing runs, the data are aligned to target sequences on the human reference and genotyping data is extracted
Fetal diagnosis of Beta-thalassemia is ultimately achieved after comparing the paternal and maternal cell-free fetal DNA (cffDNA) haplotypes with G6V consensus haplotype.
First, a consensus for the 736delATCTGAinsTAGATTC in the BLM gene founder haplotype is identified and constructed, such as by the methods disclosed hereinabove (e.g., using the HapMap or deCode or whole genome sequencing data from one or more ethnicities).
Subsequently, peripheral blood samples are collected from pregnant female indices and plasma is separated from peripheral blood by methods known in the art, e.g., centrifugation at 1,900×g for 10 minutes at 4° C. The plasma supernatant is then re-centrifuged at 16,000×g for 10 minutes at 4° C. and 3 ml of the resulting supernatant was used for cell-free DNA extraction such as with the QIAamp Circulating Nucleic Acid kit (QIAGEN) according to the manufacturer's protocol. The maternal plasma DNA extracts are then pre-amplified, in duplicate, such as with the SurePlex Amplification System (Illumina) ahead of downstream processing.
Thereafter, the DNA extracts suspected of having the 736delATCTGAinsTAGATTC founder mutation are amplified with standard or allele-specific amplification methods followed by sequencing. Indexed next generation sequencing libraries are prepared and normalized (e.g., Illumina) according to the manufacturer's protocol followed by 2×150 bp pair-end sequencing to a mean depth of at least 500× for genomic and plasma DNA samples, respectively. After sequencing runs, the data are aligned to target sequences on the human reference and genotyping data is extracted
Fetal diagnosis of Bloom syndrome is ultimately achieved after comparing the paternal and maternal cell-free fetal DNA (cffDNA) haplotypes with 736delATCTGAinsTAGATTC consensus haplotype.
First, a consensus for the G269S mutationm in the HEXA gene founder haplotype is identified and constructed, such as by the methods disclosed hereinabove, inter alia by using the publicly available haplotype database, such as HapMap or deCode or whole genome sequencing data from one or more ethnicities.
Subsequently, peripheral blood samples are collected from pregnant female indices and plasma is separated from peripheral blood by methods known in the art, e.g., centrifugation at 1,900×g for 10 minutes at 4° C. The plasma supernatant is then re-centrifuged at 16,000×g for 10 minutes at 4° C. and 3 ml of the resulting supernatant was used for cell-free DNA extraction such as with the QIAamp Circulating Nucleic Acid kit (QIAGEN) according to the manufacturer's protocol. The maternal plasma DNA extracts are then pre-amplified, in duplicate, such as with the SurePlex Amplification System (Illumina) ahead of downstream processing.
Thereafter, the DNA extracts suspected of having the G269S founder mutation are amplified with standard or allele-specific amplification methods followed by sequencing. Indexed next generation sequencing libraries are prepared and normalized (e.g., Illumina) according to the manufacturer's protocol followed by 2×150 bp pair-end sequencing to a mean depth of at least 500× for genomic and plasma DNA samples, respectively. After sequencing runs, the data are aligned to target sequences on the human reference and genotyping data is extracted
Fetal diagnosis of Tay-Sachs is ultimately achieved after comparing the paternal and maternal cell-free fetal DNA (cffDNA) haplotypes with G269S consensus haplotype.
First, a consensus for the E342K mutation in the SERPINA gene founder haplotype is identified and constructed, such as by the methods disclosed hereinabove, inter alia by using the publicly available haplotype database, such as HapMap or deCode or whole genome sequencing data from one or more ethnicities.
Subsequently, peripheral blood samples are collected from pregnant female indices and plasma is separated from peripheral blood by methods known in the art, e.g., centrifugation at 1,900×g for 10 minutes at 4° C. The plasma supernatant is then re-centrifuged at 16,000×g for 10 minutes at 4° C. and 3 ml of the resulting supernatant was used for cell-free DNA extraction such as with the QIAamp Circulating Nucleic Acid kit (QIAGEN) according to the manufacturer's protocol. The maternal plasma DNA extracts are then pre-amplified, in duplicate, such as with the SurePlex Amplification System (Illumina) ahead of downstream processing.
Thereafter, the DNA extracts suspected of having the E342K founder mutation are amplified with standard or allele-specific amplification methods followed by sequencing. Indexed next generation sequencing libraries are prepared and normalized (e.g., Illumina) according to the manufacturer's protocol followed by 2×150 bp pair-end sequencing to a mean depth of at least 500× for genomic and plasma DNA samples, respectively. After sequencing runs, the data are aligned to target sequences on the human reference and genotyping data is extracted
Fetal diagnosis of alpha-1-antitrypsin deficiency was ultimately achieved after comparing the paternal and maternal cell-free fetal DNA (cffDNA) haplotypes with E342K consensus haplotype.
Eleven pregnant couples were recruited into the study. Six of the couples achieved pregnancy via preimplantation genetic diagnosis (PGD) for cystic fibrosis (CF) and the rest had performed PGD for other genetic disorders and consented to allow their early pregnancy plasma samples to be used for allelic inheritance testing of intronic or gene-flanking CFTR single nucleotide polymorphisms (SNPs). Otherwise, detailed information regarding the ethnic background and CFTR mutation carriage (where relevant) of the study cohort are presented in a table in
As a key preliminary step in the process, a sensitive technique for paternal CF allele assessment in very early pregnancy plasma samples was sought. Currently accepted practice in the NIPD field is not to perform Mendelian disorder testing prior to week 9 gestation. The primary obstacle to mutation testing before this time is low fetal fraction which, prior to week 8 gestation, rarely rises above the widely reported 4% lower threshold for effective NIPT diagnosis. When fetal fraction is below 4% it has been extremely difficult to discriminate ‘background noise’ of sequencing or digital PCR errors from true biological events, such as wild type or mutant allele transmission to a fetus, at such prohibitively low fetal dosages. For this reason, an ultra-sensitive method, termed ASP-SEQ was developed for diffuse molecule detection even at dosages well below 0.5% where fetal DNA concentration cannot be reliably measured.
ASP-SEQ is a new proprietary high throughput genotyping methodology (diagramed in
In a typical ASP-SEQ experiment (
For preliminary testing of the ASP-SEQ method, a NIPD simulation was devised using DNA samples from a CF PGD family, family A, comprised of a couple and their CF-affected daughter (
After demonstrating ASP-SEQ effectiveness in a model system, the technique was further challenged with ‘live’ early pregnancy plasma samples from the 11-couple study cohort. Of the 11 couples, 5 were carriers of CF mutations and the others were tested for allelic transmission of intronic or gene-flanking CFTR SNPs. Seven couples provided two or more early pregnancy plasma samples for testing while the other four couples provided only one plasma sample each. In all cases, paternal inheritance was tested by ASP-SEQ in pregnant indexes at different time points ranging from week 5 through week 8 gestation. Here too, TDS was used as a conventional NIPD technique for comparison. Altogether, the assayed fetal dosages, NIPD and subsequent amniotic fluid testing results, and other details from the ‘live’ early pregnancy study are summarized in a table in
Overall, testing outcome for each couple in the study was heavily influenced by the number of plasma samples provided for evaluation. With ASP-SEQ, correct allelic inheritance was determined for 6 out of 7 couples who provided 2 or more plasma samples for testing. For the seventh couple in this group (Family 3), allelic classification could not be determined, but importantly, there was no misdiagnosis (
Also of note, is the fact that the fetal load in most plasma samples in the study was markedly low, with an average and median dosage of 1.5% and 1.0%, respectively. Nonetheless, despite the low overall fetal concentration per sample, 7 out of 11 couples obtained accurate (amniotic fluid validated) paternal allele classification in their respective fetuses by ASP-SEQ testing. Moreover, haplotype classification was remarkably clear and unambiguous using the ASP-SEQ method (
Regarding TDS performance with the same early pregnancy cohort, the results were far less accurate. TDS derived paternal allele test results for only 4 out of 7 couples in the two or more plasma sample group, one result of which was incorrect as determined by amniotic fluid testing (see Family 6,
Although the invention has been described in conjunction with specific embodiments thereof, it is evident that many alternatives, modifications and variations will be apparent to those skilled in the art. Accordingly, it is intended to embrace all such alternatives, modifications and variations that fall within the spirit and broad scope of the appended claims.
This application is a Continuation-in-Part of U.S. patent application Ser. No. 15/529,151, filled on May 24, 2017, which is a national phase of PCT Patent Application No. PCT/IL2015/051142, filled on Nov. 24, 2015, which claims the benefit of priority of U.S. Provisional Patent Application Nos. 62/208,935 filed on Aug. 24, 2015, 62/109,407 filed on Jan. 29, 2015 and 62/083,595 filed on Nov. 24, 2014. The contents of the above applications are all hereby expressly incorporated by reference, in their entirety.
Number | Date | Country | |
---|---|---|---|
62083595 | Nov 2014 | US | |
62109407 | Jan 2015 | US | |
62208935 | Aug 2015 | US |
Number | Date | Country | |
---|---|---|---|
Parent | 15529151 | May 2017 | US |
Child | 15877922 | US |