The invention is in the field of genetics. In particular, the field of allelic mapping, haplotyping and diplotyping.
Variations in the genetic sequence have been identified at multiple sites in the human genome. Genetic variations or polymorphisms may have functional implications, for example, some polymorphisms may predispose and individual to a disease or may determine the way in which drugs are metabolized. Polymorphisms exist in different forms and include single nucleotide variations, multibase insertion, microsatellite repeats, di-nucleotide repeats, tri-nucleotide repeats and sequence rearrangements. Among these sequence polymorphisms, the most frequent polymorphisms in the human genome are single-base variations, also called single-nucleotide polymorphisms (SNPs).
In the human genome, SNPs occur at approximately every three hundred bases. Since humans are diploid organisms, multiple SNPs can be inherited together and appear on one strand of DNA, or inherited from both parents separately so that they appear on different copies of the same gene. The phase information, that is, whether the SNPs occur on the same strand (cis-) or on different strands (trans-) with each other, is important as it is known to affect disease risk, severity of disease phenotype, and drug response. Some mutations can mask the deleterious effects of another when they occur cis to each other. For example, thrombophilia is associated with two mutation sites in the methylenetetrahydrofolate reductase (MTHFR) gene; a C→T mutation at position 677 (C677T) and an A→C, mutation at position 1298 (A1298C). However, it is only when these two mutations occur trans to each other will the diseased phenotype be observed. In other cases, the effects are compounded when the mutations are cis with each other such as that for two independent SNPs related to lung cancer and parkinson's disease. There is also growing evidence that around 1-5% of human genes are expressed in an allele-specific manner. This means that one copy can be expressed at a different rate than the other; thus, if the combination of SNPs affect the function of the corresponding protein, the sequence information on each allele is also important.
Despite the importance of SNP phasing, this information is often missed out in routine DNA sequencing technologies since the DNA is randomly fragmented prior to sequencing. Typically, phase information is obtained by processing the genotype data of a father-mother-child trio through computational and statistical algorithms such as PHASE and HelixTree. However, this method is limited by the accuracy of the bioinformatics software and the availability of family data. Rare variants which occur at low frequencies also cannot be phased. Alternatively, direct laboratory-based approaches may be employed. These include long-range sequencing combined with more powerful computational tools, conventional sequencing methods applied to a single molecule of DNA to detect only cis-SNPs, or by the sequential addition of dibases to resolve phase information using the neighbouring bases.
Typically, phasing of haplotypes and diplotypes requires multiple iterative rounds of polymerase chain reaction (PCR) and the products of the second (or afterwards) rounds of PCR are sequenced in order to determine which SNPs are co-expressed. Currently, the most widely used routine phasing method is Sanger sequencing, which requires specialized instrumentation and is only limited to SNPs within 700 nt.
Presently, all the methods for determining haplotypes and diplotypes are limited by labour-intensive protocols and the need for specialized instrumentation and high-powered computing devices for data analyses.
There is a need to provide a simple and quick method to resolve the exact allelic content on the chromosomal copies to determine haplotypes and diplotypes that overcomes or at least ameliorates one or more of the disadvantages described above.
In one aspect, there is provided a method for determining a haplotype or a diplotype in a genetic sample comprising the steps of: a) contacting a probe-complex with the genetic sample, wherein the probe complex comprises at least two probes, b) hybridising at least two probes to a polynucleotide sequence, wherein each of the at least two probes is specific to one of two or more genetic variants in said polynucleotide sequence; c) determining the presence or absence of at least one genetic variant by detecting a signal emitted from at least one probe, wherein detection of said signal is indicative of the the presence of a genetic variant; d) removing or displacing at least one of said probes from said sample; and e) detecting a change in the signal to determine the haplotype or a diplotype in the genetic sample.
In one aspect, there is provided a kit for use in the method as disclosed herein comprising at least two probes, wherein each of the at least two probes is specific to one of two or more genetic variants, and instructions for use.
As used herein, a “genetic variant” refers to a variation in one or more nucleotides in a genetic sequence relative to a reference nucleotide sequence.
As used herein, the term “haplotype” refers to two or more alleles on one chromosome or a part of a chromosome. The term “haplotype” may also refer to two or more single nucleotide polymorphisms (SNPs) on one chromosome or part of a chromosome.
As used herein, the tem “diplotype” refers to the matched pair of haplotypes on homologous chromosomes.
As used herein, the term “allele” refers to any one of two or more different forms of a gene that occupy the same position (locus) on a chromosome.
As used herein, the term “phase” or “phasing” refers to the position of one genetic variant or SNP relative to another genetic variant or SNP. Two or more genetic variants or SNPs that occur on the same nucleic acid strand are in cis-configuration whilst two or more genetic variants or SNPs that occur on different nucleic acid strands are in trans-configuration.
As used herein, the term “hybridize” or grammatical variants thereof means that that the probe anneals to a target polynucleotide sequence via a non-covalent interaction. It will be generally understood that any hybridization reaction is performed under stringent conditions. The term “stringent conditions” means any hybridisation conditions which allow the probe to bind specifically to a nucleotide sequence, but not to any other nucleotide sequences. It is within the ambit of the skilled person to vary the parameters of hybridization such as temperature, probe length and salt concentration such that specific hybridisation can be achieved.
As used herein, the term “probe” refers to a molecule designed to bind to a nucleotide sequence and may be used to identify a specific nucleotide sequence in a target or sample. The probe may comprise a nucleotide sequence complementary to the specific nucleotide sequence to be identified.
As used herein, the term “toehold” in the context of “toehold sequence” refers to a single stranded nucleic acid sequence within a probe which binds to a given target nucleic acid sequence. Binding of the toehold sequence to the target triggers separation of the strands of the probe.
As used herein, the term “polymorphism” refers to the occurrence of two or more alternative genomic sequences or alleles between or among different genomes or individuals. “Polymorphic” refers to the condition in which two or more variants of a specific genomic sequence can be found in a population. A “polymorphic site” is the locus at which the variation occurs.
As used herein, the term “single nucleotide polymorphism” or “SNP” is a single base pair change in a nucleotide sequence. Typically a single nucleotide polymorphism is the substitution of one nucleotide by another nucleotide at the polymorphic site. Deletion of a single nucleotide or insertion of a single nucleotide may also give rise to single nucleotide polymorphisms. Typically, between different genomes or between different individuals, the polymorphic site may be occupied by two different nucleotides.
The invention will be better understood with reference to the detailed description when considered in conjunction with the non-limiting examples and the accompanying drawings, in which:
In a first aspect the present invention refers to a method for determining a haplotype or a diplotype in a genetic sample comprising the steps of: a) contacting a probe-complex with the genetic sample, wherein the probe complex comprises at least two probes; b) hybridising at least two probes to a polynucleotide sequence, wherein each of the at least two probes is specific to one of two or more genetic variants in said polynucleotide sequence; c) determining the presence or absence of at least one genetic variant by detecting a signal emitted from at least one probe, wherein detection of said signal is indicative of the the presence of a genetic variant; d) removing or displacing at least one of said probes from said sample; and e) detecting a change in the signal to determine the haplotype or a diplotype in the genetic sample.
In one embodiment, a probe-complex may comprise at least two probes. A probe-complex may also comprise a double stranded nucleic acid molecule, for example, a double stranded DNA molecule, a three-stranded nucleic acid molecule for example, a three stranded DNA molecule or a four-stranded nucleic acid molecule, for example, a four stranded DNA molecule. The design of a probe-complex may be determined by the application of the probe-complex. For example, a probe-complex may be designed to enable simultaneous interrogation of two genetic variants or SNPs in a sample. In another example, a probe-complex may be designed to contain a universal fluorophore and quencher pair together with probes to identify specific genetic variants or SNPs.
In one embodiment, two discrete probes may be hybridized to a connector strand to form a three stranded DNA molecule.
It will generally be understood that each probe may interact with a specific target sequence on a polynucleotide sequence. The polynucleotide sequence may be isolated or purified from a genetic sample which may include but is not limited to blood, blood plasma, serum, buccal smear, amniotic fluid, prenatal tissue, sweat, nasal swab, urine, organs, tissues, fractions, and cells isolated from mammals including humans. A genetic sample may also include sections of the genetic sample including tissues (for example, sectional portions of an organ or tissue). In other embodiments, the isolated polynucleotide sequence may be amplified by methods known in the art. In a preferred embodiment, the polynucleotide sequence tray be amplified using a polymerase chain reaction (PCR) or an isothermal amplification process.
Each probe may interact with its specific target sequence sequentially or simultaneously. The polynucleotide sequence may include nucleic acids, nucleic acid fragments, plasmids and other molecules such as gene fragments and the like. The probes may be nucleic acids, oligonucleotides, nucleic acid variants such as peptide nucleic acids (PNAs) or locked nucleic acids (LNAs), peptides, proteins, dyes, fluorophores, magnetic beads, lipids, drugs, or small molecules. Any combination of probe types may be used in a given experiment.
In some embodiments, the probe may comprise one or more nucleotide sequences that are specific to one or more target sequences.
A probe may further comprise a dye. A dye may refer to a substance used to color materials or to enable the generation of luminescent or fluorescent signal. A dye may absorb or emit light at specific wavelengths and may be bound to the a probe or a target by intercalation, noncovalent binding or covalent binding. A dye may be a chemiluminescent or a fluorophore molecule. Examples of chemiluminescent molecules include but are not limited to N-(4-Aminobutyl)-N-ethylisoluminol, luminol, coelenterazine, ruthenium complexes such as Ru(BPS)34+ (wherein BPS is 4,7-diphenyl-1,10-phenanthroline disulfonate or bathophenanthroline disulfonate), Ru(BPS)2(bipy)2− (where bipy is 2,2′-bipyridine), Ru(BPS)(bipy)2 and tris(2,2′-bipyridine)ruthenium(II) (Ru(bipy)32') and analogues of ruthenium. A fluorophore may be a protein or peptide, a small organic compound, or a synthetic oligomer or polymer. For example, a fluorophore may be a non-protein organic fluorophore selected from xanthene derivatives, cyanine derivatives, squarine derivatives and ring-substituted squaraines, napthalene derivatives, coumarin derivatives, oxadiazole derivatives, anthracene derivatives, pyrene derivatives, oxazine derivatives, arylmethine derivatives and tetrapyyrole derivatives. Other chemiluminescent or fluorophore molecules known in the art may be suitably used within the scope of the invention. Those skilled in the art will also recognise other dyes that may be used within the scope of the invention. It will also be understood to those skilled in the art that any set or combination of fluorophores or dyes may be used, but these should have non-overlapping excitation/emission spectra.
In one embodiment, a probe may further comprise a quencher. A quencher is any molecule or agent that decreases chemiluminescence or fluorescence intensity. An example of a quencher may be an organic or inorganic molecule with a network of conjugated double-bonds. Other examples of quenchers include but are not limited to molecular oxygen, iodide ions and acrylamide. In one embodiment, the fluorophore and quencher may be located on separate strands of the probe.
In the method of the present invention, the at least two probes may comprise at least two distinct fluorophores. The at least two probes may further comprise at least two quenchers which may be identical or distinct from each other.
In a further embodiment, one or more quenchers may be added onto a connector strand hybridized to at least two discrete probes. The connector strand may be covalently modified with one or more quencher molecules at the 5′ and/or 3′ end of the connector strand.
In another embodiment, one of the at least two probes may further comprise a magnetic bead. The magnetic bead may be attached to one or more of the at least two probes via a streptavidin molecule. In one embodiment, the magnetic bead may be a streptavidin-modified magnetic bead functionally attached to a probe by a biotin modification in the probe. In another embodiment, the magnetic bead may be functionally attached to a probe by activation of a functional group (e.g. N-hydroxysuccinimide (NHS)) on the surface of the magnetic bead and reacting the magnetic bead with a probe comprising one or more amine-functionalized oligonucleotides.
In some embodiments, the probe may be immobilized onto a surface. The surface may be a solid surface or a substrate. Examples of a surface include but are not limited to gold or silica, a membrane such as egg shell membrane (ESM), a polymeric substrate or a gel. In some embodiments, the probe may be attached to a solid surface via a gold-thiol-DNA bond, a silica-NHS-amine-DNA interaction or a polymer-streptavidin-biotin-DNA interaction.
Hybridization of a probe to nucleotide sequence may be achieved by any means that anneals the probe to the nucleotide sequence. In one embodiment, hybridization may be achieved by toehold-mediated strand displacement. Hybridization may be triggered by a toehold sequence on the probe annealing to a complementary sequence on the polynucleotide sequence. Annealing of the toehold sequence on the probe to the polynucleotide sequence may cause the strands of the probe to separate and the strand of the probe comprising the toehold sequence to hybridize to the polynucleotide sequence. The strand that does not comprise the toehold sequence is displaced. It will generally be understood that the specificity of the hybridization reaction of the probe to the polynucleotide sequence may be governed thermodynamically by the sequence of the probe and/or the length of the toe-hold region.
Hybridization of the probe to the polynucleotide sequence may result in the emission of a signal from the probe. Detection of the presence of an emitted signal may be indicative of the presence of a genetic variant. In other embodiments, the intensity of the emitted signal may be measured to determine the presence or number of copies of a genetic variant in a genetic sample. The intensity of the emitted signal may be measured relative to a reference signal. A reference signal may be a signal emitted from a genetic sample with known genetic variants, for example, a wild type sample. A reference signal may also be a signal emitted from the same genetic sample prior to the addition of an enzyme or prior to the removal or displacement of one or more probes. A reference signal may also be a signal emitted from a genetic sample in the absence of hybridization of a probe to the polynucleotide sequence.
As described herein, the presence or absence of a genetic variant may be determined by detecting a signal emitted from at least one probe using the method of the present invention. The method of the present invention further allows the phase of at least two or more genetic variants to be determined by removing or displacing at least one of said probes from the sample.
In one embodiment, the at least one probe may be removed by magnetic separation. It will be understood that magnetic separation may be used to separate a probe attached to a magnetic bead from a probe that is not attached to a magnetic bead.
In another embodiment, the at least one probe may be removed from the sample by a washing step. In one embodiment, the at least one probe may be immobilized to a surface and after the target nucleic acid hybridizes with the immobilized probe, any probes that are not immobilized to the surface after the hybridization step may be removed from the sample by washing.
In another embodiment, the at least one probe may be displaced from said sample by the action of a polymerase. In some embodiments, a probe that is bound to the polynucleotide sequence may act as a primer for the polymerase. Extension of the primer by the polymerase enzyme may displace another probe that is bound to the polynucleotide sequence.
In a preferred embodiment, the polymerase may be a high fidelity DNA polymerase. The high fidelity DNA polymerase may have no 5′ to 3′ exonuclease activity.
In one embodiment, one of the at least two probes further comprises a modification that prevents polymerase extension. An example of a modification that prevents polymerase extension is an overhanging region. For example, an overhanging region may comprise a 3′ poly A tail. An overhanging region may also comprise a hairpin region. In a preferred embodiment, one of the at least two probes may be modified with a 3′ poly A tail.
Removal or displacement of the at least one probe from the genetic sample may result in a change in the presence or level of intensity of one or more signals emitted from the genetic sample. Detection of a change in one or more signals emitted may be used to determine the haplotype or diplotype in the genetic sample. In one example, detection of a decrease in an emitted signal may indicate that two genetic variants are located in cis configuration. In another example, detection of a decrease in an emitted signal may indicate that two genetic variants are located in trans configuration.
The method of the present invention may be used to determine the haplotype or diplotype of two or more genetic variants. The two or more genetic variants may be located less than 1 kilobase (1 kb) from each other, or more than 1 kb from each other. It will generally be understood that the two or more genetic variants may be located at a distance of up to the length of a chromosome apart. For example, the two or more genetic variants may be located at least 100 nucleotides (nt), at least 200 nt, at least 300 nt, at least 400 nt, at least 500 nt, at least 600 nt, at least 700 nt, at least 800 nt, at least 900 nt, at least 1000 nt, at least 1500 nt or at least 2000 nt from each other. In one embodiment, the two or more genetic variants are located on a chromosome. In a preferred embodiment, the two or more genetic variants may be located at least 700 nt apart.
The present invention also provides a probe-complex for use in the method of the present invention, comprising at least two probes, a fluorophore and a quencher.
In one embodiment, the probe-complex may further comprise a connector strand hybridized to the at least two probes. In another embodiment, the probe-complex further comprises a magnetic bead attached to one or more of the at least two probes.
The present invention also provides a kit for use in the method as disclosed herein comprising at least two probes wherein each of the at least two probes is specific to one of two or more genetic variants, and instructions for use.
In one embodiment, the kit may further comprise a polymerase enzyme.
The invention illustratively described herein may suitably be practiced in the absence of any element or elements, limitation or limitations, not specifically disclosed herein. Thus, for example, the terms “comprising”, “including”, “containing”, etc. shall be read expansively and without limitation. Additionally, the terms and expressions employed herein have been used as terms of description and not of limitation, and there is no intention in the use of such terms and expressions of excluding any equivalents of the features shown and described or portions thereof, but it is recognized that various modifications are possible within the scope of the invention claimed. Thus, it should be understood that although the present invention has been specifically disclosed by preferred embodiments and optional features, modification and variation of the inventions embodied therein herein disclosed may be resorted to by those skilled in the art, and that such modifications and variations are considered to be within the scope of this invention.
The invention has been described broadly and generically herein. Each of the narrower species and subgeneric groupings falling within the generic disclosure also form part of the invention. This includes the generic description of the invention with a proviso or negative limitation removing any subject matter from the genus, regardless of whether or not the excised material is specifically recited herein.
Other embodiments are within the following claims and non-limiting examples. In addition, where features or aspects of the invention are described in terms of Markush groups, those skilled in the art will recognize that the invention is also thereby described in terms of any individual member or subgroup of members of the Markush group.
Non-limiting examples of the invention and comparative examples will be further described in greater detail by reference to specific Examples, which should not be construed as in any way limiting the scope of the invention.
A two-step reaction was designed wherein the presence of the two SNPs was first interrogated, and if both SNPs were present, the phase information was deduced from a second set of nucleic acid reactions (see
A modification of the previous example was introduced by using a universal probe, (henceforth referred to as “X-probe” because of its shape as shown in
The present invention also provides a method where an enzyme is not needed. Instead, two fluorophore-labelled probes that interrogate the two different SNP sites (referred to as SNPs A and B) were initially hybridized together with a connector strand that was covalently modified with quencher molecules on both the 5′ and the 3′ end. This three-stranded probe was then attached to a streptavidin-modified magnetic bead via a biotin modification in one of the fluorophore-labelled probe. The fluorescence signal was measured twice—first after incubation of the magnetic bead-conjugated probe with the appropriate target DNA and a second time after separation, washing, and reconstitution (to the same volume) of the magnetic bead. In the first measurement, no signal was registered without the target (or in the presence of a wild type target) and one fluorophore was used to detect when the target contains either of the SNPs. Both fluorophores emitted a signal when both are present (
The method in Example 1 previously described was extrapolated to phase three SNP sites. Phasing multiple SNP sites increases the repertoire of the diseases and conditions that can be identified and expands the possible applications of this technology. In this case, two reaction vessels containing the same sample (containing any one, two, or three SNPs) were incubated with three fluorophore-quencher pairs (probe A, B, and C) to interrogate the presence of the three SNPs, SNPs A, B, and C, respectively. The wavelengths of the fluorescence signals emitted were indicative of the presence of the corresponding SNP (
Similar to Example 4, the conditional separation by a magnetic particle (Example 3) was extrapolated to interrogate and phase three SNPs. This required two reaction vessels wherein the magnetic bead was attached to probe C in the first sample well, and attached to probe B in the second sample well (
In the first sample well, the fluorescence profile after magnetic separation provided the phase information relative to probe C, while the second sample well provided the phase information relative to probe B (
The conditional displacement assay was tested on 10 possible diplotypes using the four templates (TD, TA, TB and WT) as shown in
The foregoing examples are presented for the purpose of illustrating the invention and should not be construed as imposing any limitation on the scope of the invention. It will readily be apparent that numerous modifications and alterations may be made to the specific embodiments of the invention described above and illustrated in the examples without departing from the principles underlying the invention. All such modifications and alterations are intended to be embraced by this application.
This application claims the benefit of priority of U.S. provisional application 62/601,136, filed on 14 Mar. 2017, the contents of it being hereby incorporated by reference in its entirety for all purposes.
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/CN2018/078938 | 3/14/2018 | WO | 00 |
Number | Date | Country | |
---|---|---|---|
62601136 | Mar 2017 | US |