The present invention relates to the diagnosis and analytical assays and is related to a method and kit comprising reagents and means for the identification, detection and/or quantification of a large number of (micro)organisms of different groups (classes, family, genus, species, individual among other ones) by their identification or the identification of a component thereof on a same array.
The invention is especially suited for the simultaneous identification and/or quantification of groups and sub-groups of (micro)organisms or related genes present in the same biological sample.
The present invention also provides a two step method for detecting first for the presence of any of the search (micro)organisms followed by its identification.
The present invention is in the field of diagnosis and is related to a method and kit comprising reagents and means for the identification (detection and/or quantification) of different (micro)organisms among other ones having different nucleotide sequences by identification of their nucleotide sequences by hybridization on specific immobilized capture molecules after amplification by PCR
The invention is especially suited for the identification and/or quantification of different (micro)organisms of the same genus or family or for the detection and/or quantification of different genes in a specific (micro)organism present in a biological sample.
Identification of an organism or microorganisms can be performed based on the presence in their genetic material of specific sequences. Identification of a specific organism can be performed easily by amplification of a given sequence of the organism using specific primers and detecting or identifying the amplified sequence.
However, in many applications especially in diagnostic, possible organisms present in biological samples are numerous and belong to different families, genus, species, subspecies or even individuals. Amplifications of each of the possible organisms is difficult and expensive. A simple method is thus required for such multi-parametric, multi-levels analysis.
Amplification of a given sequence is performed by several methods such as the polymerase chain reaction (PCR) (U.S. Pat. Nos. 4,683,195 and 4,683,202), ligase chain reaction (LCR) (Wu and Wallace, 1989 Genomics 4:560-569) or the Cycling Probe Reaction (CPR) (U.S. Pat. No. 5,011,769), which are the most common. One particular way to detect the presence of a given sequence and thus of a particular organism is to follow the appearance of amplicons during the amplicon cycles. The method is called the real time PCR. A fluorescent signal appears when the amplifications are formed and the amplification is considered as positive when reaching a threshold.
Detecting the amplicons can also be performed after the amplification by methods based on the specific recognition of amplicons to complementary sequences. The first supports used for such hybridization were the nitrocellulose or nylon membranes. However, the methods were miniaturized and new supports such as conducting surfaces, silica, and glass were proposed together with the miniaturization of the detection process. Micro-arrays or DNA Chips are used for multiple analysis of DNA or RNA sequences either after an amplification step or after a reverse transcription into a cDNA. The target sequences to be detected are labeled during the amplification or copying step and are then detected and possibly quantified on arrays. The presence of a specific target sequence on the arrays is indicative of the presence of a given gene or DNA sequence in the sample and thus of a given organism which may then be identified. The problem of detection becomes difficult when several sequences are homologous to each other, but have to be specifically discriminated upon the same array. This technical problem is the condition to use arrays for many diagnostic purposes since organisms or micro-organisms of interest are often very similar to others on a taxonomic basis and have almost identical DNA sequences.
The Company Affymetrix Inc. has developed a method for direct synthesis of oligonucleotides upon a solid support, at specific locations by using masks at each step of the processing. Said method comprises the addition of a new nucleotide on a growing oligonucleotide in order to obtain a desired sequence at a desired location. This method is derived from the photolithographic technology and is coupled with the use of photoprotective groups, which are released before a new nucleotide is added (EP-A1-0476014, U.S. Pat. No. 5,445,934, U.S. Pat. No. 5,143,854 and U.S. Pat. No. 5,510,270). However, only small oligonucleotides are present on the surface, and said method finds applications mainly for sequencing or identifying a pattern of positive spots corresponding to each specific oligonucleotide bound on the array. The characterization of a target sequence is obtained by comparison of such pattern with a reference. Said technique was applied to the identification of Mycobacterium tuberculosis rpoB gene (WO97/29212 and WO98/28444), wherein the capture nucleotide sequence comprises less than 30 nucleotides and from the analysis of two different sequences that may differ by a single nucleotide (the identification of SNPs or genotyping). Small capture nucleotide sequences (having a length comprised between 10 and 20 nucleotides) are preferred since the discrimination between two oligonucleotides differing in one base is higher, when their length is smaller.
The method is complicated by the fact that it cannot directly detect amplicons resulting from genetic amplification (PCR). A double amplification is performed with primer(s) bearing a T3 or T7 sequences and then a reverse transcription with a RNA polymerase. These RNA are cut into pieces of about 40 bases before being detected on an array (example 1 of WO 97/29212). Each sequence requires the presence of 10 capture nucleotide sequences and 10 control nucleotide sequences to be identified on the array. The reason for this complex procedure is that long DNA or RNA fragments hybridize very slowly on small oligonucleotide capture nucleotide sequences present on the surface. Said methods are therefore not suited for the detection of homologous sequences, since the homology varies along the sequences and so part of the pieces will hybridize on the same capture nucleotide sequences. Therefore, a software for the interpretation of the results is incorporated in the method for allowing interpretation of the obtained data. The main reason not to perform a single hybridization of the amplicons on the array is that the amplicons will rehybridize in solution much faster than hybridize on the small capture nucleotide sequences of the array.
One consequence of such constraints is that polynucleotides are analyzed on oligonucleotides based arrays, only after being cut into oligonucleotides. Said methods are therefore not suited for the detection of homologous sequences since the homology varies along the sequences and so part of the pieces could hybridize on the same capture probes. For gene expression array which is based on the detection of cDNA copy of the mRNA, the problem still exist but is less acute since the cDNA is single stranded. The fragments are also cut into smaller species and the method requires the use of several capture oligonucleotide sequences in order to obtain a pattern of signals which attest the presence of a given gene. Said cutting also decreases the number of labeled nucleotides, and thus reduces the obtained signal. In the case of cDNA analysis, the use of long capture polynucleotide sequences gives a much better sensitivity to the detection. In many gene expression applications, the use of long capture nucleotide sequences is not a problem, when cDNAs to be detected originate from genes having different sequences, since the difference in the sequence is sufficient in order to avoid cross reactions between them even on a sequence longer than 100 bases so that polynucleotides can be used as capture nucleotide sequences. Long capture nucleotide sequences give the required sensitivity but they will hybridize to other homologous sequences.
The main reason not to perform a single hybridization of the amplicons on the array is that the amplicons will rehybridize in solution much faster than hybridize on the small capture nucleotide sequences of the array.
However, for gene expression array which is based on the cDNA copy of mRNA the same problem is encountered when using small capture probe arrays: the rate of hybridisation is low. Therefore, the fragments are cut into smaller pieces and the method requires the use of several capture nucleotide sequences in order to obtain a pattern of signals which attest to the presence of a given gene (WO97/10364 and WO97/27317). Said cutting also decreases the number of labeled nucleotides, and thus reduces the obtained signal. In this case, the use of long capture nucleotide sequences gives a much better sensitivity to the detection. In many gene expression applications, when cDNA to be detected originates from genes having different sequences, the use of long capture probes is not a problem, since there is no cross-reactions between them. Long capture nucleotide sequences give the required sensitivity, however, they will hybridize to other homologous sequences.
The detection of Single Nucleotide Polymorphism in the DNA is just one particular aspect of the detection of homologous sequences. The use of arrays has been proposed to discriminate two sequences differing by one nucleotide at a particular location of the sequence. Since DNA or RNA sequences are in low copy numbers, their sequences are first amplified so that double stranded sequences are analyzed on the array. Several methods have been proposed to detect such a base change in one location. The document WO 97/31256 proposes the use of two oligonucleotide sequences: the first one with a part specific and a part addressable, the second one with a part specific and a part labeled. After ligation in solution, the product is immobilized on an array with capture nucleotide sequences with a least a part complementary of the addressable part. The detection of SNP is the basis for polymorphism determination of individual organism, but also for its genotyping, since the genome of individuals differ from each other in the same species or subspecies by said SNPs. The presence of a particular SNP affect the activities of enzymes like the P450 and make them more or less active in the metabolism of a drug.
The capture oligonucleotide present on the array can also be used as primers for extension once the target nucleotide hybridized. The document WO 96/31622 proposes to identify a nucleotide at a given location upon a sequence by elongation of a capture nucleotide sequence with detectable modified nucleotides in order to detect the given spots, where the target has been bound with the last nucleotide of the capture nucleotide sequence being complementary of a target sequence at this particular position. The document WO 98/28438 proposes to complete several cycles of hybridization-elongation steps to label a spot in order to compensate for a low hybridization yield of the target sequence. This method allows identification of a nucleotide at a given location of a sequence by labeling of a spot of the elongated capture nucleotide sequence.
Prior to elongation, the capture nucleotide sequences present on the array can be digested by a nuclease in order to differentiate between matched and the unmatched heteroduplexes (U.S. Pat. No. 5,753,439). Use of nuclease for identification of sequences has also been proposed (EP 0721016). A second labeled nucleotide sequence complementary of the targets has also been proposed to be added to the hybridized targets and being ligate to the capture nucleotide sequence if the last nucleotide of the targets is complementary to the targets a this position (WO 96/31622).
The document EP-0785280 proposes a detection of polymorphism based on the hybridization of the target nucleotides on blocks containing several oligonucleotide sequences differing by one base each and obtain a ratio of intensity for determining which sequences are the perfect hybridization matches.
Using membranes or nylon supports are proposed to increase the sensitivity of the detection on solid support by incorporation of a spacer between the support and the capture nucleotide sequences. Van Ness et al. (1991 Nucleic Acids Res. 19:3345) describe a poly(ethyleneimine) arm for the binding of DNA on nylon membranes. The European patent application EP-0511559 describes a hexaethylene glycol derivative as spacer for the binding of small oligonucleotides upon a membrane. When membranes like nylon are used as support, there is no control of the site of binding between the solid support and the oligonucleotides and it was observed that a poly dT tail increased the fixation yield and so the resulting hybridization (WO089/11548). Similar results are obtained with repeated capture sequences present in a polymer (U.S. Pat. No. 5,683,872).
Guo et al. (1994 Nucleic Acids Res. 22:5456) teach the use of poly dT of 15 bases as spacer for the binding of oligonucleotides on glass with increased sensitivity of hybridization.
The document WO99/16780 describes the detection of 4 homologous sequences of the gene femA on nylon strips. However, no data on the sensitivity of the method and the detection is presented. In said document, the capture nucleotide sequences comprise between 15 and 350 bases with homology less than 50% with a consensus sequence.
The publication of Anthony et al. (J. Clin. Microbiol. 38:7817-8820) describes the use of a membrane array for the discrimination with low sensitivity of homologous sequences originated from a several related organisms. Targets to detect are rDNA amplified from bacteria by consensus PCR and the detection is obtained on nylon array containing capture nucleotide sequences for said bacteria and having the capture nucleotide sequences having between 20 and 30 bases which are covalently linked to the nylon, and there is no control of the portion of the sequence which is available for hybridization.
However these patents neither described nor suggested that it is was possible to use a component of a (micro)organism, especially a genetic sequence, to identify said (micro)organism together with the identification of the group to which these (micro)organisms belong. Also there is neither an indication nor a suggestion in the state of the art that polynucleotides can be used as capture sequences in microarrays in order to differentiate a binding between homologous polynucleotides sequences and to permit identification of one target sequence among other species, genus or families of (micro)organisms sequences.
Also there is no indication or suggestion that homologous sequences differing by one nucleotide at one location of the sequence (such as observed in polymorphism analysis) could be detected by hybridization of the amplified sequences on corresponding capture nucleotide sequences.
Prior to the invention, it was unknown that it is possible to identify in a two-step process, i.e. an amplification followed by a direct hybridization of the amplicons on an array, organisms belonging to the same group, to two groups or more together with the specific identification of the groups as such. Also it was unknown that it was possible to identify organisms belonging to a group and sub-group together with the specific identification of these group and sub-group. Also that such identification could be obtained by using polynucleotide as capture sequences for all detections.
Also it was unknown that polynucleotides could be used for the identification of homologous polynucleotide sequences differing by one nucleotide present in a particular location of the sequence.
Also it was unknown that homologous polynucleotide sequences could be discriminated and detected on an array directly after amplification with a very high sensitivity.
The development of the biochips technology allows the detection of multiple nucleotide sequences simultaneously in a given assay and thus allows the identification of the corresponding organism or part of the organism. Arrays are solid supports containing on their surface a series of discrete regions bearing capture nucleotide sequences (or probes) that are able to bind (by hybridisation) to a corresponding target nucleotide sequence(s) possibly present in a sample to be analysed. The present invention enables the detection of the full length double stranded amplicons produced by PCR on the capture probes fixed on a support like the array. If the target sequence is labelled with modified nucleotides during a reverse transcription or an amplification of said sequence, then a signal can be detected and measured at the binding location. Its intensity gives an estimation of the amount of target sequences present in the sample. Such technology allows the identification and/or quantification of genes or species for diagnostic or screening purposes. More particularly, the present invention extends the specific amplification-detection processes of multiple nucleotide sequences even to non homologous sequences.
The present invention provides a new method and device to improve microarrays or biochips technology for the easy identification (detection and/or quantification) of a large number of (micro)organisms or portions of (micro)organisms like their gene transcripts having very different nucleotide sequences. The method is well suited for miniaturized assays where a large amount of information has to be tested or obtained on a small amount of biological material.
Typical applications fitting with these needs are identifications of organisms like bacteria or other pathogenic organisms among many possible other ones which can be responsible for a disease. The present invention may be used for the determination of the presence of SNPs in a genome in order to detect possible genetic diseases. The present invention may also be used for expression analysis on clinical samples where sometimes a few milligrams or even a few micrograms of tissue are available.
The present invention further provides a method and device for getting specific and sensitive detection even for assays suitable for multiple targets. The method is made simple both for the specific amplification of multiple nucleotide sequences even if non homologous by providing derivative nucleic acids which are all amplified by a single primer pair and identifying (detection and/or quantification) the amplified sequences by their direct hybridization on specific capture molecules immobilized in specific locations. Preferably, the invention allows identification and/or recording of single signals upon said locations. The method is particularly suitable when the sample contains nucleotide sequences to be detected at very different concentrations and/or when genomic DNA is present.
The method may be used in diagnostic procedures which employ a closed system containing all reagents for performing this amplification method and which employ a single amplification reaction of all the sequences present in the sample.
The method is also suited for an identification of the genome of pathogenic organisms. It is also useful for quantification of gene expression in cells or tissues, even in degraded form. The method is compatible with detection of amplified target sequences in real time PCR and on microarrays.
The present invention is premised in part on the discovery that arrays can be used to obtain a discrimination between a homologous (biological) component (such as a genetic sequence) of different (micro)organisms belonging to several groups together with the identification of these groups as such.
The present invention is especially useful in using arrays to discriminate between homologous genetic sequences (amino acid sequences and nucleotide sequences) belonging to several groups of organisms together with the identification of these groups as such.
The invention provides a method and a device which are based upon a simplified technology requiring the use of a single or limited number of primer pair(s) in an amplification step to detect the presence of the specific target or group of target sequence(s) and followed by the identification (detection and/or quantification) of said specific target or groups of target genetic sequence(s) by recording in a single spot identification upon said micro-array and in the same experimental protocol, said signal being either specific of the organism or the group or sub-group of organisms.
The present invention further provides means for an identification of organisms differing by single base difference of a given nucleotide sequence followed by hybridization of their amplified polynucleotide sequences upon arrays.
The terms “nucleic acid”, “oligonucleotide”, “array”, “nucleotide sequence”, “target nucleic acid”, “bind substantially”, “hybridizing specifically to”, “background”, “quantifying” are the ones described in the international patent application WO 97/27317 incorporated herein by reference. The term polynucleotide refers to nucleotide or nucleotide like sequences of more than 100 bases long.
The terms “nucleotide triphosphate”, “nucleotide”, “primer sequence” are those described in the document WO 00/72018 and WO 01/31055 incorporated herein by references.
The terms “homologous genetic sequences” mean amino acid or nucleotide sequences having a percentage of amino acids or nucleotides identical at corresponding positions which is higher than in purely random alignments. They are considered as homologous when they show a minimum of homology (or sequence identity) defined as the percentage of identical nucleotides or amino acids found at each position compared to a total of nucleotides or amino acids, after the sequences have been optimally aligned taking into account additions or deletions (like gaps) in one of the two sequences to be compared. Genes coding for a given protein but present in genetically different sources like different organisms are usually homologous. Also in a given organism, genes coding for proteins or enzymes of the same family (Interleukins, Cytochrome b, Cytochrome P450). The degree of homology (or sequence identity) can vary a lot as homologous sequences may be homologous only in one part, a few parts or portions or all along their sequences. The parts or portions of the sequences that are identical in both sequences are said conserved. Protein domains which present a conserved three dimensional structure are usually coded by homologous sequences and even often by a unique exon. The sequences showing a high degree of invariance in their sequences are said to be highly conserved and they present a high degree of homology.
The terms “group, sub-group and sub-sub-group” refer first to the classification of biological organisms in taxas kingdom, branches, classes, orders, families, genus, species, sub-species, varieties or individuals. These constitute different levels of biological taxonomical organization. Groups also refer to organisms which have some aspects in common, but some genetic differences like, for example, the GMO plants, transgenic or chimeric animals. For the purpose of this invention, the common aspects have to be reflected into common or homology DNA or RNA sequences and the dissimilarities or differences in DNA sequences. Gene sequences can also be classified in groups and sub-group independently of their organism origins and are as such part of the invention. They will then refer to groups or sub-groups of genes which belong to a given family such as the cytochrome P450 genes, the protein kinases, the G receptor coupled proteins and others. These genes are homologous to each other as defined here above.
Classification of genes (nucleotide sequences) are used as the basis of molecules paleontology for establishing the classification of organisms into species, genus, family, orders, classes branches, kingdom and taxus.
The terms “hybridization” or “annealing” refer to the formation of duplex DNA strands by nucleotide base pairing. Hybridization yield and specificity is strongly dependant on the incubation conditions especially the temperature and the solution stringency. Conditions have to be worked out in order to optimize the hybridization yield of the specific strands and to minimize the hybridization of unrelated sequences. Stability of the duplex is estimated by the melting temperature (Tm) which represents the temperature for which 50% of the strands will dissociate in given conditions. Determination of the duplex stability can be performed empirically by those skilled in the art considering the variables such as but not limited to the length of the duplex, base composition, ionic strength, and number and position of the mismatches. The Tm will also strongly depend on solution composition, on the ionic strength and on the pH. Tm for perfectly matched small sequences of around 20 bp such as primers can be estimated in reference conditions in a first approximation by the available software methods such as the Primer express or Oligo 6.
Reaction conditions have to be adjusted in order to obtain stringent hybridization conditions in which the complementary sequences will fully or nearly fully hybridize. Such conditions are presented for example in Sambrook et al. (1985 Molecular Cloning—A Laboratory Manual, Cold Spring Harbor Laboratory, Cold Spring Harbor, N.Y.) presented here as reference. Typical stringent solutions used in the PCR are in the range of 0.1 M salt concentration at pH 8. The working conditions are typically chosen in order to be around 5° C. lower than the Tm of the primers and is then adjusted if necessary taking into account the possible presence of mismatches.
The term “homologous sequences” mean nucleotide sequences having a percentage of nucleotides identical at corresponding positions which is higher than in purely random alignments. They are considered as homologous when they show a minimum of homology (or sequence identity) defined as the percentage of identical nucleotides found at each position compared to the total nucleotides, after the sequences have been optimally aligned taking into account additions or deletions (like gaps) in one of the two sequences to be compared. Genes coding for a given protein but present in genetically different sources like different organisms are usually homologous. Also in a given organism, genes coding for proteins or enzymes of the same family (Interleukins, Cytochrome b, P450) are homologous. The degree of homology (or sequence identity) can vary a lot as homologous sequences may be homologous only in one part, a few parts or portions or all along their sequences. The parts or portions of the sequences that are identical in both sequences are said to be conserved. They show identity of sequences. The overall different sequences which include such identical portions of sequences are said to be homologous since some portions of their sequences show a perfect alignment. In some embodiments, the homologous sequences have at least 50% and better at least 70 and even 90 percent nucleotide identity.
Methods of alignment of sequences are based on local homology algorithms which have been computerised and are available as for example (but not limited to) Clustal®, (Intelligenetics, Mountain Views, Calif.), or GAP®, BESTFIT®, FASTA® and TFASTA® (Wisconsin Genetics Software Package, Genetics Computer Group Madison, Wis., USA) or Boxshade®.
The term “consensus sequence” is a sequence determined after alignment of the several homologous sequences to be considered (calculated as the base which is the most commonly found in each position in the compared, aligned, homologous sequences).
The consensus sequence represents a sort of “average” sequence which is as close as possible from all the compared sequences. For high homologous sequences or if the consensus sequence is long enough and the reaction conditions are not too stringent, it can bind to all the homologous sequences. This is especially useful for the amplification of homologous sequences with the same primers called, consensus primers. Experimentally, the consensus sequence calculated from the programs above can be adapted in order to obtain such property.
The terms “primer”, “universal primer” or “specific primer”, “amplification reaction mixture”, “thermostable polymerase” “volume exclusion agent” as mainly used here are defined in the EP141113 (cited above).
The term “primer” refers to an oligonucleotide, whether natural or synthetic, capable of acting as a point of initiation of DNA synthesis under conditions in which synthesis of a primer extension product complementary to a nucleic acid strand is induced, i.e., in the presence of four different nucleoside triphosphates and an agent for polymerization (i.e., DNA polymerase or reverse transcriptase) in an appropriate buffer and at a suitable temperature. Oligonucleotide analogues, such as “peptide nucleic acids”, can act as primers and are encompassed within the meaning of the term “primer” as used herein. A primer is preferably a single-stranded oligodeoxyribonucleotide. The appropriate length of a primer depends on the intended use of the primer but typically ranges from 6 to 50 nucleotides. Short primer molecules generally require cooler temperatures to form sufficiently stable hybrid complexes with the template. A primer need not reflect the exact sequence of the template but must be sufficiently complementary to hybridize with a template. The primers are specific to given sequences or to a family or sequence related polynucleotide and are then considered as consensus for these related sequences.
The PCR reagents described herein are provided and used in PCR in suitable concentrations to provide amplification of the target nucleic acid. The minimal amount of DNA polymerase is generally at least about 1 unit/100 μl of solution, with from about 4 to about 25 units/100 μl being preferred. A “unit” is defined herein as the amount of enzyme activity required to incorporate 10 nmoles of total nucleotides (dNTPs) into an extending nucleic acid chain in 30 minutes at 74° C. The concentration of each primer is at least about 0.025 μmolar and less than about 1 μmolar with from about 0.05 to about 0.2 μmolar being preferred. All primers are present in about the same amount (within a variation of 10% of each). The cofactor is generally present in an amount of from about 1 to about 15 μmolar, and each dNTP is generally present at from about 0.15 to about 3.5 mmolar in the reaction mixture. The volume exclusion agent is present in an amount of at least about 1 weight percent, with amounts within the range of from about 1 to about 20 weight % being preferred. As used in defining the amounts of materials, the term “about” refers to a variation of +/−10% of the indicated amount.
An “amplification reaction mixture”, which refers to a solution containing reagents necessary to carry out an amplification reaction refers, as used herein, to an aqueous solution comprising the various amplification reagents used to amplify a target nucleic acid. The reagents include primers, enzymes, aqueous buffers, salts, target nucleic acid, and deoxynucleoside triphosphates (both conventional and unconventional). Depending on the context, the mixture can be either a complete or incomplete reaction mixture. A “PCR reaction mixture” typically contains oligonucleotide primers, a thermostable DNA polymerase, dNTPs, and a divalent metal cation in a suitable buffer.
A reaction mixture is referred to as complete if it contains all reagents necessary to enable the reaction, and incomplete if it contains only a subset of the necessary reagents. It will be understood by those skilled in the art that reaction components are routinely stored as separate solutions, each containing a subset of the total components, for reasons of convenience, storage stability, and to allow for independent adjustment of the concentrations of the components depending on the application, and, furthermore, that reaction components are combined prior to the reaction to create a complete reaction mixture.
The term “thermostable DNA polymerase” refers to an enzyme that is relatively stable to heat and catalyzes the polymerization of nucleoside triphosphates to form primer extension products that are complementary to one of the nucleic acid strands of the target sequence. The enzyme initiates synthesis at the 3′ end of the primer and proceeds in the direction toward the 5′ end of the template until synthesis terminates. Purified thermostable DNA polymerases can be selected from the genera Thermus, Pyrococcus Thermococcus and Thermotoga, preferably Thermus aquaticus, Pyrococcus furiosus, Pyrococcus woesei, Pyrococcus spec. (strain KOD1), Pyrococcus spec. GB-D, Thermococcus Litoralis Thermococcus sp. 9° N-7, Thermotoga maritima, Pyrococcus spec. ES4 (endeavori), Pyrococcus spec. OT3 (horikoshii), Pyrococcus profundus, Thermococcus stetteri, Thermococcus spec. AN1 (zilligii), Thermococcus peptonophilus, Thermococus celer and Thermococcus fumicolans.
The term “thermostable enzyme” refers to an enzyme that is relatively stable to heat. The thermostable enzymes can withstand the high temperature incubation used to remove the modifier groups, typically greater than 50° C., without suffering an irreversible loss of activity. The hot start DNA polymerases are enzymes or enzyme conditions which make then less active in the original conditions but their activity increased during a first heating at high temperature usually above 90° C.
The term “volume exclusion agent”, as defined herein, refers to one or more water-soluble or water-swellable, nonionic, polymeric volume exclusion agents.
The general principles and conditions for amplification and detection of nucleic acids using polymerase chain reaction are quite well known and are described in numerous references including U.S. Pat. No. 4,683,195, U.S. Pat. No. 4,683,202 and U.S. Pat. No. 4,965,188, incorporated herein by reference. Thus, in view of the teaching in the art and the specific teaching provided herein, a worker skilled in the art should have no difficulty in practicing the present invention by making the adjustments taught herein to co-amplify several nucleic acids, one of which may be a low copy target nucleic acid or may be preferentially amplified.
The term “Real Time PCR” means a method which allows detecting and/or quantifying the presence of the amplicons during the PCR cycles. In the Real Time PCR, the presence of the amplicons is detected and/or quantified in at least one of the cycles of amplification. The increase of amplicons or signal related to the amount of amplicons formed during the PCR cycles is used for the detection and/or quantification of a given nucleotide sequence in the PCR solution.
Micro-arrays are described extensively in EP1266034 and in US20040229225, the disclosures of which are incorporated herein by reference in their entireties. “Micro-array” means a support on which multiple capture molecules are immobilized in order to be able to bind to the given specific target molecule. The micro-array is preferentially composed of capture molecules present at specifically localized areas on the surface or within the support or on the substrate covering the support. A specifically localized area is the area of the surface which contains bound capture molecules specific for a determined target molecule. The specific localized area is either known by the method of building the micro-array or is defined during or after the detection. A spot is the area where specific target molecules are fixed on their capture molecules and seen by the detector. Immobilization of capture molecules on insoluble support is also possible in the form of lines.
The term “organisms” includes live microbial entities as such, such as bacteria or fungi, and comprises parts thereof, the presence of which may be identified with the present method. Hence, in case an organism produces a particular entity, such as a particular protein, the identification of the genetic material of said organism (such as its genomic DNA or its mRNA) allows the determination of whether said part of the organism is present in the sample.
The present invention is related to an identification and/or quantification method of a biological (micro)organism or a (biological) component thereof, said (micro)organism or its component being possibly present in a sample, preferably a biological sample, among at least two, preferably at least four, other related (micro)organisms or components; said method comprising the step of:
Advantageously, said method further comprises the step of identifying and/or quantifying the presence of several groups, subgroups or sub-subgroups of components or (micro)organisms, comprising said components being related to each other until possible individual genetic sequences (nucleotide and/or amino acid sequences) wherein the binding of targets and corresponding specific capture molecules forms a signal at an expected location allowing the identification of a target specific of a group, sub-group or sub-subgroup of components or (micro)organisms comprising said components.
Therefore, the biological component according to the invention could be a nucleotide sequence specific of a (micro)organism or an amino acid sequence (peptide) specific of a (micro)organism. Examples of said molecules are homologous nucleotide sequences or peptides presenting a high homology such as receptors, HLA molecules, cytochrome P450, etc.
Furthermore, the inventors have discovered that it is possible to drastically simplify the identification or quantification of one or several (micro)organisms among many other ones present in such biological sample, said identification and/or quantification being obtained by combining a single amplification using common primer pairs and an identification of the possible (micro)organisms by detecting, quantifying and/or possibly recording upon an array the presence of a single signal resulting only between a capture nucleotide sequence and its corresponding target nucleotide sequence and thereafter correlating the presence of said detected target nucleotide sequence to the identification of a nucleotide sequence specific of said (micro)organism(s).
This means that the method and device according to the invention will allow the easy identification/detection of a specific sequence among other homologous sequences and possibly its quantification (characterization of the number of copies or presence of said organisms in a biological sample) of a target nucleotide sequence, said target sequence having a nucleotide sequence specific of said (micro)organisms.
Such identification may be obtained directly, after washing of possible contaminants (unbound sequences), by detecting and possibly recording a single spot signal at one specific location, wherein said capture nucleotide sequence was previously bound and said identification is not a result of an analysis of a specific pattern upon the microarray as proposed in the system of the state of the art. Therefore, said method and device do not necessarily need a detailed analysis of said pattern by an image processing and a software analysis.
This invention was made possible by discovering that target sequences can be discriminated from other homologous ones upon an array with high sensitivity by using bound capture nucleotide sequences composed of at least two parts, one being a spacer bound by a single and advantageously predetermined (defined) link to the support (preferably a non porous support) and the other part being a specific nucleotide sequence able to hybridize with the nucleotide target sequence.
Furthermore, said detection is greatly increased, if high concentrations of capture nucleotide sequences are bound to the surface of the solid support.
The present invention is related to the identification of a target nucleotide sequence obtained from a biological (micro)organism or a portion thereof, especially a gene possibly present in a biological sample from at least 4 other homologous (micro)organisms or a portion thereof, said other (micro)organisms could be present in the same biological sample and have homologous nucleotide sequences with the target.
Said identification is obtained firstly by a genetic amplification of said nucleotide sequences (target and homologous sequences) by common primer pairs followed (after washing) by discrimination between the possible different target amplified nucleotide sequences. Said discrimination is advantageously obtained by hybridization upon the surface of an array containing capture nucleotide sequences at a given location, specific for a target nucleotide sequence specific for each (micro)organism to be possibly present in the biological sample and by the identification of said specific target nucleotide sequence through the identification and possibly the recording of a signal resulting from the specific binding of this target nucleotide sequence upon its corresponding capture nucleotide sequence at the expected location (single location signal being specific).
According to the invention, the preferred method for genetic amplification is the PCR using two anti-parallel consensus primers which can recognize all said target homologous nucleotide sequences but other genetic amplification methods may be used.
Therefore, said (micro)organisms could be present in any biological material or sample including genetic material obtained (virus, fungi, bacteria, plant or animal cell, including the human body). The biological sample can be also any culture medium wherein microorganisms, xenobiotics or pollutants are present, as well as such extract obtained from a plant or an animal (including a human) organ, tissue, cell or biological fluid (blood, serum, urine, sputum, etc).
The method according to the invention can be performed by using a specific identification (diagnostic and/or quantification) kit or device comprising at least an insoluble solid support upon which are bound single stranded capture nucleotide sequences (preferably bound to the surface of the solid support by a direct covalent link or by the intermediate of a spacer) according to an array with a density of at least 4, preferably at least 10, 16, 20, 50, 100, 1000, 4000, 10 000 or more, different single stranded capture nucleotide sequences/cm2 insoluble solid support surface, said single stranded capture nucleotide sequences having advantageously a length comprised between about 30 and about 600 bases (including the spacer) and containing a sequence of about 3 to about 60 bases, said sequence being specific for the target (which means that said bases of said sequence are able to form a binding with their complementary bases upon the sequence of the target by complementary hybridization). Preferably, said hybridization is obtained under stringent conditions (under conditions well-known to the person skilled in the art).
In the method and kit or device according to the invention, the capture nucleotide sequence is a sequence having between 16 and 600 bases, preferably between 30 and 300 bases, more preferably between 40 and 150 bases and the spacer is a chemical chain of at least 6.8 nm long (of at least 4 carbon chains), a nucleotide sequence of more than 15 bases or is nucleotide derivative such as PMA.
The method, kit and device according to the invention are particularly suitable for the identification of a target, being preferably biological (micro)organisms or a part of it, possibly present in a biological sample where at least 4, 12, 15 or even more homologous sequences are present. Because of the high homology, said nucleotide sequence can be amplified by common primer(s) so that the identification of the target nucleotide sequence is obtained specifically by the discrimination following its binding with the corresponding capture nucleotide sequence, previously bound at a given location upon the microarray. The sensitivity can be also greater increased if capture nucleotide sequences are spotted to the solid support surface by a robot at high density according to an array. A preferred embodiment of the invention is to use an amount of capture nucleotide sequences spotted on the array resulting in the binding of between about 0.01 to about 5 pmoles of sequence equivalent/cm2 of solid support surface.
The kit or device according to the invention may also incorporate various media or devices for performing the method according to the invention. Said kit (or device) can also be included in an automatic apparatus such as a high throughput screening apparatus for the detection and/or the quantification of multiple nucleotide sequences present in a biological sample to be analyzed. Said kit or apparatus can be adapted for performing all the steps or only several specific steps of the method according to the invention.
In the method, the kit (device) or apparatus according to the invention, the length of the bound capture nucleotide sequences is preferably comprised between about 30 and about 600 bases, preferably between about 40 and about 400 bases and more preferably between about 40 and about 150 bases. Longer nucleotide sequences can be used if they do not lower the binding yield of the target nucleotide sequences usually by adopting hairpin based secondary structure or by interaction with each other.
In a preferred embodiment, the specific part of the capture nucleotide sequence is bound onto a nucleotide sequence of between 20 and 600 bases.
In another preferred embodiment, all capture molecules are polynucleotides of more than 100 base long.
In another embodiment, the capture nucleotide sequence is linked to a polymer molecule bound to the solid support. The polymer is preferably a chain of at least 10 atoms, selected from the group consisting of poly-ethyleneglycol, polyaminoacids, polyacrylamide, poly-aminosaccharides, polyglucides, polyamides, polyacrylate, polycarbonate, polyepoxides or poly-ester (possibly branched polymers).
If the homology between the sequences to be detected is low (between 30 and 60%), parts of the sequence which are specific in each sequence can be used for the design of specific capture nucleotide sequences binding each of the different target sequences. However, it is more difficult to find part of the sequence sufficiently conserved as to design “consensus” sequences which will amplify or copy all desired sequences. If one pair of consensus primers is not enough to amplify all the homologous sequences, then a mixture of two or more primers pairs is added in order to obtain the desired amplifications. The minimum homologous sequences amplified by the same consensus primer is two, nut there is no limitation to said number.
If the sequences show high degree of homology, higher than 60% and even higher than 90%, then the finding of common sequence for consensus primer is easily obtained, but the choice for specific capture nucleotide sequences become more difficult.
In another preferred embodiment of the invention, the capture nucleotide sequences are chemically synthesized oligonucleotides sequences shorter than 100 bases (easily performed on programmed automatic synthesizer). Such sequences can bear a functionalized group for covalent attachment upon the support, at high concentrations.
Longer capture nucleotide sequences are preferably synthesized by (PCR) amplification (of a sequence incorporated into a plasmid containing the specific part of the capture nucleotide sequence and the non specific part (spacer)).
In a further embodiment of the invention, the specific sequence of the capture nucleotide sequence is separated from the surface of the solid support by at least about 6.8 nm long, equivalent to the distance of at least 20 base pair long nucleotides in double helix form.
In the method, kit (device) or apparatus according to the invention, the portion(s) (or part(ies)) of the capture nucleotide sequences complementary to the target is comprised between about 3 and about 60 bases, preferably between about 15 and about 40 bases and more preferably between about 20 and about 30 bases. These bases are preferably assigned as a continuous sequence located at or near the extremity of the capture nucleotide sequence. This sequence is considered as the specific sequence for the detection. In a preferred form of the invention, the sequence located between the specific capture nucleotide sequence and the support is a non specific sequence.
In another embodiment of the invention, a specific nucleotide sequence comprising between about 3 and about 60 bases, preferably between about 15 and about 40 bases and more preferably between about 20 and about 30 bases is located on a capture nucleotide sequence comprising a sequence between about 30 and about 600 bases.
The method, kit (device) or apparatus according to the invention are suitable for the detection and/or the quantification of a target which is made of DNA or RNA, including sequences which are partially or totally homologous upon their total length.
The method according to the invention can be performed even when a target present between an homology (or sequence identity) greater than 30%, greater than 60% and even greater than 80% and other molecules.
In the method, kit (device) or apparatus according to the invention, the capture nucleotide sequences are advantageously covalently bound (or fixed) upon the insoluble solid support, preferably by one of their extremities as described hereafter.
The method according to the invention gives significant results which allows identification (detection and quantification) with amplicons in solutions at concentration of lower than about 10 nM, of lower than about 1 nM, preferably of lower than about 0.1 nM and more preferably of lower than about 0.01 nM (=1 fmole/100 μl).
Another important aspect of this invention is to use very concentrate capture nucleotide sequences on the surface. If too low, the yield of the binding is quickly lower and is undetectable. Concentrations of capture nucleotide sequences between about 600 and about 3,000 nM in the spotting solutions are preferred. However, concentrations as low as about 100 nM still give positive results in favorable cases (when the yield of covalent fixation is high or when the target to be detected is single stranded and present in high concentrations). Such low spotting concentrations would give density of capture nucleotide sequence as low as 20 fmoles per cm2. On the other side, higher density was only limited in the assays by the concentrations of the capture solutions, but concentrations still higher than 3,000 nM give good results.
The use of these very high concentrations and long nucleotide sequences are two unexpected characteristic features of the invention. The theory of DNA hybridization proposed that the rate of hybridization between two DNA complementary sequences in solution is proportional to the square root of the DNA length, the smaller one being the limited factor (Wetmur, J. G. and Davidson, N. 1968 J. Mol. Biol. 3:584). In order to obtain the required specificity, the specific sequences of the capture nucleotide sequences had to be small compared to the target. Moreover, the targets were obtained after PCR amplification and were double stranded so that they reassociate in solution much faster than to hybridize on small sequences fixed on a solid support where diffusion is low thus reducing even more the rate of reaction. It was unexpected to observe a so large increase in the yield of hybridization with the same short specific sequence.
The amount of a target which “binds” on the spots is small compared to the amount of capture nucleotide sequences present. So there is a large excess of capture nucleotide sequence and there was no increase of binding if more capture nucleotide sequences were present.
One may perform the detection on the full length sequence obtained after amplification or copy and when labeling is performed by incorporation of labeled nucleotides, more markers are present on the hybridized target making the assay sensitive.
The method, kit and apparatus according to the invention may comprise the use of other bound capture nucleotide sequences, which may have the same characteristics as the previous ones and may be used to identifying a target from another group of homologous sequences (preferably amplified by common primer(s)).
In the microbiological field, one may use consensus primer(s) specific for each family, or genus, of micro-organisms and then identify some or all the species of these various family in an array by using capture nucleotide sequences of the invention. Detection of other sequences can be advantageously performed on the same array (i.e. by allowing an hybridization with a standard nucleotide sequence used for the quantification, with consensus capture nucleotide sequences for the same or different micro-organisms strains, with a sequence allowing a detection of a possible antibiotic resistance gene by micro-organisms or for positive or negative control of hybridization). Said other capture nucleotide sequences have (possibly) a specific sequence longer than 10 to 60 bases and a total length as high as 600 bases and are also bound upon the insoluble solid support (preferably in the array made with the other bound capture nucleotide sequences related to the invention). A long capture nucleotide sequence may also be present on the array as consensus capture nucleotide sequence for hybridization with all sequences of the microorganisms from the same family or genus, thus giving the information on the presence or not of a microorganism of such family, genus in the biological sample.
The same array can also bear capture nucleotide sequences specific for a bacterial group and as specific application to Gram-positive or Gram-negative strains or even all the bacteria.
Another application is the detection of homologous genes from a consensus protein of the same species, such as various cytochromes P450 by specific capture nucleotide sequences with or without the presence of a consensus capture nucleotide sequence for all the cytochromes P450 possibly present in a biological sample. Such detection is performed at the gene level by reverse transcription into cDNA.
The solid support according to the invention can be or can be made with materials selected from the group consisting of glasses, electronic devices, silicon supports, plastic supports, silica, metal or a mixture thereof in format such as slides, compact discs, gel layers, microbeads. Advantageously, said solid support is a single glass slide which may comprise additional means (barcodes, markers, etc.) or media for improving the method according to the invention.
The amplification step used in the method according to the invention is advantageously obtained by well known amplification protocols, preferably selected from the group consisting of PCR, RT-PCR, LCR, CPT, NASBA, ICR or Avalanche DNA techniques.
Advantageously, the target nucleotide sequence to be identified is labeled previously to its hybridization with the single stranded capture nucleotide sequences. Said labeling (with known techniques from the person skilled in the art) is preferably also obtained upon the amplified sequence previously to the denaturation (if the method includes an amplification step).
Advantageously, the length of the target nucleotide sequence is selected as being of a limited length preferably between 50 and 2000 bases, preferably between 100 and 400 bases and more preferably between 100 and 200 bases. This preferred requirement depends on the possibility to find consensus primers to amplify the required sequences possibly present in the sample. Too long target nucleotide sequence may reallocate faster and adopt secondary structures which can inhibit the fixation on the capture nucleotide sequences.
The amplified target nucleotide sequence can be cut before the hybridization, and the use of one capture sequence for each target sequence to make the interpretation of the results easy.
The detection of homologous expressed genes is obtained by first reverse transcription of the mRNA by a consensus primer, the preferred one being the poly dT. In one embodiment, the reverse transcribed cDNA is then amplified by consensus primers as described in this invention.
According to a further aspect of the present invention, the method, kit (device) or apparatus according to the invention is advantageously used for the identification of different Staphylococcus species or variant, preferably the S. aureus, the S. epidermidis, the S. saprophyticus, the S. hominis or the S. haemolyticus for homologous organs present together or separately in the biological sample, said identification being obtained by detecting the genetic variants of the FemA gene in said different species, preferably by using a common locations in the FemA genetic sequence (examples 4, 5, 6, 7). In another aspect of the invention, 16 Staphylococcus species could be detected after amplification by the same primers and identification on the array (example 7).
Preferably, the primer(s) and the specific portions of said FemA sequence used for obtaining amplified products are the ones described hereafter in Example 2. These primers have been selected as consensus primers for the amplification of the FemA genes of all of the 16 Staphylococcus tested and they probably will amplify the FemA from all other possible Staphylococcus species.
A further aspect of the invention is the detection of Mycobacteria species, the M. tuberculosis and other species, preferably the M. avium, M. gastrii, M. gordonae, M. intracellulare, M. leprae, M. kansasi, M. malmoense, M. marinum, M. scrofulaceum, M. simiae, M. szulgai, M. xenopi, M. ulcerans (Example 8).
In a further application of the invention, one array can specifically detect amplified sequences from several bacterial species belonging to the same genus (Examples 7 and 8) or from several genus like Staphylococcus, Streptococcus, Enterococcus, Haemophilus (see Table 1) or different bacterial species and genus belonging to the Gram-positive bacteria and/or to the Gram-negative bacteria (Examples 16 and 22).
Preferably, the primer(s) and the specific portions of gyrase (sub-unit A) sequences are used for obtaining amplified products. These primers have been selected as consensus primers for the amplification of the gyrase genes of all of the bacteria tested and they probably will amplify the gyrase from many other possible bacteria species and genus and families.
The invention is particularly suitable for detection of bacteria belonging to at least two of the following genus families: Staphylococcus, Enterococcus, Streptococcus, Haemolyticus, Pseudomonas, Campylobacter, Enterobacter, Neisseria, Proteus, Salmonella, Simonsiella, Riemerella, Escherichia, Neisseria, Meningococcus, Moraxella, Kingella, Chromobacterium, Branhamella.
The array allows to read the MAGE number by observation of the lines positive for signal bearing the specific capture nucleotide sequences.
The same application was developed for the G Protein Coupled Receptors (GPCR). These receptors bind all sorts of ligands and are responsible for the signal transduction to the cytoplasm and very often to the nucleus by modulating the activity of the transcriptional factors. Consensus primers are formed for the various subtypes of GPCR for dopamine and for serotonin and histamine. The same is possible for the histamine and other ligands.
The detection of the various HLA types is also one of the applications of the invention. HLA are homologous sequences which differ from one individual to the other. The determination of the HLA type is especially useful in tissue transplantation in order to determine the degree of compatibility between the donor and the recipient. It is also a useful parameter for immunization. Given the large number of subtypes and the close relation between the homologous sequences it was not always possible to perfectly discriminate one sequence among all the other ones and for some of them there was one or two cross-reactions. In this case, a second capture nucleotide sequence complementary to another location of the amplified sequence was added on the array, in order to make the identification absolute.
Genetic sequences code for proteins so that homologous DNA sequences correspond to homologous amino acid sequences of the encoded proteins while variation in the DNA sequences correspond to variation in amino acid sequence. One embodiment of this invention is to use antibodies for specific capture of proteins from a sample in order to identify the protein and so the organism from which it originates. By choosing appropriate antibodies, the organisms or the group to which it belongs is determined. The HLA typing is given as example of the use of specific antibodies for discriminating the various HLA-A proteins on an array (Example 23).
Discrimination of the Cytochrome P450 forms is one particular application of the invention (Example 14).
Detection of polymorphism sequences (which can be considered as homologous even if differing by only one base) can be made also by the method according to the invention. This is especially useful for the Cytochrome P450 since the presence of certain isoforms modifies the metabolism of some drugs. The invention was found particularly useful for discriminating between the isoforms of Cyto P450 2D6 and 2C19. More generally the invention is particularly well adapted for the discrimination of sequences differing by one base mutation or deletion called Single Nucleotide Polymorphism (SNP). The originality of the invention is to perform the hybridization step directly on the amplified sequences without the necessity to copy into RNA and to cut them into pieces.
Furthermore, one array can specifically detect amplified sequences from several animal species and genus belonging to several families like Galinacea, Leporidae, Suidae and Bovidae (Table 2).
One array can specifically detect amplified sequences from several fishes species, such as G. morhua, G. macrocephalus, P. flesus, M merluccius, O. mykiss, P. platessa, P. virens, S. salar, S. pilchardus, A. thazard, T. alalunga, T. obesus, R. hippoglossoides, S. trutta, S. sarda, T. thynnus, S. scombrus belonging to several genera such as Auxis, Sarda, Scomber, Thunnus, Oncorhynch, Salmo, Merluccius, Pleuronectes, Platichtlys, Reinhardtius, Pollachius, Gadus, Sardina, from several families such as Scombridae, Salmonidae, Merluccidae, Pleuronectidae, Gadidae and Clupeidae (Table 3). Other homologous sequences allow the determination of plant species and genuses such as Potato, tomato, oryza, zea, soja, wheat, barley, bean, carrot belonging to several families (Example 19).
According to a further aspect of the present invention, the method, kit (device) or apparatus according to the invention is advantageously used for the identification of the origin of meat (Table 2).
Preferably, the primer(s) and the specific portions of cytochrome b sequences are used for obtaining amplified products are the ones described hereafter in Example 3. These primers have been selected as consensus primers for the amplification of the cytochrome B genes of all of animals tested and they probably will amplify the cytochrome B from many other animals species, genus and families.
According to a further aspect of the present invention, the method, kit (device) or apparatus according to the invention is advantageously used for the identification of the origin of fishes (table 3).
Preferably, the primer(s) and the specific portions of said cytochrome b sequences used for obtaining amplified products are the ones described hereafter in Example 18. These primers have been selected as consensus primers for the amplification of the cytochrome B genes of all of fishes tested and they probably will amplify the cytochrome B from many other fish species, genuses and families.
According to a further aspect of the present invention, the method, kit (device) or apparatus according to the invention is advantageously used for the identification of the origin of plants.
Preferably, the primer(s) and the specific portions of said sucrose synthase sequences used for obtaining amplified products are the ones described hereafter in the examples. These primers have been selected as consensus primers for the amplification of the sucrose synthase genes of all of plants tested and they probably will amplify the sucrose synthase from many other plants species, genus and families.
According to a further aspect of the present invention, the method, kit (device) or apparatus according to the invention is advantageously used for the identification of the Genetically Modified Organism (GMO). The GMO are produced by insertion into the genome of an organism of one or several external genes together with other regulating or construction sequences.
Preferably, the primer(s) and the specific portions of said sucrose synthase sequences used for obtaining amplified products are the ones described hereafter in the examples. These primers have been selected as consensus primers.
Homologous DNA or RNA sequences lead to the expression in cells or tissues of proteins which are also homologous to each other. Therefore, a target component to be detected may be protein which is related to other homologous ones which could be present in the same biological sample. Related proteins means proteins which have some part(s) of their sequence or conformation in common, while said proteins present other part(s) which are specific or the (micro)organisms or a part of said (micro)organisms from which they originate.
Part or portion of the amino acid sequences are identical between proteins from the same group while other portions are specific of the target to be identified and possibly quantified. Said amino acid sequences present linear or conformational epitopes which can be recognized by specific (monoclonal) antibodies. The discrimination between said specific related targets is possible by specific antibodies or reconstructed antibodies like proteins bearing hypervariable portions of these antibodies. An identification of said common homologous sequences is also possible by using antibodies directed against the common sequence. Therefore, discrimination between groups, subgroups, sub-subgroups and individual proteins can be made in a single experiment.
Preferably, antibodies are bound to the solid support as array and are used for the specific capture of the target's components to be identified. For HLA identification, proteins are classified in class I, II and III antigens. The class I is divided into the HLA-A, B, C, E, F and G. Each of them being subdivided into HLA types and subtypes as given in the databank IMGT/HLA. There are more than 476 different alleles of the class I HLA antigens. The heavy chains of the HLA complex of type I possess regions as the α1 and α2 domains which are very polymorphic while other parts as the α3 is more conserved (Auffray and Strominger, 1986 Advanced Hum. Genet. 15:197). The class II is divided into the HLA-DR, HLA-DP and HLA-DQ. There are more than 430 alleles of the HLA class II. Each type is subdivided into subtypes and sub-subtypes which can be discriminated according to the present invention (Example 23).
In one of the aspects of the invention, typing of Cytochrome P450 proteins is performed using the antibodies directed against cytochrome P450 1A1, 1A2, 2A6, 2C11, 3A4, 4A. These antibodies are available from ABR (Golden, Colo., USA).
According to a further aspect of the present invention, the method, kit (device) or apparatus according to the invention is advantageously used for the identification of the organisms or part of it as provided in the examples cited here above and also the ones presented in the Examples 1 to 23.
Another aspect of the present invention is related to any part of biochips or microarray comprising said above described sequences (especially the specific capture nucleotide sequence described in the examples) as well as a general screening method for the identification of a target sequence specific of said microorganisms of family type discriminated from homologous sequences upon any type of microarrays or biochips by any method.
After hybridization on the array, the target sequences can be detected by current techniques. Without labeling, preferred methods are the identification of the target by mass spectrometry now adapted to the arrays (U.S. Pat. No. 5,821,060) or by intercalating agents followed by fluorescent detection(WO 97/27329).
The labeled associated detections are numerous. A review of the different labeling molecules is given in WO97/27317. They are obtained using either already labeled primer or by incorporation of labeled nucleotides during the copy or amplification step. A labeling can also be obtained by ligating a detectable moiety onto the RNA or DNA to be tested (a labeled oligonucleotide, which is ligated, at the end of the sequence by a ligase). Fragments of RNA or DNA can also incorporate labeled nucleotides at their 5′-OH or 3′-OH ends using a kinase, a transferase or a similar enzyme.
The most frequently used labels are fluorochromes like Cy3, Cy5 and Cy7 suitable for analyzing an array by using commercially available array scanners (General Scanning, Genetic Microsystem). Radioactive labeling, cold labeling or indirect labeling with small molecules recognized thereafter by specific ligands (streptavidin or antibodies) are common methods. The resulting signal of target fixation on the array is either fluorescent, colorimetric, diffusion, electroluminescent, bio- or chemiluminescent, magnetic, electric like impedometric or voltammetric (U.S. Pat. No. 5,312,527). A preferred method is based upon the use of the gold labeling of the bound target in order to obtain a precipitate or silver staining which is then easily detected and quantified by a scanner.
Quantification has to take into account not only the hybridization yield and detection scale on the array (which is identical for target and reference sequences) but also the extraction, the amplification (or copying) and the labeling steps.
The method according to the invention may also comprise means for obtaining a quantification of target nucleotide sequences by using a standard nucleotide sequence (external or internal standard) added at known concentration. A capture nucleotide sequence is also present on the array so as to fix the standard in the same conditions as said target (possibly after amplification or copying); the method comprising the step of quantification of a signal resulting from the formation of a double stranded nucleotide sequence formed by complementary base pairing between the capture nucleotide sequences and the standard and the step of a correlation analysis of signal resulting from the formation of said double stranded nucleotide sequence with the signal resulting from the double stranded nucleotide sequence formed by complementary base pairing between capture nucleotide sequence(s) and the target in order to quantify the presence of the original nucleotide sequence to be detected and/or quantified in the biological sample.
Advantageously the standard is added in the initial biological sample or after the extraction step and is amplified or copied with the same primers and/or has a length and a GC content identical or differing from no more than 20% to the target. More preferably, the standard can be designed as a competitive internal standard having the characteristics of the internal standard found in the document WO 98/11253. Said internal standard has a part of its sequence common to the target and a specific part which is different. It also has at or near its two ends sequences which are complementary of the two primers used for amplification or copy of the target and similar GC content (WO 98/11253). In the preferred embodiment of this invention, the common part of the standard and the target, means a nucleotide sequence which is homologous to all target amplified by the same primers (i.e. which belong to the same family or organisms to be quantified).
Preferably, the hybridization yield of the standard through this specific sequence is identical or differ no more than 20% from the hybridization yield of the target sequence and quantification is obtained as described in WO 98/11253.
Said standard nucleotide sequence, external and/or internal standard, is also advantageously included in the kit (device) or apparatus according to the invention, possibly with all the media and means necessary for performing the different steps according to the invention (hybridization and culture media, polymerase and other enzymes, standard sequence(s), labeling molecule(s), etc.).
Advantageously, the solid support of the biochips also contain spots with various concentrations (i.e. 4) of labeled capture nucleotide sequences. These labeled capture nucleotide sequences are spotted from known concentrations solutions and their signals allow the conversion of the results of hybridization into absolute amounts. They also allow to test for the reproducibility of the detection.
The solid support of the biochips can be inserted in a support connected to another chamber and automatic machine through the control of liquid solution based upon the use of microfluidic technology. By being inserted into such a microlaboratory system, it can be incubated, heated, washed and labeled by automates, even for preliminary steps (like extraction of DNA, genetic amplification steps) or the identification and discrimination steps (labeling and detection). All these steps can be performed upon the same solid support.
The present invention is also related to a method to identify homologous sequences (and the groups to which they belong and eventually the organisms and their groups) possibly present in a biological sample by assay of their genetic material in a array-type format. The method is well adapted for determination of organisms belonging to several groups being themselves members of a super-group. The method is for example well adapted for a biological determination and/or classification of animals, plants, fungi or micro-organisms.
The method involves the use of multiple capture nucleotide sequences present as arrays, the capture of the corresponding target sequences and their analysis and possibly their quantification. The method also allows the identification of these organisms and their groups by characterization of the positive area of the arrays bearing the required capture nucleotide sequences. One particular specification of the invention being that a positive hybridization resulting in one spot on the array, gives the necessary information for the identification of the sequence or the organism or the group or sub-group from which it belongs by the person skilled in the art.
It also provides a method for sequential analysis of the presence of any researched organisms during the genetic amplification followed by the detection of amplicons on the array and identification of the corresponding organisms or groups thereafter.
Furthermore, the inventors have discovered that is possible to obtain by the method of the invention a very quick and easy identification of such multiple sequences belonging to several groups or sub-groups or sub-sub-groups of sequences being homologous to each others, until possible individual sequences, by combining a single nucleotide amplification, preferably by PCR, using common primer pair(s) together with an identification of the organisms at different level(s) by detecting and possibly recording upon an array having at least 5 different bound single stranded capture nucleotide sequences/cm2 of solid support surface, the presence of a single signal resulting from the binding between a capture sequence and its (or their) corresponding target sequence(s) and thereafter correlating the presence of said detected target sequences to the identification of a specific genetic sequence among the other ones. The method is especially well adapted for the identification of organism species, genus and family through the analysis of a given part of their genome or gene expressed, these sequences being homologous to each other in the different organisms.
A single signal means a signal which by itself is sufficient to identify one or more target nucleotide sequence(s) to which it is designed and therefore to give (if necessary) an unambiguous response for the presence or not of the organisms or groups of organism present in the sample or the organisms or group of organisms from which said sample has been obtained.
The method and device according to the invention allows easy identification/detection of a specific nucleotide sequences among other possible amplified nucleotide sequences and possibly their quantification (characterization of the number of copies or presence of said organisms in a biological sample) of target sequences, said target nucleotide sequences having a nucleotide sequence specific of said organisms or groups of organisms.
The array may contain capture nucleotide sequences from several organism genus and from several of these genus species. The capture nucleotide sequences may detect the genus, the species and also the family(ies) to which these genus belong. The capture nucleotide sequences may also detect the sub-species and even the individual organisms of one or several species. Individual organisms of a given species are considered as having very homologous sequences differing mainly by single bases within some of their DNA sequences or genes. Homology is important for getting consensus primers and a single base change is sufficient to obtain discrimination between two target amplicons. If not completed, the discrimination can be confirmed by the use of second capture nucleotide sequences present upon the array and able to bind a same amplicon at different sequence location.
Said identification is obtained firstly by a genetic amplification of said nucleotide sequences (target sequences) by common primer pair followed (after washing) by discrimination between the possible different targets amplified according to the above described method.
The amplified sequences may belong to the same gene, may be part of the same DNA locus and are homologous to each others.
The method according to the invention further comprises the step of correlating the signal of detection (possibly recorded) to the presence of:
specific organism(s) groups,
specific organism(s) sub-groups until the possible individuals,
genetic characteristics of a sequence from a organism,
polymorphism of said sequence,
genotyping of organisms based on differences in DNA or RNA sequences,
diagnostic predisposition or evolution (monitoring) of genetic diseases, including cancer of a patient (including the human) from which the biological sample has been obtained.
The method also applies to the identification and possibly characterization of nucleotide sequences as such independently of the organism. Genes or DNA sequences can be classified in groups and sub-groups and sub-sub-groups according to their sequence homology. Bioinformatic programs exist for sequence alignment and comparison (such as Clustal, Intelligenetics, Mountain View, Calif., or GAP, BESTFIT, FASTA and TFASTA in the Wisconsin Genetics Software Package, Genetics computer Group Madison, Wis., USA or Boxshade). A classification can be made according to the percentage of homology and alignment of the sequences. An interest in detection and identification of the sequences from a given family in a given organism, tissue or cell is for example the possibility to detect the effect of any given molecules, biological or pathological conditions (by proteomics, functional genomics, etc.) upon both the overall and the specific genes of one or several families.
The inventors also found that sensitivity of the assay was increased by using high density of capture nucleotide sequences fixed on the support, being preferably higher than about 100 fmoles/cm2 of solid support surface.
The capture nucleotide sequences specific for the determination of a group of organisms are designed in a way as to be able to specifically capture the different sequences belonging to the various groups. These capture nucleotide sequences are called consensus for this group of organisms. The consensus capture nucleotide sequences may contain specific sequences which are longer than the specific capture nucleotide sequences of the different members of the group. These capture nucleotide sequences are consensus sequences, (i.e. the sequences containing at each of its location the base which is the most present in the different sequences of the members of the group when aligned). In another embodiment the consensus capture nucleotide sequence has the length of the amplified sequences.
The inventors have found unexpected results in that the same identification of several organisms of several groups can be performed at the organisms as well as at the level in the same experimental conditions. Identification of the groups required long capture nucleotide sequences while the specific identification of the organism require small, but specific capture sequences. The inventors found that using the characteristic of the invention, mainly by binding of the specific part of the sequences onto a spacer, it was possible to obtain both results in the same experimental conditions. The invention allows also using the same stringency conditions, meanly determined by the salt concentration and the temperature and the rate of reaction.
According to the invention, organisms are identified as such by their specific polymorphism. Single base substitution in a particular location of genome is the characteristic of an individual organism among others of the same species. The method for identification of the polymorphism is part of the invention with direct hybridization of the amplified sequences on the capture nucleotide sequences of the array and detection of the fixed target sequence.
The detection of the target sequence being bound on capture nucleotide sequences is obtained through the labeling of the capture nucleotide sequence on which the target sequence is bound. A step of capture nucleotide sequences labeling is added after the hybridization step. The extension of the capture nucleotide sequence free end, preferably the 3′ end) is performed using detectable nucleotide, preferably a biotin or fluorescent nucleotide, and a polymerization agent, preferably a DNA polymerase and the necessary reagent for making the extension. The target sequence hybridized on the capture nucleotide sequence serves as matrix for the extension; the hybridized target sequences are then removed from the capture nucleotide sequence, rehybridized and extension of the capture nucleotide sequence performed.
The invention allows identification of the presence of a polymorphism by using an array having at least five different bounded single stranded capture polynucleotide sequence/cm2 of solid support surface, the determination of a single signal resulting from the binding between the capture sequence and the target sequence, extending at least one polynucleotide primer of the hybrid beyond the 3′ terminal nucleotide thereof in the 3′ to 5′ direction using the polynucleotide sequence as a template, said extension is effected in the presence of polymerization agent and nucleotide precursor wherein at least one nucleotide incorporated into the extended primer molecule is a detectably-modified nucleotide; denaturing the duplex to free the target sequence from the polynucleotide capture nucleotide sequence, carry out step one or more times and detecting the presence of a signal associated with the detectable modified nucleotide in the extended capture nucleotide sequence at the reaction zone to effect said determination.
The process is repeated as needed to obtain a signal detectable on the array. A preferred signal is obtained in colorimetry using the silver precipitation as proposed and detection of the array on colorimetric detector (WO 00/72018). The arrays may be present in the surface of multiwells and multiwell plate detectors used for the reading of the results.
In another embodiment, a second labeled nucleotide sequence complementary to the target sequence and adjacent to the capture nucleotide sequence is added on the hybridized amplicons and a ligation performed. If the last base of the capture nucleotide sequence is complementary to the target sequence, then ligation will occur and the spot is labeled. If not ligation will not occur even if the target amplicon is hybridized on the capture nucleotide sequence.
In a particular embodiment the array bear in separated area several identical capture nucleotide sequences differing only by one nucleotide located at the same place in the capture nucleotide sequence, the last free end is the interrogation base. The array is then able to identify the presence of any of the 4 bases present at a given location of the sequence. Such array is especially useful when detecting polymorphism in homozygote or heterozygote organism or when the polymorphism is not known.
In the method, kit (device) or apparatus according to the invention, the portion(s) (or part(ies)) of the capture nucleotide sequences complementary to the target sequence is composed of at least two families. The first one comprised between about 5 and about 60 bases, preferably between about 15 and about 40 bases and more preferably between about 20 and about 30 bases. In the second capture family, the binding parts of the capture nucleotide sequence sequences are comprised between about 10 and 1000 bases and preferably between 100 and 600 bases. These bases are preferably assigned as a continuous sequence located at or near the extremity of the capture nucleotide sequence. This sequence is considered as the specific sequence for the detection. In a preferred form of the invention, the sequence located between the specific capture nucleotide sequence and the support surface is a non-specific sequence.
In another preferred embodiment of the invention, the first family of capture nucleotide sequences detect the members of a group while the second family of capture nucleotide sequences detect the group as such.
However, both families of capture nucleotide sequences can be polynucleotides.
All the capture sequences present on the array necessary for capturing the target sequences are polynucleotides and are able to detect both the members of a group and the groups or sub-groups themselves.
The consensus primers can be chosen in order to amplify different sequences and groups of sequences.
The same pair of primers amplifies several groups of sequences being different for the different groups of homologous sequences, each one being associated with one or several group of organism.
The pair of consensus primers may be associated with group identification and/or for species identification on the array.
A second or third (or even more) primers are added for the amplification step in order to possibly amplify other sequences, related or not to one particular group and useful to be detected in the sample. Virus susceptible to be present in a clinical sample together with bacteria is one of the examples where such extension of the invention is particularly useful like the combination of virus detection of example 17 with bacteria detection of examples 7, 8 or 16.
Two pairs of (possibly consensus) primers may be used for the amplification, (one for amplification of sequences of the gram-positive and the other one for the gram-negative bacteria, the amplified sequences are specific of each of the gram-positive or the gram-negative bacteria and detected thereafter on the array as specific bacteria species or/and genus and/or family).
Each of the two primers pair amplifies various sequences specific of one or several families which are then detected as specific species or/and genus, families on the array.
The same array can also bear capture nucleotides sequences specific for bacterial families or genus.
In one preferred embodiment of the invention, the detection of the presence of any member of the groups are first detected during the PCR using method like the real time PCR and the amplicons are thereafter used for identification on the array.
Real time PCR is performed in specific machines which along the PCR cycle detect the appearance of fluorescence in the solution. Increase in fluorescence is due to the insertion of fluorochromes such as in the double stranded amplicons produced during the PCR cycles.
Specific fluorescent labeled nucleotide sequences are added to the PCR solution for specific identification of the amplicons. These nucleotide sequences are complementary to the amplified target sequences and their fluorescence emission is limited by the presence at the right position of a scavenger. Once digested by the polymerase during the copying of the amplicons, the fluorochrome is released in solution where it is detected. Said method is called Fluorescence Resonance Emission Transfert (FRET. The sequence is chosen so as to bind to a consensus region of the detected amplicons or several nucleotide sequences are chosen in consensus regions specific of the groups of sequences or organisms to be detected. These nucleotide sequences are preferably labeled with different fluorochromes so as to identify the group during the amplification step.
The fluorescent signal of the amplification solution is registered and if crossing a threshold, the solution is processed for hybridization on capture nucleotide sequences of the array. In a preferred embodiment a solid support bearing the array is added in the amplification chamber and in the hybridization processes. In another preferred embodiment the hybridization is performed on the surface of the same chamber as the PCR. Chambers, preferably closed chambers, can be of any size, format and material as compatible with arrays as already mentioned here above. The chambers may be in polymers such as polycarbonate, polypropylene, or glass such as capillaries. Polyacrylate based surfaces are particularly useful since they are transparent to light and allow covalent binding of capture probes necessary for the arrays. The free end, of the capture nucleotide sequence can be either a 5′ or 3′ —OH or phosphate group modified in order to avoid elongation. Preferably, the specific sequence portion of the capture nucleotide sequence has a melting temperature smaller than the primers used for the amplification in order to avoid hybridization during the PCR cycles. Also the hybridization may be performed at a given temperature using the heating and control system of the amplification cycler. A control process provides on the amplification cycler to continue or not the detection on the array after the amplification steps.
The real time PCR may be performed with the primers amplifying the gram-positive or/and the gram-negative PCR and thereafter the families or/and the genes or/and the species identified on the array.
One embodiment of the invention is to combine in one process the real time PCR together with the hybridization on capture probes for identification of the target molecules or organisms. In a preferred embodiment the process is performed in the same chamber and with the same machine device.
The present invention also covers the machine and apparatus necessary for performing the various steps of the process mainly for diagnostic and/or quantification of a (micro)organism or component possibly present in a sample among at least two, preferably at least 4 other related (micro)organisms which comprises:
capture molecules being bound to an insoluble solid support at specific locations according to an array, said capture molecules being able to discriminate between related (micro)organisms or components, said array having a density of at least 4 discrete regions per cm2 solid support surface;
a detection and/or quantification device of a signal formed at the location of the binding between said target compound with said capture molecule;
possibly reading device of information recorded upon said solid support;
a computer program to recognize the discrete regions bearing the target molecules and their locations; and
correlating the presence of the signal at these locations with the detection and/or quantification of the said (micro)organism or component.
In a particular embodiment, this apparatus also performs the genetic amplification of the nucleotide sequences by PCR performed previously or in real time together with the identification of a (micro)organism or its components.
Detection of other sequences can be advantageously performed on the same array (i.e. by allowing an hybridization with a standard nucleotide sequence used for the quantification, with consensus capture nucleotide sequences for the same or different micro-organisms strains, with a sequence allowing a detection of a possible antibiotic resistance gene by micro-organisms or for positive or negative control of hybridization). Said other capture nucleotide sequences have (possibly) a specific sequence longer than 10 to 60 bases and a total length as high as 600 bases and are also bound upon the insoluble solid support (preferably in the array made with the other bound capture nucleotide sequences related to the invention).
These characteristics described in details for a specific detection and analysis of nucleotide sequences can be adapted by the person skilled in the art for other components of (micro)organisms such as receptors, antibodies, enzymes, etc.
The present invention will be described in details in the following non-limiting examples in reference to the enclosed figures and tables.
Table 1 presents identification of 3 gram-positive and 1 gram-negative bacteria at the genus level (horizontally) and at the species level (vertically). These bacteria are detected with the method of the invention on biochips after PCR amplification with consensus primers. The PCR was realized on the gyrase (sub-unit A) sequences.
(*) Identification of the species
Table 2: The identification of meat animals at the family level (horizontally) and at the genus and species levels (vertically) (3 levels of classification), detected with the method of the invention on biochips after PCR amplification with consensus primers. The PCR was realized on Cytochrome B gene sequences.
Table 3 presents the identification of fishes at the family level (horizontally) and at the genus and species levels (vertically) (3 levels of classification), detected with the method of the invention on biochips after PCR amplification with consensus primers. The PCR was realized on Cytochrome B gene sequences.
The inventors have also discovered that it is possible to drastically simplify the identification of one or several (micro)organisms among many other ones having different sequences and being present in a sample even at different concentrations by combining a single amplification using a common primer pair together with amplification with primers specific of the different nucleotide sequences. In some embodiments, the invention involves detecting and possibly recording the presence of a single signal resulting only from a binding between an immobilized capture sequence and its corresponding target sequence and correlating the presence of said detected target sequence to the identification of a genetic sequence specific of said (micro) organism(s). The method and device according to the invention allow the easy identification/detection of a specific sequence among other sequences and possibly its quantification (characterization of the number of copies or presence of said organisms in a biological sample) of a target sequence, said target sequence having a nucleotide sequence specific of said (micro) organisms.
The present invention provides a method for identifying and/or quantifying an organism or part of an organism in a sample by detecting a nucleotide sequence specific of said organism, among at least 4 other nucleotide sequences from other organisms or from parts of the organism comprising the steps of:
In some embodiments, the identification is performed directly or after washing off possible contaminants (unbound sequences), by detecting and possibly recording a single spot signal at one specific location, wherein said capture nucleotide sequence was previously bound and said identification is the result of said signal at the expected location and is not a result of an analysis of a specific pattern upon a microarray as proposed in the system of the state of the art. Therefore, said method and device do not necessarily need a detailed analysis of said pattern by an image processing and a software analysis.
This invention was made possible by discovering that target sequences can be discriminated from other ones upon an array with high sensitivity by using bound capture nucleotide sequences composed of at least two parts, one being a spacer bound by a single and advantageously predetermined (defined) link to the support (preferably a non porous support) and the other part being a specific nucleotide sequence able to hybridize with the nucleotide target sequence. The target molecule binds to its specific complementary sequence of the probe and this sequence is separated from the solid support surface by nucleotides acting as a spacer. Such configuration of the capture molecules leads to a high hybridization yield and/or to a stabilization of the target sequences which makes possible the detection of full length molecules even in the presence of their complementary sequences present in the same hybridization solution. This effect is reproducible and valid for different target molecule to be detected. This result also allows hybridization of the full length amplified sequences without them being further cut into pieces or without them being transformed into single strand sequences, which was unexpected given the constraints of the hybridization on solid support.
Furthermore, said detection is greatly increased, if high concentrations of capture nucleotide sequences are bound to the surface of the solid support.
In some embodiments, the present invention is related to the identification of a target sequence obtained from a biological (micro)organism or a portion thereof. For example, the target gene may be present in a biological sample which contains at least 4 other (micro)organisms or portions thereof. Such a method is also well applicable to detection of the components or portions of an organism such as different gene transcripts.
In some embodiments, said identification is obtained firstly by a genetic amplification of said nucleotide sequences (target and homologous sequences) by a common primer pair together with primers specific for the nucleotide sequences followed (after washing if necessary) by a discrimination between the possible different targets amplified. In some embodiments, said discrimination is advantageously obtained by hybridization upon a surface containing capture nucleotide sequences at a given location, specific for a target specific for each (micro)organism which may be possibly present in the biological sample and by the identification of said specific target through the identification and possibly the recording of a signal resulting from the specific binding of this target upon its corresponding capture sequence at the expected location (single location signal being specific for the target).
This embodiment of the invention is related to an unexpected improvement of multiplex amplification methods, preferably a PCR amplification, working in tandem with the detection on immobilized capture molecules allowing analysis of at least 5, 10, 20, 40 different polynucleotide target sequences being possibly present at different concentrations in a given sample. In the amplification step, this embodiment of the invention prevents the use of high concentrations of primers specific for each of the nucleotides to be detected as in a normal PCR. This embodiment of the invention opens the way for the detection of unrelated sequences and is useful in many biological applications such as pathogen detection or the identification of transcripts or of different polymorphisms.
According to this embodiment of the invention, the preferred method for genetic amplification is the PCR. Each nucleotide sequence to be detected is amplified by the combination of two primer pairs; the first one being specific for the nucleotide sequence and leading to the production of derived sequences and the second one being common (universal primer pair) for all the target nucleotides which will be detected and identified thereafter. The derived sequences are best obtained by amplification (or copy) of the nucleotide sequences present in the sample using specific primer pairs, each member of said primer pair comprises a sequence complementary to one of the two strands of a given polynucleotide sequence and a common sequence (U) serving as an universal amplifying sequence, being identical for all the specific primers. In some embodiments, U is at least 15, and preferably at least 20 nucleotides, in length and is located at the 5′ end of the primers. Preferably, U does not contain more than 10 and, more preferably no more than 5, successive bases complementary to the target polynucleotides. During the first amplification cycles, the common sequence (U) will be incorporated into the amplicons and it will then serve in the later cycles as an universal amplifying sequence. Its complementary sequence will then be recognized by the unique primer pair which in this particular embodiment is composed of a single primer having a sequence identical to the common sequence of the specific primers. “Identical sequences” means that they share at least 10 and, more preferably, 15 bases.
In a particular embodiment, the two members of a given primer pair have two different common sequences (U1 and U2) which are used as two universal amplifying sequences. Each specific nucleotide sequence needs two primers to be amplified, each one being complementary of one of the two nucleotide opposite strands. In this particular embodiment, the two primers of a given pair have two different common sequences and the unique primer pair is composed of a pair of sequences having a sequence identical to these two common sequences (U1 and U2). Preferably, the identical sequences share at least 10 and, more preferably, 15 bases.
In another embodiment, the length of the sequence complementary to one of the two strands of a given polynucleotide sequence of the specific primer pair is at least 6 and, more preferably, at least 15 nucleotides. In another embodiment, the sequences complementary to the strands of the polynucleotide sequences of the specific primer pairs show a homology of lower than 50% and preferably lower than 30%.
In the preferred embodiment the nucleotide sequences of the sample to be detected have less than 50% and preferably less that 30% homology to each other. In a particular embodiment, the homology of the amplified target sequences show a low homology being lower than 50% and preferably below 30% so that they are not considered as homologous to each other.
The method according to this embodiment of the invention may further comprise the step of correlating the signal of detection (possibly recorded) to the presence of specific (micro)organism(s), genetic characteristics of a sequence, polymorphism of a sequence, diagnostic predisposition or evolution (monitoring) of genetic diseases, including cancer of a patient (including a human) from which the biological sample has been obtained.
Therefore, said (micro)organisms could be present in any biological material including genetic material obtained. The biological material may comprise virus, fungi, bacteria, plant or animal cells, including biological samples, obtained from humans. The biological sample can be also any culture medium wherein microorganisms, xenobiotics or pollutants are present, as well as such extract obtained from a plant or an animal (including a human) organ, tissue, cell or biological fluid (blood, serum, urine, etc).
The method according to this embodiment of the invention is performed by using a specific identification (diagnostic and/or quantification) kit or device comprising at least an insoluble solid support upon which are bound single stranded capture nucleotide sequences. Preferably the capture molecules are bound to an insoluble solid support (by a direct covalent link or by the intermediate of a spacer) at a specific location according to an array, said array having a density of at least 4, preferably at least 10, 16, 20, 50, 100, 1000, 4000, 10 000 or more, different bound single stranded capture nucleotide sequences/cm2 insoluble solid support surface. In another embodiment, the capture probes are bound to different solid support. In some embodiments, the different solid supports are beads, each bead having bound a capture molecule specific for a target so that identification of the location of the binding of a specific capture molecule can be performed.
In some embodiments, the single stranded capture nucleotide sequences have a length comprised between about 30 and about 600 bases (including the spacer) and containing a specific sequence or capture portion (able to hybridize with their corresponding target nucleotide) of at least 15, preferably at least 40, more preferably at least 60 and even more preferably more than 100 continuous nucleotide sequence complementary to one of the two strands of the amplified target sequences, said sequence being specific for the target (which means that said bases of said sequence are able to bind with their complementary bases upon the sequence of the target by complementary hybridization). Preferably, said hybridization is obtained under stringent conditions (under conditions well-known to the person skilled in the art).
Advantageously, when the nucleotide sequence specific for the organism or part of the organism to be identified and/or quantified in a sample is non homologous or the homology is low (less than 30% homology) with other sequences from other organisms possibly present in the same sample, the length of the specific sequence of the capture nucleotide sequence can be increased significantly in order to have a high hybridization yield with the target amplified nucleotide sequences. As the homology between the amplified target nucleotide sequences is low, the risk of cross-hybridization on long capture nucleotide sequence is also low. As a consequence, the specificity of the assay is kept even when long specific sequences are used. In a particular embodiment, the length of the specific sequence of the capture nucleotide sequence is preferably of more than 100 bases, and even more than 200 bases and even more than 400 bases.
The length of the capture molecules is preferably to be limited in order to reduce or avoid cross-reaction with other target sequences. The detection of possible cross-reaction on the capture molecule can be first tested theoretically by comparison of the sequences with the appropriate software as known by the person skilled in the art and/or by experimental assay. Also, long nucleotide sequences can be used if they do not lower the binding yield of the target nucleotide sequences usually by adopting hairpin based secondary structure or by interaction with each other. The length of the target specific sequence of the capture nucleotide sequence is preferably limited to about 600 bases and preferably to about 450 bases and even to about 150 bases.
In the method and kit or device according to this embodiment of the invention, the capture nucleotide sequence is a sequence having between 16 and 600 bases, preferably between 30 and 300 bases, more preferably between 40 and 150 bases and the spacer or spacer portion is a chemical chain of at least 6.8 nm long (corresponding to a nucleotide sequence of 20 bases), comprising a nucleotide sequence of at least 20 bases, better at least 40 bases and even longer than 60 bases or is a nucleotide derivative such as PMA or LNA.
In a preferred embodiment, the nucleotide sequence located between the specific capture nucleotide sequence and the support is a non-specific sequence which is not homologous or identical to the target to be detected. In a particular embodiment, the spacer sequence of a particular capture molecule is a sequence which is complementary to the nucleotide sequences to be detected. It will serve as spacer by separation of the at least 15 bases complementary to the amplified target from the support by at least 20 and better 40 bases.
In a preferred embodiment, the binding of the amplicons on the capture probe is such as to produce two non complementary ends, one being a spacer end and the other one a non-spacer end, such that the spacer end is non-complementary to the spacer portion of the capture molecule and said spacer end exceeds said non-spacer end by at least 50 bases.
In still another preferred embodiment, the detection is performed by hybridization of the full length of amplified sequence upon capture molecules.
In a preferred embodiment, the quantification of the organism present in the biological sample is obtained by the quantification of the signal present at a particular location of the support.
The method, kit and device according to this embodiment of the invention are particularly suitable for the identification of a target, being preferably biological (micro)organisms or a part thereof, possibly present in a biological sample where at least 4, 10, 20 or even more different sequences are possibly present. In some embodiments, said sequences are amplified by specific primers and are made homologous by the incorporation of a common sequence(s) present together on the specific primers and are then amplified by common primers. Because of the presence of common sequences at two different locations of their sequence, the derived sequences are amplified to a detectable amount by a common primer pair. Also given their difference in sequences their identification is obtained by the discrimination following its binding with the corresponding capture nucleotide sequence, previously bound at a given location upon a solid support. The sensitivity can be also greater increased if capture nucleotide sequences are spotted to the solid support surface by a robot at high density according to an array. A preferred embodiment of the invention is to use an amount of capture nucleotide sequences spotted on the array resulting in the binding of between about 0.01 to about 5 pmoles of sequence equivalent/cm2 of solid support surface.
The kit or device according to this embodiment of the invention may also incorporate various media or devices for performing the method according to the invention. Said kit (or device) can also be included in an automatic apparatus such as a high throughput screening apparatus for the detection and/or the quantification of multiple nucleotide sequences present in a biological sample to be analyzed. Said kit or apparatus can be adapted for performing all the steps or only several specific steps of the method according to this embodiment of the invention.
In the method, the kit (device) or apparatus according to this embodiment of the invention, the bound single stranded capture nucleotide sequences contain a sequence of between about 10 and about 600 bases, preferably between about 50 and about 450 bases specific for a target nucleotide sequence to be detected and/or quantified and having a total length comprised between about 30 and about 800 bases comprising a spacer having a nucleotide sequence of at least about 20 bases, preferably at least about 40 bases, preferably at least about 60 bases.
It was found that multiple genes or genomic DNA which are unrelated to the other in terms of sequence could be amplified together in one amplification (PCR) solution containing a limited concentration of primers specific of the different nucleotide sequences to be detected so that the total primer concentration in the amplification solution is comprised between 0.5 and 4 μM and better between 0.5 and 2 μM with the achievement thereafter of high yield of the different amplicons with the help of a unique primer pair so that the amplicons are then in sufficient amount to be detected after binding on specific probes.
The method is especially useful when the assay is designed to detect a large number of possible organisms (such as 10 or even 20 or more than 40) so that the amplification solution has to contain specific primers for all these organism nucleotide sequences but the number of actually detected organisms in the sample is limited. This is typically the situation of a diagnostic assay where the number of possible pathogens is large but indeed only one or a few of them are present in a given sample. In this case the amplification method allies both the specificity by the use of specific primer but avoid the problems occurring with the use of high primer concentrations. The use of a common primer pair at high concentration provides finally enough amplicons of any of the present pathogen(s) for their detection on the specific capture probes.
The present amplification method drastically reduces the non-specific amplification due to very low concentrations in the amplification solution of the different target specific primers. This feature is especially useful when working on real biological samples which contain genomic DNA from the host.
In a particular embodiment, the production of the derived sequences from the organism nucleotide sequences of the sample and the amplification by the universal primers are performed in one amplification process.
Preferably the length of the sequence complementary to one of the two strands of a given polynucleotide sequence of the specific primer pair is at least 6 and, more preferably, at least 15 nucleotides. In the preferred embodiment, the primers are designed to be specific for a given nucleotide sequence to be detected. However in a particular application, the primers are random sequences of small sequences of around 6 to 14 bases which are used for the amplification (for example for the transcripts). One of the applications is the amplification of transcripts or random sequences in a genome. The primers can be linked by a ligase to a polynucleotide if needed and processed for further amplification with the universal primers.
In another embodiment, the primers specific for the targets are at a concentration lower than 100 nM in the PCR solution, and may be even lower than 50 nM, or even lower than 20 10 nM.
Also in a preferred embodiment, the concentration of the universal primers is at least 100 nM, and preferably at least 500 nM, and even more preferably at least 1000 nM.
Also in a preferred embodiment, the ratio between the concentration of universal primers and the concentration of the specific target primers in the amplification PCR solution is at least 20:1, more preferably at least 50:1, and even more preferably at least 100:1.
In a preferred embodiment, the total concentration of the overall specific primers does not exceed 2000 nM, and preferably does not exceed 1000 nM, and still more preferably does not exceed 500 nM.
In still another preferred embodiment, the concentration of the overall specific primers does not exceed the concentrations of the universal primers.
Preferably, the universal primers have a Tm ±5° C., and, preferably, ±2° C. of the primers specific for the sample nucleotide sequences.
In still a preferred embodiment, the annealing temperature of the PCR cycles is at least 5° C., and, preferably, at least 7° C. lower than the Tm of the specific and the universal primers.
In particular embodiment, the PCR amplification is obtained with less than 25, and, preferably, less than 20, and even more preferably less 15 cycles and the detection is performed on such amplified sequences.
In particular embodiment, the concentration ratio between 2 different polynucleotide target sequences being detected is higher than 10.
In particular embodiment, the amplification (PCR) solution comprises at least 20 and preferably at least 40 and even more preferably at least 60 different target specific primers.
In a preferred embodiment, the ratio between the concentrations of the two universal primers in the amplification solution is comprised between 1.2 and 2.
In particular embodiment, the amount of non specific amplified sequences represents less than 50% and even less than 20% of the specific amplified sequences.
In particular embodiment, the PCR amplification is performed by a DNA polymerase which is a hot-start DNA polymerase.
In particular embodiment, the PCR amplification is performed by a DNA polymerase which is a Topo Taq DNA polymerase.
The method is not only applicable to amplification and detection of full size genes, but also to degraded genes and is well suited for degraded genes extracted from paraffin embedded tissues, where some chemical modifications of the mRNA occur due to the presence of chemical fixing agents. The present method is fully compatible and well adapted in term of sensitivity and specificity in combination with detection on microarray and also with a real time PCR performed on arrays.
In another preferred embodiment of the invention, the capture nucleotide sequences are chemically synthesized oligonucleotides sequences shorter than 100 bases (easily performed on programmed automatic synthesizer). Such sequences can bear a functionalized group such as amino group for covalent attachment upon the support, at high concentrations.
Longer capture nucleotide sequences are preferably synthesized by PCR amplification (of a sequence incorporated into a plasmid containing the specific part of the capture nucleotide sequence and the non specific part (spacer)).
In a further embodiment of the invention, the specific sequence of the capture nucleotide sequence is separated from the surface of the solid support by a spacer which is at least about 6.8 nm long, equivalent to the distance of at least 20 base pair long nucleotides in double helix form or equivalent to the size of the streptavidin or avidin protein when used as a linker between the capture molecules and the support.
In another embodiment of the invention, a specific nucleotide sequence comprising between about 10 and about 60 bases, preferably between about 15 and about 40 bases and more preferably between about 20 and about 30 bases is located on a capture nucleotide sequence comprising a sequence between about 30 and about 600 bases.
The method, kit (device) or apparatus according to the invention are suitable for the detection and/or the quantification of a target which is made of DNA or RNA, including sequences which are partially or totally homologous upon their total length.
In the method, kit (device) or apparatus according to the invention, the capture nucleotide sequences are advantageously covalently bound (or fixed) upon the insoluble solid support, preferably by one of their extremities as described hereafter.
The method according to the invention gives significant results which allows identification (detection and quantification) with amplicons in solutions at concentration of lower than about 10 nM, of lower than about 1 nM, preferably of lower than about 0.1 nM and more preferably of lower than about 0.01 nM (=1 fmole/100 μl).
In another aspect of this embodiment of the invention, very concentrated capture nucleotide sequences are used on the surface. The density of capture nucleotide sequences bound to the surface at a specific location is preferably higher than about 10 fmoles, and preferably higher than about 100 fmoles per cm2 of solid support surface. If the amount of capture nucleotides is too low, the yield of the binding is quickly lower and is undetectable. Concentrations of capture nucleotide sequences between about 600 and about 3,000 nM in the spotting or binding solutions are preferred. However, concentrations as low as about 100 nM still give positive results in favorable cases (when the yield of covalent fixation is high or when the target to be detected is single stranded and present in high concentrations). Such low spotting concentrations would give density of capture nucleotide sequence as low as 20 fmoles per cm2. On the other hand, higher density was only limited in the assays by the concentrations of the capture solutions, but concentrations still higher than 3,000 nM give good results.
The use of these very high concentrations and long probes are two unexpected characteristic features of the invention. The theory of DNA hybridization proposed that the rate of hybridization between two DNA complementary sequences in solution is proportional to the square root of the DNA length, the smaller one being the limited factor (Wetmur, J. G. and Davidson, N. 1968 J. Mol. Biol. 3:584). In order to obtain the required specificity, the specific sequences of the capture nucleotide sequences had to be small or limited in length compared to the target. Moreover, the targets were obtained after PCR amplification and were double stranded so that they reassociate in solution much faster than they hybridise on small sequences fixed on a solid support where diffusion is low thus reducing even more the rate of reaction. It was unexpected to observe a so large increase in the yield of hybridisation with the same short specific sequence.
In one embodiment, the amount of a target which “binds” on the spots is very small compared to the amount of capture nucleotide sequences present. So there is a large excess of capture nucleotide sequence and there was no reason to obtain the binding if even more capture nucleotide sequences.
One may perform the detection on the full length sequence after amplification or copying and when labeling is performed by incorporation of labeled nucleotides, more signal is present on the hybridized target making the assay sensitive. Since the method is highly sensitive, the capture probes are also able to capture cut target amplified sequence very efficiently. Cutting the sequences is preferably performed by enzymatic digestion, such as the DNAase, or by chemical treatment, such as the heating in alkaline solution.
The method, kit and apparatus according to this embodiment of the invention may comprise the use of other bound capture nucleotide sequences, which may have the same characteristics as the previous ones and may be used to identifying a target from another group of homologous sequences (preferably amplified by common primer(s)).
In the microbiological field, one may use the present invention for the amplification-detection of various microorganisms from the same genus or from different genuses and then identify the species by using capture nucleotide sequences of the invention. The finding of specific sequence is best performed by alignment programs using software on DNA or genomic sequences data bases. Given the genome programs of sequencing the different pathogenic organisms, it is feasible to find specific sequences for the amplification by specific primers and for the detection on specific probes. Detection of other sequences can be advantageously performed on the same array i.e. by allowing a hybridization with a standard nucleotide sequence used for the quantification or for positive or negative controls of hybridization. Said other capture nucleotide sequences have (possibly) a specific sequence longer than 10 to 60 bases and a total length as high as 600 bases and are also bound upon the insoluble solid support (preferably in the array made with the other bound capture nucleotide sequences related to the invention). A long capture nucleotide sequence may also be present on the array as consensus capture nucleotide sequence for hybridization with all sequences of the microorganisms from the same family or genus, thus giving the information on the presence or not of a microorganism of such family, genus in the biological sample.
The same array can also bear capture nucleotide sequences specific for a bacterial group (Gram positive or Gram negative strains or even all the bacteria).
The solid support according to the invention can be or can be made with materials selected from the group consisting of gel layers, glasses, electronic devices, silicon or plastic support, polymers, compact discs, metallic supports or a mixture thereof (see EP 0 535 242, U.S. Pat. No. 5,736,257, WO99/35499, U.S. Pat. No. 5,552,270, etc). Advantageously, said solid support is a single glass slide which may comprise additional means (barcodes, markers, etc.) or media for improving the method according to the invention. In a particular embodiment, the insoluble solid support is in the form of a multiwell plate.
In another particular embodiment, the different capture molecules are immobilized on different beads and, more preferably, the different beads with different capture molecules are labeled so as to be discriminated from each other. This is best achieved by using a mixture of beads having particular features, usually a particular fluorescent emission spectra, and distinguishable from each other in order to quantify the bound molecules on a particular bead. In this embodiment, one bead or a population of beads is then considered as a spot having a capture molecule specific of one target molecule.
The amplification step used in the method according to the invention is advantageously obtained by well known amplification protocols, preferably selected from the group consisting of PCR, RT-PCR, LCR, CPT, NASBA, ICR, Avalanche DNA techniques or isothermal amplification.
One particular isothermal amplification method which is suitable for RNA is the WT-Ovation™ Pico RNA Amplification System based on the Ribo-SPIA™ technology (NuGEN, San Carlos, Calif., USA). It produces amplified cDNA from total RNA for gene expression analysis. Amplification is initiated at the 3′ end as well as randomly throughout the whole transcriptome in the sample.
Ribo-SPIA™ technology is a three-step process; (1) Generation of First strand cDNA, (2) Generation of a DNA/RNA Heteroduplex Double Strand cDNA and (3) SPIA™ Amplification.
First, the cDNA is generated from the RNA by reverse transcription using a mix of DNA/RNA chimeric poly T primer and DNA/RNA chimeric random primer. These chimeric primers contain a sequence part which is common to all primers. The second part of the sequences each has a DNA portion that hybridizes either to the 5′ portion of the poly (A) sequence or randomly across the transcript. RT extends the 3′ DNA end of each primer generating first strand cDNA. The resulting cDNA/mRNA hybrid molecule contains a unique RNA sequence at the 5′ end common for all of the cDNA strands. Fragmentation of the mRNA within the cDNA/mRNA complex creates priming sites for DNA polymerase to synthesize a second strand, which includes DNA complementary to the 5′ unique sequence from the first strand chimeric primers. The result is a double stranded cDNA with a unique DNA/RNA heteroduplex at one end. RNase H is then used to degrade RNA in the DNA/RNA heteroduplex at the 5′ end of the first cDNA strand. This results in the exposure of a DNA sequence that is available for binding a second SPIA™ DNA/RNA chimeric primer. DNA polymerase then initiates replication at the 3′ end of the primer, displacing the existing forward strand. The RNA portion at the 5′ end of the newly synthesized strand is again removed by RNase H, exposing part of the unique priming site for initiation of the next round of cDNA synthesis.
The process of SPIA™ DNA/RNA primer binding, DNA replication, strand displacement and RNA cleavage is repeated, resulting in rapid accumulation of cDNA with sequence complementary to the original mRNA. The method is a linear amplification and the company claim an amplification of 15,000-fold. The method is applicable on 500 pg of starting total RNA. The method is especially well suited for the amplification of small RNA as present in paraffin embedded tissues.
Advantageously, the target to be identified is labeled previously to its hybridization with the single stranded capture nucleotide sequences. Said labeling (with known techniques from the person skilled in the art) is preferably also obtained during the amplification step. Hybridization on capture probes preferably requires the denaturation of the double stranded amplified target sequences. However, the inventors have found that this denaturation is not mandatory and hybridization can take place even without the denaturation step.
Advantageously, the length of the target is selected as being of a limited length preferably between about 60 and about 200 bases, preferably between about 80 and about 400 bases and more preferably between about 80 and about 800 bases. This preferred requirement depends on the possibility to find specific primers to amplify the required sequences possibly present in the sample. Too long target may reallocate faster and adopt secondary structures which can inhibit the fixation on the capture nucleotide sequences.
In a particular embodiment, the detection and/or the quantification of the amplified target sequences is obtained after their hybridization on corresponding capture probes in the amplification solution. Preferably, the amplification and the detection are performed in the same closed device. In a particular embodiment, the detection of the amplified sequences is performed during the PCR cycles. The amplification is preferably a real time PCR.
In a preferred embodiment, the detection of the presence of pathogenic organisms (being or not micro organisms such as bacteria or viruses) is obtained by detection of their genomic DNA sequences.
In a preferred embodiment, the detection of the presence of Genetically Modified Organisms (GMO) is performed by the detection of their genomic DNA sequences.
In another embodiment, the method is used for detection of the presence of mutations or deletions in some specific parts of a genome or in genes.
In a preferred embodiment, the original sequence to be detected and/or quantified in the sample belongs to the cytochrome P450 forms family.
Detection of genes is also a preferred application of this invention. In one embodiment the detection of homologous genes is obtained by first reverse transcription of the mRNA and then amplification by specific and universal primers as described in this invention. More particularly, the original nucleotide sequences to be detected and/or be quantified are RNA sequences submitted to a reverse transcription of the 3′ or 5′ end by using poly dT oligonucleotide or random primers of 6 or 8 or even 10 bases long. In another embodiment, the amplification is obtained by using random primers of between 6, 8 or 10 nucleotides long useful when the mRNAs present in the sample are the result of a degradation of the RNA transcripts and are found in small fragments.
In yet another embodiment, the amplification is the result of an isothermal amplification. In another embodiment, the amplification is a linear amplification. In a specific embodiment, one common sequence is used for all of the different specific primers used for the amplification of the RNA present in the sample as proposed by WT-Ovation™ Pico RNA Amplification System of NuGEN (San Carlos, Calif.).
More specifically the invention is related to a method for identifying and/or quantifying at least 5 transcripts of a cell in a sample comprising the steps of:
producing derived sequences from the parts of the transcript sequences present in the cell extract by incorporation of at least one common sequence in said parts of transcript sequences in order to obtain a partial homology between the said derived nucleotide sequences,
amplifying said derived nucleotide sequences as to produce full-length target nucleotide sequences having between 50 and 800 bases;
contacting said full-length target nucleotide sequences resulting from the amplifying step with at least 5 different single-stranded capture nucleotide sequences having between 55 and 800 bases, preferably between about 200 and about 450 bases, said single-stranded capture nucleotide sequences being covalently bound in an microarray to insoluble solid support(s) and said capture nucleotide sequences comprising a nucleotide sequence of at least 15 bases which is able to specifically bind to said full-length target nucleotide sequence, and said specific sequence is separated from the surface of the solid support by a nucleotide sequence of at least 40 bases in length; and
detecting specific hybridization of said target nucleotide sequence to said capture nucleotide sequences and quantifying the transcript expression level in the cell.
The present method allows best the detection and quantification of at least 10, preferably at least 20, and even more preferably more than 50 gene transcripts. In a preferred embodiment, the detection and/or quantification of the nucleotide sequence is performed on degraded RNA extracted from paraffin embedded tissue.
Because of the degradation of the RNA, the full length target amplified sequences are best produced by random primers so that the sequence which is amplified may be any part of the transcripts. Since their concentration is low, a first amplification step based on the use of random primer is necessary. The length of the capture molecule which gives the best reproducible and sensitive assay from one sample to the other is a sequence between about 55 and about 800 nucleotide long, preferably between about 200 and about 450 nucleotide long. The inventors have found that the use of long probes complementary to the transcripts gives very efficient, sensitive and reproducible from one sample to the other method for the detection of the cDNA coming from the small RNA present in the paraffin embedded tissues. The levels of the detection signals are also very high and well adapted for the determination of the transcripts pattern in the tissues even with analysis performed on small fragments of such transcripts. The particular feature of the method is the possibility to obtain a quantification of a particular transcript from the detection of the amplified sequences from RNA present in the tissue as small fragments which are randomly produced so that there is a collection of different fragments for each transcript.
In a preferred embodiment, the different single-stranded capture nucleotide sequences bound to the support have their entire sequences complementary or identical to one part of the transcript sequence to be detected.
In a particular embodiment, the capture nucleotide sequences comprise a nucleotide sequence of at least about 50 bases which is able to specifically bind to said full-length target nucleotide sequence without binding to said at least 4 other derived nucleotide sequences.
In a particular embodiment, the detection and/or quantification of the nucleotide sequence is performed on target amplified cDNA having a full length of between about 50 and about 150 bases long.
In a particular embodiment, the full-length target nucleotide sequences are double stranded DNA produced by PCR. In a particular embodiment, the full-length target nucleotide sequences are single stranded DNA produced by isothermal amplification.
According to a further aspect of the present invention, the method, kit (device) or apparatus according to the invention is advantageously used for the identification of different bacterial species belonging to different genus among them, Salmonella, Escherichia coli, Yersinia, Vibrio, Enterobacterium, Pseudomonas.
According to a further aspect of the present invention, the method, kit (device) or apparatus according to the invention is advantageously used for the identification of at least 5 GMO obtained after amplification of one of their DNA sequences with specific primers and detection on specific capture molecules present on an array.
The present method also allows the detection of antibiotic resistance genes or genetic variants such as the FemA gene, the Gyrase gene or the MexR gene.
The method of the present invention allows the detection of mutations or deletions in some specific parts of a genome or in genes for the polymorphism analysis of a genome or particular genes. Examples of polymorphism are given in example 5 on the genes gyrase and muxR related to antibiotic resistance. Detection of polymorphism is especially useful for the detection of genetic diseases and for analyzing specific susceptibilities of patients to drugs, such as a cytochrome P450, where the presence of certain isoforms modifies the metabolism of some drugs.
Another aspect of the present invention is related to any part of biochips or microarray comprising said above described sequences (especially the specific capture nucleotide sequence described in the examples) as well as a general screening method for the identification of a target sequence specific of said microorganisms of family type discriminated from other sequences upon any type of microarrays or biochips by any method.
After hybridization on the array, the target sequence is detected by any current techniques suitable for micro detection on arrays or on equivalent support. Without labelling, preferred methods are the identification of the target by mass spectrometry now adapted to the arrays (U.S. Pat. No. 5,821,060) or by intercalating agents followed by fluorescent detection (WO97/27329 or Fodor et al. 1993 Nature 364: 555).
The detection methods employing labels are numerous. A review of the different labeling molecules is given in WO 97/27317. They are obtained using either already labeled primer or by incorporation of labeled nucleotides during the copying or amplification step. Labeling can also be obtained by ligating a detectable moiety onto the RNA or DNA to be tested (a labeled oligonucleotide, which is ligated, at the end of the sequence by a ligase). Fragments of RNA or DNA can also incorporate labeled nucleotides at their 5′OH or 3′OH ends using a kinase, a transferase or a similar enzyme.
The most frequently used labels are fluorochromes like Cy3, Cy5 and Cy7 suitable for analyzing an array by using commercially available array scanners (General Scanning, Genetic Microsystem, etc.). Radioactive labeling, cold labeling or indirect labeling with small molecules recognized thereafter by specific ligands (streptavidin or antibodies) are common methods. The resulting signal of target fixation on the array is either fluorescent, calorimetric, diffusion, electroluminescent, bio- or chemiluminescent, magnetic, electric like impedometric or voltammetric (U.S. Pat. No. 5,312,527). A preferred method is based upon the use of the gold labeling of the bound target in order to obtain a precipitate or silver staining which is then easily detected and quantified by a scanner.
Quantification has to take into account not only the hybridization yield and detection scale on the array (which is identical for target and reference sequences) but also the extraction, the amplification (or copying) and the labeling steps.
The method according to the invention may also comprise means for obtaining a quantification of target nucleotide sequences by using a standard nucleotide sequence (external or internal standard) added at known concentration. A capture nucleotide sequence is also present on the array so as to hybridize to the standard in the same conditions as said target (possibly after amplification or copying). In this embodiment, the method comprises the quantification of a signal resulting from the formation of a double stranded nucleotide sequence formed by complementary base pairing between the capture nucleotide sequences and the standard and the step of a correlation analysis of signal resulting from the formation of said double stranded nucleotide sequence with the signal resulting from the double stranded nucleotide sequence formed by complementary base pairing between capture nucleotide sequence(s) and the target in order to quantify the presence of the original nucleotide sequence to be detected and/or quantified in the biological sample.
Advantageously the standard is added in the initial biological sample or after the extraction step and is amplified or copied with the same primers and/or has a length and a GC content identical or differing by no more than 20% from the target. More preferably, the standard can be designed as a competitive internal standard having the characteristics of the internal standard found in the document WO98/11253, the disclosure of which is incorporated herein by reference in its entirety. Said internal standard has a part of its sequence common to the target and a specific part which is different. It also has at or near its two ends sequences which are complementary of the two primers used for amplification or copy of the target and similar GC content (WO98/11253).
Preferably, the hybridization yield of the standard through this specific sequence is identical or differs by no more than 20% from the hybridization yield of the target sequence and quantification is obtained as described in WO 98/11253.
Said standard nucleotide sequence, external and/or internal standard, is also advantageously included in the kit (device) or apparatus according to the invention, possibly with all the media and means necessary for performing the different steps according to the invention (hybridization and incubation media, polymerase and other enzymes, standard sequence(s), labeling molecule(s), etc.).
The present invention also covers the means for performing the method. Particularly, the invention includes a diagnostic and/or quantification kit which comprises:
an insoluble solid support upon which single stranded capture nucleotide sequences are bound in an array, said single stranded capture nucleotide sequences containing a sequence of between about 10 and about 600 bases, and preferably between 50 and 450 bases specific for a target nucleotide sequence to be detected and/or quantified and having a total length comprised between about 30 and about 800 600 bases comprising a spacer having a nucleotide sequence of at least about 20 bases and preferably of at least about 40 bases and, in some embodiments even longer than about 60 bases, said single stranded capture nucleotide sequences being disposed upon the surface of the solid support according to an array with a density of at least 4 single stranded capture nucleotide sequences/cm2 of the solid support, and.
an amplification (PCR) solution that comprises at least 5 different target specific primers and a universal primer pair, a thermostable DNA polymerase, a plurality of dNTPs and a buffered solution having a pH comprised between 7 and 9 for containing the primers.
In a preferred embodiment, the kit comprises a device having a chamber for performing the amplification reaction together with detection and possibly a quantification of amplified target sequences. The kit preferably comprises the amplification reagents for the performance of the PCR amplification together with the hybridization on the immobilized capture molecules.
In another embodiment, the insoluble solid support of the kit is selected from the group consisting of glasses, electronic devices, silicon supports, plastic supports, compact discs, gel layers, metallic supports or a mixture thereof.
In another preferred embodiment of the kit, the single stranded capture nucleotide sequences are disposed upon the surface of the solid support as an array with a density of at least 4 single stranded capture nucleotide sequences/cm2 of the solid support surface.
In another embodiment, the insoluble solid support of the kit is in the form of a multiwell plate.
In another preferred embodiment, the insoluble solid support is a series of microbeads. The biochip is composed of a collection of beads on which the capture molecules are bound with one particular bead having only one capture molecule sequence. The beads are labeled so that they can be recognized preferably by a bead analyzed and counter such as the FACS machine.
In a preferred embodiment of the kit, the capture nucleotide sequences are specific to a target nucleotide sequence to be detected and/or quantified which is specific for a gene selected from the group consisting of bacteria, human cells, cytochrome P450 forms family.
In another embodiment, the diagnostic kit comprises biochips, for identification and/or quantification of GMOs obtained after amplification of one of their DNA sequences with specific primers and detection on specific capture molecules present on an array. In some embodiments, the kit allows identification and/or quantification of at least 5 GMOs. Preferably the specific capture molecules present on an array contain at least 5 bases located on either sides of the 3′ or 5′ flanking regions of the foreign DNA incorporated into the genome of the plant in order to obtain a of the GMO.
In a specific embodiment, the diagnostic kit comprises biochips, for identification and/or quantification of bacterial species obtained after amplification of one of their DNA sequences with specific primers and universal primer(s) and detection on an array. In some embodiments, the kit allows the identification and/or quantification of at least 5 bacterial species.
In another preferred embodiment, the diagnostic kit comprises biochips, for identification and/or quantification of different single nucleotide polymorphism (SNP) located at different locations in the genome of an organism.
In another embodiment, the kit comprises biochips, for identification and/or quantification of at least 5 gene transcripts obtained after amplification of one of their RNA or cDNA sequences with specific primer(s) and detection on specific capture molecules present on an array.
Advantageously, the biochips also contain spots with various concentrations (e.g., 4) of labeled capture nucleotide sequences. These labeled capture nucleotide sequences are spotted from known concentrations solutions and their signals allow the conversion of the results of hybridization into absolute amounts. They also allow testing for the reproducibility of the detection.
In a particular embodiment, the support for the capture molecules is a multiwell plate.
Alternatively, the biochip is composed of a collection of beads on which the capture molecules are bound with one particular bead having only one capture molecule sequence. The beads are labeled so that they can be recognized preferably by a bead analyzed and counter such as the FACS machine.
The solid support (biochip) can be inserted in a support connected to another chamber and automatic machine through the control of liquid solution based upon the use of microfluidic technology. By being inserted into such a microlaboratory system, it can be incubated, heated, washed and labeled by automates, even for previous steps (like extraction of DNA, amplification by PCR) or the following step (labeling and detection). All these steps can be performed upon the same solid support. In a preferred embodiment, the mixing is performed by movement of the liquid by physical means such as pump, opening and closing valves, electrostatic waves or piezoelectric vibrations
Preferably the support containing the capture molecules is part of a device having a chamber for performing the amplification reaction and a chamber having capture molecules for performing the hybridization and the detection of the target molecules.
Preferably the chamber for performing the PCR reaction is in a material resistant to 95° C. Preferably material is selected from the group consisting of glass, polymer, polycarbonate (PC), polyethylene (PE), Cycloolefin copolymer (COC), cyclic olefin polymer (COP and a mixture thereof. In still another embodiment, the chamber for PCR has a thickness of material of less than 2 mm and better less than 1 mm.
In a preferred embodiment, the incubation system provides conditions so that the thickness of the solution being in contact with the micro-array is constant above all the arrayed spots or localized areas. The difference of thickness between two spots or localized areas of the arrayed surface is preferably lower than 100 micrometers and may be lower than 10 micrometers and/or even lower than 1 micrometer. In another embodiment, the incubation system provides conditions for the thickness of the solution which is in contact with the micro-array to be changed between two measurements. In still another embodiment the chamber having the capture molecules has a surface having a transmission of more than 90% and better more than 95% at a the wavelength of detection of the target label. In still another embodiment the chamber having the capture molecules has a surface having allowing the same detection efficiency on the overall surface covered by the micro-array to be analyzed.
Preferably the detection and/or the quantification of the amplified target sequences is obtained after their hybridization on corresponding capture probes in the amplification solution.
In still another embodiment the PCR chamber and the array chambers are the same chamber.
In a particular embodiment, the amplification and the detection are performed in the same closed device. In still another embodiment, the detection of the amplified sequences is performed during the PCR cycles and preferably the detection is a real time PCR.
The present invention will be described in details in the following non-limiting examples in reference to the enclosed figures.
Production of the Capture Nucleotide Sequences and of the Targets
The FemA genes corresponding to the different Staphylococci species were amplified separately by PCR using the following primers:
The PCR was performed in a final volume of 50 μl containing: 1.5 mM MgCl2, 10 mM Tris pH 8.4, 50 mM KCl, 0.8 μM of each primer, 50 μM of each dNTP, 50 μM of biotin-16-dUTP), 1.5 U of Taq DNA polymerase Biotools, 7.5% DMSO, 5 ng of plasmid containing FemA gene. Samples were first denatured at 94° C. for 3 min. Then 40 cycles of amplification were performed consisting of 30 sec at 94° C., 30 sec at 60° C. and 30 sec at 72° C. and a final extension step of 10 min at 72° C. Water controls were used as negative controls of the amplification. The sizes of the amplicons obtained using these primers were 108 bp for S. saprophyticus, 139 bp for S. aureus, 118 bp for S. hominis, 101 bp for S. epidermidis and 128 bp for S. haemolyticus. The sequences of the capture nucleotide sequences were the same as the corresponding amplicons but they were single strands.
The biochips also contained positive controls which were CMV amplicons hybridized on their corresponding capture nucleotide sequence and negative controls which were capture nucleotide sequences for a HIV-I sequence on which the CMV could not bind.
Capture Nucleotide Sequence Immobilization
The protocol described by Schena et al. (1996 PNAS. USA 93:10614) was followed for the grafting of aminated DNA to aldehyde derivatized glass. The aminated capture nucleotide sequences were spotted from solutions at concentrations ranging from 150 to 3000 nM. The capture nucleotide sequences were printed onto the silylated microscopic slides with a home made robotic device (250 μm pins from Genetix (UK) and silylated (aldehyde) microscope slides from Cell associates (Houston, USA)). The spots have 400 μm in diameter and the volume dispensed is about 0.5 nl. Slides were dried at room temperature and stored at 4° C. until used.
Hybridization
At 65 μl of hybridization solution (AAT, Namur, Belgium) were added 5 μl of amplicons and the solution was loaded on the array framed by a hybridization chamber. For positive controls we added 2 nM biotinylated CMV amplicons of 437 bp to the solution; their corresponding capture nucleotide sequences were spotted on the array. The chamber was closed with a coverslip and slides were denatured at 95° C. for 5 min. The hybridization was carried out at 600 for 2 h. Samples were washed 4 times with a washing buffer.
Colorimetric Detection
The glass samples were incubated 45 min at room temperature with 800 μl of streptavidin labeled with colloidal gold 1000× diluted in blocking buffer (Maleic buffer 100 mM pH 7.5, NaCl 150 mM, Gloria milk powder 0.1%). After 5 washes with washing buffer, the presence of gold served for catalysis of silver reduction using a staining revelation solution (AAT, Namur, Belgium). The slides were incubated 3 times 10 min with 800 μl of revelation mixture, then rinsed with water, dried and analyzed using a microarray reader. Each slide was then quantified by a specific quantification software.
Fluorescence Detection
The glass samples were incubated 45 min at room temperature with 800 μl of Cyanin 3 or Cyanin 5 labeled streptavidin. After washing, the slides were dried before being stored at room temperature. The detection was performed in the array-scanner GSM 418 (Genetic Microsystem, Woburn, Mass., USA). Each slide was then quantified by a specific quantification software.
The results give a cross-reaction between the species. For example, epidermidis amplicons hybridized on its capture nucleotide sequence give a value of 152, but give a value of 144, 9, 13 and 20 respectively for the S. saprophyticus, S. aureus, S. haemolyticus and S. hominis capture nucleotide sequences.
Protocols for capture nucleotide sequences immobilization and silver staining detection were described in Example 1 but the capture nucleotide sequences specific of the 5 Staphylococcus species were spotted at concentrations of 600 nM and are the following:
In this case, the targets are fragments of the FemA gene sequence corresponding to the different Staphylococci species which were amplified by a PCR using the following consensus primers:
This PCR was performed in a final volume of 100 μl containing: 3 mM MgCl2, 1 mM Tris pH 8, 1 μM of each primer, 200 μM of dATP, dCTP and dGTP, 150 μM of dTTP, 50 μM of biotin-16-dUTP, 2.5 U of Taq DNA polymerase (Boehringer Mannheim, Allemagne), 1 U of Uracil-DNA-glycosylase heat labile (Boehringer Mannheim, Allemagne), 1 ng of plasmid containing FemA gene. Samples were first denatured at 94° C. for 5 min. Then 40 cycles of amplification were performed consisting of 1 min at 94° C., 1 min at 50° C. and 1 min at 72° C. and a final extension step of 10 min at 72° C. Water controls were used as negative controls of the amplification. The sizes of the amplicons obtained using these primers were 489 bp for all species.
The hybridization solution was prepared as in example 1 and loaded on the slides. Slides were denatured at 98° C. for 5 min. Hybridization is carried out at 50° C. for 2 h. Samples are then washed 4 times with a washing buffer. The values were very low and almost undetectable.
The experiment was conducted as described in Example 2 with the same amplicons but the capture nucleotide sequences used are the following:
aThe spacer sequences are underlined
The target amplicons were 489 bp long while the capture nucleotide sequences were 47, 67 or 87 bases single stranded DNA with a specific sequence of 27 bases.
The experiment was conducted as described in example 2 but the capture nucleotide sequences were spotted at concentrations of 3000 nM and are the following:
aThe spacer sequence is underlined. The specific sequences were of 27 bases
The targets are fragments of the FemA gene sequence corresponding to the different Staphylococci species which were amplified by PCR using the following consensus primers:
A consensus sequence is present on the biochips which detects all the tested Staphylococcus species. All target sequences were amplified by PCR with the same pair of primers.
The size of the amplicons obtained using these primers were 587 bp for all species. The consensus sequence capture nucleotide sequence was a 489 base long single stranded DNA complementary to the amplicons of S. hominis as amplified in example 2. The detection was made in fluorescence. Homology between the consensus capture nucleotide sequence and the sequences of the FemA from the 15 S. species were between 66 and 85%. All the sequences hybridized on this consensus capture nucleotide sequence.
The experiment was conducted as described in example 4 but at a temperature of 43° C. and the capture nucleotide sequences used are presented in the table here joined. The numbers after the names indicate the length of the specific sequences.
The FemA amplicons of S. anaerobius (a subspecies of S. aureus) were hybridized on an array bearing capture nucleotide sequences of 67 single stranded bases with either 15, 27 and 40 bases specific for the S. aureus, anaerobius and epidermidis at their extremities. The difference between the capture nucleotide sequences of anaerobius and aureus was only one base in the 15 base capture nucleotide sequence and 2 in the 27 and the 40 bases.
The amplicons of the FemA from the three Staphylococcus species were hybridized on the array.
The experiment was conducted as described in example 4 with the capture nucleotide sequences spotted at concentrations of 3000 nM. The bacterial FemA sequences were serially diluted before the PCR and being incubated with the arrays.
The consensus primers and the amplicons were the same as described in the example 4 but the capture probes were chosen for the identification of 15 Staphylococcus species. The experiment is conducted as in example 4. The capture nucleotide sequences contain a spacer fixed on the support by its 5′ end and of the following sequence 5′ GAATTCAAAGTTGCTGAGAATAGTTCAATGGAAGGAAGCG 3′ (SEQ ID NO: 36) followed by the following specific sequences for the various femA from the different Staphylococcus:
The P34 genes present in all Mycobacteria were all amplified with the following consensus primers:
Sense
MycU4 5′ CATGCAGTGAATTAGAACGT 3′ (SEQ ID NO: 53) located at the position 496-515 of the gene, Tm=56° C.
Antisense
APmcon02 5′ GTASGTCATRRSTYCTCC 3′ (SEQ ID NO: 54) located at the position 733-750 of the gene, Tm 52-58° C., S═C or G; R=A or G; Y=T or C.
The size of amplified products ranges from 123 to 258 bp.
The following capture nucleotide sequences were chosen for the specific capture of the Mycobacteria sequences:
Each of the sequences above comprises a spacer at its 5′ end. Spacer sequence: 5′ GAATTCAAAGTTGCTGAGAATAGTTCAATGGAAGGAAGCGTCTTC 3′ (SEQ ID NO: 74). Capture nucleotide sequences were aminated at their 5′ end.
MAGE genes were all amplified with the following consensus primers:
Sense
DPSCONS2 5′ GGGCTCCAGCAGCCAAGAAGAGGA 3′ (SEQ ID NO: 75), located at the 398-421 position of the gene, Tm=78° C.
Other amplicons were added as sense primer in order to increase the efficiency of the PCR for some MAGEs:
Antisense
DPASCONB4 5′ CGGTACTCCAGGTAGTTTTCCTGC 3′ (SEQ ID NO: 79), located at the position 913-936 of the gene, Tm=74° C.
The size of the amplified products are around 530 bp.
The following capture nucleotide sequences of 27 nucleotides were chosen for the specific capture of the MAGE sequences:
Each of the sequences above comprises a spacer aminated at its 5′ end in order to be covalently linked to the glass. Spacer sequence 5′ GAATTCAAAGTTGCTGAGAATAGTTCAATGGAAGGAAGCG 3′ (SEQ ID NO: 36).
They were spotted on aldehyde bearing glasses and used for the detection of the MAGEs amplified by the consensus primers given here above. The results showed a non equivocal identification of the MAGEs present in the tumors compared to identification using 12 specific PCR, one for each MAGE sequences.
Dopamine Receptors coupled to the G-protein were all amplified with the following consensus primers:
Sense
CONSENSUS2-3-4: 5′ TGCAGACMACCACCAACTACTT 3′ (SEQ ID NO: 92) located at the position 221-242 of the gene, Tm=66° C.; M=A or C;
CONSENSUS1-5: 5′ TGMGGKCCAAGATGACCAACWT 3′ (SEQ ID NO: 93) (22 nt) located at the position 221-240 of the gene, Tm=66° C.; M=A or C; K=G or T; W=A or T.
Antisense
5′ TCATGRCRCASAGGTTCAGGAT 3′ (SEQ ID NO: 94) located at the position 395-416 of the gene, Tm=64-68° C.; R=A or G; S═C or G.
The size of the amplified product is 196 bp.
The following capture nucleotide sequences of 27 nucleotides were chosen for the specific capture of the dopamine receptor sequences:
Each of the sequences above comprised an aminated spacer at its 5′ end. Spacer sequence 5′ GAATTCAAAGTTGCTGAGAATAGTTCAATGGAAGGAAGCG (SEQ ID NO: 36).
Histamine Receptors coupled to the G-protein were all amplified with the following primers:
Sense
H1sense: 5′ CTCCGTCCAGCAACCCCT 3′ (SEQ ID NO: 100) (18 nt) located at the Position 381-398 of the gene, Tm=60° C.
H2sense: 5′ CTGTGCTGGTCACCCCAGT 3′ (SEQ ID NO: 101) (19 nt) located at the Position 380-398 of the gene, Tm=62° C.
H3sense: 5′ ACTCATCAGCTATGACCGATT 3′ (SEQ ID NO: 102) (21 nt) located at the Position 378-398 of the gene, Tm=60° C.
Antisense
H1antisense: 5′ ACCTTCCTTGGTATCGTCTG 3′ (SEQ ID NO: 103) (20 nt) located at the Position 722-741 of the gene, Tm=60° C.
H2antisense: 5′ GAAACCAGCAGATGATGAACG 3′ (SEQ ID NO: 104) (21 nt) located at the Position 722-742 of the gene, Tm=62° C.
H3antisense: 5′ GCATCTGGTGGGGGTTCTG 3′ (SEQ ID NO: 105) (19 nt) located at the Position 722-740 of the gene, Tm=62° C.
Size of the amplified product ranged from 359 to 364 bp.
The following capture nucleotide sequences were chosen for the specific capture of the histamine receptor sequences:
Each of the sequences above comprised a spacer at its 5′ end.
Spacer sequence 5′ GAATTCAAAGTTGCTGAGAATAGTTCAATGG AAGGAAGCG 3′ (SEQ ID NO: 36). Capture nucleotide sequences were aminated at their 5′ end.
Serotonin Receptor coupled to the G-protein were all amplified with the following primers:
Sense
Antisense
The following capture nucleotide sequences were chosen for the specific capture of the serotonin receptor subtypes sequences:
Each of the sequences above comprises a spacer at its 5′ end
Spacer sequence 5′ GAATTCAAAGTTGCTGAGAATAGTTCAAT GGAAGGAAGCG 3′ (SEQ ID NO: 36). Capture nucleotide sequences were aminated at their 5′ end.
The HLA-A subtypes were amplified with the following consensus primers:
Sense
located at the position 181-200 of the gene, Tm 70° C.
Antisense
located at the position 735-754 of the gene, Tm=74° C.
The size of the amplified product was 574 bp.
The following capture nucleotide sequences of 27 nucleotides were chosen for the specific capture of the HLA-A sequences:
Each of the sequences above comprised an aminated spacer at its 5′ end. Spacer sequence 5′ GAATTCAAAGTTGCTGAGAATAGTTCAATGGAAGGAAGCG 3′ (SEQ ID NO: 36).
The Cytochrome P450 forms were amplified with the following consensus primers:
Sense
Consensus: 5′ GCCAGAGCCTGAGGA 3′ (SEQ ID NO: 183) located at the position 1297-1311 of the 3a3 gene, Tm=50° C.
Antisense
Consensus a3, a23, a1, a2: 5′ TCAAAAGAAATTAACAGAGA 3′ (SEQ ID NO: 184) located at the position 1839-1858 of the 3a3 gene, Tm=50° C.
Specific a9: 5′ ACAATGAAGGTAACATAGG 3′ (SEQ ID NO: 185) located at the position 2015-2033 of the 3a9 gene Tm=52° C.
Specific a18: 5′ ACTGATGGAACTAACTGG 3′ (SEQ ID NO: 186) located at the position 1830-1846 of the 3a18 gene Tm=52° C.
The length of the PCR product was around 560 bp.
The following capture nucleotide sequences were chosen for the specific capture of the cytochrome P-450 3a sequences:
Each of the sequences above comprised a spacer at its 5′ end
Spacer sequence 5′ GAATTCAAAGTTGCTGAGAATAGTTCAAT GGAAGGAAGCG 3′ (SEQ ID NO: 36). Capture nucleotide sequences were aminated at their 5′ end.
Each of the sequences above comprises a spacer at its 5′ end.
Spacer sequence 5′ GAATTCAAAGTTGCTGAGAATAGTTCAAT GGAAGGAAGCG (SEQ ID NO: 36).
The following primers were chosen for the amplification step of the GMO.
Consensus primers to detect GMO on biochips:
These primers allowed the amplification of the following genes:
1) CTP1, CTP2, CP4EPSPS, S CryIAb and hsp 70 Int. in Mon 809 (corn, Monsanto);
2) hsp 70 Int. and S CryIAb in Mon 810 (corn, Monsanto);
3) S CryIAb and S Pat in Bt 11 (corn, Novartis);
4) CTP4 and EPSPS in GTS40-3-2 (soybean, Monsanto).
The capture nucleotide sequences were chosen in these sequences to allow discrimination. Each of the sequences above comprised a spacer at its 5′ end.
The following sequences were chosen as specific capture probes of the GMO:
Amplification of the Sequences
The amplified target sequences are fragments of the gyrase gene (sub-unit A) sequences corresponding to the different genus and species (table 1) which were amplified by a PCR using the following consensus primers:
The PCR was performed in a final volume of 100 μl containing: 3 mM MgCl2, 1 mM Tris pH 8, 1 μM of each primer, 200 μM of dATP, dCTP and dGTP, 150 μM of dTTP, 50 μM of biotin-16-dUTP, 2.5 U of Taq DNA polymerase (Boehringer Mannheim, Allemagne), 1 U of Uracil-DNA-glycosylase heat labile (Boehringer Mannheim, Allemagne), 1 ng of plasmid containing gyrase gene. Samples were first denatured at 94° C. for 5 min. Then 40 cycles of amplification were performed consisting of 30 sec at 94° C., 45 sec at 48° C. and 30 sec at 72° C. and a final extension step of 10 min at 72° C. Water controls were used as negative controls of the amplification. The sizes of the amplicons obtained using these primers were 166 bp for all genera.
Production of the Capture Nucleotide Sequences and of the Targets
The capture nucleotide sequences contain a spacer fixed on the support by its 5′ end and of the following sequence
followed by the following specific sequences for the various Gyrase from the different bacteria:
The capture nucleotide sequences were first synthesized chemically and later on produced by PCR amplification after cloning of the sequences into the plasmid pGEM-T Easy Vector System (Promega, Madison, USA). The capture nucleotide sequences were then produced by amplification of the plasmids using a common 5′ aminated primer 5′ GAATTCAAAGTTGCTGAGAATAGTTCA (SEQ ID NO: 221) and a second primer of 27 bases complementary of each capture nucleotide sequence.
The aminated capture polynucleotide sequences (longer than 100 bases) were spotted from solutions at concentrations ranging from 150 to 3000 nM. The capture nucleotide sequences were printed onto the aldehyde microscopic slides with a home made robotic device (250 μm pins from Genetix (UK). The solutions of spotting were from AAT (Namur, Belgium). The spots have 400 μm in diameter and the volume dispensed is about 0.5 nl. Slides are dried at room temperature and stored at 4° C. until used.
Hybridization
At 65 μl of hybridization solution (AAT, Namur, Belgium) were added 5 μl of amplicons and the solution was loaded on the array framed by a hybridization chamber. For positive controls 2 nM biotinylated CMV amplicons of 437 bp were added to the solution; their corresponding capture nucleotide sequences were spotted on the array. The chamber was closed with a coverslip and slides were denatured at 95° C. for 5 min. The hybridization was carried out at 650 for 30 min. Samples were then washed 4 times with a washing buffer.
Colorimetric Detection
The glass samples were incubated 45 min at room temperature with 800 μl of streptavidin labeled with colloidal gold 1000× diluted in blocking buffer (Maleic buffer 100 mM pH 7.5, NaCl 150 mM, Gloria milk powder 0.1%). After 5 washes with washing buffer, the presence of gold served for catalysis of silver reduction using a staining solution (Silver Blue Solution, AAT, Namur, Belgium). The slides were incubated 10 min with 800 μl of revelation mixture, then rinsed with water, dried and analyzed using a microarray reader (Worstation, AAT, Namur, Belgium). The spots of the arrays were then quantified by a specific quantification software.
The virus to be detected was the adenovirus, the herpes virus 1, 5 and 4. The consensus primers for the virus amplification were
The amplicons of the virus are respectively of 315, 331, 779, and 820 bases long for the 4 virus corresponding to the sequences N°420-734, 7924-8254, 1562-2340, 120761-130580.
The conditions for the PCR amplification were as described in example 1 but with an annealing temperature of 45° C. After amplification, the amplicons were hybridized on an array bearing the capture nucleotide sequences for each virus species and subtypes. The capture nucleotide sequences were composed of a spacer fixed by its 5′ end to the slides and have the sequence as in example 16 and a specific part located on the 3′ end of the capture nucleotide sequence.
Specific sequences of the capture nucleotide sequences:
The hybridization, the colorimetry labeling and the quantification were performed as in example 1.
The amplified target sequences are fragments of the cytochrome b gene sequences corresponding to the different species were amplified by a PCR using the following consensus primers:
The PCR were performed as in example 1. The sizes of the amplicons obtained using these primers were between 130 and 147 bp for all genus. After amplification, the amplicons were hybridized on an array bearing the capture nucleotide sequences for each species. The capture nucleotide sequences were composed of a spacer fixed by its 5′ end to the slides and having the same sequence as in example 1 and a specific part located on the 3′ end of the capture nucleotide sequence.
Specific sequences of the capture nucleotide sequences:
The consensus capture nucleotide sequence for all these animal detection
To identify the cow species, another couple of consensus primer was designed:
Specific capture nucleotide sequences have been designed:
The hybridization, the colorimetry labeling and the quantification were performed as in example 1.
The amplified targets are fragments of the sucrose synthase gene sequences corresponding to the different species were amplified by a PCR using the following consensus primers:
The PCR were performed as in example 1. The sizes of the amplicons obtained using these primers were 221 bp for all genuses. After amplification, the amplicons were hybridized on an array bearing the capture nucleotide sequences for each species. The capture nucleotide sequences were composed of a spacer fixed by its 5′ end to the slides and having the following sequence and a specific part located on the 3′ end of the capture nucleotide sequence.
Specific sequences of the capture nucleotide sequences:
The hybridization, the colorimetry labeling and the quantification were performed as in example 1.
The amplified target sequences are fragments of the cytochrome b gene sequences corresponding to the different species were amplified by a PCR using the following consensus primers:
The PCR were performed as in example 1. The sizes of the amplicons obtained using these primers were 170 bp for all genuses. After amplification, the amplicons were hybridized on an array bearing the capture nucleotide sequences for each species. The capture nucleotide sequences were composed of a spacer fixed by its 5′ end to the slides and having the following sequence and a specific part located on the 3′ end of the capture nucleotide sequence.
Specific sequences of the capture nucleotide sequences for the species:
Specific sequences of the capture nucleotide sequences for the families:
Among this family, a consensus capture nucleotide sequence was designed to detect the Thunnus genus: ATTCCACATCGGCCG (SEQ ID NO: 291)
Consensus capture nucleotide sequences for these various fish families:
The hybridization, the colorimetry labeling and the quantification were performed as in example 1.
The amplified targets are fragments of the cytochrome P450 gene sequences corresponding to the different families which were amplified by a PCR using the following consensus primers:
The conditions for the PCR amplification are the same as in example 1. The sizes of the amplicons obtained using these primers were 970 bp. After amplification, the amplicons were hybridized on an array bearing the capture nucleotide sequences for each single point mutation.
The capture nucleotide sequences were composed of a spacer fixed by its 5′ end to the slides and having the following sequence and a specific part located on the 3′ end of the capture nucleotide sequence.
Specific sequences of the capture nucleotide sequences for the single point mutations from different families of cytochrome p450.
The hybridization, the colorimetry labeling and the quantification were performed as in example 1.
Example of detection of the main bacteria responsible for meningitis by real-time PCR on cerebrospinal fluid was combined with genus and species sequence identification on DNA microarray
The tuf is phylogenetically well conserved gene amongst bacteria, it encodes an elongation factor (TE). The biological sample for the detection of meningitis was cerebrospinal fluid. Indeed, this medium is normally sterile and if there is an infection, it would be contaminated by only one pathogen. Thus it limits the risk to amplify other genus with consensus primers.
For a real-time PCR consensus primers for the tuf gene, amplify all genus and species of interest and the consensus probe for the tuf gene was labeled with two fluorochromes (quencher and emitter) as internal control of the PCR.
Biochips bearing specific capture probes for bacteria genus and species currently found in meningitis infections were:
Neisseria menengitidis serogroup A;
Neisseria menengitidis serogroup B;
Haemophylus influenzae;
Escherichia coli;
Streptococcus pneumoniae;
Streptococcus agalactiae;
Staphylococcus aureus;
Staphylococcus epidermidis;
Staphylococcus haemolyticus;
Staphylococcus hominis.
Staphylococcus saprophyticus
For the Primers Consensus Sense were: 5′ GAATTRGTTGAAATGGAA 3′ 18 NT (SEQ ID NO: 305); (R=A or G) position 443-460 Tm=46-48° C., 1 mismatch maximum.
For the Consensus Antisense were: 5′ GTAGTACGGAARTAGAA 3′ 17 nt (SEQ ID NO: 306), (R=A or G), position 995-1011 Tm=46-48° C., 1 mismatch maximum.
For the Double labeled Probe (sense) were: 5′ GGTGTTGAAATGTTCC 3′ 16 nt (SEQ ID NO: 307) position 776-792 Tm=46° C., 1mismatch maximum
Size of the amplified product: 569 bp.
Genus Specific Capture Probes
Identical for serogroup A and B and a minimum of 5 mismatches against the other genus.
Identical for Streptococcus pneumoniae and Streptococcus agalactiae and a minimum of 5 mismatches against the other genus.
Identical for Staphylococcus aureus, Staphylococcus epidermidis, Staphylococcus haemolyticus, Staphylococcus hominis, Staphylococcus saprophyticus and a minimum of 6 mismatches against the other genus.
Each of the sequences above comprised a spacer at its 5′ end Spacer sequence
Capture probes were aminated at their 5′ end.
Glass surface was activated in order to bear aldehyde groups as proposed by EP-00870184.9. The slides were then incubated with a Protein A at 5 μg/ml in PBS solution for 60 min. The slides were washed in PBS and then incubated for 5 min. in NaBH4 solution at 2.5 mg/ml. After washing they were incubated for 2 h with 10% milk powder and then washed again. Antibodies at concentration of 0.1 mg/ml were spotted on the glass slides with solid pins of 0.250 mm diameter and the spots were around 0.35 mm diameter final. The spotting solution contained buffer borate 0.05 M pH 8, glycerol 40% and NP40 0.02%. After 3 washes with 0.01 M phosphate pH 7.4, non-specific binding sites were blocked with PBS containing milk powder at 0.1% for 1 h at 20° C.
For the reaction of the targets, the slides were incubated for 1 h at 20° C. with the samples in the presence of PBS containing milk powder at 0.1%. After 4 washes of one minute with a 10 mM maleate buffer containing 15 mM NaCl (washing buffer) the slides were incubated for 45 min. at 20° C. with an antibody common for the various targets potentially present in the samples, then with a conjugate of anti-IgG/gold particles of 10 nm diameter (diluted 100 times) in 100 mM maleate buffer containing 150 mM NaCl.
The slides were washed 5 times in the same washing buffer as before and then incubated for 10 min. in the Silver Blue detection solution (AAT Namur) for obtaining the silver crystal precipitation. The slides were finally washed in water before being read in the Silver Blue Reader (AAT).
The HLA-A typing was obtained using antibodies specific of the types or subtypes. The antibodies against HLA-ABC common, HLA-B7, HLA-B27, were obtained from Cymbus Biotechnology, Ltd., Hampshire, UK. Other antibodies were from Pel-Freez especially the antibodies directed against the HLA-A2, A2O3 and A210 or HLA-B39, B3901, B3902, which allow typing and subtyping of the HLA. Lymphocytes were isolated from the blood according to the classical microlypophocytotoxicity assay (Pel-Freez, Brown Deer, Wis., USA). Lymphocytes at 10×106 cells/ml were incubated for 30 min. at 37° C. with the antibody array in RPMI 1640 media with Hepes buffer. The arrays are then washed 4 times in the same medium. The second antibodies for cells were directed against CD-2 and CD-19. Then the anti-IgG/nano-gold complexes were incubated followed by the Silver Blue (AAT, Namur, Belgium) for the detection.
The list of the four targeted GMO and the targeted plant species (invertase for Maize) to be detectable in the assay is presented in the table below together with the two primers used for each of the gene sequences to be amplified. Two common sequences are present on the two primers of the same pair. These common sequences which are later used as universal amplifying sequences are in bold. Specific sequences are in italics.
Amplifications were carried out in 25 μl volume reactions, with 100 ng of genomic DNA (each GMO was diluted in 100% non-GMO maize DNA to a final concentration of 0.1%). The GMO DNAs were obtained from the American Oil Chemist Society (AOCS) (Boulder, Urbana, Ill., USA). The sample contains the DNA of T25, Mon531 and LLRice62 at 0.1% in 99.7% DNA of non-GMO maize. The PCR reaction contained both the specific and the universal primers. The universal primers were TCTATATGCTCCACAGTATGCGA (Universal Primer A) (SEQ ID NO: 332) and ACTATATGCTCCACTCTATGCCT (Universal Primer B) (SEQ ID NO: 333). The PCR was performed using 10 nM of each of the specific primers, 0.667 μM of the biotinylated Universal Primer A, 1 μM of the biotinylated Universal Primer B, 1× QIAGEN Multiplex PCR Master Mix (Qiagen, Hilden, Germany), 600 μM dUTP (Roche, Manheim, Germany), 0.5 U of UNG (USB corp., Cleveland, USA). Samples were first incubated at 22° C. for 10 min (activation of UNG) and then incubated at 94° C. for 15 min (inactivation of UNG/activation of Taq polymerase). The first ten amplification cycles were performed with 94° C. for 30s, 55° C. for 90s and 72° C. for 30° C. The next 30 cycles were performed with 94° C. for 30s, 60° C. for 90s and 72° C. for 30° C. and a final extension step of 10 min at 72° C.
The resulting target amplicons were between 119 and 150 bp long.
The capture nucleotide sequences contained specific binding sequence for their respective target. The specific parts of the capture molecule are presented in the table below.
Each capture probe comprised a spacer at its 5′ end which has the following sequence: ataaaaaagtgggtcttagaaataaatttcgaagtgcaataattattattcacaacatttcgatttttg caactacttcagtt cactccaaatta (SEQ ID NO: 339).
The last nucleotide contained a free amino group for binding on the activated glass.
The capture molecules were chemically synthesised by Eurogentec (Liege, Belgium).
The capture molecules were spotted on Diaglass which are glass slides activated according to the process described in the EP01313677B1.
Each spot of the array was obtained according to the technology developed for the DualChips (Eppendorf; Array Technologies, Namur, Belgium) by deposit at a location on the slide of around 0.2 nl of spotting solution containing the capture molecules at 3 mM.
After amplification, the amplicons were hybridized on the arrays. The hybridization mix containing 9 μl of PCR product, 5 μl of Sensihyb solution (Eppendorf, Hamburg, Germany), 4 μl of hybridization control (Eppendorf, Hamburg, Germany) and 27 μl of water are denaturated with 5 μl of NaOH and then incubated 5 min at room temperature. 50 μl of hybridisation solution (Eppendorf, Hamburg, Germany) were added in the mix and the solution was loaded on the array framed by a hybridisation chamber. The chamber was closed with a covership. The hybridisation was carried out at 60° C. for 1 h. Samples were washed with several washing buffers as described in the DualChip Manual.
The detection was performed in colorimetry using the Siverquant labeling provided by Eppendorf (Hamburg, Germany) and described in EP1179180B1. In short, the glass samples were first incubated 45 min at room temperature with colloidal gold-conjugated IgG Anti-biotin 1000× diluted in blocking buffer. After 5 washes with washing buffer, the presence of gold served for catalysis of silver reduction using a staining revelation solution. The slides were then incubated 3 times 10 min with the revelation mixture, then rinsed with water, dried and analysed using the Silverquant scanner. Each slide was then quantified by the Silverquant data analysis software. Data were corrected for the local background and the triplicates were averaged.
Results
Signals for the four GMO and for the maize reference gene (invertase) were detected. Signal intensity after hybridization of PCR products are given in the table below as the raw values after background subtraction on a scale or 65 536.
The table also gives the signals values when the universal primers are not incorporated into the PCR. ND is not detectable value considered as absent. The experiments were repeated two times and lead to identical conclusion concerning the presence of the four different GMO and of the maize plant reference gene (invertase).
The list of the targeted GMO to be detected is presented in the table below together with the two primers used for each of the gene sequence to be amplified. Two common sequences are present on the two primers of the same pair. These common sequences which are later used as universal amplifying sequences are in bold. Specific sequences are in italics.
The PCR was performed on 100 ng DNA extracted from the samples using the standard CTAB procedure of Rogers and Bendich (1985, Plant Biol. 5:69-76). The PCR solution contained both the specific and the universal primers. The universal primers were TCTATATGCTCCACAGTATGCGA (SEQ ID NO: 366) and ACTATATGCTCCACTCTATGCCT (SEQ ID NO: 367). The PCR was performed using 10 nM of each of the specific primers, 1 μM of each biotinylated universal primer, 1× QIAGEN Multiplex PCR Master Mix (Qiagen, Hilden, Germany), 600 μM dUTP (Roche, Manheim, Germany), 0.5 U of UNG (USB corp., Cleveland, USA), 100 ng of target genomic DNA in a final volume of 25 μl. Samples were first incubated at 22° C. for 10 min (activation of UNG) and then incubated at 94° C. for 15 min (inactivation of UNG/activation of Taq polymerase). The first ten amplification cycles were performed with 94° C. for 30s, 55° C. for 90s and 72° C. for 30° C. The next 30 cycles were performed with 94° C. for 30s, 60° C. for 90s and 72° C. for 30° C. and a final extension step of 10 min at 72° C.
The resulting target amplicons were between 104 and 180 bp long.
The capture nucleotide sequences contained specific binding sequence for their respective target. The specific parts of the capture molecule are presented in the table below.
Each capture probe comprised a spacer at its 5′ end which has the following sequence: ataaaaaagtgggtcttagaaataaatttcgaagtgcaataattattattcacaacatttcgatttttgcaa ctacttcagtcactccaaatta (SEQ ID NO: 381).
The last nucleotide contains a free amino group for binding on the activated glass.
The capture molecules were chemically synthesised by Eurogentec (Liege Belgium).
The capture molecules were spotted on Diaglass which are glass slides activated according to the process described in the EP01313677B1.
Each spot of the array was obtained according to the technology developed for the DualChips (Eppendorf; Array Technologies, Namur, Belgium) by deposit at a location on the slide of around 0.2 nl of spotting solution containing the capture molecules at 3 mM.
The hybridization, detection n the arrays and the quantification were performed as in Example 24.
The experiment was conducted as described in Example 24.
The list of the targeted genes is presented in the Table below together with the two primers used for each of the gene sequence to be amplified. The two universal amplifying sequences are in bold. Specific sequences are in italics. The SNP are present in Pseudomonas aeruginosa.
The method was based on a two-step PCR. The PCR reaction was performed using 0.05 μM of each of the universal primers, 1 μM of each two biotinylated universal primers 1× TAQ polymerase reaction buffer (Eppendorf, Hamburg, Germany), 200 ∥M dNTP, 2.5 U of Taq polymerase (Eppendorf, Hamburg, Germany), 100 ng of target genomic DNA in a final volume of 25 μl. Samples were first denatured at 95 C for 2 min. The first ten amplification cycles were performed with 95° C. for 30s, 55° C. for 30s and 72° C. for 30° C. The next 30 cycles were performed with 95° C. for 30s, 60° C. for 30s and 72° C. for 30° C. and a final extension step of 10 min at 72° C.
The resulting target amplicons were between 197 and 417 bp long.
The capture nucleotide sequences contained specific binding sequence for their respective SNP target. Wt represents the wild type sequence while the mut represents the mutated gene. The specific parts of the capture molecules are presented in the Table below.
Each capture probe comprised a spacer at its 5′ end which has the following sequence: ataaaaaagtgggtcttagaaataaatttcgaagtgcaataattattattcacaacatttcgatttttgcaac tacttcagttcactccaaatta (SEQ ID NO: 402).
The last nucleotide contains a free amino group for binding on the activated glass.
After amplification, the amplicons were hybridized on the arrays. The hybridization mix containing 9 μl of PCR product, 5 μl of sensihyb solution (Eppendorf, Hamburg, Germany), 4 μl of hybridization control (Eppendorf, Hamburg, Germany) and 27 μl of water were denaturated with 5 μl of NaOH and then incubated 5 min at room temperature. 50 μl of hybridisation solution (Eppendorf, Hamburg, Germany) was added in the mix and the solution was loaded on the array framed by a hybridisation chamber. The chamber was closed with a covership. The hybridisation was carried out at 60° C. for 1 h. Samples were washed with several washing buffers as described in the DualChip Manual.
The detection was performed in colorimetry using the Siverquant labeling provided by Eppendorf (Hamburg, Germany) and described in EP1179180B1. In short, the glass samples were first incubated 45 min at room temperature with colloidal gold-conjugated IgG Anti-biotin 1000× diluted in blocking buffer. After 5 washes with washing buffer, the presence of gold served for catalysis of silver reduction using a staining revelation solution. The slides were then incubated 3 times 10 min with the revelation mixture, then rinsed with water, dried and analysed using the Silverquant scanner. Each slide was then quantified by the Silverquant data analysis software. Data were corrected for the local background and then averaged.
The experiment was conducted as described in Example 24.
The list of the targeted bacteria is presented in the Table below together with the two primers used for each of the gene sequence to be amplified. Universal amplifying sequences are in bold. Specific sequences are in italics.
The PCR reaction was performed using 0.05 μM of each of the specific primers, 1 μM of biotinylated universal primer, 1× TAQ polymerase reaction buffer (Eppendorf, Hamburg, Germany), 200 μM dNTP, 2.5 U of Taq polymerase (Eppendorf, Hamburg, Germany), 100 ng of target genomic DNA in a final volume of 25 μl. Samples were first denatured at 95° C. for 2 min. The first ten amplification cycles were performed with 95° C. for 30s, 55° C. for 30s and 72° C. for 30° C. The next 30 cycles were performed 95° C. for 30s, 60° C. for 30s and 72° C. for 30° C. and a final extension step of 10 min at 72° C.
The resulting target amplicons were between 140 and 346 bp long.
The capture nucleotide sequences contained specific binding sequence for their respective target. The specific parts of the capture molecule are presented in the Table below.
Each capture probe comprised a spacer at its 5′ end which has the following sequence: ataaaaaagtgggtcttagaaataaatttcgaagtgcaataattattattcacaacatttcgatttttg caactacttcagttcactccaaatta (SEQ ID NO: 439).
The last nucleotide contains a free amino group for binding on the activated glass.
After amplification, the amplicons were hybridized as described in example 28 and the detection is performed in colorimetry using the Silverquant labelling.
6 genes were selected as being expressed in the sample or being used as house keeping genes. For each of them a specific primer pair was designed having a specific sequence complementary either of the sense and the other one of the antisense strand. The primer sequence of the 6 different genes was described in the table below. Each of the different primers had an additional 5′ end universal sequence being; TGCTATGCTCACAGATGCGA (SEQ ID NO: 440). The lengths of amplified targets were comprised between 80 bp and 107 bp. The same sequence TGCTATGCTCACAGATGCGA (SEQ ID NO: 441) was used for the universal primer.
Total RNA Extraction
The total RNA extraction was performed using the RecoverAll Total Nucleic Acid Isolation kit from Ambion (Cat#1975). Part I of the isolation, the “Deparaffinization”, was performed starting from Fresh Frozen Paraffin Embedded Human Tonsil Thin section thinner than 80 μm, in a final volume of 1 ml 100% Xylene (Fisher, #0287K) and by incubating for 3 min at 50° C. then centrifuging for 2 min at maximum speed. The digested pellet was washed twice with 1 ml 100% ethanol (MERCK, 8.18760.1000). Part II, the “Protease digestion”, was performed by adding Digestion buffer and Protease then by incubating for 16 h at 50° C. Part III, the “Nucleic Acid Isolation, was performed by adding 480 μl Isolation additive and 1.1 ml 100% ethanol (MERCK, 8.18760.1000) then filtering and washing. Part IV, the “Nuclease Digestion and Final Purification” was performed by incubating 60 μl of DNase for 30 min at room temperature, then by washing and eluting twice in 30 μl nuclease free water.
FFPE extracted RNA yield was assessed in a ND-1000 spectrophotometer (Nanodrop Technologies, Inc. Wilmington, USA) and size distribution was obtained by capillary electrophoresis with the Agilent 2100 BioAnalyzer® (Agilent Technologies, Palo Alto, Calif.).
RT-PCR
RT-PCR was performed using an amplification kit from Promega (Access RT-PCR system, Cat# A1250). The RT-PCR was performed in a final volume of 50 μl the following reagents were added in a reaction tube: 1× AMV/Tfl 5× Reaction Buffer, 200 μM of dNTP mix, 50 nM of each specific primer, 1 μM of the universal primer, 1 mM of MgSO4, 5U of AMV Reverse Transcriptase (5 U/μl), 5 U of Tfl DNA Polymerase (5 U/μl), 1 μg of Breast Adenocarcinoma (MCF7) Total RNA from Ambion (Cat# AM7846), 32 μM of biotin-11-dATP (Perkin Elmer, NEL540, 1 mM) and 32 μM of biotin-11-dCTP (Perkin Elmer, NEL538, 1 mM).
The reaction tubes were then placed in a thermocycler programmed as follows: (i) reverse transcription of 45 min at 48° C., (ii) AMV RT inactivation at 94° C. for 2 min, (iii) 35 PCR cycles including a denaturation step of 30 sec at 94° C., annealing step of 60 sec at 54° C. and extension step of 2 min at 68° C. and a final extension step of 7 min at 68° C.
Water controls were used as negative controls of the amplification.
Microarray
DualChip human breast (Eppendorf, Hamburg, Germany) were used for the detection and the quantification of the amplified sequences. The DualChips were obtained by spotting aminated capture molecules on aldehyde activated glass obtained according to the EP01313677B1, the disclosure of which is incorporated herein by reference in its entirety, using a home made robotic device. The capture molecules were part of an Xmer technology of Eppendorf and were between 200 and 450 bp long. The spots were around 250 μm in diameter. The slides were stored at 4° C.
The capture probes for the different genes detected in this example are presented in the table below. Their sequences are complementary of the gene transcripts which have to be detected. The sequence complementary of the amplified target sequence is shown in bold. The sequence located in the 5′ end of the capture molecules serves as spacer for the binding of the target amplified sequences.
Capture probe sequence for the different detected genes. The sequence complementary of the amplified target sequence is shown in bold.
Hybridization
For each condition, 10 μl of PCR product were added to the hybridization mix (Eppendorf, Hamburg, Germany) to a final volume of 100 μl. The hybridization mix was injected slowly by the injection port of the hybridization frame of the DualChip. The frame was sealed with an aluminium pad and immediately after the sealing, the slides were placed in the Thermomixer comfort (Eppendorf) and incubated overnight (for 12-16 h) at 60° C.
After the hybridization step, the slides were washed 4 times for 2 min with a washing buffer as described in the DualChip Manual.
Fluorescence Detection
The slides were incubated 45 min at room temperature with the Cy3-conjugated IgG Anti-biotin (Jackson Immuno Research Laboratories, Inc #200-162-096) diluted 1/1000× Conjugate-Cy3 in the blocking buffer and protected from light. After this incubation, the slides were washed 5 times for 2 min with the washing buffer and 2 times with distilled water for 2 min and then these slides were dried before being stored at room temperature. The detection was performed in a confocal laser scanner “Autoloader ScanArray” (Packard, USA) and quantified by a specific quantification software. The signal intensity for each spot is corrected by the subtraction of the local background and then averaged. The quantification process was described in detail by de Longueville et al. (2002 Biochem. Pharmacol. 64:137-149).
Results
The extraction was performed on 4 slices of 10 μM FFPE Human tonsil and 1 μg of extracted total RNA was used in RT-PCR amplification.
The signal intensities of hybridization on DualChip human breast cancer for the different amplified gene transcripts are given in the table below. The scale of the scanner is from 1 to 65536.
Signal intensities of 6 amplified gene transcripts after hybridization on DualChip human breast cancer.
Total RNA Extraction
The extraction was performed on 4 slices of 10 μM FFPE Human breast cancer samples from two different patients and 100 ng of extracted total RNA was used in RNA amplification. The patients were characterized based on the expression status of the estrogen receptor-α (ERα, gene ESR1).
The total RNA extraction was performed using the RecoverAll Total Nucleic Acid Isolation kit from Ambion (Cat#1975). Part I of the isolation, the “Deparaffinization”, was performed starting from Fresh Frozen Paraffin Embedded Human breast cancer samples Thin section of 10 μm, in a final volume of 1 ml 100% Xylene (Fisher, #0287K) and for 3 min at 50° C., then centrifuging for 2 min at maximum speed. The digested pellet was washed twice with 1 ml 100% ethanol (MERCK, 8.18760.1000). Part II, the “Protease digestion”, was performed by adding Digestion buffer and Protease, then by incubating for 16 h at 50° C. Part III, the “Nucleic Acid Isolation”, was performed by adding 480 μl Isolation additive and 1.1 ml 100% ethanol (MERCK, 8.18760.1000) then filtering and washing. Part IV, the “Nuclease Digestion and Final Purification” was performed by incubating samples with 60 μl of DNase for 30 min at room temperature, then by washing and eluting twice in 30 μl nuclease free water.
FFPE extracted RNA yield was assessed in a ND-1000 spectrophotometer (Nanodrop Technologies, Inc. Wilmington, USA) and size distribution was obtained by capillary electrophoresis with the Agilent 2100 BioAnalyzer® (Agilent Technologies, Palo Alto, Calif.).
Amplification
One hundred ng of total RNA was amplified and biotin labelled using a whole transcriptome amplification kit from Nugen (WT Ovation™ Pico RNA amplification system) according to manufacturer's instructions. The amplified product was a single stranded cDNA in the antisense direction. After the amplification, 5 μg of purified amplified cDNA was labelled using NuGEN's FL-Ovation™ cDNA Biotin Module V2. The biotin labelling was performed in a final volume of 50 μl. The following reagents were added in a reaction tube: 15 μl of labelling Buffer mix, 1.5 μl of Labelling Reagent, 1.5 μl of Labelling Enzyme mix, 64 μM of biotin-11-dATP (Perkin Elmer, NEL540, 1 mM) and 64 μM of biotin-11-dCTP (Perkin Elmer, NEL538, 1 mM) and 5 μg of purified amplified cDNA. After the biotin labelling, the labelled cDNA was purified using the CyScribe GFX purification (Amersham) according to the instruction's manual.
The reaction tubes were then placed in a thermocycler programmed as follows: (i) Labeling of 60 min at 37° C., (ii) 10 min at 70° C. and (iii) forever at 4° C.
Microarray
DualChip human breast (Eppendorf, Hamburg, Germany) were used for the detection and the quantification of the amplified sequences. The DualChips were obtained by spotting aminated capture molecules on aldehyde activated glass obtained according to the EP01313677B1 using a home made robotic device. The capture molecules were part of an Xmer technology of Eppendorf and are between 200 and 450 bp long. The spots were around 250 μm in diameter. The slides were stored at 4° C.
The capture probes for the different genes detected in this example are presented in the table below. Their sequences are complementary of the gene transcripts which have to be detected.
Capture probe sequence for the different detected genes.
Hybridization
For each condition, 20 μl of biotin labelled cDNA were denaturated for 2 min. at 99° C., directly placed on ice for 5 min and then added to the hybridization mix (Eppendorf, Hamburg, Germany) to a final volume of 100 μl. The hybridization mix was injected slowly by the injection port of the hybridization frame of the DualChip. The frame was sealed with an aluminium pad and immediately after the sealing, the slides were placed in the Thermomixer comfort (Eppendorf) and incubated overnight (for 12-16 h) at 60° C.
After the hybridization step, the slides were washed 4 times for 2 min with a washing buffer as described in the DualChip Manual.
Fluorescence Detection
The slides were incubated 45 min at room temperature with the Cy3-conjugated IgG Anti-biotin (Jackson Immuno Research Laboratories, Inc #200-162-096) diluted 1/1000× Conjugate-Cy3 in the blocking buffer and protected from light. After this incubation, the slides were washed 5 times for 2 min with the washing buffer and 2 times with distilled water for 2 min and then these slides were dried before being stored at room temperature. The detection was performed in a confocal laser scanner “Autoloader ScanArray” (Packard, USA) and quantified by a specific quantification software. The signal intensity for each spot was corrected by the subtraction of the local background and then averaged. The quantification process was described in detail by de Longueville et al. (2002 Biochem. Pharmacol. 64:137-149).
Results
The signal intensities of hybridization on DualChip human breast cancer for the different amplified gene transcripts are given in the table below. The scale of the scanner is from 1 to 65536.
6 internal standards were selected. They are part of the Dualchip product and the capture probes are present in the predefined DualChip. They were in vitro transcribed polyadenylated RNA which were produced by in vitro transcription using a T7 RNA polymerase. They were quantified and diluted in order to obtain the required concentrations. For each of them a primer pair was designed having a specific sequence complementary either of the sense and the other one of the antisense strand. Each of the different primers had an additional 5′ end universal sequence being; TGCTATGCTCACAGATGCGA (SEQ ID NO: 460) which was also the universal primer.
The specific part of the primers sequences for the different genes are described in the table below. We also provide the concentrations of the various mRNA incorporated into the Primer sequences of the six internal standards.
The experiments of RT-PCR, hybridization and fluorescence detection and quantification were conducted as described in Example 27. The detection of the amplified targets was performed on the DualChip human breast (Eppendorf, Hamburg, Germany). The DualChips are obtained by spotting aminated capture molecules on aldehyde activated glass obtained according to the EP01313677B1 using a home made robotic device. The capture molecules were part of an Xmer technology of Eppendorf and are between 200 and 450 bp long. The spots were around 250 μm in diameter. The slides were stored at 4° C.
The capture probes for the different genes detected in this example are presented in the table below. Their sequences are complementary of the gene transcripts which have to be detected. The sequence complementary of the amplified target sequence is shown in bold. The sequence located in the 5′ end of the capture molecules serves as spacer for the binding of the target amplified sequences.
Capture probe sequence for the different detected genes. The sequence complementary of the amplified target sequence is shown in bold.
The bacteria to be detected, the genes to be amplified, the primer pairs and the amplification conditions are as in Example 26. The universal primers were biotinylated at the 5′ terminus.
The capture molecules had the same sequences as the probes of example 26 with an amino group at the 5′ end. The beads were the xMAP Multi-analyte COOH Microsperes from Luminex (Oosterhout, The Nederlands). The beads were labelled with fluorescent dyes and contained surface layer of avidin which were used for the binding of the biotinylated-probes. The beads were obtained at a concentration of 2.5×106 beads per ml. One capture probe was bound to one particular bead population. The coupling of the probes on their respective beads was performed as proposed by Cowan, L. et al. (2004 J. Clin Microbiol. 42:474-477).
To couple the probes to the microspheres (Luminex Corp.), 200 μmol of polynucleotides probes, 2.5×106 microspheres, and 25 μg of freshly purchased N-(3-dimethylaminopropyl)-N′-ethylcarbodiimide (Pierce Chemical, Rockford, Ill.) were combined in 25 μl of 100 mM 2-(N-morpholino)ethanesulfonic acid (MES), pH 4.5 (Sigma, St. Louis, Mo.). The reaction mixtures were incubated at room temperature in the dark for 30 min. The N-(3-dimethylaminopropyl)-N′-ethylcarbodiimide addition and subsequent incubation were repeated once. After coupling, the microspheres were washed with 0.5 ml of 0.02% Tween 20 followed by 0.5 ml of 0.1% sodium dodecyl sulfate. The prepared microspheres were suspended in 50 μl of Tris-EDTA, pH 8.0, and stored at 4° C. in the dark. A microspheres mix was prepared by combining equal volumes of each of the different beads bearing the different capture molecules.
For hybridization, the amplicons were first denatured by preparation of a hybridization mix containing 5 μl of PCR product, 5 μl of sensihyb solution (Eppendorf, Hamburg, Germany), 4 μl of hybridization control (Eppendorf, Hamburg, Germany) and 6 μl of water. 5 μl of NaOH was then added to the mix and then incubated 5 min at room temperature.
The microsphere mix was prepared by dilution of the microsphere mix in the hybridisation solution (Eppendorf, Hamburg, Germany) to a final concentration of approximately 150 microspheres of each set/μl. PCR product (25 μl) and diluted microsphere mix (25 μl) were combined in a Thermowell 96-well plate (VWR International, West Chester, Pa.). The reaction mixtures were incubated for 60 min at 60° C., in a GeneAmp 9700 PCR System (Perkin-Elmer, Foster City, Calif.). The plate is centrifuged at 2,250×g for 3 min, the supernatant was removed by pipette, and the microspheres were resuspended with 75 μl of detection buffer (R-phycoerythrin-conjugated streptavidin [Molecular Probes, Eugene, Oreg.] diluted to 4 μg/ml with 1× hybridization buffer). Following 5-min incubation at 52° C., the samples were analyzed in the Luminex 100, version 1.7; a minimum of 100 events/microsphere set were analyzed.
The beads were then analyzed in a Luminex 100 IS system (Oosterhout, The Nederlands) which was a flow cell fluorometry which detected both the beads according to their fluorescent dyes but also the fluorochrome attached to the labelled targets by the use of two different lasers. The Luminex 100 system associated the presence of a specific capture probe present on a bead with a particular dye with the intensity of the fluorochrome associated with the binding of the target on this capture molecule. The quantification was performed as presented by Spiro A. and M. Lowe (2002 Appl. Environ. Microbiol. 68:1010-1013). The intensity values of the target signals (reporter signals) were converted into units known as molecules of equivalent soluble fluorochrome (MESF) using Quantum 27 (R-PE) Reference Standards (Bangs Laboratories, Inc.) according to standard procedures. Cytometry data were analyzed with FCS Express version 1.065 (De Novo Software). The mean intensity (Is) of the reporter signal and intersample standard deviation (SD) were determined by running ≧7 replicate tubes. A similar procedure was used for the background signal (Ib). The uncertainty in the fluorescence response F=Is−Ib was calculated using the standard error SD in the difference of means.
Number | Date | Country | Kind |
---|---|---|---|
00870055.1 | Mar 2000 | EP | regional |
00870204.5 | Sep 2000 | EP | regional |
This is a continuation-in-part of U.S. patent application Ser. No. 10/056,229, filed Jan. 23, 2002, which is a continuation-in-part of U.S. patent application Ser. No. 09/817,014 filed Mar. 23, 2001, which claims priority to European Application Serial Number 00870055.1 filed on Mar. 24, 2000, and European Application Serial Number 00870204.5 filed on Sep. 15, 2000, the disclosures of all of which are incorporated herein by reference in their entireties.
Number | Date | Country | |
---|---|---|---|
Parent | 10056229 | Jan 2002 | US |
Child | 11694871 | Mar 2007 | US |
Parent | 09817014 | Mar 2001 | US |
Child | 10056229 | Jan 2002 | US |