The present invention is directed to a method of sequential signal-encoding of analytes in a sample, a use of a set of decoding oligonucleotides to sequentially signal-encode analytes in a sample, and to a kit for sequentially signal-encoding of analytes in a sample.
The present invention relates to the field of molecular biology, more particularly to the detection of analytes in a sample, preferably the detection of biomolecules such as nucleic acid molecules and/or proteins in a biological sample.
The analysis and detection of small quantities of analytes in biological and non-biological samples has become a routine practice in the clinical and analytical environment. Numerous analytical methods have been established for this purpose. Some of them use encoding techniques assigning a particular readable code to a specific first analyte which differs from a code assigned to a specific second analyte.
One of the prior art techniques in this field is the so-called ‘single molecule fluorescence in situ hybridization’ (smFISH) essentially developed to detect mRNA molecules in a sample. In Lubeck et al. (2014), Single-cell in situ RNA profiling by sequential hybridization, Nat. Methods 11(4), p. 360-361, the mRNAs of interest are detected via specific directly labeled probe sets. After one round of hybridization and detection, the set of mRNA specific probes is eluted from the mRNAs and the same set of probes with other (or the same) fluorescent labels is used in the next round of hybridization and imaging to generate gene specific color-code schemes over several rounds. The technology needs several differently tagged probe sets per transcript and needs to denature these probe sets after every detection round.
A further development of this technology does not use directly labeled probe sets. Instead, the oligonucleotides of the probe sets provide nucleic acid sequences that serve as initiator for hybridization chain reactions (HCR), a technology that enables signal amplification; see Shah et al. (2016), In situ transcription profiling of single cells reveals spatial organization of cells in the mouse hippocampus, Neuron 92(2), p. 342-357.
Another technique referred to as ‘multiplexed error robust fluorescence in situ hybridization’ (merFISH) is described by Chen et al. (2015), RNA imaging. Spatially resolved, highly multiplexed RNA profiling in single cells, Science 348(6233):aaa6090. There, the mRNAs of interest are detected via specific probe sets that provide additional sequence elements for the subsequent specific hybridization of fluorescently labeled oligonucleotides. Each probe set provides four different sequence elements out of a total of 16 sequence elements. After hybridization of the specific probe sets to the mRNAs of interest, the so-called readout hybridizations are performed. In each readout hybridization one out of the 16 fluorescently labeled oligonucleotides complementary to one of the sequence elements is hybridized. All readout oligonucleotides use the same fluorescent color. After imaging, the fluorescent signals are destroyed via illumination and the next round of readout hybridization takes place without a denaturing step. As a result, a binary code is generated for each mRNA species. A unique signal signature of 4 signals in 16 rounds is created using only a single hybridization round for binding of specific probe sets to the mRNAs of interest, followed by 16 rounds of hybridization of readout oligonucleotides labeled by a single fluorescence color.
A further development of this technology improves the throughput by using two different fluorescent colors, eliminating the signals via disulfide cleavage between the readout-oligonucleotides and the fluorescent label and an alternative hybridization buffer; see Moffitt et al. (2016), High-throughput single-cell gene-expression profiling with multiplexed error-robust fluorescence in situ hybridization, Proc. Natl. Acad. Sci. USA. 113(39), p. 11046-11051.
A technology referred to as ‘intron seqFISH’ is described in Shah et al. (2018), Dynamics and spatial genomics of the nascent transcriptome by intron seqFISH, Cell 117(2), p. 363-376. There, the mRNAs of interest are detected via specific probe sets that provide additional sequence elements for the subsequent specific hybridization of fluorescently labeled oligonucleotides. Each probe set provides one out of 12 possible sequence elements (representing the 12 ‘pseudocolors’ used) per color-coding round. Each color-coding round consists of four serial hybridizations. In each of these serial hybridizations, three readout probes, each labeled with a different fluorophore, are hybridized to the corresponding elements of the mRNA-specific probe sets. After imaging, the readout probes are stripped off by a 55% formamide buffer and the next hybridization follows. After 5 color-coding rounds with 4 serial hybridizations each, the color-codes are completed.
EP 0 611 828 discloses the use of a bridging element to recruit a signal generating element to probes that specifically bind to an analyte. A more specific statement describes the detection of nucleic acids via specific probes that recruit a bridging nucleic acid molecule. This bridging nucleic acids eventually recruit signal generating nucleic acids. This document also describes the use of a bridging element with more than one binding site for the signal generating element for signal amplification like branched DNA.
Player et al. (2001), Single-copy gene detection using branched DNA (bDNA) in situ hybridization, J. Histochem. Cytochem. 49(5), p. 603-611, describe a method where the nucleic acids of interest are detected via specific probe sets providing an additional sequence element. In a second step, a preamplifier oligonucleotide is hybridized to this sequence element. This preamplifier oligonucleotide comprises multiple binding sites for amplifier oligonucleotides that are hybridized in a subsequent step. These amplifier oligonucleotides provide multiple sequence elements for the labeled oligonucleotides. This way a branched oligonucleotide tree is build up that leads to an amplification of the signal.
A further development of this method referred to as is described by Wang et al. (2012), RNAscope: a novel in situ RNA analysis platform for formalin-fixed, paraffin-embedded tissues, J. Mol. Diagn. 14(1), p. 22-29, which uses another design of the mRNA-specific probes. Here two of the mRNA-specific oligonucleotides have to hybridize in close proximity to provide a sequence that can recruit the preamplifier oligonucleotide. This way the specificity of the method is increased by reducing the number of false positive signals.
Choi et al. (2010), Programmable in situ amplification for multiplexed imaging of mRNA expression, Nat. Biotechnol. 28(11), p. 1208-1212, disclose a method known as ‘HCR-hybridization chain reaction’. The mRNAs of interest are detected via specific probe sets that provide an additional sequence element. The additional sequence element is an initiator sequence to start the hybridization chain reaction. Basically, the hybridization chain reaction is based on metastable oligonucleotide hairpins that self-assembly into polymers after a first hairpin is opened via the initiator sequence.
A further development of the technology uses so called split initiator probes that have to hybridize in close proximity to form the initiator sequence for HCR, similarly to the RNAscope technology, this reduces the number of false positive signals; see Choi et al. (2018), Third-generation in situ hybridization chain reaction: multiplexed, quantitative, sensitive, versatile, robust. Development 145(12).
Mateo et al. (2019), Visualizing DNA folding and RNA in embryos at single-cell resolution, Nature Vol, 568, p. 49ff., disclose a method called ‘optical reconstruction of chromatin structure (ORCA). This method is intended to make the chromosome line visible.
The methods known in the art, however, have numerous disadvantages. In particular, they are inflexible, expensive, complex, time consuming and quite often provide non-accurate results. In particular, the encoding capacities of the existing methods are low and do not meet the requirements of modern molecular biology and medicine.
Against this background, it is an object underlying the present invention to provide a method by means of which the disadvantages of the prior art methods can be reduced or even avoided.
The present invention satisfies these and other needs.
The present invention provides a method of sequential signal-encoding of analytes in a sample, the method comprising the steps:
(1) providing a set of analyte-specific probes, each analyte-specific probe comprising:
(2) incubating the set of analyte-specific probes with the sample, thereby allowing a specific binding of the analyte-specific probes to the analyte to be encoded;
(3) removing non-bound probes from the sample;
(4) providing a set of decoding oligonucleotides, each decoding oligonucleotide comprising:
(5) incubating the set of decoding oligonucleotides with the sample, thereby allowing a specific hybridization of the decoding oligonucleotides to the unique identifier sequence;
(6) removing non-bound decoding oligonucleotides from the sample;
(7) providing a set of signal oligonucleotides, each signal oligonucleotide comprising:
The inventors have realized that this novel method provides the essential steps required to set up a process allowing the specific quantitative and/or spatial detection or counting of different analytes or different single analyte molecules in a sample in parallel via specific hybridization. The technology allows distinguishing a higher number of analytes than different signals available. In contrast to other state-of-the-art methods the oligonucleotides providing the detectable signal are not directly interacting with sample-specific nucleic acid sequences but are mediated by so called ‘decoding-oligonucleotides’. This mechanism decouples the dependency between the analyte-specific oligonucleotides and the signal oligonucleotides and, therefore, results in a dramatical increase of the encoding capacity.
Another subject-matter of the invention is the use of a set of decoding oligonucleotides to sequentially signal-encode analytes in a sample, each decoding oligonucleotide comprising:
A still further subject-matter of the present invention is a kit for sequentially signal-encoding of analytes in a sample, comprising
The use of decoding-oligonucleotides allows a much higher flexibility while dramatically decreasing the number of different signal oligonucleotides needed which in turn increases the encoding capacity achieved. The utilization of decoding-oligonucleotides leads to a sequential signal-coding technology that is more flexible, cheaper, simpler, faster and/or more accurate than other methods. In particular, the invention results in a significant increase of the encoding capacity in comparison to the prior art methods.
The use of decoding oligonucleotides breaks the dependencies between the target specific probes and the signal oligonucleotides. Without decoupling target specific probes and signal generation as in the methods of the state of the art, two different signals can only be generated for a certain target if using two different molecular tags. Each of these molecular tags can only be used once. Multiple readouts of the same molecular tag do not increase the information about the target. In order to create an encoding scheme, a change of the target specific probe set after each round is required (SeqFISH) or multiple molecular tags must be present on the same probe set (like merFISH, intronSeqFISH). These restrictions in the art are very relevant and reduce the flexibility, coding capacity, accuracy, reproducibility and increase the costs of the experiment.
According to the invention an “analyte” is the subject to be specifically detected as being present or absent in a sample and, in case of its presence, to encode it. It can be any kind of entity, including a protein or a nucleic acid molecule (RNA or DNA) of interest. The analyte provides at least one site for specific binding with analyte-specific probes. Sometimes herein the term “analyte” is replaced by “target”. An “analyte” according to the invention incudes a complex of subjects, e.g. at least two individual nucleic acid, protein or peptides molecules. In an embodiment of the invention an “analyte” excludes a chromosome. In another embodiment of the invention an “analyte” excludes DNA.
A “sample” as referred to herein is a composition in liquid or solid form suspected of comprising the analytes to be encoded.
An “oligonucleotide” as used herein, refers to s short nucleic acid molecule, such as DNA, PNA, LNA or RNA. The length of the oligonucleotides is within the range 4-200 nucleotides (nt), preferably 6-80 nt, more preferably 8-60 nt, more preferably 10-50 nt, more preferably 12 to 35 depending on the number of consecutive sequence elements. The nucleic acid molecule can be fully or partially single-stranded. The oligonucleotides may be linear or may comprise hairpin or loop structures. The oligonucleotides may comprise modifications such as biotin, labeling moieties, blocking moieties, or other modifications.
The “analyte-specific probe” consists of at least two elements, namely the so-called binding element (S) which specifically interacts with one of the analytes, and a so-called identifier element (T) comprising the ‘unique identifier sequence’. The binding element (S) may be a nucleic acid such as a hybridization sequence or an aptamer, or a peptidic structure such as an antibody. The “unique identifier sequence” as comprised by the analyte-specific probe is unique in its sequence compared to other unique identifiers. “Unique” in this context means that it specifically identifies only one analyte, such as Cyclin A, Cyclin D, Cyclin E etc., or, alternatively, it specifically identifies only a group of analytes, independently whether the group of analytes comprises a gene family or not. Therefore, the analyte or a group of analytes to be encoded by this unique identifier can be distinguished from all other analytes or groups of analytes that are to be encoded based on the unique identifier sequence of the identifier element (T). Or, in other words, there is only one ‘unique identifier sequence’ for a particular analyte or a group of analytes, but not more than one, i.e. not even two. Due to the uniqueness of the unique identifier sequence the identifier element (T) hybridizes to exactly one type of decoding oligonucleotides. The length of the unique identifier sequence is within the range 8-60 nt, preferably 12-40 nt, more preferably 14-20 nt, depending on the number of analytes encoded in parallel and the stability of interaction needed. A unique identifier may be a sequence element of the analyte-specific probe, attached directly or by a linker, a covalent bond or high affinity binding modes, e.g. antibody-antigen interaction, streptavidin-biotin interaction etc. It is understood that the term “analyte specific probe” includes a plurality of probes which may differ in their binding elements (S) in a way that each probe binds to the same analyte but possibly to different parts thereof, for instance to different (e.g. neighboring) or overlapping sections of the nucleotide sequence comprised by the nucleic acid molecule to be encoded. However, each of the plurality of the probes comprises the same identifier element (T).
A “decoding oligonucleotide” consists of at least two sequence elements. One sequence element that can specifically bind to a unique identifier sequence, referred to as “first connector element” (t), and a second sequence element specifically binding to a signal oligonucleotide, referred to as “translator element” (c). The length of the sequence elements is within the range 8-60 nt, preferably 12-40 nt, more preferably 14-20 nt, depending on the number of analytes to be encoded in parallel, the stability of interaction needed and the number of different signal oligonucleotides used. The length of the two sequence elements may or may not be the same.
A “signal oligonucleotide” as used herein comprises at least two elements, a so-called “second connector element” (C) having a nucleotide sequence specifically hybridizable to at least a section of the nucleotide sequence of the translator element (c) of the decoding oligonucleotide, and a “signal element” which provides a detectable signal. This element can either actively generate a detectable signal or provide such a signal via manipulation, e.g. fluorescent excitation. Typical signal elements are, for example, enzymes that catalyze a detectable reaction, fluorophores, radioactive elements or dyes.
A “set” refers to a plurality of moieties or subjects, e.g. analyte-specific probes or decoding oligonucleotides, whether the individual members of said plurality are identical or different from each other. In an embodiment of the invention a single set refers to a plurality of oligonucleotides
An “analyte specific probe set” refers to a plurality of moieties or subjects, e.g. analyte-specific probes that are different from each other and bind to independent regions of the analyte. A single analyte specific probe set is further characterized by the same unique identifier.
A “decoding oligonucleotide set” refers to a plurality of decoding oligonucleotides specific for a certain unique identifier needed to realize the encoding independent of the length of the code word. Each and all of the decoding oligonucleotides included in a “decoding oligonucleotide set” bind to the same unique identifier element (T) of the analyte-specific probe.
“Essentially complementary” means, when referring to two nucleotide sequences, that both sequences can specifically hybridize to each other under stringent conditions, thereby forming a hybrid nucleic acid molecule with a sense and an antisense strand connected to each other via hydrogen bonds (Watson-and-Crick base pairs). “Essentially complementary” includes not only perfect base-pairing along the entire strands, i.e. perfect complementary sequences but also imperfect complementary sequences which, however, still have the capability to hybridize to each other under stringent conditions. Among experts it is well accepted that an “essentially complementary” sequence has at least 88% sequence identity to a fully or perfectly complementary sequence.
“Percent sequence identity” or “percent identity” in turn means that a sequence is compared to a claimed or described sequence after alignment of the sequence to be compared (the “Compared Sequence”) with the described or claimed sequence (the “Reference Sequence”). The percent identity is then determined according to the following formula: percent identity=100 [1−(C/R)]
If an alignment exists between the Compared Sequence and the Reference Sequence for which the percent identity as calculated above is about equal to or greater than a specified minimum Percent Identity then the Compared Sequence has the specified minimum percent identity to the Reference Sequence even though alignments may exist in which the herein above calculated percent identity is less than the specified percent identity.
In the “incubation” steps as understood herein the respective moieties or subjects such as probes or oligonucleotide, are brought into contact with each other under conditions well known to the skilled person allowing a specific binding or hybridization reaction, e.g. pH, temperature, salt conditions etc. Such steps may therefore, be preferably carried out in a liquid environment such as a buffer system which is well known in the art.
The “removing” steps according to the invention may include the washing away of the moieties or subjects to be removed such as the probes or oligonucleotides by certain conditions, e.g. pH, temperature, salt conditions etc., as known in the art.
It is understood that in an embodiment of the method according to the invention a plurality of analytes can be encoded in parallel. This requires the use of different sets of analyte-specific probes in step (1). The analyte-specific probes of a particular set differ from the analyte-specific probes of another set. This means that the analyte-specific probes of set 1 bind to analyte 1, the analyte-specific probes of set 2 bind to analyte 2, the analyte-specific probes of set 3 bind to analyte 3, etc. In this embodiment also the use of different sets of decoding oligonucleotides is required in step (4). The decoding oligonucleotides of a particular set differ from the decoding oligonucleotides of another set. This means, the decoding oligonucleotides of set 1 bind to the analyte-specific probes of above set 1 of analyte-specific probes, the decoding oligonucleotides of set 2 bind to the analyte-specific probes of above set 2 of analyte-specific probes, the decoding oligonucleotides of set 3 bind to the analyte-specific probes of above set 3 of analyte-specific probes, etc. In this embodiment where a plurality of analytes is to be encoded in parallel the different sets of analyte-specific probes may be provided in step (1) as a premixture of different sets of analyte-specific probes and/or the different sets of decoding oligonucleotides may be provided in step (4) as a premixture of different sets of decoding oligonucleotides. Each mixture may be contained in a single vial. Alternatively, the different sets of analyte-specific probes and/or the different sets of decoding oligonucleotides may be provided in steps (1) and/or (4) singularly.
A “kit” is a combination of individual elements useful for carrying out the use and/or method of the invention, wherein the elements are optimized for use together in the methods. The kits may also contain additional reagents, chemicals, buffers, reaction vials etc. which may be useful for carrying out the method according to the invention. Such kits unify all essential elements required to work the method according to the invention, thus minimizing the risk of errors. Therefore, such kits also allow semi-skilled laboratory staff to perform the method according to the invention.
The features, characteristics, advantages and embodiments specified herein apply to the method, use, and kit according to the invention, even if not specifically indicated.
In an embodiment of the invention the sample is a biological sample, preferably comprising biological tissue, further preferably comprising biological cells. A biological sample may be derived from an organ, organoids, cell cultures, stem cells, cell suspensions, primary cells, samples infected by viruses, bacteria or fungi, eukaryotic or prokaryotic samples, smears, disease samples, a tissue section.
The method is particularly qualified to encode, identify, detect, count or quantify analytes or single analytes molecules in a biological sample, i.e. such as a sample which contains nucleic acids or proteins as said analytes. It is understood that the biological sample may be in a form as it is in its natural environment (i.e. liquid, semiliquid, solid etc.), or processed, e.g. as a dried film on the surface of a device which may be re-liquefied before the method is carried out.
In another embodiment of the invention prior to step (2) the biological tissue and/or biological cells are fixed.
This measure has the advantage that the analytes to be encoded, e.g. the nuclei acids or proteins, are immobilized and cannot escape. In doing so, the analytes then prepared for a better detection or encoding by the method according to the invention. The fixation of the sample can be, e.g., carried out by means of formaline, ethanol, methanol or other components well known to the skilled person.
In yet a further embodiment within the set of analyte-specific probes the individual analyte-specific probes comprise binding elements (S1, S2, S3, S4, S5) which specifically interact with different sub-structures of one of the analytes to be encoded.
By this measure the method becomes even more robust and reliable because the signal intensity obtained at the end of the method or a cycle, respectively, is increased. It is understood, that the individual probes of a set while binding to the same analyte differ in their binding position or binding site at or on the analyte. The binding elements S1, S2, S3, S4, S5 etc. of the first, second, third fourth, fifth etc. analyte-specific probes therefore bind to or at a different position which, however, may or may not overlap.
in another embodiment of the method according to the invention it comprises the following additional steps:
By this measure the method is further developed to such an extent that the encoded analytes can be detected by any means which is adapted to visualize the signal element. Examples of detectable physical features include e.g. light, chemical reactions, molecular mass, radioactivity, etc.
In a further embodiment of the method according to the invention the following additional step is carried out:
By this measure the requirements for another round of binding further decoding oligonucleotides to the same analyte-specific probes are established, thus finally resulting in a code or encoding scheme comprising more than one signal. This step is realized by applying conditions and factors well known to the skilled person, e.g. pH, temperature, salt conditions, oligonucleotide concentration, polymers etc.
In another embodiment of the invention, the method comprises the following step:
With this measure a code of more than one signal is set up, i.e. of two, three, four, five etc. signals in case of two [(42)-(112)], three [(43)-(113)], four [(44)-(114)], five [(45)-(115)], etc. [(4n)-(11n)] rounds which are carried out by the user, where ‘n’ is an integer representing the number of rounds. The encoding capacity of the method according to the invention is herewith increased depending on the nature of the analyte and the needs of the operator.
In an embodiment of the invention said encoding scheme is predetermined and allocated to the analyte to be encoded.
This measure enables a precise experimental set-up by providing the appropriate sequential order of the employed decoding and signal oligonucleotides and, therefore, allows the correct allocation of a specific analyte to a respective encoding scheme.
The decoding oligonucleotides which are used in repeated steps (42)-(112) may comprise a translator element (c2) which is identical with the translator element (c1) of the decoding oligonucleotides used in previous steps (4)-(11). In another embodiment of the invention decoding oligonucleotides are used in repeated steps (42)-(112) comprising a translator element (c2) which differs from the translator element (c1) of the decoding oligonucleotides used in previous steps (4)-(11).
It is understood that the decoding elements may or may not be changed from round to round, i.e. in the second round (42)-(112) comprising the translator element c2, in the third round (43)-(113) comprising the translator element c3, in the fourth round (44)-(114) comprising the translator element c4, in the fifths round comprising the translator element c5, in the ‘n’ round (4n)-(11n) comprising the translator element cn, etc., wherein ‘n’ is an integer representing the number of rounds.
The signal oligonucleotides which are used in repeated steps (42)-(112) may comprise a signal element which is identical with the signal element of the decoding oligonucleotides used in previous steps (4)-(11). In a further embodiment of the invention signal oligonucleotides are used in repeated steps (42)-(112) comprising a signal element which differs from the signal element of the decoding oligonucleotides used in previous steps (4)-(11).
By this measure each round the same or a different signal is provided resulting in an encoding scheme characterized by a signal sequence consisting of numerous different signals. This measure allows the creation of a unique code or code word which differs from all other code words of the encoding scheme.
In another embodiment of the invention the binding element (S) of the analyte-specific probe comprises a nucleic acid comprising a nucleotide sequence allowing a specific binding to the analyte to be encoded, preferably a specific hybridization to the analyte to be encoded.
This measure creates the condition for encoding a nucleic acid analyte, such as specific DNA molecules, e.g. genomic DNA, nuclear DNA, mitochondrial DNA, viral DNA, bacterial DNA, extra- or intracellular DNA etc., or specific mRNA molecules, e.g. hnRNA, miRNA, viral RNA, bacterial RNA, extra- or intracellular RNA, etc.
In an alternative embodiment of the invention the binding element (S) of the analyte-specific probe comprises an amino acid sequence allowing a specific binding to the analyte to be encoded, preferably the binding element is an antibody.
This measure creates the condition for encoding a nucleic acid analyte, such as an mRNA, e.g. such an mRNA coding for a particular protein.
In another embodiment the analyte to be encoded or detected is a nucleic acid, preferably DNA or RNA, further preferably mRNA, and/or, alternatively the analyte to be decoded is a peptide or a protein.
By this measure the invention is adapted to the detection of such kinds of analytes which are of upmost importance in the clinical routine or the focus of biological questions.
It is to be understood that the before-mentioned features and those to be mentioned in the following cannot only be used in the combination indicated in the respective case, but also in other combinations or in an isolated manner without departing from the scope of the invention.
The invention is now further explained by means of embodiments resulting in additional features, characteristics and advantages of the invention. The embodiments are of pure illustrative nature and do not limit the scope or range of the invention. The features mentioned in the specific embodiments are general features of the invention which are not only applicable in the specific embodiment but also in an isolated manner in the context of any embodiment of the invention.
The invention is now described and explained in further detail by referring to the following non-limiting examples and figures.
The method disclosed herein is used for specific detection of many different analytes in parallel. The technology allows distinguishing a higher number of analytes than different signals are available. The process preferably includes at least two consecutive rounds of specific binding, signal detection and selective denaturation (if a next round is required), eventually producing a signal code. To decouple the dependency between the analyte specific binding and the oligonucleotides providing the detectable signal, a so called “decoding” oligonucleotide is introduced. The decoding oligonucleotide transcribes the information of the analyte specific probe set to the signal oligonucleotides.
In a first application variant, the analyte or target is nucleic acid, e.g. DNA or RNA, and the probe set comprises oligonucleotides that are partially or completely complementary to the whole sequence or a subsequence of the nucleic acid sequence to be detected (
In a second application variant, the analyte or target is a protein and the probe set comprises one or more proteins, e.g. antibodies (
In a third application variant, at least one analyte is a nucleic acid and at least a second analyte is a protein and at least the first probe set binds to the nucleic acid sequence and at least the second probe set binds specifically to the protein analyte. Other combinations are possible as well.
In order to elucidate the workflow in more depth, the following workflow is restricted on the first application variant. A well-trained person can easily adapt the exemplary workflow to other applications. The method steps are depicted in the flowchart of
Step 1: Applying the analyte- or target-specific probe set. The target nucleic acid sequence is incubated with a probe set consisting of oligonucleotides with sequences complementary to the target nucleic acid. In this example, a probe set of 5 different probes is shown, each comprising a sequence element complementary to an individual subsequence of the target nucleic acid sequence (S1 to S5). In this example, the regions do not overlap. Each of the oligonucleotides targeting the same nucleic acid sequence comprises the identifier element or unique identifier sequence (T), respectively.
Step 2: Hybridization of the probe set. The probe set is hybridized to the target nucleic acid sequence under conditions allowing a specific hybridization. After the incubation, the probes are hybridized to their corresponding target sequences and provide the identifier element (T) for the next steps.
Step 3: Eliminating non-bound probes. After hybridization, the unbound oligonucleotides are eliminated, e.g. by washing steps.
Step 4: Applying the decoding oligonucleotides. The decoding oligonucleotides consisting of at least two sequence elements (t) and (c) are applied. While sequence element (t) is complementary to the unique identifier sequence (T), the sequence element (c) provides a region for the subsequent hybridization of signal oligonucleotides (translator element).
Step 5: Hybridization of decoding oligonucleotides. The decoding oligonucleotides are hybridized with the unique identifier sequences of the probes (T) via their complementary first sequence elements (t). After incubation, the decoding oligonucleotides provide the translator sequence element (c) for a subsequent hybridization step.
Step 6: Eliminating the excess of decoding oligonucleotides. After hybridization, the unbound decoding oligonucleotides are eliminated, e.g. by washing steps.
Step 7: Applying the signal oligonucleotide. The signal oligonucleotides are applied. The signal oligonucleotides comprise at least one second connector element (C) that is essentially complementary to the translator sequence element (c) and at least one signal element that provides a detectable signal (F).
Step 8: Hybridization of the signal oligonucleotides. The signal oligonucleotides are hybridized via the complementary sequence connector element (C) to the translator element (c) of decoding oligonucleotide. After incubation, the signal oligonucleotides are hybridized to their corresponding decoding oligonucleotides and provide a signal (F) that can be detected.
Step 9: Eliminating the excess of signal oligonucleotides. After hybridization, the unbound signal oligonucleotides are eliminated, e.g. by washing steps.
Step 10: Signal detection. The signals provided by the signal oligonucleotides are detected.
The following steps (steps 11 and 12) are unnecessary for the last detection round.
Step 11: Selective denaturation. The hybridization between the unique identifier sequence (T) and the first sequence element (t) of the decoding oligonucleotides is dissolved. The destabilization can be achieved via different mechanisms well known to the trained person like for example: increased temperature, denaturing agents, etc. The target- or analyte-specific probes are not affected by this step.
Step 12: Eliminating the denatured decoding oligonucleotides. The denatured decoding oligonucleotides and signal oligonucleotides are eliminated (e.g. by washing steps) leaving the specific probe sets with free unique identifier sequences, reusable in a next round of hybridization and detection (steps 4 to 10). This detection cycle (steps 4 to 12) is repeated ‘n’ times until the planed encoding scheme is completed.
Note that in every round of detection, the type of signal provided by a certain unique identifier is controlled by the use of a certain decoding oligonucleotide. As a result, the sequence of decoding oligonucleotides applied in the detection cycles transcribes the binding specificity of the probe set into a unique signal sequence.
The steps of decoding oligonucleotide hybridization (steps 4 to 6) and signal oligonucleotide hybridization (steps 7 to 9) can also be combined in two alternative ways as shown in
Opt. 1: Simultaneous hybridization. Instead of the steps 4 to 9 of
Opt. 2: Preincubation. Additionally to option 1 of
Step 1: Target nucleic acids. In this example three different target nucleic acids (A), (B) and (C) have to be detected and differentiated by using only two different types of signal. Before starting the experiment, a certain encoding scheme is set. In this example, the three different nucleic acid sequences are encoded by three rounds of detection with two different signals (1) and (2) and a resulting hamming distance of 2 to allow for error detection. The planed code words are:
Step 2: Hybridization of the probe sets. For each target nucleic acid, an own probe set is applied, specifically hybridizing to the corresponding nucleic acid sequence of interest. Each probe set provides a unique identifier sequence (T1), (T2) or (T3). This way each different target nucleic acid is uniquely labeled. In this example sequence (T) is labeled with (T1), sequence (B) with (T2) and sequence (C) with (T3). The illustration summarizes steps 1 to 3 of
Step 3: Hybridization of the decoding oligonucleotides. For each unique identifier present, a certain decoding oligonucleotide is applied specifically hybridizing to the corresponding unique identifier sequence by its first sequence element (here (t1) to (T1), (t2) to (T2) and (t3) to (T3)). Each of the decoding oligonucleotides provides a translator element that defines the signal that will be generated after hybridization of signal oligonucleotides. Here nucleic acid sequences (A) and (B) are labeled with the translator element (c1) and sequence (C) is labeled with (c2). The illustration summarizes steps 4 to 6 of
Step 4: Hybridization of signal oligonucleotides. For each type of translator element, a signal oligonucleotide with a certain signal (2), differentiable from signals of other signal oligonucleotides, is applied. This signal oligonucleotide can specifically hybridize to the corresponding translator element. The illustration summarizes steps 7 to 9 of
Step 5: Signal detection for the encoding scheme. The different signals are detected. Note that in this example the nucleic acid sequence (C) can be distinguished from the other sequences by the unique signal (2) it provides, while sequences (A) and (B) provide the same kind of signal (1) and cannot be distinguished after the first cycle of detection. This is due to the fact, that the number of different nucleic acid sequences to be detected exceeds the number of different signals available. The illustration corresponds to step 10 of
Step 6: Selective denaturation. The decoding (and signal) oligonucleotides of all nucleic acid sequences to be detected are selectively denatured and eliminated as described in steps 11 and 12 of
Step 7: Second round of detection. A next round of hybridization and detection is done as described in steps 3 to 5. Note that in this new round the mix of different decoding oligonucleotides is changed. For example, decoding oligonucleotide of nucleic acid sequence (A) used in the first round comprised of sequence elements (t1) and (c1) while the new decoding oligonucleotide comprises of the sequence elements (t1) and (c2). Note that now all three sequences can clearly be distinguished due to the unique combination of first and second round signals.
Step 8: Third round of detection. Again, a new combination of decoding oligonucleotides is used leading to new signal combinations. After signal detection, the resulting code words for the three different nucleic acid sequences are not only unique and therefore distinguishable but comprise a hamming distance of 2 to other code words. Due to the hamming distance, an error in the detection of the signals (signal exchange) would not result in a valid code word and therefore could be detected. By this way three different nucleic acids can be distinguished in three detection rounds with two different signals, allowing error detection.
Compared to state-of-the-art methods, one particular advantage of the method according to the invention is the use of decoding oligonucleotides breaking the dependencies between the target specific probes and the signal oligonucleotides.
Without decoupling target specific probes and signal generation, two different signals can only be generated for a certain target if using two different molecular tags. Each of these molecular tags can only be used once. Multiple readouts of the same molecular tag do not increase the information about the target. In order to create an encoding scheme, a change of the target specific probe set after each round is required (SeqFISH) or multiple molecular tags must be present on the same probe set (like merFISH, intronSeqFISH).
Following the method according to the invention, different signals are achieved by using different decoding oligonucleotides reusing the same unique identifier (molecular tag) and a small number of different, mostly cost-intensive signal oligonucleotides. This leads to several advantages in contrast to the other methods.
Coding Capacity
All three methods compared in the Table 1 below use specific probe sets that are not denatured between different rounds of detection. For intronSeqFISH there are four detection rounds needed to produce the pseudo colors of one coding round, therefore data is only given for rounds 4, 8, 12, 16 and 20. The merFISH-method uses a constant number of 4 signals, therefore the data starts with the smallest number of rounds possible. After 8 detection rounds our method exceeds the maximum coding capacity reached with 20 rounds of merFISH (depicted with one asterisk) and after 12 rounds of detection the maximum coding capacity of intron FISH is exceeded (depicted with two asterisks). For the method according to the invention usage of 3 different signals is assumed (as is with intronSeqFISH).
As shown in
Note that this maximum efficiency of coding capacity is also reached in case of seqFISH, where specific probes are denatured after every detection round and a new probe set is specifically hybridized to the target sequence for each detection round. However, this method has major downsides to technologies using only one specific hybridization for their encoding scheme (all other methods):
Due to these reasons all other methods use a single specific hybridization event and accept the major downside of lower code complexity and therefore the need of more detection rounds and a higher oligonucleotide design complexity.
The method according to the invention combines the advantages of seqFISH (mainly complete freedom concerning the encoding scheme) with all advantages of methods using only one specific hybridization event while eliminating the major problems of such methods.
Note that the high numbers of code words produced after 20 rounds can also be used to introduce higher hamming distances (differences) between different codewords, allowing error detection of 1, 2 or even more errors and even error corrections. Therefore, even very high coding capacities are still practically relevant.
A key factor of the method according to the invention is the consecutive process of decoding oligonucleotide binding, signal oligonucleotide binding, signal detection and selective denaturation. In order to generate an encoding scheme, this process has to be repeated several times (depending on the length of the code word). Because the same unique identifier is reused in every detection cycle, all events from the first to the last detection cycle are depending on each other. Additionally, the selective denaturation depends on two different events: While the decoding oligonucleotide has to be dissolved from the unique identifier with highest efficiency, specific probes have to stay hybridized with highest efficiency.
Due to this the efficiency E of the whole encoding process can be described by the following equation:
E=B
sp×(Bde×Bsi×Ede×Ssp)n
Based on this equation the efficiency of each single step can be estimated for a given total efficiency of the method. The calculation is hereby based on the assumption, that each process has the same efficiency. The total efficiency describes the portion of successfully decodable signals of the total signals present.
The total efficiency of the method is dependent on the efficiency of each single step of the different factors described by the equation. Under the assumption of an equally distributed efficiency the total efficiency can be plotted against the single step efficiency as shown in
Experimentally, the inventors achieved a total decoding efficiency of about 30% to 65% based on 5 detection cycles. A calculation of the efficiency of each single step (Bsp, Bde, Bsi, Ede, Ssp) by the formula given above revealed an average efficiency of about 94.4% to 98%. These high efficiencies are very surprising and cannot easily be anticipated by a well-trained person in this field.
The experiment shows the specific detection of 10 to 50 different mRNAs species in parallel with single molecule resolution. It is based on 5 detection cycles, 3 different fluorescent signals and an encoding scheme without signal gaps and a hamming distance of 2 (error detection). The experiment proofs the enablement and functionality of the method according to the invention.
Oligonucleotides and their Sequences
All oligonucleotide sequences used in the experiment (target specific probes, decoding oligonucleotides, signal oligonucleotides) are listed in the sequence listing of the appendix. The signal oligonucleotide R:ST05*O_Atto594 was ordered from biomers.net GmbH. All other oligonucleotides were ordered from Integrated DNA Technologies. Oligonucleotides were dissolved in water. The stock solutions (100 μM) were stored at −20° C.
The 50 different target specific probe sets are divided into 5 groups. The name of the transcript to be detected and the name of the target specific probe set are the same (transcript variant names of www.ensemble.org). The term “new” indicates a revised probe design. All oligonucleotide sequences of the probe sets can be found in the sequence listing. The table lists the unique identifier name of the probe set as well as the names of the decoding oligonucleotides used in the different detection cycles. The resulting code shows the sequence of fluorescent signals generated during the 5 detection cycles (G(reen)=Alexa Fluor 488, O(range)=Atto 594, Y(ellow)=Alexa Fluor 546).
Some variations of the experiment have been performed. Experiments 1 to 4 mainly differ in the number of transcripts detected in parallel. The groups listed as target specific probe sets refer to table 6. Experiments 5 to 8 are single round, single target controls for comparison with the decoded signals.
Before hybridization, cells were equilibrated with 200 μl sm-wash-buffer. The sm-wash-buffer comprises 30 mM Na3Citrate, 300 mM NaCl, pH7, 10% formamide (Roth, Cat.:P040.1) and 5 mM Ribonucleoside Vanadyl Complex (NEB, Cat.: S1402S). For each target-specific probe set 1 μl of a 100 μM oligonucleotide stock solution was added to the mixture. The oligonucleotide stock solution comprises equimolar amounts of all target specific oligonucleotides of the corresponding target specific probe set. The total volume of the mixture was adjusted to 100 μl with water and mixed with 100 μl of a 2× concentrated hybridization buffer solution. The 2× concentrated hybridization buffer comprises 120 mM Na3Citrate, 1200 mM NaCl, pH7, 20% formamide and 20 mM Ribonucleoside Vanadyl Complex. The resulting 200 μl hybridization mixture was added to the corresponding well and incubated at 37° C. for 2 h. Afterwards cells were washed three times with 200 μl per well for 10 min with target probe wash buffer at 37° C. The target probe wash buffer comprises 30 mM Na3Citrate, 300 mM NaCl, pH7, 20% formamide and 5 mM Ribonucleoside Vanadyl Complex.
Cells were washed once with 200 μl of imaging buffer per well at room temperature. In experiments without Trolox (see table 7, last column) imaging buffer comprises 30 mM Na3Citrate, 300 mM NaCl, pH7 and 5 mM Ribonucleoside Vanadyl Complex. In experiments with Trolox, imaging buffer additionally contains 10% VectaCell Trolox Antifade Reagent (Vector laboratories, Cat.: CB-1000), resulting in a final Trolox concentration of 10 mM.
Steps (E) to (H) were repeated 5 times in experiments 1 to 4. Step (H) was omitted for the 5th detection cycle.
Table 4 shows a very low number of incorrectly decoded signals compared to the number of correctly decoded signals. The absolute values for decoded signals of a certain transcript are very similar between different regions of one experiment. The fraction of the total number of signals that can be successfully decoded is between 27.1% and 64.5%. This fraction depends on the number of transcripts and/or the total number of signals present in the respective region/experiment.
The method according to the invention produces a low amount of incorrectly assigned code words and can therefore be considered specific. The fraction of successfully decodable signals is very high, even with very high numbers of signals per region and very high numbers of transcripts detected in parallel. The high fraction of assignable signals and the high specificity make the method practically useful.
The relative abundancies of transcripts correlate very well between different regions of one experiment but also between different experiments. This can be clearly seen by the comparisons of
Next to the reliability of quantification, the point clouds of multi round experiments also show the same intracellular and intercellular distribution patterns of transcripts. This is clearly proven by the direct comparison of the assigned point clouds with signals from single round experiments detecting only one characteristic mRNA-species.
The three decoded point clouds of cell cycle dependent proteins shown in
In the accompanying sequence listing SEQ ID Nos. 1-1247 refer to nucleotide sequences of exemplary target-specific oligonucleotides. The oligonucleotides listed consist of a target specific binding site (5′-end) a spacer/linker sequence (gtaac or tagac) and the unique identifier sequence, which is the same for all oligonucleotides of one probe set.
In the accompanying sequence listing SEQ ID Nos. 1248-1397 refer to nucleotide sequences of exemplary decoding oligonucleotides.
In the accompanying sequence listing SEQ ID Nos. 1398-1400 refer to the nucleotide sequences of exemplary signal oligonucleotides. For each signal oligonucleotide the corresponding fluorophore is present twice. One fluorophore is covalently linked to the 5′-end and one fluorophore is covalently linked to the 3′-end. SEQ ID No. 1398 comprises at its 5′ terminus “5Alex488N”, and at its 3′ terminus “3AlexF488N”. SEQ ID No. 1399 comprises at its 5′ terminus “5Alex546”, and at its 3′ terminus 3Alex546N. SEQ ID No. 1400 comprises at its 5′ terminus and at its 3′ terminus “Atto594”.
| Number | Date | Country | Kind |
|---|---|---|---|
| 19 181 051.4 | Jun 2019 | EP | regional |
| Filing Document | Filing Date | Country | Kind |
|---|---|---|---|
| PCT/EP2020/067010 | 6/18/2020 | WO | 00 |