The present invention relates to fluorescence in situ hybridization (FISH). In particular, the invention relates to a pair of non-naturally occurring nucleic acid probes for detecting a polynucleotide analyte for fluorescence in situ hybridization.
As an attractive approach to spatial transcriptomics, multiplexed fluorescent in situ hybridization (FISH) allows combinatorial imaging of the transcriptome, and promises to reveal the state-to-function relationships of single cells in native tissues. A key challenge to making multiplexed FISH more broadly applicable to all tissue types is the difficulty in accurately detecting individual RNA molecules in complex tissue environments, which often suffer from low signals and tissue-dependent background. To address this limitation, much effort has been focused on signal amplification to generate brighter RNA spots. However, such approaches can only improve the signal relative to the tissue auto-fluorescence. In addition, since all probes are equally amplified, these amplification methods do not help to distinguish between real RNA spots (true positives) from non-specifically bound probes (false positives).
Off-target binding of FISH probes generates background fluorescence and spurious signals. These problems are exacerbated in multiplexed FISH because of the use of highly diverse (usually consisting of thousands of sequences) and concentrated probe solutions. One approach to solve these problems is to use customized tissue clearing approaches to remove cellular proteins and lipids, thereby reducing non-specific probe binding. However, clearing does not remove the non-specific binding of probes to non-target RNAs inside the cells and tissues. In addition, tissue clearing creates another source of technical variability from sample to sample, and it entails lengthy protocols that may require customization for each tissue type.
Accordingly, it is generally desirable to overcome or ameliorate one or more of the above mentioned difficulties.
In one aspect, there is provided a pair of non-naturally occurring nucleic acid probes for detecting a polynucleotide analyte, comprising:
In one aspect, there is provided a probe system as defined herein.
In one embodiment, there is provided a probe system comprising:
In one embodiment, the probe binding arm in the first and/or second nucleic acid probe comprises an identification portion for binding to a unique bridge probe. The identification portion may allow a pair (or multiple pairs) of nucleic acid probes to be recognized by a unique bridge probe. Multiple pairs of nucleic acid probes may comprise the same identification portion for binding to the same unique bridge probe, this may allow each pair of nucleic acid probes (or a set of nucleic acid probe pairs) to be distinguishable from one another in a library comprising a plurality of nucleic acid probe pairs.
In one aspect, there is provided a method of detecting a polynucleotide analyte in a sample, the method comprising:
In one aspect, there is provided a library for detecting two or more polynucleotide analytes in a sample; the library comprising two or more pairs of non-naturally occurring nucleic acid probes or a plurality of probe systems as defined herein,
In one aspect, there is provided a method of detecting two or more polynucleotide analytes in a sample, the method comprising:
The method may comprise providing a unique bridge probe that is configured to bind to a specific pair (or multiple pairs) of nucleic acid probes prior to step b). A plurality of unique bridge probes may be provided either concurrently, sequentially or combinatorically to enable detection of a plurality of polynucleotide analytes.
In one aspect, there is provided a method of detecting or visualising the expression of one or more polynucleotide analytes in a sample, the method comprising
In one aspect, there is provided a kit comprising a pair of non-naturally occurring nucleic acid probes as defined herein or a plurality of probe systems or a library as defined herein.
In one embodiment, the kit further comprises one or more bridge probes.
Certain embodiments are illustrated by the following figures. It is to be understood that the following description is for the purpose of describing particular embodiments only and is not intended to be limiting with respect to the description.
RNA FISH images of split bridge sequence length (x) 7-12 nucleotides (nt) in (b) unpaired and (c) paired split probes (orange and light blue sequences). Shorter (7-9 nucleotides) bridge lengths were able to suppress the binding of unpaired probes. However, using bridge lengths that were too short (7+7 nucleotides) resulted in poor binding even in paired probes. 9+9 nucleotides appeared to be the most optimal length.
At each round of imaging, bridge probes are introduced and allowed to hybridize, followed by dye-labelled readout probes. After imaging, both bridge and readout probes are washed out in preparation for the next round. (b) Decoded transcript locations for the region in
Colors represent different genes. Length of the scale bar is 10 μm. Scatter plot of total counts per gene vs bulk RNA-sequencing FPKM values for AML12, with Log Pearson correlation in red. Scatter plot of counts per cell between split-FISH and conventional, for the 10 genes common to both schemes. The y=x line is shown in red. (c) Scatter plot of total counts per gene vs bulk RNA-sequencing FPKM values for brain, kidney, ovary, and liver tissues. Log Pearson correlation values in red. (d) Comparison of ‘blank’ counts per cell between conventional multiplexed FISH and split-FISH for mouse brain and liver tissues. Eight and seven ‘blank’ barcodes were tested for split-FISH (317 genes) and conventional (133 genes) schemes respectively. Centre line, median; box limits, upper and lower quartiles; whiskers, 1.5× interquartile range; all data points shown in blue.
The specification discloses a pair of non-naturally occurring nucleic acid probes for detecting a polynucleotide analyte.
Provided herein is a pair of non-naturally occurring nucleic acid probes for detecting a polynucleotide analyte, comprising
wherein binding of the first polynucleotide analyte binding arm to the first analyte target region and binding of the second polynucleotide analyte binding arm to the second analyte target region permit binding of the first probe binding arm to the first probe target region and binding of the second probe binding arm to the second probe target region, thereby detecting the polynucleotide analyte.
In one aspect, there is provided a probe system comprising:
Without being bound by theory, the inventors have found a way to decrease non-specific background when detecting polynucleotide analytes in a cell or tissue (such as using Fluorescence in-situ hybridization). This can be done by using a set of split probes whereby a fluorescence signal is generated only when two independent hybridization events are co-localized (termed as split-FISH). In the split-FISH scheme (
The probe system may further comprise the bridge probe.
The pair of non-naturally occurring nucleic acid probes for detecting a polynucleotide analyte may also be referred to a pair of non-naturally occurring nucleic acid split probes.
The pair of non-naturally occurring nucleic acid probes may also be referred to as “encoding probes”.
The pair of nucleic acid probes may be a pair of single-stranded nucleic acid probes.
The “bridge probe” may hybridize to the nucleic acid probes when the first and second nucleic acid probes hybridizes with the polynucleotide analyte. The “bridge probe” may therefore detect the binding of the first and second nucleic acid probes to the polynucleotide analyte.
Each pair of nucleic acid probes may be configured to hybridize to a unique bridge probe. In one embodiment, the probe binding arm in the first and/or second nucleic acid probes comprises an identification portion for binding to a unique bridge probe. The identification portion may allow a pair (or multiple pairs) of nucleic acid probes to be recognized by a unique bridge probe. This may allow each pair of nucleic acid probes (or a set of nucleic acid probe pairs) to be distinguishable from one another in a library comprising a plurality of nucleic acid probe pairs.
Also provided herein is the use of a pair of non-naturally occurring nucleic acid probes for detecting a polynucleotide analyte. Also provided herein is a pair of non-naturally occurring nucleic acid probes when used to detect a polynucleotide analyte
In one embodiment, the probe binding arm in the first and/or second nucleic acid probes consists of 9 or 10 nucleotides. In one embodiment, the probe binding arm in the first and/or second nucleic acid probes consists of 9 nucleotides. It was found that the length of the split bridge may affect non-specific background signal and a length of about 9 nucleotides was surprisingly able to produce a level of non-specific background signal that is virtually undetectable. For example, the first nucleic acid probe may comprise a first probe binding arm at the 3′ terminus that is complementary to and selectively hybridizes to a first probe target region of a bridge probe, wherein the first probe binding arm is ATTTAACCG (SEQ ID NO: 592) (see Table 9). The second nucleic acid probe may comprise a second probe binding arm at the 5′ terminus that is complementary to and selectively hybridizes with a second probe target region of the bridge probe, wherein the second probe binding arm is CCCATTACC (SEQ ID NO: 593). The bridge probe may have a sequence of GGTAATGGGCGGTTAAAT (SEQ ID NO: 594). The bridge probe may further comprise one or two readout sequences (e.g. ATTGTAAAGCGTGAGAAA (SEQ ID NO: 595)) that allows the bridge probe to be detected or recognised by a readout probe.
In one embodiment, the polynucleotide analyte binding arm in the first or second nucleic acid probes consists of 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49 or 50 nucleotides. In one embodiment, the polynucleotide analyte binding arm in the first or second nucleic acid probes consists of 25 nucleotides.
In one embodiment, a linker is positioned between the probe binding arm and the polynucleotide analyte binding arm. The linker may be a short linker that is about 1 to 10 nucleotides. The linker may be a short linker of 1, 2, 3, 4, 5, 6, 7, 8, 9 or 10 nucleobases. In one embodiment, the linker is about 1 to 10, 1 to 9, 1 to 8; 1 to 7; 1 to 6; 1 to 5, 1 to 4, 1 to 3, 1 to 2 nucleobases. In one embodiment, the linker is about 1 to 5 nucleobases. In one embodiment, the linker is 1, 2, 3, 4 or 5 nucleobases. In one embodiment, the linker is 2 or 3 nucleobases. In one example, the linker is TAT (see Table 8a under Paired (circular) split probe sequences).
The term “nucleic acid” refers to a deoxyribonucleotide or ribonucleotide polymer in either single- or double-stranded form, and unless otherwise limited, encompasses known analogues of natural nucleotides that hybridize to nucleic acids in a manner similar to naturally occurring nucleotides.
As used herein, the term “nucleic acid”, and equivalent terms such as “polynucleotide”, refer to a polymeric form of nucleotides of any length, such as ribonucleotides, deoxyribonucleotides or peptide nucleic acids (PNAs), that comprise purine and pyrimidine bases, or other natural, chemically or biochemically modified, non-natural, or derivatized nucleotide bases. The nucleic acid may be double stranded or single stranded. References to single stranded nucleic acids include references to the sense or antisense strands. The backbone of the polynucleotide can comprise sugars and phosphate groups, as may typically be found in RNA or DNA, or modified or substituted sugar or phosphate groups. A polynucleotide may comprise modified nucleotides, such as methylated nucleotides and nucleotide analogs. The sequence of nucleotides may be interrupted by non-nucleotide components. The terms nucleoside, nucleotide, deoxynucleoside and deoxynucleotide generally include complements, fragments and variants of the nucleoside, nucleotide, deoxynucleoside and deoxynucleotide, or analogs thereof.
In one embodiment, the first analyte target region is immediately adjacent to the second analyte target region. In another embodiment, the first analyte target region is spaced from the second analyte target region by no more than 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19 or 20 nucleobases.
In one embodiment, the first probe target region is immediately adjacent to the second probe target region. In another embodiment, the first probe target region is spaced from the second probe target region by no more than 20, 19, 18, 17, 16, 15, 14, 13, 12, 11, 10, 9, 8, 7, 6, 5, 4, 3, 2 or 1 nucleobases.
An “oligonucleotide” as used herein is a single stranded molecule which may be used in hybridization or amplification technologies. In general, an oligonucleotide may be any integer from about 15 to about 100 nucleotides in length, but may also be of greater length.
The term “probe” refers to any molecule which is capable of selectively binding to a specifically intended target molecule, for example, a nucleotide transcript. Probes can be either synthesized by one skilled in the art, or derived from appropriate biological preparations.
The nucleic acid probes (or nucleic acid split probes) of the present invention may be useful for detecting the presence or absence of one or more polynucleotide analytes in one or more samples known to contain or suspected of containing the polynucleotide analytes. The nucleic acid probes can also be used to quantify the amount of polynucleotide analytes within the sample. The nucleic acid probes are useful for detecting unamplified polynucleotide target in a sample such as for example RNA, MRNA, rRNA, plasmid DNA, viral DNA, bacterial DNA, and chromosomal DNA. Additionally, the nucleic acid probes may be useful in conjunction with the amplification of a polynucleotide target by well-known methods such as PCR, ligase chain reaction, Q-B replicase, strand-displacement amplification (SDA), rolling-circle amplification (RCA), nucleic acid sequence-based amplification (NASBA), and the like.
In one embodiment, the bridge probe is coupled or conjugated to a label (such as a fluorescent label). Such a bridge probe may be referred to as a readout probe. In one embodiment, the bridge probe is detected via hybridization to a secondary detection probe (or readout probe) that is conjugated to a label (such as a fluorescent label). The bridge probe may comprise a specific (or unique) tag or barcode sequences that enable it to be recognised via hybridisation to a secondary detection probe (or readout probe).
Examples of fluorescent labels include, but are not limited to, rare earth chelates (europium chelates), Texas Red, rhodamine, fluorescein, dansyl, phycocrytherin, phycocyanin, spectrum orange, spectrum green, and/or derivatives of any one or more of the above. Multiple probes used in the assay may be labeled with more than one distinguishable fluorescent or pigment color. These color differences provide a means to identify, for example, the hybridization positions of specific probes. Moreover, probes that are not separated spatially can be identified by a different color light or pigment resulting from mixing two other colors (e.g., light red+green=yellow) pigment (e.g., blue+yellow=green) or by using a filter set that passes only one color at a time. Probes can be labeled directly or indirectly with the fluorophore, utilizing conventional methodology. Additional probes and colors may be added to refine and extend this general procedure to include more genetic abnormalities or serve as internal controls.
In one embodiment, the secondary detection probe (or readout probe) hybridizes to a terminal region of the bridge probe.
In one embodiment, two secondary detection probes hybridize to both terminal regions of the bridge probe.
In one embodiment, the secondary detection probe or probes (or readout probes) hybridize to a central region of the bridge probe.
In one embodiment, the bridge probe has the same sequence as the polynucleotide analyte.
In one embodiment, the readout probe has the same sequence as the polynucleotide analyte.
In one embodiment, there is provided a pair of non-naturally occurring nucleic acid probes for detecting a polynucleotide analyte, the pair of nucleic acid probes comprising two anti-parallel nucleic acid strands, wherein:
wherein hybridization of the first and second nucleic acid strands with the polynucleotide analyte enables hybridization to the readout probe and detection of the polynucleotide analyte.
The term “complementary” refers to the base pairing between nucleotides or nucleic acids, such as, for instance, between the two strands of a double stranded DNA molecule or between an oligonucleotide primer and a primer binding site on a single stranded nucleic acid to be sequenced or amplified. Complementary nucleotides are, generally, A and T (or A and U), or C and G. Two single stranded RNA or DNA molecules are said to be complementary when the nucleotides of one strand, optimally aligned and compared and with appropriate nucleotide insertions or deletions, pair with at least about 80% of the nucleotides of the other strand, usually at least about 90% to 95%, and more preferably from about 98 to 100% of the nucleotides of the other strand. Alternatively, complementarity exists when an RNA or DNA strand will hybridize under selective hybridization conditions to its complement. Typically, selective hybridization will occur when there is at least about 65% complementarity over a stretch of at least 14 to 25 nucleotides, preferably at least about 75%, and more preferably at least about 90% complementarity.
As used herein, the term “hybridization” or “hybridizes” refers to the process in which two single-stranded polynucleotides bind non-covalently to form a stable double-stranded polynucleotide. The term “hybridization” may also refer to triple-stranded hybridization. The resulting (usually) double-stranded polynucleotide is a “hybrid”. The proportion of the population of polynucleotides that forms stable hybrids is referred to herein as the “degree of hybridization.”
Hybridization conditions will typically include salt concentrations of less than about 1M, more usually less than about 500 mM and less than about 200 mM. Hybridization temperatures can be as low as 5° C., but are typically greater than 22° C., more typically greater than about 30° C., and preferably in excess of about 37° C. Hybridizations are usually performed under stringent conditions, i.e. conditions under which a probe will hybridize to its target. Stringent conditions are sequence-dependent and are different under different circumstances. Longer fragments may require higher hybridization temperatures for specific hybridization. As other factors may affect the stringency of hybridization, including base composition and length of the complementary strands, presence of organic solvents and extent of base mismatching, the combination of parameters is more important than the absolute measure of any one alone. Generally, stringent conditions are selected to be about 5° C. lower than the thermal melting point (Tm) for the specific sequence at a defined ionic strength and pH. The Tm is the temperature (under defined ionic strength, pH and nucleic acid composition) at which 50% of the probes complementary to the target sequence hybridize to the target sequence at equilibrium. Typically, stringent conditions include salt concentration of at least 0.01 M to no more than 1 M Na ion concentration (or other salts) at a pH 7.0 to 8.3 and a temperature of at least 25° C. For example, conditions of 5×SSPE (750 mM NaCl, 50 mM NaPhosphate, 5 mM EDTA, pH 7.4) and a temperature of 25-30° C. are suitable for allele-specific probe hybridizations.
A “label” refers to a reporter molecule or enzyme that is capable of generating a measurable signal and is covalently or non-covalently joined to a polynucleotide.
The term “labelled”, with regard to, for example, a probe, is intended to encompass direct labelling of the probe by coupling (i.e., physically linking) a detectable substance to the probe, as well as indirect labelling of the probe by reactivity with another reagent that is directly labelled. Examples of indirect labelling include detection of a bridge probe (bound to a nucleic acid pair in the presence of a polynucleotide analyte) using a fluorescently labelled secondary probe (or readout probe).
The term “polynucleotide analyte” may be any polynucleotide that may be detected or analyzed by a pair of nucleic acid probes or probe system as defined herein. The analyte may be naturally-occurring or synthetic. A polynucleotide analyte may be present in a sample obtained using any methods known in the art. In some cases, a sample may be processed before analyzing it for a polynucleotide analyte. The polynucleotide may include DNA, RNA, peptide nucleic acids, and any hybrid thereof, where the polynucleotide contains any combination of deoxyribo- and/or ribo-nucleotides. Polynucleotides may be single stranded or double stranded, or contain portions of both double stranded or single stranded sequence. Polynucleotides may contain any combination of nucleotides or bases, including, for example, uracil, adenine, thymine, cytosine, guanine, inosine, xanthine, hypoxanthine, isocytosine, isoguanine and any nucleotide derivative thereof. As used herein, the term “nucleotide” may include nucleotides and nucleosides, as well as nucleoside and nucleotide analogs, and modified nucleotides, including both synthetic and naturally occurring species. Polynucleotides may be any suitable polynucleotide, including but not limited to cDNA, mitochondrial DNA (mtDNA), messenger RNA (mRNA), ribosomal RNA (rRNA), transfer RNA (tRNA), nuclear RNA (nRNA), small interfering RNA (siRNA), small nuclear RNA (snRNA), small nucleolar RNA (snoRNA), small Cajal body-specific RNA (scaRNA), microRNA (miRNA), double stranded (dsRNA), ribozyme, riboswitch or viral RNA. Polynucleotides may be contained within any suitable vector, such as a plasmid, cosmid, fragment, chromosome, or genome. The polynucleotide analyte can be a nucleic acid endogenous to the cell. As another example, the polynucleotide analyte can be a nucleic acid introduced to or expressed in the cell by infection of the cell with a pathogen, for example, a viral or bacterial genomic RNA or DNA, a plasmid, a viral or bacterial mRNA, or the like.
Genomic DNA may be obtained from naturally occurring or genetically modified organisms or from artificially or synthetically created genomes. Polynucleotide analytes comprising genomic DNA may be obtained from any source and using any methods known in the art. For example, genomic DNA may be isolated with or without amplification. Amplification may include PCR amplification, rolling circle amplification and other amplification methods. Genomic DNA may also be obtained by cloning or recombinant methods, such as those involving plasmids and artificial chromosomes or other conventional methods (see Sambrook and Russell, Molecular Cloning: A Laboratory Manual., cited supra.) Polynucleotide analytes may be isolated using other methods known in the art, for example as disclosed in Genome Analysis: A Laboratory Manual Series (Vols. I-IV) or Molecular Cloning: A Laboratory Manual. If the isolated polynucleotide analyte is an mRNA, it may be reverse transcribed into cDNA using conventional techniques, as described in Sambrook and Russell, Molecular Cloning: A Laboratory Manual., cited supra.
The term “gene” is used broadly to refer to any nucleic acid associated with a biological function. Genes typically include coding sequences and/or the regulatory sequences required for expression of such coding sequences. The term gene can apply to a specific genomic sequence, as well as to a cDNA or an mRNA encoded by that genomic sequence. Genes also include non-expressed nucleic acid segments that, for example, form recognition sequences for other proteins. Non-expressed regulatory sequences include promoters and enhancers, to which regulatory proteins such as transcription factors bind, resulting in transcription of adjacent or nearby sequences.
As used herein, the term “sample” includes tissues, cells, body fluids and isolates thereof etc., isolated from a subject, as well as tissues, cells and fluids etc. present within a subject (i.e. the sample is in vivo). Examples of samples include: whole blood, blood fluids (e.g. serum and plasm), lymph and cystic fluids, sputum, stool, tears, mucus, hair, skin, ascitic fluid, cystic fluid, urine, nipple exudates, nipple aspirates, sections of tissues such as biopsy and autopsy samples, frozen sections taken for histologic purposes, archival samples, explants and primary and/or transformed cell cultures derived from patient tissues etc.
The sample (such as a tissue or cell sample) may be fixed and permeabilized before hybridization with a pair of nucleic acid probe as defined herein, to retain the polynucleotide analytes in the cell and to permit the nucleic acid probes, bridge probes, etc. to enter the sample. The sample is optionally washed to remove materials not captured to one of the polynucleotide analytes. The sample can be washed after any of the various steps, for example, after hybridization of the nucleic acid probes to the polynucleotide analytes to remove unbound nucleic acid probes or after hybridization with the nucleic acid probes and bridge probes, before removing unbound nucleic acid probe and bridge probes.
The terms “restriction enzyme” and “restriction endonuclease” as used herein means an endonuclease enzyme that recognises and cleaves a specific sequence of DNA (recognition sequence).
In one aspect, there is provided a method of detecting a polynucleotide analyte in a sample, the method comprising:
In one embodiment, there is provided a method of determining the level of a polynucleotide analyte in a sample, the method comprising:
The various hybridization steps can be performed simultaneously or sequentially, in essentially any convenient order. In one embodiment, a hybridization step with the multiple pairs (or library) of nucleic acid probes is accomplished for all of the polynucleotide analytes at the same time. For example, all the nucleic acid probes can be added to the sample at once and permitted to hybridize to their corresponding targets, the sample can then be washed. Corresponding bridge probes can be hybridized to the nucleic acid probes and sample can be washed again prior to detection of the bridge probes. It will be evident that double-stranded polynucleotide analyte(s) are preferably denatured, e.g., by heat, prior to hybridization of the corresponding pair(s) of nucleic acid probes to the polynucleotide analyte.
The method may comprise the step of hybridizing a bridge probe to the pair of non-naturally occurring nucleic acid probes that are bound to the polynucleotide analyte that is present. Any unbound bridge probe may be removed or washed off.
The bridge probe may be coupled or conjugated to a label (such as a fluorescent label) that enables detection of the bridge probe and thus enables detection of the polynucleotide analyte. Such a bridge probe may also be referred to as a “readout probe”.
Alternatively, a secondary detection probe (i.e. a readout probe) may be hybridized to the bridge probe and allows the bridge probe (and the polynucleotide analyte) to be detected.
The bridge probe may comprise a specific tag or barcode sequence (such as a 6 nucleotide sequence). This may enable to bridge probe to be recognised by the secondary detection probe (or readout probe).
The method may allow the detection of the presence or levels of the polynucleotide analyte based on the signal that is detected.
The method may involve detecting one or more polynucleotide analytes. The polynucleotide analytes may be detected concurrently or sequentially.
In the case where the polynucleotide analytes are detected sequentially, this may involve multiple rounds of hybridization for each polynucleotide analyte with a specific pair of nucleic acid probes, and subsequent detection with bridge and/or readout probes. There may also be a step of washing or removal of signal (by, for example, bleaching) in between detection of each polynucleotide analyte.
In one aspect, there is provided a library for detecting two or more polynucleotide analytes in a sample; the library comprising two or more pairs of non-naturally occurring nucleic acid probes or a plurality of probe systems as defined herein, wherein each pair of nucleic acid probes is specific to each polynucleotide analyte; and wherein each pair of nucleic acid probes is configured to hybridize to a unique bridge probe in the presence of the polynucleotide analyte.
The term “unique bridge probe” may refer to the ability of a bridge probe to recognise a specific pair of nucleic acid probes. Each pair of nucleic acid probes in a library may comprise an “identification portion” (or barcode) in the probe binding arm of either the first or second nucleic acid probe (or both) for binding to a unique bridge probe. In one embodiment, the identification portion consists of 6 nucleotides (e.g. actcta). The bridge probe may have a corresponding barcode sequence that recognises the identification portion in the pair of nucleic acid probes.
More than one pair of nucleic acid probes (e.g. a set of nucleic acid probes) may comprise the same identification portion (or barcode) that allows them to bind to a unique bridge probe. A library of nucleic acid probe pairs may be grouped according to nucleic acid probe pairs that share the same identification portion (or barcode). This may allow for the combinatorial detection of polynucleotide analytes based on addition of a corresponding unique bridge probe that recognises nucleic acid probe pairs that share the same identification portion.
A library of identification portions (or barcodes) may be used in certain embodiments, e.g., containing at least 10, at least 102, at least 103, at least 104, at least 105, at least 106, at least 107, at least 108, etc. unique sequences. The unique sequences may be all individually determined (e.g., randomly), although in some cases, the identification portion may be defined as a plurality of variable portions (or “bits”). e.g., in sequence. For example, an identification portion may include at least 2, at least 3, at least 5, at least 6, at least 7, at least 10, at least 15, at least 20, at least 25, at least 30, at least 40, or at least 50 variable portions. Each of the variable portions may include at least 2, at least 3, at least 4, at least 5, at least 7, at least 10, at least 15, at least 20, at least 25, at least 30, at least 40, at least 50, or more possibilities. In one embodiment, the identification portion consists of 6 variable portions.
Thus, for example, an identification portion defined with 22 variable regions and 2 unique possibilities per variable region would define a library of identification portions with 2=4,194,304 members. As another non-limiting example, an identification portion may be defined with 10 variable regions and 7 unique possibilities per variable region to define a library of identification portions with 710 members. It should be understood that a variable portion may include any suitable number of nucleotides, and different variable portions within an identification portion may independently have the same or different numbers of nucleotides. Different variable regions also may have the same or different numbers of unique possibilities. For example, a variable portion may be defined having a length of at least 2, at least 3, at least 4, at least 5, at least 7, at least 10, at least 15, at least 20, at least 25, at least 30, at least 40, at least 50, or more nucleotides, and/or a maximum length of no more than 50, no more than 40, no more than 30, no more than 25, no mom than 20, no more than 15, no more than 10, no more than 7, no more than 5, no more than 4, no more than 3, or no more than 2 nucleotides. Combinations of these are also possible, e.g., a variable portion may have a length of between 5 and 50 nt, or between 15 and 25 nt, etc. A non-limiting example of a library is illustrated with identification sequences 1-1, 1-0, 2-1, 2-0, etc. through 22-1 and 22-0, which may be concatenated together (e.g., identification sequence 1- identification sequence 2-identification sequence 3- . . . —identification 22) to produce an bridge sequence (in this non-limiting example, each sequence position 1, 2, . . . 22 may have one of two possibilities, identified with −0 and −1. e.g., sequence position 1 can be either identification sequence 1-1 or 1-0, sequence position 2 can be either identification sequence 2-1 or 2-0, etc.). Similarly, according to certain embodiments, information could also be included in the absence of such sequences. For example, the same information included in the presence of one sequence (e.g. sequence 1-0), could also be determined from the absence of another sequence (e.g., sequence 1-1) Each identification sequence position may be thought of as a “bit” (e.g., 1 or 0 in this example), although it should be understood that the number of possibilities for each “bit” is not necessarily limited to only 2, unlike in a computer. In other embodiments, there may be 3 possibilities (i.e., a “trit”), 4 possibilities (i.e., a “quad-bit”), 5 possibilities, etc., instead of only 2 possibilities as in some embodiments.
The method for generating a library may comprise (a) associating barcode sequences with a plurality of oligonucleotide sequences and a plurality of codewords, wherein the codewords comprise a number of positions that is less than the number of targets, and b) grouping the pairs of nucleic acid probes based on a plurality of codewords, wherein each of the bridge probe corresponds to a specific value of a unique position within the codewords. The method may comprise exposing a sample to one of the bridge probes; imaging the sample; and repeating the exposing and imaging steps one or more times, before repeating with a different bridge probe. This process may be repeated for at least 10, 15, 20, 50, 80, 100, 500 repetitions.
In one aspect, there is provided a method of detecting two or more polynucleotide analytes in a sample, the method comprising:
In one embodiment, there is provided a method for combinatorial detection of two or more polynucleotide analytes in a sample, the method comprising:
In one embodiment, there is provided a method of determining the levels of two or more polynucleotide analytes in a sample, the method comprising:
In one embodiment, two or more nucleic acid probe pairs may be configured to bind to the same unique bridge probe to allow the two or more polynucleotide analytes to be detected combinatorically.
The terms “detecting”, “determining”, “measuring”, “evaluating”, “assessing” and “assaying” are used interchangeably herein to refer to any form of measurement, and include determining if an element is present or not. These terms include both quantitative and/or qualitative determinations. Assessing may be relative or absolute. The method as defined herein may comprise measuring or visualising the levels of two or more polynucleotide analytes in a sample.
In one embodiment, the method comprises contacting the sample with a unique (or bar-coded) bridge probe for each polynucleotide analyte.
In one embodiment, the multiple polynucleotide analytes are detected concurrently based on hybridization to a unique bridge probe for each polynucleotide analyte.
In one embodiment, the multiple polynucleotide analytes are detected sequentially based on multiple rounds of hybridization to a unique bridge probe for each polynucleotide analyte.
In one embodiment, the method comprises detecting the unique bridge probe via hybridization to a readout probe that is conjugated to a label.
In one embodiment, the method comprises contacting the sample with a unique readout probe for each polynucleotide analyte.
The method may comprise removing any bound or unbound bridge and/or readout probe (such as by washing) in between detection of each polynucleotide analyte.
The method may comprise removing any signal from any bound or unbound readout probe in between detection of each polynucleotide analyte. This may be done by, for example, bleaching or quenching a signal.
In one aspect, there is provided a kit comprising a pair of non-naturally occurring nucleic acid probes as defined herein or a library as defined herein. The kit may further comprise bridge probes for detecting nucleic acid probes that are bound to polynucleotide analytes. The bridge probes may be labelled to enable detection or measurement of the analyte. Alternatively, the kit may further comprise readout probes that bind to the bridge probes. The kit optionally also includes instructions for detecting one or more polynucleotide analytes in a sample, one or more buffered solutions (e.g., diluent, hybridization buffer, and/or wash buffer), reference cell(s) comprising one or more of the polynucleotide analytes.
In one embodiment, there is provided a method of performing an array-based assay. Provided herein is also an array-based assay. The term “array” encompasses the term “microarray” and refers to an ordered array presented for binding to nucleic acids and the like. An “array,” includes any two-dimensional or substantially two-dimensional (as well as a three-dimensional) arrangement of spatially addressable regions bearing nucleic acids, particularly oligonucleotides or synthetic mimetics thereof, and the like.
Provided herein is a method of performing a multiplex fluorescence in situ hybridisation (FISH) assay.
Provided herein is a composition, the composition comprising a pair of non-naturally occurring nucleic acid probes as defined herein.
Essentially any type of cell that can be differentiated based on its nucleic acid content (presence, absence, expression level or copy number of one or more nucleic acids) can be detected and identified using the nucleic acid probes as defined herein to detect a suitable selection of polynucleotide analytes. The cell can, for example, be a circulating tumor cell, a virally infected cell, a fetal cell in maternal blood, a bacterial cell or other microorganism in a biological sample (e.g., blood or other body fluid), an endothelial cell, precursor endothelial cell, or myocardial cell in blood, a stem cell, or a T-cell. Rare cell types can be enriched prior to performing the methods, if necessary, by methods known in the art (e.g., lysis of red blood cells, isolation of peripheral blood mononuclear cells, further enrichment of rare target cells through magnetic-activated cell separation (MACS), etc.). The methods are optionally combined with other techniques, such as DAPI staining for nuclear DNA. It will be evident that a variety of different types of nucleic acid markers are optionally detected simultaneously by the methods and used to identify the cell. For example, a cell can be identified based on the presence or relative expression level of one nucleic acid target in the cell and the absence of another nucleic acid target from the cell; e.g., a circulating tumor cell can be identified by the presence or level of one or more markers found in the tumor cell and not found (or found at different levels) in blood cells, and its identity can be confirmed by the absence of one or more markers present in blood cells and not circulating tumor cells. The principle may be extended to using any other type of markers such as protein based markers in single cells.
Provided herein are methods of diagnosis of a disease. The disease may be cancer, or viral or bacterial infection or a genetic disorder due to the presence of a defective gene. The method may comprise detecting the presence or absence of one or more polynucleotide analytes in a sample obtained from a subject. Provided herein are also methods of treating the disease following detection of the disease.
By “subject” or “patient” is meant any single subject for which therapy is desired, including humans, cattle, horses, pigs, goats, sheep, dogs, cats, guinea pigs, rabbits, chickens, insects and so on. Also intended to be included as a subject are any subjects involved in clinical research trials not showing any clinical sign of disease, or subjects involved in epidemiological studies, or subjects used as controls.
One or more polynucleotide analytes associated with cancer can be detected using the nucleic acid probes as defined herein, e.g., those that encode over expressed or mutated polypeptide growth factors (e.g., sis), overexpressed or mutated growth factor receptors (e.g., erb-B1), over expressed or mutated signal transduction proteins such as G-proteins (e.g., Ras), or non-receptor tyrosine kinases (e.g., abl), or over expressed or mutated regulatory proteins (e.g., myc, myb, jun, fos, etc.) and/or the like. In general, cancer can often be linked to signal transduction molecules and corresponding oncogene products, e.g., nucleic acids encoding Mos, Ras, Raf, and Met; and transcriptional activators and suppressors, e.g., p53, Tat, Fos, Myc, Jun, Myb, Rel, and/or nuclear receptors, p53. For detection of circulating tumor cells (CTC), a variety of suitable polynucleotide analytes are known. For example, a multiplex panel of markers for CTC detection could include one or more of the following markers: epithelial cell-specific (e.g. CK19, Mucl, EpCAM), blood cell-specific as negative selection (e.g. CD45), tumor origin-specific (e.g. PSA, PSMA, HPN for prostate cancer and mam, mamB, her-2 for breast cancer), proliferating potential-specific (e.g. Ki-67, CEA, CA15-3), apoptosis markers (e.g. BCL-2, BCL-XL), and other markers for metastatic, genetic and epigenetic changes.
Similarly, one or more polynucleotide analytes from pathogenic or infectious organisms can be detected by the nucleic acid probes as defined herein, e.g., for infectious fungi, e.g., Aspergillus, or Candida species; bacteria, particularly E. coli, which serves a model for pathogenic bacteria (and, of course certain strains of which are pathogenic), as well as medically important bacteria such as Staphylococci (e.g., aureus), or Streptococci (e.g., pneunoniae); protozoa such as sporozoa (e.g., Plasmodia), rhizopods (e.g., Entamoeba) and flagellates (Trypanosona, Leislunania, Trichonmonas, Giardia, etc.); viruses such as (+) RNA viruses (examples include Poxviruses e.g., vaccinia; Picornaviruses. e.g., polio; Togaviruses, e.g., rubella; Flaviviruses, e.g., HCV; and Coronaviruses), (−) RNA viruses (e.g., Rhabdoviruses. e.g., VSV; Paramyxovimses. e.g., RSV; Orthomyxovimses, e.g., influenza; Bunyaviruses; and Arenaviruses), dsDNA viruses (e.g. Reoviruses), RNA to DNA viruses, i.e., Retroviruses, e.g., HIV and HTLV, and certain DNA to RNA viruses such as Hepatitis B.
Gene amplification or deletion events can be detected at a chromosomal level using the nucleic acid probes as described herein, as can altered or abnormal expression levels. Some polynucleotide analytes include oncogenes or tumor suppressor genes subject to such amplification or deletion. Exemplary nucleic acid targets include, integrin (e.g., deletion), receptor tyrosine kinases (RTKs; e.g., amplification, point mutation, translkcation, or increased expression), NF1 (e.g., deletion or point mutation), Akt (e.g., amplification, point mutation, or increased expression). PTEN (e.g., deletion or point mutation), MDM2 (e.g., amplification), SOX (e.g., amplification), RAR (e.g., amplification), CDK2 (e.g., amplification or increased expression). Cyclin D (e.g., amplification or translocation), Cyclin E (e.g., amplification), Aurora A (e.g., amplification or increased expression), P53 (e.g., deletion or point mutation), NBS1 (e.g., deletion or point mutation). Gli (e.g., amplification or translocation). Myc (e.g., amplification or point mutation). HPV-E7 (e.g., viral infection), and HPV-E6 (e.g., viral infection).
If a polynucleotide analyte is used as a reference, suitable reference nucleic acids have similarly been described in the art or can be determined. For example, a variety of genes whose copy number is stably maintained in various tumor cells is known in the art. Housekeeping genes whose transcripts can serve as references in gene expression analyses include, for example, 18S rRNA, 28S rRNA, GAPD, ACTB, and PPIB.
Provided herein is a method of detecting or visualising the expression of one or more polynucleotide analytes in a sample, the method comprising a) contacting a sample with a library as defined herein, and b) detecting or visualising the expression of each polynucleotide analyte based on hybridisation to a unique bridge probe in the presence of the one or more polynucleotide analytes.
The method may comprise detecting the presence or level of mRNA in a sample.
The sample may be a cell or tissue sample.
Throughout this specification, unless the context requires otherwise, the word “comprise”, or variations such as “comprises” or “comprising”, will be understood to imply the inclusion of a stated element or integer or method step or group of elements or integers or method steps but not the exclusion of any other element or integer or method steps or group of elements or integers or method steps.
As used in the subject specification, the singular forms “a”, “an” and “the” include plural aspects unless the context clearly dictates otherwise. Thus, for example, reference to “a method” includes a single method, as well as two or more methods; reference to “an agent” includes a single agent, as well as two or more agents; reference to “the disclosure” includes a single and multiple aspects taught by the disclosure; and so forth. Aspects taught and enabled herein are encompassed by the term “invention”. Any variants and derivatives contemplated herein are encompassed by “forms” of the invention.
SPLIT-FISH library design. Targeting regions (pairs of 25-nt sequences with 2-nt spacing in between the pair) were identified using a previously published algorithm. First, reference transcript sequences were downloaded from the GENCODE website (human v24 and mouse m4 respectively). A specificity table was calculated using 15-nt seed and 0.2 specificity cut-off was used. Quartet repeats (‘AAAA’, ‘TITT’, ‘GGGG’, and ‘CCCC’), KpnI restriction sites (‘GGTACC’ (SEQ ID NO: 1) and ‘CCATGG’ (SEQ ID NO: 2)), and EcoRI restriction sites (‘GAATTC’(SEQ ID NO: 3) and ‘CTTAAG’ (SEQ ID NO: 4)) were excluded from the possible target regions. Then, the right targeting region pairs were concatenated with the right bridge sequence (e.g. ‘CactctaCCTAT’ (SEQ ID NO: 5), lowercase indicates variable bases that form the 6-nt barcode, TAT is a linker between the bridge sequence and the targeting region). The left targeting region pairs were concatenated with the left bridge sequence ‘TAT ATTTAACCG’ (SEQ ID NO: 6). Finally, KpnI and EcoRI restriction sites, as well as the forward and reverse PCR primers were introduced at both ends of each side of the probes.
Removal of the PCR primers via restriction digestion is required for efficient subsequent hybridization of the bridge sequence. The list of encoding probes can be found in Table 1. The bridge sequences were flanked by readout sequences at both ends. The list of bridge sequence can be found in Table 3. The readout sequences used were ‘/5Cy5/TTACTCACGCACCCATCA’ (SEQ ID NO: 7) and ‘/5Alex750NfTTCTCACGCTTTACAAT’ (SEQ ID NO: 8). To construct the 317-genes combinatorial library, a ‘26 choose 2’ coding scheme was used. Eight of the 325 possible code-words were blanks, which are not assigned to any gene (no encoding probes), to act as negative controls that estimate the levels of the false-positive background. For each gene, 72 pairs of target regions were split into two pools. Each pool was assigned a 6-nt barcode according to the gene's ‘on’ bits. The gene codebook assignment for the 317-genes library can be found in Table 2. The conventional multiplexed FISH probe and library were designed as previously described. The conventional encoding probe library and readout sequences can be found in Table 4 and 5 respectively. The conventional codebook can be found in Table 6.
Probe amplification and preparation. Probe library (Twist Bioscience) was made using a slightly modified version of a previously published protocol. Briefly, the oligopool was first amplified by limited cycle PCR using Phusion Hot Start Flex 2× master mix (NEB, Cat: M0536L) with an annealing temperature of 66° C., followed by an overnight in vitro transcription using a high yield in vitro transcription kit (NEB, Cat: E2050S). T7 promoter sequence was introduced on the reverse primer during the PCR. Next, reverse transcription from the RNA template (ThermoFisher Cat: EP0753) was performed. The RNA was then cleaved off using alkaline hydrolysis, leaving behind ssDNA which was then purified via spin column purification (Zymo Cat: C1016-50), and eluted in nuclease free water (Ambion, Cat: AM9930). Cut primers, complementary to the EcoRI and KpnI restriction sites were then annealed to the ssDNA probes before performing a double restriction digest for 16 hours at 37° C. using high fidelity enzymes (NEB Cat: R3101M, R3142M) to cleave off the forward and reverse primers. Finally, the ssDNA probes were purified using a spin column (Zymol, Cat: C1016-50) or magnetic beads (Beckman Coulter, Cat: A63882) and eluted in nuclease-free water. Probes were dried and stored at −20° C. The primers used for PCR are f‘AACGAACGGAGGGTCATTGG’ (SEQ ID NO: 9) and ‘TAATACGACTCACTATAGGGAGGCTCTACTCGCATTAGGG’ (SEQ ID NO: 10); the primers used for restriction digestion are ‘TACTCGCATTAGGGGAATTCNN’ (SEQ ID NO: 11) and ‘NNGTACCCCAATGACCCTCCGT’ (SEQ ID NO: 12).
Cell culture sample preparation. Human foreskin fibroblasts (ATCC® CRL-2097™), human A549 (ATCC® CCL-185™), and AML12 (ATCC® CRL-2254™) cells were cultured in Dulbecco's High Glucose Modified Eagles Medium (Hyclone™ Cat: SH30022.01), supplemented with 10% fetal bovine serum (Thermofisher, Cat: 26140079). A549 cells were cultured in DMEM/F12 1:1, supplemented with 10% fetal bovine serum. Cells were grown in 6-well plates on 22 mm×22 mm No. 1 coverslips (Marienfeld-Superior Cat: 0101050) for the XLOC_010514 and MUC5AC experiments, or 40 mm diameter No. 1 coverslips (Warner Instruments Cat: 64-1500) for the FLNA experiments. Cells were grown to ˜80% confluency before fixation in 4% vol/vol paraformaldehyde (Electron Microscopy Sciences Cat: 15714) in 1× PBS for 15 minutes at room temperature. Following fixation, the samples were quenched in 0.1 M Glycine (1st BASE) for 1 minute at room temperature. The cells were then permeabilized in 70% ethanol overnight at 4° C.
Tissue sample preparation and coverslip functionalization. All animal care and experiments were carried out in accordance with Agency for Science, Technology and Research (A*STAR) Institutional Animal Care and Use Committee (IACUC) guidelines. Coverslip functionalization and tissue processing were based on a slightly modified version of a previously published protocol3. Briefly, coverslips (Warner Instruments Cat: 64-1500) were cleaned with 1 M KOH in an ultrasonic water bath for 20 minutes, rinsed thrice with MilliQ water followed by 100% methanol. Then, the coverslips were immersed in an amino-silane solution (3% vol/vol (3-Aminopropyl)triethoxysilane [MERCK Cat: 440140] 5% vol/vol acetic acid [Sigma Cat: 537020] in methanol) for 2 minutes at room temperature before rinsing thrice with MilliQ water and air dried. Functionalized coverslips can then be used immediately or stored in a dry, desiccated environment at room temperature for several weeks. Histology work was performed by the Advanced Molecular Pathology Laboratory, IMCB, A*STAR, Singapore. Briefly, C57BL/6NTac mice aged 8 weeks (InVivos) were euthanized with ketamine, the kidney, liver, brain, and ovary were quickly harvested, cut to smaller pieces, and frozen immediately in Optimal Cutting Temperature compound (Tissue-Tek O.C.T.; VWR, 25608-930), and stored at −80° C. 7 μm sections of fresh frozen tissues were cut using a cryotome onto functionalized coverslips. Sections were air-dried for 5 minutes at room temperature prior to fixation in 4% vol/vol paraformaldehyde in 1× PBS for 15 minutes. Following fixation, samples were rinsed once with 1× PBS and either permeabilized in 70% ethanol overnight at 4° C. or stored at −80° C.
XLOC_010514, MUC5AC, and FLNA experiments. After permeabilization, the cultured cells were equilibrated to room temperature before rehydration in 2× saline-sodium citrate (SSC, Axil Scientific Cat: BUF-3050-20X1L) for 5 minutes. Samples were incubated in a 10% formamide wash buffer, containing 10% deionized formamide (Ambion™ Cat: AM9342, AM9344) and 2×SSC, for 30 minutes at room temperature. The split probes were diluted in a 10% hybridization buffer to a final concentration of 20 nM per probe. The 10% hybridization buffer composed of 10% deionized formamide (vol/vol) and 10% dextran sulfate (Sigma Cat: D8906) (wt/vol) in 2×SSC. The encoding probes were stained overnight at 37° C. in a humidified chamber. Following hybridization of the encoding probes, the samples were washed in a 10% formamide wash buffer twice, incubating for 15 minutes at 37° C. per wash. The samples were then removed from the 10% formamide wash buffer and stained with either the bridge probe or the conventional readout probe. The probes were diluted to a concentration of 10 nM in 10% hybridization buffer and stained for 20 minutes at room temperature. The cells were then washed once with 10% formamide wash buffer and then twice with 2×SSC at room temperature. DAPI (Sigma Cat: D9564) was stained at a concentration of 1 μg/mL in 2×SSC for 10 minutes at room temperature. The samples were then washed twice with 2×SSC and either imaged immediately or stored for no longer than 12 hours at 4° C. in 2×SSC before imaging. The list of XLOC_010514, MUC5AC, and FLNA sequences can be found in Table 7, 8, and 9 respectively.
Multiplexed FISH experiments in tissue. Tissue samples were stained as described above, using 20% formamide concentration in the hybridization and wash buffers instead of 10%. For tissue samples, pre-hybridization was also extended to 3 hours at 37° C. in 20% formamide wash buffer. The samples were stained overnight or longer at a final probe concentration of 500 μM (2 to 3 fold higher concentration than used in the conventional experiment) in 20% hybridization buffer. After two 20% formamide washes, the samples were washed twice with 2×SSC and either imaged immediately or stored in 2×SSC for no longer than one week at 4° C. prior to imaging.
Split-FISH imaging cycle. Samples were then mounted into a flow chamber (Bioptechs Cat: FCS2), which was secured to the microscope stage. Hybridization of the bridge and readout probes in the flow chamber was done sequentially by buffer exchange controlled by a custom-built, computer-controlled fluidics system. The system consisted of three daisy-chained eight-way valves for buffer selection and a peristaltic pump providing the driving force for fluid flow, as previously described. The bridge probe solution contained 5 nM of each bridge sequence in a 10% hybridization buffer. The sample was incubated in the solution for 10 minutes at room temperature. Next, 5 nM of fluorescently labeled readout probe in 10% hybridization buffer was flowed into the chamber and incubated for another 10 minutes at room temperature. Following hybridization, the sample was washed with 10% formamide wash buffer to remove unbound probes. Imaging buffer was then flowed into the chamber before images were acquired. The imaging buffer consisted of 2×SSC, 50 mM Tris-HCl pH 8, 10% glucose, 2 mM Trolox (Sigma, Cat: 238813), 0.5 mg/ml glucose oxidase (Sigma, Cat: G2133) and 40 μg/ml catalase (Sigma, Cat: C30). To remove the fluorescent signals, the samples were washed with 40% formamide wash buffer. This hybridization and wash cycle was repeated until all the bits were imaged. With two-color imaging, 26 bits were completed in 13 cycles. 133-genes (Modified Hamming Distance 4) multiplexed FISH imaging using the conventional probes was performed as previously described. The conventional probe library correlated well with bulk RNA-seq (
Imaging Setup 1. The XLOC_010514 and MUC5AC experiments were performed using a custom-built microscope that was constructed around a Nikon Ti-E body, MS-200 ASI X-Y stage, CFI Plan Apo Lambda 100×1.45 N.A, oil-immersion objective, and Andor iXon Ultra 888 EMCCD camera. DAPI was excited by 405 nm (LuxX, 405-20), and Cy5 was excited by 638 nm (LuxX, 638-100) solid-state lasers (Omicron). Z-stacks, of 400 nm apart, were obtained for each laser excitation for five different Z positions. The exposure time was 1 second.
Imaging Setup 2. The FLNA and multiplexed FISH experiments were performed using a second custom-built microscope that was constructed around a Nikon Ti2-E body, Marzhauser SCANplus IM 130×85 motorized X-Y stage, a Nikon CFI Plan Apo Lambda 60×1.4 N.A, oil-immersion objective, and an Andor Sona 4.2B-11 sCMOS camera. Focus was maintained using the Nikon Perfect Focus system and only one Z position was imaged per field of view per cycle. The DAPI channel was excited by a Coherent Obis 405 100 mW laser. The following two fiber lasers from MPB Communications: 2RU-VFL-P-1000-647-B1R (1000 mW), 2RU-VFL-P-500-750-BIR (500 mW) were used as illumination for Cy5 (647 nm) and Alexa750 (750 nm) respectively. All laser channels were combined and launched into a Newport F-SM8-C-2FCA fiber. The resulting beam was collimated and flattened using an AdlOptica 6_6 series Pi-shaper, then expanded before being sent into a 300 mm lens near the back-port of the Ti-2 to illuminate an approximately 230 um×230 um field of view. Custom multi-wavelength filters ZET488/532/592/647/750m (Chroma) and ZT488/532/592/647/750rpc-UF2 (Chroma) were used. A Finger Lakes Instrumentation HS-632 High Speed Filter Wheel, containing FF01-433/24-32, FF02-684/24-32 and FFO1-776/LP-32 emission filters (Semrock), was attached to the output port between the microscope and the camera, allowing different emission filters to be used when imaging respective channels. The exposure time was 500 ms.
Image analysis. The multiplexed FISH images were processed by a custom Python pipeline, following a previously published approach but with modified pre-processing, gene callout filtering, and mosaic-stitching procedures. Briefly, the images from each hybridization cycle were first corrected for field and chromatic distortion. Images were then registered for translation relative to a selected frame in the Cy5 channel by phase correlation using a subpixel registration algorithm provided in the Scikit-image package. For each dataset, a global bit-wise normalization was performed by pooling all pixels above the 99.9th percentile of intensity in each field of view, then taking the 50th percentile of the pooled pixel intensities as a normalization value for the bit. Images were filtered in the frequency domain using a second order 2D band-pass Butterworth filter to remove cell background (low frequency cutoff) and camera noise (high frequency cutoff). The n-dimensional vector (where n is the number of bits) for each aligned pixel is then normalized to the unit length by dividing by its magnitude (L2 norm). The same normalization was done for each code-word in the set of genes. The Euclidean distance from the pixel vector to each gene's code-word was then calculated. All pixels were filtered for maximum Euclidean distance (distance threshold) to a gene's code-word, using a threshold of 0.52 for conventional and 0.33 for split-FISH. The L2 norm of each pixel vector was used as a second filter (magnitude threshold) to remove called pixels with too low intensities. The called and filtered pixels were then grouped into connected regions (4-connected neighbourhood) for each gene. Regions with only 1 pixel were subject to a second more stringent intensity threshold. Sets of parameters which yielded both good correlation to bulk FPKM counts and high gene counts were chosen. The number of regions for each gene across all fields of view was then summed, and total counts for each gene compared to the respective FPKM values by calculating the Pearson correlation. The FPKM values from bulk RNA sequencing of mouse tissues were downloaded from the ENCODE portal (https://www.encodeproect.org/) with the following identifiers: ENCSR000BZC (ovary), ENCFF478QMU (kidney replicate 1), ENCFF638NYA (kidney replicate 2), ENCFF844MJF (liver replicate 1), ENCFF271DWG (liverreplicate 2), ENCFF653BKJ (frontal cortex replicate 1), and ENCFF703SOK (frontal cortex replicate 2). The FPKM values of AML12 cell line was obtained by performing bulk RNA sequencing in-house. Briefly, RNA was extracted using Isolate II RNA Mini Kit (Bioline), sequencing was performed at the GIS next generation sequencing platform, A*STAR, Singapore, and the sequences were analyzed using Salmon. The list of FPKM values (or their mean if the tissue has sequencing replicates) used for the Pearson correlation analysis is listed in Table 10. Cells were manually counted using the DAPI and RNA images. For the split-FISH library, 789, 4043, 7484, 13405, and 26001 cells were imaged for the AML-12, brain, liver, ovary and kidney experiments respectively. For the conventional library, 1382, 2581 and 2729 cells were imaged for the AML-12, brain and liver experiments respectively. Brightness and spot counting analysis for the MUC5AC and FLNA experiments (for
First, the split probe sequence was optimized using single-molecule FISH on MUC5AC transcripts in A549 cells (
Next, the inventors focused on optimizing the split-FISH workflow (
The performance of split-FISH was then compared against conventional multiplexed FISH in mouse cell cultures and mouse tissue slices. To demonstrate the combinatorial labelling of RNAs, 317 genes were randomly selected as targets, and 26 barcoded bridge sequences were designed. An ‘N Choose 2’ barcoding scheme (Table 2) was designed by assigning each of the two required barcodes to half of the available encoding probes (Table 1). Compared with samples stained with the conventional probe library, samples stained with the split probe library showed decreases in non-specific background (estimated as the median value of all the raw images) that was about 16% in cultured mouse hepatocytes (AML12,
To demonstrate that split-FISH works robustly without any tissue-specific clearing, the same probe set for the 317 genes was used and split-FISH imaging of three additional mouse tissues-kidney, liver, and ovary was performed. The transcript counts from all the tissues also correlated strongly with bulk RNA-seq results, with log Pearson correlation values between 0.54 and 0.75 (
For each tissue type that was imaged, diverse localization patterns of the single-cell transcriptome was observed. For example, Map4 transcripts were found to be highly enriched in the neuronal processes in the frontal cortex (
In conclusion, the inventors showed accurate multiplexed FISH of 317 genes in diverse mouse tissues without requiring tissue clearing, demonstrating the prowess of split-FISH not only in simplifying tissue preparation protocols for multiplexed FISH, but also in broadening the range of accessible tissue types.
AAGCCCAGGGGTACTCCTTATATCCACCGAACCCTTAC (SEQ ID NO: 265)
AAGCCCAGGGGTACTCCTTA TAT TATCCACCG (SEQ ID NO: 329)
Number | Date | Country | Kind |
---|---|---|---|
10202001453Y | Feb 2020 | SG | national |
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/SG2020/050353 | 6/24/2020 | WO |