Embodiments of the present invention relate to a method for analyzing a sample, in particular a biological sample. Further, embodiments of the present invention relate to a device for analyzing a biological sample. The method or the device may also be used for a chemical compound or a chemical element.
In order to address key problems in the field of life sciences it is vital to precisely detect the presence of an analyte or molecular target in one of a biological sample (e.g. tissue samples or cell cultures), an environmental sample (a soil sample), a water sample, a diagnostic procedure (e.g. in a solid or liquid biopsy or a sample prepared from either) or a lysate or extract from such a sample. This can be done by introducing markers into the sample that bind to specific structures, e.g. specific biomolecules. These markers typically comprise an affinity reagent that attaches to the structure in question and a fluorescent dye that is either directly conjugated to the affinity reagent or attached to the affinity reagent by means of a secondary affinity reagent. There are various techniques for analyzing biological samples prepared in this way. The plexing level in fluorescence microscopy, i.e. the number of different fluorescent dyes that can be read out at the same time is generally low and in the case of fluorescence microscopy typically in the range of 1-5 dyes for channel-based readouts and 5-12 dyes for readouts with spectral detectors, which use dispersive optical elements, such as prisms or gratings in combination with multiple detectors or array detectors. In fluorescence-based cytometry and sorting, somewhat higher plexing levels have been attained, but also in this case plexing is limited to a low number of dyes and consequently markers that can be readout in one experiment.
“Fluorescent cell barcoding” is a multiplexing technique developed by Krutzig and Nolan 2006 and is based on using different mixtures of three fluorescent dyes as described in Nat Methods. 2006 May; 3(5):361-8. doi: 10.1038/nmeth872.
Citing from Tsai et al. 2020 “Fluorescent cell barcoding (FCB) is a multiplexing technique for high-throughput flow cytometry (FCM). Although powerful in minimizing staining variability, it remains a subjective FCM technique because of inter-operator variability and differences in data analysis” (J Immunol Methods 2020 February; 477:112667.doi: 10.1016/j.jim.2019.112667. Epub 2019 Nov. 11.) Both the subjectiveness of the technique and inter-operator variability of this method are inherently related to the fact it is based on encoding a part of the information in the hues of dyes, i.e. in intensity variations like in for example light green, green, dark-green, which severely limits the use of this technique.
However, no technique allows for the fluorescence-based readout of a high number of different markers. Wherein readout may refer to image-based or non-image-based readouts.
Embodiments of the present invention provide a method for analyzing a sample. The sample includes a plurality of affinity reagents, at least one of the plurality of affinity reagents being attached to an analyte, and a first plurality of combinations of dyes. Each combination of dyes is unique within the first plurality of combinations of dyes. Each combination of dyes includes at least two dyes having different characteristics for at least one of excitation or emission. Each one of the unique combinations of dyes is attached to an associated affinity reagent of the plurality of affinity reagents according to a first mapping. The method includes directing excitation light at the sample, the excitation light having characteristics for exciting at least one of the at least two dyes having different characteristics, generating at least one first readout from emission light emitted by the excited dyes, and determining, by at least one computer processor, at least one affinity reagent present in the sample based on the at least one first readout.
Subject matter of the present disclosure will be described in even greater detail below based on the exemplary figures. All features described and/or illustrated herein can be used alone or combined in different combinations. The features and advantages of various embodiments will become apparent by reading the following detailed description with reference to the attached drawings, which illustrate the following:
Fluorescence microscopy allows for imaging the sample with high spatial resolution but involves only a low number of different fluorescent dyes, typically between 1 and 5. The available markers have to accommodate markers that are used to identify cell types, functional markers like protein-of-interest, and general morphological markers in the same experiment. This means that cell types in most imaging experiments are merely poorly identified. This means that rather broad multi cell type populations are being studied, which severely limits the predictive power and translational value of the results generated. While modern approaches that allow for a much more reliable and robust identification of cell types, e.g. based on the analysis of genetic regulatory networks (GRNs), exist they require a much higher number of different markers to be readout from the sample.
While in the adjacent field of cytometry, mass cytometry and imaging mass cytometry techniques can distinguish between around 12 to 30 different markers, they do so with a low spatial resolution.
Spatial profiling techniques can distinguish a number of different markers several orders of magnitude higher, albeit at an even lower spatial resolution as they are based on hybridizing oligonucleotides to the sample and then selectively releasing bound oligonucleotides in a region-selective fashion followed by next-generation sequencing of the released oligonucleotides.
Embodiments of the present invention allow very high number of markers to be readout by means of a fluorescence-based optical readout, which may be based on a continuous data readout stream or discrete readout (digital or analog) and may be based on point-detectors, line-detectors or area detectors such as cameras or hyperspectral cameras for example. The method is therefore widely applicable in life sciences, diagnostics, environmental sciences, and healthcare and quality control and can be combined with a wide array of optical readouts. These include but are not limited to cytometers, plate readers, microscopes, imaging systems.
Embodiments of the present invention achieve marker discrimination capability and coverage rates attainable presently with next-generation sequencing-based readouts on the basis of an optical fluorescence-based readout and can be implemented on commercially available fluorescence imaging systems, such as for example the STELLARIS 8 confocal microscope platform (from Leica Microsystems).
Embodiments of the present invention are based on “looking at” microscopy as an encoding/decoding problem rather than a problem of registering spatially located intensities in an image, which is essentially a matrix of intensity values. While the method according to embodiments of the present invention are compatible with image-based readouts the “images” generated by the method and device described herein should be regarded as probabilistic mathematical models of the reality of the sample under investigation, in which the presence of a target molecule is detected or called (presence calling) based upon the decision of the user to accept its presence based on a measure of statistical confidence and a certain level of statistical confidence in the presence of the respective target molecule or analyte in the readout volume.
Making the step of accepting the presence of a certain marker, a certain affinity reagent, and a certain target molecule based on a measure of statistical confidence and/or a certain level of statistical confidence (i.e. presence calling) generates a mathematical truth. This is important as this also implies that following to presence calling one is operating in the axiomatic domain of mathematics which is inherently free of influences that complicate measurements in the non-mathematical domain, i.e. the physical, chemical, biological domain. This fact therefore has important implications and also indicates that this method enables a completely new microscopic modality.
The statistical methods to provide a measure of statistical confidence on a per marker-basis or per target-molecule basis, may well be a combined measure and may in many ways be similar or identical to methods, which are used in transcriptomics and genomics, where enrichment scores and p values are commonly used.
In the sense of this document the following terms are used in the following way:
“Sample”: In the sense of this document “sample” refers to a biological sample which may also be named a biological specimen including, for example blood, serum, plasma, tissue, bodily fluids (e.g. lymph, saliva, semen, interstitial fluid, cerebrospinal fluid), feces, solid biopsy, liquid biopsy, explants, whole embryos (e.g. zebrafish, Drosophila), entire model organisms (e.g. zebrafish larvae, Drosophila embryos, C. elegans), cells (e.g. prokaryotes, eukaryotes, archea), multicellular organisms (e.g. Volvox), suspension cell cultures, monolayer cell cultures, 3D cell cultures (e.g. spheroids, tumoroids, organoids derived from various organs such as intestine, brain, heart, liver, etc.), a lysate of any of the aforementioned, a virus. In the sense of this document “sample” further refers to a volume surrounding a biological sample. Like for example in assays, where secreted proteins like growth factors, extracellular matrix constituents are being studied the extracellular environment surrounding a cell up to a certain assay-dependent distance, is also referred to as the “sample”. Specifically, affinity reagents brought into this surrounding volume are referred to in the sense of this document as being “introduced into the sample”.
“Affinity reagent”: In the sense of this document the term “affinity reagent” may in particular be an antibody, a single-domain antibody (also known as nanobody), a combination of at least two single-domain antibodies, an aptamer, an oligonucleotide, a morpholino, a PNA complementary to a predetermined RNA, DNA target sequence, a ligand (e.g. a drug or a drug-like molecule), or a toxin, e.g. Phalloidin a toxin that binds to an actin filament. In the sense of this document an affinity reagent is configured to bind a target molecule or to an analyte with a certain affinity and specificity such that it can be said that the affinity reagent is substantially specific to the target molecule or predetermined target structure. In the sense of this document “plurality of affinity reagents” (S2) contains the affinity reagents (a1, a2, a3, . . . an), which are configured to specifically bind to a predetermined target structure within the biological sample or to a predetermined chemical compound or to a predetermined chemical element or to an analyte. At least some of the affinity reagents from the plurality of affinity reagents (A) are “introduced to the sample” such that the affinity reagents can attach to the respective predetermined target structure within the sample. In this context and in the sense of this document and as described above “introduced to the sample” may refer to being physically introduced into the volume of the sample or into a volume surrounding and assigned to the sample. An example of the latter case may be assays for secreted molecules for instance, which are best assessed in the extracellular space where they might be outside of the sample, but within a certain spatial context or vicinity of the sample.
“Predetermined target structure”: In the sense of this document “predetermined target structure” refers to a target molecule or a target structure or to an analyte, which may for example be a protein (e.g. a certain protein), an RNA sequence (e.g. the mRNA of a certain gene), a peptide (e.g. somatostatin), a DNA sequence (e.g. the a genetic locus or element), a metabolite (e.g. lactic acid), a hormone (e.g. estradiol), a neurotransmitter (e.g. dopamine), a vitamin (e.g. cobalamine), a micronutrient (e.g. biotin), a metal ion (e.g. metal and heavy metal ions like Cd(II), Co(II), Pb(II), Hg(II), U(VI)).
“Dye”: In the sense of this document the terms “fluorescent dye”, “fiuorophore”, “fluorochrome”, “dye” are used interchangeably to denote a fluorescent chemical compound or structure and can be in particular one of the following: a fluorescent organic dye, a fluorescent quantum dot, a fluorescent dyad, a fluorescent carbon dot, graphene quantum dot or other carbon-based fluorescent nanostructure, a fluorescent protein, a fluorescent DNA origami-based nanostructure. From the organic fluorescent dyes in particular derivatives of the following are meant by the term “fluorescent dye”: xanthene (e.g. fluorescein, rhodamine, Oregon green, Texas), cyanine (e.g. cyanine, indocarbocyanine, oxacarbocyanine, thiacarbocyanine, merocyanine), derivatives, squaraine rotaxane derivatives, naphthalene, coumarin, oxadiazole, anthracene (anthraquinones, DRAQ5, DRAQ7, CyTRAK Orange), pyrene (cascade blue), oxazine (Nile red, Nile blue, cresyl violet, oxazine 170), acridine (proflavine, acridine orange, acridine yellow), arylmethine (auramine, crystal violet, malachite green), tetrapyrrole (porphin, phthalocyanine, bilirubin), dipyrromethene (BODIPY, aza-BODIPY), a phosphorescent dye, or a luminescent dye. The following trademark groups designated commercially available fluorescent dyes, which may include dyes belonging to different chemical families CF dye (Biotium), DRAQ and CyTRAK probes (BioStatus), BODIPY (Invitrogen), EverFluor (Setareh Biotech), Alexa Fluor (Invitrogen), Bella Fluore (Setareb Biotech), DyLight Fluor (Thermo Scientific), Atto and Tracy (Sigma-Aldrich), FluoProbes (Interchim), Abberior Dyes (Abberior Dyes), Dy and MegaStokes Dyes (Dyomnics), Sulfo Cy dyes (Cyandye), HiLyte Fluor (AnaSpec), Seta, SeTau and Square Dyes (SETA BioMedicals), Quasar and Cal Fluor dyes (Biosearch Technologies), SureLight Dyes (Columbia Biosciences), Vio Dyes (Milteny Biotec) [list modified from: https://en.wikipedia.org/wiki/Fluorophore]. From the group of fluorescent proteins in particular the members of the green fluorescent protein (GFP) family including GFP and GFP-like proteins (e.g., DsRed, TagRFP) and their (monomerized) derivatives (e.g., EBFP, ECFP, EYFP, Cerulaen, mTurquoise2, YFP, EYFP, mCitrine, Venus, YPet, Superfolder GFP, mCherry, mPlum) are meant by the term “fluorescent dye” in the sense of this document. Further from the group of fluorescent proteins the term “fluorescent dye” in the sense of this document may include fluorescent proteins, whose absorbance or emission characteristics change upon binding of ligand like for example BFPms1 or in response to changes in the environment like for example redox-sensitive roGFP or pH-sensitive variants. Further from the group of fluorescent proteins the term “fluorescent dye” in the sense of this document may include derivative of cyanobacterial phycobiliprotein small ultra red fluorescent protein smURFP as well as fluorescent protein nanoparticles that can be derived from srnURFP. An overview of fluorescent proteins can be found in Rodriguez et al. 2017 in Trends Biochem Sci. 2017 February; 42(2): 111-129. The term “fluorescent dye” in the sense of this document may further refer to a fluorescent quantum dot. The term “fluorescent dye” in the sense of this document may further refer to fluorescent carbon dot, a fluorescent graphene quantum dot, a fluorescent carbon-based nanostructure as described in Yan et al. 2019 in Microchimica Acta (2019) 186: 583 and Iravani and Varma 2020 in Environ Chem Lett. 2020 Mar. 10: 1-25. The term “fluorescent dye” in the sense of this document may further refer to a fluorescent polymer dot (Pdot) or nanodiamond. The term “fluorescent dye” in the sense of this document may further refer to a fluorescent dyad, like for example a dyad of a perylene antenna and a triangelium emitter as described in Kacenauskaite et al. 2021 J. Am. Chem. Soc. 2021, 143, 1377-1385.
The term “fluorescent dye” in the sense of this document may further refer to an organic dye, a dyad, a quantum dot, a polymer dot, a graphene dot, a carbon-based nanostructure, a DNA origami-based nanostructure, a nanoruler, a polymer bead with incorporated dyes, a fluorescent protein, an inorganic fluorescent dye, a SMILE, or a microcapsule filled with any of the aforementioned.
The term “fluorescent dye” in the sense of this document may further refer to a FRET-pair having at least one fluorescent dye as FRET donor and at least one fluorescent dyes as a FRET acceptor, or a FRET-triple, which is used to generate a three component Forster resonance energy transfer. In particular, the FRET-pair or FRET-triplet is connected by a complementary linker or by a linking element.
The term “fluorescent dye” in the sense of this document may further refer to a FRET n-tupel of physically connected dyes.
“Plurality of combinations of dyes” (S1): In the sense of this document the term “Plurality of combinations of dyes” (S1) refers to the plurality of combinations of dyes for which, each combination of dyes (s1, s2, s3, . . . sn) is unique within the plurality of combinations of dyes (S1), each combination of dyes (s1, s2, s3, . . . sn) comprises at least two different dyes (|s|>=2); wherein the plurality of combinations of dyes (S1) is composed such that each dye (y1, y2, y3, . . . yσ) in the plurality of combinations of dyes (S1) can be readout by a readout device; wherein dyes can be separated by a readout device into channels; each channel corresponding to one of the dyes (y1, y2, y3, . . . yσ).
“Marker”: In the sense of this document “marker” is used to denote both a single molecule used as marker and a collection of identical molecules used as marker. A “marker” in the sense of this document is the combination of an affinity reagent configured to attach to a predetermined structure also referred to as a target molecule or an analyte and/or a “reporter”. As such the “marker” is the virtual assignment or mapping of an affinity reagent to a particular combination of dyes (virtual marker) and the physical assembly of an affinity reagent with the combination of dyes (physical marker). In the sense of this document, the physical assembly of an affinity reagent with the combination of dyes may occur before, during, or after the introduction of the respective affinity reagent into the sample. Like for example when oligonucleotide sequence barcoded antibodies are used as affinity reagents, they may be brought into a sample and allowed to attach to their predetermined target structure, e.g. by physically attaching the unique combination of dyes (si) to the assigned affinity reagent (ai), before or after introducing at least some affinity reagents from the plurality of affinity reagents (A) to the sample or to the chemical compound or to the chemical element, or before a generation of a readout from emission light emitted by excited dyes. In an iterative staining-imaging-dye deactivation process an affinity reagent bound to a predetermined target structure may be cyclically connected to a sequence of different combinations of dyes, as in a first combination of dyes in a first iteration and a second combination of dyes in a second iteration, a strategy to which we refer as “Primary qualitative iterative multi-species readout volume decoding by reassigning codes in between iterations (“code swapping”)”. In other words, one or more of the markers in the sample will change between iterations.
“Reporter”: In the sense of this document “reporter” is used to denote both a single molecule/structure used as reporter and a collection of identical molecules/structures used as reporter. A “reporter” in the sense of this document is the combination of a unique “combination of dyes” and “linker”, configured to connect the combination of “dyes” with the “affinity reagent”.
“Linker”: In the sense of this document the linker denotes a unipartite chemical structure (e.g. a monomeric molecule or a polymer) or multipartite assembly of chemical structures linking a combination of fluorescent dyes to an affinity reagent. A linker might be directly or covalently coupled to the dyes and to the affinity reagent or indirectly through for example affinity tag-affinity ligand combination such as streptavidin-biotin interaction or a hapten or an oligonucleotide for example. In the case of covalent coupling this may be a site-selective coupling. Commonly used coupling chemistries such NHS-, maleimide, azide-alkine and a range of further so called click chemistries may be used to couple the linker to the affinity reagent and/or the linker to a dye. A linker may in particular comprise an oligonucleotide (e.g. DNA, RNA, LNA, PNA, morpholino, other artificial oligonucleotide), a peptide, a DNA-origami-based structure such as for example a nanoruler, a micro-/nanobead, a polymer, a micro-/nanocapsule, a micro-/nanocrystal, a carbontube, a carbon-based nanostructure (e.g. a graphene).
A linker may in particular comprise an oligonucleotide and another element of the group mentioned before, like for example comprise an oligonucleotide and a peptide.
“Readout device”: In the sense of this document “readout device” refers to a device used to perform fluorescence multicolour reading or imaging. A readout device typically includes at least one excitation light source, a detection system including at least one detection channel and may as well contain filters and/or dispersive optical elements such as prisms and/or gratings to route excitation light to the sample and/or to route emission light from the sample onto to a detector or onto an appropriate area of the detector. The detection system in the sense of this document may comprise several detection channels, may be a spectral detector detecting multiple bands of the spectrum in parallel, or a hyperspectral detector detecting a contiguous part of the spectrum. The detection system comprises at least one detector, which may be a point-detector (e.g. a photomultiplier, an avalanche diode, a hybrid detector), an array-detector, a camera, hyperspectral camera. The detection system may record intensities per channel as is typically the case in cytometers or may be an imaging detection system that records images as in the case of plate readers or microscopes. A readout device with one detector channel, like for example a camera or a photomultiplier, may generate readouts with multiple detection channels using, for example, different excitation and emission bands. Readout devices allow a certain number of dyes to be analysed from a given biological sample in a given run. A “run” may refer to an “iteration” or “round”, i.e. the production of at least one readout for a given set of combinations of dyes and a given mapping of affinity reagents to combinations of dyes, wherein the affinity reagents are attached to the analytes. This number typically depends on the number of detection channels, n, the readout device is configured to provide, i.e. is able to spectrally resolve. In the case of microscopes the number of detection channels is typically 4-5 in the case of camera-based widefield detections (e.g. widefield epifluorescence microscopes, spinning disk microscopes, light sheet fluorescence microscopes), 5-12 detection channels in the case of microscopes with spectral detection concepts that typically rely on excitation or emission fingerprinting and (spectral/linear) unmixing. Hyperspectral imaging instead, which can differentiate a high number of dyes by providing very fine spectral resolution over a wide and contiguous spectral range is not yet widely deployed in microscopy. In addition to spectral properties also the lifetime of fluorescent dyes can be used to discern multiple dye species and differentiate them from autofluorescence effectively increasing the number of detection channels, y, which might correspond also to the maximum number of dyes in a set (i.e. excitable by the same excitation light) that can be reliably separated.
“Oligonucleotide”: in the sense of this document refers to DNA, RNA, peptide nucleic acid, morpholino or locked nucleic acid, glycol nucleic acid, threose nucleic acid, hexitol nucleic acid or another form of artificial nucleic acid.
“Spot”: in the sense of this document refers to a volume in the sample or region surrounding the sample, which is being readout. The size and shape of spots is dictated by the effective point spread function of the imaging system used to acquire the data.
“Point spread function”:—In the sense of this document the term “point spread function” is used to denote the main maximum of the point spread function and unless otherwise denoted the term refers to the effective point spread function (PSF) of the imaging system, which is generally elliptical, i.e. the lateral resolution is better than the axial resolution, but may approach an almost spherical shape as more views are acquired from preferably equidistant angles.
“Readout”: In the sense of this document the term “readout” refers to an image-based readout, which may be acquired on a microscope like a point-scanning confocal or a camera-based/widefield imaging system like for example a spinning disk microscope, a light sheet fluorescence microscope, a light field microscope, a stereomicroscope. Further the term “readout” refers to non-image based readouts like for example in a cytometer or a flow-through based readout device with at least one point detector or a line detector. A readout may consist of a discrete readout, like for example a single acquisition of an emission spectrum or image stack, a readout may be a readout data stream, like for example in a point-scanning confocal or cytometer, which is substantially continuous. Further a readout may be a sequence of images for example a spectral or hyperspectral image stack, wherein in each image fluorescence emission of different wavelength bands is recorded.
“Readout volume”: In the sense of this document the term “readout volume” refers to the volume which is effectively detected by an optical system such as a microscope or a cytometer at a given moment in time. For systems with a continuous data stream, the “readout volume” is determined by a clock like a “pixel clock”, which divides a continuous data stream into chunks that are then assigned to a certain time point or spatial location. The readout volume might depend on the effective point spread function of the imaging system, e.g. an effective point spread function might define or confine the maximum extent of a readout volume.
“Readout sequence”: In the sense of this document the term “readout sequence” is used refer to a readout of a “readout volume” that readouts all dyes (y1, y2, y3, . . . yσ) in the plurality of dyes YD from which the combinations of dyes in the plurality of combinations of dyes (S1) are composed at least one time, i.e. all dyes (y1, y2, y3, . . . yσ) in the plurality of dyes YD are excited at least one time and the emitted fluorescence light is detected and separated by the readout device into channels, each channel corresponding to one of the dyes (y1, y2, y3, . . . yσ). Such that after obtaining the readout sequence the presence or absence of a dye from the readout volume can be assessed qualitatively and/or quantitatively, wherein qualitatively refers to calling a dye present in the readout volume, when the intensity in the corresponding channel is above a certain user-defined threshold, wherein quantitatively refers to calling a dye present in the readout volume and assigning a relative intensity value or absolute number of molecules to it. The threshold may be a fixed threshold, a fixed channel-specific threshold, or a dynamically adjusted threshold. The threshold may be a combination of thresholds like for example an intensity threshold and a statistical confidence in the dye separation result. The decision to call a dye present (presence calling) may be made dependent on passing a combination of multiple thresholds.
Preferably, a readout sequence results from exciting the sample with a first excitation light A, detecting the emitted fluorescence, and assigning it to yA channels corresponding to DyeA1, DyeA2, DyeA3, . . . DyeAyn, a second excitation light B, detecting the emitted fluorescence, and assigning it to yB channels corresponding to DyeB1, DyeB2, DyeB3 . . . DyeByn, and repeating the process until Dyen1, Dyen2, Dyen3, . . . Dyenyn (i.e. the entire plurality of dyes (YD) with yσ members) have been readout at least once.
A “code” in the sense of this document is defined as follows: S and Tare two finite sets, with S being named the “source alphabet” and T being named the “target alphabet”. A Code
is a total function or algorithm that uniquely represents an element from S as a sequence of symbols over T. The extension C′ of C, is a homomorphism of S* into T*, which naturally maps each sequence of source symbols to a sequence of target symbols. In the language used in computer science a code is generally referred to as an algorithm and a sequence of symbols as an encoded string (modified from: Code. (n.d.) In Wikipedia. Retrieved Jun. 17, 2021 from https://en.wikipedia.org/wiki/Code). In the sense of this document the finite set S1 is also named the “plurality of combination of dyes”, and T1* is the finite set of strings over T1 and corresponds to the “plurality of affinity reagents”, which may also be named A or S2.
Alternatively, or in addition to the encoding/decoding of combinations of dyes users may encrypt/decrypt combinations of dyes using a cipher X
Two different cases a and b are being discerned and can be regarded as different directionalities of encoding/encryption.
The method disclosed in this document is compatible with both cases α and β. As well as with cases in which multiple codes C1, C2, C3, . . . Cn and/or ciphers X1, X2, X3, . . . Xn are being used so as long as they are total functions and as long as the resulting mapping is injective or bijective. In both of these cases the codes C1, C2, C3, . . . Cn and/or ciphers X1, X2, X3, . . . Xn are functions that can be inverted, i.e. decoded. In a preferred embodiment of the invention, a bijective mapping (encoding or encryption) is used, which means that there is a one-to-one correspondence between an element (ai) of the plurality of affinity reagents (S2) also named (A) or (T1*) and an element (si) of the plurality of combination of dyes (S1). This allows easy decoding of the combination of dyes (s1 to sk) contained in the readout volume and thereby the identification of their associated affinity reagents (a1 to ak) based on a readout sequence, that assesses the presence of all dyes (y1, y2, y3, . . . yσ) in the readout volume qualitatively (presence calling, e.g. “yes”=“1”, “no”=0) and/or quantitatively (presence calling with relative or absolute quantitation). The microscopic examination of the readout volume as described in this document can be regarded as encoding/decoding problem, which is solved by labeling target molecules with affinity reagents that are (dynamically) linked to, and associated with, combinations of dyes, which encode those target molecules in the readout volume labeled in this way. Retrieving the identity of the target molecules which have a one-to-one mapping (bijective association) with the affinity reagents from the plurality of the affinity reagents is thus a decoding problem. It is important to state, that if the presence of a certain dye from the plurality of dyes has been accepted in a readout sequence based on a certain degree of statistical confidence, then that presence becomes a mathematical truth. The decoding of combinations of dyes (s1 to sk) is possible, when a readout sequence is observed that does not allow all possible combinations of dyes (s1, s2, s3, . . . sn) in the plurality of combinations of dyes (S1) to be subsumed under it. In other words, if a readout sequence is observed under which all possible combinations of dyes (s1, s2, s3, . . . sn) in the plurality of combinations of dyes (S1) can be subsumed, than one does not gain knowledge about the contents of the readout volume. This would be the case if a readout sequence would indicate that all dyes (y1, y2, y3, . . . yσ) from the plurality of dyes (PD) are called “present” in the readout volume. This case is, however, unlikely even for cases in which large numbers of affinity reagents are being used, as the cardinality of the plurality of combinations of dyes (S1) grows exponentially, while the number of available affinity reagents is limited to the number of targets molecules of interest. For example the entire human genome contains about 20,000 coding genes, so even if, one would use 20,000 affinity reagents in the plurality of affinity reagents to label these target molecules with a unique combination of dyes from the plurality of combination of dyes (S1), it would be easy to define a plurality of dyes (PD), which is large enough to ensure that the number of elements in S1>>20,000, i.e. several orders of magnitude higher like for example 106 to 1010. In consequence, it is easily possible to define conditions in which the fraction a of actually assigned combinations of dyes to all available combinations of dyes from the plurality of combination of dyes (S1) becomes very small. In this case the likelihood to observe false-positive, i.e. combinations of dyes not assigned to a marker (type I false-positive) and/or combinations of dyes assigned to an affinity reagent not physically present in the readout volume (type II false-positive) subsumable under a first readout sequences becomes lower. If the conditions are such that a single iteration does not yield satisfactory levels of statistical confidence for presence calling for any of the following: combination of dyes; affinity reagents; and target molecules contained in the readout volume, it is possible to significantly improve the analysis in different ways which will be described herein.
In a preferred embodiment of the invention, a first readout sequence is acquired in a first step and the “first set of combinations of dyes” subsumable under this first readout sequence is stored in a memory device. In a next step for at least some affinity reagents from the plurality of affinity reagents (A) the encoding/encryption might be changed. This can be done by deactivating the dyes introduced in the first step by means of eluting the affinity reagents, bleaching dyes or severing the linkage between the combination of dyes and the affinity reagents. Depending on the choice of method the target molecules are then re-labeled in a second step with at least some affinity reagents from the plurality of affinity reagents (A), where in at least some affinity reagents are assigned to a different second combination of dyes. In a next step the “second set of combinations of dyes” is derived from a second readout sequence, i.e., a second readout sequence is generated in the same manner that the first readout sequence was generated and dyes from the second set of combinations of dyes identified in the second readout sequence. The retrieval of all second combinations of dyes subsumable under this second readout sequence is stored in a memory device. In a further step the “first set of combinations of dyes” and the “second set of combinations of dyes” might be compared to define the overlap and at least one statistical confidence measure is computed for each combination of dyes and/or affinity reagent and/or target molecule and/or analyte detected in the overlap. A certain combination of dyes and/or a certain affinity reagent and/or a certain target molecule and/or analyte is then said or called to be present in the readout volume, when the at least one statistical confidence measure computed for this particular certain combination of dyes and/or certain affinity reagent and/or certain target molecule and/or analyte is acceptable based on criteria, which may be fixed and a priori defined or dynamically derived and adjusted during the experiment.
In principle this process may be repeated until an acceptable level of statistical confidence in the acceptance or rejection of the presence of a certain combination of dyes and/or affinity reagents and/or target molecules and/or analytes of interest has been reached. Importantly, while in strict mathematical terms each iteration in this iterative process analyses exactly the same readout volume, it is possible and a preferred embodiment of the present invention to allow small deviations (fractions 1/10000, 1/1000, 1/100, 1/10, ¼, ½ of the lateral extent of the effective PSF for example) in the spatial and/or temporal position (fractions 1/10000, 1/1000, 1/100, 1/10, ¼, ½ of the time a sample takes to traverse the lateral extent of the effective PSF for example) of the readout volume between a first and a second readout sequence. In this case a first readout sequence generates apriori knowledge about the second readout sequence in the sense of Bayesian probabilities according to the Bayesian theorem, this is not unlike pretest probability in diagnostic testing, in which a symptomatic patient typically has a substantially lower false-positive rate than an asymptomatic patient. In analog fashion one can argue that if an affinity reagent at was detected in a first readout volume than this influences its probability to be detected in an overlapping second readout volume, wherein the overlap may be understood as spatial or temporal.
Preferably, a “code” in the sense of this document may be for example a linear code (e.g. binary code), fixed length code, a variable length-code, or an error-correcting code. In a preferred embodiment, the codes are “independent and identically-distributed”. In a preferred embodiment of the present invention, a binary code is used.
“set of combinations of dyes subsumable under a readout sequence”: In the sense of this document “set combinations of dyes subsumable under a readout sequence” refers to the set containing all combinations of dyes from the plurality of combinations of dyes that can be subsumed under a certain readout sequence. The “set of combinations of dyes subsumable under a readout sequence” subsumable under a readout sequence contains 1 elements.
“assignment rate”: is the proportion of unique codes (also referred to as combination of dyes) from the set of unique codes, which may also be referred to as the plurality of combinations of dyes (S1), that are actually assigned to a marker and is denoted as α.
Embodiments of the present invention provide a method and a device for analyzing a sample, preferably a biological sample that allows analyzing a very high number of markers in a very short time.
In accordance with embodiments of the invention, there is provided a method for analyzing a sample, the sample comprising: a plurality of affinity reagents, each affinity reagent being configured to attach to an analyte, at least one of the affinity reagents being attached to an analyte; and a first plurality of combinations of dyes, each combination of dyes being unique within the first plurality of combinations of dyes and each combination of dyes comprising at least two dyes having different characteristics for at least one of: excitation and emission, wherein each one of the unique combinations of dyes is attached to an associated affinity reagent according to a first mapping, the method comprising:
The method may be a computer-implemented method.
In this way, the method provides an improved method for detecting the presence of analytes in a sample. In particular, by directing excitation light having characteristics for exciting dyes having different excitation and/or emission characteristics, the readout generated can contain information allowing for the determination of a greater number of analytes per readout than is possible using known methods, as will be described in greater detail herein.
The method may be further defined in that each unique combination of dyes in the first plurality of combinations of dyes is attached to only one affinity reagent, such that no unique combination of dyes is associated with more than one affinity reagent in the first mapping. In this way, the detection of a unique combination of dyes within a readout can, with confidence, be used to determine that an analyte is present in the sample. It may also be said that the mapping of the plurality of combinations of dyes to affinity reagents is at least injective, preferably bijective.
Generating at least one first readout may further comprise: separating the emission light emitted by the excited dyes into detection channels, wherein each detection channel substantially corresponds to a dye from the plurality of dyes or wherein the detection channel corresponds to a dye from the plurality of dyes. Each combination of dyes may be selected to comprise one dye per detection channel.
In this way, a greater number of dyes may be readout on a readout device, as will be described in greater detail herein.
The at least two dyes may have different excitation characteristics, and wherein excitation light having each excitation characteristic is directed to the sample at different times. In this way, the information available to the readout device can be increased, and many more combinations of dyes can be unique. For example, dyes having different excitation characteristics and the same emission characteristics may be harder to distinguish from one another if light having both of their excitation characteristics is directed towards the sample simultaneously. By separating the excitation lights, and noting when light having the shared emission characteristic was emitted, the dyes can be more easily distinguished. This leads to a greater number of feasible combinations of dyes.
The at least two dyes may have different excitation characteristics, and wherein excitation light having each excitation characteristic is directed to the sample simultaneously. In this way, the method may be more efficient both computationally and in terms of time to run. Exciting all dyes at the same time may be permissible for smaller numbers of analytes in the sample, i.e., where fewer unique combinations of dyes are required.
The method may further comprise: providing the plurality of affinity reagents; and providing the first plurality of combinations of dyes.
The method may further comprise:
In this way, it can be controlled how the attachments between affinity reagent and analyte, and between combination of dyes or reporter and affinity reagent, are achieved. Depending on the affinity reagent, analyte, and dyes, among other variables, it may be advantageous to form the markers before attachment to the analytes. In other cases, it may be advantageous to attach the affinity reagents to the analytes, and add the dyes later, forming the markers “in situ” within the sample.
Attaching the plurality of affinity reagents to the first plurality of combinations of dyes may comprise:
The method may further comprise at least one of.
In this way, more information can be generated that pertains to the same sample. A single combination of dyes and/or single mapping may not produce a readout from which any or all analytes can be determined. It is therefore desirable to obtain more information relating to the original sample. The method may therefore deactivate, or allow to deactivate, at least one of the dyes or attachments, such that the (second) readout generated when steps i) to iii) are run again will be different to the original (first) readout. A second readout having new information relative to the first can then be used to determine further analytes.
The method may further comprise suggesting, by a computer processor, at least one dye and/or combination of dyes for the second plurality of combinations of dyes and/or rules for the second mapping based on the at least one first readout. In this way, the number of iterations can advantageously be reduced, thereby optimizing the process.
The method may further comprise iteratively repeating the steps of at least one of deactivating, removing attachment, waiting, and suggesting for at least one of: a number of pluralities of combinations of dyes; and a number of mappings, until all affinity reagents attached to analytes in the sample are determined. In this way, the method determines all analytes present in the sample.
Determining, by at least one computer processor, at least one affinity reagent present in the sample, may comprise:
In this way, the method can analyse differences between the readouts, and the known differences between the mappings and/or combinations of dyes, to aid the determination of analytes in the sample.
The characteristics of a dye for at least one of excitation and emission may comprise at least one of: excited wavelength; emitted wavelength; fluorescence intensity; and fluorescence lifetime. In this way, the readout can effectively be retrieved and analysed, and the dyes distinguished from one another.
Determining, by at least one computer processor, at least one affinity reagent present in the sample based on a readout may comprise:
In this way, the method provides a computationally efficient way to “decode” the information in each readout, such that the presence of analytes in the sample can be determined.
In accordance with embodiments of the invention, there is provided a device for analyzing a sample, the device being configured to perform the method according to embodiments of the invention.
In accordance with embodiments of the invention, there is provided a linker, configured to couple to an affinity reagent, the linker comprising:
In accordance with embodiments of the invention, there is provided a reporter, comprising:
In accordance with embodiments of the invention, there is provided a marker, comprising:
In accordance with embodiments of the invention, there is provided a plurality of markers according to embodiments of the invention, wherein each reporter comprises a unique combination of dyes, and wherein each reporter is attached to an affinity reagent configured for attachment to an analyte, such that no unique combination of dyes is associated with more than one affinity reagent.
In these ways, embodiments of the invention provide the building blocks to put methods in accordance with embodiments of the invention into effect. The linker, reporter, marker, and plurality of markers described above allow the advantageous effects, including determination of a greater number of analytes per readout, as described in relation to the method.
In accordance with embodiments of the invention, there is provided a solution comprising at least one of a combination of dyes according to embodiments of the invention, a linker according to embodiments of the invention a reporter according to embodiments of the invention, a marker according to embodiments of the invention, and a plurality of markers according to embodiments of the invention. A solution may be manufactured to comprise linkers, reporters, markers, or a plurality of markers, by any suitable methods apparent to a skilled person. By way of example only, the solution may comprise water and/or saline, for example a phosphate-buffered saline, and may comprise further minerals in alternative formulations.
In accordance with embodiments of the invention, there is provided a lyophilized solid comprising at least one of a combination of dyes according to the embodiments of the invention, a linker according to embodiments of the invention, a reporter according to embodiments of the invention, a marker according to embodiments of the invention, and a plurality of markers according to embodiments of the invention. A lyophilized solid may be manufactured to comprise linkers, reporters, markers, or a plurality of markers, by any suitable methods apparent to a skilled person. By way of example only, a lyophilized solid may be manufactured by processes of freezing and drying, optionally under vacuum.
In accordance with embodiments of the invention, there is provided a computer program with a program code for performing the method according to embodiments of the invention when the computer program is run on a processor.
In accordance with embodiments of the invention, there is provided a computer readable storage medium storing the computer program according to embodiments of the invention.
In accordance with embodiments of the invention, there is provided a database comprising information corresponding to:
In this way, embodiments of the invention may keep track of the relevant information in an efficient way, which allows the rapid retrieval and storage requiring minimal amounts of memory on a memory device.
In further exemplary embodiments, a method for analyzing a biological sample comprises the following steps:
In a preferred embodiment of the invention the determination of the presence of affinity reagents in the readout volume is established based on a measure of statistical confidence and a certain level of statistical confidence.
A measure of statistical confidence is computed for each marker and/or affinity reagent and/or combination of dyes and/or predetermined target molecule. This may be a combined measure consisting of multiple measures of statistical confidence assessing related aspects. The measure of statistical confidence may incorporate apriori knowledge and use Bayes theorem for example to adjust the probability of observing a given marker and/or affinity reagent and/or combination of dyes and/or predetermined target molecule based on a priori knowledge about that marker and/or affinity reagent and/or combination of dyes and/or predetermined target molecule. This apriori knowledge may be generated before or during the experiment. This means that for example a p value is computed for each marker and/or affinity reagent and/or combination of dyes and/or predetermined target molecule which assesses the probability that the detected presence (qualitative decoding) and/or quantity (relative or absolute quantitative decoding) is observed when the null hypothesis is true, i.e. the respective marker and/or affinity reagent and/or combination of dyes and/or predetermined target molecule is actually not present in the readout volume. The presence calling, i.e. the user's decision to accept the presence of a given marker and/or affinity reagent and/or combination of dyes and/or predetermined target molecule in the readout volume is then based on attaining a sufficient level of statistical confidence. The decision may be automated by using thresholds, which may be fixed and the same across all markers, affinity reagents, combinations of dyes, and target molecules or they may be different thresholds, which may be based on a priori knowledge. Further thresholds may be adjusted dynamically throughout the experiment. Like for example, they may be made more or less stringent. This is advantageous as it allows to demand a higher statistical confidence for target molecules, which are of particular interest.
Embodiments of the present invention are based on “looking at” microscopy as an encoding/decoding problem rather than a problem of registering spatially located intensities in an image, which is essentially a matrix of intensity values. While the method described herein is compatible with image-based readouts the “images” generated by the method and device described herein should be regarded probabilistic mathematical models of the reality of the sample under investigation, in which the presence of a target molecule is detected or called (presence calling) based upon the decision of the user to accept its presence based on a measure of statistical confidence and a certain level of statistical confidence in the presence of the respective target molecule in the readout volume.
Making the step of accepting the presence of a certain marker, a certain affinity reagent, and a certain target molecule based on a measure of statistical confidence and a certain level of statistical confidence (i.e. presence calling) generates a mathematical truth. This is important as this also implies that following to presence calling one is operating in the axiomatic domain of mathematics which is inherently free of influences that complicate measurements in the non-mathematical domain, i.e. the physical, chemical, biological domain. This fact therefore has important implications and also indicates that this method enables a completely new microscopic modality.
The statistical methods to provide a measure of statistical confidence on a per marker-basis or per target-molecule basis, may well be a combined measure and may in many ways be similar or identical to methods, which are used in transcriptomics and genomics, where enrichment scores and p values are commonly used.
Embodiments of the present invention relate to the patent application with the title “Method and device for analyzing a biological sample” with the application number PCT/EP2021/063310 which leverages the capability of the “IHP method” to image or readout a plurality of dyes with a high number of dyes in a single round. In contrast to the method disclosed in the patent application with the title “Method and device for analyzing a biological sample” with the application number PCT/EP2021/063310, wherein there is a 1:1 relationship between a given dye and a given affinity reagent, such that each marker is unique in a round, here we disclose a method in which this principle is combined with a combinatorial code such that there is 1:many relationship between marker and dyes. The content of PCT/EP2021/063310 is completely included herein by reference.
In a preferred embodiment of the method, a given affinity reagent, like for example an antibody, a single domain antibody, an oligonucleotide probe, an aptamer or a toxin, is assignable to a reporter comprising a unique combination of dyes and a linker forming a virtual marker, i.e. wherein the bijective pair (604) or injective association (606a, 606b) between an affinity reagent (ai) and a combination of dyes corresponds to a marker (μi) within a plurality of markers (M).
Further in this embodiment dyes in the plurality of dyes can be assigned to sets of dyes A to n (n being 0 or an element of the natural numbers), wherein each set of dyes A to n contains yA to yn dyes, such that the plurality of dyes (PD) contains yA+yB+yC= . . . yn=yσ members; wherein dyes assigned to one of the sets of dyes A to n are excitable with the same excitation light, e.g. with excitation light having respective different wavelengths λ1 to λn, wherein at least all dyes in each set of dyes A to n can be separated by a readout device into channels, wherein each dye is read out in an individual channel and each channel corresponds to one of the dyes; Using the dyes and the readout device in this way, which is based on the “IHP method” maximizes the number of dyes that can be readout and separated reliably, i.e. increases yσ and thereby the cardinality of the plurality of combinations of dyes (S1). The unique combination of dyes (si) is in this case assigned to the affinity reagent (ai) either prior to or following to the introduction of the affinity reagent into the sample, but prior to the generation of the readout. In a next step at least some affinity reagents from the plurality of affinity reagents (S2) are introduced into the sample. Then excitation lights are directed to the sample in order to excite the fluorescent dyes of the markers (μ1, μ2, μ3, . . . μn), this means that all dyes in the plurality of combinations of dyes (S1) are excited at least one time. Following excitation of the dyes at least one readout, preferably a complete readout, from fluorescence light emitted by the excited dyes located in a readout volume of the sample is generated, the readout comprising at least two channels, each channel corresponding to one of the dyes. In other words in the method described in this document each dye is readout in an individual channel. In a following step the markers present in the readout volume are determined based on the at least one readout sequence obtained in step (d) which may be made dependent on a measure of statistical confidence and attaining a certain level of statistical confidence (“presence calling”).
In a preferred embodiment of the method, the plurality of combinations of dyes (S1) is mapped uniquely to the plurality of affinity reagents (A=S2=T1*) using at least one code Cα1 to Cαn and/or at least one cipher Xα1 to Xαn, wherein C: S->T* or X: S->T* are total functions which are preferably bijective or at least injective, wherein S is the “source alphabet” and T is the “target alphabet”, and wherein S and Tare finite sets [case α].
In a preferred embodiment of the method, the plurality of affinity reagents (A=S2=T1*) is mapped uniquely to the plurality of combinations of dyes S1 using at least one code Cβ1 to Cβn and/or at least one cipher Xβ1 to Xβn, wherein C: S->T* or X: S->T* are total functions which are preferably bijective or at least injective, wherein S2 is the “source alphabet” and T1 is the “target alphabet”, and wherein S2 and T1 are finite sets [case β].
In both cases α and β the method disclosed in this document can be performed so as long as the codes C1, C2, C3, . . . Cn, and/or ciphers X1, X2, X3, . . . Xn used for encoding/decoding and/or encrypting/decrypting are total functions that are either injective or bijective. In the following preferred embodiments are proposed as an example, which use a certain code Ci, but this is not intended to a express a restriction of the method to any particular code in anyway, as all the codes C1, C2, C3, . . . Cn, and/or ciphers X1, X2, X3, . . . Xn, which are total functions and are either injective or bijective may be used for encoding/decoding and/or encrypting/decrypting to perform the method disclosed in this document.
There may be a 1: n relationship between a given affinity reagent and the dyes assigned to or attached to it, as combination of dyes contains at least two dyes. Importantly, n in this case defines the number of different dye species that are being used not the number of dye molecules.
In a particular preferred embodiment, of the present invention the combination of dyes is established by randomly selecting one dye from A to n sets of dyes, with yA+yB+yC . . . +yn=yσ members. If for example the method disclosed in “IHP method” is used and each set A to n comprises yA to yn markers, and n different excitation lights are used for generating n different readouts, the number of unique markers that can be readout and discerned from each other is yA+yB+yC . . . +yn. If for example the method disclosed in this document is used and each marker is labeled with n dyes, and n different excitation lights are used for generating n different images, the number of unique markers that can be readout and discerned from each other is yA×yB×yC . . . ×yn. In other words this leads to yA×yB×yC . . . ×yn unique codes. We refer to this preferred embodiment as “set-based encoding”.
In a preferred embodiment of the present invention the plurality of dyes (YD) formed by all fluorescent dyes of the plurality of combination of dyes (S1) comprises at least 10, 20, 50, 100, 1000, or 10000 different fluorescent dyes.
In a further preferred embodiment of the method, a given affinity reagent, like for example an antibody, a single domain antibody, an oligonucleotide probe, an aptamer or a toxin, is connected to a set of up to n×y labels. Such that there is variable and random relationship between a given affinity reagent and the dyes attached to it. This leads to a binary encoding, in which the absence of a particular dyes is counted as “0” and the presence as “1” in the code with up to n×y digits generating a set of unique codes with 2(y
Thus, the method described in this document vastly increases the number of markers and/or combinations of dyes and/or affinity reagents and/or predetermined target structures that can be readout without requiring to remove or to deactivate the previous markers, and without additional staining.
Importantly, there are many other codes or ciphers, i.e. ways of encoding/encrypting and decoding/decrypting that can be used in conjunction with the method disclosed in this document.
Each affinity reagent targets its combination of dyes to its predetermined structure which may also be named a target molecule or an analyte within the biological sample or lysate, e.g. a specific biomolecule.
Owing to combinatorial encoding for both methods the number of combinations of dyes which can be regarded as unique codes grows exponentially and quickly exceeds the number of protein-coding genes, which is in the range of 20,000. This is an important reference value as proteins carry out the majority of biological functions and are therefore of great interest. For this reason, the bulk of fluorescence microscopy performed today analyzes protein targets.
In both embodiments, the method remains compatible with various means of amplification including multiple binding sites or amplification strategies based on enzymatic reactions such as for example rolling circle DNA amplification.
In both embodiments, the method remains compatible with various analyte classes as the affinity reagent can be for example an antibody (protein target) or an oligonucleotide (RNA/DNA target).
In a preferred embodiment of the present invention each marker (μi) comprises a linker having at least two different attachment sites, the combination of attachment sites being unique to the marker; and wherein each dye is connected to a complementary linker to form a reporter, the complementary linker being unique to the dye and configured to attach to a predetermined attachment site.
In a preferred embodiment of the present invention the linker and/or the complementary linkers are oligonucleotides comprising DNA, RNA, peptide nucleic acid, morpholino or locked nucleic acid, glycol nucleic acid, threose nucleic acid, hexitol nucleic acid, or another form of artificial nucleic acid.
In a preferred embodiment of the present invention, the linker and/or complementary linkers contain a site for enzymatic cleavage or photolysis. This allows the efficient and easy releasing of a first combination of dyes in order to deactivate the first combination of dyes, which may be followed by relabeling with a second combination of dyes.
In a preferred embodiment of the present invention, the reporters are attached to their respective attachment sites before the markers are introduced into the sample.
In a preferred embodiment of the present invention, wherein at least two readouts are generated; and wherein the reporters are dynamically associated with and/or dissociated from their respective attachment sites between the generation of the first and second readouts in order to achieve a stochastic labeling. This is a strategy to increase spatial resolution and a strategy to render the decoding of multi-species readout volumes simpler. Such stochastic labeling may be based on super resolution microscopy such as STORM, PALM, GSDIM or a related method which leverages blinking.
In a preferred embodiment of this invention, stochastic labeling is achieved by combining the method with DNA-PAINT.
In a preferred embodiment of this invention, the plurality of dyes formed by all fluorescent dyes of the markers is divided into sets of dyes A to n, with yA to yn members, with yA+yB+yC+ . . . yn=yσ, with y being a natural number and yσ being the total number of dyes in the plurality of dyes (PD); wherein each dye in the same set can be excited by light of essentially one wavelength spectrum or by the same wavelength spectrum; wherein at least one excitation light for each set of dyes is directed at the sample in order to excite the fluorescent dyes of the respective set; wherein at least one readout for each set of dyes is generated from fluorescence light emitted by the excited dyes located in the readout volume of the sample, the readout comprising at least two channels, each channel corresponding to one of the dyes of the respective set. This embodiment uses the “IHP method”, which basically allows a higher number of dyes to be readout on a readout device. This is advantageous as a higher yσ leads to a higher cardinality of the plurality of combinations of dyes (S1) and consequently to a lower assignment rate α and higher statistical power of the method.
In a preferred embodiment of the present invention, excitation lights for exciting the sets of dyes A to n are directed onto the sample in a sequence temporally following each other.
In a preferred embodiment of the present invention, the readout is an image, or a microscopic image, or a readout image data stream of the readout volume.
In a preferred embodiment of the present invention, the readout is or contains a hyperspectral image of the sample. This is advantageous as it allows a high total number of dyes yσ to be used and leads to a higher cardinality of the plurality of combinations of dyes (S1) and consequently lower assignment rate α and higher statistical power of the method.
In a preferred embodiment of the present invention, the method comprises the further step of stabilizing the fluorescence lifetime of at least one fluorescent dye. This can be achieved by placing the fluorescent dye in a shielded environment by at least one of encapsulating, polymer-matrix embedding, and co-crystallizing. SMILEs are an advantageous class of dyes in this regard. In a preferred embodiment of the present invention, at least one dye in the plurality of dyes (PD) is a SMILEs.
In a preferred embodiment of the present invention, the step of generating the channels is based on at least one of channel unmixing, spectral unmixing, excitation spectral imaging, spectral phasor analysis, spectral FLIM phasor, a fluorescence lifetime of the fluorescent dyes and an excitation fingerprint of the fluorescent dyes. This is advantageous as it allows a high total number of dyes yσ to be used and leads to a higher cardinality of the plurality of combinations of dyes (S1) and consequently lower assignment rate α and higher statistical power of the method.
In a preferred embodiment of the present invention, the step of generating the channels is based on at least two orthogonal contrasts. When orthogonal contrasts are obtained from these methods and used in conjunction, they can be used to strongly increase the total number of dyes yσ, i.e. separate a much higher number of dyes. Like for example excitation fingerprinting information may be combined with fluorescence emission spectral information and/or fluorescence lifetime information with either or both of the aforementioned.
In a preferred embodiment of the present invention, the step of generating the channels is based on at least one of machine learning, deep learning, or artificial intelligence.
In a preferred embodiment of the present invention, the following steps are repeated at least twice in order to create series of images or readouts of the sample: providing a second plurality of markers, introducing the second plurality of markers into the sample, direct the at least one excitation light onto the sample, generating the at least one readout, and determining the markers present in the readout volume; or wherein the steps a) to e) of the methods described above are repeated at least twice.
In a preferred embodiment of the present invention, the reporters labeling the second plurality of markers comprise combinations of dyes that were determined based on the first series of images or readouts of the sample.
In a preferred embodiment of the present invention, the reporters are assembled by adding a mix of dyes wherein each dye is connected to a complementary linker to form reporters with linker molecules containing dye-specific attachment sites for all dyes in the plurality of dyes, such that adding a mix of dyes corresponding to a unique combination of dyes to a linker molecule in a coupling reaction volume leads to a stoichiometric coupling.
In a preferred embodiment of the present invention, the reporters are assembled by adding a mix of dyes wherein each dye is connected to a complementary linker to form reporters with linker molecules containing dye-in-specific attachment sites for all dyes in the plurality of dyes, such that adding a mix of dyes corresponding to a unique combination of dyes to a linker molecule in a coupling reaction volume leads to a stochastic coupling.
In a preferred embodiment of the present invention, the excitation light is coherent light.
In a preferred embodiment of the present invention, the excitation light comprises a wavelength range being smaller than 50 nm, smaller than 30 nm, smaller than 10 nm or a single wavelength.
In a preferred embodiment of the present invention, a device for analyzing a biological sample is adapted to carry out the method according to one of the methods described above.
In a preferred embodiment of the present invention the device comprises a microscope preferably a lens-free microscope, a light field microscope, widefield microscope, a fluorescence widefield microscope, a light sheet microscope, a scanning microscope, or a confocal scanning microscope, a plate reader, a cytometer, an imaging cytometer, or a fluorescence activated cell sorter configured to generate the at least one readout.
In a preferred embodiment of the present invention the device is configured to determine a fluorescence emission intensity, a fluorescence lifetime, an emission spectrum, an excitation fingerprint, fluorescence anisotropy from fluorescence dyes in the sample.
In a preferred embodiment of the present invention the device is configured to perform separation of the readout into the at least two channels by at least one of a spectrometer comprising a prism or a grating and at least one detector.
In a preferred embodiment of the present invention the device is configured to perform separation of the readout into the at least two channels by at least one of a spectrometer comprising a prism or a grating and at least one detector and the device comprises a comprising a time-sensitive detector.
In a preferred embodiment of the present invention, the device may comprise a memory device for storing a unique identifier that identifies the affinity reagent, the predetermined structure, and the unique combination of dyes for each marker.
In a preferred embodiment of the present invention, the device may comprise a calibration unit configured to receive the fluorescence light emitted by the excited dye, and to generate calibration data based on the received fluorescence light; wherein the at least one readout is generated based on the calibration data.
In a preferred embodiment of the present invention, no combination of dyes is assigned to more than one affinity reagent.
The following examples illustrate the power of the approach and likewise underscore its feasibility. In the following example n=5 different excitation light wavelengths, e.g. 405 nm, 488 nm, 560 nm, 630 nm, 700 nm are used to excite n=5 sets of fluorescent dyes with yA=yB=yC=yD=yE=5 members, such that the plurality of different dyes yσ used in this example is 25. In this case, owing to the exponential nature of combinatorial coding the total number of unique codes is 55=3125. In other words, the method presented here can be used to readout the entire human secretome based on a simple 5 channel fluorescence-based readout like for example a microscope or a cytometer using commercially available dyes. In relation to the prior art, which enables multiplexing in the range of 30-60 biomarkers using an iterative process (i.e. ˜markers per round), this is a significant 50-100× improvement. Using the same set of dyes and an iterative process in which the sample is stained, imaged, and then blanked (i.e. dyes are being removed or inactivated) it would be possible to probe the 30,000 targets in 10 rounds, which is roughly equivalent to the number of coding genes in the human genome. Today about 20,380 human protein-coding genes have been registered in UNIPROT.
In the following example n=6 different excitation lights, e.g. 350 nm, 405 nm, 488 nm, 560 nm, 630 nm, 700 nm are used to excite n10 sets of fluorescent dyes with yA=yB=yC=yD=yE=yF=yG=yH=yI=yJ=10 members, such that the plurality of different dyes yσ used in this example is 60. In this case, owing to the exponential nature of combinatorial coding the total number of unique codes is 106=1,000,000. In other words, the method presented here can be used to readout the entire human proteome estimated to have on the order of 80,000-400,000 distinct proteins in a single round.
Further, the method is easily adapted in a cytometer, a plate reader, a fluorescence microscope, allowing to readout a very high number of markers. In other words the method is compatible with image-based and non-image-based readouts.
Further, the method is easily adapted in a fluorescence microscope, allowing to readout a very high number of markers with very high spatial resolution.
In a preferred embodiment the method is based on detecting individual disparate spots, i.e. spots that can be resolved from each other by the readout device. A very high number of spots can be readout at the same time using area detectors that image a field of view, such as for example a camera. Disparity of spots may result from different locations in X, Y direction in a single field of view in an image of a microscope. Or from a different point of time T in passing through a flow cell as for example in a cytometer or imaging cytometer or in a laser scanning microscope. In a system which uses image-based readouts such as microscopes and plate readers disparity in densely labeled structures such as, when trying to image a very high number of markers in a cell for instance, may be achieved by stochastic dye blinking, i.e. the temporal separation. This is a commonly used strategy in stochastical optical reconstruction microscopy (STORM) and related modalities, which rely on for example Gaussian fitting to find the location of disparate emitters in densely labeled samples.
The disparity of spots maybe an immediate consequence of the assay format like for example in a bead-based assay in flow-through, wherein a plurality of beads passes the flow cell (to which a readout device is adapted) in sequence. A spot may result from a structure which is bigger or smaller than the readout volume. In a biological sample like a cell, intracellular targets may be at different X, Y, Z locations. In some cases, when label density is too high, an iterative approach may be employed to reduce the label density to an acceptable level. Other strategies to achieve disparity of spots may involve stochastically labeling by reducing the concentration of labeling reagent for instance. Further disparity may be achieved by expanding the sample using protocols known as expansion microscopy, which are described in Wassie et al. 2019 Nature Methods, Volume 16, pages 33-41 (2019). Further a suitable spacing between spots may be achieved by increasing the number of sets of dyes n. To this end tunable light sources or continuous light sources can be combined with the method. Further strategies for densely labeled samples are discussed in the following.
Combining the Disclosed Method with Iterative Staining Processes
In another preferred embodiment the following steps are repeated at least twice in order to create a series of readouts/images of the sample: Staining the sample. Directing the first excitation onto the sample. Generating the first readouts/images. Directing the second excitation onto the sample. Generating the second readouts/images. The steps defined in claim 1 describe a single round of readouts/images acquisition. Additional rounds may be performed in order to acquire a series of readouts/images of the sample. In particular, the series of subsequent readouts/images may be used in order to observe changes in the sample that occur over time. In particular, the series of subsequent readouts/images may be used in order to further increase the number of markers that can be readout or to make sure that the number of markers readout in a single round is not too high in densely labeled samples, i.e. in this case it may be useful to reduce the number of markers to, for example, 1000 markers per round, which is still significantly higher than the methods described in the prior art for fluorescence-based imaging-compatible readout.
Mono-Species Readout Volumes Vs. Multi-Species Readout Volumes
When using the method a spot in the sample, which is defined by the size of the main maximum of the effective point spread function and in the case of confocal microscopy also referred to as the confocal volume, may contain only one marker or a plurality of markers of a single specificity (mono-species readout volume) or may contain markers of multiple specificities (multi-species readout volume). The method presented in this document allows the robust decoding of mono-species readout volumes and provides very high numbers of unique codes. For multi-species readout volumes, however, it cannot be guaranteed that the markers of multiple specificities located in a spot can be decoded, thereby finding only a single possible combination of unique codes, i.e. the decoding of a multi-species readout volume may lead to multiple possible combinations of markers. In the case that a multi-species readout volume is encountered, however, the method can recognize this event reliably and inform the user that a multi-species readout volume was encountered for which an unambiguous solution was not found. The method may further find a limited number of possible alternative solutions and may based on these solutions suggest a labelling strategy for at least one further round of staining and imaging with a subset of markers labeled with a new set of fluorescent combinatorial codes. Alternatively or in addition, the user may resort to the approaches described in section “Strategies for densely labeled samples”.
The likelihood to encounter multi-species readout volumes depends on both the number of markers, which are to be read-out in a single round, and also on their subcellular location. A cell has multiple meta-compartments such as the nucleus, the cytoplasm, and the secretory pathway as well as a range of compartments including for example the nuclear membrane, nucleoli (˜7%, ˜1300 proteins), the nucleoplasm, actin filaments, intermediate filaments, centrosomes, microtubules, the cytosol, mitochondria, the endoplasmic reticulum, the Golgi apparatus, the plasma membrane, secreted proteins, vesicles, which are further divided into sub-compartments. For example, endosomes, lipid droplets, lysosomes, peroxisomes and vesicles are grouped in the cohort of vesicles. While some proteins have a given location in a cell other proteins are multi-localizing proteins (MLPs), which constitute ˜55% (n=7106) of the localized proteins in the Cell Atlas (source: https://www.proteinatlas.org/humanproteome/cell/multilocalizing). The number of localized proteins is therefore in the ˜14,000 range with ˜55% localizing in multiple (sub)compartments. Nucleoli are a dense structure and so far ˜7% or 1361 proteins have been detected in one or multiple nucleolar sub compartments: nucleoli (1008), nucleoli fibrillar center (300), nucleoli rim (100) (source: https://www.proteinatlas.org/humanproteome/cell/nucleoli). A typical nucleolus may be in the range of 0.2-3.5 μm in diameter, which means that a small nucleolus of 0.2 μm diameter has a volume of roughly 0.0335 μm3, which is about 1.3-fold larger than the volume of the effective PSF of an NA 1.4 oil immersion objective. For this reason, the nucleolus may be regarded as a challenging structure with respect to multi-species readout volumes. One could estimate that in the worst case ˜1500 distinct localized proteins (non-localized proteins excluded) would have to be resolved at the same time in a confocal volume, if one wanted to achieve this in a single round of imaging. Similarly, roughly 24% or 4770 proteins have been detected in the cytosol in humans (source: https://www.proteinatlas.org/humanproteome/cell/cytosol), which consists of ˜70% water and 20-30% protein. Cytosolic proteins may be evenly distributed throughout the cytosol or in punctate patterns, like aggresome, cytoplasmic bodies, rods & rings. The odds of encountering a multi-species readout volume in an iterative staining and imaging process can therefore be minimized by taking this a priori knowledge of protein localization into account when defining the sets of markers for each round in a way that they are distributed across the maximally possible number of sub compartments.
Multi-species readout volumes can be reliably detected by the method by acquiring a first readout sequence and retrieving (from a memory device) or computing all combinations of dyes from the plurality of combinations of dyes (S1) subsumable under the first readout sequence. A multi-species readout volume is detected, when more than one combination of dyes from the plurality of combinations of dyes (S1) is subsumable under the first readout sequence. In this case the user is notified, e.g. by a software program, that the corresponding readout volume contains multiple species of target molecules. Depending on possible species in the spot and preferably apriori knowledge from aforementioned protein expression databases an optimized second deterministically assembled set of combinations of dyes from the plurality of combinations of dyes (S1) may be suggested to decode a multi-species readout volume in an iterative decoding process with a minimum number of iterations based on a certain acceptable level of statistical confidence. Alternatively, or in addition, a second independent and identically distributed set of combinations of dyes from the plurality of combinations of dyes (S1) may be assigned to the plurality of affinity reagents in a second round (this might be regarded as random repeated drawing with putting back/replacement).
It is important to note though that for a multi-species readout volume that cannot be decoded finding an unambiguous solution in a single round of read-out it is still possible to decode the spot finding an unambiguous solution by multiple rounds of read-out with varying sets of combinatorial combination of dyes attached to the same set of affinity reagents. It is important to note that the cardinality of the set of unique codes can be very high in relation to the cardinality of the set of genes in the human genome or the cardinality of analytes to be identified/to be decoded due to the exponential nature of combinatorial encoding even when a limited amount of dyes is used to generate the codes. For this reason, it is easy to perform experiments in which only a fraction of the available codes are actually assigned to a marker and a target molecule. Like for example, the number of protein-coding genes in the human genome may be estimated to be roughly 20,000. In an example where n=5 and yA=yB=yC=yD=yE=10, which would generate 100,000 unique codes this means that ˜20% of the available codes would actually be assigned. Further it means that the fraction of actually assigned codes may be easily adjusted over a wide range by adding a further set of dyes or expanding the number of dyes in a set. Like for example n=6 and yA=yB=yC=yD=yE=yF=15 which would yield 10,000,000 unique codes and drop the fraction of actually assigned codes needed to encode 20,000 markers to ˜0.2%. Importantly 6 excitation lines can be easily provided e.g. on commercial confocal microscopes like for example 360 nm, 405 nm, 488 nm, 560 nm, 630 nm, 700 nm and sets containing 15 dyes each of which ˜5 each fall into one of three major classes according to their fluorescence lifetime (e.g. <1 ns, 1-5 ns, >10 ns) can be derived from existing fluorescent dyes through modification of the base structure of the dyes. Likewise, several approaches to use fluorescence emission spectral information and lifetime information in conjunction are available and include spectral and fluorescence lifetime, gating, unmixing, phasor-based approaches, machine or deep learning-based classification strategies.
For this reason, it is possible to decode a multi-species readout volume reliably using a preferred embodiment of the method disclosed in this document using an iterative process based on a certain level of statistical confidence like for example ap value. Ap value measures the probability of obtaining a test result equal to the actually measured value under the assumption that the null hypothesis is true. In the method described in this document ap value for each marker can be calculated that measures the probability that the marker was observed in the readout volume (i.e. the confocal volume or effective point spread function of the readout device) despite the fact that it was not actually present in the readout volume, that the null hypothesis (i.e. the marker is not present in the readout volume) is true. Importantly the confidence in the decoding result grows quickly with each iteration. For this reason, generally acceptable statistical confidence levels, i.e. p values, are attainable with a limited number of iterations like for example 1-10.
This is achieved by (1) providing a first set of markers with a first set of affinity reagents labeled with a first set of combination of dyes, which are (randomly or deterministically) selected from the set of combination of dyes, reading out the spot or the readout volume and assembling the first set of all possible combination of dyes subsumable under the first readout sequence, (2) providing a second set of markers with a first set of affinity reagents labeled with a second set of combination of dyes, which are either randomly or deterministically selected from the set of combination of dyes, reading out the spot or the readout volume again and assembling the second set of all possible combination of dyes subsumable under the second readout sequence, (3) comparing the first set of combinations of dyes subsumable under the first readout sequence with the second set of combinations of dyes subsumable under the second readout sequence and removing all combinations of dyes, which are not shared by the first and the second set of combinations of dyes by all possible combination of dyes, i.e. defining the overlap. (4) calculating p values and/or other suitable measures of statistical confidence for each combinations of dyes and/or affinity reagent and/or marker and/or target molecule detected in the overlap. (5) comparing all calculated p values and/or other suitable measures of statistical confidence against a user defined threshold to perform “presence calling” based on a certain level of statistical confidence for at least some of the combinations of dyes and/or affinity reagent and/or marker and/or target molecule detected in the overlap. (6) repeating the process until all multi-species readout volumes of interest have been decoded at a satisfactory level of statistical confidence for at least each combination of dyes and/or affinity reagent and/or marker and/or target molecule in a user-defined subset (e.g. a group of target molecules of interest).
Alternatively or in addition to the method described above, decoding of readout sequences may leverage intensity information. It can be postulated that all dyes exhibit substantially comparable brightness and a substantially linear response in the regime of conditions under which the measurements are performed. Furthermore, differences in the brightness of individual dyes can be accounted for by performing a suitable calibration. This is a general assumption underlying for example fluorescence microscopy measurements. Under this is assumption it can be stated that the likelihood of observing a false-positive result is lower, when higher signal intensities are being observed. For example when a first readout sequence contains a “1” for DyeA.1 and DyeB.2, which means that they were both detected, the intensity of these dyes may be different for example DyeA.1 may have an intensity of 1 AU and DyeB.2 of 10 AU. In this case, codes subsumable under the readout sequence that have a “1” in position DyeB.2 correspond to markers that have a higher probability of being actually present in the readout volume, i.e. better intensity-adjusted p1 values and/or other suitable measures of statistical confidence.
Alternatively, or in addition to primary and secondary qualitative decoding a multi-species readout volume may also be quantitatively decoded. This may be brought about by finding the scaling of the proportions of markers in the overlap such that they match the observed intensity profile in the best possible way. As both the intensity profile as well as the identities of the markers in the readout spot are known after primary and/or secondary qualitative decoding (based on a certain level of statistical confidence) this becomes a fully determined set of linear equations, which is essentially comparable to linear unmixing. Using the aforementioned steps multi-species readout volumes can be decoded reliably using a limited number of rounds/iterations providing the identity of markers in the readout spot/volume (i.e. confocal volume/effective PSF) based on suitable measures of statistical confidence and attaining a certain level of statistical confidence, which can be provided in the form of for example marker-specific p values or intensity-adjusted p1 values, or on other suitable combinations of dyes-/affinity reagent-/marker-/target molecule-specific measures of statistical confidence. Furthermore, relative quantitative or absolute quantitative information may be derived as well. In this case the response of the readout device and the dye have to be in the linear regime. For an absolute quantitative readout suitable calibrations have to be performed to relate the area under the curve (AUC) for a given dye emission spectrum back to the number of dye molecules. It is important to note that the method does not require an absolute quantitative readout.
In case the method presented in this document, is used for samples labeled with very high densities like for example genome-wide labeling (e.g. 20,000 markers) multi-species readout volumes may occur at high frequency and each multi-species readout volume may contain a high number of species, i.e. different target molecules (e.g. 100-1000 different target molecule species). This leads to the question how high is the likelihood that the same non-present marker is detected multiple times in an iterative decoding scheme. In an example with n=5 and yA=yB=yC=yD=yE=10, ν=20,000 markers and ψ=100,000 available unique codes. How high is the probability to observe a non-present marker in the overlap, i.e. to obtain the same wrong or untrue result, multiple times. In this sense it is useful to consider the probability after the second round. Following to obtaining the first readout sequence the first set of combination of dyes subsumable under the first readout sequence with κ1 codes that can be subsumed under the first readout sequence are retrieved from the memory. In our example κ1 maybe equal to 1 for a mono-species readout volume or between 1 and 20,000, the maximum number of used codes, for multi-species readout volumes. Generally, it can be expected, however, that for multi-species readout volumes κ may be in the range of 10-5000. After the round of decoding, therefore significant uncertainty exists with respect to the true content of a multi-species readout volume. This leads to the question: How high is the probability pκ to obtain a readout sequence under which κ codes can be subsumed? As markers and codes are assigned to each other in a random fashion, it can be postulated amongst all sets of subsumable codes is a stable overlap, i.e. corresponding to the actually present markers in the sample, plus a random assortment of markers, whose codes can by chance be subsumed under the observed readout sequence. The probability pκ that a given marker out of ν=20,000 markers is subsumable under the readout sequence, depends on the readout sequence, and is provided by κ/ψ, wherein the ψ is the cardinality of the plurality of combinations of dyes (S1). For example for, κ=1,000 in each round the probability to observe the same non-present marker twice is 1,000:100,000×1,000:100,000 or 1:10,000. Importantly, the likelihood of observing a certain κ can be estimated a priori and used as an information to guide the choices by the user with respect to how many dyes shall be used, i.e. the cardinality of the set of unique codes, how many iterations would be needed to decode a certain number of markers at a certain level of statistical confidence. Whether a higher or lower κ is observed depends on a number of parameters. For example κ the assignment rate α=ν/ψ, which corresponds to relation between the set of combination of dyes assigned to a marker to the overall number of combination of dyes available. Further κ depends on the entropy S in the set of combination of dyes assigned to a marker and is inverse proportional to S, κ˜1/S, i.e. a higher the entropy in the set combination of dyes assigned to a marker allows better p values to be obtained in fewer iterations.
In this preferred embodiment the method, which is based on an iterative process of staining the sample, reading out the sample, and inactivating the dyes, further comprises a step of deactivating at least one of the plurality of markers, at least one set of markers, at least one marker. In this document, deactivating one or more markers means preventing the associated fluorescent dye from emitting fluorescence light from the sample in the future. This can be done by either removing the fluorescent dye from the sample or by bleaching the fluorescent dye. Thereby, crosstalk between fluorescent dyes associated with different sets of markers is greatly reduced. In other words, by deactivating a set of markers, the structure marked by said set will not be visible in future images/readouts. This means, for example, that fluorescent dyes with similar emission spectra may be used in subsequent images, thereby, increasing the number of overall markers that can be used in a single round, a single experiment and/or with a single biological sample.
Preferably, the deactivating step is done by at least one of bleaching the fluorescent dye unique to the at least one marker and removing the at least one marker from the sample, preferably by at least one of dissociating or cleaving the fluorescent dye from the affinity reagent or dissociating the affinity reagent from the target structure.
In another preferred embodiment the step of generating the channels is based on at least one of spectral unmixing (which may also be referred to as spectral imaging and linear unmixing, or channel unmixing), a fluorescence lifetime of the fluorescent dyes and an excitation fingerprint of the fluorescent dyes. Spectral unmixing may be performed in various ways including but not limited to linear unmixing, principle component analysis, learning unsupervised means of spectra, support vector machines, neural networks, (spectral) phasor approach, and Monte Carlo unmixing algorithm. In order to reduce crosstalk between the fluorescent dyes associated with different markers, several techniques may be employed. The unmixing techniques are used to separate contributions from different fluorescent dyes to the same detection channel, i.e. the crosstalk due to overlapping emission spectra. Employing these techniques can greatly enhance the sensitivity of the method due to reduced noise. Further, the fluorescence lifetime and the excitation fingerprint of a fluorescent dye can be used in order to correctly identify the fluorescent dye. Phasor S-FLIM for example (as described in Scipioni, L., Rossetta, A., Tedeschi, G. et al. Phasor S-FLIM: a new paradigm for fast and robust spectral fluorescence lifetime imaging. Nat Methods 18, 542-550 (2021)) is a suitable approach to leverage both the emission spectrum and fluorescence lifetime information to increase the overall number of dyes that can be reliably discerned. This can be used to employ more sets of markers per image, i.e. have more sets of markers in one set. In turn, this vastly increases the overall number of markers that can be imaged. Both τ gating and τ unmixing are suitable strategies to take advantage of fluorescence lifetime to increase the number of discernible dyes y per set.
Today fluorescence lifetime is not widely used as an orthogonal contrast in microscopy and cytometry. This is probably related to the fact that most organic dyes, which account for the vast majority of commercially available fluorescent dyes, have fluorescence lifetimes in the 1-5 ns range, which renders the lifetime-based separation challenging. Further and maybe more importantly fluorescence lifetime is strongly dependent on the molecular environment and many dyes show a shortening of fluorescence lifetime in aqueous or polar environments, which are typical for biological specimens. Nevertheless, the widespread use of fluorescence lifetime as an orthogonal contrast seems feasible. In a preferred embodiment of this invention, fluorescence lifetimes of the dyes are stabilized against the environmental conditions by means of one of the following encapsulation, caging, dyad formation, deriving rotaxanes from dyes, co-crystallizing dyes into for example SMILEs, polymerizing dyes, and incorporating dyes into nano- or microstructures such as polymer beads.
In another preferred embodiment machine learning, deep learning or other artificial intelligence approaches are used to train a classifier to discern dyes based on a combination of at least two of the following properties: excitation fingerprint, fluorescence emission spectrum, fluorescence lifetime, fluorescence intensity, brightness. Such a trainable classifier may be similar to “Learning Unsupervised Means of Spectra” (LUMOS) described in McRae T D, Oleksyn D, Miller J, Gao Y-R (2019) Robust blind spectral unmixing for fluorescence microscopy using unsupervised learning. PLoS ONE 14(12): e0225410, which is based on k-means clustering.
In a preferred embodiment, a learning algorithm based on machine learning, deep learning, or artificial intelligence techniques including but not limited to support vector machines, classic neural networks, convolutional neural networks, recurrent neural networks, generative adversarial networks, self-organizing maps, Boltzmann Machines, deep reinforcement learning, autoencoders are trained to separate dyes based on either their emission spectrum.
In a preferred embodiment, a learning algorithm based on machine learning, deep learning, or artificial intelligence techniques including but not limited to support vector machines, classic neural networks, convolutional neural networks, recurrent neural networks, generative adversarial networks, self-organizing maps, Boltzmann Machines, deep reinforcement learning, or autoencoders are trained to separate dyes based on either their emission spectrum and fluorescence lifetime. This may be based on simple fluorescence lifetime gating or more sophisticated fluorescence lifetime analysis. A suitable means to derive training data for this approach may be Phasor S-FLIM for example (as described in Scipioni, L., Rossetta, A., Tedeschi, G. et al Phasor S-FLIM: a new paradigm for fast and robust spectral fluorescence lifetime imaging. Nat Methods 18, 542-550 (2021)).
In another preferred embodiment the method further comprises a step of capturing a hyperspectral image of the sample. In contrast to multispectral imaging, which captures a limited number of wavelength bands, typically less than or around 10, a hyperspectral image captures tens or hundreds of wavelength bands per pixel. In other words, hyperspectral images have a very high spectral resolution. This allows for a much finer differentiation of fluorescent dyes based on their emission spectrum and thereby increases the sensitivity and reliability of the method.
The robustness of the readout is an important consideration. Ideally each spot is readout individually, which means that spots that are readout in parallel are separated spatially such that the optical system used for their detection can resolve them as separate spots. If two or more markers with distinct reactivities/specificities are in too close spatial proximity, i.e. both located substantially within the same readout volume, and are readout simultaneously, then it may happen that an unambiguous decoding of the encoded information cannot be obtained in a single round. In this case, strategies for densely labelled samples may be employed as described below. If densely labelled samples are to be used with very high number of markers like for example genome-wide studies, in which multi-species readout volumes may occur with high frequency, the method can be adapted to reliably detect multi-species readout volumes and decode the contained information, i.e. the identity of the markers in the spot, in an iterative process with a limited number of rounds. This is a preferred embodiment of the invention and a breakthrough with respect to the prior art in terms of the ‘plexing’ level that can be attained per round, which is several orders of magnitude higher than currently available methods. This is described above and referred to as “Primary qualitative iterative multi-species readout volume decoding by reassigning codes in between iterations” or “(“code swapping”)”. Further as described above it is also possible to perform relative and absolute quantitative decoding/decryption.
When the method disclosed herein is intended to be used in conjunction with densely labelled samples, it may be beneficial to adapt the method. One such adaptation is based on a priori knowledge of protein location such nuclear, cytoplasmic, nucleocytoplasmic, secreted proteins, proteins located on or in organelles, or on the cell membrane both intracellular and extracellular, which allows the stratification of the plurality of markers into multiple sub-pluralities that are then brought into the sample and acquired in multiple rounds of an iterative staining and imaging process in a way that minimizes the chances of two distinct markers colocalizing in the same spot in the same round. This may be combined with expansion microscopy protocols as described by Martinez et al., Scientific Reports, Volume 10, Article number: 2917 (2020) to expand the sample by a roughly a factor of 4 in all room directions and thereby further reduces the odds of two distinct markers colocalizing in the same spot or readout volume in the same round. Further the number of sets n can be increased and the plurality of dyes may be divided into sub-pluralities of dyes, each sub-plurality of dyes may then be used to generate combinations of dyes for respective sub-pluralities of markers. As more and more dyes become available with narrower excitation and emission spectra, it will be easier to accommodate a higher n and/or a higher y. A similar problem exists in super resolution microscopy; stochastic labelling of the target structure or stochastic blinking of fluorescent dyes may be used to avoid said problem. Blinking of fluorescent dyes can be achieved in various ways, while some dyes such as for example quantum dots generally blink, other fluorophores may be photoactivated, photoswitched, or ground state depleted for example to make them blink. These techniques can be adapted for the method disclosed in this document to allow the imaging of densely labelled samples. In a preferred embodiment, DNA-PAINT is used and markers are readout stochastically such that an 1 to n readout is reiterated for i times to obtain a first readout sequence. In a preferred embodiment, stochastic optical reconstruction microscopy or a related blinking method is used and markers are readout stochastically such that an A to n readout, is reiterated for i times to obtain a first readout sequence.
In order to readout the combination of dyes, it is preferable in some embodiments of the present invention, like for example in whole secretome profiling, to ensure that markers of a given specificity are located in disparate locations or spots in the sample and in this case it is useful to perform a spot detection. Spot detection is based on image segmentation. A spot is a kind of feature in the sense of this document. Combination of dyes are preferably readout on a per spot basis in assay formats in which the majority of readout volumes are mono-species readout volumes.
The image segmentation analysis can be carried out with classical approaches, artificial intelligence based techniques including machine learning and neural-networks/deep learning, or other techniques including thresholding techniques, dimensionality reduction techniques, clustering methods, compression-based methods, histogram-based methods, edge detection, dual clustering method, region-growing methods, partial differentiation equation-based methods, variational methods, graph partitioning methods (for example Markov random fields), a watershed transformation, model-based segmentation, multi-scale segmentation, semi-automatic segmentation, trainable segmentation using various machine learning, neural network and artificial intelligence approaches for example pulse-coupled neural networks (PCNNs), and convolutional neural network (U-Net), recurring neural networks (RNNs) as well as object co-segmentation methods such as Markov networks, convolutional neural networks, or long short-term memory (LSTM), for example. Alternatively, or in addition, characteristics such as size and/or colour and/or fluorescent intensity and/or fluorescent lifetime can be used to identify the constituent parts of the marker from the image data. Various algorithms can be used for identification including Harris Corner, scale invariant feature transform (SIFT), speeded up robust feature (SURF), features from accelerated segment test (FAST), and oriented FAST and rotated BRIEF (ORB) are known and can be used to identify the constituent parts and/or features of the marker from the image data.
In another preferred embodiment the method further comprises a step of applying the second excitation light temporally after the first excitation light. Preferably, the time between applying the first excitation light and applying the second excitation light is longer than the fluorescence lifetime of the fluorescent dyes of the first set/which are excited by the first excitation light. This ensures, that only fluorescence light emitted by the fluorescent dyes of the second set/which are excited by the second excitation light is captured for generating the second image/readout. Thereby, crosstalk between fluorescent dyes can be reduced and the sensitivity of the method is further improved.
In another preferred embodiment at least one of the first wavelength spectrum and the second wavelength spectrum for dye excitation comprise a wavelength range being smaller than 50 nm, smaller than 30 nm, smaller than 10 nm or a single wavelength. These wavelength bands are typical ranges of e.g. dichroitic beam splitters or bandpass filters being used in fluorescence microscopy. Various methods can be used in order to generate the respective wavelength spectrum for sample illumination or fluorescent dye excitation. For example, a bandpass filter which filters out a wavelength range might be used in combination with a light source emitting light having a broad spectrum of wavelengths, e.g. a mercury or xenon lamp. Alternatively, or additionally, a white light laser emitting supercontinuum white light in combination with an AOTF for selecting of one or more single wavelengths of the emitted light could be used.
In another preferred embodiment the fluorescent dyes unique to each set can be excited by essentially one wavelength spectrum or by the same wavelength spectrum. This allows the fluorescent dyes of a single set to be excited by a single light source with e.g. a narrow emission spectrum. This embodiment of the method can be easily implemented with existing fluorescence microscopes which often comprise such light sources.
In another preferred embodiment the fluorescent dyes of a single set of dyes A to n comprise emission spectra of at least partially different wavelength ranges. Thereby, the fluorescent dyes of a single set of dyes A to n can be easily distinguished from another by their emission spectra. This reduces or eliminates the computational load of the unmixing necessary to separate the channels of each image and makes the method faster and more reliable.
In another preferred embodiment at least two fluorescent dyes, each unique to one set of dyes A to n, have different fluorescent lifetimes. Thereby, the at least two fluorescent dyes can be distinguished from another by their lifetimes. In particular, this can be used to increase the number of channels per image, i.e. capture more markers per image. Thus, the overall number of dyes per set that can be imaged is vastly increased. In particular, existing fluorescent dyes may be engineered to generate derivative fluorescent dyes with similar excitation and/or emission spectra, but different fluorescence lifetimes by modifying the base structure or putting the fluorescent dye into a different molecular environment. In particular, existing fluorescent dyes may encapsulated in micro- or nanocapsules, embedded into a polymer like polystyrene, caged, or co-crystallized in for example SMILES to stabilize their molecular environment and thereby their fluorescence lifetime. This strategy may also be used to generate sets of dyes of the same dye species with different fluorescence lifetimes. Rotaxanes and in particular rotaxanes derived from squaraine are interesting dyes in this regard, as the interlocking of the dye molecule in a macrocycle stabilizes the molecular environment and thereby the fluorescence lifetime.
In a preferred embodiment at least one of the dyes or labels is a small-molecule ionic isolation lattices (SMILES), these are small crystals that consist of cationic dyes which are co-crystallized with counterions such as for example anion-binding cyanostar or alternative agents, such as Bis-amide, cyclodextrin, Tetra-phenyl or pyrene as described in Benson et al. 2020 Chem 6, 1978-1997, Aug. 6, 2020.
In a preferred embodiment at least one of the dyes or labels is a polymer microbead or nanostructure containing a small-molecule ionic isolation lattices (SMILES).
In a preferred embodiment at least one of the dyes or labels is a rotaxane dye like for example squaraine-rotaxane dyes.
In a preferred embodiment at least one of the dyes or labels is a dyad consisting of an antenna moiety and emitter moiety.
In a preferred embodiment at least one of the dyes or labels is a FRET pair of at least two dyes a donor and an acceptor connected by a linker, like for example a nucleic. The at least one of the fluorescent dyes may be a FRET-pair based having at least one fluorescent dye as FRET donor and at least one fluorescent dyes as a FRET acceptor. The FRET-pair might be physically connected by a linker comprising DNA, RNA, peptide nucleic acid, morpholino or locked nucleic acid, glycol nucleic acid, threose nucleic acid, hexitol nucleic acid or another form of artificial nucleic acid, a DNA nanostructure and or a peptide.
In another preferred embodiment at least two fluorescent dyes, each unique to one marker, in the first and/or second sub-pluralities comprise emission spectra of essentially the same wavelength ranges and essentially the same fluorescent life time at a first condition like for example a certain first pH value, a certain first solvent, a certain first redox level, certain first temperature, or a certain first concentration of a ligand (e.g. lower concentration of one of the following Cu(II), Zn(II), a small molecule) of the sample and comprise emission spectra of essentially the same wavelength ranges and substantially different fluorescent life time at a second condition like for example a certain second pH value, a certain second solvent, a certain second redox level, certain second temperature, or a certain second concentration of a ligand (e.g. lower concentration of one of the following Cu(II), Zn(II), a small molecule) of the sample.
Embodiments of the present invention also relate to a device for analyzing a biological sample being adapted to carry out the method for analyzing a biological sample describe above. The device has the same advantages as the method and can be supplemented using the features of the dependent claims directed at the method. The device may in particular be configured to image samples in an array format such as a microplate.
In a preferred embodiment the device is configured to readout samples flowing through a flow cell.
In a preferred embodiment the device comprises at least one of a first light source configured to emit the first excitation light, and at least one second light source configured to emit the second excitation light. Alternatively, or additionally, the device comprises a tunable light source configured to emit the first and second excitation light. Preferably at least one of the first excitation light and the second excitation light is coherent light.
In another preferred embodiment a separation of the first and/or second images (readouts) into the at least two channels is done by at least one of a spectrometer comprising a prism or a grating and at least one detector. Diffractive elements can be used to optically or spatially separate the captured fluorescence light by wavelength into distinct channels, e.g. by directing different wavelength onto different parts of a single detector or onto different detectors. Since these channels are created by detector hardware they will also be called detection channels in the following. An example for such a spectrometer arrangement for a confocal scanning microscope is disclosed e.g. in U.S. Pat. No. 6,614,526 B1.
In another preferred embodiment a separation of the first and/or second images (readouts) into the at least two channels is done by at least one time-sensitive detector. Such detectors register not only the wavelength spectrum but also the arrival time of the captured fluorescence light. They may also be time-gated, i.e. configured to register events within discrete segments of time so called time gates, enabling e.g. the determination of lifetime information from the arrival time of the captured fluorescence light. Thereby, fluorescent dyes having significantly overlapping emission spectra but different fluorescence lifetimes can be separated reliably into different channels. This further increases the number of markers that can be grouped into a single sub-plurality, i.e. imaged at the same time.
Embodiments of the present invention further relate to a microscope system comprising the device for analyzing a biological sample described above. The microscope system is preferably a lens-free microscope, a light field microscope, widefield microscope, a fluorescence widefield microscope, a light sheet microscope, a scanning microscope, or a confocal scanning microscope.
It is important to note that this is conceptually dependent on the readout as a readout that provides orthogonal contrasts such as emission spectrum and fluorescence lifetime for instance can differentiate more dyes as compared to readouts that provide only one kind of contrast.
In the sense of this document 124 shall represent not only dimers but multimerized (single-domain) antibodies in general. Such combinations 124 may be engineered in order to achieve specific affinities that are not obtainable otherwise (bispecific reactivity) or to increase avidity
Direct and indirect immunofluorescence labelling is widely used in life science research and diagnostic application to analyse the presence of molecular targets such as proteins, RNA, DNA, and other molecules. Typically, such fluorescence-based assays are read-out using at least one of a plate reader, a high content screening device, a microscope, a cytometer, or a fluorescence-activated cell sorter.
Read-out devices used to perform fluorescence multicolour imaging typically include at least one light source for generating excitation light, a detection system including at least one detection channel and may as well contain filters and/or dispersive optical elements such as prisms and gratings to route excitation light to the sample and emission from the sample to the detection system and/or onto to appropriate areas of the detector.
Read-out devices allow a certain number of dyes to be analysed from a given biological sample in a given run. This number typically depends on the number of detection channels, n, the read-out device is configured to provide, i.e. able to spectrally resolve, or discern. In the case of microscopes the number of detection channels is typically 4-5 in the case of camera-based widefield detections (e.g. widefield epifluorescence microscopes, spinning disk microscopes, light sheet fluorescence microscopes), 5-12 detection channels in the case of microscopes with spectral detection concepts that typically rely on excitation or emission fingerprinting and (spectral/linear) unmixing. In order to detect the emission from multiple fluorophores, quantum dots, and/or fluorescent proteins and assign detected photons to the corresponding fluorophores, quantum dot, and/or fluorescent protein from which they are emitted read-out devices may employ various strategies. Emission filters are commonly used to direct desired bands of the spectrum to the detector. Multiple emission filters are typically installed on filter wheels such that bands of emission light reaching the detector can be swiftly changed. Alternatively, or in addition dispersive optical elements schematically shown in
At least some affinity reagents a from the plurality of affinity reagents A are uniquely assigned to a combination of dyes s from the plurality of combinations of dyes S1. In both mappings, no combination of dyes s is assigned to more than one affinity reagent a from the plurality of affinity reagents A. However, in a one-to-many mapping, an element aj from the plurality of affinity reagents A may be assigned to more than one combination of dyes s from the plurality of combinations of dyes S1. In this case multiple unordered pairs including aj are formed each corresponding to a given marker. For example, the markers μh={aj,sf}, μh′={aj,sz}, μh″={aj,sr} may be three distinct markers that share the same affinity reagent aj (e.g. in the same round or in different rounds). In such a case a given target would be addressed by multiple markers with different combinations of dyes s. Further the same target may be addressed with multiple markers using distinct affinity reagents a from the plurality of affinity reagents A that bind to the same predetermined structure or analyte. Alternatively, or in addition ordered pairs between affinity reagents a from the plurality of affinity reagents A and combinations of dyes s from the plurality of combinations of dyes S1 are formed. Exemplary illustrations of markers forming bijective pairings are shown in
“Set-based encoding” and “Binary encoding” are based on certain “rules of forming combinations of dyes”. In “set-based encoding” dyes y from the plurality of dyes YD are grouped into sets A to n excitable with the respective excitation lights A to n. A combination is formed by selecting one dye from each set A to n such that the number of combinations of dyes in the plurality of dyes S1 is maximized and such that each combination of dyes is unique to the plurality of combinations of dyes. In set-based encoding each combination of dyes is an n-tupel with n members. In “binary encoding” the “rule of forming combinations of dyes” is such that each dye from the plurality of dyes y1, y2, y3, . . . yσ corresponds to a particular digit 700 in a binary code. In “binary encoding” according to this preferred embodiment each combination of dyes comprises or contains 1 toy, members.
In another preferred embodiment of the invention each dye corresponds to a digit 700 such that each code 704b may comprise/contain a variable number of dyes which is randomly selected and between 1 and yA+yB+yC . . . +yn=yσ. This is named as “binary encoding” in this document and yields 2(y
While combinatorial coding has been described in the prior art, the number of codes that could be obtained was limited, as the number of dyes that could be used was limited to around 5. Using the method disclosed in this document it is feasible to define pluralities of dyes with 25 or more members from commercially available fluorescent dyes that can be readout using commercially available readout devices. Using available detector and dye technology it is feasible based on the method disclosed herein in to define pluralities of dyes with 120 or more members from available fluorescent dyes that can be readout using dedicated readout devices. This is feasible, when for example n=8 (e.g. excitation line 1-8: 360 nm, 405 nm, 440 nm, 488 nm, 560 nm, 590 nm, 630 nm, 700 nm) excitation lines are used to with sets of yA=yB=yC . . . =yn=15 dyes, which are grouped into 3 ti classes, such that each i class holds 5 dyes, which are sufficiently spectrally separated. In this case the cardinality, i.e. the number of combinations of dyes that can be encoded in the set of combinations of dyes is in the range of 2.56 billion combinations of dyes for the set-based encoding and on the order of 1.33×1036 in the case of binary encoding. While this number is significantly higher than, for example 20,000 to 30,000, which is a rough estimate for the number of protein-coding genes in the human genome, it may still be preferable to work with such high numbers of available codes as this allows setting up experiments in a way that only a small percentage would actually be used. If a n=8 and y=15 set would be used with this method to analyze 20,000 target molecules α would be around only 0.00078% of the available codes in the case 2.56 billion combinations of dyes in our example for set-based encoding. If for the same example binary encoding would be used then α the fraction of combinations of dyes actually assigned to a marker from the set of unique codes would be in the range of 0.000000000000000000000000000002%. It is important to note that this is a situation which is already attainable based on existing dyes and readout technology. Working with very small α means that, the entropy of a set of randomly assigned combinations of dyes will be higher, and consequently the decoding will be easier. A small α may also lead to a lower probability of observing type II false-positives a number of times.
In a preferred embodiment of the invention this strategy is used to improve the readout of spots in densely labeled samples
As shown in
As illustrated by the bar chart in
Optionally, following the capture of the images or the non-image-based readout, the fluorescent dyes 100 are deactivated in step S1806. Deactivation is done in order to prevent the fluorescent dyes 100 from emitting fluorescence light in the future. Methods for deactivating a fluorescent dye 100 include bleaching the fluorescent dye 100, either by chemically inactivating the fluorescent dye 100 or by photophysical bleaching; or removing the fluorescent dye 100 from the sample. In order to remove the fluorescent dye 100 from the sample, the connection between the primary affinity reagent 108 to 118 the predetermined structure or target molecule or analyte 900 has to be severed. This can be done for example by antibody elution in case the affinity reagent is an antibody 108 to 118. Alternatively, the fluorescent dye 100 could be removed from either the primary affinity reagent 108 to 118 or the secondary affinity reagent (not shown for sake of clarity). This can be done for example through enzymatic cleaving at a cleavage site 904 of the peptide 4802 or oligonucleotide 402 binding that connects the fluorescent dye 100 and the affinity reagent 108 to 118. It is also possible to reversibly bind the fluorescent dye 100 to the affinity reagent 108 to 118, e.g. through oligonucleotide hybridization and the use of barcoded antibodies. In the case of oligonucleotides 116, which are hybridization-based the oligonucleotides may be hybridized and dehybridized by using standard in situ hybridization or fluorescent in situ hybridization (FISH) protocols. In this case it is possible to hybridize the fluorescent dyes onto the linker and bringing the fully constituted marker into the sample. Alternatively, or in addition one could stain the sample with the affinity reagents 108 to 118 bearing a barcoded linker 902 and adding the dyes at a later point in time. Unlabeled primary affinity reagents 122 to 132 are likewise shown in
Certain steps may be omitted or repeated, other steps not shown in the
In a step S2202 the image data is pre-processed, which may include background removal, by means of a variety of feature detection methods including segmentation and filtering. The constituent parts of the marker may be identified by their keypoint features, edges, and interest points/feature points. Feature or interest points may be any detectable object, for example a microbead 2000. A keypoint feature could be all the objects that have a certain neighbourhood for instance. For example, microbeads 2000 may be segmented and their centre of mass determined.
The image segmentation analysis in step S2204 can be carried out with at least one of: classical approaches, artificial intelligence based techniques including machine learning and neural-networks/deep learning, or other techniques including thresholding techniques, clustering methods, compression-based methods, histogram-based methods, edge detection, dual clustering method, region-growing methods, partial differentiation equation-based methods, variational methods, graph partitioning methods (for example Markov random fields), a watershed transformation, model-based segmentation, multi-scale segmentation, semi-automatic segmentation, trainable segmentation using various machine learning, neural network and artificial intelligence approaches for example pulse-coupled neural networks (PCNNs), and convolutional neural network (U-Net), recurring neural networks (RNNs) as well as object co-segmentation methods such as Markov networks, convolutional neural networks, or long short-term memory (LSTM), for example. Alternatively, or in addition, characteristics such as size and/or colour and/or fluorescent intensity and/or fluorescent lifetime can be used to identify the constituent parts of the marker from the image data. Various algorithms can be used for identification including Harris Corner, scale invariant feature transform (SIFT), speeded up robust feature (SURF), features from accelerated segment test (FAST), and oriented FAST and rotated BRIEF (ORB) are known and can be used to identify the constituent parts and/or features of the marker from the image data.
It is preferable, that spot detection and/or feature extraction and/or feature classification analysis S2206 are performed leveraging machine and deep learning approaches, such as for example content aware feature enhancement. This is especially advantageous when the marker is generated using fluorescent microbeads, fluorescent nanorulers or similar structures as this allows neural networks to be pre-trained to perform content aware feature enhancement specifically for these features. Likewise, in this case substantial pre-training can be easily performed against a ground truth, i.e. image data from the hydrogel beads containing only the marker. Importantly, the ground truth can be generated on a different imaging system. In particular the ground truth can be generated on an imaging system that has a high optical performance with respect to e.g. numerical aperture, resolution, light-collecting efficiency, Etendue (flux), signal-to-noise, chromatic or spherical aberrations as well as any other imaging aberration. Importantly, the method can be implemented in a way that each run generates respective training data that potentially improves the quality of the generated image data by improving for example denoising, background removal, image correction, deconvolution, amongst others as well as the performance of feature extraction. Similarly, networks can be pre-trained using suitable reference samples to classify features. In particular networks can be pre-trained to classify a feature as the hydrogel bead, a fluorescent microbead, a fluorescent nanoruler, or similar, and a cell, a group of cells, or a different kind of biological sample.
Following to feature extraction and classification spots are readout, their combination of dyes is decoded in step S2208, by looking up the identity of the corresponding marker and its target molecule/predetermined structure in the memory device. Each spot labelled with a certain combination of dyes is then counted and the result is stored alongside the intensity information in a memory device. The process ends in step S2210.
In a preferred embodiment of the invention, the detector and the light source are configured to perform spectral fluorescence lifetime imaging which may be used for example with spectral FLIM phasors and provide a high dye separation capacity of the readout device.
Alternatively, or in addition, the capture beads 2000a may be incubated with a lysate of a cell or the lysate of a sample or an environmental sample to detect the presence of an analyte in the sample. In a preferred embodiment of the method, this is used to detect the presence of a large number of analytes in the same experiment using the described bead-based format and cytometry as a readout.
Optionally, the capture beads 2000a may be sorted using a fluorescence activated cell sorter (FACS) 2502, which uses deflector plates 2504 that direct beads into respective collection tubes 2506.
In a preferred embodiment of the invention the assays described in
In a further preferred embodiment of invention the assays described in
In a further preferred embodiment of invention the assays described in
In contrast to a mono-species readout volume a multi-species readout volume cannot simply be decoded by obtaining the readout sequence and retrieving the underlying markers. As illustrated in
The following discussion first discusses strategies to mitigate this problem and then turns to a discussion about a robust solution of this problem second.
In terms of mitigating the challenge to decode multi-species readout volumes, it is possible to use blinking of dyes and methods that are used for localization microscopy, which effectively turns a multi-species readout volume into a number of temporally separated sub-diffraction localized mono-species readout volumes. This is depicted by only one dye lighting up in each of the PSFs shown in
In a preferred embodiment of the invention, the challenge of multi-species readout volume decoding is solved in principle. As illustrated in
In step S3716 the first and second sets of combination of dyes subsumable under the first and the second readout sequence respectively are compared with each other and the overlap is established and stored in a memory device. In a step S3720 a statistical confidence in the decoding result is evaluated and p values and/or another suitable measure of statistical confidence are calculated for each marker in the overlap in step S3720. This may include the use of intensity information as well as include further information of multiple measurements in overlapping confocal volumes/effective point spread functions. If p values are acceptable to the user in step S3722, the processes can be ended in step S3724. Alternatively, further iterations may be performed as indicated by the arrow pointing back to step S3708.
Alternatively, or in addition the linker may carry a unique oligonucleotide sequence barcode (UOSB), which identifies a certain combination of dye and can be used to attach the reporter in a flexible way to any affinity reagent carrying a complementary sequence to the respective UOSB. Using this strategy or simply attaching the reporter using a generic oligonucleotide sequence to an oligonucleotide conjugate to the affinity reagent is advantageous as it allows the flexible attaching and removing of dyes by means of hybridization and melting. Using oligonucleotides to connect antibody and linker by means of hybridization is preferable, as it allows the affinity reagent to be brought into and to stay inside the sample independent of the reporter. Furthermore, once affinity reagents are bound to their target structures this strategy allows the easy switching of combination of dyes as part of the iterative multi-spot decoding process as it does not necessitate the removal of the affinity reagents from the bound target.
In a preferable embodiment of the invention shown in
In a further preferred embodiment of the invention shown in
In a further preferred embodiment of the invention shown in
In a further preferred embodiment of the invention shown in
In a further preferred embodiment of the invention shown in
In a further embodiment of the invention shown in
As shown in the embodiments above reporters can be linked in various ways to the affinity reagent.
In another embodiment of the present invention the linker comprises at least a combination of an oligonucleotide and a peptide sequence and may be referred to as oligonucleotide-peptide-based linker 4802.
In another preferred embodiment, the linker is a nanostructure 4500 and in particular a DNA-origami-based structure or a nanoruler and may be referred to as nanostructure-based or nanoruler-based linker 4804.
In another preferred embodiment of the present invention a peptide-based linker 4806 is used.
In a preferred embodiment of the invention a linker comprising at least one nano-/microbead is used 4808.
Reporters may be either directly conjugated to the affinity reagents through standard coupling chemistries such as NHS, maleimide, or various “click chemistries” such as azide-alkine coupling or they may be non-covalently linked using for instance nucleic acids and hybridization between a UOSB and a complementary sequence or a high affinity interaction between an affinity ligand 4900 and an affinity tag 4902 such as for example the biotin-Streptavidin interaction as depicted in
An optional imaging unit 5006 of the device 5000 is configured to generate readouts which may be images or non-image-based readouts from the fluorescence light emitted by the excited dyes 100. The imaging unit 5006 comprises an objective directed at the sample for capturing the fluorescence light. The captured fluorescence light is then directed onto a detection unit 2320, 5012a, 5012b by the beam splitting unit 2316. The detection unit 2320, 5012a, 5012b comprises at least one detector element and a diffractive element 204, 206 or filters for splitting the fluorescence light into different detection channels as shown in
After imaging the sample, the fluorescent dyes 100 might need to be deactivated. This can be done for example by photo bleaching the fluorescent dyes 100 with coherent light emitted by at least one of the light sources of the excitation unit 2314, 5002a, 5002b. Alternatively, a bleaching agent for chemically deactivating the fluorescent dyes 1320 can be introduced into the sample 1002 with the staining unit 5004. Further, it is possible to remove the fluorescent dye 100 from either the primary or secondary affinity reagent. This can be done for example by introducing enzymatic cleaving agent into the sample with the staining unit 5004. Alternatively, or in additionally, the fluorescent dye 100 may be deactivated by antibody elution or by dehybridization (i.e. melting) and elution in the case of fluorescently labeled oligonucleotides. Thus, the excitation unit 5002a, 5002b and/or the staining unit 5004 form a marker deactivation unit configured to deactivate at least one set of markers present in the sample.
As used herein the term “and/or” includes any and all combinations of one or more of the associated listed items and may be abbreviated as “/”.
Although some aspects have been described in the context of an apparatus, it is clear that these aspects also represent a description of the corresponding method, where a block or device corresponds to a method step or a feature of a method step. Analogously, aspects described in the context of a method step also represent a description of a corresponding block or item or feature of a corresponding apparatus.
Some embodiments relate to a microscope comprising a system as described in connection with one or more of the
The computer system 5016 may be a local computer device (e.g. personal computer, laptop, tablet computer or mobile phone) with one or more processors and one or more storage devices or may be a distributed computer system (e.g. a cloud computing system with one or more processors and one or more storage devices distributed at various locations, for example, at a local client and/or one or more remote server farms and/or data centers). The computer system 5016 may comprise any circuit or combination of circuits. In one embodiment, the computer system 5016 may include one or more processors which can be of any type. As used herein, processor may mean any type of computational circuit, such as but not limited to a microprocessor, a microcontroller, a complex instruction set computing (CISC) microprocessor, a reduced instruction set computing (RISC) microprocessor, a very long instruction word (VLIW) microprocessor, a graphics processor, a digital signal processor (DSP), multiple core processor, a field programmable gate array (FPGA), for example, of a microscope or a microscope component (e.g. camera) or any other type of processor or processing circuit. Other types of circuits that may be included in the computer system 5016 may be a custom circuit, an application-specific integrated circuit (ASIC), or the like, such as, for example, one or more circuits (such as a communication circuit) for use in wireless devices like mobile telephones, tablet computers, laptop computers, two-way radios, and similar electronic systems. The computer system 5016 may include one or more storage devices, which may include one or more memory elements suitable to the particular application, such as a main memory in the form of random access memory (RAM), one or more hard drives, and/or one or more drives that handle removable media such as compact disks (CD), flash memory cards, digital video disk (DVD), and the like. The computer system 5016 may also include a display device, one or more speakers, and a keyboard and/or controller, which can include a mouse, trackball, touch screen, voice-recognition device, or any other device that permits a system user to input information into and receive information from the computer system 5016.
Some or all of the method steps may be executed by (or using) a hardware apparatus, like for example, a processor, a microprocessor, a programmable computer or an electronic circuit. In some embodiments, some one or more of the most important method steps may be executed by such an apparatus.
Depending on certain implementation requirements, embodiments of the invention can be implemented in hardware or in software. The implementation can be performed using a non-transitory storage medium such as a digital storage medium, for example a floppy disc, a DVD, a Blu-Ray, a CD, a ROM, a PROM, and EPROM, an EEPROM or a FLASH memory, having electronically readable control signals stored thereon, which cooperate (or are capable of cooperating) with a programmable computer system such that the respective method is performed. Therefore, the digital storage medium may be computer readable.
Some embodiments according to the invention comprise a data carrier having electronically readable control signals, which are capable of cooperating with a programmable computer system, such that one of the methods described herein is performed.
Generally, embodiments of the present invention can be implemented as a computer program product with a program code, the program code being operative for performing one of the methods when the computer program product runs on a computer. The program code may, for example, be stored on a machine readable carrier.
Other embodiments comprise the computer program for performing one of the methods described herein, stored on a machine readable carrier.
In other words, an embodiment of the present invention is, therefore, a computer program having a program code for performing one of the methods described herein, when the computer program runs on a computer.
A further embodiment of the present invention is, therefore, a storage medium (or a data carrier, or a computer-readable medium) comprising, stored thereon, the computer program for performing one of the methods described herein when it is performed by a processor. The data carrier, the digital storage medium or the recorded medium are typically tangible and/or non-transitionary. A further embodiment of the present invention is an apparatus as described herein comprising a processor and the storage medium.
A further embodiment of the invention is, therefore, a data stream or a sequence of signals representing the computer program for performing one of the methods described herein. The data stream or the sequence of signals may, for example, be configured to be transferred via a data communication connection, for example, via the internet.
A further embodiment comprises a processing means, for example, a computer or a programmable logic device, configured to, or adapted to, perform one of the methods described herein.
A further embodiment comprises a computer having installed thereon the computer program for performing one of the methods described herein.
A further embodiment according to the invention comprises an apparatus or a system configured to transfer (for example, electronically or optically) a computer program for performing one of the methods described herein to a receiver. The receiver may, for example, be a computer, a mobile device, a memory device or the like. The apparatus or system may, for example, comprise a file server for transferring the computer program to the receiver.
In some embodiments, a programmable logic device (for example, a field programmable gate array) may be used to perform some or all of the functionalities of the methods described herein. In some embodiments, a field programmable gate array may cooperate with a microprocessor in order to perform one of the methods described herein. Generally, the methods are preferably performed by any hardware apparatus.
The following is a non-exhaustive list of numbered embodiments:
1. A method for analyzing a biological sample or a chemical compound or a chemical element, the method comprising the steps of:
2. The method according to embodiment 1, wherein the determination of the presence of affinity reagents in the readout volume is established based on a measure or an estimation of a statistical confidence.
3. The method according to embodiment 1 or 2, wherein the plurality of combinations of dyes (S1) is mapped uniquely to the plurality of affinity reagents (A=S2=T1*) using at least one code (Cα1 to Cαn) and/or at least one cipher (Xα1 to Xαn), wherein C: S->T* or X: S->T* are total functions which are preferably bijective or at least injective, wherein S2 is the “source alphabet” and T1 is the “target alphabet”, and wherein S2 and T1 are finite sets.
4. The method according to embodiment 1 or 2, wherein the plurality of affinity reagents (A=S2=T1*) is mapped uniquely to the plurality of combinations of dyes S1 using at least one code Cβ1 to Cβn and/or at least one cipher Xβ1 to Xβn, wherein C: S->T* or X: S->T* are total functions which are preferably bijective or at least injective, wherein S2 is the “source alphabet” and T1 is the “target alphabet”, and wherein S2 and T1 are finite sets.
5. The method according to any of the preceding embodiments,
6. The method according to any of the preceding embodiments, wherein the affinity reagent is configured to allow attaching at least one combination of dyes (si) selected from a plurality of combination of dyes (S1) in a reversible manner.
7. The method according to any of the preceding embodiments, providing a plurality of reporters (R), wherein each reporter (r1, r2, r3, . . . rn); comprises a linker and a combination of dyes (si).
8. The method according to embodiment 1, wherein a plurality of dyes (YD) formed by all fluorescent dyes of the plurality of combination of dyes (S1) comprises at least 10, 20, 50, 100, 1000, or 10000 different fluorescent dyes.
9. The method according to embodiment 1 or 2, wherein the steps c) and d), preferably the steps c) to g), are repeated for two or more different readout volumes of the sample.
10. The method according to any of the preceding embodiments, wherein each marker (μi) comprises a linker having at least two different attachment sites, the combination of attachment sites being unique to the marker; and wherein each dye is connected to a complementary linker to form a reporter, the complementary linker being unique to the dye and configured to attach to a predetermined attachment site.
11. The method according to embodiment 10, wherein the linker and/or the complementary linkers are oligonucleotides.
12. The method according to embodiment 10 or 11, wherein the linker and/or complementary linkers contain a site for enzymatic cleavage or photolysis.
13. The method according to one of the embodiments 10 to 12, wherein the reporters are attached to their respective attachment sites before the markers are introduced into the sample.
14. The method according to one of the embodiments 10 to 13, wherein at least two readouts are generated; and wherein the reporters are dynamically associated with and/or dissociated from their respective attachment sites between the generation of the first and second readouts in order to achieve a stochastic labeling.
15. The method according to embodiment 14, wherein the stochastic labeling is achieved by DNA-PAINT.
16. The method according to embodiment 13 or 14, wherein the stochastic labeling is achieved by a blinking method, for example by super resolution microscopy such as STORM, PALM, GSDIM or a related method which leverages blinking.
17. The method according to any of the preceding embodiments, wherein the plurality of dyes formed by all fluorescent dyes of the markers is divided into sets of dyes A to n, with yA to yn members, with yA+yB+yC+ . . . yn=yσ, with y being a natural number and yσ being the total number of dyes in the plurality of dyes (YD); wherein each dye in the same set can be excited by essentially one wavelength spectrum or by the same wavelength spectrum;
18. The method according to embodiment 17, wherein the excitation lights are directed onto the sample in a sequence temporally following each other.
19. The method according to any of the preceding embodiments, wherein the readout is an image or a readout image data stream of the readout volume.
20. The method according to any of the preceding embodiments, comprising the further step of capturing a hyperspectral image of the sample.
21. The method according to any of the preceding embodiments, comprising the further step of stabilizing the fluorescence lifetime of at least one fluorescent dye, for example by placing the fluorescent dye in a shielded environment by at least one of encapsulating, polymer-matrix embedding, and co-crystallizing.
22. The method according to any of the preceding embodiments, wherein the step of generating the channels is based on at least one of channel unmixing, spectral unmixing, excitation spectral imaging, spectral phasor analysis, spectral FLIM phasor, a fluorescence lifetime of the fluorescent dyes and an excitation fingerprint of the fluorescent dyes.
23. The method according to any of the preceding embodiments, wherein the step of generating the channels is based on at least two orthogonal contrasts.
24. The method according to any of the preceding embodiments, wherein the step of generating the channels is based on at least one of machine learning, deep learning or artificial intelligence.
25. The method according to any of the preceding embodiments, comprising the further step of deactivating at least one marker.
26. The method according to embodiment 25, wherein the deactivating step is done by at least one of bleaching at least one fluorescent dye of the at least one marker and removing the at least one marker from the sample, preferably by at least one of dissociating or cleaving the fluorescent dye from the affinity reagent or dissociating the affinity reagent from the target structure.
27. The method according to any of the preceding embodiments, wherein the following steps are repeated at least twice in order to create series of images or readouts of the sample: providing a second plurality of markers, introducing the second plurality of markers into the sample, direct the at least one excitation light onto the sample, generating the at least one readout, and determining the markers present in the readout volume; or wherein the steps a) to e) of embodiment 1 are repeated at least twice.
28. The method according to any of the preceding embodiments, wherein each marker comprises a linker having at least two different attachment sites, the combination of attachment sites being unique to the marker; and wherein each dye is connected to a complementary linker to form a reporter, the complementary linker being unique to the dye and configured to attach to a predetermined attachment site.
29. The method according to embodiment 28, wherein the reporters labeling the second plurality of markers comprise combinations of dyes that were determined based on the first series of images or readouts of the sample.
30. The method according to embodiment 28 or 29, wherein the reporters are assembled by adding a mix of dyes, wherein each dye is connected to a complementary linker to form reporters with linker molecules containing dye-specific attachment sites for all dyes in the plurality of dyes, such that adding a mix of dyes corresponding to a unique combination of dyes to a linker molecule in a coupling reaction volume leads to a stoichiometric coupling.
31. The method according to embodiment 30, wherein the reporters are assembled by adding a mix of dyes, wherein each dye is connected to a complementary linker to form reporters with linker molecules containing dye-inspecific attachment sites for all dyes in the plurality of dyes, such that adding a mix of dyes corresponding to a unique combination of dyes to a linker molecule in a coupling reaction volume leads to a stochastic coupling.
32. The method according to any of the preceding embodiments, wherein the excitation light is coherent light.
33. The method according to any of the preceding embodiments, wherein the excitation light comprises a wavelength range being smaller than 50 nm, smaller than 30 nm, smaller than 10 nm or a single wavelength.
34. A device for analyzing a biological sample being adapted to carry out the method according to one of the embodiments 1 to 33.
35. The device according to embodiment 34, comprising a microscope, preferably a lens-free microscope, a light field microscope, a widefield microscope, a fluorescence widefield microscope, a light sheet microscope, a scanning microscope, or a confocal scanning microscope, a plate reader, a cytometer, an imaging cytometer, or a fluorescence activated cell sorter configured to generate the at least one readout.
36. The device according to embodiment 34 or 35, configured to determine a fluorescence emission intensity, a fluorescence lifetime, an emission spectrum, an excitation fingerprint, fluorescence anisotropy from fluorescence dyes in the sample.
37. The device according to any of the embodiments 34 to 36, wherein a separation of the readout into the at least two channels is done by at least one of a spectrometer comprising a prism or a grating and at least one detector.
38. The device according to any of the embodiments 34 to 37, comprising a time-sensitive detector.
39. The device according to any of the embodiments 34 to 38, comprising a memory device for storing a unique identifier that identifies the affinity reagent, the predetermined structure, and the unique combination of dyes for each marker.
40. The device according to any of the embodiments 34 to 39, comprising a calibration unit configured to receive fluorescence light emitted by the excited dye, and to generate calibration data based on the received fluorescence light; wherein the at least one readout is generated based on the calibration data.
41. The method according to any of the preceding embodiments, wherein no combination of dyes is assigned to more than one affinity reagent.
42. A database comprising information about at least one of the affinity reagents; characteristics about the affinity agents; the plurality of dyes; the plurality of combinations of dyes; the characteristics of each combination of dyes; the plurality of markers; characteristics about the markers; linkers; complementary linkers; reporters; and information about the assignment of each affinity reagent of the plurality of affinity reagents to at least one combination of dyes from the plurality of combinations of dyes; which might be necessary to carry out the method of one of the embodiments 1 to 33 or which might be necessary to operate the device of one of the embodiments 34 to 41.
43. A plurality of combinations of dyes as composed in accordance of step c) of embodiment 1 or with the characteristics as described in one of the embodiments 1 to 41.
44. A plurality of combination of dyes as composed in accordance of step c) of embodiment 1 or with the characteristics as described in one of the embodiments 1 to 41.
45. A device adapted to carry out the method of one of the embodiments 1 to 33.
46. A computer program with a program code for performing the method according to one of the embodiments 1 to 33 or for operating the device of one of the embodiments 34 to 41.
47. A computer-readable medium comprising the computer program of embodiment 46.
While subject matter of the present disclosure has been illustrated and described in detail in the drawings and foregoing description, such illustration and description are to be considered illustrative or exemplary and not restrictive. Any statement made herein characterizing the invention is also to be considered illustrative or exemplary and not restrictive as the invention is defined by the claims. It will be understood that changes and modifications may be made, by those of ordinary skill in the art, within the scope of the following claims, which may include any combination of features from different embodiments described above.
The terms used in the claims should be construed to have the broadest reasonable interpretation consistent with the foregoing description. For example, the use of the article “a” or “the” in introducing an element should not be interpreted as being exclusive of a plurality of elements. Likewise, the recitation of “or” should be interpreted as being inclusive, such that the recitation of “A or B” is not exclusive of “A and B,” unless it is clear from the context or the foregoing description that only one of A and B is intended. Further, the recitation of “at least one of A, B and C” should be interpreted as one or more of a group of elements consisting of A, B and C, and should not be interpreted as requiring at least one of each of the listed elements A, B and C, regardless of whether A, B and C are related as categories or otherwise. Moreover, the recitation of “A, B and/or C” or “at least one of A, B or C” should be interpreted as including any singular entity from the listed elements, e.g., A, any subset from the listed elements, e.g., A and B, or the entire list of elements A, B and C.
Number | Date | Country | Kind |
---|---|---|---|
PCT/EP2021/063310 | May 2021 | WO | international |
PCT/EP2021/066645 | Jun 2021 | WO | international |
This application is a U.S. National Phase application under 35 U.S.C. § 371 of International Application No. PCT/EP2021/073819, filed on Aug. 28, 2021, and claims benefit to International Patent Application No. PCT/EP2021/066645, filed on Jun. 18, 2021 and International Patent Application No. PCT/EP2021/063310, filed on May 19, 2021. The International Application was published in English on Nov. 24, 2022 as WO 2022/242887 A1 under PCT Article 21(2).
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/EP2021/073819 | 8/28/2021 | WO |