This invention generally relates to techniques for fluorescence labelling, and to methods, apparatus and computer program code implementing numerical algorithms for processing fluorescence signal data. The techniques we describe are particularly useful in biotechnology applications.
A common biological problem is the measurement of the optical emission from spatially coincident fluorophores (dyes). Imaging the functional components of a living cell often involves the registration of multiple fluorescent markers. Quantifying the hybridisation of labelled nucleic acids (probes) to immobilised target molecules in a microarray (“gene chip”) can also require the simultaneous detection of multiple-component fluorescent spectra.
We have previously described, in WO 03/023376 (hereby incorporated by reference in its entirety) cryogenic detector technology, in particular employing a superconducting tunneling junction (STJ), for the detection of a fluorescent signal from, for example, a DNA (deoxyribonucleic acid) microarray. An STJ device is sensitive over a range of wavelengths (colours or energies), generally down to the single photon level, as well as exhibiting a highly linear response and high signal-to-noise ratio. The energy-resolving capability of the STJ in the optical band (embodiments of the device may be described as hyperspectral) facilitates simultaneous multi-colour detection of hybridisation to microarrays for applications such as drug discovery.
In a typical microarray experiment two samples or targets are reverse transcribed into cDNA (complementary DNA) and labelled using different fluorescent dyes. The DNA microarray comprises an array of DNA sequences which act as probes, and the targets are mixed and hybridised with these probes and then, after removal of excess unbound material by washing, the microarray is imaged, generally using a scanner which responds to the fluorescence signal at each of the array spots. The differential hybridisation of the two targets to a probe sequence is, broadly speaking, determined by the ratio of the fluorescence intensities at the spot on the microarray for the probe sequence. In this way, the relative abundance of each of the probe sequences in the two targets may be assessed. There are a number of variants of this basic technique. Currently the majority of microarrays comprise DNA (which here includes cDNA), but microarrays may also be fabricated using RNA (ribonucleic acid), proteins, antibodies, antigens and the like.
A conventional scanner typically employs a photomultiplier to record the signals from the microarray but we have described how an STJ detector device can be used to provide substantial improvements in performance (ibid; also Review of Scientific Instruments, Volume 74, Number 9, September 2003, “Detection of multiple fluorescent labels using superconducting tunnel junction detectors”, G. W. Fraser. J. S. Helsop-Harrison, T. Schwarzacher, A. D. Holland, P. Verhoeve and A. Peacock). For example one detector used in a study of biological fluorescence was a single 30×30 μm2 STJ with 100 nm thick Ta layers and 30 nm thick Al layers on either side of the tunnel barrier. The detector was made using photolithographic techniques from a Ta/Al multilayer deposited on a polished sapphire substrate. Cooling to 300 mK in a 3He cryostat (i.e., T˜Tc/15, where Tc denotes the superconducting transition temperature) kept the thermally excited quasiparticle current well below the leakage current level. The STJ had a measured resolving power (λ/Δλ) of 14.1 at 600 nm. Samples were stimulated with a Leica microscope with mercury lamp excitation. Preferential selection of colour from fluorophore samples could be made using an Omega triple filter set, which gives transmission in narrow bands centred on 450 nm (blue), 520 nm (green) and 620 nm (red). The integration times were ˜30 s.
Such Ta/Al devices (resolving power ˜10-20) are capable of simultaneously measuring at least four well-separated fluorophores. Smaller band gap, lower operating temperature, STJ devices with better resolving power—e.g. Hf with R˜80 or Mo with R˜40—are potentially even better. The modest throughput of single pixel STJs can be improved by the development of large format arrays. (See Nuclear Instruments and Methods in Physics Research A 559 (2006) 782-784, “Optical fluorescence of biological samples using STJs”, G. W. Fraser, J. S. Helsop-Harrison, T. Schwarzacher, P. Verhoeve, A. Peacock and S. J. Smith).
Typically, a scanner will include an image capture/processing system, for example based upon a digital signal processor or a suitably programmed general purpose computer, and the fluorescence signals from the imaged microarray are typically output as a colour image file in an industry standard format, such as a 16 bit GIF (graphics interchange format) or TIFF (tagged image file format) format. Different fluorophores have different emission peaks and, in general, different (shorter wavelengths) absorbtion peaks. The scanner may either excite both absorption peaks simultaneously with a single wavelength, then read both emission wavelengths simultaneously, or, the microarray may be scanned at first one absorption wavelength and then at the other(s). Often lasers are employed to excite the fluorescence. It is generally preferable to read multiple fluorescence signals (colours) simultaneously as re-scanning can damage the hybridised entities and can, in particular, cause photo bleaching. Generally, passband filtering is employed to discriminate between the excitation illumination and the fluorescent emission as well as, optionally, between the different fluorescence signals.
Background material relating to microarray data analysis can be found in “Microarray data analysis: from disarray to consolidation and consensus”; David B. Allison, Xiangqin Cui, Grier P. Page and Mahyar Sabripour, NATURE REVIEWS, GENETICS, Volume 7, January 2006, page 55-65; and “Improving false discovery rate estimation”, Bioinformatics 20(11), 2004, page 1737-1745; Material published after the earliest priority date of this application can be found in “Speed-mapping quantitative trait loci using micromays”, Chao-Qiang Lai, Jeff Leips, Wei Zou, Jessica F Roberts, Kurt R Wollenberg, Laurence D Parnell, Zhao-Bang Zeng, Jose M Ordovas & Trudy F C Mackay, NATURE METHODS, Vol. 4 No. 10, October 2007, pages 841-839; and “HoughFeature, a novel method for assessing drug effects in three-color cDNA microarray experiments”, Hongya Zhao and Hong Yan, 17 Jul. 2007, BMC Bioinformatics 2007, 8:256, doi: 10.1186/1471-2105-8-256.
Conventionally, the processing of fluorescence signals from a microarray has been based upon some implicit assumptions, in particular that there is a linear relationship between the fluorescent signals from a particular spot and the relative abundances of the labelled sample or target. The inventors have, however, recognised that this is not generally true and that coupling between two different fluorophores will introduce non-linearities. The inventors have further recognised that non-linearities can occur even in a single fluorophore system. The degrees of non-linearity will in part depend upon the fluorophores employed. We will describe techniques by which these non-linearities can be taken into account. More particularly, we will describe both techniques for improved processing of signals from entities labelled or associated with two different fluorophores, and techniques relating to the determination of an optimum degree of labelling i.e. one which produces maximum brightness.
According to a first aspect of the invention, there is therefore provided a method of determining respective first and second degree-of-labelling signals for different respective first and second fluorophores associated with a common entity, the method comprising: determining a first fluorescence signal from said first and second fluorophores under first conditions; determining a second fluorescence signal from said first and second fluorophores under second conditions different to said first conditions; and determining said first and second degree-of-labelling signals for said first and second fluorophores from said first and second fluorescence signals; and wherein said determining of said first and second degree-of-labelling signals is responsive to at least one coupling value (c12; c21) representing a coupling of energy between said fluorophores, or the absorption of light emitted by one fluorophore by the other fluorophore.
The skilled person will understand that in a microarray experiment signal intensities (corrected for the effects of non-linearity) are measured. The degree of labelling may be, for example, either the number of fluorophores on a particular molecule or entity or the number of fluorescent labelled molecules which bind to an entity. An example of the first case is where multiple fluorophores bind, at spatial intervals, to DNA. An example of the second case is where, say, an antibody has multiple binding sites and binds to a plurality of molecules simultaneously each carrying a single fluorophore. Embodiments of the technique can still further be used in a situation where the degree of labelling signals associated with a common entity arise from a physical mixture of different fluorophores attached to different individual molecules or entities of the same type. An example of this is where a microarray spot contains a physical mixture of the same conjugate molecule some of which have one or more fluor A moieties attached and others of which have one or more fluor B moieties attached.
Embodiments of the technique allow the degree-of-labelling signals from two different fluorophores to be separated, thus dispensing with the need for re-scanning and minimising the deleterious effects of photo bleaching. In some particularly preferred embodiments, the technique is employed with fluorescence signals from a superconducting tunnel junction detector device as described above. The techniques are particularly advantageous with this type of detector because of the relatively small number of photons which may be detected. However, more generally, embodiments of the method may be employed with any type of microarray scanning system, as well as in the context of other systems in which optical emission from spatially substantially coincident fluorophores may be observed. Thus, for example, the method may be embodied as computer programme code to implement a front end for conventional microarray scan analysis software. The degree-of-labelling signals determined for the first and second chromophores (fluorophores) may either comprise a degree-of-labelling per se, or, for example, separated signals from the two fluorophores from which respective degrees of labelling or other hybridisation information may later be derived.
In preferred embodiments of the method, at least one coupling value represents a coupling between light emitted by one of the fluorophores and absorbed by the other of the fluorophores.
In embodiments of the method the first fluorophore has an emission peak at a longer wavelength than that of the second fluorophore and the coupling value represents a coupling between light emitted by the second fluorophore and absorbed by the first fluorophore; in embodiments of the method coupling in the other direction may be substantially neglected.
The conditions under which the first and second fluorescence signals are determined generally define one or both of different illumination wavelengths and different detection wavelengths for the determination of the first and second fluorescence signals. Thus, a common illumination signal may be applied to a microarray in a single scan, using different wavelengths or wavelength bands, for example selected by filters, to determine the two fluorescence signals.
In embodiments an estimate for the one or more coupling values may be determined by performing a calibration over a range of combinations of the first and second fluorophores in different proportions. In preferred embodiments of the method the determining of the two degree-of-labelling signals also takes into account respective parameters for the two fluorophores representing a respective degree of self-quenching. Such parameters may be available or derivable from published data or may again be determined by performing a calibration, here for each fluorophore separately.
The skilled person will appreciate that the techniques we describe may be applied to a range of entities with which the first and second fluorophores are associated. For example, in a microarray experiment the two fluorophores may be associated with a common probe entity to which the separately tagged targets are attached, for example to determine a degree of relative hybridisation. Additionally or alternatively the two fluorophores may be associated with a common sample or target entity. Typically the two fluorophores will be part of a probe-target experiment, but, potentially, they may also be incorporated into the structure of a common molecule, for example as differently fluorescently tagged bases in a strand of DNA or RNA.
In some preferred embodiments the method is employed to process fluorescence data from a microarray. Generally this will comprise a microarray of DNA or RNA, although the microarray may additionally or alternatively comprise antibodies or antigens; in other applications the technique may be employed to process fluorescence data from a sandwich assay.
Thus in a further aspect there is provided a method of processing fluorescence data from a microarray, the microarray being labelled with two or more different fluorophores, the method comprising: inputting said fluorescence data, the fluorescence data representing fluorescence signals from said microarray at two or more wavelengths; determining data representing a line of parity for said fluorescence signals, said line of parity being a line along which signal intensities from fluorescence at said two or more wavelengths are expected to represent substantially equal quantities of the entities to which the fluorophores are attached; and correcting a said fluorescence signal from said microarray at one of said wavelengths using said determined line of parity.
The skilled person will appreciate that the technique may be extended to three or more fluorophores, in which case the concept of a line of parity may be extended accordingly (i.e. to a surface of equivalence, or set of such surfaces). Thus in the above method “line of parity” includes “surface of parity”. In embodiments of the method the fluorescence signal corresponds to one or more biological parameters, for example a level of gene expression or the like. In embodiments of the method is preferable that at one of the fluorescence signals used for determining the line of parity represents a control level of fluorescence (although this is not essential since one signal may be used as a control for the other even where both represent a level of a biological parameter). Where one signal is used as a control, the other generally represents a level of a biological parameter, as previously mentioned.
In embodiments the determining of the line of parity comprises determining first and second end points of the line, optionally excluding outlier data signals. One end point may correspond to fluorescence intensity signals at first and second wavelengths being substantially zero (in terms of the later equations, assuming n is approximately unity). Thus one point on the line of parity may be determined using:
S
G
/S
R=(aG−bG)/(aR−bR)
where a and b represent characteristics of a fluorophore, the subscripts G and R representing fluorophores which fluoresce primarily at first and second respective wavelengths, for example a green fluorophore such as Cyanine 3 fluorescent dye Cy3® and a red fluorophore such as Cyanine 5 fluorescent dye Cy5®.
In preferred embodiments a second point on the line is determined using:
S
G
/S
R=(bG+cRG)/bR
Where cRG accounts for emission from one fluorophore which is absorbed by the other fluorophore. More particularly the cRG term takes account of quenching of a short wavelength fluorescence emitter by a longer wavelength fluorescence emitter (so the longer wavelength emitter is less perturbed by this effect than the shorter wavelength emitter). In some preferred embodiments a value for cRG may be determined from the fluorescence data, for example by a best-fit technique. The point on the line of parity determined by this method may be a maximum fluorescence end point, that is where the fluorescence intensity signals at the two or more wavelengths are substantially at a maximum.
In preferred embodiments the correcting process comprises compensating for a difference between a measurement variable (i.e. a non-control) fluorescence signal and a value of that fluorescence signal predicted from the line of parity, for example by subtracting one from the other. In some preferred embodiments the method also comprises correcting for systematic noise comprising one or both of: fixed pattern noise from the microarray, and noise resulting from division of one digital number by another. The fixed pattern noise may arise, for example, from artefacts due to deposition of the microarray and may have a cyclic repetition, for example at row or column sub-array intervals.
Thus, in a further aspect the invention provides a method of processing fluorescence data from a microarray, the method comprising: inputting said fluorescence data, the fluorescence data representing fluorescence signals from said microarray at a plurality of different spot locations: and processing said fluorescence data to determine a biological parameter associated with fluorescence from a said spot; and wherein said processing includes: compensating said fluorescence data for systematic noise comprising one or both of fixed pattern noise from said microarray and noise resulting from the division of one digital number by another.
The invention further provides processor control code, in particular on a carrier, to implement embodiments of the above described method. The carrier may comprise a disc such as a CD (compact disc)- or DVD (digital video disc)-Rom, programmed memory such a read only memory, or a data carrier such as an optical or electrical signal carrier. The processor control code may comprise source, object or executable code in any conventional programming language, for example C, or code for a hardware description language. As the skilled person will appreciate such code and/or associated data may be distributed between a plurality of coupled components in communication with one another.
The invention further provides apparatus configured to implement a method as described above. In general such apparatus comprises an input to receive fluorescence data to be processed, an output to provide the processed fluorescence data, either for further analysis or, for example, as a set of gene expression levels, and a data processor coupled to the input and the output, to working memory, and to program memory storing processor control code to implement a fluorescence data processing method.
In a related aspect, the invention provides apparatus for determining respective first and second degree-of-labelling signals for different respective first and second fluorophores associated with a common entity, the apparatus comprising: means for determining a first fluorescence signal from said first and second fluorophores under first conditions; means for determining a second fluorescence signal from said first and second fluorophores under second conditions different to said first conditions; means for determining said first and second degree-of-labelling signals for said first and second fluorophores from said first and second fluorescence signals; and wherein said means for determining said first and second degree-of-labelling signals is responsive to at least one coupling value (c12; c21) representing a coupling of energy between said fluorophores.
The inventors have further recognised that the above-described techniques may also be employed to determine an optimum degree-of-labelling of an entity with a fluorophore and, more particularly, by two or more fluorophores. This is advantageous because a procedure to label entities with one or more fluorescent tags generally involves multiple labelling experiments which are tedious and time consuming. The inventors have recognised that, depending upon the data available, only a single labelling experiment or, potentially no labelling experiments may be necessary.
Thus, according to a further aspect of the invention, there is provided a method of labelling an entity with a fluorophore, the method comprising inputting a first parameter dependent on a light creation efficiency of said fluorophore; inputting a second parameter dependent on a degree of self-quenching of said fluorophore; determining an estimate of an optimum degree-of-labelling of said entity by said fluorophore using said first and second parameters; and labelling said entity with said fluorophore in accordance with said estimated optimum degree-of-labelling.
In embodiments the estimated optimum degree-of-labelling comprises an estimated degree-of-labelling (number of fluorophores per entity, e.g. molecule) at which fluorescence intensity (brightness) is predicted to a maximum.
Preferably the first parameter is further dependent on one or more of a structure of the entity, a structure of the fluorophore, and a bonding between the fluorophore and the entity. In some particularly preferred embodiments, the method is employed with a plurality of fluorophores, using a third parameter dependent upon the degree of coupling between at least two of the plurality of fluorophores to determine an estimated optimum degree-of-labelling for each of the fluorophores in the presence of the other. The degree of coupling may be estimated, for example, by performing a calibration experiment using entities labelled with a range of different respective combinations of the plurality of fluorophores.
In a further aspect the invention provides an entity labelled with one or more fluorophores using the above described method. Thus in embodiments the entity has a substantially optimum degree-of-labelling by the one or more fluorophores. The optimum degree-of-labelling may be defined as a degree-of-labelling which corresponds to substantially the maximum fluorescent light yield from the labelled entity for the relevant fluorophore.
Thus in further aspects the invention provides an entity labelled with a plurality of different fluorophores, and a kit of fluorescent probes, respective numbers of said different fluorophores being such that a fluorescence signal (S(n)) from each said fluorophore is substantially maximised.
The invention still further provides a fluorophore-labelled entity having a number of labelling fluorophores determined by a product of a figure of merit (R), an example of which is described later, multiplied by a constant of proportionality determined from a measured set of peak values of fluorescence against respective figures of merit for a plurality of different other fluorophores. For example the constant of proportionality may be that shown in
The invention still further provides a method of manufacturing a kit of fluorophore labelled probes, the method comprising: determining a combination of fluorophores for said kit; and manufacturing said kit using said determined combination of fluorophores; and wherein said determining of said combination of fluorophores comprises: selecting one or both of a set of fluorophores for said kit of fluorophore labelled probes and a degree of labelling of said probes by said fluorophores using a fluorescence brightness figure of merit function (R) for a candidate said fluorophore of the set.
In embodiments, the fluorescence brightness figure of merit function is dependent on a degree of overlap between emission and absorption spectra of a said candidate said fluorophore. Preferably R also comprises a function dependent on the quantum yield of a said candidate fluorophore, a maximum value of an extinction coefficient of the fluorophore. Additionally or alternatively said function comprises a function of i) a parameter dependent on a light creation efficiency of fluorophore, and ii) a parameter depending on a degree of self-quenching of the fluorophore, said selecting further being dependent on iii) a parameter dependent on a degree of coupling between the fluorophore and another fluorophore.
In embodiments the kit of fluorophore labelled probes comprises a calibration kit for a microarray or another diagnostic platform. In embodiments the number of calibration fluorophores matches the number of experimental fluorophores. In some preferred embodiments the kit of fluorophore labelled probes for use with an STJ detector.
The invention also provides a kit of fluorophore labelled probes comprising a kit of fluorophores, and wherein one or both of a set of fluorophores for said kit of fluorophore labelled probes and a degree-of-labelling of said probes by said fluorophores are selected using a fluorescence figure of merit function (R) for a candidate said fluorophore of the set, and wherein said fluorescence brightness figure of merit function is dependent on a degree of overlap between emission and absorption spectra of a said candidate said fluorophore.
In embodiments the fluorophores and/or degree-of-labelling of the fluorophores is selected to optimise subsequent signal detection and/or measurement and/or spectral deconvolution.
These and other aspects of the invention will now be further described, by way of example only, with reference to the accompanying figures in which:
a to 9c show, respectively, a microarray data scatter plot showing construction of a line of parity according to an embodiment of the invention, the plot of
a to 10d show, respectively, background noise from the HFF-PDS (Human foreskin fibroblasts, infected with the PDS strain of the parasite Toxoplasma gondii (a close relation of the organism which causes malaria)) data set folded modulo-28 illustrating artefacts due to fixed pattern noise, systematic noise arising from the division of one small digital number by another, results of a simulation illustrating how noise could be misinterpreted as gene expression, and a similar example showing real, raw microarray (from SMD data set 3932).
Broadly speaking we will describe a model of the self-quenching of fluorescent emission and compare this with measurements of light yield versus degree-of-labelling for a number of fluorophores (dyes) commonly used in biology. The model is physically based on the emission and absorption of light by molecules of the same species. The model shows that the optimum degree-of-labelling corresponding to maximum light yield, is predictable from a combination of basic parameters of the fluorophore. However the maximum can also depend on the fluorophore's conjugate molecule. Extension of the model to multi-fluorophore systems is described, as is a method for determining degree-of-labelling signals in such systems, and procedures for the recovery of biological information in such systems in the presence of non-linearities.
In dye-labelled biological systems, the fluorescent signal Is is not always linearly related to the degree-of-labelling n—the number of fluorophores present per conjugate molecule. (In what follows,
formally denotes the integral of the wavelength-dependent emission function i over an output filter bandpass [λ1≦λ≦λ2]). The conjugate molecule is the biologically active protein or antibody to which the fluorophore is attached. The phenomenon of self quenching—the decrease in fluorescent signal observed for high labelling densities—is poorly understood [S. Hamann, J. F. Kiilgaard, T. Litman, F. J. Alavrez-Leefmans, B. R. Winther and Zeuthen, J. Fluorescence 12 (2002) 139], even though it leads in many cases to a well-defined degree of labelling (n=npeak) for maximum brightness.
Of possible quenching mechanisms dynamic fluorescence quenching cannot be considered a universal process because it depends on collisional energy exchange with a quenching agent distinct from the fluorophore itself. Static quenching, by contrast, depends on the formation of non-emitting molecular complexes by the fluorophore and the quencher. The process of fluorescent resonant energy transfer (FRET), thirdly, requires the overlap of the donor emission spectrum and the absorption spectrum of the acceptor in a system whose two components are separated by only ˜nanometre distances.
Here, we assume that self quenching—leading to signal non-linearity—is primarily due to the absorption by the fluorophore molecules of their own emitted light. In other words, we suppose that, fundamentally, self-quenching arises from the overlap of the absorption and emission spectra in the same dye molecule in an otherwise transparent system. In what follows, we compare the predictions of our model with measurements of the brightness function S(n) reported in the literature, calculate npeak, the labelling density which corresponds to maximum signal.
Table 1 summarises the acronyms used below to denote specific fluorophores, conjugate molecules and reference standards for the estimation of quantum yield [R. F. Rubin and A. N. Fletcher, J. Luminescence 27 (1982) 445].
Consider the absorption of light from a monochromatic source of intensity I0 (photons/cm2/s) and wavelength λs in a dye-labelled biological sample whose thickness is d and whose volume is V. In a weakly absorbing system, the absorbance A due to the fluorophore may be written as either:
A=ε·d·C -(1a)
or
A=N·σ·d -(1b)
where:
I
s
=I
0
AQ[V/d][dΩ/4π] -(1c)
Eqs. (1a, b) express, for biologists and physicists respectively, the same underlying Beer-Lambert law. We see from eq. (1b) that it is the number of fluorophores per unit volume, N, which quantitatively determines the degree of absorption, but note from above that the parameter almost always reported in biology is n, the number of fluorophores per conjugate molecule, measured in Mol/Mol. The relationship between N and n, however, is simple and linear if M, the mass of the conjugate molecule (e.g. 52,800 Da for the protein streptavidin) is much greater than that of the fluorophore (e.g. 300-900 Da for the fluorophores below). With Min grams:
N=[n·N
a
·ρ]/M -(2)
where:
Substituting for A (from eq. 1b) and N (from eq. 2) in eq. (1c), we find, after some manipulation:
I
s
=I
0
nQN
c
σ[dΩ/4π]=I0S(n)Ncσ[dΩ/4π] (3)
where Nc is the number of conjugate molecules in the sample and the physical brightness function S(n)=nQ incorporates the possibility that the quantum efficiency depends on the degree of labelling, n.
This analysis suggests that the brightness function S(n) should be higher, for a given n value, the lower the mass of the conjugate molecule, since then there will be proportionately more fluorophores per unit volume, provided the density varies little between conjugates. This hypothesis is tested later.
In order to estimate the form of S(n), we need to consider first the absorption of the source flux and, second, the reabsorption of the fluorescent emission by the same population of dye molecules. Applying eqs. (1b, 2), the probability of fluorescent light emission and the probability of reabsorption of that fluorescent photon within the sample are both proportional to n, the number of fluorophores per conjugate molecule. We write λp(>λs) for the wavelength corresponding to the peak of the fluorophore's emission spectrum.
Thus, disregarding details of both the sample geometry and of the interaction of the fluorophore with its host molecule, the brightness function, in arbitrary units, is then found from the product of the production and reabsorption probabilities as follows:
S=[k
1
n][1−k2n] -(4a)
which is of the form:
S=an−bn
2 -(4b)
where: a=k1 and b=ak2 are characteristics of the fluorophore. A close identification of the former constant follows from the biological definition of brightness function:
S(n)=n·RQY(n) -(5)
where RQY(n) denotes the relative quantum yield for a degree-of-labelling, n. From eq. (4b). it follows that:
Limn→0(S(n)/n)=Limn→0RQY(n)=a -(6)
is Differentiating eq. (4b) we find that, if npeak is the value of n which corresponds to the maximum light yield, we have:
The optimum degree-of-labelling can therefore be estimated for any fluorophore for which the constants a and b (or simply k2) have been determined. The same analysis gives the maximum useful labelling density, nzero, for which the signal is totally quenched:
nzero=2npeak -(8)
Thus, a defining characteristic of a fluorophore exhibiting self quenching by self-absorption is that the maximum degree of labelling is exactly twice the value corresponding to maximum light yield. Formally, this result is in conflict with the starting mathematical assumption of weak absorption and with the physical observation that, for most fluorophores, the absorption spectrum does not completely overlap the emission spectrum. The physical (above) and biological definitions of brightness function differ only by a multiplicative factor which is the absolute quantum yield of a reference standard (see below).
Comparison with Published Data
It is useful to be able to predict the optimum degree of labelling for new fluorophores, given only basic physical data.
Returning to eq. (4), the fluorophore constant k1 accounts for the details of the fluorescent light creation process and for the wavelength-dependent losses of fluorescent light in a given absorber geometry; it includes the subtleties of the chemical bonding between the fluorophore and its conjugate biomolecule. In terms of measurable quantities, therefore, k1 is, as already demonstrated (eq. (6)), related to the quantum yield Q (fluorescent photons/absorbed photon), and to the maximum value of the extinction coefficient, εmax. The fluorophore constant k2, by contrast, should also account for the degree of overlap between the emission and absorption spectra of the fluorophore. One measure of this overlap is the value of the extinction coefficient at the wavelength of maximum emission, ε(λp). One may argue, therefore, that the combination of measurable quantities:
R=Qε
max/ε(λp) -(9)
should track npeak, if the self-absorption model of self-quenching is correct.
In order to test this hypothesis, we use values of the absolute quantum yield Q, measured (ideally) for a common degree-of-labelling, n=1. Readily available efficiency values usually refer to the pairwise comparison of spectrally similar fluorophores and are measured relative to a reference standard (such as S101) for degrees-of-labelling chosen to match the absorbances of the fluorophore pair. Table 2 (below) gives the values for relative quantum yield RQY(n) and the correction factors used to arrive at the desired absolute quantum yield Q(n=1) via:
Thus, we are able to construct, for eight fluorophores conjugated to GAM for which all the basic data is available, the relationship between npeak and the figure of merit R shown in
If two fluorophores (subscripted 1 and 2) label the same conjugate molecule to degrees n1 and n2 respectively, and both fluorophores independently exhibit self-quenching due to self-absorption, the brightness function S will have the form:
S(n1,n2)=a1n1−b1n12+a2n2−b2n22−(c12+c21)n1n2 -(11a)
The term c12 accounts for emission from fluorophore 1 being absorbed by fluorophore 2 while c21 describes emission from fluorophore 2 being absorbed by fluorophore 1. In other words, there is likely to be mutual quenching observed in the two signal channels.
Typically, the fluorophores chosen for a dual labelling experiment have very distinct (absorption and) emission spectra. Measuring in a well-defined bandpass, one would therefore hope to see the signature of only one fluorophore of the pair, but our model indicates otherwise. In the bandpass appropriate to fluorophore number 1:
S
1(n1,n2)=S(n1)−c12n1n2<S(n1) -(11b)
and
(Is)1=I0S1(n1,n2)Ncσ1[dΩ/4π] -(11c)
with similar expressions for fluorophore number 2:
S
2(n1,n2)=S(n2)−c21n1n2<S(n2) -(11d)
(Is)2=I0S2(n1,n2)Ncσ2[dΩ/4π] -(11e)
A number of points emerge from this discussion, which have useful implications for the interpretation of (for example) DNA microarray data.
Even in the complete absence of spillover of the emission spectra between output bandpasses, dual (or, by extension, multiple) labelling experiments can be expected to exhibit inter-dependent signal intensities in the output channels because of mutual quenching. The fluorescent intensity from one fluorophore, its concentration held constant, is reduced by an increase in the concentration of a second fluorophore.
The signal intensities described in eqs. (11c) and (11e) therefore lead to underestimates of the “true” degrees of expression in a microarray experiment. Furthermore, if the fluorophores are ordered by increasing wavelength of maximum emission, then we expect that c21<<c12. In other words, the response of the red fluorophore in a dual labelling experiment should be less perturbed by the presence of the second species than that of the blue fluorophore.
The skilled person will also understand that the above considerations can be used to select two or more fluorophores for a kit of fluorophores for detector calibration. The fluorophores may, for example, be selected to optimise (maximum) the signal from each according to the above equations, optimally taking into account detector sensitivity.
These predictions can be subject to experimental test in the form of a dilution experiment, in particular with an STJ detector, use of a superconducting tunnel junction (STJ) detector permitting the registration of fluorescent spectra on a photon-by-photon basis from DNA labelled, in proportions ranging from 1:4 to 4:1, with both Alexa® 488 (emission in the 510-543 nm band of a triple band output filter) and Cy3 (emission in a longer wavelength band 607-659 nm). The counting rates in these two bands are summarised in Table 3 (below). The ratio of count rates follows the dilution ratio more or less linearly, although the dynamic range is only 6.4:1, rather than the 16:1 expected from the known amounts of fluorophore. Of more interest in the present context is the decrease (from 0.45 to 0.27 to 0.32 counts/second) in the Alexa® 488 signal as the amount of Cy3 present is increased above parity and the decrease in the Cy3 signal (from 1.85 to 0.8 to 1.39 counts/second) as the amount of Alexa® 488 increases. These results could be artefacts of subtle variations in the absolute amounts of dye present in the various samples, but the suppression of one fluorophore's signal intensity by the increased presence of a second fluorophore is also clearly embodied in eqs. (11a-e).
These results suggest that the full recovery of biological information from multiple fluorophore systems (such as microarrays) will benefit from preparatory calibration experiments of the kind summarised in Table 3, in order to account for non-linear behaviour due to self- and mutual quenching.
The skilled person will appreciate that the brightness function described above may be employed in a number of different methods. For example a value for c12 (and optionally also c21) may be determined by a calibration procedure in which the combined brightness function S is measured for a range of different values of n1 and n2. Then a value for S(n1,n2) may be measured and, knowing c12, a value for S(n1) may be determined; and similarly for S(n2). Using equation 11a self-quenching may be taken into account (through b, which is related to k2).
The desired result of an example two-colour microarray analysis is the identification of those genes which are significantly over- or under-expressed in a disease or experimental state relative to a normal or control state. The experimental state signal is represented by the fluorescent intensity in one colour channel (Green, represented by the fluorophore Cy3, in the example below) relative to the intensity in a second colour channel (Red, represented by Cy5). The crucial step in any microarray analysis, therefore, is to establish a line of parity—the locus of points where degrees of expression in the two channels—represented externally by the signal intensities—are equal. The above analysis suggests an unambiguous method to closely approximate the line of parity.
Let the ratio of experimental to control intensities be denoted G/R. Then:
G/R≈S
G(n)/SR(n)=[aGn−(bG+cRG)n2]/[aRn−bRn2] -(12)
This general expression is not particularly useful, since it is not easy to identify physical values of n in a microarray context, but one can identify two limiting cases which are useful—as n tends to unity, when:
G/R≈[a
G
−b
G
−c
RG
]/[a
R
−b
R] -(13a)
and as n tends to infinity, when:
G/R≈[b
G
+c
RG
]/[b
R] -(13b)
For any given fluorophore there are two wavelength-independent constants: a and b. In the above equations the subscript G denotes the green fluorophore (e.g. Cy3). The subscript R denotes the red fluorophore (e.g. Cy5). The values for aG, bG, aR and bR are determined independently from analysis of published curves of light yield versus degree-of-labelling. The remaining coupling constant (cRG) expresses the mutual self-quenching between fluorophores and is a free parameter fixed by fitting to the data.
All the parameters in eqs. (13a, b) are known from the earlier characterisation of the individual Cy3 and Cy5 fluorophores, except cRG, which can safely be approximated by zero in the dilute case, eq. (13a), but not in the intensive case, eq. (13b). Thus, we have limiting expressions for the local line of parity, with only one free parameter to be fit to data (cRG), and with no a priori assumptions regarding the biology of the system.
The procedure is illustrated in
aG=0.55
bG=0.044
aR=1.65
bR=0.55
cRG=0.4 (fitted)
S
G
=K·S
R
For line 900, k=(bG+cRG)/bR
For line 902, k=(aG−bG)/(aR−bR)
For line 904, k=1
The overall line of parity is taken to be the diagonal straight line joining the bottom left (bottom of line 902) and top right (top of line 904) of the “box” containing the bulk of the data points.
b) shows the results of collapsing the data onto this line of parity, by subtracting on a point-by-point basis the difference between the measured Green signal intensity and that predicted from the Red signal intensity and a knowledge of the true line of parity.
Thus the above example uses the limiting cases of the fluorophore brightness function ratio S(n)Cy3/S(n)Cy5 to “box” the microarray data scatter plot. Joining the corners of the box gives a good approximation to the true line of parity, taking account of the non-linearity of the two fluorophore responses. Then the data is collapsed onto the nominal line of parity by calculating the difference in the y-axis between the data point and the line of parity established.
There is evidence for fixed pattern noise related to the 28×27 sub-array pattern of the microarray data for the system HFF-PDS (no. 3932). In
c provides a model template to which the real data can be transformed.
An analysis therefore preferably includes a systematic error analysis and correction, in particular one or more of: removal of data points corresponding to fixed pattern noise locations, in particular Modulo x elimination of data points corresponding to fixed pattern noise locations (in the example above, x=28); an approach as described above which aims for (objective) minimisation of the match between theoretical and measured Ln(G/R) versus Ln(R) patterns (R and G comprise first and second colour, eg. Red and Green, signal data).
On closer examination it turns out that the modulo-x (e.g. 28) fixed pattern noise effect is not the most effective discriminant against “falsely expressed” genes. If the pin array has a size χ×y (27×28) or modulo χy (e.g. 756) cycle—which has been confirmed experimentally: The modulo 28 effect is observable as a small ripple, but there is also a repeating “wave” and “giant excursions” occur in the same parts of the cycle. Thus a more efficient way of reducing fixed pattern noise than simply rejecting “every 28th event” is to reject every event with a mean noise value greater than a threshold or outside a determined range e.g. 200-550 (say) in both Channel 1 and Channel 2.
Imposing the requirement that the mean noise be less than a threshold level, say 550, reduces the number of apparently under-expressed genes above the S/N=1 level of ˜8 bits, almost to zero. Imposing the condition that the mean noise level should not fall below a threshold level, e.g. 200 has a similar effect on the over expressed genes. No doubt many other effective alternatives will occur to the skilled person. In particular application of the techniques we describe are not limited to use with STJ-type detectors.
The techniques we describe may also be applied to: correcting fluorescence images from comparative biochemistry carried out in microtitre plates, including experiments involving whole or live cells, for example for high throughput drug discovery. Also to fluorescence imaging of whole or live cells or synthetic particles with fluorophore tagged moieties attached in biological experiments involving flow cytometry. Also to tissue and whole cell imaging by fluorescence microscopy or confocal microscopy down to the level of single molecule detection, particularly for example for the identification of the very earliest stages of the development of cancers. The spatial location and movement of individual proteins in whole cells is under early development for both basic biomedical research and for drug discovery. Most whole cell experiments are currently qualitative in nature, but there will be an increasing demand for quantitative imaging, which will require understanding and correction for coupling and energy transfer between fluorophores. Similarly in an important area of future research, at present in its infancy, which involves the characterisation of the autofluorescence of biomarkers, particularly in whole cells.
It will be understood that the invention is not limited to the described embodiments and encompasses modifications apparent to those skilled in the art lying within the spirit and scope of the claims appended hereto.
Number | Date | Country | Kind |
---|---|---|---|
0700189.4 | Jan 2007 | GB | national |
Filing Document | Filing Date | Country | Kind | 371c Date |
---|---|---|---|---|
PCT/GB2008/050009 | 1/4/2008 | WO | 00 | 11/30/2009 |