Mass Spectrometer

Abstract
A mass spectrometer and method of mass spectrometry are disclosed wherein two separate samples are mass analysed and then the relative intensity, concentration or expression level of one or more components, molecules or analytes in a first sample is quantitated relative to the intensity, concentration or expression level of one or more components, molecules or analytes in a second sample. The relative quantitation is performed probabilistically without the need to resort to using internal calibrants.
Description

Various embodiments of the present invention will now be described, by way of example only, and with reference to the accompanying drawings in which:



FIG. 1 shows some simulated noisy data where measurements for some analytes are not available; and



FIG. 2 shows the actual relationship between sample quality and analyte expression and the relationship as determined according to the preferred embodiment.





A preferred embodiment of the present invention will now be described. FIG. 1 shows simulated data with numbers produced using a random number generator. Four samples were considered and two replicate experiments were modelled for each sample. Accordingly, a total of eight experiments were performed. The actual, underlying or true relationships or ratios between the sample quantities h1-h8 and between the analyte expressions L1-L4 are shown in FIG. 2. FIG. 2 also shows the experimentally determined relationships or ratios as reconstructed according to the preferred embodiment from the noisy and incomplete data as shown in FIG. 1.


It is apparent from FIG. 1 that in the sixth experiment no data was modelled as being present or obtained for the internal standard or invariant ions. However, nonetheless as can be see from FIG. 2 the ratio h6/h1 has still been recovered successfully despite the lack of any internal standard in this experiment by the method of the preferred embodiment.


It is to be noted that all of the sample ratios were successfully recovered and are shown in FIG. 2 consistent within the reported uncertainties. This would not be possible using conventional techniques.


A number of further modifications to the preferred embodiment are contemplated. According to a modification the Poisson distribution given in Equation 3 above may be replaced by a Gaussian approximation to a Poisson distribution.


According to another embodiment the exponential prior probability distribution function as presented in Equation 4 above may be replaced by a gamma distribution for any of the parameters G,L,h or k. For example, according to an embodiment:










Pr


(

L

D

)


=



L

a
-
1




exp


(


-
L

/
t

)





Gamma
(
a
)



t
a







(
7
)







According to a further embodiment, the exponential prior probability distribution function as given in Equation 3 above may be replaced by a normal distribution for any of the parameters G, L, h or k. For example:










Pr


(

L

D

)


=


1

σ







2





π






exp
(

-



(

L
-
μ

)

2


2






σ
2




)






(
8
)







The exponential prior probability distribution function as given in Equation 3 may according to another embodiment be replaced by a lognormal distribution for any of the parameters G, L, h or k. For example:










Pr


(

L

D

)


=


1

SL



2





π






exp
(

-



(


ln





L

-
M

)

2


2






S
2




)






(
9
)







According to an embodiment the value L0 in Equation 3 above is set to the average datum size.

It is contemplated that a dimension could be removed from the model. According to such an embodiment, L may be multiplied by a constant and k could be divided by the same constant without changing the likelihood (Equation 2). A constraint could be added such as:









i







h
i


=
1




and the dependence on h could be recast in hyperbolic coordinates. This describes an alternative method of simplifying the probability distribution to marginalisation. Rather than integrating a value out of the equation in the case of marginalisation, a limit could instead be imposed on its possible values, such that there is less “space” for the algorithm to explore. To understand the concept of “space” a graph of h2 axis over h1 axis can be considered. If there is no limit imposed on values of h, then the algorithm must explore all positive values—zero to infinity—for h1 and likewise for h2, i.e. the entire positive region of the graph. By declaring the product of h1h2=l, the space that the algorithm needs to explore is limited to a single hyperbolic line on this graph (h2=1/h1, y=1/x). This leaves the values of h with some flexibility, so is a better approximation than simply assigning h1=1. This imposition can be made since the likelihood will remain the same if the value of k is altered accordingly.


According to another embodiment marginalisation may proceed by integrating over h instead of k.


As discussed above, since according to the preferred embodiment the values of L and Data are the only ones of particular interest, then all other values (i.e. G, h, k) in the joint probability function (See Equation 6 above) can be considered as being nuisance parameters i.e. parameter required for the calculation but otherwise unnecessary for the output. One of these values can be removed from the joint probability function by integrating both sides with respect to this value. For instance, to remove k, it is necessary to integrate with respect to k, giving:










Pr
(

L
,
h
,
G
,
Data

)

=




(


Pr


(
L
)




Pr


(
h
)




Pr


(
k
)




Pr


(
G
)






Data







Pr


(



D
O


L

,
h
,
k

)




)




k







(
10
)







thus leaving the algorithm one less parameter to explore, and saving computational time. The result of such an integral is unlikely to be a function, so further integration is unlikely to be possible. It is not usually possible to integrate the function with respect to G, the program usually doing so with respect to h or k.


The analytes could according to an embodiment be processed one at a time along with the internal standard rather than modelling the whole data set at once.


According to an embodiment the preferred embodiment may tackle the problem in two parts. Firstly, h may be inferred and then L may be inferred given the inference about h.


According to an embodiment there may not be any daughters (e.g. peptides) i.e. it may be possible to quantify directly on the analytes, or it may not be possible to make the associations described above and treat each daughter as a separate analyte.


A further embodiment is contemplated wherein different approximations may be made to the joint probability distribution given in Equation 6 above. For example, up to six terms or eight terms may be kept, or all terms may be retained. It is also contemplated that the joint probability distribution could be explored without marginalisation.


Although the present invention has been described with reference to preferred embodiments, it will be understood by those skilled in the art that various changes in form and detail may be made without departing from the scope of the invention as set forth in the accompanying claims.

Claims
  • 1. A method of mass spectrometry comprising: providing a first sample comprising a first mixture of components, molecules or analytes;providing a second different sample comprising a second mixture of components, molecules or analytes; andprobabilistically determining or quantifying the relative intensity, concentration or expression level of a component, molecule or analyte in said first sample relative to the intensity, concentration or expression level of a component, molecule or analyte in said second sample.
  • 2. A method as claimed in claim 1, further comprising providing a plurality of further samples each comprising a mixture of components, molecules or analytes.
  • 3-5. (canceled)
  • 6. A method as claimed in claim 1, further comprising: digesting said first mixture of components, molecules or analytes; and/or digesting said second mixture of components, molecules or analytes; and/or digesting further mixtures of components, molecules or analytes.
  • 7. (canceled)
  • 8. (canceled)
  • 9. A method as claimed in claim 1, further comprising: dividing said first sample into one or more first replicate samples; and/ordividing said second sample into one or more second replicate samples; and/ordividing further samples into one or more further replicate samples; and/ordividing said first complex mixture into one or more first replicate samples; and/ordividing said second complex mixture into one or more second replicate samples; and/ordividing said further complex mixtures into one or more further replicate samples.
  • 10. A method as claimed in claim 1, further comprising: separating components, analytes or molecules in said first sample by means of a separation process; and/orseparating components, analytes or molecules in said second sample by means of a separation process; and/orseparating components, analytes or molecules in further samples by means of a separation process; and/orseparating components, analytes or molecules in said first replicate samples by means of a separation process; and/orseparating components, analytes or molecules in said second replicate samples by means of a separation process; and/orseparating components, analytes or molecules in further replicate samples by means of a separation process.
  • 11. A method as claimed in claim 10, wherein said separation process comprises liquid chromatography.
  • 12. (canceled)
  • 13. (canceled)
  • 14. A method as claimed in claim 1, further comprising: mass analysing components, analytes or molecules in said first sample; and/ormass analysing components, analytes or molecules in said second sample; and/ormass analysing components, analytes or molecules in further samples; and/ormass analysing components, analytes or molecules in said first replicate samples; and/ormass analysing components, analytes or molecules in said second replicate samples; and/ormass analysing components, analytes or molecules in further replicate samples;wherein said step of mass analysing components analytes or molecules further comprises producing mass spectral data comprising a plurality of mass peaks.
  • 15-18. (canceled)
  • 19. A method as claimed in claim 14, further comprising clustering mass peaks from said first sample and/or said second sample and/or further samples and/or from said first replicate sample and/or said second replicate sample and/or further replicate samples.
  • 20-24. (canceled)
  • 25. A method as claimed in claim 19, wherein components, analytes or molecules are recognised or identified on the basis of chromatographic retention time.
  • 26. (canceled)
  • 27. (canceled)
  • 28. A method as claimed in claim 19, further comprising: identifying or recognising components, molecules or analytes in said first sample on the basis of fragment, daughter or product ions; and/oridentifying or recognising components, molecules or analytes in said second sample on the basis of fragment, daughter or product ions; and/oridentifying or recognising components, molecules or analytes in further samples on the basis of fragment, daughter or product ions.
  • 29-36. (canceled)
  • 37. A method as claimed in claim 1, further comprising determining, formulating or assigning a prior probability distribution function Pr(L) for the relative amount or concentration L of components, molecules or analytes present in each sample.
  • 38-42. (canceled)
  • 43. A method as claimed in claim 1, further comprising determining, formulating or assigning a prior probability distribution function Pr(k) for the overall response factor k of each component, molecule or analyte in said sample.
  • 44. A method as claimed in claim 43, wherein k includes one or more of the following: (i) digestion efficiency; (ii) relative product yield; (iii) losses in delivery; (iv) ionisation efficiency; (v) transmission efficiency; and (vi) detection efficiency.
  • 45-48. (canceled)
  • 49. A method as claimed in claim 1, further comprising determining, formulating or assigning a prior probability distribution function Pr(h) for the relative amount of sample h of each component, molecule or analyte in each sample used in an analysis.
  • 50-54. (canceled)
  • 55. A method as claimed in claim 1, further comprising determining, formulating or assigning a prior probability distribution function Pr(G) for the noise contribution factor G assumed for observed signal intensities and/or applied to predicted signal intensities.
  • 56-60. (canceled)
  • 61. A method as claimed in claim 1, further comprising locating, determining, identifying or choosing one or more internal standards or references.
  • 62-67. (canceled)
  • 68. A method as claimed in claim 1, further comprising predicting what would be observed for each mass peak intensity given probability distribution functions Pr(L) and/or Pr(k) and/or Pr(h) and/or Pr(G) and/or given the probability p of correct identification.
  • 69. A method as claimed in claim 68, further comprising comparing peak intensities that are predicted with those that are observed.
  • 70. A method as claimed in claim 68, further comprising adjusting the value of L or the probability distribution function Pr(L); and/or adjusting the value of k or the probability distribution function Pr(k); and/or adjusting the value of h or the probability distribution function Pr(h); and/or adjusting the value of G or the probability distribution function Pr(G).
  • 71-73. (canceled)
  • 74. A method as claimed in claim 68, further comprising predicting what would be observed for each mass peak intensity given said adjusted probability distribution functions Pr(L) and/or Pr(k) and/or Pr(h) and/or Pr(G) and/or given the probability p of correct identification.
  • 75. (canceled)
  • 76. A method as claimed in claim 74, further comprising accepting or rejecting adjusted probability distribution functions.
  • 77. (canceled)
  • 78. A method as claimed in claim 76, further comprising determining the ratios Lij of relative concentrations L of each component, molecule or analyte in each of said samples for every pair i,j of samples.
  • 79-82. (canceled)
  • 83. A method as claimed in claim 1, wherein said first sample and/or said second sample and/or further samples comprise a plurality of different biopolymers, proteins, protein digest products, peptides, fragments of peptides, polypeptides, oligionucleotides, oligionucleosides, amino acids, carbohydrates, sugars, lipids, fatty acids, vitamins, hormones, portions or fragments of DNA, portions or fragments of cDNA, portions or fragments of RNA, portions or fragments of mRNA, portions or fragments of tRNA, polyclonal antibodies, monoclonal antibodies, ribonucleases, enzymes, metabolites, polysaccharides, phosphorolated peptides, phosphorolated proteins, glycopeptides, glycoproteins or steroids.
  • 84. (canceled)
  • 85. (canceled)
  • 86. A method as claimed in claim 1, wherein either: (i) said first sample is taken from a diseased organism and said second sample is taken from a non-diseased organism; (ii) said first sample is taken from a treated organism and said second sample is taken from a non-treated organism; or (iii) said first sample is taken from a mutant organism and said second sample is taken from a wild type organism.
  • 87. A method as claimed in claim 1, further comprising identifying components, molecules or analytes in said first sample and/or said second sample and/or further samples
  • 88. A method as claimed in claim 1, wherein said components, molecules or analytes in said first sample and/or said second sample and/or further samples are only identified if the intensity of said components, molecules or analytes in said first sample differs from the intensity of said components, molecules or analytes in said second sample and/or further samples by more than a predetermined amount.
  • 89. A method as claimed in claim 1, wherein said components, molecules or analytes in said first sample and/or said second sample and/or further samples are only identified if the average intensity of a plurality of different components, molecules or analytes in said first sample differs from the average intensity of a plurality of different components, molecules or analytes in said second sample and/or further samples by more than a predetermined amount.
  • 90-92. (canceled)
  • 93. A mass spectrometer comprising means arranged to probabilistically determine or quantify the relative intensity, concentration or expression level of a component, molecule or analyte in a first sample relative to the intensity, concentration or expression level of a component, molecule or analyte in a second sample.
  • 94-100. (canceled)
  • 101. A method of relatively quantifying one or more molecular species among several samples, said method comprising: dividing each sample into multiple replicate samples;for each of said replicate samples obtaining a signal for each of several tentatively identified digestion products of said molecular species in question, wherein the signal is proportional to the concentration of the parent species subject to random noise;obtaining or assigning probabilities that each tentative identification is correct;assigning a prior probability distribution function for the relative amount L of each molecular species in each sample;assigning a prior probability distribution function for the relative amount k of digestion product produced from each molecular species;assigning a prior probability distribution function for the relative amount h of sample for each replicate sample;assigning a prior probability distribution function for the noise level G in each sample;choosing an internal standard wherein the concentration of said internal standard is known to be the same in all of said replicate samples;updating the probability distribution for the relative amount L of each molecular species in each sample;obtaining samples according to said probability distribution for the relative amount L of each molecular species in each sample of a monotonic function of the ratios L_i to L_j for every distinct pair i,j of said replicate samples; andcalculating a mean value and standard deviation of the function for each of said pairs.
  • 102. (canceled)
  • 103. (canceled)
Priority Claims (2)
Number Date Country Kind
0409677.2 Apr 2004 GB national
0411248.8 May 2004 GB national
PCT Information
Filing Document Filing Date Country Kind 371c Date
PCT/GB05/01679 5/3/2005 WO 00 12/10/2007