1. Field of the Invention
The present invention relates to a data processing apparatus for obtaining structural information regarding unknown compounds by searching for the mass spectra of such unknown compounds in a mass spectrum database of known compounds, the mass spectra of such unknown compounds being gained by using mass spectrometry that is capable of MSn.
2. Background Art
In order to measure unknown compounds using mass spectrometry and to determine the structure thereof, a mass spectrum database of known compounds is widely used for searching for the mass spectrum of an unknown compound gained by measurement. JP Patent Publication (Kokai) No. 11-64285 A (1999) (Patent Document 1) and JP Patent Publication (Kokai) No. 2001-50945 A (Patent Document 2), for example, disclose mass spectrum database search methods.
Mass spectrometry uses an apparatus for separating and detecting a sample on the basis of the ratio of mass to charge (m/z), the sample being ionized using an ion source. In this case, a usual mass analysis method is referred to as MS1, by which a sample ionized at the beginning using an ion source is detected as such. A method for obtaining a second mass spectrum is referred to as MS2, by which the second mass spectrum is obtained by providing energy to ions (precursor ions) of a specific mass for fragmentation, the specific mass being in a mass spectrum obtained in MS1, and by separating the masses of a plurality of generated product ions.
Each bond in a sample molecule has a different likelihood of cleavage in accordance with the structure of the relevant molecules. Thus, fragment ions in a mass spectrum gained in MS2 have differing intensities, and show molecule-specific mass spectrum patterns. In other words, when different compounds show the same mass spectrum pattern in an MS1 spectrum, they show different mass spectrum patterns in an MS2 spectrum. Thus, more accurate identification is possible by searching the MS2 spectrum along with the MS1 spectrum using a database. JP Patent Publication (Kokai) No. 8-124519 A (1996) (Patent Document 3), JP Patent Publication (Kokai) No. 2001-249114 A (Patent Document 4), and U.S. Pat. No. 6,624,408 (Patent Document 5) show examples of a database search method using the MS2 spectrum.
Conventionally, the object of database search for unknown compounds is to identify unknown compounds. Under such object, it is necessary that the unknown compounds and known compounds searched for in a database have the same molecular weight, so that it is meaningless to search for mass spectra whose generations are different (have different n values) to each other, when searching MSn spectra. Thus, in conventional database search, among the MSn spectra of unknown compounds and known compounds, mass spectra whose generations are the same, such as MS1 for MS1, MS2 for MS2 . . . are compared.
Generally, biopolymers such as carbohydrates and peptides, for example, have many series of related compounds including various types of different side chains at the same principal chain. In the structural analysis thereof, when the structure of the principal chain is determined in accordance with the MSn spectrum of a cleaved principal chain, the analysis of the entire structure is possible by estimating the side chains on the basis of the structure of the principal chain, even if the entire structure is not clear. However, depending on compound, a series of related compounds has different numbers of MSn generations (n) (
Also, compounds that have a multitude of structural isomers in which a plurality of structural units that have the same mass are bound, such as carbohydrates, have a multitude of isomers whose molecular weights are equivalent to one other. Thus, in many cases, although the same mass spectrum patterns are shown in MS1, they result in different compounds. Consequently, it is difficult to accurately determine or identify structure via conventional database search methods.
It is an object of the present invention, in database searches for MSn spectra, to enable searching for a known compound whose principal chain is identical to that of an unknown compound even if the entire structure of the known compound in a database and that of a measured unknown compound is not identical so as to readily analyze the entire structure.
In order to achieve the aforementioned object, the present invention provides a data processing apparatus for mass spectrometry. The data processing apparatus is capable of MSn analysis of an ionized sample, and is provided with a database for storing mass spectrum data obtained as a result of MSn analysis of known compounds by each compound, and for searching for the mass spectrum data by comparing the mass spectrum data with MSm spectra (m≧1) obtained as a result of MSm analysis of unknown compounds. The data processing apparatus is characterized in that it has a function of searching MSn data involving differing generations, upon database search.
The MSm spectrum of an unknown compound that is a comparison target in the present invention, is characterized in that the MSm spectrum of the unknown compound is that with the smallest value of m among those mass spectra such that their intensity ratios of base ions to other ions are greater than a threshold.
The present invention is also characterized in that the MSm measurement of unknown compounds ends when the intensity ratio of the base ions to other ions in the MSm spectrum exceeds the threshold.
The present invention is further characterized in that the number m of mass spectra obtained as a result of the MSm analysis of unknown compounds are compared with all the mass spectra in a database successively from m=1, depending on the structure of the database.
According to the present invention, in the MSn measurement of a series of related compounds including various types of different side chains with the same principal chain, such as in the case of biopolymers, it is possible to determine the structure of the principal chain using a database search even if the entire structure is not clear. And the estimation of the entire structure is possible on the basis of the principal chain structure. Further, in a database search for determining the structure of a principal chain, it is possible to determine the structure of related compounds whose number is greater than that of known compounds registered in a database.
Moreover, in the MSn measurement of compounds that have a structure where a plurality of structural units that have the same mass are bound and the molecular weights of isomers are equivalent to each other, it is possible to identify the isomers using a database search even if the mass spectrum patterns in the results of MS1 are the same.
In the following, the embodiments of the present invention are described.
Sample ions ionized by the ion source 1 are introduced into the mass separating unit 2. In the mass separating unit 2, the sample ions are mass-separated. Also, MSn (n=2, 3, 4 . . . ) is successively conducted in accordance with the setting performed by an observer. The mass-separated sample ions are sent to the detector 3 and detected in the form of a mass spectrum. The mass spectrum is sent to the data processing unit 5 for processing via the signal wire 6, and is displayed via the display unit 7.
A sugar chain has a structure where a principal chain in which a multitude of sugars are bound is induced by various side chains. When the sugar chain is subjected to MSn measurement, cleavage is caused successively from the bond between the principal chain and the side chains. Thus, the number of generations (n) for MSn required for the bond in the principal chain to be cleaved differs depending on compounds. Also, the sugars constituting the principal chain of the sugar chain are isomers that have masses equivalent to one other, so that it is difficult to identify the structure of the principal chain on the basis of daughter ions corresponding to the principal chain, the daughter ions being gained by the desorption of the side chains upon MSn−1. Thus, it is necessary to conduct MSn until the principal chain is cleaved.
In
In this case, when the MSn measurement of unknown compounds is conducted, one method enables measurement allowing an observer to estimate sufficient n for the number of generations (n) of MSn in advance such that it allows the principal chain of unknown compounds to be cleaved, and to specify n regarding measurement conditions.
Also, by setting a threshold for the intensity ratio of the base ions, which represent the strongest peak in a mass spectrum, to other ions in advance, it is possible to determine a mass spectrum in which the intensity ratio exceeds the threshold as a mass spectrum that shows structural information about the principal chain. On the basis of this, MSn relative to the base ions can be automatically repeated until a mass spectrum that shows structural information about the principal chain can be obtained. For example, in the aforementioned case, measurement can be automatically conducted up to MS2 in 200.2 and MS3 in 210.2 by establishing conditions whereby the principal chain is determined to be cleaved when the intensity of other ions exceeds 40% of the intensity of the base ions in a mass spectrum. A percentage from 10% to 50% is suitable for the threshold.
A case is considered where the MS2 spectrum of 200.2 from data gained as mentioned above is registered in a database, and the results of MS3 analysis of 210.2 are searched for in the database (
In the present invention, a mass spectrum that best shows structural information of the principal chain in the results of MS3 analysis of 210.2 is first selected from among the three mass spectra of MS1, MS2, and MS3.
This selection method has two methods regarding the MSn analysis method for 210.2.
If measurement is conducted by specifying the number of MSn generations (n) regardless of mass spectrum patterns, a mass spectrum that shows the structure of the principal chain is selected from the number n of gained mass spectra. The selection is carried out by determining a mass spectrum with the smallest n as the mass spectrum in which the principal chain is cleaved among mass spectra such that the intensity ratio of other ions to the base ions in the mass spectrum is not less than a certain threshold. In the case of 210.2, when the threshold is set as 40%, the MS3 spectrum in which the intensity of other ions to the base ions exceeds 40% is selected. A percentage from 10% to 50% is suitable for the threshold.
In contrast, if the MSn measurement of 210.2 is conducted by automatically determining a mass spectrum in which the principal chain is cleaved, the mass spectrum that shows the structure of the principal chain is a mass spectrum gained as the end of the MSn measurement; namely, the MS3 spectrum. Thus, it is selected.
The selected MS3 spectrum is compared with all the mass spectra registered in the database. As a result, the MS2 spectrum of 200.2 that shows a similar mass spectrum pattern is displayed as a search result, and the principal chain of 210.2 is determined to be the same as that of 200.2. An observer can analyze each mass spectrum of MS1 and MS2 using determined principal chain information, and can determine the entire structure (
The second embodiment includes the constitution of the first embodiment shown in
In the present invention, an MSn spectrum (n≧1) of 210.2 is first compared with all the mass spectra registered in the database successively from n=1 and any similar mass spectra are selected. In the present embodiment, the MS1 spectrum of 200.2 that is similar to the MS2 spectrum of 210.2 is selected. This comparison determines a mass spectrum that shows daughter ions corresponding to the principal chain.
Then, concerning both a selected MSm spectrum of an unknown compound and the MS1 spectrum in the database, the MSm+1 spectrum and the MS1+1 spectrum are compared. In the present embodiment, the MS3 spectrum of 210.2 and the MS2 spectrum of 200.2 in the database are compared. The comparison is conducted between the MSm spectrum and the MSn spectrum in which the base ions are cleaved. As a result, search results are displayed in descending order of similarity, and the principal chain of 210.2 is determined to be the same as that of 200.2 (
Using determined principal chain information, an observer can determine the entire structure by analyzing each mass spectrum of MS2 and MS1.
Number | Date | Country | Kind |
---|---|---|---|
2004-9977 | Jan 2004 | JP | national |