Discriminating between mass signals generated for ions having similar mass-to-charge (m/z) can be a difficult problem in mass spectrometry.
In top-down mass spectrometry (MS) protein analysis, for example, overlapping of mass or mass-to-charge (m/z) peaks in a mass spectrum is a significant problem. In this type of analysis, a wide range of different fragment or product ions are produced, including product ions that have lengths of 1-200 amino acids and have 1-50 different charge states. The product ion peaks are heavily overlapped with each other in a single spectrum. In addition, the overlap can be so extensive that even mass spectrometers with the highest mass resolution (Fourier transform ion cyclotron resonance (FT-ICR) or Orbitrap) cannot deconvolve such overlapped peaks. As a result, large product ions are often lost in top-down protein analysis, limiting the sequence coverage of large proteins.
For compound identification, mass spectra are usually converted into a list of monoisotopic masses corresponding to different compounds. To find such masses the following strategy is often employed: first, each peak in the mass spectrum is assigned to a corresponding isotopic cluster and the charge state of such cluster is found. Following this, the lowest m/z peak is found for each cluster, which is the peak corresponding to the monoisotopic mass. Knowing the cluster charges the monoisotopic peaks of each cluster can be converted to a zero-charge list of monoisotopic masses, which then can be used in subsequent algorithms attributing mass spectral peaks to chemical compounds. Practically, correct charge state assignment to a feature (isotopic cluster) in the mass spectrum is a key step towards compound identification.
Conventionally, charge deconvolution algorithms in the m/z domain are used for charge state identification. However if there is a severe spectral overlap, which includes inter-digitated peaks and peak overlapping, this approach is challenging. This is often the case for complex spectra of mixtures or product ion spectra of large biopolymers, such as top-down analysis spectra.
It has long been recognized that the detection response for both electron-multiplier or image-charge detection systems can be proportional to the charge state of the measured ion (e.g. see references listed in PCT/IB2020/050795, incorporated herein by reference). Therefore, in theory the charge state can be determined upon careful investigation of such intensities. Interestingly, few attempts have been made to exploit the phenomena for charge state inference. This is because it is challenging technologically.
First, detection events from multiple acquisitions are conventionally summed into a single spectrum to compress the data. Such compression however prevents any further analysis of detector responses of each individual ion events rendering it impossible to infer the charge state. A complete record of each ion detection event intensity and it's mass spectral feature, e.g. time of flight or oscillation frequency, is therefore preferred for such analysis. Alternative data compression strategies can also be utilized for retaining some information of individual detector responses while still maintaining data compression. For example, each detection event can be co-added to a multiple spectra forming detector response bands similar to the approach described in PCT/IB2020/050795, incorporated by reference in its entirety.
Second, multiple co-detected ions can generate a detector response, which is substantially a sum of the detector responses generated by each co-arriving ion. It is therefore not always possible to infer the charge state of the ions using only the detector response intensity of the detected signal. In general, sufficiently low ion flux is preferred for charge state determination using detector response, such that of the number of detection events with co-arriving ions is minimized.
Third, another challenge in such methods is that the detector response distributions for each particular type of ion are wide and often overlap for different species.
The problem of wide intensity distributions for direct identification of charge state using detection response intensity was recognized and a few strategies were proposed to deal with it for mass spectrometers employing image-charge based detectors. In such systems, the wide distribution predominantly can be attributed to the collisions with the residual neutral molecules during the measurement, which quench the coherent oscillation of the ion of interest and effectively stop the detection of its signal making its contribution dependent on the actual ion measurement time. Therefore, it was proposed to filter the detection events attributed to the ions experienced the collision during the acquisition (Kafader et. al. Anal. Chem. 2019, 91, 4, 2776-2783). This approach, however, leads to a large number of ions being discarded, thus sufficiently increasing the time to obtain good ion statistics. In addition approaches to reduce base pressure and decrease ion velocity also proposed, however those adversely affect mass analyzer characteristics. Finally, it was proposed to employ sophisticated data processing techniques to detect exact time of the collision and hence scale the measured signal intensity according to the actual detection time (Kafader et. al. J. Am. Soc. Mass. Spectrom. 2019, 11, 2200-2203).
For mass spectrometers that use an electron-multiplier detector, the average number of secondary emission electrons is well defined for each ion with a particular m/z and charge, but the exact number of emitted primary electrons defining the magnitude of the observed response is a probabilistic quantity. Both secondary emission yield and collisions with the bath gas are described by Poisson statistics, but the underlying physics of the process is very different. Therefore, none of the techniques proposed to deal with the wide distributions for mass spectrometers employing image-charge induced detectors are applicable for the mass spectrometers with electron-multiplier based detection systems.
Therefore, there is a need for methods, which address the problem.
In some embodiments, a method is provided for assigning charge state. In some aspects, the method may include assigning a molecular weight based on the assigned charge state.
In some embodiments, the method may include capturing from a detector a detector response signal corresponding to a plurality of ion arrival events. The detector response signal comprising information related to individual ion responses generated by the detector for each ion arrival event. The method may further comprise combining the detector response signal with one or more additional features corresponding to the ion arrival event to assign a charge state for that ion arrival event. In some aspects the one or more additional features may be selected from a group including: m/z; ion mobility; DMS parameter, chromatographic time. In some embodiments the method may further comprise calculating a mass corresponding to the ion arrival events based on the assigned charge states and the m/z corresponding to those ion arrival events.
In some embodiments, the combining the detector response signal with one or more additional features may comprise: grouping m/z bins based on one or more of the features and producing a simplified mass spectrum from the combination of the detector response signal and the one or more features.
In some aspects, the one or more features comprises the recorded detector response.
In some aspects, the grouping comprises applying principle components analysis (PCA) to the detector response signal.
In some aspects, the grouping comprises: generating a list of elementary detector response profiles and corresponding m/z bins, identifying detector response profiles attributed to unique compounds, and decomposing one or more remaining detector response profiles and corresponding m/z bins based on the identified detector response profiles attributed to the unique compounds.
In some aspects, the grouping comprises: generating a list of unique detector response profiles, finding elementary detector response profiles attributed to elementary features, and attributing remaining mixed groups to said elementary features.
In some aspects, the grouping comprises: generating a list of unique detector response profiles, identifying elementary detector response profiles attributed to said elementary features, and attributing the remaining mixed groups to said elementary features.
In some aspects, the grouping may further comprise updating the generated list based on contributions of said corresponding elementary detector response profiles.
In some aspects, the grouping comprises applying a grouping algorithm, and wherein the method further comprises: identifying unique groups; identifying ion groups from the unique groups based on elementary features; and, attributing remaining mixed groups to said elementary features.
In some embodiments, a device is provided for assigning charge states. The device may include: at least one processing element; and non-transitory memory storing program code that, when executed by the at least one processing element, causes the device to: capture a detector response signal corresponding to a plurality of ion arrival events, the detector response signal comprising information related to individual ion responses generated by the detector for each ion arrival event; and, combine the detector response signal with one or more additional features corresponding to the ion arrival event to assign a charge state for that ion arrival event.
In some aspects, the device may be further operative to: calculate a mass corresponding to the ion arrival events based on the assigned charge states and the m/z corresponding to those ion arrival events. The one or more additional features may be selected, for instance, from a group including: m/z; ion mobility; DMS parameter; and, chromatographic time.
In some aspects, the device may be further operative to: further operative to: calculate a mass corresponding to the ion arrival events based on the assigned charge states and the m/z corresponding to those ion arrival events.
The one or more additional features may include, for instance, m/z domain information.
In some embodiments, a device may be provided for assigning charge states. The device may include, for instance: at least one processing element; non-transitory memory storing program code that, when executed by the at least one processing element, causes the device to: generate, from mass analysis data, a plurality of detector response profiles, each detector response profile comprising an m/z range containing a portion of a mass spectrum extracted from the mass analysis data; evaluate the plurality of detector response profiles to group similar detector response profiles; reduce each group of similar detector response profiles to a simplified mass spectrum representative of that group; and, associate each simplified mass spectrum with a corresponding compound and related charge state.
The device may be operative to associate one or more additional separation domains with the detector response profiles. The additional separation domains may, for instance, be selected from the group including: retention time, drift time, and DMS operational parameters)
In some embodiments, a device is provided for assigning charge states. The device may include, for instance: at least one processing element; non-transitory memory storing program code that, when executed by the at least one processing element, causes the device to: generate, from mass analysis data, a plurality of detector response profiles, each detector response profile comprising an m/z range containing a portion of a mass spectrum extracted from the mass analysis data; and, compare the detector response profiles with a previously generated library of detector response profiles to identify at least one of an associated compound and related charge state.
In some aspects, the previously generated library of detector response profiles comprises a plurality of simplified mass spectra.
Although, the detector response profile is insufficient for accurate determination of the charge state it could be very helpful for separating signals originating from different compounds. This, in combination with the fact that the accurate charge state information is encoded in the m/z domain allows for substantially improved performance if conventional charge determination algorithms are coupled with the detector response domain for charge state determination.
Importantly, because separation happens at the last step of the mass spectrometry analysis this method can be applicable in some cases, where alternative approaches will not work. Specifically, LC methods can provide separation of compounds; however, they are of little use for separation of the product ions originating from the same precursor, while the fragments from the same precursor can still substantially overlap. Similarly, ion mobility separation, which is conventionally performed before fragmentation (e.g. differential mobility separation (DMS)) require significant modifications to setup post fragmentation separation.
One approach to enhance performance of the conventional charge determination algorithms is to leverage the detector response profiles for grouping the data. Often the same chemical compound has multiple isotopes forming an isotope cluster, which may or may not be resolved in the m/z domain. The m/z bins corresponding to the positions of those isotopes under certain circumstances will have similar detection response profiles. For example, the detector response profiles will be similar if at least two conditions are satisfied. First, the signal does not overlap (i.e. m/z bin does not contain signal from multiple different species); second, for all m/z bins, which contain the signal from those isotopes, the signal is acquired under predominantly single ion arrival conditions. This allows grouping of m/z bins containing information from the same compound effectively splitting the signal between multiple channels. This yields substantially simplified spectra for subsequent charge detection analysis by conventional algorithms.
The signal grouping may be performed by a variety of grouping algorithms such as, for example, principle component analysis (PCA), k-means clustering or other known grouping algorithms known in the art.
In some embodiments additional steps can be performed which may include, for instance, generating a library of detector response profiles and their associated charge states using well-characterized compounds. The library may be a generic library, applicable to a number of instruments or, alternatively, the library may be a custom library generated for a particular instrument. The library of detector response profiles and associated charge states for each of the well-characterized compounds providing reference templates that may be stored and then later accessed for comparison in subsequent analysis. For instance, in a subsequent analysis, a captured detector response profile may be compared to the stored detector response profiles in the library of detector response profiles to identify a corresponding stored detector response profile in order to identify an associated charge state for the captured detector response profile.
Optionally an m/z position of a compound may be stored in the library of detector response profiles and associated charge states in association with a compound of interest. In this embodiment, a step of charge state assignment is performed based on a degree of similarity between a captured detector response profile generated from captured mass analysis data captured for the compound of interest and a stored detector response profile in the library associated with that compound. The m/z position defining one or more m/z bins attributed to a corresponding one or more adjacent charge states for the compound. In a subsequent step the defined one or more m/z bins may then be co-extracted from the captured mass analysis data for subsequent analysis.
Often overlapping features have not only inter-digitated peaks, but also overlapping peaks, where a single m/z bin contains a signal that originated from multiple different species reaching the detector. In some cases, such a signal can be accurately attributed to those overlapping features. Indeed, if there are no co-detected events the total signal is a sum of the respective contributions originating from the different species and therefore can be decomposed into individual contributions using conventional linear algebra algorithms such as, for instance, non-negative least squares algorithm among other decomposition techniques. It is convenient to call a detector response profile originating from a single specie and recorded under a single ion arrival condition an elementary detector response profile. In some aspects, a plurality of detector response profiles may be captured. Each of the plurality of detector response profiles corresponding to its own elementary detector response profile, or an associated combination of elementary detector response profiles. In either case, each detection response profile corresponding to an overlapping peak can be decomposed into its elementary detector response profile(s).
In cases where the condition of single ion arrivals would not be satisfied for every ion, there would be arrival events where the m/z bins containing signal from the same type ions will have different detector response distributions depending upon a number of ions that arrived at that event. Indeed, the signal is effectively summed on the detector and having multiple ions arriving simultaneously will lead to a rightward shift of the intensity of the detector response distributions.
Importantly, single ion arrivals and multiple ion arrivals can be distinguished by a simple examination of the frequency of observed detection events in the m/z bin. The process can be modeled, for instance using the Poisson distribution, and with simple calculation of ‘no detection’ occurrences for specific m/z bins, it is possible to calculate the frequencies of each ion multiplicity in the same bin. Such frequencies then can be used as an input to the grouping algorithms to help assign ions with different multiplicities to the same group of ions.
Cases of overlapping features at higher multiplicity may be resolved using a Bayesian framework, or other suitable technique.
An alternative approach would be to use detector response profile and m/z position information in a single algorithm.
Based on the building blocks a number of different embodiments are possible, which combine an m/z and detector response domains and address charge state determination problem.
The methods may be implemented employing a computing device including at least one processing element operable to execute program code stored in non-transitory memory. When executed, the program code rendering the computing device operable to execute any of the methods described above. The computing device may be communicatively coupled to a mass spectrometry system, or may be integral therewith.
The mass analyzer 1003 can be any type of mass analyzer used for a desired technique, such as a time-of-flight (TOF), an ion trap, or a quadrupole mass analyzer. The detector 1004 may be an appropriate detector for detection ions and generating the signals discussed herein. For example, the detector 1004 may include an electron multiplier detector that may include analog-to-digital conversion (ADC) circuitry. The detector 1004 may produce detection pulses for detected ions. The detector 1004 may also be an image charge induced detector.
The computing elements of the system 1000, such as the processor 1005 and memory 1006, may be included in the mass spectrometer itself, located adjacent to the mass spectrometer, or be located remotely from the mass spectrometer. In general, the computing elements of the system may be in electronic communication with the detector 1004 such that the computing elements are able to receive the signals generated from the detector 1004. The processor 1005 may include multiple processors and may include any type of suitable processing components for processing the signals and generating the results discussed herein. Depending on the exact configuration, memory 1006 (storing, among other things, mass analysis programs and instructions to perform the operations disclosed herein) can be volatile (such as RAM), non-volatile (such as ROM, flash memory, etc.), or some combination of the two. Other computing elements may also be included in the system 1000. For instance, the system 1000 may include storage devices (removable and/or non-removable) including, but not limited to, solid-state devices, magnetic or optical disks, or tape. The system 1000 may also have input device(s) such as touch screens, keyboard, mouse, pen, voice input, etc., and/or output device(s) such as a display, speakers, printer, etc. One or more communication connections, such as local-area network (LAN), wide-area network (WAN), point-to-point, Bluetooth, RF, etc., may also be incorporated into the system 1000.
This application is being filed on Aug. 6, 2021, as a PCT International Patent Application and claims the benefit of priority to U.S. Patent Application Ser. No. 63/062,231, filed Aug. 6, 2020, the entire disclosure of which is hereby incorporated by reference in its entirety.
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/IB2021/000514 | 8/6/2021 | WO |
Number | Date | Country | |
---|---|---|---|
63062231 | Aug 2020 | US |