Cross-Platform and Graphical Software Tool for Adaptive LC/MS and GC/MS Metabolomics Data Preprocessing

Information

  • Research Project
  • 10234033
  • ApplicationId
    10234033
  • Core Project Number
    U01CA235507
  • Full Project Number
    5U01CA235507-04
  • Serial Number
    235507
  • FOA Number
    RFA-RM-17-012
  • Sub Project Id
  • Project Start Date
    9/19/2018 - 6 years ago
  • Project End Date
    8/31/2022 - 2 years ago
  • Program Officer Name
    ZANETTI, KRISTA A
  • Budget Start Date
    9/1/2021 - 3 years ago
  • Budget End Date
    8/31/2022 - 2 years ago
  • Fiscal Year
    2021
  • Support Year
    04
  • Suffix
  • Award Notice Date
    8/19/2021 - 3 years ago

Cross-Platform and Graphical Software Tool for Adaptive LC/MS and GC/MS Metabolomics Data Preprocessing

Project Summary / Abstract Data preprocessing is critical for the success of any MS-based untargeted metabolomics study, as it is the first informatics step for making sense of the data. Despite the enormous contributions that existing software tools have made to metabolomics, errors in compound identification and relative quantitation are still plaguing the field. This issue is becoming more serious as the sensitivity of LC/MS and GC/MS platforms is constantly increasing. Preprocessing involves peak detection, peak grouping and annotation for LC/MS or spectral deconvolution for GC/MS data, and peak alignment. Existing software tools invariably yield an immense number of false positive and false negative peaks, produce inaccurate peak groups, mis-align detected peaks, and extract inaccurate information of relative metabolite quantitation. These errors can translate downstream into spurious or missing compound identifications and cause misleading interpretations of the metabolome. Furthermore, users need to specify a large number of parameters for existing software tools to work. Unfortunately, general users usually do not understand how to optimize these parameters, and maximizing one aspect (e.g., sensitivity) often has deleterious effects on another (e.g., specificity). We will address these challenges by developing more accurate algorithms for improving the rigor and reproducibility of data preprocessing. The proposed algorithms will be implemented in Java and integrated with the widely-used MZmine 2, making the software cross-platform and user-friendly with rich visualization capabilities. In addition, the implementation will be optimized for memory efficiency and computing speed allowing large-scale data preprocessing. Extensive testing of the software will be conducted in close collaborations with metabolomics core facilities and users around the world.

IC Name
NATIONAL CANCER INSTITUTE
  • Activity
    U01
  • Administering IC
    CA
  • Application Type
    5
  • Direct Cost Amount
    422610
  • Indirect Cost Amount
    104982
  • Total Cost
    324528
  • Sub Project Total Cost
  • ARRA Funded
    False
  • CFDA Code
    393
  • Ed Inst. Type
    SCHOOLS OF ARTS AND SCIENCES
  • Funding ICs
    OD:324528\
  • Funding Mechanism
    Non-SBIR/STTR RPGs
  • Study Section
    ZRG1
  • Study Section Name
    Special Emphasis Panel
  • Organization Name
    UNIVERSITY OF NORTH CAROLINA CHARLOTTE
  • Organization Department
    BIOSTATISTICS & OTHER MATH SCI
  • Organization DUNS
    066300096
  • Organization City
    CHARLOTTE
  • Organization State
    NC
  • Organization Country
    UNITED STATES
  • Organization Zip Code
    282230001
  • Organization District
    UNITED STATES