Development of Trans Proteomic Pipeline, an Analysis Suite for Mass Spectrometry

Information

  • Research Project
  • 9936396
  • ApplicationId
    9936396
  • Core Project Number
    R01GM087221
  • Full Project Number
    5R01GM087221-10
  • Serial Number
    087221
  • FOA Number
    PA-14-156
  • Sub Project Id
  • Project Start Date
    9/1/2010 - 14 years ago
  • Project End Date
    4/30/2022 - 2 years ago
  • Program Officer Name
    RAVICHANDRAN, VEERASAMY
  • Budget Start Date
    5/1/2020 - 4 years ago
  • Budget End Date
    4/30/2021 - 3 years ago
  • Fiscal Year
    2020
  • Support Year
    10
  • Suffix
  • Award Notice Date
    5/1/2020 - 4 years ago

Development of Trans Proteomic Pipeline, an Analysis Suite for Mass Spectrometry

Project Summary Mass spectrometry (MS) based proteomics is a key technology for the identification, quantification and comparison of proteins and their post-translational modifications across all aspects of biology. MS datasets have been growing ever larger with the advancement of instrumentation, as has the archive of experimental data available for re-analysis and comparison. In order to meet the needs of the proteomics community for coping with big data, we have been developing our end-to-end suite of data processing and analysis tools, called the Trans-Proteomic Pipeline (TPP). This project will advance the widely used TPP software suite to become even more useful to its user community, enabling them to perform their analyses even faster with less human effort, and adding capabilities that are currently not possible or are only in testing stages. We will add full end-to-end TPP support for the data independent acquisition (DIA) workflows, such as SWATH-MS, and proteogenomics workflows, such as RNA-seq assisted proteomics. The TPP already has partial support for these workflows, but needs additional finishing, hardening, and extension to high capacity cloud computing platforms to become truly useful to all our users. As protein abundance quantification becomes even more essential to more experiments, we will enhance our existing tools for isotopic and isobaric labeled data as well as label-free data, and build a new analysis workbench that will give our users access to advanced statistical analysis and comparison routines that already exist but are difficult for many users to handle. In addition to bundling this statistical software, we will build a framework that allows users to take their quantitative results from any of the traditional workflows or new workflows, transform them into the formats that the statistical packages require, and then visualize and interactively explore the outputs of statistical analysis, so trends can be uncovered and outliers verified in the original data. A substantial number of smaller enhancements to the TPP suite will be made to make the tools smarter so that users are relieved of the burden setting parameters and shepherding data through various tools. We will develop new modes of operation for existing tools to be able to handle challenges presented by our users based on the feedback we receive from them. We will continue our many outreach efforts, which include teaching software courses several times per year, hosting workshops and booths at scientific conferences to meet with and gain feedback from our users, and develop many more publicly available tutorials and recipes for using the tools and applications to various circumstances. We will of course continue to disseminate the advancements of the TPP with articles in the literature and with presentations at scientific conferences. In summary, this proposed program will continue to advance the TPP as the preeminent free and open-source end-to-end software analysis tool suite for routine and big data applications in proteomics.

IC Name
NATIONAL INSTITUTE OF GENERAL MEDICAL SCIENCES
  • Activity
    R01
  • Administering IC
    GM
  • Application Type
    5
  • Direct Cost Amount
    286770
  • Indirect Cost Amount
    238019
  • Total Cost
    524789
  • Sub Project Total Cost
  • ARRA Funded
    False
  • CFDA Code
    859
  • Ed Inst. Type
  • Funding ICs
    NIGMS:524789\
  • Funding Mechanism
    Non-SBIR/STTR RPGs
  • Study Section
    BDMA
  • Study Section Name
    Biodata Management and Analysis Study Section
  • Organization Name
    INSTITUTE FOR SYSTEMS BIOLOGY
  • Organization Department
  • Organization DUNS
    135646524
  • Organization City
    SEATTLE
  • Organization State
    WA
  • Organization Country
    UNITED STATES
  • Organization Zip Code
    981095263
  • Organization District
    UNITED STATES