Protein Knowledge Networks and Semantic Computing for Disease Discovery

Information

  • Research Project
  • 10207002
  • ApplicationId
    10207002
  • Core Project Number
    R35GM141873
  • Full Project Number
    1R35GM141873-01
  • Serial Number
    141873
  • FOA Number
    PAR-19-367
  • Sub Project Id
  • Project Start Date
    8/25/2021 - 3 years ago
  • Project End Date
    7/31/2026 - a year from now
  • Program Officer Name
    RAVICHANDRAN, VEERASAMY
  • Budget Start Date
    8/25/2021 - 3 years ago
  • Budget End Date
    7/31/2022 - 2 years ago
  • Fiscal Year
    2021
  • Support Year
    01
  • Suffix
  • Award Notice Date
    8/20/2021 - 3 years ago
Organizations

Protein Knowledge Networks and Semantic Computing for Disease Discovery

Protein Knowledge Networks and Semantic Computing for Disease Discovery The growing volume and breadth of information from the scientific literature and biomedical databases pose challenges to the research community to exploit the content for discovery. This MIRA grant application will advance our knowledge mining and semantic computing system to accelerate data-driven discovery for understanding of gene-disease-drug relationships. We have employed natural language processing and machine learning approaches in a generalizable framework for bioentity and relation extraction from large-scale text. Our Protein Ontology supports protein-centric semantic integration of biomedical data for both human understanding and computational reasoning. We have also developed a resource to support functional interpretation and analysis of protein post-translational modifications (PTMs) across modification types and organisms. Building on our computational algorithms, bioinformatics infrastructure and community interactions, we will further develop literature mining tools to support automated information extraction across the bibliome and open linked data models for semantic integration of biomedical data from heterogeneous resources. Our text mining tools will be trained for different use cases using deep learning methods. We will develop RDF (Resource Description Framework) semantic models in an increasingly computable, inferable and explainable knowledge system to assist in hypothesis generation. We will present evidence in the form of textual artifacts and semantic models to ensure unbiased analysis and interpretation of results to promote rigorous and reproducible research. We will develop scientific case studies to drive the system development. Examples include PTM disease variant and enrichment analyses for drug target identification, genotype- phenotype knowledge mining for Alzheimer's Disease understanding, and gene-disease-drug knowledge network construction for COVID-19 drug repurposing. To foster community engagement, we will host workshops and hackathons to address critical fundamental research questions and emerging disease scenarios. We have fully adopted the FAIR (Findable, Accessible, Interoperable, Reusable) principles for resource sharing. All data, tools and research results will be broadly disseminated from the project website, accessible programmatically via RESTful API, queryable via SPARQL endpoints, and dockerized for community code reuse. The successful completion of this research will thus support scalable, integrative and collaborative knowledge discovery to accelerate disease understanding and drug target discovery.

IC Name
NATIONAL INSTITUTE OF GENERAL MEDICAL SCIENCES
  • Activity
    R35
  • Administering IC
    GM
  • Application Type
    1
  • Direct Cost Amount
    275000
  • Indirect Cost Amount
    158379
  • Total Cost
    433379
  • Sub Project Total Cost
  • ARRA Funded
    False
  • CFDA Code
    859
  • Ed Inst. Type
    BIOMED ENGR/COL ENGR/ENGR STA
  • Funding ICs
    NIGMS:433379\
  • Funding Mechanism
    Non-SBIR/STTR RPGs
  • Study Section
    ZRG1
  • Study Section Name
    Special Emphasis Panel
  • Organization Name
    UNIVERSITY OF DELAWARE
  • Organization Department
    BIOSTATISTICS & OTHER MATH SCI
  • Organization DUNS
    059007500
  • Organization City
    NEWARK
  • Organization State
    DE
  • Organization Country
    UNITED STATES
  • Organization Zip Code
    197160099
  • Organization District
    UNITED STATES