Learning Dynamics of Biological Processes from Time Course Omics Datasets

Information

  • Research Project
  • 10242091
  • ApplicationId
    10242091
  • Core Project Number
    R01GM135926
  • Full Project Number
    5R01GM135926-03
  • Serial Number
    135926
  • FOA Number
    PAR-19-001
  • Sub Project Id
  • Project Start Date
    9/23/2019 - 4 years ago
  • Project End Date
    8/31/2023 - 9 months ago
  • Program Officer Name
    BRAZHNIK, PAUL
  • Budget Start Date
    9/1/2021 - 2 years ago
  • Budget End Date
    8/31/2022 - a year ago
  • Fiscal Year
    2021
  • Support Year
    03
  • Suffix
  • Award Notice Date
    8/23/2021 - 2 years ago
Organizations

Learning Dynamics of Biological Processes from Time Course Omics Datasets

Complex biological processes, including organ development, immune response and disease progression, are inherently dynamic. Learning their regulatory architecture requires understanding how components of a large system dynamically interact with each other and give rise to emergent behavior. Recent experimental advances have made ii possible to investigate these biological systems in a data-driven fashion al high temporal resolution, allowing identification of new genes and their regulatory interactions. Longitudinal omics data sets are becoming increasingly common in clinical practice as well. Information on these collections of interacting genes can be integrated to gain systems-level insights into the roles of biological pathways and processes, including progression of diseases. Consequently, developing interpretable methods for learning functional relationships among genes, proteins or metabolites from high-dimensional time series data has become a timely research problem. The nature of these time-course data sets presents exciting opportunities and interesting challenges from a statistical perspective. Typical time-course omics data sets are challenging because of their high-dimensionality and non-linear relationships among system components. To tackle these challenges, one needs sophisticated dimension-reduction techniques that are biologically meaningful, computationally efficient and allow uncertainty quantification. Methods that incorporate prior biological information (e.g., pathway membership, protein-protein interactions) into the data analysis are good candidates for analyzing such high-dimensional systems using small samples. Here, we will develop three core methods to address the above challenges - (Aim 1): an empirical Bayes framework for clustering high-dimensional omics time-course data using prior biological knowledge; (Aim 2): a quantile-based Granger causality framework for learning interactions among genes or metabolites from their lead-lag relationships; and (Aim 3): a decision tree ensemble framework for searching cascades of interactions among genes from their temporal expression profiles. Our interdisciplinary team of statisticians and scientists will analyze time-course omics data from three research projects: (i) innate immune response systems in Drosophila, (ii) developmental process in mouse models, and (ii) longitudinal metabolite profiling of TB patients. These insights will be used to build and validate our methodology, which will be implemented in a publicly available software. This proposal is innovative in its incorporation of prior biological knowledge in the framework of novel dimension reduction techniques for interrogating high-dimensional time-course omics data. This research is significant in that it will impact basic sciences by elucidating data-driven, testable hypotheses on the regulatory architecture of biological processes, and clinical practice by monitoring disease progression and prognosis.

IC Name
NATIONAL INSTITUTE OF GENERAL MEDICAL SCIENCES
  • Activity
    R01
  • Administering IC
    GM
  • Application Type
    5
  • Direct Cost Amount
    232470
  • Indirect Cost Amount
    124893
  • Total Cost
    357363
  • Sub Project Total Cost
  • ARRA Funded
    False
  • CFDA Code
    859
  • Ed Inst. Type
    EARTH SCIENCES/RESOURCES
  • Funding ICs
    NIGMS:357363\
  • Funding Mechanism
    Non-SBIR/STTR RPGs
  • Study Section
    ZGM1
  • Study Section Name
    Special Emphasis Panel
  • Organization Name
    CORNELL UNIVERSITY
  • Organization Department
    BIOSTATISTICS & OTHER MATH SCI
  • Organization DUNS
    872612445
  • Organization City
    ITHACA
  • Organization State
    NY
  • Organization Country
    UNITED STATES
  • Organization Zip Code
    148502820
  • Organization District
    UNITED STATES