Deep Learning Methods for Fine Mapping and Discovery in Genomic Association Studies

Information

  • Research Project
  • 10350124
  • ApplicationId
    10350124
  • Core Project Number
    P20GM109035
  • Full Project Number
    5P20GM109035-05
  • Serial Number
    109035
  • FOA Number
    PAR-14-035
  • Sub Project Id
    9781
  • Project Start Date
    3/1/2020 - 4 years ago
  • Project End Date
    8/3/2021 - 2 years ago
  • Program Officer Name
    MATUKUMALLI, LAKSHMI KUMAR
  • Budget Start Date
    3/1/2020 - 4 years ago
  • Budget End Date
    2/28/2022 - 2 years ago
  • Fiscal Year
    2020
  • Support Year
    05
  • Suffix
  • Award Notice Date
    9/17/2021 - 2 years ago
Organizations

Deep Learning Methods for Fine Mapping and Discovery in Genomic Association Studies

Nonlinear genetic effects have been proposed as key contributors to missing heritability ? the proportion of heritability in a trait that is not explained by the top associated additive variants in genome-wide association (GWA) studies. To this end, probabilistic machine learning approaches have been shown to be useful tools that exhibit great performance gains in genomic selection-based analyses. This is often attributed to the fact that popular kernel regression functions and deep neural networks offer scalable implementations that implicitly enumerate all possible polynomial interaction effects for all variables in the data. Recently, however, these same algorithms have also become criticized as ?black box? techniques. There is a fundamental interpretability issue where understanding how genetic features are being ranked within machine learning methods is an important, yet open, problem. Here, we propose to develop a suite of novel methodological approaches that make probabilistic machine learning and deep neural networks fully amenable for fine mapping and discovery in genomic sequencing studies (i.e. opening up the black box). Our efforts will lead to unified frameworks that produce interpretable summaries detailing associations on multiple genomic scales (e.g. SNPs, genes, signaling pathways). The first aim of this project is to develop an interpretable significance measure for probabilistic machine learning. The second aim is to develop a unified deep learning framework for gene-level and pathway enrichment analysis in genome-wide association studies. The third aim is to create distributable software and use it to characterize nonlinear genetic effects at multiple genomic scales in real data applications.

IC Name
NATIONAL INSTITUTE OF GENERAL MEDICAL SCIENCES
  • Activity
    P20
  • Administering IC
    GM
  • Application Type
    5
  • Direct Cost Amount
    162283
  • Indirect Cost Amount
    88901
  • Total Cost
  • Sub Project Total Cost
    251184
  • ARRA Funded
    False
  • CFDA Code
  • Ed Inst. Type
  • Funding ICs
    NIGMS:251184\
  • Funding Mechanism
    RESEARCH CENTERS
  • Study Section
    ZGM1
  • Study Section Name
    Special Emphasis Panel
  • Organization Name
    BROWN UNIVERSITY
  • Organization Department
  • Organization DUNS
    001785542
  • Organization City
    PROVIDENCE
  • Organization State
    RI
  • Organization Country
    UNITED STATES
  • Organization Zip Code
    029129002
  • Organization District
    UNITED STATES