Supporting IGVF by modeling genetics, function, and phenotype with machine learning

Information

  • Research Project
  • 10297060
  • ApplicationId
    10297060
  • Core Project Number
    U01HG012022
  • Full Project Number
    1U01HG012022-01
  • Serial Number
    012022
  • FOA Number
    RFA-HG-20-047
  • Sub Project Id
  • Project Start Date
    9/1/2021 - 3 years ago
  • Project End Date
    5/31/2026 - a year from now
  • Program Officer Name
    GILCHRIST, DANIEL A
  • Budget Start Date
    9/1/2021 - 3 years ago
  • Budget End Date
    5/31/2022 - 2 years ago
  • Fiscal Year
    2021
  • Support Year
    01
  • Suffix
  • Award Notice Date
    9/3/2021 - 3 years ago
Organizations

Supporting IGVF by modeling genetics, function, and phenotype with machine learning

PROJECT SUMMARY Leveraging the power of the human genome to understand the risks, causes, and treatments of human dis- ease remains a grand challenge for all of biology and medicine. While sequencing costs have plummeted, and clinical implementation has become commonplace, interpreting human genomes remains a highly challenging task. It is our hypothesis that understanding the function of the genome and its products at a molecular, tissue, and phenotypic level using advanced machine learning will help unlock the door to better interpretation for sci- entific discovery and better clinical outcomes based on genomic medicine. To that end, our team has spent the past two decades working to develop computational models of biology, to predict how those models are perturbed through changes in the genome, and to use those perturbations to model phenotype and disease. We have had many research outputs in this area, having developed and published a number of widely used methods that predict biochemical and phenotypic changes caused by genetic variants to infer phenotype and pathogenicity. However, we believe that there is a coming convergence between the variability in clinical inter- pretation, high-throughput biotechnology assays, and modern machine learning methodology that will result in more accurate clinical assessments and improved clinical care. Therefore, in this ambitious proposal, we are addressing important questions in variant and genome interpretation consistent with this view and the mission of the IGVF Consortium. Our major goals include (1) developing advanced semi-supervised approaches to predict variants that disrupt molecular function and/or are capable of altering phenotypes; (2) identifying in- formative assays, variants, and genes to automate experimental design with an emphasis on resource alloca- tion and reduction of ascertainment bias in the Consortium; and (3) developing machine learning approaches to integrate these models into a workflow of the IGVF Consortium and enable the interaction between compu- tation and experiment in order to catalyze advances in both genetic variant interpretation and predictive model development.

IC Name
NATIONAL HUMAN GENOME RESEARCH INSTITUTE
  • Activity
    U01
  • Administering IC
    HG
  • Application Type
    1
  • Direct Cost Amount
    255408
  • Indirect Cost Amount
    100856
  • Total Cost
    356264
  • Sub Project Total Cost
  • ARRA Funded
    False
  • CFDA Code
    172
  • Ed Inst. Type
    SCHOOLS OF ARTS AND SCIENCES
  • Funding ICs
    NHGRI:356264\
  • Funding Mechanism
    Non-SBIR/STTR RPGs
  • Study Section
    ZHG1
  • Study Section Name
    Special Emphasis Panel
  • Organization Name
    NORTHEASTERN UNIVERSITY
  • Organization Department
    CHEMISTRY
  • Organization DUNS
    001423631
  • Organization City
    BOSTON
  • Organization State
    MA
  • Organization Country
    UNITED STATES
  • Organization Zip Code
    021155005
  • Organization District
    UNITED STATES