Identifying disease-relevant cell types by integrating genetic and functional genomics data

Information

  • Research Project
  • 10247697
  • ApplicationId
    10247697
  • Core Project Number
    DP5OD024582
  • Full Project Number
    5DP5OD024582-05
  • Serial Number
    024582
  • FOA Number
    RFA-RM-16-006
  • Sub Project Id
  • Project Start Date
    9/1/2017 - 7 years ago
  • Project End Date
    8/31/2022 - 2 years ago
  • Program Officer Name
    MILLER, BECKY
  • Budget Start Date
    9/1/2021 - 3 years ago
  • Budget End Date
    8/31/2022 - 2 years ago
  • Fiscal Year
    2021
  • Support Year
    05
  • Suffix
  • Award Notice Date
    8/31/2021 - 3 years ago
Organizations

Identifying disease-relevant cell types by integrating genetic and functional genomics data

Project Summary: Large-scale datasets such as those generated by GTEx, the Roadmap Epigenomics Consortium, and the ENCODE project are valuable new resources for understanding the genetic basis of disease. We now have data on gene expression and many functional elements such as histone modifications and DNase-I Hypersensitivity Sites (DHS) in a variety of cell types and tissues in humans. Analysis of these datasets, together with data from genome-wide association studies (GWAS), has the potential to lead to breakthroughs in our understanding of the causes of disease. While statistical and computational methods for integrative analysis of these datasets with GWAS datasets have already led to many interesting advances, there is a great need for further methodological progress to translate this abundance of data into concrete mechanistic insights. We will focus on the fundamental problem of identifying disease-relevant cell types and tissues via integrative analysis of these datasets. Our work is motivated by the fact that the substantial majority of disease heritability lies in non-coding regions, and regulatory elements often exhibit strong cell-type specificity. Thus, to understand the mechanistic consequences of genetic variation by either computational or experimental means, we need to identify the cell types and tissues in which the relevant processes are occurring. While these are known for some complex phenotypes, they are uncertain or unknown for many; for example, while it is known that schizophrenia is a brain disease, recent evidence indicates that the complement system is involved in schizophrenia pathogenesis through its role in synaptic pruning, and the relevant cell types remain unresolved. Despite the importance of this problem, developing a powerful method for identification of cell types and tissues using GWAS data remains open. Our approach will have two components: first, we will develop methods for using genetic data to assess whether a given genomic annotation?i.e. a subset of the genome?is important for the phenotype we are studying. We will build on a method we previously developed for enrichment analysis that powerfully leverages polygenic signal, extending it so that it can analyze rare variant data, combine signal from multiple sources of data about a single cell type/tissue, and investigate shared cell types/tissues across traits. Second, we will use gene expression data and functional genomics data to construct, for each candidate cell type/tissue, genomic annotations that are maximally informative about cell-type specific activity. We will begin by using specifically expressed genes, which have not been fully leveraged in this context, and we will also develop new methods for constructing maximally informative genomic annotations from chromatin data like that available from Roadmap. We will continue our practice of releasing open-source, user-friendly software and data. Together, our new methods and annotations will allow for powerful identification of disease-relevant cell types and tissues from GWAS data, functional genomic data, and gene expression data.

IC Name
OFFICE OF THE DIRECTOR, NATIONAL INSTITUTES OF HEALTH
  • Activity
    DP5
  • Administering IC
    OD
  • Application Type
    5
  • Direct Cost Amount
    250000
  • Indirect Cost Amount
    195000
  • Total Cost
    445000
  • Sub Project Total Cost
  • ARRA Funded
    False
  • CFDA Code
    310
  • Ed Inst. Type
  • Funding ICs
    NIDCR:1\OD:444999\
  • Funding Mechanism
    Non-SBIR/STTR RPGs
  • Study Section
    ZRG1
  • Study Section Name
    Special Emphasis Panel
  • Organization Name
    BROAD INSTITUTE, INC.
  • Organization Department
  • Organization DUNS
    623544785
  • Organization City
    CAMBRIDGE
  • Organization State
    MA
  • Organization Country
    UNITED STATES
  • Organization Zip Code
    021421027
  • Organization District
    UNITED STATES