Computational epigenetics modeling of cell identity genes

Information

  • Research Project
  • 10450361
  • ApplicationId
    10450361
  • Core Project Number
    R01GM125632
  • Full Project Number
    7R01GM125632-04
  • Serial Number
    125632
  • FOA Number
    PA-21-268
  • Sub Project Id
  • Project Start Date
    7/1/2018 - 6 years ago
  • Project End Date
    4/30/2022 - 2 years ago
  • Program Officer Name
    BRAZHNIK, PAUL
  • Budget Start Date
    11/2/2020 - 4 years ago
  • Budget End Date
    4/30/2021 - 3 years ago
  • Fiscal Year
    2020
  • Support Year
    04
  • Suffix
  • Award Notice Date
    9/17/2021 - 3 years ago

Computational epigenetics modeling of cell identity genes

Cell identity genes are a group of functionally linked genes that jointly implement the phenotype of a given cell type. A major constraint on cell identity study is the lack of a robust method to define the catalogue of identity genes for a cell type, and to identify master transcription factors that regulate the expression network of cell identity genes and drive cell identity specification. Intrigued by our recent discoveries, we hypothesize that cell identity genes can be identified using epigenetic feature that manifests their distinct transcriptional regulation mechanism. We and several other groups discovered that cell identity genes display unique epigenetic features, e.g., broad H3K4me3 (Chen, et al, Nature Genetics, 2015) and super-enhancers. We illustrated that these features are associated with strong and stable transcription activation signals for cell identity genes in their associated cell type, but not in other cell types. Biologists have used super enhancers or broad H3K4me3 as makers to nominate cell identity genes recently. However, it is still challenging for most biologists to use this method, as the required bioinformatics tools are not yet available. Our overall goal in this proposal is to extend the development of our computational epigenetic methods for cell identity gene discovery. Leveraging the early success of our bioinformatics algorithms DANPOS (Chen, et al, Genome Research, 2013) and DANPOS2 (Chen, et al, Nature Genetics, 2015), we will develop a series of new algorithms to (1) define epigenetic features for cell identity genes, (2) customize parameters for ChIP-Seq analysis of epigenetic feature, (3) collect known cell identity genes on the basis of thorough literature search followed by manual inspection, (4) systematically identify unknown cell identity genes, and (5) define master transcription factors that regulate the network of cell identity genes and drive cell identity specification. As a proof of principle, we will apply our novel methods to study cell identity determinants for the ECs in collaboration with Drs. John P. Cooke, Longhou Fang, and Qi Cao, three experts in EC biology, angiogenesis, and epigenetics. Successful completion of this study is expected to have broad positive impact on the study of cell identity determination, transcriptional regulation, and chromatin epigenetics. The scientific community will be able to use the bioinformatics tools developed in this proposal to define histone modification features with improved accuracy, and to predict identity genes and their master transcription factors systematically for given cell types in numerous biological systems or disease models. Our functional assay for new identity genes of ECs will improve mechanistic understanding of endothelial differentiation, development, and phenotypes, and will better guide discovery of therapeutic targets for treatment of vascular diseases. Although we focus on histone modification features for EC identity genes, our proposed bioinformatics methods can be easily adapted to investigate many other chromatin marks and gene categories in all cell types.

IC Name
NATIONAL INSTITUTE OF GENERAL MEDICAL SCIENCES
  • Activity
    R01
  • Administering IC
    GM
  • Application Type
    7
  • Direct Cost Amount
    105495
  • Indirect Cost Amount
    81231
  • Total Cost
    186726
  • Sub Project Total Cost
  • ARRA Funded
    False
  • CFDA Code
    859
  • Ed Inst. Type
  • Funding ICs
    NIGMS:186726\
  • Funding Mechanism
    Non-SBIR/STTR RPGs
  • Study Section
    GCAT
  • Study Section Name
    Genomics, Computational Biology and Technology Study Section
  • Organization Name
    BOSTON CHILDREN'S HOSPITAL
  • Organization Department
  • Organization DUNS
    076593722
  • Organization City
    BOSTON
  • Organization State
    MA
  • Organization Country
    UNITED STATES
  • Organization Zip Code
    021155724
  • Organization District
    UNITED STATES