III-CXT: Learning from graph-structured data: new algorithms for modeling physical interactions in cellular networks

Information

  • NSF Award
  • 0835494
Owner
  • Award Id
    0835494
  • Award Effective Date
    8/15/2007 - 17 years ago
  • Award Expiration Date
    7/31/2011 - 13 years ago
  • Award Amount
    $ 788,325.00
  • Award Instrument
    Continuing grant

III-CXT: Learning from graph-structured data: new algorithms for modeling physical interactions in cellular networks

III-CXT: Learning from graph-structured data: new algorithms for <br/>modeling physical interactions in cellular networks<br/><br/>The complex behavior of the cell derives from an intricate network of <br/>molecular interactions of thousands of genes and their products. <br/>Understanding how this network operates and predicting its behavior <br/>are primary goals of biology and have broad implications for life <br/>science, medicine and biotechnology.<br/><br/>The genomic information revolution of the last ten years has enabled <br/>new systems-level and data-driven approaches for studying cellular <br/>networks. In particular, using machine learning to model gene <br/>regulatory networks---the switching on and off of genes by regulatory <br/>proteins that bind to non-coding DNA---has emerged as a central <br/>problem in systems biology. Now, an explosion of new high-throughput <br/>technologies for measuring physical interactions between proteins and <br/>between protein and DNA provides a new data integration challenge for <br/>computational modeling of gene regulation. These new data can all be <br/>viewed as graph-structured data, or physical interaction networks.<br/><br/>The central computational goal of this project is to develop new <br/>machine learning learning algorithms for exploiting graph-structured <br/>data, including: (1) boosting with efficient graph mining; (2) graph <br/>kernels based on subgraph histogramming; and (3) information-based <br/>graph partitioning. These new algorithms will be used to integrate <br/>physical interaction network data into models of gene regulation in <br/>order to better represent underlying biological mechanisms. The <br/>focus will be two fundamental modeling problems: inferring signal <br/>transduction pathways and modeling cis regulatory modules at the <br/>level of DNA sequence and interacting regulatory proteins. The <br/>algorithms will be applied both to publicly available data and to <br/>primary gene expression data provided by one of the investigators to <br/>study the hypoxia in yeast and the response to environmental toxins <br/>in mammalian neural cells.<br/><br/>This project will learn systems-level models that lead to new insight <br/>into the underlying mechanisms of gene regulation and open the way to <br/>broader biological discoveries. All data, results and source code <br/>will be publicly available via the Web (http://www.cs.columbia.edu/ <br/>compbio/cellular-networks) and disseminated through courses and <br/>bioinformatics software packages. The project will also create <br/>undergraduate research opportunities for joint dry and wet lab <br/>projects and outreach activities to introduce New York City public <br/>high school students to new interdisciplinary areas of science.

  • Program Officer
    Vasant G. Honavar
  • Min Amd Letter Date
    6/5/2008 - 16 years ago
  • Max Amd Letter Date
    7/1/2009 - 15 years ago
  • ARRA Amount

Institutions

  • Name
    Sloan Kettering Institute For Cancer Research
  • City
    New York
  • State
    NY
  • Country
    United States
  • Address
    1275 York Avenue
  • Postal Code
    100650000
  • Phone Number
    6462273273

Investigators

  • First Name
    Christina
  • Last Name
    Leslie
  • Email Address
    cleslie@cbio.mskcc.org
  • Start Date
    6/5/2008 12:00:00 AM