Scalable tools for the analysis of chemical compounds using graph-based querying

Information

  • Research Project
  • 7293378
  • ApplicationId
    7293378
  • Core Project Number
    R43GM081328
  • Full Project Number
    1R43GM081328-01
  • Serial Number
    81328
  • FOA Number
    PAR-06-535
  • Sub Project Id
  • Project Start Date
    9/1/2007 - 17 years ago
  • Project End Date
    8/31/2009 - 15 years ago
  • Program Officer Name
    OKITA, RICHARD T.
  • Budget Start Date
    9/1/2007 - 17 years ago
  • Budget End Date
    8/31/2009 - 15 years ago
  • Fiscal Year
    2007
  • Support Year
    1
  • Suffix
  • Award Notice Date
    8/21/2007 - 17 years ago
Organizations

Scalable tools for the analysis of chemical compounds using graph-based querying

[unreadable] DESCRIPTION (provided by applicant): The generation, manipulation, storage and retrieval of chemical structures and subsequent calculation of various properties, often related to their biological activity, have become extremely important for drug discovery. The resulting field of Cheminformatics has blossomed in recent years and has been a hotbed for the application of data mining and database principles to collections of chemical compounds. The wide adoption of these techniques has led to im- proved methods for representation of chemical structures, similarity-based retrieval of chemical compounds, diversity analysis, and substructure mining. The representation of chemical compounds as graphs captures the essential aspects of chemical structures in a natural way that can be communicated easily. Recent techniques for graph querying and mining have demonstrated great promise for scalability as well as an improved quality of results over traditional representation techniques such as fingerprints. These techniques include novel ways of graph matching, the organization of graphs in a hierarchical index structure, and the mining of a set of graphs to find statistically over-represented motifs. The proposed research will develop computational tools based on these ideas and investigate the feasibility of the techniques on diverse and large data sets. Graph-based techniques for similar compound retrieval, diversity analysis, and substructure mining will be compared to competing techniques based on other representations of chemical structures. Finally, a system that integrates chemical compound databases with biological databases will be developed. The resulting analysis methods are expected to make a significant impact on the complex, time-consuming, and expensive process of drug discovery. Graph-based representation of chemical compounds results in a more accurate realization of the chemical space. The use of recent techniques in graph querying and mining will enable data analysis that can scale to millions of compounds. The developed system will also integrate information on chemical compounds with biological activity and protein interaction networks, thus enabling more efficient drug discovery. [unreadable] [unreadable] [unreadable]

IC Name
NATIONAL INSTITUTE OF GENERAL MEDICAL SCIENCES
  • Activity
    R43
  • Administering IC
    GM
  • Application Type
    1
  • Direct Cost Amount
  • Indirect Cost Amount
  • Total Cost
    223300
  • Sub Project Total Cost
  • ARRA Funded
  • CFDA Code
    859
  • Ed Inst. Type
  • Funding ICs
    NIGMS:223300\
  • Funding Mechanism
  • Study Section
    BDMA
  • Study Section Name
    Biodata Management and Analysis Study Section
  • Organization Name
    ACELOT, INC.
  • Organization Department
  • Organization DUNS
    784692001
  • Organization City
    SANTA BARBARA
  • Organization State
    CA
  • Organization Country
    UNITED STATES
  • Organization Zip Code
    931111471
  • Organization District
    UNITED STATES