Centralized assay datasets for modelling support of small drug discovery organizations

Information

  • Research Project
  • 9751326
  • ApplicationId
    9751326
  • Core Project Number
    R44GM122196
  • Full Project Number
    5R44GM122196-03
  • Serial Number
    122196
  • FOA Number
    PA-17-302
  • Sub Project Id
  • Project Start Date
    1/1/2017 - 7 years ago
  • Project End Date
    7/31/2020 - 3 years ago
  • Program Officer Name
    RAVICHANDRAN, VEERASAMY
  • Budget Start Date
    8/1/2019 - 4 years ago
  • Budget End Date
    7/31/2020 - 3 years ago
  • Fiscal Year
    2019
  • Support Year
    03
  • Suffix
  • Award Notice Date
    7/29/2019 - 4 years ago

Centralized assay datasets for modelling support of small drug discovery organizations

Project Summary The growing importance of artificial intelligence (AI) is visible by the growth in companies and increasing deals over the past year between pharma and smaller companies using machine learning to assist in drug discovery. The continuing steady growth of structure-activity data for diverse targets, diseases and molecular properties poses a considerable challenge as they are generally not readily accessible for machine learning: content resides in a mixture of public databases (with differing levels of curation), disparate files within research groups, non- curated literature publications. In Phase I, Collaborations Pharmaceuticals Inc. developed a prototype of Assay Central software and used this with a wide variety of structure activity data from sources both public and private, formatted and unformatted, for enabling neglected, rare or common disease targets. Public data was mixed with collaborator/customer-contributed data, using original software and applied chemistry judgment of an expert team. In Phase I we created error checking and correction software. We also built and validated Bayesian models with the datasets that were collected and cleaned. And, in addition, we developed new data visualization tools. The software environment that we created readily enables the user to compile structure-activity data for building computational models and can be used to create selections of these models for sharing with collaborators as needed. This software can in turn be used for scoring new molecules and visualizing the multiple outputs in various formats. We have enabled ~14 collaborative projects which have shared models on specific targets such as PyrG for Tuberculosis (identifying a lead compound), HIV reverse transcriptase, whole cell screening for Leishmaniasis as well as P450 and nuclear receptor models (e.g. estrogen receptor) relevant to toxicology. We have utilized Assay Central in our ongoing internal projects working on Ebola, HIV and tuberculosis small molecule drug discovery. In Phase II, we propose the following aims that will enable us to develop Assay Central into a production tool for enabling drug discovery collaborations which we will continue to focus on. In Phase 1 we performed a preliminary analysis of different machine learning algorithms with select drug discovery datasets. In Phase II we will now perform a thorough evaluation and selection of additional machine learning algorithms and molecular descriptors as well as assessment of combination of algorithms (e.g. Bayesian and Deep Learning). We will implement disease/target definitions for machine learning models to facilitate drug discovery. We will enable molecule selection and automated design and optimization. The utility of having such a tool as Assay Central readily available will empower scientists to leverage public, private or a combination of data to help with their drug discovery tasks. Developing this software suite of computational models with public data will enable us to identify foundations, academics and potential collaborators that generate preliminary data to test models. These efforts will dramatically increase the number of projects we can work on, create new IP, and generate employment using machine learning focused on drug discovery in the area of rare and neglected diseases, in particular. Assay Central benefits include 1. Ease of deployment and use with a Java file executed by users without the need for IT support; 2. Built on industry standard technologies; 3. Graphical display of models provides instant feedback; 4 Model applicability with multiple methods to assess scores and graphics.

IC Name
NATIONAL INSTITUTE OF GENERAL MEDICAL SCIENCES
  • Activity
    R44
  • Administering IC
    GM
  • Application Type
    5
  • Direct Cost Amount
  • Indirect Cost Amount
  • Total Cost
    692845
  • Sub Project Total Cost
  • ARRA Funded
    False
  • CFDA Code
    859
  • Ed Inst. Type
  • Funding ICs
    NIGMS:692845\
  • Funding Mechanism
    SBIR-STTR RPGs
  • Study Section
    ZRG1
  • Study Section Name
    Special Emphasis Panel
  • Organization Name
    COLLABORATIONS PHARMACEUTICALS, INC.
  • Organization Department
  • Organization DUNS
    079704473
  • Organization City
    FUQUAY VARINA
  • Organization State
    NC
  • Organization Country
    UNITED STATES
  • Organization Zip Code
    275269278
  • Organization District
    UNITED STATES