Clinicopathologic and Genetic Profiling through Machine Learning and Natural Language Processing for Precision Lung Cancer Management

Information

  • Research Project
  • 10250521
  • ApplicationId
    10250521
  • Core Project Number
    R01CA249758
  • Full Project Number
    5R01CA249758-03
  • Serial Number
    249758
  • FOA Number
    PAR-16-404
  • Sub Project Id
  • Project Start Date
    9/25/2019 - 5 years ago
  • Project End Date
    8/31/2023 - a year ago
  • Program Officer Name
    OSSANDON, MIGUEL
  • Budget Start Date
    9/1/2021 - 3 years ago
  • Budget End Date
    8/31/2022 - 2 years ago
  • Fiscal Year
    2021
  • Support Year
    03
  • Suffix
  • Award Notice Date
    8/25/2021 - 3 years ago
Organizations

Clinicopathologic and Genetic Profiling through Machine Learning and Natural Language Processing for Precision Lung Cancer Management

PROJECT SUMMARY/ABSTRACT Lung cancer is the second-most common type of cancer and the leading cause of cancer death in men and women. Among the different types of lung cancer, non-small cell lung cancer (NSCLC) is the most common type and it constitutes 85% to 90% of all lung cancer cases. Current cancer research has shown that multiple somatic mutations affect the sensitivity of patients to various drugs used for NSCLC treatment. These mutations are essential factors for determining the most effective, ?personalized? treatment for each NSCLC patient; however, most NSCLC patients develop resistance to these targeted therapies in their first year of treatment. Many mechanisms of this resistance are still unknown. Designing and prescribing better targeted therapies for NSCLC patients requires further understanding, particularly with respect to the relationship between NSCLC tumors? pathological and clinical findings, genetic profiles, and targeted therapy responses/resistance. Currently, there is no computational method to connect observations and findings from pathology reports, medical records, somatic mutations, and the targeted therapy resistance. This project provides a plan to build a novel computational method to identify statistically significant associations between the pathological findings of NSCLC tumors and the presence of clinically-actionable somatic mutations. Furthermore, these associations, in combination with an innovative set of feature analysis from pathology reports and electronic medical records, will be leveraged to build and validate a machine-learning model to identify NSCLC patients with clinically-actionable somatic mutations. Finally, the associated clinical, pathological, and genetic findings for NSCLC patients will be used in a new machine-learning framework to predict patients? time-to-resistance to targeted therapies. The required data to build and validate the proposed models in this project will be obtained through a collaboration with the Department of Pathology?s Laboratory for Clinical Genomics and Advanced Technologies at Dartmouth-Hitchcock Medical Center. In addition to internal validation, the investigators in this proposal established a collaboration with the Department of Pathology at the University of Vermont Medical Center to apply and validate the developed models on an external data source. Upon successful implementation of this bioinformatics approach, the developed models will be able to reveal statistically significant links between clinical and pathological findings, clinically-actionable somatic mutations, and targeted-therapy responses for a better understanding of NSCLC tumor development and treatment. The proposed approach will provide an accurate, fast, and inexpensive pre- selection method for screening NSCLC patients with clinically-actionable mutations for translational research and precision medicine. Furthermore, the proposed machine-learning method to identify NSCLC patients? resistance to targeted therapies will help healthcare providers to select the best treatment strategies for these patients, improve their health outcomes, and establish this precision medicine paradigm for other types of cancer.

IC Name
NATIONAL CANCER INSTITUTE
  • Activity
    R01
  • Administering IC
    CA
  • Application Type
    5
  • Direct Cost Amount
    228750
  • Indirect Cost Amount
    146400
  • Total Cost
    375150
  • Sub Project Total Cost
  • ARRA Funded
    False
  • CFDA Code
    394
  • Ed Inst. Type
    SCHOOLS OF MEDICINE
  • Funding ICs
    NCI:375150\
  • Funding Mechanism
    Non-SBIR/STTR RPGs
  • Study Section
    BLR
  • Study Section Name
    Biomedical Library and Informatics Review Committee
  • Organization Name
    DARTMOUTH COLLEGE
  • Organization Department
    INTERNAL MEDICINE/MEDICINE
  • Organization DUNS
    041027822
  • Organization City
    HANOVER
  • Organization State
    NH
  • Organization Country
    UNITED STATES
  • Organization Zip Code
    037551421
  • Organization District
    UNITED STATES