Multi-objective representation learning methods for interpetable predictions of patient outcomesusing electronic health records

Information

  • Research Project
  • 10453863
  • ApplicationId
    10453863
  • Core Project Number
    R00LM012926
  • Full Project Number
    4R00LM012926-03
  • Serial Number
    012926
  • FOA Number
    PA-18-398
  • Sub Project Id
  • Project Start Date
    9/1/2021 - 3 years ago
  • Project End Date
    8/31/2024 - 4 months ago
  • Program Officer Name
    VANBIERVLIET, ALAN
  • Budget Start Date
    9/1/2021 - 3 years ago
  • Budget End Date
    8/31/2022 - 2 years ago
  • Fiscal Year
    2021
  • Support Year
    03
  • Suffix
  • Award Notice Date
    8/30/2021 - 3 years ago

Multi-objective representation learning methods for interpetable predictions of patient outcomesusing electronic health records

Project Summary/Abstract This project proposes new methods for representing data in electronic health records (EHR) to improve pre- dictive modeling and interpretation of patient outcomes. EHR data offer a promising opportunity for advancing the understanding of how clinical decisions and patient conditions interact over time to in?uence patient health. However, EHR data are dif?cult to use for predictive modeling due to the various data types they contain (con- tinuous, categorical, text, etc.), their longitudinal nature, the high amount of non-random missingness for certain measurements, and other concerns. Furthermore, patient outcomes often have heterogenous causes and re- quire information to be synthesized from several clinical lab measures and patient visits. The core challenge at hand is overcoming the mismatch between data representations in the EHR and the assumptions underly- ing commonly used statistical and machine learning (ML) methods. To this end, this project proposes novel wrapper-based methods for learning informative features from EHR data. Both methods propose specialized operators to handle sequential data, time delays, and variable interactions, and have the capacity to discover underlying clinical rules/decisions that affect patient outcomes. Importantly, both methods also produce archives of possible models that represent the best trade-offs between complexity and accuracy, which assists in model interpretation. These method advances are made possible by encoding a rich set of data operations as nodes in a directed acyclic graph, and optimizing the graph structures using multi-objective optimization. The central hypothesis of this research is that multi-objective optimization can learn effective data representations from the EHR to produce accurate, explanatory models of patient outcomes. Preliminary work has shown that these methods can effectively learn low-order data representations that improve the predictive ability of several state- of-the-art ML methods. This technique demonstrates good scaling properties with high-dimensional biomedical data. Aim 1 (K99) is to develop a multi-objective feature engineering method that pairs with existing ML methods to iteratively improve their performance by constructing new features from the raw data and using feedback from the trained model to guide feature construction. In Aim 2 (K99), this method is applied to form predictive models of the risk of heart disease and heart failure using longitudinal EHR data. The resultant models will be inter- preted with the help of mentors in order to translate predictions into clinical recommendations. For Aim 3 (R00), a second method is proposed that uses a similar framework to optimize existing neural network approaches in order to simplify their structure as much as possible while maintaining accuracy. The goal of Aim 4 (R00) is to identify hospital patients who are at risk of readmission and propose point-of-care strategies to mitigate that risk. This goal is facilitated through the application of the proposed methods to patient data collected from the Hospital of the University of Pennsylvania, the Geisinger Health System, and publicly available EHR databases.

IC Name
NATIONAL LIBRARY OF MEDICINE
  • Activity
    R00
  • Administering IC
    LM
  • Application Type
    4
  • Direct Cost Amount
    133644
  • Indirect Cost Amount
    102906
  • Total Cost
    236550
  • Sub Project Total Cost
  • ARRA Funded
    False
  • CFDA Code
    879
  • Ed Inst. Type
  • Funding ICs
    NLM:236550\
  • Funding Mechanism
    Non-SBIR/STTR RPGs
  • Study Section
    NSS
  • Study Section Name
    Special Emphasis Panel
  • Organization Name
    BOSTON CHILDREN'S HOSPITAL
  • Organization Department
  • Organization DUNS
    076593722
  • Organization City
    BOSTON
  • Organization State
    MA
  • Organization Country
    UNITED STATES
  • Organization Zip Code
    021155724
  • Organization District
    UNITED STATES