Harnessing Patient Generated Data to Find Causes and Effects of Diet in Pregnancy

Information

  • Research Project
  • 9980490
  • ApplicationId
    9980490
  • Core Project Number
    R01LM013308
  • Full Project Number
    5R01LM013308-02
  • Serial Number
    013308
  • FOA Number
    PAR-19-004
  • Sub Project Id
  • Project Start Date
    8/1/2019 - 5 years ago
  • Project End Date
    4/30/2022 - 2 years ago
  • Program Officer Name
    YE, JANE
  • Budget Start Date
    5/1/2020 - 4 years ago
  • Budget End Date
    4/30/2021 - 3 years ago
  • Fiscal Year
    2020
  • Support Year
    02
  • Suffix
  • Award Notice Date
    4/9/2020 - 4 years ago

Harnessing Patient Generated Data to Find Causes and Effects of Diet in Pregnancy

Enormous amounts of biomedical data are generated by hospitals, but most of this data is available only after people become ill. For people with chronic diseases such as diabetes, though, many important events happen outside of the medical system. Patient generated health data (PGHD) can provide detailed insight into an individual's health during daily life. With longterm continuous glucose data, activity data, and food logs, we could develop personalized models of how factors affect blood glucose and deliver personalized guidance to patients on how to better manage it. Transforming PGHD into information to guide decisions is a highly general problem that applies to all forms of diabetes, and other chronic diseases. We specifically focus on identifying dietary and lifestyle risk factors for gestational diabetes mellitus (GDM). GDM occurs in 9% of pregnancies, and leads to a 7-fold increase in Type 2 Diabetes risk after birth, making it a significant public health problem. Pregnancy provides an ideal test bed for methods designed to make use of PGHD and uncover causes, as outcomes can be captured in a limited study duration. Motivated by trying to find causes and effects of nutrition in pregnancy, we develop generalizable algorithms that address widespread challenges in the use of PGHD for causal inference. First, existing causal inference methods assume we have well-defined variables (e.g. bodyweight), but nutrition can be measured in many ways (calories, macronutrients, food groups). This puts a large burden on users, and limits the potential for data-driven inference. We introduce the first causal inference algorithm that automatically identifies optimal variable granularity for each relationship, by leveraging ontologies. This allows identification of different effects between, say, protein and specific meats on health outcomes, without users needing to specify such hypotheses. Second, while individual level data is essential for personalized inference, only limited data may be available when a treatment decision must be made or when health status is changing over time, such as during pregnancy. Leveraging population data can yield more accurate inferences, but existing methods are unable to identify relevant data dynamically and pregnant individuals may be more similar to others at the same stage of pregnancy than to themselves in the recent past. We introduce new methods for dynamic causal transfer learning that continually identify and adapt relevant population data for personalized causal inference. We initially test our approach on publicly available ICU, diabetes, and nutrition datasets, before collecting a unique dietary and activity dataset from 150 pregnant individuals. RELEVANCE (Sae instructions): Gestational diabetes mellitus (GDM) is a significant public health problem, and while changes in diet during pregnancy may increase risk, it is currently unknown which dietary factors have the most influence. This project develops better methods for continually assessing GDM risk and gaining insight into diet in a longterm individualized way. The methods developed will be generalizable to other types of health data and may in particular yield insights into causes and effects of diabetes and chronic disease.

IC Name
NATIONAL LIBRARY OF MEDICINE
  • Activity
    R01
  • Administering IC
    LM
  • Application Type
    5
  • Direct Cost Amount
    225576
  • Indirect Cost Amount
    42429
  • Total Cost
    268005
  • Sub Project Total Cost
  • ARRA Funded
    False
  • CFDA Code
    879
  • Ed Inst. Type
    BIOMED ENGR/COL ENGR/ENGR STA
  • Funding ICs
    NLM:268005\
  • Funding Mechanism
    Non-SBIR/STTR RPGs
  • Study Section
    ZLM1
  • Study Section Name
    Special Emphasis Panel
  • Organization Name
    STEVENS INSTITUTE OF TECHNOLOGY
  • Organization Department
    BIOSTATISTICS & OTHER MATH SCI
  • Organization DUNS
    064271570
  • Organization City
    HOBOKEN
  • Organization State
    NJ
  • Organization Country
    UNITED STATES
  • Organization Zip Code
    070305906
  • Organization District
    UNITED STATES