Statistical Methods for Multi-Study Predictions

Information

  • NSF Award
  • 1810829
Owner
  • Award Id
    1810829
  • Award Effective Date
    8/1/2018 - 5 years ago
  • Award Expiration Date
    7/31/2021 - 2 years ago
  • Award Amount
    $ 102,794.00
  • Award Instrument
    Continuing grant

Statistical Methods for Multi-Study Predictions

Science faces significant challenges in relation to replicability of studies. These challenges affect prediction models used in a broad spectrum of business, scientific, and social activities. The investigators have identified underutilized opportunities to make most prediction modeling techniques more likely to produce replicable results by training them on multiple studies, and rewarding good replicability in this training phase. Recent work indicates that this novel and general strategy provides insight into the replicability of predictions, and is a promising venue for systematic improvement. As many areas of science and technology are becoming data-rich, multiple datasets are more commonly available for training, and it is also more important that they be simultaneously considered and systematically used for improving replicability. Steps towards more easily replicable predictions would increase public confidence in the scientific process, facilitate dissemination of results, and robustify public engagement with science and technology. <br/><br/>The goal of this project is to make progress in the area of cross-study replication of predictions. The investigators have identified two fundamental and underutilized opportunities: 1) to train on multiple studies; 2) to leverage ensembles of prediction models, each trained on one, or a subset, of the studies. The combination of these two elements can be used to design robust prediction algorithms that are trained to incorporate replicability across different contexts and populations. In this project, the investigators propose to implement and evaluate specific prediction techniques within this paradigm; to investigate their statistical properties theoretically and empirically; to compare them to existing alternative multi-study statistical methods; and to build free, open-source software to implement the successful strategies.<br/><br/>This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

  • Program Officer
    Nandini Kannan
  • Min Amd Letter Date
    7/31/2018 - 5 years ago
  • Max Amd Letter Date
    7/31/2018 - 5 years ago
  • ARRA Amount

Institutions

  • Name
    Dana-Farber Cancer Institute
  • City
    Boston
  • State
    MA
  • Country
    United States
  • Address
    Office of Grants and Contracts
  • Postal Code
    022155450
  • Phone Number
    6176323940

Investigators

  • First Name
    Giovanni
  • Last Name
    Parmigiani
  • Email Address
    gp@jimmy.harvard.edu
  • Start Date
    7/31/2018 12:00:00 AM
  • First Name
    Lorenzo
  • Last Name
    Trippa
  • Email Address
    ltrippa@jimmy.harvard.edu
  • Start Date
    7/31/2018 12:00:00 AM

Program Element

  • Text
    STATISTICS
  • Code
    1269