Collaborative Research: Identifying Reproducible Research Using Human-in-the-loop Machine Learning

Information

  • NSF Award
  • 2022443
Owner
  • Award Id
    2022443
  • Award Effective Date
    9/1/2020 - 3 years ago
  • Award Expiration Date
    8/31/2022 - a year ago
  • Award Amount
    $ 155,741.00
  • Award Instrument
    Standard Grant

Collaborative Research: Identifying Reproducible Research Using Human-in-the-loop Machine Learning

Research quality thrives under healthy skepticism where scientists retest hypotheses creating higher levels of confidence in the original findings and a sturdy foundation for extending work. Recent attempts to sample scientific research in psychology, economics, and medicine however have shown that more scientific papers fail than pass manual replication tests. Consequently, several attempts have been made to find ways to efficiently extend replication studies, including new statistics, surveys, and prediction markets. However new statistics have been very slowly adopted and the high costs associated with surveys and prediction markets makes these methods impractical for estimating the reproducibility of more than a few hundred studies out of the vast stock of millions of research papers that are used as building blocks for current and future work. The proposed research aims to develop metrics and tools to help make replication studies of existing work more efficient with one additional benefit: to help scientists, scholars, and technologists self-evaluate their work before publishing it. <br/><br/>This proposal combines efforts to create new datasets, ‘reproducibility’ metrics, and machine learning models that estimate a confidence level in the reproducibility of a published work. The deliverables will include new datasets covering the success and failure of hundreds of scientific papers in psychology and economics and their related subfields. The metrics will go beyond a binary classification of whether a publication is estimated to be reproducible or not. They will quantify a level of confidence that the work is likely to be reproducible. The machine learning models will also help scientists interpret and explain confidence scores, which aid scientists in learning about the factors that correlate with reproducibility. In all three areas, the project will aim to provide scientists with better tools to evaluate the reproducibility of their own and others’ work, creating a better foundation of knowledge for advancing research.<br/><br/>This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

  • Program Officer
    Joshua Trapani
  • Min Amd Letter Date
    9/8/2020 - 3 years ago
  • Max Amd Letter Date
    10/13/2020 - 3 years ago
  • ARRA Amount

Institutions

  • Name
    Northern Illinois University
  • City
    De Kalb
  • State
    IL
  • Country
    United States
  • Address
    301 Lowden Hall
  • Postal Code
    601152828
  • Phone Number
    8157531581

Investigators

  • First Name
    David
  • Last Name
    Koop
  • Email Address
    dakoop@niu.edu
  • Start Date
    9/8/2020 12:00:00 AM
  • First Name
    Hamed
  • Last Name
    Alhoori
  • Email Address
    alhoori@niu.edu
  • Start Date
    9/8/2020 12:00:00 AM

Program Element

  • Text
    Science of Science

Program Reference

  • Text
    SCIENCE OF SCIENCE POLICY
  • Code
    7626