Inference After Predictor Selection

Information

  • NSF Award
  • 1307642
Owner
  • Award Id
    1307642
  • Award Effective Date
    8/15/2013 - 11 years ago
  • Award Expiration Date
    2/28/2014 - 10 years ago
  • Award Amount
    $ 139,999.00
  • Award Instrument
    Standard Grant

Inference After Predictor Selection

There are three goals for this project. The first goal is to develop data-driven assessments of the complexity of data generators and data-driven assessments of the complexity of the predictive techniques to be used for a data generator and then relate them to each other. It is expected that a complexity matching principle between data generators and their predictors will be established. The motivation is to speed the search for predictors that have low generalization error. The second goal is to develop techniques to derive modeling information from good predictors. The motivation is to be able to make statements about the data generator beyond numerical prediction. The third goal is to use these techniques on a complex data set for which a predictive approach is essential because the extreme complexity of the data means it defies conventional modeling. The motivation is to verify that the complexity based techniques give reliable inferences for an important question such as `which of those who have suffered a traumatic event are likely to get post- traumatic stress disorder'.<br/><br/>The motivation for the overall project is to find ways to get information out of data that is so complex conventional techniques are ineffective. Such data is becoming increasingly common as the number of data types increases and as data bases become more comprehensive. The problem with conventional techniques seems to be that they assume a model that means something physically before there is a strong enough basis even to propose one. The approach here is significant because it is overtly predictive: Instead of proposing models, one can propose predictors that are easier to test and then study the predictors to make statements about whatever it was that generated the data. This reverses the usual approach in which one models first and then predicts.

  • Program Officer
    Gabor J. Szekely
  • Min Amd Letter Date
    8/7/2013 - 11 years ago
  • Max Amd Letter Date
    8/7/2013 - 11 years ago
  • ARRA Amount

Institutions

  • Name
    University of Miami School of Medicine
  • City
    Coral Gables
  • State
    FL
  • Country
    United States
  • Address
    1320 S. Dixie Highway Suite 650
  • Postal Code
    331462926
  • Phone Number
    3052843924

Investigators

  • First Name
    BERTRAND
  • Last Name
    CLARKE
  • Email Address
    bclarke3@unl.edu
  • Start Date
    8/7/2013 12:00:00 AM

Program Element

  • Text
    STATISTICS
  • Code
    1269