Collaborative Research: Prediction and Model Selection for New Challenging Problems with Complex Data

Information

  • NSF Award
  • 1509557
Owner
  • Award Id
    1509557
  • Award Effective Date
    8/15/2015 - 8 years ago
  • Award Expiration Date
    7/31/2018 - 5 years ago
  • Award Amount
    $ 102,185.00
  • Award Instrument
    Standard Grant

Collaborative Research: Prediction and Model Selection for New Challenging Problems with Complex Data

Mixed model prediction, that is, prediction based on a class of statistical models known as mixed effects models, has a fairly long history. The traditional fields of applications have included genetics, agriculture, education, and surveys. Nowadays, new and challenging problems have emerged from such fields as business and health sciences, in addition to the traditional fields, to which methods of mixed model prediction are potentially applicable, but not without further methodology and computational developments. Some of these problems occur when interest is at subject level, such as personalized medicine, or (small) sub-population level, such as small communities, rather than at large population level. In such cases, it is possible to make substantial gains in prediction accuracy by identifying a class that a new subject belongs to. Other challenging problems occur when applying existing model search strategies in situations of incomplete or missing data, in model search or selection when prediction is of primary interest, and in making statistical inference based on the result of model search or selection. This collaborative research project aims at solving these challenging problems in prediction and model selection in situations of complex data, such as incomplete or missing data, and data that are correlated due to presence of random effects.<br/><br/>In this collaborative research project the PIs develop a novel statistical method, called classified mixed model prediction, to identify the subject class. This way, the new subject is associated with a random effect corresponding to the same class in the training data, so that the mixed model prediction method can be used to make the best prediction. Furthermore, the PIs develop a recently proposed method, called E-MS algorithm, for model selection in the presence of incomplete or missing data. The PIs also develop an idea called predictive model selection by deriving a predictive measure of lack-of-fit, and combining this measure with a recently developed class of strategies of model selection, called the fence methods. Finally, the PIs develop a unified Jackknife method to accurately assess uncertainty in mixed model analysis after model selection. Theories will be established for these new methods, and their performance and potential gains through extensive Monte-Carlo simulations will be studied. The new methods will be implemented in the R language/environment for statistical computing and graphics. All of the developed methodologies will be applied and tested in a number of applications via a series of close collaborations with experts who will provide access to the data and also guidance in interpretation and dissemination of findings. The fields of applications include genetics, health and medicine, agriculture, education, business and economy. The research project will also promote teaching, training and learning that involve under-represented groups, and build research networks between our institutions.

  • Program Officer
    Gabor J. Szekely
  • Min Amd Letter Date
    8/7/2015 - 8 years ago
  • Max Amd Letter Date
    8/7/2015 - 8 years ago
  • ARRA Amount

Institutions

  • Name
    Oregon Health and Science University
  • City
    Portland
  • State
    OR
  • Country
    United States
  • Address
    3181 S W Sam Jackson Park Rd
  • Postal Code
    972393098
  • Phone Number
    5034947784

Investigators

  • First Name
    Thuan
  • Last Name
    Nguyen
  • Email Address
    nguythua@ohsu.edu
  • Start Date
    8/7/2015 12:00:00 AM

Program Element

  • Text
    STATISTICS
  • Code
    1269