Collaborative Research: New Regression Models and Methods for Studying Multiple Categorical Responses

Information

  • NSF Award
  • 2415067
Owner
  • Award Id
    2415067
  • Award Effective Date
    1/15/2024 - 4 months ago
  • Award Expiration Date
    8/31/2024 - 3 months from now
  • Award Amount
    $ 67,380.00
  • Award Instrument
    Continuing Grant

Collaborative Research: New Regression Models and Methods for Studying Multiple Categorical Responses

In many areas of scientific study including bioengineering, epidemiology, genomics, and neuroscience, an important task is to model the relationship between multiple categorical outcomes and a large number of predictors. In cancer research, for example, it is crucial to model whether a patient has cancer of subtype A, B, or C and high or low mortality risk given the expression of thousands of genes. However, existing statistical methods either cannot be applied, fail to capture the complex relationships between the response variables, or lead to models that are difficult to interpret and thus, yield little scientific insight. The PIs address this deficiency by developing multiple new statistical methods. For each new method, the PIs will provide theoretical justifications and fast computational algorithms. Along with graduate and undergraduate students, the PIs will also create publicly available software that will enable applications across both academia and industry.<br/><br/>This project aims to address a fundamental problem in multivariate categorical data analysis: how to parsimoniously model the joint probability mass function of many categorical random variables given a common set of high-dimensional predictors. The PIs will tackle this problem by using emerging technologies on tensor decompositions, dimension reduction, and both convex and non-convex optimization. The project focuses on three research directions: (1) a latent variable approach for the low-rank decomposition of a conditional probability tensor; (2) a new overlapping convex penalty for intrinsic dimension reduction in a multivariate generalized linear regression framework; and (3) a direct non-convex optimization-based approach for low-rank tensor regression utilizing explicit rank constraints on the Tucker tensor decomposition. Unlike the approach of regressing each (univariate) categorical response on the predictors separately, the new models and methods will allow practitioners to characterize the complex and often interesting dependencies between the responses.<br/><br/>This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

  • Program Officer
    Yong Zengyzeng@nsf.gov7032927299
  • Min Amd Letter Date
    1/26/2024 - 4 months ago
  • Max Amd Letter Date
    1/26/2024 - 4 months ago
  • ARRA Amount

Institutions

  • Name
    University of Minnesota-Twin Cities
  • City
    MINNEAPOLIS
  • State
    MN
  • Country
    United States
  • Address
    200 OAK ST SE
  • Postal Code
    554552009
  • Phone Number
    6126245599

Investigators

  • First Name
    Aaron
  • Last Name
    Molstad
  • Email Address
    amolstad@umn.edu
  • Start Date
    1/26/2024 12:00:00 AM

Program Element

  • Text
    STATISTICS
  • Code
    126900

Program Reference

  • Text
    Machine Learning Theory