New nonparametric theory and methods for censored data

Information

  • NSF Award
  • 2412746
Owner
  • Award Id
    2412746
  • Award Effective Date
    10/1/2024 - 3 months from now
  • Award Expiration Date
    9/30/2027 - 3 years from now
  • Award Amount
    $ 300,000.00
  • Award Instrument
    Standard Grant

New nonparametric theory and methods for censored data

Estimating the survival time distribution given a set of predictors is of great importance in biomedical research and epidemiological studies, where the survival time is often censored, thus unobserved, due to dropouts or limited follow-up time. To obtain robust results, making weak model assumptions is desirable. This project focuses on two sets of tools without making assumptions about the functional form of any effect of a predictor on the survival time: one is a classical approach based on spline basis expansion, and the other is a modern approach based on deep neural networks. For the spline method, the project aims to develop a generally applicable distributional theory for the estimates of unknown functions in several widely used survival models, which is needed for making proper statistical inference but still lacking in the current literature. Furthermore, the deep neural network approach makes the weakest possible assumption and is the most flexible estimating method for an unknown multivariate function. The project considers deep neural networks with a full likelihood-based loss function for censored survival data and more general types of predictors that can randomly vary with time. The project aims to investigate both the numerical implementation and the theory for the estimation precision. This work will foster interdisciplinary research with epidemiologists, nephrologists, neurologists, and other scientists working on real scientific studies, and contribute to the well-being of human beings and the scientific community in a significant way through its versatile real-life applications, thus create an impact in and beyond statistical periphery. <br/><br/>Spline basis expansion is a commonly used approach for approximating an unknown smooth function, hence widely applied in estimating functional parameters. It is too often, however, that the approximation is treated as “exact” in practice so to treat a nonparametric estimation problem as a parametric one because the asymptotic distributional theory for the spline estimation is lacking for models beyond the nonparametric linear model. The project takes advantages of recent developments in the random matrix theory to tackle the distributional theoretical problem of spline estimates in a broad range of commonly used statistical models in censored data analysis. The most general nonparametric problem is to estimate the conditional distribution other than, for example, the conditional mean or median. With the conditional distribution at hand, prediction becomes a choice of a particular characteristic of the conditional distribution and a prediction interval can be easily obtained. The project focuses on full likelihood-based loss functions characterized by the conditional hazard function and applies either deep neural networks or deep operator networks for the estimation of the conditional distribution function given functional covariates. In survival analysis, the functional covariates can be time-varying covariates that affect the hazard function in an arbitrarily way, for which no estimating method exists in the literature. Convergence rates of considered neural network methods with functional inputs will be established.<br/><br/>This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

  • Program Officer
    Jun Zhujzhu@nsf.gov7032924551
  • Min Amd Letter Date
    4/23/2024 - 2 months ago
  • Max Amd Letter Date
    4/23/2024 - 2 months ago
  • ARRA Amount

Institutions

  • Name
    University of California-Irvine
  • City
    IRVINE
  • State
    CA
  • Country
    United States
  • Address
    160 ALDRICH HALL
  • Postal Code
    926970001
  • Phone Number
    9498247295

Investigators

  • First Name
    Bin
  • Last Name
    Nan
  • Email Address
    nanb@uci.edu
  • Start Date
    4/23/2024 12:00:00 AM

Program Element

  • Text
    STATISTICS
  • Code
    126900

Program Reference

  • Text
    Machine Learning Theory