Data-adaptive multivariate inference with finite sample guarantees

Information

  • NSF Award
  • 2413885
Owner
  • Award Id
    2413885
  • Award Effective Date
    8/1/2024 - a year ago
  • Award Expiration Date
    7/31/2027 - a year from now
  • Award Amount
    $ 300,000.00
  • Award Instrument
    Standard Grant

Data-adaptive multivariate inference with finite sample guarantees

Working with complex data requires a good data structure in order to efficiently organize these data in a computer. The research project will show that a certain popular data structure also has surprising and advantageous statistical properties. It will be shown how these properties can be used to give performance guarantees for a number of statistical procedures and that these guarantees lead to optimal statistical inference. In particular, the project will show how these properties can be used to overcome a problem that affects many multivariate statistical analyses and which is known as the `curse of dimensionality' or `empty space phenomenon'. The research project will also involve mentoring undergraduate students in the context of summer research projects and provide research training opportunities for graduate students. <br/><br/>k-d trees are space-partitioning binary trees that are popular in computer science because of their computational efficiency. This project will show that k-d trees also have advantageous stochastic properties that can be used to effectively address a number of challenging statistical problems. In particular, the research will show how the data-adaptive multiresolution partitions generated by a k-d tree can be used to avoid the `curse of dimensionality' or `empty space phenomenon' that afflicts many multivariate statistical procedures. It will also show that the resulting inference comes with finite sample guarantees and that it satisfies certain optimality properties. The to-be-developed methodology will be used to address a number of important problems, such as inference about a multivariate log-concave distribution. Finding the maximum likelihood estimator for such a distribution is computationally very expensive, even for low-dimensional observations. The research will show how the data-adaptive partition can be used to compute a confidence band for a log-concave density in a fast way, and it will establish statistical optimality properties for these bands.<br/><br/>This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

  • Program Officer
    Yong Zengyzeng@nsf.gov7032927299
  • Min Amd Letter Date
    7/25/2024 - a year ago
  • Max Amd Letter Date
    7/25/2024 - a year ago
  • ARRA Amount

Institutions

  • Name
    Stanford University
  • City
    STANFORD
  • State
    CA
  • Country
    United States
  • Address
    450 JANE STANFORD WAY
  • Postal Code
    943052004
  • Phone Number
    6507232300

Investigators

  • First Name
    Guenther
  • Last Name
    Walther
  • Email Address
    Walther@stat.stanford.edu
  • Start Date
    7/25/2024 12:00:00 AM

Program Element

  • Text
    STATISTICS
  • Code
    126900

Program Reference

  • Text
    Machine Learning Theory