Statistical and Computational Guarantees of Estimation of Generative Models and Optimal Transport

Information

  • NSF Award
  • 2412573
Owner
  • Award Id
    2412573
  • Award Effective Date
    8/1/2024 - 6 months ago
  • Award Expiration Date
    7/31/2027 - 2 years from now
  • Award Amount
    $ 225,000.00
  • Award Instrument
    Standard Grant

Statistical and Computational Guarantees of Estimation of Generative Models and Optimal Transport

Generative machine learning models are currently revolutionizing the artificial intelligence (AI) community with their significant capabilities in creating innovative images and text. At its heart, generative AI fundamentally addresses a high dimensional density estimation problem. Alternatively, it can be perceived as a transport problem, transforming a simple and known distribution/noise into a complex and unknown distribution. Despite the development of numerous successful algorithms, the literature lacks statistical guarantees to theoretically underpin these algorithms, and concerns about the environmental impact due to extensive computations continue to persist. Among these models, score-based diffusion models are currently replacing the generative adversarial neural nets and at the forefront in terms of popularity and efficacy. However, the score training process can be exceedingly slow and energy intensive. To address this, the investigator will study the more computationally and energetically efficient rectified flow algorithm and its variants which turn the high-dimensional density estimation to an iterative regression problem, and this iterative regression leads to an optimal transport. The research will advance the understanding of the success of models in generative AI. The intrinsic connections to be explored among those models will help convert statistical guarantees from one generative model to another, and lead to novel and improved algorithms, which would eventually advance the state of art of generative AI. <br/><br/>The investigator aims to study the statistical and computational assurances of rectified flow and diffusion models and explore two connections among those models: score matching and solving ordinary/stochastic differential equation, with an intriguing linkage to nonparametric empirical Bayes. The following questions will be addressed: 1) can we show that the iterative rectified flow obtains density and transport estimation optimally in just one step of regression? 2) how fast does the iterative regression of rectified flow converge to the optimal transport? 3) can we propose an improved algorithm over the rectified flow for a better statistical and computational guarantee? 4) what are the statistical and computational guarantees of diffusion models? 5) can we improve the denoising diffusion probabilistic models by an iterative algorithm to obtain the optimal transport? In addition, the project will explore applications of generative models and optimal transport to neuroscience and autism spectrum disorder. Research results from this proposal will be disseminated through articles, workshops, and interdisciplinary seminar series. It will integrate research and education by teaching monograph courses and organizing workshops and seminars to enhance the career development of the next generations of statisticians and data scientists, including a particular focus on the underrepresented groups in mathematical sciences.<br/><br/>This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

  • Program Officer
    Yulia Gelygel@nsf.gov7032920000
  • Min Amd Letter Date
    7/24/2024 - 6 months ago
  • Max Amd Letter Date
    7/24/2024 - 6 months ago
  • ARRA Amount

Institutions

  • Name
    Yale University
  • City
    NEW HAVEN
  • State
    CT
  • Country
    United States
  • Address
    150 MUNSON ST
  • Postal Code
    065113572
  • Phone Number
    2037854689

Investigators

  • First Name
    Huibin
  • Last Name
    Zhou
  • Email Address
    huibin.zhou@yale.edu
  • Start Date
    7/24/2024 12:00:00 AM

Program Element

  • Text
    OFFICE OF MULTIDISCIPLINARY AC
  • Code
    125300
  • Text
    STATISTICS
  • Code
    126900

Program Reference

  • Text
    Artificial Intelligence (AI)
  • Text
    STATISTICS
  • Code
    1269
  • Text
    ARTIFICIAL INTELL & COGNIT SCI
  • Code
    6856