Manifold learning in Wasserstein space using Laplacians: From graphs to submanifolds

Information

  • NSF Award
  • 2410140
Owner
  • Award Id
    2410140
  • Award Effective Date
    8/1/2024 - 5 months ago
  • Award Expiration Date
    7/31/2027 - 2 years from now
  • Award Amount
    $ 419,962.00
  • Award Instrument
    Standard Grant

Manifold learning in Wasserstein space using Laplacians: From graphs to submanifolds

Manifold learning algorithms are tools used to reveal the underlying structure of high-dimensional datasets. This can be achieved by finding a lower-dimensional representation of the dataset, thereby enhancing the efficiency of subsequent data analysis. They find applications across various fields such as single-cell analysis, natural language processing, and neuroscience. While most existing algorithms are designed for datasets represented in vector spaces, real-world data often comprises distributions or point-clouds, presenting both theoretical and computational challenges for manifold learning algorithms. This project will develop manifold learning algorithms tailored for distributional or point-cloud datasets, with a particular emphasis on theoretical analysis and computational efficiency. Leveraging the framework of optimal transport and established manifold learning theory in vector spaces, the project will address these challenges. This project will also train students in interdisciplinary aspects of the research.<br/><br/>This project will develop and analyze algorithms for uncovering low-dimensional intrinsic structures of data sets within Wasserstein space, a natural space for distributions or point-clouds. This is motivated by the recent success in representing data as elements in Wasserstein space, as opposed to Euclidean space, and the necessity to develop efficient algorithms for their analysis. To accomplish the goals of this project, the research team will leverage the eigenvectors of a Laplacian matrix built from a data-dependent graph. Specifically, consistency theory of operators such as the Laplacian between the discrete (graph) and the continuous (submanifold) setting will be developed, drawing inspiration from the well-established theory for finite-dimensional Riemannian manifolds. The project will develop theoretically provable methods that provide algorithmic insights, which in turn can be used for efficient algorithms. The aims are threefold: (1) define dimensionality reduction algorithms for point-cloud data that can uncover curved submanifolds through suitable embeddings, (2) provide theoretical guarantees for these embeddings, and (3) design efficient algorithms for applications in high-dimensional settings such as single-cell data analysis.<br/><br/>This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

  • Program Officer
    Troy D. Butlertdbutler@nsf.gov7032922084
  • Min Amd Letter Date
    5/29/2024 - 7 months ago
  • Max Amd Letter Date
    5/29/2024 - 7 months ago
  • ARRA Amount

Institutions

  • Name
    University of North Carolina at Chapel Hill
  • City
    CHAPEL HILL
  • State
    NC
  • Country
    United States
  • Address
    104 AIRPORT DR STE 2200
  • Postal Code
    275995023
  • Phone Number
    9199663411

Investigators

  • First Name
    Caroline
  • Last Name
    Moosmueller
  • Email Address
    cmoosm@unc.edu
  • Start Date
    5/29/2024 12:00:00 AM
  • First Name
    Shiying
  • Last Name
    Li
  • Email Address
    shiyl@unc.edu
  • Start Date
    5/29/2024 12:00:00 AM

Program Element

  • Text
    COMPUTATIONAL MATHEMATICS
  • Code
    127100

Program Reference

  • Text
    Machine Learning Theory
  • Text
    COMPUTATIONAL SCIENCE & ENGING
  • Code
    9263