Manifold learning algorithms are tools used to reveal the underlying structure of high-dimensional datasets. This can be achieved by finding a lower-dimensional representation of the dataset, thereby enhancing the efficiency of subsequent data analysis. They find applications across various fields such as single-cell analysis, natural language processing, and neuroscience. While most existing algorithms are designed for datasets represented in vector spaces, real-world data often comprises distributions or point-clouds, presenting both theoretical and computational challenges for manifold learning algorithms. This project will develop manifold learning algorithms tailored for distributional or point-cloud datasets, with a particular emphasis on theoretical analysis and computational efficiency. Leveraging the framework of optimal transport and established manifold learning theory in vector spaces, the project will address these challenges. This project will also train students in interdisciplinary aspects of the research.<br/><br/>This project will develop and analyze algorithms for uncovering low-dimensional intrinsic structures of data sets within Wasserstein space, a natural space for distributions or point-clouds. This is motivated by the recent success in representing data as elements in Wasserstein space, as opposed to Euclidean space, and the necessity to develop efficient algorithms for their analysis. To accomplish the goals of this project, the research team will leverage the eigenvectors of a Laplacian matrix built from a data-dependent graph. Specifically, consistency theory of operators such as the Laplacian between the discrete (graph) and the continuous (submanifold) setting will be developed, drawing inspiration from the well-established theory for finite-dimensional Riemannian manifolds. The project will develop theoretically provable methods that provide algorithmic insights, which in turn can be used for efficient algorithms. The aims are threefold: (1) define dimensionality reduction algorithms for point-cloud data that can uncover curved submanifolds through suitable embeddings, (2) provide theoretical guarantees for these embeddings, and (3) design efficient algorithms for applications in high-dimensional settings such as single-cell data analysis.<br/><br/>This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.