Like taking an opinion poll, down sampling is a key technique for grappling with the ever-growing amounts of data used in large-scale machine learning (ML) tasks. In particular, the objective function optimized when training ML models is defined over a full set of data, but evaluating this objective function repeatedly during training is impractical, whereas evaluating the objective function on a sample of the data scales much better with minimal loss in accuracy. Instead of sampling uniformly, the sampling process can be biased to prefer parts of the data set that have a greater influence on the objective function, resulting in faster convergence and smaller errors of in the learning process. The classical approach to sampling, known as Markov chain Monte Carlo (MCMC), assumes that the current sample has no useful dependence on the samples that came before, known as the Markov assumption. It remains to be seen whether a non-Markovian sample process that allows for an explicit dependence on past samples, can be turned into a sampling process that translates to a better ML training procedure, either practically or theoretically.<br/><br/>The overarching theme in this project is to transcend the current limitations in sampling, optimization, and machine learning algorithms that have predominantly been built upon Markovian approach or MCMC, by exploiting the full potential of going beyond traditional Markov chains for the analysis and design of distributed algorithms in the most efficient way. Specifically, this project aims to explore the following three inter-related thrusts. The first is to explore all possible ways to maximally enhance the sampling efficiency of multiple, interacting nonlinear Markov chains in the form of self-repellent random walks (SRRWs) by designing adaptive degrees of spatio-temporal repellency among multiple walkers as well as with their `collective' history, while providing all the theoretical performance guarantees. Second, this project will assess the performance implication of distributed algorithms in ML/optimization and decentralized learning in the form of stochastic approximation and its variants, when driven by a set of adaptive and interacting nonlinear Markov chains such as SRRWs instead of traditional MCMC inputs, and obtain usable performance bounds both in finite-time/sample and asymptotic regime to strike a balance between faster convergence and maximal efficiency. Third, this project seeks to develop an algorithmic framework in which one can, for a given Markovian environment, always speed up the stochastic approximation algorithm by augmenting it into multi-timescale versions with low computational complexity, as well as co-design them with carefully constructed nonlinear Markovian sampling strategies for a tunable environment in decentralized learning. Broadly speaking, this project will have potential impact on a vast range of multi-disciplinary applications where the standard MCMC methods and Markovian-driven stochastic and iterative algorithms have been dominant and taken for granted, including sampling from high-dimensional state space with graphical constraints, Markovian random walks on general graphs and their applications to various inference tasks in a distributed manner, learning algorithms and stochastic approximation in a Markovian environment, stochastic optimization with Markovian noise, and beyond.<br/><br/>This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.