Decision-making in many fields involves managing complex systems made up of smaller, interconnected parts. These systems can be modeled using weakly coupled Markov decision processes (WCMDPs), which are groups of smaller Markov decision processes (MDPs) linked by shared constraints. WCMDPs are applicable in various fields such as job scheduling, resource allocation, electric vehicle charging, and supply chain management. However, despite their widespread application, many fundamental questions on WCMDPs remain unanswered. Efficiently computing near-optimal decision rules, i.e., policies, for WCMDPs is still an open problem. Furthermore, when the problem parameters are unknown, reinforcement learning (RL) approaches are needed, but effective RL algorithms for WCMDPs are currently lacking. A key challenge is that the shared constraints create coupling among the smaller MDPs, which prevents making decisions for each MDP individually and thus leads to hardness results when the number of MDPs is large.<br/> <br/><br/>This proposal aims to establish a theoretical foundation and innovate algorithm designs for WCMDPs. The proposed research will develop theory and techniques to “decouple” large WCMDPs into their smaller parts and then “reassemble” them properly. This research will draw on a new approach devised in the preliminary work, named the “one-to-many” approach, for tackling decision-making in large, complex stochastic systems. This new approach will be combined with classical techniques from large stochastic systems, including the Lyapunov drift method, Stein’s method, and rate conservation law, as well as recent advances in reinforcement learning. The algorithms and theory developed in the above research will be evaluated in both simulated problems and in the resource management problem in large-scale computing systems, using real-world data traces from Google’s datacenters. The results from this project are expected to enrich the traditional algorithms and theory not only for WCMDPs but also for large-scale MDPs in general. This research will be accompanied by curriculum development, mentoring programs, and initiatives at conferences designed to recruit students from underrepresented backgrounds into research on decision-making in large stochastic systems.<br/><br/>This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.