Collaborative Research: New Algorithms and Theory for Weakly Coupled Markov Decision Processes

Information

  • NSF Award
  • 2432545
Owner
  • Award Id
    2432545
  • Award Effective Date
    10/1/2024 - 3 months ago
  • Award Expiration Date
    9/30/2027 - 2 years from now
  • Award Amount
    $ 275,000.00
  • Award Instrument
    Standard Grant

Collaborative Research: New Algorithms and Theory for Weakly Coupled Markov Decision Processes

Decision-making in many fields involves managing complex systems made up of smaller, interconnected parts. These systems can be modeled using weakly coupled Markov decision processes (WCMDPs), which are groups of smaller Markov decision processes (MDPs) linked by shared constraints. WCMDPs are applicable in various fields such as job scheduling, resource allocation, electric vehicle charging, and supply chain management. However, despite their widespread application, many fundamental questions on WCMDPs remain unanswered. Efficiently computing near-optimal decision rules, i.e., policies, for WCMDPs is still an open problem. Furthermore, when the problem parameters are unknown, reinforcement learning (RL) approaches are needed, but effective RL algorithms for WCMDPs are currently lacking. A key challenge is that the shared constraints create coupling among the smaller MDPs, which prevents making decisions for each MDP individually and thus leads to hardness results when the number of MDPs is large.<br/> <br/><br/>This proposal aims to establish a theoretical foundation and innovate algorithm designs for WCMDPs. The proposed research will develop theory and techniques to “decouple” large WCMDPs into their smaller parts and then “reassemble” them properly. This research will draw on a new approach devised in the preliminary work, named the “one-to-many” approach, for tackling decision-making in large, complex stochastic systems. This new approach will be combined with classical techniques from large stochastic systems, including the Lyapunov drift method, Stein’s method, and rate conservation law, as well as recent advances in reinforcement learning. The algorithms and theory developed in the above research will be evaluated in both simulated problems and in the resource management problem in large-scale computing systems, using real-world data traces from Google’s datacenters. The results from this project are expected to enrich the traditional algorithms and theory not only for WCMDPs but also for large-scale MDPs in general. This research will be accompanied by curriculum development, mentoring programs, and initiatives at conferences designed to recruit students from underrepresented backgrounds into research on decision-making in large stochastic systems.<br/><br/>This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

  • Program Officer
    Anthony Kuhakuh@nsf.gov7032924714
  • Min Amd Letter Date
    8/15/2024 - 5 months ago
  • Max Amd Letter Date
    8/15/2024 - 5 months ago
  • ARRA Amount

Institutions

  • Name
    Carnegie-Mellon University
  • City
    PITTSBURGH
  • State
    PA
  • Country
    United States
  • Address
    5000 FORBES AVE
  • Postal Code
    152133815
  • Phone Number
    4122688746

Investigators

  • First Name
    Weina
  • Last Name
    Wang
  • Email Address
    weinaw@cs.cmu.edu
  • Start Date
    8/15/2024 12:00:00 AM

Program Element

  • Text
    EPCN-Energy-Power-Ctrl-Netwrks
  • Code
    760700

Program Reference

  • Text
    LEARNING & INTELLIGENT SYSTEMS
  • Code
    8888