Collaborative Research: SLES: No Bad Surprises: Aligning Agent and Human Norms via Specification Refinements

Information

NSF Award
2416459

Owner

Oregon State University

Award Id
2416459
Award Effective Date
9/1/2024 - 10 months ago
Award Expiration Date
8/31/2028 - 3 years from now
Award Amount
$ 750,000.00
Award Instrument
Standard Grant

Information

Collaborative Research: SLES: No Bad Surprises: Aligning Agent and Human Norms via Specification Refinements

Autonomous robots hold the potential to revolutionize society in areas such as healthcare, transportation, and manufacturing. These systems frequently employ learning-enabled components in their perception, planning, and control modules, necessitating complex design choices to ensure safe operation. However, design decisions that initially appear sound may lead to unexpected problems during testing or, even worse, post-deployment. For example, an autonomous vehicle once exhibited erratic swerving to localize itself for lane-keeping, a failure mode unforeseen by the system's designers and developers. Such surprises indicate that the agent's norms—what it considers permissible and obligatory—are inappropriate in certain situations. As learning-enabled systems become more complex, operate in open environments, and interact with humans and other robots, these challenges are likely to be exacerbated. This project focuses on safety failures of reinforcement learning (RL) agents, stemming from two primary sources: the misalignment between design intent and the agent’s perceived norms, and the gap between the agent’s required knowledge for safe operation and its actual perception capabilities. The goal is to equip researchers and practitioners with tools to design provably safe autonomous systems, encompassing all major stages of design, verification, and deployment.<br/><br/>The project develops a process to iteratively align an RL agent's norms with those of its designers and formally verify the resulting behavior. Key activities include: (1) developing inverse reinforcement learning algorithms to learn a reward function from demonstrations, constrained by deontic logic; (2) systematically exploring the trained agent’s norms to uncover unknowns by generating norms that would surprise the engineer; (3) querying the agent to explain its reward function when it produces undesired behavior; (4) defining a new class of obligations related to knowledge and corresponding formal specification logic; (5) designing run-time monitors to predict action and knowledge safety violations during operation; (6) implementing online metareasoning, coupled with introspective perception modules, to restore safe behavior; (7) iteratively improving system alignment by updating the agent's learning process using verification and run-time monitoring results. The project's outcomes are validated using an industrial simulator of a real-world bipedal robot, scaled-down autonomous race cars, and a campus-wide fleet of delivery robots.<br/><br/>This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

Program Officer
Jie Yangjyang@nsf.gov7032924768
Min Amd Letter Date
8/21/2024 - 11 months ago
Max Amd Letter Date
8/21/2024 - 11 months ago
ARRA Amount

Institutions

Name
Oregon State University
City
CORVALLIS
State
OR
Country
United States
Address
1500 SW JEFFERSON AVE
Postal Code
973318655
Phone Number
5417374933

Investigators

First Name
Houssam
Last Name
Abbas
Email Address
houssam.abbas@oregonstate.edu
Start Date
8/21/2024 12:00:00 AM

First Name
Sandhya
Last Name
Saisubramanian
Email Address
sandhya.sai@oregonstate.edu
Start Date
8/21/2024 12:00:00 AM

Program Element

Text
AI-Safety

Collaborative Research: SLES: No Bad Surprises: Aligning Agent and Human Norms via Specification Refinements

Information

Owner

Award Id

Award Effective Date

Award Expiration Date

Award Amount

Award Instrument

Collaborative Research: SLES: No Bad Surprises: Aligning Agent and Human Norms via Specification Refinements

Program Officer

Min Amd Letter Date

Max Amd Letter Date

ARRA Amount

Institutions

Name

City

State

Country

Address

Postal Code

Phone Number

Investigators

First Name

Last Name

Email Address

Start Date

First Name

Last Name

Email Address

Start Date

Program Element

Text