Compositional Shielding and Reinforcement Learning for Multi-Agent Systems

Asger Horn Brorholt*, Kim Guldstrand Larsen, Christian Schilling

*Corresponding author for this work

Research output: Contribution to book/anthology/report/conference proceedingArticle in proceedingResearchpeer-review

Abstract

Deep reinforcement learning has emerged as a powerful tool for obtaining high-performance policies. However, the safety of these policies has been a long-standing issue. One promising paradigm to guarantee safety is a shield, which ''shields'' a policy from making unsafe actions. However, computing a shield scales exponentially in the number of state variables. This is a particular concern in multi-agent systems with many agents. In this work, we propose a novel approach for multi-agent shielding. We address scalability by computing individual shields for each agent. The challenge is that typical safety specifications are global properties, but the shields of individual agents only ensure local properties. Our key to overcome this challenge is to apply assume-guarantee reasoning. Specifically, we present a sound proof rule that decomposes a (global, complex) safety specification into (local, simple) obligations for the shields of the individual agents. Moreover, we show that applying the shields during reinforcement learning significantly improves the quality of the policies obtained for a given training budget. We demonstrate the effectiveness and scalability of our multi-agent shielding framework in two case studies, reducing the computation time from hours to seconds and achieving fast learning convergence.
Original languageEnglish
Title of host publicationProceedings of the 24th International Conference on Autonomous Agents and Multiagent Systems
EditorsYevgeniy Vorobeychik
Number of pages9
Place of PublicationRichland, SC, USA
PublisherAssociation for Computing Machinery (ACM)
Publication date5 Jun 2025
Edition24
Pages399-407
ChapterResearch Paper Track
ISBN (Electronic)979-8-4007-1426-9
Publication statusPublished - 5 Jun 2025
Event24th International Conference on Autonomous Agents and Multiagent Systems - Renaissance Center, Detroit, United States
Duration: 19 May 202523 May 2025
Conference number: 24
https://aamas2025.org/

Conference

Conference24th International Conference on Autonomous Agents and Multiagent Systems
Number24
LocationRenaissance Center
Country/TerritoryUnited States
CityDetroit
Period19/05/202523/05/2025
Internet address

Keywords

  • Multi-agent reinforcement learning
  • Shielding
  • Safety
  • Assume-Guarantee Reasoning
  • Reinforcement Learning

Fingerprint

Dive into the research topics of 'Compositional Shielding and Reinforcement Learning for Multi-Agent Systems'. Together they form a unique fingerprint.

Cite this