Shield Synthesis for Reinforcement Learning

Bettina Könighofer*, Florian Lorber, Nils Jansen, Roderick Bloem

*Corresponding author for this work

Research output: Contribution to book/anthology/report/conference proceedingArticle in proceedingResearchpeer-review

9 Citations (Scopus)

Abstract

Reinforcement learning algorithms discover policies that maximize reward. However, these policies generally do not adhere to safety, leaving safety in reinforcement learning (and in artificial intelligence in general) an open research problem. Shield synthesis is a formal approach to synthesize a correct-by-construction reactive system called a shield that enforces safety properties of a running system while interfering with its operation as little as possible. A shield attached to a learning agent guarantees safety during learning and execution phases. In this paper we summarize three types of shields that are synthesized from different specification languages, and discuss their applicability to reinforcement learning. First, we discuss deterministic shields that enforce specifications expressed as linear temporal logic specifications. Second, we discuss the synthesis of probabilistic shields from specifications in probabilistic temporal logic. Third, we discuss how to synthesize timed shields from timed automata specifications. This paper summarizes the application areas, advantages, disadvantages and synthesis approaches for the three types of shields and gives an overview of experimental results.

Original languageEnglish
Title of host publicationLeveraging Applications of Formal Methods, Verification and Validation : Verification Principles - 9th International Symposium on Leveraging Applications of Formal Methods, ISoLA 2020, Proceedings
EditorsTiziana Margaria, Bernhard Steffen
Number of pages17
PublisherSpringer Science+Business Media
Publication date2020
Pages290-306
ISBN (Print)978-3-030-61361-7
ISBN (Electronic)978-3-030-61362-4
DOIs
Publication statusPublished - 2020
Event9th International Symposium on Leveraging Applications of Formal Methods, Verification and Validation, ISoLA 2020 - Rhodes, Greece
Duration: 20 Oct 202030 Oct 2020

Conference

Conference9th International Symposium on Leveraging Applications of Formal Methods, Verification and Validation, ISoLA 2020
Country/TerritoryGreece
CityRhodes
Period20/10/202030/10/2020
SeriesLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume12476 LNCS
ISSN0302-9743

Bibliographical note

Publisher Copyright:
© 2020, Springer Nature Switzerland AG.

Fingerprint

Dive into the research topics of 'Shield Synthesis for Reinforcement Learning'. Together they form a unique fingerprint.

Cite this