𝐿*-based learning of Markov decision processes: Extended version

Martin Tappler, Bernhard K. Aichernig*, Giovanni Bacci, Maria Eichlseder, Kim Guldstrand Larsen

*Kontaktforfatter

Publikation: Bidrag til tidsskriftTidsskriftartikelForskningpeer review

13 Citationer (Scopus)
43 Downloads (Pure)

Abstract

Automata learning techniques automatically generate systemmodels fromtest observations. Typically, these techniques fall into two categories: passive and active. On the one hand, passive learning assumes no interaction with the system under learning and uses a predetermined training set, e.g., system logs. On the other hand, active learning techniques collect training data by actively querying the system under learning, allowing one to steer the discovery ofmeaningful information about the systemunder learning leading to effective learning strategies. A notable example of active learning technique for regular languages is Angluin’s L-algorithm. The L-algorithm describes the strategy of a student who learns the minimal deterministic finite automaton of an unknown regular language L by asking a succinct number of queries to a teacher who knows L. In this work, we study L -based learning of deterministic Markov decision processes, a class of Markov decision processes where an observation following an action uniquely determines a successor state. For this purpose, we first assume an ideal setting with a teacher who provides perfect information to the student. Then, we relax this assumption and present a novel learning algorithm that collects information by sampling execution traces of the system via testing. Experiments performed on an implementation of our sampling-based algorithm suggest that our method achieves better accuracy than state-of-the-art passive learning techniques using the same amount of test obser vations. In contrast to existing learning algorithms which assume a predefined number of states, our algorithm learns the complete model structure including the state space.

OriginalsprogEngelsk
TidsskriftFormal Aspects of Computing
Vol/bind33
Udgave nummer4-5
Sider (fra-til)575-615
Antal sider41
ISSN0934-5043
DOI
StatusUdgivet - 31 mar. 2021

Fingeraftryk

Dyk ned i forskningsemnerne om '𝐿*-based learning of Markov decision processes: Extended version'. Sammen danner de et unikt fingeraftryk.

Citationsformater