Learning Markov Decision Processes for Model Checking

Research output: Contribution to journalConference article in JournalResearchpeer-review

Abstract

Constructing an accurate system model for formal model verification can be both resource demanding
and time-consuming. To alleviate this shortcoming, algorithms have been proposed for automatically
learning system models based on observed system behaviors. In this paper we extend the algorithm
on learning probabilistic automata to reactive systems, where the observed system behavior is in
the form of alternating sequences of inputs and outputs. We propose an algorithm for automatically
learning a deterministic labeled Markov decision process model from the observed behavior of a
reactive system. The proposed learning algorithm is adapted from algorithms for learning deterministic
probabilistic finite automata, and extended to include both probabilistic and nondeterministic transitions.
The algorithm is empirically analyzed and evaluated by learning system models of slot machines. The
evaluation is performed by analyzing the probabilistic linear temporal logic properties of the system
as well as by analyzing the schedulers, in particular the optimal schedulers, induced by the learned
models.
Original languageEnglish
JournalElectronic Proceedings in Theoretical Computer Science
Volume103
Pages (from-to)49-63
ISSN2075-2180
DOIs
Publication statusPublished - 2012
EventQuantities in formal methods - Paris, France
Duration: 28 Aug 201228 Aug 2012
Conference number: 1

Workshop

WorkshopQuantities in formal methods
Number1
CountryFrance
CityParis
Period28/08/201228/08/2012

Fingerprint

Model checking
Temporal logic
Finite automata
Learning algorithms
Learning systems

Cite this

@inproceedings{bc0b2d17dc0e4031bee0b8a9ad2df5fa,
title = "Learning Markov Decision Processes for Model Checking",
abstract = "Constructing an accurate system model for formal model verification can be both resource demandingand time-consuming. To alleviate this shortcoming, algorithms have been proposed for automaticallylearning system models based on observed system behaviors. In this paper we extend the algorithmon learning probabilistic automata to reactive systems, where the observed system behavior is inthe form of alternating sequences of inputs and outputs. We propose an algorithm for automaticallylearning a deterministic labeled Markov decision process model from the observed behavior of areactive system. The proposed learning algorithm is adapted from algorithms for learning deterministicprobabilistic finite automata, and extended to include both probabilistic and nondeterministic transitions.The algorithm is empirically analyzed and evaluated by learning system models of slot machines. Theevaluation is performed by analyzing the probabilistic linear temporal logic properties of the systemas well as by analyzing the schedulers, in particular the optimal schedulers, induced by the learnedmodels.",
author = "Hua Mao and Yingke Chen and Manfred Jaeger and Nielsen, {Thomas Dyhre} and Larsen, {Kim Guldstrand} and Brian Nielsen",
year = "2012",
doi = "10.4204/EPTCS.103.6",
language = "English",
volume = "103",
pages = "49--63",
journal = "Electronic Proceedings in Theoretical Computer Science",
issn = "2075-2180",
publisher = "Open Publishing Association",

}

Learning Markov Decision Processes for Model Checking. / Mao, Hua; Chen, Yingke; Jaeger, Manfred; Nielsen, Thomas Dyhre; Larsen, Kim Guldstrand; Nielsen, Brian.

In: Electronic Proceedings in Theoretical Computer Science, Vol. 103, 2012, p. 49-63.

Research output: Contribution to journalConference article in JournalResearchpeer-review

TY - GEN

T1 - Learning Markov Decision Processes for Model Checking

AU - Mao, Hua

AU - Chen, Yingke

AU - Jaeger, Manfred

AU - Nielsen, Thomas Dyhre

AU - Larsen, Kim Guldstrand

AU - Nielsen, Brian

PY - 2012

Y1 - 2012

N2 - Constructing an accurate system model for formal model verification can be both resource demandingand time-consuming. To alleviate this shortcoming, algorithms have been proposed for automaticallylearning system models based on observed system behaviors. In this paper we extend the algorithmon learning probabilistic automata to reactive systems, where the observed system behavior is inthe form of alternating sequences of inputs and outputs. We propose an algorithm for automaticallylearning a deterministic labeled Markov decision process model from the observed behavior of areactive system. The proposed learning algorithm is adapted from algorithms for learning deterministicprobabilistic finite automata, and extended to include both probabilistic and nondeterministic transitions.The algorithm is empirically analyzed and evaluated by learning system models of slot machines. Theevaluation is performed by analyzing the probabilistic linear temporal logic properties of the systemas well as by analyzing the schedulers, in particular the optimal schedulers, induced by the learnedmodels.

AB - Constructing an accurate system model for formal model verification can be both resource demandingand time-consuming. To alleviate this shortcoming, algorithms have been proposed for automaticallylearning system models based on observed system behaviors. In this paper we extend the algorithmon learning probabilistic automata to reactive systems, where the observed system behavior is inthe form of alternating sequences of inputs and outputs. We propose an algorithm for automaticallylearning a deterministic labeled Markov decision process model from the observed behavior of areactive system. The proposed learning algorithm is adapted from algorithms for learning deterministicprobabilistic finite automata, and extended to include both probabilistic and nondeterministic transitions.The algorithm is empirically analyzed and evaluated by learning system models of slot machines. Theevaluation is performed by analyzing the probabilistic linear temporal logic properties of the systemas well as by analyzing the schedulers, in particular the optimal schedulers, induced by the learnedmodels.

U2 - 10.4204/EPTCS.103.6

DO - 10.4204/EPTCS.103.6

M3 - Conference article in Journal

VL - 103

SP - 49

EP - 63

JO - Electronic Proceedings in Theoretical Computer Science

JF - Electronic Proceedings in Theoretical Computer Science

SN - 2075-2180

ER -