Learning Markov Decision Processes for Model Checking

Hua Mao; Yingke Chen; Manfred Jaeger; Thomas Dyhre Nielsen; Kim Guldstrand Larsen; Brian Nielsen

doi:10.4204/EPTCS.103.6

Learning Markov Decision Processes for Model Checking

Hua Mao, Yingke Chen, Manfred Jaeger, Thomas Dyhre Nielsen, Kim Guldstrand Larsen, Brian Nielsen

Research output: Contribution to journal › Conference article in Journal › Research › peer-review

Abstract

Constructing an accurate system model for formal model verification can be both resource demanding
and time-consuming. To alleviate this shortcoming, algorithms have been proposed for automatically
learning system models based on observed system behaviors. In this paper we extend the algorithm
on learning probabilistic automata to reactive systems, where the observed system behavior is in
the form of alternating sequences of inputs and outputs. We propose an algorithm for automatically
learning a deterministic labeled Markov decision process model from the observed behavior of a
reactive system. The proposed learning algorithm is adapted from algorithms for learning deterministic
probabilistic finite automata, and extended to include both probabilistic and nondeterministic transitions.
The algorithm is empirically analyzed and evaluated by learning system models of slot machines. The
evaluation is performed by analyzing the probabilistic linear temporal logic properties of the system
as well as by analyzing the schedulers, in particular the optimal schedulers, induced by the learned
models.

Original language	English
Journal	Electronic Proceedings in Theoretical Computer Science
Volume	103
Pages (from-to)	49-63
ISSN	2075-2180
DOIs	https://doi.org/10.4204/EPTCS.103.6
Publication status	Published - 2012
Event	Quantities in formal methods - Paris, France Duration: 28 Aug 2012 → 28 Aug 2012 Conference number: 1

Workshop

Workshop	Quantities in formal methods
Number	1
Country/Territory	France
City	Paris
Period	28/08/2012 → 28/08/2012

Access to Document

10.4204/EPTCS.103.6

AUB Link

Search for the material in Aalborg University Library's search engine

Cite this

@inproceedings{bc0b2d17dc0e4031bee0b8a9ad2df5fa,

title = "Learning Markov Decision Processes for Model Checking",

abstract = "Constructing an accurate system model for formal model verification can be both resource demandingand time-consuming. To alleviate this shortcoming, algorithms have been proposed for automaticallylearning system models based on observed system behaviors. In this paper we extend the algorithmon learning probabilistic automata to reactive systems, where the observed system behavior is inthe form of alternating sequences of inputs and outputs. We propose an algorithm for automaticallylearning a deterministic labeled Markov decision process model from the observed behavior of areactive system. The proposed learning algorithm is adapted from algorithms for learning deterministicprobabilistic finite automata, and extended to include both probabilistic and nondeterministic transitions.The algorithm is empirically analyzed and evaluated by learning system models of slot machines. Theevaluation is performed by analyzing the probabilistic linear temporal logic properties of the systemas well as by analyzing the schedulers, in particular the optimal schedulers, induced by the learnedmodels.",

author = "Hua Mao and Yingke Chen and Manfred Jaeger and Nielsen, {Thomas Dyhre} and Larsen, {Kim Guldstrand} and Brian Nielsen",

year = "2012",

doi = "10.4204/EPTCS.103.6",

language = "English",

volume = "103",

pages = "49--63",

journal = "Electronic Proceedings in Theoretical Computer Science",

issn = "2075-2180",

publisher = "Open Publishing Association",

note = "Quantities in formal methods, QFM ; Conference date: 28-08-2012 Through 28-08-2012",

}

TY - GEN

T1 - Learning Markov Decision Processes for Model Checking

AU - Mao, Hua

AU - Chen, Yingke

AU - Jaeger, Manfred

AU - Nielsen, Thomas Dyhre

AU - Larsen, Kim Guldstrand

AU - Nielsen, Brian

N1 - Conference code: 1

PY - 2012

Y1 - 2012

N2 - Constructing an accurate system model for formal model verification can be both resource demandingand time-consuming. To alleviate this shortcoming, algorithms have been proposed for automaticallylearning system models based on observed system behaviors. In this paper we extend the algorithmon learning probabilistic automata to reactive systems, where the observed system behavior is inthe form of alternating sequences of inputs and outputs. We propose an algorithm for automaticallylearning a deterministic labeled Markov decision process model from the observed behavior of areactive system. The proposed learning algorithm is adapted from algorithms for learning deterministicprobabilistic finite automata, and extended to include both probabilistic and nondeterministic transitions.The algorithm is empirically analyzed and evaluated by learning system models of slot machines. Theevaluation is performed by analyzing the probabilistic linear temporal logic properties of the systemas well as by analyzing the schedulers, in particular the optimal schedulers, induced by the learnedmodels.

AB - Constructing an accurate system model for formal model verification can be both resource demandingand time-consuming. To alleviate this shortcoming, algorithms have been proposed for automaticallylearning system models based on observed system behaviors. In this paper we extend the algorithmon learning probabilistic automata to reactive systems, where the observed system behavior is inthe form of alternating sequences of inputs and outputs. We propose an algorithm for automaticallylearning a deterministic labeled Markov decision process model from the observed behavior of areactive system. The proposed learning algorithm is adapted from algorithms for learning deterministicprobabilistic finite automata, and extended to include both probabilistic and nondeterministic transitions.The algorithm is empirically analyzed and evaluated by learning system models of slot machines. Theevaluation is performed by analyzing the probabilistic linear temporal logic properties of the systemas well as by analyzing the schedulers, in particular the optimal schedulers, induced by the learnedmodels.

U2 - 10.4204/EPTCS.103.6

DO - 10.4204/EPTCS.103.6

M3 - Conference article in Journal

SN - 2075-2180

VL - 103

SP - 49

EP - 63

JO - Electronic Proceedings in Theoretical Computer Science

JF - Electronic Proceedings in Theoretical Computer Science

T2 - Quantities in formal methods

Y2 - 28 August 2012 through 28 August 2012

ER -

Learning Markov Decision Processes for Model Checking

Abstract

Workshop

Access to Document

AUB Link

Fingerprint

Cite this