Learning Markov Decision Processes for Model Checking

Publikation: Bidrag til tidsskriftKonferenceartikel i tidsskriftForskningpeer review

Resumé

Constructing an accurate system model for formal model verification can be both resource demanding
and time-consuming. To alleviate this shortcoming, algorithms have been proposed for automatically
learning system models based on observed system behaviors. In this paper we extend the algorithm
on learning probabilistic automata to reactive systems, where the observed system behavior is in
the form of alternating sequences of inputs and outputs. We propose an algorithm for automatically
learning a deterministic labeled Markov decision process model from the observed behavior of a
reactive system. The proposed learning algorithm is adapted from algorithms for learning deterministic
probabilistic finite automata, and extended to include both probabilistic and nondeterministic transitions.
The algorithm is empirically analyzed and evaluated by learning system models of slot machines. The
evaluation is performed by analyzing the probabilistic linear temporal logic properties of the system
as well as by analyzing the schedulers, in particular the optimal schedulers, induced by the learned
models.
OriginalsprogEngelsk
TidsskriftElectronic Proceedings in Theoretical Computer Science
Vol/bind103
Sider (fra-til)49-63
ISSN2075-2180
DOI
StatusUdgivet - 2012
BegivenhedQuantities in formal methods - Paris, Frankrig
Varighed: 28 aug. 201228 aug. 2012
Konferencens nummer: 1

Workshop

WorkshopQuantities in formal methods
Nummer1
LandFrankrig
ByParis
Periode28/08/201228/08/2012

Fingerprint

Model checking
Temporal logic
Finite automata
Learning algorithms
Learning systems

Citer dette

@inproceedings{bc0b2d17dc0e4031bee0b8a9ad2df5fa,
title = "Learning Markov Decision Processes for Model Checking",
abstract = "Constructing an accurate system model for formal model verification can be both resource demandingand time-consuming. To alleviate this shortcoming, algorithms have been proposed for automaticallylearning system models based on observed system behaviors. In this paper we extend the algorithmon learning probabilistic automata to reactive systems, where the observed system behavior is inthe form of alternating sequences of inputs and outputs. We propose an algorithm for automaticallylearning a deterministic labeled Markov decision process model from the observed behavior of areactive system. The proposed learning algorithm is adapted from algorithms for learning deterministicprobabilistic finite automata, and extended to include both probabilistic and nondeterministic transitions.The algorithm is empirically analyzed and evaluated by learning system models of slot machines. Theevaluation is performed by analyzing the probabilistic linear temporal logic properties of the systemas well as by analyzing the schedulers, in particular the optimal schedulers, induced by the learnedmodels.",
author = "Hua Mao and Yingke Chen and Manfred Jaeger and Nielsen, {Thomas Dyhre} and Larsen, {Kim Guldstrand} and Brian Nielsen",
year = "2012",
doi = "10.4204/EPTCS.103.6",
language = "English",
volume = "103",
pages = "49--63",
journal = "Electronic Proceedings in Theoretical Computer Science",
issn = "2075-2180",
publisher = "Open Publishing Association",

}

Learning Markov Decision Processes for Model Checking. / Mao, Hua; Chen, Yingke; Jaeger, Manfred; Nielsen, Thomas Dyhre; Larsen, Kim Guldstrand; Nielsen, Brian.

I: Electronic Proceedings in Theoretical Computer Science, Bind 103, 2012, s. 49-63.

Publikation: Bidrag til tidsskriftKonferenceartikel i tidsskriftForskningpeer review

TY - GEN

T1 - Learning Markov Decision Processes for Model Checking

AU - Mao, Hua

AU - Chen, Yingke

AU - Jaeger, Manfred

AU - Nielsen, Thomas Dyhre

AU - Larsen, Kim Guldstrand

AU - Nielsen, Brian

PY - 2012

Y1 - 2012

N2 - Constructing an accurate system model for formal model verification can be both resource demandingand time-consuming. To alleviate this shortcoming, algorithms have been proposed for automaticallylearning system models based on observed system behaviors. In this paper we extend the algorithmon learning probabilistic automata to reactive systems, where the observed system behavior is inthe form of alternating sequences of inputs and outputs. We propose an algorithm for automaticallylearning a deterministic labeled Markov decision process model from the observed behavior of areactive system. The proposed learning algorithm is adapted from algorithms for learning deterministicprobabilistic finite automata, and extended to include both probabilistic and nondeterministic transitions.The algorithm is empirically analyzed and evaluated by learning system models of slot machines. Theevaluation is performed by analyzing the probabilistic linear temporal logic properties of the systemas well as by analyzing the schedulers, in particular the optimal schedulers, induced by the learnedmodels.

AB - Constructing an accurate system model for formal model verification can be both resource demandingand time-consuming. To alleviate this shortcoming, algorithms have been proposed for automaticallylearning system models based on observed system behaviors. In this paper we extend the algorithmon learning probabilistic automata to reactive systems, where the observed system behavior is inthe form of alternating sequences of inputs and outputs. We propose an algorithm for automaticallylearning a deterministic labeled Markov decision process model from the observed behavior of areactive system. The proposed learning algorithm is adapted from algorithms for learning deterministicprobabilistic finite automata, and extended to include both probabilistic and nondeterministic transitions.The algorithm is empirically analyzed and evaluated by learning system models of slot machines. Theevaluation is performed by analyzing the probabilistic linear temporal logic properties of the systemas well as by analyzing the schedulers, in particular the optimal schedulers, induced by the learnedmodels.

U2 - 10.4204/EPTCS.103.6

DO - 10.4204/EPTCS.103.6

M3 - Conference article in Journal

VL - 103

SP - 49

EP - 63

JO - Electronic Proceedings in Theoretical Computer Science

JF - Electronic Proceedings in Theoretical Computer Science

SN - 2075-2180

ER -