Teaching Stratego to Play Ball: Optimal Synthesis for Continuous Space MDPs

Manfred Jaeger; Peter Gjøl Jensen; Kim Guldstrand Larsen; Axel Bernard E Legay; Sean Sedwards; Jakob Haahr Taankvist

doi:10.1007/978-3-030-31784-3_5

Teaching Stratego to Play Ball: Optimal Synthesis for Continuous Space MDPs

Manfred Jaeger, Peter Gjøl Jensen^*, Kim Guldstrand Larsen, Axel Bernard E Legay, Sean Sedwards, Jakob Haahr Taankvist

^*Kontaktforfatter

Publikation: Bidrag til bog/antologi/rapport/konference proceeding › Konferenceartikel i proceeding › Forskning › peer review

16 Citationer (Scopus)

312 Downloads (Pure)

Abstract

Formal models of cyber-physical systems, such as priced timed Markov decision processes, require a state space with continuous and discrete components. The problem of controller synthesis for such systems then can be cast as finding optimal strategies for Markov decision processes over a Euclidean state space. We develop two different reinforcement learning strategies that tackle the problem of continuous state spaces via online partition refinement techniques. We provide theoretical insights into the convergence of partition refinement schemes. Our techniques are implemented in Open image in new window . Experimental results show the advantages of our new techniques over previous optimization algorithms of Open image in new window .

Originalsprog	Engelsk
Titel	Automated Technology for Verification and Analysis- 17th International Symposium, AVTA 2019, Proceedings : ATVA 2019: Automated Technology for Verification and Analysis
Redaktører	Yu-Fang Chen, Chih-Hong Cheng, Javier Esparza
Antal sider	17
Forlag	Springer
Publikationsdato	28 okt. 2019
Sider	81-97
ISBN (Trykt)	978-3-030-31783-6
ISBN (Elektronisk)	978-3-030-31784-3
DOI	https://doi.org/10.1007/978-3-030-31784-3_5
Status	Udgivet - 28 okt. 2019
Begivenhed	International Symposium on Automated Technology for Verification and Analysis - Taipei, Taiwan Varighed: 28 okt. 2019 → 31 okt. 2019

Konference

Konference	International Symposium on Automated Technology for Verification and Analysis
Land/Område	Taiwan
By	Taipei
Periode	28/10/2019 → 31/10/2019

Navn	Lecture Notes in Computer Science
Vol/bind	11781
ISSN	0302-9743

Adgang til dokumentet

10.1007/978-3-030-31784-3_5

mainAccepteret manuskript, 795 KB

AUB Link

Søg efter materialet i Aalborg Universitetsbiblioteks søgemaskine

Citationsformater

Jaeger, M., Jensen, P. G., Larsen, K. G., Legay, A. B. E., Sedwards, S., & Taankvist, J. H. (2019). Teaching Stratego to Play Ball: Optimal Synthesis for Continuous Space MDPs. I Y-F. Chen, C-H. Cheng, & J. Esparza (red.), Automated Technology for Verification and Analysis- 17th International Symposium, AVTA 2019, Proceedings: ATVA 2019: Automated Technology for Verification and Analysis (s. 81-97). Springer. https://doi.org/10.1007/978-3-030-31784-3_5

Jaeger, Manfred ; Jensen, Peter Gjøl ; Larsen, Kim Guldstrand et al. / Teaching Stratego to Play Ball : Optimal Synthesis for Continuous Space MDPs. Automated Technology for Verification and Analysis- 17th International Symposium, AVTA 2019, Proceedings: ATVA 2019: Automated Technology for Verification and Analysis. red. / Yu-Fang Chen ; Chih-Hong Cheng ; Javier Esparza. Springer, 2019. s. 81-97 (Lecture Notes in Computer Science, Bind 11781).

@inproceedings{8215b9e2fec048ab9b02b61d0058b401,

title = "Teaching Stratego to Play Ball: Optimal Synthesis for Continuous Space MDPs",

abstract = "Formal models of cyber-physical systems, such as priced timed Markov decision processes, require a state space with continuous and discrete components. The problem of controller synthesis for such systems then can be cast as finding optimal strategies for Markov decision processes over a Euclidean state space. We develop two different reinforcement learning strategies that tackle the problem of continuous state spaces via online partition refinement techniques. We provide theoretical insights into the convergence of partition refinement schemes. Our techniques are implemented in Open image in new window . Experimental results show the advantages of our new techniques over previous optimization algorithms of Open image in new window .",

author = "Manfred Jaeger and Jensen, {Peter Gj{\o}l} and Larsen, {Kim Guldstrand} and Legay, {Axel Bernard E} and Sean Sedwards and Taankvist, {Jakob Haahr}",

year = "2019",

month = oct,

day = "28",

doi = "10.1007/978-3-030-31784-3_5",

language = "English",

isbn = "978-3-030-31783-6",

series = "Lecture Notes in Computer Science",

publisher = "Springer",

pages = "81--97",

editor = "Yu-Fang Chen and Chih-Hong Cheng and Javier Esparza",

booktitle = "Automated Technology for Verification and Analysis- 17th International Symposium, AVTA 2019, Proceedings",

address = "Germany",

note = "International Symposium on Automated Technology for Verification and Analysis, ATVA ; Conference date: 28-10-2019 Through 31-10-2019",

}

Jaeger, M , Jensen, PG , Larsen, KG, Legay, ABE, Sedwards, S & Taankvist, JH 2019, Teaching Stratego to Play Ball: Optimal Synthesis for Continuous Space MDPs. i Y-F Chen, C-H Cheng & J Esparza (red), Automated Technology for Verification and Analysis- 17th International Symposium, AVTA 2019, Proceedings: ATVA 2019: Automated Technology for Verification and Analysis. Springer, Lecture Notes in Computer Science, bind 11781, s. 81-97, International Symposium on Automated Technology for Verification and Analysis, Taipei, Taiwan, 28/10/2019. https://doi.org/10.1007/978-3-030-31784-3_5

Teaching Stratego to Play Ball: Optimal Synthesis for Continuous Space MDPs. / Jaeger, Manfred ; Jensen, Peter Gjøl ; Larsen, Kim Guldstrand et al.
Automated Technology for Verification and Analysis- 17th International Symposium, AVTA 2019, Proceedings: ATVA 2019: Automated Technology for Verification and Analysis. red. / Yu-Fang Chen; Chih-Hong Cheng; Javier Esparza. Springer, 2019. s. 81-97 (Lecture Notes in Computer Science, Bind 11781).

Publikation: Bidrag til bog/antologi/rapport/konference proceeding › Konferenceartikel i proceeding › Forskning › peer review

TY - GEN

T1 - Teaching Stratego to Play Ball

T2 - International Symposium on Automated Technology for Verification and Analysis

AU - Jaeger, Manfred

AU - Jensen, Peter Gjøl

AU - Larsen, Kim Guldstrand

AU - Legay, Axel Bernard E

AU - Sedwards, Sean

AU - Taankvist, Jakob Haahr

PY - 2019/10/28

Y1 - 2019/10/28

N2 - Formal models of cyber-physical systems, such as priced timed Markov decision processes, require a state space with continuous and discrete components. The problem of controller synthesis for such systems then can be cast as finding optimal strategies for Markov decision processes over a Euclidean state space. We develop two different reinforcement learning strategies that tackle the problem of continuous state spaces via online partition refinement techniques. We provide theoretical insights into the convergence of partition refinement schemes. Our techniques are implemented in Open image in new window . Experimental results show the advantages of our new techniques over previous optimization algorithms of Open image in new window .

AB - Formal models of cyber-physical systems, such as priced timed Markov decision processes, require a state space with continuous and discrete components. The problem of controller synthesis for such systems then can be cast as finding optimal strategies for Markov decision processes over a Euclidean state space. We develop two different reinforcement learning strategies that tackle the problem of continuous state spaces via online partition refinement techniques. We provide theoretical insights into the convergence of partition refinement schemes. Our techniques are implemented in Open image in new window . Experimental results show the advantages of our new techniques over previous optimization algorithms of Open image in new window .

U2 - 10.1007/978-3-030-31784-3_5

DO - 10.1007/978-3-030-31784-3_5

M3 - Article in proceeding

SN - 978-3-030-31783-6

T3 - Lecture Notes in Computer Science

SP - 81

EP - 97

BT - Automated Technology for Verification and Analysis- 17th International Symposium, AVTA 2019, Proceedings

A2 - Chen, Yu-Fang

A2 - Cheng, Chih-Hong

A2 - Esparza, Javier

PB - Springer

Y2 - 28 October 2019 through 31 October 2019

ER -

Jaeger M , Jensen PG , Larsen KG, Legay ABE, Sedwards S, Taankvist JH. Teaching Stratego to Play Ball: Optimal Synthesis for Continuous Space MDPs. I Chen Y-F, Cheng C-H, Esparza J, red., Automated Technology for Verification and Analysis- 17th International Symposium, AVTA 2019, Proceedings: ATVA 2019: Automated Technology for Verification and Analysis. Springer. 2019. s. 81-97. (Lecture Notes in Computer Science, Bind 11781). doi: 10.1007/978-3-030-31784-3_5

Teaching Stratego to Play Ball: Optimal Synthesis for Continuous Space MDPs

Abstract

Konference

Adgang til dokumentet

AUB Link

Fingeraftryk

Citationsformater