Teaching Stratego to Play Ball: Optimal Synthesis for Continuous Space MDPs

Manfred Jaeger, Peter Gjøl Jensen*, Kim Guldstrand Larsen, Axel Bernard E Legay, Sean Sedwards, Jakob Haahr Taankvist

*Kontaktforfatter

Publikation: Bidrag til bog/antologi/rapport/konference proceedingKonferenceartikel i proceedingForskningpeer review

16 Citationer (Scopus)
312 Downloads (Pure)

Abstract

Formal models of cyber-physical systems, such as priced timed Markov decision processes, require a state space with continuous and discrete components. The problem of controller synthesis for such systems then can be cast as finding optimal strategies for Markov decision processes over a Euclidean state space. We develop two different reinforcement learning strategies that tackle the problem of continuous state spaces via online partition refinement techniques. We provide theoretical insights into the convergence of partition refinement schemes. Our techniques are implemented in Open image in new window . Experimental results show the advantages of our new techniques over previous optimization algorithms of Open image in new window .
OriginalsprogEngelsk
TitelAutomated Technology for Verification and Analysis- 17th International Symposium, AVTA 2019, Proceedings : ATVA 2019: Automated Technology for Verification and Analysis
RedaktørerYu-Fang Chen, Chih-Hong Cheng, Javier Esparza
Antal sider17
ForlagSpringer
Publikationsdato28 okt. 2019
Sider81-97
ISBN (Trykt)978-3-030-31783-6
ISBN (Elektronisk)978-3-030-31784-3
DOI
StatusUdgivet - 28 okt. 2019
BegivenhedInternational Symposium on Automated Technology for Verification and Analysis - Taipei, Taiwan
Varighed: 28 okt. 201931 okt. 2019

Konference

KonferenceInternational Symposium on Automated Technology for Verification and Analysis
Land/OmrådeTaiwan
ByTaipei
Periode28/10/201931/10/2019
NavnLecture Notes in Computer Science
Vol/bind11781
ISSN0302-9743

Fingeraftryk

Dyk ned i forskningsemnerne om 'Teaching Stratego to Play Ball: Optimal Synthesis for Continuous Space MDPs'. Sammen danner de et unikt fingeraftryk.

Citationsformater