Approximating Euclidean by Imprecise Markov Decision Processes

Research output: Contribution to book/anthology/report/conference proceedingArticle in proceedingResearchpeer-review

1 Citation (Scopus)

Abstract

Euclidean Markov decision processes are a powerful tool for modeling control problems under uncertainty over continuous domains. Finite state imprecise, Markov decision processes can be used to approximate the behavior of these infinite models. In this paper we address two questions: first, we investigate what kind of approximation guarantees are obtained when the Euclidean process is approximated by finite state approximations induced by increasingly fine partitions of the continuous state space. We show that for cost functions over finite time horizons the approximations become arbitrarily precise. Second, we use imprecise Markov decision process approximations as a tool to analyse and validate cost functions and strategies obtained by reinforcement learning. We find that, on the one hand, our new theoretical results validate basic design choices of a previously proposed reinforcement learning approach. On the other hand, the imprecise Markov decision process approximations reveal some inaccuracies in the learned cost functions.
Original languageEnglish
Title of host publicationInternational Symposium on Leveraging Applications of Formal Methods (ISoLA 2020)
EditorsTiziana Margaria, Bernhard Steffen
Number of pages15
Volume12476
Place of PublicationLNCS
PublisherSpringer
Publication date2020
Pages275-289
ISBN (Print)978-3-030-61361-7
ISBN (Electronic)978-3-030-61362-4
DOIs
Publication statusPublished - 2020
EventInternational Symposium on Leveraging Applications of Formal Methods 2020: ISoLA 2020 - Rhodos, Greece
Duration: 20 Oct 202030 Oct 2020

Conference

ConferenceInternational Symposium on Leveraging Applications of Formal Methods 2020
CountryGreece
CityRhodos
Period20/10/202030/10/2020
SeriesLecture Notes in Computer Science
ISSN0302-9743

Fingerprint Dive into the research topics of 'Approximating Euclidean by Imprecise Markov Decision Processes'. Together they form a unique fingerprint.

Cite this