[SAP25]
Yannik Schnitzer, Alessandro Abate and David Parker.
Learning Provably Robust Policies in Uncertain Parametric Environments.
In Proc. 31st International Conference on Tools and Algorithms for the Construction and Analysis of Systems (TACAS'25), Springer.
May 2025.
[pdf]
[bib]
[Prevents a framework for robust policy learning in unknown environments, implemented on top of PRISM.]
|
Notes:
The original publication is available at link.springer.com.
|
Links:
[Google]
[Google Scholar]
|
Abstract.
We present a data-driven approach for producing policies that
are provably robust across unknown stochastic environments. Existing
approaches can learn models of a single environment as an interval
Markov decision processes (IMDP) and produce a robust policy with
a probably approximately correct (PAC) guarantee on its performance.
However these are unable to reason about the impact of environmental
parameters underlying the uncertainty. We propose a framework based
on parametric Markov decision processes with unknown distributions
over parameters. We learn and analyse IMDPs for a set of unknown
sample environments induced by parameters. The key challenge is then
to produce meaningful performance guarantees that combine the two
layers of uncertainty: (1) multiple environments induced by parameters
with an unknown distribution; (2) unknown induced environments which
are approximated by IMDPs. We present a novel approach based on
scenario optimisation that yields a single PAC guarantee quantifying
the risk level for which a specified performance level can be assured in
unseen environments, plus a means to trade-off risk and performance.
We implement and evaluate our framework using multiple robust policy
generation methods on a range of benchmarks. We show that our approach
produces tight bounds on a policy’s performance with high confidence.
|