In the context of Markov Decision Processes (MDPs), the framework of forward-backward probability propagation on factor graphs has proven to be useful for finding optimal policies. However, in cases involving vector rewards, there is a need to evaluate a trade-off among constituent objectives. In this work, assuming multiple rewards, we show how to use the framework of belief propagation for dynamically generating the Pareto front and propagating it as a forward flow distribution. The idea is applied to path planning on discrete 1D and 2D grids where different sets of states have vector rewards in the form of priors.

Belief Propagation of Pareto Front in Multi-Objective MDP Graphs

Buonanno A.;
2023-01-01

Abstract

In the context of Markov Decision Processes (MDPs), the framework of forward-backward probability propagation on factor graphs has proven to be useful for finding optimal policies. However, in cases involving vector rewards, there is a need to evaluate a trade-off among constituent objectives. In this work, assuming multiple rewards, we show how to use the framework of belief propagation for dynamically generating the Pareto front and propagating it as a forward flow distribution. The idea is applied to path planning on discrete 1D and 2D grids where different sets of states have vector rewards in the form of priors.
2023
Belief propagation
Multi-objective MDP
Pareto Front
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.12079/76987
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 0
social impact