Jean-Yves Le Boudec

  1. Mean field for Markov Decision Processes: from Discrete to Continuous Optimization.

    Authors: Nicolas Gast, Bruno Gaujal, Jean-Yves Le Boudec
    Subjects: Artificial Intelligence
    Abstract

    We study the convergence of Markov Decision Processes made of a large number
    of objects to optimization problems on ordinary differential equations (ODE).
    We show that the optimal reward of such a Markov Decision Process, satisfying a
    Bellman equation, converges to the solution of a continuous
    Hamilton-Jacobi-Bellman (HJB) equation based on the mean field approximation of
    the Markov Decision Process. We give bounds on the difference of the rewards,
    and a constructive algorithm for deriving an approximating solution to the
    Markov Decision Process from a solution of the HJB equations.

RSS-материал