The Partially Observable Markov Decision Process has long been recognized as
a rich framework for real-world planning and control problems, especially in
robotics. However exact solutions in this framework are typically
computationally intractable for all but the smallest problems. A well-known
technique for speeding up POMDP solving involves performing value backups at
specific belief points, rather than over the entire belief simplex. The
efficiency of this approach, however, depends greatly on the selection of
points.
Standard value function approaches to finding policies for Partially
Observable Markov Decision Processes (POMDPs) are generally considered to be
intractable for large models. The intractability of these algorithms is to a
large extent a consequence of computing an exact, optimal policy over the
entire belief space. However, in real-world POMDP problems, computing the
optimal policy for the full belief space is often unnecessary for good control
even for problems with complicated policy classes.