A decision process in which rewards depend on history rather than merely on
the current state is called a decision process with non-Markovian rewards
(NMRDP). In decision-theoretic planning, where many desirable behaviours are
more naturally expressed as properties of execution sequences rather than as
properties of states, NMRDPs form a more natural model than the commonly
adopted fully Markovian decision process (MDP) model.