This paper describes a computationally feasible approximation to the AIXI
agent, a universal reinforcement learning agent for arbitrary environments.
AIXI is scaled down in two key ways: First, the class of environment models is
restricted to all prediction suffix trees of a fixed maximum depth. This allows
a Bayesian mixture of environment models to be computed in time proportional to
the logarithm of the size of the model class. Secondly, the finite-horizon
expectimax search is approximated by an asymptotically convergent Monte Carlo
Tree Search technique. This scaled down AIXI agent is empirically shown to be
effective on a wide class of toy problem domains, ranging from simple fully
observable games to small POMDPs. We explore the limits of this approximate
agent and propose a general heuristic framework for scaling this technique to
much larger problems.