Following a recent surge in using history-based methods for resolving
perceptual aliasing in reinforcement learning, we introduce an algorithm based
on the feature reinforcement learning framework called PhiMDP. To create a
practical algorithm we devise a stochastic search procedure for a class of
context trees based on parallel tempering and a specialized proposal
distribution. We provide the first empirical evaluation for PhiMDP.
This article is a brief personal account of the past, present, and future of
algorithmic randomness, emphasizing its role in inductive inference and
artificial intelligence. It is written for a general audience interested in
science and philosophy. Intuitively, randomness is a lack of order or
predictability. If randomness is the opposite of determinism, then algorithmic
randomness is the opposite of computability. Besides many other things, these
concepts have been used to quantify Ockham's razor, solve the induction
problem, and define intelligence.
This encyclopedic article gives a mini-introduction into the theory of
universal learning, founded by Ray Solomonoff in the 1960s and significantly
developed and extended in the last decade. It explains the spirit of universal
learning, but necessarily glosses over technical subtleties.
Hutter (2007) recently introduced the loss rank principle (LoRP) as a
generalpurpose principle for model selection. The LoRP enjoys many attractive
properties and deserves further investigations. The LoRP has been well-studied
for regression framework in Hutter and Tran (2010). In this paper, we study the
LoRP for classification framework, and develop it further for model selection
problems in unsupervised learning where the main interest is to describe the
associations between input measurements, like cluster analysis or graphical
modelling.
The problem of identifying the 3D pose of a known object from a given 2D
image has important applications in Computer Vision ranging from robotic vision
to image analysis. Our proposed method of registering a 3D model of a known
object on a given 2D photo of the object has numerous advantages over existing
methods: It does neither require prior training nor learning, nor knowledge of
the camera parameters, nor explicit point correspondences or matching features
between image and model.
Increasingly encompassing models have been suggested for our world. Theories
range from generally accepted to increasingly speculative to apparently bogus.
The progression of theories from ego- to geo- to helio-centric models to
universe and multiverse theories and beyond was accompanied by a dramatic
increase in the sizes of the postulated worlds, with humans being expelled from
their center to ever more remote and random locations.
This paper introduces a principled approach for the design of a scalable
general reinforcement learning agent. This approach is based on a direct
approximation of AIXI, a Bayesian optimality notion for general reinforcement
learning agents. Previously, it has been unclear whether the theory of AIXI
could motivate the design of practical algorithms. We answer this hitherto open
question in the affirmative, by providing the first computationally feasible
approximation to the AIXI agent.
We are studying long term sequence prediction (forecasting). We approach this
by investigating criteria for choosing a compact useful state representation.
The state is supposed to summarize useful information from the history. We want
a method that is asymptotically consistent in the sense it will provably
eventually only choose between alternatives that satisfy an optimality property
related to the used criterion.
The two parameter Poisson-Dirichlet process is also known as the Pitman-Yor
Process and related to the Chinese Restaurant Process, is a generalisation of
the Dirichlet Process, and is increasingly being used for probabilistic
modelling in discrete areas such as language and images. This article reviews
the theory of the Poisson-Dirichlet process in terms of its consistency for
estimation, the convergence rates and the posteriors of data. This theory has
been well developed for continuous distributions (more generally referred to as
non-atomic distributions).
A key issue in statistics and machine learning is to automatically select the
"right" model complexity, e.g., the number of neighbors to be averaged over in
k nearest neighbor (kNN) regression or the polynomial degree in regression with
polynomials. We suggest a novel principle - the Loss Rank Principle (LoRP) -
for model selection in regression and classification. It is based on the loss
rank, which counts how many other (fictitious) data would be fitted better.
LoRP selects the model that has minimal loss rank.
Walley's Imprecise Dirichlet Model (IDM) for categorical i.i.d. data extends
the classical Dirichlet model to a set of priors. It overcomes several
fundamental problems which other approaches to uncertainty suffer from. Yet, to
be useful in practice, one needs efficient ways for computing the
imprecise=robust sets or intervals. The main objective of this work is to
derive exact, conservative, and approximate, robust and credible interval
estimates under the IDM for a large class of statistical estimators, including
the entropy and mutual information.
While statistics focusses on hypothesis testing and on estimating (properties
of) the true sampling distribution, in machine learning the performance of
learning algorithms on future data is the primary issue. In this paper we
bridge the gap with a general principle (PHI) that identifies hypotheses with
best predictive performance. This includes predictive point and interval
estimation, simple and composite hypothesis testing, (mixture) model selection,
and others as special cases. For concrete instantiations we will recover
well-known methods, variations thereof, and new ones.
Finding the three-dimensional representation of all or a part of a scene from
a single two dimensional image is a challenging task. In this paper we propose
a method for identifying the pose and location of objects with circular
protrusions in three dimensions from a single image and a 3d representation or
model of the object of interest. To do this, we present a method for
identifying ellipses and their properties quickly and reliably with a novel
technique that exploits intensity differences between objects and a geometric
technique for matching an ellipse in 2d to a circle in 3d.
The Minimum Description Length (MDL) principle selects the model that has the
shortest code for data plus model. We show that for a countable class of
models, MDL predictions are close to the true distribution in a strong sense.
The result is completely general. No independence, ergodicity, stationarity,
identifiability, or other assumption on the model class need to be made. More
formally, we show that for any countable class of models, the distributions
selected by MDL (or MAP) asymptotically predict (merge with) the true measure
in the class in total variation distance.
This paper describes a computationally feasible approximation to the AIXI
agent, a universal reinforcement learning agent for arbitrary environments.
AIXI is scaled down in two key ways: First, the class of environment models is
restricted to all prediction suffix trees of a fixed maximum depth. This allows
a Bayesian mixture of environment models to be computed in time proportional to
the logarithm of the size of the model class. Secondly, the finite-horizon
expectimax search is approximated by an asymptotically convergent Monte Carlo
Tree Search technique.