Detecting and locating changes in highly multivariate data is a major concern
in several current statistical applications. In this context, the first
contribution of the paper is a novel non-parametric two-sample homogeneity test
for multivariate data based on the well-known Wilcoxon rank statistic. The
proposed two-sample homogeneity test statistic can be extended to deal with
ordinal or censored data as well as to test for the homogeneity of more than
two samples.
Online (also called "recursive" or "adaptive") estimation of fixed model
parameters in hidden Markov models is a topic of much interest in times series
modelling. In this work, we propose an online parameter estimation algorithm
that combines two key ideas. The first one, which is deeply rooted in the
Expectation-Maximization (EM) methodology consists in reparameterizing the
problem using complete-data sufficient statistics. The second ingredient
consists in exploiting a purely recursive form of smoothing in HMMs based on an
auxiliary recursion.
This paper presents a finite-time analysis of the KL-UCB algorithm, an
online, horizon-free index policy for stochastic bandit problems. We prove two
distinct results: first, for arbitrary bounded rewards, the KL-UCB algorithm
satisfies a uniformly better regret bound than UCB or UCB2; second, in the
special case of Bernoulli rewards, it reaches the lower bound of Lai and
Robbins. Furthermore, we show that simple adaptations of the KL-UCB algorithm
are also optimal for specific classes of (possibly unbounded) rewards,
including those generated from exponential families of distributions.
We propose a non-parametric statistical procedure for detecting multiple
change-points in multidimensional signals. The method is based on a test
statistic that generalizes the well-known Kruskal-Wallis procedure to the
multivariate setting. The proposed approach does not require any knowledge
about the distribution of the observations and is parameter-free. It is
computationally efficient thanks to the use of dynamic programming and can also
be applied when the number of change-points is unknown.
We consider model-based reinforcement learning in finite Markov Decision
Processes (MDPs), focussing on so-called optimistic strategies. Optimism is
usually implemented by carrying out extended value iterations, under a
constraint of consistency with the estimated model transition probabilities. In
this paper, we strongly argue in favor of using the Kullback-Leibler (KL)
divergence for this purpose. By study- ing the linear maximization problem
under KL constraints, we provide an efficient algorithm for solving
KL-optimistic extended value iteration.
We propose a novel approach for distributed statistical detection of
change-points in high-volume network traffic. We consider more specifically the
task of detecting and identifying the targets of Distributed Denial of Service
(DDoS) attacks. The proposed algorithm, called DTopRank, performs distributed
network anomaly detection by aggregating the partial information gathered in a
set of network monitors.
Conditional Random Fields (CRFs) constitute a popular and efficient approach
for supervised sequence labelling. CRFs can cope with large description spaces
and can integrate some form of structural dependency between labels. In this
contribution, we address the issue of efficient feature selection for CRFs
based on imposing sparsity through an L1 penalty. We first show how sparsity of
the parameter set can be exploited to significantly speed up training and
labelling. We then introduce coordinate descent parameter update schemes for
CRFs with L1 regularization.