Consider a random sample from a bivariate distribution function $F$ in the
max-domain of attraction of an extreme-value distribution function $G$. This
$G$ is characterized by two extreme-value indices and a spectral measure, the
latter determining the tail dependence structure of $F$. A major issue in
multivariate extreme-value theory is the estimation of the spectral measure
$\Phi_p$ with respect to the $L_p$ norm.
Consider a random sample from a bivariate distribution function $F$ in the
max-domain of attraction of an extreme-value distribution function $G$. This
$G$ is characterized by two extreme-value indices and a spectral measure, the
latter determining the tail dependence structure of $F$. A major issue in
multivariate extreme-value theory is the estimation of the spectral measure
$\Phi_p$ with respect to the $L_p$ norm.
This paper discusses asymptotically distribution free tests for the classical
goodness-of-fit hypothesis of an error distribution in nonparametric regression
models. These tests are based on the same martingale transform of the residual
empirical process as used in the one sample location model. This transformation
eliminates extra randomization due to covariates but not due the errors, which
is intrinsically present in the estimators of the regression function. Thus,
tests based on the transformed process have, generally, better power.
This paper discusses asymptotically distribution free tests for the classical
goodness-of-fit hypothesis of an error distribution in nonparametric regression
models. These tests are based on the same martingale transform of the residual
empirical process as used in the one sample location model. This transformation
eliminates extra randomization due to covariates but not due the errors, which
is intrinsically present in the estimators of the regression function. Thus,
tests based on the transformed process have, generally, better power.
We suggest a robust nearest-neighbor approach to classifying high-dimensional
data. The method enhances sensitivity by employing a threshold and truncates to
a sequence of zeros and ones in order to reduce the deleterious impact of
heavy-tailed data. Empirical rules are suggested for choosing the threshold.
They require the bare minimum of data; only one data vector is needed from each
population. Theoretical and numerical aspects of performance are explored,
paying particular attention to the impacts of correlation and heterogeneity
among data components.
We suggest a robust nearest-neighbor approach to classifying high-dimensional
data. The method enhances sensitivity by employing a threshold and truncates to
a sequence of zeros and ones in order to reduce the deleterious impact of
heavy-tailed data. Empirical rules are suggested for choosing the threshold.
They require the bare minimum of data; only one data vector is needed from each
population. Theoretical and numerical aspects of performance are explored,
paying particular attention to the impacts of correlation and heterogeneity
among data components.
The p_1 model is a directed random graph model used to describe dyadic
interactions in a social network in terms of effects due to differential
attraction (popularity) and expansiveness, as well as an additional effect due
to reciprocation. In this article we carry out an algebraic statistics analysis
of this model. We show that the p_1 model is a toric model specified by a
multi-homogeneous ideal. We conduct an extensive study of the Markov bases for
p_1 models that incorporate explicitly the constraint arising from
multi-homogeneity.
We consider estimating an unknown signal, both blocky and sparse, which is
corrupted by additive noise. We study three interrelated least squares
procedures and their asymptotic properties. The first procedure is the fused
lasso, put forward by Friedman et al. [Ann. Appl. Statist. 1 (2007) 302--332],
which we modify into a different estimator, called the fused adaptive lasso,
with better properties.
Let $p_n(y)=\sum_k\hat{\alpha}_k\phi(y-k)+\sum_{l=0}^{j_n-1}\sum_k\hat
{\beta}_{lk}2^{l/2}\psi(2^ly-k)$ be the linear wavelet density estimator, where
$\phi$, $\psi$ are a father and a mother wavelet (with compact support),
$\hat{\alpha}_k$, $\hat{\beta}_{lk}$ are the empirical wavelet coefficients
based on an i.i.d.
Using the asymptotical minimax framework, we examine convergence rates
equivalency between a continuous functional deconvolution model and its
real-life discrete counterpart, over a wide range of Besov balls and for the
$L^2$-risk. For this purpose, all possible models are divided into three
groups: {\it uniform}, {\it regular} and {\it irregular}. We formulate the
conditions when each of these situations takes place.
We discuss connecting tables with zero-one entries by a subset of a Markov
basis. In this paper, as a Markov basis we consider the Graver basis, which
corresponds to the unique minimal Markov basis for the Lawrence lifting of the
original configuration. Since the Graver basis tends to be large, it is of
interest to clarify conditions such that a subset of the Graver basis, in
particular a minimal Markov basis itself, connects tables with zero-one
entries. We give some theoretical results on the connectivity of tables with
zero-one entries.
We discuss connecting tables with zero-one entries by a subset of a Markov
basis. In this paper, as a Markov basis we consider the Graver basis, which
corresponds to the unique minimal Markov basis for the Lawrence lifting of the
original configuration. Since the Graver basis tends to be large, it is of
interest to clarify conditions such that a subset of the Graver basis, in
particular a minimal Markov basis itself, connects tables with zero-one
entries. We give some theoretical results on the connectivity of tables with
zero-one entries.
This is a technical appendix to "Adaptive estimation of stationary Gaussian
fields". We present several proofs that have been skipped in the main paper.
This is a technical appendix to "Adaptive estimation of stationary Gaussian
fields". We present several proofs that have been skipped in the main paper.
We reconsider the existing kernel estimators for a copula function, as
proposed in Gijbels and Mielniczuk [Comm. Statist. Theory Methods 19 (1990)
445--464], Fermanian, Radulovi\v{c} and Wegkamp [Bernoulli 10 (2004) 847--860]
and Chen and Huang [Canad. J. Statist. 35 (2007) 265--282]. All of these
estimators have as a drawback that they can suffer from a corner bias problem.
A way to deal with this is to impose rather stringent conditions on the copula,
outruling as such many classical families of copulas.
We reconsider the existing kernel estimators for a copula function, as
proposed in Gijbels and Mielniczuk [Comm. Statist. Theory Methods 19 (1990)
445--464], Fermanian, Radulovi\v{c} and Wegkamp [Bernoulli 10 (2004) 847--860]
and Chen and Huang [Canad. J. Statist. 35 (2007) 265--282]. All of these
estimators have as a drawback that they can suffer from a corner bias problem.
A way to deal with this is to impose rather stringent conditions on the copula,
outruling as such many classical families of copulas.
In this manuscript we introduce a generalisation of the log-Normal
distribution that is inspired by a modification of the Kaypten multiplicative
process using the $q$-product of Borges [Physica A \textbf{340}, 95 (2004)].
Depending on the value of q the distribution increases the tail for small (when
$q<1$) or large (when $q>1$) values of the variable upon analysis. The usual
log-Normal distribution is retrieved when $q=1$. The main statistical features
of this distribution are presented as well as a related random number
generators and tables of quantiles of the Kolmogorov-Smirnov.
We present theoretical properties of the log-concave maximum likelihood
estimator of a density based on an independent and identically distributed
sample in $\mathbb{R}^d$. Our study covers both the case where the true
underlying density is log-concave, and where this model is misspecified. We
begin by showing that for a sequence of log-concave densities, convergence in
distribution implies much stronger types of convergence -- in particular, it
implies convergence in Hellinger distance and even in certain exponentially
weighted total variation norms.
In this paper, we study the asymptotic posterior distribution of linear
functionals of the density. In particular, we give general conditions to obtain
a semiparametric version of the Bernstein-Von Mises theorem. We then apply this
general result to nonparametric priors based on infinite dimensional
exponential families. As a byproduct, we also derive adaptive nonparametric
rates of concentration of the posterior distributions under these families of
priors on the class of Sobolev and Besov spaces.
Consider a continuous random pair $(X,Y)$ whose dependence is characterized
by an extreme-value copula with Pickands dependence function $A$. When the
marginal distributions of $X$ and $Y$ are known, several consistent estimators
of $A$ are available. Most of them are variants of the estimators due to
Pickands [Bull. Inst. Internat. Statist. 49 (1981) 859--878] and
Cap\'{e}ra\`{a}, Foug\`{e}res and Genest [Biometrika 84 (1997) 567--577]. In
this paper, rank-based versions of these estimators are proposed for the more
common case where the margins of $X$ and $Y$ are unknown.
Principal component analysis (PCA) is a classical method for dimensionality
reduction based on extracting the dominant eigenvectors of the sample
covariance matrix. However, PCA is well known to behave poorly in the ``large
$p$, small $n$'' setting, in which the problem dimension $p$ is comparable to
or larger than the sample size $n$. This paper studies PCA in this
high-dimensional regime, but under the additional assumption that the maximal
eigenvector is sparse, say, with at most $k$ nonzero components.
Principal component analysis (PCA) is a classical method for dimensionality
reduction based on extracting the dominant eigenvectors of the sample
covariance matrix. However, PCA is well known to behave poorly in the ``large
$p$, small $n$'' setting, in which the problem dimension $p$ is comparable to
or larger than the sample size $n$. This paper studies PCA in this
high-dimensional regime, but under the additional assumption that the maximal
eigenvector is sparse, say, with at most $k$ nonzero components.
Consider a continuous random pair $(X,Y)$ whose dependence is characterized
by an extreme-value copula with Pickands dependence function $A$. When the
marginal distributions of $X$ and $Y$ are known, several consistent estimators
of $A$ are available. Most of them are variants of the estimators due to
Pickands [Bull. Inst. Internat. Statist. 49 (1981) 859--878] and
Cap\'{e}ra\`{a}, Foug\`{e}res and Genest [Biometrika 84 (1997) 567--577]. In
this paper, rank-based versions of these estimators are proposed for the more
common case where the margins of $X$ and $Y$ are unknown.
In arXiv:0907.0079 by Cator and Lopuhaa, an asymptotic expansion for the MCD
estimators is established in a very general framework. This expansion requires
the existence and non-singularity of the derivative in a first-order Taylor
expansion. In this paper, we prove the existence of this derivative for
multivariate distributions that have a density and provide an explicit
expression. Moreover, under suitable symmetry conditions on the density, we
show that this derivative is non-singular.
In arXiv:0907.0079 by Cator and Lopuhaa, an asymptotic expansion for the MCD
estimators is established in a very general framework. This expansion requires
the existence and non-singularity of the derivative in a first-order Taylor
expansion. In this paper, we prove the existence of this derivative for
multivariate distributions that have a density and provide an explicit
expression. Moreover, under suitable symmetry conditions on the density, we
show that this derivative is non-singular.
It is shown that a necessary and sufficient condition for an Archimedean
copula generator to generate a $d$-dimensional copula is that the generator is
a $d$-monotone function. The class of $d$-dimensional Archimedean copulas is
shown to coincide with the class of survival copulas of $d$-dimensional
$\ell_1$-norm symmetric distributions that place no point mass at the origin.
The $d$-monotone Archimedean copula generators may be characterized using a
little-known integral transform of Williamson [Duke Math. J.
In 1985, for detecting changes in distributions Pollak introduced a specific
minimax performance metric and a randomized version of the Shiryaev-Roberts
procedure where the zero initial condition is replaced by a random variable
sampled from the quasi-stationary distribution. Pollak proved that this
procedure is third-order asymptotically optimal as the mean time to false alarm
becomes large. The question whether Pollak's procedure is strictly minimax for
any false alarm rate has been open for more than two decades, and there were
several attempts to prove this strict optimality.
Ancestral graphs can encode conditional independence relations that arise in
directed acyclic graph (DAG) models with latent and selection variables.
However, for any ancestral graph, there may be several other graphs to which it
is Markov equivalent. We state and prove conditions under which two maximal
ancestral graphs are Markov equivalent to each other, thereby extending
analogous results for DAGs given by other authors. These conditions lead to an
algorithm for determining Markov equivalence that runs in time that is
polynomial in the number of vertices in the graph.
We give a finite-sample analysis of predictive inference procedures after
model selection in regression with random design. The analysis is focused on a
statistically challenging scenario where the number of potentially important
explanatory variables can be infinite, where no regularity conditions are
imposed on unknown parameters, where the number of explanatory variables in a
"good" model can be of the same order as sample size and where the number of
candidate models can be of larger order than sample size.
In this paper we examine the use of topological methods for multivariate
statistics. Using persistent homology from computational algebraic topology, a
random sample is used to construct estimators of persistent homology. This
estimation procedure can then be evaluated using the bottleneck distance
between the estimated persistent homology and the true persistent homology. The
connection to statistics comes from the fact that when viewed as a
nonparametric regression problem, the bottleneck distance is bounded by the
sup-norm loss.
Stochastic approximation Monte Carlo (SAMC) has recently been proposed by
Liang, Liu and Carroll [J. Amer. Statist. Assoc. 102 (2007) 305--320] as a
general simulation and optimization algorithm. In this paper, we propose to
improve its convergence using smoothing methods and discuss the application of
the new algorithm to Bayesian model selection problems. The new algorithm is
tested through a change-point identification example. The numerical results
indicate that the new algorithm can outperform SAMC and reversible jump MCMC
significantly for the model selection problems.
We consider nonparametric Bayesian estimation inference using a rescaled
smooth Gaussian field as a prior for a multidimensional function. The rescaling
is achieved using a Gamma variable and the procedure can be viewed as choosing
an inverse Gamma bandwidth. The procedure is studied from a frequentist
perspective in three statistical settings involving replicated observations
(density estimation, regression and classification).
Consider the problem of estimating the $\gamma$-level set
$G^*_{\gamma}=\{x:f(x)\geq\gamma\}$ of an unknown $d$-dimensional density
function $f$ based on $n$ independent observations $X_1,...,X_n$ from the
density. This problem has been addressed under global error criteria related to
the symmetric set difference. However, in certain applications a spatially
uniform mode of convergence is desirable to ensure that the estimated set is
close to the target set everywhere. The Hausdorff error criterion provides this
degree of uniformity and, hence, is more appropriate in such situations.
This paper discusses the problem of adaptive estimation of a univariate
object like the value of a regression function at a given point or a linear
functional in a linear inverse problem. We consider an adaptive procedure
originated from Lepski [Theory Probab. Appl. 35 (1990) 454--466.] that selects
in a data-driven way one estimate out of a given class of estimates ordered by
their variability. A serious problem with using this and similar procedures is
the choice of some tuning parameters like thresholds.
We consider estimation of quantile curves for a general class of
nonstationary processes. Consistency and central limit results are obtained for
local linear quantile estimates under a mild short-range dependence condition.
Our results are applied to environmental data sets. In particular, our results
can be used to address the problem of whether climate variability has changed,
an important problem raised by IPCC (Intergovernmental Panel on Climate Change)
in 2001.
The Bayesian methods for linear inverse problems is studied using
hierarchical Gaussian models. The problems are considered with different
discretizations, and we analyze the phenomena which appear when the
discretization becomes finer. A hierarchical solution method for signal
restoration problems is introduced and studied with arbitrarily fine
discretization. We show that the maximum a posteriori estimate converges to a
minimizer of the Mumford--Shah functional. A new result regarding the existence
of a minimizer of the Mumford--Shah functional is proved.
Bayesian and frequentist methods differ in many aspects, but share some basic
optimality properties. In practice, there are situations in which one of the
methods is more preferred by some criteria. We consider the case of inference
about a set of multiple parameters, which can be divided into two disjoint
subsets. On one set, a frequentist method may be favored and on the other, the
Bayesian.
We consider the problem of estimating the slope parameter in circular
functional linear regression, where scalar responses Y1,...,Yn are modeled in
dependence of 1-periodic, second order stationary random functions X1,...,Xn.
We consider an orthogonal series estimator of the slope function, by replacing
the first m theoretical coefficients of its development in the trigonometric
basis by adequate estimators.
In this paper we are interested in empirical likelihood (EL) as a method of
estimation, and we address the following two problems: (1) selecting among
various empirical discrepancies in an EL framework and (2) demonstrating that
EL has a well-defined probabilistic interpretation that would justify its use
in a Bayesian context. Using the large deviations approach, a Bayesian law of
large numbers is developed that implies that EL and the Bayesian maximum a
posteriori probability (MAP) estimators are consistent under misspecification
and that EL can be viewed as an asymptotic form of MAP.
In the analysis of cluster data, the regression coefficients are frequently
assumed to be the same across all clusters. This hampers the ability to study
the varying impacts of factors on each cluster. In this paper, a semiparametric
model is introduced to account for varying impacts of factors over clusters by
using cluster-level covariates. It achieves the parsimony of parametrization
and allows the explorations of nonlinear interactions. The random effect in the
semiparametric model also accounts for within-cluster correlation.
We consider tests of hypotheses when the parameters are not identifiable
under the null in semiparametric models, where regularity conditions for
profile likelihood theory fail. Exponential average tests based on integrated
profile likelihood are constructed and shown to be asymptotically optimal under
a weighted average power criterion with respect to a prior on the
nonidentifiable aspect of the model. These results extend existing results for
parametric models, which involve more restrictive assumptions on the form of
the alternative than do our results.
This paper establishes a necessary and sufficient condition for the
asymptotic normality of the nonparametric estimator of sample coverage proposed
by Good [Biometrica 40 (1953) 237--264]. This new necessary and sufficient
condition extends the validity of the asymptotic normality beyond the
previously proven cases.
The research of developing a general methodology for the construction of good
nonregular designs has been very active in the last decade. Recent research by
Xu and Wong [Statist. Sinica 17 (2007) 1191--1213] suggested a new class of
nonregular designs constructed from quaternary codes. This paper explores the
properties and uses of quaternary codes toward the construction of
quarter-fraction nonregular designs. Some theoretical results are obtained
regarding the aliasing structure of such designs.
Mixture models have received considerable attention recently and Newton
[Sankhy\={a} Ser. A 64 (2002) 306--322] proposed a fast recursive algorithm for
estimating a mixing distribution. We prove almost sure consistency of this
recursive estimate in the weak topology under mild conditions on the family of
densities being mixed. This recursive estimate depends on the data ordering and
a permutation-invariant modification is proposed, which is an average of the
original over permutations of the data sequence.
Normal mixture distributions are arguably the most important mixture models,
and also the most technically challenging. The likelihood function of the
normal mixture model is unbounded based on a set of random samples, unless an
artificial bound is placed on its component variance parameter. Moreover, the
model is not strongly identifiable so it is hard to differentiate between over
dispersion caused by the presence of a mixture and that caused by a large
variance, and it has infinite Fisher information with respect to mixing
proportions.
Response-adaptive randomization has recently attracted a lot of attention in
the literature. In this paper, we propose a new and simple family of
response-adaptive randomization procedures that attain the Cramer--Rao lower
bounds on the allocation variances for any allocation proportions, including
optimal allocation proportions. The allocation probability functions of
proposed procedures are discontinuous. The existing large sample theory for
adaptive designs relies on Taylor expansions of the allocation probability
functions, which do not apply to nondifferentiable cases.
We study a class of hypothesis testing problems in which, upon observing the
realization of an n-dimensional Gaussian vector, one has to decide whether the
vector was drawn from a standard normal distribution or, alternatively, whether
there is a subset of the components belonging to a certain given class of sets
whose elements have been "contaminated," that is, have a mean different from
zero. We establish some general conditions under which testing is possible and
others under which testing is hopeless with a small risk.
We consider the fundamental problem of estimating the mean of a vector
$y=X\beta+z$, where $X$ is an $n\times p$ design matrix in which one can have
far more variables than observations, and $z$ is a stochastic error term--the
so-called "$p>n$" setup. When $\beta$ is sparse, or, more generally, when there
is a sparse subset of covariates providing a close approximation to the unknown
mean vector, we ask whether or not it is possible to accurately estimate
$X\beta$ using a computationally tractable algorithm.
This paper proves fixed domain asymptotic results for estimating a smooth
invertible transformation $f:\Bbb{R}^2\to\Bbb{R}^2$ when observing the deformed
random field $Z\circ f$ on a dense grid in a bounded, simply connected domain
$\Omega$, where $Z$ is assumed to be an isotropic Gaussian random field on
$\Bbb{R}^2$. The estimate $\hat{f}$ is constructed on a simply connected domain
$U$, such that $\overline{U}\subset\Omega$ and is defined using kernel smoothed
quadratic variations, Bergman projections and results from quasiconformal
theory.
We consider the problem of estimating a density $f_X$ using a sample
$Y_1,...,Y_n$ from $f_Y=f_X\star f_{\epsilon}$, where $f_{\epsilon}$ is an
unknown density. We assume that an additional sample
$\epsilon_1,...,\epsilon_m$ from $f_{\epsilon}$ is observed. Estimators of
$f_X$ and its derivatives are constructed by using nonparametric estimators of
$f_Y$ and $f_{\epsilon}$ and by applying a spectral cut-off in the Fourier
domain.
Given a sample from a discretely observed L\'evy process $X=(X_t)_{t\geq 0}$
of the finite jump activity, we study the problem of nonparametric estimation
of the L\'evy density $\rho$ corresponding to the process $X.$ Our estimator of
$\rho$ is based on a suitable inversion of the L\'evy-Khintchine formula and a
plug-in device. The main result of the paper deals with an upper bound on the
mean square error of the estimator of $\rho$ at a fixed point $x.$ We also show
that the estimator attains the minimax convergence rate over a suitable class
of L\'evy densities.
We consider a class of doubly weighted rank-based estimating methods for the
transformation (or accelerated failure time) model with missing data as arise,
for example, in case-cohort studies. The weights considered may not be
predictable as required in a martingale stochastic process formulation. We
treat the general problem as a semiparametric estimating equation problem and
provide proofs of asymptotic properties for the weighted estimators, with
either true weights or estimated weights, by using empirical process theory
where martingale theory may fail.
We define a generalized index of jump activity, propose estimators of that
index for a discretely sampled process and derive the estimators' properties.
These estimators are applicable despite the presence of Brownian volatility in
the process, which makes it more challenging to infer the characteristics of
the small, infinite activity jumps. When the method is applied to high
frequency stock returns, we find evidence of infinitely active jumps in the
data and estimate their index of activity.
We consider regression models with parametric (linear or nonlinear)
regression function and allow responses to be ``missing at random.'' We assume
that the errors have mean zero and are independent of the covariates. In order
to estimate expectations of functions of covariate and response we use a fully
imputed estimator, namely an empirical estimator based on estimators of
conditional expectations given the covariate.
The problem we concentrate on is as follows: given (1) a convex compact set
$X$ in ${\mathbb{R}}^n$, an affine mapping $x\mapsto A(x)$, a parametric family
$\{p_{\mu}(\cdot)\}$ of probability densities and (2) $N$ i.i.d. observations
of the random variable $\omega$, distributed with the density $p_{A(x)}(\cdot)$
for some (unknown) $x\in X$, estimate the value $g^Tx$ of a given linear form
at $x$.
This paper explores the following question: what kind of statistical
guarantees can be given when doing variable selection in high-dimensional
models? In particular, we look at the error rates and power of some multi-stage
regression methods. In the first stage we fit a set of candidate models. In the
second stage we select one model by cross-validation. In the third stage we use
hypothesis testing to eliminate some variables.
Regression models to relate a scalar $Y$ to a functional predictor $X(t)$ are
becoming increasingly common. Work in this area has concentrated on estimating
a coefficient function, $\beta(t)$, with $Y$ related to $X(t)$ through
$\int\beta(t)X(t) dt$. Regions where $\beta(t)\ne0$ correspond to places where
there is a relationship between $X(t)$ and $Y$. Alternatively, points where
$\beta(t)=0$ indicate no relationship.
In this paper we deal with the regression problem in a random design setting.
We investigate asymptotic optimality under minimax point of view of various
Bayesian rules based on warped wavelets and show that they nearly attain
optimal minimax rates of convergence over the Besov smoothness class
considered. Warped wavelets have been introduced recently, they offer very good
computable and easy-to-implement properties while being well adapted to the
statistical problem at hand.
We derive sharp performance bounds for least squares regression with $L_1$
regularization from parameter estimation accuracy and feature selection quality
perspectives. The main result proved for $L_1$ regularization extends a similar
result in [Ann. Statist. 35 (2007) 2313--2351] for the Dantzig selector. It
gives an affirmative answer to an open question in [Ann. Statist. 35 (2007)
2358--2364]. Moreover, the result leads to an extended view of feature
selection that allows less restrictive conditions than some recent work.
We design a particle interpretation of Feynman-Kac measures on path spaces
based on a backward Markovian representation combined with a traditional mean
field particle interpretation of the flow of their final time marginals. In
contrast to traditional genealogical tree based models, these new particle
algorithms can be used to compute normalized additive functionals "on-the-fly"
as well as their limiting occupation measures with a given precision degree
that does not depend on the final time horizon.
Let M be a smooth compact oriented manifold without boundary, imbedded in a
euclidean space E and let f be a smooth map of M into a Riemannian manifold N.
An unknown state x in M is observed via X=x+su where s>0 is a small parameter
and u is a white Gaussian noise. For a given smooth prior on M and smooth
estimators g of the map f we have derived a second-order asymptotic expansion
for the related Bayesian risk (see arXiv:0705.2540). In this paper, we apply
this technique to a variety of examples.
We provide a new algorithm for the treatment of the noisy inversion of the
radon transform using an appropriate thresholding technique adapted to a well
chosen new localized basis. We establish minimax results and prove their
optimality. In particular we prove that the procedures provided here are able
to attain minimax bounds for any $\bL_p$ loss. It is important to notice that
most of the minimax bounds obtained here are new to our knowledge.