We consider probability measures supported on a finite discrete interval
$[0,n]$. We introduce a new finitedifference operator $\nabla_n$, defined as a
linear combination of left and right finite differences. We show that this
operator $\nabla_n$ plays a key role in a new Poincar\'e (spectral gap)
inequality with respect to binomial weights, with the orthogonal Krawtchouk
polynomials acting as eigenfunctions of the relevant operator. We briefly
discuss the relationship of this operator to the problem of optimal transport
of probability measures.
Normal variance-mean mixtures encompass a large family of useful
distributions such as the generalized hyperbolic distribution, which itself
includes the Student t, Laplace, hyperbolic, normal inverse Gaussian, and
variance gamma distributions as special cases. We study shape properties of
normal variance-mean mixtures, in both the univariate and multivariate cases,
and determine conditions for unimodality and log-concavity of the density
functions. This leads to a short proof of the unimodality of all generalized
hyperbolic densities.
We study a bandit problem where observations from each arm have an
exponential family distribution and different arms are assigned independent
conjugate priors. At each of n stages, one arm is to be selected based on past
observations. The goal is to find a strategy that maximizes the expected
discounted sum of the $n$ observations. Two structural results hold in broad
generality: (i) for a fixed prior weight, an arm becomes more desirable as its
prior mean increases; (ii) for a fixed prior mean, an arm becomes more
desirable as its prior weight decreases.
One of two independent stochastic processes (arms) are to be selected at each
of n stages. The selection is sequential and depends on past observations as
well as the prior information. Observations from arm i are independent given a
distribution P_i, and, following Clayton and Berry (1985), P_i's have
independent Dirichlet process priors. The objective is to maximize the expected
future-discounted sum of the n observations. We study structural properties of
the bandit, in particular how the maximum expected payoff and the optimal
strategy vary with the Dirichlet process priors.
The relative log-concavity ordering $\leq_{\mathrm{lc}}$ between probability
mass functions (pmf's) on non-negative integers is studied. Given three pmf's
$f,g,h$ that satisfy $f\leq_{\mathrm{lc}}g\leq_{\mathrm{lc}}h$, we present a
pair of (reverse) triangle inequalities: if $\sum_iif_i=\sum_iig_i<\infty,$
then \[D(f|h)\geq D(f|g)+D(g|h)\] and if $\sum_iig_i=\sum_iih_i<\infty,$ then
\[D(h|f)\geq D(h|g)+D(g|f),\] where $D(\cdot|\cdot)$ denotes the
Kullback--Leibler divergence.
Comparison results are obtained for the inclusion probabilities in some
unequal probability sampling plans without replacement. For either successive
sampling or Hajek's rejective sampling, the larger the sample size, the more
uniform the inclusion probabilities in the sense of majorization. In
particular, the inclusion probabilities are more uniform than the drawing
probabilities. For the same sample size, and given the same set of drawing
probabilities, the inclusion probabilities are more uniform for rejective
sampling than for successive sampling.
We use the minorization-maximization principle (Lange, Hunter and Yang 2000)
to establish the monotonicity of a multiplicative algorithm for computing
Bayesian D-optimal designs. This proves a conjecture of Dette, Pepelyshev and
Zhigljavsky (2008).
Improved EM strategies, based on the idea of efficient data augmentation
(Meng and van Dyk 1997, 1998), are presented for ML estimation of mixture
proportions. The resulting algorithms inherit the simplicity, ease of
implementation, and monotonic convergence properties of EM, but have
considerably improved speed. Because conventional EM tends to be slow when
there exists a large overlap between the mixture components, we can improve the
speed without sacrificing the simplicity or stability, if we can reformulate
the problem so as to reduce the amount of overlap.
We study a class of multiplicative algorithms introduced by Silvey et al.
(1978) for computing D-optimal designs. Strict monotonicity is established for
a variant considered by Titterington (1978). A formula for the rate of
convergence is also derived. This is used to explain why modifications
considered by Titterington (1978) and Dette et al. (2008) usually converge
faster.
One of the difficulties in calculating the capacity of certain Poisson
channels is that H(lambda), the entropy of the Poisson distribution with mean
lambda, is not available in a simple form. In this work we derive upper and
lower bounds for H(lambda) that are asymptotically tight and easy to compute.
The derivation of such bounds involves only simple probabilistic and analytic
tools. This complements the asymptotic expansions of Knessl (1998), Jacquet and
Szpankowski (1999), and Flajolet (1999).
Monotonic convergence is established for a general class of multiplicative
algorithms introduced by Silvey et al. (1978) for computing optimal designs. A
conjecture of Titterington (1978) is confirmed as a consequence. Optimal
designs for logistic regression are used as an illustration.
A "cocktail algorithm" is proposed for numerical computation of (approximate)
D-optimal designs. This new algorithm combines and extends the multiplicative
algorithm of Silvey et al. (1978) and the vertex exchange method (VEM) of
Bohning (1986), and shares their simplicity and monotonic convergence
properties. Numerical examples show that the cocktail algorithm can lead to
dramatically improved speed, sometimes by orders of magnitude, relative to
either VEM or the multiplicative algorithm.
We compare weighted sums of i.i.d. positive random variables according to the
usual stochastic order. The main inequalities are derived using majorization
techniques under certain log-concavity assumptions. Specifically, let Y_i be
i.i.d. random variables on R_+. Assuming that log Y_i has a log-concave
density, we show that sum a_i Y_i is stochastically smaller than sum b_i Y_i,
if (log a_1, ..., log a_n) is majorized by (log b_1, ..., log b_n).
We investigate stochastic comparisons between exponential family
distributions and their mixtures with respect to the usual stochastic order,
the hazard rate order, the reversed hazard rate order, and the likelihood ratio
order. A general theorem based on the notion of relative log-concavity is shown
to unify various specific results for the Poisson, binomial, negative binomial,
and gamma distributions in recent literature.
The bivariate distribution with exponential conditionals (BEC) is introduced
by Arnold and Strauss [Bivariate distributions with exponential conditionals,
J. Amer. Statist. Assoc. 83 (1988) 522--527]. This work presents a simple and
fast algorithm for simulating random variates from this density.
We present an entropy comparison result concerning weighted sums of
independent and identically distributed random variables.
An inequality concerning ratios of gamma functions is proved. This answers a
question of Guo and Qi (2003).
An inequality concerning ratios of gamma functions is proved. This answers a
question of Guo and Qi (2003).
The Arimoto--Blahut algorithm for computing the capacity of a discrete
memoryless channel is revisited. A so-called ``squeezing'' strategy is used to
design algorithms that preserve its simplicity and monotonic convergence
properties, but have provably better rates of convergence.
The data augmentation (DA) algorithm is a simple and powerful tool in
statistical computing. In this note basic information theory is used to prove a
nontrivial convergence theorem for the DA algorithm.
We consider the entropy of sums of independent discrete random variables, in
analogy with Shannon's Entropy Power Inequality, where equality holds for
normals. In our case, infinite divisibility suggests that equality should hold
for Poisson variables. We show that some natural analogues of the Entropy Power
Inequality do not in fact hold, but propose an alternative formulation which
does always hold. The key to many proofs of Shannon's Entropy Power Inequality
is the behaviour of entropy on scaling of continuous random variables.