Bayesian shrinkage

Authors: 
Anirban Bhattacharya, Debdeep Pati, Natesh S. Pillai, David B. Dunson
Duke University, Florida State University, Harvard University, Duke University

Feb 2 2013

Penalized regression methods, such as L1 regularization, are routinely used in high-dimensional applications, and there is a rich literature on optimality properties under sparsity assumptions. In the Bayesian paradigm, sparsity is routinely induced through two-component mixture priors having a probability mass at zero, but such priors encounter daunting computational problems in high dimensions. This has motivated an amazing variety of continuous shrinkage priors, which can be expressed as global-local scale mixtures of Gaussians, facilitating computation. In sharp contrast to the corresponding frequentist literature, very little is known about the properties of such priors. Focusing on a broad class of shrinkage priors, we provide precise results on prior and posterior concentration. Interestingly, we demonstrate that most commonly used shrinkage priors, including the Bayesian Lasso, are suboptimal in high-dimensional settings. A new class of Dirichlet-Laplace (DL) priors are proposed, which possess optimal concentration and lead to efficient posterior computation exploiting results from normalized random measure theory. Finite sample performance of Dirichlet-Laplace priors relative to alternatives is assessed in simulations.

Keywords: 

Bayesian, high-dimensions, posterior convergence, shrinkage prior, sub-optimality

Manuscript: 

PDF icon 2012-16.pdf