A Note on the Bias in Estimating Posterior Probabilities in Variable Selection

Merlise Clyde, Joyee Ghosh
Duke University, University of Iowa

Jan 16 2012

Monte Carlo algorithms are commonly used to identify a set of models
for Bayesian model selection or model averaging. Because empirical
frequencies of models are often zero or one in high dimensional
problems, posterior probabilities calculated from the observed
marginal likelihoods, renormalized over the sampled models are
often employed. Such estimates are the only recourse in several
newer stochastic search algorithms. In this paper, we prove that
renormalization of posterior probabilities over the set of sampled
models generally leads to bias which may dominate mean squared error.
Viewing the model space as a finite population, we propose a new
estimator based on a ratio of Horvitz--Thompson estimators which
incorporates observed marginal likelihoods, but is approximately
unbiased. This is shown to lead to a reduction in mean squared
error compared to the empirical or re-normalized estimators, with
little increase in computational cost.


Bayesian model averaging, Horvitz-Thompson estimator , Inclusion probability, Markov chain Monte Carlo, Median probability model, Model uncertainty, Variable selection


PDF icon 2010-11.pdf

BibTeX Citation: 

  author = 	 {Merlise A Clyde and Joyee Ghosh},
  title = 	 {Finite Population Estimators in Stochastic Search Variable Selection },
  journal = 	 {Biometrika},
  year = 	 {to appear},
  OPTkey = 	 {},
  OPTvolume = 	 {},
  OPTnumber = 	 {},
  OPTpages = 	 {},
  OPTmonth = 	 {},
  OPTnote = 	 {},
  OPTannote = 	 {}