Bayesian Conditional Tensor Factorizations for High-Dimensional Classification

Authors: 
Yun Yang, David Dunson
Duke University

Jul 12 2012

In many application areas, data are collected on a categorical response and high-dimensional categorical predictors, with the goals being to build a parsimonious model for classification while doing inferences on the important predictors. In settings such as genomics, there can be complex interactions among the predictors. By using a carefully-structured Tucker factorization, we define a model that can characterize any conditional probability, while facilitating variable selection and modeling of higher-order interactions. Following a Bayesian approach, we propose a Markov chain Monte Carlo algorithm for posterior computation accommodating uncertainty in the predictors to be included. Under near sparsity assumptions, the posterior distribution for the conditional probability is shown to achieve close to the parametric rate of contraction even in ultra high-dimensional settings. The methods are illustrated using simulation examples and biomedical applications.

Keywords: 

Classification, Convergence rate, Nonparametric Bayes, Tensor factorization, Ultra high-dimensional, Variable selection

Manuscript: 

PDF icon 2012-12.pdf