Nonparametric Bayesian Models for Supervised Dimension
Reduction and Regression
Abstract
We propose nonparametric Bayesian models for supervised dimension reduction and
regression problems. Supervised dimension reduction is a setting where one needs to reduce
the dimensionality of the predictors or find the dimension reduction subspace and
lose little or no predictive information. Our first method retrieves the dimension reduction
subspace in the inverse regression framework by utilizing a dependent Dirichlet process
that allows for natural clustering for the data in terms of both the response and predictor
variables. Our second method is based on ideas from the gradient learning framework and
retrieves the dimension reduction subspace through coherent nonparametric Bayesian kernel
models. We also discuss and provide a new rationalization of kernel regression based
on nonparametric Bayesian models allowing for direct and formal inference on the uncertain
regression functions. Our proposed models apply for high dimensional cases where
the number of variables far exceed the sample size, and hold for both the classical setting
of Euclidean subspaces and the Riemannian setting where the marginal distribution is
concentrated on a manifold. Our Bayesian perspective adds appropriate probabilistic and
statistical frameworks that allow for rich inference such as uncertainty estimation which is
important for measuring the estimates. Formal probabilistic models with likelihoods and
priors are given and efficient posterior sampling can be obtained by Markov chain Monte
Carlo methodologies, particularly Gibbs sampling schemes. For the supervised dimension
reduction as the posterior draws are linear subspaces which are points on a Grassmann
manifold, we do the posterior inference with respect to geodesics on the Grassmannian.
The utility of our approaches is illustrated on simulated and real examples.
This thesis is available in
PDF (614kb).