Kai Mao

missing portrait
Graduation Year: 
2009

Employment Info

Associate Vice President
Citigroup

Dissertation

Nonparametric Bayesian Models for Supervised Dimension Reduction and Regression

We propose nonparametric Bayesian models for supervised dimension reduction and regression problems. Supervised dimension reduction is a setting where one needs to reduce the dimensionality of the predictors or find the dimension reduction subspace and lose little or no predictive information. Our first method retrieves the dimension reduction subspace in the inverse regression framework by utilizing a dependent Dirichlet process that allows for natural clustering for the data in terms of both the response and predictor variables. Our second method is based on ideas from the gradient learning framework and retrieves the dimension reduction subspace through coherent nonparametric Bayesian kernel models. We also discuss and provide a new rationalization of kernel regression based on nonparametric Bayesian models allowing for direct and formal inference on the uncertain regression functions. Our proposed models apply for high dimensional cases where the number of variables far exceed the sample size, and hold for both the classical setting of Euclidean subspaces and the Riemannian setting where the marginal distribution is concentrated on a manifold. Our Bayesian perspective adds appropriate probabilistic and statistical frameworks that allow for rich inference such as uncertainty estimation which is important for measuring the estimates. Formal probabilistic models with likelihoods and priors are given and efficient posterior sampling can be obtained by Markov chain Monte Carlo methodologies, particularly Gibbs sampling schemes. For the supervised dimension reduction as the posterior draws are linear subspaces which are points on a Grassmann manifold, we do the posterior inference with respect to geodesics on the Grassmannian. The utility of our approaches is illustrated on simulated and real examples.