Professor of Statistical Science
My research areas include nonparametric methods, high-dimensional inference, scalable inference, Bayesian modeling, and applied statistics in biomedical applications. I am particularly interested in tackling statistical and computational challenges in analyzing massive data (especially large n problems) using nonparametric methods.
Traditional nonparametric methods, while enjoying many established theoretical properties, scale polynomially with the sample size and hence are computationally intractable for big data. Furthermore, applications of such methods to data from complex sampling mechanisms often require resampling (e.g., permutation and bootstrap) for making inference, which is often computationally prohibitive for massive data sets with millions or more observations as even a single run of the nonparametric method on such data can be computationally demanding.
A recent focus of my methodological research is on constructing and understanding models and algorithms under a multi-scale divide-and-conquer strategy that allows various inferential and predictive tasks to be completed nonparametrically in a computational complexity that scales linearly with the sample size. Such methods provide a general framework for tackling the computational bottleneck, while preserving many theoretical guarantees enjoyed by classical methods. A current applied focus of my research is on developing statistical models and methods for effective analysis of high-dimensional data generated from modern high-throughput biomedical experiments. Some examples include microbiome sequencing data and flow cytometry data.