STA 701 Talk

Monday, September 16, -


Moderator: Marie Neubrander

Speakers:  Yunran Chen and Rihui Ou

Yunran Chen

Shrinkage-based Covariance Matrix Estimation

Covariance matrix estimation is fundamental to multivariate statistical analysis. However, challenges arise in high-dimensional settings or when the sample size is limited, particularly when the number of variables p is comparable to or exceeds the sample size n. In such cases, conventional methods for estimating covariance matrices often become unreliable. In this paper, we propose three shrinkage-based estimators for improving covariance matrix estimation. The first approach assumes an unknown block structure in the covariance matrix, with variables within each block considered exchangeable. We employ a mixture of finite mixtures model to identify the blocks based on marginal likelihood, which is computed based on a canonical representation of the block covariance matrix. To enhance estimation accuracy, we utilize a hierarchical prior to construct a shrinkage estimator, where both the shrinkage intensity and the target are learned adaptively. The second and third estimators are developed by modifying the sample eigenvalues in the eigendecomposition of the sample covariance matrix. We assume the sample eigenvalues follow a mixture model with an unknown mixing density. In one method, we replace the sample eigenvalues with their posterior means. The alternative approach involves constructing an optimal nonlinear shrinkage estimator using the Hilbert transform of the density of sample eigenvalues. We provide a comparative analysis of the performance of these estimators across various scenarios. 

Rihui Ou

Bayesian Inference on High-Dimensional Multivariate Binary Responses

Abstract: It has become increasingly common to collect high-dimensional binary response data; for example, with the emergence of new sampling techniques in ecology. In smaller dimensions, multivariate probit (MVP) models are routinely used for inferences. However, algorithms for fitting such models face issues in scaling up to high dimensions due to the intractability of the likelihood, involving an integral over a multivariate normal distribution having no analytic form. Although a variety of algorithms have been proposed to approximate this intractable integral, these approaches are difficult to implement and/or inaccurate in high dimensions. Our main focus is in accommodating high-dimensional binary response data with a small-to-moderate number of covariates. We propose a two-stage approach for inference on model parameters while taking care of uncertainty propagation between the stages. We use the special structure of latent Gaussian models to reduce the highly expensive computation involved in joint parameter estimation to focus inference on marginal distributions of model parameters. This essentially makes the method embarrassingly parallel for both stages.

 

 

Contact

Lori Rauch