MAW Barycenter for Gaussian Mixture Models with Application to Multi-source Single-cell Data Integration

-
Speaker(s): Lynn Lin, Assistant Professor of Biostatistics & Bioinformatics, Duke University

The Minimized Aggregated Wasserstein (MAW) distance for Gaussian mixture models (GMM) has been used as a computationally efficient approximation to the Wasserstein metric. Recently, significant theoretical advances on MAW have been made, providing deep insight into its optimality. In this talk, we develop a new algorithm for computing the barycenter of GMMs under MAW distance and prove that this barycenter has the same expectation as the Wasserstein barycenter. In addition, we illustrate its practical use with examples of integrating single-cell data across different sources. We demonstrate that the new method achieves better clustering results on several single-cell RNA-seq data sets than some other popular methods.

Sponsor

Statistical Science

Photo of Lynn Lin

Contact

Statistical Science Seminar Coordinator