Brian S. St. Thomas
Data ScientistSpotifyJun 2017-Present
Linear Subspace and Manifold Learning Via Extrinsic Geometry
In the last few decades, data analysis techniques have had to expand to handle large sets of data with complicated structure. This includes identifying low dimensional structure in high dimensional data, analyzing shape and image data, and learning from or classifying large corpora of text documents. Common Bayesian and Machine Learning techniques rely on using the unique geometry of these data types, however departing from Euclidean geometry can result in both theoretical and practical complications. Bayesian nonparametric approaches can be particularly challenging in these areas. This dissertation proposes a novel approach to these challenges by working with convenient embeddings of the manifold valued parameters of interest, commonly making use of an extrinsic distance or measure on the manifold. Carefully selected extrinsic distances are shown to reduce the computational cost and to increase accuracy of inference. The embeddings are also used to yield straight forward derivations for nonparametric techniques. The methods developed are applied to subspace learning in dimension reduction problems, planar shapes, shape constrained regression, and text analysis.