Senior Data ScientistNetflix - Science & Algorithms
Bayesian Modelling and Computation in Dynamic and Spatial Systems
Applied studies in multiple areas involving spatial and dynamic systems increasingly challenge our modelling and computational abilities as data volumes increase, and as spatial and temporal scales move to increasingly high-resolutions with parallel increase in complexity of dependency patterns. Motivated by two challenging problems of this sort, study of cellular dynamics in bacterial communication and global Carbon monoxide emissions prediction based on high-resolution global satellite imagery, this dissertation focuses on building sparse models and computational methods for data-dense dynamic, spatial and spatio-dynamic systems. The first part of the thesis develops a novel particle filtering algorithm for very long state-space models with sparse observations arising in studies of dynamic cellular networks. The need for increasing sample size with increasing dimension is met with parallel developments in informed resample-move strategies and distributed implementation. Fundamental innovations in the particle filtering literature are identified and used for designing an efficient particle filter. The second part of the thesis focuses on sparse spatial modelling of high-resolution lattice data. Gaussian Markov random field models, defined through spatial autoregressions, are adopted for their computational properties. Their potential is evidenced in an applied example in atmospheric chemistry where the focus is on inversion of satellite data combined with computer model predictions to infer ground-level CO emissions from multiple candidate sources on a global scale. Further, extending the framework of simultaneous autoregressive models, a novel hierarchical autoregressive model is developed for non-homogeneous spatial random-fields. The final part of the thesis develops a novel space-time model for data on a rectangular lattice. The dynamic spatial factor model framework is extended with matrix normal spatial factor loadings. A new class of Gaussian Markov random field models for random matrices, defined with low-dimensional row and column conditional independence graphs, is used to model sparse spatial factor loadings. Further dimensionality reduction is achieved through the dynamic factor model framework, which makes this class of models extremely attractive for systematically evolving non-homogeneous, high-resolution space-time data on rectangular lattices. Flexible choices for prior distributions and posterior computations are presented and illustrated with a synthetic data example.