Vice President for Research & EducationKuang-Chi Institute of Advanced TechnologyApr 2010-Present
Advances in Bayesian Modelling and Computation: Spatio-temporal Processes, Model Assessment and Adaptive MCMC
The modelling and analysis of complex stochastic systems with increasingly large data sets, state-spaces and parameters provides major stimulus to research in Bayesian nonparametric methods and Bayesian computation. This dissertation presents advances in both nonparametric modelling and statistical computation stimulated by challenging problems of analysis in complex spatio-temporal systems and core computational issues in model fitting and model assessment. The first part of the thesis, represented by chapters 2 to 4, concerns novel, nonparametric Bayesian mixture models for spatial point processes, with advances in modelling, computation and applications in biological contexts. Chapter 2 describes and develops models for spatial point processes in which the point outcomes are latent, where indirect observations related to the point outcomes are available, and in which the underlying spatial intensity functions are typically highly heterogenous. Spatial intensities of inhomogeneous Poisson processes are represented via flexible nonparametric Bayesian mixture models. Computational approaches are presented for this new class of spatial point process mixtures and extended to the context of unobserved point process outcomes. Two examples drawn from a central, motivating context, that of immunofluorescence histology analysis in biological studies generating high-resolution imaging data, demonstrate the modelling approach and computational methodology. Chapters 3 and 4 extend this framework to define a class of flexible Bayesian nonparametric models for inhomogeneous spatiotemporal point processes, adding dynamic models for underlying intensity patterns. Dependent Dirichlet process mixture models are introduced as core components of this new time-varying spatial model. Utilizing such nonparametric mixture models for the spatial process intensity functions allows the introduction of time variation via dynamic, state-space models for parameters characterizing the intensities. Bayesian inference and model-fitting is addressed via novel particle filtering ideas and methods. Illustrative simulation examples include studies in problems of extended target tracking and substantive data analysis in cell fluorescent microscopic imaging tracking problems. The second part of the thesis, consisting of chapters 5 and chapter 6, concerns advances in computational methods for some core and generic Bayesian inferential problems. Chapter 5 develops a novel approach to estimation of upper and lower bounds for marginal likelihoods in Bayesian modelling using refinements of existing variational methods. Traditional variational approaches only provide lower bound estimation; this new lower/upper bound analysis is able to provide accurate and tight bounds in many problems, so facilitates more reliable computation for Bayesian model comparison while also providing a way to assess adequacy of variational densities as approximations to exact, intractable posteriors. The advances also include demonstration of the significant improvements that may be achieved in marginal likelihood estimation by marginalizing some parameters in the model. A distinct contribution to Bayesian computation is covered in Chapter 6. This concerns a generic framework for designing adaptive MCMC algorithms, emphasizing the adaptive Metropolized independence sampler and an effective adaptation strategy using a family of mixture distribution proposals. This work is coupled with development of a novel adaptive approach to computation in nonparametric modelling with large data sets; here a sequential learning approach is defined that iteratively utilizes smaller data subsets. Under the general framework of importance sampling based marginal likelihood computation, the proposed adaptive Monte Carlo method and sequential learning approach can facilitate improved accuracy in marginal likelihood computation. The approaches are exemplified in studies of both synthetic data examples, and in a real data analysis arising in astro-statistics. Finally, chapter 7 summarizes the dissertation and discusses possible extensions of the specific modelling and computational innovations, as well as potential future work.