Mengyang (Michael) Gu
Assistant Research Professor, Department of Applied Mathematics and StatisticsJohns Hopkins University
Robust Uncertainty Quantification and Scalable Computation for Computer Models with Massive Output
Computer models have been widely used to reproduce the behavior of engineering, physical, biological and human processes. The rapid development of technology has empowered scientists and engineers to implement large-scale simulations via computer models and to collect real data from different kinds of resources, with the ultimate goal of predicting real processes through modeling. To achieve this typically requires a host of interactions with data and statistics, a process that has come to be called uncertainty quantification (UQ). This thesis develops statistical models focusing on two aspects of the problem in UQ: computational feasibility for huge functional data and robust parameter estimation in fitting models. To achieve these two goals, new techniques, theories and numerical procedures are developed and studied. Chapter 1 frames the issue of modeling data from computer experiments and introduces the Gaussian stochastic process (GaSP) emulator as a crucial step in UQ. Chapter 2 provides a practical approach for simultaneously emulating/approximating computer models that produce massive output; for instance, output over a huge range of space-time coordinates (as necessary for the discussed application of hazard quantification for the Soufrière Hills volcano in Montserrat island). Chapter 3 and Chapter 4 are both about the parameter estimation problem for the GaSP emulator. Chapter 3 provides new criteria for parameter estimation, called ‘robustness parameter estimation criteria’. Properties of the reference prior are studied in a general setting and a new robust estimation procedure is proposed based on the estimation from marginal posterior modes with optimal parameterizations. Chapter 4 introduces a new class of priors – called jointly robust priors – for the GaSP model where inert inputs (inputs that barely affect the computer model output) are present. This prior has many of the good properties of the reference prior and is more computationally tractable. Chapter 5 discusses another problem with big functional data, in which the number of observations in a function is large. An exact algorithm that can compute the likelihood of the GaSP model linearly in time is introduced, and regression, separable GaSP models and nonseparable GaSP models are discussed in a unified framework. Finally, Chapter 6 provides some concluding remarks and describes possible future work.