Statistical ScientistBerry Consultants, Inc.June 2019 - Present
Bayesian Dynamic Modeling and Forecasting of Count Time Series
Abstract Problems of forecasting related time series of counts arise in a diverse array of applications such as consumer sales, epidemiology, ecology, law enforcement, and tourism. Characteristics of high-frequency count data including many zeros, high variation, extreme values, and varying means make the application of traditional time series methods inappropriate. In many settings, an additional challenge is producing on- line, multi-step forecasts for thousands of individual series in an efficient and flexible manner. This dissertation introduces novel classes of models to address efficiency, efficacy and scalability of dynamic models based on the concept of decouple/recouple applied to multiple series that are individually represented via novel univariate state- space models. The novel dynamic count mixture model involves dynamic generalized linear models for binary and conditionally Poisson time series, with dynamic random effects for overdispersion, and the use of dynamic covariates in both binary and non-zero components. New multivariate models then enable information sharing in contexts where data at a more highly aggregated level provide more incisive inference on shared patterns such as trends and seasonality. This novel decouple/recouple strategy incorporates cross-series linkages while insulating parallel estimation of univariate models. We extend these models to a general framework appropriate for settings in which count data arises through a compound process. The motivating application is in consumer sales contexts where variability in high-frequency sales data arises from the compounding effects of the number of transactions and the number of sales-per-transactions. This framework involves adapting the dynamic count mixture model to forecast transactions, coupled with a binary cascade concept using a sequence of Bayesian models to predict the number of units per transaction. The motivation behind the binary cascade is that the appropriate way to model rare events is through a sequence of conditional probabilities of increasingly rare outcomes. Several case studies in many-item, multi-step ahead supermarket sales forecasting demonstrate improved forecasting performance using the proposed models, with discussion of forecast accuracy metrics and the benefits of probabilistic forecast accuracy assessment.