Director, Statistical Innovation Group; Associate Fellow, Department of StatisticsGSK Pharmaceutical R&D; University of Warwick, UK
Beta Stacy Survival Models and Bayesian Weibull Survival Trees
The present thesis illustrates three statistical frameworks for the Bayesian analysis of survival data in presence of random right censoring. The thesis is divided into eight Chapters. Chapter 1 includes a general introduction describing the rationales behind the development of the thesis. Chapters 2 and 5 outline the relevant theory of the three survival models developed in Chapters 3, 4, 6 and 7. The latter Chapters include several numerical results illustrating the fit and predictions obtained by a Matlab implementation of the three survival models. Chapter 3 illustrates a semi-parametric proportional hazards survival model which employs the Beta Stacy process (Walker and Muliere, 1997) as the baseline cumulative distribution function. Fit for the model's hyper-parameters and predictions of future survival times are derived under a finite approximation of the baseline process. Numerical results are provided for a set of simulated data and for a set of acute leukemia remission times (Cox, 1972). The posterior estimates are found to be consistent with the previous literature. The issue of sensitivity of the posterior inferences and predictions with respect to the prior specification of the hyper-parameters is addressed within the analysis of the Cox leukemia data. Chapter 4 develops a non-parametric Beta Stacy regression model for grouped survival data. The hyper-measure of the random survival process includes two group specific parameters and a set of common regression effects. The model represents a unique example of dependent non-parametric statistical framework for the analysis of survival data. Inference is carried out by marginalising over a discrete time Beta Stacy process by means of the generalised Pólya urn scheme of Walker and Muliere , whereas predictions are derived under the finite dimensional approximation of the process introduced in Chapter 3. The Chapter includes the analyses of two sets of data the first of which is simulated. The results provided for this synthetic set of data suggest that the current implementation of the model yields estimates and predictions consistent with the simulation mechanism. The second dataset reports the survival times of a pool of breast cancer patients collected at the Koo Foundation Sun Yat-Sen Cancer Center (KFSYSCC) in Taiwan. Fit and predictions for this second dataset are derived under two alternative prior structures for the model's hyper-parameters in order to address the issue of sensitivity to the prior specification. Chapter 6 concerns a Bayesian Weibull survival tree model where both the form of the regression function as represented by the tree structure and the Weibull parameters characterising the sampling densities within the tree leaves are to be inferred from the data. A novel shrinkage prior mechanism is introduced along with an original Markov chain Monte Carlo technique which is employed to draw samples from the marginal posterior distribution over the tree space. The proposal mechanism used within the stochastic exploration of the model space is based on employing parallel Markov chains of models. This provides a novel application of the simulated tempering framework in the context of model uncertainty assessment. Mixing in the tree space is further improved by incorporating in the proposal distribution the original notion of predictive equivalence. Chapter 7 illustrates extensively the results obtained by employing the model for the fitting and prediction of both simulated data and of a set of liver cancer survival data. Chapter 8 concludes the thesis by summarising present achievements and future developments.