BiostatisticianDuke Molecular Physiology InstituteSep 2018 - Present
Essays on Propensity Score Methods for Causal Inference in Observational Studies
Abstract In this dissertation, I present three essays from three different research projects and they involve different usages of propensity scores in drawing causal inferences in observational studies. Chapter 1 talks about the general idea of causal inference as well as the concept of randomized experiments and observational studies. It introduces the three different projects and their contributions to the literature. Chapter 2 gives a critical review and an extensive discussion of several commonly used propensity score methods when the data have a multilevel structure, including matching, weighting, strati cation, and methods that combine these with regression. The usage of these methods is illustrated using a data set about endoscopic vein-graft harvesting in coronary artery bypass graft (CABG) surgeries. We discuss important aspects of the implementation of these methods such as model specification and standard error calculations. Based on the comparison, we provide general guidelines for using propensity score methods with multilevel data in practice. We also provide the relevant code in the form of an R package, available on GitHub. In observational studies, subjects are no longer assigned to treatment at random as in randomized experiments, and thus the association between the treatment and outcome can be due to some unmeasured variable that affects both the treatment and the outcome. Chapter 3 focuses on conducting sensitivity analysis to assess the robustness of the estimated quantity when the unconfoundedness assumption is violated. Two approaches to sensitivity analysis are presented, both are extensions from previous works to accommodate for a count outcome. One method is based on the subclassifcation estimator and it relies on maximum likelihood estimation. The second method is more flexible on the estimation method and is based on simulations. We illustrate both methods using a data set from a traffic safety research study about the safety effectiveness (measured in crash counts reduction) of the combined application of center line rumble strips and shoulder rumble strips on two-lane rural roads in Pennsylvania. Chapter 4 proposes a method for estimating heterogeneous causal effects in observational studies by augmenting additive-interactive Gaussian process regression using the propensity scores, yielding a flexible yet robust way to predict the potential outcome surface from which the conditional treatment effects can be calculated. We show that our method works well even in presence of strong confounding and illustrate this by comparing with commonly-used methods in different settings using simulated data. Finally, chapter 5 concludes this dissertation and discusses possible future works for each of the projects.