Associate In Research
Bayesian Inference Via Partitioning Under Differential Privacy
In this thesis, I develop differentially private methods to report posterior probabilities and posterior quantiles of linear regression coefficients. I accomplish this by randomly partitioning the data, taking an intermediate outcome of the data within each partition, aggregating the intermediate outcomes so that they approximate the statistic of interest, and adding Laplace noise to ensure differential privacy. I find the posterior probability by assuming that the variance of the posterior distribution given data from one partition is proportional to the variance of the posterior distribution given the full dataset. The mean posterior probability of the data within each partition is found as the intermediate outcome. The posterior probabilities given the data within one partition are averaged and the variance is rescaled so that the averaged probability approximates the posterior probability given the full dataset. Added noise ensures that the released quantity satisfies differential privacy. I find the posterior quantile by fitting the Bayesian model on the data within each partition where the likelihood has been inflated to rescale the posterior variance so that it approximates the posterior variance given the full dataset. The posterior quantile of the data within each partition is found as an intermediate outcome, and averaged to approximate the posterior quantile given the whole dataset. I add noise to ensure the released quantity satisfies differential privacy. Simulations show that both the partitioning methods and the noise mechanism can return accurate estimates of the statistics they are perturbing.