Differentially Private Verifications of Predictions from Synthetic Data
When data are confidential, one approach for releasing public available files is to make synthetic data, i.e, data simulated from statistical models estimated on the confidential data. Given access only to synthetic data, users have no way to tell if the synthetic data can preserve the adequacy of their analysis. Thus, I present methods that can help users to make such assessments automatically while controlling the information disclosure risks in the confidential data. There are three verification methods presented in this thesis: differentially private prediction tolerance intervals, differentially private prediction histogram, and differentially private Kolmogorov-Smirnov test. I use simulation to illustrate these prediction verification methods.