A Comparison of Multiple Imputation Methods for Categorical Data
This thesis evaluates the performance of several multiple imputation methods for categorical data, including multiple imputation by chained equations using generalized linear models, multiple imputation by chained equations using classification and regression trees and non-parametric Bayesian multiple imputation for categorical data (using the Dirichlet process mixture of products of multinomial distributions model). The performance of each method is evaluated with repeated sampling studies using housing unit data from the American Community Survey 2012. These data afford exploration of practical problems such as multicollinearity and large dimensions. This thesis highlights some advantages and limitations of each method compared to others. Finally, it provides suggestions on which method should be preferred, and conditions under which the suggestions hold.
I am a fourth-year PhD candidate working on developing statistical methodology for handling missing and faulty data, with particular emphasis on applications that intersect with the social sciences. I am especially motivated to develop methods that can be readily applied by statistical agencies and data analysts. My work focuses on creating a coherent imputation engine that can handle missing data and reporting error, leverage auxiliary information on marginal distributions, incorporate survey weights, and scale up to a large number of categorical variables.
Akande, O, Li, F, and Reiter, J. "An Empirical Comparison of Multiple Imputation Methods for Categorical Data." The American Statistician 71.2 (April 3, 2017): 162-170. Full Text Open Access Copy
Akande, O, Reiter, J, and Barrientos, A. "Multiple Imputation of Missing Values in Household Data with Structural Zeros(Accepted)." Survey Methodology. Open Access Copy
Akande, O, Barrientos, A, and Reiter, J. "Simultaneous Edit and Imputation for Household Data with Structural Zeros(Accepted)." Journal of Survey Statistics and Methodology. Full Text Open Access Copy