Record Linkage

As costs of mounting new data collection efforts increase, many statistical agencies and data analysis are turning to integrating data from multiple sources. Faculty members in the department are developing methodology for record linkage and for data fusion. In record linkage, analysts attempt to merge records from two or more databases by matching on variables common to both files. Typically these variables contain error or do not result in unique matches, opening the potential for linkage errors. Faculty in the department work on techniques for accounting for uncertainty due to inexact matching. In data fusion, analysts combine information from two or more databases with disjoint sets of individuals and some variables that do not overlap. Faculty member work on techniques for incorporating auxiliary information, such as data from online polls or large administrative data sources, to improve data fusion inferences.

Faculty in this Research Area