Regression analysis of probability-linked data
Data obtained after probability linkage of administrative registers will typically include errors due to the fact that some linked records actually contain data items are sourced from different individuals. Such errors can induce bias in standard statistical analyses if ignored. In this report we describe some approaches to eliminating this bias in the case of linear regression analysis and, more generally when inference is based on an estimating equation, with an emphasis on logistic regression. Simulation results that illustrate the gains from allowing for linkage error in linear and logistic regression analysis are presented, as are extensions of the approach to situations where a sample is linked to a register and to where the linked registers are of unequal size.