pldamixture - Post-Linkage Data Analysis Based on Mixture Modelling
Perform inference in the secondary analysis setting with
linked data potentially containing mismatch errors. Only the
linked data file may be accessible and information about the
record linkage process may be limited or unavailable.
Implements the 'General Framework for Regression with
Mismatched Data' developed by Slawski et al. (2023)
<doi:10.48550/arXiv.2306.00909>. The framework uses a mixture
model for pairs of linked records whose two components reflect
distributions conditional on match status, i.e., correct match
or mismatch. Inference is based on composite likelihood and the
Expectation-Maximization (EM) algorithm. The package currently
supports Cox Proportional Hazards Regression (right-censored
data only) and Generalized Linear Regression Models (Gaussian,
Gamma, Poisson, and Logistic (binary models only)). Information
about the underlying record linkage process can be incorporated
into the method if available (e.g., assumed overall mismatch
rate, safe matches, predictors of match status, or predicted
probabilities of correct matches).