rss
J Am Med Inform Assoc 15:654-660 doi:10.1197/jamia.M2265
  • Original Investigation

Ignoring Dependency between Linking Variables and Its Impact on the Outcome of Probabilistic Record Linkage Studies

Table 4

Impact on the Classification of Record Pairs Between the Naïve and Nonnaïve Strategy in Scenario 2 (Empirical Datasets) and Scenario 3 (Simulated Datasets) Both Using 5 Linking Variables

Scenario 2: Empirical Datasets Scenario 3: Simulated Datasets
Naïve Nonnaïve Naïve Nonnaïve Truth
Dataset 1 129,576 40,000
Dataset 2 116,390 40,000
Number of pairs 15,081,350,640 1,600,000,000
Estimated prevalence 8.30E-05 4.37E-06 7.07E-05 4.37E-06 4.38E-06
Number of estimated matches 1,251,752 65,951 113,069 6,998 7,000
Number of links 1,226,322 65,639 112,988 6,983 7,000
Number of false-positive links NA NA 106,009 51 0
Number of false-negative links NA NA 20 68 0
  • NA = not applicable.

This Article

Access policy for JAMIA

All content published in JAMIA is deposited with PubMed Central by the publisher with a 12 month embargo. Authors/funders may pay an Unlocked fee of $2,000 to make the article free on the JAMIA website and PMC immediately on publication.

All content older than 12 months is freely available on this website.

AMIA members can log in with their JAMIA user name (email address) and password or via the AMIA website.