Links and Legibility: Making Sense of Historical U.S. Census Automated Linking Methods

A-Tier
Journal: Journal of Business & Economic Statistics
Year: 2024
Volume: 42
Issue: 2
Pages: 579-590

Score contribution per author:

1.341 = (α=2.01 / 3 authors) × 2.0x A-tier

α: calibrated so average coauthorship-adjusted count equals average raw count

Abstract

How does handwriting legibility affect the performance of algorithms that link individuals across census rounds? We propose a measure of legibility, which we implement at scale for the 1940 U.S. Census, and find strikingly wide variation in enumeration-district-level legibility. Using boundary discontinuities in enumeration districts, we estimate the causal effect of low legibility on the quality of linked samples, measured by linkage rates and share of validated links. Our estimates imply that, across eight linking algorithms, perfect legibility would increase the linkage rate by 5–10 percentage points. Improvements in transcription could substantially increase the quality of linked samples.

Technical Details

RePEc Handle
repec:taf:jnlbes:v:42:y:2024:i:2:p:579-590
Journal Field
Econometrics
Author Count
3
Added to Database
2026-01-25