Breakthroughs in historical record linking using genealogy data: The Census Tree project

B-Tier
Journal: Explorations in Economic History
Year: 2025
Volume: 98
Issue: C

Authors (4)

Buckles, Kasey (University of Notre Dame) Haws, Adrian (not in RePEc) Price, Joseph (not in RePEc) Wilbert, Haley E.B. (not in RePEc)

Score contribution per author:

0.503 = (α=2.01 / 4 authors) × 1.0x B-tier

α: calibrated so average coauthorship-adjusted count equals average raw count

Abstract

The Census Tree is the largest-ever database of record links among the historical U.S. censuses, with over 700 million links for people living in the United States between 1850 and 1940. To create the Census Tree, we begin with a collection of high-quality links contributed by the users of a free online genealogy platform, many of which would be difficult or impossible to find using currently available linking technologies. We then use these links as training data for a machine learning algorithm to make new matches, and incorporate other recent efforts to link the historical U.S. censuses. Finally, we introduce a procedure for filtering the links and adjudicating disagreements. Our complete Census Tree achieves match rates across adjacent censuses that are between 69 and 86 % for men and between 58 and 79 % for women—a major breakthrough compared to previous linking efforts. The size of the Census Tree allows researchers in the social sciences and other disciplines to construct longitudinal datasets that are highly representative of the population. We validate the accuracy of these links and provide researchers with a simple tool for choosing their preferred tradeoff between sample size and accuracy. To demonstrate the advantages of the Census Tree, we extend the work of Abramitzky, Boustan, Jácome, and Pérez (2021) to include intergenerational mobility estimates for additional immigrant nationalities and for women.

Technical Details

RePEc Handle
repec:eee:exehis:v:98:y:2025:i:c:s0014498325000646
Journal Field
Economic History
Author Count
4
Added to Database
2026-01-25