Synthetic Individual Income Tax Data: Promises and Challenges

B-Tier
Journal: National Tax Journal
Year: 2022
Volume: 75
Issue: 4
Pages: 767 - 790

Authors (8)

Claire McKay Bowen (not in RePEc) Victoria L. Bryant (not in RePEc) Leonard Burman (not in RePEc) Surachai Khitatrakun (not in RePEc) Robert McClelland (Urban Institute) Livia Mucciolo (not in RePEc) Madeline Pickens (not in RePEc) Aaron R. Williams (not in RePEc)

Score contribution per author:

0.251 = (α=2.01 / 8 authors) × 1.0x B-tier

α: calibrated so average coauthorship-adjusted count equals average raw count

Abstract

Tax data are invaluable for research, but privacy concerns severely limit access. Although the US Internal Revenue Service produces a public-use file (PUF), improved technology and the proliferation of individual data have made it increasingly difficult to protect. Synthetic data are an alternative that reproduce the statistical properties of administrative data without revealing individual taxpayer information. This paper evaluates the quality and safety of the first fully synthetic PUF and demonstrates its performance in tax model microsimulations. The synthetic PUF could also be used to develop and debug statistical programs that could then be safely run on confidential data via a validation server.

Technical Details

RePEc Handle
repec:ucp:nattax:doi:10.1086/722094
Journal Field
Public
Author Count
8
Added to Database
2026-01-26