Combining survey and census data for improved poverty prediction using semi-supervised deep learning

A-Tier

Journal: Journal of Development Economics

Year: 2025

Volume: 172

Issue: C

Authors (5)

Echevin, Damien Fotso, Guy (not in RePEc) Bouroubi, Yacine (not in RePEc) Coulombe, Harold Li, Qing (not in RePEc)

Score contribution per author:

0.807 = (α=2.02 / 5 authors) × 2.0x A-tier

α: calibrated so average coauthorship-adjusted count equals average raw count

View Full Article View on IDEAS/RePEc

Abstract

This paper presents a methodology for predicting poverty using semi-supervised learning techniques, specifically pseudo-labeling, and deep learning algorithms. Standard poverty prediction models rely on limited household survey data, whereas our approach exploits large amounts of unlabeled census data to improve prediction accuracy. By applying pseudo-labeling, we improve key performance metrics across various African regions, where our models outperform conventional approaches to identifying poor individuals. Deep neural networks (DNNs) trained on pseudo-labeled data exhibited area under the curve (AUC) scores ranging from 0.8 to over 0.9, a notable improvement over previous machine learning survey-based methods. Furthermore, random undersampling was key to refining model performance, balancing higher coverage with some reduction in precision. These findings have significant implications for poverty targeting, enabling more accurate identification of poor individuals and supporting better resource allocation.

Combining survey and census data for improved poverty prediction using semi-supervised deep learning

Authors (5)

Abstract

Technical Details