Score contribution per author:
α: calibrated so average coauthorship-adjusted count equals average raw count
This study constructs a novel dataset on the ethnicity of individuals in an ethnically diverse country in sub-Saharan Africa. We measure ethnicity using a machine learning algorithm that exploits variation in surnames across ethnolinguistic groups. We apply this approach to voter registration data from Uganda’s 2016 general election. The resulting data capture local variation in ethnicity over a wide geographic scale. We pair these data with election outcomes from polling stations throughout Uganda to estimate the relationship between ethnicity and voting behavior. Our regression analyses both control for location and include interactions between ethnic groups within the same polling station. Local variation in ethnicity is associated with voting behavior at the level of the polling station, and these relationships vary with the presence of other ethnic groups at the polling station. The results suggest the importance of studying ethnic voting using local variation in ethnicity at scale.