Score contribution per author:
α: calibrated so average coauthorship-adjusted count equals average raw count
We study differences in teacher evaluations of student performance relative to those measured by test scores. While much literature is concerned with estimating various types of teacher biases, we show conceptually that there is no single ‘teacher bias’ effect. Even if teachers have no group bias, teacher evaluation differences by group masystematically deviate from test score differences if the distribution of test scores differs across groups. Commonly used approaches are not equivalent and can lead to different conclusions as they target different estimands. We demonstrate our findings using Monte Carlo simulations and, using two recent UK cohort surveys, we show that these conceptual issues matter in practice when we evaluate whether teachers are likely to over‐estimate female performance in English. Finally, we use the methods to examine an issue of substantive importance, gender differences in teacher perceptions in comparative advantage in English relative to mathematics. Our findings suggest that it is unlikely that teacher misperceptions of comparative advantage by gender are an important cause of the gender gap in STEM.