Score contribution per author:
α: calibrated so average coauthorship-adjusted count equals average raw count
Student‐driven teaching evaluations are commonly used in faculty assessments but are known to exhibit gender and racial biases. Despite this, some argue that student comments provide meaningful feedback. Using over 8 million reviews from RateMyProfessors.com, we apply sentiment analysis to evaluate bias in student comments. We find that while comments exhibit similar biases as numerical scores, the effects are more muted. On average, numerical ratings are 2% lower for female instructors and 6% lower for minority instructors. Sentiment scores are 2% lower for female instructors and 3.6%–8.6% lower for minority instructors. These penalties increase with course difficulty but do not vary by institution type. In economics departments, numerical ratings are lower for female and minority instructors; these differences remain statistically significant when sentiment scores are considered. Importantly, nearly 30% of the variation in evaluation scores can be explained by student sentiment, highlighting the influence of subjective language.