Score contribution per author:
α: calibrated so average coauthorship-adjusted count equals average raw count
A p-beauty contest is a wide class of games of guessing the most popular strategy among other players. In particular, guessing a fraction of a mean of numbers chosen by all players is a classic behavioral experiment designed to test iterative reasoning patterns among various groups of people. The previous literature reveals that the level of sophistication of the opponents is an important factor affecting the outcome of the game. Smarter decision makers choose strategies that are closer to theoretical Nash equilibrium and demonstrate faster convergence to equilibrium in iterated contests with information revelation. We replicate a series of classic experiments by running virtual experiments with large language models (LLMs) who play against various groups of virtual players. Our results show that LLMs recognize the strategic context of the game and demonstrate expected adaptability to the changing set of parameters. LLMs systematically behave in a more sophisticated way compared to the participants of the original experiments. All LLMs still fail to identify dominant strategies in a two-player game. Our results contribute to the discussion on the accuracy of modeling human economic agents by artificial intelligence.