Hostname: page-component-848d4c4894-p2v8j Total loading time: 0.001 Render date: 2024-05-26T09:37:40.829Z Has data issue: false hasContentIssue false

Out of One, Many: Using Language Models to Simulate Human Samples

Published online by Cambridge University Press:  21 February 2023

Lisa P. Argyle*
Affiliation:
Department of Political Science, Brigham Young University, Provo, UT, USA. e-mail: lpargyle@byu.edu, ethan.busby@byu.edu, jgub@byu.edu
Ethan C. Busby
Affiliation:
Department of Political Science, Brigham Young University, Provo, UT, USA. e-mail: lpargyle@byu.edu, ethan.busby@byu.edu, jgub@byu.edu
Nancy Fulda
Affiliation:
Department of Computer Science, Brigham Young University, Provo, UT, USA. e-mail: nfulda@cs.byu.edu, christophermichaelrytting@gmail.com, wingated@cs.byu.edu
Joshua R. Gubler
Affiliation:
Department of Political Science, Brigham Young University, Provo, UT, USA. e-mail: lpargyle@byu.edu, ethan.busby@byu.edu, jgub@byu.edu
Christopher Rytting
Affiliation:
Department of Computer Science, Brigham Young University, Provo, UT, USA. e-mail: nfulda@cs.byu.edu, christophermichaelrytting@gmail.com, wingated@cs.byu.edu
David Wingate
Affiliation:
Department of Computer Science, Brigham Young University, Provo, UT, USA. e-mail: nfulda@cs.byu.edu, christophermichaelrytting@gmail.com, wingated@cs.byu.edu
*
Corresponding author Lisa P. Argyle

Abstract

We propose and explore the possibility that language models can be studied as effective proxies for specific human subpopulations in social science research. Practical and research applications of artificial intelligence tools have sometimes been limited by problematic biases (such as racism or sexism), which are often treated as uniform properties of the models. We show that the “algorithmic bias” within one such tool—the GPT-3 language model—is instead both fine-grained and demographically correlated, meaning that proper conditioning will cause it to accurately emulate response distributions from a wide variety of human subgroups. We term this property algorithmic fidelity and explore its extent in GPT-3. We create “silicon samples” by conditioning the model on thousands of sociodemographic backstories from real human participants in multiple large surveys conducted in the United States. We then compare the silicon and human samples to demonstrate that the information contained in GPT-3 goes far beyond surface similarity. It is nuanced, multifaceted, and reflects the complex interplay between ideas, attitudes, and sociocultural context that characterize human attitudes. We suggest that language models with sufficient algorithmic fidelity thus constitute a novel and powerful tool to advance understanding of humans and society across a variety of disciplines.

Type
Article
Copyright
© The Author(s), 2023. Published by Cambridge University Press on behalf of the Society for Political Methodology

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Footnotes

Edited by Jeff Gill

References

Adiwardana, D., et al. 2020. “Towards a Human-Like Open-Domain Chatbot.” Preprint, arXiv:2001.09977.Google Scholar
ANES. 2021. “American National Election Studies.” https://electionstudies.org/about-us/.Google Scholar
Argyle, L. P., Busby, E. C., Fulda, N., Gubler, J. R., Rytting, C., and Wingate, D.. 2022. “Replication Data for: ‘Out of One, Many: Using Language Models to Simulate Human Samples’.” https://doi.org/10.7910/DVN/JPV20K CrossRefGoogle Scholar
Barberá, P., Boydstun, A. E., Linn, S., McMahon, R., and Nagler, J.. 2021. “Automated Text Classification of News Articles: A Practical Guide.” Political Analysis 29 (1): 1942.CrossRefGoogle Scholar
Barocas, S., and Selbst, A. D.. 2016. “Big Data’s Disparate Impact.” California Law Review 104: 671.Google Scholar
Bender, E. M., Gebru, T., McMillan-Major, A., and Shmitchell, S.. 2021. “On the Dangers of Stochastic Parrots: Can Language Models Be Too Big?” In Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency, 610623.Google Scholar
Benoit, K., Munger, K., and Spirling, A.. 2019. “Measuring and Explaining Political Sophistication through Textual Complexity.” American Journal of Political Science 63 (2): 491508.CrossRefGoogle ScholarPubMed
Berelson, B., Lazarsfeld, P. F., and McPhee, W. N.. 1954. Voting: A Study of Opinion Formation in a Presidential Campaign. Chicago, IL: University of Chicago Press.Google Scholar
Box-Steffensmeier, J. M., Boef, S. D., and Lin, T.-m.. 2004. “The Dynamics of the Partisan Gender Gap.” American Political Science Review 98 (3): 515528.CrossRefGoogle Scholar
Brown, T. B., et al. 2020. “Language Models Are Few-Shot Learners.” Advances in Neural Information Processing Systems 33: 1–25.Google Scholar
Burns, N., and Gallagher, K.. 2010. “Public Opinion on Gender Issues: The Politics of Equity and Roles.” Annual Review of Political Science 13 (1): 425443.CrossRefGoogle Scholar
Caliskan, A., Bryson, J. J., and Narayanan, A.. 2017. “Semantics Derived Automatically from Language Corpora Contain Human-Like Biases.” Science 356 (6334): 183186.10.1126/science.aal4230CrossRefGoogle ScholarPubMed
Campbell, A., Converse, P. E., Miller, W. E., and Stokes, D. E.. 1960. The American Voter. Chicago, IL: University of Chicago Press.Google Scholar
Coppock, A., and McClellan, O. A.. 2019. “Validating the Demographic, Political, Psychological, and Experimental Results Obtained from a New Source of Online Survey Respondents.” Research & Politics 6 (1): 114.CrossRefGoogle Scholar
Cramér, H. 1946. Mathematical Methods of Statistics. Princeton, NJ: Princeton University Press.Google Scholar
Cramer, K. 2020. “Understanding the Role of Racism in Contemporary US Public Opinion.” Annual Review of Political Science 23 (1): 153169.10.1146/annurev-polisci-060418-042842CrossRefGoogle Scholar
Cramer, K. J. 2016. The Politics of Resentment: Rural Consciousness in Wisconsin and the Rise of Scott Walker. Chicago, IL: University of Chicago Press.CrossRefGoogle Scholar
Dai, Z., Yang, Z., Yang, Y., Carbonell, J., Le, Q. V., and Salakhutdinov, R.. 2019. “Transformer-Xl: Attentive Language Models beyond a Fixed-Length Context.” In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, 2978–2988, Florence, Italy. Association for Computational Linguistics.CrossRefGoogle Scholar
Druckman, J. N., and Lupia, A.. 2016. “Preference Change in Competitive Political Environments.” Annual Review of Political Science 19 (1): 1331.CrossRefGoogle Scholar
Garg, N., Schiebinger, L., Jurafsky, D., and Zou, J.. 2018. “Word Embeddings Quantify 100 Years of Gender and Ethnic Stereotypes.” Proceedings of the National Academy of Sciences 115 (16): E3635E3644.CrossRefGoogle ScholarPubMed
Greene, K. T., Park, B., and Colaresi, M.. 2019. “Machine Learning Human Rights and Wrongs: How the Successes and Failures of Supervised Learning Algorithms Can Inform the Debate about Information Effects.” Political Analysis 27 (2): 223230.CrossRefGoogle Scholar
Grimmer, J., Roberts, M. E., and Stewart, B. M.. 2021. “Machine Learning for Social Science: An Agnostic Approach.” Annual Review of Political Science 24: 395419.10.1146/annurev-polisci-053119-015921CrossRefGoogle Scholar
Hutchings, V. L., and Valentino, N. A.. 2004. “The Centrality of Race in American Politics.” Annual Review of Political Science 7 (1): 383408.10.1146/annurev.polisci.7.012003.104859CrossRefGoogle Scholar
Iyengar, S., Sood, G., and Lelkes, Y.. 2012. “Affect, Not Ideology a Social Identity Perspective on Polarization.” Public Opinion Quarterly 76 (3): 405431.CrossRefGoogle Scholar
Jardina, A. 2019. White Identity Politics. New York: Cambridge University Press.CrossRefGoogle Scholar
Keith, B. E., Magleby, D. B., Nelson, C. J., Orr, E., and Westyle, M. C.. 1992. The Myth of the Independent Voter. Berkeley, CA: University of California Press.Google Scholar
Klar, S., and Krupnikov, Y.. 2016. Independent Politics: How American Disdain for Parties Leads to Political Inaction. New York: Cambridge University Press.CrossRefGoogle Scholar
Magleby, D. B., Nelson, C. J., and Westlye, M. C.. 2011. “The Myth of the Independent Voter Revisited.” In Facing the Challenge of Democracy: Explorations in the Analysis of Public Opinion and Political Participation, edited by Sniderman, P. M., and Highton, B., 238–266. Princeton, NJ: Princeton University Press.Google Scholar
Marcus, G. 2020. “The Next Decade in AI: Four Steps towards Robust Artificial Intelligence.” Preprint, arXiv:2002.06177.Google Scholar
Mason, L. 2018. Uncivil Agreement. Chicago, IL: University of Chicago Press.CrossRefGoogle Scholar
Mayson, S. G. 2018. “Bias In, Bias Out.” Yale Law Journal 128: 2218.Google Scholar
Panch, T., Mattie, H., and Atun, R.. 2019. “Artificial Intelligence and Algorithmic Bias: Implications for Health Systems.” Journal of Globalization and Health 9 (2): 010318. https://doi.org/ 10.7189/jogh.09.020318 CrossRefGoogle ScholarPubMed
Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., and Sutskever, I.. 2019. “Language Models Are Unsupervised Multitask Learners.” OpenAI Blog 1 (8): 9.Google Scholar
Raffel, C., et al. 2020. “Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer.” Journal of Machine Learning Research 21 (140): 167.Google Scholar
Rheault, L., and Cochrane, C.. 2020. “Word Embeddings for the Analysis of Ideological Placement in Parliamentary Corpora.” Political Analysis 28 (1): 112133.10.1017/pan.2019.26CrossRefGoogle Scholar
Rodriguez, P., and Spirling, A.. 2022. “Word Embeddings: What Works, What Doesn’t, and How to Tell the Difference for Applied Research.” Journal of Politics 84 (1): 101115.CrossRefGoogle Scholar
Ross, R. S. 2012. Guide for Conducting Risk Assessments (Nist Sp-800-30rev1). Gaithersburg: The National Institute of Standards and Technology (NIST).Google Scholar
Rothschild, J. E., Howat, A. J., Shafranek, R. M., and Busby, E. C.. 2019. “Pigeonholing Partisans: Stereotypes of Party Supporters and Partisan Polarization.” Political Behavior 41 (2): 423443.CrossRefGoogle Scholar
Salganik, M. J. 2017. Bit by Bit: Social Research in the Digital Age. Princeton, NJ: Princeton University Press.Google Scholar
Simpson, E. H. 1951. “The Interpretation of Interaction in Contingency Tables.” Journal of the Royal Statistical Society, Series B 13: 238241.Google Scholar
Tate, K. 1994. From Protest to Politics: The New Black Voters in American Elections. Cambridge, MA: Harvard University Press.Google Scholar
Supplementary material: Link

Argyle et al. Dataset

Link
Supplementary material: PDF

Argyle et al. supplementary material

Argyle et al. supplementary material

Download Argyle et al. supplementary material(PDF)
PDF 1.2 MB