Text Analysis in Python for Social Scientists: Prediction and Classification

Dirk Hovy

doi:10.1017/9781108960885

Series: Elements in Quantitative and Computational Methods for the Social Sciences

Text Analysis in Python for Social Scientists

Prediction and Classification

Published online by Cambridge University Press: 15 February 2022

Dirk Hovy

Show author details

Dirk Hovy: Affiliation:
Università Commerciale Luigi Bocconi, Milan

Summary

Text contains a wealth of information about about a wide variety of sociocultural constructs. Automated prediction methods can infer these quantities (sentiment analysis is probably the most well-known application). However, there is virtually no limit to the kinds of things we can predict from text: power, trust, misogyny, are all signaled in language. These algorithms easily scale to corpus sizes infeasible for manual analysis. Prediction algorithms have become steadily more powerful, especially with the advent of neural network methods. However, applying these techniques usually requires profound programming knowledge and machine learning expertise. As a result, many social scientists do not apply them. This Element provides the working social scientist with an overview of the most common methods for text classification, an intuition of their applicability, and Python code to execute them. It covers both the ethical foundations of such work as well as the emerging potential of neural network methods.

Element contents

Summary
References

Get access

Keywords

text analysis natural language processing computational linguistics classification prediction

Type: Element
Information: Series: Elements in Quantitative and Computational Methods for the Social Sciences

DOI: https://doi.org/10.1017/9781108960885 [Opens in a new window]

Online ISBN: 9781108960885

Publisher: Cambridge University Press

Print publication: 17 March 2022

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Element purchase

Temporarily unavailable

References

Adamson, A. S., & Smith, A. (2018). Machine learning and health care disparities in dermatology. JAMA Dermatology, 154(11), 1247–1248.Google Scholar

Alowibdi, J. S., Buy, U. A., & Yu, P. (2013). Empirical evaluation of profile characteristics for gender classification on Twitter. In 12th International Conference on Machine Learning and Applications (Volume 1) (pp. 365–369).Google Scholar

Angwin, J., Larson, J., Mattu, S., & Kirchner, L. (2016). Machine bias. ProPublica, May, 23.Google Scholar

Atalay, S., El Kihal, S., & Ellsaesser, F. (2019). A natural language processing approach to predicting the persuasiveness of marketing communications. SSRN 3410351.Google Scholar

Bahdanau, D., Cho, K., & Bengio, Y. (2014). Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473.Google Scholar

Bahdanau, D., Cho, K., & Bengio, Y. (2015). Neural machine translation by jointly learning to align and translate. In 3rd International Conference on Learning Representations.Google Scholar

Bamman, D., O’Connor, B., & Smith, N. (2012). Censorship and deletion practices in Chinese social media. First Monday, 17(3).Google Scholar

Bender, E. M., & Friedman, B. (2018). Data statements for natural language processing: Toward mitigating system bias and enabling better science. Transactions of the Association for Computational Linguistics, 6, 587–604. https://doi.org/10.1162/tacl_a_00041 Google Scholar

Berg-Kirkpatrick, T., Burkett, D., & Klein, D. (2012). An empirical investigation of statistical significance in NLP. In Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (pp. 995–1005).Google Scholar

Bhatia, S. (2017). Associative judgment and vector space semantics. Psychological Review, 124(1), 1.Google Scholar

Bolukbasi, T., Chang, K.-W., Zou, J. Y., Saligrama, V., & Kalai, A. T. (2016). Man is to computer programmer as woman is to homemaker? Debiasing word embeddings. In 30th Conference on Neural Information Processing Systems (NIPS 2016), Barcelona, Spain (pp. 4349–4357).Google Scholar

Chatsiou, K., & Mikhaylov, S. J. (2020). Deep learning for political science. arXiv preprint arXiv:2005.06540.Google Scholar

Chollet, F. (2017). Deep learning with Python. Manning.Google Scholar

Ciot, M., Sonderegger, M., & Ruths, D. (2013). Gender inference of Twitter users in non-english contexts. In Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing (pp. 18–21).Google Scholar

Coavoux, M., Narayan, S., & Cohen, S. B. (2018). Privacy-preserving neural representations of text. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing (pp. 1–10).CrossRef Google Scholar

Collins, M. (2002). Discriminative training methods for hidden Markov models: Theory and experiments with perceptron algorithms. In Proceedings of the 2002 Conference on Empirical Methods in Natural Language Processing (pp. 1–8). Association for Computational Linguistics. www.aclweb.org/anthology/W02-1001. http://doi.org/10.3115/1118693.1118694.CrossRef Google Scholar

Coussement, K., & Van den Poel, D. (2008). Churn prediction in subscription services: An application of support vector machines while comparing two parameter-selection techniques. Expert Systems with Applications, 34(1), 313–327.CrossRef Google Scholar

De Choudhury, M., Counts, S., & Horvitz, E. J. (2013). Predicting postpartum changes in emotion and behavior via social media. In Proceedings of the Sigchi Conference on Human Factors in Computing Systems (pp. 3267–3276).Google Scholar

De Choudhury, M., Counts, S., Horvitz, E. J., & Hoff, A. (2014). Characterizing and predicting postpartum depression from shared facebook data. In Proceedings of the 17th ACM Conference on Computer Supported Cooperative Work & Social Computing (pp. 626–638).Google Scholar

Dell, G. S. (1986). A spreading-activation theory of retrieval in sentence production. Psychological Review, 93(3), 283.Google Scholar

Devlin, J., Chang, M.-W., Lee, K., & Toutanova, K. (2019). BERT: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long and Short Papers) (pp. 4171–4186).Google Scholar

Elazar, Y., & Goldberg, Y. (2018). Adversarial removal of demographic attributes from text data. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing (pp. 11–21).CrossRef Google Scholar

Eliashberg, J., Hui, S. K., & Zhang, Z. J. (2007). From story line to box office: A new approach for green-lighting movie scripts. Management Science, 53(6), 881–893.Google Scholar

Evans, M., McIntosh, W., Lin, J., & Cates, C. (2007). Recounting the courts? Applying automated content analysis to enhance empirical legal research. Journal of Empirical Legal Studies, 4(4), 1007–1039.CrossRef Google Scholar

Fort, K., Adda, G., & Cohen, K. B. (2011). Last words: Amazon Mechanical Turk: Gold mine or coal mine? Computational Linguistics, 37(2), 413–420. www.aclweb.org/anthology/J11-2010. http://doi.org/10.1162/COLI_a_00057.CrossRef Google Scholar

Garg, N., Schiebinger, L., Jurafsky, D., & Zou, J. (2018). Word embeddings quantify 100 years of gender and ethnic stereotypes. Proceedings of the National Academy of Sciences, 115(16), E3635–E3644.CrossRef Google Scholar PubMed

Gerber, M. S. (2014). Predicting crime using twitter and kernel density estimation. Decision Support Systems, 61, 115–125.CrossRef Google Scholar

Goldberg, Y. (2016). A primer on neural network models for natural language processing. Journal of Artificial Intelligence Research, 57, 345–420.Google Scholar

Goldberg, Y. (2017). Neural network methods for natural language processing. Synthesis Lectures on Human Language Technologies, 10(1), 1–309.Google Scholar

Goldstein, D. G., & Gigerenzer, G. (2002). Models of ecological rationality: The recognition heuristic. Psychological Review, 109(1), 75.Google Scholar

Gonen, H., & Goldberg, Y. (2019, June). Lipstick on a pig: Debiasing methods cover up systematic gender biases in word embeddings but do not remove them. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long and Short Papers)( pp. 609–614). www.aclweb.org/anthology/N19-1061. http://doi.org/10.18653/v1/N19-1061.CrossRef Google Scholar

Greene, K. T., Park, B., & Colaresi, M. (2019). Machine learning human rights and wrongs: How the successes and failures of supervised learning algorithms can inform the debate about information effects. Political Analysis, 27(2), 223–230.CrossRef Google Scholar

Harwell, D. (2018). The accent gap. Why some accents don’t work on Alexa or Google Home. The Washington Post. www.washingtonpost.com/graphics/2018/business/alexa-does-not-understand-your-accent/.Google Scholar

Henrich, J., Heine, S. J., & Norenzayan, A. (2010). The weirdest people in the world? Behavioral and Brain Sciences, 33(2–3), 61–83.Google Scholar

Hinton, G. E., Srivastava, N., Krizhevsky, A., Sutskever, I., & Salakhutdinov, R. R. (2012). Improving neural networks by preventing co-adaptation of feature detectors. In 30th Conference on Neural Information Processing Systems (NIPS 2016), Barcelona, SpainGoogle Scholar

Hochreiter, S., & Schmidhuber, J. (1997). Long short-term memory. Neural Computation, 9(8), 1735–1780.Google Scholar

Hofman, J. M., Sharma, A., & Watts, D. J. (2017). Prediction and explanation in social systems. Science, 355(6324), 486–488.CrossRef Google Scholar PubMed

Hovy, D. (2016). The enemy in your own camp: How well can we detect statistically-generated fake reviews – An adversarial study. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics. (pp. 351–356). http://doi.org/10.18653/v1/P16-2057 CrossRef Google Scholar

Hovy, D. (2020). Text analysis in Python for social scientists: Discovery and exploration. Cambridge University Press.Google Scholar

Hovy, D., Berg-Kirkpatrick, T., Vaswani, A., & Hovy, E. (2013). Learning whom to trust with MACE. In Proceedings of the 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (pp. 1120–1130).Google Scholar

Hovy, D., & Søgaard, A. (2015). Tagging performance correlates with author age. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 2: Short Papers) (pp. 483–488).Google Scholar

Hovy, D., & Spruit, S. L. (2016). The social impact of natural language processing. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers) (pp. 591–598).Google Scholar

Huang, H., Wen, Z., Yu, D., Ji, H., Sun, Y., Han, J., & Li, H. (2013). Resolving entity morphs in censored data. In Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) (pp. 1083–1093).Google Scholar

Humphreys, A., & Wang, R. J.-H. (2017). Automated text analysis for consumer research. Journal of Consumer Research, 44(6), 1274–1306.Google Scholar

Jonas, H. (1984). The imperative of responsibility: Foundations of an ethics for the technological age (Original in German: Prinzip Verantwortung). University of Chicago Press.Google Scholar

Jørgensen, A., Hovy, D., & Søgaard, A. (2015). Challenges of studying and processing dialects in social media. In Proceedings of the Workshop on Noisy User-Generated Text (pp. 9–18).Google Scholar

Joshi, P., Santy, S., Budhiraja, A., Bali, K., & Choudhury, M. (2020, July). The state and fate of linguistic diversity and inclusion in the NLP world. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics (pp. 6282–6293). Association for Computational Linguistics. www.aclweb.org/anthology/2020.acl-main.560. http://doi.org/10.18653/v1/2020.acl-main.560.Google Scholar

Kiritchenko, S., & Mohammad, S. (2018). Examining gender and race bias in two hundred sentiment analysis systems. In Proceedings of the Seventh Joint Conference on Lexical and Computational Semantics (pp. 43–53).Google Scholar

Konečnỳ, J., McMahan, H. B., Yu, F. X., Richtárik, P., Suresh, A. T., & Bacon, D. (2016). Federated learning: Strategies for improving communication efficiency. arXiv preprint arXiv:1610.05492.Google Scholar

Kozlowski, A. C., Taddy, M., & Evans, J. A. (2018). The geometry of culture: Analyzing meaning through word embeddings. arXiv preprint arXiv:1803.09288.Google Scholar

Kurita, K., Vyas, N., Pareek, A., Black, A. W., & Tsvetkov, Y. (2019, August). Measuring bias in contextualized word representations. In Proceedings of the First Workshop on Gender Bias in Natural Language Processing (pp. 166–172). Association for Computational Linguistics. www.aclweb.org/anthology/W19-3823. http://doi.org/10.18653/v1/W19-3823.Google Scholar

Le, Q., & Mikolov, T. (2014). Distributed representations of sentences and documents. In Proceedings of the 31st International Conference on Machine Learning (pp. 1188–1196).Google Scholar

Levelt, W. J. (1993). Speaking: From intention to articulation (Vol. 1). MIT Press.Google Scholar

Lewis-Kraus, G. (2016). The great AI awakening. The New York Times, 14. www.nytimes.com/2016/12/14/magazine/the-great-ai-awakening.html.Google Scholar

Li, Y., Baldwin, T., & Cohn, T. (2018). Towards robust and privacy-Preserving text representations. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers) (pp. 25–30).Google Scholar

Liu, W., & Ruths, D. (2013). What’s in a name? Using first names as features for gender inference in Twitter. In Analyzing Microtext: 2013 AAAI Spring Symposium (10–16).Google Scholar

Lucy, L., Demszky, D., Bromley, P., & Jurafsky, D. (2020). Content analysis of textbooks via natural language processing: Findings on gender, race, and ethnicity in Texas U.S. history textbooks. AERA Open, 6(3), 2332858420940312.Google Scholar

Luong, T., Pham, H., & Manning, C. D. (2015, September). Effective approaches to attention-based neural machine translation. In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing (pp. 1412–1421). Association for Computational Linguistics. www.aclweb.org/anthology/D15-1166. http://doi.org/10.18653/v1/D15-1166.Google Scholar

Manning, C. D. (2015). Computational linguistics and deep learning. Computational Linguistics, 41(4), 701–707.Google Scholar

Marsland, S. (2011). Machine learning: An algorithmic perspective. Chapman and Hall/CRC.Google Scholar

Meinshausen, N., & Bühlmann, P. (2010). Stability selection. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 72(4), 417–473.Google Scholar

Mikolov, T., Sutskever, I., Chen, K., Corrado, G. S., & Dean, J. (2013). Distributed representations of words and phrases and their compositionality. In 30th Conference on Neural Information Processing Systems (NIPS 2016), Barcelona, Spain (pp. 3111–3119).Google Scholar

Mills, S. (2012). Gender matters: Feminist linguistic analysis. Equinox.Google Scholar

Minsky, M., & Papert, S. A. (1969). Perceptrons. MIT Press.Google Scholar

Mohammady, E., & Culotta, A. (2014). Using county demographics to infer attributes of Twitter users. In Proceedings of the Joint Workshop on Social Dynamics and Personal Attributes in Social Media (pp. 7–16).CrossRef Google Scholar

Mohri, M., Rostamizadeh, A., & Talwalkar, A. (2018). Foundations of machine learning. MIT Press.Google Scholar

Mosteller, F., & Wallace, D. L. (1963). Inference in an authorship problem: A comparative study of discrimination methods applied to the authorship of the disputed Federalist Papers. Journal of the American Statistical Association, 58(302), 275–309.Google Scholar

Munro, R. (2013). NLP for all languages. Idibon Blog, May 22. http://idibon.com/nlp-for-all.Google Scholar

Nguyen, D., Smith, N. A., & Rosé, C. P. (2011). Author age prediction from text using linear regression. In Proceedings of the 5th ACL-HLT Workshop on Language Technology for Cultural Heritage, Social Sciences, and Humanities (pp. 115–123).Google Scholar

Niculae, V., Kumar, S., Boyd-Graber, J., & Danescu-Niculescu-Mizil, C. (2015). Linguistic harbingers of betrayal: A case study on an online strategy game. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers) (pp. 1650–1659).Google Scholar

Nozza, D., Bianchi, F., & Hovy, D. (2020). What the [MASK]? Making sense of language-specific BERT models. arXiv preprint arXiv:2003.02912.Google Scholar

O’Neil, C. (2016). The ethical data scientist. Slate, February 4. www.slate.com/articles/technology/future_tense/2016/02/how_to_bring_better_ethics_to_data_science.html.Google Scholar

Park, B., Colaresi, M., & Greene, K. (2018). Beyond a bag of words: Using pulsar to extract judgments on specific human rights at scale. Peace Economics, Peace Science and Public Policy, 24(4).Google Scholar

Park, G., Schwartz, H. A., Eichstaedt, J. C., Kern, M. L., Kosinski, M., Stillwell, D. J., … Seligman, M. E. (2015). Automatic personality assessment through social media language. Journal of Personality and Social Psychology, 108(6), 934.Google Scholar

Passonneau, R. J., & Carpenter, B. (2014). The benefits of a model of annotation. Transactions of the Association for Computational Linguistics, 2, 311–326. www.aclweb.org/anthology/Q14-1025. http://doi.org/10.1162/tacl_a_00185.Google Scholar

Paun, S., Carpenter, B., Chamberlain, J., Hovy, D., Kruschwitz, U., & Poesio, M. (2018). Comparing Bayesian models of annotation. Transactions of the Association for Computational Linguistics, 6, 571–585. https://doi.org/10.1162/tacl_a_00040 Google Scholar

Pavlick, E., Post, M., Irvine, A., Kachaev, D., & Callison-Burch, C. (2014). The language demographics of Amazon Mechanical Turk. Transactions of the Association for Computational Linguistics, 2, 79–92. www.aclweb.org/anthology/Q14-1007. http://doi.org/10.1162/tacl_a_00167.Google Scholar

Peskov, D., Cheng, B., Elgohary, A., Barrow, J., Danescu-Niculescu-Mizil, C., & Boyd-Graber, J. (2020, July). It takes two to lie: One to lie, and one to listen. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics (pp. 3811–3854). Association for Computational Linguistics. www.aclweb.org/anthology/2020.acl-main.353.Google Scholar

Peterson, A., & Spirling, A. (2018). Classification accuracy as a substantive quantity of interest: Measuring polarization in westminster systems. Political Analysis, 26(1), 120–128.Google Scholar

Plank, B., Hovy, D., & Søgaard, A. (2014). Learning part-of-speech taggers with inter-annotator agreement loss. In Proceedings of the 14th Conference of the European Chapter of the Association for Computational Linguistics (pp. 742–751).CrossRef Google Scholar

Pomerleau, D. A. (1989). Alvinn: An autonomous land vehicle in a neural network. In Advances in Neural Information Processing Systems (pp. 305–313).Google Scholar

Pomerleau, D. A. (2012). Neural network perception for mobile robot guidance (Vol. 239). Springer Science & Business Media.Google Scholar

Prabhakaran, V., Rambow, O., & Diab, M. (2012). Predicting overt display of power in written dialogs. In Proceedings of the 2012 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (pp. 518–522).Google Scholar

Preotiuc-Pietro, D., Lampos, V., & Aletras, N. (2015a). An analysis of the user occupational class through Twitter content. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers) (pp. 1754–1764).Google Scholar

Preotiuc-Pietro, D., Volkova, S., Lampos, V., Bachrach, Y., & Aletras, N. (2015b). Studying user income through language, behaviour and affect in social media. PloS One, 10(9), e0138717.Google Scholar

Radford, A., Narasimhan, K., Salimans, T., & Sutskever, I. (2018). Improving language understanding by generative pre-training. https://s3-us-west-2.amazonaws.com/openaiassets/researchcovers/languageunsupervised/languageunderstandingpaper.pdf.Google Scholar

Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., & Sutskever, I. (2019). Language models are unsupervised multitask learners. OpenAI Blog, 1(8), 9.Google Scholar

Rogaway, P. (2015). The moral character of cryptographic work (Technical Report). IACR-Cryptology ePrint Archive.Google Scholar

Rogers, A., Kovaleva, O., & Rumshisky, A. (2020). A primer in BERTology: What we know about how BERT works. arXiv preprint arXiv:2002.12327.Google Scholar

Rosenblatt, F. (1958). The perceptron: A probabilistic model for information storage and organization in the brain. Psychological Review, 65(6), 386.Google Scholar

Rosenthal, S., & McKeown, K. (2011). Age prediction in blogs: A study of style, content, and online behavior in pre-and post-social media generations. In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies (Volume 1) (pp. 763–772).Google Scholar

Rudinger, R., Naradowsky, J., Leonard, B., & Van Durme, B. (2018). Gender bias in coreference resolution. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 2: Short Papers) (pp. 8–14).Google Scholar

Sap, M., Card, D., Gabriel, S., Choi, Y., & Smith, N. A. (2019, July). The risk of racial bias in hate speech detection. In Proceedings of the 57th Conference of the Association for Computational Linguistics (pp. 1668–1678). Association for Computational Linguistics. www.aclweb.org/anthology/P19-1163.Google Scholar

Sap, M., Gabriel, S., Qin, L., Jurafsky, D., Smith, N. A., & Choi, Y. (2020, July). Social bias frames: Reasoning about social and power implications of language. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics (pp. 5477–5490). Association for Computational Linguistics. www.aclweb.org/anthology/2020.acl-main.486.Google Scholar

Schnoebelen, T. (2013). The weirdest languages. Idibon Blog, June 21. http://idibon.com/the-weirdest-languages.Google Scholar

Shah, D. S., Schwartz, H. A., & Hovy, D. (2020, July). Predictive biases in natural language processing models: A conceptual framework and overview. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics (pp. 5248–5264). Association for Computational Linguistics. www.aclweb.org/anthology/2020.acl-main.468. http://doi.org/10.18653/v1/2020.acl-main.468.Google Scholar

Shmueli, G. (2010). To explain or to predict? Statistical Science, 25(3), 289–310.Google Scholar

Snow, R., O’Connor, B., Jurafsky, D., & Ng, A. (2008, October). Cheap and fast – but is it good? Evaluating non-expert annotations for natural language tasks. In Proceedings of the 2008 Conference on Empirical Methods in Natural Language Processing (pp. 254–263). Association for Computational Linguistics. www.aclweb.org/anthology/D08-1027.Google Scholar

Solaiman, I., Brundage, M., Clark, J., Askell, A., Herbert-Voss, A., Wu, J., … Wang, J. (2019). Release strategies and the social impacts of language models. arXiv preprint arXiv:1908.09203.Google Scholar

Spärck Jones, K. (1972). A statistical interpretation of term specificity and its application in retrieval. Journal of Documentation, 28(1), 11–21.Google Scholar

Strubell, E., Ganesh, A., & McCallum, A. (2019, July). Energy and policy considerations for deep learning in NLP. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics (pp. 3645–3650). Association for Computational Linguistics. www.aclweb.org/anthology/P19-1355. http://doi.org/10.18653/v1/P19-1355.Google Scholar

Sunstein, C. R. (2004). Precautions against what? The availability heuristic and cross-cultural risk perceptions. University of Chicago John M. Olin Law & Economics Working Paper, No. 220, 4–22.Google Scholar

Tan, Y. C., & Celis, L. E. (2019). Assessing social and intersectional biases in contextualized word representations. In 30th Conference on Neural Information Processing Systems (NIPS 2016), Barcelona, Spain (pp. 13230–13241).Google Scholar

Tatman, R. (2017). Gender and dialect bias in YouTube’s automatic captions. In Proceedings of the First ACL Workshop on Ethics in Natural Language Processing (pp. 53–59).Google Scholar

Tetreault, J., Burstein, J., & Leacock, C. (2015). Proceedings of the Tenth Workshop on Innovative Use of NLP for Building Educational Applications. Association for Computational Linguistics. http://aclweb.org/anthology/W15-0600 Google Scholar

Tirunillai, S., & Tellis, G. J. (2012). Does chatter really matter? Dynamics of user-generated content and stock performance. Marketing Science, 31(2), 198–215.Google Scholar

Tversky, A., & Kahneman, D. (1973). Availability: A heuristic for judging frequency and probability. Cognitive Psychology, 5(2), 207–232.Google Scholar

Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., … Polosukhin, I. (2017). Attention is all you need. In 30th Conference on Neural Information Processing Systems (NIPS 2016), Barcelona, Spain (pp. 5998–6008).Google Scholar

Vicinanza, P., Goldberg, A., & Srivastava, S. (2020). Who sees the future? A deep learning language model demonstrates the vision advantage of being small. https://doi.org/10.31235/osf.io/j24pw Google Scholar

Volkova, S., Bachrach, Y., Armstrong, M., & Sharma, V. (2015, January). Inferring latent user properties from texts published in social media (demo). In Proceedings of the Twenty-Ninth Conference on Artificial Intelligence (pp. 4296–4297).Google Scholar

Volkova, S., Coppersmith, G., & Van Durme, B. (2014). Inferring user political preferences from streaming communications. In Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (pp. 186–196).Google Scholar

Wu, Y., Schuster, M., Chen, Z., Le, Q. V., Norouzi, M., Macherey, W., & Dean, J. (2016). Google’s neural machine translation system: Bridging the gap between human and machine translation. arXiv preprint arXiv:1609.08144.Google Scholar

Yarkoni, T., & Westfall, J. (2017). Choosing prediction over explanation in psychology: Lessons from machine learning. Perspectives on Psychological Science, 12(6), 1100–1122.Google Scholar

Yatskar, M., Zettlemoyer, L., & Farhadi, A. (2016). Situation recognition: Visual semantic role labeling for image understanding. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 5534–5542).Google Scholar

Zhao, J., Wang, T., Yatskar, M., Ordonez, V., & Chang, K.-W. (2017). Men also like shopping: Reducing gender bias amplification using corpus-level constraints. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing (pp. 2979–2989).Google Scholar