References
Antoniak, M., & Mimno, D. (2018). Evaluating the stability of embeddingbased word similarities. Transactions of the Association for Computational Linguistics, 6, 107–119.
Bengio, Y., Ducharme, R., Vincent, P., & Jauvin, C. (2003). A neural probabilistic language model. Journal of Machine Learning Research, 3, 1137–1155.
Bhatia, S. (2017). Associative judgment and vector space semantics. Psychological Review 124(1), 1.
Bianchi, F., Terragni, S., & Hovy, D. (2020). Pre-training is a hot topic: Contextualized document embeddings improve topic coherence. arXiv preprint arXiv:2004.03974.
Blei, D. M., Ng, A. Y., & Jordan, M. I. (2003). Latent Dirichlet allocation. Journal of Machine Learning Research, 3, 993–1022.
Blodgett, S. L., Green, L., & O'Connor, B. (2016). Demographic dialectal variation in social media: A case study of African-American English. In Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, Association for Computational Linguistics. Stroudsburg, PA. (pp. 1119-1130).
Boyd-Graber, J., Mimno, D., & Newman, D. (2014). Careandfeedingoftopic models: Problems, diagnostics, and improvements. In Airoldi, E. M., Blei, D., Erosheva, E. A., & Fienberg, S. E. (Eds.), Handbook of mixed membership models and their applications. Boca Raton, FL: CRC Press, pp. 225–254.
Chen, S. F., & Goodman, J. (1996). An empirical study of smoothing techniques for language modeling. Paper presented at the 34th annual meeting of the Association for Computational Linguistics. Retrieved from http://aclweb.org/anthology/P96-1041 Chollet, F. (2017). Deep learning with Python. Manning, Shelter Island, NY.
Crystal, D. (2003). The Cambridge encyclopedia of the English language (3rd ed.). Cambridge, England: Cambridge University Press.
Das, R., Zaheer, M., & Dyer, C. (2015). Gaussian LDA for topic mod-els with word embeddings. In Proceedings of the 53rd annual meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing: Vol. 1. Long papers. Association for Computational Linguistics. Stroudsburg, PA. (pp. 795-804).
Deerwester, S., Dumais, S. T., Furnas, G. W., Landauer, T. K., & Harshman, R. (1990). Indexing by latent semantic analysis. Journal of the American Society for Information Science 41(6), 391–407.
Dempster, A. P., Laird, N. M., & Rubin, D. B. (1977). Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society: Series B (Methodological), 39(1), 1–22.
Denny, M.J., & Spirling, A. (2018). Text preprocessing for unsupervised learning: Why it matters, when it misleads, and what to do about it. Political Analysis 26(2), 168–189.
Dieng, A. B., Ruiz, F. J., & Blei, D. M. (2019). Topic modeling in embedding spaces. arXiv preprint arXiv:1907.04907.
Eisenstein, J. (2019). Introduction to natural language processing. Cambridge, MA: MIT Press.
Evans, J. A., & Aceves, P. (2016). Machine translation: Mining text for social theory. Annual Review of Sociology, 42, 21–50.
Firth, J. R. (1957). A synopsis of linguistic theory, 1930-1955. Studies in Linguistic Analysis. Basil Blackwell, Oxford. pp 1-32. Volume 1
Fromkin, V., Rodman, R., & Hyams, N. (2018). An introduction to language. Cengage Learning. Wadsworth. Boston, MA.
Garg, N., Schiebinger, L., Jurafsky, D., & Zou, J. (2018). Word embeddings quantify 100 years of gender and ethnic stereotypes. Proceedings of the National Academy of Sciences 115(16), E3635–E3644.
Gentzkow, M., Kelly, B. T., & Taddy, M. (2017). Text as data (technical report). Washington, DC: National Bureau of Economic Research.
Goldberg, Y. (2016). A primer on neural network models for natural language processing. Journal of Artificial Intelligence Research, 57, 345–420.
Goldberg, Y. (2017). Neural network methods for natural language processing. Edited by Graeme Hirst. Morgan & Claypool. San Rafael, CA, Synthesis Lectures on Human Language Technologies 10(1), 1–309.
Goldberg, Y., & Levy, O. (2014). word2vec Explained: Deriving Mikolov et al.’s negative-sampling word-embedding method. arXiv preprint arXiv: 1402.3722.
Grave, E., Bojanowski, P., Gupta, P., Joulin, A., & Mikolov, T. (2018). Learning word vectors for 157 languages. Paper presented at the International Conference on Language Resources and Evaluation (LREC 2018).
Grimmer, J., & Stewart, B. M. (2013). Text as data: The promise and pitfalls of automatic content analysis methods for political texts. Political Analysis 21(3), 267–297.
Hamilton, W.L., Leskovec, J., & Jurafsky, D. (2016). Diachronic word embeddings reveal statistical laws of semantic change. In Proceedings of the 54th Meeting of the Association for Computational Linguistics (pp. 1489-1501).
Hartmann, J., Huppertz, J., Schamp, C., & Heitmann, M. (2018). Comparing automated text classification methods. Association for Computational Linguistics. Stroudsburg, PA. International Journal of Research in Marketing 36(1), pp. 20–38.
Hovy, D. (2010). An evening with: : : EM (technical report). University of Southern California. Online tech report.
Hovy, D., & Purschke, C. (2018). Capturing regional variation with distributed place representations and geographic retrofitting. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics. Stroudsburg, PA. (pp. 4383-4394).
Humphreys, A., & Wang, R. J.-H. (2017). Automated text analysis for consumer research. Journal of Consumer Research 44(6), 1274–1306.
Jagarlamudi, J., Daume, H., III, & Udupa, R. (2012). Incorporating lexical priors into topic models. In Proceedings of the 13th Conference of the European Chapter of the Association for Computational Linguistics. Association for Computational Linguistics. Stroudsburg, PA (pp. 204-213).
Jelinek, F., & Mercer, R. (1980). Interpolated estimation of Markov source parameters from sparse data. In Proceedings Workshop Pattern Recognition in Practice (pp. 381-397).
Jurafsky, D. (2014). The language of food: A linguist reads the menu. North Holland Publishing Company, Amsterdam. New York: W. W. Norton.
Jurafsky, D.,& Martin, J. H. (2014). Speech and language processing (3rd ed.). London: Pearson.
Katz, S. (1987). Estimation of probabilities from sparse data for the language model component of a speech recognizer. IEEE Transactions on Acoustics, Speech, and Signal Processing 35(3), 400–401.
Kulkarni, V., Al-Rfou, R., Perozzi, B., & Skiena, S. (2015). Statistically significant detection of linguistic change. In Proceedings of the 24th International Conference on the World Wide Web, Association for Computing Machinery. New York, NY. (pp. 625-635).
Labov, W. (1972). Sociolinguistic patterns. Philadelphia, PA: University of Pennsylvania Press.
Landauer, T. K., & Dumais, S. T. (1997). A solution to Plato’s problem: The latent semantic analysis theory of acquisition, induction, and representation of knowledge. Psychological Review 104(2), 211–240.
Lang, S. (2012). Introduction to linear algebra. New York: Springer Science & Business Media.
Lau, J. H., &Baldwin, T. (2016). An empirical evaluation of doc2vec with practical insights into document embedding generation. In (p. 78-86). Proceedings of the 1st Workshop on Representation Learning for NLP. Association for Computational Linguistics. Stroudsburg, PA.
Le, Q., & Mikolov, T. (2014). Distributed representations of sentences and documents. In Proceedings of the 31st International Conference on Machine Learning (ICML-14). Association for Computing Machinery. New York, NY. (pp. 1188-1196).
Loper, E., &Bird, S. (2002). NLTK: The Natural Language Toolkit. Paper presented at the ACL-02 Workshop on Effective Tools and Methodologies for Teaching Natural Language Processing and Computational Linguistics.
Maaten, L. v. d., & Hinton, G. (2008). Visualizing data using t-SNE. Journal of Machine Learning Research, 9, 2579–2605.
Manning, C. D., & Schutze, H. (1999). Foundations of statistical natural language processing. Cambridge, MA: MIT Press.
Marsland, S. (2015). Machine learning: An algorithmic perspective (2nd ed.). New York: Chapman and Hall/CRC.
McDonald, R., Nivre, J., Quirmbach-Brundage, Y., Goldberg, Y., Das, D., Ganchev, K., et al. (2013). Universal dependency annotation for multilingual parsing. In Proceedings of the 51st annual meeting of the Association for Computational Linguistics: Vol. 2. Short Papers. Association for Com-putational Linguistics. Stroudsburg, PA. (pp. 92-97).
Mikolov, T, Karafiat, M., Burget, L., Cernocky, J., & Khudanpur, S. (2010). Recurrent neural network based language model. Paper presented at the 11th annual conference of the International Speech Communication Association.
Mikolov, T., Sutskever, I., Chen, K., Corrado, G. S., & Dean, J. (2013). Distributed representations of words and phrases and their compositionality. In Advances in neural information processing systems. Neural Information Processing Systems Foundation. San Diego, CA. (pp. 3111-3119).
Mimno, D., Wallach, H., Talley, E., Leenders, M., & McCallum, A. (2011). Optimizing semantic coherence in topic models. In Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing. Association forComputationalLinguistics. Stroudsburg, PA. (pp. 262-272).
Murphy, K. P. (2012). Machine learning: A probabilistic perspective. Cambridge, MA: MIT Press.
Niculae, V., Kumar, S., Boyd-Graber, J., & Danescu-Niculescu-Mizil, C. (2015). Linguistic harbingers of betrayal: A case study on an online strategy game. In Proceedings of the 53rd annual meeting of the Associationfor Computational Linguistics and the 7th International Joint Conference on Natural Language Processing: Vol. 1. Long Papers. Association for Computational Linguistics. Stroudsburg, PA. (pp. 1650-1659).
Nivre, J., Agic, Z., Aranzabe, M. J., Asahara, M., Atutxa, A., Ballesteros, M., et al. (2015). Universal Dependencies Consortium. No address: https://universaldependencies.org/ Universal dependencies 1.2. Nivre, J., de Marneffe, M.-C., Ginter, F., Goldberg, Y., Hajic, J., Manning, C. D., etal. (2016, May). Universal dependencies v1 : A multilingual treebank collection. In Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC’16) (pp. 1659-1666). Portoroz, Slovenia: European Language Resources Association (ELRA). Retrieved from www.aclweb.org/anthology/L16-1262 Pennebaker, J. W. (2011). The secret life ofpronouns: What our words say about us. New York: Bloomsbury Press.
Pennebaker, J. W., Francis, M. E., & Booth, R. J. (2001). Linguisticinquiryand word count: LIWC2001. Mahwah, NJ: Lawrence Erlbaum, 2001.
Pennington, J., Socher, R., & Manning, C. D. (2014). Glove: Global vectors for word representation. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP) . Association for Computational Linguistics. Stroudsburg, PA. (pp. 1532-1543).
Petrov, S., Das, D., & McDonald, R. (2011). A universal part-of-speech tagset. In Proceedings ofLREC. European Language Resources Association. Paris.
Porter, M. F. (1980). An algorithm for suffix stripping. Program 14(3), 130–137.
Prabhakaran, V., Rambow, O., & Diab, M. (2012). Predicting overt display of power in written dialogs. In Proceedings of the 2012 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Association for Computational Linguistics. Stroudsburg, PA. (pp. 518-522).
Resnik, P., & Hardisty, E. (2010). Gibbssamplingfor the uninitiated (technical report). College Park, MD: University of Maryland Institute for Advanced Computer Studies.
Roberts, Molly Roberts, Brandon, Stewart, Dustin Tingley, Edoardo Airoldi M. E., Stewart, B. M., Tingley, D., Airoldi, E. M., et al. (2013). The structural topic model and applied social science. In Advances in neural information processing systems workshop on topic models: Computation, application, and evaluation. Neural Information Processing Systems Foundation. San Diego, CA. (pp. 1-20).
Röder, M., Both, A., & Hinneburg, A. (2015). Exploring the space of topic coherence measures. In Proceedings of the 8th ACM International Conference on Web Search and Data Mining. Association for Computing Machinery. New York, NY. (pp. 399-408).
Rong, X. (2014). word2vec parameter learning explained. arXiv preprint arXiv:1411.2738.
Schwartz, H. A., Eichstaedt, J., Blanco, E., Dziurzynski, L., Kern, M., Ramones, S., et al. (2013). Choosing the right words: Characterizing and reducing error of the word count approach. In Second Joint Conference on Lexical and Computational Semantics (* SEM): Vol. 1. Proceedings of the main conference and the shared task: Semantic textual similarity. Association for Computational Linguistics. Stroudsburg, PA. (pp. 296-305).
Sparck Jones, K. (1972). A statistical interpretation of term specificity and its application in retrieval. Journal of Documentation 28(1), 11–21.
Srivastava, A., & Sutton, C. (2017). Autoencoding variational inference for topic models. arXiv preprint arXiv:1703.01488.
Stevens, K., Kegelmeyer, P., Andrzejewski, D., & Buttler, D. (2012, July). Exploring topic coherence over many models and many topics. In Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (pp. 952-961). Jeju Island, Korea: Association for Computational Linguistics. Retrieved from www.aclweb.org/anthology/D12-1087 Trudgill, P. (2000). Sociolinguistics: An introduction to language and society. London: Penguin.
Zipf, G. K. (1935). The psycho-biology of language: An introduction to dynamic philology. Houghton Mifflin. Boston, MA.