Molly Roberts (University of California, San Diego), Brandon Stewart (Princeton), and Dustin Tingley (Harvard) stm: An R package for Structural Topic Models

Selection committee: Jeffrey Arnold (University of Washington), Sarah Bouchat (Northwestern), Adam Glynn (Emory, Chair)


Structural topic models have enhanced and advanced the use of topic modeling for diverse text corpora by encompassing metadata and structure at the document level in model estimation. The stm package represents a significant research contribution to methodological innovation in text analysis, as several significant papers introducing and using the method have demonstrated. The package facilitates the inclusion of covariates in topic model estimation, while also providing auxiliary utilities for regression and visualization of results. Its ease of use in each stage of analysis and the importance and utility of evaluating structural components in text analysis of topics are reflected in more than 100 citations of the package in a variety of research applications. The authors also put a lot of work into addressing the inferential issues associated with topic models and providing speed that allows the package to be widely used. The stm package is available on CRAN, at the website http://www.structuraltopicmodel.com/, and is featured in a forthcoming article in the Journal of Statistical Software.