Modern Dimension Reduction

Philip D. Waggoner

doi:10.1017/9781108981767

Series: Elements in Quantitative and Computational Methods for the Social Sciences

Modern Dimension Reduction

Published online by Cambridge University Press: 10 July 2021

Philip D. Waggoner

Show author details

Philip D. Waggoner: Affiliation:
University of Chicago

Summary

Data are not only ubiquitous in society, but are increasingly complex both in size and dimensionality. Dimension reduction offers researchers and scholars the ability to make such complex, high dimensional data spaces simpler and more manageable. This Element offers readers a suite of modern unsupervised dimension reduction techniques along with hundreds of lines of R code, to efficiently represent the original high dimensional data space in a simplified, lower dimensional subspace. Launching from the earliest dimension reduction technique principal components analysis and using real social science data, I introduce and walk readers through application of the following techniques: locally linear embedding, t-distributed stochastic neighbor embedding (t-SNE), uniform manifold approximation and projection, self-organizing maps, and deep autoencoders. The result is a well-stocked toolbox of unsupervised algorithms for tackling the complexities of high dimensional data so common in modern society. All code is publicly accessible on Github.

Element contents

Summary
References

Get access

Keywords

dimension reduction unsupervised machine learning computational social science big data social computing

Type: Element
Information: Series: Elements in Quantitative and Computational Methods for the Social Sciences

DOI: https://doi.org/10.1017/9781108981767 [Opens in a new window]

Online ISBN: 9781108981767

Publisher: Cambridge University Press

Print publication: 05 August 2021

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

ANES, American National Election Studies. 2019. Pilot Study.Google Scholar

Armstrong, David A., Bakker, Ryan, Carroll, Royce, et al. 2014. Analyzing Spatial Models of Choice and Judgment with R. CRC Press.CrossRef Google Scholar

Chang, Hong and Yeung, Dit-Yan. 2006. “Robust locally linear embedding.” Pattern Recognition 39(6):1053–1065.Google Scholar

Friedman, Jerome, Hastie, Trevor, and Tibshirani, Robert. 2001. The Elements of Statistical Learning. Springer.Google Scholar

Goodfellow, Ian, Bengio, Yoshua, Courville, Aaron, and Bengio, Yoshua. 2016. Deep Learning. Massachusetts Institute of Technology Press.Google Scholar

Hebb, Donald Olding. 1949. The Organization of Behavior: A Neuropsychological Theory. J. Wiley; Chapman & Hall.Google Scholar

James, Gareth, Witten, Daniela, Hastie, Trevor, and Tibshirani, Robert. 2013. An Introduction to Statistical Learning. Springer.CrossRef Google Scholar

Kennedy, Ryan and Waggoner, Philip D.. 2021. Introduction to R for Social Scientists: A Tidy Programming Approach. CRC Press.Google Scholar

Kohonen, Teuvo. 1982. “Self-organized formation of topologically correct feature maps.” Biological Cybernetics 43(1):59–69.CrossRef Google Scholar

Kourtit, Karima, Nijkamp, Peter, and Arribas, Daniel. 2012. “Smart cities in perspective – a comparative European study by means of self-organizing maps.” Innovation: The European Journal of Social Science Research 25(2):229–246.Google Scholar

Krishnan, Raghavan, Samaranayake, V. A., and Jagannathan, Sarangapani. 2018. “A multi-step nonlinear dimension-reduction approach with applications to big data.” IEEE Transactions on Knowledge and Data Engineering 31(12):2249–2261.CrossRef Google Scholar

Li, Juntao, Song, Yan, Zhang, Haisong, et al. 2018. Generating classical Chinese poems via conditional variational autoencoder and adversarial training. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. pp. 3890–3900.CrossRef Google Scholar

McInnes, Leland, Healy, John, and Melville, James. 2018. “Umap: Uniform manifold approximation and projection for dimension reduction.” arXiv preprint:1802.03426.Google Scholar

Mikolov, Tomas, Chen, Kai, Corrado, Greg, and Dean, Jeffrey. 2013. “Efficient estimation of word representations in vector space.” arXiv preprint arXiv:1301.3781.Google Scholar

Ordun, Catherine, Purushotham, Sanjay, and Raff, Edward. 2020. “Exploratory analysis of COVID-19 tweets using topic modeling, umap, and digraphs.” arXiv preprint arXiv:2005.03082.Google Scholar

Riemann, B. 1873. “On the Hypotheses that Lie at the Bases of Geometry (1854).” English translation by W. K. Clifford, Nature 8.Google Scholar

Roweis, Sam T. and Saul, Lawrence K.. 2000. “Nonlinear dimensionality reduction by locally linear embedding.” Science 290(5500):2323–2326.Google Scholar

Rubin, Donald B. 1976. “Inference and missing data.” Biometrika 63(3): 581–592.Google Scholar

Saul, Lawrence K. and Roweis, Sam T.. 2003. “Think globally, fit locally: unsupervised learning of low dimensional manifolds.” Journal of Machine Learning Research 4(Jun):119–155.Google Scholar

Schölkopf, Bernhard, Smola, Alexander, and Müller, Klaus-Robert. 1997. Kernel principal component analysis. In International Conference on Artificial Neural Networks. Springer pp. 583–588.Google Scholar

van der Maaten, Laurens. 2014. “Accelerating t-SNE using tree-based algorithms.” The Journal of Machine Learning Research 15(1):3221–3245.Google Scholar

van der Maaten, Laurens and Hinton, Geoffrey. 2008. “Visualizing data using t-SNE.” Journal of Machine Learning Research 9(Nov):2579–2605.Google Scholar

Waggoner, Philip D. 2019. “amerika: American Politics-Inspired Color Palette Generator. R package version 0.1.0.” https://CRAN.R-project.org/package=amerika Google Scholar

Waggoner, Philip D. 2020. Unsupervised Machine Learning for Clustering in Political and Social Research. Cambridge University Press.CrossRef Google Scholar

Waggoner, Philip D. 2021. “Pandemic Policymaking.” Journal of Social Computing 2(1):14–26.CrossRef Google Scholar

Wattenberg, Martin, Viégas, Fernanda, and Johnson, Ian. 2016. “How to use t-SNE effectively.” Distill 1(10):e2.Google Scholar

Wickham, Hadley, Averick, Mara, Bryan, Jennifer, et al. 2019. “Welcome to the Tidyverse.” Journal of Open Source Software 4(43):1686.CrossRef Google Scholar

Zhang, Tonglin and Yang, Baijian. 2018. “Dimension reduction for big data.” Statistics and Its Interface 11(2):295–306.Google Scholar

Zou, Hui, Hastie, Trevor, and Tibshirani, Robert. 2006. “Sparse principal component analysis.” Journal of Computational and Graphical Statistics 15(2):265–286.Google Scholar

Element contents

Modern Dimension Reduction

Summary

Keywords

Access options

References

Save element to Kindle

Save element to Dropbox

Save element to Google Drive