Skip to main content Accessibility help
×
  • Cited by 17
    • Show more authors
    • You may already have access via personal or institutional login
    • Select format
    • Publisher:
      Cambridge University Press
      Publication date:
      10 July 2021
      05 August 2021
      ISBN:
      9781108981767
      9781108986892
      Dimensions:
      Weight & Pages:
      Dimensions:
      (229 x 152 mm)
      Weight & Pages:
      0.16kg, 98 Pages
    You may already have access via personal or institutional login
  • Selected: Digital
    Add to cart View cart Buy from Cambridge.org

    Book description

    Data are not only ubiquitous in society, but are increasingly complex both in size and dimensionality. Dimension reduction offers researchers and scholars the ability to make such complex, high dimensional data spaces simpler and more manageable. This Element offers readers a suite of modern unsupervised dimension reduction techniques along with hundreds of lines of R code, to efficiently represent the original high dimensional data space in a simplified, lower dimensional subspace. Launching from the earliest dimension reduction technique principal components analysis and using real social science data, I introduce and walk readers through application of the following techniques: locally linear embedding, t-distributed stochastic neighbor embedding (t-SNE), uniform manifold approximation and projection, self-organizing maps, and deep autoencoders. The result is a well-stocked toolbox of unsupervised algorithms for tackling the complexities of high dimensional data so common in modern society. All code is publicly accessible on Github.

    References

    ANES, American National Election Studies. 2019. Pilot Study.
    Armstrong, David A., Bakker, Ryan, Carroll, Royce, et al. 2014. Analyzing Spatial Models of Choice and Judgment with R. CRC Press.
    Chang, Hong and Yeung, Dit-Yan. 2006. “Robust locally linear embedding.” Pattern Recognition 39(6):10531065.
    Friedman, Jerome, Hastie, Trevor, and Tibshirani, Robert. 2001. The Elements of Statistical Learning. Springer.
    Goodfellow, Ian, Bengio, Yoshua, Courville, Aaron, and Bengio, Yoshua. 2016. Deep Learning. Massachusetts Institute of Technology Press.
    Hebb, Donald Olding. 1949. The Organization of Behavior: A Neuropsychological Theory. J. Wiley; Chapman & Hall.
    James, Gareth, Witten, Daniela, Hastie, Trevor, and Tibshirani, Robert. 2013. An Introduction to Statistical Learning. Springer.
    Kennedy, Ryan and Waggoner, Philip D.. 2021. Introduction to R for Social Scientists: A Tidy Programming Approach. CRC Press.
    Kohonen, Teuvo. 1982. “Self-organized formation of topologically correct feature maps.” Biological Cybernetics 43(1):5969.
    Kourtit, Karima, Nijkamp, Peter, and Arribas, Daniel. 2012. “Smart cities in perspective – a comparative European study by means of self-organizing maps.” Innovation: The European Journal of Social Science Research 25(2):229246.
    Krishnan, Raghavan, Samaranayake, V. A., and Jagannathan, Sarangapani. 2018. “A multi-step nonlinear dimension-reduction approach with applications to big data.” IEEE Transactions on Knowledge and Data Engineering 31(12):22492261.
    Li, Juntao, Song, Yan, Zhang, Haisong, et al. 2018. Generating classical Chinese poems via conditional variational autoencoder and adversarial training. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. pp. 38903900.
    McInnes, Leland, Healy, John, and Melville, James. 2018. “Umap: Uniform manifold approximation and projection for dimension reduction.” arXiv preprint:1802.03426.
    Mikolov, Tomas, Chen, Kai, Corrado, Greg, and Dean, Jeffrey. 2013. “Efficient estimation of word representations in vector space.” arXiv preprint arXiv:1301.3781.
    Ordun, Catherine, Purushotham, Sanjay, and Raff, Edward. 2020. “Exploratory analysis of COVID-19 tweets using topic modeling, umap, and digraphs.” arXiv preprint arXiv:2005.03082.
    Riemann, B. 1873. “On the Hypotheses that Lie at the Bases of Geometry (1854).” English translation by W. K. Clifford, Nature 8.
    Roweis, Sam T. and Saul, Lawrence K.. 2000. “Nonlinear dimensionality reduction by locally linear embedding.” Science 290(5500):23232326.
    Rubin, Donald B. 1976. “Inference and missing data.” Biometrika 63(3): 581592.
    Saul, Lawrence K. and Roweis, Sam T.. 2003. “Think globally, fit locally: unsupervised learning of low dimensional manifolds.” Journal of Machine Learning Research 4(Jun):119155.
    Schölkopf, Bernhard, Smola, Alexander, and Müller, Klaus-Robert. 1997. Kernel principal component analysis. In International Conference on Artificial Neural Networks. Springer pp. 583588.
    van der Maaten, Laurens. 2014. “Accelerating t-SNE using tree-based algorithms.” The Journal of Machine Learning Research 15(1):32213245.
    van der Maaten, Laurens and Hinton, Geoffrey. 2008. “Visualizing data using t-SNE.” Journal of Machine Learning Research 9(Nov):25792605.
    Waggoner, Philip D. 2019. “amerika: American Politics-Inspired Color Palette Generator. R package version 0.1.0.” https://CRAN.R-project.org/package=amerika
    Waggoner, Philip D. 2020. Unsupervised Machine Learning for Clustering in Political and Social Research. Cambridge University Press.
    Waggoner, Philip D. 2021. “Pandemic Policymaking.” Journal of Social Computing 2(1):1426.
    Wattenberg, Martin, Viégas, Fernanda, and Johnson, Ian. 2016. “How to use t-SNE effectively.” Distill 1(10):e2.
    Wickham, Hadley, Averick, Mara, Bryan, Jennifer, et al. 2019. “Welcome to the Tidyverse.” Journal of Open Source Software 4(43):1686.
    Zhang, Tonglin and Yang, Baijian. 2018. “Dimension reduction for big data.” Statistics and Its Interface 11(2):295306.
    Zou, Hui, Hastie, Trevor, and Tibshirani, Robert. 2006. “Sparse principal component analysis.” Journal of Computational and Graphical Statistics 15(2):265286.

    Metrics

    Altmetric attention score

    Full text views

    Total number of HTML views: 0
    Total number of PDF views: 0 *
    Loading metrics...

    Book summary page views

    Total views: 0 *
    Loading metrics...

    * Views captured on Cambridge Core between #date#. This data will be updated every 24 hours.

    Usage data cannot currently be displayed.

    Accessibility standard: Unknown

    Why this information is here

    This section outlines the accessibility features of this content - including support for screen readers, full keyboard navigation and high-contrast display options. This may not be relevant for you.

    Accessibility Information

    Accessibility compliance for the HTML of this book is currently unknown and may be updated in the future.