Abstract
We report a comprehensive computational study of unsupervised machine learning for extraction of chemically relevant information in X-ray absorption near edge structure (XANES) and in valence-to-core X-ray emission spectra (VtC-XES) for classification of a broad ensemble of sulforganic molecules. By progressively decreasing the constraining assumptions of the unsupervised machine learning algorithm, moving from principal component analysis to a variational autoencoder to t-distributed stochastic neighbor embedding (t-SNE), we find improved sensitivity to steadily more refined chemical information. Surprisingly, even in merely two dimensions, t-SNE distinguishes not just oxidation state and general sulfur bonding environment but the aromaticity of the bonding radical group with 87% accuracy as well as identifying even finer details in electronic structure within aromatic or aliphatic sub-classes. We find that the chemical information in XANES and VtC-XES is very similar, although they exhibit an unexpected tendency to have different sensitivity within a given molecular class.
Supplementary materials
Title
Supplemental information: Unsupervised Machine Learning for Unbiased Chemical Classification in X-ray Absorption Spectroscopy and X-ray Emission Spectroscopy
Description
Supplemental information for the main manuscript.
Actions



![Author ORCID: We display the ORCID iD icon alongside authors names on our website to acknowledge that the ORCiD has been authenticated when entered by the user. To view the users ORCiD record click the icon. [opens in a new tab]](https://www.cambridge.org/engage/assets/public/coe/logo/orcid.png)