Identifying Structure-Property Relationships through SMILES Syntax Analysis With Self-Attention Mechanism

Shuangjia Zheng; Xin Yan; Yuedong Yang; Jun Xu

doi:10.26434/chemrxiv.7295903.v2

Theoretical and Computational Chemistry

Search within Theoretical and Computational Chemistry

Identifying Structure-Property Relationships through SMILES Syntax Analysis With Self-Attention Mechanism

14 November 2018, Version 2

Working Paper

Show author details

This content is an early or alternative research output and has not been peer-reviewed by Cambridge University Press at the time of posting.

Abstract

Recognizing substructures and their relations embedded in a molecular structure representation is a key process for structure-activity or structure-property relationship (SAR/SPR) studies. A molecular structure can be either explicitly represented as a connection table (CT) or linear notation, such as SMILES, which is a language describing the connectivity of atoms in the molecular structure. Conventional SAR/SPR approaches rely on partitioning the CT into a set of predefined substructures as structural descriptors. In this work, we propose a new method to identifying SAR/SPR through linear notation (for example, SMILES) syntax analysis with self-attention mechanism, an interpretable deep learning architecture. The method has been evaluated by predicting chemical property, toxicology, and bioactivity from experimental data sets. Our results demonstrate that the method yields superior performance comparing with state-of-the-art methods. Moreover, the method can produce chemically interpretable results, which can be used for a chemist to design, and synthesize the activity/property improved compounds.

Keywords

structure-property relationship

drug discovery

Comments

Comments are not moderated before they are posted, but they can be removed by the site moderators if they are found to be in contravention of our Commenting and Discussion Policy - please read this policy before you post. Comments should be used for scholarly discussion of the content in question. You can find more information about how to use the commenting feature here .

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Now Published

Identifying Structure–Property Relationships through SMILES Syntax Analysis with Self-Attention Mechanism

Shuangjia Zheng, Xin Yan, Yuedong Yang, Jun Xu journal article

Journal of Chemical Information and Modeling , Volume 59, Issue 2

Online publication date: Jan 22, 2019

Version History

Nov 14, 2018 Version 2

Nov 05, 2018 Version 1

Version Notes

fixed typographic mistakes

Metrics

5,251

3,360

Views

Downloads

License

The content is available under CC BY NC 4.0

DOI

10.26434/chemrxiv.7295903.v2

Funding

national science & technology major project of the ministry of science and technology of China (2018ZX09735010), GD Frontier & Key Techn. Innovation Program (2015B010109004), GD-NSF (2016A030310228), Natural Science Foundation of China (U1611261) and the program for Guangdong Introducing Innovative and Enterpreneurial Teams (2016ZT06D211)

Author’s competing interest statement

The authors declare that they have no competing interests.

Identifying Structure-Property Relationships through SMILES Syntax Analysis With Self-Attention Mechanism

Authors

Abstract

Keywords

Comments

Now Published

Version History

Version Notes

Metrics

License

DOI

Funding

Author’s competing interest statement

Share