Abstract
In the face of low-resource reaction training samples, we construct a chemical platform for addressing small-scale reaction prediction problem. By using a self-supervised pretraining strategy called MASS, the transformer model can absorb the chemical information about 1 billion molecules and then finetunes on small-scale reaction prediction, which is different from previous works that only rely on reaction samples. To demonstrate the broad applicability of our approach, we adopt three dif-ferent name reactions in our work. In the Baeyer-Villiger, Heck and Sharpless asymmetric epoxidation reaction prediction tasks, the average accuracies increase by 5.7%, 10.8%, 4.8% respectively, marking an important step to low-resource reaction prediction.
Supplementary materials
Title
Self-supervised molecular pretraining strategy for low-resource reaction prediction scenarios
Description
Supplementary Materials for self-supervised molecular pretraining strategy for low-resource reaction prediction scenarios
Actions



![Author ORCID: We display the ORCID iD icon alongside authors names on our website to acknowledge that the ORCiD has been authenticated when entered by the user. To view the users ORCiD record click the icon. [opens in a new tab]](https://www.cambridge.org/engage/assets/public/coe/logo/orcid.png)