Abstract
Machine learning (ML) is rapidly reshaping the chemical sciences, with applications spanning molecular property prediction, chemical reaction design, molecular structure generation, and other data-driven discovery. With the growing integration of ML into chemical research, undergraduate chemistry students increasingly need training that bridges traditional chemical education with ML methods. Here we present Machine Learning in Chemistry (MLChem), an undergraduate-level course designed with a chemistry-first perspective to lower barriers to entry into ML while maintaining disciplinary relevance. This course introduces fundamental ML algorithms using chemical datasets, such as the small molecule solubility dataset, and the peptide activity dataset. It progresses from traditional ML algorithms to neural networks. Each chapter is accompanied by tutorial notebooks and homework assignments focused on chemistry-relevant tasks. These course materials are open-source and available at https://xuhuihuang.github.io/mlchem. These fundamental chapters are also complemented by advanced modules on emerging topics such as reinforcement learning for retrosynthesis, ML-based force fields, deep learning for the predictions of protein structure and dynamics. By combining chemical context with hands-on coding and exposure to frontier applications, MLChem equips undergraduate chemistry students with both conceptual foundations and practical skills, preparing them to participate in ML-driven chemical research.
Supplementary weblinks
Title
Jupyter Notebook Tutorials for Machine Learning in Chemistry (MLChem)
Description
The Jupyter notebook tutorials and example homework assignments for MLChem.
Actions
View 


![Author ORCID: We display the ORCID iD icon alongside authors names on our website to acknowledge that the ORCiD has been authenticated when entered by the user. To view the users ORCiD record click the icon. [opens in a new tab]](https://www.cambridge.org/engage/assets/public/coe/logo/orcid.png)