Hostname: page-component-89b8bd64d-rbxfs Total loading time: 0 Render date: 2026-05-09T03:30:16.895Z Has data issue: false hasContentIssue false

Machine learning approaches to identify and design low thermal conductivity oxides for thermoelectric applications

Published online by Cambridge University Press:  09 September 2020

Abhishek Tewari*
Affiliation:
Department of Metallurgical and Materials Engineering, Indian Institute of Technology Roorkee, Hardiwar, India
Siddharth Dixit
Affiliation:
Department of Mathematics, Shiv Nadar University, Gautam Buddha Nagar, India
Niteesh Sahni
Affiliation:
Department of Mathematics, Shiv Nadar University, Gautam Buddha Nagar, India
Stéphane P.A. Bordas
Affiliation:
Department of Engineering, Institute of Computational Engineering, University of Luxembourg, Esch-sur-Alzette, Luxembourg Institute of Mechanics and Advanced Materials, School of Engineering, Cardiff University, Cardiff, United Kingdom
*
*Corresponding author. E-mail: abhishek@mt.iitr.ac.in

Abstract

The search space for new thermoelectric oxides has been limited to the alloys of a few known systems, such as ZnO, SrTiO3, and CaMnO3. Notwithstanding the high power factor, their high thermal conductivity is a roadblock in achieving higher efficiency. In this paper, we apply machine learning (ML) models for discovering novel transition metal oxides with low lattice thermal conductivity ($ {k}_L $). A two-step process is proposed to address the problem of small datasets frequently encountered in material informatics. First, a gradient-boosted tree classifier is learnt to categorize unknown compounds into three categories of $ {k}_L $: low, medium, and high. In the second step, we fit regression models on the targeted class (i.e., low $ {k}_L $) to estimate $ {k}_L $ with an $ {R}^2>0.9 $. Gradient boosted tree model was also used to identify key material properties influencing classification of $ {k}_L $, namely lattice energy per atom, atom density, band gap, mass density, and ratio of oxygen by transition metal atoms. Only fundamental materials properties describing the crystal symmetry, compound chemistry, and interatomic bonding were used in the classification process, which can be readily used in the initial phases of materials design. The proposed two-step process addresses the problem of small datasets and improves the predictive accuracy. The ML approach adopted in the present work is generic in nature and can be combined with high-throughput computing for the rapid discovery of new materials for specific applications.

Information

Type
Research Article
Creative Commons
Creative Common License - CCCreative Common License - BY
This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted re-use, distribution, and reproduction in any medium, provided the original work is properly cited.
Copyright
© The Author(s), 2020. Published by Cambridge University Press
Figure 0

Figure 1. Two-step machine learning process, where the first step filters low $ {k}_L $ compounds using only fundamental material properties, such as details about crystal structure, interatomic bonding, and compound chemistry. In the second step, a multifidelity machine learning surrogate regression model is built to predict numerical $ {k}_L $ values.

Figure 1

Figure 2. Relative comparison of accuracy obtained using machine learning and deep learning classifiers. Here XGBoost and Random Forest surpass deep neural networks and other machine learning approaches to obtain the best classification accuracy. Cohen’s kappa coefficient is also used to evaluate the different classification models amongst themselves. Abbreviations: kNN, k Nearest Neighbors; SVM, support vector machine with rbf kernel; LDA, linear discriminant analysis.

Figure 2

Table 1. Confusion Matrix for predictions made on the validation set by the XGBoost classifier.

Figure 3

Table 2. A detailed comparison of different classifiers and their relative performance.

Figure 4

Figure 3. Feature importance plot generated by the XGBoost Classifier. The relative importance of descriptors is calculated by how useful it was while making key decisions with Decision Trees.

Figure 5

Figure 4. Predictions of the regression models where lighter shades of red and bigger point sizes represent higher residuals, (a) random forest fitted on entire dataset, (b) random forest fitted on dataset of low $ {k}_L $ compounds, and (c) Cubist model fitted on low $ {k}_L $ compounds including Debye temperature and Gruneisen parameter.

Figure 6

Table 3. Represents the 10-fold cross validation results obtained from the regression model including Gruneisen parameter and Debye temperature.

Supplementary material: File

Tewari et al. Supplementary Materials

Tewari et al. Supplementary Materials

Download Tewari et al. Supplementary Materials(File)
File 1.5 KB
Submit a response

Comments

No Comments have been published for this article.