Hostname: page-component-89b8bd64d-ksp62 Total loading time: 0 Render date: 2026-05-07T01:15:36.862Z Has data issue: false hasContentIssue false

Deep materials informatics: Applications of deep learning in materials science

Published online by Cambridge University Press:  13 June 2019

Ankit Agrawal*
Affiliation:
Department of Electrical Engineering and Computer Science, Northwestern University, Evanston, IL 60201, USA
Alok Choudhary
Affiliation:
Department of Electrical Engineering and Computer Science, Northwestern University, Evanston, IL 60201, USA
*
Address all correspondence to Ankit Agrawal at ankitag@eecs.northwestern.edu

Abstract

The growing application of data-driven analytics in materials science has led to the rise of materials informatics. Within the arena of data analytics, deep learning has emerged as a game-changing technique in the last few years, enabling numerous real-world applications, such as self-driving cars. In this paper, the authors present an overview of deep learning, its advantages, challenges, and recent applications on different types of materials data. The increasingly availability of materials databases and big data in general, along with groundbreaking advances in deep learning offers a lot of promise to accelerate the discovery, design, and deployment of next-generation materials.

Information

Type
Artificial Intelligence Prospectives
Creative Commons
Creative Common License - CCCreative Common License - BY
This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted re-use, distribution, and reproduction in any medium, provided the original work is properly cited.
Copyright
Copyright © Materials Research Society 2019
Figure 0

Figure 1. The four paradigms of science in the context of materials. Historically, science has been largely empirical or observational, which is known today as the experimental branch of science. When calculus was invented in the 17th century, it became possible to describe natural phenomena in the form of mathematical equations, marking the beginning of second paradigm of science, which is model-based theoretical science. With time, these equations became larger and more complex, and it was only in the 20th century when computers were invented that such larger and complex theoretical models (system of equations) became solvable, enabling large-scale simulations of real-world phenomena, which is the third paradigm of science. The last two decades have seen an explosive growth in the generation of data from the first three paradigms, which has far out-stripped our capacity to make sense of it. All this collected data can serve as a valuable resource for learning and augmenting the knowledge from first three paradigms, and has led to the emergence of the fourth paradigm of science, which is (big) data-driven science (reproduced from Ref. 2 under CC-BY license).

Figure 1

Figure 2. The PSPP relationships of materials science and engineering, where science flows from left-to-right, and engineering flows from right-to-left. Interestingly, each relationship from left to right is many-to-one. For example, many different processing routes can possibly result in the same structure, and along similar lines, it is also possible that the same property is achieved by multiple material structures. Materials informatics approaches can help decipher these relationships via fast and accurate forward models, which in turn can also help to realize the more difficult inverse models of materials discovery and design (reproduced from Ref. 2 under CC-BY license).

Figure 2

Figure 3. Pros and cons of deep learning. As with any technique, there are advantages and challenges of using deep learning that need to be considered carefully for successful application.

Figure 3

Figure 4. A fully-connected deep ANN with four inputs, one output, and five hidden layers with varying number of neurons (left). The ReLU activation function (right).

Figure 4

Figure 5. A CNN with three convolution layers, two pooling layers, and three fully connected layers. It takes a 64 × 64 RGB image (i.e., three channels) as input. The first convolution layer has two filters resulting in a feature map with two channels (depicted in purple and blue). The second convolution layer has three filters, thereby producing a feature map with three channels. It is then followed by a 2 × 2 pooling layer, which reduces the dimensionality of the feature map from 64 × 64 to 32 × 32. This is followed by another convolution layer of five filters, and another pooling layer to reduce feature map dimension to 16 × 16 (five channels). Next, the feature map is flattened to get a 1-D vector of 16 × 16 × 5 = 1280 values, which is fed into three fully connected layers of 640, 64, and one neuron(s) respectively, finally producing the output value.

Figure 5

Figure 6. A GAN consists of two neural networks—generator and discriminator, and with proper training, is capable of generating realistic images/data from noise.