Hostname: page-component-5db58dd55d-8lnk4 Total loading time: 0 Render date: 2026-05-31T12:57:54.171Z Has data issue: false hasContentIssue false

CRYSTAL: a multi-agent AI system for automated mapping of materials' crystal structures

Published online by Cambridge University Press:  24 April 2019

Carla P. Gomes*
Affiliation:
Department of Computer Science, Cornell University, Ithaca, NY 14853, USA
Junwen Bai
Affiliation:
Department of Computer Science, Cornell University, Ithaca, NY 14853, USA
Yexiang Xue
Affiliation:
Department of Computer Science, Cornell University, Ithaca, NY 14853, USA
Johan Björck
Affiliation:
Department of Computer Science, Cornell University, Ithaca, NY 14853, USA
Brendan Rappazzo
Affiliation:
Department of Computer Science, Cornell University, Ithaca, NY 14853, USA
Sebastian Ament
Affiliation:
Department of Computer Science, Cornell University, Ithaca, NY 14853, USA
Richard Bernstein
Affiliation:
Department of Computer Science, Cornell University, Ithaca, NY 14853, USA
Shufeng Kong
Affiliation:
Department of Computer Science, Cornell University, Ithaca, NY 14853, USA
Santosh K. Suram
Affiliation:
Joint Center for Artificial Photosynthesis, California Institute of Technology, Pasadena CA 91125, USA
R. Bruce van Dover
Affiliation:
Department of Materials Science and Engineering, Cornell University, Ithaca, NY, USA
John M. Gregoire*
Affiliation:
Joint Center for Artificial Photosynthesis, California Institute of Technology, Pasadena CA 91125, USA
*
Address all correspondence to Carla P. Gomes at gomes@cs.cornell.edu and John M. Gregoire at gregoire@caltech.edu
Address all correspondence to Carla P. Gomes at gomes@cs.cornell.edu and John M. Gregoire at gregoire@caltech.edu

Abstract

We introduce CRYSTAL, a multi-agent AI system for crystal-structure phase mapping. CRYSTAL is the first system that can automatically generate a portfolio of physically meaningful phase diagrams for expert-user exploration and selection. CRYSTAL outperforms previous methods to solve the example Pd-Rh-Ta phase diagram, enabling the discovery of a mixed-intermetallic methanol oxidation electrocatalyst. The integration of multiple data-knowledge sources and learning and reasoning algorithms, combined with the exploitation of problem decompositions, relaxations, and parallelism, empowers AI to supersede human scientific data interpretation capabilities and enable otherwise inaccessible scientific discovery in materials science and beyond.

Information

Type
Artificial Intelligence Research Letters
Copyright
Copyright © Materials Research Society 2019 
Figure 0

Figure 1. Materials discovery cycle. (a) Synthesis of materials using sputter co-deposition from palladium (Pd), rhodium (Rh), and tantalum (Ta) sources to form a “materials library” thin film with continuous composition variation. Collection of both elastically (XRD) and inelastically (XRF) scattered x-rays, using a synchrotron x-ray beam, to characterize the materials' crystal structure and composition, respectively, the latter enabling a ternary composition map of the Pd-Rh-Ta library. (b) Each library is screened for catalytic activity using an electrochemical imaging strategy in which the best catalysts are identified using a fluorescent marker.[11] Materials which appear active in the absence of methanol are denoted as “unstable.” (c) The triangle-composition plot contains the 197 distilled XRD/XRF measurements that comprise the input for phase mapping. The 12 XRD patterns along the Pd-Rh composition line illustrate composition-dependent peak shifting due to the two elements alloying in a single-crystal structure. (d) CRYSTAL's phase map solution identifies five phases (purple, yellow, orange, blue, and red) and six multi-phase fields; each sample's XRD pattern is explained by either a single phase or a mixture of phases. (e) The XRD pattern for a phase “3” sample is shown with red sticks denoting the known peak pattern of the face centered cubic (fcc) crystal structure, indicating that the broad range of compositions in the fcc phase field crystallize into this same structure. (f) The average atomic radius varies systematically with the alloy composition, which CRYSTAL captures by mapping the composition-dependent fcc lattice constant.

Figure 1

Figure 2. Outline of the CRYSTAL system. CRYSTAL incorporates a diverse collection of fast and specialized algorithms with different types of knowledge and computational capabilities. IAFD integrates the AgileFD, Gibbs, Gibbs Alloy, and Phase Connectivity bots, constituting CRYSTAL's core phase-mapping engine: AgileFD performs agile factor decomposition to learn the factors or basis patterns, corresponding to pure crystal structures, and its three partner bots enforce physical constraints. The Phase Matching bot matches the basis patterns discovered by IAFD to known crystal structure patterns from databases. The Phase Dimension Analysis bot analyzes and validates the generated phase maps and infers the system's maximum number of pure phases, which dictates how many system configurations CRYSTAL explores, using parallelism and randomization, to produce a large number of candidate phase maps. The hierarchical Clustering bot uses automated thresholding to identify a small set of representative candidate solutions, which are provided to either the CRYSTAL Planner for solution refinement or to the Analysis & Reporting bot to generate phase diagrams and other visualizations for human-expert inspection. The Visualizer & Interface bot enables users to interact with CRYSTAL for solution selection and fine tuning.

Figure 2

Figure 3. Solving phase mapping using an unsupervised generative approach. The IAFD bot network solves phase mapping as a constrained matrix factorization problem in which the input XRD pattern matrix (A) is decomposed into factors W and H such that W × H approximates A while satisfying physical constraints. W encodes the characteristic patterns of pure crystal phases (including shifted versions) and H their activations, which dictate both the amount and the pattern shifting extent of each pure phase in each XRD measurement. IAFD starts with p (typically three) rounds of interactions between the AgileFD and Gibbs bots followed by rounds of iterations between the Gibbs Alloy and Phase Connectivity bots, until all the constraints are satisfied. AgileFD performs matrix factorization using light-weight multiplicative updating rules, without enforcing the combinatorial physical constraints. The AgileFD solution's violations of the connectivity constraint and the constraints based on Gibbs' phase rule are repaired by the corresponding bots in an interleaved manner using efficient algorithms (red circles highlight repaired activations of H). The entire procedure is repeated for solution refinement (typically q = 2), and the resulting generated basis patterns are passed to the Phase Matching bot to identify the crystal structures by comparison with ICDD and/or determine if the solution potentially contains a new phase. The figure illustrates a representative XRD pattern of the Pd-Rh-Ta system (#69) that is decomposed into shifted versions of two different basis patterns (0.16 and 0.84 of each, respectively).

Figure 3

Figure 4. CRYSTAL's solution to the Pd-Rh-Ta catalyst system. (a) CRYSTAL automatically generates 2500 phase diagrams in parallel from which the Phase Dimension Analysis bot identifies 1639 valid solutions and the Clustering bot identifies 100 representative solutions for additional refinement. (b) From the 100 refined phase diagrams, CRYSTAL automatically identifies the span of solutions with different physical meaning, which is 20 phase diagrams in this case. (c) The selected 20 phase diagrams that represent their respective clusters. (d) The final solution resulting from expert consideration of CRYSTAL's report. The expert user also provided minor manual refinement of the phase diagram, in particular small phase field boundary adjustments in composition regions with sparse measurement data. (e) Color scheme for the phase fields where the single-phase fields are labeled and phase combinations are denoted by linkages. The 11 phase fields marked with a black circle appear in the final solution. (f) The basis patterns for the final solution with stick patterns from the International Center for Diffraction Data shown in red. Composition maps of the relative lattice constant for each phase reveal alloying-based shifts due to the different atomic radii of the elements (Ta > Pd > Rh). The dot size denotes the phase concentration. (g) The methanol oxidation onset potential for the ternary and binary composition spaces where Rh-Ta is the only binary to exhibit catalytic activity. The overlay of the final solution's phase field boundaries reveals that the best activity (lowest onset potential) is observed in the mixed orth-Rh2Ta + hex-Pd3Ta phase field.

Figure 4

Table I. Comparison of constraint satisfaction for solutions generated by different algorithms.

Supplementary material: File

Gomes et al. supplementary material

Gomes et al. supplementary material 1

Download Gomes et al. supplementary material(File)
File 234.5 KB