Hostname: page-component-8448b6f56d-t5pn6 Total loading time: 0 Render date: 2024-04-23T06:56:06.273Z Has data issue: false hasContentIssue false

Symbolic regression in materials science

Published online by Cambridge University Press:  21 June 2019

Yiqun Wang
Affiliation:
Department of Materials Science and Engineering, Northwestern University, Evanston, IL 60208, USA
Nicholas Wagner
Affiliation:
Department of Materials Science and Engineering, Northwestern University, Evanston, IL 60208, USA
James M. Rondinelli*
Affiliation:
Department of Materials Science and Engineering, Northwestern University, Evanston, IL 60208, USA
*
Address all correspondence to James M. Rondinelli at jrondinelli@northwestern.edu
Get access

Abstract

The authors showcase the potential of symbolic regression as an analytic method for use in materials research. First, the authors briefly describe the current state-of-the-art method, genetic programming-based symbolic regression (GPSR), and recent advances in symbolic regression techniques. Next, the authors discuss industrial applications of symbolic regression and its potential applications in materials science. The authors then present two GPSR use-cases: formulating a transformation kinetics law and showing the learning scheme discovers the well-known Johnson–Mehl–Avrami–Kolmogorov form, and learning the Landau free energy functional form for the displacive tilt transition in perovskite LaNiO3. Finally, the authors propose that symbolic regression techniques should be considered by materials scientists as an alternative to other machine learning-based regression models for learning from data.

Type
Artificial Intelligence Prospectives
Copyright
Copyright © Materials Research Society 2019 

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Footnotes

These authors contributed equally to this work.

References

1.Deelman, E., Carothers, C., Mandal, A., Tierney, B., Vetter, J.S., Baldin, I., Castillo, C., Juve, G., Król, D., Lynch, V., Mayer, B., Meredith, J., Proffen, T., Ruth, P., and da Silva, R.F.: PANORAMA: an approach to performance modeling and diagnosis of extreme-scale workflows. Int. J. High Perform. Comput. Appl. 31, 418 (2017).Google Scholar
2.Lupini, A.R., Oxley, M.P., and Kalinin, S.V.: Pushing the limits of electron ptychography. Science 362, 399400 (2018).Google Scholar
3.Ren, F., Pandolfi, R., Van Campen, D., Hexemer, A., and Mehta, A.: On-the-fly data assessment for high-throughput X-ray diffraction measurements. ACS Comb. Sci. 19, 377385 (2017).Google Scholar
4.Stein, H.S., Guevarra, D., Newhouse, P.F., Soedarmadji, E., and Gregoire, J.M.: Machine learning of optical properties of materials predicting spectra from images and images from spectra. Chem. Sci. 10, 4755 (2019).Google Scholar
5.Alberi, K., Nardelli, M.B., Zakutayev, A., Mitas, L., Curtarolo, S., Jain, A., Fornari, M., Marzari, N., Takeuchi, I., Green, M.L., Kanatzidis, M., Toney, M.F., Butenko, S., Meredig, B., Lany, S., Kattner, U., Davydov, A., Toberer, E.S., Stevanovic, V., Walsh, A., Park, N.-G., Aspuru-Guzik, A., Tabor, D.P., Nelson, J., Murphy, J., Setlur, A., Gregoire, J., Li, H., Xiao, R., Ludwig, A., Martin, L.W., Rappe, A.M., Wei, S.-H., and Perkins, J.: The 2019 materials by design roadmap. J. Phys. D: Appl. Phys. 52, 013001 (2019).Google Scholar
6.Green, M.L., Choi, C.L., Hattrick-Simpers, J.R., Joshi, A.M., Takeuchi, I., Barron, S.C., Campo, E., Chiang, T., Empedocles, S., Gregoire, J.M., Kusne, A.G., Martin, J., Mehta, A., Persson, K., Trautt, Z., Van Duren, J., and Zakutayev, A.: Fulfilling the promise of the materials genome initiative with high-throughput experimental methodologies. Appl. Phys. Rev. 4, 011105 (2017). doi:10.1063/1.4977487Google Scholar
7.Ye, W., Chen, C., Dwaraknath, S., Jain, A., Ong, S.P., and Persson, K.A.: Harnessing the Materials Project for machine-learning and accelerated discovery. MRS Bull. 43, 664669 (2018).Google Scholar
8.Tanaka, I., Rajan, K., and Wolverton, C.: Data-centric science for materials innovation. MRS Bull. 43, 659663 (2018).Google Scholar
9.Kim, E., Huang, K., Saunders, A., McCallum, A., Ceder, G., and Olivetti, E.: Materials synthesis insights from scientific literature via text extraction and machine learning. Chem. Mater. 29, 94369444 (2017).Google Scholar
10.Krallinger, M., Rabal, O., Lourenço, A., Oyarzabal, J., and Valencia, A.: Information retrieval and text mining technologies for chemistry. Chem. Rev. 117, 76737761 (2017).Google Scholar
11.U.S. Government: Materials Genome Initiative National Science and Technology Council Committee on Technology Subcommittee on the Materials Genome Initiative; Whitehouse.Gov, June 2014.Google Scholar
12.Jain, A., Ong, S.P., Hautier, G., Chen, W., Richards, W.D., Dacek, S., Cholia, S., Gunter, D., Skinner, D., Ceder, G., and Persson, K.A.: Commentary: the Materials Project: a materials genome approach to accelerating materials innovation. APL Mater. 1, 011002 (2013).Google Scholar
13.Saal, J.E., Kirklin, S., Aykol, M., Meredig, B., and Wolverton, C.: Materials design and discovery with high-throughput density functional theory: the open quantum materials database (OQMD). JOM 65, 15011509 (2013).Google Scholar
14.Curtarolo, S., Setyawan, W., Wang, S., Xue, J., Yang, K., Taylor, R.H., Nelson, L.J., Hart, G.L.W., Sanvito, S., Buongiorno-Nardelli, M., Mingo, N., and Levy, O.: AFLOWLIB.ORG: a distributed materials properties repository from high-throughput ab initio calculations. Comput. Mater. Sci. 58, 227235 (2012).Google Scholar
15.Borysov, S.S., Geilhufe, R.M., and Balatsky, A.V.: Organic materials database: an open-access online database for data mining. PLoS ONE 12, e0171501 (2017).Google Scholar
16.Pizzi, G., Cepellotti, A., Sabatini, R., Marzari, N., and Kozinsky, B.: AiiDA: automated interactive infrastructure and database for computational science. Comput. Mater. Sci. 111, 218230 (2016).Google Scholar
17.Zhuo, Y., Tehrani, A.M., Oliynyk, A.O., Duke, A.C., and Brgoch, J.: Identifying an efficient, thermally robust inorganic phosphor host via machine learning. Nat. Commun. 9, 4377 (2018).Google Scholar
18.Hall, P. and Gill, N.: An Introduction to Machine Learning Interpretability, 1st ed. (O'Reilly Media, Inc., Sebastopol, California, 2018).Google Scholar
19.Https://apps.webofknowledge.com/ (Clarivate Analytics, Philadelphia, PA).Google Scholar
20.Augusto, D.A. and Barbosa, H.J.C.: Symbolic regression via genetic programming. In Proceedings - Brazilian Symposium on Neural Networks, SBRN, Vol. 2000, Janua; IEEE Computer Society, 2000; pp. 173–178.Google Scholar
21.Seber, G.A.F. and Lee, A.J.: Linear Regression Analysis (Wiley-Interscience, Hoboken, New Jersey, 2003), pp. 557.Google Scholar
22.Koza, J.R.: Genetic programming as a means for programming computers by natural selection. Stat. Comput. 4, 87112 (1994).Google Scholar
23.Forrest, S.: Genetic algorithms: principles of natural selection applied to computation. Science 261, 872878 (1993).Google Scholar
24.Meredig, B. and Wolverton, C.: A hybrid computational-experimental approach for automated crystal structure solution. Nat. Mater. 12, 123127 (2013).Google Scholar
25.Chua, A.L.-S., Benedek, N.A., Chen, L., Finnis, M.W., and Sutton, A.P.: A genetic algorithm for predicting the structures of interfaces in multicomponent systems. Nat. Mater. 9, 418422 (2010).Google Scholar
26.Mohn, C.E., Stølen, S., and Kob, W.: Predicting the structure of alloys using genetic algorithms. Mater. Manuf. Processes 26, 348353 (2011).Google Scholar
27.Arnaldo, I., Krawiec, K., and O'Reilly, U.-M.: Multiple regression genetic programming. In Proceedings of the 2014 Conference on Genetic and Evolutionary Computation - GECCO'14; ACM Press, New York, NY, 2014; pp. 879–886.Google Scholar
28.Moore, J.A., Ma, R., Domel, A.G., and Liu, W.K.: An efficient multiscale model of damping properties for filled elastomers with complex microstructures. Compos. Part B: Eng. 62, 262270 (2014).Google Scholar
29.Castelli, M., Silva, S., and Vanneschi, L.: A C++ framework for geometric semantic genetic programming. Genet. Program. Evol. Mach. 16, 7381 (2015).Google Scholar
30.Miller, J.F., Job, D., and Vassilev, V.K.: Principles in the evolutionary design of digital circuits part I. Genet. Program. Evol. Mach. 1, 735 (2000).Google Scholar
31.Rad, H.I., Feng, J., and Iba, H.: GP-RVM: Genetic Programing-based Symbolic Regression Using Relevance Vector Machine. (2018). arXiv:1806.02502v2Google Scholar
32.Giustolisi, O. and Savic, D.A.: Advances in data-driven analyses and modelling using EPR-MOGA. J. Hydroinform. 11, 225 (2009).Google Scholar
33.McConaghy, T.: FFX: Fast, Scalable, Deterministic Symbolic Regression Technology (Springer, New York, NY, 2011) pp. 235260.Google Scholar
34.Orzechowski, P., La Cava, W., and Moore, J.H.: Where are we now? In Proceedings of the Genetic and Evolutionary Computation Conference on - GECCO'18; ACM Press, New York, NY, 2018; pp. 1183–1190. arXiv:1804.09331Google Scholar
35.Icke, I. and Bongard, J.C., Improving genetic programming based symbolic regression using deterministic machine learning. In 2013 IEEE Congress on Evolutionary Computation; IEEE, 2013; pp. 1763–1770.Google Scholar
36.Krawiec, K.: Genetic programming-based construction of features for machine learning and knowledge discovery tasks. Genet. Program. Evol. Mach. 3, 329343 (2002).Google Scholar
37.Lu, Q., Ren, J., and Wang, Z.: Using genetic programming with prior formula knowledge to solve symbolic regression problem. Comput. Intell. Neurosci. 2016, 1 (2016).Google Scholar
38.Li, L., Fan, M., Singh, R., and Riley, P.: Neural-guided symbolic regression with semantic prior. (2019). arXiv preprint: arXiv:1901.07714.Google Scholar
39.Tolman, C.A.: The 16 and 18 electron rule in organometallic chemistry and homogeneous catalysis. Chem. Soc. Rev. 1, 337353 (1972).Google Scholar
40.Van Beest, B.W.H., Kramer, G.J., and Van Santen, R.A.: Force fields for silicas and aluminophosphates based on ab initio calculations. Phys. Rev. Lett. 64, 1955 (1990).Google Scholar
41.Yanai, T., Tew, D.P., and Handy, N.C.: A new hybrid exchange–correlation functional using the coulomb-attenuating method (cam-b3lyp). Chem. Phys. Lett. 393, 5157 (2004).Google Scholar
42.Schmidt, M. and Lipson, H.: Distilling free-form natural laws from experimental data. Science 324, 8185 (2009).Google Scholar
43.Gout, J., Quade, M., Shafi, K., Niven, R.K., and Abel, M.: Synchronization control of oscillator networks using symbolic regression. Nonlin. Dyn. 91, 10011021 (2018).Google Scholar
44.Arkov, V., Evans, C., Fleming, P.J., Hill, D.C., Norton, J.P., Pratt, I., Rees, D., and Rodríguez-Vázquez, K.: System identification strategies applied to aircraft gas turbine engines. Annu. Rev. Control 24, 6781 (2000).Google Scholar
45.Berardi, L., Giustolisi, O., Kapelan, Z., and Savic, D.A.: Development of pipe deterioration models for water distribution systems using EPR. J. Hydroinform. 10, 113 (2008).Google Scholar
46.Bongard, J. and Lipson, H.: Automated reverse engineering of nonlinear dynamical systems. Proc. Natl. Acad. Sci. 104, 99439948 (2007).Google Scholar
47.Cai, W., Pacheco-Vega, A., Sen, M., and Yang, K.T.: Heat transfer correlations by symbolic regression. Int. J. Heat Mass Transf. 49, 43524359 (2006).Google Scholar
48.Can, B. and Heavey, C.: Comparison of experimental designs for simulation-based symbolic regression of manufacturing systems. Comput. Ind. Eng. 61, 447462 (2011).Google Scholar
49.McKay, B., Willis, M., and Barton, G.: Steadystate modelling of chemical process systems using genetic programming. Comput. Chem. Eng. 21, 981996 (1997).Google Scholar
50.La Cava, W., Danai, K., and Spector, L.: Inference of compact nonlinear dynamic models by epigenetic local search. Eng. Appl. Artif. Intell. 55, 292306 (2016).Google Scholar
51.La Cava, W., Danai, K., Spector, L., Fleming, P., Wright, A., and Lackner, M.: Automatic identification of wind turbine models using evolutionary multiobjective optimization. Renew. Energy 87, 892902 (2016).Google Scholar
52.Chen, S.-H. and Yeh, C.-H.: Toward a computable approach to the efficient market hypothesis: an application of genetic programming. J. Econ. Dyn. Control 21, 10431063 (1997).Google Scholar
53.Gray, G.J., Murray-Smith, D.J., Li, Y., Sharman, K.C., and Weinbrenner, T.: Nonlinear model structure identification using genetic programming. Control Eng. Pract. 6, 13411352 (1998).Google Scholar
54.Khu, S.T., Liong, S.Y., Babovic, V., Madsen, H., and Muttil, N.: Genetic programming and its application in real-time runoff forecasting. J. Am. Water Resour. Assoc. 37, 439451 (2001).Google Scholar
55.Liong, S.-Y., Gautam, T.R., Khu, S.T., Babovic, V., Keijzer, M., and Muttil, N.: Genetic programming: a new paradigm in rainfall runoff modeling. J. Am. Water Resour. Assoc. 38, 705718 (2002).Google Scholar
56.Quade, M., Abel, M., Shafi, K., Niven, R.K., and Noack, B.R.: Prediction of dynamical systems by symbolic regression. Phys. Rev. E 94, 012214 (2016).Google Scholar
57.Schmidt, M.D., Vallabhajosyula, R.R., Jenkins, J.W., Hood, J.E., Soni, A.S., Wikswo, J.P., and Lipson, H.: Automated refinement and inference of analytical models for metabolic networks. Phys. Biol. 8, 055011 (2011).Google Scholar
58.Stanislawska, K., Krawiec, K., and Kundzewicz, Z.W.: Modeling global temperature changes with genetic programming. Comput. Math. Appl. 64, 37173728 (2012).Google Scholar
59.Uesaka, K. and Kawamata, M.: Synthesis of low-sensitivity second-order digital filters using genetic programming with automatically defined functions. IEEE Signal Process. Lett. 7, 8385 (2000).Google Scholar
60.Vyas, R., Goel, P., and Tambe, S.S., Genetic programming applications in chemical sciences and engineering. In Handbook of Genetic Programming Applications; Springer International Publishing, Cham, 2015; pp. 99–140.Google Scholar
61.Langdon, W.B. and Barrett, S.J.: Genetic programming in data mining for drug discovery. In Evolutionary Computation in Data Mining, Vol. 163; Springer-Verlag, Berlin/Heidelberg, 2005; pp. 211–235.Google Scholar
62.Vyas, R., Goel, P., Karthikeyan, M., Tambe, S.S., and Kulkarni, B.D.: Pharmacokinetic modeling of Caco-2 cell permeability using genetic programming (GP) method. Lett. Drug Des. Discov. 11, 11121118 (2014).Google Scholar
63.Barmpalexis, P., Kachrimanis, K., Tsakonas, A., and Georgarakis, E.: Symbolic regression via genetic programming in the optimization of a controlled release pharmaceutical formulation. Chemom. Intell. Lab. Syst. 107, 7582 (2011).Google Scholar
64.Muzny, C.D., Huber, M.L., and Kazakov, A.F.: Correlation for the viscosity of normal hydrogen obtained from symbolic regression. J. Chem. Eng. Data 58, 969979 (2013).Google Scholar
65.Markov, A.A., Patrakeev, M.V., Kharton, V.V., Pivak, Y.V., Leonidov, I.A., and Kozhevnikov, V.L.: Oxygen nonstoichiometry and ionic conductivity of Sr3Fe2−xScxO7−δ. Chem. Mater. 19, 39803987 (2007).Google Scholar
66.Nakamura, A. and Wagner, J.B.: Defect Structure, Ionic Conductivity, and Diffusion in Yttria Stabilized Zirconia and Related Oxide Electrolytes with Fluorite Structure, Technical Report.Google Scholar
67.Daza, L., Rangel, C.M., Baranda, J., Casais, M.T., Mart´ınez, M.J., and Alonso, J.A.: Modified nickel oxides as cathode materials for MCFC. J. Power Sources 86, 329333 (2000).Google Scholar
68.Maslyaev, M., Hvatov, A., and Kalyuzhnaya, A., Data-driven PDE discovery with evolutionary approach. (2019). arXiv:1903.08011Google Scholar
69.Gaucel, S., Keijzer, M., Lutton, E., and Tonda, A., Learning dynamical systems using standard symbolic regression. In Genetic Programming, edited by M. Nicolau, K. Krawiec, M. I. Heywood, M. Castelli, P. García-Sánchez, J.J. Merelo, V.M. Rivas Santos, and K. Sim (Springer, Berlin/Heidelberg, 2014) pp. 25–36.Google Scholar
70.Schmidt, M. and Lipson, H.: Symbolic regression of implicit equations. Genet. Program. Theory Pract. 7, 7385 (2009).Google Scholar
71.von Barth, U. and Hedin, L.: A local exchange correlation potential for the spin polarized case: I. J. Phys. C: Solid State Phys. 5, 1629 (1972).Google Scholar
72.The Minerals, Metals & Materials Society: Modeling Across Scales: A Roadmapping Study for Connecting Materials Models and Simulations Across Length and Time Scales, Technical Report (2015), 2015.Google Scholar
73.Yadollahi, A., Shamsaei, N., Thompson, S.M., and Seely, D.W.: Effects of process time interval and heat treatment on the mechanical and microstructural properties of direct laser deposited 316L stainless steel. Mater. Sci. Eng. A 644, 171183 (2015).Google Scholar
74.Ward, L. and Wolverton, C.: Atomistic calculations and materials informatics: a review. Curr. Opin. Solid State Mater. Sci. 21, 167176 (2017).Google Scholar
75.Ghiringhelli, L.M., Vybiral, J., Levchenko, S.V., Draxl, C., and Scheffler, M.: Big data of materials science: Critical role of the descriptor. Phys. Rev. Lett. 114, 105503 (2015). doi:10.1103/PhysRevLett.114.105503Google Scholar
76.Ghiringhelli, L.M., Vybiral, J., Ahmetcik, E., Ouyang, R., Levchenko, S.V., Draxl, C., and Scheffler, M.: Learning physical descriptors for materials science by compressed sensing. New J. Phys. 19, 023017 (2017).Google Scholar
77.Ouyang, R., Curtarolo, S., Ahmetcik, E., Scheffler, M., and Ghiringhelli, L.M.: SISSO: a compressed-sensing method for identifying the best low-dimensional descriptor in an immensity of offered candidates. Phys. Rev. Mater. 2, 083802 (2018).Google Scholar
78.Vanderplaats, G.N.: Numerical Optimization Techniques for Engineering Design (Vanderplaats Research and Development, Inc., Novi, MI), 2005.Google Scholar
79.Shimada, M., Kokawa, H., Wang, Z.J., Sato, Y.S., and Karibe, I.: Optimization of grain boundary character distribution for intergranular corrosion resistant 304 stainless steel by twin-induced grain boundary engineering. Acta Mater. 50, 23312341 (2002).Google Scholar
80.Decker, B.F. and Harker, D.: Activation energy for recrystallization in rolled copper. JOM 2, 887890 (1950).Google Scholar
81.Stephens, Trevor: Genetic Programming in Python, with a scikit-learn inspired API: gplearn, 2016.Google Scholar
82.Gou, G., Grinberg, I., Rappe, A.M., and Rondinelli, J.M.: Lattice normal modes and electronic properties of the correlated metal LaNiO3. Phys. Rev. B 84, 144101 (2011).Google Scholar
83.Yu, H., Young, J., Wu, H., Zhang, W., Rondinelli, J.M., and Shiv Halasyamani, P.: Electronic, crystal chemistry, and nonlinear optical property relationships in the dugganite A3B3CD2O14 family. J. Am. Chem. Soc. 138, 49844989 (2016).Google Scholar