Abstract
Understanding material surfaces and interfaces is vital in applications such as catalysis or electronics. By combining energies from electronic structure with statistical mechanics, ab initio simulations can, in principle, predict the structure of material surfaces as a function of thermodynamic variables. However, accurate energy simulations are prohibitive when coupled to the vast phase space that must be statistically sampled. Here we present a bi-faceted computational loop to predict surface phase diagrams of multicomponent materials that accelerates both the energy scoring and statistical sampling methods. Fast, scalable and data-efficient machine learning interatomic potentials are trained on high-throughput density-functional-theory calculations through closed-loop active learning. Markov chain Monte Carlo sampling in the semigrand canonical ensemble is enabled by using virtual surface sites. The predicted surfaces for GaN(0001), Si(111) and SrTiO3(001) are in agreement with past work and indicate that the proposed strategy can model complex material surfaces and discover previously unreported surface terminations.
This is a preview of subscription content, access via your institution
Access options
style{display:none!important}.LiveAreaSection-193358632 *{align-content:stretch;align-items:stretch;align-self:auto;animation-delay:0s;animation-direction:normal;animation-duration:0s;animation-fill-mode:none;animation-iteration-count:1;animation-name:none;animation-play-state:running;animation-timing-function:ease;azimuth:center;backface-visibility:visible;background-attachment:scroll;background-blend-mode:normal;background-clip:borderBox;background-color:transparent;background-image:none;background-origin:paddingBox;background-position:0 0;background-repeat:repeat;background-size:auto auto;block-size:auto;border-block-end-color:currentcolor;border-block-end-style:none;border-block-end-width:medium;border-block-start-color:currentcolor;border-block-start-style:none;border-block-start-width:medium;border-bottom-color:currentcolor;border-bottom-left-radius:0;border-bottom-right-radius:0;border-bottom-style:none;border-bottom-width:medium;border-collapse:separate;border-image-outset:0s;border-image-repeat:stretch;border-image-slice:100%;border-image-source:none;border-image-width:1;border-inline-end-color:currentcolor;border-inline-end-style:none;border-inline-end-width:medium;border-inline-start-color:currentcolor;border-inline-start-style:none;border-inline-start-width:medium;border-left-color:currentcolor;border-left-style:none;border-left-width:medium;border-right-color:currentcolor;border-right-style:none;border-right-width:medium;border-spacing:0;border-top-color:currentcolor;border-top-left-radius:0;border-top-right-radius:0;border-top-style:none;border-top-width:medium;bottom:auto;box-decoration-break:slice;box-shadow:none;box-sizing:border-box;break-after:auto;break-before:auto;break-inside:auto;caption-side:top;caret-color:auto;clear:none;clip:auto;clip-path:none;color:initial;column-count:auto;column-fill:balance;column-gap:normal;column-rule-color:currentcolor;column-rule-style:none;column-rule-width:medium;column-span:none;column-width:auto;content:normal;counter-increment:none;counter-reset:none;cursor:auto;display:inline;empty-cells:show;filter:none;flex-basis:auto;flex-direction:row;flex-grow:0;flex-shrink:1;flex-wrap:nowrap;float:none;font-family:initial;font-feature-settings:normal;font-kerning:auto;font-language-override:normal;font-size:medium;font-size-adjust:none;font-stretch:normal;font-style:normal;font-synthesis:weight style;font-variant:normal;font-variant-alternates:normal;font-variant-caps:normal;font-variant-east-asian:normal;font-variant-ligatures:normal;font-variant-numeric:normal;font-variant-position:normal;font-weight:400;grid-auto-columns:auto;grid-auto-flow:row;grid-auto-rows:auto;grid-column-end:auto;grid-column-gap:0;grid-column-start:auto;grid-row-end:auto;grid-row-gap:0;grid-row-start:auto;grid-template-areas:none;grid-template-columns:none;grid-template-rows:none;height:auto;hyphens:manual;image-orientation:0deg;image-rendering:auto;image-resolution:1dppx;ime-mode:auto;inline-size:auto;isolation:auto;justify-content:flexStart;left:auto;letter-spacing:normal;line-break:auto;line-height:normal;list-style-image:none;list-style-position:outside;list-style-type:disc;margin-block-end:0;margin-block-start:0;margin-bottom:0;margin-inline-end:0;margin-inline-start:0;margin-left:0;margin-right:0;margin-top:0;mask-clip:borderBox;mask-composite:add;mask-image:none;mask-mode:matchSource;mask-origin:borderBox;mask-position:0 0;mask-repeat:repeat;mask-size:auto;mask-type:luminance;max-height:none;max-width:none;min-block-size:0;min-height:0;min-inline-size:0;min-width:0;mix-blend-mode:normal;object-fit:fill;object-position:50% 50%;offset-block-end:auto;offset-block-start:auto;offset-inline-end:auto;offset-inline-start:auto;opacity:1;order:0;orphans:2;outline-color:initial;outline-offset:0;outline-style:none;outline-width:medium;overflow:visible;overflow-wrap:normal;overflow-x:visible;overflow-y:visible;padding-block-end:0;padding-block-start:0;padding-bottom:0;padding-inline-end:0;padding-inline-start:0;padding-left:0;padding-right:0;padding-top:0;page-break-after:auto;page-break-before:auto;page-break-inside:auto;perspective:none;perspective-origin:50% 50%;pointer-events:auto;position:static;quotes:initial;resize:none;right:auto;ruby-align:spaceAround;ruby-merge:separate;ruby-position:over;scroll-behavior:auto;scroll-snap-coordinate:none;scroll-snap-destination:0 0;scroll-snap-points-x:none;scroll-snap-points-y:none;scroll-snap-type:none;shape-image-threshold:0;shape-margin:0;shape-outside:none;tab-size:8;table-layout:auto;text-align:initial;text-align-last:auto;text-combine-upright:none;text-decoration-color:currentcolor;text-decoration-line:none;text-decoration-style:solid;text-emphasis-color:currentcolor;text-emphasis-position:over right;text-emphasis-style:none;text-indent:0;text-justify:auto;text-orientation:mixed;text-overflow:clip;text-rendering:auto;text-shadow:none;text-transform:none;text-underline-position:auto;top:auto;touch-action:auto;transform:none;transform-box:borderBox;transform-origin:50% 50%0;transform-style:flat;transition-delay:0s;transition-duration:0s;transition-property:all;transition-timing-function:ease;vertical-align:baseline;visibility:visible;white-space:normal;widows:2;width:auto;will-change:auto;word-break:normal;word-spacing:normal;word-wrap:normal;writing-mode:horizontalTb;z-index:auto;-webkit-appearance:none;-moz-appearance:none;-ms-appearance:none;appearance:none;margin:0}.LiveAreaSection-193358632{width:100%}.LiveAreaSection-193358632 .login-option-buybox{display:block;width:100%;font-size:17px;line-height:30px;color:#222;padding-top:30px;font-family:Harding,Palatino,serif}.LiveAreaSection-193358632 .additional-access-options{display:block;font-weight:700;font-size:17px;line-height:30px;color:#222;font-family:Harding,Palatino,serif}.LiveAreaSection-193358632 .additional-login>li:not(:first-child)::before{transform:translateY(-50%);content:””;height:1rem;position:absolute;top:50%;left:0;border-left:2px solid #999}.LiveAreaSection-193358632 .additional-login>li:not(:first-child){padding-left:10px}.LiveAreaSection-193358632 .additional-login>li{display:inline-block;position:relative;vertical-align:middle;padding-right:10px}.BuyBoxSection-683559780{display:flex;flex-wrap:wrap;flex:1;flex-direction:row-reverse;margin:-30px -15px 0}.BuyBoxSection-683559780 .box-inner{width:100%;height:100%}.BuyBoxSection-683559780 .readcube-buybox{background-color:#f3f3f3;flex-shrink:1;flex-grow:1;flex-basis:255px;background-clip:content-box;padding:0 15px;margin-top:30px}.BuyBoxSection-683559780 .subscribe-buybox{background-color:#f3f3f3;flex-shrink:1;flex-grow:4;flex-basis:300px;background-clip:content-box;padding:0 15px;margin-top:30px}.BuyBoxSection-683559780 .subscribe-buybox-nature-plus{background-color:#f3f3f3;flex-shrink:1;flex-grow:4;flex-basis:100%;background-clip:content-box;padding:0 15px;margin-top:30px}.BuyBoxSection-683559780 .title-readcube,.BuyBoxSection-683559780 .title-buybox{display:block;margin:0;margin-right:10%;margin-left:10%;font-size:24px;line-height:32px;color:#222;padding-top:30px;text-align:center;font-family:Harding,Palatino,serif}.BuyBoxSection-683559780 .title-asia-buybox{display:block;margin:0;margin-right:5%;margin-left:5%;font-size:24px;line-height:32px;color:#222;padding-top:30px;text-align:center;font-family:Harding,Palatino,serif}.BuyBoxSection-683559780 .asia-link{color:#069;cursor:pointer;text-decoration:none;font-size:1.05em;font-family:-apple-system,BlinkMacSystemFont,”Segoe UI”,Roboto,Oxygen-Sans,Ubuntu,Cantarell,”Helvetica Neue”,sans-serif;line-height:1.05em6}.BuyBoxSection-683559780 .access-readcube{display:block;margin:0;margin-right:10%;margin-left:10%;font-size:14px;color:#222;padding-top:10px;text-align:center;font-family:-apple-system,BlinkMacSystemFont,”Segoe UI”,Roboto,Oxygen-Sans,Ubuntu,Cantarell,”Helvetica Neue”,sans-serif;line-height:20px}.BuyBoxSection-683559780 .access-asia-buybox{display:block;margin:0;margin-right:5%;margin-left:5%;font-size:14px;color:#222;padding-top:10px;text-align:center;font-family:-apple-system,BlinkMacSystemFont,”Segoe UI”,Roboto,Oxygen-Sans,Ubuntu,Cantarell,”Helvetica Neue”,sans-serif;line-height:20px}.BuyBoxSection-683559780 .access-buybox{display:block;margin:0;margin-right:10%;margin-left:10%;font-size:14px;color:#222;opacity:.8px;padding-top:10px;text-align:center;font-family:-apple-system,BlinkMacSystemFont,”Segoe UI”,Roboto,Oxygen-Sans,Ubuntu,Cantarell,”Helvetica Neue”,sans-serif;line-height:20px}.BuyBoxSection-683559780 .price-buybox{display:block;font-size:30px;color:#222;font-family:-apple-system,BlinkMacSystemFont,”Segoe UI”,Roboto,Oxygen-Sans,Ubuntu,Cantarell,”Helvetica Neue”,sans-serif;padding-top:30px;text-align:center}.BuyBoxSection-683559780 .price-buybox-to{display:block;font-size:30px;color:#222;font-family:-apple-system,BlinkMacSystemFont,”Segoe UI”,Roboto,Oxygen-Sans,Ubuntu,Cantarell,”Helvetica Neue”,sans-serif;text-align:center}.BuyBoxSection-683559780 .price-info-text{font-size:16px;padding-right:10px;color:#222;font-family:-apple-system,BlinkMacSystemFont,”Segoe UI”,Roboto,Oxygen-Sans,Ubuntu,Cantarell,”Helvetica Neue”,sans-serif}.BuyBoxSection-683559780 .price-value{font-size:30px;font-family:-apple-system,BlinkMacSystemFont,”Segoe UI”,Roboto,Oxygen-Sans,Ubuntu,Cantarell,”Helvetica Neue”,sans-serif}.BuyBoxSection-683559780 .price-per-period{font-family:-apple-system,BlinkMacSystemFont,”Segoe UI”,Roboto,Oxygen-Sans,Ubuntu,Cantarell,”Helvetica Neue”,sans-serif}.BuyBoxSection-683559780 .price-from{font-size:14px;padding-right:10px;color:#222;font-family:-apple-system,BlinkMacSystemFont,”Segoe UI”,Roboto,Oxygen-Sans,Ubuntu,Cantarell,”Helvetica Neue”,sans-serif;line-height:20px}.BuyBoxSection-683559780 .issue-buybox{display:block;font-size:13px;text-align:center;color:#222;font-family:-apple-system,BlinkMacSystemFont,”Segoe UI”,Roboto,Oxygen-Sans,Ubuntu,Cantarell,”Helvetica Neue”,sans-serif;line-height:19px}.BuyBoxSection-683559780 .no-price-buybox{display:block;font-size:13px;line-height:18px;text-align:center;padding-right:10%;padding-left:10%;padding-bottom:20px;padding-top:30px;color:#222;font-family:-apple-system,BlinkMacSystemFont,”Segoe UI”,Roboto,Oxygen-Sans,Ubuntu,Cantarell,”Helvetica Neue”,sans-serif}.BuyBoxSection-683559780 .vat-buybox{display:block;margin-top:5px;margin-right:20%;margin-left:20%;font-size:11px;color:#222;padding-top:10px;padding-bottom:15px;text-align:center;font-family:-apple-system,BlinkMacSystemFont,”Segoe UI”,Roboto,Oxygen-Sans,Ubuntu,Cantarell,”Helvetica Neue”,sans-serif;line-height:17px}.BuyBoxSection-683559780 .tax-buybox{display:block;width:100%;color:#222;padding:20px 16px;text-align:center;font-family:-apple-system,BlinkMacSystemFont,”Segoe UI”,Roboto,Oxygen-Sans,Ubuntu,Cantarell,”Helvetica Neue”,sans-serif;line-height:NaNpx}.BuyBoxSection-683559780 .button-container{display:flex;padding-right:20px;padding-left:20px;justify-content:center}.BuyBoxSection-683559780 .button-container>*{flex:1px}.BuyBoxSection-683559780 .button-container>a:hover,.Button-505204839:hover,.Button-1078489254:hover,.Button-2496381730:hover{text-decoration:none}.BuyBoxSection-683559780 .readcube-button{background:#fff;margin-top:30px}.BuyBoxSection-683559780 .button-asia{background:#069;border:1px solid #069;border-radius:0;cursor:pointer;display:block;padding:9px;outline:0;text-align:center;text-decoration:none;min-width:80px;margin-top:75px}.BuyBoxSection-683559780 .button-label-asia,.ButtonLabel-3869432492,.ButtonLabel-3296148077,.ButtonLabel-1651148777{display:block;color:#fff;font-size:17px;line-height:20px;font-family:-apple-system,BlinkMacSystemFont,”Segoe UI”,Roboto,Oxygen-Sans,Ubuntu,Cantarell,”Helvetica Neue”,sans-serif;text-align:center;text-decoration:none;cursor:pointer}.Button-505204839,.Button-1078489254,.Button-2496381730{background:#069;border:1px solid #069;border-radius:0;cursor:pointer;display:block;padding:9px;outline:0;text-align:center;text-decoration:none;min-width:80px;max-width:320px;margin-top:10px}.Button-505204839 .readcube-label,.Button-1078489254 .readcube-label,.Button-2496381730 .readcube-label{color:#069}
/* style specs end */
Access Nature and 54 other Nature Portfolio journals
Get Nature+, our best-value online-access subscription
$29.99 / 30 days
cancel any time
Subscribe to this journal
Receive 12 digital issues and online access to articles
$99.00 per year
only $8.25 per issue
Rent or buy this article
Prices vary by article type
from$1.95
to$39.95
Prices may be subject to local taxes which are calculated during checkout
Data availability
The trained models, DFT data and Jupyter notebooks used for data analysis are available on Zenodo at https://doi.org/10.5281/zenodo.7758174 (ref. 72). Source data are provided with this paper.
Code availability
The VSSR-MC algorithm reported in this work is available on GitHub at https://github.com/learningmatter-mit/surface-sampling. The version of code used in this work is available on Zenodo at https://doi.org/10.5281/zenodo.10086398 (ref. 73).
References
-
Shi, R., Waterhouse, G. I. & Zhang, T. Recent progress in photocatalytic CO2 reduction over perovskite oxides. Solar RRL 1, 1700126 (2017).
Google Scholar
-
Sumaria, V., Nguyen, L., Tao, F. F. & Sautet, P. Atomic-scale mechanism of platinum catalyst restructuring under a pressure of reactant gas. J. Am. Chem. Soc. 145, 392–401 (2023).
Google Scholar
-
Fabbri, E. et al. Dynamic surface self-reconstruction is the key of highly active perovskite nano-electrocatalysts for water splitting. Nat. Mater. 16, 925–931 (2017).
Google Scholar
-
Zhang, Z., Wei, Z., Sautet, P. & Alexandrova, A. N. Hydrogen-induced restructuring of a Cu(100) electrode in electroreduction conditions. J. Am. Chem. Soc. 144, 19284–19293 (2022).
Google Scholar
-
Sha, Z., Shen, Z., Cali, E., Kilner, J. A. & Skinner, S. J. Understanding surface chemical processes in perovskite oxide electrodes. J. Mater. Chem. 11, 5645–5659 (2023).
Google Scholar
-
Jung, S.-K. et al. Understanding the degradation mechanisms of LiNi0.5Co0.2Mn0.3O2 cathode material in lithium ion batteries. Adv. Energy Mater. 4, 1300787 (2014).
Google Scholar
-
Han, B. et al. From coating to dopant: how the transition metal composition affects alumina coatings on Ni-rich cathodes. ACS Appl. Mater. Interfaces 9, 41291–41302 (2017).
Google Scholar
-
Xu, C. et al. Bulk fatigue induced by surface reconstruction in layered Ni-rich cathodes for Li-ion batteries. Nat. Mater. 20, 84–92 (2021).
Google Scholar
-
Hirata, A., Saiki, K., Koma, A. & Ando, A. Electronic structure of a SrO-terminated SrTiO3(100) surface. Surf. Sci. 319, 267–271 (1994).
Google Scholar
-
Castell, M. R. Scanning tunneling microscopy of reconstructions on the SrTiO3(001) surface. Surf. Sci. 505, 1–13 (2002).
Google Scholar
-
Erdman, N. et al. The structure and chemistry of the TiO2-rich surface of SrTiO3(001). Nature 419, 55–58 (2002).
Google Scholar
-
Heifets, E., Piskunov, S., Kotomin, E. A., Zhukovskii, Y. F. & Ellis, D. E. Electronic structure and thermodynamic stability of double-layered SrTiO3(001) surfaces: ab initio simulations. Phys. Rev. B 75, 115417 (2007).
Google Scholar
-
Li, H., Jiao, Y., Davey, K. & Qiao, S.-Z. Data-driven machine learning for understanding surface structures of heterogeneous catalysts. Angew. Chem. Int. Ed. 135, e202216383 (2023).
Google Scholar
-
Merte, L. R. et al. Structure of an ultrathin oxide on Pt3Sn(111) solved by machine learning enhanced global optimization. Angew. Chem. Int. Ed. 61, e202204244 (2022).
Google Scholar
-
Foiles, S. M., Baskes, M. I. & Daw, M. S. Embedded-atom-method functions for the fcc metals Cu, Ag, Au, Ni, Pd, Pt, and their alloys. Phys. Rev. B 33, 7983–7991 (1986).
Google Scholar
-
Nord, J., Albe, K., Erhart, P. & Nordlund, K. Modelling of compound semiconductors: analytical bond-order potential for gallium, nitrogen and gallium nitride. J. Phys. Condensed Matter 15, 5649 (2003).
Google Scholar
-
Kolpak, A. M., Li, D., Shao, R., Rappe, A. M. & Bonnell, D. A. Evolution of the structure and thermodynamic stability of the BaTiO3(001) surface. Phys. Rev. Lett. 101, 036102 (2008).
Google Scholar
-
Wexler, R. B., Qiu, T. & Rappe, A. M. Automatic prediction of surface phase diagrams using ab initio grand canonical Monte Carlo. J. Phys. Chem. C 123, 2321–2328 (2019).
Google Scholar
-
Zhou, X.-F., Oganov, A. R., Shao, X., Zhu, Q. & Wang, H.-T. Unexpected reconstruction of the α-boron (111) surface. Phys. Rev. Lett. 113, 176101 (2014).
Google Scholar
-
Timmermann, J. et al. IrO2 surface complexions identified through machine learning and surface investigations. Phys. Rev. Lett. 125, 206101 (2020).
Google Scholar
-
Wales, D. J. & Doye, J. P. K. Global optimization by basin-hopping and the lowest energy structures of Lennard–Jones clusters containing up to 110 atoms. J. Phys. Chem. A 101, 5111–5116 (1997).
Google Scholar
-
Panosetti, C., Krautgasser, K., Palagin, D., Reuter, K. & Maurer, R. J. Global materials structure search with chemically motivated coordinates. Nano Lett. 15, 8044–8048 (2015).
Google Scholar
-
Obersteiner, V., Scherbela, M., Hörmann, L., Wegner, D. & Hofmann, O. T. Structure prediction for surface-induced phases of organic monolayers: overcoming the combinatorial bottleneck. Nano Lett. 17, 4453–4460 (2017).
Google Scholar
-
Egger, A. T. et al. Charge transfer into organic thin films: a deeper insight through machine-learning-assisted structure search. Adv. Sci. 7, 2000992 (2020).
Google Scholar
-
Bauer, M. N., Probert, M. I. J. & Panosetti, C. Systematic comparison of genetic algorithm and basin hopping approaches to the global optimization of Si(111) surface reconstructions. J. Phys. Chem. A 126, 3043–3056 (2022).
Google Scholar
-
Wang, Q., Oganov, A. R., Zhu, Q. & Zhou, X.-F. New reconstructions of the (110) surface of rutile TiO2 predicted by an evolutionary method. Phys. Rev. Lett. 113, 266101 (2014).
Google Scholar
-
Schusteritsch, G. & Pickard, C. J. Predicting interface structures: from SrTiO3 to graphene. Phys. Rev. B 90, 035424 (2014).
Google Scholar
-
Meldgaard, S. A., Mortensen, H. L., Jørgensen, M. S. & Hammer, B. Structure prediction of surface reconstructions by deep reinforcement learning. J. Phys. Condensed Matter 32, 404005 (2020).
Google Scholar
-
Hess, F. & Yildiz, B. Polar or not polar? The interplay between reconstruction, Sr enrichment, and reduction at the La0.75Sr0.25MnO3(001) surface. Phys. Rev. Mater. 4, 015801 (2020).
-
Unke, O. T. et al. Machine learning force fields. Chem. Rev. 121, 10142–10186 (2021).
Google Scholar
-
Axelrod, S. et al. Learning matter: materials design with machine learning and atomistic simulations. Acc. Mater. Res. 3, 343–357 (2022).
Google Scholar
-
Bisbo, M. K. & Hammer, B. Efficient global structure optimization with a machine-learned surrogate model. Phys. Rev. Lett. 124, 086102 (2020).
Google Scholar
-
Bisbo, M. K. & Hammer, B. Global optimization of atomic structure enhanced by machine learning. Phys. Rev. B 105, 245404 (2022).
Google Scholar
-
Timmermann, J. et al. Data-efficient iterative training of Gaussian approximation potentials: application to surface structure determination of rutile IrO2 and RuO2. J. Chem. Phys. 155, 244107 (2021).
Google Scholar
-
Rønne, N. et al. Atomistic structure search using local surrogate model. J. Chem. Phys. 157, 174115 (2022).
Google Scholar
-
Han, Y. et al. Prediction of surface reconstructions using MAGUS. J. Chem. Phys. 158, 174109 (2023).
Google Scholar
-
Xu, J., Xie, W., Han, Y. & Hu, P. Atomistic insights into the oxidation of flat and stepped platinum surfaces using large-scale machine learning potential-based grand-canonical Monte Carlo. ACS Catal. 12, 14812–14824 (2022).
Google Scholar
-
Bernardin, F. E. & Rutledge, G. C. Semi-grand canonical Monte Carlo (SGMC) simulations to interpret experimental data on processed polymer melts and glasses. Macromolecules 40, 4691–4702 (2007).
Google Scholar
-
Damewood, J., Schwalbe-Koda, D. & Gómez-Bombarelli, R. Sampling lattices in semi-grand canonical ensemble with autoregressive machine learning. npj Comput. Mater. 8, 61 (2022).
Google Scholar
-
Carrete, J., Montes-Campos, H., Wanzenböck, R., Heid, E. & Madsen, G. K. H. Deep ensembles vs committees for uncertainty estimation in neural-network force fields: comparison and application to active learning. J. Chem. Phys. 158, 204801 (2023).
Google Scholar
-
Tan, A. R., Urata, S., Goldman, S., Dietschreit, J. C. B. & Gómez-Bombarelli, R. Single-model uncertainty quantification in neural network potentials does not consistently outperform model ensembles. Preprint at https://arxiv.org/abs/2305.01754 (2023).
-
Schwalbe-Koda, D., Tan, A. R. & Gómez-Bombarelli, R. Differentiable sampling of molecular geometries with uncertainty-based adversarial attacks. Nat. Commun. 12, 5104 (2021).
Google Scholar
-
Fu, X. et al. Forces are not enough: benchmark and critical evaluation for machine learning force fields with molecular simulations. Transactions on Machine Learning Research https://openreview.net/forum?id=A8pqQipwkt (2023).
-
Damewood, J. et al. Representations of materials for machine learning. Annu. Rev. Mater. Res. 53, 399–426 (2023).
Google Scholar
-
Stephenson, P. C. L., Radny, M. W. & Smith, P. V. A modified Stillinger–Weber potential for modelling silicon surfaces. Surf. Sci. 366, 177–184 (1996).
Google Scholar
-
Northrup, J. E., Neugebauer, J., Feenstra, R. M. & Smith, A. R. Structure of GaN(0001): the laterally contracted Ga bilayer model. Phys. Rev. B 61, 9932–9935 (2000).
Google Scholar
-
Štich, I., Payne, M. C., King-Smith, R. D., Lin, J.-S. & Clarke, L. J. Ab initio total-energy calculations for extremely large systems: application to the Takayanagi reconstruction of Si(111). Phys. Rev. Lett. 68, 1351–1354 (1992).
Google Scholar
-
Smeu, M., Guo, H., Ji, W. & Wolkow, R. A. Electronic properties of Si(111)-7×7 and related reconstructions: density functional theory calculations. Phys. Rev. B 85, 195315 (2012).
Google Scholar
-
Herger, R. et al. Surface of strontium titanate. Phys. Rev. Lett. 98, 076102 (2007).
Google Scholar
-
Hong, C. et al. Anomalous intense coherent secondary photoemission from a perovskite oxide. Nature 617, 493–498 (2023).
Google Scholar
-
Szot, K. & Speier, W. Surfaces of reduced and oxidized SrTiO3 from atomic force microscopy. Phys. Rev. B 60, 5909–5926 (1999).
Google Scholar
-
Kubo, T. & Nozoye, H. Surface structure of SrTiO3(100). Surf. Sci. 542, 177–191 (2003).
Google Scholar
-
Winter, G. & Gómez-Bombarelli, R. Simulations with machine learning potentials identify the ion conduction mechanism mediating non-Arrhenius behavior in LGPS. J. Phys. Energy 5, 024004 (2023).
Google Scholar
-
Millan, R., Bello-Jurado, E., Moliner, M., Boronat, M. & Gomez-Bombarelli, R. Effect of framework composition and NH3 on the diffusion of Cu+ in Cu-CHA catalysts predicted by machine-learning accelerated molecular dynamics. ACS Cent. Sci. 9, 2044–2056 (2023).
-
Thompson, A. P. et al. LAMMPS—a flexible simulation tool for particle-based materials modeling at the atomic, meso, and continuum scales. Comput. Phys. Commun. 271, 108171 (2022).
Google Scholar
-
Larsen, A. H. et al. The atomic simulation environment—a Python library for working with atoms. J. Phys. Condensed Matter 29, 273002 (2017).
Google Scholar
-
Boes, J. R., Mamun, O., Winther, K. & Bligaard, T. Graph theory approach to high-throughput surface adsorption structure generation. J. Phys. Chem. A 123, 2281–2285 (2019).
Google Scholar
-
Ong, S. P. et al. Python Materials Genomics (pymatgen): a robust, open-source python library for materials analysis. Comput. Mater. Sci. 68, 314–319 (2013).
Google Scholar
-
Momma, K. & Izumi, F. VESTA 3 for three-dimensional visualization of crystal, volumetric and morphology data. J. Appl. Crystallogr. 44, 1272–1276 (2011).
Google Scholar
-
Jain, A. et al. The Materials Project: a materials genome approach to accelerating materials innovation. APL Mater. 1, 011002 (2013).
Google Scholar
-
Schütt, K., Unke, O. & Gastegger, M. Equivariant message passing for the prediction of tensorial properties and molecular spectra. In Proc. 38th International Conference on Machine Learning, Proc. Machine Learning Research Vol. 139 (eds Meila, M. & Zhang, T.) 9377–9388 (PMLR, 2021).
-
Martinez-Cantin, R., Tee, K. & McCourt, M. Practical Bayesian optimization in the presence of outliers. In Proc. Twenty-First International Conference on Artificial Intelligence and Statistics, Proc. Machine Learning Research Vol. 84 (eds Storkey, A. & Perez-Cruz, F.) 1722–1731 (PMLR, 2018).
-
Ramachandran, P., Zoph, B. & Le, Q. V. Searching for activation functions. Preprint at https://arxiv.org/abs/1710.05941 (2017).
-
Kingma, D. P. & Ba, J. Adam: a method for stochastic optimization. In Proc. 3rd International Conference on Learning Representations, ICLR 2015 (eds Bengio, Y. & LeCun, Y.) (2015).
-
Gasteiger, J., Giri, S., Margraf, J. T. & Günnemann, S. Fast and uncertainty-aware directional message passing for non-equilibrium molecules. Machine Learning for Molecules Workshop, NeurIPS 2020 https://ml4molecules.github.io/papers2020/ML4Molecules_2020_paper_35.pdf (2020).
-
Reuter, K. & Scheffler, M. Composition, structure, and stability of RuO2(110) as a function of oxygen pressure. Phys. Rev, B 65, 035406 (2001).
Google Scholar
-
Heifets, E., Ho, J. & Merinov, B. Density functional simulation of the BaZrO3(011) surface structure. Phys. Rev. B 75, 155431 (2007).
Google Scholar
-
Kresse, G. & Furthmüller, J. Efficient iterative schemes for ab initio total-energy calculations using a plane-wave basis set. Phys. Rev. B 54, 11169–11186 (1996).
Google Scholar
-
Kresse, G. & Joubert, D. From ultrasoft pseudopotentials to the projector augmented-wave method. Phys. Rev. B 59, 1758–1775 (1999).
Google Scholar
-
Perdew, J. P., Burke, K. & Ernzerhof, M. Generalized gradient approximation made simple. Phys. Rev. Lett. 77, 3865–3868 (1996).
Google Scholar
-
Tadmor, E. B., Elliott, R. S., Sethna, J. P., Miller, R. E. & Becker, C. A. The potential of atomistic simulations and the knowledgebase of interatomic models. JOM 63, 17 (2011).
Google Scholar
-
Du, X. Data for: Machine-learning-accelerated simulations to enable automatic surface reconstruction. Zenodo https://doi.org/10.5281/zenodo.7758174 (2023).
-
Du, X. learningmatter-mit/surface-sampling. Zenodo https://doi.org/10.5281/zenodo.10086398 (2023).
Acknowledgements
We thank G. Winter, J. Peng, N. Frey and M. Liu for helpful discussions. We also appreciate editing by J. Peng and A. Hoffman. X.D. acknowledges support from the National Science Foundation Graduate Research Fellowship under grant no. 2141064. J.K.D. was supported by the Department of Defense through the National Defense Science and Engineering Graduate Fellowship Program. We are grateful for computation time allocated on the MIT SuperCloud cluster, the MIT Engaging cluster and the NERSC Perlmutter cluster. This material is based on work supported by the Under Secretary of Defense for Research and Engineering under Air Force Contract No. FA8702-15-D-0001. Any opinions, findings, conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the Under Secretary of Defense for Research and Engineering. Delivered to the US Government with Unlimited Rights, as defined in DFARS Part 252.227-7013 or 7014 (February 2014). Notwithstanding any copyright notice, US Government rights in this work are defined by DFARS 252.227-7013 or DFARS 252.227-7014 as detailed above. Use of this work other than as specifically authorized by the US Government may violate any copyrights that exist in this work.
Author information
Authors and Affiliations
Contributions
X.D. implemented the sampling algorithm, performed surface modeling, ran DFT calculations, trained the neural networks and carried out surface stability analysis. J.K.D. assisted with sampling algorithm implementation and provided guidance with surface modeling. J.R.L. provided guidance with surface modeling and ran DFT calculations. R.M. provided guidance with neural network training and active learning. B.Y. provided guidance with the choice of surfaces and surface stability analysis. L.L. supervised the research and contributed to securing funding. R.G.-B. conceived the project, supervised the research and contributed to securing funding. All authors contributed to results discussion and paper writing.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Peer review
Peer review information
Nature Computational Science thanks Mie Andersen and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. Primary Handling Editor: Kaitlin McCardle, in collaboration with the Nature Computational Science team. Peer reviewer reports are available.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Extended data
Extended Data Fig. 1 Top view of additional GaN(0001) MC-sampled structures.
The surface reconstructions are rotated in comparison with the reference structure from ref. 46 but contain the same hexagonal pattern.
Extended Data Fig. 2 Comparing classical potential and DFT energies of Si(111) sampled surface reconstructions.
a–c, Structures shown were obtained from constant-composition (canonical) VSSR-MC sampling using the SRS modified Stillinger-Weber potential45 with 3×3 (a), 5×5 (b) and 7×7 (c) unit cells. The SRS energies were obtained from the depicted structures while the DFT energies came from structures further relaxed at the DFT level. * Further relaxation using DFT resulted in the 3×3 DAS structure.
Extended Data Fig. 3 Correlation plot of force MAE with force s.d. over AL generations.
At each AL generation, an ensemble of just three NFF models was able to estimate force s.d. that correlated strongly with force error. Each individual data point represents a sampled structure. Each blue ‘X’ represents a binned average and a best-fit line is drawn through the binned averages. The binned average is calculated by dividing both the force s.d. and force MAE into equal-sized bins. The average force MAE is then plotted against the median force s.d. for each corresponding bin.
Source data
Extended Data Fig. 4 Force distribution over AL generations.
The majority of high-force structures were added in AL generations 1, 2 and 6, which correspond either to random structures or structures obtained through adversarial attack. The three VSSR-MC AL generations produced structures with low force values mostly around 50 eV Å-1 or less.
Source data
Extended Data Fig. 5 Test performance of the best NFF model.
As described in the main paper, the test data is obtained from VSSR-MC runs using the sixth-generation NFF model.
Source data
Extended Data Fig. 6 Strengths and limitations of VSSR-MC.
a,b, Comparison of limited fixed on-lattice sites (a) and denser algorithmically-generated virtual surface sites that can overlap (b). c, Off-lattice reconstructions can be obtained following VSSR-MC discrete sampling at virtual sites and continuous relaxation of surface atoms and adsorbates. d, Amorphous reconstructions with many local minima, however, will likely be difficult for VSSR-MC to sample.
Extended Data Fig. 7 Side view of virtual sites for surfaces studied in this work.
a–d, Pymatgen (a) and CatKit (b) virtual sites for GaN(0001) against the contracted Ga monolayer reconstruction, two-layer pymatgen sites for Si(111) against the 5×5 DAS reconstruction (c), and pymatgen virtual sites for SrTiO3(001) against the double-layer TiO2 reconstruction (d). The dashed lines are a guide for the eye.
Extended Data Fig. 8 Visualizations in the latent space.
a, Clustering of VSSR-MC structures in the NFF latent space visualized in the first three principal components. In the VSSR-MC with clustering AL method, the surface from each cluster with the highest force s.d. is selected for DFT evaluation. b, PCA of training data and the dominant terminations (term.) in the latent space of the sixth-generation model.
Source data
Supplementary information
Supplementary Information
Supplementary Sections: (1) abbreviations used; and (2) surface stability analysis.
Peer Review File
Supplementary Data 1
Comparison of AutoSurfRecon with existing computational methods for surface reconstruction. AutoSurfRecon automatically samples across many surface compositions and configurations while training an accurate NFF for low-cost energy prediction.
Source data
Source Data Fig. 3
Statistical source data: Typical GaN(0001) VSSR-MC run profile.
Source Data Fig. 4
Statistical source data: (b) force error and predicted force s.d. for the sixth-generation model; (c) latent space embedding PCA of surfaces acquired at each AL generation; (d) force and energy predictions of the model at each AL generation on the final test set.
Source Data Fig. 5
Statistical source data: (b) predicted surface free energies for each dominant termination across Sr and O chemical potentials; (c–e) predicted surface free energies of sampled structures at Sr chemical potentials of −10, −7 and −4 eV and O chemical potential of 0 eV.
Source Data Extended Data Fig. 3
Statistical source data: force error and predicted force s.d. over six AL generations.
Source Data Extended Data Fig. 4
Statistical source data: distribution of force magnitudes over six AL generations.
Source Data Extended Data Fig. 5
Statistical source data: predictions of the sixth-generation AL model on final test data.
Source Data Extended Data Fig. 8
Statistical source data: (a) PCA of test data in the latent space of the sixth-generation model; (b) PCA of the sixth-generation training data and dominant terminations in the latent space of the sixth-generation model.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
Reprints and Permissions
About this article
Cite this article
Du, X., Damewood, J.K., Lunger, J.R. et al. Machine-learning-accelerated simulations to enable automatic surface reconstruction.
Nat Comput Sci (2023). https://doi.org/10.1038/s43588-023-00571-7
-
Received: 11 May 2023
-
Accepted: 13 November 2023
-
Published: 07 December 2023
-
DOI: https://doi.org/10.1038/s43588-023-00571-7