Skip to main content

Best Practices for Fine-Tuning Machine Learning Interatomic Potentials for Catalysis

Presenter:
Veena
Chauhan
University:
Texas Tech University
Program:
LRGF
Year:
2026

Machine learning interatomic potentials (MLIPs) now approach the accuracy of density functional theory (DFT) at orders-of-magnitude lower cost, making them increasingly viable as direct replacements for DFT in catalysis workflows. Realizing this promise for a specific system, however, typically requires fine-tuning on initial-state-like and final-state-like structures drawn from that system - and despite its widespread use, fine-tuning remains more art than science. Key open questions include (i) how to generate training data efficiently, (ii) how well a fine-tuned model generalizes beyond its training distribution and whether it suffers catastrophic forgetting, and (iii) which hyperparameter choices actually matter.
This work outlines what we have determined to be best practices for fine-tuning MACE-MH-1, along with factors to consider when fine-tuning other MLIPs. We examine several strategies for generating fine-tuning data, including geometry optimization, machine-learning molecular dynamics (ML-MD) followed by DFT single-point calculations, and ab initio molecular dynamics (AIMD) trajectories. We then assess how each data-generation method performs on tasks relevant to catalysis modeling, such as geometry optimization and canonical-ensemble (NVT) molecular dynamics simulations.
In addition to data-generation methods, we evaluate how the balance between initial-state-like and final-state-like training data affects downstream quantities such as reaction energies, given the possibility of forgetting during fine-tuning. In particular, we show that energy differences computed with MLIPs can exhibit mean absolute errors (MAEs) substantially different from the MAE of either the initial- or final-state training frames alone. To evaluate these approaches, we consider two case studies: methane dehydrogenation on Ni supported on La₂O₃, and N₂ dissociation on a stepped Ru surface. Finally, we examine hyperparameter optimization for MACE-MH-1, evaluating the number of epochs, learning rate, learning-rate decay, and patience. We find that the default hyperparameters are not optimal and that tuning them yields meaningful improvements in the fine-tuned MLIP.