Scale-Bridging Computational Materials Science: Heterogeneous Algorithms for Heterogeneous Platforms
Los Alamos National Laboratory
Computational materials scientists have been among the earliest and heaviest users of leadership-class supercomputers. The codes and algorithms which have been developed span a wide range of physical scales, and have been useful not only for gaining scientific insight, but also as testbeds for exploring new approaches for tacking evolving challenges, including massive (nearly million-way) concurrency, an increased need for fault and power management, and data bottlenecks. Multiscale, or scale-bridging, techniques are attractive from both materials science and computational perspectives, particularly as we look ahead from the current petascale era towards the exascale platforms expected to be deployed by the end of this decade. In particular, the increasingly heterogeneous and hierarchical nature of computer architectures demands that algorithms, programming models, and tools must mirror these characteristics if they are to thrive in this environment. Following a review of the current state of the art in petascale single-scale materials science application codes, and in scale-bridging algorithms, I will describe our UQ-driven adaptive physics refinement scale-bridging strategy for modeling materials at extreme mechanical and irradiation conditions, in which coarse-scale simulations spawn sub-scale direct numerical simulations as needed. This task-based MPMD approach leverages the extensive concurrency and heterogeneity expected at exascale, while enabling novel data models, power management, and fault tolerance strategies within applications. The programming models and runtime task/resource management and data sharing systems required to support such an approach will also enable in situ visualization and analysis, thus alleviating much of the I/O burden. The concept of computational "co-design" is essential for developing these algorithms and the necessary programming models and runtime systems (middleware) to enable their execution, and also influence potential architecture design choices for future exascale systems.