High-performance stencil computations on GPUs

Samuel Skillman, University of Colorado at Boulder

The numerical solution to a large number of partial differential equations in scientific applications relies on the use of stencil calculations. As such, there is a great interest in the design and implementation of high-performance stencil calculations. At the same time, the supercomputing landscape is undergoing a drastic change with the shift to many-core CPUs and the proliferation of hardware accelerators such as graphical processing units (GPUs). In order to properly leverage the advances in hardware technology, many scientific applications require a rewrite of the core computations. Here we present the development of a number of high-performance stencil calculations designed to run on GPUs. Using the CUDA programming language, we review the optimizations needed in order to take advantage of the memory hierarchy inherent to the underlying hardware and outline many of the pitfalls associated with the requirements in stencil computations.

Abstract Author(s): Samuel W. Skillman Wayne Joubert