Maximilian Bremer

University of Texas at Austin

When Max Bremer arrived at Lawrence Berkeley National Laboratory (LBNL) for his Department of Energy Computational Science Graduate Fellowship (DOE CSGF) practicum, he didn’t realize he already had the seed of a perfect research project. He had earned a bachelor’s degree in aerospace engineering and had taken steps toward computationally modeling hurricane storm surges. But at LBNL, he learned he could make a unique contribution to computational science.

“This hurricane simulation stuff just fell into my lap,” he says. His work doesn’t center on writing new modeling codes but on revising them to boost efficiency on future high-performance computing (HPC) architectures. “It’s a really good problem, and I find it really interesting. So I just kind of stuck with it.”

When a hurricane threatens, local officials must make rapid evacuation decisions based, in part, on simulations from an HPC code called Advanced Circulation (ADCIRC).

“They are life-or-death decisions,” says Clint Dawson, Bremer’s University of Texas at Austin advisor. “For example, if they decide to close a road, and the road didn’t need to be closed, then that could slow down evacuations by several days.”

Bremer joined Dawson’s group, which has helped develop ADCIRC, as an undergraduate working on DGSWEM, an experimental code used to test concepts for incorporation into the main code. After graduating, Bremer went to the University of Cambridge to delve deeper into pure mathematics. He later he rejoined Dawson’s group.

When he began his 2016 summer practicum with Cy Chan in LBNL’s Computer Architecture Group, Bremer was interested in the unit's emphasis on HPC optimization, especially task-based parallelism – a way to improve simulations’ efficiency. “There’s only so much smaller you can make these computer chips,” Bremer says. To accelerate computing, “you just have more computers” – additional processors.

That means greater parallelism – breaking big problems into discrete tasks and doling them out to individual processors, which solve the separate pieces simultaneously. But because some jobs are bigger than others, processors can sit idle, waiting as every task is completed before they all move on to the next set. These brief idle times add up to losses and prevent researchers from drawing on a machine’s full capability. In task-based parallelism, each task runs on its own time step, reducing the wait between jobs.

At Berkeley Lab, Bremer set out to learn task-based parallelism and load balancing, another optimization method. He planned to take the methods to Dawson’s group and apply them to DGSWEM and, perhaps someday, to ADCIRC and other codes.

Bremer: Storm Surge

ADCIRC is a mature code, Bremer says. “It’s probably hard to move the bar on what’s been done in the mathematical or algorithmic space,” he says. “Rather than come up with a new mathematical model, what if we can just use the machines better?”

Chan and his co-workers helped Bremer tackle another inefficiency in hurricane storm surge simulations. At the outset, dry areas demand no computational work. But some soon require significant processing as they become inundated. Because no one can predict precisely which dry areas will suddenly demand more resources, the computer’s workload becomes imbalanced. “To achieve efficient utilization of the machine, you need to move these patches around on the fly,” Bremer says. “That’s load balancing.”

During the practicum, Bremer learned to write the C++ language, then used it to implement task-based parallelism and load balancing. Before incorporating the methods directly into DGSWEM, Bremer created DGSim, a skeletonized version of the program, essentially writing a simulator for the simulator. “It allowed us to use a lot less of the machine. So I can run the simulation on my laptop whereas normally I would need thousands of cores to do it.”

The group validated DGSim on Edison, a Cray XC30 supercomputer at LBNL’s NERSC (National Energy Research Scientific Computing Center), Chan says. “We were able to reduce the number of time steps calculated but still capture the same overall dynamic load profile of the hurricane.” DGSim saved more than 5,000 core-hours compared to running a DGSWEM simulation, and the new algorithms improved hurricane-simulation performance by more than 50 percent. Bremer, Chan and their colleague John Bachan presented the results at SC18, the international supercomputing conference in Dallas.

Bremer served another practicum with Chan’s group in the summer of 2019, exploring how to make time-stepping not only asynchronous but also locally determined, with each element in a simulation taking its cues from neighboring elements.

Image caption: Forecasted maximum water level at the start of a two-day span for 2017’s Hurricane Harvey on the Middle Texas Coast, based on a National Hurricane Center advisory and generated using the ADCIRC+SWAN Surge Guidance System. The water level color scale is on the right; latitude and longitude are along the left and bottom. The city of Houston is at the top. The computational model uses advisories as inputs to simulate how the storm affects the ocean and to predict flooding. Credit: UT Austin Computational Hydraulics Research Group.

Read the entire article in DEIXIS, the DOE CSGF annual. [PDF, pages 16-18]