### Argonne National Laboratory

Guiding the Alternating Direction Method of Multipliers with Filters

Robert Baraldi, University of Washington

**Practicum Year:**2020

**Practicum Supervisor:**Sven Leyffer, Project Leader/Sr. Computational Mathematician, Mathematics and Computer Science , Argonne National Laboratory

Large, nonconvex, and nonlinear problems arise naturally in physical inverse problems and machine learning applications. Examples of these are the nonsmooth regularization of generalized lasso, low-rank regularized matrix completion, sparse convolutional neural networks, and physics-based, PDE-constrained optimization. This practicum will focus on developing a novel method for problems with these characteristics and then explore the descent and convergence properties of this algorithm in the context of the aforementioned settings. In typical inverse problems, one minimizes some cost/data misfit function to find model parameters that effectively capture the collected data. The issue is that cost functions for physics and machine learning applications are challenging for many standard minimization algorithms, which make often convexity and smoothness assumptions about the problem. While progress has been made in adapting existing algorithms to these difficult settings, the issues of computational cost and convergence guarantees still remain. One such promising algorithm is Alternating Direction Method of Multipliers (ADMM). ADMM is one of the workhorses for many machine learning and common optimization settings, and is considered a useful alternative to stochastic gradient descent (SGD) in nonlinear and nonconvex realms with large computational cost. It primarily functions by splitting up a difficult cost function in simpler parts, and solving these simpler problems iteratively until some sort of convergence is achieved. However, most nonconvex and nonlinear problems still lack global convergence guarantees and tend to converge slowly even towards local minima. Hence, solutions given by ADMM and SGD can come burdened with a lot of uncertainty. In addition, slow convergence to even local minima increases the computational cost of implementing these algorithms, as increased iterations of an already expensive model run burden these techniques. At ANL, we plan to study these issues through the lens of filter methods for ADMM. Filter methods have proved particularly promising for enforcing convergence in nonlinear programs. They have been shown to increase performance for sequential quadratic programs and interior point methods, as well as provide first order convergence guarantees for a variety of function types and constraints. The goal of this practicum is to develop an ADMM filter method for highly nonlinear and nonconvex problems, and then implementing this method numerically on sparsity-enforcing cost functions (ie p-norms for 0

Machine Learning/Artificial Intelligence for Molecular Dynamics simulations

Anda Trifan, University of Illinois at Urbana-Champaign

**Practicum Year:**2020

**Practicum Supervisor:**Arvind Ramanathan, Computational Biologist, Data Science & Learning Division, Argonne National Laboratory

The algorithms I learned during my practicum were applied to advance science, especially as related to the current pandemic of COVID-19. We applied deep learning to identify conformational changes in the spike protein of the COVID-19 virion, as well as identifying drug targets for other COVID-19 targets such as PLPro. This work was submitted for consideration for Gordon Bell Prize.

Neutrinos in HACC - Reducing noise in cosmological N-body simulations with neutrino particles

James Sullivan, University of California, Berkeley

**Practicum Year:**2019

**Practicum Supervisor:**Salman Habib, Group Leader/Senior Physicist, High Energy Physics, Argonne National Laboratory

The project was to implement and test a popular new initial conditions generating scheme for massively parallel N-body simulations with neutrinos. Specifically, I looked into various non-physical numerical effects that could infect science results. This work was done using HACC (Hybrid Architecture Cosmology Code), which is highly parallelizable, and scales well across many nodes.

Computational and Experimental Studies of Liquid Structure

Malia Wenny, Harvard University

**Practicum Year:**2019

**Practicum Supervisor:**Byeongdu Lee, Physicist, X-Ray Science Division, Argonne National Laboratory

In my practicum, I aimed to explore techniques for studying the liquid structure of materials both computationally and experimentally. In the experimental portion of my practicum, I used small-angle and wide-angle x-ray scattering to study nanometer-scale ordering in a variety of liquid samples, including nanoparticles dispersed in solution. I also used x-ray absorption fine structure spectroscopy to study the local coordination environment of metals in materials in both the solid and liquid states. In the computational portion of my project, I attempted to study the liquid structure of metal-salt hydrates using ab initio molecular dynamics in order to study the interactions of metal cations and water molecules in the liquid state.

Optimization under uncertainty using chance constraints

Morgan Kelley, University of Texas at Austin

**Practicum Year:**2018

**Practicum Supervisor:**Sven Leyffer, , Mathematics and Computer Science Division, Argonne National Laboratory

Optimization under uncertainty involves making optimization decisions based on parameters which are not fully known at time of solution i.e. are described/predicted by a distribution. Determining how to account for this uncertainty in an optimization problem is important for finding a solution which is robust, but not overly conservative. Chance constrained methods are considered a key approach in solving optimization problems under uncertainty: such methods ensure that the probability of meeting a constraint in an optimization problem is above a certain level.
This project seeks to develop and compare various formulations for integrating chance constraints into nonlinear optimization problems of both convex and non-convex varieties. Formulations examined include: big-M, perspective, slack method, vanishing constraints, and mixed-integer methods. Each chance constraint formulation is applied to a set of test problems, and the success of each is quantified by solution time, where connections are drawn between solution time and the number of variables and constraints introduced by each method.

Non-gaussianity in cosmological simulations

Zachary Weiner, University of Illinois at Urbana-Champaign

**Practicum Year:**2018

**Practicum Supervisor:**Katrin Heitmann and Salman Habib, , High Energy Physics/Cosmology, Argonne National Laboratory

Applying stochastic methods (dynamical evolution of a Langevin-type system) to produce non-Gaussian initial conditions for cosmological simulations of large-scale structure.

Automated Thermochemistry and Kinetics for Combustion

Sarah Elliott, University of Georgia

**Practicum Year:**2018

**Practicum Supervisor:**Stephen Klippenstein, Dr., Chemical Sciences and Engineering, Argonne National Laboratory

We are building a predictive and automated combustion chemistry (PACC) modeling software. This code will accurately predict the temperature and pressure dependence of the thousands of gas phase reactions in a combustion mechanism by employing electronic structure theory, transition state theory, classical trajectory simulations, and the master equation.
During my previous practicum, we automated the process of generating thermochemical data. Our Quantum Thermochemistry (QTC) driver takes a list of chemical species as input and, for each, uses Monte Carlo sampling to achieve a starting 3D geometry, performs composite ab initio electronic structure calculations coupled with high-level treatments of anharmonicity and nonlocal motions (including torsional optimizations and scans), forms partition functions, and thereafter generates thermochemical data representing as NASA polynomials. Concurrently it populates a large database with the properties computed with the electronic structure methods (e.g. electronic energies, frequencies, zero-point vibrational energies) and the thermochemical properties (e.g., temperature dependent heat of formations, entropies, and heat capacities).
In this second practicum, we added automated, pressure-independent kinetics to PACC. This required a second driver, which controls reaction mechanism generation to produce a list of the species and the hundreds to thousands of reactions between them for a given fuel, automates the transition state search for each reaction, calls QTC to generate thermochemical data, and then uses transition state theory or master equations to get rate information. Currently, our code is able to produce kinetic information for abstraction, addition, and hydrogen migration reactions.

Calculation of free energy profiles for reactions on metal surfaces using advanced sampling methods

Thomas Ludwig, Stanford University

**Practicum Year:**2018

**Practicum Supervisor:**Juan de Pablo, Professor, Materials Science Division, Argonne National Laboratory

In this project, we investigated the free energy profiles of several surface reactions relevant to heterogeneous catalysis. Free energy sampling methods such as the adaptive biasing force (ABF) method were coupled to first-principles density functional theory (DFT) calculations to calculate free energy profiles of surface reactions. Several important and experimentally relevant surface reaction steps were investigated. This involved choosing DFT parameters to balance sufficient accuracy with performance on high-performance computing resources such that sufficient statistics could be gathered to converge free energy calculations, as well as testing of the parameters of the free energy sampling method itself.

Phonon bottlenecks in low dimensional materials

Nicholas Rivera, Massachusetts Institute of Technology

**Practicum Year:**2018

**Practicum Supervisor:**Pierre Darancet, Staff Scientist, Center for Nanoscale Materials, Argonne National Laboratory

In this project, I worked on developing an algorithm for solving the time-dependent Boltzmann equation for low-dimensional materials to find the non-equilibrium steady state, if there was one. The goal was to find the time-dependent temperature distribution of the different phonon modes in materials such as transition metal dichalcogenides, given a non-equilibrium electron distribution which could be created by pumping a suitable current through the material in question.

Using PETSc to parallelize finite element simulations of arbitrarily curved and deforming surfaces

Amaresh Sahu, University of California, Berkeley

**Practicum Year:**2018

**Practicum Supervisor:**Barry Smith, Senior Computing Mathematician, Mathematics and Computer Science, Argonne National Laboratory

Using PETSc to parallelize a novel isoparametric arbitrary Lagrangian--Eulerian finite element formulation for curved and deforming surfaces.

Computational Calculations of the Effects of Defects on HfO2 Properties

Emily Crabb, Massachusetts Institute of Technology

**Practicum Year:**2017

**Practicum Supervisor:**Olle G. Heinonen, Materials Scientist & Strategic Initiative Lead, Computational Chemistry and Materials, Argonne National Laboratory

Current Si-CMOS technologies are reaching their limit because transistors can only be made finitely small before quantum effects interfere with their behavior. As a result, it will not be possible to maintain Mooreâ€™s law solely by relying on existing technology in the near future. One method to extend these technologies is to improve the gate materials used. Current Si-MOSFETs use HfO2 as a thin gate material with a high dielectric constant. However, one problem with HfO2 gates is leakage current that limits their efficiency. Impurities in the HfO2 gates increase their efficiency by decreasing this leakage current, but the exact mechanisms are not fully understood. The goal of this project was to use density-functional theory (DFT) based calculations to examine the energetics of oxygen vacancies and nitrogen and fluorine defects in HfO2. DFT was used because it is fast, relatively inexpensive, and well-established as an electronic structure methods for finding materials' properties. In order to account for electron correlations, we used the so-called DFT+U approach, in which an on-site parameter U is used for electronic correlations. For transition metal oxides like HfO2, the electron correlations are often too strong to use DFT to accurately determine the materials' properties and the U-parameter cannot accurately capture all effects stemming from correlations. As such, there is a growing area of research that uses Quantum Monte Carlo (QMC) simulations to better model these materials. QMC simulations are more expensive, as they require more computational power and time, but they are also capable of achieving greater accuracy. We therefore also used QMC as a follow-on to the DFT calculations to elucidate the effect of correlations on the electronic structure.

Automated Thermochemistry for Combustion

Sarah Elliott, University of Georgia

**Practicum Year:**2017

**Practicum Supervisor:**Stephen Klippenstein, Dr., Chemical Sciences and Engineering, Argonne National Laboratory

As part of a large Exascale Computing Project grant for fuel simulations, we are building an automated chemistry software for full combustion processes. Our program couples EStokTP, Gaussian, Molpro, NWChem and MOPAC along with in house codes to explore the thousands of reactions that make up a combustion mechanism. We accurately predict the temperature and pressure dependence of gas phase reactions by employing electronic structure theory, transition state theory, classical trajectory simulations, and the master equation. Concurrently we are populating a large MongoDB database with the properties computed with the electronic structure methods (e.g. electronic energies, frequencies, zero-point vibrational energies) and the thermochemical properties (e.g., temperature dependent heat of formations, entropies, and heat capacities)

Nonlinear Robust Optimization

Carson Kent, Stanford University

**Practicum Year:**2017

**Practicum Supervisor:**Sven Leyffer, Senior Computational Mathematician, Mathematics and Computer Science, Argonne National Laboratory

Robust optimization is an extension of classical constrained optimization that allows us to account for uncertainty/noise that might be present in the parameters/constraints of a given problem. In short, it produces solutions that which remain feasible with respect to perturbations of a set of constraints. Unfortunately, most robust optimization methods rely heavily on convexity assumptions in order to operate-- assumptions which are violated in many DOE applications such as additive manufacturing and nano-fabrication.
The objective of our project was to develop robust optimization methods which work for general non-linear problems and did not rely on convexity assumptions. To this end, we developed a new method which works in this general case of nonlinear, non-convex constraints and is competitive, in terms of computational performance, with the best available robust optimization methods. Our method achieves this performance by combining both discretization and constraint reformulation techniques to solve a sequence of local approximations of the nonlinear robust constraints.

Halo Occupation Distribution Modeling of RedMagic Galaxies

Andres Salcedo, The Ohio State University

**Practicum Year:**2017

**Practicum Supervisor:**Salman Habib, , High Energy Physics , Argonne National Laboratory

Using the OuterRim cosmological N-body simulation I modeled the connection between dark matter halos and and Dark Energy Survey Year One Redmagic selected galaxies using a Halo Occupation Distribution (HOD) model. I then modeled the galaxy-galaxy projected correlation function as a Taylor series in HOD parameters, which I then used to produce a fit to the observed clustering of RedMagic galaxies.

Inverse optimization technique for targeted self-assembly of nanostructures

Mukarram Tahir, Massachusetts Institute of Technology

**Practicum Year:**2015

**Practicum Supervisor:**Stefan Wild, Computational Mathematician, Mathematics and Computer Science Division, Argonne National Laboratory

A promising approach for designing materials at the nanoscale resolution is self-assembly from anisotropic nanometer-sized building blocks. In my practicum research, we focused on the inverse design problem of identifying building blocks that would spontaneously self-assemble into a target superstructure. We developed a mutable representation for building block morphology, and then utilized non-linear least squares optimization (namely the POUNDERS algorithm from Argonne's TAO) to evolve the associated parameters so that the resulting building block would spontaneously self-assemble into targeted nanostructures.

Computational Studies on Inhomogeneous Surface Plasmon Polaritons in Metamaterials

Gerald Wang, Massachusetts Institute of Technology

**Practicum Year:**2015

**Practicum Supervisor:**Stephen Gray, Group Leader, Theory and Modeling, Center for Nanoscale Materials, Argonne National Laboratory

This practicum work focused on developing analytical and computational models of inhomogeneous surface plasmon polaritons (ISPPs) i.e. surface waves at a metal-dielectric interface with the property that their decay and attenuation directions are not aligned. In particular, the project focused on whether these wave phenomena could be found within negative-index metamaterials (NIMs). This work has many potential applications in sub-diffraction-limit optics, chemical sensing, and even possibly optical cloaking.

Massively Parallelized Equation-Of-Motion Coupled Cluster Singles and Doubles (EOMCCSD) for Excited State Characterization

Samuel Blau, Harvard University

**Practicum Year:**2014

**Practicum Supervisor:**Jeff Hammond, Assistant Computational Scientist, Leadership Computing Facility, Argonne National Laboratory

Excited states are notoriously difficult to calculate, yet most systems of technological interest involve excitations. Time-dependent density functional theory (TDDFT) is the most widely used method for calculating excited states, but it has well documented issues with conjugated systems and charge-transfer (CT) states. Additionally, users of TDDFT must choose between a range of functionals and empirical parameters, which casts serious doubt on its ability to be truly predictive. EOMCCSD is significantly more expensive than TDDFT, but is much more accurate and has no problem with conjugated systems, CT states, or the user's choices. However, given its cost, in order to apply it to technologically relevant systems it has to be done in parallel.
Thankfully, Edgar Solomonik and Devin Matthews, two CSGFs who did practicums with Jeff a couple of years ago, have already implemented a massively-parallelized tensor contraction engine and built ground-state CCSD on top of that. Thus implementing massively-parallelized EOMCCSD was a feasible practicum project.

Performance Analysis of Full-Core Reactor Simulations on the Xeon Phi Architecture

David Ozog, University of Oregon

**Practicum Year:**2014

**Practicum Supervisor:**Andrew Siegel, Director, CESAR, Math and Computer Science / Nuclear Engineering, Argonne National Laboratory

A primary characteristic of history-based Monte Carlo neutron transport simulation is the application of MIMD-style parallelism: the path of each neutron particle is largely independent of all other particles, so threads of execution perform independent instructions with respect to other threads. This conflicts with the growing trend of HPC vendors exploiting SIMD hardware, which accomplishes better parallelism and more FLOPS per Watt. Event-based neutron transport suits vectorization better than history-based transport, but it is difficult to implement and complicates data management and transfer. However, the Intel Xeon Phi architecture supports the familiar x86 instruction set and memory model, mitigating difficulties in vectorizing neutron transport codes. My project compared the event-based and history-based approaches for exploiting SIMD in Monte Carlo neutron transport simulations, with the Xeon Phi as the target architecture.

High Resolution Modeling of Tropical Cyclones

David Plotkin, University of Chicago

**Practicum Year:**2014

**Practicum Supervisor:**Robert Jacob, Computational Climate Scientist, Mathematics and Computer Science, Argonne National Laboratory

The goal of this project is to improve simulation of tropical cyclones in global climate models, with a secondary goal of predicting changes in cyclone activity caused by global warming. Global warming is often predicted to increase intensity but decrease frequency of cyclones. However, global climate models do not reproduce current cyclone intensities (wind speeds are too low compared to pressure minima), so these predictions are highly uncertain. This project aims to improve the model physics in order to better capture the observed cyclone intensities.

Metabolic Modeling of Trophic Interactions in Microbial Communities

Andrew Stine, Northwestern University

**Practicum Year:**2014

**Practicum Supervisor:**Christopher Henry, Computational Biologist , Mathematics and Computer Science Department, Argonne National Laboratory

Dr. Chris Henry's research group has previously been involved in developing a software suite known as KBase which is capable of generating a preliminary metabolic model for an organism from its sequenced genome. This software has been successfully utilized to generate metabolic models for a wide variety of pure culture organisms. However, no work has yet been done to apply this type of analysis to organism growing together in communities. For this project, the Henry lab is utilizing KBase to study different microbial communities and compare the computational predictions with experimental results.

Enhancing scalability of the coastal ocean model SELFE

Jesse Lopez, Oregon Health and Science University

**Practicum Year:**2013

**Practicum Supervisor:**Jed Brown, Argonne Scholar, Laboratory for Advanced Numerical Simulation, Argonne National Laboratory

The goal of the project was to identify and alleviate bottlenecks preventing the code from exhibiting good strong scaling characteristics with problem sizes currently used by researchers on modern clusters.

Applying DFT to the study of charge density effects in fullerene semiconductors

Kenley Pelzer, University of Chicago

**Practicum Year:**2013

**Practicum Supervisor:**Seth Darling, Nanoscientist, Center for Nanoscale Materials, Argonne National Laboratory

The project originally started as a study of the PTB7:PC71BM organic photovoltaic but eventually morphed into a project that is applicable to the general study of organic semiconductors. DFT (density functional theory, an electronic structure theory method) was applied to study the effect of adding point charges near the fullerene molecules that are involved in electron transport in organic semiconductors. These point charges were used to simulate the effects of charge density with the goal of assessing whether and why electron transport is more efficient in materials where charges are fairly densely packed into the material.

p3calc: An efficient tool for calculating properties of potential energy surfaces for chemical kinetics

Jason Bender, University of Minnesota

**Practicum Year:**2012

**Practicum Supervisor:**Stephen Klippenstein, , Chemical Sciences and Engineering, Argonne National Laboratory

Chemical kinetic modeling is a key area of research within the DOE for the development and analysis of next-generation biofuels. Transition state theory is a powerful, statistical approach that is widely used for predicting chemical reaction rates in detailed combustion mechanisms. My practicum at Argonne National Laboratory, from May to August, 2012, under the advising of Dr. Stephen Klippenstein, focused on the development of an efficient software tool for calculating molecular potential energy surface properties that are needed in state-of-the-art transition state theory computations. In particular, the program was designed with the goal of evaluating molecular partition functions that accurately account for torsional vibrational modes. This addresses a deficiency in current modeling of biofuel combustion reactions. We plan to use the software to more accurately predict relevant chemical reaction rates, contributing to a better understanding of fundamental interactions in the combustion of advanced fuels.

Molecular Dynamics Simulations for Extracting Phase Field Model Parameters

Carmeline Dsilva, Princeton University

**Practicum Year:**2012

**Practicum Supervisor:**Mihai Anitescu, Computational Mathematician, Mathematics and Computer Science Division, Argonne National Laboratory

There is interest in modeling material microstructure (i.e. the grains within a metal) in order to predict material behavior and potential failures. Researchers have looked at using phase field models for such predictions, as these models are able to access much longer time and length scales compared to traditional atomistic simulations such as molecular dynamics. However, phase field models require material parameters that must be obtained from experiments or simulations. In my project, I used molecular dynamics simulations to obtain the relevant material parameters for a phase field simulation of uranium dioxide.

Parallelization and optimization of lattice-Boltzmann method

Christopher Ivey, Stanford University

**Practicum Year:**2012

**Practicum Supervisor:**Ramesh Balakrishnan, Computational Scientist, Argonne Leadership Computing Facility, Argonne National Laboratory

The purpose of my practicum was to parallelize and optimize a lattice-Boltzmann method (LBM) solver to perform direct numerical simulations of fluid flow with varying levels of description. The LBM is based on the discrete Boltzmann equation, where simplified kinetic models are constructed to incorporate the essential mesoscopic physics such that the macroscopically average properties obey their respective macroscopic equations. The mesoscopic description, through augmented problem dimensionality, provides an interesting framework to study parallelism and to incorporate 'extra' physics, e.g. multiphase representation of fluid flow. The primary interest to the researchers was the general scalability of the LBM on the Blue Gene architectures using MPI, OpenMP, and vectorization. Secondary goals included investigations of lattice geometries, boundary conditions, and contemporary LBM phase descriptions and collision operators.

PU-Accelerated Coarse-Grained Simulations of Polymer-Nanotube
Complexes

Zachary Ulissi, Massachusetts Institute of Technology

**Practicum Year:**2012

**Practicum Supervisor:**Gregory Voth, Distinguished Service Professor, Computation Institute, Argonne National Laboratory

Polymer complexes are inherently difficult to simulate at the atomistic level due to the large length scales and long time scales involved. For my practicum, I built on previous atomistic simulations of SWCNT-polymer assemblies, with the aim to simulate polymer adsorption at the microsecond timescale.
As part of this project, I learned and implemented my simulations in HOOMD, a gpu-accelerated molecular mechanics code.

Fragment molecular orbital molecular dynamics on the BlueGene/P

Kurt Brorsen, Iowa State University

**Practicum Year:**2011

**Practicum Supervisor:**Graham Fletcher, Project Specialist, Leadership Computer Facility, Argonne National Laboratory

The main accomplishment of the practicum was the porting of the GAMESS fragment molecular orbital-molecular dynamics (FMO-MD) code to the BlueGene/P. Running FMO-MD on the BlueGene/P required the standard changes to the GAMESS code as well as additional changes to GAMESS' I/O, the Distributed Data Interface (DDI), and the FMO routines. Since the main goal of porting FMO-MD to the BlueGene/P was to run molecular dynamics to calculate bulk properties of water clusters using 2nd order perturbation theory (MP2), the FMO-MP2 gradient was interfaced with DDI. This replaced the old FMO-MP2 gradient which required extensive writing to disk. These accomplishments have resulted in a FMO-MD code that is stable and fully functional on the BlueGene/P and should allow the use of FMO-MD on larger systems, using better levels of theory than ever before.
Once the code was stable, preliminary benchmarking of the FMO-MD code was undertaken. This benchmarking identified a realistic system size for the proposed FMO-MD-MP2 simulations of water clusters and determined a current bottleneck in the FMO-MD code that must be resolved before practical MD simulations can be undertaken. Possible ideas to eliminate this bottleneck were implemented and tested, but none proved successful. Besides benchmarking, calculations were performed that showed recent improvements to the FMO gradient results in perfect energy conservation for FMO-MD. Energy conservation from the gradient improvements had been assumed, but needed to be shown to prove the validity of the FMO-MD method.

A Linearly Constrained Augmented Lagrangian Method for PDE-Constrained Optimization

Evan Gawlik, Stanford University

**Practicum Year:**2011

**Practicum Supervisor:**Todd Munson, Dr., Mathematics and Computer Science Division, Argonne National Laboratory

We developed a linearly-constrained augmented Lagrangian method for solving optimization problems with partial differential equation constraints. By exploiting the problem structure, the algorithm requires just two linearized forward and adjoint PDE solves per iteration. We tested the algorithm on a collection of model problems in PDE-constrained optimization proposed by Haber & Hanson, which consists of three parameter estimation problems constrained by elliptic, parabolic, and hyperbolic PDEs. All experiments were performed on the Fusion cluster at Argonne National Laboratory. An implementation of the algorithm is now provided in the Toolkit for Advanced Optimization (TAO), a software package for large-scale optimization that is built upon the popular PETSc suite developed at Argonne.

Algorithms for Local and Parallel Structured Tensor Contractions

Devin Matthews, University of Texas

**Practicum Year:**2011

**Practicum Supervisor:**Jeff Hammond, Director's Postdoctoral Fellow, Argonne Leadership Computing Facility, Argonne National Laboratory

Our project was the development of new algorithms for performing tensor contractions, exploiting special structure of both the algorithm and the tensors themselves. The kinds of structure that we use are diverse, but work together to give a novel algorithm that should be highly efficient.
For the tensors, we exploit the structure of index permutation symmetry, where the interchange of two indices may induce a change in overall sign, but keeps the magnitude of the element the same. This symmetry allows us to pack the tensors in to a smaller storage area (by an ratio that grows factorially with the number of indices), and use an implicit algorithm to operate on it. For the local tensor contraction algorithm, we also exploit the index permutation symmetry of the tensors to avoid performing redundant computation.
The parallel algorithm uses two more types of structure: cyclic distribution of the tensor elements and the topology of the underlying communication network. Distributing the tensor elements cyclically allows for much better load balancing than blocked distributions and also maintains the index permutation symmetry for the local sub-tensors. Using the topology of the underlying communication network (currently only n-dimensional torii) allows us to use generalizations of the well-known parallel Canon and SUMMA matrix multiplication algorithms which utilize all dimensions of the network efficiently.
Work on both the local and parallel algorithms is ongoing.

Conserving angular momentum in the master equation

Kenley Pelzer, University of Chicago

**Practicum Year:**2011

**Practicum Supervisor:**Stephen Klippenstein, Chemist, Chemistry (Chemical Dynamics in the Gas Phase), Argonne National Laboratory

I worked in a group that in the past had calculated rates of chemical reactions dealing with only conservation of energy and neglecting conservation of angular momentum. I wrote code in C++ to treat the same problems but with conservation of angular momentum.

A framework for parallel symmetric tensor contractions

Edgar Solomonik, University of California, Berkeley

**Practicum Year:**2011

**Practicum Supervisor:**Jeff Hammond, Dr, ALCF, Argonne National Laboratory

We developed a set of new algorithms and a parallel framework for performing tensor contractions. The algorithms and framework support distributed packed layouts for partial and full symmetries. The framework automatically maps a tensors of any dimension and symmetry to a physical processor grid of another dimension. This framework is integrated with a quantum chemistry electronic structure calculation code. The resulting framework should allow much more efficient calculations of electronic structure since our algorithms are well-structured and perform less communication than previous frameworks (TCE/Global Arrays).

Investigation of Scalable Adjoint Frameworks for Multi-physics Nuclear Reactor Simulations

Hayes Stripling, Texas A&M University

**Practicum Year:**2011

**Practicum Supervisor:**Mihai Anitescu, Computational Mathematician, Mathematics and Computer Science Division, Argonne National Laboratory

I participated in some scoping work for the CESAR (Center for Exascale Simulation of Advanced Reactors) exascale center, a recently funded DOE research initiative. The center is located at Argonne, but faculty in my home department and University will be contributing Uncertainty Quantification studies to the center. Specifically, we are interested in the co-design of the next generation machines for efficient adjoint and automatic differentiation calculations for nuclear reactor simulations. Each of these techniques pose challenges at exascale because they tend to be memory-intensive. A major initiative at the CESAR center is exploring new techniques and machine design features that will allow for efficient calculations without the large state-storage task.

Investigating Hybrid Programming through Solid Mechanics

Piotr Fidkowski, Massachusetts Institute of Technology

**Practicum Year:**2010

**Practicum Supervisor:**Pavan Balaji, Assistant Computer Scientist, Mathematics and Computer Science Division, Argonne National Laboratory

Heterogeneous clusters that add hardware accelerators to nodes are
becoming more popular, due to trends in hardware as well as
constraints in computing towards the exascale. Three of the top ten
computers in the Top500 list use hardware accelerators, including the
current #2 supercomputer. Achieving optimal performance on such
machines requires the use of hybrid parallelization, combining shared
memory at the nodal level and message passing in between nodes.
Over the summer, I explored hybrid parallelization in an attempt to
address some of the missing pieces in accelerator computing. I took
an existing MPI parallel application for computational solid mechanics
used by MIT research group and hybridized it with CUDA. The
application is an explicit finite element solver on unstructured
meshes, designed for relatively "fast" and nonlinear problems. The
domain decomposition used for unstructured FEM provides a natural
hierarchy for hardware acceleration at the node level, since
accelerator cards can be used to compute results for the thousands to
hundreds of thousands of elements per node. By exploiting the memory
hierarchy in GPU computing I have been able to achieve a performance
improvement of 20-30x using a hybrid CUDA+MPI approach.
While implementing this hybridization, I noted a couple of ways that
MPI could be extended to make GPU hybridization more convenient. The
most obvious extension is implementing versions of send and receive
that can take GPU pointers. Further possible improvements include
mechanisms for coordinating kernels on multiple GPU nodes. Finally, I
would like to note that the performance improvement I achieved with my
application was only achievable by fully exploiting all levels of the
memory hierarchy in a very application specific way, and I do not
believe automatic cross compilation for the GPU will be competitive in
the near future.

Universal Gradient Enhanced Kriging Methods for Surrogate Modeling in Nuclear Engineering

Brian Lockwood, University of Wyoming

**Practicum Year:**2010

**Practicum Supervisor:**Mihai Anitescu, Computational Mathematician, Mathematics and Computer Science Division, Argonne National Laboratory

My practicum project involved the development of surrogate models for the purposes of inexpensive uncertainty quantification. Typical approaches for uncertainty quantification rely on Monte Carlo sampling, requiring thousands of simulations to acquire reliable statistics. In order to reduce the computational cost of exhaustive sampling, the simulation output can be approximated by an inexpensive surrogate model which is built from a limited number of simulation results. My practicum work focused on universal Kriging models which combine polynomial regression techniques with statistical methods developed for modeling random processes. These methods possess the rapid convergence associated with regression based approaches while maintaining the advantages of statistical approaches, such as the ability to efficiently incorporate additional training data and the ability to provide estimates for the variance associated with model predictions. Additionally, derivative observations can be incorporated into the construction of the model, giving a higher quality model at reduced computational cost when combined with adjoint techniques. Using these universal gradient enhanced Kriging models, I explored their applicability for modeling the output of reactor safety codes.

Application of Anisotropic Diffusion Tensor in 3-D and Implementation in DIF3D

Travis Trahan, University of Michigan

**Practicum Year:**2010

**Practicum Supervisor:**Michael Smith, Dr., Nuclear Reactor Analyst, Nuclear Engineering Division, Argonne National Laboratory

Neutron diffusion theory is an approximation that accurately describes the neutron transport process for optically thick, scattering-dominated systems in which the angular neutron flux is a weak function of angle. These conditions imply that standard fine-mesh diffusion theory cannot be used for the design and analysis of Very High Temperature Reactors (VHTRs) in which narrow but nearly-voided channels occurring in the reactor core can lead to boundary layer effects and angular fluxes that are strong functions of angle. It has previously been shown that a new, accurate diffusion equation can be derived for such problems. It yields a new diffusion approximation which is the same as the standard equation away from optically thin regions, but which has different anisotropic diffusion coefficients near the optically thin regions. This anisotropic diffusion approximation significantly improves the accuracy of diffusion simulations while having lower computational cost than higher order transport methods. As part of a DOE Computational Science Graduate Fellowship summer practicum, this method has been implemented in the DIF3D finite-difference diffusion code. A 3-dimensional, cartesian geometry, Monte Carlo code has been developed for calculation of the anisotropic diffusion tensor. The Monte Carlo for Anisotropic Diffusion (MCAD) is also useful for obtaining reference solutions. The anisotropic diffusion method method is applied to a broader class of problems and extended to three dimensions. Results confirm that anisotropic diffusion theory offers significant advantages for these special problems. More importantly, it is shown that anisotropic diffusion can be effectively implemented in a production code, making it a practical and useful tool for reactor physicists.

Parallel Implementation of Charge and Thermal Transport Simulation in Carbon Nanotubes

Zlatan Aksamija, University of Illinois at Urbana-Champaign

**Practicum Year:**2009

**Practicum Supervisor:**Paul Fischer, Computational Scientist, Mathematics and Computer Science Division, Argonne National Laboratory

Approximate Dynamic Programing Models of Public Health Emergency Response

Kathleen King, Cornell University

**Practicum Year:**2009

**Practicum Supervisor:**Sven Leyffer, Computational Mathematician, Mathematics and Computer Science Division, Argonne National Laboratory

We developed approximate dynamic programs (ADP), linear programs (LP), and integer programs (IP) to model the United States' planned response in the event of an anthrax attack. We implemented all of these models using the optimization modeling languages AMPL and GAMS. We considered theoretical relationships between the different models, using the LP and IP models to obtain bounds on the ADP results. We constructed a variety of ADP algorithms and classified the strengths and weaknesses of each algorithm type.

Genome wide analysis of epigenetic features during development of Drosophila melanogaster.

Paul Loriaux, University of California, San Diego

**Practicum Year:**2009

**Practicum Supervisor:**Dr. Kevin White, Director, IGSB, Mathematics and Computer Science, Argonne National Laboratory

ChIP-chip and ChIP-SEQ are two modern, high-throughput techniques by which to ascertain the presence of a protein or epigenetic feature over an entire genome of interest and for a particular experimental context of interest. In the case of my pracitcum, I worked with a group that was measuring the prevalence of six epigenetic features and two functional proteins over the course of the developing fruit fly. The purpose of the project was to ascertain the functional consequence of these features in a complex transcriptional program like embryo development. Knowledge of this function would have immediate consquences to our understanding of genomic regulation and our ability to control and influence cell development, the latter being the present goal of stem cell therapy.

Systematic comparison of very different computational neuronal network models of epilepsy

Anne Warlaumont, University of Memphis

**Practicum Year:**2009

**Practicum Supervisor:**Mark Hereld, , Mathematics and Computer Science Division, Argonne National Laboratory

Computational modeling is emerging as an important method for understanding and predicting seizure onset in epileptic patients. My project compared computational neural network models of the epileptic brain. The models differed in their levels of abstraction/realism. The motivation was to determine when abstract models may be used in place of very detailed models in cases where computational resources are limited or when abstraction is desired for interpretive reasons. Since we knew of no method for systematically comparing such models' behaviors, we developed a pipeline for analyzing the range of behaviors produced by a given model and for comparing models to one another and to real data. We investigated quantitative measures of model behavior. We also explored various methods of visualizing the relationship between behaviors from different models or from a model and real data. We found that the abstract model we simulated produce at least as wide a range (if not wider) of behaviors compared to the range of behaviors produced by more detailed models.

Optimization of simulation-based optimization problems with constraints on the
computational budget

Stefan Wild, Cornell University

**Practicum Year:**2006

**Practicum Supervisor:**Jorge More, Senior Computational Mathematician -Director, LANS, Mathematics and Computer Science Division, Argonne National Laboratory

This summer Jorge More and I focused on the problem of derivative-free optimization in the case when the objective function is computationally expensive, a theme I am researching for my doctoral dissertation. While previous work in optimization has focused on long-run "tail" convergence, we sought to develop an algorithm which makes significant progress during early stages, a feature which is especially attractive to computational scientists whose optimization runs are constrained by a computational budget.

Finite element model of mantle convection in icy satellites including grain size-dependent viscosity

Emma Rainey, California Institute of Technology

**Practicum Year:**2005

**Practicum Supervisor:**Matthew Knepley, , , Argonne National Laboratory

The project was to create a finite element mantle convection simulation for icy
satellites using the PETSc solvers. The model would be easily customized to include complex rheologies.

A Discontinuous Galerkin Method for the Incompressible Navier-Stokes Equations on Structured Meshes.

Krzysztof Fidkowski, Massachusetts Institute of Technology

**Practicum Year:**2004

**Practicum Supervisor:**Paul Fischer, , Mathematics and Computer Science Division, Argonne National Laboratory

Paul Fischer has done extensive work on the incompressible Navier-Stokes equations using a continuous spectral-element formulation. The applications of his code are numerous, including blood flow in arterio-venus grafts. However, the continuous discretization posseses stability problems for high Reynolds number applications, requiring artificial stabilization through filtering. In my practicum I worked on an alternate discretization with the end-goal of improving stability and perhaps accuracy.

Developing a Full-scale Implementation of the Hessian-implicit Primal-dual Method

Jaydeep Bardhan, Massachusetts Institute of Technology

**Practicum Year:**2003

**Practicum Supervisor:**Jorge More, , Mathematics and Computer Science, Argonne National Laboratory

In my research at school, we have developed a novel kind of optimization technique for certain classes of problems. During my practicum, I worked with S. Leyffer and S. Benson to develop an application that could solve large optimization problems using this technique.

Non-newtonian dynamics and thermal structure in subduction zones: applying the Portable Extensible Toolkit for Scientific Computing (PETSc) to a highly non-linear problem in CFD

Richard Katz, Columbia University

**Practicum Year:**2003

**Practicum Supervisor:**Barry Smith, Scientist (?), Mathematics and Computer Science Division, ANL, Argonne National Laboratory

The rheology of mantle rock has been shown experimentally to be highly sensitive to temperature and strain rate (Karato, Science '93). This leads to strong non-linearity in the Stokes equation that must be solved to resolve the flow structure of the slowly convecting mantle over geologic time. PETSc provides robust parallel solvers ideal for handling this computationally challenging problem.

Krylov Based Exponential Propagators

Daniel Horner, University of California, Berkeley

**Practicum Year:**2002

**Practicum Supervisor:**Paul Fischer, , Mathematics and Computer Science, Argonne National Laboratory

Development and application of Krylov-based exponential propagators in the context of incompressible a promising new approach to highly accurate time stepping schemes in numerical solution of partial differential equations, particularly those exhibiting disparate time scales.

Sediment Particle Dynamics

Samuel Schofield, University of Arizona

**Practicum Year:**2002

**Practicum Supervisor:**Paul Fischer, , Mathematics and Computer Science Division, Argonne National Laboratory

The project focused on studying the forces on sediment particles in oscillatory boundary layers. A full 3D Navier-Stokes calculation is being performed to determine the lift and drag on sand particles in different configurations.

Variation in Metabolic Pathways

Pauline Ng, University of Washington

**Practicum Year:**2001

**Practicum Supervisor:**Natalia Maltsev, Asst. Comp. Biologist, MCS-Mathematics & Computer Science Division, Argonne National Laboratory

A metabolic pathway takes a substrate and transforms it into the desired product through a series of enzymatic steps. Understanding pathways is essential for understanding cell behavior. A pathway is the process through which a substrate is transformed into its desired product. The components of a pathway are proteins and a protein is responsible for execution of an enzymatic step in the pathway so that eventually the product is formed.
My project was to study the selective pressures of metabolic pathways, specifically the pathways involved in synthesizing purine and pyrimidine, DNA components. While one may expect that the proteins in a pathway to be under the same selective pressure since they are involved in the same pathway, our preliminary results show that steps in the pathway are not equivalent.
Results:
1) Allosterically regulated proteins are under stronger selective constraints than other proteins.
This was especially true for the pyrimidine pathway (p < 0.05)). Although the protein involved in allosteric regulation for purine synthesis was among one of the most conserved proteins, this value was not significant.
2) Certain steps in the pathway can tolerate variation.
A redundant step can tolerate variation. Another enzymatic step that tolerated variation can be rescued by certain environmental conditions. The steps that tolerated variation also correlated with nonortholgous transfer (p <0.01) in the organisms studied and in eukaryotes. These steps are wobbly at a micro- and macro-molecular level and are more susceptible to convergent evolution.

Investigation of the suitability of WENO schemes for the Rayleigh-Taylor instability

John Costello, University of Arizona

**Practicum Year:**1999

**Practicum Supervisor:**Paul Fischer, , Mathematics and Computer Science, Argonne National Laboratory

We looked at attacking the Rayleigh-Taylor instability problem numerically using a relatively new class of finite difference schemes knows as "weighted essentially non-oscilliatory" schemes. The Rayleigh-Taylor problem is an important problem in (among other areas of research) stellar dynamics, and it is both computationally and experimentally very difficult to attack.

Developing the Beam Element for Finite Element System Using Absolute Nodal Coordinate Formulation

Larisa Goldmints, Carnegie Mellon University

**Practicum Year:**1998

**Practicum Supervisor:**Dr. Thomas Canfield, , , Argonne National Laboratory

Absolute nodal coordinate formulation was introduced recently for flexible multibody dynamics systems. This formulation is proposed to simulate rigid body motion exactly and simplify calculations by providing constant mass matrix.

Implementation of Von Mises plasticity in SUMAA3D - a scalable unstructured mesh algorithms and applications package

William Barry, Carnegie Mellon University

**Practicum Year:**1996

**Practicum Supervisor:**Dr. Paul Plassmann, , Mathematics and Computer Science Division, Argonne National Laboratory

A Von Mises plasticity material model was developed for a simple three-noded triangular finite element. The formulations were prototyped and refined using MATLAB and then incorporated into an existing adaptive mesh refinement computer code.

A Simplex Based Primal-Dual Algorithm for Finding a Perfect Two-Matching in an Undirected Graph

Paul Bunch, Purdue University

**Practicum Year:**1996

**Practicum Supervisor:**Dr. Stephen Wright, , Mathematics and Computer Science Division, Argonne National Laboratory

This work is in the area of optimization. Specifically, my work focuses on the development of solution algorithms for large scale combinatorial optimization problems.

Performance Optimization and Parallelization of FANS-3D

Kent Carlson, Florida State University

**Practicum Year:**1992

**Practicum Supervisor:**Dr. William Gropp, , Mathematics and Computer Science, Argonne National Laboratory

I spent the summer working with Bill Gropp enhancing the computational performance of my research code, FANS-3D (Finite Analytic Numerical Simulation-Three Dimensional). We replaced the linear system solver in FANS-3D with a more efficient one written by Bill and Barry Smith, which significantly lowered the run time necessary to solve our test problem. In addition, we are in the process of parallelizing the code, to run on multiprocessor machines and clusters of workstations. When completed, this change will result in a significant performance improvement in the code.