Sandia National Laboratories, California
Randomized Algorithms for Decomposing Sparse Tensors
Brett Larsen, Stanford University
Practicum Year: 2018
Practicum Supervisor: Tamara Kolda, Distinguished Member of Technical Staff, Data Science and Cyber Analytics, Sandia National Laboratories, California
Tensor decompositions have consistently proven to be an essential tool in large-scale, high-dimensional data analysis. However, conventional algorithms for finding low-rank tensor decompositions can be unwieldy or even infeasible for extremely large and sparse tensors. In my practicum, we developed two different approaches for applying randomized algorithms to this problem, essentially solving a much smaller sampled version at each iterate with probabilistic guarantees on the approximation errors. In the first method, sketching was applied to the highly-overdetermined least squares problem that occurs repeatedly in alternating least squares, the standard approach for the CANDECOMP/PARFAC (CP) decomposition. In contrast to previous work, we performed sketching through leverage score sampling which avoids working with dense tensors created by random mixing. In the second, we turned to an "all-at-once" optimization perspective and examine weighted sampling for stochastic gradient descent. We considered how sampling by estimates of the local smoothness of the loss function can push towards elements where the loss is steepest and present progress towards efficiently estimating the local smoothness for different variations of the Generalized CP decomposition.
Numerical Performance Evaluation of Asynchrony-Tolerant Finite Difference Schemes for Nonreacting and Reacting Flow Equations
Emmet Cleary, California Institute of Technology
Practicum Year: 2017
Practicum Supervisor: Jacqueline H. Chen, Dr., Combustion Reserach Facility, Sandia National Laboratories, California
Communication and data synchronization between processing elements (PEs) are likely to pose a major challenge in the scalability of solvers at the exascale. Recently developed asynchrony-tolerant (AT) finite difference schemes address this issue by relaxing communication and synchronization between PEs at a mathematical level while preserving accuracy, resulting in improved scalability. The performance of these schemes has been validated for simple linear and nonlinear homogeneous PDEs. However, many problems of practical interest are governed by a highly nonlinear system of PDEs, whose solution may be sensitive to perturbations caused by communication asynchrony. For this project, I applied AT schemes to a wide range of one dimensional fluids problems, including taveling waves, turbulence, shocks, and premixed flames. The results showed that these schemes are applicable to practical fluids problems, introducing asynchrony errors only at negligibly small lengthscales.
Data-Driven Approach to Modeling Material Uncertainty in Additively Manufactured Materials
Clay Sanders, Duke University
Practicum Year: 2017
Practicum Supervisor: Jeremy Alan Templeton, R&D S&E, Mechanical Engineering, Thermal/Fluid Sciences & Engineering, Sandia National Laboratories, California
I collaborated with an ongoing research initiative at Sandia National Laboratories to characterize and model uncertainty in properties and behavior of additively manufactured (AM) materials. Uncertainty in additively manufactured material behavior arises from manufacturing defects and microstructure heterogeneity. Proper characterization of this uncertainty, which varies over length scales and amongst different materials, is necessary for the design and analysis of AM components. The objective of the project was to develop data-driven constitutive models of polycrystalline metals that capture the variability of the material characteristics. The project ultimately aims use both high throughput experimental data and synthetic data from crystal-plasticity simulations to train neural network models to predict nonlinear, evolving plastic properties in AM materials.
Dimensionality Reduction of Neural Data by Canonical Polyadic Tensor Decomposition
Alexander Williams, Stanford University
Practicum Year: 2016
Practicum Supervisor: Tamara Kolda, , Data Science and Cyber Analytics, Sandia National Laboratories, California
Modern technologies enable neuroscientists to record from many hundreds of neurons for weeks or even months. Commonly used methods for dimensionality reduction identify low-dimensional features of within-trial neural dynamics, but do not model changes in neural activity across trials. We represent multi-trial data as a three-dimensional data array (a third-order tensor) with each entry indicating the activity of a particular neuron at a particular time on a particular trial, and use canonical polyadic (CP) tensor decomposition to find low-dimensional signatures of within and across-trial changes in neural dynamics. This approach produces more informative descriptions of experimental and synthetic multi-trial neural data than classic techniques such as principal and independent components analysis (PCA & ICA).
Disambiguation of data error and model error: new algorithms for uncertainty quantification
Jason Bender, University of Minnesota
Practicum Year: 2015
Practicum Supervisor: Dr. Habib N. Najm, Distinguished Member of the Technical Staff, Combustion Research Facility - Reacting Flow, Sandia National Laboratories, California
Uncertainty quantification (UQ) has become a crucial part of computational science in recent years. Engineering and scientific models rely on inputs that rarely can be determined exactly. A principal goal of UQ is to determine how uncertainty in those inputs propagates into uncertainty in the outputs. Often, models contain numerical parameters that must be specified from experimental data. This calibration procedure arises, for example, when constructing detailed models of combustion chemical kinetics. There are two sources of error in such problems: the "data" error resulting from imperfect measurements and instrument noise, and the "model" error resulting from the fact that the theoretical model is an inherent simplification of physical reality. Current UQ methods do not treat model error in a consistent way. For my practicum at Sandia National Laboratories in Livermore, CA, I worked with Dr. Habib N. Najm and Dr. Khachik Sargsyan to develop and analyze algorithms to separate data and model error in calibration problems. The research built on earlier work by Sargsyan, Najm, and Ghanem (Int. J. Chem. Kinet. 47, 246 (2015)), which considered representative cases with model error but no data error. Their approach consisted of embedding a probabilistic structure (based on polynomial chaos expansions) in the model parameters vector M; then, using a Bayesian formulation and Markov chain Monte Carlo (MCMC) techniques, a new parameters vector P was calculated, describing the probability density function for M. During my time at Sandia, I extended this strategy to include cases with data noise, and I explored several different constructions of the MCMC likelihood function. The approach proved successful in several important prototypical cases. While important questions remain, we feel that we made significant progress in understanding this problem of disambiguating data and model error.
Creating a geometry-dependent volume meshing algorithm for use in an automated mesh resolution study
Michael Rosario, Duke University
Practicum Year: 2013
Practicum Supervisor: Sarah Scott, R&D S&E Mechanical Engineer, Thermal and Fluid Processes, Sandia National Laboratories, California
I was in charge of creating an algorithm that would mesh complicated objects incorporating geometric information while requiring only limited input from the user. The main challenge of this project was to use geometric features pf the object to determine the variable mesh sizes throughout the entire object. In addition to automatically recognizing regions that would benefit from finer or coarser mesh sizes, the code also created smooth transitions between these different mesh sizes. By having a code that could complete these tasks, we then automatically meshed an object with differing setpoints of fineness and coarseness in order to test whether our meshes were fine enough to capture the dynamics of heat transferring throughout the object.
Analysis of ensemble classification algorithms
Miles Lopes, University of California, Berkeley
Practicum Year: 2012
Practicum Supervisor: Philip Kegelmeyer, Senior Scientist , Informatics and Decision Sciences, Sandia National Laboratories, California
Within the field of machine learning, the topic of classification deals with the task of using labeled examples to predict the labels of new objects (e.g. classifying spam email). A widely used approach to classification is to collect the predictions of many classification algorithms, and then take the majority vote of those predictions. Methods relying on this principle are commonly known as "ensemble methods". In general, the accuracy of this kind of procedure tends to converge to a particular value once a large number of votes has been collected. However, each vote incurs a computational cost, and it is desirable to collect the smallest number of votes that is needed to achieve an accuracy that is close to the limiting value. Our project focused on the problem of finding an optimal number of votes. More specifically, we studied the rate at which the accuracy of ensemble algorithms converges as the number of votes increases. As the project progressed, we were able to calculate precisely what this rate is, and we performed simulations to validate the consequences of our analysis. Currently, we are working to incorporate our theoretical results into software that can be used by practitioners.
Efficiently Inferring Optimal and Near-Optimal Approximations for Large Stochastic Dynamical Systems
Christopher Quinn, University of Illinois at Urbana-Champaign
Practicum Year: 2012
Practicum Supervisor: Ali Pinar, Technical Staff, Information & Decision Sciences Dept., Sandia National Laboratories, California
For this practicum I worked on algorithms to efficiently approximate the structure of large graphs representing networks of stochastic processes. For instance, in a large social network people might have hundreds of friends, though only a few close friends. These algorithms could be useful to identify the most important friends, in an efficient manner.
Systems Dynamics Analysis of Biomass to Energy Conversion
Aaron Sisto, Stanford University
Practicum Year: 2012
Practicum Supervisor: Daniel Dedrick, Manager, Hydrogen and Combustion Technologies, Combustion Research Facility, Sandia National Laboratories, California
Although analyses to date have examined the theoretical limitations of biomass-to-energy in terms of potential resource availability, tradeoffs of biomass conversion into transportation fuel versus electricity have not been thoroughly investigated. Existing studies have focused on energy crops and cellulosic residues for biomass-to-energy inputs, but few have investigated waste streams as biomass resources. Municipal solid waste (MSW) is a low-cost waste resource with an existing, well defined supply infrastructure and does not compete for land area or food supply, making it a potentially attractive, scalable, renewable feedstock. MSW chemical composition is geographically and annually variable in terms of C-N ratio, cellulosic content, water and soil content, and other factors that affect its conversion chemistry. Existing MSW-to-energy studies are limited to the efficiency of converting MSW to electricity, and ignore the resulting impact on land availability from MSW diversion and changes in landfill greenhouse gases (GHGs). We propose a system dynamics modeling approach to examine the interdependencies between MSW composition and chemical conversion to fuels versus electricity. Key metrics will include energy output, displacement of fossil energy, reduction in landfill use, and GHG impacts of these pathways.
Making tensor factorizations robust against non-Gaussian noise
Eric Chi, Rice University
Practicum Year: 2010
Practicum Supervisor: Tamara Kolda, Principal Member of Technical Staff, Informatics and Decision Sciences, Sandia National Laboratories, California
Tensors are multi-way arrays, and the CANDECOMP/PARAFAC (CP) tensor factorization has found application in many different domains ranging from chemistry, psychology, to neuroimaging. The CP model is typically fit using a least squares objective function, which is a maximum likelihood estimate under the assumption of i.i.d. Gaussian noise. This loss function can actually be highly sensitive to non-Gaussian noise. Therefore, we investigated a loss function based on the 1-norm because it can accommodate both Gaussian and grossly non-Gaussian perturbations.
Modeling Variable-Density Turbulence in One-Dimension
Curtis Hamman, Stanford University
Practicum Year: 2009
Practicum Supervisor: Alan Kerstein, Deputy Director, Combustion Research Facility, Sandia National Laboratories, California
A turbulence model was compared against existing direct numerical simulations.
Investigation of the intrusive polynomial chaos eigenstructure of random dynamical systems
Benjamin Sonday, Princeton University
Practicum Year: 2009
Practicum Supervisor: Habib Najm, Technical Staff, Reacting Flow Research, Sandia National Laboratories, California
Random dynamical systems can be written intrusively in terms of polynomial chaos. An N-dimensional dynamical system, for instance, can be expanded in terms of polynomial chaos of order P, yielding an N(P+1)-dimensional dynamical system which is now deterministic. The eigenstructure of this new, ``intrusive'' system may now be changed, however.
Viewing Organic Behavior through the Lens of Computation
Arnab Bhattacharyya, Massachusetts Institute of Technology
Practicum Year: 2008
Practicum Supervisor: Rob Armstrong, Researcher, Principal Member, Dept. 8961 - High Performance Computing Research, Sandia National Laboratories, California
The goal of the research was to explore how the structure of a complex system affects the robustness of that system in response to perturbations in its environment. To make the problem tractable, and allow us to bring well-developed analytical tools to bear on the problem, we chose to study the structure-robustness relationship in the context of Boolean networks. Boolean networks are graphs in which each node has a Boolean state (0 or 1) that changes at the next timestep according to some function of the current state and the state of the node's neighbors. While simple, Boolean networks can model complicated dynamics of interest in fields such as biology, epidemiology, electrical engineering, computer science, etc. In particular, VLSI chips composed of digital logic circuits can be represented directly as Boolean networks. Our research has explored what constraints on the structure of Boolean networks lead to robustness in the presence of environmental perturbations. We believe that these results can be generalized to show constraints on the structure of real-world complex systems that show robustness in the presence of environmental change, such as social organizations (including terrorist networks), biological organisms, computer networks, etc. Our results can hopefully inform efforts to model such systems. Furthermore, we believe our research results might have direct applicability to the problem of designing logic hardware, software, and protocols that resist faults and attacks.
Dimensionality reduction for large scale uncertainty quantification
James Martin, University of Texas
Practicum Year: 2008
Practicum Supervisor: Youssef M. Marzouk, Dr., Biological & Energy Sciences Center, Sandia National Laboratories, California
This was a very open ended research project, and work is still ongoing. Computational uncertainty quantification (UQ) is a problem of great interest in scientific problems, and of critical importance in engineering and design certification. Unfortunately, many of the most straightforward approaches (i.e. polynomial chaos methods) to these problems scale very poorly both in the number of uncertain parameters which must be quantified, and in the polynomial order of approximation of these parameters. We thus seek to find a reduced basis of the most important modes, which we can then use to solve a smaller, but still representative, UQ problem. Previous work simply uses the most significant modes from the prior probability distribution, but makes no account for the physical model underlying the problem, and hence produces (often extremely) inefficient representations of the parameter space, and thus requiring substantial computational effort. This summer, we worked on methods to include local gradient and Hessian information from the physical model with these significant prior modes in order to determine the ideal basis in which we will learn the most new information by solving the UQ problem.
Analysis and Reduction of Chemical Models Under Uncertainty
Geoffrey Oxberry, Massachusetts Institute of Technology
Practicum Year: 2008
Practicum Supervisor: Habib Najm, Technical Staff, Combustion Research Facility, Sandia National Laboratories, California
While models of combustion processes have been successful in developing engines with improved fuel economy, more costly simulations are required to accurately model pollution chemistry. These simulations will also involve significant parametric uncertainties. Computational singular perturbation (CSP) and polynomial chaos-uncertainty quantification (PC-UQ) can be used to mitigate the additional computational cost of modeling combustion with uncertain parameters. PC-UQ was used to interrogate and analyze the Davis-Skodje model, where the deterministic parameter in the model was replaced with an uncertain parameter. In addition, PC-UQ was combined with CSP to explore how model reduction could be combined with uncertainty quantification to understand how reduced models are affected by parametric uncertainty.
Electronic nature of looped carbon nanotubes
Aron Cummings, Arizona State University
Practicum Year: 2005
Practicum Supervisor: Francois Leonard, Senior Member of Technical Staff, Nanoscale Science and Technology Department, Sandia National Laboratories, California
A process known as alternating-current dielectrophoresis (ACDEP) has been used to deposit carbon nanotubes onto metallic contacts. It has been observed that when ACDEP is used, loops are sometimes formed in the nanotubes. The purpose of this project was to use computer simulations to investigate the charge and electric potential profiles of the looped carbon nanotube structure in order to gain a better understanding of the effects the loop may have on the electronic behavior of the structure as a whole.
Towards the Understanding of Silicon Micro-Needle Failure in a Biological MEMS Device
Alex Lindblad, University of Washington
Practicum Year: 2005
Practicum Supervisor: Jonathan Zimmerma, , , Sandia National Laboratories, California
The goal of this research was to determine why silicon micro-needles repeatedly fail during insertion into brain cells of the sea slug, Tritonea diomedea, by modeling the interaction between the needle and the cell. Once there was a better udnerstanding of the reasons the needles fail, recommendations could be made to improve the fabrication process and/or the insertion process.
Alex Lindblad, University of Washington
Practicum Year: 2005
Practicum Supervisor: Jaideep Ray, Technical Staff, Advanced Software Research and Development, Sandia National Laboratories, California
The goal of the project was to develop a parallel environment to model the spread of a biological attack in a specific population. The long term goal is to have the capability to rapidly predict how a biological attack will affect a target city, and how to determine the source of an attack once an outbreak occurs.
Application of One-Dimensional Turbulence (ODT) to Large Eddy Simulation Bulk Flow Closure
Randall McDermott, University of Utah
Practicum Year: 2002
Practicum Supervisor: Alan Kerstein, Distinguished Member of Technical Staff, Combustion Research Facility, Sandia National Laboratories, California
To date the One-Dimensional Turbulence (ODT) model of Kerstein has found use in several classes of flows; including shear layers, boundary layers, jets, and even astrophysical flows such as Rayleigh-Bernard convection in stars. Most recently ODT was used by Schmidt et al. as the near wall closure in a channel flow Large Eddy Simulation (LES). This project develops a framework which extends ODT to the bulk flow closure in LES. The resulting method is quite different from typical LES. The standard LES momentum equations are not integrated. Instead, a new set of equations is formulated and evolved on three separate orthogonal grids. The method resembles a dimensional splitting approach, with strict adherence to finite volume filtering operations. The ODT field, which is a good approximation to the surface filtered subgrid field, is then explicitly filtered to the LES scale. This operation occurs at a time scale commensurate with the LES. The LES velocities are then corrected to satisfy continuity via the pressure Poisson equation. The ODT subgrid field is brought into conformity with the new LES velocity to begin the next sequence of ODT evolutions.
Finite Elements in Hartree-Fock Theory
Christopher Rinderspacher, University of Georgia
Practicum Year: 2002
Practicum Supervisor: Matthew Leininger, PhD, Group 8915, Sandia National Laboratories, California
The Hartree-Fock equations were implemented in a finite element approach using the Poisson equation to solve for the electron-electron interactions.
Applying ODT to the Dredge-up Problem in Novae
Lewis Dursi, University of Chicago
Practicum Year: 2000
Practicum Supervisor: Alan Kerstein, , , Sandia National Laboratories, California
A nova is thought to result from a partially degenerate thermonuclear runaway on the surface of a white dwarf. The runaway occurs in material, primarily hydrogen, accreted from a companion star, but material from the underlying start must be 'dredged up' into the atmosphere to produce an explosion with the observed energies; the carbon and oxygen serve as catalysts for the hydrogen burning in the atmosphere. However, the physical mechanism for the dredge-up is unknown. This summer, I extended an ODT code written by Scott Wunsch and Alan Kerstein. ODT is a One-Dimensional Turbulence model developed by Alan Kerstein at Sandia National Labs which contains much of the physics of a turbulent cascade, and has been used to explore turbulent mixing phenomena under a wide range of conditions. I worked to allow its use in modeling a nova atmosphere, and began using it as a mixing model so that we can explore a wide range of physical scenarios for possible dredge-up.
Co-Planar Crack Interaction in CLeaved Mica
Judith Hill, Carnegie Mellon University
Practicum Year: 2000
Practicum Supervisor: Patrick Klein, , , Sandia National Laboratories, California
The interaction of co-planar cracks in layered muscovite mica has been studied numerically with a cohesive zone approach. With a double-cantilevered beam geometry, an end crack is propagated and interects with an internal stationary crack created by an inclusion. The instability and eventual coalescence of the two cracks has been modeled and compared with existing experimental measurements to provide validation of the model.
Convergence Analysis and Troubleshooting of MPJet, a Low Mach Number Reacting Flow Code
Susanne Essig, Massachusetts Institute of Technology
Practicum Year: 1999
Practicum Supervisor: Dr. Habib Najm, Ph.D., , , Sandia National Laboratories, California
During my practicum at the Combustion Research Facility at SNL/CA, I worked with a reacting flow code called MPJet developed to study two-dimensional turbulent jet diffusion flames consisting of a jet fuel stream surrounded by co-flow air. MPJet is a massively parallel coupled Eulerian-Lagrangian code which uses a finite difference scheme with adaptive mesh refinement for solving the scalar conservation equations, and the vortex method for the momentum equation, along with the necessary coupling terms. My primary task project was to perform a convergence analysis to verify that the order of convergence rate for this code was as expected, and, if not, to investigate sources of error.
The Study of SiH2 Interacting with NH3 and PH3 Using the BAC-MP4 Method
Michael Mysinger, Stanford University
Practicum Year: 1997
Practicum Supervisor: Dr. Carl Melius, , , Sandia National Laboratories, California
We studied the thermodynamics and kinetics of SiH2 + NH3 and subsequent reactions as well as SiH2 + PH3 and subsequent reaction for comparison. We use the BAC-MP4 method developed by Carl Melius to do first principles calculations.
Modelling the Dynamics of Computer Networks
Robert Fischer, Harvard University
Practicum Year: 1996
Practicum Supervisor: Dr. Peter Dean, , Networks Division, Sandia National Laboratories, California
I set a project and a goal: to develop a practical network traffic source model. I independently researched long-range statistical analysis towards this goal, and came close to a working model in the summer. I continued the research to reach the goal in November.
Direct numerical simulation (DNS) of turbulent combustion
Erik Monsen, Stanford University
Practicum Year: 1992
Practicum Supervisor: Dr. Jacquelin Chen, , Combustion Research Facility, Sandia National Laboratories, California
Direct numerical simulation (DNS) of turbulent combustion has recently become an important computational approach used to understand fundamental turbulence chemistry interactions in premixed and nonpremixed flames. The full compressible Navier-Stokes equations along with a reduced chemical mechanism are solved with no turbulence models. All of the relevant spatial and temporal scales are resolved in the computation. Large numerical databases created from the simulations can be used to study the effects of heat release, finite-rate chemical kinetics, and differential diffusion of species and to validate existing turbulent combustion models. The simulations are computationally intensive, requiring significant amounts of memory and cpu. The simulations are currently being run on a Cray-YMP machine. To approach Reynolds numbers more characteristic of those in experiments and to incorporate chemical kinetics more representative of hydrocarbon fuels, it would be desirable to port the DNS code to massively parallel machines like the Intel iPSC/860 at Sandia National Laboratories and at NASA Ames Research Center. The conversion should be relatively straight forward. Similar DNS codes have already been converted at NASA Ames.