It took Brett Larsen awhile to find his research niche. He entered Arizona State University set on math or science, but most of his early experiences only showed him what he did not want to do.
Larsen studied low-power sensor and circuit design, interning at Sandia National Laboratories in New Mexico, but found he was more interested in extracting signals in noisy data from his prototypes. “It showed me that I wanted to go more in a math-simulation focus,” says Larsen, now a Department of Energy Computational Science Graduate Fellowship (DOE CSGF) recipient.
A year overseas at the University of Cambridge cemented Larsen’s interest in a tool for analyzing massive data: optimization, the math of most efficiently minimizing or maximizing a quantity. He’d already been accepted for Ph.D. study at Stanford University, but his work on identifying groups in large, complex data helped focus his goals.
In his research with Shaul Druckmann, Larsen combines math with biophysics and neuroscience to better understand and apply neural networks, brain-circuit-inspired machine-learning methods. Such algorithms identify characteristics in known data and use them to execute previously unseen tasks, such as finding objects in a random environment. But researchers don’t fully understand how neural networks work, limiting their use in studying biological brains.
Larsen investigates a neuroscience conundrum: the prevalence of brain cells with recurrent connections that loop back on themselves and others. Computational recurrent neural networks primarily tackle time-oriented tasks, but recurrent connections are found even in brain regions unassociated with temporal-type processing. Larsen and his colleagues want to grasp why this is so and explore what problems recurrent networks are best at solving.
The researchers trained recurrent neural networks with varying architectures to detect connections between two points in an image, then compared their performance. What he and his colleagues learned provided principles to identify jobs the networks can do well.
“Recurrence is nice for tasks that can be expressed as the composition of a bunch of simple functions,” he says. Importantly, these tasks needn’t be composed over time, so recurrent networks could work well for non-temporal problems.
This and other research assume neural networks behave optimally, efficiently exploring possible solutions for the best answer. Each combination of network configuration and input data has a loss value measuring network performance. The optimal pairing minimizes this loss.
Optimization can be described as changing network parameters to move through a landscape with peaks and valleys. The algorithm seeks the fastest course to the lowest point, representing the smallest loss. But it can get stuck in a valley, fooling users into thinking that’s the best solution when another is lower. The algorithm also could land on a saddle point, where only a few of the many directions lead down. Larsen, Druckmann and postdoctoral researcher Abbas Kazemipour analyze the landscape to verify that no saddle points or false valleys exist or to tailor algorithms so they avoid them.
Meanwhile, Larsen continues a project from his 2018 practicum with mathematician Tamara Kolda at Sandia California. It focused on large, multidimensional data structures called tensors. In one test case, Larsen and Kolda used information from the website Reddit in a three-dimensional tensor with billions of entries, each corresponding to co-occurrences of an individual user using a specific word in a particular subreddit.
To analyze multidimensional data, mathematicians use tensor decomposition algorithms, finding hidden structures that more simply represent the information. “More importantly, it’s an unsupervised way to find trends. You don’t specify exactly what you’re looking for, but these factors pop out,” Larsen says.
Algorithms for finding these decompositions are infeasible when the tensor is large and the data are sparse, with mostly blank entries. Larsen and Kolda are developing ways to apply randomized algorithms, extending to sparse tensors techniques that previously worked only on dense tensors, in which most entries have data.
The technique samples from the tensor, solving a smaller version of the iteration while limiting how much error deviates from the original problem. The method assigns a score to parts of factors used to model the tensor, making it more likely to sample rows that most affect accuracy.
The approach should work well even on tensors too large to fit on a single computer. Larsen has tested it on Kahuna, a Sandia cluster. He’s working on a paper about the research.
Besides letting Larsen explore a different optimization area, the practicum provided a better sense of work the national laboratories do. He hopes to snag a postdoctoral post at one or an industry position after graduation.
Image caption: Brett Larsen’s optimization research explores neural network configurations to find one with the lowest loss value. By changing the network parameters, the algorithm explores the landscape of possible loss values in search of the lowest point. The trajectories could land on a saddle point, shown here, where only a few of the many directions lead down. Larsen and his colleagues analyze the landscape to verify that no saddle points or false valleys exist or to tailor algorithms so they avoid them. Credit: Brett Larsen.