Joy Yang

Massachusetts Institute of Technology

Pac-Man opened the door to Joy Yang’s computational science career.

Dan Klein at the University of California, Berkeley, used the video game to teach an artificial intelligence class Yang took as an undergraduate. Writing code that helped the hungry yellow dot learn how to navigate mazes and avoid ghosts taught her about algorithms, probability and more, awakening her to limitless possibilities.

“It was at that point that I began to understand what computers are capable of,” says Yang, now a doctoral candidate in computational and systems biology at the Massachusetts Institute of Technology and a Department of Energy Computational Science Graduate Fellowship recipient. She switched to a statistics major by the end of the semester.

After graduation, Julia Segre at the National Institutes of Health hired Yang for her statistical training and put her to work developing a way to cheaply type microbe strains in order to track infections that spread through hospitals. Yang knew she would go to graduate school but liked using her education immediately. “It really takes implementing a method for me to truly internalize it and believe that it works,” Yang says. “The application is really important to me.”

Yang’s doctoral research fits the bill. With advisor Martin Polz, she uses computing to understand how viruses called phages infect bacteria and use them to reproduce. What she learns could have wide impact.

One thimbleful of seawater contains 10 million phages and 1 million bacteria, Yang says. This scale, spread through vast oceans, makes the tiny organisms big players in our ecosystem and the carbon cycle. Knowledge of host-phage interactions also could have implications for combatting antibiotic resistance and improving human health.

Yang’s research revolves around a collection of 243 bacterial strains and 241 viruses. Using the decoded genomes of each, “you can very easily predict the infections,” but the data are highly structured and contain many confounding variables, “so finding the actual pathways that allow these infections to happen is harder.”

It’s a complex problem. Between the number of phages and bacteria and the genes in each, Yang could examine around 10 million parameter combinations. She’s seeking computational techniques to make the problem manageable. “We’re still brainstorming to come up with the best way to do this,” she says. Because the researchers work with the bacteria and phages and have the strains available, they also can use lab experiments to validate the model’s findings.

A second part of Yang’s work zooms in on a particular question: why some phages carry their own transfer RNA (tRNA), a key molecule in transcribing DNA instructions into the proteins that conduct cell functions. “Usually viruses have no reason to carry their own tRNA because they can hijack their host machinery,” she says. She’s statistically comparing phage tRNA with bacterial tRNA to find answers.

Yang’s 2016 Lawrence Berkeley National Laboratory practicum also had two parts, both related to genetics and a revolutionary engineering technique: CRISPR, which enlists genetic machinery bacteria use to defend their DNA from foreign DNA, such as that from a virus. CRISPR can be programmed to target particular DNA segments, letting scientists edit genes at specific locations.

LBNL researcher Adam Arkin’s lab used the technique to block one gene at a time across the entire genome in their model organism, Escherichia coli bacteria. The modified organisms were cultured to observe what defects, if any, each exhibited, providing clues to find which genes are essential to the germ’s function. Yang analyzed this genome-wide experiment, seeking relationships between genes and characteristics in the resulting organisms. Besides providing gene function data, the results also could help design CRISPR tools to more accurately target genes.

The practicum also put Yang in the laboratory, where she explored using CRISPR to modify phage genes. She designed tools to target genes in phages that commonly infect E. coli and tested to see if the changes decreased infection rates. More work is needed to see if the method could be effective.

Yang has often applied what she’s learned during her practicum to overcome roadblocks in her thesis research. “I’m grateful to the DOE CSGF for providing me opportunities to see many different approaches and for pushing me to learn what I came to graduate school to learn.”

Yang expects to graduate in 2019 and hopes to find a postdoctoral fellowship with an applied statistics research group.

Image caption:Vibriophage infection network. The outer ring depicts bacteria and the nodes in the interior depict the phages. If a phage is able to infect the bacteria, the two nodes are connected. Credit: From Kauffman KM, Hussain FA, Yang JY, Arevalo P, Brown JM, Chang WK, VanInsberghe D, Elsherbini J, Sharma RS, Cutler MB, Kelly L, Polz MF. "A major lineage of nontailed dsDNA viruses as unrecognized killers of marine bacteria." Nature, (2018).