Pacific Northwest National Laboratory
Harvesting the Fruits of the Genome Revolution
(page 2 of 3)
An underlying principle of genetics, she explains, is that genes that are very close to one another often are controlled by a single regulatory element and are turned on or off in concert. Therefore, understanding where proteins are located in relation to one another on a chromosome can provide valuable insight into their control, or co-expression.
“PQuad is nice in that the developers have recognized that people are looking for ways to highlight the relevant information,” Adkins says. “PQuad is good in that it will do comparisons between experiments and place the results over sequence information. By doing this, you can then go over it visually and then by a color map as you scan the genome you can quickly recognize areas of interest.”
As the computational biologists worked with biologists, they realized that to make PQuad useful, they would have to work together and learn the other field to some extent.
That’s where Department of Energy Computational Science Graduate Fellowship alumnus Chris Oehmen helped accelerate the process, Webb-Robertson says. Oehmen’s training in modeling biological systems (during his practicum, he built heart function models) came into play as he helped with user evaluation and further development of PQuad.
“Understanding the language of biology has made it really easy for me to talk to people in biology,” says Oehmen. “I know what a protein is and how it is made. I can kind of be the middle person. I can talk to the software design people, I can talk to the people who are creating the data, and I can understand both sides enough to keep things moving along.”
But Does It Work?
The PQuad team has released the software for public use at http://ncrr.pnl.gov/software/software_register.asp?id=23 . [Editor’s note: since this article was written, PQuad has been replaced by VESPA (Visual Exploration and Statistics for Proteomics Analyses) which is available at http://omics.pnl.gov/software/VESPA.php .] Now, they are eager to show what it can do. Oehmen and Webb-Robertson, along with PNNL colleagues Adkins, microbiologist Lee Ann McCue, and programmers, data analysts, mathematicians, and graphics experts, collaborated to enter the Supercomputing 2006 analytics challenge. The challenge lets researchers showcase computationally intensive applications that use high-performance computing, networking, and visualization to solve real-world, complex problems.
For their demonstration problem, the research team analyzed the biochemical pathways that the bacterium Salmonella uses to produce toxins that cause food poisoning. The project addresses the laboratory’s mission to understand the fundamental biochemical pathways of microorganisms. The group used proteomic datasets from samples taken under growth conditions designed to mimic environments in which the organism produces or doesn’t produce toxins. The sample of all proteins produced under the two growth conditions was first digested by enzymes into small pieces called peptides, and then aerosolized and injected into a mass spectrometer, which records the mass-to-charge ratio of each peptide fragment.