Getting the Most Out of Statistical Potentials: Optimized Representations for Protein-protein Interactions and Fold Recognition

Julian Mintseris, Boston University

Photo of Julian Mintseris

Knowledge-based statistical potentials have been extensively and successfully used in biophysical studies such as protein threading, folding, protein-protein interactions, and small molecule drug design. The choice of protein representation is the first step in working with statistical potentials, and many different atom type classifications have been suggested based on a variety of chemical, physical, and biological assumptions. Here we show that all such “hand-made” schemes, including the standard amino acid representation, are sub-optimal. Drawing direct parallels between protein structure energetics, information theory, and statistical theory, we derive an optimal reduced protein representation for an arbitrary number of atom types. Using a set of protein decoys, we demonstrate that we can dramatically improve the results of fold recognition by simply changing the way the problem is represented. For protein-protein interactions, we show a marked improvement in docking results with a representation optimized with protein complex interfaces. A detailed look at several representations also provides insights into the nature of protein structures and protein-protein interfaces.

Abstract Author(s): Julian Mintseris, Rong Chen, and Zhiping Weng