Getting at the Basis of Protein-Protein Recognition Through Multivariate Analysis

Julian Mintseris, Boston University

Understanding of the basis of protein-protein interaction and recognition is an important aspect of the overall view of biologically relevant functional interrelationships. Comparing and classifying different types of protein complexes has been hindered by the lack of an appropriate way to measure the similarity between them. Here we show that sequence similarity of the interacting protein partners is not always appropriate for this purpose and introduce an intuitive measure of structural similarity for protein interfaces. We use this similarity measure for unsupervised clustering of all the protein complexes in the PDB and then apply factor analysis and discriminant analysis to two classification problems. For discrimination of homodimers and crystal contacts we used an existing dataset and showed that our results are better than published methods using only structural information. We also obtained good results for classification of recognition (transient) versus folding (permanent) complexes and identified the most discriminating features.

