Caitlin Whitter, Purdue University

Photo of Caitlin Whitter

Graphs are popular data structures for describing the topology of non-linear structures. Graph kernels, functions that quantify the similarity between two graphs, have become a widely used approach in machine learning for performing classification tasks on graphs.

Domains such as cheminformatics have large quantities of structured data that lend themselves naturally to graph representations and the graph kernel framework. For example, a marginalized graph kernel-based machine learning pipeline can learn and predict the atomization energy of molecules with high accuracy (Y.H. Tang, W.A. de Jong, 2019).

Computing the marginalized graph kernel amounts to solving a linear system of equations. However, as graphs scale in size, these calculations can become prohibitively expensive. Using an iterative method like conjugate gradient alone may not result in fast convergence.

We examine several preconditioning techniques for efficiently computing the marginalized graph kernel. To compare these methods, we perform experiments on graphs from the QM7 molecular data set (L.C. Blum, J.L. Reymond, 2009; M. Rupp, A. Tkatchenko, K.R. Müller, O.A. von Lilienfeld, 2012) and report the results.

Abstract Author(s): Caitlin Whitter