Efficient Matching for Recognition and Retrieval

Kristen Grauman, Massachusetts Institute of Technology

Photo of Kristen Grauman

Local image features have emerged as a powerful way to describe images of objects and scenes. Their stability under variable image conditions is critical for success in a wide range of recognition and retrieval applications. However, comparing images represented by their collections of local features is challenging, since each set may vary in cardinality and its elements lack a meaningful ordering. Existing methods compare feature sets by searching for explicit correspondences between their elements, which is computationally too expensive in many realistic settings.

I will present the pyramid match, which efficiently forms an implicit partial matching between two sets of feature vectors. The matching has linear time complexity, naturally forms a Mercer kernel, and is robust to clutter or outlier features, a critical advantage for handling images with variable backgrounds, occlusions, and viewpoint changes. I will show how this dramatic increase in performance enables accurate and flexible image comparisons to be made on large-scale data sets, and removes the need to artificially limit the size of images’ local descriptions. As a result, we can now access a new class of applications that relies on the analysis of rich visual data, such as place or object recognition and meta-data labeling. I will provide results on several important vision tasks, including our algorithm’s state-of-the-art recognition performance on a challenging data set of object categories.

Abstract Author(s): Kristen Grauman