Hierarchical Matching Pursuit for Recognition: Architecture and Fast Algorithms
Advances in Neural Information Processing Systems (NIPS) 2011
Abstract
Extracting good representations from images is essential for many
computer vi- sion tasks. In this paper, we propose hierarchical
matching pursuit (HMP), which builds a feature hierarchy
layer-by-layer using an efficient matching pursuit en- coder. It
includes three modules: batch (tree) orthogonal matching pursuit,
spatial pyramid max pooling, and contrast normalization. We
investigate the architecture of HMP, and show that all three
components are critical for good performance. To speed up the
orthogonal matching pursuit, we propose a batch tree orthog- onal
matching pursuit that is particularly suitable to encode a large
number of observations that share the same large dictionary. HMP is
scalable and can effi- ciently handle full-size images. In addition,
HMP enables linear support vector machines (SVMs) to match the
performance of nonlinear SVMs while being scal- able to large
datasets. We compare HMP with many state-of-the-art algorithms
including convolutional deep belief networks, SIFT based single layer
sparse cod- ing, and kernel based feature learning. HMP consistently
yields superior accuracy on three types of visual recognition
problems: object recognition (Caltech-101), scene recognition
(MIT-Scene), and static event recognition (UIUC-Sports).