A Scalable Tree-based Approach for Joint Object and Pose Recognition
Twenty-Fifth Conference on Artificial Intelligence (AAAI), 2011
Abstract
Recognizing possibly thousands of objects is a crucial capability for an
autonomous agent to understand and interact with everyday environments.
Practical object recognition comes in multiple forms: Is this a coffee
mug? (category recognition). Is this Alice's coffee mug? (instance
recognition). Is the mug with the handle facing left or right? (pose
recognition). We present a scalable framework, Object-Pose Tree, which
efficiently organizes data into a semantically structured tree. The
tree structure enables both scalable training and testing, allowing us to
solve recognition over thousands of object poses in near real-time. Moreover, by
simultaneously optimizing all three tasks, our approach outperforms standard
nearest neighbor and 1-vs-all classifications, with large improvements on pose
recognition. We evaluate the proposed technique on a dataset of 300
household objects collected using a Kinect-style 3D camera. Experiments
demonstrate that our system achieves robust and efficient object category,
instance, and pose recognition on challenging everyday objects.