Proc. of International Conference on Robotics and Automation (ICRA), 2012
Abstract
We propose a view-based approach for labeling
objects in 3D scenes reconstructed from RGB-D (color+depth)
videos. We utilize sliding window detectors trained from object
views to assign class probabilities to pixels in every RGB-D
frame. These probabilities are projected into the reconstructed
3D scene and integrated using a voxel representation. We
perform efficient inference on a Markov Random Field over the
voxels, combining cues from view-based detection and 3D shape,
to label the scene. Our detection-based approach produces
accurate scene labeling on the RGB-D Scenes Dataset and
improves the robustness of object detection.