Thursday, October 06, 2011

Kinect Object Datasets: Berkeley's B3DO, UW's RGB-D, and NYU's Depth Dataset

Why Kinect?
The Kinect, made by Microsoft, is starting to become quite a common item in Robotics and Computer Vision research.  While the Robotics community has been using the Kinect as a cheap laser sensor which can be used for obstacle avoidance, the vision community has been excited about using the 2.5D data associated with the Kinect for object detection and recognition.  The possibility of building object recognition systems which have access to pixel features as well as 2.5D features is truly exciting for the vision hacker community!

Berkeley's B3DO
First of all, I would like to mention that it looks like the Berkeley Vision Group jumped on the Kinect bandwagon.  But the data collection effort will be crowdsourced -- they need your help!  They need you to use your Kinect to capture your own home/office environments and upload it to their servers  This way, a very large dataset will be collected, and we, the vision hackers, can use machine learning techniques to learn what sofas, desks, chairs, monitors, and paintings look like.  They Berkeley hackers have a paper on this at one of the ICCV 2011 workshops in Barcelona, here is the paper information:

A Category-Level 3-D Object Dataset: Putting the Kinect to Work
Allison JanochSergey KarayevYangqing JiaJonathan T. BarronMario FritzKate SaenkoTrevor Darrell
ICCV-W 2011
[pdf] [bibtex]

UW's RGB-D Object Dataset
On another note, if you want to use 3D for your own object recognition experiments then you might want to check out the following dataset: University of Washington's RGB-D Object Dataset.  With this dataset you'll be able to compare against UW's current state-of-the-art.

In this dataset you will find RGB+Kinect3D data for many household items taken from different views.  Here is the really cool paper which got me excited about the RGB-D Dataset:
A Scalable Tree-based Approach for Joint Object and Pose Recognition
Kevin Lai, Liefeng Bo, Xiaofeng Ren, and Dieter Fox
In the Twenty-Fifth Conference on Artificial Intelligence (AAAI), August 2011.

NYU's Depth Dataset
I have to admit that I did not know about this dataset (created by by Nathan Silberman of NYU), until after I blogged about the other two datasets.  Check out the NYU Depth Dataset homepage. However the internet is great, and only a few hours after posted this short blog post, somebody let me know that I left out this really cool NYU dataset.  In fact, it looks like this particular dataset might be at the LabelMe-level regarding dense object annotations, but with accompanying Kinect data.  Rob Fergus & Co strike again!

Nathan Silberman, Rob Fergus. Indoor Scene Segmentation using a Structured Light Sensor. To Appear: ICCV 2011 Workshop on 3D Representation and Recognition