Thursday, January 12, 2006

recognition as segmentation across time

Let me start out my brief discussion with a segmentation result:

This is the image the New Year's Party picture that I posted a few posts ago. This is the result of Felzenszwalb's graph based segmentation executable. I ran the executable many times by uniformly sampling the input parameter space and selected one image which looked particularly nice.

Segmentation is generally referred as a mid-level process, ie a process which groups together similar regions. However, it is still not a high-level process because it doesn't use any information from other images. When the concept of segmentation was introduced into the vision community, researchers thought that it would a good pre-processing step that would further aid object recognition. However, the community quickly realized that an object-consistent segmentation is only possible after the objects have been identified in the image!

Image segmentation is still very popular as of 2006; however, the traditional definition of segmentation as something you do before recognition is slowly becoming outdated. Modern research on segmentation has a significant object-recognition feel to it, and one interesting question that remains is: how does one incorporate information from a large set of images to segment 1 image?

Every problem in computer vision can be solved with an object recognition module, unfortunately recognition is the most difficult. Computer vision is not simply image processing. Computer vision strives to build machines that can make synthetic a posteriori statements about the world. If Emmanuel Kant was alive this day, he would be a vision hacker.