Tuesday, May 20, 2008

dude, where's my image?

Check out IM2GPS: estimating geographic information from a single image. This is CVPR2008 work done by James Hays and Alexei Efros. Some crazy titles that have been suggested to James can also be seen on his project site -- some of them are rather funny too!

Anyways, you can just read his abstract and browse his results if you are interested in the kind of computer vision research that uses millions of images. The basic idea is to predict the location of an image using only information embedded inside the image (and a training set of over 6 million geo-tagged Flickr images.)

Saturday, May 17, 2008

what is recognition?

I want to briefly discuss what the terms recognition, classification, and categorization mean to me and how they relate to the fields such as computer vision, machine learning, and psychology.

From my understanding, "category" == "class" and thus categorization and classification are the same thing! It is correct to say that when we categorize, we affix a label to some entity. But these labels do refer to categories, or classes. One can attribute the popularity of the term 'classification' to the field of machine learning. Categorization is a term that was more heavily used in psychology and only recently it is popping up in computer vision papers.

Because I see classification and categorization as the same thing, I don't agree that only one can be hierarchical.

Regarding the term recognition, the answer is a bit more complicated. In the field of computer vision, when one says that they are interested in recognition they are usually interested in recognizing novel instances from some predefined list of classes. To stress the interest in discrimination between a large number of object classes, vision researchers have recently begun using terms such as "a visual categorization system" or they talk about "object class recognition."

In all places that I have seen this term pop up, "identification" refers to specific instances. A face identification system might be designed to find faces of George Bush and might work on top of a face-class recognition system. The problem is that early work in computer vision was usually concerned with a fixed number of objects and the goal was to find those exact object instances inside an image -- and this was referred to as simply "recognition." Nowadays, we often use the term "recognition" to refer to category-level recognition and not specific objects.

In conclusion, recognition is a very general term that has been applied to both category-level recognition (dog vs. cat vs. car vs. person) and recognition of specific object instances (this particular blue ball vs. this particular face). To be more precise, one can use the terms "category-level recognition" and "identification."

This post has been written in response to Vidit Jain's blog post titled "Etymology of common learning-related words such as recognize."