Sunday, June 22, 2008

Going to Anchorage

I am going to Anchorage in a couple of hours! CVPR 2008 should be really fun.

On another note, yesterday I spent much of the day in beautiful San Francisco. I definitely need to make a few more visits there.

Thursday, June 05, 2008

computer vision summer internship at google

I am going to Mountain View, California next week to start my 3 month-long summer internship at Google. While I don't know the specific details on what I'll be working on, I will be working with the Computer Vision group there.

Here is an idea: imagine making sense of the billions of objects embedded in images contained in google street-view image database. Google is already blurring faces in these images -- which means they are running vision algorithms on this dataset -- but are google researchers finding makes/models of cars, reading street signs, analyzing building facades to see which homes are victorian/ranch/etc, aligning visual information with google maps, etc?

Google street view is an excellent portal from machine to the world. If there is ever any hope of visual recognition happening on a robot, then it will have to happen at Google first. First using immense computational power. If that works, why not outsource visual recognition capabilities to a company like Google? Imagine a little computer onboard your favorite humanoid robot that is actually communicating via some standard recognition API with google's servers. What the robot sees is sent over to Google for analysis -- then 'image understanding' data is propagated back. I imagine such a service could be set up, and for a fairly cheap price.

Wednesday, June 04, 2008

Shimon Edelman: Constraints on the nature of the neural representation of the visual world

I came across a nice short article written by Shimon Edelman titled Constraints on the nature of the neural representation of the visual world. Shimon got his PhD from Weizmann Institute of Science in 1988 under the guidance of the infamous Prof. Shimon Ullman and is now a Professor of Psychology at Cornell. If you are a vision hacker -- a code-writing graphical-model advocating graduate student in computer science -- then you might ask yourself: Why should I care what psychologists/philosophers have to say about vision?

The problem is that overfitting to what is currently *hot* at CVPR isn't very productive if you want to solve big problems. Philosophers, psychologists, roboticists, and cognitive neuroscientists have a lot to say about vision and offer plenty of ideas as to what they expect to see in a successful vision system. While being a CS graduate student something like "the problem of computer vision" might seem like a rather grand goal; however, these other scientists (from different fields) suggest that it is unlikely that a pure CS approach will get the glory.

Some concepts that are brought up in this paper are the following: ontological strategy, context, inherent ambiguities in segmentation, ineffability of the visual world, multidimensional similarity space. I think looking at vision from a philosophical point of view is not only enlightening, but suggests that what we should be after is more than just solving the problem of computer vision. What does it mean to solve the problem of computer vision after all? What we should be after is a theory of intelligence -- a theory of mind -- and strive to build truly intelligent machines.