Friday, June 30, 2006

congrats to "Putting Objects in Perspective" and Geometric Context

Congratulations to Derek, Alexei, and Martial for getting their work Slashdotted and winning this year's Best Paper Award at CVPR in New York City. This work reinforces the fact that Carnegie Mellon University (especially The Robotics Institute) is the place you want to be if you want to study Computer Vision (and/or Machine Learning).

The work which won the Best Paper Award at CVPR is titled "Putting Objects in Perspective".

Quoting Derek's project description, "Image understanding requires not only individually estimating elements of the visual world but also capturing the interplay among them. We provide a framework for placing local object detection in the context of the overall 3D scene by modeling the interdependence of objects, surface orientations, and camera viewpoint. Most object detection methods consider all scales and locations in the image as equally likely. We show that with probabilistic estimates of 3D geometry, both in terms of surfaces and world coordinates, we can put objects into perspective and model the scale and location variance in the image. Our approach reflects the cyclical nature of the problem by allowing probabilistic object hypotheses to refine geometry and vice-versa. Our framework allows painless substitution of almost any object detector and is easily extended to include other aspects of image understanding."

The slashdot story link(June 14th) can be found here:

Researchers Teach Computers To Perceive 3D from 2D

Tuesday, June 27, 2006

what is segmentation?

What is segmentation?

According to the computer vision community, a segmentation is a disjoint partition of an image into K regions. Popular segmentation strategies include (but are not limited to): normalized cuts, graph cuts, mean-shift, watershed. Researches sometimes use the outputs of these 'segmentation engines' in the middle of their own algorithm. However, according to Jitendra Malik (paraphrased from CVPR 2006) segmentation is the result of recognition and not that disjoint partition that is returned by your favorite 'segmentation engine'.

Perhaps Jitendra would prefer to call these 'segmentation engines' something else such as 'hypothetical perceptual grouping' engine. I would have to agree with Jitendra that segmentation is what we want at the end of recognition.

Sunday, June 18, 2006


I returned from Poland this past Thursday, and I attented the first day of CVPR yesterday. On this first day, I went to the "Beyond Patches" workshop; however, the most exciting part of the conference will be tomorrow through Wednesday.

Also, I've started reading Ishmael by Daniel Quinn. I think there are some interesting topics being discussed in Ishmael and should be compared to the ideas presented in Isaac Asimov's Foundation series. When I finish Ishmael, I will write more on this.

Monday, June 05, 2006


Greetings from Wloclawek, Poland. I'm currently sitting in an internet cafe checking up on my email. I finished Dan Brown's DaVinci Code, and am currently reading Kurt Vonnegut's Slaughterhouse Five.