Tombone's Computer Vision Blog

Saturday, August 19, 2006

Computer Vision TA + discovering music that I like

This upcoming semester I'll be one of the two Teaching Assistants for Martial Hebert's Computer Vision class. (The other TA will be Ankur Datta) Most Robotics PhD students take this graduate course to satisfy their perception requirement. Since the 1st semester of the 1st year is a very popular time to take this course (at least it was a popular time for my incoming class), this opportunity will give me a chance to meet the new Robograds.

On another note, I recently out about Pandora Internet Radio, which I have been listening to over the past few days. On this website, you input a favourite song or artist and a radio station is automatically created to match your interests in music. You can then vote for the songs that you heard. The basic idea is to get introduced to music you've never heard but you should enjoy. Another great source for internet radio is Shoutcast and archive.org (download live shows or just stream them!).

Thursday, August 03, 2006

Latent Dirichlet Allocation + Logistic-Normal + Detecting Objects

As a Robotics PhD student at Carnegie Mellon University, I took a Computer Vision course called "Advanced Perception." This Machine Learning-intensive course was taught by Alexei (Alyosha) Efros and my final project was all about detecting objects using a variant of Latent Dirichlet Allocation.

Jonathan Huang and I (Tomasz Malisiewicz) created these documents. So if you are into object recognition or just into bayesian hierarchical models you can check out these docs:

Detecting Objects via Multiple Segmentations and Latent Topic Models

Correlated Topic Model Details

You can also download some MATLAB code that Jon put up on the web:
fitting a Hierarchical Logistic-Normal distribution

You can also download our MATLAB Latent Dirichlet Allocation implementation (variational inference)

Enjoy!

karamazov

Yesterday I started reading The Brothers Karamazov by Fyodor Dostoyevsky. Last summer I read The Count of Monte Cristo and I really enjoyed it, thus I decided to start another great classic. I wasn't impressed with the ending of Ishmael and decided that I wanted to read something that passed the test of time.

Wednesday, July 05, 2006

normalized cuts on an image of a cat

This is an image of a cat from the MSRC Image Database.

The segmentation engine used here is normalized-cuts; however, what you see here is not the raw output of normalized-cuts.

Friday, June 30, 2006

congrats to "Putting Objects in Perspective" and Geometric Context

Congratulations to Derek, Alexei, and Martial for getting their work Slashdotted and winning this year's Best Paper Award at CVPR in New York City. This work reinforces the fact that Carnegie Mellon University (especially The Robotics Institute) is the place you want to be if you want to study Computer Vision (and/or Machine Learning).

The work which won the Best Paper Award at CVPR is titled "Putting Objects in Perspective".

Quoting Derek's project description, "Image understanding requires not only individually estimating elements of the visual world but also capturing the interplay among them. We provide a framework for placing local object detection in the context of the overall 3D scene by modeling the interdependence of objects, surface orientations, and camera viewpoint. Most object detection methods consider all scales and locations in the image as equally likely. We show that with probabilistic estimates of 3D geometry, both in terms of surfaces and world coordinates, we can put objects into perspective and model the scale and location variance in the image. Our approach reflects the cyclical nature of the problem by allowing probabilistic object hypotheses to refine geometry and vice-versa. Our framework allows painless substitution of almost any object detector and is easily extended to include other aspects of image understanding."

The slashdot story link(June 14th) can be found here:

Researchers Teach Computers To Perceive 3D from 2D

Tuesday, June 27, 2006

what is segmentation?

What is segmentation?

According to the computer vision community, a segmentation is a disjoint partition of an image into K regions. Popular segmentation strategies include (but are not limited to): normalized cuts, graph cuts, mean-shift, watershed. Researches sometimes use the outputs of these 'segmentation engines' in the middle of their own algorithm. However, according to Jitendra Malik (paraphrased from CVPR 2006) segmentation is the result of recognition and not that disjoint partition that is returned by your favorite 'segmentation engine'.

Perhaps Jitendra would prefer to call these 'segmentation engines' something else such as 'hypothetical perceptual grouping' engine. I would have to agree with Jitendra that segmentation is what we want at the end of recognition.

Sunday, June 18, 2006

CVPR

I returned from Poland this past Thursday, and I attented the first day of CVPR yesterday. On this first day, I went to the "Beyond Patches" workshop; however, the most exciting part of the conference will be tomorrow through Wednesday.

Also, I've started reading Ishmael by Daniel Quinn. I think there are some interesting topics being discussed in Ishmael and should be compared to the ideas presented in Isaac Asimov's Foundation series. When I finish Ishmael, I will write more on this.