Showing posts with label martial hebert. Show all posts
Showing posts with label martial hebert. Show all posts

Thursday, June 21, 2012

CVPR 2012 Day 2: optimize, optimize, optimize

Due to popular request, here is my overview of some of the coolest stuff from Day 2 of CVPR 2012 in Providence, RI.  While the Lobster dinner was the highlight for many of us, there were also some serious learning/optimization-based papers presented during Day 2 worthy of sharing.  Here are some of the papers which left me with a very positive impression.


Dennis Strelow of Google Research in Mountain View presented a general framework for Wiberg minimization.  This is a strategy for minimizing objective functions with multiple variables -- objectives which are typically tackled in an EM-style fashion.  The idea is to express one of the variables as a linear function of the other variable, effectively making the problem depend on only one set of variables.  The technique is quite general and has been shown to produce state-of-the-art results on a bundle adjustment problem.  I know Dennis from my second internship at Google where we worked on some sparse-coding problems.  If you perform lots of matrix decomposition problems, check out his paper!


Dennis Strelow
General and Nested Wiberg Minimization
CVPR 2012


Another cool paper which is all about learning is Hossein Mobahi's algorithm for optimizing objectives by smoothing them to avoiding getting stuck in local minima.  This paper is not about blurry images, but about applying Gaussians to objective functions.  In fact, for the problem of image alignment, Hossein provides closed form versions of image operators.  Now when you apply these operators to images, you efficiently smooth the underlying cross-correlation alignment objective.  You decrease the blur, while following the optimum path, and get much nicer answers that doing naive image alignment.


Hossein Mobahi, C. Lawrence Zitnick, Yi Ma
Seeing through the Blur
CVPR 2012


Ira Kemelmacher-Shlizerman, of Photobios fame, showed a really cool algorithm for computing optical flow between two different faces based on learning a subspace (using a large database of faces).  The ideas is quite simple and allows for flowing between two very different faces where the underlying operation produces a sequence of intermediate faces in an interpolation-like manner.  She shared this video with us during her presentation, but it is on Youtube, so now you can enjoy it for yourself.


Ira Kemelmacher-Shlizerman, Steven M. Seitz
Collection Flow
CVPR 2012



Now talk about cool ideas!  Pyry, of CMU fame, presented a recommendation engine for classifiers.  The idea is to take techniques from collaborative filtering (think Netflix!) and apply then to the classifier selection problem.  Pyry has been working on action recognition and the ideas presented in this work are not only quite general, but have are quite intuitive and likely to benefit anybody working with large collections of classifiers.

Pyry Matikainen, Rahul Sukthankar, Martial Hebert
Model Recommendation for Action Recognition
CVPR 2012


And finally, a super-easy algorithm presented for metric learning by Martin Köstinger had me intrigued!  This a Mahalanobis distance metric learning paper which uses equivalence relationships.  This means that you are given pairs of similar items and pairs of dissimilar items.  The underlying algorithm is really not much more than fitting two covariance matrices, one to the positive equivalence relations, and another to the non-equivalence relations.  They have lots of code online, and if you don't believe that such a simple algorithm can beat LMNN (Large-Margin Nearest Neighbor from Killian Weinberger), then get their code and hack away!

Martin Köstinger, Martin Hirzer, Paul Wohlhart, Peter M. Roth, Horst Bischof
Large Scale Metric Learning from Equivalence Constraints
CVPR 2012



CVPR 2012 gave us many very math-oriented papers, and while I cannot list of all of them, I hope you found my short list useful.



Wednesday, May 23, 2012

Why your vision lab needs a reading group

I have a certain attitude when it comes to computer vision research -- don't do it in isolation. Reading vision papers on your own is not enough.  Learning how your peers analyze computer vision ideas will only strengthen your own understanding of the field and help you become a more critical thinker.  And that is why at places like CMU and MIT we have computer vision reading groups.  The computer vision reading group at CMU (also known as MISC-read to the CMU vision hackers) has a long tradition, and Martial Hebert has made sure it is a strong part of the CMU vision culture.  Others ex-CMU hackers such as Sanjiv Kumar have continued the vision reading group tradition onto places such as Google Research in NY (correct me if this is no longer the case).  I have continued the reading group tradition to MIT (where I'm currently a postdoc) because I was surprised there wasn't one already!  In reality, we spend so much time talking about papers in an informal setting, that I felt it was a shame to not do so in a more organized fashion.
My personal philosophy is that as a vision researcher, the way towards the goal of creating novel long-lasting ideas is learning how others think about the field.  There's a lot of value in being able to analyze, criticize, and re-synthesize other researchers' ideas.  Believe me when I say that a lot of new vision papers come out of top tier vision conferences every year.  You should be reading them!  But not just reading, also criticizing them among your peers.  Because once you learn to criticize others' ideas, you will become better at promulgating your own.  Do not equate criticism with nasty words for the sake of being nasty -- good criticism stems from a keen understanding of what must be done in science to convince a broad audience of your ideas.

In case you want to start your own computer vision research group, I've collected some tips, tricks, and advice:

1. You don't need faculty.  If you can't find a season vision veteran to help you organize the event, do not worry.  You just need 3+ people interested in vision and the motivation to maintain weekly meetings.  Who cares if you don't understand every detail of every paper!  Nobody besides the authors will ever understand every detail.

2. Be fearless.  Ask dumb questions.  Alyosha Efros taught me that if you're reading a paper or listening to a presentation, if you don't understand something then there's a good chance you're not the only one in the audience with the same questions.  Sometimes younger PhD students are afraid of "asking a dumb question" in front of audience.  But if you love knowledge, then it is your duty to ask.  Silence will not get you far.  Be bold, be curious, and grow wise.  

3. Choose your own papers to present.  Do not present papers that others want you to present -- that is better left for a seminar course led by a faculty member.  In a reading group it is very important that you care about the problems you will be discussing with your peers.  If you keep up with this trend then when it comes to "paper writing time" you should be up to date on many relevant papers in your field and you will know about your other lab mates' research interests.

4. It is better to show a paper PDF up on a projector than cancel a meeting.  Even if everybody is busy, and the presenter didn't have time to create slides, it is important to keep the momentum going.

5. After a major conference, have all of the people who attended the conference present their "top K paper."  The week after CVPR it will be valuable to have such a massive vision brain dump onto your peers because it is unlikely that everybody got to attend. 

6. Book a room every week and try to have the meeting at the same time and place.  Have either the presenter or the reading group organizer send out an announcement with the paper they will be presenting ahead of time.  At MIT we share a google doc with the information about interesting papers and the upcoming presenter usually chooses the paper one week in advance so that the following week's presenter doesn't choose the same paper.  If somebody already presents your paper, don't do it a second time!  Choose another paper.  cvpapers.com is a great resource to find upcoming papers.

At CMU, there is a long rotating schedule which includes every vision student and faculty member.  Once it is your time to present, you can only get off the hook if you swap your slot with somebody else.  Being on a schedule months in advance means you'll have lots of time to prepare your slides.  At MIT, we are currently following the object recognition / scene understanding / object detection theme where we (Prof. Torralba, his students, his postdocs, his visiting students, etc) choose a paper highly relevant to our interests.  By keeping such a focus, we can really jump into the relevant details without having to explain fundamental concepts such as SVMs, features, etc.  However, at CMU the reading group is much broader because on the queue are students/profs interested in all aspects of vision and related fields such as graphics, illumination, geometry, learning, etc.