Wednesday, January 26, 2011

if you are starting your research in the field of object recognition / object detection...

If you are an aspiring computer vision graduate student and hope to one day shatter the boundaries of machine perception, a good place to start is on the shoulders of giants.  A key ingredient to successful object recognition research is a powerful codebase, which you will hopefully one day outgrow and/or extend.  The single best place to get starter-code is at the following work, titled:

Discriminatively Trained Deformable Part Models




Why not start with some easy-to-understand MATLAB code so you can starting advancing your research this year, not this decade!?!  Also, if you are able to build on this work, you will have an easy time publishing object detection papers that will actually be treated seriously by contemporary vision researchers.  So my advice is to get voc-release-3.1, and read the following PAMI paper.

P. Felzenszwalb, R. Girshick, D. McAllester, D. Ramanan
Object Detection with Discriminatively Trained Part Based Models
IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 32, No. 9, September 2010
pdf Source code

Be warned that pff is probably smarter than you so you will not be able to understand 100% of everything he says, but because it is well-written code you will not have to understand all of it. If you want to be a Vision Jedi, look at the code, read the paper, discard the downloaded code, and write it yourself.

8 comments:

  1. Anonymous1:06 AM

    Perfect timing, I just added you to my google reader feed a few days ago. I am a senior undergrad and am relatively new to the field!

    ReplyDelete
  2. I'm glad you found the links useful! Enjoy the most exciting field in the world, aka Computer Vision.

    ReplyDelete
  3. insightful.
    There is something about well written codes... they complete the paper.

    ReplyDelete
  4. Anonymous10:45 AM

    There is a new version of the code ... 4.0

    ReplyDelete
  5. Anonymous10:06 AM

    Yes, but 4.0 is a more complicated version. That's possibly why tomasz advises 3.1. Version 4.0 also frames optimization within grammars which are not as easy to understand (at least for myself).

    ReplyDelete
  6. Hi Tomasz,
    Thanks a lot for the advice on trying version 3.1 first. I started with the latest(4.0) but had failed to undersand the implementation.

    Kaushik

    ReplyDelete
  7. Hello Sir,
    I have a query for you: Can we use this method to count number of raised Fingers of human Hand?

    ReplyDelete
  8. I forgot to check "Notify me"

    ReplyDelete