Monday, November 22, 2010

I, for one, welcome our new Visual Memex-based overlords

Welcome to the era of visual intelligence -- the era of Visual Memex-based overlords (now in 3D!)
  



The goal of today's post is simple: to empower you, the reader, with an exciting and fresh perspective on the problem of visual reasoning.  This simple idea is one of the central tenets promulgated in my upcoming doctoral dissertation -- and but I'd like to give this potent meme a head start.  Visual Memex-style reasoning is not the kind of reasoning that is described in classic graduate level textbooks on AI (e.g. first-order logic).  In the case that you've mentally over-fit to a graduate-level CS curriculum, you might even portray my iconoclastic views as ramblings of a lunatic -- this is okay, I know at least Ludwig would be proud.

The Visual Memex is a mentality/perspective which, I believe, can overcome many limitations faced by modern computer vision systems.  What the Visual Memex can do for visual intelligence is akin to what the World Wide Web has done for knowledge (see Weinberger's excellent book "Everything is Miscellaneous" for the full argument).  It's akin to using Google for acquiring knowledge instead of going to the library -- maybe knowledge was never meant to be embedded in bookshelves.  The idea is embarrassingly simple: replace visual object categories with object exemplars and relationships between those exemplars.  Maybe the linguistic categories that we (as humans) cannot seem to live without are mere shadows cast on the wall of a dark cave.  Psychologists have long abandoned rigid categories in their models of how humans think about concepts, but the notion of a class is so fundamental to contemporary Machine Learning that many haven't even bothered to question its tenuous foundations.  While categories (also referred to as classes) definitely make learning algorithms easier to formalize, maybe its better to let the data speak for itself.  Free the data!



One upcoming research paper inspired by this category-free mentality is: Context-Based Search for 3D Models, by Matthew Fisher and Pat Hanrahan, of Stanford University.  This paper will be presented at SIGGRAPH Asia 2010.  Maybe it is time to abandon those rigid categories and memexify your own research problem?

Further reading:




Saturday, November 13, 2010

CVPR, the A+'s of yesteryear, and robots need us

It is November yet again, and I'm proud to announce my last CVPR submission as a graduate student!  It is that time of the year again -- the post-CVPR downtime.  It is time to mentally tuck away the fruits of our labor (NOTE: you might want to create a readme.txt which explains how to use the 20,000 lines of code you wrote in the 7 days preceding the deadline), consider the long-term impact of our work, and perhaps even reconsider our position in life.

I want to build intelligent machines, and I feel vision is the right place to start -- even roboticists such as Rodney Brooks started out in vision. However, I don't feel churning out 'cute' CVPR papers is going to do much.  Perhaps if all one cares about in life is getting tenure at a top ranked university, then proof-of-concept papers might be the path of least resistance.  But remember when you were a teen, and you wanted to build a rocket which lets you travel at relativistic speeds -- allowing you to go back in time?  Or remember when you wanted to build those humanoid robots that would both entertain your kid sister and help out your mother with house chores?

So why did so many intelligent people I know abandon those grandeur dreams and settle for bread crumbs?  Getting your paper submitted to a peer-reviewed conference, so that you can pad your CV with another publication, is incommensurable with the dreams you once had.  The publication of today is the A+ of yesteryear, and it is just way too easy for us, intellectuals, to stay comfortable with those A's, without asking for more.  But robots need us, CVPR papers won't assemble themselves into intelligent machines.

But the deadline is over, and now its time to relax.  If my rant did not make sense to you, then I envy you.  I have to move on to more positive things -- I need to finish reading Pinker's Blank Slate, read some more Wittgenstein (and fully assimilate his criticism of Augustine's theory of language-acquisition), waste two days playing with the Riemann Zeta function (because the Basel problem was only the beginning), play some guitar, etc.