"I think a key to AI is the need for several representations of the knowledge, such that when the system is stuck (using one representation) it can jump to use another. When David Marr at MIT moved into computer vision, he generated a lot of excitement, but he hit up against the problem of knowledge representation; he had no good representations for knowledge in his vision systems." -- Marvin Minsky
Check out the full interview with Marvin Minsky here -- a must read for anybody serious about building intelligent machines! This interview appears to be a part of a larger volume: Hal's Legacy.
I believe that in order to make the enterprise of computer vision of success, we must seriously broaden our outlook on the problem. Are we seriously expecting algorithms to delineate object boundaries from real images based on statistics of patch descriptors without any sort of model of the world?
I don't know about you, but I seriously want to build intelligent machines. I don't think there will ever be any sort of low-level SIFT-esque algorithm that "solves vision." It is a much grander picture of intelligence that I'm really after -- and successful computer vision will be a result(component?) of a higher-level intelligent machine. Machines need to know about a whole lot more than is found in a single image -- and the necessary conceptual tools might not be present in the computer vision community.
A recurring theme in my blog is my belief that we must become renaissance men -- a unison of *nix hackers, vision scientists, cognitive scientists, philosophers, athletes, machine learning scientists, skilled orators, and much more -- if we are to have any hope of chiseling away at the problem of computational intelligence. Minsky was a pioneer of computational intelligence, and his words revitalize my own research efforts in this direction.