Wednesday, September 28, 2011

Kant's Intuitions, the intentional stance, and reverse-engineering the mind

“Thoughts without content are empty, intuitions without concepts are blind. The understanding can intuit nothing, the senses can think nothing. Only through their unison can knowledge arise.” -- Immanuel Kant

“We live in a world that is subjectively open. And we are designed by evolution to be "informavores", epistemically hungry seekers of information, in an endless quest to improve our purchase on the world, the better to make decisions about our subjectively open future.” -- Daniel Dennett

"For scientists studying how humans cometo understand their world, the central challenge is this: How do our minds get so much from so little? We build rich causal models,make strong generalizations, and construct powerful abstractions, whereas the input data are sparse, noisy, and ambiguous—in every way far too limited. A massive mismatch looms between the information coming in through our senses and the outputs of cognition." -- Josh Tenenbaum

Organizing by space (space, time, and physics)
There are two faculties of understanding which it is unlikely we have acquired from experience.  The first is that of understanding objects as extended bodies in a 3D space and thus occupying some volume.  I believe it is Kant argued best against the hardcore British Empiricists, who proclaimed that experience is the sole originator of knowledge.  Experiences are the pen strokes, which fill the Empricisit’s tabula rasa.  Kant argued (against Hume) that the concept of a spatially extended object is not acquired from experience – the very notion of experience requires that we already possess the notion of an object in order to have a meaningful percept.  It is as if the Empiricists failed to acknowledge that to make strokes on a sheet of paper, we need to already have a pen.  Kant’s intuitions are the pens of experience.  The requirement of having suitable intuitions for grouping percepts into experiences is what Kant described as a form of transcendental idealism.  “Objectness” is a faculty of human understanding, not something acquired from experience.  If you are a vision researcher, being aware of this can have drastic implications on your research programme.

It has also been argued that there are some primitive notions of object dynamics, aka folk-physics, which can are possessed by very young children.  Given the uniformity of human experience (at least I have no ostensible reason to double that my colleague’s experiences significantly differ from my own), and the diversity in our individual upbringing, it is also unlikely that folk-physics is learned from experience.  However, I don't want to make any strong claims regarding folk-physics.  I feel safe to say that Quantum Mechanics is another story -- it requires years of mathematics and thousands hours of deliberate problem solving to grasp.

Organizing by mind (psychology, mind, and intent)
The second faculty of understanding, which can be found in many aspects of human intelligence, is that of understanding the world in terms of cognitive agents.  Humans have an amazing capability when it comes to attributing stuff with having a mind.  This way of thinking about the world is so common and uniform among children all over the world, that the differences in their upbringing cannot be reconciled with the uniformity of their capability to project humanness onto objects.  Consider the following video (thanks J. Tenenbaum's videos/lectures for pointing this out).

We cannot just view this video os triangles, dots, and lines.  Each one of understands the story in terms of a narrative based on agents and their intent.  We are stimulated by the external world, we take as input sense-data, and the brain helps us make sense of it -- it turns the hodgepodge of data into experience.  But the brain is a mold, it conforms percepts to some shape defined by the mold.  These molds are the faculties of understanding which let us understand things, it is like the faculties of understanding are basis vectors onto which we project all input sense data.  The data is weak and noisy, the priors are strong, and understanding is the result of their union.  An experience without a proper basis is blind, it is just a ball of percepts.  These faculties allow us to have experience.  The experiences, coupled with memory, allow us to obtain understanding – where understanding is the relationship between a given experience and past experiences, either in the form of direct associations between currently-experienced-objects and previously-experienced-objects, or rules abstracted away from previously-experienced-objects being directly applied to the current sense data.

What I am talking about is what philosopher Daniel Dennett refers to by the “intentional stance.”  Given my background in AI and philosophy of mind, it is very likely that Dennett and I have had the same influences.  I like to juxtapose my ideas with those of the classical philosophers such as Descartes, Locke, Kant, Wittgenstein and Pinker -- I’m not sure how Dennett motivates his philosophy nor do I know against whose ideas he juxtaposes his own stance.

At MIT, J. Tenenbaum is pushing these ideas to the next level.  I only wish there was more perception in his work -- toy worlds just don't do it for me.  I want to build intelligent machines, and really cannot afford to sidestep the issue of perception.  Here is a great talk by Josh Tenenbaum on reverse engineering the mind from NIPS 2010. Video is on, just click the link.

Implications for Artificial Intelligence and Machine Vision
Following Josh Tenenbaum, I think that a criticism of classical machine learning is long overdue.  Machine Learning, as a field, has been spewing out hardcore empiricists.  “Let me download your features, my machine learning algorithm will take care of the rest,” they say.  It is like the glory is in the mathematics, which manipulates N-D vectors.  But I argue that intelligence isn’t “in the calculus,” it is what the primitives in the calculus actually represent.  As an undergraduate I proclaimed, “I am not a mathematician, I am a physicists.  I care about the structure of the world, not the structure of proofs. “  As a graduate student I proclaimed, “The glory isn’t in the manipulation of vectors, the glory is understanding the what/why of encoding information about the world into vectors.  I am a computer vision researcher, not a machine learning researcher.”  That is why the view of the world as coming from K different classes is wrong – this is merely a convenient view if the statistician’s toolbox is at your disposal.  It is all about structuring the input to match a researcher’s high-level intuitions about the world. 


  1. Aren't machines better now learning arbitary N-D vectors than people? (I heard something like state-of-the-art neural networks are more accurate at classifying handwritten numbers than people, with numbers not even being arbitary..) So you're right, maybe it's time to stop being obsessed with additional half-percents for such stuff, and solve problems where general algorithms don't even come close to give the right answer...
    By the way, wasn't Pinker also arguing that, for example, much of language acquisition builds on builtin structure in the brain?

  2. Simon you are right, Pinker argued passionately that there is something about the structure of the brain which lets us acquire language. In fact, if I can correctly trace my influences, it was my research which led me towards Wittgenstein, and Wittgenstein which led me to Pinker. I then came across Pinker's "Stuff of Thought" but it was too language-oriented for my linking -- his "Blank Slate" was a much more philosophical work which led me to Immanuel Kant.

    Pinker believes language is the hazy window through which we can peer to reflect upon the nature of the brain's inner-workings. Being a vision person, I think that studying object and scene recognition in humans can also be used to probe into human nature.

  3. Anonymous7:51 PM

    I once went to a conference about identifying the structure of proteins from microscopic images. It was very interdisciplinary -- there were mathematicians, computer scientists, and biologists. The computer scientists made what I think was the mistake of taking the images as mere collections of n by n pixels, and running standard spectral graph theory on them. The mathematicians took advantage of the fact that the images were projections of a 3-d object onto a viewing plane -- that is, they took notice of the "physics" or the "geometry" of the situation.

    I believe that's an example of the sort of thing you're talking about.

    I think it can be dangerous to abstract context away too quickly. It may not be irrelevant to face-recognition algorithms that the things they are recognizing are faces, for example.

  4. Hey Anonymous,

    I agree with you that the scenario you described calls for understanding objects as more than pixel patterns. While in your example it was the physicists which used their intuition about the problem to gain deeper insights, I believe machines must also do something like this.

    I don't think physics-based on geometry-based reasoning can be learned, and as artificial intelligence engineers, we must somehow "inject" this capacity of understanding into our machines. It is not clear how to do this in a general setting, but I'm sure others will see the light and augment their hardcore "pure" machine-learning based approaches with these insights.

  5. This is a great post. It's good to read someone in this field (accurately) giving a synopsis of Kant's epistemology.

  6. Hey jwdink, I think computer science researchers are somewhat blind to 2000+ years of great philosophy with great ideas regarding the mind, cognition, and intelligence. My personal belief has always been that if one wants their ideas to gain broad appreciation, one must juxtapose their own ideas against the ideas proposed by titans such as Kant, Newton, Descartes, Locke, Aristotle, Plato, Darwin, Einstein, etc.

    Kant has been one of my idols (don't tell my superiors, they are not fond of my relationship with philosophy) and in the next decade of my life I hope to enlighten my fellow contemporary computer scientists regarding the products of philosophy. We, computer vision researchers, are in the game of building intelligent machines. This means that by definition we are computational epistemologists (or computational metaphysicians) -- unfortunately too few of us actually know what "epistemology" means.

  7. Good to hear, keep up the good fight! I'm coming from a philosophy undergraduate background myself, but I'm just starting a cognitive psychology program. The advantage is that cognitive psych is (often) more aware and accepting of philosophical thinking and problems, and it's probably the reason I was able to get in without training in someone more modern tools (computer programming, etc.). The disadvantage is no one's particularly willing or able to ask questions like the above, except for some developmental researchers.

    Any tips on staying in the loop with machine learning/ machine vision?