Tombone's Computer Vision Blog: william james

Friday, July 03, 2009

Linguistic Idealism

I have been an anti-realist since a freshman in college. Due to my lack of philosophical vocabulary I might have even called myself an idealist back then. However, looking back I think it would have been much better to use the word 'anti-realist.' I was mainly opposed to the correspondence theory of truth which presupposes an external, observer independent, reality to which our thoughts and notions are supposed to adhere to. It was in the context of the Philosophy of Science that I acquired my strong anti-realist views, (developing my views while taking Quantum Mechanics, Epistemology, and Artificial Intelligence courses at the same time). Pragmatism -- the offspring of William James -- was the single best view which best summarized my philosophical views. While pragmatism is a rejection of the absolutes, an abandonment of metaphysics, it does not get in the way of making progress in science. It is merely a new a perspective on science, a view that does not undermine the creativity of the creator of scientific theories, a re-rendering of the scientist as more of an artist and less of a machine.

However, pragmatism is not the anything-goes postmodern philosophy that many believe it to be. It is as if there is something about the world which compels scientists to do science in a similar way and for ideas to converge. I recently came across the concept of Linguistic Idealism, and being a recent reader of Wittgenstein this is a truly novel concept for me. Linguistic Idealism is a sort of dependence on language, or the Gamest-of-all-games that we (humans) play. It is a sort of epiphany that all statements we make about the world are statements within the customs of language which results in a criticism of the validity of those statements with respect to correspondence to an external reality. The criticism of statements' validity stems from the fact that they rely on language, a somewhat arbitrary set of customs and rules which we follow when we communicate. Philosophers such as Sellars have gone as far as to say that all awareness is linguistically mediated. If we step back, can we say anything at all about perception?

I'm currently reading a book on Wittgenstein called "Wittgenstein's Copernican Revolution: The Question of Linguistic Idealism."

Tuesday, June 16, 2009

On Edelman's "On what it means to see"

I previously mentioned Shimon Edelman in my blog and why his ideas are important for the advancement of computer vision. Today I want to post a review of a powerful and potentially influential 2009 piece written by Edelman.

Below is a review of the June 16th, 2009 version of this paper:
Shimon Edelman, On what it means to see, and what we can do about it, in Object Categorization: Computer and Human Vision Perspectives, S. Dickinson, A. Leonardis, B. Schiele, and M. J. Tarr, eds. (Cambridge University Press, 2009, in press). Penultimate draft.

I will refer to the article as OWMS (On What it Means to See).

The goal of Edelman's article is to demonstrate the limitations of conceptual vision (referred to as "seeing as"), criticize the modern computer vision paradigm as being overly conceptual, and show how providing a richer representation of a scene is required for advancing computer vision.

Edelman proposes non-conceptual vision, where categorization isn't forced on an input -- "because the input may best be left altogether uninterpreted in the traditional sense." (OWMS) I have to agree with the author, where abstracting away the image into a conceptual map is not only an impoverished view of the world, but it is not clear whether such a limited representation is useful for other tasks relying on vision (something like the bottom of Figure 1.2 in OWMS or the Figure seen below and taken from my Recognition by Association talk).

Building a Conceptual Map = Abstracting Away

Drawing on insights from the influential Philosopher Wittgenstein, Edelman discusses the difference between "seeing" versus "seeing as." "Seeing as" is the easy-to-formalize map-pixels-to-objects attitude which modern computer vision students are spoon fed from the first day of graduate school -- and precisely the attitude which Edelman attacks in this wonderful article. To explain "seeing" Edelman uses some nice prose from Wittgenstein's Philosophical Investigations; however, instead of repeating the passages Edelman selected, I will complement the discussion with a relevant passage by William James:

The germinal question concerning things brought for the first time before consciousness is not the theoretic "What is that?" but the practical "Who goes there?" or rather, as Horwicz has admirably put it, "What is to be done?" ... In all our discussions about the intelligence of lower animals the only test we use is that of their acting as if for a purpose. (William James in Principles of Psychology, page 941)

"Seeing as" is a non-invertible process that abstracts away visual information to produce a lower dimensional conceptual map (see Figure above), whereas "seeing" provides a richer representation of the input scene. Its not exactly clear what is the best way to operationalize this "seeing" notion in a computer vision system, but the escapability-from-formalization might be one of the subtle points Edelman is trying to make about non-conceptual vision. Quoting Edelman, when "seeing" we are "letting the seething mass of categorization processes that in any purposive visual system vie for the privilege of interpreting the input be the representation of the scene, without allowing any one of them to gain the upper hand." (OWMS) Edelman goes on to criticize "seeing as" because vision systems have to be open-ended in the sense that we cannot specify ahead of time all the tasks that vision will be applied to. According to Edelman, conceptual vision cannot capture the ineffability (or richness) of the human visual experience. Linguistic concepts capture a mere subset of visual experience, and casting the goal of vision as providing a linguistic (or conceptual) interpretation is limited. The sparsity of conceptual understanding is one key limitation of the modern computer vision paradigm. Edelman also criticizes the notion of a "ground-truth" segmentation in computer vision, arguing that a fragmentation of the scene into useful chunks is in the eye of the beholder.

To summarize, Edelman points out that "The missing component is the capacity for having rich visual experiences... The visual world is always more complex than can be expressed in terms of a ﬁxed set of concepts, most of which, moreover, only ever exist in the imagination of the beholder." (OWMS) Being a pragmatist, many of these words resonate deeply within my soul, and I'm particularly attracted to elements of Edelman's antirealism.

I have to give two thumbs up to this article for pointing out the flaws in the current way computer vision scientists go about tackling vision problems (in other words researchers too often blindly work inside the current computer vision paradigm and do not often enough question fundamental assumptions which can help new paradigms arise). Many similar concerns regarding Computer Vision I have already pointed out on this blog, and it is reassuring to find others point to similar paradigmatic weaknesses. Such insights need to somehow leave the Philosophy/Psychology literature and make a long lasting impact in the CVPR/NIPS/ICCV/ECCV/ICML communities. The problem is that too many researchers/hackers actually building vision systems and teaching Computer Vision courses have no clue who Wittgenstein is and that they can gain invaluabe insights from Philosophy and Psychology alike. Computer Vision is simply not lacking computational methods, it is gaining critical insights that cannot be found inside an Emacs buffer. In order to advance the field, one needs to: read, write, philosophize, as well as mathematize, exercise, diversify, be a hacker, be a speaker, be one with the terminal, be one with prose, be a teacher, always a student, a master of all trades; or simply put, be a Computer Vision Jedi.

Friday, July 03, 2009

Linguistic Idealism

Tuesday, June 16, 2009

On Edelman's "On what it means to see"

Subscribe To