Wednesday, July 30, 2008

The Duality of Perception and Action

If we have managed to build an unambiguous 3D model of the viewed scene, is the process then complete? The answer to this question must be no. Computer vision systems, per se, are only part of a more embracing system which is simultaneously concerned with making sense of the environment and interacting with the environment. Without action, perception is futile, without perception, action is futile. Both are complementary, but strongly related activities and any intelligent action in which the system engages in the environment, i.e., anything it does, it does with an understanding of its action, and it gains this quite often by on going visual perception. Computer vision, then, is not an end in itself, that is, while the task of constructing an unambiguous explicit 3D representation of the world is a large part of its function, there is more to vision than just the explication of structural organisation. In essence, computer vision systems, or image understanding systems, are as concerned with cause and effect with purpose, with action and reaction as they are with structural organisation.

This is a rather remarkable excerpt from "Advanced Image Understanding and Autonomous Systems," by David Vernon from Department of Computer Science in Trinity College Dublin, Ireland.

Thursday, July 24, 2008

More Newton's Method Fractals on Youtube

newtons method fractal image
I've posted another cool fractal video. I used ffmpeg and imagemagick to get the screenshots from an OpenGL C++ program running on my Macbook Pro.

Friday, July 11, 2008

Learning Per-Exemplar Distance Functions == Learning Anisotropic Per-Exemplar Kernels for Non-Parametric Density Estimation

When estimating probability densities, one can be parametric or non-parametric. A parametric approach goes like this: a.) assume the density has a certain form, then b.) estimate parameters of that distribution. A Gaussian is a simple example. In the parametric case, having more data means that we will get "better" estimates of the parameters. Unfortunately, the parameters will only be "better" if the modeling distribution is close to "the truth."

Non-parametric approach make very weak assumptions about the underlying distribution -- however the number of parameters is a non-parametric density estimator scales with the number of data points. Generally estimation proceeds as follows: a.) store all the data, b.) use something like a Parzen density estimate where a tiny Gaussian kernel is placed around each data point.

The theoretical strength of the non-parametric approach is that *in theory* we can approximate any underlying distribution. In reality, we would need a crazy amount of data to approximate any underlying distribution -- the pragmatic answer to "why be non-parametric?" is that there is no real learning going on. In a non-parametric approach, we don't a parameter estimation stage. While estimating the parameters of a single gaussian is quite easy, consider a simple gaussian mixture model -- the parameter estimation (generally done via EM) is not so simple.

With the availability of inexpensive memory, data-driven approaches have become popular in computer vision. Does this mean that by using more data we can simply bypass parameter estimation?

An alternative is to combine the best of both worlds. Use lots of data, make very few assumptions about the underlying density, and still allow learning to improve density estimates. Usually a single isotropic kernel is used in non-parametric density estimation -- if we really have no knowledge this might be the only thing possible. But maybe something better than an isotropic kernel can be used, and perhaps we can use learning to find out the shape of the kernel.

The local per-exemplar distance function learning algorithm that I presented in my CVPR 2008 paper, can be thought of as such a anisotropic per-data-point kernel estimator.

I put some MATLAB distance function learning code up on the project web page. Check it out!