Tombone's Computer Vision Blog: 2008

Wednesday, December 24, 2008

Newton's Method Fractal Yet Again

Yet another Newton's Method Fractal Animation. This one is created from the OpenGL C++ program I wrote some time ago on my Macbook Pro. I dumped the frames as PPMs, used ImageMagick to convert them to pngs (shell script one liner), FrameByFrame (A great free OS X product) to make a movie from frames, and iMovie to add music/titles. The song in the background is a New Deal cover of Journey's Separate Ways from 2003-02-21 (Check it out on archive.org).

In the future I plan on synchronizing the music with the fractals. Here is a cool screenshot from the movie when the background becomes white.

Friday, December 05, 2008

Using Computer Vision to Solve Jigsaw Puzzles

This past Thanksgiving I took a little bit of time to see if I could solve Jonathan Huang's Puzzle. While I haven't yet solved the task since I could only afford to put a couple of hours of work into it, here is a nice debug screenshot of the local puzzle piece alignment strategy I've been pursuing.

In this image I've shown puzzle piece A which is fixed and in red, and puzzle piece B as well as some likely transformations that when applied to B snap it to piece A. If I have more time over Xmas break and I get to finish the final puzzle -- I'll be sure to post the details.

Tuesday, November 18, 2008

Algorithmic Simplicity + Data > Algorithmic Complexity

Decades ago (during the era of Rodney Brooks, Takeo Kanade, and other such great computational thinkers) computer vision researchers were manually designing complex AI programs for image analysis. Back then, if the algorithm was able to work on a single real image it was publishable. The parmaters of some of these complicated models were often tuned by hand -- and that was okay -- there simply wasn't enough image data to fit these models from examples.

We are now living in a Machine Learning generation where hand tweaked parameters are looked down upon and if you want to publish an object recognition paper you'll need to test your algorithm on a standard dataset containing hundreds of images spanning many different types of objects. There is still a lot of excitement about Machine Learning in the air and new approaches are constantly being introduced as the new 'state-of-the-art' on canonical datasets. The problem with this mentality is that researchers are introducing a lot of complicated machinery and it is often unclear whether these new techniques will stand the test of time.

Peter Norvig -- now at Google -- advocates an alternative view. Rather than designing more advanced machine to work with a measly 20,000 or so training images for an object recognition task -- we shouldn't be too eager to make conclusions when dealing with such paltry training sets. In a recent Norvig video lecture I watched he showed some interesting results where the algorithms that obtained the best performance on a small dataset no longer did the best when the size of the training set was increased by an order of magnitude. In some cases, when fixing the test set, the simplest algorithms provided with an order of magnitude more training data outperformed the most advanced 'state-of-the-art.' Also, the mediocre algorithms in the small training size regime often outperformed their more complicated counterparts once more data was utilized.

The next generation of researchers will inevitably be using much more training data than we are at the moment, so if we want our scientific contributions to pass the test of time, we have to focus on designing simple yet principled algorithms. Focus on simplicity. Consider a particular recognition task, namely car recognition. Without any training data we are back in the 1960/1970s generation where we have to hard-code rules about what it means to be a car in order for an algorithm to work on a novel image. With a small amount of labeled training data, we can now learn the parameters of a general parts-based car detector -- we can even learn the appearance of such parts. But what can we do with millions of images of cars? Do we even need much more than a large scale nearest neighbor lookup?

As Rodney Brooks once said, "The world is its own best representation," and perhaps we should follow Google's mentaly and simply equip our ideas with more, more, more training data.

Tuesday, November 04, 2008

Computer Vision as immature?

Ashutosh Saxena points out on his web page an interesting quote from Wikipedia about Computer Vision from August 2007. I just checked out the Wikipedia article on Computer Vision, and it seems this paragraph is still there. Parts of it go as follows:

The field of computer vision can be characterized as immature and diverse ... Consequently there is no standard formulation of "the computer vision problem." ... no standard formulation of how computer vision problems should be solved.

I agree that there is no elegant equation akin to F=ma or Schrodinger's Wave Equation that is magically supposed to explain how meaning is supposed to be attributed to images. While this might seem like a weak point, especially to the mathematically inclined always seeking to generalize and abstract away, I am skeptical of Computer Vision ever being grounded in such an all-encompassing mathematical theory.

Being a discipline centered on perception and reasoning, there is something about Computer Vision that will make it forever escape formalization. State of the art computer vision systems that operate on images can return many different types of information. Some systems return bounding boxes of all object instances from a single category, some systems break up the image into regions (segmentation) and say nothing about object classes/categories, and other systems assign a single object-level category to the entire image without performing any localization/segmentation. Aside from objects, some systems (See Hoiem et al. and Saxena et al.) return a geometric 3D layout of the scene. While it seems that humans can do extremely well at all these tasks, it makes sense that different robotic agents interacting with the real world should percieve the world differently to accomplish their own varying tasks. Thing of biological vision -- do we see the same world as dogs? Is there an objective observer-independent reality that we are supposed to see? To me, perception is very personal, and while my hardware (brain) might appear similar to another human's I'm not convinced that we see/perceive/understand the world the same way.

I can imagine ~40 years ago researchers/scientists trying to come up with an abstract theory of computation that would allow one to run arbitrary computer programs. What we have today is myriad operating systems and programming languages suited for different crowds and different applications. While the humanoid robot in our living room is nowhere to be found, I believe if we wait until that day and inspect its internal working we will not see a beautiful rigorous mathematical theory. We will see AI/mechanical components developed by different researcher groups and integrated by other researchers -- the fruits of a long engineering effort. These bots will be always learning, always updating, always getting updates, and always getting replaced by newer and better ones.

Linear Support Vector Machine (SVM) in the Primal

I often solve linear SVMs. These are convex optimization problems, much like logistic regression, where the goal is to find a linear decision boundary between two classes. Unlike logistic regression, only the points near the decision boundary affect its parameters. Data points that are correctly classified by a margin do not influence the decision boundary.

Quite often when SVMs are taught in a Machine Learning course, the dual and kernelization are jumped into very quickly. While the formulation looks nice and elegant, I'm not sure how many students can come home after such a lecture and implement an SVM in a language such as MATLAB on their own.

I have to admit that I never really obtained a good grasp of Support Vector Machines until I sat through through John Lafferty's lectures in 10-702: Statistical Machine Learning which demistyfied them. The main idea was that an SVM is just like logistic regression but with a different loss function -- the hinge loss function.

A very good article written on SVMs, and how they can be efficiently tackled in the primal (which is super-easy to understand) is Olivier Chapelle's Training a Support Vector Machine in the Primal. Chapelle is a big proponent of primal optimization and he has some arguments on why primal approximations are better than dual approximations. On this webpage one can also find MATLAB code for SVMs which is very fast in practice since it uses a second order optimization technique. In order to use such a second order technique, the squared hinge loss is used (see the article for why). I have used this code many times in the past even for linear binary classification problems with over 6 million data points embedded in 7 dimensions.

In fact, this is the code that I have used in the past for distance function learning. So the next time you want to throw something simple and fast at a binary classification problem, check out Chapelle's code.

Thursday, October 30, 2008

creating visualizations via Google Earth and Matlab

I've been recently generating KML files in Matlab for visualizing "stuff" on the Earth's Surface. It is really easy to generate overlays as well as place "placemarks" showing images. Google Earth is an amazing tool which is going to be around for quite some time. Rather than generating static figures, why not let people interact with your figures? Well, Google Earth seems to have taken care of the interactions.

In addition, if you have lots of data you can use the level of detail capabilities of KML to cut your data up into little chunks. Its just like an octree from you computer graphics class. Pretty easy to code up. All your data can be stored on a server so there is no need to have anybody download it. Its possible to make a webserver emit the .kml files properly and not as ascii so that Google Earth can directly open it.

Once I *finalize* some kml files I'll show simple examples of the Level of Detail visualizations.

Sunday, September 21, 2008

6.870 Object Recognition and Scene Understanding

This semester at MIT, Antonio Torralba is teaching 6.870 Object Recognition and Scene Understanding. Topics include object recognition, multiclass categorization, objects in context, internet vision, 3D object models, scene-level analysis, as well as "What happens if we solve object recognition?"

Why should anybody care what courses new faculty are offering, especially if they are being taught at another academic institution? The answer is simple. The new rising stars (a.k.a the new faculty) teach graduate-level courses that reflect ideas which these professors are truly passionate about. Besides the few initial semesters, when new faculty sometimes have to teach introductory level courses, these special topic courses (most often they are grad-level courses) reflect what has been going on in their heads for the past 10 years. Such courses reflect the past decade of research interests (pursued by the new professor) and the material is often presented in such a way that the students will get inspired and have the best opportunity to one day surpass the professor. I'm a big advocate of letting faculty teach their own courses -- of course introductory level undergraduate courses still have to be taught somehow...

A new professor's publication list is a depiction of what kind of research was actually pursued; however, the material comprising a special topic course presents a theme -- a conceptual layout -- which is sometimes a better predictor of where a professor's ideas (and inadvertently the community's) are actually going long-term. If you want to see where Computer Vision is going, just see what new faculty are teaching.

On the course homepage, ~~Antonio~~ Professor Torralba mentions other courses taught at other universities by the new breed of Professors such as Computer Vision by Rob Fergus and Internet Vision by Tamara Berg. Note: I have only listed Antonio's, Rob's, and Tamara's courses since they are Fall2008 offerings -- many other courses exist but are from Fall07 or other semesters.

Thanks to my advisor, Alyosha Efros, for pointing out this new course.

On another note, I'm back at CMU from a Google summer internship where I was working with Thomas Leung on computer vision related problems.

Sunday, August 10, 2008

What is segmentation? What is image segmentation?

According to Merriam-Webster, segmentation is "the process of dividing into segments" and a segment is "a separate piece of something; a bit, or a fragment." This is a rather broad definition which suggests that segmentation is nothing mystical -- it is just taking a whole and partitioning it into pieces. One can segment sentences, time periods, tasks, inhabitants of a country, and digital images.

Segmentation is a term that often pops up in technical fields, such as Computer Vision. I have attempted to write a short article on Knol about Image Segmentation and how it pertains to Computer Vision. Deciding to abstain from discussing specific algorithms -- which might be of interest to graduate students and not the population as a whole -- I instead target the high-level question, "Why segment images?" The answer, according to me, is that image segmentation (and any other image processing task) should be performed solely assist object recognition and image understanding.

Wednesday, July 30, 2008

The Duality of Perception and Action

If we have managed to build an unambiguous 3D model of the viewed scene, is the process then complete? The answer to this question must be no. Computer vision systems, per se, are only part of a more embracing system which is simultaneously concerned with making sense of the environment and interacting with the environment. Without action, perception is futile, without perception, action is futile. Both are complementary, but strongly related activities and any intelligent action in which the system engages in the environment, i.e., anything it does, it does with an understanding of its action, and it gains this quite often by on going visual perception. Computer vision, then, is not an end in itself, that is, while the task of constructing an unambiguous explicit 3D representation of the world is a large part of its function, there is more to vision than just the explication of structural organisation. In essence, computer vision systems, or image understanding systems, are as concerned with cause and effect with purpose, with action and reaction as they are with structural organisation.

This is a rather remarkable excerpt from "Advanced Image Understanding and Autonomous Systems," by David Vernon from Department of Computer Science in Trinity College Dublin, Ireland.

Thursday, July 24, 2008

More Newton's Method Fractals on Youtube

$newtons method fractal image$
I've posted another cool fractal video. I used ffmpeg and imagemagick to get the screenshots from an OpenGL C++ program running on my Macbook Pro.

Friday, July 11, 2008

Learning Per-Exemplar Distance Functions == Learning Anisotropic Per-Exemplar Kernels for Non-Parametric Density Estimation

When estimating probability densities, one can be parametric or non-parametric. A parametric approach goes like this: a.) assume the density has a certain form, then b.) estimate parameters of that distribution. A Gaussian is a simple example. In the parametric case, having more data means that we will get "better" estimates of the parameters. Unfortunately, the parameters will only be "better" if the modeling distribution is close to "the truth."

Non-parametric approach make very weak assumptions about the underlying distribution -- however the number of parameters is a non-parametric density estimator scales with the number of data points. Generally estimation proceeds as follows: a.) store all the data, b.) use something like a Parzen density estimate where a tiny Gaussian kernel is placed around each data point.

The theoretical strength of the non-parametric approach is that *in theory* we can approximate any underlying distribution. In reality, we would need a crazy amount of data to approximate any underlying distribution -- the pragmatic answer to "why be non-parametric?" is that there is no real learning going on. In a non-parametric approach, we don't a parameter estimation stage. While estimating the parameters of a single gaussian is quite easy, consider a simple gaussian mixture model -- the parameter estimation (generally done via EM) is not so simple.

With the availability of inexpensive memory, data-driven approaches have become popular in computer vision. Does this mean that by using more data we can simply bypass parameter estimation?

An alternative is to combine the best of both worlds. Use lots of data, make very few assumptions about the underlying density, and still allow learning to improve density estimates. Usually a single isotropic kernel is used in non-parametric density estimation -- if we really have no knowledge this might be the only thing possible. But maybe something better than an isotropic kernel can be used, and perhaps we can use learning to find out the shape of the kernel.

The local per-exemplar distance function learning algorithm that I presented in my CVPR 2008 paper, can be thought of as such a anisotropic per-data-point kernel estimator.

I put some MATLAB distance function learning code up on the project web page. Check it out!

Sunday, June 22, 2008

Going to Anchorage

I am going to Anchorage in a couple of hours! CVPR 2008 should be really fun.

On another note, yesterday I spent much of the day in beautiful San Francisco. I definitely need to make a few more visits there.

Thursday, June 05, 2008

computer vision summer internship at google

I am going to Mountain View, California next week to start my 3 month-long summer internship at Google. While I don't know the specific details on what I'll be working on, I will be working with the Computer Vision group there.

Here is an idea: imagine making sense of the billions of objects embedded in images contained in google street-view image database. Google is already blurring faces in these images -- which means they are running vision algorithms on this dataset -- but are google researchers finding makes/models of cars, reading street signs, analyzing building facades to see which homes are victorian/ranch/etc, aligning visual information with google maps, etc?

Google street view is an excellent portal from machine to the world. If there is ever any hope of visual recognition happening on a robot, then it will have to happen at Google first. First using immense computational power. If that works, why not outsource visual recognition capabilities to a company like Google? Imagine a little computer onboard your favorite humanoid robot that is actually communicating via some standard recognition API with google's servers. What the robot sees is sent over to Google for analysis -- then 'image understanding' data is propagated back. I imagine such a service could be set up, and for a fairly cheap price.

Wednesday, June 04, 2008

Shimon Edelman: Constraints on the nature of the neural representation of the visual world

I came across a nice short article written by Shimon Edelman titled Constraints on the nature of the neural representation of the visual world. Shimon got his PhD from Weizmann Institute of Science in 1988 under the guidance of the infamous Prof. Shimon Ullman and is now a Professor of Psychology at Cornell. If you are a vision hacker -- a code-writing graphical-model advocating graduate student in computer science -- then you might ask yourself: Why should I care what psychologists/philosophers have to say about vision?

The problem is that overfitting to what is currently *hot* at CVPR isn't very productive if you want to solve big problems. Philosophers, psychologists, roboticists, and cognitive neuroscientists have a lot to say about vision and offer plenty of ideas as to what they expect to see in a successful vision system. While being a CS graduate student something like "the problem of computer vision" might seem like a rather grand goal; however, these other scientists (from different fields) suggest that it is unlikely that a pure CS approach will get the glory.

Some concepts that are brought up in this paper are the following: ontological strategy, context, inherent ambiguities in segmentation, ineffability of the visual world, multidimensional similarity space. I think looking at vision from a philosophical point of view is not only enlightening, but suggests that what we should be after is more than just solving the problem of computer vision. What does it mean to solve the problem of computer vision after all? What we should be after is a theory of intelligence -- a theory of mind -- and strive to build truly intelligent machines.

Tuesday, May 20, 2008

dude, where's my image?

Check out IM2GPS: estimating geographic information from a single image. This is CVPR2008 work done by James Hays and Alexei Efros. Some crazy titles that have been suggested to James can also be seen on his project site -- some of them are rather funny too!

Anyways, you can just read his abstract and browse his results if you are interested in the kind of computer vision research that uses millions of images. The basic idea is to predict the location of an image using only information embedded inside the image (and a training set of over 6 million geo-tagged Flickr images.)

Saturday, May 17, 2008

what is recognition?

I want to briefly discuss what the terms recognition, classification, and categorization mean to me and how they relate to the fields such as computer vision, machine learning, and psychology.

From my understanding, "category" == "class" and thus categorization and classification are the same thing! It is correct to say that when we categorize, we affix a label to some entity. But these labels do refer to categories, or classes. One can attribute the popularity of the term 'classification' to the field of machine learning. Categorization is a term that was more heavily used in psychology and only recently it is popping up in computer vision papers.

Because I see classification and categorization as the same thing, I don't agree that only one can be hierarchical.

Regarding the term recognition, the answer is a bit more complicated. In the field of computer vision, when one says that they are interested in recognition they are usually interested in recognizing novel instances from some predefined list of classes. To stress the interest in discrimination between a large number of object classes, vision researchers have recently begun using terms such as "a visual categorization system" or they talk about "object class recognition."

In all places that I have seen this term pop up, "identification" refers to specific instances. A face identification system might be designed to find faces of George Bush and might work on top of a face-class recognition system. The problem is that early work in computer vision was usually concerned with a fixed number of objects and the goal was to find those exact object instances inside an image -- and this was referred to as simply "recognition." Nowadays, we often use the term "recognition" to refer to category-level recognition and not specific objects.

In conclusion, recognition is a very general term that has been applied to both category-level recognition (dog vs. cat vs. car vs. person) and recognition of specific object instances (this particular blue ball vs. this particular face). To be more precise, one can use the terms "category-level recognition" and "identification."

This post has been written in response to Vidit Jain's blog post titled "Etymology of common learning-related words such as recognize."

Wednesday, April 23, 2008

newton's method fractal

Back in high school I was 'into' newton's method fractals. Some old images can be seen by clicking on the following image

When people make fractal videos (check them out on youtube), they are usually zooming into a fixed fractal. I have generated a fractal where the axis is fixed and the equation is changing. Check it out!

Tuesday, April 08, 2008

Recognition by Association via Learning Per-exemplar Distances

Tomasz Malisiewicz, Alexei A. Efros. Recognition by Association via Learning Per-exemplar Distances. In CVPR, June 2008.

Abstract:

We pose the recognition problem as data association. In this setting, a novel object is explained solely in terms of a small set of exemplar objects to which it is visually similar. Inspired by the work of Frome et al., we learn separate distance functions for each exemplar; however, our distances are interpretable on an absolute scale and can be thresholded to detect the presence of an object. Our exemplars are represented as image regions and the learned distances capture the relative importance of shape, color, texture, and position features for that region. We use the distance functions to detect and segment objects in novel images by associating the bottom-up segments obtained from multiple image segmentations with the exemplar regions. We evaluate the detection and segmentation performance of our algorithm on real-world outdoor scenes from the LabelMe dataset and also show some promising qualitative image parsing results.

http://www.cs.cmu.edu/~tmalisie/projects/cvpr08/

Thursday, April 03, 2008

Vocabulary Lesson: Transductive Learning

The goal of this blog post isn't to necessarily provide new insights into the relationship between Transductive Learning versus Semi-Supervised Learning. I will attempt to simply answer the question: "What is Transductive Learning?" To understand what Transductive means, we have to understand what induction (or Inductive Learning) means.

Induction, as opposed to deduction, is a form of reasoning that makes generalizations based on individual instances. It is important to note that induction isn't the kind of reasoning that predicate calculus or any other logic system was meant to handle. The conclusions produced from induction might have a high probability of being true but are never as certain as the inputs. The generalizations obtained from induction can be propagated onto newly observed inputs. One can think of a generalization obtained from induction as a function -- an abstract entity that can always map inputs to outputs.

The Marriam-Webster definition of Transduction states that it is: the transfer of genetic material from one microorganism to another by a viral agent (as a bacteriophage). While this definition has its roots in one particular branch of science, the crucial component of this definition is still present. Transduction is the transfer of something from entity A to entity B.

The Machine Learning definition of Transduction states that it is reasoning from observed inputs to specific test inputs. The key difference between induction and transduction is that induction refers to learning a function that can be applied to any novel inputs, while transduction is only concerned with transferring some property onto a specific set of test inputs.

Rather than paraphrasing Wikipedia, the interested reader should do some follow research of their own into the merits of Transductive Learning.

To conclude, a WILLOW Research Team member -- Olivier Duchenne -- gave a talk about their CVPR 2008 work on applying Transductive Learning to the problem of image segmentation. This was my first exposure to the concepts of transductive learning and it is always a good thing to learn new things.

Monday, March 31, 2008

keyword analysis

I've been looking at my logs, and here are the top things people searched for when they stumbled across my blog within the last month or so. It seems everybody wants to know about the ndseg fellowship -- especially when they will hear back. And I unfortunately can't provide any advice regarding the ndseg fellowship. Good luck to you, graduate students.

33.87% ndseg
4.84% ipod turn on
3.23% latent dirichlet allocation
3.23% bmvc 2007
3.23% ipod won't turn on
3.23% logistic normal latent dirichlet allocation
1.61% burton clash
1.61% ndseg fellowship 2007
1.61% park city utah blog
1.61% ipod turning on
1.61% 2008 ndseg winners
1.61% my dream car
1.61% ndseg thegradcafe
1.61% ndseg fellowship offers 2008
1.61% cvpr 2007
1.61% ndseg, heard back
1.61% ndseg anyone
1.61% computer vision grad school
1.61% ndseg forum
1.61% ipod display support url
1.61% nsf graduate fellowship hear back
1.61% my ipod wont turn on
1.61% latent dirichlet allocation gibbs sampling
1.61% jogging in pittsburgh, squirrel hill
1.61% ndseg 2008 winners
1.61% eye inverse optics
1.61% multiple segmentations
1.61% nsf graduate fellowship heard yet?
1.61% burton clash guitar
1.61% nsf graduate research fellowships
1.61% thegradcafe ndseg
1.61% nsf grf march
1.61% my first paper on cvpr conference
1.61% ndseg fellowship
1.61% burton clash 2005
1.61% ndseg anyone heard

Wednesday, March 19, 2008

Understanding the past

While a certain degree of advancement is possible when working in isolation on a scientific problem, interaction with the scientific community can drastically hasten one's progress. Most people have their own experiences with 'isolation' and 'interaction with a community' but I should explicitly delineate how I intend to use these terms. While 'interaction with a community' usually implies two-sided communication such as directly working together on a problem or simply discussing one's research with a group of other scientists, I want to consider a subtler form of interaction.

By reading about past accomplishments and former ideologies in a particular field, one is essentially communicating with the ideas of the past. While many scholarly articles -- in a field such as Computer Vision -- are mostly devoted to algorithmic details and experimental evaluations, it isn't too difficult to find manuscripts which reveal the philosophic underpinnings of the proposed research. It is even possible to find papers which are entirely devoted to understanding the philosophical motivations of a past generation of research.

A prime example of interaction with the past is the paper "Object Recognition in the Geometric Era: A Retrospective," by Joseph L. Mundy from Brown University. Such a compilation of ideas -- perhaps even a mini-summa -- is quite accessible to any researcher in the field of Computer Vision. Avoiding the specific details of any algorithm developed in the so-called Geometric Era of Computer Vision, this text is both entertaining and highly educational. By reading such texts one is effectively communicating (albeit one-way) with a larger scientific community of the past.

To conclude, I would like to point out that neither do I agree with some of the past paradigms of Computer Vision, nor am I a die-hard proponent of the modern statistical machine learning school of thought. However, to explore new territories what better way to scope the world around you than by standing on the shoulders of giants? We should be aware of what has been done in the past, and sometimes de-emphasize algorithmic minutiae in order to understand the philosophical motivations behind former paradigms.

Wednesday, February 20, 2008

On Geometry and Computer Science

What do you think of when you hear the term 'Geometry' ? Perhaps you think of elementary mathematics courses you've taken in the past. Perhaps you think of a branch of mathematics concerned with properties of space such as length and volume. If you're like me then you probably don't think too much about the origins of this term -- that Geometry means measurement (metry) of the Earth (Geo). This is a case where the discipline has impacted so many other fields and thus transcended its origins in such a way that most of us don't necessarily think about the earth when we think about Geometry.

How about the term 'Computer Science'? Most of us probably still think about computer programming when we think about computer science. I believe that one day Computer Science will encompass so much of our daily lives that we will forget about the origins of this term. Dijkstra once said, "Computer Science is no more about computers than astronomy is about telescopes." I have to agree with him in the sense that Computer Science is a mental framework for solving problems -- it doesn't necessarily require computers.

How about 'Computer Vision'? Being a much younger discipline that Computer Science, we will have to wait and see what happens to this term. I've argued in earlier posts that it will become clear in the future that to solve the problem of Computer Vision, the field will inevitably need to become more concerned with intelligence, learning, and metaphysics and less about visual attributes and image processing. Maybe there will be no term Computer Vision in the future and the field of Machine Learning will take the glory. Or perhaps the term will stick but become so commonplace that we will forget how Computer Vision initially started out.

Saturday, February 02, 2008

Paris Adventure

This Sunday, instead of watching the Superbowl I will be flying somewhere over the Atlantic Ocean heading towards Paris, France. Even though I've been learning the French language at home, living in Paris for three months will teach me more than just the language -- conversing in French will be only a small part of my French experience.

I'm going to use a bunch of Google Maps features to track the places I've seen and visited. I will also be relying on Skype for communicating with friends and family overseas.

Au revoir,
Tomasz