Since I started monitoring the IPs of the people who visit my blog (via statcounter's free service), I've had 452 recorded unique hits. I quickly started thinking about the purpose of a blog and here are some insightful remarks (feel free to comment if you agree/disagree with any of these views):
The Diary-Blog
A blog can be be used to keep track of one's daily life. One can reconstruct significant events in their own life from reading their old blog entries. This is the 'my blog is for me' view. The only reason why this portrayal isn't perfectly aligned with the traditional notion of diary is that a blog is inherently public. Anybody can read anybody else's blog. The next few categories revolve around this very important depiction of a blog as a non-private collection of entries.
The News-About-Me-Blog
Since a blog is public and can be read by anybody with an internet connection, it is a way for the world to obtain information about the blogger without direct communication. In this view, the blog is the interface between the blogger and the rest of the world. A {stranger,friend,foe} doesn't have to bother the blogger by calling them to find out what they are up to and they don't even have to check their IM away message in the middle of the night. The Blog is always up since it is posted on the internet. However, in this view the outside world which reads the blog is nameless; it is a faceless corpus of readers.
The Philosophy-Blog (where comments are key)
By keeping the blog interesting the blogger can keep customers coming back. Here I use the word customer to denote a blog reader. Generally the satisfied customers are people who are interested in some of the topics that are conveyed throughout the blog. This allows the blogger to gear certain blog entries for that particular crowd. For example, throughout my blog a recurring theme is the philosophical questions "What is the world made of?" and how it relates to my current life as a researcher in the fields of computational vision and learning. By entertaining the customers who are also related in such deeper questions and interacting with them via comments, a blog can help exchange ideas with such a broad audience.
The I-am-talking-to-You-Blog
With tools such as Statcounter, a blogger who is competent in statistics can make many insightful inferences about the blog-viewing habits of his/her customers. Now, allow me (the blogger) to present you (the customer) with a rather intriguing use of the Blog. While the Philosophy-Blog was geared toward a large audience of people who share similar interests, the I-am-talking-to-you-Blog portrayal is centered around the key observation that one particular person will read the blog entry with an exceptionally high probability. Under this model, a particular blog entry is geared for one person, and one person only. However, due to the non-private nature of a blog, it is usually not obvious whether a blog entry was written for one person and whom that target person migh be. Surely the blogger could explicitly state who the blog entry was for and why, but that would defeat the whole purpose of person-targeting on a public blog! If a blogger wanted to say something directly, then they would use {phone,email,IM} as opposed to posting something on their blog! The genious behind the I-am-talking-to-you-Blog paradigm is that one can gear a blog entry for a particular customer and never state so directly; therefore, the blogger can always deny that the blog entry was geared towards any particular person! Clearly, a customer can only infer the true intentions/target of a blog entry when the blogger uses sophisticated obfuscatory techniques. By discussing content that is general enough for a broad audience to classify as random-talk yet specific enough that the target customer can transcend the seemingly random arrangement of words, the blogger can steer his voice in the proper direction.
In conclusion, I have written some Python code that automatically downloads people's blogs (their entire archive actually) in an attempt to mine the internet. The internet and well-formatted blogging is an ideal interface between people's most intimate thoughts and machines.
Deep Learning, Computer Vision, and the algorithms that are shaping the future of Artificial Intelligence.
Wednesday, November 30, 2005
Monday, November 28, 2005
Latent Topics and the Turing Test
Researchers in statistical language modeling employ the concept of a stoplist. A stoplist is a list of commonly occuring words such as "the", "of", and "are." When using a statistical technique based on the bag-of-words assumption (word exchangeability), these stopwords are discarded with the hope that the remaining words are the truly informative ones. Although suitable for classification and clustering tasks, such an approach falls short of modelling the syntax in the english language.
I believe that we should stop using stop lists. These 'meaningless' words are the glue that binds together informative words and if we want to be able to perform tasks such as grammar checking spelling checking then we have to look beyond bag-of-words. By complementing models such as LDA with a latent syntactic labels per word, we can attain partial exchangeability.
Latent Semantic Topc = A topic that is used to denote high-level information about the target sentence.
Latent Syntactic Topic = A topic which denotes the type of word (such as noun,verb,adjective).
Consider the sentence:
I read thick books.
This sentence is generated from the syntactic skeleton [noun,verb,adjective,noun].
Ideally we want a to understand text in such a way that the generative process generates text that appears to be generated by a human.
-----------------------------------
I would like to thank Jon for pointing out the Integrating Topics and Syntax paper which talks about this. Only two days after I posted this entry he showed me this paper (of course Blei is one of the authors).
I believe that we should stop using stop lists. These 'meaningless' words are the glue that binds together informative words and if we want to be able to perform tasks such as grammar checking spelling checking then we have to look beyond bag-of-words. By complementing models such as LDA with a latent syntactic labels per word, we can attain partial exchangeability.
Latent Semantic Topc = A topic that is used to denote high-level information about the target sentence.
Latent Syntactic Topic = A topic which denotes the type of word (such as noun,verb,adjective).
Consider the sentence:
I read thick books.
This sentence is generated from the syntactic skeleton [noun,verb,adjective,noun].
Ideally we want a to understand text in such a way that the generative process generates text that appears to be generated by a human.
-----------------------------------
I would like to thank Jon for pointing out the Integrating Topics and Syntax paper which talks about this. Only two days after I posted this entry he showed me this paper (of course Blei is one of the authors).
Tuesday, November 22, 2005
pittsburgh airport wireless
I'm currently sitting at the Pittsburgh Airport Terminal browsing the web. I finished reading Cryptonomicon 3 minutes ago.
I finished my Machine Project on Latent Dirichlet Allocation yesterday. I also won (tied for 1st actually) a photography competition in my Appearance Modeling class. Congrats Jean-Francois for splitting the win with me! Coincidently, we are working together on a project for that class.
Today in my Machine Learning class, Prof Mitchell talked about dimensionality reduction techniques. It still feels a bit weird when people call PCA and SVD an "unsupervised dimensionality reduction" technique. People should really make sure that they understand the singular value decomposition, not only as "some recondite matrix factorization", but as "the linear operator factorization."
I was thinking about using applying SVD to Latent Dirichlet Allocation features (the output of my machine lerning project). As an unsupervised machine learning technique, LDA automatically learns hidden topics. The output of LDA is the probability of a word belonging to a particular topics P(w|z). Variational inference could be used to find information about an unseed document. Given this novel document, we can determine P(z|d). In other words, LDA maps a document of arbitrary length to a vector on the K-simplex (the K-topic mixture proportions are a multinomial random variable).
By constructing a large data matrix with each row being the multinomial mixture weights of a particular document (this matrix woudl have as many columns as topics) and performing SVD, we would hope to be able to create a new set of K' uncorrelated topics. This is just like diagonalizing a matrix.
It would also be interesting to run LDA with a different number of topics (L={30,40,50,60,...,300}) and deterime the rank (or just look at the spectrum) of the data matrix. The rank would tell us how many 'truly independent' categories are present in our corpus. Here a category would be defined as a linear combination of latent topics, and it would be interesting to see how these 'orthonormal' categories obtained from SVD would relate to the original newsgroup categories.
I finished my Machine Project on Latent Dirichlet Allocation yesterday. I also won (tied for 1st actually) a photography competition in my Appearance Modeling class. Congrats Jean-Francois for splitting the win with me! Coincidently, we are working together on a project for that class.
Today in my Machine Learning class, Prof Mitchell talked about dimensionality reduction techniques. It still feels a bit weird when people call PCA and SVD an "unsupervised dimensionality reduction" technique. People should really make sure that they understand the singular value decomposition, not only as "some recondite matrix factorization", but as "the linear operator factorization."
I was thinking about using applying SVD to Latent Dirichlet Allocation features (the output of my machine lerning project). As an unsupervised machine learning technique, LDA automatically learns hidden topics. The output of LDA is the probability of a word belonging to a particular topics P(w|z). Variational inference could be used to find information about an unseed document. Given this novel document, we can determine P(z|d). In other words, LDA maps a document of arbitrary length to a vector on the K-simplex (the K-topic mixture proportions are a multinomial random variable).
By constructing a large data matrix with each row being the multinomial mixture weights of a particular document (this matrix woudl have as many columns as topics) and performing SVD, we would hope to be able to create a new set of K' uncorrelated topics. This is just like diagonalizing a matrix.
It would also be interesting to run LDA with a different number of topics (L={30,40,50,60,...,300}) and deterime the rank (or just look at the spectrum) of the data matrix. The rank would tell us how many 'truly independent' categories are present in our corpus. Here a category would be defined as a linear combination of latent topics, and it would be interesting to see how these 'orthonormal' categories obtained from SVD would relate to the original newsgroup categories.
Sunday, November 20, 2005
Root Reps in Cryptonomicon
While reading Neal Stephenson's Cryptonomicon I came across an interesting exchange between Enoch Root and Randy Waterhouse about Root Reps. A Root Rep (short for Root Representation) is an internal representation of Enoch Root, a mystical character in the novel. As stated in the book, the Root Rep is "some pattern of neurological activity," while the physical Enoch Root is some "big slug of carbon and oxygen and some other stuff." This concept was introduced by Enoch Root to reinforce the idea that the Root Rep is "the thing that you'll carry around in your brain for the rest of your life." Enoch Root said to Randy that instead of "thinking about me qua this big slug of carbon, you are thinking about the Root Rep."
I think that people should be aware how their actions influence their Reps (representations of themselves for other people). In this case a person's Rep is the neurological activity that is induced in another human's brain. We (made up of matter) can never truly enter another's person direct consciousness in a way that transcends the Rep. Although each one of us has a self-Rep that might be difficult to alter, each time we spread our Rep (by allowing others to communicate with us, read about us, or think about us in any way) we can influence the Rep creation.
What do you want to be today? Perhaps you cannot alter your own self-Rep very easily, but you can easily alter the Rep that is created inside other minds. Essentialy the Reps that represents us which is located inside other minds is based on a finite set of observations and thoughts that were most often a result of direct interactions. By carefully controling our interactions with others we can help shape the Rep. Cultivate a Rep today, be anything you want tomorrow.
I think that people should be aware how their actions influence their Reps (representations of themselves for other people). In this case a person's Rep is the neurological activity that is induced in another human's brain. We (made up of matter) can never truly enter another's person direct consciousness in a way that transcends the Rep. Although each one of us has a self-Rep that might be difficult to alter, each time we spread our Rep (by allowing others to communicate with us, read about us, or think about us in any way) we can influence the Rep creation.
What do you want to be today? Perhaps you cannot alter your own self-Rep very easily, but you can easily alter the Rep that is created inside other minds. Essentialy the Reps that represents us which is located inside other minds is based on a finite set of observations and thoughts that were most often a result of direct interactions. By carefully controling our interactions with others we can help shape the Rep. Cultivate a Rep today, be anything you want tomorrow.
Friday, November 18, 2005
Cartesian Philosophy, Computational Idealism, and a Machine-in-a-vat
Once my ideals congeal beyond the state of utter ineffability, I will upload an interesting short essay on Cartesian Philosophy to my blog. I will attempt to expose contemporary research in artificial intelligence as overly Cartesian. Additionaly, I will try to explain how postmodern philosophy based on social constructivism is necessary to advance the field of computational intelligence. Relating my ideas to the Cartesian concept of a "brain in a vat," I will also paint a Matrix-like picture where the key player in the machine intelligence game is the internet.
Wednesday, November 16, 2005
shackles of vocabulary
At first glance, the myriad of interchangeable terms found in the English language appears to enhance its expressive power. On the contrary, the freedom that one has in choosing the precise words to express his/her ideas can sometimes hinder communication. Not only can one uncompress a sentence into its primary intended meaning, but one can also extract additional secondary information from the mere choice of words. This additional channel of information could be used for stealth communication between two parties; however, it is most often used for a slightly different purpose. By entering the world of metaphor and dabbling in the field of primary meaning invariance, one can encode a sentence with a hierarchy of secondary meanings. While also useful for surreptitious exchange of information, the plurality of meaning provides the author with a mechanism for saying things that they don't want to say directly.
But you might ask yourself: why encode nonlinear meaning into a message as opposed to keeping it straightworward? Isn't there a possibility that the receiving party will fail to receive the hierarchy of secondary meanings? Sure, if we aren't trying to hide anything from an intermediate party then this hierarchical injection of information does nothing but obfuscate the primary message. But some of us still do it on a daily basis. I'd like to know why. It's not like we're trying to be poetic here.
But you might ask yourself: why encode nonlinear meaning into a message as opposed to keeping it straightworward? Isn't there a possibility that the receiving party will fail to receive the hierarchy of secondary meanings? Sure, if we aren't trying to hide anything from an intermediate party then this hierarchical injection of information does nothing but obfuscate the primary message. But some of us still do it on a daily basis. I'd like to know why. It's not like we're trying to be poetic here.
Saturday, November 12, 2005
softcore study of consciousness is for wimps
What is softcore study of consciousness? My personal view is that softcore study of anything is study performed by people who lack hardcore quantitative skills. For example, consider the contemporary philosopher who conveys his ideas by writing large corpora of text as opposed to any type of analysis (whether it be an empirical study or dabbling in gedanken-hilbert space).
If somebody wants to convince me that I should read their long publications on consciousness, they better be a hardcore scientist and not some type of calculus-avoiding softee.
Allow me to now boast of MIT's Center for Biological & Computational Learning. It's not like I one day decided to learn about biological research; I know of Tomaso Poggio (the big name associated with this lab) because a few weeks a go I wanted to learn about Reproducing Kernel Hilbert Spaces. Awesome! These guys are no dabblers and I personally encourage them to speak of consciousness. If you take a look at their publications list you'll notice that it is well aligned with my current academic interests. Their entire research plan supports the lemma that computer vision isn't all about machines!
If somebody wants to convince me that I should read their long publications on consciousness, they better be a hardcore scientist and not some type of calculus-avoiding softee.
Allow me to now boast of MIT's Center for Biological & Computational Learning. It's not like I one day decided to learn about biological research; I know of Tomaso Poggio (the big name associated with this lab) because a few weeks a go I wanted to learn about Reproducing Kernel Hilbert Spaces. Awesome! These guys are no dabblers and I personally encourage them to speak of consciousness. If you take a look at their publications list you'll notice that it is well aligned with my current academic interests. Their entire research plan supports the lemma that computer vision isn't all about machines!
Wednesday, November 09, 2005
Broader Impacts Season Ends
Broader Impacts season has just ended. Final Standings reported below:
Tomasz: 12-4
Broader Impacts: 11-5
Intellectual Merit: 16-0
NSF: 4-12
Sleep: 6-10
I would like to thank Justin, Vince, Aiah, and most notably Alyosha for their insightful comments regarding my NSF essays. Now it's time to focus on unsupervised learning and probabilistic graphical models.
-Tomasz
Tomasz: 12-4
Broader Impacts: 11-5
Intellectual Merit: 16-0
NSF: 4-12
Sleep: 6-10
I would like to thank Justin, Vince, Aiah, and most notably Alyosha for their insightful comments regarding my NSF essays. Now it's time to focus on unsupervised learning and probabilistic graphical models.
-Tomasz
Tuesday, November 08, 2005
Action at a Distance and Computer Vision
The problem of action at a distance which was around since the time of Newton still plagues us. While introduced in the context of gravitational attraction between two heavenly bodies, it has recently came up again in the context of object independence. Allow me to quickly explain.
The original problem was: how can two objects instantaneously 'communicate' via a gravitational attraction? How can scientists make sense of this action at a distance?
In the context of vision, how does the localization of one object influence the localization of another object in a scene? In other words, how can information about object A's configuration be embedded in object B's configuration?
Being the postmodern idealist that I am, I am not afraid to post the thesis that we, the perceivers, are the quark gluon plasma that binds together the seemingly distinct bits of information we acquire from the world. Perhaps what we semantically segment and label as object A is nothing but a subjective boundary that allows our perception to relate it to another subjective semantic segmentation called object B. When working on your next research project, remember that maybe the world isn't made up of things that you can see.
The original problem was: how can two objects instantaneously 'communicate' via a gravitational attraction? How can scientists make sense of this action at a distance?
In the context of vision, how does the localization of one object influence the localization of another object in a scene? In other words, how can information about object A's configuration be embedded in object B's configuration?
Being the postmodern idealist that I am, I am not afraid to post the thesis that we, the perceivers, are the quark gluon plasma that binds together the seemingly distinct bits of information we acquire from the world. Perhaps what we semantically segment and label as object A is nothing but a subjective boundary that allows our perception to relate it to another subjective semantic segmentation called object B. When working on your next research project, remember that maybe the world isn't made up of things that you can see.
Saturday, November 05, 2005
a bag of words and can of whoop ass
If you are interested in object recognition then you must check out the ICCV 2005 short course on Recognizing and Learning Object Categories.
From the link:
This course reviews current methods for object category recognition, dividing them into four main areas: bag of words models; parts and structure models; discriminative methods and combined recognition and segmentation. The emphasis will be on the important general concepts rather than in depth coverage of contemporary papers. The course is accompanied by extensive Matlab demos.
From the link:
This course reviews current methods for object category recognition, dividing them into four main areas: bag of words models; parts and structure models; discriminative methods and combined recognition and segmentation. The emphasis will be on the important general concepts rather than in depth coverage of contemporary papers. The course is accompanied by extensive Matlab demos.
Friday, November 04, 2005
vision people like eyes
My life as a graduate student consists of coming up with crazy ideas related to machine perception of the visual world. Somewhere between the cold objective yet uninterpretable reality and the resplendent human mind lies the human visual system. I'm not actually into retinas per se. However, I strive to understand the process that is responsible for understanding the visual world. Inadvertently, I am into gateways between the objective and the subjective (ie I'm into eyes in the metaphorical sense).
Many people would agree with the statement that the personal experience of 'thinking' is rather subjective, while the personal experience of 'seeing' is rather objective when two different people are looking at the same thing. However, I dont agree with this statement. I'm rather skeptical of a hard split between the objective world and the subjective realm inside us. If vision is observation and thinking is theory, then Popper's thesis that all observations are theory laden translates as follows: seeing is tainted with intelligence. I'm not saying that it is the eyes that are doing anything magical to transform the cold soulless material world to the warm and fuzzy subjective real of inner consciousness; however, visual information is processed via this gateway and if we want to ever reconstruct (or at least fit a naive model) someone's inner realm then we have to start hacking the eyes.
Perhaps we should not be looking for intelligence in people's brains. Perhaps we (I?) should look at the gateway between the objective and the subjective.
I like to wear sunglasses so that people don't know what I'm thinking. The gateway is the weakest link and I don't want to be exposed to crackers. Even though I'm not overly afraid of van eck phreakers peeking into my soul, I still wear shades. But you and I don't really need expensive van eck phreaking apparatus (fMRI?) when we have our own eyes. Peek into an eye today and learn an idea tomorrow.
Many people would agree with the statement that the personal experience of 'thinking' is rather subjective, while the personal experience of 'seeing' is rather objective when two different people are looking at the same thing. However, I dont agree with this statement. I'm rather skeptical of a hard split between the objective world and the subjective realm inside us. If vision is observation and thinking is theory, then Popper's thesis that all observations are theory laden translates as follows: seeing is tainted with intelligence. I'm not saying that it is the eyes that are doing anything magical to transform the cold soulless material world to the warm and fuzzy subjective real of inner consciousness; however, visual information is processed via this gateway and if we want to ever reconstruct (or at least fit a naive model) someone's inner realm then we have to start hacking the eyes.
Perhaps we should not be looking for intelligence in people's brains. Perhaps we (I?) should look at the gateway between the objective and the subjective.
I like to wear sunglasses so that people don't know what I'm thinking. The gateway is the weakest link and I don't want to be exposed to crackers. Even though I'm not overly afraid of van eck phreakers peeking into my soul, I still wear shades. But you and I don't really need expensive van eck phreaking apparatus (fMRI?) when we have our own eyes. Peek into an eye today and learn an idea tomorrow.
Tuesday, November 01, 2005
An analogy, Karl Popper, and science for machines
There is a nice analogy between the problem of segmentation and the problem of object detection/classification/recognition.
Segmentation is grouping on an intra-image spatial level.
Detection/classification/recognition is grouping on an inter-image level.
Tracking is grouping at the inter-frame temporal level.
--------------------
Let me remind the remind about the Theory Observation Distinction that is mainly attributed to Karl Popper. "All observation is selective and theory-laden," and similar quotes can be found on Stanford's Encyclopedia of Philosophy entry on Karl Popper. The entry further states that Popper repudiates induction and rejects the view that it is the characteristic method of scientific investigation and inference, and substitutes falsifiability in its place.
Researchers in the field of machine vision could learn from the philosophy of science. When placed in the context of machine intelligence, Popper's ideas sound like this:
The notion of training a system to classify images by presenting it with a large set of labeled examples and building an visual model is analogous to using induction over a finite set of observables. However, since a lesson on science has taught us that there is much to say about positing a theory, maybe we should be less concerned with machines that perform data-driven model building and more concerned with building machines that can posit models and verify them.
----------
Should we be building machines that posit scientific theories, or are we doing this already?
Segmentation is grouping on an intra-image spatial level.
Detection/classification/recognition is grouping on an inter-image level.
Tracking is grouping at the inter-frame temporal level.
--------------------
Let me remind the remind about the Theory Observation Distinction that is mainly attributed to Karl Popper. "All observation is selective and theory-laden," and similar quotes can be found on Stanford's Encyclopedia of Philosophy entry on Karl Popper. The entry further states that Popper repudiates induction and rejects the view that it is the characteristic method of scientific investigation and inference, and substitutes falsifiability in its place.
Researchers in the field of machine vision could learn from the philosophy of science. When placed in the context of machine intelligence, Popper's ideas sound like this:
The notion of training a system to classify images by presenting it with a large set of labeled examples and building an visual model is analogous to using induction over a finite set of observables. However, since a lesson on science has taught us that there is much to say about positing a theory, maybe we should be less concerned with machines that perform data-driven model building and more concerned with building machines that can posit models and verify them.
----------
Should we be building machines that posit scientific theories, or are we doing this already?
Subscribe to:
Posts (Atom)