Tombone's Computer Vision Blog: Deep Learning vs Machine Learning vs Pattern Recognition

Friday, March 20, 2015

Deep Learning vs Machine Learning vs Pattern Recognition

Lets take a close look at three related terms (Deep Learning vs Machine Learning vs Pattern Recognition), and see how they relate to some of the hottest tech-themes in 2015 (namely Robotics and Artificial Intelligence). In our short journey through jargon, you should acquire a better understanding of how computer vision fits in, as well as gain an intuitive feel for how the machine learning zeitgeist has slowly evolved over time.

Fig 1. Putting a human inside a computer is not Artificial Intelligence

(Photo from WorkFusion Blog)

If you look around, you'll see no shortage of jobs at high-tech startups looking for machine learning experts. While only a fraction of them are looking for Deep Learning experts, I bet most of these startups can benefit from even the most elementary kind of data scientist. So how do you spot a future data-scientist? You learn how they think.

The three highly-related "learning" buzz words

“Pattern recognition,” “machine learning,” and “deep learning” represent three different schools of thought. Pattern recognition is the oldest (and as a term is quite outdated). Machine Learning is the most fundamental (one of the hottest areas for startups and research labs as of today, early 2015). And Deep Learning is the new, the big, the bleeding-edge -- we’re not even close to thinking about the post-deep-learning era. Just take a look at the following Google Trends graph. You'll see that a) Machine Learning is rising like a true champion, b) Pattern Recognition started as synonymous with Machine Learning, c) Pattern Recognition is dying, and d) Deep Learning is new and rising fast.

1. Pattern Recognition: The birth of smart programs

Pattern recognition was a term popular in the 70s and 80s. The emphasis was on getting a computer program to do something “smart” like recognize the character "3". And it really took a lot of cleverness and intuition to build such a program. Just think of "3" vs "B" and "3" vs "8". Back in the day, it didn’t really matter how you did it as long as there was no human-in-a-box pretending to be a machine. (See Figure 1) So if your algorithm would apply some filters to an image, localize some edges, and apply morphological operators, it was definitely of interest to the pattern recognition community. Optical Character Recognition grew out of this community and it is fair to call “Pattern Recognition” as the “Smart" Signal Processing of the 70s, 80s, and early 90s. Decision trees, heuristics, quadratic discriminant analysis, etc all came out of this era. Pattern Recognition become something CS folks did, and not EE folks. One of the most popular books from that time period is the ~~infamous~~ invaluable Duda & Hart "Pattern Classification" book and is still a great starting point for young researchers. But don't get too caught up in the vocabulary, it's a bit dated.

The character "3" partitioned into 16 sub-matrices. Custom rules, custom decisions, and custom "smart" programs used to be all the rage.

Quiz: The most popular Computer Vision conference is called CVPR and the PR stands for Pattern Recognition. Can you guess the year of the first CVPR conference?

2. Machine Learning: Smart programs can learn from examples

Sometime in the early 90s people started realizing that a more powerful way to build pattern recognition algorithms is to replace an expert (who probably knows way too much about pixels) with data (which can be mined from cheap laborers). So you collect a bunch of face images and non-face images, choose an algorithm, and wait for the computations to finish. This is the spirit of machine learning. "Machine Learning" emphasizes that the computer program (or machine) must do some work after it is given data. The Learning step is made explicit. And believe me, waiting 1 day for your computations to finish scales better than inviting your academic colleagues to your home institution to design some classification rules by hand.

"What is Machine Learning" from Dr Natalia Konstantinova's Blog. The most important part of this diagram are the "Gears" which suggests that crunching/working/computing is an important step in the ML pipeline.

As Machine Learning grew into a major research topic in the mid 2000s, computer scientists began applying these ideas to a wide array of problems. No longer was it only character recognition, cat vs. dog recognition, and other “recognize a pattern inside an array of pixels” problems. Researchers started applying Machine Learning to Robotics (reinforcement learning, manipulation, motion planning, grasping), to genome data, as well as to predict financial markets. Machine Learning was married with Graph Theory under the brand “Graphical Models,” every robotics expert had no choice but to become a Machine Learning Expert, and Machine Learning quickly became one of the most desired and versatile computing skills. However "Machine Learning" says nothing about the underlying algorithm. We've seen convex optimization, Kernel-based methods, Support Vector Machines, as well as Boosting have their winning days. Together with some custom manually engineered features, we had lots of recipes, lots of different schools of thought, and it wasn't entirely clear how a newcomer should select features and algorithms. But that was all about to change...

Further reading: To learn more about the kinds of features that were used in Computer Vision research see my blog post: From feature descriptors to deep learning: 20 years of computer vision.

3. Deep Learning: one architecture to rule them all

Fast forward to today and what we’re seeing is a large interest in something called Deep Learning. The most popular kinds of Deep Learning models, as they are using in large scale image recognition tasks, are known as Convolutional Neural Nets, or simply ConvNets.

ConvNet diagram from Torch Tutorial

Deep Learning emphasizes the kind of model you might want to use (e.g., a deep convolutional multi-layer neural network) and that you can use data fill in the missing parameters. But with deep-learning comes great responsibility. Because you are starting with a model of the world which has a high dimensionality, you really need a lot of data (big data) and a lot of crunching power (GPUs). Convolutions are used extensively in deep learning (especially computer vision applications), and the architectures are far from shallow.

If you're starting out with Deep Learning, simply brush up on some elementary Linear Algebra and start coding. I highly recommend Andrej Karpathy's Hacker's guide to Neural Networks. Implementing your own CPU-based backpropagation algorithm on a non-convolution based problem is a good place to start.

There are still lots of unknowns. The theory of why deep learning works is incomplete, and no single guide or book is better than true machine learning experience. There are lots of reasons why Deep Learning is gaining popularity, but Deep Learning is not going to take over the world. As long as you continue brushing up on your machine learning skills, your job is safe. But don't be afraid to chop these networks in half, slice 'n dice at will, and build software architectures that work in tandem with your learning algorithm. The Linux Kernel of tomorrow might run on Caffe (one of the most popular deep learning frameworks), but great products will always need great vision, domain expertise, market development, and most importantly: human creativity.

Other related buzz-words

Big-data is the philosophy of measuring all sorts of things, saving that data, and looking through it for information. For business, this big-data approach can give you actionable insights. In the context of learning algorithms, we’ve only started seeing the marriage of big-data and machine learning within the past few years. Cloud-computing, GPUs, DevOps, and PaaS providers have made large scale computing within reach of the researcher and ambitious "everyday" developer.

Artificial Intelligence is perhaps the oldest term, the most vague, and the one that was gone through the most ups and downs in the past 50 years. When somebody says they work on Artificial Intelligence, you are either going to want to laugh at them or take out a piece of paper and write down everything they say.

Further reading: My 2011 Blog post Computer Vision is Artificial Intelligence.

Conclusion

Machine Learning is here to stay. Don't think about it as Pattern Recognition vs Machine Learning vs Deep Learning, just realize that each term emphasizes something a little bit different. But the search continues. Go ahead and explore. Break something. We will continue building smarter software and our algorithms will continue to learn, but we've only begun to explore the kinds of architectures that can truly rule-them-all.

If you're interested in real-time vision applications of deep learning, namely those suitable for robotic and home automation applications, then you should check out what we've been building at vision.ai. Hopefully in a few days, I'll be able to say a little bit more. :-)

Until next time.

See discussion about this blog post on Hacker News.

23 comments:

Tomasz Malisiewicz3:03 PM
The "Pattern Classification" book which I mentioned in this blog post is the orange/red book which looks like this: http://www.wiley.com/WileyCDA/WileyTitle/productCd-0471056693.html
It's really a great book (even though it is a bit dated).

A more recent machine learning textbook is Christopher Bishop's "Pattern Classification and Machine Learning" and is still used in ML classrooms today.

http://research.microsoft.com/en-us/um/people/cmbishop/prml/

ReplyDelete
Replies
TheStranger11:28 AM
Thank you for great article!
ReplyDelete
Replies
Tomasz Malisiewicz2:23 PM
There will be some posts describing what we've been up to with vision.ai, but here are 10 Beta keys for the vision.ai software. Please only use one, and be warned that they are first come first served. I will post an update when they are all gone:

6db56b8c-b775-4516-9843-f186ab3cf938
182cddf3-78f7-4aeb-abeb-e47166142f34
4f2b41f5-26d5-4175-8920-c499c9c62e28
72c2a7f3-2e1f-4761-8f9b-424c86926e6b
1393dcca-057a-4f32-b840-daa4dd1f51a7
6bdcd723-3fb2-4c6b-816f-72f13a4c0b4d
f40125e9-85fe-4b5b-8de0-ba9ceeacafa8
f61d71b8-676b-4650-8fe5-efd320b39a2b
30d58085-e25c-4b30-abf6-3871ae0bde8d
cfd23407-0ec1-483c-bae5-b3653838b5f0
ReplyDelete
Replies
Tomasz Malisiewicz9:56 PM
Thanks for eating up some of these licenses, folks. They work for the current version and will work for all future updates of our VMX software. The tutorials will be out soon, but if you're eager then check it out. 5 licenses left.

If you sign up to the vision.ai mailing list, we will keep you updated with new of our imminent "big launch".

You can train detectors using our GUI and get a taste of what next generation Machine Learning is all about. But you don't actually need to know anything about Machine Learning to start training your own object recognition models.
ReplyDelete
Replies
Anonymous12:13 AM
tried. Great tools! but poor ui:( much things have to be done. somehow like machine learning module on Azure.
ReplyDelete
Replies
MD8:24 AM
OK, now I'm wondering: how is deep learning different (or distinct?) from just using artificial neural networks (ANNs; which have been used as models in machine learning since the inception of this field)?

Say, compared to the MLP (multilayer perceptron ANN -- as traditional as it gets, I suppose) -- what's new in deep learning?

Is it the kind of neural network (ConvNets instead of MLPs?) -- is it the number or size of layers (but that's not really new, then, since large, multilayer MLPs aren't nothing new, either)?

Or, is it the way they're being trained (hardware choices, like GPUs -- as well as math algorithms choices, like stochastic gradient descent; or avoiding the numerical difficulties, like the vanishing gradient problem, bringing us to LSTM, pretraining? -- if so, what exactly about any of these developments "identifies" whether one is using deep learning -- and how?).

It's not easy to disambiguate between ANNs and deep learning for me at the moment -- for instance, everything in the Andrej Karpathy's guide is just good, old, traditional machine learning applying classic ANNs.

Any more details would be greatly appreciated! :-)

ReplyDelete
Replies
Tomasz Malisiewicz7:29 PM
Hi MD,

You're rightfully confused. The boundary between multilayer perceptron ANNs and deep learning systems is blurry, and because working on Deep Learning is so fashionable as of late, lots of systems are claiming to be deep.

Deep learning vs old ANNs is based on 3 things:

1.) the amount of data involved. ImageNet is orders of magnitude larger than the kind of MNIST-style datasets that were popular in the 80s and 90s.

2.) computers are much faster. It's too late for me to do the back-of-the-envelope calculation, but I wouldn't be surprised if we're running today's learning algorithms 1000x longer than we previously thought necessary.

3.) algorithms are a little bit smarter. There are tricks like ReLu activation functions and dropout, but they probably aren't as important as good ol' minibatch gradient descent and a lot of patience.

Yisong Yue give a great overview of Deep Learning on his Machine Learning blog and the commentary is full of feedback from some of the biggest names in the field. Here is the link http://yyue.blogspot.com/2015/01/a-brief-overview-of-deep-learning.html

An intimate understanding of Deep Learning requires that you know the insides of classic ANNs and backprop like the back of your hand. It's still the best place to start. Karpathy's guides are good.

ReplyDelete
Replies
Richard2:53 AM
Thank you for the article. For people like me trying to do more with computers in countries that are only developing this is important.
ReplyDelete
Replies
Aaron12:08 PM
Deep Learning is a type of Machine Learning is a technique for Pattern Recognition.
ReplyDelete
Replies
Neil4:40 AM
Nice read!
ReplyDelete
Replies
Jim1:04 PM
Tomasz: Excellent summary, it will help a lot of people get grounded in the field. Of course there are numerous ways to organize the topics, but the differentiating aspects are well presented. Nice job!
ReplyDelete
Replies
Unknown10:59 AM
Very inspiring!
Thank you :)
ReplyDelete
Replies
Unknown2:17 AM
Thanks for sharing such valuable information.
laser cutting machine
ReplyDelete
Replies

Add comment

Friday, March 20, 2015

Deep Learning vs Machine Learning vs Pattern Recognition

23 comments:

Subscribe To