Thursday, March 26, 2015

Mobileye's quest to put Deep Learning inside every new car

In Amnon Shashua's vision of the future, every car can see.  He's convinced that the key technology behind the imminent driving revolution is going to be computer vision, and to experience this technology, we won't have to wait for fully autonomous cars to become mainstream.  I had the chance to hear Shashua's vision of the future this past Monday, and from what I'm about to tell you, it looks like there's going to be a whole lot of Deep Learning inside tomorrow's carCars equipped with Deep Learning-based pedestrian avoidance systems (See Figure 1) can sense people and dangerous situations while you're behind the wheel. From winning large-scale object recognition competitions like ImageNet, to heavy internal use by Google, Deep Learning is now at the foundation of many hi-tech startups and giants. And when it comes to cars, Deep Learning promises to give us both safer roads and the highly-anticipated hands-free driving experience. 

Mobileye's Deep Learning-based Pedestrian Detector

Mobileye Co-founder Amnon Shashua shares his vision during an invited lecture at MIT
Amnon Shashua is the Co-founder & CTO of Mobileye and this past Monday (March 23, 2015) he gave a compelling talk at MIT’s Brains, Minds & Machines Seminar Series titled “Computer Vision that is Changing Our Lives”. Shashua discussed Mobileye’s Deep Learning chips, robots, autonomous driving, as well as introduced his most recent project, a wearable computer vision unit called OrCam

Fig 2. Prof Amnon Shashua, CTO of Mobileye

Let's take a deeper look at the man behind Mobileye and his vision. Below is my summary of Shashua's talk as well as some personal insights regarding Mobileye's embedded computer vision technology and how it relates to cloud-based computer vision.

Mobileye's academic roots
You might have heard stories of bold entrepreneurs dropping out of college to form million dollar startups, but this isn't one of them.  This is the story of a professor who turned his ideas into a publicly traded company, Mobileye (NYSE:MBLY). Amnon Shashua is a Professor at Hebrew University, and his lifetime achievements suggest that for high-tech entrepreneurship, it is pretty cool to stay in school. And while Shashua and I never overlapped academically (he is 23 years older than me), both of us spent some time at MIT as postdoctoral researchers.

Deep Learning's impact on Mobileye
During his presentation at MIT, Amnon Shashua showcased a wide array of of computer vision problems that are currently being solved by Mobileye real-time computer vision systems. These systems are image-based and do not require expensive 3D sensors such as the ones commonly found on top of self-driving cars.  He showed videos of real-time lane detection, pedestrian detection, animal detection, and road surface detection. I have seen many similar visualizations during my academic career; however, Shashua emphasized that deep learning is now used to power most of Mobileye's computer vision systems

Question: I genuinely wonder how much the shift to Deep methods improved Mobileye's algorithms, or if the move is a strategic technology upgrade to stay relevant in the era where Google and and competition is feverishly pouncing on the landscape of deep learning. There's a lot of competition on the hardware front, and it seems like the chase for ASIC-like Deep Learning Miners/Trainers is on.

The AlexNet CNN diagram from the popular Krizhevsky/Sutskever/Hinton paper. Shashua explicitly mentioned the AlexNet model during his MIT talk, and it appears that Mobileye has done their Deep Learning homework.

The early Mobileye: Mobileye didn’t wait for the deep learning revolution to happen. They started shipping computer vision technology for vehicles using traditional techniques more than a decade ago. In fact, I attended a Mobileye presentation at CMU almost a full decade ago -- it was given by Andras Ferencz at the 2005 CMU VASC Seminar.  This week's talk by Shashua suggests that Mobileye was able to successfully modernize their algorithms to use deep learning.

Further reading: To learn about object recognition methods in computer vision which were popular before Deep Learning, see my January blog post, titled From feature descriptors to deep learning: 20 years of computer vision.

Fig 3. "Deep Learning at Mobileye" presentation at the 2015 Deutsche Bank Global 
Auto Industry Conference.

Mobileye's custom Computer Vision hardware
Mobileye is not a software computer vision company -- they bake their algorithms into custom computer vision chips. Shashua reported some impressive computation speeds on what appears to be tiny vision chips. Their custom hardware is more specific than GPUs (which are quite common for deep learning, scientific computations, computer graphics, and actually affordable). But Mobileye chips do not need to perform the computationally expensive big-data training stage onboard, so their devices can be much leaner than GPUs. Mobileye has lots of hardware experience, and regarding machine learning, Shashua mentioned that Mobileye has more vehicle-related training data than they know what to do with.  

Fig 4. The Mobileye Q2 lane detection chip.

Embedded vs. Cloud-based computer vision
While Mobileye makes a strong case for embedded computer vision, there are many scenarios today where the alternative cloud-based computer vision approach triumphs.  Cloud-based computer vision is about delivering powerful algorithms as a service, over the web.  In a cloud-based architecture, the algorithms live in a data center and applications talk to the vision backend via an API layer.  And while certain mission-critical applications cannot have a cloud-component (e.g., a drones flying over the desert), cloud-based vision system promise to turn laptops and smartphones into smart devices, without the need to bake algorithms into chips. In-home surveillance apps, home-automation apps, exploratory robotics projects, and even scientific research can benefit from cloud-based computer vision.  Most importantly, cloud-based deployment means that startups can innovate faster, and entire products can evolve much faster.

Unlike Mobileye's decade-long journey, I suspect cloud-based computer vision platforms are going to make computer vision development much faster, giving developers a Heroku-like button for visual AI.  Choosing diverse compilation targets such as a custom chip or Javascript will be handled by the computer vision platform, allowing computer vision developers to work smarter and deploy to more devices.

Conclusion and Predictions
Even if you don't believe that today's computer vision-based safety features make cars smart enough to call them robots, driving tomorrow's car is sure going to feel different.  I will leave you with one final note: Mobileye's CTO hinted that if you are going to design a car in 2015 on top of computer vision tech, you might reconsider traditional safety features such as airbags, and create a leaner, less-expensive AI-enabled vehicle.

Fig 5. Mobileye technology illustration [].

Watch the Mobileye presentation on YouTube: If you are interested in embedded deep learning, autonomous vehicles, or want to get a taste of how the industry veterans compile their deep networks into chips, you can watch the full 38-minute presentation from Amnon's January 2015 Mobileye presentation. 

I hope you learned a little bit about vehicle computer vision systems, embedded Deep Learning, and got a glimpse of the visual intelligence revolution that is happening today. Feel free to comment below, follow me on Twitter (@quantombone), or sign-up to the mailing list if you are a developer interested in taking's cloud-based computer vision platform for a spin.