Source: Medium | December 12, 2018
Author: Karen Hao
Researchers borrowed equations from calculus to redesign the core machinery of deep learning so it can model continuous processes like changes in health
David Duvenaud was working on a project involving medical data when he hit upon a major shortcoming in AI.
An AI researcher at the University of Toronto, he wanted to build a deep-learning model that would predict a patient’s health over time. But data from medical records is kind of messy: throughout your life, you might visit the doctor at different times for different reasons, generating a smattering of measurements at arbitrary intervals. A traditional neural network struggles to handle this. Its design requires it to learn from data with clear stages of observation. Thus it is a poor tool for modeling continuous processes, especially ones that are measured irregularly over time.
The challenge led Duvenaud and his collaborators at the university and the Vector Institute to redesign neural networks as we know them. Last week their paper was among four others crowned “best paper” at the Neural Information Processing Systems conference, one of the largest AI research gatherings in the world.
Neural nets are the core machinery that make deep learning so powerful. A traditional neural net is made up of stacked layers of simple computational nodes that work together to find patterns in data. The discrete layers are what keep it from effectively modeling continuous processes (we’ll get to that).
In response, Duvenaud’s design scraps the layers entirely. (He’s quick to note that his team didn’t come up with this idea. They were just the first to implement it in a generalizable way.) To understand how this is possible, let’s walk through what the layers do in the first place.