The Hype of Machine Learning

Many folks in the tech industry are skeptical about the hype of machine learning, since the world has been promised to them before, but the reality hasn't really panned out. In fact this has happened so many times, there is a term for this in the machine learning community called an AI Winter. A number of fundamental factors are different though in 2015 that warrant taking a closer look this time around.

Why Machine Learning is the Real Deal###

  • Larger data sets - The explosion of big data is probably the number one driver of the new effectiveness of machine learning. It turns out that using the exact same techniques, more data can yield significantly better predictions. This new data comes from many sources:
    • Data collection - this is the obvious one, we now have more devices like smartwatches, smartphones, and computers generating more data. More analytics packages and more people spending their time in a digital world
    • Data labeling - gathering more data through unsupervised learning methods first, such as auto-encoders
    • Feature engineering - creating new training data through skewing, translations, rotations, and other transformations
  • Improved Hardware - Following Moore's Law, the processing power of chips and other hardware components continue to steadily improve over time, which simply leads to faster training.
    • GPU vs CPU vs FGPA - the emergence of GPUs for training machine learning algorithms has given a huge boost in speed in calculating matrix operations. While a GPU can't do many types of operations, it can focus on doing one operation really, really well. Same is true with using FPGAs for training neural networks.
    • Better Processors - The chips themselves are just faster and more powerful over time. Although, there is some news that rate of improvement is not as fast anymore, things are still getting better.
  • Better Algorithms - Lastly, there have also been advances in how to train these machines to learn. While this third component is what academic researchers and the media often like to focus on, it is worth stating again that the first two are both equally important advances. Examples include:
    • Neural networks - whereas most functions of the past used some combination of statistics (Naive Bayes), data transformations (kernels in SVMs), or increasingly sophisticated ensembling techniques (Random Forests), neural networks model the problem from a totally different perspective. By mimicking how the human brain functions, neural networks are able to achieve a surprising level of complexity without losing the flexibility to adapt to unique domains.
    • Network architecture - as researchers continued to explore neural networks, they began adding significantly more layers to the model, giving rise to deep learning. Not only were there more layers though, sometimes these layers performed different functions, such as activation, max-pooling, or convolutions. Finally, sometimes the network also grew larger in the front due to pre-training layers using unsupervised methods, or in the end due to extra recurrent layers from RNNs.
    • Sum is greater than the parts - lots of other little wins that incrementally are not worth much, but combined have drastically improved the accuracy of machine learning. Without going into too much detail, some noteworthy advances include improved initialization methods (Lecun, Glorot, He) to start off training and improved learning rates (RMSProp, AdaGrad, Adam) to speed up the process. And let's not forget improved activation functions (Sigmoid, tanH, ReLU) to avoid neuron saturation and regularization techniques (lasso, ridge, dropout) to avoid overfitting on training data.

Examples of Improved Machine Learning###

Frankly, words can't do justice in explaining the amazing advances that have been made in the last year – so just watch.

  • Image Recognition - 5 years ago, computers could not tell the difference between an image of cat and an image of dog. Now, they can interpret objects better than most people:
  • NLP and Translation - Not too long ago, computers could barely understand the words you were speaking even if spoken slowly from a limited vocabulary. Now, they can translate entire sentences on the fly:
  • Memory Networks - Not too long ago, computers could not really interpret any meaning or context from data. Now, they can chain together sentences to reason through logic.

It's no surprise then that all the big names are joining in on the fun.

Conclusion

To be clear, we aren't in this wonderful new world yet. The majority of machine learning news you hear these days is simply about incremental improvements in data science that allow a researcher to better predict outcome Y based on features X1, X2, and X3. But we are quickly moving into a reality where machines will simply understand our desires and cater to those personal interests with little to no prompting. And because technology improves exponentially, the age of machine learning and artificial intelligence is arriving much sooner than you may have been led to believe.