Though we’re living through a time of extraordinary innovation in GPU-accelerated machine learning, the latest research papers frequently (and prominently) feature algorithms that are decades, in certain cases 70 years old.
Some might contend that many of these older methods fall into the camp of ‘statistical analysis’ rather than machine learning, and prefer to date the advent of the sector back only so far as 1957, with the invention of the Perceptron.
Given the extent to which these older algorithms support and are enmeshed in the latest trends and headline-grabbing developments in machine learning, it’s a contestable stance. So let’s take a look at some of the ‘classic’ building blocks underpinning the latest innovations, as well as some newer entries that are making an early bid for the AI hall of fame.
In 2017 Google Research led a research collaboration culminating in the paper Attention Is All You Need. The work outlined a novel architecture that promoted attention mechanisms from ‘piping’ in encoder/decoder and recurrent network models to a central transformational technology in their own right.
The approach was dubbed Transformer, and has since become a revolutionary methodology in Natural Language Processing (NLP), powering, amongst many other examples, the autoregressive language model and AI poster-child GPT-3.