Hostname: page-component-89b8bd64d-5bvrz Total loading time: 0 Render date: 2026-05-07T12:21:58.241Z Has data issue: false hasContentIssue false

The Deep Learning Revolution in AI

Published online by Cambridge University Press:  18 September 2025

Shalom Lappin*
Affiliation:
School of Electronic Engineering and Computer Science, Queen Mary University of London, Mile End Road, London E1 4NS, UK. Email: s.lappin@qmul.ac.uk
Rights & Permissions [Opens in a new window]

Abstract

The early development of Artificial Intelligence (AI) in the latter half of the twentieth century was marked by limited, hand-crafted systems and fluctuating perceptions of the field’s potential. Early research explored a range of paradigms – including symbolic, neural and probabilistic approaches – constrained by severe hardware and data limitations. Key technological advances, such as the invention of microchips, GPUs and later TPUs, significantly enhanced computational capacity, enabling more complex AI experimentation. Concurrently, the proliferation of digital data through the internet addressed longstanding bottlenecks in data availability. The most transformative shift, however, came from architectural innovations in neural networks, culminating in the deep learning revolution. This unfolded in two phases: the emergence of Recurrent and Convolutional Neural Networks, followed by the development of transformer-based models, which underpin today’s Large Language Models (LLMs).

Information

Type
AE Annual Conference Lecture
Creative Commons
Creative Common License - CCCreative Common License - BY
This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted re-use, distribution and reproduction, provided the original article is properly cited.
Copyright
© The Author(s), 2025. Published by Cambridge University Press on behalf of Academia Europaea
Figure 0

Figure 1. A simple Recurrent Neural Network (from Lappin 2021).

Figure 1

Figure 2. An LSTM (from Olah 2015, with permission).

Figure 2

Figure 3. A CNN (from Saha 2015, with permission).

Figure 3

Figure 4. A bidirectional LSTM with an attention layer (from Bahdanau et al. 2015, with permission).

Figure 4

Figure 5. The architecture of a transformer (from Lappin 2021, based on Vaswani et al. 2017).

Figure 5

Figure 6. ChatGPT-4 interprets a sequence of images (from OpenAI 2023, arXiv.org non-exclusive license to distribute).