skipToContent
🌐HE higher-ed

The engine behind the AI revolution: NVIDIA’s chief scientist William Dally on what really drives computing forward

NUS Newsroom Global
The engine behind the AI revolution: NVIDIA’s chief scientist William Dally on what really drives computing forward
While attention is often placed on the algorithms and models that enable AI to advance, much credit is also owed to hardware, specifically graphics processing units (GPUs). They are designed to efficiently process a large volume of data simultaneously, making them essential for training AI models, said Dr William Dally, Chief Scientist and Senior Vice President of Research at American technology company NVIDIA, at the NUS120 Distinguished Speaker Series . Delivering a lecture on “Shaping the Future Through Computing Innovation” on 20 May 2026, the renowned computer scientist noted that algorithms behind deep learning were largely developed in the 1980s, and the data needed to train AI systems was available by the late 2000s. What was lacking at that time was enough computing power to put the two together. That missing ingredient was the GPU, a chip originally designed for video games, said Dr Dally, who has shaped the foundations of today’s AI revolution through his pioneering work in parallel computing and high-performance interconnects. “Think of that (data and algorithms) as a fuel in the air, and they were waiting for a spark to ignite them and really light off the AI revolution,” he told some 180 students, faculty, alumni, and members of the public. Behind every frontier AI model lies a coordination issue: getting thousands of processors to work in concert, communicating at speed and scale. “Dr Dally’s work on high-performance interconnects has been instrumental in making that coordination feasible for the large-scale systems that define the AI we see today,” added Prof Tan. That spark was sufficient computing power, exemplified by the landmark AlexNet model in 2012, which was trained on just two NVIDIA GPUs. Since then, the computing power required to train a state-of-the-art AI model has grown by roughly 10 million times. A single NVIDIA GPU has become around 5,000 times more powerful over the same period. But with AI rapidly permeating all aspects of life, the demand for computing power to fuel AI has grown to unprecedented levels. “As models grow more ambitious, the infrastructure behind them has become as consequential as the ideas within them. This tension is reshaping how systems are designed,” said NUS President Professor Tan Eng Chye in his opening remarks. Behind every frontier AI model lies a coordination issue: getting thousands of processors to work in concert, communicating at speed and scale. “Dr Dally’s work on high-performance interconnects has been instrumental in making that coordination feasible for the large-scale systems that define the AI we see today,” added Prof Tan. Getting more for less To illustrate how AI models process information, Dr Dally offered a simple example. When a person asks a large language model (an AI system trained on vast amounts of text to learn patterns in language), “Is an apple a fruit?”, the system reads every word at once. But when the AI model generates the answer, it produces one word at a time, and each word requires reading the entire model’s stored knowledge before the next one can appear. With today’s largest models containing trillions of stored values, that adds up fast. One way to improve the efficiency is to make those stored values lighter, noted Dr Dally, who was a professor at the Massachusetts Institute of Technology and Stanford University before joining NVIDIA in 2009. Computers represent numbers as strings of 0s and 1s, called bits — more bits means greater precision, but also more energy per calculation. As AI models can function with cruder approximations than traditional computing, NVIDIA progressively reduced the number of bits used: from 32 to 16, then eight, and now four. Each step roughly quadrupled energy efficiency. To mitigate the risk of reduced accuracy caused by the accumulation of rounding errors, Dr Dally shared how a technique called sparsity can be applied. Zeroes are identified and skipped during calculation, effectively doubling the useful work done per unit of energy — a gain that has been built into every NVIDIA GPU since the Ampere generation, introduced in 2020. The next frontier Despite the substantial improvement in a single GPU, it still falls far short of what the most demanding AI models require. Dr Dally explained how modern AI workloads are distributed across many GPUs at once, with different processors handling different parts of a model. This requires fast, reliable links between chips. NVIDIA’s NVLink technology, for instance, allows GPUs within a single cabinet to exchange data at around 1.8 terabytes per second. However, he noted that hardware improvements can only go so far without software to match. For example, when NVIDIA optimised its software after a new chip launch, performance improved between two-and-a-half and three times — on identical hardware. The lecture was followed by a Q&A moderated by Professor Tulika Mitra, Vice Provost (Special Projects) and Dean of NUS Computing , with audience questions ranging from the trade-offs between specialised and general-purpose hardware to the outlook for computing formats beyond 4-bit. Looking ahead, Dr Dally speculated that AI might one day help design future chips, reducing a process that currently requires thousands of man-years of engineering work. “I would love to see people look at applying AI to reducing the amount of energy required to turn out a new GPU,” he said. MORE ON THIS TOPIC ‘Godfathers of AI’ Yoshua Bengio and Yann LeCun weigh in on potential of human-level AI, emerging risks and future frontiers at NUS lectures Experience beats knowledge: Prof Richard Sutton on reinforcement learning and the future of AI
Share
Original story
Continue reading at NUS Newsroom
news.nus.edu.sg
Read full article

Summary generated from the RSS feed of NUS Newsroom. All article rights belong to the original publisher. Click through to read the full piece on news.nus.edu.sg.