Search

Introducing TPU v4: Google’s Supercomputer for Large Language Models

Simeon Spencer
Apr 14, 2023
2 min read

Updated: Apr 21, 2023

With ChatGPT usage and other AI growing at such a breakneck speed, so does the demand for higher-performance infrastructure to support them. And that’s why Google has launched Tensor Processing Unit (TPU) v4, the latest version of Google’s TPU AI accelerator application-specific integrated circuit (ASIC) that was built for ML and AI. TPUs are designed as matrix processors for neural network workloads, to solve the issue of memory access problems which causes GPUs and CPUs to slow down. TPU v4 features the following improvements:

TPU v4 chips have 4 MXUs over TPU v3 which had only 2.
TPU v4 is the first supercomputer to deploy reconfigurable optical circuit switches (OCSes) which reduces congestion and improves scalability, and performance among many benefits.
TPU v4 is the first supercomputer to support embeddings through SparseCores, resulting in the performance of recommendation models being 3x faster than TPU v3.

In general, TPU v4 outperforms TPU v3 by 2.1x and has improved performance/watt by 2.7x. The OCS implementation and flexibility are beneficial for large language models such as LaMDA, MUM, and PaLM, which were trained on TPU v4. TPU v4 also has multidimensional model-partitioning techniques for low-latency, high-throughput inference of large language models.
The below table shows the workload by deep neural network model type and the % usage of TPUs (lower is better except for Transformer). Over 90% of training at Google is on TPUs, and this table shows how fast product workloads change at Google.

Source: TPU v4: An Optically Reconfigurable Supercomputer for Machine Learning with Hardware Support for Embeddings Research Paper

So, What Does This Mean?

Firstly, this means that NVIDIA’s latest H100 GPUs finally have a worthy competitor for AI/ML workloads. Secondly, more powerful ASICs or GPUs for AI/ML workloads can mean lower costs of cloud AI/ML compute services as hardware becomes more energy efficient or lower processing power is required. Overall, this will enable the trend of tech companies rushing to develop their own generative AIs to compete for a share of the new generative AI industry. The ultimate winners, for hardware at least, will be Google and NVIDIA.

Comments