Nvidia H100's dominance in machine learning benchmarks is still untouched

Nvidia H100 is the current market leader

Home » News

2 min. read

Published on June 13, 2024

by Rafly Gilang

published on June 13, 2024

Share this article

Improve this guide

Readers help support MSpoweruser. We may get a commission if you buy through our links.

Key notes

Nvidia’s H100 system dominates MLPerf’s new AI benchmarks for fine-tuning LLMs and GNNs.
Using 11,616 GPUs, Nvidia set records in five out of nine benchmarks, outpacing Google and Intel.
Nvidia also achieved a 27% improvement in GPT-3 training times with software optimizations and flash attention.

Nvidia has been dominating the AI chip market for quite some time, and that’s not baseless at all. The tech giant’s H100 system is the current market leader, and so far, there hasn’t been any dominant competitor.

MLPerf, one of the most popular benchmarks used to measure AI chips’ performance (if not the most accurate), has just launched a new set of tests. They’re made for fine-tuning large language models (LLMs) and graph neural networks (GNNs), and according to these tests, Nvidia’s H100 system sets records.

11,616 H100 GPUs were used, making it the largest system tested in the MLPerf benchmarks. They achieved top performance across all nine benchmarks, setting records in five of them, as this report details.

Competitors like Google and Intel participated with their AI accelerators but were outperformed by Nvidia. Google’s TPU systems showed significant speed improvements, and Intel’s GPUs also made notable progress, but neither could match the performance of Nvidia’s largest system with 11,616 H100 GPUs.

Additionally, Nvidia also saw a 27% improvement in GPT-3 training times from June 2023 benchmarks due to several software optimizations. These included better use of 8-bit floating point operations, more efficient power management of compute engines, and improved communication among GPUs.

They also implemented flash attention, an algorithm that speeds up transformer networks by minimizing memory writes, contributing to a 10% reduction in training times.

Rafly Gilang

Tech Reporter

Rafly is a reporter with years of journalistic experience, ranging from technology, business, social, and culture. Currently reporting news on Microsoft-related products, tech, and AI on MSPowerUser. Got a tip? Send it to [email protected]

User forum

0 messages

Sort by:

Leave a Reply Cancel reply