UPDATED 14:00 EDT / OCTOBER 10 2024

INFRA

AMD attacks AI workloads with next-generation Instinct MI325X accelerator and networking chips

As demand for the powerful hardware needed to support artificial intelligence workloads continues to surge, Advanced Micro Devices Inc. is stepping up its game with the launch of its most sophisticated AI accelerator yet.

At a media and analyst event called “Advancing AI” in San Francisco today, the company showed off its next-generation AI chip, the AMD Instinct MI325X accelerator, alongside a new networking platform based on the AMD Pensando Salina data processing unit.

The new technologies are set to launch next year, with AMD promising that they’ll set a new standard in terms of generative AI performance. They’re part of AMD’s continued push into data center chips — an area where it has stolen a march on longtime rival Intel Corp., capturing 34% of data center chip revenues — by focusing on making them better for AI.

“Our goal is to make AMD the end-to-end AI leader,” Chief Executive Lisa Su (pictured, holding an AMD EPYC chip), said this morning at her event keynote. She trotted a Who’s Who of executives at AI leaders such as Microsoft Corp., Meta Platforms Inc., Databricks Inc. and Oracle Corp., as well as startups such a Reka AI Inc., Essential AI Labs Inc., Fireworks AI and Luma AI Inc., who lauded their partnerships with AMD.

The chipmaker, which has rapidly emerged as a growing threat to the dominance of Nvidia Corp. in the AI infrastructure industry, is building on the success of its fast-selling MI300X AI chip, which launched earlier this year and is forecast to drive more than $4 billion in AI chip sales for the company.

AMD’s most powerful AI chip yet

The AMD Instinct MI325X, built on the company’s CDNA 3 architecture, is designed to enable blazing-fast performance combined with greater energy efficiency for the most demanding AI tasks, including training large language models, fine-tuning their performance and AI inference, where the models deliver results to users.

In a press briefing, Brad McCredie, AMD’s corporate vice president of GPU platforms, said the MI325X chip brings more than double the performance of the existing MI300 in terms of performance for AI inference and training workloads.

McCredie proceeded to reel off a list of impressive numbers, saying the MI325X features 256 gigabytes of high-performance HBM3E memory and up to 6 terabits per second in terms of bandwidth. That equates to 1.8 times more capacity and 1.3 times more bandwidth than Nvidia’s current most powerful AI chip, the Nvidia H200 graphics processing unit. In addition, the MI325X provides 1.3 times greater peak theoretical FP16 and FP8 performance than its most powerful competitor.

“This industry-leading memory and compute can deliver up to 1.3 times the inference performance of the H200 on Mistral 7B at FP16, and 1.2-times inference performance on Llama 3.1 70B at FP8,” McCredie said.

Those numbers suggest the AMD MI325X chips will pack a powerful punch for AI developers, and the best news is they won’t have to wait that long to see how well they stack up. AMD said the MI325X is slated to enter production by the fourth quarter, with companies such as Dell Technologies Inc., Hewlett Packard Enterprise Co., Lenovo Group Ltd., Gigabyte Technology Co. Ltd. and Super Micro Computer Inc. planning to put servers using them on sale by the first quarter of next year.

How much they will challenge Nvidia, however, remains to be seen. “AMD certainly remains well-positioned in the data center, but I think their CPU efforts are still their best-positioned products,” said Ben Bajarin, CEO and principal analyst at Creative Strategies Inc. “The market for AI acceleration and GPUs is still heavily favoring Nvidia and I don’t see that changing anytime soon.”

Building the network foundation for AI

AMD intends to pair the latest Instinct accelerators with new networking technologies, including the AMD Pensando Salina DPU and the AMD Pensando Pollara 400, which it claimed is the industry’s first Ultra Ethernet-Ready AI network interface card. The new technologies are critical for linking AMD’s new AI accelerators and ensuring sufficient throughput for sharing data.

The AMD Pensando Salina DPU represents the front end of AMD’s network, which delivers data to clusters of Instinct accelerators, while the AMD Pensando Pollara 400 represents the back end, which manages data transfer between individual accelerators and clusters. They’ll be available early next year.

The Pensando Salini DPU is the third-generation of its DPU series, delivering twice the performance of its predecessor and more than doubling the available bandwidth and scale. All told, it supports 400G throughput, ensuring some of the fastest data transfer rates ever seen in a data center. As such, it will serve as a critical component of AI front-end network clusters, helping to optimize the performance, efficiency and scalability of AI applications.

In the briefing, Soni Jiandani, senior vice president and general manager of AMD’s Network Technology and Solutions Group and a co-founder of Pensando Systems Inc. that AMD acquired in 2022, stressed the importance of networking. She explained that AI systems need to connect to the front end of the network for users, while at the back end they must be linked to thousands of GPUs to ensure performance.

“Back-end networks drive AI system performance,” she said. “Meta says 30% of its training cycle time typically elapses while waiting for networking. So networking is not only critical, it’s foundational to driving AI performance.”

IDC analyst Brandon Hoff agreed. “AI workloads, especially generative AI workloads, are the first to be able to consume all of the compute, memory, storage and networking in a server node,” he explained. “AI also scales beyond a single AI Factory node which requires all of the GPUs to talk to each other.”

As a result, he added, “the time to communicate between AI Factory nodes is called ‘time in network’ and can be up to 60% of the processing time for a training or multi-node inferencing AI run. To put it a different way, if a hyperscaler spent $1 billion on their GPUs, they get $400 million of work done and $600 million of the GPUs sitting idle. High-performance networking is essential, and the second most important piece.”

So for AMD, he said, “having a strong networking set of products is an essential part of building their AI business. These are the right products for DPUs and SmartNICs, and Ethernet is the right technology to invest in.”

Bajarin said AMD is making good progress technically in networking. “I imagine the more AMD can integrate this into their full-stack approach to optimizing for the racks via the ZT systems purchase, then I think their networking stuff becomes even more important,” he said.

Future GPUs in the works

The Salina DPU and the Pollara 400 will also launch early next year, but while companies are waiting for those technologies to become available, they can at least contemplate what’s coming further along down the line.

In addition to its new chips and networking technologies coming out shortly, AMD also previewed the next of its next-generation chips, the AMD Instinct MI350 series accelerators, slated to arrive in the second half of next year.

According to AMD, the Instinct MI350 will be built on the company’s CDNA 4 architecture and deliver an incredible 35-times improvement in terms of inference performance, compared with the Instinct MI325X, while providing 288GB of HBM3E memory.

The company also mentioned plans for a new MI355X accelerator chip that’s expected to start shipping in volume in late 2025, followed by the MI400 chip that will be based on an entirely new architecture when it launches sometime in 2026.

Next-Gen EPYC server chips

Though everyone’s focused on AI, AMD has no intention to let up on its wider assault of the market for data center servers. At the event, it also lifted the lid on its latest EPYC central processing unit, formerly codenamed “Turin,” aimed at enterprise, AI and cloud workloads.

The 5th Gen AMD EPYC Series processors will be built on the company’s Zen 5 core architecture, with the company aiming to offer a wide range of core counts to suit different use cases, spanning from as low as just eight all the way up to 192 cores for the most demanding workloads.

The company promised that 5th Gen EPYC chips will build on the performance of its existing, 4th Gen EPYC platform, with its biggest, 192-core EPYC 9005 Series offering 2.7 times better performance than its most powerful existing chip. Meanwhile, the new 64-core EPYC 9575F is customized for GPU-powered AI workloads, offering 5 gigabits per second in terms of performance, providing 28% faster processing than competing chips.

Dan McNamara, senior vice president and general manager of AMD’s server business, said customers can put their trust in the company’s performance claims. “With five generations of on-time roadmap execution, AMD has proven it can meet the needs of the data center market and give customers the standard for data center performance, efficiency, solutions and capabilities for cloud, enterprise and AI workloads,” he said.

Ryzen AI Pro chips for PCs

Finally, AMD teased the imminent launch of its third-generation of mobile processors laptops and notebooks. The new Ryzen AI Pro 300 Series processors are built on an advanced four-nanometer process and are said to be powerful enough to support on-device AI workloads such as generative AI copilots, live captioning and AI-powered translations. They offer a threefold increase in AI performance from the previous-generation Ryzen chips, together with advanced security and manageability features for enterprise users.

Like the latest EPYC chips, the new Ryzen processors are built on AMD’s Zen 5 architecture, and they’ll provide up to 40% better performance and up to 14% faster productivity performance compared with the Intel Core Ultra 7 165U. They’ll also “significantly” extend device battery life, the company added.

In addition, they’ll come with an integrated neural processing unit that can deliver 50 trillions of operations per second in AI processing power, exceeding Microsoft Corp.’s requirements for its AI-powered Copilot tools, AMD said.

Jack Huynh, senior vice president and general manager of AMD’s Computing and Graphics Group, said the PRO 300 Series is designed to address enterprise’s increased demands for more compute power and efficiency on their business machines. “Our third-generation AI-enabled processors for business PCs deliver unprecedented AI processing capabilities with incredible battery life and seamless compatibility for the applications users depend on,” he said.

The bottom line, said Creative Strategies’ Bajarin, is that “the data center is under a complete transformation and we are still only in the early days of that which makes this still a wide open competitive field over the arc of time 10-plus years. I’m not sure we can say with any certainty how this shakes out over that time but the bottom line is there is a lot of market share and money to go around to keep AMD, Nvidia and Intel busy.”

With reporting from Robert Hof

Featured photo: Robert Hof/SiliconANGLE; images: AMD

A message from John Furrier, co-founder of SiliconANGLE:

Your vote of support is important to us and it helps us keep the content FREE.

One click below supports our mission to provide free, deep, and relevant content.  

Join our community on YouTube

Join the community that includes more than 15,000 #CubeAlumni experts, including Amazon.com CEO Andy Jassy, Dell Technologies founder and CEO Michael Dell, Intel CEO Pat Gelsinger, and many more luminaries and experts.

“TheCUBE is an important partner to the industry. You guys really are a part of our events and we really appreciate you coming and I know people appreciate the content you create as well” – Andy Jassy

THANK YOU