Lately, it seems like it’s Nvidia world, and everyone—and certainly anyone in tech and the explosive AI industry—is living in it. Between its well-timed market entry, leading-edge hardware research, and a robust software ecosystem tailored for its GPUs, the company dominates AI development and the stock market. Its latest earnings report late today revealing that quarterly sales tripled, boosting its share price even higher.
Nonetheless, long-time rival chipmaker AMD is still pushing hard to establish a foothold in AI, telling the builders behind the key technologies in the nascent space that they can also do their work on AMD hardware.
“I just wanted to remind everybody that if you’re using PyTorch, TensorFlow or JAX, you can use your notebooks or scripts, they’ll just run on AMD,” declared AMD senior director Ian Ferreira at the Microsoft Build 2024 conference earlier on Wednesday. “Same with inferencing engines. BLLM and Onyx also work out of the box.”
The company used its time on stage to show examples of how AMD GPUs can natively run powerful AI models like Stable Diffusion, and Microsoft Phi, efficiently performing computationally-intensive training tasks without depending on Nvidia’s technology or hardware.
Conference host Microsoft bolstered the message by announcing the availability of AMD-based virtual machines on its Azure cloud computing platform, using the company’s accelerated MI300X GPUs. The chips were announced last June, began shipping in the new year, and were recently implemented in Microsoft Azure’s OpenAI service and Hugging Face’s infrastructure.
Nvidia’s proprietary CUDA technology, which includes a full programming model and API designed specifically for Nvidia GPUs, has become the industry standard for AI development. AMD’s main message, therefore, was that its solutions could slot right into the same workflows.
Seamless compatibility with existing AI systems could be a game-changer, as developers can now leverage AMD’s less costly hardware without overhauling their codebase.
“Of course, we understand that you need more than just frameworks, you need a bunch of upstream stuff, you need a bunch of experimentation stuff, distributed training—all of that is enabled and works on AMD,” Ferreira assured.
He then demonstrated how AMD handles different tasks, from running small models like ResNet 50 and Phi-3 to fine tuning and training GPT-2—all using the same code that Nvidia cards run.
One of the key advantages AMD touted is the ability to handle large language models efficiently.
“You can load up to 70 billion parameters on one GPU, with eight of those on this instance,” he explained. “You can have eight different llama 70B’s loaded, or take a big model like Llama-3 400Bn and deploy that on a single instance.”
Challenging Nvidia’s dominance is no easy feat, as the Santa Clara, Calif.-based company has fiercely protected its turf. Nvidia has already taken legal action against projects attempting to provide CUDA compatibility layers for third-party GPUs like AMD’s, arguing that it violates CUDA’s terms of service. This has limited the development of open-source solutions and made it harder for developers to embrace alternatives.
AMD’s strategy to circumvent Nvidia’s blockade is to leverage its open-source ROCm framework, which competes directly with CUDA. The company has been making significant strides in this regard, partnering with Hugging Face, the world’s largest repository of open-source AI models, to provide support for running code on AMD hardware.
This partnership has already yielded promising results, with AMD offering native support and additional acceleration tools like ONNX models execution on ROCm-powered GPUs, Optimum-Benchmark, DeepSpeed for ROCm-powered GPUs using Transformers, GPTQ, TGI, and more.
Ferreira also pointed out that this integration is native, eliminating the need for third-party solutions or middlemen that could make processes less efficient.
“You can take your existing notebooks, your existing scripts, and you can run them on AMD, and that’s important, because a lot of the other accelerators require transcoding and all kinds of pre-compiling scripts,” he said. “Our stuff just works out of the box—and it’s really, really fast.”
While AMD’s move is undoubtedly bold, dethroning Nvidia will be a considerable challenge. Nvidia is not resting on its laurels, continuously innovating and making it difficult for developers to migrate to a new infrastructure from the de-facto CUDA standard.
However, with its open-source approach, strategic partnerships, and a focus on native compatibility, AMD is positioning itself as a viable alternative for developers seeking more options in the AI hardware market.
Edited by Ryan Ozawa.
Generally Intelligent Newsletter
A weekly AI journey narrated by Gen, a generative AI model.