Home Artificial Intelligence Nvidia expands partnership with hyperscalers to boost AI training and development

by Anirban Ghoshal

Senior Writer

Nvidia expands partnership with hyperscalers to boost AI training and development

News

Mar 19, 20244 mins

Cloud ComputingGenerative AIGPUs

As part of its extended collaboration with AWS, GCP, Microsoft, IBM, and Oracle, the chip designer will share its new Blackwell GPU platform, foundational models, and integrate its software across platforms of hyperscalers.

Credit: Nvidia

Nvidia is extending its existing partnerships with hyperscalers Amazon Web Services (AWS), Google Cloud Platform, Microsoft Azure, and Oracle Cloud Infrastructure, to make available its latest GPUs and foundational large language models (LLMs), and to integrate its software across their platforms.

AWS, for instance, will offer Nvidia’s Blackwell GPU platform, featuring the latest GB200 NVL72 server rack that comes with 72 Blackwell GPUs and 36 Grace CPUs interconnected by Nvidia’s high-speed GPU connecting framework NVLink, as part of its cloud.

“When connected with Amazon’s powerful networking (EFA), and supported by advanced virtualization (AWS Nitro System) and hyper-scale clustering (Amazon EC2 UltraClusters), enterprises can scale to thousands of GB200 Superchips,” the companies said in a joint statement.

Further, the companies said they expect the availability of Nvidia’s Blackwell platform on AWS to speed up inference workloads for multi-trillion parameter LLMs.

Nvidia will also make the Blackwell GB200 GPUs available in the AWS cloud via its own DGX Cloud AI training service, which hosts in other vendors’ clouds. DGX was initially only available in Microsoft Azure and Oracle Cloud Infrastructure, but last November AWS said it would begin offering it too.

Another feature of the expanded partnership is that Nvidia will offer its NIM microservices inside Amazon SageMaker, AWS’ machine learning platform, to help enterprises deploy foundational LLMs that are pre-compiled and optimized to run on Nvidia GPUs. This will reduce the time-to-market for generative AI applications, the companies said.

Other collaborations between AWS and Nvidia include the use of Nvidia’s BioNeMo foundational model for generative chemistry, protein structure prediction, and understanding how drug molecules interact with targets via AWS’ HealthOmics offering. The two companies’ healthcare teams are also working together to launch generative AI microservices to advance drug discovery, medtech, and digital health, they said.

Google Cloud to get Blackwell-powered DGX Cloud

Google Cloud Platform, like AWS, will be getting the new Blackwell GPU platform and integrating Nvidia’s NIM suite of microservices into Google Kubernetes Engine (GKE) to speed up AI inferencing and deployment. In addition, Nvidia DGX Cloud is now generally available on Google Cloud A3 VM instances powered by NVIDIA H100 Tensor Core GPUs, Google and Nvidia said in a joint statement.

The two companies are also extending their partnership to bring Google’s JAX machine learning framework for transforming numerical functions to Nvidia’s GPUs. This means that enterprises will be able to use JAX for LLM training on Nvidia’s H100 GPUs via MaxText and Accelerated Processing Kit (XPK), the companies said.

In order to help enterprises with data science and analytics, Google said that its Vertex AI machine learning platform will now support Google Cloud A3 VMs powered by Nvidia’s H100 GPUs and G2 VMs powered by Nvidia’s L4 Tensor Core GPUs.

“This provides MLops teams with scalable infrastructure and tooling to manage and deploy AI applications. Dataflow has also expanded support for accelerated data processing on Nvidia GPUs,” the companies said.

Oracle and Microsoft too

Other hyperscalers, such as Microsoft and Oracle, has also partnered with Nvidia to integrate the chipmaker’s hardware and software to beef up their offerings.

Not only are both companies adopting the Blackwell GPU platform across their services, they are also expected to see the adoption of Blackwell-powered DGX Cloud.

IBM, on the other hand, said nothing about Nvidia hardware — but its consulting team will integrate Nvidia software components such as the NIM microservices suite to help enterprises on their AI development journeys.

by Anirban Ghoshal

Senior Writer

Anirban Ghoshal is a senior writer covering enterprise software for CIO.com and databases and cloud and AI infrastructure for InfoWorld.

Americas

Topics

About

Policies

Our Network

More

Nvidia expands partnership with hyperscalers to boost AI training and development

As part of its extended collaboration with AWS, GCP, Microsoft, IBM, and Oracle, the chip designer will share its new Blackwell GPU platform, foundational models, and integrate its software across platforms of hyperscalers.

Google Cloud to get Blackwell-powered DGX Cloud

Oracle and Microsoft too

More from this author

Alibaba to cease data center operations in India and Australia

Microsoft lays off staffers from its Azure division

Alibaba Cloud is betting on emerging markets with massive price cuts

Microsoft Build 2024: Cloud infra updates include Cobalt 100-based VMs, access to Copilot in Azure

Google unveils next-generation AI chip Trillium

Microsoft to invest $1.7 billion in Indonesia to expand cloud infra

Microsoft expands G42 partnership with $1.5 billion investment

Google Cloud adds more infrastructure support for AI workloads

Most popular authors

Show me more

OpenSSH vulnerability regreSSHion puts millions of servers at risk

Nutanix hunts disgruntled VMware customers

Download the endpoint detection and response (EDR) enterprise buyer’s guide

Has the hype around ‘Internet of Things’ paid off? | Ep. 145

Episode 1: Understanding Cisco’s Converged SDN Transport

Episode 2: Pluggable Optics and the Internet for the Future

How to use the stat command

The SL command easter egg

How to use the shuf command

Nvidia expands partnership with hyperscalers to boost AI training and development

As part of its extended collaboration with AWS, GCP, Microsoft, IBM, and Oracle, the chip designer will share its new Blackwell GPU platform, foundational models, and integrate its software across platforms of hyperscalers.

Google Cloud to get Blackwell-powered DGX Cloud

Oracle and Microsoft too

Related content

Cisco patches actively exploited zero-day flaw in Nexus switches

Nokia to buy optical networker Infinera for $2.3 billion

French antitrust charges threaten Nvidia amid AI chip market surge

Lenovo adds new AI solutions, expands Neptune cooling range to enable heat reuse

Newsletter Promo Module Test

More from this author

Alibaba to cease data center operations in India and Australia

Microsoft lays off staffers from its Azure division

Alibaba Cloud is betting on emerging markets with massive price cuts

Microsoft Build 2024: Cloud infra updates include Cobalt 100-based VMs, access to Copilot in Azure

Google unveils next-generation AI chip Trillium

Microsoft to invest $1.7 billion in Indonesia to expand cloud infra

Microsoft expands G42 partnership with $1.5 billion investment

Google Cloud adds more infrastructure support for AI workloads

Most popular authors

Show me more

OpenSSH vulnerability regreSSHion puts millions of servers at risk

Nutanix hunts disgruntled VMware customers

Download the endpoint detection and response (EDR) enterprise buyer’s guide

Has the hype around ‘Internet of Things’ paid off? | Ep. 145

Episode 1: Understanding Cisco’s Converged SDN Transport

Episode 2: Pluggable Optics and the Internet for the Future

How to use the stat command

The SL command easter egg

How to use the shuf command