12 C
New York
Saturday, November 16, 2024

Microsoft launches newest Azure digital machines optimized for AI supercomputing, the ND H200 v5 collection


Our clients depend on Azure AI infrastructure to develop progressive AI-powered options, so immediately we’re providing new cloud-based AI supercomputing clusters constructed with Azure ND H200 v5 collection digital machines (VMs).

The necessity for high-performance, scalable infrastructure continues to develop exponentially because the AI ​​panorama advances. Our clients depend on Azure AI infrastructure to develop progressive AI-powered options, so immediately we’re providing new cloud-based AI supercomputing clusters constructed with Azure ND H200 v5 collection digital machines (VMs). These digital machines at the moment are extensively obtainable and are designed to deal with the rising complexity of superior AI workloads, from elementary mannequin coaching to generative inference. The size, effectivity, and improved efficiency of our ND H200 v5 digital machines are already driving buyer adoption and adoption of Microsoft AI providers equivalent to Azure Machine Studying and Azure OpenAI Service.

We’re excited to undertake the brand new Azure H200 digital machines. “We’ve seen that H200 provides improved efficiency with minimal portability effort, and we sit up for utilizing these digital machines to speed up our analysis, enhance the ChatGPT expertise, and additional our mission.” —Trevor Cai, head of infrastructure, OpenAI.

Azure ND H200 v5 digital machines are designed with Microsoft’s programs strategy to enhance effectivity and efficiency, and have eight NVIDIA H200 Tensor Core GPUs. Particularly, they handle the hole because of the progress of the uncooked computing energy of GPUs at a a lot quicker fee than the hooked up reminiscence and reminiscence bandwidth. Azure ND H200 v5 collection digital machines provide a 76% improve in high-bandwidth reminiscence (HBM) to 141 GB and a 43% improve in HBM bandwidth to 4.8 TB/s in comparison with the earlier technology of Azure ND H100 v5 digital machines. This improve in HBM bandwidth permits GPUs to entry mannequin parameters quicker, serving to to cut back total software latency, which is a vital metric for real-time functions equivalent to interactive brokers. ND H200 V5 digital machines can even accommodate extra advanced massive language fashions (LLMs) inside the reminiscence of a single digital machine, enhancing efficiency by serving to customers keep away from the overhead of operating jobs distributed throughout a number of digital machines. .

The design of our H200 supercomputing clusters additionally permits extra environment friendly GPU reminiscence administration for mannequin weights, key-value cache, and batch sizes, all of which instantly impacts efficiency, latency, and cost-efficiency in workloads. LLM-based generative AI inference. With its larger HBM capability, the ND H200 v5 digital machine can assist larger batch sizes, driving higher GPU utilization and efficiency in comparison with the ND H100 v5 collection for inference workloads on each small languages ​​(SLM) as in LLM. In early testing, we noticed as much as a 35% efficiency improve with ND H200 v5 digital machines in comparison with the ND H100 v5 collection for inference workloads operating the LLAMA 3.1 405B mannequin (with world dimension 8, size enter size 128, output size 8 and most batch). sizes: 32 for H100 and 96 for H200). For extra particulars on Azure Excessive Efficiency Computing benchmarks, please learn extra right here or go to our AI Benchmarking Information within the Azure GitHub repository for extra particulars.

ND H200 v5 digital machines come pre-integrated with Azure Batch, Azure Kubernetes Service, Azure OpenAI Service, and Azure Machine Studying to assist companies get began straight away. Please go to right here to extra detailed technical documentation of the brand new Azure ND H200 v5 digital machines.



Related Articles

Latest Articles