Inside Nvidia’s brand-new AI supercomputer

Uncategorized

With Nvidia’s Arm-based Grace processor at its core, the company has actually presented a supercomputer designed to perform AI processing powered by a CPU/GPU combination.The brand-new system, officially presented at the Computex tech conference in Taipei the DGX GH200 supercomputer is powered by 256 Grace Hopper Superchips, technology that is a mix of Nvidia’s Grace CPU, a 72-core Arm processor designed for high-performance computing and the Hopper GPU. The 2 are connected by Nvidia’s proprietary NVLink-C2C high-speed interconnect.The DGX GH200 features an enormous shared memory area of more than 144TB of HBM3 memory linked by its NVLink-C2C adjoin technology. The system is a simplified design, and its processors are seen by thier software application as one giant GPU with one huge memory swimming pool, said Ian Buck, vice president and general supervisor of Nvidia’s hyperscale and HPC company unit.He said the system can be released and trained with Nvidia’s assistance in AI designs that can require memory beyond the bounds of what a single GPU supports.”We need an entirely brand-new system architecture that can break through one terabyte of memory in order to train these giant models,” he said.Nvidia claims an exaFLOP of efficiency, but that’s from eight-bit FP8 processing. Now most of AI processing is being done using

16-bit Bfloat16 instructions, which would take twice as long. One way of looking at it is you could have a supercomputer that ranks in the top 10 of the TOP500 supercomputer list and occupy a relatively modest space.By using NVLink instead of standard PCI Express interconnects, the bandwidth between GPU and CPU is 7 times quicker and requires a fifth of the interconnect power. Google Cloud, Meta, and Microsoft are amongst the very first expected to gain access to the DGX GH200 to explore its capabilities for generative AI work. Nvidia likewise intends to provide the DGX GH200 design as a plan to cloud service providers and other hyperscalers so they can further personalize it for their facilities. Nvidia DGX GH200 supercomputers are expected to be available by the end of the year.Software is included.These supercomputers come with Nvidia software application set up to provide a turnkey item that consists of Nvidia

AI Enterprise, the main software layer for its AI platform including frameworks, pretrained designs, and development tools; and Base Command for enterprise-level cluster management. DGX GH200 is the very first supercomputer to combine Grace Hopper Superchips with Nvidia’s NVLink Change System, the adjoin that makes it possible for the GPUs in the system to interact as one. The previous generation system maxed out at eight GPUs working in tandem.To get to the full-sized system still needs significant data-center realty. Each 15 rack-unit chassis holds 8 calculate nodes, and there are 2 chassis per rack(or pod in Nvidia parlance )in addition to NVswitch ethernet and IP connectivity. Up to eight of the pods can be connected for as much as 256 processors.The system is air cooled regardless of the truth that Hopper GPUs draw 700 Watts of power, which implies substantial heat. Nvidia said that it is internally developing liquid-cooled systems and is talking about it with customers and partners, however for now the DGX GH200 is cooled by fans.So far, potental users of the system aren’t ready for liquid cooling, said Charlie Boyle, vice president of DGX systems at Nvidia.” There will be points in the future where we’ll … Source

Leave a Reply

Your email address will not be published. Required fields are marked *