Advances in Technology are Democratizing AI and HPC, but Only for the Wealthy?

Subscribe To Download This Insight

2Q 2022 | IN-6514

Heterogeneous compute and hardware abstraction enhance productivity, but it is still a premium cost facility.

Registered users can unlock up to five pieces of premium content each month.

Log in or register to unlock this Insight.

 

Advancements in Full Stack CPUs

NEWS


The March 2022 NVIDIA’s GPU Technology Conference (GTC) event saw the US-based technology company issue a raft of technology announcements that backed its stance as a full-stack company. Not long has passed since NVIDIA’s graduation from Graphics Processing Unit (GPU) producer to Central Processing Unit (CPU) producer, with their new hardware announcements backing up their position as a full-stack supplier.

NVIDIA defines the stack as having four distinct layers: at the top are the applications that technology consumers interact with most heavily, sitting directly beneath that is the applications layer, which is very tightly coupled to the platform layer, with four platforms currently. The third layer down is the system software layer, and the bottom layer is the hardware layer which includes servers and components such as CPU and GPU.

NVIDIA made announcements about new products or enhancements at GTC that touched every layer of their stack and will undoubtedly increase productivity within their various domains. Specifically with respect to High-Performance Compute (HPC), Artificial Intelligence (AI), Machine Learning (ML), and data center workloads, the multiple hardware announcements grabbed attention. One such announcement was the new Hopper architecture and the H100 GPU based upon it. This GPU, the successor to the A100 GPU, is claimed to offer a six-fold improvement over the outgoing model and offers per-instance workload isolation, a first for GPUs. Eight H100 GPU have been packaged together with high bandwidth memory (HBM) onto the HGX system board with additional x86 CPU and InfiniBand networking modules. The resulting product is the NVIDIA DGX H100 an AI-focused server that is available from NVIDIA and partner channels.

NVIDIA went on to announce that up to 32 DGX H100’s can further be linked using NVLink to form a DGX POD, and that those pods can be combined to form SuperPODs which are, to all intents and purposes, one giant GPU.

These high-end systems are not cheap. The A100-based DGX system retailed at US$199,000, and while NVIDIA has a track record of making its DGX models both cheaper and more powerful between generations, the productivity gains claimed by the next-generation model are unlikely to see a heavy reduction, if any reduction at all. With a DGX POD being composed of up to 32 H100 DGXs, the list price looks set to be eye-watering, so can NVIDIA really claim that it is democratizing anything at these prices?

Democratizing Productivity, Not HPC

IMPACT


The increase in productivity that these new systems offer does not have a fixed value. Research facilities and government-sponsored bodies, as well as some privately funded organizations, will always build very expensive High-Performance Compute clusters that are well out of reach of most enterprises. Either the productivity increases warrant the investment from a commercial perspective or the capabilities that these large systems bring enables new workloads to be run in the hopes of advancing scientific discovery as a result. In continuing to innovate silicon, technology companies like NVIDIA, Intel, AMD, and Arm are providing the data center tools that scientists need to maintain a healthy cadence with respect to innovation. In building the ecosystems that allow the hardware to be customized to specific workloads and to maximize the productivity gains made in silicon, they are lowering the technical barriers to entry for this technology. So, in this respect, the democratization claim rings true.

To further bolster the claim to democratization that chip manufacturers are making, we can look to modular design that the new generation of HPC clusters are being built to. The flexibility that is enabled through NVIDIA’s plans to make their interconnects available across their whole range of silicon, as well as customer and partner silicon, means that the number of different types of processors that can be combined into a coherent system is greatly expanded. Intel has a very close working relationship with their customers and a similar range of CPU and accelerator technologies that ensure they are developing commodity products that meet an equally diverse range use cases. All of this means that systems can be built targeting very specific and niche workloads, which is something that was previously only available to high-end bespoke systems, often built, at great cost, to service a very sparse range of workloads. Targeted productivity is therefore also feeling the benefit of the democratization of AI and HPC.

The biggest claim to democratization that can be made by all the major silicon vendors is the fact that their modular systems make financial sense at scale, which makes them highly attractive to hyperscalers. Very small improvements in productivity are amplified greatly, at the hyperscale, in cost per flop, byte, or transaction is king. NVIDIA’s per-instance isolation on their H100 GPU means that the hyperscalers can allow multiple tenants to share GPU hardware securely, bringing down the cost of usage and enable more customers to access this cutting-edge technology affordably.

Lead, Don't Follow

RECOMMENDATIONS


Silicon companies and the hyperscalers understand the value that their products bring to the enterprise, and the prices they charge are justified in terms of the productivity they bring, but just because leading-edge technology democratizes these high-end systems does not mean that the enterprise needs to use them to feel the benefits of that democratization. The ecosystems, applications, Application Programming Interfaces (APIs) and Software Development Kits (SDKs), which move forward in line with the silicon that they utilize for new functionality, model calculation tweaks, and optimizations, where not dependent on specific hardware, will often be available for use on current generation hardware as well.

The important thing for the enterprise to take away is that they need to understand their own workloads, understand the value that the new technology brings to their compute demands, and make informed decisions as to whether the value proposition rings true. There are tools available to help the enterprise with this. NVIDIA makes its DGX SuperPOD and NVIDIA certified systems available for evaluation via its Launchpad service. Often, hyperscalers will be the first to deploy the newest technology, meaning that the enterprise can evaluate it and understand the cost of performing their workloads against different technology generations and specifications in the cloud before making a commitment.

Some silicon producers shout louder than others, as all of them improve their product performance or functionality generation over generation. NVIDIA goes to great lengths to launch their next-generation products while Intel has a constant pipeline of products in development. Both work hard to understand the future demands that their customers will have, and their customers trust Intel to bring the features that are required, launch events therefore bring fewer surprises. Cut through the hype, understand your workload, and, critically, understand the productivity value to your organization that new silicon and features bring, let that guide your technology decisions.

 

Services