NVIDIA and SoftBank Make Bold Claims for GPU-Based RAN
|
NEWS
|
NVIDIA and SoftBank made bold claims in November 2024, announcing that the operator’s trials with Graphics Processing Unit (GPU)-based Artificial Intelligence (AI)-Radio Access Network (RAN) was better performing, consuming less power and, at the same time, provides new revenue opportunities through GPU-as-a-Service (GPUaaS) business models. It is worth diving deeper into this announcement and deciphering which configuration and infrastructure were used. The trial included 20 Radio Units (RUs) and 100 User Equipment (UE) devices, and allowed multi-tenancy in the AI-RAN. Specifically, the trial was run on a GB200-NVL2 server powered by two Grace Central Processing Units (CPUs) and two Blackwell GPUs, and is estimated to consume up to 3 Kilowatts (kW) of power at full load and cost approximately US$60,000 to US$65,000.
The trial concluded that the Aerial RAN computer with the above specifications resulted in 60% lower power consumption (Watt (W)/Gigabits per Second (Gbps)) compared to an x86 system and 40% compared to custom silicon, with similar efficiencies across Distributed-RAN (D-RAN) and Centralized-RAN (C-RAN). NVIDIA’s AI Aerial PHY software was used, whereas SoftBank provided MAC layer software and the software stack is open to different partners, for example using Centralized Unit (CU)/Distributed Unit (DU) containerized functions.
The more interesting claims were the GPUaaS business opportunity arguments, claiming that each server can generate 25,000 tokens/sec for Generative Artificial Intelligence (Gen AI) models (Llama-3-70B FP4 was used) and at an average price of US$0.25/token, this could provide US$20/hour or US$1 billion/year for 6,000 AI-RAN servers. This translated to US$5 of new revenue per US$1 spent on AI-RAN infrastructure with a 219% profit margin for AI-heavy workloads, being profitable even on RAN-heavy scenarios (profitability dropping to 33% in this case), where most of the processing in the network is utilized by the RAN. This model uses Operational Expenditure (OPEX) costs typical in Japan, with an amortization period of 5 years and assumes that these servers are deployed at aggregation sites.
However, it is necessary to dive deeper into these numbers and compare with Tier One vendor infrastructure directly.
Tier One Vendor RAN Provides >10X Efficiency Compared to GPU-Based RAN
|
IMPACT
|
The AI-RAN trial specifications indicate that the radios deployed were 4T4R with 100 Megahertz (MHz) bandwidth. Each AI RAN computer can power up to 20 cells, which translates to 2 to 6 cell site locations, according to the cell configuration (3, 6, or 9 sector). The latest cuPHY software release supports 20 peak loaded 4T4R cells with 100 MHz bandwidth or 3 cells with Massive Multiple Input, Multiple Output (MIMO) 64TR support. Effectively, an Aerial RAN computer would be deployed at the cell site, or very near because it can only provide processing for a few cell-site locations. Because more 5G networks are already deployed, especially in developed markets, the question is whether mobile operators would replace existing infrastructure for 4T4R networks with GPU-based RAN, even if the business case is proven. According to ABI Research assumptions and calculations, the maximum capacity this server can support is ~1 Gbps per cell (4 component carriers of 20 MHz each aggregated with a 4T4R MIMO configuration, assuming optimal channel conditions).
If we consider custom silicon, Ericsson’s baseband, specifically the RAN Processor 6651, supports up to 300 cells in a C-RAN deployment, while it is powered by Ericsson custom silicon, developed in collaboration with Intel. ABI Research estimates that the unit consumes 1.5 kW of power and costs a fraction of the GPU-based server, specifically US$5,000 to US$10,000 depending on contract volume and configuration. This server can handle up to 20 Gbps according to Ericsson specifications.
A direct comparison illustrates that in terms of RAN processing alone, Ericsson equipment performs better than the GPU-based server. However, it is necessary to include the additional GPUaaS revenue that SoftBank and NVIDIA claim to reverse this balance for enterprise vertical use cases; in this case robots, manufacturing, automotive, and smart city.
GPUaaS in the RAN: A Great Opportunity for Risk Takers
|
RECOMMENDATIONS
|
The NVIDIA-SoftBank forecasts assume that 6,000 AI-RAN servers deployed in the network will generate US$1 billion per year. If the Aerial RAN computer and the Ericsson baseband specified earlier are of similar specification and used for similar cell configuration, then the mobile operator in question would have to spend an extra US$60,000 per cell site for infrastructure, excluding software, installation, maintenance, and other fees that are assumed to be the same for either custom or AI-RAN processing. This additional cost per cell site would translate to US$360 million for 6,000 cells and well over US$1 billion for larger deployments.
This additional cost is not prohibitive. An additional US$60,000 per site is minor, because this equipment is typically depreciated over a 5- or 10-year period, during which new business models and revenue streams would surely be discovered and the GPUaaS deployment models will mature. However, the rate at which GPUs depreciate may be significantly different to RAN infrastructure, especially now with NVIDIA launching new GPUs every 1 to 2 years. The difference in terms of performance and power efficiency is due to the maturity of the software running on the GPU servers, which is relatively new compared to the decades of experience Ericsson, Huawei, and Nokia have, and the millions of man-hours they have spent to develop, streamline, and optimize their software and custom silicon. The delta between custom silicon and GPU-based systems will close, and ABI Research expects that even the Tier One vendors will adopt GPUs for RAN processing eventually, breaking free of the long and expensive development cycle of custom silicon. But for now, if we only consider RAN processing, custom silicon is the king.