Cloud AI Inference Workload Capacity Consumption to Surpass Training by 2033, Reaching 46 GW by 2035
Enterprise AI deployment is shifting cloud economics from model building to production-scale inference, with code generation emerging as the biggest long-term demand driver
Global technology intelligence firm ABI Research forecasts that AI inference workloads will grow at a 42% CAGR to surpass 46 Gigawatts of capacity consumption by 2035, overtaking training workloads by 2033 as the dominant force in cloud AI infrastructure. The finding signals a structural shift in the market, as enterprise adoption, better token economics, and stronger production Return on Investment push operators to prioritize inference-optimized capacity.
“Inference is the commercial engine of the AI market, and its market activity is accelerating at an incredible pace with better model capabilities and the computational demands of agentic systems,” said Larbi Belkhit, Senior Analyst at ABI Research. “For the last several years, the cloud demand and build-out centered predominantly on training frontier models, but the next wave of competition will be won by providers that can deliver inference at scale with the right balance of performance, latency, cost, and compute utilization.”
Training demand will still expand aggressively, reaching 36 GW by 2035, but its composition is changing. Fine-tuning workloads are set to surpass foundation model training by 2032 and grow to 21 GW by 2035, while foundation model training climbs to nearly 13 GW, underscoring how enterprise customization is becoming a major contributor to cloud AI capacity consumption.
Among inference workloads, code generation is the standout category in the enterprise market, scaling to roughly 24 GW by 2035 and accounting for more than half of total inference capacity consumption. Text generation will continue growing, but only to about 7 GW by 2035, while audio generation is projected to become the fastest growing inference workload at a 42% CAGR as model capabilities and economics improve.
The market shift will also reshape competitive dynamics across cloud providers. ABI Research expects neocloud providers to nearly catch hyperscalers in total inference capacity consumption by 2035, reaching 15 GW and 16 GW respectively, as they expand infrastructure built specifically for inference-heavy AI services.
“These forecasts show that AI infrastructure strategy is moving into a new phase, where success depends less on headline model size and more on operationalizing AI across real-world workloads,” Belkhit said. “Cloud providers, enterprises, and the wider AI supply chain are preparing for a far more heterogeneous compute landscape shaped by long-running autonomous agentic systems, multi-modal workloads, and scalable inference demand.”
These findings are from ABI Research’s AI Cloud Workloads market data report, part of the company’s AI & Machine Learning research service, which includes research, data, and ABI Insights.
Contact ABI Research
Media Contacts
Americas: +1.516.624.2542
Europe: +44.(0).203.326.0142
Asia: +65 6950.5670
Related Research
Market Data | 2Q 2026 | MD-AICW-101
Related Service
- Competitive & Market Intelligence
- Executive & C-Suite
- Marketing
- Product Strategy
- Startup Leader & Founder
- Users & Implementers
Job Role
- Telco & Communications
- Hyperscalers
- Industrial & Manufacturing
- Semiconductor
- Supply Chain
- Industry & Trade Organizations
Industry
Services
Spotlights
5G, Cloud & Networks
- 5G Devices, Smartphones & Wearables
- 5G, 6G & Open RAN
- Cloud
- Enterprise Connectivity
- Space Technologies & Innovation
- Telco AI
AI & Robotics
Automotive
Bluetooth, Wi-Fi & Short Range Wireless
Cyber & Digital Security
- Citizen Digital Identity
- Digital Payment Technologies
- eSIM & SIM Solutions
- Quantum Safe Technologies
- Trusted Device Solutions
