AI Infrastructure Investment Grows with ROI Concerns Reaching a Fever Pitch, While AI Server OEMs Seek to Address OPEX Burden Through Full Stack Investment & Innovation
By Reece Hayden |
11 Dec 2025 |
IN-8006
Log In to unlock this content.
You have x unlocks remaining.
This content falls outside of your subscription, but you may view up to five pieces of premium content outside of your subscription each month
You have x unlocks remaining.
By Reece Hayden |
11 Dec 2025 |
IN-8006
NEWSAI Server Prices Are Set to Rise, Bringing Already Pervasive Infrastructure ROI Questions to the Fore Once Again |
Investment in Artificial Intelligence (AI) infrastructure continues to dominate headlines. Vultr, a rapidly growing neocloud provider, has announced a further US$1 billion investment in AMD processors for a new data center in Ohio. U.K.-based neocloud Nscale has revealed plans to deploy more than 300,000 NVIDIA Graphics Processing Units (GPUs) across multiple sites over the next 2 years, while Iren, a former crypto-mining company, intends to raise US$2 billion in convertible bonds to finance its AI data center expansion. These announcements represent only the visible edge of a rapidly expanding market, with 2026 expected to usher in further capital commitments.
However, this bullish sentiment is not universal. Industry leaders are increasingly questioning whether the surge in AI data center Capital Expenditure (CAPEX) will deliver meaningful long-term returns. IBM’s Chief Executive Officer (CEO) recently warned that current levels of AI investment are unsustainable, while Google’s CEO cautioned that the market is showing signs of “irrationality.” These warnings should resonate with operators aggressively expanding AI infrastructure—particularly at a time when server prices across the market continue to climb.
Recent reports indicate that server vendors are preparing price increases in 2026, driven by expectations of continued inelastic demand for AI systems. But demand alone is not the sole pressure point. The Bill of Materials (BOM) is increasing sharply, most notably with Dynamic Random-Access Memory (DRAM) costs reportedly increasing by around 60%. These cost increases will inevitably be passed from suppliers to cloud providers, pushing CAPEX per server even higher and further compressing infrastructure margins. Furthermore, reshoring efforts that bring manufacturing into the United States and Europe will further push up labor costs.
While CAPEX will continue to command headlines, the more critical battleground lies beneath in the long-term cost of operating AI infrastructure. Success will depend not just on how much is spent, but on how effectively these systems are run. Recognizing this, Original Equipment Manufacturers (OEMs) are investing across the stack to reduce server Operational Expenditure (OPEX) in an attempt to differentiate in a crowded, and increasingly homogenous market.
IMPACTServer Costs May Be Increasing, but OEMs Aim to Solve ROI Bottlenecks Through Innovation Targeting Critical OPEX Drivers |
AI server OEMs are accelerating innovation with a clear focus on OPEX, targeting improvements across energy efficiency, infrastructure management, deployment speed, cooling effectiveness, and vendor lock-in mitigation. Below are some of the most notable recent developments:
- Penguin Solutions Releases ICE ClusterWare Management Software 13.0: This release is designed to address performance bottlenecks in AI infrastructure by sustaining peak cluster efficiency while enabling secure, multi-tenant access to GPU resources. By supporting the secure provisioning of individual GPUs and sub-clusters to diverse user groups, it significantly improves utilization through safe resource sharing without compromising performance or isolation. This has important implications for GPU-as-a-Service (GPUaaS) models. The platform enables faster creation of sub-clusters with built-in security and segmentation, reducing the operational friction that often leads to deployment delays, underutilized resources, and inefficient workloads. The result is improved efficiency, higher utilization rates, and faster time-to-value.
- Penguin Solutions Partners with SK Hynix to Tackle Memory Bottlenecks: The partnership focuses on advancing next-generation memory technologies for AI data centers, particularly for memory-intensive inference workloads. By addressing persistent memory bandwidth and capacity constraints, the collaboration promises meaningful performance gains measured in tokens per watt. These improvements translate directly into lower OPEX through enhanced power efficiency and reduced latency, strengthening the overall Return on Investment (ROI) case for AI infrastructure deployments.
- Supermicro Introduces Advanced Air-Cooling for AMD Instinct MI355X: Performance gains from each new generation of accelerators typically drive higher cooling costs, with some deployment Service-Level Agreements (SLAs) necessitating direct-to-chip liquid cooling systems, rather than traditional air approaches. Supermicro’s latest announcement challenges this trend by delivering high-performance GPU system designs optimized for air-cooled environments. By leveraging Open Compute Project (OCP)-compliant architectures, these platforms achieve double-digit performance improvements, while remaining compatible with air cooling. This directly addresses growing Total Cost of Ownership (TCO) concerns, especially for on-premises enterprise deployments associated with liquid cooling infrastructure and accelerates time-to-value, improving ROI timelines for AI cloud providers and enterprises alike.
While server OEMs may be passing some BOM increases on through higher CAPEX, they are simultaneously addressing fundamental ROI constraints through OPEX-focused investment and innovation.
RECOMMENDATIONSLooking into 2026, How Can Server Vendors Continue to Accelerate Differentiation? |
Looking ahead to 2026, server OEMs must continue to sharpen their value propositions as market competition continues to grow. Several strategic priorities are clear:
- Deepen Engagement and Co-Development with ASIC Vendors: While NVIDIA remains the dominant force in AI acceleration, the year 2025 marked a meaningful expansion of AMD’s market position through major customer wins and ecosystem partnerships, including Vultr’s continued investment in AMD platforms, the HPE–AMD partnership, and Crusoe’s US$400 million commitment to AMD-based AI infrastructure. Collaboration with these established leaders will remain essential, but OEMs must also take a longer-term view by engaging with emerging challengers. Vendors such as SambaNova, Cerebras, and Groq are already gaining traction, while newer players including Furiosa and Axelera present particularly compelling opportunities, especially for the sovereignty-focused European and Asian markets, where domestic compute capabilities are becoming an increasingly strategic priority.
- Expand Networking Co-Development Beyond Incumbent Ecosystems: Memory has rightly received significant attention due to its impact on AI performance, yet networking remains an underappreciated constraint. NVIDIA’s strong position in InfiniBand has historically limited choice and innovation in interconnects, but growing interest in alternative fabrics (such as the Ultra Ethernet Consortium) is beginning to shift the market. To stay ahead, server OEMs must broaden partnerships across the open, Ethernet-based networking ecosystem and reduce reliance on a single vendor or architecture. Nokia is one example of a company actively expanding its role in this space, and its recent collaboration with Nscale illustrates the growing appetite for infrastructure diversification.
- Prioritize Hybrid Cooling Compatibility: Cooling is rapidly emerging as one of the most critical bottlenecks in AI data centers. Next-generation AI systems are driving adoption of direct-to-chip liquid cooling, introducing new complexities and cost pressures across both CAPEX and OPEX. Hybrid cooling support and an ongoing commitment to air-cooled systems should, therefore, be a top priority for OEMs. By building reference designs and validating compatibility across multiple cooling approaches, vendors can provide cloud operators and enterprises with greater flexibility, faster deployment, and lower operational risk.
These technical priorities form the foundation of a compelling value proposition, but commercial considerations are equally important, especially when addressing the growing neocloud segment. Flexibility, time to market, supply chain visibility, and digital sovereignty are now central to procurement decisions. Customers such as Nscale, planning to deploy more than 300,000 GPUs across Europe and the United States within the next 1 to 2 years, require transparency and confidence that supply chains will remain resilient in the face of geopolitical and economic disruption. At the same time, commercial models are becoming more adaptive: buyers are negotiating contracts that include extensions, payment pauses for unused capacity, and other consumption-driven terms. OEMs must account for these evolving expectations when structuring partnerships with these customers.
Written by Reece Hayden
Related Service
- Competitive & Market Intelligence
- Executive & C-Suite
- Marketing
- Product Strategy
- Startup Leader & Founder
- Users & Implementers
Job Role
- Telco & Communications
- Hyperscalers
- Industrial & Manufacturing
- Semiconductor
- Supply Chain
- Industry & Trade Organizations
Industry
Services
Spotlights
5G, Cloud & Networks
- 5G Devices, Smartphones & Wearables
- 5G, 6G & Open RAN
- Cellular Standards & Intellectual Property Rights
- Cloud
- Enterprise Connectivity
- Space Technologies & Innovation
- Telco AI
AI & Robotics
Automotive
Bluetooth, Wi-Fi & Short Range Wireless
Cyber & Digital Security
- Citizen Digital Identity
- Digital Payment Technologies
- eSIM & SIM Solutions
- Quantum Safe Technologies
- Trusted Device Solutions