The Memory Super-Cycle: Can Used GPUs Alleviate AI Market Demand?
By Paul Schell |
04 Feb 2026 |
IN-8041
Log In to unlock this content.
You have x unlocks remaining.
This content falls outside of your subscription, but you may view up to five pieces of premium content outside of your subscription each month
You have x unlocks remaining.
By Paul Schell |
04 Feb 2026 |
IN-8041
NEWSConcentrated Returns Among Micron, SK hynix, and Samsung |
The surge in high-performance memory demand underpinning data center Artificial Intelligence (AI) compute has driven vast profits for the primary memory vendors and their suppliers. Demand has dramatically outstripped supply, and efforts to boost manufacturing capacity have not satisfied demand. Three primary vendors are at the center of this cycle:
- Micron (United States): Designs and manufactures Dynamic Random Access Memory (DRAM), NAND flash and High-Bandwidth Memory (HBM) used in leading Graphics Processing Units (GPUs). It is less integrated than its rivals and has relied on partners for some elements of production, such as advanced packaging. The share price has surged almost 35% in the last month.
- SK hynix (South Korea): Designs and manufactures server DRAM, enterprise Solid State Drivers (SSDs) and NAND, and HBM. It was the first to market with HBM4 and has experienced a 33% share price increase in the past month.
- Samsung (South Korea): Designs, manufactures, and packages semiconductors used in DRAM, server DRAM, and HBM. It is slightly behind its rivals on next-generation HBM4 mass production, but is the most vertically integrated. Shares have lurched over 30% in the past month.
All three memory vendors supply NVIDIA, AMD, and other AI silicon challengers, as well as hyperscalers designing accelerators in-house, with HBM, crucial for manufacturing leading data center GPUs performing AI inference and training workloads. There has been a recent spike in demand for inference workloads to service agentic workloads as enterprises adopt systems that increase worker productivity. These vendors also supply DRAM and other critical memory components needed for manufacturing AI servers, with supply capacity instead having been allocated to HBM. All vendors are expanding their manufacturing capacity, including fab construction in the United States, with Micron aiming to integrate more advanced packaging capabilities in-house. However, such construction projects take around 2 years, and some of the recently announced projects, such as Micron’s Singapore site, are only expected to mass produce wafers in 2028.
IMPACTOEMs and Cloud Providers Won't Absorb Entire Price Increases |
It is reported that most of the memory manufacturing capacity has been dedicated to server modules to meet the demand in data center infrastructure build-out over the past 2 years. This has been clear in the DRAM market, also creating a squeeze on other form factors reliant on DRAM, including DDR4/DDR5 in Personal Computers (PCs) and LPDDR4X/LPDDR5X in mobile devices such as smartphones, laptops, and tablets. Nonetheless, server prices have also been affected, particularly toward the end of 2025. In fact, even hyperscalers (with the largest buying “power”) were unable to have their orders fulfilled in 4Q 2025.
Naturally, this had knock-on effects further along the value chain, as AI server Original Equipment Manufacturers (OEMs) increased their prices to cover memory costs, and it is rumored that some GPU cloud instances have increased due to cost increases, which includes servers alongside energy. Server prices increased by up to 15% around the end of 2025 reflecting the increase in the Bill of Materials (BOM) related to memory—from GPUs and accelerators to Central Processing Units (CPUs) and storage. Concurrently, the pricing validity of quotes is tightening, with some OEMs going down to 14 days.
Even before the acute pressure from memory shortages (and associated price increases), AI-first neoclouds and other cloud operators were looking to repurpose older generations of NVIDIA systems for inference-based services and on-demand cloud instances. This was largely driven by capital constraints—and the already prohibitively expensive cost of the latest NVIDIA Blackwell and AMD Instinct systems—as well as concerns with the rate of depreciation of GPU assets from a financing perspective.
RECOMMENDATIONSOlder Systems for Modern Workloads |
Now, AI server procurement teams are being advised to bring forward orders from server OEMs and Original Design Manufacturers (ODMs) to get ahead of the supply crunch and guarantee pricing and availability for new hardware. But regardless of the availability or price of new servers, older hardware is able to service today’s inference-heavy workloads, including some fine-tuning for smaller models, as well as Agentic AI. This includes NVIDIA V100, A100, and T100 servers, alongside end-to-end A100 DGX systems. There are a number of benefits of older, refurbished hardware that the market should consider during procurement decisions:
- Circular GPU Economics: There is an ecosystem of established resellers such as Alta Technologies and NewServerLife in the United States, or TechBuyer and ServerMall in Europe. Most of the systems come with some level of warranty and lifecycle support similar to new systems sold by resellers.
- Quantization & Software Optimization: Advances in software optimization for older GPU generations and the quantization of many popular AI models mean that older systems can handle today’s workloads at a viable level of accuracy.
- Smaller, Agentic AI Models: The memory constraints for some smaller Agentic AI workloads are not as acute as with larger monolithic Large Language Models (LLMs), making deployment on older hardware viable. For more memory-constrained workloads (e.g., iterative loops with high KV cache requirements), high-tier A100 GPUs actually offer the same amount of Video Random Access Memory (VRAM) as their newer counterpart H100 GPUs at 80 Gigabytes (GB).
- Classical Machine Learning (ML) & Non-Transformer Networks: Alongside applicability for the newest agentic workloads, older GPUs still serve older, classical AI and ML workloads, which do not require the latest GPUs and accelerators.
- Testing & Development: As enterprises look to test and productize AI in-house (especially for commercially-sensitive industries such as healthcare and finance), it is more cost-effective to trial using older hardware before committing to the newest and considerably more expensive hardware.
- Heterogenous Systems: Older GPU servers can sit alongside newer servers to serve a diversity of workloads that do not require the latest GPUs. Such “mixed fleets” can route less latency-critical tasks to older GPUs.
- On-Premises Deployment: Older systems are typically air-cooled, including the A100 DGX, significantly lowering enterprise installation complexity and lifecycle management over liquid-cooled systems.
The market for used and certified AI hardware is still in the early stages, and several key areas still need addressing, such as identifying the provenance of GPUs and understanding how their historical usage affects residual value. For instance, there is a difference between a GPU used for training in an air-cooled environment versus one used for lighter inference workloads in a liquid-cooled server. Nonetheless, a sign of the growing commercial interest (and maturation) of this market is the recent interest by financial services and fintech companies offering structured lending and insurance on used GPUs.
Written by Paul Schell
- Competitive & Market Intelligence
- Executive & C-Suite
- Marketing
- Product Strategy
- Startup Leader & Founder
- Users & Implementers
Job Role
- Telco & Communications
- Hyperscalers
- Industrial & Manufacturing
- Semiconductor
- Supply Chain
- Industry & Trade Organizations
Industry
Services
Spotlights
5G, Cloud & Networks
- 5G Devices, Smartphones & Wearables
- 5G, 6G & Open RAN
- Cellular Standards & Intellectual Property Rights
- Cloud
- Enterprise Connectivity
- Space Technologies & Innovation
- Telco AI
AI & Robotics
Automotive
Bluetooth, Wi-Fi & Short Range Wireless
Cyber & Digital Security
- Citizen Digital Identity
- Digital Payment Technologies
- eSIM & SIM Solutions
- Quantum Safe Technologies
- Trusted Device Solutions