Close
Close

Why Is AI Causing Shortages? Inside the Data-Center Surge, the Coming Edge Wave, and How Rand Keeps You Shipping

AI has moved from experiment to ecosystem—and it’s happening faster than traditional supply plans can adapt. Hyperscalers are scaling compute footprints; enterprises are racing to modernize back-end infrastructure; and, increasingly, device makers are preparing for on-device intelligence. Behind the buzzword fog lies a clear operational reality: AI is creating the most sustained, multi-layered hardware shortage of the modern era.

From Pandemic Whiplash to AI Overdrive: Why the Supply Side Is Too Lean

The COVID→LTA→Inventory Hangover Loop

Klein recounts how COVID’s demand shock drove customers to double- and triple-order, forcing component makers to stretch commitments and push Long-Term Agreements (LTAs). “Every customer was so hysterical for parts they started to double and triple order… suppliers couldn’t tell what was real demand,” she notes.

Many OEMs signed multi-year LTAs; when post-pandemic consumption cooled in 2023–2024, product still arrived monthly. OEMs ended up with thin demand, swollen inventory, and near-term cash strain, just as AI infrastructure started to surge.

Utilization Cuts and Deferred Capex

That prudence made sense—until AI demand accelerated.

Now the Pendulum Swings—Hard

Today, hyperscaler buildouts are pulling demand forward across servers, storage, networking, power, and advanced memory. Independent analyses echo what we see on the ground. Dell’Oro reported data-center capex up >30% in 2025 (with growth moderating in 2026 but staying structurally high), driven by AI deployments and platform refreshes (PR Newswire) Dell’Oro also sees data-center semiconductors and components up ~46% in 2025, with NVIDIA leading and HBM-exposed memory vendors (SK hynix, Samsung) surging. Delloro, meanwhile, flags grid, power, and supply-chain gaps as AI campuses densify. Deloitte

Bottom line: utilization cuts and deferred capex left the supply side too lean to absorb AI’s wave. The result is tightening supply, rising lead times, and price pressure wherever AI-critical content concentrates.

The Hyperscaler Engine: How Data Centers Ignite the First Wave

Klein is explicit: “The only sector that was driving growth was hyperscalers—Tencent, Alibaba, Baidu, Amazon, Google, Meta.” The reasoning being that AI training clusters need thousands of accelerators (GPUs/ASICs), enormous memory bandwidth, speedy interconnects, and robust power/thermal envelopes. That cascade pulls on:

  • Compute: accelerated servers (e.g., Blackwell platforms)
  • Memory: HBM on package; DRAM; caching tiers
  • Storage: NVMe flash for data pipelines; object storage back ends
  • Networking: high-speed fabrics and optics
  • Power/Thermals: PSUs, PDUs, coolers, chillers

The deployments are very real: volume shipments of Blackwell-class systems are underway at leading server OEMs/ODMs, with TrendForce projecting Blackwell >80% of NVIDIA’s high-end GPU shipments in 2025. (Supermicro)

Externally, the funding and scale signal a multi-year, secular buildout. Dell’Oro’s capex view is one datapoint; the broader ecosystem—from power solutions to bespoke silicon—confirms the trajectory, with major deals to provision AI campuses and behind-the-meter energy, such as the Oracle partnership with Brookfield to deploy fuel cells to its AI data centers. (Investors)

Memory Realignment: DDR4 Tightens as Capacity Shifts to HBM

By Q2, the big three memory manufacturers announced they were going EOL on DDR4. That sent a shock around the world.

The core dynamic is unmistakable, even if road maps vary by supplier: wafer and packaging capacity is shifting to HBM (HBM3E→HBM4), the lifeblood of AI accelerators. That reallocation tightens supply elsewhere—particularly legacy DRAM—and introduces price volatility.

  • HBM demand accelerates; shipments projected to surpass 30B Gb by 2026, with HBM4 overtaking HBM3E by late 2026. TrendForce
  • Conventional DRAM (including DDR4) is pricing up—TrendForce cites 8–13% QoQ increases in Q4’25, with HBM-inclusive products up 13–18%; NAND pricing is also firming. Astute Group
  • NAND/SSD: AI inference is pulling enterprise flash; a 2% NAND supply shortfall by 2026 could widen to ~8% if nearline SSDs displace HDDs faster than expected. TrendForce

What this means for OEMs/EMS:

  • DDR4-dependent platforms risk unexpected tightness (and cost variance).
  • DDR5 is not immune: node transitions and binning for AI use cases can pinch supply for certain densities and speeds.
  • HBM packaging capacity (CoWoS-class) remains a structural bottleneck for the next several years.

The Edge Awakens: Why the Next Leg of Demand Will Be Even Harder

So far, AI’s first wave has centered on training clusters in big iron data centers. The next wave, inference at the edge, is where demand explodes across billions of endpoints.

Klein frames it bluntly: “When AI really gets to the edge, there will have to be a change-out in all the equipment. Think about when the world went from black-and-white TV to color.

She points to the first full-stack handset as a tipping point: Apple’s “iPhone 17,” built around an Apple ASIC that integrates CPU, GPU, and an NPU and ships with a developer SDK that lights up on-device AI. “It’s the first time in history that a single device and closed ecosystem offers a ready full stack for developers, she says; “in the next 6–12 months, expect apps that none of us have ever even imagined.”

Whether or not any given device hits that exact profile, the direction is crystal clear across the industry: once compelling on-device AI arrives at scale, every laptop, handset, IoT node, vehicle, and industrial controller begins a hardware refresh. That simultaneously pulls on:

  • Compute changeover: from CPU-centric to heterogeneous (CPU+GPU/NPU)
  • Board/thermal/power redesigns: higher TDPs, denser layouts, stricter SI/PI
  • Memory/storage shifts: higher bandwidth DRAM; more robust flash tiers for local models
  • Connectivity: low-latency fabrics and high-speed I/O at the edge

We want to emphasize that inference at the edge drives cloud training, further increasing data center workloads.

Why This Shortage Will Last Years (Not Quarters)

This will be the most prolonged shortage in the history of this industry. Suppliers have been cutting capacity for years. It’ll take them three to five years to catch up.

Several forces make this cycle durably tight:

  1. Capex Cycles Are Long by Physics
    Expanding front-end capacity, advanced packaging, and HBM stacks is multi-year work, even with strong balance sheets and government incentives. Dell’Oro’s 2025 surge fades a bit in 2026 but remains structurally high thereafter. PR Newswire TrendForce similarly expects cloud service provider (CSP) capex to keep climbing as they secure GPUs and develop ASICs. TrendForce
  2. Packaging & Power Are New Chokepoints
    HBM and chiplet architectures elevate advanced packaging (e.g., CoWoS-class) to first-order bottlenecks. In parallel, power delivery and cooling constraints slow campus rollouts; utilities and operators are turning to behind-the-meter solutions to keep timelines. Deloitte
  3. Demand Is Layered: Cloud + Enterprise + Edge
    The training boom isn’t abating (Blackwell/GB300 ramps confirm it), while inference moves into enterprises and devices. TrendForce expects Blackwell to dominate high-end GPU shipments through 2025, with successor roadmaps sampling already. EE Times Asia
  4. Global Realignment & Tariffs Add Friction
    De-risking from China to Southeast Asia reshapes supply nodes, local skills, and qualification cadence—slowing execution during the transition. Klein notes the tariff and trade uncertainty weighing on forecasts and capacity decisions.

Net result: a multi-year constraint regime in which different parts of the stack go short at different times: compute today, memory variants next, then power/thermals or interconnects, creating rolling bottlenecks.

A Reality Check on Power & Sites: Campuses Need More Than Chips

AI doesn’t just consume silicon; it consumes power and real estate. Deloitte highlights the collision between AI campus timelines and grid/headroom constraints, pushing operators toward modular builds, microgrids, and alternative power. Deloitte Capital flows confirm the pivot—multi-billion dollar power partnerships are emerging to keep AI “factories” on track despite utility interconnection queues. Investors

What this means for you: even if your boards are ready, site-level dependencies (PDUs, switchgear, thermal capacity) can still bottleneck production, field deployment, or acceptance milestones. Treat campus power/thermal hardware as critical path items and plan materials accordingly.

What Makes This Cycle Different (and Riskier) Than 2020–2022

  • It’s not a single spike; it’s layered. Training demand (cloud) plus inference (enterprise/edge) create stacked pull across categories with different lead times.
  • Packaging and power have joined wafers as first-order constraints.
  • Financing is evolving. Analysts are flagging the increasing use of debt and complex structures to fund AI infrastructure—a sign the boom is now testing financial resilience as well. (Even if your company isn’t financing data centers, the systemic pressure can, from time to time, impact upstream suppliers and delivery risk.) Financial Times

Translation: volatility will migrate around the stack; yesterday’s green light can be tomorrow’s red flag. Your procurement posture has to be dynamic.

We’re going to build out all those capabilities… We’re going to accelerate, expand, and take our seat at the big table.

We leverage real-time market analytics, historical cross-cycle patterns, and program-level visibility to pre-position parts we know will gate builds, so you’re not bidding at the last minute.  Practically, that means faster turns, better coverage by commodity, and deeper regional reach, all pointed at your most time-sensitive material needs.

A C-Suite Playbook: What to Do Now

If you’re leading procurement, supply chain, or hardware product at an OEM/EMS, here’s a concise action plan we recommend—and help execute:

  1. Triage Your AI Exposure
    • Identify which programs are exposed to AI-sensitive content (accelerators, DDR families, NVMe, VRMs, high-current passives, high-speed interconnects).
    • Flag legacy DDR4 dependencies; stage bridge supply or accelerated DDR5 qualifications where feasible. (Expect DRAM price firmness; plan accordingly.) Astute Group
  2. Use Quality as a Velocity Multiplier

Looking 6–12–24 Months Out: What We Expect

0–6 months (now):

  • AI server ramps intensify; spot pockets of DDR4 tightness persist; NAND/SSD pricing firms; intermittent squeezes on power/thermal content. TrendForce
  • Hyperscaler capex cadence remains robust; more campus power workarounds materialize to keep timelines. PR Newswire

6–12 months:

  • The first wave of edge-AI endpoints and enterprise inference stacks is stimulating refreshes across boards and A-commodities (connectors, passives, PMICs).
  • HBM packaging remains a bottleneck; HBM4 sampling/early ramps are beginning to reposition road maps. TrendForce

12–24 months:

  • Rolling constraints migrate around the stack: networking optics today, VRMs tomorrow, thermals the next quarter: rotation plan, not resolution.
  • Capacity expansions and new sites reduce some pressure, but structural tightness persists as demand layers (cloud + enterprise + edge) continue to compound. Light Reading

The Supply Chain’s Defining Cycle

The AI revolution is not just software—it’s atoms: wafers, packages, boards, connectors, memory, power. And for the first time, AI demand is compounding across data centers and devices.