The Architecture of Scarcity

Last week, all the AI hyperscalers reported their earnings on the same day and each presented the same underlying tone – one of scarcity. Sundar Pichai, Alphabet’s CEO, said, “We are compute constrained in the near term,” and Amazon noted that capacity is being monetised as fast as it can be installed. Microsoft expects to remain constrained through the remainder of 2026, while Meta raised its spending forecast and cited the cost of components it cannot get fast enough.

 

Together, Alphabet, Amazon, Meta, and Microsoft have committed to spending more than US$700 billion on AI infrastructure in 2026. That is almost double the amount spent in 2025, and approximately US$100 billion more than their own estimates in the prior quarter. Amazon has committed US$200 billion and Microsoft US$190 billion, up 61% over the prior year. Alphabet is expecting to land between US$180 billion and US$190 billion. Meta lifted its range to between US$125 billion and US$145 billion, up US$10 billion on its prior guidance, citing higher component costs.

 

The signal across the hyperscalers is staggering and consistent: that the demand of tokens outstrips supply and that physical scarcity is creating a barrier at a time of accelerating adoption.

 

A chain of constraints

Since ChatGPT arrived in late 2022, the AI infrastructure build-out has been a sequential story of constraints. Each bottleneck, once partially resolved, has revealed the next one waiting behind it. For investors who read that sequence, it has been consistently rewarding.

 

It started with GPUs and LLM training, with Nvidia’s H100 on allocation and waitlists growing as the tempo of development accelerated. The speed of change, the scale of the buildout, and the fixation of a bubble hid emerging constraints such as advanced packaging, high-bandwidth memory, data centre construction, land, permitting, labour, and building materials – and not forgetting power, grid interconnections, transformers, and high-voltage switchgear, as well as optical interconnects.

 

All benefitted from orders and capital flow. Even the CPU, long dismissed as an old technology with limited opportunity, re-emerged as a genuine beneficiary as inference exploded and workloads proved far more processor-intensive than most originally expected.

 

The question now is whether that sequence has run its course or whether this is structural and set to endure.

 

The gap between intent and execution

With an intent to spend US$700 billion of committed capital, the question we are now faced is whether it can be deployed, given only five gigawatts (GW) of the previously announced 12GW to 16GW of US data centre capacity announced for completion in 2026 is under construction.

 

This trend has accelerated, and whether future projects will be delayed or outright cancelled remains to be seen but is of concern. The US wants to be at the forefront of the AI revolution, but delays in compute deployment will hamper ambition and perhaps cede ground to others who have less inertia.

 

Last month, Maine came close to becoming the first state in America to say no to the ongoing and aggressive buildout of data centres. Its legislature passed a bill that would have frozen approvals for new large data centres until October 2027 while the state worked out their impact on power bills, water consumption, the environment, and the electrical grid. However, in a surprise move, Governor Janet Mills vetoed it, but only on a technicality – that it badly needs the jobs created.

 

The reasons are not exotic: transformers and switchgear carry lead times approaching three years, and the impact of tariffs has compounded the supply constraints, especially in those areas predominantly reliant on China, which supplies over 40% of battery imports and approximately 30% of transformers and switchgear capacity. Permitting processes that were slow before the AI boom have not accelerated to match, with significant and well-funded opposition groups now active across many US states. Several states are considering construction moratorium legislation.

 

The companies driving the revolution are not short of capital or the conviction to spend, but with the hyperscalers committing the bulk of what will amount to over US$1 trillion in capex this year, the deployment bottleneck may be difficult to overcome. With that, investor concern surrounding lofty expectations may grow.

 

Digital scarcity

With demand exploding from both the consumer and now the enterprise, the physical shortage is creating a second form of scarcity that is now layering on top of it. As recently as mid-2025, AI model providers built their businesses on the assumption that, at least for a while, usage would coalesce in conversational queries representing a few hundred tokens at a time, with a human at the keyboard. However, one technological fork in the road changed all of that, and the cornerstone assumption has been overtaken by events.

 

Agentic AI, where models write code, browse the web, and execute multi-step tasks autonomously, consumes thousands of tokens per session and runs continuously rather than sporadically. Visa consumed two trillion tokens in a single month, after consuming one trillion the month before, and Uber exhausted its annual AI budget in three months.

 

This change has happened so fast that the pricing model of the AI industry as a whole has broken. The flat-rate price that balanced consumption across all users was designed at a time when the AI use case was under close scrutiny. It was not designed for a world of agents running 24/7.

 

Think of it as a party: you invite 50 people but cater for 30, expecting 40% to drop out, but the drawcard is so great that they all arrive. Worse, some bring a plus-one! For AI, they are faced with two challenges, the first being the deployment of capital and the second, at least in the short-term, being how to create controlled demand destruction to bring utilisation back to something like balance.

 

Anthropic has raised prices to enterprise customers and shifted away from flat-rate billing toward consumption-based contracts, with model performance reported to have been throttled for lower-tier users and rationed toward higher-end, paid workloads. OpenAI has temporarily shut down its Sora image generation tool. The cost to serve a frontier model at current usage levels remains substantially higher than the market prices used to acquire customers, and the subsidy is under growing pressure as usage accelerates and the economics worsen.

 

The demand destruction dynamic is visible through Cloudflare, a close partner of several major model providers, which disclosed that it had pivoted its internal AI development tools from a leading proprietary model to an open-source alternative, cutting costs by 77% on a single high-volume workflow. 

 

By way of a description, at its recent ‘Agents Week’ event Cloudflare said, “If the more than 100 million knowledge workers in the US each used an agentic assistant at ~15% concurrency, you’d need capacity for approximately 24 million simultaneous sessions. At 25–50 users per CPU, that’s somewhere between 500K and 1M server CPUs – just for the US, with one agent per person.

 

“Now picture each person running several agents in parallel. Now picture the rest of the world with more than 1 billion knowledge workers. We’re not a little short on compute. We’re orders of magnitude away.”

 

What it means

For the hyperscalers with owned compute, scarcity is largely good news, and the market has been surprised by just how good. Cloud operating margins are expanding rather than contracting, the opposite of what had been expected as depreciation from massive capital spending hit the income statement. AWS operating margins reached 37.7% and are moving higher, while Google Cloud’s hit 32.9%. Pricing power is real for those that have the capacity to allocate compute, especially where their own silicon provides a significant cost advantage.

 

For those without their own compute and dependent on rented infrastructure, the picture is considerably harder. They are now at the pricing whim of their compute providers, and this creates a ceiling on growth, forces difficult choices about which customers and workloads to prioritise, and exposes pricing commitments – written at an earlier time when compute was plentiful – to painful renegotiation.

 

For investors, the central question is whether this cycle follows the pattern of previous technology infrastructure buildouts, where shortages eventually resolve into oversupply and compress margins, or whether the demand for AI compute is genuinely durable enough to keep the cycle tight for longer than prior analogues suggest. For example, in prior cycles, memory companies would have aggressively expanded capacity, leading to oversupply, but there is little evidence of ill-discipline at the moment.

 

The constraints to do so are everywhere, and the companies supplying into the buildout – the chip makers, the power infrastructure names, the optical networking players and the like – are generating cash at levels rarely seen in their histories. The hyperscalers spending it, however, are the providers of this cash waterfall and are watching their own free cash flow compress under the weight of commitments that show no sign of slowing.

 

The evidence from last week’s earnings, at least for now, sits firmly on the side of duration. Cloud growth rates are accelerating, backlogs are expanding, and the companies building this infrastructure are not blinking.

 

What is less certain is whether the physical world can keep pace with the ambition. This is not a repeat of the dotcom bubble where dark fibre had no market; in this revolution, demand is insatiable. If anything, the constraints are more basic, not just chips and other hardware but at the speed at which concrete sets.

 

There is no particular reason to think we are nearing the end game, at least not yet, but investor expectations are high and any demand weakness will manifest itself in share prices, and quickly.

 

Tim Chesterfield is CIO of the Perpetual Guardian Group and the founding CIO and Director of its investment management business, PG Investments. With $2.8 billion in funds under management and $8 billion in total assets under management, Perpetual Guardian Group is a leading financial services provider to New Zealanders.

 

Disclaimer

Information provided in this publication is not personalised and does not take into account the particular financial situation, needs or goals of any person. Professional investment advice should be taken before making an investment. The information provided in this article is not a recommendation to buy, sell, or hold any of the companies mentioned. PG Investments is not responsible for, and expressly disclaims all liability for, damages of any kind arising out of use, reference to, or reliance on any information contained within this article, and no guarantee is given that the information provided in this article is correct, complete, and up to date.

You might also like:

Top