‘Tokenmaxxing’ is dead: How the hardware shortage is blowing up AI budgets - SMH.com.au - News Bunkers

June 29, 2026 — 4:58pm
You have reached your maximum number of saved items.
Remove items from your saved list to add more.
Tech giant Google has begun capping access to its Gemini artificial intelligence models, the clearest signal yet that the global compute squeeze is killing off the “tokenmaxxing” era and forcing the cost discipline Australian boardrooms have so far failed to impose on their own AI rollouts.
Tokens are the small chunks of work – tiny, standardised units of data – that AI models process for every task, leading to a fad in the tech world last year of “tokenmaxxing” as companies that treated token use as a proxy for productivity pushed staff to consume as much as possible.
But companies such as Uber and Meta have now stepped away from the practice as the cost of token usage has soared to astronomical levels, and capacity constraints have forced companies such as Google to curb sales.
The Financial Times reported this week that Google told Meta around March it could not provide all the Gemini capacity the social media giant wanted to buy, with the cap still in place and other Google customers also affected. Meta is one of the world’s largest enterprise AI customers; that even it cannot get all the compute it wants signals how severe the global shortage has become.
New Australian research from search firm Elastic shows the squeeze is already changing how local businesses think about AI spend. One in three Australian organisations exceeded their AI budget last financial year and 32 per cent have paused, cancelled or wound back deployments because the cost could not be justified.
Nvidia chief executive Jensen Huang said in March he would be “deeply alarmed” if a $US500,000 ($724,000) developer spent less than $US250,000 on tokens. Meta engineers reportedly consumed more than 60 trillion tokens in 30 days, an outlay one estimate put at roughly $US900 million, and Uber’s chief technology officer said in April the ride-share company had burnt through its full-year AI budget in four months.
Since then, Amazon and Meta have deleted internal leaderboards for token use. “Please don’t use AI just for the sake of using AI,” senior vice president David Treadwell told staff in May. That month, Uber’s chief operating officer told a podcast that the “link is not there yet” between token use and genuine productivity.
The hardware to sustain that pace does not exist. Memory chipmakers SK Hynix, Samsung and Micron have sold out most of their supply of the high-bandwidth memory AI models depend on, while rental prices for Nvidia’s older H100 graphics processing units are up about 30 per cent since November.
To survive the squeeze, engineering teams are increasingly abandoning massive foundational models in favour of specialised Small Language Models and alternatives that can be hosted locally at a fraction of the cost.
The scramble to build physical infrastructure has triggered an unprecedented land grab. Australian-founded Firmus, planning an ASX listing valuing it at up to $12 billion, said on Monday it had signed a deal with Nvidia to build AI data centre capacity in Indonesia that it forecasts will generate $US25 billion to $US30 billion in revenue over its first six years.
David Alonso, the national AI market lead at Deloitte Australia, said model providers had shifted from licence and subscription pricing to pricing based on consumption, removing the implicit subsidy that had made AI feel cheap. “It’s the end of the era of AI subsidy,” Alonso told this masthead. “Tokenmaxxing … becomes now more of a problem in itself.”
Elastic country manager Jeremy Pell said the squeeze would force companies to control their AI costs. “Because demand is outstripping the physical infrastructure the basic laws of economics will take over, and token costs are inevitably going to rise,” he said.
Alonso said the shift did not mean Australian companies would spend less on AI overall, however. “If anything, they will … keep growing and … your cost line is highly likely to still go up,” he said. “But it’s now that clear need to link this to value.”
The Elastic survey, conducted by Pure Profile and commissioned by Elastic, suggests local businesses are not yet measuring whether their AI spend delivers. Only 8 per cent of decision-makers track AI’s contribution to revenue or cost savings. Yet half of them plan to increase AI spend over the next 12 months, with 32 per cent saying they will only do so with clearer proof of value.
“Over the next 12 months we predict a massive shift from AI usage to strict AI accountability,” Pell said. “The era of evaluating success by how busy your usage dashboards look is officially coming to a close.”
Alonso said Australia had a “window of opportunity” of about two years to attract investment in domestic data centre capacity that would give Australian businesses local compute and more control over their token costs.
The Business Briefing newsletter delivers major stories, exclusive coverage and expert opinion. Sign up to get it every weekday morning.
You have reached your maximum number of saved items.
Remove items from your saved list to add more.
More:

source

‘Tokenmaxxing’ is dead: How the hardware shortage is blowing up AI budgets – SMH.com.au

Leave a Reply Cancel Reply