H100 GPU rental prices have reversed course. After three years of steady depreciation, they're now worth more than when they first hit the market. That's not a typo, and it's not a temporary spike. The economics of AI inference just shifted.
The Math That Changed
When H100s launched, the assumption was straightforward - newer chips would always outperform older ones, driving prices down as better hardware arrived. That's how Moore's Law conditioned us to think. Faster silicon makes slower silicon cheaper.
But something unexpected happened. Improved reasoning models and better inference software made existing hardware significantly more valuable. An H100 running optimised inference code today delivers far more useful compute than the same chip did 18 months ago. The silicon didn't change. The software stack around it did.
This matters because it breaks the upgrade cycle that has dominated tech infrastructure planning for decades. If software improvements can extract 2-3x more value from existing hardware, the business case for constantly upgrading weakens. Companies that bought H100s early aren't sitting on depreciating assets - they're holding chips that generate more revenue per hour than when they were purchased.
Why This Reversal Happened
Two forces converged. First, reasoning models got dramatically better at squeezing performance from available compute. Techniques like speculative decoding, quantisation, and model distillation mean you can run more sophisticated models on the same hardware without sacrificing output quality.
Second, inference software matured. The tooling for deploying and optimising models improved faster than the models themselves. A well-optimised inference pipeline on older hardware now outperforms a poorly optimised setup on newer chips. That shifts the performance bottleneck from silicon to software engineering.
The result: companies that invested in H100s early and learned how to optimise their inference stacks are now sitting on infrastructure that's appreciated in value. That's rare in technology infrastructure.
What This Means for Builders
If you're planning AI infrastructure, this changes the calculation. Waiting for the next generation of chips might not be the optimal strategy if current-generation hardware can deliver more value through better software. The question shifts from "what's the fastest silicon?" to "what's our inference optimisation capability?"
For businesses already running on H100s, this is validation. The hardware investment isn't depreciating on schedule - it's holding value longer than expected because the software ecosystem around it keeps improving. That extends the useful life of capital expenditure.
For cloud providers, this creates interesting dynamics. If rental prices are rising because compute is more valuable per hour, that's revenue growth from existing infrastructure. But it also means customers have stronger incentive to optimise their own inference pipelines to reduce compute costs. The relationship between provider and customer shifts when both sides benefit from software improvements.
The Broader Pattern
This isn't just about GPUs. It's a signal that AI infrastructure economics are stabilising in unexpected ways. The wild west phase where newer was always better is giving way to a more nuanced market where software optimisation matters as much as hardware specs.
That's actually good news for the industry. It means companies can make longer-term infrastructure investments without worrying that their hardware will be obsolete in 12 months. It means expertise in inference optimisation becomes a competitive advantage. And it means the relationship between compute cost and model capability is more complex than raw silicon speed.
The H100 price reversal isn't an anomaly. It's a sign that the AI infrastructure market is maturing, and that software innovation can drive value as powerfully as hardware advances. For anyone building in this space, that's worth understanding.
Full analysis available at Latent Space.