Theo from t3.gg just proved GitHub's usage metering has a serious flaw. He consumed $46,000 worth of Copilot tokens for a $40 payment. Not through a hack - through normal API usage that the platform failed to rate-limit properly.
The exploit video walks through the mechanics, but the core issue is simple: GitHub's billing system couldn't keep up with actual token consumption. Send enough requests fast enough, and the metering breaks down. The platform let him burn through token quotas that should have triggered throttling or account suspension.
Why This Matters Beyond One User
This isn't about Theo gaming the system - he disclosed it publicly. It's about what the flaw reveals. If one developer can accidentally consume 1,150x their payment value, the accounting infrastructure underneath isn't production-ready.
Every SaaS company building on LLM APIs faces the same problem. Tokens are cheap at small scale but expensive at high volume. If your metering lags behind actual usage, you're exposed. A single user with a misconfigured loop could bankrupt you before the bills arrive.
GitHub has the resources to absorb a $46,000 loss. Most startups don't. This is the kind of vulnerability that kills early-stage companies - the one where your infrastructure costs spike faster than your revenue model can handle.
The Rate-Limiting Gap
Rate limits exist to prevent exactly this scenario. But rate-limiting at the API gateway level doesn't help if the underlying billing system can't enforce quotas in real time. Theo's exploit worked because the checks happened too late - after the tokens were already consumed.
This is a distributed systems problem. When usage spikes, metering systems lag. The lag is usually fine - a few seconds of drift doesn't matter for most products. But with token-based billing, where costs scale instantly with usage, that lag becomes a financial exposure.
The fix isn't simple. Real-time usage tracking at scale is hard. You need low-latency reads, consistent writes across distributed systems, and graceful degradation when parts fail. GitHub's problem is that they built the product before the billing infrastructure could support it.
What Builders Should Learn
If you're building on token-based APIs, put hard rate limits in your own code. Don't rely on the provider's metering. Track usage client-side and fail fast when you hit thresholds. That's the only way to protect yourself from runaway costs.
The broader lesson: any system that bills based on consumption needs real-time usage tracking as a first-class concern, not a backfill. Theo's exploit proves that even large platforms get this wrong. If GitHub can't meter Copilot usage accurately, assume nobody can.
For businesses evaluating AI tools, this is a risk assessment question. How confident are you in the provider's billing accuracy? What happens if your team accidentally triggers exponential usage? Do you have client-side controls in place? Because the provider's controls might not catch it in time.
The $46,000 gap between payment and consumption isn't just a bug. It's a signal that the infrastructure layer underneath LLM products is still immature. Usage-based billing at AI scale is a harder problem than most platforms realised when they launched.