ENZH

Claude Code Max: The Most Lopsided Deal in AI

In Part 1 of this series, I talked about how benchmarks stopped mattering for real work. This post is about the economics behind that work — specifically, what it actually costs to run Claude Code as your primary development tool, and why the Max plan might be the most lopsided deal in AI right now.

The short version: thousands of dollars in API-equivalent compute on my laptop in 14 days. Thousands more on my cloud devbox over 20 days. The monthly API equivalent is staggering. The Max plan is a flat monthly fee. A 40-50x value multiplier.


The Setup

I use ccusage to track my Claude Code token consumption. It reads the local session logs and calculates what the equivalent API cost would have been based on published per-token pricing. This gives you a ground-truth view of how much compute you're actually burning — something Anthropic doesn't surface in the Max plan dashboard because, well, they probably don't want you to see these numbers.

Here's what that looks like in raw numbers.


The Numbers

ccusage tracks daily breakdowns across all models (Haiku, Opus, Sonnet). Over about 20 active days, the numbers tell a clear story: billions of tokens consumed, with API-equivalent costs in the thousands of dollars. On a flat monthly plan.

On two spike days — heavy agentic coding sessions — the single-day API equivalent exceeded the entire monthly plan cost by 6-8x each. That's one day of serious coding burning more compute than the plan costs for a full month.


What's Driving the Cost

The model mix

Most of the cost comes from Opus 4.6 — the most expensive model in the Claude lineup, with premium per-token rates. On the Max plan, Claude Code defaults to Opus for complex tasks and routes to Sonnet or Haiku for simpler operations. You don't think about it. You don't configure it. You just get the best model for every task, always.

On API billing, the economics are different. Every time Claude Code wants to spin up a research agent or do a deep analysis pass, I'd be watching Opus tokens tick up at premium output rates. The rational response is to force Sonnet everywhere and only use Opus for the tasks that truly need it. That's cognitive overhead I don't want.

The spike days

Feb 27 and Mar 2 were both heavy coding days. The cache read numbers on those days are enormous (3.4B and 2.4B tokens respectively) because each agent is reading large chunks of the codebase into context.

This is what agentic coding actually looks like at scale. It's not a human typing prompts and waiting for responses. It's an orchestration system spinning up multiple AI threads that each consume context proportional to the codebase size. The token burn rate in this mode is staggering.

This is only one machine

The laptop figure is from one machine alone. I also run Claude Code on a cloud devbox for server-side work — roughly 20 active days of usage there, mostly Opus-heavy backend sessions. Conservative estimate: the devbox adds thousands more in API equivalent.

Combined, a single month of my Claude Code usage represents tens of thousands of dollars in API-equivalent compute. For a flat monthly fee.


The Math

Let's make this concrete.

MetricValue
Max plan monthly costFlat fee
API-equivalent cost (laptop + devbox)Tens of thousands
Value multiplier~40-50x
Days to break even< 1
Heaviest single day vs monthly plan6-8x

The plan pays for itself within the first day of serious usage each month. Everything after that is free. If you use Claude Code for even 2 focused hours per day on a real codebase, you are almost certainly burning more than the monthly plan cost in API compute by the end of the first week.


The Psychological Effect

On API billing, you are constantly aware of cost. Every decision has a price tag. Should I run this agent? That's Opus tokens. Should I do one more research pass? More Opus tokens. Should I let Claude Code explore three different approaches in parallel? That's 3x the tokens. The rational behavior on API billing is to be conservative — use cheaper models, skip the extra research pass, run fewer parallel agents.

That conservatism directly degrades your output quality. The whole point of agentic coding is that the AI can explore broadly, try multiple approaches, and burn compute on your behalf so you don't have to burn time. If you're rationing compute, you're defeating the purpose.

Max plan removes the meter entirely. I never think about which model is being used. I never hesitate to let Claude Code run parallel agents on a complex problem. The AI operates at full capacity because there's no economic pressure to throttle it.

Removing cost anxiety improves output in ways that don't show up in any token count.


Who This Is For

Solo developers and small teams who use AI as a primary development tool. If you're building full products through Claude Code — not just asking it to write a function here and there, but using it as your primary engineering partner with agentic workflows, parallel agents, and deep context — you are almost certainly burning well over the Max plan fee in API equivalent.

People who would otherwise self-censor on model choice. If you find yourself thinking "I should use Sonnet for this to save money" when Opus would give better results, the Max plan fixes that problem permanently.

Anyone burning 1B+ tokens/month. At that volume, the API cost far exceeds the Max plan fee. The crossover point is probably around 500M tokens/month, and if you're doing serious agentic coding, you blow past that in the first week.

If you use Claude Code casually — a few prompts a day, no agentic workflows, mostly Sonnet-level tasks — the Pro plan is probably fine. Max is for people who treat AI compute as a core production input, not an occasional convenience.


The Bigger Picture

AI compute is being subsidized right now.

Anthropic is not making money on my Max subscription. They're losing money — potentially a lot of it. My usage alone costs them many multiples of what I pay. Multiply that across every power user on the Max plan, and the economics are clearly unsustainable at current pricing.

Why do it? Because they're buying market share. They're building a user base of developers who build their entire workflow around Claude Code. Once you're deeply integrated — your muscle memory, your development process, your team's tooling — switching costs are enormous. This is the classic SaaS playbook, except the subsidy is orders of magnitude larger than normal because AI compute is orders of magnitude more expensive than typical SaaS infrastructure.

This window will not last forever. At some point, Max will either get more expensive, get usage caps, or get tiered in ways that bring the economics closer to break-even. A 40-50x value multiplier is a land-grab price, not an equilibrium price.

The practical move is to exploit this window while it exists. Opus-class AI at effectively unlimited volume for a flat monthly fee — that's not a normal market condition.

I've been treating it accordingly: PanPanMao, Mio v1, now Mio v2 — none of it would have been economically rational at API pricing.

Software is becoming disposable. The agent economy is forming. The builders who move now will have shipped real products on subsidized infrastructure. Their competitors will pay full price for the same capability six months from now.

The meter is off. It won't be off forever.


© Xingfan Xia 2024 - 2026 · CC BY-NC 4.0