The AI industry is burning capital at a scale with few historical parallels. At the same time, token prices have dropped 280× in roughly 18 months[1]. Both are true. Sort them out, and you understand why the next three years stay historically cheap for AI users — and why caution comes after. Let's start with your own bill.
01 · Your billWhat your $100 to Anthropic likely actually costs
You pay $100. You likely caused costs of ~$300–600[△].
Estimate for a heavy user on Claude Max ($100/month, "5× Pro" tier) working with Claude Code every day. Basis: Epoch AI Cost Analysis 2025[7] and published inference cost data. Ranges because no official exist.
This subsidy is finite. Anthropic internally projects cash-flow positive for 2027/2028 (per investor materials, not audited[9]). OpenAI for 2029/2030. So: two to four years of subsidy are reasonably safe — after that, providers have to start pricing for real.
A paradox at first glance: why does the $20 base tier stay stable when the subsidy ends? The answer is tier segmentation. Per-token inference cost drops about 10× per year, per Sequoia[5]. In two years, the real cost of GPT-3.5-class operations is a fraction of today's. The $20 list price can hold steady — with tighter token caps — and still turn profitable. Providers need the budget tier as top-of-funnel.
The heavy-use tier — where you land as a serious AI user — gets more expensive. Claude Pro has already tightened limits. ChatGPT Plus throttles o3 access. The "Max" and "Ultra" tiers at $100–$200 are the new normal. What costs $100 today either costs $100 in 2028 with dramatically fewer tokens included, or $200–$300 for the same depth of use.
If you want to know why the subsidy exists and why it has to end, look at the macro numbers next.
02 · ScaleBig 4 are burning $700B — against $50B in industry revenue
A single NVIDIA GPU runs $25,000 to $40,000[14]. The big buy them in hundreds of thousands. A modern AI datacenter with liquid cooling costs $20–30M per of installed power[15] — hardware on top. The ratio of investment to revenue is absurd. Here are the three hardest contrasts:
03 · Token price collapseWhat you pay keeps going down
Here's the good news, and it's surprisingly good. Stanford HAI's AI Index 2025[1] documents a 280× price collapse for GPT-3.5-class performance in roughly 18 months — November 2022 to October 2024. Put differently: what cost you $20 per million at the end of 2022 costs 7 cents today. For the same performance.
One thing matters for the chart below: every value is in the same unit — US dollars per million input tokens, measured against the same performance bar (GPT-3.5 level on the benchmark). Otherwise you're comparing apples to oranges.
What's interesting is what's happened since the end of 2024: the price for raw GPT-3.5-class performance has hit a floor — roughly 5 to 10 cents per million tokens, and not much room left to fall. What keeps shifting dramatically: what you get for that price. Models like Gemini 2.0 Flash, GPT-4.1 Nano, or DeepSeek V3.2 run around 10 cents per million input tokens today[13] and deliver well above GPT-3.5-class output — roughly on par with original GPT-4 (which still cost $30 in May 2023). Performance per dollar has multiplied again, even though the headline token price stays flat.
What does performance cost today? Top-tier models like Claude Opus 4 or GPT-5 still run $3 to $15 per million input tokens[13]. But these models do reasoning, multi-hour coding sessions, and tool use — none of which was possible in 2023. The market is segmenting: mass usage turns commodity, frontier stays valuable.
04 · Bulls vs. bearsIs AI changing the world — or is it the next bubble?
I've collected the strongest arguments on both sides — only with numbers from primary sources. No speculative "might be".
05 · Investment paradoxSeven cents of revenue per dollar invested
Goldman Sachs put it well[3]: in 2025, roughly $350B flows into AI infrastructure (the Big 4 alone)[6], and AI-native inference revenue (OpenAI, Anthropic, etc., by the Sequoia/Goldman definition) sits at roughly $25B[5]. That's seven cents of revenue per dollar invested[△]. Profit is nowhere in sight — the providers are paying out of pocket. (The total AI market including hyperscaler cloud-AI services is much larger, but the ratio of CapEx to direct AI revenue stays structurally lopsided.)
$5B
$13B
$25B
$50B
Even if AI revenue doubles in 2026 (to $50B) and CapEx grows "only" to $700B — the ratio gets worse, not better. From an investor's view, that's a 1:14 ratio of revenue to infrastructure investment. From an AI user's view it means: you're using infrastructure worth a multiple of what all users combined are paying for it.
06 · Dotcom or Telecom?Why the fiber bust is the more honest analogy
Most people compare this with Amazon or remember the of the late 1990s. Amazon doesn't fit — wrong order of magnitude: Amazon's cumulative losses up to its first real profitability in 2003 were around $3B. OpenAI burns that today in about four months.
The dotcom bubble gets us closer: hundreds of overvalued software companies, Pets.com, Webvan, Boo.com — almost all gone. But the structurally more accurate parallel is its lesser-known twin crisis that collapsed at the same time: the . While the dotcom bubble wiped out the software layer, telecom wiped out the infrastructure — and infrastructure is what's at stake with AI today. If you don't remember the telecom boom: WorldCom, Global Crossing, and Nortel put over $500B into fiber infrastructure and bonds because everyone believed internet data volume would "double every 100 days"[16]. It didn't. Result: roughly 90% price collapse for bandwidth, massive bankruptcies, WorldCom's July 2002 insolvency — the largest accounting fraud in US history at the time ($11B+ in inflated assets)[16].
- 01 Over $500B invested.[16] WorldCom, Global Crossing, AT&T, Lucent — all financed with debt, based on growth forecasts that later proved too optimistic.
- 02 Bandwidth prices dropped ~90%. Overcapacity led to the price collapse. Only a fraction of the laid fiber was actually used.[16]
- 03 Bankruptcies and crash. WorldCom insolvency July 2002 — the largest accounting fraud in US history at the time[16]. Investors lost hundreds of billions.
- 04 Who won: the users and the app layer. Google bought dark fiber for pennies. YouTube was only possible because of cheap bandwidth. Streaming, cloud, social media — all built on the wreckage.
- 01 $350B in AI CapEx in 2025 alone.[6] Big 4 finance from cashflow (unlike back then!), but the order of magnitude is a one-to-one match — built on the assumption of exponentially growing AI demand.
- 02 Token prices fell 99.7%. 280× price drop in ~18 months[1]. Same pattern — overcapacity meeting slower-growing real demand.
- 03 Risk: individual providers fail. OpenAI investor documents put losses at ~$9B per year[9], Anthropic has accumulated a low double-digit billion in losses over 5 years — both from pitch decks, not audited. When the subsidy ends, not everyone survives.
- 04 Who wins: probably the users and the app layer, again. Anyone building robust AI workflows today wins in every scenario — no matter who ends up buying the GPU wrecks.
07 · Jevons paradoxEfficiency is near the limit — consumption explodes anyway
Every new GPU generation is more energy efficient, every new datacenter has a better (Power Usage Effectiveness — the ratio of total energy to compute energy). Lower is better: PUE 1.0 means no overhead loss, PUE 2.0 means half the energy goes to cooling and networking. Best-in-class hyperscalers (Google, Meta) sit at ~1.1 today[10] — which is where most AI workloads run. And yet power consumption triples by 2030[2].
Global datacenter power (TWh / year)
Efficiency: headroom nearly gone
Meaning: there's not much left on the datacenter efficiency side. Future savings have to come from better chips (NVIDIA Rubin) and algorithmic efficiency — not datacenter layout.
Per-token efficiency doubles — total consumption triples. That's Jevons paradox in real time: when something gets cheaper, we use so much more of it that total consumption rises anyway.
08 · Two truthsWhat we know for sure — and what's estimate
Confirmed (from primary sources and SEC filings): Big-4 CapEx is real and comes from cashflow, not debt[6]. Inference prices are collapsing[1]. Energy efficiency is maxed out[10]. Total consumption doubles by 2030[2]. Alphabet's Q1 2026 included a $28.7B book gain from the Anthropic stake[4] — almost half of the quarterly profit with not a single dollar of cash inflow.
Estimate / not officially confirmed: OpenAI's internal cost-to-revenue ratios ("$2 spent per $1 of inference revenue", $9B loss against $13B revenue) come from leaked investor documents cited by Fortune, The Information, and wheresyoured.at[9] — not from audited statements. Anthropic's of ~50% and of ~$211/month are third-party analyses (SaaStr subscription-mix analysis[12], Sequoia token economics[5]). The Claude Max inference cost of $300–600/month is a model calculation[△] based on Epoch AI "Inference Cost Analysis" 2025[7] — a range, because no official unit economics exist. Profitability forecasts (Anthropic 2027/28, OpenAI 2029/30) are internal models, not audited statements[9]. Read these numbers in the article as order-of-magnitude reference points, not as verified balance-sheet items.
The most likely resolution: we're living through two truths at the same time. On one side, an investor bubble at the big cloud providers pouring hundreds of billions into infrastructure that may never pay off. On the other side, a historic bargain for anyone using AI — you're getting service worth an estimated $300 to $600 for $100. Both are real. Both can be true simultaneously.
The telecom boom proved this: investors lost, the equipment makers won (NVIDIA is the new Cisco), users won. The internet, YouTube, Spotify — all possible because of the cheap bandwidth that was left behind. With AI it'll be similar. Only the subsidy phase is finite — and in this phase you should build your workflows so you're not stuck when pricing power kicks in.
09 · If you build somethingNot everything needs AI — and that becomes important soon
Here's where the article gets practical. If you build your own tools, write software, or want to put a business model on top of AI — the subsidy phase dramatically changes the rules after 2028. Anyone building a SaaS today that makes an call per user click has a scaling problem in two years. Classical deterministic software scales cheap: a written-once if-then block costs a few cents per million calls. An LLM call costs hundreds to thousands of dollars per million.
The rule of thumb: use AI where it actually earns its keep — creative, contextual, linguistic, generative. Not where deterministic software does the work cheaper and more reliably. A simple calculation, a database query, a validator check, a routing decision — all of that is classical programming. Replacing it with an LLM makes it 1,000× more expensive and less reliable. It works today because the LLM call is subsidized. In three years it doesn't.
Anyone shipping software should ask the following before every AI call:
- →Can you solve this with classical logic? Then do that.
- →Do you actually need frontier intelligence, or is a small, cheap model enough?
- →Can the result be cached, so the same call doesn't run twelve times?
- →How much would your business model lose if the token price tripled tomorrow?
For solo developers: build your personal tools so you can swap them out without breaking your business model. If your workflow says "absolutely needs Claude Opus at every step", you're exposed.
For companies: cost of inference has to be a hard metric on every AI project, not just a footnote. Ask on every project: what does it cost us per user, per month, at 10× scale? Which components are non-negotiably AI — and which are better built classically? Any business model built on 100,000 users generating millions of LLM calls per month has an existential problem when kicks in. Software that uses AI intelligently and sparingly wins.
10 · How to hedgeWhat you should do concretely now
The honest recommendation isn't "don't worry" and it isn't "stop paying". It's: use the subsidy phase fully, but build your workflows so you're not exposed when pricing power arrives. Concretely, six things:
Use the subsidy. Avoid the lock-in.
- 01 Use it fully now. The price-to-performance ratio is historically rare. Build workflows, automate processes, get as much as you can. Wait and you leave the subsidy on the table.
- 02 Build classical where possible. A calculation, a database lookup, a validator, a routing decision — a few cents per million calls. Same job via LLM: hundreds to thousands of dollars. Use AI where language, context, or generation are needed — not as a universal hammer. The hammer approach works today because the LLM call is subsidized. In three years it doesn't.
- 03 Build provider-flexible. Write your tools so you can flip from Anthropic to OpenAI or Google with one switch. Tools like or make this practically trivial. For companies: parallel contracts with two providers, never bet 100% on one. How I did it myself: setting up cloud with Antigravity.
- 04 Keep your data local. Prompts, memory, workflows, custom knowledge bases — whatever lives inside a provider system isn't yours. Regular exports. For sensitive company data, add storage and clear contractual terms on data egress.
- 05 Keep self-hosting on the table. -class models run on a Mac Studio today ( is surprisingly good for local LLMs). For companies: a dedicated GPU server pays off above a certain volume — and makes you independent of provider pricing. Build plan B before you need it.
- 06 Cheap models for bulk work. Frontier model only where it needs actual reasoning. For classification, extraction, simple answers, a 10–20× cheaper model is enough. That saves over 80% and decouples you from price shocks on the top tier. If you work with Claude Code: a token statusbar shows live which workflow eats the most tokens.
Rule of thumb: anyone who builds routines that don't work without AI during the subsidy phase will be exposed in the pricing-power phase. Anyone who builds routines that migrate to a different model in minutes wins today and keeps options tomorrow.
Bottom line
AI infrastructure investment is the most expensive bet in history — and a meaningful share of it is rational. Token prices are dropping for real. Efficiency is at its limit. Demand grows faster than both combined.
The risk sits with investors and hyperscalers, not users — at least for now. The historical precedent (the telecom boom) shows: the infrastructure gets built anyway, the prices collapse anyway, and the biggest winners are the ones who use the cheap infrastructure most effectively — not the ones who held WorldCom stock.
What you need today isn't skepticism, and it isn't naive enthusiasm. You need a plan to use the subsidy phase without being exposed in the next one. Take the plan above literally — or as a starting point for your own.