OpenRouter Weekly Token Rankings 2026: Billing Data Reveals Who Really Leads

28.9T weekly volume · China-US shift · DeepSeek matrix leads · Token vs dollar truth · six-step weekly tracking

OpenRouter weekly token rankings and billing data breakdown

If you keep bouncing between MMLU charts and production reality but want to know who is actually getting called in 2026, OpenRouter Rankings weekly token throughput is more honest than any benchmark deck. For the week ending May 24, 2026, global weekly volume hit 28.9 trillion tokens (five straight weeks of growth). Chinese models reached 9.223T and have led the US for four weeks running. The DeepSeek trio totals 5.74T at the top of the vendor chart. This article is for developers and tech leads doing model routing and cost control. You get data source notes, that week's Top 10, the token share vs dollar revenue split, the a16z inverse-benchmark finding, a six-step weekly tracking runbook, and why a monthly Mac Mini M4 rental still makes sense for long-running Agents.

01

Why billing data beats benchmark leaderboards: five traps

OpenRouter is the largest neutral AI model API aggregator: 300+ models, 60+ providers, 8M+ users, roughly 100T tokens per month. Its public leaderboard (openrouter.ai/rankings) ranks by 7-day rolling token throughput, counting both input and output. That is developers voting with wallets—not vendor radar charts.

A year ago OpenRouter processed about 2.4T tokens per week; one week now reaches 28.9T, roughly 12x growth. Token volume is a commercial weather vane: investors track AI monetization, developers pick multi-vendor routing.

  1. 01

    Benchmarks can be gamed: High MMLU or HumanEval scores do not mean stable XML/JSON tool calls in Agent workflows—or thirty minutes of autonomous coding without drift.

  2. 02

    Volume reflects production trust: Developers keep paying and burning compute when a model passes stability, latency, and price in real workloads.

  3. 03

    Weekly cadence catches spikes: DeepSeek V4-Flash jumped +66% week over week—a signal monthly snapshots smooth away.

  4. 04

    Free models skew the chart: Zero-price models like Owl Alpha inflate experiment traffic. Read both token share and dollar revenue share.

  5. 05

    Coding is now the top use case: OpenRouter + a16z (100T tokens of anonymous metadata) show coding share rising from 11% in early 2025 to over 50%—Top 10 models optimize for Agents and code.

It is not who is smartest—it is who gets called that drives real AI adoption. Billing numbers are more honest than any eval leaderboard.

02

May 18–24, 2026: 28.9T global weekly volume and the China-US shift

The tables below summarize OpenRouter public data (7-day rolling weekly stats, through May 24, 2026). Cross-checked against NBD (2026-05-25), OpenRouter official rankings, and MACCOME commentary from the same period.

MetricValueWoW change
Global weekly volume28.9T tokens+7.4% (five weeks up)
China model weekly volume9.223T tokens+19.89%
US model weekly volume4.93T tokens+16.27%
China vs US rankChina leads US for four straight weeksGlobal #1 region

China model share timeline

PeriodChina model traffic share
Early 2025< 2%
Feb 2026First time above US
May 2026~45%+, four weeks ahead of US
i

Scope note: OpenRouter assigns regional share by model vendor. DeepSeek, Tencent, MiniMax, StepFun count toward China; Anthropic, Google, xAI count toward the US.

03

That week's model Top 10: DeepSeek matrix takes three slots

Ranked by weekly tokens for May 18–24, 2026. DeepSeek V4-Flash, V4-Pro, and V3.2 all land in the top nine; the series totals 5.74T (+25.9% WoW), leading vendors for two weeks over Anthropic and Google. Kimi K2.6, #6 the prior week, dropped out of the Top 10.

RankModelVendorWeekly tokensWoWNotes
1DeepSeek-V4-FlashDeepSeek3.43T+66%Agent workflows, ultra-low price
2Tencent Hy3 PreviewTencent3.07T+16%Still growing after promo ended
3Claude Sonnet 4.6Anthropic1.35T1M context, enterprise coding
4DeepSeek-V3.2DeepSeek1.31TLow-cost long tail
5Owl AlphaOpenRouter1.15T+29%Free Agent model, 1M context
6Gemini 3 Flash PreviewGoogle1.06TMultimodal, academic/medical
7DeepSeek-V4-ProDeepSeek1.00TMatrix flagship (5.74T series total)
8MiniMax M2.7MiniMax806BLong-context value pick
9Grok 4.1 FastxAI721B2M context, legal workloads
10Step 3.5 FlashStepFun673BFast, cheap batch jobs
04

Vendor landscape: the dual truth of token share vs dollar revenue

Token volume alone misses pricing. Anthropic shows a classic premium paradox: token share near 12% (down from 25% a year ago) while dollar revenue share stays near 46%. Enterprise users still pay premium rates for Claude, but traffic leadership moved elsewhere. Claude Opus 4.6 earns about $25M/month on a fraction of DeepSeek's token count.

SegmentExample modelsToken patternRevenue pattern
High value, low volumeClaude Opus seriesShare decliningComplex enterprise reasoning, strong ARPU
Mid price, steady volumeGoogle Gemini FlashStable growthMultimodal and academic use
Ultra-low price, high volumeDeepSeek / MiniMax / StepFunShare expanding fastAgents, coding, batch dominate

The OpenRouter + a16z 2025 AI Usage Report adds a counter-intuitive point: benchmark scores and market share often move inversely. Developers optimize for inference cost and API stability over peak capability—matching DeepSeek and Hy3 atop the weekly chart while some benchmark champions sit outside the Top 10.

05

Six-step runbook: track OpenRouter weekly and tune model routing

Rankings refresh weekly; routing should too. This runbook fits Claude Code, Cursor, OpenClaw, or a custom gateway—turning leaderboard insight into config changes.

  1. 01

    Every Monday, open Rankings: Visit openrouter.ai/rankings. Log global totals, China-US split, and Top 10 moves. Screenshot for team review.

  2. 02

    Split token vs dollar views: Check both token share and revenue share so free models (Owl Alpha) are not mistaken for production defaults.

  3. 03

    Map models to tasks: Agent/batch → DeepSeek-V4-Flash; enterprise reasoning → Claude Opus; multimodal → Gemini Flash; watch new entrants (Hy3, Owl Alpha) as breakout signals.

  4. 04

    Regression on a fixed prompt set: Weekly, rerun the same coding issue subset. Track tool-call failure rate against leaderboard shifts.

  5. 05

    Update routing JSON and budget caps: Raise Flash concurrency, hard-cap Opus monthly spend; fallback chain Sonnet → V4-Flash → human queue.

  6. 06

    Bind 7×24 host to validate routing: Routing can live anywhere; if Agents need macOS (Claude Code, OpenClaw), deploy daemons on a monthly Mac Mini rental instead of a sleeping laptop.

json · multi-model routing tuned to weekly rankings (concept)
{
  "weekly_review": "2026-05-24",
  "routes": {
    "agent_batch": "openrouter/deepseek/deepseek-v4-flash",
    "enterprise": "openrouter/anthropic/claude-sonnet-4.6",
    "complex_reasoning": "openrouter/anthropic/claude-opus-4.6",
    "multimodal": "openrouter/google/gemini-3-flash-preview",
    "experiment": "openrouter/owl-alpha"
  },
  "fallback": ["enterprise", "agent_batch"],
  "monthly_cap_usd": 800
}
06

Citable hard data and Agent host choices

For internal memos or architecture reviews, these points are cross-checked against OpenRouter public data and contemporaneous press (week of May 18–24, 2026):

  • Global weekly volume: 28.9T tokens, +7.4% WoW, five weeks up; ~2.4T a year ago (~12x/year).
  • DeepSeek matrix: V4-Flash 3.43T + V4-Pro 1.00T + V3.2 1.31T = 5.74T, vendor #1.
  • Coding share: OpenRouter + a16z report: 11% early 2025 → over 50%, largest single category.
  • Anthropic premium: ~12% token share vs ~46% dollar share; Opus 4.6 ~$25M/month.
  • China model share: <2% early 2025 → ~45%+ May 2026, four weeks above US.

OpenRouter solves inference vendor switching; it does not replace process supervision, key boundaries, or Apple tooling. Teams crush API cost on Flash tiers yet lose overnight Agent runs when laptops sleep—or fight Metal/Keychain/Xcode gaps on Linux VPS hosts. Same pattern as the OpenRouter trends selection guide and renting Mac Mini for OpenClaw: models swap on token pricing; host uptime is an OpEx contract. For teams treating multi-model routing as infrastructure while running iOS CI/CD and overnight Agents, VpsMesh Mac Mini M4 cloud rental is usually steadier than a personal MacBook. Plans: Mac Mini M4 rental pricing. Setup: help center.

FAQ

Three questions readers ask most

Weekly token volume reflects real paid production traffic—a market thermometer. Benchmarks suit peak capability comparisons; OpenRouter + a16z show they often invert vs share. Combine weekly trends with private regression on a fixed task set and monthly checks at openrouter.ai/rankings.

DeepSeek V4-Flash lists near $0.10/$0.40 per M tokens—ideal for Agent and batch at scale (3.43T that week). Claude runs 30–50x higher per token; low token share but ~46% dollar share. Pick by scenario, not hype—see the OpenRouter trends selection guide.

Not always. Pure OpenRouter API works on Linux. If your stack includes Claude Code, Xcode, or OpenClaw daemons, a Mac Mini M4 monthly rental beats a sleeping laptop. Start one month to validate weekly routing and daemons: Mac Mini M4 rental pricing, order at order page.