Cloud BenchmarksLocal Benchmarks

☁️ Cloud Benchmarks ☁️

I run cron jobs to periodically test the token generation speed of different cloud LLM providers. The chart helps visualize the distributions of different speeds, as they can vary somewhat depending on the loads. For readability not all models are shown, but you can see the full results in the table below.

Every provider and model now has a dedicated landing page with narrative insights, SEO-friendly metadata, and structured data for search engines. Click any provider or model in the table to explore performance in depth.

I am working daily to add more providers and models, looking anywhere that does not require purchasing dedicated endpoints for hosting (why some models may appear to be missing). If you have any more suggestions let me know on GitHub!! 😊

Fastest Models Right Now (updated <24h)

#ModelProviderSpeed
1llama-3.1-8bgroq297 tok/s
2qwen-3-32bgroq249 tok/s
3llama-3.1-8bcerebras226 tok/s
4llama-4-scoutgroq210 tok/s
5gpt-oss-120bcerebras207 tok/s

📊 Speed Distribution 📊

📚 Full Results 📚

Showing 86 of 86 modelsFlagged statuses: likely_deprecated, deprecated, failing, stale, never_succeeded, disabled
Status
groqllama-3.1-8bActive57m ago297.0087471120.00
cerebrasqwen-3-32bActive1d ago256.004444290.00
groqqwen-3-32bActive57m ago249.0046391140.00
cerebrasllama-3.1-8bActive58m ago226.006365480.00
groqllama-4-scoutActive57m ago210.0023331220.00
cerebrasgpt-oss-120bActive58m ago207.004346720.00
cerebrasllama-3.3-70bActive1d ago207.0017338360.00
groqllama-3.3-70bActive57m ago206.0094322120.00
groqllama-4-maverickActive57m ago168.0019310500.00
cerebrasqwen-3-235b-instructActive14d ago160.0022641000.00
togetherllama-3.1-8bActive55m ago153.002232540.00
groqkimi-k2Active57m ago144.0021201220.00
bedrocknova-microActive38m ago126.0069151260.00
bedrockllama-4-maverickActive38m ago109.0036142250.00
openaio3 MiniNever Succeeded(Medium)57m ago106.00211590.00
bedrockllama-4-scoutActive38m ago102.001140350.00
bedrocknova-liteActive38m ago101.0039132290.00
bedrockllama-3.3-70bActive38m ago101.0015137240.00
togetherqwen-2.5-7bActive56m ago92.003146300.00
togethermistral-7bActive55m ago91.002165480.00
bedrocknova-proActive38m ago88.4010124370.00
googleActive55m ago80.1034132500.00
openaigpt-3.5-turboActive56m ago79.0012125440.00
openaigpt-4.1-nanoActive57m ago73.1027131380.00
googleclaude-3-haikuActive29d ago72.006376460.00
togetherllama-3.1-70bActive55m ago71.607144400.00
togetherllama-3.2-3bActive55m ago69.0051451220.00
googlegemini-2.5-flashNever Succeeded(Medium)55m ago65.4051001130.00
openaigpt-4oActive56m ago65.1071541380.00
fireworksmixtral-8x22bActive57m ago63.8037112530.00
openaiGPT-5.1-codex-maxActive57m ago62.7011061840.00
googlegemini-2.0-flashActive13d ago60.401588600.00
googlegemini-2.0-flash-liteActive13d ago58.401180690.00
deepinframixtral-8x22bStale(Medium)58m ago55.401480330.00
togetherllama-3.3-70bActive56m ago54.7021361450.00
togethermixtral-8x7bActive56m ago52.8013110210.00
togetherqwen-2.5-72bActive11d ago52.40471390.00
deepinframistral-7bStale(Medium)58m ago50.703124560.00
openaio4 MiniNever Succeeded(Medium)57m ago50.0015760.00
anthropicclaude-haiku-4.5Active58m ago49.501580670.00
openaigpt-4.1-miniActive57m ago48.301597500.00
bedrockllama-3.2-90bActive38m ago47.602951340.00
deepinfrallama-3.1-8bStale(Medium)58m ago47.0011021080.00
deepinfradevstral-smallNever Succeeded(Medium)58m ago44.903131550.00
deepinfrallama-3-8bStale(Medium)58m ago44.60771300.00
fireworksllama-3.3-70bActive57m ago44.007901190.00
bedrockmistral-largeActive38m ago43.70747250.00
bedrockclaude-haiku-4.5Active38m ago43.604631000.00
googlegemini-2.5-proNever Succeeded(Medium)55m ago42.3011721530.00
openaigpt-4o-miniActive56m ago41.90895400.00
togetherdeepseek-r1Active56m ago38.701691860.00
deepinfrallama-3.2-90bStale(Medium)58m ago36.30288950.00
googleclaude-3-5-sonnetActive29d ago34.303335680.00
openaigpt-4.1Active57m ago34.00670470.00
bedrockclaude-3-7-sonnetActive39m ago33.60644730.00
openaigpt-4-turboActive56m ago32.90251570.00
deepinfrallama-3-70bStale(Medium)58m ago32.40448530.00
bedrockclaude-3-5-sonnetActive39m ago32.30643580.00
deepinfrallama-2-70bStale(Medium)58m ago32.30349430.00
deepinfraqwen-2.5-72bStale(Medium)58m ago32.10247700.00
deepinfraQwen 2.5 Coder 32BNever Succeeded(Medium)58m ago31.201673460.00
bedrockclaude-3-5-haikuActive39m ago29.20138900.00
openaiGPT-5.1Active57m ago29.203561030.00
deepinfrallama-3.2-3bStale(Medium)58m ago27.70190730.00
deepinfrallama-3.2-1bStale(Medium)58m ago27.60290520.00
openaiGPT-5.2Active57m ago27.00942980.00
openaigpt-4Active56m ago26.20249670.00
openaiGPT-5.1-codexActive57m ago25.703491300.00
deepinfrallama-3.3-70bNever Succeeded(Medium)58m ago22.501591110.00
togetherdeepseek-v3Active25d ago22.501592610.00
openaiGPT-5.1-codex-miniActive57m ago21.801551400.00
bedrockclaude-sonnet-4.5Active38m ago21.702291730.00
togetherllama-3.1-405bActive11d ago21.602301270.00
deepinfrallama-3.1-405bStale(Medium)58m ago20.601341640.00
deepinfrallama-3.1-70bStale(Medium)58m ago20.60346620.00
anthropicclaude-4-sonnetActive58m ago19.601301940.00
anthropicclaude-opus-4.5Active58m ago19.507271760.00
bedrockclaude-3-opusActive39m ago19.10422850.00
deepinfraqwen-3-235bNever Succeeded(Medium)58m ago18.901432390.00
anthropicClaude Opus 4.1Active58m ago18.505251400.00
anthropicclaude-4-opusActive58m ago18.109241210.00
bedrockclaude-opus-4.5Active38m ago17.903232120.00
deepinfrallama-3.2-11bStale(Medium)58m ago17.901682050.00
openaiActive57m ago12.102221990.00
openaio1-proLikely Deprecated(Medium)57m ago9.64119140.00
openaiGPT-5.2-proActive57m ago6.581136240.00
Lifecycle snapshot
Loading status summary…

📈 Time Series 📈

llama-3.3-70b

llama-3.1-8b

claude-3-5-sonnet

claude-haiku-4.5

claude-opus-4.5

llama-3.1-405b

llama-3.1-70b

llama-3.2-3b

llama-3.2-90b

llama-4-maverick

llama-4-scout

mistral-7b

mixtral-8x22b

qwen-2.5-72b

qwen-3-32b

undefined

Claude Opus 4.1

claude-3-5-haiku

claude-3-7-sonnet

claude-3-opus

claude-4-opus

claude-4-sonnet

claude-sonnet-4.5

deepseek-r1

devstral-small

gemini-2.5-flash

gemini-2.5-pro

gpt-3.5-turbo

gpt-4

gpt-4-turbo

gpt-4.1

gpt-4.1-mini

gpt-4.1-nano

gpt-4o

gpt-4o-mini

GPT-5.1

GPT-5.1-codex

GPT-5.1-codex-max

GPT-5.1-codex-mini

GPT-5.2

GPT-5.2-pro

gpt-oss-120b

kimi-k2

llama-2-70b

llama-3-70b

llama-3-8b

llama-3.2-11b

llama-3.2-1b

mistral-large

mixtral-8x7b

nova-lite

nova-micro

nova-pro

o1-pro

o3 Mini

o4 Mini

Qwen 2.5 Coder 32B

qwen-2.5-7b

qwen-3-235b

gemini-2.0-flash

gemini-2.0-flash-lite