Cloud BenchmarksLocal Benchmarks

☁️ Cloud Benchmarks ☁️

I run cron jobs to periodically test the token generation speed of different cloud LLM providers. The chart helps visualize the distributions of different speeds, as they can vary somewhat depending on the loads. For readability not all models are shown, but you can see the full results in the table below.

Every provider and model now has a dedicated landing page with narrative insights, SEO-friendly metadata, and structured data for search engines. Click any provider or model in the table to explore performance in depth.

I am working daily to add more providers and models, looking anywhere that does not require purchasing dedicated endpoints for hosting (why some models may appear to be missing). If you have any more suggestions let me know on GitHub!! 😊

Fastest Models Right Now (updated <24h)

#ModelProviderSpeed
1llama-3.1-8bgroq325 tok/s
2qwen-3-32bgroq238 tok/s
3llama-4-scoutgroq213 tok/s
4llama-3.1-8bcerebras205 tok/s
5llama-3.3-70bgroq201 tok/s

📊 Speed Distribution 📊

📚 Full Results 📚

Showing 83 of 83 modelsFlagged statuses: likely_deprecated, deprecated, failing, stale, never_succeeded, disabled
Status
groqllama-3.1-8bActive3h ago325.008747190.00
cerebrasqwen-3-32bActive15d ago246.004444380.00
groqqwen-3-32bActive3h ago238.002391280.00
groqllama-4-scoutActive3h ago213.0023335210.00
cerebrasllama-3.1-8bActive6h ago205.002343650.00
groqllama-3.3-70bActive3h ago201.0068322130.00
cerebrasgpt-oss-120bActive3h ago195.0013380810.00
cerebrasllama-3.3-70bActive15d ago186.0017338440.00
groqllama-4-maverickActive3h ago181.0012307430.00
togetherllama-3.1-8bActive3h ago146.002232560.00
groqkimi-k2Active3h ago144.0019215260.00
bedrocknova-microActive33m ago122.0065149270.00
cerebrasqwen-3-235b-instructActive27d ago117.0019252950.00
openaio3 MiniNever Succeeded(Medium)3h ago110.00211600.00
bedrockllama-4-maverickActive33m ago108.003139260.00
bedrockllama-4-scoutActive33m ago102.004132270.00
bedrocknova-liteActive33m ago101.0039130300.00
bedrockllama-3.3-70bActive33m ago99.303137280.00
togetherqwen-2.5-7bActive3h ago93.3011145230.00
bedrocknova-proActive33m ago91.6013124350.00
openaigpt-3.5-turboActive3h ago79.1013126460.00
openaiGPT-5.1-codex-maxActive3h ago76.30111181490.00
googleActive3h ago75.8034132520.00
togetherllama-3.1-70bActive5d ago73.007144390.00
openaigpt-4.1-nanoActive3h ago70.909149430.00
togethermistral-7bActive5d ago70.10291530.00
openaigpt-4oActive3h ago68.9071731330.00
fireworksmixtral-8x22bActive1d ago67.8029112430.00
googlegemini-2.5-flashNever Succeeded(Medium)3h ago66.6051011070.00
deepinframistral-7bStale(Medium)3h ago62.405136560.00
togetherllama-3.2-3bActive3h ago59.6051211460.00
deepinframixtral-8x22bStale(Medium)13d ago58.401478370.00
togethermixtral-8x7bActive3h ago57.8013114160.00
googlegemini-2.0-flashActive27d ago57.702978610.00
deepinfradevstral-smallNever Succeeded(Medium)3h ago56.503131650.00
googlegemini-2.0-flash-liteActive27d ago54.001173920.00
togetherllama-3.3-70bActive3h ago53.6021461250.00
deepinfrallama-3.1-8bStale(Medium)3h ago53.406100430.00
togetherqwen-2.5-72bActive24d ago52.90471490.00
anthropicclaude-haiku-4.5Active3h ago50.601674600.00
openaigpt-4.1-miniActive3h ago50.401597480.00
fireworksllama-3.3-70bActive3h ago49.704951250.00
openaio4 MiniNever Succeeded(Medium)3h ago49.1014760.00
bedrockllama-3.2-90bActive33m ago47.30651350.00
deepinfrallama-3-8bStale(Medium)3h ago44.90771310.00
togetherdeepseek-r1Active3h ago43.601691040.00
openaigpt-4o-miniActive3h ago43.20895380.00
bedrockmistral-largeActive33m ago42.50347330.00
bedrockclaude-haiku-4.5Active33m ago42.004631070.00
googlegemini-2.5-proNever Succeeded(Medium)3h ago41.406721590.00
openaigpt-4.1Active3h ago37.30682490.00
deepinfrallama-2-70bStale(Medium)3h ago35.60357510.00
deepinfrallama-3-70bStale(Medium)3h ago35.10255610.00
deepinfrallama-3.2-1bStale(Medium)3h ago34.701100630.00
deepinfrallama-3.2-90bStale(Medium)3h ago34.50376840.00
deepinfrallama-3.2-3bStale(Medium)3h ago34.50197800.00
bedrockclaude-3-7-sonnetActive34m ago33.30544740.00
deepinfraqwen-2.5-72bStale(Medium)3h ago32.50250640.00
deepinfraQwen 2.5 Coder 32BNever Succeeded(Medium)3h ago32.201742160.00
openaigpt-4-turboActive3h ago32.10252530.00
bedrockclaude-3-5-sonnetActive34m ago31.90244660.00
bedrockclaude-3-5-haikuActive34m ago30.20138790.00
openaiGPT-5.1Active3h ago29.90357970.00
openaigpt-4Active3h ago27.20247660.00
openaiGPT-5.2Active3h ago26.60940950.00
openaiGPT-5.1-codexActive3h ago25.703481310.00
openaiGPT-5.1-codex-miniActive3h ago24.401551280.00
deepinfrallama-3.1-405bStale(Medium)3h ago23.501341130.00
deepinfrallama-3.1-70bStale(Medium)3h ago23.20346650.00
bedrockclaude-sonnet-4.5Active33m ago22.305291610.00
togetherllama-3.1-405bActive25d ago21.006291060.00
anthropicclaude-opus-4.5Active3h ago19.802331880.00
deepinfrallama-3.3-70bNever Succeeded(Medium)3h ago19.801492050.00
anthropicclaude-4-sonnetActive3h ago19.401312120.00
bedrockclaude-3-opusActive12d ago19.10422860.00
bedrockclaude-opus-4.5Active33m ago18.304241990.00
anthropicClaude Opus 4.1Active3h ago18.008271480.00
anthropicclaude-4-opusActive3h ago17.805241280.00
deepinfrallama-3.2-11bStale(Medium)3h ago16.301622400.00
deepinfraqwen-3-235bNever Succeeded(Medium)9h ago12.701384670.00
openaiActive3h ago12.701221870.00
openaio1-proLikely Deprecated(Medium)3h ago8.90118650.00
openaiGPT-5.2-proActive3h ago8.211135300.00
Lifecycle snapshot
Loading status summary…

📈 Time Series 📈

llama-3.3-70b

llama-3.1-8b

claude-3-5-sonnet

claude-haiku-4.5

claude-opus-4.5

llama-3.1-70b

llama-3.2-3b

llama-3.2-90b

llama-4-maverick

llama-4-scout

mistral-7b

undefined

mixtral-8x22b

Claude Opus 4.1

claude-3-5-haiku

claude-3-7-sonnet

claude-3-opus

claude-4-opus

claude-4-sonnet

claude-sonnet-4.5

deepseek-r1

devstral-small

gemini-2.5-flash

gemini-2.5-pro

gpt-3.5-turbo

gpt-4

gpt-4-turbo

gpt-4.1

gpt-4.1-mini

gpt-4.1-nano

gpt-4o

gpt-4o-mini

GPT-5.1

GPT-5.1-codex

GPT-5.1-codex-max

GPT-5.1-codex-mini

GPT-5.2

GPT-5.2-pro

gpt-oss-120b

kimi-k2

llama-2-70b

llama-3-70b

llama-3-8b

llama-3.1-405b

llama-3.2-11b

llama-3.2-1b

mistral-large

mixtral-8x7b

nova-lite

nova-micro

nova-pro

o1-pro

o3 Mini

o4 Mini

Qwen 2.5 Coder 32B

qwen-2.5-72b

qwen-2.5-7b

qwen-3-235b

qwen-3-32b