Cloud BenchmarksLocal Benchmarks

☁️ Cloud Benchmarks ☁️

I run cron jobs to periodically test the token generation speed of different cloud LLM providers. The chart helps visualize the distributions of different speeds, as they can vary somewhat depending on the loads. For readability not all models are shown, but you can see the full results in the table below.

Every provider and model now has a dedicated landing page with narrative insights, SEO-friendly metadata, and structured data for search engines. Click any provider or model in the table to explore performance in depth.

I am working daily to add more providers and models, looking anywhere that does not require purchasing dedicated endpoints for hosting (why some models may appear to be missing). If you have any more suggestions let me know on GitHub!! 😊

Fastest Models Right Now (updated <24h)

#ModelProviderSpeed
1llama-3.1-8bgroq307 tok/s
2qwen-3-32bgroq256 tok/s
3llama-3.1-8bcerebras217 tok/s
4llama-4-scoutgroq217 tok/s
5llama-3.3-70bgroq209 tok/s

📊 Speed Distribution 📊

📚 Full Results 📚

Showing 84 of 84 modelsFlagged statuses: likely_deprecated, deprecated, failing, stale, never_succeeded, disabled
Status
groqllama-3.1-8bActive2h ago307.0087471100.00
groqqwen-3-32bActive2h ago256.002391260.00
cerebrasqwen-3-32bActive4d ago253.004444310.00
cerebrasllama-3.1-8bActive2h ago217.006350510.00
groqllama-4-scoutActive2h ago217.0023331200.00
groqllama-3.3-70bActive2h ago209.0099322110.00
cerebrasgpt-oss-120bActive2h ago202.004346780.00
cerebrasllama-3.3-70bActive4d ago200.0017338380.00
groqllama-4-maverickActive2h ago173.0025310460.00
cerebrasqwen-3-235b-instructActive17d ago155.0022521170.00
togetherllama-3.1-8bActive2h ago153.002232470.00
groqkimi-k2Active2h ago147.0021208210.00
bedrocknova-microActive36m ago125.0069151260.00
bedrockllama-4-maverickActive36m ago109.0036142250.00
openaio3 MiniNever Succeeded(Medium)2h ago108.00211600.00
bedrockllama-4-scoutActive36m ago102.001140350.00
bedrocknova-liteActive36m ago101.0039131290.00
bedrockllama-3.3-70bActive36m ago101.0011137240.00
togetherqwen-2.5-7bActive2h ago90.003141310.00
bedrocknova-proActive36m ago89.6013124370.00
togethermistral-7bActive2h ago84.602165500.00
openaigpt-3.5-turboActive2h ago79.8012125440.00
googleActive2h ago78.1034132510.00
togetherllama-3.1-70bActive2h ago71.707144380.00
openaigpt-4.1-nanoActive2h ago70.8027127390.00
openaiGPT-5.1-codex-maxActive2h ago67.7011081680.00
openaigpt-4oActive2h ago66.3071731370.00
togetherllama-3.2-3bActive2h ago65.9051431300.00
googlegemini-2.5-flashNever Succeeded(Medium)2h ago65.7051001090.00
fireworksmixtral-8x22bActive2h ago65.5037112490.00
googlegemini-2.0-flashActive17d ago60.001688580.00
googlegemini-2.0-flash-liteActive17d ago57.001173730.00
deepinframixtral-8x22bStale(Medium)2d ago56.701480340.00
togetherllama-3.3-70bActive2h ago55.2021461360.00
togethermixtral-8x7bActive2h ago54.1013114190.00
togetherqwen-2.5-72bActive14d ago53.10471390.00
deepinframistral-7bStale(Medium)2h ago52.103124590.00
deepinfrallama-3.1-8bStale(Medium)2h ago51.605102420.00
openaio4 MiniNever Succeeded(Medium)2h ago50.1015760.00
anthropicclaude-haiku-4.5Active2h ago49.901680640.00
openaigpt-4.1-miniActive2h ago48.801597500.00
bedrockllama-3.2-90bActive36m ago47.501951340.00
deepinfradevstral-smallNever Succeeded(Medium)2h ago46.303131580.00
fireworksllama-3.3-70bActive2h ago46.307941070.00
deepinfrallama-3-8bStale(Medium)2h ago44.30771300.00
bedrockmistral-largeActive36m ago43.40347270.00
bedrockclaude-haiku-4.5Active36m ago42.604631070.00
openaigpt-4o-miniActive2h ago42.10895400.00
googlegemini-2.5-proNever Succeeded(Medium)2h ago42.0011721540.00
togetherdeepseek-r1Active2h ago40.301691580.00
openaigpt-4.1Active2h ago35.10670470.00
deepinfrallama-3.2-90bStale(Medium)2h ago34.902881000.00
bedrockclaude-3-7-sonnetActive36m ago33.80644730.00
openaigpt-4-turboActive2h ago32.80251570.00
deepinfrallama-3-70bStale(Medium)2h ago32.50448530.00
deepinfrallama-2-70bStale(Medium)2h ago32.50349440.00
deepinfraqwen-2.5-72bStale(Medium)2h ago32.40247730.00
deepinfraQwen 2.5 Coder 32BNever Succeeded(Medium)2h ago32.301672610.00
bedrockclaude-3-5-sonnetActive36m ago31.80243650.00
deepinfrallama-3.2-1bStale(Medium)2h ago31.50190610.00
deepinfrallama-3.2-3bStale(Medium)2h ago31.40190740.00
openaiGPT-5.1Active2h ago29.503561010.00
bedrockclaude-3-5-haikuActive36m ago29.30138880.00
openaiGPT-5.2Active2h ago27.10942950.00
openaigpt-4Active2h ago26.70249670.00
openaiGPT-5.1-codexActive2h ago25.103491330.00
openaiGPT-5.1-codex-miniActive2h ago22.301551390.00
deepinfrallama-3.3-70bNever Succeeded(Medium)2h ago22.201591320.00
bedrockclaude-sonnet-4.5Active36m ago22.002291680.00
togetherdeepseek-v3Active28d ago21.805311810.00
togetherllama-3.1-405bActive15d ago21.702291230.00
deepinfrallama-3.1-405bStale(Medium)2h ago21.401341340.00
deepinfrallama-3.1-70bStale(Medium)2h ago21.20346620.00
anthropicclaude-4-sonnetActive2h ago19.701301940.00
anthropicclaude-opus-4.5Active2h ago19.607301750.00
bedrockclaude-3-opusActive2d ago19.10422860.00
anthropicClaude Opus 4.1Active2h ago18.405251420.00
deepinfrallama-3.2-11bStale(Medium)2h ago18.401682330.00
bedrockclaude-opus-4.5Active36m ago18.004232060.00
anthropicclaude-4-opusActive2h ago17.909241220.00
deepinfraqwen-3-235bNever Succeeded(Medium)2h ago17.301382830.00
openaiActive5h ago12.402221920.00
openaio1-proLikely Deprecated(Medium)2h ago9.33119330.00
openaiGPT-5.2-proActive2h ago7.111135860.00
Lifecycle snapshot
Loading status summary…

📈 Time Series 📈

llama-3.3-70b

llama-3.1-8b

claude-3-5-sonnet

claude-haiku-4.5

claude-opus-4.5

llama-3.1-70b

llama-3.2-3b

llama-3.2-90b

llama-4-maverick

llama-4-scout

mistral-7b

mixtral-8x22b

qwen-3-32b

undefined

Claude Opus 4.1

claude-3-5-haiku

claude-3-7-sonnet

claude-3-opus

claude-4-opus

claude-4-sonnet

claude-sonnet-4.5

deepseek-r1

devstral-small

gemini-2.5-flash

gemini-2.5-pro

gpt-3.5-turbo

gpt-4

gpt-4-turbo

gpt-4.1

gpt-4.1-mini

gpt-4.1-nano

gpt-4o

gpt-4o-mini

GPT-5.1

GPT-5.1-codex

GPT-5.1-codex-max

GPT-5.1-codex-mini

GPT-5.2

GPT-5.2-pro

gpt-oss-120b

kimi-k2

llama-2-70b

llama-3-70b

llama-3-8b

llama-3.1-405b

llama-3.2-11b

llama-3.2-1b

mistral-large

mixtral-8x7b

nova-lite

nova-micro

nova-pro

o1-pro

o3 Mini

o4 Mini

Qwen 2.5 Coder 32B

qwen-2.5-72b

qwen-2.5-7b

qwen-3-235b