Cloud BenchmarksLocal Benchmarks

☁️ Cloud Benchmarks ☁️

I run cron jobs to periodically test the token generation speed of different cloud LLM providers. The chart helps visualize the distributions of different speeds, as they can vary somewhat depending on the loads. For readability not all models are shown, but you can see the full results in the table below.

Every provider and model now has a dedicated landing page with narrative insights, SEO-friendly metadata, and structured data for search engines. Click any provider or model in the table to explore performance in depth.

I am working daily to add more providers and models, looking anywhere that does not require purchasing dedicated endpoints for hosting (why some models may appear to be missing). If you have any more suggestions let me know on GitHub!! 😊

Fastest Models Right Now (updated <24h)

#ModelProviderSpeed
1llama-3.1-8bgroq286 tok/s
2qwen-3-32bcerebras255 tok/s
3qwen-3-32bgroq244 tok/s
4llama-3.1-8bcerebras225 tok/s
5gpt-oss-120bcerebras216 tok/s

📊 Speed Distribution 📊

📚 Full Results 📚

Showing 86 of 86 modelsFlagged statuses: likely_deprecated, deprecated, failing, stale, never_succeeded, disabled
Status
groqllama-3.1-8bActive1h ago286.0095447120.00
cerebrasqwen-3-32bActive1h ago255.004444300.00
groqqwen-3-32bActive1h ago244.0046391150.00
cerebrasllama-3.1-8bActive1h ago225.006365520.00
cerebrasgpt-oss-120bActive1h ago216.004346690.00
cerebrasllama-3.3-70bActive1h ago213.0017338360.00
groqllama-3.3-70bActive1h ago204.0079280120.00
groqllama-4-scoutActive1h ago202.0023328240.00
groqllama-4-maverickActive1h ago169.0019310500.00
cerebrasqwen-3-235b-instructActive9d ago163.002264850.00
togetherllama-3.1-8bActive1h ago153.002232590.00
groqkimi-k2Active1h ago142.0021203230.00
bedrocknova-microActive19m ago126.0069151260.00
bedrockllama-4-maverickActive19m ago109.0028142250.00
openaio3 MiniNever Succeeded(Medium)1h ago105.00201580.00
bedrockllama-4-scoutActive19m ago102.001140350.00
bedrocknova-liteActive19m ago101.0042135290.00
bedrockllama-3.3-70bActive19m ago101.0015137230.00
togethermistral-7bActive1h ago99.902166470.00
togetherqwen-2.5-7bActive1h ago96.103146290.00
bedrocknova-proActive19m ago85.5010124400.00
googleActive1h ago79.9034113510.00
openaigpt-3.5-turboActive1h ago77.7012129460.00
openaigpt-4.1-nanoActive1h ago74.6013138400.00
togetherllama-3.1-70bActive1h ago72.707147410.00
togetherllama-3.2-3bActive1h ago70.9051451170.00
googleclaude-3-haikuActive24d ago68.104379520.00
openaigpt-4oActive1h ago63.6071541410.00
fireworksmixtral-8x22bActive1h ago62.4037112550.00
googlegemini-2.0-flashActive8d ago60.901588590.00
googlegemini-2.5-flashNever Succeeded(Medium)1h ago60.405901310.00
googlegemini-2.0-flash-liteActive8d ago59.801180630.00
deepinframixtral-8x22bStale(Medium)1h ago53.302880300.00
togetherllama-3.3-70bActive1h ago52.8021361620.00
togetherqwen-2.5-72bActive6d ago51.30371510.00
togethermixtral-8x7bActive1h ago50.507110270.00
openaiGPT-5.1-codex-maxActive1h ago50.4011051660.00
anthropicclaude-haiku-4.5Active1h ago50.201580660.00
openaio4 MiniNever Succeeded(Medium)1h ago49.1015750.00
openaigpt-4.1-miniActive1h ago47.601897480.00
bedrockllama-3.2-90bActive19m ago47.502951340.00
deepinfrallama-3-8bStale(Medium)1h ago44.80771300.00
bedrockclaude-haiku-4.5Active19m ago44.60463940.00
deepinframistral-7bStale(Medium)1h ago44.50380540.00
bedrockmistral-largeActive19m ago43.80847250.00
deepinfrallama-3.1-8bStale(Medium)1h ago41.7011021250.00
deepinfradevstral-smallNever Succeeded(Medium)1h ago41.50384510.00
googlegemini-2.5-proNever Succeeded(Medium)1h ago41.5011631580.00
fireworksllama-3.3-70bActive1h ago41.406841290.00
openaigpt-4o-miniActive1h ago40.90895420.00
togetherdeepseek-r1Active1h ago37.801691860.00
deepinfrallama-3.2-90bStale(Medium)1h ago37.60193920.00
googleclaude-3-5-sonnetActive24d ago34.602642710.00
openaigpt-4.1Active1h ago33.80767470.00
openaigpt-4-turboActive1h ago33.40251560.00
bedrockclaude-3-7-sonnetActive19m ago33.10642740.00
deepinfrallama-2-70bStale(Medium)1h ago32.70849400.00
bedrockclaude-3-5-sonnetActive20m ago32.60743570.00
deepinfrallama-3-70bStale(Medium)1h ago32.50246580.00
deepinfraqwen-2.5-72bStale(Medium)1h ago31.70247860.00
deepinfraQwen 2.5 Coder 32BNever Succeeded(Medium)1h ago29.601674710.00
bedrockclaude-3-5-haikuActive20m ago29.20138910.00
openaiGPT-5.1Active1h ago28.603561070.00
togetherdeepseek-v3Active19d ago28.101671680.00
openaiGPT-5.2Active1h ago27.40944990.00
openaiGPT-5.1-codexActive1h ago26.403491100.00
deepinfrallama-3.2-3bStale(Medium)1h ago26.20190710.00
openaigpt-4Active1h ago25.90252660.00
deepinfrallama-3.2-1bStale(Medium)1h ago25.90383450.00
deepinfrallama-3.3-70bNever Succeeded(Medium)1h ago22.70159910.00
bedrockclaude-sonnet-4.5Active19m ago21.501291810.00
togetherllama-3.1-405bActive6d ago21.301301520.00
deepinfraqwen-3-235bNever Succeeded(Medium)1h ago21.20143780.00
openaiGPT-5.1-codex-miniActive1h ago21.001451220.00
deepinfrallama-3.1-70bStale(Medium)1h ago20.50344580.00
anthropicclaude-4-sonnetActive1h ago19.908301630.00
anthropicclaude-opus-4.5Active1h ago19.707301740.00
deepinfrallama-3.1-405bStale(Medium)1h ago19.201321780.00
bedrockclaude-3-opusActive19m ago19.00422850.00
anthropicClaude Opus 4.1Active1h ago18.805251350.00
anthropicclaude-4-opusActive1h ago18.209241170.00
bedrockclaude-opus-4.5Active19m ago18.001232220.00
deepinfrallama-3.2-11bStale(Medium)1h ago16.501682010.00
openaiActive1h ago11.902202070.00
openaio1-proLikely Deprecated(Medium)1d ago9.62119170.00
openaiGPT-5.2-proActive1h ago5.621125940.00
Lifecycle snapshot
Loading status summary…

📈 Time Series 📈

llama-3.3-70b

llama-3.1-8b

claude-3-5-sonnet

claude-haiku-4.5

claude-opus-4.5

llama-3.1-405b

llama-3.1-70b

llama-3.2-3b

llama-3.2-90b

llama-4-maverick

llama-4-scout

mistral-7b

mixtral-8x22b

qwen-2.5-72b

qwen-3-32b

undefined

Claude Opus 4.1

claude-3-5-haiku

claude-3-7-sonnet

claude-3-opus

claude-4-opus

claude-4-sonnet

claude-sonnet-4.5

deepseek-r1

devstral-small

gemini-2.0-flash

gemini-2.0-flash-lite

gemini-2.5-flash

gemini-2.5-pro

gpt-3.5-turbo

gpt-4

gpt-4-turbo

gpt-4.1

gpt-4.1-mini

gpt-4.1-nano

gpt-4o

gpt-4o-mini

GPT-5.1

GPT-5.1-codex

GPT-5.1-codex-max

GPT-5.1-codex-mini

GPT-5.2

GPT-5.2-pro

gpt-oss-120b

kimi-k2

llama-2-70b

llama-3-70b

llama-3-8b

llama-3.2-11b

llama-3.2-1b

mistral-large

mixtral-8x7b

nova-lite

nova-micro

nova-pro

o1-pro

o3 Mini

o4 Mini

Qwen 2.5 Coder 32B

qwen-2.5-7b

qwen-3-235b

qwen-3-235b-instruct