Cloud BenchmarksLocal Benchmarks

☁️ Cloud Benchmarks ☁️

I run cron jobs to periodically test the token generation speed of different cloud LLM providers. The chart helps visualize the distributions of different speeds, as they can vary somewhat depending on the loads. For readability not all models are shown, but you can see the full results in the table below.

Every provider and model now has a dedicated landing page with narrative insights, SEO-friendly metadata, and structured data for search engines. Click any provider or model in the table to explore performance in depth.

I am working daily to add more providers and models, looking anywhere that does not require purchasing dedicated endpoints for hosting (why some models may appear to be missing). If you have any more suggestions let me know on GitHub!! 😊

Fastest Models Right Now (updated <24h)

#ModelProviderSpeed
1llama-3.1-8bgroq268 tok/s
2qwen-3-32bcerebras251 tok/s
3qwen-3-32bgroq240 tok/s
4gpt-oss-120bcerebras219 tok/s
5llama-3.3-70bcerebras218 tok/s

📊 Speed Distribution 📊

📚 Full Results 📚

Showing 83 of 83 modelsFlagged statuses: likely_deprecated, deprecated, failing, stale, never_succeeded, disabled
Status
groqllama-3.1-8bActive2h ago268.0095411140.00
cerebrasqwen-3-32bActive2h ago251.004369440.00
groqqwen-3-32bActive2h ago240.0046387160.00
cerebrasgpt-oss-120bActive2h ago219.004346720.00
cerebrasllama-3.3-70bActive2h ago218.0014338380.00
cerebrasllama-3.1-8bActive2h ago218.004365620.00
groqllama-3.3-70bActive2h ago205.0079280120.00
groqllama-4-scoutActive2h ago191.0023255270.00
groqllama-4-maverickActive2h ago169.0019310510.00
cerebrasqwen-3-235b-instructActive2h ago166.002266760.00
togetherllama-3.1-8bActive2h ago147.002230480.00
groqkimi-k2Active2h ago144.0021203230.00
bedrocknova-microActive3m ago129.0069154250.00
togethermistral-7bActive2h ago118.003166320.00
bedrockllama-4-maverickActive3m ago109.0028149250.00
openaio3 MiniNever Succeeded(Medium)2h ago104.00201510.00
togetherqwen-2.5-7bActive2h ago103.003146260.00
bedrocknova-liteActive3m ago102.0042135290.00
bedrockllama-4-scoutActive3m ago102.001140350.00
bedrockllama-3.3-70bActive3m ago101.009136230.00
openaigpt-4.1-nanoActive2h ago80.2013138380.00
bedrocknova-proActive3m ago79.9010124420.00
openaigpt-3.5-turboActive2h ago78.0012129450.00
togetherllama-3.2-3bActive2h ago72.1051451300.00
togetherllama-3.1-70bActive2h ago71.804147530.00
googleclaude-3-haikuActive15d ago68.302782520.00
openaigpt-4oActive2h ago64.2091411390.00
googlegemini-2.0-flash-liteActive2h ago61.801186590.00
googlegemini-2.0-flashActive2h ago61.70888580.00
fireworksmixtral-8x22bActive2h ago60.603783590.00
togetherllama-3.3-70bActive2h ago51.5021331400.00
anthropicclaude-haiku-4.5Active2h ago50.701580630.00
openaio4 MiniNever Succeeded(Medium)2h ago50.3022750.00
togetherqwen-2.5-72bActive2h ago50.10270520.00
togethermixtral-8x7bActive2h ago48.606111430.00
bedrockllama-3.2-90bActive3m ago47.402351350.00
deepinframixtral-8x22bStale(Medium)2h ago46.902580320.00
openaigpt-4.1-miniActive2h ago45.901869460.00
deepinfrallama-3-8bStale(Medium)2h ago45.90771290.00
bedrockclaude-haiku-4.5Active3m ago45.90865880.00
deepinframistral-7bStale(Medium)2h ago44.90389550.00
bedrockmistral-largeActive3m ago44.00847250.00
deepinfradevstral-smallNever Succeeded(Medium)2h ago43.20485460.00
fireworksllama-3.3-70bActive2h ago41.906761320.00
deepinfrallama-3.2-90bStale(Medium)2h ago39.90193840.00
openaigpt-4o-miniActive2h ago36.70878440.00
togetherdeepseek-r1Active2h ago36.301671930.00
googleclaude-3-5-sonnetActive15d ago34.502446710.00
deepinfraQwen 2.5 Coder 32BNever Succeeded(Medium)2h ago34.501674340.00
openaigpt-4-turboActive2h ago34.10249540.00
openaigpt-4.1Active2h ago34.00767460.00
deepinfrallama-2-70bStale(Medium)2h ago33.80844380.00
deepinfrallama-3-70bStale(Medium)2h ago33.00244590.00
bedrockclaude-3-7-sonnetActive3m ago32.80142800.00
bedrockclaude-3-5-sonnetActive3m ago32.50542580.00
deepinfraqwen-2.5-72bStale(Medium)2h ago31.90246810.00
deepinfrallama-3.1-8bStale(Medium)2h ago31.3011021270.00
togetherdeepseek-v3Active10d ago30.401671260.00
bedrockclaude-3-5-haikuActive3m ago29.001381250.00
openaiGPT-5.2Active2h ago27.706441000.00
openaiGPT-5.1Active2h ago27.103561100.00
openaigpt-4Active2h ago26.10252590.00
openaiGPT-5.1-codex-maxActive11h ago25.901701250.00
openaiGPT-5.1-codexActive2h ago25.10449710.00
deepinfrallama-3.2-3bStale(Medium)2h ago25.10590390.00
deepinfrallama-3.2-1bStale(Medium)2h ago24.90783370.00
deepinfrallama-3.3-70bNever Succeeded(Medium)2h ago23.60159770.00
deepinfraqwen-3-235bNever Succeeded(Medium)2h ago22.30243530.00
togetherllama-3.1-405bActive2h ago21.601301360.00
deepinfrallama-3.1-70bStale(Medium)2h ago21.50343480.00
bedrockclaude-sonnet-4.5Active3m ago21.401291840.00
openaiGPT-5.1-codex-miniActive2h ago20.00145820.00
anthropicclaude-4-sonnetActive2h ago19.908321680.00
anthropicclaude-opus-4.5Active5h ago19.6011301740.00
bedrockclaude-3-opusActive3m ago19.00722840.00
anthropicClaude Opus 4.1Active2h ago18.705251350.00
bedrockclaude-opus-4.5Active3m ago18.001232300.00
anthropicclaude-4-opusActive2h ago17.704241240.00
deepinfrallama-3.1-405bStale(Medium)2h ago15.801313530.00
deepinfrallama-3.2-11bStale(Medium)2h ago12.401682130.00
googleclaude-3-opusActive29d ago11.7010132030.00
openaio1-proLikely Deprecated(Medium)2h ago10.30119130.00
openaiGPT-5.2-proActive2h ago1.85145110.00
Lifecycle snapshot
Loading status summary…

📈 Time Series 📈

llama-3.3-70b

llama-3.1-8b

claude-3-5-sonnet

claude-haiku-4.5

claude-opus-4.5

llama-3.1-405b

llama-3.1-70b

llama-3.2-3b

llama-3.2-90b

llama-4-maverick

llama-4-scout

mistral-7b

mixtral-8x22b

qwen-2.5-72b

qwen-3-32b

Claude Opus 4.1

claude-3-5-haiku

claude-3-7-sonnet

claude-3-opus

claude-4-opus

claude-4-sonnet

claude-sonnet-4.5

deepseek-r1

deepseek-v3

devstral-small

gemini-2.0-flash

gemini-2.0-flash-lite

gpt-3.5-turbo

gpt-4

gpt-4-turbo

gpt-4.1

gpt-4.1-mini

gpt-4.1-nano

gpt-4o

gpt-4o-mini

GPT-5.1

GPT-5.1-codex

GPT-5.1-codex-max

GPT-5.1-codex-mini

GPT-5.2

GPT-5.2-pro

gpt-oss-120b

kimi-k2

llama-2-70b

llama-3-70b

llama-3-8b

llama-3.2-11b

llama-3.2-1b

mistral-large

mixtral-8x7b

nova-lite

nova-micro

nova-pro

o1-pro

o3 Mini

o4 Mini

Qwen 2.5 Coder 32B

qwen-2.5-7b

qwen-3-235b

qwen-3-235b-instruct