Cloud BenchmarksLocal Benchmarks

☁️ Cloud Benchmarks ☁️

I run cron jobs to periodically test the token generation speed of different cloud LLM providers. The chart helps visualize the distributions of different speeds, as they can vary somewhat depending on the loads. For readability not all models are shown, but you can see the full results in the table below.

Every provider and model now has a dedicated landing page with narrative insights, SEO-friendly metadata, and structured data for search engines. Click any provider or model in the table to explore performance in depth.

I am working daily to add more providers and models, looking anywhere that does not require purchasing dedicated endpoints for hosting (why some models may appear to be missing). If you have any more suggestions let me know on GitHub!! 😊

Fastest Models Right Now (updated <24h)

#ModelProviderSpeed
1llama-3.1-8bgroq292 tok/s
2qwen-3-32bcerebras253 tok/s
3qwen-3-32bgroq248 tok/s
4llama-3.1-8bcerebras227 tok/s
5gpt-oss-120bcerebras211 tok/s

📊 Speed Distribution 📊

📚 Full Results 📚

Showing 86 of 86 modelsFlagged statuses: likely_deprecated, deprecated, failing, stale, never_succeeded, disabled
Status
groqllama-3.1-8bActive2h ago292.0095447120.00
cerebrasqwen-3-32bActive2h ago253.004444300.00
groqqwen-3-32bActive2h ago248.0046391140.00
cerebrasllama-3.1-8bActive2h ago227.006365470.00
cerebrasgpt-oss-120bActive2h ago211.004346720.00
cerebrasllama-3.3-70bActive5h ago211.0017338350.00
groqllama-4-scoutActive2h ago207.0023328220.00
groqllama-3.3-70bActive2h ago206.0094291120.00
groqllama-4-maverickActive2h ago172.0019310480.00
cerebrasqwen-3-235b-instructActive11d ago161.002264900.00
togetherllama-3.1-8bActive2h ago151.002232550.00
groqkimi-k2Active2h ago144.0021203230.00
bedrocknova-microActive42m ago126.0069151260.00
bedrockllama-4-maverickActive42m ago109.0039142250.00
openaio3 MiniNever Succeeded(Medium)2h ago106.00201590.00
bedrockllama-4-scoutActive42m ago102.001140350.00
bedrocknova-liteActive42m ago101.0044135290.00
bedrockllama-3.3-70bActive42m ago101.0015137240.00
togethermistral-7bActive2h ago95.202166510.00
togetherqwen-2.5-7bActive2h ago93.403146300.00
bedrocknova-proActive42m ago87.9010124380.00
googleActive2h ago82.0034132490.00
openaigpt-3.5-turboActive2h ago78.8012125440.00
openaigpt-4.1-nanoActive2h ago74.5013138410.00
togetherllama-3.1-70bActive2h ago73.007147400.00
togetherllama-3.2-3bActive2h ago70.3051451170.00
googleclaude-3-haikuActive27d ago69.606279490.00
openaigpt-4oActive2h ago64.9071541390.00
googlegemini-2.5-flashNever Succeeded(Medium)2h ago64.6051001180.00
fireworksmixtral-8x22bActive2h ago63.1037112540.00
googlegemini-2.0-flashActive11d ago61.201588580.00
googlegemini-2.0-flash-liteActive11d ago59.101180660.00
openaiGPT-5.1-codex-maxActive2h ago57.9011061780.00
deepinframixtral-8x22bStale(Medium)2h ago54.602980310.00
togetherllama-3.3-70bActive2h ago53.7021361490.00
togetherqwen-2.5-72bActive8d ago52.80471390.00
togethermixtral-8x7bActive2h ago52.3013110210.00
openaio4 MiniNever Succeeded(Medium)2h ago49.8015750.00
anthropicclaude-haiku-4.5Active2h ago49.601580670.00
openaigpt-4.1-miniActive2h ago48.301597490.00
bedrockllama-3.2-90bActive42m ago47.502951340.00
deepinframistral-7bStale(Medium)2h ago47.403124550.00
deepinfrallama-3.1-8bStale(Medium)2h ago45.0011021140.00
deepinfrallama-3-8bStale(Medium)2h ago45.00771300.00
bedrockclaude-haiku-4.5Active42m ago44.70463930.00
bedrockmistral-largeActive42m ago43.80747250.00
fireworksllama-3.3-70bActive2h ago43.307841190.00
deepinfradevstral-smallNever Succeeded(Medium)2h ago43.303114530.00
googlegemini-2.5-proNever Succeeded(Medium)2h ago42.6011721520.00
openaigpt-4o-miniActive2h ago41.80895410.00
togetherdeepseek-r1Active2h ago38.101691860.00
deepinfrallama-3.2-90bStale(Medium)2h ago37.70293930.00
openaigpt-4.1Active2h ago34.30670470.00
googleclaude-3-5-sonnetActive27d ago34.302642720.00
bedrockclaude-3-7-sonnetActive42m ago33.50644730.00
openaigpt-4-turboActive2h ago33.30251570.00
deepinfrallama-3-70bStale(Medium)2h ago33.00448510.00
deepinfrallama-2-70bStale(Medium)2h ago33.00849420.00
bedrockclaude-3-5-sonnetActive42m ago32.50643580.00
deepinfraqwen-2.5-72bStale(Medium)2h ago31.90247690.00
deepinfraQwen 2.5 Coder 32BNever Succeeded(Medium)2h ago29.801673960.00
bedrockclaude-3-5-haikuActive42m ago29.30138900.00
openaiGPT-5.1Active2h ago28.803561060.00
togetherdeepseek-v3Active22d ago28.001672030.00
openaiGPT-5.2Active2h ago27.10942980.00
deepinfrallama-3.2-3bStale(Medium)2h ago26.70190730.00
deepinfrallama-3.2-1bStale(Medium)2h ago26.60290530.00
openaigpt-4Active2h ago26.40252660.00
openaiGPT-5.1-codexActive2h ago26.103491220.00
deepinfrallama-3.3-70bNever Succeeded(Medium)2h ago22.301591040.00
bedrockclaude-sonnet-4.5Active42m ago21.902291680.00
togetherllama-3.1-405bActive9d ago21.502301280.00
openaiGPT-5.1-codex-miniActive5h ago21.301531330.00
deepinfrallama-3.1-70bStale(Medium)2h ago21.00346590.00
deepinfrallama-3.1-405bStale(Medium)2h ago20.301341620.00
deepinfraqwen-3-235bNever Succeeded(Medium)5h ago20.201431490.00
anthropicclaude-opus-4.5Active2h ago19.907301710.00
anthropicclaude-4-sonnetActive2h ago19.601301940.00
bedrockclaude-3-opusActive42m ago19.00422850.00
anthropicClaude Opus 4.1Active2h ago18.705251380.00
anthropicclaude-4-opusActive2h ago18.109241190.00
bedrockclaude-opus-4.5Active42m ago18.003232090.00
deepinfrallama-3.2-11bStale(Medium)2h ago17.801681890.00
openaiActive2h ago11.802202100.00
openaio1-proLikely Deprecated(Medium)2h ago9.75119150.00
openaiGPT-5.2-proActive2h ago6.161136130.00
Lifecycle snapshot
Loading status summary…

📈 Time Series 📈

llama-3.3-70b

llama-3.1-8b

claude-3-5-sonnet

claude-haiku-4.5

claude-opus-4.5

llama-3.1-405b

llama-3.1-70b

llama-3.2-3b

llama-3.2-90b

llama-4-maverick

llama-4-scout

mistral-7b

mixtral-8x22b

qwen-2.5-72b

qwen-3-32b

undefined

Claude Opus 4.1

claude-3-5-haiku

claude-3-7-sonnet

claude-3-opus

claude-4-opus

claude-4-sonnet

claude-sonnet-4.5

deepseek-r1

devstral-small

gemini-2.0-flash

gemini-2.0-flash-lite

gemini-2.5-flash

gemini-2.5-pro

gpt-3.5-turbo

gpt-4

gpt-4-turbo

gpt-4.1

gpt-4.1-mini

gpt-4.1-nano

gpt-4o

gpt-4o-mini

GPT-5.1

GPT-5.1-codex

GPT-5.1-codex-max

GPT-5.1-codex-mini

GPT-5.2

GPT-5.2-pro

gpt-oss-120b

kimi-k2

llama-2-70b

llama-3-70b

llama-3-8b

llama-3.2-11b

llama-3.2-1b

mistral-large

mixtral-8x7b

nova-lite

nova-micro

nova-pro

o1-pro

o3 Mini

o4 Mini

Qwen 2.5 Coder 32B

qwen-2.5-7b

qwen-3-235b

qwen-3-235b-instruct