Cloud BenchmarksLocal Benchmarks

☁️ Cloud Benchmarks ☁️

I run cron jobs to periodically test the token generation speed of different cloud LLM providers. The chart helps visualize the distributions of different speeds, as they can vary somewhat depending on the loads. For readability not all models are shown, but you can see the full results in the table below.

Every provider and model now has a dedicated landing page with narrative insights, SEO-friendly metadata, and structured data for search engines. Click any provider or model in the table to explore performance in depth.

I am working daily to add more providers and models, looking anywhere that does not require purchasing dedicated endpoints for hosting (why some models may appear to be missing). If you have any more suggestions let me know on GitHub!! 😊

Fastest Models Right Now (updated <24h)

#ModelProviderSpeed
1llama-3.1-8bgroq290 tok/s
2qwen-3-32bcerebras253 tok/s
3qwen-3-32bgroq247 tok/s
4llama-3.1-8bcerebras226 tok/s
5llama-3.3-70bcerebras213 tok/s

📊 Speed Distribution 📊

📚 Full Results 📚

Showing 86 of 86 modelsFlagged statuses: likely_deprecated, deprecated, failing, stale, never_succeeded, disabled
Status
groqllama-3.1-8bActive3h ago290.0095447120.00
cerebrasqwen-3-32bActive3h ago253.004444300.00
groqqwen-3-32bActive3h ago247.0046391140.00
cerebrasllama-3.1-8bActive3h ago226.006365480.00
cerebrasllama-3.3-70bActive3h ago213.0017338350.00
cerebrasgpt-oss-120bActive3h ago212.004346720.00
groqllama-3.3-70bActive3h ago205.0079291120.00
groqllama-4-scoutActive3h ago205.0023328230.00
groqllama-4-maverickActive3h ago168.0019310490.00
cerebrasqwen-3-235b-instructActive10d ago165.002264870.00
togetherllama-3.1-8bActive3h ago150.002232550.00
groqkimi-k2Active3h ago143.0021203230.00
bedrocknova-microActive16m ago126.0069151260.00
bedrockllama-4-maverickActive15m ago109.0028142250.00
openaio3 MiniNever Succeeded(Medium)3h ago106.00201590.00
bedrockllama-4-scoutActive15m ago102.001140350.00
bedrocknova-liteActive16m ago101.0042135290.00
bedrockllama-3.3-70bActive16m ago101.0015137240.00
togethermistral-7bActive3h ago96.802166470.00
togetherqwen-2.5-7bActive3h ago94.403146300.00
bedrocknova-proActive16m ago87.1010124390.00
googleActive3h ago81.5034113500.00
openaigpt-3.5-turboActive3h ago78.2012129450.00
openaigpt-4.1-nanoActive3h ago74.2013138410.00
togetherllama-3.1-70bActive3h ago72.807147410.00
togetherllama-3.2-3bActive3h ago70.0051451170.00
googleclaude-3-haikuActive26d ago69.106079500.00
openaigpt-4oActive3h ago64.3071541400.00
googlegemini-2.5-flashNever Succeeded(Medium)3h ago62.705901230.00
fireworksmixtral-8x22bActive3h ago62.6037112550.00
googlegemini-2.0-flashActive10d ago61.201588580.00
googlegemini-2.0-flash-liteActive10d ago59.301180650.00
deepinframixtral-8x22bStale(Medium)3h ago54.402980310.00
openaiGPT-5.1-codex-maxActive3h ago54.1011061690.00
togetherllama-3.3-70bActive3h ago53.9021361500.00
togetherqwen-2.5-72bActive7d ago52.90471380.00
togethermixtral-8x7bActive3h ago51.8013110210.00
anthropicclaude-haiku-4.5Active3h ago49.601580670.00
openaio4 MiniNever Succeeded(Medium)3h ago49.6015750.00
openaigpt-4.1-miniActive3h ago47.901597490.00
bedrockllama-3.2-90bActive16m ago47.502951340.00
deepinframistral-7bStale(Medium)3h ago46.403122540.00
deepinfrallama-3-8bStale(Medium)3h ago44.80771300.00
bedrockclaude-haiku-4.5Active16m ago44.50463950.00
deepinfrallama-3.1-8bStale(Medium)3h ago43.9011021240.00
bedrockmistral-largeActive15m ago43.80747250.00
deepinfradevstral-smallNever Succeeded(Medium)3h ago43.303114510.00
fireworksllama-3.3-70bActive3h ago42.407841200.00
googlegemini-2.5-proNever Succeeded(Medium)3h ago41.7011631550.00
openaigpt-4o-miniActive3h ago41.40895410.00
togetherdeepseek-r1Active3h ago38.301691850.00
deepinfrallama-3.2-90bStale(Medium)3h ago37.40193970.00
googleclaude-3-5-sonnetActive26d ago34.502642720.00
openaigpt-4.1Active3h ago33.80667470.00
bedrockclaude-3-7-sonnetActive16m ago33.40644730.00
openaigpt-4-turboActive3h ago33.30251560.00
deepinfrallama-3-70bStale(Medium)3h ago33.10448510.00
deepinfrallama-2-70bStale(Medium)3h ago33.00849420.00
bedrockclaude-3-5-sonnetActive16m ago32.50643580.00
deepinfraqwen-2.5-72bStale(Medium)3h ago31.70247770.00
deepinfraQwen 2.5 Coder 32BNever Succeeded(Medium)3h ago29.401674220.00
bedrockclaude-3-5-haikuActive16m ago29.30138900.00
openaiGPT-5.1Active3h ago28.803561060.00
togetherdeepseek-v3Active21d ago28.501671870.00
openaiGPT-5.2Active3h ago27.20942990.00
openaigpt-4Active3h ago26.30252650.00
deepinfrallama-3.2-3bStale(Medium)3h ago26.20190740.00
deepinfrallama-3.2-1bStale(Medium)3h ago26.10290530.00
openaiGPT-5.1-codexActive3h ago26.003491180.00
deepinfrallama-3.3-70bNever Succeeded(Medium)3h ago22.20159980.00
bedrockclaude-sonnet-4.5Active16m ago21.802291700.00
togetherllama-3.1-405bActive8d ago21.401301440.00
openaiGPT-5.1-codex-miniActive3h ago21.101451300.00
deepinfrallama-3.1-70bStale(Medium)3h ago20.90346580.00
deepinfraqwen-3-235bNever Succeeded(Medium)3h ago20.80143940.00
deepinfrallama-3.1-405bStale(Medium)3h ago19.801341640.00
anthropicclaude-opus-4.5Active3h ago19.807301720.00
anthropicclaude-4-sonnetActive3h ago19.601301950.00
bedrockclaude-3-opusActive16m ago19.00422850.00
anthropicClaude Opus 4.1Active3h ago18.805251360.00
anthropicclaude-4-opusActive3h ago18.109241190.00
bedrockclaude-opus-4.5Active16m ago18.103232090.00
deepinfrallama-3.2-11bStale(Medium)3h ago17.401681940.00
openaiActive3h ago11.902202080.00
openaio1-proLikely Deprecated(Medium)3h ago9.68119160.00
openaiGPT-5.2-proActive3h ago6.001126110.00
Lifecycle snapshot
Loading status summary…

📈 Time Series 📈

llama-3.3-70b

llama-3.1-8b

claude-3-5-sonnet

claude-haiku-4.5

claude-opus-4.5

llama-3.1-405b

llama-3.1-70b

llama-3.2-3b

llama-3.2-90b

llama-4-maverick

llama-4-scout

mistral-7b

mixtral-8x22b

qwen-2.5-72b

qwen-3-32b

undefined

Claude Opus 4.1

claude-3-5-haiku

claude-3-7-sonnet

claude-3-opus

claude-4-opus

claude-4-sonnet

claude-sonnet-4.5

deepseek-r1

devstral-small

gemini-2.0-flash

gemini-2.0-flash-lite

gemini-2.5-flash

gemini-2.5-pro

gpt-3.5-turbo

gpt-4

gpt-4-turbo

gpt-4.1

gpt-4.1-mini

gpt-4.1-nano

gpt-4o

gpt-4o-mini

GPT-5.1

GPT-5.1-codex

GPT-5.1-codex-max

GPT-5.1-codex-mini

GPT-5.2

GPT-5.2-pro

gpt-oss-120b

kimi-k2

llama-2-70b

llama-3-70b

llama-3-8b

llama-3.2-11b

llama-3.2-1b

mistral-large

mixtral-8x7b

nova-lite

nova-micro

nova-pro

o1-pro

o3 Mini

o4 Mini

Qwen 2.5 Coder 32B

qwen-2.5-7b

qwen-3-235b

qwen-3-235b-instruct