Cloud BenchmarksLocal Benchmarks

☁️ Cloud Benchmarks ☁️

I run cron jobs to periodically test the token generation speed of different cloud LLM providers. The chart helps visualize the distributions of different speeds, as they can vary somewhat depending on the loads. For readability not all models are shown, but you can see the full results in the table below.

Every provider and model now has a dedicated landing page with narrative insights, SEO-friendly metadata, and structured data for search engines. Click any provider or model in the table to explore performance in depth.

I am working daily to add more providers and models, looking anywhere that does not require purchasing dedicated endpoints for hosting (why some models may appear to be missing). If you have any more suggestions let me know on GitHub!! 😊

Fastest Models Right Now (updated <24h)

#ModelProviderSpeed
1llama-3.1-8bgroq268 tok/s
2qwen-3-32bcerebras251 tok/s
3qwen-3-32bgroq239 tok/s
4gpt-oss-120bcerebras220 tok/s
5llama-3.3-70bcerebras220 tok/s

📊 Speed Distribution 📊

📚 Full Results 📚

Showing 84 of 84 modelsFlagged statuses: likely_deprecated, deprecated, failing, stale, never_succeeded, disabled
Status
groqllama-3.1-8bActive1h ago268.0095411140.00
cerebrasqwen-3-32bActive1h ago251.004369440.00
groqqwen-3-32bActive1h ago239.0046387160.00
cerebrasgpt-oss-120bActive1h ago220.004346700.00
cerebrasllama-3.3-70bActive1h ago220.0014338370.00
cerebrasllama-3.1-8bActive1h ago218.004365610.00
groqllama-3.3-70bActive1h ago205.0079280120.00
groqllama-4-scoutActive1h ago190.0023255270.00
groqllama-4-maverickActive1h ago170.0019310500.00
cerebrasqwen-3-235b-instructActive1h ago168.002266740.00
groqkimi-k2Active1h ago147.0021203210.00
togetherllama-3.1-8bActive1h ago146.002230480.00
bedrocknova-microActive45m ago129.0069154250.00
togethermistral-7bActive1h ago119.003166330.00
bedrockllama-4-maverickActive45m ago109.0028149250.00
openaio3 MiniNever Succeeded(Medium)1h ago105.00201510.00
togetherqwen-2.5-7bActive1h ago103.003146260.00
bedrocknova-liteActive45m ago102.0042135290.00
bedrockllama-4-scoutActive45m ago102.001140350.00
bedrockllama-3.3-70bActive45m ago101.009136230.00
openaigpt-4.1-nanoActive1h ago80.6013138380.00
bedrocknova-proActive45m ago79.5010124420.00
openaigpt-3.5-turboActive1h ago77.7012129460.00
togetherllama-3.2-3bActive1h ago72.8051451280.00
togetherllama-3.1-70bActive1h ago71.604147530.00
googleclaude-3-haikuActive14d ago68.402782520.00
openaigpt-4oActive1h ago64.6091511390.00
googlegemini-2.0-flash-liteActive1h ago62.301386560.00
googlegemini-2.0-flashActive1h ago61.90888570.00
fireworksmixtral-8x22bActive1h ago60.203774600.00
togetherllama-3.3-70bActive1h ago52.0021331410.00
anthropicclaude-haiku-4.5Active1h ago50.901580620.00
openaio4 MiniNever Succeeded(Medium)1h ago50.7022750.00
togetherqwen-2.5-72bActive1h ago50.10270520.00
togethermixtral-8x7bActive1h ago48.406111450.00
bedrockllama-3.2-90bActive45m ago47.402351350.00
deepinframixtral-8x22bStale(Medium)1h ago46.602580320.00
bedrockclaude-haiku-4.5Active45m ago46.20865860.00
openaigpt-4.1-miniActive1h ago46.101869450.00
deepinfrallama-3-8bStale(Medium)1h ago45.90771290.00
deepinframistral-7bStale(Medium)1h ago44.70389570.00
bedrockmistral-largeActive45m ago44.10947250.00
deepinfradevstral-smallNever Succeeded(Medium)1h ago43.20485460.00
fireworksllama-3.3-70bActive1h ago42.106761310.00
deepinfrallama-3.2-90bStale(Medium)4h ago40.50193850.00
openaigpt-4o-miniActive1h ago36.70878440.00
togetherdeepseek-r1Active1h ago36.101661940.00
deepinfraQwen 2.5 Coder 32BNever Succeeded(Medium)1h ago35.101674220.00
googleclaude-3-5-sonnetActive14d ago34.502446710.00
openaigpt-4.1Active1h ago34.30773460.00
openaigpt-4-turboActive1h ago34.20249540.00
deepinfrallama-2-70bStale(Medium)1h ago33.50844380.00
deepinfrallama-3-70bStale(Medium)1h ago32.90244590.00
bedrockclaude-3-7-sonnetActive45m ago32.80142800.00
bedrockclaude-3-5-sonnetActive45m ago32.50542580.00
deepinfraqwen-2.5-72bStale(Medium)1h ago31.90246810.00
togetherdeepseek-v3Active10d ago30.501671250.00
deepinfrallama-3.1-8bStale(Medium)1h ago30.2011021280.00
bedrockclaude-3-5-haikuActive45m ago29.001381260.00
openaiGPT-5.2Active1h ago27.90644990.00
openaiGPT-5.1Active1h ago27.503561060.00
openaigpt-4Active1h ago26.30752590.00
openaiGPT-5.1-codex-maxActive22h ago25.701701200.00
openaiGPT-5.1-codexActive1h ago25.10449680.00
deepinfrallama-3.2-3bStale(Medium)1h ago25.00590390.00
deepinfrallama-3.2-1bStale(Medium)1h ago24.80783370.00
deepinfrallama-3.3-70bNever Succeeded(Medium)1h ago23.60159760.00
anthropicclaude-3-opusActive29d ago23.102324730.00
deepinfraqwen-3-235bNever Succeeded(Medium)1h ago22.30243520.00
togetherllama-3.1-405bActive1h ago21.701301340.00
deepinfrallama-3.1-70bStale(Medium)1h ago21.50343450.00
bedrockclaude-sonnet-4.5Active45m ago21.501291840.00
openaiGPT-5.1-codex-miniActive1h ago20.30245790.00
anthropicclaude-4-sonnetActive1h ago20.108321640.00
anthropicclaude-opus-4.5Active1h ago19.6011301720.00
bedrockclaude-3-opusActive45m ago19.00722840.00
anthropicClaude Opus 4.1Active1h ago18.705251340.00
bedrockclaude-opus-4.5Active45m ago18.001232310.00
anthropicclaude-4-opusActive1h ago17.804241230.00
deepinfrallama-3.1-405bStale(Medium)1h ago15.701313670.00
deepinfrallama-3.2-11bStale(Medium)1h ago12.101682150.00
googleclaude-3-opusActive28d ago12.0010132070.00
openaio1-proLikely Deprecated(Medium)1h ago10.50119130.00
openaiGPT-5.2-proActive4h ago1.83145060.00
Lifecycle snapshot
Loading status summary…

📈 Time Series 📈

llama-3.3-70b

llama-3.1-8b

claude-3-5-sonnet

claude-haiku-4.5

claude-opus-4.5

llama-3.1-405b

llama-3.1-70b

llama-3.2-3b

llama-3.2-90b

llama-4-maverick

llama-4-scout

mistral-7b

mixtral-8x22b

qwen-2.5-72b

qwen-3-32b

Claude Opus 4.1

claude-3-5-haiku

claude-3-7-sonnet

claude-3-opus

claude-4-opus

claude-4-sonnet

claude-sonnet-4.5

deepseek-r1

deepseek-v3

devstral-small

gemini-2.0-flash

gemini-2.0-flash-lite

gpt-3.5-turbo

gpt-4

gpt-4-turbo

gpt-4.1

gpt-4.1-mini

gpt-4.1-nano

gpt-4o

gpt-4o-mini

GPT-5.1

GPT-5.1-codex

GPT-5.1-codex-max

GPT-5.1-codex-mini

GPT-5.2

GPT-5.2-pro

gpt-oss-120b

kimi-k2

llama-2-70b

llama-3-70b

llama-3-8b

llama-3.2-11b

llama-3.2-1b

mistral-large

mixtral-8x7b

nova-lite

nova-micro

nova-pro

o1-pro

o3 Mini

o4 Mini

Qwen 2.5 Coder 32B

qwen-2.5-7b

qwen-3-235b

qwen-3-235b-instruct