Cloud BenchmarksLocal Benchmarks

☁️ Cloud Benchmarks ☁️

I run cron jobs to periodically test the token generation speed of different cloud LLM providers. The chart helps visualize the distributions of different speeds, as they can vary somewhat depending on the loads. For readability not all models are shown, but you can see the full results in the table below.

Every provider and model now has a dedicated landing page with narrative insights, SEO-friendly metadata, and structured data for search engines. Click any provider or model in the table to explore performance in depth.

I am working daily to add more providers and models, looking anywhere that does not require purchasing dedicated endpoints for hosting (why some models may appear to be missing). If you have any more suggestions let me know on GitHub!! 😊

Fastest Models Right Now (updated <24h)

#ModelProviderSpeed
1llama-3.1-8bgroq268 tok/s
2qwen-3-32bcerebras251 tok/s
3qwen-3-32bgroq239 tok/s
4gpt-oss-120bcerebras221 tok/s
5llama-3.3-70bcerebras221 tok/s

📊 Speed Distribution 📊

📚 Full Results 📚

Showing 84 of 84 modelsFlagged statuses: likely_deprecated, deprecated, failing, stale, never_succeeded, disabled
Status
groqllama-3.1-8bActive34m ago268.0095411140.00
cerebrasqwen-3-32bActive35m ago251.004369440.00
groqqwen-3-32bActive34m ago239.0046387160.00
cerebrasgpt-oss-120bActive35m ago221.004346700.00
cerebrasllama-3.3-70bActive35m ago221.0014338370.00
cerebrasllama-3.1-8bActive35m ago218.004365610.00
groqllama-3.3-70bActive34m ago205.0079280120.00
groqllama-4-scoutActive34m ago190.0023255270.00
groqllama-4-maverickActive34m ago170.0019310500.00
cerebrasqwen-3-235b-instructActive35m ago167.002266750.00
togetherllama-3.1-8bActive31m ago146.002230480.00
groqkimi-k2Active34m ago146.0021203220.00
bedrocknova-microActive21m ago129.0069154250.00
togethermistral-7bActive32m ago119.003166330.00
bedrockllama-4-maverickActive21m ago109.0028149250.00
openaio3 MiniNever Succeeded(Medium)34m ago105.00201510.00
togetherqwen-2.5-7bActive32m ago103.003146260.00
bedrocknova-liteActive21m ago102.0042135290.00
bedrockllama-4-scoutActive21m ago102.001140350.00
bedrockllama-3.3-70bActive21m ago101.009136230.00
openaigpt-4.1-nanoActive34m ago80.6013138380.00
bedrocknova-proActive21m ago79.6010124420.00
openaigpt-3.5-turboActive32m ago77.8012129460.00
togetherllama-3.2-3bActive31m ago72.3051451300.00
togetherllama-3.1-70bActive31m ago71.604147530.00
googleclaude-3-haikuActive14d ago68.402782520.00
openaigpt-4oActive32m ago64.5091511390.00
googlegemini-2.0-flash-liteActive31m ago62.201386560.00
googlegemini-2.0-flashActive31m ago61.90888570.00
fireworksmixtral-8x22bActive34m ago60.303774590.00
togetherllama-3.3-70bActive32m ago51.7021331410.00
anthropicclaude-haiku-4.5Active35m ago50.801580630.00
openaio4 MiniNever Succeeded(Medium)34m ago50.6022750.00
togetherqwen-2.5-72bActive32m ago50.10270520.00
togethermixtral-8x7bActive32m ago48.406111450.00
bedrockllama-3.2-90bActive21m ago47.402351350.00
deepinframixtral-8x22bStale(Medium)35m ago46.702580320.00
openaigpt-4.1-miniActive34m ago46.101869450.00
bedrockclaude-haiku-4.5Active21m ago46.10865860.00
deepinfrallama-3-8bStale(Medium)34m ago45.90771290.00
deepinframistral-7bStale(Medium)35m ago44.80389550.00
bedrockmistral-largeActive21m ago44.10947250.00
deepinfradevstral-smallNever Succeeded(Medium)35m ago43.20485460.00
fireworksllama-3.3-70bActive34m ago41.906761320.00
deepinfrallama-3.2-90bStale(Medium)35m ago40.40193840.00
openaigpt-4o-miniActive32m ago36.70878440.00
togetherdeepseek-r1Active32m ago36.101671940.00
deepinfraQwen 2.5 Coder 32BNever Succeeded(Medium)35m ago35.001674230.00
googleclaude-3-5-sonnetActive14d ago34.502446710.00
openaigpt-4.1Active34m ago34.20773460.00
openaigpt-4-turboActive32m ago34.20249540.00
deepinfrallama-2-70bStale(Medium)34m ago33.60844380.00
deepinfrallama-3-70bStale(Medium)34m ago32.90244590.00
bedrockclaude-3-7-sonnetActive21m ago32.80142800.00
bedrockclaude-3-5-sonnetActive21m ago32.50542580.00
deepinfraqwen-2.5-72bStale(Medium)35m ago32.00246810.00
togetherdeepseek-v3Active10d ago30.501671260.00
deepinfrallama-3.1-8bStale(Medium)34m ago30.5011021280.00
bedrockclaude-3-5-haikuActive22m ago29.001381260.00
openaiGPT-5.2Active34m ago27.90644990.00
openaiGPT-5.1Active34m ago27.403561060.00
openaigpt-4Active32m ago26.30752590.00
openaiGPT-5.1-codex-maxActive1d ago25.801701220.00
openaiGPT-5.1-codexActive34m ago25.10449690.00
deepinfrallama-3.2-3bStale(Medium)34m ago25.10590390.00
deepinfrallama-3.2-1bStale(Medium)34m ago24.90783370.00
deepinfrallama-3.3-70bNever Succeeded(Medium)35m ago23.60159750.00
anthropicclaude-3-opusActive29d ago23.202324730.00
deepinfraqwen-3-235bNever Succeeded(Medium)35m ago22.30243530.00
togetherllama-3.1-405bActive31m ago21.701301340.00
deepinfrallama-3.1-70bStale(Medium)34m ago21.60343450.00
bedrockclaude-sonnet-4.5Active21m ago21.501291840.00
openaiGPT-5.1-codex-miniActive34m ago20.20145800.00
anthropicclaude-4-sonnetActive35m ago20.108321640.00
anthropicclaude-opus-4.5Active35m ago19.6011301730.00
bedrockclaude-3-opusActive21m ago19.00722840.00
anthropicClaude Opus 4.1Active35m ago18.705251350.00
bedrockclaude-opus-4.5Active21m ago18.001232310.00
anthropicclaude-4-opusActive35m ago17.804241230.00
deepinfrallama-3.1-405bStale(Medium)34m ago15.801313530.00
deepinfrallama-3.2-11bStale(Medium)34m ago12.301682150.00
googleclaude-3-opusActive28d ago12.1010132050.00
openaio1-proLikely Deprecated(Medium)34m ago10.50119130.00
openaiGPT-5.2-proActive6h ago1.83145060.00
Lifecycle snapshot
Loading status summary…

📈 Time Series 📈

llama-3.3-70b

llama-3.1-8b

claude-3-5-sonnet

claude-haiku-4.5

claude-opus-4.5

llama-3.1-405b

llama-3.1-70b

llama-3.2-3b

llama-3.2-90b

llama-4-maverick

llama-4-scout

mistral-7b

mixtral-8x22b

qwen-2.5-72b

qwen-3-32b

Claude Opus 4.1

claude-3-5-haiku

claude-3-7-sonnet

claude-3-opus

claude-4-opus

claude-4-sonnet

claude-sonnet-4.5

deepseek-r1

deepseek-v3

devstral-small

gemini-2.0-flash

gemini-2.0-flash-lite

gpt-3.5-turbo

gpt-4

gpt-4-turbo

gpt-4.1

gpt-4.1-mini

gpt-4.1-nano

gpt-4o

gpt-4o-mini

GPT-5.1

GPT-5.1-codex

GPT-5.1-codex-max

GPT-5.1-codex-mini

GPT-5.2

GPT-5.2-pro

gpt-oss-120b

kimi-k2

llama-2-70b

llama-3-70b

llama-3-8b

llama-3.2-11b

llama-3.2-1b

mistral-large

mixtral-8x7b

nova-lite

nova-micro

nova-pro

o1-pro

o3 Mini

o4 Mini

Qwen 2.5 Coder 32B

qwen-2.5-7b

qwen-3-235b

qwen-3-235b-instruct