Cloud BenchmarksLocal Benchmarks

☁️ Cloud Benchmarks ☁️

I run cron jobs to periodically test the token generation speed of different cloud LLM providers. The chart helps visualize the distributions of different speeds, as they can vary somewhat depending on the loads. For readability not all models are shown, but you can see the full results in the table below.

Every provider and model now has a dedicated landing page with narrative insights, SEO-friendly metadata, and structured data for search engines. Click any provider or model in the table to explore performance in depth.

I am working daily to add more providers and models, looking anywhere that does not require purchasing dedicated endpoints for hosting (why some models may appear to be missing). If you have any more suggestions let me know on GitHub!! 😊

Fastest Models Right Now (updated <24h)

#ModelProviderSpeed
1llama-3.1-8bgroq303 tok/s
2qwen-3-32bgroq252 tok/s
3llama-3.1-8bcerebras224 tok/s
4llama-4-scoutgroq212 tok/s
5llama-3.3-70bgroq205 tok/s

📊 Speed Distribution 📊

📚 Full Results 📚

Showing 84 of 84 modelsFlagged statuses: likely_deprecated, deprecated, failing, stale, never_succeeded, disabled
Status
groqllama-3.1-8bActive3h ago303.0087471110.00
cerebrasqwen-3-32bActive3d ago256.004444300.00
groqqwen-3-32bActive3h ago252.002391270.00
cerebrasllama-3.1-8bActive3h ago224.006365480.00
groqllama-4-scoutActive3h ago212.0023331210.00
groqllama-3.3-70bActive3h ago205.0094322120.00
cerebrasllama-3.3-70bActive3d ago203.0017338370.00
cerebrasgpt-oss-120bActive3h ago202.004346760.00
groqllama-4-maverickActive3h ago172.0019310480.00
cerebrasqwen-3-235b-instructActive15d ago158.0022641080.00
togetherllama-3.1-8bActive3h ago155.002232470.00
groqkimi-k2Active3h ago145.0021208220.00
bedrocknova-microActive23m ago125.0069151260.00
bedrockllama-4-maverickActive23m ago109.0036142250.00
openaio3 MiniNever Succeeded(Medium)3h ago107.00211590.00
bedrockllama-4-scoutActive23m ago102.001140350.00
bedrocknova-liteActive23m ago101.0039132290.00
bedrockllama-3.3-70bActive23m ago101.0015137240.00
togetherqwen-2.5-7bActive3h ago91.803146300.00
bedrocknova-proActive23m ago89.2010124370.00
togethermistral-7bActive3h ago88.002165500.00
openaigpt-3.5-turboActive3h ago79.5012125440.00
googleActive3h ago78.5034132510.00
openaigpt-4.1-nanoActive3h ago72.3027131390.00
togetherllama-3.1-70bActive3h ago71.107144400.00
togetherllama-3.2-3bActive3h ago67.7051431250.00
googlegemini-2.5-flashNever Succeeded(Medium)3h ago65.5051001110.00
openaigpt-4oActive3h ago65.4071731380.00
fireworksmixtral-8x22bActive3h ago64.6037112510.00
openaiGPT-5.1-codex-maxActive3h ago64.0011061770.00
googlegemini-2.0-flashActive15d ago60.401688580.00
googlegemini-2.0-flash-liteActive15d ago57.701178710.00
deepinframixtral-8x22bStale(Medium)1d ago56.301480330.00
togetherllama-3.3-70bActive3h ago55.3021361460.00
togethermixtral-8x7bActive3h ago53.3013110190.00
togetherqwen-2.5-72bActive12d ago52.90471390.00
deepinframistral-7bStale(Medium)3h ago51.803124560.00
openaio4 MiniNever Succeeded(Medium)3h ago49.8015760.00
anthropicclaude-haiku-4.5Active3h ago49.601580660.00
openaigpt-4.1-miniActive3h ago48.601597500.00
deepinfrallama-3.1-8bStale(Medium)3h ago48.301102750.00
bedrockllama-3.2-90bActive23m ago47.501951340.00
deepinfradevstral-smallNever Succeeded(Medium)3h ago45.503131580.00
fireworksllama-3.3-70bActive3h ago44.907941140.00
deepinfrallama-3-8bStale(Medium)3h ago44.40771300.00
bedrockmistral-largeActive23m ago43.60747250.00
bedrockclaude-haiku-4.5Active23m ago42.704631060.00
googlegemini-2.5-proNever Succeeded(Medium)3h ago42.2011721520.00
openaigpt-4o-miniActive3h ago41.90895410.00
togetherdeepseek-r1Active3h ago38.901691850.00
deepinfrallama-3.2-90bStale(Medium)3h ago35.50288990.00
openaigpt-4.1Active3h ago34.10670470.00
bedrockclaude-3-7-sonnetActive23m ago33.60644730.00
openaigpt-4-turboActive3h ago32.80251570.00
deepinfrallama-2-70bStale(Medium)3h ago32.60349430.00
deepinfrallama-3-70bStale(Medium)3h ago32.40448530.00
deepinfraqwen-2.5-72bStale(Medium)3h ago32.00247720.00
bedrockclaude-3-5-sonnetActive23m ago31.90243600.00
deepinfraQwen 2.5 Coder 32BNever Succeeded(Medium)3h ago31.801673000.00
deepinfrallama-3.2-3bStale(Medium)3h ago29.40190730.00
openaiGPT-5.1Active3h ago29.403561020.00
bedrockclaude-3-5-haikuActive23m ago29.30138890.00
deepinfrallama-3.2-1bStale(Medium)3h ago29.20190610.00
openaiGPT-5.2Active3h ago27.00942970.00
openaigpt-4Active3h ago26.30249670.00
openaiGPT-5.1-codexActive3h ago25.103491340.00
deepinfrallama-3.3-70bNever Succeeded(Medium)3h ago22.401591200.00
bedrockclaude-sonnet-4.5Active23m ago21.802291720.00
openaiGPT-5.1-codex-miniActive3h ago21.701551400.00
togetherllama-3.1-405bActive13d ago21.602291200.00
deepinfrallama-3.1-405bStale(Medium)3h ago21.001341620.00
deepinfrallama-3.1-70bStale(Medium)3h ago20.90346620.00
anthropicclaude-4-sonnetActive3h ago19.701301920.00
togetherdeepseek-v3Active26d ago19.701313470.00
anthropicclaude-opus-4.5Active3h ago19.607261750.00
bedrockclaude-3-opusActive53m ago19.10422850.00
anthropicClaude Opus 4.1Active3h ago18.505251410.00
deepinfraqwen-3-235bNever Succeeded(Medium)3h ago18.401432450.00
anthropicclaude-4-opusActive3h ago18.009241210.00
deepinfrallama-3.2-11bStale(Medium)3h ago18.001682300.00
bedrockclaude-opus-4.5Active23m ago17.803232130.00
openaiActive3h ago12.202221960.00
openaio1-proLikely Deprecated(Medium)3h ago9.38119230.00
openaiGPT-5.2-proActive3h ago6.761136020.00
Lifecycle snapshot
Loading status summary…

📈 Time Series 📈

llama-3.3-70b

llama-3.1-8b

claude-3-5-sonnet

claude-haiku-4.5

claude-opus-4.5

llama-3.1-70b

llama-3.2-3b

llama-3.2-90b

llama-4-maverick

llama-4-scout

mistral-7b

mixtral-8x22b

qwen-3-32b

undefined

llama-3.1-405b

qwen-2.5-72b

Claude Opus 4.1

claude-3-5-haiku

claude-3-7-sonnet

claude-3-opus

claude-4-opus

claude-4-sonnet

claude-sonnet-4.5

deepseek-r1

devstral-small

gemini-2.5-flash

gemini-2.5-pro

gpt-3.5-turbo

gpt-4

gpt-4-turbo

gpt-4.1

gpt-4.1-mini

gpt-4.1-nano

gpt-4o

gpt-4o-mini

GPT-5.1

GPT-5.1-codex

GPT-5.1-codex-max

GPT-5.1-codex-mini

GPT-5.2

GPT-5.2-pro

gpt-oss-120b

kimi-k2

llama-2-70b

llama-3-70b

llama-3-8b

llama-3.2-11b

llama-3.2-1b

mistral-large

mixtral-8x7b

nova-lite

nova-micro

nova-pro

o1-pro

o3 Mini

o4 Mini

Qwen 2.5 Coder 32B

qwen-2.5-7b

qwen-3-235b