Cloud BenchmarksLocal Benchmarks

☁️ Cloud Benchmarks ☁️

I run cron jobs to periodically test the token generation speed of different cloud LLM providers. The chart helps visualize the distributions of different speeds, as they can vary somewhat depending on the loads. For readability not all models are shown, but you can see the full results in the table below.

Every provider and model now has a dedicated landing page with narrative insights, SEO-friendly metadata, and structured data for search engines. Click any provider or model in the table to explore performance in depth.

I am working daily to add more providers and models, looking anywhere that does not require purchasing dedicated endpoints for hosting (why some models may appear to be missing). If you have any more suggestions let me know on GitHub!! 😊

Fastest Models Right Now (updated <24h)

#ModelProviderSpeed
1llama-3.1-8bgroq326 tok/s
2qwen-3-32bgroq242 tok/s
3llama-4-scoutgroq215 tok/s
4llama-3.1-8bcerebras208 tok/s
5llama-3.3-70bgroq203 tok/s

📊 Speed Distribution 📊

📚 Full Results 📚

Showing 83 of 83 modelsFlagged statuses: likely_deprecated, deprecated, failing, stale, never_succeeded, disabled
Status
groqllama-3.1-8bActive2h ago326.008747190.00
cerebrasqwen-3-32bActive13d ago248.004444370.00
groqqwen-3-32bActive2h ago242.002391270.00
groqllama-4-scoutActive2h ago215.0023335210.00
cerebrasllama-3.1-8bActive2h ago208.0012350500.00
groqllama-3.3-70bActive2h ago203.0068322120.00
cerebrasgpt-oss-120bActive5h ago199.0013378780.00
cerebrasllama-3.3-70bActive13d ago189.0017338430.00
groqllama-4-maverickActive2h ago179.0012310450.00
togetherllama-3.1-8bActive2h ago148.002232560.00
groqkimi-k2Active2h ago145.0019212250.00
bedrocknova-microActive48m ago123.0065149270.00
cerebrasqwen-3-235b-instructActive26d ago116.0019252880.00
openaio3 MiniNever Succeeded(Medium)2h ago110.00211600.00
bedrockllama-4-maverickActive48m ago108.003139260.00
bedrockllama-4-scoutActive48m ago102.004132270.00
bedrocknova-liteActive48m ago101.0039130300.00
bedrockllama-3.3-70bActive48m ago99.803137260.00
togetherqwen-2.5-7bActive2h ago92.5011145230.00
bedrocknova-proActive48m ago92.2013124350.00
openaigpt-3.5-turboActive2h ago79.5013126460.00
googleActive2h ago75.9034132510.00
openaiGPT-5.1-codex-maxActive2h ago75.4021181490.00
togetherllama-3.1-70bActive4d ago72.907144380.00
openaigpt-4.1-nanoActive2h ago70.509149440.00
togethermistral-7bActive4d ago70.30291510.00
openaigpt-4oActive2h ago69.0071731340.00
fireworksmixtral-8x22bActive20h ago67.5029112430.00
googlegemini-2.5-flashNever Succeeded(Medium)2h ago67.0051011060.00
togetherllama-3.2-3bActive2h ago61.2051211400.00
deepinframistral-7bStale(Medium)2h ago61.105136550.00
googlegemini-2.0-flashActive26d ago60.902988560.00
deepinframixtral-8x22bStale(Medium)11d ago58.201478370.00
togethermixtral-8x7bActive2h ago56.8013114180.00
googlegemini-2.0-flash-liteActive26d ago56.701173800.00
deepinfradevstral-smallNever Succeeded(Medium)2h ago55.403131630.00
deepinfrallama-3.1-8bStale(Medium)2h ago54.807100430.00
togetherqwen-2.5-72bActive23d ago54.00471450.00
togetherllama-3.3-70bActive2h ago53.8021461230.00
openaigpt-4.1-miniActive2h ago50.401597490.00
anthropicclaude-haiku-4.5Active2h ago50.301674610.00
openaio4 MiniNever Succeeded(Medium)2h ago49.3014760.00
fireworksllama-3.3-70bActive2h ago49.004941240.00
bedrockllama-3.2-90bActive48m ago47.30651350.00
deepinfrallama-3-8bStale(Medium)2h ago44.50771310.00
togetherdeepseek-r1Active2h ago43.70169870.00
openaigpt-4o-miniActive2h ago43.20895380.00
bedrockmistral-largeActive48m ago42.60347320.00
bedrockclaude-haiku-4.5Active48m ago42.204631070.00
googlegemini-2.5-proNever Succeeded(Medium)2h ago41.306721590.00
openaigpt-4.1Active2h ago37.80682480.00
deepinfrallama-2-70bStale(Medium)2h ago35.40357510.00
deepinfrallama-3.2-90bStale(Medium)14h ago35.00376840.00
deepinfrallama-3-70bStale(Medium)2h ago34.90255620.00
deepinfrallama-3.2-1bStale(Medium)2h ago33.60190640.00
bedrockclaude-3-7-sonnetActive48m ago33.40544730.00
deepinfrallama-3.2-3bStale(Medium)2h ago33.30188790.00
deepinfraQwen 2.5 Coder 32BNever Succeeded(Medium)2h ago32.701672040.00
deepinfraqwen-2.5-72bStale(Medium)2h ago32.70250630.00
openaigpt-4-turboActive2h ago32.20252530.00
bedrockclaude-3-5-sonnetActive48m ago32.00244660.00
bedrockclaude-3-5-haikuActive48m ago30.20138790.00
openaiGPT-5.1Active2h ago30.20357970.00
openaigpt-4Active2h ago27.20247660.00
openaiGPT-5.2Active2h ago26.60940950.00
openaiGPT-5.1-codexActive2h ago25.603491320.00
openaiGPT-5.1-codex-miniActive2h ago24.101551290.00
deepinfrallama-3.1-405bStale(Medium)2h ago23.301341130.00
deepinfrallama-3.1-70bStale(Medium)2h ago23.20346650.00
bedrockclaude-sonnet-4.5Active48m ago22.405291600.00
togetherllama-3.1-405bActive24d ago21.30629970.00
deepinfrallama-3.3-70bNever Succeeded(Medium)2h ago20.001492040.00
anthropicclaude-opus-4.5Active2h ago19.806321790.00
anthropicclaude-4-sonnetActive2h ago19.401302100.00
bedrockclaude-3-opusActive11d ago19.10422860.00
bedrockclaude-opus-4.5Active48m ago18.304241990.00
anthropicClaude Opus 4.1Active2h ago18.108271470.00
anthropicclaude-4-opusActive2h ago17.705241280.00
deepinfrallama-3.2-11bStale(Medium)2h ago16.401622450.00
deepinfraqwen-3-235bNever Succeeded(Medium)14h ago13.301384570.00
openaiActive5h ago12.701221880.00
openaio1-proLikely Deprecated(Medium)2h ago8.95118650.00
openaiGPT-5.2-proActive2h ago8.061135290.00
Lifecycle snapshot
Loading status summary…

📈 Time Series 📈

llama-3.3-70b

llama-3.1-8b

claude-3-5-sonnet

claude-haiku-4.5

claude-opus-4.5

llama-3.1-70b

llama-3.2-3b

llama-3.2-90b

llama-4-maverick

llama-4-scout

mistral-7b

mixtral-8x22b

undefined

qwen-3-32b

Claude Opus 4.1

claude-3-5-haiku

claude-3-7-sonnet

claude-3-opus

claude-4-opus

claude-4-sonnet

claude-sonnet-4.5

deepseek-r1

devstral-small

gemini-2.5-flash

gemini-2.5-pro

gpt-3.5-turbo

gpt-4

gpt-4-turbo

gpt-4.1

gpt-4.1-mini

gpt-4.1-nano

gpt-4o

gpt-4o-mini

GPT-5.1

GPT-5.1-codex

GPT-5.1-codex-max

GPT-5.1-codex-mini

GPT-5.2

GPT-5.2-pro

gpt-oss-120b

kimi-k2

llama-2-70b

llama-3-70b

llama-3-8b

llama-3.1-405b

llama-3.2-11b

llama-3.2-1b

mistral-large

mixtral-8x7b

nova-lite

nova-micro

nova-pro

o1-pro

o3 Mini

o4 Mini

Qwen 2.5 Coder 32B

qwen-2.5-72b

qwen-2.5-7b

qwen-3-235b