Cloud BenchmarksLocal Benchmarks
API Status

☁️ Cloud Benchmarks ☁️

I run cron jobs to periodically test the token generation speed of different cloud LLM providers. The chart helps visualize the distributions of different speeds, as they can vary somewhat depending on the loads. For readability not all models are shown, but you can see the full results in the table below.

Every provider and model now has a dedicated landing page with narrative insights, SEO-friendly metadata, and structured data for search engines. Click any provider or model in the table to explore performance in depth.

I am working daily to add more providers and models, looking anywhere that does not require purchasing dedicated endpoints for hosting (why some models may appear to be missing). If you have any more suggestions let me know on GitHub!! 😊

Pick A Path In 10 Seconds

Quick recommendations from the latest 7-day benchmark slice. Use one path, jump into full results, then drill into provider/model pages.

Loading 7-day recommendations…

Fastest Models Right Now (updated <24h)

#ModelProviderSpeed
1llama-3.1-8bgroq317 tok/s
2qwen-3-32bgroq223 tok/s
3llama-3.3-70bgroq203 tok/s
4llama-4-scoutgroq199 tok/s
5gpt-oss-120bcerebras188 tok/s

πŸ“Š Speed Distribution πŸ“Š

πŸ“š Full Results πŸ“š

Showing 80 of 80 modelsFlagged statuses: likely_deprecated, deprecated, failing, stale, never_succeeded, disabled
Status
groqllama-3.1-8bActive35m ago317.008747190.00
cerebrasqwen-3-32bActive24d ago261.0032406240.00
groqqwen-3-32bActive35m ago223.002374310.00
groqllama-3.3-70bActive35m ago203.0068340130.00
groqllama-4-maverickActive3d ago200.001307630.00
groqllama-4-scoutActive35m ago199.0038335240.00
cerebrasllama-3.3-70bActive24d ago190.0017316430.00
cerebrasgpt-oss-120bActive6h ago188.0013801070.00
cerebrasllama-3.1-8bActive38m ago184.0013531190.00
groqkimi-k2Active35m ago140.0012215310.00
togetherllama-3.1-8bActive34m ago138.003228350.00
bedrocknova-microActive18m ago121.0065152270.00
openaio3 MiniNever Succeeded(Medium)35m ago109.0081600.00
bedrockllama-4-maverickActive18m ago108.003139270.00
bedrocknova-liteActive18m ago101.0039132300.00
bedrockllama-4-scoutActive18m ago101.006127270.00
bedrockllama-3.3-70bActive18m ago97.103136300.00
togetherqwen-2.5-7bActive34m ago93.901145490.00
bedrocknova-proActive18m ago89.1036121350.00
openaiGPT-5.1-codex-maxActive35m ago80.90111181380.00
openaigpt-3.5-turboActive34m ago77.0013126480.00
deepinframistral-7bStale(Medium)37m ago76.105148640.00
googlegemini-2.5-flash-liteActive34m ago73.6010132550.00
togetherllama-3.1-70bActive15d ago72.4012129350.00
openaigpt-4.1-nanoActive35m ago71.409149460.00
togethermistral-7bActive15d ago70.50291550.00
deepinfradevstral-smallNever Succeeded(Medium)38m ago70.009140600.00
openaigpt-4oActive34m ago69.50131731310.00
fireworksmixtral-8x22bActive35m ago67.4029111430.00
googlegemini-2.5-flashNever Succeeded(Medium)34m ago67.3051011010.00
togethermixtral-8x7bActive34m ago61.1014114160.00
deepinframixtral-8x22bStale(Medium)22d ago58.201466500.00
togetherllama-3.2-3bActive6d ago55.8051211570.00
fireworksllama-3.3-70bActive35m ago53.901951700.00
togetherllama-3.3-70bActive34m ago53.1011461270.00
anthropicclaude-haiku-4.5Active38m ago52.001974530.00
openaigpt-4.1-miniActive35m ago51.2015109460.00
openaio4 MiniNever Succeeded(Medium)35m ago49.204760.00
togetherdeepseek-r1Active34m ago48.801113940.00
bedrockllama-3.2-90bActive18m ago47.00251360.00
deepinfrallama-3-8bStale(Medium)35m ago45.601871320.00
deepinfrallama-3.1-8bStale(Medium)36m ago45.40385670.00
deepinfrallama-3.2-1bStale(Medium)37m ago43.901100740.00
deepinfrallama-3.2-3bStale(Medium)37m ago42.70299700.00
googlegemini-2.5-proNever Succeeded(Medium)34m ago41.102721680.00
bedrockmistral-largeActive18m ago41.00247540.00
openaigpt-4o-miniActive34m ago39.80765390.00
bedrockclaude-haiku-4.5Active18m ago39.703621190.00
openaigpt-4.1Active35m ago39.30683500.00
deepinfrallama-3.2-90bStale(Medium)37m ago35.50382760.00
deepinfrallama-2-70bStale(Medium)35m ago35.00357620.00
deepinfrallama-3-70bStale(Medium)35m ago33.90255680.00
bedrockclaude-3-7-sonnetActive18m ago32.80244740.00
deepinfraqwen-2.5-72bStale(Medium)37m ago32.60150810.00
bedrockclaude-3-5-sonnetActive18m ago32.40244660.00
openaigpt-4-turboActive34m ago32.40252530.00
bedrockclaude-3-5-haikuActive18m ago31.20938660.00
deepinfraQwen 2.5 Coder 32BNever Succeeded(Medium)38m ago31.101823230.00
openaiGPT-5.1Active35m ago29.203571030.00
openaigpt-4Active34m ago27.60347680.00
openaiGPT-5.2Active35m ago26.90438970.00
openaiGPT-5.4Active35m ago26.3019361250.00
deepinfrallama-3.1-405bStale(Medium)36m ago26.10139730.00
openaiGPT-5.1-codexActive35m ago25.701481270.00
openaiGPT-5.1-codex-miniActive6h ago25.101551200.00
deepinfrallama-3.1-70bStale(Medium)36m ago22.901461100.00
bedrockclaude-sonnet-4.5Active18m ago22.002291660.00
anthropicclaude-opus-4.5Active38m ago20.302331800.00
anthropicclaude-4-sonnetActive38m ago20.201311980.00
openaiGPT-5.3-codexActive35m ago20.207321290.00
bedrockclaude-3-opusActive21d ago19.50822850.00
bedrockclaude-opus-4.5Active18m ago18.501242040.00
deepinfrallama-3.3-70bNever Succeeded(Medium)38m ago18.301462580.00
anthropicClaude Opus 4.1Active38m ago17.808271500.00
anthropicclaude-4-opusActive38m ago17.405241310.00
openaigpt-5.2-codexActive3h ago13.001271790.00
deepinfrallama-3.2-11bStale(Medium)37m ago10.701612750.00
openaio1-proLikely Deprecated(Medium)3h ago9.50118640.00
openaiGPT-5.2-proActive35m ago8.614144940.00
deepinfraqwen-3-235bNever Succeeded(Medium)37m ago8.111406960.00
Lifecycle snapshot
Loading status summary…

πŸ“ˆ Time Series πŸ“ˆ