Cloud BenchmarksLocal Benchmarks
API Status

☁️ Cloud Benchmarks ☁️

I run cron jobs to periodically test the token generation speed of different cloud LLM providers. The chart helps visualize the distributions of different speeds, as they can vary somewhat depending on the loads. For readability not all models are shown, but you can see the full results in the table below.

Every provider and model now has a dedicated landing page with narrative insights, SEO-friendly metadata, and structured data for search engines. Click any provider or model in the table to explore performance in depth.

I am working daily to add more providers and models, looking anywhere that does not require purchasing dedicated endpoints for hosting (why some models may appear to be missing). If you have any more suggestions let me know on GitHub!! 😊

Pick A Path In 10 Seconds

Quick recommendations from the latest 7-day benchmark slice. Use one path, jump into full results, then drill into provider/model pages.

Loading 7-day recommendations…

Fastest Models Right Now (updated <24h)

#ModelProviderSpeed
1llama-3.1-8bgroq318 tok/s
2qwen-3-32bgroq227 tok/s
3llama-4-scoutgroq204 tok/s
4llama-3.3-70bgroq201 tok/s
5gpt-oss-120bcerebras188 tok/s

πŸ“Š Speed Distribution πŸ“Š

πŸ“š Full Results πŸ“š

Showing 78 of 78 modelsFlagged statuses: likely_deprecated, deprecated, failing, stale, never_succeeded, disabled
Status
groqllama-3.1-8bActive3h ago318.008747190.00
cerebrasqwen-3-32bActive22d ago263.0032444230.00
groqqwen-3-32bActive3h ago227.002374300.00
groqllama-4-scoutActive3h ago204.0038335230.00
groqllama-3.3-70bActive3h ago201.0068340130.00
groqllama-4-maverickActive1d ago197.001307620.00
cerebrasgpt-oss-120bActive6h ago188.0013801050.00
cerebrasllama-3.1-8bActive3h ago186.0013481160.00
cerebrasllama-3.3-70bActive22d ago185.0017316420.00
groqkimi-k2Active3h ago141.0012215310.00
togetherllama-3.1-8bActive3h ago140.003232340.00
bedrocknova-microActive51m ago121.0065152270.00
openaio3 MiniNever Succeeded(Medium)3h ago110.00211600.00
bedrockllama-4-maverickActive51m ago108.003139260.00
bedrocknova-liteActive51m ago101.0039132300.00
bedrockllama-4-scoutActive51m ago101.004129280.00
bedrockllama-3.3-70bActive51m ago97.603136290.00
togetherqwen-2.5-7bActive3h ago93.801145490.00
bedrocknova-proActive51m ago89.5035121350.00
openaiGPT-5.1-codex-maxActive3h ago80.10111181410.00
openaigpt-3.5-turboActive3h ago77.6013126480.00
googlegemini-2.5-flash-liteActive3h ago74.7010132540.00
deepinframistral-7bStale(Medium)6h ago72.405148620.00
togetherllama-3.1-70bActive13d ago72.4010129360.00
openaigpt-4.1-nanoActive3h ago71.709149450.00
togethermistral-7bActive13d ago70.60291500.00
openaigpt-4oActive3h ago69.60131731310.00
deepinfradevstral-smallNever Succeeded(Medium)3h ago67.209140600.00
googlegemini-2.5-flashNever Succeeded(Medium)3h ago66.9051011020.00
fireworksmixtral-8x22bActive9d ago66.7029111440.00
togethermixtral-8x7bActive3h ago60.6014114160.00
deepinframixtral-8x22bStale(Medium)20d ago59.401478450.00
togetherllama-3.2-3bActive4d ago55.2051211580.00
togetherllama-3.3-70bActive3h ago53.3011461300.00
fireworksllama-3.3-70bActive3h ago52.502951420.00
anthropicclaude-haiku-4.5Active3h ago51.601974550.00
openaigpt-4.1-miniActive3h ago51.2015109450.00
openaio4 MiniNever Succeeded(Medium)3h ago49.6014760.00
togetherdeepseek-r1Active3h ago48.101113950.00
bedrockllama-3.2-90bActive51m ago47.00251350.00
deepinfrallama-3.1-8bStale(Medium)3h ago46.30385650.00
deepinfrallama-3-8bStale(Medium)3h ago45.50971310.00
deepinfrallama-3.2-1bStale(Medium)3h ago45.001100700.00
deepinfrallama-3.2-3bStale(Medium)3h ago43.70299620.00
bedrockmistral-largeActive51m ago41.40247470.00
googlegemini-2.5-proNever Succeeded(Medium)3h ago40.802721690.00
bedrockclaude-haiku-4.5Active51m ago40.604621130.00
openaigpt-4o-miniActive3h ago39.901865370.00
openaigpt-4.1Active3h ago38.90683510.00
deepinfrallama-3.2-90bStale(Medium)6h ago35.00382850.00
deepinfrallama-2-70bStale(Medium)3h ago34.90357570.00
deepinfrallama-3-70bStale(Medium)3h ago34.00255680.00
bedrockclaude-3-7-sonnetActive51m ago32.90244740.00
deepinfraqwen-2.5-72bStale(Medium)3h ago32.50150770.00
openaigpt-4-turboActive3h ago32.40252530.00
bedrockclaude-3-5-sonnetActive51m ago32.30244660.00
bedrockclaude-3-5-haikuActive51m ago31.10938660.00
deepinfraQwen 2.5 Coder 32BNever Succeeded(Medium)3h ago30.901823010.00
openaiGPT-5.1Active3h ago29.80357960.00
openaigpt-4Active3h ago27.70347670.00
openaiGPT-5.2Active3h ago26.901240910.00
deepinfrallama-3.1-405bStale(Medium)3h ago25.90339710.00
openaiGPT-5.1-codexActive3h ago25.501481310.00
openaiGPT-5.1-codex-miniActive3h ago24.801551200.00
bedrockclaude-sonnet-4.5Active51m ago22.102291650.00
deepinfrallama-3.1-70bStale(Medium)3h ago21.901461120.00
anthropicclaude-4-sonnetActive3h ago20.301311960.00
anthropicclaude-opus-4.5Active3h ago20.202331830.00
bedrockclaude-3-opusActive19d ago19.40622870.00
deepinfrallama-3.3-70bNever Succeeded(Medium)3h ago18.801462520.00
bedrockclaude-opus-4.5Active51m ago18.401242060.00
anthropicClaude Opus 4.1Active3h ago17.808271510.00
anthropicclaude-4-opusActive3h ago17.505241310.00
openaigpt-5.2-codexActive9h ago12.901271820.00
deepinfrallama-3.2-11bStale(Medium)3h ago11.801622700.00
openaio1-proLikely Deprecated(Medium)3h ago9.38118710.00
deepinfraqwen-3-235bNever Succeeded(Medium)3h ago8.891406590.00
openaiGPT-5.2-proActive3h ago8.544144950.00
Lifecycle snapshot
Loading status summary…

πŸ“ˆ Time Series πŸ“ˆ