Cloud BenchmarksLocal Benchmarks
API Status

☁️ Cloud Benchmarks ☁️

I run cron jobs to periodically test the token generation speed of different cloud LLM providers. The chart helps visualize the distributions of different speeds, as they can vary somewhat depending on the loads. For readability not all models are shown, but you can see the full results in the table below.

Every provider and model now has a dedicated landing page with narrative insights, SEO-friendly metadata, and structured data for search engines. Click any provider or model in the table to explore performance in depth.

I am working daily to add more providers and models, looking anywhere that does not require purchasing dedicated endpoints for hosting (why some models may appear to be missing). If you have any more suggestions let me know on GitHub!! 😊

Pick A Path In 10 Seconds

Quick recommendations from the latest 7-day benchmark slice. Use one path, jump into full results, then drill into provider/model pages.

Loading 7-day recommendations…

Fastest Models Right Now (updated <24h)

#ModelProviderSpeed
1llama-3.1-8bgroq322 tok/s
2qwen-3-32bgroq234 tok/s
3llama-4-scoutgroq210 tok/s
4llama-3.3-70bgroq203 tok/s
5llama-3.1-8bcerebras196 tok/s

πŸ“Š Speed Distribution πŸ“Š

πŸ“š Full Results πŸ“š

Showing 80 of 80 modelsFlagged statuses: likely_deprecated, deprecated, failing, stale, never_succeeded, disabled
Status
groqllama-3.1-8bActive51m ago322.008747190.00
cerebrasqwen-3-32bActive18d ago245.004444410.00
groqqwen-3-32bActive51m ago234.002391300.00
groqllama-4-scoutActive51m ago210.0038335210.00
groqllama-3.3-70bActive51m ago203.0068322120.00
cerebrasllama-3.1-8bActive53m ago196.001348930.00
cerebrasgpt-oss-120bActive53m ago191.0013380820.00
groqllama-4-maverickActive51m ago188.0012307400.00
cerebrasllama-3.3-70bActive18d ago187.0017318400.00
togetherllama-3.1-8bActive49m ago142.002232530.00
groqkimi-k2Active51m ago141.0012215310.00
bedrocknova-microActive29m ago122.0065150270.00
openaio3 MiniNever Succeeded(Medium)51m ago110.00211600.00
bedrockllama-4-maverickActive29m ago108.003139270.00
bedrocknova-liteActive29m ago101.0039130300.00
bedrockllama-4-scoutActive29m ago101.004129270.00
bedrockllama-3.3-70bActive29m ago98.703137290.00
togetherqwen-2.5-7bActive49m ago93.7011145240.00
bedrocknova-proActive29m ago90.5013121350.00
openaigpt-3.5-turboActive49m ago78.5013126470.00
openaiGPT-5.1-codex-maxActive51m ago78.10111181430.00
googlegemini-2.5-flash-liteActive49m ago75.3010132530.00
togetherllama-3.1-70bActive8d ago72.207129400.00
openaigpt-4.1-nanoActive51m ago71.309149450.00
togethermistral-7bActive8d ago70.50291520.00
openaigpt-4oActive49m ago69.20111731330.00
fireworksmixtral-8x22bActive5d ago67.8029112430.00
googlegemini-2.5-flashNever Succeeded(Medium)49m ago66.6051011030.00
deepinframistral-7bStale(Medium)52m ago64.505136610.00
togethermixtral-8x7bActive49m ago59.6024114150.00
deepinfradevstral-smallNever Succeeded(Medium)53m ago59.209131590.00
deepinframixtral-8x22bStale(Medium)16d ago59.101478390.00
togetherllama-3.2-3bActive49m ago56.8051211500.00
togetherqwen-2.5-72bActive28d ago55.604371310.00
togetherllama-3.3-70bActive49m ago53.8011461420.00
anthropicclaude-haiku-4.5Active54m ago51.801974550.00
fireworksllama-3.3-70bActive51m ago51.104951210.00
openaigpt-4.1-miniActive51m ago50.901585470.00
openaio4 MiniNever Succeeded(Medium)51m ago49.5014760.00
deepinfrallama-3.1-8bStale(Medium)51m ago49.30496540.00
bedrockllama-3.2-90bActive29m ago47.20651350.00
deepinfrallama-3-8bStale(Medium)51m ago45.40971310.00
togetherdeepseek-r1Active49m ago44.3011041000.00
openaigpt-4o-miniActive49m ago42.902095360.00
bedrockmistral-largeActive29m ago42.10347360.00
bedrockclaude-haiku-4.5Active30m ago41.504621080.00
deepinfrallama-3.2-1bStale(Medium)52m ago41.101100630.00
googlegemini-2.5-proNever Succeeded(Medium)49m ago40.802721700.00
deepinfrallama-3.2-3bStale(Medium)52m ago40.50198790.00
openaigpt-4.1Active50m ago37.80682500.00
deepinfrallama-2-70bStale(Medium)51m ago35.40357540.00
deepinfrallama-3-70bStale(Medium)51m ago35.10255660.00
deepinfrallama-3.2-90bStale(Medium)52m ago34.20376880.00
bedrockclaude-3-7-sonnetActive30m ago33.10544740.00
bedrockclaude-3-5-sonnetActive30m ago32.20244660.00
deepinfraqwen-2.5-72bStale(Medium)52m ago32.10250690.00
openaigpt-4-turboActive49m ago32.10252540.00
deepinfraQwen 2.5 Coder 32BNever Succeeded(Medium)53m ago31.001742560.00
bedrockclaude-3-5-haikuActive30m ago30.60438690.00
openaiGPT-5.1Active51m ago30.00357950.00
openaigpt-4Active49m ago27.50347660.00
openaiGPT-5.2Active51m ago26.70940930.00
openaiGPT-5.1-codexActive51m ago25.703481280.00
openaiGPT-5.1-codex-miniActive51m ago25.401551180.00
deepinfrallama-3.1-405bStale(Medium)52m ago24.30136990.00
deepinfrallama-3.1-70bStale(Medium)52m ago22.70146990.00
bedrockclaude-sonnet-4.5Active30m ago22.305291620.00
togetherllama-3.1-405bActive28d ago21.301129960.00
anthropicclaude-4-sonnetActive53m ago20.201311980.00
anthropicclaude-opus-4.5Active54m ago20.102331850.00
deepinfrallama-3.3-70bNever Succeeded(Medium)53m ago19.401462430.00
bedrockclaude-3-opusActive15d ago19.20422860.00
bedrockclaude-opus-4.5Active30m ago18.404241980.00
anthropicClaude Opus 4.1Active53m ago18.008271470.00
anthropicclaude-4-opusActive53m ago17.705241280.00
deepinfrallama-3.2-11bStale(Medium)52m ago14.401622590.00
openaigpt-5.2-codexActive6h ago12.801271860.00
deepinfraqwen-3-235bNever Succeeded(Medium)53m ago10.901385760.00
openaio1-proLikely Deprecated(Medium)50m ago9.16118650.00
openaiGPT-5.2-proActive51m ago8.564144910.00
Lifecycle snapshot
Loading status summary…

πŸ“ˆ Time Series πŸ“ˆ