Cloud BenchmarksLocal Benchmarks
API Status

☁️ Cloud Benchmarks ☁️

I run cron jobs to periodically test the token generation speed of different cloud LLM providers. The chart helps visualize the distributions of different speeds, as they can vary somewhat depending on the loads. For readability not all models are shown, but you can see the full results in the table below.

Every provider and model now has a dedicated landing page with narrative insights, SEO-friendly metadata, and structured data for search engines. Click any provider or model in the table to explore performance in depth.

I am working daily to add more providers and models, looking anywhere that does not require purchasing dedicated endpoints for hosting (why some models may appear to be missing). If you have any more suggestions let me know on GitHub!! 😊

Pick A Path In 10 Seconds

Quick recommendations from the latest 7-day benchmark slice. Use one path, jump into full results, then drill into provider/model pages.

Loading 7-day recommendations…

Fastest Models Right Now (updated <24h)

#ModelProviderSpeed
1llama-3.1-8bgroq318 tok/s
2qwen-3-32bgroq228 tok/s
3llama-4-scoutgroq207 tok/s
4llama-3.3-70bgroq202 tok/s
5llama-4-maverickgroq198 tok/s

πŸ“Š Speed Distribution πŸ“Š

πŸ“š Full Results πŸ“š

Showing 78 of 78 modelsFlagged statuses: likely_deprecated, deprecated, failing, stale, never_succeeded, disabled
Status
groqllama-3.1-8bActive2h ago318.008747190.00
cerebrasqwen-3-32bActive21d ago260.0032444240.00
groqqwen-3-32bActive2h ago228.002391300.00
groqllama-4-scoutActive2h ago207.0038335220.00
groqllama-3.3-70bActive2h ago202.0068340130.00
groqllama-4-maverickActive5h ago198.001307610.00
cerebrasllama-3.1-8bActive2h ago189.0013481120.00
cerebrasgpt-oss-120bActive5h ago188.0013801060.00
cerebrasllama-3.3-70bActive21d ago185.0017316410.00
togetherllama-3.1-8bActive2h ago142.003232340.00
groqkimi-k2Active2h ago141.0012215310.00
bedrocknova-microActive33m ago121.0065152270.00
openaio3 MiniNever Succeeded(Medium)2h ago111.00211600.00
bedrockllama-4-maverickActive33m ago108.003139260.00
bedrocknova-liteActive33m ago102.0039132300.00
bedrockllama-4-scoutActive33m ago101.004129280.00
bedrockllama-3.3-70bActive33m ago98.203136290.00
togetherqwen-2.5-7bActive2h ago93.907145270.00
bedrocknova-proActive33m ago89.9035121350.00
openaiGPT-5.1-codex-maxActive2h ago80.60111181410.00
openaigpt-3.5-turboActive2h ago78.1013126470.00
googlegemini-2.5-flash-liteActive2h ago75.7010132520.00
openaigpt-4.1-nanoActive2h ago72.009149450.00
togetherllama-3.1-70bActive12d ago71.807129390.00
togethermistral-7bActive12d ago70.80291490.00
deepinframistral-7bStale(Medium)5h ago70.605148620.00
openaigpt-4oActive2h ago69.70131731310.00
fireworksmixtral-8x22bActive8d ago67.0029112440.00
googlegemini-2.5-flashNever Succeeded(Medium)2h ago66.9051011030.00
deepinfradevstral-smallNever Succeeded(Medium)2h ago65.409140600.00
togethermixtral-8x7bActive2h ago61.1014114150.00
deepinframixtral-8x22bStale(Medium)19d ago59.401478430.00
togetherllama-3.2-3bActive3d ago56.2051211540.00
togetherllama-3.3-70bActive2h ago53.8011461350.00
fireworksllama-3.3-70bActive2h ago52.302951410.00
anthropicclaude-haiku-4.5Active2h ago52.001974540.00
openaigpt-4.1-miniActive2h ago51.201585450.00
openaio4 MiniNever Succeeded(Medium)2h ago49.6014760.00
bedrockllama-3.2-90bActive33m ago47.10251350.00
togetherdeepseek-r1Active2h ago47.001113950.00
deepinfrallama-3.1-8bStale(Medium)2h ago46.70396630.00
deepinfrallama-3.2-1bStale(Medium)2h ago46.301100630.00
deepinfrallama-3-8bStale(Medium)2h ago45.50971310.00
deepinfrallama-3.2-3bStale(Medium)2h ago45.00299540.00
bedrockmistral-largeActive33m ago41.70247430.00
bedrockclaude-haiku-4.5Active33m ago41.204621100.00
openaigpt-4o-miniActive2h ago41.201895360.00
googlegemini-2.5-proNever Succeeded(Medium)2h ago41.002721690.00
openaigpt-4.1Active2h ago38.70683510.00
deepinfrallama-3.2-90bStale(Medium)2h ago35.30382870.00
deepinfrallama-2-70bStale(Medium)2h ago34.90357570.00
deepinfrallama-3-70bStale(Medium)2h ago34.30255680.00
bedrockclaude-3-7-sonnetActive33m ago33.10244740.00
openaigpt-4-turboActive2h ago32.50252530.00
bedrockclaude-3-5-sonnetActive33m ago32.40244660.00
deepinfraqwen-2.5-72bStale(Medium)2h ago32.40150760.00
deepinfraQwen 2.5 Coder 32BNever Succeeded(Medium)2h ago31.201822980.00
bedrockclaude-3-5-haikuActive33m ago30.90438680.00
openaiGPT-5.1Active2h ago29.90357950.00
openaigpt-4Active2h ago27.90347670.00
openaiGPT-5.2Active2h ago27.001240900.00
deepinfrallama-3.1-405bStale(Medium)2h ago25.50139870.00
openaiGPT-5.1-codexActive2h ago25.501481310.00
openaiGPT-5.1-codex-miniActive2h ago25.301551180.00
bedrockclaude-sonnet-4.5Active33m ago22.202291630.00
deepinfrallama-3.1-70bStale(Medium)2h ago21.701461130.00
anthropicclaude-4-sonnetActive2h ago20.301311950.00
anthropicclaude-opus-4.5Active2h ago20.202331830.00
bedrockclaude-3-opusActive18d ago19.40622870.00
deepinfrallama-3.3-70bNever Succeeded(Medium)2h ago19.001462510.00
bedrockclaude-opus-4.5Active33m ago18.401242060.00
anthropicClaude Opus 4.1Active2h ago17.908271500.00
anthropicclaude-4-opusActive2h ago17.505241310.00
openaigpt-5.2-codexActive2h ago12.901271830.00
deepinfrallama-3.2-11bStale(Medium)2h ago12.701622670.00
deepinfraqwen-3-235bNever Succeeded(Medium)2h ago9.631406100.00
openaio1-proLikely Deprecated(Medium)2h ago9.44118640.00
openaiGPT-5.2-proActive2h ago8.524144960.00
Lifecycle snapshot
Loading status summary…

πŸ“ˆ Time Series πŸ“ˆ