Cloud BenchmarksLocal Benchmarks
API Status

☁️ Cloud Benchmarks ☁️

I run cron jobs to periodically test the token generation speed of different cloud LLM providers. The chart helps visualize the distributions of different speeds, as they can vary somewhat depending on the loads. For readability not all models are shown, but you can see the full results in the table below.

Every provider and model now has a dedicated landing page with narrative insights, SEO-friendly metadata, and structured data for search engines. Click any provider or model in the table to explore performance in depth.

I am working daily to add more providers and models, looking anywhere that does not require purchasing dedicated endpoints for hosting (why some models may appear to be missing). If you have any more suggestions let me know on GitHub!! 😊

Pick A Path In 10 Seconds

Quick recommendations from the latest 7-day benchmark slice. Use one path, jump into full results, then drill into provider/model pages.

Loading 7-day recommendations…

Fastest Models Right Now (updated <24h)

#ModelProviderSpeed
1llama-3.1-8bgroq323 tok/s
2qwen-3-32bgroq233 tok/s
3llama-4-scoutgroq210 tok/s
4llama-3.3-70bgroq202 tok/s
5llama-3.1-8bcerebras199 tok/s

πŸ“Š Speed Distribution πŸ“Š

πŸ“š Full Results πŸ“š

Showing 83 of 83 modelsFlagged statuses: likely_deprecated, deprecated, failing, stale, never_succeeded, disabled
Status
groqllama-3.1-8bActive3h ago323.008747190.00
cerebrasqwen-3-32bActive17d ago244.004444400.00
groqqwen-3-32bActive3h ago233.002391300.00
groqllama-4-scoutActive3h ago210.0038335210.00
groqllama-3.3-70bActive3h ago202.0068322130.00
cerebrasllama-3.1-8bActive3h ago199.001343910.00
cerebrasgpt-oss-120bActive3h ago192.0013380830.00
groqllama-4-maverickActive3h ago185.0012307410.00
cerebrasllama-3.3-70bActive17d ago183.0017318430.00
togetherllama-3.1-8bActive3h ago143.002232560.00
groqkimi-k2Active3h ago142.0012215280.00
bedrocknova-microActive39m ago121.0065149270.00
openaio3 MiniNever Succeeded(Medium)3h ago110.00211600.00
bedrockllama-4-maverickActive39m ago108.003139260.00
bedrocknova-liteActive39m ago101.0039130300.00
bedrockllama-4-scoutActive39m ago101.004129270.00
bedrockllama-3.3-70bActive39m ago99.203137290.00
togetherqwen-2.5-7bActive3h ago93.2011145240.00
bedrocknova-proActive39m ago90.8013121350.00
openaigpt-3.5-turboActive3h ago78.3013126470.00
openaiGPT-5.1-codex-maxActive3h ago77.60111181440.00
googlegemini-2.5-flash-liteActive3h ago75.1010132540.00
togetherllama-3.1-70bActive7d ago72.607129390.00
openaigpt-4.1-nanoActive3h ago71.109149450.00
togethermistral-7bActive7d ago70.50291550.00
openaigpt-4oActive3h ago69.0071731340.00
fireworksmixtral-8x22bActive3d ago67.8029112430.00
googlegemini-2.5-flashNever Succeeded(Medium)3h ago66.3051011060.00
deepinframistral-7bStale(Medium)3h ago64.305136580.00
deepinframixtral-8x22bStale(Medium)15d ago59.401478380.00
togethermixtral-8x7bActive3h ago58.7013114170.00
deepinfradevstral-smallNever Succeeded(Medium)3h ago58.303131670.00
togetherllama-3.2-3bActive3h ago56.3051211520.00
googlegemini-2.0-flashActive29d ago54.002966680.00
togetherllama-3.3-70bActive3h ago53.8011461430.00
togetherqwen-2.5-72bActive26d ago52.20471570.00
googlegemini-2.0-flash-liteActive29d ago52.004562710.00
anthropicclaude-haiku-4.5Active3h ago51.301974570.00
deepinfrallama-3.1-8bStale(Medium)3h ago51.10697450.00
fireworksllama-3.3-70bActive3h ago50.904951200.00
openaigpt-4.1-miniActive3h ago50.801597470.00
openaio4 MiniNever Succeeded(Medium)3h ago49.5014760.00
bedrockllama-3.2-90bActive39m ago47.20651350.00
deepinfrallama-3-8bStale(Medium)3h ago45.30971310.00
togetherdeepseek-r1Active3h ago43.701691010.00
openaigpt-4o-miniActive3h ago43.402095350.00
bedrockmistral-largeActive39m ago42.30347340.00
bedrockclaude-haiku-4.5Active39m ago41.804621060.00
googlegemini-2.5-proNever Succeeded(Medium)3h ago40.502721720.00
deepinfrallama-3.2-1bStale(Medium)3h ago38.901100630.00
deepinfrallama-3.2-3bStale(Medium)3h ago38.30197790.00
openaigpt-4.1Active3h ago37.60682490.00
deepinfrallama-2-70bStale(Medium)3h ago35.50357520.00
deepinfrallama-3-70bStale(Medium)3h ago35.00255640.00
deepinfrallama-3.2-90bStale(Medium)3h ago34.00376890.00
bedrockclaude-3-7-sonnetActive40m ago33.20544740.00
deepinfraqwen-2.5-72bStale(Medium)3h ago32.00250710.00
openaigpt-4-turboActive3h ago32.00252540.00
bedrockclaude-3-5-sonnetActive40m ago32.00244660.00
deepinfraQwen 2.5 Coder 32BNever Succeeded(Medium)3h ago31.201742650.00
bedrockclaude-3-5-haikuActive40m ago30.40138750.00
openaiGPT-5.1Active3h ago30.00357970.00
openaigpt-4Active3h ago27.40347670.00
openaiGPT-5.2Active3h ago26.60940940.00
openaiGPT-5.1-codexActive3h ago25.703481290.00
openaiGPT-5.1-codex-miniActive3h ago25.201551200.00
deepinfrallama-3.1-405bStale(Medium)3h ago23.901341120.00
deepinfrallama-3.1-70bStale(Medium)3h ago23.20146700.00
cerebrasqwen-3-235b-instructActive29d ago22.9019272750.00
bedrockclaude-sonnet-4.5Active39m ago22.305291620.00
togetherllama-3.1-405bActive27d ago20.506291180.00
anthropicclaude-4-sonnetActive3h ago19.901312050.00
anthropicclaude-opus-4.5Active3h ago19.902331860.00
deepinfrallama-3.3-70bNever Succeeded(Medium)3h ago19.801492250.00
bedrockclaude-3-opusActive14d ago19.20422860.00
bedrockclaude-opus-4.5Active39m ago18.304241990.00
anthropicClaude Opus 4.1Active3h ago18.108271470.00
anthropicclaude-4-opusActive3h ago17.805241260.00
deepinfrallama-3.2-11bStale(Medium)3h ago15.101622540.00
openaigpt-5.2-codexActive3h ago12.801271860.00
deepinfraqwen-3-235bNever Succeeded(Medium)3h ago11.301385490.00
openaio1-proLikely Deprecated(Medium)3h ago8.98118650.00
openaiGPT-5.2-proActive3h ago8.381145080.00
Lifecycle snapshot
Loading status summary…

πŸ“ˆ Time Series πŸ“ˆ