Cloud BenchmarksLocal Benchmarks
API Status

☁️ Cloud Benchmarks ☁️

I run cron jobs to periodically test the token generation speed of different cloud LLM providers. The chart helps visualize the distributions of different speeds, as they can vary somewhat depending on the loads. For readability not all models are shown, but you can see the full results in the table below.

Every provider and model now has a dedicated landing page with narrative insights, SEO-friendly metadata, and structured data for search engines. Click any provider or model in the table to explore performance in depth.

I am working daily to add more providers and models, looking anywhere that does not require purchasing dedicated endpoints for hosting (why some models may appear to be missing). If you have any more suggestions let me know on GitHub!! 😊

Pick A Path In 10 Seconds

Quick recommendations from the latest 7-day benchmark slice. Use one path, jump into full results, then drill into provider/model pages.

Loading 7-day recommendations…

Fastest Models Right Now (updated <24h)

#ModelProviderSpeed
1llama-3.1-8bgroq319 tok/s
2qwen-3-32bgroq228 tok/s
3llama-4-scoutgroq208 tok/s
4llama-3.3-70bgroq201 tok/s
5llama-4-maverickgroq198 tok/s

πŸ“Š Speed Distribution πŸ“Š

πŸ“š Full Results πŸ“š

Showing 78 of 78 modelsFlagged statuses: likely_deprecated, deprecated, failing, stale, never_succeeded, disabled
Status
groqllama-3.1-8bActive1h ago319.008747190.00
cerebrasqwen-3-32bActive21d ago259.0032444240.00
groqqwen-3-32bActive1h ago228.002391300.00
groqllama-4-scoutActive1h ago208.0038335220.00
groqllama-3.3-70bActive1h ago201.0068340130.00
groqllama-4-maverickActive1h ago198.001307600.00
cerebrasllama-3.1-8bActive1h ago190.0013481050.00
cerebrasgpt-oss-120bActive1h ago189.0013801050.00
cerebrasllama-3.3-70bActive21d ago184.0017316410.00
togetherllama-3.1-8bActive1h ago142.003232340.00
groqkimi-k2Active1h ago141.0012215310.00
bedrocknova-microActive31m ago121.0065152270.00
openaio3 MiniNever Succeeded(Medium)1h ago111.00211600.00
bedrockllama-4-maverickActive31m ago108.003139260.00
bedrocknova-liteActive31m ago102.0039132300.00
bedrockllama-4-scoutActive31m ago101.004129280.00
bedrockllama-3.3-70bActive31m ago98.303136290.00
togetherqwen-2.5-7bActive1h ago94.3011145240.00
bedrocknova-proActive31m ago90.0035121350.00
openaiGPT-5.1-codex-maxActive1h ago80.50111181420.00
openaigpt-3.5-turboActive1h ago78.2013126470.00
googlegemini-2.5-flash-liteActive1h ago75.7010132520.00
openaigpt-4.1-nanoActive1h ago72.209149450.00
togetherllama-3.1-70bActive12d ago71.707129400.00
togethermistral-7bActive12d ago70.90291490.00
deepinframistral-7bStale(Medium)1h ago70.505148620.00
openaigpt-4oActive1h ago69.70131731310.00
fireworksmixtral-8x22bActive8d ago67.1029112440.00
googlegemini-2.5-flashNever Succeeded(Medium)1h ago66.8051011030.00
deepinfradevstral-smallNever Succeeded(Medium)1h ago65.209140600.00
togethermixtral-8x7bActive1h ago61.1014114150.00
deepinframixtral-8x22bStale(Medium)19d ago59.501478430.00
togetherllama-3.2-3bActive3d ago56.4051211540.00
togetherllama-3.3-70bActive1h ago53.6011461360.00
fireworksllama-3.3-70bActive1h ago52.102951420.00
anthropicclaude-haiku-4.5Active1h ago52.001974550.00
openaigpt-4.1-miniActive1h ago51.301585450.00
openaio4 MiniNever Succeeded(Medium)1h ago49.6014760.00
bedrockllama-3.2-90bActive31m ago47.10251350.00
deepinfrallama-3.1-8bStale(Medium)1h ago47.10496580.00
togetherdeepseek-r1Active1h ago46.901113950.00
deepinfrallama-3.2-1bStale(Medium)1h ago46.401100630.00
deepinfrallama-3-8bStale(Medium)1h ago45.50971310.00
deepinfrallama-3.2-3bStale(Medium)1h ago45.10299540.00
bedrockmistral-largeActive31m ago41.70247430.00
openaigpt-4o-miniActive1h ago41.401895360.00
bedrockclaude-haiku-4.5Active31m ago41.304621090.00
googlegemini-2.5-proNever Succeeded(Medium)1h ago41.002721690.00
openaigpt-4.1Active1h ago38.60683510.00
deepinfrallama-3.2-90bStale(Medium)1h ago35.30382870.00
deepinfrallama-2-70bStale(Medium)1h ago34.90357570.00
deepinfrallama-3-70bStale(Medium)1h ago34.40255680.00
bedrockclaude-3-7-sonnetActive31m ago33.10244740.00
openaigpt-4-turboActive1h ago32.50252530.00
bedrockclaude-3-5-sonnetActive31m ago32.40244660.00
deepinfraqwen-2.5-72bStale(Medium)1h ago32.40150760.00
deepinfraQwen 2.5 Coder 32BNever Succeeded(Medium)1h ago31.301822980.00
bedrockclaude-3-5-haikuActive31m ago30.90438680.00
openaiGPT-5.1Active1h ago29.90357950.00
openaigpt-4Active1h ago27.90347670.00
openaiGPT-5.2Active1h ago27.001240900.00
openaiGPT-5.1-codexActive1h ago25.601481300.00
deepinfrallama-3.1-405bStale(Medium)1h ago25.40139870.00
openaiGPT-5.1-codex-miniActive4h ago25.201551180.00
bedrockclaude-sonnet-4.5Active31m ago22.202291620.00
deepinfrallama-3.1-70bStale(Medium)1h ago21.601461130.00
anthropicclaude-4-sonnetActive1h ago20.301311950.00
anthropicclaude-opus-4.5Active1h ago20.202331840.00
bedrockclaude-3-opusActive18d ago19.40622870.00
deepinfrallama-3.3-70bNever Succeeded(Medium)1h ago19.101462510.00
bedrockclaude-opus-4.5Active31m ago18.401242060.00
anthropicClaude Opus 4.1Active1h ago17.908271490.00
anthropicclaude-4-opusActive1h ago17.605241310.00
openaigpt-5.2-codexActive4h ago12.901271830.00
deepinfrallama-3.2-11bStale(Medium)1h ago12.701622680.00
deepinfraqwen-3-235bNever Succeeded(Medium)1h ago9.761406030.00
openaio1-proLikely Deprecated(Medium)1h ago9.49118640.00
openaiGPT-5.2-proActive1h ago8.524144990.00
Lifecycle snapshot
Loading status summary…

πŸ“ˆ Time Series πŸ“ˆ