Cloud BenchmarksLocal Benchmarks
API Status

☁️ Cloud Benchmarks ☁️

I run cron jobs to periodically test the token generation speed of different cloud LLM providers. The chart helps visualize the distributions of different speeds, as they can vary somewhat depending on the loads. For readability not all models are shown, but you can see the full results in the table below.

Every provider and model now has a dedicated landing page with narrative insights, SEO-friendly metadata, and structured data for search engines. Click any provider or model in the table to explore performance in depth.

I am working daily to add more providers and models, looking anywhere that does not require purchasing dedicated endpoints for hosting (why some models may appear to be missing). If you have any more suggestions let me know on GitHub!! 😊

Pick A Path In 10 Seconds

Quick recommendations from the latest 7-day benchmark slice. Use one path, jump into full results, then drill into provider/model pages.

Loading 7-day recommendations…

Fastest Models Right Now (updated <24h)

#ModelProviderSpeed
1llama-3.1-8bgroq319 tok/s
2qwen-3-32bgroq227 tok/s
3llama-4-scoutgroq206 tok/s
4llama-3.3-70bgroq201 tok/s
5llama-4-maverickgroq197 tok/s

πŸ“Š Speed Distribution πŸ“Š

πŸ“š Full Results πŸ“š

Showing 78 of 78 modelsFlagged statuses: likely_deprecated, deprecated, failing, stale, never_succeeded, disabled
Status
groqllama-3.1-8bActive39m ago319.008747190.00
cerebrasqwen-3-32bActive21d ago260.0032444240.00
groqqwen-3-32bActive39m ago227.002374300.00
groqllama-4-scoutActive39m ago206.0038335220.00
groqllama-3.3-70bActive39m ago201.0068340130.00
groqllama-4-maverickActive9h ago197.001307610.00
cerebrasgpt-oss-120bActive9h ago188.0013801060.00
cerebrasllama-3.1-8bActive41m ago188.0013481140.00
cerebrasllama-3.3-70bActive21d ago186.0017316410.00
groqkimi-k2Active39m ago141.0012215310.00
togetherllama-3.1-8bActive37m ago141.003232340.00
bedrocknova-microActive23m ago121.0065152270.00
openaio3 MiniNever Succeeded(Medium)39m ago111.00211600.00
bedrockllama-4-maverickActive23m ago108.003139260.00
bedrocknova-liteActive23m ago101.0039132300.00
bedrockllama-4-scoutActive23m ago101.004129280.00
bedrockllama-3.3-70bActive23m ago98.103136290.00
togetherqwen-2.5-7bActive37m ago94.307145260.00
bedrocknova-proActive23m ago89.8035121350.00
openaiGPT-5.1-codex-maxActive39m ago80.50111181420.00
openaigpt-3.5-turboActive38m ago78.0013126470.00
googlegemini-2.5-flash-liteActive37m ago75.6010132520.00
togetherllama-3.1-70bActive12d ago72.5010129360.00
openaigpt-4.1-nanoActive39m ago71.909149450.00
deepinframistral-7bStale(Medium)40m ago71.305148620.00
togethermistral-7bActive12d ago70.70291490.00
openaigpt-4oActive38m ago69.60131731300.00
fireworksmixtral-8x22bActive8d ago67.0029112440.00
googlegemini-2.5-flashNever Succeeded(Medium)37m ago66.7051011030.00
deepinfradevstral-smallNever Succeeded(Medium)41m ago65.809140600.00
togethermixtral-8x7bActive37m ago60.9014114150.00
deepinframixtral-8x22bStale(Medium)19d ago59.201478430.00
togetherllama-3.2-3bActive3d ago55.9051211560.00
togetherllama-3.3-70bActive38m ago53.4011461410.00
fireworksllama-3.3-70bActive39m ago52.402951420.00
anthropicclaude-haiku-4.5Active42m ago51.901974550.00
openaigpt-4.1-miniActive39m ago51.301585450.00
openaio4 MiniNever Succeeded(Medium)39m ago49.8014760.00
togetherdeepseek-r1Active37m ago47.301113940.00
bedrockllama-3.2-90bActive23m ago47.10251350.00
deepinfrallama-3.1-8bStale(Medium)40m ago46.20396640.00
deepinfrallama-3.2-1bStale(Medium)40m ago45.901100660.00
deepinfrallama-3-8bStale(Medium)39m ago45.50971310.00
deepinfrallama-3.2-3bStale(Medium)40m ago44.60299560.00
bedrockmistral-largeActive23m ago41.60247450.00
openaigpt-4o-miniActive38m ago41.101895360.00
bedrockclaude-haiku-4.5Active23m ago41.004621110.00
googlegemini-2.5-proNever Succeeded(Medium)37m ago41.002721690.00
openaigpt-4.1Active39m ago38.70683500.00
deepinfrallama-3.2-90bStale(Medium)40m ago35.10382870.00
deepinfrallama-2-70bStale(Medium)39m ago35.00357570.00
deepinfrallama-3-70bStale(Medium)40m ago34.10255680.00
bedrockclaude-3-7-sonnetActive23m ago33.00244740.00
deepinfraqwen-2.5-72bStale(Medium)41m ago32.50150760.00
bedrockclaude-3-5-sonnetActive23m ago32.40244660.00
openaigpt-4-turboActive38m ago32.40252530.00
deepinfraQwen 2.5 Coder 32BNever Succeeded(Medium)41m ago31.101823000.00
bedrockclaude-3-5-haikuActive23m ago30.90438680.00
openaiGPT-5.1Active39m ago29.80357960.00
openaigpt-4Active38m ago27.80347670.00
openaiGPT-5.2Active39m ago26.901240910.00
deepinfrallama-3.1-405bStale(Medium)40m ago25.60139870.00
openaiGPT-5.1-codexActive39m ago25.401481310.00
openaiGPT-5.1-codex-miniActive39m ago25.301551180.00
bedrockclaude-sonnet-4.5Active23m ago22.202291640.00
deepinfrallama-3.1-70bStale(Medium)40m ago21.801461130.00
anthropicclaude-4-sonnetActive42m ago20.301311960.00
anthropicclaude-opus-4.5Active42m ago20.102331830.00
bedrockclaude-3-opusActive18d ago19.40622870.00
deepinfrallama-3.3-70bNever Succeeded(Medium)41m ago18.801462540.00
bedrockclaude-opus-4.5Active23m ago18.401242060.00
anthropicClaude Opus 4.1Active42m ago17.808271510.00
anthropicclaude-4-opusActive41m ago17.505241310.00
openaigpt-5.2-codexActive3h ago12.901271820.00
deepinfrallama-3.2-11bStale(Medium)40m ago12.701622680.00
deepinfraqwen-3-235bNever Succeeded(Medium)41m ago9.531406200.00
openaio1-proLikely Deprecated(Medium)39m ago9.43118640.00
openaiGPT-5.2-proActive39m ago8.534144960.00
Lifecycle snapshot
Loading status summary…

πŸ“ˆ Time Series πŸ“ˆ