Cloud BenchmarksLocal Benchmarks
API Status

☁️ Cloud Benchmarks ☁️

I run cron jobs to periodically test the token generation speed of different cloud LLM providers. The chart helps visualize the distributions of different speeds, as they can vary somewhat depending on the loads. For readability not all models are shown, but you can see the full results in the table below.

Every provider and model now has a dedicated landing page with narrative insights, SEO-friendly metadata, and structured data for search engines. Click any provider or model in the table to explore performance in depth.

I am working daily to add more providers and models, looking anywhere that does not require purchasing dedicated endpoints for hosting (why some models may appear to be missing). If you have any more suggestions let me know on GitHub!! 😊

Pick A Path In 10 Seconds

Quick recommendations from the latest 7-day benchmark slice. Use one path, jump into full results, then drill into provider/model pages.

Loading 7-day recommendations…

Fastest Models Right Now (updated <24h)

#ModelProviderSpeed
1llama-3.1-8bgroq318 tok/s
2qwen-3-32bgroq225 tok/s
3llama-4-scoutgroq201 tok/s
4llama-3.3-70bgroq201 tok/s
5gpt-oss-120bcerebras187 tok/s

πŸ“Š Speed Distribution πŸ“Š

πŸ“š Full Results πŸ“š

Showing 80 of 80 modelsFlagged statuses: likely_deprecated, deprecated, failing, stale, never_succeeded, disabled
Status
groqllama-3.1-8bActive32m ago318.008747190.00
cerebrasqwen-3-32bActive23d ago267.0032444240.00
groqqwen-3-32bActive32m ago225.002374310.00
groqllama-4-scoutActive32m ago201.0038335240.00
groqllama-3.3-70bActive32m ago201.0068340130.00
groqllama-4-maverickActive2d ago199.001307620.00
cerebrasllama-3.3-70bActive23d ago189.0017316410.00
cerebrasgpt-oss-120bActive6h ago187.0013801060.00
cerebrasllama-3.1-8bActive34m ago183.0013531190.00
groqkimi-k2Active32m ago141.0012215310.00
togetherllama-3.1-8bActive30m ago140.003228340.00
bedrocknova-microActive34m ago121.0065152270.00
openaio3 MiniNever Succeeded(Medium)31m ago109.0081600.00
bedrockllama-4-maverickActive34m ago108.003139260.00
bedrocknova-liteActive34m ago101.0039132300.00
bedrockllama-4-scoutActive34m ago101.004127280.00
bedrockllama-3.3-70bActive34m ago97.203136300.00
togetherqwen-2.5-7bActive30m ago93.501145490.00
bedrocknova-proActive34m ago89.3036121350.00
openaiGPT-5.1-codex-maxActive31m ago80.20111181410.00
openaigpt-3.5-turboActive30m ago77.4013126480.00
deepinframistral-7bStale(Medium)33m ago74.205148630.00
googlegemini-2.5-flash-liteActive30m ago73.7010132550.00
togetherllama-3.1-70bActive14d ago72.5012129360.00
openaigpt-4.1-nanoActive31m ago71.409149460.00
togethermistral-7bActive14d ago70.70291520.00
openaigpt-4oActive30m ago69.60131731310.00
deepinfradevstral-smallNever Succeeded(Medium)34m ago68.409140600.00
googlegemini-2.5-flashNever Succeeded(Medium)30m ago67.3051011010.00
fireworksmixtral-8x22bActive32m ago67.0029111430.00
togethermixtral-8x7bActive30m ago60.7014114160.00
deepinframixtral-8x22bStale(Medium)21d ago59.201477470.00
togetherllama-3.2-3bActive5d ago55.5051211600.00
togetherllama-3.3-70bActive30m ago53.4011461270.00
fireworksllama-3.3-70bActive32m ago53.101951720.00
anthropicclaude-haiku-4.5Active34m ago51.901974530.00
openaigpt-4.1-miniActive31m ago51.4015109460.00
openaio4 MiniNever Succeeded(Medium)31m ago49.004760.00
togetherdeepseek-r1Active30m ago48.601113940.00
bedrockllama-3.2-90bActive34m ago47.00251350.00
deepinfrallama-3-8bStale(Medium)32m ago45.802071310.00
deepinfrallama-3.1-8bStale(Medium)32m ago45.60385660.00
deepinfrallama-3.2-1bStale(Medium)32m ago44.301100740.00
deepinfrallama-3.2-3bStale(Medium)33m ago43.20299630.00
bedrockmistral-largeActive34m ago41.20247510.00
googlegemini-2.5-proNever Succeeded(Medium)30m ago40.802721690.00
bedrockclaude-haiku-4.5Active34m ago40.204621150.00
openaigpt-4o-miniActive30m ago39.901865370.00
openaigpt-4.1Active31m ago39.20683500.00
deepinfrallama-3.2-90bStale(Medium)33m ago35.10382810.00
deepinfrallama-2-70bStale(Medium)32m ago34.70357620.00
deepinfrallama-3-70bStale(Medium)32m ago33.80255720.00
bedrockclaude-3-7-sonnetActive34m ago32.90244740.00
deepinfraqwen-2.5-72bStale(Medium)33m ago32.50150810.00
bedrockclaude-3-5-sonnetActive34m ago32.40244660.00
openaigpt-4-turboActive30m ago32.40252530.00
deepinfraQwen 2.5 Coder 32BNever Succeeded(Medium)34m ago31.201823020.00
bedrockclaude-3-5-haikuActive34m ago31.10938660.00
openaiGPT-5.1Active31m ago29.503571020.00
openaigpt-4Active30m ago27.60347670.00
openaiGPT-5.2Active31m ago26.80440980.00
deepinfrallama-3.1-405bStale(Medium)32m ago26.10339720.00
openaiGPT-5.1-codexActive31m ago25.501481290.00
openaiGPT-5.4Active32m ago24.8019311540.00
openaiGPT-5.1-codex-miniActive31m ago24.601551220.00
deepinfrallama-3.1-70bStale(Medium)32m ago22.201461120.00
bedrockclaude-sonnet-4.5Active34m ago22.102291650.00
anthropicclaude-opus-4.5Active34m ago20.302331820.00
anthropicclaude-4-sonnetActive34m ago20.301311960.00
openaiGPT-5.3-codexActive32m ago20.207321250.00
bedrockclaude-3-opusActive20d ago19.50822850.00
deepinfrallama-3.3-70bNever Succeeded(Medium)34m ago18.501462520.00
bedrockclaude-opus-4.5Active34m ago18.401242050.00
anthropicClaude Opus 4.1Active34m ago17.808271500.00
anthropicclaude-4-opusActive34m ago17.405241310.00
openaigpt-5.2-codexActive32m ago13.001271800.00
deepinfrallama-3.2-11bStale(Medium)33m ago11.201622690.00
openaio1-proLikely Deprecated(Medium)31m ago9.44118700.00
openaiGPT-5.2-proActive31m ago8.574144980.00
deepinfraqwen-3-235bNever Succeeded(Medium)34m ago8.351406890.00
Lifecycle snapshot
Loading status summary…

πŸ“ˆ Time Series πŸ“ˆ