Cloud BenchmarksLocal Benchmarks
API Status

☁️ Cloud Benchmarks ☁️

I run cron jobs to periodically test the token generation speed of different cloud LLM providers. The chart helps visualize the distributions of different speeds, as they can vary somewhat depending on the loads. For readability not all models are shown, but you can see the full results in the table below.

Every provider and model now has a dedicated landing page with narrative insights, SEO-friendly metadata, and structured data for search engines. Click any provider or model in the table to explore performance in depth.

I am working daily to add more providers and models, looking anywhere that does not require purchasing dedicated endpoints for hosting (why some models may appear to be missing). If you have any more suggestions let me know on GitHub!! 😊

Pick A Path In 10 Seconds

Quick recommendations from the latest 7-day benchmark slice. Use one path, jump into full results, then drill into provider/model pages.

Loading 7-day recommendations…

Fastest Models Right Now (updated <24h)

#ModelProviderSpeed
1llama-3.1-8bgroq316 tok/s
2qwen-3-32bgroq221 tok/s
3llama-3.3-70bgroq200 tok/s
4llama-4-scoutgroq199 tok/s
5llama-3.1-8bcerebras184 tok/s

πŸ“Š Speed Distribution πŸ“Š

πŸ“š Full Results πŸ“š

Showing 80 of 80 modelsFlagged statuses: likely_deprecated, deprecated, failing, stale, never_succeeded, disabled
Status
groqllama-3.1-8bActive3h ago316.008747190.00
cerebrasqwen-3-32bActive26d ago282.0059406230.00
groqqwen-3-32bActive3h ago221.002374310.00
groqllama-4-maverickActive5d ago201.001307650.00
groqllama-3.3-70bActive3h ago200.0068340130.00
groqllama-4-scoutActive3h ago199.0038335240.00
cerebrasgpt-oss-120bActive2d ago187.0013801100.00
cerebrasllama-3.3-70bActive26d ago185.0035280380.00
cerebrasllama-3.1-8bActive3h ago184.0013531210.00
togetherllama-3.1-8bActive18h ago141.003228350.00
groqkimi-k2Active3h ago140.0012215310.00
bedrocknova-microActive27m ago121.0065152270.00
openaio3 MiniNever Succeeded(Medium)3h ago110.0081640.00
bedrockllama-4-maverickActive27m ago108.003139270.00
bedrocknova-liteActive27m ago101.0039132300.00
bedrockllama-4-scoutActive27m ago101.006127270.00
bedrockllama-3.3-70bActive27m ago97.503136290.00
togetherqwen-2.5-7bActive3h ago92.201145510.00
bedrocknova-proActive27m ago88.5036121360.00
openaiGPT-5.1-codex-maxActive3h ago82.00111181350.00
deepinframistral-7bStale(Medium)3h ago78.105148620.00
openaigpt-3.5-turboActive3h ago76.6013126490.00
googlegemini-2.5-flash-liteActive3h ago73.5010132550.00
togetherllama-3.1-70bActive17d ago72.9012129350.00
openaigpt-4.1-nanoActive3h ago71.609149470.00
deepinfradevstral-smallNever Succeeded(Medium)6h ago71.409140610.00
togethermistral-7bActive17d ago70.90291420.00
openaigpt-4oActive3h ago69.80131731300.00
fireworksmixtral-8x22bActive3h ago68.4029111420.00
googlegemini-2.5-flashNever Succeeded(Medium)3h ago67.105101970.00
togethermixtral-8x7bActive3h ago61.2014114160.00
deepinframixtral-8x22bStale(Medium)24d ago57.601466520.00
togetherllama-3.2-3bActive8d ago57.5051211450.00
fireworksllama-3.3-70bActive3h ago54.6011081680.00
togetherllama-3.3-70bActive3h ago53.8011461250.00
openaigpt-4.1-miniActive3h ago51.9015109450.00
anthropicclaude-haiku-4.5Active3h ago51.801974540.00
togetherdeepseek-r1Active3h ago50.501113740.00
openaio4 MiniNever Succeeded(Medium)3h ago49.804760.00
bedrockllama-3.2-90bActive27m ago46.90251360.00
deepinfrallama-3-8bStale(Medium)3h ago46.001871310.00
deepinfrallama-3.1-8bStale(Medium)3h ago45.40385670.00
deepinfrallama-3.2-1bStale(Medium)3h ago42.401100780.00
deepinfrallama-3.2-3bStale(Medium)3h ago41.40299730.00
googlegemini-2.5-proNever Succeeded(Medium)3h ago41.302721660.00
bedrockmistral-largeActive27m ago40.90247540.00
openaigpt-4.1Active3h ago40.501083500.00
openaigpt-4o-miniActive3h ago39.80764390.00
bedrockclaude-haiku-4.5Active28m ago39.603621170.00
deepinfrallama-3.2-90bStale(Medium)3h ago35.70382750.00
deepinfrallama-2-70bStale(Medium)3h ago34.70357620.00
deepinfrallama-3-70bStale(Medium)3h ago33.80255660.00
bedrockclaude-3-7-sonnetActive28m ago32.90244740.00
deepinfraqwen-2.5-72bStale(Medium)3h ago32.70150790.00
bedrockclaude-3-5-sonnetActive28m ago32.60246650.00
openaigpt-4-turboActive3h ago32.50252530.00
bedrockclaude-3-5-haikuActive28m ago31.50938650.00
deepinfraQwen 2.5 Coder 32BNever Succeeded(Medium)3h ago30.601823440.00
openaiGPT-5.1Active3h ago29.503571030.00
openaigpt-4Active3h ago27.70547630.00
openaiGPT-5.4Active3h ago27.7019361120.00
openaiGPT-5.2Active3h ago27.00438970.00
openaiGPT-5.1-codexActive3h ago26.001481260.00
deepinfrallama-3.1-405bStale(Medium)3h ago25.90139740.00
openaiGPT-5.1-codex-miniActive3h ago25.301551200.00
deepinfrallama-3.1-70bStale(Medium)3h ago23.101441110.00
bedrockclaude-sonnet-4.5Active27m ago22.202291660.00
openaiGPT-5.3-codexActive3h ago21.107321250.00
anthropicclaude-opus-4.5Active3h ago20.502331800.00
anthropicclaude-4-sonnetActive3h ago20.101312000.00
bedrockclaude-3-opusActive23d ago19.70822840.00
bedrockclaude-opus-4.5Active28m ago18.801272010.00
deepinfrallama-3.3-70bNever Succeeded(Medium)3h ago18.101462640.00
anthropicClaude Opus 4.1Active3h ago17.708271530.00
anthropicclaude-4-opusActive3h ago17.405241330.00
openaigpt-5.2-codexActive3h ago13.301271760.00
deepinfrallama-3.2-11bStale(Medium)3h ago9.721612610.00
openaio1-proLikely Deprecated(Medium)3h ago9.69118670.00
openaiGPT-5.2-proActive3h ago8.604145030.00
deepinfraqwen-3-235bNever Succeeded(Medium)3h ago8.351536690.00
Lifecycle snapshot
Loading status summary…

πŸ“ˆ Time Series πŸ“ˆ