Cloud BenchmarksLocal Benchmarks
API Status

☁️ Cloud Benchmarks ☁️

I run cron jobs to periodically test the token generation speed of different cloud LLM providers. The chart helps visualize the distributions of different speeds, as they can vary somewhat depending on the loads. For readability not all models are shown, but you can see the full results in the table below.

Every provider and model now has a dedicated landing page with narrative insights, SEO-friendly metadata, and structured data for search engines. Click any provider or model in the table to explore performance in depth.

I am working daily to add more providers and models, looking anywhere that does not require purchasing dedicated endpoints for hosting (why some models may appear to be missing). If you have any more suggestions let me know on GitHub!! 😊

Pick A Path In 10 Seconds

Quick recommendations from the latest 7-day benchmark slice. Use one path, jump into full results, then drill into provider/model pages.

Loading 7-day recommendations…

Fastest Models Right Now (updated <24h)

#ModelProviderSpeed
1llama-3.1-8bgroq317 tok/s
2qwen-3-32bgroq222 tok/s
3llama-3.3-70bgroq201 tok/s
4llama-4-scoutgroq198 tok/s
5llama-3.1-8bcerebras185 tok/s

πŸ“Š Speed Distribution πŸ“Š

πŸ“š Full Results πŸ“š

Showing 80 of 80 modelsFlagged statuses: likely_deprecated, deprecated, failing, stale, never_succeeded, disabled
Status
groqllama-3.1-8bActive34m ago317.008747190.00
cerebrasqwen-3-32bActive25d ago264.0032406250.00
groqqwen-3-32bActive34m ago222.002374310.00
groqllama-3.3-70bActive34m ago201.0068340130.00
groqllama-4-maverickActive4d ago200.001307640.00
groqllama-4-scoutActive34m ago198.0038335240.00
cerebrasgpt-oss-120bActive1d ago186.0013801090.00
cerebrasllama-3.3-70bActive25d ago186.0017280460.00
cerebrasllama-3.1-8bActive36m ago185.0013531190.00
groqkimi-k2Active34m ago140.0012215310.00
togetherllama-3.1-8bActive32m ago140.003228350.00
bedrocknova-microActive28m ago121.0065152270.00
openaio3 MiniNever Succeeded(Medium)33m ago109.0081600.00
bedrockllama-4-maverickActive28m ago108.003139270.00
bedrocknova-liteActive28m ago101.0039132300.00
bedrockllama-4-scoutActive28m ago101.006127270.00
bedrockllama-3.3-70bActive28m ago97.303136290.00
togetherqwen-2.5-7bActive32m ago93.401145490.00
bedrocknova-proActive28m ago88.9036121360.00
openaiGPT-5.1-codex-maxActive33m ago81.40111181370.00
deepinframistral-7bStale(Medium)35m ago77.905148630.00
openaigpt-3.5-turboActive33m ago76.7013126490.00
googlegemini-2.5-flash-liteActive32m ago73.3010132550.00
togetherllama-3.1-70bActive16d ago72.4012129350.00
deepinfradevstral-smallNever Succeeded(Medium)3h ago71.509140600.00
openaigpt-4.1-nanoActive33m ago71.309149470.00
togethermistral-7bActive16d ago71.00291410.00
openaigpt-4oActive33m ago69.60131731300.00
fireworksmixtral-8x22bActive34m ago68.2029111420.00
googlegemini-2.5-flashNever Succeeded(Medium)32m ago67.305101970.00
togethermixtral-8x7bActive32m ago61.4014114160.00
deepinframixtral-8x22bStale(Medium)23d ago57.801466510.00
togetherllama-3.2-3bActive7d ago56.4051211500.00
fireworksllama-3.3-70bActive34m ago54.4011081680.00
togetherllama-3.3-70bActive32m ago53.7011461250.00
anthropicclaude-haiku-4.5Active36m ago51.801974540.00
openaigpt-4.1-miniActive33m ago51.5015109450.00
togetherdeepseek-r1Active32m ago50.101113740.00
openaio4 MiniNever Succeeded(Medium)33m ago49.504760.00
bedrockllama-3.2-90bActive28m ago46.90251360.00
deepinfrallama-3-8bStale(Medium)34m ago46.001871320.00
deepinfrallama-3.1-8bStale(Medium)34m ago45.20385670.00
deepinfrallama-3.2-1bStale(Medium)34m ago43.101100810.00
deepinfrallama-3.2-3bStale(Medium)35m ago41.90299740.00
googlegemini-2.5-proNever Succeeded(Medium)32m ago41.302721660.00
bedrockmistral-largeActive28m ago40.90247540.00
openaigpt-4.1Active33m ago39.80683500.00
openaigpt-4o-miniActive33m ago39.80765390.00
bedrockclaude-haiku-4.5Active28m ago39.503621190.00
deepinfrallama-3.2-90bStale(Medium)35m ago35.70382750.00
deepinfrallama-2-70bStale(Medium)34m ago34.80357620.00
deepinfrallama-3-70bStale(Medium)34m ago33.90255670.00
bedrockclaude-3-7-sonnetActive28m ago32.80244740.00
deepinfraqwen-2.5-72bStale(Medium)35m ago32.70150800.00
bedrockclaude-3-5-sonnetActive28m ago32.50246660.00
openaigpt-4-turboActive33m ago32.40252530.00
bedrockclaude-3-5-haikuActive28m ago31.40938650.00
deepinfraQwen 2.5 Coder 32BNever Succeeded(Medium)36m ago30.901823380.00
openaiGPT-5.1Active33m ago29.303571030.00
openaigpt-4Active33m ago27.60547630.00
openaiGPT-5.2Active33m ago27.00438970.00
openaiGPT-5.4Active34m ago26.6019361200.00
deepinfrallama-3.1-405bStale(Medium)34m ago25.90139750.00
openaiGPT-5.1-codexActive33m ago25.801481260.00
openaiGPT-5.1-codex-miniActive33m ago25.001551200.00
deepinfrallama-3.1-70bStale(Medium)34m ago23.201461110.00
bedrockclaude-sonnet-4.5Active28m ago22.102291670.00
openaiGPT-5.3-codexActive34m ago20.707321280.00
anthropicclaude-opus-4.5Active36m ago20.402331810.00
anthropicclaude-4-sonnetActive36m ago20.001312020.00
bedrockclaude-3-opusActive22d ago19.50822850.00
bedrockclaude-opus-4.5Active28m ago18.601262030.00
deepinfrallama-3.3-70bNever Succeeded(Medium)35m ago18.201462600.00
anthropicClaude Opus 4.1Active36m ago17.808271500.00
anthropicclaude-4-opusActive36m ago17.405241330.00
openaigpt-5.2-codexActive34m ago13.101271780.00
deepinfrallama-3.2-11bStale(Medium)35m ago9.901612750.00
openaio1-proLikely Deprecated(Medium)33m ago9.61118610.00
openaiGPT-5.2-proActive33m ago8.624144950.00
deepinfraqwen-3-235bNever Succeeded(Medium)35m ago8.351536730.00
Lifecycle snapshot
Loading status summary…

πŸ“ˆ Time Series πŸ“ˆ