Cloud BenchmarksLocal Benchmarks
API Status

☁️ Cloud Benchmarks ☁️

I run cron jobs to periodically test the token generation speed of different cloud LLM providers. The chart helps visualize the distributions of different speeds, as they can vary somewhat depending on the loads. For readability not all models are shown, but you can see the full results in the table below.

Every provider and model now has a dedicated landing page with narrative insights, SEO-friendly metadata, and structured data for search engines. Click any provider or model in the table to explore performance in depth.

I am working daily to add more providers and models, looking anywhere that does not require purchasing dedicated endpoints for hosting (why some models may appear to be missing). If you have any more suggestions let me know on GitHub!! 😊

Pick A Path In 10 Seconds

Quick recommendations from the latest 7-day benchmark slice. Use one path, jump into full results, then drill into provider/model pages.

Loading 7-day recommendations…

Fastest Models Right Now (updated <24h)

#ModelProviderSpeed
1llama-3.1-8bgroq313 tok/s
2qwen-3-32bgroq219 tok/s
3llama-3.3-70bgroq198 tok/s
4llama-4-scoutgroq197 tok/s
5llama-3.1-8bcerebras179 tok/s

πŸ“Š Speed Distribution πŸ“Š

πŸ“š Full Results πŸ“š

Showing 80 of 80 modelsFlagged statuses: likely_deprecated, deprecated, failing, stale, never_succeeded, disabled
Status
groqllama-3.1-8bActive34m ago313.008747190.00
cerebrasqwen-3-32bActive27d ago292.00189406200.00
groqqwen-3-32bActive34m ago219.002374310.00
groqllama-4-maverickActive6d ago203.001307650.00
groqllama-3.3-70bActive34m ago198.0068340130.00
groqllama-4-scoutActive34m ago197.0038335240.00
cerebrasgpt-oss-120bActive3d ago188.0013801120.00
cerebrasllama-3.3-70bActive27d ago188.0035256400.00
cerebrasllama-3.1-8bActive36m ago179.0013531260.00
togetherllama-3.1-8bActive1d ago141.003228350.00
groqkimi-k2Active34m ago139.0012215310.00
bedrocknova-microActive23m ago121.0065152270.00
openaio3 MiniNever Succeeded(Medium)34m ago110.0081640.00
bedrockllama-4-maverickActive23m ago108.003139270.00
bedrocknova-liteActive23m ago101.0039132300.00
bedrockllama-4-scoutActive23m ago101.006127280.00
bedrockllama-3.3-70bActive23m ago97.203136290.00
togetherqwen-2.5-7bActive32m ago91.601145510.00
bedrocknova-proActive23m ago87.8036121360.00
openaiGPT-5.1-codex-maxActive34m ago82.40111181320.00
deepinframistral-7bStale(Medium)36m ago77.705148630.00
openaigpt-3.5-turboActive33m ago76.3013126500.00
togetherllama-3.1-70bActive18d ago73.9015129320.00
googlegemini-2.5-flash-liteActive32m ago72.7010132550.00
openaigpt-4.1-nanoActive34m ago71.709149470.00
deepinfradevstral-smallNever Succeeded(Medium)3h ago71.309140620.00
togethermistral-7bActive18d ago70.30291440.00
openaigpt-4oActive33m ago69.50131731310.00
fireworksmixtral-8x22bActive34m ago68.9029111410.00
googlegemini-2.5-flashNever Succeeded(Medium)32m ago67.005105970.00
togethermixtral-8x7bActive32m ago61.2014114160.00
togetherllama-3.2-3bActive9d ago57.5051211440.00
deepinframixtral-8x22bStale(Medium)25d ago56.801466530.00
fireworksllama-3.3-70bActive34m ago54.9011081670.00
togetherllama-3.3-70bActive32m ago53.2011461250.00
anthropicclaude-haiku-4.5Active36m ago51.901974530.00
openaigpt-4.1-miniActive34m ago51.9015109440.00
togetherdeepseek-r1Active32m ago51.801113740.00
openaio4 MiniNever Succeeded(Medium)34m ago49.604760.00
bedrockllama-3.2-90bActive23m ago46.80251370.00
deepinfrallama-3-8bStale(Medium)34m ago45.601871320.00
deepinfrallama-3.1-8bStale(Medium)35m ago44.90385680.00
deepinfrallama-3.2-1bStale(Medium)35m ago41.601100810.00
googlegemini-2.5-proNever Succeeded(Medium)32m ago41.202721680.00
bedrockmistral-largeActive23m ago40.90247550.00
openaigpt-4.1Active34m ago40.901083500.00
deepinfrallama-3.2-3bStale(Medium)35m ago40.60299770.00
openaigpt-4o-miniActive33m ago39.80764390.00
bedrockclaude-haiku-4.5Active23m ago39.403621190.00
deepinfrallama-3.2-90bStale(Medium)36m ago35.50382730.00
deepinfrallama-2-70bStale(Medium)34m ago34.60357610.00
deepinfrallama-3-70bStale(Medium)34m ago33.60255660.00
deepinfraqwen-2.5-72bStale(Medium)36m ago33.00150790.00
bedrockclaude-3-5-sonnetActive23m ago32.80246650.00
bedrockclaude-3-7-sonnetActive23m ago32.70243750.00
openaigpt-4-turboActive33m ago32.40252530.00
bedrockclaude-3-5-haikuActive23m ago31.60938650.00
deepinfraQwen 2.5 Coder 32BNever Succeeded(Medium)36m ago30.501823540.00
openaiGPT-5.1Active34m ago29.502571100.00
openaiGPT-5.4Active34m ago27.9019361080.00
openaigpt-4Active33m ago27.50547640.00
openaiGPT-5.2Active34m ago27.00440970.00
openaiGPT-5.1-codexActive34m ago26.201481250.00
deepinfrallama-3.1-405bStale(Medium)35m ago25.60139760.00
openaiGPT-5.1-codex-miniActive34m ago25.501551190.00
deepinfrallama-3.1-70bStale(Medium)35m ago22.901441110.00
bedrockclaude-sonnet-4.5Active23m ago22.102281690.00
openaiGPT-5.3-codexActive34m ago21.307321230.00
anthropicclaude-opus-4.5Active36m ago20.602331800.00
anthropicclaude-4-sonnetActive36m ago20.106311770.00
bedrockclaude-3-opusActive24d ago19.60822840.00
bedrockclaude-opus-4.5Active23m ago18.901272000.00
deepinfrallama-3.3-70bNever Succeeded(Medium)36m ago17.901462670.00
anthropicClaude Opus 4.1Active36m ago17.607271560.00
anthropicclaude-4-opusActive36m ago17.405221320.00
openaigpt-5.2-codexActive34m ago13.401271740.00
openaio1-proLikely Deprecated(Medium)34m ago9.64118670.00
deepinfrallama-3.2-11bStale(Medium)36m ago8.971612630.00
openaiGPT-5.2-proActive34m ago8.584144890.00
deepinfraqwen-3-235bNever Succeeded(Medium)36m ago8.351536610.00
Lifecycle snapshot
Loading status summary…

πŸ“ˆ Time Series πŸ“ˆ