Cloud BenchmarksLocal Benchmarks
API Status

☁️ Cloud Benchmarks ☁️

I run cron jobs to periodically test the token generation speed of different cloud LLM providers. The chart helps visualize the distributions of different speeds, as they can vary somewhat depending on the loads. For readability not all models are shown, but you can see the full results in the table below.

Every provider and model now has a dedicated landing page with narrative insights, SEO-friendly metadata, and structured data for search engines. Click any provider or model in the table to explore performance in depth.

I am working daily to add more providers and models, looking anywhere that does not require purchasing dedicated endpoints for hosting (why some models may appear to be missing). If you have any more suggestions let me know on GitHub!! 😊

Pick A Path In 10 Seconds

Quick recommendations from the latest 7-day benchmark slice. Use one path, jump into full results, then drill into provider/model pages.

Loading 7-day recommendations…

Fastest Models Right Now (updated <24h)

#ModelProviderSpeed
1llama-3.1-8bgroq318 tok/s
2qwen-3-32bgroq229 tok/s
3llama-4-scoutgroq208 tok/s
4llama-3.3-70bgroq202 tok/s
5llama-4-maverickgroq196 tok/s

πŸ“Š Speed Distribution πŸ“Š

πŸ“š Full Results πŸ“š

Showing 78 of 78 modelsFlagged statuses: likely_deprecated, deprecated, failing, stale, never_succeeded, disabled
Status
groqllama-3.1-8bActive36m ago318.008747190.00
cerebrasqwen-3-32bActive21d ago260.0032444240.00
groqqwen-3-32bActive36m ago229.002391300.00
groqllama-4-scoutActive36m ago208.0038335220.00
groqllama-3.3-70bActive36m ago202.0068340130.00
groqllama-4-maverickActive36m ago196.001307610.00
cerebrasgpt-oss-120bActive37m ago192.0013380840.00
cerebrasllama-3.1-8bActive37m ago192.0013481000.00
cerebrasllama-3.3-70bActive21d ago185.0017316400.00
togetherllama-3.1-8bActive34m ago142.003232380.00
groqkimi-k2Active36m ago141.0012215310.00
bedrocknova-microActive19m ago122.0065152270.00
openaio3 MiniNever Succeeded(Medium)35m ago111.00211600.00
bedrockllama-4-maverickActive19m ago108.003139260.00
bedrocknova-liteActive19m ago102.0039132300.00
bedrockllama-4-scoutActive19m ago101.004129280.00
bedrockllama-3.3-70bActive19m ago98.403136290.00
togetherqwen-2.5-7bActive34m ago94.2011145240.00
bedrocknova-proActive19m ago90.2031121350.00
openaiGPT-5.1-codex-maxActive35m ago80.50111181420.00
openaigpt-3.5-turboActive34m ago78.2013126470.00
googlegemini-2.5-flash-liteActive34m ago76.0010132520.00
openaigpt-4.1-nanoActive35m ago72.309149450.00
togetherllama-3.1-70bActive11d ago72.107129390.00
togethermistral-7bActive11d ago71.00291480.00
deepinframistral-7bStale(Medium)36m ago70.105148620.00
openaigpt-4oActive34m ago69.70131731310.00
fireworksmixtral-8x22bActive7d ago67.0029112440.00
googlegemini-2.5-flashNever Succeeded(Medium)34m ago66.9051011030.00
deepinfradevstral-smallNever Succeeded(Medium)37m ago64.609140580.00
togethermixtral-8x7bActive34m ago61.2014114150.00
deepinframixtral-8x22bStale(Medium)19d ago59.701478420.00
togetherllama-3.2-3bActive2d ago57.3051211510.00
togetherllama-3.3-70bActive34m ago54.0011461350.00
fireworksllama-3.3-70bActive36m ago51.802951380.00
anthropicclaude-haiku-4.5Active37m ago51.701974550.00
openaigpt-4.1-miniActive35m ago51.401585450.00
openaio4 MiniNever Succeeded(Medium)35m ago49.8014760.00
deepinfrallama-3.1-8bStale(Medium)36m ago47.40496570.00
bedrockllama-3.2-90bActive19m ago47.10251350.00
togetherdeepseek-r1Active34m ago46.801113930.00
deepinfrallama-3.2-1bStale(Medium)36m ago46.701100610.00
deepinfrallama-3.2-3bStale(Medium)36m ago45.80299520.00
deepinfrallama-3-8bStale(Medium)36m ago45.80971310.00
openaigpt-4o-miniActive34m ago42.002095360.00
bedrockmistral-largeActive19m ago41.90347390.00
bedrockclaude-haiku-4.5Active19m ago41.504621090.00
googlegemini-2.5-proNever Succeeded(Medium)34m ago41.202721680.00
openaigpt-4.1Active35m ago38.60683500.00
deepinfrallama-3.2-90bStale(Medium)36m ago35.30382870.00
deepinfrallama-2-70bStale(Medium)36m ago34.80357580.00
deepinfrallama-3-70bStale(Medium)36m ago34.30255680.00
bedrockclaude-3-7-sonnetActive19m ago33.20244730.00
openaigpt-4-turboActive34m ago32.60252530.00
bedrockclaude-3-5-sonnetActive19m ago32.40244660.00
deepinfraqwen-2.5-72bStale(Medium)36m ago32.40150760.00
deepinfraQwen 2.5 Coder 32BNever Succeeded(Medium)37m ago31.401822710.00
bedrockclaude-3-5-haikuActive19m ago30.90438680.00
openaiGPT-5.1Active35m ago30.10357950.00
openaigpt-4Active34m ago27.90347670.00
openaiGPT-5.2Active35m ago26.90940910.00
openaiGPT-5.1-codexActive35m ago25.801481290.00
openaiGPT-5.1-codex-miniActive35m ago25.301551170.00
deepinfrallama-3.1-405bStale(Medium)36m ago25.20139970.00
bedrockclaude-sonnet-4.5Active19m ago22.302291620.00
deepinfrallama-3.1-70bStale(Medium)36m ago21.701461120.00
anthropicclaude-4-sonnetActive37m ago20.401311930.00
anthropicclaude-opus-4.5Active37m ago20.102331840.00
bedrockclaude-3-opusActive18d ago19.40622870.00
deepinfrallama-3.3-70bNever Succeeded(Medium)37m ago19.101462510.00
bedrockclaude-opus-4.5Active19m ago18.401242050.00
anthropicClaude Opus 4.1Active37m ago17.908271490.00
anthropicclaude-4-opusActive37m ago17.605241300.00
openaigpt-5.2-codexActive6h ago12.901271830.00
deepinfrallama-3.2-11bStale(Medium)36m ago12.801622660.00
deepinfraqwen-3-235bNever Succeeded(Medium)37m ago10.101405950.00
openaio1-proLikely Deprecated(Medium)35m ago9.47118640.00
openaiGPT-5.2-proActive36m ago8.534144990.00
Lifecycle snapshot
Loading status summary…

πŸ“ˆ Time Series πŸ“ˆ