Cloud BenchmarksLocal Benchmarks
API Status

☁️ Cloud Benchmarks ☁️

I run cron jobs to periodically test the token generation speed of different cloud LLM providers. The chart helps visualize the distributions of different speeds, as they can vary somewhat depending on the loads. For readability not all models are shown, but you can see the full results in the table below.

Every provider and model now has a dedicated landing page with narrative insights, SEO-friendly metadata, and structured data for search engines. Click any provider or model in the table to explore performance in depth.

I am working daily to add more providers and models, looking anywhere that does not require purchasing dedicated endpoints for hosting (why some models may appear to be missing). If you have any more suggestions let me know on GitHub!! 😊

Pick A Path In 10 Seconds

Quick recommendations from the latest 7-day benchmark slice. Use one path, jump into full results, then drill into provider/model pages.

Loading 7-day recommendations…

Fastest Models Right Now (updated <24h)

#ModelProviderSpeed
1llama-3.1-8bgroq310 tok/s
2qwen-3-32bgroq217 tok/s
3llama-3.3-70bgroq195 tok/s
4llama-4-scoutgroq193 tok/s
5llama-3.1-8bcerebras171 tok/s

πŸ“Š Speed Distribution πŸ“Š

πŸ“š Full Results πŸ“š

Showing 80 of 80 modelsFlagged statuses: likely_deprecated, deprecated, failing, stale, never_succeeded, disabled
Status
cerebrasqwen-3-32bActive29d ago366.00366366140.00
groqllama-3.1-8bActive56m ago310.0087471100.00
groqqwen-3-32bActive56m ago217.002374310.00
groqllama-4-maverickActive8d ago203.001307680.00
groqllama-3.3-70bActive56m ago195.0068340140.00
groqllama-4-scoutActive56m ago193.0038335250.00
cerebrasgpt-oss-120bActive5d ago184.0013801170.00
cerebrasllama-3.1-8bActive57m ago171.0013531320.00
togetherllama-3.1-8bActive3d ago141.003228350.00
groqkimi-k2Active56m ago138.0012215320.00
bedrocknova-microActive35m ago121.0065152270.00
openaio3 MiniNever Succeeded(Medium)55m ago109.0081640.00
bedrockllama-4-maverickActive35m ago108.003139270.00
bedrockllama-4-scoutActive35m ago101.006130280.00
bedrocknova-liteActive35m ago100.0022132300.00
cerebrasllama-3.3-70bActive29d ago97.809798620.00
bedrockllama-3.3-70bActive35m ago96.503136300.00
togetherqwen-2.5-7bActive54m ago92.601145500.00
bedrocknova-proActive35m ago86.1019121370.00
openaiGPT-5.1-codex-maxActive56m ago81.70111181220.00
deepinframistral-7bStale(Medium)57m ago79.005148610.00
openaigpt-3.5-turboActive54m ago75.4013126510.00
deepinfradevstral-smallNever Succeeded(Medium)57m ago74.309140580.00
togetherllama-3.1-70bActive20d ago73.8015129340.00
googlegemini-2.5-flash-liteActive54m ago72.1010117550.00
openaigpt-4.1-nanoActive55m ago70.309149480.00
togethermistral-7bActive20d ago70.20691380.00
fireworksmixtral-8x22bActive56m ago69.9029111400.00
openaigpt-4oActive54m ago68.40141731320.00
googlegemini-2.5-flashNever Succeeded(Medium)54m ago66.005105990.00
togethermixtral-8x7bActive54m ago60.8014114170.00
deepinframixtral-8x22bStale(Medium)27d ago55.001466580.00
togetherllama-3.2-3bActive11d ago54.9051211510.00
fireworksllama-3.3-70bActive56m ago54.5011081670.00
togetherdeepseek-r1Active54m ago53.901113740.00
togetherllama-3.3-70bActive54m ago51.7011461240.00
openaigpt-4.1-miniActive55m ago51.5015109440.00
anthropicclaude-haiku-4.5Active57m ago51.401973550.00
openaio4 MiniNever Succeeded(Medium)55m ago49.004760.00
bedrockllama-3.2-90bActive35m ago46.70251370.00
deepinfrallama-3-8bStale(Medium)56m ago45.001869320.00
deepinfrallama-3.1-8bStale(Medium)56m ago44.30385690.00
openaigpt-4.1Active55m ago40.801083510.00
bedrockmistral-largeActive35m ago40.70247560.00
googlegemini-2.5-proNever Succeeded(Medium)54m ago40.502721700.00
deepinfrallama-3.2-1bStale(Medium)56m ago40.301100860.00
openaigpt-4o-miniActive54m ago39.70764390.00
deepinfrallama-3.2-3bStale(Medium)56m ago39.40299830.00
bedrockclaude-haiku-4.5Active36m ago39.303621200.00
deepinfrallama-3.2-90bStale(Medium)57m ago34.80382760.00
deepinfrallama-2-70bStale(Medium)56m ago34.60357600.00
deepinfrallama-3-70bStale(Medium)56m ago33.90255650.00
bedrockclaude-3-5-sonnetActive36m ago32.70246650.00
deepinfraqwen-2.5-72bStale(Medium)57m ago32.50150800.00
bedrockclaude-3-7-sonnetActive36m ago32.20242760.00
openaigpt-4-turboActive55m ago32.20752520.00
bedrockclaude-3-5-haikuActive36m ago31.70938650.00
deepinfraQwen 2.5 Coder 32BNever Succeeded(Medium)57m ago31.401823530.00
openaiGPT-5.1Active55m ago29.502571100.00
openaigpt-4Active54m ago27.30847630.00
openaiGPT-5.4Active56m ago27.2015361050.00
openaiGPT-5.2Active56m ago27.10440970.00
openaiGPT-5.1-codexActive55m ago26.401481240.00
openaiGPT-5.1-codex-miniActive55m ago25.301521190.00
deepinfrallama-3.1-405bStale(Medium)56m ago25.10139790.00
deepinfrallama-3.1-70bStale(Medium)56m ago23.001441100.00
bedrockclaude-sonnet-4.5Active35m ago21.901281700.00
openaiGPT-5.3-codexActive56m ago21.307321230.00
anthropicclaude-opus-4.5Active57m ago20.502331810.00
bedrockclaude-3-opusActive26d ago19.60822860.00
anthropicclaude-4-sonnetActive57m ago19.606311870.00
bedrockclaude-opus-4.5Active35m ago19.101271990.00
anthropicClaude Opus 4.1Active57m ago17.607271550.00
deepinfrallama-3.3-70bNever Succeeded(Medium)57m ago17.601462590.00
anthropicclaude-4-opusActive57m ago17.405221320.00
openaigpt-5.2-codexActive56m ago13.601271720.00
openaio1-proLikely Deprecated(Medium)55m ago9.57118670.00
openaiGPT-5.2-proActive3h ago8.614144770.00
deepinfraqwen-3-235bNever Succeeded(Medium)57m ago8.381536190.00
deepinfrallama-3.2-11bStale(Medium)56m ago8.141612650.00
Lifecycle snapshot
Loading status summary…

πŸ“ˆ Time Series πŸ“ˆ