Cloud BenchmarksLocal Benchmarks
API Status

☁️ Cloud Benchmarks ☁️

I run cron jobs to periodically test the token generation speed of different cloud LLM providers. The chart helps visualize the distributions of different speeds, as they can vary somewhat depending on the loads. For readability not all models are shown, but you can see the full results in the table below.

Every provider and model now has a dedicated landing page with narrative insights, SEO-friendly metadata, and structured data for search engines. Click any provider or model in the table to explore performance in depth.

I am working daily to add more providers and models, looking anywhere that does not require purchasing dedicated endpoints for hosting (why some models may appear to be missing). If you have any more suggestions let me know on GitHub!! 😊

Pick A Path In 10 Seconds

Quick recommendations from the latest 7-day benchmark slice. Use one path, jump into full results, then drill into provider/model pages.

Loading 7-day recommendations…

Fastest Models Right Now (updated <24h)

#ModelProviderSpeed
1llama-3.1-8bgroq320 tok/s
2qwen-3-32bgroq233 tok/s
3llama-4-scoutgroq210 tok/s
4llama-3.3-70bgroq202 tok/s
5llama-3.1-8bcerebras196 tok/s

πŸ“Š Speed Distribution πŸ“Š

πŸ“š Full Results πŸ“š

Showing 80 of 80 modelsFlagged statuses: likely_deprecated, deprecated, failing, stale, never_succeeded, disabled
Status
groqllama-3.1-8bActive2h ago320.008747190.00
cerebrasqwen-3-32bActive19d ago250.004444390.00
groqqwen-3-32bActive2h ago233.002391300.00
groqllama-4-scoutActive2h ago210.0038335210.00
groqllama-3.3-70bActive2h ago202.0068322130.00
cerebrasllama-3.1-8bActive3h ago196.001348960.00
cerebrasgpt-oss-120bActive3h ago194.0013380800.00
groqllama-4-maverickActive2h ago189.001307640.00
cerebrasllama-3.3-70bActive19d ago187.0017316390.00
togetherllama-3.1-8bActive2h ago142.003232390.00
groqkimi-k2Active2h ago141.0012215320.00
bedrocknova-microActive23m ago122.0065150270.00
openaio3 MiniNever Succeeded(Medium)2h ago111.00211600.00
bedrockllama-4-maverickActive23m ago108.003139260.00
bedrocknova-liteActive23m ago101.0039132300.00
bedrockllama-4-scoutActive23m ago101.004129270.00
bedrockllama-3.3-70bActive23m ago98.903137290.00
togetherqwen-2.5-7bActive2h ago94.2011145240.00
bedrocknova-proActive23m ago90.6025121350.00
openaiGPT-5.1-codex-maxActive2h ago78.80111181440.00
openaigpt-3.5-turboActive2h ago78.5013126470.00
googlegemini-2.5-flash-liteActive2h ago75.5010132530.00
togetherllama-3.1-70bActive9d ago72.607129380.00
openaigpt-4.1-nanoActive2h ago72.009149450.00
togethermistral-7bActive9d ago70.40291530.00
openaigpt-4oActive2h ago69.60131731320.00
fireworksmixtral-8x22bActive5d ago67.6029112430.00
googlegemini-2.5-flashNever Succeeded(Medium)2h ago66.7051011030.00
deepinframistral-7bStale(Medium)2h ago66.005139620.00
deepinfradevstral-smallNever Succeeded(Medium)3h ago60.409138590.00
togethermixtral-8x7bActive2h ago59.7024114150.00
deepinframixtral-8x22bStale(Medium)17d ago58.801478390.00
togetherllama-3.2-3bActive20h ago57.0051211500.00
togetherqwen-2.5-72bActive28d ago55.804371310.00
togetherllama-3.3-70bActive2h ago53.8011461420.00
anthropicclaude-haiku-4.5Active3h ago51.701974560.00
fireworksllama-3.3-70bActive2h ago51.104951260.00
openaigpt-4.1-miniActive2h ago51.001585460.00
openaio4 MiniNever Succeeded(Medium)2h ago49.5014760.00
deepinfrallama-3.1-8bStale(Medium)2h ago49.20496540.00
bedrockllama-3.2-90bActive23m ago47.10651350.00
deepinfrallama-3-8bStale(Medium)2h ago45.50971310.00
togetherdeepseek-r1Active2h ago45.3011131000.00
openaigpt-4o-miniActive2h ago42.802095360.00
deepinfrallama-3.2-1bStale(Medium)2h ago42.501100620.00
bedrockmistral-largeActive23m ago42.00347370.00
deepinfrallama-3.2-3bStale(Medium)2h ago41.80198790.00
bedrockclaude-haiku-4.5Active23m ago41.604621070.00
googlegemini-2.5-proNever Succeeded(Medium)2h ago40.902721700.00
openaigpt-4.1Active2h ago38.10682500.00
deepinfrallama-2-70bStale(Medium)2h ago35.20357560.00
deepinfrallama-3-70bStale(Medium)2h ago34.80255670.00
deepinfrallama-3.2-90bStale(Medium)6h ago34.40376880.00
bedrockclaude-3-7-sonnetActive23m ago33.10544740.00
deepinfraqwen-2.5-72bStale(Medium)2h ago32.30250680.00
openaigpt-4-turboActive2h ago32.30252530.00
bedrockclaude-3-5-sonnetActive24m ago32.30244650.00
deepinfraQwen 2.5 Coder 32BNever Succeeded(Medium)3h ago31.301752690.00
bedrockclaude-3-5-haikuActive24m ago30.80438680.00
openaiGPT-5.1Active2h ago30.10357940.00
openaigpt-4Active2h ago27.70347660.00
openaiGPT-5.2Active2h ago26.70940920.00
openaiGPT-5.1-codexActive2h ago25.903481290.00
openaiGPT-5.1-codex-miniActive2h ago25.801551160.00
deepinfrallama-3.1-405bStale(Medium)2h ago24.60136990.00
togetherllama-3.1-405bActive29d ago22.701328690.00
deepinfrallama-3.1-70bStale(Medium)2h ago22.401461010.00
bedrockclaude-sonnet-4.5Active23m ago22.305291610.00
anthropicclaude-4-sonnetActive3h ago20.301311970.00
anthropicclaude-opus-4.5Active3h ago20.102331840.00
bedrockclaude-3-opusActive16d ago19.30422860.00
deepinfrallama-3.3-70bNever Succeeded(Medium)6h ago19.001462500.00
bedrockclaude-opus-4.5Active23m ago18.404241980.00
anthropicClaude Opus 4.1Active3h ago18.008271480.00
anthropicclaude-4-opusActive3h ago17.705241290.00
deepinfrallama-3.2-11bStale(Medium)2h ago13.801622620.00
openaigpt-5.2-codexActive6h ago12.901271850.00
deepinfraqwen-3-235bNever Succeeded(Medium)2h ago10.901405850.00
openaio1-proLikely Deprecated(Medium)2h ago9.29118650.00
openaiGPT-5.2-proActive2h ago8.594144840.00
Lifecycle snapshot
Loading status summary…

πŸ“ˆ Time Series πŸ“ˆ