Cloud BenchmarksLocal Benchmarks
API Status

☁️ Cloud Benchmarks ☁️

I run cron jobs to periodically test the token generation speed of different cloud LLM providers. The chart helps visualize the distributions of different speeds, as they can vary somewhat depending on the loads. For readability not all models are shown, but you can see the full results in the table below.

Every provider and model now has a dedicated landing page with narrative insights, SEO-friendly metadata, and structured data for search engines. Click any provider or model in the table to explore performance in depth.

I am working daily to add more providers and models, looking anywhere that does not require purchasing dedicated endpoints for hosting (why some models may appear to be missing). If you have any more suggestions let me know on GitHub!! 😊

Pick A Path In 10 Seconds

Quick recommendations from the latest 7-day benchmark slice. Use one path, jump into full results, then drill into provider/model pages.

Loading 7-day recommendations…

Fastest Models Right Now (updated <24h)

#ModelProviderSpeed
1llama-3.1-8bgroq319 tok/s
2qwen-3-32bgroq232 tok/s
3llama-4-scoutgroq211 tok/s
4llama-3.3-70bgroq203 tok/s
5llama-3.1-8bcerebras194 tok/s

πŸ“Š Speed Distribution πŸ“Š

πŸ“š Full Results πŸ“š

Showing 80 of 80 modelsFlagged statuses: likely_deprecated, deprecated, failing, stale, never_succeeded, disabled
Status
groqllama-3.1-8bActive58m ago319.008747190.00
cerebrasqwen-3-32bActive19d ago256.004444400.00
groqqwen-3-32bActive58m ago232.002391300.00
groqllama-4-scoutActive58m ago211.0038335210.00
groqllama-3.3-70bActive58m ago203.0068340130.00
cerebrasllama-3.1-8bActive1h ago194.001348970.00
groqllama-4-maverickActive58m ago194.001307620.00
cerebrasgpt-oss-120bActive1h ago193.0013380800.00
cerebrasllama-3.3-70bActive19d ago182.0017316400.00
togetherllama-3.1-8bActive57m ago142.003232380.00
groqkimi-k2Active58m ago141.0012215310.00
bedrocknova-microActive8m ago122.0065150270.00
openaio3 MiniNever Succeeded(Medium)58m ago111.00211600.00
bedrockllama-4-maverickActive8m ago108.003139260.00
bedrocknova-liteActive8m ago102.0039132300.00
bedrockllama-4-scoutActive8m ago101.004129270.00
bedrockllama-3.3-70bActive8m ago98.903137290.00
togetherqwen-2.5-7bActive57m ago94.2011145240.00
bedrocknova-proActive8m ago90.5031121350.00
openaiGPT-5.1-codex-maxActive58m ago80.10111181390.00
openaigpt-3.5-turboActive57m ago78.7013126470.00
googlegemini-2.5-flash-liteActive57m ago75.9010132530.00
togetherllama-3.1-70bActive10d ago72.607129390.00
openaigpt-4.1-nanoActive58m ago72.409149450.00
togethermistral-7bActive10d ago70.70291540.00
openaigpt-4oActive57m ago69.90131731310.00
deepinframistral-7bStale(Medium)59m ago68.005145620.00
fireworksmixtral-8x22bActive6d ago67.3029112430.00
googlegemini-2.5-flashNever Succeeded(Medium)57m ago67.0051011030.00
deepinfradevstral-smallNever Succeeded(Medium)59m ago62.509140580.00
togethermixtral-8x7bActive57m ago60.8024114150.00
deepinframixtral-8x22bStale(Medium)17d ago59.401478400.00
togetherllama-3.2-3bActive1d ago57.9051211480.00
togetherqwen-2.5-72bActive29d ago55.704963340.00
togetherllama-3.3-70bActive57m ago53.9011461350.00
anthropicclaude-haiku-4.5Active1h ago51.801974550.00
openaigpt-4.1-miniActive58m ago51.601585460.00
fireworksllama-3.3-70bActive58m ago51.202951390.00
openaio4 MiniNever Succeeded(Medium)58m ago49.8014760.00
deepinfrallama-3.1-8bStale(Medium)59m ago48.20496560.00
bedrockllama-3.2-90bActive8m ago47.10651350.00
deepinfrallama-3-8bStale(Medium)58m ago45.70971310.00
togetherdeepseek-r1Active57m ago45.5011131000.00
deepinfrallama-3.2-1bStale(Medium)59m ago44.301100610.00
deepinfrallama-3.2-3bStale(Medium)59m ago43.50298530.00
openaigpt-4o-miniActive57m ago42.902095360.00
bedrockmistral-largeActive8m ago42.00347380.00
bedrockclaude-haiku-4.5Active8m ago41.904621060.00
googlegemini-2.5-proNever Succeeded(Medium)57m ago41.102721690.00
openaigpt-4.1Active58m ago38.60682500.00
deepinfrallama-3.2-90bStale(Medium)59m ago34.80376870.00
deepinfrallama-2-70bStale(Medium)58m ago34.80357570.00
deepinfrallama-3-70bStale(Medium)58m ago34.30255680.00
bedrockclaude-3-7-sonnetActive8m ago33.20544730.00
deepinfraqwen-2.5-72bStale(Medium)59m ago32.60250650.00
openaigpt-4-turboActive57m ago32.50252530.00
bedrockclaude-3-5-sonnetActive8m ago32.30244650.00
deepinfraQwen 2.5 Coder 32BNever Succeeded(Medium)1h ago31.501752580.00
bedrockclaude-3-5-haikuActive8m ago30.90438680.00
openaiGPT-5.1Active58m ago30.20357940.00
openaigpt-4Active57m ago27.90347660.00
openaiGPT-5.2Active58m ago26.80940910.00
openaiGPT-5.1-codexActive58m ago25.903481290.00
openaiGPT-5.1-codex-miniActive58m ago25.801551160.00
deepinfrallama-3.1-405bStale(Medium)59m ago24.80136990.00
togetherllama-3.1-405bActive29d ago24.502425450.00
bedrockclaude-sonnet-4.5Active8m ago22.402291610.00
deepinfrallama-3.1-70bStale(Medium)59m ago21.801461110.00
anthropicclaude-4-sonnetActive1h ago20.301311950.00
anthropicclaude-opus-4.5Active1h ago20.102331840.00
bedrockclaude-3-opusActive16d ago19.30422860.00
deepinfrallama-3.3-70bNever Succeeded(Medium)59m ago18.901462530.00
bedrockclaude-opus-4.5Active8m ago18.504241970.00
anthropicClaude Opus 4.1Active1h ago18.008271480.00
anthropicclaude-4-opusActive1h ago17.705241290.00
deepinfrallama-3.2-11bStale(Medium)59m ago13.301622680.00
openaigpt-5.2-codexActive58m ago12.901271830.00
deepinfraqwen-3-235bNever Succeeded(Medium)59m ago10.701405860.00
openaio1-proLikely Deprecated(Medium)58m ago9.46118650.00
openaiGPT-5.2-proActive58m ago8.594144830.00
Lifecycle snapshot
Loading status summary…

πŸ“ˆ Time Series πŸ“ˆ