Cloud BenchmarksLocal Benchmarks
API Status

☁️ Cloud Benchmarks ☁️

I run cron jobs to periodically test the token generation speed of different cloud LLM providers. The chart helps visualize the distributions of different speeds, as they can vary somewhat depending on the loads. For readability not all models are shown, but you can see the full results in the table below.

Every provider and model now has a dedicated landing page with narrative insights, SEO-friendly metadata, and structured data for search engines. Click any provider or model in the table to explore performance in depth.

I am working daily to add more providers and models, looking anywhere that does not require purchasing dedicated endpoints for hosting (why some models may appear to be missing). If you have any more suggestions let me know on GitHub!! 😊

Pick A Path In 10 Seconds

Quick recommendations from the latest 7-day benchmark slice. Use one path, jump into full results, then drill into provider/model pages.

Loading 7-day recommendations…

Fastest Models Right Now (updated <24h)

#ModelProviderSpeed
1llama-3.1-8bgroq319 tok/s
2qwen-3-32bgroq231 tok/s
3llama-4-scoutgroq211 tok/s
4llama-3.3-70bgroq202 tok/s
5llama-4-maverickgroq196 tok/s

πŸ“Š Speed Distribution πŸ“Š

πŸ“š Full Results πŸ“š

Showing 78 of 78 modelsFlagged statuses: likely_deprecated, deprecated, failing, stale, never_succeeded, disabled
Status
groqllama-3.1-8bActive3h ago319.008747190.00
cerebrasqwen-3-32bActive20d ago261.0032444230.00
groqqwen-3-32bActive3h ago231.002391300.00
groqllama-4-scoutActive3h ago211.0038335210.00
groqllama-3.3-70bActive3h ago202.0068340130.00
groqllama-4-maverickActive3h ago196.001307610.00
cerebrasgpt-oss-120bActive3h ago194.0013380800.00
cerebrasllama-3.1-8bActive3h ago192.001348980.00
cerebrasllama-3.3-70bActive20d ago184.0017316400.00
togetherllama-3.1-8bActive5h ago143.003232380.00
groqkimi-k2Active3h ago142.0012215300.00
bedrocknova-microActive25m ago122.0065152270.00
openaio3 MiniNever Succeeded(Medium)3h ago112.00211600.00
bedrockllama-4-maverickActive25m ago108.003139260.00
bedrocknova-liteActive25m ago102.0039132300.00
bedrockllama-4-scoutActive25m ago101.004129280.00
bedrockllama-3.3-70bActive25m ago98.703137290.00
togetherqwen-2.5-7bActive2h ago94.1011145240.00
bedrocknova-proActive25m ago90.4031121350.00
openaiGPT-5.1-codex-maxActive3h ago80.80111181430.00
openaigpt-3.5-turboActive2h ago78.4013126470.00
googlegemini-2.5-flash-liteActive2h ago76.0010132520.00
openaigpt-4.1-nanoActive3h ago72.609149450.00
togetherllama-3.1-70bActive11d ago72.307129390.00
togethermistral-7bActive11d ago71.20291470.00
openaigpt-4oActive2h ago69.90131731310.00
deepinframistral-7bStale(Medium)3h ago69.005145620.00
fireworksmixtral-8x22bActive7d ago67.3029112430.00
googlegemini-2.5-flashNever Succeeded(Medium)2h ago67.0051011030.00
deepinfradevstral-smallNever Succeeded(Medium)3h ago63.709140570.00
togethermixtral-8x7bActive2h ago60.8014114150.00
deepinframixtral-8x22bStale(Medium)18d ago59.801478410.00
togetherllama-3.2-3bActive2d ago57.3051211500.00
togetherllama-3.3-70bActive2h ago54.2011461340.00
anthropicclaude-haiku-4.5Active3h ago52.001974550.00
fireworksllama-3.3-70bActive3h ago51.602951380.00
openaigpt-4.1-miniActive3h ago51.501585450.00
openaio4 MiniNever Succeeded(Medium)3h ago49.8014760.00
deepinfrallama-3.1-8bStale(Medium)3h ago47.80496550.00
bedrockllama-3.2-90bActive25m ago47.10651350.00
deepinfrallama-3-8bStale(Medium)3h ago45.80971310.00
togetherdeepseek-r1Active2h ago45.7011131000.00
deepinfrallama-3.2-1bStale(Medium)3h ago45.601100610.00
deepinfrallama-3.2-3bStale(Medium)3h ago44.80298530.00
openaigpt-4o-miniActive2h ago42.702095360.00
bedrockmistral-largeActive25m ago41.90347380.00
bedrockclaude-haiku-4.5Active25m ago41.804621060.00
googlegemini-2.5-proNever Succeeded(Medium)2h ago41.302721680.00
openaigpt-4.1Active3h ago38.60682500.00
deepinfrallama-3.2-90bStale(Medium)6h ago35.20382870.00
deepinfrallama-2-70bStale(Medium)3h ago34.80357580.00
deepinfrallama-3-70bStale(Medium)3h ago34.10255680.00
bedrockclaude-3-7-sonnetActive25m ago33.20244730.00
openaigpt-4-turboActive2h ago32.60252520.00
deepinfraqwen-2.5-72bStale(Medium)3h ago32.50150750.00
bedrockclaude-3-5-sonnetActive25m ago32.40244650.00
deepinfraQwen 2.5 Coder 32BNever Succeeded(Medium)3h ago31.101752640.00
bedrockclaude-3-5-haikuActive25m ago30.90438680.00
openaiGPT-5.1Active3h ago30.30357940.00
openaigpt-4Active2h ago27.90347660.00
openaiGPT-5.2Active3h ago27.00940910.00
openaiGPT-5.1-codexActive3h ago26.003481280.00
openaiGPT-5.1-codex-miniActive3h ago25.501551160.00
deepinfrallama-3.1-405bStale(Medium)3h ago25.10139980.00
bedrockclaude-sonnet-4.5Active25m ago22.402291610.00
deepinfrallama-3.1-70bStale(Medium)3h ago21.701461120.00
anthropicclaude-4-sonnetActive3h ago20.401311930.00
anthropicclaude-opus-4.5Active3h ago20.102331840.00
bedrockclaude-3-opusActive17d ago19.40622860.00
deepinfrallama-3.3-70bNever Succeeded(Medium)3h ago19.001462510.00
bedrockclaude-opus-4.5Active25m ago18.501242030.00
anthropicClaude Opus 4.1Active3h ago18.008271480.00
anthropicclaude-4-opusActive3h ago17.605241300.00
deepinfrallama-3.2-11bStale(Medium)3h ago12.901622690.00
openaigpt-5.2-codexActive3h ago12.901271820.00
deepinfraqwen-3-235bNever Succeeded(Medium)3h ago10.601405820.00
openaio1-proLikely Deprecated(Medium)3h ago9.54118640.00
openaiGPT-5.2-proActive3h ago8.544144930.00
Lifecycle snapshot
Loading status summary…

πŸ“ˆ Time Series πŸ“ˆ