Cloud BenchmarksLocal Benchmarks
API Status

☁️ Cloud Benchmarks ☁️

I run cron jobs to periodically test the token generation speed of different cloud LLM providers. The chart helps visualize the distributions of different speeds, as they can vary somewhat depending on the loads. For readability not all models are shown, but you can see the full results in the table below.

Every provider and model now has a dedicated landing page with narrative insights, SEO-friendly metadata, and structured data for search engines. Click any provider or model in the table to explore performance in depth.

I am working daily to add more providers and models, looking anywhere that does not require purchasing dedicated endpoints for hosting (why some models may appear to be missing). If you have any more suggestions let me know on GitHub!! 😊

Pick A Path In 10 Seconds

Quick recommendations from the latest 7-day benchmark slice. Use one path, jump into full results, then drill into provider/model pages.

Loading 7-day recommendations…

Fastest Models Right Now (updated <24h)

#ModelProviderSpeed
1llama-3.1-8bgroq311 tok/s
2qwen-3-32bgroq218 tok/s
3llama-3.3-70bgroq197 tok/s
4llama-4-scoutgroq195 tok/s
5llama-3.1-8bcerebras173 tok/s

πŸ“Š Speed Distribution πŸ“Š

πŸ“š Full Results πŸ“š

Showing 80 of 80 modelsFlagged statuses: likely_deprecated, deprecated, failing, stale, never_succeeded, disabled
Status
groqllama-3.1-8bActive37m ago311.008747190.00
cerebrasqwen-3-32bActive28d ago292.00212374190.00
groqqwen-3-32bActive37m ago218.002374310.00
groqllama-4-maverickActive7d ago202.001307680.00
groqllama-3.3-70bActive37m ago197.0068340140.00
groqllama-4-scoutActive37m ago195.0038335250.00
cerebrasgpt-oss-120bActive4d ago183.0013801160.00
cerebrasllama-3.1-8bActive39m ago173.0013531310.00
cerebrasllama-3.3-70bActive28d ago170.0035251480.00
togetherllama-3.1-8bActive2d ago141.003228350.00
groqkimi-k2Active37m ago139.0012215310.00
bedrocknova-microActive38m ago121.0065152270.00
openaio3 MiniNever Succeeded(Medium)37m ago109.0081640.00
bedrockllama-4-maverickActive38m ago108.003139270.00
bedrocknova-liteActive38m ago101.0039132300.00
bedrockllama-4-scoutActive38m ago101.006130280.00
bedrockllama-3.3-70bActive38m ago96.803136290.00
togetherqwen-2.5-7bActive36m ago92.001145510.00
bedrocknova-proActive38m ago87.2029121360.00
openaiGPT-5.1-codex-maxActive37m ago81.90111181260.00
deepinframistral-7bStale(Medium)39m ago77.905148640.00
openaigpt-3.5-turboActive36m ago75.6013126510.00
togetherllama-3.1-70bActive19d ago73.9015129330.00
deepinfradevstral-smallNever Succeeded(Medium)39m ago72.309140600.00
googlegemini-2.5-flash-liteActive36m ago72.1010117550.00
openaigpt-4.1-nanoActive37m ago70.609149470.00
togethermistral-7bActive19d ago70.20691370.00
fireworksmixtral-8x22bActive37m ago69.3029111400.00
openaigpt-4oActive36m ago69.00131731310.00
googlegemini-2.5-flashNever Succeeded(Medium)36m ago66.305105980.00
togethermixtral-8x7bActive36m ago61.0014114170.00
deepinframixtral-8x22bStale(Medium)26d ago56.301466550.00
togetherllama-3.2-3bActive10d ago56.1051211450.00
fireworksllama-3.3-70bActive37m ago54.8011081670.00
togetherdeepseek-r1Active36m ago52.901113770.00
togetherllama-3.3-70bActive36m ago52.4011461250.00
anthropicclaude-haiku-4.5Active3h ago51.701973540.00
openaigpt-4.1-miniActive37m ago51.6015109440.00
openaio4 MiniNever Succeeded(Medium)37m ago49.104760.00
bedrockllama-3.2-90bActive38m ago46.80251370.00
deepinfrallama-3-8bStale(Medium)37m ago45.301869320.00
deepinfrallama-3.1-8bStale(Medium)38m ago44.80385680.00
bedrockmistral-largeActive38m ago40.90247550.00
googlegemini-2.5-proNever Succeeded(Medium)36m ago40.602721700.00
openaigpt-4.1Active37m ago40.501083510.00
deepinfrallama-3.2-1bStale(Medium)38m ago40.401100840.00
openaigpt-4o-miniActive36m ago39.70764400.00
deepinfrallama-3.2-3bStale(Medium)38m ago39.60299800.00
bedrockclaude-haiku-4.5Active38m ago39.203621200.00
deepinfrallama-3.2-90bStale(Medium)39m ago35.40382730.00
deepinfrallama-2-70bStale(Medium)37m ago34.40357600.00
deepinfrallama-3-70bStale(Medium)38m ago33.50255650.00
deepinfraqwen-2.5-72bStale(Medium)39m ago32.80150790.00
bedrockclaude-3-5-sonnetActive38m ago32.70246650.00
bedrockclaude-3-7-sonnetActive38m ago32.50243750.00
openaigpt-4-turboActive36m ago32.30752520.00
bedrockclaude-3-5-haikuActive38m ago31.60938650.00
deepinfraQwen 2.5 Coder 32BNever Succeeded(Medium)39m ago30.701823540.00
openaiGPT-5.1Active37m ago29.602571100.00
openaigpt-4Active36m ago27.40847630.00
openaiGPT-5.4Active37m ago27.3015361080.00
openaiGPT-5.2Active37m ago27.10440970.00
openaiGPT-5.1-codexActive37m ago26.101481250.00
deepinfrallama-3.1-405bStale(Medium)38m ago25.30139780.00
openaiGPT-5.1-codex-miniActive37m ago25.301551190.00
deepinfrallama-3.1-70bStale(Medium)38m ago22.601441120.00
bedrockclaude-sonnet-4.5Active38m ago22.002281700.00
openaiGPT-5.3-codexActive37m ago21.307321220.00
anthropicclaude-opus-4.5Active3h ago20.502331820.00
anthropicclaude-4-sonnetActive3h ago19.906311820.00
bedrockclaude-3-opusActive25d ago19.60822850.00
bedrockclaude-opus-4.5Active38m ago19.001271990.00
anthropicClaude Opus 4.1Active3h ago17.607271550.00
deepinfrallama-3.3-70bNever Succeeded(Medium)39m ago17.601462580.00
anthropicclaude-4-opusActive3h ago17.405221330.00
openaigpt-5.2-codexActive3h ago13.601271720.00
openaio1-proLikely Deprecated(Medium)37m ago9.61118670.00
openaiGPT-5.2-proActive37m ago8.594144780.00
deepinfraqwen-3-235bNever Succeeded(Medium)39m ago8.441536050.00
deepinfrallama-3.2-11bStale(Medium)39m ago8.041612690.00
Lifecycle snapshot
Loading status summary…

πŸ“ˆ Time Series πŸ“ˆ