Cloud BenchmarksLocal Benchmarks
API Status

☁️ Cloud Benchmarks ☁️

I run cron jobs to periodically test the token generation speed of different cloud LLM providers. The chart helps visualize the distributions of different speeds, as they can vary somewhat depending on the loads. For readability not all models are shown, but you can see the full results in the table below.

Every provider and model now has a dedicated landing page with narrative insights, SEO-friendly metadata, and structured data for search engines. Click any provider or model in the table to explore performance in depth.

I am working daily to add more providers and models, looking anywhere that does not require purchasing dedicated endpoints for hosting (why some models may appear to be missing). If you have any more suggestions let me know on GitHub!! 😊

Fastest Models Right Now (updated <24h)

#ModelProviderSpeed
1llama-3.1-8bgroq325 tok/s
2qwen-3-32bgroq236 tok/s
3llama-4-scoutgroq213 tok/s
4llama-3.1-8bcerebras202 tok/s
5llama-3.3-70bgroq202 tok/s

πŸ“Š Speed Distribution πŸ“Š

πŸ“š Full Results πŸ“š

Showing 83 of 83 modelsFlagged statuses: likely_deprecated, deprecated, failing, stale, never_succeeded, disabled
Status
groqllama-3.1-8bActive2h ago325.008747190.00
cerebrasqwen-3-32bActive15d ago245.004444390.00
groqqwen-3-32bActive2h ago236.002391280.00
groqllama-4-scoutActive2h ago213.0038335210.00
cerebrasllama-3.1-8bActive2h ago202.002343720.00
groqllama-3.3-70bActive2h ago202.0068322130.00
cerebrasgpt-oss-120bActive2h ago194.0013380810.00
groqllama-4-maverickActive2h ago182.0012307430.00
cerebrasllama-3.3-70bActive15d ago181.0017338450.00
togetherllama-3.1-8bActive2h ago144.002232560.00
groqkimi-k2Active5h ago143.0012215290.00
bedrocknova-microActive31m ago122.0065149270.00
openaio3 MiniNever Succeeded(Medium)2h ago110.00211600.00
bedrockllama-4-maverickActive31m ago108.003139260.00
bedrocknova-liteActive31m ago101.0039130300.00
bedrockllama-4-scoutActive31m ago101.004129270.00
bedrockllama-3.3-70bActive31m ago99.103137280.00
togetherqwen-2.5-7bActive2h ago92.9011145240.00
bedrocknova-proActive31m ago91.2013121350.00
cerebrasqwen-3-235b-instructActive28d ago85.20191711270.00
openaigpt-3.5-turboActive2h ago78.7013126470.00
openaiGPT-5.1-codex-maxActive2h ago77.50111181460.00
googlegemini-2.5-flash-liteActive2h ago75.3034132520.00
togetherllama-3.1-70bActive6d ago72.507129390.00
openaigpt-4.1-nanoActive2h ago71.209149430.00
togethermistral-7bActive6d ago70.30291540.00
openaigpt-4oActive2h ago68.8071731340.00
fireworksmixtral-8x22bActive2d ago68.0029112420.00
googlegemini-2.5-flashNever Succeeded(Medium)2h ago66.6051011060.00
deepinframistral-7bStale(Medium)2h ago63.705136570.00
deepinframixtral-8x22bStale(Medium)13d ago59.101478370.00
togethermixtral-8x7bActive2h ago58.5013114160.00
togetherllama-3.2-3bActive2h ago58.0051211490.00
deepinfradevstral-smallNever Succeeded(Medium)2h ago57.403131660.00
togetherllama-3.3-70bActive2h ago53.6011461430.00
googlegemini-2.0-flashActive28d ago53.302968700.00
togetherqwen-2.5-72bActive25d ago53.00471510.00
deepinfrallama-3.1-8bStale(Medium)2h ago52.10697440.00
googlegemini-2.0-flash-liteActive28d ago51.501164960.00
anthropicclaude-haiku-4.5Active2h ago51.001674590.00
openaigpt-4.1-miniActive2h ago50.501597480.00
fireworksllama-3.3-70bActive2h ago50.504951200.00
openaio4 MiniNever Succeeded(Medium)2h ago49.1014760.00
bedrockllama-3.2-90bActive31m ago47.20651350.00
deepinfrallama-3-8bStale(Medium)2h ago44.90971310.00
togetherdeepseek-r1Active2h ago44.301691010.00
openaigpt-4o-miniActive2h ago43.502095350.00
bedrockmistral-largeActive31m ago42.40347340.00
bedrockclaude-haiku-4.5Active31m ago41.804631080.00
googlegemini-2.5-proNever Succeeded(Medium)2h ago40.602721730.00
openaigpt-4.1Active2h ago37.40682490.00
deepinfrallama-3.2-1bStale(Medium)2h ago37.101100630.00
deepinfrallama-3.2-3bStale(Medium)2h ago36.60197800.00
deepinfrallama-2-70bStale(Medium)2h ago35.70357510.00
deepinfrallama-3-70bStale(Medium)2h ago35.30255610.00
deepinfrallama-3.2-90bStale(Medium)2h ago33.80376880.00
bedrockclaude-3-7-sonnetActive31m ago33.20544740.00
deepinfraqwen-2.5-72bStale(Medium)2h ago32.20250700.00
bedrockclaude-3-5-sonnetActive31m ago32.00244660.00
openaigpt-4-turboActive2h ago32.00252530.00
deepinfraQwen 2.5 Coder 32BNever Succeeded(Medium)5h ago31.301742490.00
bedrockclaude-3-5-haikuActive31m ago30.40138750.00
openaiGPT-5.1Active2h ago29.90357970.00
openaigpt-4Active2h ago27.30247660.00
openaiGPT-5.2Active2h ago26.60940950.00
openaiGPT-5.1-codexActive2h ago25.603481300.00
openaiGPT-5.1-codex-miniActive2h ago24.801551250.00
deepinfrallama-3.1-405bStale(Medium)2h ago23.701341120.00
deepinfrallama-3.1-70bStale(Medium)2h ago23.30346660.00
bedrockclaude-sonnet-4.5Active31m ago22.305291610.00
togetherllama-3.1-405bActive26d ago20.506291120.00
deepinfrallama-3.3-70bNever Succeeded(Medium)2h ago20.001492040.00
anthropicclaude-opus-4.5Active2h ago19.802331880.00
anthropicclaude-4-sonnetActive2h ago19.701312090.00
bedrockclaude-3-opusActive13d ago19.10422870.00
bedrockclaude-opus-4.5Active31m ago18.304241990.00
anthropicClaude Opus 4.1Active2h ago18.008271470.00
anthropicclaude-4-opusActive2h ago17.705241280.00
deepinfrallama-3.2-11bStale(Medium)2h ago15.601622470.00
openaigpt-5.2-codexActive5h ago12.801271860.00
deepinfraqwen-3-235bNever Succeeded(Medium)2h ago12.001385130.00
openaio1-proLikely Deprecated(Medium)2h ago8.94118650.00
openaiGPT-5.2-proActive2h ago8.361145190.00
Lifecycle snapshot
Loading status summary…

πŸ“ˆ Time Series πŸ“ˆ