Cloud BenchmarksLocal Benchmarks
API Status

☁️ Cloud Benchmarks ☁️

I run cron jobs to periodically test the token generation speed of different cloud LLM providers. The chart helps visualize the distributions of different speeds, as they can vary somewhat depending on the loads. For readability not all models are shown, but you can see the full results in the table below.

Every provider and model now has a dedicated landing page with narrative insights, SEO-friendly metadata, and structured data for search engines. Click any provider or model in the table to explore performance in depth.

I am working daily to add more providers and models, looking anywhere that does not require purchasing dedicated endpoints for hosting (why some models may appear to be missing). If you have any more suggestions let me know on GitHub!! 😊

Pick A Path In 10 Seconds

Quick recommendations from the latest 7-day benchmark slice. Use one path, jump into full results, then drill into provider/model pages.

Loading 7-day recommendations…

Fastest Models Right Now (updated <24h)

#ModelProviderSpeed
1llama-3.1-8bgroq287 tok/s
2qwen-3-32bgroq202 tok/s
3llama-3.1-8bcerebras191 tok/s
4llama-4-scoutgroq187 tok/s
5llama-3.3-70bgroq174 tok/s

πŸ“Š Speed Distribution πŸ“Š

πŸ“š Full Results πŸ“š

Showing 94 of 94 modelsFlagged statuses: likely_deprecated, deprecated, failing, stale, never_succeeded, disabled
Status
groqllama-3.1-8bActive1h ago287.00130424100.00
groqllama-4-maverickActive24d ago223.0013021600.00
groqqwen-3-32bActive1h ago202.0011284210.00
cerebrasllama-3.1-8bActive1h ago191.0013531020.00
groqllama-4-scoutActive1h ago187.007333290.00
cerebrasgpt-oss-120bActive21d ago175.0013481740.00
groqllama-3.3-70bActive1h ago174.0040340180.00
togetherllama-3.1-8bActive19d ago148.0041228220.00
groqkimi-k2Active1h ago129.0012211350.00
bedrocknova-microActive32m ago124.0064152260.00
openaio3 MiniNever Succeeded(Medium)4h ago110.0081690.00
bedrockllama-4-maverickActive31m ago105.001145530.00
openaio3-mini-2025-01-31Active4h ago105.00151600.00
bedrocknova-liteActive32m ago98.8020132300.00
bedrockllama-4-scoutActive31m ago98.103130310.00
bedrockllama-3.3-70bActive31m ago94.002128310.00
openaiGPT-5.4-nanoActive4h ago91.6042134400.00
openaiGPT-5.1-codex-maxActive4h ago91.10141171120.00
deepinframistral-7bStale(Medium)1h ago88.9010148490.00
togetherqwen-2.5-7bActive1h ago88.001139530.00
openaiGPT-5.4-nano-2026-03-17Active4h ago86.9036125450.00
openaio1Active4h ago81.902114740.00
deepinfradevstral-smallNever Succeeded(Medium)1h ago76.909140540.00
bedrocknova-proActive32m ago76.7019118390.00
openaiGPT-5.4-miniActive4h ago76.6016111480.00
openaigpt-4.1-nanoActive4h ago74.9018139410.00
googlegemini-2.5-flash-liteActive1h ago74.7010117530.00
fireworksmixtral-8x22bActive1h ago74.7028111330.00
openaiGPT-5.4-mini-2026-03-17Active4h ago73.809119520.00
openaigpt-3.5-turboActive4h ago73.104125520.00
togetherdeepseek-r1Active1h ago65.005113560.00
googlegemini-2.5-flashNever Succeeded(Medium)1h ago64.7061051020.00
openaigpt-4oActive4h ago59.4051421570.00
togethermixtral-8x7bActive1h ago58.508114190.00
fireworksllama-3.3-70bActive1h ago57.4011081270.00
openaigpt-4.1-miniActive4h ago53.6018109390.00
togetherllama-3.2-3bActive27d ago52.80121091290.00
openaiGPT-5-chat-latestActive4h ago52.401383550.00
openaio4-mini-2025-04-16Active4h ago52.2028770.00
togetherllama-3.3-70bActive1h ago51.802121930.00
openaio4 MiniNever Succeeded(Medium)4h ago50.604770.00
anthropicclaude-haiku-4.5Active1h ago48.20373670.00
bedrockllama-3.2-90bActive32m ago46.60250380.00
deepinfraQwen 2.5 Coder 32BNever Succeeded(Medium)1h ago46.201842050.00
deepinfrallama-3-8bStale(Medium)1h ago45.401869320.00
openaigpt-4.1Active4h ago43.601585510.00
bedrockmistral-largeActive31m ago41.00247520.00
googlegemini-2.5-proNever Succeeded(Medium)1h ago40.507651520.00
deepinfrallama-3.2-1bStale(Medium)1h ago40.003100750.00
openaio3-2025-04-16Active4h ago40.009710.00
deepinfrallama-3.2-3bStale(Medium)1h ago39.40399750.00
openaigpt-4o-miniActive4h ago39.00764430.00
openaio3Active4h ago38.809690.00
bedrockclaude-haiku-4.5Active32m ago38.703651200.00
deepinfrallama-3.1-8bStale(Medium)1h ago35.80278760.00
openaiGPT-5.1Active4h ago33.90264940.00
openaiGPT-5.1-2025-11-13Active4h ago33.801062840.00
openaigpt-4-turboActive8d ago33.10149520.00
bedrockclaude-3-5-haikuActive32m ago32.50538660.00
bedrockclaude-3-5-sonnetActive2d ago31.90146800.00
bedrockclaude-3-7-sonnetActive32m ago31.80242800.00
deepinfrallama-3.2-90bStale(Medium)1h ago31.30482820.00
openaiGPT-5.4Active4h ago30.10945760.00
openaiGPT-5.4-2026-03-05Active4h ago30.10842700.00
openaiGPT-5.2Active4h ago29.60447820.00
openaiGPT-5.2-2025-12-11Active4h ago29.501643770.00
deepinfrallama-3-70bStale(Medium)1h ago28.90451570.00
deepinfraqwen-2.5-72bStale(Medium)1h ago28.801462470.00
deepinfrallama-2-70bStale(Medium)1h ago28.80452630.00
openaiGPT-5.1-codexActive4h ago28.801521160.00
openaiGPT-5.1-chat-latestActive4h ago28.40352970.00
openaiGPT-5.1-codex-miniActive4h ago26.101511210.00
openaigpt-4Active4h ago25.80446650.00
openaiGPT-5.3-codexActive4h ago25.70740840.00
deepinfrallama-3.2-11bStale(Medium)1h ago25.201811440.00
deepinfrallama-3.1-405bStale(Medium)1h ago21.701391120.00
anthropicclaude-opus-4.5Active1h ago21.404311560.00
bedrockclaude-sonnet-4.5Active32m ago20.901291810.00
deepinfrallama-3.1-70bStale(Medium)1h ago20.601421080.00
anthropicclaude-4-sonnetActive1h ago18.907321940.00
deepinfrallama-3.3-70bNever Succeeded(Medium)1h ago18.801432000.00
bedrockclaude-opus-4.5Active32m ago18.201272470.00
anthropicClaude Opus 4.1Active1h ago17.807251460.00
openaigpt-5.2-codexActive4h ago17.302371410.00
anthropicclaude-4-opusActive1h ago17.208241330.00
openaiGPT-5.2-chat-latestActive4h ago10.701271520.00
openaio1-proLikely Deprecated(Medium)4h ago9.91118710.00
openaiGPT-5.2-proActive4h ago9.114144530.00
openaiGPT-5-codexActive19h ago8.131231940.00
openaio3-proActive4h ago7.55115360.00
openaio3-pro-2025-06-10Active4h ago7.46214470.00
deepinfraqwen-3-235bNever Succeeded(Medium)1h ago7.211535800.00
openaiGPT-5-proActive4h ago4.04180.00
openaiGPT-5.2-pro-2025-12-11Active10h ago1.90148040.00
Lifecycle snapshot
Loading status summary…

πŸ“ˆ Time Series πŸ“ˆ