Cloud BenchmarksLocal Benchmarks
API Status

☁️ Cloud Benchmarks ☁️

I run cron jobs to periodically test the token generation speed of different cloud LLM providers. The chart helps visualize the distributions of different speeds, as they can vary somewhat depending on the loads. For readability not all models are shown, but you can see the full results in the table below.

Every provider and model now has a dedicated landing page with narrative insights, SEO-friendly metadata, and structured data for search engines. Click any provider or model in the table to explore performance in depth.

I am working daily to add more providers and models, looking anywhere that does not require purchasing dedicated endpoints for hosting (why some models may appear to be missing). If you have any more suggestions let me know on GitHub!! 😊

Pick A Path In 10 Seconds

Quick recommendations from the latest 7-day benchmark slice. Use one path, jump into full results, then drill into provider/model pages.

Loading 7-day recommendations…

Fastest Models Right Now (updated <24h)

#ModelProviderSpeed
1llama-3.1-8bgroq290 tok/s
2qwen-3-32bgroq202 tok/s
3llama-3.1-8bcerebras199 tok/s
4llama-4-scoutgroq187 tok/s
5llama-3.3-70bgroq168 tok/s

πŸ“Š Speed Distribution πŸ“Š

πŸ“š Full Results πŸ“š

Showing 93 of 93 modelsFlagged statuses: likely_deprecated, deprecated, failing, stale, never_succeeded, disabled
Status
groqllama-3.1-8bActive3h ago290.00130424100.00
groqllama-4-maverickActive28d ago241.0073302180.00
groqqwen-3-32bActive3h ago202.009282200.00
cerebrasllama-3.1-8bActive3h ago199.003353740.00
groqllama-4-scoutActive3h ago187.007333290.00
cerebrasgpt-oss-120bActive25d ago186.0013482670.00
groqllama-3.3-70bActive3h ago168.0040277190.00
togetherllama-3.1-8bActive23d ago146.0041228230.00
groqkimi-k2Active3h ago136.0012211280.00
bedrocknova-microActive28m ago123.0064154260.00
openaio3 MiniNever Succeeded(Medium)4d ago111.0081690.00
openaio3-mini-2025-01-31Active4d ago105.00151600.00
bedrockllama-4-maverickActive28m ago104.001145530.00
bedrockllama-4-scoutActive28m ago97.803130320.00
bedrocknova-liteActive28m ago97.6020131310.00
bedrockllama-3.3-70bActive28m ago93.602128300.00
openaiGPT-5.1-codex-maxActive4d ago91.80141171120.00
openaiGPT-5.4-nanoActive4d ago91.6042134400.00
togetherqwen-2.5-7bActive3h ago87.101138530.00
openaiGPT-5.4-nano-2026-03-17Active4d ago86.9036125450.00
deepinframistral-7bStale(Medium)3h ago85.905148530.00
openaio1Active4d ago81.902114740.00
openaiGPT-5.4-miniActive4d ago76.6016111480.00
openaigpt-4.1-nanoActive4d ago75.6018139390.00
bedrocknova-proActive28m ago75.2019118400.00
fireworksmixtral-8x22bActive3h ago74.5028111330.00
openaiGPT-5.4-mini-2026-03-17Active4d ago73.809119520.00
googlegemini-2.5-flash-liteActive3h ago73.3018117520.00
deepinfradevstral-smallNever Succeeded(Medium)3h ago73.0010136560.00
openaigpt-3.5-turboActive4d ago72.804125520.00
togetherdeepseek-r1Active3h ago65.505109620.00
googlegemini-2.5-flashNever Succeeded(Medium)3h ago64.9061051020.00
togethermixtral-8x7bActive3h ago60.408113190.00
fireworksllama-3.3-70bActive3h ago58.6011081030.00
openaigpt-4oActive4d ago57.0051421660.00
openaigpt-4.1-miniActive4d ago53.5018109390.00
openaiGPT-5-chat-latestActive4d ago52.401383550.00
openaio4-mini-2025-04-16Active4d ago52.2028770.00
togetherllama-3.3-70bActive3h ago51.602118910.00
openaio4 MiniNever Succeeded(Medium)4d ago50.904770.00
deepinfraQwen 2.5 Coder 32BNever Succeeded(Medium)3h ago50.801841500.00
anthropicclaude-haiku-4.5Active3h ago47.30370700.00
bedrockllama-3.2-90bActive28m ago46.50250390.00
deepinfrallama-3-8bStale(Medium)3h ago45.901869320.00
openaigpt-4.1Active4d ago43.801585510.00
bedrockmistral-largeActive28m ago41.10247500.00
googlegemini-2.5-proNever Succeeded(Medium)3h ago40.007651540.00
openaio3-2025-04-16Active4d ago40.009710.00
openaigpt-4o-miniActive4d ago39.10764430.00
bedrockclaude-haiku-4.5Active28m ago38.803651200.00
openaio3Active4d ago38.809690.00
deepinfrallama-3.1-8bStale(Medium)3h ago37.30278650.00
deepinfrallama-3.2-1bStale(Medium)3h ago35.80399750.00
deepinfrallama-3.2-3bStale(Medium)3h ago35.50399750.00
openaiGPT-5.1Active4d ago34.20264930.00
openaiGPT-5.1-2025-11-13Active4d ago33.801062840.00
openaigpt-4-turboActive12d ago32.80149520.00
bedrockclaude-3-5-haikuActive28m ago32.60342650.00
bedrockclaude-3-7-sonnetActive28m ago32.00242810.00
deepinfrallama-3.2-11bStale(Medium)3h ago31.701811130.00
bedrockclaude-3-5-sonnetActive6d ago31.60146840.00
openaiGPT-5.4Active4d ago30.10945760.00
openaiGPT-5.4-2026-03-05Active4d ago30.10842700.00
openaiGPT-5.2Active4d ago30.00447810.00
openaiGPT-5.2-2025-12-11Active4d ago29.501643770.00
openaiGPT-5.1-codexActive4d ago29.201521150.00
deepinfrallama-3-70bStale(Medium)6h ago29.10351580.00
deepinfrallama-3.2-90bStale(Medium)3h ago29.00382870.00
deepinfraqwen-2.5-72bStale(Medium)3h ago28.901452450.00
deepinfrallama-2-70bStale(Medium)6h ago28.90452590.00
openaiGPT-5.1-chat-latestActive4d ago28.40352970.00
openaiGPT-5.1-codex-miniActive4d ago26.402511210.00
openaiGPT-5.3-codexActive4d ago25.70740840.00
openaigpt-4Active4d ago25.50445650.00
deepinfrallama-3.1-70bStale(Medium)3h ago22.10142670.00
anthropicclaude-opus-4.5Active3h ago22.004371470.00
bedrockclaude-sonnet-4.5Active28m ago20.901291790.00
deepinfrallama-3.1-405bStale(Medium)3h ago20.301391390.00
deepinfrallama-3.3-70bNever Succeeded(Medium)3h ago19.201421720.00
anthropicclaude-4-sonnetActive3h ago18.707322010.00
bedrockclaude-opus-4.5Active28m ago18.101272470.00
anthropicClaude Opus 4.1Active3h ago18.007251430.00
openaigpt-5.2-codexActive4d ago17.802371360.00
anthropicclaude-4-opusActive3h ago17.108241320.00
openaiGPT-5.2-chat-latestActive4d ago10.701271520.00
openaio1-proLikely Deprecated(Medium)4d ago9.83118810.00
openaiGPT-5.2-proActive4d ago9.135144740.00
openaiGPT-5-codexActive4d ago8.131231940.00
deepinfraqwen-3-235bNever Succeeded(Medium)3h ago7.561535320.00
openaio3-proActive4d ago7.55115360.00
openaio3-pro-2025-06-10Active4d ago7.46214470.00
openaiGPT-5-proActive4d ago4.04180.00
openaiGPT-5.2-pro-2025-12-11Active4d ago1.90148040.00
Lifecycle snapshot
Loading status summary…

πŸ“ˆ Time Series πŸ“ˆ