Cloud BenchmarksLocal Benchmarks
API Status

☁️ Cloud Benchmarks ☁️

I run cron jobs to periodically test the token generation speed of different cloud LLM providers. The chart helps visualize the distributions of different speeds, as they can vary somewhat depending on the loads. For readability not all models are shown, but you can see the full results in the table below.

Every provider and model now has a dedicated landing page with narrative insights, SEO-friendly metadata, and structured data for search engines. Click any provider or model in the table to explore performance in depth.

I am working daily to add more providers and models, looking anywhere that does not require purchasing dedicated endpoints for hosting (why some models may appear to be missing). If you have any more suggestions let me know on GitHub!! 😊

Pick A Path In 10 Seconds

Quick recommendations from the latest 7-day benchmark slice. Use one path, jump into full results, then drill into provider/model pages.

Loading 7-day recommendations…

Fastest Models Right Now (updated <24h)

#ModelProviderSpeed
1llama-3.1-8bgroq289 tok/s
2qwen-3-32bgroq203 tok/s
3llama-3.1-8bcerebras193 tok/s
4llama-4-scoutgroq189 tok/s
5llama-3.3-70bgroq174 tok/s

πŸ“Š Speed Distribution πŸ“Š

πŸ“š Full Results πŸ“š

Showing 94 of 94 modelsFlagged statuses: likely_deprecated, deprecated, failing, stale, never_succeeded, disabled
Status
groqllama-3.1-8bActive2h ago289.00130424100.00
groqllama-4-maverickActive25d ago219.0013021880.00
groqqwen-3-32bActive2h ago203.0011284200.00
cerebrasllama-3.1-8bActive2h ago193.003353810.00
groqllama-4-scoutActive2h ago189.007333280.00
cerebrasgpt-oss-120bActive22d ago185.0013481710.00
groqllama-3.3-70bActive2h ago174.0040340180.00
togetherllama-3.1-8bActive20d ago147.0041228220.00
groqkimi-k2Active2h ago132.0012211330.00
bedrocknova-microActive26m ago124.0064152260.00
openaio3 MiniNever Succeeded(Medium)1d ago111.0081690.00
bedrockllama-4-maverickActive26m ago105.001145530.00
openaio3-mini-2025-01-31Active1d ago105.00151600.00
bedrocknova-liteActive26m ago98.5020132300.00
bedrockllama-4-scoutActive26m ago98.003130310.00
bedrockllama-3.3-70bActive26m ago93.902128310.00
openaiGPT-5.4-nanoActive1d ago91.6042134400.00
openaiGPT-5.1-codex-maxActive1d ago91.50141171120.00
deepinframistral-7bStale(Medium)2h ago87.9010148500.00
togetherqwen-2.5-7bActive2h ago87.801138530.00
openaiGPT-5.4-nano-2026-03-17Active1d ago86.9036125450.00
openaio1Active1d ago81.902114740.00
openaiGPT-5.4-miniActive1d ago76.6016111480.00
bedrocknova-proActive26m ago76.5019118390.00
deepinfradevstral-smallNever Succeeded(Medium)2h ago76.109140540.00
openaigpt-4.1-nanoActive1d ago75.3018139400.00
googlegemini-2.5-flash-liteActive2h ago74.9018117520.00
fireworksmixtral-8x22bActive2h ago74.5028111330.00
openaiGPT-5.4-mini-2026-03-17Active1d ago73.809119520.00
openaigpt-3.5-turboActive1d ago73.204125510.00
togetherdeepseek-r1Active2h ago66.405113550.00
googlegemini-2.5-flashNever Succeeded(Medium)2h ago65.0061051020.00
togetherllama-3.2-3bActive28d ago60.10121091350.00
openaigpt-4oActive1d ago59.4051421560.00
togethermixtral-8x7bActive2h ago59.008114190.00
fireworksllama-3.3-70bActive2h ago57.3011081260.00
openaigpt-4.1-miniActive1d ago53.8018109390.00
openaiGPT-5-chat-latestActive1d ago52.401383550.00
openaio4-mini-2025-04-16Active1d ago52.2028770.00
togetherllama-3.3-70bActive2h ago52.102121920.00
openaio4 MiniNever Succeeded(Medium)1d ago50.604770.00
anthropicclaude-haiku-4.5Active2h ago48.10373670.00
deepinfraQwen 2.5 Coder 32BNever Succeeded(Medium)2h ago47.601841800.00
bedrockllama-3.2-90bActive26m ago46.60250380.00
deepinfrallama-3-8bStale(Medium)2h ago45.401869320.00
openaigpt-4.1Active1d ago43.701585510.00
bedrockmistral-largeActive26m ago41.00247520.00
googlegemini-2.5-proNever Succeeded(Medium)2h ago40.507651520.00
openaio3-2025-04-16Active1d ago40.009710.00
deepinfrallama-3.2-1bStale(Medium)2h ago39.203100750.00
openaigpt-4o-miniActive1d ago39.10764430.00
bedrockclaude-haiku-4.5Active27m ago38.803651200.00
deepinfrallama-3.2-3bStale(Medium)2h ago38.80399750.00
openaio3Active1d ago38.809690.00
deepinfrallama-3.1-8bStale(Medium)2h ago36.00278750.00
openaiGPT-5.1Active1d ago34.00264940.00
openaiGPT-5.1-2025-11-13Active1d ago33.801062840.00
openaigpt-4-turboActive9d ago33.10149520.00
bedrockclaude-3-5-haikuActive27m ago32.50542660.00
bedrockclaude-3-5-sonnetActive3d ago31.90146810.00
bedrockclaude-3-7-sonnetActive27m ago31.90242790.00
deepinfrallama-3.2-90bStale(Medium)2h ago30.80482820.00
openaiGPT-5.4Active1d ago30.10945760.00
openaiGPT-5.4-2026-03-05Active1d ago30.10842700.00
openaiGPT-5.2Active1d ago29.70447820.00
openaiGPT-5.2-2025-12-11Active1d ago29.501643770.00
deepinfraqwen-2.5-72bStale(Medium)2h ago29.001462460.00
deepinfrallama-3-70bStale(Medium)2h ago28.90451550.00
openaiGPT-5.1-codexActive1d ago28.901521150.00
deepinfrallama-2-70bStale(Medium)2h ago28.60452620.00
openaiGPT-5.1-chat-latestActive1d ago28.40352970.00
deepinfrallama-3.2-11bStale(Medium)2h ago26.801811350.00
openaiGPT-5.1-codex-miniActive1d ago26.001511210.00
openaigpt-4Active1d ago25.80446650.00
openaiGPT-5.3-codexActive1d ago25.70740840.00
deepinfrallama-3.1-405bStale(Medium)2h ago21.601391120.00
anthropicclaude-opus-4.5Active2h ago21.504311540.00
bedrockclaude-sonnet-4.5Active27m ago21.001291800.00
deepinfrallama-3.1-70bStale(Medium)2h ago20.801421020.00
anthropicclaude-4-sonnetActive2h ago18.907321930.00
deepinfrallama-3.3-70bNever Succeeded(Medium)2h ago18.701432000.00
bedrockclaude-opus-4.5Active27m ago18.101272460.00
anthropicClaude Opus 4.1Active2h ago17.907251450.00
openaigpt-5.2-codexActive1d ago17.402371400.00
anthropicclaude-4-opusActive2h ago17.108241330.00
openaiGPT-5.2-chat-latestActive1d ago10.701271520.00
openaio1-proLikely Deprecated(Medium)1d ago9.97118730.00
openaiGPT-5.2-proActive1d ago9.145144610.00
openaiGPT-5-codexActive1d ago8.131231940.00
deepinfraqwen-3-235bNever Succeeded(Medium)2h ago7.961535490.00
openaio3-proActive1d ago7.55115360.00
openaio3-pro-2025-06-10Active1d ago7.46214470.00
openaiGPT-5-proActive1d ago4.04180.00
openaiGPT-5.2-pro-2025-12-11Active1d ago1.90148040.00
Lifecycle snapshot
Loading status summary…

πŸ“ˆ Time Series πŸ“ˆ