Cloud BenchmarksLocal Benchmarks

☁️ Cloud Benchmarks ☁️

I run cron jobs to periodically test the token generation speed of different cloud LLM providers. The chart helps visualize the distributions of different speeds, as they can vary somewhat depending on the loads. For readability not all models are shown, but you can see the full results in the table below.

Every provider and model now has a dedicated landing page with narrative insights, SEO-friendly metadata, and structured data for search engines. Click any provider or model in the table to explore performance in depth.

I am working daily to add more providers and models, looking anywhere that does not require purchasing dedicated endpoints for hosting (why some models may appear to be missing). If you have any more suggestions let me know on GitHub!! 😊

Fastest Models Right Now (updated <24h)

#ModelProviderSpeed
1llama-3.1-8bgroq316 tok/s
2qwen-3-32bgroq248 tok/s
3llama-4-scoutgroq215 tok/s
4llama-3.1-8bcerebras212 tok/s
5llama-3.3-70bgroq206 tok/s

📊 Speed Distribution 📊

📚 Full Results 📚

Showing 83 of 83 modelsFlagged statuses: likely_deprecated, deprecated, failing, stale, never_succeeded, disabled
Status
groqllama-3.1-8bActive47m ago316.0087471100.00
cerebrasqwen-3-32bActive9d ago249.004444330.00
groqqwen-3-32bActive47m ago248.002391270.00
groqllama-4-scoutActive47m ago215.0023335200.00
cerebrasllama-3.1-8bActive50m ago212.006350490.00
groqllama-3.3-70bActive47m ago206.0079322120.00
cerebrasgpt-oss-120bActive50m ago197.004346840.00
cerebrasllama-3.3-70bActive9d ago194.0017338400.00
groqllama-4-maverickActive47m ago165.0025310500.00
togetherllama-3.1-8bActive45m ago151.002232470.00
groqkimi-k2Active47m ago146.0019212230.00
cerebrasqwen-3-235b-instructActive22d ago138.0052521180.00
bedrocknova-microActive33m ago123.0069151270.00
bedrockllama-4-maverickActive33m ago109.003142260.00
openaio3 MiniNever Succeeded(Medium)47m ago109.00211600.00
bedrockllama-4-scoutActive33m ago102.004132280.00
bedrockllama-3.3-70bActive33m ago101.0011137240.00
bedrocknova-liteActive33m ago100.0039131300.00
togetherqwen-2.5-7bActive45m ago92.503145300.00
bedrocknova-proActive33m ago91.3013124360.00
openaigpt-3.5-turboActive46m ago80.0012124440.00
googleActive45m ago75.9034132520.00
togethermistral-7bActive45m ago75.402157520.00
togetherllama-3.1-70bActive45m ago72.507144390.00
openaiGPT-5.1-codex-maxActive47m ago72.2011181650.00
openaigpt-4.1-nanoActive47m ago69.609126420.00
openaigpt-4oActive46m ago67.5071731340.00
googlegemini-2.5-flashNever Succeeded(Medium)45m ago66.8051011100.00
fireworksmixtral-8x22bActive47m ago66.0035112480.00
togetherllama-3.2-3bActive45m ago63.5051211320.00
googlegemini-2.0-flashActive21d ago60.101688590.00
deepinframixtral-8x22bStale(Medium)7d ago57.301479360.00
deepinframistral-7bStale(Medium)49m ago56.105136580.00
googlegemini-2.0-flash-liteActive21d ago56.101173790.00
togethermixtral-8x7bActive45m ago55.7013114180.00
deepinfrallama-3.1-8bStale(Medium)47m ago55.207102400.00
togetherqwen-2.5-72bActive19d ago53.60471400.00
togetherllama-3.3-70bActive46m ago53.4021461360.00
deepinfradevstral-smallNever Succeeded(Medium)50m ago50.303131600.00
openaigpt-4.1-miniActive47m ago49.801597490.00
openaio4 MiniNever Succeeded(Medium)47m ago49.6015760.00
anthropicclaude-haiku-4.5Active50m ago49.201674650.00
bedrockllama-3.2-90bActive33m ago47.40651340.00
fireworksllama-3.3-70bActive47m ago46.204941280.00
deepinfrallama-3-8bStale(Medium)47m ago44.80771300.00
bedrockmistral-largeActive33m ago43.00347300.00
openaigpt-4o-miniActive46m ago42.60895400.00
bedrockclaude-haiku-4.5Active34m ago42.504631060.00
togetherdeepseek-r1Active45m ago42.201691520.00
googlegemini-2.5-proNever Succeeded(Medium)45m ago42.0011721530.00
deepinfrallama-3.2-90bStale(Medium)49m ago36.80388770.00
openaigpt-4.1Active47m ago36.60682470.00
deepinfrallama-3.2-3bStale(Medium)48m ago33.80190780.00
deepinfrallama-2-70bStale(Medium)47m ago33.60357410.00
bedrockclaude-3-7-sonnetActive34m ago33.50644730.00
deepinfrallama-3.2-1bStale(Medium)48m ago33.50190630.00
deepinfrallama-3-70bStale(Medium)47m ago33.30453540.00
deepinfraQwen 2.5 Coder 32BNever Succeeded(Medium)50m ago33.201672240.00
openaigpt-4-turboActive46m ago32.50252570.00
deepinfraqwen-2.5-72bStale(Medium)49m ago32.30250740.00
bedrockclaude-3-5-sonnetActive34m ago31.60244650.00
openaiGPT-5.1Active47m ago30.20456940.00
bedrockclaude-3-5-haikuActive34m ago29.50138870.00
openaiGPT-5.2Active47m ago26.80940940.00
openaigpt-4Active46m ago26.50247670.00
openaiGPT-5.1-codexActive47m ago25.603491310.00
openaiGPT-5.1-codex-miniActive47m ago23.801551330.00
bedrockclaude-sonnet-4.5Active33m ago22.402291630.00
deepinfrallama-3.1-405bStale(Medium)48m ago22.401341210.00
togetherllama-3.1-405bActive19d ago21.802291110.00
deepinfrallama-3.1-70bStale(Medium)48m ago21.30346670.00
deepinfrallama-3.3-70bNever Succeeded(Medium)50m ago21.101491770.00
anthropicclaude-opus-4.5Active50m ago19.606301790.00
anthropicclaude-4-sonnetActive50m ago19.601302000.00
bedrockclaude-3-opusActive6d ago19.00422860.00
bedrockclaude-opus-4.5Active34m ago18.204232010.00
anthropicClaude Opus 4.1Active50m ago18.108251450.00
anthropicclaude-4-opusActive50m ago17.809241250.00
deepinfrallama-3.2-11bStale(Medium)49m ago17.201622540.00
deepinfraqwen-3-235bNever Succeeded(Medium)49m ago15.501383970.00
openaiActive47m ago12.702221850.00
openaio1-proLikely Deprecated(Medium)47m ago8.93119580.00
openaiGPT-5.2-proActive47m ago7.651135580.00
Lifecycle snapshot
Loading status summary…

📈 Time Series 📈

llama-3.3-70b

llama-3.1-8b

claude-3-5-sonnet

claude-haiku-4.5

claude-opus-4.5

llama-3.1-70b

llama-3.2-3b

llama-3.2-90b

llama-4-maverick

llama-4-scout

mistral-7b

mixtral-8x22b

qwen-3-32b

undefined

Claude Opus 4.1

claude-3-5-haiku

claude-3-7-sonnet

claude-3-opus

claude-4-opus

claude-4-sonnet

claude-sonnet-4.5

deepseek-r1

devstral-small

gemini-2.5-flash

gemini-2.5-pro

gpt-3.5-turbo

gpt-4

gpt-4-turbo

gpt-4.1

gpt-4.1-mini

gpt-4.1-nano

gpt-4o

gpt-4o-mini

GPT-5.1

GPT-5.1-codex

GPT-5.1-codex-max

GPT-5.1-codex-mini

GPT-5.2

GPT-5.2-pro

gpt-oss-120b

kimi-k2

llama-2-70b

llama-3-70b

llama-3-8b

llama-3.1-405b

llama-3.2-11b

llama-3.2-1b

mistral-large

mixtral-8x7b

nova-lite

nova-micro

nova-pro

o1-pro

o3 Mini

o4 Mini

Qwen 2.5 Coder 32B

qwen-2.5-72b

qwen-2.5-7b

qwen-3-235b