I run cron jobs to periodically test the token generation speed of different cloud LLM providers. The chart helps visualize the distributions of different speeds, as they can vary somewhat depending on the loads. For readability not all models are shown, but you can see the full results in the table below.
Every provider and model now has a dedicated landing page with narrative insights, SEO-friendly metadata, and structured data for search engines. Click any provider or model in the table to explore performance in depth.
I am working daily to add more providers and models, looking anywhere that does not require purchasing dedicated endpoints for hosting (why some models may appear to be missing). If you have any more suggestions let me know on GitHub!! 😊
| # | Model | Provider | Speed |
|---|---|---|---|
| 1 | llama-3.1-8b | groq | 268 tok/s |
| 2 | qwen-3-32b | cerebras | 251 tok/s |
| 3 | qwen-3-32b | groq | 239 tok/s |
| 4 | gpt-oss-120b | cerebras | 221 tok/s |
| 5 | llama-3.3-70b | cerebras | 221 tok/s |
| Status | |||||||
|---|---|---|---|---|---|---|---|
| groq | llama-3.1-8b | Active | 34m ago | 268.00 | 95 | 411 | 140.00 |
| cerebras | qwen-3-32b | Active | 35m ago | 251.00 | 4 | 369 | 440.00 |
| groq | qwen-3-32b | Active | 34m ago | 239.00 | 46 | 387 | 160.00 |
| cerebras | gpt-oss-120b | Active | 35m ago | 221.00 | 4 | 346 | 700.00 |
| cerebras | llama-3.3-70b | Active | 35m ago | 221.00 | 14 | 338 | 370.00 |
| cerebras | llama-3.1-8b | Active | 35m ago | 218.00 | 4 | 365 | 610.00 |
| groq | llama-3.3-70b | Active | 34m ago | 205.00 | 79 | 280 | 120.00 |
| groq | llama-4-scout | Active | 34m ago | 190.00 | 23 | 255 | 270.00 |
| groq | llama-4-maverick | Active | 34m ago | 170.00 | 19 | 310 | 500.00 |
| cerebras | qwen-3-235b-instruct | Active | 35m ago | 167.00 | 2 | 266 | 750.00 |
| together | llama-3.1-8b | Active | 31m ago | 146.00 | 2 | 230 | 480.00 |
| groq | kimi-k2 | Active | 34m ago | 146.00 | 21 | 203 | 220.00 |
| bedrock | nova-micro | Active | 21m ago | 129.00 | 69 | 154 | 250.00 |
| together | mistral-7b | Active | 32m ago | 119.00 | 3 | 166 | 330.00 |
| bedrock | llama-4-maverick | Active | 21m ago | 109.00 | 28 | 149 | 250.00 |
| openai | o3 Mini | Never Succeeded(Medium) | 34m ago | 105.00 | 20 | 151 | 0.00 |
| together | qwen-2.5-7b | Active | 32m ago | 103.00 | 3 | 146 | 260.00 |
| bedrock | nova-lite | Active | 21m ago | 102.00 | 42 | 135 | 290.00 |
| bedrock | llama-4-scout | Active | 21m ago | 102.00 | 1 | 140 | 350.00 |
| bedrock | llama-3.3-70b | Active | 21m ago | 101.00 | 9 | 136 | 230.00 |
| openai | gpt-4.1-nano | Active | 34m ago | 80.60 | 13 | 138 | 380.00 |
| bedrock | nova-pro | Active | 21m ago | 79.60 | 10 | 124 | 420.00 |
| openai | gpt-3.5-turbo | Active | 32m ago | 77.80 | 12 | 129 | 460.00 |
| together | llama-3.2-3b | Active | 31m ago | 72.30 | 5 | 145 | 1300.00 |
| together | llama-3.1-70b | Active | 31m ago | 71.60 | 4 | 147 | 530.00 |
| claude-3-haiku | Active | 14d ago | 68.40 | 27 | 82 | 520.00 | |
| openai | gpt-4o | Active | 32m ago | 64.50 | 9 | 151 | 1390.00 |
| gemini-2.0-flash-lite | Active | 31m ago | 62.20 | 13 | 86 | 560.00 | |
| gemini-2.0-flash | Active | 31m ago | 61.90 | 8 | 88 | 570.00 | |
| fireworks | mixtral-8x22b | Active | 34m ago | 60.30 | 37 | 74 | 590.00 |
| together | llama-3.3-70b | Active | 32m ago | 51.70 | 2 | 133 | 1410.00 |
| anthropic | claude-haiku-4.5 | Active | 35m ago | 50.80 | 15 | 80 | 630.00 |
| openai | o4 Mini | Never Succeeded(Medium) | 34m ago | 50.60 | 22 | 75 | 0.00 |
| together | qwen-2.5-72b | Active | 32m ago | 50.10 | 2 | 70 | 520.00 |
| together | mixtral-8x7b | Active | 32m ago | 48.40 | 6 | 111 | 450.00 |
| bedrock | llama-3.2-90b | Active | 21m ago | 47.40 | 23 | 51 | 350.00 |
| deepinfra | mixtral-8x22b | Stale(Medium) | 35m ago | 46.70 | 25 | 80 | 320.00 |
| openai | gpt-4.1-mini | Active | 34m ago | 46.10 | 18 | 69 | 450.00 |
| bedrock | claude-haiku-4.5 | Active | 21m ago | 46.10 | 8 | 65 | 860.00 |
| deepinfra | llama-3-8b | Stale(Medium) | 34m ago | 45.90 | 7 | 71 | 290.00 |
| deepinfra | mistral-7b | Stale(Medium) | 35m ago | 44.80 | 3 | 89 | 550.00 |
| bedrock | mistral-large | Active | 21m ago | 44.10 | 9 | 47 | 250.00 |
| deepinfra | devstral-small | Never Succeeded(Medium) | 35m ago | 43.20 | 4 | 85 | 460.00 |
| fireworks | llama-3.3-70b | Active | 34m ago | 41.90 | 6 | 76 | 1320.00 |
| deepinfra | llama-3.2-90b | Stale(Medium) | 35m ago | 40.40 | 1 | 93 | 840.00 |
| openai | gpt-4o-mini | Active | 32m ago | 36.70 | 8 | 78 | 440.00 |
| together | deepseek-r1 | Active | 32m ago | 36.10 | 1 | 67 | 1940.00 |
| deepinfra | Qwen 2.5 Coder 32B | Never Succeeded(Medium) | 35m ago | 35.00 | 1 | 67 | 4230.00 |
| claude-3-5-sonnet | Active | 14d ago | 34.50 | 24 | 46 | 710.00 | |
| openai | gpt-4.1 | Active | 34m ago | 34.20 | 7 | 73 | 460.00 |
| openai | gpt-4-turbo | Active | 32m ago | 34.20 | 2 | 49 | 540.00 |
| deepinfra | llama-2-70b | Stale(Medium) | 34m ago | 33.60 | 8 | 44 | 380.00 |
| deepinfra | llama-3-70b | Stale(Medium) | 34m ago | 32.90 | 2 | 44 | 590.00 |
| bedrock | claude-3-7-sonnet | Active | 21m ago | 32.80 | 1 | 42 | 800.00 |
| bedrock | claude-3-5-sonnet | Active | 21m ago | 32.50 | 5 | 42 | 580.00 |
| deepinfra | qwen-2.5-72b | Stale(Medium) | 35m ago | 32.00 | 2 | 46 | 810.00 |
| together | deepseek-v3 | Active | 10d ago | 30.50 | 1 | 67 | 1260.00 |
| deepinfra | llama-3.1-8b | Stale(Medium) | 34m ago | 30.50 | 1 | 102 | 1280.00 |
| bedrock | claude-3-5-haiku | Active | 22m ago | 29.00 | 1 | 38 | 1260.00 |
| openai | GPT-5.2 | Active | 34m ago | 27.90 | 6 | 44 | 990.00 |
| openai | GPT-5.1 | Active | 34m ago | 27.40 | 3 | 56 | 1060.00 |
| openai | gpt-4 | Active | 32m ago | 26.30 | 7 | 52 | 590.00 |
| openai | GPT-5.1-codex-max | Active | 1d ago | 25.80 | 1 | 70 | 1220.00 |
| openai | GPT-5.1-codex | Active | 34m ago | 25.10 | 4 | 49 | 690.00 |
| deepinfra | llama-3.2-3b | Stale(Medium) | 34m ago | 25.10 | 5 | 90 | 390.00 |
| deepinfra | llama-3.2-1b | Stale(Medium) | 34m ago | 24.90 | 7 | 83 | 370.00 |
| deepinfra | llama-3.3-70b | Never Succeeded(Medium) | 35m ago | 23.60 | 1 | 59 | 750.00 |
| anthropic | claude-3-opus | Active | 29d ago | 23.20 | 23 | 24 | 730.00 |
| deepinfra | qwen-3-235b | Never Succeeded(Medium) | 35m ago | 22.30 | 2 | 43 | 530.00 |
| together | llama-3.1-405b | Active | 31m ago | 21.70 | 1 | 30 | 1340.00 |
| deepinfra | llama-3.1-70b | Stale(Medium) | 34m ago | 21.60 | 3 | 43 | 450.00 |
| bedrock | claude-sonnet-4.5 | Active | 21m ago | 21.50 | 1 | 29 | 1840.00 |
| openai | GPT-5.1-codex-mini | Active | 34m ago | 20.20 | 1 | 45 | 800.00 |
| anthropic | claude-4-sonnet | Active | 35m ago | 20.10 | 8 | 32 | 1640.00 |
| anthropic | claude-opus-4.5 | Active | 35m ago | 19.60 | 11 | 30 | 1730.00 |
| bedrock | claude-3-opus | Active | 21m ago | 19.00 | 7 | 22 | 840.00 |
| anthropic | Claude Opus 4.1 | Active | 35m ago | 18.70 | 5 | 25 | 1350.00 |
| bedrock | claude-opus-4.5 | Active | 21m ago | 18.00 | 1 | 23 | 2310.00 |
| anthropic | claude-4-opus | Active | 35m ago | 17.80 | 4 | 24 | 1230.00 |
| deepinfra | llama-3.1-405b | Stale(Medium) | 34m ago | 15.80 | 1 | 31 | 3530.00 |
| deepinfra | llama-3.2-11b | Stale(Medium) | 34m ago | 12.30 | 1 | 68 | 2150.00 |
| claude-3-opus | Active | 28d ago | 12.10 | 10 | 13 | 2050.00 | |
| openai | o1-pro | Likely Deprecated(Medium) | 34m ago | 10.50 | 1 | 19 | 130.00 |
| openai | GPT-5.2-pro | Active | 6h ago | 1.83 | 1 | 4 | 5060.00 |