I run cron jobs to periodically test the token generation speed of different cloud LLM providers. The chart helps visualize the distributions of different speeds, as they can vary somewhat depending on the loads. For readability not all models are shown, but you can see the full results in the table below.
Every provider and model now has a dedicated landing page with narrative insights, SEO-friendly metadata, and structured data for search engines. Click any provider or model in the table to explore performance in depth.
I am working daily to add more providers and models, looking anywhere that does not require purchasing dedicated endpoints for hosting (why some models may appear to be missing). If you have any more suggestions let me know on GitHub!! 😊
| # | Model | Provider | Speed |
|---|---|---|---|
| 1 | llama-3.1-8b | groq | 326 tok/s |
| 2 | qwen-3-32b | groq | 242 tok/s |
| 3 | llama-4-scout | groq | 215 tok/s |
| 4 | llama-3.1-8b | cerebras | 208 tok/s |
| 5 | llama-3.3-70b | groq | 203 tok/s |
| Status | |||||||
|---|---|---|---|---|---|---|---|
| groq | llama-3.1-8b | Active | 2h ago | 326.00 | 87 | 471 | 90.00 |
| cerebras | qwen-3-32b | Active | 13d ago | 248.00 | 4 | 444 | 370.00 |
| groq | qwen-3-32b | Active | 2h ago | 242.00 | 2 | 391 | 270.00 |
| groq | llama-4-scout | Active | 2h ago | 215.00 | 23 | 335 | 210.00 |
| cerebras | llama-3.1-8b | Active | 2h ago | 208.00 | 12 | 350 | 500.00 |
| groq | llama-3.3-70b | Active | 2h ago | 203.00 | 68 | 322 | 120.00 |
| cerebras | gpt-oss-120b | Active | 5h ago | 199.00 | 13 | 378 | 780.00 |
| cerebras | llama-3.3-70b | Active | 13d ago | 189.00 | 17 | 338 | 430.00 |
| groq | llama-4-maverick | Active | 2h ago | 179.00 | 12 | 310 | 450.00 |
| together | llama-3.1-8b | Active | 2h ago | 148.00 | 2 | 232 | 560.00 |
| groq | kimi-k2 | Active | 2h ago | 145.00 | 19 | 212 | 250.00 |
| bedrock | nova-micro | Active | 48m ago | 123.00 | 65 | 149 | 270.00 |
| cerebras | qwen-3-235b-instruct | Active | 26d ago | 116.00 | 19 | 252 | 880.00 |
| openai | o3 Mini | Never Succeeded(Medium) | 2h ago | 110.00 | 21 | 160 | 0.00 |
| bedrock | llama-4-maverick | Active | 48m ago | 108.00 | 3 | 139 | 260.00 |
| bedrock | llama-4-scout | Active | 48m ago | 102.00 | 4 | 132 | 270.00 |
| bedrock | nova-lite | Active | 48m ago | 101.00 | 39 | 130 | 300.00 |
| bedrock | llama-3.3-70b | Active | 48m ago | 99.80 | 3 | 137 | 260.00 |
| together | qwen-2.5-7b | Active | 2h ago | 92.50 | 11 | 145 | 230.00 |
| bedrock | nova-pro | Active | 48m ago | 92.20 | 13 | 124 | 350.00 |
| openai | gpt-3.5-turbo | Active | 2h ago | 79.50 | 13 | 126 | 460.00 |
| Active | 2h ago | 75.90 | 34 | 132 | 510.00 | ||
| openai | GPT-5.1-codex-max | Active | 2h ago | 75.40 | 2 | 118 | 1490.00 |
| together | llama-3.1-70b | Active | 4d ago | 72.90 | 7 | 144 | 380.00 |
| openai | gpt-4.1-nano | Active | 2h ago | 70.50 | 9 | 149 | 440.00 |
| together | mistral-7b | Active | 4d ago | 70.30 | 2 | 91 | 510.00 |
| openai | gpt-4o | Active | 2h ago | 69.00 | 7 | 173 | 1340.00 |
| fireworks | mixtral-8x22b | Active | 20h ago | 67.50 | 29 | 112 | 430.00 |
| gemini-2.5-flash | Never Succeeded(Medium) | 2h ago | 67.00 | 5 | 101 | 1060.00 | |
| together | llama-3.2-3b | Active | 2h ago | 61.20 | 5 | 121 | 1400.00 |
| deepinfra | mistral-7b | Stale(Medium) | 2h ago | 61.10 | 5 | 136 | 550.00 |
| gemini-2.0-flash | Active | 26d ago | 60.90 | 29 | 88 | 560.00 | |
| deepinfra | mixtral-8x22b | Stale(Medium) | 11d ago | 58.20 | 14 | 78 | 370.00 |
| together | mixtral-8x7b | Active | 2h ago | 56.80 | 13 | 114 | 180.00 |
| gemini-2.0-flash-lite | Active | 26d ago | 56.70 | 11 | 73 | 800.00 | |
| deepinfra | devstral-small | Never Succeeded(Medium) | 2h ago | 55.40 | 3 | 131 | 630.00 |
| deepinfra | llama-3.1-8b | Stale(Medium) | 2h ago | 54.80 | 7 | 100 | 430.00 |
| together | qwen-2.5-72b | Active | 23d ago | 54.00 | 4 | 71 | 450.00 |
| together | llama-3.3-70b | Active | 2h ago | 53.80 | 2 | 146 | 1230.00 |
| openai | gpt-4.1-mini | Active | 2h ago | 50.40 | 15 | 97 | 490.00 |
| anthropic | claude-haiku-4.5 | Active | 2h ago | 50.30 | 16 | 74 | 610.00 |
| openai | o4 Mini | Never Succeeded(Medium) | 2h ago | 49.30 | 14 | 76 | 0.00 |
| fireworks | llama-3.3-70b | Active | 2h ago | 49.00 | 4 | 94 | 1240.00 |
| bedrock | llama-3.2-90b | Active | 48m ago | 47.30 | 6 | 51 | 350.00 |
| deepinfra | llama-3-8b | Stale(Medium) | 2h ago | 44.50 | 7 | 71 | 310.00 |
| together | deepseek-r1 | Active | 2h ago | 43.70 | 1 | 69 | 870.00 |
| openai | gpt-4o-mini | Active | 2h ago | 43.20 | 8 | 95 | 380.00 |
| bedrock | mistral-large | Active | 48m ago | 42.60 | 3 | 47 | 320.00 |
| bedrock | claude-haiku-4.5 | Active | 48m ago | 42.20 | 4 | 63 | 1070.00 |
| gemini-2.5-pro | Never Succeeded(Medium) | 2h ago | 41.30 | 6 | 72 | 1590.00 | |
| openai | gpt-4.1 | Active | 2h ago | 37.80 | 6 | 82 | 480.00 |
| deepinfra | llama-2-70b | Stale(Medium) | 2h ago | 35.40 | 3 | 57 | 510.00 |
| deepinfra | llama-3.2-90b | Stale(Medium) | 14h ago | 35.00 | 3 | 76 | 840.00 |
| deepinfra | llama-3-70b | Stale(Medium) | 2h ago | 34.90 | 2 | 55 | 620.00 |
| deepinfra | llama-3.2-1b | Stale(Medium) | 2h ago | 33.60 | 1 | 90 | 640.00 |
| bedrock | claude-3-7-sonnet | Active | 48m ago | 33.40 | 5 | 44 | 730.00 |
| deepinfra | llama-3.2-3b | Stale(Medium) | 2h ago | 33.30 | 1 | 88 | 790.00 |
| deepinfra | Qwen 2.5 Coder 32B | Never Succeeded(Medium) | 2h ago | 32.70 | 1 | 67 | 2040.00 |
| deepinfra | qwen-2.5-72b | Stale(Medium) | 2h ago | 32.70 | 2 | 50 | 630.00 |
| openai | gpt-4-turbo | Active | 2h ago | 32.20 | 2 | 52 | 530.00 |
| bedrock | claude-3-5-sonnet | Active | 48m ago | 32.00 | 2 | 44 | 660.00 |
| bedrock | claude-3-5-haiku | Active | 48m ago | 30.20 | 1 | 38 | 790.00 |
| openai | GPT-5.1 | Active | 2h ago | 30.20 | 3 | 57 | 970.00 |
| openai | gpt-4 | Active | 2h ago | 27.20 | 2 | 47 | 660.00 |
| openai | GPT-5.2 | Active | 2h ago | 26.60 | 9 | 40 | 950.00 |
| openai | GPT-5.1-codex | Active | 2h ago | 25.60 | 3 | 49 | 1320.00 |
| openai | GPT-5.1-codex-mini | Active | 2h ago | 24.10 | 1 | 55 | 1290.00 |
| deepinfra | llama-3.1-405b | Stale(Medium) | 2h ago | 23.30 | 1 | 34 | 1130.00 |
| deepinfra | llama-3.1-70b | Stale(Medium) | 2h ago | 23.20 | 3 | 46 | 650.00 |
| bedrock | claude-sonnet-4.5 | Active | 48m ago | 22.40 | 5 | 29 | 1600.00 |
| together | llama-3.1-405b | Active | 24d ago | 21.30 | 6 | 29 | 970.00 |
| deepinfra | llama-3.3-70b | Never Succeeded(Medium) | 2h ago | 20.00 | 1 | 49 | 2040.00 |
| anthropic | claude-opus-4.5 | Active | 2h ago | 19.80 | 6 | 32 | 1790.00 |
| anthropic | claude-4-sonnet | Active | 2h ago | 19.40 | 1 | 30 | 2100.00 |
| bedrock | claude-3-opus | Active | 11d ago | 19.10 | 4 | 22 | 860.00 |
| bedrock | claude-opus-4.5 | Active | 48m ago | 18.30 | 4 | 24 | 1990.00 |
| anthropic | Claude Opus 4.1 | Active | 2h ago | 18.10 | 8 | 27 | 1470.00 |
| anthropic | claude-4-opus | Active | 2h ago | 17.70 | 5 | 24 | 1280.00 |
| deepinfra | llama-3.2-11b | Stale(Medium) | 2h ago | 16.40 | 1 | 62 | 2450.00 |
| deepinfra | qwen-3-235b | Never Succeeded(Medium) | 14h ago | 13.30 | 1 | 38 | 4570.00 |
| openai | Active | 5h ago | 12.70 | 1 | 22 | 1880.00 | |
| openai | o1-pro | Likely Deprecated(Medium) | 2h ago | 8.95 | 1 | 18 | 650.00 |
| openai | GPT-5.2-pro | Active | 2h ago | 8.06 | 1 | 13 | 5290.00 |