Provider Snapshot
19
49.23
0.00
Jun 25, 2026
Key Takeaways
19 together models are actively benchmarked with 1029 total measurements across 991 benchmark runs.
qwen-2-1.5b-instruct leads the fleet with 124.00 tokens/second, while qwen-2.5-7b delivers 66.50 tok/s.
Performance varies by 86.5% across the together model lineup, indicating diverse optimization strategies for different use cases.
The together model fleet shows varied performance characteristics (53.8% variation coefficient), reflecting diverse model architectures.
Fastest Models
| Provider | Model | Avg Toks/Sec | Min | Max | Avg TTF (ms) |
|---|---|---|---|---|---|
| together | qwen-2-1.5b-instruct | 124.00 | 59.00 | 148.00 | 0.00 |
| together | nemotron-3-ultra-550b-a55b | 76.10 | 23.40 | 123.00 | 0.00 |
| together | GLM-5.2 | 71.50 | 2.08 | 158.00 | 0.00 |
| together | Meta-Llama-3-8B-Instruct-Lite | 67.90 | 7.22 | 114.00 | 0.00 |
| together | Qwen2.5-7B-Instruct-Turbo | 67.00 | 27.30 | 83.20 | 0.00 |
| together | qwen-2.5-7b | 66.50 | 15.30 | 90.70 | 0.00 |
All Models
Complete list of all together models tracked in the benchmark system. Click any model name to view detailed performance data.
| Provider | Model | Avg Toks/Sec | Min | Max | Avg TTF (ms) |
|---|---|---|---|---|---|
| together | Qwen2.5-7B-Instruct-Turbo | 67.00 | 27.30 | 83.20 | 0.00 |
| together | qwen-2-1.5b-instruct | 124.00 | 59.00 | 148.00 | 0.00 |
| together | DeepSeek-V4-Pro | 24.60 | 2.63 | 54.50 | 0.00 |
| together | gemma-3n-e4b-it | 38.00 | 10.40 | 59.30 | 0.00 |
| together | llama-3.3-70b | 35.80 | 1.25 | 67.40 | 0.00 |
| together | Llama-Guard-4-12B | 3.18 | 1.89 | 3.73 | 0.00 |
| together | Meta-Llama-3-8B-Instruct-Lite | 67.90 | 7.22 | 114.00 | 0.00 |
| together | MiniMax-M3 | 64.50 | 31.40 | 110.00 | 0.00 |
| together | Kimi-K2.6 | 32.70 | 2.31 | 90.00 | 0.00 |
| together | Kimi-K2.7-Code | 47.70 | 0.96 | 130.00 | 0.00 |
| together | nemotron-3-ultra-550b-a55b | 76.10 | 23.40 | 123.00 | 0.00 |
| together | GPT-oss-120b | 35.70 | 5.73 | 82.60 | 0.00 |
| together | GPT-oss-20b | 39.90 | 3.23 | 86.10 | 0.00 |
| together | gemma-4-31b-it | 18.70 | 8.51 | 40.30 | 0.00 |
| together | qwen-2.5-7b | 66.50 | 15.30 | 90.70 | 0.00 |
| together | Qwen3-235B-A22B-Instruct-2507-FP8 | 23.10 | 2.72 | 66.00 | 0.00 |
| together | GLM-5 | 57.50 | 10.40 | 100.00 | 0.00 |
| together | GLM-5.1 | 41.00 | 3.07 | 71.40 | 0.00 |
| together | GLM-5.2 | 71.50 | 2.08 | 158.00 | 0.00 |
Featured Models
Frequently Asked Questions
Based on recent tests, qwen-2-1.5b-instruct shows the highest average throughput among tracked together models.
This provider summary aggregates 1029 individual prompts measured across 991 monitoring runs over the past month.