Benchmark Overview
22.70
690.00
8
Feb 6, 2026, 06:04 PM
Key Insights
llama-3.1-405b streams at 22.70 tokens/second on average across the last 8 benchmark runs.
Performance fluctuated by 14.60 tokens/second (64.3% coefficient of variation), indicating variable behavior across benchmark runs.
Average time to first token is 690.00 ms (good latency), suitable for latency-sensitive workloads.
Latest measurements completed on Feb 6, 2026, 06:04 PM based on 8 total samples.
Performance Distribution
Distribution of throughput measurements showing performance consistency across benchmark runs.
Performance Over Time
Historical performance trends showing how throughput has changed over the benchmarking period.
llama-3.1-405b
Benchmark Samples
| Provider | Model | Avg Toks/Sec | Min | Max | Avg TTF (ms) |
|---|---|---|---|---|---|
| together | llama-3.1-405b | 22.70 | 13.00 | 27.60 | 690.00 |
Frequently Asked Questions
The latest rolling average throughput is 22.70 tokens per second with an average time to first token of 690.00 ms across 8 recent runs.
Benchmarks refresh automatically whenever the monitoring cron runs. The most recent run completed on Feb 6, 2026, 06:04 PM.