groq LLM Benchmarks – Performance & Latency

Provider Snapshot

Models Tracked

Avg Tokens / Second

182.00

Avg Time to First Token (ms)

0.00

Last Updated

Jun 25, 2026

6 groq models are actively benchmarked with 1730 total measurements across 1330 benchmark runs.
GPT-oss-safeguard-20b leads the fleet with 300.00 tokens/second, while llama-4-scout delivers 127.00 tok/s.
Performance varies by 136.2% across the groq model lineup, indicating diverse optimization strategies for different use cases.
The groq model fleet shows varied performance characteristics (32.9% variation coefficient), reflecting diverse model architectures.

Provider	Model	Avg Toks/Sec	Min	Max
groq	GPT-oss-safeguard-20b	300.00	35.20	703.00
groq	llama-3.1-8b	216.00	12.10	336.00
groq	qwen-3-32b	158.00	25.40	233.00
groq	llama-3.3-70b	154.00	79.70	275.00
groq	qwen3.6-27b	137.00	31.50	321.00
groq	llama-4-scout	127.00	18.50	245.00

Complete list of all groq models tracked in the benchmark system. Click any model name to view detailed performance data.

Provider	Model	Avg Toks/Sec	Min	Max
groq	llama-3.1-8b	216.00	12.10	336.00
groq	llama-3.3-70b	154.00	79.70	275.00
groq	llama-4-scout	127.00	18.50	245.00
groq	GPT-oss-safeguard-20b	300.00	35.20	703.00
groq	qwen-3-32b	158.00	25.40	233.00
groq	qwen3.6-27b	137.00	31.50	321.00