Benchmarks · Independent

The end-to-end latency benchmark for agents.

Most published inference benchmarks measure single-call performance. We measure what actually predicts agent UX: TTFT, sustained throughput, and end-to-end task latency under concurrency.

Request the methodology pack Read the methodology post

3–5×

Lower TTFT vs. major providers

3.0×

Faster e2e p95 on agent tasks

↓ 60%

Cost per 1k completed tasks

2026 Q1

Last refresh — re-run quarterly

Headline results

Synaptix vs. major providers.

Concurrency: 16. Hardware: provider default. Models: where unavailable on a provider, the cell shows "—". All numbers are medians across 100 runs from us-east. Reproduce with the harness in our methodology pack.

Model	Metric	Synaptix	Bedrock	Vertex	Together	Fireworks
gpt-oss-120B	TTFT (ms)	182	—	—	318	291
gpt-oss-120B	Tokens/sec (single)	412	—	—	248	276
Llama 4 405B	TTFT (ms)	240	612	598	445	402
Llama 4 405B	p95 e2e task (s)	3.1	9.4	8.8	5.7	5.2
Kimi-K2.5 (long ctx 1M)	TTFT (ms)	388	—	—	972	—
Qwen3-Coder	Tokens/sec (single)	528	—	—	342	381
Mixed agent workload	Cost / 1k tasks (USD)	$2.40	$8.10	$7.65	$4.20	$3.95