OpenAI compatible API. Attested gateway. Public status.

Venice

Venice models on TrustedRouter with prices, routes, policy notes, and source links.

Verify gateway

1 URLbase_url migration

100smodels and routes

0prompt logs by default

`venice`

Confidential

All providers

Provider	Venice
Models	11 public models
Prepaid routes	11
BYOK routes	11
Zero data retention	yes
Confidential compute	yes
Provider E2EE	yes
Policy note	Tracked as confidential — Venice documents no logging or storage of prompts/responses plus TEE-isolated, end-to-end-encrypted inference. (Caveat: requests Venice proxies to external frontier models inherit those providers' policies; TR routes Venice-native open models here.) Policy source

Measured performance

308 samples

Continuously sampled across Venice's routed models — p50 TTFT, throughput, and success rate. Unsupported route and probe-configuration rows are separated from provider downtime. No prompt or output content stored.

p50 TTFT	1198 ms
Throughput	—
Uptime	57.47%

Model	p50 TTFT	p50 TTFB	Throughput	Uptime	Config excluded	Samples
qwen/qwen3-235b-a22b-thinking-2507	992 ms	911 ms	—	72.00%	—	25
qwen/qwen3.5-9b	1073 ms	991 ms	—	67.86%	—	28
qwen/qwen3.6-27b	1131 ms	1028 ms	—	52.00%	—	25
z-ai/glm-4.7-flash	1138 ms	1055 ms	—	65.38%	—	26
z-ai/glm-5.1	1198 ms	1107 ms	—	62.50%	—	40
z-ai/glm-4.7	1281 ms	1261 ms	—	77.78%	—	27
z-ai/glm-5	1289 ms	1288 ms	—	62.07%	—	29
z-ai/glm-4.6	1334 ms	1330 ms	—	73.91%	—	23
qwen/qwen3.5-397b-a17b	1675 ms	1598 ms	—	58.33%	—	24
z-ai/glm-5-turbo	1914 ms	1835 ms	—	68.18%	—	22
z-ai/glm-5v-turbo	—	—	—	0.00%	—	39

Full provider & model leaderboard.

Provider models

Models served by Venice.

Each row links to pricing, provider, benchmark, and API pages for the model.

Model	Context	Endpoints	Prompt	Completion	Routes
`qwen/qwen3-235b-a22b-thinking-2507` Qwen: Qwen3 235B A22B Thinking 2507 benchmarks performance api	262,144	2	$0.495/1M	$3.85/1M	prepaid BYOK
`qwen/qwen3.5-397b-a17b` Qwen: Qwen3.5 397B A17B benchmarks performance api	262,144	2	$0.825/1M	$4.95/1M	prepaid BYOK
`qwen/qwen3.5-9b` Qwen: Qwen3.5-9B benchmarks performance api	262,144	2	$0.11/1M	$0.165/1M	prepaid BYOK
`qwen/qwen3.6-27b` Qwen: Qwen3.6 27B benchmarks performance api	262,144	2	$0.363/1M	$3.575/1M	prepaid BYOK
`z-ai/glm-4.6` Z.ai: GLM 4.6 benchmarks performance api	202,752	2	$0.935/1M	$3.025/1M	prepaid BYOK
`z-ai/glm-4.7` Z.ai: GLM 4.7 benchmarks performance api	202,752	2	$0.605/1M	$2.915/1M	prepaid BYOK
`z-ai/glm-4.7-flash` Z.ai: GLM 4.7 Flash benchmarks performance api	202,752	2	$0.143/1M	$0.55/1M	prepaid BYOK
`z-ai/glm-5` Z.ai: GLM 5 benchmarks performance api	204,800	2	$1.1/1M	$3.52/1M	prepaid BYOK
`z-ai/glm-5-turbo` Z.ai: GLM 5 Turbo benchmarks performance api	202,752	2	$1.32/1M	$4.4/1M	prepaid BYOK
`z-ai/glm-5.1` Z.ai: GLM 5.1 benchmarks performance api	202,752	2	$1.925/1M	$6.05/1M	prepaid BYOK
`z-ai/glm-5v-turbo` Z.ai: GLM 5V Turbo benchmarks performance api	202,752	2	$1.65/1M	$5.5/1M	prepaid BYOK