Measured performance

Provider & model performance

Measured time-to-first-token, time-to-first-byte, throughput, and uptime for every LLM provider and model TrustedRouter routes to — continuously sampled, not vendor-claimed.

Last updated 2026-06-19T08:00:48Z
Continuously sampled from TrustedRouter's monitor regions over the 5,000-sample benchmark set — time-to-first-token (TTFT), time-to-first-byte (TTFB), throughput, and success rate measured on real streaming requests, not vendor-claimed. Unsupported route and probe-configuration rows are reported separately and do not count as provider downtime. No prompt or output content is ever stored.

Providers

Ranked by measured p50 time-to-first-token across all of a provider's models in the 5,000-sample benchmark set (23 providers · 2653 samples).

#ProviderModels p50 TTFTThroughputUptimeErrorsConfig excludedSamples
1 zai 2 901 ms 22 tok/s 100.00% 193
2 mistral 8 1193 ms 100.00% 134
3 grok 2 1203 ms 100.00% 100
4 cerebras 4 1219 ms 100.00% 86
5 lightning 1 1254 ms 96.77% ReadTimeout 3% 124
6 deepinfra 7 1272 ms 24 tok/s 97.46% provider_error 3% 118
7 parasail 25 1296 ms 70.65% provider_error 29% 1 probe_config_error 92
8 together 3 1365 ms 98.15% provider_error 2% 108
9 nebius 14 1379 ms 98.41% ReadTimeout 2% 63
10 fireworks 5 1429 ms 98.92% ReadTimeout 1% 93
11 venice 11 1491 ms 82.24% provider_error 18% 107
12 openai 11 1525 ms 100.00% 112
13 novita 60 1747 ms 93.64% ReadTimeout 6% 110
14 siliconflow 7 2075 ms 13 tok/s 98.26% ReadTimeout 2% 115
15 anthropic 10 2085 ms 80 tok/s 79.82% provider_error 20% 109
16 minimax 6 2391 ms 100.00% 91
17 xiaomi 5 2462 ms 100.00% 106
18 phala 17 2970 ms 99.06% ReadTimeout 1% 106
19 gemini 7 4630 ms 79 tok/s 99.44% provider_error 1% 5 probe_config_error 179
20 gmi 5 6190 ms 73.79% empty_stream 26% 103
21 tinfoil 6 20720 ms 41 tok/s 81.25% provider_error 19% 192
22 deepseek 2 0.00% provider_error 100% 109
23 kimi 3 0.00% provider_error 100% 103

Models

Models sampled in the 5,000-sample benchmark set, fastest measured TTFT first. Rows with few samples are marked — more data sharpens the numbers.

#ModelProvider p50 TTFTp95 TTFTp50 TTFB ThroughputUptimeConfig excludedSamples
1 meta-llama/llama-4-maverick-17b-128e-instruct-fp8 limited data novita 590 ms 834 ms 589 ms 100.00% 2
2 minimax/minimax-m2.5 limited data parasail 713 ms 1669 ms 712 ms 100.00% 2
3 sao10k/l3-8b-lunaris limited data novita 762 ms 926 ms 761 ms 100.00% 2
4 deepseek/deepseek-prover-v2-671b limited data novita 773 ms 3136 ms 771 ms 100.00% 2
5 google/gemma-4-31b-it limited data tinfoil 783 ms 783 ms 782 ms 100.00% 1
6 google/gemma-3-12b-it deepinfra 842 ms 2729 ms 779 ms 100.00% 20
7 qwen/qwen3.5-27b limited data deepinfra 842 ms 3443 ms 809 ms 100.00% 10
8 z-ai/glm-5.1 limited data parasail 842 ms 8959 ms 840 ms 100.00% 2
9 qwen/qwen3-235b-a22b-thinking-2507 limited data venice 868 ms 1402 ms 867 ms 100.00% 6
10 qwen/qwen2.5-vl-72b-instruct limited data phala 890 ms 2406 ms 888 ms 100.00% 4
11 z-ai/glm-4.6v limited data zai 901 ms 1750 ms 899 ms 100.00% 2
12 NousResearch/Hermes-4-405B limited data nebius 915 ms 2572 ms 813 ms 100.00% 6
13 openai/gpt-oss-120b limited data cerebras 988 ms 2537 ms 928 ms 100.00% 18
14 moonshotai/kimi-k2.5 limited data fireworks 1019 ms 11761 ms 933 ms 100.00% 16
15 google/gemma-4-26b-a4b-it limited data parasail 1043 ms 2024 ms 939 ms 100.00% 6
16 openai/gpt-oss-120b limited data fireworks 1045 ms 3054 ms 1014 ms 100.00% 17
17 mistralai/ministral-3b-2512 mistral 1046 ms 2728 ms 1045 ms 100.00% 24
18 meta-llama/llama-3.3-70b-instruct limited data parasail 1046 ms 1046 ms 943 ms 100.00% 1
19 google/gemma-3-27b-it limited data parasail 1051 ms 4315 ms 947 ms 100.00% 10
20 x-ai/grok-4.20 grok 1074 ms 3377 ms 1017 ms 100.00% 46
21 qwen/qwen-2.5-72b-instruct limited data novita 1075 ms 1075 ms 971 ms 100.00% 1
22 mistralai/ministral-8b-2512 limited data mistral 1088 ms 2132 ms 996 ms 100.00% 15
23 cerebras/gpt-oss-120b limited data cerebras 1089 ms 2077 ms 1088 ms 100.00% 14
24 google/gemma-3-4b-it limited data deepinfra 1133 ms 1869 ms 1133 ms 90.00% 10
25 qwen/qwen-2.5-7b-instruct together 1135 ms 3557 ms 1134 ms 100.00% 36
26 openai/gpt-oss-20b limited data phala 1138 ms 12284 ms 1033 ms 100.00% 4
27 qwen/qwen3-max limited data novita 1145 ms 2271 ms 1143 ms 100.00% 2
28 qwen/qwen3-omni-30b-a3b-instruct limited data novita 1150 ms 1723 ms 1149 ms 100.00% 3
29 mistralai/mistral-small-3.2-24b-instruct limited data parasail 1151 ms 5076 ms 1049 ms 100.00% 4
30 mistralai/mistral-large limited data mistral 1159 ms 2462 ms 1158 ms 100.00% 11
31 google/gemini-2.5-flash-lite limited data gemini 1176 ms 2400 ms 1175 ms 100.00% 6
32 deepseek-ai/DeepSeek-V4-Pro limited data nebius 1178 ms 1661 ms 1177 ms 66.67% 3
33 sao10k/l31-70b-euryale-v2.2 limited data novita 1181 ms 1181 ms 1077 ms 100.00% 1
34 qwen/qwen3.5-35b-a3b limited data novita 1186 ms 3492 ms 1184 ms 100.00% 2
35 mistralai/ministral-14b-2512 limited data mistral 1193 ms 2983 ms 1177 ms 100.00% 19
36 x-ai/grok-4.3 grok 1203 ms 2729 ms 1143 ms 100.00% 54
37 qwen/qwen3.6-27b limited data novita 1207 ms 1582 ms 1207 ms 100.00% 2
38 openai/gpt-4o limited data openai 1208 ms 2450 ms 1208 ms 100.00% 6
39 z-ai/glm-4.7 cerebras 1219 ms 4025 ms 1219 ms 100.00% 25
40 openai/gpt-4.1-nano limited data openai 1236 ms 3063 ms 1235 ms 100.00% 11
41 z-ai/glm-4.6 limited data venice 1250 ms 3297 ms 1147 ms 100.00% 10
42 google/gemma-4-31b-it lightning 1254 ms 2503 ms 1224 ms 96.77% 124
43 mistralai/mistral-small-3.2-24b-instruct limited data mistral 1255 ms 3087 ms 1255 ms 100.00% 19
44 qwen/qwen3-235b-a22b-fp8 limited data novita 1257 ms 1257 ms 1256 ms 100.00% 1
45 mistralai/mistral-medium-3-5 limited data mistral 1265 ms 3653 ms 1163 ms 100.00% 18
46 moonshotai/kimi-k2.5 limited data novita 1268 ms 1268 ms 1266 ms 100.00% 1
47 google/gemma-4-31b-it deepinfra 1272 ms 23961 ms 1271 ms 24 tok/s 100.00% 36
48 qwen/qwen2.5-vl-72b-instruct limited data parasail 1282 ms 1820 ms 1281 ms 100.00% 4
49 google/gemma-3-27b-it limited data deepinfra 1286 ms 3534 ms 1285 ms 100.00% 16
50 mistralai/mistral-nemo limited data mistral 1288 ms 3303 ms 1288 ms 100.00% 17
51 qwen/qwen3-vl-8b-instruct limited data novita 1288 ms 1288 ms 1288 ms 100.00% 1
52 qwen/qwen3.5-9b limited data venice 1291 ms 3237 ms 1290 ms 90.00% 10
53 qwen/qwen3-next-80b-a3b-instruct limited data parasail 1296 ms 4531 ms 1296 ms 100.00% 4
54 openai/gpt-oss-120b limited data nebius 1302 ms 1460 ms 1204 ms 100.00% 5
55 meta-llama/llama-4-maverick limited data parasail 1303 ms 1698 ms 1303 ms 100.00% 5
56 meta-llama/llama-3.3-70b-instruct limited data tinfoil 1306 ms 1360 ms 1306 ms 100.00% 3
57 zai-org/glm-4.6v limited data novita 1306 ms 1554 ms 1202 ms 100.00% 3
58 openai/gpt-oss-20b limited data parasail 1308 ms 1329 ms 1307 ms 100.00% 2
59 mistralai/mistral-small-2603 limited data mistral 1317 ms 3321 ms 1317 ms 100.00% 11
60 google/gemini-2.5-flash limited data gemini 1332 ms 1867 ms 1331 ms 100.00% 9
61 google/gemini-3.1-flash-lite-preview limited data gemini 1334 ms 1767 ms 1333 ms 100.00% 6
62 NousResearch/Hermes-4-70B limited data nebius 1339 ms 3420 ms 1338 ms 100.00% 9
63 google/gemma-4-26b-a4b-it limited data deepinfra 1352 ms 6079 ms 1351 ms 100.00% 13
64 minimax/minimax-m2.7 limited data novita 1354 ms 2338 ms 1353 ms 100.00% 2
65 Sao10K/L3-8B-Stheno-v3.2 limited data novita 1354 ms 1354 ms 1353 ms 100.00% 1
66 meta-llama/llama-3.1-70b-instruct limited data deepinfra 1355 ms 2957 ms 1355 ms 84.62% 13
67 meta-llama/llama-3.3-70b-instruct together 1365 ms 6403 ms 1365 ms 100.00% 36
68 google/gemini-3.1-flash-lite limited data gemini 1371 ms 2397 ms 1370 ms 100.00% 7
69 qwen/qwen3-235b-a22b-thinking-2507 limited data novita 1378 ms 1619 ms 1378 ms 100.00% 4
70 Qwen/Qwen3-30B-A3B-Instruct-2507 limited data nebius 1379 ms 3647 ms 1378 ms 100.00% 9
71 nvidia/nemotron-3-super-120b-a12b limited data nebius 1380 ms 1828 ms 1378 ms 100.00% 4
72 openai/gpt-5.4-mini limited data openai 1393 ms 3504 ms 1392 ms 100.00% 11
73 openai/gpt-oss-120b limited data novita 1393 ms 1393 ms 1292 ms 100.00% 1
74 nvidia/Llama-3_1-Nemotron-Ultra-253B-v1 limited data nebius 1399 ms 1934 ms 1398 ms 100.00% 2
75 qwen/qwen3.6-27b limited data venice 1419 ms 1745 ms 1317 ms 100.00% 13
76 deepseek/deepseek-v4-flash limited data siliconflow 1419 ms 3767 ms 1315 ms 13 tok/s 93.75% 16
77 z-ai/glm-4.7 limited data venice 1421 ms 3624 ms 1421 ms 100.00% 7
78 deepseek/deepseek-v4-pro fireworks 1429 ms 3783 ms 1428 ms 100.00% 20
79 anthropic/claude-haiku-4.5 limited data anthropic 1446 ms 4101 ms 1446 ms 100.00% 13
80 openai/gpt-oss-120b limited data tinfoil 1447 ms 1825 ms 1446 ms 100.00% 7
81 qwen/qwen3-coder-next limited data novita 1448 ms 1448 ms 1345 ms 100.00% 1
82 anthropic/claude-sonnet-4.5 limited data anthropic 1449 ms 2521 ms 1448 ms 100.00% 14
83 qwen/qwen3-coder-next limited data parasail 1451 ms 3882 ms 1450 ms 100.00% 3
84 openai/gpt-4.1-mini limited data openai 1455 ms 1851 ms 1394 ms 100.00% 7
85 Qwen/Qwen3-32B limited data nebius 1456 ms 2914 ms 1455 ms 100.00% 5
86 Qwen/Qwen2.5-VL-72B-Instruct limited data nebius 1461 ms 1588 ms 1460 ms 100.00% 3
87 deepseek/deepseek-v3-turbo limited data novita 1476 ms 2253 ms 1475 ms 100.00% 2
88 z-ai/glm-5 limited data venice 1491 ms 2348 ms 1474 ms 100.00% 14
89 google/gemini-3-flash-preview limited data gemini 1493 ms 1803 ms 1390 ms 100.00% 3
90 cerebras/zai-glm-4.7 cerebras 1494 ms 4073 ms 1401 ms 100.00% 29
91 deepseek/deepseek-r1-distill-llama-70b limited data novita 1495 ms 1495 ms 1494 ms 100.00% 1
92 moonshotai/kimi-k2.6 together 1501 ms 7574 ms 1501 ms 94.44% 36
93 meta-llama/Llama-3.3-70B-Instruct limited data nebius 1501 ms 2550 ms 1501 ms 100.00% 2
94 qwen/qwen3-vl-235b-a22b-instruct limited data parasail 1511 ms 4640 ms 1510 ms 100.00% 5
95 openai/gpt-4.1 limited data openai 1515 ms 2452 ms 1515 ms 100.00% 12
96 openai/o3 limited data openai 1525 ms 3609 ms 1480 ms 100.00% 14
97 Qwen/Qwen3-235B-A22B-Instruct-2507 limited data nebius 1533 ms 2960 ms 1532 ms 100.00% 2
98 qwen/qwen3.5-397b-a17b limited data venice 1542 ms 5809 ms 1542 ms 100.00% 8
99 google/gemma-4-26b-a4b-it limited data novita 1547 ms 5695 ms 1547 ms 100.00% 2
100 google/gemini-3.5-flash limited data gemini 1560 ms 3315 ms 1558 ms 100.00% 8
101 moonshotai/kimi-k2.6 limited data tinfoil 1582 ms 4349 ms 1477 ms 100.00% 4
102 openai/gpt-oss-20b limited data novita 1610 ms 2000 ms 1609 ms 100.00% 3
103 qwen/qwen3.5-27b limited data novita 1610 ms 1610 ms 1610 ms 100.00% 1
104 qwen/qwen3-omni-30b-a3b-thinking limited data novita 1613 ms 1956 ms 1612 ms 100.00% 2
105 qwen/qwen3.5-397b-a17b limited data novita 1627 ms 1627 ms 1524 ms 100.00% 1
106 minimax/minimax-m2 limited data novita 1639 ms 4324 ms 1546 ms 100.00% 3
107 mistralai/mistral-nemo limited data novita 1647 ms 1647 ms 1647 ms 100.00% 1
108 moonshotai/kimi-k2.6 fireworks 1648 ms 18398 ms 1647 ms 96.15% 26
109 moonshotai/kimi-k2.6 limited data novita 1649 ms 2015 ms 1543 ms 100.00% 2
110 bytedance/ui-tars-1.5-7b limited data parasail 1667 ms 1667 ms 1666 ms 100.00% 1
111 openai/gpt-4o-mini limited data openai 1673 ms 3317 ms 1673 ms 100.00% 13
112 deepseek/deepseek-v4-pro limited data tinfoil 1707 ms 4483 ms 1706 ms 100.00% 7
113 deepseek/deepseek-v4-pro limited data novita 1707 ms 1707 ms 1602 ms 100.00% 1
114 z-ai/glm-5.1 limited data fireworks 1708 ms 4562 ms 1707 ms 100.00% 14
115 qwen/qwen3-vl-30b-a3b-instruct limited data phala 1740 ms 9605 ms 1738 ms 100.00% 8
116 kwaipilot/kat-coder-pro limited data novita 1747 ms 1852 ms 1644 ms 100.00% 3
117 openai/gpt-oss-120b limited data phala 1748 ms 3316 ms 1748 ms 100.00% 6
118 openai/o4-mini limited data openai 1792 ms 2870 ms 1790 ms 100.00% 11
119 zai-org/glm-4.7 limited data novita 1801 ms 1801 ms 1799 ms 100.00% 1
120 thedrummer/cydonia-24b-v4.1 limited data parasail 1805 ms 5567 ms 1702 ms 100.00% 2
121 google/gemma-4-31b-it limited data parasail 1819 ms 5621 ms 1818 ms 100.00% 4
122 arcee-ai/trinity-large-thinking limited data parasail 1862 ms 4597 ms 1759 ms 100.00% 3
123 meta-llama/llama-3-70b-instruct limited data novita 1871 ms 1871 ms 1870 ms 100.00% 1
124 minimax/minimax-m2 limited data minimax 1888 ms 3768 ms 1785 ms 100.00% 13
125 minimax/minimax-m3 limited data siliconflow 1908 ms 2762 ms 1804 ms 100.00% 15
126 anthropic/claude-sonnet-4.6 limited data anthropic 1909 ms 6235 ms 1909 ms 100.00% 12
127 zai-org/glm-4.6 limited data novita 1914 ms 1914 ms 1912 ms 100.00% 1
128 z-ai/glm-5-turbo limited data venice 1923 ms 6392 ms 1921 ms 57.14% 7
129 deepseek/deepseek-v4-flash limited data novita 1932 ms 1932 ms 1931 ms 100.00% 1
130 xiaomi/mimo-v2.5-pro limited data xiaomi 1945 ms 3125 ms 1945 ms 100.00% 15
131 inclusionai/ling-2.6-flash limited data novita 1961 ms 1975 ms 1960 ms 100.00% 2
132 z-ai/glm-5v-turbo limited data siliconflow 1990 ms 6058 ms 1927 ms 100.00% 17
133 zai-org/autoglm-phone-9b-multilingual limited data novita 2006 ms 2006 ms 2005 ms 100.00% 1
134 deepseek/deepseek-v3.1-terminus limited data novita 2065 ms 2065 ms 1962 ms 100.00% 1
135 qwen/qwen3-vl-8b-instruct limited data parasail 2069 ms 2069 ms 2069 ms 100.00% 1
136 tencent/hunyuan-a13b-instruct limited data siliconflow 2075 ms 3644 ms 2075 ms 100.00% 13
137 anthropic/claude-opus-4.5 limited data anthropic 2085 ms 2704 ms 1983 ms 100.00% 8
138 qwen/qwen3-coder-480b-a35b-instruct limited data novita 2101 ms 2101 ms 2101 ms 100.00% 1
139 xiaomi/mimo-v2.5-pro-ultraspeed xiaomi 2121 ms 4472 ms 2082 ms 100.00% 26
140 anthropic/claude-opus-4.8 limited data anthropic 2132 ms 3768 ms 2131 ms 100.00% 13
141 openai/o3-mini limited data openai 2141 ms 2590 ms 2037 ms 100.00% 11
142 google/gemma-3-27b-it limited data phala 2151 ms 2855 ms 2060 ms 100.00% 8
143 deepseek/deepseek-v4-pro gmi 2167 ms 11999 ms 2166 ms 100.00% 20
144 z-ai/glm-4.7-flash limited data phala 2170 ms 4748 ms 2169 ms 100.00% 3
145 qwen/qwen3-235b-a22b-instruct-2507 limited data novita 2178 ms 2229 ms 2127 ms 100.00% 2
146 qwen/qwen3-vl-30b-a3b-thinking limited data novita 2179 ms 2203 ms 2075 ms 100.00% 2
147 minimax/minimax-m3 limited data minimax 2204 ms 4077 ms 2203 ms 100.00% 8
148 deepseek/deepseek-v4-pro siliconflow 2236 ms 4183 ms 2133 ms 95.24% 21
149 z-ai/glm-5 limited data siliconflow 2241 ms 10712 ms 2240 ms 100.00% 19
150 minimax/minimax-m2.1 limited data novita 2256 ms 2730 ms 2255 ms 100.00% 2
151 openai/gpt-5.5 limited data openai 2274 ms 3954 ms 2273 ms 100.00% 8
152 Qwen/Qwen3-Next-80B-A3B-Thinking limited data nebius 2282 ms 2944 ms 2281 ms 100.00% 6
153 deepseek/deepseek-v3-0324 limited data novita 2293 ms 2293 ms 2292 ms 100.00% 1
154 baidu/ernie-4.5-vl-424b-a47b limited data novita 2293 ms 2293 ms 2292 ms 100.00% 1
155 minimax/minimax-m2.1-highspeed limited data minimax 2361 ms 9374 ms 2359 ms 100.00% 19
156 moonshotai/kimi-k2-0905 limited data novita 2377 ms 2377 ms 2273 ms 100.00% 1
157 z-ai/glm-5.1 limited data venice 2378 ms 8076 ms 2276 ms 100.00% 15
158 minimax/minimax-m2.5-highspeed limited data minimax 2391 ms 5852 ms 2390 ms 100.00% 17
159 anthropic/claude-opus-4.7 limited data anthropic 2427 ms 3007 ms 2426 ms 80 tok/s 100.00% 7
160 moonshotai/kimi-k2-thinking limited data novita 2446 ms 4816 ms 2445 ms 100.00% 2
161 xiaomi/mimo-v2-pro xiaomi 2462 ms 4164 ms 2454 ms 100.00% 20
162 moonshotai/kimi-k2.5 limited data phala 2478 ms 4852 ms 2478 ms 83.33% 6
163 tencent/hy3-preview limited data siliconflow 2504 ms 3695 ms 2503 ms 100.00% 14
164 minimax/minimax-m2.7-highspeed minimax 2522 ms 5371 ms 2521 ms 100.00% 23
165 openai/gpt-oss-120b limited data parasail 2552 ms 4035 ms 2551 ms 100.00% 3
166 anthropic/claude-opus-4.6 limited data anthropic 2572 ms 11740 ms 2517 ms 100.00% 12
167 xiaomi/mimo-v2.5 limited data xiaomi 2582 ms 6058 ms 2520 ms 100.00% 19
168 qwen/qwen-2.5-7b-instruct limited data phala 2638 ms 10870 ms 2638 ms 100.00% 7
169 deepseek/deepseek-v3.1 limited data novita 2640 ms 2997 ms 2640 ms 100.00% 3
170 xiaomi/mimo-v2-flash xiaomi 2766 ms 5863 ms 2663 ms 100.00% 26
171 deepseek/deepseek-r1-turbo limited data novita 2789 ms 2789 ms 2788 ms 100.00% 1
172 z-ai/glm-5.1 limited data phala 2814 ms 5889 ms 2814 ms 100.00% 5
173 anthropic/claude-opus-4.1 limited data anthropic 2814 ms 3681 ms 2723 ms 100.00% 8
174 z-ai/glm-5 limited data phala 2970 ms 8398 ms 2970 ms 100.00% 7
175 minimax/minimax-m2.5-highspeed limited data novita 2986 ms 3243 ms 2882 ms 100.00% 2
176 deepseek/deepseek-chat-v3.1 limited data phala 2993 ms 12596 ms 2889 ms 100.00% 6
177 google/gemma-4-31b-it limited data novita 3119 ms 5839 ms 3118 ms 100.00% 2
178 zai-org/glm-4.7-flash limited data novita 3271 ms 4318 ms 3167 ms 100.00% 7
179 zai-org/glm-5.1 limited data novita 3281 ms 4020 ms 3281 ms 100.00% 4
180 moonshotai/kimi-k2.6 limited data phala 3288 ms 10187 ms 3285 ms 100.00% 7
181 minimax/minimax-m2.5 limited data phala 3321 ms 8430 ms 3321 ms 100.00% 6
182 deepseek/deepseek-r1-0528 limited data novita 3598 ms 4397 ms 3598 ms 100.00% 3
183 qwen/qwen-mt-plus limited data novita 3648 ms 3648 ms 3648 ms 100.00% 1
184 zai-org/glm-4.5 limited data novita 3685 ms 3685 ms 3684 ms 100.00% 1
185 microsoft/wizardlm-2-8x22b limited data novita 3825 ms 3825 ms 3824 ms 100.00% 1
186 minimax/minimax-m2.7 limited data minimax 3863 ms 12125 ms 3862 ms 100.00% 11
187 openai/o1 limited data openai 3904 ms 6946 ms 3903 ms 100.00% 8
188 z-ai/glm-4.7 limited data phala 3983 ms 20356 ms 3982 ms 100.00% 11
189 google/gemma-4-31b-it gmi 4069 ms 22922 ms 4066 ms 52.38% 21
190 z-ai/glm-4.7-flash limited data venice 4069 ms 24493 ms 4069 ms 28.57% 7
191 qwen/qwen3-30b-a3b-instruct-2507 limited data phala 4364 ms 9715 ms 4363 ms 100.00% 8
192 deepseek/deepseek-ocr limited data novita 4422 ms 4422 ms 4422 ms 100.00% 1
193 google/gemini-3.1-pro-preview gemini 4630 ms 4630 ms 4629 ms 79 tok/s 99.29% 5 probe_config_error 140
194 zai-org/GLM-5.1 limited data nebius 4814 ms 5155 ms 4711 ms 100.00% 2
195 qwen/qwen3.5-397b-a17b limited data phala 5119 ms 12769 ms 5118 ms 100.00% 5
196 google/gemma-4-26b-a4b-it gmi 6190 ms 29342 ms 6087 ms 35.00% 20
197 google/gemma-3-27b-it limited data nebius 6206 ms 14874 ms 6205 ms 100.00% 5
198 z-ai/glm-5 limited data gmi 6362 ms 28961 ms 6361 ms 78.95% 19
199 z-ai/glm-5.1 gmi 7290 ms 21370 ms 7290 ms 100.00% 23
200 thedrummer/skyfall-36b-v2 limited data parasail 7377 ms 13486 ms 7376 ms 75.00% 4
201 deepseek/deepseek-v3.2 limited data phala 8660 ms 12787 ms 8556 ms 100.00% 5
202 zai-org/glm-4.5v limited data novita 8668 ms 22559 ms 8668 ms 100.00% 3
203 z-ai/glm-5.2 tinfoil 20720 ms 104883 ms 41 tok/s 78.82% 170
204 z-ai/glm-5.2 zai 22 tok/s 100.00% 191
205 google/gemma-3-12b-it limited data novita 0.00% 3
206 deepseek/deepseek-v4-pro deepseek 0.00% 50
207 moonshotai/kimi-k2.7-code kimi 0.00% 28
208 deepseek/deepseek-v4-flash deepseek 0.00% 59
209 moonshotai/kimi-k2.6 kimi 0.00% 37
210 moonshotai/kimi-k2.6 limited data parasail 0.00% 8
211 qwen/qwen3.5-397b-a17b limited data parasail 0.00% 1 probe_config_error 1
212 anthropic/claude-opus-4 limited data anthropic 0.00% 13
213 z-ai/glm-5v-turbo limited data venice 0.00% 10
214 moonshotai/kimi-k2.5 kimi 0.00% 38
215 anthropic/claude-sonnet-4 limited data anthropic 0.00% 9
216 baidu/ernie-4.5-vl-28b-a3b limited data novita 0.00% 2
217 deepseek/deepseek-v4-pro limited data parasail 0.00% 1
218 moonshotai/kimi-k2.5 limited data parasail 0.00% 8
219 stepfun/step-3.5-flash limited data parasail 0.00% 4
220 google/gemma-3-27b-it limited data novita 0.00% 2
221 deepseek/deepseek-v3.2 limited data parasail 0.00% 4

Sign in

Choose a sign in method.