ai-xray 1.2.0 → 2.0.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/PRD.md +421 -280
- package/README.md +2 -2
- package/dist/cli.js +771 -0
- package/dist/cli.js.map +1 -0
- package/package.json +36 -24
- package/src/cli.ts +155 -118
- package/src/client.ts +203 -0
- package/src/commands/bench.ts +99 -0
- package/src/commands/compare.ts +76 -0
- package/src/commands/id.ts +139 -0
- package/src/commands/ping.ts +55 -0
- package/src/commands/probe.ts +136 -0
- package/src/commands/tokenize.ts +96 -0
- package/src/utils/http.ts +86 -0
- package/src/utils/output.ts +36 -123
- package/src/utils/timer.ts +75 -0
- package/tests/bench.test.ts +13 -0
- package/tests/client.test.ts +37 -0
- package/tests/compare.test.ts +24 -0
- package/tests/http.test.ts +12 -0
- package/tests/id.test.ts +13 -0
- package/tests/ping.test.ts +12 -0
- package/tests/probe.test.ts +13 -0
- package/tests/tokenize.test.ts +32 -0
- package/tsup.config.ts +11 -11
- package/vitest.config.ts +13 -0
- package/ana-suggestions.md +0 -105
- package/tests/cli.test.ts +0 -172
- package/tests/diff.test.ts +0 -169
- package/tests/env.test.ts +0 -69
- package/tests/init.test.ts +0 -164
- package/tests/output.test.ts +0 -49
- package/tests/read.test.ts +0 -169
- package/tests/scout.test.ts +0 -248
- package/tests/tree.test.ts +0 -222
package/PRD.md
CHANGED
|
@@ -1,397 +1,538 @@
|
|
|
1
|
-
# ai-xray
|
|
1
|
+
# ai-xray — AI Model Diagnostics Toolkit (PRD v2.0.0)
|
|
2
2
|
|
|
3
|
-
>
|
|
4
|
-
>
|
|
5
|
-
> **一句话定位:** Give your AI coding agent x-ray vision into any codebase.
|
|
3
|
+
> **Tagline:** X-ray your AI. Know what's under the hood.
|
|
4
|
+
> **npm:** `ai-xray` | **GitHub:** `10iii/ai-xray` | **Author:** `10iii <zh@ngxil.in>`
|
|
6
5
|
|
|
7
6
|
---
|
|
8
7
|
|
|
9
|
-
## 1.
|
|
8
|
+
## 1. Repositioning
|
|
10
9
|
|
|
11
|
-
|
|
12
|
-
**`ai-xray` 就是那束穿透黑暗的 X 光。** 一条命令,瞬间看穿项目的骨骼结构。
|
|
10
|
+
`ai-xray` was originally a codebase context scanner for AI agents. That functionality has been migrated to `ai-see`.
|
|
13
11
|
|
|
14
|
-
-
|
|
15
|
-
- 不是 DevTools(不面向人类)
|
|
16
|
-
- 是 **AI 助手的眼睛**——专为 LLM 设计的、结构化的、Token 友好的本地环境探测器
|
|
12
|
+
**New mission:** `ai-xray` is now a **diagnostic and probing toolkit for AI models and agents themselves.** It answers the question: *"What AI am I talking to, and what can it actually do?"*
|
|
17
13
|
|
|
18
|
-
|
|
14
|
+
Think of it as `curl` for AI models — a lightweight CLI that lets you probe, benchmark, and fingerprint any LLM through its API.
|
|
19
15
|
|
|
20
|
-
## 2.
|
|
16
|
+
## 2. Problem Statement
|
|
21
17
|
|
|
22
|
-
|
|
18
|
+
Developers and AI agent systems work with dozens of LLM providers (OpenAI, Anthropic, Google, Ollama, Groq, etc.) but have no standardized way to:
|
|
19
|
+
1. **Identify** which model is actually responding (API version, model ID, context window).
|
|
20
|
+
2. **Probe** what capabilities a model supports (function calling, vision, streaming, JSON mode).
|
|
21
|
+
3. **Benchmark** response speed, token throughput, and cost across providers.
|
|
22
|
+
4. **Compare** multiple models on the same prompt side-by-side.
|
|
23
23
|
|
|
24
|
-
|
|
25
|
-
|---|---|
|
|
26
|
-
| **Pure JSON stdout** | 成功时 stdout 只包含一个合法 JSON 对象,无任何其他字符 |
|
|
27
|
-
| **Structured stderr** | 失败时 stderr 输出 `{"error": "...", "code": "..."}` + 非零 Exit Code |
|
|
28
|
-
| **Zero interactivity** | 绝不等待用户输入。所有选项必须通过参数传入 |
|
|
29
|
-
| **No ANSI escapes** | 输出中不含颜色码、进度条、emoji |
|
|
24
|
+
## 3. Core Design Principles
|
|
30
25
|
|
|
31
|
-
|
|
26
|
+
1. **Provider-Agnostic:** Works with any OpenAI-compatible API endpoint (covers 90%+ of providers).
|
|
27
|
+
2. **Zero External Dependencies:** Uses Node.js built-in `https`/`http` modules for API calls.
|
|
28
|
+
3. **Machine-First Output:** All output is JSON by default (can format for humans with `--pretty`).
|
|
29
|
+
4. **Non-Destructive:** Read-only probing. Never sends harmful or creative prompts — only diagnostic prompts.
|
|
32
30
|
|
|
33
|
-
|
|
31
|
+
## 4. Technology Stack
|
|
34
32
|
|
|
35
|
-
|
|
33
|
+
- **Language:** TypeScript
|
|
34
|
+
- **Build:** tsup (CJS output, target node18)
|
|
35
|
+
- **Test:** vitest
|
|
36
|
+
- **Dependencies:** Zero runtime dependencies
|
|
36
37
|
|
|
37
|
-
##
|
|
38
|
+
## 5. File Structure
|
|
38
39
|
|
|
39
|
-
|
|
40
|
-
|
|
41
|
-
|
|
42
|
-
|
|
43
|
-
|
|
44
|
-
|
|
45
|
-
|
|
40
|
+
```
|
|
41
|
+
ai-xray/
|
|
42
|
+
├── package.json
|
|
43
|
+
├── tsconfig.json
|
|
44
|
+
├── tsup.config.ts
|
|
45
|
+
├── vitest.config.ts
|
|
46
|
+
├── README.md
|
|
47
|
+
├── src/
|
|
48
|
+
│ ├── cli.ts # Entry point
|
|
49
|
+
│ ├── client.ts # Universal LLM API client
|
|
50
|
+
│ ├── commands/
|
|
51
|
+
│ │ ├── ping.ts # ai-xray ping — basic connectivity test
|
|
52
|
+
│ │ ├── id.ts # ai-xray id — model fingerprinting
|
|
53
|
+
│ │ ├── probe.ts # ai-xray probe — capability detection
|
|
54
|
+
│ │ ├── bench.ts # ai-xray bench — performance benchmark
|
|
55
|
+
│ │ ├── tokenize.ts # ai-xray tokens — token counting
|
|
56
|
+
│ │ └── compare.ts # ai-xray compare — multi-model comparison
|
|
57
|
+
│ └── utils/
|
|
58
|
+
│ ├── http.ts # HTTP request helper (stdlib only)
|
|
59
|
+
│ ├── timer.ts # High-res timing utilities
|
|
60
|
+
│ └── output.ts # JSON / pretty output formatting
|
|
61
|
+
└── tests/
|
|
62
|
+
├── client.test.ts
|
|
63
|
+
├── ping.test.ts
|
|
64
|
+
├── id.test.ts
|
|
65
|
+
├── probe.test.ts
|
|
66
|
+
├── bench.test.ts
|
|
67
|
+
├── tokenize.test.ts
|
|
68
|
+
├── compare.test.ts
|
|
69
|
+
└── http.test.ts
|
|
70
|
+
```
|
|
46
71
|
|
|
47
72
|
---
|
|
48
73
|
|
|
49
|
-
##
|
|
50
|
-
|
|
51
|
-
### 命令速查表
|
|
52
|
-
|
|
53
|
-
```
|
|
54
|
-
ai-xray scout [--budget=N] 一键全局侦察(env+tree+readme+rules+diff)
|
|
55
|
-
ai-xray env 项目画像
|
|
56
|
-
ai-xray tree [dir] [--depth=N] [--budget=N] 目录结构透视
|
|
57
|
-
ai-xray read <file...> [--lines=N-M] [--keys=a,b] [--budget=N] 精准文件读取
|
|
58
|
-
ai-xray diff [--full] [--file=path] Git 变更摘要
|
|
59
|
-
ai-xray init [--format=cursorrules|claude|agents] 注入/生成 AI rules 文件
|
|
60
|
-
ai-xray help 获取专为 AI 设计的 JSON 格式命令指南
|
|
61
|
-
```
|
|
74
|
+
## 6. Global Flags
|
|
62
75
|
|
|
63
|
-
|
|
64
|
-
|
|
65
|
-
|
|
76
|
+
| Flag | Alias | Description |
|
|
77
|
+
|---|---|---|
|
|
78
|
+
| `--help` | `-h` | Print help summary with all commands and usage examples. Exit 0. |
|
|
79
|
+
| `--version` | `-v` | Print `ai-xray@<version>` and exit 0. |
|
|
80
|
+
| `--pretty` | — | Format JSON output as human-readable tables (default is compact JSON). |
|
|
81
|
+
| `--provider` | `-p` | Specify which provider config to use (default: env vars). |
|
|
66
82
|
|
|
67
83
|
---
|
|
68
84
|
|
|
69
|
-
|
|
85
|
+
## 7. Configuration
|
|
70
86
|
|
|
71
|
-
|
|
87
|
+
`ai-xray` reads provider config from environment variables or a config file.
|
|
72
88
|
|
|
89
|
+
**Environment Variables (primary):**
|
|
73
90
|
```bash
|
|
74
|
-
|
|
91
|
+
AI_XRAY_API_KEY=sk-xxxxx
|
|
92
|
+
AI_XRAY_BASE_URL=https://api.openai.com/v1 # default
|
|
93
|
+
AI_XRAY_MODEL=gpt-4o # default
|
|
75
94
|
```
|
|
76
95
|
|
|
77
|
-
**
|
|
96
|
+
**Config File (optional):** `~/.ai-xray.json`
|
|
78
97
|
```json
|
|
79
98
|
{
|
|
80
|
-
"
|
|
81
|
-
|
|
82
|
-
|
|
83
|
-
|
|
84
|
-
|
|
85
|
-
|
|
86
|
-
"
|
|
99
|
+
"providers": {
|
|
100
|
+
"openai": {
|
|
101
|
+
"baseUrl": "https://api.openai.com/v1",
|
|
102
|
+
"apiKey": "sk-xxxxx",
|
|
103
|
+
"model": "gpt-4o"
|
|
104
|
+
},
|
|
105
|
+
"anthropic": {
|
|
106
|
+
"baseUrl": "https://api.anthropic.com/v1",
|
|
107
|
+
"apiKey": "sk-ant-xxxxx",
|
|
108
|
+
"model": "claude-sonnet-4-20250514"
|
|
109
|
+
},
|
|
110
|
+
"ollama": {
|
|
111
|
+
"baseUrl": "http://localhost:11434/v1",
|
|
112
|
+
"model": "llama3"
|
|
113
|
+
}
|
|
87
114
|
}
|
|
88
115
|
}
|
|
89
116
|
```
|
|
90
117
|
|
|
91
|
-
|
|
118
|
+
**Function Signature:**
|
|
119
|
+
```typescript
|
|
120
|
+
// src/client.ts
|
|
121
|
+
export interface ProviderConfig {
|
|
122
|
+
baseUrl: string;
|
|
123
|
+
apiKey?: string;
|
|
124
|
+
model: string;
|
|
125
|
+
}
|
|
92
126
|
|
|
93
|
-
|
|
127
|
+
export function loadConfig(provider?: string): ProviderConfig;
|
|
128
|
+
```
|
|
94
129
|
|
|
95
|
-
|
|
130
|
+
---
|
|
96
131
|
|
|
97
|
-
|
|
132
|
+
## 7. Command Reference
|
|
98
133
|
|
|
99
|
-
|
|
134
|
+
### 7.1 `ai-xray ping [--provider=<name>]` — Connectivity Test
|
|
100
135
|
|
|
101
|
-
|
|
102
|
-
1.
|
|
103
|
-
2.
|
|
104
|
-
3.
|
|
105
|
-
4. IF CLAUDE.md / .cursorrules / AGENTS.md 存在
|
|
106
|
-
THEN 读取其内容 (这些是 AI 约定文件,极重要)
|
|
107
|
-
5. IF 是 git 仓库 THEN 输出 diff stat (常见)
|
|
108
|
-
6. IF package.json 有 workspaces 字段
|
|
109
|
-
THEN 列出所有 workspace 包名和路径 (monorepo 场景)
|
|
110
|
-
```
|
|
136
|
+
**Behavior:**
|
|
137
|
+
1. Send a minimal request (`messages: [{ role: "user", content: "hi" }]`, `max_tokens: 1`) to the API.
|
|
138
|
+
2. Measure response time.
|
|
139
|
+
3. Report: reachable (true/false), latency, model echoed back, rate limit headers.
|
|
111
140
|
|
|
112
|
-
**Output
|
|
141
|
+
**Output:**
|
|
113
142
|
```json
|
|
114
143
|
{
|
|
115
|
-
"
|
|
116
|
-
|
|
117
|
-
|
|
118
|
-
|
|
119
|
-
"
|
|
120
|
-
"
|
|
121
|
-
"name": "vitest",
|
|
122
|
-
"runCommand": "npx vitest run --reporter=json",
|
|
123
|
-
"parseHint": "Look for testResults[].assertionResults[].failureMessages"
|
|
124
|
-
},
|
|
125
|
-
"lintFramework": {
|
|
126
|
-
"name": "eslint",
|
|
127
|
-
"runCommand": "npx eslint . --format=json"
|
|
128
|
-
}
|
|
129
|
-
},
|
|
130
|
-
"tree": {
|
|
131
|
-
"src/": {
|
|
132
|
-
"components/": { "_files": ["Button.tsx", "Modal.tsx"], "_more": 9 },
|
|
133
|
-
"pages/": { "index.tsx": {}, "auth/": { "login.tsx": {} } },
|
|
134
|
-
"lib/": { "_files": ["db.ts", "auth.ts", "utils.ts"] }
|
|
135
|
-
},
|
|
136
|
-
"package.json": {},
|
|
137
|
-
"README.md": {}
|
|
138
|
-
},
|
|
139
|
-
"readme": "# My Project\nA Next.js application for...\n...(truncated at 60 lines)",
|
|
140
|
-
"agentRules": null,
|
|
141
|
-
"diff": {
|
|
142
|
-
"branch": "main",
|
|
143
|
-
"ahead": 0,
|
|
144
|
-
"behind": 2,
|
|
145
|
-
"staged": [],
|
|
146
|
-
"unstaged": [],
|
|
147
|
-
"untracked": []
|
|
148
|
-
},
|
|
149
|
-
"workspaces": null,
|
|
150
|
-
"_meta": {
|
|
151
|
-
"tokensEstimate": 1850,
|
|
152
|
-
"budget": 3000,
|
|
153
|
-
"truncated": false
|
|
144
|
+
"reachable": true,
|
|
145
|
+
"latency_ms": 342,
|
|
146
|
+
"model": "gpt-4o-2024-08-06",
|
|
147
|
+
"rate_limit": {
|
|
148
|
+
"remaining": 9998,
|
|
149
|
+
"reset_at": "2026-03-07T21:00:00Z"
|
|
154
150
|
}
|
|
155
151
|
}
|
|
156
152
|
```
|
|
157
153
|
|
|
158
|
-
|
|
159
|
-
|
|
160
|
-
|
|
154
|
+
**Function Signature:**
|
|
155
|
+
```typescript
|
|
156
|
+
// src/commands/ping.ts
|
|
157
|
+
export interface PingResult {
|
|
158
|
+
reachable: boolean;
|
|
159
|
+
latency_ms: number;
|
|
160
|
+
model: string | null;
|
|
161
|
+
rate_limit: { remaining: number | null; reset_at: string | null };
|
|
162
|
+
error?: string;
|
|
163
|
+
}
|
|
161
164
|
|
|
162
|
-
|
|
165
|
+
export async function ping(config: ProviderConfig): Promise<PingResult>;
|
|
166
|
+
```
|
|
163
167
|
|
|
164
|
-
|
|
168
|
+
### 7.2 `ai-xray id [--provider=<name>]` — Model Fingerprinting
|
|
165
169
|
|
|
166
|
-
|
|
167
|
-
|
|
168
|
-
|
|
170
|
+
**Behavior:**
|
|
171
|
+
Send a series of diagnostic prompts to identify the model:
|
|
172
|
+
1. `"What model are you? Reply with only the model identifier."` → Extract self-reported model name.
|
|
173
|
+
2. `"What is your knowledge cutoff date? Reply YYYY-MM only."` → Extract training cutoff.
|
|
174
|
+
3. `"What is your maximum context window in tokens? Reply with only the number."` → Extract context size.
|
|
175
|
+
4. Parse API response headers for `x-model`, `x-ratelimit-*`, `openai-organization`.
|
|
169
176
|
|
|
177
|
+
**Output:**
|
|
170
178
|
```json
|
|
171
179
|
{
|
|
172
|
-
"
|
|
173
|
-
|
|
174
|
-
|
|
175
|
-
|
|
176
|
-
"scripts": {
|
|
177
|
-
"dev": "next dev",
|
|
178
|
-
"build": "next build",
|
|
179
|
-
"test": "vitest",
|
|
180
|
-
"lint": "eslint ."
|
|
181
|
-
},
|
|
182
|
-
"testFramework": {
|
|
183
|
-
"name": "vitest",
|
|
184
|
-
"runCommand": "npx vitest run --reporter=json",
|
|
185
|
-
"parseHint": "Look for testResults[].assertionResults[].failureMessages"
|
|
180
|
+
"self_reported": {
|
|
181
|
+
"model": "GPT-4o",
|
|
182
|
+
"cutoff": "2025-06",
|
|
183
|
+
"context_window": 128000
|
|
186
184
|
},
|
|
187
|
-
"
|
|
188
|
-
"
|
|
189
|
-
"
|
|
190
|
-
"parseHint": "Each element has filePath and messages[]"
|
|
185
|
+
"api_reported": {
|
|
186
|
+
"model": "gpt-4o-2024-08-06",
|
|
187
|
+
"organization": "org-xxxxx"
|
|
191
188
|
},
|
|
192
|
-
"
|
|
193
|
-
"
|
|
194
|
-
"
|
|
195
|
-
"remoteUrl": "https://github.com/10iii/ai-xray.git"
|
|
189
|
+
"fingerprint": {
|
|
190
|
+
"provider": "openai",
|
|
191
|
+
"confidence": 0.95
|
|
196
192
|
}
|
|
197
193
|
}
|
|
198
194
|
```
|
|
199
195
|
|
|
200
|
-
|
|
201
|
-
|
|
202
|
-
|
|
196
|
+
**Function Signature:**
|
|
197
|
+
```typescript
|
|
198
|
+
// src/commands/id.ts
|
|
199
|
+
export interface IdResult {
|
|
200
|
+
self_reported: {
|
|
201
|
+
model: string | null;
|
|
202
|
+
cutoff: string | null;
|
|
203
|
+
context_window: number | null;
|
|
204
|
+
};
|
|
205
|
+
api_reported: {
|
|
206
|
+
model: string | null;
|
|
207
|
+
organization: string | null;
|
|
208
|
+
};
|
|
209
|
+
fingerprint: {
|
|
210
|
+
provider: string;
|
|
211
|
+
confidence: number;
|
|
212
|
+
};
|
|
213
|
+
}
|
|
203
214
|
|
|
204
|
-
|
|
205
|
-
ai-xray tree # 整个项目,默认 depth=3
|
|
206
|
-
ai-xray tree src/ # 只看 src
|
|
207
|
-
ai-xray tree --depth=1 # 只看第一层
|
|
208
|
-
ai-xray tree --budget=300 # Token 预算限制
|
|
215
|
+
export async function identify(config: ProviderConfig): Promise<IdResult>;
|
|
209
216
|
```
|
|
210
217
|
|
|
218
|
+
### 7.3 `ai-xray probe [--provider=<name>]` — Capability Detection
|
|
219
|
+
|
|
220
|
+
**Behavior:**
|
|
221
|
+
Run a battery of lightweight tests to detect model capabilities:
|
|
222
|
+
|
|
223
|
+
| Test | How | Result |
|
|
224
|
+
|---|---|---|
|
|
225
|
+
| **JSON mode** | Send with `response_format: { type: "json_object" }` | `true/false` |
|
|
226
|
+
| **Function calling** | Send with a dummy `tools` array | `true/false` |
|
|
227
|
+
| **Vision** | Send a tiny base64 image in content | `true/false` |
|
|
228
|
+
| **Streaming** | Send with `stream: true` | `true/false` |
|
|
229
|
+
| **System prompt** | Send with a system message | `true/false` |
|
|
230
|
+
| **Temperature control** | Send with `temperature: 0` | `true/false` |
|
|
231
|
+
|
|
232
|
+
**Output:**
|
|
211
233
|
```json
|
|
212
234
|
{
|
|
213
|
-
"
|
|
214
|
-
|
|
215
|
-
|
|
216
|
-
"
|
|
217
|
-
|
|
218
|
-
|
|
219
|
-
|
|
220
|
-
},
|
|
221
|
-
"pages/": {
|
|
222
|
-
"index.tsx": {},
|
|
223
|
-
"auth/": { "login.tsx": {}, "register.tsx": {} }
|
|
224
|
-
},
|
|
225
|
-
"lib/": {
|
|
226
|
-
"_files": ["db.ts", "auth.ts", "utils.ts"]
|
|
227
|
-
}
|
|
228
|
-
},
|
|
229
|
-
"tests/": { "_more": 8 },
|
|
230
|
-
"package.json": {},
|
|
231
|
-
"tsconfig.json": {}
|
|
232
|
-
},
|
|
233
|
-
"stats": {
|
|
234
|
-
"totalFiles": 47,
|
|
235
|
-
"displayed": 16,
|
|
236
|
-
"ignored": ["node_modules", "dist", ".next", ".git"]
|
|
235
|
+
"capabilities": {
|
|
236
|
+
"json_mode": true,
|
|
237
|
+
"function_calling": true,
|
|
238
|
+
"vision": true,
|
|
239
|
+
"streaming": true,
|
|
240
|
+
"system_prompt": true,
|
|
241
|
+
"temperature_control": true
|
|
237
242
|
},
|
|
238
|
-
"
|
|
243
|
+
"probe_duration_ms": 4521
|
|
239
244
|
}
|
|
240
245
|
```
|
|
241
246
|
|
|
242
|
-
|
|
243
|
-
|
|
244
|
-
|
|
245
|
-
|
|
247
|
+
**Function Signature:**
|
|
248
|
+
```typescript
|
|
249
|
+
// src/commands/probe.ts
|
|
250
|
+
export interface ProbeResult {
|
|
251
|
+
capabilities: Record<string, boolean>;
|
|
252
|
+
probe_duration_ms: number;
|
|
253
|
+
}
|
|
246
254
|
|
|
247
|
-
|
|
255
|
+
export async function probe(config: ProviderConfig): Promise<ProbeResult>;
|
|
256
|
+
```
|
|
248
257
|
|
|
249
|
-
###
|
|
258
|
+
### 7.4 `ai-xray bench [--provider=<name>] [--rounds=5]` — Performance Benchmark
|
|
250
259
|
|
|
251
|
-
|
|
260
|
+
**Behavior:**
|
|
261
|
+
1. Send a fixed prompt (`"Write a haiku about coding."`) N times.
|
|
262
|
+
2. Measure: time-to-first-token (TTFT), total response time, tokens generated, tokens/second.
|
|
263
|
+
3. Compute statistics: mean, median, p95.
|
|
252
264
|
|
|
253
|
-
**
|
|
254
|
-
```
|
|
255
|
-
|
|
265
|
+
**Output:**
|
|
266
|
+
```json
|
|
267
|
+
{
|
|
268
|
+
"rounds": 5,
|
|
269
|
+
"stats": {
|
|
270
|
+
"ttft_ms": { "mean": 280, "median": 265, "p95": 410 },
|
|
271
|
+
"total_ms": { "mean": 1200, "median": 1150, "p95": 1800 },
|
|
272
|
+
"tokens_per_second": { "mean": 42.5, "median": 40.1, "p95": 55.2 },
|
|
273
|
+
"output_tokens": { "mean": 51, "total": 255 }
|
|
274
|
+
}
|
|
275
|
+
}
|
|
256
276
|
```
|
|
257
277
|
|
|
258
|
-
**
|
|
259
|
-
```
|
|
260
|
-
|
|
261
|
-
|
|
278
|
+
**Function Signature:**
|
|
279
|
+
```typescript
|
|
280
|
+
// src/commands/bench.ts
|
|
281
|
+
export interface BenchStats {
|
|
282
|
+
mean: number;
|
|
283
|
+
median: number;
|
|
284
|
+
p95: number;
|
|
285
|
+
}
|
|
262
286
|
|
|
263
|
-
|
|
264
|
-
|
|
265
|
-
|
|
266
|
-
|
|
287
|
+
export interface BenchResult {
|
|
288
|
+
rounds: number;
|
|
289
|
+
stats: {
|
|
290
|
+
ttft_ms: BenchStats;
|
|
291
|
+
total_ms: BenchStats;
|
|
292
|
+
tokens_per_second: BenchStats;
|
|
293
|
+
output_tokens: { mean: number; total: number };
|
|
294
|
+
};
|
|
295
|
+
}
|
|
267
296
|
|
|
268
|
-
|
|
269
|
-
|
|
270
|
-
|
|
297
|
+
export async function bench(
|
|
298
|
+
config: ProviderConfig,
|
|
299
|
+
options?: { rounds?: number }
|
|
300
|
+
): Promise<BenchResult>;
|
|
271
301
|
```
|
|
272
302
|
|
|
273
|
-
|
|
303
|
+
### 7.5 `ai-xray tokens <text_or_file>` — Token Counter
|
|
304
|
+
|
|
305
|
+
**Behavior:**
|
|
306
|
+
1. Accept text from argument, stdin pipe, or file path.
|
|
307
|
+
2. Estimate token count using a simple whitespace/BPE heuristic (no tiktoken dependency).
|
|
308
|
+
3. Report: estimated tokens, character count, word count, cost estimate (if model pricing known).
|
|
309
|
+
|
|
310
|
+
**Output:**
|
|
274
311
|
```json
|
|
275
312
|
{
|
|
276
|
-
"
|
|
277
|
-
|
|
278
|
-
|
|
279
|
-
|
|
280
|
-
|
|
313
|
+
"characters": 1542,
|
|
314
|
+
"words": 287,
|
|
315
|
+
"estimated_tokens": 412,
|
|
316
|
+
"cost_estimate": {
|
|
317
|
+
"model": "gpt-4o",
|
|
318
|
+
"input_cost_usd": 0.00103
|
|
319
|
+
}
|
|
281
320
|
}
|
|
282
321
|
```
|
|
283
322
|
|
|
284
|
-
|
|
285
|
-
|
|
286
|
-
|
|
323
|
+
**Function Signature:**
|
|
324
|
+
```typescript
|
|
325
|
+
// src/commands/tokenize.ts
|
|
326
|
+
export interface TokenResult {
|
|
327
|
+
characters: number;
|
|
328
|
+
words: number;
|
|
329
|
+
estimated_tokens: number;
|
|
330
|
+
cost_estimate?: {
|
|
331
|
+
model: string;
|
|
332
|
+
input_cost_usd: number;
|
|
333
|
+
};
|
|
334
|
+
}
|
|
287
335
|
|
|
288
|
-
|
|
289
|
-
|
|
290
|
-
|
|
291
|
-
|
|
336
|
+
export async function countTokens(
|
|
337
|
+
input: string,
|
|
338
|
+
options?: { model?: string }
|
|
339
|
+
): Promise<TokenResult>;
|
|
292
340
|
```
|
|
293
341
|
|
|
294
|
-
|
|
342
|
+
### 7.6 `ai-xray compare --providers=openai,anthropic,ollama` — Multi-Model Comparison
|
|
343
|
+
|
|
344
|
+
**Behavior:**
|
|
345
|
+
1. Send the same prompt to all specified providers.
|
|
346
|
+
2. Run `ping` + `bench` on each.
|
|
347
|
+
3. Return a side-by-side comparison table.
|
|
348
|
+
|
|
349
|
+
**Output:**
|
|
295
350
|
```json
|
|
296
351
|
{
|
|
297
|
-
"
|
|
298
|
-
"
|
|
299
|
-
|
|
300
|
-
|
|
301
|
-
{ "
|
|
302
|
-
]
|
|
303
|
-
"unstaged": [
|
|
304
|
-
{ "file": "src/auth.ts", "status": "M", "insertions": 15, "deletions": 2 }
|
|
305
|
-
],
|
|
306
|
-
"untracked": ["src/components/LoginForm.tsx"]
|
|
352
|
+
"prompt": "Write a haiku about coding.",
|
|
353
|
+
"results": [
|
|
354
|
+
{ "provider": "openai", "model": "gpt-4o", "ttft_ms": 280, "total_ms": 1200, "tokens": 51 },
|
|
355
|
+
{ "provider": "anthropic", "model": "claude-sonnet-4-20250514", "ttft_ms": 350, "total_ms": 1400, "tokens": 48 },
|
|
356
|
+
{ "provider": "ollama", "model": "llama3", "ttft_ms": 120, "total_ms": 800, "tokens": 45 }
|
|
357
|
+
]
|
|
307
358
|
}
|
|
308
359
|
```
|
|
309
360
|
|
|
310
|
-
|
|
361
|
+
**Function Signature:**
|
|
362
|
+
```typescript
|
|
363
|
+
// src/commands/compare.ts
|
|
364
|
+
export interface CompareResult {
|
|
365
|
+
prompt: string;
|
|
366
|
+
results: Array<{
|
|
367
|
+
provider: string;
|
|
368
|
+
model: string;
|
|
369
|
+
ttft_ms: number;
|
|
370
|
+
total_ms: number;
|
|
371
|
+
tokens: number;
|
|
372
|
+
error?: string;
|
|
373
|
+
}>;
|
|
374
|
+
}
|
|
375
|
+
|
|
376
|
+
export async function compare(
|
|
377
|
+
providers: string[],
|
|
378
|
+
options?: { prompt?: string; rounds?: number }
|
|
379
|
+
): Promise<CompareResult>;
|
|
380
|
+
```
|
|
311
381
|
|
|
312
382
|
---
|
|
313
383
|
|
|
314
|
-
|
|
384
|
+
## 8. Universal LLM API Client
|
|
315
385
|
|
|
316
|
-
|
|
386
|
+
**File:** `src/client.ts`
|
|
317
387
|
|
|
318
|
-
|
|
388
|
+
```typescript
|
|
389
|
+
export interface ChatMessage {
|
|
390
|
+
role: 'system' | 'user' | 'assistant';
|
|
391
|
+
content: string | Array<{ type: string; [key: string]: unknown }>;
|
|
392
|
+
}
|
|
319
393
|
|
|
320
|
-
|
|
321
|
-
|
|
322
|
-
|
|
394
|
+
export interface ChatRequest {
|
|
395
|
+
model: string;
|
|
396
|
+
messages: ChatMessage[];
|
|
397
|
+
max_tokens?: number;
|
|
398
|
+
temperature?: number;
|
|
399
|
+
stream?: boolean;
|
|
400
|
+
response_format?: { type: string };
|
|
401
|
+
tools?: unknown[];
|
|
402
|
+
}
|
|
323
403
|
|
|
324
|
-
|
|
325
|
-
|
|
326
|
-
|
|
327
|
-
|
|
328
|
-
|
|
329
|
-
|
|
404
|
+
export interface ChatResponse {
|
|
405
|
+
id: string;
|
|
406
|
+
model: string;
|
|
407
|
+
choices: Array<{
|
|
408
|
+
message: { role: string; content: string };
|
|
409
|
+
finish_reason: string;
|
|
410
|
+
}>;
|
|
411
|
+
usage?: {
|
|
412
|
+
prompt_tokens: number;
|
|
413
|
+
completion_tokens: number;
|
|
414
|
+
total_tokens: number;
|
|
415
|
+
};
|
|
416
|
+
headers: Record<string, string>;
|
|
417
|
+
latency_ms: number;
|
|
418
|
+
}
|
|
330
419
|
|
|
331
|
-
|
|
420
|
+
export async function chat(
|
|
421
|
+
config: ProviderConfig,
|
|
422
|
+
request: ChatRequest
|
|
423
|
+
): Promise<ChatResponse>;
|
|
332
424
|
```
|
|
333
425
|
|
|
334
|
-
|
|
426
|
+
**Key behavior:**
|
|
427
|
+
- Uses Node.js `https.request` (no `fetch`, no axios) for maximum compatibility.
|
|
428
|
+
- Automatically adds `Authorization: Bearer <key>` header.
|
|
429
|
+
- Captures response headers for rate limit info.
|
|
430
|
+
- Measures latency with `performance.now()`.
|
|
335
431
|
|
|
336
|
-
|
|
337
|
-
<!-- ai-xray:start -->
|
|
338
|
-
## Project Context Protocol (powered by ai-xray)
|
|
432
|
+
---
|
|
339
433
|
|
|
340
|
-
|
|
341
|
-
npx ai-xray scout --budget=2000
|
|
434
|
+
## 9. Test Plan
|
|
342
435
|
|
|
343
|
-
|
|
344
|
-
|
|
436
|
+
**Framework:** vitest
|
|
437
|
+
**Run:** `npx vitest run`
|
|
345
438
|
|
|
346
|
-
|
|
347
|
-
|
|
348
|
-
|
|
439
|
+
### 9.1 `tests/client.test.ts` (5 tests)
|
|
440
|
+
```
|
|
441
|
+
✓ should construct correct request headers
|
|
442
|
+
✓ should handle 401 unauthorized gracefully
|
|
443
|
+
✓ should handle network timeout
|
|
444
|
+
✓ should parse response JSON correctly
|
|
445
|
+
✓ should measure latency
|
|
349
446
|
```
|
|
350
447
|
|
|
351
|
-
|
|
352
|
-
```
|
|
353
|
-
|
|
354
|
-
|
|
355
|
-
|
|
448
|
+
### 9.2 `tests/ping.test.ts` (4 tests)
|
|
449
|
+
```
|
|
450
|
+
✓ should report reachable=true for successful response
|
|
451
|
+
✓ should report reachable=false for connection error
|
|
452
|
+
✓ should extract rate limit headers
|
|
453
|
+
✓ should measure latency_ms
|
|
356
454
|
```
|
|
357
455
|
|
|
358
|
-
|
|
359
|
-
```
|
|
360
|
-
|
|
361
|
-
|
|
362
|
-
|
|
363
|
-
|
|
364
|
-
|
|
365
|
-
}
|
|
456
|
+
### 9.3 `tests/id.test.ts` (5 tests)
|
|
457
|
+
```
|
|
458
|
+
✓ should extract self-reported model name
|
|
459
|
+
✓ should extract knowledge cutoff
|
|
460
|
+
✓ should extract context window size
|
|
461
|
+
✓ should extract API-reported model from response
|
|
462
|
+
✓ should compute provider fingerprint with confidence score
|
|
366
463
|
```
|
|
367
464
|
|
|
368
|
-
|
|
465
|
+
### 9.4 `tests/probe.test.ts` (7 tests)
|
|
466
|
+
```
|
|
467
|
+
✓ should detect JSON mode support (supported)
|
|
468
|
+
✓ should detect JSON mode support (unsupported → graceful false)
|
|
469
|
+
✓ should detect function calling support
|
|
470
|
+
✓ should detect vision support
|
|
471
|
+
✓ should detect streaming support
|
|
472
|
+
✓ should detect system prompt support
|
|
473
|
+
✓ should measure total probe duration
|
|
474
|
+
```
|
|
369
475
|
|
|
370
|
-
|
|
476
|
+
### 9.5 `tests/bench.test.ts` (5 tests)
|
|
477
|
+
```
|
|
478
|
+
✓ should run specified number of rounds
|
|
479
|
+
✓ should compute mean correctly
|
|
480
|
+
✓ should compute median correctly
|
|
481
|
+
✓ should compute p95 correctly
|
|
482
|
+
✓ should report output token totals
|
|
483
|
+
```
|
|
371
484
|
|
|
372
|
-
|
|
485
|
+
### 9.6 `tests/tokenize.test.ts` (4 tests)
|
|
486
|
+
```
|
|
487
|
+
✓ should count characters accurately
|
|
488
|
+
✓ should count words accurately
|
|
489
|
+
✓ should estimate tokens within ±20% of actual
|
|
490
|
+
✓ should compute cost estimate for known models
|
|
491
|
+
```
|
|
373
492
|
|
|
374
|
-
|
|
375
|
-
|
|
376
|
-
|
|
377
|
-
|
|
378
|
-
|
|
379
|
-
|
|
380
|
-
|
|
493
|
+
### 9.7 `tests/compare.test.ts` (4 tests)
|
|
494
|
+
```
|
|
495
|
+
✓ should query all specified providers
|
|
496
|
+
✓ should handle one provider failing gracefully
|
|
497
|
+
✓ should return results in consistent format
|
|
498
|
+
✓ should accept custom prompt
|
|
499
|
+
```
|
|
381
500
|
|
|
382
|
-
|
|
501
|
+
### 9.8 `tests/http.test.ts` (3 tests)
|
|
502
|
+
```
|
|
503
|
+
✓ should make HTTPS POST request
|
|
504
|
+
✓ should handle redirect
|
|
505
|
+
✓ should timeout after specified duration
|
|
506
|
+
```
|
|
383
507
|
|
|
384
|
-
|
|
508
|
+
**Total: ~37 test cases**
|
|
385
509
|
|
|
386
|
-
|
|
387
|
-
|---|---|
|
|
388
|
-
| **M1** | `env` + `tree` + `scout` (core engine) |
|
|
389
|
-
| **M2** | `read` + `diff` |
|
|
390
|
-
| **M3** | `init` (self-propagation engine) |
|
|
391
|
-
| **M4** | `--budget` Token 预算系统 |
|
|
392
|
-
| **M5** | tsup 构建 → `ai-xray@1.0.0` 发布 |
|
|
510
|
+
---
|
|
393
511
|
|
|
394
|
-
|
|
512
|
+
## 10. package.json (Final)
|
|
395
513
|
|
|
396
|
-
|
|
397
|
-
|
|
514
|
+
```json
|
|
515
|
+
{
|
|
516
|
+
"name": "ai-xray",
|
|
517
|
+
"version": "2.0.0",
|
|
518
|
+
"description": "X-ray your AI. Probe, benchmark, and fingerprint any LLM.",
|
|
519
|
+
"main": "dist/cli.js",
|
|
520
|
+
"bin": { "ai-xray": "dist/cli.js" },
|
|
521
|
+
"scripts": {
|
|
522
|
+
"dev": "tsup --watch",
|
|
523
|
+
"build": "tsup",
|
|
524
|
+
"test": "vitest run"
|
|
525
|
+
},
|
|
526
|
+
"keywords": ["ai","llm","benchmark","probe","diagnostics","model","agent"],
|
|
527
|
+
"author": "10iii <zh@ngxil.in> (https://zha.ngxil.in)",
|
|
528
|
+
"repository": { "type": "git", "url": "git+https://github.com/10iii/ai-xray.git" },
|
|
529
|
+
"homepage": "https://github.com/10iii/ai-xray#readme",
|
|
530
|
+
"license": "MIT",
|
|
531
|
+
"devDependencies": {
|
|
532
|
+
"@types/node": "^20.0.0",
|
|
533
|
+
"tsup": "^8.0.0",
|
|
534
|
+
"typescript": "^5.0.0",
|
|
535
|
+
"vitest": "^1.0.0"
|
|
536
|
+
}
|
|
537
|
+
}
|
|
538
|
+
```
|