@deepsql/mcp 0.8.0 → 0.10.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/AGENT-SETUP.md +289 -0
- package/CLAUDE.md +330 -0
- package/package.json +3 -1
- package/src/auth/store.js +32 -1
- package/src/auth/store.test.js +22 -0
- package/src/cli.js +40 -5
- package/src/commands/_connections.js +26 -3
- package/src/commands/_connections.test.js +21 -4
- package/src/commands/_session.js +4 -1
- package/src/commands/access.js +0 -2
- package/src/commands/admin.test.js +37 -0
- package/src/commands/anti-patterns.js +0 -1
- package/src/commands/brain-context.js +27 -8
- package/src/commands/business-rules.js +0 -1
- package/src/commands/connections.js +579 -9
- package/src/commands/digest.js +3 -5
- package/src/commands/explain.js +0 -1
- package/src/commands/query.js +0 -1
- package/src/commands/relationships.js +0 -1
- package/src/commands/schema.js +0 -1
- package/src/commands/slow-queries.js +2 -3
- package/src/commands/whoami.js +11 -5
- package/src/connections/schema.js +213 -0
- package/src/connections/secrets.js +167 -0
- package/src/connections/secrets.test.js +151 -0
package/AGENT-SETUP.md
ADDED
|
@@ -0,0 +1,289 @@
|
|
|
1
|
+
# DeepSQL — agent-driven setup
|
|
2
|
+
|
|
3
|
+
> Paste this file's content into Claude Code, Codex CLI, Cursor, or any
|
|
4
|
+
> agent capable of running shell commands. The agent will install, log in,
|
|
5
|
+
> register your databases, and wire DeepSQL into your editor — typically in
|
|
6
|
+
> under 5 minutes. Each step has a clear "exit on failure" so the agent
|
|
7
|
+
> won't paper over real problems.
|
|
8
|
+
|
|
9
|
+
You are setting up DeepSQL for the user. DeepSQL is a self-hosted AI database
|
|
10
|
+
performance assistant. The goal of this conversation is to install the CLI,
|
|
11
|
+
authorize it against the user's DeepSQL host, register one or more database
|
|
12
|
+
connections, and wire the MCP integration into the user's editor.
|
|
13
|
+
|
|
14
|
+
**Throughout setup, never echo passwords back to the user, never write
|
|
15
|
+
secrets to a tracked file, and clean up tempfiles you create.** When you
|
|
16
|
+
need credentials, ask one question at a time and stop typing them in
|
|
17
|
+
clear text once they're in your head.
|
|
18
|
+
|
|
19
|
+
---
|
|
20
|
+
|
|
21
|
+
## Step 1 — Install the CLI
|
|
22
|
+
|
|
23
|
+
```bash
|
|
24
|
+
npm install -g @deepsql/mcp@latest
|
|
25
|
+
deepsql --version
|
|
26
|
+
```
|
|
27
|
+
|
|
28
|
+
Confirm Node ≥ 20. If the install fails, ask the user about their npm
|
|
29
|
+
permissions and recommend `npm config set prefix '~/.npm-global'` if it's
|
|
30
|
+
an EACCES on `/usr/local`.
|
|
31
|
+
|
|
32
|
+
---
|
|
33
|
+
|
|
34
|
+
## Step 2 — Log in to DeepSQL
|
|
35
|
+
|
|
36
|
+
Ask the user for their DeepSQL host URL (e.g., `https://deepsql.example.com`).
|
|
37
|
+
Then:
|
|
38
|
+
|
|
39
|
+
```bash
|
|
40
|
+
deepsql login --url <host>
|
|
41
|
+
```
|
|
42
|
+
|
|
43
|
+
This opens a browser tab against the user's DeepSQL host. The user clicks
|
|
44
|
+
**Approve** on the device-authorization page; the CLI receives a token and
|
|
45
|
+
saves it to `~/.config/deepsql/auth.json` (mode 0600).
|
|
46
|
+
|
|
47
|
+
If the user is on a remote/SSH box without a browser, fall back to:
|
|
48
|
+
|
|
49
|
+
```bash
|
|
50
|
+
deepsql login --url <host> --device
|
|
51
|
+
```
|
|
52
|
+
|
|
53
|
+
The CLI prints a code; the user opens the URL on their laptop and approves.
|
|
54
|
+
|
|
55
|
+
Verify:
|
|
56
|
+
|
|
57
|
+
```bash
|
|
58
|
+
deepsql whoami
|
|
59
|
+
```
|
|
60
|
+
|
|
61
|
+
Should print the username, role, URL. If it doesn't, stop and surface the
|
|
62
|
+
error to the user — don't proceed with setup.
|
|
63
|
+
|
|
64
|
+
---
|
|
65
|
+
|
|
66
|
+
## Step 3 — Inspect the connection config schema
|
|
67
|
+
|
|
68
|
+
```bash
|
|
69
|
+
deepsql connections schema --json
|
|
70
|
+
```
|
|
71
|
+
|
|
72
|
+
This prints the JSON Schema for the connection config. Use it as the
|
|
73
|
+
contract for the next step. **Required fields:** `connectionName`, `dbType`
|
|
74
|
+
(`postgres` | `mysql`), `host`, `port`, `database`, `username`, `password`.
|
|
75
|
+
**Conditional fields:** `sshEnabled` triggers `sshHost`/`sshUsername` and
|
|
76
|
+
either `sshPassword` or `sshPrivateKey`. `sslMode` (`server-only` |
|
|
77
|
+
`server-client`) triggers SSL-cert fields.
|
|
78
|
+
|
|
79
|
+
---
|
|
80
|
+
|
|
81
|
+
## Step 4 — Gather credentials
|
|
82
|
+
|
|
83
|
+
Ask the user about each database they want to connect to. For each one:
|
|
84
|
+
|
|
85
|
+
1. **Friendly name** (will be shown in the CLI everywhere)
|
|
86
|
+
2. **Database engine** — `postgres` or `mysql`
|
|
87
|
+
3. **Host** and **port** (defaults: 5432 for postgres, 3306 for mysql)
|
|
88
|
+
4. **Database name** and **username**
|
|
89
|
+
5. **Password** — collect securely; do not paste it back into chat
|
|
90
|
+
6. **SSH bastion?** If yes: SSH host, port (22), user, key path or password
|
|
91
|
+
7. **SSL?** If yes: which mode, and the cert paths if `server-client`
|
|
92
|
+
8. **Cloud metadata** (optional but improves DeepSQL's tuning advice):
|
|
93
|
+
AWS / Azure / GCP / self-hosted, managed service (RDS / Aurora / etc.),
|
|
94
|
+
instance class, vCPU / memory / storage type / IOPS
|
|
95
|
+
|
|
96
|
+
For each connection, write a tempfile with `mktemp`, set mode 0600, and
|
|
97
|
+
keep it short-lived. **Never write the JSON to a path the user has open in
|
|
98
|
+
their editor or that lives inside a git repo.** Use this exact pattern:
|
|
99
|
+
|
|
100
|
+
```bash
|
|
101
|
+
tmp=$(mktemp /tmp/deepsql-conn-XXXXXX.json)
|
|
102
|
+
chmod 600 "$tmp"
|
|
103
|
+
cat > "$tmp" <<'EOF'
|
|
104
|
+
{
|
|
105
|
+
"connectionName": "prod-mysql",
|
|
106
|
+
"dbType": "mysql",
|
|
107
|
+
"host": "db.example.com",
|
|
108
|
+
"port": 3306,
|
|
109
|
+
"database": "myapp",
|
|
110
|
+
"username": "deepsql_reader",
|
|
111
|
+
"password": "REPLACE_BEFORE_RUNNING",
|
|
112
|
+
|
|
113
|
+
"sshEnabled": true,
|
|
114
|
+
"sshAuthType": "PRIVATE_KEY",
|
|
115
|
+
"sshHost": "bastion.example.com",
|
|
116
|
+
"sshPort": 22,
|
|
117
|
+
"sshUsername": "ec2-user",
|
|
118
|
+
"sshPrivateKey": "@file:~/.ssh/bastion_ed25519",
|
|
119
|
+
|
|
120
|
+
"sslMode": "server-only",
|
|
121
|
+
|
|
122
|
+
"cloudProvider": "aws",
|
|
123
|
+
"managedService": "rds",
|
|
124
|
+
"instanceClass": "db.r6g.xlarge",
|
|
125
|
+
"instanceVcpus": 4,
|
|
126
|
+
"instanceMemoryGb": 32.0,
|
|
127
|
+
"storageType": "gp3"
|
|
128
|
+
}
|
|
129
|
+
EOF
|
|
130
|
+
```
|
|
131
|
+
|
|
132
|
+
Notes on the secret refs supported in any string field:
|
|
133
|
+
|
|
134
|
+
- `"$VAR_NAME"` — pulled from `process.env` at CLI runtime, never persisted.
|
|
135
|
+
- `"@file:<path>"` — file contents read at runtime; mode 0600 enforced.
|
|
136
|
+
- Plaintext is allowed but warns if the JSON file is in a git tree.
|
|
137
|
+
|
|
138
|
+
The `mktemp` path is guaranteed to be outside any tracked directory, so the
|
|
139
|
+
plaintext warning won't fire — and the file gets deleted next.
|
|
140
|
+
|
|
141
|
+
---
|
|
142
|
+
|
|
143
|
+
## Step 5 — Test, save, wait
|
|
144
|
+
|
|
145
|
+
Always test before saving:
|
|
146
|
+
|
|
147
|
+
```bash
|
|
148
|
+
deepsql connections test --from-file "$tmp"
|
|
149
|
+
```
|
|
150
|
+
|
|
151
|
+
The output is a privilege report: `✓` for granted privileges, `✗` for
|
|
152
|
+
missing ones, plus `connectionSuccessful` and `sshTunnelSuccessful` flags.
|
|
153
|
+
**Stop if `connectionSuccessful=false`.** Common causes:
|
|
154
|
+
|
|
155
|
+
- Wrong host / port → user fixes the JSON
|
|
156
|
+
- Bad SSH key path → check `@file:~/.ssh/...` permissions (must be 0600)
|
|
157
|
+
- Bad password → user re-enters
|
|
158
|
+
- Missing privileges → DeepSQL still saves (read-only privileges are
|
|
159
|
+
enough), but flag the warning to the user so they know which features
|
|
160
|
+
may be limited
|
|
161
|
+
|
|
162
|
+
If the test passes, save it:
|
|
163
|
+
|
|
164
|
+
```bash
|
|
165
|
+
deepsql connections add --from-file "$tmp" --delete-after --wait
|
|
166
|
+
```
|
|
167
|
+
|
|
168
|
+
`--delete-after` removes the tempfile on success. `--wait` polls
|
|
169
|
+
`GET /connections/{id}/init-status` until brain initialization completes
|
|
170
|
+
(or fails). This can take a few minutes on large databases — DeepSQL is
|
|
171
|
+
ingesting the schema and indexing it for retrieval. Don't proceed until
|
|
172
|
+
this finishes successfully.
|
|
173
|
+
|
|
174
|
+
---
|
|
175
|
+
|
|
176
|
+
## Step 6 — Pin the primary connection
|
|
177
|
+
|
|
178
|
+
```bash
|
|
179
|
+
deepsql connections use <connectionName>
|
|
180
|
+
```
|
|
181
|
+
|
|
182
|
+
This sets the active default for the profile. Every subsequent command
|
|
183
|
+
(`deepsql brain-context`, `deepsql query`, `deepsql digest`, etc.) uses it
|
|
184
|
+
automatically — the user no longer has to pass `--connection`.
|
|
185
|
+
|
|
186
|
+
If the user has more than one connection, ask which they want pinned. They
|
|
187
|
+
can switch later with another `connections use`.
|
|
188
|
+
|
|
189
|
+
---
|
|
190
|
+
|
|
191
|
+
## Step 7 — Wire the MCP integration into the editor
|
|
192
|
+
|
|
193
|
+
> *Coming in `0.11.0`: `deepsql mcp config --install --for cursor|claude-code|codex`.*
|
|
194
|
+
> *Until then, use the manual steps below.*
|
|
195
|
+
|
|
196
|
+
For **Claude Code**: edit `~/.claude/settings.json` and add an `mcpServers`
|
|
197
|
+
entry:
|
|
198
|
+
|
|
199
|
+
```json
|
|
200
|
+
{
|
|
201
|
+
"mcpServers": {
|
|
202
|
+
"deepsql": {
|
|
203
|
+
"command": "deepsql",
|
|
204
|
+
"args": ["mcp"]
|
|
205
|
+
}
|
|
206
|
+
}
|
|
207
|
+
}
|
|
208
|
+
```
|
|
209
|
+
|
|
210
|
+
For **Cursor**: edit `~/.cursor/mcp.json` with the same shape.
|
|
211
|
+
|
|
212
|
+
For **Codex CLI**: see `~/.codex/config.toml` — a `[mcp_servers.deepsql]`
|
|
213
|
+
section with `command = "deepsql"` and `args = ["mcp"]`.
|
|
214
|
+
|
|
215
|
+
The CLI's saved auth token at `~/.config/deepsql/auth.json` is used
|
|
216
|
+
automatically — no token needs to be embedded in the editor config.
|
|
217
|
+
|
|
218
|
+
---
|
|
219
|
+
|
|
220
|
+
## Step 8 — Validate end-to-end
|
|
221
|
+
|
|
222
|
+
```bash
|
|
223
|
+
deepsql brain-context "list a few tables on this database" --top-k 5 --json
|
|
224
|
+
```
|
|
225
|
+
|
|
226
|
+
You should see 5 ranked results pointing at real tables. If the response
|
|
227
|
+
is empty or the pipeline returned `skipped: simple_schema_question`,
|
|
228
|
+
re-run with a more semantic question:
|
|
229
|
+
|
|
230
|
+
```bash
|
|
231
|
+
deepsql brain-context "what is the primary fact table for orders" --top-k 5
|
|
232
|
+
```
|
|
233
|
+
|
|
234
|
+
If retrieval works, setup is complete. Tell the user that:
|
|
235
|
+
|
|
236
|
+
- They can now use DeepSQL from their editor (the MCP tools `list_connections`,
|
|
237
|
+
`get_brain_context`, `execute_readonly_sql`, `explain_readonly_sql`,
|
|
238
|
+
`analyze_slow_queries`, etc. are all available there).
|
|
239
|
+
- The companion file `CLAUDE.md` (in this same npm package, at
|
|
240
|
+
`node_modules/@deepsql/mcp/CLAUDE.md`) covers the day-to-day usage
|
|
241
|
+
patterns and common mistakes for editor agents.
|
|
242
|
+
|
|
243
|
+
---
|
|
244
|
+
|
|
245
|
+
## Troubleshooting
|
|
246
|
+
|
|
247
|
+
| Symptom | Likely cause | Diagnose with |
|
|
248
|
+
|---|---|---|
|
|
249
|
+
| `deepsql login` opens browser but never returns | User closed the tab without clicking Approve | Re-run `deepsql login`. Tell the user to click **Approve** on the page. |
|
|
250
|
+
| `whoami` shows wrong user | Stale cached profile from a prior install | `deepsql logout` then `deepsql login --url <host>` |
|
|
251
|
+
| `connections test` fails: "DNS resolution" | Host typo or VPC-local hostname not reachable | Verify the host with `dig`/`nslookup`, or set up SSH tunnel |
|
|
252
|
+
| `connections test` fails: "SSH tunnel connection failed" | Wrong SSH host, wrong key file path, or key file mode > 600 | `ssh -i ~/.ssh/<key> ec2-user@<host>` to validate the bastion manually |
|
|
253
|
+
| `connections add` returns "Missing privileges: SELECT, ..." | DB user has insufficient grants | Save still succeeds; ask the user to grant the missing privileges |
|
|
254
|
+
| `connections add --wait` polls forever | Brain init is genuinely running on a huge DB | Default cap is 30 min. Pass `--wait-timeout 60m` or just `Ctrl-C` (init keeps running on the backend) |
|
|
255
|
+
| `brain-context` returns `skipped: simple_schema_question` | The question was too short / too schema-y | Ask a more semantic question, or pass `--top-k 5` to force ranked retrieval |
|
|
256
|
+
| `connections add` errors before contacting the backend | JSON validation failed | Read the `path: message` lines in the error; fix the JSON; re-run |
|
|
257
|
+
|
|
258
|
+
---
|
|
259
|
+
|
|
260
|
+
## Idempotency / re-runs
|
|
261
|
+
|
|
262
|
+
This entire flow is safe to re-run. `connections use` is idempotent. To
|
|
263
|
+
update an existing connection:
|
|
264
|
+
|
|
265
|
+
```bash
|
|
266
|
+
deepsql connections add --from-file "$tmp" --upsert --delete-after --wait
|
|
267
|
+
```
|
|
268
|
+
|
|
269
|
+
`--upsert` does PUT instead of POST when a name collision exists; the
|
|
270
|
+
backend's PATCH-style merge preserves any secrets you don't include in
|
|
271
|
+
the new JSON.
|
|
272
|
+
|
|
273
|
+
To remove a connection:
|
|
274
|
+
|
|
275
|
+
```bash
|
|
276
|
+
deepsql connections remove <connectionName> --yes
|
|
277
|
+
```
|
|
278
|
+
|
|
279
|
+
If it was the active default, the pin is cleared automatically.
|
|
280
|
+
|
|
281
|
+
---
|
|
282
|
+
|
|
283
|
+
## Reference
|
|
284
|
+
|
|
285
|
+
- `deepsql --help` — full command list
|
|
286
|
+
- `deepsql connections schema [--json]` — full input contract
|
|
287
|
+
- `node_modules/@deepsql/mcp/CLAUDE.md` — runtime guidance for AI agents
|
|
288
|
+
using DeepSQL's MCP tools
|
|
289
|
+
- `https://github.com/DeepSQLAI/dba-agent` — source repo (private)
|
package/CLAUDE.md
ADDED
|
@@ -0,0 +1,330 @@
|
|
|
1
|
+
# DeepSQL — guidance for AI agents (Claude Code / Cursor / Codex / etc.)
|
|
2
|
+
|
|
3
|
+
> Read this file before invoking DeepSQL tools. It exists because there are
|
|
4
|
+
> **two surfaces** (MCP tools + CLI commands) that look similar and agents
|
|
5
|
+
> tend to pick the wrong one or use the right one inefficiently.
|
|
6
|
+
|
|
7
|
+
DeepSQL is an autonomous database performance assistant. It exposes itself
|
|
8
|
+
to AI agents in two ways:
|
|
9
|
+
|
|
10
|
+
| Surface | Where it lives | When you use it |
|
|
11
|
+
|---|---|---|
|
|
12
|
+
| **MCP tools** (programmatic) | The stdio server you connected via `deepsql mcp` | Default. You're an MCP client and these are first-class tool calls. |
|
|
13
|
+
| **CLI** (`deepsql` binary) | The user's `$PATH`, invoked via Bash | Only when an MCP tool can't do it (admin ops, auth, multi-step user flows) **or** the user explicitly asked you to "run `deepsql ...`". |
|
|
14
|
+
|
|
15
|
+
If you have both available, **prefer MCP tools.** They're structured, typed,
|
|
16
|
+
faster, and don't depend on the user's shell environment.
|
|
17
|
+
|
|
18
|
+
---
|
|
19
|
+
|
|
20
|
+
## Decision tree — "I want to..."
|
|
21
|
+
|
|
22
|
+
```
|
|
23
|
+
1. Find which databases I can query
|
|
24
|
+
→ list_connections (returns id, name, type for each)
|
|
25
|
+
|
|
26
|
+
2. Understand a database's structure
|
|
27
|
+
→ get_brain_context (RAG retrieval — best when you have a
|
|
28
|
+
natural-language question; returns ranked
|
|
29
|
+
tables/columns/FKs/docs/business rules)
|
|
30
|
+
→ get_schema (full deterministic schema dump — when you
|
|
31
|
+
need every table; expensive on large DBs)
|
|
32
|
+
→ get_database_objects (tables/views/functions/procedures only)
|
|
33
|
+
|
|
34
|
+
3. Answer a business question about data
|
|
35
|
+
→ get_brain_context first (retrieves what tables hold what, FK edges,
|
|
36
|
+
business rules; gives you grounded context)
|
|
37
|
+
→ then construct SQL yourself, then:
|
|
38
|
+
→ explain_readonly_sql (validate the plan)
|
|
39
|
+
→ execute_readonly_sql (run it)
|
|
40
|
+
|
|
41
|
+
4. Find inferred relationships between tables
|
|
42
|
+
→ get_relationships (returns FK candidates with confidence scores)
|
|
43
|
+
|
|
44
|
+
5. Read business rules / data-access policies
|
|
45
|
+
→ list_business_rules (active rules + guardrails for a connection;
|
|
46
|
+
pass `question` to scope to relevant ones)
|
|
47
|
+
|
|
48
|
+
6. Find anti-patterns
|
|
49
|
+
→ get_anti_patterns kind=table (schema/structural anti-patterns)
|
|
50
|
+
→ get_anti_patterns kind=query (slow/expensive query patterns)
|
|
51
|
+
|
|
52
|
+
7. Investigate slow queries
|
|
53
|
+
→ analyze_slow_queries (recent slow queries with fingerprints + ms)
|
|
54
|
+
|
|
55
|
+
8. Run SQL to inspect data
|
|
56
|
+
→ execute_readonly_sql (read-only — backend rejects mutations)
|
|
57
|
+
```
|
|
58
|
+
|
|
59
|
+
**Rule of thumb for question-answering:** start with `get_brain_context`,
|
|
60
|
+
not `execute_readonly_sql`. The brain context tells you which tables matter,
|
|
61
|
+
their FKs, what the columns mean, and what business rules apply. Skipping
|
|
62
|
+
straight to SQL is how you write queries against the wrong tables.
|
|
63
|
+
|
|
64
|
+
---
|
|
65
|
+
|
|
66
|
+
## MCP tool reference (10 tools)
|
|
67
|
+
|
|
68
|
+
Every tool requires a `connectionId` (string UUID) **except** `list_connections`.
|
|
69
|
+
Always call `list_connections` first if you don't already know the ID.
|
|
70
|
+
|
|
71
|
+
### Discovery
|
|
72
|
+
|
|
73
|
+
#### `list_connections`
|
|
74
|
+
- **Args:** none
|
|
75
|
+
- **Returns:** array of `{ id, connectionName, databaseType, ... }`
|
|
76
|
+
- **Use when:** the user mentions a DB by name and you need its ID, or you
|
|
77
|
+
don't know which DBs are available.
|
|
78
|
+
|
|
79
|
+
#### `get_schema(connectionId)`
|
|
80
|
+
- **Returns:** the cached schema metadata for the whole DB.
|
|
81
|
+
- **Use when:** you need an exhaustive listing. **Avoid** when the DB is
|
|
82
|
+
large (hundreds of tables) — `get_brain_context` ranks the relevant
|
|
83
|
+
subset much faster.
|
|
84
|
+
|
|
85
|
+
#### `get_database_objects(connectionId)`
|
|
86
|
+
- **Returns:** tables, views, functions, procedures.
|
|
87
|
+
- **Use when:** the user asks "what views/functions exist?" — narrower
|
|
88
|
+
than `get_schema`.
|
|
89
|
+
|
|
90
|
+
### RAG / brain retrieval (preferred for question-answering)
|
|
91
|
+
|
|
92
|
+
#### `get_brain_context(connectionId, question, topK?)`
|
|
93
|
+
- **Args:**
|
|
94
|
+
- `question` — natural-language question used for retrieval ranking
|
|
95
|
+
- `topK` (optional, 1–100) — when provided, returns ranked diagnostic
|
|
96
|
+
snippets (good for "show me the top 5 most relevant tables"). When
|
|
97
|
+
omitted, returns the rich training-context payload (tables + columns
|
|
98
|
+
+ FKs + business rules + docs assembled for prompt-grounding).
|
|
99
|
+
- **Use when:** the user asks any analytical question. This is the cheapest
|
|
100
|
+
way to ground yourself before generating SQL.
|
|
101
|
+
- **Output:** typically includes `trainingContext` (text block ready to feed
|
|
102
|
+
into your own context window) plus structured ranked results.
|
|
103
|
+
|
|
104
|
+
#### `list_business_rules(connectionId, question?)`
|
|
105
|
+
- **Returns:** `activeRules` + `applicableGuardrails` + `guardrailContext`.
|
|
106
|
+
- **Use when:** before generating SQL that touches sensitive entities. If
|
|
107
|
+
the rules say "PII columns are blocked," respect that in your output.
|
|
108
|
+
Pass `question` to filter to rules applicable to the user's intent.
|
|
109
|
+
|
|
110
|
+
#### `get_relationships(connectionId)`
|
|
111
|
+
- **Returns:** array of `{ sourceTable, sourceColumn, targetTable, targetColumn, confidence, inferenceMethod, validationStatus }`.
|
|
112
|
+
- **Use when:** writing JOINs and the actual FK constraint isn't declared
|
|
113
|
+
in the schema. Anything `confidence >= 0.8` is safe; lower confidence
|
|
114
|
+
means inferred from naming patterns or data — verify with the user.
|
|
115
|
+
|
|
116
|
+
#### `get_anti_patterns(connectionId, kind?, limit?)`
|
|
117
|
+
- **`kind="table"` (default):** schema-level anti-patterns (missing
|
|
118
|
+
indexes, wide tables, etc.).
|
|
119
|
+
- **`kind="query":** query-level patterns; pass `limit` (1–500).
|
|
120
|
+
- **Use when:** the user asks "what's wrong with this DB?" or you've
|
|
121
|
+
generated a query and want to sanity-check it.
|
|
122
|
+
|
|
123
|
+
### Operations
|
|
124
|
+
|
|
125
|
+
#### `analyze_slow_queries(connectionId, thresholdMs?, limit?)`
|
|
126
|
+
- **Args:** `thresholdMs` defaults 100, `limit` defaults 10.
|
|
127
|
+
- **Returns:** recent slow queries from `pg_stat_statements` with
|
|
128
|
+
fingerprints, durations, example statements.
|
|
129
|
+
- **Use when:** the user asks "what's slow?" or you're triaging a
|
|
130
|
+
performance incident.
|
|
131
|
+
|
|
132
|
+
### Execution
|
|
133
|
+
|
|
134
|
+
#### `execute_readonly_sql(connectionId, query, limit?, timeoutSeconds?)`
|
|
135
|
+
- **Read-only enforced at four layers:** client SQL parser, backend SQL
|
|
136
|
+
parser, per-connection ACL on the calling user's token, and the DB
|
|
137
|
+
role itself usually only has SELECT/EXPLAIN. Mutations (INSERT, UPDATE,
|
|
138
|
+
DELETE, DDL, etc.) are rejected — **don't try to work around this**.
|
|
139
|
+
- **Multi-statement SQL is rejected** in phase 1. Send one statement.
|
|
140
|
+
- **Defaults:** 100-row `limit`, backend default `timeoutSeconds`.
|
|
141
|
+
- **Use when:** you've grounded yourself with `get_brain_context` and need
|
|
142
|
+
to fetch concrete numbers.
|
|
143
|
+
|
|
144
|
+
#### `explain_readonly_sql(connectionId, query)`
|
|
145
|
+
- **Don't include `EXPLAIN` in the query string** — the tool wraps it.
|
|
146
|
+
`ANALYZE` is also rejected (read-only).
|
|
147
|
+
- **Use when:** you want to validate a plan before running it, or you're
|
|
148
|
+
diagnosing why a query is slow.
|
|
149
|
+
|
|
150
|
+
---
|
|
151
|
+
|
|
152
|
+
## CLI commands (run via Bash, only when MCP isn't enough)
|
|
153
|
+
|
|
154
|
+
The CLI exposes the same data plane plus admin operations the MCP server
|
|
155
|
+
deliberately doesn't expose. **Only run CLI commands when the user explicitly
|
|
156
|
+
asks you to**, or when an MCP tool can't do what's needed (admin, auth,
|
|
157
|
+
multi-step flows).
|
|
158
|
+
|
|
159
|
+
### Quick reference
|
|
160
|
+
|
|
161
|
+
```
|
|
162
|
+
# Auth (the user typically did this once; don't re-run unless asked)
|
|
163
|
+
deepsql login --url https://<host>
|
|
164
|
+
deepsql whoami
|
|
165
|
+
deepsql logout
|
|
166
|
+
|
|
167
|
+
# Connections — the human's "active DB" pin (CLI-only; MCP tools don't read this)
|
|
168
|
+
deepsql connections list # marks active with *
|
|
169
|
+
deepsql connections use <name> # pin
|
|
170
|
+
deepsql connections current # show pinned
|
|
171
|
+
deepsql connections unset
|
|
172
|
+
|
|
173
|
+
# Read-only data ops (mirror MCP tools — same backend, same guardrails)
|
|
174
|
+
deepsql query "SELECT ..." --connection <name>
|
|
175
|
+
deepsql explain "SELECT ..." --connection <name>
|
|
176
|
+
deepsql schema [tables|objects] --connection <name>
|
|
177
|
+
|
|
178
|
+
# Brain / RAG (mirror the MCP brain tools)
|
|
179
|
+
deepsql brain-context "<question>" --connection <name> [--top-k N]
|
|
180
|
+
deepsql business-rules --connection <name> [--question "..."]
|
|
181
|
+
deepsql relationships --connection <name>
|
|
182
|
+
deepsql anti-patterns --connection <name> [--kind table|query] [--limit N]
|
|
183
|
+
|
|
184
|
+
# Slack daily digest
|
|
185
|
+
deepsql digest [N] --connection <name>
|
|
186
|
+
|
|
187
|
+
# Slow-query operations
|
|
188
|
+
deepsql slow-queries latest --connection <name>
|
|
189
|
+
deepsql slow-queries history --connection <name> [N]
|
|
190
|
+
deepsql slow-queries analyze --connection <name>
|
|
191
|
+
deepsql slow-queries optimize --connection <name> --query-id <id> # SSE stream
|
|
192
|
+
|
|
193
|
+
# Admin (require ADMIN role)
|
|
194
|
+
deepsql users list | get | add | set-role | lock | unlock | disable | delete
|
|
195
|
+
deepsql access list | grant | revoke | policy
|
|
196
|
+
deepsql permissions list | override | reset
|
|
197
|
+
deepsql setup [--skip-email] [--skip-slack] # post-install wizard
|
|
198
|
+
```
|
|
199
|
+
|
|
200
|
+
### When CLI is the right call (vs MCP)
|
|
201
|
+
|
|
202
|
+
- The user said "run `deepsql ...`" or "use the CLI."
|
|
203
|
+
- The operation is admin (`users`, `access`, `permissions`, `setup`) — these
|
|
204
|
+
aren't exposed via MCP intentionally.
|
|
205
|
+
- The user wants Slack digest content (`digest`).
|
|
206
|
+
- You're in a script context where structured stdin/stdout is preferable.
|
|
207
|
+
|
|
208
|
+
### When CLI is the **wrong** call
|
|
209
|
+
|
|
210
|
+
- For everything in the decision tree above. The MCP equivalents are
|
|
211
|
+
faster and don't depend on the user's `$PATH`, env, or saved auth.
|
|
212
|
+
- For executing SQL the user is paying you to write — use
|
|
213
|
+
`execute_readonly_sql`, not `Bash("deepsql query ...")`.
|
|
214
|
+
|
|
215
|
+
---
|
|
216
|
+
|
|
217
|
+
## Common mistakes — and how to avoid them
|
|
218
|
+
|
|
219
|
+
### ❌ Generating SQL without retrieving brain context first
|
|
220
|
+
The user asks: *"How many active customers do we have?"*
|
|
221
|
+
|
|
222
|
+
**Wrong:** call `execute_readonly_sql("SELECT COUNT(*) FROM customers WHERE active = true")` —
|
|
223
|
+
guesses at the table name (`customers` vs `dim_customer` vs `users`),
|
|
224
|
+
guesses at the column (`active` vs `is_active` vs `status='ACTIVE'`),
|
|
225
|
+
ignores any business rule that defines what "active" means.
|
|
226
|
+
|
|
227
|
+
**Right:**
|
|
228
|
+
1. `get_brain_context(connectionId, "how many active customers")` — returns the
|
|
229
|
+
right table (`dim_customer`) and the column convention.
|
|
230
|
+
2. `list_business_rules(connectionId, "active customers")` — returns the rule
|
|
231
|
+
if "active" has a workspace-specific definition.
|
|
232
|
+
3. Generate SQL using the names + rules from #1 and #2.
|
|
233
|
+
4. `explain_readonly_sql(...)` — sanity check.
|
|
234
|
+
5. `execute_readonly_sql(...)` — run it.
|
|
235
|
+
|
|
236
|
+
### ❌ Calling `get_schema` on every analysis question
|
|
237
|
+
`get_schema` returns the entire DB. On a 200-table OLAP warehouse that's a
|
|
238
|
+
huge response and most of it is irrelevant to the question. **Use
|
|
239
|
+
`get_brain_context` for question-scoped retrieval.** Reserve `get_schema`
|
|
240
|
+
for exhaustive listing tasks.
|
|
241
|
+
|
|
242
|
+
### ❌ Trying to mutate data
|
|
243
|
+
Every execution path is read-only. INSERT/UPDATE/DELETE/CREATE/DROP/ALTER/
|
|
244
|
+
TRUNCATE are all rejected at the SQL parser layer. If the user asks for a
|
|
245
|
+
mutation, **stop and tell them DeepSQL is read-only**, then offer to draft
|
|
246
|
+
the SQL for them to run themselves.
|
|
247
|
+
|
|
248
|
+
### ❌ Forgetting the connectionId
|
|
249
|
+
Every tool except `list_connections` requires it. If the user mentions a DB
|
|
250
|
+
by name (e.g., "look at prod-replica"), call `list_connections` first to
|
|
251
|
+
resolve the name → UUID. Don't guess.
|
|
252
|
+
|
|
253
|
+
### ❌ Re-fetching context on every turn
|
|
254
|
+
Schema and brain context don't change minute-to-minute. If you already
|
|
255
|
+
called `get_brain_context` for a related question this conversation, reuse
|
|
256
|
+
the result. Don't re-call unless the question has shifted topics.
|
|
257
|
+
|
|
258
|
+
### ❌ Mixing CLI invocations and MCP tool calls in the same session
|
|
259
|
+
Pick one. If you have MCP available, stay in MCP. If you only have Bash,
|
|
260
|
+
use the CLI. Mixing forces the user to debug two surfaces.
|
|
261
|
+
|
|
262
|
+
### ❌ Calling `analyze_slow_queries` and immediately querying the slow-query log table directly
|
|
263
|
+
The MCP tool already does the right query against `pg_stat_statements` (or
|
|
264
|
+
the equivalent for MySQL) with the right thresholds. Don't reinvent it.
|
|
265
|
+
|
|
266
|
+
---
|
|
267
|
+
|
|
268
|
+
## Output handling tips
|
|
269
|
+
|
|
270
|
+
- **`get_brain_context` returns a `trainingContext` text block.** It's
|
|
271
|
+
designed to drop into your prompt as-is. Don't summarize it before
|
|
272
|
+
generating SQL — let the structured names flow through.
|
|
273
|
+
- **`execute_readonly_sql` returns `{ result: { columns, rows, rowCount, totalRowCount, isLimited, ... }, success, queryType }`.**
|
|
274
|
+
`rows` is array-of-arrays (column-positional), not array-of-objects. The
|
|
275
|
+
CLI's `query` command renders this; if you're consuming the structured
|
|
276
|
+
response yourself, zip `columns` and `rows[i]` to get an object.
|
|
277
|
+
- **`explain_readonly_sql` returns the plan as JSON.** Postgres-style
|
|
278
|
+
textual EXPLAIN is in `plan` if available; structured form may be
|
|
279
|
+
alongside.
|
|
280
|
+
- **`analyze_slow_queries` returns slow queries with fingerprints, not raw
|
|
281
|
+
SQL.** Fingerprints are normalized (`?` for literals). Use the
|
|
282
|
+
`queryId` to feed back into `optimize` flows.
|
|
283
|
+
|
|
284
|
+
---
|
|
285
|
+
|
|
286
|
+
## Multi-database situations
|
|
287
|
+
|
|
288
|
+
DeepSQL doesn't support cross-connection JOINs at the SQL layer. If the user
|
|
289
|
+
asks a question that spans DBs:
|
|
290
|
+
|
|
291
|
+
1. Call `list_connections` to enumerate.
|
|
292
|
+
2. For each relevant DB, call `get_brain_context` and/or `execute_readonly_sql`.
|
|
293
|
+
3. Combine the results in your reasoning, not in SQL.
|
|
294
|
+
|
|
295
|
+
The CLI's "active connection" pin (`deepsql connections use`) is **not** read
|
|
296
|
+
by MCP tools — it only saves typing for human CLI users. As an MCP client,
|
|
297
|
+
always pass `connectionId` explicitly per call.
|
|
298
|
+
|
|
299
|
+
---
|
|
300
|
+
|
|
301
|
+
## Authentication & security model
|
|
302
|
+
|
|
303
|
+
You don't need to manage auth — the MCP server was launched with a saved
|
|
304
|
+
token from `~/.config/deepsql/auth.json`. Every tool call carries that
|
|
305
|
+
token. The token is bound to a specific user identity:
|
|
306
|
+
|
|
307
|
+
- The user's role and per-connection ACLs are enforced **server-side**.
|
|
308
|
+
If you call a tool and get an authorization error, surface it to the
|
|
309
|
+
user — don't retry with different parameters.
|
|
310
|
+
- The user may have **chat-access policies** (plain-English rules
|
|
311
|
+
attached to a connection). The brain context already reflects them; if
|
|
312
|
+
a query you generate triggers a policy violation, the backend rejects
|
|
313
|
+
it. Trust the rejection and ask the user how to proceed.
|
|
314
|
+
- **Read-only is enforced at four independent layers** (client parser,
|
|
315
|
+
backend parser, per-connection ACL, DB role). Don't try to bypass any of
|
|
316
|
+
them — each rejection is a real signal that the operation isn't safe.
|
|
317
|
+
|
|
318
|
+
---
|
|
319
|
+
|
|
320
|
+
## When in doubt
|
|
321
|
+
|
|
322
|
+
1. Call `list_connections` first if you don't have a connectionId.
|
|
323
|
+
2. Call `get_brain_context` second if you have a question.
|
|
324
|
+
3. Generate your SQL using the names and rules from those calls.
|
|
325
|
+
4. Call `explain_readonly_sql` if performance matters.
|
|
326
|
+
5. Call `execute_readonly_sql` last.
|
|
327
|
+
|
|
328
|
+
That five-step flow handles 80% of legitimate analytical workloads. Anything
|
|
329
|
+
that doesn't fit this pattern probably warrants asking the user a
|
|
330
|
+
clarifying question instead of guessing.
|
package/package.json
CHANGED
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
{
|
|
2
2
|
"name": "@deepsql/mcp",
|
|
3
|
-
"version": "0.
|
|
3
|
+
"version": "0.10.1",
|
|
4
4
|
"description": "DeepSQL CLI and stdio MCP server for self-hosted deployments",
|
|
5
5
|
"bin": {
|
|
6
6
|
"deepsql": "./bin/deepsql.js",
|
|
@@ -9,6 +9,8 @@
|
|
|
9
9
|
"main": "./deepsql-phase1-server.js",
|
|
10
10
|
"files": [
|
|
11
11
|
"README.md",
|
|
12
|
+
"CLAUDE.md",
|
|
13
|
+
"AGENT-SETUP.md",
|
|
12
14
|
"bin",
|
|
13
15
|
"src",
|
|
14
16
|
"deepsql-phase1-server.js",
|
package/src/auth/store.js
CHANGED
|
@@ -11,10 +11,18 @@
|
|
|
11
11
|
* {
|
|
12
12
|
* "default": "http://localhost:8080",
|
|
13
13
|
* "profiles": {
|
|
14
|
-
* "<base-url>": {
|
|
14
|
+
* "<base-url>": {
|
|
15
|
+
* token, username, tokenId, createdAt,
|
|
16
|
+
* defaultConnection?: "<connection-name-or-uuid>"
|
|
17
|
+
* }
|
|
15
18
|
* }
|
|
16
19
|
* }
|
|
17
20
|
*
|
|
21
|
+
* `defaultConnection` is the active connection for commands that need one
|
|
22
|
+
* (query/explain/schema/digest/slow-queries/brain-*). Set via
|
|
23
|
+
* `deepsql connections use <name>`. Resolution order in commands:
|
|
24
|
+
* --connection flag → DEEPSQL_CONNECTION env → profile.defaultConnection
|
|
25
|
+
*
|
|
18
26
|
* The file is written with mode 0600 and the parent dir with 0700. We refuse to
|
|
19
27
|
* read a file with looser perms unless DEEPSQL_INSECURE_AUTH=1 is set, since
|
|
20
28
|
* tokens grant access to the user's databases.
|
|
@@ -133,6 +141,27 @@ function defaultBaseUrl() {
|
|
|
133
141
|
return state.default || null;
|
|
134
142
|
}
|
|
135
143
|
|
|
144
|
+
function getDefaultConnection(baseUrl) {
|
|
145
|
+
const profile = getProfile(baseUrl);
|
|
146
|
+
return profile && profile.defaultConnection ? profile.defaultConnection : null;
|
|
147
|
+
}
|
|
148
|
+
|
|
149
|
+
function setDefaultConnection(baseUrl, connectionName) {
|
|
150
|
+
const state = load();
|
|
151
|
+
const key = normalizeBaseUrl(baseUrl);
|
|
152
|
+
if (!state.profiles[key]) {
|
|
153
|
+
throw new Error(
|
|
154
|
+
`No profile saved for ${key}. Run \`deepsql login --url ${key}\` first.`,
|
|
155
|
+
);
|
|
156
|
+
}
|
|
157
|
+
if (connectionName == null || connectionName === "") {
|
|
158
|
+
delete state.profiles[key].defaultConnection;
|
|
159
|
+
} else {
|
|
160
|
+
state.profiles[key].defaultConnection = connectionName;
|
|
161
|
+
}
|
|
162
|
+
save(state);
|
|
163
|
+
}
|
|
164
|
+
|
|
136
165
|
function listProfiles() {
|
|
137
166
|
return load();
|
|
138
167
|
}
|
|
@@ -141,6 +170,7 @@ module.exports = {
|
|
|
141
170
|
authFilePath,
|
|
142
171
|
configDir,
|
|
143
172
|
defaultBaseUrl,
|
|
173
|
+
getDefaultConnection,
|
|
144
174
|
getProfile,
|
|
145
175
|
listProfiles,
|
|
146
176
|
load,
|
|
@@ -148,5 +178,6 @@ module.exports = {
|
|
|
148
178
|
removeProfile,
|
|
149
179
|
save,
|
|
150
180
|
setDefault,
|
|
181
|
+
setDefaultConnection,
|
|
151
182
|
setProfile,
|
|
152
183
|
};
|