@datasynx/agentic-ai-cartography 1.1.1 → 2.2.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/llms-full.txt ADDED
@@ -0,0 +1,758 @@
1
+ # @datasynx/agentic-ai-cartography — full documentation
2
+
3
+ > MCP-first infrastructure & agentic-AI cartography — install once, every AI agent knows your system landscape. Read-only discovery exposed over the Model Context Protocol.
4
+
5
+
6
+ ---
7
+
8
+ <!-- source: docs/tutorials/index.md -->
9
+
10
+ # Tutorial: from zero to an agent that knows your system
11
+
12
+ A guided, first-run walkthrough. By the end you'll have discovered your local
13
+ landscape and queried it from an AI client.
14
+
15
+ ## 1. Discover (read-only, no LLM required)
16
+
17
+ ```bash
18
+ npx -y --package @datasynx/agentic-ai-cartography datasynx-cartography discover
19
+ ```
20
+
21
+ This scans your machine deterministically — installed apps, listening ports,
22
+ browser bookmarks — and writes a catalog. Nothing leaves your machine.
23
+
24
+ ## 2. Run the MCP server
25
+
26
+ ```bash
27
+ npx -y --package @datasynx/agentic-ai-cartography cartography-mcp
28
+ ```
29
+
30
+ The server speaks the Model Context Protocol over stdio.
31
+
32
+ ## 3. Connect a client
33
+
34
+ Let the harness write the config for you:
35
+
36
+ ```bash
37
+ datasynx-cartography install --client claude-code
38
+ ```
39
+
40
+ Restart the host, then ask it: *"Read cartography://graph/summary and describe my system."*
41
+
42
+ Next: the [How-to guides](/how-to/) for specific tasks, or the [Reference](/reference/).
43
+
44
+
45
+ ---
46
+
47
+ <!-- source: docs/how-to/install.md -->
48
+
49
+ # How to install Cartography into a client
50
+
51
+ ## Claude Code — one-step plugin (recommended)
52
+
53
+ Cartography ships as a Claude Code plugin in the shared Datasynx marketplace, so
54
+ no manual config editing is needed:
55
+
56
+ ```text
57
+ /plugin marketplace add datasynx/claude-plugins
58
+ /plugin install cartography@datasynx
59
+ ```
60
+
61
+ Verify the server is live with `/mcp`. This is the same flow as the
62
+ [`shadowing`](https://github.com/datasynx/agentic-ai-shadowing) plugin; the
63
+ plugin manifest lives in [`plugin/`](https://github.com/datasynx/agentic-ai-cartography/tree/main/plugin)
64
+ of this repository.
65
+
66
+ ## Every other host — the `install` harness
67
+
68
+ The `install` command parses your host's existing config and merges in the
69
+ Cartography MCP server **without clobbering** your other servers.
70
+
71
+ ```bash
72
+ datasynx-cartography list-clients # see supported hosts
73
+ datasynx-cartography install --client <id> # write the config
74
+ datasynx-cartography install --client <id> --dry-run # preview the merge diff
75
+ ```
76
+
77
+ ## Scopes
78
+
79
+ - `--global` (default) — your user-level config.
80
+ - `--project` — a project-local config (e.g. `.mcp.json`, `.vscode/mcp.json`).
81
+
82
+ ## Options
83
+
84
+ | Flag | Purpose |
85
+ | --- | --- |
86
+ | `--dry-run` | Print the merge diff; write nothing. |
87
+ | `--name <server>` | Server key to register (default `cartography`). |
88
+ | `--http` / `--url <url>` | Register the Streamable HTTP endpoint instead of stdio. |
89
+ | `--db <path>` | Serve a specific catalog. |
90
+ | `--session <id>` | Serve a specific discovery session. |
91
+ | `--deeplink` | Print a one-click Cursor/VS Code install link instead of writing. |
92
+
93
+ ## One-click deeplinks
94
+
95
+ ```bash
96
+ datasynx-cartography install --client cursor --deeplink
97
+ datasynx-cartography install --client vscode --deeplink
98
+ ```
99
+
100
+ See the full host matrix in the [Reference → Supported clients](/reference/clients).
101
+
102
+
103
+ ---
104
+
105
+ <!-- source: docs/reference/mcp.md -->
106
+
107
+ # MCP tools & resources
108
+
109
+ The Cartography MCP server exposes read-only **resources**, query **tools** and
110
+ reusable **prompts**.
111
+
112
+ ## Resources
113
+
114
+ | URI | Description |
115
+ | --- | --- |
116
+ | `cartography://graph/summary` | Low-token aggregate index — read this first. |
117
+ | `cartography://nodes` | Lightweight list of all nodes. |
118
+ | `cartography://nodes/{id}` | Full node record plus incident edges. |
119
+ | `cartography://services` | Service-type nodes. |
120
+ | `cartography://databases` | Data-store nodes. |
121
+ | `cartography://dependencies/{id}` | Transitive downstream dependencies. |
122
+ | `cartography://sessions` | Discovery sessions in the catalog. |
123
+
124
+ ## Tools
125
+
126
+ <!-- AUTO-GENERATED:tools START — regenerated by `npm run docs:tables` -->
127
+ | Tool | Read-only | Description |
128
+ | --- | --- | --- |
129
+ | `classify_drift` | ✅ | Compare two discovery sessions and return a severity-classified drift alert (info|warning|critical per item plus an overall severity). Defaults to the two most recent. Read-only: never dispatches to sinks. |
130
+ | `diff_topology` | ✅ | Compare two discovery sessions and report added/removed/changed nodes and added/removed edges, plus newly-appearing structural anomalies (3.6). Defaults to the two most recent sessions (base = second-most-recent, current = most-recent). |
131
+ | `get_activity_events` | ✅ | Recent executed tool calls and their result sizes for the current session. |
132
+ | `get_cost_summary` | ✅ | FinOps rollup: cost by domain and owner, currency/period-bucketed (3.3). |
133
+ | `get_dependencies` | ✅ | Traverse the dependency graph from a node (downstream/upstream/both) with a depth limit. |
134
+ | `get_node` | ✅ | Fetch a single node with its incident edges. |
135
+ | `get_summary` | ✅ | Low-token overview of the whole landscape (counts, types, domains, most-connected, anomalies). |
136
+ | `list_services` | ✅ | List discovered services or data stores. |
137
+ | `query_infrastructure` | ✅ | Search the topology by name/id/domain (optionally filtered by node type). Returns compact node records. |
138
+ | `query_natural_language` | ✅ | Answer a plain-English topology question (e.g. "services that depend on the payments DB"). Deterministically parses the question into a structured intent, then anchors via search and traverses dependencies, applying any node-type filter to the results. Echoes the parsed intent for explainability. Read-only, LLM-free. |
139
+ | `run_discovery` | — | Scan the local system (read-only) and update the catalog. Returns counts of nodes/edges found. Pass `update: true` to rescan the served session in place and return the delta (2.1 incremental discovery). |
140
+ | `score_compliance` | ✅ | Grade the served session against a compliance ruleset (baseline/cis/soc2/iso27001 starter sets) and list gaps with the node ids that caused them. Read-only; never throws. |
141
+ | `search_topology` | ✅ | Find nodes related to a concept by meaning (semantic search when available, lexical otherwise). |
142
+ <!-- AUTO-GENERATED:tools END -->
143
+
144
+ ## Prompts
145
+
146
+ | Prompt | Description |
147
+ | --- | --- |
148
+ | `audit-attack-surface` | Review externally-reachable services and risky dependencies. |
149
+ | `map-service-dependencies` | Produce a dependency map for a given service. |
150
+ | `onboard-to-system` | Explain the system landscape to a new engineer. |
151
+
152
+
153
+ ---
154
+
155
+ <!-- source: docs/reference/cli.md -->
156
+
157
+ # CLI reference
158
+
159
+ `datasynx-cartography <command>` (the discovery/management CLI) and
160
+ `cartography-mcp` (the MCP server binary).
161
+
162
+ | Command | Purpose |
163
+ | --- | --- |
164
+ | `discover` | Scan and map your infrastructure (`--output-format text\|json\|stream-json`, `--name <name>`). |
165
+ | `diff [base] [current]` | Compare two sessions for drift (`--format text\|json\|mermaid`). |
166
+ | `schedule --config <file>` | Run discovery recurringly and record per-run drift (`--once` / `--watch`, config-file driven). |
167
+ | `seed` | Manually add known tools/DBs/APIs. |
168
+ | `install --client <id>` | Register the MCP server into a host's config. |
169
+ | `list-clients` | List supported hosts. |
170
+ | `mcp` | Run the MCP server (stdio by default; `--http` for Streamable HTTP). |
171
+ | `export [session]` | Export Mermaid / JSON / YAML / HTML. |
172
+ | `show [session]` | Show session details. |
173
+ | `sessions` | List all sessions. |
174
+ | `overview` | Aggregate overview across sessions. |
175
+ | `bookmarks` | View browser bookmarks. |
176
+ | `doctor` | Check requirements (kubectl, aws, gcloud, az). |
177
+ | `prune` | Remove old sessions. |
178
+ | `docs` | Full in-terminal feature reference. |
179
+
180
+ ## `mcp` flags
181
+
182
+ | Flag | Default | Purpose |
183
+ | --- | --- | --- |
184
+ | `--http` | off | Use Streamable HTTP instead of stdio. |
185
+ | `--port <n>` | `3737` | HTTP port. |
186
+ | `--host <h>` | `127.0.0.1` | HTTP host. |
187
+ | `--allowed-hosts <list>` | — | Host allowlist (required for non-loopback `--host`). |
188
+ | `--db <path>` | default catalog | Catalog to serve. |
189
+ | `--session <id>` | `latest` | Session to serve. |
190
+ | `--no-semantic` | — | Disable semantic (vector) search. |
191
+
192
+
193
+ ---
194
+
195
+ <!-- source: docs/reference/clients.md -->
196
+
197
+ # Supported clients
198
+
199
+ The `install` harness writes the correct config for each host. Run
200
+ `datasynx-cartography list-clients` for the live list.
201
+
202
+ <!-- AUTO-GENERATED:clients START — regenerated by `npm run docs:tables` -->
203
+ | id | Host | Format | Notes |
204
+ | --- | --- | --- | --- |
205
+ | `claude-code` | Claude Code | json | |
206
+ | `cursor` | Cursor | json | |
207
+ | `vscode` | VS Code (Copilot) | json | Uses the `servers` key (not `mcpServers`) — the most common copy-paste mistake. |
208
+ | `codex` | Codex CLI | toml | Project scope only loads in "trusted" projects. |
209
+ | `windsurf` | Windsurf | json | |
210
+ | `cline` | Cline | json | |
211
+ | `roo` | Roo Code | json | Project .roo/mcp.json takes precedence over the global settings. |
212
+ | `zed` | Zed | json | Manual servers need "source": "custom"; remote uses an mcp-remote bridge. |
213
+ | `junie` | JetBrains / Junie | json | |
214
+ | `gemini` | Gemini CLI | json | |
215
+ | `goose` | Goose | yaml | Verify the extension shape against current Goose docs; built-ins are left untouched. |
216
+ | `openhands` | OpenHands | toml | SHTTP is preferred; SSE is legacy. Only api_key is supported (no arbitrary headers). |
217
+ | `claude-desktop` | Claude Desktop | json | One-click install is also available via the .mcpb bundle (npm run build:mcpb). |
218
+ <!-- AUTO-GENERATED:clients END -->
219
+
220
+ ## Copy-paste config per host
221
+
222
+ The exact entry the `install` harness writes for each host (global scope shown).
223
+
224
+ <!-- AUTO-GENERATED:quickstarts START — regenerated by `npm run docs:tables` -->
225
+ ### Claude Code (`claude-code`)
226
+
227
+ `~/.claude.json`
228
+
229
+ ```json
230
+ {
231
+ "mcpServers": {
232
+ "cartography": {
233
+ "command": "npx",
234
+ "args": [
235
+ "-y",
236
+ "--package",
237
+ "@datasynx/agentic-ai-cartography",
238
+ "cartography-mcp"
239
+ ]
240
+ }
241
+ }
242
+ }
243
+ ```
244
+
245
+ ### Cursor (`cursor`)
246
+
247
+ `~/.cursor/mcp.json`
248
+
249
+ ```json
250
+ {
251
+ "mcpServers": {
252
+ "cartography": {
253
+ "command": "npx",
254
+ "args": [
255
+ "-y",
256
+ "--package",
257
+ "@datasynx/agentic-ai-cartography",
258
+ "cartography-mcp"
259
+ ]
260
+ }
261
+ }
262
+ }
263
+ ```
264
+
265
+ ### VS Code (Copilot) (`vscode`)
266
+
267
+ `~/.config/Code/User/mcp.json`
268
+
269
+ ```json
270
+ {
271
+ "servers": {
272
+ "cartography": {
273
+ "type": "stdio",
274
+ "command": "npx",
275
+ "args": [
276
+ "-y",
277
+ "--package",
278
+ "@datasynx/agentic-ai-cartography",
279
+ "cartography-mcp"
280
+ ]
281
+ }
282
+ }
283
+ }
284
+ ```
285
+
286
+ ### Codex CLI (`codex`)
287
+
288
+ `~/.codex/config.toml`
289
+
290
+ ```toml
291
+ [mcp_servers.cartography]
292
+ command = "npx"
293
+ args = [ "-y", "--package", "@datasynx/agentic-ai-cartography", "cartography-mcp" ]
294
+ ```
295
+
296
+ ### Windsurf (`windsurf`)
297
+
298
+ `~/.codeium/windsurf/mcp_config.json`
299
+
300
+ ```json
301
+ {
302
+ "mcpServers": {
303
+ "cartography": {
304
+ "command": "npx",
305
+ "args": [
306
+ "-y",
307
+ "--package",
308
+ "@datasynx/agentic-ai-cartography",
309
+ "cartography-mcp"
310
+ ]
311
+ }
312
+ }
313
+ }
314
+ ```
315
+
316
+ ### Cline (`cline`)
317
+
318
+ `~/.config/Code/User/globalStorage/saoudrizwan.claude-dev/settings/cline_mcp_settings.json`
319
+
320
+ ```json
321
+ {
322
+ "mcpServers": {
323
+ "cartography": {
324
+ "command": "npx",
325
+ "args": [
326
+ "-y",
327
+ "--package",
328
+ "@datasynx/agentic-ai-cartography",
329
+ "cartography-mcp"
330
+ ],
331
+ "alwaysAllow": [],
332
+ "disabled": false
333
+ }
334
+ }
335
+ }
336
+ ```
337
+
338
+ ### Roo Code (`roo`)
339
+
340
+ `~/.config/Code/User/globalStorage/rooveterinaryinc.roo-cline/settings/cline_mcp_settings.json`
341
+
342
+ ```json
343
+ {
344
+ "mcpServers": {
345
+ "cartography": {
346
+ "command": "npx",
347
+ "args": [
348
+ "-y",
349
+ "--package",
350
+ "@datasynx/agentic-ai-cartography",
351
+ "cartography-mcp"
352
+ ]
353
+ }
354
+ }
355
+ }
356
+ ```
357
+
358
+ ### Zed (`zed`)
359
+
360
+ `~/.config/zed/settings.json`
361
+
362
+ ```json
363
+ {
364
+ "context_servers": {
365
+ "cartography": {
366
+ "source": "custom",
367
+ "command": "npx",
368
+ "args": [
369
+ "-y",
370
+ "--package",
371
+ "@datasynx/agentic-ai-cartography",
372
+ "cartography-mcp"
373
+ ]
374
+ }
375
+ }
376
+ }
377
+ ```
378
+
379
+ ### JetBrains / Junie (`junie`)
380
+
381
+ `~/.junie/mcp/mcp.json`
382
+
383
+ ```json
384
+ {
385
+ "mcpServers": {
386
+ "cartography": {
387
+ "command": "npx",
388
+ "args": [
389
+ "-y",
390
+ "--package",
391
+ "@datasynx/agentic-ai-cartography",
392
+ "cartography-mcp"
393
+ ]
394
+ }
395
+ }
396
+ }
397
+ ```
398
+
399
+ ### Gemini CLI (`gemini`)
400
+
401
+ `~/.gemini/settings.json`
402
+
403
+ ```json
404
+ {
405
+ "mcpServers": {
406
+ "cartography": {
407
+ "command": "npx",
408
+ "args": [
409
+ "-y",
410
+ "--package",
411
+ "@datasynx/agentic-ai-cartography",
412
+ "cartography-mcp"
413
+ ]
414
+ }
415
+ }
416
+ }
417
+ ```
418
+
419
+ ### Goose (`goose`)
420
+
421
+ `~/.config/goose/config.yaml`
422
+
423
+ ```yaml
424
+ extensions:
425
+ cartography:
426
+ name: cartography
427
+ type: stdio
428
+ enabled: true
429
+ command: npx
430
+ args:
431
+ - -y
432
+ - --package
433
+ - "@datasynx/agentic-ai-cartography"
434
+ - cartography-mcp
435
+ ```
436
+
437
+ ### OpenHands (`openhands`)
438
+
439
+ `~/.openhands/config.toml`
440
+
441
+ ```toml
442
+ [[mcp.stdio_servers]]
443
+ name = "cartography"
444
+ command = "npx"
445
+ args = [ "-y", "--package", "@datasynx/agentic-ai-cartography", "cartography-mcp" ]
446
+ ```
447
+
448
+ ### Claude Desktop (`claude-desktop`)
449
+
450
+ `~/.config/Claude/claude_desktop_config.json`
451
+
452
+ ```json
453
+ {
454
+ "mcpServers": {
455
+ "cartography": {
456
+ "command": "npx",
457
+ "args": [
458
+ "-y",
459
+ "--package",
460
+ "@datasynx/agentic-ai-cartography",
461
+ "cartography-mcp"
462
+ ]
463
+ }
464
+ }
465
+ }
466
+ ```
467
+
468
+ <!-- AUTO-GENERATED:quickstarts END -->
469
+
470
+
471
+ ---
472
+
473
+ <!-- source: docs/adapters.md -->
474
+
475
+ # Native adapters for non-MCP frameworks
476
+
477
+ Some agent frameworks don't read a config file — they load MCP tools through their
478
+ own adapter classes. Cartography needs **no special support** for these; point them
479
+ at the standard stdio command and they'll pick up every tool, just like an MCP host.
480
+
481
+ Standard launch command (used in every snippet below):
482
+
483
+ ```
484
+ npx -y --package @datasynx/agentic-ai-cartography cartography-mcp
485
+ ```
486
+
487
+ > Run a discovery first (`datasynx-cartography discover`) so the catalog has a
488
+ > topology to serve. Cartography's MCP **prompts** and **resources** are available
489
+ > in full MCP hosts; some adapters below load **tools only** (noted inline).
490
+
491
+ ---
492
+
493
+ ## LangGraph / LangChain (Python)
494
+
495
+ ```bash
496
+ pip install langchain-mcp-adapters
497
+ ```
498
+
499
+ ```python
500
+ from langchain_mcp_adapters.client import MultiServerMCPClient
501
+
502
+ client = MultiServerMCPClient({
503
+ "cartography": {
504
+ "command": "npx",
505
+ "args": ["-y", "--package", "@datasynx/agentic-ai-cartography", "cartography-mcp"],
506
+ "transport": "stdio",
507
+ },
508
+ # or a remote Streamable HTTP server:
509
+ # "cartography": {"url": "http://127.0.0.1:3737/mcp", "transport": "streamable_http"},
510
+ })
511
+ tools = await client.get_tools() # hand `tools` to create_react_agent / create_agent
512
+ ```
513
+
514
+ `MultiServerMCPClient` is stateless by default (a new session per tool call); use
515
+ `client.session("cartography")` for a stateful session. JS: `@langchain/mcp-adapters`.
516
+
517
+ ## Microsoft AutoGen (Python)
518
+
519
+ ```bash
520
+ pip install "autogen-ext[mcp]"
521
+ ```
522
+
523
+ ```python
524
+ from autogen_ext.tools.mcp import McpWorkbench, StdioServerParams
525
+
526
+ params = StdioServerParams(
527
+ command="npx",
528
+ args=["-y", "--package", "@datasynx/agentic-ai-cartography", "cartography-mcp"],
529
+ read_timeout_seconds=60,
530
+ )
531
+ async with McpWorkbench(params) as mcp:
532
+ agent = AssistantAgent("assistant", model_client=..., workbench=mcp)
533
+ ```
534
+
535
+ > AutoGen is in maintenance mode; for new projects Microsoft points to the
536
+ > **Microsoft Agent Framework (MAF)**, which speaks MCP + A2A.
537
+
538
+ ## CrewAI (Python)
539
+
540
+ ```bash
541
+ pip install "crewai-tools[mcp]"
542
+ ```
543
+
544
+ ```python
545
+ from crewai_tools import MCPServerAdapter
546
+ from mcp import StdioServerParameters
547
+
548
+ server_params = StdioServerParameters(
549
+ command="npx",
550
+ args=["-y", "--package", "@datasynx/agentic-ai-cartography", "cartography-mcp"],
551
+ )
552
+ with MCPServerAdapter(server_params) as tools:
553
+ agent = Agent(role="SRE", goal="Map the system", backstory="...", tools=tools)
554
+ ```
555
+
556
+ > `MCPServerAdapter` exposes **tools only** (no prompts/resources).
557
+
558
+ ## Pydantic AI (Python)
559
+
560
+ ```bash
561
+ pip install "pydantic-ai-slim[mcp]"
562
+ ```
563
+
564
+ ```python
565
+ from pydantic_ai import Agent
566
+ from pydantic_ai.mcp import MCPServerStdio
567
+
568
+ server = MCPServerStdio("npx", args=["-y", "--package", "@datasynx/agentic-ai-cartography", "cartography-mcp"])
569
+ agent = Agent("openai:gpt-5.2", toolsets=[server])
570
+ ```
571
+
572
+ `load_mcp_servers("config.json")` also reads an `mcpServers` JSON block directly.
573
+
574
+ ## OpenAI Agents SDK (Python)
575
+
576
+ MCP support is built in:
577
+
578
+ ```python
579
+ from agents import Agent
580
+ from agents.mcp import MCPServerStdio
581
+
582
+ async with MCPServerStdio(
583
+ name="Cartography",
584
+ params={"command": "npx", "args": ["-y", "--package", "@datasynx/agentic-ai-cartography", "cartography-mcp"]},
585
+ ) as server:
586
+ agent = Agent(name="Assistant", instructions="...", mcp_servers=[server])
587
+ ```
588
+
589
+ Options: `cache_tools_list`, `tool_filter`, `max_retry_attempts`, `require_approval`.
590
+
591
+ ## Smolagents (Python)
592
+
593
+ ```bash
594
+ pip install "smolagents[mcp]"
595
+ ```
596
+
597
+ ```python
598
+ from smolagents import ToolCollection, CodeAgent
599
+ from mcp import StdioServerParameters
600
+
601
+ params = StdioServerParameters(
602
+ command="npx",
603
+ args=["-y", "--package", "@datasynx/agentic-ai-cartography", "cartography-mcp"],
604
+ )
605
+ with ToolCollection.from_mcp(params, trust_remote_code=True) as tc:
606
+ agent = CodeAgent(tools=[*tc.tools], model=...)
607
+ ```
608
+
609
+ ## Vercel AI SDK (TypeScript)
610
+
611
+ ```ts
612
+ import { experimental_createMCPClient as createMCPClient } from 'ai';
613
+ import { Experimental_StdioMCPTransport as StdioMCPTransport } from 'ai/mcp-stdio';
614
+
615
+ const mcp = await createMCPClient({
616
+ transport: new StdioMCPTransport({
617
+ command: 'npx',
618
+ args: ['-y', '--package', '@datasynx/agentic-ai-cartography', 'cartography-mcp'],
619
+ }),
620
+ });
621
+ const tools = await mcp.tools(); // MCP tools → AI SDK tools, any model
622
+ ```
623
+
624
+ > Define your **own** tools with `inputSchema` (renamed from `parameters` in AI SDK
625
+ > **v5** — using `parameters` yields an empty schema / 400 errors). The MCP client
626
+ > is lightweight: **tools only**, no session management or resources.
627
+
628
+
629
+ ---
630
+
631
+ <!-- source: docs/explanation/index.md -->
632
+
633
+ # Why MCP-first?
634
+
635
+ Cartography's primary interface is a **Model Context Protocol** server, not a CLI or
636
+ a library. That choice is deliberate.
637
+
638
+ ## One integration surface, every host
639
+
640
+ The [Model Context Protocol](https://modelcontextprotocol.io) is the common
641
+ denominator across AI hosts and agent frameworks. By exposing discovery as an MCP
642
+ server, Cartography works in Claude Code, Cursor, VS Code, Cline, Windsurf, Zed,
643
+ LangGraph, CrewAI and more — without bespoke integrations for each.
644
+
645
+ ## Read-only by construction
646
+
647
+ Every tool is annotated `readOnlyHint: true`; the command allowlist rejects anything
648
+ that mutates. The server *describes* your landscape — it never changes it.
649
+
650
+ ## Progressive disclosure
651
+
652
+ Agents read `cartography://graph/summary` first (a low-token index), then drill into
653
+ specific nodes. This keeps token usage bounded even for large landscapes — important
654
+ where hosts cap tool output or total tool count.
655
+
656
+ ## The CLI and SDK are adapters
657
+
658
+ The `datasynx-cartography` CLI and the embeddable library are thin layers over the
659
+ same core. The MCP server is the headline; everything else is convenience.
660
+
661
+ ## See also
662
+
663
+ - [Threat model](./threat-model.md) — attacker model, trust boundaries, and the mitigation
664
+ enforced at each, mapped to the code.
665
+
666
+
667
+ ---
668
+
669
+ <!-- source: docs/explanation/threat-model.md -->
670
+
671
+ # Threat model
672
+
673
+ Cartography performs **read-only** infrastructure discovery and exposes the result over the Model
674
+ Context Protocol. Its safety boundary is the read-only allowlist in
675
+ [`src/allowlist.ts`](https://github.com/datasynx/agentic-ai-cartography/blob/main/src/allowlist.ts),
676
+ enforced for every command spawned by `run()`
677
+ ([`src/platform.ts`](https://github.com/datasynx/agentic-ai-cartography/blob/main/src/platform.ts))
678
+ regardless of origin — scanner template, agent, or MCP tool. This page makes the model behind that
679
+ boundary explicit: who the attacker is, what is worth protecting, where trust changes hands, and
680
+ which mechanism defends each crossing.
681
+
682
+ It complements the guarantee list in
683
+ [`SECURITY.md`](https://github.com/datasynx/agentic-ai-cartography/blob/main/SECURITY.md); that file
684
+ is the contract, this one is the reasoning.
685
+
686
+ ## Attacker model
687
+
688
+ Three attackers are in scope:
689
+
690
+ 1. **A malicious or compromised MCP client / agent.** It can call any exposed tool with any
691
+ arguments and is assumed to *want* to run destructive commands, inject extra shell commands
692
+ through scan parameters, or exfiltrate credentials. It is *not* trusted.
693
+ 2. **Untrusted scanned content.** Bookmark titles, browser-history entries, and the stdout of host
694
+ CLIs (`aws`, `gcloud`, `az`, `kubectl`, database clients) are attacker-influenceable data. A
695
+ payload hidden there may try to smuggle instructions into the agent's context (prompt injection)
696
+ or blow up the context window.
697
+ 3. **A network attacker against the HTTP transport.** When the Streamable HTTP transport is bound to
698
+ a non-loopback address, an unauthenticated peer or a DNS-rebinding origin may try to reach it.
699
+
700
+ Out of scope: an attacker who already has the user's shell, the host CLIs' own credential stores, or
701
+ the integrity of the operating system. Cartography trusts the host it runs on (see *Residual risk*).
702
+
703
+ ## Assets
704
+
705
+ - **The local command-execution surface.** The single most valuable target — code that can run
706
+ shell commands on the user's machine.
707
+ - **Cloud and cluster credentials.** AWS/GCP/Azure/Kubernetes configs the host CLIs read on
708
+ Cartography's behalf.
709
+ - **Scanned personal data.** Browser bookmarks and history, installed applications.
710
+ - **The catalog.** Node ids, metadata, and edge evidence persisted to SQLite — which later re-enter
711
+ an LLM context when an agent queries the topology.
712
+
713
+ ## Trust boundaries
714
+
715
+ | # | Boundary | Untrusted side → trusted side |
716
+ |---|----------|-------------------------------|
717
+ | B1 | Agent/client → command execution | Tool calls and their string arguments → `run()` shell |
718
+ | B2 | Scanner output → catalog / LLM context | CLI stdout, bookmark/history text → persisted nodes, agent context |
719
+ | B3 | Network → HTTP transport | Remote requests → the MCP server |
720
+ | B4 | Catalog persistence | Node ids/metadata containing secrets → durable storage |
721
+
722
+ ## Mitigations per boundary
723
+
724
+ Each mitigation is enforced in code; the citations point at the implementing lines.
725
+
726
+ | Boundary / threat | Mitigation | Location |
727
+ |---|---|---|
728
+ | **B1** Arbitrary / destructive command execution | Positive read-only **allowlist** (known-read-only binaries + per-tool verb rules), not a denylist — anything not provably read-only is rejected | [`src/allowlist.ts:14-29,44-65,181-222`](https://github.com/datasynx/agentic-ai-cartography/blob/main/src/allowlist.ts#L14-L222) |
729
+ | **B1** Command injection via substitution | `$()` and backticks rejected before execution | [`src/allowlist.ts:211`](https://github.com/datasynx/agentic-ai-cartography/blob/main/src/allowlist.ts#L211) |
730
+ | **B1** Shell-arg injection through scan parameters | `assertSafeScanArg` validates region/profile/project/namespace/etc. against strict regexes before they are spliced into a command | [`src/tools.ts:83-114`](https://github.com/datasynx/agentic-ai-cartography/blob/main/src/tools.ts#L83-L114) |
731
+ | **B1** Defense-in-depth at the execution chokepoint | `run()` re-checks `checkReadOnly()` immediately before `execSync`, regardless of origin | [`src/platform.ts:77-95`](https://github.com/datasynx/agentic-ai-cartography/blob/main/src/platform.ts#L77-L95) |
732
+ | **B1** Secret env leaking into child processes | `safeEnv()` passes only an allowlist of environment keys to spawned commands | [`src/platform.ts:60-75`](https://github.com/datasynx/agentic-ai-cartography/blob/main/src/platform.ts#L60-L75) |
733
+ | **B1** Agent-driven Bash in the optional Claude loop | `safetyHook` PreToolUse denies non-read-only Bash before it runs | [`src/safety.ts:1-42`](https://github.com/datasynx/agentic-ai-cartography/blob/main/src/safety.ts#L1-L42) |
734
+ | **B2** Hidden prompt-injection in untrusted text | `sanitizeUntrusted` strips invisible/bidi/format/control Unicode (NFC-normalized) before text enters the catalog or an LLM context | [`src/sanitize.ts:18-45`](https://github.com/datasynx/agentic-ai-cartography/blob/main/src/sanitize.ts#L18-L45), applied at [`src/db.ts:475-492,539-550`](https://github.com/datasynx/agentic-ai-cartography/blob/main/src/db.ts#L475-L550) |
735
+ | **B2** Context-window exhaustion from large output | `clampText` caps a single tool response at `maxToolResponseBytes` (default 100 000) | [`src/tools.ts:48-65`](https://github.com/datasynx/agentic-ai-cartography/blob/main/src/tools.ts#L48-L65), [`src/types.ts:194,213`](https://github.com/datasynx/agentic-ai-cartography/blob/main/src/types.ts#L194-L213) |
736
+ | **B3** Unauthenticated HTTP access | Non-loopback bind requires a bearer token; tokens are compared in constant time | [`src/mcp/transports.ts:36-107`](https://github.com/datasynx/agentic-ai-cartography/blob/main/src/mcp/transports.ts#L36-L107) |
737
+ | **B3** DNS-rebinding (CVE-2025-66414) | Non-loopback bind requires an explicit `allowedHosts` Host allowlist | [`src/mcp/transports.ts:36-107`](https://github.com/datasynx/agentic-ai-cartography/blob/main/src/mcp/transports.ts#L36-L107) |
738
+ | **B4** Credentials persisted in node ids / metadata | `stripSensitive`, `redactSecrets`, `redactValue` remove `user:password@` and query/path secrets before persistence | [`src/tools.ts:67-81,111-126`](https://github.com/datasynx/agentic-ai-cartography/blob/main/src/tools.ts#L67-L126) |
739
+
740
+ These mechanisms are exercised by `test/safety.test.ts`, `test/tools-hardening.test.ts`,
741
+ `test/sanitize.test.ts`, and `test/transports.test.ts`.
742
+
743
+ ## Residual risk and assumptions
744
+
745
+ - **The host is trusted.** Cartography assumes the machine it runs on, its installed CLIs
746
+ (`aws`/`gcloud`/`az`/`kubectl`/database clients), and those CLIs' credential stores are not already
747
+ compromised. It reads through them; it does not sandbox them.
748
+ - **Allowlist correctness is the trust root.** The read-only guarantee is exactly as strong as
749
+ `checkReadOnly()`. A gap there is a vulnerability — see *Reporting* below.
750
+ - **Out-of-process secret hygiene is the operator's job.** Cartography redacts secrets it persists,
751
+ but it does not manage how cloud credentials are stored on the host.
752
+ - **Hosted Smithery runs are read-only.** The managed runtime serves a catalog with no host CLIs and
753
+ no secrets (`smithery.yaml` declares `env: {}`); the cloud scanners are intended for
754
+ local/self-hosted use only.
755
+
756
+ If you find a way around any boundary above, it is a vulnerability — please report it privately via
757
+ [`SECURITY.md`](https://github.com/datasynx/agentic-ai-cartography/blob/main/SECURITY.md).
758
+