vgxness 1.5.0 → 1.5.2
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +23 -2
- package/dist/agents/agent-seed-service.js +10 -0
- package/dist/agents/canonical-agent-manifest.js +177 -0
- package/dist/agents/canonical-agent-projection.js +146 -0
- package/dist/agents/renderers/claude-renderer.js +30 -52
- package/dist/cli/bun-bin.js +6 -0
- package/dist/cli/cli-help.js +3 -0
- package/dist/cli/commands/agent-skill-dispatcher.js +6 -5
- package/dist/cli/commands/mcp-dispatcher.js +65 -3
- package/dist/cli/index.js +1 -1
- package/dist/governance/governance-report-builder.js +45 -26
- package/dist/mcp/claude-code-agent-config.js +79 -0
- package/dist/mcp/claude-code-config.js +84 -0
- package/dist/mcp/client-install-claude-code-contract.js +86 -0
- package/dist/mcp/client-install-claude-code.js +85 -0
- package/dist/mcp/control-plane.js +2 -0
- package/dist/mcp/index.js +5 -0
- package/dist/mcp/opencode-default-agent-config.js +7 -113
- package/dist/mcp/provider-canonical-agent-manifest.js +39 -0
- package/dist/mcp/provider-change-plan.js +57 -1
- package/dist/mcp/provider-doctor.js +54 -0
- package/dist/mcp/provider-health-types.js +3 -1
- package/dist/mcp/provider-status.js +82 -2
- package/dist/mcp/schema.js +11 -2
- package/dist/mcp/stdio-server.js +2 -0
- package/dist/mcp/validation.js +23 -1
- package/dist/memory/memory-service.js +59 -0
- package/dist/memory/repositories/sessions.js +1 -1
- package/dist/sdd/sdd-workflow-service.js +129 -59
- package/dist/setup/providers/claude-setup-adapter.js +7 -4
- package/docs/architecture.md +54 -112
- package/docs/cli.md +53 -0
- package/docs/code-runtime.md +218 -0
- package/docs/contributing.md +120 -0
- package/docs/glossary.md +211 -0
- package/docs/mcp.md +144 -0
- package/docs/prd.md +23 -26
- package/docs/providers.md +123 -0
- package/docs/roadmap.md +88 -0
- package/docs/safety.md +147 -0
- package/docs/storage.md +93 -0
- package/package.json +1 -1
- package/docs/funcionamiento-del-sistema.md +0 -865
- package/docs/harness-gap-analysis.md +0 -243
- package/docs/vgxcode.md +0 -87
- package/docs/vgxness-code.md +0 -48
|
@@ -0,0 +1,123 @@
|
|
|
1
|
+
# Providers
|
|
2
|
+
|
|
3
|
+
VGXNESS is provider-agnostic at the core: the registry stores provider-neutral definitions and adapters translate them into provider-specific config and runtime behavior. This document covers the two adapter layers: the **control-plane adapter** (OpenCode renderer today, with a Claude preview renderer in the tree) and the **code-runtime provider adapter** (OpenAI-compatible + fake).
|
|
4
|
+
|
|
5
|
+
## Status (v1.5.1)
|
|
6
|
+
|
|
7
|
+
| Provider | Control plane | Code runtime | Notes |
|
|
8
|
+
|---|---|---|---|
|
|
9
|
+
| OpenCode | `managed` | n/a (target) | Primary supported provider. The configurator renders OpenCode MCP config and manager/SDD agent definitions into the chosen scope. |
|
|
10
|
+
| Claude Code | `preview/manual` | n/a | The canonical agent manifest declares Claude support as `preview` with an explicit reason: VGXNESS does not install Claude or write `.claude/` or `CLAUDE.md`. Owners of Claude-only workflows must run setup themselves. |
|
|
11
|
+
| Antigravity | `placeholder` | n/a | Listed in the TUI Installation surface as coming-soon. |
|
|
12
|
+
| Custom / future | `extension` | extension point | Per the [Architecture](./architecture.md) decision, anything not OpenCode or Claude is a custom extension. |
|
|
13
|
+
| OpenAI-compatible | n/a | `openai-compatible-provider-adapter.ts` | Real adapter used by `vgxness code`. Speaks to any OpenAI-compatible endpoint. |
|
|
14
|
+
| Fake | n/a | `fake-provider-adapter.ts` | Deterministic, offline; for unit tests and CI. |
|
|
15
|
+
|
|
16
|
+
## Control-plane adapter contract
|
|
17
|
+
|
|
18
|
+
The control-plane adapter takes a root agent plus optional registered subagents and returns previewable artifacts. It does not mutate the registry, write provider config, or call providers.
|
|
19
|
+
|
|
20
|
+
| Type | Purpose |
|
|
21
|
+
|---|---|
|
|
22
|
+
| `ProviderRenderer` | Named renderer for one output format/provider. |
|
|
23
|
+
| `ProviderRenderInput` | A root agent plus optional registered subagents. |
|
|
24
|
+
| `ProviderRenderResult` | Generated artifacts, provider name, `installable: false`, warnings. |
|
|
25
|
+
| `ProviderRenderArtifact` | Relative path, content type, and generated contents. |
|
|
26
|
+
|
|
27
|
+
### OpenCode renderer
|
|
28
|
+
|
|
29
|
+
`src/agents/renderers/opencode-renderer.ts` renders a single OpenCode config preview with `$schema` and an `agent` object. Top-level agents default to `primary`; subagents render as `subagent`. Agent keys are sanitized deterministically from registry names, and rendering rejects key collisions instead of overwriting generated config.
|
|
30
|
+
|
|
31
|
+
```bash
|
|
32
|
+
vgxness agents render --provider opencode --project vgxness --name apply-agent
|
|
33
|
+
```
|
|
34
|
+
|
|
35
|
+
The output is previewable JSON; the renderer does not write to `.opencode/`, `.claude/`, or any user/global provider config.
|
|
36
|
+
|
|
37
|
+
### JSON renderer
|
|
38
|
+
|
|
39
|
+
`src/agents/renderers/json-renderer.ts` produces a debug/export shape. It includes the matching `adapters.json` as `selectedAdapter` so downstream consumers can replay the same rendering deterministically.
|
|
40
|
+
|
|
41
|
+
```bash
|
|
42
|
+
vgxness agents render --provider json --project vgxness --name apply-agent
|
|
43
|
+
```
|
|
44
|
+
|
|
45
|
+
### Claude renderer (preview)
|
|
46
|
+
|
|
47
|
+
`src/agents/renderers/claude-renderer.ts` exists in the tree as a preview renderer. The shape of an install-safe Claude artifact is not yet finalized, so the renderer is not enabled for end-to-end install flows.
|
|
48
|
+
|
|
49
|
+
### OpenCode injection preview
|
|
50
|
+
|
|
51
|
+
`OpenCodeInjectionPreviewService` (in `src/providers/opencode/`) composes existing read-only outputs into a single envelope:
|
|
52
|
+
|
|
53
|
+
| Output | Source |
|
|
54
|
+
|---|---|
|
|
55
|
+
| `providerArtifacts` | OpenCode renderer for the selected agent and registered subagents. |
|
|
56
|
+
| `skillPayload` | Skill registry payload builder for the selected SDD phase and OpenCode adapter. |
|
|
57
|
+
| `sdd` | SDD workflow status and readiness for the selected project/change/phase. |
|
|
58
|
+
| `context` and `safety` | OpenCode preview layer metadata for future OpenCode/MCP/hook callers. |
|
|
59
|
+
|
|
60
|
+
The envelope is always `installable: false` and `readOnly: true`. It does not execute OpenCode, install hooks, create MCP servers, create runs, record skill usage, or touch provider/user/global config. Future live injection should build on this contract only after a separate approved change defines execution, hook, or MCP safety rules.
|
|
61
|
+
|
|
62
|
+
```bash
|
|
63
|
+
vgxness opencode preview --provider opencode --agent apply-agent --project vgxness --change checkout-flow --phase apply-progress
|
|
64
|
+
```
|
|
65
|
+
|
|
66
|
+
## Code-runtime provider adapter
|
|
67
|
+
|
|
68
|
+
The code runtime speaks to a model through `CodeProviderAdapter` (`src/code/providers/provider-adapter.ts`):
|
|
69
|
+
|
|
70
|
+
```ts
|
|
71
|
+
export interface CodeProviderAdapter {
|
|
72
|
+
readonly id: string;
|
|
73
|
+
readonly displayName: string;
|
|
74
|
+
readonly capabilities: ProviderCapabilities;
|
|
75
|
+
createResponse(request: ProviderRequest): Promise<ProviderResponse>;
|
|
76
|
+
streamResponse?(request: ProviderRequest): AsyncIterable<ProviderStreamEvent>;
|
|
77
|
+
countTokens?(input: ProviderTokenInput): Promise<TokenUsageEstimate>;
|
|
78
|
+
diagnostics?(model: string): Promise<ProviderDiagnostics>;
|
|
79
|
+
}
|
|
80
|
+
```
|
|
81
|
+
|
|
82
|
+
Errors are surfaced through `CodeProviderError` with a `code` of `missing_credentials`, `blocked_credentials`, `provider_error`, or `retryable_provider_error`. The `retryable` flag tells callers whether a transient retry is meaningful.
|
|
83
|
+
|
|
84
|
+
### `openai-compatible-provider-adapter.ts`
|
|
85
|
+
|
|
86
|
+
The real adapter. Speaks to any OpenAI-compatible endpoint; credentials come from environment references, never embedded in prompts or reports. The adapter streams responses through `stream-normalizer.ts` and maps messages with `message-mapper.ts`.
|
|
87
|
+
|
|
88
|
+
### `fake-provider-adapter.ts`
|
|
89
|
+
|
|
90
|
+
Deterministic, offline. Used by `test/code/` and `bun run smoke:opentui-code` to keep CI hermetic. The fake adapter is intentionally thin — it does not model provider quirks, only the contract.
|
|
91
|
+
|
|
92
|
+
### Adding a new code-runtime provider
|
|
93
|
+
|
|
94
|
+
1. Implement `CodeProviderAdapter` and a `CodeProviderError`-throwing path. The interface is small; most of the work is in the request/response shape and the stream normalizer.
|
|
95
|
+
2. Add credentials handling in `src/code/providers/credentials.ts`. Do not embed secrets; accept environment references only.
|
|
96
|
+
3. Add a smoke test under `test/code/providers.test.ts` that exercises `createResponse`, `streamResponse`, and `countTokens` (where applicable).
|
|
97
|
+
4. If the new provider has a different default tool shape, add a normalizer. If the stream events are different, extend `stream-normalizer.ts`.
|
|
98
|
+
5. Update [Code runtime](./code-runtime.md) and this document with the new adapter id and capabilities.
|
|
99
|
+
|
|
100
|
+
## OpenCode provider install and doctor
|
|
101
|
+
|
|
102
|
+
Provider install and doctor flows live in `src/mcp/client-install-opencode.ts` and `src/mcp/provider-doctor.ts` and are exposed through the MCP server and the CLI:
|
|
103
|
+
|
|
104
|
+
- `vgxness mcp install opencode --plan` — read-only plan; never writes config.
|
|
105
|
+
- `vgxness mcp install opencode --yes` — explicit write path; creates a backup first.
|
|
106
|
+
- `vgxness mcp doctor opencode` — JSON report of provider health.
|
|
107
|
+
- `vgxness provider status` / `vgxness provider doctor` (planned CLI) — same shape through the operator surface.
|
|
108
|
+
- `vgxness setup rollback --backup <path>` — restores a previous OpenCode config byte-for-byte after validation.
|
|
109
|
+
|
|
110
|
+
Writes happen only through `apply` with explicit consent. Plans, status, doctor, change-plan, and previews are read-only by contract.
|
|
111
|
+
|
|
112
|
+
## Safety boundary
|
|
113
|
+
|
|
114
|
+
Adapters and renderers must not:
|
|
115
|
+
|
|
116
|
+
- Write provider config (`.opencode/`, `.claude/`, `opencode.json`, `CLAUDE.md`).
|
|
117
|
+
- Call providers (`opencode`, `claude`, etc.) during preview or status.
|
|
118
|
+
- Install global memory.
|
|
119
|
+
- Create `openspec/`.
|
|
120
|
+
- Bypass permission policy.
|
|
121
|
+
- Mutate other tools' state outside the VGXNESS SQLite store.
|
|
122
|
+
|
|
123
|
+
Adapter code is reviewed against the safety boundary tests in `test/mcp/` and `test/agents/provider-renderer.test.ts` before merge.
|
package/docs/roadmap.md
ADDED
|
@@ -0,0 +1,88 @@
|
|
|
1
|
+
# Roadmap
|
|
2
|
+
|
|
3
|
+
This document tracks planned work that is not yet shipped. The [Architecture](./architecture.md) document describes what is built; this is what is next. Items are grouped by area and ordered roughly by impact and dependency.
|
|
4
|
+
|
|
5
|
+
## Runtime execution
|
|
6
|
+
|
|
7
|
+
The lifecycle, policy recording, approval records, and retry policy are all in place. What is still test-only is the actual executor that turns a reserved attempt into a real-world side effect.
|
|
8
|
+
|
|
9
|
+
- **Real provider/tool executor.** The current `RunService.executeOperation(...)` takes an injected `RunOperationExecutor`; production code still uses fake/deterministic executors in tests. The plan is to ship a sandboxed executor that respects the `ExecutionIsolationPlan` (`workspace`/`git-worktree`/`process-sandbox`) and the SDD phase permission matrix.
|
|
10
|
+
- **CLI/MCP orchestration for `resume-after-approval`.** Once the executor is real, the `vgxness_run_resume_inspect` and `vgxness_run_resume_gate` tools need an `apply` partner that calls the executor only after a human resolves the pending approval.
|
|
11
|
+
- **Sandbox/worktree execution strategies.** `src/runs/sandbox-worktree-planning.ts` produces a plan; the plan needs to materialize into a real worktree or sandbox. Symlink and realpath hardening is already in the policy evaluator; the next step is wiring it to a runner.
|
|
12
|
+
- **Richer verification evidence summaries.** Verification currently records `pass`/`fail`/`skipped`; the next step is to summarize per-task verification into the run, task list, and SDD cockpit so blocked phases have concrete evidence.
|
|
13
|
+
|
|
14
|
+
## Code runtime
|
|
15
|
+
|
|
16
|
+
The four modes (`inspect`/`plan`/`craft-preview`/`craft`) and the 19 tools are in. What is left is provider coverage, ergonomics, and the OpenTUI shell promotion.
|
|
17
|
+
|
|
18
|
+
- **Native Anthropic provider.** The code runtime speaks to any OpenAI-compatible endpoint. A native Anthropic adapter would remove the need to route Claude through an OpenAI-compatible bridge.
|
|
19
|
+
- **More providers.** The adapter contract is small and provider-neutral; additional adapters (Anthropic, Google, local model servers) can be added incrementally.
|
|
20
|
+
- **`vgxcode` OpenTUI shell promotion.** The shell is currently root-owned during development. To promote it, we need: deterministic event contracts, snapshot tests, and a packaging path that does not require `bun src/cli/tui/opentui/code/index.ts`.
|
|
21
|
+
- **Workspace-executor for the `craft` mode.** Workspace mutations (`apply_patch`, `create_file`, `update_file`, `delete_file`) currently use the `WorkspaceToolExecutor` with policy gating. The next step is a real file-system backend with rollback on failure.
|
|
22
|
+
- **Repair loop.** The `--verification repair` option exists in the CLI but is not wired to a loop in the runtime.
|
|
23
|
+
|
|
24
|
+
## SDD governance
|
|
25
|
+
|
|
26
|
+
The hard acceptance gate and the cockpit blockers are in. The remaining work is mostly UX and stronger cross-linking.
|
|
27
|
+
|
|
28
|
+
- **Verification evidence linked to the cockpit.** The cockpit currently returns blockers; surfacing the per-phase verification evidence (pass/fail/skipped counts, last-run timestamps) would make "why is this blocked" answers more useful.
|
|
29
|
+
- **Per-phase model/profile routing in the cockpit.** The manager profile overlay exists. The cockpit should recommend a model for the next phase based on the active overlay.
|
|
30
|
+
- **Migration of `openspec/`-style workflows.** Some users bring artifacts from other tools. A formal `import` path through the artifact portability service would help, but it should not silently convert external artifacts into accepted SDD phases.
|
|
31
|
+
|
|
32
|
+
## Skills and agents
|
|
33
|
+
|
|
34
|
+
The registries, version model, payloads, and improvement-proposal lifecycle are in.
|
|
35
|
+
|
|
36
|
+
- **Skill evaluation harness.** `013_skill_evaluation_scenarios.sql` is the storage; the runtime that runs scenarios against a proposed skill version is not yet built.
|
|
37
|
+
- **Skill improvement suggestion agent.** Skill proposals today are user-driven. A trace-driven candidate detector is planned but explicitly out of scope per the "no silent skill mutation" rule.
|
|
38
|
+
- **Canonical manifest v7.** v6 is in the tree (WIP commit). v7 would address the uncommitted `canonical-agent-manifest.ts` and `canonical-agent-projection.ts` work and ship a stable, validated manifest as the source of truth.
|
|
39
|
+
|
|
40
|
+
## Storage and portability
|
|
41
|
+
|
|
42
|
+
SQLite, scopes, migrations, and the run snapshot export are in.
|
|
43
|
+
|
|
44
|
+
- **`import` path for SDD artifacts.** The `ArtifactPortabilityService` exports; an import path that re-creates acceptance records with explicit human re-acceptance is planned.
|
|
45
|
+
- **Sensitive-data redaction during export.** `src/export/redaction.ts` exists. Wiring it as a default into the export path and a CLI flag (`--redact`) is the next step.
|
|
46
|
+
- **Database upgrade tooling.** Forward-only migrations work; downgrade requires a snapshot. A small CLI helper for "is my DB on the latest migration?" would be useful for support.
|
|
47
|
+
|
|
48
|
+
## TUI
|
|
49
|
+
|
|
50
|
+
The main menu and setup screens are in via OpenTUI. The dashboard directory is empty in the current tree.
|
|
51
|
+
|
|
52
|
+
- **TUI dashboard screen.** Real-time view of active runs, pending approvals, and SDD blockers. Should match the cockpit surface from MCP.
|
|
53
|
+
- **TUI runs screen.** Run list, drill into a run, see events/approvals/attempts, with read-only safe actions.
|
|
54
|
+
- **TUI approvals screen.** Resolve pending approvals directly from the TUI, with the same audit trail as `vgxness_sdd_accept_artifact`.
|
|
55
|
+
|
|
56
|
+
## Observability and evaluation
|
|
57
|
+
|
|
58
|
+
95 test files cover the eval targets in v1.5.1.
|
|
59
|
+
|
|
60
|
+
- **Eval gate.** A `vgxness verify --evals` runner that orchestrates a quality gate beyond `bun run verify`. The architecture document lists 11 eval targets; lifting them into a single runner would close the loop.
|
|
61
|
+
- **Token/cost tracking.** A nice-to-have that the trace model already supports structurally (`ProviderTokenInput`).
|
|
62
|
+
- **Timeline export.** `RunSnapshotPackageV1` exists. A `runs timeline <id> --format html|md|json` CLI would make sharing easier.
|
|
63
|
+
|
|
64
|
+
## Provider coverage
|
|
65
|
+
|
|
66
|
+
- **Claude Code install path.** Preview/manual only today. The renderer exists; an install-safe Claude artifact shape is the blocker.
|
|
67
|
+
- **Antigravity and other custom adapters.** Listed as placeholders; a real adapter would unblock those workflows.
|
|
68
|
+
- **Provider doctor upgrades.** Doctor checks could surface real config drift, not just existence of expected files.
|
|
69
|
+
|
|
70
|
+
## Out of scope (still)
|
|
71
|
+
|
|
72
|
+
These remain future expansion areas per the [PRD](./prd.md):
|
|
73
|
+
|
|
74
|
+
- Cloud sync across machines.
|
|
75
|
+
- Team/shared memory spaces.
|
|
76
|
+
- Hosted web console.
|
|
77
|
+
- Distributed agent workers.
|
|
78
|
+
- Hosted observability and evaluation UI.
|
|
79
|
+
- Skill marketplace or shared skill catalog.
|
|
80
|
+
|
|
81
|
+
## How work gets into this list
|
|
82
|
+
|
|
83
|
+
When the architecture drifts from the code, or when a real product gap is identified, the right path is:
|
|
84
|
+
|
|
85
|
+
1. Open an SDD change with `vgxness sdd status --project <project> --change <id>` (or a fresh change id).
|
|
86
|
+
2. Move the item from here to an `explore` artifact.
|
|
87
|
+
3. Walk the SDD phases: `explore → proposal → spec → design → tasks → apply-progress → verify → archive`.
|
|
88
|
+
4. Once shipped, retire the item from this roadmap and update [Architecture](./architecture.md) and the [CHANGELOG](../../CHANGELOG.md) to reflect the new state.
|
package/docs/safety.md
ADDED
|
@@ -0,0 +1,147 @@
|
|
|
1
|
+
# Safety model
|
|
2
|
+
|
|
3
|
+
VGXNESS treats safety as a core domain, not as adapter-specific behavior. This document describes the three layers that make up the safety model: the **policy evaluator** for the control plane, the **per-SDD-phase permission matrix**, and the **code runtime approval flow** with redactors.
|
|
4
|
+
|
|
5
|
+
If a tool, surface, or new provider wants to take an action, the action must:
|
|
6
|
+
|
|
7
|
+
1. Be classified into a permission category.
|
|
8
|
+
2. Resolve through the policy evaluator or the per-SDD-phase matrix.
|
|
9
|
+
3. If the decision is `ask`, surface an approval prompt through the configured approval channel.
|
|
10
|
+
4. Record the decision in the run event stream regardless of outcome.
|
|
11
|
+
|
|
12
|
+
## Permission categories
|
|
13
|
+
|
|
14
|
+
The category list lives in `src/permissions/schema.ts` (`permissionCategories`). The default policy (`defaultPermissionPolicy` in `src/permissions/policy-evaluator.ts`) defines the baseline decision for each.
|
|
15
|
+
|
|
16
|
+
| Category | Default | Risky | Notes |
|
|
17
|
+
|---|---|---|---|
|
|
18
|
+
| `read` | `allow` | no | Files, memory, artifacts. Allowed only when path stays inside `workspaceRoot`. |
|
|
19
|
+
| `edit` | `ask` | yes | Write/patch/modify files. |
|
|
20
|
+
| `implementation-edit` | `ask` | yes | Edits tied to a SDD `apply-progress` phase. |
|
|
21
|
+
| `spec-write` | `ask` | yes | Writes to a spec artifact. |
|
|
22
|
+
| `design-write` | `ask` | yes | Writes to a design artifact. |
|
|
23
|
+
| `task-write` | `ask` | yes | Writes to a task artifact. |
|
|
24
|
+
| `shell` | `ask` | yes | Commands, scripts, package managers. |
|
|
25
|
+
| `test-run` | `ask` | yes | Test execution. |
|
|
26
|
+
| `install` | `ask` | yes | Dependency installation mutates local state. |
|
|
27
|
+
| `network` | `ask` | yes | Web fetch, API calls, package downloads. |
|
|
28
|
+
| `git` | `ask` | yes | Status, diff, branch, commit. |
|
|
29
|
+
| `git-write` | `ask` | yes | Push, merge, branch mutation. |
|
|
30
|
+
| `memory` | `ask` | yes | Memory read/write/search. |
|
|
31
|
+
| `memory-write` | `ask` | yes | Memory upsert/update. |
|
|
32
|
+
| `external-directory` | `deny` | yes | Access outside project/user-approved roots. Cannot be relaxed by agent overrides. |
|
|
33
|
+
| `provider-tool` | `ask` | yes | Opaque adapter/provider tool calls. |
|
|
34
|
+
| `secrets` | `deny` | yes | Environment variables, credentials, tokens. |
|
|
35
|
+
|
|
36
|
+
The risky categories (`isRiskyPermissionCategory`) get a forced `ask` even when an agent override would otherwise allow the category.
|
|
37
|
+
|
|
38
|
+
## Decisions
|
|
39
|
+
|
|
40
|
+
Every permission request resolves to one of three decisions:
|
|
41
|
+
|
|
42
|
+
- `allow` — execute and record `succeeded` or `failed`.
|
|
43
|
+
- `ask` — create a pending approval record, record `pending-approval`, and pause until the approval is resolved.
|
|
44
|
+
- `deny` — record `blocked` and do not invoke the executor.
|
|
45
|
+
|
|
46
|
+
In addition, the SDD-phase permission matrix uses four `PermissionMode` values:
|
|
47
|
+
|
|
48
|
+
- `allow` — pass through.
|
|
49
|
+
- `audit` — log and pass through.
|
|
50
|
+
- `require-preflight` — must run preflight and receive an `allow` decision before the operation executes.
|
|
51
|
+
- `deny` — block.
|
|
52
|
+
|
|
53
|
+
## Per-SDD-phase permission matrix
|
|
54
|
+
|
|
55
|
+
The matrix (`sddPhasePermissionMatrix`, version `sdd-phase-permissions-v1`) gives every (phase, category) pair a mode. The matrix version is exported as a constant so consumers can compare.
|
|
56
|
+
|
|
57
|
+
| Phase | Distinctive modes (vs. the planning baseline) |
|
|
58
|
+
|---|---|
|
|
59
|
+
| `explore`, `proposal`, `spec`, `design`, `tasks`, `archive` | Planning baseline. Reads allowed; edits denied; spec/design/task writes, shell, install, network, git, memory all `require-preflight`; provider-tool `audit`; external-directory and secrets `deny`. |
|
|
60
|
+
| `apply-progress` | Edits and `implementation-edit` are `require-preflight` (instead of `deny`). Shell and `test-run` are `require-preflight`. |
|
|
61
|
+
| `verify` | Edits and `implementation-edit` stay `deny`. Shell and `test-run` are `require-preflight`. |
|
|
62
|
+
|
|
63
|
+
Workspace reads are allowed only when the target path stays inside `workspaceRoot`. The evaluator resolves the real path with `realpathSync` to defeat symlink escapes and refuses to relax workspace boundary denials through agent overrides.
|
|
64
|
+
|
|
65
|
+
## Code runtime approval flow
|
|
66
|
+
|
|
67
|
+
The code runtime layers a second, finer-grained decision on top of the policy evaluator. Per-tool definitions declare whether a tool is `read`, `confirm`, or `restricted` (see [Code runtime](./code-runtime.md)). The approval coordinator (`src/code/runtime/approval-coordinator.ts`) wires that to a broker and a gateway.
|
|
68
|
+
|
|
69
|
+
```text
|
|
70
|
+
tool call
|
|
71
|
+
│
|
|
72
|
+
▼
|
|
73
|
+
PolicyApprovalBroker ──► ConservativePermissionGateway.evaluate(tool, mode, context)
|
|
74
|
+
│ │
|
|
75
|
+
│ ├── allow → execute, emit CodeRuntimeEvent(succeeded)
|
|
76
|
+
│ ├── ask → ApprovalPrompt
|
|
77
|
+
│ │ ├── stdio channel: read line from stdin
|
|
78
|
+
│ │ └── auto channel: injected broker (e.g. MCP client)
|
|
79
|
+
│ └── deny → emit CodeRuntimeEvent(blocked), return error
|
|
80
|
+
▼
|
|
81
|
+
CodeRuntimeEventSink (stream JSONL to consumer; OpenTUI shell, scripts, tests)
|
|
82
|
+
```
|
|
83
|
+
|
|
84
|
+
The gateway is conservative by default: a non-`read` tool in a non-`apply-progress` phase defaults to `ask`. Even with `--approval-policy allow`, the gateway can re-promote to `ask` for risky categories or workspace boundary issues.
|
|
85
|
+
|
|
86
|
+
## Redaction
|
|
87
|
+
|
|
88
|
+
`src/code/reporting/redaction.ts` ships three helpers used by every consumer that emits prompts, reports, checkpoints, transcripts, or memory saves.
|
|
89
|
+
|
|
90
|
+
| Helper | What it does |
|
|
91
|
+
|---|---|
|
|
92
|
+
| `redactSecrets(value: string): string` | Replaces secret-shaped substrings with `[REDACTED]`. Patterns include `*TOKEN=...`, `*SECRET=...`, `*KEY=...`, `*PASSWORD=...`, `sk_*`/`pk_*` keys, `Bearer ...` headers, and PEM private keys. |
|
|
93
|
+
| `redactJson(value: JsonValue): JsonValue` | Recursively walks JSON, redacts string values, and replaces values whose key matches `(token\|secret\|password\|api.?key\|authorization\|credential)` with the literal `"[REDACTED]"`. |
|
|
94
|
+
| `omitSensitiveCommandOutput<T extends JsonValue>(value: T): JsonValue` | For tool results that include `stdout`/`stderr`, replaces those fields with `"[omitted by default]"` unless the consumer explicitly opts in. |
|
|
95
|
+
|
|
96
|
+
These helpers are deterministic and are the only place secret-shaped values should be stripped. Tool results that include shell output must use `omitSensitiveCommandOutput` before they are recorded into the run event stream or rendered into a checkpoint.
|
|
97
|
+
|
|
98
|
+
## Approval lifecycle
|
|
99
|
+
|
|
100
|
+
Approvals are first-class records linked to a permission-decision event:
|
|
101
|
+
|
|
102
|
+
```text
|
|
103
|
+
executeOperation(...)
|
|
104
|
+
│
|
|
105
|
+
▼
|
|
106
|
+
evaluate permission
|
|
107
|
+
│
|
|
108
|
+
├── allow → no approval record, executor runs
|
|
109
|
+
├── deny → no approval record, executor blocked
|
|
110
|
+
└── ask → pending approval record + reserved operation attempt
|
|
111
|
+
│
|
|
112
|
+
▼
|
|
113
|
+
human resolves approval (approved | rejected | cancelled)
|
|
114
|
+
│
|
|
115
|
+
▼
|
|
116
|
+
resumeApprovedOperation({ approvalId, executor })
|
|
117
|
+
│
|
|
118
|
+
▼
|
|
119
|
+
reserved attempt → succeeded | failed | abandoned
|
|
120
|
+
```
|
|
121
|
+
|
|
122
|
+
Retry policy evaluation is separate from execution. `vgxness_run_resume_gate` evaluates the policy and returns an `OperationRetryDecision` (`allowed`/`reasonCode`/`reason`/attempt metadata). The default policy is `never`, so any prior `reserved`/`succeeded`/`failed`/`abandoned` attempt blocks later resume before the executor is invoked.
|
|
123
|
+
|
|
124
|
+
| Policy | Allows a new attempt after | Always blocks |
|
|
125
|
+
|---|---|---|
|
|
126
|
+
| `never` | No previous attempt only | `reserved`, `succeeded`, `failed`, `abandoned` |
|
|
127
|
+
| `after-abandoned` | latest attempt is `abandoned` | active `reserved`, `succeeded`, `failed` |
|
|
128
|
+
| `after-failure` | latest attempt is `failed` | active `reserved`, `succeeded`, `abandoned` |
|
|
129
|
+
| `after-failure-or-abandoned` | latest attempt is `failed` or `abandoned` | active `reserved`, `succeeded` |
|
|
130
|
+
|
|
131
|
+
`RunService.abandonReservedOperationAttempt({ attemptId, actor, reason })` is recovery-only: it transitions a stuck attempt to `abandoned` and appends an `operation-execution` audit event. It does not call an executor or retry.
|
|
132
|
+
|
|
133
|
+
## Safety boundaries (enforced everywhere)
|
|
134
|
+
|
|
135
|
+
- Read-only/preview commands must stay non-mutating: setup plans, MCP setup previews, OpenCode previews, workflow previews, status views, and the natural-language orchestrator preview never write provider config, call providers, or install global memory.
|
|
136
|
+
- Provider config writes (`opencode.json`, `.opencode/`, `.claude/`, `CLAUDE.md`) require explicit consent (`--yes` or an equivalent confirmed flow) plus backup/rollback behavior.
|
|
137
|
+
- The control plane and the code runtime do not create or write `openspec/`. SDD artifacts are stored through the local SQLite artifact service under canonical topic keys.
|
|
138
|
+
- Human acceptance is distinct from artifact presence: `vgxness_sdd_accept_artifact` records explicit human-only acceptance; saving a draft never implies acceptance.
|
|
139
|
+
- Unrelated user work is preserved. Workspace boundary denials cannot be relaxed by agent or subagent overrides.
|
|
140
|
+
|
|
141
|
+
## Tests
|
|
142
|
+
|
|
143
|
+
The policy and approval flow are covered by:
|
|
144
|
+
|
|
145
|
+
- `test/permissions/policy-evaluator.test.ts` — decisions, defaults, workspace boundary, SDD phase matrix.
|
|
146
|
+
- `test/runs/` — approval records, reserved attempts, retry policy, abandonment, resume inspect/gate.
|
|
147
|
+
- `test/code/` — approval broker, tool definitions, runtime approval flow.
|
package/docs/storage.md
ADDED
|
@@ -0,0 +1,93 @@
|
|
|
1
|
+
# Storage
|
|
2
|
+
|
|
3
|
+
VGXNESS keeps all durable state in local SQLite. There is no required cloud account, no remote sync in v1.5.1, and no implicit sharing between scopes. Project data and personal/global data must not be collapsed into one scope.
|
|
4
|
+
|
|
5
|
+
## Database location
|
|
6
|
+
|
|
7
|
+
The default is a per-user global database. The path is resolved by `resolveMemoryDatabasePath(...)` in `src/memory/storage-paths.ts` with this precedence:
|
|
8
|
+
|
|
9
|
+
1. `--db <path>` flag (`source: 'flag'`).
|
|
10
|
+
2. `VGXNESS_DB_PATH` env var (`source: 'environment'`).
|
|
11
|
+
3. Global default (`source: 'global-default'`).
|
|
12
|
+
|
|
13
|
+
| Platform | Global default |
|
|
14
|
+
|---|---|
|
|
15
|
+
| macOS | `~/Library/Application Support/vgxness/memory.sqlite` |
|
|
16
|
+
| Linux | `${XDG_DATA_HOME:-~/.local/share}/vgxness/memory.sqlite` |
|
|
17
|
+
| Windows | `%LOCALAPPDATA%\\vgxness\\memory.sqlite` when available; otherwise `%APPDATA%\\vgxness\\memory.sqlite` |
|
|
18
|
+
|
|
19
|
+
If the global default cannot be resolved (for example, `HOME` is not set on Linux), the resolver returns a `store_unavailable` error rather than falling back to the current working directory. The fix is to pass `--db` or set `VGXNESS_DB_PATH`.
|
|
20
|
+
|
|
21
|
+
The parent directory is created on demand by `prepareMemoryDatabasePath(...)`. Existing project-local databases remain compatible; opt in explicitly when needed:
|
|
22
|
+
|
|
23
|
+
```bash
|
|
24
|
+
vgxness memory search --project vgxness --db .vgx/memory.sqlite
|
|
25
|
+
```
|
|
26
|
+
|
|
27
|
+
## Runtime engine
|
|
28
|
+
|
|
29
|
+
Local SQLite storage is supported through `bun:sqlite`. Node.js is not the storage runtime; it remains development/build/test tooling. The release gate `bun run verify:bun-sqlite` proves SQLite readiness on a clean database by:
|
|
30
|
+
|
|
31
|
+
- Copying the source migrations into a temporary directory.
|
|
32
|
+
- Applying them against a temporary database.
|
|
33
|
+
- Checking foreign keys, busy timeout, FTS/search, transaction rollback, integrity, and cleanup.
|
|
34
|
+
|
|
35
|
+
## Migration layout
|
|
36
|
+
|
|
37
|
+
SQL migrations live in `src/memory/sqlite/migrations/` and are applied in lexicographic order. The package build copies them into `dist/memory/sqlite/migrations` so installed bins can apply them on first use. The full list as of v1.5.1:
|
|
38
|
+
|
|
39
|
+
| # | File | Adds |
|
|
40
|
+
|---|---|---|
|
|
41
|
+
| 001 | `001_initial.sql` | Base memory observations table. |
|
|
42
|
+
| 002 | `002_observation_revisions.sql` | Observation revision history. |
|
|
43
|
+
| 003 | `003_agent_registry.sql` | Agent and subagent registry. |
|
|
44
|
+
| 004 | `004_run_runtime.sql` | Run records, events, checkpoints. |
|
|
45
|
+
| 005 | `005_run_approvals.sql` | Approval records linked to permission-decision events. |
|
|
46
|
+
| 006 | `006_run_operation_attempts.sql` | Reserved operation attempts. |
|
|
47
|
+
| 007 | `007_abandoned_operation_attempts.sql` | Recovery-only `abandoned` state. |
|
|
48
|
+
| 008 | `008_run_execution_plan_events.sql` | Execution isolation plan events. |
|
|
49
|
+
| 009 | `009_multiple_operation_attempts.sql` | Multiple ordered attempts per approval. |
|
|
50
|
+
| 010 | `010_skill_registry.sql` | Skill identity, versions, source metadata. |
|
|
51
|
+
| 011 | `011_skill_usage_resolution_outcomes.sql` | Skill usage records from resolution. |
|
|
52
|
+
| 012 | `012_skill_improvement_proposals.sql` | Versioned, reviewable skill proposals. |
|
|
53
|
+
| 013 | `013_skill_evaluation_scenarios.sql` | Skill evaluation harness. |
|
|
54
|
+
| 014 | `014_manager_profile_overlays.sql` | Manager profile overlays. |
|
|
55
|
+
| 015 | `015_artifact_metadata.sql` | Artifact metadata for SDD. |
|
|
56
|
+
|
|
57
|
+
## Scopes
|
|
58
|
+
|
|
59
|
+
Two scopes share the database:
|
|
60
|
+
|
|
61
|
+
| Scope | Purpose |
|
|
62
|
+
|---|---|
|
|
63
|
+
| `project` | Repo-specific memory, SDD artifacts, run history, project agents/skills, project manager profile overlay. |
|
|
64
|
+
| `personal` | User preferences, reusable skills, cross-project patterns, personal manager profile overlay. |
|
|
65
|
+
|
|
66
|
+
Scopes are not stored in separate databases in v1.5.1; they are columns on the relevant tables. Code that wants to read or write a scope must pass it explicitly. Memory writes default to `project`; pass `scope: "personal"` for personal/global observations.
|
|
67
|
+
|
|
68
|
+
## Tables and lifecycle
|
|
69
|
+
|
|
70
|
+
The schema follows a small set of conventions:
|
|
71
|
+
|
|
72
|
+
- **Memory observations** are immutable for content; updates create revisions tracked by `observation_revisions`. `topicKey` is the durable upsert key: re-saving with the same key updates the existing observation rather than creating a duplicate.
|
|
73
|
+
- **SDD artifacts** are stored under canonical topic keys `sdd/{change}/{phase}`. Acceptance is recorded separately on `SddArtifactAcceptanceRecord` and is not a content column.
|
|
74
|
+
- **Runs** are append-only in spirit. `RunRecord.status` transitions through the lifecycle described in [Safety model](./safety.md); `RunEvent` is the audit trail; `RunCheckpoint` is the resumable JSON state. Terminal runs cannot be finalized again, and the final `outcome` must match the terminal `status`.
|
|
75
|
+
- **Operation attempts** are reserved through `RunService.executeOperation(...)` and finalized in one transaction. `reserved` is exclusive; once finalized, a new attempt requires a fresh reservation and policy admission.
|
|
76
|
+
- **Skills** are versioned, with `currentVersionId` pointing at the active version. Skill improvement proposals do not mutate the active skill until `applySkillImprovementProposal(...)` is called on an `approved` proposal.
|
|
77
|
+
- **Manager profile overlays** are a thin layer over the canonical agent manifest; they record per-user/per-project model and prompt-contract preferences without mutating the built-in manifest.
|
|
78
|
+
|
|
79
|
+
## Backup and recovery
|
|
80
|
+
|
|
81
|
+
There is no automatic backup of the SQLite database in v1.5.1. The recommended pattern is to treat the database file as a normal file in the user's data directory and use OS-level snapshots if needed.
|
|
82
|
+
|
|
83
|
+
Provider config (OpenCode) has its own backup/rollback path through `vgxness setup rollback --backup <path>`. That is separate from the database and is documented in the [CLI reference](./cli.md).
|
|
84
|
+
|
|
85
|
+
## Wiping and migrating
|
|
86
|
+
|
|
87
|
+
To reset state for a project, point `--db` at a new path; do not delete a database file that another `vgxness` process might still be using. Migrations are forward-only and idempotent within a single database version. If a downgrade is required, restore the previous database file from a snapshot — VGXNESS does not support automatic down-migrations.
|
|
88
|
+
|
|
89
|
+
## Safety
|
|
90
|
+
|
|
91
|
+
- The database file holds potentially sensitive content (memory observations, run transcripts, prompt excerpts). Treat it as you would any other local credential-bearing file. The redaction helpers in `src/code/reporting/redaction.ts` are applied at the prompt/report/checkpoint boundary, not at the storage boundary.
|
|
92
|
+
- `--db` accepts any path the calling user can write to. VGXNESS refuses to write through symlinks that escape the parent directory of the resolved path.
|
|
93
|
+
- `*.sqlite*`, `.vgx/`, and `.opencode/` are git-ignored.
|