vgxness 1.5.1 → 1.5.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (42) hide show
  1. package/README.md +23 -2
  2. package/dist/agents/agent-seed-service.js +10 -0
  3. package/dist/agents/canonical-agent-manifest.js +177 -0
  4. package/dist/agents/canonical-agent-projection.js +146 -0
  5. package/dist/agents/renderers/claude-renderer.js +30 -52
  6. package/dist/cli/bun-bin.js +6 -0
  7. package/dist/cli/cli-help.js +3 -0
  8. package/dist/cli/commands/agent-skill-dispatcher.js +6 -5
  9. package/dist/cli/commands/mcp-dispatcher.js +65 -3
  10. package/dist/cli/index.js +1 -1
  11. package/dist/governance/governance-report-builder.js +45 -26
  12. package/dist/mcp/claude-code-agent-config.js +79 -0
  13. package/dist/mcp/claude-code-config.js +84 -0
  14. package/dist/mcp/client-install-claude-code-contract.js +86 -0
  15. package/dist/mcp/client-install-claude-code.js +85 -0
  16. package/dist/mcp/index.js +5 -0
  17. package/dist/mcp/opencode-default-agent-config.js +7 -113
  18. package/dist/mcp/provider-canonical-agent-manifest.js +39 -0
  19. package/dist/mcp/provider-change-plan.js +57 -1
  20. package/dist/mcp/provider-doctor.js +54 -0
  21. package/dist/mcp/provider-status.js +82 -2
  22. package/dist/mcp/schema.js +2 -2
  23. package/dist/mcp/validation.js +1 -1
  24. package/dist/memory/memory-service.js +4 -0
  25. package/dist/sdd/sdd-workflow-service.js +129 -59
  26. package/dist/setup/providers/claude-setup-adapter.js +7 -4
  27. package/docs/architecture.md +54 -112
  28. package/docs/cli.md +53 -0
  29. package/docs/code-runtime.md +218 -0
  30. package/docs/contributing.md +120 -0
  31. package/docs/glossary.md +211 -0
  32. package/docs/mcp.md +144 -0
  33. package/docs/prd.md +23 -26
  34. package/docs/providers.md +123 -0
  35. package/docs/roadmap.md +88 -0
  36. package/docs/safety.md +147 -0
  37. package/docs/storage.md +93 -0
  38. package/package.json +1 -1
  39. package/docs/funcionamiento-del-sistema.md +0 -865
  40. package/docs/harness-gap-analysis.md +0 -243
  41. package/docs/vgxcode.md +0 -87
  42. package/docs/vgxness-code.md +0 -48
@@ -1,243 +0,0 @@
1
- # Historical Harness Systems Gap Analysis
2
-
3
- > **Status:** historical planning note. This document predates much of the v1 runtime foundation. Use `docs/architecture.md`, `docs/prd.md`, and `docs/cli.md` as the current product references. Keep this file for design context, not as a live gap checklist.
4
-
5
- ## Current interpretation after v1.3.0
6
-
7
- Several items below now exist as v1 foundations: local run records, preflight/approval planning, SDD artifacts, memory-backed storage, agent/subagent registries, provider setup previews, package evidence, and MCP control-plane tools.
8
-
9
- The remaining strategic gaps are narrower:
10
-
11
- - real provider/executor dispatch instead of planning-only run execution;
12
- - stronger SDD governance gates that combine artifact presence, human acceptance, verification evidence, and blockers;
13
- - operational TUI screens for SDD, runs, approvals, doctor, and settings;
14
- - sandbox/worktree enforcement beyond advisory planning;
15
- - export/import/redaction and upgrade/rollback evidence for beta readiness.
16
-
17
- This research compares current agent harness patterns against the `vgxness` PRD and identifies what the product still needs before it can become a serious local-first SDD harness.
18
-
19
- ## Executive summary
20
-
21
- The current PRD has the right product direction: local-first, provider-agnostic, memory-backed, SDD-first, and agent/subagent aware.
22
-
23
- What is still missing is the **runtime contract**: permissions, sandboxing, run state, provider adapters, observability, evaluation, and artifact portability. Without these, `vgxness` risks becoming “memory + prompts” instead of a real harness.
24
-
25
- ## Systems reviewed
26
-
27
- | System | Relevant lessons for `vgxness` |
28
- |---|---|
29
- | Anthropic agent patterns | Keep workflows simple and composable; distinguish predictable workflows from autonomous agents; invest heavily in tool design and transparency. |
30
- | Claude Code subagents | Subagents need isolated context, explicit tools, permissions, model selection, memory scopes, lifecycle hooks, and clear delegation descriptions. |
31
- | OpenCode agents | Provider/tool configuration should support primary agents, subagents, per-agent permissions, model routing, task permissions, and markdown/JSON definitions. |
32
- | OpenAI Agents SDK | Useful primitives: agents, handoffs, agents-as-tools, guardrails, sessions, human-in-the-loop, tracing, MCP, sandbox agents, and resumable workspaces. |
33
- | LangGraph | Durable execution, checkpoints, streaming, human-in-the-loop, stateful workflows, memory, and deep traces matter for long-running agents. |
34
- | AutoGen | Multi-agent systems benefit from layers: simple AgentChat, lower-level event-driven Core, extensions, distributed runtimes, and UI/studio tooling. |
35
- | CrewAI | Productized multi-agent systems commonly include agents, crews, flows, tasks, memory, knowledge, guardrails, observability, persistence, and resume. |
36
-
37
- ## What the PRD already covers well
38
-
39
- - Local-first memory.
40
- - Project and personal/global memory scopes.
41
- - SDD-first workflow.
42
- - Agent and subagent registry from the MVP.
43
- - Provider-agnostic model with OpenCode/Claude Code adapters.
44
- - CLI for setup/configuration and integrations for day-to-day usage.
45
- - Cloud sync and team workflows correctly deferred until later.
46
-
47
- ## Missing or underdefined areas
48
-
49
- ### 1. Runtime/run model
50
-
51
- `vgxness` needs a first-class concept of a **run**.
52
-
53
- Minimum fields:
54
-
55
- - run id
56
- - project id/path
57
- - user intent
58
- - phase/workflow
59
- - selected agent/subagent
60
- - provider adapter
61
- - model
62
- - tool calls
63
- - artifacts read/written
64
- - memory reads/writes
65
- - approvals
66
- - verification evidence
67
- - final status
68
-
69
- Why it matters: without runs, the harness cannot resume, debug, audit, or explain agent behavior.
70
-
71
- ### 2. Permission and sandbox model
72
-
73
- The PRD mentions agents and integrations, but not the security boundary.
74
-
75
- Needed capabilities:
76
-
77
- - Read/write/shell/network/git/memory permission categories.
78
- - Per-agent and per-tool permissions.
79
- - Human approval gates for destructive, external, or privileged operations.
80
- - Workspace boundary enforcement.
81
- - Optional sandbox/worktree strategy for implementation agents.
82
-
83
- This is NOT optional. A harness that can run agents without strong permissions is a loaded weapon.
84
-
85
- ### 3. Provider adapter contract
86
-
87
- Provider-agnostic intent is correct, but the PRD needs an adapter interface.
88
-
89
- Each adapter should declare:
90
-
91
- - supported agent definition fields
92
- - supported permissions
93
- - supported memory injection modes
94
- - supported subagent/task model
95
- - supported hooks/lifecycle events
96
- - config file locations
97
- - limitations
98
- - export/render format
99
-
100
- This prevents `vgxness` from pretending all tools support the same features.
101
-
102
- ### 4. Agent definition schema
103
-
104
- The agent registry needs a neutral schema, not just “store agents”.
105
-
106
- Suggested minimum schema:
107
-
108
- - name
109
- - description/delegation trigger
110
- - role/system instructions
111
- - mode: primary/subagent/workflow-phase
112
- - capabilities
113
- - allowed tools
114
- - denied tools
115
- - model preference
116
- - memory scopes
117
- - SDD phases supported
118
- - max steps/turns
119
- - required approvals
120
- - adapter overrides
121
-
122
- ### 5. Tool/ACI design
123
-
124
- Agent-computer interface design is a product feature.
125
-
126
- Needed:
127
-
128
- - Tool descriptions optimized for model usage.
129
- - Safe input schemas.
130
- - Examples and edge cases per tool.
131
- - Clear boundaries between similar tools.
132
- - Tool-level tests/evals to catch misuse.
133
-
134
- Bad tools create bad agents. This is where a lot of harnesses quietly fail.
135
-
136
- ### 6. Durable execution and resume
137
-
138
- SDD creates long-running work. Long-running work needs checkpoints.
139
-
140
- Needed:
141
-
142
- - run checkpoints
143
- - phase checkpoints
144
- - apply-progress merge rules
145
- - resumable interrupted runs
146
- - idempotency expectations for tools
147
- - failure classification: blocked, failed, needs-human, cancelled, completed
148
-
149
- ### 7. Observability and debugging
150
-
151
- The product needs traces, not just logs.
152
-
153
- Minimum trace entities:
154
-
155
- - run
156
- - phase
157
- - agent/subagent invocation
158
- - tool call
159
- - memory operation
160
- - artifact operation
161
- - approval decision
162
- - verification command/result
163
-
164
- Nice-to-have later:
165
-
166
- - token/cost tracking
167
- - model latency
168
- - failure heatmap
169
- - timeline UI/export
170
-
171
- ### 8. Evaluation and quality gates
172
-
173
- The PRD has success criteria, but not evals.
174
-
175
- Needed MVP evals:
176
-
177
- - agent resolution chooses the expected agent
178
- - SDD artifact chain remains complete
179
- - memory upsert/revision behavior is durable
180
- - provider adapter renders valid config
181
- - permission model blocks unsafe operations
182
- - resume restores the expected run state
183
-
184
- ### 9. Artifact portability
185
-
186
- Memory-only artifacts are fast, but PRD/review workflows need portability.
187
-
188
- Needed:
189
-
190
- - export SDD artifacts to markdown/json
191
- - import artifacts back into memory
192
- - snapshot a run for debugging or sharing
193
- - redact sensitive data during export
194
-
195
- ### 10. CLI surface definition
196
-
197
- The PRD says CLI, but the first command set is still open.
198
-
199
- Candidate MVP commands:
200
-
201
- - `vgx init`
202
- - `vgx memory search|get|save|update`
203
- - `vgx agent list|add|render|validate`
204
- - `vgx sdd new|continue|status|archive`
205
- - `vgx run list|show|resume`
206
- - `vgx adapter doctor|render`
207
-
208
- ## Recommended MVP additions to PRD
209
-
210
- Add these as explicit MVP requirements:
211
-
212
- 1. **Run lifecycle model** — every agentic operation is captured as a resumable/auditable run.
213
- 2. **Permission model** — per-agent tool permissions with human approval gates.
214
- 3. **Provider adapter contract** — adapters translate neutral `vgxness` definitions into provider-specific configs.
215
- 4. **Agent schema** — neutral registry schema for agents/subagents/workflow-phase agents.
216
- 5. **Trace model** — structured trace records for runs, tools, memory, artifacts, approvals, and verification.
217
- 6. **Artifact export/import** — SDD and memory artifacts can be exported for review/debugging.
218
- 7. **Evaluation harness** — tests/evals for agent resolution, adapters, permissions, memory, and resume.
219
-
220
- ## Suggested next SDD change
221
-
222
- Create a new SDD change named `harness-runtime-foundation`.
223
-
224
- Scope it narrowly:
225
-
226
- - define run lifecycle schema
227
- - define agent registry schema
228
- - define permission categories
229
- - define provider adapter interface
230
- - add CLI validation/render skeleton
231
- - add tests for schemas and adapter rendering
232
-
233
- Do **not** implement full cloud sync, distributed agents, web UI, or team workflows yet.
234
-
235
- ## Sources
236
-
237
- - Anthropic: Building effective agents
238
- - Claude Code: subagents documentation
239
- - OpenCode: agents documentation
240
- - OpenAI Agents SDK documentation
241
- - LangGraph overview
242
- - Microsoft AutoGen documentation
243
- - CrewAI documentation
package/docs/vgxcode.md DELETED
@@ -1,87 +0,0 @@
1
- # VGXNESS Code OpenTUI shell (`vgxcode`)
2
-
3
- Experimental Bun/OpenTUI coding interface for VGXNESS Code.
4
-
5
- **Naming rule:** `VGXNESS Code` is the public product/runtime surface (`vgxness code ...`; `vgx code ...` remains a compatibility alias). `vgxcode` is the internal root-owned OpenTUI shell that renders and drives that runtime during repository development.
6
-
7
- ## Why this is root-owned
8
-
9
- VGXNESS ships the existing `vgxness`/`vgx` CLI bins while the OpenTUI coding interface remains an internal root-owned surface. The repository root owns `@opentui/core`, the Bun lockfile, and verification.
10
-
11
- This keeps the shipped `vgxness`/`vgx` CLI stable while letting us build the OpenTUI experience directly from root source.
12
-
13
- ## Run
14
-
15
- Prerequisite: install Bun.
16
-
17
- ```bash
18
- bun src/cli/tui/opentui/code/index.ts
19
- ```
20
-
21
- Interactive mode starts in read-only `inspect`. Type a task/question and press `Enter`; `vgxcode` runs the root Bun CLI bridge as one of:
22
-
23
- ```bash
24
- bun run cli:bun -- code inspect "<your prompt>" --events-jsonl
25
- bun run cli:bun -- code plan "<your prompt>" --events-jsonl
26
- bun run cli:bun -- code craft-preview "<your prompt>" --events-jsonl
27
- bun run cli:bun -- code craft "<your prompt>" --events-jsonl --approval-channel stdio
28
- ```
29
-
30
- The OpenTUI shell uses `bun run --silent cli:bun -- ...` internally so Bun lifecycle output does not pollute the JSONL event stream.
31
-
32
- The prompt defaults to `inspect`. Press `Tab` to toggle between `inspect` and `plan`, or prefix a prompt with `/inspect`, `/plan`, `/craft-preview`, or `/craft`:
33
-
34
- ```text
35
- /plan outline a safe implementation
36
- /inspect summarize the current architecture
37
- /craft-preview show the diff you would make
38
- /craft apply the smallest approved fix
39
- ```
40
-
41
- After submit, the prompt input is cleared and the submitted prompt remains visible as `Last submitted`. The UI shows explicit `idle`, `running`, `completed`, and `error` states.
42
-
43
- `vgxcode` does not own mutation policy. `inspect`, `plan`, and `craft-preview` are read-only/preview paths. `/craft` is approval-capable and may mutate only through the VGXNESS Code runtime and its explicit approval channel; the OpenTUI shell only renders pending approvals and writes approve/deny decisions to the live runtime process.
44
-
45
- To replay real read-only runtime events without spawning the root CLI, pipe the root Bun CLI JSONL bridge into `vgxcode`:
46
-
47
- ```bash
48
- bun run cli:bun -- code inspect "What is this project?" --events-jsonl | bun src/cli/tui/opentui/code/index.ts
49
- bun run cli:bun -- code plan "Plan a safe change" --events-jsonl | bun src/cli/tui/opentui/code/index.ts
50
- bun run cli:bun -- code craft-preview "Preview a safe change" --events-jsonl | bun src/cli/tui/opentui/code/index.ts
51
- ```
52
-
53
- Use `bun run cli:bun -- ...` for OpenTUI-adjacent local testing. `npm run cli -- ...` uses Node/tsx and can fail when a path loads `@opentui/core`.
54
-
55
- Press `Ctrl+C` to exit.
56
-
57
- ## Current scope
58
-
59
- The shell reads newline-delimited `CodeRuntimeEvent` JSON from stdin when piped. If stdin has events or parse errors, `vgxcode` renders that stream and does not spawn the root CLI. If stdin is a TTY, the OpenTUI entrypoint opens the interactive prompt and uses `inspect` by default through the JSONL bridge.
60
-
61
- Errors are shown in the Activity panel when JSONL parsing fails, unsupported runtime events appear, npm/lifecycle banners appear in the stream, or the spawned root CLI exits non-zero.
62
-
63
- ## Checks
64
-
65
- ```bash
66
- npm run check:bun-lock # from the repository root; read-only/advisory
67
- bun run verify:typecheck
68
- node --import tsx --test test/cli/tui/opentui-code.test.ts
69
- bun run smoke:opentui-code
70
- ```
71
-
72
- The root `npm run check:bun-lock` command compares root `package.json`
73
- dependency specifiers with the root `bun.lock` without installing Bun or
74
- mutating `node_modules`. The root lockfile is the repository dependency
75
- authority; package evidence is validated by `bun run package:bun:evidence`.
76
-
77
- Manual interactive check:
78
-
79
- ```bash
80
- bun src/cli/tui/opentui/code/index.ts
81
- # type: What is this project?
82
- # press Enter
83
- ```
84
-
85
- ## Safety rule
86
-
87
- `vgxcode` renders state and user decisions. It must not execute tools directly, bypass approvals, or own mutation policy. Runtime, approvals, verification, SDD, and memory stay in the VGXNESS core runtime.
@@ -1,48 +0,0 @@
1
- # VGXNESS Code Readiness Notes
2
-
3
- VGXNESS Code is the native VGXNESS coding CLI/runtime. It is not an OpenCode wrapper, fork, compatibility layer, config format, prompt copy, or branded re-skin. Provider adapters translate VGXNESS-native requests only.
4
-
5
- `vgxcode` is the internal root-owned OpenTUI shell for VGXNESS Code development. Public commands stay under `vgxness code ...`; the OpenTUI shell should render runtime state and approval decisions without becoming a separate mutation policy layer.
6
-
7
- ## Commands
8
-
9
- - `vgxness code inspect "<question>"` — read-only repository investigation.
10
- - `vgxness code plan "<task>"` — read-only implementation planning.
11
- - `vgxness code craft "<task>"` — bounded edit-capable work with approval gates.
12
- - `vgxness code sdd <change> <phase>` — SDD-backed phase work; use `--save-artifact` only when persistence is intended.
13
-
14
- Useful controls: `--provider`, `--model`, `--stream`, `--json`, `--max-source-bytes`, `--approval-policy ask|allow|deny`, `--verification none|suggest|run|repair`, `--transcript off|summary|full`, and `--memory off|ask|auto`.
15
-
16
- ## Configuration and Reporting
17
-
18
- Safe defaults are local and conservative: fake provider, read-only posture, approval policy `ask`, verification `suggest`, transcript `summary`, memory `off`, bounded prompt/context size, and no repair loop unless explicitly enabled.
19
-
20
- Transcript modes:
21
- - `off`: no transcript in the final summary.
22
- - `summary`: checkpoint labels/timestamps only.
23
- - `full`: sanitized checkpoints and tool summaries; command stdout/stderr are omitted by default.
24
-
25
- Memory modes:
26
- - `off`: never save learnings.
27
- - `ask`: prepare a sanitized memory-save checkpoint but do not persist.
28
- - `auto`: save sanitized learnings only through a configured memory gateway.
29
-
30
- ## Safety Model
31
-
32
- VGXNESS Code routes edits, shell, network, git mutation, SDD persistence, and memory saves through explicit policy decisions. External workspace edits are denied, destructive commands require approval, git mutation is blocked by default unless explicitly approved, and network access requires approval. Prompts, reports, checkpoints, transcripts, and memory saves redact secret-like values.
33
-
34
- ## SDD Mode
35
-
36
- SDD mode loads existing artifacts for the requested change/phase and exposes phase-appropriate tools. Non-implementation phases stay read/artifact oriented. `apply-progress` may expose edit and shell tools; `verify` may expose verification shell tools. Artifact saves require explicit persistence intent.
37
-
38
- ## Verification and Provider Setup
39
-
40
- Project detection reports repository root, stack hints, config files, and verification presets such as `npm run typecheck` or `npm run test` when package scripts exist. The fake provider is deterministic for local tests. OpenAI-compatible providers are credential-gated by environment references; secret values are not inserted into prompts or reports.
41
-
42
- ## Rollout Checklist
43
-
44
- - Config: safe defaults documented; transcript/memory/provider controls exposed.
45
- - Safety: external edits, destructive shell, git mutation, network, secrets, and unrelated user work covered by tests.
46
- - Verification: detected presets reported; verification results are honest pass/fail/skipped evidence.
47
- - Reporting: transcripts are configurable and sanitized; sensitive command output is omitted by default.
48
- - Provider behavior: core runtime remains provider-neutral and native VGXNESS Code.