vibe-coding-master 0.4.41 → 0.5.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/dist/backend/api/task-routes.js +7 -0
- package/dist/backend/api/translation-routes.js +11 -0
- package/dist/backend/errors.js +2 -0
- package/dist/backend/gateway/gateway-service.js +31 -61
- package/dist/backend/server.js +11 -1
- package/dist/backend/services/round-service.js +34 -0
- package/dist/backend/services/task-launch-service.js +91 -0
- package/dist/backend/services/translation-service.js +87 -0
- package/dist/backend/services/translation-worker-service.js +228 -72
- package/dist-frontend/assets/index-BaDS9Ohj.js +96 -0
- package/dist-frontend/index.html +1 -1
- package/docs/ARCHITECTURE.md +123 -0
- package/docs/TESTING.md +121 -73
- package/docs/known-issues.md +155 -0
- package/package.json +1 -1
- package/dist-frontend/assets/index-CsxS5H0d.js +0 -96
- package/docs/claude-code-translation-plan.md +0 -1268
- package/docs/full-harness-baseline.md +0 -160
- package/docs/gate-review-gates.md +0 -132
- package/docs/gateway-design.md +0 -813
- package/docs/v0.2-implementation-plan.md +0 -408
- package/docs/v0.4-harness-optimization-plan.md +0 -664
package/dist-frontend/index.html
CHANGED
|
@@ -4,7 +4,7 @@
|
|
|
4
4
|
<meta charset="UTF-8" />
|
|
5
5
|
<meta name="viewport" content="width=device-width, initial-scale=1.0" />
|
|
6
6
|
<title>VibeCodingMaster</title>
|
|
7
|
-
<script type="module" crossorigin src="/assets/index-
|
|
7
|
+
<script type="module" crossorigin src="/assets/index-BaDS9Ohj.js"></script>
|
|
8
8
|
<link rel="stylesheet" crossorigin href="/assets/index-BM6nSKae.css">
|
|
9
9
|
</head>
|
|
10
10
|
<body>
|
package/docs/ARCHITECTURE.md
CHANGED
|
@@ -1 +1,124 @@
|
|
|
1
1
|
# Architecture
|
|
2
|
+
|
|
3
|
+
Project-level architecture for VibeCodingMaster (VCM). VCM is a single npm
|
|
4
|
+
package that provides a local GUI cockpit for running and orchestrating multiple
|
|
5
|
+
Claude Code role sessions around one engineering task.
|
|
6
|
+
|
|
7
|
+
This document is architect-owned. It gives the project-wide module overview,
|
|
8
|
+
responsibilities, relationships, dependency direction, and constraints. The
|
|
9
|
+
module-level detailed design lives in [`ARCHITECTURE.md`](../ARCHITECTURE.md) at
|
|
10
|
+
the repository root (the single workspace module recorded in
|
|
11
|
+
`.ai/generated/module-index.json`).
|
|
12
|
+
|
|
13
|
+
## Module / Layer Overview
|
|
14
|
+
|
|
15
|
+
VCM is one workspace module (`vibe-coding-master`) organized into three source
|
|
16
|
+
layers plus supporting tools.
|
|
17
|
+
|
|
18
|
+
| Layer | Path | Responsibility |
|
|
19
|
+
| --- | --- | --- |
|
|
20
|
+
| Backend | `src/backend` | Fastify HTTP + WebSocket server, role/session runtime over `node-pty`, services, external adapters, mobile gateway, and the downstream harness templates. |
|
|
21
|
+
| Frontend | `src/frontend` | React 19 + Vite single-page GUI: task workspace, role tabs, embedded `xterm` terminals, harness/translation panels, and client state stores. |
|
|
22
|
+
| Shared | `src/shared` | Cross-layer TypeScript types, constants (role definitions, ports), and zod-backed validation helpers consumed by both backend and frontend. |
|
|
23
|
+
| Tools / scripts | `.ai/tools`, `scripts` | Generated-context generators, long-running validation wrappers, bash guard, and harness install/verify scripts. |
|
|
24
|
+
|
|
25
|
+
### Backend sub-areas (`src/backend`)
|
|
26
|
+
|
|
27
|
+
- `api/`: Fastify route modules, one per domain (project, task, session, round,
|
|
28
|
+
message, harness, gate-review, translation, gateway, diagnostics, artifacts,
|
|
29
|
+
runtime-state, app-settings, claude-hook). Routes are thin and delegate to
|
|
30
|
+
services.
|
|
31
|
+
- `services/`: business logic. Key services include `task-service`,
|
|
32
|
+
`task-launch-service` (backend-owned one-click task start, shared by the GUI
|
|
33
|
+
endpoint and the gateway), `session-service`, `round-service`,
|
|
34
|
+
`runtime-coordinator-service`, `message-service`, `harness-service`,
|
|
35
|
+
`gate-review-service`, `translation-service`/`translation-worker-service`,
|
|
36
|
+
`job-guard-service`, and `command-dispatcher`.
|
|
37
|
+
- `runtime/`: PTY-backed terminal runtime (`node-pty-runtime`,
|
|
38
|
+
`terminal-runtime`, `session-registry`, `terminal-submit`) that supervises one
|
|
39
|
+
Claude Code process per role.
|
|
40
|
+
- `adapters/`: side-effect boundaries — `claude-adapter`, `git-adapter`,
|
|
41
|
+
`command-runner`, `filesystem`.
|
|
42
|
+
- `gateway/`: mobile gateway service plus channel implementations
|
|
43
|
+
(Weixin iLink, Lark) and command parsing.
|
|
44
|
+
- `templates/`: message/handoff/role-command templates and, under
|
|
45
|
+
`templates/harness/`, the source of truth for the VCM harness that VCM installs
|
|
46
|
+
into downstream repositories.
|
|
47
|
+
- `ws/`: WebSocket bridge (`terminal-ws`) streaming PTY I/O to the frontend.
|
|
48
|
+
- `server.ts`, `main.ts`, `app-version.ts`, `vcm-data-dir.ts`, `errors.ts`:
|
|
49
|
+
composition root, CLI entry, version, data-dir resolution, error types.
|
|
50
|
+
|
|
51
|
+
### Frontend sub-areas (`src/frontend`)
|
|
52
|
+
|
|
53
|
+
- `routes/`: top-level views (`project-dashboard`, `task-workspace`).
|
|
54
|
+
- `components/`: GUI building blocks (app shell, session console/toolbar, role
|
|
55
|
+
session tabs, harness panel/studio, translation panel, message timeline,
|
|
56
|
+
repo connect form, diff modal, error center).
|
|
57
|
+
- `state/`: client stores and helpers (`app-store`, `session-store`,
|
|
58
|
+
`api-client`, polling schedulers, translation feed, UI error handling).
|
|
59
|
+
- `terminal/`: `xterm` view and terminal websocket client.
|
|
60
|
+
|
|
61
|
+
### Shared sub-areas (`src/shared`)
|
|
62
|
+
|
|
63
|
+
- `types/`: domain type contracts shared across layers.
|
|
64
|
+
- `validation/`: pure validators (`artifact-check`, `language-detect`,
|
|
65
|
+
`slug-check`).
|
|
66
|
+
- `constants.ts`: role definitions and default ports.
|
|
67
|
+
|
|
68
|
+
## Module Relationships and Dependency Direction
|
|
69
|
+
|
|
70
|
+
```
|
|
71
|
+
frontend --depends on--> shared <--depends on-- backend
|
|
72
|
+
| ^
|
|
73
|
+
+-------- HTTP /api + WS /ws (api-client) ----------+
|
|
74
|
+
```
|
|
75
|
+
|
|
76
|
+
- `shared` is the leaf layer. It must not import from `backend` or `frontend`.
|
|
77
|
+
- `backend` and `frontend` both depend on `shared`, and must not depend on each
|
|
78
|
+
other at the module level.
|
|
79
|
+
- The only runtime coupling between frontend and backend is the HTTP `/api`
|
|
80
|
+
surface and the `/ws` WebSocket, mediated on the client by
|
|
81
|
+
`src/frontend/state/api-client.ts` and
|
|
82
|
+
`src/frontend/terminal/terminal-client.ts`.
|
|
83
|
+
- Within the backend, the intended direction is
|
|
84
|
+
`api -> services -> (runtime | adapters | gateway | templates)`. Routes should
|
|
85
|
+
not contain business logic; services should reach the outside world only
|
|
86
|
+
through adapters and the runtime.
|
|
87
|
+
|
|
88
|
+
## Project-Wide Constraints
|
|
89
|
+
|
|
90
|
+
- Single npm package, ESM only, TypeScript strict mode. Node `^20 || >=22`.
|
|
91
|
+
- Keep the layer boundary: no `shared -> backend/frontend` imports, no direct
|
|
92
|
+
`frontend <-> backend` imports.
|
|
93
|
+
- Backend and frontend compile under separate tsconfigs
|
|
94
|
+
(`tsconfig.node.json`, `tsconfig.json`); new files must fall inside the correct
|
|
95
|
+
`include` globs, and `npm run typecheck` must pass both.
|
|
96
|
+
- Downstream harness behavior is defined by `src/backend/templates/harness/**`;
|
|
97
|
+
change harness output there, not in generated target-repo files.
|
|
98
|
+
- Long-running and background process rules from the VCM managed block in
|
|
99
|
+
`CLAUDE.md` apply; never detach processes.
|
|
100
|
+
- The npm package ships only built artifacts (`dist`, `dist-frontend`, `docs`,
|
|
101
|
+
`scripts`, `README.md`).
|
|
102
|
+
|
|
103
|
+
## Generated Context Ownership
|
|
104
|
+
|
|
105
|
+
Generated indexes under `.ai/generated/` are machine-maintained and regenerated
|
|
106
|
+
by the tools in `.ai/tools/`:
|
|
107
|
+
|
|
108
|
+
- `.ai/generated/module-index.json` — produced by
|
|
109
|
+
`.ai/tools/generate-module-index`. Maps the workspace to layers, modules,
|
|
110
|
+
manifests, module docs, source files, and test files. Use it to locate code and
|
|
111
|
+
confirm module boundaries.
|
|
112
|
+
- `.ai/generated/public-surface.json` — produced by
|
|
113
|
+
`.ai/tools/generate-public-surface` (after `module-index.json` exists). It is the
|
|
114
|
+
authoritative machine index of module-to-module public APIs, routes, and
|
|
115
|
+
externally consumed surfaces. Treat it as the full public-surface listing;
|
|
116
|
+
module docs explain meaning and design intent rather than duplicating it.
|
|
117
|
+
|
|
118
|
+
Regenerate both after changing module layout, public exports, or HTTP routes.
|
|
119
|
+
|
|
120
|
+
## Module-Level Architecture Docs
|
|
121
|
+
|
|
122
|
+
- Root module: [`ARCHITECTURE.md`](../ARCHITECTURE.md) — detailed design,
|
|
123
|
+
boundaries, behavior, public surface explanation, risks, and update triggers
|
|
124
|
+
for the `vibe-coding-master` workspace module.
|
package/docs/TESTING.md
CHANGED
|
@@ -1,82 +1,130 @@
|
|
|
1
1
|
# Testing
|
|
2
2
|
|
|
3
|
-
|
|
3
|
+
Reviewer-owned validation strategy for VibeCodingMaster (VCM). This document maps
|
|
4
|
+
the VCM validation levels to project-native commands, where tests live, how to
|
|
5
|
+
select what to run, and the current testing gaps.
|
|
4
6
|
|
|
5
|
-
## Validation Levels
|
|
7
|
+
## Validation Levels and Commands
|
|
6
8
|
|
|
7
|
-
|
|
8
|
-
| --- | --- | --- |
|
|
9
|
-
| L0 | format / typecheck | `npm run typecheck` (`tsc -p tsconfig.json --noEmit && tsc -p tsconfig.node.json --noEmit`) |
|
|
10
|
-
| L1/L2 | unit + module/integration tests | `npm run test` (vitest, runs `tests/unit/**` and `tests/integration/**`) |
|
|
11
|
-
| L3 | smoke / E2E (browser) | `npm run e2e` (Playwright) |
|
|
12
|
-
| L4 | full regression / release | `npm run prepack` (build + `verify:package`) + full `npm run test` + `npm run e2e` |
|
|
13
|
-
|
|
14
|
-
Targeted run: `npx vitest run <path/to/test.ts>`.
|
|
15
|
-
|
|
16
|
-
### Test environment note
|
|
17
|
-
|
|
18
|
-
`vitest.config.ts` runs in `environment: "node"` (no jsdom). Frontend tests are
|
|
19
|
-
therefore either pure-helper tests or component tests rendered with
|
|
20
|
-
`react-dom/server` `renderToStaticMarkup` (static HTML assertions), not DOM/event
|
|
21
|
-
tests. Component modules that load browser globals at import time (e.g. the
|
|
22
|
-
xterm-backed `terminal/xterm-view`) must be stubbed with `vi.mock` before the
|
|
23
|
-
component under test is imported.
|
|
24
|
-
|
|
25
|
-
## Required prerequisite: build before full `npm run test`
|
|
26
|
-
|
|
27
|
-
`tests/unit/backend/harness-templates-sync.test.ts` spawns the real harness
|
|
28
|
-
installer (`scripts/install-vcm-harness.mjs`), which runs the compiled CLI at
|
|
29
|
-
`dist/backend/cli/install-vcm-harness.js`, falling back to the TypeScript source
|
|
30
|
-
only when the `tsx` binary is present. In a clean checkout with no `dist/` and no
|
|
31
|
-
`node_modules/.bin/tsx`, the installer exits with
|
|
32
|
-
`compiled CLI not found. Run npm run build first.` and this single test fails.
|
|
33
|
-
|
|
34
|
-
**Run `npm run build` before a full `npm run test`** (or before final acceptance).
|
|
35
|
-
All other unit tests run without a build. This is a documented prerequisite, not a
|
|
36
|
-
product defect or known issue.
|
|
9
|
+
All commands run from the repository root (the task worktree during a VCM task).
|
|
37
10
|
|
|
38
|
-
|
|
39
|
-
|
|
40
|
-
|
|
41
|
-
|
|
11
|
+
| Level | Scope | Command(s) |
|
|
12
|
+
| --- | --- | --- |
|
|
13
|
+
| L0 fast checks | Format/lint/typecheck/boundary. Project ships typecheck across both tsconfigs. | `npm run typecheck` |
|
|
14
|
+
| L1 coder unit checks | Changed behavior + direct regressions via Vitest unit tests. | `npm test` (optionally a scoped `npx vitest run <path>`) |
|
|
15
|
+
| L2 module / integration checks | Module/API/runtime wiring. Vitest config reserves `tests/integration/api/**` and `tests/integration/runtime/**`. | `npm test` (runs unit + any integration tests that exist) |
|
|
16
|
+
| L3 smoke E2E checks | Core GUI journeys via Playwright. | `npm run e2e` |
|
|
17
|
+
| L4 full regression / release | Build + package verification before publish. | `npm run build` then `npm run verify:package` |
|
|
18
|
+
|
|
19
|
+
Notes:
|
|
20
|
+
|
|
21
|
+
- `npm run typecheck` runs `tsc` against both `tsconfig.json` (frontend + shared)
|
|
22
|
+
and `tsconfig.node.json` (backend), so it is the boundary/type gate for the
|
|
23
|
+
whole module.
|
|
24
|
+
- `npm test` is `vitest run`. Its `include` globs already cover unit and the
|
|
25
|
+
reserved integration directories, so a single `npm test` is both L1 and L2 once
|
|
26
|
+
integration tests exist.
|
|
27
|
+
- `npm run e2e` is `playwright test` against `tests/e2e`, and its `webServer`
|
|
28
|
+
starts `npm run dev` automatically (reusing an existing server if one is up).
|
|
29
|
+
|
|
30
|
+
## Validation Selection Rules
|
|
31
|
+
|
|
32
|
+
- Docs-only or comment-only change: L0 (`npm run typecheck`) is usually enough; no
|
|
33
|
+
code behavior to retest.
|
|
34
|
+
- `src/shared/**` change: always L0 + L1, because shared types/validators are
|
|
35
|
+
cross-cutting to both backend and frontend.
|
|
36
|
+
- `src/backend/**` change: L0 + the affected `tests/unit/backend/**` files; run
|
|
37
|
+
full `npm test` before handoff. Add L2 when touching runtime, routes, or
|
|
38
|
+
cross-service wiring.
|
|
39
|
+
- `src/frontend/**` change: L0 + the affected `tests/unit/frontend/**` files; add
|
|
40
|
+
L3 (`npm run e2e`) when changing a core user journey (connect repo, create task,
|
|
41
|
+
start/resume a role session, send a message, translation panel).
|
|
42
|
+
- `src/backend/templates/harness/**` change: L0 + `npm test` (harness template
|
|
43
|
+
sync and harness service/route tests guard these), because output ships into
|
|
44
|
+
downstream repos.
|
|
45
|
+
- `.ai/tools/**` or `scripts/harness-tools/**` change: run
|
|
46
|
+
`tests/unit/backend/harness-tools.test.ts` and `vcm-bash-guard.test.ts`.
|
|
47
|
+
- Pre-publish / release: L4 (`npm run build` + `npm run verify:package`).
|
|
48
|
+
|
|
49
|
+
## Long-Running Validation
|
|
50
|
+
|
|
51
|
+
Use the `vcm-long-running-validation` skill (`.ai/tools/run-long-check` +
|
|
52
|
+
`.ai/tools/watch-job`) for any command that may exceed ~2 minutes (notably
|
|
53
|
+
`npm run e2e` and full builds). Never run validation as a detached/background
|
|
54
|
+
process; the job guard denies it. Honor the 60-minute per-job ceiling.
|
|
55
|
+
|
|
56
|
+
## Test Layout
|
|
42
57
|
|
|
43
58
|
```
|
|
44
|
-
|
|
59
|
+
tests/
|
|
60
|
+
unit/
|
|
61
|
+
backend/ # services, routes, runtime, adapters, gateway, harness, tools
|
|
62
|
+
frontend/ # api-client, stores, components (message timeline, harness panel, translation panel)
|
|
63
|
+
shared/ # pure validators (artifact-check, language-detect, slug-check)
|
|
64
|
+
integration/ # reserved by vitest config: api/**, runtime/** (not yet present)
|
|
65
|
+
e2e/ # reserved by playwright config (not yet present)
|
|
45
66
|
```
|
|
46
67
|
|
|
47
|
-
|
|
48
|
-
|
|
49
|
-
|
|
50
|
-
|
|
51
|
-
|
|
52
|
-
|
|
53
|
-
|
|
54
|
-
|
|
55
|
-
|
|
56
|
-
|
|
57
|
-
|
|
58
|
-
|
|
59
|
-
|
|
60
|
-
|
|
61
|
-
|
|
62
|
-
|
|
63
|
-
|
|
64
|
-
|
|
|
65
|
-
|
|
66
|
-
|
|
67
|
-
|
|
68
|
-
|
|
|
69
|
-
|
|
|
70
|
-
|
|
71
|
-
|
|
72
|
-
|
|
73
|
-
|
|
74
|
-
|
|
75
|
-
-
|
|
76
|
-
|
|
77
|
-
|
|
78
|
-
|
|
79
|
-
|
|
80
|
-
-
|
|
81
|
-
|
|
82
|
-
|
|
68
|
+
- Place unit tests next to their layer under `tests/unit/<layer>/` named
|
|
69
|
+
`<subject>.test.ts`.
|
|
70
|
+
- Place integration tests under `tests/integration/api/**` or
|
|
71
|
+
`tests/integration/runtime/**` so the existing Vitest `include` picks them up.
|
|
72
|
+
- Place Playwright specs under `tests/e2e/`.
|
|
73
|
+
|
|
74
|
+
## Integration / E2E Case List
|
|
75
|
+
|
|
76
|
+
These are reserved by configuration but not yet implemented. They are the
|
|
77
|
+
recommended first cases when integration/E2E coverage is added.
|
|
78
|
+
|
|
79
|
+
### Integration (reserved: `tests/integration/api/**`, `tests/integration/runtime/**`)
|
|
80
|
+
|
|
81
|
+
| ID | Scenario | Entry point | Proves | Key assertions | When to run | Limitation |
|
|
82
|
+
| --- | --- | --- | --- | --- | --- | --- |
|
|
83
|
+
| INT-API-001 | Project + task lifecycle over HTTP | Fastify app via `project-routes` / `task-routes` | Routes + services persist task state correctly | Create project, create task, read back task, status transitions | L2, on backend api/service change | Not yet implemented |
|
|
84
|
+
| INT-API-002 | Message bus round trip | `message-routes` / `message-service` | Route-file dispatch and history persistence | Posted message is persisted and retrievable in order | L2, on messaging change | Not yet implemented |
|
|
85
|
+
| INT-RT-001 | Session start/resume lifecycle | `runtime-coordinator-service` + `session-registry` | PTY session can start, persist id, and resume | Session id persisted; resume reuses id; stop cleans registry | L2, on runtime change | Not yet implemented; needs `claude`/pty test doubles |
|
|
86
|
+
|
|
87
|
+
### E2E (reserved: `tests/e2e/`)
|
|
88
|
+
|
|
89
|
+
| ID | Scenario | Entry point | Proves | Key assertions | When to run | Limitation |
|
|
90
|
+
| --- | --- | --- | --- | --- | --- | --- |
|
|
91
|
+
| E2E-001 | Connect repository and create a task | GUI at `http://127.0.0.1:5173` | Core onboarding journey works end to end | Repo connects, branch/status render, task appears in list | L3, before release / on shell or routing change | Not yet implemented; requires a real `claude` binary for live sessions |
|
|
92
|
+
| E2E-002 | Start a role session and observe terminal output | Task workspace role tabs | Embedded terminal streams PTY output over `/ws` | Session starts, xterm receives output, status badge updates | L3, on runtime/terminal change | Not yet implemented; environment-dependent |
|
|
93
|
+
| E2E-003 | Translation panel renders translated transcript | Translation panel | Translator session reads transcript JSONL and renders | Panel shows translated entries without mutating handoffs | L3, on translation change | Not yet implemented |
|
|
94
|
+
| E2E-004 | Auto-orchestration journey (one-click → auto-follow → flow-pause) | GUI task workspace, auto mode | The relocated backend-owned orchestration drives the GUI end to end | One-click starts the roster via `POST /api/tasks/:slug/one-click-start`; the role tab follows `roundState.activeRole`; a stopped round with no next turn raises the `roundState.flowPause` notice | L3, on one-click/round/role-follow change | Not yet implemented; needs a Playwright harness + live `claude`/pty. Until then the three contracts are covered at integration level: task-routes inject + gateway inbound (P1), active-role-follow + app wiring (P2), round-service flowPause matrix + flow-pause-alert (P3) |
|
|
95
|
+
|
|
96
|
+
## Generated-Context Freshness Checks
|
|
97
|
+
|
|
98
|
+
- Regenerate `.ai/generated/module-index.json` with `.ai/tools/generate-module-index`
|
|
99
|
+
after adding/removing/moving modules, source files, or test files.
|
|
100
|
+
- Regenerate `.ai/generated/public-surface.json` with
|
|
101
|
+
`.ai/tools/generate-public-surface` (after module-index exists) after changing
|
|
102
|
+
exported APIs, HTTP routes, or shared types.
|
|
103
|
+
- Treat stale generated indexes as a validation failure during review: if a
|
|
104
|
+
source/route/export change is not reflected in the indexes, regenerate before
|
|
105
|
+
acceptance.
|
|
106
|
+
|
|
107
|
+
## Final-Validation Cleanup
|
|
108
|
+
|
|
109
|
+
- Remove temporary scripts, scratch files, and any test-only fixtures created
|
|
110
|
+
during investigation before final acceptance.
|
|
111
|
+
- Do not leave `.only`/`.skip` in committed Vitest or Playwright specs.
|
|
112
|
+
- Ensure no detached/background validation jobs remain running.
|
|
113
|
+
- Confirm generated indexes are regenerated and committed when source/surface
|
|
114
|
+
changed.
|
|
115
|
+
- Before publish, `npm run build` and `npm run verify:package` must pass.
|
|
116
|
+
|
|
117
|
+
## Known Testing Gaps
|
|
118
|
+
|
|
119
|
+
- No integration tests exist yet; `tests/integration/**` is configured but empty.
|
|
120
|
+
- No E2E tests exist yet; `tests/e2e/**` is configured (Playwright) but empty.
|
|
121
|
+
- E2E and live runtime tests depend on a real `claude` binary and `node-pty`,
|
|
122
|
+
which are environment-sensitive and not currently stubbed for CI.
|
|
123
|
+
- There is no lint command in `package.json`; L0 is currently typecheck-only.
|
|
124
|
+
- Coverage thresholds are not enforced by configuration.
|
|
125
|
+
- `tests/unit/backend/harness-templates-sync.test.ts` shells out to
|
|
126
|
+
`scripts/install-vcm-harness.mjs`, which needs the compiled CLI (`dist/main.js`).
|
|
127
|
+
On a clean checkout with no `dist/`, those 3 cases fail with
|
|
128
|
+
"compiled CLI not found. Run npm run build first." Run `npm run build` before
|
|
129
|
+
`npm test` (or treat these specific failures as build-state, not regressions)
|
|
130
|
+
when validating from a clean tree.
|
package/docs/known-issues.md
CHANGED
|
@@ -1 +1,156 @@
|
|
|
1
1
|
# Known Issues
|
|
2
|
+
|
|
3
|
+
Durable open issues and accepted limitations for VibeCodingMaster (VCM). This is
|
|
4
|
+
a current open-issue snapshot, not a task log. Each entry is architect-owned and
|
|
5
|
+
should be removed or rewritten once the underlying gap is resolved.
|
|
6
|
+
|
|
7
|
+
Issues are grouped by category. Severity reflects architectural/correctness/
|
|
8
|
+
security risk, not delivery priority.
|
|
9
|
+
|
|
10
|
+
---
|
|
11
|
+
|
|
12
|
+
## Security & Exposure
|
|
13
|
+
|
|
14
|
+
### KI-001 — Unauthenticated HTTP API and `/ws` terminal surface
|
|
15
|
+
|
|
16
|
+
- **Status**: Open (accepted limitation by design; needs explicit user/PM decision before any non-loopback use).
|
|
17
|
+
- **Category**: Product / security.
|
|
18
|
+
- **Affected modules / surfaces**: `src/backend/server.ts` (no auth hook, no CORS/origin policy), all `src/backend/api/*` routes, `src/backend/ws/terminal-ws.ts`, `src/main.ts` (`--host=` flag).
|
|
19
|
+
- **Current gap**: The backend registers every `/api/*` route and the `/ws/terminal/:id` WebSocket with no authentication, authorization, or origin/CSRF check. These surfaces can spawn processes (`runtime.createSession`), run `git`, read/write the filesystem and `vcmDataDir`, manage gateway tokens, and write raw bytes directly into role PTYs (`runtime.write` via the terminal WebSocket). The default bind is `127.0.0.1`, but `--host=` lets a user bind to `0.0.0.0`/LAN, and the WS upgrade path performs no `Origin` validation (cross-site WebSocket hijacking is possible if a browser session is open).
|
|
20
|
+
- **Impact**: Binding to any non-loopback interface exposes a powerful, fully unauthenticated remote-code-execution-equivalent surface to the local network. Even on loopback, the lack of `Origin` checks means a malicious web page in the user's browser could drive the API/terminal.
|
|
21
|
+
- **Mitigation / workaround**: Keep the default `127.0.0.1` bind; do not pass `--host=` with a routable address; do not run VCM on shared/untrusted machines.
|
|
22
|
+
- **Resolution condition**: Either (a) document and enforce loopback-only as a hard product constraint, or (b) add an auth/token + `Origin` allowlist before allowing non-loopback binds. Requires a product decision (route through the full code-change flow if a fix is chosen).
|
|
23
|
+
- **Related**: KI-002, KI-007.
|
|
24
|
+
|
|
25
|
+
### KI-002 — Gateway bot token and app secret stored in plaintext at rest
|
|
26
|
+
|
|
27
|
+
- **Status**: Open (accepted limitation).
|
|
28
|
+
- **Category**: Product / security.
|
|
29
|
+
- **Affected modules / surfaces**: `src/backend/gateway/gateway-settings-service.ts` (`writeJsonAtomic(settingsPath, cachedSettings)`), gateway channel credentials (`binding.token`, `binding.appSecret`).
|
|
30
|
+
- **Current gap**: Gateway channel credentials (Weixin iLink bot token, Lark app secret) are persisted unencrypted as JSON under `vcmDataDir`. Status responses correctly expose only `tokenConfigured`/`appSecretConfigured` booleans, so the leak is at-rest only, not over the status API.
|
|
31
|
+
- **Impact**: Anyone with read access to the user's `vcmDataDir` (backups, sync tools, other local users) can recover live chat-platform bot credentials.
|
|
32
|
+
- **Mitigation / workaround**: Protect `vcmDataDir` filesystem permissions; rotate tokens if the data dir is exposed.
|
|
33
|
+
- **Resolution condition**: Encrypt secrets at rest or delegate to an OS keychain; or formally accept and document the plaintext-at-rest model.
|
|
34
|
+
- **Related**: KI-001.
|
|
35
|
+
|
|
36
|
+
---
|
|
37
|
+
|
|
38
|
+
## Correctness & Robustness
|
|
39
|
+
|
|
40
|
+
### KI-003 — Spawn failures are indistinguishable from real `git` exit code 1
|
|
41
|
+
|
|
42
|
+
- **Status**: Open.
|
|
43
|
+
- **Category**: Product / correctness.
|
|
44
|
+
- **Affected modules / surfaces**: `src/backend/adapters/command-runner.ts` (catch branch returns `{ exitCode: 1, stderr: error.message }`), `src/backend/adapters/git-adapter.ts` (`isIgnored`, `branchExists` treat `exitCode === 1` as a definitive "false").
|
|
45
|
+
- **Current gap**: `command-runner.run` collapses every `execa` failure — including a spawn error such as `git` not being installed/launchable (ENOENT) — into `exitCode: 1`. Several git-adapter methods (`isIgnored`, `branchExists`) interpret `exitCode === 1` as a meaningful negative result ("not ignored" / "branch does not exist"). A missing or unspawnable `git` therefore returns a confident wrong answer instead of surfacing the real failure.
|
|
46
|
+
- **Impact**: Downstream logic (worktree creation, ignore checks, branch existence gating) can silently make wrong decisions when `git` is absent or the spawn fails, masking the root cause and producing confusing secondary errors.
|
|
47
|
+
- **Mitigation / workaround**: Ensure `git` is installed and on `PATH` before use.
|
|
48
|
+
- **Resolution condition**: Distinguish spawn/launch errors from non-zero process exits in `command-runner` (e.g., a sentinel exit code or a typed `spawnFailed` flag) and have git-adapter treat spawn failure as an error rather than a `false` result. Requires a cross-file contract change → route through the full code-change flow.
|
|
49
|
+
- **Related**: none.
|
|
50
|
+
|
|
51
|
+
### KI-010 — Translation queue stuck-head recovery is enqueue-triggered, not continuous
|
|
52
|
+
|
|
53
|
+
- **Status**: Open (accepted limitation; the primary issue #13 case is resolved).
|
|
54
|
+
- **Category**: Product / robustness (operability).
|
|
55
|
+
- **Affected modules / surfaces**: `src/backend/services/translation-worker-service.ts` (`dispatchNext` / `reconcileStuckActiveItem` / `STALE_CONVERSATION_ITEM_MS`), `src/backend/services/translation-service.ts` (`waitForConversationResult` poll loop).
|
|
56
|
+
- **Current gap**: Recovery of a stuck active queue item (whose Translator `Stop`/`StopFailure` hook was lost) runs only when `dispatchNext` is invoked — i.e. when a new item is enqueued or a hook arrives. The request poll loop (`waitForConversationResult` -> `getState`) does not call `dispatchNext`. The primary case (the result already written to disk, hook lost) self-heals immediately on the next enqueue: conversation output lives in a single shared, self-describing `runtime/conversations/result.json`, and the recovery association key is the in-file `batchId` validated all-or-nothing (`conversationResultAvailable` requires the file to exist, parse, have `batchId` equal to the active item's `batchId`, and contain every expected `batchIndex`). The secondary case (Translator session gone with no result written) is only released after the item passes the 90s `STALE_CONVERSATION_ITEM_MS` window *and* a subsequent enqueue occurs; if the stuck head is still younger than 90s when the next translation is requested, that request can still time out once.
|
|
57
|
+
- **Impact**: Low. A narrow window can still produce a single `translation timed out` (HTTP 502) for the "session gone, no result, head <90s old, no further enqueue" case; it self-heals on the next translation attempt after the stale window. No permanent queue block remains, and a backend restart with a pre-existing stuck item recovers immediately (its `updatedAt` is already stale, or the result is on disk).
|
|
58
|
+
- **Mitigation / workaround**: Retry the translation once; the retry's enqueue triggers reconciliation.
|
|
59
|
+
- **Resolution condition**: Add a periodic / poll-driven reconcile (e.g. reconcile on `getState` or a timer) so stuck heads are released without depending on a new enqueue. Requires a code change → route through the full code-change flow if pursued.
|
|
60
|
+
- **Related**: KI-011.
|
|
61
|
+
|
|
62
|
+
### KI-011 — Conversation result cleanup deletes the shared dir without a batchId guard
|
|
63
|
+
|
|
64
|
+
- **Status**: Open (accepted limitation; harm effectively unreachable today).
|
|
65
|
+
- **Category**: Product / robustness (operability).
|
|
66
|
+
- **Affected modules / surfaces**: `src/backend/services/translation-worker-service.ts` (`validateConversationResult` cleanup of the shared `runtime/conversations/` directory holding `result.json`).
|
|
67
|
+
- **Current gap**: Because conversation translation now uses one shared `result.json` (KI-010), cleanup after a consumed result removes the shared `conversations/` directory (the `batchResultPath` dirname) rather than a per-batch directory, and it is not guarded by a `batchId` match against the file actually on disk. In principle a delete could race a newly written `result.json` for a later batch.
|
|
68
|
+
- **Impact**: Negligible in practice. Cleanup fires on the ~500ms consumer poll, far ahead of when a subsequent batch's Translator (LLM latency ≫ 500ms) could write a new `result.json`; and the all-or-nothing in-file `batchId` validation means the worst case is a recoverable dropped result, never a mis-assignment — within the design's accepted "drop over mis-assign" tolerance.
|
|
69
|
+
- **Mitigation / workaround**: None needed; a dropped conversation result self-recovers via re-translate / stale-release.
|
|
70
|
+
- **Resolution condition**: Optional hardening — scope the cleanup to delete only when the on-disk `result.json` `batchId` matches the just-consumed batch (or delete the file, not the directory). Requires a small code change → full code-change flow if pursued.
|
|
71
|
+
- **Related**: KI-010.
|
|
72
|
+
|
|
73
|
+
### KI-004 — Claude transcript project-directory hashing does not match Claude Code's encoding
|
|
74
|
+
|
|
75
|
+
- **Status**: Open.
|
|
76
|
+
- **Category**: Product / correctness (external coupling to Claude Code's on-disk format).
|
|
77
|
+
- **Affected modules / surfaces**: `src/backend/services/claude-transcript-service.ts` (`projectHash`, `projectsTranscriptDir`, `claudeTranscriptPath`, `resolveExistingClaudeTranscriptPath`), translation panel and question/todo extraction that depend on it.
|
|
78
|
+
- **Current gap**: `projectHash` only replaces `[/\s]+` with `-`, which does not reproduce Claude Code's actual project-directory encoding (which also encodes `.` and other path characters). The primary path lookup can therefore miss; correctness currently leans on the fallback full scan `findClaudeTranscriptPathBySessionId`, which picks the most-recently-modified `<sessionId>.jsonl` across all project dirs.
|
|
79
|
+
- **Impact**: If Claude changes its encoding, or two project directories produce a colliding hash / share a session-id filename, transcript resolution can attach to the wrong file or fail to find one, breaking translation feed and question/todo surfacing. The fallback masks the brittleness rather than fixing it.
|
|
80
|
+
- **Mitigation / workaround**: Rely on the `session.transcriptPath` / `claudeSessionId` resolution path; the mtime-sorted fallback usually recovers the right file.
|
|
81
|
+
- **Resolution condition**: Mirror Claude Code's real directory-encoding scheme (or resolve transcript paths via a documented Claude API/contract) instead of an approximate replace. Treat the encoding as an external-contract assumption to re-verify on Claude Code upgrades.
|
|
82
|
+
- **Related**: KI-005.
|
|
83
|
+
|
|
84
|
+
---
|
|
85
|
+
|
|
86
|
+
## Performance & Scalability
|
|
87
|
+
|
|
88
|
+
### KI-005 — Synchronous filesystem I/O and full-file replay on the event loop in `TranscriptTail`
|
|
89
|
+
|
|
90
|
+
- **Status**: Open.
|
|
91
|
+
- **Category**: Product / performance.
|
|
92
|
+
- **Affected modules / surfaces**: `src/backend/services/claude-transcript-service.ts` (`TranscriptTail.start/flush/replayHistory/replaySince`), translation worker/feed consumers.
|
|
93
|
+
- **Current gap**: Transcript tailing uses synchronous `statSync`/`openSync`/`readSync` on every flush and `readFileSync` for replay, all on the main event-loop thread, with a 1s poll timer per subscribed session. Replay (`replayHistory`/`replaySince`) reads the entire JSONL transcript into memory and parses every line synchronously.
|
|
94
|
+
- **Impact**: Long-lived sessions accumulate large transcripts; with multiple concurrent role sessions each tailing + replaying, synchronous reads can stall the event loop and spike memory, degrading API/WS responsiveness.
|
|
95
|
+
- **Mitigation / workaround**: Practical session/transcript sizes are usually small; impact is bounded by transcript length and session count.
|
|
96
|
+
- **Resolution condition**: Move to async/streamed reads, bound replay (cap bytes/lines read), and/or offload tailing; treat as a scalability hardening item.
|
|
97
|
+
- **Related**: KI-004, KI-006.
|
|
98
|
+
|
|
99
|
+
### KI-006 — O(n)-per-chunk terminal replay buffer recomputation
|
|
100
|
+
|
|
101
|
+
- **Status**: Open.
|
|
102
|
+
- **Category**: Product / performance.
|
|
103
|
+
- **Affected modules / surfaces**: `src/backend/runtime/node-pty-runtime.ts` (`appendTerminalReplay`, `tailTerminalReplay`, invoked on every `child.onData`).
|
|
104
|
+
- **Current gap**: On every PTY output chunk, `appendTerminalReplay` concatenates the existing buffer with the new data and re-tails to the 2 MB cap, and `tailTerminalReplay` recomputes `Buffer.byteLength` inside a trimming loop. This is O(buffer size) per chunk regardless of chunk size.
|
|
105
|
+
- **Impact**: Chatty/high-throughput Claude sessions trigger repeated multi-MB string copies and byte-length scans, a measurable CPU hotspot under sustained output.
|
|
106
|
+
- **Mitigation / workaround**: Output bursts are typically short; the 2 MB cap bounds memory.
|
|
107
|
+
- **Resolution condition**: Use a chunked/ring buffer or amortized trimming so per-chunk cost is proportional to the new data, not the whole buffer.
|
|
108
|
+
- **Related**: KI-005.
|
|
109
|
+
|
|
110
|
+
---
|
|
111
|
+
|
|
112
|
+
## Maintainability
|
|
113
|
+
|
|
114
|
+
### KI-007 — Non-matching `/ws` upgrade requests leak the socket
|
|
115
|
+
|
|
116
|
+
- **Status**: Open.
|
|
117
|
+
- **Category**: Product / robustness.
|
|
118
|
+
- **Affected modules / surfaces**: `src/backend/ws/terminal-ws.ts` (`app.server.on("upgrade", ...)`).
|
|
119
|
+
- **Current gap**: When an upgrade request's path does not match `/ws/terminal/:id`, the handler `return`s without calling `socket.destroy()` (or writing a `400`/`426` response). The half-upgraded socket is left hanging until a timeout. There is also no `Origin` check at the upgrade boundary (see KI-001).
|
|
120
|
+
- **Impact**: Low — stray/unrelated `/ws` upgrade attempts hold a connection open instead of being cleanly rejected; minor resource pressure, no correct rejection signal to the client.
|
|
121
|
+
- **Mitigation / workaround**: Only the intended `/ws/terminal/:id` path is used by the shipped frontend.
|
|
122
|
+
- **Resolution condition**: Destroy (or explicitly reject) the socket on non-matching upgrade paths and add an `Origin` allowlist.
|
|
123
|
+
- **Related**: KI-001.
|
|
124
|
+
|
|
125
|
+
### KI-008 — Oversized service modules concentrate orchestration complexity
|
|
126
|
+
|
|
127
|
+
- **Status**: Open (maintainability hazard, not a defect).
|
|
128
|
+
- **Category**: Product / maintainability.
|
|
129
|
+
- **Affected modules / surfaces**: `src/backend/services/harness-service.ts` (~2160 lines), `translation-worker-service.ts` (~2155), `translation-service.ts` (~1721), `session-service.ts` (~1682), `gate-review-service.ts` (~980), `claude-hook-service.ts` (~769).
|
|
130
|
+
- **Current gap**: Several service files greatly exceed comfortable single-file cohesion and bundle orchestration, retry/error handling, and side-effect coordination together. This makes the intended `api -> services -> (runtime | adapters | gateway | templates)` boundary harder to reason about and raises regression risk on edits.
|
|
131
|
+
- **Impact**: Higher change cost and review/regression risk in the highest-traffic backend logic; harder to localize behavior and test seams.
|
|
132
|
+
- **Mitigation / workaround**: Existing unit tests cover many of these services; keep edits narrowly scoped.
|
|
133
|
+
- **Resolution condition**: Incrementally extract cohesive sub-modules (with explicit cross-file contracts captured in module `ARCHITECTURE.md`) when these areas are next changed. No standalone refactor mandated.
|
|
134
|
+
- **Related**: none.
|
|
135
|
+
|
|
136
|
+
### KI-009 — Error responses surface raw subprocess stderr and runtime diagnostics to clients
|
|
137
|
+
|
|
138
|
+
- **Status**: Open (low risk on loopback; compounds with KI-001).
|
|
139
|
+
- **Category**: Product / information exposure.
|
|
140
|
+
- **Affected modules / surfaces**: `src/backend/server.ts` global error handler (returns `hint` and `runtime` diagnostics), `src/backend/adapters/git-adapter.ts` (sets `hint: result.stderr`).
|
|
141
|
+
- **Current gap**: API error payloads include `hint` (often raw `git` stderr) and `diagnosticsService.getErrorRuntimeInfo()`. On loopback this is acceptable developer feedback, but it leaks local paths/environment detail to any caller — which matters if combined with a non-loopback bind (KI-001).
|
|
142
|
+
- **Impact**: Low in the default configuration; an information-exposure amplifier when the API is exposed beyond loopback.
|
|
143
|
+
- **Mitigation / workaround**: Keep the loopback bind (KI-001).
|
|
144
|
+
- **Resolution condition**: Gate verbose `hint`/`runtime` detail behind a dev flag, or sanitize before returning, if non-loopback exposure is ever supported.
|
|
145
|
+
- **Related**: KI-001, KI-007.
|
|
146
|
+
|
|
147
|
+
### KI-012 — `flowPause.role` / `flowPause.since` are emitted but unused by the GUI
|
|
148
|
+
|
|
149
|
+
- **Status**: Open (accepted minor redundancy; not a defect).
|
|
150
|
+
- **Category**: Product / maintainability (cleanup).
|
|
151
|
+
- **Affected modules / surfaces**: `src/shared/types/round.ts` (`VcmFlowPauseState`), `src/backend/services/round-service.ts` (`computeFlowPause`), `src/frontend/app.tsx` (flow-pause alert mechanics).
|
|
152
|
+
- **Current gap**: The authoritative `roundState.flowPause` carries `role` and `since`, but the GUI alert mechanics still read equivalent round-level fields — `roundState.activeRole` for the pause-notice label and `getFlowPauseDurationMs(roundState)` for sound severity. Both sources derive from the same `currentRound`, so the values are equivalent and the redundancy is harmless.
|
|
153
|
+
- **Impact**: None functionally; mild contract over-provisioning (fields provided that no consumer reads), which can confuse future maintainers ("why does `flowPause` carry `role`/`since`?").
|
|
154
|
+
- **Mitigation / workaround**: None needed.
|
|
155
|
+
- **Resolution condition**: Either point the GUI label/severity at `flowPause.role`/`flowPause.since` (consume what the signal already provides), or drop the two fields from `VcmFlowPauseState`. Small, optional.
|
|
156
|
+
- **Related**: none.
|