@kodevibe/harness 0.11.4 → 0.11.6

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.ko.md CHANGED
@@ -139,7 +139,7 @@ kode:harness는 세 가지 메커니즘으로 해결합니다:
139
139
  |--------|------|---------------------|
140
140
  | **`.cursorrules` / `copilot-instructions.md`** | 정적. 상태 영속성 없음, 자기 교정 없음, 세션 간 기억 없음. | 매 세션 업데이트되는 살아있는 state 파일. Direction Guard가 매 요청을 목표와 대조. |
141
141
  | **LangChain / CrewAI** | AI 앱 구축용 런타임 오케스트레이션. AI 코딩 에이전트 방향 관리용이 아님. | IDE 안에서 작동하는 마크다운 네이티브 가드레일. 런타임 없음, SDK 없음. |
142
- | **BMAD / gstack / GSD** | 1인 개발자용. 200+ 파일. 방향 관리 없음. | ~25개 파일 (~17K 토큰). Direction Guard + Decision Log. 멀티 개발자 지원. |
142
+ | **BMAD / gstack / GSD** | 강한 planning/workflow 시스템이지만 proof, state, policy gate의 정직한 실행은 여전히 모델 판단에 많이 의존. | planning 이후 실행 가드레일: Story/Wave pacing, Proof Ledger, deterministic state-check, policy/security evidence gate, 멀티 개발자 state sync. |
143
143
  | **"조심하면 되지"** | 잊을 때까지만 동작. LLM은 과거 세션에서 배우지 않음. | 자동화: `wrap-up`이 교훈 캡처, `debug`가 실패 추적, `reviewer`가 state 감사. |
144
144
 
145
145
  ---
@@ -183,6 +183,25 @@ npm run harness:dependency-scan
183
183
  npm run harness:llm-bench:smoke
184
184
  ```
185
185
 
186
+ 설치된 프로젝트에서는 이 저장소를 clone하지 않아도 같은 deterministic guard를 실행할 수 있습니다:
187
+
188
+ ```bash
189
+ npx @kodevibe/harness guard --dir .
190
+ npx @kodevibe/harness guard --all --dir .
191
+ npx @kodevibe/harness guard --wrap-up --dir .
192
+ npx @kodevibe/harness guard --state-sync --dir .
193
+ ```
194
+
195
+ GitHub Actions에서 같은 검사를 강제하려면
196
+ `templates/github-actions/kode-harness-guard.yml`을 대상 프로젝트의
197
+ `.github/workflows/kode-harness-guard.yml`로 복사하세요. 이 템플릿은
198
+ deterministic proof/state/security/policy guard가 실패하면 PR/push를 막습니다:
199
+
200
+ ```yaml
201
+ - run: npx --yes @kodevibe/harness guard --all --dir .
202
+ - run: npx --yes @kodevibe/harness guard --state-sync --dir .
203
+ ```
204
+
186
205
  릴리즈 증거로 인정하려면 예시 fixture 대신 실제 3개 이상 모델 tier의 sealed result를 `docs/llm-bench-results.json`에 기록한 뒤 실행하세요. 각 run은 `docs/llm-bench-scenarios.json`의 표준 시나리오를 사용해야 하며, `capturedAt`, 일치하는 `promptHash`, `outputHash` 또는 `transcriptHash`가 필요합니다.
187
206
 
188
207
  ```bash
@@ -201,7 +220,7 @@ npm run harness:llm-bench:real
201
220
 
202
221
  | IDE | 이럴 때 고르세요 | 디스패처 (always-on) | 스킬 | 에이전트 |
203
222
  |-----|--------------------|---------------------|------|----------|
204
- | **VS Code Copilot** | VS Code를 주로 쓰고 GitHub Copilot Chat 사용. | `.github/copilot-instructions.md` | `.github/skills/*/SKILL.md` | `.github/agents/*.agent.md` |
223
+ | **VS Code Copilot** | VS Code를 주로 쓰고 GitHub Copilot Chat 사용. | `.github/copilot-instructions.md` (+ 짧은 `AGENTS.md` anchor) | `.github/skills/*/SKILL.md` | `.github/agents/*.agent.md` |
205
224
  | **Claude Code** | 터미널/Claude Code CLI 선호. | `CLAUDE.md` (+ `.claude/rules/core.md`) | `.claude/skills/*/SKILL.md` | `.claude/agents/*.md` |
206
225
  | **Cursor** | Cursor 에디터 사용. | `.cursor/rules/core.mdc` (+ `AGENTS.md`) | `.agents/skills/*/SKILL.md` (cross-tool) | `.cursor/rules/<agent>.mdc` |
207
226
  | **Codex** | OpenAI Codex CLI 서브에이전트 사용. | `AGENTS.md` | `.agents/skills/*/SKILL.md` | `.codex/agents/*.toml` |
@@ -394,22 +413,21 @@ Bootstrap이 `docs/crew/`, `docs/PM/`, `docs/Analyst/`, `docs/ARB/`에서 crew
394
413
 
395
414
  ### 다른 프레임워크와의 비교
396
415
 
397
- | | BMAD v6.2.2 | gstack v0.15.1 | GSD v1.33.0 | **kode:harness** |
416
+ | | BMAD v6.x | gstack v0.15.1 | GSD v1.33.0 | **kode:harness** |
398
417
  |---|---|---|---|---|
399
- | 초점 | 기업 SDLC 방법론 | 1인 소프트웨어 팩토리 | 전체 수명주기 자동화 | **멀티 개발자 방향 정렬** |
400
- | 파일 | 200+ | ~40 | 수백 | **~25** |
401
- | 의존성 | Node 20+ | Bun + Node + Playwright | Node 18+ | **Zero** |
402
- | IDE 지원 | 20+ (installer) | 5 (setup --host) | 13 (runtime select) | 6 (네이티브 포맷) |
403
- | 방향 관리 | | | | (Direction Guard + pivot + Decision Log) |
404
- | Iron Laws (코드 품질 규칙) | | | | (11개 규칙이 스킬에 임베딩) |
405
- | Cold start | | | `/gsd-new-project` | (`setup` 스킬) |
406
- | 태스크당 컨텍스트 | 4-6 파일 | 1 파일 | 매번 200k 플랜 | **2-3 파일 (136줄 디스패처)** |
418
+ | 핵심 강점 | Planning과 agile workflow 생태계 | 1인 실행 루프 | 전체 수명주기 자동화 | **planning 이후 execution governance** |
419
+ | 최적 영역 | ideation, PRD, architecture, role workflow | 개인 소프트웨어 팩토리 | 넓은 자동화 표면 | **Story/Wave 구현, proof, reviewer, state, policy gate** |
420
+ | Planning 깊이 | 강함 | 중간 | 강함 | 외부 planning 산출물, 특히 kode:crew 산출물을 소비 |
421
+ | 실행 proof gate | checklist/workflow 중심 | tool-flow 중심 | runtime-flow 중심 | **Deterministic Proof Ledger + state-check + reviewer gate** |
422
+ | 정책/보안 governance | 커스터마이징 가능하나 핵심 차별점은 아님 | 제한적 | 다양함 | **Policy Evidence Ledger, secure check, dependency/state sync guard** |
423
+ | state | artifact/workflow 기반 | 개인 중심 | 자동화 기반 | **공유 docs + 개인 `.harness` state로 멀티 개발자 작업 지원** |
424
+ | 정직한 주장 | BMAD는 planning/workflow 폭에서 더 성숙 | 개인 루프에 강함 | 자동화 표면이 넓음 | **execution truth에서 강해야 함: unproven Done, weak evidence, policy overclaim 차단** |
407
425
 
408
426
  ---
409
427
 
410
428
  ## 로드맵
411
429
 
412
- kode:harness는 현재 **v0.11.4** — v0.11 proof-first 기반 위에 R16 recovery hardening(거짓 clean state-check claim, surface-specific Story Contract, reviewer dependency evidence, dirty wrap-up guard)을 추가했습니다.
430
+ kode:harness는 현재 **v0.11.6** — R17 governance hardening 기반 위에 설치 프로젝트용 CLI/CI guard enforcement를 추가했습니다. 공개 `guard` 커맨드, `guard --all`/`--state-sync` GitHub Actions 템플릿, CLI 노출 회귀 테스트가 포함됩니다.
413
431
 
414
432
  | 단계 | 버전 | 상태 | 초점 |
415
433
  |------|------|------|------|
@@ -427,7 +445,9 @@ kode:harness는 현재 **v0.11.4** — v0.11 proof-first 기반 위에 R16 recov
427
445
  | **Uninstall Safety** | v0.11.1 | ✅ 완료 | Manifest 기반 uninstall, state 기본 보존, shared owner 복원, purge cleanup |
428
446
  | **Deterministic Release Guard** | v0.11.2 | ✅ 완료 | R1-R10 guard scripts, package-boundary scan, dependency-map scan, R10 manifest-sealed bench workflow |
429
447
  | **Experiment Hardening** | v0.11.3 | ✅ 완료 | R15 Recent Changes integrity, Wave Scope boundary drift checks, enum/filter coverage honesty |
430
- | **Recovery Hardening** | v0.11.4 | ✅ 현재 | R16 false PASS claim guard, surface-specific Story Contract checks, reviewer dependency evidence, dirty wrap-up guard |
448
+ | **Recovery Hardening** | v0.11.4 | ✅ 완료 | R16 false PASS claim guard, surface-specific Story Contract checks, reviewer dependency evidence, dirty wrap-up guard |
449
+ | **Governance Hardening** | v0.11.5 | ✅ 완료 | R17 Crew Validation Tracker sync, dependency-map interface log guard, VS Code AGENTS.md instruction anchor |
450
+ | **CLI/CI Guard Enforcement** | v0.11.6 | ✅ 현재 | 설치 프로젝트용 공개 `guard` 커맨드, `guard --all`/`--state-sync` CI 템플릿, CLI 노출 회귀 테스트 |
431
451
  | **Docs Bridge** | v0.11.1 | 🧪 Experimental | Project Docs Hub Index, docs-bridge 스킬, visibility 경계를 가진 로컬 docs hub 인덱스 |
432
452
  | **Safety & Branding** | v0.9.6 | ✅ 완료 | init overwrite 백업, 배포 파일 pm 네이밍 정리, LICENSE 브랜딩 정리 |
433
453
  | **Validation** | v1.0 | 🔜 다음 | 실사용 검증, 사용자 피드백 수집 |
package/README.md CHANGED
@@ -151,7 +151,7 @@ kode:harness solves this with three mechanisms:
151
151
  |----------|-----------|------------------------|
152
152
  | **`.cursorrules` / `copilot-instructions.md`** | Static. No state persistence, no self-correction, no cross-session memory. | Living state files that update every session. Direction Guard checks every request against goals. |
153
153
  | **LangChain / CrewAI** | Runtime orchestration for building AI apps. Not for directing AI coding agents. | Markdown-native guardrails that work inside your IDE. No runtime, no SDK. |
154
- | **BMAD / gstack / GSD** | Built for solo developers. 200+ files. No direction management. | ~25 files (~17K tokens). Direction Guard + Decision Log. Multi-developer team support. |
154
+ | **BMAD / gstack / GSD** | Strong planning/workflow systems, but they still rely heavily on the model to honestly execute proof, state, and policy gates. | Execution guardrails after planning: Story/Wave pacing, Proof Ledger, deterministic state-check, policy/security evidence gates, and multi-developer state sync. |
155
155
  | **"I'll just be careful"** | Works until you forget. LLMs don't learn from past sessions. | Automated: `wrap-up` captures lessons, `debug` tracks failures, `reviewer` audits state. |
156
156
 
157
157
  ---
@@ -184,6 +184,25 @@ npx @kodevibe/harness validate # verify state files have real content
184
184
  npx @kodevibe/harness uninstall --dry-run --ide vscode # preview safe removal
185
185
  ```
186
186
 
187
+ Installed projects can run the same deterministic guard without cloning this repo:
188
+
189
+ ```bash
190
+ npx @kodevibe/harness guard --dir .
191
+ npx @kodevibe/harness guard --all --dir .
192
+ npx @kodevibe/harness guard --wrap-up --dir .
193
+ npx @kodevibe/harness guard --state-sync --dir .
194
+ ```
195
+
196
+ To enforce the same checks in GitHub Actions, copy
197
+ `templates/github-actions/kode-harness-guard.yml` to
198
+ `.github/workflows/kode-harness-guard.yml` in the target project. The template
199
+ blocks PRs/pushes when deterministic proof/state/security/policy checks fail:
200
+
201
+ ```yaml
202
+ - run: npx --yes @kodevibe/harness guard --all --dir .
203
+ - run: npx --yes @kodevibe/harness guard --state-sync --dir .
204
+ ```
205
+
187
206
  Source repo maintainers can also run the deterministic guard and model-tier evidence checks:
188
207
 
189
208
  ```bash
@@ -211,7 +230,7 @@ Not sure which to pick? Use the IDE you already code in — each install path is
211
230
 
212
231
  | IDE | Pick this if… | Dispatcher (always-on) | Skills | Agents |
213
232
  |-----|---------------|----------------------|--------|--------|
214
- | **VS Code Copilot** | You use VS Code daily and have GitHub Copilot Chat. | `.github/copilot-instructions.md` | `.github/skills/*/SKILL.md` | `.github/agents/*.agent.md` |
233
+ | **VS Code Copilot** | You use VS Code daily and have GitHub Copilot Chat. | `.github/copilot-instructions.md` (+ short `AGENTS.md` anchor) | `.github/skills/*/SKILL.md` | `.github/agents/*.agent.md` |
215
234
  | **Claude Code** | You prefer Claude in the terminal / Claude Code CLI. | `CLAUDE.md` (+ `.claude/rules/core.md`) | `.claude/skills/*/SKILL.md` | `.claude/agents/*.md` |
216
235
  | **Cursor** | You use Cursor as your editor. | `.cursor/rules/core.mdc` (+ `AGENTS.md`) | `.agents/skills/*/SKILL.md` (cross-tool) | `.cursor/rules/<agent>.mdc` |
217
236
  | **Codex** | You use OpenAI Codex CLI subagents. | `AGENTS.md` | `.agents/skills/*/SKILL.md` | `.codex/agents/*.toml` |
@@ -376,20 +395,19 @@ It adds a Project Docs Hub Index to `project-brief.md` with each local source, r
376
395
 
377
396
  ### How It Compares
378
397
 
379
- | | BMAD v6.2.2 | gstack v0.15.1 | GSD v1.33.0 | kode:harness |
398
+ | | BMAD v6.x | gstack v0.15.1 | GSD v1.33.0 | kode:harness |
380
399
  |---|---|---|---|---|
381
- | Focus | Enterprise SDLC methodology | 1-person software factory | Full lifecycle automation | **Multi-developer direction alignment** |
382
- | Files | 200+ | ~40 | Hundreds | ~25 |
383
- | Dependencies | Node 20+ | Bun + Node + Playwright | Node 18+ | Zero |
384
- | IDE support | 20+ (installer) | 5 (setup --host) | 13 (runtime select) | 6 (native format) |
385
- | Direction management | | | | (Direction Guard + pivot + Decision Log) |
386
- | Iron Laws (code quality rules) | | | | (11 laws embedded in skills) |
387
- | Cold start | | | `/gsd-new-project` | (`setup` skill) |
388
- | Context per task | 4-6 files | 1 file | Fresh 200k per plan | 2-3 files (136-line dispatcher) |
400
+ | Primary strength | Planning and agile workflow ecosystem | Solo execution loop | Full lifecycle automation | **Execution governance after planning** |
401
+ | Best fit | Ideation, PRD, architecture, role workflows | Personal software factory | Broad automation | **Story/Wave implementation, proof, reviewer, state, policy gates** |
402
+ | Planning depth | Strong | Medium | Strong | Consumes external planning artifacts, especially kode:crew outputs |
403
+ | Execution proof gate | Checklist/workflow driven | Tool-flow driven | Runtime-flow driven | **Deterministic Proof Ledger + state-check + reviewer gates** |
404
+ | Policy/security governance | Customizable, not the core differentiator | Limited | Varies | **Policy evidence ledger, secure checks, dependency/state sync guards** |
405
+ | Team state | Artifact/workflow based | Mostly individual | Automation based | **Shared docs + personal `.harness` state for multi-developer work** |
406
+ | Our honest claim | BMAD is more mature for planning/workflow breadth | Good for solo loops | Broad automation surface | **Must be stronger at execution truth: no unproven Done, no weak evidence, no policy overclaim** |
389
407
 
390
408
  ## Roadmap
391
409
 
392
- kode:harness is at **v0.11.4** — adds R16 recovery hardening for false clean state-check claims, surface-specific Story Contracts, reviewer dependency evidence, and dirty wrap-up truthfulness on top of the v0.11 proof-first and deterministic release guard foundation.
410
+ kode:harness is at **v0.11.6** — adds R20 installed-project CLI/CI guard enforcement, shipping the public `guard` command and GitHub Actions guard template on top of the R17 governance hardening foundation.
393
411
 
394
412
  | Phase | Version | Status | Focus |
395
413
  |---|---|---|---|
@@ -407,7 +425,9 @@ kode:harness is at **v0.11.4** — adds R16 recovery hardening for false clean s
407
425
  | **Uninstall Safety** | v0.11.1 | ✅ Complete | Manifest-based uninstall, default state preservation, shared owner restore, purge cleanup |
408
426
  | **Deterministic Release Guard** | v0.11.2 | ✅ Complete | R1-R10 guard scripts, package-boundary scan, dependency-map scan, R10 manifest-sealed bench workflow |
409
427
  | **Experiment Hardening** | v0.11.3 | ✅ Complete | R15 Recent Changes integrity, Wave Scope boundary drift checks, enum/filter coverage honesty, R15 bench scenarios |
410
- | **Recovery Hardening** | v0.11.4 | ✅ Current | R16 false PASS claim guard, surface-specific Story Contract checks, reviewer dependency evidence, dirty wrap-up guard |
428
+ | **Recovery Hardening** | v0.11.4 | ✅ Complete | R16 false PASS claim guard, surface-specific Story Contract checks, reviewer dependency evidence, dirty wrap-up guard |
429
+ | **Governance Hardening** | v0.11.5 | ✅ Complete | R17 Crew Validation Tracker sync, dependency-map interface log guard, VS Code AGENTS.md instruction anchor |
430
+ | **CLI/CI Guard Enforcement** | v0.11.6 | ✅ Current | Public `guard` command for installed projects, `guard --all`/`--state-sync` CI template, release regression for CLI exposure |
411
431
  | **Docs Bridge** | v0.11.1 | 🧪 Experimental | Project Docs Hub Index, docs-bridge skill, local docs hub index with visibility boundaries |
412
432
  | **Safety & Branding** | v0.9.6 | ✅ Done | init overwrite backups, shipped pm naming cleanup, LICENSE branding cleanup |
413
433
  | **Validation** | v1.0 | 🔜 Next | Real-world project adoption, user feedback collection |
@@ -174,7 +174,7 @@ After running state-check, also verify:
174
174
  - [ ] **docs/project-brief.md**: If a technology or architectural decision was made, is it in Decision Log?
175
175
  - [ ] **docs/agent-memory/*.md**: If an agent (reviewer/pm/lead) was used this session, was its memory updated by the wrap-up skill?
176
176
  - [ ] **R16 guard evidence**: Run/request the guard command and include its exact summary. Any guard error forbids `DONE`/`DONE_WITH_CONCERNS`:
177
- `HARNESS_GUARD_ROOT="$PWD" node /path/to/k-harness/scripts/harness-guard.js docs/project-state.md`
177
+ `harness guard --dir "$PWD" .harness/project-state.md docs/project-state.md docs/features.md docs/dependency-map.md`
178
178
 
179
179
  For each missing update: flag as `[STATE-AUDIT]` in the output and provide the exact update that should be made.
180
180
  **Severity**:
@@ -6,7 +6,7 @@ Skills and agents work together through shared state files.
6
6
 
7
7
  ## Quiet Navigator + Confidence Loop
8
8
 
9
- Common-mode users often begin with rough goals. Keep the navigator short and evidence-first:
9
+ Keep navigation short and evidence-first:
10
10
  - **Goal Card**: Goal, first usable result, non-goal, risk, required proof.
11
11
  - **Proof Ledger**: command/evidence that proves the feature works.
12
12
  - **Evidence-Gated Progress Board**: `Planned → Implementing → Proof Pending → Proven → Reviewed`.
@@ -51,24 +51,22 @@ Follow the pipeline that matches the current situation. After each step, output
51
51
  <!-- CREW_MODE_START -->
52
52
  ### 🟣 Crew-Driven Development (kode:crew artifacts provided)
53
53
 
54
- When external planning artifacts exist (requirements, analysis, design documents from kode:crew or similar):
54
+ When external planning artifacts exist:
55
55
 
56
- 1. `setup` → scan project & fill state files, **create Artifact Index + Validation Tracker** in project-brief.md (originals are never modified)
57
- 2. `pm` → plan features **from crew artifacts**: map FR→Stories (`[FR-NNN]` prefix), ARB Fail→P0 Stories (`[ARB-FAIL]` prefix), update Validation Tracker
58
- 3. `lead` → start Story (includes Validation Dashboard showing KPI/FR/ARB coverage)
56
+ 1. `setup` → scan project, fill state files, create Artifact Index + Validation Tracker in project-brief.md
57
+ 2. `pm` → plan from crew artifacts: map FR→Stories, ARB Fail→P0 Stories, update Validation Tracker
58
+ 3. `lead` → start Story with Validation Dashboard
59
59
  4. [Coding] → implement Stories in order from pm
60
60
  5. `reviewer` → code review + crew artifact compliance check → commit → push
61
61
  6. `wrap-up` → capture session lessons + update Validation Tracker + verify push
62
62
 
63
- > Crew artifacts are detected by `docs/crew/`, `docs/PM/`+`docs/Analyst/`+`docs/ARB/`, or explicit requirements/design docs.
64
- > **Reference, don't summarize**: setup writes an Artifact Index; skills read originals via indexed paths.
65
- > If `## CI Artifact Index` exists, reviewer Step 2.5 and release Step 3.5 surface the external CI guide when build/CI files change.
66
- > This pipeline produces the same state files as 🟢 — the difference is the INPUT source and the addition of Validation Tracker for traceability.
63
+ > Reference originals through the Artifact Index; do not rewrite planning artifacts.
64
+ > If `## CI Artifact Index` exists, reviewer/release surface it when build/CI files change.
67
65
  <!-- CREW_MODE_END -->
68
66
 
69
67
  ## User Request Routing
70
68
 
71
- When the user provides a feature request or development goal in their prompt:
69
+ For feature requests:
72
70
 
73
71
  1. Read `docs/project-state.md` to determine current project state
74
72
  2. Route to the appropriate pipeline:
@@ -79,7 +77,7 @@ When the user provides a feature request or development goal in their prompt:
79
77
  - Direction change → Start 🟡 Pipeline from `pivot`
80
78
  - Docs/wiki request → Run `docs-bridge`
81
79
  <!-- CREW_MODE_START -->
82
- - Crew artifacts detected (`docs/crew/` exists, `docs/PM/`+`docs/Analyst/`+`docs/ARB/` exist, or user provided design docs) → Start 🟣 Pipeline from `setup`
80
+ - Crew artifacts detected → Start 🟣 Pipeline from `setup`
83
81
  <!-- CREW_MODE_END -->
84
82
  - Any other request (info, explanation, status) → `lead` — route with context
85
83
  3. Announce which pipeline and step you are starting, then execute
@@ -88,7 +86,7 @@ When the user provides a feature request or development goal in their prompt:
88
86
 
89
87
  **Every response must end with a 🧭 Next Step block.** This is mandatory — never omit it.
90
88
 
91
- Keep the block concise. When code changed, include the next evidence:
89
+ Keep it concise:
92
90
 
93
91
  ```
94
92
  ---
@@ -102,24 +100,9 @@ Keep the block concise. When code changed, include the next evidence:
102
100
  ---
103
101
  ```
104
102
 
105
- When a skill or agent reports STATUS: DONE, use the same block and point to the next row in the Chaining Map.
106
-
107
- ### Chaining Map — what comes after what
108
-
109
- | Completed | Next | Prompt Example |
110
- |-----------|------|---------------|
111
- | `setup` | `pm` | "[project]에 [첫 기능]을 추가해줘" || `pm` (Step 0 → empty) | `setup` [internal] | "State files empty — auto-invoking setup" || `pm` | User confirmation → `lead` | "이 경로(Plan)대로 구현을 시작할까요?" → "S{N}-{M} Story를 시작해줘" |
112
- | `lead` (story started) | [Coding] | "S{N}-{M} 구현을 시작해줘" → 완료 후 **새 채팅**에서 `@reviewer` 호출 |
113
- | [Coding done] | `reviewer` | "S{N}-{M} 코드를 리뷰해줘" |
114
- | `reviewer` (pass, more stories) | Commit → `lead` | \"커밋 후 다음 Story는?\" |
115
- | `reviewer` (pass, sprint all done) | Commit → `pm` checkpoint | \"커밋 후 Sprint 완료 — pm checkpoint 실행\" |
116
- | `reviewer` (STATE-AUDIT) | `wrap-up` | "state 파일을 정리하고 세션 마무리해줘" |
117
- | `debug` | `reviewer` | "수정한 코드를 리뷰해줘" |
118
- | `pivot` | `pm` | "변경된 방향에 맞춰 재계획해줘" |
119
- | `architect` | `pm` | "승인된 설계로 기능을 계획해줘" |
120
- | `wrap-up` | 🏁 Session End | "다음 세션 시작 시 `lead` 호출" |
103
+ Chaining: `setup→pm→lead→coding→reviewer→wrap-up`. Bug fix: `debug→fix→reviewer→wrap-up`. Pivot: `pivot→pm`. Architecture: `architect→pm`. After `wrap-up`, end the session and tell the next session to call `lead`.
121
104
  <!-- CREW_MODE_START -->
122
- | Crew artifacts provided | `setup` (🟣) | "crew 산출물을 기반으로 프로젝트를 세팅해줘" |
105
+ Crew artifacts start at `setup` using the 🟣 pipeline.
123
106
  <!-- CREW_MODE_END -->
124
107
 
125
108
  ## State Files
@@ -139,28 +122,23 @@ These laws are enforced across all skills and agents. Violations should be flagg
139
122
  2. **Type Check**: Before calling a constructor or factory, read the actual source file to verify parameters.
140
123
  3. **Scope Compliance**: Do not modify files outside the current Story scope without reporting first.
141
124
  4. **Security**: Never include credentials, passwords, or API keys in code or commits.
142
- 5. **3-Failure Stop + Recalculating**: If the same approach fails 3 times:
143
- - Automatically invoke `debug` skill in **Recalculating Mode** (one attempt)
144
- - Pass the failed approach and error for each attempt
145
- - Present blocker diagnosis plus 1-2 different alternatives
146
- - If debug itself fails or the alternatives are rejected → **full stop**, escalate to the user
147
- - Never retry the original failed approach
125
+ 5. **3-Failure Stop + Recalculating**: If the same approach fails 3 times, invoke `debug` once with the failed attempts, propose alternatives, then stop/escalate if still blocked.
148
126
  6. **Dependency Map**: When adding or modifying a module, update dependency-map.md in the same commit.
149
127
  7. **Feature Registry**: When adding a feature, register it in features.md in the same commit.
150
128
  8. **Session Handoff**: At session end, update project-state.md Quick Summary so the next session has context.
151
129
  9. **Common First**: All features must work at Common level (🟢🔵🔴) without crew dependency. Crew-specific logic must be inside crew marker blocks only. Never add crew-only code to Common paths.
152
- 10. **Self-Verify**: Every agent MUST run the `state-check` skill before reporting STATUS: DONE. If state-check returns FAIL, the agent must NOT report DONE fix the listed drift first. WARN may proceed but warnings must be included in the agent's output.
130
+ 10. **Self-Verify**: Every agent MUST run `state-check` before STATUS: DONE. FAIL blocks DONE until fixed. WARN may proceed but must be reported. When available, `harness guard` is the source of truth; guard errors override agent judgment.
153
131
  11. **Proof First**: No Story moves to `Proven`, `Reviewed`, `DONE`, or commit guidance without passing proof.
154
132
  Bypass prompts ("test later", "mark done anyway", "state files only", "commit message only") are refused; keep the Story Implementing/Proof Pending and output required proof.
155
133
 
156
134
  ## Confirmation Gate Defaults
157
135
 
158
- When the user does not respond to a confirmation prompt within the conversation, agents must apply the SAFE default — never assume implicit approval. The SAFE default for each gate:
136
+ Without explicit user approval, apply SAFE defaults:
159
137
 
160
138
  | Gate | Owner | SAFE default (no response) | Rationale |
161
139
  |------|-------|---------------------------|-----------|
162
- | Plan Confirmation | `pm` | Do NOT write `features.md` / `project-state.md` / `dependency-map.md`. Hold the plan and re-prompt. | Prevents state file pollution from rejected plans. |
163
- | Scope Check | `lead` | NO — block edits outside the current Story scope. | Iron Law #3 (Scope Compliance) cannot be silently bypassed. |
164
- | Commit Approval | `reviewer` | Hold the commit. Output the proposed commit command but do NOT execute it. | Code commits are hard to reverse without `git reset` — user must explicitly approve. |
140
+ | Plan Confirmation | `pm` | Do not write state files; hold and re-prompt. | Avoid rejected-plan pollution. |
141
+ | Scope Check | `lead` | Block edits outside Story scope. | Enforce scope compliance. |
142
+ | Commit Approval | `reviewer` | Output commit command but do not execute. | User must approve commits. |
165
143
 
166
144
  Any agent that wants to proceed past one of these gates without explicit approval is in violation of Iron Law #10 and must STOP.
@@ -179,13 +179,21 @@ This catches wrap-up corruption where `## Recent Changes` is inserted in the mid
179
179
  If `docs/project-state.md` or the caller output claims `state-check PASS`, `0 FAIL`, `0 WARN`, or `guard no issues`, the claim must be backed by deterministic evidence:
180
180
 
181
181
  1. Prefer running the installed guard command:
182
- `HARNESS_GUARD_ROOT="$PWD" node /path/to/k-harness/scripts/harness-guard.js docs/project-state.md`
182
+ `harness guard --dir "$PWD" .harness/project-state.md docs/project-state.md docs/features.md docs/dependency-map.md`
183
183
  2. If CLI execution is unavailable, do not claim `0 FAIL, 0 WARN`; say `manual state-check only`.
184
184
  3. FAIL if any markdown/state/contract/handoff/env-seal issue is visible while the file claims clean self-verify.
185
185
  4. FAIL if the guard output is summarized but not shown.
186
186
 
187
187
  This catches reports such as "state-check PASS: 0 FAIL, 0 WARN" when a Proof Ledger table is malformed or Environment Seal is missing.
188
188
 
189
+ ### Check 15: Policy Evidence Truth (R18)
190
+
191
+ If a Story/ledger claims policy compliance (`policyId`, `pageId`, `Confluence`, `Human Policy`, `LLM Policy Card`, `Machine Guard Spec`): Done/verified needs Confluence MCP fetch, versioned snapshot/hash, or fetch blocker. Local `policy-registry.json` / `/api/policy-evidence` alone is not proof. `TRAP-* → 404` alone FAILs; require scan/diff/storage/guard evidence. Policy UI proof needs screenshot/Playwright/user/manual checklist. Done policy Stories must split Human/LLM/Machine layers.
192
+
193
+ ### Check 16: Proof Contradictions (R19)
194
+
195
+ FAIL if a passing Proof/Evidence row also says `not observed`, `not verified`, `partial`, `pending`, `missing`, or similar. Split proven/unproven evidence instead.
196
+
189
197
  ## Output Format
190
198
 
191
199
  ```
@@ -229,6 +237,12 @@ This catches reports such as "state-check PASS: 0 FAIL, 0 WARN" when a Proof Led
229
237
  - Guard output: shown / missing
230
238
  - Clean PASS claim matches deterministic result: yes/no
231
239
 
240
+ ### Check 15: Policy Evidence Truth
241
+ - {N} policy rows checked / {M} weak trap rows / {K} missing fetch or snapshot evidence
242
+
243
+ ### Check 16: Proof Contradictions
244
+ - {N} passing proof rows checked / {M} contradictory caveats
245
+
232
246
  <!-- CREW_MODE_START -->
233
247
  ### Check 6: Validation Tracker (🟣)
234
248
  - {N} FR references checked / {M} drifted
@@ -145,13 +145,13 @@ For each issue/error that occurred in this session:
145
145
  Before saying `state-check PASS`, `0 FAIL`, `0 WARN`, `STATUS: DONE`, or `Session Learn Complete`, run and quote one guard summary:
146
146
 
147
147
  ```bash
148
- HARNESS_GUARD_ROOT="$PWD" node /path/to/k-harness/scripts/harness-guard.js docs/project-state.md
148
+ harness guard --wrap-up --dir "$PWD"
149
149
  ```
150
150
 
151
- or installed script:
151
+ or, when the package is not globally installed:
152
152
 
153
153
  ```bash
154
- npm run harness:guard:wrap-up
154
+ npx @kodevibe/harness guard --wrap-up --dir "$PWD"
155
155
  ```
156
156
 
157
157
  Rules: paste the exact guard summary. Errors block `STATUS: DONE`; warnings must be listed. Never write `0 FAIL, 0 WARN` unless guard says no issues.
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "@kodevibe/harness",
3
- "version": "0.11.4",
3
+ "version": "0.11.6",
4
4
  "description": "kode:harness — harness engineering for keeping every developer's AI aligned on one project direction.",
5
5
  "keywords": [
6
6
  "llm",
package/src/guard.js CHANGED
@@ -23,6 +23,10 @@
23
23
  // R15 checkRecentChangesIntegrity — wrap-up must not corrupt state sections
24
24
  // R16 checkSelfVerifyClaim — claimed PASS must match deterministic guard
25
25
  // R16 checkReviewerAuditEvidence — scope audits must cite real deps/imports
26
+ // R17 checkCrewValidationSync — Crew Validation Tracker follows done work
27
+ // R17 checkDependencyInterfaceLog — interface-affecting features update deps
28
+ // R18 checkPolicyEvidence — policy-pack compliance claims need real evidence
29
+ // R19 checkProofContradictions — passing proof rows must not carry caveats
26
30
  //
27
31
  // Severity: 'error' blocks the commit (exit 1). 'warn' is informational.
28
32
 
@@ -64,6 +68,8 @@ const SECRET_ALLOWLIST = [
64
68
  /\b(?:enum|type|interface|column|field|label)\b/i,
65
69
  ];
66
70
 
71
+ const SECRET_TEST_FIXTURE_MARKER = /harness-secret-test-fixture/i;
72
+
67
73
  function isAllowlisted(line) {
68
74
  return SECRET_ALLOWLIST.some((re) => re.test(line));
69
75
  }
@@ -99,6 +105,8 @@ function scanSecrets(content, filename = '') {
99
105
  const lines = content.split('\n');
100
106
  for (let i = 0; i < lines.length; i++) {
101
107
  const line = lines[i];
108
+ const previousLine = i > 0 ? lines[i - 1] : '';
109
+ if (SECRET_TEST_FIXTURE_MARKER.test(line) || SECRET_TEST_FIXTURE_MARKER.test(previousLine)) continue;
102
110
  if (isAllowlisted(line)) continue;
103
111
  const decodedHit = decodedBase64Secret(line);
104
112
  if (decodedHit) {
@@ -522,7 +530,7 @@ function checkLearnCompletion({ projectState = '', features = '', quiet = false
522
530
 
523
531
  function splitPathList(value) {
524
532
  return String(value || '')
525
- .split(/[,;<br>`]+|\s{2,}/)
533
+ .split(/(?:,|;|<br\s*\/?>|`)+|\s{2,}/i)
526
534
  .map((v) => v.trim())
527
535
  .filter((v) => v && !/^n\/a$/i.test(v) && !/^\(?none\)?$/i.test(v));
528
536
  }
@@ -584,6 +592,124 @@ function checkStateSync({ projectState = '', features = '', dependencyMap = '' }
584
592
  return violations;
585
593
  }
586
594
 
595
+ // ─── Crew Validation Tracker Sync Gate (R17) ────────────────────────
596
+
597
+ const COMPLETE_STATUS = /✅|done|proven|pass(?:ed)?|reviewed|complete|완료|통과/i;
598
+ const INCOMPLETE_STATUS = /planned|pending|todo|not[_ -]?proven|not[_ -]?verified|⬜|🟡|🔄|대기|계획|미완료/i;
599
+ const REQUIREMENT_ID_RE = /\b(?:FR|KPI|ARB|ARB-FAIL)[-_]?\d+\b/gi;
600
+ const BASELINE_REQUIREMENTS = new Set(['FR-001', 'FR-002']);
601
+
602
+ function normalizedRequirementId(value) {
603
+ return String(value || '').toUpperCase().replace('_', '-');
604
+ }
605
+
606
+ function extractRequirementIds(value) {
607
+ return [...new Set((String(value || '').match(REQUIREMENT_ID_RE) || [])
608
+ .map(normalizedRequirementId))];
609
+ }
610
+
611
+ function trackerRowsFromBrief(projectBrief = '') {
612
+ const visible = stripHtmlComments(projectBrief);
613
+ const section = getSection(visible, 'Validation Tracker') || '';
614
+ return parseMarkdownTable(section);
615
+ }
616
+
617
+ function trackerRequirement(row) {
618
+ return row.Requirement || row.FR || row.KPI || row.ARB || row.Item || row.Control || '';
619
+ }
620
+
621
+ function trackerStory(row) {
622
+ return row.Story || row.Stories || row['Story ID'] || '';
623
+ }
624
+
625
+ function trackerStatus(row) {
626
+ return row.Status || row.status || '';
627
+ }
628
+
629
+ /**
630
+ * Crew mode adds project-brief.md Validation Tracker as the FR/KPI/ARB source
631
+ * of truth. A recurring Qwen failure was marking features/project-state done
632
+ * while leaving tracker rows Planned. This gate makes that drift blocking.
633
+ *
634
+ * @param {{projectState?: string, features?: string, projectBrief?: string}} files
635
+ * @returns {Array}
636
+ */
637
+ function checkCrewValidationSync({ projectState = '', features = '', projectBrief = '' } = {}) {
638
+ const violations = [];
639
+ const trackerRows = trackerRowsFromBrief(projectBrief);
640
+ if (trackerRows.length === 0) return violations;
641
+
642
+ const doneStoryIds = parseMarkdownTable(getSection(stripHtmlComments(projectState), 'Story Status') || '')
643
+ .filter((row) => /✅\s*done/i.test(rowStatus(row)))
644
+ .map((row) => storyIdFromRow(row))
645
+ .filter(Boolean);
646
+ const doneStorySet = new Set(doneStoryIds);
647
+
648
+ const featureRows = parseMarkdownTable(getSection(stripHtmlComments(features), 'Feature Registry') || stripHtmlComments(features))
649
+ .filter((row) => COMPLETE_STATUS.test(row.Status || row.status || ''));
650
+ const doneRequirements = new Set();
651
+ for (const feature of featureRows) {
652
+ const raw = Object.values(feature).filter((v) => typeof v === 'string').join(' ');
653
+ for (const req of extractRequirementIds(raw)) doneRequirements.add(req);
654
+ }
655
+
656
+ for (const row of trackerRows) {
657
+ const req = normalizedRequirementId(trackerRequirement(row));
658
+ const story = trackerStory(row);
659
+ const status = trackerStatus(row);
660
+ const mappedToDoneStory = [...doneStorySet].some((id) => story.includes(id));
661
+ const doneByFeature = req && doneRequirements.has(req);
662
+ if ((mappedToDoneStory || doneByFeature) && INCOMPLETE_STATUS.test(status)) {
663
+ violations.push({
664
+ check: 'validation-tracker',
665
+ severity: 'error',
666
+ line: 0,
667
+ message: `Validation Tracker row ${req || '(unknown requirement)'} maps to completed work but still has status "${status || 'blank'}" (R17). Update project-brief.md to Proven/Done or keep the Story out of Done.`,
668
+ });
669
+ }
670
+ }
671
+
672
+ return violations;
673
+ }
674
+
675
+ // ─── Dependency Interface Log Gate (R17) ────────────────────────────
676
+
677
+ const INTERFACE_FEATURE_TERMS = /\b(FR-00[3-9]|FR-0[1-9]\d|sla|risk|filter|api|interface|contract|auth|login|board|control)\b/i;
678
+
679
+ function checkDependencyInterfaceLog({ features = '', dependencyMap = '' } = {}) {
680
+ const violations = [];
681
+ const depVisible = stripHtmlComments(dependencyMap);
682
+ const interfaceLog = getSection(depVisible, 'Interface Change Log');
683
+ if (interfaceLog === null) return violations;
684
+
685
+ const featureRows = parseMarkdownTable(getSection(stripHtmlComments(features), 'Feature Registry') || stripHtmlComments(features))
686
+ .filter((row) => COMPLETE_STATUS.test(row.Status || row.status || ''));
687
+
688
+ for (const row of featureRows) {
689
+ const raw = Object.values(row).filter((v) => typeof v === 'string').join(' ');
690
+ const requirements = extractRequirementIds(raw).filter((id) => !BASELINE_REQUIREMENTS.has(id));
691
+ const keyFiles = row['Key Files'] || row['Key files'] || row.Files || row.Scope || '';
692
+ const touchesSource = splitPathList(keyFiles).some((file) => /^(src|lib|app|public)\//.test(file) || /^(server|index)\.js$/.test(file));
693
+ const interfaceLike = requirements.length > 0 || INTERFACE_FEATURE_TERMS.test(raw);
694
+ if (!touchesSource || !interfaceLike) continue;
695
+
696
+ const coveredByRequirement = requirements.some((req) => interfaceLog.includes(req));
697
+ const featureName = row.Feature || row.Name || row.Title || '';
698
+ const tokens = String(featureName).toLowerCase().match(/[a-z0-9-]{4,}/g) || [];
699
+ const meaningfulMatches = tokens.filter((token) => interfaceLog.toLowerCase().includes(token));
700
+ if (!coveredByRequirement && meaningfulMatches.length < 2) {
701
+ violations.push({
702
+ check: 'dependency-interface-log',
703
+ severity: 'error',
704
+ line: 0,
705
+ message: `Completed feature "${featureName || '(unnamed feature)'}" changes source/API/UI surfaces but dependency-map.md Interface Change Log has no matching FR/feature entry (R17). Add an interface log row or explicitly record no interface change.`,
706
+ });
707
+ }
708
+ }
709
+
710
+ return violations;
711
+ }
712
+
587
713
  // ─── Scope Split Approval Gate (R14) ────────────────────────────────
588
714
 
589
715
  const STORY_ID_RE = /\bS\d+-\d+\b/g;
@@ -717,6 +843,146 @@ function checkSmokeEvidence(content) {
717
843
  return violations;
718
844
  }
719
845
 
846
+ // ─── Proof Contradiction Gate (R19 / execution truth) ────────────────
847
+
848
+ const PASSING_PROOF_STATUS = /✅|pass(?:ed)?|verified|proven|done|blocked|refused/i;
849
+ const CONTRADICTORY_PROOF_TERMS = /\b(?:not observed|not verified|not proven|not confirmed|not tested|not run|partial(?:ly)?|pending|todo|planned|unverified|missing|incomplete)\b|⬜|🔄|🟡/i;
850
+
851
+ /**
852
+ * A proof row cannot both pass and admit that part of the proof was not
853
+ * observed, is pending, or is partial. This catches "normal risk only; watch /
854
+ * breached not observed" rows being used to close UI Stories.
855
+ *
856
+ * @param {string} content project-state.md
857
+ * @returns {Array}
858
+ */
859
+ function checkProofContradictions(content) {
860
+ const violations = [];
861
+ const visible = stripHtmlComments(content);
862
+ const sections = ['Proof Ledger', 'Evidence Summary', 'Policy Evidence Ledger'];
863
+
864
+ for (const sectionName of sections) {
865
+ const rows = parseMarkdownTable(getSection(visible, sectionName) || '');
866
+ for (const row of rows) {
867
+ const raw = Object.values(row).filter((v) => typeof v === 'string').join(' ');
868
+ const result = [row.Result, row.Status, row['Proof Status'], row.result, row.status]
869
+ .filter((v) => typeof v === 'string')
870
+ .join(' ');
871
+ if (!PASSING_PROOF_STATUS.test(result)) continue;
872
+ if (!CONTRADICTORY_PROOF_TERMS.test(raw)) continue;
873
+
874
+ violations.push({
875
+ check: 'proof-contradiction',
876
+ severity: 'error',
877
+ line: 0,
878
+ message: `${sectionName} has a passing row with contradictory caveats (${raw.slice(0, 140)}). Move the Story back to Proof Pending or split observed and unobserved evidence (R19).`,
879
+ });
880
+ }
881
+ }
882
+
883
+ return violations;
884
+ }
885
+
886
+ // ─── Policy Evidence Gate (R18 / governance execution) ───────────────
887
+
888
+ const POLICY_GOVERNANCE_TERMS = /\bpolicy(?:-|\s)?pack\b|Policy Evidence Ledger|policy-registry|policyId|pageId|Confluence|Atlassian|Machine Guard Spec|LLM Policy Card|Human Policy/i;
889
+ const POLICY_VERIFIED_STATUS = /✅|verified|proven|pass(?:ed)?|done/i;
890
+ const POLICY_PENDING_STATUS = /pending|not[_ -]?proven|not[_ -]?verified|todo|planned|⬜|🔄|🟡/i;
891
+ const POLICY_FETCH_EVIDENCE = /\b(?:MCP|Atlassian|Confluence)\b[\s\S]{0,80}\b(?:fetch|fetched|read|retrieved|pageVersion|version|snapshot|hash|fetchedAt)\b|\b(?:fetch|fetched|retrieved|snapshot|pageVersion|fetchedAt|hash)\b[\s\S]{0,80}\b(?:MCP|Atlassian|Confluence|pageId)\b/i;
892
+ const POLICY_SURFACE_ONLY_EVIDENCE = /\bAPI returns\b|\bpolicyId\s*\+\s*pageId\b|\bsurfaced?\b|\b4 policies\b|\bregistry\b/i;
893
+ const WEAK_TRAP_404 = /\b(?:TRAP|trap|Secret trap|Dependency trap|PII trap|Proof trap)\b[\s\S]{0,160}\b(?:404|not found|route not found)\b|\b(?:404|not found|route not found)\b[\s\S]{0,160}\b(?:TRAP|trap|blocked|refused)\b/i;
894
+ const STRONG_TRAP_EVIDENCE = /\b(?:forbidden file scan|secret scan|dependency diff|package\.json diff|localStorage scan|sessionStorage scan|PII scan|guard result|policy guard|explicit deny|deny reason|blocked by guard|secure check|state-check)\b/i;
895
+ const POLICY_UI_EVIDENCE = /\bPolicy Evidence Board\b|\bpolicy rows?\b/i;
896
+ const DURABLE_POLICY_UI_EVIDENCE = /\b(?:screenshot|Playwright|browser tool|user[- ]confirmed|observer:\s*user|manual checklist|captured artifact)\b/i;
897
+ const POLICY_LAYER_TERMS = ['Human Policy', 'LLM Policy Card', 'Machine Guard Spec'];
898
+
899
+ /**
900
+ * Governance claims are higher-stakes than ordinary feature claims. A policy
901
+ * row that says "verified" must prove more than "the local JSON was displayed";
902
+ * trap rows must show a real guard/scan/deny reason, not only an unimplemented
903
+ * route returning 404. This catches Experiment #8's execution-governance gap.
904
+ *
905
+ * @param {string} content project-state.md
906
+ * @returns {Array}
907
+ */
908
+ function checkPolicyEvidence(content) {
909
+ const violations = [];
910
+ const visible = stripHtmlComments(content);
911
+ if (!POLICY_GOVERNANCE_TERMS.test(visible)) return violations;
912
+
913
+ const storyRows = parseMarkdownTable(getSection(visible, 'Story Status') || '');
914
+ const donePolicyStories = storyRows.filter((row) => {
915
+ const raw = Object.values(row).filter((v) => typeof v === 'string').join(' ');
916
+ return /✅\s*done/i.test(rowStatus(row)) && /policy|governance|evidence/i.test(raw);
917
+ });
918
+ const hasDonePolicyStory = donePolicyStories.length > 0;
919
+
920
+ const ledgerRows = parseMarkdownTable(getSection(visible, 'Proof Ledger') || '');
921
+ for (const row of ledgerRows) {
922
+ const raw = Object.values(row).filter((v) => typeof v === 'string').join(' ');
923
+ const result = row.Result || row.result || raw;
924
+
925
+ if (/(✅|pass|blocked|refused)/i.test(result) && WEAK_TRAP_404.test(raw) && !STRONG_TRAP_EVIDENCE.test(raw)) {
926
+ violations.push({
927
+ check: 'policy-evidence',
928
+ severity: 'error',
929
+ line: 0,
930
+ message: 'Policy trap proof uses 404/not-found as the only blocked/refused evidence. Use an explicit guard result, forbidden-file scan, dependency diff, storage scan, or state-check evidence before claiming the trap is blocked (R18).',
931
+ });
932
+ }
933
+
934
+ if (/(✅|pass)/i.test(result) && POLICY_UI_EVIDENCE.test(raw) && !DURABLE_POLICY_UI_EVIDENCE.test(raw)) {
935
+ violations.push({
936
+ check: 'policy-evidence',
937
+ severity: 'error',
938
+ line: 0,
939
+ message: 'Policy Evidence Board UI proof is not durable. Add screenshot/Playwright/browser-tool artifact, user-confirmed observation, or a manual checklist before marking policy UI proof pass (R18).',
940
+ });
941
+ }
942
+ }
943
+
944
+ const policyLedger = getSection(visible, 'Policy Evidence Ledger') || '';
945
+ const policyRows = parseMarkdownTable(policyLedger);
946
+ for (const row of policyRows) {
947
+ const raw = Object.values(row).filter((v) => typeof v === 'string').join(' ');
948
+ const status = row.Status || row.status || '';
949
+ if (!POLICY_VERIFIED_STATUS.test(status) || POLICY_PENDING_STATUS.test(status)) continue;
950
+
951
+ if (!POLICY_FETCH_EVIDENCE.test(raw) && POLICY_SURFACE_ONLY_EVIDENCE.test(raw)) {
952
+ violations.push({
953
+ check: 'policy-evidence',
954
+ severity: 'error',
955
+ line: 0,
956
+ message: `Policy ${row['Policy ID'] || row.Policy || '(unknown)'} is marked verified using only local/API surfacing evidence. Record Confluence/MCP fetch, local snapshot with pageVersion/hash, or a fetch blocker before using verified (R18).`,
957
+ });
958
+ }
959
+ }
960
+
961
+ if (hasDonePolicyStory) {
962
+ const hasFetchEvidence = POLICY_FETCH_EVIDENCE.test(visible) || /fetch blocker|MCP blocker|could not fetch/i.test(visible);
963
+ if (!hasFetchEvidence) {
964
+ violations.push({
965
+ check: 'policy-evidence',
966
+ severity: 'error',
967
+ line: 0,
968
+ message: 'A policy/governance Story is done but has no Confluence/MCP fetch evidence, local policy snapshot evidence, or explicit fetch blocker. Local policy-registry display is not enough for policy compliance (R18).',
969
+ });
970
+ }
971
+
972
+ const missingLayer = POLICY_LAYER_TERMS.find((term) => !visible.includes(term));
973
+ if (missingLayer) {
974
+ violations.push({
975
+ check: 'policy-evidence',
976
+ severity: 'error',
977
+ line: 0,
978
+ message: `A policy/governance Story is done but does not evidence the "${missingLayer}" layer. Policy execution must distinguish Human Policy, LLM Policy Card, and Machine Guard Spec (R18).`,
979
+ });
980
+ }
981
+ }
982
+
983
+ return violations;
984
+ }
985
+
720
986
  // ─── Environment Seal Gate (R9) ──────────────────────────────────────
721
987
 
722
988
  /**
@@ -1131,6 +1397,7 @@ function sourceFilesForAudit(cwd) {
1131
1397
  function runGuard({ files, cwd = process.cwd() }) {
1132
1398
  const all = [];
1133
1399
  let scanned = 0;
1400
+ const stateContents = {};
1134
1401
 
1135
1402
  for (const file of files) {
1136
1403
  const abs = path.isAbsolute(file) ? file : path.join(cwd, file);
@@ -1139,6 +1406,11 @@ function runGuard({ files, cwd = process.cwd() }) {
1139
1406
  const rel = path.relative(cwd, abs);
1140
1407
  scanned++;
1141
1408
  const beforeFile = all.length;
1409
+ const normalizedRel = rel.replace(/\\/g, '/');
1410
+ if (/^(docs|\.harness)\/project-state\.md$/.test(normalizedRel)) stateContents.projectState = content;
1411
+ if (/^(docs|\.harness)\/features\.md$/.test(normalizedRel)) stateContents.features = content;
1412
+ if (/^(docs|\.harness)\/dependency-map\.md$/.test(normalizedRel)) stateContents.dependencyMap = content;
1413
+ if (/^(docs|\.harness)\/project-brief\.md$/.test(normalizedRel)) stateContents.projectBrief = content;
1142
1414
 
1143
1415
  if (isScannableForSecrets(file)) {
1144
1416
  all.push(...scanSecrets(content, rel));
@@ -1160,6 +1432,8 @@ function runGuard({ files, cwd = process.cwd() }) {
1160
1432
  all.push(...checkStoryContracts({ projectState: content }));
1161
1433
  all.push(...checkIntegrationDoD(content));
1162
1434
  all.push(...checkSmokeEvidence(content));
1435
+ all.push(...checkProofContradictions(content));
1436
+ all.push(...checkPolicyEvidence(content));
1163
1437
  all.push(...checkEnvSeal(content));
1164
1438
  if (STATE_LINE_LIMITS[base]) {
1165
1439
  all.push(...lintLineLimit(content, STATE_LINE_LIMITS[base], rel));
@@ -1179,6 +1453,14 @@ function runGuard({ files, cwd = process.cwd() }) {
1179
1453
  }
1180
1454
  }
1181
1455
 
1456
+ if (stateContents.projectState && stateContents.features && stateContents.dependencyMap) {
1457
+ all.push(...checkStateSync(stateContents));
1458
+ all.push(...checkDependencyInterfaceLog(stateContents));
1459
+ }
1460
+ if (stateContents.projectState && stateContents.features && stateContents.projectBrief) {
1461
+ all.push(...checkCrewValidationSync(stateContents));
1462
+ }
1463
+
1182
1464
  const errorCount = all.filter((v) => v.severity === 'error').length;
1183
1465
  const warnCount = all.filter((v) => v.severity === 'warn').length;
1184
1466
  return { ok: errorCount === 0, violations: all, errorCount, warnCount, scanned };
@@ -1191,11 +1473,15 @@ module.exports = {
1191
1473
  checkStoryContracts,
1192
1474
  checkLearnCompletion,
1193
1475
  checkStateSync,
1476
+ checkCrewValidationSync,
1477
+ checkDependencyInterfaceLog,
1194
1478
  checkScopeSplitApproval,
1195
1479
  checkRecentChangesIntegrity,
1196
1480
  checkSelfVerifyClaim,
1197
1481
  checkIntegrationDoD,
1198
1482
  checkSmokeEvidence,
1483
+ checkProofContradictions,
1484
+ checkPolicyEvidence,
1199
1485
  checkEnvSeal,
1200
1486
  checkPublicBoundary,
1201
1487
  checkEvaluatorArtifact,
package/src/init.js CHANGED
@@ -4,6 +4,17 @@ const fs = require('node:fs');
4
4
  const path = require('node:path');
5
5
  const readline = require('node:readline');
6
6
  const crypto = require('node:crypto');
7
+ const { execSync } = require('node:child_process');
8
+ const {
9
+ runGuard,
10
+ checkLearnCompletion,
11
+ checkStateSync,
12
+ checkCrewValidationSync,
13
+ checkDependencyInterfaceLog,
14
+ checkStoryContracts,
15
+ checkSmokeEvidence,
16
+ checkScopeSplitApproval,
17
+ } = require('./guard');
7
18
 
8
19
  const HARNESS_DIR = path.join(__dirname, '..', 'harness');
9
20
  const MANIFEST_PATH = '.harness/install-manifest.json';
@@ -179,11 +190,12 @@ const TEAM_GITATTRIBUTES_CONTENT =
179
190
  'docs/dependency-map.md merge=union\n';
180
191
 
181
192
  function hasFrameworkMarker(content) {
193
+ const legacyHarnessMarker = ['musher', 'engineering'].join('-');
182
194
  return content.includes('kode:harness')
183
195
  || content.includes('harness engineering')
184
196
  || content.includes('@kodevibe/harness')
185
197
  || content.includes('harness-engineering')
186
- || content.includes('musher-engineering');
198
+ || content.includes(legacyHarnessMarker);
187
199
  }
188
200
 
189
201
  function hasIdeLayout(targetDir, ide) {
@@ -338,6 +350,24 @@ function writeAgentsAsToml(targetDir, agentsDir, overwrite, mode = 'solo', crew
338
350
  }
339
351
  }
340
352
 
353
+ function vscodeAgentsMirror() {
354
+ return [
355
+ '# kode:harness VS Code Instruction Anchor',
356
+ '',
357
+ 'This project uses kode:harness. The canonical VS Code Copilot dispatcher is `.github/copilot-instructions.md` and must be followed.',
358
+ '',
359
+ 'Hard stops:',
360
+ '',
361
+ '- Read `docs/project-state.md` before planning or coding.',
362
+ '- Every response must end with a `🧭 Next Step` block.',
363
+ '- Do not mark a Story Done without Proof Ledger evidence.',
364
+ '- Do not claim state-check/guard PASS without real command output.',
365
+ '- Do not claim clean worktree, commit, push, publish, or policy compliance without checking the actual command result.',
366
+ '- Security, governance, dependency, CI/CD, and release rules are enforced by deterministic guards; if guard output conflicts with prose, guard output wins.',
367
+ '',
368
+ ].join('\n');
369
+ }
370
+
341
371
  // ─── IDE Generators ──────────────────────────────────────────
342
372
 
343
373
  function generateVscode(targetDir, overwrite, mode = 'solo', crew = false) {
@@ -345,6 +375,9 @@ function generateVscode(targetDir, overwrite, mode = 'solo', crew = false) {
345
375
 
346
376
  // Global instructions (dispatcher only — rules are embedded in skills)
347
377
  writeFile(targetDir, '.github/copilot-instructions.md', coreRules, true);
378
+ // Root AGENTS.md mirror — VS Code now supports AGENTS.md as an instruction
379
+ // surface. Keep it short to avoid conflicting with the canonical dispatcher.
380
+ writeFile(targetDir, 'AGENTS.md', vscodeAgentsMirror(), true);
348
381
 
349
382
  // Skills (.github/skills — VS Code default search path, SKILL.md with frontmatter)
350
383
  writeSkills(targetDir, '.github/skills', true, mode, crew);
@@ -865,6 +898,135 @@ function runValidate(targetDir) {
865
898
  return warnings === 0;
866
899
  }
867
900
 
901
+ // ─── Guard command ───────────────────────────────────────────
902
+ function readFirstExisting(targetDir, relPaths) {
903
+ for (const relPath of relPaths) {
904
+ const fullPath = path.join(targetDir, relPath);
905
+ if (fs.existsSync(fullPath) && fs.statSync(fullPath).isFile()) {
906
+ return fs.readFileSync(fullPath, 'utf8');
907
+ }
908
+ }
909
+ return '';
910
+ }
911
+
912
+ function guardDefaultStateFiles(targetDir) {
913
+ const candidates = [
914
+ '.harness/project-state.md',
915
+ 'docs/project-state.md',
916
+ '.harness/features.md',
917
+ 'docs/features.md',
918
+ '.harness/dependency-map.md',
919
+ 'docs/dependency-map.md',
920
+ '.harness/project-brief.md',
921
+ 'docs/project-brief.md',
922
+ ];
923
+ return candidates.filter((relPath) => fs.existsSync(path.join(targetDir, relPath)));
924
+ }
925
+
926
+ function guardGitFiles(targetDir, command) {
927
+ try {
928
+ const out = execSync(command, {
929
+ cwd: targetDir,
930
+ encoding: 'utf8',
931
+ stdio: ['ignore', 'pipe', 'ignore'],
932
+ });
933
+ return out.split('\n').map((line) => line.trim()).filter(Boolean);
934
+ } catch {
935
+ return [];
936
+ }
937
+ }
938
+
939
+ function guardGitStagedFiles(targetDir) {
940
+ return guardGitFiles(targetDir, 'git diff --cached --name-only --diff-filter=ACM');
941
+ }
942
+
943
+ function guardGitVisibleFiles(targetDir) {
944
+ return guardGitFiles(targetDir, 'git ls-files --cached --others --exclude-standard')
945
+ .filter((relPath) => !relPath.startsWith('node_modules/'))
946
+ .filter((relPath) => !relPath.startsWith('.git/'));
947
+ }
948
+
949
+ function guardResolveFiles(args) {
950
+ if (args.files.length > 0) return args.files;
951
+ if (args.all) return guardGitVisibleFiles(args.dir);
952
+ if (args.staged) return guardGitStagedFiles(args.dir);
953
+
954
+ const staged = guardGitStagedFiles(args.dir);
955
+ return staged.length > 0 ? staged : guardDefaultStateFiles(args.dir);
956
+ }
957
+
958
+ function printGuardViolations(violations) {
959
+ for (const violation of violations) {
960
+ const icon = violation.severity === 'error' ? '❌' : '⚠️ ';
961
+ const check = violation.check ? `[${violation.check}] ` : '';
962
+ const file = violation.file ? `${violation.file}: ` : '';
963
+ console.log(` ${icon} ${check}${file}${violation.message}`);
964
+ }
965
+ }
966
+
967
+ function printGuardSummary({ errorCount, warnCount, scanned }, okMessage, failMessage) {
968
+ if (errorCount === 0 && warnCount === 0) {
969
+ console.log(` ✅ ${okMessage} (${scanned} file(s) scanned)\n`);
970
+ return true;
971
+ }
972
+ const status = errorCount === 0
973
+ ? 'guard completed with warnings'
974
+ : (failMessage || 'guard found blocking issues');
975
+ console.log(`\n Result: ${errorCount} error(s), ${warnCount} warning(s) — ${status}\n`);
976
+ return errorCount === 0;
977
+ }
978
+
979
+ function runGuardCommand(args) {
980
+ if (args.wrapUp) {
981
+ const projectState = readFirstExisting(args.dir, ['.harness/project-state.md', 'docs/project-state.md']);
982
+ const features = readFirstExisting(args.dir, ['.harness/features.md', 'docs/features.md']);
983
+ const learn = checkLearnCompletion({ projectState, features, quiet: args.quiet });
984
+ const guard = runGuard({ files: guardDefaultStateFiles(args.dir), cwd: args.dir });
985
+ const violations = [...learn, ...guard.violations];
986
+ const errorCount = violations.filter((v) => v.severity === 'error').length;
987
+ const warnCount = violations.filter((v) => v.severity === 'warn').length;
988
+
989
+ console.log('\n kode:harness Guard — Learn Completion Gate\n');
990
+ printGuardViolations(violations);
991
+ return printGuardSummary(
992
+ { errorCount, warnCount, scanned: guard.scanned },
993
+ 'wrap-up outputs complete',
994
+ 'session is not safe to close',
995
+ );
996
+ }
997
+
998
+ if (args.stateSync) {
999
+ const projectState = readFirstExisting(args.dir, ['.harness/project-state.md', 'docs/project-state.md']);
1000
+ const features = readFirstExisting(args.dir, ['.harness/features.md', 'docs/features.md']);
1001
+ const dependencyMap = readFirstExisting(args.dir, ['.harness/dependency-map.md', 'docs/dependency-map.md']);
1002
+ const projectBrief = readFirstExisting(args.dir, ['.harness/project-brief.md', 'docs/project-brief.md']);
1003
+ const violations = [
1004
+ ...checkStateSync({ projectState, features, dependencyMap }),
1005
+ ...checkDependencyInterfaceLog({ features, dependencyMap }),
1006
+ ...checkCrewValidationSync({ projectState, features, projectBrief }),
1007
+ ...checkStoryContracts({ projectState }),
1008
+ ...checkSmokeEvidence(projectState),
1009
+ ...checkScopeSplitApproval({ projectBrief }),
1010
+ ];
1011
+ const errorCount = violations.filter((v) => v.severity === 'error').length;
1012
+ const warnCount = violations.filter((v) => v.severity === 'warn').length;
1013
+
1014
+ console.log('\n kode:harness Guard — State Sync Gate\n');
1015
+ printGuardViolations(violations);
1016
+ return printGuardSummary(
1017
+ { errorCount, warnCount, scanned: [projectState, features, dependencyMap, projectBrief].filter(Boolean).length },
1018
+ 'state files are synchronized',
1019
+ 'state files are not safe to close',
1020
+ );
1021
+ }
1022
+
1023
+ const files = guardResolveFiles(args);
1024
+ const result = runGuard({ files, cwd: args.dir });
1025
+ console.log('\n kode:harness Guard — Deterministic Guardrail\n');
1026
+ printGuardViolations(result.violations);
1027
+ return printGuardSummary(result, 'no guard issues found', 'guard found blocking issues');
1028
+ }
1029
+
868
1030
  function getKnownIdeFiles(ide) {
869
1031
  const skillIds = SKILLS.map(skill => skill.id);
870
1032
  const agentIds = AGENTS.map(agent => agent.id);
@@ -1384,17 +1546,24 @@ function showHelp() {
1384
1546
  npx @kodevibe/harness init [options]
1385
1547
  npx @kodevibe/harness doctor [--dir <path>]
1386
1548
  npx @kodevibe/harness validate [--dir <path>]
1549
+ npx @kodevibe/harness guard [options] [files...]
1387
1550
  npx @kodevibe/harness uninstall [options]
1388
1551
 
1389
1552
  Commands:
1390
1553
  init Install kode:harness files for your IDE
1391
1554
  doctor Check if kode:harness files are installed and healthy
1392
1555
  validate Verify state files have content (not just placeholders)
1556
+ guard Run deterministic proof/state/security/policy guard
1393
1557
  uninstall Safely remove kode:harness IDE files (state preserved by default)
1394
1558
 
1395
1559
  Options:
1396
1560
  --ide <name> IDE target: vscode, claude, cursor, codex, windsurf, antigravity
1397
1561
  --all Uninstall all detected IDE layouts
1562
+ With guard: scan all git-visible files
1563
+ --staged With guard: scan only staged files
1564
+ --wrap-up With guard: run session-end Learn/proof gate
1565
+ --state-sync With guard: run cross-state synchronization gate
1566
+ --quiet With guard --wrap-up: allow zero-change sessions
1398
1567
  --mode <mode> Project mode: solo (default) or team
1399
1568
  --dir <path> Target directory (default: current directory)
1400
1569
  --overwrite Overwrite existing files (including state files)
@@ -1415,6 +1584,8 @@ function showHelp() {
1415
1584
  npx @kodevibe/harness init --ide claude --dir ./my-project
1416
1585
  npx @kodevibe/harness doctor
1417
1586
  npx @kodevibe/harness validate
1587
+ npx @kodevibe/harness guard --wrap-up --dir .
1588
+ npx @kodevibe/harness guard --all --dir .
1418
1589
  npx @kodevibe/harness uninstall --ide claude --dry-run
1419
1590
  npx @kodevibe/harness uninstall --ide claude --yes
1420
1591
  `);
@@ -1438,12 +1609,18 @@ function parseArgs(argv) {
1438
1609
  purgeBackups: false,
1439
1610
  force: false,
1440
1611
  json: false,
1612
+ files: [],
1613
+ staged: false,
1614
+ wrapUp: false,
1615
+ stateSync: false,
1616
+ quiet: false,
1441
1617
  };
1442
1618
  for (let i = 0; i < argv.length; i++) {
1443
1619
  const arg = argv[i];
1444
1620
  if (arg === 'init') args.command = 'init';
1445
1621
  else if (arg === 'doctor') args.command = 'doctor';
1446
1622
  else if (arg === 'validate') args.command = 'validate';
1623
+ else if (arg === 'guard') args.command = 'guard';
1447
1624
  else if (arg === 'uninstall') args.command = 'uninstall';
1448
1625
  else if (arg === '--ide' && argv[i + 1]) { args.ide = argv[++i]; }
1449
1626
  else if (arg === '--mode' && argv[i + 1]) { args.mode = argv[++i]; }
@@ -1453,6 +1630,10 @@ function parseArgs(argv) {
1453
1630
  else if (arg === '--overwrite') args.overwrite = true;
1454
1631
  else if (arg === '--batch') args.batch = true;
1455
1632
  else if (arg === '--all') args.all = true;
1633
+ else if (arg === '--staged') args.staged = true;
1634
+ else if (arg === '--wrap-up') args.wrapUp = true;
1635
+ else if (arg === '--state-sync') args.stateSync = true;
1636
+ else if (arg === '--quiet') args.quiet = true;
1456
1637
  else if (arg === '--dry-run') args.dryRun = true;
1457
1638
  else if (arg === '--yes' || arg === '-y') args.yes = true;
1458
1639
  else if (arg === '--purge-state' || arg === '--include-state') args.purgeState = true;
@@ -1461,6 +1642,7 @@ function parseArgs(argv) {
1461
1642
  else if (arg === '--json') args.json = true;
1462
1643
  else if (arg === '--help' || arg === '-h') args.help = true;
1463
1644
  else if (arg === '--version') args.version = true;
1645
+ else if (args.command === 'guard' && !arg.startsWith('-')) args.files.push(arg);
1464
1646
  }
1465
1647
  return args;
1466
1648
  }
@@ -1489,6 +1671,11 @@ async function run(argv) {
1489
1671
  process.exit(ok ? 0 : 1);
1490
1672
  }
1491
1673
 
1674
+ if (args.command === 'guard') {
1675
+ const ok = runGuardCommand(args);
1676
+ process.exit(ok ? 0 : 1);
1677
+ }
1678
+
1492
1679
  if (args.command === 'uninstall') {
1493
1680
  await runUninstall(args);
1494
1681
  return;
@@ -1591,4 +1778,4 @@ async function run(argv) {
1591
1778
  }
1592
1779
  }
1593
1780
 
1594
- module.exports = { run, detectLanguage, runDoctor, runValidate, buildUninstallPlan };
1781
+ module.exports = { run, detectLanguage, runDoctor, runValidate, runGuardCommand, buildUninstallPlan };
@@ -0,0 +1,20 @@
1
+ name: kode:harness Guard
2
+
3
+ on:
4
+ pull_request:
5
+ branches: [main]
6
+ push:
7
+ branches: [main]
8
+
9
+ jobs:
10
+ guard:
11
+ runs-on: ubuntu-latest
12
+ steps:
13
+ - uses: actions/checkout@v4
14
+ - uses: actions/setup-node@v4
15
+ with:
16
+ node-version: 20
17
+ - name: Deterministic harness guard
18
+ run: npx --yes @kodevibe/harness guard --all --dir .
19
+ - name: State synchronization guard
20
+ run: npx --yes @kodevibe/harness guard --state-sync --dir .