npm - @harness-engineering/cli - Versions diffs - 1.9.0 → 1.11.0 - Mend

@harness-engineering/cli 1.9.0 → 1.11.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (86) hide show

package/dist/agents/skills/claude-code/enforce-architecture/SKILL.md +4 -0
package/dist/agents/skills/claude-code/harness-autopilot/SKILL.md +7 -2
package/dist/agents/skills/claude-code/harness-brainstorming/SKILL.md +10 -1
package/dist/agents/skills/claude-code/harness-execution/SKILL.md +2 -2
package/dist/agents/skills/claude-code/harness-parallel-agents/SKILL.md +105 -20
package/dist/agents/skills/claude-code/harness-pre-commit-review/SKILL.md +37 -0
package/dist/agents/skills/gemini-cli/enforce-architecture/SKILL.md +4 -0
package/dist/agents/skills/gemini-cli/harness-autopilot/SKILL.md +7 -2
package/dist/agents/skills/gemini-cli/harness-brainstorming/SKILL.md +10 -1
package/dist/agents/skills/gemini-cli/harness-execution/SKILL.md +2 -2
package/dist/agents/skills/gemini-cli/harness-parallel-agents/SKILL.md +105 -20
package/dist/agents/skills/gemini-cli/harness-pre-commit-review/SKILL.md +37 -0
package/dist/agents-md-ZFV6RR5J.js +8 -0
package/dist/architecture-EXNUMH5R.js +13 -0
package/dist/bin/harness-mcp.d.ts +1 -0
package/dist/bin/harness-mcp.js +28 -0
package/dist/bin/harness.js +42 -8
package/dist/check-phase-gate-VZFOY2PO.js +12 -0
package/dist/chunk-2NCIKJES.js +470 -0
package/dist/chunk-2YPZKGAG.js +62 -0
package/dist/{chunk-CGSHUJES.js → chunk-2YSQOUHO.js} +4484 -2688
package/dist/chunk-3WGJMBKH.js +45 -0
package/dist/{chunk-ULSRSP53.js → chunk-6N4R6FVX.js} +11 -112
package/dist/{chunk-6JIT7CEM.js → chunk-72GHBOL2.js} +1 -1
package/dist/chunk-BM3PWGXQ.js +14 -0
package/dist/chunk-C2ERUR3L.js +255 -0
package/dist/chunk-EBJQ6N4M.js +39 -0
package/dist/chunk-GNGELAXY.js +293 -0
package/dist/chunk-GSIVNYVJ.js +187 -0
package/dist/chunk-HD4IBGLA.js +80 -0
package/dist/chunk-I6JZYEGT.js +4361 -0
package/dist/chunk-IDZNPTYD.js +16 -0
package/dist/chunk-JSTQ3AWB.js +31 -0
package/dist/chunk-K6XAPGML.js +27 -0
package/dist/chunk-KET4QQZB.js +8 -0
package/dist/chunk-L2KLU56K.js +125 -0
package/dist/chunk-MHBMTPW7.js +29 -0
package/dist/chunk-NC6PXVWT.js +116 -0
package/dist/chunk-NKDM3FMH.js +52 -0
package/dist/chunk-PA2XHK75.js +248 -0
package/dist/chunk-Q6AB7W5Z.js +135 -0
package/dist/chunk-QPEH2QPG.js +347 -0
package/dist/chunk-TEFCFC4H.js +15 -0
package/dist/chunk-TI4TGEX6.js +85 -0
package/dist/chunk-TRAPF4IX.js +185 -0
package/dist/chunk-VRFZWGMS.js +68 -0
package/dist/chunk-VUCPTQ6G.js +67 -0
package/dist/chunk-W6Y7ZW3Y.js +13 -0
package/dist/chunk-WJZDO6OY.js +103 -0
package/dist/chunk-WUJTCNOU.js +122 -0
package/dist/chunk-X3MN5UQJ.js +89 -0
package/dist/chunk-Z75JC6I2.js +189 -0
package/dist/chunk-ZOAWBDWU.js +72 -0
package/dist/{chunk-RTPHUDZS.js → chunk-ZWC3MN5E.js} +1944 -2779
package/dist/ci-workflow-K5RCRNYR.js +8 -0
package/dist/constants-5JGUXPEK.js +6 -0
package/dist/create-skill-WPXHSLX2.js +11 -0
package/dist/dist-D4RYGUZE.js +14 -0
package/dist/{dist-C5PYIQPF.js → dist-JVZ2MKBC.js} +108 -6
package/dist/dist-L7LAAQAS.js +18 -0
package/dist/{dist-I7DB5VKB.js → dist-M6BQODWC.js} +1145 -0
package/dist/docs-PWCUVYWU.js +12 -0
package/dist/engine-6XUP6GAK.js +8 -0
package/dist/entropy-4I6JEYAC.js +12 -0
package/dist/feedback-TNIW534S.js +18 -0
package/dist/generate-agent-definitions-MWKEA5NU.js +15 -0
package/dist/glob-helper-5OHBUQAI.js +52 -0
package/dist/graph-loader-KO4GJ5N2.js +8 -0
package/dist/index.d.ts +328 -12
package/dist/index.js +93 -34
package/dist/loader-4FIPIFII.js +10 -0
package/dist/mcp-MOKLYNZL.js +34 -0
package/dist/performance-BTOJCPXU.js +24 -0
package/dist/review-pipeline-3YTW3463.js +9 -0
package/dist/runner-VMYLHWOC.js +6 -0
package/dist/runtime-GO7K2PJE.js +9 -0
package/dist/security-4P2GGFF6.js +9 -0
package/dist/skill-executor-RG45LUO5.js +8 -0
package/dist/templates/orchestrator/WORKFLOW.md +48 -0
package/dist/templates/orchestrator/template.json +6 -0
package/dist/validate-JN44D2Q7.js +12 -0
package/dist/validate-cross-check-DB7RIFFF.js +8 -0
package/dist/version-KFFPOQAX.js +6 -0
package/package.json +13 -7
package/dist/create-skill-UZOHMXRU.js +0 -8
package/dist/validate-cross-check-VG573VZO.js +0 -7

package/dist/agents/skills/claude-code/enforce-architecture/SKILL.md CHANGED Viewed

@@ -47,6 +47,8 @@ Graph queries show the complete violation scope (not just the first occurrence p
 1. **Run `harness check-deps`** to analyze all import statements against the constraint model. Capture the full JSON output.
+1b. **Optionally run `harness check-arch`** for comprehensive architecture analysis beyond dependency checking. This covers circular dependencies, complexity, coupling, module size, and dependency depth in addition to layer violations.
 2. **Parse the results.** Each violation includes:
    - The violating file and line number
    - The forbidden import target
@@ -162,6 +164,8 @@ Module A re-exports from Module B, and Module B imports from Module A. The circu
 - **`harness-design-system`** — Provides the design token source of truth (`tokens.json`) that constraints validate against.
 - **`harness-accessibility`** — Provides WCAG contrast validation used by DESIGN-003 constraints.
 - **Design constraint category** — Controlled by `design.strictness` in `harness.config.json`. Design violations surface alongside architectural violations in the same report.
+- **`harness check-arch`** — Architecture assertion framework. Runs all 7 metric collectors against baseline and thresholds. Use for comprehensive structural health checks beyond layer dependencies. Supports `--update-baseline` to capture current state and `--json` for machine-readable output.
+- **`harness check-arch --module <path>`** — Scoped architecture check for a single module. Use when validating a specific subsystem.
 ## Success Criteria

package/dist/agents/skills/claude-code/harness-autopilot/SKILL.md CHANGED Viewed

@@ -339,7 +339,9 @@ INIT → ASSESS → PLAN → APPROVE_PLAN → EXECUTE → VERIFY → REVIEW →
 3. **Mark phase as `complete`** in state.
-4. **Check for next phase:**
+4. **Sync roadmap.** If `docs/roadmap.md` exists, call `manage_roadmap` with action `sync` and `apply: true`. This reflects the just-completed phase in the roadmap (e.g., updating the feature from `planned` to `in-progress`). If `manage_roadmap` is unavailable, fall back to direct file manipulation using `syncRoadmap()` from core. Skip silently if no roadmap exists. Do not use `force_sync: true` — the human-always-wins rule applies.
+5. **Check for next phase:**
    - If more phases remain: "Phase {N} complete. Next: Phase {N+1}: {name} (complexity: {level}). Continue? (yes / stop)"
      - **yes** — Increment `currentPhase`, reset `retryBudget`, transition to ASSESS.
      - **stop** — Save state and exit.
@@ -383,7 +385,9 @@ INIT → ASSESS → PLAN → APPROVE_PLAN → EXECUTE → VERIFY → REVIEW →
    - [skill:harness-autopilot] [outcome:observation] {any notable patterns from the run}
    ```
-5. **Clean up state:** Set `currentState: "DONE"` in `{sessionDir}/autopilot-state.json`. Do not delete the file — it serves as a record.
+5. **Update roadmap to done.** If `docs/roadmap.md` exists and the current spec maps to a roadmap feature, call `manage_roadmap` with action `update` to set the feature status to `done`. Derive the feature name from the spec title (H1 heading) or the session's `handoff.json` `summary` field. If `manage_roadmap` is unavailable, fall back to direct file manipulation using `updateFeature()` from core. Skip silently if no roadmap exists or if the feature is not found. Do not use `force_sync: true`.
+6. **Clean up state:** Set `currentState: "DONE"` in `{sessionDir}/autopilot-state.json`. Do not delete the file — it serves as a record.
 ## Harness Integration
@@ -394,6 +398,7 @@ INIT → ASSESS → PLAN → APPROVE_PLAN → EXECUTE → VERIFY → REVIEW →
 - **Handoff** — `.harness/sessions/<slug>/handoff.json` is written by each delegated skill and read by the next. Autopilot writes a final handoff on DONE.
 - **Learnings** — `.harness/learnings.md` (global) is appended by both delegated skills and autopilot itself.
 - **Roadmap context** — During INIT, reads `docs/roadmap.md` (if present) for project-level priorities, blockers, and milestone status. Provides broader context for phase execution decisions.
+- **Roadmap sync** — During PHASE_COMPLETE, calls `manage_roadmap` with `sync` and `apply: true` to reflect phase progress. During DONE, calls `manage_roadmap` with `update` to set feature status to `done`. Both skip silently when no roadmap exists. Neither uses `force_sync: true`.
 ## Success Criteria

package/dist/agents/skills/claude-code/harness-brainstorming/SKILL.md CHANGED Viewed

@@ -156,7 +156,15 @@ These keywords flow into the `handoff.json` `contextKeywords` field when the spe
    The human must explicitly approve before this skill is complete.
-6. **Write handoff and suggest transition.** After the human approves the spec:
+6. **Add feature to roadmap.** If `docs/roadmap.md` exists:
+   - Derive the feature name from the spec title (the H1 heading of the proposal).
+   - Call `manage_roadmap` with action `add`, `status: "planned"`, `milestone: "Current Work"`, and the spec path. Include a one-line summary from the spec overview.
+   - If the feature already exists in the roadmap (duplicate name), skip silently — the feature was likely added manually or by a prior brainstorming session.
+   - Log: `"Added '<feature-name>' to roadmap as planned"` (informational, not a prompt).
+   - If `manage_roadmap` is unavailable, fall back to direct file manipulation using `addFeature()` from core.
+   - If no roadmap exists, skip this step silently.
+7. **Write handoff and suggest transition.** After the human approves the spec:
    Write `.harness/handoff.json`:
@@ -257,6 +265,7 @@ Converge on a recommendation that addresses all concerns before presenting the d
 - **`harness check-docs`** — Run to verify the spec does not conflict with existing documentation.
 - **Spec location** — Specs go to `docs/changes/<feature>/proposal.md`. Follow existing naming patterns.
 - **Handoff to harness-planning** — Once the spec is approved, invoke harness-planning to create the implementation plan from the spec.
+- **Roadmap sync** — After spec approval, call `manage_roadmap` with action `add` to register the new feature as `planned` in `docs/roadmap.md`. Skip silently if no roadmap exists. Duplicates are silently ignored.
 - **`emit_interaction`** -- Call at the end of Phase 4 to suggest transitioning to harness-planning. Uses confirmed transition (waits for user approval).
 #### Requirement Phrasing

package/dist/agents/skills/claude-code/harness-execution/SKILL.md CHANGED Viewed

@@ -266,7 +266,7 @@ Skipping this step means subsequent graph queries (impact analysis, dependency h
    }
    ```
-5. **Sync roadmap (if present).** If `docs/roadmap.md` exists, trigger a roadmap sync to update linked feature statuses based on the just-completed execution state. Use the `manage_roadmap` MCP tool with `sync` action if available, or invoke `/harness:roadmap --sync`. This keeps the roadmap current as plans are executed. If no roadmap exists, skip this step silently.
+5. **Sync roadmap (mandatory when present).** If `docs/roadmap.md` exists, call `manage_roadmap` with action `sync` and `apply: true` to update linked feature statuses from the just-completed execution state. Do not use `force_sync: true` — the human-always-wins rule applies. If `manage_roadmap` is unavailable, fall back to direct file manipulation using `syncRoadmap()` from core. If no roadmap exists, skip silently.
 6. **Learnings are append-only.** Never edit or delete previous learnings. They are a chronological record.
@@ -327,7 +327,7 @@ These are non-negotiable. When any condition is met, stop immediately.
 - **`harness state learn "<message>"`** — Append a learning from the command line.
 - **`.harness/state.json`** — Read at session start to resume position. Updated after every task.
 - **`.harness/learnings.md`** — Append-only knowledge capture. Read at session start for prior context.
-- **Roadmap sync** — After completing plan execution, sync roadmap status via `manage_roadmap sync` if `docs/roadmap.md` exists. Keeps roadmap current with execution progress.
+- **Roadmap sync** — After completing plan execution, call `manage_roadmap` with action `sync` and `apply: true` to update roadmap status. Mandatory when `docs/roadmap.md` exists. Do not use `force_sync: true`. Falls back to `syncRoadmap()` from core if MCP tool is unavailable.
 - **`emit_interaction`** -- Call at plan completion to auto-transition to harness-verification. Uses auto-transition (proceeds immediately without user confirmation).
 ## Success Criteria

package/dist/agents/skills/claude-code/harness-parallel-agents/SKILL.md CHANGED Viewed

@@ -16,28 +16,78 @@
 ## Process
-### Step 1: Identify Independent Problem Domains
+### Step 1: Verify Task Independence
-Before dispatching anything in parallel, rigorously verify independence:
+Before dispatching anything in parallel, predict conflicts using `predict_conflicts` (preferred) or `check_task_independence` (fallback):
-1. **List the candidate tasks.** Pull from the plan, or identify from the current work.
+1. **List the candidate tasks.** Pull from the plan, or identify from the current work. For each task, identify the files it will read and write.
-2. **Check file overlap.** For each pair of tasks, compare the files they will read and write. Any overlap in WRITE targets means they are NOT independent. Overlap in READ targets is acceptable only if neither task writes to those files.
+2. **Call `check_task_independence`.** Pass the tasks with their file lists:
-3. **Check state overlap.** Do any tasks share database tables, configuration files, environment variables, or in-memory state? If yes, they are NOT independent.
+   ```json
+   {
+     "path": "<project-root>",
+     "tasks": [
+       { "id": "task-a", "files": ["src/module-a/index.ts", "src/module-a/index.test.ts"] },
+       { "id": "task-b", "files": ["src/module-b/index.ts", "src/module-b/index.test.ts"] }
+     ],
+     "depth": 1
+   }
+   ```
-4. **Check import graph overlap.** If Task A modifies module X and Task B imports module X, they are NOT independent — Task B's tests may be affected by Task A's changes.
+   The tool checks direct file overlap AND transitive dependency overlap (via the knowledge graph when available). It returns:
+   - **`pairs`**: Pairwise independence results with overlap details
+   - **`groups`**: Safe parallel dispatch groups (connected components of the conflict graph)
+   - **`verdict`**: Human-readable summary (e.g., "3 of 4 tasks can run in parallel in 2 groups")
+   - **`analysisLevel`**: `"graph-expanded"` (full analysis) or `"file-only"` (graph unavailable)
-5. **When in doubt, run serially.** The cost of a false parallel dispatch (merge conflicts, subtle bugs, wasted work) far exceeds the cost of running serially.
+   **Preferred: Use `predict_conflicts`** for severity-aware analysis with automatic regrouping:
+   ```json
+   {
+     "path": "<project-root>",
+     "tasks": [
+       { "id": "task-a", "files": ["src/module-a/index.ts", "src/module-a/index.test.ts"] },
+       { "id": "task-b", "files": ["src/module-b/index.ts", "src/module-b/index.test.ts"] }
+     ],
+     "depth": 1
+   }
+   ```
+   The `predict_conflicts` tool extends independence checking with:
+   - **`conflicts`**: Severity-classified conflict details with human-readable reasoning
+   - **`groups`**: Revised parallel dispatch groups (high-severity conflicts force serialization)
+   - **`summary`**: Conflict counts by severity and whether regrouping occurred
+   - **`verdict`**: Human-readable summary including severity breakdown
+   If `predict_conflicts` is unavailable, fall back to `check_task_independence`.
+3. **Act on the result.** Use the returned `groups` for dispatch. Flag any medium-severity conflicts to the coordinator. If high-severity conflicts forced regrouping (`summary.regrouped === true`), log which tasks were serialized and why. If all tasks are in one group, dispatch them all in parallel. If tasks are split across groups, dispatch each group as a separate parallel wave.
+4. **When in doubt, run serially.** The cost of a false parallel dispatch (merge conflicts, subtle bugs, wasted work) far exceeds the cost of running serially.
+#### Manual Fallback (when MCP tool is unavailable)
+If `check_task_independence` is not available, verify independence manually:
+1. **Check file overlap.** For each pair of tasks, compare the files they will read and write. Any overlap in WRITE targets means they are NOT independent. Overlap in READ targets is acceptable only if neither task writes to those files.
+2. **Check state overlap.** Do any tasks share database tables, configuration files, environment variables, or in-memory state? If yes, they are NOT independent.
+3. **Check import graph overlap.** If Task A modifies module X and Task B imports module X, they are NOT independent — Task B's tests may be affected by Task A's changes.
+4. **When in doubt, run serially.** Same principle as above.
 ### Graph-Enhanced Context (when available)
-When a knowledge graph exists at `.harness/graph/`, use graph queries for faster, more accurate independence verification:
+When a knowledge graph exists at `.harness/graph/`, `check_task_independence` automatically uses it for transitive dependency analysis (this is the `"graph-expanded"` analysis level). No manual graph queries are needed for independence checking.
+For custom queries beyond independence checking, these tools remain available:
-- `query_graph` — get the dependency subgraph per candidate task and check for node overlap between tasks
-- `get_impact` — verify tasks do not write to overlapping files or share transitive dependencies
+- `query_graph` — get the dependency subgraph for a specific module or file
+- `get_impact` — assess the impact radius of changes to a specific module
-Automated graph-based independence verification replaces manual import grep and catches transitive overlaps that file-level checks miss. Fall back to file-based commands if no graph is available.
+When no graph is available, `check_task_independence` falls back to file-only overlap detection and flags `analysisLevel: "file-only"` so you know transitive dependencies were not checked.
 ### Step 2: Create Focused Agent Tasks
@@ -101,7 +151,7 @@ For each independent task, write a focused agent brief:
 ## Success Criteria
-- Independence was verified before dispatch (file overlap, state overlap, import graph)
+- Independence was verified before dispatch via `check_task_independence` (or manual fallback if tool unavailable)
 - Each agent had a focused brief with explicit scope, goal, constraints, and expected output
 - All agents completed successfully (or blockers were reported)
 - Integration produced no merge conflicts (or conflicts were resolved)
@@ -117,17 +167,52 @@ For each independent task, write a focused agent brief:
 **Step 1: Verify independence**
+Call `check_task_independence`:
+```json
+{
+  "path": ".",
+  "tasks": [
+    {
+      "id": "task-4-user",
+      "files": [
+        "src/services/user/service.ts",
+        "src/services/user/service.test.ts",
+        "src/types/user.ts"
+      ]
+    },
+    {
+      "id": "task-5-product",
+      "files": [
+        "src/services/product/service.ts",
+        "src/services/product/service.test.ts",
+        "src/types/product.ts"
+      ]
+    },
+    {
+      "id": "task-6-notification",
+      "files": [
+        "src/services/notification/service.ts",
+        "src/services/notification/service.test.ts",
+        "src/types/notification.ts"
+      ]
+    }
+  ]
+}
 ```
-Task 4 (UserService):      writes src/services/user/*, reads src/types/user.ts
-Task 5 (ProductService):   writes src/services/product/*, reads src/types/product.ts
-Task 6 (NotificationService): writes src/services/notification/*, reads src/types/notification.ts
-File overlap: NONE (different directories, different type files)
-State overlap: NONE (different DB tables, no shared config)
-Import graph: NONE (no cross-service imports)
-Verdict: INDEPENDENT — safe to parallelize
+Result:
+```json
+{
+  "analysisLevel": "graph-expanded",
+  "groups": [["task-4-user", "task-5-product", "task-6-notification"]],
+  "verdict": "3 of 3 tasks can run in parallel in 1 group"
+}
 ```
+All tasks are independent — safe to parallelize.
 **Step 2: Create agent briefs**
 ```

package/dist/agents/skills/claude-code/harness-pre-commit-review/SKILL.md CHANGED Viewed

@@ -81,6 +81,26 @@ If a knowledge graph exists at `.harness/graph/` and code files have changed sin
 If no graph exists, skip this step — the tools fall back to non-graph behavior.
+### Impact Preview
+After mechanical checks pass, run `harness impact-preview` to surface the blast radius of staged changes. This is informational only — it never blocks the commit.
+```bash
+harness impact-preview
+```
+Include the output in the report between the mechanical checks section and the AI review section:
+```
+Impact Preview (3 staged files)
+  Code:   12 files   (routes/login.ts, middleware/verify.ts, +10)
+  Tests:   3 tests   (auth.test.ts, integration.test.ts, +1)
+  Docs:    2 docs    (auth-guide.md, api-reference.md)
+  Total:  17 affected
+```
+If no graph exists, the command prints a nudge message and returns — no action needed. If no files are staged, it says so. Neither case blocks the workflow.
 ### Phase 2: Classify Changes
 Determine whether AI review is needed based on what changed.
@@ -192,6 +212,12 @@ Mechanical Checks:
 - Tests: PASS (12/12)
 - Security Scan: PASS (0 errors, 0 warnings)
+Impact Preview (3 staged files)
+  Code:   12 files   (routes/login.ts, middleware/verify.ts, +10)
+  Tests:   3 tests   (auth.test.ts, integration.test.ts, +1)
+  Docs:    2 docs    (auth-guide.md, api-reference.md)
+  Total:  17 affected
 AI Review: PASS (no issues found)
 ```
@@ -207,6 +233,11 @@ Mechanical Checks:
 - Security Scan: WARN (0 errors, 1 warning)
   - [SEC-NET-001] src/cors.ts:5 — CORS wildcard origin
+Impact Preview (2 staged files)
+  Code:    8 files   (cors.ts, server.ts, +6)
+  Tests:   2 tests   (cors.test.ts, server.test.ts)
+  Total:  10 affected
 AI Review: 2 observations
 1. [file:line] Possible null dereference — `user.email` accessed without null check after `findUser()` which can return null.
 2. [file:line] Debug artifact — `console.log('debug:', payload)` appears to be left from debugging.
@@ -244,6 +275,7 @@ fi
 - Complements harness-code-review (full review) — use pre-commit for quick checks, code-review for thorough analysis
 - **`assess_project`** — Used in Phase 1 for harness-specific health checks (validate + deps) in a single call.
 - **`review_changes`** — Used in Phase 4 with `depth: 'quick'` for fast pre-commit diff analysis.
+- **`harness impact-preview`** — Run after mechanical checks pass to show blast radius of staged changes. Informational only — never blocks.
 ## Success Criteria
@@ -264,6 +296,11 @@ Mechanical Checks:
 - Types: PASS
 - Tests: PASS (12/12)
+Impact Preview (2 staged files)
+  Code:    5 files   (auth.ts, login.ts, +3)
+  Tests:   2 tests   (auth.test.ts, login.test.ts)
+  Total:   7 affected
 AI Review: PASS (no issues found)
 ```

package/dist/agents/skills/gemini-cli/enforce-architecture/SKILL.md CHANGED Viewed

@@ -47,6 +47,8 @@ Graph queries show the complete violation scope (not just the first occurrence p
 1. **Run `harness check-deps`** to analyze all import statements against the constraint model. Capture the full JSON output.
+1b. **Optionally run `harness check-arch`** for comprehensive architecture analysis beyond dependency checking. This covers circular dependencies, complexity, coupling, module size, and dependency depth in addition to layer violations.
 2. **Parse the results.** Each violation includes:
    - The violating file and line number
    - The forbidden import target
@@ -162,6 +164,8 @@ Module A re-exports from Module B, and Module B imports from Module A. The circu
 - **`harness-design-system`** — Provides the design token source of truth (`tokens.json`) that constraints validate against.
 - **`harness-accessibility`** — Provides WCAG contrast validation used by DESIGN-003 constraints.
 - **Design constraint category** — Controlled by `design.strictness` in `harness.config.json`. Design violations surface alongside architectural violations in the same report.
+- **`harness check-arch`** — Architecture assertion framework. Runs all 7 metric collectors against baseline and thresholds. Use for comprehensive structural health checks beyond layer dependencies. Supports `--update-baseline` to capture current state and `--json` for machine-readable output.
+- **`harness check-arch --module <path>`** — Scoped architecture check for a single module. Use when validating a specific subsystem.
 ## Success Criteria

package/dist/agents/skills/gemini-cli/harness-autopilot/SKILL.md CHANGED Viewed

@@ -339,7 +339,9 @@ INIT → ASSESS → PLAN → APPROVE_PLAN → EXECUTE → VERIFY → REVIEW →
 3. **Mark phase as `complete`** in state.
-4. **Check for next phase:**
+4. **Sync roadmap.** If `docs/roadmap.md` exists, call `manage_roadmap` with action `sync` and `apply: true`. This reflects the just-completed phase in the roadmap (e.g., updating the feature from `planned` to `in-progress`). If `manage_roadmap` is unavailable, fall back to direct file manipulation using `syncRoadmap()` from core. Skip silently if no roadmap exists. Do not use `force_sync: true` — the human-always-wins rule applies.
+5. **Check for next phase:**
    - If more phases remain: "Phase {N} complete. Next: Phase {N+1}: {name} (complexity: {level}). Continue? (yes / stop)"
      - **yes** — Increment `currentPhase`, reset `retryBudget`, transition to ASSESS.
      - **stop** — Save state and exit.
@@ -383,7 +385,9 @@ INIT → ASSESS → PLAN → APPROVE_PLAN → EXECUTE → VERIFY → REVIEW →
    - [skill:harness-autopilot] [outcome:observation] {any notable patterns from the run}
    ```
-5. **Clean up state:** Set `currentState: "DONE"` in `{sessionDir}/autopilot-state.json`. Do not delete the file — it serves as a record.
+5. **Update roadmap to done.** If `docs/roadmap.md` exists and the current spec maps to a roadmap feature, call `manage_roadmap` with action `update` to set the feature status to `done`. Derive the feature name from the spec title (H1 heading) or the session's `handoff.json` `summary` field. If `manage_roadmap` is unavailable, fall back to direct file manipulation using `updateFeature()` from core. Skip silently if no roadmap exists or if the feature is not found. Do not use `force_sync: true`.
+6. **Clean up state:** Set `currentState: "DONE"` in `{sessionDir}/autopilot-state.json`. Do not delete the file — it serves as a record.
 ## Harness Integration
@@ -394,6 +398,7 @@ INIT → ASSESS → PLAN → APPROVE_PLAN → EXECUTE → VERIFY → REVIEW →
 - **Handoff** — `.harness/sessions/<slug>/handoff.json` is written by each delegated skill and read by the next. Autopilot writes a final handoff on DONE.
 - **Learnings** — `.harness/learnings.md` (global) is appended by both delegated skills and autopilot itself.
 - **Roadmap context** — During INIT, reads `docs/roadmap.md` (if present) for project-level priorities, blockers, and milestone status. Provides broader context for phase execution decisions.
+- **Roadmap sync** — During PHASE_COMPLETE, calls `manage_roadmap` with `sync` and `apply: true` to reflect phase progress. During DONE, calls `manage_roadmap` with `update` to set feature status to `done`. Both skip silently when no roadmap exists. Neither uses `force_sync: true`.
 ## Success Criteria

package/dist/agents/skills/gemini-cli/harness-brainstorming/SKILL.md CHANGED Viewed

@@ -156,7 +156,15 @@ These keywords flow into the `handoff.json` `contextKeywords` field when the spe
    The human must explicitly approve before this skill is complete.
-6. **Write handoff and suggest transition.** After the human approves the spec:
+6. **Add feature to roadmap.** If `docs/roadmap.md` exists:
+   - Derive the feature name from the spec title (the H1 heading of the proposal).
+   - Call `manage_roadmap` with action `add`, `status: "planned"`, `milestone: "Current Work"`, and the spec path. Include a one-line summary from the spec overview.
+   - If the feature already exists in the roadmap (duplicate name), skip silently — the feature was likely added manually or by a prior brainstorming session.
+   - Log: `"Added '<feature-name>' to roadmap as planned"` (informational, not a prompt).
+   - If `manage_roadmap` is unavailable, fall back to direct file manipulation using `addFeature()` from core.
+   - If no roadmap exists, skip this step silently.
+7. **Write handoff and suggest transition.** After the human approves the spec:
    Write `.harness/handoff.json`:
@@ -257,6 +265,7 @@ Converge on a recommendation that addresses all concerns before presenting the d
 - **`harness check-docs`** — Run to verify the spec does not conflict with existing documentation.
 - **Spec location** — Specs go to `docs/changes/<feature>/proposal.md`. Follow existing naming patterns.
 - **Handoff to harness-planning** — Once the spec is approved, invoke harness-planning to create the implementation plan from the spec.
+- **Roadmap sync** — After spec approval, call `manage_roadmap` with action `add` to register the new feature as `planned` in `docs/roadmap.md`. Skip silently if no roadmap exists. Duplicates are silently ignored.
 - **`emit_interaction`** -- Call at the end of Phase 4 to suggest transitioning to harness-planning. Uses confirmed transition (waits for user approval).
 #### Requirement Phrasing

package/dist/agents/skills/gemini-cli/harness-execution/SKILL.md CHANGED Viewed

@@ -266,7 +266,7 @@ Skipping this step means subsequent graph queries (impact analysis, dependency h
    }
    ```
-5. **Sync roadmap (if present).** If `docs/roadmap.md` exists, trigger a roadmap sync to update linked feature statuses based on the just-completed execution state. Use the `manage_roadmap` MCP tool with `sync` action if available, or invoke `/harness:roadmap --sync`. This keeps the roadmap current as plans are executed. If no roadmap exists, skip this step silently.
+5. **Sync roadmap (mandatory when present).** If `docs/roadmap.md` exists, call `manage_roadmap` with action `sync` and `apply: true` to update linked feature statuses from the just-completed execution state. Do not use `force_sync: true` — the human-always-wins rule applies. If `manage_roadmap` is unavailable, fall back to direct file manipulation using `syncRoadmap()` from core. If no roadmap exists, skip silently.
 6. **Learnings are append-only.** Never edit or delete previous learnings. They are a chronological record.
@@ -327,7 +327,7 @@ These are non-negotiable. When any condition is met, stop immediately.
 - **`harness state learn "<message>"`** — Append a learning from the command line.
 - **`.harness/state.json`** — Read at session start to resume position. Updated after every task.
 - **`.harness/learnings.md`** — Append-only knowledge capture. Read at session start for prior context.
-- **Roadmap sync** — After completing plan execution, sync roadmap status via `manage_roadmap sync` if `docs/roadmap.md` exists. Keeps roadmap current with execution progress.
+- **Roadmap sync** — After completing plan execution, call `manage_roadmap` with action `sync` and `apply: true` to update roadmap status. Mandatory when `docs/roadmap.md` exists. Do not use `force_sync: true`. Falls back to `syncRoadmap()` from core if MCP tool is unavailable.
 - **`emit_interaction`** -- Call at plan completion to auto-transition to harness-verification. Uses auto-transition (proceeds immediately without user confirmation).
 ## Success Criteria

package/dist/agents/skills/gemini-cli/harness-parallel-agents/SKILL.md CHANGED Viewed

@@ -16,28 +16,78 @@
 ## Process
-### Step 1: Identify Independent Problem Domains
+### Step 1: Verify Task Independence
-Before dispatching anything in parallel, rigorously verify independence:
+Before dispatching anything in parallel, predict conflicts using `predict_conflicts` (preferred) or `check_task_independence` (fallback):
-1. **List the candidate tasks.** Pull from the plan, or identify from the current work.
+1. **List the candidate tasks.** Pull from the plan, or identify from the current work. For each task, identify the files it will read and write.
-2. **Check file overlap.** For each pair of tasks, compare the files they will read and write. Any overlap in WRITE targets means they are NOT independent. Overlap in READ targets is acceptable only if neither task writes to those files.
+2. **Call `check_task_independence`.** Pass the tasks with their file lists:
-3. **Check state overlap.** Do any tasks share database tables, configuration files, environment variables, or in-memory state? If yes, they are NOT independent.
+   ```json
+   {
+     "path": "<project-root>",
+     "tasks": [
+       { "id": "task-a", "files": ["src/module-a/index.ts", "src/module-a/index.test.ts"] },
+       { "id": "task-b", "files": ["src/module-b/index.ts", "src/module-b/index.test.ts"] }
+     ],
+     "depth": 1
+   }
+   ```
-4. **Check import graph overlap.** If Task A modifies module X and Task B imports module X, they are NOT independent — Task B's tests may be affected by Task A's changes.
+   The tool checks direct file overlap AND transitive dependency overlap (via the knowledge graph when available). It returns:
+   - **`pairs`**: Pairwise independence results with overlap details
+   - **`groups`**: Safe parallel dispatch groups (connected components of the conflict graph)
+   - **`verdict`**: Human-readable summary (e.g., "3 of 4 tasks can run in parallel in 2 groups")
+   - **`analysisLevel`**: `"graph-expanded"` (full analysis) or `"file-only"` (graph unavailable)
-5. **When in doubt, run serially.** The cost of a false parallel dispatch (merge conflicts, subtle bugs, wasted work) far exceeds the cost of running serially.
+   **Preferred: Use `predict_conflicts`** for severity-aware analysis with automatic regrouping:
+   ```json
+   {
+     "path": "<project-root>",
+     "tasks": [
+       { "id": "task-a", "files": ["src/module-a/index.ts", "src/module-a/index.test.ts"] },
+       { "id": "task-b", "files": ["src/module-b/index.ts", "src/module-b/index.test.ts"] }
+     ],
+     "depth": 1
+   }
+   ```
+   The `predict_conflicts` tool extends independence checking with:
+   - **`conflicts`**: Severity-classified conflict details with human-readable reasoning
+   - **`groups`**: Revised parallel dispatch groups (high-severity conflicts force serialization)
+   - **`summary`**: Conflict counts by severity and whether regrouping occurred
+   - **`verdict`**: Human-readable summary including severity breakdown
+   If `predict_conflicts` is unavailable, fall back to `check_task_independence`.
+3. **Act on the result.** Use the returned `groups` for dispatch. Flag any medium-severity conflicts to the coordinator. If high-severity conflicts forced regrouping (`summary.regrouped === true`), log which tasks were serialized and why. If all tasks are in one group, dispatch them all in parallel. If tasks are split across groups, dispatch each group as a separate parallel wave.
+4. **When in doubt, run serially.** The cost of a false parallel dispatch (merge conflicts, subtle bugs, wasted work) far exceeds the cost of running serially.
+#### Manual Fallback (when MCP tool is unavailable)
+If `check_task_independence` is not available, verify independence manually:
+1. **Check file overlap.** For each pair of tasks, compare the files they will read and write. Any overlap in WRITE targets means they are NOT independent. Overlap in READ targets is acceptable only if neither task writes to those files.
+2. **Check state overlap.** Do any tasks share database tables, configuration files, environment variables, or in-memory state? If yes, they are NOT independent.
+3. **Check import graph overlap.** If Task A modifies module X and Task B imports module X, they are NOT independent — Task B's tests may be affected by Task A's changes.
+4. **When in doubt, run serially.** Same principle as above.
 ### Graph-Enhanced Context (when available)
-When a knowledge graph exists at `.harness/graph/`, use graph queries for faster, more accurate independence verification:
+When a knowledge graph exists at `.harness/graph/`, `check_task_independence` automatically uses it for transitive dependency analysis (this is the `"graph-expanded"` analysis level). No manual graph queries are needed for independence checking.
+For custom queries beyond independence checking, these tools remain available:
-- `query_graph` — get the dependency subgraph per candidate task and check for node overlap between tasks
-- `get_impact` — verify tasks do not write to overlapping files or share transitive dependencies
+- `query_graph` — get the dependency subgraph for a specific module or file
+- `get_impact` — assess the impact radius of changes to a specific module
-Automated graph-based independence verification replaces manual import grep and catches transitive overlaps that file-level checks miss. Fall back to file-based commands if no graph is available.
+When no graph is available, `check_task_independence` falls back to file-only overlap detection and flags `analysisLevel: "file-only"` so you know transitive dependencies were not checked.
 ### Step 2: Create Focused Agent Tasks
@@ -101,7 +151,7 @@ For each independent task, write a focused agent brief:
 ## Success Criteria
-- Independence was verified before dispatch (file overlap, state overlap, import graph)
+- Independence was verified before dispatch via `check_task_independence` (or manual fallback if tool unavailable)
 - Each agent had a focused brief with explicit scope, goal, constraints, and expected output
 - All agents completed successfully (or blockers were reported)
 - Integration produced no merge conflicts (or conflicts were resolved)
@@ -117,17 +167,52 @@ For each independent task, write a focused agent brief:
 **Step 1: Verify independence**
+Call `check_task_independence`:
+```json
+{
+  "path": ".",
+  "tasks": [
+    {
+      "id": "task-4-user",
+      "files": [
+        "src/services/user/service.ts",
+        "src/services/user/service.test.ts",
+        "src/types/user.ts"
+      ]
+    },
+    {
+      "id": "task-5-product",
+      "files": [
+        "src/services/product/service.ts",
+        "src/services/product/service.test.ts",
+        "src/types/product.ts"
+      ]
+    },
+    {
+      "id": "task-6-notification",
+      "files": [
+        "src/services/notification/service.ts",
+        "src/services/notification/service.test.ts",
+        "src/types/notification.ts"
+      ]
+    }
+  ]
+}
 ```
-Task 4 (UserService):      writes src/services/user/*, reads src/types/user.ts
-Task 5 (ProductService):   writes src/services/product/*, reads src/types/product.ts
-Task 6 (NotificationService): writes src/services/notification/*, reads src/types/notification.ts
-File overlap: NONE (different directories, different type files)
-State overlap: NONE (different DB tables, no shared config)
-Import graph: NONE (no cross-service imports)
-Verdict: INDEPENDENT — safe to parallelize
+Result:
+```json
+{
+  "analysisLevel": "graph-expanded",
+  "groups": [["task-4-user", "task-5-product", "task-6-notification"]],
+  "verdict": "3 of 3 tasks can run in parallel in 1 group"
+}
 ```
+All tasks are independent — safe to parallelize.
 **Step 2: Create agent briefs**
 ```