@jaimevalasek/aioson 1.23.3 → 1.28.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/CHANGELOG.md +56 -0
- package/docs/en/4-agents/README.md +11 -8
- package/docs/en/4-agents/forge-run.md +165 -0
- package/docs/en/5-reference/README.md +1 -0
- package/docs/en/5-reference/cli-reference.md +199 -85
- package/docs/en/5-reference/executable-verification.md +165 -0
- package/docs/pt/4-agentes/README.md +2 -1
- package/docs/pt/4-agentes/forge-run.md +150 -0
- package/docs/pt/4-agentes/pm.md +8 -0
- package/docs/pt/4-agentes/qa.md +2 -0
- package/docs/pt/4-agentes/scope-check.md +19 -1
- package/docs/pt/4-agentes/sheldon.md +2 -0
- package/docs/pt/4-agentes/validator.md +20 -0
- package/docs/pt/5-referencia/autopilot-handoff.md +33 -0
- package/docs/pt/5-referencia/comandos-cli.md +64 -9
- package/docs/pt/5-referencia/fluxo-artefatos.md +40 -15
- package/docs/pt/5-referencia/loop-guardrails.md +19 -0
- package/docs/pt/5-referencia/sdd-automation-scripts.md +130 -26
- package/package.json +1 -1
- package/src/cli.js +70 -54
- package/src/commands/forge-compile.js +330 -0
- package/src/commands/harness-check.js +159 -0
- package/src/commands/harness.js +37 -2
- package/src/commands/spec-analyze.js +324 -0
- package/src/constants.js +118 -108
- package/src/harness/contract-schema.js +8 -0
- package/src/harness/plan-waves.js +77 -0
- package/src/harness/review-payload.js +230 -0
- package/src/i18n/messages/en.js +21 -15
- package/src/i18n/messages/es.js +15 -13
- package/src/i18n/messages/fr.js +15 -13
- package/src/i18n/messages/pt-BR.js +21 -15
- package/src/parser.js +3 -1
- package/template/.aioson/agents/dev.md +67 -66
- package/template/.aioson/agents/forge-run.md +57 -0
- package/template/.aioson/agents/pm.md +51 -45
- package/template/.aioson/agents/qa.md +22 -22
- package/template/.aioson/agents/scope-check.md +49 -46
- package/template/.aioson/agents/sheldon.md +1 -1
- package/template/.aioson/agents/validator.md +16 -5
- package/template/.aioson/docs/autopilot-handoff.md +34 -32
- package/template/.aioson/docs/sheldon/harness-contract.md +19 -2
- package/template/.claude/commands/aioson/agent/forge-run.md +17 -0
- package/template/AGENTS.md +15 -13
- package/template/CLAUDE.md +10 -9
- package/template/OPENCODE.md +24 -23
package/CHANGELOG.md
CHANGED
|
@@ -2,6 +2,62 @@
|
|
|
2
2
|
|
|
3
3
|
All notable changes to this project will be documented in this file.
|
|
4
4
|
|
|
5
|
+
## [1.28.0] - 2026-06-11
|
|
6
|
+
|
|
7
|
+
### Added
|
|
8
|
+
- **`forge:compile` — spec → workflow-script compiler (Lane B, opt-in).** `aioson forge:compile [path] --feature=<slug> [--json]` compiles a MEDIUM feature's completed artifacts into `.aioson/plans/{slug}/forge-run.workflow.js` — an auditable, versionable dynamic-workflow script meant to be committed alongside the spec. Compiled structure: one `parallel()` stage per Wave (file-disjoint dev agents, blocked-wave early stop), a deterministic convergence loop on `aioson harness:check` bounded by the governor's `error_streak_limit` (fixes run **sequentially** — criteria don't prove file-disjointness, only waves do) plus a token-budget guard, 3-lens adversarial review (correctness/completeness/regression-risk, majority survives, refute-by-default) for binary criteria without `verification`, and a fresh-context validator stage that closes through the normal `harness:validate` → `last-validator-output.json` → `apply-validation` circuit-breaker cycle. Hard preflights — invalid/missing contract, zero executable criteria, plan without Wave column, `spec:analyze` errors, and `wave_file_overlap` (warning in analyze, **error** here) all refuse compilation with owner-agent guidance. Generated code honors the workflow runtime contract: pure-literal `meta`, plain JS, no `Date.now()`/`Math.random()`/`new Date()`, and all artifact-derived text embedded via `JSON.stringify` (no interpolable template literals — injection-safe). The script never runs `feature:close`/publish.
|
|
9
|
+
- **`@forge-run` agent — the Lane B entry point.** New opt-in agent (`/forge-run`): compile (refusals route to the owner agent), review the compile report with the user (cost warning included), execute the generated script via the workflow runtime (never hand-emulated), and report — PASS recommends the human run `feature:close`, FAIL routes to `@dev` through the normal lane. Registered across CLAUDE.md/AGENTS.md/OPENCODE.md, `.claude/commands` wrapper, `src/constants.js` (MANAGED_FILES + AGENTS), and all template mirrors.
|
|
10
|
+
- **`src/harness/plan-waves.js`** — shared Execution Sequence parser (waves + scope + done columns, `groupByWave`), extracted from `spec:analyze` and reused by the compiler. `spec:analyze` behavior unchanged.
|
|
11
|
+
|
|
12
|
+
### Tests
|
|
13
|
+
- Added `forge-compile` suite (9 cases: preflight refusals, governor-derived fix-loop cap, wave→parallel structure, runtime-constraint bans, byte-identical recompilation determinism, template-injection invariant, JSON mode). Full suite green (3210 pass).
|
|
14
|
+
|
|
15
|
+
## [1.27.0] - 2026-06-11
|
|
16
|
+
|
|
17
|
+
### Added
|
|
18
|
+
- **Wave column — parallelism markers in the MEDIUM implementation plan.** `@pm`'s Execution Sequence gains a `Wave` column: phases sharing a Wave are file-disjoint and dependency-free with respect to each other (parallelizable via isolated subagents/worktrees); waves execute in ascending order. Marking rules are conservative by design — same Wave only when Primary files do not overlap AND neither phase consumes the other's output; when in doubt, sequential (a wrong sequential marking costs wall-clock; a wrong parallel marking costs a merge conflict). This is the cheap prerequisite for any future fan-out execution lane, and forces explicit file-boundary thinking even without one. Template mirror synced.
|
|
19
|
+
- **`wave_file_overlap` check in `spec:analyze`.** The deterministic pass now parses the Execution Sequence table and flags same-wave phases whose Primary files overlap (warning). Noise-guarded for backward compatibility: plans without a Wave column skip the check entirely; placeholder cells (`...`, `-`) and non-integer waves are ignored; paths normalized (backticks stripped, separators unified, case-insensitive).
|
|
20
|
+
|
|
21
|
+
### Tests
|
|
22
|
+
- Added wave overlap/disjoint/legacy-plan cases to the `spec-analyze` suite (14 total). Full suite green (3201 pass).
|
|
23
|
+
|
|
24
|
+
## [1.26.0] - 2026-06-11
|
|
25
|
+
|
|
26
|
+
### Added
|
|
27
|
+
- **`spec:analyze` — deterministic cross-artifact content consistency.** `aioson spec:analyze [path] --feature=<slug> [--json]` is the content sibling of `artifact:validate` (chain presence — untouched): it confronts the feature's artifacts before the execution gate and reports findings by severity. Checks: REQ/AC **ID traceability** (ids declared in `requirements-{slug}.md` never referenced downstream = coverage gap; ids referenced downstream but never declared = orphan/drift signal — both noise-guarded: prose-style plans that cite no ids produce no gap findings), **staleness ordering** (upstream artifact modified after a downstream one was produced, 60s tolerance, project-global `architecture.md` excluded), **readiness states** (`blocked` = error, `ready_with_warnings` = info), **harness-contract sanity** (schema errors = error; executable-coverage warnings = info, via `validateContract`), and **AC→contract linkage** (no declared AC mentioned in the contract = info). Persists `spec-analyze-{slug}.json` to `.aioson/context/` (collected by `feature:export`/`archive`); `error` findings flip `ok: false` (exit 1 in `--json` mode) for gate scripting. Reuses `scanArtifacts`/`detectClassification` from the preflight engine.
|
|
28
|
+
|
|
29
|
+
### Changed
|
|
30
|
+
- **`@scope-check` preflight runs `spec:analyze`.** The deterministic pass executes before deep loads: `error` findings are blockers routed to the owner agent; `warning` findings enter the drift comparison as pre-computed evidence to confirm or dismiss explicitly. Template mirror synced.
|
|
31
|
+
|
|
32
|
+
### Tests
|
|
33
|
+
- Added `spec-analyze` suite (11 cases: traceability gaps/orphans, prose-plan noise guard, staleness via mtime, readiness blocked, contract schema/coverage/AC-linkage, persistence, JSON mode). Full suite green (3198 pass).
|
|
34
|
+
|
|
35
|
+
## [1.25.0] - 2026-06-11
|
|
36
|
+
|
|
37
|
+
### Added
|
|
38
|
+
- **Fresh-context review payload in `harness:validate`.** The generated `validator-prompt.txt` is now self-contained for isolated execution: it appends a review payload with the deterministic `harness:check` results (exit-code verdicts to copy verbatim), the changed-file list (including untracked, framework state filtered out), and the unified diff vs a resolved base ref — explicit `--base=<ref>`, the loop's `baseline.json` HEAD, merge-base with main/master, or `HEAD` as fallback. Diff is size-capped (`--max-diff-bytes`, default 200KB) with a line-boundary truncation marker; `--no-diff` skips the payload. Degrades gracefully outside a git repository (existing router flows untouched). New module `src/harness/review-payload.js`.
|
|
39
|
+
- **Fresh-context validation protocol.** `@validator` documents the generated prompt as its preferred activation surface — run in a fresh, isolated context (subagent/Task tool or separate session), never inline in the session that implemented the feature. `.aioson/docs/autopilot-handoff.md` post-dev cycle routes `@validator` through the isolated-subagent flow (check → validate → isolated run → re-validate to consume the verdict through the circuit breaker); `@qa`'s recommendation mentions the route. Template mirrors synced.
|
|
40
|
+
|
|
41
|
+
### Changed
|
|
42
|
+
- **`harness:validate` next-steps guidance** now instructs running the prompt in a fresh isolated context, and the command result exposes a `reviewPayload` summary (base, changed-file count, truncation, checks included). The `waiting_validation`/`apply-validation` state machine is unchanged.
|
|
43
|
+
- **Parser:** `--no-diff` registered as a pure boolean flag (mirrors `--no-index` precedent).
|
|
44
|
+
|
|
45
|
+
### Tests
|
|
46
|
+
- Added `review-payload` suite (10 cases: git fixtures for base resolution, untracked + framework-state filtering, truncation, check-summary embedding, `harness:validate` integration, `--no-diff`). Full suite green (3187 pass).
|
|
47
|
+
|
|
48
|
+
## [1.24.0] - 2026-06-11
|
|
49
|
+
|
|
50
|
+
### Added
|
|
51
|
+
- **`harness:check` — standalone deterministic runner for `criteria[].verification`.** `aioson harness:check [path] --slug=<slug> [--criteria=C1,C2] [--timeout=<ms>] [--json]` executes the contract's executable checks outside `self:loop`, reusing the existing `runCriteria`/`executeInSandbox` stack (timeouts, process-tree kill, credential redaction, failure signatures). Read-only over `progress.json` — circuit/breaker state mutation remains exclusive to the `harness:validate`/`apply-validation` cycle. Persists the report to `.aioson/plans/{slug}/last-check-output.json` (mirroring `last-validator-output.json`), emits `criteria_check_failed` telemetry best-effort, auto-discovers the active contract when `--slug` is omitted, and supports criterion-subset runs via `--criteria`.
|
|
52
|
+
- **`verification` is now a first-class authored field.** The canonical contract doc (`.aioson/docs/sheldon/harness-contract.md` + template mirror) documents `criteria[].verification` with authoring rules (exit 0 = pass, deterministic, cross-platform, prefer the project test runner); `@sheldon` RF-05 instructs producing it for every mechanically checkable `binary: true` criterion. Legacy contracts without the field remain fully valid.
|
|
53
|
+
- **Executable-coverage warning in contract schema validation.** `validateContract` now emits an advisory warning (never an error) for each `binary: true` criterion lacking a `verification` command, surfacing verification debt at `harness:init`/preflight without breaking any existing contract.
|
|
54
|
+
|
|
55
|
+
### Changed
|
|
56
|
+
- **`@validator` consumes deterministic checks first.** Step 2 of the validator protocol now runs `aioson harness:check . --slug={slug} --json` and copies each executable check's exit-code verdict verbatim into `results[].passed`; LLM judgment is reserved for criteria without `verification`. Output JSON schema unchanged — `harness:apply-validation` and the circuit-breaker cycle are untouched. `@qa`'s validator recommendation and `@dev`'s implementation strategy mention the new command (template mirrors synced).
|
|
57
|
+
|
|
58
|
+
### Tests
|
|
59
|
+
- Added `harness-check` suite (10 cases: pass/fail/signature, progress.json immutability, subset filter, unknown ids, active-contract auto-discovery, JSON mode, schema rejection) and coverage-warning cases in `harness-contract-schema`. Full suite green (3178 pass).
|
|
60
|
+
|
|
5
61
|
## [1.23.0] - 2026-06-10
|
|
6
62
|
|
|
7
63
|
### Added
|
|
@@ -1,12 +1,12 @@
|
|
|
1
1
|
# Agent cards — AIOSON (EN)
|
|
2
2
|
|
|
3
|
-
This layer will hold one card per AIOSON agent (29 total), translated from [`docs/pt/4-agentes/`](../../pt/4-agentes/README.md).
|
|
3
|
+
This layer will hold one card per AIOSON agent (29 total), translated from [`docs/pt/4-agentes/`](../../pt/4-agentes/README.md).
|
|
4
4
|
|
|
5
5
|
Cards are being translated progressively. Until a card is available here, the PT version is the canonical reference — it follows the same format and covers the same agents.
|
|
6
6
|
|
|
7
7
|
---
|
|
8
8
|
|
|
9
|
-
## The 29 agents (plus @pair alias)
|
|
9
|
+
## The 29 agents (plus @pair alias)
|
|
10
10
|
|
|
11
11
|
### Workflow core (pipeline order)
|
|
12
12
|
|
|
@@ -14,17 +14,18 @@ Cards are being translated progressively. Until a card is available here, the PT
|
|
|
14
14
|
|---|---|
|
|
15
15
|
| `@setup` | Project onboarding — detect stack, classify MICRO/SMALL/MEDIUM, write `project.context.md` |
|
|
16
16
|
| `@briefing` | Pre-PRD framing — turn `plans/` sketches into structured briefings with gap analysis |
|
|
17
|
-
| `@product` | PRD — vision, problem, users, scope, acceptance criteria |
|
|
18
|
-
| `@sheldon` | PRD quality guardian — gap detection, web research, sizing, in-place enrichment or phased plan |
|
|
19
|
-
| `@analyst` | Domain discovery — entities, flows, brownfield mapping |
|
|
20
|
-
| `@scope-check` | Alignment gate before implementation — validates intent vs plan and catches scope drift |
|
|
21
|
-
| `@architect` | Technical decisions — structure, libraries, integration boundaries |
|
|
17
|
+
| `@product` | PRD — vision, problem, users, scope, acceptance criteria |
|
|
18
|
+
| `@sheldon` | PRD quality guardian — gap detection, web research, sizing, in-place enrichment or phased plan |
|
|
19
|
+
| `@analyst` | Domain discovery — entities, flows, brownfield mapping |
|
|
20
|
+
| `@scope-check` | Alignment gate before implementation — validates intent vs plan and catches scope drift |
|
|
21
|
+
| `@architect` | Technical decisions — structure, libraries, integration boundaries |
|
|
22
22
|
| `@ux-ui` | Design system and UI component specs (MEDIUM) |
|
|
23
23
|
| `@pm` | Backlog and user stories (MEDIUM) |
|
|
24
24
|
| `@orchestrator` | Parallel lane coordination (MEDIUM) |
|
|
25
25
|
| `@dev` | Feature implementation — any stack |
|
|
26
26
|
| `@qa` | Risk-first review, test generation, autonomous fix/test loop |
|
|
27
27
|
| `@validator` | Binary contract verification against `harness-contract.json` |
|
|
28
|
+
| [`@forge-run`](./forge-run.md) | Lane B (opt-in) — compile a MEDIUM feature's specs into an executable workflow and run it (`forge:compile`) |
|
|
28
29
|
| `@tester` | Systematic test engineering — legacy and coverage gaps |
|
|
29
30
|
| `@pentester` | Adversarial security review — OWASP Top 10, LLM Top 10, supply chain |
|
|
30
31
|
|
|
@@ -50,8 +51,10 @@ Cards are being translated progressively. Until a card is available here, the PT
|
|
|
50
51
|
| `@design-hybrid-forge` | Combine two design skills into a hybrid |
|
|
51
52
|
| `@orache` | Domain investigation and strategic research |
|
|
52
53
|
| `@copywriter` | Conversion copy — landing pages, VSL scripts |
|
|
53
|
-
| [`@discovery-design-doc`](./discovery-design-doc.md) | Discovery, readiness, and design doc package |
|
|
54
|
+
| [`@discovery-design-doc`](./discovery-design-doc.md) | Discovery, readiness, and design doc package |
|
|
54
55
|
|
|
55
56
|
---
|
|
56
57
|
|
|
58
|
+
For the executable-verification theme that `@forge-run`, `@validator`, `@scope-check`, `@sheldon`, and `@pm` participate in, see [Executable verification](../5-reference/executable-verification.md).
|
|
59
|
+
|
|
57
60
|
Full PT cards with dialogue examples, disk outputs, and handoff maps: [`docs/pt/4-agentes/`](../../pt/4-agentes/README.md)
|
|
@@ -0,0 +1,165 @@
|
|
|
1
|
+
# @forge-run - Compile and run the Lane B verification workflow
|
|
2
|
+
|
|
3
|
+
> **For whom:** people with a MEDIUM feature that has a binary contract and a wave-based plan, who want to run the entire executable-verification cycle as a single compiled workflow.
|
|
4
|
+
> **Reading time:** 4 min.
|
|
5
|
+
> **What you will learn:**
|
|
6
|
+
> - What Lane B is and why it is **opt-in** and **additive**
|
|
7
|
+
> - The protocol: `forge:compile` → review with you → execute → report
|
|
8
|
+
> - Why `@forge-run` never weakens a verification nor closes the feature
|
|
9
|
+
|
|
10
|
+
---
|
|
11
|
+
|
|
12
|
+
## What it is for
|
|
13
|
+
|
|
14
|
+
The default executable-verification lane (`@scope-check` → `@dev` → `@qa` → `@validator`) is **unchanged** and remains the recommended path. `@forge-run` is a **second lane (Lane B), optional and additive**: it compiles a MEDIUM feature's artifacts into a single **versionable workflow** and runs the whole deterministic-verification cycle end to end.
|
|
15
|
+
|
|
16
|
+
Instead of advancing stage by stage manually, `@forge-run` generates `.aioson/plans/{slug}/forge-run.workflow.js` — a Claude Code dynamic-workflow script — and executes it through the workflow runtime (never hand-emulated). The compiled structure mirrors the executable-verification roadmap:
|
|
17
|
+
|
|
18
|
+
- **`parallel()` per Wave** — same-wave phases are file-disjoint and run in parallel (see the `Wave` column produced by `@pm`).
|
|
19
|
+
- **Deterministic loop over `harness:check`** — bounded by the governor's `error_streak_limit`; fixes are sequential (only waves prove disjointness).
|
|
20
|
+
- **3-lens adversarial review** — for binary criteria that lack `verification` and therefore cannot be checked mechanically.
|
|
21
|
+
- **Fresh-context validator** — closes through the `harness:validate` → `apply-validation` cycle (see `@validator`).
|
|
22
|
+
|
|
23
|
+
**Hard rule:** one feature per run. `@forge-run` never runs `feature:close` and never publishes.
|
|
24
|
+
|
|
25
|
+
---
|
|
26
|
+
|
|
27
|
+
## When to invoke
|
|
28
|
+
|
|
29
|
+
- A **MEDIUM** feature whose `harness-contract.json` carries `verification` per criterion (authored by `@sheldon`).
|
|
30
|
+
- An implementation plan with the `Wave` column filled in (produced by `@pm`).
|
|
31
|
+
- A clean `aioson spec:analyze` (no `errors`) — the execution-gate precondition.
|
|
32
|
+
- When you want to run the whole executable-verification cycle as a single reproducible, versionable workflow.
|
|
33
|
+
|
|
34
|
+
---
|
|
35
|
+
|
|
36
|
+
## When not to invoke
|
|
37
|
+
|
|
38
|
+
- MICRO or SMALL features → use the default lane (`@dev` → `@qa` → `@validator`).
|
|
39
|
+
- A contract without executable criteria, or a plan without a `Wave` column → the hard preflights refuse to compile.
|
|
40
|
+
- `spec:analyze` reporting `errors` or `wave_file_overlap` → resolve those first; Lane B does not compile over drift.
|
|
41
|
+
- You only want to validate, not execute → use `@validator` directly.
|
|
42
|
+
|
|
43
|
+
---
|
|
44
|
+
|
|
45
|
+
## Required inputs
|
|
46
|
+
|
|
47
|
+
Lane B only compiles when every prerequisite is present — otherwise the hard preflights refuse:
|
|
48
|
+
|
|
49
|
+
- `.aioson/plans/{slug}/harness-contract.json` — a **valid** contract with at least one executable criterion (`verification` present).
|
|
50
|
+
- `implementation-plan-{slug}.md` — a plan with the `Wave` column filled in.
|
|
51
|
+
- `spec-analyze-{slug}.json` with no `errors` and no `wave_file_overlap`.
|
|
52
|
+
|
|
53
|
+
---
|
|
54
|
+
|
|
55
|
+
## Where it fits in the workflow
|
|
56
|
+
|
|
57
|
+
Lane B is an opt-in branch off the standard MEDIUM flow, taken after the spec and plan are ready:
|
|
58
|
+
|
|
59
|
+
```text
|
|
60
|
+
@product -> @analyst -> @architect -> @sheldon (authors verification) ->
|
|
61
|
+
@pm (Wave column) -> @scope-check (spec:analyze clean) -> @forge-run (Lane B)
|
|
62
|
+
```
|
|
63
|
+
|
|
64
|
+
It never replaces the default lane — it compiles the same artifacts into a single executable workflow. (`@pm`, `@sheldon`, and `@validator` cards are tracked in the [agents index](./README.md); PT is canonical until each is translated.)
|
|
65
|
+
|
|
66
|
+
---
|
|
67
|
+
|
|
68
|
+
## Typical dialogue
|
|
69
|
+
|
|
70
|
+
```text
|
|
71
|
+
You > @forge-run
|
|
72
|
+
|
|
73
|
+
@forge-run > Feature: checkout-stripe (MEDIUM)
|
|
74
|
+
|
|
75
|
+
Lane B preflight:
|
|
76
|
+
- harness-contract.json valid ✓
|
|
77
|
+
- 5 criteria, 4 with verification (executable) ✓
|
|
78
|
+
- implementation-plan with Wave column ✓
|
|
79
|
+
- spec:analyze → ok, no errors, no wave_file_overlap ✓
|
|
80
|
+
|
|
81
|
+
$ aioson forge:compile . --feature=checkout-stripe
|
|
82
|
+
Compiled: .aioson/plans/checkout-stripe/forge-run.workflow.js
|
|
83
|
+
|
|
84
|
+
Workflow structure:
|
|
85
|
+
- Wave 1 (parallel): backend + migrations
|
|
86
|
+
- Wave 2: frontend
|
|
87
|
+
- harness:check loop (error_streak limit: 3, sequential fixes)
|
|
88
|
+
- 3-lens adversarial review for C5 (no verification)
|
|
89
|
+
- Fresh-context validator → apply-validation
|
|
90
|
+
|
|
91
|
+
⚠ Cost: this workflow runs multiple LLM executions.
|
|
92
|
+
Review the script before executing. Approve execution? (y/n)
|
|
93
|
+
|
|
94
|
+
You > y
|
|
95
|
+
|
|
96
|
+
@forge-run > Executing via the workflow runtime...
|
|
97
|
+
Wave 1 ✓ Wave 2 ✓
|
|
98
|
+
harness:check → 4/4 executable criteria PASS
|
|
99
|
+
Adversarial review C5 → PASS
|
|
100
|
+
Fresh-context validator → overall_score: 1
|
|
101
|
+
|
|
102
|
+
RESULT: PASS
|
|
103
|
+
Recommendation: run `aioson feature:close` manually.
|
|
104
|
+
```
|
|
105
|
+
|
|
106
|
+
---
|
|
107
|
+
|
|
108
|
+
## Disk outputs
|
|
109
|
+
|
|
110
|
+
| File | Content |
|
|
111
|
+
|---|---|
|
|
112
|
+
| `.aioson/plans/{slug}/forge-run.workflow.js` | Compiled, versionable workflow (Claude Code dynamic workflow) |
|
|
113
|
+
| `.aioson/plans/{slug}/last-check-output.json` | Last `harness:check` result consumed by the loop |
|
|
114
|
+
| `.aioson/plans/{slug}/last-validator-output.json` | Fresh-context validator verdict |
|
|
115
|
+
| `.aioson/plans/{slug}/progress.json` | Post-execution state (`circuit_state`, `ready_for_done_gate`) |
|
|
116
|
+
|
|
117
|
+
The generated code is deterministic by construction: pure-literal metadata, no `Date.now`/`Math.random`/`new Date`, artifact text always via `JSON.stringify`, and it **never** invokes `feature:close`.
|
|
118
|
+
|
|
119
|
+
---
|
|
120
|
+
|
|
121
|
+
## How it reads your project
|
|
122
|
+
|
|
123
|
+
- `.aioson/plans/{slug}/harness-contract.json` — the contract and criteria with `verification`
|
|
124
|
+
- `.aioson/context/implementation-plan-{slug}.md` — the plan with the `Wave` column
|
|
125
|
+
- `.aioson/plans/{slug}/spec-analyze-{slug}.json` — cross-artifact consistency (execution gate)
|
|
126
|
+
- `.aioson/plans/{slug}/progress.json` — state and the governor's `error_streak_limit`
|
|
127
|
+
|
|
128
|
+
---
|
|
129
|
+
|
|
130
|
+
## Related CLI commands
|
|
131
|
+
|
|
132
|
+
```bash
|
|
133
|
+
# Compile the feature's artifacts into the Lane B workflow
|
|
134
|
+
aioson forge:compile . --feature={slug}
|
|
135
|
+
|
|
136
|
+
# Parseable output
|
|
137
|
+
aioson forge:compile . --feature={slug} --json
|
|
138
|
+
```
|
|
139
|
+
|
|
140
|
+
See [forge:compile in the CLI reference](../5-reference/cli-reference.md) and [Executable verification](../5-reference/executable-verification.md) for the full theme.
|
|
141
|
+
|
|
142
|
+
---
|
|
143
|
+
|
|
144
|
+
## Hard rules
|
|
145
|
+
|
|
146
|
+
- **Never** bypass a failed preflight (invalid contract, zero executable criteria, plan without a `Wave` column, `spec:analyze` `errors` or `wave_file_overlap`).
|
|
147
|
+
- **Never** weaken or remove a `verification` check to make a criterion pass.
|
|
148
|
+
- **Never** run `feature:close` or publish — that is always a human decision.
|
|
149
|
+
- One feature per run.
|
|
150
|
+
|
|
151
|
+
---
|
|
152
|
+
|
|
153
|
+
## Typical handoff
|
|
154
|
+
|
|
155
|
+
- **Comes from:** opt-in entry by the user (Lane B); assumes `@sheldon`, `@pm`, and `@scope-check`/`spec:analyze` are already complete.
|
|
156
|
+
- **PASS:** recommends the **human** run `aioson feature:close` manually.
|
|
157
|
+
- **FAIL:** routes back to `@dev` via the **normal lane** to fix and re-verify.
|
|
158
|
+
|
|
159
|
+
---
|
|
160
|
+
|
|
161
|
+
## Next step
|
|
162
|
+
|
|
163
|
+
- `@pm` — produces the `Wave` column that becomes `parallel()` in the workflow (card in the [agents index](./README.md))
|
|
164
|
+
- `@validator` — the fresh-context validator that closes the cycle (card in the [agents index](./README.md))
|
|
165
|
+
- [Executable verification](../5-reference/executable-verification.md) — the full theme (Phases 1–5)
|
|
@@ -14,6 +14,7 @@ This layer currently holds the original EN feature guides. Additional reference
|
|
|
14
14
|
|---|---|
|
|
15
15
|
| [cli-reference.md](./cli-reference.md) | Full reference for every CLI command |
|
|
16
16
|
| [json-schemas.md](./json-schemas.md) | `--json` output contracts for all commands |
|
|
17
|
+
| [executable-verification.md](./executable-verification.md) | The executable-verification theme: `verification` + `harness:check`, fresh-context validator, `spec:analyze`, Wave markers, Lane B (`forge:compile` + `@forge-run`) |
|
|
17
18
|
|
|
18
19
|
---
|
|
19
20
|
|
|
@@ -663,88 +663,202 @@ aioson scout:commit --input=<path> --json
|
|
|
663
663
|
**Exit codes:** 0 = committed (or no-op); 1 = file not found, lock failure.
|
|
664
664
|
|
|
665
665
|
See [Sub-task Scout — CLI reference](../deyvin-subtask-scout/cli-commands.md) for full details.
|
|
666
|
-
|
|
667
|
-
---
|
|
668
|
-
|
|
669
|
-
## harness:
|
|
670
|
-
|
|
671
|
-
|
|
672
|
-
|
|
673
|
-
```bash
|
|
674
|
-
|
|
675
|
-
aioson harness:
|
|
676
|
-
|
|
677
|
-
|
|
678
|
-
|
|
679
|
-
|
|
680
|
-
|
|
681
|
-
|
|
682
|
-
|
|
683
|
-
|
|
684
|
-
|
|
685
|
-
|
|
686
|
-
|
|
687
|
-
|
|
688
|
-
|
|
689
|
-
|
|
690
|
-
|
|
691
|
-
|
|
692
|
-
|
|
693
|
-
|
|
694
|
-
|
|
695
|
-
|
|
696
|
-
|
|
697
|
-
|
|
698
|
-
|
|
699
|
-
|
|
700
|
-
|
|
701
|
-
|
|
702
|
-
|
|
703
|
-
|
|
704
|
-
|
|
705
|
-
|
|
706
|
-
|
|
707
|
-
|
|
708
|
-
|
|
709
|
-
|
|
710
|
-
|
|
711
|
-
|
|
712
|
-
|
|
713
|
-
|
|
714
|
-
|
|
715
|
-
|
|
716
|
-
|
|
717
|
-
|
|
718
|
-
|
|
719
|
-
|
|
720
|
-
|
|
721
|
-
|
|
722
|
-
|
|
723
|
-
|
|
724
|
-
|
|
725
|
-
|
|
726
|
-
|
|
727
|
-
|
|
728
|
-
|
|
729
|
-
|
|
730
|
-
|
|
731
|
-
|
|
732
|
-
|
|
733
|
-
|
|
734
|
-
|
|
735
|
-
|
|
736
|
-
|
|
737
|
-
|
|
738
|
-
|
|
739
|
-
|
|
740
|
-
|
|
741
|
-
|
|
742
|
-
|
|
743
|
-
|
|
744
|
-
|
|
745
|
-
|
|
746
|
-
|
|
747
|
-
|
|
748
|
-
|
|
749
|
-
|
|
750
|
-
|
|
666
|
+
|
|
667
|
+
---
|
|
668
|
+
|
|
669
|
+
## harness:check
|
|
670
|
+
|
|
671
|
+
Run the `criteria[].verification` shell commands from `harness-contract.json` deterministically — **outside** the self:loop and **read-only** over `progress.json`. Each criterion's command runs in the sandbox; exit code `0` = pass. Reuses the loop's `runCriteria`/`executeInSandbox` machinery (timeouts, process-tree kill, credential redaction, failure signatures). Persists `last-check-output.json` and emits `criteria_check_failed` telemetry on failure.
|
|
672
|
+
|
|
673
|
+
```bash
|
|
674
|
+
# Run every verifiable criterion of the active contract (auto-discovered)
|
|
675
|
+
aioson harness:check . --slug=checkout
|
|
676
|
+
|
|
677
|
+
# Run only a subset of criteria
|
|
678
|
+
aioson harness:check . --slug=checkout --criteria=C1,C3
|
|
679
|
+
|
|
680
|
+
# Custom timeout and JSON output (exit 0 = pass)
|
|
681
|
+
aioson harness:check . --slug=checkout --timeout=120000 --json
|
|
682
|
+
```
|
|
683
|
+
|
|
684
|
+
**Options:**
|
|
685
|
+
- `--slug=<feature>` — feature slug matching the harness contract. If omitted, the active contract is auto-discovered.
|
|
686
|
+
- `--criteria=C1,C2` — run only the listed criteria instead of all verifiable ones.
|
|
687
|
+
- `--timeout=<ms>` — per-criterion timeout override.
|
|
688
|
+
- `--json` — structured output; exit code propagated.
|
|
689
|
+
|
|
690
|
+
**What it does:** the `verification` field is authored per criterion by `@sheldon` for every mechanically-checkable `binary: true` criterion (prefer the project test runner; deterministic; cross-platform; exit 0 = pass). `harness:check` is the standalone deterministic verification of those criteria — it never touches the circuit-breaker state (that stays exclusive to `harness:validate`/`apply-validation`). Legacy contracts without `verification` remain valid; `validateContract` only emits an advisory **warning** for `binary: true` criteria lacking it. `@validator` runs `harness:check` first and copies the exit-code verdicts verbatim into `results[].passed`, LLM-judging only the criteria without `verification`.
|
|
691
|
+
|
|
692
|
+
See [Executable verification](./executable-verification.md) for the full theme.
|
|
693
|
+
|
|
694
|
+
---
|
|
695
|
+
|
|
696
|
+
## harness:validate
|
|
697
|
+
|
|
698
|
+
Generate the `validator-prompt.txt` for the binary success contract and append a self-contained **review payload** so the validator can judge in a fresh, isolated context. Consumes the verdict back through the circuit breaker.
|
|
699
|
+
|
|
700
|
+
```bash
|
|
701
|
+
# Generate the validator prompt with review payload
|
|
702
|
+
aioson harness:validate . --slug=checkout
|
|
703
|
+
|
|
704
|
+
# Resolve the diff against an explicit base
|
|
705
|
+
aioson harness:validate . --slug=checkout --base=main
|
|
706
|
+
|
|
707
|
+
# Skip the diff (review payload still includes check results + changed files)
|
|
708
|
+
aioson harness:validate . --slug=checkout --no-diff
|
|
709
|
+
|
|
710
|
+
# Cap the embedded diff size
|
|
711
|
+
aioson harness:validate . --slug=checkout --max-diff-bytes=200000
|
|
712
|
+
```
|
|
713
|
+
|
|
714
|
+
**Options:**
|
|
715
|
+
- `--slug=<feature>` — **required**. Feature slug matching the harness contract.
|
|
716
|
+
- `--base=<ref>` — git ref to diff against. Resolution order: `--base` > `baseline.json` head > merge-base with `main`/`master` > `HEAD`.
|
|
717
|
+
- `--no-diff` — pure boolean flag; omit the unified diff from the review payload.
|
|
718
|
+
- `--max-diff-bytes=<n>` — cap the embedded diff (default `200000`); truncation happens on a line boundary.
|
|
719
|
+
|
|
720
|
+
**What it does:** the review payload (built by `src/harness/review-payload.js`) contains (a) the `harness:check` results, (b) the changed-file list (untracked files included, `.aioson/**` framework state filtered out), and (c) a unified diff against the resolved base. It degrades gracefully outside a git repo. The protocol is that `@validator` runs in a **fresh, isolated context** (a subagent / Task tool, or a separate session) — never inline in the implementing session, because implementation history biases the verdict. Typical flow: `harness:check` → `harness:validate` → isolated subagent run → re-run `harness:validate` to consume the verdict through the circuit breaker.
|
|
721
|
+
|
|
722
|
+
See [Executable verification](./executable-verification.md) for the full theme.
|
|
723
|
+
|
|
724
|
+
---
|
|
725
|
+
|
|
726
|
+
## harness:approve
|
|
727
|
+
|
|
728
|
+
Approve a pending human gate in the self:loop (loop guardrails). Persists the decision (who, when) to `.aioson/plans/{slug}/gates/{id}.json` and resumes the loop.
|
|
729
|
+
|
|
730
|
+
```bash
|
|
731
|
+
aioson harness:approve . --slug=<feature> --gate=<gate-id>
|
|
732
|
+
aioson harness:approve . --slug=checkout --gate=database_destructive_change-1
|
|
733
|
+
```
|
|
734
|
+
|
|
735
|
+
**Options:**
|
|
736
|
+
- `--slug=<feature>` — **required**. Feature slug matching the harness contract.
|
|
737
|
+
- `--gate=<id>` — **required**. Gate id shown in `harness:status` output.
|
|
738
|
+
- `--by=<name>` — override the "decided by" field (defaults to `git config user.name`).
|
|
739
|
+
|
|
740
|
+
**Idempotent:** re-approving an already-decided gate is a no-op with a warning.
|
|
741
|
+
|
|
742
|
+
---
|
|
743
|
+
|
|
744
|
+
## harness:reject
|
|
745
|
+
|
|
746
|
+
Reject a pending human gate. Ends the current loop attempt with a summary. Requires `--reason`.
|
|
747
|
+
|
|
748
|
+
```bash
|
|
749
|
+
aioson harness:reject . --slug=<feature> --gate=<gate-id> --reason="needs revert"
|
|
750
|
+
```
|
|
751
|
+
|
|
752
|
+
**Options:**
|
|
753
|
+
- `--slug`, `--gate` — same as `harness:approve`.
|
|
754
|
+
- `--reason=<text>` — **required** on reject. Recorded in the gate decision file.
|
|
755
|
+
|
|
756
|
+
---
|
|
757
|
+
|
|
758
|
+
## harness:status
|
|
759
|
+
|
|
760
|
+
Human-readable view of the current loop state for a feature.
|
|
761
|
+
|
|
762
|
+
```bash
|
|
763
|
+
aioson harness:status . --slug=<feature>
|
|
764
|
+
aioson harness:status . --slug=checkout --json
|
|
765
|
+
```
|
|
766
|
+
|
|
767
|
+
**Shows:** circuit state (open/closed), current iteration / max, estimated token budget (used/ceiling), last-attempt checks (passed/failed), last failure signature, pending human gates, and recommended next action.
|
|
768
|
+
|
|
769
|
+
**Options:**
|
|
770
|
+
- `--slug=<feature>` — **required**.
|
|
771
|
+
- `--json` — structured output.
|
|
772
|
+
|
|
773
|
+
---
|
|
774
|
+
|
|
775
|
+
## harness:retro
|
|
776
|
+
|
|
777
|
+
Deterministically mine the failure trail of a feature and materialize a retrospective dossier at `.aioson/context/retro/{slug}.md`. LLM-free, network-free. Source files are never modified.
|
|
778
|
+
|
|
779
|
+
```bash
|
|
780
|
+
aioson harness:retro . --feature=<slug>
|
|
781
|
+
aioson harness:retro . --last=<N> # last N features by PASS date
|
|
782
|
+
aioson harness:retro . --feature=checkout --json
|
|
783
|
+
```
|
|
784
|
+
|
|
785
|
+
**Options:**
|
|
786
|
+
- `--feature=<slug>` — mine a specific feature (mutually exclusive with `--last`).
|
|
787
|
+
- `--last=<N>` — mine the N most recently completed features.
|
|
788
|
+
- `--json` — structured output; exit codes are propagated.
|
|
789
|
+
- `--locale=<l>` — output locale (default: project `interaction_language`).
|
|
790
|
+
|
|
791
|
+
**Exit codes:** 0 = success (including empty dossier); 1 = unexpected I/O error; 12 = input error (invalid slug, conflicting flags, feature not found).
|
|
792
|
+
|
|
793
|
+
**Sources mined:** QA reports, correction plans, dossier FAIL→PASS cycles, execution events, attempt artifacts, failure signatures, devlogs.
|
|
794
|
+
|
|
795
|
+
---
|
|
796
|
+
|
|
797
|
+
## harness:preview
|
|
798
|
+
|
|
799
|
+
Display a truncated, UTF-8-safe preview of an artifact file. Used in self:loop criteria-fail feedback to avoid dumping full file contents into the agent context.
|
|
800
|
+
|
|
801
|
+
```bash
|
|
802
|
+
aioson harness:preview <file>
|
|
803
|
+
aioson harness:preview .aioson/context/retro/checkout.md
|
|
804
|
+
```
|
|
805
|
+
|
|
806
|
+
Read-only. Best-effort write for the preview artifact.
|
|
807
|
+
|
|
808
|
+
---
|
|
809
|
+
|
|
810
|
+
## spec:analyze
|
|
811
|
+
|
|
812
|
+
The **content** sibling of `artifact:validate` (which checks chain **presence** — unchanged). Runs deterministic cross-artifact consistency checks before the execution gate. Persists `spec-analyze-{slug}.json`.
|
|
813
|
+
|
|
814
|
+
```bash
|
|
815
|
+
# Analyze cross-artifact consistency for a feature
|
|
816
|
+
aioson spec:analyze . --feature=checkout
|
|
817
|
+
|
|
818
|
+
# JSON output for gate scripting (errors → exit 1)
|
|
819
|
+
aioson spec:analyze . --feature=checkout --json
|
|
820
|
+
```
|
|
821
|
+
|
|
822
|
+
**Options:**
|
|
823
|
+
- `--feature=<slug>` — **required**. Feature slug.
|
|
824
|
+
- `--json` — structured output; `error` findings flip `ok: false` (exit 1).
|
|
825
|
+
|
|
826
|
+
**What it does:** runs five deterministic checks across the feature's artifacts:
|
|
827
|
+
1. **REQ/AC ID traceability** — declared-but-unreferenced IDs = coverage-gap warning; referenced-but-undeclared IDs = orphan/drift warning (noise-guarded for prose plans).
|
|
828
|
+
2. **Staleness** — an upstream artifact modified after a downstream one = warning (60s tolerance; the project-global `architecture.md` is excluded).
|
|
829
|
+
3. **Readiness** — `blocked` = error; `ready_with_warnings` = info.
|
|
830
|
+
4. **Harness-contract sanity** — schema errors = error; executable-coverage = info.
|
|
831
|
+
5. **AC→contract linkage** = info.
|
|
832
|
+
|
|
833
|
+
An `error` flips `ok: false` (exit 1 in `--json`). `@scope-check` runs `spec:analyze` in preflight: errors are blockers, warnings are pre-computed drift evidence. When the plan carries a `Wave` column, it also runs the `wave_file_overlap` check (same-wave phases sharing Primary files = warning; plans without a `Wave` column skip it).
|
|
834
|
+
|
|
835
|
+
See [Executable verification](./executable-verification.md) for the full theme.
|
|
836
|
+
|
|
837
|
+
---
|
|
838
|
+
|
|
839
|
+
## forge:compile
|
|
840
|
+
|
|
841
|
+
**Lane B.** Compile a MEDIUM feature's artifacts into `.aioson/plans/{slug}/forge-run.workflow.js` — an auditable, versionable Claude Code dynamic-workflow script that is committed alongside the spec. Opt-in entry point is the `@forge-run` agent.
|
|
842
|
+
|
|
843
|
+
```bash
|
|
844
|
+
# Compile the feature into a forge-run.workflow.js
|
|
845
|
+
aioson forge:compile . --feature=checkout
|
|
846
|
+
|
|
847
|
+
# JSON output (hard preflights may refuse compilation)
|
|
848
|
+
aioson forge:compile . --feature=checkout --json
|
|
849
|
+
```
|
|
850
|
+
|
|
851
|
+
**Options:**
|
|
852
|
+
- `--feature=<slug>` — **required**. Feature slug.
|
|
853
|
+
- `--json` — structured output; refusals are reported with the owning agent named.
|
|
854
|
+
|
|
855
|
+
**What it does:** the generated workflow mirrors the executable-verification roadmap:
|
|
856
|
+
- one `parallel()` per **Wave** (file-disjoint dev agents; blocked-wave early stop),
|
|
857
|
+
- a deterministic `harness:check` convergence loop bounded by the governor's `error_streak_limit` (sequential fixes — only waves prove disjointness) plus a token-budget guard,
|
|
858
|
+
- a 3-lens adversarial review (correctness / completeness / regression-risk; majority survives; refute-by-default) for binary criteria **without** `verification`,
|
|
859
|
+
- a fresh-context validator stage closing through `harness:validate` → `last-validator-output.json` → `apply-validation`.
|
|
860
|
+
|
|
861
|
+
**Hard preflights** refuse compilation and name the owning agent: invalid/missing contract, zero executable criteria, plan without a `Wave` column, `spec:analyze` errors, and `wave_file_overlap` (a *warning* in `spec:analyze`, an **error** here). Generated code is deterministic by construction: pure-literal metadata, plain JS, no `Date.now`/`Math.random`/`new Date`, artifact text via `JSON.stringify` (injection-safe). It **never** runs `feature:close`. New module: `src/harness/plan-waves.js`.
|
|
862
|
+
|
|
863
|
+
See [@forge-run](../4-agents/forge-run.md) and [Executable verification](./executable-verification.md).
|
|
864
|
+
|