npm - @jaimevalasek/aioson - Versions diffs - 1.23.1 → 1.28.0 - Mend

@jaimevalasek/aioson 1.23.1 → 1.28.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (52) hide show

package/CHANGELOG.md +56 -0
package/docs/en/4-agents/README.md +11 -8
package/docs/en/4-agents/forge-run.md +165 -0
package/docs/en/5-reference/README.md +1 -0
package/docs/en/5-reference/cli-reference.md +199 -85
package/docs/en/5-reference/executable-verification.md +165 -0
package/docs/pt/4-agentes/README.md +2 -1
package/docs/pt/4-agentes/forge-run.md +150 -0
package/docs/pt/4-agentes/pm.md +8 -0
package/docs/pt/4-agentes/qa.md +2 -0
package/docs/pt/4-agentes/scope-check.md +19 -1
package/docs/pt/4-agentes/sheldon.md +2 -0
package/docs/pt/4-agentes/validator.md +20 -0
package/docs/pt/5-referencia/autopilot-handoff.md +33 -0
package/docs/pt/5-referencia/comandos-cli.md +64 -9
package/docs/pt/5-referencia/fluxo-artefatos.md +40 -15
package/docs/pt/5-referencia/loop-guardrails.md +19 -0
package/docs/pt/5-referencia/sdd-automation-scripts.md +130 -26
package/package.json +1 -1
package/src/cli.js +70 -54
package/src/commands/context-select.js +1 -0
package/src/commands/forge-compile.js +330 -0
package/src/commands/harness-check.js +159 -0
package/src/commands/harness.js +37 -2
package/src/commands/spec-analyze.js +324 -0
package/src/constants.js +118 -108
package/src/context-selector.js +28 -2
package/src/gateway-pointer-merge.js +25 -4
package/src/harness/contract-schema.js +8 -0
package/src/harness/plan-waves.js +77 -0
package/src/harness/review-payload.js +230 -0
package/src/i18n/messages/en.js +21 -15
package/src/i18n/messages/es.js +15 -13
package/src/i18n/messages/fr.js +15 -13
package/src/i18n/messages/pt-BR.js +21 -15
package/src/parser.js +3 -1
package/template/.aioson/agents/dev.md +67 -66
package/template/.aioson/agents/deyvin.md +79 -74
package/template/.aioson/agents/forge-run.md +57 -0
package/template/.aioson/agents/pm.md +51 -45
package/template/.aioson/agents/qa.md +22 -22
package/template/.aioson/agents/scope-check.md +49 -46
package/template/.aioson/agents/sheldon.md +1 -1
package/template/.aioson/agents/validator.md +16 -5
package/template/.aioson/docs/autopilot-handoff.md +34 -32
package/template/.aioson/docs/sheldon/harness-contract.md +19 -2
package/template/.aioson/skills/process/aioson-spec-driven/SKILL.md +9 -7
package/template/.aioson/skills/process/aioson-spec-driven/references/deyvin.md +19 -15
package/template/.claude/commands/aioson/agent/forge-run.md +17 -0
package/template/AGENTS.md +7 -5
package/template/CLAUDE.md +4 -3
package/template/OPENCODE.md +24 -22

package/template/.aioson/agents/pm.md CHANGED Viewed

@@ -5,14 +5,14 @@
 ## Mission
 Enrich the living PRD with prioritization, sequencing, and testable acceptance clarity without rewriting product intent.
-## Context loading modes
-Use two explicit modes. Planning should consolidate upstream decisions, not reload every source document forever.
-- **PLANNING** — inspect workflow status, project context, PRD/frontmatter, Gate B status, dossier, and `context:select` output. Do not load full `.aioson/rules/`, `.aioson/docs/`, `.aioson/design-docs/`, or historical memories.
-- **EXECUTING** — before writing `implementation-plan-{slug}.md` or editing PRD sections owned by `@pm`, run `context:select --mode=executing` and load only selected rules/design governance plus source artifacts needed for the plan.
-Rules and design docs override this file only when selected by metadata, path match, task trigger, or explicit artifact reference.
+## Context loading modes
+Use two explicit modes. Planning should consolidate upstream decisions, not reload every source document forever.
+- **PLANNING** — inspect workflow status, project context, PRD/frontmatter, Gate B status, dossier, and `context:select` output. Do not load full `.aioson/rules/`, `.aioson/docs/`, `.aioson/design-docs/`, or historical memories.
+- **EXECUTING** — before writing `implementation-plan-{slug}.md` or editing PRD sections owned by `@pm`, run `context:select --mode=executing` and load only selected rules/design governance plus source artifacts needed for the plan.
+Rules and design docs override this file only when selected by metadata, path match, task trigger, or explicit artifact reference.
 ## Golden rule
 Maximum 2 pages. If it exceeds that, you are doing more than necessary. Cut ruthlessly.
@@ -22,21 +22,21 @@ Maximum 2 pages. If it exceeds that, you are doing more than necessary. Cut ruth
 - **SMALL** projects: optional — activate if user explicitly asks for delivery planning.
 - **MICRO** projects: skip — `@dev` reads context and architecture directly. Do not produce an implementation plan for MICRO.
-## Required input
-- `.aioson/context/project.context.md`
-- `.aioson/context/prd.md` or `prd-{slug}.md` — **read first**; this is the PRD base from `@product`. Preserve all existing sections unless they belong to `@pm`.
-- `.aioson/context/requirements-{slug}.md` and `spec-{slug}.md` in feature mode
-- `.aioson/context/discovery.md` only when project-level entities/flows are needed for sequencing
-- `.aioson/context/architecture.md` when Gate B or module ordering is relevant
-- `.aioson/context/design-doc*.md` / `readiness*.md` when they define implementation paths or readiness
-- `.aioson/context/ui-spec.md` only when UI/frontend phases are in scope
-Before optional inputs, run:
-```bash
-aioson context:select . --agent=pm --mode=planning --task="<planning task>" --paths="<known artifacts>"
-aioson preflight:context . --agent=pm --mode=planning --task="<planning task>" --paths="<known artifacts>"
-```
+## Required input
+- `.aioson/context/project.context.md`
+- `.aioson/context/prd.md` or `prd-{slug}.md` — **read first**; this is the PRD base from `@product`. Preserve all existing sections unless they belong to `@pm`.
+- `.aioson/context/requirements-{slug}.md` and `spec-{slug}.md` in feature mode
+- `.aioson/context/discovery.md` only when project-level entities/flows are needed for sequencing
+- `.aioson/context/architecture.md` when Gate B or module ordering is relevant
+- `.aioson/context/design-doc*.md` / `readiness*.md` when they define implementation paths or readiness
+- `.aioson/context/ui-spec.md` only when UI/frontend phases are in scope
+Before optional inputs, run:
+```bash
+aioson context:select . --agent=pm --mode=planning --task="<planning task>" --paths="<known artifacts>"
+aioson preflight:context . --agent=pm --mode=planning --task="<planning task>" --paths="<known artifacts>"
+```
 ## Workflow position reality
@@ -92,26 +92,32 @@ gate_status: approved
 ## Gate C Summary
 [Why Gate C is approved — prerequisites satisfied]
-## Required Context Package
-[Ordered list of files @dev must read, split into "Primary activation package" and "Phase-triggered loads"]
+## Required Context Package
+[Ordered list of files @dev must read, split into "Primary activation package" and "Phase-triggered loads"]
 ## Pre-Taken Decisions
 [Decisions already made — @dev does not re-discuss these]
 ## Execution Sequence
-| Phase | Scope | Primary files | Done criteria |
-|---|---|---|---|
-| 1 | ... | ... | ... |
-## Checkpoints
-[After each phase, what @dev must update]
-```
-Required Context Package rules:
-- Keep the primary activation package to 2-4 files: `project.context.md`, `spec-{slug}.md`, `implementation-plan-{slug}.md`, and optionally the most relevant `design-doc/readiness` artifact.
-- Put heavier sources under phase-triggered loads, not activation: `requirements-{slug}.md` for data/business rules, `architecture.md` for module boundaries/integrations/security, `ui-spec.md` for UI work, PRD/Sheldon enrichment only for product ambiguity.
-- Each execution phase must state: files to read, files allowed to change, upstream decisions to respect, and verification expected.
-- Never copy whole upstream documents into the plan. Reference artifact paths and sections.
+| Phase | Wave | Scope | Primary files | Done criteria |
+|---|---|---|---|---|
+| 1 | 1 | ... | ... | ... |
+## Checkpoints
+[After each phase, what @dev must update]
+```
+Wave column rules (parallelism markers):
+- Phases sharing a Wave number are **file-disjoint and dependency-free with respect to each other** — they may be executed in parallel (isolated subagents/worktrees) or in any order. Waves execute in ascending order.
+- Assign the same Wave to two phases ONLY when their Primary files do not overlap AND neither consumes the other's output (no shared data contract, migration, or API shape in flight).
+- Default is sequential: when in doubt, each phase gets its own Wave. A wrong sequential marking costs wall-clock; a wrong parallel marking costs a merge conflict or a broken contract.
+- `aioson spec:analyze` verifies Wave consistency deterministically (same-wave phases with overlapping Primary files are flagged) — keep Primary files explicit per phase so the check has signal.
+Required Context Package rules:
+- Keep the primary activation package to 2-4 files: `project.context.md`, `spec-{slug}.md`, `implementation-plan-{slug}.md`, and optionally the most relevant `design-doc/readiness` artifact.
+- Put heavier sources under phase-triggered loads, not activation: `requirements-{slug}.md` for data/business rules, `architecture.md` for module boundaries/integrations/security, `ui-spec.md` for UI work, PRD/Sheldon enrichment only for product ambiguity.
+- Each execution phase must state: files to read, files allowed to change, upstream decisions to respect, and verification expected.
+- Never copy whole upstream documents into the plan. Reference artifact paths and sections.
 After writing the plan, always close Gate C:
 ```
@@ -125,9 +131,9 @@ Implementation plan written: .aioson/context/implementation-plan-{slug}.md
 Gate C: approved
 Next agent: from the workflow state machine (MEDIUM feature: @scope-check pre-dev; MEDIUM project: @orchestrator; SMALL with user-confirmed plan: @dev)
 Tracked action: aioson workflow:next . --complete=pm --tool=<tool>
-Direct fallback: /scope-check {slug}, /orchestrator {slug} or /dev {slug} per the state machine
-```
-> Recommended: `/clear` before activating — fresh context window.
+Direct fallback: /scope-check {slug}, /orchestrator {slug} or /dev {slug} per the state machine
+```
+> Recommended: `/clear` before activating — fresh context window.
 ## Observability
@@ -140,11 +146,11 @@ aioson runtime:emit . --agent=pm --type=gate_check --summary="Gate C approved: {
 At session end, register:
 ```bash
 # Capture user decisions for operator memory
-aioson op:capture --signal=confirmation --quote="<user's verbatim choice>" --proposal="<decision paraphrase>" --source-agent=pm 2>/dev/null || true
-aioson agent:epilogue . --agent=pm --feature={slug} --summary="PM <slug>: <N> stories prioritized, Gate C <approved|pending>" --action="PM completed: {N} stories prioritized, Gate C {approved|pending}" --next="<next agent recommendation>" --gate="Gate C: <approved|pending>" 2>/dev/null || aioson agent:done . --agent=pm --summary="PM <slug>: <N> stories prioritized, Gate C <approved|pending>" 2>/dev/null || true
-```
+aioson op:capture --signal=confirmation --quote="<user's verbatim choice>" --proposal="<decision paraphrase>" --source-agent=pm 2>/dev/null || true
+aioson agent:epilogue . --agent=pm --feature={slug} --summary="PM <slug>: <N> stories prioritized, Gate C <approved|pending>" --action="PM completed: {N} stories prioritized, Gate C {approved|pending}" --next="<next agent recommendation>" --gate="Gate C: <approved|pending>" 2>/dev/null || aioson agent:done . --agent=pm --summary="PM <slug>: <N> stories prioritized, Gate C <approved|pending>" 2>/dev/null || true
+```
-If `agent:epilogue`/`agent:done` does not report workflow auto-advance, tell the user to run the tracked action above before activating the next agent. Never recommend a bare `/orchestrator` activation for a feature; include `{slug}` so the activation preflight can recover context even without a workflow handoff.
+If `agent:epilogue`/`agent:done` does not report workflow auto-advance, tell the user to run the tracked action above before activating the next agent. Never recommend a bare `/orchestrator` activation for a feature; include `{slug}` so the activation preflight can recover context even without a workflow handoff.
 ## Autopilot handoff

package/template/.aioson/agents/qa.md CHANGED Viewed

@@ -122,21 +122,21 @@ aioson dev:state:write . --feature={slug} --next="Apply mandatory corrections fr
 If the CLI is unavailable, edit `.aioson/context/dev-state.md` directly: set `next_step` to the corrections-plan path and add the plan to the context package. `aioson dev:resume-data` also auto-surfaces any `corrections-*.md` with `status: open|in_progress` for the active feature, but the dev-state pointer is the primary trail — a fresh @dev session must find the corrections without any chat history.
-3. **Auto-cycle to @dev (runtime-managed, cap from `agentic_policy`, default 3):**
-Before looping, scan Critical findings for keywords `auth | secret | credential | session | password | token | sensitive | data leak | PII | encryption`. If any match, pass `--critical-security`.
-```bash
-aioson review-cycle:advance . --feature={slug} --plan=.aioson/plans/{slug}/corrections-{date}.md --source=qa --to=dev --json 2>/dev/null || true
-```
-Interpret the JSON action:
-- `invoke_dev`: invoke `Skill(aioson:agent:dev)` with the returned `task`. User can Ctrl+C anytime.
-- `human_gate`: tell the user that the Critical security finding requires human intervention before continuing. Include the plan path.
-- `stop_cycle_limit`: tell the user the QA-to-Dev auto-cycle exhausted after `max_cycles`; remaining findings are in the returned plan path.
-- command unavailable: use the legacy state file `.aioson/runtime/qa-dev-cycle.json` with `{slug, cycle, started_at, last_plan}` and the same 3-round behavior.
-**Reset:** on QA PASS (no Critical/High remaining), run `aioson review-cycle:reset . --feature={slug} --source=qa --to=dev 2>/dev/null || true` before `feature:close`.
+3. **Auto-cycle to @dev (runtime-managed, cap from `agentic_policy`, default 3):**
+Before looping, scan Critical findings for keywords `auth | secret | credential | session | password | token | sensitive | data leak | PII | encryption`. If any match, pass `--critical-security`.
+```bash
+aioson review-cycle:advance . --feature={slug} --plan=.aioson/plans/{slug}/corrections-{date}.md --source=qa --to=dev --json 2>/dev/null || true
+```
+Interpret the JSON action:
+- `invoke_dev`: invoke `Skill(aioson:agent:dev)` with the returned `task`. User can Ctrl+C anytime.
+- `human_gate`: tell the user that the Critical security finding requires human intervention before continuing. Include the plan path.
+- `stop_cycle_limit`: tell the user the QA-to-Dev auto-cycle exhausted after `max_cycles`; remaining findings are in the returned plan path.
+- command unavailable: use the legacy state file `.aioson/runtime/qa-dev-cycle.json` with `{slug, cycle, started_at, last_plan}` and the same 3-round behavior.
+**Reset:** on QA PASS (no Critical/High remaining), run `aioson review-cycle:reset . --feature={slug} --source=qa --to=dev 2>/dev/null || true` before `feature:close`.
 4. **Fallback (when auto-loop is blocked or skipped):** the durable trail from step 2 must already be on disk before you say this. Inform the user:
 > "Corrections plan created at `.aioson/plans/{slug}/corrections-{date}.md`.
@@ -173,7 +173,7 @@ Both `@tester` and `@pentester` are official AIOSON agents. Surface them explici
 **Recommend `@validator`** in the report when:
 - `.aioson/plans/{slug}/harness-contract.json` exists for the active feature (MEDIUM with a binary success contract)
 - Verdict is trending PASS (no unresolved Critical/High) — `@validator` is the final binary gate immediately before `feature:close`
-> "Harness contract detected ({path}). Activate `/aioson:agent:validator` to run binary verification of `criteria[]` before `feature:close`. The validator runs in an isolated context (reads only the contract + listed completed_steps) — schema in `.aioson/docs/sheldon/harness-contract.md`."
+> "Harness contract detected ({path}). Activate `/aioson:agent:validator` to run binary verification of `criteria[]` before `feature:close`. The validator first executes the contract's `verification` commands deterministically via `aioson harness:check . --slug={slug}` and only LLM-judges criteria without one. Prefer the fresh-context route: `aioson harness:validate . --slug={slug}` generates a self-contained `validator-prompt.txt` (criteria + check results + diff vs base) to execute in an isolated subagent — schema in `.aioson/docs/sheldon/harness-contract.md`."
 When AIOSON CLI is available and feature mode is MEDIUM, prefer the tracked invocation `aioson agent:invoke pentester . --mode=app_target --feature={slug} --scope="{target}"` instead of telling the user to type the slash command — same effect, dashboard logs the run. The same convention applies to `@validator` via `aioson agent:invoke validator . --feature={slug}`.
@@ -403,7 +403,7 @@ When QA is complete and all Critical and High findings are resolved:
 When `auto_handoff: true` is set in `project.context.md`, you are the hub of the post-dev review cycle (`.aioson/docs/autopilot-handoff.md`). After your verdict and closing duties, route automatically instead of stopping — the four agents (`@dev`/`@qa`/`@tester`/`@pentester`) are always chained, but `@tester`/`@pentester` only run when their trigger fires:
-- **Verdict FAIL (Critical/High):** the corrections auto-cycle above already invokes `@dev` (cap 3, security gate). That path takes precedence — do not also route here.
+- **Verdict FAIL (Critical/High):** the corrections auto-cycle above already invokes `@dev` (cap 3, security gate). That path takes precedence — do not also route here.
 - **Verdict PASS — evaluate in order; auto-invoke the FIRST that applies and is not already done clean this cycle:**
   1. `@tester` trigger fires (coverage gap / no mutation tests on auth·money) → `Skill(aioson:agent:tester)`.
   2. `@pentester` trigger fires (sensitive surface: auth/secrets/data/upload/external URL/supply chain) → `Skill(aioson:agent:pentester)`.
@@ -469,8 +469,8 @@ aioson workflow:next .
 If `.aioson/runtime/reflect-prompt.json` exists at the start of your turn: read it, edit the listed `targets` in `bootstrap/*.md` (frontmatter intact, `generated_at` bumped, no writes outside `validation_rules.allowed_paths`), then `aioson memory:reflect-commit . --agent=qa --output=<path>` with `{ "files": { "<rel>": "<content>" } }`. See `.aioson/docs/autonomy-protocol.md` for tier semantics. Skip silently if no manifest is present.
-## Observability
-At session end, prefer the consolidated epilogue:
-```bash
-aioson agent:epilogue . --agent=qa --feature=<slug> --summary="Reviewed <slug>: <N> findings (<H> high, <M> med)" --verdict=<PASS|FAIL> 2>/dev/null || aioson agent:done . --agent=qa --summary="Reviewed <slug>: <N> findings (<H> high, <M> med)" 2>/dev/null || true
-```
+## Observability
+At session end, prefer the consolidated epilogue:
+```bash
+aioson agent:epilogue . --agent=qa --feature=<slug> --summary="Reviewed <slug>: <N> findings (<H> high, <M> med)" --verdict=<PASS|FAIL> 2>/dev/null || aioson agent:done . --agent=qa --summary="Reviewed <slug>: <N> findings (<H> high, <M> med)" 2>/dev/null || true
+```

package/template/.aioson/agents/scope-check.md CHANGED Viewed

@@ -18,13 +18,13 @@ Default to `pre-dev` unless activation context, handoff, or user request says ot
 | `post-fix` | optional after QA/tester/pentester caused code changes | approved plan + findings vs fix diff | whether corrections preserved scope |
 | `final` | optional before close/commit/release | intent vs plan vs delivered result | concise delivery reconciliation |
-Recommended workflow:
-```
-SMALL:  @product -> @analyst -> @scope-check(pre-dev) -> @architect -> @discovery-design-doc -> @dev -> [@scope-check(post-dev) optional] -> @qa
-MEDIUM: @product -> @analyst -> @architect -> @discovery-design-doc -> @pm -> @scope-check(pre-dev) -> @dev -> [@scope-check(post-dev) optional] -> @pentester -> @qa
-After QA/tester/pentester fixes: [@scope-check(post-fix) optional] only when code or behavior changed materially.
-```
+Recommended workflow:
+```
+SMALL:  @product -> @analyst -> @scope-check(pre-dev) -> @architect -> @discovery-design-doc -> @dev -> [@scope-check(post-dev) optional] -> @qa
+MEDIUM: @product -> @analyst -> @architect -> @discovery-design-doc -> @pm -> @scope-check(pre-dev) -> @dev -> [@scope-check(post-dev) optional] -> @pentester -> @qa
+After QA/tester/pentester fixes: [@scope-check(post-fix) optional] only when code or behavior changed materially.
+```
 ## Required input
@@ -34,19 +34,22 @@ After QA/tester/pentester fixes: [@scope-check(post-fix) optional] only when cod
 - The selected mode (`pre-dev` default, or `post-dev`/`post-fix`/`final`) — determines which of the above are compared
 > Pick the highest-authority source per claim — see the **Evidence** section below.
-## Context Loading Modes
-- **PLANNING** — inspect workflow status, selected mode, project context, feature/frontmatter, artifact presence, and `context:select` output. Do not bulk-load rules/docs/design governance.
-- **EXECUTING** — before writing or patching `scope-check*.md` or `dev-state.md`, run `context:select --mode=executing` and load only selected rules/docs/design governance plus the source artifacts needed for the comparison.
-Load `aioson-spec-driven/SKILL.md` for spec workflows, then only `references/artifact-map.md` and `references/approval-gates.md` unless a specific reference is needed.
-Before optional deep loads, run:
-```bash
-aioson context:select . --agent=scope-check --mode=planning --task="<scope-check mode and feature>" --paths="<known artifacts>"
-aioson preflight:context . --agent=scope-check --mode=planning --task="<scope-check mode and feature>" --paths="<known artifacts>"
-```
+## Context Loading Modes
+- **PLANNING** — inspect workflow status, selected mode, project context, feature/frontmatter, artifact presence, and `context:select` output. Do not bulk-load rules/docs/design governance.
+- **EXECUTING** — before writing or patching `scope-check*.md` or `dev-state.md`, run `context:select --mode=executing` and load only selected rules/docs/design governance plus the source artifacts needed for the comparison.
+Load `aioson-spec-driven/SKILL.md` for spec workflows, then only `references/artifact-map.md` and `references/approval-gates.md` unless a specific reference is needed.
+Before optional deep loads, run:
+```bash
+aioson context:select . --agent=scope-check --mode=planning --task="<scope-check mode and feature>" --paths="<known artifacts>"
+aioson preflight:context . --agent=scope-check --mode=planning --task="<scope-check mode and feature>" --paths="<known artifacts>"
+aioson spec:analyze . --feature={slug} --json
+```
+`spec:analyze` is the deterministic cross-artifact consistency pass (ID traceability, upstream-modified-after-downstream staleness, readiness blocked, contract sanity). Treat its `error` findings as blockers (route to the owner agent before any verdict); fold `warning` findings into your drift comparison as pre-computed evidence — confirm or dismiss each one explicitly. Do not re-derive by hand what the report already proves.
 ## Evidence
@@ -149,33 +152,33 @@ Why: {reason}
 Optional handoff: {when useful, suggest `@scope-check --scope-mode=post-dev|post-fix|final`; otherwise "none"}
 ```
-## Handoff Rules
-- `approved` or `patched`: continue to the next workflow stage.
+## Handoff Rules
+- `approved` or `patched`: continue to the next workflow stage.
 - `needs-*`: do not continue downstream; route to the owner with exact files and changes needed.
 - `blocked`: ask one specific question.
 - `post-dev` can route to `@qa` or `@pentester` only when drift is resolved.
-- `post-fix` can route to `@qa` when verification owns the final decision.
-## Dev-State Producer
-In `pre-dev` mode, when the verdict is `approved` or `patched` and the next workflow stage is `@dev`, write the final cold-start handoff before `agent:epilogue`/`agent:done`:
-```bash
-aioson dev:state:write . --feature={slug} --phase=1 \
-  --next="<first concrete implementation slice from scope-check + plan/readiness>" \
-  --context=spec,design-doc,readiness
-```
-For MEDIUM features with `implementation-plan-{slug}.md`, use:
-```bash
-aioson dev:state:write . --feature={slug} --phase=1 \
-  --next="<first phase from implementation-plan-{slug}.md>" \
-  --context=spec,impl-plan,readiness
-```
-If the first implementation slice is UI/frontend work, replace the least relevant optional token with `ui-spec`. Keep the package short; `implementation-plan-{slug}.md` carries phase-triggered loads for requirements, architecture, UI spec, and PRD sections.
+- `post-fix` can route to `@qa` when verification owns the final decision.
+## Dev-State Producer
+In `pre-dev` mode, when the verdict is `approved` or `patched` and the next workflow stage is `@dev`, write the final cold-start handoff before `agent:epilogue`/`agent:done`:
+```bash
+aioson dev:state:write . --feature={slug} --phase=1 \
+  --next="<first concrete implementation slice from scope-check + plan/readiness>" \
+  --context=spec,design-doc,readiness
+```
+For MEDIUM features with `implementation-plan-{slug}.md`, use:
+```bash
+aioson dev:state:write . --feature={slug} --phase=1 \
+  --next="<first phase from implementation-plan-{slug}.md>" \
+  --context=spec,impl-plan,readiness
+```
+If the first implementation slice is UI/frontend work, replace the least relevant optional token with `ui-spec`. Keep the package short; `implementation-plan-{slug}.md` carries phase-triggered loads for requirements, architecture, UI spec, and PRD sections.
 ## Autopilot Handoff
@@ -194,5 +197,5 @@ If `auto_handoff: true` in `project.context.md` frontmatter, a feature workflow
 At session end:
 ```bash
-aioson agent:epilogue . --agent=scope-check --feature={slug} --summary="Scope check {slug}: {mode}/{status}" --action="Scope check {mode}: {status}" --next="{next agent}" 2>/dev/null || aioson agent:done . --agent=scope-check --summary="Scope check {slug}: {mode}/{status}" 2>/dev/null || true
-```
+aioson agent:epilogue . --agent=scope-check --feature={slug} --summary="Scope check {slug}: {mode}/{status}" --action="Scope check {mode}: {status}" --next="{next agent}" 2>/dev/null || aioson agent:done . --agent=scope-check --summary="Scope check {slug}: {mode}/{status}" 2>/dev/null || true
+```

package/template/.aioson/agents/sheldon.md CHANGED Viewed

@@ -272,7 +272,7 @@ Run after writing `sheldon-enrichment-{slug}.md` only when `classification: MEDI
 Goal: convert binary ACs from the enriched PRD into a machine-checkable contract consumed by `@validator`. Implements AC-HD-06 of `harness-driven-aioson`.
-Load `.aioson/docs/sheldon/harness-contract.md` for the full procedure: init via `aioson harness:init`, criteria population (binary vs advisory), `contract_mode`/governor selection by risk, and canonical schemas. Mention the contract path in the post-enrichment handoff; the user approves before the contract is final.
+Load `.aioson/docs/sheldon/harness-contract.md` for the full procedure: init via `aioson harness:init`, criteria population (binary vs advisory), `verification` command authoring (every `binary: true` criterion carries an executable check when mechanically possible — exit 0 = pass, run via `aioson harness:check`), `contract_mode`/governor selection by risk, and canonical schemas. Mention the contract path in the post-enrichment handoff; the user approves before the contract is final.
 ## Retro dossier analysis (on-demand)

package/template/.aioson/agents/validator.md CHANGED Viewed

@@ -25,6 +25,7 @@ Rules and governance docs may *add* binary criteria but never override the expli
 - `.aioson/plans/{slug}/progress.json` — current state and `completed_steps`
 - Files listed in `progress.json.completed_steps` — the only delivered artifacts in scope
 - Diagnostic tool output — linters, test runners, compilers for deterministic verification
+- `.aioson/plans/{slug}/validator-prompt.txt` (when generated by `aioson harness:validate`) — self-contained review payload: deterministic check results, changed-file list, and unified diff vs base. **Preferred activation: run from this prompt in a fresh, isolated context** (subagent or separate session) — never inline in the session that implemented the feature.
 > Strict sandbox: read ONLY the above. Never read other agents' history, PRDs/requirements/architecture, or unrelated code — see **Context restrictions (mandatory)** below.
 ## Context restrictions (mandatory)
@@ -50,13 +51,23 @@ To preserve impartiality and avoid continuity hallucinations, you operate in a *
 Locate `harness-contract.json` for the current feature. Identify criteria with `binary: true`.
 ### Step 2 — Deterministic verification
-Run (or request execution of) local tools for each criterion:
+First, run the executable checks declared in the contract:
+```bash
+aioson harness:check . --slug={slug} --json
+```
+For every criterion that has a `verification` command, the check's exit code **is** the verdict — copy `ok` into `passed` verbatim (reason = the check's stderr first line on failure). Never override a deterministic result with judgment. The report is also persisted at `.aioson/plans/{slug}/last-check-output.json` (allowed reading — it is diagnostic tool output).
+For criteria **without** `verification` that are still mechanically checkable, run (or request execution of) local tools yourself:
 - `ls -l {path}` to check file existence.
 - `cat {path}` to validate patterns or content.
 - `npm test` or equivalent for execution criteria.
+If the `aioson` CLI is unavailable, fall back to running each criterion's `verification` command directly and use its exit code.
 ### Step 3 — Semantic verification
-For criteria that require understanding (e.g., "API follows REST conventions"), analyze the delivered code strictly against what the contract requires — nothing more.
+Only for criteria with no `verification` command that require understanding (e.g., "API follows REST conventions"): analyze the delivered code strictly against what the contract requires — nothing more.
 ### Step 4 — Verdict generation
 Your output must be **EXCLUSIVELY** a structured JSON object designed to be parsed by a machine. Do not add preambles or explanations outside the JSON.
@@ -100,12 +111,12 @@ aioson dossier:add-finding . --slug={slug} --agent=validator --section="Agent Tr
 Skip silently when the dossier is absent — `progress.json` remains the canonical machine output.
-## Observability
-At session end, register: `aioson agent:epilogue . --agent=validator --feature=<slug> --summary="Validated <slug> phase <N>: score=<0|1>, ready_for_done=<bool>" --no-dossier 2>/dev/null || aioson agent:done . --agent=validator --summary="Validated <slug> phase <N>: score=<0|1>, ready_for_done=<bool>" 2>/dev/null || true`
+## Observability
+At session end, register: `aioson agent:epilogue . --agent=validator --feature=<slug> --summary="Validated <slug> phase <N>: score=<0|1>, ready_for_done=<bool>" --no-dossier 2>/dev/null || aioson agent:done . --agent=validator --summary="Validated <slug> phase <N>: score=<0|1>, ready_for_done=<bool>" 2>/dev/null || true`
 ## Autopilot handoff (post-dev cycle)
-When `auto_handoff: true` is set in `project.context.md`, after the verdict and `agent:epilogue`/`agent:done` (`.aioson/docs/autopilot-handoff.md`):
+When `auto_handoff: true` is set in `project.context.md`, after the verdict and `agent:epilogue`/`agent:done` (`.aioson/docs/autopilot-handoff.md`):
 - Score 0 / FAIL → `Skill(aioson:agent:dev)` with `"fix @validator findings — autopilot handoff"`.
 - Score 1 / PASS → **STOP**. The feature is verification-clean; recommend the human run `aioson feature:close . --feature={slug}`. **Never auto-run `feature:close`** — the close is the human gate.

package/template/.aioson/docs/autopilot-handoff.md CHANGED Viewed

@@ -9,21 +9,21 @@ Opt-in protocol that removes manual handoff confirmations in the deterministic s
 1. **Pre-dev chain (`@analyst` → `@dev`):** `@analyst`, `@scope-check`, `@architect`, `@discovery-design-doc`, and `@pm` (MEDIUM only). Upstream agents (`@briefing`, `@product`, `@sheldon`) always stay manual — they end on genuine human decisions.
 2. **Post-dev review cycle (`@dev` → `@qa` → `@tester`/`@pentester` → `@validator`):** once a human starts `@dev`, the implementation and review agents chain automatically until the feature is ready to close. `@qa` is the hub: it owns the routing to the specialized agents and the corrections loop.
-## Activation
-Autopilot is active only when ALL are true:
-1. `project.context.md` frontmatter has `auto_handoff: true` (absent or `false` = manual handoffs, current behavior).
-2. A feature workflow is active (feature slug known, classification SMALL or MEDIUM).
-3. The current agent's own gate/verdict passed (see stop conditions).
-Preferred runtime entrypoint:
-```bash
-aioson workflow:execute . --feature={slug} --tool=<tool> --agentic
-```
-`workflow:execute --agentic` is the central orchestration contract. It writes `.aioson/context/workflow-execute.json` with `agentic_policy`, including the review-loop caps, sidecar/scout policy, lane guard, current checkpoint, and resumable command. Prompt-level `Skill(aioson:agent:<next>)` chaining remains a compatibility fallback for clients that cannot let the gateway consume this checkpoint.
+## Activation
+Autopilot is active only when ALL are true:
+1. `project.context.md` frontmatter has `auto_handoff: true` (absent or `false` = manual handoffs, current behavior).
+2. A feature workflow is active (feature slug known, classification SMALL or MEDIUM).
+3. The current agent's own gate/verdict passed (see stop conditions).
+Preferred runtime entrypoint:
+```bash
+aioson workflow:execute . --feature={slug} --tool=<tool> --agentic
+```
+`workflow:execute --agentic` is the central orchestration contract. It writes `.aioson/context/workflow-execute.json` with `agentic_policy`, including the review-loop caps, sidecar/scout policy, lane guard, current checkpoint, and resumable command. Prompt-level `Skill(aioson:agent:<next>)` chaining remains a compatibility fallback for clients that cannot let the gateway consume this checkpoint.
 ## Routing — deterministic, never LLM-chosen
@@ -34,22 +34,22 @@ The next agent comes from the workflow state machine and on-disk evidence, not f
 Never skip a stage, reorder, or pick an agent the state machine / routing table did not name.
-## Auto-invoke pattern
-When autopilot is active and no stop condition applies:
-1. Finish your own closing duties first (artifacts on disk, gate registration, dossier/spec updates, `agent:epilogue`; `agent:done` remains the fallback).
-2. If the runtime checkpoint contains `agentic_policy.enabled=true`, let the gateway continue from `.aioson/context/workflow-execute.json`; do not ask the user to confirm the next deterministic stage.
-3. If no runtime gateway is available, emit a one-line transition notice: `Autopilot: @<current> done → invoking @<next> (Ctrl+C to interrupt)`.
-4. Invoke `Skill(aioson:agent:<next>)` with the task `"continue feature {slug} — autopilot handoff from @<current>"`. No user prompt — Ctrl+C interrupts.
+## Auto-invoke pattern
+When autopilot is active and no stop condition applies:
+1. Finish your own closing duties first (artifacts on disk, gate registration, dossier/spec updates, `agent:epilogue`; `agent:done` remains the fallback).
+2. If the runtime checkpoint contains `agentic_policy.enabled=true`, let the gateway continue from `.aioson/context/workflow-execute.json`; do not ask the user to confirm the next deterministic stage.
+3. If no runtime gateway is available, emit a one-line transition notice: `Autopilot: @<current> done → invoking @<next> (Ctrl+C to interrupt)`.
+4. Invoke `Skill(aioson:agent:<next>)` with the task `"continue feature {slug} — autopilot handoff from @<current>"`. No user prompt — Ctrl+C interrupts.
 ## Segment 1 — pre-dev chain (`@analyst` → `@dev`)
-SMALL feature: `@analyst` → `@scope-check` → `@architect` → `@discovery-design-doc` → `@dev`.
-MEDIUM feature: `@analyst` → `@architect` → `@discovery-design-doc` → `@pm` → `@scope-check` → `@dev`.
-The prompt-only fallback still stops before the FIRST `@dev` activation because `@dev` is a heavy phase and benefits from a fresh context window. Runtime agentic mode may cross this boundary only by starting a fresh `@dev` activation from the checkpoint/context package, not by carrying the upstream chat context forward. If the gateway cannot start that fresh activation, stop with the normal `/clear` + `/dev` recommendation.
+SMALL feature: `@analyst` → `@scope-check` → `@architect` → `@discovery-design-doc` → `@dev`.
+MEDIUM feature: `@analyst` → `@architect` → `@discovery-design-doc` → `@pm` → `@scope-check` → `@dev`.
+The prompt-only fallback still stops before the FIRST `@dev` activation because `@dev` is a heavy phase and benefits from a fresh context window. Runtime agentic mode may cross this boundary only by starting a fresh `@dev` activation from the checkpoint/context package, not by carrying the upstream chat context forward. If the gateway cannot start that fresh activation, stop with the normal `/clear` + `/dev` recommendation.
 ## Segment 2 — post-dev review cycle (hub = `@qa`)
@@ -60,8 +60,8 @@ Routing table (each row is followed only when autopilot is active and no stop co
 | Current | Condition | Auto-invoke |
 |---|---|---|
 | `@dev` (first pass) | tests green, gates clear, no open corrections cycle | `@qa` |
-| `@dev` (corrections) | corrections applied, tests green (`review-cycle:status` active; `qa-dev-cycle.json` is legacy QA compatibility) | `@qa` (re-verify) |
-| `@qa` | verdict **FAIL** (Critical/High) | `@dev` via the corrections auto-cycle (cap 3, security gate) |
+| `@dev` (corrections) | corrections applied, tests green (`review-cycle:status` active; `qa-dev-cycle.json` is legacy QA compatibility) | `@qa` (re-verify) |
+| `@qa` | verdict **FAIL** (Critical/High) | `@dev` via the corrections auto-cycle (cap 3, security gate) |
 | `@qa` | verdict **PASS** + `@tester` trigger fires AND `@tester` not yet run clean | `@tester` |
 | `@qa` | verdict **PASS** + `@pentester` trigger fires AND `@pentester` not yet run clean | `@pentester` |
 | `@qa` | verdict **PASS** + harness contract present AND `@validator` not yet PASS | `@validator` |
@@ -77,11 +77,13 @@ Routing table (each row is followed only when autopilot is active and no stop co
 **Re-entry guard (no infinite loops):** before auto-invoking a specialized agent, `@qa` checks on-disk evidence that it already ran clean this cycle (e.g. `security-findings-{slug}.json` clean → `@pentester` done; a tester coverage artifact present with no new gap → `@tester` done; `progress.json.ready_for_done_gate` / validator PASS recorded → `@validator` done). An agent that already returned clean is not re-invoked.
+**`@validator` runs fresh-context:** when routing to `@validator` with a harness contract present, do not run it inline in the current session — the implementation history biases the verdict. Instead: (1) `aioson harness:check . --slug={slug}` (deterministic checks), (2) `aioson harness:validate . --slug={slug}` — the generated `validator-prompt.txt` is self-contained (criteria + check results + diff vs base), (3) execute that prompt in an **isolated subagent** (Task tool, no conversation context) that writes its JSON verdict to `last-validator-output.json`, (4) re-run `aioson harness:validate` to consume the verdict through the circuit breaker. Clients without subagent support fall back to `Skill(aioson:agent:validator)` in a fresh session, as before.
 ## Stop conditions — break the chain and emit the normal manual handoff
 1. **`feature:close` / publish** — ALWAYS the human gate. When `@qa` (PASS, nothing pending) or `@validator` (PASS) is the last clean step, STOP and recommend `aioson feature:close . --feature={slug}`. Never auto-run `feature:close`, `feature:archive`, `npm publish`, or any publish/close action.
-2. **First `@dev` entry without runtime gateway** — prompt-only clients stop here (Segment 1). Runtime agentic mode may continue only through a fresh checkpointed `@dev` activation.
-3. **Corrections cap reached** — review cycles are bounded by `agentic_policy.review_cycle` (default 3); when `review-cycle:advance` returns `stop_cycle_limit`, stop and escalate to the human.
+2. **First `@dev` entry without runtime gateway** — prompt-only clients stop here (Segment 1). Runtime agentic mode may continue only through a fresh checkpointed `@dev` activation.
+3. **Corrections cap reached** — review cycles are bounded by `agentic_policy.review_cycle` (default 3); when `review-cycle:advance` returns `stop_cycle_limit`, stop and escalate to the human.
 4. **Critical security finding** — the `@qa` corrections security gate (auth/secret/credential/session/password/token/PII/encryption keywords) blocks the auto-loop; stop and require human intervention.
 5. **Verdict not clean / gate or readiness blocked** — `@scope-check` not `approved`/`patched`, `@architect` Gate B blocked, `@discovery-design-doc` readiness `blocked`, `@pm` Gate C blocked, `@validator` FAIL with no safe corrections path: stop and route to the owner manually.
 6. **Context budget** — estimated usage ≥ `context_warning_threshold` (`.aioson/config.md`): write the compaction checkpoint to `.aioson/context/last-handoff.json`, stop, and recommend `/clear`. The workflow resumes from `workflow.state.json` — the next session re-enters autopilot automatically.

package/template/.aioson/docs/sheldon/harness-contract.md CHANGED Viewed

@@ -37,7 +37,8 @@ For every AC in the enriched PRD with an objective, mechanically verifiable asse
   "id": "C1",
   "description": "<AC text — human-readable for PR review>",
   "assertion": "<machine-verifiable expression — e.g. 'tests/foo.test.js passes' or 'src/x/y.js exports parseX'>",
-  "binary": true
+  "binary": true,
+  "verification": "<shell command whose exit code 0 = pass — e.g. 'node --test tests/foo.test.js'>"
 }
 ```
@@ -45,6 +46,19 @@ ACs that are subjective (UX feel, code style preference) get `binary: false` and
 **Rule of thumb:** if the assertion can be answered by a single shell command exit code or a single test, it qualifies as `binary: true`. Otherwise mark it advisory and let `@qa` cover it.
+### 2b. Author `verification` commands
+Every `binary: true` criterion **must** carry a `verification` shell command whenever one is mechanically possible. Exit code 0 = pass; anything else = fail. These commands are executed deterministically by `aioson harness:check . --slug={slug}` (and by `self:loop`) — `@validator` only LLM-judges criteria that have no `verification`.
+Authoring rules for `verification`:
+- **Prefer the project's own test runner** (`node --test tests/x.test.js`, `npm test -- --grep "..."`, `pytest tests/test_x.py`). A criterion backed by a real test is the gold standard.
+- **One-liner assertions** when no test exists yet: `node -e "const m = require('./src/x'); process.exit(typeof m.parseX === 'function' ? 0 : 1)"`.
+- **Deterministic only**: no network calls, no wall-clock dependence, no interactive prompts.
+- **Cross-platform**: single commands or npm scripts — avoid shell chaining (`&&`, `||`) and POSIX-only utilities (`grep`, `test -f`) on Windows-first projects; use `node -e` for file/shape assertions instead.
+- **Self-contained**: the command must pass/fail on a clean checkout after install — no hidden setup steps.
+- A `binary: true` criterion **without** `verification` remains valid (judged by `@validator`, as before), but the contract schema emits a coverage warning — treat each one as debt and justify it in the enrichment log.
 ### 3. Set `contract_mode`
 By classification and risk surface:
@@ -90,12 +104,15 @@ Safe defaults for `BALANCED`:
       "id": "C1",
       "description": "...",
       "assertion": "...",
-      "binary": true
+      "binary": true,
+      "verification": "node --test tests/foo.test.js"
     }
   ]
 }
 ```
+`verification` is optional per criterion (legacy contracts remain valid), executed via `aioson harness:check` with exit code 0 = pass.
 ### `progress.json`
 ```json

package/template/.aioson/skills/process/aioson-spec-driven/SKILL.md CHANGED Viewed

@@ -5,13 +5,15 @@
 ## When to use
-Load this skill when:
-- starting spec work for a new feature or project (any agent)
-- deciding phase depth based on classification (MICRO / SMALL / MEDIUM)
-- preparing a clean handoff to the next agent
-- retaking work after a session break (check `last_checkpoint` + `phase_gates` first)
-Do NOT load the entire `references/` folder. Load only the file matching your current need.
+Load this skill when:
+- starting spec work for a new feature or project (any agent)
+- deciding phase depth based on classification (MICRO / SMALL / MEDIUM)
+- preparing a clean handoff to the next agent
+- retaking work after a session break (check `last_checkpoint` + `phase_gates` first)
+Do not load this skill for `@deyvin` activation-only recovery. A bare `@deyvin` activation is status recovery, not spec work; run Deyvin's fast path and stop before opening this file.
+Do NOT load the entire `references/` folder. Load only the file matching your current need.
 ## What phases exist