npm - @amsterdamdatalabs/enact-extensions - Versions diffs - 0.1.5 → 0.1.8 - Mend

@amsterdamdatalabs/enact-extensions 0.1.5 → 0.1.8

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (105) hide show

package/extensions/enact-factory/skills/azdo-ci-strategy/references/troubleshooting.md ADDED Viewed

@@ -0,0 +1,53 @@
+# Troubleshooting & bootstrap gotchas
+Assumes the **Variables** block from `../SKILL.md` is set where commands use it.
+## Quick checks
+```bash
+az devops configure --list      # current config
+az account show                 # verify auth
+az extension show --name azure-devops
+```
+## PR commands live under `az repos`, not `az devops`
+```bash
+az devops pr create ...    # WRONG — 'pr' is not a subcommand of az devops
+az repos pr create --source-branch feature/x --target-branch integration --detect   # CORRECT
+```
+## Extension fails with `No module named 'msrestazure'`
+The `azure-devops` extension depends on `msrestazure`, not bundled in newer
+Homebrew `azure-cli` installs. Install it into az's own Python:
+```bash
+az --version 2>&1 | grep "Python location"   # e.g. .../azure-cli/2.76.0/libexec/bin/python
+/opt/homebrew/Cellar/azure-cli/<version>/libexec/bin/python -m pip install msrestazure
+# If the extension directory is corrupt, remove and reinstall:
+rm -rf ~/.azure/cliextensions/azure-devops
+az extension add --name azure-devops
+```
+## Pipeline bootstrap gotchas
+### First PR that adds `azure-pipelines.yml` will not trigger CI
+AzDO evaluates the `pr:` trigger using the YAML **in the target branch**
+(`integration`). If `azure-pipelines.yml` doesn't exist in `integration` yet
+(because the PR adding it hasn't merged), the trigger never fires. Merge the
+bootstrap PR manually once — subsequent PRs auto-trigger.
+### First-run checklist (new pipeline bootstrap)
+1. **Validate YAML locally:**
+   `python3 -c "import yaml; yaml.safe_load(open('azure-pipelines.yml')); print('OK')"`
+2. **Check tool versions exist:** e.g. Zig at
+   `https://ziglang.org/download/<version>/zig-linux-x86_64-<version>.tar.xz`
+   (`curl -sI`, expect 200).
+3. **Variable group authorized:** see `policies-and-pipelines.md` →
+   variable-group authorization.
+4. **Multi-checkout repos accessible:** if `resources: repositories` is used,
+   confirm the service connection has read access to each sibling repo.

package/extensions/enact-factory/skills/deep-interview/SKILL.md ADDED Viewed

@@ -0,0 +1,72 @@
+---
+name: deep-interview
+description: "Intent-first clarification loop for vague, risky, or product-heavy work before planning or execution."
+---
+# Deep Interview
+## Purpose
+Use `$deep-interview` to turn a fuzzy request into an execution-ready spec. This is not generic
+brainstorming. It is a focused clarification loop that removes ambiguity before planning or coding.
+## Use When
+- the request describes outcomes, not behavior
+- the user is still discovering what they want
+- scope, non-goals, or decision boundaries are unclear
+- a wrong assumption would create expensive rework
+## Do Not Use When
+- the request already names files, symbols, and acceptance criteria
+- the user explicitly wants immediate execution and the risk is low
+- the only missing work is architectural decomposition, not intent clarity
+## Execution Policy
+- ask one question at a time
+- ask only the highest-leverage unresolved question
+- use repo facts before asking about codebase internals
+- force clarity on non-goals and decision boundaries before handoff
+- keep the interview moving toward a durable artifact, not an endless conversation
+## Question Order
+1. Why does this need to exist?
+2. What should be true when it is done?
+3. How far should it go?
+4. What should explicitly stay out?
+5. What may the agent decide without checking again?
+6. What constraints or preferences are hard?
+## Workflow
+1. Inspect the repo for brownfield context if relevant.
+2. Capture the current hypothesis in `.enact/loop/plans/<phase>-requirements.md`.
+3. Run a one-question loop until these are explicit:
+   - goal
+   - in-scope
+   - out-of-scope
+   - acceptance criteria
+   - decision boundaries
+4. Update:
+   - `.enact/loop/plans/<phase>-requirements.md`
+5. Hand off to `$plan` when the spec is concrete.
+## Output Standard
+The finished interview should leave behind:
+- clear goal
+- explicit non-goals
+- decision boundaries
+- testable acceptance criteria
+- constraints that downstream execution must honor
+## Stop Conditions
+Do not hand off while either of these is missing:
+- non-goals
+- decision boundaries

package/extensions/enact-factory/skills/drive-loop/SKILL.md ADDED Viewed

@@ -0,0 +1,259 @@
+---
+name: drive-loop
+description: >-
+  Teaches factory how to drive a WorkItem through the enact-loop contract runner
+  using subagents. Use when delivering a work-item, running the loop, driving a
+  work-item to closure, executing the loop contract, dispatching grader subagents,
+  or integrating factory with enact-loop. Triggers on "drive a work-item",
+  "run the loop", "deliver work-item via loop", "loop contract", "grader dispatch",
+  "loop closure".
+metadata:
+  author: Amsterdam Data Labs
+  version: 1.0.0
+---
+# drive-loop
+## Model
+**enact-loop** is a domain-neutral contract-runner. **enact-factory** is the
+orchestrator. The relationship is process-boundary integration only — no
+cross-repo imports, no shared types. Integration is via the `enact-loop` MCP
+tools or CLI.
+```
+Factory builds CONTRACT (JSON)
+  → hands to enact-loop (MCP: loop_start | CLI: enact-loop)
+  → loop drives each STAGE to closure
+  → boulder (Stop hook) blocks the agent until every required stage has a PASS
+     and every judgment stage has an independent GO verdict
+  → CONTROL RETURNS TO FACTORY
+  → factory calls pushLoopClosure (enact-factory MCP/CLI)
+  → lifecycle booleans advance → WorkItem progresses on the board
+```
+No cross-repo imports. The integration boundary is the `enact-loop` MCP tools
+and the `enact-factory` MCP/CLI.
+---
+## Contract Schema
+Embed the contract as plain JSON when calling `loop_start`. All fields are
+required unless marked optional.
+```json
+{
+  "id": "<uuid>",
+  "name": "<human-readable name>",
+  "stages": [
+    {
+      "id": "<stage-id>",
+      "name": "<stage name>",
+      "type": "mechanical | judgment",
+      "required": true,
+      "command": "<shell command>",
+      "grader": {
+        "role": "architect | critic | code-reviewer | verifier",
+        "model": "<model-id — MUST differ from executor>",
+        "harness": "paseo | subagent",
+        "rounds": 1
+      },
+      "passCriteria": "<optional plain-text pass condition>"
+    }
+  ]
+}
+```
+### Rules
+| Stage type | Required fields | Pass condition |
+|------------|----------------|----------------|
+| `mechanical` | `command` | exit-0 from the command |
+| `judgment` | `grader` (with `role`) | independent GO verdict |
+- `grader.model` **MUST** differ from the executor model. Cross-vendor is
+  preferred (e.g. executor = codex, grader = claude-opus-4-8). No self-grading.
+- `required: true` → stage must pass before closure is declared.
+- Closure = all required stages passed.
+---
+## Workflow
+### Step 1 — Build the contract from the WorkItem
+Inspect the WorkItem's `closureRequirements`. Map each requirement to a stage:
+- Deterministic check → `type: mechanical`, set `command`.
+- Subjective quality gate → `type: judgment`, set `grader`.
+### Step 2 — Start the loop
+```
+MCP:  loop_start({ contract: <contract JSON> })
+CLI:  enact-loop --contract ./contract.json
+```
+The loop runner takes ownership of the session's Stop hook (the boulder). The
+agent cannot stop until the loop signals closure.
+### Step 3 — Drive mechanical stages
+For each mechanical stage the loop surfaces:
+1. Run the stage's `command`.
+2. Capture stdout/stderr and exit code.
+3. Record the result: `loop_grade({ stageId, pass: <bool>, output: <string> })`.
+Repeat until the stage passes or the loop escalates to a retry/fail policy.
+### Step 4 — Drive judgment stages (INDEPENDENT GRADER SUBAGENT)
+For each judgment stage:
+1. Call `loop_grader_dispatch({ stageId, role: <grader role> })`.
+2. Spawn an **independent subagent** on a **different model** (never the
+   executor's model or session):
+   ```
+   Agent({
+     subagent_type: "<role>",   // architect | critic | code-reviewer | verifier
+     model: "<grader model>",   // MUST differ from executor
+     prompt: "<artifact + passCriteria + GO/NO-GO instruction>"
+   })
+   ```
+3. The grader returns a GO or NO-GO verdict with rationale.
+4. Record the verdict: `loop_grader_verdict({ stageId, verdict: "GO | NO-GO", rationale })`.
+The boulder prevents the session from completing until every judgment stage
+has a GO from an independent grader.
+### Step 5 — Closure → factory lifecycle advance
+When the loop signals closure (all required stages passed):
+1. Control returns to factory.
+2. Call `pushLoopClosure` (enact-factory MCP/CLI) with the loop's closure
+   record.
+3. Factory maps closure → lifecycle booleans → WorkItem advances on the board.
+---
+## Agent Roster (grader candidates)
+Use these roles as `grader.role`. Each must run on a model/session distinct
+from the executor.
+| Role | Use for |
+|------|---------|
+| `architect` | design correctness, interface shape, system fit |
+| `critic` | adversarial quality and risk review |
+| `code-reviewer` | implementation correctness, test coverage, style |
+| `verifier` | evidence that the deliverable meets acceptance criteria |
+| `explore` | read-only scoping (typically pre-grader, not a grader itself) |
+| `planner` | plan review (judgment on the plan artifact, not execution) |
+---
+## Worked Example — Bug Delivery
+```json
+{
+  "id": "wi-2041-bug-null-crash",
+  "name": "Bug WI-2041: null crash in session handler",
+  "stages": [
+    {
+      "id": "unit-tests",
+      "name": "Unit tests pass",
+      "type": "mechanical",
+      "required": true,
+      "command": "cd codex-rs && cargo test --features codex-cli/enact --no-fail-fast -- --test-threads=2",
+      "passCriteria": "exit 0, no test failures"
+    },
+    {
+      "id": "regression-evidence",
+      "name": "Regression test covers the crash path",
+      "type": "mechanical",
+      "required": true,
+      "command": "grep -r 'null_session' codex-rs/src --include='*.rs' -l",
+      "passCriteria": "at least one file references the regression path"
+    },
+    {
+      "id": "code-review",
+      "name": "Independent code review: GO on fix correctness",
+      "type": "judgment",
+      "required": true,
+      "grader": {
+        "role": "code-reviewer",
+        "model": "claude-opus-4-8",
+        "harness": "subagent",
+        "rounds": 1
+      },
+      "passCriteria": "fix is minimal, regression is covered, no obvious regressions introduced"
+    },
+    {
+      "id": "verifier-signoff",
+      "name": "Verifier confirms acceptance criteria met",
+      "type": "judgment",
+      "required": true,
+      "grader": {
+        "role": "verifier",
+        "model": "claude-opus-4-8",
+        "harness": "subagent",
+        "rounds": 1
+      },
+      "passCriteria": "WI-2041 acceptance criteria satisfied; crash path eliminated"
+    }
+  ]
+}
+```
+**Execution trace:**
+1. Factory builds this contract from WI-2041's `closureRequirements`.
+2. `loop_start({ contract })` — boulder active.
+3. Loop surfaces `unit-tests` → executor runs `cargo test` → exit 0 →
+   `loop_grade({ stageId: "unit-tests", pass: true, output: "..." })`.
+4. Loop surfaces `regression-evidence` → executor runs grep → match found →
+   `loop_grade({ stageId: "regression-evidence", pass: true, output: "..." })`.
+5. Loop surfaces `code-review` →
+   `loop_grader_dispatch({ stageId: "code-review", role: "code-reviewer" })` →
+   spawn `code-reviewer` subagent on `claude-opus-4-8` →
+   grader returns GO →
+   `loop_grader_verdict({ stageId: "code-review", verdict: "GO", rationale: "..." })`.
+6. Loop surfaces `verifier-signoff` →
+   `loop_grader_dispatch(...)` →
+   spawn `verifier` subagent on `claude-opus-4-8` →
+   grader returns GO →
+   `loop_grader_verdict({ stageId: "verifier-signoff", verdict: "GO", rationale: "..." })`.
+7. All required stages passed → loop signals closure → boulder lifts.
+8. Factory calls `pushLoopClosure(closureRecord)` → WI-2041 advances to Done.
+---
+## Anti-patterns
+- **Never self-grade.** The executor that wrote the code cannot be the grader.
+  Grader must run on a different model and in a different session.
+- **Never skip the boulder.** Do not call loop completion tools unless all
+  required stages have passed — the boulder enforces this, but do not attempt
+  to work around it.
+- **Never import enact-loop types directly.** The boundary is MCP/CLI only.
+  No shared module imports between factory and loop.
+- **Judgment without a grader spec is invalid.** Every `type: judgment` stage
+  must have a `grader` field with at least `role` set.
+---
+## Related Skills
+- `workitem-triage` — read-only triage of the WorkItem board (use before
+  building the contract to confirm the WorkItem is ready for delivery).
+- `azdo-ci-strategy` — branch/PR/release strategy for integrating delivered
+  work into `integration` and `main`.
+- `testing-strategy` — when a mechanical stage runs tests, use this skill to
+  pick the right test mode.
+See `references/contract-schema.md` for the full contract schema reference.

package/extensions/enact-factory/skills/drive-loop/references/contract-schema.md ADDED Viewed

@@ -0,0 +1,107 @@
+# Contract Schema Reference
+Full schema for the JSON object passed to `loop_start`. This is a progressive
+disclosure supplement to `SKILL.md`.
+---
+## Contract (root)
+| Field | Type | Required | Description |
+|-------|------|----------|-------------|
+| `id` | string (uuid) | yes | Stable unique identifier. Use the WorkItem id or a derived uuid. |
+| `name` | string | yes | Human-readable name shown in loop progress output. |
+| `stages` | Stage[] | yes | Ordered list of stages. Mechanical stages run first; judgment stages may interleave. |
+---
+## Stage
+| Field | Type | Required when | Description |
+|-------|------|---------------|-------------|
+| `id` | string | always | Stable slug. Referenced by `loop_grade` and `loop_grader_verdict`. |
+| `name` | string | always | Human-readable label. |
+| `type` | `"mechanical" \| "judgment"` | always | Determines which fields are used and how pass is determined. |
+| `required` | boolean | always | `true` → must pass before closure. `false` → informational only. |
+| `command` | string | mechanical | Shell command run by the executor. Exit 0 = pass. |
+| `grader` | GraderSpec | judgment | Specification for the independent grader subagent. |
+| `passCriteria` | string | optional | Plain-text description of what constitutes a pass. Included in grader prompts. |
+---
+## GraderSpec
+| Field | Type | Required | Description |
+|-------|------|----------|-------------|
+| `role` | `"architect" \| "critic" \| "code-reviewer" \| "verifier"` | yes | Determines the subagent type spawned. |
+| `model` | string | recommended | Model ID for the grader. **Must differ from the executor model.** Cross-vendor preferred. Omit only if the harness selects automatically. |
+| `harness` | `"paseo" \| "subagent"` | optional | How to dispatch. Default: `subagent`. Use `paseo` when the grader needs multi-step tool access. |
+| `rounds` | integer | optional | Number of independent grading rounds before aggregating. Default: `1`. |
+---
+## Validation Rules
+1. Every `type: judgment` stage **must** have `grader` with `role` set.
+2. Every `type: mechanical` stage **must** have `command` set.
+3. `grader.model` must differ from the executor's model — enforced at
+   `loop_grader_dispatch` time.
+4. At least one stage must be `required: true`, or the contract is trivially
+   closed and should not be submitted to the loop.
+5. Stage `id` values must be unique within the contract.
+---
+## Closure Definition
+A contract reaches **closure** when:
+- Every stage where `required: true` has status `PASSED`.
+- `PASSED` means:
+  - `mechanical`: `loop_grade` was called with `pass: true`.
+  - `judgment`: `loop_grader_verdict` was called with `verdict: "GO"`.
+Optional stages (`required: false`) do not block closure but their results
+are included in the closure record returned to factory.
+---
+## MCP Tool Reference
+| Tool | When to call | Key params |
+|------|-------------|------------|
+| `loop_start` | Once, at the start of the delivery | `{ contract: <Contract JSON> }` |
+| `loop_grade` | After each mechanical stage run | `{ stageId, pass, output }` |
+| `loop_grader_dispatch` | Before spawning the grader subagent | `{ stageId, role }` |
+| `loop_grader_verdict` | After the grader subagent returns | `{ stageId, verdict: "GO\|NO-GO", rationale }` |
+Factory-side tool (after loop closure):
+| Tool | When to call | Key params |
+|------|-------------|------------|
+| `pushLoopClosure` | Once, after loop signals closure | `{ closureRecord }` |
+---
+## Example: Minimal Contract (single mechanical stage)
+```json
+{
+  "id": "wi-9999-minimal",
+  "name": "Minimal: lint only",
+  "stages": [
+    {
+      "id": "lint",
+      "name": "Lint passes",
+      "type": "mechanical",
+      "required": true,
+      "command": "cargo clippy --workspace -- -D warnings"
+    }
+  ]
+}
+```
+## Example: Full Contract (mixed mechanical + judgment)
+See `SKILL.md` worked example (WI-2041 Bug Delivery) for a four-stage
+mixed contract with two mechanical and two judgment stages.

package/extensions/enact-factory/skills/hyperplan/SKILL.md ADDED Viewed

@@ -0,0 +1,51 @@
+---
+name: hyperplan
+description: "Three-round adversarial planning debate that hands a distilled bundle to $plan instead of writing the final plan directly."
+---
+# Hyperplan
+## Purpose
+Use `$hyperplan` when a plan needs adversarial pressure before execution starts.
+## Debate Team
+Use exactly this 4-agent team:
+- `critic`
+- `critic`
+- `architect`
+- `explore`
+The original openagent placeholder names are not used here.
+## Rounds
+Run exactly 3 rounds:
+1. independent analysis
+2. cross-attack
+3. defend, refine, or concede
+Persist each round under:
+- `.enact/loop/hyperplan/<sessionId>/round-1.md`
+- `.enact/loop/hyperplan/<sessionId>/round-2.md`
+- `.enact/loop/hyperplan/<sessionId>/round-3.md`
+## Handoff Rule
+The lead agent does not write the final implementation plan directly.
+After round 3, hand the distilled debate bundle to a separate `$plan` invocation.
+## Execution Notes
+- use `explore` to gather the evidence bundle before critique
+- use the two `critic` agents to take opposing positions in the cross-attack round
+- use `architect` to synthesize the final structured verdict and tradeoffs
+- do not skip a round because the first opinion looked sufficient
+## Activation
+Invoke with `$hyperplan` or `$enact-loop:hyperplan`. The skill is stateless — it produces a debate bundle and hands off to `$plan`. No durable workflow state is started automatically.

package/extensions/enact-factory/skills/looplan/SKILL.md ADDED Viewed

@@ -0,0 +1,103 @@
+---
+name: looplan
+description: "Plan-first loop execution wrapper: write a durable plan, enter the loop, drive to verified closure. Use when the work needs a reviewable plan before the loop starts."
+---
+# Looplan
+## Purpose
+Use `$looplan` when the work needs a reviewable execution plan before the loop starts. Looplan gates
+execution behind a plan the user or reviewer can inspect, then advances into the `$loop` contract
+runner with explicit stage verification.
+Looplan is the combination of `$plan` (write artifacts, define slices and verification) and `$loop`
+(durable contract-runner with closure gates). Neither sub-skill is optional.
+## Use When
+- the request is clear enough to plan but carries meaningful risk or complexity
+- a written plan artifact needs to exist before any code changes
+- verification stages need to be defined up front, not discovered mid-execution
+- the user wants to review the plan before execution begins
+## Do Not Use When
+- the request is still ambiguous — route through `$deep-interview` first, then return here
+- the work is trivial and a plan would add no value
+## Lifecycle
+```
+intent → $plan (write plan + verification map) → user review → $loop start → running → verifying → completed
+```
+Each transition is explicit. Never advance to the loop before the plan artifact exists and is reviewed.
+## Workflow
+### Phase 1 — Plan
+1. Read current artifacts and inspect the repo for brownfield context.
+2. Write the plan to `.enact/loop/plans/<phase>.md` following `$plan` conventions:
+   - Named files per slice
+   - Explicit verification per slice
+   - Exit conditions stated
+3. Write the verification map to `.enact/loop/plans/<phase>-verification.md`.
+4. Present the plan for review. Do not advance until it is approved.
+### Phase 2 — Execute
+5. Start the loop with the goal and stages from the plan:
+   ```
+   loop_start  goal="<goal>"  contract=<stages from plan>
+   ```
+6. Confirm the loop is running: `loop_status`
+7. Execute each plan slice.
+   - Mechanical stages: run the command, then `loop_grade`.
+   - Judgment stages: `loop_grader_dispatch`, await `loop_grader_verdict`.
+8. If blocked: `loop_block` with reason, resolve, then `loop_advance` back to `running`.
+### Phase 3 — Verify and Complete
+9. When all stages have evidence, advance to verifying:
+   ```
+   loop_advance  toPhase=verifying
+   ```
+10. Run `$review` against the completed work and plan artifacts.
+11. If review passes, complete the loop:
+    ```
+    loop_complete  summary="<one-line summary>"
+    ```
+12. If review returns changes requested, advance back to running and address findings.
+## State Contract
+Reads:
+- `.enact/loop/plans/*` (plan and verification map written in Phase 1)
+- `.enact/loop/state/loop.json` (loop state)
+Writes:
+- `.enact/loop/plans/<phase>.md`
+- `.enact/loop/plans/<phase>-verification.md`
+- Loop state via MCP tools
+## MCP Tools
+- `loop_start` — start a new loop with goal and stages
+- `loop_status` — read current loop state
+- `loop_advance` — transition to a new phase
+- `loop_grade` — record a mechanical stage result with evidence
+- `loop_grader_dispatch` — dispatch an independent grader for a judgment stage
+- `loop_grader_verdict` — record an independent grader's GO/NO-GO
+- `loop_block` — mark blocked with a reason
+- `loop_complete` — close the loop with a summary
+- `loop_abort` — abort with a reason
+## Final Check
+- Plan artifact exists at `.enact/loop/plans/<phase>.md` before loop was started
+- All stages have evidence recorded, not just a passed status
+- Judgment stages have valid independent grader verdicts
+- `$review` verdict is approved
+- `loop_status` shows phase as `completed`