npm - @jamie-tam/forge - Versions diffs - 6.0.0 - Mend

@jamie-tam/forge 6.0.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (213) hide show

package/skills/support-task-force/SKILL.md ADDED Viewed

@@ -0,0 +1,265 @@
+---
+name: support-task-force
+description: "Use when the user prompts a punch list of independent tasks (numbered or bulleted, ≥3 items) OR explicitly invokes /task-force. Classifies each task by type, assembles the right specialist team per task, dispatches all in parallel, aggregates results. Phase-aware: light teams during prototype phases (fast iteration), full crew during harden/production (no shortcuts). Optional Codex adversarial pairing when consent is on. REFUSES command-shaped work — routes feature/bugfix/refactor items to their owning commands instead."
+---
+# Task Force
+## Overview
+Users don't always invoke `/feature` or `/bugfix`. The daily reality is a paste-in punch list: "do these 10 things: 1. update README; 2. explore option X; 3. add a test for Y; 4. ...". Doing those in series in one context window is slow, conflict-prone, and skips the Codex second-opinion that catches real bugs.
+This skill assembles a **task-force per item** — a specialist team sized for the task's type and the project's current phase — and dispatches them in parallel. Each task-force has a Claude reviewer/builder paired with a Codex sibling (when the task is non-trivial). The skill REFUSES to swallow command-shaped work; items that look like features/bugfixes/refactors get routed to their owning commands.
+**Core principle:** match team size to task complexity AND to the project's current phase. Prototype phases iterate fast with light teams. Production phases demand the full crew.
+**Announce at start:** "I'm using support-task-force to dispatch N task-forces in parallel — M routed to commands, K running in light mode, L running in full mode." If `!max` / `--full-power` detected: "...running in `!max` mode — production teams enforced regardless of phase."
+## When to Use
+Fire when **any** of:
+- User types `/task-force` explicitly
+- User prompt contains a numbered list (`1. xxx 2. xxx 3. xxx`) with ≥3 items
+- User prompt contains a bulleted list (`- xxx / - xxx / - xxx`) with ≥3 items AND items look like work-tasks (not casual mentions)
+Confirm with user before dispatching if auto-detected (one-line: "Detected a 5-item task list; dispatching task-forces — say 'no' to cancel").
+Do NOT use for:
+- A single task (use the appropriate command or skill directly)
+- A list where every item is command-shaped (route to commands instead, don't even invoke this skill)
+- Ad-hoc questions or conversation (this is for *do these things*, not *tell me about*)
+## Process
+### Step 0: Codex consent (per `protocols/codex.md`)
+Run the Codex consent flow ONCE for the whole task-force run. The choice (Verify / Takeover / Skip / Never) propagates to every per-task dispatch. Record in `.forge/task-force/{run-id}/codex.yaml`.
+### Step 1: Detect repo phase (or honor full-power override)
+**Magic keyword override.** If the user's prompt contains either `!max` (short form) OR `--full-power` (canonical form), skip all phase detection and resolve to `production` mode unconditionally. This forces:
+- Full-crew team templates for every task type (see Step 4 production tables)
+- Mandatory Codex consent (Verify mode) — no opt-out for the run
+- No "Codex skipped" prototype-mode shortcuts even on quick / docs tasks
+- Acknowledgement in the run announcement: "Running in `!max` mode — production teams enforced regardless of phase."
+Use the override when:
+- You're working in a prototype phase but the task touches code that WILL go to production
+- You want maximum review coverage regardless of the repo's current phase state
+- You're auditing or hardening prototype code before promoting it
+- You explicitly want the cost (more agents, more tokens) for the safety
+Otherwise, detect phase from the active work manifest: glob `.forge/work/*/*/manifest.yaml` and pick the most recently modified. Look at `phases.{phase}.status` and `phase_plan.{phase}`.
+Resolve to one of:
+| Phase mode | Detected by |
+|---|---|
+| **prototype** | Phase 4 active or phase_plan has `iterate-prototype: active` and `harden: skipped`/`not-yet`; OR no manifest exists AND `pocs/*` directory is present |
+| **production** | Phase 5+ active (`harden`, `production-build`, `quality`, `deliver`) OR no manifest AND the repo has shipped production code (presence of `src/`, `dist/`, packaged code) |
+| **ambiguous** | No manifest, no clear prototype/production signal — ask the user |
+Phase mode determines team sizing in Step 4. The override (`!max` / `--full-power`) short-circuits this to `production` regardless of detection.
+**Detection rules for the override:**
+- Match either form (whichever the user types):
+  - **Short form:** `!max` (literal, case-insensitive — `!MAX`, `!Max` also accepted)
+  - **Canonical form:** `--full-power` (literal, case-insensitive)
+- Must be a whole word: bounded by whitespace, line start, or line end
+- Do NOT match accidental variants: `max` alone (no bang), `!maximum`, `full power` (no dashes), `--fullpower`, `-full-power`
+- Strict matching prevents false triggers from casual English
+- Strip the keyword from each task's text before classification (it's a run-level flag, not part of any task)
+### Step 2: Parse the task list
+Extract items from the prompt. Number them 1..N. For each item, record:
+- raw text
+- detected verbs (build, fix, explore, document, audit, refactor, deploy, ...)
+- detected scope (file path, subsystem, "everywhere", ...)
+If N > 20 (hard cap), refuse and ask user to batch. If N > 10 (soft cap), warn and offer to batch.
+### Step 3: Classify each task
+For each task, apply this classifier in order — first match wins:
+| Pattern | Classification |
+|---|---|
+| "add a (new) feature", "implement Y end-to-end", "build Z product" | **command-shaped: /feature** |
+| "fix bug N", "investigate failure X" (single bug, multi-step) | **command-shaped: /bugfix** |
+| "refactor module M", "restructure X" | **command-shaped: /refactor** |
+| "start new project", "scaffold from scratch" | **command-shaped: /greenfield** |
+| "hotfix production", "rollback X" | **command-shaped: /hotfix** |
+| "audit security", "review for vulns", "check OWASP" | **security** (dispatch `quality-security-audit` skill) |
+| "review this code/PR" (with diff scope) | **review** (dispatch `quality-code-review` skill) |
+| "explore Y", "compare X vs Y", "find best approach for Z", "research Q" | **research** |
+| "fix bug" (single-line, obvious) OR "find why X fails" (one trace) | **debug** |
+| "write README", "document Y", "add comments to Z" | **docs** |
+| "build feature slice" / "add function" / "implement X" (small, scoped) | **dev** |
+| "fix typo", "bump dep", "rename variable" | **quick** |
+| ≥4 subsystems touched OR ≥3 verbs combined | **complex** |
+| Anything else | **ambiguous** (ask user) |
+Command-shaped items get **routed out** — they do NOT get a task-force. Surface them to the user with: "Item X looks like a `/feature` — running it as a forge command instead. The remaining items continue as task-forces." User can override.
+### Step 4: Assemble team per task
+Apply both the **task type** AND the **phase mode**:
+#### Phase mode: prototype (light teams, fast iteration)
+| Task type | Team |
+|---|---|
+| dev | `prototype-builder` only (Codex skipped unless user requested Verify) |
+| research | 1 explorer (`architect` OR `tracer` OR `doc-writer`, picked by scope) — no Codex |
+| debug | dispatch `support-debug` skill (owns its own chain) — no extra Codex |
+| docs | `doc-writer` only |
+| review | dispatch `quality-code-review` (it picks risk-tier; prototype work is usually Low) |
+| security | dispatch `quality-security-audit` |
+| quick | 1 agent inline (no dispatch, no Codex) |
+| complex | full crew defined below — but warn user "complex task in prototype mode; consider /feature instead" |
+#### Phase mode: production (full crew, mandatory Codex)
+| Task type | Team |
+|---|---|
+| dev | `builder` + dispatch `quality-code-review` skill + Codex (per consent) |
+| research | 2-3 explorers from `architect`, `tracer`, `doc-writer`, `spec-reviewer` (pick by scope) + Codex |
+| debug | dispatch `support-debug` skill + `gotcha-hunter` after tracer locks scope + Codex |
+| docs | `doc-writer` + Codex verify |
+| review | dispatch `quality-code-review` (full chain) + Codex adversarial |
+| security | dispatch `quality-security-audit` + `security-reviewer` + Codex adversarial |
+| quick | 1 agent + Codex verify (production is no-shortcuts even for one-liners that touch shipped code) |
+| complex | full crew: `architect` + `tracer` + (`builder` or `prototype-builder` by phase) + dispatch `quality-code-review` + `spec-reviewer` + `security-reviewer` (if risk triggers) + `e2e-runner` (if E2E plan exists) + Codex adversarial |
+#### Phase mode: ambiguous
+Ask the user: "Is this a prototype or production task list?" Then proceed with the matching template.
+### Step 5: Dispatch all task-forces in parallel
+For each non-routed task:
+1. Compose the prompt: include the task text, the detected type, phase mode, the team composition.
+2. Fire each agent in the team as a background subagent (`run_in_background: true`).
+3. For tasks whose team includes "dispatch X skill", spawn ONE agent that invokes that skill (the skill handles its own internal multi-agent chain).
+4. Tag each agent with `task-force/{run-id}/{task-n}/{role}` for telemetry.
+Batched dispatch ceiling: 20-25 agents per `<function_calls>` block. For 10 tasks × 4 agents = 40 agents, plan 2 dispatch rounds.
+### Step 6: Accumulate
+Background notifications arrive asynchronously:
+- Save each agent's result to `.forge/task-force/{run-id}/task-{n}/{role}.md`
+- Maintain a per-task status line: `Task 3 of 10: 2 of 3 agents complete (1 pending)`
+- DO NOT narrate every individual agent completion to the user — batch updates every ≥3 returns
+### Step 7: Per-task verdict
+When all agents for a task return, synthesize:
+- If Claude builder + Codex sibling agree → mark task `complete-consensus`
+- If they disagree → mark `complete-divergent`, surface the disagreement
+- If only one returned (other failed/timeout) → mark `complete-single-source`, flag for user attention
+- If both failed → mark `failed`, recommend retry or single-task fallback
+Per-task output goes to `.forge/task-force/{run-id}/task-{n}/verdict.md`.
+### Step 8: Run summary
+After all tasks finish (or hit batch limit), produce one consolidated report:
+```
+Task-Force Run {run-id} — {N} tasks, {M} routed to commands
+| # | Task | Type | Team size | Status | Conflicts |
+|---|------|------|-----------|--------|-----------|
+| 1 | update README | docs | 1 | consensus | — |
+| 2 | explore option X | research | 3 | consensus | — |
+| 3 | implement Y | dev | 2 | divergent | Claude says ship, Codex says blocking |
+| ... | ... | ... | ... | ... | ... |
+Routed to commands:
+- Item 4 (add payment feature) → /feature
+- Item 7 (fix critical bug) → /bugfix
+Followups:
+- Resolve divergence on task 3 (recommend manual review)
+- Run /feature for item 4 when ready
+```
+Write the summary to `.forge/task-force/{run-id}/run-summary.md` and surface to user.
+### Step 9: Stop
+This skill does ONE run per invocation. If the user wants to iterate on findings, they invoke the appropriate command or this skill again with a new list.
+## I/O Contract
+| Field | Value |
+|---|---|
+| Requires | A user-provided task list (parsed in Step 2); optionally an active manifest for phase detection |
+| Produces | `.forge/task-force/{run-id}/codex.yaml`, `.forge/task-force/{run-id}/tasks.yaml` (classification), `.forge/task-force/{run-id}/task-{n}/{role}.md` per agent, `.forge/task-force/{run-id}/task-{n}/verdict.md` per task, `.forge/task-force/{run-id}/run-summary.md` |
+| Updates manifest | If a feature/bugfix manifest is active, append `phases.{phase}.task-force-runs: [{run-id}]`. Otherwise no manifest update — task-force runs are not phase-gated. |
+| Feeds into | `/feature`, `/bugfix`, `/refactor` (for routed items); `support-gotcha` (for recurring patterns surfaced across tasks) |
+## Agent Dispatch
+This skill is itself a dispatcher. It does NOT have a single subagent type — it picks per task per phase from the table in Step 4.
+| Helper skill it dispatches | When |
+|---|---|
+| `quality-code-review` | Every dev/review task in production mode (skill owns its own agent chain) |
+| `support-debug` | Every debug task (skill owns tracer + gotcha-hunter chain) |
+| `quality-security-audit` | Every security task |
+| `deliver-onboarding` | Every docs task that's onboarding-scoped |
+Raw agents it dispatches directly:
+- `prototype-builder` (prototype dev)
+- `builder` (production dev, when not going through quality-code-review)
+- `architect`, `tracer`, `doc-writer`, `spec-reviewer`, `security-reviewer` (research/explore)
+- `e2e-runner` (complex tasks with E2E plan)
+- Codex sibling (via `codex:codex-rescue` subagent type or protocol invocation)
+## Integration
+| Step | Skill / Command | Why |
+|---|---|---|
+| Pre-dispatch consent | `protocols/codex.md` | Codex pairing requires consent ONCE per run |
+| Task routing-out | `/feature`, `/bugfix`, `/refactor`, `/greenfield`, `/hotfix` | Command-shaped items must use commands, not task-forces |
+| Per-task dev | `quality-code-review` skill | Production dev tasks get the full review chain |
+| Per-task debug | `support-debug` skill | Debug tasks get tracer + gotcha-hunter |
+| Per-task security | `quality-security-audit` skill | Security tasks get the OWASP-aware audit chain |
+| Post-run gotcha | `support-gotcha` | Patterns surfaced across multiple task-forces deserve a recorded lesson |
+## Red Flags
+| Red Flag | What it means |
+|---|---|
+| Every task gets the same team | Classifier is failing; user-pasted heterogeneous tasks don't all become "dev" |
+| Routed-to-command items >50% | User typed a feature description, not a punch list — recommend dropping this skill entirely and using `/feature` |
+| Codex consent skipped silently | Means no second-opinion across the whole run — for production-mode tasks this is a regression |
+| 0 conflicts across all tasks | Either every task was trivial OR Claude+Codex are echoing each other; sample-check |
+| Complex tasks in prototype mode | Probably should be a /feature; warn user before proceeding |
+| Run-id directories never get cleaned | Add a periodic prune; aiwiki/dream or /evolve should handle it |
+| Soft cap warning ignored repeatedly | User has misframed work — recommend splitting into milestones |
+## Anti-Patterns
+**Swallowing command-shaped work.** If the user types "add a payment feature, then audit security, then ship", DO NOT run a task-force — that's three workflows pretending to be three tasks. Route them: `/feature` → `quality-security-audit` → `/feature` deploy phase. Task-force is for items that DON'T fit a workflow.
+**Phase-blind sizing.** Spawning the full crew on a Phase-4 prototype task wastes tokens AND violates the prototype-throwaway principle. Always read phase first.
+**Skipping Codex on production tasks.** "Quick" is not an excuse in production mode. The whole point of pairing is to catch the bug Claude-alone would miss (this skill exists because the v6 audit demonstrated exactly that pattern, Phase 4 caught telemetry.sh regression Claude missed).
+**Ignoring divergent verdicts.** Claude+Codex disagreement is a SIGNAL, not noise. Surface it. Let the user adjudicate.
+**Letting `.forge/task-force/` grow forever.** Each run leaves a directory. Add to `support-dream` consolidation scope, or document a cleanup pattern.
+**Auto-dispatching without confirmation when auto-detected.** Numbered lists in user prose aren't always task lists (could be a survey of options, a sequence diagram, etc.). Confirm before firing 40 agents.
+## References
+- `references/dispatch-pattern.md` — bash recipes, prompt templates, sizing guidance
+- `references/synthesis-template.md` — per-task verdict + run-summary structures
+- `agents/critic.md` — voice calibration for research/review cells
+- `protocols/codex.md` — Codex consent flow
+- This skill's design lineage: born from the v6-migration audit's fanout pattern, rescoped to daily ad-hoc task lists per dogfood feedback.

package/skills/support-task-force/references/dispatch-pattern.md ADDED Viewed

@@ -0,0 +1,178 @@
+# Task-Force Dispatch Pattern
+Operational recipes for spawning per-task Claude+Codex teams.
+## Phase detection (Step 1)
+```bash
+# Check for !max / --full-power override FIRST (case-insensitive, whole-word)
+if echo "$USER_PROMPT" | grep -qiE '(^|[[:space:]])(!max|--full-power)($|[[:space:]])'; then
+  PHASE_MODE=production
+  CODEX_FORCED=verify
+  FULL_POWER=1
+  # Strip the keyword from task texts before classification
+else
+  FULL_POWER=0
+  # Glob the active manifest
+  manifest=$(ls -t .forge/work/*/*/manifest.yaml 2>/dev/null | head -1)
+  if [ -z "$manifest" ]; then
+    # No active manifest — check repo state
+    if [ -d pocs ] && [ ! -d dist ]; then
+      PHASE_MODE=prototype
+    elif [ -d dist ] || [ -d src ]; then
+      PHASE_MODE=production
+    else
+      PHASE_MODE=ambiguous
+    fi
+  else
+    # Manifest exists — read phase state
+    current=$(yq '.phases | to_entries | map(select(.value.status == "in-progress")) | .[0].key' "$manifest")
+    case "$current" in
+      concept|wireframe|prototype|iterate) PHASE_MODE=prototype ;;
+      harden|production-build|quality|deliver) PHASE_MODE=production ;;
+      *) PHASE_MODE=ambiguous ;;
+    esac
+  fi
+fi
+```
+**Magic keyword reference:**
+- Short form: `!max` (preferred for daily use)
+- Canonical form: `--full-power` (verbose, self-documenting)
+- Both case-insensitive
+- Must be whole words (bounded by whitespace or string boundary)
+- Either form forces `PHASE_MODE=production` and `CODEX_FORCED=verify`
+- Stripped from task text before classification (it's a run-level flag)
+Reject silent variants — do NOT match `max` alone (no bang), `!maximum`, `full power` (no dashes), `--fullpower` (no internal dash), `-full-power` (one dash). Strict matching prevents false positives from casual English.
+## Task classification (Step 3)
+Apply regex/heuristic match in order, first-match wins:
+```python
+classifiers = [
+    (r'(?i)add (a |an )?new feature|build .* product|implement .* end-to-end', 'command:/feature'),
+    (r'(?i)fix bug|investigate failure', 'command:/bugfix'),
+    (r'(?i)refactor|restructure', 'command:/refactor'),
+    (r'(?i)hotfix production|rollback', 'command:/hotfix'),
+    (r'(?i)scaffold .* (project|repo)|start new project', 'command:/greenfield'),
+    (r'(?i)audit security|owasp|vuln', 'security'),
+    (r'(?i)review (this )?(code|pr|diff)', 'review'),
+    (r'(?i)explore|compare .* (vs|or)|find best|research|investigate options', 'research'),
+    (r'(?i)why does .* fail|debug|trace', 'debug'),
+    (r'(?i)write (a |the )?readme|document|add comments|docstring', 'docs'),
+    (r'(?i)fix typo|bump dep|rename var|rename const', 'quick'),
+    # falls through to: dev (default for code work), complex (if multi-subsystem), ambiguous
+]
+```
+## Team assembly (Step 4) — by phase × type
+### Prototype mode teams (light, fast)
+```
+dev        → 1 agent: prototype-builder
+research   → 1 agent: architect | tracer | doc-writer (by scope)
+debug      → 1 dispatch: support-debug skill
+docs       → 1 agent: doc-writer
+review     → 1 dispatch: quality-code-review (low-risk tier)
+security   → 1 dispatch: quality-security-audit
+quick      → 1 agent inline (no dispatch)
+complex    → warn + full crew (Codex flagged this; consider /feature)
+```
+### Production mode teams (full crew, mandatory Codex)
+```
+dev        → 1 agent (builder) + 1 dispatch (quality-code-review) + 1 Codex
+research   → 2-3 agents (architect, tracer, doc-writer/spec-reviewer) + 1 Codex
+debug      → 1 dispatch (support-debug) + 1 agent (gotcha-hunter) + 1 Codex
+docs       → 1 agent (doc-writer) + 1 Codex verify
+review     → 1 dispatch (quality-code-review full chain) + 1 Codex adversarial
+security   → 1 dispatch (quality-security-audit) + 1 agent (security-reviewer) + 1 Codex adversarial
+quick      → 1 agent + 1 Codex verify
+complex    → architect + tracer + builder + dispatch quality-code-review + spec-reviewer + security-reviewer (if risk) + e2e-runner (if E2E plan) + Codex adversarial
+```
+## Per-agent prompt template
+```
+You are agent {role} in task-force task {n} of {N}, run-id {run-id}.
+Task type: {dev|research|debug|docs|review|security|quick|complex}
+Phase mode: {prototype|production}
+Codex consent: {verify|takeover|skip|never}
+TASK:
+{verbatim task text from user list}
+Repository context:
+- cwd: {abs path}
+- active manifest: {path or "none"}
+- last commit: {sha}
+Your role: {builder|reviewer|explorer|tracer|doc-writer|...}
+Your scope: {paths, subsystem, or "whole repo"}
+OUTPUT:
+- Result of doing the task (code diffs, findings, document, etc.)
+- One-line verdict for the task-force aggregator: PASS / CONCERNS / FAIL
+- Hand off any followups by writing them to .forge/task-force/{run-id}/task-{n}/followups.md
+Cap your output at {N} words. Save artifacts under .forge/task-force/{run-id}/task-{n}/{role}.{ext}.
+```
+## Batched dispatch ceiling
+20-25 Agent tool calls per `<function_calls>` block. For 10 tasks × avg 3 agents = 30 agents, plan 2 dispatch rounds.
+## Per-task aggregator (Step 7)
+```python
+verdicts = [parse_verdict(role_output) for role_output in role_outputs]
+if all(v == "PASS" for v in verdicts):
+    status = "complete-consensus"
+elif any(v == "FAIL" for v in verdicts) and "PASS" in verdicts:
+    status = "complete-divergent"
+elif all(v == "FAIL" for v in verdicts):
+    status = "failed"
+elif len(verdicts) < expected_count:
+    status = "complete-single-source"
+else:
+    status = "complete-with-concerns"
+```
+## Failure modes
+| Failure | Symptom | Action |
+|---|---|---|
+| Background agent timeout | Notification never arrives | Mark agent failed, fall through to single-source verdict |
+| Codex sandbox can't write | "operation blocked in sandbox" | Codex outputs uncommitted; Claude commits or main session commits |
+| Two agents touch same file | Edit tool conflict | Partition more aggressively |
+| Classifier picks command-shaped but rest heterogeneous | Some routed, some not | Surface routing decisions, continue with remainder |
+| User cancels mid-run | Receives interrupt | Mark in-flight tasks abandoned; preserve completed outputs |
+## Sizing examples
+| Punch list | Tasks | Phase | Agents fired |
+|---|---|---|---|
+| 3 docs items | 3 | prototype | 3 |
+| 5 quick chores | 5 | prototype | 5 |
+| 5 quick chores | 5 | production | 10 |
+| 2 dev + 1 research + 1 debug + 1 docs | 5 | prototype | 5 (very lean) |
+| Same 5 mixed | 5 | production | 12-14 |
+| 10 heterogeneous | 10 | production | 25-40 |
+| 15 tasks (soft cap warning) | 15 | production | 40-60 — offer batch |
+| 25 tasks (over hard cap) | 25 | any | REFUSE — split |
+## Cleanup
+`.forge/task-force/{run-id}/` accumulates per-run. Prune via:
+- `/evolve` (include task-force cleanup as a step)
+- Manual: `find .forge/task-force -maxdepth 1 -mtime +30 -exec rm -rf {} +`
+- `support-dream` could consolidate task-force runs into aiwiki summary entries before prune

package/skills/support-task-force/references/synthesis-template.md ADDED Viewed

@@ -0,0 +1,126 @@
+# Task-Force Output Templates
+Use these for per-task verdicts (Step 7) and run summaries (Step 8).
+## Per-task verdict
+```markdown
+# Task {n} Verdict — {short task title}
+**Type:** {dev|research|debug|docs|review|security|quick|complex}
+**Phase mode:** {prototype|production}
+**Team:** {list of roles}
+**Status:** {complete-consensus | complete-divergent | complete-single-source | failed}
+## Result
+{1-2 sentences summarizing what was done / found}
+## Agent verdicts
+| Role | Verdict | Brief |
+|------|---------|-------|
+| builder (Claude) | PASS | <one line> |
+| code-reviewer (Claude) | CONCERNS | <one line> |
+| code-reviewer (Codex) | PASS | <one line> |
+## Disagreements (if any)
+- Claude code-reviewer flagged X as a concern; Codex did not. Reason for divergence: <agent explanation>
+- Recommendation: <manual review | accept Claude | accept Codex | re-dispatch>
+## Artifacts
+- `.forge/task-force/{run-id}/task-{n}/builder.md`
+- `.forge/task-force/{run-id}/task-{n}/code-reviewer-claude.md`
+- `.forge/task-force/{run-id}/task-{n}/code-reviewer-codex.md`
+- `.forge/task-force/{run-id}/task-{n}/followups.md` (if any)
+## Followups (if any)
+- <item> — owner: <user | next task-force run | /evolve>
+```
+## Run summary
+```markdown
+# Task-Force Run {run-id}
+**Started:** {ISO-8601}
+**Completed:** {ISO-8601}
+**Phase mode:** {prototype|production|ambiguous}
+**Codex consent:** {verify|takeover|skip|never}
+## Input
+{verbatim user prompt or task list}
+## Classification
+| # | Task | Type | Routed |
+|---|------|------|--------|
+| 1 | Update README | docs | — |
+| 2 | Explore caching options | research | — |
+| 3 | Implement payment integration | command:/feature | ✅ routed |
+| ... | ... | ... | ... |
+## Outcomes
+| # | Status | Team size | Conflicts | Followups |
+|---|--------|-----------|-----------|-----------|
+| 1 | consensus | 1 | — | — |
+| 2 | consensus | 3 | — | — |
+| 4 | divergent | 2 | Claude PASS, Codex CONCERNS | manual review |
+| 5 | failed | 1 | both agents timed out | retry |
+## Routed to commands
+- Item 3 → `/feature` (run when ready)
+## Followup actions
+1. Resolve divergence on task 4 (recommend manual review of file X)
+2. Retry task 5 (consider /bugfix if recurring)
+3. Invoke `/feature` for routed item 3
+## Stats
+- Tasks: {N}
+- Routed: {M}
+- Executed: {N - M}
+- Total agents dispatched: {K}
+- Wall time: {minutes}
+- Cells with Claude+Codex consensus: {X}/{N - M}
+- Cells with divergence: {Y}/{N - M}
+## Next steps
+If divergence > 30%, sample-check a few task-force outputs — your classifier or team-assembly might be miscalibrated.
+If all tasks PASSED with no divergence, run a `/evolve` to consider whether patterns recurred enough to warrant capturing as a gotcha or rule.
+```
+## Severity / verdict definitions
+Per-agent verdict vocabulary:
+- **PASS** — task done as specified, no concerns
+- **CONCERNS** — done with caveats; followups recommended but not blocking
+- **FAIL** — task could not be completed (blocked, unknown, hit error)
+Per-task status:
+- **complete-consensus** — all team members agree PASS or all agree CONCERNS
+- **complete-divergent** — team members disagree (e.g., PASS vs FAIL); user should adjudicate
+- **complete-single-source** — only one agent returned; use with caution
+- **failed** — multiple agents failed; task did not complete
+## When to escalate to a command
+After run summary, if any task is `complete-divergent` AND the divergence is over a load-bearing question (security, data-loss, contract change), recommend escalating:
+- Dev task with divergent verdicts → `quality-code-review` skill standalone for that change
+- Debug task with divergent verdicts → `support-debug` skill standalone
+- Research task with divergent verdicts → user adjudication required
+Task-force is for parallel speed. When agreement breaks down, the right tool is depth, not parallelism.

package/skills/support-wiki-bootstrap/SKILL.md ADDED Viewed

@@ -0,0 +1,37 @@
+---
+name: support-wiki-bootstrap
+description: "Use when a command needs to create the aiwiki directory and template-backed wiki scaffolding before work starts."
+---
+# Wiki Bootstrap
+## Overview
+Scaffold `aiwiki/` exactly once so gotchas, decisions, conventions, architecture notes, sessions, raw captures, proposals, and schemas have a home before workflow phases write to them.
+**Announce at start:** "I'm using the support-wiki-bootstrap skill to ensure aiwiki is ready."
+## Procedure
+If `aiwiki/` already exists at the project root, skip silently. If `.claude/templates/aiwiki/` is missing, surface the broken install to the user and stop; the repair path is `npx @jamie-tam/forge init`.
+Run the bootstrap commands from the project root:
+```bash
+mkdir -p aiwiki/{decisions,gotchas,conventions,architecture,sessions,raw,proposed,schemas,oracles}
+cp .claude/templates/aiwiki/schemas/*.md aiwiki/schemas/
+cp .claude/templates/aiwiki/CLAUDE.md.template aiwiki/CLAUDE.md
+[ -f aiwiki/INDEX.md ] || printf "# Wiki Index\n\n(Empty -- populated as work happens.)\n" > aiwiki/INDEX.md
+[ -f aiwiki/projectbrief.md ] || printf "# %s\n\n(One-paragraph description.)\n" "$(basename "$(pwd)")" > aiwiki/projectbrief.md
+```
+If the calling command already knows the project name or stack, update `aiwiki/projectbrief.md` after bootstrapping rather than replacing an existing user-edited file.
+## I/O Contract
+| Field | Value |
+|---|---|
+| **Requires** | Project root with `.claude/templates/aiwiki/` installed |
+| **Produces** | `aiwiki/` directories, `aiwiki/CLAUDE.md`, schema files, `aiwiki/INDEX.md`, `aiwiki/projectbrief.md` |
+| **Feeds into** | Commands and skills that write gotchas, decisions, conventions, architecture notes, sessions, raw captures, or proposals |
+| **Updates manifest** | None |