npm - ai-factory - Versions diffs - 2.13.2 → 2.14.0 - Mend

ai-factory 2.13.2 → 2.14.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (30) hide show

package/dist/cli/wizard/skill-hints.d.ts.map +1 -1
package/dist/cli/wizard/skill-hints.js +1 -0
package/dist/cli/wizard/skill-hints.js.map +1 -1
package/package.json +2 -1
package/skills/aif/SKILL.md +23 -12
package/skills/aif/references/config-template.yaml +10 -0
package/skills/aif/references/update-config.mjs +1 -0
package/skills/aif-archive/SKILL.md +317 -0
package/skills/aif-distillation/SKILL.md +4 -2
package/skills/aif-distillation/references/LARGE-MATERIALS.md +21 -1
package/skills/aif-explore/SKILL.md +29 -4
package/skills/aif-fix/SKILL.md +23 -2
package/skills/aif-implement/SKILL.md +3 -0
package/skills/aif-improve/SKILL.md +96 -177
package/skills/aif-improve/references/CHECK-MODE.md +101 -0
package/skills/aif-improve/references/EXAMPLES.md +88 -0
package/skills/aif-improve/references/LIST-MODE.md +83 -0
package/skills/aif-improve/references/VALIDATOR.md +89 -0
package/skills/aif-plan/SKILL.md +23 -3
package/skills/aif-reference/SKILL.md +22 -3
package/skills/aif-review/SKILL.md +44 -17
package/skills/aif-review/references/CHECK-MODE.md +109 -0
package/skills/aif-review/references/SEVERITY.md +35 -0
package/skills/aif-review/references/VALIDATOR.md +103 -0
package/skills/aif-rules/SKILL.md +18 -2
package/skills/aif-security-checklist/SKILL.md +20 -3
package/skills/aif-skill-generator/SKILL.md +39 -19
package/skills/aif-skill-generator/references/SECURITY-SCANNING.md +5 -3
package/skills/aif-skill-generator/scripts/cleanup-blocked-skill.py +617 -0
package/skills/aif-verify/SKILL.md +14 -1

package/skills/aif-implement/SKILL.md CHANGED Viewed

@@ -48,6 +48,7 @@ Handoff sync is handled inline — see **Step 0.2** (after reading the plan file
 1. Read `.ai-factory/config.yaml` if it exists to resolve:
    - `paths.description`, `paths.architecture`, `paths.rules_file`, `paths.roadmap`, `paths.research`
    - `paths.plan`, `paths.plans`, `paths.fix_plan`, `paths.patches`
+   - `paths.archive`
    - `paths.rules`
    - `language.ui`, `language.artifacts`
    - `git.enabled`, `git.base_branch`, `git.create_branches`
@@ -461,6 +462,8 @@ Then continue with normal execution using the selected plan file.
 4. `paths.plan` (from `/aif-plan fast`) - fallback when no full plan exists
 5. `paths.fix_plan` - redirect to `/aif-fix` (from `/aif-fix` plan mode)
+**Note:** Plan discovery scans `paths.plans/` only. Plans archived to `paths.archive/plans/` by `/aif-archive` are excluded from discovery.
 **Read the plan file** to understand:
 - Context and settings (testing, logging preferences)

package/skills/aif-improve/SKILL.md CHANGED Viewed

@@ -1,8 +1,8 @@
 ---
 name: aif-improve
-description: Refine and enhance an existing implementation plan with a second iteration. Re-analyzes the codebase, checks for gaps, missing tasks, wrong dependencies, and improves the plan quality. Use after /aif-plan to polish the plan before implementation, or to improve an existing /aif-fix plan.
-argument-hint: "[--list] [@plan-file] [improvement prompt or empty for auto-review]"
-allowed-tools: Read Write Edit Glob Grep Bash(git *) TaskCreate TaskUpdate TaskList TaskGet AskUserQuestion Questions
+description: Refine an existing implementation plan with a second iteration. Re-analyzes the codebase for gaps, missing tasks, and wrong dependencies. Use after /aif-plan or to improve an /aif-fix plan. Optional +check flag validates refinements via a fresh-context subagent.
+argument-hint: "[--list] [+check] [@plan-file] [improvement prompt or empty for auto-review]"
+allowed-tools: Read Write Edit Glob Grep Bash(git *) Task Agent TaskCreate TaskUpdate TaskList TaskGet AskUserQuestion Questions
 disable-model-invocation: false
 ---
@@ -22,11 +22,11 @@ enhanced plan with better tasks, correct dependencies, more detail
 ## Workflow
-### Step 0: Load Config & Find the Plan
+### Step 0: Load Config & Parse Arguments
 **FIRST:** Read `.ai-factory/config.yaml` if it exists to resolve:
-- **Paths:** `paths.plan`, `paths.plans`, `paths.fix_plan`, `paths.research`, `paths.description`, and `paths.patches`
-- **Language:** `language.ui` for prompts
+- **Paths:** `paths.plan`, `paths.plans`, `paths.fix_plan`, `paths.research`, `paths.description`, `paths.patches`, and `paths.archive`
+- **Language:** `language.ui` for prompts and summaries, `language.artifacts` for plan artifact updates, and `language.technical_terms` for human-readable technical terminology in plan artifacts
 - **Git:** `git.enabled`, `git.base_branch`, `git.create_branches`
 - **Workflow:** `workflow.plan_id_format` (default: `slug`) — used by branch-based plan discovery.
   Active values: `slug` and `sequential`. When `sequential`, the resolver globs
@@ -42,48 +42,44 @@ If config.yaml doesn't exist, use defaults:
 - research: `.ai-factory/RESEARCH.md`
 - patches/: `.ai-factory/patches/`
 - DESCRIPTION.md: `.ai-factory/DESCRIPTION.md`
-- Language: `en` (English)
+- `ui_language`: `en`
+- `artifact_language`: `en`
+- `technical_terms_policy`: `keep`
 - `workflow.plan_id_format`: `slug`
+Resolved language values:
+- `ui_language = language.ui || "en"`
+- `artifact_language = language.artifacts || language.ui || "en"`
+- `technical_terms_policy = language.technical_terms || "keep"`
+If `technical_terms_policy` is not one of `keep`, `translate`, or `mixed`, treat it as `keep`. Legacy values such as `english` also behave like `keep`.
+All AskUserQuestion prompts, progress updates, refinement reports, summaries, and next-step guidance MUST be written in `ui_language`.
+Any generated or updated plan artifact content under `paths.plan`, `paths.plans`, or `paths.fix_plan` MUST be written in `artifact_language`.
+Templates and examples define structure, not fixed English output. If `artifact_language` is not `en`, translate human-readable headings, labels, task prose, roadmap rationale, research summaries, improvement notes, and dependency notes before saving. Preserve markdown structure, checkbox syntax, task IDs, numeric prefixes, branch names, commit messages, commands, file paths, config keys, package names, API names, `WARN`/`INFO` labels, and raw errors unchanged. Apply `technical_terms_policy` to other human-readable terminology.
 **First parse arguments:**
 ```
 - --list    → list available plans only (read-only, then STOP)
+- +check    → after refinement, validate findings via a fresh-context subagent
 - @<path>   → explicit plan file override (highest priority)
 - remaining argument text → optional improvement prompt
 ```
-When both are present, `--list` wins and no refinement is executed.
+`+check` is orthogonal to the other flags and may appear anywhere in `$ARGUMENTS`. Strip it from the argument string before resolving `@<path>` and the improvement prompt.
+When `--list` is present, it wins and no refinement is executed. `+check` is silently ignored in `--list` mode (there is nothing to validate before refinement runs).
 ### Step 0.list: List Available Plans (`--list`)
-If `$ARGUMENTS` contains `--list`, run read-only discovery and stop.
+If `$ARGUMENTS` contains `--list`, execute the procedure in `references/LIST-MODE.md` and STOP. That document is the single source of truth for the discovery rules, output shape, and read-only contract (no refinement, no file modifications, `+check` is silently ignored). Do not duplicate its content here.
-```
-1. Get current branch:
-   git branch --show-current (git mode only)
-2. Convert branch to filename stem: replace "/" with "-" (git mode only)
-   → this is <branch-slug>
-3. Check existence of:
-   - <configured plans dir>/<branch-slug>.md (default `plan_id_format`)
-   - when `workflow.plan_id_format = sequential`: also glob
-     `<configured plans dir>/[0-9][0-9][0-9][0-9]_<branch-slug>.md`;
-     report all matches (highest-numbered first)
-   - if git mode is off or branch creation is disabled: any `*.md` full-mode plan in `<configured plans dir>/`
-     (a leading 4-digit prefix counts as a match)
-   - <resolved fast plan path>
-   - <resolved fix plan path>
-4. Print availability summary and usage hints:
-   - /aif-improve @<path> <optional prompt>
-   - /aif-improve <optional prompt>      # automatic priority
-5. If none found, suggest creating a plan via /aif-plan or /aif-fix
-6. STOP.
-```
+### Step 1: Resolve Active Plan
-**Important:** In `--list` mode:
-- Do not execute refinement
-- Do not modify files
-- Do not update TaskList/plan content
+This step runs in the default (non-`--list`) mode and picks **one** plan file for refinement using the priority chain below. The discovery-list logic for `--list` lives in `references/LIST-MODE.md` and is independent of this step.
 **Locate the active plan file using this priority:**
@@ -113,6 +109,8 @@ If `$ARGUMENTS` contains `--list`, run read-only discovery and stop.
 5. No full-mode plan and no resolved fast plan → Check the resolved fix plan path (from /aif-fix plan mode)
 ```
+**Note:** Plan discovery scans `paths.plans/` only. Plans archived to `paths.archive/plans/` by `/aif-archive` are excluded from discovery.
 **If NO plan file found at any location:**
 ```
@@ -126,11 +124,11 @@ To create a plan first, use:
 → **STOP here.** Do not proceed without a plan file.
-**If plan file found → read it and continue to Step 1.**
+**If plan file found → proceed to Step 2 (Load Context).**
-### Step 1: Load Context
+### Step 2: Load Context
-**1.1: Read the plan file**
+**2.1: Read the plan file**
 Read the found plan file completely. Understand:
 - Feature scope and goals
@@ -139,7 +137,7 @@ Read the found plan file completely. Understand:
 - Commit checkpoints
 - Which tasks are already completed (checkboxes `- [x]`)
-**1.2: Read project context**
+**2.2: Read project context**
 Read `.ai-factory/DESCRIPTION.md` (use path from config) if it exists:
 - Tech stack
@@ -149,7 +147,7 @@ Read `.ai-factory/DESCRIPTION.md` (use path from config) if it exists:
 Read `.ai-factory/RESEARCH.md` (use path from config) if it exists and is relevant to the plan being refined.
-**1.3: Read patches (limited fallback)**
+**2.3: Read patches (limited fallback)**
 Use patches as fallback context, not the default source:
@@ -181,7 +179,7 @@ codebase conventions, and tech-stack analysis. These rules are tailored to the c
 **Enforcement:** After generating any output artifact, verify it against all skill-context rules.
 If any rule is violated — fix the output before presenting it to the user.
-**1.4: Load current task list**
+**2.4: Load current task list**
 ```
 TaskList → Get all tasks with statuses
@@ -189,11 +187,11 @@ TaskList → Get all tasks with statuses
 Understand what's already been created, what's in progress, what's completed.
-### Step 2: Deep Codebase Analysis
+### Step 3: Deep Codebase Analysis
 Now do a **deeper** codebase exploration than what `/aif-plan` did initially:
-**2.1: Trace through existing code paths**
+**3.1: Trace through existing code paths**
 For each task in the plan, find the relevant files:
 ```
@@ -207,7 +205,7 @@ Look for:
 - Hidden dependencies the plan missed
 - Shared utilities or services the plan should use instead of creating new ones
-**2.2: Check for integration points**
+**3.2: Check for integration points**
 Look for things the plan might have missed:
 - API routes that need updating
@@ -217,7 +215,7 @@ Look for things the plan might have missed:
 - Middleware or guards that apply
 - Existing validation patterns
-**2.3: Check for edge cases**
+**3.3: Check for edge cases**
 Based on the tech stack and codebase:
 - Error handling patterns used in the project
@@ -226,45 +224,62 @@ Based on the tech stack and codebase:
 - Rate limiting, caching considerations
 - Data validation at boundaries
-### Step 3: Identify Improvements
+### Step 4: Identify Improvements
 Compare the plan against what you found. Categorize issues:
-**3.1: Missing tasks**
+**4.1: Missing tasks**
 - Tasks that should exist but don't (e.g., migration, config update, index creation)
 - Tasks for edge cases not covered
-**3.2: Task quality issues**
+**4.2: Task quality issues**
 - Descriptions too vague (no file paths, no specific implementation details)
 - Missing logging requirements
 - Missing error handling details
 - Incorrect file paths
-**3.3: Dependency issues**
+**4.3: Dependency issues**
 - Wrong task order (task A depends on B but B comes after A)
 - Missing dependencies (task C needs task A's output but isn't blocked by it)
 - Unnecessary dependencies (tasks could run in parallel)
-**3.4: Redundant or duplicate tasks**
+**4.4: Redundant or duplicate tasks**
 - Two tasks doing the same thing
 - Task that's unnecessary because the code already exists
 - Task that duplicates existing functionality
-**3.5: Scope issues**
+**4.5: Task size issues**
 - Tasks too large (should be split)
 - Tasks too small (should be merged)
-- Tasks outside the feature scope (gold-plating)
+- Split/merge findings go into the "📝 Task Improvements" report section (`improvements` group, alongside 4.2) — they restructure existing tasks rather than add or remove them.
-**3.6: User-prompted improvements (if $ARGUMENTS provided)**
+**4.6: Out-of-scope tasks**
+- Tasks already in the plan that look useful in themselves but are unrelated to the feature this plan is about (gold-plating)
+- On approval these are removed from the active plan — the same drop action as `removals` (see Step 6.4). The difference is the report only: an out-of-scope task goes to its own "💡 Out of scope" section instead of being lumped into "🗑️ Removals", so the user sees a useful-but-unrelated idea before it is dropped and can choose to capture it elsewhere. The skill itself does not persist out-of-scope items anywhere.
-If the user provided specific improvement instructions in `$ARGUMENTS` (excluding `--list` and `@<path>` tokens):
+**4.7: User-prompted improvements (if $ARGUMENTS provided)**
+If the user provided specific improvement instructions in `$ARGUMENTS` (excluding `--list`, `+check`, and `@<path>` tokens):
 - Apply the user's feedback to the plan
 - Look for tasks that need modification based on the prompt
 - Add new tasks if the user's prompt requires them
-### Step 4: Present Improvements
+This is a dispatcher step, not a separate finding category. Each finding it produces is routed to its natural group based on its nature: a new task goes to 4.1 (`missing`), a rewording or expansion of an existing task goes to 4.2 (`improvements`), an explicit removal request goes to 4.4 (`removals`), and a "useful-but-out-of-scope" idea goes to 4.6 (`out_of_scope`). There is no separate 4.7 group in the Step 5 report or in `+check` validation.
+### Optional: `+check` validation between Step 4 and Step 5
+When the `+check` flag is set (and `--list` is not), run the validation procedure from `references/CHECK-MODE.md` here, between Step 4 and Step 5. It re-reads cited files via a fresh-context subagent, then drops invented items, rewrites partially-correct ones, and recomputes dependencies on the filtered list. Without `+check`, skip this entirely — the output has no validator-related lines and the Summary block stays in its default shape without the two `+check` counter rows.
+### Step 5: Present Improvements
+Show the user what you found in a clear format. The emoji-grouped sections are kept for scannability, but the items inside "🆕 Missing Tasks", "📝 Task Improvements", "🗑️ Removals", and "💡 Out of scope" all follow the same prose shape — no labeled `Why:` / `Issue:` / `Fix:` fields:
-Show the user what you found in a clear format:
+1. **Behavioral impact** — what breaks or becomes harder if the plan stays as-is (missing capability, vague task that will be misimplemented, redundant task that wastes effort).
+2. **Optional note** — short citation from the codebase, an existing pattern the plan should match, or a consequence. Include only when it adds signal.
+3. **Plan anchor** — `Task #X` reference (or "after Task #X" for new tasks).
+4. **Suggested edit** — concrete change: what to add / how to reword / what to remove.
+The "🔗 Dependency Fixes" group is **not** restated in this shape — it is always computed after the four other groups (and after `+check` filtering when the flag is set, see `references/CHECK-MODE.md`) and uses the short legacy form: `Task #X should depend on Task #Y. Reason: …`. The dependency entries reference only tasks that survived filtering.
 ```
 ## Plan Refinement Report
@@ -275,35 +290,28 @@ Tasks analyzed: N
 ### Findings
 #### 🆕 Missing Tasks (N found)
-1. **[New task subject]**
-   Why: [reason this task is needed]
-   After: Task #X (dependency)
-2. **[New task subject]**
-   Why: [reason]
+1. The plan currently leaves authenticated requests without a session refresh step — long-running clients silently lose access after the access-token TTL. The existing middleware in `src/middleware/auth.ts` already exposes a `refresh()` hook, so the plan should reuse it instead of inventing a new one. After Task #3. Add a new task: "Wire `authMiddleware.refresh()` into the login flow and cover the expired-token path with an explicit test."
 #### 📝 Task Improvements (N found)
-1. **Task #X: [subject]**
-   Issue: [what's wrong]
-   Fix: [what should change]
-2. **Task #Y: [subject]**
-   Issue: [what's wrong]
-   Fix: [what should change]
+1. Task #4 ("Add validation") gives no field-by-field contract — implementer will either over-validate or skip the email format check that the rest of the codebase enforces via `validators/email.ts`. Task #4. Rewrite as: "Validate `email` (via `validators/email.ts`), `password` (min 12 chars), and `displayName` (1-64 chars) in `RegisterRequest`; return 422 with field-level errors when validation fails."
 #### 🔗 Dependency Fixes (N found)
-1. Task #X should depend on Task #Y
-   Reason: [why]
+1. Task #5 should depend on Task #2. Reason: Task #5 consumes the session helper introduced in Task #2.
 #### 🗑️ Removals (N found)
-1. **Task #X: [subject]**
-   Reason: [why it's redundant/unnecessary]
+1. Task #7 ("Create UserRepository") duplicates `src/repos/user.ts:12` which already exposes the same query surface — keeping the task will lead to a parallel implementation. Task #7. Remove the task; rely on the existing repository and adjust Task #8 to import it.
+#### 💡 Out of scope — for later (N found)
+1. Task #11 ("Refactor the logging module") looks reasonable on its own but is unrelated to the login feature this plan is about — keeping it expands scope without any concrete trigger from the current code paths. Task #11. Drop it from the active plan; the idea is surfaced here so you can capture it elsewhere (issue tracker, backlog note) if it's worth revisiting as its own feature later.
 #### 📋 Summary
 - Missing tasks: N
 - Tasks to improve: N
 - Dependencies to fix: N
 - Tasks to remove: N
+- Out of scope: N
+When `+check` ran successfully, two extra rows (`Hidden by +check: N`, `Adjusted by +check: M`) are appended to the Summary block — the exact wording and failure-mode replacements live in `references/CHECK-MODE.md`.
 AskUserQuestion: Apply these improvements?
@@ -320,6 +328,8 @@ Options:
 **If no improvements found:**
+The completion templates below define structure only. Render all human-readable text in these user-facing responses in `ui_language`. Preserve command names, paths, task counts, and numeric counts unchanged.
 ```
 ## Plan Review Complete
@@ -332,11 +342,11 @@ Ready to implement:
 /aif-implement
 ```
-### Step 5: Apply Approved Improvements
+### Step 6: Apply Approved Improvements
 Based on user's choice:
-**5.1: Apply task improvements**
+**6.1: Apply task improvements**
 For existing tasks that need better descriptions:
 ```
@@ -344,7 +354,7 @@ TaskGet(taskId) → read current
 TaskUpdate(taskId, description: "improved description", subject: "improved subject")
 ```
-**5.2: Add missing tasks**
+**6.2: Add missing tasks**
 For new tasks:
 ```
@@ -352,19 +362,23 @@ TaskCreate(subject, description, activeForm)
 TaskUpdate(taskId, addBlockedBy: [...]) → set dependencies
 ```
-**5.3: Fix dependencies**
+**6.3: Fix dependencies**
 ```
 TaskUpdate(taskId, addBlockedBy: [...])
 ```
-**5.4: Remove redundant tasks**
+**6.4: Remove redundant or out-of-scope tasks**
+Both `removals` and `out_of_scope` translate to the same plan-file action — drop the task:
 ```
 TaskUpdate(taskId, status: "deleted")
 ```
-**5.5: Update the plan file**
+The difference between the two is the report only. `removals` are dead-weight duplicates: mentioned once and forgotten. `out_of_scope` items appear in the "💡 Out of scope" section so the user sees the idea was noticed and consciously dropped from this plan, not removed without a trace. The skill does not persist out-of-scope tasks anywhere — capturing the idea elsewhere (issue tracker, backlog) is the user's call.
+**6.5: Update the plan file**
 **CRITICAL:** After all changes, update the plan file to reflect the new state:
@@ -377,13 +391,15 @@ TaskUpdate(taskId, status: "deleted")
 Use `Edit` to make surgical changes to the plan file, or `Write` to regenerate it if changes are extensive.
+When editing or regenerating the plan file, keep all human-readable artifact content in `artifact_language`; the examples above are structural only. Preserve completed `- [x]` checkboxes exactly.
 **Filename invariant:** when the existing plan filename matches the sequential
 pattern `^[0-9]{4}_.*\.md$` (e.g. `0042_feature-user-auth.md`), preserve the
 exact numeric prefix on rewrite. Never renumber a plan during an improve pass —
 the prefix is permanent and must survive any regeneration. Write back to the
 same absolute path you read from.
-**5.6: Confirm completion**
+**6.6: Confirm completion**
 ```
 ## Plan Refined
@@ -417,108 +433,11 @@ Suggest the user to free up context space if needed: `/clear` (full reset) or `/
 2. **Preserve completed work** — never modify or remove `- [x]` completed tasks
 3. **Traceable improvements** — every change must be justified by codebase analysis or user input
 4. **Respect settings** — if testing is "no", don't add test tasks. If logging is "minimal", don't add verbose logging tasks
-5. **No gold-plating** — don't add tasks outside the feature scope unless critical
+5. **No gold-plating** — don't propose adding tasks outside the feature scope unless critical. When you find a task already in the plan that drifts outside scope, route it to the "💡 Out of scope" report section, not to "🗑️ Removals" — the user should see useful-but-not-here ideas separately from dead-weight duplicates.
 6. **Minimal viable improvements** — suggest only what matters, not every possible enhancement
 7. **User approves first** — never apply changes without user confirmation
 8. **Keep plan file in sync** — the plan file MUST match the task list after improvements
 ## Examples
-### Example 1: Auto-review (no arguments)
-```
-User: /aif-improve
-→ Found plan: .ai-factory/plans/feature-user-auth.md
-→ 6 tasks in plan
-→ Deep codebase analysis...
-→ Found: project uses middleware pattern for auth, plan misses middleware task
-→ Found: Task #3 description doesn't mention existing UserService
-→ Found: Task #5 depends on Task #3 but no dependency set
-Report:
-- 1 missing task (auth middleware)
-- 1 task to improve (reference UserService)
-- 1 dependency to fix
-Apply? → Yes → Changes applied
-```
-### Example 2: With user prompt
-```
-User: /aif-improve добавь обработку ошибок и валидацию входных данных
-→ Found plan: <resolved fast plan path>
-→ 4 tasks in plan
-→ User wants: error handling + input validation
-→ Analyzing each task for missing error handling...
-→ Found: none of the tasks mention input validation
-→ Found: error handling is inconsistent
-Report:
-- 2 tasks improved (added validation details to descriptions)
-- 1 new task (create shared validation utils)
-- Updated task descriptions with error handling patterns from codebase
-Apply? → Yes → Changes applied
-```
-### Example 3: No plan found
-```
-User: /aif-improve
-→ Branch: <current-branch-or-empty>
-→ No matching branch-based full plan found
-→ No resolved fast plan found
-→ No resolved fix plan found
-→ No plan file found
-"No active plan found. Create one first:
-- /aif-plan full <description>
-- /aif-plan fast <description>
-- /aif-fix <bug description>"
-```
-### Example 4: Explicit plan file
-```
-User: /aif-improve @my-custom-plan.md add rollback and edge-case handling
-→ Explicit plan override: my-custom-plan.md
-→ Found plan: my-custom-plan.md
-→ User wants: rollback + edge-case handling
-→ Deep codebase analysis...
-→ Report prepared
-```
-### Example 5: List mode
-```
-User: /aif-improve --list
-## Available Plans
-Current branch: feature/user-auth
-- [x] .ai-factory/plans/feature-user-auth.md
-- [ ] <resolved fast plan path>
-- [x] <resolved fix plan path>
-Use:
-- /aif-improve @.ai-factory/plans/feature-user-auth.md
-- /aif-improve add validation and retries
-```
-### Example 6: Plan already looks good
-```
-User: /aif-improve
-→ Found plan: .ai-factory/plans/feature-product-search.md
-→ 5 tasks in plan
-→ Deep analysis... all tasks well-defined, dependencies correct
-→ No significant improvements found
-"Plan looks solid! Ready to implement:
-/aif-implement"
-```
+Worked examples for the default, prompt-driven, no-plan, explicit-plan-file, and "plan looks solid" flows live in `references/EXAMPLES.md`. The `--list` mode example lives in `references/LIST-MODE.md`; the `+check` mode example lives in `references/CHECK-MODE.md`.

package/skills/aif-improve/references/CHECK-MODE.md ADDED Viewed

@@ -0,0 +1,101 @@
+# `+check` validation procedure
+This file describes the optional findings-validation pass that runs when `aif-improve` is invoked with the `+check` flag. The parent skill defers to this document so the main `SKILL.md` stays focused on the default refinement workflow; `+check` is opt-in and most invocations do not need it.
+## When this runs
+`aif-improve` is invoked with `+check` and **without** `--list`. The pass executes between Step 4 (Identify Improvements) and Step 5 (Present Improvements). Without `+check`, skip this procedure entirely — there are no validator-related lines in the output and the Step 5 Summary block stays in its default shape without the two `+check` counter rows.
+`+check` together with `--list` is silently ignored (no refinement to validate).
+## Procedure
+The validation pass has two sequential phases.
+### Phase (a) — validate the four findings groups
+1. Collect items from the four findings groups built in Step 4: `missing` (4.1), `improvements` (4.2 + 4.5 — everything that rewords or expands existing tasks), `removals` (4.4 — redundant/duplicate tasks dropped without trace), and `out_of_scope` (4.6 — useful-but-unrelated tasks routed to the "💡 Out of scope" report section). Number them across all four groups in display order — group label is carried alongside each item. **If the combined list is empty, skip steps 2–5 of phase (a) entirely**: do not dispatch the validator, treat phase (a) as successful with `hidden = 0`, `adjusted = 0`, and proceed directly to phase (b) (Dependency Fixes still get recomputed normally).
+2. Build the project context block: working directory path, optional excerpt from `.ai-factory/DESCRIPTION.md`, a one-line summary of the plan being refined (plan path + task count), and the user's improvement prompt parsed in Step 0 — verbatim when the run had one, or the literal marker `none — bare auto-review` when `$ARGUMENTS` carried no prompt text. The validator needs the prompt to tell a user-requested task apart from agent-invented gold-plating.
+3. Read `references/VALIDATOR.md`. The reference declares two substitution slots at the top of the file — one for the project context block from step 2 and one for the items list from step 1 (each under its own `### Item N (group: …)` heading). Replace both before dispatch; the exact placeholder tokens are listed in the VALIDATOR.md header.
+4. Dispatch one call: `Task(subagent_type: general-purpose, prompt: <rendered template>)`. The subagent runs with fresh context. Read-only behavior (no writes, no commands) is enforced by the prompt inside `references/VALIDATOR.md`, not by the dispatch interface — `general-purpose` exposes the full tool set, so a tool-level restriction is not available.
+5. Parse the response by `### Item N` headings. The group of each item is always its **original** group from step 1 — the validator is forbidden by `references/VALIDATOR.md` from changing it. The `Group:` line in the response is an integrity check, not a control field: if its value differs from the original group, treat the whole item block as malformed (see failure modes below). For each well-formed item:
+   - `Verdict: keep` → keep the item unchanged in its original group.
+   - `Verdict: modify` → replace the item text with `Modified-text`, put it back in its original group. Increment `adjusted`.
+   - `Verdict: drop` → remove the item from the output. Increment `hidden`.
+### Phase (b) — recompute dependencies on the filtered list
+After phase (a) finishes, the main skill (not the validator) recomputes the 🔗 Dependency Fixes group against the **post-(a) plan state**:
+- start from the original plan tasks,
+- add tasks introduced by `missing.keep` and `missing.modify` (these are confirmed new tasks),
+- remove tasks targeted by `removals.keep`, `removals.modify`, `out_of_scope.keep`, and `out_of_scope.modify` (the validator confirmed the proposal to drop the task from the plan),
+- tasks rescued by `removals.drop` or `out_of_scope.drop` stay in the plan — the validator overruled the proposal — and remain valid dependency targets,
+- `improvements` only reword existing tasks; they never add or remove anything from the graph.
+Any dependency that points at a task absent from the post-(a) plan is discarded. Dependencies are NOT sent to the validator — the legacy short form (`Task #X should depend on Task #Y. Reason: …`) is preserved and the counters from phase (a) do not include this group.
+## Failure modes
+- **Per-item malformed response** (heading missing, no `Verdict` line, unknown verdict token, missing `Modified-text` line when `Verdict` is `modify`, or `Group:` value that differs from the item's original group): treat that item as `keep` and append one extra line at the very end of the Step 5 output: `WARN [+check]: validator response for item N was malformed, kept as-is`. Continue with the remaining items.
+- **Whole-dispatch failure** (empty response, exception, timeout, validator refusal): treat **all** items in phase (a) as `keep`, skip the `Hidden by +check` / `Adjusted by +check` Summary rows, and append one line at the end of Step 5: `WARN [+check]: validator failed (<reason>), all items kept as-is`. Phase (b) still runs against the unfiltered list — dependencies are recomputed normally.
+## Output additions
+When phase (a) ran successfully (no whole-dispatch failure), the Step 5 Summary block gains two extra rows at the end:
+```
+- Hidden by +check: N
+- Adjusted by +check: M
+```
+The counters cover the four validated groups (`missing`, `improvements`, `removals`, `out_of_scope`) — `Dependencies to fix` is computed after validation and is not part of the counters. Skip both rows entirely when `+check` was not set, when the whole-dispatch failure path applies (the single `WARN [+check]` line replaces them), or when Step 5 takes the no-improvements branch (the "Plan Review Complete" / "Plan looks solid" path has no Summary block to extend).
+## Examples
+### Success
+```
+User: /aif-improve +check
+→ Found plan: .ai-factory/plans/feature-user-auth.md
+→ Step 4 produced 4 missing, 3 improvements, 2 removals, 1 out_of_scope
+→ +check validator dispatched (see procedure above)
+→ Validator returned: 7 keep, 2 modify, 1 drop
+→ Dependencies recomputed against the post-(a) plan state
+Step 5 report:
+- Missing tasks: 3
+- Tasks to improve: 3
+- Dependencies to fix: 2
+- Tasks to remove: 2
+- Out of scope: 1
+- Hidden by +check: 1
+- Adjusted by +check: 2
+Apply? → Yes → Changes applied
+```
+### Whole-dispatch failure
+```
+User: /aif-improve +check
+→ Found plan: .ai-factory/plans/feature-user-auth.md
+→ Step 4 produced 4 missing, 3 improvements, 2 removals, 1 out_of_scope
+→ +check validator dispatched
+→ Validator failed (empty response)
+→ Phase (a): all items treated as keep (no Hidden/Adjusted counters emitted)
+→ Phase (b): dependencies still recomputed normally against the unfiltered list
+Step 5 report (original counters, no +check rows appended):
+- Missing tasks: 4
+- Tasks to improve: 3
+- Dependencies to fix: 2
+- Tasks to remove: 2
+- Out of scope: 1
+WARN [+check]: validator failed (empty response), all items kept as-is
+Apply? → Yes → Changes applied
+```