npm - all-for-claudecode - Versions diffs - 2.9.1 → 2.11.0 - Mend

all-for-claudecode 2.9.1 → 2.11.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (42) hide show

package/.claude-plugin/marketplace.json +2 -2
package/.claude-plugin/plugin.json +2 -2
package/MIGRATION.md +1 -1
package/README.md +4 -0
package/package.json +2 -2
package/schemas/plugin.schema.json +2 -2
package/scripts/afc-consistency-check.sh +25 -23
package/scripts/afc-doctor.sh +10 -10
package/scripts/afc-qa-audit.sh +1 -1
package/scripts/afc-sync-cache.sh +1 -1
package/{commands/analyze.md → skills/analyze/SKILL.md} +10 -8
package/{commands/architect.md → skills/architect/SKILL.md} +3 -3
package/skills/auto/SKILL.md +1156 -0
package/{commands/clarify.md → skills/clarify/SKILL.md} +4 -3
package/{commands/clean.md → skills/clean/SKILL.md} +14 -13
package/{commands/consult.md → skills/consult/SKILL.md} +19 -18
package/{commands/implement.md → skills/implement/SKILL.md} +28 -15
package/{commands/init.md → skills/init/SKILL.md} +5 -1
package/{commands/learner.md → skills/learner/SKILL.md} +4 -4
package/{commands/plan.md → skills/plan/SKILL.md} +1 -122
package/skills/plan/plan-template.md +118 -0
package/{commands/pr-comment.md → skills/pr-comment/SKILL.md} +4 -4
package/{commands/principles.md → skills/principles/SKILL.md} +1 -1
package/{commands/qa.md → skills/qa/SKILL.md} +2 -2
package/{commands/release-notes.md → skills/release-notes/SKILL.md} +8 -4
package/{commands/review.md → skills/review/SKILL.md} +11 -11
package/{commands/security.md → skills/security/SKILL.md} +19 -4
package/{commands/spec.md → skills/spec/SKILL.md} +2 -77
package/skills/spec/spec-template.md +72 -0
package/{commands/triage.md → skills/triage/SKILL.md} +6 -7
package/commands/auto.md +0 -585
/package/{commands/checkpoint.md → skills/checkpoint/SKILL.md} +0 -0
/package/{commands/debug.md → skills/debug/SKILL.md} +0 -0
/package/{commands/doctor.md → skills/doctor/SKILL.md} +0 -0
/package/{commands/ideate.md → skills/ideate/SKILL.md} +0 -0
/package/{commands/launch.md → skills/launch/SKILL.md} +0 -0
/package/{commands/research.md → skills/research/SKILL.md} +0 -0
/package/{commands/resume.md → skills/resume/SKILL.md} +0 -0
/package/{docs → skills/spec}/nfr-templates.md +0 -0
/package/{commands/tasks.md → skills/tasks/SKILL.md} +0 -0
/package/{commands/test.md → skills/test/SKILL.md} +0 -0
/package/{commands/validate.md → skills/validate/SKILL.md} +0 -0

package/{commands/clarify.md → skills/clarify/SKILL.md} RENAMED Viewed

@@ -50,10 +50,11 @@ Scan across 10 categories:
 | 9 | Completion criteria | Success criteria that cannot be measured |
 | 10 | Residual placeholders | TODO/TBD/??? |
+These categories serve as a comprehensive checklist, not a rigid classification. Adapt to the project's domain — skip categories irrelevant to the project type (e.g., skip 'UX flow' for CLI tools) and add domain-specific categories if needed (e.g., 'regulatory compliance' for healthcare/fintech projects).
 ### 3. Generate and Present Questions
-- Generate at most **5** questions
-- Priority: scope > security/privacy > UX > technical
+- Generate questions ranked by their impact on spec quality — how much would the answer change the spec's direction or completeness? Present the most impactful questions first. The number of questions should match the actual ambiguity level: deeply ambiguous specs may need more questions, while mostly-clear specs need fewer. Do not artificially cap at a fixed number, but keep the set focused and avoid overwhelming the user (aim for the minimum needed to resolve critical ambiguities).
 - Present **one at a time** via AskUserQuestion:
   - Use multiple choice when possible (2-4 options)
   - Include the meaning/impact of each option
@@ -79,7 +80,7 @@ Clarification complete
 ## Notes
-- **5-question limit**: If more than 5 questions arise, select only the most important. Resolve the rest during the plan phase.
+- **Question focus**: Ask only what is needed to resolve critical ambiguities. Defer lower-priority questions to the plan phase rather than overwhelming the user.
 - **Modify spec only**: Do not touch plan.md or tasks.md.
 - **Avoid redundancy**: Do not ask about items already clearly stated in spec.
 - **If `$ARGUMENTS` is provided**: Focus the scan on that area.

package/{commands/clean.md → skills/clean/SKILL.md} RENAMED Viewed

@@ -60,23 +60,24 @@ Set `PIPELINE_ARTIFACT_DIR` = `.claude/afc/specs/{feature}/`
 - **If retrospective.md exists** -> record as patterns missed by the Plan phase Critic Loop in `.claude/afc/memory/retrospectives/` (reuse as RISK checklist items in future runs)
 - **If review-report.md exists** -> copy to `.claude/afc/memory/reviews/{feature}-{date}.md` before .claude/afc/specs/ deletion
 - **If research.md exists** and was not already persisted in Plan phase -> copy to `.claude/afc/memory/research/{feature}.md`
-- **Agent memory consolidation**: check each agent's MEMORY.md line count -- if either exceeds 100 lines, invoke the respective agent to self-prune:
+- **Agent memory consolidation**: Check each agent's MEMORY.md for bloat — if it contains redundant, obsolete, or superseded entries that reduce signal-to-noise ratio, invoke the agent to self-prune:
   ```
   Task("Memory cleanup: afc-architect", subagent_type: "afc:afc-architect",
-    prompt: "Your MEMORY.md exceeds 100 lines. Read it, prune old/redundant entries, and rewrite to under 100 lines following your size limit rules.")
+    prompt: "Review your MEMORY.md. Read it, identify and prune old/redundant/obsolete entries, and rewrite it keeping only entries that are still relevant and non-overlapping.")
   ```
-  (Same pattern for afc-security if needed. Skip if both are under 100 lines.)
-- **Memory rotation**: for each memory subdirectory, check file count and prune oldest files if over threshold:
-  | Directory | Threshold | Action |
-  |-----------|-----------|--------|
-  | `quality-history/` | 30 files | Delete oldest files beyond threshold |
-  | `reviews/` | 40 files | Delete oldest files beyond threshold |
-  | `retrospectives/` | 30 files | Delete oldest files beyond threshold |
-  | `research/` | 50 files | Delete oldest files beyond threshold |
-  | `decisions/` | 60 files | Delete oldest files beyond threshold |
-  - Sort by filename ascending (oldest first), delete excess
+  Use semantic assessment (are entries still relevant? do entries overlap?) rather than a line-count threshold. (Same pattern for afc-security if needed.)
+- **Memory rotation**: For each memory subdirectory, assess whether the oldest files still provide value. Prune files that are superseded by newer entries, reference features/code that no longer exists, or overlap with other files. As a practical guideline, keep the most recent and relevant entries — if a directory has grown large enough that scanning it would be slow (roughly 30+ files), prioritize pruning the least relevant entries:
+  | Directory | Pruning Intent | Soft Guideline |
+  |-----------|---------------|----------------|
+  | `quality-history/` | Remove superseded or redundant quality records | ~30 files |
+  | `reviews/` | Remove reviews for features no longer in the codebase | ~40 files |
+  | `retrospectives/` | Remove retrospectives whose learnings are already captured elsewhere | ~30 files |
+  | `research/` | Remove research for libraries/patterns no longer used | ~50 files |
+  | `decisions/` | Remove decisions that have been reversed or are no longer relevant | ~60 files |
+  - These numbers are soft guidelines, not hard cutoffs — use judgment based on relevance
+  - Sort by filename ascending (oldest first) when pruning by recency
   - Log: `"Memory rotation: {dir} pruned {N} files"`
-  - Skip directories that do not exist or are under threshold
+  - Skip directories that do not exist or clearly do not need pruning
 ### 6. Quality Report

package/{commands/consult.md → skills/consult/SKILL.md} RENAMED Viewed

@@ -33,24 +33,25 @@ If `$ARGUMENTS` is empty → go to Step 2 (domain selection).
 **A. Explicit domain provided** → use it directly.
-**B. No domain, but question provided** → keyword matching:
-| Domain | Keywords |
-|--------|----------|
-| backend | API, database, schema, query, server, auth, JWT, REST, GraphQL, ORM, migration, endpoint, middleware, validation, session, cookie, token |
-| infra | deploy, Docker, CI/CD, cloud, monitoring, k8s, pipeline, Kubernetes, terraform, AWS, GCP, Azure, nginx, SSL, DNS, CDN, container, scaling |
-| pm | feature, user story, priority, roadmap, PRD, MVP, backlog, metric, KPI, retention, churn, persona, requirement, scope |
-| design | UI, UX, accessibility, component, layout, color, animation, responsive, wireframe, prototype, typography, spacing, contrast, WCAG |
-| marketing | SEO, analytics, content, growth, conversion, funnel, GA4, acquisition, retention, landing page, Open Graph, meta tag, social media |
-| legal | GDPR, CCPA, privacy, cookie, consent, license, GPL, MIT, compliance, terms of service, data protection, PII, HIPAA, regulation, policy |
-| security | XSS, CSRF, injection, OWASP, vulnerability, attack, exploit, encryption, secret, credential, CORS, CSP, rate limit, brute force, penetration |
-| advisor | library, framework, stack, tool, package, which to use, alternative, compare, choose, select, recommend, what exists, ecosystem, best option, switch to |
-| peer | think together, brainstorm, discuss, explore idea, talk through, figure out, pros and cons, what if, should I, direction, approach, trade-off, opinion, weigh options |
-Match rules:
-- Case-insensitive keyword matching against the question
-- If multiple domains match: pick the one with the most keyword hits
-- If tie: pick the first domain in the table order above
+**B. No domain, but question provided** → intent-based evaluation:
+Read the user's question and determine which domain's expertise would provide the most value. Consider the actual intent, not keyword presence.
+| Domain | When to route |
+|--------|---------------|
+| backend | User needs help with server-side logic, data modeling, API design, authentication flows, database decisions, or how application code processes and stores data |
+| infra | User needs help with how the application is deployed, operated, or monitored — infrastructure topology, CI/CD pipelines, cloud services, scaling, reliability |
+| pm | User needs help with product decisions: what to build, for whom, when, how to measure success, how to prioritize competing features, or how to define scope |
+| design | User needs help with how something looks or feels to a user — visual hierarchy, interaction patterns, accessibility, component design, or user flow |
+| marketing | User needs help reaching or retaining users outside the product: SEO, content strategy, acquisition funnels, analytics tracking, or growth tactics |
+| legal | User needs help understanding regulatory obligations, license compatibility, privacy requirements, or the legal implications of a design or data practice |
+| security | User needs help identifying or mitigating threats, vulnerabilities, or attack surfaces — secure coding, threat modeling, or compliance with security standards |
+| advisor | User is choosing between technologies, frameworks, libraries, or architectural approaches and wants an informed recommendation with trade-off analysis |
+| peer | User wants to think through a problem collaboratively, explore directions, weigh trade-offs, or have a structured dialogue rather than receive an answer |
+Evaluation rules:
+- Identify what specialized knowledge the user actually needs, not which domain's jargon appears in the text
+- If multiple domains seem relevant, identify the PRIMARY expertise gap — what specialized knowledge does the user need most?
 **C. No domain, no question, or no keyword match** → ask user:

package/{commands/implement.md → skills/implement/SKILL.md} RENAMED Viewed

@@ -116,7 +116,7 @@ If `.claude/afc/memory/retrospectives/` exists, load the **most recent 10 files*
 ### 3. Phase-by-Phase Execution
-Execute each phase in order. Choose the orchestration mode based on the number of [P] tasks in the phase:
+Execute each phase in order. Choose the orchestration mode by evaluating whether multi-agent coordination overhead would be justified given the tasks' characteristics:
 #### Mode Selection
@@ -126,12 +126,17 @@ Execute each phase in order. Choose the orchestration mode based on the number o
 |-----------|------|----------|
 | No [P] markers | Sequential | Main agent executes tasks one by one |
 | [P] tasks but delegation criteria NOT met | Sequential | Main agent executes directly (preserves full context) |
-| [P] tasks, delegation criteria ALL met, 3–5 [P] | Parallel Batch | Launch Task() calls in parallel |
-| [P] tasks, delegation criteria ALL met, 6+ [P] | Swarm | Create task pool → orchestrator pre-assigns tasks to worker agents |
+| [P] tasks, delegation criteria ALL met, coordination overhead justified, moderate parallelism | Parallel Batch | Launch Task() calls in parallel |
+| [P] tasks, delegation criteria ALL met, coordination overhead clearly justified, high parallelism | Swarm | Create task pool → orchestrator pre-assigns tasks to worker agents |
+**Mode judgment**: Ask — "Given these N tasks with their complexity, file scope, and interdependencies, would spawning multiple agents and merging their results be faster and safer than executing sequentially?" If the answer is not clearly yes, default to Sequential.
+- **Parallel Batch** is appropriate when there are enough independent tasks that parallel execution provides meaningful speed gain, but the total count is manageable enough that a single orchestrator round-trip suffices.
+- **Swarm** is appropriate when the number of independent tasks is large enough that a single batch of Task() calls would saturate the concurrent agent limit, requiring multiple orchestrator rounds.
 **Parallel delegation criteria** (ALL must be satisfied):
 1. Tasks have **no `depends:` edges** between them in the DAG (no ordering constraint)
-2. **≥ 3 parallelizable tasks** in the phase (2 tasks → sequential is cheaper)
+2. **Enough parallelizable tasks** that multi-agent overhead is worth it (a very small number of short tasks → sequential is cheaper)
 3. Each task is **self-contained** (does not require runtime results from other tasks in the same batch)
 4. Each task's **target files do not overlap** with any other task in the batch (no shared file writes)
@@ -143,7 +148,7 @@ If ANY criterion fails → main agent sequential execution (context preservation
 - On task start: `▶ {ID}: {description}`
 - On completion: `✓ {ID} complete`
-#### Parallel Batch Mode (3–5 [P] tasks)
+#### Parallel Batch Mode (moderate [P] tasks)
 **Pre-validation**: Verify no file overlap (downgrade to sequential if overlapping).
@@ -203,14 +208,18 @@ Task("T004: Create AuthService", subagent_type: "afc:afc-impl-worker", isolation
 2. Capture the `agentId` from the failed agent's result (returned in Task tool output)
 3. Reset: `TaskUpdate(taskId, status: "pending")`
 4. Track: `TaskUpdate(taskId, metadata: { retryCount: N, lastAgentId: agentId })`
-5. If retryCount < 3 → re-launch with `resume: lastAgentId` in the next batch round. The resumed agent retains full context from the previous attempt (what it tried, what failed, partial progress), enabling more targeted retry instead of starting from scratch.
+5. **Classify the error before deciding to retry**:
+   - **First failure** (no `metadata.lastError` exists): store `metadata.lastError = {current error message}`. Classify as transient (no prior error to compare) and proceed with retry.
+   - **Subsequent failures** (`metadata.lastError` exists): Compare the current error with `metadata.lastError`. If the error is **the same** (deterministic failure — same message, same stack location) → stop immediately and mark as failed. Retrying a deterministic failure wastes cycles.
+   - If the error **differs** from the previous attempt (transient/flaky — different message, network blip, lock contention) → re-launch with `resume: lastAgentId`. The resumed agent retains full context from the previous attempt (what it tried, what failed, partial progress), enabling more targeted retry.
    - **Worktree caveat**: if the failed worker made no file changes, its worktree is auto-cleaned and `resume` will fail. In this case, fall back to a fresh launch (omit `resume`) for the retry.
-6. If retryCount >= 3 → mark as failed, report: `"T{ID} failed after 3 attempts: {last error}"`
+   - Update `metadata.lastError` with the current error on each attempt.
+6. If retryCount >= 5 (absolute safety cap) → mark as failed, report: `"T{ID} failed after {retryCount} attempts: {last error}"`
 7. Continue with remaining tasks — a single failure does not block the entire phase
-#### Swarm Mode (6+ [P] tasks)
+#### Swarm Mode (high [P] task count)
-When a phase has more than 5 parallelizable tasks, use the **orchestrator-managed swarm pattern**.
+When a phase has enough parallelizable tasks that a single batch of Task() calls would saturate the concurrent agent limit and require multiple orchestrator rounds, use the **orchestrator-managed swarm pattern**.
 > **Key constraint**: Claude Code's TaskUpdate uses **last-write-wins** with local file locking only. Multiple sub-agents calling TaskUpdate on the same task simultaneously can cause lost writes. The orchestrator must mediate task assignment to prevent collisions.
@@ -268,7 +277,7 @@ Task("Worker 2: T008, T010, T012", subagent_type: "afc:afc-impl-worker", isolati
 5. If unblocked tasks remain → assign to new worker batch (repeat Step 2)
 6. If all tasks complete → phase done
-**Worker count**: N = min(5, unblocked task count). Max 5 concurrent sub-agents per phase.
+**Worker count**: N = min(5, unblocked task count). Max 5 concurrent sub-agents per phase (5 is the Claude Code platform limit for concurrent agents — not a semantic preference).
 **Task assignment strategy**: Round-robin by file path — each worker gets tasks targeting different files to maximize isolation. If a worker has multiple tasks, order them by `depends:` topology.
@@ -280,9 +289,13 @@ When a worker agent returns an error:
 3. Capture the `agentId` from the failed worker's result
 4. Reset uncompleted tasks: `TaskUpdate(taskId, status: "pending")`
 5. Track retry count: `TaskUpdate(taskId, metadata: { retryCount: N, lastAgentId: agentId })`
-6. If retryCount < 3 → re-launch with `resume: lastAgentId` to preserve context from the previous attempt. The resumed agent retains its full conversation history (files read, changes attempted, errors encountered), enabling targeted retry.
+6. **Classify the error before deciding to retry**:
+   - **First failure** (no `metadata.lastError` exists): store `metadata.lastError = {current error message}`. Classify as transient (no prior error to compare) and proceed with retry.
+   - **Subsequent failures** (`metadata.lastError` exists): Compare the current error with `metadata.lastError`. If the error is **the same** (deterministic failure — same message, same stack location) → stop immediately and mark as failed. Retrying a deterministic failure wastes cycles.
+   - If the error **differs** from the previous attempt (transient/flaky — different message, network blip, lock contention) → re-launch with `resume: lastAgentId`. The resumed agent retains its full conversation history (files read, changes attempted, errors encountered), enabling targeted retry.
    - **Worktree caveat**: if the failed worker made no file changes, its worktree is auto-cleaned and `resume` will fail. In this case, fall back to a fresh launch (omit `resume`) for the retry.
-7. If retryCount >= 3 → mark as failed, report: `"T{ID} failed after 3 attempts: {last error}"`
+   - Update `metadata.lastError` with the current error on each attempt.
+7. If retryCount >= 5 (absolute safety cap) → mark as failed, report: `"T{ID} failed after {retryCount} attempts: {last error}"`
 8. Continue with remaining tasks
 > Single task failure does not block the phase. The orchestrator reassigns failed tasks to subsequent batches.
@@ -350,7 +363,7 @@ After CI passes, run a convergence-based Critic Loop to verify design alignment
 **Critic Loop until convergence** (safety cap: 5):
-- **SCOPE_ADHERENCE**: Compare `git diff` changed files against plan.md File Change List. Flag any file modified that is NOT in the plan. Flag any planned file NOT modified. Provide "M of N files match" count.
+- **SCOPE_ADHERENCE**: Compare `git diff` changed files against plan.md File Change Map. Flag any file modified that is NOT in the plan. Flag any planned file NOT modified. Provide "M of N files match" count.
 - **ARCHITECTURE**: Validate changed files against `{config.architecture}` rules (layer boundaries, naming conventions, import paths). Provide "N of M rules checked" count.
 - **CORRECTNESS**: Cross-check implemented changes against spec.md acceptance criteria (AC). Verify each AC has corresponding code. Provide "N of M AC verified" count.
 - **SIDE_EFFECT_SAFETY**: For tasks that changed call order, error handling, or state flow: verify that callee behavior is compatible with the new call pattern. Provide "{M} of {N} behavioral changes verified" count.
@@ -384,10 +397,10 @@ Implementation complete
 - **Architecture compliance**: follow {config.architecture} rules.
 - **{config.ci} gate**: must pass on phase completion. Do not bypass.
 - **Swarm workers**: max 5 concurrent. File overlap is strictly prohibited between parallel tasks.
-- **On error**: prevent infinite loops. Report to user after 3 attempts.
+- **On error**: classify errors before retrying. Stop immediately on deterministic (same) errors. Allow additional attempts for transient (different) errors. Hard cap at 5 retries total.
 - **Real-time tasks.md updates**: mark checkbox on each task completion.
 - **Default is direct execution**: main agent executes tasks directly unless all 4 parallel delegation criteria are met. This preserves full context and avoids multi-agent context loss.
-- **Mode selection is automatic**: do not manually override. Sequential (default), batch for 3–5 qualifying [P], swarm for 6+ qualifying [P].
+- **Mode selection is automatic**: do not manually override. Sequential (default), batch when moderate independent parallelism justifies coordination overhead, swarm when high task count requires multiple orchestrator rounds.
 - **NEVER use `run_in_background: true` on Task calls**: agents must run in foreground so results are returned before the next step.
 - **No worker self-claiming**: In swarm mode, the orchestrator pre-assigns tasks to workers. Workers do NOT call TaskList/TaskUpdate to claim tasks — this avoids last-write-wins race conditions on TaskUpdate.
 - **Phase-locked registration**: Only register (TaskCreate) the current phase's tasks. Never pre-register future phases. This is the primary mechanism for phase boundary enforcement.

package/{commands/init.md → skills/init/SKILL.md} RENAMED Viewed

@@ -72,6 +72,8 @@ Analyze the project and auto-infer configuration. Use `$ARGUMENTS` as additional
 - If no lockfile: check `packageManager` field in `package.json`
 - Non-JS projects: check `pyproject.toml` (Python), `Cargo.toml` (Rust), `go.mod` (Go)
+> These detection rules are starting-point heuristics, not definitive. If a project uses a tool not listed here, the model should still detect it from context (e.g., `bun.lockb` for Bun, `deno.lock` for Deno). Always confirm the detected setup with the user before proceeding.
 **Step 2. Framework Detection**
 - Determine from `package.json` dependencies/devDependencies:
@@ -91,9 +93,11 @@ Analyze the project and auto-infer configuration. Use `$ARGUMENTS` as additional
 - Non-JS: `pyproject.toml` → Django/FastAPI/Flask, `Cargo.toml` → Rust project, `go.mod` → Go project
 - Presence of `tsconfig.json` → TypeScript indicator
+> This list covers common frameworks but is not exhaustive. For unlisted frameworks, infer from package.json dependencies, project structure, and configuration files. Present the detection result to the user for confirmation.
 **Step 3. Architecture Detection**
 - Analyze directory structure:
-  - FSD: requires **at least 3** of `features/`, `entities/`, `shared/`, `widgets/`, `pages/` under `src/`
+  - FSD: If the project's src/ directory contains a combination of FSD-characteristic directories (`features/`, `entities/`, `shared/`, `widgets/`, `pages/`, `processes/`, `app/`), assess whether the project follows FSD principles. Variant FSD structures (e.g., using `processes/` instead of `pages/`) should also be detected. Confirm with the user if the detection is uncertain.
   - `src/domain/`, `src/application/`, `src/infrastructure/` → Clean Architecture
   - `src/modules/` → Modular
   - Other → Layered

package/{commands/learner.md → skills/learner/SKILL.md} RENAMED Viewed

@@ -53,7 +53,7 @@ Analyze ALL queue entries together as a batch. For each entry, you receive struc
 **Classification rules:**
 1. Group semantically similar entries into clusters (e.g., "use const not let" + "always use const" = 1 cluster)
 2. For each cluster, determine:
-   - **Confidence**: high (explicit preference, ≥2 occurrences) / medium (single clear correction) / low (ambiguous excerpt)
+   - **Confidence**: Assess based on the strength and clarity of the signal, not occurrence count. A single explicit user correction ("never do X") is high confidence. Two ambiguous occurrences may still be medium. Consider: was the feedback direct and clear? Does it apply broadly or only to a specific case? Use high / medium / low accordingly.
    - **Rule type**: naming, style, workflow, testing, architecture
    - **Scope**: universal (all files) or file-type-specific (e.g., "In TypeScript files...")
@@ -73,7 +73,7 @@ For each candidate rule:
 ### 4. Present Suggestions
-Show clustered suggestions to user. Cap at **5 suggestions per review** (highest confidence first).
+Show clustered suggestions to user, most impactful first. Present the most impactful suggestions first. If there are many high-confidence patterns, present them all rather than artificially capping. If most are low-confidence, present fewer. Let relevance and confidence drive the count, not a fixed limit.
 ```markdown
 ## Learned Patterns ({N} pending, showing top {M})
@@ -125,7 +125,7 @@ For each approved rule:
 3. **Remove consumed entries** from `.claude/.afc-learner-queue.jsonl` (entries that were approved, skipped, or rejected — only keep entries not yet reviewed)
-4. **Rule count check**: If `afc-learned.md` now has ≥30 rules (count `<!-- afc:learned` markers), suggest consolidation:
+4. **Rule count check**: Suggest consolidation when the rules file becomes unwieldy — when rules overlap, contradict each other, or are too numerous to effectively guide behavior. Use judgment rather than a fixed count:
    ```
    afc-learned.md has {N} rules. Consider reviewing and consolidating related rules
    to keep context budget efficient. You can edit the file directly.
@@ -147,6 +147,6 @@ Learner review complete
 - **Opt-in only**: Learner signal collection requires `.claude/afc/learner.json` to exist. Run `/afc:learner enable` to start.
 - **Project-scoped rules**: All rules write to `.claude/rules/afc-learned.md` (git-tracked, team-visible). Never writes to root `CLAUDE.md`, `~/.claude/CLAUDE.md`, or auto memory.
 - **No raw prompts stored**: The signal queue contains only structured metadata (type, category, 80-char redacted excerpt, timestamp). Full prompt text is never persisted.
-- **Queue limits**: Max 50 entries, 7-day TTL. Stale entries are pruned at session start.
+- **Queue limits**: Manage queue size to prevent unbounded growth. Remove entries that have been reviewed, are no longer relevant (the code they reference has changed significantly), or are duplicates of already-processed patterns. As a practical guideline, keep the queue focused on recent, actionable items. Stale entries are pruned at session start.
 - **Safe by design**: Anti-injection guardrails prevent propagation of harmful instructions. Category blocklist prevents rules about permissions/security/hooks.
 - **Editable output**: `afc-learned.md` is a regular markdown file. Edit, delete, or reorganize rules at any time.

package/{commands/plan.md → skills/plan/SKILL.md} RENAMED Viewed

@@ -101,128 +101,7 @@ Future pipelines can reference prior research to avoid redundant investigation.
 ### 4. Phase 1 — Write Design
-Create `.claude/afc/specs/{feature}/plan.md`. **Must** follow the structure below:
-```markdown
-# Implementation Plan: {feature name}
-## Summary
-{summary of core requirements from spec + technical approach, 3-5 sentences}
-## Technical Context
-{Summarize key project settings from .claude/rules/afc-project.md (auto-loaded) and afc.config.md}
-- **Constraints**: {constraints extracted from spec}
-## Principles Check
-{if .claude/afc/memory/principles.md exists: validation results against MUST principles}
-{if violations possible: state explicitly + justification}
-## Architecture Decision
-### Approach
-{core idea of the chosen design}
-### Architecture Placement
-| Layer | Path | Role |
-|-------|------|------|
-| {entities/features/widgets/shared} | {path} | {description} |
-### State Management Strategy (omit if not applicable)
-{what combination of Zustand store / React Query / Context is used where}
-### API Design (omit if not applicable)
-{plan for new API endpoints or use of existing APIs}
-## Test Strategy
-> Written alongside the File Change Map. Classify each implementation file and decide test coverage level.
-> Determines which files need test coverage and at what level.
-### Code Classification
-| File | Code Type | Test Need | Reason |
-|------|-----------|:---------:|--------|
-| {path} | {business-logic / pure-function / side-effect / framework / config / UI} | {required / optional / unnecessary} | {brief justification} |
-> Classification guide:
-> - **business-logic / pure-function**: Required — unit tests (AAA pattern)
-> - **side-effect code** (external API, DB, file I/O): Required — integration tests with mocks
-> - **framework / config / getter-setter / boilerplate**: Unnecessary — no test
-> - **UI rendering** (no state logic): Optional — minimal snapshot or skip
-### Test Pyramid
-- **Unit tests**: {count} files ({which files})
-- **Integration tests**: {count} files ({which files}, if applicable)
-- **E2E tests**: {count} (if applicable, only for critical user flows)
-### Required Test Cases (derived from spec EARS requirements)
-{For each spec EARS requirement with `→ TC:` mapping, list the test case here}
-- `should_{behavior}_when_{trigger}` → covers FR-{NNN}
-- `should_{behavior}_while_{state}` → covers FR-{NNN}
-## File Change Map
-| File | Action | Description | Depends On | Phase |
-|------|--------|-------------|------------|-------|
-| {path} | create/modify/delete | {summary} | {file(s) or "—"} | {1-N} |
-> - **Depends On**: list file(s) that must be created/modified first (enables dependency-aware task generation in /afc:implement).
-> - **Phase**: implementation phase number. Same-phase + no dependency + different file = parallelizable.
-> - **Test files**: For each implementation file classified as "required" in Code Classification, include a corresponding test file in the same Phase. Test files are first-class citizens in the File Change Map.
-## Implementation Context
-> Auto-generated section for implementation agents. Compress to under 500 words.
-> This section travels with every sub-agent prompt during /afc:implement.
-- **Objective**: {1-sentence feature purpose from spec Overview}
-- **Key Constraints**: {NFR summaries + spec Constraints section, compressed}
-- **Critical Edge Cases**: {top 3 edge cases from spec, 1 line each}
-- **Risk Watchpoints**: {top risks from Risk & Mitigation table}
-- **Must NOT**: {explicit prohibitions — from spec constraints, principles.md, or CLAUDE.md}
-- **Acceptance Anchors**: {key acceptance criteria from spec that implementation must satisfy}
-## Risk & Mitigation
-| Risk | Impact | Mitigation |
-|------|--------|------------|
-| {risk} | {H/M/L} | {approach} |
-## Alternative Design
-### Approach 0: No Change (status quo)
-{Why might the current state be sufficient? What is the cost of doing nothing?}
-{If no change is clearly inferior: state specific reason — "Status quo lacks X, which is required by FR-001"}
-{If no change is viable: recommend it — avoid implementing for the sake of implementing}
-### Approach A: {chosen approach name}
-{Brief description — this is the approach detailed above}
-### Approach B: {alternative approach name}
-{Brief description of a meaningfully different approach}
-| Criterion | No Change | Approach A | Approach B |
-|-----------|-----------|-----------|-----------|
-| Complexity | None | {evaluation} | {evaluation} |
-| Risk | None | {evaluation} | {evaluation} |
-| Maintainability | Current | {evaluation} | {evaluation} |
-| Justification | {why not enough} | {why this} | {why this} |
-**Decision**: Approach {0/A/B} — {1-sentence rationale}
-{If Approach 0 chosen: abort plan, report: "No implementation needed — current state satisfies requirements."}
-## Phase Breakdown
-### Phase 1: Setup
-{project structure, type definitions, configuration}
-### Phase 2: Core Implementation
-{core business logic, state management}
-### Phase 3: UI & Integration
-{UI components, API integration}
-### Phase 4: Polish
-{error handling, performance optimization, tests}
-```
+Create `.claude/afc/specs/{feature}/plan.md` following the template in `${CLAUDE_SKILL_DIR}/plan-template.md`. Read it first, then generate the plan using that structure. **All sections are mandatory** unless marked "(omit if not applicable)".
 ### 4.5. File Path Verification

package/skills/plan/plan-template.md ADDED Viewed

@@ -0,0 +1,118 @@
+# Implementation Plan: {feature name}
+## Summary
+{summary of core requirements from spec + technical approach, 3-5 sentences}
+## Technical Context
+{Summarize key project settings from .claude/rules/afc-project.md (auto-loaded) and afc.config.md}
+- **Constraints**: {constraints extracted from spec}
+## Principles Check
+{if .claude/afc/memory/principles.md exists: validation results against MUST principles}
+{if violations possible: state explicitly + justification}
+## Architecture Decision
+### Approach
+{core idea of the chosen design}
+### Architecture Placement
+| Layer | Path | Role |
+|-------|------|------|
+| {entities/features/widgets/shared} | {path} | {description} |
+### State Management Strategy (omit if not applicable)
+{what combination of Zustand store / React Query / Context is used where}
+### API Design (omit if not applicable)
+{plan for new API endpoints or use of existing APIs}
+## Test Strategy
+> Written alongside the File Change Map. Classify each implementation file and decide test coverage level.
+> Determines which files need test coverage and at what level.
+### Code Classification
+| File | Code Type | Test Need | Reason |
+|------|-----------|:---------:|--------|
+| {path} | {business-logic / pure-function / side-effect / framework / config / UI} | {required / optional / unnecessary} | {brief justification} |
+> Classification guide:
+> - **business-logic / pure-function**: Required — unit tests (AAA pattern)
+> - **side-effect code** (external API, DB, file I/O): Required — integration tests with mocks
+> - **framework / config / getter-setter / boilerplate**: Unnecessary — no test
+> - **UI rendering** (no state logic): Optional — minimal snapshot or skip
+### Test Pyramid
+- **Unit tests**: {count} files ({which files})
+- **Integration tests**: {count} files ({which files}, if applicable)
+- **E2E tests**: {count} (if applicable, only for critical user flows)
+### Required Test Cases (derived from spec EARS requirements)
+{For each spec EARS requirement with `→ TC:` mapping, list the test case here}
+- `should_{behavior}_when_{trigger}` → covers FR-{NNN}
+- `should_{behavior}_while_{state}` → covers FR-{NNN}
+## File Change Map
+| File | Action | Description | Depends On | Phase |
+|------|--------|-------------|------------|-------|
+| {path} | create/modify/delete | {summary} | {file(s) or "—"} | {1-N} |
+> - **Depends On**: list file(s) that must be created/modified first (enables dependency-aware task generation in /afc:implement).
+> - **Phase**: implementation phase number. Same-phase + no dependency + different file = parallelizable.
+> - **Test files**: For each implementation file classified as "required" in Code Classification, include a corresponding test file in the same Phase. Test files are first-class citizens in the File Change Map.
+## Implementation Context
+> Auto-generated section for implementation agents. Compress to under 500 words.
+> This section travels with every sub-agent prompt during /afc:implement.
+- **Objective**: {1-sentence feature purpose from spec Overview}
+- **Key Constraints**: {NFR summaries + spec Constraints section, compressed}
+- **Critical Edge Cases**: {top 3 edge cases from spec, 1 line each}
+- **Risk Watchpoints**: {top risks from Risk & Mitigation table}
+- **Must NOT**: {explicit prohibitions — from spec constraints, principles.md, or CLAUDE.md}
+- **Acceptance Anchors**: {key acceptance criteria from spec that implementation must satisfy}
+## Risk & Mitigation
+| Risk | Impact | Mitigation |
+|------|--------|------------|
+| {risk} | {H/M/L} | {approach} |
+## Alternative Design
+### Approach 0: No Change (status quo)
+{Why might the current state be sufficient? What is the cost of doing nothing?}
+{If no change is clearly inferior: state specific reason — "Status quo lacks X, which is required by FR-001"}
+{If no change is viable: recommend it — avoid implementing for the sake of implementing}
+### Approach A: {chosen approach name}
+{Brief description — this is the approach detailed above}
+### Approach B: {alternative approach name}
+{Brief description of a meaningfully different approach}
+| Criterion | No Change | Approach A | Approach B |
+|-----------|-----------|-----------|-----------|
+| Complexity | None | {evaluation} | {evaluation} |
+| Risk | None | {evaluation} | {evaluation} |
+| Maintainability | Current | {evaluation} | {evaluation} |
+| Justification | {why not enough} | {why this} | {why this} |
+**Decision**: Approach {0/A/B} — {1-sentence rationale}
+{If Approach 0 chosen: abort plan, report: "No implementation needed — current state satisfies requirements."}
+## Phase Breakdown
+### Phase 1: Setup
+{project structure, type definitions, configuration}
+### Phase 2: Core Implementation
+{core business logic, state management}
+### Phase 3: UI & Integration
+{UI components, API integration}
+### Phase 4: Polish
+{error handling, performance optimization, tests}

package/{commands/pr-comment.md → skills/pr-comment/SKILL.md} RENAMED Viewed

@@ -134,10 +134,10 @@ No issues found. Code looks good!
 Display the full review comment to the user in the console.
-Then determine the review event type:
-- **Critical findings exist** → `REQUEST_CHANGES`
-- **Only Warning/Info findings** → `COMMENT`
-- **No findings** → `APPROVE`
+Then determine the review event based on the actual severity and context of findings:
+- **REQUEST_CHANGES**: Use when findings indicate genuine risk to production code — bugs that would affect users, security vulnerabilities, or architectural violations that would be costly to fix later. A Critical finding in test code or documentation alone does not warrant blocking the PR.
+- **COMMENT**: Use when findings are improvements or concerns that the author should consider but that don't pose immediate risk. Also appropriate when Critical findings are in non-production code (tests, docs, config) or when the author has already acknowledged the concern in PR discussion.
+- **APPROVE**: Use when no findings exist, or when all findings are informational and the code is ready to merge.
 Tell the user:
 ```

package/{commands/principles.md → skills/principles/SKILL.md} RENAMED Viewed

@@ -107,5 +107,5 @@ Collect principles interactively:
 - **Persistent storage**: Saved to .claude/afc/memory/principles.md and maintained across sessions.
 - **Auto-referenced**: Automatically loaded and validated by /afc:plan and /afc:architect.
-- **Keep it concise**: Maintain no more than 10 principles. Too many reduces effectiveness.
+- **Keep it concise**: Keep principles concise and actionable. If the list grows long enough that principles start overlapping or becoming too granular, suggest consolidation. The goal is a set of principles that can be held in working memory — typically under 15, but the right number depends on the project's complexity.
 - **Avoid duplication with CLAUDE.md**: Do not re-register rules already present in CLAUDE.md as principles.

package/{commands/qa.md → skills/qa/SKILL.md} RENAMED Viewed

@@ -82,7 +82,7 @@ Checks:
 Evaluate general code quality indicators.
 Checks:
-- **Complexity hotspots**: deeply nested logic, functions exceeding ~50 LOC
+- **Complexity hotspots**: deeply nested logic, functions with high cognitive complexity (many branches, side effects, or state mutations). Line count alone is not a reliable indicator — a 30-line function with nested conditionals and side effects may be more complex than a 60-line function with simple sequential logic.
 - **Duplication**: near-identical code blocks across files
 - **Magic numbers/strings**: unexplained literals in logic
 - **TODO/FIXME accumulation**: stale markers (count, age if git history available)
@@ -128,7 +128,7 @@ For each active category:
 ### 5. Critic Loop
-Apply `docs/critic-loop-rules.md` with **safety cap: 3 rounds**.
+**Always** read `${CLAUDE_PLUGIN_ROOT}/docs/critic-loop-rules.md` first and follow it. Safety cap: **5 passes**.
 Focus the critic on:
 - Are the findings actionable or just noise?

package/{commands/release-notes.md → skills/release-notes/SKILL.md} RENAMED Viewed

@@ -75,13 +75,17 @@ Flag any matches for the Breaking Changes section.
 Categorize each commit/PR into one of:
-| Category | Conventional Commit Prefixes | Fallback Heuristics |
-|----------|------------------------------|---------------------|
+| Category | Conventional Commit Prefixes | Fallback (no prefix) |
+|----------|------------------------------|----------------------|
 | Breaking Changes | `!:` suffix, `BREAKING` | Label: `breaking` |
-| New Features | `feat:` | "add", "new", "implement", "support" |
-| Bug Fixes | `fix:` | "fix", "resolve", "correct", "patch" |
+| New Features | `feat:` | Commit introduces new user-facing functionality |
+| Bug Fixes | `fix:` | Commit fixes broken or incorrect behavior |
 | Other Changes | `chore:`, `docs:`, `ci:`, `refactor:`, `perf:`, `test:`, `style:`, `build:` | Everything else |
+**Fallback classification rule** — For commits without conventional prefixes, read the commit message semantically. Classify based on what the commit actually does: did it introduce new user-facing functionality (New Features)? Did it fix broken behavior (Bug Fixes)? Or is it a maintenance/improvement change (Other Changes)? Do not match individual words — understand the commit's purpose.
+The word "add" in "add a new API endpoint" indicates a feature. The same word in "add missing test coverage" indicates a test (Other Changes). Context determines classification.
 **Rewriting rules** — transform each entry from developer-speak to user-facing language:
 1. Remove conventional commit prefixes (`feat:`, `fix(scope):`, etc.)