npm - prizmkit - Versions diffs - 1.1.8 → 1.1.9 - Mend

prizmkit 1.1.8 → 1.1.9

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (123) hide show

package/bundled/skills/bug-planner/references/input-formats.md ADDED Viewed

@@ -0,0 +1,53 @@
+# Bug Input Format Detection & Extraction
+Auto-detect the user's input format and extract structured bug information accordingly.
+## Format A: Stack Trace / Error Log
+```
+TypeError: Cannot read property 'token' of null
+    at AuthService.handleLogin (src/services/auth.ts:42)
+    at LoginPage.onSubmit (src/pages/login.tsx:28)
+```
+Extract: `error_source.type="stack_trace"`, `error_message`, `stack_trace`, `affected_modules`
+## Format B: Natural Language User Report
+```
+When I click the login button with correct credentials, the page turns white.
+Expected: redirect to home page.
+Actual: white screen with no error message visible.
+```
+Extract: `error_source.type="user_report"`, `reproduction_steps`, `description` (expected vs actual)
+## Format C: Failed Test Output
+```
+FAIL src/services/__tests__/auth.test.ts
+  ● AuthService > handleLogin > should return token on success
+    Expected: "abc123"
+    Received: null
+```
+Extract: `error_source.type="failed_test"`, `failed_test_path`, `error_message`
+## Format D: Log Pattern
+```
+[2026-03-07 10:23:45] ERROR [auth-service] Connection timeout after 30000ms
+[2026-03-07 10:23:45] ERROR [auth-service] Failed to authenticate user: ETIMEDOUT
+[2026-03-07 10:23:46] ERROR [auth-service] Connection timeout after 30000ms
+```
+Extract: `error_source.type="log_pattern"`, `log_snippet`, `affected_modules`
+## Format E: Monitoring Alert
+```
+ALERT: CPU usage > 95% for auth-service pod (5min avg)
+ALERT: Error rate spike: 500 errors/min on /api/login endpoint
+```
+Extract: `error_source.type="monitoring_alert"`, `error_message`, `affected_modules`

package/bundled/skills/bug-planner/references/schema-validation.md ADDED Viewed

@@ -0,0 +1,25 @@
+# Schema Validation Checklist
+Use this checklist for manual validation when `validate-bug-list.py` is not available. The script is the source of truth — this checklist mirrors its logic.
+## Required Top-Level Fields
+- [ ] `$schema`: must be `"dev-pipeline-bug-fix-list-v1"`
+- [ ] `project_name`: non-empty string
+- [ ] `bugs`: non-empty array
+## Per-Bug Required Fields
+- [ ] `id`: matches pattern `B-NNN` (e.g., `B-001`)
+- [ ] `title`: non-empty string
+- [ ] `description`: non-empty string
+- [ ] `severity`: one of `critical`, `high`, `medium`, `low`
+- [ ] `error_source.type`: one of `stack_trace`, `user_report`, `failed_test`, `log_pattern`, `monitoring_alert`
+- [ ] `verification_type`: one of `automated`, `manual`, `hybrid`
+- [ ] `acceptance_criteria`: non-empty array of strings
+- [ ] `status`: must be `pending` for new bugs
+## Consistency Checks
+- [ ] No duplicate bug IDs
+- [ ] If `priority` is set, must be one of `high`, `medium`, `low`

package/bundled/skills/bug-planner/references/severity-rules.md ADDED Viewed

@@ -0,0 +1,16 @@
+# Severity Auto-Classification Rules
+When extracting bugs, apply these rules to auto-suggest severity:
+| Severity | Indicators | Examples |
+|----------|------------|----------|
+| **critical** | System crash, data loss, security breach, OOM, unrecoverable error | `Segmentation fault`, `OutOfMemoryError`, `SQL injection vulnerability`, `Database corrupted` |
+| **high** | Core feature broken, authentication failure, data integrity issue, timeout | `Auth token invalid`, `Payment failed`, `Connection timeout`, `500 Internal Server Error` |
+| **medium** | Feature partially broken, workaround exists, incorrect output | `CSV encoding issue`, `Pagination not working`, `Wrong date format`, `Missing validation` |
+| **low** | Cosmetic issue, minor inconvenience, edge case | `UI misalignment`, `Typo in error message`, `Slow loading (non-critical page)`, `Non-breaking warning` |
+## Special Cases
+- Failed test → medium (unless test covers critical path, then high)
+- User report with "cannot use app" → high
+- User report with "annoying but works" → low

package/bundled/skills/bug-planner/scripts/validate-bug-list.py CHANGED Viewed

@@ -19,7 +19,7 @@ import re
 VALID_SEVERITIES = {"critical", "high", "medium", "low"}
 VALID_SOURCE_TYPES = {"stack_trace", "user_report", "failed_test", "log_pattern", "monitoring_alert"}
 VALID_VERIFICATION_TYPES = {"automated", "manual", "hybrid"}
-VALID_STATUSES = {"pending", "in_progress", "fixed", "failed", "skipped", "needs_info"}
+VALID_STATUSES = {"pending", "triaging", "reproducing", "fixing", "verifying", "completed", "failed", "needs_info", "skipped"}
 BUG_ID_PATTERN = re.compile(r"^B-\d{3}$")
 SCHEMA_VERSION = "dev-pipeline-bug-fix-list-v1"
@@ -117,10 +117,6 @@ def validate(bug_list_path, feature_list_path=None):
             if priority not in ("high", "medium", "low"):
                 errors.append(f"{prefix} ({bug_id}): invalid priority '{priority}' — must be one of 'high', 'medium', 'low'")
-        # Cross-reference affected_feature
-        affected_feature = bug.get("affected_feature")
-        if affected_feature and feature_ids and affected_feature not in feature_ids:
-            warnings.append(f"{prefix} ({bug_id}): affected_feature '{affected_feature}' not found in feature-list.json")
     # Output results
     if errors:

package/bundled/skills/bugfix-pipeline-launcher/SKILL.md CHANGED Viewed

@@ -117,20 +117,16 @@ Detect user intent from their message, then follow the corresponding workflow:
    Use `AskUserQuestion` to present the following configuration choices. Each question is a separate selectable option:
-   **Question 1 — Critic review** (multiSelect: false):
-   - Off (default) — Skip adversarial review
-   - On — Enable critic review after bug fix (+2-5 min/bug for critical/high severity)
-   **Question 2 — Verbose logging** (multiSelect: false):
+   **Question 1 — Verbose logging** (multiSelect: false):
    - On (default) — Detailed AI session logs including tool calls and subagent activity
    - Off — Minimal logging
-   **Question 3 — Max retries** (multiSelect: false):
+   **Question 2 — Max retries** (multiSelect: false):
    - 3 (default)
    - 1
    - 5
-   **Question 4 — Session timeout** (multiSelect: false):
+   **Question 3 — Session timeout** (multiSelect: false):
    - None (default) — No timeout
    - 30 min — `SESSION_TIMEOUT=1800`
    - 1 hour — `SESSION_TIMEOUT=3600`
@@ -142,7 +138,6 @@ Detect user intent from their message, then follow the corresponding workflow:
    | Config choice | Environment variable |
    |-----------|---------------------|
-   | Critic: On | `ENABLE_CRITIC=true` |
    | Verbose: Off | `VERBOSE=0` |
    | Verbose: On | `VERBOSE=1` |
    | Max retries: N | `MAX_RETRIES=N` |
@@ -208,7 +203,7 @@ Detect user intent from their message, then follow the corresponding workflow:
    **If foreground**: Pipeline runs to completion in the terminal. After it finishes:
    - Summarize results: total bugs, fixed, failed, skipped
-   - If all fixed: each bug session has already run `prizmkit-retrospective` (structural sync) internally. Ask user what's next.
+   - If all fixed: each bug session has already run `prizmkit-retrospective` internally (structural sync by default; full retrospective when the fix changed interfaces, dependencies, or observable behavior). Ask user what's next.
    - If some failed: show failed bug IDs and suggest `retry-bugfix.sh <B-XXX>` or `dev-pipeline/reset-bug.sh <B-XXX> --clean --run`
    **If background daemon**:
@@ -303,7 +298,7 @@ When user says "retry B-001":
 dev-pipeline/retry-bugfix.sh B-001 .prizmkit/plans/bug-fix-list.json
 ```
-**Note:** `retry-bugfix.sh` automatically cleans bug artifacts and resets status before retrying. This is equivalent to `reset-feature.sh --clean --run` in the feature pipeline. No separate reset command is needed.
+**Note:** `retry-bugfix.sh` runs exactly one bug session and exits. It **preserves prior session artifacts and checkpoint state** — reads `retry_count` and `resume_from_phase` from `status.json` so the AI session can resume from where it left off. For a full clean retry, use `dev-pipeline/reset-bug.sh <B-XXX> --clean --run`.
 Environment variables (optional):
 ```bash

package/bundled/skills/feature-pipeline-launcher/SKILL.md CHANGED Viewed

@@ -170,6 +170,20 @@ Detect user intent from their message, then follow the corresponding workflow:
    | Max retries: N | `MAX_RETRIES=N` |
    | Timeout: value | `SESSION_TIMEOUT=<seconds>` |
+   **Advanced environment variables** (not exposed in interactive menu, pass via `--env`):
+   | Variable | Default | Purpose |
+   |----------|---------|---------|
+   | `MODEL` | (none) | AI model override (e.g. `claude-opus-4.6`) |
+   | `AUTO_PUSH` | `0` | Auto-push to remote after successful feature (`1` to enable) |
+   | `DEV_BRANCH` | auto-generated | Custom dev branch name (default: `dev/{feature_id}-YYYYMMDDHHmm`) |
+   | `HEARTBEAT_INTERVAL` | `30` | Heartbeat log interval in seconds |
+   | `HEARTBEAT_STALE_THRESHOLD` | `600` | Max seconds without heartbeat before marking stale |
+   | `PIPELINE_MODE` | (none) | Override mode for all features: `lite`\|`standard`\|`full` |
+   | `LOG_CLEANUP_ENABLED` | `1` | Run periodic log cleanup (`0` to disable) |
+   | `LOG_RETENTION_DAYS` | `14` | Delete logs older than N days |
+   | `LOG_MAX_TOTAL_MB` | `1024` | Keep total logs under N MB via oldest-first cleanup |
    ⚠️ STOP HERE and wait for user response before continuing to step 7.
 7. **Show final command**: After user confirms configuration in step 6, assemble the complete command from execution mode + user-confirmed configuration, and present it to the user.
@@ -324,9 +338,8 @@ SESSION_TIMEOUT=3600 dev-pipeline/retry-feature.sh F-003 .prizmkit/plans/feature
 ```
 Notes:
-- `retry-feature.sh` runs exactly one feature session and exits. **It always performs a full clean** (deletes session history and `.prizmkit/specs/` artifacts) before retrying, ensuring a fresh start. This is destructive — prior session logs and spec artifacts for this feature will be deleted.
-- `reset-feature.sh --clean --run` is equivalent to a manual clean + retry (same behavior as `retry-feature.sh`, but can also operate on ranges and filtered sets).
-- For a lighter retry that preserves prior session artifacts, use `run-feature.sh run <F-XXX> --no-reset` instead.
+- `retry-feature.sh` runs exactly one feature session and exits. It **preserves prior session artifacts and checkpoint state** — reads `retry_count` and `resume_from_phase` from `status.json` so the AI session can resume from where it left off rather than starting from scratch.
+- `reset-feature.sh --clean --run` performs a full clean (deletes session history and artifacts) before retrying — use this for a fresh start when checkpoint recovery is not desired.
 - Keep pipeline daemon mode for main run management (`launch-feature-daemon.sh`).
 ---

package/bundled/skills/feature-planner/SKILL.md CHANGED Viewed

@@ -22,7 +22,7 @@ For planning a **new application from scratch** (vision, tech stack, decompositi
 If the user's request is about planning a new app from scratch (vision, tech stack selection, app architecture), recommend `app-planner` instead and ask the user to confirm before switching.
-If you believe the task is better suited for a different workflow (e.g., fast path via `/prizmkit-plan`), you MUST:
+If you believe the task is better suited for a different workflow, you MUST:
 1. **Explain why** you think a different path is more appropriate
 2. **Ask the user explicitly** whether they want to switch or continue with feature-planner
 3. **Only switch if the user confirms** — otherwise proceed with feature-planner as invoked
@@ -39,14 +39,13 @@ The user chose this skill intentionally. Respect that choice.
 **Your ONLY writable outputs are:**
 1. `.prizmkit/plans/feature-list.json` (`.prizmkit/plans/`)
-2. Draft backups in `.prizmkit/planning/`
+2. Draft backups in `.prizmkit/plans/` (e.g., `feature-list.draft.json`)
 **After planning is complete**, you MUST:
-1. Present the summary and recommended next step
+1. Present the summary and recommended next step (invoking `feature-pipeline-launcher` )
 2. **Ask the user explicitly** whether they want to proceed to execution
-3. If the user agrees → recommend invoking `feature-pipeline-launcher` or running `run-feature.sh` (do NOT execute it yourself)
-4. If the user wants to adjust → continue refining `.prizmkit/plans/feature-list.json`
-5. **NEVER auto-execute** the pipeline, launcher, or any implementation step
+3. If the user wants to adjust → continue refining `.prizmkit/plans/feature-list.json`
+4. **NEVER auto-execute** the pipeline, launcher, or any implementation step
 ## When to Use
@@ -74,18 +73,16 @@ Do NOT use this skill when:
    - Browser interaction fields needed → read `${SKILL_DIR}/references/browser-interaction.md`
    - New feature set for a project (Route A) → read `${SKILL_DIR}/references/new-project-planning.md` for phase guide, quality rules, and delivery checklist
    - Feature decomposition from scratch → read `${SKILL_DIR}/references/decomposition-patterns.md` for common app patterns (CRUD, SaaS, Social, E-commerce)
+   - Phase 6 completeness review → read `${SKILL_DIR}/references/completeness-review.md`
-4. **Always validate output via script**:
-   ```bash
-   python3 ${SKILL_DIR}/scripts/validate-and-generate.py validate --input <output-path> --mode <new|incremental>
-   ```
+4. **Always validate output via script** — see §Output Rules for the validation command.
    If the script is not available, perform these manual validation checks:
    1. **ID sequence**: All feature IDs are sequential (F-001, F-002, F-003, ...)
    2. **No circular dependencies**: No feature depends (directly or transitively) on itself
    3. **Description length**: Minimum 15 words per description (error), 30/50/80 recommended
    4. **Dependency references**: All referenced features in dependencies exist in features array
-   5. **Priority enums**: All priority values are exactly "high", "medium", or "low" (case-sensitive)
+   5. **Priority enums**: All priority values are exactly "critical", "high", "medium", or "low" (case-sensitive)
    6. **Status enum**: All status values are one of: pending, in_progress, completed, failed, skipped, split, auto_skipped
    7. **Acceptance criteria**: At least 1 criterion per feature, each is a concrete, measurable statement
    8. **Browser interaction**: If present, has url, verify_steps array, and optional setup_command
@@ -103,7 +100,7 @@ Before questions, check optional context files (never block if absent):
 - `.prizmkit/config.json` (existing stack preferences and detected tech stack)
 - `.prizmkit/plans/project-brief.md` (project context from app-planner, if available)
 - existing `.prizmkit/plans/feature-list.json` (required for incremental mode)
-- `.prizmkit/project-conventions.json` (project conventions from app-planner, if available)
+- `CLAUDE.md` / `CODEBUDDY.md` `### Project Conventions` section (project conventions from app-planner, if available)
 - If `.prizm-docs/root.prizm` is absent and the project has existing source code, scan the directory structure to understand the codebase layout:
   ```bash
   find . -maxdepth 2 -type d -not -path '*/node_modules/*' -not -path '*/.git/*' -not -path '*/dist/*' -not -path '*/build/*' -not -path '*/__pycache__/*' -not -path '*/vendor/*' | sed -e 's;[^/]*/;|____;g;s;____|; |;g'
@@ -116,7 +113,7 @@ Before questions, check optional context files (never block if absent):
 ## Global Context Population
-The `global_context` object in `.prizmkit/plans/feature-list.json` provides technology stack information to the pipeline for intelligent code generation.
+The `global_context` object in `.prizmkit/plans/feature-list.json` provides technology stack information. Populate it from `.prizmkit/config.json` if available, or ask the user during Phase 1.
 ### Recommended Fields by Project Type
@@ -136,37 +133,8 @@ The `global_context` object in `.prizmkit/plans/feature-list.json` provides tech
 - `language`, `frontend_framework`, `backend_framework`, `database`, `testing_strategy` (all recommended)
 - Additional: `frontend_styling`, `orm`, `bundler`, `runtime`
-### Auto-Population from app-planner
+All `global_context` fields are optional — including recommended fields improves downstream code generation quality. See `dev-pipeline/templates/feature-list-schema.json` for the full schema definition.
-If the project was planned with `app-planner`, `feature-planner` will automatically read:
-- `.prizmkit/config.json` (tech stack detected by app-planner)
-- `.prizmkit/plans/project-brief.md` (tech choices made during planning)
-- `.prizm-docs/root.prizm` (architecture decisions)
-**Recommendation**: Always run `app-planner` before `feature-planner` to ensure `global_context` is pre-populated correctly. This reduces planning time and improves code generation consistency.
-### Manual Population (if no app-planner context)
-If planning features directly without app-planner:
-1. During Phase 1 (Scope Clarification), identify the project's tech stack
-2. Ask user or read from existing `package.json` / `pyproject.toml` / `go.mod`
-3. Populate `global_context` in `.prizmkit/plans/feature-list.json` with detected values
-4. Confirm with user: "Using tech stack: [language] + [frameworks]. Correct?"
-### Pipeline Behavior with Missing Fields
-| Missing Field | Pipeline Behavior |
-|---------------|-------------------|
-| `language` | Uses generic pseudocode patterns; code generation may be generic |
-| `frontend_framework` | Frontend features generate generic HTML/JavaScript; no framework-specific patterns |
-| `backend_framework` | Backend features generate generic API patterns; no framework-specific scaffolding |
-| `database` | Assumes no persistent storage; generates unit tests only with mocks |
-| `testing_strategy` | Defaults to Jest (if JavaScript/TypeScript); pytest (if Python) |
-| All fields empty | Pipeline still works but code quality/consistency may suffer; recommend re-running with app-planner |
-### Note
-**global_context fields are all optional** — pipeline can execute without them. However, including the recommended fields dramatically improves code generation quality and framework-specific best practices.
 ---
@@ -229,46 +197,11 @@ Checkpoints catch cascading errors early — skipping one means the next phase b
 ## Pre-Generation Completeness Review (Phase 6)
-Before generating `.prizmkit/plans/feature-list.json`, review the full feature set holistically. Individual features may look fine in isolation but have gaps when viewed together.
-### Step 1: Description Adequacy Scan
+Before generating `.prizmkit/plans/feature-list.json`, review the full feature set holistically.
-For each feature, evaluate against the word-count thresholds in `planning-guide.md` §4:
-- Does the description cover: what to build, key behaviors, integration points, data model (if applicable), error/edge cases?
-- Is the description specific enough for an AI coding session to implement without guessing?
-- Flag any feature below the recommended word count for its complexity level (30/50/80 words for low/medium/high).
-### Step 2: Cross-Feature Completeness Check
-Look at the feature set as a whole:
-- **Implied functionality gaps**: Does feature A's acceptance criteria assume a capability that no other feature provides?
-- **Missing integration seams**: If two features share data or interact at runtime, is the interface specified?
-- **Scope leaks**: Does any feature's description reference functionality outside the agreed scope?
-### Step 3: Present Review to User
-Show a structured summary table:
-```
-Feature    | Description | Cross-Feature        | Recommendation
-           | Adequacy    | Issues               |
-F-001      | ✓ (65 words)| —                    | Ready
-F-002      | ⚠ (28 words)| —                    | Expand: add API endpoints, error handling
-F-003      | ✓ (52 words)| Assumes email from   | Clarify: who sends the notification?
-           |             | F-006 (not yet defined)|
-```
+→ Read `${SKILL_DIR}/references/completeness-review.md` for the full review process (description adequacy scan, cross-feature completeness check, user presentation, and interactive supplementation).
-Then ask if any features need further discussion.
-### Step 4: Interactive Supplementation
-For each feature the user wants to discuss:
-1. Ask targeted questions about the unclear aspects
-2. Propose concrete description supplements
-3. Update the feature description with agreed details
-4. Re-check: does the supplement resolve the gap?
-Continue until the user confirms all features are implementation-ready. This gate exists because fixing thin descriptions here costs minutes; fixing misimplemented features downstream costs hours.
+This gate ensures all features are implementation-ready before output generation. Thin descriptions here cost minutes to fix; misimplemented features downstream cost hours.
 ## Fast Path (Simple Incremental)
@@ -314,39 +247,27 @@ A feature is **exempt** when ANY true:
 ## Output Rules
-`.prizmkit/plans/feature-list.json` must satisfy:
-- `$schema` = `dev-pipeline-feature-list-v1`
-- non-empty `features`
+`.prizmkit/plans/feature-list.json` must conform to `dev-pipeline/templates/feature-list-schema.json` (`$schema` = `dev-pipeline-feature-list-v1`).
+Key requirements:
+- non-empty `features` array
 - sequential feature IDs (`F-001`, `F-002`, ...)
-- valid dependency DAG
-- `priority` must be a string: `"high"`, `"medium"`, or `"low"` (NOT numeric)
+- valid dependency DAG (no cycles, all referenced IDs exist)
+- `priority`: `"critical"`, `"high"`, `"medium"`, or `"low"` (string, NOT numeric)
 - new items default `status: "pending"`
 - English feature titles for stable slug generation
-- `model` field is optional
-- `critic` field defaults based on priority: `true` for high/medium, `false` for low
-- `critic_count` field defaults: `3` for high priority, `1` for medium
-- `browser_interaction` field auto-generated for qualifying frontend features
-- descriptions must be implementation-ready — minimum 15 words (error), recommended 30/50/80 words for low/medium/high complexity (warning). See `planning-guide.md` §4.
-## Testing Defaults (Phase 8)
-All three testing mechanisms are enabled by default. The user can opt out.
+- `critic` / `critic_count` defaults per Testing Defaults section
+- `browser_interaction` auto-generated for qualifying frontend features
+- descriptions: minimum 15 words (error), recommended 30/50/80 for low/medium/high complexity (warning)
-### 1. Test File Generation (TDD)
-Every feature includes tests. In the feature description, state testing expectations:
-| Complexity | Default Testing Expectation |
-|------------|----------------------------|
-| low | Unit tests for core logic |
-| medium | Unit tests + integration tests for key flows |
-| high | Unit tests + integration tests + edge case coverage |
-### 2. Browser Verification (Playwright)
+Run the validation script after generation:
+```bash
+python3 ${SKILL_DIR}/scripts/validate-and-generate.py validate --input <output-path> --mode <new|incremental>
+```
-Default ON for all qualifying frontend features.
+## Testing Defaults (Phase 8)
-### 3. Adversarial Critic Review
+Set default testing-related fields for each feature. The user can opt out.
 | Priority | `critic` | `critic_count` | Rationale |
 |----------|----------|----------------|-----------|
@@ -354,23 +275,13 @@ Default ON for all qualifying frontend features.
 | medium | `true` | `1` | Single critic review |
 | low | `false` | (omitted) | Skip critic |
+For frontend features with `browser_interaction`, Playwright verification is enabled by default.
 Present a consolidated testing summary table at Phase 8, then ask for confirmation.
 ## Next-Step Execution Policy (after planning)
-Recommend these three options in this strict order:
-1. **Preferred**: invoke `feature-pipeline-launcher` skill
-2. **Fallback A**: run daemon wrapper
-   ```bash
-   ./dev-pipeline/launch-feature-daemon.sh start .prizmkit/plans/feature-list.json
-   ./dev-pipeline/launch-feature-daemon.sh status
-   ```
-3. **Fallback B**: run direct foreground script
-   ```bash
-   ./dev-pipeline/run-feature.sh run
-   ./dev-pipeline/run-feature.sh status
-   ```
+Recommend invoking `feature-pipeline-launcher` to configure and launch the dev-pipeline. Do NOT recommend running shell scripts directly — that is the launcher's responsibility.
 ## Error Recovery & Resume
@@ -397,7 +308,7 @@ When the session appears to be ending:
 1. **Remind**: "You set out to produce `.prizmkit/plans/feature-list.json` but we haven't completed it yet."
 2. **Offer 3 options**:
    - **(a) Continue to completion**
-   - **(b) Save draft & exit** — write current progress as draft
+   - **(b) Save draft & exit** — write current progress as `feature-list.draft.json` to `.prizmkit/plans/`
    - **(c) Abandon** — exit without saving
 ## Handoff Message Template

package/bundled/skills/feature-planner/assets/evaluation-guide.md CHANGED Viewed

@@ -12,7 +12,7 @@ Requires npm setup:
 ```bash
 npm run skill:review -- \
-  --workspace /.codebuddy/skill-evals/feature-planner-workspace \
+  --workspace .prizmkit/skill-evals/feature-planner-workspace \
   --iteration iteration-N \
   --skill-name feature-planner \
   --skill-path ${SKILL_DIR} \

package/bundled/skills/feature-planner/assets/planning-guide.md CHANGED Viewed

@@ -6,7 +6,7 @@ For app-level design references (vision templates, tech stack matrix), see `app-
 ---
-## 4. Feature Description Writing Guide
+## Feature Description Writing Guide
 Feature descriptions are the **primary input** for autonomous pipeline sessions. A thin description forces the AI to guess — producing worse code. Invest in rich descriptions upfront.
@@ -52,9 +52,25 @@ Every description should cover these aspects (adapt per feature type):
 "Build a dashboard page at /dashboard as the post-login landing screen. Display: (1) summary cards showing total projects count, active tasks count, and recent activity count; (2) a recent activity feed listing the last 10 actions across all projects with timestamps; (3) a quick-access project list showing the 5 most recently updated projects. Fetch data via GET /api/dashboard/summary. Show loading skeleton on initial load, empty state when user has no projects."
 ```
+### Headless Execution Requirements
+Feature descriptions are consumed by **autonomous AI sessions running in headless mode** — no human is available to clarify ambiguities. This raises the bar for description quality:
+**Must include for headless readiness:**
+1. **Concrete deliverables** — specific files, endpoints, components, or models to create
+2. **Integration points** — which existing APIs to call, which models to import, which modules to extend
+3. **Key behaviors** — validation rules, state transitions, error codes, edge cases
+**Dependency descriptions:**
+When a feature depends on others, explicitly state what it needs from them:
+- ✅ "Uses the User model (id, email, display_name) from F-001 to create a foreign key user_id on the Project model"
+- ❌ "depends on F-001" — the AI won't know what F-001 built
+**Self-test:** Read the description as if you have no other context. Could you implement it without asking a single question? If not, add more detail.
 ---
-## 5. Acceptance Criteria Writing Guide
+## Acceptance Criteria Writing Guide
 Acceptance criteria define what "done" means for a feature. They should be specific, testable, and unambiguous.
@@ -95,7 +111,7 @@ Then [expected outcome]
 ---
-## 6. Complexity Estimation Guide
+## Complexity Estimation Guide
 | Complexity | Characteristics | Typical Scope |
 |------------|----------------|---------------|
@@ -122,7 +138,7 @@ Consider splitting a feature if it exhibits any of the following:
 ---
-## 7. Dependency Graph Rules
+## Dependency Graph Rules
 These rules ensure the feature dependency graph is valid and buildable.
@@ -150,7 +166,7 @@ These rules ensure the feature dependency graph is valid and buildable.
 ---
-## 8. Session Granularity Decision Rules
+## Session Granularity Decision Rules
 Session granularity determines whether a feature is implemented in a single coding session or split across multiple sub-feature sessions.

package/bundled/skills/feature-planner/references/browser-interaction.md CHANGED Viewed

@@ -4,11 +4,9 @@ For web apps with UI, features that involve user-facing pages or interactive flo
 ## How to Capture
-During Phase 4 (refine descriptions and acceptance criteria), for qualifying features ask:
+During Phase 4.2, auto-generate `browser_interaction` for all qualifying features (see SKILL.md §Browser Interaction Planning for auto-detection rules). Present a **batch summary** to the user showing which features received `browser_interaction` — do NOT ask per-feature. The user can opt out specific features from the summary.
-> "This feature has UI behavior. Want to add browser verification so the pipeline can auto-check it after implementation? (Y/n)"
-If yes, generate the `browser_interaction` object:
+For each qualifying feature, generate the `browser_interaction` object:
 ```json
 {

package/bundled/skills/feature-planner/references/completeness-review.md ADDED Viewed

@@ -0,0 +1,57 @@
+# Pre-Generation Completeness Review
+Before generating `.prizmkit/plans/feature-list.json`, review the full feature set holistically. Individual features may look fine in isolation but have gaps when viewed together.
+## Step 1: Description Adequacy Scan
+For each feature, evaluate against the word-count thresholds in `planning-guide.md`:
+- Does the description cover: what to build, key behaviors, integration points, data model (if applicable), error/edge cases?
+- Is the description specific enough for an AI coding session to implement without guessing?
+- Flag any feature below the recommended word count for its complexity level (30/50/80 words for low/medium/high).
+**Implementation clarity check** — Every feature description will be consumed by an autonomous AI session. Verify each description specifies:
+1. Concrete deliverables (files to create, endpoints to build, components to implement, models to define)
+2. Key behaviors and business rules (validation, state transitions, error handling)
+3. Integration points with other modules (which APIs to call, which models to use)
+**Dependency context check** — If the feature depends on others, the description should reference what it needs from them:
+- Good: "Uses User model from F-001 to link projects to users via userId foreign key"
+- Bad: "depends on F-001" (too vague)
+**Ambiguity check** — Flag vague phrases:
+- Bad: "Create a nice dashboard" (what components? what data? what layout?)
+- Good: "Create dashboard at /dashboard with: (1) summary cards showing total projects count, active tasks count; (2) recent activity feed (last 10 items); (3) quick-access project list (5 most recent). Fetch data via GET /api/dashboard/summary."
+If any feature description is unclear, **expand it now** before generating the output file.
+## Step 2: Cross-Feature Completeness Check
+Look at the feature set as a whole:
+- **Implied functionality gaps**: Does feature A's acceptance criteria assume a capability that no other feature provides?
+- **Missing integration seams**: If two features share data or interact at runtime, is the interface specified?
+- **Scope leaks**: Does any feature's description reference functionality outside the agreed scope?
+## Step 3: Present Review to User
+Show a structured summary table:
+```
+Feature    | Description | Cross-Feature        | Recommendation
+           | Adequacy    | Issues               |
+F-001      | ✓ (65 words)| —                    | Ready
+F-002      | ⚠ (28 words)| —                    | Expand: add API endpoints, error handling
+F-003      | ✓ (52 words)| Assumes email from   | Clarify: who sends the notification?
+           |             | F-006 (not yet defined)|
+```
+Then ask if any features need further discussion.
+## Step 4: Interactive Supplementation
+For each feature the user wants to discuss:
+1. Ask targeted questions about the unclear aspects
+2. Propose concrete description supplements
+3. Update the feature description with agreed details
+4. Re-check: does the supplement resolve the gap?
+Continue until the user confirms all features are implementation-ready. Fixing thin descriptions here costs minutes; fixing misimplemented features downstream costs hours.