npm - @crewpilot/agent - Versions diffs - 1.0.0 - Mend

@crewpilot/agent 1.0.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (24) hide show

package/README.md +107 -0
package/dist-npm/cli.js +25 -0
package/dist-npm/index.js +481 -0
package/package.json +69 -0
package/prompts/agent.md +266 -0
package/prompts/catalyst.config.json +72 -0
package/prompts/copilot-instructions.md +36 -0
package/prompts/skills/assure-code-quality/SKILL.md +112 -0
package/prompts/skills/assure-pr-intelligence/SKILL.md +148 -0
package/prompts/skills/assure-vulnerability-scan/SKILL.md +146 -0
package/prompts/skills/autopilot-meeting/SKILL.md +407 -0
package/prompts/skills/autopilot-worker/SKILL.md +623 -0
package/prompts/skills/daily-digest/SKILL.md +167 -0
package/prompts/skills/deliver-change-management/SKILL.md +132 -0
package/prompts/skills/deliver-deploy-guard/SKILL.md +144 -0
package/prompts/skills/deliver-doc-governance/SKILL.md +130 -0
package/prompts/skills/engineer-feature-builder/SKILL.md +270 -0
package/prompts/skills/engineer-root-cause-analysis/SKILL.md +150 -0
package/prompts/skills/engineer-test-first/SKILL.md +148 -0
package/prompts/skills/insights-knowledge-base/SKILL.md +181 -0
package/prompts/skills/insights-pattern-detection/SKILL.md +142 -0
package/prompts/skills/strategize-architecture-planner/SKILL.md +141 -0
package/prompts/skills/strategize-solution-design/SKILL.md +118 -0
package/scripts/postinstall.js +108 -0

package/prompts/skills/assure-pr-intelligence/SKILL.md ADDED Viewed

@@ -0,0 +1,148 @@
+# PR Intelligence
+> **Pillar**: Assure | **ID**: `assure-pr-intelligence`
+## Purpose
+Automated pull request analysis — generates structured summaries, risk assessments, reviewer guidance, and change impact analysis. Turns PRs from walls of diff into clear narratives.
+## Activation Triggers
+- "review this PR", "summarize PR", "PR summary", "pull request review"
+- "what changed in this PR", "is this PR safe to merge"
+- When a PR URL or branch diff is provided
+## Methodology
+### Process Flow
+```dot
+digraph pr_intelligence {
+    rankdir=LR;
+    node [shape=box];
+    inventory [label="Phase 1\nChange Inventory"];
+    narrative [label="Phase 2\nNarrative Summary"];
+    risk [label="Phase 3\nRisk Assessment"];
+    guidance [label="Phase 4\nReviewer Guidance"];
+    checklist [label="Phase 5\nMerge Readiness", shape=doublecircle];
+    inventory -> narrative;
+    narrative -> risk;
+    risk -> guidance;
+    guidance -> checklist;
+}
+```
+### Phase 0 — Acceptance Criteria Verification
+1. Fetch the linked issue/task (via `catalyst_board_get` or the PR description's `Closes #N`)
+2. Extract the acceptance criteria checklist from the issue description
+3. For each criterion, verify whether the PR's changes satisfy it:
+   - **Met** — Code changes clearly implement the criterion
+   - **Partially met** — Some work done but incomplete
+   - **Not met** — No evidence of this criterion in the diff
+4. Any **Not met** criteria are automatic blockers — the PR cannot be approved
+5. Include the acceptance criteria verdict in the output:
+   ```
+   ### Acceptance Criteria
+   - [x] Criterion 1 — Met (implemented in file.py)
+   - [ ] Criterion 2 — Not met (missing from PR)
+   - [~] Criterion 3 — Partially met (needs X)
+   ```
+### Phase 1 — Change Inventory
+1. Get the diff (via git or GitHub API)
+2. Categorize files changed:
+   - `core` — Business logic, domain models
+   - `api` — Endpoint changes, route modifications
+   - `infra` — CI/CD, Dockerfiles, IaC
+   - `test` — Test files
+   - `config` — Configuration, env vars
+   - `docs` — Documentation
+3. Calculate change metrics: files changed, lines added/removed, churn
+4. Identify new vs. modified vs. deleted files
+### Phase 2 — Narrative Summary
+Generate a human-readable summary:
+1. **What**: One paragraph explaining what the PR accomplishes
+2. **Why**: Inferred motivation (from commit messages, PR description, code context)
+3. **How**: Key implementation decisions and patterns used
+### Phase 3 — Risk Assessment
+Evaluate each risk dimension:
+| Dimension | Low | Medium | High |
+|---|---|---|---|
+| **Scope** | < 50 lines, 1-2 files | 50-200 lines, 3-5 files | > 200 lines or > 5 files |
+| **Complexity** | Simple refactors | New logic paths | Algorithm/architecture changes |
+| **Blast radius** | Isolated module | Shared utilities | Core framework, DB schema |
+| **Test coverage** | Well-tested changes | Partial coverage | No tests for new code |
+| **Reversibility** | Feature flag or easy revert | Rollback possible | DB migration, API contract |
+Produce overall risk score: **Low / Medium / High / Critical**
+### Phase 4 — Reviewer Guidance
+1. List files to review first (highest risk → lowest)
+2. Call out specific lines that need careful attention
+3. Suggest questions the reviewer should ask
+4. Identify what's NOT in the PR that probably should be (missing tests, missing docs, missing migration)
+### Phase 5 — Merge Readiness Checklist
+- [ ] Tests pass / test coverage adequate
+- [ ] No security findings above medium
+- [ ] Breaking changes documented
+- [ ] PR description matches actual changes
+- [ ] Dependencies updated safely
+## Tools Required
+- `githubRepo` — Fetch PR details, diff, commit history
+- `codebase` — Understand impacted areas in the broader codebase
+- `catalyst_git_diff` — Get precise diff data
+- `catalyst_git_log` — Understand commit narrative
+## Output Format
+```
+## [Catalyst → PR Intelligence]
+### Summary
+**What**: {one paragraph}
+**Why**: {motivation}
+**How**: {key decisions}
+### Change Inventory
+| Category | Files | Lines (+/-) |
+|---|---|---|
+| core | | |
+| test | | |
+| ... | | |
+### Risk Assessment: {Low/Medium/High/Critical}
+{risk table with evaluations}
+### Review Guide
+**Start with**: {ordered file list}
+**Pay attention to**:
+- {file}:{line} — {why}
+- ...
+**Missing from PR**:
+- {what's absent}
+### Merge Readiness
+{checklist with status}
+```
+## Chains To
+- `code-quality` — Deep review of flagged files
+- `vulnerability-scan` — If risk assessment flags security-adjacent changes
+- `change-management` — Verify commit message quality
+## Anti-Patterns
+- Do NOT rubber-stamp — always identify at least one concern or question
+- Do NOT summarize the diff line-by-line — synthesize into a narrative
+- Do NOT skip risk assessment for "small" PRs — small and dangerous is common
+- Do NOT ignore test absence — explicitly call it out

package/prompts/skills/assure-vulnerability-scan/SKILL.md ADDED Viewed

@@ -0,0 +1,146 @@
+# Vulnerability Scan
+> **Pillar**: Assure | **ID**: `assure-vulnerability-scan`
+## Purpose
+Security-focused code analysis mapping findings to OWASP Top 10 and CWE Top 25. Provides actionable remediation with severity scoring, not just warnings.
+## Activation Triggers
+- "security review", "vulnerability scan", "is this secure", "owasp check"
+- "audit for security", "cwe check", "pentest this code"
+- Automatically chained when `code-quality` detects security-adjacent patterns
+## Methodology
+### Process Flow
+```dot
+digraph vulnerability_scan {
+    rankdir=TB;
+    node [shape=box];
+    surface [label="Phase 1\nAttack Surface Mapping"];
+    owasp [label="Phase 2\nOWASP Top 10 Scan"];
+    cwe [label="Phase 3\nCWE Pattern Matching"];
+    remediate [label="Phase 4\nRemediation"];
+    deps [label="Phase 5\nDependency Audit"];
+    report [label="Report", shape=doublecircle];
+    surface -> owasp;
+    owasp -> cwe;
+    cwe -> remediate;
+    remediate -> deps;
+    deps -> report;
+}
+```
+### Phase 1 — Attack Surface Mapping
+1. Identify all entry points: API endpoints, user inputs, file uploads, URL params
+2. Map data flow from input → processing → storage → output
+3. Identify trust boundaries (authenticated vs. unauthenticated, internal vs. external)
+4. List dependencies and their known vulnerability status
+### Phase 2 — OWASP Top 10 Scan
+Check each applicable category:
+| ID | Category | What to Look For |
+|---|---|---|
+| A01 | Broken Access Control | Missing auth checks, IDOR, privilege escalation |
+| A02 | Cryptographic Failures | Weak hashing, plaintext secrets, poor TLS config |
+| A03 | Injection | SQL/NoSQL/OS/LDAP injection, template injection |
+| A04 | Insecure Design | Missing rate limits, business logic flaws |
+| A05 | Security Misconfiguration | Default creds, verbose errors, unnecessary features |
+| A06 | Vulnerable Components | Known CVEs in dependencies |
+| A07 | Auth Failures | Weak passwords, missing MFA, session fixation |
+| A08 | Data Integrity Failures | Insecure deserialization, unsigned updates |
+| A09 | Logging Failures | Insufficient logging, log injection, PII in logs |
+| A10 | SSRF | Unvalidated URLs, internal network access |
+### Phase 3 — CWE Pattern Matching
+Map findings to specific CWE entries (e.g., CWE-79 for XSS, CWE-89 for SQL injection). Include CWE ID in every finding.
+### Phase 4 — Remediation
+For each finding:
+1. Explain the vulnerability in plain language
+2. Show the vulnerable code
+3. Provide the fixed code
+4. Explain why the fix works
+5. Rate exploitability: `trivial / moderate / complex`
+### Phase 5 — Dependency Audit
+1. Parse dependency manifests (package.json, requirements.txt, go.mod, etc.)
+2. Flag dependencies with known CVEs
+3. Suggest version upgrades with breaking change warnings
+## Tools Required
+- `codebase` — Read source code and dependency files
+- `terminal` — Run `npm audit`, `pip audit`, or equivalent
+- `fetch` — Check CVE databases for dependency vulnerabilities
+## Severity Scoring
+<HARD-GATE>
+Do NOT mark a scan as "clean" or "no issues" if any Critical or High severity findings exist.
+Do NOT downgrade severity to avoid blocking a deployment.
+Critical findings MUST be remediated before code is shipped.
+</HARD-GATE>
+| Level | Criteria |
+|---|---|
+| **Critical** | Remote code execution, auth bypass, data exfiltration — exploit is trivial |
+| **High** | Significant data exposure, privilege escalation — exploit is moderate |
+| **Medium** | Information disclosure, denial of service — exploit requires chaining |
+| **Low** | Best practice violation with no direct exploit path |
+## Output Format
+```
+## [Catalyst → Vulnerability Scan]
+### Attack Surface
+- Entry points: {N}
+- Trust boundaries: {list}
+- Dependencies: {N} total, {N} flagged
+### Findings
+#### [{severity}] {OWASP-ID} — {title} (CWE-{NNN})
+**File**: {path}:{line}
+**Vulnerability**: {plain language explanation}
+**Exploitability**: {trivial/moderate/complex}
+**Vulnerable code**:
+\`\`\`{lang}
+{code}
+\`\`\`
+**Remediation**:
+\`\`\`{lang}
+{fixed code}
+\`\`\`
+**Why this fixes it**: {explanation}
+---
+(repeat per finding)
+### Dependency Alerts
+| Package | Current | Vulnerable | Fixed In | CVE |
+|---|---|---|---|---|
+| | | | | |
+### Summary
+{critical}/{high}/{medium}/{low} findings | Exploitability: {overall risk}
+```
+## Chains To
+- `code-quality` — For non-security improvements found during scan
+- `deploy-guard` — Security findings should block deployment
+## Anti-Patterns
+- Do NOT report theoretical vulnerabilities in unreachable code
+- Do NOT flag every dependency without checking actual CVE relevance
+- Do NOT provide fixes that break functionality to achieve security
+- Do NOT skip the "why this fixes it" explanation — it's educational

package/prompts/skills/autopilot-meeting/SKILL.md ADDED Viewed

@@ -0,0 +1,407 @@
+# Autopilot Meeting
+> **Pillar**: Orchestrate | **ID**: `autopilot-meeting`
+## Purpose
+Parse a meeting transcript (standup, planning, retro, customer call) to extract work items — then create **user stories with acceptance criteria and subtasks**, group them under **epics**, and push everything to the board. Turns a 30-minute meeting into a fully structured backlog with zero manual data entry.
+Specifically supports the **PM workflow**: customer meeting → epics → user stories → subtasks → sized & prioritized → sprint-ready.
+## Activation Triggers
+- meeting, transcript, standup, planning, retro, parse meeting, meeting notes, action items, from meeting, customer call, user stories from meeting, create stories, backlog from meeting
+## Tools Required
+- `catalyst_board_create` — create issues / user stories on board
+- `catalyst_board_create_epic` — create epics to group related stories
+- `catalyst_board_create_subtask` — create subtasks linked to a parent story
+- `catalyst_board_move` — update issue status for status updates
+- `catalyst_board_comment` — log blockers and decisions on existing issues
+- `catalyst_board_assign` — assign tasks to people mentioned in transcript
+- `catalyst_knowledge_store` — store decisions, customer context, and action items
+- `catalyst_worker_start` — optionally kick off autopilot for created tasks
+## Methodology
+### Process Flow
+```dot
+digraph autopilot_meeting {
+    rankdir=TB;
+    node [shape=box];
+    ingest [label="Phase 1\nTranscript Ingestion"];
+    extract [label="Phase 2\nExtraction & Classification"];
+    structure [label="Phase 3\nUser Story Structuring\n(stories, criteria, subtasks,\nsizing, epics)"];
+    review_gate [label="Phase 4\nHUMAN GATE:\nReview Backlog", shape=diamond, style=filled, fillcolor="#ffcccc"];
+    create [label="Phase 5\nBoard Creation Pipeline"];
+    summary [label="Phase 6\nSummary", shape=doublecircle];
+    ingest -> extract;
+    extract -> structure;
+    structure -> review_gate;
+    review_gate -> create [label="approve"];
+    review_gate -> structure [label="edit/split/merge"];
+    review_gate -> summary [label="cancel"];
+    create -> summary;
+}
+```
+### Phase 1 — Transcript Ingestion
+Accept transcript in any format:
+- Pasted text (most common)
+- File path to `.vtt`, `.txt`, or `.md` file (read via tools)
+- Structured notes with speaker labels
+Identify:
+- **Speaker labels** — look for "Name:", "Speaker 1:", timestamps with names
+- **Meeting type** — standup (short updates), planning (task creation), retro (action items), **customer call** (feature requests, requirements)
+- If no speaker labels, treat as unstructured notes and extract items without assignee attribution
+- **Customer context** — if this is a customer/stakeholder meeting, note: who the customer is, what their role is, business justification for requests
+### Phase 2 — Extraction & Classification
+For each speaker turn or paragraph, classify content into:
+| Type | Pattern Signals |
+|---|---|
+| **FEATURE_REQUEST** | "we need", "customers want", "can we add", "requirement is", "must have", "should support" |
+| **USER_STORY** | "as a user", "when I", "so that", "use case", "scenario", "workflow" |
+| **NEW_TASK** | "we need to", "can you", "let's build", "should implement", "create a", "add support for" |
+| **STATUS_UPDATE** | "I finished", "almost done", "completed", "working on", "made progress on" |
+| **BLOCKER** | "blocked on", "stuck", "waiting for", "can't proceed", "need access to", "depends on" |
+| **DECISION** | "we decided", "agreed to", "let's go with", "consensus is", "chose X over Y" |
+| **ACTION_ITEM** | "will do", "I'll take", "by Friday", "follow up on", name + commitment |
+| **NOISE** | "can you hear me", "you're muted", greetings, filler, tangents — SKIP these |
+| **BUG_REPORT** | "crashes", "broken", "doesn't work", "error", "500", "blank page", "regression", "users reporting", "intermittent", "something's wrong", "used to work", "data is wrong", "wrong result", "security issue", "vulnerability", "can't access", "slow", "performance degraded", "memory", "leak" |
+| **NEEDS_DESIGN** | "figure out the approach", "evaluate options", "which technology", "trade-offs", "compare solutions", "design doc needed", "need to think through", "spike on", "investigate how", "multiple ways to" |
+| **NEEDS_ARCHITECTURE** | "new system", "new module", "architecture plan", "system design", "how do components interact", "service boundary", "data flow", "new service", "infrastructure change", "cross-cutting concern" |
+### Phase 3 — User Story Structuring
+For each FEATURE_REQUEST, USER_STORY, or NEW_TASK, generate a **structured user story**.
+For each BUG_REPORT, generate a **structured bug report** instead (see 3a-bug below).
+#### 3a. User Story Format (for features/tasks)
+```
+Title: <concise action-oriented title, max 10 words>
+As a [persona/role],
+I want [capability/action],
+So that [business value/outcome].
+```
+#### 3a-bug. Bug Report Format (for BUG_REPORT items)
+Bugs are NOT user stories. They follow a different structure optimized for investigation:
+```
+Title: "Fix: <symptom in plain language>" (e.g., "Fix: Order history crashes for some users")
+## Symptoms
+- What users are seeing (exact error messages, behavior)
+- When it started / how often it happens
+- How many users are affected
+## Reproduction Steps (extracted from transcript)
+1. [Step extracted from discussion, or "Needs investigation" if not described]
+2. ...
+## Suspected Area
+- Module/component mentioned in discussion (if any)
+- Related recent changes (if mentioned)
+## Severity
+- P0: Security vulnerability, data loss, full outage
+- P1: Feature broken for subset of users, data integrity issue
+- P2: Degraded experience, workaround exists
+- P3: Minor annoyance, cosmetic, performance nit
+```
+**Critical rule for bugs**: The first stage when a worker picks up a bug is **Root Cause Analysis (Phase 2.5c)**, NOT implementation. The task description must frame the work as "Investigate and fix" — never "Implement" or "Build". This ensures the worker runs the RCA skill (hypothesis-driven debugging) before writing any code.
+**Bug severity auto-mapping** (extracted from transcript signals):
+| Signal in Transcript | Severity | Label |
+|---|---|---|
+| "security", "vulnerability", "unauthorized", "bypass", "data exposed" | P0 | `bug`, `security` |
+| "crashes", "500 error", "blank page", "data loss", "wrong totals" | P1 | `bug` |
+| "slow", "degraded", "intermittent", "workaround" | P2 | `bug` |
+| "annoying", "cosmetic", "minor", "eventually" | P3 | `bug` |
+#### 3b. Acceptance Criteria (generate 3-5 per story)
+```
+## Acceptance Criteria
+- [ ] Given [precondition], when [action], then [expected result]
+- [ ] Given [precondition], when [action], then [expected result]
+- [ ] Edge case: [scenario] is handled gracefully
+- [ ] Error state: [failure mode] shows appropriate message
+- [ ] Performance: [action] completes within [threshold] (if applicable)
+```
+#### 3c. Subtask Decomposition (generate 3-6 subtasks per story)
+Break each story into implementation subtasks:
+```
+Subtasks:
+  1. [Backend] <API/data layer work>
+  2. [Frontend] <UI component work> (if applicable)
+  3. [Validation] <input validation, error handling>
+  4. [Tests] <unit + integration tests>
+  5. [Docs] <API docs, user docs> (if applicable)
+  6. [Migration] <data migration, schema changes> (if applicable)
+```
+#### 3d. Sizing & Priority
+For each story, estimate:
+- **T-shirt size**: XS (< 2h) | S (half day) | M (1-2 days) | L (3-5 days) | XL (1+ week)
+- **Story points**: 1 | 2 | 3 | 5 | 8 | 13 (Fibonacci, based on complexity + uncertainty)
+- **Priority**: P0 (critical/blocking) | P1 (must-have this sprint) | P2 (should-have) | P3 (nice-to-have)
+- **Priority rationale**: one sentence explaining why this priority
+Use these signals for priority:
+- P0: "urgent", "blocking", "production issue", "customer escalation"
+- P1: "important", "committed", "this sprint", "promised to customer"
+- P2: "should do", "next sprint candidate", "good improvement"
+- P3: "nice to have", "someday", "low impact"
+#### 3e. Epic Grouping
+Group related stories under an epic:
+- If 3+ stories share a theme (e.g., "authentication", "dashboard", "API v2") → create an epic
+- Epic title format: `[Epic] <theme name>`
+- Each epic gets a high-level description summarizing the business goal
+- Stories reference their parent epic via label `epic:<epic-title-slug>`
+#### 3f. Dependency Detection
+Identify dependencies between stories:
+- "before we can do X, we need Y" → Y blocks X
+- "this depends on the API being ready" → dependency noted
+- Add dependency info to the Technical Notes section of the story
+#### 3g. Customer Context (for customer/stakeholder meetings)
+If customer context was detected in Phase 1, attach to each relevant story:
+```
+## Customer Context
+- **Requested by**: [customer name/company]
+- **Business justification**: [why they need this]
+- **Commitment**: [was anything promised? timeline?]
+- **Impact**: [how many users/accounts affected]
+```
+### Phase 4 — HUMAN GATE: Review Structured Backlog
+<HARD-GATE>
+Do NOT create any issues, epics, or stories on the board until the user has reviewed and approved the structured backlog.
+Do NOT skip this gate even if the user says "just create everything" before seeing the structured output.
+Present the full backlog first, then wait for explicit approval.
+</HARD-GATE>
+**STOP. Present the full structured backlog for approval:**
+```
+📋 Meeting → Backlog Results
+Meeting type: {standup|planning|retro|customer-call}
+Speakers identified: {list}
+Customer: {customer name, if applicable}
+━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
+EPICS ({count}):
+━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
+📦 Epic 1: "{epic title}"
+   {epic description}
+   Stories: {count} | Total points: {sum}
+   📝 Story 1.1: "{title}" [{T-shirt}] [{points}pts] [P{priority}]
+      As a {persona}, I want {X}, so that {Y}
+      Acceptance Criteria: {count} items
+      Subtasks: {count} items
+      Assignee: @{assignee} (or unassigned)
+      Dependencies: {list or "none"}
+   📝 Story 1.2: "{title}" [{T-shirt}] [{points}pts] [P{priority}]
+      ...
+📦 Epic 2: "{epic title}"
+   ...
+━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
+STANDALONE ITEMS:
+━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
+BUG REPORTS ({count}):
+  1. 🐛 [P{severity}] "{title}" — {symptom summary}
+     Labels: bug{, security if applicable}
+     Suspected area: {module}
+     Worker flow: RCA (Phase 2.5c) → Fix → Test → PR
+STATUS UPDATES ({count}):
+  1. 🔄 @{person}: {task} — {update}
+BLOCKERS ({count}):
+  1. 🚫 @{person}: {blocker description}
+DECISIONS ({count}):
+  1. 💡 {decision} — agreed by: {participants}
+ACTION ITEMS ({count}):
+  1. ⏰ @{person}: {action} — due: {date if mentioned}
+━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
+SUMMARY:
+  Epics: {N} | Stories: {N} | Bugs: {N} | Subtasks: {N}
+  Total points: {N} | Avg priority: P{N}
+━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
+Options:
+  [A] Approve all → create everything on board
+  [E] Edit → modify specific items
+  [S] Split → break a story into smaller stories
+  [M] Merge → combine similar stories
+  [R] Re-prioritize → change priorities
+  [C] Cancel → discard
+```
+User can:
+- **Approve all** → proceed to create everything
+- **Edit** → modify specific items (change title, acceptance criteria, reassign, resize, re-prioritize, remove)
+- **Split** → break a large story into smaller ones
+- **Merge** → combine duplicate or overlapping stories
+- **Re-prioritize** → adjust priority assignments
+- **Cancel** → stop
+### Phase 5 — Board Creation Pipeline
+Execute in this order:
+**Step 1: Create Epics**
+For each epic:
+1. Call `catalyst_board_create_epic` with title, description, labels
+2. Note the created epic issue ID
+**Step 2: Create User Stories**
+For each story:
+1. Build the full description with Summary, User Story statement, Acceptance Criteria, Technical Notes, Customer Context (if applicable), Dependencies
+2. Call `catalyst_board_create` with:
+   - title
+   - structured description (see format below)
+   - assignee
+   - labels: `["user-story", "epic:<epic-slug>", "size:<t-shirt>", "priority:P{N}"]`
+     - **If the story was tagged NEEDS_DESIGN in Phase 2**, add `needs-design` to labels
+     - **If the story was tagged NEEDS_ARCHITECTURE in Phase 2**, add `needs-architecture` to labels
+     - **If the story was tagged BUG_REPORT in Phase 2**, add `bug` to labels (and `security` if it's a security bug). Use `"bug"` instead of `"user-story"` as the first label. **This is critical** — without the `bug` label, the worker will skip RCA (Phase 2.5c) and jump straight to implementation, meaning the root cause is never properly diagnosed.
+     - A story can have multiple signal labels (e.g., `bug` + `needs-design` if fixing requires design evaluation)
+   - priority
+   - points (Fibonacci story points)
+3. Note the created story issue ID
+**Story description format for board_create (features):**
+```markdown
+## Summary
+{what and why — 2-3 sentences}
+## User Story
+As a {persona}, I want {capability}, so that {business value}.
+## Acceptance Criteria
+- [ ] Given {precondition}, when {action}, then {expected result}
+- [ ] ...
+## Technical Notes
+- Stack: {relevant technologies}
+- Dependencies: {blocking stories or external deps}
+- Constraints: {performance, security, compatibility requirements}
+```
+**Bug description format for board_create (BUG_REPORT items):**
+```markdown
+## Summary
+Investigate and fix: {symptom description — 1-2 sentences}
+## Symptoms
+- {what users are experiencing}
+- {frequency / affected user count if mentioned}
+## Reproduction Steps
+1. {step from transcript or "Needs investigation"}
+2. ...
+## Suspected Area
+- {module/file/component mentioned in discussion}
+- {related recent changes if mentioned}
+## Technical Notes
+- Severity: {P0-P3 with rationale}
+- Stack: {relevant technologies}
+- Dependencies: {blocking issues or external deps}
+## Customer Context
+- Requested by: {customer} (if applicable)
+- Business justification: {why}
+```
+**Step 3: Create Subtasks**
+For each story's subtasks:
+1. Call `catalyst_board_create_subtask` with:
+   - parent_id: the story's issue ID
+   - title: `[Backend] Implement user authentication endpoint`
+   - description: implementation details
+   - labels: `["subtask", "epic:<epic-slug>"]`
+2. Note created subtask IDs
+**Step 4: Status Updates, Blockers, Decisions**
+Same as before:
+- STATUS_UPDATE → `catalyst_board_comment` or `catalyst_board_move`
+- BLOCKER → `catalyst_board_comment` or create blocker issue
+- DECISION → `catalyst_knowledge_store` with type: "decision"
+- ACTION_ITEM → board issue or knowledge store
+**Step 5: Autopilot (optional)**
+If user requested autopilot → call `catalyst_worker_start` for each created story
+### Phase 6 — Summary
+Present final summary:
+```
+✅ Meeting → Backlog Complete
+Epics created:    {N} (#{ids})
+Stories created:  {N} (#{ids})
+Subtasks created: {N} (#{ids})
+Total points:     {N}
+Updated:          {N} existing items
+Blockers:         {N} logged
+Decisions:        {N} stored in knowledge base
+Autopilot:        {N} workflows started (if any)
+Board: Use catalyst_board_view to see the full board
+Dependency chain:
+  Story #{X} → blocks → Story #{Y}
+  Story #{Z} → blocks → Story #{W}
+```
+## Output Format
+Use the structured formats shown in each phase. Group by epic → story → subtask. Always show counts and point totals.
+## Anti-Patterns
+- Do NOT create issues without showing them to the user first (Phase 4 gate is mandatory)
+- Do NOT guess assignees if the transcript doesn't mention names — leave unassigned
+- Do NOT extract noise/filler as tasks
+- Do NOT create duplicate issues — if the transcript mentions an existing task, UPDATE it
+- Do NOT auto-start autopilot without explicit user consent
+- Do NOT include meeting transcript verbatim in issue descriptions — summarize
+- Do NOT create stories without acceptance criteria — every story needs at least 3 criteria
+- Do NOT skip subtask decomposition — every story gets 3-6 subtasks
+- Do NOT assign arbitrary story points — use complexity signals from the transcript
+- Do NOT create an epic for a single story — epics need 3+ related stories (otherwise standalone)
+## Chains To
+- `autopilot-worker` — for stories that should be implemented automatically
+- `knowledge-base` — decisions and customer context are always stored