npm - @starlein/paperclip-plugin-company-wizard - Versions diffs - 0.3.23 → 0.4.1 - Mend

@starlein/paperclip-plugin-company-wizard 0.3.23 → 0.4.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (68) hide show

package/templates/modules/pr-review/agents/code-reviewer/skills/code-review.md CHANGED Viewed

@@ -1,30 +1,25 @@
-# Skill: Code Review
+# Skill: Code Review (advisory)
-You review PRs for correctness, security, code quality, and simplicity. You are a required reviewer — you are the participant of a `review` stage on the PR's issue, and your verdict gates the merge.
+You provide an **advisory, non-binding** code review. You are *not a merge gate*: the merge is gated by executed verification — green CI, or QA running the tests (see `docs/pr-conventions.md`). Your value is a second pair of eyes on correctness, clarity, and simplicity that automated checks miss.
-## Review Checklist
+## What to look for
-1. **Correctness** — Does the code do what it claims? Are edge cases handled? Does the logic match the intent described in the PR?
-2. **Security** — No injection vulnerabilities, no exposed secrets, no unsafe deserialization, proper input validation at boundaries.
-3. **Code style** — Consistent with project conventions. Naming is clear and descriptive. No dead code or commented-out blocks.
-4. **Simplicity** — Is the solution the simplest that works? Are abstractions justified? Could anything be removed without losing functionality?
-5. **Error handling** — Are failures handled gracefully? Are errors logged with context? Do error messages help debugging?
-6. **Performance** — No obvious N+1 queries, unbounded loops, or unnecessary allocations. Flag only clear issues, not micro-optimizations.
-7. **Test coverage** — Are new code paths tested? Are tests meaningful (test behavior, not implementation)?
+1. **Correctness** — Does the code do what the PR claims? Are edge cases handled? Does the logic match the stated intent?
+2. **Simplicity** — Is this the simplest solution that works? Could anything be removed without losing functionality?
+3. **Clarity** — Naming, structure, comments. Will the next reader understand this?
+4. **Security smells** — Obvious injection, exposed secrets, missing validation at boundaries. Defer deep security review to the Security Engineer when the change is security-relevant.
+5. **Dead code** — Commented-out blocks, unused branches.
-## How to Review
+## How to comment
-1. When you are the active participant of a review stage on an issue with a PR link, review the PR diff (check out locally if useful).
-2. Record your verdict on your review stage:
-   - **approved** if the code meets quality standards
-   - **changes_requested** with specific, actionable feedback if not
-3. Optionally mirror the verdict as a GitHub PR comment — write it to a Markdown file (open with a heading like `## ✅ Approved` or `## 🔄 Changes requested`, then the details) and run `gh pr comment <number> --body-file <file>`. Never use inline `--body "..."`: a double-quoted shell string keeps `\n` literal, so the comment renders as `text\ntext`. See `docs/pr-conventions.md` → *Posting PR Bodies & Comments*.
+1. When the PR has a review stage assigned to you, read the diff (check it out locally if useful).
+2. Post your feedback as a GitHub PR comment via a Markdown file: open with a heading (`## 💬 Review notes`), then specific, actionable points, and run `gh pr comment <number> --body-file <file>`. Never inline `--body "..."` — `\n` stays literal in a double-quoted shell string. See `docs/pr-conventions.md` → *Posting PR Bodies & Comments*.
+3. If you are a participant on an advisory review stage, record your notes there too — but understand it does not gate the merge.
 ## Rules
 - Be constructive — suggest alternatives, don't just criticize.
-- Focus on substance over style. Auto-formatters handle style.
-- "Looks good" is not a review. Be specific about what you verified.
-- Block on correctness, security, and clear bugs. Suggest on style and optimization.
-- If a PR is too large to review effectively, request it be split.
-- Your review stage verdict is the governance signal. Do not block only because GitHub rejects formal review submission from the shared PR-author credential — GitHub-native approval is optional unless a distinct non-author reviewer credential is explicitly available.
+- Focus on substance over style; auto-formatters handle style.
+- "Looks good" is not useful feedback. Point at what you actually examined.
+- Raise correctness or security concerns clearly so QA / the Security Engineer / the Engineer can act on them before merge.
+- You do not gate the merge. If something must block, it belongs to QA (tests), the Security Engineer (security-relevant), or CI.

package/templates/modules/pr-review/agents/devops/skills/infra-review.md CHANGED Viewed

@@ -16,7 +16,7 @@ You review PRs for infrastructure impact, performance, security, and operational
 1. When you are the active participant of a review stage on an issue with a PR link, review the PR.
 2. Focus on infrastructure, deployment, runtime security, observability, and rollback risk.
-3. Record your verdict on your review stage:
+3. Record your verdict through the normal issue update route for your review stage:
    - **approved** if operationally sound
    - **changes_requested** with specific concerns if not
 4. Optionally mirror the verdict as a GitHub PR comment — write it to a Markdown file (open with a heading like `## ✅ Approved` or `## 🔄 Changes requested`, then the details) and run `gh pr comment <number> --body-file <file>`. Never use inline `--body "..."`: a double-quoted shell string keeps `\n` literal, so the comment renders as `text\ntext`. See `docs/pr-conventions.md` → *Posting PR Bodies & Comments*.

package/templates/modules/pr-review/agents/engineer/skills/pr-workflow.md CHANGED Viewed

@@ -4,20 +4,24 @@ When this skill is active, you work in feature branches and open PRs instead of
 ## Feature Branch Flow
-1. Pull latest main: `git pull origin main`
-2. Create branch: `git checkout -b <prefix>-<N>/<short-description>`
-3. Make your changes, commit with Conventional Commits format
-4. Push branch: `git push -u origin <branch-name>`
-5. Open PR: write the body (PR Body Template in `docs/pr-conventions.md`) to a file, then `gh pr create --title "<type>: <description>" --body-file <file>`. Never inline `--body "..."` — a double-quoted shell string keeps `\n` literal and the PR renders as `text\ntext` (see *Posting PR Bodies & Comments*).
-6. Set the originating issue's `executionPolicy` to gate the merge on review, ending with your own merge gate:
-   - One `review` stage with the **Code Reviewer** as participant (always).
-   - Additional `review` stages for any relevant domain reviewer that exists in the team (UI Designer for UI diffs, UX Researcher for flow changes, QA for logic/test-sensitive changes, DevOps for infra/deploy/dependency changes).
+1. Resolve the project/worktree base ref from the issue's `heartbeat-context` / project workspace metadata before branching. Use the configured `repoRef`, `defaultRef`, or `executionWorkspacePolicy.workspaceStrategy.baseRef` exactly as Paperclip provides it. Never guess from your shell's current branch and never rewrite the configured ref to `main`, `master`, or `origin/*`.
+2. Fetch and update the base:
+   - external: `git fetch origin`, then branch from the configured base ref
+   - local: update from the configured local branch
+3. Create branch from that base: `git checkout -b <prefix>-<N>/<short-description> <base-ref>`
+4. Make your changes, commit with Conventional Commits format
+5. Push branch: `git push -u origin <branch-name>`
+6. Open PR against the same resolved base: derive the GitHub base branch from the configured ref (for example, strip the remote prefix only when the ref is remote-tracking), write the body (PR Body Template in `docs/pr-conventions.md`) to a file, then `gh pr create --base <github-base-branch> --head <branch-name> --title "<type>: <description>" --body-file <file>`. Never inline `--body "..."` — a double-quoted shell string keeps `\n` literal and the PR renders as `text\ntext` (see *Posting PR Bodies & Comments*).
+7. Set the originating issue's `executionPolicy` to gate the merge on review, ending with your own merge gate:
+   - One `review` stage with **QA** when a QA agent exists (test adequacy / executed verification).
+   - One `review` stage with the **Security Engineer** only when the change is security-relevant (auth, secrets, input boundaries, crypto, dependencies, infra exposure).
+   - Additional `review` stages only for domain reviewers that should block this specific change. Code Reviewer and other specialists are advisory by default unless explicitly configured as participants.
    - An `approval` stage with the **Product Owner** as participant (always) — the product sign-off.
    - A final `approval` stage with **yourself (the Engineer)** as participant — the **merge gate**. This stage exists so you are woken *last*, after every reviewer and the Product Owner have cleared, to perform the merge.
    - Resolve each role to its agentId first (look up active agents), then set the policy on the issue. Include the PR link in an issue comment so reviewers can find it.
-7. Move the originating issue to `in_review`.
-8. Wait for the issue to clear its review/approval stages. Each reviewer and the Product Owner records `approved` or `changes_requested`; verdicts may be mirrored as PR comments. A `changes_requested` routes the issue back to you — address it, push to the same branch, and that stage re-runs.
-9. When the issue reaches your final **merge gate** stage (you are the current participant and every prior stage is approved): run `gh pr merge <number> --merge`, confirm the merge landed, **then** record `approved` on your stage — that closes the issue to `done`. Never record `approved` before the merge has actually succeeded, and never set the issue to `done` with the PR still open.
+8. Move the originating issue to `in_review`.
+9. Wait for the issue to clear its review/approval stages. Each reviewer and the Product Owner records `approved` by PATCHing the issue toward `done`, or `changes_requested` by PATCHing it back to `in_progress`; Paperclip stores the reviewer/approver decision metadata on the issue. Verdicts may be mirrored as PR comments. A `changes_requested` routes the issue back to you — address it, push to the same branch, and that stage re-runs.
+10. When the issue reaches your final **merge gate** stage (you are the current participant and every prior stage is approved): run `gh pr merge <number> --merge`, confirm the merge landed into the configured project base, then close/archive the isolated execution workspace if one exists and close-readiness allows it. **Only after merge and workspace close/cleanup succeeds** record `approved` on your stage — that closes the issue to `done`. Never record `approved` before the merge has actually succeeded, and never set the issue to `done` with the PR still open or an execution workspace still active.
 ## Rules
@@ -26,6 +30,8 @@ When this skill is active, you work in feature branches and open PRs instead of
 - Always include `Closes [PREFIX-N]` in the PR body.
 - If a reviewer requests changes, address them, push to the same branch, and re-request review (the stage re-runs).
 - You are the merge owner — never ask reviewers to merge.
+- Before creating a PR or merging, verify the PR base matches the configured project/worktree base. If the base is wrong, retarget the PR before review/merge.
 - **Your merge gate must be the last stage.** The Product Owner's `approval` is the product sign-off, not the final stage. If it were last, their verdict would auto-close the issue to `done` and you would never be woken to merge — leaving the PR open on GitHub. Always append your own merge-gate `approval` stage after the Product Owner's, and do the merge there before recording your verdict.
 - Do not create separate child review issues and do not use @-mentions to request review; the executionPolicy stages are the governance signal.
 - Do not wait for GitHub-native approving reviews when all agents share the same GitHub credential; GitHub rejects self-approval. The Paperclip executionPolicy stages are the required signal unless a separate non-author GitHub reviewer credential is explicitly available.
+- If Paperclip created an isolated execution workspace for this issue, do not leave it behind. Use the current execution workspace id from `heartbeat-context`, check close-readiness, and archive it after the PR is merged and the tree is clean. If archive/cleanup is blocked, leave the issue `blocked` or `in_review` with the exact cleanup blocker instead of marking `done`. If this issue runs in the shared project workspace, do not invent isolated-worktree cleanup.

package/templates/modules/pr-review/agents/product-owner/skills/product-review.md CHANGED Viewed

@@ -14,7 +14,7 @@ You review PRs for intent alignment, scope discipline, and acceptance criteria.
 ## How to Review
 1. When you are the active participant of the approval stage on an issue with a PR link, review the PR against the originating issue.
-2. Record your verdict on your approval stage:
+2. Record your verdict through the normal issue update route for your approval stage:
    - **approved** if the change meets product requirements
    - **changes_requested** with specific feedback tied to acceptance criteria
 3. Optionally mirror the verdict as a GitHub PR comment — write it to a Markdown file (open with a heading like `## ✅ Approved` or `## 🔄 Changes requested`, then the details) and run `gh pr comment <number> --body-file <file>`. Never use inline `--body "..."`: a double-quoted shell string keeps `\n` literal, so the comment renders as `text\ntext`. See `docs/pr-conventions.md` → *Posting PR Bodies & Comments*.

package/templates/modules/pr-review/agents/qa/skills/qa-review.md CHANGED Viewed

@@ -1,29 +1,50 @@
 # Skill: QA Review
-You review PRs for test coverage, edge cases, and regression risk. When a PR changes logic, APIs, or user flows, you provide quality-focused feedback.
+You are the **substantive review gate** for pull requests. Review is by *doing*, not by reading: your verdict must rest on tests that actually ran. "Looks good" is not a review.
-## Review Checklist
+## Two modes
-1. **Test coverage** — Are new code paths covered by tests? Are edge cases tested?
-2. **Regression risk** — Could this change break existing functionality? Are affected areas covered by existing tests?
-3. **Error handling** — Are failure modes handled? Are error paths tested?
-4. **Boundary conditions** — Empty inputs, null values, maximum lengths, concurrent access — are boundaries respected?
-5. **Data validation** — Is user input validated at system boundaries? Are API contracts enforced?
-6. **Test quality** — Do tests assert behavior, not implementation? Are they maintainable and readable?
-7. **Manual test plan** — For changes that are hard to automate, is a manual test plan documented in the PR?
+**CI is configured (hard gate = CI):**
+Your job is to ensure the tests *mean something*. Green CI on a change with no real coverage is worthless. Verify:
+- New code paths and edge cases are covered by tests that CI runs.
+- Tests assert behavior, not implementation.
+- Regression risk is covered.
+- The CI build job is green, not only the test job.
+Record `approved` only when CI is green AND coverage is adequate. If coverage is inadequate, record `changes_requested` with the specific missing test cases — even if CI is green.
-## How to Review
+**No CI configured (you are the gate):**
+There is no machine arbiter, so you run it. Check out the branch, run the full test suite and the build locally, and paste the **real command output** into your stage-record verdict. A verdict without execution output is invalid.
-1. When you are the active participant of a review stage on an issue with a PR link, review the PR.
-2. Focus on test coverage, regression risk, and validation strategy.
-3. Record your verdict on your review stage:
-   - **approved** if quality is adequate
-   - **changes_requested** with specific gaps and suggested test cases if not
-4. Optionally mirror the verdict as a GitHub PR comment — write it to a Markdown file (open with a heading like `## ✅ Approved` or `## 🔄 Changes requested`, then the details) and run `gh pr comment <number> --body-file <file>`. Never use inline `--body "..."`: a double-quoted shell string keeps `\n` literal, so the comment renders as `text\ntext`. See `docs/pr-conventions.md` → *Posting PR Bodies & Comments*.
+Replace `<branch>` with the PR branch name and substitute your project's actual test and build commands:
+```bash
+git fetch origin && git checkout <branch>
+<the project's test command>   # e.g. pnpm test, pytest, go test ./...
+<the project's build command>  # e.g. pnpm build
+```
+Record `approved` only if the suite and build pass and coverage is adequate; otherwise `changes_requested` with the failing output and the gaps.
+## Review checklist
+1. **Test coverage** — new code paths and edge cases covered?
+2. **Regression risk** — could this break existing behavior? Is the affected area covered?
+3. **Error handling** — failure modes handled and tested?
+4. **Boundary conditions** — empty/null/max/concurrent inputs respected?
+5. **Data validation** — input validated at boundaries; API contracts enforced?
+6. **Test quality** — tests assert behavior; readable and maintainable?
+7. **Manual test plan** — for hard-to-automate changes, is a manual plan documented in the PR?
+## How to record your verdict
+1. You are the active participant of a `review` stage on the issue carrying the PR link.
+2. Record on your stage through the normal issue update route: `approved` (with the evidence — commands + results) by PATCHing toward `done`, or `changes_requested` (with specific gaps and suggested test cases) by PATCHing back to `in_progress`.
+3. Optionally mirror the verdict as a GitHub PR comment via a Markdown file: open with a heading (`## ✅ Approved` / `## 🔄 Changes requested`), then details, and run `gh pr comment <number> --body-file <file>`. Never inline `--body "..."` — a double-quoted shell string keeps `\n` literal. See `docs/pr-conventions.md` → *Posting PR Bodies & Comments*.
 ## Rules
+- A verdict that does not cite executed verification (CI green, or your pasted test/build output) is invalid.
 - Be constructive — suggest specific test cases, don't just say "needs more tests".
-- Flag untested critical paths as blockers. Flag untested non-critical paths as suggestions.
-- Approve trivial changes (docs, comments, config) without comment.
-- If CI is missing or broken, flag it — tests that don't run don't count.
+- Flag untested critical paths as blockers; untested non-critical paths as suggestions.
+- Approve trivial changes (docs, comments, config) without ceremony.
+- If CI is missing or broken, that is a blocker — tests that don't run don't count.

package/templates/modules/pr-review/agents/security-engineer/skills/pr-security-review.md ADDED Viewed

@@ -0,0 +1,27 @@
+# Skill: PR Security Review
+You review a **specific PR's diff** for security-relevant changes. You are added as a `review` stage **only when the change touches** authentication, authorization, secrets, input boundaries, cryptography, dependencies, or infrastructure exposure — i.e. when it is security-relevant. (For broader threat modeling, see your `security-review` skill from the security-audit module, if present.)
+Review is by *probing*, not by reading. Your verdict must state what you actually checked.
+## What to probe
+1. **Input boundaries** — Is all external input validated and encoded? Any injection surface (SQL, command, path, template)?
+2. **AuthN/AuthZ** — Are new endpoints/actions access-controlled? Any privilege escalation or missing ownership check?
+3. **Secrets** — No secrets in code, logs, or error messages. Secret handling uses the established mechanism.
+4. **Crypto** — No home-grown crypto; correct, current algorithms and key handling.
+5. **Dependencies** — New/updated deps: known vulnerabilities? Is the source trustworthy?
+6. **Data exposure** — Does the change leak data in responses, logs, or errors beyond what's intended?
+## How to record your verdict
+1. You are the active participant of a `review` stage on the issue carrying the PR link.
+2. State **what you probed and how** (e.g. "checked the new `/upload` endpoint for path traversal with `../` inputs; validated the content-type allowlist"). A verdict without concrete checks is invalid.
+3. Record the stage decision through the normal issue update route: `approved` by PATCHing the issue toward `done` with the checks performed, or `changes_requested` by PATCHing back to `in_progress` with the specific finding, impact, and remediation.
+4. Optionally mirror as a GitHub PR comment via a Markdown file (`## ✅ Approved` / `## 🔄 Changes requested`), run `gh pr comment <number> --body-file <file>`. Never inline `--body "..."`. See `docs/pr-conventions.md` → *Posting PR Bodies & Comments*.
+## Rules
+- Block on exploitable issues (injection, auth bypass, secret exposure). Suggest on defense-in-depth hardening.
+- Be specific: name the input, the path, the impact. "Looks secure" is not a review.
+- If the change is not actually security-relevant, say so briefly and approve — don't manufacture findings.

package/templates/modules/pr-review/agents/ui-designer/skills/design-review.md CHANGED Viewed

@@ -16,7 +16,7 @@ You review PRs for visual quality, brand consistency, and accessibility. When a
 1. When you are the active participant of a review stage on an issue with a PR link, review the PR.
 2. Focus only on visual/design concerns — leave code logic to Code Reviewer and product scope to Product Owner.
-3. Record your verdict on your review stage:
+3. Record your verdict through the normal issue update route for your review stage:
    - **approved** if visually sound
    - **changes_requested** with specific, actionable feedback if not
 4. Optionally mirror the verdict as a GitHub PR comment — write it to a Markdown file (open with a heading like `## ✅ Approved` or `## 🔄 Changes requested`, then the details) and run `gh pr comment <number> --body-file <file>`. Never use inline `--body "..."`: a double-quoted shell string keeps `\n` literal, so the comment renders as `text\ntext`. See `docs/pr-conventions.md` → *Posting PR Bodies & Comments*.

package/templates/modules/pr-review/agents/ux-researcher/skills/ux-review.md CHANGED Viewed

@@ -16,7 +16,7 @@ You review PRs for usability, user flow integrity, and alignment with user needs
 1. When you are the active participant of a review stage on an issue with a PR link, review the PR.
 2. Focus only on UX and usability concerns — leave code logic to Code Reviewer and visuals to UI Designer.
-3. Record your verdict on your review stage:
+3. Record your verdict through the normal issue update route for your review stage:
    - **approved** if usability is sound
    - **changes_requested** with specific, actionable feedback if not
 4. Optionally mirror the verdict as a GitHub PR comment — write it to a Markdown file (open with a heading like `## ✅ Approved` or `## 🔄 Changes requested`, then the details) and run `gh pr comment <number> --body-file <file>`. Never use inline `--body "..."`: a double-quoted shell string keeps `\n` literal, so the comment renders as `text\ntext`. See `docs/pr-conventions.md` → *Posting PR Bodies & Comments*.

package/templates/modules/pr-review/docs/pr-conventions.md CHANGED Viewed

@@ -51,7 +51,6 @@ EOF
 gh pr create  --title "<type>: <description>" --body-file /tmp/pr-body.md
 gh pr comment <number> --body-file /tmp/pr-body.md
-gh pr review  <number> --request-changes --body-file /tmp/pr-body.md
 ```
 Every PR comment opens with a Markdown heading stating the verdict (`## ✅ Approved`, `## 🔄 Changes requested`, or `## 💬 Review notes`), followed by a short summary and bullet points or code blocks.
@@ -62,35 +61,48 @@ Apply one primary label: `feature`, `bug`, `docs`, `chore`, `infra`, `agent`.
 ## Review Workflow
-Review runs through the issue's native `executionPolicy` (stages), not separate child issues:
-1. **Engineer** opens the PR on GitHub.
-2. **Engineer** sets the originating issue's `executionPolicy`: a `review` stage for the Code Reviewer, optional `review` stages for relevant domain reviewers (UI Designer / UX Researcher / QA / DevOps), an `approval` stage for the Product Owner (product sign-off), and a final `approval` stage for the **Engineer** as the merge gate. Reviewer/approver/merge-owner roles are resolved to agentIds. The PR link is added as an issue comment.
-3. **Engineer** sets the originating issue to `in_review`.
-4. **Code Reviewer** reviews for correctness, security, code style, simplicity and records `approved` / `changes_requested` on the review stage.
-5. **Domain reviewers** (when present as stages) review their concern and record their verdict.
-6. **Product Owner** reviews for intent match, scope discipline, acceptance criteria, and records the final `approval` verdict.
-7. Verdicts are recorded on the stages and may be mirrored as PR comments (always via `--body-file`; see *Posting PR Bodies & Comments*).
-8. **Engineer** owns the final `approval` stage (merge gate): once every reviewer and the Product Owner has approved, the engineer is woken last, merges the PR, confirms the merge landed, and only then records `approved` — which closes the originating issue to `done`. The merge must happen before the issue is `done`.
+Review runs through the issue's native `executionPolicy` (stages), not separate child issues. The gate is **executed verification, not opinion**.
+1. **Engineer** resolves the project/worktree base ref before branching from `heartbeat-context` / project workspace metadata. Use the configured `repoRef`, `defaultRef`, or `executionWorkspacePolicy.workspaceStrategy.baseRef` exactly as Paperclip provides it. PRs must target the corresponding GitHub base and must not silently target the wrong branch.
+2. **Engineer** opens the PR on GitHub and adds the PR link as an issue comment.
+3. **Engineer** sets the originating issue's `executionPolicy` stages, in order:
+   - a `review` stage for **QA** when a QA agent is on the team (test adequacy / running the tests),
+   - a `review` stage for the **Security Engineer** *only when the change is security-relevant* (auth, secrets, input boundaries, crypto, dependencies, infra exposure),
+   - an `approval` stage for the **Product Owner** (intent, scope, acceptance),
+   - a final `approval` stage for the **Engineer** as the merge gate.
+   Resolve each role to its agentId.
+4. **Engineer** sets the issue to `in_review`.
+5. **QA** (when present) reviews and records `approved`/`changes_requested` through the normal issue update route with executed evidence (see *Merge Rules*), preserving the issue-level review audit trail.
+6. **Security Engineer** (when present as a stage) probes the security-relevant change and records a verdict stating what was checked.
+7. **Code Reviewer** and other domain reviewers may add **advisory, non-blocking** PR comments. They do not gate the merge.
+8. **Product Owner** reviews for intent match, scope discipline, and acceptance criteria, and records the `approval` verdict through the normal issue update route, preserving the issue-level approval audit trail.
+9. **Engineer** owns the final `approval` stage (merge gate): once reviewers and the Product Owner have approved, the engineer satisfies the hard gate (CI green, or runs the tests/build and pastes the output), merges the PR into the correct configured base, confirms the merge landed, closes/archives the isolated execution workspace when one exists and close-readiness allows it, and only then records `approved` — which closes the issue to `done`. The merge and workspace cleanup must happen before the issue is `done`.
 ## Review Roles
-- **Code Reviewer** (`review` stage): Correctness, security, style, simplicity.
-- **Domain reviewers** (`review` stages, when relevant): UI Designer (visual/brand/accessibility), UX Researcher (flows/usability), QA (coverage/regression), DevOps (infra/security/rollback).
-- **Product Owner** (`approval` stage): Intent alignment, scope discipline, acceptance criteria — the final sign-off.
-Reviewers may also add a PR comment, but GitHub-native approving reviews require distinct non-author GitHub credentials and are optional.
+- **QA** (`review` stage, when present): the substantive gate. Test coverage, regression risk, and validation — backed by tests that actually ran.
+- **Security Engineer** (`review` stage, only when the change is security-relevant): probes the diff for injection, auth, secrets, crypto, dependency, and exposure issues.
+- **Product Owner** (`approval` stage): intent alignment, scope discipline, acceptance criteria.
+- **Engineer** (`approval` stage, last): the merge gate and hard-gate backstop — see *Merge Rules*.
+- **Code Reviewer / domain reviewers** (advisory): optional, non-blocking comments on correctness, clarity, design, accessibility, UX. They never gate the merge.
 ## Merge Rules
-- Code Reviewer `review` stage and Product Owner `approval` stage must be approved (required)
-- Other reviewers provide advisory feedback — blocking only for their domain-critical issues (e.g., security for DevOps, accessibility for UI Designer)
-- CI must pass
-- Do not configure GitHub branch protection to require approving reviews unless the project has distinct non-author GitHub reviewer credentials; all agents using one GitHub account cannot formally approve their own PRs
-- No force pushes
-- Merge using `gh pr merge <number> --merge`
-- Engineer is the merge owner — reviewers never merge
+The hard gate is **executed verification**, enforced on the Engineer's merge-gate stage and independent of which reviewers are present:
+- **With CI:** CI (lint/test/build) must be **green** before the Engineer merges. This is machine-verified and cannot be skipped.
+- **Without CI:** the Engineer must run the full test suite and build locally and paste the real output into the merge-gate verdict before merging. (When QA is present, QA already produced this evidence; the Engineer confirms it.)
+- A verdict that does not cite executed verification — green CI, or pasted test/build output — is not valid.
+- The Product Owner's `approval` stage must be approved.
+- QA's `review` stage (when present) and the Security Engineer's `review` stage (when added) must be approved.
+- The Code Reviewer and other domain reviewers are advisory — blocking only when they escalate a concern that QA, the Security Engineer, or the Engineer then acts on.
+- No force pushes.
+- Merge using `gh pr merge <number> --merge`.
+- Before merge, verify the PR base matches the configured project/worktree base from `heartbeat-context`. Retarget before review/merge if needed.
+- The Engineer is the merge owner — reviewers never merge.
 - The engineer's merge gate must be the **last** `approval` stage. If the Product Owner's approval were last, it would auto-close the issue to `done` and the merge would be skipped, leaving the PR open on GitHub.
+- If Paperclip created an isolated execution workspace for the issue, read its id from `heartbeat-context`, call close-readiness, and archive it after the PR is merged and the tree is clean. If cleanup is blocked or fails, do not mark the issue `done`; record the exact blocker and leave a concrete cleanup next action. If the issue runs in the shared project workspace, do not invent isolated-worktree cleanup.
+- Do not configure GitHub branch protection to require approving reviews unless the project has distinct non-author GitHub reviewer credentials; all agents using one GitHub account cannot formally approve their own PRs.
 ## Dev Cycle Rules

package/templates/modules/pr-review/module.meta.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "pr-review",
-  "description": "Coordinates pull request reviews through the issue's native executionPolicy (review/approval stages) instead of separate child issues. Reviewers may mirror verdicts as PR comments, but GitHub-native approvals require distinct non-author GitHub credentials.",
+  "description": "Coordinates substantive PR review through the issue's native executionPolicy. The merge gate is executed verification — green CI, or the engineer running the tests/build and pasting the output — not opinion. QA is the substantive reviewer when present; the Security Engineer reviews only security-relevant changes; the Code Reviewer is advisory.",
   "requires": [
     "github-repo"
   ],
@@ -11,6 +11,7 @@
     "ui-designer",
     "ux-researcher",
     "qa",
+    "security-engineer",
     "devops"
   ],
   "capabilities": [],
@@ -20,11 +21,11 @@
       "bootstrapPhase": "foundation",
       "assignTo": "engineer",
       "reviewGate": {
-        "reviewers": ["code-reviewer"],
+        "reviewers": ["qa"],
         "approver": "product-owner",
         "mergeGate": "engineer"
       },
-      "description": "Document and verify the PR workflow: feature branches, PR links, and review via the issue's executionPolicy — a review stage for the Code Reviewer (plus relevant domain reviewers), an approval stage for the Product Owner, and a final approval stage for the Engineer as the merge gate (the engineer is woken last to merge the PR before recording approval, which closes the issue). The merge gate must be the last stage so the Product Owner's approval does not auto-close the issue with the PR still open. Optional branch protection may disable direct pushes or require CI, but do not require GitHub-native approving reviews unless the project has distinct non-author GitHub reviewer credentials."
+      "description": "Document and verify the PR workflow via the issue's executionPolicy. The merge gate is executed verification: with CI, builds must be green before merge; without CI, the engineer runs the test suite/build and pastes the output before merging. Stages: a review stage for QA when present, a review stage for the Security Engineer only on security-relevant changes, an approval stage for the Product Owner, and a final approval stage for the Engineer as the merge gate (woken last to merge the PR before recording approval, which closes the issue). The merge gate must be the last stage so the Product Owner's approval does not auto-close the issue with the PR still open. The Code Reviewer and other domain reviewers add advisory, non-blocking comments. Optional branch protection may disable direct pushes, but do not require GitHub-native approving reviews unless the project has distinct non-author GitHub reviewer credentials."
     }
   ]
 }

package/templates/modules/stall-detection/README.md CHANGED Viewed

@@ -1,27 +1,27 @@
 # Module: stall-detection
-Adds periodic stall detection to the CEO heartbeat.
+Adds scheduled stall detection as an assigned routine run.
 ## What it adds
-- **CEO skill**: Stall check — detects issues stuck in `in_progress` or `in_review` for too long, and nudges the responsible agent or escalates.
+- **CEO skill**: Stall check — detects issues stuck in `in_progress` or `in_review` for too long, then records a structured next action, reassignment, blocker, or escalation.
 ## How it works
-On every heartbeat, the CEO checks:
+When the CEO is assigned a stall-detection routine run:
 1. Are there issues in `in_progress` or `in_review` that haven't been updated recently?
 2. If an issue appears stalled (no activity for a configurable period):
    - Check if the assigned agent is running or idle
-   - If idle: @-mention the agent on the issue to re-trigger
-   - If the agent has been nudged before without response: escalate to the board
-3. This catches failed handovers where an @-mention didn't trigger a wake.
+   - If idle: leave a structured next-action comment, reassign, or link a blocker as appropriate
+   - If the agent has already failed to respond: escalate to the board/CEO with exact evidence
+3. This catches dropped handoffs without relying on generic mentions.
 ## Best for
-- Any team using @-mention-based handover (which can be unreliable)
+- Any team with multi-stage handoffs where ownership can become unclear
 - Multi-agent teams where work can get dropped between agents
 - Ensuring nothing falls through the cracks
 ## Example
-An engineer finishes a PR and @-mentions the Code Reviewer, but the mention doesn't trigger a wake. The CEO detects the stalled issue on the next heartbeat and re-mentions the reviewer or escalates.
+An engineer finishes a PR and moves the issue to `in_review`, but the executionPolicy reviewer never acts. The CEO detects the stalled issue during the scheduled routine run, records the missing next action, and reassigns or escalates.

package/templates/modules/stall-detection/agents/ceo/heartbeat-section.md CHANGED Viewed

@@ -1,12 +1,3 @@
-## Stall Detection Check
+## Stall Detection Routine
-After handling assignments and before exit:
-1. Query active issues: `GET /api/companies/{companyId}/issues?status=in_progress,in_review`
-2. For each issue, check the latest comment/activity timestamp.
-3. If an issue has had no activity for more than 2 heartbeat cycles:
-   - Agent is `idle` → @-mention with a nudge.
-   - Agent is `running` → skip (may be mid-work).
-   - Agent is `error` or `paused` → escalate to the board.
-4. If already nudged and still no progress → escalate to the board.
-5. Record stall findings in daily notes.
+Do not scan for stalled work during a normal heartbeat. When you are assigned a routine-run issue titled like "Stall detection", use `skills/stall-detection.md`, then summarize findings on the routine issue and exit.

package/templates/modules/stall-detection/agents/ceo/skills/stall-detection.md CHANGED Viewed

@@ -1,21 +1,30 @@
 # Skill: Stall Detection
-Add this check to your heartbeat, after handling assignments and before exit.
+You own stall detection when you are explicitly assigned a stall-detection routine run. This is not an every-heartbeat background scan.
+## When To Use This Skill
+Use this only when the current assigned issue/routine is titled like "Stall detection" or explicitly asks you to inspect stalled work. Otherwise follow the normal Paperclip heartbeat rule: only work assigned issues and do not scan the whole board.
 ## Stall Check
-1. Query active issues: `GET /api/companies/{companyId}/issues?status=in_progress,in_review`
-2. For each issue, check the latest comment/activity timestamp.
-3. If an issue has had no activity for more than 2 heartbeat cycles:
-   - Check the assigned agent's status via `GET /api/agents/{agentId}`
-   - If agent is `idle`: @-mention them on the issue with a nudge: "This issue appears stalled. Please check and continue or report blockers."
-   - If agent is `running`: skip — they may be working on it now.
-   - If agent is `error` or `paused`: escalate to the board with a comment.
-4. If you've already nudged an agent on the same issue in a previous heartbeat and there's still no progress: escalate to the board.
-5. Record stall findings in your daily notes.
+1. Checkout the assigned routine-run issue.
+2. Query active issues for the relevant company/project: `todo`, `in_progress`, `in_review`, and blocked work where applicable.
+3. For each candidate, inspect latest comments/activity, execution state, blockers, approval/review state, and assigned agent status.
+4. Skip issues with an active run, recent activity, valid `blockedByIssueIds`, or pending executionPolicy approval/review.
+5. For a likely stall, leave a structured comment on the issue with:
+   - issue id/title
+   - assigned agent
+   - last activity timestamp/context
+   - why it appears stalled
+   - exact next action requested
+6. Prefer reassignment, blocker linkage, or escalation to CEO/Product Owner over informal nudges.
+7. If an agent is `error`, paused, or repeatedly non-responsive, escalate with an issue comment and assign the manager/CEO as appropriate.
+8. Summarize findings on the routine-run issue and mark it done.
 ## Rules
-- Don't nudge agents that are currently running — they may be mid-work.
-- Only escalate after one failed nudge attempt.
-- When escalating, be specific: which issue, which agent, how long stalled, what was the last activity.
+- Do not @-mention as a generic nudge; use assignment, status, blockers, and explicit next-action comments.
+- Do not interrupt running agents.
+- Do not close or cancel another agent's work unless the issue explicitly grants that authority.
+- Be specific: which issue, which agent, last activity, why stalled, and who owns the next action.

package/templates/modules/stall-detection/module.meta.json CHANGED Viewed

@@ -5,7 +5,7 @@
   "routines": [
     {
       "title": "Stall detection",
-      "description": "Check all in_progress issues for signs of stalling. Nudge assigned agents, escalate if blocked for >24h.",
+      "description": "Routine-run checklist: inspect in_progress/in_review issues for stale activity, verify blockers and executionPolicy gates, record a concrete next action/reassignment/escalation on each stalled issue, summarize on the routine issue, then exit.",
       "assignTo": "ceo",
       "schedule": "0 */3 * * *",
       "priority": "medium",

package/templates/modules/triage/skills/issue-triage.md CHANGED Viewed

@@ -1,42 +1,29 @@
 # Skill: Issue Triage
-You are responsible for processing inbound GitHub issues — classifying, responding, and converting them into actionable work.
+You are responsible for processing inbound GitHub issues when assigned a triage issue or triage routine. Classify, respond, and convert actionable reports into Paperclip work.
-## Triage Process
+## When To Use This Skill
-Run this on every heartbeat, after handling your own assignments.
+Use this only when the current assigned issue/routine asks for GitHub issue triage. Do not scan GitHub on every normal heartbeat.
-1. **Fetch new issues** — `gh issue list --state open --label "" --json number,title,body,labels,createdAt` to find untriaged issues (no classification label yet).
-2. **For each untriaged issue:**
-   a. Read the full issue body and any comments.
-   b. **Classify** by type:
-      - `bug` — Something is broken or behaves unexpectedly
-      - `feature` — New functionality request
-      - `enhancement` — Improvement to existing functionality
-      - `question` — User needs help or clarification
-      - `duplicate` — Already reported (link to original)
-      - `invalid` — Not actionable, out of scope, or spam
-   c. **Set priority** (P0–P3):
-      - P0: Production broken, data loss, security vulnerability
-      - P1: Major feature broken, blocking multiple users
-      - P2: Non-critical bug or important feature request
-      - P3: Minor issue, cosmetic, nice-to-have
-   d. **Apply labels** — `gh issue edit <number> --add-label "<type>,<priority>"`
-   e. **Respond to reporter:**
-      - Acknowledge the report
-      - Ask follow-up questions if reproduction steps are unclear (bugs)
-      - Set expectations ("we'll look into this" / "this is out of scope because...")
-      - For duplicates: link to the original issue and close
-      - For invalid: explain why and close politely
-   f. **Convert to Paperclip task** — For actionable issues (bug, feature, enhancement), create a corresponding issue in Paperclip via `POST /api/companies/{companyId}/issues` with the GitHub issue number in the description for traceability. Include the active `projectId` (and `goalId` / `parentId` when applicable).
+## Triage Process
-3. **Record** what you triaged in your daily notes.
+1. Checkout the assigned triage issue/routine.
+2. Fetch new issues: `gh issue list --state open --label "" --json number,title,body,labels,createdAt` to find untriaged issues (no classification label yet).
+3. For each untriaged issue:
+   - Read the full issue body and comments.
+   - Classify by type: `bug`, `feature`, `enhancement`, `question`, `duplicate`, `invalid`.
+   - Set priority P0-P3; P0 maps to urgent Paperclip priority.
+   - Apply labels with `gh issue edit <number> --add-label "<type>,<priority>"`.
+   - Respond to the reporter with acknowledgement, follow-up questions, or a polite close reason.
+   - For actionable issues, create a corresponding Paperclip issue via `POST /api/companies/{companyId}/issues`. Include GitHub issue URL/number, active `projectId`, and `goalId` / `parentId` when applicable.
+4. Link bidirectionally: GitHub comment references the Paperclip issue, Paperclip issue references GitHub.
+5. Summarize triage results on the assigned triage issue/routine and mark it done.
 ## Rules
-- Respond to every issue. No issue should sit without acknowledgment for more than one heartbeat cycle.
-- Be respectful and constructive in responses, even for invalid or duplicate issues.
-- Don't close legitimate issues without explanation. Always comment before closing.
-- Link GitHub issues to Paperclip tasks bidirectionally — include the GitHub URL in the Paperclip issue and the Paperclip task reference in a GitHub comment.
-- If you can't classify an issue (ambiguous report), ask the reporter for clarification and label as `needs-info`.
-- Escalate P0 issues immediately — create the Paperclip task with priority `urgent` and mention it in the CEO's daily notes.
+- Do not triage from unrelated heartbeats.
+- Respond respectfully and constructively, even for invalid or duplicate reports.
+- Do not close legitimate issues without explanation.
+- If you cannot classify an issue, ask for clarification and label `needs-info`.
+- Escalate P0 issues immediately via Paperclip issue assignment/status and a CEO/Product Owner comment.