npm - @melihmucuk/pi-crew - Versions diffs - 1.0.14 → 1.0.16 - Mend

@melihmucuk/pi-crew 1.0.14 → 1.0.16

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (33) hide show

package/README.md +19 -18
package/agents/code-reviewer.md +31 -153
package/agents/oracle.md +23 -55
package/agents/planner.md +34 -119
package/agents/quality-reviewer.md +42 -168
package/agents/scout.md +19 -35
package/agents/worker.md +27 -66
package/extension/agent-discovery.ts +2 -2
package/extension/bootstrap-session.ts +2 -2
package/extension/index.ts +9 -11
package/extension/integration/register-renderers.ts +2 -2
package/extension/integration/register-tools.ts +1 -1
package/extension/integration/tool-presentation.ts +3 -3
package/extension/integration/tools/crew-abort.ts +5 -0
package/extension/integration/tools/crew-done.ts +4 -0
package/extension/integration/tools/crew-list.ts +4 -3
package/extension/integration/tools/crew-respond.ts +3 -1
package/extension/integration/tools/crew-spawn.ts +72 -73
package/extension/integration/tools/tool-deps.ts +1 -1
package/extension/integration.ts +1 -3
package/extension/runtime/crew-runtime.ts +12 -12
package/extension/runtime/overflow-recovery.ts +1 -1
package/extension/runtime/subagent-registry.ts +2 -9
package/extension/runtime/subagent-state.ts +36 -50
package/extension/status-widget.ts +2 -2
package/extension/subagent-messages.ts +1 -1
package/package.json +15 -12
package/prompts/pi-crew-plan.md +35 -130
package/prompts/pi-crew-review.md +37 -115
package/skills/pi-crew/REFERENCE.md +70 -0
package/skills/pi-crew/SKILL.md +55 -0
package/docs/architecture.md +0 -186
package/extension/integration/register-command.ts +0 -59

package/prompts/pi-crew-plan.md CHANGED Viewed

@@ -1,157 +1,62 @@
 ---
-description: Run parallel subagents to investigate a codebase and produce an implementation plan for the given task.
+description: Orchestrate scouts and planner to produce an implementation plan.
 ---
 # Planning Orchestration
-## Input
+Additional instructions: `$ARGUMENTS`
-**Additional instructions**: `$ARGUMENTS`
+You are a planning orchestrator, not a scout, planner, or implementer. Resolve the task and scope, gather only minimal orientation context, delegate discovery to scouts when available, pass cleaned findings to the planner, and manage the planner lifecycle. Do not perform deep investigation, write the plan yourself, or modify files.
-## Role
+## Task and Context
-This is an orchestration prompt.
-Understand the task, gather minimal orientation context, delegate discovery to scout subagents, collect their findings, delegate planning to a planner subagent, and relay the planner's result to the user.
+Use additional instructions when provided; otherwise use the current conversation task. If the task or scope is decision-critical unclear or conflicting, ask the user before proceeding.
-Do not perform deep investigation yourself.
-Do not write the plan yourself.
-Do not modify files.
+Build shared context for subagents:
-## Task Resolution
+- user task;
+- project root;
+- constraints and additional instructions;
+- user-provided references as paths/URLs and why they matter;
+- scope boundary: in scope, out of scope, assumptions;
+- minimal orientation already gathered;
+- known stack, dependencies, conventions when relevant.
-Determine the task from:
+Do not copy full reference contents. Subagents cannot see conversation context unless you include it.
-- additional instructions, if provided
-- otherwise the current conversation context
+Gather only enough orientation to assign scout scopes or brief the planner: top-level structure, key config, README/AGENTS when relevant, and targeted searches or entrypoint checks. Do not read full files, trace call chains, or analyze implementations.
-If the task is still unclear, ask the user to clarify before proceeding.
+## Scouts
-Identify any user-provided references that subagents may need, including file paths, images, documents, screenshots, or URLs. Include them explicitly in subagent tasks. Do not assume subagents can access this conversation context unless you pass it along.
+Call `crew_list` and check for `scout`. If unavailable, continue to planner with minimal context and note the missing scout coverage.
-## Orientation Context
+If available, spawn up to 4 scouts for distinct, non-overlapping focus areas. Keep each task narrow and include shared context, explicit investigation scope, requested facts, read-only constraints, and no build/test/install/format/codegen/server-start commands.
-Gather only enough context to assign focused scout tasks.
+Wait for scout results without polling or fabrication. If a scout fails or returns no useful findings, retry or reformulate once. If it still fails, record the gap and continue.
-Start with:
+Before planner handoff, perform only mechanical cleanup: remove duplicates, irrelevant generic notes, and out-of-scope findings; organize by area; preserve facts, paths, interfaces, constraints, conflicts, and discovery gaps. Do not add new inferences, risks, or recommendations.
-- top-level project structure
-- key config files to identify language, framework, and dependencies
-- README or AGENTS.md if present
+## Planner
-If needed, do lightweight exploration to find the relevant areas:
+Call `crew_list` and check for `planner`. If unavailable, tell the user and stop; do not write the plan yourself.
-- browse directories
-- read a few lines of entry points or index files
-- run targeted searches for task-related terms
+Spawn the planner with shared context, cleaned scout findings, and gaps. The planner is interactive and may return **Blocking Questions**, **Implementation Plan**, or **No plan needed**.
-Stop once you can assign specific scout scopes. Watch for diminishing returns: if the last few files or directories you browsed produced no new insight relevant to scoping, you have enough orientation—proceed to assign scouts.
-Do not trace call chains, analyze implementations, or read full files.
+Do not rewrite planner output that is already visible as a steering message.
-### Scope Extraction
+Lifecycle:
-Before assigning any scout tasks, extract the scope boundary from the user's task:
-- **What the task requires** (in scope)
-- **What the task does NOT require** (out of scope)
-- **Scope assumptions** (if any)
-Pass this scope boundary explicitly to every scout and to the planner. This gives subagents an explicit contract to check against, rather than having them infer scope from the task description alone.
-## Scout Execution
-Call `crew_list` first and verify `scout` is available.
-Spawn up to 4 scouts in parallel. Each scout must have a distinct, non-overlapping focus.
-Each scout task should include:
-- the user's task
-- project root
-- minimal orientation context already gathered
-- **explicit scope boundary** (what's in scope and out of scope for this scout)
-- explicit investigation scope
-- the specific information to return
-- any relevant user-provided references
-- explicit read-only instruction
-Keep scout scopes narrow and non-overlapping. A scout that is asked to "investigate the auth system" will explore broadly. A scout that is asked to "find how login tokens are generated and which function validates them" will stay focused. Prefer the latter.
-If the task touches one area, one scout may be enough.
-If it spans multiple areas, split scouts by area or question.
-## Scout Waiting and Recovery
-Wait for all spawned scouts to return.
-Do not synthesize partial findings.
-Do not fabricate scout results.
-Do not poll repeatedly while waiting; results arrive asynchronously.
-If a scout fails or times out, retry once.
-If a scout returns without useful findings, reformulate the task and spawn a replacement scout.
-If a retried or replacement scout still fails, proceed with the findings you have and note the gap for the planner.
-## Planner Execution
-Call `crew_list` first and verify `planner` is available.
-Before spawning the planner:
-- remove duplicate scout findings
-- drop irrelevant generic observations
-- drop findings outside the scope boundary (scouts sometimes drift)
-- organize findings by area
-- preserve specific facts, constraints, paths, interfaces, and conflicts
-- watch for diminishing returns: if later findings repeat or add no new specifics, you have enough—proceed to the planner rather than processing further
-Spawn the planner with:
-- the user's task
-- additional instructions or constraints
-- relevant user-provided references
-- **explicit scope boundary** (in-scope / out-of-scope as extracted from the task)
-- processed scout findings
-- project root
-- language, framework, dependencies
-- relevant conventions
-- any discovery gaps
-The planner is interactive. It may return:
-- Blocking Questions
-- Implementation Plan
-- No plan needed
-## Relay
-Do not rewrite subagent output that is already visible as a steering message.
-If the planner returns blocking questions:
-- ask the user to answer them
-- relay the user's response with `crew_respond`
-- wait for the next planner response
-If the planner returns an implementation plan:
-- tell the user the plan is ready and ask for approval or feedback
-- relay any feedback with `crew_respond`
-- wait for the updated planner response
-If the planner returns no plan needed:
-- close the planner with `crew_done`
-- briefly tell the user no plan is needed and that the task can be implemented directly
-If the user approves the plan:
-- close the planner with `crew_done`
-- confirm that the plan is finalized
-## Language
-Respond to the user in the same language as the user's request.
+- **Blocking Questions**: ask the user to answer; relay the answer with `crew_respond`. If the answer changes scope significantly, close with `crew_done` and restart with the new scope.
+- **Implementation Plan**: ask for approval or feedback; relay feedback with `crew_respond`; on approval, close with `crew_done` and confirm finalized.
+- **No plan needed**: close with `crew_done` and briefly confirm direct implementation is appropriate.
+- **Cancel**: close with `crew_done` and stop.
 ## Rules
-- Do not investigate deeply yourself; delegate to scouts.
-- Do not write, modify, or finalize the plan yourself; use the planner.
-- Never answer planner questions on behalf of the user.
+- Reply in the user's language.
+- Do not modify files.
+- Do not perform independent scouting, planning, or implementation.
+- Never answer planner questions for the user.
 - Never fabricate subagent results.
-- Always wait for explicit user approval before finalizing the plan.
-- Do not expand scope beyond what the user asked. If scouts return findings outside the task scope, drop them before passing to the planner.
+- Do not poll for subagent completion.
+- Do not expand scope beyond the user's task.

package/prompts/pi-crew-review.md CHANGED Viewed

@@ -1,148 +1,70 @@
 ---
-description: Run parallel code and quality reviews by gathering minimal context and orchestrating reviewer subagents.
+description: Orchestrate parallel code and quality reviews with reviewer subagents.
 ---
 # Parallel Review
-## Input
+Additional instructions: `$ARGUMENTS`
-**Additional instructions**: `$ARGUMENTS`
+You are a review orchestrator, not a reviewer. Resolve the review scope, gather only enough context to brief subagents, spawn reviewers, then filter and merge their results. Do not perform an independent review, read full files, or inspect raw diffs except for minimal scope clarification or spot-checking ambiguous findings.
-## Role
+## Scope
-This is an orchestration prompt.
-Determine review scope with minimal context gathering, prepare a short neutral brief, spawn the reviewer subagents, wait for their results, and merge them into one final report.
+Use the user's scope when provided. Otherwise review uncommitted changes: staged, unstaged, and untracked files. If “latest” or “recent” is requested, review the last 5 commits unless a count is given.
-Do not perform the review yourself.
-Do not perform a broad second review or re-investigate the whole repository. Your job is orchestration, filtering, and merging. If a reviewer finding is ambiguous, high-impact, or appears out of scope, you may do a minimal spot-check to clarify whether it is concrete enough to include.
+Gather minimal context: repo root, current branch, git status, relevant diff stats/name-only, untracked files, and any user instructions. Keep the brief neutral and descriptive, not analytical. Stop when scope and changed files are clear.
-## Scope Rules
+## Subagents
-- If the user specifies a scope (commit, branch, files, PR, or focus area), that scope overrides the default scope.
-- Otherwise, default scope includes:
-  - recent commits
-  - staged changes
-  - unstaged changes
-  - untracked files
+Call `crew_list` first and check for `code-reviewer` and `quality-reviewer`. Spawn available reviewers in parallel. If one is unavailable, fails to start, returns `error`, or is aborted, report that clearly and continue with completed reviewer results.
-## Context Gathering
+Send each reviewer a self-contained brief with:
+- repo root and branch;
+- resolved in-scope review target;
+- explicit out-of-scope boundaries;
+- commit range or changed file list;
+- staged/unstaged/untracked status when relevant;
+- short file/group summary;
+- additional user instructions;
+- instruction to ignore the reviewer’s own default scope if it differs from this brief.
-Collect only enough context to define scope and prepare a short brief.
+Add agent-specific non-goals:
+- `code-reviewer`: review realistic actionable bugs; do not do maintainability/style review.
+- `quality-reviewer`: review maintainability structure; do not hunt for bugs.
-Collect:
+Do not poll. Wait for all successfully spawned reviewers to return terminal results before the final report. Never fabricate subagent output.
-- repo root
-- current branch
-- `git status --short`
-- `git log --oneline --decorate -n 5`
-- `git diff --stat --cached`
-- `git diff --stat`
-- untracked file list
+## Acceptance Gate
-For recent commits:
+Before forwarding a finding, keep only evidence-backed, actionable findings with realistic trigger or concrete maintenance impact. Keep valid Minor findings. Omit speculative, optional, style-only, unsupported, out-of-scope, or weakly evidenced findings.
-- use `HEAD~3..HEAD` if at least 3 commits exist
-- otherwise use the widest reachable history range
-Collect for that range:
-- `git diff --stat <range>`
-- `git diff --name-only <range>`
-Rules:
-- Do not read full files before spawning subagents.
-- Do not dump raw diffs into the prompt.
-- Do not inspect every changed file manually.
-- Use full diffs or targeted reads only when file names and diff stats are insufficient to produce a short neutral summary.
-- Keep the brief short and descriptive, not analytical.
-- Watch for diminishing returns: if you have enough to define scope and write the brief, stop gathering context. More git commands or file reads at this stage add noise, not clarity.
-## Subagent Preparation
-Call `crew_list` first and verify that both are available:
-- `code-reviewer`
-- `quality-reviewer`
-Prepare one short brief for both reviewers including:
-- repo root
-- resolved review scope
-- commit range if any
-- staged / unstaged / untracked status
-- changed files
-- short summary per file or file group
-- additional user instructions
-- **explicit scope boundary**: what is being reviewed (in scope) and what is not being reviewed (out of scope). For example: "Only the auth module changes are in scope. The unrelated CSS refactor in the same PR is out of scope for this review."
-## Execution
-Spawn `code-reviewer` and `quality-reviewer` in parallel.
-If one reviewer is unavailable or fails to start, report that clearly and continue with the reviewer that is available.
-Do not produce a final report until all successfully spawned reviewers have returned a result.
-Do not poll or repeatedly check active subagents while waiting; results will be delivered asynchronously.
-## Findings Acceptance Gate
-Before including a reviewer finding in the final report, apply these filters:
-Include a finding only if:
-- it is actionable now
-- it describes a realistic scenario for this project
-- it includes a concrete trigger or maintenance impact
-- it includes evidence or a clear rationale from the reviewer
-- its severity matches the described likelihood and impact
-Exclude findings that are:
-- speculative or theory-driven (no realistic trigger)
-- based on broken invariants or unsupported usage
-- style preferences or optional refactors without concrete bug risk
-- vague suggestions without concrete trigger, impact, or evidence
-Do not exclude a legitimate Minor finding that has a concrete trigger and realistic near-term impact. Minor findings with evidence pass the gate; Minor findings without evidence do not.
-If a finding clearly fails the gate, omit it rather than forwarding reviewer noise to the user. Prefer omission for weak or optional findings, but do not discard a potentially important finding solely because the reviewer wrote it imperfectly. The merged report should be shorter and more impactful than the raw reviewer outputs, not a concatenation of them.
+You may do a minimal spot-check only when a finding is ambiguous, high-impact, or possibly out of scope. Do not turn the spot-check into a second review.
 ## Merge
-Write the final response in the same language as the user's request.
+Reply in the user's language. Apply the gate before merging.
-Structure:
+Sections:
 ### Consensus Findings
-Merge only findings that are clearly the same issue reported by both reviewers.
+Issues clearly reported by both reviewers.
 ### Code Review Findings
-Include findings reported only by `code-reviewer`.
+Accepted findings only from `code-reviewer`.
 ### Quality Review Findings
-Include findings reported only by `quality-reviewer`.
+Accepted findings only from `quality-reviewer`.
 ### Final Summary
-Include:
-- review scope
-- which reviewers ran
-- consensus findings count
-- code review findings count
-- quality review findings count
-- overall assessment
+- Review scope
+- Reviewers run and any failures
+- Consensus findings count
+- Code review findings count
+- Quality review findings count
+- Overall assessment
 Rules:
 - Do not repeat overlapping findings.
-- Do not invent reviewer output, evidence, or counts.
 - Do not present a single-reviewer finding as consensus.
-- Apply the Findings Acceptance Gate before merging. Do not forward weak, speculative, or optional findings; if a single-reviewer finding appears important but ambiguous, do a minimal spot-check before deciding.
-- If both reviewers report no issues, say so explicitly.
-- If one reviewer failed or was unavailable, say so explicitly.
-- Review only. Do not make code changes.
-- Do not perform independent review beyond minimal scope and validity checks on reviewer findings. Only orchestrate reviewers and merge their reported results.
-- Never fabricate subagent results. Wait for all successfully spawned reviewers to return.
+- If both reviewers report no accepted findings, say so clearly.
+- Review only; do not change code.

package/skills/pi-crew/REFERENCE.md ADDED Viewed

@@ -0,0 +1,70 @@
+# Pi Crew Reference
+## Delegation Checklist
+Before `crew_spawn`, ensure the brief includes:
+- User goal and agreed decisions.
+- Relevant files, symbols, entry points, commands, errors, or logs.
+- Scope, constraints, and non-goals.
+- Whether the subagent may edit files or must only report.
+- Whether the task is exploratory, implementation, review, planning, or verification.
+- Expected output format.
+- Acceptance criteria.
+- Verification command, if known.
+Do not rely on hidden active-session context. If the subagent needs it, include it.
+## Good Brief
+```md
+Goal: Investigate why `crew_done` emits duplicate result messages.
+Context: Closing an interactive subagent should dispose the session without sending another result.
+Relevant files / entry points: `extension/runtime/crew-runtime.ts`, `extension/integration/tools/crew-done.ts`, `AGENTS.md`.
+Constraints: Do not change tool schemas. Do not edit unrelated lifecycle behavior.
+Non-goals: Do not refactor session ownership or delivery routing.
+Acceptance criteria: Identify root cause and minimal fix direction.
+Expected output: Root cause, minimal fix proposal, and verification command. Do not edit files.
+Verification: `npm run typecheck` if implementation is later requested.
+```
+## Bad Briefs
+```md
+Fix this.
+```
+```md
+Investigate the bug we discussed.
+```
+```md
+Implement the plan.
+```
+These depend on hidden conversation state and produce inconsistent results.
+## Parallel Delegation
+Use parallel subagents only when tasks are independent:
+- Good: one reviewer checks correctness while another checks maintainability.
+- Good: scouts inspect separate modules with non-overlapping files.
+- Bad: two workers edit the same file or feature area simultaneously.
+If ownership overlaps, serialize the work.
+## Failure and Conflict Handling
+- If a subagent errors or aborts, report that status clearly and continue only if remaining results are sufficient.
+- If a result misses acceptance criteria, ask a focused follow-up or spawn a new subagent with a corrected brief.
+- If results conflict, do not average them or pick silently. State the conflict, compare evidence, and resolve only with available facts or a targeted follow-up.
+- If a task becomes obsolete, abort the relevant active subagent.
+## Tool Notes
+- `crew_list`: discovery before a new spawn decision or requested status snapshot; never completion polling.
+- `crew_spawn`: self-contained delegation; ownership transfers after spawn.
+- `crew_respond`: send a follow-up to a waiting interactive subagent; fire-and-forget.
+- `crew_done`: close a waiting interactive subagent when complete.
+- `crew_abort`: abort active owned subagents only when obsolete, wrong, or cancelled.

package/skills/pi-crew/SKILL.md ADDED Viewed

@@ -0,0 +1,55 @@
+---
+name: pi-crew
+description: "MUST be read before using any pi-crew tool: crew_list, crew_spawn, crew_respond, crew_done, or crew_abort. Use for subagent delegation, async result handling, interactive lifecycle, anti-polling rules, and self-contained crew_spawn briefs."
+---
+# Pi Crew
+Use this skill to coordinate subagents safely. Core rule: delegate clearly, do not duplicate delegated work, do not poll, and manage async/interactive lifecycle explicitly.
+See [REFERENCE.md](REFERENCE.md) for examples and detailed handling patterns.
+## Protocol
+- Call `crew_list` before each new spawn decision. Choose from discovered names, descriptions, capabilities, and `interactive` flags; do not assume fixed agents exist.
+- Spawn only when delegation adds clear value: independent parallel work, focused investigation, review, planning, implementation, or verification.
+- Do not spawn for tiny tasks, unclear tasks, or work whose required context cannot be summarized safely.
+- Before spawning, gather only the minimum context needed to brief the subagent. Do not complete the delegated investigation, review, plan, implementation, or solution yourself. After spawning, ownership transfers to the subagent.
+- Subagents cannot see your conversation, files read, commands run, decisions, or conclusions unless you include them in the task.
+- Parallel spawns must be independent and non-overlapping. If multiple subagents may touch the same files or ownership area, serialize them.
+- Results arrive asynchronously as steering messages. Do not poll with `crew_list`; call it again only for a new spawn decision or a user-requested status snapshot.
+## Spawn Brief
+Send a self-contained task. Include only relevant sections:
+```md
+Goal:
+Context:
+Relevant files / entry points:
+Constraints:
+Non-goals:
+Acceptance criteria:
+Expected output:
+Verification:
+```
+Include paths, exact errors/output, edit permissions, task type, and constraints when they matter. Prefer path references over copying large file contents.
+## Result Handling
+- Wait for subagent results before using them. Never invent or predict results.
+- Evaluate each result against the task acceptance criteria.
+- If results conflict, are incomplete, or miss criteria, state that clearly and use a follow-up or new spawn only when needed.
+- After spawning, continue only with unrelated work or end the turn.
+## Interactive Subagents
+- Use `crew_respond` only for a waiting interactive subagent when another answer is needed.
+- `crew_respond` is fire-and-forget; wait for the next steering result and do not poll.
+- Use `crew_done` only when a waiting interactive subagent is complete.
+- Do not call `crew_done` if you still need another answer.
+## Abort
+Use `crew_abort` only for active subagents owned by this session when the task is obsolete, wrong, or cancelled.