npm - @oh-my-pi/pi-coding-agent - Versions diffs - 12.19.2 → 13.0.0 - Mend

@oh-my-pi/pi-coding-agent 12.19.2 → 13.0.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (106) hide show

package/CHANGELOG.md +53 -0
package/package.json +7 -7
package/src/commit/prompts/analysis-system.md +3 -3
package/src/commit/prompts/analysis-user.md +14 -14
package/src/commit/prompts/changelog-system.md +4 -4
package/src/commit/prompts/changelog-user.md +4 -4
package/src/commit/prompts/file-observer-system.md +2 -2
package/src/commit/prompts/file-observer-user.md +2 -2
package/src/commit/prompts/reduce-system.md +4 -4
package/src/commit/prompts/reduce-user.md +6 -6
package/src/commit/prompts/summary-system.md +4 -4
package/src/commit/prompts/summary-user.md +6 -6
package/src/config/settings-schema.ts +0 -11
package/src/discovery/helpers.ts +13 -1
package/src/internal-urls/docs-index.generated.ts +2 -2
package/src/internal-urls/index.ts +8 -3
package/src/internal-urls/local-protocol.ts +223 -0
package/src/internal-urls/{docs-protocol.ts → pi-protocol.ts} +12 -12
package/src/internal-urls/router.ts +1 -1
package/src/internal-urls/types.ts +1 -1
package/src/ipy/executor.ts +4 -32
package/src/main.ts +0 -1
package/src/memories/index.ts +1 -1
package/src/modes/components/settings-defs.ts +0 -5
package/src/modes/controllers/event-controller.ts +4 -4
package/src/modes/interactive-mode.ts +84 -64
package/src/modes/types.ts +11 -3
package/src/modes/utils/ui-helpers.ts +5 -3
package/src/patch/hashline.ts +42 -42
package/src/patch/index.ts +24 -21
package/src/patch/shared.ts +21 -43
package/src/plan-mode/approved-plan.ts +55 -0
package/src/prompts/agents/designer.md +6 -6
package/src/prompts/agents/explore.md +4 -4
package/src/prompts/agents/frontmatter.md +1 -0
package/src/prompts/agents/init.md +10 -10
package/src/prompts/agents/plan.md +6 -6
package/src/prompts/agents/reviewer.md +4 -3
package/src/prompts/agents/task.md +10 -10
package/src/prompts/compaction/branch-summary.md +3 -3
package/src/prompts/compaction/compaction-short-summary.md +7 -7
package/src/prompts/compaction/compaction-summary-context.md +1 -1
package/src/prompts/compaction/compaction-summary.md +5 -5
package/src/prompts/compaction/compaction-turn-prefix.md +3 -3
package/src/prompts/compaction/compaction-update-summary.md +11 -11
package/src/prompts/memories/consolidation.md +5 -5
package/src/prompts/memories/read-path.md +11 -0
package/src/prompts/memories/stage_one_input.md +1 -1
package/src/prompts/memories/stage_one_system.md +5 -5
package/src/prompts/review-request.md +4 -4
package/src/prompts/system/agent-creation-architect.md +17 -17
package/src/prompts/system/agent-creation-user.md +2 -2
package/src/prompts/system/custom-system-prompt.md +6 -6
package/src/prompts/system/plan-mode-active.md +20 -20
package/src/prompts/system/plan-mode-approved.md +9 -7
package/src/prompts/system/plan-mode-reference.md +2 -2
package/src/prompts/system/plan-mode-subagent.md +8 -8
package/src/prompts/system/subagent-submit-reminder.md +5 -5
package/src/prompts/system/subagent-system-prompt.md +9 -9
package/src/prompts/system/subagent-user-prompt.md +3 -5
package/src/prompts/system/summarization-system.md +1 -1
package/src/prompts/system/system-prompt.md +109 -84
package/src/prompts/system/title-system.md +2 -2
package/src/prompts/system/ttsr-interrupt.md +2 -2
package/src/prompts/system/web-search.md +16 -16
package/src/prompts/tools/ask.md +6 -6
package/src/prompts/tools/bash.md +9 -9
package/src/prompts/tools/browser.md +5 -5
package/src/prompts/tools/cancel-job.md +2 -2
package/src/prompts/tools/exit-plan-mode.md +13 -10
package/src/prompts/tools/find.md +2 -2
package/src/prompts/tools/gemini-image.md +7 -7
package/src/prompts/tools/grep.md +4 -3
package/src/prompts/tools/hashline.md +37 -39
package/src/prompts/tools/patch.md +5 -5
package/src/prompts/tools/poll-jobs.md +1 -1
package/src/prompts/tools/python.md +8 -10
package/src/prompts/tools/read.md +2 -12
package/src/prompts/tools/replace.md +6 -6
package/src/prompts/tools/ssh.md +2 -7
package/src/prompts/tools/task.md +34 -23
package/src/prompts/tools/todo-write.md +65 -49
package/src/prompts/tools/web-search.md +2 -2
package/src/prompts/tools/write.md +4 -3
package/src/sdk.ts +11 -9
package/src/session/agent-session.ts +92 -51
package/src/session/artifacts.ts +1 -1
package/src/session/messages.ts +1 -0
package/src/task/agents.ts +1 -0
package/src/task/index.ts +2 -1
package/src/task/render.ts +2 -2
package/src/task/types.ts +1 -0
package/src/tools/bash-interactive.ts +1 -1
package/src/tools/bash-skill-urls.ts +3 -2
package/src/tools/bash.ts +38 -19
package/src/tools/exit-plan-mode.ts +30 -2
package/src/tools/grep.ts +131 -75
package/src/tools/index.ts +13 -3
package/src/tools/path-utils.ts +2 -1
package/src/tools/plan-mode-guard.ts +8 -8
package/src/tools/python.ts +0 -2
package/src/tools/read.ts +2 -2
package/src/tools/todo-write.ts +276 -146
package/src/internal-urls/plan-protocol.ts +0 -95
package/src/modes/components/todo-display.ts +0 -114
package/src/prompts/memories/read_path.md +0 -11

package/src/prompts/tools/hashline.md CHANGED Viewed

@@ -3,19 +3,17 @@
 Apply precise file edits using `LINE#ID` tags, anchoring to the file content.
 <workflow>
-1. `read` the target range to capture current `LINE#ID` tags.
-2. Pick the smallest operation per change site (line/range/insert/content-replace).
-3. Direction-lock every edit: exact current text → intended text.
-4. Submit one `edit` call per file containing all operations.
-5. If another edit is needed in that file, re-read first (hashes changed).
-6. Output tool calls only; no prose.
+1. You MUST `read` the target range to capture current `LINE#ID` tags.
+2. You MUST pick the smallest operation per change site (line/range/insert/content-replace).
+3. You MUST direction-lock every edit: exact current text → intended text.
+4. You MUST submit one `edit` call per file containing all operations.
+5. If another edit is needed in that file, you MUST re-read first (hashes changed).
+6. You MUST output tool calls only; no prose.
 </workflow>
 <operations>
-- **Single line replace/delete**
-  - `{ op: "set", tag: "N#ID", content: […] }`
-  - `content: null` deletes the line; `content: [""]` keeps a blank line.
-- **Range replace/delete**
+- **Line or range replace/delete**
+  - `{ op: "replace", tag: "N#ID", content: […] }`
   - `{ op: "replace", first: "N#ID", last: "N#ID", content: […] }`
   - Use for swaps, block rewrites, or deleting a full span (`content: null`).
 - **Insert** (new content)
@@ -29,33 +27,33 @@ Apply precise file edits using `LINE#ID` tags, anchoring to the file content.
 </operations>
 <rules>
-1. **Minimize scope:** one logical mutation site per operation.
-2. **Preserve formatting:** keep indentation, punctuation, line breaks, trailing commas, brace style.
-3. **Prefer insertion over neighbor rewrites:** anchor on structural boundaries (`}`, `]`, `},`) not interior property lines.
-4. **No no-ops:** replacement content must differ from current content.
-5. **Touch only requested code:** avoid incidental edits.
-6. **Use exact current tokens:** never rewrite approximately; mutate the token that exists now.
-7. **For swaps/moves:** prefer one range operation over multiple single-line operations.
+1. **Minimize scope:** You MUST use one logical mutation site per operation.
+2. **Preserve formatting:** You MUST keep indentation, punctuation, line breaks, trailing commas, brace style.
+3. **Prefer insertion over neighbor rewrites:** You SHOULD anchor on structural boundaries (`}`, `]`, `},`) not interior property lines.
+4. **No no-ops:** replacement content MUST differ from current content.
+5. **Touch only requested code:** You MUST NOT make incidental edits.
+6. **Use exact current tokens:** You MUST NOT rewrite approximately; mutate the token that exists now.
+7. **For swaps/moves:** You SHOULD prefer one range operation over multiple single-line operations.
 </rules>
-<op_choice>
-- One wrong line → `set`
-- Adjacent block changed → `insert`
-- Missing line/block → insert with `append`/`prepend`
-</op_choice>
+<op-choice>
+- One wrong line → MUST use `set`
+- Adjacent block changed → MUST use `insert`
+- Missing line/block → MUST use `append`/`prepend`
+</op-choice>
-<tag_choice>
-- Copy tags exactly from the prefix of the `read` or error output.
-- Never guess tags.
-- For inserts, prefer `insert` > `append`/`prepend` when both boundaries are known.
-- Re-read after each successful edit call before issuing another on same file.
-</tag_choice>
+<tag-choice>
+- You MUST copy tags exactly from the prefix of the `read` or error output.
+- You MUST NOT guess tags.
+- For inserts, you SHOULD prefer `insert` > `append`/`prepend` when both boundaries are known.
+- You MUST re-read after each successful edit call before issuing another on same file.
+</tag-choice>
 <recovery>
 **Tag mismatch (`>>>`)**
-- Retry with the updated tags shown in error output.
-- Re-read only if required tags are missing from error snippet.
-- If mismatch repeats, stop and re-read the exact block.
+- You MUST retry with the updated tags shown in error output.
+- You MUST re-read only if required tags are missing from error snippet.
+- If mismatch repeats, you MUST stop and re-read the exact block.
 </recovery>
 <example name="fix a value or type">
@@ -63,7 +61,7 @@ Apply precise file edits using `LINE#ID` tags, anchoring to the file content.
 {{hlinefull 23 "  const timeout: number = 5000;"}}
 ```
 ```
-op: "set"
+op: "replace"
 tag: "{{hlineref 23 "  const timeout: number = 5000;"}}"
 content: ["  const timeout: number = 30_000;"]
 ```
@@ -75,7 +73,7 @@ content: ["  const timeout: number = 30_000;"]
 {{hlinefull 8 "const data = fetchSync(url);"}}
 ```
 ```
-op: "set"
+op: "replace"
 tag: "{{hlineref 7 "// @ts-ignore"}}"
 content: null
 ```
@@ -86,7 +84,7 @@ content: null
 {{hlinefull 14 "  placeholder: \"DO NOT SHIP\","}}
 ```
 ```
-op: "set"
+op: "replace"
 tag: "{{hlineref 14 "  placeholder: \"DO NOT SHIP\","}}"
 content: [""]
 ```
@@ -207,10 +205,10 @@ content: ["function validate() {", …, "}"]
 </example>
 <critical>
-Ensure:
+You MUST ensure:
 - Payload shape is `{ "path": string, "edits": [operation, …], "delete"?: boolean, "rename"?: string }`
-- Every edit matches exactly one variant
-- Every tag has been copied EXACTLY from a tool result as `N#ID`
-- Scope is minimal and formatting is preserved except targeted token changes
+- Every edit MUST match exactly one variant
+- Every tag MUST be copied EXACTLY from a tool result as `N#ID`
+- Scope MUST be minimal and formatting MUST be preserved except targeted token changes
 </critical>
-**Final reminder:** tags are immutable references to the last read snapshot. Re-read when state changes, then edit.
+**Final reminder:** tags are immutable references to the last read snapshot. You MUST re-read when state changes, then edit.

package/src/prompts/tools/patch.md CHANGED Viewed

@@ -43,11 +43,11 @@ Returns success/failure; on failure, error message indicates:
 </output>
 <critical>
-- Always read target file before editing
-- Copy anchors and context lines verbatim (including whitespace)
-- Never use anchors as comments (no line numbers, location labels, placeholders like `@@ @@`)
-- Do not place new lines outside intended block
-- If edit fails or breaks structure, re-read file and produce new patch from current content—do not retry same diff
+- You MUST read the target file before editing
+- You MUST copy anchors and context lines verbatim (including whitespace)
+- You MUST NOT use anchors as comments (no line numbers, location labels, placeholders like `@@ @@`)
+- You MUST NOT place new lines outside the intended block
+- If edit fails or breaks structure, you MUST re-read the file and produce a new patch from current content — you MUST NOT retry the same diff
 - **NEVER** use edit to fix indentation, whitespace, or reformat code. Formatting is a single command run once at the end (`bun fmt`, `cargo fmt`, `prettier --write`, etc.)—not N individual edits. If you see inconsistent indentation after an edit, leave it; the formatter will fix all of it in one pass.
 </critical>

package/src/prompts/tools/poll-jobs.md CHANGED Viewed

@@ -2,6 +2,6 @@
 Block until one or more background jobs complete, fail, or are cancelled.
-Use this instead of polling `read jobs://` in a loop when you need to wait for background task or bash results before continuing.
+You MUST use this instead of polling `read jobs://` in a loop when you need to wait for background task or bash results before continuing.
 Returns the status and results of all watched jobs once at least one finishes.

package/src/prompts/tools/python.md CHANGED Viewed

@@ -5,13 +5,13 @@ Runs Python cells sequentially in persistent IPython kernel.
 <instruction>
 Kernel persists across calls and cells; **imports, variables, and functions survive—use this.**
 **Work incrementally:**
-- One logical step per cell (imports, define function, test it, use it)
-- Pass multiple small cells in one call
-- Define small functions you can reuse and debug individually
-- Put explanations in assistant message or cell title, **not** in code
+- You SHOULD use one logical step per cell (imports, define function, test it, use it)
+- You SHOULD pass multiple small cells in one call
+- You SHOULD define small functions you can reuse and debug individually
+- You MUST put explanations in assistant message or cell title, MUST NOT put them in code
 **When something fails:**
 - Errors tell you which cell failed (e.g., "Cell 3 failed")
-- Resubmit only fixed cell (or fixed cell + remaining cells)
+- You SHOULD resubmit only the fixed cell (or fixed cell + remaining cells)
 </instruction>
 <prelude>
@@ -34,23 +34,21 @@ All helpers auto-print results and return values for chaining.
 </prelude>
 <output>
-Streams in real time, truncated after 100KB; if truncated, full output stored under $ARTIFACTS and referenced as `artifact://<id>` in metadata.
 User sees output like Jupyter notebook; rich displays render fully:
 - `display(JSON(data))` → interactive JSON tree
 - `display(HTML(...))` → rendered HTML
 - `display(Markdown(...))` → formatted markdown
 - `plt.show()` → inline figures
-  **You will see object repr** (e.g., `<IPython.core.display.JSON object>`). Trust `display()`; do not assume user sees only repr.
+  **You will see object repr** (e.g., `<IPython.core.display.JSON object>`). Trust `display()`; you MUST NOT assume user sees only repr.
 </output>
 <caution>
 - Per-call mode uses fresh kernel each call
-- Use `reset: true` to clear state when session mode active
+- You MUST use `reset: true` to clear state when session mode active
 </caution>
 <critical>
-- Use `run()` for shell commands; never raw `subprocess`
+- You MUST use `run()` for shell commands; you MUST NOT use raw `subprocess`
 </critical>
 <example name="good">

package/src/prompts/tools/read.md CHANGED Viewed

@@ -1,6 +1,6 @@
 # Read
-Reads files from local filesystem or internal URLs.
+Reads files from local filesystem or harness URLs.
 <instruction>
 - Reads up to {{DEFAULT_MAX_LINES}} lines default
@@ -14,17 +14,7 @@ Reads files from local filesystem or internal URLs.
 {{/if}}
 - Supports images (PNG, JPG) and PDFs
 - For directories, returns formatted listing with modification times
-- Parallelize reads when exploring related files
-- Supports internal URLs:
-  - `skill://<name>` - read SKILL.md for a skill
-  - `skill://<name>/<path>` - read relative path within skill directory
-  - `rule://<name>` - read rule content
-  - `memory://root` - read memory summary (`memory_summary.md`)
-  - `memory://root/<path>` - read relative path within project memory root
-  - `agent://<id>` - read agent output artifact
-  - `agent://<id>/<path>` or `agent://<id>?q=<query>` - extract JSON from agent output
-  - `docs://` - list available pi documentation files
-  - `docs://<file>.md` - read a specific pi documentation file
+- You SHOULD parallelize reads when exploring related files
 </instruction>
 <output>

package/src/prompts/tools/replace.md CHANGED Viewed

@@ -3,10 +3,10 @@
 String replacements in files with fuzzy whitespace matching.
 <instruction>
-- Use smallest edit that uniquely identifies change
-- If `old_text` not unique, expand to include more context or use `all: true` to replace all occurrences
+- You MUST use the smallest edit that uniquely identifies the change
+- If `old_text` not unique, you MUST expand to include more context or use `all: true` to replace all occurrences
 - Fuzzy matching handles minor whitespace/indentation differences automatically
-- Prefer editing existing files over creating new ones
+- You SHOULD prefer editing existing files over creating new ones
 </instruction>
 <output>
@@ -14,10 +14,10 @@ Returns success/failure status. On success, file modified in place with replacem
 </output>
 <critical>
-- Must read file at least once in conversation before editing. Tool errors if you attempt edit without reading file first.
+- You MUST read the file at least once in the conversation before editing. Tool errors if you attempt edit without reading file first.
 </critical>
-<bash*alternatives>
+<bash-alternatives>
 Replace for content-addressed changes—you identify \_what* to change by its text.
 For position-addressed or pattern-addressed changes, bash more efficient:
@@ -35,4 +35,4 @@ For position-addressed or pattern-addressed changes, bash more efficient:
 Use Replace when _content itself_ identifies location.
 Use bash when _position_ or _pattern_ identifies what to change.
-</bash_alternatives>
+</bash-alternatives>

package/src/prompts/tools/ssh.md CHANGED Viewed

@@ -3,7 +3,7 @@
 Run commands on remote hosts.
 <instruction>
-Build commands from reference below
+You MUST build commands from the reference below
 </instruction>
 <commands>
@@ -23,13 +23,8 @@ Build commands from reference below
 - Navigation: `cd`, `echo %CD%`
 </commands>
-<output>
-stdout/stderr combined, truncated at 50KB; exit code captured.
-If truncated, full output stored under $ARTIFACTS as `artifact://<id>`.
-</output>
 <critical>
-Verify shell type from "Available hosts", use matching commands.
+You MUST verify the shell type from "Available hosts" and use matching commands.
 </critical>
 <example name="linux">

package/src/prompts/tools/task.md CHANGED Viewed

@@ -2,12 +2,12 @@
 Launch subagents to execute parallel, well-scoped tasks.
 {{#if asyncEnabled}}
-Use `read jobs://` to inspect background task state and `read jobs://<job_id>` for detailed status/output when needed.
-When you need to wait for async results before continuing, call `poll_jobs` — it blocks until jobs complete. Do NOT poll `read jobs://` in a loop or yield and hope for delivery.
+Use `read jobs://` to inspect background task state and `read jobs://<job-id>` for detailed status/output when needed.
+When you need to wait for async results before continuing, call `poll_jobs` — it blocks until jobs complete. You MUST NOT poll `read jobs://` in a loop or yield and hope for delivery.
 {{/if}}
 ## What subagents inherit automatically
-Subagents receive the **full system prompt**, including AGENTS.md, context files, and skills. Do NOT repeat project rules, coding conventions, or style guidelines in `context` — they already have them.
+Subagents receive the **full system prompt**, including AGENTS.md, context files, and skills. You MUST NOT repeat project rules, coding conventions, or style guidelines in `context` — they already have them.
 ## What subagents do NOT have
 Subagents have no access to your conversation history. They don't know:
@@ -17,6 +17,8 @@ Subagents have no access to your conversation history. They don't know:
 - Requirements the user stated only in conversation
 Subagents CAN grep the parent conversation file for supplementary details.
+For large intermediate outputs (long traces, JSON payloads, temporary analysis snapshots), you SHOULD write them to `local://<path>` and pass the path in task context instead of inlining bulky text.
 ---
 ## Parameters
@@ -30,9 +32,9 @@ Agent type for all tasks in this batch.
 Shared background prepended verbatim to every task `assignment`. Use only for session-specific information subagents lack.
 <critical>
-Do NOT include project rules, coding conventions, or style guidelines — subagents already have AGENTS.md and context files in their system prompt. Repeating them wastes tokens and inflates context. Restating any rule from AGENTS.md in `context` is a bug — treat it like a lint error.
+You MUST NOT include project rules, coding conventions, or style guidelines — subagents already have AGENTS.md and context files in their system prompt. Repeating them wastes tokens and inflates context. Restating any rule from AGENTS.md in `context` is a bug — treat it like a lint error.
 </critical>
-**Before writing each line of context, ask:** "Would this sentence be true for ANY task in this repo, or only for THIS specific batch?" If it applies to any task → it's a project rule → the subagent already has it → delete the line.
+**Before writing each line of context, ask:** "Would this sentence be true for ANY task in this repo, or only for THIS specific batch?" If it applies to any task → it's a project rule → the subagent already has it → you MUST delete the line.
 WRONG — restating project rules the subagent already has:
 ```
@@ -42,7 +44,7 @@ WRONG — restating project rules the subagent already has:
 - Run the formatter after changes
 - Follow the logging convention
 ```
-Every line above restates a project convention. The subagent reads AGENTS.md. Delete them all.
+Every line above restates a project convention. The subagent reads AGENTS.md. You MUST delete them all.
 RIGHT — only session-specific decisions the subagent cannot infer from project files:
 ```
@@ -99,7 +101,7 @@ Run in isolated git worktree; returns patches. Use when tasks edit overlapping f
 {{/if}}
 ### `schema` (optional — recommended for structured output)
-JTD schema defining expected response structure. Use typed properties. If you care about parsing result, define here — **never describe output format in `context` or `assignment`**.
+JTD schema defining expected response structure. Use typed properties. If you care about parsing result, define here — you MUST NOT describe output format in `context` or `assignment`.
 <caution>
 **Schema vs agent mismatch causes null output.** Agents with `output="structured"` (e.g., `explore`) have a built-in schema. If you also pass `schema`, yours takes precedence — but if you describe output format in `context`/`assignment` instead, the agent's built-in schema wins. The agent gets confused trying to fit your requested format into its schema shape and submits `null`. Either: (1) use `schema` to override the built-in one, (2) use `task` agent which has no built-in schema, or (3) match your instructions to the agent's expected output shape.
@@ -110,7 +112,7 @@ JTD schema defining expected response structure. Use typed properties. If you ca
 <critical>## Task scope
-`assignment` must contain enough info for agent to act **without asking a clarifying question**.
+`assignment` MUST contain enough info for agent to act **without asking a clarifying question**.
 **Minimum bar:** assignment under ~8 lines or missing acceptance criteria = too vague. One-liners guaranteed failure.
 Use structure every assignment:
@@ -135,7 +137,7 @@ Use structure every assignment:
 - DO NOT include project-wide build/test/lint commands (see below)
 ```
-`context` carries shared background. `assignment` carries only delta: file-specific instructions, local edge cases, per-task acceptance checks. Never duplicate shared constraints across assignments.
+`context` carries shared background. `assignment` carries only delta: file-specific instructions, local edge cases, per-task acceptance checks. You MUST NOT duplicate shared constraints across assignments.
 ### Anti-patterns (ban these)
 **Vague assignments** — agent guesses wrong or stalls:
@@ -156,9 +158,9 @@ If a constraint appears in AGENTS.md, it MUST NOT appear in `context`. The subag
 If tempted to write above, expand using templates.
 **Output format in prose instead of `schema`** — agent returns null:
-Structured agents (`explore`, `reviewer`) have built-in output schemas. Describing a different output format in `context`/`assignment` without overriding via `schema` creates a mismatch — the agent can't reconcile your prose instructions with its schema and submits null data. Always use `schema` for output structure, or pick an agent whose built-in schema matches your needs.
+Structured agents (`explore`, `reviewer`) have built-in output schemas. Describing a different output format in `context`/`assignment` without overriding via `schema` creates a mismatch — the agent can't reconcile your prose instructions with its schema and submits null data. You MUST use `schema` for output structure, or pick an agent whose built-in schema matches your needs.
 **Test/lint commands in parallel tasks** — edit wars:
-Parallel agents share working tree. If two agents run `bun check` or `bun test` concurrently, they see each other's half-finished edits, "fix" phantom errors, loop. **Never tell parallel tasks run project-wide build/test/lint commands.** Each task edits, stops. Caller verifies after all tasks complete.
+Parallel agents share working tree. If two agents run `bun check` or `bun test` concurrently, they see each other's half-finished edits, "fix" phantom errors, loop. You MUST NOT tell parallel tasks to run project-wide build/test/lint commands. Each task edits, stops. Caller verifies after all tasks complete.
 **If you can't specify scope yet**, create **Discovery task** first: enumerate files, find callsites, list candidates. Then fan out with explicit paths.
 ### Delegate intent, not keystrokes
@@ -247,12 +249,12 @@ Do not touch TS bindings or downstream consumers — separate phase.
 ## Task scope
-Each task small, well-defined scope — **at most 3–5 files**.
+Each task MUST have small, well-defined scope — **at most 3–5 files**.
 **Signs task too broad:**
 - File paths use globs (`src/**/*.ts`) instead of explicit names
 - Assignment says "update all" / "migrate everything" / "refactor across"
 - Scope covers entire package or directory tree
-**Fix:** enumerate files first (grep/glob discovery), then fan out one task per file or small cluster.
+**Fix:** You MUST enumerate files first (grep/glob discovery), then fan out one task per file or small cluster.
 ---
 ## Parallelization
@@ -278,23 +280,32 @@ Each task small, well-defined scope — **at most 3–5 files**.
 ### Phased execution
+<caution>
+**Parallel agents share the working tree.** They see each other's half-finished edits in real time. This is why:
+- Parallel tasks MUST NOT run project-wide build/test/lint — they will collide on phantom errors
+- Tasks editing overlapping files MUST use `isolated: true` (worktree isolation) or be made sequential
+- The caller MUST run verification after all tasks complete, not inside any individual task
+</caution>
 Layered work with dependencies:
-**Phase 1 — Foundation** (do yourself or single task): define interfaces, create scaffolds, establish API shape. Never fan out until contract known.
+**Phase 1 — Foundation** (caller MUST do this, MUST NOT delegate): define interfaces, create scaffolds, establish API shape. You MUST NOT fan out until contract is known.
 **Phase 2 — Parallel implementation**: fan out tasks consuming same known interface. Include Phase 1 API contract in `context`.
-**Phase 3 — Integration** (do yourself): wire modules, fix mismatches, verify builds.
+**Phase 3 — Integration** (caller MUST do this, MUST NOT delegate): wire modules, fix mismatches, verify builds.
 **Phase 4 — Dependent layer**: fan out tasks consuming Phase 2 outputs.
 ---
 ## Pre-flight checklist
-Before calling tool, verify:
-- [ ] `context` includes only session-specific info not already in AGENTS.md/context files
-- [ ] Each `assignment` follows assignment template — not one-liner
-- [ ] Each `assignment` includes edge cases / "don’t break" items
-- [ ] Tasks truly parallel (no hidden dependencies)
-- [ ] Scope small, file paths explicit (no globs)
-- [ ] No task runs project-wide build/test/lint — you do after all tasks complete
-- [ ] `schema` used if you expect information
+<critical>
+Before calling tool, verify each item:
+- [ ] `context` MUST include only session-specific info not already in AGENTS.md/context files
+- [ ] Each `assignment` MUST follow the assignment template — one-liners are PROHIBITED
+- [ ] Each `assignment` MUST include edge cases / "don't break" items
+- [ ] Tasks MUST be truly parallel — you MUST be able to articulate why no task depends on another's output
+- [ ] Scope MUST be small; file paths MUST be explicit (no globs)
+- [ ] Tasks MUST NOT run project-wide build/test/lint — caller MUST verify after all tasks complete
+- [ ] `schema` MUST be used if you expect structured output
+</critical>
 ---
 ## Agents

package/src/prompts/tools/todo-write.md CHANGED Viewed

@@ -1,65 +1,81 @@
 # Todo Write
-Create/manage structured task list for coding session.
+Manage a phased task list. Submit an `ops` array — each op mutates state incrementally.
+**Primary op: `update`.** Use it to mark tasks `in_progress` or `completed`. Only reach for other ops when the structure itself needs to change.
+<critical>
+You MUST call this tool twice per task:
+1. Before beginning — `{op: "update", id: "task-N", status: "in_progress"}`
+2. Immediately after finishing — `{op: "update", id: "task-N", status: "completed"}`
+You MUST keep exactly one task `in_progress` at all times. Mark `completed` immediately — no batching.
+</critical>
 <conditions>
-Use proactively:
-1. Complex multi-step tasks requiring 3+ steps/actions
-2. User requests todo list
-3. User provides multiple tasks (numbered/comma-separated)
-4. After new instructions—capture requirements as todos
-5. Starting task—mark in_progress BEFORE beginning
-6. After completing—mark completed, add follow-up tasks found
+Create a todo list when:
+1. Task requires 3+ distinct steps
+2. User explicitly requests one
+3. User provides a set of tasks to complete
+4. New instructions arrive mid-task — capture before proceeding
 </conditions>
 <protocol>
-1. **Task States**:
-	 - pending: not started
-	 - in_progress: working
-	 - completed: finished
-2. **Task Management**:
-   - Update status in real time
-   - Mark complete IMMEDIATELY after finishing (no batching)
-   - Keep exactly ONE task in_progress at a time
-   - Remove tasks no longer relevant
-   - Complete tasks in list order (do not mark later tasks completed while earlier tasks remain incomplete)
-3. **Task Completion Requirements**:
-   - ONLY mark completed when FULLY accomplished
-   - On errors/blockers/inability to finish, keep in_progress
-   - When blocked, create task describing what needs resolving
-4. **Task Breakdown**:
-	 - Create specific, actionable items
-	 - Keep each todo scoped to one logical unit of work; split unrelated work into separate items
-	 - Break complex tasks into smaller steps
-	 - Use clear, descriptive names
-</protocol>
+## Operations
-<output>
-Returns confirmation todo list updated.
-</output>
+|op|When to use|
+|---|---|
+|`update`|Mark a task in_progress / completed / abandoned, or edit content/notes|
+|`replace`|Initial setup, or full restructure when the plan changes significantly|
+|`add_phase`|Add a new phase of work discovered mid-task|
+|`add_task`|Add a task to an existing phase|
+|`remove_task`|Remove a task that is no longer relevant|
-<caution>
-When in doubt, use this.
-</caution>
+## Statuses
+|Status|Meaning|
+|---|---|
+|`pending`|Not started|
+|`in_progress`|Currently working — exactly one at a time|
+|`completed`|Fully done|
+|`abandoned`|Dropped intentionally|
+## Rules
+- You MUST mark `in_progress` **before** starting work, not after
+- You MUST mark `completed` **immediately** — never defer
+- You MUST keep exactly **one** task `in_progress`
+- You MUST complete phases in order — do not mark later tasks `completed` while earlier ones are `pending`
+- On blockers: keep `in_progress`, add a new task describing the blocker
+- Multiple ops can be batched in one call (e.g., complete current + start next)
+</protocol>
-<example name="use-dark-mode">
-User: Add dark mode toggle to settings. Run tests when done.
-→ Creates todos: toggle component, state management, theme styles, update components, run tests
+<avoid>
+- Single-step tasks — act directly
+- Conversational or informational requests
+- Tasks completable in under 3 trivial steps
+</avoid>
+<example name="start-task">
+Mark task-2 in_progress before beginning work:
+ops: [{op: "update", id: "task-2", status: "in_progress"}]
 </example>
-<example name="use-features">
-User: Implement user registration, product catalog, shopping cart, checkout.
-→ Creates todos per feature with subtasks
+<example name="complete-and-advance">
+Finish task-2 and start task-3 in one call:
+ops: [
+  {op: "update", id: "task-2", status: "completed"},
+  {op: "update", id: "task-3", status: "in_progress"}
+]
 </example>
-<example name="skip">
-User: Run npm install / Add a comment to this function / What does git status do?
-→ Do directly. Single-step/informational tasks need no tracking.
+<example name="initial-setup">
+Replace is for setup only. Prefer add_phase / add_task for incremental additions.
+ops: [{op: "replace", phases: [
+  {name: "Investigation", tasks: [{content: "Read source"}, {content: "Map callsites"}]},
+  {name: "Implementation", tasks: [{content: "Apply fix"}, {content: "Run tests"}]}
+]}]
 </example>
-<avoid>
-Skip when:
-1. Single straightforward task
-2. Task completable in <3 trivial steps
-3. Task purely conversational/informational
-</avoid>
+<example name="skip">
+User: "What does this function do?" / "Add a comment" / "Run npm install"
+→ Do it directly. No list needed.
+</example>

package/src/prompts/tools/web-search.md CHANGED Viewed

@@ -3,8 +3,8 @@
 Search the web for up-to-date information beyond Claude's knowledge cutoff.
 <instruction>
-- Prefer primary sources (papers, official docs) and corroborate key claims with multiple sources
-- Include links for cited sources in the final response
+- You SHOULD prefer primary sources (papers, official docs) and corroborate key claims with multiple sources
+- You MUST include links for cited sources in the final response
 </instruction>
 <output>

package/src/prompts/tools/write.md CHANGED Viewed

@@ -5,6 +5,7 @@ Creates or overwrites file at specified path.
 <conditions>
 - Creating new files explicitly required by task
 - Replacing entire file contents when editing would be more complex
+- Prefer `local://<path>` for large temporary artifacts, subagent handoff payloads, and reusable planning artifacts that should survive within the session
 </conditions>
 <output>
@@ -12,7 +13,7 @@ Confirmation of file creation/write with path. When LSP available, content may b
 </output>
 <critical>
-- Prefer Edit tool for modifying existing files (more precise, preserves formatting)
-- Create documentation files (*.md, README) only when explicitly requested
-- No emojis unless requested
+- You SHOULD use Edit tool for modifying existing files (more precise, preserves formatting)
+- You MUST NOT create documentation files (*.md, README) unless explicitly requested
+- You MUST NOT use emojis unless requested
 </critical>