npm - ypi - Versions diffs - 0.4.0 → 0.5.1 - Mend

ypi 0.4.0 → 0.5.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (6) hide show

package/CHANGELOG.md CHANGED Viewed

@@ -3,6 +3,31 @@
 All notable changes to ypi are documented here.
 Format based on [Keep a Changelog](https://keepachangelog.com/).
+## [0.5.1] - 2026-03-23
+### Fixed
+- **macOS mktemp compatibility**: BSD `mktemp` does not allow characters after the `XXXXXX` template suffix — moved `XXXXXX` to end of all templates and use `${TMPDIR:-/tmp}` for portable temp directory resolution
+- **Bash 3.2 unbound variable crash**: empty array expansion under `set -u` fails on macOS default bash — build argv incrementally with length checks in `ypi` launcher
+## [0.5.0] - 2026-02-15
+### Added
+- **Notify-done extension** (`contrib/extensions/notify-done.ts`): background task completion notifications via sentinel files — injects messages into conversation when tasks finish, no polling needed
+- **LSP extension** (`contrib/extensions/lsp/`): Language Server Protocol integration for code intelligence (diagnostics, references, definitions, rename, hover, symbols)
+- **Persist-system-prompt extension** (`contrib/extensions/persist-system-prompt.ts`): saves effective system prompt to session files for debugging and reproducibility
+- **Auto-title extension** (`contrib/extensions/auto-title.ts`): automatic session title generation
+- **Cachebro extension** (`contrib/extensions/cachebro.ts`): intelligent file caching with diff-aware invalidation and token estimation
+- **Context window awareness**: SYSTEM_PROMPT.md now teaches agents about finite context budgets and how to manage them
+- Tests for notify-done and persist-system-prompt extensions
+### Changed
+- **AGENTS.md**: added sentinel/notify-done workflow pattern, background task instructions
+- **SYSTEM_PROMPT.md**: context window awareness guidance
+- **contrib/README.md**: updated with new extensions documentation
+### Fixed
+- Notify-done extension: block broadcast sentinels, use `steer` for busy agents, `display: true` for visibility
 ## [0.4.0] - 2026-02-13
 ### Added

package/README.md CHANGED Viewed

@@ -173,13 +173,32 @@ jj git push                  # Push to GitHub
 ### Testing
 ```bash
-make test-fast    # 54 tests, no LLM calls, seconds
-make test-e2e     # Real LLM calls, costs ~$0.05
-make test         # Both
+make test-fast         # unit + guardrails
+make test-extensions   # extension compatibility with installed pi
+make pre-push-checks   # shared local/CI gate (recommended before push)
+make test-e2e          # real LLM calls, costs money
+make test              # all of the above
+```
+Install hooks once per clone to run checks automatically on git push:
+```bash
+make install-hooks
+```
+Release/update helper:
+```bash
+make release-preflight   # same checks + upstream dry-run in one command
+make land                # deterministic-ish landing helper
 ```
 **Before any change to `rlm_query`:** run `make test-fast`. After: run it again. `rlm_query` is a live dependency of the agent's own execution — breaking it breaks the agent.
+CI helper commands:
+```bash
+make ci-status N=15      # recent workflow runs
+make ci-last-failure     # dump latest failing workflow log
+```
 ### History

package/SYSTEM_PROMPT.md CHANGED Viewed

@@ -4,11 +4,13 @@
 - You are a **recursive LLM** equipped with a Bash shell and the `rlm_query` tool.
 - The environment variable `RLM_DEPTH` tells you your current recursion depth; respect `RLM_MAX_DEPTH` and be more **conservative** (fewer sub‑calls, more direct actions) the deeper you are.
 - You can **read files, write files, run commands, and delegate work** to sub‑agents via `rlm_query`.
-- Sub‑agents inherit the same capabilities and receive their own isolated context.
+- Sub‑agents inherit the same capabilities and receive their own **fresh context window**.
 - All actions should aim to be **deterministic and reproducible**.
+- **Your context window is finite and non-renewable.** Every file you read, every tool output you receive, every message in this conversation — it all accumulates. When it fills up, older context gets compressed and you lose information. This is the fundamental constraint that shapes how you work.
 ## SECTION 2 – Recursive Decomposition
 You solve problems by **decomposing them**: break big tasks into smaller ones, delegate to sub‑agents, combine results. This works for any task — coding, analysis, refactoring, generation, exploration.
+**Why recurse?** Not because a problem is too hard — because it’s too *big* for one context window. A 10-file refactor doesn’t need more intelligence; it needs more context windows. Each child agent you spawn via `rlm_query` gets a fresh context budget. You get back only their answer — a compact result instead of all the raw material. This is how you stay effective on long tasks.
 Your original prompt is also available as a file at `$RLM_PROMPT_FILE` — use it when you need to manipulate the question programmatically (e.g., extracting exact strings, counting characters) rather than copying tokens from memory.
@@ -17,13 +19,15 @@ If a `$CONTEXT` file is set, it contains data relevant to your task. Treat it li
 **Core pattern: size up → search → delegate → combine**
 1. **Size up the problem** – How big is it? Can you do it directly, or does it need decomposition? For files: `wc -l` / `wc -c`. For code tasks: how many files, how complex?
 2. **Search & explore** – `grep`, `find`, `ls`, `head` — orient yourself before diving in.
-3. **Delegate** – use `rlm_query` to hand sub‑tasks to child agents. Two patterns:
+3. **Delegate** – use `rlm_query` to hand sub‑tasks to child agents. Three patterns:
    ```bash
-   # Pipe data as the child's context
+   # Pipe data as the child's context (synchronous — blocks until done)
    sed -n '100,200p' bigfile.txt | rlm_query "Summarize this section"
-   # Child inherits your environment (files, cwd, $CONTEXT)
+   # Child inherits your environment (synchronous)
    rlm_query "Refactor the error handling in src/api.py"
+   # ASYNC — returns immediately, child runs in background (PREFERRED for parallel work)
+   rlm_query --async "Write tests for the auth module"
+   # Returns: {"job_id": "...", "output": "/tmp/...", "sentinel": "/tmp/...done", "pid": 12345}
    ```
 4. **Combine** – aggregate results, deduplicate, resolve conflicts, produce the final output.
 5. **Do it directly when it's small** – don't delegate what you can do in one step.
@@ -42,11 +46,11 @@ cat src/config.py
 ```bash
 # Find all files that need updating
 grep -rl "old_api_call" src/
-# Delegate each file to a sub-agent (each gets its own jj workspace)
+# Delegate each file to a sub-agent using --async (non-blocking)
 for f in $(grep -rl "old_api_call" src/); do
-    rlm_query "In $f, replace all old_api_call() with new_api_call(). Update the imports too."
-done
+    rlm_query --async "In $f, replace all old_api_call() with new_api_call(). Update the imports. Then jj commit -m 'refactor: $f'"
+    done
+# Children run in parallel, each in its own jj workspace. Check sentinels for completion.
 ```
 **Example 3 – Large file analysis, chunk and search**
@@ -59,17 +63,27 @@ grep -n "ERROR\|FATAL" data/logs.txt
 sed -n '480,600p' data/logs.txt | rlm_query "What caused this error? Suggest a fix."
 ```
-**Example 4 – Parallel sub-tasks with different goals**
+**Example 4 – Parallel sub-tasks with --async (PREFERRED)**
 ```bash
-# Break a complex task into independent pieces
-SUMMARY=$(rlm_query "Read README.md and summarize what this project does in one paragraph.")
-ISSUES=$(rlm_query "Run the test suite and report any failures.")
-DEPS=$(rlm_query "Check for outdated dependencies in package.json.")
-# Combine into a report
-echo "Summary: $SUMMARY"
-echo "Test issues: $ISSUES"
-echo "Dependency status: $DEPS"
+# Break a complex task into independent pieces — all run in parallel
+JOB1=$(rlm_query --async "Read README.md and summarize what this project does in one paragraph.")
+JOB2=$(rlm_query --async "Run the test suite and report any failures.")
+JOB3=$(rlm_query --async "Check for outdated dependencies in package.json.")
+# Each returns immediately with {"job_id", "output", "sentinel", "pid"}
+# Check completion non-blockingly:
+for JOB in "$JOB1" "$JOB2" "$JOB3"; do
+    SENTINEL=$(echo "$JOB" | python3 -c "import sys,json; print(json.load(sys.stdin)['sentinel'])")
+    OUTPUT=$(echo "$JOB" | python3 -c "import sys,json; print(json.load(sys.stdin)['output'])")
+    [ -f "$SENTINEL" ] && echo "Done: $(cat $OUTPUT)" || echo "Still running..."
+done
+```
+**Example 5 – Sequential sub-tasks (when order matters)**
+```bash
+# Use synchronous rlm_query ONLY when each step depends on the previous
+SUMMARY=$(rlm_query "Read README.md and summarize what this project does.")
+ISSUES=$(rlm_query "Given this summary: $SUMMARY — what are the main risks?")
 ```
 **Example 5 – Iterative chunking over a huge file**
@@ -95,6 +109,7 @@ done
 - **Write files directly** with `write` or standard Bash redirection; do **not** merely describe the change.
 - When you need to create or modify multiple files, perform each action explicitly (e.g., `echo >> file`, `sed -i`, `cat > newfile`).
 - Any sub‑agents you spawn via `rlm_query` inherit their own jj workspaces, so their edits are also isolated.
+- **Always commit before exiting** — if you're in a jj workspace, run `jj commit -m 'description'` before you finish. Uncommitted work is **lost** when the workspace is forgotten on exit.
 ## SECTION 4 – Guardrails & Cost Awareness
 - **RLM_TIMEOUT** – if set, respect the remaining wall‑clock budget; avoid long‑running loops.
@@ -113,15 +128,24 @@ done
   rlm_sessions grep <pattern>      # search across sessions
   ```
   Available for debugging and reviewing what other agents in the tree have done.
+- **`rlm_cleanup`** – clean up stale temp files and jj workspaces from previous rlm_query runs:
+  ```bash
+  rlm_cleanup              # dry-run: show what would be cleaned
+  rlm_cleanup --force      # actually delete stale files and workspace dirs
+  rlm_cleanup --age 60     # override age threshold (default: 120 min)
+  ```
+  Run this when the machine feels slow or /tmp is filling up. The reaper in `rlm_query` runs automatically at depth 0, but this lets you trigger it manually or with a different age threshold.
 - **Depth awareness** – at deeper `RLM_DEPTH` levels, prefer **direct actions** (e.g., file edits, single‑pass searches) over spawning many sub‑agents.
 - Always **clean up temporary files** and respect `trap` handlers defined by the infrastructure.
+- **NEVER run `rlm_query` in a foreground for-loop** — this blocks the parent's conversation for the entire duration. Use `rlm_query --async` for parallel work. Synchronous `rlm_query` is only for single calls or when you need the result immediately for the next step.
 ## SECTION 5 – Rules
-1. **Size up first** – before delegating, check if the task is small enough to do directly. Read small files, edit simple things, answer obvious questions — don't over‑decompose.
-2. **Validate sub‑agent output** – if a sub‑call returns unexpected output, re‑query or do it yourself; never guess.
-3. **Computation over memorization** – use `python3`, `date`, `wc`, `grep -c` for counting, dates, and math. Don't eyeball it.
-4. **Act, don't describe** – when instructed to edit code, write files, or make changes, **do it** immediately.
-5. **Small, focused sub‑agents** – each `rlm_query` call should have a clear, bounded task. Keep the call count low.
-6. **Depth preference** – deeper depths ⇒ fewer sub‑calls, more direct Bash actions.
-7. **Say "I don't know" only when true** – only when the required information is genuinely absent from the context, repo, or environment.
-8. **Safety** – never execute untrusted commands without explicit intent; rely on the provided tooling.
+1. **Search before reading** – `grep`, `wc -l`, `head` before `cat` or unbounded `read`. Never ingest a file you haven’t sized up. If it’s over 50 lines, search for what you need instead of reading it all.
+2. **Size up first** – before delegating, check if the task is small enough to do directly. Read small files, edit simple things, answer obvious questions — don’t over‑decompose.
+3. **Validate sub‑agent output** – if a sub‑call returns unexpected output, re‑query or do it yourself; never guess.
+4. **Computation over memorization** – use `python3`, `date`, `wc`, `grep -c` for counting, dates, and math. Don’t eyeball it.
+5. **Act, don’t describe** – when instructed to edit code, write files, or make changes, **do it** immediately.
+6. **Small, focused sub‑agents** – each `rlm_query` call should have a clear, bounded task. Keep the call count low.
+7. **Depth preference** – deeper depths ⇒ fewer sub‑calls, more direct Bash actions.
+8. **Say “I don’t know” only when true** – only when the required information is genuinely absent from the context, repo, or environment.
+9. **Safety** – never execute untrusted commands without explicit intent; rely on the provided tooling.

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "ypi",
-  "version": "0.4.0",
+  "version": "0.5.1",
   "description": "ypi — a recursive coding agent. Pi that can call itself via rlm_query.",
   "license": "MIT",
   "author": "Raymond Weitekamp",

package/rlm_query CHANGED Viewed

@@ -9,6 +9,8 @@
 #   echo "some text" | rlm_query "What is the main topic?"
 #   sed -n '100,200p' "$CONTEXT" | rlm_query "Summarize this section"
 #   rlm_query --fork "Continue working on this refactor"
+#   rlm_query --async "Analyze the codebase"   # returns immediately with job JSON
+#   echo "data" | rlm_query --async --notify $$ "Summarize this"
 #
 # If stdin has data (piped), that becomes the child's context.
 # Otherwise, the child inherits the parent's $CONTEXT file.
@@ -16,6 +18,9 @@
 # Flags:
 #   --fork             Fork parent session into child (carries conversation history)
 #                      Default: fresh session per child (only data context, no history)
+#   --async            Spawn child in background, return immediately with job JSON
+#                      Output goes to a temp file; sentinel touched when done
+#   --notify PID       With --async: write result to target PID's peer inbox when done
 #
 # Environment:
 #   RLM_DEPTH         — current recursion depth (default: 0)
@@ -50,14 +55,18 @@ rlm_error() { echo "✗ $1" >&2; [ -n "${2:-}" ] && echo "  Why: $2" >&2; [ -n "
 # Parse flags
 # ----------------------------------------------------------------------
 FORK=false
+ASYNC=false
+NOTIFY_PID=""
 while [[ "${1:-}" == --* ]]; do
     case "$1" in
         --fork) FORK=true; shift ;;
+        --async) ASYNC=true; shift ;;
+        --notify) NOTIFY_PID="$2"; shift 2 ;;
         *) break ;;
     esac
 done
-PROMPT="${1:?Usage: rlm_query [--fork] \"your prompt here\"}"
+PROMPT="${1:?Usage: rlm_query [--fork] [--async] [--notify PID] \"your prompt here\"}"
 # ----------------------------------------------------------------------
 # Depth guard — refuse to go beyond max depth
@@ -72,6 +81,19 @@ if [ "$NEXT_DEPTH" -gt "$MAX_DEPTH" ]; then
     exit 1
 fi
+# ----------------------------------------------------------------------
+# Stale temp reaper — runs once per root invocation (depth 0 only)
+# Cleans up leaked files from crashed/killed processes (SIGKILL skips traps)
+# ----------------------------------------------------------------------
+if [ "$DEPTH" -eq 0 ]; then
+    (
+        # Delete rlm temp files older than 2 hours whose parent process is gone
+        find /tmp -maxdepth 1 -name 'rlm_*' -mmin +120 -delete 2>/dev/null
+        # Remove stale jj workspace directories older than 2 hours
+        find /tmp -maxdepth 1 -name 'rlm_ws_*' -type d -mmin +120 -exec rm -rf {} + 2>/dev/null
+    ) &
+fi
 PROVIDER="${RLM_PROVIDER:-}"
 MODEL="${RLM_MODEL:-}"
 SYSTEM_PROMPT_FILE="${RLM_SYSTEM_PROMPT:-}"
@@ -117,7 +139,7 @@ fi
 # Initialize cost file if budget is set but no file exists yet
 if [ -n "${RLM_BUDGET:-}" ] && [ -z "${RLM_COST_FILE:-}" ]; then
-    export RLM_COST_FILE=$(mktemp /tmp/rlm_cost_XXXXXX.jsonl)
+    export RLM_COST_FILE=$(mktemp "${TMPDIR:-/tmp}/rlm_cost.XXXXXX")
 fi
 # ----------------------------------------------------------------------
@@ -149,12 +171,12 @@ fi
 # ----------------------------------------------------------------------
 # Temporary child context file
 # ----------------------------------------------------------------------
-CHILD_CONTEXT=$(mktemp /tmp/rlm_ctx_d${NEXT_DEPTH}_XXXXXX.txt)
+CHILD_CONTEXT=$(mktemp "${TMPDIR:-/tmp}/rlm_ctx_d${NEXT_DEPTH}.XXXXXX")
 COMBINED_PROMPT=""
 # Write prompt to a file for symbolic access — agents can grep/sed the original
 # question instead of relying on in-context token copying.
-PROMPT_FILE=$(mktemp /tmp/rlm_prompt_d${NEXT_DEPTH}_XXXXXX.txt)
+PROMPT_FILE=$(mktemp "${TMPDIR:-/tmp}/rlm_prompt_d${NEXT_DEPTH}.XXXXXX")
 echo "$PROMPT" > "$PROMPT_FILE"
 # ----------------------------------------------------------------------
@@ -166,7 +188,7 @@ if [ "${RLM_JJ:-1}" != "0" ] \
    && command -v jj &>/dev/null \
    && jj root &>/dev/null 2>&1; then
     JJ_WS_NAME="rlm-d${NEXT_DEPTH}-$$"
-    JJ_WORKSPACE=$(mktemp -d /tmp/rlm_ws_d${NEXT_DEPTH}_XXXXXX)
+    JJ_WORKSPACE=$(mktemp -d "${TMPDIR:-/tmp}/rlm_ws_d${NEXT_DEPTH}.XXXXXX")
     if ! jj workspace add --name "$JJ_WS_NAME" "$JJ_WORKSPACE" &>/dev/null; then
         JJ_WORKSPACE=""
         JJ_WS_NAME=""
@@ -174,6 +196,8 @@ if [ "${RLM_JJ:-1}" != "0" ] \
 fi
 # Cleanup: remove temp context + forget jj workspace (updated in run section below)
+# In async mode, the child needs these files — skip cleanup (child cleans up after itself)
+if [ "$ASYNC" != true ]; then
 trap '{
     rm -f "$CHILD_CONTEXT" "$PROMPT_FILE"
     rm -f "${COMBINED_PROMPT:-}"
@@ -181,20 +205,28 @@ trap '{
         jj workspace forget "$JJ_WS_NAME" 2>/dev/null || true
     fi
 }' EXIT
+fi
 trap 'rlm_error "Interrupted" "Received signal" "Re-run the command"; exit 130' INT TERM
 # ----------------------------------------------------------------------
 # Detect piped stdin
 # ----------------------------------------------------------------------
 HAS_STDIN=false
-if [ -p /dev/stdin ]; then
+if [ -n "${RLM_STDIN:-}" ]; then
     HAS_STDIN=true
-elif [ -n "${RLM_STDIN:-}" ]; then
+elif [ -p /dev/stdin ]; then
     HAS_STDIN=true
 fi
 if [ "$HAS_STDIN" = true ]; then
     cat > "$CHILD_CONTEXT"
+    # Some CI runners expose stdin as an empty pipe even when nothing was piped
+    # to rlm_query. In that case, preserve inherited CONTEXT unless stdin was
+    # explicitly signaled via RLM_STDIN.
+    if [ ! -s "$CHILD_CONTEXT" ] && [ -z "${RLM_STDIN:-}" ] && [ -n "${CONTEXT:-}" ] && [ -f "${CONTEXT:-}" ]; then
+        cp "$CONTEXT" "$CHILD_CONTEXT"
+    fi
 else
     if [ -n "${CONTEXT:-}" ] && [ -f "${CONTEXT:-}" ]; then
         cp "$CONTEXT" "$CHILD_CONTEXT"
@@ -265,7 +297,7 @@ fi
 # Build combined system prompt with rlm_query source embedded
 COMBINED_PROMPT=""
 if [ -n "$SYSTEM_PROMPT_FILE" ] && [ -f "$SYSTEM_PROMPT_FILE" ]; then
-    COMBINED_PROMPT=$(mktemp /tmp/rlm_system_prompt_XXXXXX.md)
+    COMBINED_PROMPT=$(mktemp "${TMPDIR:-/tmp}/rlm_system_prompt.XXXXXX")
     cat "$SYSTEM_PROMPT_FILE" > "$COMBINED_PROMPT"
     SELF_SOURCE="$(cd "$(dirname "$(readlink -f "${BASH_SOURCE[0]}")")" && pwd)/rlm_query"
     if [ -f "$SELF_SOURCE" ]; then
@@ -301,11 +333,63 @@ if [ -n "$JJ_WORKSPACE" ]; then
     cd "$JJ_WORKSPACE"
 fi
+# ----------------------------------------------------------------------
+# Async mode — spawn child in background and return immediately
+# ----------------------------------------------------------------------
+if [ "$ASYNC" = true ]; then
+    ASYNC_ID="rlm_async_${RLM_TRACE_ID}_c${RLM_CALL_COUNT}_$(head -c 4 /dev/urandom | od -An -tx1 | tr -d ' \n')"
+    ASYNC_OUTPUT="/tmp/${ASYNC_ID}.txt"
+    ASYNC_SENTINEL="/tmp/${ASYNC_ID}.done"
+    # Build the full command
+    CHILD_CMD=(pi "${CMD_ARGS[@]}" "$PROMPT")
+    if [ -n "$TIMEOUT_CMD" ]; then
+        CHILD_CMD=($TIMEOUT_CMD "${CHILD_CMD[@]}")
+    fi
+    # Spawn in background: run child, capture output, touch sentinel, optionally notify
+    (
+        "${CHILD_CMD[@]}" > "$ASYNC_OUTPUT" 2>&1
+        touch "$ASYNC_SENTINEL"
+        # If --notify PID was given, write to that peer's inbox
+        if [ -n "$NOTIFY_PID" ]; then
+            INBOX_DIR=$(find /tmp/pi_peer_* -maxdepth 0 -type d 2>/dev/null | while read d; do
+                if [ -f "$d/meta.json" ] && grep -q "\"pid\":$NOTIFY_PID" "$d/meta.json" 2>/dev/null; then
+                    echo "$d"; break
+                fi
+            done)
+            if [ -n "$INBOX_DIR" ] && [ -d "$INBOX_DIR" ]; then
+                RESULT=$(cat "$ASYNC_OUTPUT" | tail -c 50000)
+                MSG=$(cat <<MSGJSON
+{"from_pid":$$,"from_project":"rlm_query","message":"[rlm_query --async result]\n\n$RESULT","timestamp":"$(date -Iseconds)","id":"async_${ASYNC_ID}"}
+MSGJSON
+)
+                echo "$MSG" >> "$INBOX_DIR/inbox.jsonl"
+            fi
+        fi
+        # Clean up temp files the parent skipped
+        rm -f "$CHILD_CONTEXT" "$PROMPT_FILE" "${COMBINED_PROMPT:-}"
+        if [ -n "$JJ_WS_NAME" ]; then
+            jj workspace forget "$JJ_WS_NAME" 2>/dev/null || true
+        fi
+    ) &
+    ASYNC_PID=$!
+    disown $ASYNC_PID 2>/dev/null || true
+    # Return job metadata immediately
+    cat <<EOF
+{"job_id": "$ASYNC_ID", "output": "$ASYNC_OUTPUT", "sentinel": "$ASYNC_SENTINEL", "pid": $ASYNC_PID}
+EOF
+    exit 0
+fi
 # ----------------------------------------------------------------------
 # Run child Pi — JSON mode (default) or plain text
 # JSON mode streams text to stdout and captures cost via fd 3.
 # ----------------------------------------------------------------------
-COST_OUT=$(mktemp /tmp/rlm_cost_out_XXXXXX.json)
+COST_OUT=$(mktemp "${TMPDIR:-/tmp}/rlm_cost_out.XXXXXX")
 trap '{
     rm -f "$CHILD_CONTEXT" "$PROMPT_FILE"
     rm -f "${COMBINED_PROMPT:-}"

package/ypi CHANGED Viewed

@@ -66,7 +66,7 @@ mkdir -p "$RLM_SESSION_DIR"
 # Build combined system prompt: SYSTEM_PROMPT.md + rlm_query source
 # This way the agent sees the full implementation, not just usage docs.
-COMBINED_PROMPT=$(mktemp /tmp/ypi_system_prompt_XXXXXX.md)
+COMBINED_PROMPT=$(mktemp "${TMPDIR:-/tmp}/ypi_system_prompt.XXXXXX")
 trap 'rm -f "$COMBINED_PROMPT"' EXIT
 cat "$SCRIPT_DIR/SYSTEM_PROMPT.md" > "$COMBINED_PROMPT"
@@ -94,6 +94,7 @@ fi
 # We append to the combined prompt file rather than passing through,
 # since pi already gets --system-prompt from us.
 PASS_ARGS=()
+QUIET="${YPI_QUIET:-0}"
 while [[ $# -gt 0 ]]; do
     case "$1" in
         --append-system-prompt)
@@ -101,9 +102,13 @@ while [[ $# -gt 0 ]]; do
             printf '\n%s\n' "$1" >> "$COMBINED_PROMPT"
             shift
             ;;
+        --quiet|-q)
+            QUIET=1
+            shift
+            ;;
         --system-prompt)
             # User overriding ypi's system prompt entirely
-            echo "⚠️  Overriding ypi's system prompt. Did you mean --append-system-prompt?" >&2
+            [ "$QUIET" != "1" ] && echo "⚠️  Overriding ypi's system prompt. Did you mean --append-system-prompt?" >&2
             shift
             cat "$1" > "$COMBINED_PROMPT" 2>/dev/null || echo "$1" > "$COMBINED_PROMPT"
             shift
@@ -116,4 +121,7 @@ while [[ $# -gt 0 ]]; do
 done
 # Launch Pi with the combined system prompt, passing all args through
 # User's own extensions (hashline, etc.) are discovered automatically by Pi.
-exec pi --system-prompt "$COMBINED_PROMPT" "${YPI_EXT_ARGS[@]}" "${PASS_ARGS[@]}"
+PI_ARGV=(pi --system-prompt "$COMBINED_PROMPT")
+[ ${#YPI_EXT_ARGS[@]} -gt 0 ] && PI_ARGV+=("${YPI_EXT_ARGS[@]}")
+[ ${#PASS_ARGS[@]} -gt 0 ] && PI_ARGV+=("${PASS_ARGS[@]}")
+exec "${PI_ARGV[@]}"