npm - prizmkit - Versions diffs - 1.0.26 → 1.0.28 - Mend

prizmkit 1.0.26 → 1.0.28

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (14) hide show

package/bundled/VERSION.json +3 -3
package/bundled/dev-pipeline/README.md +90 -0
package/bundled/dev-pipeline/retry-bug.sh +7 -0
package/bundled/dev-pipeline/retry-feature.sh +7 -0
package/bundled/dev-pipeline/run-bugfix.sh +7 -0
package/bundled/dev-pipeline/run.sh +71 -0
package/bundled/dev-pipeline/scripts/init-pipeline.py +13 -2
package/bundled/dev-pipeline/scripts/update-feature-status.py +24 -9
package/bundled/skills/_metadata.json +1 -9
package/bundled/skills/dev-pipeline-launcher/SKILL.md +30 -5
package/bundled/skills/feature-workflow/SKILL.md +9 -2
package/package.json +1 -1
package/src/index.js +0 -1
package/bundled/skills/refactor-skill/SKILL.md +0 -371

package/bundled/VERSION.json CHANGED Viewed

@@ -1,5 +1,5 @@
 {
-  "frameworkVersion": "1.0.26",
-  "bundledAt": "2026-03-15T17:57:13.772Z",
-  "bundledFrom": "896326f"
+  "frameworkVersion": "1.0.28",
+  "bundledAt": "2026-03-17T06:14:03.182Z",
+  "bundledFrom": "780f717"
 }

package/bundled/dev-pipeline/README.md CHANGED Viewed

@@ -33,6 +33,7 @@ python3 dev-pipeline/scripts/init-pipeline.py \
 |---------|-------------|
 | `./run.sh run [feature-list.json] [options]` | Start or resume the pipeline. Processes features sequentially by dependency order. |
 | `./run.sh status [feature-list.json]` | Display current pipeline status: completed, pending, blocked, failed features. |
+| `./run.sh test-cli` | Test AI CLI detection: show detected CLI, version, platform, and query the AI model identity. |
 | `./run.sh reset` | Clear all runtime state in `state/`. Pipeline starts fresh on next `run`. |
 | `./run.sh help` | Show usage help. |
 | `./retry-feature.sh <feature-id> [feature-list.json]` | Retry a single failed feature. Runs one session then exits. |
@@ -97,7 +98,9 @@ What is always reset (with or without `--clean`):
 | `MAX_RETRIES` | `3` | Maximum retry attempts per feature before marking as failed. |
 | `SESSION_TIMEOUT` | `0` (no limit) | Timeout in seconds per AI CLI session. 0 = no timeout. |
 | `AI_CLI` | auto-detect | AI CLI command name. Auto-detects `cbc` or `claude`. Set to override. |
+| `MODEL` | (none) | AI model ID for the session. Passed as `--model` to the CLI. See [Model Selection](#model-selection). |
 | `CODEBUDDY_CLI` | (deprecated) | Legacy alias for `AI_CLI`. Prefer `AI_CLI`. |
+| `VERBOSE` | `0` | Set to `1` to enable `--verbose` on AI CLI (shows subagent output). |
 | `HEARTBEAT_INTERVAL` | `30` | Seconds between heartbeat log output while a session is running. |
 | `HEARTBEAT_STALE_THRESHOLD` | `600` | Seconds before a session is considered stale/stuck. |
 | `LOG_CLEANUP_ENABLED` | `1` | Run log cleanup before pipeline execution (`1`=enabled, `0`=disabled). |
@@ -116,6 +119,93 @@ SESSION_TIMEOUT=7200 ./dev-pipeline/run.sh run feature-list.json
 LOG_RETENTION_DAYS=7 LOG_MAX_TOTAL_MB=512 ./dev-pipeline/run.sh run feature-list.json
 ```
+### AI CLI Configuration
+The pipeline auto-detects which AI CLI to use. Detection priority:
+1. `AI_CLI` environment variable (highest)
+2. `.prizmkit/config.json` → `ai_cli` field
+3. `CODEBUDDY_CLI` environment variable (legacy)
+4. Auto-detect: `cbc` in PATH → `claude` in PATH (lowest)
+To permanently configure a project to use a specific CLI, create `.prizmkit/config.json`:
+```json
+{
+  "ai_cli": "claude-internal",
+  "platform": "claude"
+}
+```
+Or override per-invocation:
+```bash
+AI_CLI=claude-internal ./dev-pipeline/run.sh run feature-list.json
+```
+### Model Selection
+Use the `MODEL` environment variable to specify which AI model to use. The value is passed as `--model <id>` to the CLI.
+```bash
+# Run pipeline with Sonnet (faster, cheaper)
+MODEL=claude-sonnet-4.6 ./dev-pipeline/run.sh run feature-list.json
+# Run pipeline with Opus (most capable)
+MODEL=claude-opus-4.6 ./dev-pipeline/run.sh run feature-list.json
+# Retry a feature with a specific model
+MODEL=claude-opus-4.6 ./dev-pipeline/retry-feature.sh F-007
+# Test which model the CLI is using
+MODEL=claude-sonnet-4.6 ./dev-pipeline/run.sh test-cli
+```
+Common model IDs (for `cbc`):
+| Model ID | Description |
+|----------|-------------|
+| `claude-opus-4.6` | Most capable, slower, higher cost |
+| `claude-sonnet-4.6` | Balanced speed/capability (recommended for pipeline) |
+| `claude-haiku-4.5` | Fastest, cheapest, less capable |
+> **Note**: `--model` support depends on the CLI. `cbc` fully supports it. `claude-internal` does not support `--model` in headless mode (only interactive `/model` command). If `MODEL` is set but the CLI doesn't support it, the flag is silently ignored.
+### Testing AI CLI (`test-cli`)
+Use `test-cli` to verify which CLI, version, and model the pipeline will use:
+```bash
+# Basic test — uses auto-detected CLI and default model
+./dev-pipeline/run.sh test-cli
+# Test with a specific model
+MODEL=claude-sonnet-4.6 ./dev-pipeline/run.sh test-cli
+# Test with a specific CLI
+AI_CLI=cbc ./dev-pipeline/run.sh test-cli
+```
+Example output:
+```
+============================================
+  Dev-Pipeline AI CLI Test
+============================================
+  Detected CLI:    cbc
+  Platform:        codebuddy
+  CLI Version:     2.62.1
+  Querying AI model (headless mode)...
+  AI Response:     I'm CodeBuddy, running Claude Opus 4.6
+============================================
+```
+The test sends a one-line prompt asking the AI to identify itself, with a 30-second timeout. If the CLI requires authentication or is unavailable, it shows a fallback message.
 ## How It Works
 ### Execution Flow

package/bundled/dev-pipeline/retry-bug.sh CHANGED Viewed

@@ -227,6 +227,11 @@ if [[ "$USE_STREAM_JSON" == "true" ]]; then
 fi
 # Spawn AI CLI session
+MODEL_FLAG=""
+if [[ -n "${MODEL:-}" ]]; then
+    MODEL_FLAG="--model $MODEL"
+fi
 case "$CLI_CMD" in
     *claude*)
         "$CLI_CMD" \
@@ -234,6 +239,7 @@ case "$CLI_CMD" in
             -p "$(cat "$BOOTSTRAP_PROMPT")" \
             --yes \
             $STREAM_JSON_FLAG \
+            $MODEL_FLAG \
             > "$SESSION_LOG" 2>&1 &
         ;;
     *)
@@ -241,6 +247,7 @@ case "$CLI_CMD" in
             --print \
             -y \
             $STREAM_JSON_FLAG \
+            $MODEL_FLAG \
             < "$BOOTSTRAP_PROMPT" \
             > "$SESSION_LOG" 2>&1 &
         ;;

package/bundled/dev-pipeline/retry-feature.sh CHANGED Viewed

@@ -228,6 +228,11 @@ python3 "$SCRIPTS_DIR/update-feature-status.py" \
     --action start >/dev/null 2>&1 || true
 # Spawn AI CLI session
+MODEL_FLAG=""
+if [[ -n "${MODEL:-}" ]]; then
+    MODEL_FLAG="--model $MODEL"
+fi
 case "$CLI_CMD" in
     *claude*)
         "$CLI_CMD" \
@@ -235,6 +240,7 @@ case "$CLI_CMD" in
             -p "$(cat "$BOOTSTRAP_PROMPT")" \
             --yes \
             $STREAM_JSON_FLAG \
+            $MODEL_FLAG \
             > "$SESSION_LOG" 2>&1 &
         ;;
     *)
@@ -242,6 +248,7 @@ case "$CLI_CMD" in
             --print \
             -y \
             $STREAM_JSON_FLAG \
+            $MODEL_FLAG \
             < "$BOOTSTRAP_PROMPT" \
             > "$SESSION_LOG" 2>&1 &
         ;;

package/bundled/dev-pipeline/run-bugfix.sh CHANGED Viewed

@@ -79,6 +79,11 @@ spawn_and_wait_session() {
         stream_json_flag="--output-format stream-json"
     fi
+    local model_flag=""
+    if [[ -n "${MODEL:-}" ]]; then
+        model_flag="--model $MODEL"
+    fi
     case "$CLI_CMD" in
         *claude*)
             "$CLI_CMD" \
@@ -87,6 +92,7 @@ spawn_and_wait_session() {
                 --yes \
                 $verbose_flag \
                 $stream_json_flag \
+                $model_flag \
                 > "$session_log" 2>&1 &
             ;;
         *)
@@ -95,6 +101,7 @@ spawn_and_wait_session() {
                 -y \
                 $verbose_flag \
                 $stream_json_flag \
+                $model_flag \
                 < "$bootstrap_prompt" \
                 > "$session_log" 2>&1 &
             ;;

package/bundled/dev-pipeline/run.sh CHANGED Viewed

@@ -20,6 +20,7 @@ set -euo pipefail
 #   AI_CLI                AI CLI command name (override; also readable from .prizmkit/config.json)
 #   CODEBUDDY_CLI         Legacy alias for AI_CLI (deprecated, use AI_CLI instead)
 #   PRIZMKIT_PLATFORM     Force platform: 'codebuddy' or 'claude' (auto-detected)
+#   MODEL                 AI model to use (e.g. claude-opus-4.6, claude-sonnet-4.6, claude-haiku-4.5)
 #   VERBOSE               Set to 1 to enable --verbose on AI CLI (shows subagent output)
 #   HEARTBEAT_INTERVAL    Heartbeat log interval in seconds (default: 30)
 #   HEARTBEAT_STALE_THRESHOLD  Heartbeat stale threshold in seconds (default: 600)
@@ -41,6 +42,7 @@ LOG_CLEANUP_ENABLED=${LOG_CLEANUP_ENABLED:-1}
 LOG_RETENTION_DAYS=${LOG_RETENTION_DAYS:-14}
 LOG_MAX_TOTAL_MB=${LOG_MAX_TOTAL_MB:-1024}
 VERBOSE=${VERBOSE:-0}
+MODEL=${MODEL:-""}
 # Source shared common helpers (CLI/platform detection + logs + deps)
 source "$SCRIPT_DIR/lib/common.sh"
@@ -91,6 +93,11 @@ spawn_and_wait_session() {
         stream_json_flag="--output-format stream-json"
     fi
+    local model_flag=""
+    if [[ -n "$MODEL" ]]; then
+        model_flag="--model $MODEL"
+    fi
     case "$CLI_CMD" in
         *claude*)
             # Claude Code: prompt via -p argument, --yes for auto-accept
@@ -100,6 +107,7 @@ spawn_and_wait_session() {
                 --yes \
                 $verbose_flag \
                 $stream_json_flag \
+                $model_flag \
                 > "$session_log" 2>&1 &
             ;;
         *)
@@ -109,6 +117,7 @@ spawn_and_wait_session() {
                 -y \
                 $verbose_flag \
                 $stream_json_flag \
+                $model_flag \
                 < "$bootstrap_prompt" \
                 > "$session_log" 2>&1 &
             ;;
@@ -790,6 +799,7 @@ show_help() {
     echo "  run [feature-list.json]                 Run all features sequentially"
     echo "  run <feature-id> [options]              Run a single feature"
     echo "  status [feature-list.json]               Show pipeline status"
+    echo "  test-cli                                 Test AI CLI: show detected CLI, version, and model"
     echo "  reset                                    Clear all state and start fresh"
     echo "  help                                     Show this help message"
     echo ""
@@ -805,6 +815,7 @@ show_help() {
     echo "  MAX_RETRIES           Max retries per feature (default: 3)"
     echo "  SESSION_TIMEOUT       Session timeout in seconds (default: 0 = no limit)"
     echo "  AI_CLI                AI CLI command name (auto-detected: cbc or claude)"
+    echo "  MODEL                 AI model ID (e.g. claude-opus-4.6, claude-sonnet-4.6, claude-haiku-4.5)"
     echo "  HEARTBEAT_INTERVAL    Heartbeat log interval in seconds (default: 30)"
     echo "  HEARTBEAT_STALE_THRESHOLD  Heartbeat stale threshold in seconds (default: 600)"
     echo "  LOG_CLEANUP_ENABLED   Run log cleanup before execution (default: 1)"
@@ -820,6 +831,8 @@ show_help() {
     echo "  ./run.sh run F-007 --clean --mode standard             # Clean + run standard"
     echo "  ./run.sh status                                        # Show pipeline status"
     echo "  MAX_RETRIES=5 SESSION_TIMEOUT=7200 ./run.sh run        # Custom config"
+    echo "  MODEL=claude-sonnet-4.6 ./run.sh run                    # Use Sonnet model"
+    echo "  MODEL=claude-haiku-4.5 ./run.sh test-cli                # Test with Haiku"
 }
 case "${1:-run}" in
@@ -843,6 +856,64 @@ case "${1:-run}" in
             --state-dir "$STATE_DIR" \
             --action status
         ;;
+    test-cli)
+        echo ""
+        echo "============================================"
+        echo "  Dev-Pipeline AI CLI Test"
+        echo "============================================"
+        echo ""
+        echo "  Detected CLI:    $CLI_CMD"
+        echo "  Platform:        $PLATFORM"
+        if [[ -n "$MODEL" ]]; then
+            echo "  Requested Model: $MODEL"
+        fi
+        # Get CLI version (first line only)
+        cli_version=$("$CLI_CMD" -v 2>&1 | head -1 || echo "unknown")
+        echo "  CLI Version:     $cli_version"
+        echo ""
+        echo "  Querying AI model (headless mode)..."
+        test_prompt="What AI assistant/platform are you and what model are you running? Reply in one line, e.g. \"I'm Claude Code Claude Opnus x.x\".No extra text."
+        local_model_flag=""
+        if [[ -n "$MODEL" ]]; then
+            local_model_flag="--model $MODEL"
+        fi
+        # Run headless query with 30s timeout (background + kill pattern for macOS)
+        tmpfile=$(mktemp)
+        (
+            unset CLAUDECODE
+            case "$CLI_CMD" in
+                *claude*)
+                    "$CLI_CMD" --print -p "$test_prompt" --dangerously-skip-permissions --no-session-persistence $local_model_flag > "$tmpfile" 2>/dev/null
+                    ;;
+                *)
+                    echo "$test_prompt" | "$CLI_CMD" --print -y $local_model_flag > "$tmpfile" 2>/dev/null
+                    ;;
+            esac
+        ) &
+        query_pid=$!
+        ( sleep 30 && kill "$query_pid" 2>/dev/null ) &
+        timer_pid=$!
+        wait "$query_pid" 2>/dev/null
+        kill "$timer_pid" 2>/dev/null
+        wait "$timer_pid" 2>/dev/null || true
+        model_reply=$(cat "$tmpfile" 2>/dev/null | head -3)
+        rm -f "$tmpfile"
+        if [[ -z "$model_reply" ]]; then
+            model_reply="(no response — CLI may require auth or is unavailable)"
+        fi
+        echo ""
+        echo "  AI Response:     $model_reply"
+        echo ""
+        echo "============================================"
+        echo ""
+        ;;
     reset)
         log_warn "Resetting pipeline state..."
         rm -rf "$STATE_DIR"

package/bundled/dev-pipeline/scripts/init-pipeline.py CHANGED Viewed

@@ -19,6 +19,7 @@ from datetime import datetime, timezone
 EXPECTED_SCHEMA = "dev-pipeline-feature-list-v1"
 FEATURE_ID_PATTERN = re.compile(r"^F-\d{3}$")
+TERMINAL_STATUSES = {"completed", "failed", "skipped"}
 REQUIRED_FEATURE_FIELDS = [
     "id",
@@ -234,6 +235,12 @@ def create_state_directory(state_dir, feature_list_path, features):
     os.makedirs(abs_state_dir, exist_ok=True)
     os.makedirs(features_dir, exist_ok=True)
+    # Count features already in terminal status at init time
+    completed_count = sum(
+        1 for f in features
+        if isinstance(f, dict) and f.get("status") in TERMINAL_STATUSES
+    )
     # Write pipeline.json
     pipeline_state = {
         "run_id": run_id,
@@ -241,7 +248,7 @@ def create_state_directory(state_dir, feature_list_path, features):
         "feature_list_path": abs_feature_list_path,
         "created_at": now,
         "total_features": len(features),
-        "completed_features": 0,
+        "completed_features": completed_count,
     }
     pipeline_path = os.path.join(abs_state_dir, "pipeline.json")
     with open(pipeline_path, "w", encoding="utf-8") as f:
@@ -260,9 +267,13 @@ def create_state_directory(state_dir, feature_list_path, features):
         sessions_dir = os.path.join(feature_dir, "sessions")
         os.makedirs(sessions_dir, exist_ok=True)
+        # Respect existing terminal status from feature-list.json
+        fl_status = feature.get("status", "pending")
+        init_status = fl_status if fl_status in TERMINAL_STATUSES else "pending"
         feature_status = {
             "feature_id": fid,
-            "status": "pending",
+            "status": init_status,
             "retry_count": 0,
             "max_retries": 3,
             "sessions": [],

package/bundled/dev-pipeline/scripts/update-feature-status.py CHANGED Viewed

@@ -109,10 +109,15 @@ def now_iso():
     return datetime.now(timezone.utc).strftime("%Y-%m-%dT%H:%M:%SZ")
-def load_feature_status(state_dir, feature_id):
+def load_feature_status(state_dir, feature_id, feature_list_status=None):
     """Load the status.json for a feature.
     If the file does not exist, return a default pending status.
+    If feature_list_status is a terminal status (completed, failed, skipped),
+    it overrides the status field from status.json. This makes feature-list.json
+    the single source of truth for terminal statuses, while all other fields
+    (retry_count, sessions, etc.) still come from status.json.
     """
     status_path = os.path.join(
         state_dir, "features", feature_id, "status.json"
@@ -134,7 +139,7 @@ def load_feature_status(state_dir, feature_id):
     if err:
         # If we can't read it, treat as pending
         now = now_iso()
-        return {
+        data = {
             "feature_id": feature_id,
             "status": "pending",
             "retry_count": 0,
@@ -145,6 +150,9 @@ def load_feature_status(state_dir, feature_id):
             "created_at": now,
             "updated_at": now,
         }
+    # feature-list.json wins for terminal statuses
+    if feature_list_status in TERMINAL_STATUSES:
+        data["status"] = feature_list_status
     return data
@@ -303,7 +311,7 @@ def action_get_next(feature_list_data, state_dir):
         fid = feature.get("id")
         if not fid:
             continue
-        fs = load_feature_status(state_dir, fid)
+        fs = load_feature_status(state_dir, fid, feature.get("status"))
         status_map[fid] = fs.get("status", "pending")
         status_data_map[fid] = fs
@@ -574,7 +582,7 @@ def _format_duration(seconds):
         return "{}h{}m".format(h, m)
-def _estimate_remaining_time(features, state_dir, counts):
+def _estimate_remaining_time(features, state_dir, counts, feature_list_data=None):
     """基于已完成 Feature 的历史耗时，按 complexity 加权预估剩余时间。
     策略:
@@ -588,6 +596,13 @@ def _estimate_remaining_time(features, state_dir, counts):
     # complexity 权重（用于没有历史数据时的估算）
     COMPLEXITY_WEIGHT = {"low": 1.0, "medium": 2.0, "high": 4.0}
+    # Build feature-list status map for terminal status override
+    fl_status_map = {}
+    if feature_list_data:
+        for f in feature_list_data.get("features", []):
+            if isinstance(f, dict) and f.get("id"):
+                fl_status_map[f["id"]] = f.get("status")
     # 按 complexity 分组收集已完成 Feature 的耗时
     duration_by_complexity = {}  # complexity -> [duration_seconds]
     feature_complexity_map = {}  # feature_id -> complexity
@@ -608,7 +623,7 @@ def _estimate_remaining_time(features, state_dir, counts):
         fid = feature.get("id")
         if not fid:
             continue
-        fs = load_feature_status(state_dir, fid)
+        fs = load_feature_status(state_dir, fid, fl_status_map.get(fid))
         if fs.get("status") != "completed":
             continue
         duration = _calc_feature_duration(state_dir, fid)
@@ -638,7 +653,7 @@ def _estimate_remaining_time(features, state_dir, counts):
         fid = feature.get("id")
         if not fid:
             continue
-        fs = load_feature_status(state_dir, fid)
+        fs = load_feature_status(state_dir, fid, fl_status_map.get(fid))
         fstatus = fs.get("status", "pending")
         if fstatus in TERMINAL_STATUSES:
             continue
@@ -694,7 +709,7 @@ def action_status(feature_list_data, state_dir):
         fid = feature.get("id")
         if not fid:
             continue
-        fs = load_feature_status(state_dir, fid)
+        fs = load_feature_status(state_dir, fid, feature.get("status"))
         status_map[fid] = fs.get("status", "pending")
     for feature in features:
@@ -705,7 +720,7 @@ def action_status(feature_list_data, state_dir):
         if not fid:
             continue
-        fs = load_feature_status(state_dir, fid)
+        fs = load_feature_status(state_dir, fid, feature.get("status"))
         fstatus = fs.get("status", "pending")
         retry_count = fs.get("retry_count", 0)
         max_retries_val = fs.get("max_retries", 3)
@@ -797,7 +812,7 @@ def action_status(feature_list_data, state_dir):
     # 预估剩余时间
     est_remaining, confidence = _estimate_remaining_time(
-        features, state_dir, counts
+        features, state_dir, counts, feature_list_data
     )
     summary_line = "Total: {} features | Completed: {} | In Progress: {}".format(

package/bundled/skills/_metadata.json CHANGED Viewed

@@ -1,5 +1,5 @@
 {
-  "version": "1.0.26",
+  "version": "1.0.28",
   "skills": {
     "prizm-kit": {
       "description": "Full-lifecycle dev toolkit. Covers spec-driven development, Prizm context docs, code quality, debugging, deployment, and knowledge management.",
@@ -197,13 +197,6 @@
       "hasAssets": false,
       "hasScripts": false
     },
-    "refactor-skill": {
-      "description": "Intelligent refactor review for existing skills with in-place vs v2 optimization and mandatory eval+graphical review.",
-      "tier": "companion",
-      "category": "Custom-skill",
-      "hasAssets": false,
-      "hasScripts": false
-    },
     "bug-planner": {
       "description": "Interactive bug planning that produces bug-fix-list.json. Supports stack traces, user reports, failed tests, log patterns, monitoring alerts.",
       "tier": "companion",
@@ -255,7 +248,6 @@
         "prizmkit-retrospective",
         "feature-workflow",
         "refactor-workflow",
-        "refactor-skill",
         "app-planner",
         "bug-planner",
         "dev-pipeline-launcher",

package/bundled/skills/dev-pipeline-launcher/SKILL.md CHANGED Viewed

@@ -99,9 +99,34 @@ Detect user intent from their message, then follow the corresponding workflow:
      --action status 2>/dev/null
    ```
-4. **Ask user to confirm**: "Ready to launch the pipeline? It will process N features in the background."
+4. **Ask execution mode**: Present the user with a choice before launching:
+   - **(1) Background daemon (recommended)**: Pipeline runs fully detached via `launch-daemon.sh`. Survives session closure.
+   - **(2) Foreground in session**: Pipeline runs in the current session via `run.sh run`. Visible output but will stop if session times out.
+   - **(3) Manual — show commands**: Display the exact commands the user can run themselves. No execution.
-5. **Launch**:
+   Default to option 1 if user says "just run it" or doesn't specify.
+   **If option 2 (foreground)**:
+   ```bash
+   dev-pipeline/run.sh run feature-list.json
+   ```
+   Note: This will block the session. Warn user about timeout risk.
+   **If option 3 (manual)**: Print commands and stop. Do not execute anything.
+   ```
+   # To run in background (recommended):
+   dev-pipeline/launch-daemon.sh start feature-list.json
+   # To run in foreground:
+   dev-pipeline/run.sh run feature-list.json
+   # To check status:
+   dev-pipeline/launch-daemon.sh status
+   ```
+5. **Ask user to confirm**: "Ready to launch the pipeline? It will process N features in the background."
+6. **Launch**:
    ```bash
    dev-pipeline/launch-daemon.sh start feature-list.json
    ```
@@ -110,18 +135,18 @@ Detect user intent from their message, then follow the corresponding workflow:
    dev-pipeline/launch-daemon.sh start feature-list.json --env "SESSION_TIMEOUT=7200 MAX_RETRIES=5"
    ```
-6. **Verify launch**:
+7. **Verify launch**:
    ```bash
    dev-pipeline/launch-daemon.sh status
    ```
-7. **Start log monitoring** -- Use the Bash tool with `run_in_background: true`:
+8. **Start log monitoring** -- Use the Bash tool with `run_in_background: true`:
    ```bash
    tail -f dev-pipeline/state/pipeline-daemon.log
    ```
    This runs in background so you can continue interacting with the user.
-8. **Report to user**:
+9. **Report to user**:
    - Pipeline PID
    - Log file location
    - "You can ask me 'pipeline status' or 'show logs' at any time"

package/bundled/skills/feature-workflow/SKILL.md CHANGED Viewed

@@ -144,12 +144,19 @@ Add new features to an existing project (incremental mode).
    Proceed? (Y/n)
    ```
-2. **Invoke `dev-pipeline-launcher` skill**:
+2. **Ask execution mode**: Before invoking the launcher, present the choice:
+   - **(1) Background daemon (recommended)**: Runs detached, survives session closure.
+   - **(2) Foreground in session**: Runs in current session with visible output. Stops if session times out.
+   - **(3) Manual — show commands**: Display commands only, no execution.
+   Pass the chosen mode to `dev-pipeline-launcher`.
+3. **Invoke `dev-pipeline-launcher` skill**:
    - The launcher handles all prerequisites checks
    - Starts `launch-daemon.sh` in background
    - Returns PID and log file location
-3. **Verify launch success**:
+4. **Verify launch success**:
    - Confirm pipeline is running
    - Record PID and log path for Phase 3

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "prizmkit",
-  "version": "1.0.26",
+  "version": "1.0.28",
   "description": "Create a new PrizmKit-powered project with clean initialization — no framework dev files, just what you need.",
   "type": "module",
   "bin": {

package/src/index.js CHANGED Viewed

@@ -45,7 +45,6 @@ export async function runScaffold(directory, options) {
   const cliStatus = [
     detected.cbc ? chalk.green('cbc ✓') : chalk.gray('cbc ✗'),
     detected.claude ? chalk.green('claude ✓') : chalk.gray('claude ✗'),
-    detected.claudeInternal ? chalk.green('claude-internal ✓') : chalk.gray('claude-internal ✗'),
   ].join('  ');
   console.log(`  检测到的 CLI 工具:  ${cliStatus}`);
   console.log(`  目标目录:          ${projectRoot}`);

package/bundled/skills/refactor-skill/SKILL.md DELETED Viewed

@@ -1,371 +0,0 @@
----
-name: "refactor-skill"
-tier: companion
-description: "Intelligent refactor review for existing skills: evaluates quality, proposes in-place upgrade, and enforces eval + graphical review after changes. (project) by using the newest skill-creator standard"
----
-# Refactor Skill
-Specialized workflow for reviewing and upgrading existing skills with measurable quality gates.
-## When to Use
-Use this skill when user says:
-- "重构这个 skill", "优化技能设计", "评审并升级技能"
-- "review this skill and improve it"
-- "keep the same skill but improve quality"
-- "原地升级这个 skill" / "in-place upgrade this skill"
-Do NOT use when user only wants to run a pipeline immediately without changing skill design.
-## Core Goals
-1. Review current skill comprehensively and find concrete improvement points.
-2. **Default and preferred mode: in-place upgrade** of the existing skill.
-3. New-version fork (e.g., `-v2`) is **exception-only** and must be explicitly requested by user.
-4. After any modification, **must** run standardized evaluation and graphical review.
-## Context Readiness Gate (Mandatory)
-Before any refactor action, verify whether conversation context already contains:
-- target skill name/path
-- current project/workspace path
-- refactor objective and constraints (quality, speed, compatibility)
-- whether user explicitly requests a new-version fork (default is in-place upgrade)
-If any item is missing, do not block; gather context proactively:
-1. Read `/core/skills/_metadata.json` to locate target skill and related neighbors.
-2. Read target `SKILL.md` plus key assets/scripts under that skill.
-3. Check recent evaluation artifacts under `/.codebuddy/skill-evals/` if present.
-4. Ask only the minimum unresolved question(s).
-## Review Dimensions (Mandatory Rubric)
-Assess and score each dimension (1-5):
-1. **功能性 (Functionality)**
-   - Trigger clarity and routing correctness
-   - Workflow completeness and error recovery
-   - Output contract correctness (schema/format compatibility)
-2. **效率性 (Efficiency)**
-   - Unnecessary steps, token/time overhead
-   - Reusability of scripts/assets
-   - Fast-path design and fallback strategy
-3. **可维护性 (Maintainability)**
-   - Instruction structure/readability
-   - Coupling to environment and path robustness
-   - Testability and observability (artifacts, checkpoints)
-Output a concise review summary with:
-- strengths
-- prioritized issues (P0/P1/P2)
-- expected impact for each fix
-## Optimization Strategy Selection
-After review, apply **in-place upgrade** by default.
-### Default Mode — In-Place Upgrade (Required Unless Explicitly Overridden)
-Use in almost all cases:
-- skill naming and contract remain stable
-- change scope is moderate or large but compatible
-- backward compatibility is required
-Actions:
-1. edit existing skill files in place
-2. preserve skill name/frontmatter compatibility
-3. keep migration notes minimal
-### Exception Mode — New Version via `skill-creator` (Explicit User Request Only)
-Only use when user clearly asks to fork a new version (e.g., `create <skill>-v2`).
-Additional required checks before using exception mode:
-- user confirms they need side-by-side old/new variants
-- user accepts added maintenance cost for two versions
-Actions:
-1. copy current skill as baseline snapshot
-2. create `<skill-name>-v2` and apply redesign
-3. run evaluation against old version baseline
-## Mandatory Post-Change Validation, Review, and Optimization Loop
-Run this full loop after **every** refactor (default: in-place; exception: new-version fork). Do not skip.
-### Step 0: Freeze Refactor Scope (Input Gate)
-Capture and freeze:
-- target skill path
-- iteration id (`iteration-N`)
-- baseline type (`old-snapshot` preferred, fallback `without_skill`)
-- optimization goal for this round (quality, token, latency, or compatibility)
-Expected output:
-- one-line run plan: `skill + baseline + iteration + goal`
-### Step 1: Structural Validation (Pre-Eval)
-Validate skill structure/frontmatter and required files first.
-Expected output:
-- validation pass/fail result
-- blocking fix list if failed
-### Step 2: Execute Standardized Eval Runs (Mandatory)
-Create iteration workspace:
-- `/.codebuddy/skill-evals/<skill-name>-workspace/iteration-N/`
-Run both configurations for the same eval set in the same iteration:
-- `with_skill` (updated skill)
-- `baseline` (old snapshot or `without_skill`)
-Use multi-run strategy:
-- default: 3 runs (fast feedback)
-- release gate: 5 runs (stability check)
-Required artifacts per run:
-- `outputs/`
-- `timing.json`
-- `grading.json`
-- `eval_metadata.json` (per eval directory)
-Expected output:
-- complete run tree with paired `with_skill` vs `baseline` runs
-- no missing required artifact files
-### Step 3: Score, Aggregate, and Build Benchmark
-Run grading and aggregation using standardized scripts. Keep metrics comparable across iterations.
-Required outputs:
-- `benchmark.json`
-- `benchmark.md`
-Required metrics:
-- pass rate
-- duration
-- token usage
-- with_skill vs baseline delta
-- variance (`stddev`) for stability judgment
-Expected output:
-- benchmark summary with clear win/lose/neutral conclusion per metric
-### Step 4: Graphical Review (Mandatory via `generate_review`)
-Generate review UI using official `generate_review.py` (no custom viewer).
-Preferred modes:
-1. server mode for interactive inspection
-2. static HTML mode (`--static`) for headless fallback
-Expected output:
-- review entry recorded (URL or HTML path)
-- quick notes on representative good/bad runs linked to evidence
-### Step 5: Analyze Results and Derive Optimization Actions
-Translate benchmark + viewer evidence into prioritized actions:
-- **P0**: contract/validation breakages
-- **P1**: quality instability or high variance
-- **P2**: token/time inefficiencies
-For each action define:
-- root cause hypothesis
-- exact file/section to modify
-- expected metric impact
-- rollback condition
-Expected output:
-- actionable optimization list (not generic advice)
-### Step 6: Implement Targeted Improvements
-Apply only the selected actions for this iteration.
-Avoid mixing unrelated changes to keep causal attribution clear.
-Expected output:
-- focused diff scoped to the chosen actions
-### Step 7: Re-Run Evaluation and Compare Iterations
-Re-run Step 2–4 on the updated skill and compare against previous iteration.
-Decision rule:
-- if goals met and no regression: accept iteration
-- if partial improvement: keep gains, open next iteration with narrowed scope
-- if regression: rollback or revise hypothesis and repeat
-Expected output:
-- iteration verdict (`accepted` / `needs-next-iteration` / `rollback`)
-- before/after comparison table
-### Step 8: Close the Loop (Mandatory Delivery)
-Return:
-1. what changed
-2. measured impact (pass/time/tokens/variance deltas)
-3. viewer entry
-4. remaining risks
-5. next iteration plan (if needed)
-This closes the loop from **test review → evidence analysis → skill optimization → re-validation**.
-### Standard Command Blueprint (Project-level)
-Use the one-command review pipeline with optional grader hook:
-```bash
-npm run skill:review -- \
-  --workspace /abs/.codebuddy/skill-evals/<skill-name>-workspace \
-  --iteration iteration-N \
-  --skill-name <skill-name> \
-  --skill-path /abs/core/skills/<skill-name> \
-  --runs 3 \
-  --grader-cmd "python3 /abs/scripts/skill-evals/grade-eval-runs.py --workspace {workspace} --iteration {iteration} --validator /abs/core/skills/<skill-name>/scripts/validate-and-generate.py --baseline-input /abs/.codebuddy/skill-evals/<skill-name>-workspace/inputs/feature-list-existing.json"
-```
-Minimum expected deliverables per iteration:
-- `<workspace>/<iteration>/benchmark.json`
-- `<workspace>/<iteration>/benchmark.md`
-- `<workspace>/<iteration>/review.html`
-- optimization action list with priority and owner
-## Execution Notes for `skill-creator` Integration
-When available, follow latest `skill-creator` evaluation/viewer workflow as source of truth:
-- parallelized run spawning (with_skill + baseline)
-- assertion-based grading format compatibility
-- benchmark aggregation via official script
-- viewer generation via official script
-## Output Contract of This Skill
-After completion, return:
-1. selected mode (`in-place` by default, or `new-version` if explicitly requested) and why
-2. files changed/created
-3. review rubric scores before vs after
-4. benchmark summary (pass/time/tokens delta)
-5. graphical review entry (URL or static HTML path)
-6. remaining risks and next iteration suggestions
-## Error Handling
-- Missing target skill path: auto-discover under `/core/skills/` then confirm.
-- Missing baseline snapshot: create one before modifications.
-- Eval incomplete: mark status as blocked and list missing artifacts.
-- Viewer runtime incompatibility: switch to `--static` mode and continue.
-## Skill Registry Modification Guide
-When adding or removing skills from the framework, follow this reference checklist.
-### Adding a New Skill
-**Step 1: Create skill definition**
-```
-core/skills/<skill-name>/SKILL.md    # Required: skill definition with frontmatter
-core/skills/<skill-name>/assets/     # Optional: templates, configs, etc.
-core/skills/<skill-name>/scripts/    # Optional: executable scripts
-```
-**Step 2: Register in metadata**
-Edit `core/skills/_metadata.json`:
-```json
-{
-  "skills": {
-    "<skill-name>": {
-      "description": "Brief description of the skill",
-      "tier": "1",              // "foundation", "1", "2", or "companion"
-      "category": "core",       // "core", "quality", "devops", "debugging", "documentation", "pipeline"
-      "hasAssets": false,       // true if assets/ directory exists
-      "hasScripts": false       // true if scripts/ directory exists
-    }
-  }
-}
-```
-**Step 3: (Optional) Add to suite**
-If the skill belongs to `core` or `minimal` suite, add to `suites` section in `_metadata.json`:
-```json
-{
-  "suites": {
-    "core": {
-      "skills": ["<skill-name>", ...]
-    }
-  }
-}
-```
-**Step 4: Regenerate derived artifacts**
-```bash
-# Update bundled directory for npm package
-node scripts/bundle.js
-```
-**Step 5: Validate**
-```bash
-npm test
-# or
-node tests/validate-all.js
-```
-### Removing a Skill
-**Step 1: Delete skill directory**
-```bash
-rm -rf core/skills/<skill-name>/
-```
-**Step 2: Remove from metadata**
-Edit `core/skills/_metadata.json`:
-- Remove entry from `skills` object
-- Remove from any `suites` that reference it
-**Step 3: Regenerate derived artifacts**
-```bash
-node scripts/bundle.js
-```
-**Step 4: Validate**
-```bash
-npm test
-```
-### Modification Checklist Summary
-| File | Action | Required |
-|------|--------|----------|
-| `core/skills/<name>/SKILL.md` | Create/Delete | **Always** |
-| `core/skills/_metadata.json` → `skills` | Add/Remove entry | **Always** |
-| `core/skills/_metadata.json` → `suites` | Add/Remove from suite | If belongs to suite |
-| `core/skills/<name>/assets/` | Create/Delete | If has resources |
-| `core/skills/<name>/scripts/` | Create/Delete | If has scripts |
-| `create-prizmkit/bundled/` | Regenerate via script | Auto |
-### Documents That May Need Number Updates
-**Recommendation: Avoid hardcoding skill counts.** Use relative descriptions instead:
-- ✅ "All skills" instead of "34 skills"
-- ✅ "Core Tier 1 skills" instead of "Core Tier 1 skills (17 skills)"
-- ✅ "symlink (skills)" instead of "symlink (35 skills)"
-If you must include counts, maintain them in one place (`_metadata.json`) and update all references together.
-Files that currently contain hardcoded skill counts:
-| File | Current Pattern | Suggested Fix |
-|------|-----------------|---------------|
-| `README.md` | "**N Skills** covering..." | Remove number or use "Skills" |
-| `CODEBUDDY.md` | "**N Skills** — ..." | Remove number |
-| `PK-Construct-Guide.md` | "N skills — 每个 skill..." | Remove number |
-| `PK-Evolving-User-Guide.md` | "symlink (N skills)" | Use "symlink (skills)" |
-| `core/skills/prizm-kit/SKILL.md` | "## Skill Inventory (N skills)" | Use "## Skill Inventory" |
-| `core/skills/_metadata.json` | `"description": "All N skills"` | Use `"description": "All skills"` |
-To find hardcoded numbers:
-```bash
-grep -rn "[0-9]\+ skills\|[0-9]\+ Skills" --include="*.md" --include="*.json" .
-```
-## Path Rules
-- Prefer absolute paths in execution commands.
-- Keep path references portable in instructions when possible (e.g., `${SKILL_DIR}` for intra-skill files).
-- Never delete `.codebuddy` directory.