npm - devlyn-cli - Versions diffs - 1.14.0 → 2.0.0 - Mend

devlyn-cli 1.14.0 → 2.0.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (148) hide show

package/config/skills/devlyn:resolve/references/phases/plan.md ADDED Viewed

@@ -0,0 +1,42 @@
+# PHASE 1 — PLAN (canonical body)
+The per-engine adapter header from `_shared/adapters/<model>.md` is prepended at runtime. This file is engine-agnostic.
+<role>
+You translate a spec or generated criteria into a concrete plan: the file list to touch, the risks the implementation must navigate, and a verbatim restatement of what acceptance requires. The plan is the contract IMPLEMENT executes against.
+</role>
+<input>
+- Source: `pipeline.state.json:source.spec_path` (real spec) or `state.source.criteria_path` (`.devlyn/criteria.generated.md`).
+- Codebase at `state.base_ref.sha`.
+- For free-form mode: also `state.complexity` (trivial / medium / large) — informs depth.
+</input>
+<output>
+Write `.devlyn/plan.md` with three sections:
+1. **Files to touch** — explicit list. Each entry: path, change type (`new` / `edit` / `delete`), one-line rationale tied to a specific Requirement.
+2. **Risks** — out-of-scope expansions to refuse, ambiguous spec sections to interpret strictly, known failure modes for this language/framework.
+3. **Acceptance restatement** — verbatim copy of the spec's `## Verification` block (or generated criteria's equivalent). The plan is wrong if any verification command later fails because of a planning oversight.
+Also update `pipeline.state.json:phases.plan.{verdict, completed_at, duration_ms}`. Verdict: `PASS` if plan is shippable; `BLOCKED` if spec is internally contradictory or cannot be planned without violating constraints.
+</output>
+<quality_bar>
+- Scope first, then implementation. Decide what files to touch before deciding how to implement. Files not in the list are off-limits to IMPLEMENT.
+- Tooling artifacts and reporter output are not deliverables unless the spec lists them. Plan to configure tools to emit to gitignored paths.
+- Existing tests are contract. Plan to extend them; do not plan to remove or weaken them.
+- Spec frontmatter is read-only to PLAN and IMPLEMENT. The DOCS-style status flip happens in CLEANUP under a tight allowlist.
+- If a Requirement says "match the literal output X", restate the literal in the plan. Paraphrasing the contract here propagates into IMPLEMENT.
+</quality_bar>
+<runtime_principles>
+Read `_shared/runtime-principles.md` (Subtractive-first / Goal-locked / No-workaround / Evidence). Codex-routed phases receive the contract excerpt inlined:
+- Subtractive-first: prefer trimming an existing helper to introducing a new one. Pure-addition needs a cited prior failure mode or an explicit spec/user requirement.
+- Goal-locked: refuse "while I'm here" cleanups, speculative robustness, mid-flight re-scoping. Single test before any deviation: "did the user ask for this OR does the stated goal strictly require it?"
+- No-workaround: no `any`, no `@ts-ignore`, no silent `catch`, no hardcoded values, no helper scripts that bypass root cause.
+- Evidence: every claim cites file:line opened at planning time. Vague claims excluded.
+</runtime_principles>
+The task is: [orchestrator pastes the task description and spec context here]

package/config/skills/devlyn:resolve/references/phases/verify.md ADDED Viewed

@@ -0,0 +1,69 @@
+# PHASE 5 — VERIFY (canonical body, fresh subagent context)
+Per-engine adapter header is prepended at runtime. **You are spawned with empty conversation context.** No carry-over from PLAN / IMPLEMENT / BUILD_GATE / CLEANUP. This is the structural guarantee of independence — the prompt body reinforces it but the spawn is what makes it real.
+<role>
+Independent quality layer. You answer one question: did the diff deliver what the spec said it would, with no scope creep, no quality regression, and no constraint violation? You produce findings only — you have no code-mutation tools.
+</role>
+<input>
+- `spec.md` (or `.devlyn/criteria.generated.md` for free-form mode) — the contract.
+- `spec.expected.json` — the mechanical acceptance contract per `_shared/expected.schema.json`.
+- The cumulative diff against `state.base_ref.sha`.
+- The spec hash (`state.source.spec_sha256`) — re-read the spec from disk and confirm the hash matches; if it does not, write `state.phases.verify.verdict: "BLOCKED"` with reason `spec_sha256_mismatch` and stop.
+You do NOT receive: PLAN, IMPLEMENT's reasoning, BUILD_GATE's findings, CLEANUP's allowlist negotiations. Reading those would compromise independence.
+</input>
+<sub_phases>
+### MECHANICAL (deterministic)
+Re-run the mechanical checks fresh, independent of BUILD_GATE's earlier run:
+1. `python3 .claude/skills/_shared/spec-verify-check.py` against the post-CLEANUP code.
+2. Re-scan `spec.expected.json.forbidden_patterns` against the diff (Python re.search; honor each pattern's `files` allowlist).
+3. Confirm `required_files` exist post-diff; confirm `forbidden_files` do not appear in the diff.
+4. Confirm `max_deps_added` is not exceeded (`git diff -- package.json` for Node; equivalent for other ecosystems).
+Emit findings to `.devlyn/verify-mechanical.findings.jsonl`. Each match = one finding. Severity from the pattern's `severity` field (disqualifier → CRITICAL, warning → MEDIUM).
+### JUDGE (fresh-context grading)
+Grade the diff against the spec on rubric axes:
+- **Spec compliance** — did every Requirement get an `evidence` record pointing at code that satisfies it?
+- **Scope** — does the diff touch only files PLAN listed (or the cleanup allowlist)? Out-of-scope file = HIGH finding `scope.out-of-scope-violation`.
+- **Quality** — does the implementation follow the framework's idiomatic patterns, or are there hand-rolled helpers replacing standard primitives? `design.unidiomatic-pattern` MEDIUM if so.
+- **Consistency** — internal style (naming, error shape, module boundaries) consistent with the surrounding code.
+For each finding, write file:line evidence. Do not paraphrase code; quote it.
+**Coverage check**: before declaring done, confirm you have evidence for every spec axis. If you could not exercise an axis (the spec asks for behavior X but the diff does not touch the code that produces X), set `state.verify.coverage_failed: true` and surface the missing-evidence finding rather than passing on assumption.
+**Anti-self-filter rule**: report every finding you observe, including ones you consider low-severity or low-confidence. Tag each with `confidence: high|medium|low` and let the harness's downstream filter rank them. Filtering at this stage suppresses recall.
+### Pair-mode (when triggered by orchestrator)
+When the orchestrator spawns a second VERIFY agent with the OTHER engine's adapter, both judgments are merged:
+- Any HIGH/CRITICAL finding either model surfaces is verdict-binding.
+- Lower-severity disagreements are logged but do not change the verdict.
+- The orchestrator handles merge; you only emit your own findings.
+</sub_phases>
+<output>
+- `.devlyn/verify-mechanical.findings.jsonl` — MECHANICAL findings.
+- `.devlyn/verify.findings.jsonl` — JUDGE findings.
+- `state.phases.verify.{verdict, completed_at, duration_ms, sub_verdicts: {mechanical, judge}, artifacts}`. Verdict: WORSE of the two sub-verdicts. `PASS` requires zero CRITICAL/HIGH findings AND coverage met.
+</output>
+<quality_bar>
+- Independence is structural (fresh context) and behavioral (no code mutation). Both must hold.
+- Quote, do not paraphrase. Findings without quoted file:line evidence are excluded.
+- Coverage > confidence. Missing-evidence findings outrank a confident "looks fine."
+</quality_bar>
+<runtime_principles>
+Read `_shared/runtime-principles.md`. VERIFY's discipline is "the spec is the contract, the diff is the evidence, the verdict is the comparison."
+</runtime_principles>

package/config/skills/devlyn:resolve/references/state-schema.md ADDED Viewed

@@ -0,0 +1,106 @@
+# pipeline.state.json schema
+Single authoritative verdict source for `/devlyn:resolve`. The orchestrator branches on `state.phases.<name>.verdict` directly — never parses `.devlyn/*.findings.jsonl` for routing. Living document; bump `version` on a breaking change.
+## Top-level shape
+```json
+{
+  "version": "2.0",
+  "run_id": "rs-<UTC-timestamp>-<12-hex>",
+  "started_at": "2026-04-30T12:00:00Z",
+  "engine": "claude",
+  "mode": "spec",
+  "complexity": null,
+  "base_ref": { "branch": "main", "sha": "abc123..." },
+  "rounds": { "max_rounds": 4, "global": 0 },
+  "bypasses": [],
+  "implement_passed_sha": null,
+  "source": {
+    "type": "spec",
+    "spec_path": "docs/roadmap/phase-1/X.md",
+    "spec_sha256": "...",
+    "criteria_path": null,
+    "criteria_sha256": null
+  },
+  "criteria": [
+    { "id": "C1", "ref": "spec://requirements/0", "status": "pending", "evidence": [], "failed_by_finding_ids": [] }
+  ],
+  "phases": {
+    "plan": null,
+    "implement": null,
+    "build_gate": null,
+    "cleanup": null,
+    "verify": null,
+    "final_report": null
+  },
+  "verify": { "coverage_failed": false }
+}
+```
+## Field rules
+- **version** — string. Bump major on a breaking schema change.
+- **mode** — `"free-form" | "spec" | "verify-only"`.
+- **complexity** — `null | "trivial" | "medium" | "large"`. Free-form mode populates this; spec/verify-only mode leaves it null.
+- **engine** — `"claude" | "codex" | "auto"` initially; rewritten by engine-preflight if a downgrade fired.
+- **rounds.global** — incremented every fix-loop pass (BUILD_GATE → fix-loop OR VERIFY → fix-loop).
+- **bypasses** — array of phase names from `--bypass`. Valid: `"build-gate" | "cleanup"`. PLAN, IMPLEMENT, VERIFY are non-bypassable (orchestrator rejects at parse time).
+- **implement_passed_sha** — captured at end of PHASE 2; null until then. Activates the post-implement invariant for CLEANUP and VERIFY.
+- **criteria** — generated from spec's `## Requirements` checklist (one per `- [ ]`). `status: pending → implemented` is the legal transition. `failed_by_finding_ids` populates when VERIFY surfaces a finding tied to a criterion.
+- **verify.coverage_failed** — set by VERIFY's JUDGE sub-phase when a spec axis could not be exercised against the diff. Triggers pair-mode escalation when set.
+## Per-phase shape
+Each entry under `phases.<name>` (for `plan`, `implement`, `build_gate`, `cleanup`, `verify`, `final_report`):
+```json
+{
+  "started_at": "2026-04-30T12:00:01Z",
+  "completed_at": "2026-04-30T12:00:30Z",
+  "duration_ms": 29000,
+  "round": 0,
+  "triggered_by": null,
+  "verdict": "PASS",
+  "engine": "claude",
+  "model": "claude-opus-4-7",
+  "pre_sha": null,
+  "artifacts": { "findings_file": null, "log_file": null },
+  "sub_verdicts": null
+}
+```
+- `verdict` — `"PASS" | "PASS_WITH_ISSUES" | "FAIL" | "NEEDS_WORK" | "BLOCKED"`. PHASE 6 (FINAL_REPORT) writes its own verdict per the terminal-verdict precedence.
+- `triggered_by` — null on first run; one of `"build_gate" | "verify"` when the phase is a fix-loop respawn.
+- `pre_sha` — captured by orchestrator before CLEANUP and (if needed) other allowlist-enforced phases. Used to validate the post-spawn diff.
+- `sub_verdicts` — only populated for VERIFY: `{ "mechanical": "PASS|FAIL", "judge": "PASS|...", "pair_judge": "PASS|..." | null }`.
+## Write protocol
+1. **Before each phase spawn**: orchestrator writes `phases.<name>.{started_at, round, triggered_by}` and (when applicable) `pre_sha`.
+2. **After each agent returns**: orchestrator validates `verdict`, `completed_at`, `duration_ms`, `artifacts` are populated. Missing fields → orchestrator fills from observable state. Branching on a null verdict is undefined behavior.
+3. **Before archive** (PHASE 6 step 3): `phases.final_report.verdict` must be non-null. Archive prune skips runs whose final_report verdict is null (treated as in-flight).
+## Terminal verdict (PHASE 6)
+Precedence:
+1. `phases.<any>.verdict == "BLOCKED"` → terminal `BLOCKED:<reason>`.
+2. `phases.verify.verdict == "NEEDS_WORK"` after fix-loop exhaustion → terminal `NEEDS_WORK`.
+3. `phases.verify.verdict == "PASS_WITH_ISSUES"` → terminal `PASS_WITH_ISSUES`.
+4. `phases.verify.verdict == "PASS"` → terminal `PASS`.
+5. Verify-only mode: terminal = `phases.verify.verdict` directly (PHASE 1-4 are skipped).
+## Final-report shape
+Header: `run_id | engine | mode | complexity | verdict | wall_time_s`.
+Per-phase summary table: `phase | verdict | duration_ms | round | triggered_by | findings_count`.
+Findings table (post-IMPLEMENT phases only — they are findings-only): each finding's `severity | rule_id | file:line | message | confidence`.
+Follow-up notes: any `--continue-on-large` assumptions, any silent fallbacks (engine downgrade), any `state.verify.coverage_failed` axes.
+## Archive contract
+PHASE 6 step 4 moves `.devlyn/*` (excluding `.devlyn/runs/`) into `.devlyn/runs/<run_id>/`. The `.devlyn/runs/` directory keeps the last 10 completed runs (sorted by `started_at`). Best-effort prune; archive failure does not change the run's verdict.

package/{config/skills → optional-skills}/devlyn:design-system/SKILL.md RENAMED Viewed

@@ -1,4 +1,5 @@
 ---
+name: devlyn:design-system
 description: Extract all design values from selected style for exact reproduction
 argument-hint: <style-number> [platform] (e.g., "3", "3 flutter", "style 2 web")
 allowed-tools: Bash(ls:*), Bash(cat:*), Bash(grep:*), View, Edit, Write

package/optional-skills/devlyn:reap/SKILL.md ADDED Viewed

@@ -0,0 +1,105 @@
+---
+name: devlyn:reap
+description: Safely count and kill orphaned child processes (PPID=1) left behind by Claude Code MCP plugins, Superset terminal tabs, and codex wrappers. Use this whenever the user says "too many processes", "can't open terminals", "pty/process limit", "hundreds of bun/codex/workerd piling up", "clean up orphans", "reap processes", or reports new terminals failing to spawn on macOS. Also use proactively after long Claude sessions to prevent hitting kern.maxprocperuid or kern.tty.ptmx_max limits. ONLY touches a conservative whitelist of known leaks — never guesses on unknown processes.
+allowed-tools: Read, Bash(ps:*), Bash(lsof:*), Bash(pgrep:*), Bash(awk:*), Bash(id:*), Bash(sysctl:*), Bash(bash:*)
+argument-hint: [scan | kill | kill --force | kill --include workerd | kill --only telegram-bun]
+---
+<role>
+You are a process-hygiene janitor for macOS. Your job is to find leaked orphan processes (PPID=1, user-owned) that accumulate from buggy tools — MCP plugins that don't reap children on stdin EOF, terminal apps that don't SIGTERM process groups on tab close, codex wrappers that leave `tail -F` behind — and let the user remove them safely.
+Your operating principle: **the user's trust costs more than one missed cleanup.** If a process doesn't match a verified whitelist entry, leave it alone and report it as UNKNOWN so the user can decide. Never guess.
+</role>
+<user_input>
+$ARGUMENTS
+</user_input>
+<process>
+## Phase 1: Parse intent
+Look at `$ARGUMENTS` and classify:
+| Input | Mode |
+|---|---|
+| empty, `scan`, `status`, `count`, `list`, or anything non-imperative | **SCAN only** (default) |
+| starts with `kill`, `reap`, `clean`, `prune`, `죽여`, `정리` | **KILL** mode |
+In KILL mode, also parse:
+- `--force` → SIGKILL instead of SIGTERM
+- `--include workerd` → extend the default whitelist with the workerd-dev category
+- `--only <category>` → restrict to a single category
+- `--dry-run` → list kills but don't send signals
+If the user's intent is ambiguous (e.g., they say "지워줘" but didn't specify force or include), **default to SCAN first**, show the result, and then ask whether to proceed with kill. Never escalate to `--force` without an explicit request.
+## Phase 2: SCAN
+Always run scan first — even in KILL mode — so the user sees what is about to happen.
+Run the bundled scanner. The skill is installed at `~/.claude/skills/devlyn:reap/`:
+```bash
+bash ~/.claude/skills/devlyn:reap/scripts/scan.sh
+```
+Report the output verbatim to the user. Then add your own 2-line summary:
+- total orphan count across whitelist categories
+- any UNKNOWN_ORPHANS that the user might want to investigate manually
+Also surface the macOS limits for context, only once per session:
+```bash
+sysctl kern.maxprocperuid kern.tty.ptmx_max 2>/dev/null
+```
+## Phase 3: KILL (only when requested)
+Run the reap script with the parsed flags:
+```bash
+bash ~/.claude/skills/devlyn:reap/scripts/reap.sh [flags]
+```
+Show the output verbatim. The script re-verifies `PPID==1 && user==current` for every PID right before signaling — a process that was legitimately adopted since the scan will be skipped, not killed.
+After kill, re-run scan to confirm the counts dropped. If any whitelisted PIDs are still present after SIGTERM and 2 seconds, mention that `--force` (SIGKILL) is available.
+## Phase 4: Recommend (only if signals of chronic leak)
+If `telegram-bun` count > 10 OR oldest whitelisted orphan > 24h, tell the user this is a recurring leak and suggest one of:
+1. **Patch the telegram plugin** — add `process.stdin.on('end', () => process.exit(0))` to `server.ts` so the child dies when Claude Code exits.
+2. **Schedule this skill** — run `/devlyn:reap kill` periodically (e.g., via the `/loop` skill or a launchd agent).
+3. **Update Superset** — newer versions may SIGTERM process groups on tab close.
+Do NOT apply these automatically. Recommend and let the user choose.
+</process>
+<safety>
+## Never-touch rules
+- **NEVER kill** a process whose command does not match a whitelist category in `scan.sh`. Unknown = informational only.
+- **NEVER kill** anything where `ps -o ppid=` returns something other than `1` at signal time.
+- **NEVER kill** processes owned by another user (the scripts check `id -un`).
+- **NEVER use** `killall`, `pkill -9`, or wildcard `kill $(pgrep ...)` in this skill. Always iterate PIDs individually with per-PID re-verification.
+- **NEVER suggest** `sudo` escalation — this is a user-scope cleanup tool.
+## Whitelist definitions
+These are the ONLY categories reap.sh will touch:
+| Category | Match | Why safe |
+|---|---|---|
+| `telegram-bun` | `bun server.ts` **AND** cwd contains `/plugins/cache/claude-plugins-official/telegram/` | Telegram MCP plugin leaks one per Claude session. Verified by cwd, not just cmdline. |
+| `superset-codex-bash` | `/bin/bash .*/.superset/bin/codex` with PPID=1 | `.superset/bin/codex` wrapper exits without killing its tail child; bash copies left behind. |
+| `superset-codex-tail` | `tail -F .*superset-codex-session-*.jsonl` with PPID=1 | Log tail from the same wrapper, always safe to stop. |
+| `workerd` (opt-in) | `@cloudflare/workerd-darwin-*/bin/workerd serve ` with PPID=1 | moonmaker-engine dev server that survives tab close. Opt-in because the user may have an active dev session. |
+If the user asks to add a new category, **edit scan.sh and reap.sh together** — both must know the same pattern so scan never promises a cleanup that reap won't deliver.
+</safety>

package/optional-skills/devlyn:reap/scripts/reap.sh ADDED Viewed

@@ -0,0 +1,129 @@
+#!/usr/bin/env bash
+# devlyn:reap — kill orphan processes from safe whitelist categories.
+# Verifies PPID==1 and user-ownership AGAIN at kill time to avoid racing a
+# legitimately-reparented process. Unknown orphans are never killed.
+#
+# Usage:
+#   reap.sh                       # default categories, SIGTERM
+#   reap.sh --force               # SIGKILL instead of SIGTERM
+#   reap.sh --include workerd     # add workerd-dev to the default set
+#   reap.sh --only telegram-bun   # restrict to a single category
+#   reap.sh --dry-run             # print what WOULD be killed, kill nothing
+set -u
+LC_ALL=C
+export LC_ALL
+ME="$(id -un)"
+SIGNAL="TERM"
+DRY=0
+INCLUDE=""
+ONLY=""
+while [ $# -gt 0 ]; do
+  case "$1" in
+    --force)     SIGNAL="KILL" ;;
+    --dry-run)   DRY=1 ;;
+    --include)   shift; INCLUDE="${INCLUDE},$1" ;;
+    --only)      shift; ONLY="$1" ;;
+    -h|--help)
+      sed -n '2,14p' "$0"; exit 0 ;;
+    *)
+      printf 'unknown flag: %s\n' "$1" >&2; exit 2 ;;
+  esac
+  shift
+done
+DEFAULT_CATEGORIES="telegram-bun,superset-codex-bash,superset-codex-tail"
+if [ -n "$ONLY" ]; then
+  CATEGORIES="$ONLY"
+else
+  CATEGORIES="${DEFAULT_CATEGORIES}${INCLUDE}"
+fi
+SNAPSHOT="$(ps -eo pid=,ppid=,user=,etime=,command= 2>/dev/null | awk -v me="$ME" '$2==1 && $3==me')"
+collect_pids() {
+  local category="$1"
+  case "$category" in
+    telegram-bun)
+      # cwd-verified — same logic as scan.sh
+      printf '%s\n' "$SNAPSHOT" \
+        | grep -E '/bun[^ ]* server\.ts( |$)' \
+        | awk '{print $1}' \
+        | while read -r pid; do
+            cwd="$(lsof -a -d cwd -p "$pid" 2>/dev/null | awk 'NR==2 {for(i=9;i<=NF;i++) printf "%s ", $i; print ""}')"
+            case "$cwd" in
+              *"/plugins/cache/claude-plugins-official/telegram/"*) printf '%s\n' "$pid" ;;
+            esac
+          done
+      ;;
+    superset-codex-bash)
+      printf '%s\n' "$SNAPSHOT" | grep -E '/bin/bash .*/\.superset/bin/codex( |$)' | awk '{print $1}' ;;
+    superset-codex-tail)
+      printf '%s\n' "$SNAPSHOT" | grep -E 'tail .*superset-codex-session-.*\.jsonl' | awk '{print $1}' ;;
+    workerd)
+      printf '%s\n' "$SNAPSHOT" | grep -E '@cloudflare/workerd-darwin-[^/]+/bin/workerd serve ' | awk '{print $1}' ;;
+    *)
+      printf 'unknown category: %s\n' "$category" >&2
+      return 1 ;;
+  esac
+}
+TOTAL_KILLED=0
+TOTAL_SKIPPED=0
+# Split the comma-separated category list without letting IFS leak into the
+# inner loop that iterates newline-separated PIDs.
+CATS_ARR=()
+OLD_IFS="$IFS"
+IFS=,
+for c in $CATEGORIES; do
+  [ -n "$c" ] && CATS_ARR+=("$c")
+done
+IFS="$OLD_IFS"
+for cat in "${CATS_ARR[@]}"; do
+  pids="$(collect_pids "$cat")" || continue
+  if [ -z "$pids" ]; then
+    printf '[%s] nothing to kill\n' "$cat"
+    continue
+  fi
+  while IFS= read -r pid; do
+    [ -z "$pid" ] && continue
+    # Re-verify right before killing. Any of these mean "don't touch":
+    #   - process already gone
+    #   - PPID is no longer 1 (got adopted by a real parent — not our target)
+    #   - owner changed (extremely unlikely but cheap to check)
+    live_info="$(ps -o ppid=,user= -p "$pid" 2>/dev/null)"
+    if [ -z "$live_info" ]; then
+      printf '[%s] %s  skipped (already exited)\n' "$cat" "$pid"
+      TOTAL_SKIPPED=$((TOTAL_SKIPPED+1))
+      continue
+    fi
+    live_ppid="$(printf '%s' "$live_info" | awk '{print $1}')"
+    live_user="$(printf '%s' "$live_info" | awk '{print $2}')"
+    if [ "$live_ppid" != "1" ] || [ "$live_user" != "$ME" ]; then
+      printf '[%s] %s  skipped (ppid=%s user=%s — no longer orphan)\n' "$cat" "$pid" "$live_ppid" "$live_user"
+      TOTAL_SKIPPED=$((TOTAL_SKIPPED+1))
+      continue
+    fi
+    if [ "$DRY" -eq 1 ]; then
+      printf '[%s] %s  would SIG%s\n' "$cat" "$pid" "$SIGNAL"
+    else
+      if kill -s "$SIGNAL" "$pid" 2>/dev/null; then
+        printf '[%s] %s  SIG%s sent\n' "$cat" "$pid" "$SIGNAL"
+        TOTAL_KILLED=$((TOTAL_KILLED+1))
+      else
+        printf '[%s] %s  kill failed\n' "$cat" "$pid"
+        TOTAL_SKIPPED=$((TOTAL_SKIPPED+1))
+      fi
+    fi
+  done <<< "$pids"
+done
+if [ "$DRY" -eq 1 ]; then
+  printf '\ndry-run complete.\n'
+else
+  printf '\ndone. killed=%s skipped=%s\n' "$TOTAL_KILLED" "$TOTAL_SKIPPED"
+fi

package/optional-skills/devlyn:reap/scripts/scan.sh ADDED Viewed

@@ -0,0 +1,116 @@
+#!/usr/bin/env bash
+# devlyn:reap — scan orphan processes by safe-to-kill category.
+# Read-only. Never kills anything. Always exits 0 on success.
+#
+# Output format: one TSV line per category with
+#   CATEGORY  COUNT  OLDEST_ETIME  PIDS  NOTE
+# Followed by an "UNKNOWN_ORPHANS" line reporting non-system orphans we
+# deliberately left out of the whitelist — these will NOT be touched by reap.sh.
+set -u
+LC_ALL=C
+export LC_ALL
+# PPID=1 user-owned processes. Column layout: PID  PPID  ETIME  COMMAND...
+ME="$(id -un)"
+SNAPSHOT="$(ps -eo pid=,ppid=,user=,etime=,command= 2>/dev/null | awk -v me="$ME" '$2==1 && $3==me')"
+# -----------------------------------------------------------------------------
+# Category matchers (grep -E patterns). These target processes that are KNOWN
+# to leak from specific tools that do not reap their children on exit.
+# Conservative by design — if unsure, leave it UNKNOWN.
+# -----------------------------------------------------------------------------
+match_telegram_bun()      { grep -E '/bun[^ ]* server\.ts( |$)'; }
+match_superset_codex_sh() { grep -E '/bin/bash .*/\.superset/bin/codex( |$)'; }
+match_superset_codex_tl() { grep -E 'tail .*superset-codex-session-.*\.jsonl'; }
+match_workerd_dev()       { grep -E '@cloudflare/workerd-darwin-[^/]+/bin/workerd serve '; }
+emit() {
+  local name="$1"; shift
+  local note="$1"; shift
+  local lines; lines="$(cat)"
+  local count; count="$(printf '%s\n' "$lines" | grep -c . || true)"
+  if [ "${count:-0}" -eq 0 ]; then
+    printf '%-24s\t0\t-\t-\t%s\n' "$name" "$note"
+    return
+  fi
+  local pids oldest
+  # ps column order is: pid ppid user etime command...
+  pids="$(printf '%s\n' "$lines" | awk '{print $1}' | paste -sd, -)"
+  oldest="$(printf '%s\n' "$lines" | awk '{print $4}' | sort -r | head -1)"
+  printf '%-24s\t%s\t%s\t%s\t%s\n' "$name" "$count" "$oldest" "$pids" "$note"
+}
+printf 'CATEGORY                \tCOUNT\tOLDEST\tPIDS\tNOTE\n'
+# Verify the bun server belongs to the telegram plugin before classifying it.
+# cwd is the reliable signal; command line alone is ambiguous.
+TELEGRAM_PIDS=""
+if [ -n "$SNAPSHOT" ]; then
+  BUN_CANDIDATES="$(printf '%s\n' "$SNAPSHOT" | match_telegram_bun | awk '{print $1}')"
+  for pid in $BUN_CANDIDATES; do
+    cwd="$(lsof -a -d cwd -p "$pid" 2>/dev/null | awk 'NR==2 {for(i=9;i<=NF;i++) printf "%s ", $i; print ""}')"
+    case "$cwd" in
+      *"/plugins/cache/claude-plugins-official/telegram/"*)
+        TELEGRAM_PIDS="${TELEGRAM_PIDS}${pid}
+" ;;
+    esac
+  done
+fi
+if [ -n "$TELEGRAM_PIDS" ]; then
+  # Reconstruct rows for accurate ETIME/command display.
+  printf '%s' "$TELEGRAM_PIDS" | grep -v '^$' | while read -r pid; do
+    printf '%s\n' "$SNAPSHOT" | awk -v p="$pid" '$1==p'
+  done | emit "telegram-bun"         "cwd=.../telegram/ plugin — safe"
+else
+  printf '' | emit "telegram-bun"    "cwd=.../telegram/ plugin — safe"
+fi
+printf '%s\n' "$SNAPSHOT" | match_superset_codex_sh | emit \
+  "superset-codex-bash"     ".superset/bin/codex wrapper leak — safe"
+printf '%s\n' "$SNAPSHOT" | match_superset_codex_tl | emit \
+  "superset-codex-tail"     "superset-codex-session-*.jsonl tail — safe"
+printf '%s\n' "$SNAPSHOT" | match_workerd_dev | emit \
+  "workerd-dev"             "cloudflare dev server — opt-in (include=workerd)"
+# -----------------------------------------------------------------------------
+# UNKNOWN_ORPHANS: everything else that is PPID=1 and user-owned. Informational
+# only. These will NOT be killed without a human explicitly extending the
+# whitelist. macOS system helpers (launchd, /usr/libexec/**, Application
+# bundles, Electron helpers, etc.) are filtered out — they're not orphans in
+# the leak sense, they legitimately run under launchd.
+# -----------------------------------------------------------------------------
+SYSTEM_FILTER='(^|/)(launchd|aslmanager|cloudphotod|automountd|autofsd|usernotificationsd|voicebankingd|veraport)( |$)|^/System/|^/usr/libexec/|^/usr/sbin/|^/Library/Apple|^/Library/Developer/PrivateFrameworks/CoreSimulator|^/Library/PrivilegedHelperTools/|^/Applications/|CoreSimulator|raonsecure|TEK_|ChatGPTHelper|FigmaAgent|figma_agent|iniLINE|CrossEX|com\.apple\.|Superset Helper|Electron Framework|QuickLookUIService|SandboxHelper|MTLCompilerService|extensionkitservice|ssh-agent|Squirrel|app-server-broker\.mjs'
+UNKNOWN="$(printf '%s\n' "$SNAPSHOT" \
+  | grep -Ev "$SYSTEM_FILTER" \
+  | awk '{printf "%s\t", $1; for(i=5;i<=NF;i++) printf "%s ", $i; print ""}')"
+# Strip already-whitelisted categories from the UNKNOWN set so we don't
+# double-count them.
+WHITELIST_PIDS="$( {
+  printf '%s' "$TELEGRAM_PIDS"
+  printf '%s\n' "$SNAPSHOT" | match_superset_codex_sh | awk '{print $1}'
+  printf '%s\n' "$SNAPSHOT" | match_superset_codex_tl | awk '{print $1}'
+  printf '%s\n' "$SNAPSHOT" | match_workerd_dev | awk '{print $1}'
+} | grep -v '^$' | sort -u)"
+printf '\nUNKNOWN_ORPHANS (informational — NOT killed by reap.sh):\n'
+if [ -z "$UNKNOWN" ]; then
+  printf '  (none)\n'
+else
+  # awk can't take a multi-line string via -v (literal newlines are rejected),
+  # so pass the whitelist as a temp file instead.
+  WL_TMP="$(mktemp -t devlyn-reap-wl)"
+  # shellcheck disable=SC2064
+  trap "rm -f '$WL_TMP'" EXIT
+  printf '%s\n' "$WHITELIST_PIDS" > "$WL_TMP"
+  printf '%s\n' "$UNKNOWN" | awk -v wlf="$WL_TMP" '
+    BEGIN {
+      while ((getline line < wlf) > 0) if (line != "") wh[line]=1
+      close(wlf)
+    }
+    { if (!($1 in wh)) print "  " $0 }
+  '
+fi

package/{config/skills → optional-skills}/devlyn:team-design-ui/SKILL.md RENAMED Viewed

@@ -1,3 +1,8 @@
+---
+name: devlyn:team-design-ui
+description: Assemble a world-class design team to generate 5 radically distinct, portfolio-worthy UI style explorations. Like /devlyn:design-ui but powered by a full team of design specialists.
+---
 Assemble a world-class design team to generate 5 radically distinct, portfolio-worthy UI style explorations. Like `/devlyn:design-ui` but powered by a full team of design specialists — Creative Director, Product Designer, Visual Designer, Interaction Designer, and Accessibility Designer — who collaborate to produce 5 stunning HTML design samples that go far beyond what a single designer could achieve.
 This is design exploration only. After the user picks a style:

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "devlyn-cli",
-  "version": "1.14.0",
+  "version": "2.0.0",
   "description": "AI development toolkit for Claude Code — ideate, auto-resolve, and ship with context engineering and agent orchestration",
   "homepage": "https://github.com/fysoul17/devlyn-cli#readme",
   "bin": {
@@ -13,9 +13,23 @@
     "!config/skills/preflight-workspace/**",
     "!config/skills/devlyn:ideate-workspace",
     "!config/skills/devlyn:ideate-workspace/**",
+    "!config/skills/devlyn:auto-resolve-workspace",
+    "!config/skills/devlyn:auto-resolve-workspace/**",
+    "!config/skills/roadmap-archival-workspace",
+    "!config/skills/roadmap-archival-workspace/**",
     "agents-config",
     "optional-skills",
-    "CLAUDE.md"
+    "benchmark/auto-resolve/BENCHMARK-DESIGN.md",
+    "benchmark/auto-resolve/README.md",
+    "benchmark/auto-resolve/RUBRIC.md",
+    "benchmark/auto-resolve/fixtures/SCHEMA.md",
+    "benchmark/auto-resolve/fixtures/F*/**",
+    "benchmark/auto-resolve/fixtures/test-repo/**",
+    "!benchmark/auto-resolve/fixtures/test-repo/node_modules/**",
+    "benchmark/auto-resolve/scripts/**",
+    "scripts/lint-skills.sh",
+    "CLAUDE.md",
+    "AGENTS.md"
   ],
   "keywords": [
     "claude",