npm - @a5c-ai/babysitter-cursor - Versions diffs - 0.1.5-staging.aaae75fb → 0.1.5-staging.b2dcdbb3 - Mend

@a5c-ai/babysitter-cursor 0.1.5-staging.aaae75fb → 0.1.5-staging.b2dcdbb3

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (4) hide show

package/commands/doctor.md CHANGED Viewed

@@ -4,9 +4,9 @@ argument-hint: "[run-id] Optional run ID to diagnose. If omitted, uses the most
 allowed-tools: Read, Grep, Write, Task, Bash, Edit, Grep, Glob, WebFetch, WebSearch, Search, AskUserQuestion, TodoWrite, TodoRead, Skill, BashOutput, KillShell, MultiEdit, LS
 ---
-You are a diagnostic agent for the babysitter runtime. Your job is to perform a comprehensive health check across 10 areas and produce a structured diagnostic report. Follow each section methodically. Track results as you go and produce the final summary at the end.
+You are a diagnostic agent for the babysitter runtime. Your job is to perform a comprehensive health check across 14 areas and produce a structured diagnostic report. Follow each section methodically. Track results as you go and produce the final summary at the end.
-Initialize a results tracker with these 10 checks, all starting as PENDING:
+Initialize a results tracker with these 14 checks, all starting as PENDING:
 1. Run Discovery
 2. Journal Integrity
 3. State Cache Consistency
@@ -17,6 +17,10 @@ Initialize a results tracker with these 10 checks, all starting as PENDING:
 8. Disk Usage
 9. Process Validation
 10. Hook Execution Health
+11. Session-ID Provenance
+12. Ancestor Liveness
+13. Concurrent Session Detection
+14. Windows Ancestor-Walk Strategy
 ---
@@ -350,9 +354,65 @@ Mark as FAIL if:
 ---
+## 11. Session-ID Provenance
+**Goal:** Verify how the current babysitter session ID was resolved and flag stale or shadowed values.
+- Invoke: `npx babysitter session:whoami --json`
+- Parse the output and inspect the `resolvedFrom` field. Classify as follows:
+  - `resolvedFrom: "pid-marker"` → mark as PASS ("Session ID derives from the live Claude Code ancestor process -- authoritative").
+  - `resolvedFrom: "env-file"` → mark as PASS with a note ("CLAUDE_ENV_FILE was used; typically healthy").
+  - `resolvedFrom: "env-var"` → mark as WARN ("`BABYSITTER_SESSION_ID` is set without a corroborating PID marker. Likely stale from a prior Claude Code session -- see GitHub issue #130").
+    - Remediation: run `babysitter session:cleanup` and start a fresh Claude Code session, or `unset BABYSITTER_SESSION_ID` before invoking babysitter.
+  - `resolvedFrom: "none"` → mark as ERROR ("No session ID resolvable. Either no session-start hook fired, or the ancestor walk failed").
+**Env-var shadow check:**
+- Independently inspect `envVarPresent` and `envVarMatches` in the output.
+- If `envVarPresent && !envVarMatches`, mark as WARN ("`BABYSITTER_SESSION_ID` in env does not match the resolved session ID; a stale value is shadowing the authoritative one. Unset the env var").
+---
+## 12. Ancestor Liveness
+**Goal:** Confirm the PID marker references a live Claude Code process.
+- Reuse the `session:whoami --json` output from check 11.
+- Inspect the `ancestorAlive` field.
+- If `ancestorAlive === false`, mark as ERROR ("The PID marker references a dead Claude Code process").
+  - Remediation: `babysitter session:cleanup`.
+- Otherwise mark as PASS.
+---
+## 13. Concurrent Session Detection
+**Goal:** Surface multiple live harness sessions that may compete for the same session ID.
+- Enumerate files in `~/.a5c/` matching the pattern `current-session-*-pid-*`.
+- Count markers per harness (derived from the filename).
+- If more than one live marker exists for the same harness, mark as INFO ("Multiple live Claude Code / harness sessions detected; ensure each shell scopes `BABYSITTER_SESSION_ID` appropriately -- the PID marker handles this automatically").
+- Otherwise mark as PASS.
+---
+## 14. Windows Ancestor-Walk Strategy
+**Goal:** Verify the ancestor-walk strategy works on Windows, where `wmic` is no longer guaranteed to be present.
+- Only run this check when `process.platform === 'win32'`. On other platforms, mark as PASS ("Not applicable -- non-Windows platform").
+- Attempt the ancestor walk by invoking `npx babysitter session:whoami --json` (reuse output from check 11 if available).
+- If resolution succeeded (any `resolvedFrom` other than `none`), mark as PASS.
+- If `resolvedFrom: "none"` on Windows:
+  - Test `wmic` availability: `where wmic` via shell.
+  - If absent, document that Windows 11 24H2 removed `wmic`; the fallback PowerShell CIM path should handle this.
+  - If the PowerShell ancestor walk also failed, mark as ERROR with remediation: ensure PowerShell is available (`powershell -NoProfile -Command "Get-CimInstance Win32_Process -Filter ProcessId=$PID"` should work).
+- If the cascade works but is slow (>5s on first probe), add an INFO note on first-probe latency.
+---
 ## Final Report
-After completing all 10 checks, produce the diagnostic report in this format:
+After completing all 14 checks, produce the diagnostic report in this format:
 ```
 ============================================
@@ -379,6 +439,10 @@ OVERALL HEALTH: <HEALTHY | WARNING | CRITICAL>
 | 8  | Disk Usage               | <status> |
 | 9  | Process Validation       | <status> |
 | 10 | Hook Execution Health    | <status> |
+| 11 | Session-ID Provenance    | <status> |
+| 12 | Ancestor Liveness        | <status> |
+| 13 | Concurrent Session Detection | <status> |
+| 14 | Windows Ancestor-Walk Strategy | <status> |
 --------------------------------------------
   ISSUES & RECOMMENDATIONS
@@ -392,9 +456,9 @@ OVERALL HEALTH: <HEALTHY | WARNING | CRITICAL>
 ```
 **Overall health determination:**
-- **HEALTHY**: All 10 checks are PASS.
-- **WARNING**: At least one check is WARN but none are FAIL.
-- **CRITICAL**: At least one check is FAIL.
+- **HEALTHY**: All 14 checks are PASS (INFO notes are acceptable).
+- **WARNING**: At least one check is WARN but none are FAIL or ERROR.
+- **CRITICAL**: At least one check is FAIL or ERROR.
 Present the full detailed findings for each check BEFORE the summary table, so the user can see the evidence. End with the summary table and recommendations. Also, create a single HTML report file with all the findings that uses the arwes UI framework and open it for the user in the browser.
@@ -424,3 +488,25 @@ After diagnosing issues, prompt the user to report or fix what was found -- they
 Example prompt after diagnosis:
 > "Diagnosis found a stale lock -- process 12847 crashed without cleanup. This is a known edge case in the orchestration loop. Even if you don't want to fix it yourself, reporting it helps: run `/babysitter:contrib bug report: orchestration loop doesn't release lock on unhandled rejection` to open an issue."
+---
+## Self-Heal Suggestions
+If any of checks 11-14 surface issues (stale env vars, dead ancestor PIDs, shadowed session IDs, or Windows ancestor-walk failures), suggest the following remediation sequence, in order. Present it as an actionable block:
+```bash
+# 1. Cleanup dead markers and orphaned state files
+babysitter session:cleanup --dry-run   # preview
+babysitter session:cleanup             # apply
+# 2. Unset a stale env var
+unset BABYSITTER_SESSION_ID
+# 3. Re-bind a run explicitly if needed
+babysitter session:resume --session-id <fresh-id> --state-dir ~/.a5c --run-id <runId> --runs-dir .a5c/runs
+# 4. Start a fresh Claude Code session (closes and reopens the session)
+```
+Run steps 1 and 2 first; re-run `/babysitter:doctor` after each step to confirm the session-provenance checks return to PASS. Step 3 is only needed when a specific run must be re-bound to the fresh session. If the issue persists after step 4, escalate via `/debug` or `/babysitter:contrib`.

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "@a5c-ai/babysitter-cursor",
-  "version": "0.1.5-staging.aaae75fb",
+  "version": "0.1.5-staging.b2dcdbb3",
   "description": "Babysitter orchestration plugin for Cursor IDE with SDK-managed process-library bootstrapping and in-turn iteration model",
   "scripts": {
     "test": "node scripts/sync-command-surfaces.js --check",
@@ -44,6 +44,6 @@
   },
   "homepage": "https://github.com/a5c-ai/babysitter/tree/main/plugins/babysitter-cursor#readme",
   "dependencies": {
-    "@a5c-ai/babysitter-sdk": "0.0.188-staging.aaae75fb"
+    "@a5c-ai/babysitter-sdk": "0.0.188-staging.b2dcdbb3"
   }
 }

package/skills/doctor/SKILL.md CHANGED Viewed

@@ -5,9 +5,9 @@ description: Diagnose babysitter run health - journal integrity, state cache, ef
 # doctor
-You are a diagnostic agent for the babysitter runtime. Your job is to perform a comprehensive health check across 10 areas and produce a structured diagnostic report. Follow each section methodically. Track results as you go and produce the final summary at the end.
+You are a diagnostic agent for the babysitter runtime. Your job is to perform a comprehensive health check across 14 areas and produce a structured diagnostic report. Follow each section methodically. Track results as you go and produce the final summary at the end.
-Initialize a results tracker with these 10 checks, all starting as PENDING:
+Initialize a results tracker with these 14 checks, all starting as PENDING:
 1. Run Discovery
 2. Journal Integrity
 3. State Cache Consistency
@@ -18,6 +18,10 @@ Initialize a results tracker with these 10 checks, all starting as PENDING:
 8. Disk Usage
 9. Process Validation
 10. Hook Execution Health
+11. Session-ID Provenance
+12. Ancestor Liveness
+13. Concurrent Session Detection
+14. Windows Ancestor-Walk Strategy
 ---
@@ -351,9 +355,65 @@ Mark as FAIL if:
 ---
+## 11. Session-ID Provenance
+**Goal:** Verify how the current babysitter session ID was resolved and flag stale or shadowed values.
+- Invoke: `npx babysitter session:whoami --json`
+- Parse the output and inspect the `resolvedFrom` field. Classify as follows:
+  - `resolvedFrom: "pid-marker"` → mark as PASS ("Session ID derives from the live Claude Code ancestor process -- authoritative").
+  - `resolvedFrom: "env-file"` → mark as PASS with a note ("CLAUDE_ENV_FILE was used; typically healthy").
+  - `resolvedFrom: "env-var"` → mark as WARN ("`BABYSITTER_SESSION_ID` is set without a corroborating PID marker. Likely stale from a prior Claude Code session -- see GitHub issue #130").
+    - Remediation: run `babysitter session:cleanup` and start a fresh Claude Code session, or `unset BABYSITTER_SESSION_ID` before invoking babysitter.
+  - `resolvedFrom: "none"` → mark as ERROR ("No session ID resolvable. Either no session-start hook fired, or the ancestor walk failed").
+**Env-var shadow check:**
+- Independently inspect `envVarPresent` and `envVarMatches` in the output.
+- If `envVarPresent && !envVarMatches`, mark as WARN ("`BABYSITTER_SESSION_ID` in env does not match the resolved session ID; a stale value is shadowing the authoritative one. Unset the env var").
+---
+## 12. Ancestor Liveness
+**Goal:** Confirm the PID marker references a live Claude Code process.
+- Reuse the `session:whoami --json` output from check 11.
+- Inspect the `ancestorAlive` field.
+- If `ancestorAlive === false`, mark as ERROR ("The PID marker references a dead Claude Code process").
+  - Remediation: `babysitter session:cleanup`.
+- Otherwise mark as PASS.
+---
+## 13. Concurrent Session Detection
+**Goal:** Surface multiple live harness sessions that may compete for the same session ID.
+- Enumerate files in `~/.a5c/` matching the pattern `current-session-*-pid-*`.
+- Count markers per harness (derived from the filename).
+- If more than one live marker exists for the same harness, mark as INFO ("Multiple live Claude Code / harness sessions detected; ensure each shell scopes `BABYSITTER_SESSION_ID` appropriately -- the PID marker handles this automatically").
+- Otherwise mark as PASS.
+---
+## 14. Windows Ancestor-Walk Strategy
+**Goal:** Verify the ancestor-walk strategy works on Windows, where `wmic` is no longer guaranteed to be present.
+- Only run this check when `process.platform === 'win32'`. On other platforms, mark as PASS ("Not applicable -- non-Windows platform").
+- Attempt the ancestor walk by invoking `npx babysitter session:whoami --json` (reuse output from check 11 if available).
+- If resolution succeeded (any `resolvedFrom` other than `none`), mark as PASS.
+- If `resolvedFrom: "none"` on Windows:
+  - Test `wmic` availability: `where wmic` via shell.
+  - If absent, document that Windows 11 24H2 removed `wmic`; the fallback PowerShell CIM path should handle this.
+  - If the PowerShell ancestor walk also failed, mark as ERROR with remediation: ensure PowerShell is available (`powershell -NoProfile -Command "Get-CimInstance Win32_Process -Filter ProcessId=$PID"` should work).
+- If the cascade works but is slow (>5s on first probe), add an INFO note on first-probe latency.
+---
 ## Final Report
-After completing all 10 checks, produce the diagnostic report in this format:
+After completing all 14 checks, produce the diagnostic report in this format:
 ```
 ============================================
@@ -380,6 +440,10 @@ OVERALL HEALTH: <HEALTHY | WARNING | CRITICAL>
 | 8  | Disk Usage               | <status> |
 | 9  | Process Validation       | <status> |
 | 10 | Hook Execution Health    | <status> |
+| 11 | Session-ID Provenance    | <status> |
+| 12 | Ancestor Liveness        | <status> |
+| 13 | Concurrent Session Detection | <status> |
+| 14 | Windows Ancestor-Walk Strategy | <status> |
 --------------------------------------------
   ISSUES & RECOMMENDATIONS
@@ -393,9 +457,9 @@ OVERALL HEALTH: <HEALTHY | WARNING | CRITICAL>
 ```
 **Overall health determination:**
-- **HEALTHY**: All 10 checks are PASS.
-- **WARNING**: At least one check is WARN but none are FAIL.
-- **CRITICAL**: At least one check is FAIL.
+- **HEALTHY**: All 14 checks are PASS (INFO notes are acceptable).
+- **WARNING**: At least one check is WARN but none are FAIL or ERROR.
+- **CRITICAL**: At least one check is FAIL or ERROR.
 Present the full detailed findings for each check BEFORE the summary table, so the user can see the evidence. End with the summary table and recommendations. Also, create a single HTML report file with all the findings that uses the arwes UI framework and open it for the user in the browser.
@@ -425,3 +489,25 @@ After diagnosing issues, prompt the user to report or fix what was found -- they
 Example prompt after diagnosis:
 > "Diagnosis found a stale lock -- process 12847 crashed without cleanup. This is a known edge case in the orchestration loop. Even if you don't want to fix it yourself, reporting it helps: run `/babysitter:contrib bug report: orchestration loop doesn't release lock on unhandled rejection` to open an issue."
+---
+## Self-Heal Suggestions
+If any of checks 11-14 surface issues (stale env vars, dead ancestor PIDs, shadowed session IDs, or Windows ancestor-walk failures), suggest the following remediation sequence, in order. Present it as an actionable block:
+```bash
+# 1. Cleanup dead markers and orphaned state files
+babysitter session:cleanup --dry-run   # preview
+babysitter session:cleanup             # apply
+# 2. Unset a stale env var
+unset BABYSITTER_SESSION_ID
+# 3. Re-bind a run explicitly if needed
+babysitter session:resume --session-id <fresh-id> --state-dir ~/.a5c --run-id <runId> --runs-dir .a5c/runs
+# 4. Start a fresh Claude Code session (closes and reopens the session)
+```
+Run steps 1 and 2 first; re-run `/babysitter:doctor` after each step to confirm the session-provenance checks return to PASS. Step 3 is only needed when a specific run must be re-bound to the fresh session. If the issue persists after step 4, escalate via `/debug` or `/babysitter:contrib`.

package/versions.json CHANGED Viewed

@@ -1,3 +1,3 @@
 {
-  "sdkVersion": "0.0.188-staging.aaae75fb"
+  "sdkVersion": "0.0.188-staging.b2dcdbb3"
 }