npm - context-mode - Versions diffs - 1.0.111 → 1.0.112 - Mend

context-mode 1.0.111 → 1.0.112

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (150) hide show

package/.claude-plugin/marketplace.json +2 -2
package/.claude-plugin/plugin.json +1 -1
package/.openclaw-plugin/index.ts +3 -2
package/.openclaw-plugin/openclaw.plugin.json +1 -1
package/.openclaw-plugin/package.json +1 -1
package/README.md +152 -34
package/bin/statusline.mjs +144 -127
package/build/adapters/base.d.ts +8 -5
package/build/adapters/base.js +8 -18
package/build/adapters/claude-code/index.d.ts +24 -3
package/build/adapters/claude-code/index.js +44 -11
package/build/adapters/codex/hooks.d.ts +10 -5
package/build/adapters/codex/hooks.js +10 -5
package/build/adapters/codex/index.d.ts +17 -5
package/build/adapters/codex/index.js +337 -37
package/build/adapters/codex/paths.d.ts +1 -0
package/build/adapters/codex/paths.js +12 -0
package/build/adapters/cursor/index.d.ts +6 -0
package/build/adapters/cursor/index.js +83 -2
package/build/adapters/detect.d.ts +1 -1
package/build/adapters/detect.js +29 -6
package/build/adapters/omp/index.d.ts +65 -0
package/build/adapters/omp/index.js +182 -0
package/build/adapters/omp/plugin.d.ts +75 -0
package/build/adapters/omp/plugin.js +220 -0
package/build/adapters/openclaw/mcp-tools.d.ts +54 -0
package/build/adapters/openclaw/mcp-tools.js +198 -0
package/build/adapters/openclaw/plugin.d.ts +130 -0
package/build/adapters/openclaw/plugin.js +629 -0
package/build/adapters/openclaw/workspace-router.d.ts +29 -0
package/build/adapters/openclaw/workspace-router.js +64 -0
package/build/adapters/opencode/plugin.d.ts +145 -0
package/build/adapters/opencode/plugin.js +457 -0
package/build/adapters/pi/extension.d.ts +26 -0
package/build/adapters/pi/extension.js +552 -0
package/build/adapters/pi/index.d.ts +57 -0
package/build/adapters/pi/index.js +173 -0
package/build/adapters/pi/mcp-bridge.d.ts +113 -0
package/build/adapters/pi/mcp-bridge.js +251 -0
package/build/adapters/types.d.ts +11 -6
package/build/cli.js +186 -170
package/build/db-base.d.ts +15 -2
package/build/db-base.js +50 -5
package/build/executor.d.ts +2 -0
package/build/executor.js +15 -2
package/build/runPool.d.ts +36 -0
package/build/runPool.js +51 -0
package/build/runtime.js +64 -5
package/build/search/auto-memory.js +6 -4
package/build/security.js +30 -10
package/build/server.d.ts +23 -1
package/build/server.js +652 -174
package/build/session/analytics.d.ts +404 -1
package/build/session/analytics.js +1347 -42
package/build/session/db.d.ts +114 -5
package/build/session/db.js +275 -27
package/build/session/event-emit.d.ts +48 -0
package/build/session/event-emit.js +101 -0
package/build/session/extract.d.ts +1 -0
package/build/session/extract.js +79 -12
package/build/session/purge.d.ts +111 -0
package/build/session/purge.js +138 -0
package/build/store.d.ts +7 -0
package/build/store.js +69 -6
package/build/util/claude-config.d.ts +26 -0
package/build/util/claude-config.js +91 -0
package/build/util/hook-config.d.ts +4 -0
package/build/util/hook-config.js +39 -0
package/cli.bundle.mjs +411 -208
package/configs/antigravity/GEMINI.md +0 -3
package/configs/claude-code/CLAUDE.md +1 -4
package/configs/codex/AGENTS.md +1 -4
package/configs/codex/config.toml +3 -0
package/configs/codex/hooks.json +8 -0
package/configs/cursor/context-mode.mdc +0 -3
package/configs/gemini-cli/GEMINI.md +0 -3
package/configs/jetbrains-copilot/copilot-instructions.md +0 -3
package/configs/kilo/AGENTS.md +0 -3
package/configs/kiro/KIRO.md +0 -3
package/configs/omp/SYSTEM.md +85 -0
package/configs/omp/mcp.json +7 -0
package/configs/openclaw/AGENTS.md +0 -3
package/configs/opencode/AGENTS.md +0 -3
package/configs/pi/AGENTS.md +0 -3
package/configs/qwen-code/QWEN.md +1 -4
package/configs/vscode-copilot/copilot-instructions.md +0 -3
package/configs/zed/AGENTS.md +0 -3
package/hooks/codex/posttooluse.mjs +9 -2
package/hooks/codex/precompact.mjs +69 -0
package/hooks/codex/sessionstart.mjs +13 -9
package/hooks/codex/stop.mjs +1 -2
package/hooks/codex/userpromptsubmit.mjs +1 -2
package/hooks/core/routing.mjs +237 -18
package/hooks/cursor/afteragentresponse.mjs +1 -1
package/hooks/cursor/hooks.json +31 -0
package/hooks/cursor/posttooluse.mjs +1 -1
package/hooks/cursor/sessionstart.mjs +5 -5
package/hooks/cursor/stop.mjs +1 -1
package/hooks/ensure-deps.mjs +12 -13
package/hooks/gemini-cli/aftertool.mjs +1 -1
package/hooks/gemini-cli/beforeagent.mjs +1 -1
package/hooks/gemini-cli/precompress.mjs +3 -2
package/hooks/gemini-cli/sessionstart.mjs +9 -9
package/hooks/jetbrains-copilot/posttooluse.mjs +1 -1
package/hooks/jetbrains-copilot/precompact.mjs +3 -2
package/hooks/jetbrains-copilot/sessionstart.mjs +9 -9
package/hooks/kiro/agentspawn.mjs +5 -5
package/hooks/kiro/posttooluse.mjs +2 -2
package/hooks/kiro/userpromptsubmit.mjs +1 -1
package/hooks/posttooluse.mjs +45 -0
package/hooks/precompact.mjs +17 -0
package/hooks/pretooluse.mjs +23 -0
package/hooks/routing-block.mjs +0 -12
package/hooks/run-hook.mjs +16 -3
package/hooks/session-db.bundle.mjs +27 -18
package/hooks/session-extract.bundle.mjs +2 -2
package/hooks/session-helpers.mjs +101 -64
package/hooks/sessionstart.mjs +51 -2
package/hooks/vscode-copilot/posttooluse.mjs +1 -1
package/hooks/vscode-copilot/precompact.mjs +3 -2
package/hooks/vscode-copilot/sessionstart.mjs +9 -9
package/openclaw.plugin.json +1 -1
package/package.json +14 -8
package/server.bundle.mjs +349 -147
package/skills/UPSTREAM-CREDITS.md +0 -51
package/skills/context-mode-ops/SKILL.md +0 -299
package/skills/context-mode-ops/agent-teams.md +0 -198
package/skills/context-mode-ops/communication.md +0 -224
package/skills/context-mode-ops/marketing.md +0 -124
package/skills/context-mode-ops/release.md +0 -214
package/skills/context-mode-ops/review-pr.md +0 -269
package/skills/context-mode-ops/tdd.md +0 -329
package/skills/context-mode-ops/triage-issue.md +0 -266
package/skills/context-mode-ops/validation.md +0 -307
package/skills/diagnose/SKILL.md +0 -122
package/skills/diagnose/scripts/hitl-loop.template.sh +0 -41
package/skills/grill-me/SKILL.md +0 -15
package/skills/grill-with-docs/ADR-FORMAT.md +0 -47
package/skills/grill-with-docs/CONTEXT-FORMAT.md +0 -77
package/skills/grill-with-docs/SKILL.md +0 -93
package/skills/improve-codebase-architecture/DEEPENING.md +0 -37
package/skills/improve-codebase-architecture/INTERFACE-DESIGN.md +0 -44
package/skills/improve-codebase-architecture/LANGUAGE.md +0 -53
package/skills/improve-codebase-architecture/SKILL.md +0 -76
package/skills/tdd/SKILL.md +0 -114
package/skills/tdd/deep-modules.md +0 -33
package/skills/tdd/interface-design.md +0 -31
package/skills/tdd/mocking.md +0 -59
package/skills/tdd/refactoring.md +0 -10
package/skills/tdd/tests.md +0 -61

package/skills/context-mode-ops/triage-issue.md DELETED Viewed

@@ -1,266 +0,0 @@
-# Triage Issue Workflow
-## Trigger
-User says: "triage issue #N", "fix issue #N", "analyze issue #N"
-## Step-by-Step
-### 1. Gather Intelligence (ONE batch call)
-Use `ctx_batch_execute` to gather everything in ONE call:
-```javascript
-commands: [
-  { label: "issue-body", command: "gh issue view {N} --json title,body,labels,state,comments,author,createdAt" },
-  { label: "issue-comments", command: "gh issue view {N} --comments" },
-  { label: "recent-related-prs", command: "gh pr list --state all --limit 10 --json number,title,state,headRefName" },
-  { label: "source-tree", command: "find src -type f -name '*.ts' | sort" },
-  { label: "test-tree", command: "find tests -type f -name '*.test.ts' | sort" },
-  { label: "open-issues", command: "gh issue list --state open --limit 20 --json number,title,labels" }
-],
-queries: [
-  "issue title description problem",
-  "affected adapter platform",
-  "error message stack trace",
-  "environment variables mentioned",
-  "OS platform specific",
-  "related PRs and issues"
-]
-```
-### 2. Classify Domains
-From the gathered intelligence, identify:
-- [ ] **Affected adapters** — which of the 12 platforms?
-- [ ] **Affected OS** — macOS, Linux, Windows, or all?
-- [ ] **Core modules** — server, store, executor, session, hooks?
-- [ ] **Issue type** — bug, feature request, question, discussion?
-- [ ] **Severity** — breaking (can't use tool), degraded (works but wrong), cosmetic
-### 3. Spawn Agent Army
-Based on classification, spawn from [agent-teams.md](agent-teams.md):
-```
-ALWAYS spawn:
-├── Context Mode Architect (reviews everything)
-├── QA Engineer (runs all tests)
-├── DX Engineer (checks user-facing quality)
-IF adapter X is affected:
-├── {X} Architect
-├── {X} Staff Engineer
-IF OS-specific:
-├── OS Compatibility Architect
-├── {macOS|Linux|Windows} Staff Engineer
-IF domain-specific:
-├── {Domain} Architect
-└── (Staff Engineer if code changes needed)
-```
-**Example: Issue #208 "CLI upgrade full support for Opencode/Kilocode"**
-```
-Agents to spawn:
-1. Context Mode Architect
-2. QA Engineer
-3. DX Engineer
-4. OpenCode Architect
-5. OpenCode Staff Engineer
-6. Kilo Architect
-7. Kilo Staff Engineer
-8. Hooks Architect (CLI upgrade touches hooks)
-9. OS Compatibility Architect (CLI runs on all OS)
-```
-### 4. Claim Verification — BLOCKING GATE
-<claim_verification_enforcement>
-STOP. Before ANY agent writes implementation code, the claim in the issue MUST be verified
-with hard evidence. We shipped inheritEnvKeys because an LLM said Claude Code strips env vars
-— it doesn't. We got burned shipping a fix for an unverified claim. Never again.
-</claim_verification_enforcement>
-**Every issue makes a claim. Verify it BEFORE coding.**
-| Issue Type | Required Evidence | How to Get It |
-|------------|-------------------|---------------|
-| **Bug report** | Reproduce locally with a failing test or command | Run the exact steps from the report. If it doesn't fail, the bug may not exist. |
-| **Feature request claiming behavior X** | Prove behavior X actually happens | Check official docs, source code, or web search. NOT LLM knowledge — LLMs hallucinate platform behavior. |
-| **Feature request claiming perf issue** | Benchmark the actual impact | Measure before/after. No "it should be faster" — show numbers. |
-| **"Tool X sets env var Y"** | Find it in official source | `ctx_fetch_and_index` the platform's docs/source. Grep their repo. If you can't find it, it probably doesn't exist. |
-**Verification Steps:**
-1. **Architect agents** must produce a `CLAIM_VERDICT` before any Staff Engineer writes code:
-   ```
-   CLAIM: "{exact claim from the issue}"
-   EVIDENCE: {link to official doc, source file, or reproduction output}
-   VERDICT: CONFIRMED | UNCONFIRMED | HALLUCINATED
-   ```
-2. If `VERDICT: UNCONFIRMED` — do NOT implement. Instead, comment on the issue:
-   ```
-   We couldn't reproduce/verify this claim. Could you provide:
-   - Debug output from: npx context-mode doctor (or ctx-debug.sh)
-   - Exact steps to reproduce
-   - Platform version and OS
-   We want to fix this but need to confirm the problem exists first.
-   ```
-3. If `VERDICT: HALLUCINATED` — the reporter (or their LLM) made up a behavior that doesn't exist. Comment kindly explaining the misunderstanding. Close with "working as intended" if appropriate.
-4. Only `VERDICT: CONFIRMED` proceeds to the Investigation Phase below.
-**The `ctx-debug.sh` script exists for exactly this purpose.** When in doubt, ask the reporter to run it and paste the output.
-### 5. Investigation Phase (Parallel)
-All agents investigate simultaneously:
-**Architects** research:
-- Read relevant source files
-- Check if claimed behavior actually exists
-- Validate ENV vars against real platform docs (use WebSearch + Context7)
-- Review related closed issues for prior art
-- Report: FINDINGS with specific file:line references
-**Staff Engineers** prepare (TDD-first per [tdd.md](tdd.md)):
-- Read the code that needs changing
-- **RED**: Write a failing test that reproduces the bug / specifies new behavior
-- Run test — verify it **FAILS** (if it passes, the test is useless)
-- **GREEN**: Write minimal code to make the test pass
-- Run test — verify it **PASSES**
-- **REFACTOR**: Clean up while keeping tests green
-- Repeat for each behavior (vertical slices, never horizontal)
-- Run full affected adapter tests
-- Report: DRAFT_FIX with RED→GREEN evidence for each behavior
-### 6. Ping-Pong Review
-Route Staff Engineer outputs to their paired Architects:
-```
-EM reads Staff Engineer result
-  → Sends to Architect via Agent(SendMessage)
-  → Architect reviews: APPROVED or CHANGES_NEEDED
-  → If CHANGES_NEEDED: route back to Staff Engineer
-  → Max 2 rounds, then EM decides
-```
-### 7. Validate (QA Engineer)
-QA Engineer runs the full validation matrix:
-```shell
-# All adapter tests
-npx vitest run tests/adapters/
-# Core tests
-npx vitest run tests/core/
-# Full suite
-npm test
-# TypeScript
-npm run typecheck
-```
-Report as a matrix:
-```
-Adapter Tests:
-  ✓ claude-code    ✓ gemini-cli    ✓ opencode
-  ✓ openclaw       ✓ kilo          ✓ codex
-  ✓ vscode-copilot ✓ cursor        ✓ antigravity
-  ✓ kiro           ✓ pi            ✓ zed
-Core Tests:    ✓ routing  ✓ search  ✓ server  ✓ cli
-TypeScript:    ✓ no errors
-Full Suite:    ✓ 47/47 passed
-```
-### 8. Push Directly to `next`
-**Do NOT open a PR.** Push fixes directly to the `next` branch:
-```bash
-# Ensure we're on next
-git checkout next
-git pull origin next
-# Apply changes from worktree agents
-# ... (merge worktree changes)
-# Commit with issue reference
-git commit -m "fix: {concise description} (closes #{N})
-- {what was broken}
-- {what was fixed}
-- {which adapters/modules affected}
-Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>"
-# Push to next
-git push origin next
-```
-### 9. Comment on Issue & Close
-After pushing to `next`, comment and **close the issue immediately**:
-```bash
-gh issue comment {N} --body "$(cat <<'EOF'
-Hey @{author}! 👋
-We investigated this and pushed a fix to the `next` branch ({commit_sha}).
-**What was happening:** {technical explanation of the root cause}
-**What we fixed:** {technical explanation of the fix}
-**Affected area:** {adapter/module names}
-This will ship in the next release. Once it's out, could you please test it in your setup and let us know if it resolves the issue? 🙏
-If the fix doesn't work for you, feel free to reopen this issue.
-Thanks for reporting this!
-EOF
-)"
-# Close the issue — fix is pushed, job done
-gh issue close {N}
-```
-## Decision Tree: Fix vs. Wontfix vs. Needs Info
-```
-Issue makes a claim about platform behavior?
-├── YES → Run Claim Verification (Step 4) FIRST
-│   ├── CONFIRMED → Fix it (steps 5-9 above)
-│   ├── UNCONFIRMED → Request evidence (ctx-debug.sh output, repro steps)
-│   └── HALLUCINATED → Explain kindly, close if appropriate
-│
-Issue is clear and reproducible (no behavioral claim)?
-├── YES → Fix it (steps 5-9 above)
-├── UNCLEAR → Comment asking for reproduction steps
-│   └── Template: "Could you share the exact command/config that triggers this?"
-└── BY DESIGN → Explain why, close with "working as intended" label
-    └── Be kind — explain the design decision
-```
-## Edge Cases
-### Issue references a feature that doesn't exist
-The issue author may have been told by an LLM that a feature exists when it doesn't. Use [validation.md](validation.md) ENV verification to catch this. Comment explaining the misunderstanding kindly.
-### Issue is a duplicate
-Link to the original issue, close as duplicate, thank the reporter.
-### Issue is actually a feature request
-Re-label, add to backlog discussion, don't close — let the community weigh in.

package/skills/context-mode-ops/validation.md DELETED Viewed

@@ -1,307 +0,0 @@
-# Validation Patterns
-Cross-cutting validation rules used by ALL workflows (triage, review, release).
-## Problem Verification — FIRST GATE
-<problem_verification_enforcement>
-This is the FIRST validation step, before anything else. We shipped inheritEnvKeys because
-we trusted an LLM claim that Claude Code strips environment variables — it does not.
-We got burned shipping a fix for an unverified claim. Never again.
-Every bug report, feature request, and behavioral claim MUST be proven true before code is written.
-</problem_verification_enforcement>
-### For Bug Reports
-**Reproduce it or reject it.** Run the exact reproduction steps from the issue. If it doesn't fail, the bug may not exist.
-```
-Step 1: Extract the claimed reproduction steps from the issue
-Step 2: Run them locally (use ctx_execute or a test)
-Step 3: Record the ACTUAL output
-Step 4: Compare actual vs. claimed behavior
-Step 5: VERDICT:
-  → REPRODUCED: Bug is real, proceed to fix
-  → NOT_REPRODUCED: Ask reporter for ctx-debug.sh output and exact repro steps
-  → INVALID: Reporter's environment is misconfigured, help them fix it
-```
-### For Feature Requests
-**Verify the underlying claim.** Feature requests always contain an implicit claim ("X behaves this way", "Y is slow", "Z doesn't support W"). Prove the claim first.
-```
-Step 1: Identify the claim (e.g., "Claude Code strips env vars from child processes")
-Step 2: Find HARD EVIDENCE — official docs, source code, or measured benchmarks
-  → Use ctx_fetch_and_index on official docs/repos
-  → Use ctx_execute to run actual tests
-  → NEVER trust LLM knowledge about platform behavior — LLMs hallucinate this constantly
-Step 3: VERDICT:
-  → CONFIRMED: Claim is true, proceed to design
-  → UNCONFIRMED: Cannot verify — ask reporter for evidence before implementing
-  → DEBUNKED: Claim is false — comment on issue explaining the misunderstanding
-```
-### Requesting Evidence from Reporters
-When a claim cannot be verified, comment on the issue BEFORE implementing:
-```markdown
-We want to address this but need to verify the underlying behavior first.
-Could you provide:
-1. Output from: `npx context-mode doctor` (or run `ctx-debug.sh`)
-2. Exact reproduction steps
-3. Platform version, adapter, and OS
-We'll investigate as soon as we can confirm the issue. Thanks for reporting!
-```
-### Evidence Log
-Every triage MUST produce a verification entry:
-```
-CLAIM: "{exact claim}"
-SOURCE: {issue number or PR}
-EVIDENCE: {link to doc, test output, or benchmark result}
-VERDICT: CONFIRMED | UNCONFIRMED | DEBUNKED
-ACTION: {proceed | request-info | close-as-invalid}
-```
----
-## ENV Variable Verification
-LLMs frequently hallucinate environment variables. Every ENV var in an issue or PR must be verified.
-### Verification Protocol
-For EACH environment variable mentioned:
-```
-Step 1: GREP — Does it exist in context-mode source?
-  → rg "{ENV_VAR}" src/
-  → If found: VERIFIED (we already use it)
-  → If not found: continue to Step 2
-Step 2: GREP ADAPTERS — Is it in the adapter detect logic?
-  → Read src/adapters/detect.ts
-  → Check the verified env vars comment block at the top
-  → If listed: VERIFIED (we know about it)
-Step 3: WEBSEARCH — Does the platform document it?
-  → WebSearch: "{PLATFORM} {ENV_VAR} environment variable"
-  → Check official docs, GitHub repos, release notes
-  → If found in official source: REAL but we don't use it yet
-Step 4: CONTEXT7 — Library documentation check
-  → resolve-library-id for the platform
-  → query-docs for the ENV var
-  → Cross-reference with Step 3
-Step 5: VERDICT
-  → VERIFIED: We use it and it's real
-  → REAL_NEW: Platform has it but we don't use it yet
-  → HALLUCINATED: No evidence it exists — flag it
-  → DEPRECATED: Used to exist but was removed
-```
-### Known Verified ENV Vars (Reference)
-| Platform | Verified ENV Vars | Source |
-|----------|------------------|--------|
-| Claude Code | `CLAUDE_PROJECT_DIR`, `CLAUDE_SESSION_ID` | src/adapters/detect.ts |
-| Gemini CLI | `GEMINI_PROJECT_DIR`, `GEMINI_CLI` | src/adapters/detect.ts |
-| OpenCode | `OPENCODE`, `OPENCODE_PID` | src/adapters/detect.ts |
-| OpenClaw | `OPENCLAW_HOME`, `OPENCLAW_CLI` | src/adapters/detect.ts |
-| Kilo | `KILO`, `KILO_PID` | src/adapters/detect.ts |
-| Codex | `CODEX_CI`, `CODEX_THREAD_ID` | src/adapters/detect.ts |
-| VS Code Copilot | `VSCODE_PID`, `VSCODE_CWD` | src/adapters/detect.ts |
-| Cursor | `CURSOR_TRACE_ID`, `CURSOR_CLI` | src/adapters/detect.ts |
-| Override | `CONTEXT_MODE_PLATFORM` | src/adapters/detect.ts |
-Any ENV var NOT in this table must go through the full verification protocol.
-## Adapter Test Matrix
-### Full Matrix Run
-```shell
-# Run ALL adapter tests
-npx vitest run tests/adapters/
-# Individual adapter (for targeted testing)
-npx vitest run tests/adapters/claude-code.test.ts
-npx vitest run tests/adapters/gemini-cli.test.ts
-npx vitest run tests/adapters/opencode.test.ts
-npx vitest run tests/adapters/openclaw.test.ts
-npx vitest run tests/adapters/kilo.test.ts
-npx vitest run tests/adapters/codex.test.ts
-npx vitest run tests/adapters/vscode-copilot.test.ts
-npx vitest run tests/adapters/cursor.test.ts
-npx vitest run tests/adapters/antigravity.test.ts
-npx vitest run tests/adapters/kiro.test.ts
-npx vitest run tests/adapters/zed.test.ts
-# Detection logic
-npx vitest run tests/adapters/detect.test.ts
-npx vitest run tests/adapters/client-map.test.ts
-```
-### Report Format
-```
-ADAPTER TEST MATRIX
-═══════════════════
-claude-code     ✓ 5/5    gemini-cli      ✓ 4/4
-opencode        ✓ 6/6    openclaw        ✓ 3/3
-kilo            ✓ 4/4    codex           ✓ 3/3
-vscode-copilot  ✓ 4/4    cursor          ✓ 3/3
-antigravity     ✓ 2/2    kiro            ✓ 3/3
-pi              ✓ 2/2    zed             ✓ 2/2
-detect          ✓ 8/8    client-map      ✓ 6/6
-───────────────────────────────────────────
-TOTAL: {N}/{N} passed | 0 failed
-```
-## Core Module Tests
-```shell
-# Core tests
-npx vitest run tests/core/routing.test.ts
-npx vitest run tests/core/search.test.ts
-npx vitest run tests/core/server.test.ts
-npx vitest run tests/core/cli.test.ts
-# Module tests
-npx vitest run tests/store.test.ts
-npx vitest run tests/executor.test.ts
-npx vitest run tests/security.test.ts
-npx vitest run tests/formatters.test.ts
-# Hook tests
-npx vitest run tests/hooks/
-# Full suite
-npm test
-```
-## OS Compatibility Checks
-### Path Handling
-```javascript
-// WRONG — breaks on Windows
-const configPath = homedir + "/.config/opencode/config.json";
-// CORRECT — works everywhere
-const configPath = path.join(homedir(), ".config", "opencode", "config.json");
-```
-Grep for potential issues:
-```shell
-# String concatenation with path separators
-rg "homedir\(\)\s*\+" src/
-rg '"/\.' src/
-rg "'\\./" src/
-# Direct slash usage in paths (should use path.join)
-rg 'path\s*=.*"/' src/ --type ts
-```
-### Temp Directory
-```javascript
-// WRONG — hardcoded /tmp
-const tmpFile = "/tmp/context-mode-output.txt";
-// CORRECT — uses OS temp dir
-const tmpFile = path.join(os.tmpdir(), "context-mode-output.txt");
-```
-Grep for hardcoded temp:
-```shell
-rg '"/tmp/' src/
-rg "'/tmp/" src/
-```
-### Native Bindings (better-sqlite3)
-Check that `better-sqlite3` is in `optionalDependencies` (not `dependencies`) and the code handles the case where it's not available:
-```shell
-rg "better-sqlite3" src/ --type ts
-rg "optionalDependencies" package.json
-```
-### Process Spawn
-```javascript
-// WRONG — shell: true behaves differently on Windows
-spawn("command", { shell: true });
-// CORRECT — explicit shell selection
-spawn("command", { shell: process.platform === "win32" ? "cmd.exe" : "/bin/sh" });
-```
-## Hook Format Validation
-Each platform has different hook formats. Verify changes match:
-| Platform | Hook Format | Key Differences |
-|----------|------------|-----------------|
-| Claude Code | `hooks.json` in plugin dir | `PreToolUse`, `PostToolUse`, `PreCompact`, `SessionStart` |
-| Gemini CLI | `~/.gemini/settings.json` | `BeforeTool`, `AfterTool`, `PreCompress`, `SessionStart` + `matcher` |
-| VS Code Copilot | `.github/hooks/*.json` | Same as Claude Code but separate file |
-| Cursor | `.cursor/hooks.json` | No `SessionStart` (injects via file instead) |
-| OpenCode | `opencode.json` | Uses `agents` section, not traditional hooks |
-| OpenClaw | `openclaw.plugin.json` | Extension model, not hook-based |
-## Security Checks
-### Sandbox Escape
-```shell
-# File writing attempts through ctx_execute
-rg "writeFile\|appendFile\|createWriteStream" src/executor.ts
-# Path traversal
-rg "\.\.\/" src/ --type ts
-# Command injection vectors
-rg "exec\(.*\$\{" src/ --type ts
-rg "spawn\(.*\$\{" src/ --type ts
-```
-### Information Disclosure
-```shell
-# Sensitive paths
-rg "process\.env\b" src/ --type ts | grep -v "test"
-# Home directory exposure
-rg "homedir\(\)" src/ --type ts
-```
-## TypeScript Validation
-```bash
-# Full type check
-npm run typecheck
-# Should report 0 errors
-# If errors exist, they MUST be fixed before shipping
-```
-## Pre-Ship Checklist
-Every change, regardless of workflow, must pass:
-- [ ] **Problem verified** — CLAIM_VERDICT is CONFIRMED with hard evidence (this is gate zero)
-- [ ] `npm run typecheck` — 0 errors
-- [ ] `npm test` — all pass
-- [ ] Adapter tests — all 12 pass (or N/A if untouched)
-- [ ] ENV vars — all verified against real platform source
-- [ ] Path handling — no hardcoded separators
-- [ ] Hook format — matches target platform's schema
-- [ ] No security regressions

package/skills/diagnose/SKILL.md DELETED Viewed

@@ -1,122 +0,0 @@
----
-name: diagnose
-description: Disciplined diagnosis loop for hard bugs and performance regressions. Reproduce → minimise → hypothesise → instrument → fix → regression-test. Use when user says "diagnose this" / "debug this", reports a bug, says something is broken/throwing/failing, or describes a performance regression.
----
-# Diagnose
-A discipline for hard bugs. Skip phases only when explicitly justified.
-When exploring the codebase, use the project's domain glossary to get a clear mental model of the relevant modules, and check ADRs in the area you're touching.
-## Phase 1 — Build a feedback loop
-**This is the skill.** Everything else is mechanical. If you have a fast, deterministic, agent-runnable pass/fail signal for the bug, you will find the cause — bisection, hypothesis-testing, and instrumentation all just consume that signal. If you don't have one, no amount of staring at code will save you.
-Spend disproportionate effort here. **Be aggressive. Be creative. Refuse to give up.**
-### Ways to construct one — try them in roughly this order
-1. **Failing test** at whatever seam reaches the bug — unit, integration, e2e.
-2. **Curl / HTTP script** against a running dev server.
-3. **CLI invocation** with a fixture input, diffing stdout against a known-good snapshot.
-4. **Headless browser script** (Playwright / Puppeteer) — drives the UI, asserts on DOM/console/network.
-5. **Replay a captured trace.** Save a real network request / payload / event log to disk; replay it through the code path in isolation.
-6. **Throwaway harness.** Spin up a minimal subset of the system (one service, mocked deps) that exercises the bug code path with a single function call.
-7. **Property / fuzz loop.** If the bug is "sometimes wrong output", run 1000 random inputs and look for the failure mode.
-8. **Bisection harness.** If the bug appeared between two known states (commit, dataset, version), automate "boot at state X, check, repeat" so you can `git bisect run` it.
-9. **Differential loop.** Run the same input through old-version vs new-version (or two configs) and diff outputs.
-10. **HITL bash script.** Last resort. If a human must click, drive _them_ with `scripts/hitl-loop.template.sh` so the loop is still structured. Captured output feeds back to you.
-Build the right feedback loop, and the bug is 90% fixed.
-### Iterate on the loop itself
-Treat the loop as a product. Once you have _a_ loop, ask:
-- Can I make it faster? (Cache setup, skip unrelated init, narrow the test scope.)
-- Can I make the signal sharper? (Assert on the specific symptom, not "didn't crash".)
-- Can I make it more deterministic? (Pin time, seed RNG, isolate filesystem, freeze network.)
-A 30-second flaky loop is barely better than no loop. A 2-second deterministic loop is a debugging superpower.
-### Non-deterministic bugs
-The goal is not a clean repro but a **higher reproduction rate**. Loop the trigger 100×, parallelise, add stress, narrow timing windows, inject sleeps. A 50%-flake bug is debuggable; 1% is not — keep raising the rate until it's debuggable.
-### When you genuinely cannot build a loop
-Stop and say so explicitly. List what you tried. Ask the user for: (a) access to whatever environment reproduces it, (b) a captured artifact (HAR file, log dump, core dump, screen recording with timestamps), or (c) permission to add temporary production instrumentation. Do **not** proceed to hypothesise without a loop.
-Do not proceed to Phase 2 until you have a loop you believe in.
-## Phase 2 — Reproduce
-Run the loop. Watch the bug appear.
-Confirm:
-- [ ] The loop produces the failure mode the **user** described — not a different failure that happens to be nearby. Wrong bug = wrong fix.
-- [ ] The failure is reproducible across multiple runs (or, for non-deterministic bugs, reproducible at a high enough rate to debug against).
-- [ ] You have captured the exact symptom (error message, wrong output, slow timing) so later phases can verify the fix actually addresses it.
-Do not proceed until you reproduce the bug.
-## Phase 3 — Hypothesise
-Generate **3–5 ranked hypotheses** before testing any of them. Single-hypothesis generation anchors on the first plausible idea.
-Each hypothesis must be **falsifiable**: state the prediction it makes.
-> Format: "If <X> is the cause, then <changing Y> will make the bug disappear / <changing Z> will make it worse."
-If you cannot state the prediction, the hypothesis is a vibe — discard or sharpen it.
-**Show the ranked list to the user before testing.** They often have domain knowledge that re-ranks instantly ("we just deployed a change to #3"), or know hypotheses they've already ruled out. Cheap checkpoint, big time saver. Don't block on it — proceed with your ranking if the user is AFK.
-## Phase 4 — Instrument
-Each probe must map to a specific prediction from Phase 3. **Change one variable at a time.**
-Tool preference:
-1. **Debugger / REPL inspection** if the env supports it. One breakpoint beats ten logs.
-2. **Targeted logs** at the boundaries that distinguish hypotheses.
-3. Never "log everything and grep".
-**Tag every debug log** with a unique prefix, e.g. `[DEBUG-a4f2]`. Cleanup at the end becomes a single grep. Untagged logs survive; tagged logs die.
-**Perf branch.** For performance regressions, logs are usually wrong. Instead: establish a baseline measurement (timing harness, `performance.now()`, profiler, query plan), then bisect. Measure first, fix second.
-## Phase 5 — Fix + regression test
-Write the regression test **before the fix** — but only if there is a **correct seam** for it.
-A correct seam is one where the test exercises the **real bug pattern** as it occurs at the call site. If the only available seam is too shallow (single-caller test when the bug needs multiple callers, unit test that can't replicate the chain that triggered the bug), a regression test there gives false confidence.
-**If no correct seam exists, that itself is the finding.** Note it. The codebase architecture is preventing the bug from being locked down. Flag this for the next phase.
-If a correct seam exists:
-1. Turn the minimised repro into a failing test at that seam.
-2. Watch it fail.
-3. Apply the fix.
-4. Watch it pass.
-5. Re-run the Phase 1 feedback loop against the original (un-minimised) scenario.
-## Phase 6 — Cleanup + post-mortem
-Required before declaring done:
-- [ ] Original repro no longer reproduces (re-run the Phase 1 loop)
-- [ ] Regression test passes (or absence of seam is documented)
-- [ ] All `[DEBUG-...]` instrumentation removed (`grep` the prefix)
-- [ ] Throwaway prototypes deleted (or moved to a clearly-marked debug location)
-- [ ] The hypothesis that turned out correct is stated in the commit / PR message — so the next debugger learns
-**Then ask: what would have prevented this bug?** If the answer involves architectural change (no good test seam, tangled callers, hidden coupling) hand off to the `/improve-codebase-architecture` skill with the specifics. Make the recommendation **after** the fix is in, not before — you have more information now than when you started.
----
-_Vendored from [mattpocock/skills](https://github.com/mattpocock/skills) @ `b843cb5` — MIT License. See [skills/UPSTREAM-CREDITS.md](../UPSTREAM-CREDITS.md) for refresh instructions._