@rune-kit/rune 2.1.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (155) hide show
  1. package/LICENSE +21 -0
  2. package/README.md +357 -0
  3. package/agents/.gitkeep +0 -0
  4. package/agents/architect.md +29 -0
  5. package/agents/asset-creator.md +11 -0
  6. package/agents/audit.md +11 -0
  7. package/agents/autopsy.md +11 -0
  8. package/agents/brainstorm.md +11 -0
  9. package/agents/browser-pilot.md +11 -0
  10. package/agents/coder.md +29 -0
  11. package/agents/completion-gate.md +11 -0
  12. package/agents/constraint-check.md +11 -0
  13. package/agents/context-engine.md +11 -0
  14. package/agents/cook.md +11 -0
  15. package/agents/db.md +11 -0
  16. package/agents/debug.md +11 -0
  17. package/agents/dependency-doctor.md +11 -0
  18. package/agents/deploy.md +11 -0
  19. package/agents/design.md +11 -0
  20. package/agents/docs-seeker.md +11 -0
  21. package/agents/fix.md +11 -0
  22. package/agents/hallucination-guard.md +11 -0
  23. package/agents/incident.md +11 -0
  24. package/agents/integrity-check.md +11 -0
  25. package/agents/journal.md +11 -0
  26. package/agents/launch.md +11 -0
  27. package/agents/logic-guardian.md +11 -0
  28. package/agents/marketing.md +11 -0
  29. package/agents/onboard.md +11 -0
  30. package/agents/perf.md +11 -0
  31. package/agents/plan.md +11 -0
  32. package/agents/preflight.md +11 -0
  33. package/agents/problem-solver.md +11 -0
  34. package/agents/rescue.md +11 -0
  35. package/agents/research.md +11 -0
  36. package/agents/researcher.md +29 -0
  37. package/agents/review-intake.md +11 -0
  38. package/agents/review.md +11 -0
  39. package/agents/reviewer.md +28 -0
  40. package/agents/safeguard.md +11 -0
  41. package/agents/sast.md +11 -0
  42. package/agents/scanner.md +28 -0
  43. package/agents/scope-guard.md +11 -0
  44. package/agents/scout.md +11 -0
  45. package/agents/sentinel.md +11 -0
  46. package/agents/sequential-thinking.md +11 -0
  47. package/agents/session-bridge.md +11 -0
  48. package/agents/skill-forge.md +11 -0
  49. package/agents/skill-router.md +11 -0
  50. package/agents/surgeon.md +11 -0
  51. package/agents/team.md +11 -0
  52. package/agents/test.md +11 -0
  53. package/agents/trend-scout.md +11 -0
  54. package/agents/verification.md +11 -0
  55. package/agents/video-creator.md +11 -0
  56. package/agents/watchdog.md +11 -0
  57. package/agents/worktree.md +11 -0
  58. package/commands/.gitkeep +0 -0
  59. package/commands/rune.md +168 -0
  60. package/compiler/__tests__/openclaw-adapter.test.js +140 -0
  61. package/compiler/__tests__/parser.test.js +55 -0
  62. package/compiler/adapters/antigravity.js +59 -0
  63. package/compiler/adapters/claude.js +37 -0
  64. package/compiler/adapters/cursor.js +67 -0
  65. package/compiler/adapters/generic.js +60 -0
  66. package/compiler/adapters/index.js +45 -0
  67. package/compiler/adapters/openclaw.js +150 -0
  68. package/compiler/adapters/windsurf.js +60 -0
  69. package/compiler/bin/rune.js +288 -0
  70. package/compiler/doctor.js +153 -0
  71. package/compiler/emitter.js +240 -0
  72. package/compiler/parser.js +208 -0
  73. package/compiler/transformer.js +69 -0
  74. package/compiler/transforms/branding.js +27 -0
  75. package/compiler/transforms/cross-references.js +29 -0
  76. package/compiler/transforms/frontmatter.js +38 -0
  77. package/compiler/transforms/hooks.js +68 -0
  78. package/compiler/transforms/subagents.js +36 -0
  79. package/compiler/transforms/tool-names.js +60 -0
  80. package/contexts/dev.md +34 -0
  81. package/contexts/research.md +43 -0
  82. package/contexts/review.md +55 -0
  83. package/extensions/ai-ml/PACK.md +517 -0
  84. package/extensions/analytics/PACK.md +557 -0
  85. package/extensions/backend/PACK.md +678 -0
  86. package/extensions/chrome-ext/PACK.md +995 -0
  87. package/extensions/content/PACK.md +381 -0
  88. package/extensions/devops/PACK.md +520 -0
  89. package/extensions/ecommerce/PACK.md +280 -0
  90. package/extensions/gamedev/PACK.md +393 -0
  91. package/extensions/mobile/PACK.md +273 -0
  92. package/extensions/saas/PACK.md +805 -0
  93. package/extensions/security/PACK.md +536 -0
  94. package/extensions/trading/PACK.md +597 -0
  95. package/extensions/ui/PACK.md +947 -0
  96. package/package.json +47 -0
  97. package/skills/.gitkeep +0 -0
  98. package/skills/adversary/SKILL.md +271 -0
  99. package/skills/asset-creator/SKILL.md +157 -0
  100. package/skills/audit/SKILL.md +466 -0
  101. package/skills/autopsy/SKILL.md +200 -0
  102. package/skills/ba/SKILL.md +279 -0
  103. package/skills/brainstorm/SKILL.md +266 -0
  104. package/skills/browser-pilot/SKILL.md +168 -0
  105. package/skills/completion-gate/SKILL.md +151 -0
  106. package/skills/constraint-check/SKILL.md +165 -0
  107. package/skills/context-engine/SKILL.md +176 -0
  108. package/skills/cook/SKILL.md +636 -0
  109. package/skills/db/SKILL.md +256 -0
  110. package/skills/debug/SKILL.md +240 -0
  111. package/skills/dependency-doctor/SKILL.md +235 -0
  112. package/skills/deploy/SKILL.md +174 -0
  113. package/skills/design/DESIGN-REFERENCE.md +365 -0
  114. package/skills/design/SKILL.md +462 -0
  115. package/skills/doc-processor/SKILL.md +254 -0
  116. package/skills/docs/SKILL.md +336 -0
  117. package/skills/docs-seeker/SKILL.md +166 -0
  118. package/skills/fix/SKILL.md +192 -0
  119. package/skills/git/SKILL.md +285 -0
  120. package/skills/hallucination-guard/SKILL.md +204 -0
  121. package/skills/incident/SKILL.md +241 -0
  122. package/skills/integrity-check/SKILL.md +169 -0
  123. package/skills/journal/SKILL.md +190 -0
  124. package/skills/launch/SKILL.md +330 -0
  125. package/skills/logic-guardian/SKILL.md +240 -0
  126. package/skills/marketing/SKILL.md +229 -0
  127. package/skills/mcp-builder/SKILL.md +311 -0
  128. package/skills/onboard/SKILL.md +298 -0
  129. package/skills/perf/SKILL.md +297 -0
  130. package/skills/plan/SKILL.md +520 -0
  131. package/skills/preflight/SKILL.md +231 -0
  132. package/skills/problem-solver/SKILL.md +284 -0
  133. package/skills/rescue/SKILL.md +434 -0
  134. package/skills/research/SKILL.md +122 -0
  135. package/skills/review/SKILL.md +354 -0
  136. package/skills/review-intake/SKILL.md +222 -0
  137. package/skills/safeguard/SKILL.md +188 -0
  138. package/skills/sast/SKILL.md +190 -0
  139. package/skills/scaffold/SKILL.md +276 -0
  140. package/skills/scope-guard/SKILL.md +150 -0
  141. package/skills/scout/SKILL.md +232 -0
  142. package/skills/sentinel/SKILL.md +320 -0
  143. package/skills/sentinel-env/SKILL.md +226 -0
  144. package/skills/sequential-thinking/SKILL.md +234 -0
  145. package/skills/session-bridge/SKILL.md +287 -0
  146. package/skills/skill-forge/SKILL.md +317 -0
  147. package/skills/skill-router/SKILL.md +267 -0
  148. package/skills/surgeon/SKILL.md +203 -0
  149. package/skills/team/SKILL.md +397 -0
  150. package/skills/test/SKILL.md +271 -0
  151. package/skills/trend-scout/SKILL.md +145 -0
  152. package/skills/verification/SKILL.md +201 -0
  153. package/skills/video-creator/SKILL.md +201 -0
  154. package/skills/watchdog/SKILL.md +166 -0
  155. package/skills/worktree/SKILL.md +140 -0
@@ -0,0 +1,168 @@
1
+ ---
2
+ name: browser-pilot
3
+ description: Playwright browser automation. Navigates URLs, takes screenshots, checks accessibility tree, interacts with UI elements, and reports findings.
4
+ metadata:
5
+ author: runedev
6
+ version: "0.2.0"
7
+ layer: L3
8
+ model: sonnet
9
+ group: media
10
+ tools: "Read, Bash, Glob, Grep"
11
+ ---
12
+
13
+ # browser-pilot
14
+
15
+ ## Purpose
16
+
17
+ Browser automation for testing and verification using MCP Playwright tools. Navigates to URLs, captures accessibility snapshots and screenshots, interacts with UI elements (click, type, fill form), and reports findings with visual evidence.
18
+
19
+ ## Called By (inbound)
20
+
21
+ - `test` (L2): e2e and visual testing
22
+ - `deploy` (L2): verify live deployment
23
+ - `debug` (L2): capture browser console errors
24
+ - `marketing` (L2): screenshot for assets
25
+ - `launch` (L1): verify live site after deployment
26
+ - `perf` (L2): Lighthouse / Core Web Vitals measurement
27
+
28
+ ## Calls (outbound)
29
+
30
+ None — pure L3 utility using Playwright MCP tools.
31
+
32
+ ## Executable Instructions
33
+
34
+ ### Step 1: Receive Task
35
+
36
+ Accept input from calling skill:
37
+ - `url` — target URL to open
38
+ - `task` — what to do: `screenshot` | `check_elements` | `fill_form` | `test_flow` | `console_errors`
39
+ - `interactions` — optional list of actions (click X, type Y into Z, etc.)
40
+
41
+ ### Step 2: Navigate
42
+
43
+ Open the target URL using the Playwright MCP navigate tool:
44
+
45
+ ```
46
+ mcp__plugin_playwright_playwright__browser_navigate({ url: "<url>" })
47
+ ```
48
+
49
+ Wait for the page to load. If navigation fails (timeout or error), report UNREACHABLE and stop.
50
+
51
+ ### Step 3: Snapshot
52
+
53
+ Capture the accessibility tree to understand page structure:
54
+
55
+ ```
56
+ mcp__plugin_playwright_playwright__browser_snapshot()
57
+ ```
58
+
59
+ Use the snapshot to:
60
+ - Identify interactive elements (buttons, inputs, links)
61
+ - Find specific elements referenced in the task
62
+ - Detect accessibility issues (missing labels, roles)
63
+
64
+ ### Step 4: Interact
65
+
66
+ Based on the task, perform interactions using Playwright MCP tools:
67
+
68
+ - **Click**: `mcp__plugin_playwright_playwright__browser_click({ ref: "<ref>", element: "<description>" })`
69
+ - **Type**: `mcp__plugin_playwright_playwright__browser_type({ ref: "<ref>", text: "<value>" })`
70
+ - **Fill form**: `mcp__plugin_playwright_playwright__browser_fill_form({ fields: [...] })`
71
+ - **Navigate back**: `mcp__plugin_playwright_playwright__browser_navigate_back()`
72
+ - **Select option**: `mcp__plugin_playwright_playwright__browser_select_option({ ref: "<ref>", values: [...] })`
73
+
74
+ Limit: max 20 interactions per session. If the task requires more, stop and report partial results.
75
+
76
+ After each interaction, take a new snapshot to verify the result before proceeding.
77
+
78
+ ### Step 5: Screenshot
79
+
80
+ Capture visual evidence:
81
+
82
+ ```
83
+ mcp__plugin_playwright_playwright__browser_take_screenshot({ type: "png" })
84
+ ```
85
+
86
+ For full-page capture (landing pages, long content):
87
+
88
+ ```
89
+ mcp__plugin_playwright_playwright__browser_take_screenshot({ type: "png", fullPage: true })
90
+ ```
91
+
92
+ Save with a descriptive filename if the `filename` param is supported.
93
+
94
+ ### Step 6: Report
95
+
96
+ Compile findings into a structured report:
97
+
98
+ ```
99
+ ## Browser Report: [url]
100
+
101
+ - **Task**: [task description]
102
+ - **Status**: SUCCESS | PARTIAL | FAILED
103
+
104
+ ### Page Info
105
+ - HTTP Status: [status]
106
+ - Load outcome: [loaded | timeout | error]
107
+
108
+ ### Accessibility Findings
109
+ - [finding from snapshot — missing labels, broken roles, etc.]
110
+
111
+ ### Interaction Log
112
+ - [action taken] → [result: success | element not found | error]
113
+
114
+ ### Console Errors
115
+ - [error message — source]
116
+
117
+ ### Screenshots
118
+ - [screenshot path or description]
119
+
120
+ ### Summary
121
+ - [overall assessment — what works, what failed, any critical issues]
122
+ ```
123
+
124
+ ### Step 7: Close
125
+
126
+ Always close the browser when done:
127
+
128
+ ```
129
+ mcp__plugin_playwright_playwright__browser_close()
130
+ ```
131
+
132
+ This step is mandatory even if earlier steps fail. Use a try-finally pattern in your reasoning.
133
+
134
+ ## Output Format
135
+
136
+ Structured Browser Report with task status, page info, accessibility findings, interaction log, console errors, screenshots, and summary. See Step 6 Report above for full template.
137
+
138
+ ## Constraints
139
+
140
+ 1. MUST close browser when done — Step 7 is non-optional even if earlier steps fail
141
+ 2. MUST NOT exceed 20 interactions per session
142
+ 3. MUST NOT store credentials or sensitive data in interaction logs
143
+ 4. MUST take screenshot evidence before reporting visual findings
144
+
145
+ ## Sharp Edges
146
+
147
+ Known failure modes for this skill. Check these before declaring done.
148
+
149
+ | Failure Mode | Severity | Mitigation |
150
+ |---|---|---|
151
+ | Not closing browser when done (including on error) | CRITICAL | Constraint 1: Step 7 browser_close() is mandatory — treat as try-finally |
152
+ | Storing credentials or tokens in interaction logs | HIGH | Constraint 3: redact all sensitive values before logging |
153
+ | Exceeding 20 interactions without stopping and reporting partial | MEDIUM | Constraint 2: stop at 20, report what was tested and what remains |
154
+ | Reporting visual findings without screenshot evidence | MEDIUM | Constraint 4: screenshot before reporting — "looks broken" without screenshot is invalid |
155
+
156
+ ## Done When
157
+
158
+ - URL navigated successfully (or UNREACHABLE reported)
159
+ - Page snapshot captured for accessibility context
160
+ - All requested interactions completed (or partial with reason if >20)
161
+ - Screenshot taken as visual evidence
162
+ - Console errors captured if task requested them
163
+ - Browser closed (Step 7 executed)
164
+ - Browser Report emitted with status, findings, and screenshot reference
165
+
166
+ ## Cost Profile
167
+
168
+ ~500-1500 tokens input, ~300-800 tokens output. Sonnet for interaction logic.
@@ -0,0 +1,151 @@
1
+ ---
2
+ name: completion-gate
3
+ description: "Validates agent claims against evidence trail. Catches 'done' without proof, 'tests pass' without output, 'fixed' without verification. Called by cook and team at workflow end."
4
+ user-invocable: false
5
+ metadata:
6
+ author: runedev
7
+ version: "1.1.0"
8
+ layer: L3
9
+ model: haiku
10
+ group: validation
11
+ tools: "Read, Bash, Glob, Grep"
12
+ ---
13
+
14
+ # completion-gate
15
+
16
+ ## Purpose
17
+
18
+ The lie detector for agent claims. Validates that what an agent says it did actually happened — with evidence. Catches the #1 failure mode in AI coding: claiming completion without proof.
19
+
20
+ <HARD-GATE>
21
+ Every claim requires evidence. No evidence = UNCONFIRMED = BLOCK.
22
+ "I ran the tests and they pass" without stdout = UNCONFIRMED.
23
+ "I fixed the bug" without before/after diff = UNCONFIRMED.
24
+ "Build succeeds" without build output = UNCONFIRMED.
25
+ </HARD-GATE>
26
+
27
+ ## Triggers
28
+
29
+ - Called by `cook` in Phase 5d (quality gate)
30
+ - Called by `team` before merging stream results
31
+ - Called by any skill that reports "done" to an orchestrator
32
+ - Auto-trigger: when agent says "done", "complete", "fixed", "passing"
33
+
34
+ ## Calls (outbound)
35
+
36
+ None — pure validator. Reads evidence, produces verdict.
37
+
38
+ ## Called By (inbound)
39
+
40
+ - `cook` (L1): Phase 5d — validate completion claims before commit
41
+ - `team` (L1): validate cook reports from parallel streams
42
+
43
+ ## Execution
44
+
45
+ ### Step 1 — Collect Claims
46
+
47
+ Parse the agent's output for completion claims. Common claim patterns:
48
+
49
+ ```
50
+ CLAIM PATTERNS:
51
+ "tests pass" / "all tests passing" / "test suite green"
52
+ "build succeeds" / "build complete" / "compiles clean"
53
+ "no lint errors" / "lint clean"
54
+ "fixed" / "resolved" / "bug is gone"
55
+ "implemented" / "feature complete" / "done"
56
+ "no security issues" / "sentinel passed"
57
+ ```
58
+
59
+ Extract each claim as: `{ claim: string, source_skill: string }`
60
+
61
+ ### Step 2 — Match Evidence
62
+
63
+ For each claim, look for corresponding evidence in the conversation context:
64
+
65
+ | Claim Type | Required Evidence | Where to Find |
66
+ |---|---|---|
67
+ | "tests pass" | Test runner stdout with pass count | Bash output from test command |
68
+ | "build succeeds" | Build command stdout showing success | Bash output from build command |
69
+ | "lint clean" | Linter stdout (even if empty = 0 errors) | Bash output from lint command |
70
+ | "fixed" | Git diff showing the change + test proving fix | Edit/Write tool calls + test output |
71
+ | "implemented" | Files created/modified matching the plan | Write/Edit tool calls vs plan |
72
+ | "no security issues" | Sentinel report with PASS verdict | Sentinel skill output |
73
+ | "coverage ≥ X%" | Coverage tool output with actual percentage | Test runner with coverage flag |
74
+
75
+ ### Step 3 — Validate Each Claim
76
+
77
+ For each claim + evidence pair:
78
+
79
+ ```
80
+ IF evidence exists AND evidence supports claim:
81
+ → CONFIRMED
82
+ IF evidence exists BUT contradicts claim:
83
+ → CONTRADICTED (most serious — agent is wrong)
84
+ IF no evidence found:
85
+ → UNCONFIRMED (agent may be right but didn't prove it)
86
+ ```
87
+
88
+ ### Step 4 — Report
89
+
90
+ ```
91
+ ## Completion Gate Report
92
+ - **Status**: CONFIRMED | UNCONFIRMED | CONTRADICTED
93
+ - **Claims Checked**: [count]
94
+ - **Confirmed**: [count] | **Unconfirmed**: [count] | **Contradicted**: [count]
95
+
96
+ ### Claim Validation
97
+ | # | Claim | Evidence | Verdict |
98
+ |---|---|---|---|
99
+ | 1 | "All tests pass" | Bash: `npm test` → "42 passed, 0 failed" | CONFIRMED |
100
+ | 2 | "Build succeeds" | No build command output found | UNCONFIRMED |
101
+ | 3 | "No lint errors" | Bash: `npm run lint` → "3 errors" | CONTRADICTED |
102
+
103
+ ### Gaps (if any)
104
+ - Claim 2: Re-run `npm run build` and capture output
105
+ - Claim 3: Agent claimed clean but lint shows 3 errors — fix required
106
+
107
+ ### Verdict
108
+ UNCONFIRMED — 1 claim lacks evidence, 1 contradicted. Cannot proceed to commit.
109
+ ```
110
+
111
+ ## Verdict Rules
112
+
113
+ ```
114
+ ALL claims CONFIRMED → overall CONFIRMED (proceed)
115
+ ANY claim CONTRADICTED → overall CONTRADICTED (BLOCK — fix the contradiction)
116
+ ANY claim UNCONFIRMED → overall UNCONFIRMED (BLOCK — provide evidence)
117
+ (no CONTRADICTED)
118
+ ```
119
+
120
+ ## Output Format
121
+
122
+ Completion Gate Report with status (CONFIRMED/UNCONFIRMED/CONTRADICTED), claim validation table, gaps, and verdict. See Step 4 Report above for full template.
123
+
124
+ ## Constraints
125
+
126
+ 1. MUST check every completion claim against actual tool output — not agent narrative
127
+ 2. MUST flag missing evidence as UNCONFIRMED — absence of proof is not proof of absence
128
+ 3. MUST flag contradictions as CONTRADICTED — this is more serious than missing evidence
129
+ 4. MUST NOT accept "I verified it" as evidence — show the command output
130
+ 5. MUST be fast (haiku) — this runs on every cook completion
131
+
132
+ ## Sharp Edges
133
+
134
+ | Failure Mode | Severity | Mitigation |
135
+ |---|---|---|
136
+ | Agent rephrases claim to avoid detection | MEDIUM | Pattern matching covers common phrasings — extend as new patterns emerge |
137
+ | Evidence from a DIFFERENT test run (stale) | HIGH | Check that evidence timestamp/context matches current changes |
138
+ | Agent pre-generates evidence by running commands proactively | LOW | This is actually GOOD behavior — we want agents to provide evidence |
139
+ | Completion-gate itself claims "all confirmed" without evidence | CRITICAL | Gate report MUST include the evidence table — no table = report is invalid |
140
+
141
+ ## Done When
142
+
143
+ - All completion claims extracted from agent output
144
+ - Each claim matched against tool output evidence
145
+ - Verdict table emitted with claim/evidence/verdict for each item
146
+ - Overall verdict: CONFIRMED / UNCONFIRMED / CONTRADICTED
147
+ - If not CONFIRMED: specific gaps listed with remediation steps
148
+
149
+ ## Cost Profile
150
+
151
+ ~500-1000 tokens input, ~200-500 tokens output. Haiku for speed. Runs frequently as part of cook's quality phase.
@@ -0,0 +1,165 @@
1
+ ---
2
+ name: constraint-check
3
+ description: "Meta-validator for HARD-GATEs. Verifies that skills' mandatory constraints were followed during a workflow. Called by cook, team, and audit to audit discipline compliance."
4
+ user-invocable: false
5
+ metadata:
6
+ author: runedev
7
+ version: "1.1.0"
8
+ layer: L3
9
+ model: haiku
10
+ group: validation
11
+ tools: "Read, Glob, Grep"
12
+ ---
13
+
14
+ # constraint-check
15
+
16
+ ## Purpose
17
+
18
+ The internal affairs department for Rune skills. Checks whether HARD-GATEs and mandatory constraints were actually followed during a workflow — not just claimed to be followed. Reads the constraint definitions from skill files and audits the conversation trail for compliance.
19
+
20
+ While `completion-gate` checks if claims have evidence, `constraint-check` checks if the PROCESS was followed. Did you actually write tests before code? Did you actually get plan approval? Did you actually run sentinel?
21
+
22
+ ## Triggers
23
+
24
+ - Called by `cook` (L1) at end of workflow as discipline audit
25
+ - Called by `team` (L1) to verify stream agents followed constraints
26
+ - Called by `audit` (L2) during quality dimension assessment
27
+ - `/rune constraint-check` — manual audit of current session
28
+
29
+ ## Calls (outbound)
30
+
31
+ None — pure read-only validator.
32
+
33
+ ## Called By (inbound)
34
+
35
+ - `cook` (L1): end-of-workflow discipline audit
36
+ - `team` (L1): verify stream agent compliance
37
+ - `audit` (L2): quality dimension
38
+ - User: manual session audit
39
+
40
+ ## Execution
41
+
42
+ ### Step 1 — Identify Active Skills
43
+
44
+ Parse the conversation/workflow to identify which skills were invoked:
45
+
46
+ ```
47
+ Extract from context:
48
+ - Skills invoked via Skill tool (exact list)
49
+ - Skills referenced in agent narrative
50
+ - Phase progression (cook phases completed)
51
+ ```
52
+
53
+ ### Step 2 — Load Constraint Definitions
54
+
55
+ For each invoked skill, extract HARD-GATEs and numbered constraints:
56
+
57
+ ```
58
+ For each skill in invoked_skills:
59
+ Read: skills/<skill>/SKILL.md
60
+ Extract:
61
+ - <HARD-GATE> blocks → mandatory, violation = BLOCK
62
+ - ## Constraints numbered list → required, violation = WARN
63
+ - ## Mesh Gates table → required gates
64
+ ```
65
+
66
+ ### Step 3 — Audit Compliance
67
+
68
+ Check each constraint against the conversation evidence:
69
+
70
+ | Constraint Type | How to Verify | Evidence Source |
71
+ |---|---|---|
72
+ | "MUST write tests BEFORE code" | Test file Write/Edit timestamps before implementation Write/Edit | Tool call ordering |
73
+ | "MUST get user approval" | User message containing "go"/"yes"/"proceed" after plan | Conversation history |
74
+ | "MUST run verification" | Bash command with test/lint/build output | Tool call results |
75
+ | "MUST show actual output" | Stdout captured in agent response | Agent messages |
76
+ | "MUST NOT modify files outside scope" | Git diff files vs plan file list | Git + plan comparison |
77
+ | "Iron Law: delete code before test" | No implementation code exists before test creation | Tool call ordering |
78
+
79
+ ### Step 4 — Classify Violations
80
+
81
+ | Violation Type | Severity | Meaning |
82
+ |---------------|----------|---------|
83
+ | HARD-GATE violation | BLOCK | Skill says this is non-negotiable |
84
+ | Constraint violation | WARN | Skill says this is required but not fatal |
85
+ | Best practice skip | INFO | Recommended but optional |
86
+
87
+ ### Step 5 — Report
88
+
89
+ ```
90
+ ## Constraint Check Report
91
+ - **Status**: COMPLIANT | VIOLATIONS_FOUND | CRITICAL_VIOLATION
92
+ - **Skills Audited**: [count]
93
+ - **Constraints Checked**: [count]
94
+ - **Violations**: [count by severity]
95
+
96
+ ### HARD-GATE Violations (BLOCK)
97
+ - [skill:test] Iron Law: implementation code written at tool_call #12 BEFORE test file created at #15
98
+ - [skill:cook] Plan Gate: Phase 4 started without user approval message
99
+
100
+ ### Constraint Violations (WARN)
101
+ - [skill:verification] Constraint 2: "All tests pass" claimed at message #20 without stdout evidence
102
+ - [skill:sentinel] Constraint 3: files scanned list not included in report
103
+
104
+ ### Compliance Summary
105
+ | Skill | HARD-GATEs | Constraints | Status |
106
+ |-------|-----------|-------------|--------|
107
+ | cook | 3/3 ✓ | 6/7 (1 WARN) | WARN |
108
+ | test | 0/1 ✗ | 8/9 (1 WARN) | BLOCK |
109
+ | verification | 1/1 ✓ | 4/6 (2 WARN) | WARN |
110
+ | sentinel | 1/1 ✓ | 7/7 ✓ | PASS |
111
+
112
+ ### Remediation
113
+ - BLOCK: test Iron Law — delete implementation, restart with test-first
114
+ - WARN: verification — re-run and capture stdout
115
+ ```
116
+
117
+ ## Constraint Catalog (Quick Reference)
118
+
119
+ Key HARD-GATEs across skills that constraint-check audits:
120
+
121
+ | Skill | HARD-GATE | Check Method |
122
+ |---|---|---|
123
+ | test | Tests BEFORE code (Iron Law) | Tool call ordering |
124
+ | cook | Scout before plan, plan before code | Phase progression |
125
+ | plan | Every code phase has test entry | Plan content |
126
+ | verification | Evidence for every claim | Stdout capture |
127
+ | sentinel | BLOCK = halt pipeline | No commit after BLOCK |
128
+ | preflight | BLOCK = halt pipeline | No commit after BLOCK |
129
+ | debug | No code changes during debug | No Write/Edit in debug |
130
+ | debug | 3-fix escalation | Fix attempt counter |
131
+ | brainstorm | No implementation before approval | User message check |
132
+
133
+ ## Output Format
134
+
135
+ Constraint Check Report with status (COMPLIANT/VIOLATIONS_FOUND/CRITICAL_VIOLATION), HARD-GATE violations, constraint violations, compliance summary table, and remediation steps. See Step 5 Report above for full template.
136
+
137
+ ## Constraints
138
+
139
+ 1. MUST check all HARD-GATEs for every invoked skill — not just the ones that seem relevant
140
+ 2. MUST use tool call ordering (not agent narrative) to verify temporal constraints
141
+ 3. MUST distinguish HARD-GATE violations (BLOCK) from constraint violations (WARN)
142
+ 4. MUST report specific evidence for each violation — not just "violated"
143
+ 5. MUST NOT accept agent's self-report as compliance evidence — check independently
144
+
145
+ ## Sharp Edges
146
+
147
+ | Failure Mode | Severity | Mitigation |
148
+ |---|---|---|
149
+ | Agent self-reports compliance and constraint-check trusts it | CRITICAL | Constraint 5: check tool calls independently, not agent narrative |
150
+ | Only checking cook constraints, missing test/sentinel/etc | HIGH | Constraint 1: audit ALL invoked skills, not just the orchestrator |
151
+ | Temporal check wrong (tool calls reordered in context) | MEDIUM | Use tool call sequence numbers, not message ordering |
152
+ | Too strict on optional steps (INFO treated as BLOCK) | LOW | Step 4 classification: only HARD-GATE = BLOCK, constraints = WARN |
153
+
154
+ ## Done When
155
+
156
+ - All invoked skills identified from context
157
+ - HARD-GATEs and constraints extracted from each skill's SKILL.md
158
+ - Each constraint checked against conversation evidence
159
+ - Violations classified as BLOCK/WARN/INFO
160
+ - Compliance summary table emitted per skill
161
+ - Remediation steps listed for each violation
162
+
163
+ ## Cost Profile
164
+
165
+ ~1000-2000 tokens input, ~500-1000 tokens output. Haiku for speed — reads skill files and checks tool call ordering.
@@ -0,0 +1,176 @@
1
+ ---
2
+ name: context-engine
3
+ description: "Context window management. Auto-triggered when context is filling up. Triggers smart compaction and preserves critical information across compaction boundaries. Called by L1 orchestrators at context thresholds."
4
+ user-invocable: false
5
+ metadata:
6
+ author: runedev
7
+ version: "0.4.0"
8
+ layer: L3
9
+ model: haiku
10
+ group: state
11
+ tools: "Read, Glob, Grep"
12
+ ---
13
+
14
+ # context-engine
15
+
16
+ ## Purpose
17
+
18
+ Context window management for long sessions. Detects when context is approaching limits, triggers smart compaction preserving critical decisions and progress, and coordinates with session-bridge to save state before compaction. Prevents the common failure mode of losing important context mid-workflow.
19
+
20
+ ### Behavioral Contexts
21
+
22
+ Context-engine also manages **behavioral mode injection** via `contexts/` directory. Three modes are available:
23
+
24
+ | Mode | File | When to Use |
25
+ |------|------|-------------|
26
+ | `dev` | `contexts/dev.md` | Active coding — bias toward action, code-first |
27
+ | `research` | `contexts/research.md` | Investigation — read widely, evidence-based |
28
+ | `review` | `contexts/review.md` | Code review — systematic, severity-labeled |
29
+
30
+ **Mode activation**: Orchestrators (cook, team, rescue) can set the active mode by writing to `.rune/active-context.md`. The session-start hook injects the active context file into the session. Mode switches mid-session are supported — the orchestrator updates the file and references the new behavioral rules.
31
+
32
+ **Default**: If no `.rune/active-context.md` exists, no behavioral mode is injected (standard Claude behavior).
33
+
34
+ ## Triggers
35
+
36
+ - Called by `cook` and `team` automatically at context boundaries
37
+ - Auto-trigger: when tool call count exceeds threshold or context utilization is high
38
+ - Auto-trigger: before compaction events
39
+
40
+ ## Calls (outbound)
41
+
42
+ # Exception: L3→L3 coordination
43
+ - `session-bridge` (L3): coordinate state save when context critical
44
+
45
+ ## Called By (inbound)
46
+
47
+ - Auto-triggered at phase boundaries and context thresholds by L1 orchestrators
48
+
49
+ ## Execution
50
+
51
+ ### Step 1 — Count tool calls
52
+
53
+ Count total tool calls made so far in this session. This is the ONLY reliable metric — token usage is not exposed by Claude Code and any estimate will be dangerously inaccurate.
54
+
55
+ Do NOT attempt to estimate token percentages. Tool count is a directional proxy, not a precise measurement.
56
+
57
+ ### Step 2 — Classify health
58
+
59
+ Map tool call count to health level:
60
+
61
+ ```
62
+ GREEN (<50 calls) — Healthy, continue normally
63
+ YELLOW (50-80 calls) — Load only essential files going forward
64
+ ORANGE (80-120 calls) — Recommend /compact at next logical boundary
65
+ RED (>120 calls) — Trigger immediate compaction, save state first
66
+ ```
67
+
68
+ These thresholds are directional heuristics, not precise limits. Sessions with many large file reads may hit context limits earlier; sessions with mostly Grep/Glob may go longer.
69
+
70
+ #### Large-File Adjustment
71
+
72
+ Projects with large source files (Python modules often 500-1500 LOC, Java files similarly) consume significantly more context per `Read` call. If the session has read files averaging >500 lines, apply a 0.8x multiplier to all thresholds:
73
+
74
+ ```
75
+ Adjusted thresholds (large-file sessions):
76
+ GREEN (<40 calls) — Healthy, continue normally
77
+ YELLOW (40-65 calls) — Load only essential files going forward
78
+ ORANGE (65-100 calls) — Recommend /compact at next logical boundary
79
+ RED (>100 calls) — Trigger immediate compaction, save state first
80
+ ```
81
+
82
+ Detection: count `Read` tool calls that returned >500 lines. If ≥3 such calls → activate large-file thresholds for the remainder of the session.
83
+
84
+ ### Step 3 — If YELLOW
85
+
86
+ Emit advisory to the calling orchestrator:
87
+
88
+ > "[X] tool calls. Load only essential files. Avoid reading full files when Grep will do."
89
+
90
+ Do NOT trigger compaction yet. Continue execution.
91
+
92
+ ### Step 4 — If ORANGE
93
+
94
+ Emit recommendation to the calling orchestrator:
95
+
96
+ > "[X] tool calls. Recommend /compact at next phase boundary (after current module completes)."
97
+
98
+ Identify the next safe boundary (end of current loop iteration, end of current file being processed) and flag it.
99
+
100
+ ### Step 5 — If RED
101
+
102
+ Immediately trigger state save via `rune:session-bridge` (Save Mode) before any compaction occurs.
103
+
104
+ Pass to session-bridge:
105
+ - Current task and phase description
106
+ - List of files touched this session
107
+ - Decisions made (architectural choices, conventions established)
108
+ - Remaining tasks not yet started
109
+
110
+ After session-bridge confirms save, emit:
111
+
112
+ > "Context CRITICAL ([X] tool calls, likely near limit). State saved to .rune/. Run /compact now."
113
+
114
+ Block further tool calls until compaction is acknowledged.
115
+
116
+ ### Step 6 — Report
117
+
118
+ Emit the context health report to the calling skill.
119
+
120
+ ## Context Health Levels
121
+
122
+ ```
123
+ GREEN (<50 calls) — Healthy, continue normally
124
+ YELLOW (50-80 calls) — Load only essential files
125
+ ORANGE (80-120 calls) — Recommend /compact at next logical boundary
126
+ RED (>120 calls) — Save state NOW via session-bridge, compact immediately
127
+ ```
128
+
129
+ Note: These are tool call counts, NOT token percentages. Claude Code does not expose context utilization to skills. Tool count is a directional signal only.
130
+
131
+ ## Output Format
132
+
133
+ ```
134
+ ## Context Health
135
+ - **Tool Calls**: [count]
136
+ - **Status**: GREEN | YELLOW | ORANGE | RED
137
+ - **Recommendation**: continue | load-essential-only | compact-at-boundary | compact-immediately
138
+ - **Note**: Tool count is a directional proxy. Check CLI status bar for actual context usage.
139
+
140
+ ### Critical Context (preserved on compaction)
141
+ - Task: [current task]
142
+ - Phase: [current phase]
143
+ - Decisions: [count saved to .rune/]
144
+ - Files touched: [list]
145
+ - Blockers: [if any]
146
+ ```
147
+
148
+ ## Constraints
149
+
150
+ 1. MUST preserve context fidelity — no summarizing away critical details
151
+ 2. MUST flag context conflicts between skills — never silently pick one
152
+ 3. MUST NOT inject stale context from previous sessions without marking it as historical
153
+
154
+ ## Sharp Edges
155
+
156
+ Known failure modes for this skill. Check these before declaring done.
157
+
158
+ | Failure Mode | Severity | Mitigation |
159
+ |---|---|---|
160
+ | Triggering compaction without saving state first | CRITICAL | Step 5 (RED): session-bridge MUST run before any compaction — state loss is irreversible |
161
+ | Blocking tool calls when context is ORANGE (not RED) | MEDIUM | ORANGE = recommend only; blocking is only for RED (>120 calls) |
162
+ | Injecting stale context from previous session without marking it historical | HIGH | Constraint 3: all loaded context must include session date marker |
163
+ | Premature compaction from over-estimated utilization | MEDIUM | Tool count is directional only — sessions with heavy Read calls may need lower thresholds; only block at confirmed RED |
164
+ | Not activating large-file adjustment on Python/Java codebases | MEDIUM | Track Read calls returning >500 lines; if ≥3 occur, switch to adjusted (0.8x) thresholds for the session |
165
+
166
+ ## Done When
167
+
168
+ - Tool call count captured
169
+ - Health level classified from count thresholds (GREEN / YELLOW / ORANGE / RED)
170
+ - Appropriate advisory emitted matching health level (no advisory for GREEN)
171
+ - If RED: session-bridge called and confirmed saved before compaction signal
172
+ - Context Health Report emitted with tool count, status, and recommendation
173
+
174
+ ## Cost Profile
175
+
176
+ ~200-500 tokens input, ~100-200 tokens output. Haiku for minimal overhead. Runs frequently as a background monitor.