azclaude-copilot 0.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (108) hide show
  1. package/.claude-plugin/marketplace.json +27 -0
  2. package/.claude-plugin/plugin.json +17 -0
  3. package/LICENSE +21 -0
  4. package/README.md +477 -0
  5. package/bin/cli.js +1027 -0
  6. package/bin/copilot.js +228 -0
  7. package/hooks/README.md +3 -0
  8. package/hooks/hooks.json +40 -0
  9. package/package.json +41 -0
  10. package/templates/CLAUDE.md +51 -0
  11. package/templates/agents/cc-cli-integrator.md +104 -0
  12. package/templates/agents/cc-template-author.md +109 -0
  13. package/templates/agents/cc-test-maintainer.md +101 -0
  14. package/templates/agents/code-reviewer.md +136 -0
  15. package/templates/agents/loop-controller.md +118 -0
  16. package/templates/agents/orchestrator-init.md +196 -0
  17. package/templates/agents/test-writer.md +129 -0
  18. package/templates/capabilities/evolution/cycle2-knowledge.md +87 -0
  19. package/templates/capabilities/evolution/cycle3-topology.md +128 -0
  20. package/templates/capabilities/evolution/detect.md +103 -0
  21. package/templates/capabilities/evolution/evaluate.md +90 -0
  22. package/templates/capabilities/evolution/generate.md +123 -0
  23. package/templates/capabilities/evolution/re-derivation.md +77 -0
  24. package/templates/capabilities/intelligence/debate.md +104 -0
  25. package/templates/capabilities/intelligence/elo.md +122 -0
  26. package/templates/capabilities/intelligence/experiment.md +86 -0
  27. package/templates/capabilities/intelligence/opro.md +84 -0
  28. package/templates/capabilities/intelligence/pipeline.md +149 -0
  29. package/templates/capabilities/level-builders/level1-claudemd.md +52 -0
  30. package/templates/capabilities/level-builders/level2-mcp.md +58 -0
  31. package/templates/capabilities/level-builders/level3-skills.md +276 -0
  32. package/templates/capabilities/level-builders/level4-memory.md +72 -0
  33. package/templates/capabilities/level-builders/level5-agents.md +123 -0
  34. package/templates/capabilities/level-builders/level6-hooks.md +119 -0
  35. package/templates/capabilities/level-builders/level7-extmcp.md +60 -0
  36. package/templates/capabilities/level-builders/level8-orchestrated.md +98 -0
  37. package/templates/capabilities/manifest.md +58 -0
  38. package/templates/capabilities/shared/5-layer-agent.md +206 -0
  39. package/templates/capabilities/shared/completion-rule.md +44 -0
  40. package/templates/capabilities/shared/context-artifacts.md +96 -0
  41. package/templates/capabilities/shared/domain-advisor-generator.md +205 -0
  42. package/templates/capabilities/shared/friction-log.md +43 -0
  43. package/templates/capabilities/shared/multi-cli-paths.md +56 -0
  44. package/templates/capabilities/shared/native-tools.md +199 -0
  45. package/templates/capabilities/shared/plan-tracker.md +69 -0
  46. package/templates/capabilities/shared/pressure-test.md +88 -0
  47. package/templates/capabilities/shared/quality-check.md +83 -0
  48. package/templates/capabilities/shared/reflexes.md +159 -0
  49. package/templates/capabilities/shared/review-reception.md +70 -0
  50. package/templates/capabilities/shared/security.md +174 -0
  51. package/templates/capabilities/shared/semantic-boundary-check.md +140 -0
  52. package/templates/capabilities/shared/session-rhythm.md +42 -0
  53. package/templates/capabilities/shared/tdd.md +54 -0
  54. package/templates/capabilities/shared/vocabulary-transform.md +63 -0
  55. package/templates/commands/add.md +152 -0
  56. package/templates/commands/audit.md +123 -0
  57. package/templates/commands/blueprint.md +115 -0
  58. package/templates/commands/copilot.md +157 -0
  59. package/templates/commands/create.md +156 -0
  60. package/templates/commands/debate.md +75 -0
  61. package/templates/commands/deps.md +112 -0
  62. package/templates/commands/doc.md +100 -0
  63. package/templates/commands/dream.md +120 -0
  64. package/templates/commands/evolve.md +170 -0
  65. package/templates/commands/explain.md +25 -0
  66. package/templates/commands/find.md +100 -0
  67. package/templates/commands/fix.md +122 -0
  68. package/templates/commands/hookify.md +100 -0
  69. package/templates/commands/level-up.md +48 -0
  70. package/templates/commands/loop.md +62 -0
  71. package/templates/commands/migrate.md +119 -0
  72. package/templates/commands/persist.md +73 -0
  73. package/templates/commands/pulse.md +87 -0
  74. package/templates/commands/refactor.md +97 -0
  75. package/templates/commands/reflect.md +107 -0
  76. package/templates/commands/reflexes.md +141 -0
  77. package/templates/commands/setup.md +97 -0
  78. package/templates/commands/ship.md +131 -0
  79. package/templates/commands/snapshot.md +70 -0
  80. package/templates/commands/test.md +86 -0
  81. package/templates/hooks/post-tool-use.js +175 -0
  82. package/templates/hooks/stop.js +85 -0
  83. package/templates/hooks/user-prompt.js +96 -0
  84. package/templates/scripts/env-scan.sh +46 -0
  85. package/templates/scripts/import-graph.sh +88 -0
  86. package/templates/scripts/validate-boundaries.sh +180 -0
  87. package/templates/skills/agent-creator/SKILL.md +91 -0
  88. package/templates/skills/agent-creator/examples/sample-agent.md +80 -0
  89. package/templates/skills/agent-creator/references/agent-engineering-guide.md +596 -0
  90. package/templates/skills/agent-creator/references/quality-checklist.md +42 -0
  91. package/templates/skills/agent-creator/scripts/scaffold.sh +144 -0
  92. package/templates/skills/architecture-advisor/SKILL.md +92 -0
  93. package/templates/skills/architecture-advisor/references/database-decisions.md +61 -0
  94. package/templates/skills/architecture-advisor/references/decision-matrices.md +122 -0
  95. package/templates/skills/architecture-advisor/references/rendering-decisions.md +39 -0
  96. package/templates/skills/architecture-advisor/scripts/detect-scale.sh +67 -0
  97. package/templates/skills/debate/SKILL.md +36 -0
  98. package/templates/skills/debate/references/acemad-protocol.md +72 -0
  99. package/templates/skills/env-scanner/SKILL.md +41 -0
  100. package/templates/skills/security/SKILL.md +44 -0
  101. package/templates/skills/security/references/security-details.md +48 -0
  102. package/templates/skills/session-guard/SKILL.md +33 -0
  103. package/templates/skills/skill-creator/SKILL.md +82 -0
  104. package/templates/skills/skill-creator/examples/sample-skill.md +74 -0
  105. package/templates/skills/skill-creator/references/quality-checklist.md +36 -0
  106. package/templates/skills/skill-creator/references/skill-engineering-guide.md +365 -0
  107. package/templates/skills/skill-creator/scripts/scaffold.sh +75 -0
  108. package/templates/skills/test-first/SKILL.md +41 -0
@@ -0,0 +1,83 @@
1
+ ---
2
+ name: quality-check
3
+ description: >
4
+ Post-setup quality verification. Runs after /setup or /level-up to verify
5
+ generated output is correct. Triggers on: "verify setup", "check quality",
6
+ "did setup work correctly", "validate environment".
7
+ tokens: ~80
8
+ ---
9
+
10
+ ## Post-Setup Quality Check
11
+
12
+ Run after `/setup` or any `/level-up`. Verifies generated files are correct, not just present.
13
+
14
+ ---
15
+
16
+ ### Environment Check
17
+
18
+ ```bash
19
+ # Required structure
20
+ echo "=== AZCLAUDE Environment Check ==="
21
+
22
+ echo "--- Core files ---"
23
+ [ -f CLAUDE.md ] && echo "✓ CLAUDE.md" || echo "✗ CLAUDE.md MISSING"
24
+ [ -f .claude/memory/goals.md ] && echo "✓ goals.md" || echo "✗ goals.md MISSING"
25
+ [ -d .claude/capabilities ] && echo "✓ capabilities/" || echo "✗ capabilities/ MISSING"
26
+ [ -d .claude/commands ] && echo "✓ commands/" || echo "✗ commands/ MISSING"
27
+ [ -f .claude/capabilities/manifest.md ] && echo "✓ manifest.md" || echo "✗ manifest.md MISSING"
28
+
29
+ echo "--- Commands ---"
30
+ for cmd in dream setup fix add review test plan evolve debate persist level-up ship status explain loop; do
31
+ [ -f ".claude/commands/$cmd.md" ] && echo "✓ /$cmd" || echo "✗ /$cmd MISSING"
32
+ done
33
+
34
+ echo "--- Memory dirs ---"
35
+ [ -d .claude/memory/sessions ] && echo "✓ memory/sessions/" || echo "✗ memory/sessions/ MISSING"
36
+ [ -d ops/observations ] && echo "✓ ops/observations/" || echo "✗ ops/observations/ MISSING"
37
+ ```
38
+
39
+ ---
40
+
41
+ ### Content Accuracy Check
42
+
43
+ After environment check passes, verify CLAUDE.md is filled (not template):
44
+
45
+ ```bash
46
+ # Detect unfilled placeholders
47
+ grep -c '{{' CLAUDE.md && echo "✗ CLAUDE.md has unfilled placeholders — re-run /setup" || echo "✓ CLAUDE.md filled"
48
+
49
+ # Verify goals.md has today's date (not empty)
50
+ grep -q "$(date +%Y-%m-%d)" .claude/memory/goals.md && echo "✓ goals.md has today's date" || echo "✗ goals.md missing date"
51
+ ```
52
+
53
+ ---
54
+
55
+ ### Command Quality Check (RECIPE vs REFERENCE)
56
+
57
+ For each command in `.claude/commands/`:
58
+
59
+ ```bash
60
+ for skill in .claude/commands/*.md; do
61
+ name=$(basename "$skill")
62
+ # Check for pushy description (3+ trigger variants)
63
+ triggers=$(grep -c "Triggers on:" "$skill" 2>/dev/null || echo 0)
64
+ # Check body length
65
+ lines=$(wc -l < "$skill")
66
+ [ "$lines" -gt 500 ] && echo "⚠ $name: $lines lines — exceeds 500 limit"
67
+ # Check for frontmatter
68
+ grep -q '^---' "$skill" && echo "✓ $name: has frontmatter" || echo "✗ $name: MISSING frontmatter"
69
+ done
70
+ ```
71
+
72
+ ---
73
+
74
+ ### Pass Criteria
75
+
76
+ All checks must show ✓ before declaring setup complete.
77
+
78
+ ```
79
+ Bad: "Setup complete!"
80
+ Good: "Environment check: 15/15 ✓. Content check: CLAUDE.md filled, goals.md dated. Skills: 15 installed, all pass RECIPE test."
81
+ ```
82
+
83
+ Run this check automatically at the end of every `/setup` and `/level-up`.
@@ -0,0 +1,159 @@
1
+ ---
2
+ name: reflexes
3
+ description: >
4
+ Load when /evolve finds repeated tool-use patterns, when reviewing learned behaviors,
5
+ when the observer has accumulated enough observations, or when promoting reflexes to
6
+ skills/agents. Load when: "reflexes", "learned behaviors", "what has copilot learned",
7
+ "observation patterns", "promote reflex", "reflex status".
8
+ tokens: ~250
9
+ ---
10
+
11
+ # Reflexes — Learned Behavioral Patterns
12
+
13
+ A reflex is an atomic learned behavior extracted from session observations.
14
+ Smaller than a skill, more concrete than a pattern. Confidence-scored.
15
+
16
+ ## Reflex Model
17
+
18
+ ```yaml
19
+ ---
20
+ id: grep-before-edit
21
+ trigger: "when modifying code files"
22
+ action: "Search with Grep first, confirm with Read, then Edit"
23
+ confidence: 0.7
24
+ domain: workflow
25
+ scope: project
26
+ evidence_count: 8
27
+ last_observed: 2026-03-18
28
+ ---
29
+ ```
30
+
31
+ ### Properties
32
+ - **Atomic** — one trigger, one action (not multi-step)
33
+ - **Confidence-scored** — 0.3 (tentative) → 0.5 (moderate) → 0.7 (strong) → 0.9 (certain)
34
+ - **Domain-tagged** — code-style, testing, git, debugging, workflow, file-patterns, security
35
+ - **Scoped** — `project` (default) or `global`
36
+ - **Evidence-backed** — tracks observation count
37
+
38
+ ## Confidence Rules
39
+
40
+ | Observations | Initial confidence |
41
+ |-------------|-------------------|
42
+ | 3-5 | 0.3 (tentative) |
43
+ | 6-10 | 0.5 (moderate) |
44
+ | 11-20 | 0.7 (strong) |
45
+ | 21+ | 0.85 (near-certain) |
46
+
47
+ Adjustments:
48
+ - +0.05 per confirming observation
49
+ - -0.10 per contradicting observation (user corrects behavior)
50
+ - -0.02 per week without observation (automatic decay)
51
+ - Confidence never exceeds 0.95
52
+ - Confidence < 0.15 after decay → auto-pruned by `/reflexes clear`
53
+
54
+ ## Confidence Decay
55
+
56
+ Reflexes that aren't confirmed decay over time. This prevents stale patterns from
57
+ accumulating. The decay formula:
58
+
59
+ ```
60
+ effective_confidence = base_confidence - (0.02 × weeks_since_last_observed)
61
+ ```
62
+
63
+ Example: a reflex with confidence 0.5 not observed for 10 weeks → 0.5 - 0.2 = 0.3 (demoted to tentative).
64
+
65
+ **Auto-pruning**: `/reflexes clear` and `/evolve` Cycle 2 remove reflexes where
66
+ effective_confidence < 0.15. This keeps the reflex library lean and relevant.
67
+
68
+ ## Reflex Frontmatter
69
+
70
+ Every reflex file includes `last_observed` date for decay calculation:
71
+
72
+ ```yaml
73
+ ---
74
+ id: grep-before-edit
75
+ trigger: "when modifying code files"
76
+ action: "Search with Grep first, confirm with Read, then Edit"
77
+ confidence: 0.7
78
+ domain: workflow
79
+ scope: project
80
+ evidence_count: 8
81
+ last_observed: 2026-03-18
82
+ created: 2026-03-10
83
+ ---
84
+ ```
85
+
86
+ ## Scope Rules
87
+
88
+ | Pattern type | Scope | Examples |
89
+ |-------------|-------|---------|
90
+ | Language/framework conventions | **project** | "Use React hooks", "Django REST patterns" |
91
+ | File structure preferences | **project** | "Tests in `__tests__/`" |
92
+ | Code style | **project** | "Functional over class-based" |
93
+ | Security practices | **global** | "Validate user input", "Sanitize SQL" |
94
+ | Tool workflow preferences | **global** | "Grep before Edit", "Read before Write" |
95
+ | Git practices | **global** | "Conventional commits", "Small focused commits" |
96
+
97
+ **Default to `project` scope.** Promote to global only when seen in 2+ projects with confidence >= 0.8.
98
+
99
+ ## Storage
100
+
101
+ ```
102
+ .claude/memory/reflexes/
103
+ ├── observations.jsonl ← raw tool-use observations (auto-captured by hook)
104
+ ├── project/ ← project-scoped reflexes
105
+ │ ├── grep-before-edit.md
106
+ │ └── prefer-functional.md
107
+ └── global/ ← universal reflexes (promoted)
108
+ └── validate-user-input.md
109
+ ```
110
+
111
+ ## Observation Format (observations.jsonl)
112
+
113
+ ```json
114
+ {"ts":"2026-03-18T10:30:00Z","tool":"Edit","file":"src/auth.js","session":"abc","event":"complete"}
115
+ {"ts":"2026-03-18T10:30:05Z","tool":"Bash","cmd":"npm test","session":"abc","event":"complete"}
116
+ ```
117
+
118
+ Captured automatically by PostToolUse hook. Truncated to last 500 entries.
119
+ Auto-purged after 30 days. Secret patterns scrubbed before writing.
120
+
121
+ ## Pattern Detection (run by /evolve or /reflexes analyze)
122
+
123
+ Detect these patterns from observations.jsonl:
124
+
125
+ 1. **Tool sequences** — same tool chain repeated 3+ times (Grep → Read → Edit)
126
+ 2. **User corrections** — user immediately undoes/redoes an action
127
+ 3. **Error → fix pairs** — error output followed by a fix pattern
128
+ 4. **File co-access** — same files always read/edited together
129
+ 5. **Naming conventions** — consistent naming in created files
130
+
131
+ ## Evolution Path
132
+
133
+ ```
134
+ Observations (raw)
135
+ → 3+ occurrences detected
136
+ → Reflex created (confidence 0.3-0.85)
137
+ → /evolve clusters related reflexes
138
+ → Strong cluster (3+ reflexes, avg confidence > 0.7)
139
+ → Evolved into skill, command, or agent
140
+ ```
141
+
142
+ ## Integration with /evolve
143
+
144
+ When `/evolve` runs Cycle 1 (Detect):
145
+ 1. Read `.claude/memory/reflexes/observations.jsonl`
146
+ 2. Detect patterns (3+ occurrences minimum)
147
+ 3. Create/update reflex files in `.claude/memory/reflexes/project/`
148
+ 4. Check for promotion candidates (seen in 2+ projects, confidence >= 0.8)
149
+ 5. Cluster related reflexes → generate skills if cluster is strong enough
150
+
151
+ ## Reading Reflexes Before Acting
152
+
153
+ Agents and commands should read reflexes before implementing:
154
+ ```
155
+ Read .claude/memory/reflexes/project/ → follow strong reflexes (confidence >= 0.7)
156
+ Read .claude/memory/reflexes/global/ → follow universal reflexes
157
+ ```
158
+
159
+ Reflexes with confidence < 0.5 are suggestions only — do not enforce.
@@ -0,0 +1,70 @@
1
+ ---
2
+ name: review-reception
3
+ description: >
4
+ Load when receiving code review feedback and deciding how to respond.
5
+ Load when tempted to say "You're absolutely right!", "Great point!", or
6
+ "I'll fix that right away" before actually evaluating the suggestion.
7
+ Load when a reviewer suggests something that seems wrong or unnecessary.
8
+ Load when unsure whether to implement a suggestion or push back on it.
9
+ tokens: ~80
10
+ ---
11
+
12
+ ## Receiving Code Review — Technical Evaluation, Not Performance
13
+
14
+ When you receive a /audit result or code review feedback, your job is
15
+ **technical evaluation**, not agreement. These are different things.
16
+
17
+ ---
18
+
19
+ ## What NOT to do
20
+
21
+ Do NOT say:
22
+ - "You're absolutely right!"
23
+ - "Great point, I'll fix that!"
24
+ - "Absolutely, that makes total sense!"
25
+
26
+ These responses are performative. They signal compliance without evaluation.
27
+ A reviewer who is wrong deserves a correct response, not a polite one.
28
+
29
+ ---
30
+
31
+ ## What to do instead
32
+
33
+ For each piece of feedback:
34
+
35
+ **1. Evaluate technically first**
36
+ - Is this suggestion correct? Does it actually improve the code?
37
+ - Is this a blocking issue or a stylistic preference?
38
+ - Does this conflict with project conventions in CLAUDE.md?
39
+ - Is this YAGNI? Would implementing this add complexity for a hypothetical future case?
40
+
41
+ **2. Implement if correct**
42
+ If the suggestion is right: implement it, run tests, confirm it works.
43
+ Say what you did — "Fixed: extracted the validation logic to its own function (auth.js:42)."
44
+
45
+ **3. Push back if wrong**
46
+ If the suggestion is incorrect, unnecessary, or conflicts with project constraints:
47
+ Say so directly and explain why.
48
+
49
+ Example: "I'm not going to extract that function — it's only called once and extracting it adds indirection without reducing complexity. The current structure matches the pattern in CLAUDE.md line 8."
50
+
51
+ **4. Ask for clarification if unclear**
52
+ If the feedback is ambiguous: ask one specific question. Don't implement something you don't understand.
53
+
54
+ ---
55
+
56
+ ## YAGNI Check
57
+
58
+ Before implementing any suggestion, ask: is this solving a real current problem
59
+ or a hypothetical future one? If hypothetical — say so and skip it.
60
+
61
+ ---
62
+
63
+ ## Completion Rule
64
+
65
+ After processing all feedback:
66
+ - List what was implemented (with file:line)
67
+ - List what was rejected (with reason)
68
+ - List what needs clarification
69
+
70
+ Do NOT say "all feedback addressed" without showing the list.
@@ -0,0 +1,174 @@
1
+ ---
2
+ name: security
3
+ description: >
4
+ Security hardening rules for AZCLAUDE environments. Covers hook integrity,
5
+ path sanitization, context injection protection, credential handling,
6
+ shared-skill verification, agent permission scoping.
7
+ Triggers on: security review, credential handling, hook modification,
8
+ importing shared skills, deploying, handling secrets, untrusted project.
9
+ tokens: ~200
10
+ ---
11
+
12
+ ## Security Model
13
+
14
+ AZCLAUDE runs code and modifies files. These rules prevent common attack vectors.
15
+
16
+ ---
17
+
18
+ ### 1. Hook Integrity
19
+
20
+ Global hooks in `~/.claude/settings.json` execute on **every prompt in every project**.
21
+ A tampered hook = arbitrary code execution.
22
+
23
+ **Protections:**
24
+ - `npx azclaude` writes a SHA-256 integrity hash to `~/.claude/.azclaude-integrity`
25
+ - On subsequent installs, hash is verified — warns if hooks were modified externally
26
+ - The `_azclaude: true` marker confirms hooks were installed by AZCLAUDE, not manually injected
27
+
28
+ **If integrity check fails:**
29
+ ```
30
+ ⚠ Hook integrity mismatch — hooks in ~/.claude/settings.json were modified
31
+ since last AZCLAUDE install. Verify manually before continuing.
32
+ ```
33
+
34
+ Do NOT silently overwrite. Show the diff. Let the user decide.
35
+
36
+ ---
37
+
38
+ ### 2. Path Sanitization
39
+
40
+ File paths that contain shell metacharacters can cause **command injection** in hooks
41
+ that use those paths (e.g., PostToolUse auto-format: `prettier "$CLAUDE_FILE_PATH"`).
42
+
43
+ **Rejected characters**: `;`, `|`, `&`, `` ` ``, `$`, `(`, `)`, `>`, `<`
44
+
45
+ **In PostToolUse hooks** — always add the guard:
46
+ ```bash
47
+ case "$CLAUDE_FILE_PATH" in
48
+ *[';''|''&''`''$''('')''>''<']*) exit 0 ;;
49
+ esac
50
+ ```
51
+
52
+ This runs before the formatter. Malicious paths exit silently — no command injection.
53
+
54
+ **In cli.js** — `sanitizePath()` rejects paths before any `fs.writeFileSync()`:
55
+ - Log a warning: `⚠ Rejected path with shell metacharacters: {path}`
56
+ - Do NOT write the file
57
+
58
+ ---
59
+
60
+ ### 3. Context Injection Protection
61
+
62
+ Files injected into Claude's context (goals.md, knowledge-index.md) can contain
63
+ **indirect prompt injection** — instructions designed to manipulate Claude's behavior.
64
+
65
+ **Sanitization patterns** — strip or warn on these before injection:
66
+ - `ignore.*previous.*instructions`
67
+ - `curl.*|.*bash` or `wget.*|.*sh`
68
+ - `system prompt` or `you are now`
69
+ - `<script>` or HTML injection attempts
70
+ - Base64-encoded blocks longer than 500 characters
71
+
72
+ **Implementation**: The UserPromptSubmit hook pipes goals.md through a sanitizer
73
+ before echoing it. Suspicious lines are replaced with `[REDACTED — suspicious content]`.
74
+
75
+ **Rule**: Never inject raw file content from untrusted sources into Claude's context
76
+ without scanning for injection patterns first.
77
+
78
+ ---
79
+
80
+ ### 4. Credential Handling
81
+
82
+ **Non-negotiable rules:**
83
+ - Credentials go in environment variables — never in committed files
84
+ - `.env` files: always in `.gitignore`, never staged by `/ship`
85
+ - `.mcp.json`: use `${ENV_VAR}` syntax for all secrets, never plaintext
86
+ - API keys in generated configs: use placeholder `${API_KEY}` with a comment
87
+
88
+ **Audit trail:**
89
+ - When an agent handles credentials, log to `.claude/memory/security-events.md`:
90
+ ```
91
+ ## {date} — {agent-name}
92
+ Action: {what was done with credentials}
93
+ Files involved: {list}
94
+ ```
95
+ - Append only. Never overwrite security event logs.
96
+
97
+ **Before any deploy or ship operation**, scan for exposed secrets:
98
+ ```bash
99
+ grep -rn "AKIA\|sk-\|ghp_\|glpat-\|xoxb-" . \
100
+ --include='*.js' --include='*.py' --include='*.ts' \
101
+ --include='*.json' --include='*.yaml' --include='*.env' \
102
+ 2>/dev/null | grep -v node_modules | grep -v .git
103
+ ```
104
+
105
+ If any match: **block the operation** and show the file:line.
106
+
107
+ ---
108
+
109
+ ### 5. Shared-Skill Verification
110
+
111
+ Skills imported from `~/shared-skills/` could be tampered with between projects.
112
+
113
+ **On export** (skill promoted to shared after k=10):
114
+ ```bash
115
+ sha256sum "$SKILL_FILE" >> ~/shared-skills/.checksums
116
+ ```
117
+
118
+ **On import** (skill copied into project):
119
+ ```bash
120
+ EXPECTED=$(grep "$SKILL_FILE" ~/shared-skills/.checksums | cut -d' ' -f1)
121
+ ACTUAL=$(sha256sum "$SKILL_FILE" | cut -d' ' -f1)
122
+ if [ "$EXPECTED" != "$ACTUAL" ]; then
123
+ echo "⚠ Checksum mismatch for $SKILL_FILE — file may have been tampered with"
124
+ fi
125
+ ```
126
+
127
+ Never import a skill that fails checksum verification without user approval.
128
+
129
+ ---
130
+
131
+ ### 6. Agent Permission Scoping
132
+
133
+ **Principle of least privilege** — every agent gets only what it needs:
134
+
135
+ | Agent type | Recommended permissions |
136
+ |-----------|------------------------|
137
+ | Reviewer | `tools: Read, Glob, Grep, Bash` + `disallowedTools: Write, Edit` |
138
+ | Implementer | `tools: Read, Write, Edit, Bash, Glob, Grep` |
139
+ | Orchestrator | `tools: Read, Agent` + `disallowedTools: Write, Edit` |
140
+ | Experiment | `isolation: worktree` (cannot touch main branch) |
141
+
142
+ **Never give Write permission to review agents.**
143
+ **Never give Agent-spawning to implementation agents.**
144
+
145
+ If an agent needs elevated permissions: document why in its Layer 4 (CONSTRAINTS).
146
+
147
+ ---
148
+
149
+ ### 7. PreToolUse Code Pattern Monitoring
150
+
151
+ Catch insecure code patterns at write time — before they reach the codebase.
152
+ Cheaper than catching them at `/ship` or `/audit` time.
153
+
154
+ **Patterns to flag on Edit/Write operations:**
155
+
156
+ | Pattern | Risk | Action |
157
+ |---|---|---|
158
+ | `eval(` / `new Function(` | Code injection | Warn — suggest alternative |
159
+ | `os.system(` / `subprocess.call(` with `shell=True` | Command injection | Warn — suggest subprocess.run with list args |
160
+ | `child_process.exec(` | Command injection | Warn — suggest execFile or spawn |
161
+ | `dangerouslySetInnerHTML` | XSS | Warn — suggest sanitized alternative |
162
+ | `document.write(` / `.innerHTML =` | DOM XSS | Warn — suggest textContent or createElement |
163
+ | `pickle.load(` / `pickle.loads(` | Deserialization | Warn — suggest json or msgpack |
164
+ | `AKIA[0-9A-Z]{16}` / `sk-` / `ghp_` | Hardcoded secret | Block — must use env var |
165
+
166
+ **Implementation:** Add a PreToolUse hook with matcher `Edit|Write|MultiEdit`:
167
+ ```bash
168
+ # In the hook script, scan the new content for patterns:
169
+ echo "$CLAUDE_TOOL_INPUT" | grep -qE 'eval\(|os\.system\(|pickle\.load' && \
170
+ echo "⚠ Security: potentially unsafe pattern detected. Review before proceeding."
171
+ ```
172
+
173
+ **Rule:** Warn, don't block (except hardcoded secrets). The developer may have a
174
+ valid reason. But make the pattern visible so it gets reviewed.
@@ -0,0 +1,140 @@
1
+ ---
2
+ name: semantic-boundary-check
3
+ description: >
4
+ Load during /evolve Cycle 3 (topology) or when boundary validator finds lexical
5
+ overlaps. Detects deeper behavioral duplication that grep cannot catch: same
6
+ behavior in different wording, hidden role overlap between commands/skills/agents,
7
+ conceptual redundancy across extension types. Load when validate-boundaries.sh
8
+ reports warnings, when adding a new skill/command/agent, or when /evolve
9
+ detects friction from competing extensions.
10
+ tokens: ~300
11
+ ---
12
+
13
+ # Semantic Boundary Check
14
+
15
+ The lexical validator (`validate-boundaries.sh`) catches obvious overlaps by
16
+ comparing trigger keywords. This capability catches what grep misses:
17
+ **same behavior, different words.**
18
+
19
+ ## When to Run
20
+
21
+ 1. After `validate-boundaries.sh` reports any warnings
22
+ 2. During `/evolve` Cycle 3 (topology optimization)
23
+ 3. When creating a new command, skill, or agent (via skill-creator or agent-creator)
24
+ 4. When a user reports "two things seem to do the same job"
25
+
26
+ ## Protocol
27
+
28
+ ### Step 1: Read All Extension Descriptions
29
+
30
+ Read the **full body** (not just frontmatter) of every extension:
31
+ ```bash
32
+ ls .claude/commands/*.md .claude/skills/*/SKILL.md .claude/agents/*.md .claude/capabilities/shared/*.md 2>/dev/null
33
+ ```
34
+
35
+ For each file, extract:
36
+ - **What it does** (from workflow/steps section)
37
+ - **When it fires** (from description/triggers)
38
+ - **What it produces** (output format, files written)
39
+ - **What it reads** (input files, state files)
40
+
41
+ ### Step 2: Build Overlap Matrix
42
+
43
+ For each pair of extensions across different types, answer:
44
+
45
+ | Question | If YES → overlap risk |
46
+ |----------|----------------------|
47
+ | Do they read the same input files? | Medium — shared data source |
48
+ | Do they write to the same output files? | High — competing writers |
49
+ | Do they trigger on the same user intent? | High — user confusion |
50
+ | Do they produce the same type of output? | High — redundant work |
51
+ | Would removing one change nothing for the user? | Critical — one is redundant |
52
+
53
+ ### Step 3: Classify Each Overlap
54
+
55
+ | Classification | Definition | Action |
56
+ |---------------|------------|--------|
57
+ | **REDUNDANT** | Same behavior, different name. Removing one changes nothing. | Merge into one, delete the other |
58
+ | **OVERLAPPING** | Partial behavior overlap. Each does something unique + something shared. | Extract shared part into a capability, keep unique parts |
59
+ | **COMPLEMENTARY** | Different behavior, same trigger domain. Both needed but confusing. | Clarify descriptions to distinguish when each fires |
60
+ | **CLEAN** | No behavioral overlap. Different purpose, different triggers. | No action needed |
61
+
62
+ ### Step 4: Output Report
63
+
64
+ ```markdown
65
+ ## Semantic Boundary Report
66
+
67
+ ### REDUNDANT (merge these)
68
+ - {extension A} ↔ {extension B}: {what they both do}
69
+ Action: merge into {recommended name}
70
+
71
+ ### OVERLAPPING (extract shared)
72
+ - {extension A} ↔ {extension B}: shared behavior = {what}
73
+ Action: extract {shared part} into capabilities/shared/{name}.md
74
+
75
+ ### COMPLEMENTARY (clarify triggers)
76
+ - {extension A} ↔ {extension B}: same domain, different purpose
77
+ Action: update descriptions to distinguish
78
+
79
+ ### CLEAN
80
+ - {N} extension pairs checked, no overlap
81
+ ```
82
+
83
+ ### Step 5: Apply Fixes
84
+
85
+ For REDUNDANT pairs:
86
+ 1. Keep the extension with more usage evidence (check patterns.md, reflexes)
87
+ 2. Merge unique content from the other into the keeper
88
+ 3. Delete the redundant one
89
+ 4. Update manifest.md
90
+
91
+ For OVERLAPPING pairs:
92
+ 1. Extract shared behavior into `capabilities/shared/{name}.md`
93
+ 2. Have both extensions reference the shared capability
94
+ 3. Remove duplicated content from each
95
+
96
+ For COMPLEMENTARY pairs:
97
+ 1. Add "Do NOT use when..." section to each
98
+ 2. Add cross-references: "For {other task}, use {other extension} instead"
99
+ 3. Make descriptions more specific (narrower triggers)
100
+
101
+ ## Examples of What Grep Misses
102
+
103
+ ### Same behavior, different words
104
+ ```
105
+ Command /fix: "Reproduce → Investigate → Fix → Verify"
106
+ Agent code-reviewer: "Check for bugs, logic errors, edge cases"
107
+ ```
108
+ These don't share trigger keywords but both deal with "finding and fixing problems."
109
+ The difference: /fix actually fixes, code-reviewer only reports. → COMPLEMENTARY, not redundant.
110
+
111
+ ### Hidden output conflict
112
+ ```
113
+ Skill test-first: "Write failing test before implementation"
114
+ Command /test: "Run tests, classify failures"
115
+ ```
116
+ Both touch testing but at different phases. → CLEAN (different lifecycle stages).
117
+
118
+ ### Real redundancy
119
+ ```
120
+ Capability completion-rule.md: "Never say 'should work' — show test output"
121
+ Agent code-reviewer constraint: "Never approve code without checking tests pass"
122
+ ```
123
+ Both enforce "prove it works." → OVERLAPPING. Extract shared enforcement rule.
124
+
125
+ ## Integration with /evolve
126
+
127
+ During Cycle 3 (topology):
128
+ 1. Run `validate-boundaries.sh` (lexical check)
129
+ 2. If warnings > 0: load this capability, run semantic check on flagged pairs
130
+ 3. If warnings = 0: run semantic check on any extensions modified in last 3 sessions
131
+ 4. Apply fixes for REDUNDANT and OVERLAPPING findings
132
+ 5. Log findings to `ops/evolution-log.md`
133
+
134
+ ## Rules
135
+
136
+ - Always read full file bodies, not just frontmatter
137
+ - Classify before acting — don't merge without confirming REDUNDANT
138
+ - When in doubt, classify as COMPLEMENTARY (add clarification, don't merge)
139
+ - Never delete an extension without checking if it's referenced in copilot.md or CLAUDE.md
140
+ - Record every merge/split decision in `.claude/memory/decisions.md`
@@ -0,0 +1,42 @@
1
+ ---
2
+ name: session-rhythm
3
+ description: >
4
+ Load when starting a new session and unsure where to begin. Load when context
5
+ was just reset or compacted. Load when the user says "let's get started" or
6
+ "where were we". Load when goals.md exists but you haven't read it yet.
7
+ Load before closing a session to ensure state is saved.
8
+ tokens: ~80
9
+ ---
10
+
11
+ ## Session Rhythm
12
+
13
+ ### ORIENT (first response of every session)
14
+
15
+ 1. **IDE diagnostics** — use `mcp__ide__getDiagnostics` if available.
16
+ If available: report `IDE: N errors, M warnings` — surface blockers before any work starts.
17
+ If unavailable: skip.
18
+ 2. Read `.claude/memory/goals.md` — know what's active
19
+ 3. Read last 3 friction logs in `ops/observations/` if they exist
20
+ 4. State: current thread, next action, any blockers
21
+
22
+ **TaskCreate** for the session's primary goal — makes intent visible to the user.
23
+
24
+ Lead with what matters now — not the full history.
25
+
26
+ ### WORK
27
+
28
+ - Reference code as `file:line` — never describe location in prose
29
+ - **TaskUpdate → in_progress** when a task begins, **→ completed** when done
30
+ - Apply Completion Rule to every task (see shared/completion-rule.md)
31
+ - Apply TDD Iron Law to every code task (see shared/tdd.md)
32
+ - If a task blocks: name the blocker, don't guess around it
33
+
34
+ ### PERSIST (last response before session ends — or run /persist)
35
+
36
+ 1. **TaskUpdate → completed** for all finished tasks
37
+ 2. Update `.claude/memory/goals.md` — current threads, done, next actions, blockers
38
+ 3. Write friction log to `ops/observations/{date}-friction.md`
39
+ 4. Append session summary to `.claude/memory/sessions/`
40
+
41
+ Do not skip PERSIST even if the session was short.
42
+ Session state lost = next session starts blind.