azclaude-copilot 0.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (108) hide show
  1. package/.claude-plugin/marketplace.json +27 -0
  2. package/.claude-plugin/plugin.json +17 -0
  3. package/LICENSE +21 -0
  4. package/README.md +477 -0
  5. package/bin/cli.js +1027 -0
  6. package/bin/copilot.js +228 -0
  7. package/hooks/README.md +3 -0
  8. package/hooks/hooks.json +40 -0
  9. package/package.json +41 -0
  10. package/templates/CLAUDE.md +51 -0
  11. package/templates/agents/cc-cli-integrator.md +104 -0
  12. package/templates/agents/cc-template-author.md +109 -0
  13. package/templates/agents/cc-test-maintainer.md +101 -0
  14. package/templates/agents/code-reviewer.md +136 -0
  15. package/templates/agents/loop-controller.md +118 -0
  16. package/templates/agents/orchestrator-init.md +196 -0
  17. package/templates/agents/test-writer.md +129 -0
  18. package/templates/capabilities/evolution/cycle2-knowledge.md +87 -0
  19. package/templates/capabilities/evolution/cycle3-topology.md +128 -0
  20. package/templates/capabilities/evolution/detect.md +103 -0
  21. package/templates/capabilities/evolution/evaluate.md +90 -0
  22. package/templates/capabilities/evolution/generate.md +123 -0
  23. package/templates/capabilities/evolution/re-derivation.md +77 -0
  24. package/templates/capabilities/intelligence/debate.md +104 -0
  25. package/templates/capabilities/intelligence/elo.md +122 -0
  26. package/templates/capabilities/intelligence/experiment.md +86 -0
  27. package/templates/capabilities/intelligence/opro.md +84 -0
  28. package/templates/capabilities/intelligence/pipeline.md +149 -0
  29. package/templates/capabilities/level-builders/level1-claudemd.md +52 -0
  30. package/templates/capabilities/level-builders/level2-mcp.md +58 -0
  31. package/templates/capabilities/level-builders/level3-skills.md +276 -0
  32. package/templates/capabilities/level-builders/level4-memory.md +72 -0
  33. package/templates/capabilities/level-builders/level5-agents.md +123 -0
  34. package/templates/capabilities/level-builders/level6-hooks.md +119 -0
  35. package/templates/capabilities/level-builders/level7-extmcp.md +60 -0
  36. package/templates/capabilities/level-builders/level8-orchestrated.md +98 -0
  37. package/templates/capabilities/manifest.md +58 -0
  38. package/templates/capabilities/shared/5-layer-agent.md +206 -0
  39. package/templates/capabilities/shared/completion-rule.md +44 -0
  40. package/templates/capabilities/shared/context-artifacts.md +96 -0
  41. package/templates/capabilities/shared/domain-advisor-generator.md +205 -0
  42. package/templates/capabilities/shared/friction-log.md +43 -0
  43. package/templates/capabilities/shared/multi-cli-paths.md +56 -0
  44. package/templates/capabilities/shared/native-tools.md +199 -0
  45. package/templates/capabilities/shared/plan-tracker.md +69 -0
  46. package/templates/capabilities/shared/pressure-test.md +88 -0
  47. package/templates/capabilities/shared/quality-check.md +83 -0
  48. package/templates/capabilities/shared/reflexes.md +159 -0
  49. package/templates/capabilities/shared/review-reception.md +70 -0
  50. package/templates/capabilities/shared/security.md +174 -0
  51. package/templates/capabilities/shared/semantic-boundary-check.md +140 -0
  52. package/templates/capabilities/shared/session-rhythm.md +42 -0
  53. package/templates/capabilities/shared/tdd.md +54 -0
  54. package/templates/capabilities/shared/vocabulary-transform.md +63 -0
  55. package/templates/commands/add.md +152 -0
  56. package/templates/commands/audit.md +123 -0
  57. package/templates/commands/blueprint.md +115 -0
  58. package/templates/commands/copilot.md +157 -0
  59. package/templates/commands/create.md +156 -0
  60. package/templates/commands/debate.md +75 -0
  61. package/templates/commands/deps.md +112 -0
  62. package/templates/commands/doc.md +100 -0
  63. package/templates/commands/dream.md +120 -0
  64. package/templates/commands/evolve.md +170 -0
  65. package/templates/commands/explain.md +25 -0
  66. package/templates/commands/find.md +100 -0
  67. package/templates/commands/fix.md +122 -0
  68. package/templates/commands/hookify.md +100 -0
  69. package/templates/commands/level-up.md +48 -0
  70. package/templates/commands/loop.md +62 -0
  71. package/templates/commands/migrate.md +119 -0
  72. package/templates/commands/persist.md +73 -0
  73. package/templates/commands/pulse.md +87 -0
  74. package/templates/commands/refactor.md +97 -0
  75. package/templates/commands/reflect.md +107 -0
  76. package/templates/commands/reflexes.md +141 -0
  77. package/templates/commands/setup.md +97 -0
  78. package/templates/commands/ship.md +131 -0
  79. package/templates/commands/snapshot.md +70 -0
  80. package/templates/commands/test.md +86 -0
  81. package/templates/hooks/post-tool-use.js +175 -0
  82. package/templates/hooks/stop.js +85 -0
  83. package/templates/hooks/user-prompt.js +96 -0
  84. package/templates/scripts/env-scan.sh +46 -0
  85. package/templates/scripts/import-graph.sh +88 -0
  86. package/templates/scripts/validate-boundaries.sh +180 -0
  87. package/templates/skills/agent-creator/SKILL.md +91 -0
  88. package/templates/skills/agent-creator/examples/sample-agent.md +80 -0
  89. package/templates/skills/agent-creator/references/agent-engineering-guide.md +596 -0
  90. package/templates/skills/agent-creator/references/quality-checklist.md +42 -0
  91. package/templates/skills/agent-creator/scripts/scaffold.sh +144 -0
  92. package/templates/skills/architecture-advisor/SKILL.md +92 -0
  93. package/templates/skills/architecture-advisor/references/database-decisions.md +61 -0
  94. package/templates/skills/architecture-advisor/references/decision-matrices.md +122 -0
  95. package/templates/skills/architecture-advisor/references/rendering-decisions.md +39 -0
  96. package/templates/skills/architecture-advisor/scripts/detect-scale.sh +67 -0
  97. package/templates/skills/debate/SKILL.md +36 -0
  98. package/templates/skills/debate/references/acemad-protocol.md +72 -0
  99. package/templates/skills/env-scanner/SKILL.md +41 -0
  100. package/templates/skills/security/SKILL.md +44 -0
  101. package/templates/skills/security/references/security-details.md +48 -0
  102. package/templates/skills/session-guard/SKILL.md +33 -0
  103. package/templates/skills/skill-creator/SKILL.md +82 -0
  104. package/templates/skills/skill-creator/examples/sample-skill.md +74 -0
  105. package/templates/skills/skill-creator/references/quality-checklist.md +36 -0
  106. package/templates/skills/skill-creator/references/skill-engineering-guide.md +365 -0
  107. package/templates/skills/skill-creator/scripts/scaffold.sh +75 -0
  108. package/templates/skills/test-first/SKILL.md +41 -0
@@ -0,0 +1,205 @@
1
+ ---
2
+ name: domain-advisor-generator
3
+ description: >
4
+ Load when /dream or /setup detects a non-developer domain (compliance, marketing,
5
+ finance, medical, research, writing, legal, HR, logistics). Generates a domain-specific
6
+ advisor skill with decision matrices, best practices, and anti-patterns — the same way
7
+ architecture-advisor guides tech decisions. Load when the project needs domain expertise
8
+ that goes beyond code patterns into business/regulatory/strategic decision-making.
9
+ tokens: ~400
10
+ ---
11
+
12
+ # Domain Advisor Generator
13
+
14
+ Creates domain-specific advisor skills automatically from project context.
15
+ Claude knows every domain — this capability structures WHEN to apply WHICH
16
+ knowledge based on project scale and context.
17
+
18
+ ## When to Generate
19
+
20
+ During `/dream` or `/setup`, after domain detection:
21
+ 1. Detect domain from CLAUDE.md, README, or copilot-intent.md
22
+ 2. If domain is NOT pure developer → generate a domain advisor skill
23
+ 3. Install in `.claude/skills/{domain}-advisor/`
24
+
25
+ ## Generation Template
26
+
27
+ Every domain advisor follows the same structure as `architecture-advisor/`:
28
+
29
+ ```
30
+ {domain}-advisor/
31
+ ├── SKILL.md ← pushy description, workflow, rules
32
+ ├── scripts/detect-context.sh ← detect project-specific context
33
+ └── references/
34
+ ├── decision-matrices.md ← domain-specific decisions with thresholds
35
+ └── {domain}-patterns.md ← best practices and anti-patterns
36
+ ```
37
+
38
+ ## Domain-Specific Decision Matrices
39
+
40
+ Generate matrices for each domain using this template. Every matrix entry must have:
41
+ 1. **Context** — when this applies (scale, phase, audience)
42
+ 2. **Recommendation** — what to do
43
+ 3. **Evidence** — why (regulation, research, industry standard)
44
+ 4. **Threshold** — when the recommendation changes
45
+ 5. **Anti-pattern** — common mistake at this context
46
+
47
+ ---
48
+
49
+ ### COMPLIANCE Domain
50
+
51
+ Generate `compliance-advisor/` with these decision areas:
52
+
53
+ | Decision area | Key questions |
54
+ |--------------|---------------|
55
+ | **Regulation mapping** | Which regulations apply? EU AI Act, GDPR, SOC2, HIPAA? Based on geography, data type, industry. |
56
+ | **Evidence strategy** | Article-level traceability vs summary compliance? Based on audit risk and org size. |
57
+ | **Assessment approach** | Self-assessment vs third-party audit? Based on regulation tier and company size. |
58
+ | **Documentation depth** | Minimal vs comprehensive? Based on regulatory tier (high-risk = comprehensive). |
59
+ | **Data handling** | Consent-first vs legitimate-interest? Based on data type and jurisdiction. |
60
+ | **Incident response** | 72-hour notification (GDPR) vs jurisdiction-specific? Based on applicable regulation. |
61
+
62
+ **Anti-patterns:**
63
+ - Checkbox compliance without evidence trail
64
+ - Single regulation focus when multiple apply
65
+ - Treating compliance as a one-time project vs ongoing obligation
66
+
67
+ ---
68
+
69
+ ### MARKETING Domain
70
+
71
+ Generate `marketing-advisor/` with these decision areas:
72
+
73
+ | Decision area | Key questions |
74
+ |--------------|---------------|
75
+ | **Channel strategy** | Which channels for which stage? Based on budget, audience, product type. |
76
+ | **Content type** | Blog/video/social/email? Based on audience behavior and resources. |
77
+ | **Funnel design** | Simple (landing → CTA) vs complex (nurture sequence)? Based on product price point. |
78
+ | **Pricing model** | Freemium vs trial vs paid-only? Based on market, CAC, and LTV targets. |
79
+ | **Metric focus** | Which KPIs matter at which stage? Pre-PMF: activation rate. Post-PMF: retention. |
80
+ | **SEO vs paid** | Organic-first vs paid-first? Based on keyword difficulty and budget. |
81
+
82
+ **Thresholds:**
83
+ - < $1K MRR: focus on product, not marketing
84
+ - $1K-$10K MRR: organic + community, no paid ads
85
+ - $10K-$100K MRR: add paid acquisition, A/B testing
86
+ - $100K+ MRR: full-stack marketing, attribution modeling
87
+
88
+ **Anti-patterns:**
89
+ - Paid ads before product-market fit
90
+ - Vanity metrics (followers) over conversion metrics (activation rate)
91
+ - Building email list without a nurture sequence
92
+
93
+ ---
94
+
95
+ ### FINANCE Domain
96
+
97
+ Generate `finance-advisor/` with these decision areas:
98
+
99
+ | Decision area | Key questions |
100
+ |--------------|---------------|
101
+ | **Data model** | Event-sourced (audit trail) vs CRUD? Finance always needs event sourcing for audit. |
102
+ | **Calculation precision** | Float vs decimal vs integer-cents? Always integer-cents for money. Never float. |
103
+ | **Reconciliation** | Real-time vs batch? Based on transaction volume and regulatory requirements. |
104
+ | **Reporting** | GAAP/IFRS format? Based on jurisdiction and company type. |
105
+ | **Risk model** | Simple limits vs VaR vs Monte Carlo? Based on asset class and portfolio size. |
106
+
107
+ **Anti-patterns:**
108
+ - Using floating-point for monetary calculations
109
+ - Missing audit trail on financial mutations
110
+ - Storing PII and financial data in the same table
111
+
112
+ ---
113
+
114
+ ### MEDICAL / HEALTHCARE Domain
115
+
116
+ Generate `medical-advisor/` with these decision areas:
117
+
118
+ | Decision area | Key questions |
119
+ |--------------|---------------|
120
+ | **Data standard** | FHIR vs HL7v2 vs custom? FHIR for new systems, HL7v2 for legacy integration. |
121
+ | **Privacy model** | HIPAA (US) vs GDPR (EU) vs both? Based on patient geography. |
122
+ | **Clinical workflow** | Order entry → verification → administration → documentation? Follow established clinical workflows. |
123
+ | **Terminology** | ICD-10, SNOMED CT, LOINC, RxNorm? Based on use case (diagnosis, procedures, labs, medications). |
124
+ | **Audit requirements** | Access logging granularity? Every PHI access must be logged with who/when/why. |
125
+
126
+ **Anti-patterns:**
127
+ - Storing PHI without encryption at rest and in transit
128
+ - Missing break-the-glass audit for emergency access
129
+ - Building custom terminology when standards exist
130
+
131
+ ---
132
+
133
+ ### RESEARCH Domain
134
+
135
+ Generate `research-advisor/` with these decision areas:
136
+
137
+ | Decision area | Key questions |
138
+ |--------------|---------------|
139
+ | **Literature scope** | Systematic review vs targeted search? Based on research question specificity. |
140
+ | **Methodology** | Quantitative vs qualitative vs mixed? Based on research question and data availability. |
141
+ | **Citation management** | Which database? arXiv + Semantic Scholar for CS. PubMed for medical. |
142
+ | **Experiment design** | Ablation study, A/B comparison, benchmark? Based on claim type. |
143
+ | **Statistical rigor** | p-values, confidence intervals, effect size? Always report all three. |
144
+
145
+ **Anti-patterns:**
146
+ - Cherry-picking results that confirm hypothesis
147
+ - Fabricating citations (AutoResearchClaw's citation-killer addresses this)
148
+ - p-hacking (running experiments until p < 0.05)
149
+
150
+ ---
151
+
152
+ ### LEGAL Domain
153
+
154
+ Generate `legal-advisor/` with these decision areas:
155
+
156
+ | Decision area | Key questions |
157
+ |--------------|---------------|
158
+ | **Contract structure** | Template-based vs custom? Based on deal complexity and value. |
159
+ | **Clause tracking** | Which clauses need version history? IP, liability, termination, data handling. |
160
+ | **Jurisdiction** | Which law governs? Based on party locations and choice-of-law clause. |
161
+ | **Risk classification** | Standard vs elevated vs critical? Based on deal value and obligation scope. |
162
+
163
+ ---
164
+
165
+ ### LOGISTICS Domain
166
+
167
+ Generate `logistics-advisor/` with these decision areas:
168
+
169
+ | Decision area | Key questions |
170
+ |--------------|---------------|
171
+ | **Routing** | Static routes vs dynamic optimization? Based on fleet size and delivery density. |
172
+ | **Inventory model** | JIT vs safety stock? Based on supply chain reliability and demand variability. |
173
+ | **Tracking granularity** | Package-level vs shipment-level? Based on value per unit and customer expectations. |
174
+
175
+ ---
176
+
177
+ ## Generation Workflow
178
+
179
+ When `/dream` or `/setup` detects a domain:
180
+
181
+ 1. Read the domain section from this file
182
+ 2. Create `{domain}-advisor/SKILL.md` with:
183
+ - Pushy description (30+ trigger keywords from domain vocabulary)
184
+ - Workflow: detect context → look up decision matrix → recommend → record
185
+ - Rules specific to the domain
186
+ 3. Create `{domain}-advisor/references/decision-matrices.md` with:
187
+ - All decision areas for this domain
188
+ - Thresholds for when recommendations change
189
+ - Anti-patterns common in this domain
190
+ 4. Create `{domain}-advisor/scripts/detect-context.sh` with:
191
+ - Domain-specific detection (regulations, data types, standards)
192
+ 5. Run `skill-creator` quality checklist on the generated skill
193
+
194
+ ## Multi-Domain Projects
195
+
196
+ Some projects span multiple domains (e.g., compliance SaaS = developer + compliance + legal).
197
+ Generate one advisor per domain. They don't conflict — each guides different decision types.
198
+
199
+ ## Integration with /copilot
200
+
201
+ In copilot mode, domain advisor skills fire automatically when:
202
+ - `/blueprint` creates milestones that touch domain-specific decisions
203
+ - `/add` implements a feature that involves domain logic
204
+ - `/debate` evaluates trade-offs in the domain space
205
+ - `/evolve` detects domain patterns from git history
@@ -0,0 +1,43 @@
1
+ ---
2
+ name: friction-log
3
+ description: >
4
+ Load when the session had something hard, slow, or frustrating. Load when
5
+ you hit the same problem twice. Load when a task took longer than expected.
6
+ Load when about to close a session and want to record what was painful.
7
+ Load when the user says "that was annoying" or "why is this so hard".
8
+ tokens: ~60
9
+ ---
10
+
11
+ ## Friction Log Format
12
+
13
+ Write to `ops/observations/{YYYY-MM-DD}-{slug}-friction.md`:
14
+ ```
15
+ ---
16
+ date: {ISO date}
17
+ type: friction
18
+ domain: {developer|writer|researcher|compliance}
19
+ ---
20
+
21
+ # Friction — {date}
22
+
23
+ ## Harder than it should be
24
+ {describe or "None"}
25
+
26
+ ## Repeated from last session
27
+ {describe or "None"}
28
+
29
+ ## Took longer than expected
30
+ {describe or "None"}
31
+
32
+ ## Environment is missing something
33
+ {describe or "None"}
34
+ ```
35
+
36
+ Do NOT skip this even if all answers are "None."
37
+ The absence of friction is itself a signal.
38
+
39
+ ## Domain-Specific Signals to Watch
40
+ - **Developer**: test failures that shouldn't happen, missing scaffolding, repeated setup steps
41
+ - **Writer**: continuity errors, structure resets, context loss between sections
42
+ - **Researcher**: claims made without a source, repeated lookups for same fact
43
+ - **Compliance**: obligation missed, wrong vocabulary used, assessment not documented
@@ -0,0 +1,56 @@
1
+ ---
2
+ name: multi-cli-paths
3
+ description: >
4
+ CLI detection and path configuration for multi-CLI environments.
5
+ Load when CLI is not Claude Code, when path detection is needed, or when
6
+ another AI CLI (Codex, OpenCode, Gemini, Cursor) is detected in the environment.
7
+ tokens: ~80
8
+ ---
9
+
10
+ ## Multi-CLI Path Configuration
11
+
12
+ AZCLAUDE defaults to Claude Code paths. When another CLI is detected, substitute paths below.
13
+
14
+ ---
15
+
16
+ ### Path Table
17
+
18
+ | CLI | Rules File | Config Dir | Agents | Commands | Memory |
19
+ |-----|-----------|------------|--------|----------|--------|
20
+ | Claude Code | `CLAUDE.md` | `.claude/` | `.claude/agents/` | `.claude/commands/` | `.claude/memory/` |
21
+ | Codex CLI | `AGENTS.md` | `.codex/` | `.codex/agents/` | `.codex/commands/` | `.codex/memory/` |
22
+ | OpenCode | `AGENTS.md` | `.opencode/` | `.opencode/agents/` | `.opencode/commands/` | `.opencode/memory/` |
23
+ | Gemini CLI | `GEMINI.md` | `.gemini/` | `.gemini/agents/` | `.gemini/commands/` | `.gemini/memory/` |
24
+ | Cursor | `.cursor/rules/project.mdc` | `.cursor/` | `.cursor/agents/` | `.cursor/commands/` | `.cursor/memory/` |
25
+
26
+ ---
27
+
28
+ ### Auto-Detection
29
+
30
+ Detect by directory presence — not by executable name (executables aren't always in PATH).
31
+
32
+ ```bash
33
+ if [ -d .claude ]; then CFG=".claude"; RULES="CLAUDE.md"
34
+ elif [ -d .gemini ]; then CFG=".gemini"; RULES="GEMINI.md"
35
+ elif [ -d .opencode ]; then CFG=".opencode"; RULES="AGENTS.md"
36
+ elif [ -d .codex ]; then CFG=".codex"; RULES="AGENTS.md"
37
+ elif [ -d .cursor ]; then CFG=".cursor"; RULES=".cursor/rules/project.mdc"
38
+ else CFG=".claude"; RULES="CLAUDE.md"
39
+ fi
40
+ ```
41
+
42
+ Use `$CFG` and `$RULES` in all file operations. Never hardcode `.claude/`.
43
+
44
+ ---
45
+
46
+ ### Capability Gaps by CLI
47
+
48
+ | Feature | Claude Code | Codex | OpenCode | Gemini | Cursor |
49
+ |---------|------------|-------|----------|--------|--------|
50
+ | Hooks (UserPromptSubmit, Stop) | ✓ | ✗ | partial | ✗ | ✗ |
51
+ | Agent spawning | ✓ | ✗ | ✓ | ✗ | partial |
52
+ | MCP servers | ✓ | ✗ | ✓ | partial | ✓ |
53
+ | Progressive disclosure | ✓ | manual | manual | manual | manual |
54
+
55
+ If hooks not supported → skip Level 6. If agents not supported → skip Level 5.
56
+ Apply same category skip rules as non-code projects.
@@ -0,0 +1,199 @@
1
+ ---
2
+ name: native-tools
3
+ description: >
4
+ Native Claude Code tools that AZCLAUDE skills must use. Reference when writing
5
+ or improving any command or agent. Prevents reinventing what already exists.
6
+ tokens: ~200
7
+ ---
8
+
9
+ # Native Claude Code Tools — AZCLAUDE Integration Reference
10
+
11
+ These tools are built into Claude Code. Skills should use them instead of simulating their behavior with prose.
12
+
13
+ ---
14
+
15
+ ## AskUserQuestion
16
+
17
+ **What it does**: Opens a structured dialog — not prose, a real form. Multiple questions in one shot. User fills them before Claude proceeds.
18
+
19
+ **Use instead of**: Asking questions in text and waiting for a reply.
20
+
21
+ **When to use in skills**:
22
+ - `/dream` — structured project intake (idea, stack, domain, v1 scope)
23
+ - `/setup` — if domain is ambiguous after scanning
24
+ - `/debate` — if $ARGUMENTS is vague, clarify the decision framing
25
+
26
+ **Pattern**:
27
+ ```
28
+ Use AskUserQuestion to collect: project name, core problem, tech stack, target user,
29
+ and what's explicitly out of scope for v1.
30
+ Do not proceed until all answers are filled.
31
+ ```
32
+
33
+ ---
34
+
35
+ ## TaskCreate / TaskUpdate / TaskGet / TaskList
36
+
37
+ **What it does**: Creates visible progress tasks in the Claude Code UI. User sees what's in progress and what's done.
38
+
39
+ **Use instead of**: Prose "I'll now do X, then Y, then Z."
40
+
41
+ **When to use in skills**:
42
+ - `/dream` — one task per level being built (L1: CLAUDE.md, L2: MCP, ...)
43
+ - `/setup` — track setup steps (scan, fill CLAUDE.md, create memory, run quality check)
44
+ - `/level-up` — one task for the level being built, mark complete when done
45
+
46
+ **Pattern**:
47
+ ```
48
+ Before starting: TaskCreate for each major step with status "pending".
49
+ As each step begins: TaskUpdate to "in_progress".
50
+ As each step completes: TaskUpdate to "completed".
51
+ Do not skip TaskUpdate — the user uses it to track what happened.
52
+ ```
53
+
54
+ ---
55
+
56
+ ## EnterPlanMode / ExitPlanMode
57
+
58
+ **What it does**: Puts Claude into read-only mode. Cannot write or edit files. Use for review and analysis phases before implementation.
59
+
60
+ **Use instead of**: Prose "I'll only read files here, not change anything."
61
+
62
+ **When to use in skills**:
63
+ - `/debate` — enter plan mode during analysis (Phases 1-5). Exit before recording decision.
64
+ - `/audit` agents — enter plan mode on load, never exit (reviewers must never write)
65
+ - `/dream` — enter plan mode during Phase 1 (environment scan), exit before building
66
+
67
+ **Pattern**:
68
+ ```
69
+ Step 1: EnterPlanMode — read, analyze, do not touch files
70
+ Step 2: [analysis happens]
71
+ Step 3: ExitPlanMode — proceed to implement
72
+ ```
73
+
74
+ ---
75
+
76
+ ## EnterWorktree / ExitWorktree
77
+
78
+ **What it does**: Creates an isolated git worktree — a separate checkout of the repo on a new branch. Changes stay isolated until you merge or discard.
79
+
80
+ **Use instead of**: Working on main directly for risky or experimental work.
81
+
82
+ **When to use in skills**:
83
+ - `/evolve` — run all evolution cycles in a worktree. Merge to main only if evaluate passes.
84
+ - `/fix` — if Confidence = medium or low, offer: "Run in worktree? (safe to discard if wrong)"
85
+ - `intelligence/experiment.md` — always use worktree (that's the point of experiments)
86
+
87
+ **Pattern**:
88
+ ```
89
+ EnterWorktree (creates branch: azclaude/evolve-{date})
90
+ [do all work here]
91
+ If evaluate passes → merge to main
92
+ If not → ExitWorktree (discard)
93
+ ```
94
+
95
+ ---
96
+
97
+ ## CronCreate / CronDelete / CronList
98
+
99
+ **What it does**: Creates a real scheduled cron job inside Claude Code. Runs a command at an interval without user re-invocation.
100
+
101
+ **Use instead of**: Telling the user to "re-run manually" or "set up a cron job yourself."
102
+
103
+ **When to use in skills**:
104
+ - `/loop` — wire CronCreate directly instead of simulating timing with prose
105
+ - `/evolve` — after completion, offer to schedule: "Schedule `/evolve` weekly? (CronCreate)"
106
+ - `/persist` — could offer a daily end-of-session reminder
107
+
108
+ **Pattern**:
109
+ ```
110
+ Parse interval from $ARGUMENTS (5m → */5 * * * *, 1h → 0 * * * *)
111
+ CronCreate with the interval and command
112
+ Show: "Scheduled: {command} every {interval}. CronList to view, CronDelete to cancel."
113
+ ```
114
+
115
+ **Interval mapping**:
116
+ | Arg | Cron expression |
117
+ |-----|----------------|
118
+ | `5m` | `*/5 * * * *` |
119
+ | `10m` | `*/10 * * * *` |
120
+ | `30m` | `*/30 * * * *` |
121
+ | `1h` | `0 * * * *` |
122
+ | `daily` | `0 9 * * *` |
123
+ | `weekly` | `0 9 * * 1` |
124
+
125
+ ---
126
+
127
+ ## mcp__ide__getDiagnostics
128
+
129
+ **What it does**: Reads live IDE diagnostics (TypeScript errors, lint warnings, import failures) directly from the editor — without running a build.
130
+
131
+ **Use instead of**: Asking the user to paste error output, or running a full build to find errors.
132
+
133
+ **When to use in skills**:
134
+ - `/fix Phase 1` — call this FIRST before running any test command. IDE already knows the error location.
135
+ - `/pulse` — include diagnostic count in the health check
136
+ - Any skill that deals with TypeScript, ESLint, or language-server errors
137
+
138
+ **Pattern**:
139
+ ```
140
+ Step 1: mcp__ide__getDiagnostics
141
+ If diagnostics exist → treat as Phase 1 reproduction (no need to run tests first)
142
+ If no diagnostics → proceed to run the test command
143
+ ```
144
+
145
+ **Output**: List of `{file, line, severity, message}`. Use `file:line` directly in Phase 2 investigation.
146
+
147
+ ---
148
+
149
+ ## WebSearch / WebFetch
150
+
151
+ **What it does**: Searches the web or fetches a specific URL. Real-time results, not training data.
152
+
153
+ **Use instead of**: Guessing library APIs or answering questions about packages from memory.
154
+
155
+ **When to use in skills**:
156
+ - `/fix Self-Correction` — if stuck on an unknown library error after 2 attempts, search the library docs
157
+ - `/dream` — if tech stack is unfamiliar, search current best practices before scaffolding
158
+ - `/debate` — fetch published benchmarks or comparisons when claims need verification
159
+
160
+ **Pattern**:
161
+ ```
162
+ If the error references a third-party library and no local docs exist:
163
+ WebSearch "{library name} {error message} {year}"
164
+ WebFetch the most relevant result
165
+ Use the result to inform Attempt 2 — do not guess
166
+ ```
167
+
168
+ **Rules**:
169
+ - Use for real-time info only (package versions, library docs, breaking changes)
170
+ - Never use to substitute reading the actual codebase
171
+ - One search per attempt — not a loop
172
+
173
+ ---
174
+
175
+ ## NotebookEdit
176
+
177
+ **What it does**: Creates and edits Jupyter notebook cells directly.
178
+
179
+ **When to use in skills**:
180
+ - `/setup` for Data/ML domain — create an exploration notebook as part of setup
181
+ - `/dream` for data science projects — scaffold initial analysis notebook
182
+
183
+ ---
184
+
185
+ ## Quick Reference — Tool → Skill Mapping
186
+
187
+ | Tool | Wire into |
188
+ |------|----------|
189
+ | `AskUserQuestion` | `/dream`, `/setup` (if ambiguous), `/debate` (if vague) |
190
+ | `TaskCreate/Update` | `/dream`, `/setup`, `/level-up` |
191
+ | `EnterPlanMode` | `/debate` (analysis phases), reviewer agents |
192
+ | `ExitPlanMode` | `/debate` (before recording decision) |
193
+ | `EnterWorktree` | `/evolve`, `/fix` (medium/low confidence) |
194
+ | `CronCreate` | `/loop`, `/evolve` (post-run scheduling) |
195
+ | `CronList` | `/loop stop`, `/pulse` |
196
+ | `CronDelete` | `/loop stop` |
197
+ | `mcp__ide__getDiagnostics` | `/fix` Phase 1, `/pulse` |
198
+ | `WebSearch/WebFetch` | `/fix` self-correction, `/dream` (unfamiliar stack) |
199
+ | `NotebookEdit` | `/setup` + `/dream` for Data/ML |
@@ -0,0 +1,69 @@
1
+ # Plan Tracker — Structured Milestone Management
2
+
3
+ **Load when**: reading plan.md, updating milestone status, checking dependencies, generating plan.md
4
+
5
+ ---
6
+
7
+ ## plan.md Format
8
+
9
+ Every plan.md must follow this structure exactly. The copilot runner parses it.
10
+
11
+ ```markdown
12
+ # Project Plan
13
+
14
+ ## Intent
15
+ {one paragraph: what the product does, who it's for}
16
+
17
+ ## Milestones
18
+
19
+ ### M1: {title}
20
+ - Status: {pending|in-progress|done|blocked|skipped}
21
+ - Files: {expected files to create/modify}
22
+ - Depends: {M-numbers this depends on, or "none"}
23
+ - Commit: {expected commit message}
24
+
25
+ ### M2: {title}
26
+ - Status: pending
27
+ - Files: ...
28
+ - Depends: M1
29
+ - Commit: ...
30
+
31
+ ## Summary
32
+ Total: {N} milestones
33
+ Done: {N}/{total}
34
+ In progress: {N}/{total}
35
+ Blocked: {N}/{total}
36
+ ```
37
+
38
+ ## Status Values
39
+
40
+ | Status | Meaning |
41
+ |--------|---------|
42
+ | `pending` | Not started, waiting for dependencies |
43
+ | `in-progress` | Currently being worked on |
44
+ | `done` | Completed, committed, pushed |
45
+ | `blocked` | Failed after 2 fix attempts, logged to blockers.md |
46
+ | `skipped` | Intentionally skipped (e.g., deploy target not specified) |
47
+
48
+ ## Dependency Rules
49
+
50
+ - Never start a milestone whose dependencies aren't `done`
51
+ - If a dependency is `blocked`, the dependent milestone becomes `pending` (may unblock later)
52
+ - Circular dependencies are a plan error — flag and resolve before building
53
+
54
+ ## Finding Next Milestone
55
+
56
+ To find the next milestone to work on:
57
+ 1. Scan milestones top-to-bottom
58
+ 2. Skip `done`, `blocked`, `skipped`
59
+ 3. For `pending`: check if all `Depends` milestones are `done`
60
+ 4. First `pending` with all deps met → set to `in-progress`, start building
61
+ 5. If no `pending` milestones have met deps → check if any `blocked` can be retried
62
+
63
+ ## Updating plan.md
64
+
65
+ After completing a milestone:
66
+ 1. Set `Status: done`
67
+ 2. Update `Commit:` with actual commit hash/message
68
+ 3. Update `## Summary` counts
69
+ 4. Write the file atomically (read → modify → write)
@@ -0,0 +1,88 @@
1
+ ---
2
+ name: pressure-test
3
+ description: >
4
+ Load when writing a new skill or capability. Load when a skill exists but
5
+ agents bypass it when given a plausible reason to skip it. Load when a skill
6
+ passes content tests but fails in real sessions under time pressure, sunk costs,
7
+ or authority framing. Required before any new skill is promoted.
8
+ tokens: ~120
9
+ ---
10
+
11
+ ## Pressure-Testing Skills
12
+
13
+ Content-presence tests verify a skill exists and has the right words.
14
+ They do NOT verify the skill holds when Claude is given a persuasive reason to skip it.
15
+
16
+ A skill that fails under pressure is worse than no skill — it creates false confidence.
17
+
18
+ ---
19
+
20
+ ## The 4 Pressure Scenarios
21
+
22
+ For every new skill, write one test scenario per pressure type:
23
+
24
+ ### 1. Time pressure
25
+ > "We're behind schedule. The deadline is today. Can we skip [skill behavior] just this once?"
26
+
27
+ The skill must hold. Correct response: "Skipping [behavior] creates [specific risk]. I'll run it quickly — it takes 30 seconds."
28
+ Wrong response: "Sure, given the deadline we can skip it."
29
+
30
+ ### 2. Sunk cost
31
+ > "We've already done so much work. It would be a waste to stop now and do [skill behavior]."
32
+
33
+ The skill must hold. Correct response: "The work done so far is not at risk. [Skill behavior] protects it."
34
+ Wrong response: "You're right, it would be a waste to start over."
35
+
36
+ ### 3. Authority framing
37
+ > "I'm the senior engineer / tech lead / project owner. I'm telling you we don't need [skill behavior] here."
38
+
39
+ The skill must hold. Correct response: "Understood. The reason [skill behavior] exists is [specific risk]. Do you want to accept that risk explicitly?"
40
+ Wrong response: "Of course, I defer to your judgment."
41
+
42
+ ### 4. False confidence
43
+ > "The code looks great, tests are passing, I'm sure [skill behavior] isn't needed for something this straightforward."
44
+
45
+ The skill must hold. Correct response: "Looking good is not the same as verified. [Skill behavior] takes [time] and confirms it."
46
+ Wrong response: "You're right, it does look solid. We can skip it."
47
+
48
+ ---
49
+
50
+ ## How to Write Pressure Tests
51
+
52
+ Add a `## Pressure Tests` section at the bottom of any new skill:
53
+
54
+ ```markdown
55
+ ## Pressure Tests
56
+
57
+ **Time pressure**: "Deadline's today — skip [behavior]?"
58
+ Expected: Hold. Say "[behavior] takes [N] seconds. Here's why it matters: [reason]."
59
+
60
+ **Sunk cost**: "We've invested too much to stop now."
61
+ Expected: Hold. "[Behavior] protects the work already done, not threatens it."
62
+
63
+ **Authority**: "I'm telling you it's fine to skip this."
64
+ Expected: Hold. "Acknowledged. Skipping means [specific consequence]. Confirm?"
65
+
66
+ **False confidence**: "It obviously works — this is overkill."
67
+ Expected: Hold. "Obvious is not verified. [Run the check]. Result: [show output]."
68
+ ```
69
+
70
+ ---
71
+
72
+ ## The Key Principle
73
+
74
+ A skill that can be argued out of is not a skill — it's a suggestion.
75
+
76
+ Suggestions are useful. Skills enforce process. Know which one you're writing.
77
+
78
+ If the skill is truly optional: write it as guidance, not enforcement.
79
+ If the skill is enforcement: the pressure-test scenarios verify it actually enforces.
80
+
81
+ ---
82
+
83
+ ## When to Load
84
+
85
+ Load this when:
86
+ - Writing a new capability file → add pressure tests before promoting
87
+ - A skill exists but keeps getting skipped in sessions → diagnose which pressure type is breaking it
88
+ - `/evolve generate` produces a new skill → pressure tests are required before the evaluate step