azclaude-copilot 0.4.2 → 0.4.4

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -1,6 +1,6 @@
1
1
  <p align="center">
2
2
  <h1 align="center">AZCLAUDE</h1>
3
- <p align="center"><strong>AI coding environment that learns, evolves, and builds autonomously.</strong></p>
3
+ <p align="center"><strong>A complete AI coding environment built on Claude Code's native architecture.</strong></p>
4
4
  <p align="center">
5
5
  <a href="https://www.npmjs.com/package/azclaude-copilot"><img src="https://img.shields.io/npm/v/azclaude-copilot.svg" alt="npm version"></a>
6
6
  <a href="https://github.com/haytamAroui/AZ-CLAUDE-COPILOT/actions/workflows/tests.yml"><img src="https://github.com/haytamAroui/AZ-CLAUDE-COPILOT/actions/workflows/tests.yml/badge.svg" alt="tests"></a>
@@ -9,278 +9,364 @@
9
9
  </p>
10
10
  <p align="center">
11
11
  <a href="#install">Install</a> ·
12
- <a href="#three-ways-to-use-it">Use It</a> ·
12
+ <a href="#the-core-idea">Core Idea</a> ·
13
13
  <a href="#what-you-get">What You Get</a> ·
14
- <a href="#evidence-based-intelligence">Intelligence</a> ·
14
+ <a href="#memory-system">Memory</a> ·
15
15
  <a href="#all-26-commands">Commands</a> ·
16
+ <a href="#autonomous-mode">Autonomous Mode</a> ·
16
17
  <a href="DOCS.md">Full Docs</a>
17
18
  </p>
18
19
  </p>
19
20
 
20
21
  ---
21
22
 
22
- ## What is AZCLAUDE?
23
+ ## The Core Idea
23
24
 
24
- An AI coding environment you install into any project. It gives Claude Code (or Gemini CLI, Codex, OpenCode, Cursor) **26 commands, 8 auto-invoked skills, 7 agents, memory across sessions, learned reflexes, and self-evolving infrastructure**.
25
+ **CLAUDE.md and markdown memory files are the best way to work with an LLM.**
25
26
 
26
- Zero dependencies. One install. Works on any stack.
27
+ Not vector databases. Not API wrappers. Not prompt templates. Plain markdown files, structured and injected at exactly the right moment.
28
+
29
+ Claude Code exposes this natively: `CLAUDE.md` for conventions, hooks for automation, `.claude/` for state. AZCLAUDE implements the full architecture on top of it — every file, every hook, every pattern proven to work.
30
+
31
+ ```
32
+ Without AZCLAUDE: With AZCLAUDE:
33
+ ───────────────── ──────────────
34
+ Claude starts every session blind. Claude reads goals.md before your first message.
35
+ No project conventions. CLAUDE.md has your stack, domain, and rules.
36
+ Repeats the same mistakes. antipatterns.md prevents known failures.
37
+ Forgets what was decided. decisions.md logs every architecture choice.
38
+ Builds the same agent repeatedly. patterns.md encodes what worked.
39
+ Can't work autonomously. /copilot builds, tests, commits, ships — unattended.
40
+ ```
41
+
42
+ One install. Any stack. Zero dependencies.
27
43
 
28
44
  ---
29
45
 
30
46
  ## Install
31
47
 
32
- **Step 1 — core install** (26 commands, memory, reflexes, evolution):
33
-
34
48
  ```bash
35
- npx azclaude-copilot
49
+ npx azclaude-copilot # core install — 26 commands, memory, hooks, reflexes
50
+ npx azclaude-copilot --full # full install — adds debate engine, pipeline, ELO
51
+ ```
52
+
53
+ Then in Claude Code:
54
+
55
+ ```
56
+ /setup # analyze your project, fill CLAUDE.md, build environment
36
57
  ```
37
58
 
38
- **Step 2 full install** (adds Level 5+: debate, pipeline, ELO — optional):
59
+ That's it. Your project now has AZCLAUDE in `.claude/`.
39
60
 
40
61
  ```bash
41
- npx azclaude-copilot --full
62
+ npx azclaude-copilot doctor # 32 checks — verify everything is wired correctly
42
63
  ```
43
64
 
44
- **Step 3 — configure your project** (open Claude Code, then run):
65
+ ---
66
+
67
+ ## What You Get
68
+
69
+ **26 commands** · **8 auto-invoked skills** · **10 agents** · **3 hooks** · **memory across sessions** · **learned reflexes** · **self-evolving environment**
45
70
 
46
71
  ```
47
- /setup
72
+ .claude/
73
+ ├── CLAUDE.md ← dispatch table: conventions, stack, routing
74
+ ├── commands/ ← 26 slash commands (/add, /fix, /audit, /copilot...)
75
+ ├── skills/ ← 8 skills (test-first, security, architecture-advisor...)
76
+ ├── agents/ ← 10 agents (orchestrator, code-reviewer, test-writer...)
77
+ ├── capabilities/ ← 37 files, lazy-loaded via manifest.md (~380 tokens/task)
78
+ ├── hooks/
79
+ │ ├── post-tool-use.js ← writes breadcrumb to goals.md on every edit
80
+ │ ├── user-prompt.js ← injects goals.md + checkpoint before your first message
81
+ │ └── stop.js ← migrates In-progress → Done, trims, resets counter
82
+ └── memory/
83
+ ├── goals.md ← rolling ledger of what changed and why
84
+ ├── checkpoints/ ← WHY decisions were made (/snapshot)
85
+ ├── patterns.md ← what worked — agents read this before implementing
86
+ ├── antipatterns.md ← what broke — prevents repeating failures
87
+ ├── decisions.md ← architecture choices logged by /debate
88
+ ├── blockers.md ← what's stuck and why
89
+ └── reflexes/ ← learned behavioral patterns (confidence-scored)
48
90
  ```
49
91
 
50
- That's it. Your project now has AZCLAUDE in `.claude/`.
51
-
52
92
  ---
53
93
 
54
94
  ## Three Ways to Use It
55
95
 
56
- ### 1. `/setup` — Configure an existing project
57
-
58
- Open Claude Code in your project, then run:
96
+ ### 1. `/setup` — wire an existing project
59
97
 
60
98
  ```
61
99
  /setup
62
100
  ```
63
101
 
64
- Analyzes your project's stack, domain, and scale. Fills CLAUDE.md. Generates project-specific skills and agents. Creates memory structure.
102
+ Scans your codebase, detects domain + stack + scale, fills CLAUDE.md, creates goals.md, generates project-specific skills and agents. Run once. After that, every Claude Code session opens with full project context.
65
103
 
66
- ### 2. `/dream` — Start from an idea
104
+ ### 2. `/dream` — start from an idea
67
105
 
68
106
  ```
69
- /dream
70
- > "Build a compliance SaaS with trilingual support"
107
+ /dream "Build a compliance SaaS — FastAPI, Supabase, trilingual"
71
108
  ```
72
109
 
73
- Scaffolds the full project: CLAUDE.md, skills, agents, memory, milestones. You build from there.
110
+ Structured intake environment scan → builds CLAUDE.md, memory, skills, agents, milestones level by level. If you have a non-developer domain (compliance, finance, medical, legal), it generates a domain-specific advisor skill with decision matrices automatically.
74
111
 
75
- ### 3. `/copilot` — Full autonomous mode
112
+ ### 3. `/copilot` — walk away, come back to a product
76
113
 
77
114
  ```bash
78
115
  npx azclaude-copilot . "Build a compliance SaaS with trilingual support"
79
116
  ```
80
117
 
81
- Walk away. AZCLAUDE plans, builds, tests, commits, evolves, and deploys. Come back to a working product with full git history.
118
+ Restarts Claude Code sessions in a loop until `COPILOT_COMPLETE`. Each session: reads state, picks next milestone, implements, tests, commits, evolves. No human input needed.
82
119
 
83
- ### Day-to-day commands (in Claude Code terminal)
120
+ ### Day-to-day
84
121
 
85
122
  ```bash
86
- /add [feature] # add a feature with TDD
123
+ /add [feature] # add a feature pre-analyzes scope, follows patterns
87
124
  /fix [bug] # reproduce → investigate → fix → verify
88
- /audit # spec-first code review
89
- /test # run tests, classify failures
90
- /evolve # detect gaps, generate fixes, learn
125
+ /audit # spec-first code review, read-only
126
+ /test # framework detection, exit-code gate, failure classification
127
+ /evolve # scan for gaps, generate fixes, create agents from evidence
91
128
  /ship # tests → secrets scan → commit → push → deploy
92
- /pulse # health check — what's the state of things?
129
+ /pulse # health check — recent changes, current level, next steps
130
+ /debate [topic] # adversarial decision protocol with evidence scoring
131
+ /blueprint [plan] # read-only analysis → plan.md with milestones
132
+ /snapshot # save WHY you made decisions — run every 15-20 turns
93
133
  ```
94
134
 
95
- ### CLI commands
135
+ ---
136
+
137
+ ## Memory System
138
+
139
+ The core insight: **Claude needs to see two things at the start of every session — what changed, and why decisions were made.** Everything else is noise.
96
140
 
97
- ```bash
98
- npx azclaude-copilot # core install (26 commands, memory, reflexes)
99
- npx azclaude-copilot --full # full install (adds debate, pipeline, ELO)
100
- npx azclaude-copilot doctor # 32-check health audit
101
- npx azclaude-copilot . "intent" 30 # copilot with 30 session limit
102
- npx azclaude-copilot . # resume existing copilot run
141
+ ### How it works (zero user input)
142
+
143
+ ```
144
+ Every edit: PostToolUse hook breadcrumb appended to goals.md
145
+ (timestamp, file, diff stats, one-line summary)
146
+
147
+ Session end: Stop hook → In-progress migrates to Done
148
+ Trims to 20 Done entries, archives overflow
149
+ Resets counters
150
+
151
+ Session start: UserPromptSubmit hook → injects before your first message:
152
+ ┌─ goals.md (capped: 30 in-progress + 20 done)
153
+ ├─ latest checkpoint (capped at 50 lines)
154
+ ├─ plan status: X/N done, Y in-progress, Z blocked [copilot mode]
155
+ └─ learned reflexes with confidence ≥ 0.8, max 5 [strict profile]
103
156
  ```
104
157
 
105
- ---
158
+ **Token cost: ~500 tokens fixed.** goals.md auto-rotates at 30 entries — oldest 15 archived, newest 15 kept. Same cost at session 5 or session 500.
106
159
 
107
- ## What You Get
160
+ ### Manual layer (you control)
108
161
 
109
- 26 commands, 8 skills, 7 agents, memory, reflexes, evolution. Here's how the layers work:
162
+ ```bash
163
+ /snapshot # save reasoning snapshot — WHY decisions were made
164
+ # every 15-20 turns on complex work
165
+ # auto-injected at next session start
166
+
167
+ /persist # end-of-session: update goals.md, write session narrative
168
+ # run before closing
110
169
 
170
+ /pulse # read current state — what's healthy, what needs attention
111
171
  ```
112
- +-----------------------------------------------------------+
113
- | LAYER 1: THE RUNNER (bin/copilot.js) |
114
- | Node.js. Stateless. Dumb on purpose. |
115
- | Restarts Claude Code sessions until COPILOT_COMPLETE. |
116
- | Reads nothing. Decides nothing. Just loops. |
117
- +-----------------------------------------------------------+
118
- | LAYER 2: THE BRAIN (AZCLAUDE inside each session) |
119
- | Reads goals.md, plan.md, checkpoint, patterns, |
120
- | blockers, decisions, reflexes, context artifacts |
121
- | Decides what to do next. Builds. Tests. Commits. |
122
- | Updates all state files before session ends. |
123
- +-----------------------------------------------------------+
124
- | LAYER 3: THE ENVIRONMENT (accumulates across sessions) |
125
- | Project agents emerge from git evidence (/evolve) |
126
- | Reflexes learned from tool-use observations |
127
- | Skills created when patterns repeat |
128
- | Conventions solidify in CLAUDE.md |
129
- | The environment gets smarter every session. |
130
- +-----------------------------------------------------------+
172
+
173
+ ### Hook profiles
174
+
175
+ ```bash
176
+ AZCLAUDE_HOOK_PROFILE=minimal claude # goals.md tracking only
177
+ AZCLAUDE_HOOK_PROFILE=standard claude # all features (default)
178
+ AZCLAUDE_HOOK_PROFILE=strict claude # all + reflex guidance injection
131
179
  ```
132
180
 
133
- Runner loops. AZCLAUDE accumulates. Claude thinks.
181
+ | Feature | minimal | standard | strict |
182
+ |---------|---------|----------|--------|
183
+ | goals.md tracking + memory rotation | ✓ | ✓ | ✓ |
184
+ | Checkpoint injection | ✓ | ✓ | ✓ |
185
+ | Reflex observations (observations.jsonl) | — | ✓ | ✓ |
186
+ | Cost tracking | — | ✓ | ✓ |
187
+ | Plan status (copilot mode) | — | ✓ | ✓ |
188
+ | Reflex guidance (confidence ≥ 0.8) | — | — | ✓ |
189
+
190
+ ### State files — the runner is stateless, these files ARE the state
191
+
192
+ | File | Written by | Read by | Purpose |
193
+ |------|-----------|---------|---------|
194
+ | `CLAUDE.md` | /setup, /dream | Every session | Conventions, routing, project identity |
195
+ | `memory/goals.md` | Hooks | Every session start | File breadcrumbs + session state |
196
+ | `memory/checkpoints/` | /snapshot | Every session start | WHY decisions were made |
197
+ | `memory/patterns.md` | /evolve, agents | Agents, /add, /fix | What works — follow this |
198
+ | `memory/antipatterns.md` | /evolve, agents | Agents, /add, /fix | What broke — avoid this |
199
+ | `memory/decisions.md` | /debate | All agents | Architecture choices — never re-debate |
200
+ | `memory/blockers.md` | /copilot | /copilot, /debate | What's stuck and why |
201
+ | `memory/reflexes/` | Hooks, /reflexes | /evolve, agents | Learned behavioral patterns |
202
+ | `plan.md` | /blueprint | /copilot, /add | Milestone tracker with status |
203
+ | `copilot-report.md` | /copilot | Human | Final autonomous run summary |
134
204
 
135
205
  ---
136
206
 
137
- ## The Pipeline
207
+ ## Evolution System
138
208
 
139
- Every command detects copilot mode automatically (`[ -f .claude/copilot-intent.md ]`) and skips human interaction -- no approval gates, no prompts, no pauses.
209
+ `/evolve` finds gaps in the environment and fixes them. Three cycles:
140
210
 
211
+ **Cycle 1 — Environment Evolution**
212
+ - Detects: stale patterns, friction signals, context rot (poisoning / distraction / confusion / clash)
213
+ - Generates: fixes for each gap
214
+ - Evaluates: quality-gates before merging (syntax, self-applicability, pressure-test resilience)
215
+
216
+ **Cycle 2 — Knowledge Consolidation** (every 3+ sessions)
217
+ - Harvests patterns.md and sessions/ by recency + importance
218
+ - Prunes stale entries, consolidates redundant patterns
219
+ - Enriches agent definitions with accumulated learnings
220
+ - Auto-prunes reflexes where confidence < 0.15
221
+
222
+ **Cycle 3 — Topology Optimization** (when friction detected)
223
+ - Measures agent influence in pipelines
224
+ - Identifies merge candidates (overlapping agents)
225
+ - Tests changes in isolated worktree before adopting
226
+
227
+ **Agent emergence from git evidence:**
141
228
  ```
142
- Session 1: /dream -> /blueprint -> /add M1 -> /add M2 -> /add M3 -> /snapshot
143
- Session 2: /evolve -> /add M4 -> /add M5 -> /add M6 -> /snapshot
144
- Session 3: /evolve -> /add M7 -> /add M8 -> /add M9 -> /snapshot
145
- Session 4: /evolve -> /audit -> /ship -> COPILOT_COMPLETE
229
+ Session 1: 0 project agents. Build basic structure.
230
+ Git: 3 commits touching fastapi/, next/, supabase/
231
+
232
+ Session 2: /evolve reads git log
233
+ 15 files in fastapi/ → cc-fastapi agent created
234
+ 8 files in next/ with i18n patterns → cc-frontend-i18n agent created
235
+
236
+ Session 3: Compliance logic repeating across 6 files → cc-compliance-engine agent
237
+ 3 agents, all from real code — not guessing
238
+
239
+ Session 4: Full evolved environment. /audit → /ship → COPILOT_COMPLETE
146
240
  ```
147
241
 
148
- ### Per Milestone
242
+ Skills and agents that are project-generic get promoted to `~/shared-skills/` — improvements discovered in one project become available to all your projects.
149
243
 
150
- 1. Read milestone from `plan.md` (description, expected files, dependencies)
151
- 2. Implement using `/add` (follows `patterns.md`, reads context artifacts, uses project agents)
152
- 3. Run tests -- fix if failing (2 attempts max)
153
- 4. If still failing -- log to `blockers.md`, skip, continue
154
- 5. Commit: `{type}: {what} -- {why}`
155
- 6. Push + update `plan.md` status to `done`
156
- 7. `/snapshot` (compaction protection)
244
+ ---
157
245
 
158
- ### Self-Healing
246
+ ## Intelligence Layer
159
247
 
160
- When builds fail:
161
- - Re-read error, check `antipatterns.md`, try alternative approach
162
- - Record failure to `antipatterns.md` (every failure teaches the environment)
163
- - Record success to `patterns.md`
164
- - If stuck, `/debate` finds alternative approach from blocker context
248
+ ### 8 Skills (auto-invoked — no slash command needed)
165
249
 
166
- ### Blocker Recovery
250
+ | Skill | Triggers on |
251
+ |-------|------------|
252
+ | `session-guard` | Session start, context reset, idle detection |
253
+ | `test-first` | Writing/fixing code in TDD projects (signal-based — only if project has tests) |
254
+ | `env-scanner` | Project setup, stack detection |
255
+ | `security` | Credentials, auth, payments, .env files, secrets, before /ship |
256
+ | `debate` | Decisions, trade-offs, "which is better", architecture comparisons |
257
+ | `skill-creator` | "Create a skill", repeated workflows, new capability |
258
+ | `agent-creator` | "Create an agent", agent boundaries, 5-layer structure |
259
+ | `architecture-advisor` | Architecture decisions, DB choice, rendering strategy, testing approach — by project scale |
167
260
 
168
- After all non-blocked milestones complete:
169
- - Retry blocked milestones with full project context now available
170
- - Often unblocked by later work
171
- - If still stuck, `/debate` evaluates; if no solution, mark `skipped`
261
+ ### Architecture Advisor — 8 Evidence-Based Decision Matrices
172
262
 
173
- ---
263
+ Not "which is popular" — which is right for **your project's scale**:
174
264
 
175
- ## What Makes It Different
265
+ | Decision | SMALL (< 50 files) | MEDIUM (50-500 files) | LARGE (500+ files) |
266
+ |----------|-------------------|----------------------|-------------------|
267
+ | Architecture | Flat modules | Modular monolith | Monolith + targeted microservices |
268
+ | Database | SQLite | PostgreSQL | PostgreSQL + Redis + search |
269
+ | Testing | Test-after critical paths | TDD for business logic | Full TDD |
270
+ | API | tRPC (internal) | REST | REST + GraphQL (mobile) |
271
+ | Auth | Clerk / Supabase | Auth0 | Keycloak (self-hosted) |
272
+ | State | useState | TanStack Query | Zustand + XState |
273
+ | Rendering | SSG or SPA | SSR / ISR | ISR + edge caching |
274
+ | Deploy | Vercel / Railway | Managed containers | AWS/GCP with IaC |
176
275
 
177
- | Feature | Claude Code | Ralph Loop | Lovable | Cursor | AZCLAUDE |
178
- |---------|------------|------------|---------|--------|-----------------|
179
- | Autonomous loop | -- | Yes | -- | -- | Yes |
180
- | Memory across sessions | -- | Git only | -- | -- | Goals + checkpoints + patterns + reflexes |
181
- | Self-evolving agents | -- | -- | -- | -- | Yes (from git evidence) |
182
- | Learned reflexes | -- | -- | -- | -- | Yes (confidence-scored) |
183
- | Convention enforcement | -- | -- | -- | -- | Yes (CLAUDE.md + patterns.md) |
184
- | Architecture advisor | -- | -- | -- | -- | Yes (8 decision matrices) |
185
- | Domain advisor generation | -- | -- | -- | -- | Yes (7 domains) |
186
- | Context artifact discovery | -- | -- | -- | -- | Yes (schemas, specs, configs) |
187
- | Any stack | Yes | Yes | Next.js only | Yes | Yes |
188
- | You own the code | Yes | Yes | -- | Yes | Yes |
189
- | Zero dependencies | Yes | Yes | -- | -- | Yes (0 in package.json) |
190
- | Deploy included | -- | -- | Yes | -- | Yes |
276
+ Every recommendation includes the threshold where it changes and the anti-pattern to avoid at that scale.
191
277
 
192
- ---
278
+ ### Domain Advisor Generator — 7 Non-Tech Domains
193
279
 
194
- ## Evidence-Based Intelligence
280
+ When `/dream` or `/setup` detects a non-developer domain, a domain-specific advisor skill is generated automatically — with decision matrices, thresholds, and anti-patterns:
195
281
 
196
- ### Reflexes -- Learned Behavioral Patterns
282
+ | Domain | What gets generated |
283
+ |--------|-------------------|
284
+ | Compliance | Regulation mapping, evidence strategy, article-level traceability, audit trail |
285
+ | Finance | Event-sourced data model, integer-cents precision, reconciliation, risk model |
286
+ | Medical | FHIR vs HL7, HIPAA vs GDPR privacy model, clinical workflow, terminology |
287
+ | Marketing | Channel strategy, funnel design, pricing model, metric focus by revenue stage |
288
+ | Research | Literature scope, methodology, experiment design, statistical rigor |
289
+ | Legal | Contract structure, clause tracking, jurisdiction, risk classification |
290
+ | Logistics | Routing, inventory model, tracking granularity |
197
291
 
198
- AZCLAUDE observes tool-use patterns across sessions and extracts atomic behaviors called reflexes. Each reflex is confidence-scored, domain-tagged, and evidence-backed.
292
+ ### Reflexes Learned Behavioral Patterns
293
+
294
+ Every tool use is observed. Patterns that repeat become reflexes:
199
295
 
200
296
  ```yaml
201
297
  id: grep-before-edit
202
298
  trigger: "when modifying code files"
203
299
  action: "Search with Grep first, confirm with Read, then Edit"
204
- confidence: 0.7 # 0.3 tentative -> 0.9 certain
300
+ confidence: 0.7 # 0.3 tentative 0.9 near-certain
205
301
  evidence_count: 8
302
+ domain: workflow
206
303
  ```
207
304
 
208
- - PostToolUse hook captures observations to `observations.jsonl` automatically
209
- - 3+ occurrences of a pattern creates a reflex
210
- - Confidence decays at -0.02/week without observation (stale patterns auto-prune)
211
- - Strong reflex clusters evolve into skills or agents via `/evolve`
212
- - Global scope promotion when seen in 2+ projects with confidence >= 0.8
305
+ - `PostToolUse` hook captures observations to `reflexes/observations.jsonl` automatically
306
+ - 3+ occurrences creates a reflex at confidence 0.3
307
+ - Confidence rises with confirming observations, decays -0.02/week without use
308
+ - Strong clusters (3+ reflexes, avg confidence > 0.7) evolve into skills or agents
309
+ - Global promotion when seen in 2+ projects at confidence 0.8
213
310
 
214
- ### Architecture Advisor -- 8 Decision Matrices
311
+ ### Context Artifacts Non-Code Project Knowledge
215
312
 
216
- Auto-fires on architecture decisions. Claude knows every framework; this skill guides **when to use which** based on project scale (SMALL/MEDIUM/LARGE):
313
+ Before implementing, AZCLAUDE discovers and reads non-code knowledge that informs implementation:
217
314
 
218
- | Decision area | Example guidance |
219
- |--------------|-----------------|
220
- | Architecture | SMALL: flat modules. MEDIUM: modular monolith. LARGE: monolith + targeted microservices |
221
- | Database | SMALL: SQLite. MEDIUM+: PostgreSQL. Cache: Redis. Search: Postgres FTS first |
222
- | Rendering | Marketing: SSG. Dashboards: SSR. Admin: SPA. Products: ISR |
223
- | Testing | MVP: test-after critical paths. MEDIUM: TDD for business logic. LARGE: full TDD |
224
- | API design | Internal: tRPC. Public: REST. Mobile: GraphQL. Real-time: WebSocket/SSE |
225
- | State mgmt | Simple: useState. Server data: TanStack Query. Complex: Zustand. Workflows: XState |
226
- | Deployment | MVP: Vercel/Railway. Scale: AWS/GCP with IaC |
227
- | Auth | Small: Clerk/Supabase. Large: Auth0/Keycloak |
315
+ | Type | Examples | Why it matters |
316
+ |------|---------|---------------|
317
+ | Database schemas | `prisma/schema.prisma`, `schema.sql` | Know table structure before writing queries |
318
+ | API specs | `openapi.yaml`, `swagger.json`, `.proto` | Know endpoints before building integrations |
319
+ | Infra configs | `terraform/`, `docker-compose.yml` | Know deployment constraints before architecture decisions |
320
+ | Architecture docs | `docs/architecture.md`, ADRs | Know design decisions before proposing changes |
321
+ | Domain knowledge | `knowledge/`, business rules, regulations | Know domain constraints before implementing logic |
228
322
 
229
- Every recommendation includes the **threshold where it changes** and the **anti-pattern** to avoid.
323
+ ---
230
324
 
231
- ### Domain Advisor Generator -- 7 Non-Tech Domains
325
+ ## Autonomous Mode
232
326
 
233
- When `/dream` detects a non-developer domain, it auto-generates a domain-specific advisor skill with decision matrices, best practices, and anti-patterns:
327
+ ### `/copilot` describe a product, come back to working code
234
328
 
235
- | Domain | Generated decisions |
236
- |--------|-------------------|
237
- | Compliance | Regulation mapping, evidence strategy, assessment approach, documentation depth |
238
- | Marketing | Channel strategy, funnel design, pricing model, KPI focus by revenue stage |
239
- | Finance | Data model (event-sourced), calculation precision (integer-cents), reconciliation |
240
- | Medical | Data standard (FHIR vs HL7), privacy model (HIPAA vs GDPR), terminology |
241
- | Research | Literature scope, methodology, experiment design, statistical rigor |
242
- | Legal | Contract structure, clause tracking, jurisdiction, risk classification |
243
- | Logistics | Routing, inventory model, tracking granularity |
329
+ ```bash
330
+ npx azclaude-copilot . "Build a compliance SaaS with trilingual support"
331
+ # or resume existing run:
332
+ npx azclaude-copilot .
333
+ ```
244
334
 
245
- ### Agent Emergence
335
+ Node.js runner restarts Claude Code sessions in a loop until `COPILOT_COMPLETE`.
246
336
 
247
- Copilot starts with **zero project agents**. They emerge from the work.
337
+ **Three-tier intelligent team (v0.4+):**
248
338
 
249
339
  ```
250
- Session 1 (milestones 1-3):
251
- 0 project agents. Build basic structure.
252
- Git: 3 commits touching fastapi/, next/, supabase/
253
-
254
- Session 2 (milestones 4-6):
255
- /evolve reads git log
256
- 15 files in fastapi/ -> creates cc-fastapi agent
257
- 8 files in next/ with i18n patterns -> creates cc-frontend-i18n agent
258
- 2 agents, both from real code patterns
259
-
260
- Session 3 (milestones 7-9):
261
- Compliance logic repeating across 6 files -> creates cc-compliance-engine agent
262
- 3 project agents, all from evidence
263
- Environment specialized for THIS project
264
-
265
- Session 4:
266
- Full evolved environment. /audit -> /ship -> deploy. COPILOT_COMPLETE
340
+ Orchestrator Problem-Architect Milestone-Builder
341
+ ───────────── ───────────────── ─────────────────
342
+ Reads plan.md → Analyzes milestone → Pre-reads all files
343
+ Selects wave Returns Team Spec: Implements
344
+ Dispatches • agents needed Runs tests
345
+ Monitors • skills to load Self-corrects (budget)
346
+ Triggers /evolve • files to pre-read Commits + reports back
347
+ Never writes code • Files Written (parallel safety)
348
+ pre-conditions, risks
349
+ • complexity (SIMPLE/MEDIUM/COMPLEX)
350
+ Never implements
267
351
  ```
268
352
 
269
- System agents (code-reviewer, test-writer, orchestrator-init) run the framework. Project agents emerge from the work. Two separate layers.
270
-
271
- ### Context Artifacts
353
+ **Copilot pipeline:**
354
+ ```
355
+ Session 1: /dream → /blueprint (architect annotates milestones) → M1, M2, M3 → /snapshot
356
+ Session 2: /evolve (new agents unblock plan) → M4+M5 parallel → M6 → /snapshot
357
+ Session 3: /evolve → M7, M8, M9 → /snapshot
358
+ Session 4: /evolve → /audit → /ship → COPILOT_COMPLETE
359
+ ```
272
360
 
273
- Before implementing any feature, AZCLAUDE scans for non-code knowledge that informs implementation:
361
+ **Every 3 milestones:** `/reflexes analyze` + `/evolve` + orchestrator re-evaluates blocked milestones.
274
362
 
275
- | Type | Examples | Why it matters |
276
- |------|---------|---------------|
277
- | Database schemas | schema.sql, prisma/schema.prisma | Know table structure before writing queries |
278
- | API specs | openapi.yaml, swagger.json, .proto files | Know endpoints before building integrations |
279
- | Infra configs | terraform/, docker-compose.yml | Know deployment constraints before architecture decisions |
280
- | Architecture docs | docs/architecture.md, ADRs | Know design decisions before proposing changes |
281
- | Domain knowledge | knowledge/, business rules | Know domain constraints before implementing logic |
363
+ **Exit conditions:**
282
364
 
283
- Artifact discovery runs automatically in copilot mode. `/evolve` checks for stale references.
365
+ | Condition | Exit code |
366
+ |-----------|-----------|
367
+ | `COPILOT_COMPLETE` in goals.md | 0 — product shipped |
368
+ | Max sessions reached (default: 20) | 1 — resume with `npx azclaude-copilot .` |
369
+ | All milestones blocked | 1 — needs human intervention |
284
370
 
285
371
  ---
286
372
 
@@ -290,220 +376,115 @@ Artifact discovery runs automatically in copilot mode. `/evolve` checks for stal
290
376
 
291
377
  | Command | What it does |
292
378
  |---------|-------------|
293
- | `/copilot` | Autonomous milestone execution. Plan, build, test, commit, evolve, ship. Zero human input. |
294
- | `/dream` | Idea to full project scaffold. Rules, memory, skills, agents -- built level by level. |
379
+ | `/copilot` | Autonomous milestone execution. Delegates to orchestrator team. Zero human input. |
380
+ | `/dream` | Idea full project scaffold. CLAUDE.md, memory, skills, agents built level by level. |
295
381
  | `/setup` | Analyze existing project. Detect domain + stack + scale. Build environment. |
296
- | `/add` | Add a feature. In copilot mode: uses milestone spec directly. |
297
- | `/fix` | REPRODUCE, INVESTIGATE, HYPOTHESIZE, FIX -- show passing tests. |
298
- | `/audit` | Spec-first review (read-only). In copilot mode: reviews against copilot-intent.md. |
382
+ | `/add` | Add a feature. Pre-analyzes scope via intelligent-dispatch before touching code. |
383
+ | `/fix` | REPRODUCE INVESTIGATE HYPOTHESIZE FIX. Show passing tests. Never guesses. |
384
+ | `/audit` | Spec-first code review (read-only). Injects decisions.md + patterns.md as checklist. |
299
385
  | `/test` | IDE diagnostics, framework detection, exit-code gate, failure classification. |
300
- | `/blueprint` | Read-only analysis. Structured plan.md with milestones. In copilot mode: skips approval. |
301
- | `/ship` | Tests, secrets scan, commit, push. In copilot mode: auto-deploys. |
302
- | `/refactor` | Restructure safely. Tests before + after. Worktree isolation for risky changes. |
386
+ | `/blueprint` | Read-only analysis structured plan.md. Architect annotates each milestone in copilot mode. |
387
+ | `/ship` | Risk scan → tests → secrets scan commit push. Auto-deploys in copilot mode. |
388
+ | `/refactor` | Safe restructuring. Tests before + after. Worktree isolation for high-risk changes. |
303
389
  | `/doc` | Generate docs from code. Matches existing style. |
304
- | `/migrate` | Upgrade deps/frameworks. Researches breaking changes. |
390
+ | `/migrate` | Upgrade deps/frameworks. Researches breaking changes. Worktree for major versions. |
305
391
  | `/deps` | Audit: outdated, vulnerable, unused packages. |
306
392
 
307
393
  ### Think and Improve
308
394
 
309
395
  | Command | What it does |
310
396
  |---------|-------------|
311
- | `/debate` | Adversarial debate with evidence scoring (AceMAD protocol). Order-independent, length-independent. |
312
- | `/evolve` | Scan for gaps, generate fixes, quality-gate them. Create agents from evidence. 3 cycles. |
313
- | `/reflexes` | View/analyze learned behavioral patterns. Confidence scoring. Promote to global scope. |
314
- | `/level-up` | Show current level (0-10), build the next one. |
315
- | `/find` | Search across commands, ~/shared-skills/, capabilities. |
316
- | `/create` | Build a new command with frontmatter and tests. |
317
- | `/reflect` | Self-improve CLAUDE.md from conversation friction. |
318
- | `/hookify` | Generate hooks from friction patterns. 5 hook types. |
397
+ | `/debate` | Adversarial debate with evidence scoring (AceMAD). Order-independent, length-independent. |
398
+ | `/evolve` | Detect gaps generate fixes quality-gate create agents from evidence. 3 cycles. |
399
+ | `/reflexes` | View, analyze, promote learned behavioral patterns. Confidence scoring. |
400
+ | `/level-up` | Show current level (0-10), build the next one progressively. |
401
+ | `/find` | Search across commands, `~/shared-skills/`, capabilities manifest. |
402
+ | `/create` | Build a new command with frontmatter, trigger variants, and tests. |
403
+ | `/reflect` | Self-improve CLAUDE.md from conversation friction and session history. |
404
+ | `/hookify` | Generate hooks from friction patterns. 5 hook types (block / warn / remind / inject / track). |
319
405
 
320
406
  ### Memory and Session
321
407
 
322
408
  | Command | What it does |
323
409
  |---------|-------------|
324
- | `/snapshot` | Mid-session snapshot: WHY + decisions + what's next. |
325
- | `/persist` | End-of-session: goals, friction log, session summary. |
326
- | `/pulse` | Health check + recent changes + current level. |
327
- | `/explain` | Code or error to plain language. |
410
+ | `/snapshot` | Mid-session: WHY + decisions + what's next. Auto-injected at next session start. |
411
+ | `/persist` | End-of-session: update goals.md, write session narrative to `sessions/`. |
412
+ | `/pulse` | Health check recent changes, current level, reflexes, blockers, next steps. |
413
+ | `/explain` | Code or error to plain language. 2-3 paragraphs max. |
328
414
  | `/loop` | Repeat any command on an interval via CronCreate. |
329
415
 
330
416
  ---
331
417
 
332
- ## 8 Skills (Auto-Invoked)
418
+ ## 10 Agents
333
419
 
334
- Skills fire automatically based on context -- no slash command needed.
420
+ **Framework agents** (ship with AZCLAUDE, always available):
335
421
 
336
- | Skill | Triggers on |
337
- |-------|------------|
338
- | session-guard | Session start, context reset, idle detection |
339
- | test-first | Writing/fixing code in TDD projects |
340
- | env-scanner | Project setup, stack detection |
341
- | debate | Decisions, trade-offs, comparisons |
342
- | security | Credentials, auth, payments, secrets |
343
- | skill-creator | "Create a skill", repeated workflows |
344
- | agent-creator | "Create an agent", agent boundaries |
345
- | architecture-advisor | Architecture decisions, which pattern/DB/framework for this project size |
422
+ | Agent | Role |
423
+ |-------|------|
424
+ | `orchestrator` | Tech lead for `/copilot`. Owns plan.md. Dispatches, monitors, triggers /evolve. Never writes code. |
425
+ | `problem-architect` | Pre-flight analyst. Returns Team Spec (agents/skills/files/risks/complexity) before every dispatch. Never implements. |
426
+ | `milestone-builder` | Base builder. Pre-reads all files, implements, verifies, self-corrects (fix budget), commits, reports. |
427
+ | `orchestrator-init` | Runs once during `/setup`. Scans project, fills CLAUDE.md, creates goals.md. Exits permanently. |
428
+ | `loop-controller` | Level 10 autonomous agent. 3 cycles: evolution, knowledge consolidation, topology optimization. |
429
+ | `code-reviewer` | Spec-first review. Stage 1: spec compliance. Stage 2: quality. Read-only. Never modifies files. |
430
+ | `test-writer` | Reads existing test patterns. Matches framework, style, naming. Writes and runs tests. |
431
+ | `cc-template-author` | Writes AZCLAUDE template files with proper structure. |
432
+ | `cc-cli-integrator` | Integrates new features into `bin/cli.js`. |
433
+ | `cc-test-maintainer` | Maintains `tests/test-features.sh` with correct grep patterns. |
346
434
 
347
- Each skill has: `SKILL.md` (lean workflow), `references/` (deep content), `scripts/` (deterministic detection).
435
+ **Project agents** (emerge from your git history via `/evolve`):
436
+ - Named `cc-{area}`, scoped to specific directories
437
+ - Created when 3+ files in the same area change together across 2+ commits
438
+ - Every agent has exactly 5 layers: persona, scope, tools, constraints, domain knowledge
348
439
 
349
440
  ---
350
441
 
351
- ## Memory System
352
-
353
- Three layers work silently. Context compaction stops being a problem.
354
-
355
- ```
356
- +--------------------------------------------------------------+
357
- | AUTOMATIC LAYER |
358
- | (zero user input required) |
359
- | |
360
- | PostToolUse hook --> goals.md --> UserPromptSubmit |
361
- | (fires on every edit) (rolling ledger) (injects before |
362
- | your message) |
363
- | |
364
- | Stop hook --> migrates "In progress" to "Done" |
365
- +--------------------------------------------------------------+
366
- | MANUAL LAYER |
367
- | (user triggers when ready) |
368
- | |
369
- | /snapshot --> checkpoints/{timestamp}.md |
370
- | (WHY you made decisions -- every 15-20 turns) |
371
- | |
372
- | /persist --> sessions/{date}-{topic}.md |
373
- | (full session narrative -- before closing) |
374
- +--------------------------------------------------------------+
375
- ```
376
-
377
- | Layer | Mechanism | Survives compaction | Automatic |
378
- |-------|-----------|-------------------|-----------|
379
- | File breadcrumb | PostToolUse -> goals.md | Yes -- WHERE + WHAT changed | Yes, every edit |
380
- | Reasoning snapshot | /snapshot -> checkpoints/ | Yes -- WHY decisions were made | Manual, every 15-20 turns |
381
- | Session narrative | /persist -> sessions/ | Yes -- full summary + next actions | Manual, before closing |
382
-
383
- `UserPromptSubmit` hook injects before your first message every session:
384
-
385
- | What | When | Profile |
386
- |------|------|---------|
387
- | `goals.md` (capped at 20 done + 30 in-progress) | Every session | all |
388
- | Latest checkpoint (capped at 50 lines) | Every session | all |
389
- | Plan status: `X/N done, Y in-progress, Z blocked` | Copilot mode only | standard + strict |
390
- | Learned reflexes (confidence ≥ 0.8, max 5) | Always | strict only |
442
+ ## What Makes It Different
391
443
 
392
- Token cost: ~500 tokens fixed. goals.md is bounded at 80 lines by auto-rotation — same cost at session 5 or session 500.
444
+ | Feature | Claude Code alone | AZCLAUDE |
445
+ |---------|------------------|---------|
446
+ | Project memory | Starts fresh every session | goals.md + checkpoints injected automatically |
447
+ | Conventions | Ad-hoc, re-explained each time | CLAUDE.md — loaded before every task |
448
+ | Learned behavior | None | Reflexes extracted from tool-use, confidence-scored |
449
+ | Architecture decisions | Re-debated every time | decisions.md — logged once, referenced forever |
450
+ | Failed approaches | Repeated | antipatterns.md — agents read before implementing |
451
+ | Domain knowledge | Generic | Domain advisors generated for compliance, finance, medical, legal... |
452
+ | Agent specialization | None | Project agents emerge from git evidence, not guessing |
453
+ | Autonomous building | Not possible | /copilot — three-tier intelligent team |
454
+ | Self-improvement | Not possible | /evolve — 3-cycle environment evolution |
455
+ | Any stack | Yes | Yes |
456
+ | You own the code | Yes | Yes |
457
+ | Zero dependencies | — | Yes (0 in package.json) |
393
458
 
394
459
  ---
395
460
 
396
461
  ## Security
397
462
 
398
- Zero dependencies in `package.json`. The only external binary is `claude` CLI (installed separately). This eliminates supply-chain risk entirely.
399
-
400
- ### 6 Security Layers
401
-
402
- 1. **Hook integrity** -- SHA-256 hash verified on every run
403
- 2. **Command injection protection** -- shell metacharacters rejected in file paths
404
- 3. **Prompt injection defense** -- suspicious patterns stripped from context injection (`curl|bash`, `ignore previous instructions`, base64 blocks)
405
- 4. **Skill checksums** -- portable skills SHA-256 hashed, imports fail if tampered
406
- 5. **Credential auditing** -- `/ship` blocks on `.env`, API keys, tokens before any git push
407
- 6. **Agent scoping** -- review agents read-only (`EnterPlanMode`), experiments in isolated worktrees
463
+ Zero dependencies in `package.json`. The only external binary is `claude` (installed separately). No supply-chain risk.
408
464
 
409
- ### Hook Profiles
465
+ **6 layers:**
466
+ 1. **Hook integrity** — SHA-256 hash verified on every run
467
+ 2. **Command injection protection** — shell metacharacters rejected in file paths
468
+ 3. **Prompt injection defense** — strips `curl|bash`, `ignore previous instructions`, base64 blocks from context injection
469
+ 4. **Skill checksums** — portable skills SHA-256 hashed, imports fail if tampered
470
+ 5. **Credential auditing** — `/ship` blocks on `.env`, `AKIA*`, `sk-*`, `ghp_*` before any git push
471
+ 6. **Agent scoping** — review agents read-only (`EnterPlanMode`), experiments in isolated worktrees (`EnterWorktree`)
410
472
 
411
- Control hook behavior via environment variable:
412
-
413
- ```bash
414
- AZCLAUDE_HOOK_PROFILE=minimal claude # goals.md tracking only
415
- AZCLAUDE_HOOK_PROFILE=standard claude # all features (default)
416
- AZCLAUDE_HOOK_PROFILE=strict claude # all + reflex guidance injection
417
- ```
418
-
419
- | Feature | minimal | standard | strict |
420
- |---------|---------|----------|--------|
421
- | goals.md tracking | ✓ | ✓ | ✓ |
422
- | Checkpoint injection | ✓ | ✓ | ✓ |
423
- | Reflex observations | — | ✓ | ✓ |
424
- | Cost tracking | — | ✓ | ✓ |
425
- | Plan status (copilot) | — | ✓ | ✓ |
426
- | Reflex guidance (≥0.8) | — | — | ✓ |
427
- | Memory rotation | ✓ | ✓ | ✓ |
428
-
429
- ### Doctor Audit
430
-
431
- ```bash
432
- npx azclaude-copilot doctor # 32 checks: hooks, settings, commands, memory
433
- npx azclaude-copilot doctor --audit # efficiency + security score
434
- ```
435
-
436
- See [SECURITY.md](SECURITY.md) for full details including known limitations and copilot-mode mitigations.
437
-
438
- ---
439
-
440
- ## Exit Conditions
441
-
442
- | Condition | Exit code |
443
- |-----------|-----------|
444
- | `COPILOT_COMPLETE` in goals.md | 0 -- product shipped |
445
- | Max sessions reached (default: 20) | 1 -- resume with `npx azclaude-copilot .` |
446
- | All milestones blocked | 1 -- needs human intervention |
447
-
448
- ---
449
-
450
- ## Project Structure
451
-
452
- ```
453
- azclaude-copilot/
454
- ├── bin/
455
- │ ├── cli.js <- installer, doctor, demo
456
- │ └── copilot.js <- autonomous runner (Node.js, cross-platform)
457
- ├── templates/
458
- │ ├── CLAUDE.md <- dispatch table template
459
- │ ├── hooks/ <- pure Node.js, cross-platform
460
- │ │ ├── user-prompt.js <- injects goals.md + checkpoint at session start
461
- │ │ ├── post-tool-use.js <- writes file + diff stat on every edit
462
- │ │ └── stop.js <- migrates In progress -> Done
463
- │ ├── agents/ (7) <- system + project agents
464
- │ ├── capabilities/ <- 36 files, lazy-loaded via manifest.md
465
- │ ├── commands/ (26) <- all 26 commands including /copilot, /reflexes
466
- │ ├── skills/ (8) <- auto-invoked SKILL.md files + architecture-advisor
467
- │ └── scripts/env-scan.sh
468
- ├── ROADMAP.md <- 5-phase build spec
469
- ├── DOCS.md <- full user guide
470
- ├── SECURITY.md <- security policy + architecture
471
- ├── tests/
472
- │ └── test-features.sh ← 1135 tests
473
- ```
474
-
475
- ---
476
-
477
- ## State Files
478
-
479
- The runner is stateless. These files ARE the state.
480
-
481
- | File | Written by | Read by | Purpose |
482
- |------|-----------|---------|---------|
483
- | `.claude/copilot-intent.md` | Runner | /dream, /copilot | Original product description |
484
- | `.claude/plan.md` | /blueprint | /copilot, /add | Milestone tracker with status |
485
- | `.claude/memory/goals.md` | Hooks | Every session start | File breadcrumbs + session state |
486
- | `.claude/memory/checkpoints/*` | /snapshot | Every session start | Reasoning snapshots |
487
- | `.claude/memory/patterns.md` | /evolve, agents | Agents, /add | What works |
488
- | `.claude/memory/antipatterns.md` | /evolve, agents | Agents, /add | What broke |
489
- | `.claude/memory/decisions.md` | /debate | Agents | Architecture choices |
490
- | `.claude/memory/blockers.md` | /copilot | /copilot, /debate | What's stuck and why |
491
- | `.claude/memory/reflexes/` | /reflexes, hooks | /evolve, agents | Learned behavioral patterns |
492
- | `.claude/copilot-report.md` | /copilot | Human | Final summary |
473
+ See [SECURITY.md](SECURITY.md) for full details.
493
474
 
494
475
  ---
495
476
 
496
477
  ## Verified
497
478
 
498
- 1135 tests. Every template, command, capability, agent, and CLI feature verified.
479
+ 1151 tests. Every template, command, capability, agent, hook, and CLI feature verified.
499
480
 
500
481
  ```bash
501
482
  bash tests/test-features.sh
502
- # Results: 1135 passed, 0 failed, 1135 total
483
+ # Results: 1151 passed, 0 failed, 1151 total
503
484
  ```
504
485
 
505
486
  ---
506
487
 
507
488
  ## License
508
489
 
509
- MIT -- [haytamAroui](https://github.com/haytamAroui)
490
+ MIT [haytamAroui](https://github.com/haytamAroui)
package/bin/cli.js CHANGED
@@ -118,7 +118,7 @@ function substitutePaths(content, cfg) {
118
118
 
119
119
  // ─── Hook Scripts ─────────────────────────────────────────────────────────────
120
120
 
121
- const HOOK_SCRIPTS = ['user-prompt.js', 'stop.js', 'post-tool-use.js'];
121
+ const HOOK_SCRIPTS = ['user-prompt.js', 'stop.js', 'post-tool-use.js', 'pre-tool-use.js'];
122
122
 
123
123
  function copyHookScripts(dstDir) {
124
124
  fs.mkdirSync(dstDir, { recursive: true });
@@ -139,9 +139,11 @@ function buildHookEntries(scriptsDir) {
139
139
  const userPromptScript = path.join(scriptsDir, 'user-prompt.js');
140
140
  const stopScript = path.join(scriptsDir, 'stop.js');
141
141
  const postToolUseScript = path.join(scriptsDir, 'post-tool-use.js');
142
+ const preToolUseScript = path.join(scriptsDir, 'pre-tool-use.js');
142
143
  return {
143
- UserPromptSubmit: [{ matcher: '', hooks: [{ type: 'command', command: `"${nodeExe}" "${userPromptScript}"` }] }],
144
- Stop: [{ matcher: '', hooks: [{ type: 'command', command: `"${nodeExe}" "${stopScript}"` }] }],
144
+ UserPromptSubmit: [{ matcher: '', hooks: [{ type: 'command', command: `"${nodeExe}" "${userPromptScript}"` }] }],
145
+ Stop: [{ matcher: '', hooks: [{ type: 'command', command: `"${nodeExe}" "${stopScript}"` }] }],
146
+ PreToolUse: [{ matcher: 'Write|Edit|MultiEdit', hooks: [{ type: 'command', command: `"${nodeExe}" "${preToolUseScript}"` }] }],
145
147
  PostToolUse: [{ matcher: 'Write|Edit|Read|Bash|Grep', hooks: [{ type: 'command', command: `"${nodeExe}" "${postToolUseScript}"` }] }],
146
148
  };
147
149
  }
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "azclaude-copilot",
3
- "version": "0.4.2",
3
+ "version": "0.4.4",
4
4
  "description": "AI coding environment — 26 commands, 8 skills, 10 agents, memory, reflexes, evolution. Install once, works on any stack.",
5
5
  "bin": {
6
6
  "azclaude": "./bin/cli.js",
@@ -155,6 +155,7 @@ Cheaper than catching them at `/ship` or `/audit` time.
155
155
 
156
156
  | Pattern | Risk | Action |
157
157
  |---|---|---|
158
+ | `${{ github.event.` in `run:` steps | GitHub Actions workflow injection | Warn — use `${{ github.event.pull_request.title }}` only in env:, never directly in run: |
158
159
  | `eval(` / `new Function(` | Code injection | Warn — suggest alternative |
159
160
  | `os.system(` / `subprocess.call(` with `shell=True` | Command injection | Warn — suggest subprocess.run with list args |
160
161
  | `child_process.exec(` | Command injection | Warn — suggest execFile or spawn |
@@ -166,9 +167,10 @@ Cheaper than catching them at `/ship` or `/audit` time.
166
167
  **Implementation:** Add a PreToolUse hook with matcher `Edit|Write|MultiEdit`:
167
168
  ```bash
168
169
  # In the hook script, scan the new content for patterns:
169
- echo "$CLAUDE_TOOL_INPUT" | grep -qE 'eval\(|os\.system\(|pickle\.load' && \
170
+ echo "$CLAUDE_TOOL_INPUT" | grep -qE 'eval\(|os\.system\(|pickle\.load|\$\{\{.*github\.event\.' && \
170
171
  echo "⚠ Security: potentially unsafe pattern detected. Review before proceeding."
171
172
  ```
172
173
 
173
174
  **Rule:** Warn, don't block (except hardcoded secrets). The developer may have a
174
175
  valid reason. But make the pattern visible so it gets reviewed.
176
+ **GitHub Actions injection** is an exception — always treat `${{ github.event.* }}` in `run:` steps as HIGH risk because it allows attackers to inject arbitrary shell commands via PR titles, issue bodies, or comment text.
@@ -0,0 +1,163 @@
1
+ #!/usr/bin/env node
2
+ 'use strict';
3
+ /**
4
+ * AZCLAUDE — PreToolUse security hook
5
+ * Fires BEFORE Edit, Write, MultiEdit operations.
6
+ * Scans content for security patterns: injection, XSS, deserialization, secrets.
7
+ * Warnings → stderr (exit 0, Claude continues).
8
+ * Hardcoded secrets → exit 2 (Claude Code blocks the write).
9
+ * Silent for all other tools, node_modules, .git, .md files.
10
+ * No dependencies. Pure synchronous fs. Cross-platform (Windows/macOS/Linux).
11
+ */
12
+ const fs = require('fs');
13
+ const path = require('path');
14
+ const os = require('os');
15
+
16
+ // ── Parse stdin ──────────────────────────────────────────────────────────────
17
+ let toolName = '';
18
+ let filePath = '';
19
+ let content = '';
20
+ try {
21
+ const raw = fs.readFileSync(0, 'utf8'); // fd 0 = stdin
22
+ const data = JSON.parse(raw);
23
+ toolName = data.tool_name || '';
24
+ filePath = data.tool_input?.file_path || data.tool_input?.path || '';
25
+ // Edit uses new_string; Write/MultiEdit use content
26
+ content = data.tool_input?.new_string || data.tool_input?.content || '';
27
+ // MultiEdit: scan all edits
28
+ if (!content && Array.isArray(data.tool_input?.edits)) {
29
+ content = data.tool_input.edits.map(e => e.new_string || '').join('\n');
30
+ }
31
+ } catch (_) {
32
+ process.exit(0); // malformed JSON — stay out of the way
33
+ }
34
+
35
+ // ── Gate: only act on write-type tools ──────────────────────────────────────
36
+ const WRITE_TOOLS = new Set(['Edit', 'Write', 'MultiEdit']);
37
+ if (!WRITE_TOOLS.has(toolName)) process.exit(0);
38
+
39
+ // ── Gate: skip noisy paths ───────────────────────────────────────────────────
40
+ if (filePath) {
41
+ const rel = path.relative(process.cwd(), path.resolve(filePath));
42
+ if (/node_modules[\\/]/.test(rel)) process.exit(0);
43
+ if (/\.git[\\/]/.test(rel)) process.exit(0);
44
+ if (/\.md$/i.test(filePath)) process.exit(0);
45
+ }
46
+
47
+ // ── Gate: nothing to scan ────────────────────────────────────────────────────
48
+ if (!content) process.exit(0);
49
+
50
+ // ── Security rules ───────────────────────────────────────────────────────────
51
+ // Each rule: { id, test, message, block }
52
+ // block:true → exit 2 (Claude Code refuses the write).
53
+ // block:false → exit 0 (warn on stderr, allow).
54
+ const RULES = [
55
+ {
56
+ id: 'gh-actions-injection',
57
+ test: /\$\{\{\s*github\.event\./,
58
+ message: 'GitHub Actions expression in run: context — injection risk. Validate event data before use.',
59
+ block: false,
60
+ },
61
+ {
62
+ id: 'child-process-exec',
63
+ test: /child_process\.exec\s*\(/,
64
+ message: 'child_process.exec() detected — command injection risk. Prefer child_process.execFile() or spawnSync() with argument arrays.',
65
+ block: false,
66
+ },
67
+ {
68
+ id: 'new-function',
69
+ test: /new\s+Function\s*\(/,
70
+ message: 'new Function() detected — dynamic code execution risk. Avoid constructing functions from strings.',
71
+ block: false,
72
+ },
73
+ {
74
+ id: 'eval',
75
+ test: /\beval\s*\(/,
76
+ message: 'eval() detected — code injection risk. Use safer alternatives (JSON.parse, Function constructors avoided).',
77
+ block: false,
78
+ },
79
+ {
80
+ id: 'dangerously-set-inner-html',
81
+ test: /dangerouslySetInnerHTML/,
82
+ message: 'dangerouslySetInnerHTML detected — XSS risk. Sanitize HTML with DOMPurify or avoid entirely.',
83
+ block: false,
84
+ },
85
+ {
86
+ id: 'dom-xss',
87
+ test: /document\.write\s*\(|\.innerHTML\s*=/,
88
+ message: 'document.write() or .innerHTML = detected — DOM XSS risk. Use textContent or a sanitization library.',
89
+ block: false,
90
+ },
91
+ {
92
+ id: 'pickle-deserialization',
93
+ test: /pickle\.loads?\s*\(/,
94
+ message: 'pickle.load()/pickle.loads() detected — deserialization risk. Never unpickle untrusted data.',
95
+ block: false,
96
+ },
97
+ {
98
+ id: 'hardcoded-secret',
99
+ test: /AKIA[A-Z0-9]{16}|sk-[a-z0-9]{20,}|ghp_[A-Za-z0-9]{36}/,
100
+ message: 'Hardcoded secret pattern detected',
101
+ block: true,
102
+ },
103
+ ];
104
+
105
+ // ── Session dedup ─────────────────────────────────────────────────────────────
106
+ // Store warned file+rule combos in a temp JSON file keyed by session PID.
107
+ // Clean up temp files older than 24 h at startup.
108
+ const SESSION_ID = process.ppid || process.pid;
109
+ const DEDUP_PATH = path.join(os.tmpdir(), `.azclaude-sec-${SESSION_ID}`);
110
+ const MAX_AGE_MS = 24 * 60 * 60 * 1000;
111
+
112
+ // Cleanup stale dedup files (best-effort, never fatal)
113
+ try {
114
+ const tmpFiles = fs.readdirSync(os.tmpdir());
115
+ for (const f of tmpFiles) {
116
+ if (!f.startsWith('.azclaude-sec-')) continue;
117
+ const fp = path.join(os.tmpdir(), f);
118
+ const age = Date.now() - fs.statSync(fp).mtimeMs;
119
+ if (age > MAX_AGE_MS) { try { fs.unlinkSync(fp); } catch (_) {} }
120
+ }
121
+ } catch (_) {}
122
+
123
+ let dedup = {};
124
+ try { dedup = JSON.parse(fs.readFileSync(DEDUP_PATH, 'utf8')); } catch (_) {}
125
+
126
+ function saveDedup() {
127
+ try { fs.writeFileSync(DEDUP_PATH, JSON.stringify(dedup)); } catch (_) {}
128
+ }
129
+
130
+ // ── Scan ─────────────────────────────────────────────────────────────────────
131
+ const displayName = filePath
132
+ ? path.relative(process.cwd(), path.resolve(filePath)) || filePath
133
+ : '(inline content)';
134
+
135
+ let didBlock = false;
136
+
137
+ for (const rule of RULES) {
138
+ if (!rule.test.test(content)) continue;
139
+
140
+ const dedupKey = `${displayName}:${rule.id}`;
141
+
142
+ if (rule.block) {
143
+ // Always emit the block message — secrets must never be silently swallowed
144
+ process.stderr.write(
145
+ `\n✗ SECURITY BLOCK: ${rule.message} in ${displayName}.\n` +
146
+ ` Use environment variables instead: process.env.MY_SECRET\n` +
147
+ ` Refusing to write. Fix before proceeding.\n\n`
148
+ );
149
+ didBlock = true;
150
+ continue; // check remaining rules before exiting
151
+ }
152
+
153
+ // Warning — deduplicated per session
154
+ if (dedup[dedupKey]) continue;
155
+ dedup[dedupKey] = true;
156
+ saveDedup();
157
+
158
+ process.stderr.write(
159
+ `\n⚠ SECURITY: ${rule.message.split(' — ')[0]} in ${displayName} — ${rule.message.includes(' — ') ? rule.message.split(' — ')[1] : rule.message}\n`
160
+ );
161
+ }
162
+
163
+ process.exit(didBlock ? 2 : 0);