npm - azclaude-copilot - Versions diffs - 0.4.2 → 0.4.4 - Mend

azclaude-copilot 0.4.2 → 0.4.4

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (5) hide show

package/README.md +319 -338
package/bin/cli.js +5 -3
package/package.json +1 -1
package/templates/capabilities/shared/security.md +3 -1
package/templates/hooks/pre-tool-use.js +163 -0

package/README.md CHANGED Viewed

@@ -1,6 +1,6 @@
 <p align="center">
   <h1 align="center">AZCLAUDE</h1>
-  <p align="center"><strong>AI coding environment that learns, evolves, and builds autonomously.</strong></p>
+  <p align="center"><strong>A complete AI coding environment — built on Claude Code's native architecture.</strong></p>
   <p align="center">
     <a href="https://www.npmjs.com/package/azclaude-copilot"><img src="https://img.shields.io/npm/v/azclaude-copilot.svg" alt="npm version"></a>
     <a href="https://github.com/haytamAroui/AZ-CLAUDE-COPILOT/actions/workflows/tests.yml"><img src="https://github.com/haytamAroui/AZ-CLAUDE-COPILOT/actions/workflows/tests.yml/badge.svg" alt="tests"></a>
@@ -9,278 +9,364 @@
   </p>
   <p align="center">
     <a href="#install">Install</a> ·
-    <a href="#three-ways-to-use-it">Use It</a> ·
+    <a href="#the-core-idea">Core Idea</a> ·
     <a href="#what-you-get">What You Get</a> ·
-    <a href="#evidence-based-intelligence">Intelligence</a> ·
+    <a href="#memory-system">Memory</a> ·
     <a href="#all-26-commands">Commands</a> ·
+    <a href="#autonomous-mode">Autonomous Mode</a> ·
     <a href="DOCS.md">Full Docs</a>
   </p>
 </p>
 ---
-## What is AZCLAUDE?
+## The Core Idea
-An AI coding environment you install into any project. It gives Claude Code (or Gemini CLI, Codex, OpenCode, Cursor) **26 commands, 8 auto-invoked skills, 7 agents, memory across sessions, learned reflexes, and self-evolving infrastructure**.
+**CLAUDE.md and markdown memory files are the best way to work with an LLM.**
-Zero dependencies. One install. Works on any stack.
+Not vector databases. Not API wrappers. Not prompt templates. Plain markdown files, structured and injected at exactly the right moment.
+Claude Code exposes this natively: `CLAUDE.md` for conventions, hooks for automation, `.claude/` for state. AZCLAUDE implements the full architecture on top of it — every file, every hook, every pattern proven to work.
+```
+Without AZCLAUDE:                     With AZCLAUDE:
+─────────────────                     ──────────────
+Claude starts every session blind.    Claude reads goals.md before your first message.
+No project conventions.               CLAUDE.md has your stack, domain, and rules.
+Repeats the same mistakes.            antipatterns.md prevents known failures.
+Forgets what was decided.             decisions.md logs every architecture choice.
+Builds the same agent repeatedly.     patterns.md encodes what worked.
+Can't work autonomously.              /copilot builds, tests, commits, ships — unattended.
+```
+One install. Any stack. Zero dependencies.
 ---
 ## Install
-**Step 1 — core install** (26 commands, memory, reflexes, evolution):
 ```bash
-npx azclaude-copilot
+npx azclaude-copilot          # core install — 26 commands, memory, hooks, reflexes
+npx azclaude-copilot --full   # full install — adds debate engine, pipeline, ELO
+```
+Then in Claude Code:
+```
+/setup    # analyze your project, fill CLAUDE.md, build environment
 ```
-**Step 2 — full install** (adds Level 5+: debate, pipeline, ELO — optional):
+That's it. Your project now has AZCLAUDE in `.claude/`.
 ```bash
-npx azclaude-copilot --full
+npx azclaude-copilot doctor   # 32 checks — verify everything is wired correctly
 ```
-**Step 3 — configure your project** (open Claude Code, then run):
+---
+## What You Get
+**26 commands** · **8 auto-invoked skills** · **10 agents** · **3 hooks** · **memory across sessions** · **learned reflexes** · **self-evolving environment**
 ```
-/setup
+.claude/
+├── CLAUDE.md                 ← dispatch table: conventions, stack, routing
+├── commands/                 ← 26 slash commands (/add, /fix, /audit, /copilot...)
+├── skills/                   ← 8 skills (test-first, security, architecture-advisor...)
+├── agents/                   ← 10 agents (orchestrator, code-reviewer, test-writer...)
+├── capabilities/             ← 37 files, lazy-loaded via manifest.md (~380 tokens/task)
+├── hooks/
+│   ├── post-tool-use.js      ← writes breadcrumb to goals.md on every edit
+│   ├── user-prompt.js        ← injects goals.md + checkpoint before your first message
+│   └── stop.js               ← migrates In-progress → Done, trims, resets counter
+└── memory/
+    ├── goals.md              ← rolling ledger of what changed and why
+    ├── checkpoints/          ← WHY decisions were made (/snapshot)
+    ├── patterns.md           ← what worked — agents read this before implementing
+    ├── antipatterns.md       ← what broke — prevents repeating failures
+    ├── decisions.md          ← architecture choices logged by /debate
+    ├── blockers.md           ← what's stuck and why
+    └── reflexes/             ← learned behavioral patterns (confidence-scored)
 ```
-That's it. Your project now has AZCLAUDE in `.claude/`.
 ---
 ## Three Ways to Use It
-### 1. `/setup` — Configure an existing project
-Open Claude Code in your project, then run:
+### 1. `/setup` — wire an existing project
 ```
 /setup
 ```
-Analyzes your project's stack, domain, and scale. Fills CLAUDE.md. Generates project-specific skills and agents. Creates memory structure.
+Scans your codebase, detects domain + stack + scale, fills CLAUDE.md, creates goals.md, generates project-specific skills and agents. Run once. After that, every Claude Code session opens with full project context.
-### 2. `/dream` — Start from an idea
+### 2. `/dream` — start from an idea
 ```
-/dream
-> "Build a compliance SaaS with trilingual support"
+/dream "Build a compliance SaaS — FastAPI, Supabase, trilingual"
 ```
-Scaffolds the full project: CLAUDE.md, skills, agents, memory, milestones. You build from there.
+Structured intake → environment scan → builds CLAUDE.md, memory, skills, agents, milestones level by level. If you have a non-developer domain (compliance, finance, medical, legal), it generates a domain-specific advisor skill with decision matrices automatically.
-### 3. `/copilot` — Full autonomous mode
+### 3. `/copilot` — walk away, come back to a product
 ```bash
 npx azclaude-copilot . "Build a compliance SaaS with trilingual support"
 ```
-Walk away. AZCLAUDE plans, builds, tests, commits, evolves, and deploys. Come back to a working product with full git history.
+Restarts Claude Code sessions in a loop until `COPILOT_COMPLETE`. Each session: reads state, picks next milestone, implements, tests, commits, evolves. No human input needed.
-### Day-to-day commands (in Claude Code terminal)
+### Day-to-day
 ```bash
-/add [feature]   # add a feature with TDD
+/add [feature]    # add a feature — pre-analyzes scope, follows patterns
 /fix [bug]        # reproduce → investigate → fix → verify
-/audit            # spec-first code review
-/test             # run tests, classify failures
-/evolve           # detect gaps, generate fixes, learn
+/audit            # spec-first code review, read-only
+/test             # framework detection, exit-code gate, failure classification
+/evolve           # scan for gaps, generate fixes, create agents from evidence
 /ship             # tests → secrets scan → commit → push → deploy
-/pulse            # health check — what's the state of things?
+/pulse            # health check — recent changes, current level, next steps
+/debate [topic]   # adversarial decision protocol with evidence scoring
+/blueprint [plan] # read-only analysis → plan.md with milestones
+/snapshot         # save WHY you made decisions — run every 15-20 turns
 ```
-### CLI commands
+---
+## Memory System
+The core insight: **Claude needs to see two things at the start of every session — what changed, and why decisions were made.** Everything else is noise.
-```bash
-npx azclaude-copilot                # core install (26 commands, memory, reflexes)
-npx azclaude-copilot --full         # full install (adds debate, pipeline, ELO)
-npx azclaude-copilot doctor         # 32-check health audit
-npx azclaude-copilot . "intent" 30  # copilot with 30 session limit
-npx azclaude-copilot .              # resume existing copilot run
+### How it works (zero user input)
+```
+Every edit:  PostToolUse hook → breadcrumb appended to goals.md
+             (timestamp, file, diff stats, one-line summary)
+Session end: Stop hook → In-progress migrates to Done
+             Trims to 20 Done entries, archives overflow
+             Resets counters
+Session start: UserPromptSubmit hook → injects before your first message:
+               ┌─ goals.md (capped: 30 in-progress + 20 done)
+               ├─ latest checkpoint (capped at 50 lines)
+               ├─ plan status: X/N done, Y in-progress, Z blocked  [copilot mode]
+               └─ learned reflexes with confidence ≥ 0.8, max 5    [strict profile]
 ```
----
+**Token cost: ~500 tokens fixed.** goals.md auto-rotates at 30 entries — oldest 15 archived, newest 15 kept. Same cost at session 5 or session 500.
-## What You Get
+### Manual layer (you control)
-26 commands, 8 skills, 7 agents, memory, reflexes, evolution. Here's how the layers work:
+```bash
+/snapshot     # save reasoning snapshot — WHY decisions were made
+              # every 15-20 turns on complex work
+              # auto-injected at next session start
+/persist      # end-of-session: update goals.md, write session narrative
+              # run before closing
+/pulse        # read current state — what's healthy, what needs attention
 ```
-+-----------------------------------------------------------+
-|  LAYER 1: THE RUNNER (bin/copilot.js)                     |
-|  Node.js. Stateless. Dumb on purpose.                     |
-|  Restarts Claude Code sessions until COPILOT_COMPLETE.    |
-|  Reads nothing. Decides nothing. Just loops.              |
-+-----------------------------------------------------------+
-|  LAYER 2: THE BRAIN (AZCLAUDE inside each session)        |
-|  Reads goals.md, plan.md, checkpoint, patterns,           |
-|  blockers, decisions, reflexes, context artifacts          |
-|  Decides what to do next. Builds. Tests. Commits.         |
-|  Updates all state files before session ends.             |
-+-----------------------------------------------------------+
-|  LAYER 3: THE ENVIRONMENT (accumulates across sessions)   |
-|  Project agents emerge from git evidence (/evolve)        |
-|  Reflexes learned from tool-use observations              |
-|  Skills created when patterns repeat                      |
-|  Conventions solidify in CLAUDE.md                        |
-|  The environment gets smarter every session.              |
-+-----------------------------------------------------------+
+### Hook profiles
+```bash
+AZCLAUDE_HOOK_PROFILE=minimal  claude   # goals.md tracking only
+AZCLAUDE_HOOK_PROFILE=standard claude   # all features (default)
+AZCLAUDE_HOOK_PROFILE=strict   claude   # all + reflex guidance injection
 ```
-Runner loops. AZCLAUDE accumulates. Claude thinks.
+| Feature | minimal | standard | strict |
+|---------|---------|----------|--------|
+| goals.md tracking + memory rotation | ✓ | ✓ | ✓ |
+| Checkpoint injection | ✓ | ✓ | ✓ |
+| Reflex observations (observations.jsonl) | — | ✓ | ✓ |
+| Cost tracking | — | ✓ | ✓ |
+| Plan status (copilot mode) | — | ✓ | ✓ |
+| Reflex guidance (confidence ≥ 0.8) | — | — | ✓ |
+### State files — the runner is stateless, these files ARE the state
+| File | Written by | Read by | Purpose |
+|------|-----------|---------|---------|
+| `CLAUDE.md` | /setup, /dream | Every session | Conventions, routing, project identity |
+| `memory/goals.md` | Hooks | Every session start | File breadcrumbs + session state |
+| `memory/checkpoints/` | /snapshot | Every session start | WHY decisions were made |
+| `memory/patterns.md` | /evolve, agents | Agents, /add, /fix | What works — follow this |
+| `memory/antipatterns.md` | /evolve, agents | Agents, /add, /fix | What broke — avoid this |
+| `memory/decisions.md` | /debate | All agents | Architecture choices — never re-debate |
+| `memory/blockers.md` | /copilot | /copilot, /debate | What's stuck and why |
+| `memory/reflexes/` | Hooks, /reflexes | /evolve, agents | Learned behavioral patterns |
+| `plan.md` | /blueprint | /copilot, /add | Milestone tracker with status |
+| `copilot-report.md` | /copilot | Human | Final autonomous run summary |
 ---
-## The Pipeline
+## Evolution System
-Every command detects copilot mode automatically (`[ -f .claude/copilot-intent.md ]`) and skips human interaction -- no approval gates, no prompts, no pauses.
+`/evolve` finds gaps in the environment and fixes them. Three cycles:
+**Cycle 1 — Environment Evolution**
+- Detects: stale patterns, friction signals, context rot (poisoning / distraction / confusion / clash)
+- Generates: fixes for each gap
+- Evaluates: quality-gates before merging (syntax, self-applicability, pressure-test resilience)
+**Cycle 2 — Knowledge Consolidation** (every 3+ sessions)
+- Harvests patterns.md and sessions/ by recency + importance
+- Prunes stale entries, consolidates redundant patterns
+- Enriches agent definitions with accumulated learnings
+- Auto-prunes reflexes where confidence < 0.15
+**Cycle 3 — Topology Optimization** (when friction detected)
+- Measures agent influence in pipelines
+- Identifies merge candidates (overlapping agents)
+- Tests changes in isolated worktree before adopting
+**Agent emergence from git evidence:**
 ```
-Session 1:  /dream -> /blueprint -> /add M1 -> /add M2 -> /add M3 -> /snapshot
-Session 2:  /evolve -> /add M4 -> /add M5 -> /add M6 -> /snapshot
-Session 3:  /evolve -> /add M7 -> /add M8 -> /add M9 -> /snapshot
-Session 4:  /evolve -> /audit -> /ship -> COPILOT_COMPLETE
+Session 1: 0 project agents. Build basic structure.
+           Git: 3 commits touching fastapi/, next/, supabase/
+Session 2: /evolve reads git log
+           15 files in fastapi/ → cc-fastapi agent created
+           8 files in next/ with i18n patterns → cc-frontend-i18n agent created
+Session 3: Compliance logic repeating across 6 files → cc-compliance-engine agent
+           3 agents, all from real code — not guessing
+Session 4: Full evolved environment. /audit → /ship → COPILOT_COMPLETE
 ```
-### Per Milestone
+Skills and agents that are project-generic get promoted to `~/shared-skills/` — improvements discovered in one project become available to all your projects.
-1. Read milestone from `plan.md` (description, expected files, dependencies)
-2. Implement using `/add` (follows `patterns.md`, reads context artifacts, uses project agents)
-3. Run tests -- fix if failing (2 attempts max)
-4. If still failing -- log to `blockers.md`, skip, continue
-5. Commit: `{type}: {what} -- {why}`
-6. Push + update `plan.md` status to `done`
-7. `/snapshot` (compaction protection)
+---
-### Self-Healing
+## Intelligence Layer
-When builds fail:
-- Re-read error, check `antipatterns.md`, try alternative approach
-- Record failure to `antipatterns.md` (every failure teaches the environment)
-- Record success to `patterns.md`
-- If stuck, `/debate` finds alternative approach from blocker context
+### 8 Skills (auto-invoked — no slash command needed)
-### Blocker Recovery
+| Skill | Triggers on |
+|-------|------------|
+| `session-guard` | Session start, context reset, idle detection |
+| `test-first` | Writing/fixing code in TDD projects (signal-based — only if project has tests) |
+| `env-scanner` | Project setup, stack detection |
+| `security` | Credentials, auth, payments, .env files, secrets, before /ship |
+| `debate` | Decisions, trade-offs, "which is better", architecture comparisons |
+| `skill-creator` | "Create a skill", repeated workflows, new capability |
+| `agent-creator` | "Create an agent", agent boundaries, 5-layer structure |
+| `architecture-advisor` | Architecture decisions, DB choice, rendering strategy, testing approach — by project scale |
-After all non-blocked milestones complete:
-- Retry blocked milestones with full project context now available
-- Often unblocked by later work
-- If still stuck, `/debate` evaluates; if no solution, mark `skipped`
+### Architecture Advisor — 8 Evidence-Based Decision Matrices
----
+Not "which is popular" — which is right for **your project's scale**:
-## What Makes It Different
+| Decision | SMALL (< 50 files) | MEDIUM (50-500 files) | LARGE (500+ files) |
+|----------|-------------------|----------------------|-------------------|
+| Architecture | Flat modules | Modular monolith | Monolith + targeted microservices |
+| Database | SQLite | PostgreSQL | PostgreSQL + Redis + search |
+| Testing | Test-after critical paths | TDD for business logic | Full TDD |
+| API | tRPC (internal) | REST | REST + GraphQL (mobile) |
+| Auth | Clerk / Supabase | Auth0 | Keycloak (self-hosted) |
+| State | useState | TanStack Query | Zustand + XState |
+| Rendering | SSG or SPA | SSR / ISR | ISR + edge caching |
+| Deploy | Vercel / Railway | Managed containers | AWS/GCP with IaC |
-| Feature | Claude Code | Ralph Loop | Lovable | Cursor | AZCLAUDE |
-|---------|------------|------------|---------|--------|-----------------|
-| Autonomous loop | -- | Yes | -- | -- | Yes |
-| Memory across sessions | -- | Git only | -- | -- | Goals + checkpoints + patterns + reflexes |
-| Self-evolving agents | -- | -- | -- | -- | Yes (from git evidence) |
-| Learned reflexes | -- | -- | -- | -- | Yes (confidence-scored) |
-| Convention enforcement | -- | -- | -- | -- | Yes (CLAUDE.md + patterns.md) |
-| Architecture advisor | -- | -- | -- | -- | Yes (8 decision matrices) |
-| Domain advisor generation | -- | -- | -- | -- | Yes (7 domains) |
-| Context artifact discovery | -- | -- | -- | -- | Yes (schemas, specs, configs) |
-| Any stack | Yes | Yes | Next.js only | Yes | Yes |
-| You own the code | Yes | Yes | -- | Yes | Yes |
-| Zero dependencies | Yes | Yes | -- | -- | Yes (0 in package.json) |
-| Deploy included | -- | -- | Yes | -- | Yes |
+Every recommendation includes the threshold where it changes and the anti-pattern to avoid at that scale.
----
+### Domain Advisor Generator — 7 Non-Tech Domains
-## Evidence-Based Intelligence
+When `/dream` or `/setup` detects a non-developer domain, a domain-specific advisor skill is generated automatically — with decision matrices, thresholds, and anti-patterns:
-### Reflexes -- Learned Behavioral Patterns
+| Domain | What gets generated |
+|--------|-------------------|
+| Compliance | Regulation mapping, evidence strategy, article-level traceability, audit trail |
+| Finance | Event-sourced data model, integer-cents precision, reconciliation, risk model |
+| Medical | FHIR vs HL7, HIPAA vs GDPR privacy model, clinical workflow, terminology |
+| Marketing | Channel strategy, funnel design, pricing model, metric focus by revenue stage |
+| Research | Literature scope, methodology, experiment design, statistical rigor |
+| Legal | Contract structure, clause tracking, jurisdiction, risk classification |
+| Logistics | Routing, inventory model, tracking granularity |
-AZCLAUDE observes tool-use patterns across sessions and extracts atomic behaviors called reflexes. Each reflex is confidence-scored, domain-tagged, and evidence-backed.
+### Reflexes — Learned Behavioral Patterns
+Every tool use is observed. Patterns that repeat become reflexes:
 ```yaml
 id: grep-before-edit
 trigger: "when modifying code files"
 action: "Search with Grep first, confirm with Read, then Edit"
-confidence: 0.7       # 0.3 tentative -> 0.9 certain
+confidence: 0.7       # 0.3 tentative → 0.9 near-certain
 evidence_count: 8
+domain: workflow
 ```
-- PostToolUse hook captures observations to `observations.jsonl` automatically
-- 3+ occurrences of a pattern creates a reflex
-- Confidence decays at -0.02/week without observation (stale patterns auto-prune)
-- Strong reflex clusters evolve into skills or agents via `/evolve`
-- Global scope promotion when seen in 2+ projects with confidence >= 0.8
+- `PostToolUse` hook captures observations to `reflexes/observations.jsonl` automatically
+- 3+ occurrences creates a reflex at confidence 0.3
+- Confidence rises with confirming observations, decays -0.02/week without use
+- Strong clusters (3+ reflexes, avg confidence > 0.7) evolve into skills or agents
+- Global promotion when seen in 2+ projects at confidence ≥ 0.8
-### Architecture Advisor -- 8 Decision Matrices
+### Context Artifacts — Non-Code Project Knowledge
-Auto-fires on architecture decisions. Claude knows every framework; this skill guides **when to use which** based on project scale (SMALL/MEDIUM/LARGE):
+Before implementing, AZCLAUDE discovers and reads non-code knowledge that informs implementation:
-| Decision area | Example guidance |
-|--------------|-----------------|
-| Architecture | SMALL: flat modules. MEDIUM: modular monolith. LARGE: monolith + targeted microservices |
-| Database | SMALL: SQLite. MEDIUM+: PostgreSQL. Cache: Redis. Search: Postgres FTS first |
-| Rendering | Marketing: SSG. Dashboards: SSR. Admin: SPA. Products: ISR |
-| Testing | MVP: test-after critical paths. MEDIUM: TDD for business logic. LARGE: full TDD |
-| API design | Internal: tRPC. Public: REST. Mobile: GraphQL. Real-time: WebSocket/SSE |
-| State mgmt | Simple: useState. Server data: TanStack Query. Complex: Zustand. Workflows: XState |
-| Deployment | MVP: Vercel/Railway. Scale: AWS/GCP with IaC |
-| Auth | Small: Clerk/Supabase. Large: Auth0/Keycloak |
+| Type | Examples | Why it matters |
+|------|---------|---------------|
+| Database schemas | `prisma/schema.prisma`, `schema.sql` | Know table structure before writing queries |
+| API specs | `openapi.yaml`, `swagger.json`, `.proto` | Know endpoints before building integrations |
+| Infra configs | `terraform/`, `docker-compose.yml` | Know deployment constraints before architecture decisions |
+| Architecture docs | `docs/architecture.md`, ADRs | Know design decisions before proposing changes |
+| Domain knowledge | `knowledge/`, business rules, regulations | Know domain constraints before implementing logic |
-Every recommendation includes the **threshold where it changes** and the **anti-pattern** to avoid.
+---
-### Domain Advisor Generator -- 7 Non-Tech Domains
+## Autonomous Mode
-When `/dream` detects a non-developer domain, it auto-generates a domain-specific advisor skill with decision matrices, best practices, and anti-patterns:
+### `/copilot` — describe a product, come back to working code
-| Domain | Generated decisions |
-|--------|-------------------|
-| Compliance | Regulation mapping, evidence strategy, assessment approach, documentation depth |
-| Marketing | Channel strategy, funnel design, pricing model, KPI focus by revenue stage |
-| Finance | Data model (event-sourced), calculation precision (integer-cents), reconciliation |
-| Medical | Data standard (FHIR vs HL7), privacy model (HIPAA vs GDPR), terminology |
-| Research | Literature scope, methodology, experiment design, statistical rigor |
-| Legal | Contract structure, clause tracking, jurisdiction, risk classification |
-| Logistics | Routing, inventory model, tracking granularity |
+```bash
+npx azclaude-copilot . "Build a compliance SaaS with trilingual support"
+# or resume existing run:
+npx azclaude-copilot .
+```
-### Agent Emergence
+Node.js runner restarts Claude Code sessions in a loop until `COPILOT_COMPLETE`.
-Copilot starts with **zero project agents**. They emerge from the work.
+**Three-tier intelligent team (v0.4+):**
 ```
-Session 1 (milestones 1-3):
-  0 project agents. Build basic structure.
-  Git: 3 commits touching fastapi/, next/, supabase/
-Session 2 (milestones 4-6):
-  /evolve reads git log
-  15 files in fastapi/ -> creates cc-fastapi agent
-  8 files in next/ with i18n patterns -> creates cc-frontend-i18n agent
-  2 agents, both from real code patterns
-Session 3 (milestones 7-9):
-  Compliance logic repeating across 6 files -> creates cc-compliance-engine agent
-  3 project agents, all from evidence
-  Environment specialized for THIS project
-Session 4:
-  Full evolved environment. /audit -> /ship -> deploy. COPILOT_COMPLETE
+Orchestrator          Problem-Architect          Milestone-Builder
+─────────────         ─────────────────          ─────────────────
+Reads plan.md    →    Analyzes milestone    →     Pre-reads all files
+Selects wave          Returns Team Spec:          Implements
+Dispatches            • agents needed             Runs tests
+Monitors              • skills to load            Self-corrects (budget)
+Triggers /evolve      • files to pre-read         Commits + reports back
+Never writes code     • Files Written (parallel safety)
+                      • pre-conditions, risks
+                      • complexity (SIMPLE/MEDIUM/COMPLEX)
+                      Never implements
 ```
-System agents (code-reviewer, test-writer, orchestrator-init) run the framework. Project agents emerge from the work. Two separate layers.
-### Context Artifacts
+**Copilot pipeline:**
+```
+Session 1:  /dream → /blueprint (architect annotates milestones) → M1, M2, M3 → /snapshot
+Session 2:  /evolve (new agents unblock plan) → M4+M5 parallel → M6 → /snapshot
+Session 3:  /evolve → M7, M8, M9 → /snapshot
+Session 4:  /evolve → /audit → /ship → COPILOT_COMPLETE
+```
-Before implementing any feature, AZCLAUDE scans for non-code knowledge that informs implementation:
+**Every 3 milestones:** `/reflexes analyze` + `/evolve` + orchestrator re-evaluates blocked milestones.
-| Type | Examples | Why it matters |
-|------|---------|---------------|
-| Database schemas | schema.sql, prisma/schema.prisma | Know table structure before writing queries |
-| API specs | openapi.yaml, swagger.json, .proto files | Know endpoints before building integrations |
-| Infra configs | terraform/, docker-compose.yml | Know deployment constraints before architecture decisions |
-| Architecture docs | docs/architecture.md, ADRs | Know design decisions before proposing changes |
-| Domain knowledge | knowledge/, business rules | Know domain constraints before implementing logic |
+**Exit conditions:**
-Artifact discovery runs automatically in copilot mode. `/evolve` checks for stale references.
+| Condition | Exit code |
+|-----------|-----------|
+| `COPILOT_COMPLETE` in goals.md | 0 — product shipped |
+| Max sessions reached (default: 20) | 1 — resume with `npx azclaude-copilot .` |
+| All milestones blocked | 1 — needs human intervention |
 ---
@@ -290,220 +376,115 @@ Artifact discovery runs automatically in copilot mode. `/evolve` checks for stal
 | Command | What it does |
 |---------|-------------|
-| `/copilot` | Autonomous milestone execution. Plan, build, test, commit, evolve, ship. Zero human input. |
-| `/dream` | Idea to full project scaffold. Rules, memory, skills, agents -- built level by level. |
+| `/copilot` | Autonomous milestone execution. Delegates to orchestrator team. Zero human input. |
+| `/dream` | Idea → full project scaffold. CLAUDE.md, memory, skills, agents — built level by level. |
 | `/setup` | Analyze existing project. Detect domain + stack + scale. Build environment. |
-| `/add` | Add a feature. In copilot mode: uses milestone spec directly. |
-| `/fix` | REPRODUCE, INVESTIGATE, HYPOTHESIZE, FIX -- show passing tests. |
-| `/audit` | Spec-first review (read-only). In copilot mode: reviews against copilot-intent.md. |
+| `/add` | Add a feature. Pre-analyzes scope via intelligent-dispatch before touching code. |
+| `/fix` | REPRODUCE → INVESTIGATE → HYPOTHESIZE → FIX. Show passing tests. Never guesses. |
+| `/audit` | Spec-first code review (read-only). Injects decisions.md + patterns.md as checklist. |
 | `/test` | IDE diagnostics, framework detection, exit-code gate, failure classification. |
-| `/blueprint` | Read-only analysis. Structured plan.md with milestones. In copilot mode: skips approval. |
-| `/ship` | Tests, secrets scan, commit, push. In copilot mode: auto-deploys. |
-| `/refactor` | Restructure safely. Tests before + after. Worktree isolation for risky changes. |
+| `/blueprint` | Read-only analysis → structured plan.md. Architect annotates each milestone in copilot mode. |
+| `/ship` | Risk scan → tests → secrets scan → commit → push. Auto-deploys in copilot mode. |
+| `/refactor` | Safe restructuring. Tests before + after. Worktree isolation for high-risk changes. |
 | `/doc` | Generate docs from code. Matches existing style. |
-| `/migrate` | Upgrade deps/frameworks. Researches breaking changes. |
+| `/migrate` | Upgrade deps/frameworks. Researches breaking changes. Worktree for major versions. |
 | `/deps` | Audit: outdated, vulnerable, unused packages. |
 ### Think and Improve
 | Command | What it does |
 |---------|-------------|
-| `/debate` | Adversarial debate with evidence scoring (AceMAD protocol). Order-independent, length-independent. |
-| `/evolve` | Scan for gaps, generate fixes, quality-gate them. Create agents from evidence. 3 cycles. |
-| `/reflexes` | View/analyze learned behavioral patterns. Confidence scoring. Promote to global scope. |
-| `/level-up` | Show current level (0-10), build the next one. |
-| `/find` | Search across commands, ~/shared-skills/, capabilities. |
-| `/create` | Build a new command with frontmatter and tests. |
-| `/reflect` | Self-improve CLAUDE.md from conversation friction. |
-| `/hookify` | Generate hooks from friction patterns. 5 hook types. |
+| `/debate` | Adversarial debate with evidence scoring (AceMAD). Order-independent, length-independent. |
+| `/evolve` | Detect gaps → generate fixes → quality-gate → create agents from evidence. 3 cycles. |
+| `/reflexes` | View, analyze, promote learned behavioral patterns. Confidence scoring. |
+| `/level-up` | Show current level (0-10), build the next one progressively. |
+| `/find` | Search across commands, `~/shared-skills/`, capabilities manifest. |
+| `/create` | Build a new command with frontmatter, trigger variants, and tests. |
+| `/reflect` | Self-improve CLAUDE.md from conversation friction and session history. |
+| `/hookify` | Generate hooks from friction patterns. 5 hook types (block / warn / remind / inject / track). |
 ### Memory and Session
 | Command | What it does |
 |---------|-------------|
-| `/snapshot` | Mid-session snapshot: WHY + decisions + what's next. |
-| `/persist` | End-of-session: goals, friction log, session summary. |
-| `/pulse` | Health check + recent changes + current level. |
-| `/explain` | Code or error to plain language. |
+| `/snapshot` | Mid-session: WHY + decisions + what's next. Auto-injected at next session start. |
+| `/persist` | End-of-session: update goals.md, write session narrative to `sessions/`. |
+| `/pulse` | Health check — recent changes, current level, reflexes, blockers, next steps. |
+| `/explain` | Code or error to plain language. 2-3 paragraphs max. |
 | `/loop` | Repeat any command on an interval via CronCreate. |
 ---
-## 8 Skills (Auto-Invoked)
+## 10 Agents
-Skills fire automatically based on context -- no slash command needed.
+**Framework agents** (ship with AZCLAUDE, always available):
-| Skill | Triggers on |
-|-------|------------|
-| session-guard | Session start, context reset, idle detection |
-| test-first | Writing/fixing code in TDD projects |
-| env-scanner | Project setup, stack detection |
-| debate | Decisions, trade-offs, comparisons |
-| security | Credentials, auth, payments, secrets |
-| skill-creator | "Create a skill", repeated workflows |
-| agent-creator | "Create an agent", agent boundaries |
-| architecture-advisor | Architecture decisions, which pattern/DB/framework for this project size |
+| Agent | Role |
+|-------|------|
+| `orchestrator` | Tech lead for `/copilot`. Owns plan.md. Dispatches, monitors, triggers /evolve. Never writes code. |
+| `problem-architect` | Pre-flight analyst. Returns Team Spec (agents/skills/files/risks/complexity) before every dispatch. Never implements. |
+| `milestone-builder` | Base builder. Pre-reads all files, implements, verifies, self-corrects (fix budget), commits, reports. |
+| `orchestrator-init` | Runs once during `/setup`. Scans project, fills CLAUDE.md, creates goals.md. Exits permanently. |
+| `loop-controller` | Level 10 autonomous agent. 3 cycles: evolution, knowledge consolidation, topology optimization. |
+| `code-reviewer` | Spec-first review. Stage 1: spec compliance. Stage 2: quality. Read-only. Never modifies files. |
+| `test-writer` | Reads existing test patterns. Matches framework, style, naming. Writes and runs tests. |
+| `cc-template-author` | Writes AZCLAUDE template files with proper structure. |
+| `cc-cli-integrator` | Integrates new features into `bin/cli.js`. |
+| `cc-test-maintainer` | Maintains `tests/test-features.sh` with correct grep patterns. |
-Each skill has: `SKILL.md` (lean workflow), `references/` (deep content), `scripts/` (deterministic detection).
+**Project agents** (emerge from your git history via `/evolve`):
+- Named `cc-{area}`, scoped to specific directories
+- Created when 3+ files in the same area change together across 2+ commits
+- Every agent has exactly 5 layers: persona, scope, tools, constraints, domain knowledge
 ---
-## Memory System
-Three layers work silently. Context compaction stops being a problem.
-```
-+--------------------------------------------------------------+
-|                     AUTOMATIC LAYER                          |
-|               (zero user input required)                     |
-|                                                              |
-|   PostToolUse hook --> goals.md --> UserPromptSubmit          |
-|   (fires on every edit) (rolling ledger) (injects before     |
-|                                           your message)      |
-|                                                              |
-|   Stop hook --> migrates "In progress" to "Done"             |
-+--------------------------------------------------------------+
-|                      MANUAL LAYER                            |
-|               (user triggers when ready)                     |
-|                                                              |
-|   /snapshot --> checkpoints/{timestamp}.md                   |
-|   (WHY you made decisions -- every 15-20 turns)              |
-|                                                              |
-|   /persist --> sessions/{date}-{topic}.md                    |
-|   (full session narrative -- before closing)                 |
-+--------------------------------------------------------------+
-```
-| Layer | Mechanism | Survives compaction | Automatic |
-|-------|-----------|-------------------|-----------|
-| File breadcrumb | PostToolUse -> goals.md | Yes -- WHERE + WHAT changed | Yes, every edit |
-| Reasoning snapshot | /snapshot -> checkpoints/ | Yes -- WHY decisions were made | Manual, every 15-20 turns |
-| Session narrative | /persist -> sessions/ | Yes -- full summary + next actions | Manual, before closing |
-`UserPromptSubmit` hook injects before your first message every session:
-| What | When | Profile |
-|------|------|---------|
-| `goals.md` (capped at 20 done + 30 in-progress) | Every session | all |
-| Latest checkpoint (capped at 50 lines) | Every session | all |
-| Plan status: `X/N done, Y in-progress, Z blocked` | Copilot mode only | standard + strict |
-| Learned reflexes (confidence ≥ 0.8, max 5) | Always | strict only |
+## What Makes It Different
-Token cost: ~500 tokens fixed. goals.md is bounded at 80 lines by auto-rotation — same cost at session 5 or session 500.
+| Feature | Claude Code alone | AZCLAUDE |
+|---------|------------------|---------|
+| Project memory | Starts fresh every session | goals.md + checkpoints injected automatically |
+| Conventions | Ad-hoc, re-explained each time | CLAUDE.md — loaded before every task |
+| Learned behavior | None | Reflexes extracted from tool-use, confidence-scored |
+| Architecture decisions | Re-debated every time | decisions.md — logged once, referenced forever |
+| Failed approaches | Repeated | antipatterns.md — agents read before implementing |
+| Domain knowledge | Generic | Domain advisors generated for compliance, finance, medical, legal... |
+| Agent specialization | None | Project agents emerge from git evidence, not guessing |
+| Autonomous building | Not possible | /copilot — three-tier intelligent team |
+| Self-improvement | Not possible | /evolve — 3-cycle environment evolution |
+| Any stack | Yes | Yes |
+| You own the code | Yes | Yes |
+| Zero dependencies | — | Yes (0 in package.json) |
 ---
 ## Security
-Zero dependencies in `package.json`. The only external binary is `claude` CLI (installed separately). This eliminates supply-chain risk entirely.
-### 6 Security Layers
-1. **Hook integrity** -- SHA-256 hash verified on every run
-2. **Command injection protection** -- shell metacharacters rejected in file paths
-3. **Prompt injection defense** -- suspicious patterns stripped from context injection (`curl|bash`, `ignore previous instructions`, base64 blocks)
-4. **Skill checksums** -- portable skills SHA-256 hashed, imports fail if tampered
-5. **Credential auditing** -- `/ship` blocks on `.env`, API keys, tokens before any git push
-6. **Agent scoping** -- review agents read-only (`EnterPlanMode`), experiments in isolated worktrees
+Zero dependencies in `package.json`. The only external binary is `claude` (installed separately). No supply-chain risk.
-### Hook Profiles
+**6 layers:**
+1. **Hook integrity** — SHA-256 hash verified on every run
+2. **Command injection protection** — shell metacharacters rejected in file paths
+3. **Prompt injection defense** — strips `curl|bash`, `ignore previous instructions`, base64 blocks from context injection
+4. **Skill checksums** — portable skills SHA-256 hashed, imports fail if tampered
+5. **Credential auditing** — `/ship` blocks on `.env`, `AKIA*`, `sk-*`, `ghp_*` before any git push
+6. **Agent scoping** — review agents read-only (`EnterPlanMode`), experiments in isolated worktrees (`EnterWorktree`)
-Control hook behavior via environment variable:
-```bash
-AZCLAUDE_HOOK_PROFILE=minimal  claude   # goals.md tracking only
-AZCLAUDE_HOOK_PROFILE=standard claude   # all features (default)
-AZCLAUDE_HOOK_PROFILE=strict   claude   # all + reflex guidance injection
-```
-| Feature | minimal | standard | strict |
-|---------|---------|----------|--------|
-| goals.md tracking | ✓ | ✓ | ✓ |
-| Checkpoint injection | ✓ | ✓ | ✓ |
-| Reflex observations | — | ✓ | ✓ |
-| Cost tracking | — | ✓ | ✓ |
-| Plan status (copilot) | — | ✓ | ✓ |
-| Reflex guidance (≥0.8) | — | — | ✓ |
-| Memory rotation | ✓ | ✓ | ✓ |
-### Doctor Audit
-```bash
-npx azclaude-copilot doctor          # 32 checks: hooks, settings, commands, memory
-npx azclaude-copilot doctor --audit  # efficiency + security score
-```
-See [SECURITY.md](SECURITY.md) for full details including known limitations and copilot-mode mitigations.
----
-## Exit Conditions
-| Condition | Exit code |
-|-----------|-----------|
-| `COPILOT_COMPLETE` in goals.md | 0 -- product shipped |
-| Max sessions reached (default: 20) | 1 -- resume with `npx azclaude-copilot .` |
-| All milestones blocked | 1 -- needs human intervention |
----
-## Project Structure
-```
-azclaude-copilot/
-├── bin/
-│   ├── cli.js                       <- installer, doctor, demo
-│   └── copilot.js                   <- autonomous runner (Node.js, cross-platform)
-├── templates/
-│   ├── CLAUDE.md                    <- dispatch table template
-│   ├── hooks/                       <- pure Node.js, cross-platform
-│   │   ├── user-prompt.js           <- injects goals.md + checkpoint at session start
-│   │   ├── post-tool-use.js         <- writes file + diff stat on every edit
-│   │   └── stop.js                  <- migrates In progress -> Done
-│   ├── agents/              (7)     <- system + project agents
-│   ├── capabilities/                <- 36 files, lazy-loaded via manifest.md
-│   ├── commands/            (26)    <- all 26 commands including /copilot, /reflexes
-│   ├── skills/              (8)     <- auto-invoked SKILL.md files + architecture-advisor
-│   └── scripts/env-scan.sh
-├── ROADMAP.md                       <- 5-phase build spec
-├── DOCS.md                          <- full user guide
-├── SECURITY.md                      <- security policy + architecture
-├── tests/
-│   └── test-features.sh          ← 1135 tests
-```
----
-## State Files
-The runner is stateless. These files ARE the state.
-| File | Written by | Read by | Purpose |
-|------|-----------|---------|---------|
-| `.claude/copilot-intent.md` | Runner | /dream, /copilot | Original product description |
-| `.claude/plan.md` | /blueprint | /copilot, /add | Milestone tracker with status |
-| `.claude/memory/goals.md` | Hooks | Every session start | File breadcrumbs + session state |
-| `.claude/memory/checkpoints/*` | /snapshot | Every session start | Reasoning snapshots |
-| `.claude/memory/patterns.md` | /evolve, agents | Agents, /add | What works |
-| `.claude/memory/antipatterns.md` | /evolve, agents | Agents, /add | What broke |
-| `.claude/memory/decisions.md` | /debate | Agents | Architecture choices |
-| `.claude/memory/blockers.md` | /copilot | /copilot, /debate | What's stuck and why |
-| `.claude/memory/reflexes/` | /reflexes, hooks | /evolve, agents | Learned behavioral patterns |
-| `.claude/copilot-report.md` | /copilot | Human | Final summary |
+See [SECURITY.md](SECURITY.md) for full details.
 ---
 ## Verified
-1135 tests. Every template, command, capability, agent, and CLI feature verified.
+1151 tests. Every template, command, capability, agent, hook, and CLI feature verified.
 ```bash
 bash tests/test-features.sh
-# Results: 1135 passed, 0 failed, 1135 total
+# Results: 1151 passed, 0 failed, 1151 total
 ```
 ---
 ## License
-MIT -- [haytamAroui](https://github.com/haytamAroui)
+MIT — [haytamAroui](https://github.com/haytamAroui)

package/bin/cli.js CHANGED Viewed

@@ -118,7 +118,7 @@ function substitutePaths(content, cfg) {
 // ─── Hook Scripts ─────────────────────────────────────────────────────────────
-const HOOK_SCRIPTS = ['user-prompt.js', 'stop.js', 'post-tool-use.js'];
+const HOOK_SCRIPTS = ['user-prompt.js', 'stop.js', 'post-tool-use.js', 'pre-tool-use.js'];
 function copyHookScripts(dstDir) {
   fs.mkdirSync(dstDir, { recursive: true });
@@ -139,9 +139,11 @@ function buildHookEntries(scriptsDir) {
   const userPromptScript  = path.join(scriptsDir, 'user-prompt.js');
   const stopScript        = path.join(scriptsDir, 'stop.js');
   const postToolUseScript = path.join(scriptsDir, 'post-tool-use.js');
+  const preToolUseScript  = path.join(scriptsDir, 'pre-tool-use.js');
   return {
-    UserPromptSubmit: [{ matcher: '',           hooks: [{ type: 'command', command: `"${nodeExe}" "${userPromptScript}"` }] }],
-    Stop:             [{ matcher: '',           hooks: [{ type: 'command', command: `"${nodeExe}" "${stopScript}"` }]       }],
+    UserPromptSubmit: [{ matcher: '',                           hooks: [{ type: 'command', command: `"${nodeExe}" "${userPromptScript}"` }]  }],
+    Stop:             [{ matcher: '',                           hooks: [{ type: 'command', command: `"${nodeExe}" "${stopScript}"` }]         }],
+    PreToolUse:       [{ matcher: 'Write|Edit|MultiEdit',      hooks: [{ type: 'command', command: `"${nodeExe}" "${preToolUseScript}"` }]  }],
     PostToolUse:      [{ matcher: 'Write|Edit|Read|Bash|Grep', hooks: [{ type: 'command', command: `"${nodeExe}" "${postToolUseScript}"` }] }],
   };
 }

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "azclaude-copilot",
-  "version": "0.4.2",
+  "version": "0.4.4",
   "description": "AI coding environment — 26 commands, 8 skills, 10 agents, memory, reflexes, evolution. Install once, works on any stack.",
   "bin": {
     "azclaude": "./bin/cli.js",

package/templates/capabilities/shared/security.md CHANGED Viewed

@@ -155,6 +155,7 @@ Cheaper than catching them at `/ship` or `/audit` time.
 | Pattern | Risk | Action |
 |---|---|---|
+| `${{ github.event.` in `run:` steps | GitHub Actions workflow injection | Warn — use `${{ github.event.pull_request.title }}` only in env:, never directly in run: |
 | `eval(` / `new Function(` | Code injection | Warn — suggest alternative |
 | `os.system(` / `subprocess.call(` with `shell=True` | Command injection | Warn — suggest subprocess.run with list args |
 | `child_process.exec(` | Command injection | Warn — suggest execFile or spawn |
@@ -166,9 +167,10 @@ Cheaper than catching them at `/ship` or `/audit` time.
 **Implementation:** Add a PreToolUse hook with matcher `Edit|Write|MultiEdit`:
 ```bash
 # In the hook script, scan the new content for patterns:
-echo "$CLAUDE_TOOL_INPUT" | grep -qE 'eval\(|os\.system\(|pickle\.load' && \
+echo "$CLAUDE_TOOL_INPUT" | grep -qE 'eval\(|os\.system\(|pickle\.load|\$\{\{.*github\.event\.' && \
   echo "⚠ Security: potentially unsafe pattern detected. Review before proceeding."
 ```
 **Rule:** Warn, don't block (except hardcoded secrets). The developer may have a
 valid reason. But make the pattern visible so it gets reviewed.
+**GitHub Actions injection** is an exception — always treat `${{ github.event.* }}` in `run:` steps as HIGH risk because it allows attackers to inject arbitrary shell commands via PR titles, issue bodies, or comment text.

package/templates/hooks/pre-tool-use.js ADDED Viewed

@@ -0,0 +1,163 @@
+#!/usr/bin/env node
+'use strict';
+/**
+ * AZCLAUDE — PreToolUse security hook
+ * Fires BEFORE Edit, Write, MultiEdit operations.
+ * Scans content for security patterns: injection, XSS, deserialization, secrets.
+ * Warnings → stderr (exit 0, Claude continues).
+ * Hardcoded secrets → exit 2 (Claude Code blocks the write).
+ * Silent for all other tools, node_modules, .git, .md files.
+ * No dependencies. Pure synchronous fs. Cross-platform (Windows/macOS/Linux).
+ */
+const fs   = require('fs');
+const path = require('path');
+const os   = require('os');
+// ── Parse stdin ──────────────────────────────────────────────────────────────
+let toolName = '';
+let filePath = '';
+let content  = '';
+try {
+  const raw  = fs.readFileSync(0, 'utf8'); // fd 0 = stdin
+  const data = JSON.parse(raw);
+  toolName   = data.tool_name || '';
+  filePath   = data.tool_input?.file_path || data.tool_input?.path || '';
+  // Edit uses new_string; Write/MultiEdit use content
+  content    = data.tool_input?.new_string || data.tool_input?.content || '';
+  // MultiEdit: scan all edits
+  if (!content && Array.isArray(data.tool_input?.edits)) {
+    content = data.tool_input.edits.map(e => e.new_string || '').join('\n');
+  }
+} catch (_) {
+  process.exit(0); // malformed JSON — stay out of the way
+}
+// ── Gate: only act on write-type tools ──────────────────────────────────────
+const WRITE_TOOLS = new Set(['Edit', 'Write', 'MultiEdit']);
+if (!WRITE_TOOLS.has(toolName)) process.exit(0);
+// ── Gate: skip noisy paths ───────────────────────────────────────────────────
+if (filePath) {
+  const rel = path.relative(process.cwd(), path.resolve(filePath));
+  if (/node_modules[\\/]/.test(rel)) process.exit(0);
+  if (/\.git[\\/]/.test(rel))        process.exit(0);
+  if (/\.md$/i.test(filePath))        process.exit(0);
+}
+// ── Gate: nothing to scan ────────────────────────────────────────────────────
+if (!content) process.exit(0);
+// ── Security rules ───────────────────────────────────────────────────────────
+// Each rule: { id, test, message, block }
+// block:true → exit 2 (Claude Code refuses the write).
+// block:false → exit 0 (warn on stderr, allow).
+const RULES = [
+  {
+    id:      'gh-actions-injection',
+    test:    /\$\{\{\s*github\.event\./,
+    message: 'GitHub Actions expression in run: context — injection risk. Validate event data before use.',
+    block:   false,
+  },
+  {
+    id:      'child-process-exec',
+    test:    /child_process\.exec\s*\(/,
+    message: 'child_process.exec() detected — command injection risk. Prefer child_process.execFile() or spawnSync() with argument arrays.',
+    block:   false,
+  },
+  {
+    id:      'new-function',
+    test:    /new\s+Function\s*\(/,
+    message: 'new Function() detected — dynamic code execution risk. Avoid constructing functions from strings.',
+    block:   false,
+  },
+  {
+    id:      'eval',
+    test:    /\beval\s*\(/,
+    message: 'eval() detected — code injection risk. Use safer alternatives (JSON.parse, Function constructors avoided).',
+    block:   false,
+  },
+  {
+    id:      'dangerously-set-inner-html',
+    test:    /dangerouslySetInnerHTML/,
+    message: 'dangerouslySetInnerHTML detected — XSS risk. Sanitize HTML with DOMPurify or avoid entirely.',
+    block:   false,
+  },
+  {
+    id:      'dom-xss',
+    test:    /document\.write\s*\(|\.innerHTML\s*=/,
+    message: 'document.write() or .innerHTML = detected — DOM XSS risk. Use textContent or a sanitization library.',
+    block:   false,
+  },
+  {
+    id:      'pickle-deserialization',
+    test:    /pickle\.loads?\s*\(/,
+    message: 'pickle.load()/pickle.loads() detected — deserialization risk. Never unpickle untrusted data.',
+    block:   false,
+  },
+  {
+    id:      'hardcoded-secret',
+    test:    /AKIA[A-Z0-9]{16}|sk-[a-z0-9]{20,}|ghp_[A-Za-z0-9]{36}/,
+    message: 'Hardcoded secret pattern detected',
+    block:   true,
+  },
+];
+// ── Session dedup ─────────────────────────────────────────────────────────────
+// Store warned file+rule combos in a temp JSON file keyed by session PID.
+// Clean up temp files older than 24 h at startup.
+const SESSION_ID  = process.ppid || process.pid;
+const DEDUP_PATH  = path.join(os.tmpdir(), `.azclaude-sec-${SESSION_ID}`);
+const MAX_AGE_MS  = 24 * 60 * 60 * 1000;
+// Cleanup stale dedup files (best-effort, never fatal)
+try {
+  const tmpFiles = fs.readdirSync(os.tmpdir());
+  for (const f of tmpFiles) {
+    if (!f.startsWith('.azclaude-sec-')) continue;
+    const fp  = path.join(os.tmpdir(), f);
+    const age = Date.now() - fs.statSync(fp).mtimeMs;
+    if (age > MAX_AGE_MS) { try { fs.unlinkSync(fp); } catch (_) {} }
+  }
+} catch (_) {}
+let dedup = {};
+try { dedup = JSON.parse(fs.readFileSync(DEDUP_PATH, 'utf8')); } catch (_) {}
+function saveDedup() {
+  try { fs.writeFileSync(DEDUP_PATH, JSON.stringify(dedup)); } catch (_) {}
+}
+// ── Scan ─────────────────────────────────────────────────────────────────────
+const displayName = filePath
+  ? path.relative(process.cwd(), path.resolve(filePath)) || filePath
+  : '(inline content)';
+let didBlock = false;
+for (const rule of RULES) {
+  if (!rule.test.test(content)) continue;
+  const dedupKey = `${displayName}:${rule.id}`;
+  if (rule.block) {
+    // Always emit the block message — secrets must never be silently swallowed
+    process.stderr.write(
+      `\n✗ SECURITY BLOCK: ${rule.message} in ${displayName}.\n` +
+      `  Use environment variables instead: process.env.MY_SECRET\n` +
+      `  Refusing to write. Fix before proceeding.\n\n`
+    );
+    didBlock = true;
+    continue; // check remaining rules before exiting
+  }
+  // Warning — deduplicated per session
+  if (dedup[dedupKey]) continue;
+  dedup[dedupKey] = true;
+  saveDedup();
+  process.stderr.write(
+    `\n⚠ SECURITY: ${rule.message.split(' — ')[0]} in ${displayName} — ${rule.message.includes(' — ') ? rule.message.split(' — ')[1] : rule.message}\n`
+  );
+}
+process.exit(didBlock ? 2 : 0);