azclaude-copilot 0.4.20 → 0.4.22

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -8,8 +8,8 @@
8
8
  "plugins": [
9
9
  {
10
10
  "name": "azclaude",
11
- "description": "AZCLAUDE is a complete AI coding environment for Claude Code. It installs 33 commands, 8 auto-invoked skills, 13 specialized agents, 4 hooks, and a persistent memory system — in one command.\n\nKey features:\n• Memory across sessions — goals.md + checkpoints injected automatically before every session\n• Self-improving loop — /reflect fixes stale CLAUDE.md rules, /reflexes learns from tool-use patterns, /evolve creates agents from git evidence\n• Autonomous copilot mode — /copilot runs a three-tier team (orchestrator → problem-architect → milestone-builder) across sessions until the product ships\n• Spec-driven workflow — /constitute writes project rules, /spec writes structured ACs, /analyze detects plan drift and ghost milestones, /blueprint traces every milestone to a spec\n• Security layer — 102-rule environment scan (/sentinel), pre-write secret blocking, pre-ship credential audit\n• Progressive levels 0–10 — start with CLAUDE.md, grow into multi-agent pipelines and self-evolving environments\n• Zero dependencies — no npm packages, no external APIs, no vector databases. Plain markdown files and Claude Code's native architecture.\n• Smart install — npx azclaude-copilot@latest auto-detects first install vs upgrade vs verify. Context-aware onboarding shows the right next command for your project state.\n\nExample use cases:\n• /setup — scan an existing project, detect stack + domain + scale, fill CLAUDE.md, generate project-specific skills and agents automatically\n• /copilot \"Build a compliance SaaS with trilingual support\" — walk away, come back to working code across multiple sessions\n• /sentinel — run a scored security audit (0–100, grade A–F) across hooks, permissions, MCP servers, agent configs, and secrets\n• /evolve — detect gaps in the environment, generate new skills and agents from git co-change evidence, report score delta (e.g. 42/100 → 68/100)\n• /constitute — write your project's constitution (non-negotiables, architectural commitments, definition of done) — gates all future AI actions\n• /analyze — cross-artifact consistency check: ghost milestones, spec vs. code drift, unplanned commits\n• /reflect — find stale, missing, or contradicting rules in CLAUDE.md and propose exact fixes\n• /debate \"REST vs GraphQL for this project\" — adversarial evidence-based decision with order-independent scoring, logged to decisions.md",
12
- "version": "0.4.19",
11
+ "description": "AZCLAUDE is a complete AI coding environment for Claude Code. It installs 33 commands, 9 auto-invoked skills, 15 specialized agents, 4 hooks, and a persistent memory system — in one command.\n\nKey features:\n• Memory across sessions — goals.md + checkpoints injected automatically before every session\n• Self-improving loop — /reflect fixes stale CLAUDE.md rules, /reflexes learns from tool-use patterns, /evolve creates agents from git evidence\n• Autonomous copilot mode — /copilot runs a three-tier team (orchestrator → problem-architect → milestone-builder) across sessions until the product ships\n• Spec-driven workflow — /constitute writes project rules, /spec writes structured ACs, /analyze detects plan drift and ghost milestones, /blueprint traces every milestone to a spec\n• Security layer — 102-rule environment scan (/sentinel), pre-write secret blocking, pre-ship credential audit\n• Progressive levels 0–10 — start with CLAUDE.md, grow into multi-agent pipelines and self-evolving environments\n• Zero dependencies — no npm packages, no external APIs, no vector databases. Plain markdown files and Claude Code's native architecture.\n• Smart install — npx azclaude-copilot@latest auto-detects first install vs upgrade vs verify. Context-aware onboarding shows the right next command for your project state.\n\nExample use cases:\n• /setup — scan an existing project, detect stack + domain + scale, fill CLAUDE.md, generate project-specific skills and agents automatically\n• /copilot \"Build a compliance SaaS with trilingual support\" — walk away, come back to working code across multiple sessions\n• /sentinel — run a scored security audit (0–100, grade A–F) across hooks, permissions, MCP servers, agent configs, and secrets\n• /evolve — detect gaps in the environment, generate new skills and agents from git co-change evidence, report score delta (e.g. 42/100 → 68/100)\n• /constitute — write your project's constitution (non-negotiables, architectural commitments, definition of done) — gates all future AI actions\n• /analyze — cross-artifact consistency check: ghost milestones, spec vs. code drift, unplanned commits\n• /reflect — find stale, missing, or contradicting rules in CLAUDE.md and propose exact fixes\n• /debate \"REST vs GraphQL for this project\" — adversarial evidence-based decision with order-independent scoring, logged to decisions.md",
12
+ "version": "0.4.22",
13
13
  "source": {
14
14
  "source": "github",
15
15
  "repo": "haytamAroui/AZ-CLAUDE-COPILOT",
@@ -1,7 +1,7 @@
1
1
  {
2
2
  "name": "azclaude",
3
- "version": "0.4.19",
4
- "description": "AZCLAUDE is a complete AI coding environment for Claude Code. It installs 33 commands, 8 auto-invoked skills, 13 specialized agents, 4 hooks, and a persistent memory system — in one command.\n\nKey features:\n• Memory across sessions — goals.md + checkpoints injected automatically before every session\n• Self-improving loop — /reflect fixes stale CLAUDE.md rules, /reflexes learns from tool-use patterns, /evolve creates agents from git evidence\n• Autonomous copilot mode — /copilot runs a three-tier team (orchestrator → problem-architect → milestone-builder) across sessions until the product ships\n• Spec-driven workflow — /constitute writes project rules, /spec writes structured ACs, /analyze detects plan drift and ghost milestones, /blueprint traces every milestone to a spec\n• Security layer — 102-rule environment scan (/sentinel), pre-write secret blocking, pre-ship credential audit\n• Progressive levels 0–10 — start with CLAUDE.md, grow into multi-agent pipelines and self-evolving environments\n• Zero dependencies — no npm packages, no external APIs, no vector databases. Plain markdown files and Claude Code's native architecture.\n• Smart install — npx azclaude-copilot@latest auto-detects first install vs upgrade vs verify. Context-aware onboarding shows the right next command for your project state.\n\nExample use cases:\n• /setup — scan an existing project, detect stack + domain + scale, fill CLAUDE.md, generate project-specific skills and agents automatically\n• /copilot \"Build a compliance SaaS with trilingual support\" — walk away, come back to working code across multiple sessions\n• /sentinel — run a scored security audit (0–100, grade A–F) across hooks, permissions, MCP servers, agent configs, and secrets\n• /evolve — detect gaps in the environment, generate new skills and agents from git co-change evidence, report score delta (e.g. 42/100 → 68/100)\n• /constitute — write your project's constitution (non-negotiables, architectural commitments, definition of done) — gates all future AI actions\n• /analyze — cross-artifact consistency check: ghost milestones, spec vs. code drift, unplanned commits\n• /reflect — find stale, missing, or contradicting rules in CLAUDE.md and propose exact fixes\n• /debate \"REST vs GraphQL for this project\" — adversarial evidence-based decision with order-independent scoring, logged to decisions.md",
3
+ "version": "0.4.22",
4
+ "description": "AZCLAUDE is a complete AI coding environment for Claude Code. It installs 33 commands, 9 auto-invoked skills, 15 specialized agents, 4 hooks, and a persistent memory system — in one command.\n\nKey features:\n• Memory across sessions — goals.md + checkpoints injected automatically before every session\n• Self-improving loop — /reflect fixes stale CLAUDE.md rules, /reflexes learns from tool-use patterns, /evolve creates agents from git evidence\n• Autonomous copilot mode — /copilot runs a three-tier team (orchestrator → problem-architect → milestone-builder) across sessions until the product ships\n• Spec-driven workflow — /constitute writes project rules, /spec writes structured ACs, /analyze detects plan drift and ghost milestones, /blueprint traces every milestone to a spec\n• Security layer — 102-rule environment scan (/sentinel), pre-write secret blocking, pre-ship credential audit\n• Progressive levels 0–10 — start with CLAUDE.md, grow into multi-agent pipelines and self-evolving environments\n• Zero dependencies — no npm packages, no external APIs, no vector databases. Plain markdown files and Claude Code's native architecture.\n• Smart install — npx azclaude-copilot@latest auto-detects first install vs upgrade vs verify. Context-aware onboarding shows the right next command for your project state.\n\nExample use cases:\n• /setup — scan an existing project, detect stack + domain + scale, fill CLAUDE.md, generate project-specific skills and agents automatically\n• /copilot \"Build a compliance SaaS with trilingual support\" — walk away, come back to working code across multiple sessions\n• /sentinel — run a scored security audit (0–100, grade A–F) across hooks, permissions, MCP servers, agent configs, and secrets\n• /evolve — detect gaps in the environment, generate new skills and agents from git co-change evidence, report score delta (e.g. 42/100 → 68/100)\n• /constitute — write your project's constitution (non-negotiables, architectural commitments, definition of done) — gates all future AI actions\n• /analyze — cross-artifact consistency check: ghost milestones, spec vs. code drift, unplanned commits\n• /reflect — find stale, missing, or contradicting rules in CLAUDE.md and propose exact fixes\n• /debate \"REST vs GraphQL for this project\" — adversarial evidence-based decision with order-independent scoring, logged to decisions.md",
5
5
  "author": {
6
6
  "name": "haytamAroui",
7
7
  "url": "https://github.com/haytamAroui"
package/README.md CHANGED
@@ -117,7 +117,7 @@ npx azclaude-copilot@latest
117
117
  ```
118
118
 
119
119
  That's it. One command, no flags. Auto-detects whether this is a fresh install or an upgrade:
120
- - **First time** → full install (33 commands, 4 hooks, 13 agents, 8 skills, memory, reflexes)
120
+ - **First time** → full install (33 commands, 4 hooks, 15 agents, 9 skills, memory, reflexes)
121
121
  - **Already installed, older version** → auto-upgrades everything to latest templates
122
122
  - **Already up to date** → verifies, no overwrites
123
123
 
@@ -129,14 +129,14 @@ npx azclaude-copilot@latest doctor # 32 checks — verify everything is wired
129
129
 
130
130
  ## What You Get
131
131
 
132
- **33 commands** · **8 auto-invoked skills** · **13 agents** · **4 hooks** · **memory across sessions** · **learned reflexes** · **self-evolving environment**
132
+ **33 commands** · **9 auto-invoked skills** · **15 agents** · **4 hooks** · **memory across sessions** · **learned reflexes** · **self-evolving environment**
133
133
 
134
134
  ```
135
135
  .claude/
136
136
  ├── CLAUDE.md ← dispatch table: conventions, stack, routing
137
137
  ├── commands/ ← 33 slash commands (/add, /fix, /copilot, /spec, /sentinel...)
138
- ├── skills/ ← 8 skills (test-first, security, architecture-advisor...)
139
- ├── agents/ ← 13 agents (orchestrator, spec-reviewer, constitution-guard...)
138
+ ├── skills/ ← 9 skills (test-first, security, architecture-advisor, frontend-design...)
139
+ ├── agents/ ← 15 agents (orchestrator, spec-reviewer, constitution-guard...)
140
140
  ├── capabilities/ ← 37 files, lazy-loaded via manifest.md (~380 tokens/task)
141
141
  ├── hooks/
142
142
  │ ├── user-prompt.js ← injects goals.md + checkpoint before your first message
@@ -807,11 +807,11 @@ Run `/level-up` at any time to see your current level and build the next one.
807
807
 
808
808
  ## Verified
809
809
 
810
- 1366 tests. Every template, command, capability, agent, hook, and CLI feature verified.
810
+ 1388 tests. Every template, command, capability, agent, hook, and CLI feature verified.
811
811
 
812
812
  ```bash
813
813
  bash tests/test-features.sh
814
- # Results: 1366 passed, 0 failed, 1366 total
814
+ # Results: 1388 passed, 0 failed, 1388 total
815
815
  ```
816
816
 
817
817
  ---
package/bin/cli.js CHANGED
@@ -428,7 +428,7 @@ function installScripts(projectDir, cfg) {
428
428
 
429
429
  // ─── Agents ───────────────────────────────────────────────────────────────────
430
430
 
431
- const AGENTS = ['orchestrator-init', 'code-reviewer', 'test-writer', 'loop-controller', 'cc-template-author', 'cc-cli-integrator', 'cc-test-maintainer', 'orchestrator', 'problem-architect', 'milestone-builder', 'security-auditor', 'spec-reviewer', 'constitution-guard'];
431
+ const AGENTS = ['orchestrator-init', 'code-reviewer', 'test-writer', 'loop-controller', 'cc-template-author', 'cc-cli-integrator', 'cc-test-maintainer', 'orchestrator', 'problem-architect', 'milestone-builder', 'security-auditor', 'spec-reviewer', 'constitution-guard', 'devops-engineer', 'qa-engineer'];
432
432
 
433
433
  function installAgents(projectDir, cfg) {
434
434
  const agentsDir = path.join(projectDir, cfg, 'agents');
package/package.json CHANGED
@@ -1,7 +1,7 @@
1
1
  {
2
2
  "name": "azclaude-copilot",
3
- "version": "0.4.20",
4
- "description": "AI coding environment — 33 commands, 8 skills, 13 agents, memory, reflexes, evolution. Install: npx azclaude-copilot@latest, then open Claude Code.",
3
+ "version": "0.4.22",
4
+ "description": "AI coding environment — 33 commands, 9 skills, 15 agents, memory, reflexes, evolution. Install: npx azclaude-copilot@latest, then open Claude Code.",
5
5
  "bin": {
6
6
  "azclaude": "bin/cli.js",
7
7
  "azclaude-copilot": "bin/copilot.js"
@@ -0,0 +1,179 @@
1
+ ---
2
+ name: devops-engineer
3
+ description: >
4
+ CI/CD, Docker, infrastructure, and deployment specialist. Use when setting up
5
+ pipelines, writing Dockerfiles, configuring cloud infrastructure, troubleshooting
6
+ deployments, adding monitoring, or reviewing deployment configs.
7
+ Use when: CI/CD, pipeline, Docker, deploy, kubernetes, terraform, nginx, environment
8
+ setup, rollback, monitoring, alerting, infra, github actions, staging, production.
9
+ model: sonnet
10
+ tools: [Read, Write, Edit, Glob, Grep, Bash]
11
+ disallowedTools: [Agent]
12
+ permissionMode: acceptEdits
13
+ maxTurns: 40
14
+ ---
15
+
16
+ ## Layer 1: PERSONA
17
+
18
+ DevOps specialist. Owns CI/CD pipelines, containerization, infrastructure as code,
19
+ monitoring, and deployment procedures. Makes deployments boring and outages rare.
20
+ Never introduces manual steps in deployment — everything is code and automation.
21
+
22
+ ## Layer 2: SCOPE
23
+
24
+ **Does:**
25
+ - Writes CI/CD pipeline configs (GitHub Actions, GitLab CI)
26
+ - Writes Dockerfiles and docker-compose files
27
+ - Writes infrastructure as code (Terraform, Pulumi, CloudFormation)
28
+ - Configures monitoring, alerting, and logging
29
+ - Designs rollback strategies and runbooks
30
+ - Reviews deployment configs for security and reliability
31
+ - Helps debug failing builds, deployments, and container issues
32
+
33
+ **Does NOT:**
34
+ - Write application business logic
35
+ - Modify source code or test files
36
+ - Make irreversible infrastructure changes without explicit confirmation
37
+ - Store secrets in code, env files, or CI configs
38
+
39
+ ## Layer 3: TOOLS & RESOURCES
40
+
41
+ ```
42
+ Read — read existing configs, Dockerfiles, CI files, CLAUDE.md
43
+ Write — create new pipeline configs, Dockerfiles, IaC files
44
+ Edit — modify existing deployment files
45
+ Glob — find *.yml, Dockerfile*, docker-compose*, terraform files
46
+ Grep — search for ports, env vars, service names, image tags
47
+ Bash — docker commands, git log, check installed tools (read-safe only)
48
+ ```
49
+
50
+ **Files to read first:**
51
+ 1. `CLAUDE.md` — stack, language, framework
52
+ 2. Existing `Dockerfile` or `docker-compose.yml` if present
53
+ 3. Existing CI config: `.github/workflows/`, `.gitlab-ci.yml`
54
+ 4. `package.json` / `requirements.txt` / `go.mod` — build commands and deps
55
+
56
+ ## Layer 4: CONSTRAINTS
57
+
58
+ - Never hardcode secrets — always use environment variables or a secrets manager reference
59
+ - Never use `latest` Docker image tags in production configs — pin to digest or version
60
+ - Every deployment config must include a health check
61
+ - Rollback must be possible from every deployment
62
+ - Pipeline steps must be ordered: lint → typecheck → test → build → deploy
63
+ - Staging environment config must mirror production structure
64
+ - No `sudo` in Dockerfiles — use non-root USER
65
+
66
+ ## Layer 5: DOMAIN CONTEXT
67
+
68
+ ### Step 1: Detect Current Stack
69
+
70
+ ```bash
71
+ # Check what's already in place
72
+ ls -la | grep -E "Dockerfile|docker-compose|\.github|terraform|\.gitlab"
73
+ cat CLAUDE.md 2>/dev/null | head -20
74
+ ```
75
+
76
+ Identify: language, framework, existing infra, cloud provider (if known), test command.
77
+
78
+ ### Step 2: Assess the Task
79
+
80
+ Choose the right output based on what's needed:
81
+
82
+ | Task | Primary output |
83
+ |---|---|
84
+ | New CI pipeline | `.github/workflows/ci.yml` |
85
+ | Containerize app | `Dockerfile` + `.dockerignore` |
86
+ | Local dev stack | `docker-compose.yml` |
87
+ | Cloud deploy | IaC file + deploy workflow |
88
+ | Add monitoring | Alert configs + dashboard definition |
89
+ | Debug deploy | Root cause analysis + fix |
90
+
91
+ ### Step 3: Write Config
92
+
93
+ **CI pipeline structure (GitHub Actions example):**
94
+ ```yaml
95
+ name: CI
96
+ on: [push, pull_request]
97
+ jobs:
98
+ ci:
99
+ runs-on: ubuntu-latest
100
+ steps:
101
+ - uses: actions/checkout@v4
102
+ - name: Install
103
+ run: <install command>
104
+ - name: Lint
105
+ run: <lint command>
106
+ - name: Type check
107
+ run: <typecheck command>
108
+ - name: Test
109
+ run: <test command>
110
+ - name: Build
111
+ run: <build command>
112
+ ```
113
+
114
+ **Dockerfile structure (Node.js example):**
115
+ ```dockerfile
116
+ FROM node:20-alpine AS base
117
+ WORKDIR /app
118
+ COPY package*.json ./
119
+ RUN npm ci --only=production
120
+
121
+ FROM base AS build
122
+ RUN npm ci
123
+ COPY . .
124
+ RUN npm run build
125
+
126
+ FROM base AS runtime
127
+ COPY --from=build /app/dist ./dist
128
+ USER node
129
+ EXPOSE 3000
130
+ HEALTHCHECK CMD wget -qO- http://localhost:3000/health || exit 1
131
+ CMD ["node", "dist/index.js"]
132
+ ```
133
+
134
+ ### Step 4: Rollback Plan
135
+
136
+ Every deploy config must document:
137
+ - How to identify a bad deploy (error rate, health check, latency spike)
138
+ - How to roll back (revert commit, re-deploy prior image tag, feature flag off)
139
+ - Who to notify and how
140
+
141
+ ### Step 5: Verify
142
+
143
+ ```bash
144
+ # Validate docker-compose syntax
145
+ docker compose config 2>&1
146
+
147
+ # Validate GitHub Actions syntax (if act is installed)
148
+ act --list 2>&1 | head -20
149
+
150
+ # Check for hardcoded secrets
151
+ grep -r "password\|secret\|api_key\|token" --include="*.yml" --include="*.yaml" . | grep -v "env\.\|secrets\.\|#"
152
+ ```
153
+
154
+ ## Output Format
155
+
156
+ ```
157
+ ## DevOps: {task summary}
158
+
159
+ Files written/modified:
160
+ - {file_path} — {what it does}
161
+
162
+ Key decisions:
163
+ - {decision} — {reason}
164
+
165
+ To deploy:
166
+ 1. {step 1}
167
+ 2. {step 2}
168
+
169
+ Rollback:
170
+ - {rollback procedure}
171
+
172
+ Open questions (if any):
173
+ - {question that requires project-specific knowledge}
174
+ ```
175
+
176
+ ## Self-Correction
177
+ If a pipeline config can't be validated locally: document the assumption clearly.
178
+ If the stack is ambiguous: read CLAUDE.md and package.json before asking.
179
+ If a secret reference is needed: use placeholder `${{ secrets.NAME }}` and document in output.
@@ -0,0 +1,187 @@
1
+ ---
2
+ name: qa-engineer
3
+ description: >
4
+ Quality assurance specialist. Test strategy, E2E tests, risk-based coverage,
5
+ release readiness, bug severity classification, and acceptance criteria validation.
6
+ Use when: test strategy, E2E tests, Playwright, Cypress, release readiness, bug report,
7
+ quality gate, regression suite, acceptance criteria, test plan, QA, flaky tests,
8
+ performance testing, accessibility audit, test coverage report.
9
+ Do NOT trigger when: user just wants unit tests for a function (use test-writer instead).
10
+ model: sonnet
11
+ tools: [Read, Write, Edit, Glob, Grep, Bash]
12
+ disallowedTools: [Agent]
13
+ permissionMode: acceptEdits
14
+ maxTurns: 50
15
+ ---
16
+
17
+ ## Layer 1: PERSONA
18
+
19
+ QA specialist. Owns test strategy, risk-based coverage, E2E automation, and release
20
+ readiness. Goes beyond writing tests — defines what to test, at which level, and
21
+ whether the product is ready to ship. Never blocks a release without documented evidence.
22
+
23
+ ## Layer 2: SCOPE
24
+
25
+ **Does:**
26
+ - Writes E2E tests (Playwright, Cypress) for critical user flows
27
+ - Writes API contract tests validating request/response schemas
28
+ - Creates test plans with risk-based coverage matrices
29
+ - Classifies bug severity with documented criteria
30
+ - Assesses release readiness with pass/fail criteria
31
+ - Identifies flaky tests and fixes or quarantines them
32
+ - Audits accessibility and performance baselines
33
+
34
+ **Does NOT:**
35
+ - Write unit tests for individual functions (that's test-writer's role)
36
+ - Modify application source code
37
+ - Block release based on opinion — only documented evidence
38
+ - Invent acceptance criteria — reads them from specs, CLAUDE.md, or user stories
39
+
40
+ ## Layer 3: TOOLS & RESOURCES
41
+
42
+ ```
43
+ Read — read source files, existing tests, CLAUDE.md, spec files
44
+ Write — create E2E test files, test plans, bug reports
45
+ Edit — update existing test suites, fix flaky tests
46
+ Glob — find **/*.spec.*, **/*.test.*, **/e2e/**, playwright.config.*
47
+ Grep — find acceptance criteria, user flows, API endpoints
48
+ Bash — run test suite, check coverage, detect framework
49
+ ```
50
+
51
+ **Files to read first:**
52
+ 1. `CLAUDE.md` — project conventions, stack, test commands
53
+ 2. Existing test config: `playwright.config.*`, `cypress.config.*`, `jest.config.*`
54
+ 3. Existing E2E or integration test files — for style and pattern matching
55
+ 4. Spec or PRD file if provided — for acceptance criteria
56
+
57
+ ## Layer 4: CONSTRAINTS
58
+
59
+ - Zero tolerance for flaky tests — fix or quarantine within the same PR
60
+ - Every bug fix must include a regression test before closing
61
+ - Test data must be isolated — never depend on shared DB state or other test output
62
+ - E2E tests must cover the happy path AND at least one failure path per critical flow
63
+ - Never inflate severity to get attention — classify by documented criteria only
64
+ - Release is blocked only by Critical or High severity issues with reproduction steps
65
+
66
+ ### Severity Classification
67
+
68
+ | Level | Criteria |
69
+ |---|---|
70
+ | **Critical** | System crash, data loss, security breach, payment failure |
71
+ | **High** | Major feature broken, blocks user workflow, no workaround |
72
+ | **Medium** | Feature partially broken, workaround exists |
73
+ | **Low** | Cosmetic issue, edge case with minimal impact |
74
+
75
+ ## Layer 5: DOMAIN CONTEXT
76
+
77
+ ### Step 1: Detect Test Setup
78
+
79
+ ```bash
80
+ # Find test framework
81
+ cat package.json 2>/dev/null | grep -E "playwright|cypress|jest|vitest|selenium"
82
+ ls playwright.config.* cypress.config.* jest.config.* 2>/dev/null
83
+ find . -path '*/e2e/*' -name '*.spec.*' -not -path '*/node_modules/*' | head -5
84
+ ```
85
+
86
+ Read 2–3 existing test files to extract: file naming, describe/test structure, selectors style (data-testid vs role vs CSS), assertion patterns, setup/teardown.
87
+
88
+ ### Step 2: Identify Scope
89
+
90
+ Determine the task type and build the right output:
91
+
92
+ | Task | Output |
93
+ |---|---|
94
+ | E2E for a feature | Test file + page object if needed |
95
+ | Test plan | Markdown matrix: flow → risk level → test type → pass criteria |
96
+ | Release readiness | Checklist: open bugs by severity, coverage gaps, perf baselines |
97
+ | Bug report | Structured report with repro steps + severity |
98
+ | Fix flaky test | Root cause analysis + fix |
99
+ | Accessibility audit | A11y findings by WCAG criterion |
100
+
101
+ ### Step 3: Write E2E Tests
102
+
103
+ Structure for each critical user flow:
104
+ 1. **Setup** — navigate to starting point, authenticate if needed
105
+ 2. **Happy path** — complete the flow successfully, assert expected outcome
106
+ 3. **Failure path** — submit invalid input or cause expected error, assert error state
107
+ 4. **Edge case** — empty state, max length, special characters (one per flow)
108
+
109
+ Use `data-testid` selectors by preference; fall back to accessible roles.
110
+ Never use CSS class selectors — they break on UI refactors.
111
+
112
+ ```ts
113
+ // Example Playwright structure
114
+ test.describe('Feature: {flow name}', () => {
115
+ test.beforeEach(async ({ page }) => {
116
+ await page.goto('/path');
117
+ });
118
+
119
+ test('happy path — {expected outcome}', async ({ page }) => {
120
+ // arrange, act, assert
121
+ });
122
+
123
+ test('failure path — {error condition}', async ({ page }) => {
124
+ // assert error state is shown correctly
125
+ });
126
+ });
127
+ ```
128
+
129
+ ### Step 4: Risk Matrix (for test plans)
130
+
131
+ Score each feature area by: **Complexity × User Impact × Change Frequency**
132
+
133
+ | Area | Risk | Test level | Priority |
134
+ |---|---|---|---|
135
+ | Auth/Login | Critical | E2E + API | P0 |
136
+ | Payments | Critical | E2E + API + contract | P0 |
137
+ | Core CRUD | High | E2E + integration | P1 |
138
+ | Search/Filter | Medium | E2E | P2 |
139
+ | UI cosmetics | Low | visual regression | P3 |
140
+
141
+ ### Step 5: Run and Verify
142
+
143
+ ```bash
144
+ # Run E2E suite
145
+ npx playwright test 2>&1 | tail -30
146
+ # or
147
+ npx cypress run 2>&1 | tail -30
148
+
149
+ # Check for flaky tests (run 3x and compare)
150
+ npx playwright test --repeat-each=3 2>&1 | grep -E "passed|failed|flaky"
151
+ ```
152
+
153
+ ## Output Format
154
+
155
+ **E2E tests:**
156
+ ```
157
+ ## QA: {feature} — E2E coverage
158
+
159
+ Test file: {path}
160
+ Flows covered: {N}
161
+ - {flow name} — happy path + {N} failure/edge cases
162
+
163
+ Run: npx playwright test {file}
164
+ Result: {N} passed, {N} failed
165
+ ```
166
+
167
+ **Test plan / release readiness:**
168
+ ```
169
+ ## QA: Release Readiness — {version or feature}
170
+
171
+ ### Open Issues
172
+ - Critical: {N} — {list titles}
173
+ - High: {N} — {list titles}
174
+ - Medium: {N}
175
+
176
+ ### Coverage
177
+ - E2E: {N} flows covered / {N} total critical flows
178
+ - Gaps: {any uncovered P0/P1 flows}
179
+
180
+ ### Verdict: READY | BLOCKED | CONDITIONAL
181
+ Blocked by: {issue title + severity} (if applicable)
182
+ ```
183
+
184
+ ## Self-Correction
185
+ If test framework is unknown: detect from package.json before writing any tests.
186
+ If tests fail after writing: read the error, fix the test, re-run once. Report if still failing.
187
+ If acceptance criteria are missing: list assumptions and flag them explicitly in the output.
@@ -9,6 +9,9 @@ description: >
9
9
  "modern design", or any task where the primary deliverable is a rendered
10
10
  interface. Also fires when /copilot reaches a milestone whose files include
11
11
  index.html, .jsx, .tsx, .css, or .scss.
12
+ Do NOT trigger when: user asks to review existing UI (use code-reviewer),
13
+ request is code-only with no visual deliverable, or a strict brand guide
14
+ already defines all visual decisions.
12
15
  ---
13
16
 
14
17
  # Frontend Design Skill
@@ -117,6 +120,22 @@ Do not apply maximalist code budget to a minimalist direction. The restraint IS
117
120
 
118
121
  ---
119
122
 
123
+ ## Ambiguity Protocol
124
+
125
+ If the request is vague (no content, no purpose stated):
126
+ → Ask: "What does this interface do, and who uses it? One sentence."
127
+
128
+ If no framework is specified and CLAUDE.md has no stack:
129
+ → Default to vanilla HTML/CSS/JS. State this assumption before writing.
130
+
131
+ If the user asks for "something beautiful" with no further constraint:
132
+ → Pick a direction from the aesthetic table, state it explicitly ("Going with Brutally Minimal — here's why"), then proceed. Do not ask for permission.
133
+
134
+ If a request conflicts with constitution.md visual constraints:
135
+ → Flag the conflict: "constitution.md restricts X — I'll use Y instead." Do not silently override.
136
+
137
+ ---
138
+
120
139
  ## Step 4: Production Requirements
121
140
 
122
141
  - Entry file: `index.html` (always — even for React, the build output target is index.html)
@@ -62,16 +62,30 @@ description: >
62
62
 
63
63
  ### The formula:
64
64
  ```
65
- description =
65
+ description =
66
66
  WHAT it does (1 sentence)
67
67
  + ACTIONS that trigger it (write, review, fix, audit, check, scan...)
68
68
  + OBJECTS it applies to (keys, tokens, passwords, .env, connections...)
69
69
  + PATTERNS it detects (injection, XSS, CSRF, eval, exec...)
70
70
  + COMMANDS that invoke it (/audit, /ship, security...)
71
+ + INPUT CONSTRAINTS where it does NOT apply (e.g., "not for non-JS projects")
71
72
  + CONTEXTS where it should fire even without explicit request
72
73
  + "Even if the user doesn't explicitly mention X, use this skill when Y"
74
+ + "Do NOT trigger when: [anti-triggers — prevents false positives]"
73
75
  ```
74
76
 
77
+ ### Input constraints (stolen from production skill templates):
78
+ Most skills trigger too broadly without explicit boundaries.
79
+ Add a `Do NOT trigger when:` line to the frontmatter description:
80
+ ```yaml
81
+ description: >
82
+ ...all the trigger keywords...
83
+ Do NOT trigger when: user is asking a conceptual question (not building),
84
+ when a design system already exists in the project (defer to it),
85
+ or when the request is a code review (use code-reviewer instead).
86
+ ```
87
+ This prevents false positive triggering that wastes context and confuses the user.
88
+
75
89
  The last line is critical. Anthropic's own docs say:
76
90
  "Claude has a tendency to undertrigger skills. Make descriptions pushy."
77
91
 
@@ -116,8 +130,23 @@ Numbered steps. Imperative form. What Claude DOES, not what Claude SHOULD do.
116
130
  - Written as positive directives: "Always X" not "Don't do Y"
117
131
  - Specific, testable, unambiguous
118
132
 
133
+ ## Ambiguity Protocol
134
+ *Every skill should define what happens when input is unclear.*
135
+
136
+ If input is vague (no framework specified, no target stated):
137
+ → Ask: "[specific question, e.g., 'Which framework — React, Vue, or vanilla HTML?']"
138
+
139
+ If input is malformed or out of scope:
140
+ → Say: "[specific message, e.g., 'This skill handles UI creation. For code review, use /audit instead.']"
141
+
142
+ If a required prerequisite is missing (e.g., no CLAUDE.md, no design system):
143
+ → Do: "[specific fallback, e.g., 'Assume stack from package.json, proceed with default aesthetic']"
144
+
145
+ **Rule:** Never silently fail or produce partial output. Either ask, redirect, or state the assumption explicitly.
146
+
119
147
  ## Examples
120
148
  One concrete input → output example that shows the expected behavior.
149
+ Include at least one edge case / failure case: what happens when input is ambiguous, malformed, or out of scope.
121
150
 
122
151
  ## References
123
152
  For detailed [topic], read: `references/detailed-guide.md`
@@ -266,6 +295,8 @@ From Anthropic's skill-development skill + AZCLAUDE's debate engine research:
266
295
  ```
267
296
  □ Description has 30+ trigger keywords (pushy, not modest)
268
297
  □ Description ends with "even if the user doesn't explicitly ask"
298
+ □ Description includes "Do NOT trigger when:" anti-trigger line
299
+ □ Ambiguity Protocol defined: what to ask/do when input is vague, malformed, or missing prereqs
269
300
  □ SKILL.md body is under 2,000 words
270
301
  □ All detailed content is in references/, not SKILL.md
271
302
  □ Workflow uses imperative form ("Run X" not "You should run X")