@paths.design/caws-cli 3.2.4 → 3.3.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -1,820 +1,1046 @@
1
- # CAWS v1.0 — Engineering-Grade Operating System for Coding Agents
1
+ # CAWS - Agent Workflow Guide
2
2
 
3
- ## Purpose
3
+ **Coding Agent Workflow System** - Engineering-grade operating system for AI-assisted development
4
4
 
5
- Our "engineering-grade" operating system for coding agents that (1) forces planning before code, (2) bakes in tests as first-class artifacts, (3) creates explainable provenance, and (4) enforces quality via automated CI gates. It's expressed as a Working Spec + Ruleset the agent must follow, with schemas, templates, scripts, and verification hooks that enable better collaboration between agent and our human in the loop.
5
+ **Version**: 3.1.0
6
+ **Last Updated**: October 8, 2025
6
7
 
7
- ## 1) Core Framework
8
+ ---
9
+
10
+ ## Purpose & Philosophy
11
+
12
+ CAWS is an engineering-grade operating system for coding agents that:
13
+
14
+ 1. **Forces planning before code** - No implementation without a validated working spec
15
+ 2. **Treats tests as first-class artifacts** - Tests drive implementation, not the other way around
16
+ 3. **Creates explainable provenance** - Every change is tracked and attributable
17
+ 4. **Enforces quality via automated gates** - CI/CD validates coverage, mutation scores, and contracts
18
+
19
+ This guide teaches agents how to collaborate effectively with humans using CAWS tooling and conventions.
20
+
21
+ ---
22
+
23
+ ## Quick Start for Agents
24
+
25
+ ### Your First CAWS Project
26
+
27
+ When you encounter a CAWS project, follow this sequence:
28
+
29
+ 1. **Check for working spec**: Look for `.caws/working-spec.yaml`
30
+ 2. **Understand the scope**: Read the `scope.in` and `scope.out` to know boundaries
31
+ 3. **Check risk tier**: Tier 1 (critical), Tier 2 (standard), Tier 3 (low risk)
32
+ 4. **Review acceptance criteria**: These are your implementation targets
33
+ 5. **Validate before starting**: Run `caws validate` to ensure spec is valid
34
+
35
+ ### The Golden Rule
36
+
37
+ **Never write implementation code until:**
38
+
39
+ - Working spec exists and validates
40
+ - Test plan is defined
41
+ - Acceptance criteria are clear
42
+ - Scope boundaries are understood
43
+
44
+ ---
45
+
46
+ ## Core Concepts
47
+
48
+ ### Risk Tiers - Your Quality Contract
49
+
50
+ Risk tiers drive rigor and determine quality gates:
51
+
52
+ | Tier | Use Case | Coverage | Mutation | Contracts | Review |
53
+ | --------- | --------------------------- | -------- | -------- | --------- | -------- |
54
+ | **🔴 T1** | Auth, billing, migrations | 90%+ | 70%+ | Required | Manual |
55
+ | **🟡 T2** | Features, APIs, data writes | 80%+ | 50%+ | Required | Optional |
56
+ | **🟢 T3** | UI, internal tools | 70%+ | 30%+ | Optional | Optional |
57
+
58
+ **As an agent, you must:**
59
+
60
+ - Infer and declare the tier in your plan
61
+ - Meet or exceed tier requirements
62
+ - Request human review for Tier 1 changes
63
+ - Never downgrade a tier without human approval
64
+
65
+ ### Key Invariants (Never Violate These)
66
+
67
+ 1. **Atomic Change Budget**: Stay within `max_files` and `max_loc` limits
68
+ 2. **In-Place Refactors**: No shadow files (`enhanced-*`, `new-*`, `v2-*`, etc.)
69
+ 3. **Deterministic Code**: Use injected time/uuid/random for testability
70
+ 4. **Secure Prompts**: Never include secrets, `.env` files, or keys in context
71
+ 5. **Provenance**: All changes are tracked and attributable
72
+
73
+ ### The Working Spec - Your Blueprint
74
+
75
+ Every task needs a working spec at `.caws/working-spec.yaml`:
76
+
77
+ ```yaml
78
+ id: FEAT-001
79
+ title: 'Add user authentication flow'
80
+ risk_tier: 1
81
+ mode: feature
82
+ change_budget:
83
+ max_files: 25
84
+ max_loc: 1000
85
+ blast_radius:
86
+ modules: ['auth', 'api']
87
+ data_migration: false
88
+ operational_rollback_slo: '5m'
89
+ scope:
90
+ in: ['src/auth/', 'tests/auth/', 'package.json']
91
+ out: ['src/billing/', 'node_modules/']
92
+ invariants:
93
+ - 'System maintains data consistency during rollback'
94
+ - 'Authentication state is never stored in localStorage'
95
+ - 'All auth tokens expire within 24h'
96
+ acceptance:
97
+ - id: 'A1'
98
+ given: 'User is logged out'
99
+ when: 'User submits valid credentials'
100
+ then: 'User is logged in and redirected to dashboard'
101
+ - id: 'A2'
102
+ given: 'User has invalid session token'
103
+ when: 'User attempts to access protected route'
104
+ then: 'User is redirected to login with error message'
105
+ non_functional:
106
+ a11y: ['keyboard-navigation', 'screen-reader-labels']
107
+ perf: { api_p95_ms: 250, lcp_ms: 2500 }
108
+ security: ['input-validation', 'csrf-protection', 'rate-limiting']
109
+ contracts:
110
+ - type: 'openapi'
111
+ path: 'docs/api/auth.yaml'
112
+ ```
113
+
114
+ ---
115
+
116
+ ## Your Development Workflow
117
+
118
+ ### Phase 1: Plan (Before Any Code)
119
+
120
+ **Goal**: Create a validated working spec and test plan.
121
+
122
+ ```bash
123
+ # 1. Create or validate working spec
124
+ caws validate --suggestions
125
+
126
+ # 2. If issues exist, use auto-fix for safe corrections
127
+ caws validate --auto-fix
128
+
129
+ # 3. Review acceptance criteria - these are your targets
130
+ cat .caws/working-spec.yaml | grep -A 20 "acceptance:"
131
+ ```
132
+
133
+ **What to include in your plan:**
134
+
135
+ 1. **Design sketch**: Sequence diagram or API table
136
+ 2. **Test matrix**: Unit/contract/integration/e2e with edge cases
137
+ 3. **Data plan**: Fixtures, factories, seed strategy
138
+ 4. **Observability**: Logs/metrics/traces for production verification
139
+
140
+ **Output**: `feature.plan.md` committed to repo
141
+
142
+ ### Phase 2: Implement (Test-Driven)
143
+
144
+ **Goal**: Write tests first, then implementation.
145
+
146
+ **Order of operations:**
147
+
148
+ 1. **Contracts first** (if applicable)
149
+
150
+ ```bash
151
+ # Generate types from OpenAPI/GraphQL
152
+ npm run generate:types
153
+
154
+ # Add contract tests before implementation
155
+ # Location: tests/contract/
156
+ ```
157
+
158
+ 2. **Unit tests next**
159
+
160
+ ```bash
161
+ # Write failing tests for each acceptance criterion
162
+ # Location: tests/unit/
163
+
164
+ # Run tests to confirm they fail
165
+ npm test
166
+ ```
167
+
168
+ 3. **Implementation**
169
+
170
+ ```bash
171
+ # Implement to make tests pass
172
+ # Stay within scope.in boundaries
173
+ # Keep files under max_loc budget
174
+ ```
175
+
176
+ 4. **Integration/E2E tests**
177
+
178
+ ```bash
179
+ # Add integration tests for persistence/transactions
180
+ # Location: tests/integration/
181
+
182
+ # Add E2E smoke tests for critical paths
183
+ # Location: tests/e2e/
184
+ ```
185
+
186
+ **Implementation rules:**
187
+
188
+ - ✅ **DO**: Edit existing modules, use injected dependencies, write deterministic code
189
+ - ❌ **DON'T**: Create shadow files, hardcode timestamps/UUIDs, exceed change budget
190
+
191
+ ### Phase 3: Verify (Must Pass Before PR)
192
+
193
+ **Goal**: Ensure all quality gates pass locally.
194
+
195
+ ```bash
196
+ # Run full verification suite
197
+ npm run verify
198
+
199
+ # Or run individual checks
200
+ npm run lint # Code style
201
+ npm run typecheck # Type safety
202
+ npm test # All tests
203
+ npm run test:coverage # Coverage thresholds
204
+ npm run test:mutation # Mutation testing
205
+ npm run test:contract # Contract validation
206
+ npm run test:e2e # End-to-end smoke tests
207
+ ```
208
+
209
+ **Quality gates by tier:**
210
+
211
+ **Tier 1:**
8
212
 
9
- ### Risk Tiering Drives Rigor
213
+ - Branch coverage 90%
214
+ - Mutation score ≥ 70%
215
+ - All contract tests pass
216
+ - Manual code review completed
217
+ - No SAST/secret scan violations
10
218
 
11
- **Tier 1** (Core/critical path, auth/billing, migrations): highest rigor; mutation ≥ 70, branch cov ≥ 90, contract tests mandatory, chaos tests optional, manual review required.
12
- • **Tier 2** (Common features, data writes, cross-service APIs): mutation ≥ 50, branch cov ≥ 80, contracts mandatory if any external API, e2e smoke required.
13
- • **Tier 3** (Low risk, read-only UI, internal tooling): mutation ≥ 30, branch cov ≥ 70, integration happy-path + unit thoroughness, e2e optional.
219
+ **Tier 2:**
14
220
 
15
- Agent must infer and declare tier in the plan; human reviewer may bump it up, never down.
221
+ - Branch coverage 80%
222
+ - Mutation score ≥ 50%
223
+ - Contract tests pass (if external APIs)
224
+ - E2E smoke tests pass
16
225
 
17
- ### New Invariants (Repository-Level "Operating Envelope")
226
+ **Tier 3:**
18
227
 
19
- 1. **Atomic Change Budget**
20
- - _Invariant:_ "A PR must fit into one of: `refactor`, `feature`, `fix`, `doc`, `chore`—and must touch only files that the Working Spec's `scope.in` names."
21
- - _Reason:_ Kills scope-creep; enables deterministic review.
22
- - _Gate:_ CI rejects PRs that modify files outside `scope.in` unless `spec_delta` is present.
228
+ - Branch coverage ≥ 70%
229
+ - Mutation score 30%
230
+ - Integration happy-path tests pass
23
231
 
24
- 2. **In-place Refactor (No Shadow Copies)**
25
- - _Invariant:_ Refactors perform **in-place** edits with AST codemods; **no parallel files** (e.g., `enhanced-*.ts`).
26
- - _Gate:_ a naming linter blocks new files that share stem with suffix/prefix (`enhanced|new|v2|copy|final`).
232
+ ### Phase 4: Document & Deliver
27
233
 
28
- 3. **Determinism & Idempotency**
29
- - _Invariant:_ All new code must be testable with injected clock/uuid/random; repeated requests must be safe (where applicable) and asserted in tests.
30
- - _Gate:_ mutation tests + property tests include at least one idempotency predicate for Tier ≥2.
234
+ **Goal**: Create comprehensive PR with all artifacts.
31
235
 
32
- 4. **Prompt & Tool Security Envelope** (for agent workflows)
33
- - _Invariant:_ Agents operate with **tool allow-lists**, **redacted secrets**, and **context firebreaks** (no raw secrets in model context; never post `.env`, keys, or tokens back into diffs).
34
- - _Gate:_ prompt-lint and secret-scan on the agent prompt files + PR diffs.
236
+ **PR checklist:**
35
237
 
36
- 5. **Supply-chain Provenance**
37
- - _Invariant:_ Every CI build produces an SBOM + SLSA-style attestation attached to the PR.
38
- - _Gate:_ trust score requires valid SBOM/attestation.
238
+ ```markdown
239
+ ## Working Spec
240
+
241
+ - [ ] `.caws/working-spec.yaml` attached and validates
242
+ - [ ] Risk tier appropriate for change impact
243
+ - [ ] Acceptance criteria met
244
+
245
+ ## Tests
246
+
247
+ - [ ] Test plan documented
248
+ - [ ] Coverage meets tier requirements
249
+ - [ ] Mutation score meets tier requirements
250
+ - [ ] Contract tests pass (if applicable)
251
+ - [ ] E2E smoke tests pass (if applicable)
252
+
253
+ ## Documentation
254
+
255
+ - [ ] README updated (if public API changed)
256
+ - [ ] Migration notes (if database changes)
257
+ - [ ] Rollback plan documented
258
+ - [ ] Changelog updated (semver impact noted)
259
+
260
+ ## Quality Gates
261
+
262
+ - [ ] All lints pass
263
+ - [ ] Type checks pass
264
+ - [ ] No secret scan violations
265
+ - [ ] No SAST violations
266
+ - [ ] Performance budgets met
267
+
268
+ ## Provenance
269
+
270
+ - [ ] Commits follow conventional commits format
271
+ - [ ] PR title references ticket ID
272
+ - [ ] Provenance updated: `caws provenance update`
273
+ ```
274
+
275
+ ---
276
+
277
+ ## CLI Commands Reference
278
+
279
+ ### Project Initialization
280
+
281
+ ```bash
282
+ # Interactive wizard (recommended for new projects)
283
+ caws init --interactive
284
+
285
+ # Initialize in existing directory
286
+ caws init .
287
+
288
+ # Use project template
289
+ caws init my-project --template=extension
290
+ ```
291
+
292
+ ### Validation
293
+
294
+ ```bash
295
+ # Check working spec validity
296
+ caws validate
297
+
298
+ # Get helpful suggestions for fixing issues
299
+ caws validate --suggestions
300
+
301
+ # Auto-fix safe validation issues
302
+ caws validate --auto-fix
303
+
304
+ # Quiet mode for CI
305
+ caws validate --quiet
306
+ ```
307
+
308
+ ### Scaffolding
309
+
310
+ ```bash
311
+ # Add CAWS components to existing project
312
+ caws scaffold
313
+
314
+ # Only essential components
315
+ caws scaffold --minimal
316
+
317
+ # Include specific features
318
+ caws scaffold --with-codemods
319
+ caws scaffold --with-oidc
320
+ ```
39
321
 
40
- ### Required Inputs (No Code Until Present)
322
+ ### Provenance Tracking
41
323
 
42
- • **Working Spec YAML** (see schema below) with user story, scope, invariants, acceptance tests, non-functional budgets, risk tier.
43
- **Interface Contracts**: OpenAPI/GraphQL SDL/proto/Pact provider/consumer stubs.
44
- **Test Plan**: unit cases, properties, fixtures, integration flows, e2e smokes; data setup/teardown; flake controls.
45
- • **Change Impact Map**: touched modules, migrations, roll-forward/rollback.
46
- **A11y/Perf/Sec budgets**: keyboard path(s), axe rules to enforce; perf budget (TTI/LCP/API latency); SAST/secret scanning & deps policy.
324
+ ```bash
325
+ # Initialize provenance tracking
326
+ caws provenance init
327
+
328
+ # Install git hooks for automatic tracking
329
+ caws hooks install --backup
330
+
331
+ # Update provenance manually
332
+ caws provenance update --commit <hash>
333
+
334
+ # View beautiful dashboard
335
+ caws provenance show --format=dashboard
336
+
337
+ # Analyze AI effectiveness
338
+ caws provenance analyze-ai
339
+ ```
340
+
341
+ ---
342
+
343
+ ## Mode Matrix - Know Your Context
47
344
 
48
- If any are missing, agent must generate a draft and request confirmation inside the PR description before implementing.
345
+ Different modes have different rules:
49
346
 
50
- ### The Loop: Plan Implement → Verify → Document
347
+ | Mode | Contracts | New Files | Required Artifacts |
348
+ | ------------ | --------------- | ---------------------- | ----------------------------------------- |
349
+ | **feature** | Required first | Allowed in scope.in | Migration plan, feature flag, perf budget |
350
+ | **refactor** | Must not change | Discouraged (use mods) | Codemod script + semantic diff report |
351
+ | **fix** | Unchanged | Discouraged | Red test → green; root cause note |
352
+ | **doc** | N/A | Allowed (docs only) | Updated README/usage snippets |
353
+ | **chore** | N/A | Limited (build/tools) | Version updates, dependency changes |
51
354
 
52
- #### 2.1 Plan (agent output, committed as feature.plan.md)
355
+ ### Feature Mode (Most Common)
53
356
 
54
- **Design sketch**: sequence diagram or pseudo-API table.
55
- • **Test matrix**: aligned to user intent (unit/contract/integration/e2e) with edge cases and property predicates.
56
- • **Data plan**: factories/fixtures, seed strategy, anonymized sample payloads.
57
- • **Observability plan**: logs/metrics/traces; which spans and attributes will verify correctness in prod.
357
+ **When to use**: Adding new functionality
58
358
 
59
- #### 2.2 Implement (rules)
359
+ **Requirements**:
360
+
361
+ 1. Define contracts first (OpenAPI/GraphQL/etc.)
362
+ 2. Write consumer/provider tests before implementation
363
+ 3. Include migration plan if database changes
364
+ 4. Add feature flag for gradual rollout
365
+ 5. Set performance budgets
366
+
367
+ **Example workflow**:
368
+
369
+ ```bash
370
+ # 1. Define contract
371
+ vim docs/api/new-feature.yaml
372
+
373
+ # 2. Generate types
374
+ npm run generate:types
375
+
376
+ # 3. Write contract tests
377
+ vim tests/contract/new-feature.test.ts
378
+
379
+ # 4. Implement
380
+ vim src/features/new-feature.ts
381
+
382
+ # 5. Verify
383
+ npm run verify
384
+ ```
60
385
 
61
- **Contract-first**: generate/validate types from OpenAPI/SDL; add contract tests (Pact/WireMock/MSW) before impl.
62
- • **Unit focus**: pure logic isolated; mocks only at boundaries you own (clock, fs, network).
63
- • **State seams**: inject time/uuid/random; ensure determinism; guard for idempotency where relevant.
64
- • **Migration discipline**: forwards-compatible; provide up/down, dry-run, and backfill strategy.
386
+ ### Refactor Mode (High Risk)
65
387
 
66
- ### Mode Matrix
388
+ **When to use**: Restructuring without behavior change
67
389
 
68
- | Mode | Contracts | New Files | Required Artifacts |
69
- | ------------ | ------------------------------------------------------------------- | ------------------------------------------------------------------------------ | ------------------------------------------------ |
70
- | **refactor** | Must not change | Discouraged; only when splitting modules with 1:1 mapping and codemod provided | Codemod script + semantic diff report |
71
- | **feature** | Required first; consumer/provider tests green before implementation | Allowed; must be listed in scope.in | Migration plan, feature flag, performance budget |
72
- | **fix** | Unchanged | Discouraged; prefer in-place edits | Red test → green; root cause note in PR |
73
- | **doc** | N/A | Allowed for documentation files | Updated README/usage snippets |
74
- | **chore** | N/A | Limited to build/tooling changes | Version updates, dependency changes |
390
+ **Requirements**:
75
391
 
76
- ### Cursor/Codex Execution Guard
392
+ 1. Contracts must not change
393
+ 2. Provide codemod script for automatic transformation
394
+ 3. Include semantic diff report
395
+ 4. Prove no behavior change with tests
396
+ 5. Update all imports automatically
77
397
 
78
- Add a commit policy hook to reject commit sets that introduce duplicate stems:
398
+ **Example workflow**:
79
399
 
80
400
  ```bash
81
- # .git/hooks/pre-commit (or CI script)
82
- PATTERN='/(copy|final|enhanced|v2)[.-]|/(new-)| - copy\.'
83
- git diff --cached --name-only | grep -E "$PATTERN" && {
84
- echo "❌ Disallowed filename pattern. Use in-place refactor or codemod."
85
- exit 1
401
+ # 1. Write codemod
402
+ vim codemod/rename-user-service.ts
403
+
404
+ # 2. Dry run
405
+ npx jscodeshift -d -t codemod/rename-user-service.ts src/
406
+
407
+ # 3. Apply
408
+ npx jscodeshift -t codemod/rename-user-service.ts src/
409
+
410
+ # 4. Verify tests still pass
411
+ npm test
412
+
413
+ # 5. Generate semantic diff
414
+ npm run semver-check
415
+ ```
416
+
417
+ ### Fix Mode (Urgent)
418
+
419
+ **When to use**: Fixing bugs
420
+
421
+ **Requirements**:
422
+
423
+ 1. Write failing test that reproduces bug
424
+ 2. Implement minimal fix
425
+ 3. Document root cause in PR
426
+ 4. Avoid new files - prefer in-place edits
427
+
428
+ **Example workflow**:
429
+
430
+ ```bash
431
+ # 1. Write failing test
432
+ vim tests/unit/user-service.test.ts
433
+ npm test # Should fail
434
+
435
+ # 2. Fix
436
+ vim src/services/user-service.ts
437
+ npm test # Should pass
438
+
439
+ # 3. Document
440
+ vim .caws/working-spec.yaml # Add root cause note
441
+ ```
442
+
443
+ ---
444
+
445
+ ## Common Patterns & Best Practices
446
+
447
+ ### Pattern: Deterministic Testing
448
+
449
+ **Problem**: Tests that use `Date.now()`, `Math.random()`, or `crypto.randomUUID()` are non-deterministic.
450
+
451
+ **Solution**: Inject time/random/UUID generators.
452
+
453
+ ```typescript
454
+ // ❌ Bad - Non-deterministic
455
+ class OrderService {
456
+ createOrder(items) {
457
+ return {
458
+ id: crypto.randomUUID(),
459
+ timestamp: Date.now(),
460
+ items,
461
+ };
462
+ }
463
+ }
464
+
465
+ // ✅ Good - Deterministic
466
+ class OrderService {
467
+ constructor(
468
+ private clock: Clock,
469
+ private idGenerator: IdGenerator
470
+ ) {}
471
+
472
+ createOrder(items) {
473
+ return {
474
+ id: this.idGenerator.generate(),
475
+ timestamp: this.clock.now(),
476
+ items,
477
+ };
478
+ }
86
479
  }
480
+
481
+ // Test with injected dependencies
482
+ test('createOrder generates valid order', () => {
483
+ const clock = new FixedClock('2025-01-01T00:00:00Z');
484
+ const idGen = new SequentialIdGenerator();
485
+ const service = new OrderService(clock, idGen);
486
+
487
+ const order = service.createOrder([item1, item2]);
488
+
489
+ expect(order.id).toBe('00000001');
490
+ expect(order.timestamp).toBe('2025-01-01T00:00:00Z');
491
+ });
87
492
  ```
88
493
 
89
- #### 2.3 Verify (must pass locally and in CI)
90
-
91
- **Static checks**: typecheck, lint (code + tests), import hygiene, dead-code scan, secret scan.
92
- • **Tests**:
93
- **Unit**: fast, deterministic; cover branches and edge conditions; property-based where feasible.
94
- • **Contract**: consumer/provider; versioned and stored under apps/contracts/.
95
- • **Integration**: real DB or Testcontainers; seed data via factories; verify persistence, transactions, retries/timeouts.
96
- **E2E smoke**: Playwright/Cypress; critical user paths only; semantic selectors; screenshot+trace on failure.
97
- **Mutation testing**: minimum scores per tier; non-conformant builds fail.
98
- **Non-functional checks**: axe rules; Lighthouse CI budgets or API latency budgets; SAST/dep scan clean.
99
- **Flake policy**: tests that intermittently fail are quarantined within 24h with an open ticket; no retries as policy, only as temporary band-aid with expiry.
100
-
101
- #### 2.4 Document & Deliver
102
-
103
- **PR bundle** (template below) with:
104
- Working Spec YAML
105
- Test Plan & Coverage/Mutation summary, Contract artifacts
106
- • Risk assessment, Rollback plan, Observability notes (dashboards/queries)
107
- • Changelog (semver impact), Migration notes
108
- • Traceability: PR title references ticket; commits follow conventional commits; each test cites the requirement ID in test name or annotation.
109
- • Explainability: agent includes a 10-line "rationale" and "known-limits" section.
110
-
111
- ## 2) Machine-Enforceable Implementation
112
-
113
- ### A) Executable Schemas & Validation
114
-
115
- #### Working Spec JSON Schema
116
-
117
- ```json
118
- {
119
- "$schema": "https://json-schema.org/draft/2020-12/schema",
120
- "title": "CAWS Working Spec",
121
- "type": "object",
122
- "required": [
123
- "id",
124
- "title",
125
- "risk_tier",
126
- "mode",
127
- "change_budget",
128
- "blast_radius",
129
- "operational_rollback_slo",
130
- "scope",
131
- "invariants",
132
- "acceptance",
133
- "non_functional",
134
- "contracts"
135
- ],
136
- "properties": {
137
- "id": { "type": "string", "pattern": "^[A-Z]+-\\d+$" },
138
- "title": { "type": "string", "minLength": 8 },
139
- "risk_tier": { "type": "integer", "enum": [1, 2, 3] },
140
- "mode": { "type": "string", "enum": ["refactor", "feature", "fix", "doc", "chore"] },
141
- "change_budget": {
142
- "type": "object",
143
- "properties": {
144
- "max_files": { "type": "integer", "minimum": 1 },
145
- "max_loc": { "type": "integer", "minimum": 1 }
146
- },
147
- "required": ["max_files", "max_loc"],
148
- "additionalProperties": false
149
- },
150
- "blast_radius": {
151
- "type": "object",
152
- "properties": {
153
- "modules": { "type": "array", "items": { "type": "string" } },
154
- "data_migration": { "type": "boolean" }
155
- },
156
- "required": ["modules", "data_migration"],
157
- "additionalProperties": false
158
- },
159
- "operational_rollback_slo": { "type": "string", "pattern": "^[0-9]+m$|^[0-9]+h$" },
160
- "threats": { "type": "array", "items": { "type": "string" } },
161
- "scope": {
162
- "type": "object",
163
- "required": ["in", "out"],
164
- "properties": {
165
- "in": { "type": "array", "items": { "type": "string" }, "minItems": 1 },
166
- "out": { "type": "array", "items": { "type": "string" } }
167
- }
168
- },
169
- "invariants": { "type": "array", "items": { "type": "string" }, "minItems": 1 },
170
- "acceptance": {
171
- "type": "array",
172
- "minItems": 1,
173
- "items": {
174
- "type": "object",
175
- "required": ["id", "given", "when", "then"],
176
- "properties": {
177
- "id": { "type": "string", "pattern": "^A\\d+$" },
178
- "given": { "type": "string" },
179
- "when": { "type": "string" },
180
- "then": { "type": "string" }
181
- }
182
- }
183
- },
184
- "non_functional": {
185
- "type": "object",
186
- "properties": {
187
- "a11y": { "type": "array", "items": { "type": "string" } },
188
- "perf": {
189
- "type": "object",
190
- "properties": {
191
- "api_p95_ms": { "type": "integer", "minimum": 1 },
192
- "lcp_ms": { "type": "integer", "minimum": 1 }
193
- },
194
- "additionalProperties": false
195
- },
196
- "security": { "type": "array", "items": { "type": "string" } }
197
- },
198
- "additionalProperties": false
199
- },
200
- "contracts": {
201
- "type": "array",
202
- "minItems": 1,
203
- "items": {
204
- "type": "object",
205
- "required": ["type", "path"],
206
- "properties": {
207
- "type": { "type": "string", "enum": ["openapi", "graphql", "proto", "pact"] },
208
- "path": { "type": "string" }
494
+ ### Pattern: Guard Clauses for Safety
495
+
496
+ **Problem**: Deep nesting makes code hard to read and error-prone.
497
+
498
+ **Solution**: Use guard clauses and early returns.
499
+
500
+ ```typescript
501
+ // Bad - Deep nesting
502
+ function processOrder(order) {
503
+ if (order) {
504
+ if (order.items && order.items.length > 0) {
505
+ if (order.user) {
506
+ if (order.user.active) {
507
+ // Process order
508
+ return calculateTotal(order.items);
509
+ } else {
510
+ throw new Error('User not active');
209
511
  }
512
+ } else {
513
+ throw new Error('No user');
210
514
  }
211
- },
212
- "observability": {
213
- "type": "object",
214
- "properties": {
215
- "logs": { "type": "array", "items": { "type": "string" } },
216
- "metrics": { "type": "array", "items": { "type": "string" } },
217
- "traces": { "type": "array", "items": { "type": "string" } }
218
- }
219
- },
220
- "migrations": { "type": "array", "items": { "type": "string" } },
221
- "rollback": { "type": "array", "items": { "type": "string" } }
222
- },
223
- "additionalProperties": false
515
+ } else {
516
+ throw new Error('No items');
517
+ }
518
+ } else {
519
+ throw new Error('No order');
520
+ }
224
521
  }
225
- ```
226
522
 
227
- #### Provenance Manifest Schema
228
-
229
- ```json
230
- {
231
- "$schema": "https://json-schema.org/draft/2020-12/schema",
232
- "type": "object",
233
- "required": [
234
- "agent",
235
- "model",
236
- "model_hash",
237
- "tool_allowlist",
238
- "commit",
239
- "artifacts",
240
- "results",
241
- "approvals",
242
- "sbom",
243
- "attestation"
244
- ],
245
- "properties": {
246
- "agent": { "type": "string" },
247
- "model": { "type": "string" },
248
- "model_hash": { "type": "string" },
249
- "tool_allowlist": { "type": "array", "items": { "type": "string" } },
250
- "prompts": { "type": "array", "items": { "type": "string" } },
251
- "commit": { "type": "string" },
252
- "artifacts": { "type": "array", "items": { "type": "string" } },
253
- "results": {
254
- "type": "object",
255
- "properties": {
256
- "coverage_branch": { "type": "number" },
257
- "mutation_score": { "type": "number" },
258
- "tests_passed": { "type": "integer" },
259
- "contracts": {
260
- "type": "object",
261
- "properties": { "consumer": { "type": "boolean" }, "provider": { "type": "boolean" } }
262
- },
263
- "a11y": { "type": "string" },
264
- "perf": { "type": "object" }
265
- },
266
- "additionalProperties": true
267
- },
268
- "approvals": { "type": "array", "items": { "type": "string" } },
269
- "sbom": { "type": "string" },
270
- "attestation": { "type": "string" }
523
+ // Good - Guard clauses
524
+ function processOrder(order) {
525
+ if (!order) {
526
+ throw new Error('No order');
527
+ }
528
+
529
+ if (!order.items || order.items.length === 0) {
530
+ throw new Error('No items');
271
531
  }
272
- }
273
- ```
274
532
 
275
- #### Tier Policy Configuration
276
-
277
- ```json
278
- {
279
- "1": {
280
- "min_branch": 0.9,
281
- "min_mutation": 0.7,
282
- "requires_contracts": true,
283
- "requires_manual_review": true,
284
- "max_files": 40,
285
- "max_loc": 1500,
286
- "allowed_modes": ["feature", "refactor", "fix"]
287
- },
288
- "2": {
289
- "min_branch": 0.8,
290
- "min_mutation": 0.5,
291
- "requires_contracts": true,
292
- "max_files": 25,
293
- "max_loc": 1000,
294
- "allowed_modes": ["feature", "refactor", "fix"]
295
- },
296
- "3": {
297
- "min_branch": 0.7,
298
- "min_mutation": 0.3,
299
- "requires_contracts": false,
300
- "max_files": 15,
301
- "max_loc": 600,
302
- "allowed_modes": ["feature", "refactor", "fix", "doc", "chore"]
533
+ if (!order.user) {
534
+ throw new Error('No user');
303
535
  }
536
+
537
+ if (!order.user.active) {
538
+ throw new Error('User not active');
539
+ }
540
+
541
+ // Now safe to process
542
+ return calculateTotal(order.items);
304
543
  }
305
544
  ```
306
545
 
307
- ### B) CI/CD Quality Gates (Automated)
546
+ ### Pattern: Contract-First Development
547
+
548
+ **Problem**: API changes break consumers unexpectedly.
549
+
550
+ **Solution**: Define contracts first, generate types, test before implementing.
551
+
552
+ ```bash
553
+ # 1. Define OpenAPI contract
554
+ cat > docs/api/users.yaml << EOF
555
+ openapi: 3.0.0
556
+ paths:
557
+ /users:
558
+ get:
559
+ responses:
560
+ 200:
561
+ content:
562
+ application/json:
563
+ schema:
564
+ type: array
565
+ items:
566
+ \$ref: '#/components/schemas/User'
567
+ components:
568
+ schemas:
569
+ User:
570
+ type: object
571
+ required: [id, email, name]
572
+ properties:
573
+ id: { type: string }
574
+ email: { type: string }
575
+ name: { type: string }
576
+ EOF
577
+
578
+ # 2. Generate TypeScript types
579
+ npx openapi-typescript docs/api/users.yaml -o src/types/api.ts
580
+
581
+ # 3. Write contract test
582
+ cat > tests/contract/users.test.ts << EOF
583
+ import { validateAgainstSchema } from '@pact-foundation/pact';
584
+
585
+ test('GET /users returns valid user array', async () => {
586
+ const response = await fetch('/api/users');
587
+ const data = await response.json();
588
+
589
+ await validateAgainstSchema(data, 'docs/api/users.yaml', '/users');
590
+ });
591
+ EOF
592
+
593
+ # 4. Implement
594
+ # src/api/users.ts now has type safety and contract validation
595
+ ```
596
+
597
+ ---
598
+
599
+ ## Troubleshooting Common Issues
600
+
601
+ ### Validation Errors
602
+
603
+ #### Error: `risk_tier is required`
308
604
 
309
- #### Complete GitHub Actions Pipeline
605
+ **Cause**: Working spec missing risk tier.
606
+
607
+ **Fix**:
310
608
 
311
609
  ```yaml
312
- name: CAWS Quality Gates
313
- on:
314
- pull_request:
315
- types: [opened, synchronize, reopened, ready_for_review]
316
-
317
- jobs:
318
- naming_guard:
319
- runs-on: ubuntu-latest
320
- steps:
321
- - uses: actions/checkout@v4
322
- - name: Block shadow file patterns
323
- run: |
324
- BAD=$(git diff --name-only origin/${{ github.base_ref }}... | \
325
- grep -E '/(copy|final|enhanced|v2)[.-]|/(new-)|(^|/)_.+\.| - copy\.' || true)
326
- if [ -n "$BAD" ]; then
327
- echo "❌ Shadow/duplicate filename patterns detected:"
328
- echo "$BAD"
329
- exit 1
330
- fi
331
-
332
- scope_guard:
333
- runs-on: ubuntu-latest
334
- steps:
335
- - uses: actions/checkout@v4
336
- - name: Ensure changes are within scope.in
337
- run: |
338
- yq -o=json '.caws/working-spec.yaml' > .caws/ws.json
339
- jq -r '.scope.in[]' .caws/ws.json | sed 's|^|^|; s|$|/|' > .caws/paths.txt
340
- CHANGED=$(git diff --name-only origin/${{ github.base_ref }}...)
341
- OUT=""
342
- for f in $CHANGED; do
343
- if ! grep -q -E -f .caws/paths.txt <<< "$f"; then OUT="$OUT\n$f"; fi
344
- done
345
- if [ -n "$OUT" ]; then
346
- echo -e "❌ Files outside scope.in:\n$OUT"
347
- echo "If intentional, add a Spec Delta to .caws/working-spec.yaml and include affected paths."
348
- exit 1
349
- fi
350
-
351
- budget_guard:
352
- runs-on: ubuntu-latest
353
- steps:
354
- - uses: actions/checkout@v4
355
- - name: Enforce max files/LOC from change_budget
356
- run: |
357
- yq -o=json '.caws/working-spec.yaml' > .caws/ws.json
358
- MAXF=$(jq -r '.change_budget.max_files' .caws/ws.json)
359
- MAXL=$(jq -r '.change_budget.max_loc' .caws/ws.json)
360
- FILES=$(git diff --name-only origin/${{ github.base_ref }}... | wc -l)
361
- LOC=$(git diff --unified=0 origin/${{ github.base_ref }}... | grep -E '^\+|^-' | wc -l)
362
- echo "Files:$FILES LOC:$LOC (budget Files:$MAXF LOC:$MAXL)"
363
- [ "$FILES" -le "$MAXF" ] && [ "$LOC" -le "$MAXL" ] || (echo "❌ Budget exceeded"; exit 1)
364
-
365
- setup:
366
- runs-on: ubuntu-latest
367
- outputs:
368
- risk: ${{ steps.risk.outputs.tier }}
369
- steps:
370
- - uses: actions/checkout@v4
371
- - uses: actions/setup-node@v4
372
- with: { node-version: '20' }
373
- - run: npm ci
374
- - name: Parse Working Spec
375
- id: risk
376
- run: |
377
- pipx install yq
378
- yq -o=json '.caws/working-spec.yaml' > .caws/working-spec.json
379
- echo "tier=$(jq -r .risk_tier .caws/working-spec.json)" >> $GITHUB_OUTPUT
380
- - name: Validate Spec
381
- run: node apps/tools/caws/validate.js .caws/working-spec.json
382
-
383
- static:
384
- needs: setup
385
- runs-on: ubuntu-latest
386
- steps:
387
- - uses: actions/checkout@v4
388
- - uses: actions/setup-node@v4
389
- with: { node-version: '20' }
390
- - run: npm ci
391
- - run: npm run typecheck && npm run lint && npm run dep:policy && npm run sast && npm run secret:scan
392
-
393
- unit:
394
- needs: setup
395
- runs-on: ubuntu-latest
396
- steps:
397
- - uses: actions/checkout@v4
398
- - uses: actions/setup-node@v4
399
- with: { node-version: '20' }
400
- - run: npm ci
401
- - run: npm run test:unit -- --coverage
402
- - name: Enforce Branch Coverage
403
- run: node apps/tools/caws/gates.js coverage --tier ${{ needs.setup.outputs.risk }}
404
-
405
- mutation:
406
- needs: unit
407
- runs-on: ubuntu-latest
408
- steps:
409
- - uses: actions/checkout@v4
410
- - uses: actions/setup-node@v4
411
- with: { node-version: '20' }
412
- - run: npm ci
413
- - run: npm run test:mutation
414
- - run: node apps/tools/caws/gates.js mutation --tier ${{ needs.setup.outputs.risk }}
415
-
416
- contracts:
417
- needs: setup
418
- runs-on: ubuntu-latest
419
- steps:
420
- - uses: actions/checkout@v4
421
- - uses: actions/setup-node@v4
422
- with: { node-version: '20' }
423
- - run: npm ci
424
- - run: npm run test:contract
425
- - run: node apps/tools/caws/gates.js contracts --tier ${{ needs.setup.outputs.risk }}
426
-
427
- integration:
428
- needs: [setup]
429
- runs-on: ubuntu-latest
430
- services:
431
- postgres: { image: postgres:16, env: { POSTGRES_PASSWORD: pass }, ports: ["5432:5432"], options: >-
432
- --health-cmd="pg_isready -U postgres" --health-interval=10s --health-timeout=5s --health-retries=5 }
433
- steps:
434
- - uses: actions/checkout@v4
435
- - uses: actions/setup-node@v4
436
- with: { node-version: '20' }
437
- - run: npm ci
438
- - run: npm run test:integration
439
-
440
- e2e_a11y:
441
- needs: [integration]
442
- runs-on: ubuntu-latest
443
- steps:
444
- - uses: actions/checkout@v4
445
- - uses: actions/setup-node@v4
446
- with: { node-version: '20' }
447
- - run: npm ci
448
- - run: npm run test:e2e:smoke
449
- - run: npm run test:axe
610
+ # Add to .caws/working-spec.yaml
611
+ risk_tier: 2 # Choose 1, 2, or 3 based on impact
612
+ ```
450
613
 
451
- perf:
452
- if: needs.setup.outputs.risk != '3'
453
- needs: [integration]
454
- runs-on: ubuntu-latest
455
- steps:
456
- - uses: actions/checkout@v4
457
- - uses: actions/setup-node@v4
458
- with: { node-version: '20' }
459
- - run: npm ci
460
- - run: npm run perf:budgets
461
-
462
- provenance_trust:
463
- needs: [naming_guard, scope_guard, budget_guard, static, unit, mutation, contracts, integration, e2e_a11y, perf]
464
- runs-on: ubuntu-latest
465
- steps:
466
- - uses: actions/checkout@v4
467
- - uses: actions/setup-node@v4
468
- with: { node-version: '20' }
469
- - run: npm ci
470
- - name: Generate SBOM
471
- run: npx @cyclonedx/cyclonedx-npm --output-file .agent/sbom.json
472
- - name: Create Attestation
473
- run: node apps/tools/caws/attest.js > .agent/attestation.json
474
- - name: Prompt/Tool lint
475
- run: node apps/tools/caws/prompt-lint.js .agent/prompts/*.md --allowlist .agent/tools-allow.json
476
- - name: Generate Provenance
477
- run: node apps/tools/caws/provenance.js > .agent/provenance.json
478
- - name: Validate Provenance
479
- run: node apps/tools/caws/validate-prov.js .agent/provenance.json
480
- - name: Compute Trust Score
481
- run: node apps/tools/caws/gates.js trust --tier ${{ needs.setup.outputs.risk }}
614
+ #### Error: `Invalid ID format`
615
+
616
+ **Cause**: ID doesn't match `PREFIX-NUMBER` pattern.
617
+
618
+ **Fix**:
619
+
620
+ ```yaml
621
+ # Bad
622
+ id: feature-001
623
+ id: FEAT001
624
+ id: feat_001
625
+
626
+ # Good
627
+ id: FEAT-001
628
+ id: FIX-042
629
+ id: REFACTOR-003
482
630
  ```
483
631
 
484
- ### C) Repository Scaffold
632
+ #### Error: `scope.in is required`
633
+
634
+ **Cause**: Missing scope definition.
635
+
636
+ **Fix**:
485
637
 
638
+ ```yaml
639
+ scope:
640
+ in: ['src/features/auth/', 'tests/auth/']
641
+ out: ['node_modules/', 'dist/']
486
642
  ```
487
- .caws/
488
- policy/tier-policy.json
489
- schemas/{working-spec.schema.json, provenance.schema.json}
490
- templates/{pr.md, feature.plan.md, test-plan.md}
491
- apps/contracts/ # OpenAPI/GraphQL/Pact
492
- docs/ # human docs; ADRs
493
- src/
494
- tests/
495
- unit/
496
- contract/
497
- integration/
498
- e2e/
499
- axe/
500
- mutation/
501
- apps/tools/caws/
502
- validate.ts
503
- gates.ts # thresholds, trust score
504
- provenance.ts
505
- prompt-lint.js # prompt hygiene & tool allowlist
506
- attest.js # SBOM + SLSA attestation generator
507
- tools-allow.json # allowed tools for agents
508
- codemod/ # AST transformation scripts for refactor mode
509
- rename.ts # example codemod for renaming modules
510
- .agent/ # provenance artifacts (generated)
511
- sbom.json
512
- attestation.json
513
- provenance.json
514
- tools-allow.json
515
- .github/
516
- workflows/caws.yml
517
- CODEOWNERS
643
+
644
+ ### Scope Violations
645
+
646
+ #### Error: `File outside scope: src/unrelated.ts`
647
+
648
+ **Cause**: PR touches files not listed in `scope.in`.
649
+
650
+ **Fix Option 1 - Update scope**:
651
+
652
+ ```yaml
653
+ scope:
654
+ in:
655
+ - 'src/features/auth/'
656
+ - 'src/unrelated.ts' # Add file to scope
518
657
  ```
519
658
 
520
- ## 3) Templates & Examples
659
+ **Fix Option 2 - Split PR**:
660
+ Split changes into separate PRs with different scopes.
661
+
662
+ ### Budget Exceeded
663
+
664
+ #### Error: `35 files changed, exceeds budget of 25`
665
+
666
+ **Cause**: Change is too large.
667
+
668
+ **Fix Option 1 - Split PR**:
669
+ Break into smaller, focused PRs.
521
670
 
522
- ### Working Spec YAML Template
671
+ **Fix Option 2 - Increase budget**:
523
672
 
524
673
  ```yaml
525
- id: { { PROJECT_ID } }
526
- title: '{{PROJECT_TITLE}}'
527
- risk_tier: { { PROJECT_TIER } }
528
- mode: { { PROJECT_MODE } }
529
674
  change_budget:
530
- max_files: { { MAX_FILES } }
531
- max_loc: { { MAX_LOC } }
532
- blast_radius:
533
- modules: [{ { BLAST_MODULES } }]
534
- data_migration: { { DATA_MIGRATION } }
535
- operational_rollback_slo: '{{ROLLBACK_SLO}}'
536
- threats: { { PROJECT_THREATS } }
537
- scope:
538
- in: [{ { SCOPE_IN } }]
539
- out: [{ { SCOPE_OUT } }]
540
- invariants: { { PROJECT_INVARIANTS } }
541
- acceptance: { { ACCEPTANCE_CRITERIA } }
542
- non_functional:
543
- a11y: [{ { A11Y_REQUIREMENTS } }]
544
- perf: { api_p95_ms: { { PERF_BUDGET } } }
545
- security: [{ { SECURITY_REQUIREMENTS } }]
546
- contracts:
547
- - type: { { CONTRACT_TYPE } }
548
- path: '{{CONTRACT_PATH}}'
549
- observability:
550
- logs: [{ { OBSERVABILITY_LOGS } }]
551
- metrics: [{ { OBSERVABILITY_METRICS } }]
552
- traces: [{ { OBSERVABILITY_TRACES } }]
553
- migrations: { { MIGRATION_PLAN } }
554
- rollback: [{ { ROLLBACK_PLAN } }]
675
+ max_files: 40 # Only if justified
676
+ max_loc: 1200
555
677
  ```
556
678
 
557
- ### PR Description Template
679
+ **Note**: Prefer splitting over increasing budget.
558
680
 
559
- ```markdown
560
- ## Summary
681
+ ### Test Coverage Failures
561
682
 
562
- {{PR_SUMMARY}}
683
+ #### Error: `Branch coverage 75% below tier 2 requirement of 80%`
563
684
 
564
- ## Working Spec
685
+ **Cause**: Insufficient test coverage.
565
686
 
566
- - Risk Tier: {{RISK_TIER}}
567
- - Mode: {{PR_MODE}}
568
- - Invariants: {{INVARIANTS}}
687
+ **Fix**:
569
688
 
570
- ## Tests
689
+ 1. Run coverage report: `npm run test:coverage`
690
+ 2. Identify untested branches in HTML report
691
+ 3. Add tests for uncovered paths
692
+ 4. Re-run: `npm run test:coverage`
693
+
694
+ #### Error: `Mutation score 45% below tier 2 requirement of 50%`
571
695
 
572
- - Unit: {{UNIT_COVERAGE}}% (target {{TARGET_COVERAGE}}%)
573
- - Mutation: {{MUTATION_SCORE}}% (target {{TARGET_MUTATION}}%)
574
- - Integration: {{INTEGRATION_TESTS}} flows
575
- - E2E smoke: {{E2E_TESTS}} ({{E2E_STATUS}})
576
- - A11y: {{A11Y_SCORE}} ({{A11Y_STATUS}})
696
+ **Cause**: Tests aren't strong enough (mutants survive).
577
697
 
578
- ## Non-functional
698
+ **Fix**:
579
699
 
580
- - API p95: {{API_PERF}}ms (budget {{API_BUDGET}}ms)
581
- - Security: {{SAST_STATUS}}
700
+ 1. Run mutation report: `npm run test:mutation`
701
+ 2. Review surviving mutants
702
+ 3. Add assertions that would catch those mutations
703
+ 4. Re-run: `npm run test:mutation`
582
704
 
583
- ## Migration & Rollback
705
+ ---
706
+
707
+ ## Provenance & AI Tracking
708
+
709
+ CAWS automatically tracks all AI-assisted changes for transparency and quality analysis.
710
+
711
+ ### Automatic Tracking via Git Hooks
712
+
713
+ When you commit, hooks automatically:
584
714
 
585
- {{MIGRATION_NOTES}}
715
+ 1. Detect if change was AI-assisted (Cursor Composer, Chat, Tab)
716
+ 2. Extract quality metrics (coverage, mutation score)
717
+ 3. Link commits to working spec
718
+ 4. Update provenance journal
586
719
 
587
- ## Known Limits
720
+ ### Viewing Provenance
588
721
 
589
- {{KNOWN_LIMITS}}
722
+ ```bash
723
+ # Beautiful dashboard with insights
724
+ caws provenance show --format=dashboard
725
+
726
+ # JSON output for tooling
727
+ caws provenance show --format=json
728
+
729
+ # Analyze AI effectiveness
730
+ caws provenance analyze-ai
590
731
  ```
591
732
 
592
- ## 4) Agent Conduct Rules (Hard Constraints)
733
+ ### Dashboard Insights
593
734
 
594
- 1. **Spec adherence**: Do not implement beyond scope.in; if discovered dependency changes spec, open "Spec delta" in PR and update tests first.
595
- 2. **No hidden state/time/net**: All non-determinism injected and controlled in tests.
596
- 3. **Explainable mocks**: Only mock boundaries; never mock the function under test; document any mock behavior in comments.
597
- 4. **Idempotency & error paths**: Provide tests for retries/timeouts/cancel; assert invariants on error.
598
- 5. **Observability parity**: Every key acceptance path emits logs/metrics/traces; tests assert on them when feasible (e.g., fake exporter assertions).
599
- 6. **Data safety**: No real PII in fixtures; factories generate realistic but synthetic data.
600
- 7. **Accessibility required**: For UI changes: keyboard path test + axe scan; for API: error messages human-readable and localizable.
601
- 8. **Performance ownership**: Include micro-bench (where hot path) or budget check; document algorithmic complexity if changed.
602
- 9. **Docs as code**: Update README/usage snippets; add example code; regenerate typed clients from contracts.
603
- 10. **Rollback ready**: Feature-flag new behavior; write a reversible migration or provide kill-switch.
735
+ The provenance dashboard shows:
604
736
 
605
- ## 5) Trust & Telemetry
737
+ - Total commits and AI-assisted percentage
738
+ - Quality score trends over time
739
+ - AI contribution breakdown (Composer vs Tab completions)
740
+ - Acceptance rate for AI-assisted changes
741
+ - Recent activity timeline
742
+ - Smart recommendations for improvement
606
743
 
607
- • **Provenance manifest** (.agent/provenance.json): agent name/version, prompts, model, commit SHAs, test results hashes, generated files list, and human approvals. Stored with the PR for auditability.
608
- • **Trust score per PR**: composite of rubric + gates + historical flake rate; expose in a PR check and weekly dashboard.
609
- • **Drift watch**: monitor contract usage in prod; alert if undocumented fields appear.
744
+ ---
610
745
 
611
- ## 6) Operational Excellence
746
+ ## Integration with Cursor IDE
612
747
 
613
- ### Flake Management
748
+ CAWS provides deep Cursor IDE integration via hooks and rules.
614
749
 
615
- **Detector**: compute week-over-week pass variance per spec ID.
616
- • **Policy**: >0.5% variance → auto-label flake:quarantine, open ticket with owner + expiry (7 days).
617
- • **Implementation**: Store test run hashes in .agent/provenance.json; nightly job aggregates and posts a table to dashboard.
750
+ ### Cursor Rules (`.cursor/rules/`)
618
751
 
619
- ### Waivers & Escalation
752
+ CAWS includes modular MDC rule files:
620
753
 
621
- **Temporary waiver requires**:
622
- waivers.yml with: gate, reason, owner, expiry ISO date (≤ 14 days), compensating control.
623
- PR must link to ticket; trust score maximum capped at 79 with active waivers.
624
- **Escalation**: unresolved flake/waiver past expiry auto-blocks merges across the repo until cleared.
754
+ 1. **01-working-style.mdc** - Working style and risk limits
755
+ 2. **02-quality-gates.mdc** - Tests, linting, commit discipline
756
+ 3. **03-naming-and-refactor.mdc** - Naming conventions, anti-duplication
757
+ 4. **04-logging-language-style.mdc** - Logging clarity, emoji policy
758
+ 5. **05-safe-defaults-guards.mdc** - Defensive coding patterns
759
+ 6. **06-typescript-conventions.mdc** - TS/JS specific rules
760
+ 7. **07-process-ops.mdc** - Server and process management
761
+ 8. **08-solid-and-architecture.mdc** - SOLID principles
762
+ 9. **09-docstrings.mdc** - Cross-language documentation
763
+ 10. **10-authorship-and-attribution.mdc** - File attribution
625
764
 
626
- ### Security & Performance Checks
765
+ **These rules guide your behavior in Cursor automatically.**
627
766
 
628
- **Secrets**: run gitleaks/trufflehog on changed files; CAWS gate blocks any hit above low severity.
629
- • **SAST**: language-appropriate tools; gate requires zero criticals.
630
- • **Performance**: k6 scripts for API budgets; LHCI for web budgets; regressions fail gate.
631
- • **Migrations**: lint for reversibility; dry-run in CI; forward-compat contract tests.
767
+ ### Cursor Hooks (`.cursor/hooks/`)
632
768
 
633
- ## 7) Language & Tooling Ecosystem
769
+ Real-time quality enforcement:
634
770
 
635
- ### TypeScript Stack (Recommended)
771
+ - **validate-command** - Blocks dangerous commands (`rm -rf /`, force push)
772
+ - **validate-file-read** - Prevents reading secrets (`.env`, keys)
773
+ - **validate-file-write** - Enforces naming conventions
774
+ - **post-edit** - Auto-formats code after changes
636
775
 
637
- **Testing**: Jest/Vitest, fast-check, Playwright, Testcontainers, Stryker, MSW or Pact
638
- • **Quality**: ESLint + types, LHCI, axe-core
639
- • **CI**: GitHub Actions with Node 20
776
+ ### Disabling Temporarily
640
777
 
641
- ### Python Stack
778
+ ```bash
779
+ # If you need to bypass hooks temporarily
780
+ # Cursor Settings → Hooks → Disable
642
781
 
643
- **Testing**: pytest, hypothesis, Playwright (Python), Testcontainers-py, mutmut, Schemathesis
644
- **Quality**: bandit/semgrep, Lighthouse CI, axe-core
782
+ # Note: --no-verify is BANNED for git commits
783
+ # Fix the issue instead of bypassing hooks
784
+ ```
645
785
 
646
- ### JVM Stack
786
+ ---
647
787
 
648
- **Testing**: JUnit5, jqwik, Testcontainers, PIT (mutation), Pact-JVM
649
- • **Quality**: OWASP dependency check, SonarQube, axe-core
788
+ ## Project Templates
650
789
 
651
- **Note**: Mutation testing is non-negotiable for tiers ≥2; it's the only reliable guard against assertion theater.
790
+ CAWS includes templates for common project types.
652
791
 
653
- ## 8) Review Rubric (Scriptable Scoring)
792
+ ### VS Code Extension
654
793
 
655
- | Category | Weight | Criteria | 0 | 1 | 2 |
656
- | --------------------------------- | ------ | ----------------------------------- | ----------------- | ------------------ | --------------------------- |
657
- | Spec clarity & invariants | ×5 | Clear, testable invariants | Missing/unclear | Basic coverage | Comprehensive + edge cases |
658
- | Contract correctness & versioning | ×5 | Schema accuracy + versioning | Errors present | Minor issues | Perfect + versioned |
659
- | Unit thoroughness & edge coverage | ×5 | Branch coverage + property tests | <70% coverage | Meets tier minimum | >90% + properties |
660
- | Integration realism | ×4 | Real containers + seeds | Mocked heavily | Basic containers | Full stack + realistic data |
661
- | E2E relevance & stability | ×3 | Critical paths + semantic selectors | Brittle selectors | Basic coverage | Semantic + stable |
662
- | Mutation adequacy | ×4 | Score vs tier threshold | <50% | Meets minimum | >80% |
663
- | A11y pathways & results | ×3 | Keyboard + axe clean | Major issues | Basic compliance | Full WCAG + keyboard |
664
- | Perf/Resilience | ×3 | Budgets + timeouts/retries | No checks | Basic budgets | Full resilience |
665
- | Observability | ×3 | Logs/metrics/traces asserted | Missing | Basic emission | Asserted in tests |
666
- | Migration safety & rollback | ×3 | Reversible + kill-switch | No rollback | Basic revert | Full rollback + testing |
667
- | Docs & PR explainability | ×3 | Clear rationale + limits | Minimal | Basic docs | Comprehensive + ADR |
668
- | **Mode compliance** | ×3 | Changes match declared `mode` | Violations | Minor drift | Full compliance |
669
- | **Scope & budget discipline** | ×3 | Diff within `scope.in` & budget | Exceeded | Near limit | Within limits |
670
- | **Supply-chain attestations** | ×2 | SBOM + SLSA attestation | Missing | Partial | Complete & valid |
794
+ ```bash
795
+ caws init my-extension --template=extension
796
+ ```
671
797
 
672
- **Target**: ≥ 82/100 (weighted sum). Calculator in `apps/tools/caws/rubric.ts`.
798
+ **Optimized for:**
673
799
 
674
- ## 9) Anti-patterns (Explicitly Rejected)
800
+ - Risk tier: 2 (high user impact)
801
+ - Webview security (CSP enforcement)
802
+ - Activation performance (<1s)
803
+ - Budget: 25 files, 1000 lines
675
804
 
676
- **Over-mocked integration tests**: mocking ORM or HTTP client where containerized integration is feasible.
677
- • **UI tests keyed on CSS classes**: brittle selectors instead of semantic roles/labels.
678
- • **Coupling tests to implementation details**: private method calls, internal sequence assertions.
679
- • **"Retry until green" CI culture**: quarantines without expiry or owner.
680
- • **100% coverage mandates**: without mutation testing or risk awareness.
805
+ ### React Library
681
806
 
682
- ## 13) Failure-Mode Cards (Common Traps & Recovery)
807
+ ```bash
808
+ caws init my-lib --template=library
809
+ ```
683
810
 
684
- Add a small section of "If you see X, do Y":
811
+ **Optimized for:**
685
812
 
686
- 1. **Symptom:** Large rename + re-exports create `*-copy.ts` or `enhanced-*.ts`.
687
- **Action:** Switch to **refactor mode**. Generate `codemod/rename.ts` that updates imports/exports in place. Validate with `tsc --noEmit` and run mutation tests to ensure unchanged behavior.
813
+ - Risk tier: 2 (API stability)
814
+ - Bundle size limits
815
+ - Tree-shakeable exports
816
+ - Budget: 20 files, 800 lines
688
817
 
689
- 2. **Symptom:** Contract change proliferates across services.
690
- **Action:** Declare **blast_radius.modules**; create consumer **Pact** tests first. Stage changes behind a feature flag; ship provider compatibility for both old/new fields.
818
+ ### API Service
691
819
 
692
- 3. **Symptom:** Flaky time-based tests.
693
- **Action:** Inject `Clock` and use fixed timestamps; assert **idempotency** with property tests.
820
+ ```bash
821
+ caws init my-api --template=api
822
+ ```
694
823
 
695
- 4. **Symptom:** Agent proposes new external tool/library.
696
- **Action:** Fail unless added to `tool_allowlist`. Require SBOM delta review and perf/a11y/security notes in the PR.
824
+ **Optimized for:**
697
825
 
698
- ## 10) Cursor/Codex Agent Integration
826
+ - Risk tier: 1 (data integrity)
827
+ - Backward compatibility
828
+ - Performance budgets
829
+ - Budget: 40 files, 1500 lines
699
830
 
700
- ### Agent Commands
831
+ ### CLI Tool
701
832
 
702
- • `agent plan` → emits plan + test matrix
703
- `agent verify` → runs local gates; generates provenance
704
- • `agent prove` → creates provenance manifest
705
- • `agent doc` → updates README/changelog from spec
833
+ ```bash
834
+ caws init my-cli --template=cli
835
+ ```
706
836
 
707
- ### Guardrails
837
+ **Optimized for:**
708
838
 
709
- **Templates**: Inject Working Spec YAML + PR template on "New Feature" command
710
- • **Scaffold**: Pre-wire tests/\* skeletons with containers and contracts
711
- • **Context discipline**: Restrict writes to spec-touched modules; deny outside scope unless spec updated
712
- **Feedback loop**: PR comments show coverage, mutation diff, contract verification summary
839
+ - Risk tier: 3 (low risk)
840
+ - Error handling
841
+ - Help text and UX
842
+ - Budget: 15 files, 600 lines
713
843
 
714
- ## 11) Adoption Roadmap
844
+ ---
715
845
 
716
- ### Foundation Setup
846
+ ## Advanced Topics
717
847
 
718
- - [ ] Add .caws/ directory with schemas and templates
719
- - [ ] Create apps/tools/caws/ validation scripts
720
- - [ ] Wire basic GitHub Actions workflow
721
- - [ ] Add CODEOWNERS for Tier-1 paths
848
+ ### Codemods for Refactoring
722
849
 
723
- ### Quality Gates Implementation
850
+ When refactoring, use codemods instead of manual edits:
724
851
 
725
- - [ ] Enable Testcontainers for integration tests
726
- - [ ] Add mutation testing with tier thresholds
727
- - [ ] Implement trust score calculation
728
- - [ ] Add axe + Playwright smoke for UI changes
852
+ ```bash
853
+ # Install jscodeshift
854
+ npm install -g jscodeshift
729
855
 
730
- ### Operational Excellence
856
+ # Create codemod
857
+ vim codemod/rename-function.ts
731
858
 
732
- - [ ] Publish provenance manifest with PRs
733
- - [ ] Implement flake detector and quarantine process
734
- - [ ] Add waiver system with trust score caps
735
- - [ ] Socialize review rubric and block merges <80
859
+ # Dry run to preview changes
860
+ jscodeshift -d -t codemod/rename-function.ts src/
736
861
 
737
- ### Continuous Improvement
862
+ # Apply transformation
863
+ jscodeshift -t codemod/rename-function.ts src/
738
864
 
739
- - [ ] Monitor drift in contract usage
740
- - [ ] Refine tooling based on feedback
741
- - [ ] Expand language support as needed
742
- - [ ] Track trust score trends and flake rates
865
+ # Verify tests pass
866
+ npm test
867
+ ```
743
868
 
744
- ## 12) Trust Score Formula
869
+ **Example codemod:**
745
870
 
746
871
  ```typescript
747
- const weights = {
748
- coverage: 0.2,
749
- mutation: 0.2,
750
- contracts: 0.16,
751
- a11y: 0.08,
752
- perf: 0.08,
753
- flake: 0.08,
754
- mode: 0.06,
755
- scope: 0.06,
756
- supplychain: 0.04,
872
+ // codemod/rename-function.ts
873
+ export default function transformer(file, api) {
874
+ const j = api.jscodeshift;
875
+ const root = j(file.source);
876
+
877
+ // Find all calls to oldFunction
878
+ root
879
+ .find(j.CallExpression, {
880
+ callee: { name: 'oldFunction' },
881
+ })
882
+ .forEach((path) => {
883
+ // Rename to newFunction
884
+ path.value.callee.name = 'newFunction';
885
+ });
886
+
887
+ return root.toSource();
888
+ }
889
+ ```
890
+
891
+ ### Feature Flags
892
+
893
+ For gradual rollouts, use feature flags:
894
+
895
+ ```typescript
896
+ // Define flags
897
+ const flags = {
898
+ newAuthFlow: process.env.FEATURE_NEW_AUTH === 'true',
757
899
  };
758
900
 
759
- function trustScore(tier: string, prov: Provenance) {
760
- const wsum = Object.values(weights).reduce((a, b) => a + b, 0);
761
- const score =
762
- weights.coverage * normalize(prov.results.coverage_branch, tiers[tier].min_branch, 0.95) +
763
- weights.mutation * normalize(prov.results.mutation_score, tiers[tier].min_mutation, 0.9) +
764
- weights.contracts *
765
- (tiers[tier].requires_contracts
766
- ? prov.results.contracts.consumer && prov.results.contracts.provider
767
- ? 1
768
- : 0
769
- : 1) +
770
- weights.a11y * (prov.results.a11y === 'pass' ? 1 : 0) +
771
- weights.perf * budgetOk(prov.results.perf) +
772
- weights.flake * (prov.results.flake_rate <= 0.005 ? 1 : 0.5) +
773
- weights.mode * (prov.results.mode_compliance === 'full' ? 1 : 0.5) +
774
- weights.scope * (prov.results.scope_within_budget ? 1 : 0) +
775
- weights.supplychain * (prov.results.sbom_valid && prov.results.attestation_valid ? 1 : 0);
776
- return Math.round((score / wsum) * 100);
901
+ // Use in code
902
+ if (flags.newAuthFlow) {
903
+ return handleAuthV2(credentials);
904
+ } else {
905
+ return handleAuthV1(credentials);
777
906
  }
778
907
  ```
779
908
 
780
- This v1.0 combines the philosophical foundation of our system with the practical, executable implementation details needed for immediate adoption. The framework provides both the "why" (quality principles) and the "how" (automated enforcement) needed for engineering-grade AI coding agents.
909
+ ### Performance Budgets
910
+
911
+ Set budgets in working spec:
912
+
913
+ ```yaml
914
+ non_functional:
915
+ perf:
916
+ api_p95_ms: 250 # API latency budget
917
+ lcp_ms: 2500 # Largest Contentful Paint
918
+ tti_ms: 3500 # Time to Interactive
919
+ bundle_kb: 50 # JavaScript bundle size
920
+ ```
921
+
922
+ **Enforce in CI:**
923
+
924
+ ```bash
925
+ # Lighthouse CI
926
+ npm run lighthouse:ci
927
+
928
+ # Bundle size check
929
+ npm run build
930
+ du -k dist/main.js | awk '{if ($1 > 50) exit 1}'
931
+ ```
781
932
 
782
933
  ---
783
934
 
784
- ## 🚀 Quick Start Guide
935
+ ## FAQ for Agents
936
+
937
+ ### Q: Can I skip writing tests if the change is small?
938
+
939
+ **A: No.** Tests are required regardless of change size. Even a one-line fix needs:
940
+
941
+ 1. A failing test that reproduces the bug
942
+ 2. The fix
943
+ 3. The passing test
944
+
945
+ ### Q: Can I create `enhanced-foo.ts` alongside `foo.ts` for refactoring?
946
+
947
+ **A: No.** Shadow files are forbidden. Instead:
948
+
949
+ 1. Edit `foo.ts` in place
950
+ 2. Or create a codemod to transform `foo.ts`
951
+ 3. Or refactor with a different canonical name
952
+
953
+ ### Q: What if the working spec doesn't exist?
954
+
955
+ **A: Create one.** Before any implementation:
956
+
957
+ 1. Create `.caws/working-spec.yaml`
958
+ 2. Fill in all required fields
959
+ 3. Run `caws validate --suggestions`
960
+ 4. Request human approval
961
+ 5. Then implement
962
+
963
+ ### Q: Can I exceed the change budget if the task requires it?
964
+
965
+ **A: Split the task.** If you need more than `max_files` or `max_loc`:
966
+
967
+ 1. Break into multiple smaller PRs
968
+ 2. Each with its own working spec
969
+ 3. Each staying within budget
970
+
971
+ Only increase budget with human approval and strong justification.
972
+
973
+ ### Q: What if lints fail but I think they're wrong?
974
+
975
+ **A: Fix the lints.** Never use `--no-verify`. If the lint rule is incorrect:
976
+
977
+ 1. Fix the code to satisfy the lint
978
+ 2. Or request human discussion of the lint rule
979
+ 3. Human can update lint config if appropriate
980
+
981
+ ### Q: Can I commit without updating provenance?
982
+
983
+ **A: Hooks do it automatically.** If hooks are installed, provenance updates on every commit. If hooks aren't installed:
984
+
985
+ 1. Install them: `caws hooks install`
986
+ 2. Or manually update: `caws provenance update`
987
+
988
+ ---
989
+
990
+ ## Additional Resources
991
+
992
+ ### Documentation
993
+
994
+ - **Complete Guide**: `docs/agents/full-guide.md` - Comprehensive CAWS reference
995
+ - **Tutorial**: `docs/agents/tutorial.md` - Step-by-step learning path
996
+ - **Examples**: `docs/agents/examples.md` - Real-world project examples
997
+
998
+ ### Project-Specific
999
+
1000
+ - **Getting Started**: `.caws/GETTING_STARTED.md` - Generated per project
1001
+ - **Templates**: `.caws/templates/` - Feature plans, test plans, PR templates
1002
+ - **Examples**: `.caws/examples/` - Working spec examples
1003
+
1004
+ ### Cursor Rules
1005
+
1006
+ - **Rules Directory**: `.cursor/rules/` - Modular MDC rule files
1007
+ - **Rules README**: `.cursor/rules/README.md` - Rule system documentation
1008
+
1009
+ ---
785
1010
 
786
- ### For New Projects
1011
+ ## Summary Checklist
787
1012
 
788
- 1. Copy this template to your project root
789
- 2. Run `caws init` to scaffold the project structure
790
- 3. Customize the Working Spec YAML for your project
791
- 4. Set up your CI/CD pipeline with the provided GitHub Actions
1013
+ Before starting any work:
792
1014
 
793
- ### For Existing Projects
1015
+ - [ ] Working spec exists and validates
1016
+ - [ ] Risk tier is appropriate
1017
+ - [ ] Acceptance criteria are clear
1018
+ - [ ] Scope boundaries are defined
1019
+ - [ ] Test plan is documented
794
1020
 
795
- 1. Copy the relevant sections to your existing project
796
- 2. Run `caws scaffold` to add missing components
797
- 3. Update your existing workflows to include the CAWS gates
1021
+ During implementation:
798
1022
 
799
- ### Customization
1023
+ - [ ] Write tests first (TDD)
1024
+ - [ ] Stay within scope.in boundaries
1025
+ - [ ] Keep under change budget
1026
+ - [ ] Use guard clauses and safe defaults
1027
+ - [ ] Inject dependencies for testability
1028
+ - [ ] No shadow files (no enhanced-_, new-_, v2-\*)
800
1029
 
801
- - **Project ID**: Update `{{PROJECT_ID}}` with your ticket system prefix
802
- - **Title**: Describe your project in `{{PROJECT_TITLE}}`
803
- - **Tier**: Set appropriate risk tier (1-3) in `{{PROJECT_TIER}}`
804
- - **Mode**: Choose from `refactor`, `feature`, `fix`, `doc`, `chore`
805
- - **Budget**: Set reasonable file/LOC limits in `change_budget`
806
- - **Scope**: Define what files/features are in/out of scope
807
- - **Contracts**: Specify API contracts (OpenAPI, GraphQL, etc.)
1030
+ Before submitting PR:
808
1031
 
809
- ### Support
1032
+ - [ ] All tests pass: `npm test`
1033
+ - [ ] Coverage meets tier requirements
1034
+ - [ ] Mutation score meets tier requirements
1035
+ - [ ] Lints pass: `npm run lint`
1036
+ - [ ] Types check: `npm run typecheck`
1037
+ - [ ] Contracts validate (if applicable)
1038
+ - [ ] Performance budgets met
1039
+ - [ ] No secret scan violations
1040
+ - [ ] Provenance updated
810
1041
 
811
- - 📖 Full documentation: See sections above
812
- - 🛠️ Tools: `apps/tools/caws/` contains all utilities
813
- - 🎯 Examples: Check `docs/` for implementation examples
814
- - 🤝 Community: Follow the agent conduct rules for collaboration
1042
+ **Questions?** Check the full guide or ask your human collaborator.
815
1043
 
816
1044
  ---
817
1045
 
818
- **Author**: @darianrosebrook
819
- **Version**: 1.0.0
820
- **License**: MIT
1046
+ _This guide is your companion for CAWS-driven development. Bookmark it, reference it often, and use it to deliver high-quality, well-tested, explainable code._