safeword 0.7.7 → 0.8.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (34) hide show
  1. package/dist/{check-3TTR7WPD.js → check-2QCPMURS.js} +3 -3
  2. package/dist/{chunk-V5T3TEEQ.js → chunk-2P7QXQFL.js} +2 -2
  3. package/dist/{chunk-LETSGOTR.js → chunk-OXQIEKC7.js} +12 -7
  4. package/dist/{chunk-LETSGOTR.js.map → chunk-OXQIEKC7.js.map} +1 -1
  5. package/dist/{chunk-4NJCU6Z7.js → chunk-ZFRO5LB5.js} +2 -2
  6. package/dist/cli.js +6 -6
  7. package/dist/{diff-XJFCAA4Q.js → diff-6LJGYHY5.js} +3 -3
  8. package/dist/{reset-WPXUWP6Y.js → reset-VHNADDMA.js} +3 -3
  9. package/dist/{setup-DLS6K6EO.js → setup-QJNVWHTK.js} +3 -3
  10. package/dist/sync-TIBNJXB2.js +9 -0
  11. package/dist/{upgrade-4ESTGNXG.js → upgrade-GZSLDUEF.js} +4 -4
  12. package/package.json +15 -14
  13. package/templates/SAFEWORD.md +2 -4
  14. package/templates/doc-templates/feature-spec-template.md +1 -1
  15. package/templates/doc-templates/task-spec-template.md +1 -1
  16. package/templates/doc-templates/test-definitions-feature.md +1 -1
  17. package/templates/guides/planning-guide.md +431 -0
  18. package/templates/guides/testing-guide.md +439 -0
  19. package/templates/scripts/lint-md.sh +0 -0
  20. package/templates/skills/safeword-systematic-debugger/SKILL.md +1 -1
  21. package/templates/skills/safeword-tdd-enforcer/SKILL.md +31 -10
  22. package/dist/sync-AAG4SP5F.js +0 -9
  23. package/templates/guides/development-workflow.md +0 -627
  24. package/templates/guides/tdd-best-practices.md +0 -624
  25. package/templates/guides/test-definitions-guide.md +0 -343
  26. package/templates/guides/user-story-guide.md +0 -265
  27. /package/dist/{check-3TTR7WPD.js.map → check-2QCPMURS.js.map} +0 -0
  28. /package/dist/{chunk-V5T3TEEQ.js.map → chunk-2P7QXQFL.js.map} +0 -0
  29. /package/dist/{chunk-4NJCU6Z7.js.map → chunk-ZFRO5LB5.js.map} +0 -0
  30. /package/dist/{diff-XJFCAA4Q.js.map → diff-6LJGYHY5.js.map} +0 -0
  31. /package/dist/{reset-WPXUWP6Y.js.map → reset-VHNADDMA.js.map} +0 -0
  32. /package/dist/{setup-DLS6K6EO.js.map → setup-QJNVWHTK.js.map} +0 -0
  33. /package/dist/{sync-AAG4SP5F.js.map → sync-TIBNJXB2.js.map} +0 -0
  34. /package/dist/{upgrade-4ESTGNXG.js.map → upgrade-GZSLDUEF.js.map} +0 -0
@@ -46,10 +46,8 @@ Training data is stale. Follow this sequence:
46
46
 
47
47
  | Trigger | Guide |
48
48
  | --------------------------------------------------------- | ----------------------------------------------- |
49
- | Starting ANY feature, bug fix, or enhancement | `./.safeword/guides/development-workflow.md` |
50
- | Need to write OR review user stories | `./.safeword/guides/user-story-guide.md` |
51
- | Need to write OR review test definitions | `./.safeword/guides/test-definitions-guide.md` |
52
- | Writing tests, doing TDD, or test is failing | `./.safeword/guides/tdd-best-practices.md` |
49
+ | Starting feature/task OR writing specs/test definitions | `./.safeword/guides/planning-guide.md` |
50
+ | Choosing test type, doing TDD, OR test is failing | `./.safeword/guides/testing-guide.md` |
53
51
  | Creating OR updating a design doc | `./.safeword/guides/design-doc-guide.md` |
54
52
  | Making architectural decision OR writing ADR | `./.safeword/guides/architecture-guide.md` |
55
53
  | Designing data models, schemas, or database changes | `./.safeword/guides/data-architecture-guide.md` |
@@ -1,6 +1,6 @@
1
1
  # Feature Spec: [Feature Name] (Issue #[number])
2
2
 
3
- **Guide**: `@./.safeword/guides/user-story-guide.md` - Best practices, INVEST criteria, and examples
3
+ **Guide**: `@./.safeword/guides/planning-guide.md` - Best practices, INVEST criteria, and examples
4
4
  **Template**: `@./.safeword/templates/feature-spec-template.md`
5
5
 
6
6
  **Feature**: [Brief description of the feature]
@@ -1,6 +1,6 @@
1
1
  # Task: [Name]
2
2
 
3
- **Guide**: `@./.safeword/guides/development-workflow.md`
3
+ **Guide**: `@./.safeword/guides/planning-guide.md`
4
4
  **Template**: `@./.safeword/templates/task-spec-template.md`
5
5
 
6
6
  ---
@@ -1,6 +1,6 @@
1
1
  # Test Definitions: [Feature Name] (Issue #[number])
2
2
 
3
- **Guide**: `@./.safeword/guides/test-definitions-guide.md` - Structure, status tracking, and TDD workflow
3
+ **Guide**: `@./.safeword/guides/testing-guide.md` - Structure, status tracking, and TDD workflow
4
4
  **Template**: `@./.safeword/templates/test-definitions-feature.md`
5
5
 
6
6
  **Feature**: [Brief description of the feature]
@@ -0,0 +1,431 @@
1
+ # Planning Guide
2
+
3
+ How to write specs, user stories, and test definitions before implementation.
4
+
5
+ ---
6
+
7
+ ## Artifact Levels
8
+
9
+ **Triage first - answer IN ORDER, stop at first match:**
10
+
11
+ | Question | Level | Artifacts |
12
+ | ---------------------------------------- | -------------- | ---------------------------------------------------- |
13
+ | User-facing feature with business value? | **L2 Feature** | Feature Spec + Test Definitions (+ Design Doc if 3+) |
14
+ | Bug, improvement, internal, or refactor? | **L1 Task** | Task Spec with inline tests |
15
+ | Typo, config, or trivial change? | **L0 Micro** | Minimal Task Spec, existing tests |
16
+
17
+ **Locations:**
18
+
19
+ - Specs: `.safeword/planning/specs/`
20
+ - Test definitions: `.safeword/planning/test-definitions/`
21
+
22
+ **If none fit:** Break down the work. A single task spanning all three levels should be split into separate L2 feature + L1 tasks.
23
+
24
+ ---
25
+
26
+ ## Templates
27
+
28
+ | Need | Template |
29
+ | ------------------------------- | ---------------------------------------------------- |
30
+ | L2 Feature spec | `@./.safeword/templates/feature-spec-template.md` |
31
+ | L1/L0 Task spec | `@./.safeword/templates/task-spec-template.md` |
32
+ | L2 Test definitions | `@./.safeword/templates/test-definitions-feature.md` |
33
+ | Complex feature design | `@./.safeword/templates/design-doc-template.md` |
34
+ | Architectural decision | `@./.safeword/templates/architecture-template.md` |
35
+ | Context anchor for complex work | `@./.safeword/templates/ticket-template.md` |
36
+ | Execution scratch pad | `@./.safeword/templates/work-log-template.md` |
37
+
38
+ ---
39
+
40
+ ## Part 1: User Stories
41
+
42
+ ### When to Use Each Format
43
+
44
+ | Format | Best For | Example Trigger |
45
+ | ------------------------------ | ------------------------------------------- | ---------------------------- |
46
+ | Standard (As a/I want/So that) | User-facing features, UI flows | "User can do X" |
47
+ | Given-When-Then | API behavior, state transitions, edge cases | "When X happens, then Y" |
48
+ | Job Story | Problem-solving, user motivation unclear | "User needs to accomplish X" |
49
+
50
+ **Decision rule:** Default to Standard. Use Given-When-Then for APIs or complex state. Use Job Story when focusing on the problem, not the solution.
51
+
52
+ **Edge cases:**
53
+
54
+ - API with UI? → Standard for UI, Given-When-Then for API contract tests
55
+ - Unclear user role? → Job Story to focus on the problem first, convert to Standard later
56
+ - Technical task (refactor, upgrade)? → Skip story format, use Technical Task template
57
+
58
+ ### Standard Format (Recommended)
59
+
60
+ ```text
61
+ As a [role/persona]
62
+ I want [capability/feature]
63
+ So that [business value/benefit]
64
+
65
+ Acceptance Criteria:
66
+ - [Specific, testable condition 1]
67
+ - [Specific, testable condition 2]
68
+ - [Specific, testable condition 3]
69
+
70
+ Out of Scope:
71
+ - [What this story explicitly does NOT include]
72
+ ```
73
+
74
+ ### Given-When-Then Format (Behavior-Focused)
75
+
76
+ ```text
77
+ Given [initial context/state]
78
+ When [action/event occurs]
79
+ Then [expected outcome]
80
+
81
+ And [additional context/outcome]
82
+ But [exception/edge case]
83
+ ```
84
+
85
+ **Example:**
86
+
87
+ ```text
88
+ Given I am an authenticated API user
89
+ When I POST to /api/campaigns with valid JSON
90
+ Then I receive a 201 Created response with campaign ID
91
+ And the campaign appears in my GET /api/campaigns list
92
+ But invalid JSON returns 400 with descriptive error messages
93
+ ```
94
+
95
+ ### Job Story Format (Outcome-Focused)
96
+
97
+ ```text
98
+ When [situation/context]
99
+ I want to [motivation/job-to-be-done]
100
+ So I can [expected outcome]
101
+ ```
102
+
103
+ **Example:**
104
+
105
+ ```text
106
+ When I'm debugging a failing test
107
+ I want to see the exact LLM prompt and response
108
+ So I can identify whether the issue is prompt engineering or code logic
109
+ ```
110
+
111
+ ---
112
+
113
+ ## INVEST Validation
114
+
115
+ Before saving any story, verify it passes all six criteria:
116
+
117
+ - [ ] **Independent** - Can be completed without depending on other stories
118
+ - [ ] **Negotiable** - Details emerge through conversation, not a fixed contract
119
+ - [ ] **Valuable** - Delivers clear value to user or business
120
+ - [ ] **Estimable** - Team can estimate effort (not too vague, not too detailed)
121
+ - [ ] **Small** - Completable in one sprint/iteration (typically 1-5 days)
122
+ - [ ] **Testable** - Clear acceptance criteria define when it's done
123
+
124
+ **If a story fails any criteria, it's not ready - refine or split it.**
125
+
126
+ ---
127
+
128
+ ## Writing Good Acceptance Criteria
129
+
130
+ **✅ GOOD - Specific, user-facing, testable:**
131
+
132
+ - User can switch campaigns without page reload
133
+ - Response time is under 200ms
134
+ - Current campaign is visually highlighted
135
+ - Error message explains what went wrong
136
+
137
+ **❌ BAD - Vague, technical, or implementation:**
138
+
139
+ - Campaign switching works ← Too vague
140
+ - Use Zustand for state ← Implementation detail
141
+ - Database is fast ← Not user-facing
142
+ - Code is clean ← Not testable
143
+
144
+ ---
145
+
146
+ ## Size Guidelines
147
+
148
+ | Indicator | Too Big | Just Right | Too Small |
149
+ | ------------------- | ------- | ---------- | --------- |
150
+ | Acceptance Criteria | 6+ | 1-5 | 0 |
151
+ | Personas/Screens | 3+ | 1-2 | N/A |
152
+ | Duration | 6+ days | 1-5 days | <1 hour |
153
+ | **Action** | Split | ✅ Ship | Combine |
154
+
155
+ **Decision rule:** When borderline, err on the side of splitting. Smaller stories are easier to estimate and complete.
156
+
157
+ ---
158
+
159
+ ## Technical Constraints Section
160
+
161
+ **Purpose:** Capture non-functional requirements that inform test definitions.
162
+
163
+ **When to use:** Fill in constraints BEFORE writing test definitions. Delete sections that don't apply.
164
+
165
+ | Category | What It Captures | Examples |
166
+ | -------------- | -------------------------------- | ----------------------------------------------- |
167
+ | Performance | Speed, throughput, capacity | Response time < 200ms, 1000 concurrent users |
168
+ | Security | Auth, validation, rate limiting | Sanitized inputs, session required, 100 req/min |
169
+ | Compatibility | Browsers, devices, accessibility | Chrome 100+, iOS 14+, WCAG 2.1 AA |
170
+ | Data | Privacy, retention, compliance | GDPR delete in 72h, 90-day log retention |
171
+ | Dependencies | Existing systems, restrictions | Use AuthService, no new packages |
172
+ | Infrastructure | Resources, offline, deployment | < 512MB memory, offline-capable |
173
+
174
+ **Include a constraint if:**
175
+
176
+ - It affects how you write tests
177
+ - It limits implementation choices
178
+ - Violating it would fail an audit or break SLAs
179
+
180
+ ---
181
+
182
+ ## User Story Examples
183
+
184
+ ### ✅ GOOD Story
185
+
186
+ ```text
187
+ As a player with multiple campaigns
188
+ I want to switch between campaigns from the sidebar
189
+ So that I can quickly resume different games
190
+
191
+ Acceptance Criteria:
192
+ - [ ] Sidebar shows all campaigns with last-played date
193
+ - [ ] Clicking campaign loads it within 200ms
194
+ - [ ] Current campaign is highlighted
195
+
196
+ Out of Scope:
197
+ - Campaign merging/deletion (separate story)
198
+ ```
199
+
200
+ ### ❌ BAD Story (Too Big)
201
+
202
+ ```text
203
+ As a user
204
+ I want a complete campaign management system
205
+ So that I can organize my games
206
+
207
+ Acceptance Criteria:
208
+ - [ ] Create, edit, delete campaigns
209
+ - [ ] Share campaigns with other players
210
+ - [ ] Export/import campaign data
211
+ - [ ] Search and filter campaigns
212
+ - [ ] Tag campaigns by theme
213
+ ```
214
+
215
+ **Problem:** This is 5+ separate stories. Split it.
216
+
217
+ ### ❌ BAD Story (No Value)
218
+
219
+ ```text
220
+ As a developer
221
+ I want to refactor the GameStore
222
+ So that code is cleaner
223
+ ```
224
+
225
+ **Problem:** Developer is not a user. "Cleaner code" is not user-facing value.
226
+
227
+ ### ✅ BETTER (Technical Task)
228
+
229
+ ```text
230
+ Technical Task: Refactor GameStore to use Immer
231
+
232
+ Why: Prevent state mutation bugs (3 bugs in last sprint)
233
+ Effort: 2-3 hours
234
+ Test: All existing tests pass, no new mutations
235
+ ```
236
+
237
+ ---
238
+
239
+ ## Part 2: Test Definitions
240
+
241
+ ### How to Fill Out Test Definitions
242
+
243
+ 1. Read `@./.safeword/templates/test-definitions-feature.md`
244
+ 2. Read user story's Technical Constraints section (if exists)
245
+ 3. Fill in feature name, issue number, test file path
246
+ 4. Organize tests into logical suites
247
+ 5. Create numbered tests (Test 1.1, Test 1.2, etc.)
248
+ 6. Add status for each test
249
+ 7. Include detailed steps and expected outcomes
250
+ 8. Add summary with coverage breakdown
251
+ 9. Save to `.safeword/planning/test-definitions/feature-[slug].md`
252
+
253
+ ---
254
+
255
+ ## Test Status Indicators
256
+
257
+ Use these consistently:
258
+
259
+ - **✅ Passing** - Test is implemented and passing
260
+ - **⏭️ Skipped** - Test is intentionally skipped (add rationale)
261
+ - **❌ Not Implemented** - Test is defined but not yet written
262
+ - **🔴 Failing** - Test exists but is currently failing
263
+
264
+ ---
265
+
266
+ ## Test Definition Naming
267
+
268
+ **✅ GOOD - Descriptive and specific:**
269
+
270
+ - "Render all three panes"
271
+ - "Cmd+J toggles AI pane visibility"
272
+ - "State persistence across sessions"
273
+
274
+ **❌ BAD - Vague or technical:**
275
+
276
+ - "Test 1" (no description)
277
+ - "Check state" (too vague)
278
+ - "Verify useUIStore hook" (implementation detail)
279
+
280
+ ---
281
+
282
+ ## Writing Test Steps
283
+
284
+ **✅ GOOD - Clear, actionable steps:**
285
+
286
+ ```text
287
+ **Steps**:
288
+ 1. Toggle AI pane visible
289
+ 2. Get bounding box for AI pane
290
+ 3. Get bounding box for Editor pane
291
+ 4. Compare X coordinates
292
+ ```
293
+
294
+ **❌ BAD - Vague or incomplete:**
295
+
296
+ ```text
297
+ **Steps**:
298
+ 1. Check panes
299
+ 2. Verify order
300
+ ```
301
+
302
+ ---
303
+
304
+ ## Writing Expected Outcomes
305
+
306
+ **✅ GOOD - Specific, testable assertions:**
307
+
308
+ ```text
309
+ **Expected**:
310
+ - AI pane X coordinate < Editor pane X coordinate
311
+ - Explorer pane X coordinate > Editor pane X coordinate
312
+ - All coordinates are positive numbers
313
+ ```
314
+
315
+ **❌ BAD - Vague expectations:**
316
+
317
+ ```text
318
+ **Expected**:
319
+ - Panes are in correct order
320
+ - Everything works
321
+ ```
322
+
323
+ ---
324
+
325
+ ## Organizing Test Suites
326
+
327
+ Group related tests:
328
+
329
+ - **Layout/Structure** - DOM structure, element presence, positioning
330
+ - **User Interactions** - Clicks, keyboard shortcuts, drag/drop
331
+ - **State Management** - State changes, persistence, reactivity
332
+ - **Accessibility** - ARIA labels, keyboard navigation, focus
333
+ - **Edge Cases** - Error handling, boundary conditions
334
+ - **Technical Constraints** - Non-functional requirements from user story
335
+
336
+ ---
337
+
338
+ ## Coverage Summary
339
+
340
+ **Always include:**
341
+
342
+ - Total test count
343
+ - Breakdown by status (passing, skipped, not implemented, failing)
344
+ - Percentages for each category
345
+ - Rationale for skipped tests
346
+
347
+ **Example:**
348
+
349
+ ```text
350
+ **Total**: 20 tests
351
+ **Passing**: 9 tests (45%)
352
+ **Skipped**: 4 tests (20%)
353
+ **Not Implemented**: 7 tests (35%)
354
+ **Failing**: 0 tests
355
+ ```
356
+
357
+ ---
358
+
359
+ ## Testing Technical Constraints
360
+
361
+ User stories include Technical Constraints. These MUST have corresponding tests.
362
+
363
+ | Constraint Category | Test Type | What to Verify |
364
+ | ------------------- | -------------------------- | --------------------------------------------- |
365
+ | Performance | Load/timing tests | Response times, throughput, capacity |
366
+ | Security | Security tests | Input sanitization, auth, rate limiting |
367
+ | Compatibility | Cross-browser/device tests | Browser versions, mobile, accessibility |
368
+ | Data | Compliance tests | Retention, deletion, privacy rules |
369
+ | Dependencies | Integration tests | Required services work, no forbidden packages |
370
+ | Infrastructure | Resource tests | Memory limits, offline behavior |
371
+
372
+ ---
373
+
374
+ ## Test Definition Example
375
+
376
+ ```markdown
377
+ ### Test 3.1: Cmd+J toggles AI pane visibility ✅
378
+
379
+ **Status**: ✅ Passing
380
+ **Description**: Verifies Cmd+J keyboard shortcut toggles AI pane
381
+
382
+ **Steps**:
383
+
384
+ 1. Verify AI pane hidden initially (default state)
385
+ 2. Press Cmd+J (Mac) or Ctrl+J (Windows/Linux)
386
+ 3. Verify AI pane becomes visible
387
+ 4. Press Cmd+J again
388
+ 5. Verify AI pane becomes hidden
389
+
390
+ **Expected**:
391
+
392
+ - AI pane starts hidden
393
+ - After first toggle: AI pane visible
394
+ - After second toggle: AI pane hidden
395
+ ```
396
+
397
+ ---
398
+
399
+ ## File Naming Convention
400
+
401
+ **Specs:** `.safeword/planning/specs/feature-[slug].md` or `task-[slug].md`
402
+
403
+ **Test definitions:** `.safeword/planning/test-definitions/feature-[slug].md`
404
+
405
+ **Good filenames:**
406
+
407
+ - `feature-campaign-switching.md`
408
+ - `task-fix-login-timeout.md`
409
+
410
+ **Bad filenames:**
411
+
412
+ - `user-story-1.md` ← Not descriptive
413
+ - `STORY_CAMPAIGN_FINAL_v2.md` ← Bloated
414
+
415
+ ---
416
+
417
+ ## Quick Reference
418
+
419
+ **User Story Red Flags (INVEST Violations):**
420
+
421
+ - No acceptance criteria → Too vague
422
+ - > 5 acceptance criteria → Split into multiple stories
423
+ - Technical implementation details → Wrong audience
424
+ - Missing "So that" → No clear value
425
+
426
+ **Test Definition Red Flags:**
427
+
428
+ - Test name doesn't describe behavior → Rename
429
+ - Steps are vague → Add detail
430
+ - No expected outcomes → Add assertions
431
+ - No coverage summary → Add totals