safeword 0.7.7 → 0.8.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (34) hide show
  1. package/dist/{check-3TTR7WPD.js → check-2QCPMURS.js} +3 -3
  2. package/dist/{chunk-V5T3TEEQ.js → chunk-2P7QXQFL.js} +2 -2
  3. package/dist/{chunk-LETSGOTR.js → chunk-OXQIEKC7.js} +12 -7
  4. package/dist/{chunk-LETSGOTR.js.map → chunk-OXQIEKC7.js.map} +1 -1
  5. package/dist/{chunk-4NJCU6Z7.js → chunk-ZFRO5LB5.js} +2 -2
  6. package/dist/cli.js +6 -6
  7. package/dist/{diff-XJFCAA4Q.js → diff-6LJGYHY5.js} +3 -3
  8. package/dist/{reset-WPXUWP6Y.js → reset-VHNADDMA.js} +3 -3
  9. package/dist/{setup-DLS6K6EO.js → setup-QJNVWHTK.js} +3 -3
  10. package/dist/sync-TIBNJXB2.js +9 -0
  11. package/dist/{upgrade-4ESTGNXG.js → upgrade-GZSLDUEF.js} +4 -4
  12. package/package.json +15 -14
  13. package/templates/SAFEWORD.md +2 -4
  14. package/templates/doc-templates/feature-spec-template.md +1 -1
  15. package/templates/doc-templates/task-spec-template.md +1 -1
  16. package/templates/doc-templates/test-definitions-feature.md +1 -1
  17. package/templates/guides/planning-guide.md +431 -0
  18. package/templates/guides/testing-guide.md +439 -0
  19. package/templates/scripts/lint-md.sh +0 -0
  20. package/templates/skills/safeword-systematic-debugger/SKILL.md +1 -1
  21. package/templates/skills/safeword-tdd-enforcer/SKILL.md +31 -10
  22. package/dist/sync-AAG4SP5F.js +0 -9
  23. package/templates/guides/development-workflow.md +0 -627
  24. package/templates/guides/tdd-best-practices.md +0 -624
  25. package/templates/guides/test-definitions-guide.md +0 -343
  26. package/templates/guides/user-story-guide.md +0 -265
  27. /package/dist/{check-3TTR7WPD.js.map → check-2QCPMURS.js.map} +0 -0
  28. /package/dist/{chunk-V5T3TEEQ.js.map → chunk-2P7QXQFL.js.map} +0 -0
  29. /package/dist/{chunk-4NJCU6Z7.js.map → chunk-ZFRO5LB5.js.map} +0 -0
  30. /package/dist/{diff-XJFCAA4Q.js.map → diff-6LJGYHY5.js.map} +0 -0
  31. /package/dist/{reset-WPXUWP6Y.js.map → reset-VHNADDMA.js.map} +0 -0
  32. /package/dist/{setup-DLS6K6EO.js.map → setup-QJNVWHTK.js.map} +0 -0
  33. /package/dist/{sync-AAG4SP5F.js.map → sync-TIBNJXB2.js.map} +0 -0
  34. /package/dist/{upgrade-4ESTGNXG.js.map → upgrade-GZSLDUEF.js.map} +0 -0
@@ -1,343 +0,0 @@
1
- # Test Definitions Guide for Claude Code
2
-
3
- ## How to Fill Out Feature Test Definitions
4
-
5
- **Template:** `@.safeword/templates/test-definitions-feature.md`
6
-
7
- **When user asks:** "Create test definitions for issue #N" or "Create test definitions for [feature]"
8
-
9
- **What you do:**
10
-
11
- 1. Read `@.safeword/templates/test-definitions-feature.md`
12
- 2. Read user story's Technical Constraints section (if exists)
13
- 3. Fill in feature name, issue number, test file path
14
- 4. Organize tests into logical suites (e.g., "Layout Structure", "User Interactions", "Technical Constraints")
15
- 5. Create numbered tests (Test 1.1, Test 1.2, etc.)
16
- 6. Add status for each test (✅/⏭️/❌/🔴)
17
- 7. Include detailed steps and expected outcomes
18
- 8. Add summary with coverage breakdown
19
- 9. Save to project location (e.g., `planning/test-definitions/45-feature-name-test-definitions.md`)
20
-
21
- **DO include:**
22
-
23
- - Status tracking per test (✅ Passing / ⏭️ Skipped / ❌ Not Implemented / 🔴 Failing)
24
- - Detailed steps (numbered list)
25
- - Expected outcomes (bullet points)
26
- - Coverage summary with percentages
27
- - Skipped tests rationale
28
- - Test execution commands
29
-
30
- ---
31
-
32
- ## Test Status Indicators
33
-
34
- Use these consistently:
35
-
36
- - **✅ Passing** - Test is implemented and passing
37
- - **⏭️ Skipped** - Test is intentionally skipped (add rationale in summary)
38
- - **❌ Not Implemented** - Test is defined but not yet written
39
- - **🔴 Failing** - Test exists but is currently failing
40
-
41
- ---
42
-
43
- ## Test Naming Conventions
44
-
45
- **✅ GOOD - Descriptive and specific:**
46
-
47
- - "Render all three panes"
48
- - "Cmd+J toggles AI pane visibility"
49
- - "State persistence across sessions"
50
- - "Button appearance reflects pane state"
51
-
52
- **❌ BAD - Vague or technical:**
53
-
54
- - "Test 1" (no description)
55
- - "Check state" (too vague)
56
- - "Verify useUIStore hook" (implementation detail)
57
-
58
- ---
59
-
60
- ## Writing Test Steps
61
-
62
- **✅ GOOD - Clear, actionable steps:**
63
-
64
- ````text
65
- **Steps**:
66
- 1. Toggle AI pane visible
67
- 2. Get bounding box for AI pane
68
- 3. Get bounding box for Editor pane
69
- 4. Compare X coordinates
70
- ```text
71
-
72
- **❌ BAD - Vague or incomplete:**
73
-
74
- ```text
75
- **Steps**:
76
- 1. Check panes
77
- 2. Verify order
78
- ```text
79
-
80
- ---
81
-
82
- ## Writing Expected Outcomes
83
-
84
- **✅ GOOD - Specific, testable assertions:**
85
-
86
- ```text
87
- **Expected**:
88
- - AI pane X coordinate < Editor pane X coordinate
89
- - Explorer pane X coordinate > Editor pane X coordinate
90
- - All coordinates are positive numbers
91
- ```text
92
-
93
- **❌ BAD - Vague expectations:**
94
-
95
- ```text
96
- **Expected**:
97
- - Panes are in correct order
98
- - Everything works
99
- ```text
100
-
101
- ---
102
-
103
- ## Organizing Test Suites
104
-
105
- **Group related tests into suites:**
106
-
107
- - **Layout/Structure** - DOM structure, element presence, positioning
108
- - **User Interactions** - Clicks, keyboard shortcuts, drag/drop
109
- - **State Management** - State changes, persistence, reactivity
110
- - **Accessibility** - ARIA labels, keyboard navigation, focus management
111
- - **Edge Cases** - Error handling, boundary conditions, race conditions
112
- - **Technical Constraints** - Non-functional requirements from user story (see below)
113
-
114
- **Each suite should have:**
115
-
116
- - Clear name describing what it tests
117
- - Brief description (1-2 sentences)
118
- - Related tests grouped logically
119
-
120
- ---
121
-
122
- ## Coverage Summary Best Practices
123
-
124
- **Always include:**
125
-
126
- - Total test count
127
- - Breakdown by status (passing, skipped, not implemented, failing)
128
- - Percentages for each category
129
- - Coverage by feature table
130
- - Rationale for skipped tests
131
-
132
- **Example:**
133
-
134
- ```text
135
- **Total**: 20 tests
136
- **Passing**: 9 tests (45%)
137
- **Skipped**: 4 tests (20%)
138
- **Not Implemented**: 7 tests (35%)
139
- **Failing**: 0 tests
140
- ```text
141
-
142
- ---
143
-
144
- ## Skipped Tests Rationale
145
-
146
- **Always explain why tests are skipped:**
147
-
148
- **✅ GOOD - Clear reasoning:**
149
-
150
- - "Conflicts with Phase 2 design decision (AI pane hidden by default)"
151
- - "Requires complex drag simulation, tested manually"
152
- - "Blocked by upstream dependency (Issue #42)"
153
-
154
- **❌ BAD - No explanation:**
155
-
156
- - "Skipped"
157
- - "Not needed"
158
- - "TODO"
159
-
160
- ---
161
-
162
- ## Test Execution Section
163
-
164
- **Include practical commands:**
165
-
166
- ```bash
167
- # Run all tests for this feature
168
- npm run test:e2e -- tests/feature-name.spec.ts
169
-
170
- # Run specific test
171
- npm run test:e2e -- tests/feature-name.spec.ts --grep "specific test name"
172
- ```text
173
-
174
- ---
175
-
176
- ## TDD Workflow Integration
177
-
178
- **Test definitions should be created:**
179
-
180
- 1. **Before implementation** (TDD: write tests first)
181
- 2. **During planning** (alongside user stories)
182
- 3. **After user stories** (tests verify acceptance criteria)
183
-
184
- **Update test definitions:**
185
-
186
- - Mark tests ✅ as they pass
187
- - Add ⏭️ with rationale if skipping
188
- - Mark 🔴 if tests fail
189
- - Update "Last Updated" date
190
-
191
- ---
192
-
193
- ## Relationship to User Stories
194
-
195
- **Test definitions should:**
196
-
197
- - Map directly to user story acceptance criteria
198
- - Cover all acceptance criteria from user stories
199
- - Include additional edge cases and error scenarios
200
- - Reference test file locations mentioned in user stories
201
- - **Include tests for Technical Constraints** (non-functional requirements)
202
-
203
- **Example:**
204
-
205
- - User Story AC: "AI pane is visible when toggled"
206
- - Test: "Test 3.1: Cmd+J toggles AI pane visibility ✅"
207
-
208
- ---
209
-
210
- ## Testing Technical Constraints
211
-
212
- **Purpose:** User stories include a Technical Constraints section with non-functional requirements. These constraints MUST have corresponding tests.
213
-
214
- **See:** `@.safeword/guides/user-story-guide.md` for constraint categories and examples.
215
-
216
- ### Constraint Categories → Test Types
217
-
218
- | Constraint Category | Test Type | What to Verify |
219
- | ------------------- | -------------------------- | --------------------------------------------------- |
220
- | Performance | Load/timing tests | Response times, throughput, capacity limits |
221
- | Security | Security tests | Input sanitization, auth checks, rate limiting |
222
- | Compatibility | Cross-browser/device tests | Browser versions, mobile, accessibility |
223
- | Data | Compliance tests | Retention, deletion, privacy rules |
224
- | Dependencies | Integration tests | Required services work, no forbidden packages |
225
- | Infrastructure | Resource tests | Memory limits, offline behavior, deploy constraints |
226
-
227
- ### Writing Constraint Tests
228
-
229
- **✅ GOOD - Specific, measurable:**
230
-
231
- ```markdown
232
- ### Test 7.1: API response under load ✅
233
-
234
- **Status**: ✅ Passing
235
- **Category**: Performance
236
- **Constraint**: API response < 200ms at P95
237
-
238
- **Steps**:
239
-
240
- 1. Spawn 100 concurrent requests to /api/data
241
- 2. Measure response times for all requests
242
- 3. Calculate P95 latency
243
-
244
- **Expected**:
245
-
246
- - P95 response time < 200ms
247
- - No requests timeout
248
- - No 5xx errors under load
249
- ```text
250
-
251
- **❌ BAD - Vague, untestable:**
252
-
253
- ```markdown
254
- ### Test 7.1: Performance test
255
-
256
- **Steps**: Check if fast
257
- **Expected**: Good performance
258
- ```text
259
-
260
- ### When to Skip Constraint Tests
261
-
262
- **Document skipped constraint tests with rationale:**
263
-
264
- - "Performance: Tested in CI pipeline, not in unit tests"
265
- - "Compatibility: Manual testing on BrowserStack, see QA checklist"
266
- - "Security: Covered by automated SAST scan (Snyk)"
267
-
268
- **Never skip without explanation.**
269
-
270
- ---
271
-
272
- ## Example: Good Test Definition
273
-
274
- ```markdown
275
- ### Test 3.1: Cmd+J toggles AI pane visibility ✅
276
-
277
- **Status**: ✅ Passing
278
- **Description**: Verifies Cmd+J keyboard shortcut toggles AI pane
279
-
280
- **Steps**:
281
-
282
- 1. Verify AI pane hidden initially (default state)
283
- 2. Press Cmd+J (Mac) or Ctrl+J (Windows/Linux)
284
- 3. Verify AI pane becomes visible
285
- 4. Press Cmd+J again
286
- 5. Verify AI pane becomes hidden
287
-
288
- **Expected**:
289
-
290
- - AI pane starts hidden
291
- - After first toggle: AI pane visible
292
- - After second toggle: AI pane hidden
293
- - Toggle action triggers state change in uiStore
294
- ```text
295
-
296
- ---
297
-
298
- ## Common Mistakes to Avoid
299
-
300
- ❌ **Don't test implementation details:**
301
-
302
- - Bad: "Verify useState hook updates"
303
- - Good: "Verify pane becomes visible when toggled"
304
-
305
- ❌ **Don't write vague steps:**
306
-
307
- - Bad: "Check if it works"
308
- - Good: "Click button and verify modal appears"
309
-
310
- ❌ **Don't skip rationale for skipped tests:**
311
-
312
- - Always explain WHY a test is skipped
313
-
314
- ❌ **Don't forget coverage summary:**
315
-
316
- - Always include totals and percentages
317
-
318
- ❌ **Don't duplicate test descriptions:**
319
-
320
- - Each test should have a unique, descriptive name
321
-
322
- ---
323
-
324
- ## LLM Instruction Design
325
-
326
- **Important:** Test definitions are instructions that LLMs read and follow. Apply best practices for clarity.
327
-
328
- **See:** `@.safeword/guides/llm-guide.md` for comprehensive framework including:
329
-
330
- - MECE decision trees (mutually exclusive, collectively exhaustive)
331
- - Explicit definitions (never assume LLMs know what you mean)
332
- - Concrete examples over abstract rules
333
- - Edge cases must be explicit
334
- - Actionable over vague language
335
-
336
- ---
337
-
338
- ## Key Takeaways
339
-
340
- - Map each user story acceptance criterion to specific tests
341
- - Include tests for technical constraints (performance, security, etc.)
342
- - Test behavior, not implementation details
343
- - Every skipped test needs documented rationale
@@ -1,265 +0,0 @@
1
- # User Story Guide for Claude Code
2
-
3
- ## How to Fill Out Feature User Stories
4
-
5
- **Template:** `@.safeword/templates/feature-spec-template.md`
6
-
7
- **When user asks:** "Create user stories for issue #N" or "Create user stories for [feature]"
8
-
9
- **What you do:**
10
-
11
- 1. Read `@.safeword/templates/feature-spec-template.md`
12
- 2. Fill in feature name, issue number, status
13
- 3. Create numbered stories (Story 1, Story 2, etc.)
14
- 4. Add acceptance criteria with ✅/❌ checkboxes for tracking
15
- 5. Include test file references
16
- 6. Add summary with completion % and phases
17
- 7. Save to project location (e.g., `planning/specs/feature-45-name.md`)
18
-
19
- **DO include:**
20
-
21
- - Status tracking (✅/❌ per story and AC)
22
- - Test file references
23
- - Implementation status
24
- - Completion percentage
25
- - Phase tracking
26
- - Next steps
27
-
28
- ---
29
-
30
- ## INVEST Validation (Do This Before Saving)
31
-
32
- After filling out story, mentally check:
33
-
34
- ✅ **Independent** - Can be done without other stories?
35
- ✅ **Negotiable** - Are details left for conversation?
36
- ✅ **Valuable** - Does "So that" clearly state value?
37
- ✅ **Estimable** - Can team estimate this (1-5 days)?
38
- ✅ **Small** - Completable in one sprint?
39
- ✅ **Testable** - Are acceptance criteria specific?
40
-
41
- **If ANY check fails → Refine or split the story**
42
-
43
- ---
44
-
45
- ## Writing Good Acceptance Criteria
46
-
47
- **✅ GOOD - Specific, user-facing, testable:**
48
-
49
- - User can switch campaigns without page reload
50
- - Response time is under 200ms
51
- - Current campaign is visually highlighted
52
- - Error message explains what went wrong
53
-
54
- **❌ BAD - Vague, technical, or implementation:**
55
-
56
- - Campaign switching works ← Too vague
57
- - Use Zustand for state ← Implementation detail
58
- - Database is fast ← Not user-facing
59
- - Code is clean ← Not testable
60
-
61
- ---
62
-
63
- ## Size Guidelines
64
-
65
- | Indicator | Too Big | Just Right | Too Small |
66
- | ------------------- | ------- | ---------- | --------- |
67
- | Acceptance Criteria | 6+ | 1-5 | 0 |
68
- | Personas/Screens | 3+ | 1-2 | N/A |
69
- | Duration | 6+ days | 1-5 days | <1 hour |
70
- | **Action** | Split | ✅ Ship | Combine |
71
-
72
- **Decision rule:** When borderline (e.g., 5 AC but 2 personas), err on the side of splitting. Smaller stories are easier to estimate and complete.
73
-
74
- ---
75
-
76
- ## Examples
77
-
78
- ### ✅ GOOD Story
79
-
80
- **As a** player with multiple campaigns
81
- **I want** to switch between campaigns from the sidebar
82
- **So that** I can quickly resume different games
83
-
84
- **Acceptance Criteria:**
85
-
86
- - [ ] Sidebar shows all campaigns with last-played date
87
- - [ ] Clicking campaign loads it within 200ms
88
- - [ ] Current campaign is highlighted
89
-
90
- ### ❌ BAD Story (Too Big)
91
-
92
- **As a** user
93
- **I want** a complete campaign management system
94
- **So that** I can organize my games
95
-
96
- **Acceptance Criteria:**
97
-
98
- - [ ] Create, edit, delete campaigns
99
- - [ ] Share campaigns with other players
100
- - [ ] Export/import campaign data
101
- - [ ] Search and filter campaigns
102
- - [ ] Tag campaigns by theme
103
-
104
- **Problem:** This is 5+ separate stories. Split it.
105
-
106
- ### ❌ BAD Story (No Value)
107
-
108
- **As a** developer
109
- **I want** to refactor the GameStore
110
- **So that** code is cleaner
111
-
112
- **Problem:**
113
-
114
- - Developer is not a user
115
- - "Cleaner code" is not user-facing value
116
- - This is a technical task, not a user story
117
-
118
- ### ✅ BETTER (Technical Story)
119
-
120
- **Technical Task:** Refactor GameStore to use Immer
121
-
122
- **Why:** Prevent state mutation bugs (3 bugs in last sprint)
123
- **Effort:** 2-3 hours
124
- **Test:** All existing tests pass, no new mutations
125
-
126
- ---
127
-
128
- ## Conversation Starter, Not Contract
129
-
130
- **Remember:** User story is a placeholder for conversation.
131
-
132
- **During planning, discuss:**
133
-
134
- - Edge cases not in acceptance criteria
135
- - Technical approach (but don't document it in story)
136
- - Open questions or dependencies
137
- - How to split if too big
138
-
139
- **The story should NOT contain:**
140
-
141
- - Technical implementation details
142
- - Test strategies
143
- - UI mockups (link to them instead)
144
- - Definition of done (that's team-wide)
145
-
146
- ---
147
-
148
- ## Technical Constraints Section
149
-
150
- **Purpose:** Capture non-functional requirements that inform test definitions. These are NOT user stories but constrain how stories are implemented.
151
-
152
- **When to use:** Fill in constraints BEFORE writing test definitions. Delete sections that don't apply—keep it lean.
153
-
154
- ### Categories
155
-
156
- | Category | What It Captures | Examples |
157
- | -------------- | -------------------------------- | ----------------------------------------------- |
158
- | Performance | Speed, throughput, capacity | Response time < 200ms, 1000 concurrent users |
159
- | Security | Auth, validation, rate limiting | Sanitized inputs, session required, 100 req/min |
160
- | Compatibility | Browsers, devices, accessibility | Chrome 100+, iOS 14+, WCAG 2.1 AA |
161
- | Data | Privacy, retention, compliance | GDPR delete in 72h, 90-day log retention |
162
- | Dependencies | Existing systems, restrictions | Use AuthService, no new packages |
163
- | Infrastructure | Resources, offline, deployment | < 512MB memory, offline-capable |
164
-
165
- ### ✅ GOOD Constraints (Specific, Testable)
166
-
167
- ```markdown
168
- ### Performance
169
-
170
- - [ ] API response < 200ms at P95 under 100 concurrent users
171
- - [ ] Initial page load < 3s on simulated 3G
172
-
173
- ### Security
174
-
175
- - [ ] All user inputs sanitized via DOMPurify
176
- - [ ] Rate limited: 100 requests/min per IP
177
- ```
178
-
179
- ### ❌ BAD Constraints (Vague, Untestable)
180
-
181
- ```markdown
182
- ### Performance
183
-
184
- - [ ] Should be fast ← How fast? Under what conditions?
185
- - [ ] Good performance ← Not measurable
186
-
187
- ### Security
188
-
189
- - [ ] Secure ← What does this mean?
190
- - [ ] Protected from hackers ← Not specific
191
- ```
192
-
193
- ### Decision Rule
194
-
195
- **Include a constraint if:**
196
-
197
- - It affects how you write tests (performance tests, security tests)
198
- - It limits implementation choices (must use X, can't use Y)
199
- - Violating it would fail an audit or break SLAs
200
-
201
- **Skip if:**
202
-
203
- - It's a project-wide standard already in ARCHITECTURE.md
204
- - It's obvious (don't document "code must compile")
205
-
206
- ### Tie-Breaking
207
-
208
- **If constraint fits multiple categories:** Choose the most specific one.
209
-
210
- | Constraint | Could Fit | Best Category | Why |
211
- | -------------------------- | ----------------------------- | -------------- | ------------------------------- |
212
- | API rate limit 100 req/min | Security, Performance | Security | Rate limiting is access control |
213
- | Page load < 3s on 3G | Performance, Compatibility | Performance | Speed is primary concern |
214
- | Must work offline | Infrastructure, Compatibility | Infrastructure | Offline is deployment concern |
215
-
216
- **Edge case:** If truly cross-cutting (e.g., "GDPR compliance" spans Data + Security), pick one and add a note: "See also: Security constraints"
217
-
218
- ---
219
-
220
- ## LLM Optimization Tips
221
-
222
- **Core principle:** User stories are instructions that LLMs read and follow. Apply LLM instruction design best practices.
223
-
224
- **See:** `@.safeword/guides/llm-guide.md` for comprehensive framework on writing LLM-consumable documentation.
225
-
226
- **When filling templates:**
227
-
228
- - Use specific, concrete language (not vague)
229
- - Avoid generic phrases ("improve UX", "make better")
230
- - Include numbers where relevant (200ms, 3 items, under 5 clicks)
231
- - Use concrete examples over abstract rules
232
- - Define all terms explicitly
233
- - Write for humans, not robots
234
-
235
- **Token efficiency:**
236
-
237
- - Template is 9 lines (minimal prompt caching cost)
238
- - No nested sections (flat structure)
239
- - No validation metadata in file
240
-
241
- ---
242
-
243
- ## File Naming Convention
244
-
245
- Save specs as: `.safeword/planning/specs/feature-[slug].md`
246
-
247
- **Good filenames:**
248
-
249
- - `campaign-switching.md`
250
- - `export-character-pdf.md`
251
- - `stress-tracking.md`
252
-
253
- **Bad filenames:**
254
-
255
- - `user-story-1.md` ← Not descriptive
256
- - `STORY_CAMPAIGN_SWITCHING_FINAL_v2.md` ← Bloated
257
-
258
- ---
259
-
260
- ## Key Takeaways
261
-
262
- - INVEST checklist: Independent, Negotiable, Valuable, Estimable, Small, Testable
263
- - "As a [role], I want [action], so that [value]"—always include the "so that"
264
- - 1-3 acceptance criteria per story; split if >3
265
- - Include technical constraints (performance, security, etc.) when relevant