npm - safeword - Versions diffs - 0.8.4 → 0.8.5 - Mend

safeword 0.8.4 → 0.8.5

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (31) hide show

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "safeword",
-  "version": "0.8.4",
+  "version": "0.8.5",
   "description": "CLI for setting up and managing safeword development environments",
   "type": "module",
   "bin": {
@@ -19,6 +19,18 @@
   "engines": {
     "node": ">=18"
   },
+  "scripts": {
+    "build": "tsup",
+    "dev": "tsup --watch",
+    "test": "vitest run",
+    "test:e2e": "vitest run tests/e2e/",
+    "test:watch": "vitest",
+    "test:coverage": "vitest run --coverage",
+    "typecheck": "tsc --noEmit",
+    "lint": "eslint src tests",
+    "clean": "rm -rf dist",
+    "prepublishOnly": "npm run build && npm test"
+  },
   "dependencies": {
     "commander": "^12.1.0"
   },
@@ -36,16 +48,5 @@
     "claude-code"
   ],
   "author": "",
-  "license": "MIT",
-  "scripts": {
-    "build": "tsup",
-    "dev": "tsup --watch",
-    "test": "vitest run",
-    "test:e2e": "vitest run tests/e2e/",
-    "test:watch": "vitest",
-    "test:coverage": "vitest run --coverage",
-    "typecheck": "tsc --noEmit",
-    "lint": "eslint src tests",
-    "clean": "rm -rf dist"
-  }
-}
+  "license": "MIT"
+}

package/templates/cursor/rules/safeword-brainstorming.mdc ADDED Viewed

@@ -0,0 +1,198 @@
+---
+description: Use before implementation when refining rough ideas into specs. Guides collaborative design through Socratic questioning, alternative exploration, and incremental validation. Triggers: 'brainstorm', 'design', 'explore options', 'figure out', 'think through', 'what approach'.
+alwaysApply: false
+---
+# Brainstorming Ideas Into Specs
+Turn rough ideas into validated specs through Socratic dialogue.
+**Iron Law:** ONE QUESTION AT A TIME. EXPLORE ALTERNATIVES BEFORE DECIDING.
+## When to Use
+Answer IN ORDER. Stop at first match:
+1. Rough idea needs refinement? → Use this skill
+2. Multiple approaches possible? → Use this skill
+3. Unclear requirements? → Use this skill
+4. Clear task, obvious approach? → Skip (use enforcing-tdd directly)
+5. Pure research/investigation? → Skip
+---
+## Phase 1: CONTEXT
+**Purpose:** Understand what exists before asking questions.
+**Protocol:**
+1. Check project state (files, recent commits, existing specs)
+2. Review relevant docs in `.safeword/planning/`
+3. Identify constraints and patterns already established
+**Exit Criteria:**
+- [ ] Reviewed relevant codebase areas
+- [ ] Checked existing specs/designs
+- [ ] Ready to ask informed questions
+---
+## Phase 2: QUESTION
+**Iron Law:** ONE QUESTION PER MESSAGE
+**Protocol:**
+1. Ask one focused question
+2. Prefer multiple choice (2-4 options) when possible
+3. Open-ended is fine for exploratory topics
+4. Focus on: purpose, constraints, success criteria, scope
+**Question Types (in order of preference):**
+| Type            | When                    | Example                                              |
+| --------------- | ----------------------- | ---------------------------------------------------- |
+| Multiple choice | Clear options exist     | "Should this be (A) real-time or (B) polling-based?" |
+| Yes/No          | Binary decision         | "Do we need offline support?"                        |
+| Bounded open    | Need specifics          | "What's the max number of items to display?"         |
+| Open-ended      | Exploring problem space | "What problem are you trying to solve?"              |
+**Exit Criteria:**
+- [ ] Understand the core problem/goal
+- [ ] Know key constraints
+- [ ] Have success criteria
+- [ ] Scope boundaries are clear
+---
+## Phase 3: ALTERNATIVES
+**Iron Law:** ALWAYS PRESENT 2-3 OPTIONS BEFORE DECIDING
+**Protocol:**
+1. Present 2-3 approaches with trade-offs
+2. Lead with your recommendation and why
+3. Be explicit about what each gives up
+4. Let user choose (or suggest hybrid)
+**Format:**
+```text
+I'd recommend Option A because [reason].
+**Option A: [Name]**
+- Approach: [how it works]
+- Pros: [benefits]
+- Cons: [drawbacks]
+**Option B: [Name]**
+- Approach: [how it works]
+- Pros: [benefits]
+- Cons: [drawbacks]
+Which direction feels right?
+```
+**Exit Criteria:**
+- [ ] Presented 2-3 viable approaches
+- [ ] Gave clear recommendation with reasoning
+- [ ] User selected approach (or hybrid)
+---
+## Phase 4: DESIGN
+**Iron Law:** PRESENT IN 200-300 WORD SECTIONS. VALIDATE EACH.
+**Protocol:**
+1. Present design incrementally (not all at once)
+2. After each section: "Does this look right so far?"
+3. Cover: architecture, components, data flow, error handling
+4. Apply YAGNI ruthlessly - remove anything "just in case"
+5. Go back and clarify when something doesn't fit
+**Sections (present one at a time):**
+1. **Overview** - What we're building, high-level approach
+2. **Components** - Key pieces and responsibilities
+3. **Data Flow** - How data moves through system
+4. **Edge Cases** - Error handling, boundaries
+5. **Out of Scope** - What we're explicitly NOT doing
+**Exit Criteria:**
+- [ ] Each section validated by user
+- [ ] Design is complete and coherent
+- [ ] YAGNI applied (no speculative features)
+- [ ] Ready to create spec
+---
+## Phase 5: SPEC
+**Purpose:** Convert validated design into structured spec.
+**Protocol:**
+1. Determine level (L0/L1/L2) using triage questions
+2. Create spec using appropriate template
+3. Commit the spec
+**Triage:**
+| Question                                 | If Yes →                     |
+| ---------------------------------------- | ---------------------------- |
+| User-facing feature with business value? | **L2** → Feature Spec        |
+| Bug, improvement, internal, or refactor? | **L1** → Task Spec           |
+| Typo, config, or trivial change?         | **L0** → Task Spec (minimal) |
+**Exit Criteria:**
+- [ ] Spec created in correct location
+- [ ] L2: Test definitions created
+- [ ] Spec committed to git
+---
+## Phase 6: HANDOFF
+**Protocol:**
+1. Summarize what was created
+2. Ask: "Ready to start implementation with TDD?"
+3. If yes → Invoke enforcing-tdd skill
+**Exit Criteria:**
+- [ ] User confirmed spec is complete
+- [ ] Handed off to enforcing-tdd (if continuing)
+---
+## Key Principles
+| Principle                     | Why                                     |
+| ----------------------------- | --------------------------------------- |
+| One question at a time        | Prevents overwhelm, gets better answers |
+| Multiple choice preferred     | Faster to answer, reduces ambiguity     |
+| Alternatives before decisions | Avoids premature commitment             |
+| Incremental validation        | Catches misunderstandings early         |
+| YAGNI ruthlessly              | Scope creep kills projects              |
+---
+## Anti-Patterns
+| Don't                          | Do                               |
+| ------------------------------ | -------------------------------- |
+| Dump full design at once       | Present in 200-300 word sections |
+| Ask 5 questions in one message | Ask ONE question                 |
+| Skip alternatives              | Always present 2-3 options       |
+| Accept vague requirements      | Probe until concrete             |
+| Add "nice to have" features    | Put them in "Out of Scope"       |

package/templates/cursor/rules/safeword-debugging.mdc ADDED Viewed

@@ -0,0 +1,202 @@
+---
+description: Four-phase debugging framework that ensures root cause identification before fixes. Use when encountering bugs, test failures, unexpected behavior, or when previous fix attempts failed. Enforces investigate-first discipline ('debug this', 'fix this error', 'test is failing', 'not working').
+alwaysApply: false
+---
+# Systematic Debugger
+Find root cause before fixing. Symptom fixes are failure.
+**Iron Law:** NO FIXES WITHOUT ROOT CAUSE INVESTIGATION FIRST
+## When to Use
+Answer IN ORDER. Stop at first match:
+1. Bug, error, or test failure? → Use this skill
+2. Unexpected behavior? → Use this skill
+3. Previous fix didn't work? → Use this skill (especially important)
+4. Performance problem? → Use this skill
+5. None of above? → Skip this skill
+**Use especially when:**
+- Under time pressure (emergencies make guessing tempting)
+- "Quick fix" seems obvious (red flag)
+- Already tried 1+ fixes that didn't work
+## The Four Phases
+Complete each phase before proceeding.
+### Phase 1: Root Cause Investigation
+**BEFORE attempting ANY fix:**
+**1. Read Error Messages Completely**
+```text
+Don't skip past errors. They often contain the exact solution.
+- Full stack trace (note line numbers, file paths)
+- Error codes and messages
+- Warnings that preceded the error
+```
+**2. Reproduce Consistently**
+| Can reproduce?  | Action                                               |
+| --------------- | ---------------------------------------------------- |
+| Yes, every time | Proceed to step 3                                    |
+| Sometimes       | Gather more data - when does it happen vs not?       |
+| Never           | Cannot debug what you cannot reproduce - gather logs |
+**3. Check Recent Changes**
+```bash
+git diff HEAD~5          # Recent code changes
+git log --oneline -10    # Recent commits
+```
+What changed that could cause this? Dependencies? Config? Environment?
+**4. Trace Data Flow (Root Cause Tracing)**
+When error is deep in call stack:
+```text
+Symptom: Error at line 50 in utils.js
+    ↑ Called by handler.js:120
+    ↑ Called by router.js:45
+    ↑ Called by app.js:10 ← ROOT CAUSE: bad input here
+```
+**Technique:**
+1. Find where error occurs (symptom)
+2. Ask: "What called this with bad data?"
+3. Trace up until you find the SOURCE
+4. Fix at source, not at symptom
+**5. Multi-Component Systems**
+When system has multiple layers (API → service → database):
+```bash
+# Log at EACH boundary before proposing fixes
+echo "=== Layer 1 (API): request=$REQUEST ==="
+echo "=== Layer 2 (Service): input=$INPUT ==="
+echo "=== Layer 3 (DB): query=$QUERY ==="
+```
+Run once to find WHERE it breaks. Then investigate that layer.
+### Phase 2: Pattern Analysis
+**1. Find Working Examples**
+Locate similar working code in same codebase. What works that's similar?
+**2. Identify Differences**
+| Working code     | Broken code    | Could this matter? |
+| ---------------- | -------------- | ------------------ |
+| Uses async/await | Uses callbacks | Yes - timing       |
+| Validates input  | No validation  | Yes - bad data     |
+List ALL differences. Don't assume "that can't matter."
+### Phase 3: Hypothesis Testing
+**1. Form Single Hypothesis**
+Write it down: "I think X is the root cause because Y"
+Be specific:
+- ❌ "Something's wrong with the database"
+- ✅ "Connection pool exhausted because connections aren't released in error path"
+**2. Test Minimally**
+| Rule                     | Why                    |
+| ------------------------ | ---------------------- |
+| ONE change at a time     | Isolate what works     |
+| Smallest possible change | Avoid side effects     |
+| Don't bundle fixes       | Can't tell what helped |
+**3. Evaluate Result**
+| Result          | Action                                  |
+| --------------- | --------------------------------------- |
+| Fixed           | Phase 4 (verify)                        |
+| Not fixed       | NEW hypothesis (return to 3.1)          |
+| Partially fixed | Found one issue, continue investigating |
+### Phase 4: Implementation
+**1. Create Failing Test**
+Before fixing, write test that fails due to the bug:
+```javascript
+it('handles empty input without crashing', () => {
+  // This test should FAIL before fix, PASS after
+  expect(() => processData('')).not.toThrow();
+});
+```
+**2. Implement Fix**
+- Address ROOT CAUSE identified in Phase 1
+- ONE change
+- No "while I'm here" improvements
+**3. Verify**
+- [ ] New test passes
+- [ ] Existing tests still pass
+- [ ] Issue actually resolved (not just test passing)
+**4. If Fix Doesn't Work**
+| Fix attempts | Action                                   |
+| ------------ | ---------------------------------------- |
+| 1-2          | Return to Phase 1 with new information   |
+| 3+           | STOP - Question architecture (see below) |
+**5. After 3+ Failed Fixes: Question Architecture**
+Pattern indicating architectural problem:
+- Each fix reveals new coupling/shared state
+- Fixes require "massive refactoring"
+- Each fix creates new symptoms elsewhere
+**STOP and ask:**
+- Is this pattern fundamentally sound?
+- Should we refactor vs. continue patching?
+- Discuss with user before more fix attempts
+## Red Flags - STOP Immediately
+If you catch yourself thinking:
+| Thought                                        | Reality                           |
+| ---------------------------------------------- | --------------------------------- |
+| "Quick fix for now, investigate later"         | Investigate NOW or you never will |
+| "Just try changing X"                          | That's guessing, not debugging    |
+| "I'll add multiple fixes and test"             | Can't isolate what worked         |
+| "I don't fully understand but this might work" | You need to understand first      |
+| "One more fix attempt" (after 2+ failures)     | 3+ failures = wrong approach      |
+**ALL mean: STOP. Return to Phase 1.**
+## Quick Reference
+| Phase             | Key Question                          | Success Criteria                   |
+| ----------------- | ------------------------------------- | ---------------------------------- |
+| 1. Root Cause     | "WHY is this happening?"              | Understand cause, not just symptom |
+| 2. Pattern        | "What's different from working code?" | Identified key differences         |
+| 3. Hypothesis     | "Is my theory correct?"               | Confirmed or formed new theory     |
+| 4. Implementation | "Does the fix work?"                  | Test passes, issue resolved        |

package/templates/cursor/rules/safeword-enforcing-tdd.mdc ADDED Viewed

@@ -0,0 +1,207 @@
+---
+description: Use when implementing features, fixing bugs, or making code changes. Ensures scope is defined before coding, then enforces RED → GREEN → REFACTOR test discipline. Triggers: 'implement', 'add', 'build', 'create', 'fix', 'change', 'feature', 'bug'.
+alwaysApply: false
+---
+# TDD Enforcer
+Scope work before coding. Write tests before implementation.
+**Iron Law:** NO IMPLEMENTATION UNTIL SCOPE IS DEFINED AND TEST FAILS
+## When to Use
+Answer IN ORDER. Stop at first match:
+1. Implementing new feature? → Use this skill
+2. Fixing bug? → Use this skill
+3. Adding enhancement? → Use this skill
+4. Refactoring? → Use this skill
+5. Research/investigation only? → Skip this skill
+---
+## Phase 0: TRIAGE
+**Purpose:** Determine work level and ensure scope exists.
+### Step 1: Identify Level
+Answer IN ORDER. Stop at first match:
+| Question                                 | If Yes →       |
+| ---------------------------------------- | -------------- |
+| User-facing feature with business value? | **L2 Feature** |
+| Bug, improvement, internal, or refactor? | **L1 Task**    |
+| Typo, config, or trivial change?         | **L0 Micro**   |
+### Step 2: Check/Create Artifacts
+| Level  | Required Artifacts                                              | Test Location                   |
+| ------ | --------------------------------------------------------------- | ------------------------------- |
+| **L2** | Feature Spec + Test Definitions (+ Design Doc if 3+ components) | `test-definitions/feature-*.md` |
+| **L1** | Task Spec                                                       | Inline in spec                  |
+| **L0** | Task Spec (minimal)                                             | Existing tests                  |
+**Locations:**
+- Specs: `.safeword/planning/specs/`
+- Test definitions: `.safeword/planning/test-definitions/`
+### Exit Criteria
+- [ ] Level identified (L0/L1/L2)
+- [ ] Spec exists with "Out of Scope" defined
+- [ ] L2: Test definitions file exists
+- [ ] L1: Test scenarios in spec
+- [ ] L0: Existing test coverage confirmed
+---
+## Phase 1: RED
+**Iron Law:** NO IMPLEMENTATION UNTIL TEST FAILS FOR THE RIGHT REASON
+**Protocol:**
+1. Pick ONE test from spec (L1) or test definitions (L2)
+2. Write test code
+3. Run test
+4. Verify: fails because behavior missing (not syntax error)
+5. Commit: `test: [behavior]`
+**For L0:** No new test needed. Confirm existing tests pass, then proceed to Phase 2.
+**Exit Criteria:**
+- [ ] Test written and executed
+- [ ] Test fails for RIGHT reason (behavior missing)
+- [ ] Committed: `test: [behavior]`
+**Red Flags → STOP:**
+| Flag                    | Action                           |
+| ----------------------- | -------------------------------- |
+| Test passes immediately | Rewrite - you're testing nothing |
+| Syntax error            | Fix syntax, not behavior         |
+| Wrote implementation    | Delete it, return to test        |
+| Multiple tests          | Pick ONE                         |
+---
+## Phase 2: GREEN
+**Iron Law:** ONLY WRITE CODE THE TEST REQUIRES
+**Protocol:**
+1. Write minimal code to pass test
+2. Run test → verify pass
+3. Commit: `feat:` or `fix:`
+**Exit Criteria:**
+- [ ] Test passes
+- [ ] No extra code
+- [ ] No hardcoded/mock values
+- [ ] Committed
+### Verification Gate
+**Before claiming GREEN:** Evidence before claims, always.
+```text
+✅ CORRECT                          ❌ WRONG
+─────────────────────────────────   ─────────────────────────────────
+Run: npm test                       "Tests should pass now"
+Output: ✓ 34/34 tests pass          "I'm confident this works"
+Claim: "All tests pass"             "Tests pass" (no output shown)
+```
+**The Rule:** If you haven't run the verification command in this response, you cannot claim it passes.
+**Red Flags → STOP:**
+| Flag                        | Action                                 |
+| --------------------------- | -------------------------------------- |
+| "should", "probably" claims | Run command, show output first         |
+| "Done!" before verification | Run command, show output first         |
+| "Just in case" code         | Delete it                              |
+| Multiple functions          | Delete extras                          |
+| Refactoring                 | Stop - that's Phase 3                  |
+| Test still fails            | Debug (→ debugging skill if stuck)     |
+| Hardcoded value             | Implement real logic (see below)       |
+### Anti-Pattern: Mock Implementations
+LLMs sometimes hardcode values to pass tests. This is not TDD.
+```typescript
+// ❌ BAD - Hardcoded to pass test
+function calculateDiscount(amount, tier) {
+  return 80; // Passes test but isn't real
+}
+// ✅ GOOD - Actual logic
+function calculateDiscount(amount, tier) {
+  if (tier === 'VIP') return amount * 0.8;
+  return amount;
+}
+```
+Fix mocks immediately. The next test cycle will catch them, but they're technical debt.
+---
+## Phase 3: REFACTOR
+**Protocol:**
+1. Tests pass before changes
+2. Improve code (rename, extract, dedupe)
+3. Tests pass after changes
+4. Commit if changed: `refactor: [improvement]`
+**Exit Criteria:**
+- [ ] Tests still pass
+- [ ] Code cleaner (or no changes needed)
+- [ ] Committed (if changed)
+**NOT Allowed:** New behavior, changing assertions, adding tests.
+---
+## Phase 4: ITERATE
+```text
+More tests in spec/test-definitions?
+├─ Yes → Return to Phase 1
+└─ No → All "Done When" / AC checked?
+        ├─ Yes → Complete
+        └─ No → Update spec, return to Phase 0
+```
+For L2: Update test definition status (✅/⏭️/❌/🔴) as tests pass.
+---
+## Quick Reference
+| Phase       | Key Question                     | Gate                          |
+| ----------- | -------------------------------- | ----------------------------- |
+| 0. TRIAGE   | What level? Is scope defined?    | Spec exists with boundaries   |
+| 1. RED      | Does test fail for right reason? | Test fails (behavior missing) |
+| 2. GREEN    | Does minimal code pass?          | Test passes, no extras        |
+| 3. REFACTOR | Is code clean?                   | Tests still pass              |
+| 4. ITERATE  | More tests?                      | All done → complete           |
+---
+## Integration
+| Scenario                | Handoff           |
+| ----------------------- | ----------------- |
+| Test fails unexpectedly | → debugging skill |
+| Review needed           | → quality-reviewer    |
+| Scope expanding         | → Update spec first   |