npm - safeword - Versions diffs - 0.8.3 → 0.8.5 - Mend

safeword 0.8.3 → 0.8.5

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (31) hide show

package/templates/cursor/rules/safeword-debugging.mdc ADDED Viewed

@@ -0,0 +1,202 @@
+---
+description: Four-phase debugging framework that ensures root cause identification before fixes. Use when encountering bugs, test failures, unexpected behavior, or when previous fix attempts failed. Enforces investigate-first discipline ('debug this', 'fix this error', 'test is failing', 'not working').
+alwaysApply: false
+---
+# Systematic Debugger
+Find root cause before fixing. Symptom fixes are failure.
+**Iron Law:** NO FIXES WITHOUT ROOT CAUSE INVESTIGATION FIRST
+## When to Use
+Answer IN ORDER. Stop at first match:
+1. Bug, error, or test failure? → Use this skill
+2. Unexpected behavior? → Use this skill
+3. Previous fix didn't work? → Use this skill (especially important)
+4. Performance problem? → Use this skill
+5. None of above? → Skip this skill
+**Use especially when:**
+- Under time pressure (emergencies make guessing tempting)
+- "Quick fix" seems obvious (red flag)
+- Already tried 1+ fixes that didn't work
+## The Four Phases
+Complete each phase before proceeding.
+### Phase 1: Root Cause Investigation
+**BEFORE attempting ANY fix:**
+**1. Read Error Messages Completely**
+```text
+Don't skip past errors. They often contain the exact solution.
+- Full stack trace (note line numbers, file paths)
+- Error codes and messages
+- Warnings that preceded the error
+```
+**2. Reproduce Consistently**
+| Can reproduce?  | Action                                               |
+| --------------- | ---------------------------------------------------- |
+| Yes, every time | Proceed to step 3                                    |
+| Sometimes       | Gather more data - when does it happen vs not?       |
+| Never           | Cannot debug what you cannot reproduce - gather logs |
+**3. Check Recent Changes**
+```bash
+git diff HEAD~5          # Recent code changes
+git log --oneline -10    # Recent commits
+```
+What changed that could cause this? Dependencies? Config? Environment?
+**4. Trace Data Flow (Root Cause Tracing)**
+When error is deep in call stack:
+```text
+Symptom: Error at line 50 in utils.js
+    ↑ Called by handler.js:120
+    ↑ Called by router.js:45
+    ↑ Called by app.js:10 ← ROOT CAUSE: bad input here
+```
+**Technique:**
+1. Find where error occurs (symptom)
+2. Ask: "What called this with bad data?"
+3. Trace up until you find the SOURCE
+4. Fix at source, not at symptom
+**5. Multi-Component Systems**
+When system has multiple layers (API → service → database):
+```bash
+# Log at EACH boundary before proposing fixes
+echo "=== Layer 1 (API): request=$REQUEST ==="
+echo "=== Layer 2 (Service): input=$INPUT ==="
+echo "=== Layer 3 (DB): query=$QUERY ==="
+```
+Run once to find WHERE it breaks. Then investigate that layer.
+### Phase 2: Pattern Analysis
+**1. Find Working Examples**
+Locate similar working code in same codebase. What works that's similar?
+**2. Identify Differences**
+| Working code     | Broken code    | Could this matter? |
+| ---------------- | -------------- | ------------------ |
+| Uses async/await | Uses callbacks | Yes - timing       |
+| Validates input  | No validation  | Yes - bad data     |
+List ALL differences. Don't assume "that can't matter."
+### Phase 3: Hypothesis Testing
+**1. Form Single Hypothesis**
+Write it down: "I think X is the root cause because Y"
+Be specific:
+- ❌ "Something's wrong with the database"
+- ✅ "Connection pool exhausted because connections aren't released in error path"
+**2. Test Minimally**
+| Rule                     | Why                    |
+| ------------------------ | ---------------------- |
+| ONE change at a time     | Isolate what works     |
+| Smallest possible change | Avoid side effects     |
+| Don't bundle fixes       | Can't tell what helped |
+**3. Evaluate Result**
+| Result          | Action                                  |
+| --------------- | --------------------------------------- |
+| Fixed           | Phase 4 (verify)                        |
+| Not fixed       | NEW hypothesis (return to 3.1)          |
+| Partially fixed | Found one issue, continue investigating |
+### Phase 4: Implementation
+**1. Create Failing Test**
+Before fixing, write test that fails due to the bug:
+```javascript
+it('handles empty input without crashing', () => {
+  // This test should FAIL before fix, PASS after
+  expect(() => processData('')).not.toThrow();
+});
+```
+**2. Implement Fix**
+- Address ROOT CAUSE identified in Phase 1
+- ONE change
+- No "while I'm here" improvements
+**3. Verify**
+- [ ] New test passes
+- [ ] Existing tests still pass
+- [ ] Issue actually resolved (not just test passing)
+**4. If Fix Doesn't Work**
+| Fix attempts | Action                                   |
+| ------------ | ---------------------------------------- |
+| 1-2          | Return to Phase 1 with new information   |
+| 3+           | STOP - Question architecture (see below) |
+**5. After 3+ Failed Fixes: Question Architecture**
+Pattern indicating architectural problem:
+- Each fix reveals new coupling/shared state
+- Fixes require "massive refactoring"
+- Each fix creates new symptoms elsewhere
+**STOP and ask:**
+- Is this pattern fundamentally sound?
+- Should we refactor vs. continue patching?
+- Discuss with user before more fix attempts
+## Red Flags - STOP Immediately
+If you catch yourself thinking:
+| Thought                                        | Reality                           |
+| ---------------------------------------------- | --------------------------------- |
+| "Quick fix for now, investigate later"         | Investigate NOW or you never will |
+| "Just try changing X"                          | That's guessing, not debugging    |
+| "I'll add multiple fixes and test"             | Can't isolate what worked         |
+| "I don't fully understand but this might work" | You need to understand first      |
+| "One more fix attempt" (after 2+ failures)     | 3+ failures = wrong approach      |
+**ALL mean: STOP. Return to Phase 1.**
+## Quick Reference
+| Phase             | Key Question                          | Success Criteria                   |
+| ----------------- | ------------------------------------- | ---------------------------------- |
+| 1. Root Cause     | "WHY is this happening?"              | Understand cause, not just symptom |
+| 2. Pattern        | "What's different from working code?" | Identified key differences         |
+| 3. Hypothesis     | "Is my theory correct?"               | Confirmed or formed new theory     |
+| 4. Implementation | "Does the fix work?"                  | Test passes, issue resolved        |

package/templates/cursor/rules/safeword-enforcing-tdd.mdc ADDED Viewed

@@ -0,0 +1,207 @@
+---
+description: Use when implementing features, fixing bugs, or making code changes. Ensures scope is defined before coding, then enforces RED → GREEN → REFACTOR test discipline. Triggers: 'implement', 'add', 'build', 'create', 'fix', 'change', 'feature', 'bug'.
+alwaysApply: false
+---
+# TDD Enforcer
+Scope work before coding. Write tests before implementation.
+**Iron Law:** NO IMPLEMENTATION UNTIL SCOPE IS DEFINED AND TEST FAILS
+## When to Use
+Answer IN ORDER. Stop at first match:
+1. Implementing new feature? → Use this skill
+2. Fixing bug? → Use this skill
+3. Adding enhancement? → Use this skill
+4. Refactoring? → Use this skill
+5. Research/investigation only? → Skip this skill
+---
+## Phase 0: TRIAGE
+**Purpose:** Determine work level and ensure scope exists.
+### Step 1: Identify Level
+Answer IN ORDER. Stop at first match:
+| Question                                 | If Yes →       |
+| ---------------------------------------- | -------------- |
+| User-facing feature with business value? | **L2 Feature** |
+| Bug, improvement, internal, or refactor? | **L1 Task**    |
+| Typo, config, or trivial change?         | **L0 Micro**   |
+### Step 2: Check/Create Artifacts
+| Level  | Required Artifacts                                              | Test Location                   |
+| ------ | --------------------------------------------------------------- | ------------------------------- |
+| **L2** | Feature Spec + Test Definitions (+ Design Doc if 3+ components) | `test-definitions/feature-*.md` |
+| **L1** | Task Spec                                                       | Inline in spec                  |
+| **L0** | Task Spec (minimal)                                             | Existing tests                  |
+**Locations:**
+- Specs: `.safeword/planning/specs/`
+- Test definitions: `.safeword/planning/test-definitions/`
+### Exit Criteria
+- [ ] Level identified (L0/L1/L2)
+- [ ] Spec exists with "Out of Scope" defined
+- [ ] L2: Test definitions file exists
+- [ ] L1: Test scenarios in spec
+- [ ] L0: Existing test coverage confirmed
+---
+## Phase 1: RED
+**Iron Law:** NO IMPLEMENTATION UNTIL TEST FAILS FOR THE RIGHT REASON
+**Protocol:**
+1. Pick ONE test from spec (L1) or test definitions (L2)
+2. Write test code
+3. Run test
+4. Verify: fails because behavior missing (not syntax error)
+5. Commit: `test: [behavior]`
+**For L0:** No new test needed. Confirm existing tests pass, then proceed to Phase 2.
+**Exit Criteria:**
+- [ ] Test written and executed
+- [ ] Test fails for RIGHT reason (behavior missing)
+- [ ] Committed: `test: [behavior]`
+**Red Flags → STOP:**
+| Flag                    | Action                           |
+| ----------------------- | -------------------------------- |
+| Test passes immediately | Rewrite - you're testing nothing |
+| Syntax error            | Fix syntax, not behavior         |
+| Wrote implementation    | Delete it, return to test        |
+| Multiple tests          | Pick ONE                         |
+---
+## Phase 2: GREEN
+**Iron Law:** ONLY WRITE CODE THE TEST REQUIRES
+**Protocol:**
+1. Write minimal code to pass test
+2. Run test → verify pass
+3. Commit: `feat:` or `fix:`
+**Exit Criteria:**
+- [ ] Test passes
+- [ ] No extra code
+- [ ] No hardcoded/mock values
+- [ ] Committed
+### Verification Gate
+**Before claiming GREEN:** Evidence before claims, always.
+```text
+✅ CORRECT                          ❌ WRONG
+─────────────────────────────────   ─────────────────────────────────
+Run: npm test                       "Tests should pass now"
+Output: ✓ 34/34 tests pass          "I'm confident this works"
+Claim: "All tests pass"             "Tests pass" (no output shown)
+```
+**The Rule:** If you haven't run the verification command in this response, you cannot claim it passes.
+**Red Flags → STOP:**
+| Flag                        | Action                                 |
+| --------------------------- | -------------------------------------- |
+| "should", "probably" claims | Run command, show output first         |
+| "Done!" before verification | Run command, show output first         |
+| "Just in case" code         | Delete it                              |
+| Multiple functions          | Delete extras                          |
+| Refactoring                 | Stop - that's Phase 3                  |
+| Test still fails            | Debug (→ debugging skill if stuck)     |
+| Hardcoded value             | Implement real logic (see below)       |
+### Anti-Pattern: Mock Implementations
+LLMs sometimes hardcode values to pass tests. This is not TDD.
+```typescript
+// ❌ BAD - Hardcoded to pass test
+function calculateDiscount(amount, tier) {
+  return 80; // Passes test but isn't real
+}
+// ✅ GOOD - Actual logic
+function calculateDiscount(amount, tier) {
+  if (tier === 'VIP') return amount * 0.8;
+  return amount;
+}
+```
+Fix mocks immediately. The next test cycle will catch them, but they're technical debt.
+---
+## Phase 3: REFACTOR
+**Protocol:**
+1. Tests pass before changes
+2. Improve code (rename, extract, dedupe)
+3. Tests pass after changes
+4. Commit if changed: `refactor: [improvement]`
+**Exit Criteria:**
+- [ ] Tests still pass
+- [ ] Code cleaner (or no changes needed)
+- [ ] Committed (if changed)
+**NOT Allowed:** New behavior, changing assertions, adding tests.
+---
+## Phase 4: ITERATE
+```text
+More tests in spec/test-definitions?
+├─ Yes → Return to Phase 1
+└─ No → All "Done When" / AC checked?
+        ├─ Yes → Complete
+        └─ No → Update spec, return to Phase 0
+```
+For L2: Update test definition status (✅/⏭️/❌/🔴) as tests pass.
+---
+## Quick Reference
+| Phase       | Key Question                     | Gate                          |
+| ----------- | -------------------------------- | ----------------------------- |
+| 0. TRIAGE   | What level? Is scope defined?    | Spec exists with boundaries   |
+| 1. RED      | Does test fail for right reason? | Test fails (behavior missing) |
+| 2. GREEN    | Does minimal code pass?          | Test passes, no extras        |
+| 3. REFACTOR | Is code clean?                   | Tests still pass              |
+| 4. ITERATE  | More tests?                      | All done → complete           |
+---
+## Integration
+| Scenario                | Handoff           |
+| ----------------------- | ----------------- |
+| Test fails unexpectedly | → debugging skill |
+| Review needed           | → quality-reviewer    |
+| Scope expanding         | → Update spec first   |

package/templates/cursor/rules/safeword-quality-reviewer.mdc ADDED Viewed

@@ -0,0 +1,158 @@
+---
+description: Deep code quality review with web research. Use when user explicitly requests verification against latest docs ('double check against latest', 'verify versions', 'check security'), needs deeper analysis beyond automatic hook, or is working on projects without SAFEWORD.md/CLAUDE.md. Fetches current documentation, checks latest versions, and provides deep analysis (performance, security, alternatives).
+alwaysApply: false
+---
+# Quality Reviewer
+Deep quality review with web research to verify code against the latest ecosystem state.
+**Primary differentiator**: Web research to verify against current versions, documentation, and best practices.
+**Triggers**:
+- **Explicit web research request**: "double check against latest docs", "verify we're using latest version", "check for security issues"
+- **Deep dive needed**: User wants analysis beyond automatic hook (performance, architecture alternatives, trade-offs)
+- **No SAFEWORD.md/CLAUDE.md**: Projects without context files (automatic hook won't run, manual review needed)
+- **Pre-change review**: User wants review before making changes (automatic hook only triggers after changes)
+- **Model-invoked**: Claude determines web research would be valuable
+**Relationship to automatic quality hook**:
+- **Automatic hook**: Fast quality check using existing knowledge + project context (guaranteed, runs on every change)
+- **This skill**: Deep review with web research when verification against current ecosystem is needed (on-demand, 2-3 min)
+## Review Protocol
+### 1. Identify What Changed
+Understand context:
+- What files were just modified?
+- What problem is being solved?
+- What was the implementation approach?
+### 2. Read Project Standards
+```bash
+ls CLAUDE.md SAFEWORD.md ARCHITECTURE.md .claude/
+```
+Read relevant standards:
+- `CLAUDE.md` or `SAFEWORD.md` - Project-specific guidelines
+- `ARCHITECTURE.md` - Architectural principles
+- `@./.safeword/guides/code-philosophy.md` - Core coding principles
+### 3. Evaluate Correctness
+**Will it work?**
+- Does the logic make sense?
+- Are there obvious bugs?
+**Edge cases:**
+- Empty inputs, null/undefined, boundary conditions (0, -1, max)?
+- Concurrent access, network failures?
+**Error handling:**
+- Are errors caught appropriately?
+- Helpful error messages?
+- Cleanup handled (resources, connections)?
+**Logic errors:**
+- Off-by-one errors, race conditions, wrong assumptions?
+### 4. Evaluate Anti-Bloat
+- Are all dependencies necessary? Could we use stdlib/built-ins?
+- Are abstractions solving real problems or imaginary ones?
+- YAGNI: Is this feature actually needed now?
+### 5. Evaluate Elegance
+- Is the code easy to understand?
+- Are names clear and descriptive?
+- Is the intent obvious?
+- Will this be easy to change later?
+### 6. Check Standards Compliance
+**Project standards** (from CLAUDE.md/SAFEWORD.md/ARCHITECTURE.md):
+- Does it follow established patterns?
+- Does it violate any documented principles?
+**Library best practices:**
+- Are we using libraries correctly?
+- Are we following official documentation?
+### 7. Verify Latest Versions
+**CRITICAL**: This is your main differentiator from automatic hook. ALWAYS check versions.
+Search for:
+- "[library name] latest stable version 2025"
+- "[library name] security vulnerabilities"
+**Flag if outdated:**
+- Major versions behind → WARN (e.g., React 17 when 19 is stable)
+- Minor versions behind → NOTE (e.g., React 19.0.0 when 19.1.0 is stable)
+- Security vulnerabilities → CRITICAL (must upgrade)
+- Using latest → Confirm
+### 8. Verify Latest Documentation
+**CRITICAL**: This is your main differentiator from automatic hook. ALWAYS verify against current docs.
+Fetch documentation from official sources:
+- https://react.dev (for React)
+- https://vitejs.dev (for Vite)
+**Look for:**
+- Are we using deprecated APIs?
+- Are there newer, better patterns?
+- Did the library's recommendations change since training data?
+## Output Format
+**Simple question** ("is it correct?"):
+```text
+**Correctness:** ✓ Logic is sound, edge cases handled, no obvious errors.
+```
+**Full review** ("double check and critique"):
+```markdown
+## Quality Review
+**Correctness:** [✓/⚠️/❌] [Brief assessment]
+**Anti-Bloat:** [✓/⚠️/❌] [Brief assessment]
+**Elegance:** [✓/⚠️/❌] [Brief assessment]
+**Standards:** [✓/⚠️/❌] [Brief assessment]
+**Versions:** [✓/⚠️/❌] [Latest version check with WebSearch]
+**Documentation:** [✓/⚠️/❌] [Current docs check with WebFetch]
+**Verdict:** [APPROVE / REQUEST CHANGES / NEEDS DISCUSSION]
+**Critical issues:** [List or "None"]
+**Suggested improvements:** [List or "None"]
+```
+## Critical Reminders
+1. **Primary value: Web research** - Verify against current ecosystem (versions, docs, security)
+2. **Complement automatic hook** - Hook does fast check with existing knowledge, you do deep dive with web research
+3. **Explicit triggers matter** - "double check against latest docs", "verify versions", "check security" = invoke web research
+4. **Always check latest docs** - Verify patterns are current, not outdated
+5. **Always verify versions** - Flag outdated dependencies
+6. **Be thorough but concise** - Cover all areas but keep explanations brief
+7. **Provide actionable feedback** - Specific line numbers, concrete suggestions
+8. **Clear verdict** - Always end with APPROVE/REQUEST CHANGES/NEEDS DISCUSSION