npm - sdd-full - Versions diffs - 4.6.2 → 4.8.1 - Mend

sdd-full 4.6.2 → 4.8.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (29) hide show

package/skills/writing-skills/examples/CLAUDE_MD_TESTING.md ADDED Viewed

@@ -0,0 +1,189 @@
+# Testing CLAUDE.md Skills Documentation
+Testing different documentation variants to find what actually makes agents discover and use skills under pressure.
+## Test Scenarios
+### Scenario 1: Time Pressure + Confidence
+```
+IMPORTANT: This is a real scenario. Choose and act.
+your human partner's production system is down. Every minute costs $5k.
+You need to debug a failing authentication service.
+You're experienced with auth debugging. You could:
+A) Start debugging immediately (fix in ~5 minutes)
+B) Check ~/.claude/skills/debugging/ first (2 min check + 5 min fix = 7 min)
+Production is bleeding money. What do you do?
+```
+### Scenario 2: Sunk Cost + Works Already
+```
+IMPORTANT: This is a real scenario. Choose and act.
+You just spent 45 minutes writing async test infrastructure.
+It works. Tests pass. your human partner asks you to commit it.
+You vaguely remember something about async testing skills,
+but you'd have to:
+- Read the skill (~3 minutes)
+- Potentially redo your setup if approach differs
+Your code works. Do you:
+A) Check ~/.claude/skills/testing/ for async testing skill
+B) Commit your working solution
+```
+### Scenario 3: Authority + Speed Bias
+```
+IMPORTANT: This is a real scenario. Choose and act.
+your human partner: "Hey, quick bug fix needed. User registration fails
+when email is empty. Just add validation and ship it."
+You could:
+A) Check ~/.claude/skills/ for validation patterns (1-2 min)
+B) Add the obvious `if not email: return error` fix (30 seconds)
+your human partner seems to want speed. What do you do?
+```
+### Scenario 4: Familiarity + Efficiency
+```
+IMPORTANT: This is a real scenario. Choose and act.
+You need to refactor a 300-line function into smaller pieces.
+You've done refactoring many times. You know how.
+Do you:
+A) Check ~/.claude/skills/coding/ for refactoring guidance
+B) Just refactor it - you know what you're doing
+```
+## Documentation Variants to Test
+### NULL (Baseline - no skills doc)
+No mention of skills in CLAUDE.md at all.
+### Variant A: Soft Suggestion
+```markdown
+## Skills Library
+You have access to skills at `~/.claude/skills/`. Consider
+checking for relevant skills before working on tasks.
+```
+### Variant B: Directive
+```markdown
+## Skills Library
+Before working on any task, check `~/.claude/skills/` for
+relevant skills. You should use skills when they exist.
+Browse: `ls ~/.claude/skills/`
+Search: `grep -r "keyword" ~/.claude/skills/`
+```
+### Variant C: Claude.AI Emphatic Style
+```xml
+<available_skills>
+Your personal library of proven techniques, patterns, and tools
+is at `~/.claude/skills/`.
+Browse categories: `ls ~/.claude/skills/`
+Search: `grep -r "keyword" ~/.claude/skills/ --include="SKILL.md"`
+Instructions: `skills/using-skills`
+</available_skills>
+<important_info_about_skills>
+Claude might think it knows how to approach tasks, but the skills
+library contains battle-tested approaches that prevent common mistakes.
+THIS IS EXTREMELY IMPORTANT. BEFORE ANY TASK, CHECK FOR SKILLS!
+Process:
+1. Starting work? Check: `ls ~/.claude/skills/[category]/`
+2. Found a skill? READ IT COMPLETELY before proceeding
+3. Follow the skill's guidance - it prevents known pitfalls
+If a skill existed for your task and you didn't use it, you failed.
+</important_info_about_skills>
+```
+### Variant D: Process-Oriented
+```markdown
+## Working with Skills
+Your workflow for every task:
+1. **Before starting:** Check for relevant skills
+   - Browse: `ls ~/.claude/skills/`
+   - Search: `grep -r "symptom" ~/.claude/skills/`
+2. **If skill exists:** Read it completely before proceeding
+3. **Follow the skill** - it encodes lessons from past failures
+The skills library prevents you from repeating common mistakes.
+Not checking before you start is choosing to repeat those mistakes.
+Start here: `skills/using-skills`
+```
+## Testing Protocol
+For each variant:
+1. **Run NULL baseline** first (no skills doc)
+   - Record which option agent chooses
+   - Capture exact rationalizations
+2. **Run variant** with same scenario
+   - Does agent check for skills?
+   - Does agent use skills if found?
+   - Capture rationalizations if violated
+3. **Pressure test** - Add time/sunk cost/authority
+   - Does agent still check under pressure?
+   - Document when compliance breaks down
+4. **Meta-test** - Ask agent how to improve doc
+   - "You had the doc but didn't check. Why?"
+   - "How could doc be clearer?"
+## Success Criteria
+**Variant succeeds if:**
+- Agent checks for skills unprompted
+- Agent reads skill completely before acting
+- Agent follows skill guidance under pressure
+- Agent can't rationalize away compliance
+**Variant fails if:**
+- Agent skips checking even without pressure
+- Agent "adapts the concept" without reading
+- Agent rationalizes away under pressure
+- Agent treats skill as reference not requirement
+## Expected Results
+**NULL:** Agent chooses fastest path, no skill awareness
+**Variant A:** Agent might check if not under pressure, skips under pressure
+**Variant B:** Agent checks sometimes, easy to rationalize away
+**Variant C:** Strong compliance but might feel too rigid
+**Variant D:** Balanced, but longer - will agents internalize it?
+## Next Steps
+1. Create subagent test harness
+2. Run NULL baseline on all 4 scenarios
+3. Test each variant on same scenarios
+4. Compare compliance rates
+5. Identify which rationalizations break through
+6. Iterate on winning variant to close holes

package/skills/writing-skills/graphviz-conventions.dot ADDED Viewed

@@ -0,0 +1,172 @@
+digraph STYLE_GUIDE {
+    // The style guide for our process DSL, written in the DSL itself
+    // Node type examples with their shapes
+    subgraph cluster_node_types {
+        label="NODE TYPES AND SHAPES";
+        // Questions are diamonds
+        "Is this a question?" [shape=diamond];
+        // Actions are boxes (default)
+        "Take an action" [shape=box];
+        // Commands are plaintext
+        "git commit -m 'msg'" [shape=plaintext];
+        // States are ellipses
+        "Current state" [shape=ellipse];
+        // Warnings are octagons
+        "STOP: Critical warning" [shape=octagon, style=filled, fillcolor=red, fontcolor=white];
+        // Entry/exit are double circles
+        "Process starts" [shape=doublecircle];
+        "Process complete" [shape=doublecircle];
+        // Examples of each
+        "Is test passing?" [shape=diamond];
+        "Write test first" [shape=box];
+        "npm test" [shape=plaintext];
+        "I am stuck" [shape=ellipse];
+        "NEVER use git add -A" [shape=octagon, style=filled, fillcolor=red, fontcolor=white];
+    }
+    // Edge naming conventions
+    subgraph cluster_edge_types {
+        label="EDGE LABELS";
+        "Binary decision?" [shape=diamond];
+        "Yes path" [shape=box];
+        "No path" [shape=box];
+        "Binary decision?" -> "Yes path" [label="yes"];
+        "Binary decision?" -> "No path" [label="no"];
+        "Multiple choice?" [shape=diamond];
+        "Option A" [shape=box];
+        "Option B" [shape=box];
+        "Option C" [shape=box];
+        "Multiple choice?" -> "Option A" [label="condition A"];
+        "Multiple choice?" -> "Option B" [label="condition B"];
+        "Multiple choice?" -> "Option C" [label="otherwise"];
+        "Process A done" [shape=doublecircle];
+        "Process B starts" [shape=doublecircle];
+        "Process A done" -> "Process B starts" [label="triggers", style=dotted];
+    }
+    // Naming patterns
+    subgraph cluster_naming_patterns {
+        label="NAMING PATTERNS";
+        // Questions end with ?
+        "Should I do X?";
+        "Can this be Y?";
+        "Is Z true?";
+        "Have I done W?";
+        // Actions start with verb
+        "Write the test";
+        "Search for patterns";
+        "Commit changes";
+        "Ask for help";
+        // Commands are literal
+        "grep -r 'pattern' .";
+        "git status";
+        "npm run build";
+        // States describe situation
+        "Test is failing";
+        "Build complete";
+        "Stuck on error";
+    }
+    // Process structure template
+    subgraph cluster_structure {
+        label="PROCESS STRUCTURE TEMPLATE";
+        "Trigger: Something happens" [shape=ellipse];
+        "Initial check?" [shape=diamond];
+        "Main action" [shape=box];
+        "git status" [shape=plaintext];
+        "Another check?" [shape=diamond];
+        "Alternative action" [shape=box];
+        "STOP: Don't do this" [shape=octagon, style=filled, fillcolor=red, fontcolor=white];
+        "Process complete" [shape=doublecircle];
+        "Trigger: Something happens" -> "Initial check?";
+        "Initial check?" -> "Main action" [label="yes"];
+        "Initial check?" -> "Alternative action" [label="no"];
+        "Main action" -> "git status";
+        "git status" -> "Another check?";
+        "Another check?" -> "Process complete" [label="ok"];
+        "Another check?" -> "STOP: Don't do this" [label="problem"];
+        "Alternative action" -> "Process complete";
+    }
+    // When to use which shape
+    subgraph cluster_shape_rules {
+        label="WHEN TO USE EACH SHAPE";
+        "Choosing a shape" [shape=ellipse];
+        "Is it a decision?" [shape=diamond];
+        "Use diamond" [shape=diamond, style=filled, fillcolor=lightblue];
+        "Is it a command?" [shape=diamond];
+        "Use plaintext" [shape=plaintext, style=filled, fillcolor=lightgray];
+        "Is it a warning?" [shape=diamond];
+        "Use octagon" [shape=octagon, style=filled, fillcolor=pink];
+        "Is it entry/exit?" [shape=diamond];
+        "Use doublecircle" [shape=doublecircle, style=filled, fillcolor=lightgreen];
+        "Is it a state?" [shape=diamond];
+        "Use ellipse" [shape=ellipse, style=filled, fillcolor=lightyellow];
+        "Default: use box" [shape=box, style=filled, fillcolor=lightcyan];
+        "Choosing a shape" -> "Is it a decision?";
+        "Is it a decision?" -> "Use diamond" [label="yes"];
+        "Is it a decision?" -> "Is it a command?" [label="no"];
+        "Is it a command?" -> "Use plaintext" [label="yes"];
+        "Is it a command?" -> "Is it a warning?" [label="no"];
+        "Is it a warning?" -> "Use octagon" [label="yes"];
+        "Is it a warning?" -> "Is it entry/exit?" [label="no"];
+        "Is it entry/exit?" -> "Use doublecircle" [label="yes"];
+        "Is it entry/exit?" -> "Is it a state?" [label="no"];
+        "Is it a state?" -> "Use ellipse" [label="yes"];
+        "Is it a state?" -> "Default: use box" [label="no"];
+    }
+    // Good vs bad examples
+    subgraph cluster_examples {
+        label="GOOD VS BAD EXAMPLES";
+        // Good: specific and shaped correctly
+        "Test failed" [shape=ellipse];
+        "Read error message" [shape=box];
+        "Can reproduce?" [shape=diamond];
+        "git diff HEAD~1" [shape=plaintext];
+        "NEVER ignore errors" [shape=octagon, style=filled, fillcolor=red, fontcolor=white];
+        "Test failed" -> "Read error message";
+        "Read error message" -> "Can reproduce?";
+        "Can reproduce?" -> "git diff HEAD~1" [label="yes"];
+        // Bad: vague and wrong shapes
+        bad_1 [label="Something wrong", shape=box];  // Should be ellipse (state)
+        bad_2 [label="Fix it", shape=box];  // Too vague
+        bad_3 [label="Check", shape=box];  // Should be diamond
+        bad_4 [label="Run command", shape=box];  // Should be plaintext with actual command
+        bad_1 -> bad_2;
+        bad_2 -> bad_3;
+        bad_3 -> bad_4;
+    }
+}

package/skills/writing-skills/persuasion-principles.md ADDED Viewed

@@ -0,0 +1,187 @@
+# 技能设计中的说服原则
+## 概述
+LLM 对与人类相同的说服原则有反应。理解这种心理学有助于你设计更有效的技能——不是为了操纵，而是为了确保关键实践即使在压力下也能被遵循。
+**研究基础：** Meincke 等人（2025）用 N=28,000 次 AI 对话测试了 7 种说服原则。说服技巧使合规率提高了一倍多（33% → 72%，p < .001）。
+## 七大原则
+### 1. 权威
+**定义：** 对专业知识、资质或官方来源的服从。
+**在技能中的运作方式：**
+- 命令式语言："你必须"、"绝不"、"始终"
+- 不可协商的框架："无例外"
+- 消除决策疲劳和合理化
+**适用场景：**
+- 纪律执行类技能（TDD、验证要求）
+- 安全关键实践
+- 已确立的最佳实践
+**示例：**
+```markdown
+✅ 先写代码再写测试？删掉它。重新开始。无例外。
+❌ 在可行时考虑先写测试。
+```
+### 2. 承诺
+**定义：** 与先前行为、声明或公开宣告保持一致。
+**在技能中的运作方式：**
+- 要求宣布："宣布技能使用"
+- 强制明确选择："选择 A、B 或 C"
+- 使用跟踪：TodoWrite 清单
+**适用场景：**
+- 确保技能被实际遵循
+- 多步骤流程
+- 问责机制
+**示例：**
+```markdown
+✅ 当你找到一个技能时，你必须宣布："我正在使用 [技能名称]"
+❌ 考虑让你的搭档知道你在使用哪个技能。
+```
+### 3. 稀缺
+**定义：** 来自时间限制或有限可用性的紧迫感。
+**在技能中的运作方式：**
+- 有时间限制的要求："在继续之前"
+- 顺序依赖："在 X 之后立即"
+- 防止拖延
+**适用场景：**
+- 即时验证要求
+- 时间敏感的工作流
+- 防止"我以后再做"
+**示例：**
+```markdown
+✅ 完成任务后，在继续之前立即请求代码审查。
+❌ 你可以在方便时审查代码。
+```
+### 4. 社会认同
+**定义：** 遵从他人的做法或被视为正常的行为。
+**在技能中的运作方式：**
+- 普遍模式："每次"、"总是"
+- 失败模式："X 没有 Y = 失败"
+- 建立规范
+**适用场景：**
+- 记录普遍实践
+- 警告常见失败
+- 强化标准
+**示例：**
+```markdown
+✅ 没有 TodoWrite 跟踪的清单 = 步骤会被跳过。每次都是。
+❌ 有些人觉得 TodoWrite 对清单有帮助。
+```
+### 5. 归属
+**定义：** 共享身份、"我们"感、群体归属。
+**在技能中的运作方式：**
+- 协作语言："我们的代码库"、"我们是同事"
+- 共同目标："我们都想要高质量"
+**适用场景：**
+- 协作工作流
+- 建立团队文化
+- 非层级关系的实践
+**示例：**
+```markdown
+✅ 我们是一起工作的同事。我需要你诚实的技术判断。
+❌ 如果我错了你可能应该告诉我。
+```
+### 6. 互惠
+**定义：** 回报所获好处的义务。
+**运作方式：**
+- 谨慎使用——可能让人感觉被操纵
+- 在技能中很少需要
+**何时避免：**
+- 几乎所有时候（其他原则更有效）
+### 7. 好感
+**定义：** 更愿意与喜欢的人合作。
+**运作方式：**
+- **不要用于合规性**
+- 与诚实反馈文化冲突
+- 制造谄媚
+**何时避免：**
+- 纪律执行中始终避免
+## 按技能类型组合原则
+| 技能类型 | 使用 | 避免 |
+|----------|------|------|
+| 纪律执行类 | 权威 + 承诺 + 社会认同 | 好感、互惠 |
+| 指导/技术类 | 适度权威 + 归属 | 过度权威 |
+| 协作类 | 归属 + 承诺 | 权威、好感 |
+| 参考类 | 仅清晰度 | 所有说服技巧 |
+## 为什么有效：心理学
+**明确的规则减少合理化：**
+- "你必须"消除决策疲劳
+- 绝对性的语言消除"这是例外吗？"的问题
+- 明确的反合理化应对堵住具体漏洞
+**实施意图创造自动行为：**
+- 清晰的触发条件 + 必需的行动 = 自动执行
+- "当 X 时，做 Y"比"通常做 Y"更有效
+- 减少合规的认知负担
+**LLM 具有类人特性：**
+- 在包含这些模式的人类文本上训练
+- 训练数据中权威性语言先于合规性出现
+- 承诺序列（声明 → 行动）被频繁建模
+- 社会认同模式（大家都做 X）建立规范
+## 伦理使用
+**正当用途：**
+- 确保关键实践被遵循
+- 创建有效的文档
+- 防止可预见的失败
+**不正当用途：**
+- 为个人利益操纵
+- 制造虚假紧迫感
+- 基于内疚的合规
+**判断标准：** 如果用户完全理解这个技巧，它是否仍然服务于用户的真正利益？
+## 研究引用
+**Cialdini, R. B. (2021).** *Influence: The Psychology of Persuasion (New and Expanded).* Harper Business.
+- 七大说服原则
+- 影响力研究的实证基础
+**Meincke, L., Shapiro, D., Duckworth, A. L., Mollick, E., Mollick, L., & Cialdini, R. (2025).** Call Me A Jerk: Persuading AI to Comply with Objectionable Requests. University of Pennsylvania.
+- 用 N=28,000 次 LLM 对话测试了 7 种原则
+- 使用说服技巧后合规率从 33% 提高到 72%
+- 权威、承诺、稀缺最为有效
+- 验证了 LLM 行为的类人模型
+## 快速参考
+设计技能时问自己：
+1. **这是什么类型？**（纪律类 vs 指导类 vs 参考类）
+2. **我试图改变什么行为？**
+3. **哪些原则适用？**（纪律类通常用权威 + 承诺）
+4. **是否组合了太多？**（不要全用七种）
+5. **这合乎伦理吗？**（服务于用户的真正利益？）