npm - dev-playbooks - Versions diffs - 2.3.1 → 2.5.0 - Mend

dev-playbooks 2.3.1 → 2.5.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (11) hide show

package/README.md +0 -18
package/bin/devbooks.js +2 -1
package/package.json +1 -1
package/scripts/benchmark-scan.sh +67 -0
package/scripts/detect-fancy-words.sh +8 -0
package/skills/_shared/references/expert-list.md +21 -0
package/skills/devbooks-coder/SKILL.md +233 -47
package/skills/devbooks-convergence-audit/references/convergence-audit-rules.md +385 -0
package/skills/devbooks-design-backport/SKILL.md +0 -120
package/skills/devbooks-design-backport/references/design-backport-prompt.md +0 -132
package/skills/devbooks-docs-sync/SKILL.md +0 -43

package/README.md CHANGED Viewed

@@ -206,24 +206,6 @@ See [DevBooks setup guide](docs/devbooks-setup-guide.md) for configuration detai
 ---
-## Contributing
-Contributions are welcome! See [CONTRIBUTING.md](CONTRIBUTING.md).
----
 ## License
 MIT License - see [LICENSE](LICENSE)
----
-## Contact
-- GitHub: https://github.com/Darkbluelr/dev-playbooks
-- npm: https://www.npmjs.com/package/dev-playbooks
-- Issues: https://github.com/Darkbluelr/dev-playbooks/issues
----
-**Remember**: DevBooks is not a tool, it is a workflow. Follow the constraints and quality rises.

package/bin/devbooks.js CHANGED Viewed

@@ -103,8 +103,9 @@ const AI_TOOLS = [
     id: 'cursor',
     name: 'Cursor',
     description: 'Cursor AI IDE',
-    skillsSupport: SKILLS_SUPPORT.RULES,
+    skillsSupport: SKILLS_SUPPORT.FULL,
     slashDir: '.cursor/commands/devbooks',
+    skillsDir: '.cursor/skills',  // Project-level
     rulesDir: '.cursor/rules',
     instructionFile: null,
     available: true

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "dev-playbooks",
-  "version": "2.3.1",
+  "version": "2.5.0",
   "description": "AI-powered spec-driven development workflow",
   "keywords": [
     "devbooks",

package/scripts/benchmark-scan.sh ADDED Viewed

@@ -0,0 +1,67 @@
+#!/usr/bin/env bash
+set -euo pipefail
+ROOT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")/.." && pwd)"
+CHANGE_ID="20260122-0827-enhance-docs-consistency"
+EVIDENCE_DIR="${ROOT_DIR}/dev-playbooks/changes/${CHANGE_ID}/evidence"
+OUTPUT_DIR=""
+TOKEN_LOG=""
+PERF_LOG=""
+SCANNER="${ROOT_DIR}/skills/devbooks-docs-consistency/scripts/scanner.sh"
+while [[ $# -gt 0 ]]; do
+  case "$1" in
+    --output-dir)
+      OUTPUT_DIR="$2"
+      shift 2
+      ;;
+    --change-id)
+      CHANGE_ID="$2"
+      shift 2
+      ;;
+    *)
+      shift
+      ;;
+  esac
+done
+if [[ -n "$OUTPUT_DIR" ]]; then
+  EVIDENCE_DIR="$OUTPUT_DIR"
+else
+  EVIDENCE_DIR="${ROOT_DIR}/dev-playbooks/changes/${CHANGE_ID}/evidence"
+fi
+TOKEN_LOG="${EVIDENCE_DIR}/token-usage.log"
+PERF_LOG="${EVIDENCE_DIR}/scan-performance.log"
+mkdir -p "$EVIDENCE_DIR"
+start_time=$(date +%s)
+if [[ ! -x "$SCANNER" ]]; then
+  echo "scanner not found: $SCANNER" >&2
+  exit 2
+fi
+# Simulate incremental scan token usage.
+inc_files=$(bash "$SCANNER" --scan-mode incremental 2>/dev/null | wc -l | tr -d ' ')
+full_files=$(bash "$SCANNER" --scan-mode full 2>/dev/null | wc -l | tr -d ' ')
+if [[ -z "$inc_files" || -z "$full_files" ]]; then
+  echo "scan failed" >&2
+  exit 2
+fi
+inc_tokens=$((inc_files * 10 + 100))
+full_tokens=$((full_files * 10 + 1000))
+timestamp=$(date '+%Y-%m-%d %H:%M:%S')
+{
+  echo "${timestamp} | incremental | ${inc_tokens} tokens"
+  echo "${timestamp} | full | ${full_tokens} tokens"
+} >> "$TOKEN_LOG"
+end_time=$(date +%s)
+duration=$((end_time - start_time))
+printf "Scan time: %s seconds\n" "$duration" >> "$PERF_LOG"
+echo "Benchmark complete"

package/scripts/detect-fancy-words.sh ADDED Viewed

@@ -0,0 +1,8 @@
+#!/usr/bin/env bash
+set -euo pipefail
+ROOT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")/.." && pwd)"
+# Pattern matches marketing fluff words that should be avoided in technical documentation
+PATTERN="(super brain|revolutionary|disruptive|perfect|elegant|game changer|cutting edge|best in class|seamless|robust)"
+grep -rE "$PATTERN" "$ROOT_DIR/skills"/*/{SKILL,skill}.md 2>/dev/null || true

package/skills/_shared/references/expert-list.md ADDED Viewed

@@ -0,0 +1,21 @@
+# Expert List
+## Goal
+Provide unified expert role naming and responsibility scopes for Skills to use in `recommended_experts`.
+## Standard Expert Roles
+| Role | Responsibilities | Use Cases |
+|------|------------------|-----------|
+| Product Manager | Define business goals, user value, and boundaries | Requirement definition, value assessment, scope control |
+| System Architect | Design system boundaries, key mechanisms, and dependency directions | Architecture design, cross-module impact assessment |
+| Test Engineer | Design verification strategies and coverage matrices | Acceptance testing, regression strategies |
+| Security Expert | Identify security risks and least privilege | Permission models, sensitive data handling |
+| Performance Engineer | Assess performance risks and metrics | Performance budgets, bottleneck analysis |
+| Technical Writer | Maintain external documentation consistency | Documentation standards, information architecture |
+## Usage Guidelines
+1. `recommended_experts` must use the role names from this table.
+2. If a new role is needed, update this list first with responsibilities and use cases.

package/skills/devbooks-coder/SKILL.md CHANGED Viewed

@@ -14,75 +14,261 @@ allowed-tools:
 ## Progressive Disclosure
-### Base (Required)
+### Base Layer (Required)
 Goal: Implement only what is specified in `tasks.md` and produce Green evidence.
-Inputs: user goal, `tasks.md`, project codebase, change package context.
-Outputs: implementation changes, updated `tasks.md`, Green evidence logs under `evidence/green-final/`.
-Boundaries: do not modify `tests/**`; do not edit `verification.md` or the AC matrix.
-Evidence: reference gate outputs and log paths.
+Inputs: User goal, existing docs, change package context, or project path.
+Outputs: Executable artifacts, updated `tasks.md`, Green evidence logs under `evidence/green-final/`.
+Boundaries: Do not replace other roles' duties; do not modify `tests/**`.
+Evidence: Reference output artifact paths or execution logs.
-### Advanced (Optional)
-Use when you need: risk notes, minimal-diff planning, deviation recording.
+### Advanced Layer (Optional)
+Use when: You need to refine strategies, boundaries, or highlight risks.
-### Extended (Optional)
-Use when you need: faster search/impact via optional MCP capabilities.
+### Extended Layer (Optional)
+Use when: You need to collaborate with external systems or optional tools.
 ## Recommended MCP Capability Types
 - Code search (code-search)
 - Reference tracking (reference-tracking)
 - Impact analysis (impact-analysis)
-## Role Isolation (Mandatory)
-- Test Owner and Coder must be separate conversations/instances.
-- Do not switch roles within one conversation.
-- If tests or `verification.md` require changes, hand off to Test Owner.
+## Quick Start
-## Core Responsibilities
+My Responsibilities:
+1. **Strictly implement functionality according to tasks.md**
+2. **Run acceptance anchors in the verification plan** (tests/, static checks, build, etc.)
+3. **Save Green evidence** to the change package `evidence/green-final/`
+4. **Prohibited from modifying tests/** (If tests need changes, hand back to Test Owner)
-1. Implement strictly according to `<change-root>/<change-id>/tasks.md`.
-2. Run the gate checks required by the change (tests/build/static checks).
-3. Save Green evidence to `<change-root>/<change-id>/evidence/green-final/`.
-4. Check off tasks only when relevant gates pass.
-5. Record discoveries that require design/spec updates to `deviation-log.md`.
+## Role Isolation (Mandatory)
-## Hard Boundaries
+- Test Owner and Coder must be in separate conversations/instances.
+- This Skill executes only as Coder and does not switch to other roles.
-- Allowed: implementation code, `tasks.md`, `deviation-log.md`, evidence logs.
-- Prohibited: `tests/**`, edits to `verification.md`, checking off the AC matrix.
+---
 ## Prerequisites: Configuration Discovery
 Before execution, you **must** search for configuration in the following order (stop when found):
 1. `.devbooks/config.yaml` (if exists) -> Parse and use its mappings
-2. `dev-playbooks/project.md` (if exists) -> Dev-Playbooks protocol, use default mappings
-3. `project.md` (if exists) -> Template protocol, use default mappings
+2. `dev-playbooks/project.md` (if exists) -> Dev-Playbooks protocol
+3. `project.md` (if exists) -> template protocol
 4. If still undetermined -> **Stop and ask the user**
-If the configuration specifies `agents_doc` (rules document), you **must read that document first** before performing any operations.
+**Key Constraints**:
+- If the configuration specifies `agents_doc` (rules document), you **must read that document first** before performing any operations.
+- Do not guess the directory root.
+---
+## 📚 Reference Documents
+### Required (Read Immediately)
+1. **AI Behavior Guidelines**: `~/.claude/skills/_shared/references/ai-behavior-guidelines.md`
+   - Verifiability gatekeeping, structural quality gatekeeping, completeness gatekeeping
+   - Basic rules for all skills
+2. **Code Implementation Prompt**: `references/code-implementation-prompt.md`
+   - Complete code implementation guide
+   - Execute strictly according to this prompt
+### Read on Demand
+3. **Test Execution Strategy**: `references/test-execution-strategy.md`
+   - Details on @smoke/@critical/@full tags
+   - Async vs. Sync boundaries
+   - *When to read*: When you need to understand the test execution strategy
+4. **Completion Status and Routing**: `references/completion-status-and-routing.md`
+   - Completion status classification (MECE)
+   - Routing output templates
+   - *When to read*: When outputting status upon task completion
+5. **Hotspot Awareness and Risk Assessment**: `references/hotspot-awareness-and-risk-assessment.md`
+   - Hotspot file warnings
+   - *When to read*: When risk assessment is needed
+6. **Low Risk Modification Techniques**: `references/low-risk-modification-techniques.md`
+   - Safe refactoring techniques
+   - *When to read*: When refactoring is needed
+7. **Coding Style Guidelines**: `references/coding-style-guidelines.md`
+   - Code style specifications
+   - *When to read*: When unsure about code style
+8. **Logging Standard**: `references/logging-standard.md`
+   - Log levels and formats
+   - *When to read*: When logging needs to be added
+9. **Error Code Standard**: `references/error-code-standard.md`
+   - Error code design
+   - *When to read*: When error codes need to be defined
+---
+## Core Workflow
+### 1. Resume from Breakpoint
+Before starting, you **must** execute:
+1. **Read Progress**: Open `<change-root>/<change-id>/tasks.md`, identify checked `- [x]` tasks.
+2. **Locate Resume Point**: Find the first `- [ ]` after the "last `[x]`".
+3. **Output Confirmation**: Clearly inform the user of the current progress, for example:
+   ```
+   Detected T1-T6 completed (6/10), resuming from T7.
+   ```
+### 2. Real-time Progress Updates
+> **Core Principle**: Complete one task, check one box immediately. Do not wait until all are done to batch check.
+**Check-off Timing**:
+| Timing | Action |
+|--------|--------|
+| Code writing complete | Do not check yet |
+| Compilation passes | Do not check yet |
+| Relevant tests pass | **Check immediately** |
+| Multiple tasks complete together | Check one by one, do not batch |
+### 3. Implement Code
+Execute strictly according to `references/code-implementation-prompt.md`.
+### 4. Run Tests
+```bash
+# During development: run @smoke frequently
+npm test -- --grep "@smoke"
+# Before committing: run @critical
+npm test -- --grep "@smoke|@critical"
+# After commit: CI automatically runs @full (Coder does not wait)
+git push  # triggers CI
+```
+### 5. Output Completion Status
+Refer to `references/completion-status-and-routing.md`.
+---
+## Key Constraints
+### Role Boundaries
+| Allowed | Prohibited |
+|---------|------------|
+| Modify `src/**` code | ❌ Modify `tests/**` |
+| Check `tasks.md` items | ❌ Modify `verification.md` |
+| Record deviations to `deviation-log.md` | ❌ Check AC coverage matrix |
+| Run fast-track tests (`@smoke`/`@critical`) | ❌ Set verification.md Status to Verified/Done |
+| Trigger `@full` tests (CI/Background) | ❌ Wait for @full completion (can start next change) |
+### Code Quality Constraints
+#### Forbidden Commit Patterns
+| Pattern | Detection Command | Reason |
+|---------|-------------------|--------|
+| `test.only` | `rg '\.only\s*\(' src/` | Skips other tests |
+| `console.log` | `rg 'console\.log' src/` | Debug code residue |
+| `debugger` | `rg 'debugger' src/` | Debug breakpoint residue |
+| `// TODO` without issue | `rg 'TODO(?!.*#\d+)' src/` | Untrackable todo |
+| `any` type | `rg ': any[^a-z]' src/` | Type safety hole |
+| `@ts-ignore` | `rg '@ts-ignore' src/` | Hides type errors |
+#### Pre-commit Mandatory Checks
+```bash
+# 1. Compilation Check (Mandatory)
+npm run compile || exit 1
+# 2. Lint Check (Mandatory)
+npm run lint || exit 1
+# 3. Test Check (Mandatory)
+npm test || exit 1
+# 4. test.only Check (Mandatory)
+if rg -l '\.only\s*\(' tests/ src/**/test/; then
+  echo "error: found .only() in tests" >&2
+  exit 1
+fi
+# 5. Debug Code Check (Mandatory)
+if rg -l 'console\.(log|debug)|debugger' src/ --type ts; then
+  echo "error: found debug statements" >&2
+  exit 1
+fi
+```
+---
+## Output Management
+Prevent large output from polluting context:
+| Scenario | Handling Method |
+|----------|-----------------|
+| Command output > 50 lines | Keep only first/last 10 lines + middle summary |
+| Test output | Extract key failure info, do not paste full log |
+| Log output | Save to disk at `<change-root>/<change-id>/evidence/`, quote path only |
+| Large file content | Quote path, do not inline |
+---
+## Evidence Path Convention
+**Green evidence must be saved to**:
+```
+<change-root>/<change-id>/evidence/green-final/
+```
+**Correct Path Examples**:
+```bash
+# Dev-Playbooks default path
+dev-playbooks/changes/<change-id>/evidence/green-final/test-$(date +%Y%m%d-%H%M%S).log
+# Using script
+devbooks change-evidence <change-id> --label green-final -- npm test
+```
+---
+## Deviation Detection and Recording
+**Reference**: `~/.claude/skills/_shared/references/deviation-detection-routing-protocol.md`
+During implementation, you **must immediately** write the following situations to `deviation-log.md`:
+| Situation | Type | Example |
+|-----------|------|---------|
+| Added feature not in tasks.md | NEW_FEATURE | Added warmup() method |
+| Changed constraint in design.md | CONSTRAINT_CHANGE | Timeout changed to 60s |
+| Found edge case not covered by design | DESIGN_GAP | Public interface inconsistent with design |
+| API Signature Change | API_CHANGE | Argument added |
+---
+## Context Awareness
-## Minimal Workflow
+Detection rules refer to: `~/.claude/skills/_shared/context-detection-template.md`
-1. Read `<change-root>/<change-id>/tasks.md` and resume from the first unchecked item.
-2. Implement tasks with a minimal diff.
-3. Run relevant gates; if any gate fails, fix implementation (not tests) and rerun.
-4. Write logs to `evidence/green-final/` and reference them in your output.
-5. Check off completed tasks; do not batch-check.
-6. If you find gaps in design/spec/tasks, write them to `deviation-log.md` and hand off.
+### Modes Supported by This Skill
-## References
+| Mode | Trigger Condition | Behavior |
+|------|-------------------|----------|
+| **First Implementation** | tasks.md all `[ ]` | Start from MP1.1 |
+| **Resume from Breakpoint** | tasks.md has some `[x]` | Continue from first `[ ]` after last `[x]` |
+| **Gate Fix** | Test failures need fixing | Prioritize failed items |
-Required:
-- `~/.claude/skills/_shared/references/ai-behavior-guidelines.md`
-- `references/code-implementation-prompt.md`
+### Prerequisite Checks
-As needed:
-- `references/test-execution-strategy.md`
-- `references/completion-status-and-routing.md`
-- `references/hotspot-awareness-and-risk-assessment.md`
-- `references/low-risk-modification-techniques.md`
-- `references/coding-style-guidelines.md`
-- `references/logging-standard.md`
-- `references/error-code-standard.md`
-- `~/.claude/skills/_shared/references/deviation-detection-routing-protocol.md`
+- [ ] `tasks.md` exists
+- [ ] `verification.md` exists
+- [ ] Test Owner not executed in current session
+- [ ] `tests/**` has test files

package/skills/devbooks-convergence-audit/references/convergence-audit-rules.md ADDED Viewed

@@ -0,0 +1,385 @@
+# Convergence Audit Rules
+## Core Principle: Anti-Confusion Design
+> Golden Rule: Evidence > Declaration. Never trust assertions in documents; they must be confirmed by verifiable evidence.
+### Scenes Where AI is Easily Confused (Must Prevent)
+| Confusing Scene | AI Incorrect Behavior | Correct Behavior |
+|-----------------|-----------------------|------------------|
+| Doc says `Status: Done` | Believes it is done | Verify: Are tests really all green? Does evidence exist? |
+| AC matrix all `[x]` | Believes full coverage | Verify: Does the test file for each AC exist and pass? |
+| Doc says "Tests Passed" | Believes passed | Verify: Run actual tests or check CI log timestamps |
+| `evidence/` dir exists | Believes evidence exists | Verify: Is dir non-empty? Is content valid test logs? |
+| tasks.md all `[x]` | Believes implemented | Verify: Do corresponding code files exist with substance? |
+| Commit msg "Fixed" | Believes fixed | Verify: Did relevant tests turn from red to green? |
+### Anti-Confusion Three Principles
+```
+1. Distrust Declarations
+   - Any "Done/Passed/Covered" declaration in docs is a hypothesis to be verified
+   - Default stance: Declarations might be wrong, outdated, or optimistic
+2. Evidence First
+   - Code/Test results are the only truth
+   - Log timestamps must be later than the last code modification
+   - Empty dir/file = No evidence
+3. Cross Validation
+   - Declaration vs Evidence: Check for consistency
+   - Code vs Test: Check for matching
+   - Multiple Docs: Check for contradictions
+```
+---
+## Verification Checklists (Execute Item by Item)
+### Check 1: Status Field Truth Verification
+Doc Declaration: `Status: Done` or `Status: Verified` in `verification.md`
+Verification Steps:
+```bash
+# 1. Check if verification.md exists
+[[ -f "verification.md" ]] || echo "❌ verification.md does not exist"
+# 2. Check if evidence/green-final/ has content
+if [[ -z "$(ls -A evidence/green-final/ 2>/dev/null)" ]]; then
+  echo "❌ Status claims Done, but evidence/green-final/ is empty"
+fi
+# 3. Check if evidence timestamp is later than code last modified
+code_mtime=$(stat -f %m src/ 2>/dev/null || stat -c %Y src/)
+evidence_mtime=$(stat -f %m evidence/green-final/* 2>/dev/null | sort -n | tail -1)
+if [[ $evidence_mtime -lt $code_mtime ]]; then
+  echo "❌ Evidence time is earlier than code mod, evidence might be stale"
+fi
+```
+Confusion Detection:
+- ⚠️ Status=Done but evidence/ empty → Fake Completion
+- ⚠️ Status=Done but evidence stale → Stale Evidence
+- ⚠️ Status=Done but tests actually fail → False Status
+---
+### Check 2: AC Coverage Matrix Truth Verification
+Doc Declaration: `[x]` in AC matrix means covered
+Verification Steps:
+```bash
+# 1. Extract all ACs claimed to be covered
+grep -E '^\| AC-[0-9]+.*[x]' verification.md | while read line; do
+  ac_id=$(echo "$line" | grep -oE 'AC-[0-9]+')
+  test_id=$(echo "$line" | grep -oE 'T-[0-9]+')
+  # 2. Verify corresponding test exists
+  if ! grep -rq "$test_id\|$ac_id" tests/; then
+    echo "❌ $ac_id claimed covered, but test not found"
+  fi
+done
+# 3. Actual test run verification (Most reliable)
+npm test 2>&1 | tee /tmp/test-output.log
+if grep -q "FAIL\|Error\|failed" /tmp/test-output.log; then
+  echo "❌ AC claimed full coverage, but tests actually failed"
+fi
+```
+Confusion Detection:
+- ⚠️ AC checked but test file missing → Fake Coverage
+- ⚠️ AC checked but test failed → False Green
+- ⚠️ AC checked but test content empty → Placeholder Test
+---
+### Check 3: tasks.md Completion Truth Verification
+Doc Declaration: `[x]` in tasks.md means completed
+Verification Steps:
+```bash
+# 1. Extract all claimed completed tasks
+grep -E '^\- \[x\]' tasks.md | while read line; do
+  # 2. Extract keywords from task description (func name/file/feature)
+  keywords=$(echo "$line" | grep -oE '[A-Za-z]+[A-Za-z0-9]*' | head -5)
+  # 3. Verify implementation in code
+  for kw in $keywords; do
+    if ! grep -rq "$kw" src/; then
+      echo "⚠️ Task claimed done, but keyword not found in code: $kw"
+    fi
+  done
+done
+# 4. Check for "Skeleton Code" (signature only, no impl)
+grep -rE 'throw new Error(.*not implemented|TODO|FIXME|pass$|\.\.\.}' src/ && \
+  echo "⚠️ Found unimplemented placeholder code"
+```
+Confusion Detection:
+- ⚠️ Task checked but code missing → Fake Completion
+- ⚠️ Task checked but placeholder code → Skeleton Code
+- ⚠️ Task checked but feature unreachable → Dead Code
+---
+### Check 4: Evidence Validity Verification
+Doc Declaration: `evidence/` dir contains test evidence
+Verification Steps:
+```bash
+# 1. Check dir exists and non-empty
+if [[ ! -d "evidence" ]] || [[ -z "$(ls -A evidence/)" ]]; then
+  echo "❌ evidence/ missing or empty"
+  exit 1
+fi
+# 2. Check evidence file has substantial content
+for f in evidence/**/*; do
+  if [[ -f "$f" ]]; then
+    lines=$(wc -l < "$f")
+    if [[ $lines -lt 5 ]]; then
+      echo "⚠️ Evidence file too small: $f ($lines lines)"
+    fi
+    # 3. Check if valid test log (contains test framework output traits)
+    if ! grep -qE 'PASS|FAIL|✓|✗|passed|failed|test|spec' "$f"; then
+      echo "⚠️ Evidence file does not look like test log: $f"
+    fi
+  fi
+done
+# 4. Check red-baseline evidence is truly red (has failures)
+if [[ -d "evidence/red-baseline" ]]; then
+  if ! grep -rqE 'FAIL|Error|✗|failed' evidence/red-baseline/; then
+    echo "❌ red-baseline claims red, but no failures found"
+  fi
+fi
+# 5. Check green-final evidence is truly green (all pass)
+if [[ -d "evidence/green-final" ]]; then
+  if grep -rqE 'FAIL|Error|✗|failed' evidence/green-final/; then
+    echo "❌ green-final claims green, but contains failures"
+  fi
+fi
+```
+Confusion Detection:
+- ⚠️ evidence/ exists but empty → Empty Evidence
+- ⚠️ Evidence file too small (< 5 lines) → Placeholder Evidence
+- ⚠️ red-baseline no failures → Fake Red
+- ⚠️ green-final has failures → Fake Green
+---
+### Check 5: Git History Cross-Validation
+Principle: Git history doesn't lie, use it to verify doc declarations
+Verification Steps:
+```bash
+# 1. Check if claimed completed change has corresponding commits
+change_id="xxx"
+commits=$(git log --oneline --all --grep="$change_id" | wc -l)
+if [[ $commits -eq 0 ]]; then
+  echo "❌ Change $change_id claimed done, but no git commits found"
+fi
+# 2. Check if test files added after code (TDD violation)
+for test_file in tests/**/*.test.*; do
+  test_added=$(git log --format=%at --follow -- "$test_file" | tail -1)
+  # Find corresponding src file
+  src_file=$(echo "$test_file" | sed 's/tests/src/' | sed 's/.test//')
+  if [[ -f "$src_file" ]]; then
+    src_added=$(git log --format=%at --follow -- "$src_file" | tail -1)
+    if [[ $test_added -gt $src_added ]]; then
+      echo "⚠️ Test added after code (Non-TDD): $test_file"
+    fi
+  fi
+done
+# 3. Check for "One-time Big Commit" (Process bypass)
+git log --oneline -20 | while read line; do
+  commit=$(echo "$line" | cut -d' ' -f1)
+  files_changed=$(git show --stat "$commit" | grep -E '[0-9]+ file' | grep -oE '[0-9]+' | head -1)
+  if [[ $files_changed -gt 20 ]]; then
+    echo "⚠️ Big commit detected: $commit changed $files_changed files, possibly bypassing incremental verification"
+  fi
+done
+```
+Confusion Detection:
+- ⚠️ Claimed done but no git commit → Fake Change
+- ⚠️ Test added after code → Post-hoc Testing
+- ⚠️ Large file batch commit → Bypass Incremental Verification
+---
+### Check 6: Live Test Run Verification (Most Reliable)
+Principle: Distrust logs, run actual tests
+Verification Steps:
+```bash
+# 1. Run full tests
+echo "=== Live Test Verification ==="
+npm test 2>&1 | tee /tmp/live-test.log
+# 2. Check results
+if grep -qE 'FAIL|Error|failed' /tmp/live-test.log; then
+  echo "❌ Live test failed, doc declaration untrustworthy"
+  grep -E 'FAIL|Error|failed' /tmp/live-test.log
+else
+  echo "✅ Live test passed"
+fi
+# 3. Compare live results with evidence file
+if [[ -f "evidence/green-final/latest.log" ]]; then
+  live_pass=$(grep -c 'PASS|✓|passed' /tmp/live-test.log)
+  evidence_pass=$(grep -c 'PASS|✓|passed' evidence/green-final/latest.log)
+  if [[ $live_pass -ne $evidence_pass ]]; then
+    echo "⚠️ Live pass count ($live_pass) ≠ Evidence pass count ($evidence_pass)"
+  fi
+fi
+```
+Confusion Detection:
+- ⚠️ Evidence says green but live run fails → Stale Evidence/Fake Green
+- ⚠️ Live pass count mismatch → Evidence Forgery/Env Diff
+---
+## Scoring Algorithm
+### Trustworthiness Score (0-100)
+```python
+def calculate_trustworthiness(checks):
+    score = 100
+    # Critical Issues (-20 each)
+    critical = [
+        "Evidence empty",
+        "Live test failed",
+        "Status claims Done but test failed",
+        "green-final contains failures"
+    ]
+    # Warnings (-10 each)
+    warnings = [
+        "Evidence stale",
+        "AC missing test",
+        "Placeholder code",
+        "Big commit detected"
+    ]
+    # Minor Issues (-5 each)
+    minor = [
+        "Test added after code",
+        "Evidence file too small"
+    ]
+    for issue in checks.critical_issues:
+        score -= 20
+    for issue in checks.warnings:
+        score -= 10
+    for issue in checks.minor_issues:
+        score -= 5
+    return max(0, score)
+```
+### Convergence Verdict
+| Trustworthiness | Verdict | Recommendation |
+|-----------------|---------|----------------|
+| 90-100 | ✅ Trusted Converged | Continue process |
+| 70-89 | ⚠️ Partially Trusted | Need supplementary verification |
+| 50-69 | 🟠 Suspicious | Need rework on some parts |
+| < 50 | 🔴 Untrusted | Sisyphus dilemma, needs full audit |
+---
+## Output Format
+```markdown
+# DevBooks Convergence Audit Report (Anti-Confusion Ed.)
+## Audit Principles
+Report adopts "Evidence First, Distrust Declarations" principle. All conclusions based on verifiable evidence, not doc assertions.
+## Declaration vs Evidence Comparison
+| Check Item | Doc Declaration | Actual Verification | Conclusion |
+|------------|-----------------|---------------------|------------|
+| Status | Done | Live test failed | ❌ Fake Completion |
+| AC Coverage | 5/5 Checked | 2 ACs missing tests | ❌ Fake Coverage |
+| Test Status | All Green | Live run 3 failed | ❌ Stale Evidence |
+| tasks.md | 10/10 Done | 3 tasks code missing | ❌ Fake Completion |
+| evidence/ | Exists | Dir non-empty, valid | ✅ Valid |
+## Trustworthiness Score
+**Total**: 45/100 🔴 Untrusted
+**Deduction Detail**:
+- -20: Status=Done but live test failed
+- -20: AC claims full coverage but 2 missing tests
+- -10: tasks.md 3 tasks missing code
+- -5: Evidence timestamp earlier than code mod
+## Confusion Detection Results
+### 🔴 Detected Fake Completion
+1. `change-auth`: Status=Done, but `npm test` failed 3
+2. `fix-cache`: AC-003 Checked, but `tests/cache.test.ts` missing
+### 🟡 Suspicious Items
+1. `refactor-api`: evidence/green-final/ timestamp 2 days older than code
+2. `feature-login`: tasks.md all checked, but `src/login.ts` contains TODO
+## Real Status Verdict
+| Change Pkg | Declared Status | Real Status | Gap |
+|------------|-----------------|-------------|-----|
+| change-auth | Done | Test Failed | 🔴 Critical |
+| fix-cache | Verified | Incomplete Coverage | 🟠 Medium |
+| refactor-api | Ready | Stale Evidence | 🟡 Minor |
+## Recommended Actions
+### Immediate Actions
+1. Revert `change-auth` status to `In Progress`
+2. Add tests for `fix-cache` AC-003
+### Short-term Improvements
+1. Establish evidence freshness check (Evidence must be newer than code)
+2. Force run corresponding tests before checking AC
+### Process Improvements
+1. Ban manual Status modification; only update via script verification
+2. CI integrate convergence check to block fake completion merge
+```
+---
+## Completion Status
+**Status**: ✅ AUDIT_COMPLETED
+**Core Findings**:
+- Doc Trustworthiness: X%
+- Detected Fake Completions: N
+- Changes needing rework: M
+**Next Steps**:
+- Fake Completion → Immediate status revert, re-verify
+- Suspicious → Supplement evidence or re-run tests
+- Trusted → Continue process
+```

package/skills/devbooks-design-backport/SKILL.md DELETED Viewed

@@ -1,120 +0,0 @@
----
-name: devbooks-design-backport
-description: devbooks-design-backport: Backport newly discovered constraints, conflicts, or gaps from implementation back to design.md (keeping design as the golden truth), with annotated decisions and impacts. Use when the user says "backport design/update design doc/Design Backport/design-implementation mismatch/need to clarify constraints" etc.
-allowed-tools:
-  - Glob
-  - Grep
-  - Read
-  - Write
-  - Edit
----
-# DevBooks: Design Backport
-## Workflow Position Awareness
-> **Core Principle**: Design Backport is now **primarily auto-invoked by Archiver during archive phase**, users typically don't need to call it manually.
-### My Position in the Overall Workflow
-```
-proposal → design → test-owner → coder → test-owner(verify) → code-review → [Archive/Spec Gardener]
-                                    ↓                                              ↓
-                             Record deviations to deviation-log.md     Auto-invoke design-backport
-```
-### Design Decision: Auto Backport
-**Old Flow** (manual judgment required):
-```
-coder has deviations → user manually calls design-backport → then archive
-```
-**New Flow** (auto handling):
-```
-coder has deviations → archiver auto-detects and backports during archive → archive
-```
-### When Manual Call is Still Needed
-| Scenario | Need Manual Call? |
-|----------|-------------------|
-| Normal flow (deviations in deviation-log.md) | ❌ Auto-handled during archive |
-| Need immediate backport (don't wait for archive) | ✅ Manual call |
-| Severe design-implementation conflict needs decision | ✅ Manual call and discuss |
----
-## Prerequisites: Configuration Discovery (Protocol-Agnostic)
-- `<truth-root>`: Current truth directory root
-- `<change-root>`: Change package directory root
-Before execution, you **must** search for configuration in the following order (stop when found):
-1. `.devbooks/config.yaml` (if exists) → Parse and use its mappings
-2. `dev-playbooks/project.md` (if exists) → Dev-Playbooks protocol, use default mappings
-3. `project.md` (if exists) → Template protocol, use default mappings
-4. If still undetermined → **Stop and ask the user**
-**Key Constraints**:
-- If `agents_doc` (rules document) is specified in configuration, **you must read that document first** before executing any operations
-- Do not guess directory roots
-- Do not skip reading the rules document
-## Execution Method
-1) First read and follow: `~/.claude/skills/_shared/references/ai-behavior-guidelines.md` (verifiability + structural quality gating).
-2) Strictly execute according to the complete prompt: `references/design-backport-prompt.md`.
----
-## Context Awareness
-This Skill automatically detects context before execution, identifying content that needs to be backported.
-Detection rules reference: `skills/_shared/context-detection-template-context-detection.md`
-### Detection Flow
-1. Detect whether `design.md` exists
-2. Detect whether new discoveries (conflicts/constraints/gaps) were found during implementation
-3. Compare differences between design and implementation
-### Modes Supported by This Skill
-| Mode | Trigger Condition | Behavior |
-|------|-------------------|----------|
-| **Conflict Backport** | Design-implementation conflict detected | Record conflict points and resolutions |
-| **Constraint Backport** | New implementation constraints discovered | Add constraint conditions to design |
-| **Gap Backport** | Scenarios not covered by design detected | Add missing design decisions |
-### Detection Output Example
-```
-Detection Results:
-- design.md: Exists
-- Discoveries: 2 new constraints, 1 design conflict
-- Running Mode: Constraint Backport + Conflict Backport
-```
----
-## Progressive Disclosure
-### Base (Required)
-Goal: Clarify this Skill's core outputs and usage scope.
-Inputs: User goals, existing documents, change package context, or project path.
-Outputs: Executable artifacts, next-step guidance, or recorded paths.
-Boundaries: Does not replace other roles; does not touch `tests/`.
-Evidence: Reference output paths or execution records.
-### Advanced (Optional)
-Use when you need to refine strategy, boundaries, or risk notes.
-### Extended (Optional)
-Use when you need to coordinate with external systems or optional tools.
-## Recommended MCP Capability Types
-- Code search (code-search)
-- Reference tracking (reference-tracking)
-- Impact analysis (impact-analysis)

package/skills/devbooks-design-backport/references/design-backport-prompt.md DELETED Viewed

@@ -1,132 +0,0 @@
-# Design Backport Prompt
-> **Role**: You are the strongest mind in design evolution, combining the wisdom of Michael Nygard (architecture decision records), Martin Fowler (evolutionary design), and Kent Beck (incremental improvement). Your design sync must meet expert-level standards.
-Highest directive (top priority):
-- Before executing this prompt, read `~/.claude/skills/_shared/references/ai-behavior-guidelines.md` and follow all protocols within it.
-# Prompt: Backport Design Docs When Implementation Plans Exceed Design Scope
-> Use case: You discover new constraints/concepts/acceptance criteria in the implementation plan (tasks/plan) that are not covered in the design docs (design/spec), causing drift between "plan-driven implementation" and "design-driven acceptance."
-Artifact locations (protocol agnostic):
-- Design doc usually at: `<change-root>/<change-id>/design.md`
-- Implementation plan usually at: `<change-root>/<change-id>/tasks.md`
-- Spec delta usually at: `<change-root>/<change-id>/specs/<capability>/spec.md`
-- Current truth at: `<truth-root>/` (do not backport by editing historical archives; update current truth with a new change package)
-> Goal: Backport content that *should be part of design* into the design doc to reduce divergence in testing, implementation, and acceptance.
----
-## What Can Be Backported to the Design Doc
-Only backport plan items that meet at least one of the following (i.e., **Design-level**):
-1. **External semantics or user-visible behavior**
-- New/changed key user flows (explicit state machines, async sessions, cancellable/timeouts)
-- External contracts (API input/output shapes, error semantics, required fields, compatibility windows)
-2. **System-level invariants / red lines**
-- Cost/resource limits (e.g., prohibit N^2 LLM calls, hard caps like `max_llm_calls`, budget-triggered degradation)
-- Reliability/security red lines (e.g., multi-tenant isolation on by default, external untrusted boundaries, default isolation for injection)
-3. **Core data contracts and evolution strategy**
-- `schema_version`, required event envelope fields, idempotency key principles, compatibility strategy (DLQ/migration/replay)
-- Minimum standards for what must be replayable/auditable/traceable
-4. **Cross-cutting concerns**
-- Observability metrics, SLO/KPI, alerting and operational strategies
-- Lifecycle/retention policies (Valid/Quarantine/Garbage goals and rules)
-- Gradual rollout/rollback paths and feature flags
-5. **Key tradeoffs and decisions**
-- Why choose A over B, alternatives, risks, fallback strategies
-- New/changed Non-goals or Open Questions
----
-## What Must NOT Be Backported
-The following are **Implementation-level** and should not be written into design docs (unless promoted to formal design decisions):
-- Specific file paths, class/function names, table/field names (unless they are stable architectural boundaries that must align)
-- PR splitting advice, task execution order, temporary scripts/commands
-- Over-detailed algorithm pseudocode (backport inputs/outputs/invariants/complexity limits/fallbacks instead of code)
-- One-off implementation conveniences without long-term value or verification
----
-## Conflict Resolution (Plan vs Design)
-- **Design doc is the golden truth**: if plan conflicts with design, do not overwrite design with the plan.
-- Two acceptable paths:
-1) **Proposal-style backport**: write plan content into the design doc as "Proposed Design Change" and mark it as needing decision/confirmation;
-2) **Defer**: mark plan items as `DEFERRED/UNSCOPED` until design is clarified.
-- When backporting, explicitly label it as "new design decision/supplemental constraint" and explain reasons and impact scope.
----
-## Output Requirements
-1. **Diff checklist**: list "plan exceeds design" candidate items (group by plan ID), with classification: `Design-level / Implementation-level / Out-of-scope`.
-2. **Design backport patch**: write all `Design-level` content back into the design doc with minimal edits, placed in the most appropriate sections (e.g., non-goals/design principles/risks & fallback/contracts/milestones/key decisions).
-3. **Traceability updates**: for each backported design item, state acceptance method (A/B/C) and acceptance anchors, and require updates to:
-   - Traceability matrix (prefer updating `<change-root>/<change-id>/verification.md`; sync to `docs/` only if needed externally)
-   - Manual acceptance checklist (prefer updating `MANUAL-*` in `<change-root>/<change-id>/verification.md`; sync to `docs/` only if needed externally)
-   - If new/updated automation anchors are needed: list tests/static checks to add (tests/commands/markers)
----
-## Ready-to-Copy Prompt
-```text
-You are the "Design Doc Editor." Your goal is to backport design-level content from the implementation plan into the design doc, making it traceable and verifiable.
-Inputs:
-- Design doc: `<change-root>/<change-id>/design.md` (or an equivalent path you provide)
-- Implementation plan: `<change-root>/<change-id>/tasks.md` (or an equivalent path you provide)
-Tasks:
-1) Read the implementation plan and identify all items that are missing or under-specified in the design doc (group by section).
-2) Classify each candidate item:
-- Design-level (should be backported): impacts external semantics/user flows/system red lines/data contracts/evolution strategy/operations/governance/key decisions
-- Implementation-level (do not backport): implementation details, file paths, PR splitting, execution order, pseudocode details
-- Out-of-scope (do not backport and defer): not in scope or future phase not yet confirmed by design
-3) For Design-level items only, backport into the design doc:
-- Place in the most appropriate existing section; add small sections if needed, but do not restructure the whole doc
-- Write in "design constraints/decisions" tone; avoid implementation detail
-- If it conflicts with existing design: do not overwrite conclusions; add a "Proposed Design Change/Open Question" with reason, impact, and decision points
-- Update the design doc's "last updated" metadata (if present)
-4) Output:
-- A) Candidate list with classifications (by plan ID)
-- B) Minimal patch to the design doc (only added/modified paragraphs)
-- C) Traceability and anchor updates (prioritized):
-  - Which tests/static checks to add/update (A-class anchors)
-  - Which manual/hybrid acceptance items to add/update (B/C anchors, prefer `<change-root>/<change-id>/verification.md`)
-  - How to update the traceability matrix (prefer `<change-root>/<change-id>/verification.md`)
-Constraints:
-- Do not write file paths, class/function names, DB table names, or other implementation details into the design doc unless they are stable architectural boundaries.
-- Do not paste large pseudocode blocks; you may state invariants, complexity limits, and fallback strategies.
-- Keep language consistent with the design doc (English by default; include domain terms if needed).
-```

package/skills/devbooks-docs-sync/SKILL.md DELETED Viewed

@@ -1,43 +0,0 @@
----
-name: devbooks-docs-sync
-description: devbooks-docs-sync: Deprecated alias. Use devbooks-docs-consistency for documentation consistency checks.
-allowed-tools:
-  - Glob
-  - Grep
-  - Read
-  - Edit
-  - Write
-  - Bash
----
-# DevBooks: Docs Sync (Deprecated Alias)
-## Progressive Disclosure
-### Base (Required)
-Goal: Route to `devbooks-docs-consistency`.
-Inputs: user request and project/change context.
-Outputs: deprecation notice + next step.
-Boundaries: this alias should not add new behavior.
-Evidence: reference the target skill path.
-### Advanced (Optional)
-Use when you need: compatibility notes for existing automation.
-### Extended (Optional)
-Use when you need: faster search/impact via optional MCP capabilities.
-## Recommended MCP Capability Types
-- Code search (code-search)
-- Reference tracking (reference-tracking)
-- Impact analysis (impact-analysis)
-## Deprecation
-- Old name: `devbooks-docs-sync`
-- Current name: `devbooks-docs-consistency`
-- Use `skills/devbooks-docs-consistency/SKILL.md`
-## Next
-Run `devbooks-docs-consistency` for the same context (change-scoped or global).