@brunosps00/dev-workflow 0.9.0 → 0.10.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (72) hide show
  1. package/README.md +18 -19
  2. package/lib/constants.js +2 -10
  3. package/lib/migrate-gsd.js +1 -1
  4. package/package.json +1 -1
  5. package/scaffold/en/commands/dw-autopilot.md +6 -6
  6. package/scaffold/en/commands/dw-brainstorm.md +1 -1
  7. package/scaffold/en/commands/dw-bugfix.md +1 -0
  8. package/scaffold/en/commands/dw-code-review.md +1 -0
  9. package/scaffold/en/commands/dw-commit.md +6 -0
  10. package/scaffold/en/commands/dw-create-techspec.md +2 -0
  11. package/scaffold/en/commands/dw-deep-research.md +6 -0
  12. package/scaffold/en/commands/dw-deps-audit.md +1 -0
  13. package/scaffold/en/commands/dw-find-skills.md +4 -4
  14. package/scaffold/en/commands/dw-fix-qa.md +1 -0
  15. package/scaffold/en/commands/dw-generate-pr.md +1 -0
  16. package/scaffold/en/commands/dw-help.md +9 -29
  17. package/scaffold/en/commands/dw-intel.md +1 -1
  18. package/scaffold/en/commands/dw-refactoring-analysis.md +2 -1
  19. package/scaffold/en/commands/dw-review-implementation.md +28 -2
  20. package/scaffold/en/commands/dw-run-plan.md +2 -2
  21. package/scaffold/en/templates/idea-onepager.md +1 -1
  22. package/scaffold/pt-br/commands/dw-autopilot.md +6 -6
  23. package/scaffold/pt-br/commands/dw-brainstorm.md +1 -1
  24. package/scaffold/pt-br/commands/dw-bugfix.md +1 -0
  25. package/scaffold/pt-br/commands/dw-code-review.md +1 -0
  26. package/scaffold/pt-br/commands/dw-commit.md +6 -0
  27. package/scaffold/pt-br/commands/dw-create-techspec.md +2 -0
  28. package/scaffold/pt-br/commands/dw-deep-research.md +6 -0
  29. package/scaffold/pt-br/commands/dw-deps-audit.md +1 -0
  30. package/scaffold/pt-br/commands/dw-find-skills.md +4 -4
  31. package/scaffold/pt-br/commands/dw-fix-qa.md +1 -0
  32. package/scaffold/pt-br/commands/dw-generate-pr.md +1 -0
  33. package/scaffold/pt-br/commands/dw-help.md +9 -29
  34. package/scaffold/pt-br/commands/dw-intel.md +1 -1
  35. package/scaffold/pt-br/commands/dw-refactoring-analysis.md +2 -1
  36. package/scaffold/pt-br/commands/dw-review-implementation.md +21 -2
  37. package/scaffold/pt-br/commands/dw-run-plan.md +2 -2
  38. package/scaffold/pt-br/templates/idea-onepager.md +1 -1
  39. package/scaffold/skills/dw-codebase-intel/SKILL.md +1 -0
  40. package/scaffold/skills/dw-codebase-intel/references/api-design-discipline.md +138 -0
  41. package/scaffold/skills/dw-debug-protocol/SKILL.md +106 -0
  42. package/scaffold/skills/dw-debug-protocol/references/error-categorization.md +127 -0
  43. package/scaffold/skills/dw-debug-protocol/references/non-reproducible-strategy.md +108 -0
  44. package/scaffold/skills/dw-debug-protocol/references/six-step-triage.md +139 -0
  45. package/scaffold/skills/dw-debug-protocol/references/stop-the-line.md +52 -0
  46. package/scaffold/skills/dw-git-discipline/SKILL.md +120 -0
  47. package/scaffold/skills/dw-git-discipline/references/atomic-commits-discipline.md +158 -0
  48. package/scaffold/skills/dw-git-discipline/references/branch-hygiene.md +150 -0
  49. package/scaffold/skills/dw-git-discipline/references/trunk-based-pattern.md +82 -0
  50. package/scaffold/skills/dw-memory/SKILL.md +1 -2
  51. package/scaffold/skills/dw-simplification/SKILL.md +142 -0
  52. package/scaffold/skills/dw-simplification/references/behavior-preserving.md +148 -0
  53. package/scaffold/skills/dw-simplification/references/chestertons-fence.md +152 -0
  54. package/scaffold/skills/dw-simplification/references/complexity-metrics.md +147 -0
  55. package/scaffold/skills/dw-source-grounding/SKILL.md +128 -0
  56. package/scaffold/skills/dw-source-grounding/references/citation-protocol.md +108 -0
  57. package/scaffold/skills/dw-source-grounding/references/freshness-check.md +108 -0
  58. package/scaffold/skills/dw-source-grounding/references/source-priority.md +146 -0
  59. package/scaffold/skills/dw-verify/SKILL.md +0 -1
  60. package/scaffold/skills/vercel-react-best-practices/SKILL.md +4 -0
  61. package/scaffold/skills/vercel-react-best-practices/references/perf-discipline.md +122 -0
  62. package/scaffold/skills/webapp-testing/SKILL.md +5 -0
  63. package/scaffold/skills/webapp-testing/references/security-boundary.md +115 -0
  64. package/scaffold/skills/webapp-testing/references/three-workflow-patterns.md +144 -0
  65. package/scaffold/en/commands/dw-execute-phase.md +0 -149
  66. package/scaffold/en/commands/dw-plan-checker.md +0 -144
  67. package/scaffold/en/commands/dw-quick.md +0 -103
  68. package/scaffold/en/commands/dw-resume.md +0 -84
  69. package/scaffold/pt-br/commands/dw-execute-phase.md +0 -149
  70. package/scaffold/pt-br/commands/dw-plan-checker.md +0 -144
  71. package/scaffold/pt-br/commands/dw-quick.md +0 -103
  72. package/scaffold/pt-br/commands/dw-resume.md +0 -84
@@ -0,0 +1,148 @@
1
+ # Behavior-preserving refactor — the test gate that protects you
2
+
3
+ The promise of simplification: the code is cleaner; nothing observable changed. The risk: you broke a subtle behavior nobody had a test for.
4
+
5
+ ## The test gate
6
+
7
+ Before simplifying ANY non-trivial code, ensure tests cover what you're about to change. Three cases:
8
+
9
+ ### Case A — tests exist and pass
10
+
11
+ Run them. They pass before, they pass after. Done.
12
+
13
+ ### Case B — tests exist but are incomplete
14
+
15
+ Run them. They pass before. Apply your simplification. They still pass — but you don't trust them to catch your specific concern.
16
+
17
+ Action: write a characterization test FIRST that pins the behavior you're worried about. Then simplify.
18
+
19
+ ### Case C — no tests at all
20
+
21
+ Don't simplify. The "preserve behavior exactly" claim is unverifiable. Either:
22
+ - Write characterization tests first (see below), then simplify.
23
+ - Or open a separate task ("add tests for X") and don't touch the code now.
24
+
25
+ ## Characterization tests
26
+
27
+ A characterization test documents what the code currently does, without claiming whether that's correct. It's a freeze of behavior at a point in time.
28
+
29
+ Pattern:
30
+
31
+ ```javascript
32
+ // Characterization test — not asserting CORRECTNESS,
33
+ // asserting BEHAVIOR-AT-TIME-OF-WRITING.
34
+ describe('legacyFormatter (characterization)', () => {
35
+ it('handles empty input by returning "(none)"', () => {
36
+ expect(legacyFormatter('')).toBe('(none)');
37
+ });
38
+
39
+ it('preserves trailing whitespace in output', () => {
40
+ expect(legacyFormatter('hello ')).toBe('hello ');
41
+ });
42
+
43
+ it('returns null when input is undefined', () => {
44
+ expect(legacyFormatter(undefined)).toBe(null);
45
+ });
46
+ });
47
+ ```
48
+
49
+ Goal: capture the WEIRD behaviors. Boring behaviors don't need tests; nobody breaks them. The weird parts are the load-bearing ones.
50
+
51
+ ## How to find weird behaviors to characterize
52
+
53
+ 1. Run the function with a range of inputs (empty, null, undefined, very long, special chars, edge values).
54
+ 2. Note any output that surprised you. That's a weird behavior.
55
+ 3. Write a test that pins it.
56
+
57
+ Example session:
58
+
59
+ ```bash
60
+ node -e "console.log(JSON.stringify(require('./src/legacy').format('')))"
61
+ # Output: "(none)" ← weird, document it.
62
+
63
+ node -e "console.log(JSON.stringify(require('./src/legacy').format('hello ')))"
64
+ # Output: "hello " ← preserves trailing whitespace? unusual, document.
65
+
66
+ node -e "console.log(JSON.stringify(require('./src/legacy').format(undefined)))"
67
+ # Output: null ← returns null not "(none)" for undefined? document.
68
+ ```
69
+
70
+ Tests written; now you can simplify. If after your refactor any of these tests fail, you broke behavior — even if your version "looks better".
71
+
72
+ ## Refactor with the test gate
73
+
74
+ ```
75
+ 1. Run tests → GREEN (baseline)
76
+ 2. Apply simplification (one logical change)
77
+ 3. Run tests → expect GREEN
78
+ - If RED → revert, understand why, retry differently
79
+ 4. Commit (atomic — one simplification per commit)
80
+ 5. Repeat for next simplification
81
+ ```
82
+
83
+ DO NOT batch multiple simplifications into one commit. If something breaks, you can't bisect which change caused it.
84
+
85
+ ## Rollback patterns
86
+
87
+ If the simplification went wrong AND the issue surfaces only post-merge:
88
+
89
+ ```bash
90
+ git revert <simplification-sha>
91
+ ```
92
+
93
+ Atomic commits make this clean. The revert is one commit; tests pass again; you can investigate at leisure.
94
+
95
+ If multiple simplifications were stacked and one broke:
96
+
97
+ ```bash
98
+ # Find which one
99
+ git bisect start
100
+ git bisect bad HEAD
101
+ git bisect good <known-good-sha>
102
+ # Run tests at each step
103
+ ```
104
+
105
+ Bisect requires that EACH commit be testable in isolation — another reason for atomic commits.
106
+
107
+ ## Codemod tooling per language
108
+
109
+ For refactors >500 lines, manual edits are dangerous. Use AST-based codemods:
110
+
111
+ ### TypeScript / JavaScript
112
+ - **jscodeshift** — Facebook's AST-based codemod runner. Many community codemods at `npmjs.com/package/<framework>-codemods`.
113
+ - **ts-morph** — programmatic TypeScript AST manipulation. Good for one-off refactors.
114
+ - **ESLint --fix** — for any rule that has a fixer (most of them); narrow scope.
115
+
116
+ ```bash
117
+ # Run a community codemod
118
+ npx jscodeshift -t ./codemods/extract-method.js src/
119
+ ```
120
+
121
+ ### Python
122
+ - **libcst** — Instagram's concrete syntax tree library. Preserves comments + formatting.
123
+ - **ast** module + custom walker for simpler cases.
124
+ - **Ruff --fix** for known auto-fixable rules.
125
+
126
+ ### C# / .NET
127
+ - **Roslyn analyzers + code fixes** — write a custom analyzer with a fix provider; runs across the solution.
128
+ - **Visual Studio Refactor menu** for IDE-driven multi-file renames/extractions.
129
+
130
+ ### Rust
131
+ - **rust-analyzer** has refactor support: rename, extract function, inline variable.
132
+ - **rustfmt + clippy --fix** for style/idiom-level fixes.
133
+
134
+ ## Don't
135
+
136
+ - Don't simplify and update tests in the same commit. The test changes hide what you changed in the code.
137
+ - Don't consider lint passing as "behavior preserved". Lint catches syntax; tests catch behavior.
138
+ - Don't trust "no test failed therefore behavior preserved" if test coverage is <60%. Add tests first.
139
+ - Don't pre-emptively simplify for a future hypothetical maintainer. Simplify when you have a present reason (a bug, a read-time confusion).
140
+
141
+ ## When the refactor is too risky
142
+
143
+ If steps 2-3 of the test gate consistently fail (you can't pin the behavior, or your simplification keeps breaking subtle things):
144
+
145
+ - The code is genuinely complex for a reason; leave it.
146
+ - Or: rewrite-with-tests as a separate, larger task — not in-flight simplification.
147
+
148
+ The right call sometimes is "this code is ugly but works; ship around it; revisit when you have hours, not minutes."
@@ -0,0 +1,152 @@
1
+ # Chesterton's Fence — understand WHY before changing
2
+
3
+ > "If you find a fence in the middle of a field, don't tear it down until you know why it was put there." — G.K. Chesterton (paraphrased)
4
+
5
+ In code, the "fence" is anything that looks unnecessary on first read. Most code that looks unnecessary is actually load-bearing for a case the casual reader hasn't encountered.
6
+
7
+ ## The protocol
8
+
9
+ Before removing or rewriting anything:
10
+
11
+ ### 1. What does it actually do?
12
+
13
+ Don't trust the name. Names lie. Names get repurposed. Read the body. Trace the call graph.
14
+
15
+ If you can't say in one sentence what the code does (not what it's named, what it DOES), you don't yet understand it.
16
+
17
+ ### 2. Why was it added?
18
+
19
+ ```bash
20
+ # When did this line/block enter the codebase?
21
+ git log -L <start>,<end>:<file> --oneline | head -5
22
+
23
+ # Or for a whole file
24
+ git log --follow --oneline <file> | head -10
25
+
26
+ # Read the introducing commit
27
+ git show <commit-hash>
28
+ ```
29
+
30
+ The commit message + PR title + linked issue often have the rationale. Common findings:
31
+
32
+ - "Fix race condition when X happens during Y" — the seemingly redundant check is the fix.
33
+ - "Workaround for [framework] bug #1234" — the weird code is a workaround.
34
+ - "Required for [external system] compatibility" — the cast/wrapper exists for a reason.
35
+ - "TODO: simplify after we deprecate Y" — there's an explicit plan, follow it.
36
+
37
+ ### 3. What breaks if it's gone?
38
+
39
+ ```bash
40
+ # Find callers
41
+ rg -F "<symbol>" --type <lang>
42
+
43
+ # Run the test suite
44
+ <test-runner>
45
+ ```
46
+
47
+ If tests pass with the code removed, it might be safe — but tests have gaps. Cross-check with:
48
+ - Search the whole codebase for the symbol/string
49
+ - Check `.dw/intel/files.json` for `imports` referencing it
50
+ - For exported symbols, check downstream consumers (other repos, npm, etc.)
51
+
52
+ If even one of the three answers is "I don't know", DON'T REMOVE.
53
+
54
+ ## Case studies
55
+
56
+ ### Case 1 — "Redundant" early return
57
+
58
+ Code:
59
+
60
+ ```python
61
+ def process(item):
62
+ if item is None:
63
+ return
64
+ if item.id is None:
65
+ return
66
+ # ...rest...
67
+ ```
68
+
69
+ Looks like the second `if` could merge into the first via `if item is None or item.id is None`. But:
70
+
71
+ - `item is None` is a system error (caller passed None).
72
+ - `item.id is None` is a domain state (item not yet persisted).
73
+
74
+ The two cases warrant different downstream behavior in the future (e.g., logging the second but not the first). Merging them collapses the distinction. Chesterton's Fence: leave alone unless you also remove the future flexibility intentionally.
75
+
76
+ ### Case 2 — "Useless" type cast
77
+
78
+ Code:
79
+
80
+ ```typescript
81
+ const value = (someApi.fetch() as unknown) as MyType;
82
+ ```
83
+
84
+ Looks like `as MyType` would suffice. But the double-cast was added because the API's actual return type is incompatible with `MyType` and a direct cast errors. Removing the `as unknown` step breaks compilation.
85
+
86
+ The fix isn't to remove the double-cast; it's to either fix `MyType` (if the API is right) or fix the API typings (if `MyType` is right). Chesterton's Fence: read the commit message — usually says "compiler workaround for incompatible types".
87
+
88
+ ### Case 3 — "Duplicated" validation
89
+
90
+ Code:
91
+
92
+ ```javascript
93
+ function createUser(payload) {
94
+ if (!payload.email) throw new ValidationError('email required');
95
+ // ...
96
+ }
97
+
98
+ function updateUser(id, payload) {
99
+ if (!payload.email) throw new ValidationError('email required');
100
+ // ...
101
+ }
102
+ ```
103
+
104
+ Looks like duplication; "extract" to `validatePayload(payload)`. But:
105
+
106
+ - `createUser` requires email.
107
+ - `updateUser` may eventually allow email-less updates (e.g., just change name).
108
+
109
+ Today they happen to share validation; tomorrow they might not. The "duplication" preserves independence of evolution. Don't extract a shared validator until at least 3 callers exist with truly identical rules.
110
+
111
+ ### Case 4 — "Dead" branch
112
+
113
+ Code:
114
+
115
+ ```typescript
116
+ if (process.env.LEGACY_MIGRATION_MODE === 'true') {
117
+ return legacyTransform(input);
118
+ }
119
+ return modernTransform(input);
120
+ ```
121
+
122
+ Looks dead because nobody sets `LEGACY_MIGRATION_MODE` in test/dev. But:
123
+
124
+ - Production sets it on the data-import service.
125
+ - The migration runs once a quarter when new historical data lands.
126
+
127
+ Removing the branch means the next quarterly migration breaks silently. Chesterton's Fence: check production env vars / feature flags / runtime configs before deleting "dead" branches.
128
+
129
+ ## When you DO act
130
+
131
+ After all three protocol steps, if:
132
+
133
+ 1. You understand what the code does.
134
+ 2. You know why it was added (and the reason no longer applies, or you have evidence it never applied).
135
+ 3. You've verified nothing breaks.
136
+
137
+ THEN you can simplify. Cite the rationale in the commit message:
138
+
139
+ ```
140
+ refactor(auth): remove fallback to legacy session API
141
+
142
+ The fallback was added in 2022-04 for compatibility with the v1
143
+ session service, which was decommissioned in 2024-09 (PR #4567).
144
+ No callers reference the legacy code path; tests confirm the
145
+ modern path covers all scenarios. Removed.
146
+ ```
147
+
148
+ Future maintainers can see why the change was safe. They can also see — if it turns out NOT to be safe — exactly what assumption was wrong.
149
+
150
+ ## Anti-pattern
151
+
152
+ Removing code because "it looks unused" without doing the three steps. Speed is not the goal; correctness is. A 30-minute investigation to avoid a 3-day production rollback is a great trade.
@@ -0,0 +1,147 @@
1
+ # Complexity metrics — when each one actually matters
2
+
3
+ Metrics are signal, not verdict. A function with cyclomatic complexity 12 isn't automatically "bad" — but it's worth a second look. This file describes the four common metrics, when each matters, and how to measure cheaply.
4
+
5
+ ## The four metrics
6
+
7
+ ### 1. Cyclomatic complexity
8
+
9
+ Counts independent paths through a function. Each `if`, `else`, `for`, `while`, `case`, `&&`, `||`, `?:` adds one.
10
+
11
+ | Score | What it means | Action |
12
+ |-------|---------------|--------|
13
+ | 1-5 | Simple, easily testable | Leave alone |
14
+ | 6-10 | Moderate; usually fine | Test thoroughly |
15
+ | 11-20 | Complex; review case-by-case | Strong candidate to simplify |
16
+ | 21+ | Very complex; high bug rate empirically | Definitely simplify (or split) |
17
+
18
+ Caveat: long switch/match statements over enum values can score high but be naturally clear. Cyclomatic complexity over-penalizes them.
19
+
20
+ ### 2. Cognitive complexity
21
+
22
+ Counts how hard the function is to **understand** (not just trace through). Nesting amplifies; flat structures don't. `goto`, `break N levels`, and recursion add weight.
23
+
24
+ This metric correlates better with bug rate than cyclomatic. Most modern linters compute it (SonarQube, ESLint with `sonarjs`, `radon` for Python).
25
+
26
+ | Score | Action |
27
+ |-------|--------|
28
+ | 0-9 | Fine |
29
+ | 10-15 | Review |
30
+ | 16+ | Refactor |
31
+
32
+ ### 3. Nesting depth
33
+
34
+ Counts the maximum depth of `if`/`for`/`try` nesting in a function.
35
+
36
+ | Depth | Action |
37
+ |-------|--------|
38
+ | 1-2 | Fine |
39
+ | 3 | Acceptable; consider early returns |
40
+ | 4 | Smell; usually flatten via guard clauses |
41
+ | 5+ | Almost always refactor |
42
+
43
+ Easiest to fix via early returns:
44
+
45
+ ```javascript
46
+ // Before — depth 4
47
+ function process(req) {
48
+ if (req) {
49
+ if (req.user) {
50
+ if (req.user.active) {
51
+ if (req.body) {
52
+ return doWork(req.body);
53
+ }
54
+ }
55
+ }
56
+ }
57
+ return null;
58
+ }
59
+
60
+ // After — depth 1
61
+ function process(req) {
62
+ if (!req) return null;
63
+ if (!req.user) return null;
64
+ if (!req.user.active) return null;
65
+ if (!req.body) return null;
66
+ return doWork(req.body);
67
+ }
68
+ ```
69
+
70
+ ### 4. Fan-out (outgoing dependencies)
71
+
72
+ Number of distinct other modules a function calls. High fan-out = many touchpoints; refactor risky.
73
+
74
+ | Fan-out | Action |
75
+ |---------|--------|
76
+ | 0-3 | Fine |
77
+ | 4-7 | Acceptable |
78
+ | 8+ | Function probably has too many concerns; consider splitting |
79
+
80
+ ## When each metric matters
81
+
82
+ | Symptom | Metric to check |
83
+ |---------|-----------------|
84
+ | "This function has a lot of `if`s" | Cyclomatic + nesting depth |
85
+ | "I keep getting lost reading this" | Cognitive complexity |
86
+ | "Tests miss edge cases" | Cyclomatic (each path needs a test) |
87
+ | "Refactor keeps breaking adjacent code" | Fan-out + fan-in |
88
+ | "Function is 200 lines" | Length + nesting (length alone is weak signal) |
89
+
90
+ ## How to measure cheaply
91
+
92
+ ### TypeScript / JavaScript
93
+
94
+ ```bash
95
+ # ESLint with sonarjs plugin
96
+ npx eslint --rule 'sonarjs/cognitive-complexity: ["error", 15]' src/
97
+
98
+ # Or per-file analysis
99
+ npx complexity-report src/foo.ts
100
+
101
+ # Bundle analysis adjacent
102
+ npx eslint --rule 'complexity: ["warn", 10]' src/
103
+ ```
104
+
105
+ ### Python
106
+
107
+ ```bash
108
+ # Radon — cyclomatic + cognitive
109
+ pip install radon
110
+ radon cc src/ -a # cyclomatic, average per file
111
+ radon mi src/ # maintainability index
112
+
113
+ # Or via ruff
114
+ ruff check --select C901 src/
115
+ ```
116
+
117
+ ### C# / .NET
118
+
119
+ ```bash
120
+ # Code metrics via Roslyn
121
+ dotnet build /p:CodeAnalysisRuleSet=...
122
+ # Or in Visual Studio: Analyze → Calculate Code Metrics
123
+ ```
124
+
125
+ ### Rust
126
+
127
+ ```bash
128
+ # Clippy + complexity lints
129
+ cargo clippy -- -W clippy::cognitive_complexity
130
+
131
+ # Or rust-code-analysis CLI
132
+ rust-code-analysis-cli -m -p src/
133
+ ```
134
+
135
+ ## Don't
136
+
137
+ - **Don't chase a score.** "Cyclomatic 9 vs 10" is meaningless. The function either reads clearly or doesn't.
138
+ - **Don't refactor purely to lower a number.** If `cognitive_complexity: 16` flags a switch over 16 enum cases, leaving it alone is fine.
139
+ - **Don't measure the whole repo at once and act on the top 100.** Most will be false positives. Focus on functions you're touching.
140
+ - **Don't ignore the trend.** A file going from average cognitive 8 → 14 over 6 months is a smell, even if no individual function crossed the threshold.
141
+
142
+ ## Healthy use of metrics
143
+
144
+ - Run on PRs that touched complex code; flag if a function moved to a worse bucket.
145
+ - Run on long-lived hot files quarterly; spot drift.
146
+ - Set CI gates ONLY at gross thresholds (e.g., cognitive >25), not at edge thresholds (>10).
147
+ - Treat metrics as conversation starters, not pass/fail gates.
@@ -0,0 +1,128 @@
1
+ ---
2
+ name: dw-source-grounding
3
+ description: Discipline of grounding architectural and dependency decisions in versioned official documentation, with mandatory citations. Other commands invoke this skill when they need to decide based on framework/library behavior — never on hallucinated APIs or stale Stack Overflow answers. Adapted from addyosmani/agent-skills (MIT).
4
+ allowed-tools:
5
+ - Read
6
+ - Bash
7
+ - Grep
8
+ - Glob
9
+ - WebFetch
10
+ ---
11
+
12
+ # dw-source-grounding
13
+
14
+ Behavioral protocol for grounding decisions in **versioned, official documentation** — and citing those sources verifiably. Used by `dw-create-techspec`, `dw-deps-audit`, `dw-deep-research` whenever a decision depends on what a framework or library actually does at the version installed in the project.
15
+
16
+ ## Why this skill exists
17
+
18
+ Decisions made on hallucinated APIs or 2-year-old Stack Overflow answers cause silent breakage. The cost shows up later — code that "worked in testing" because the agent matched an older version's API, then breaks in production where the real version is different. This skill enforces a four-step protocol that prevents that class of failure.
19
+
20
+ ## When to Use
21
+
22
+ Read this skill when:
23
+
24
+ - A command needs to recommend a library/framework version (`dw-deps-audit` brainstorm phase).
25
+ - A command must propose architectural patterns that depend on framework specifics (`dw-create-techspec`).
26
+ - A command is researching a topic that has version-specific answers (`dw-deep-research`).
27
+ - You're about to cite an API, CLI flag, configuration option, or behavior — and you want the citation to be verifiable later.
28
+
29
+ Do NOT use when:
30
+
31
+ - The decision doesn't depend on external documentation (e.g., naming a variable inside a single function).
32
+ - The library/framework version is irrelevant to the answer (e.g., "use a hash map for O(1) lookup").
33
+ - You're writing examples that are intentionally generic / pseudocode.
34
+
35
+ ## The Protocol — Detect → Fetch → Implement → Cite
36
+
37
+ ### 1. Detect — read the actual version first
38
+
39
+ Before researching anything, read the project's manifest and identify the EXACT version of the library/framework that matters:
40
+
41
+ | Stack | File | Field |
42
+ |-------|------|-------|
43
+ | Node/TS | `package.json` | `dependencies`, `devDependencies` |
44
+ | Python | `pyproject.toml`, `requirements*.txt`, `Pipfile.lock` | each dep with version |
45
+ | .NET | `*.csproj`, `packages.lock.json` | `PackageReference Version="..."` |
46
+ | Rust | `Cargo.toml`, `Cargo.lock` | `[dependencies]` |
47
+
48
+ Record the version. If a range (`^4.18.0`), note the lockfile-resolved version.
49
+
50
+ If no manifest exists OR the dep is not yet in the manifest (e.g., choosing what to install), record "no version yet — choosing fresh".
51
+
52
+ ### 2. Fetch — pull the matching version's official docs
53
+
54
+ Authority hierarchy:
55
+
56
+ 1. **Official docs** for the EXACT version (or nearest stable). E.g., `react.dev/reference/react?version=18` not `react.dev` default.
57
+ 2. **Official changelog / migration guide** when transitioning across versions.
58
+ 3. **Web standards** (MDN, RFCs, W3C) for cross-implementation behavior.
59
+ 4. **Compatibility tables** (caniuse, Compat data) for API support across runtimes.
60
+
61
+ Forbidden as primary source:
62
+
63
+ - Stack Overflow answers (use only as discovery, then verify via official).
64
+ - Tutorial blogs (frequently outdated; never authoritative).
65
+ - AI training data (your training is months/years stale).
66
+ - README screenshots from random GitHub repos.
67
+
68
+ Fetch via `WebFetch` or `mcp__context7__*` if available. If both fail, surface to the user that you're falling back to training-data knowledge AND mark the citation `[source: training-data, unverified]`.
69
+
70
+ ### 3. Implement — apply the documented pattern
71
+
72
+ Use exactly the API the documented version provides. Don't mix patterns from multiple versions ("this useEffect example is from React 16; you're on 18.3"). When the doc shows multiple acceptable patterns, pick the simplest that matches the project's style.
73
+
74
+ If the doc presents migration warnings (e.g., "deprecated in v5, use X instead"), follow the new path unless the project explicitly pins to the old version for a documented reason.
75
+
76
+ ### 4. Cite — record the source verifiably
77
+
78
+ Every decision that depended on an external source ends with a citation block:
79
+
80
+ ```
81
+ [source: <url>, version: <X.Y>, retrieved: <YYYY-MM-DD>]
82
+ ```
83
+
84
+ Examples:
85
+
86
+ ```
87
+ [source: https://react.dev/reference/react/useEffect, version: 18.3, retrieved: 2026-05-07]
88
+ [source: https://docs.python.org/3.12/library/asyncio-task.html, version: 3.12, retrieved: 2026-05-07]
89
+ [source: https://docs.aws.amazon.com/sdk-for-javascript/v3/developer-guide/welcome.html, version: SDK v3, retrieved: 2026-05-07]
90
+ ```
91
+
92
+ When citing in PRDs, techspecs, decision logs, or deps-audit reports, the citation is mandatory adjacent to each claim. A future engineer reading the doc can click and verify.
93
+
94
+ ## How `dw-create-techspec` uses this
95
+
96
+ Before writing the "Architectural Decisions" section, the techspec command:
97
+
98
+ 1. Lists every framework/library decision the techspec depends on (e.g., "use Server Actions for mutations").
99
+ 2. For each, runs the protocol: detects version, fetches official doc, cites verifiably.
100
+ 3. Writes each decision as: `<decision> — <one-line rationale> — [source: ...]`.
101
+
102
+ If the protocol can't reach official docs (offline, paywall, dead link), the techspec prefixes the decision with `⚠ training-data fallback` so the human reviewer knows to verify.
103
+
104
+ ## How `dw-deps-audit` uses this
105
+
106
+ In the brainstorm phase (Conservative/Balanced/Bold per package), each option's "target version" cites the source where that version's release notes were checked. This catches the "agent recommends v5 because it sounds modern, but v5 dropped Node 18 support" class of error.
107
+
108
+ ## How `dw-deep-research` uses this
109
+
110
+ Already does multi-source research; gains the citation discipline. Each finding line ends with a `[source: ...]` block. The output report's bibliography is built from these citations automatically.
111
+
112
+ ## Anti-patterns
113
+
114
+ 1. Citing Stack Overflow as primary source. (Use as DISCOVERY, then fetch the official doc the SO answer points to.)
115
+ 2. Citing "the docs" without a URL. The whole point is verifiability.
116
+ 3. Citing a doc URL that isn't pinned to a version (e.g., `react.dev` instead of `react.dev/reference/react?version=18`).
117
+ 4. Pretending knowledge is current when it's training data. Mark unverified.
118
+ 5. Citing your own previous answer in this session as authority. The chain has to terminate at an external source.
119
+
120
+ ## References
121
+
122
+ - `references/citation-protocol.md` — exact format of `[source: ...]` blocks; how to consolidate multiple citations in a single decision; how to track citation freshness over time.
123
+ - `references/source-priority.md` — full hierarchy with examples; when secondary sources are acceptable.
124
+ - `references/freshness-check.md` — how to validate a doc URL still applies to the version in use; how to detect doc drift between when you fetched and when the user reads the artifact.
125
+
126
+ ## Inspired by
127
+
128
+ Adapted from [`addyosmani/agent-skills/source-driven-development`](https://github.com/addyosmani/agent-skills) by Addy Osmani (MIT license). Core protocol (Detect → Fetch → Implement → Cite) and source authority hierarchy preserved. dev-workflow integration: invoked by `dw-create-techspec`, `dw-deps-audit`, `dw-deep-research` via Complementary Skills, and citation format aligned with our existing report frontmatter conventions (`type: ...`, `schema_version: ...`).
@@ -0,0 +1,108 @@
1
+ # Citation protocol — exact format of `[source: ...]` blocks
2
+
3
+ Citations are not optional decoration. They turn a decision into a verifiable claim. Future engineers reading a techspec/deps-audit/research report click the URL and confirm.
4
+
5
+ ## Required format
6
+
7
+ Single citation:
8
+
9
+ ```
10
+ [source: <url>, version: <X.Y>, retrieved: <YYYY-MM-DD>]
11
+ ```
12
+
13
+ All three fields required:
14
+
15
+ - `<url>` — direct URL to the section/page that supports the claim. Not the docs homepage.
16
+ - `<version>` — the version of the library/framework the URL applies to. If the URL is version-pinned (e.g., `?version=18`), the value here matches.
17
+ - `<retrieved>` — ISO date you fetched. Establishes when the doc was current.
18
+
19
+ Examples (good):
20
+
21
+ ```
22
+ [source: https://react.dev/reference/react/useEffect, version: 18.3, retrieved: 2026-05-07]
23
+ [source: https://nextjs.org/docs/app/building-your-application/data-fetching/server-actions-and-mutations, version: 14.2, retrieved: 2026-05-07]
24
+ [source: https://docs.python.org/3.12/library/asyncio.html, version: 3.12, retrieved: 2026-05-07]
25
+ [source: https://docs.docker.com/compose/compose-file/build/, version: Compose Spec, retrieved: 2026-05-07]
26
+ ```
27
+
28
+ Examples (bad):
29
+
30
+ ```
31
+ [source: react.dev] # no version, no date, not pinned
32
+ [source: docs, retrieved: today] # vague URL, vague date
33
+ [source: https://stackoverflow.com/q/12345] # not authoritative — see source-priority.md
34
+ [the React docs say ...] # not a citation block; future reader can't verify
35
+ ```
36
+
37
+ ## Multiple citations on one decision
38
+
39
+ When a decision rests on more than one source (e.g., "use Server Actions for mutations" depends on the React 19 form actions API + the Next.js 14 routing layer), list each:
40
+
41
+ ```
42
+ Decision: use Server Actions for the form submission flow.
43
+
44
+ Rationale: Server Actions provide automatic revalidation
45
+ [source: https://nextjs.org/docs/app/building-your-application/data-fetching/server-actions-and-mutations, version: 14.2, retrieved: 2026-05-07]
46
+ and align with React 19's `useActionState`
47
+ [source: https://react.dev/reference/react/useActionState, version: 19.0, retrieved: 2026-05-07].
48
+ ```
49
+
50
+ Inline form for short claims:
51
+
52
+ ```
53
+ - Postgres 16 introduces `pg_stat_io` for IO observability
54
+ [source: https://www.postgresql.org/docs/16/monitoring-stats.html, version: 16, retrieved: 2026-05-07].
55
+ ```
56
+
57
+ ## Where citations live
58
+
59
+ Per artifact:
60
+
61
+ | Artifact | Citations live in |
62
+ |----------|-------------------|
63
+ | PRD (`prd.md`) | "Open Questions" section when the answer needs research, OR inline next to specific functional requirements |
64
+ | TechSpec (`techspec.md`) | Inline next to every architectural decision; consolidated in a "Sources" section at the end |
65
+ | Deps-audit report | Adjacent to each package's recommended version |
66
+ | Deep-research report | Inline next to every finding; consolidated bibliography auto-generated |
67
+ | ADR | Mandatory in the "References" section |
68
+
69
+ ## Freshness notation
70
+
71
+ If the citation is older than 90 days when the artifact is consumed, suspect drift. Best-practice:
72
+
73
+ - Re-verify the citation before acting on the decision.
74
+ - If the cited URL has moved or been deprecated, update the artifact: `[source: <new-url>, version: ..., retrieved: <new-date>, supersedes: <old-url>]`.
75
+
76
+ For artifacts that age (e.g., a 6-month-old ADR), agents reading them downstream should flag stale citations rather than silently trust them.
77
+
78
+ ## Unverified fallback
79
+
80
+ When official docs are unreachable (network down, paywall, deleted page) AND the agent must still produce an answer, mark the citation explicitly:
81
+
82
+ ```
83
+ [source: training-data, unverified, claim: <X>, last-known-version: <Y>]
84
+ ```
85
+
86
+ This signals to the human reviewer that the answer needs verification before commit.
87
+
88
+ NEVER use `unverified` as a default. The four-step protocol (Detect → Fetch → Implement → Cite) demands fetch attempts; `unverified` is the failsafe, not the path.
89
+
90
+ ## Consolidation in reports
91
+
92
+ When a report has dozens of citations (e.g., a deep-research output), build a numbered bibliography at the end:
93
+
94
+ ```
95
+ ## Sources
96
+
97
+ [1] https://react.dev/reference/react/useEffect, version: 18.3, retrieved: 2026-05-07
98
+ [2] https://nextjs.org/docs/app/building-your-application/data-fetching/server-actions-and-mutations, version: 14.2, retrieved: 2026-05-07
99
+ [3] https://docs.python.org/3.12/library/asyncio.html, version: 3.12, retrieved: 2026-05-07
100
+ ```
101
+
102
+ In-line, refer to bibliography entries: `... uses Server Actions [2] for mutations ...`.
103
+
104
+ ## Don't
105
+
106
+ - Don't cite an internal Slack thread or private wiki as if it were authoritative — the citation must point to something the reader can read.
107
+ - Don't cite a URL that requires login (paywalls); find an open-access equivalent or note the access barrier in the citation.
108
+ - Don't backfill citations after the fact ("I'll add the source later"). Cite as you decide; if you can't cite, you don't have a verified decision.