@eltonssouza/development-utility-kit 0.10.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (131) hide show
  1. package/.claude/agents/README.md +24 -0
  2. package/.claude/agents/analyst.md +198 -0
  3. package/.claude/agents/backend-developer.md +126 -0
  4. package/.claude/agents/brain-keeper.md +229 -0
  5. package/.claude/agents/code-reviewer.md +181 -0
  6. package/.claude/agents/database-engineer.md +94 -0
  7. package/.claude/agents/devops-engineer.md +141 -0
  8. package/.claude/agents/frontend-developer.md +97 -0
  9. package/.claude/agents/gate-keeper.md +118 -0
  10. package/.claude/agents/migrator.md +291 -0
  11. package/.claude/agents/mobile-developer.md +80 -0
  12. package/.claude/agents/n8n-specialist.md +94 -0
  13. package/.claude/agents/product-owner.md +115 -0
  14. package/.claude/agents/qa-engineer.md +232 -0
  15. package/.claude/agents/release-engineer.md +204 -0
  16. package/.claude/agents/scaffold.md +87 -0
  17. package/.claude/agents/security-engineer.md +199 -0
  18. package/.claude/agents/sprint-runner.md +46 -0
  19. package/.claude/agents/stack-resolver.md +104 -0
  20. package/.claude/agents/tech-lead.md +182 -0
  21. package/.claude/agents/update-template.md +54 -0
  22. package/.claude/agents/ux-designer.md +118 -0
  23. package/.claude/hooks/flow-guard.js +261 -0
  24. package/.claude/hooks/flow-state.js +197 -0
  25. package/.claude/local/CLAUDE.md +71 -0
  26. package/.claude/settings.json +55 -0
  27. package/.claude/skills/README.md +331 -0
  28. package/.claude/skills/active-project/SKILL.md +131 -0
  29. package/.claude/skills/api-integration-test/SKILL.md +84 -0
  30. package/.claude/skills/auto-test-guard/SKILL.md +239 -0
  31. package/.claude/skills/auto-test-guard/resources/backend-tests.md +20 -0
  32. package/.claude/skills/auto-test-guard/resources/e2e-tests.md +24 -0
  33. package/.claude/skills/auto-test-guard/resources/execution-report.md +49 -0
  34. package/.claude/skills/auto-test-guard/resources/frontend-tests.md +18 -0
  35. package/.claude/skills/auto-test-guard/resources/initial-setup.md +108 -0
  36. package/.claude/skills/auto-test-guard/resources/run-suite.md +48 -0
  37. package/.claude/skills/auto-test-guard/resources/senior-gate.md +19 -0
  38. package/.claude/skills/brain-keeper/SKILL.md +62 -0
  39. package/.claude/skills/brain-keeper/obsidian/app.json +9 -0
  40. package/.claude/skills/brain-keeper/obsidian/appearance.json +4 -0
  41. package/.claude/skills/brain-keeper/obsidian/core-plugins.json +20 -0
  42. package/.claude/skills/brain-keeper/obsidian/daily-notes.json +5 -0
  43. package/.claude/skills/brain-keeper/obsidian/graph.json +32 -0
  44. package/.claude/skills/brain-keeper/obsidian/snippets/folder-colors.css +90 -0
  45. package/.claude/skills/brain-keeper/obsidian/templates.json +5 -0
  46. package/.claude/skills/brain-keeper/templates/README.md +51 -0
  47. package/.claude/skills/brain-keeper/templates/adr.md +40 -0
  48. package/.claude/skills/brain-keeper/templates/bug.md +35 -0
  49. package/.claude/skills/brain-keeper/templates/daily.md +38 -0
  50. package/.claude/skills/brain-keeper/templates/feature.md +62 -0
  51. package/.claude/skills/brain-keeper/templates/meeting.md +34 -0
  52. package/.claude/skills/brain-keeper/templates/tech-debt.md +21 -0
  53. package/.claude/skills/caveman/SKILL.md +189 -0
  54. package/.claude/skills/create-stack-pack/SKILL.md +281 -0
  55. package/.claude/skills/grill-me/SKILL.md +80 -0
  56. package/.claude/skills/pair-debug/SKILL.md +288 -0
  57. package/.claude/skills/prd-ready-check/SKILL.md +86 -0
  58. package/.claude/skills/project-manager/SKILL.md +334 -0
  59. package/.claude/skills/quality-standards/SKILL.md +203 -0
  60. package/.claude/skills/quick-feature/SKILL.md +266 -0
  61. package/.claude/skills/run-sprint/SKILL.md +41 -0
  62. package/.claude/skills/scaffold/SKILL.md +60 -0
  63. package/.claude/skills/stack-discovery/SKILL.md +161 -0
  64. package/.claude/skills/test-coverage-auditor/SKILL.md +87 -0
  65. package/.claude/skills/to-issues/SKILL.md +163 -0
  66. package/.claude/skills/to-prd/SKILL.md +130 -0
  67. package/.claude/skills/update-template/SKILL.md +256 -0
  68. package/.claude/stacks/CODEOWNERS +30 -0
  69. package/.claude/stacks/README.md +97 -0
  70. package/.claude/stacks/_template.md +116 -0
  71. package/.claude/stacks/dotnet/aspire-9.md +528 -0
  72. package/.claude/stacks/go/gin-1.10.md +570 -0
  73. package/.claude/stacks/java/spring-boot-3.md +376 -0
  74. package/.claude/stacks/java/spring-boot-4.md +438 -0
  75. package/.claude/stacks/node/express-5.md +538 -0
  76. package/.claude/stacks/python/django-5.md +483 -0
  77. package/.claude/stacks/python/fastapi-0.115.md +522 -0
  78. package/.claude/stacks/typescript/angular-18.md +420 -0
  79. package/.claude/stacks/typescript/angular-19.md +397 -0
  80. package/.claude/stacks/typescript/angular-21.md +494 -0
  81. package/CLAUDE.md +472 -0
  82. package/README.md +412 -0
  83. package/bin/cli.js +848 -0
  84. package/bin/lib/adr.js +146 -0
  85. package/bin/lib/backup.js +62 -0
  86. package/bin/lib/detect-stack.js +476 -0
  87. package/bin/lib/doctor.js +527 -0
  88. package/bin/lib/help.js +328 -0
  89. package/bin/lib/identity.js +108 -0
  90. package/bin/lib/lint-allowlist.json +15 -0
  91. package/bin/lib/lint.js +798 -0
  92. package/bin/lib/local-dir.js +68 -0
  93. package/bin/lib/manifest.js +236 -0
  94. package/bin/lib/sync-all.js +394 -0
  95. package/bin/lib/version-check.js +398 -0
  96. package/dashboard/db.js +321 -0
  97. package/dashboard/package.json +22 -0
  98. package/dashboard/public/app.js +853 -0
  99. package/dashboard/public/content/docs/agents-reference.en.md +911 -0
  100. package/dashboard/public/content/docs/architecture-overview.en.md +252 -0
  101. package/dashboard/public/content/docs/autonomy-matrix.en.md +186 -0
  102. package/dashboard/public/content/docs/cli-reference.en.md +538 -0
  103. package/dashboard/public/content/docs/git-flow.en.md +525 -0
  104. package/dashboard/public/content/docs/honcho-memory.en.md +394 -0
  105. package/dashboard/public/content/docs/hooks-reference.en.md +404 -0
  106. package/dashboard/public/content/docs/pipeline.en.md +414 -0
  107. package/dashboard/public/content/docs/plugins.en.md +289 -0
  108. package/dashboard/public/content/docs/quality-gate.en.md +315 -0
  109. package/dashboard/public/content/docs/skills-reference.en.md +484 -0
  110. package/dashboard/public/content/docs/stack-rules.en.md +362 -0
  111. package/dashboard/public/content/docs/troubleshooting.en.md +565 -0
  112. package/dashboard/public/content/manifest.json +114 -0
  113. package/dashboard/public/content/manual/backend.en.md +1053 -0
  114. package/dashboard/public/content/manual/existing-project.en.md +848 -0
  115. package/dashboard/public/content/manual/frontend.en.md +1008 -0
  116. package/dashboard/public/content/manual/fullstack.en.md +1459 -0
  117. package/dashboard/public/content/manual/mobile.en.md +837 -0
  118. package/dashboard/public/content/manual/quickstart.en.md +169 -0
  119. package/dashboard/public/index.html +217 -0
  120. package/dashboard/public/style.css +857 -0
  121. package/dashboard/public/vendor/marked.min.js +69 -0
  122. package/dashboard/rtk.js +143 -0
  123. package/dashboard/server-app.js +421 -0
  124. package/dashboard/server.js +104 -0
  125. package/dashboard/test/sprint1.test.js +406 -0
  126. package/dashboard/test/sprint2.test.js +571 -0
  127. package/dashboard/test/sprint3.test.js +560 -0
  128. package/package.json +33 -0
  129. package/scripts/hooks/subagent-telemetry.sh +14 -0
  130. package/scripts/hooks/telemetry-writer.js +250 -0
  131. package/scripts/latest-versions.json +56 -0
@@ -0,0 +1,289 @@
1
+ # External plugins — credit, choice rationale, integration
2
+
3
+ `development-utility-kit` is **not a closed framework**. It is a curated harness that stands on the shoulders of four external pieces of Claude Code community work: **grill-me**, **caveman**, **impeccable**, and **rtk**. Each one was chosen deliberately, each one solves a specific problem, and each one is integrated in a way that preserves the original author's intent while making it useful inside this opinionated pipeline.
4
+
5
+ This page exists because users frequently ask:
6
+
7
+ - *"Why this plugin and not another?"*
8
+ - *"How does it work inside `duk`?"*
9
+ - *"What if I disable it — what breaks?"*
10
+ - *"Who do I credit?"*
11
+
12
+ The answer to all four questions is documented here, plugin by plugin. For end-to-end usage in real workflows, see [skills-reference](skills-reference) and [pipeline](pipeline). For our own internal architecture choices, see [architecture-overview](architecture-overview) and the ADR vault under `docs/brain/decisions/`.
13
+
14
+ ---
15
+
16
+ ## Selection principles
17
+
18
+ Before listing the plugins, the four criteria used when accepting an external piece into the harness — applied retroactively to all current plugins and prospectively to any future ones:
19
+
20
+ 1. **Solves a problem the harness alone cannot solve well.** No NIH ("not invented here") — if someone built a better mousetrap, we use it. But also no "shiny new tool just because" — every plugin must be load-bearing in at least one pipeline stage.
21
+ 2. **Author is reachable and license is permissive.** Open source under MIT / Apache 2 / similar. Author publishes on GitHub. We can fork if upstream stalls, and we attribute clearly.
22
+ 3. **Drop-in integration without forking the harness.** Either invoked via `npx <plugin>@<pinned-version>` (Impeccable) or wrapped in a thin adapter (RTK, in `dashboard/rtk.js`). Skills like `grill-me` and `caveman` are adapted **once** at adoption time and become first-class `.claude/skills/` of the harness.
23
+ 4. **Distinct from our internal layer.** A plugin never overlaps with an existing skill or agent's primary responsibility. If a community plugin starts duplicating an internal piece, the internal one is the source of truth and the plugin is deprecated from the harness (with a clear ADR).
24
+
25
+ A plugin that no longer satisfies one of those criteria gets removed, regardless of how widely used it is. ADR-018 documents the precedent (npx-style distribution for `duk` itself was inspired by `impeccable` — a meta-precedent of accepting external inspiration into our own architecture).
26
+
27
+ ---
28
+
29
+ ## grill-me — relentless discovery interview
30
+
31
+ **Original author:** [Matt Pocock](https://github.com/mattpocock) — [aihero.dev](https://www.aihero.dev/my-grill-me-skill-has-gone-viral)
32
+ **Upstream repo:** [mattpocock/skills](https://github.com/mattpocock/skills)
33
+ **License:** MIT (verify upstream before any redistribution)
34
+ **Where it lives in the harness:** `.claude/skills/grill-me/SKILL.md` (adapted)
35
+ **Pinned version:** internal — we vendored the skill's interview mechanic and adapted it; we do not auto-update from upstream
36
+ **Related ADR:** ADR-011, ADR-013, ADR-017, ADR-019
37
+
38
+ ### What it does
39
+
40
+ `grill-me` is a discovery interview skill: it interrogates the user one decision branch at a time, before any code is written and before any technical plan is committed. For each question, it pre-computes a recommended answer the user can simply confirm (`y/n` style), and when the answer can come from the code or git history, it reads instead of asking. The interview ends when there is enough certainty to write down requirements without guessing.
41
+
42
+ Matt's original design surfaced hidden requirements that prompt-only conversations missed. We kept the mechanic exactly as he designed it.
43
+
44
+ ### Why we picked it
45
+
46
+ Three pre-existing problems in our pipeline made grill-me an obvious fit:
47
+
48
+ - **PRD authoring used to start from a single user paragraph.** Result: under-specified scope, edge cases discovered late, sprint rework. Grill-me forces the user to confront ambiguity *before* the PRD exists.
49
+ - **Planning agents (`analyst`, `architect`) need a structured input.** Free-form prompts make them hallucinate constraints. Grill-me produces a structured `DISCOVERY_*.md` that `analyst` can consume deterministically.
50
+ - **The interview pattern was already proven viral.** Reproducing it from scratch would have taken weeks, with no clear UX improvement.
51
+
52
+ We did not modify the interview mechanic, the recommended-answer pattern, the rubber-duck framing, or the conversational tone. Those are Matt's design.
53
+
54
+ ### How it integrates
55
+
56
+ The `grill-me → to-prd → to-issues → analyst → run-sprint` chain is our pipeline's discovery → delivery spine (ADR-008). Our adaptation:
57
+
58
+ | What we kept from Matt | What we added on top |
59
+ |---|---|
60
+ | One decision branch per question | Persistence to `docs/discovery/DISCOVERY_<NAME>.md` (ADR-017) |
61
+ | Recommended answers user just confirms | Hand-off contract: `analyst` reads the discovery file and converts to `PLAN_*.md` with goal-ready DoD (ADR-013) |
62
+ | Reading code / git log when possible | Caller signal (`caller: sprint-runner\|autonomous\|grill-me`) so autonomous flows skip the human interview gate (ADR-013) |
63
+ | Brutal directness on hidden assumptions | Hard gate: `analyst` refuses to produce a PLAN without a `DISCOVERY_*.md` on the human path |
64
+
65
+ ### How to use it correctly
66
+
67
+ - **Trigger from chat**: say `"grill me"`, `"me entrevista sobre <topic>"`, `"stress-test o plano"`, or any variant in the description triggers. The skill will start asking.
68
+ - **Trigger from pipeline**: skipped if you say `"quick-feature"` (small change), or if you are already inside `sprint-runner`. The discovery gate is for new features, not for line edits.
69
+ - **Stop early if the answer is obvious**: grill-me will not insist on questions where the recommended answer is high-confidence. Just confirm and move on.
70
+ - **End artifact**: a `docs/discovery/DISCOVERY_<NAME>.md` file. From there: `"to-prd"` to materialise the PRD, then `"to-issues"` to break into GitHub issues, then `"executa Sprint 1"` to start coding.
71
+
72
+ ### What breaks if you disable it
73
+
74
+ - `analyst` refuses to produce a `PLAN_*.md` on the human path (it requires a `DISCOVERY_*.md` per ADR-013). Workaround: produce the discovery manually and place it in `docs/discovery/` with the expected format.
75
+ - `to-prd` has nothing to read. Workaround: write the PRD yourself.
76
+ - Small features are unaffected (`quick-feature` bypasses discovery by design).
77
+
78
+ ### Alternatives considered
79
+
80
+ - **Just prompt engineering** — works for trivial features but does not surface decision branches the user did not think to mention.
81
+ - **Free-form interview by `analyst`** — analyst is biased toward "let's write the plan now" and tends to skip exploration. Grill-me's role separation (one tool that *only* explores) is the design choice.
82
+ - **External requirements management (Jira, Linear)** — orthogonal; we still use these for storing the output (`to-issues` creates the tickets).
83
+
84
+ ---
85
+
86
+ ## caveman — telegraphic style as default
87
+
88
+ **Original author:** [Julius Brussee](https://github.com/JuliusBrussee)
89
+ **Upstream repo:** [JuliusBrussee/caveman](https://github.com/JuliusBrussee/caveman)
90
+ **License:** MIT (verify upstream before any redistribution)
91
+ **Where it lives in the harness:** `.claude/skills/caveman/SKILL.md` (adapted)
92
+ **Pinned version:** internal — same as grill-me, vendored and adapted
93
+ **Related ADR:** none directly; ADR-018 references the `npx skills add` pattern that inspired our installer
94
+
95
+ ### What it does
96
+
97
+ `caveman` compresses Claude's output by removing articles, fillers, redundant connectors, and stylistic flourishes — without removing technical substance. The compression levels (LITE, FULL, ULTRA) give the user control over how aggressive the cut is:
98
+
99
+ - **LITE** — light cut, preserves sentence structure. Good for prose documents (`.md`).
100
+ - **FULL** — standard cut, removes most fillers. Good for chat answers.
101
+ - **ULTRA** — aggressive cut, telegraphic. Good for code, files, terminal output.
102
+
103
+ Julius's original published results: ~65–75% token savings in typical Claude Code usage with zero loss of technical correctness.
104
+
105
+ ### Why we picked it
106
+
107
+ Three reasons converged:
108
+
109
+ - **Cost.** Opus inference is expensive. A 65–75% reduction in output tokens compounds across every session, every PR review, every plan. Real money over a year.
110
+ - **Signal-to-noise ratio.** Long replies bury the actual answer in fluff. Senior developers using Claude Code as a thought-partner want answers, not essays. Caveman enforces that culture.
111
+ - **No information loss.** This is the key. We tested it on stack traces, ADRs, code diffs, security audits — caveman preserves every technical token (paths, exact commands, schema, error codes). It removes "essentially", "basically", "as we discussed earlier", "it is worth noting that", which carry zero information.
112
+
113
+ ### How it integrates
114
+
115
+ Caveman is the **default style** in the harness (ULTRA for `.java`, `.ts`, `.py`, `.sh`, `.yml`, `.json`; FULL for `.md`, LITE for prose responses). It is `always_on: true` in our adaptation.
116
+
117
+ | What we kept from Julius | What we added on top |
118
+ |---|---|
119
+ | LITE / FULL / ULTRA levels with the same compression rules | Default-on policy in `.claude/skills/caveman/SKILL.md` |
120
+ | Auto-clarity carve-outs (when the answer is genuinely complex, caveman steps back) | Disable command `"stop caveman"` / `"normal mode"` for moments the user needs full prose |
121
+ | Preservation of YAML frontmatter, symbol names, paths, exact commands | Plugin-level interaction with `code-reviewer` (review feedback uses FULL by default for legibility) |
122
+
123
+ ### How to use it correctly
124
+
125
+ - **Already on, by default.** You will see it in every reply. Nothing to enable.
126
+ - **Switch level inline**: `"caveman lite"`, `"caveman full"`, `"caveman ultra"`.
127
+ - **Disable for a single reply**: `"stop caveman"` or `"back to normal"`. The next prompt re-enables it.
128
+ - **Never disable system-wide.** The default is the policy; disabling produces verbose output that costs more and conveys less. If you find caveman truncating something important, it is a bug in the compression — open an issue, do not just disable it.
129
+
130
+ ### What breaks if you disable it
131
+
132
+ - Token cost rises 3–4× on typical tasks. Real number, measured on Java/Angular sessions.
133
+ - Replies become harder to scan in the chat UI.
134
+ - No functional regression — purely a cost/UX hit.
135
+
136
+ ### Alternatives considered
137
+
138
+ - **Just asking for "concise" mode in the prompt** — works for one turn, decays over multi-turn conversations. Caveman is a skill, not a hint.
139
+ - **Post-processing replies with a regex** — does not understand context (truncates valid technical content). Caveman is LLM-aware compression.
140
+ - **No compression** — cost compounds. Not viable for production-grade harness use.
141
+
142
+ ---
143
+
144
+ ## impeccable — design refinement gate
145
+
146
+ **Original author:** [Paul Bakaus](https://github.com/pbakaus)
147
+ **Upstream repo:** [pbakaus/impeccable](https://github.com/pbakaus/impeccable)
148
+ **License:** MIT (verify upstream before any redistribution)
149
+ **Where it lives in the harness:** invoked at runtime via `npx impeccable@2.1.9` from `scripts/impeccable-gate.mjs` — **NOT vendored**
150
+ **Pinned version:** `impeccable@2.1.9` (bump explicit only, via ADR — see `update-template/SKILL.md` §"Impeccable version pin")
151
+ **Related ADR:** ADR-010 (Impeccable design gate), ADR-018 (npx distribution pattern)
152
+
153
+ ### What it does
154
+
155
+ `impeccable` audits frontend code for visual design anti-patterns: inconsistent spacing, missing focus states, color contrast issues, missing aria-labels, unprincipled font scale jumps, etc. It runs in three modes:
156
+
157
+ - **polish** — generate refactor suggestions inline
158
+ - **harden** — make a passing build stricter (turn warnings into errors)
159
+ - **audit** — full report, blocking findings flagged by severity
160
+
161
+ We use **audit** in the senior+ gate (ADR-010) at WARN level first, BLOCK level once the catalog is clean. The gate runs only on changed files via `--changed-only`.
162
+
163
+ ### Why we picked it
164
+
165
+ - **Filling a gap not covered by our existing gate.** Coverage, mutation, ESLint, Lighthouse, Playwright — all good, none catches "the button looks wrong but the test still passes". Impeccable does.
166
+ - **Author's npx distribution pattern was elegant.** `npx skills add pbakaus/impeccable` directly inspired our own `npx @eltonssouza/development-utility-kit install` installer (ADR-018). When you copy someone else's architecture decision, you also credit them.
167
+ - **Visual quality is a moving target.** The catalog of anti-patterns evolves. Impeccable is maintained by an active author; we get catalog updates for free by bumping the pin.
168
+
169
+ ### How it integrates
170
+
171
+ | Layer | Integration point |
172
+ |---|---|
173
+ | Senior+ gate | `gate-keeper` agent runs `node scripts/impeccable-gate.mjs src --mode=warn` (or `--mode=block` once the catalog is clean) |
174
+ | Frontend developer guidance | `.claude/agents/frontend-developer.md` instructs to use `/impeccable polish\|harden\|audit` inline before passing to `gate-keeper` |
175
+ | Update policy | `.claude/skills/update-template/SKILL.md` §"Impeccable version pin" — bump explicit, via ADR; never silently |
176
+ | Designer override | Decisions on visual identity still go to `ux-designer` and `product-owner`. Impeccable does not override design intent; it audits whether the intent is implemented consistently. |
177
+
178
+ ### How to use it correctly
179
+
180
+ - **As a developer**: `/impeccable polish` inline while building a component; iterate suggestions.
181
+ - **As a reviewer**: `/impeccable audit` before opening a PR; PR description references the report.
182
+ - **In CI**: `gate-keeper` runs `--mode=warn` today; once frontend-developer adopts the catalog, switches to `--mode=block` via ADR-035 (pending).
183
+ - **As a designer**: do **not** treat impeccable findings as design decisions. Decisions on visual identity are in design system docs, owned by `ux-designer` / `product-owner`. Impeccable enforces consistency, not creativity.
184
+
185
+ ### What breaks if you disable it
186
+
187
+ - Visual regressions slip past the senior+ gate. Other quality gates (a11y, Lighthouse) catch *some* of them but not all.
188
+ - `frontend-developer` loses an in-loop tool for self-correction.
189
+ - The senior+ gate stays green more easily — which is a regression, not a win.
190
+
191
+ ### Alternatives considered
192
+
193
+ - **ESLint plugins for accessibility / style** — catch a subset (a11y mostly). We still use them. Impeccable layers on top.
194
+ - **Storybook + visual regression (Chromatic, Percy)** — orthogonal; great for catching diffs but not for *naming* the anti-pattern. Impeccable explains why; visual regression only shows what changed.
195
+ - **Designers manually reviewing every PR** — does not scale; introduces a human gate where mechanical was possible (per ADR-034: mechanical → CLI).
196
+
197
+ ---
198
+
199
+ ## rtk — Rust Token Killer (CLI proxy)
200
+
201
+ **Original author:** [rtk-ai](https://github.com/rtk-ai) team
202
+ **Upstream repo:** [rtk-ai/rtk](https://github.com/rtk-ai/rtk)
203
+ **License:** verify upstream before any redistribution
204
+ **Where it lives in the harness:** invoked via `rtk` binary (must be on PATH); wrapper in `dashboard/rtk.js`; consumed by `duk dashboard`
205
+ **Pinned version:** none — uses whatever `rtk` the user has installed; `rtk gain --format json` is the only call we make
206
+ **Related ADR:** ADR-014 (Vanilla JS + Chart.js dashboard); the RTK widget itself was added later, no dedicated ADR
207
+
208
+ ### What it does
209
+
210
+ RTK is a CLI proxy for LLM calls. It sits between Claude Code (or any LLM client) and the Anthropic API, cutting redundant tokens via local deduplication and other techniques. Operationally: you install `rtk`, you point your client at the proxy, and you get 60–90% savings on dev operations.
211
+
212
+ It is independent of `caveman` (which compresses Claude's output before display). RTK compresses tokens at the API boundary. Both can run simultaneously and stack.
213
+
214
+ ### Why we picked it
215
+
216
+ - **Dev-operation cost is a real budget line.** Caveman cuts output tokens; RTK cuts input tokens and redundant calls. Together they make Claude-Code-driven development financially sustainable at team scale.
217
+ - **Already proven in the Claude Code community.** We did not build a wrapper; we built a *widget* that surfaces RTK's own metrics in the local dashboard, so the team sees the savings in real time.
218
+ - **Zero coupling.** If RTK is not on PATH, the dashboard widget just shows `null` and life goes on. No hard dependency.
219
+
220
+ ### How it integrates
221
+
222
+ | Layer | Integration point |
223
+ |---|---|
224
+ | CLI bootstrap | `bin/cli.js` runs `rtk gain` before booting the dashboard server; output prints once in the terminal as a status banner |
225
+ | Dashboard widget | `dashboard/public/index.html` has an RTK card; `/api/rtk` endpoint calls `rtk.js` which spawns `rtk gain --format json` |
226
+ | Daily metric | Same endpoint returns `getRtkDaily()` for the time-series chart |
227
+
228
+ Safety design (`dashboard/rtk.js`):
229
+
230
+ - 5-second hard timeout — if `rtk` hangs, the widget shows null instead of locking the dashboard.
231
+ - Spawn with `stdio: 'ignore'` for stderr — RTK's status messages don't pollute the dashboard logs.
232
+ - If `rtk` is not installed or fails, the endpoint returns `null`; the UI gracefully degrades to "no data".
233
+
234
+ ### How to use it correctly
235
+
236
+ - **Install rtk separately** following the upstream README. We do not bundle it.
237
+ - **Configure rtk-ai once per machine** (typically a token + endpoint config). After that, `rtk gain` works system-wide.
238
+ - **Look at the dashboard widget** to see daily savings. The number is *yours*, not the harness's — RTK measures actual usage on your machine.
239
+ - **Do not try to disable rtk from inside the harness.** It is an optional dependency by design — if you do not want it, do not install it; the widget will not appear and nothing else breaks.
240
+
241
+ ### What breaks if you disable it
242
+
243
+ - The RTK widget on `duk dashboard` shows no data (just the chart container with empty state).
244
+ - `bin/cli.js` line 600 prints "Note: rtk not found or returned no output — skipping gain report." in the terminal once at startup. Harmless.
245
+ - No functional regression to any pipeline stage. Pure observability feature.
246
+
247
+ ### Alternatives considered
248
+
249
+ - **Helicone, Langfuse, OpenLLMetry** — full observability platforms. Heavier (require a backend), give you more (request-level tracing). We chose RTK because the harness's value proposition is local + zero-config — RTK respects that, the others do not.
250
+ - **Anthropic's own usage reports** — only shows total spend, not redundancy savings. Different question.
251
+ - **No widget at all** — the team would not know the savings exist. Visibility drives adoption of the cost-cutting practices.
252
+
253
+ ---
254
+
255
+ ## Roadmap — future plugins
256
+
257
+ Plugins we are watching but have not adopted yet. Each entry includes the criterion blocking adoption (per §"Selection principles") so the decision is auditable.
258
+
259
+ | Plugin | What it does | Blocking criterion |
260
+ |---|---|---|
261
+ | **anthropic-claude-skills** (multiple authors) | Catalog of community skills | Need a curation step — adopting wholesale dilutes the harness's design; cherry-pick after evaluating each against ADR-025 |
262
+ | **claude-flow** (rUv-FANN) | Multi-agent swarm orchestration | Overlaps with `project-manager` orchestrate mode (ADR-033); revisit if our orchestrate hits a hard cap |
263
+ | **awesome-claude-code skills** | Various productivity skills | Per-skill evaluation needed; no blanket integration policy |
264
+
265
+ If you have a plugin candidate, the path is: open an issue on the harness repo, document which problem it solves that the harness currently solves badly or not at all, and which criterion of §"Selection principles" it satisfies. The `tech-lead` agent (Opus) makes the final call, recorded as an ADR.
266
+
267
+ ---
268
+
269
+ ## Credits and attribution
270
+
271
+ `development-utility-kit` is a meta-skill harness — it ships its own skills and agents but stands on the four plugins above. The integration is ours; the credit is theirs.
272
+
273
+ If you build on top of this harness:
274
+
275
+ - **Star the upstream repos**: [grill-me](https://github.com/mattpocock/skills) · [caveman](https://github.com/JuliusBrussee/caveman) · [impeccable](https://github.com/pbakaus/impeccable) · [rtk](https://github.com/rtk-ai/rtk)
276
+ - **Cite the original authors** in any redistribution or derived work.
277
+ - **Open issues upstream** when you find bugs in the plugins' core behavior — not on our repo. We only own the integration shim.
278
+
279
+ Any mistake in the way we describe or integrate these plugins' work is ours, not theirs.
280
+
281
+ ---
282
+
283
+ ## See also
284
+
285
+ - [skills-reference](skills-reference) — how `grill-me` and `caveman` behave as harness skills (operational details)
286
+ - [pipeline](pipeline) — where `grill-me` and impeccable plug into the discovery → delivery flow
287
+ - [quality-gate](quality-gate) — where impeccable runs in the senior+ gate
288
+ - [architecture-overview](architecture-overview) — the 3-layer model that hosts these plugins
289
+ - ADR-010 (impeccable gate), ADR-011 (grill-me as opt-in), ADR-014 (dashboard), ADR-018 (npx distribution pattern), ADR-034 (mechanical → CLI principle)
@@ -0,0 +1,315 @@
1
+ # Quality Gate Senior+
2
+
3
+ The senior+ gate is **non-negotiable**. A task is only complete after `auto-test-guard` GREEN on **all** items. No exception. `tech-lead` blocks the merge if any item fails.
4
+
5
+ This policy exists because the harness was built to deliver production code, not a fragile MVP. Coverage below threshold, high vulnerability, broken a11y, or poor Lighthouse score do not pass — regardless of deadline pressure. When the gate blocks, the correct path is to **fix the code**, not lower the threshold.
6
+
7
+ ## Full threshold table
8
+
9
+ | Metric | Threshold | Tool |
10
+ |---|---|---|
11
+ | Backend coverage (lines) | >= **85%** | JaCoCo |
12
+ | Backend coverage (branches) | >= **80%** | JaCoCo |
13
+ | **Backend mutation score** | >= **70%** in `domain/` + `application/` | **PIT (Pitest)** |
14
+ | Frontend coverage (statements) | >= **85%** | Jest --coverage |
15
+ | Frontend coverage (branches) | >= **80%** | Jest --coverage |
16
+ | SpotBugs | 0 CRITICAL, 0 HIGH | SpotBugs Maven plugin |
17
+ | SonarLint/SonarQube (if configured) | 0 CRITICAL, 0 HIGH, 0 unreviewed hotspot | Sonar |
18
+ | OWASP dependency-check | 0 CVE with CVSS >= 7.0 | OWASP DC plugin |
19
+ | npm audit | 0 HIGH, 0 CRITICAL | `npm audit --audit-level=high` |
20
+ | ESLint frontend | 0 errors, 0 warnings on new code | `eslint --max-warnings 0` |
21
+ | Playwright E2E | 100% green, critical flows covered | Playwright |
22
+ | Browser console in E2E | 0 errors | Chrome MCP |
23
+ | **A11y violations (component)** | 0 `serious` / 0 `critical` | jest-axe |
24
+ | **A11y violations (E2E)** | 0 `serious` / 0 `critical` | @axe-core/playwright |
25
+ | **Performance score (Lighthouse)** | >= **0.80** (median of 3 runs) | @lhci/cli |
26
+ | **LCP (Largest Contentful Paint)** | <= **2500ms** | @lhci/cli |
27
+ | **CLS (Cumulative Layout Shift)** | <= **0.1** | @lhci/cli |
28
+ | **TBT (Total Blocking Time)** | <= **300ms** | @lhci/cli |
29
+ | **Testing pyramid ratio (E2E)** | <= **30%** of total (hard-fail above; ideal <= 15%) | `auto-test-guard` count |
30
+
31
+ ## How to run each validation
32
+
33
+ The commands below are run automatically by `gate-keeper`; they are documented here so devs can reproduce locally before the gate.
34
+
35
+ ### Backend (Java + Spring Boot)
36
+
37
+ ```bash
38
+ # Unit tests + JaCoCo coverage
39
+ ./mvnw test
40
+
41
+ # Mutation testing (PIT) — enable pitest profile
42
+ ./mvnw verify -Ppitest
43
+
44
+ # SpotBugs static analysis
45
+ ./mvnw spotbugs:check
46
+
47
+ # OWASP dependency audit
48
+ ./mvnw dependency-check:check
49
+
50
+ # Sonar (if configured)
51
+ ./mvnw verify sonar:sonar -Dsonar.host.url=<url>
52
+ ```
53
+
54
+ ### Frontend (Angular)
55
+
56
+ ```bash
57
+ # Jest with coverage
58
+ npm test -- --coverage
59
+
60
+ # Dependency audit
61
+ npm audit --audit-level=high
62
+
63
+ # ESLint with no warnings
64
+ npx eslint src --max-warnings 0
65
+
66
+ # Playwright E2E
67
+ npx playwright test
68
+
69
+ # Lighthouse CI
70
+ npx lhci autorun
71
+ ```
72
+
73
+ ### Test pyramid
74
+
75
+ ```bash
76
+ # Count done by gate-keeper
77
+ # total_e2e / (total_unit + total_integration + total_e2e) <= 0.30
78
+ ```
79
+
80
+ ## Minimal configuration
81
+
82
+ Projects generated by the harness already come configured. For legacy projects adopted (`update-template`), check the following.
83
+
84
+ ### pom.xml — JaCoCo + PIT + SpotBugs + OWASP
85
+
86
+ ```xml
87
+ <!-- JaCoCo -->
88
+ <plugin>
89
+ <groupId>org.jacoco</groupId>
90
+ <artifactId>jacoco-maven-plugin</artifactId>
91
+ <executions>
92
+ <execution>
93
+ <goals><goal>prepare-agent</goal></goals>
94
+ </execution>
95
+ <execution>
96
+ <id>report</id>
97
+ <phase>verify</phase>
98
+ <goals><goal>report</goal></goals>
99
+ </execution>
100
+ <execution>
101
+ <id>jacoco-check</id>
102
+ <goals><goal>check</goal></goals>
103
+ <configuration>
104
+ <rules>
105
+ <rule>
106
+ <element>BUNDLE</element>
107
+ <limits>
108
+ <limit><counter>LINE</counter><value>COVEREDRATIO</value><minimum>0.85</minimum></limit>
109
+ <limit><counter>BRANCH</counter><value>COVEREDRATIO</value><minimum>0.80</minimum></limit>
110
+ </limits>
111
+ </rule>
112
+ </rules>
113
+ </configuration>
114
+ </execution>
115
+ </executions>
116
+ </plugin>
117
+
118
+ <!-- PIT (mutation testing) -->
119
+ <profile>
120
+ <id>pitest</id>
121
+ <build>
122
+ <plugins>
123
+ <plugin>
124
+ <groupId>org.pitest</groupId>
125
+ <artifactId>pitest-maven</artifactId>
126
+ <configuration>
127
+ <targetClasses>
128
+ <param>com.company.project.domain.*</param>
129
+ <param>com.company.project.application.*</param>
130
+ </targetClasses>
131
+ <mutationThreshold>70</mutationThreshold>
132
+ <outputFormats><param>HTML</param><param>XML</param></outputFormats>
133
+ </configuration>
134
+ <executions>
135
+ <execution>
136
+ <goals><goal>mutationCoverage</goal></goals>
137
+ </execution>
138
+ </executions>
139
+ </plugin>
140
+ </plugins>
141
+ </build>
142
+ </profile>
143
+
144
+ <!-- SpotBugs -->
145
+ <plugin>
146
+ <groupId>com.github.spotbugs</groupId>
147
+ <artifactId>spotbugs-maven-plugin</artifactId>
148
+ <configuration>
149
+ <effort>Max</effort>
150
+ <threshold>High</threshold>
151
+ <failOnError>true</failOnError>
152
+ </configuration>
153
+ </plugin>
154
+
155
+ <!-- OWASP dependency-check -->
156
+ <plugin>
157
+ <groupId>org.owasp</groupId>
158
+ <artifactId>dependency-check-maven</artifactId>
159
+ <configuration>
160
+ <failBuildOnCVSS>7</failBuildOnCVSS>
161
+ </configuration>
162
+ </plugin>
163
+ ```
164
+
165
+ ### Jest config (jest.config.ts)
166
+
167
+ ```typescript
168
+ export default {
169
+ coverageThreshold: {
170
+ global: {
171
+ statements: 85,
172
+ branches: 80,
173
+ functions: 85,
174
+ lines: 85,
175
+ },
176
+ },
177
+ collectCoverageFrom: [
178
+ 'src/**/*.ts',
179
+ '!src/**/*.spec.ts',
180
+ '!src/main.ts',
181
+ ],
182
+ };
183
+ ```
184
+
185
+ ### Playwright + axe-core (playwright.config.ts)
186
+
187
+ ```typescript
188
+ import { defineConfig } from '@playwright/test';
189
+
190
+ export default defineConfig({
191
+ reporter: [['html'], ['list']],
192
+ use: {
193
+ baseURL: 'http://localhost:4200',
194
+ trace: 'on-first-retry',
195
+ },
196
+ projects: [
197
+ { name: 'chromium', use: { browserName: 'chromium' } },
198
+ ],
199
+ });
200
+ ```
201
+
202
+ ### Lighthouse CI (lighthouserc.json)
203
+
204
+ ```json
205
+ {
206
+ "ci": {
207
+ "collect": {
208
+ "url": ["http://localhost:4200"],
209
+ "numberOfRuns": 3
210
+ },
211
+ "assert": {
212
+ "assertions": {
213
+ "categories:performance": ["error", { "minScore": 0.80 }],
214
+ "largest-contentful-paint": ["error", { "maxNumericValue": 2500 }],
215
+ "cumulative-layout-shift": ["error", { "maxNumericValue": 0.1 }],
216
+ "total-blocking-time": ["error", { "maxNumericValue": 300 }]
217
+ }
218
+ }
219
+ }
220
+ }
221
+ ```
222
+
223
+ ## Test pyramid
224
+
225
+ The pyramid is a structural rule, not just a metric. Hard-fail if E2E exceeds 30% of total.
226
+
227
+ ```
228
+ /\
229
+ / \
230
+ /E2E \ <= 30% of total (ideal <= 15%)
231
+ /------\
232
+ / \
233
+ /Integration\ ~20-30%
234
+ /------------\
235
+ / \
236
+ / Unit \ ~50-70%
237
+ /------------------\
238
+ ```
239
+
240
+ Why this ratio?
241
+ - **Unit**: fast, isolated, great coverage of domain rules.
242
+ - **Integration**: validates boundaries (HTTP, DB, queue). Slower than unit, cheaper than E2E.
243
+ - **E2E**: expensive, fragile, validates user flow end-to-end. Use sparingly.
244
+
245
+ When E2E exceeds 30%, it usually means domain rules are being tested via UI clicks — anti-pattern. The fix is to move logic to a domain service and cover with unit tests.
246
+
247
+ ## Blocking flow
248
+
249
+ ```
250
+ sprint-runner finishes task
251
+
252
+
253
+ gate-keeper runs all checks
254
+
255
+ ├─ all GREEN ──────────────► code-reviewer ──► tech-lead ──► merge
256
+
257
+ └─ any failure ─────────────► block + return to responsible task
258
+
259
+
260
+ developer fixes
261
+
262
+
263
+ gate-keeper runs again
264
+ ```
265
+
266
+ At the end of the PLAN, before the final merge, `prd-ready-check` re-runs the full gate + adds:
267
+ - Zero `@Disabled` / `.skip()` in tests
268
+ - Zero `// TODO` in production code
269
+ - Zero hardcoded secrets (regex against common patterns + ggshield if available)
270
+ - ADRs referenced by the PLAN have Status: Accepted
271
+
272
+ ## How to request an exception
273
+
274
+ **You do not request one.** There is no exception to the senior+ gate.
275
+
276
+ What exists is the path of **documented technical debt**:
277
+
278
+ 1. If a threshold genuinely cannot be reached this sprint (e.g. external legacy lib with no way to mock), `tech-lead` creates an entry in `docs/brain/tech-debt.md` with:
279
+ - Threshold that fell below
280
+ - Technical reason
281
+ - Recovery plan (sprint, owner)
282
+ - Risk explicitly accepted
283
+ 2. An ADR is created documenting the temporary exception.
284
+ 3. Even so, the gate **runs** — the exception shows in the report as "P0 deferred".
285
+
286
+ Without an ADR + tech-debt entry, no exception. `gate-keeper` has no key to open the door.
287
+
288
+ ## ADR references
289
+
290
+ - **ADR-007**: a11y + Lighthouse + test pyramid thresholds.
291
+ - **ADR-008**: standard senior+ flow (V5.18.0+).
292
+
293
+ Both live in the harness `docs/decisions/` and are copied to new projects via scaffold.
294
+
295
+ ## Standard senior+ flow
296
+
297
+ Summary of what `gate-keeper` enforces, in order:
298
+
299
+ 1. `product-owner` decides requirements.
300
+ 2. `analyst` produces PLAN_*.md with goal-ready DoD.
301
+ 3. `architect` proposes ADR when there is a macro decision; `tech-lead` approves.
302
+ 4. `sprint-runner` executes Sprint N delegating to `backend-developer` + `frontend-developer` + `database-engineer` in parallel; `qa-engineer` writes tests.
303
+ 5. `gate-keeper` generates missing tests + runs the full senior+ gate (coverage, mutation, a11y, Lighthouse, pyramid).
304
+ 6. `code-reviewer` does initial review.
305
+ 7. `tech-lead` does final review → approves merge OR returns.
306
+
307
+ **Human never interrupted** — except in the 4 PO cases or 3 TL cases (see [Autonomy matrix](autonomy-matrix)).
308
+
309
+ ## Cross-references
310
+
311
+ - [Pipeline](pipeline) — where the gate runs in the end-to-end flow
312
+ - [Agents reference](agents-reference) — role of `gate-keeper`, `qa-engineer`, `tech-lead`
313
+ - [Stack rules](stack-rules) — test conventions per stack
314
+ - [Architecture overview](architecture-overview) — macro model
315
+ - [Autonomy matrix](autonomy-matrix) — authority of `gate-keeper` to block without human