@ngockhoale/ukit 1.5.2 → 1.5.5

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (33) hide show
  1. package/CHANGELOG.md +57 -5
  2. package/README.md +2 -2
  3. package/manifests/platform.full.yaml +2 -2
  4. package/package.json +1 -1
  5. package/src/cli/commands/doctor.js +14 -2
  6. package/src/cli/commands/install.js +2 -2
  7. package/src/cli/commands/uninstall.js +1 -1
  8. package/src/index/taskRouting.js +117 -1
  9. package/templates/.claude/agents/bug-debugger.md +48 -19
  10. package/templates/.claude/agents/code-reviewer.md +86 -0
  11. package/templates/.claude/agents/feature-implementer.md +59 -18
  12. package/templates/.claude/hooks/skill-router.sh +1 -1
  13. package/templates/.claude/hooks/verification-guard.sh +1 -1
  14. package/templates/.claude/skills/next-step/SKILL.md +1 -1
  15. package/templates/.claude/ukit/index/post-edit-verify.mjs +3 -2
  16. package/templates/.claude/ukit/index/route-task.mjs +8 -4
  17. package/templates/.claude/ukit/runtime/output-compression.mjs +37 -1
  18. package/templates/AGENTS.md +7 -0
  19. package/templates/CLAUDE.md +8 -1
  20. package/templates/docs/AI_HANDOFF/ACTIVE.md +9 -0
  21. package/templates/docs/AI_HANDOFF/HISTORY.md +4 -0
  22. package/templates/docs/AI_HANDOFF/INDEX.md +13 -0
  23. package/templates/docs/AI_HANDOFF/PLAN.md +75 -0
  24. package/templates/docs/AI_HANDOFF/RULES.md +127 -0
  25. package/templates/docs/AI_HANDOFF/archive/.gitkeep +0 -0
  26. package/templates/docs/AI_HANDOFF/tasks/.gitkeep +0 -0
  27. package/templates/docs/AI_HANDOFF/tasks/_TEMPLATE.md +72 -0
  28. package/templates/docs/INSTALL.md +2 -2
  29. package/templates/docs/PROJECT.md +1 -1
  30. package/templates/docs/UKIT_USAGE_GUIDE.md +1 -1
  31. package/templates/docs/WORKLOG.md +11 -0
  32. package/templates/ukit/storage/config.json +93 -1
  33. package/templates/docs/AI_HANDOFF.md +0 -118
package/CHANGELOG.md CHANGED
@@ -2,23 +2,75 @@
2
2
 
3
3
  All notable changes to UKit are documented here.
4
4
 
5
+ ## 1.5.5 - 2026-05-30
6
+
7
+ ### Added
8
+
9
+ - **Handoff Quality Gate** — opt-in, tool-agnostic, file-based, 4-phase pipeline (Idea+Plan → Create Tasks → Implement+Test → Review+Test) for tasks routed through `docs/AI_HANDOFF/`. Daily ad-hoc prompts are NOT affected and continue with the existing lightweight workflow.
10
+ - New agent `templates/.claude/agents/code-reviewer.md`: independent reviewer with a model-isolation check. Refuses to review when `EXECUTOR_MODEL` equals reviewer's own model, when either is missing/unknown, or when re-run of Verification Commands fails. Emits a structured `## Reviewer Verdict` block.
11
+ - New config block `handoff.*` in `templates/ukit/storage/config.json` with Vietnamese `_help` entries: `handoff.plan.requireTestPlan`, `handoff.executor.testFirstRequired`, `handoff.reviewer.model` (default `unic-smart`), `handoff.reviewer.blockOnCritical`, etc. The `*.modelHint` fields are explicitly documented as *labels for humans, not enforced selectors* — model enforcement happens via executor self-report + reviewer refusal.
12
+ - New `## 4. Test Plan (REQUIRED — TDD-style)` section in `templates/docs/AI_HANDOFF/PLAN.md`.
13
+ - New `templates/docs/AI_HANDOFF/tasks/_TEMPLATE.md`: every task file MUST embed its own `## Test Cases` (table with happy + ≥1 edge case + regression for bug fixes) + `## Test Files` + `## Verification Commands` + `## Acceptance Criteria`. Tasks without these are `needs_breakdown`, not `ready`.
14
+ - `## Discussion` section pattern on task files: structured AI-to-AI comment thread (date · role · tool/model) so planner/executor/reviewer can talk back to each other through files — no out-of-band channels.
15
+ - New task status states in `INDEX.md`: `pending_review`, `changes_requested`, `critical_block`, `approved`, `approved_minor`, plus Owner/Reviewer columns.
16
+
17
+ ### Changed
18
+
19
+ - `templates/.claude/agents/feature-implementer.md`: now operates in two explicit modes. **Daily mode** (DEFAULT): unchanged lightweight flow — tests only when touched code has coverage, no reviewer trigger. **Handoff mode** (when task lives under `docs/AI_HANDOFF/tasks/`): test-first → green → reviewer; cannot claim `STATUS: DONE` without fresh PASS in turn. Report format now includes `EXECUTOR_TOOL`, `EXECUTOR_MODEL`, `EXECUTOR_SUBAGENT` self-report.
20
+ - `templates/.claude/agents/bug-debugger.md`: same two-mode split. Handoff mode requires regression-test-first (RED before fix, GREEN after); Daily mode unchanged.
21
+ - `templates/docs/AI_HANDOFF/RULES.md`: rewritten around the 4-phase model. Added `## Hard rule — All work stays in docs/AI_HANDOFF/` (no out-of-band AI communication). Added Phase 2 TDD-embedded requirement. Auto-compact 80-line rule scoped to state files (ACTIVE/INDEX/tasks); PLAN.md and RULES.md exempt.
22
+ - `templates/CLAUDE.md` + `templates/AGENTS.md`: added scoped `## Handoff Quality Gate` section with explicit OPT-IN scope language so adapter targets (Claude Code, Kilo Code, Codex, OpenCode, future tools) only apply Quality Gate to handoff work, not daily prompts.
23
+
24
+ ### Fixed
25
+
26
+ - Fixed output-history deduplication for `promptCache: false` — the second call with semantically equivalent (but noisy) output now returns the cached first summary instead of recomputing. Root cause: `normalizeOutputSummaryForDedupe` wasn't stripping `FAIL`/`PASS`/`Test Files`/`Tests`/`Duration`/`Start at` lines from the dedup key, so slight token-budget differences between two similar payloads produced different keys and broke cache hits. Added `findOutputHistoryEntry` lookup in `main()` to serve cached summaries without prompt-cache.
27
+
28
+ ### Why
29
+
30
+ User had two real concerns: (1) cheap-model executors (e.g. Kilo Code) miss small things — fixed by mandatory test-first + independent reviewer with different model; (2) UKit cannot dictate which model any external tool uses — fixed by making the contract self-report-based (executor writes `EXECUTOR_MODEL`, reviewer compares and refuses on match/unknown). The Quality Gate is opt-in via the `docs/AI_HANDOFF/` folder so daily ad-hoc work is unaffected.
31
+
32
+ ## 1.5.4 - 2026-05-28
33
+
34
+ ### Fixed
35
+
36
+ - Fixed stale `docs/AI_HANDOFF.md` references in next-step skill, INSTALL.md, UKIT_USAGE_GUIDE.md, PROJECT.md (template + local).
37
+ - Fixed mirror `route-task.mjs` not showing `handoff=` in summary — wrong intent order (docsSpecific before handoff) + missing handoffFile field + missing display segment.
38
+ - Added `WORKLOG_ARCHIVE.md` to `.gitignore`.
39
+
40
+ ### Changed
41
+
42
+ - Compacted `docs/MEMORY.md` from 118KB/298 lines to 7.8KB/78 lines. Old entries archived to `docs/MEMORY_ARCHIVE.md`.
43
+ - Cleared handoff task queue after all cycle 002 tasks resolved.
44
+
45
+ ## 1.5.3 - 2026-05-28
46
+
47
+ ### Changed
48
+
49
+ - Restructured `docs/AI_HANDOFF.md` into `docs/AI_HANDOFF/` folder: ACTIVE.md (snapshot), RULES.md (flow + token budget), PLAN.md (brainstorm), INDEX.md (task index), tasks/ (per-task files), archive/ (completed cycles, max 3).
50
+ - Each task is now an isolated file in `tasks/TASK-xxx.md` — AI reads only the task it implements, not the entire backlog.
51
+ - Added token budget rule: combined handoff reads must stay under 200 lines per request.
52
+ - Added auto-compact rule: if any handoff file exceeds 80 lines, trigger `clear handoff`.
53
+ - Updated template mirror `templates/docs/AI_HANDOFF/` to match new folder structure.
54
+ - Updated `taskRouting.js` handoffFile target from `docs/AI_HANDOFF.md` to `docs/AI_HANDOFF/ACTIVE.md`.
55
+ - Updated `install.js`, `doctor.js`, `uninstall.js` to check for `docs/AI_HANDOFF/ACTIVE.md`.
56
+
5
57
  ## 1.5.2 - 2026-05-28
6
58
 
7
59
  ### Added
8
60
 
9
61
  - Added first-class `ukit:handoff` prompt detection and routing for explicit handoff/brainstorm-to-task flows.
10
- - Added `intentMode: handoff` to route summaries so hooks and helpers can prioritize `docs/AI_HANDOFF.md` automatically.
11
- - Added a reusable generic `docs/AI_HANDOFF.md` handoff template for installed projects.
62
+ - Added `intentMode: handoff` to route summaries so hooks and helpers can prioritize handoff file automatically.
63
+ - Added a reusable generic handoff template for installed projects.
12
64
 
13
65
  ### Changed
14
66
 
15
- - Handoff prompts now route through `docs-quality` skill instead of generic `update-status`, making `docs/AI_HANDOFF.md` the primary coordination artifact.
16
- - Updated installed `next-step` skill guidance to read `docs/AI_HANDOFF.md` first for explicit handoff prompts.
67
+ - Handoff prompts now route through `docs-quality` skill instead of generic `update-status`.
68
+ - Updated installed `next-step` skill guidance to read handoff file first for explicit handoff prompts.
17
69
  - Updated `route-task` mirror and hook behavior in both source and installed artifact so `ukit install` users get consistent handoff routing.
18
70
 
19
71
  ### Fixed
20
72
 
21
- - Handoff authoring is now advisory (exit 0) in Safe Patch, so large `docs/AI_HANDOFF.md` batches no longer block the handoff workflow.
73
+ - Handoff authoring is now advisory (exit 0) in Safe Patch, so large handoff batches no longer block the handoff workflow.
22
74
  - Hard runtime/shared-risk file broad rewrites still block by default unless `advisoryOnly=true` or `UKIT_SAFE_PATCH_ADVISORY=1` is set.
23
75
 
24
76
  ### Tests
package/README.md CHANGED
@@ -32,7 +32,7 @@ ukit install
32
32
  4. Fill in the generated docs baseline:
33
33
  - `docs/PROJECT.md`
34
34
  - `docs/MEMORY.md`
35
- - `docs/AI_HANDOFF.md`
35
+ - `docs/AI_HANDOFF/`
36
36
  - `docs/WORKLOG.md`
37
37
  5. Open your AI tool and work in natural language.
38
38
 
@@ -90,7 +90,7 @@ UKit v1.3.1 keeps the same shared runtime contract while adding Safe Patch Proto
90
90
  - install globally with `npm install -g @ngockhoale/ukit`
91
91
  - keep using the exact same human workflow inside projects: `ukit install`
92
92
  - preserve the same `ukit` binary, hooks, and install-first orchestration while standardizing the runtime root as hidden `.ukit/`
93
- - install `docs/AI_HANDOFF.md` as the default cross-AI handoff file for plan task breakdown implementation continuity, with explicit sections so one AI can plan, another can refine tasks, and another can implement them reliably
93
+ - install `docs/AI_HANDOFF/` as the cross-AI handoff folder with per-task isolation: ACTIVE.md (snapshot), INDEX.md (task index), tasks/ (one file per task) so each AI reads only the task it needs, with token budget rules in RULES.md
94
94
  - auto-route open-ended “what next?” / “continue” prompts to the `next-step` skill with a visible freshness cue when status may be stale
95
95
  - auto-route explicit handoff/wrap-up requests to the `update-status` skill while skipping trivial/no-state-change tasks
96
96
  - keep concrete debug/implementation/review prompts primary, so project status never replaces source/index-first task work
@@ -145,8 +145,8 @@ items:
145
145
 
146
146
  - id: docs-ai-handoff
147
147
  type: config
148
- sourceTemplate: docs/AI_HANDOFF.md
149
- targetPath: docs/AI_HANDOFF.md
148
+ sourceTemplate: docs/AI_HANDOFF
149
+ targetPath: docs/AI_HANDOFF
150
150
  requires:
151
151
  - docs-project
152
152
  - docs-memory
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "@ngockhoale/ukit",
3
- "version": "1.5.2",
3
+ "version": "1.5.5",
4
4
  "description": "Install/update an index-first AI workspace for Claude Code, Antigravity, OpenAI Codex, and OpenCode.",
5
5
  "license": "MIT",
6
6
  "type": "module",
@@ -1,4 +1,5 @@
1
1
  import path from 'node:path';
2
+ import fs from 'node:fs/promises';
2
3
  import { pathExists, readJsonIfExists } from '../../core/fileOps.js';
3
4
  import { buildPathConfig } from '../../core/paths.js';
4
5
  import { buildRuntimePaths } from '../../core/runtimePaths.js';
@@ -45,6 +46,15 @@ export async function runDoctor({ packageRoot, projectRoot, argv = [] }) {
45
46
  : [];
46
47
  const codexAdapterTracked = trackedPaths.some((entry) => entry.startsWith('.codex/'));
47
48
 
49
+ const WORKLOG_MAX_LINES = 600;
50
+ let worklogLineCount = 0;
51
+ try {
52
+ const content = await fs.readFile(path.join(projectRoot, 'docs', 'WORKLOG.md'), 'utf8');
53
+ worklogLineCount = content.split('\n').length;
54
+ } catch {
55
+ // file may not exist
56
+ }
57
+
48
58
  const checks = {
49
59
  manifestLoaded: Boolean(manifest?.name),
50
60
  templatesDirExists: await pathExists(pathConfig.templatesRoot),
@@ -62,8 +72,9 @@ export async function runDoctor({ packageRoot, projectRoot, argv = [] }) {
62
72
  sessionMemoryDirExists: await pathExists(runtimePaths.sessionsDir),
63
73
  docsProjectExists: await pathExists(path.join(projectRoot, 'docs', 'PROJECT.md')),
64
74
  docsMemoryExists: await pathExists(path.join(projectRoot, 'docs', 'MEMORY.md')),
65
- docsAiHandoffExists: await pathExists(path.join(projectRoot, 'docs', 'AI_HANDOFF.md')),
75
+ docsAiHandoffExists: await pathExists(path.join(projectRoot, 'docs', 'AI_HANDOFF', 'ACTIVE.md')),
66
76
  docsWorklogExists: await pathExists(path.join(projectRoot, 'docs', 'WORKLOG.md')),
77
+ docsWorklogUnderBudget: worklogLineCount <= WORKLOG_MAX_LINES,
67
78
  allProvidersConfigured: providers.allSupported,
68
79
  ...(codexAdapterTracked
69
80
  ? {
@@ -104,8 +115,9 @@ export async function runDoctor({ packageRoot, projectRoot, argv = [] }) {
104
115
  console.log(`[UKit] ${ok(checks.sessionMemoryDirExists)} .ukit/storage/memory/sessions/`);
105
116
  console.log(`[UKit] ${ok(checks.docsProjectExists)} docs/PROJECT.md`);
106
117
  console.log(`[UKit] ${ok(checks.docsMemoryExists)} docs/MEMORY.md`);
107
- console.log(`[UKit] ${ok(checks.docsAiHandoffExists)} docs/AI_HANDOFF.md`);
118
+ console.log(`[UKit] ${ok(checks.docsAiHandoffExists)} docs/AI_HANDOFF/`);
108
119
  console.log(`[UKit] ${ok(checks.docsWorklogExists)} docs/WORKLOG.md`);
120
+ console.log(`[UKit] ${ok(checks.docsWorklogUnderBudget)} docs/WORKLOG.md under budget (${worklogLineCount}/${WORKLOG_MAX_LINES} lines)`);
109
121
  if (codexAdapterTracked) {
110
122
  console.log(`[UKit] ${ok(checks.codexReadmeExists)} .codex/README.md`);
111
123
  console.log(`[UKit] ${ok(checks.codexSettingsExists)} .codex/settings.json`);
@@ -239,7 +239,7 @@ export async function runInstall({ packageRoot, projectRoot, packageVersion, arg
239
239
  const docsLabels = [
240
240
  'docs/PROJECT.md',
241
241
  'docs/MEMORY.md',
242
- 'docs/AI_HANDOFF.md',
242
+ 'docs/AI_HANDOFF/ACTIVE.md',
243
243
  'docs/WORKLOG.md',
244
244
  ];
245
245
 
@@ -254,7 +254,7 @@ export async function runInstall({ packageRoot, projectRoot, packageVersion, arg
254
254
  if (missingDocs.length > 0) {
255
255
  console.log(`[UKit] Missing docs — fill these in before first use: ${missingDocs.join(', ')}`);
256
256
  } else {
257
- console.log('[UKit] Docs baseline ready: docs/PROJECT.md, docs/MEMORY.md, docs/AI_HANDOFF.md, docs/WORKLOG.md');
257
+ console.log('[UKit] Docs baseline ready: docs/PROJECT.md, docs/MEMORY.md, docs/AI_HANDOFF/, docs/WORKLOG.md');
258
258
  console.log('[UKit] Fill them once with real project context for the best results.');
259
259
  }
260
260
 
@@ -47,5 +47,5 @@ export async function runUninstall({ projectRoot, argv = [] }) {
47
47
  }
48
48
 
49
49
  console.log(`[UKit] Uninstall complete. Removed ${result.removed}/${result.attempted} managed paths.`);
50
- console.log('[UKit] Note: docs/PROJECT.md, docs/MEMORY.md, docs/AI_HANDOFF.md, docs/WORKLOG.md contain user content and were preserved. Delete manually if needed.');
50
+ console.log('[UKit] Note: docs/PROJECT.md, docs/MEMORY.md, docs/AI_HANDOFF/, docs/WORKLOG.md contain user content and were preserved. Delete manually if needed.');
51
51
  }
@@ -139,6 +139,10 @@ export async function deriveTaskRoute({
139
139
  contextRecommendation,
140
140
  verificationRecommendation,
141
141
  });
142
+ const handoffBudget = intentMode === 'handoff'
143
+ ? await checkHandoffBudget(absoluteRoot)
144
+ : null;
145
+ const worklogBudget = await checkWorklogBudget(absoluteRoot);
142
146
  const routeSummary = buildRouteSummary({
143
147
  activeSkills,
144
148
  routingContext: {
@@ -157,6 +161,8 @@ export async function deriveTaskRoute({
157
161
  contextRecommendation,
158
162
  verificationRecommendation,
159
163
  nextAction,
164
+ handoffBudget,
165
+ worklogBudget,
160
166
  });
161
167
  const approachSelector = routeSummary?.approachSelector ?? null;
162
168
 
@@ -180,6 +186,8 @@ export async function deriveTaskRoute({
180
186
  verificationRecommendation,
181
187
  nextAction,
182
188
  routeSummary,
189
+ handoffBudget,
190
+ worklogBudget,
183
191
  ...(degradedWarnings.length > 0 ? { degradedWarnings } : {}),
184
192
  };
185
193
  }
@@ -190,6 +198,8 @@ export function buildRouteSummary({
190
198
  contextRecommendation = null,
191
199
  verificationRecommendation = null,
192
200
  nextAction = null,
201
+ handoffBudget = null,
202
+ worklogBudget = null,
193
203
  } = {}) {
194
204
  const autonomyLevel = routingContext.autonomyLevel ?? 'balanced';
195
205
  const delegationRecommendation = deriveDelegationRecommendation({
@@ -243,7 +253,7 @@ export function buildRouteSummary({
243
253
  ),
244
254
  );
245
255
  const nextActionCommand = compactHelperLane ? null : nextAction?.command ?? null;
246
- const handoffFile = routingContext.intentMode === 'handoff' ? 'docs/AI_HANDOFF.md' : null;
256
+ const handoffFile = routingContext.intentMode === 'handoff' ? 'docs/AI_HANDOFF/ACTIVE.md' : null;
247
257
  const summaryLine = [
248
258
  routingContext.taskType ? `task=${routingContext.taskType}` : null,
249
259
  handoffFile ? `handoff=${handoffFile}` : null,
@@ -253,6 +263,8 @@ export function buildRouteSummary({
253
263
  editGuardHint ? `editGuard=${editGuardHint}` : null,
254
264
  delegationRecommendation?.hint ? `delegate=${delegationRecommendation.hint}` : null,
255
265
  policyMode ? `policy=${policyMode}` : null,
266
+ handoffBudget?.warning ? `budget=${handoffBudget.warning}` : null,
267
+ worklogBudget?.warning ? `budget=${worklogBudget.warning}` : null,
256
268
  ].filter(Boolean).join(' | ');
257
269
 
258
270
  return {
@@ -270,6 +282,8 @@ export function buildRouteSummary({
270
282
  continuationState,
271
283
  intentMode: routingContext.intentMode ?? null,
272
284
  handoffFile,
285
+ handoffBudget,
286
+ worklogBudget,
273
287
  delegateHint: delegationRecommendation?.hint ?? null,
274
288
  nextActionType: nextAction?.type ?? null,
275
289
  nextActionCommand,
@@ -1480,6 +1494,108 @@ function unique(values) {
1480
1494
  return [...new Set(values.filter(Boolean))];
1481
1495
  }
1482
1496
 
1497
+ const HANDOFF_BUDGET_MAX_LINES = 200;
1498
+ const HANDOFF_FILE_MAX_LINES = 80;
1499
+ const WORKLOG_BUDGET_MAX_LINES = 600;
1500
+ const WORKLOG_BUDGET_MAX_ENTRIES = 30;
1501
+
1502
+ export async function checkHandoffBudget(rootDir) {
1503
+ const handoffDir = path.join(rootDir, 'docs', 'AI_HANDOFF');
1504
+ const checkFiles = ['ACTIVE.md', 'INDEX.md'];
1505
+ let totalLines = 0;
1506
+ const fileDetails = [];
1507
+
1508
+ for (const fileName of checkFiles) {
1509
+ const filePath = path.join(handoffDir, fileName);
1510
+ try {
1511
+ const content = await fs.readFile(filePath, 'utf8');
1512
+ const lines = content.split('\n').length;
1513
+ totalLines += lines;
1514
+ fileDetails.push({ file: fileName, lines });
1515
+ } catch {
1516
+ // File may not exist yet — not an error
1517
+ }
1518
+ }
1519
+
1520
+ // Also count task files
1521
+ const tasksDir = path.join(handoffDir, 'tasks');
1522
+ try {
1523
+ const taskFiles = await fs.readdir(tasksDir);
1524
+ for (const taskFile of taskFiles) {
1525
+ if (!taskFile.endsWith('.md')) continue;
1526
+ const content = await fs.readFile(path.join(tasksDir, taskFile), 'utf8');
1527
+ const lines = content.split('\n').length;
1528
+ totalLines += lines;
1529
+ fileDetails.push({ file: `tasks/${taskFile}`, lines });
1530
+ }
1531
+ } catch {
1532
+ // tasks dir may not exist
1533
+ }
1534
+
1535
+ const oversizedFiles = fileDetails.filter((f) => f.lines > HANDOFF_FILE_MAX_LINES);
1536
+ const overBudget = totalLines > HANDOFF_BUDGET_MAX_LINES;
1537
+
1538
+ let warning = null;
1539
+ if (overBudget) {
1540
+ warning = 'over-budget';
1541
+ } else if (oversizedFiles.length > 0) {
1542
+ warning = 'file-oversized';
1543
+ }
1544
+
1545
+ return {
1546
+ totalLines,
1547
+ maxLines: HANDOFF_BUDGET_MAX_LINES,
1548
+ fileDetails,
1549
+ oversizedFiles: oversizedFiles.map((f) => f.file),
1550
+ overBudget,
1551
+ warning,
1552
+ action: warning ? 'clear-handoff' : null,
1553
+ };
1554
+ }
1555
+
1556
+ export async function checkWorklogBudget(rootDir) {
1557
+ const worklogPath = path.join(rootDir, 'docs', 'WORKLOG.md');
1558
+ try {
1559
+ const content = await fs.readFile(worklogPath, 'utf8');
1560
+ const lines = content.split('\n').length;
1561
+ const entryMatches = content.match(/^## \d{4}-\d{2}-\d{2}/gm) ?? [];
1562
+ const entryCount = entryMatches.length;
1563
+
1564
+ const overLineBudget = lines > WORKLOG_BUDGET_MAX_LINES;
1565
+ const overEntryBudget = entryCount > WORKLOG_BUDGET_MAX_ENTRIES;
1566
+ const overBudget = overLineBudget || overEntryBudget;
1567
+
1568
+ let warning = null;
1569
+ if (overLineBudget && overEntryBudget) {
1570
+ warning = 'worklog-lines-and-entries-over';
1571
+ } else if (overLineBudget) {
1572
+ warning = 'worklog-lines-over';
1573
+ } else if (overEntryBudget) {
1574
+ warning = 'worklog-entries-over';
1575
+ }
1576
+
1577
+ return {
1578
+ lines,
1579
+ maxLines: WORKLOG_BUDGET_MAX_LINES,
1580
+ entryCount,
1581
+ maxEntries: WORKLOG_BUDGET_MAX_ENTRIES,
1582
+ overBudget,
1583
+ warning,
1584
+ action: warning ? 'compact-worklog' : null,
1585
+ };
1586
+ } catch {
1587
+ return {
1588
+ lines: 0,
1589
+ maxLines: WORKLOG_BUDGET_MAX_LINES,
1590
+ entryCount: 0,
1591
+ maxEntries: WORKLOG_BUDGET_MAX_ENTRIES,
1592
+ overBudget: false,
1593
+ warning: null,
1594
+ action: null,
1595
+ };
1596
+ }
1597
+ }
1598
+
1483
1599
  const DELEGATABLE_IMPLEMENTATION_SKILL_IDS = new Set([
1484
1600
  'delivery',
1485
1601
  'frontend',
@@ -8,50 +8,79 @@ tools: ["Read", "Grep", "Glob", "Bash", "Edit", "TodoWrite"]
8
8
 
9
9
  Systematic debugging — understand before fixing.
10
10
 
11
+ **Two modes:**
12
+ - **Daily/ad-hoc** (DEFAULT): bug not coming from `docs/AI_HANDOFF/` → reproduce → fix → verify (no mandatory regression test if no pre-existing coverage; original lightweight flow).
13
+ - **Handoff mode**: bug task lives in `docs/AI_HANDOFF/tasks/TASK-xxx.md` → activate Quality Gate: regression-test-first → green → reviewer.
14
+
11
15
  ## Workflow
12
16
 
13
17
  ### 1. Reproduce (required)
14
18
 
15
- - Run the failing command/action
16
- - Capture exact error message and stack trace
17
- - If not reproducible → document conditions and ask user
19
+ - Run the failing command/action.
20
+ - Capture exact error message and stack trace.
21
+ - If not reproducible → document conditions and ask user.
18
22
 
19
23
  ### 2. Trace Root Cause
20
24
 
21
- - Read error location and surrounding code
22
- - Trace data flow: input → processing → failure point
23
- - Identify: is this a logic error, state error, or integration error?
25
+ - Read error location and surrounding code.
26
+ - Trace data flow: input → processing → failure point.
27
+ - Identify: logic / state / integration error?
28
+
29
+ ### 3. Regression Test First (RED) — Handoff mode
30
+
31
+ - Write a regression test that reproduces the bug as a failing test.
32
+ - Run it: must FAIL with the original error/signature.
33
+ - If you truly cannot write a regression test (pure UI glitch, env-only issue), document why and attach a manual repro script.
34
+ - **Daily mode**: write a regression test only if the file already has tests; otherwise rely on the original repro command for verification.
24
35
 
25
- ### 3. Fix
36
+ ### 4. Fix (GREEN)
26
37
 
27
- - Apply smallest reliable fix at the root cause
28
- - Do NOT patch symptoms — fix the cause
38
+ - Apply smallest reliable fix at the root cause.
39
+ - Do NOT patch symptoms — fix the cause.
40
+ - Re-run the regression test: must PASS.
29
41
 
30
- ### 4. Verify
42
+ ### 5. Verify
31
43
 
32
- - Re-run the original failing command → must pass
33
- - Run related tests: `yarn test [relevant-file]`
34
- - Check no regression in adjacent functionality
44
+ - Re-run the original failing command → must pass.
45
+ - Run related tests: `yarn test [relevant-file]`.
46
+ - If shared code touched, run wider suite.
47
+ - Check no regression in adjacent functionality.
35
48
 
36
- ### 5. Report
49
+ ### 6. Report
37
50
 
38
51
  ```
39
52
  STATUS: DONE | BLOCKED | PARTIAL
53
+ EXECUTOR_TOOL: [claude-code | kilo-code | codex | opencode | other]
54
+ EXECUTOR_MODEL: [exact model name you are running as. "unknown" if you cannot tell.]
55
+ EXECUTOR_SUBAGENT: [subagent name within your host, if any, else "-"]
40
56
  SUMMARY: [1-2 sentences — root cause and fix]
41
57
  ROOT_CAUSE: [what caused the bug]
58
+ REGRESSION_TEST:
59
+ file: [path]
60
+ red_before: [exact error captured]
61
+ green_after: [pass output line]
42
62
  FILES_CHANGED:
43
63
  - [file path]: [what changed]
44
- VERIFIED: [original failing command now passes + test output]
64
+ VERIFICATION:
65
+ command: [exact command]
66
+ result: [N pass / M fail / exit code]
45
67
  ISSUES: [any remaining risks or edge cases, or "none"]
46
- NEXT: [follow-up needed, or "nothing bug fixed"]
68
+ HANDOFF_TO_REVIEWER: yes | noreason
69
+ NEXT: [follow-up needed, or "ready for review"]
47
70
  ```
48
71
 
72
+ ### 7. Trigger Reviewer — Handoff mode ONLY
73
+
74
+ Daily mode: skip. Handoff mode: set task status `pending_review` in `INDEX.md`; a reviewer session (model from `handoff.reviewer.model`, MUST differ from this debugger's model) will pick it up.
75
+
49
76
  ## Rules
50
77
 
51
- - Don't patch blindly confirm root cause with evidence
78
+ - **Iron law (Handoff mode):** no `DONE` without (a) regression test passing and (b) original failing command passing, both in this turn.
79
+ - **Daily mode:** original — original failing command must pass; regression test optional unless prior coverage exists.
80
+ - Don't patch blindly — confirm root cause with evidence.
52
81
  - For bug triage, use graduated doc budget:
53
82
  - obvious/simple bug: `docs/MEMORY.md` only
54
83
  - non-trivial bug: `docs/MEMORY.md` + `docs/PROJECT.md` + `docs/CODE_MAP.md`
55
84
  - read `docs/WORKLOG.md` only recent relevant entries
56
- - Keep fix scope minimal — no drive-by refactors
57
- - If root cause is unclear after 5 minutes of tracing → ask user for more context
85
+ - Keep fix scope minimal — no drive-by refactors.
86
+ - If root cause is unclear after 5 minutes of tracing → ask user for more context.
@@ -0,0 +1,86 @@
1
+ ---
2
+ name: code-reviewer
3
+ description: "Independent reviewer for handoff Phase 3. Use after executor reports STATUS: DONE on a handoff task. MUST run with a model different from the executor (configured in .ukit/storage/config.json → handoff.reviewer.model, default unic-smart). Produces a verdict: APPROVED | APPROVED-WITH-MINOR | CHANGES-REQUESTED | CRITICAL."
4
+ model: inherit
5
+ color: yellow
6
+ tools: ["Read", "Grep", "Glob", "Bash"]
7
+ ---
8
+
9
+ You are the independent reviewer for UKit's handoff Quality Gate. Your model is configured in `.ukit/storage/config.json` → `handoff.reviewer.model` and MUST differ from the executor's model. If the host can bind a model from config, use it; otherwise note in the verdict which model you are running as.
10
+
11
+ **Do not invent issues. Do not rubber-stamp.** Every finding must point at a specific file + line + concrete failure mode.
12
+
13
+ ## Inputs you expect
14
+
15
+ - Path to task file: `docs/AI_HANDOFF/tasks/TASK-xxx.md` (has Test Plan §4 + Verification Commands + Executor Report at bottom).
16
+ - The executor's `STATUS: DONE` report with FILES_CHANGED + VERIFICATION block.
17
+ - The diff (use `git diff` or read FILES_CHANGED directly).
18
+
19
+ If any input is missing, return `CHANGES-REQUESTED` with reason "incomplete handoff package".
20
+
21
+ ## Review order
22
+
23
+ 1. **Test Plan adherence** — Were all tests in §4 actually implemented? Run them yourself: `<task Verification Commands>`. Fresh PASS required, no trusting executor's output blindly.
24
+ 2. **Correctness** — Does the diff implement the requested behavior? Any obvious wrong assumptions, stale refs, missing cases?
25
+ 3. **Regression risk** — What existing behavior could this break? Are shared paths/tests/contracts still aligned? Run the wider test suite if shared code was touched.
26
+ 4. **Safety / security / data loss** — Destructive actions, auth/permission, path handling, unsafe shell/DB/file ops.
27
+ 5. **Performance / scale** — Accidental N+1, repeated I/O, large scans inside hot paths.
28
+ 6. **Maintainability** — Duplicated logic, dead branches, misleading naming, drift between docs/tests/source.
29
+
30
+ ## Severity ladder
31
+
32
+ - **CRITICAL** — security hole, data loss risk, broken core behavior, test was faked (no real assertion), or verification command does NOT actually pass when you re-run it. Blocks handoff cứng.
33
+ - **CHANGES-REQUESTED** — Important issues: missing edge-case test, regression risk in shared code, wrong abstraction at scope boundary. Executor must fix and re-submit.
34
+ - **APPROVED-WITH-MINOR** — Minor naming / doc / style issues. Logged on task file but handoff allowed.
35
+ - **APPROVED** — Clean.
36
+
37
+ ## Output (append to task file as `## Reviewer Verdict`)
38
+
39
+ ```
40
+ ## Reviewer Verdict
41
+
42
+ VERDICT: APPROVED | APPROVED-WITH-MINOR | CHANGES-REQUESTED | CRITICAL
43
+ REVIEWER_MODEL: [model name actually used]
44
+ EXECUTOR_MODEL: [from executor report]
45
+ VERIFICATION_RERUN:
46
+ command: [exact command]
47
+ result: [N pass / M fail]
48
+ TEST_PLAN_COVERAGE: [all-followed | partial — list gaps | missing — list]
49
+ FINDINGS:
50
+ critical:
51
+ - file: <path:line> — <what fails, how>
52
+ important:
53
+ - file: <path:line> — <what risk, evidence>
54
+ minor:
55
+ - file: <path:line> — <what to clean up>
56
+ NEXT_STATUS_FOR_INDEX: approved | approved_minor | changes_requested | critical_block
57
+ NOTES: [1-2 sentences for human reviewer if needed]
58
+ ```
59
+
60
+ After writing the verdict, update `docs/AI_HANDOFF/INDEX.md` row for this task: set Status = NEXT_STATUS_FOR_INDEX, set Reviewer = your model name.
61
+
62
+ ## Model isolation check (FIRST thing you do)
63
+
64
+ UKit cannot force any tool to use a specific model. The contract is enforced HERE, by you, via self-report comparison.
65
+
66
+ 1. Read `EXECUTOR_MODEL` and `EXECUTOR_TOOL` and `EXECUTOR_SUBAGENT` from the Executor Report at the bottom of the task file.
67
+ 2. Identify your own model. Your model SHOULD match `handoff.reviewer.model` in `.ukit/storage/config.json`. State both in the verdict.
68
+ 3. Apply this table:
69
+
70
+ | Executor model | Your model | Action |
71
+ |-----------------------|-----------------------|--------------------------------------------------------------------------------------------|
72
+ | named, != yours | named | proceed with review |
73
+ | named, == yours | named | REFUSE -> VERDICT = CHANGES-REQUESTED, reason "reviewer model must differ from executor" |
74
+ | "unknown" | named | proceed but mark `NOTES: executor model unverified - human, please confirm before merge` |
75
+ | named | "unknown" | REFUSE -> VERDICT = CHANGES-REQUESTED, reason "reviewer cannot verify own model" |
76
+ | missing field | any | REFUSE -> VERDICT = CHANGES-REQUESTED, reason "executor did not self-report model - re-run with v1.5.5+ contract" |
77
+
78
+ Same model is the most common silent failure. Do not skip this check.
79
+
80
+ ## Rules
81
+
82
+ - **Always re-run** the task's Verification Commands. If they fail, VERDICT = CRITICAL regardless of executor claims.
83
+ - If executor said `TEST_PLAN_FOLLOWED: N/A` without a real justification, downgrade to at minimum CHANGES-REQUESTED.
84
+ - Never approve when test file has no real `expect`/`assert` - that is a fake test -> CRITICAL.
85
+ - Keep the verdict block <= 30 lines. Findings are bullet points, not essays.
86
+ - The same-model refusal above is non-negotiable: bypassing it defeats the entire Quality Gate.