npm - @ngockhoale/ukit - Versions diffs - 1.5.2 → 1.5.5 - Mend

@ngockhoale/ukit 1.5.2 → 1.5.5

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (33) hide show

package/CHANGELOG.md +57 -5
package/README.md +2 -2
package/manifests/platform.full.yaml +2 -2
package/package.json +1 -1
package/src/cli/commands/doctor.js +14 -2
package/src/cli/commands/install.js +2 -2
package/src/cli/commands/uninstall.js +1 -1
package/src/index/taskRouting.js +117 -1
package/templates/.claude/agents/bug-debugger.md +48 -19
package/templates/.claude/agents/code-reviewer.md +86 -0
package/templates/.claude/agents/feature-implementer.md +59 -18
package/templates/.claude/hooks/skill-router.sh +1 -1
package/templates/.claude/hooks/verification-guard.sh +1 -1
package/templates/.claude/skills/next-step/SKILL.md +1 -1
package/templates/.claude/ukit/index/post-edit-verify.mjs +3 -2
package/templates/.claude/ukit/index/route-task.mjs +8 -4
package/templates/.claude/ukit/runtime/output-compression.mjs +37 -1
package/templates/AGENTS.md +7 -0
package/templates/CLAUDE.md +8 -1
package/templates/docs/AI_HANDOFF/ACTIVE.md +9 -0
package/templates/docs/AI_HANDOFF/HISTORY.md +4 -0
package/templates/docs/AI_HANDOFF/INDEX.md +13 -0
package/templates/docs/AI_HANDOFF/PLAN.md +75 -0
package/templates/docs/AI_HANDOFF/RULES.md +127 -0
package/templates/docs/AI_HANDOFF/archive/.gitkeep +0 -0
package/templates/docs/AI_HANDOFF/tasks/.gitkeep +0 -0
package/templates/docs/AI_HANDOFF/tasks/_TEMPLATE.md +72 -0
package/templates/docs/INSTALL.md +2 -2
package/templates/docs/PROJECT.md +1 -1
package/templates/docs/UKIT_USAGE_GUIDE.md +1 -1
package/templates/docs/WORKLOG.md +11 -0
package/templates/ukit/storage/config.json +93 -1
package/templates/docs/AI_HANDOFF.md +0 -118

package/templates/.claude/agents/feature-implementer.md CHANGED Viewed

@@ -8,48 +8,89 @@ tools: ["Read", "Edit", "Write", "Grep", "Glob", "Bash", "TodoWrite"]
 Implement requested behavior with minimal scope drift.
+**Two modes — auto-detect at start:**
+- **Daily/ad-hoc mode** (DEFAULT): task didn't come from `docs/AI_HANDOFF/` → use the original lightweight workflow. Tests only when touched code already has coverage. No reviewer trigger.
+- **Handoff mode**: task file is `docs/AI_HANDOFF/tasks/TASK-xxx.md` OR user explicitly invokes handoff (e.g. "execute task TASK-001") → activate full Quality Gate: test-first → green → reviewer.
+If unsure, ask the user. Don't apply Handoff mode rules to a quick one-off fix.
 ## Workflow
 ### 1. Understand (< 30 seconds)
-- Infer intent directly from user request (build/test/docs flow)
+- Infer intent directly from user request (build/test/docs flow).
 - Apply graduated doc budget:
   - trivial: no docs
   - simple: `docs/MEMORY.md` only
   - non-trivial: `docs/MEMORY.md` + `docs/PROJECT.md` + `docs/CODE_MAP.md`
-- Identify target files and existing patterns
-- If confidence is low or risk is high, ask one short clarifying question before deeper analysis
+- Identify target files and existing patterns.
+- If task came from handoff, read `tasks/TASK-xxx.md` and locate its **Test Plan** + **Verification Commands**.
+- If confidence is low or risk is high, ask one short clarifying question before deeper analysis.
 ### 2. Plan Approach (< 1 minute)
-- List files to create/modify (max diff)
-- Implement directly
+- List files to create/modify (max diff).
+- **Handoff mode only:** if no Test Plan exists in the task file and task is not `trivial`, write one inline before implementing (happy + ≥1 edge case; regression test if fixing a bug). In daily mode, skip this step.
+### 3. Test First (RED) — Handoff mode
+- Write the test(s) from §2 / from task Test Plan.
+- Run them: must FAIL for the expected reason. Capture output.
+- If test passes immediately → test is wrong or behavior already exists. Fix the test or stop and report.
+- **Daily mode**: skip this step unless touched code already has tests (then follow original rule).
-### 3. Implement
-- Smallest correct change set
-- Reuse existing code before creating new
-- No unrelated changes or speculative refactors
-- Follow project conventions (check `.claude/skills/` for patterns)
+### 4. Implement (GREEN)
-### 4. Verify
+- Smallest correct change set to make the test pass.
+- Reuse existing code before creating new.
+- No unrelated changes or speculative refactors.
+- Follow project conventions (check `.claude/skills/` for patterns).
-- Run existing tests if touched behavior has coverage: `yarn test` or project-specific
-- For SQL changes: verify with `EXPLAIN ANALYZE` on non-trivial queries
-- Check no lint errors introduced
+### 5. Verify (REQUIRED before DONE in Handoff mode; targeted in Daily mode)
-### 5. Report
+- **Handoff mode**: run the task's Verification Commands fresh in this turn. Capture full output. If ANY test fails → status is `PARTIAL` or `BLOCKED`, never `DONE`.
+- **Daily mode**: run existing tests if touched behavior has coverage; lint clean; targeted verification only.
+- For SQL changes: verify with `EXPLAIN ANALYZE` on non-trivial queries.
+- Check no lint errors introduced.
+### 6. Report
 ```
 STATUS: DONE | BLOCKED | PARTIAL
+EXECUTOR_TOOL: [claude-code | kilo-code | codex | opencode | other]
+EXECUTOR_MODEL: [exact model name you are running as — e.g. unic-code, claude-sonnet-4-5, gpt-5-mini. If you truly cannot tell, write "unknown" — reviewer treats unknown as suspicious and asks the human to confirm.]
+EXECUTOR_SUBAGENT: [name of the subagent you are, if your host has multiple — e.g. "Kilo:code", "Claude:feature-implementer". Otherwise "-".]
 SUMMARY: [1-2 sentences of what was implemented]
+TEST_PLAN_FOLLOWED: [task §4 / inline / N/A — reason]
 FILES_CHANGED:
   - [file path]: [what changed]
-VERIFIED: [test output / manual check result]
+TESTS_ADDED:
+  - [test file]: [test names]
+VERIFICATION:
+  command: [exact command run]
+  result: [N pass / M fail / exit code]
+  output_excerpt: |
+    [last 5-10 lines of test output]
 ISSUES: [any problems or edge cases, or "none"]
-NEXT: [follow-up needed, or "nothing — task complete"]
+HANDOFF_TO_REVIEWER: yes | no — reason
+NEXT: [follow-up needed, or "ready for review"]
 ```
+> **Self-report rule:** UKit cannot force any tool/host to use a specific model. Your self-reported `EXECUTOR_MODEL` is how the reviewer (in another tool or subagent) knows what to compare against its own model. Misreporting → reviewer refuses and asks the human to confirm.
+### 7. Trigger Reviewer — Handoff mode ONLY
+- Daily mode: skip this step entirely. Just report and stop.
+- Handoff mode + `STATUS: DONE` + `handoff.reviewer.enabled=true`:
+  - Set task status to `pending_review` in `docs/AI_HANDOFF/INDEX.md`.
+  - The next AI session (any tool, model from `handoff.reviewer.model`, MUST differ from executor) will pick `pending_review` task and run review.
+  - Do NOT dispatch reviewer in-process unless your host explicitly supports it AND can guarantee a different model — file-based handoff is the default.
 ## Rules
-- Add/update tests only when touched behavior already has coverage
+- **Iron law (Handoff mode):** no `DONE` without fresh PASS output in the current turn.
+- **Daily mode:** original rule — add tests only when touched behavior already has coverage.
+- If Handoff Test Plan says `N/A`, document why in the report and ensure manual verification ran.
+- Never silently skip reviewer phase in Handoff mode; if disabled, say so explicitly in NEXT.
+- Detection rule: if the task came from `docs/AI_HANDOFF/tasks/`, you are in Handoff mode. Otherwise Daily mode.

package/templates/.claude/hooks/skill-router.sh CHANGED Viewed

@@ -3,7 +3,7 @@
 # Used from UserPromptSubmit and PreToolUse so end users do not need to name skills.
 INPUT=$(cat)
-PROJECT_ROOT="${CLAUDE_PROJECT_DIR:-$(pwd)}"
+PROJECT_ROOT="${CLAUDE_PROJECT_DIR:-$PWD}"
 STATE_FILE="$PROJECT_ROOT/.claude/ukit/skill-router-state.json"
 HOOK_DIR="$(cd "$(dirname "$0")" && pwd)"
 THRESHOLD_SCRIPT="$HOOK_DIR/../ukit/runtime/compact-threshold.mjs"

package/templates/.claude/hooks/verification-guard.sh CHANGED Viewed

@@ -3,7 +3,7 @@
 # and does not jump to blanket verification when targeted evidence already exists.
 INPUT=$(cat)
-PROJECT_ROOT="${CLAUDE_PROJECT_DIR:-$(pwd)}"
+PROJECT_ROOT="${CLAUDE_PROJECT_DIR:-$PWD}"
 STATE_FILE="$PROJECT_ROOT/.claude/ukit/skill-router-state.json"
 ROUTE_CACHE_FILE="$PROJECT_ROOT/.claude/ukit/route-cache.json"
 PROGRESS_FILE="$PROJECT_ROOT/.claude/ukit/verification-progress.json"

package/templates/.claude/skills/next-step/SKILL.md CHANGED Viewed

@@ -44,7 +44,7 @@ If stale or missing, downgrade confidence and verify with the smallest current t
 ## Input Order
 Read only what is needed:
-1. `docs/AI_HANDOFF.md` first when the prompt explicitly names handoff / `ukit:handoff` / brainstorm-to-task flow
+1. `docs/AI_HANDOFF/ACTIVE.md` first when the prompt explicitly names handoff / `ukit:handoff` / brainstorm-to-task flow
 2. `docs/STATUS.md` (or existing root `STATUS.md` fallback) for normal status/continue prompts
 3. `docs/TASKS.md` only for queued-task prompts, “continue” with no active status, or when status points at queued work
 4. `docs/CODE_MAP.md` only when navigation is needed

package/templates/.claude/ukit/index/post-edit-verify.mjs CHANGED Viewed

@@ -18,7 +18,8 @@ function getFilePath(payload = {}) {
 }
 function isHandoffAuthoringFile(relativePath) {
-  return relativePath === 'docs/AI_HANDOFF.md';
+  return relativePath === 'docs/AI_HANDOFF.md'
+    || relativePath.startsWith('docs/AI_HANDOFF/');
 }
 async function listManifestPaths(backupsRoot) {
@@ -200,7 +201,7 @@ async function main() {
     process.stdout.write(`[ukit-safe-patch] ${result.message}\n`);
   } else if (result.status === 'advisory') {
     process.stderr.write(`[ukit-safe-patch] ${result.message}\n`);
-    process.stderr.write('[ukit-safe-patch] handoff authoring advisory — change is already written; continue updating docs/AI_HANDOFF.md.\n');
+    process.stderr.write('[ukit-safe-patch] handoff authoring advisory — change is already written; continue updating docs/AI_HANDOFF/.\n');
   }
   if (result.status === 'blocked') {
     const advisory = isSafePatchAdvisoryOnly(runtimeConfig);

package/templates/.claude/ukit/index/route-task.mjs CHANGED Viewed

@@ -481,6 +481,7 @@ function formatDisplayRouteSummary(routeSummary = null, routingContext = {}) {
   const segments = [
     taskSegment,
+    extractRouteLineSegment(line, 'handoff'),
     extractRouteLineSegment(line, 'targets'),
     extractRouteLineSegment(line, 'tests'),
     extractRouteLineSegment(line, 'styles'),
@@ -831,14 +832,14 @@ function deriveIntentMode({ promptText = '', commandText = '', targetFile = null
   const openEndedStatus = hasOpenEndedStatusSignal(lower, raw) || taskQueueNext;
   const concreteTask = hasConcreteTaskSignal(lower, raw, targetFile, { taskQueueNext });
-  if (docsSpecific) {
-    return 'docs-specific';
-  }
   if (hasHandoffSignal(lower, raw)) {
     return 'handoff';
   }
+  if (docsSpecific) {
+    return 'docs-specific';
+  }
   if (statusUpdate) {
     return 'status-update';
   }
@@ -1350,8 +1351,10 @@ function buildRouteSummary({
       ),
   );
   const nextActionCommand = compactHelperLane ? null : nextAction?.command ?? null;
+  const handoffFile = routingContext.intentMode === 'handoff' ? 'docs/AI_HANDOFF/ACTIVE.md' : null;
   const line = [
     routingContext.taskType ? `task=${routingContext.taskType}` : null,
+    handoffFile ? `handoff=${handoffFile}` : null,
     formatCompactSegment('targets', primaryTargets),
     formatCompactSegment('tests', relatedTests),
     formatCompactSegment('styles', styleFiles),
@@ -1374,6 +1377,7 @@ function buildRouteSummary({
     completionState,
     continuationState,
     intentMode: routingContext.intentMode ?? null,
+    handoffFile,
     delegateHint: delegationRecommendation?.hint ?? null,
     nextActionType: nextAction?.type ?? null,
     nextActionCommand,

package/templates/.claude/ukit/runtime/output-compression.mjs CHANGED Viewed

@@ -1082,7 +1082,11 @@ function normalizeOutputSummaryForDedupe(summary) {
   return String(summary ?? '')
     .split(/\r?\n/)
     .map((line) => String(line ?? '').trim())
-    .filter((line) => line && !/^- Full output:\s+/i.test(line))
+    .filter((line) => line
+      && !/^- Full output:\s+/i.test(line)
+      && !/^-\s*FAIL(?:ED)?\b/i.test(line)
+      && !/^-\s*PASS\b/i.test(line)
+      && !/^-\s*(?:Test Files|Tests|Duration|Start at)\b/i.test(line))
     .join('\n');
 }
@@ -1131,6 +1135,16 @@ async function appendOutputHistory(projectRoot, entry) {
   return nextDocument;
 }
+async function findOutputHistoryEntry(projectRoot, command, summary) {
+  const runtimePaths = buildRuntimePaths(projectRoot);
+  const current = normalizeOutputHistoryDocument(await readJson(runtimePaths.outputHistoryPath, { entries: [] }));
+  const lookupKey = [
+    String(command ?? '').trim(),
+    normalizeOutputSummaryForDedupe(summary),
+  ].join('\n').trim().toLowerCase();
+  return current.entries.find((candidate) => buildOutputHistoryDedupeKey(candidate) === lookupKey) ?? null;
+}
 function shouldCompress(config) {
   return Boolean(config?.tokenPipeline?.outputCompression);
 }
@@ -1244,6 +1258,28 @@ async function main() {
     exitCode,
     projectRoot,
   });
+  const historyMatch = await findOutputHistoryEntry(projectRoot, result.command, result.summary);
+  if (historyMatch?.summary) {
+    await appendOutputHistory(projectRoot, {
+      timestamp: Date.now(),
+      command: historyMatch.command,
+      profile: historyMatch.profile,
+      summary: historyMatch.summary,
+      tokensBefore: historyMatch.tokensBefore,
+      tokensAfter: historyMatch.tokensAfter,
+      savedTokens: historyMatch.savedTokens,
+      exitCode,
+      rawSaved: historyMatch.rawSaved,
+      rawPath: historyMatch.rawPath,
+      rawBytes: historyMatch.rawBytes,
+      recoveryReason: historyMatch.recoveryReason,
+      truncated: historyMatch.truncated,
+    });
+    process.stdout.write(String(historyMatch.summary));
+    return;
+  }
   const recoveryReason = buildRawOutputRecoveryReason({
     exitCode,
     tokensBefore: result.tokensBefore,

package/templates/AGENTS.md CHANGED Viewed

@@ -83,6 +83,13 @@ For clearly non-code specialist lanes (docs-only, status, task queue), skip the
 - Threshold-based compact pressure is internal orchestration; do not expose it to users.
 - For Codex Desktop long sessions, UKit can use soft auto-compact handoffs. Default `compact.codexContext.compactTarget=150` means about 150 compact handoff lines (120-150 preferred, hard max 170), not 150 tokens.
+## Handoff Quality Gate — OPT-IN
+Activates ONLY when work goes through `docs/AI_HANDOFF/` (user says "execute task TASK-xxx" or target is `docs/AI_HANDOFF/tasks/*.md`). Daily prompts → unchanged lightweight flow, no test-first/reviewer overhead.
+In Handoff mode: read `docs/AI_HANDOFF/RULES.md` for the 4-phase spec (Idea+Plan → Create Tasks → Implement+Test → Review+Test), state machine, comment thread, and self-reported model contract. Config: `.ukit/storage/config.json` → `handoff.*`.
 ## Context + Verification Budget
 - **Trivial**: no docs, no index query unless the file target is unclear.

package/templates/CLAUDE.md CHANGED Viewed

@@ -89,6 +89,13 @@ For clearly non-code specialist lanes (docs-only, status, task queue), skip the
 - Preserve UTF-8 BOM/no-BOM and LF/CRLF for existing multilingual/user-authored files.
 - Use `node .claude/ukit/index/safe-patch.mjs` internally when normal Edit/Write may normalize bytes or when anchor-based matching is needed.
+## Handoff Quality Gate — OPT-IN
+CHỈ kích hoạt khi task đi qua `docs/AI_HANDOFF/` (user nói "execute task TASK-xxx" hoặc target là `docs/AI_HANDOFF/tasks/*.md`). Daily prompt → KHÔNG đụng, flow cũ giữ nguyên.
+Khi Handoff mode: đọc `docs/AI_HANDOFF/RULES.md` để biết 4 phase (Idea+Plan → Create Tasks → Implement+Test → Review+Test) + state machine + comment thread + self-report model. Config: `.ukit/storage/config.json` → `handoff.*`.
 ## Context + Verification Budget
 - **Trivial**: no docs.
@@ -96,7 +103,7 @@ For clearly non-code specialist lanes (docs-only, status, task queue), skip the
 - **Non-trivial**: `docs/MEMORY.md` + `docs/PROJECT.md` + `docs/CODE_MAP.md`.
 - `docs/STATUS.md`: use for open-ended status/continue prompts or meaningful continuation context; stale status is orientation only.
 - `docs/TASKS.md`: use only for queued-task prompts or when status points at queued work; safely clean exact duplicates/completed overflow by default without deleting unfinished human-authored tasks.
-- `docs/WORKLOG.md`: only recent relevant entries.
+- `docs/WORKLOG.md`: only recent relevant entries. Follow the Budget Rules at the top of the file; archive oldest entries to `docs/WORKLOG_ARCHIVE.md` when over limits.
 - Follow routed verification policy: targeted first, widen only when risk/shared scope justifies it, ask before blanket broad runs.
 ## Living Status Workflow

package/templates/docs/AI_HANDOFF/ACTIVE.md ADDED Viewed

@@ -0,0 +1,9 @@
+# Active Handoff Cycle
+- Status: `ready_for_next_cycle`
+- Phase: `cleared`
+- Updated:
+- Updated by:
+- Human decision needed: `no`
+<!-- Snapshot only. Rules in RULES.md. Tasks in tasks/. Plan brainstorm in PLAN.md. -->

package/templates/docs/AI_HANDOFF/HISTORY.md ADDED Viewed

@@ -0,0 +1,4 @@
+# Handoff History
+| Cycle | Date | Summary |
+|-------|------|---------|

package/templates/docs/AI_HANDOFF/INDEX.md ADDED Viewed

@@ -0,0 +1,13 @@
+# Handoff Task Index
+<!--
+Status values (xem RULES.md §Status state machine):
+ready | in_progress | pending_review | changes_requested | critical_block | approved | approved_minor | blocked | done
+Owner = tool đang giữ task: claude-code | kilo-code | codex | opencode | -
+-->
+| ID | Title | Priority | Size | Status | Owner | Reviewer | File |
+|----|-------|----------|------|--------|-------|----------|------|
+Updated:

package/templates/docs/AI_HANDOFF/PLAN.md ADDED Viewed

@@ -0,0 +1,75 @@
+# Handoff Plan
+Status: `empty`
+<!--
+File này CHỈ dùng cho luồng Handoff Quality Gate.
+Daily prompt / quick fix → KHÔNG cần đụng vào đây, UKit vẫn chạy flow cũ bình thường.
+Chỉ kích hoạt khi user explicit đẩy việc qua handoff (ví dụ: "đưa vào handoff", "gom ý tưởng X").
+QUALITY GATE: mỗi task split ra phải kèm Test Plan (xem mục bên dưới).
+Không có Test Plan → task không được phép chuyển sang status `ready`.
+-->
+## 1. Intent / Goal
+<!-- 1-2 câu mô tả thứ user muốn đạt. Không paste lại nguyên prompt. -->
+## 2. Scope
+- In scope:
+- Out of scope:
+- Risk surface (file/module rủi ro share):
+## 3. Approach
+<!-- Cách làm ngắn gọn. Reuse code có sẵn trước khi tạo mới. -->
+## 4. Test Plan (REQUIRED — TDD-style)
+Liệt kê test sẽ viết TRƯỚC khi code. Mỗi test phải có:
+| # | Loại | Tên test | File | Expect | Pre-state |
+|---|------|----------|------|--------|-----------|
+| 1 | unit / integration / regression / e2e | `<tên test mô tả hành vi>` | `<path/to/file.test.js>` | `<output kỳ vọng cụ thể>` | `<input/fixture>` |
+Bắt buộc tối thiểu:
+- **Happy path**: hành vi chính chạy đúng.
+- **Edge case**: ít nhất 1 (null/empty/boundary/concurrent…).
+- **Regression** (nếu fix bug): test fail-trước-khi-fix, pass-sau-khi-fix.
+Nếu task không thể test (config-only, doc-only, prototype throw-away): ghi `Test plan: N/A — lý do: <…>` và đính kèm phương án verify thủ công.
+## 5. Verification Commands
+Lệnh chính xác executor sẽ chạy:
+```bash
+# ví dụ:
+# yarn test path/to/file.test.js
+# yarn test --run
+# node scripts/smoke.mjs
+```
+## 6. Acceptance Criteria
+- [ ] Tất cả test ở Test Plan PASS (kèm output trong report).
+- [ ] Không có regression ở suite liên quan.
+- [ ] Reviewer (model riêng) báo `APPROVED` hoặc `APPROVED-WITH-MINOR`.
+- [ ] Docs/CHANGELOG cập nhật nếu user-facing.
+## 7. Task Split (Phase 2 — TDD-embedded, MANDATORY)
+Khi human approve plan, AI tạo từng `tasks/TASK-xxx.md` theo cấu trúc ở `tasks/_TEMPLATE.md`.
+**Mỗi TASK file BẮT BUỘC có:**
+- `## Test Cases` — bảng test (loại, tên, expected, fixture) cho slice của task. Tối thiểu happy + 1 edge + regression nếu fix bug.
+- `## Test Files` — đường dẫn cụ thể file test sẽ tạo/sửa.
+- `## Verification Commands` — lệnh executor + reviewer đều chạy fresh.
+- `## Acceptance Criteria`.
+Task không kèm Test Cases + Test Files cụ thể → đánh `needs_breakdown`, không cho status `ready`.
+Update `INDEX.md`: thêm row cho mỗi task mới với `Status: ready`, `Owner: -`.

package/templates/docs/AI_HANDOFF/RULES.md ADDED Viewed

@@ -0,0 +1,127 @@
+# Handoff Rules
+## Token Budget (MANDATORY)
+- **Combined handoff reads must stay under 200 lines per request.**
+- Read order: `ACTIVE.md` (if needed) → `INDEX.md` (scan tasks) → single `tasks/TASK-xxx.md` (implement one task).
+- Do NOT read `RULES.md` every request — only when you need flow clarification.
+- Do NOT read multiple task files in one request.
+- If ACTIVE.md + INDEX.md + task file would exceed budget, read only the task file.
+- Auto-compact: if any **state file** (`ACTIVE.md`, `INDEX.md`, or any single `tasks/TASK-xxx.md`) exceeds 80 lines, trigger `clear handoff` / split task. `PLAN.md` and `RULES.md` are reference/spec — exempt.
+## How Human Submits Ideas
+- Natural language is enough: `ukit:handoff`, `gom ý tưởng`, `chia task`, `đưa vào handoff`.
+- If request is already a concrete task (clear file/logic/output, small enough to do now), bypass handoff and execute directly.
+- If request is broad/ambiguous/multi-step, use handoff.
+## Hard rule — All work stays in `docs/AI_HANDOFF/`
+Mọi giao tiếp giữa các AI trong handoff CHỈ qua file dưới `docs/AI_HANDOFF/`:
+- `PLAN.md` — brainstorm + Test Plan tổng (Phase 1).
+- `INDEX.md` — bảng task + status (mọi phase đọc/ghi).
+- `tasks/TASK-xxx.md` — nơi sống của từng task: Goal + Test Cases + Verification + Executor Report + Reviewer Verdict + **Discussion thread**.
+- `ACTIVE.md` — snapshot cycle hiện tại.
+- `archive/` — cycle cũ.
+**Cấm**: AI gửi câu hỏi/comment qua chat tool khác, qua commit message, hay qua file ngoài thư mục này. Lý do: cross-tool/cross-subagent chỉ đồng bộ được qua file. AI nào không đọc folder này = không tham gia handoff.
+### Discussion thread (AI-to-AI comments)
+Khi cần hỏi-lại / push-back / gợi ý cho phase khác, AI ghi vào `## Discussion` của task file (template ở `tasks/_TEMPLATE.md`). Format:
+```
+### <YYYY-MM-DD> · <role: planner|executor|reviewer> · <tool/model>
+<nội dung — gửi @planner / @executor / @reviewer nếu có người nhận>
+```
+Phase kế tiếp PHẢI đọc Discussion trước khi tiếp tục — coi như inbox.
+## Handoff Flow (tool-agnostic, file-based state machine)
+UKit handoff hoạt động qua **file state**. Anh tự chọn tool nào cho từng phase — Claude Code / Kilo Code / Codex / OpenCode / tool mới sau này — đều được. UKit chỉ care về **role của model**, không care tool.
+3 phase × 3 role model:
+- **Plan** — model mạnh nhất anh có (reasoning model). Có thể chạy ở bất kỳ tool nào hỗ trợ planning tốt.
+- **Execute** — model rẻ-mà-vẫn-thông-minh (code model). Có thể là subagent code của Kilo, hay agent build của OpenCode, hay feature-implementer của Claude Code.
+- **Review** — **MODEL KHÁC executor** (reasoning model thường tốt hơn). Có thể là tool khác, hoặc cùng tool nhưng subagent khác model (ví dụ Kilo có subagent code và subagent review riêng).
+Hai mô hình triển khai đều hợp lệ:
+- **Cross-tool**: ví dụ Claude (plan) → Kilo (execute) → Claude (review). Bridge qua file.
+- **Same-tool different-subagent**: ví dụ Kilo:plan → Kilo:code → Kilo:review, miễn 3 subagent dùng MODEL khác nhau ở role tương ứng.
+Mỗi tool/subagent đọc cùng `INDEX.md` + `tasks/TASK-xxx.md` → chọn task theo `status` → cập nhật status khi xong.
+> **Quan trọng — UKit không enforce model:** `handoff.executor.cheapSmartModelHint` và `handoff.reviewer.model` trong `.ukit/storage/config.json` chỉ là **nhãn** để anh biết MUỐN dùng gì. Tool nào dùng model nào là do anh chọn trong settings của tool đó. UKit enforce contract bằng cách bắt executor TỰ KHAI `EXECUTOR_MODEL` trong Executor Report; reviewer so với chính nó và refuse nếu trùng. Vì vậy nếu trong Kilo anh để cả code-subagent và review-subagent đều dùng cùng model → reviewer sẽ tự refuse, không silent-pass.
+### Status state machine
+```
+brainstorm ──[plan approved]──▶ ready ──[executor pick]──▶ in_progress
+                                                              ├─[PASS]──▶ pending_review
+                                                              └─[FAIL]──▶ blocked
+pending_review ──[reviewer]──▶ approved | approved_minor ──▶ done
+                            ├▶ changes_requested ──[fix]──▶ in_progress
+                            └▶ critical_block      ──[fix]──▶ in_progress
+```
+### 4 Phases
+**Phase 1 — Idea + Plan** (smart/reasoning model)
+- Human submit ideas (natural language).
+- AI ghi vào `PLAN.md`: §1 Intent, §2 Scope, §3 Approach, **§4 Test Plan (bắt buộc TDD-style)**, §5 Verification Commands, §6 Acceptance Criteria.
+- Output: PLAN.md đầy đủ, chờ human approve.
+**Phase 2 — Create Tasks (TDD-embedded, MANDATORY)** (smart/reasoning model, thường cùng phase 1)
+- Human approve plan → AI split `PLAN.md §7` sang nhiều `tasks/TASK-xxx.md`.
+- **Mỗi TASK file BẮT BUỘC có Test Plan của riêng nó**, không chỉ trỏ về PLAN.md. Cụ thể:
+  - `§ Test Cases`: bảng test (loại, tên test, expected) cho phần task này — happy + ≥1 edge case + regression (nếu fix bug).
+  - `§ Test Files`: đường dẫn cụ thể file test sẽ tạo/sửa (ví dụ `tests/auth/login.test.js`).
+  - `§ Verification Commands`: lệnh executor sẽ chạy để xác nhận PASS.
+  - `§ Acceptance Criteria`: checklist.
+- Nếu split mà task nào không kèm được Test Cases + Test Files cụ thể → task đó chưa đủ `ready`, đánh `needs_breakdown`.
+- Update `INDEX.md`: thêm row mỗi task với status `ready`.
+- Đây là **điểm cắt human-approval**: phase này xong, executor được phép pick.
+- Mục tiêu: executor (cheap-smart model) đọc task file là biết NGAY test gì cần viết trước, KHÔNG phải tự suy diễn.
+**Phase 3 — Implement + Test** (cheap-smart/code model)
+- User: "execute next task" / "làm TASK-001" / "implement task 1".
+- Executor đọc `INDEX.md` → pick `ready` task → đổi `in_progress` → **viết test trước → RED → implement → GREEN** → chạy Verification Commands fresh trong turn → append `## Executor Report` (gồm `EXECUTOR_TOOL`/`EXECUTOR_MODEL`/`EXECUTOR_SUBAGENT` + verification output) vào cuối task file → đổi status `pending_review`.
+- KHÔNG được claim DONE nếu chưa có PASS fresh.
+**Phase 4 — Review + Test** (reviewer model — KHÁC model executor)
+- User: "review pending tasks" / "review TASK-001".
+- Reviewer đọc INDEX → pick `pending_review` → đọc task file + diff → **so model với `EXECUTOR_MODEL`, refuse nếu trùng/unknown** → **re-run Verification Commands fresh** (không tin executor) → áp `code-review` skill → append `## Reviewer Verdict` vào task file (verdict + findings + reviewer model dùng) → đổi status:
+  - `approved` / `approved_minor` → cho phép `done`.
+  - `changes_requested` → executor phải fix Important → lặp Phase 3-4.
+  - `critical_block` → executor PHẢI fix → lặp Phase 3-4.
+Nếu `handoff.reviewer.enabled=false`, Phase 4 skip nhưng phải log lý do vào task — bỏ Phase 4 là bỏ lưới an toàn cuối.
+## Task Gate
+A task is `ready` only when it has:
+- Clear target files
+- Clear action
+- Dependencies stated
+- **Test Plan** (PLAN.md §4) — happy path + ≥1 edge case (+ regression test nếu fix bug); hoặc `N/A` kèm lý do
+- Verification command (lệnh executor sẽ chạy)
+- Acceptance criteria
+Missing any → `needs_breakdown`, `blocked`, or `needs_human`.
+## Clear Handoff
+1. Archive current cycle → `archive/cycle-NNN.md`.
+2. If archive > 3 files → delete oldest, append 1-line summary to `HISTORY.md`.
+3. Reset `ACTIVE.md` to empty template.
+4. Clear `INDEX.md`.
+5. Delete all files in `tasks/`.
+6. Clear `PLAN.md`.
+## Docs Sync
+After cycle, update affected docs only: `WORKLOG.md`, `PROJECT.md`, `CODE_MAP.md`, `CHANGELOG.md`.

package/templates/docs/AI_HANDOFF/archive/.gitkeep ADDED Viewed

File without changes

package/templates/docs/AI_HANDOFF/tasks/.gitkeep ADDED Viewed

File without changes

package/templates/docs/AI_HANDOFF/tasks/_TEMPLATE.md ADDED Viewed

@@ -0,0 +1,72 @@
+# TASK-XXX — <short title>
+<!--
+Template cho mỗi task. Planner copy file này khi split PLAN.md sang task riêng.
+File này BẮT BUỘC giữ structure: Goal + Test Cases + Test Files + Verification + Acceptance.
+Mọi AI (planner / executor / reviewer) đọc và ghi vào file NÀY. Không trao đổi ngoài file.
+-->
+- Status: `ready`  <!-- ready | in_progress | pending_review | changes_requested | critical_block | approved | approved_minor | blocked | done -->
+- Owner: `-`       <!-- tool đang giữ task -->
+- Reviewer: `-`    <!-- model name reviewer dùng, set ở Phase 4 -->
+- Parent plan: `docs/AI_HANDOFF/PLAN.md` §<section>
+## Goal
+<!-- 1-2 câu mô tả slice này làm gì. -->
+## Target Files
+- `<path/to/source.js>` — <what changes>
+## Test Cases (REQUIRED — TDD)
+| # | Loại | Tên test | Expected | Pre-state / Fixture |
+|---|------|----------|----------|---------------------|
+| 1 | unit | `<describe behavior>` | `<concrete expected>` | `<input>` |
+| 2 | edge | `<null/empty/boundary>` | `<expected>` | `<input>` |
+| 3 | regression (nếu bug fix) | `<reproduces bug>` | RED before fix, GREEN after | `<repro input>` |
+## Test Files
+- `<tests/path/to/file.test.js>` — chứa các test ở trên.
+## Verification Commands
+```bash
+yarn test tests/path/to/file.test.js
+```
+## Acceptance Criteria
+- [ ] Mọi test ở §Test Cases PASS.
+- [ ] Không regression ở suite liên quan.
+- [ ] Reviewer verdict APPROVED hoặc APPROVED-WITH-MINOR.
+- [ ] Docs/CHANGELOG cập nhật nếu user-facing.
+## Dependencies
+- (none) <!-- hoặc TASK-xxx phải done trước -->
+---
+## Discussion
+<!--
+AI nói chuyện với nhau Ở ĐÂY, không nói qua tool khác.
+Format mỗi comment:
+### <date> · <role: planner|executor|reviewer> · <tool/model>
+<nội dung — câu hỏi, lưu ý, đề xuất, đẩy ngược về phase trước>
+Reply lùn 1 level (####). Ghi rõ "→ @planner" / "→ @executor" / "→ @reviewer" nếu có người nhận cụ thể.
+-->
+(chưa có comment)
+---
+<!--
+Phase 3 executor append `## Executor Report` BÊN DƯỚI dấu phân cách này.
+Phase 4 reviewer append `## Reviewer Verdict` BÊN DƯỚI Executor Report.
+-->

package/templates/docs/INSTALL.md CHANGED Viewed

@@ -47,7 +47,7 @@ End users do not need to manage any of that manually.
 Complete these files before first serious use:
 - `docs/PROJECT.md`
 - `docs/MEMORY.md`
-- `docs/AI_HANDOFF.md`
+- `docs/AI_HANDOFF/`
 - `docs/WORKLOG.md`
 ### 4) Open your AI tool
@@ -97,7 +97,7 @@ ukit install
 Check that the docs baseline files exist and are filled in:
 - `docs/PROJECT.md`
 - `docs/MEMORY.md`
-- `docs/AI_HANDOFF.md`
+- `docs/AI_HANDOFF/`
 - `docs/WORKLOG.md`
 ---

package/templates/docs/PROJECT.md CHANGED Viewed

@@ -44,7 +44,7 @@
 1. Run `ukit memory recall "<current task>"` for non-trivial work; reuse relevant `## Previous Context` before asking the user to restate prior decisions
 2. Read `docs/MEMORY.md` — architecture decisions, active constraints, known bugs
-3. Read `docs/AI_HANDOFF.md` when continuing cross-AI planning, task breakdown, or task implementation handoff work
+3. Read `docs/AI_HANDOFF/ACTIVE.md` when continuing cross-AI planning, task breakdown, or task implementation handoff work
 4. Read `docs/CODE_MAP.md` if it exists — structural navigation index
 5. Use the installed source-code index / routed helpers to localize the smallest relevant file + test set first
 6. Scan recent `docs/WORKLOG.md` entries if continuing prior work

package/templates/docs/UKIT_USAGE_GUIDE.md CHANGED Viewed

@@ -138,7 +138,7 @@ Project đang ở đâu, làm gì tiếp?
 Expected UKit behavior:
 1. auto-load the hidden next-step lane
-2. read `docs/AI_HANDOFF.md` when the team is passing planning, task breakdown, or implementation context between AIs
+2. read `docs/AI_HANDOFF/ACTIVE.md` when the team is passing planning, task breakdown, or implementation context between AIs
 3. verify the handoff against source/index before treating it as authoritative
 4. suggest only a few actionable next candidates
 5. if the prompt names a concrete bug/feature/review target, keep the concrete workflow primary instead of producing a global roadmap