npm - tink-harness - Versions diffs - 1.7.1 → 1.8.0 - Mend

tink-harness 1.7.1 → 1.8.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (21) hide show

package/.claude-plugin/plugin.json +1 -1
package/CHANGELOG.md +15 -0
package/README.ko.md +13 -1
package/README.md +13 -1
package/VERSIONING.md +1 -1
package/commands/cast.md +65 -19
package/docs/pr/2026-06-09-v1.8.0.ko.md +44 -0
package/docs/work-state.ko.md +6 -0
package/docs/work-state.md +6 -0
package/package.json +1 -1
package/skills/tink/SKILL.md +20 -19
package/templates/claude/commands/tink/cast.md +65 -19
package/templates/claude/skills/tink/SKILL.md +20 -19
package/templates/codex/skills/tink-core/RULES.md +27 -14
package/templates/tink/harnesses/HARNESS.md +4 -0
package/templates/tink/harnesses/delegation-brief.md +30 -0
package/templates/tink/harnesses/goal-checkpoint.md +30 -0
package/templates/tink/harnesses/index.json +60 -0
package/templates/tink/harnesses/plan-consensus.md +30 -0
package/templates/tink/harnesses/requirements-interview.md +30 -0
package/templates/tink/rules/index.json +56 -0

package/.claude-plugin/plugin.json CHANGED Viewed

@@ -1,7 +1,7 @@
 {
   "name": "tink",
   "description": "A small harness layer for Claude Code and Codex.",
-  "version": "1.7.1",
+  "version": "1.8.0",
   "author": {
     "name": "dotori"
   }

package/CHANGELOG.md CHANGED Viewed

@@ -7,6 +7,21 @@ All notable changes to Tink are tracked here.
 No unreleased changes yet.
+## [1.8.0] - 2026-06-09
+### Added
+- Added four visible-thinking harnesses selected through `/tink:cast` and `$tink:cast`: `requirements-interview`, `plan-consensus`, `goal-checkpoint`, and `delegation-brief`.
+- Added optional current-run artifact guidance for `.tink/current/goals.json` and `.tink/current/delegation.md`.
+- Added rule graph routing for ambiguous requirements, broad plans, long runs, and safe delegation briefs without adding new public commands.
+- Added Korean PR history draft for the v1.8.0 release in `docs/pr/2026-06-09-v1.8.0.ko.md`.
+### Changed
+- Updated Claude Code and Codex cast guidance so GJC-style interview, consensus planning, goal checkpoint, and delegation concepts stay inside Tink's small harness model.
+- Updated README and work-state docs to explain the new harnesses and optional run-state files.
 ## [1.7.1] - 2026-06-09
 ### Fixed

package/README.ko.md CHANGED Viewed

@@ -8,7 +8,7 @@ Claude Code와 Codex를 위한 작은 하네스 레이어입니다.
 Tink는 지금 작업에 맞는 하네스를 고르고, 실행 상태를 보이게 만들고, 실제 사용 중 생긴 실패와 피드백으로 하네스 세트를 개선합니다.
-**최신 패키지:** v1.7.1 — "둘 다" surface와 "Clean Codex picker"를 함께 선택했을 때 Claude Code 명령이 삭제되던 버그를 수정합니다. 최신 마이너 릴리스 노트: [v1.7.0](https://github.com/dotoricode/tink-harness/releases/tag/v1.7.0).
+**최신 패키지:** v1.8.0 — 요구사항 인터뷰, 합의형 계획, 목표 체크포인트, 위임 브리프를 위한 visible-thinking 하네스를 추가합니다. 최신 마이너 릴리스 노트: [v1.8.0](https://github.com/dotoricode/tink-harness/releases/tag/v1.8.0).
 [English](README.md) · **한국어**
@@ -59,6 +59,14 @@ npx tink-harness@latest update
 업데이트 후 Codex skill, schema, Windows 경고가 이상해 보이면 `docs/update-troubleshooting.ko.md` 또는 `docs/update-troubleshooting.md`를 확인하세요.
+## 1.8.0에서 달라진 점
+이번 마이너 릴리스는 GJC식 사고 단계 노출 방식을 Tink의 작은 하네스 모델 안으로 가져옵니다.
+- `/tink:cast`와 `$tink:cast`가 `requirements-interview`, `plan-consensus`, `goal-checkpoint`, `delegation-brief`를 선택할 수 있습니다.
+- 긴 실행은 `.tink/current/goals.json`, 인수인계나 병렬 검토 계획은 `.tink/current/delegation.md`에 보이는 상태로 남길 수 있습니다.
+- 이 하네스들은 worker, tmux pane, worktree를 자동 시작하지 않습니다. 위임은 별도 승인된 워크플로가 실행하기 전까지 브리프 상태로만 둡니다.
 ## 1.7.1에서 달라진 점
 이번 패치는 "둘 다" surface와 "Clean Codex picker"를 함께 선택했을 때 Claude Code 명령과 skill이 삭제되는 문제를 수정합니다.
@@ -124,6 +132,8 @@ Claude Code에서는 `/tink:*`, Codex에서는 `$tink:*`을 씁니다. 예전 `$
 Tink는 이제 비단순 작업에 대해 `.tink/current/contract.json`도 만듭니다. 이 파일에는 작업 종류, 위험, 성공 조건, 금지 사항, 검증 명령이 들어갑니다.
+더 크거나 모호한 작업에서는 `cast`가 에이전트의 생각 단계를 파일로 더 잘 드러내는 하네스를 고를 수 있습니다. 모호한 아이디어는 `requirements-interview`, 큰 계획은 `plan-consensus`, 긴 실행은 `goal-checkpoint`, 안전한 인수인계는 `delegation-brief`를 씁니다. 모두 `/tink:cast` 또는 `$tink:cast`가 고르는 재사용 하네스이며, 별도 CLI 명령은 아닙니다.
 ### `/tink:verify` / `$tink:verify`
 `contract.json`에 적힌 검증을 실제로 실행하고 증거를 남깁니다.
@@ -156,6 +166,8 @@ Tink는 직접 볼 수 있는 파일을 씁니다.
 - `.tink/maintenance/`: 검증, friction, weave 신호 기록
 - `.tink/memory/`: 승인된 실수, 선호, 교훈
+선택된 하네스에 따라 `.tink/current/goals.json`에는 현재 실행의 목표 체크포인트가, `.tink/current/delegation.md`에는 인수인계 패킷이 추가될 수 있습니다. Tink는 이런 브리프를 보이는 상태로 준비하지만, 별도 승인된 워크플로가 아니면 worker, tmux pane, worktree를 시작하지 않습니다.
 Rule graph는 작게 유지합니다. Tink는 먼저 필수 규칙을 고르고, 작업 사실이나 keyword에 맞는 선택 규칙만 가져오며, phase별로 이미 읽은 rule id를 기록해 같은 안내를 반복하지 않습니다.
 설계 메모는 `docs/`에 둡니다. 기본 호환성 기준은 `docs/compatibility-policy.md`에 있으며, 새 작업은 Claude Code와 Codex, macOS와 Windows를 함께 고려해야 합니다. Repo Signal 동작은 `docs/repo-signals.ko.md` 또는 `docs/repo-signals.md`에 정리되어 있고, 가벼운 graph 규칙 적용 계획은 `docs/graph-rule-adoption-plan.ko.md`에 정리되어 있습니다. 외부 context 안전 기준은 `docs/mcp-safe-profile.md`에 정리되어 있습니다. `.tink/current/` 상태를 읽거나 검토할 때는 `docs/work-state.ko.md` 또는 `docs/work-state.md`부터 보면 됩니다. 다음 업데이트 안정화 계획은 `docs/phase-5-update-confidence.ko.md`와 `docs/phase-5-update-confidence.md`에 정리되어 있습니다. 더 큰 아이디어 구현 점검과 로드맵은 `docs/tink-idea-implementation-plan.ko.md`에 정리되어 있습니다.

package/README.md CHANGED Viewed

@@ -24,7 +24,7 @@
   <a href="https://github.com/dotoricode/tink-harness/stargazers"><img src="https://img.shields.io/github/stars/dotoricode/tink-harness?style=social" alt="GitHub stars"></a>
 </p>
-<p><strong>Latest package:</strong> v1.7.1 - Fixes accidental deletion of Claude Code commands when "Both" surface and "Clean Codex picker" were selected together. Latest minor release notes: <a href="https://github.com/dotoricode/tink-harness/releases/tag/v1.7.0">v1.7.0</a>.</p>
+<p><strong>Latest package:</strong> v1.8.0 - Adds visible-thinking harnesses for requirements interviews, consensus planning, goal checkpoints, and delegation briefs. Latest minor release notes: <a href="https://github.com/dotoricode/tink-harness/releases/tag/v1.8.0">v1.8.0</a>.</p>
 **English** · [한국어](README.ko.md)
@@ -124,6 +124,14 @@ To quickly verify the updated install, see `docs/update-verification-recipe.md`
 If an update looks stale or incomplete, see `docs/update-troubleshooting.md` or `docs/update-troubleshooting.ko.md`.
+## What's new in 1.8.0
+This minor release brings GJC-style visible thinking into Tink without adding new commands.
+- `/tink:cast` and `$tink:cast` can now select `requirements-interview`, `plan-consensus`, `goal-checkpoint`, and `delegation-brief`.
+- Long runs can record `.tink/current/goals.json`; handoff or parallel-review plans can record `.tink/current/delegation.md`.
+- Tink still does not start workers, tmux panes, or worktrees from these harnesses. Delegation remains a visible brief unless another approved workflow runs it.
 ## What's new in 1.7.1
 This patch fixes a destructive interaction between the "Both" surface selection and "Clean Codex picker."
@@ -195,6 +203,8 @@ In Tink, `cast` is the main path. It reads the task, chooses or drafts the right
 Use it when the task is more than a quick answer.
+For bigger or fuzzier work, `cast` can expose more of the agent's thinking as files without adding new commands. Ambiguous ideas can start with `requirements-interview`, broad plans with `plan-consensus`, long runs with `goal-checkpoint`, and safe handoffs with `delegation-brief`. These are reusable harnesses selected by `/tink:cast` or `$tink:cast`, not separate CLI workflows.
 ### `/tink:verify` / `$tink:verify`
 `verify` runs the checks promised in `.tink/current/contract.json`.
@@ -237,6 +247,8 @@ Tink uses files you can inspect:
 - `.tink/maintenance/`: verification, friction, and weave signals that help repeated failures become approved improvements
 - `.tink/memory/`: approved mistakes, preferences, and lessons
+When selected, current-run artifacts may also include `.tink/current/goals.json` for goal checkpoints or `.tink/current/delegation.md` for handoff packets. Tink prepares those briefs as visible state; it does not start workers, tmux panes, or worktrees unless a separate approved workflow does so.
 The rule graph stays small on purpose. Tink loads matching mandatory rules first, retrieves only relevant optional rules by task facts or keywords, and records loaded rule ids by phase so the same guidance is not repeated in one run.
 Design notes live in `docs/`. The compatibility baseline is `docs/compatibility-policy.md`: every new slice should consider Claude Code and Codex, plus macOS and Windows. Repo signal behavior is described in `docs/repo-signals.md` or `docs/repo-signals.ko.md`. The lightweight graph-rule adoption plan is `docs/graph-rule-adoption-plan.ko.md`. External context safety is described in `docs/mcp-safe-profile.md` and `docs/external-context-policy.md`. To read or review `.tink/current/` state, start with `docs/work-state.md` or `docs/work-state.ko.md`. Update confidence is still documented in `docs/phase-5-update-confidence.md` or `docs/phase-5-update-confidence.ko.md`. The planned work-unit list is `docs/planned-work-units.md` or `docs/planned-work-units.ko.md`, with details in the verification evidence, harness lifecycle, memory decision, context change, and update diagnosis docs. The broader Korean idea audit and roadmap is `docs/tink-idea-implementation-plan.ko.md`.

package/VERSIONING.md CHANGED Viewed

@@ -1,6 +1,6 @@
 # Versioning
-Current version: `1.7.1`
+Current version: `1.8.0`
 Tink follows semver from `1.0.0` onward.

package/commands/cast.md CHANGED Viewed

@@ -147,6 +147,11 @@ After approval, create `.tink/current/` with these files before doing deeper wor
 - `context-metrics-evaluation.json`: measured or estimated context-efficiency scores, formulas, evidence refs, and limits
 - `excluded-context.md`: notable omitted files, tools, sources, or claims and why they were excluded
+Optional current-run artifacts are created only when their harness is selected:
+- `goals.json`: current-run goals for `goal-checkpoint`; keep 2-6 goals, one active goal, status, done criteria, verification, and evidence.
+- `delegation.md`: handoff or parallel-work packets for `delegation-brief`; include packet scope, forbidden actions, expected evidence, and reconciliation notes. Do not start tmux panes, worktrees, workers, or external agents from this harness.
 Create `contract.json` before loading harness bodies. It should be short, factual, and based on the user request plus visible project context:
 ```json
@@ -389,34 +394,39 @@ A task is trivial only when ALL of the following are true:
    - docs
    - ship/release
    - new pattern not covered yet
-6. Pick the best existing harness set using the context budget policy below. Prefer 1-3 harnesses, but do not use a hard cap when several tiny harnesses add useful checks without crowding context. When the task is ambiguous (Stitch goal-ambiguity is expected to trigger), start with a single best-fit harness; add a second only after the user clarifies. Do not bundle 2+ harnesses for ambiguous tasks upfront.
+6. Consider GJC-style visible-thinking overlays as normal Tink harnesses, not as new command surfaces:
+   - If the request is an ambiguous idea, early product concept, or underspecified implementation prompt, prefer `requirements-interview` before planning or coding. This is the default harness when Stitch is expected to trigger for goal ambiguity.
+   - If the request asks for a plan, architecture decision, large refactor, migration, or broad public contract change, consider `plan-consensus`.
+   - If the work naturally splits into multiple durable milestones, add `goal-checkpoint` and create `.tink/current/goals.json` after approval.
+   - If parallel review, verification, or handoff would reduce risk, add `delegation-brief` and create `.tink/current/delegation.md` after approval. This harness prepares briefs only; it never starts tmux, worktrees, workers, or external agents.
+7. Pick the best existing harness set using the context budget policy below. Prefer 1-3 harnesses, but do not use a hard cap when several tiny harnesses add useful checks without crowding context. When the task is ambiguous (Stitch goal-ambiguity is expected to trigger), start with `requirements-interview` alone; add a second harness only after the user clarifies. Do not bundle 2+ harnesses for ambiguous tasks upfront.
    After selecting, run a quick quality check using the index metadata for each chosen harness:
    - If fewer than 2 words in `use_when` match the current task description (case-insensitive) → treat as a Stitch harness-mismatch signal
    - If `checks` is empty or missing → treat as a Stitch harness-mismatch signal
    - If `asks` is empty or missing and the task goal is not self-evident → treat as a Stitch goal-ambiguity signal
-   Feed any signals into the Stitch evaluation at step 11.
-7. Add any rule graph check candidates to `contract.json` verification if they are relevant and cheap. For risky commands, set `approval_required: true`.
-8. Add opt-in guard candidates to `notes.md` only as suggestions. Do not register enforcement hooks unless the user separately approves.
-9. Run the synthesis probe on the initial harness choice. The probe produces one of three outcomes: strong fit (0-1 yes), generic fit (2-3 yes), or no fit (4-5 yes or no harness matches).
-10. If the probe finds no fit, load `harness-synthesis` and draft a domain-specific harness for this run instead of forcing a bad fit.
-11. If the probe finds a generic fit (2-3 yes), propose a run-only draft harness or domain rules alongside the built-in harness. Do not save it by default.
-12. If too many tools, skills, agents, or harnesses are available, load `harness-curation` and choose the smallest effective set before loading more context.
-13. If lightweight signals show a recurring operating habit, use `harness-curation` (its habit calibration section) to make one advisory recommendation without loading a separate body.
-14. If the user points to research, notes, examples, prior failures, or "what I learned today", synthesize from those inputs. Extract behavior-shaping rules and reusable procedure, not a summary.
-15. Run Stitch once before committing to `.tink/current/`. If it triggers, show exactly one proposal before approval. Call `AskUserQuestion` as described in the Interaction policy section.
-16. Ask for explicit approval before non-trivial work.
-17. After approval, read only the selected harness files and any approved run-only draft.
-18. Create `.tink/current/` files from the run state contract, including `contract.json`, `session.json`, `context-pack.md`, `context-map.json`, `context-metrics-evaluation.json`, and `excluded-context.md`.
-19. Execute the first safe step immediately:
+   Feed any signals into the Stitch evaluation at step 16.
+8. Add any rule graph check candidates to `contract.json` verification if they are relevant and cheap. For risky commands, set `approval_required: true`.
+9. Add opt-in guard candidates to `notes.md` only as suggestions. Do not register enforcement hooks unless the user separately approves.
+10. Run the synthesis probe on the initial harness choice. The probe produces one of three outcomes: strong fit (0-1 yes), generic fit (2-3 yes), or no fit (4-5 yes or no harness matches).
+11. If the probe finds no fit, load `harness-synthesis` and draft a domain-specific harness for this run instead of forcing a bad fit.
+12. If the probe finds a generic fit (2-3 yes), propose a run-only draft harness or domain rules alongside the built-in harness. Do not save it by default.
+13. If too many tools, skills, agents, or harnesses are available, load `harness-curation` and choose the smallest effective set before loading more context.
+14. If lightweight signals show a recurring operating habit, use `harness-curation` (its habit calibration section) to make one advisory recommendation without loading a separate body.
+15. If the user points to research, notes, examples, prior failures, or "what I learned today", synthesize from those inputs. Extract behavior-shaping rules and reusable procedure, not a summary.
+16. Run Stitch once before committing to `.tink/current/`. If it triggers, show exactly one proposal before approval. Call `AskUserQuestion` as described in the Interaction policy section.
+17. Ask for explicit approval before non-trivial work.
+18. After approval, read only the selected harness files and any approved run-only draft.
+19. Create `.tink/current/` files from the run state contract, including `contract.json`, `session.json`, `context-pack.md`, `context-map.json`, `context-metrics-evaluation.json`, and `excluded-context.md`. If selected, also create `goals.json` for `goal-checkpoint` and `delegation.md` for `delegation-brief`.
+20. Execute the first safe step immediately:
    - inspect relevant files,
    - run a read-only diagnostic,
    - draft the first artifact,
    - or reproduce the issue.
-20. Keep `steps.json`, `notes.md`, `contract.json`, and `session.json` current as work progresses.
-21. Before final, run `/tink:verify` behavior for required contract checks or state why verification is blocked.
-22. If the task exposed a repeated mistake or reusable improvement, use the Reusable State Save Gate approval payload below. Save only after separate user approval.
+21. Keep `steps.json`, `notes.md`, `contract.json`, and `session.json` current as work progresses. When present, keep `goals.json` and `delegation.md` aligned with actual status and evidence.
+22. Before final, run `/tink:verify` behavior for required contract checks or state why verification is blocked.
+23. If the task exposed a repeated mistake or reusable improvement, use the Reusable State Save Gate approval payload below. Save only after separate user approval.
 ## Synthesis probe
@@ -637,6 +647,42 @@ Before saving, score the candidate 1-5 on specificity, actionability, verifiabil
 }
 ```
+## `goals.json` template
+Use only when `goal-checkpoint` is selected:
+```json
+{
+  "goals": [
+    {
+      "id": "G1",
+      "status": "active",
+      "description": "",
+      "done_criteria": [],
+      "verification": [],
+      "evidence": [],
+      "next_action": ""
+    }
+  ]
+}
+```
+## `delegation.md` template
+Use only when `delegation-brief` is selected:
+```md
+# Delegation brief
+## Shared constraints
+-
+## Packets
+### Packet 1: <owner label>
+- Scope:
+- Forbidden:
+- Expected evidence:
+- Reconciliation notes:
+```
 ## Meaning of `context`
 When listing harnesses, define `context` once:

package/docs/pr/2026-06-09-v1.8.0.ko.md ADDED Viewed

@@ -0,0 +1,44 @@
+# feat: GJC식 visible-thinking 하네스 추가
+## Problem
+Tink는 이미 `.tink/current/`와 rule graph로 실행 상태를 남기고 있었지만, 모호한 요구사항 인터뷰, 합의형 계획 검토, 긴 목표 체크포인트, 안전한 위임 브리프는 일반 하네스나 임시 초안으로만 처리해야 했음.
+GJC의 `deep-interview`, `ralplan`, `ultragoal`, `team` 아이디어를 그대로 런타임으로 옮기면 Tink의 작은 command surface와 충돌할 수 있었음. 새 CLI나 worker runtime을 늘리지 않고 같은 효과를 Tink식으로 흡수할 방법이 필요했음.
+## Fix
+`/tink:cast`와 `$tink:cast`가 고를 수 있는 reusable harness 4개를 추가했음.
+- `requirements-interview`: 모호한 아이디어를 한 번에 한 질문씩 좁혀 성공 조건과 금지 조건을 명확히 함.
+- `plan-consensus`: Planner, Architect, Critic, Final 흐름으로 큰 계획을 점검함.
+- `goal-checkpoint`: 긴 실행을 2-6개 목표와 완료 증거로 나눠 `.tink/current/goals.json`에 남김.
+- `delegation-brief`: 병렬 검토나 인수인계를 위한 범위, 금지 행동, 증거 요구사항을 `.tink/current/delegation.md`에 남김.
+## Summary
+새 public command나 Codex visible skill은 추가하지 않았음. GJC식 사고 단계는 Tink의 기존 `cast` 선택 과정과 run-state 파일 안에서만 동작하게 했음.
+## Changes
+- Claude Code `cast` 명령 3-copy를 동기화해 새 하네스 선택 규칙과 optional artifact 템플릿을 추가했음.
+- Codex core rules에도 같은 선택 규칙을 추가해 `$tink:cast`에서 동일하게 동작하게 했음.
+- `templates/tink/harnesses/index.json`과 `.tink/rules/index.json` seed에 새 하네스와 routing rule을 등록했음.
+- README/README.ko와 work-state 문서에 `goals.json`, `delegation.md` 설명을 추가했음.
+- 패키지 버전을 `1.8.0`으로 올리고 changelog/release metadata를 맞췄음.
+## Behavior
+모호한 작업은 바로 구현으로 들어가지 않고 `requirements-interview`로 성공 조건을 먼저 좁힐 수 있음.
+큰 설계나 refactor 계획은 `plan-consensus`로 한 번 더 비판 검토할 수 있음.
+긴 작업은 현재 run 안에서만 목표 checkpoint를 남김. repo 전체 장기 goal database는 만들지 않음.
+위임 브리프는 worker 실행 전 문서화까지만 담당함. tmux pane, worktree, 외부 agent는 자동으로 시작하지 않음.
+## Testing
+- `npm_config_cache=/private/tmp/tink-npm-cache npm test`
+- `git diff --check`
+- clean tarball install smoke: `npm exec --yes --package <tarball> -- tink-harness install --lang=ko --yes --scope=repo`

package/docs/work-state.ko.md CHANGED Viewed

@@ -30,6 +30,11 @@ Tink는 실행 상태를 파일로 남겨서 사람이 빠르게 네 가지를
 7. `.tink/current/notes.md`
    - 마지막 안전 지점, 복구 메모, 짧은 검증 요약을 읽습니다.
+특정 하네스가 선택되면 선택 파일이 추가될 수 있습니다.
+- `.tink/current/goals.json`: `goal-checkpoint`가 만드는 현재 실행 목표 체크포인트.
+- `.tink/current/delegation.md`: `delegation-brief`가 만드는 인수인계 또는 병렬 작업 패킷.
 ## Context 읽는 법
 먼저 `context-pack.md`를 봅니다. 스키마를 몰라도 읽을 수 있어야 합니다.
@@ -95,6 +100,7 @@ Tink는 실행 상태를 파일로 남겨서 사람이 빠르게 네 가지를
 - excluded context가 건너뛴 입력이나 위험한 입력을 보이게 합니다.
 - verification evidence가 짧고 반복 가능하게 남습니다.
 - notes가 마지막 안전 지점과 다음 행동을 말합니다.
+- goals나 delegation brief가 있으면 상태, 금지 행동, 증거가 명확합니다.
 ## 피해야 할 것

package/docs/work-state.md CHANGED Viewed

@@ -28,6 +28,11 @@ Start here when resuming, reviewing, or handing off a run:
 7. `.tink/current/notes.md`
    - Read the last safe point, recovery notes, and compact verification summaries.
+Optional files may appear for specific harnesses:
+- `.tink/current/goals.json`: current-run goal checkpoints created by `goal-checkpoint`.
+- `.tink/current/delegation.md`: handoff or parallel-work packets created by `delegation-brief`.
 ## How To Read Context
 Use `context-pack.md` first. It should be readable without knowing the schema.
@@ -93,6 +98,7 @@ A good run state has these properties:
 - The excluded context file makes skipped or unsafe inputs visible.
 - Verification evidence is compact and repeatable.
 - Notes say the last safe point and next action.
+- If present, goals and delegation briefs have explicit status, forbidden actions, and evidence.
 ## What To Avoid

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "tink-harness",
-  "version": "1.7.1",
+  "version": "1.8.0",
   "description": "Self-growing harnesses for Claude Code and Codex.",
   "license": "MIT",
   "type": "module",

package/skills/tink/SKILL.md CHANGED Viewed

@@ -43,25 +43,26 @@ Use only these commands:
 5. Prefer the smallest useful harness set. Use context footprint, not a universal hard cap: tiny harnesses may stack, large harnesses load one at a time after approval, and meta harnesses should reduce or replace context rather than pile on.
 6. If `.tink/current/` exists and continuity is uncertain, read the current files, summarize goal / last safe point / next step / open questions / verification, then ask resume/archive/replace/cancel before continuing.
 7. Run the synthesis probe on the initial harness choice. The probe produces one of three outcomes: strong fit (0-1 yes), generic fit (2-3 yes), or no fit (4-5 yes or no harness matches). Strong fit keeps the harness; generic fit adds a run-only draft; no fit loads `harness-synthesis`.
-8. If no existing harness fits, use `harness-synthesis` to draft a narrow domain-specific harness instead of forcing a bad fit.
-9. If too many tools, skills, agents, or harnesses are available, use `harness-curation` to choose the smallest effective set before loading more context.
-10. If lightweight signals show recurring context, token, prompt-quality, output-length, reset, or evidence habits, use `harness-curation` to make one advisory recommendation.
-11. When research notes, examples, prior failures, or user corrections are available, extract behavior-shaping rules: triggers, decision rules, checks, stop conditions, recovery, and evidence.
-12. Run Stitch once before committing to `.tink/current/`: evaluate every time, show exactly one proposal only for high-impact quality or safety branches, and use configured language.
-13. Use soft Stitch choices `Approve`, `Add requirements`, `Continue as-is` or localized equivalents; use hard choices `Approve`, `Add requirements`, `Cancel` only.
-14. Hard gates must not offer `Continue as-is` or `이대로 진행`, and Stitch may change method or order but not the user's goal without separate approval.
-15. Treat Reusable State Save Gate as a separate hard approval gate for `.tink/memory/*`, `.tink/harnesses/*`, `.tink/rules/*`, `.tink/config.json`, `.claude/`, and template/plugin files that affect future installs.
-16. Current-run approval never authorizes reusable-state writes; before saving reusable state, show operation, destination files, exact entry or patch summary, reusable reason, sensitive content excluded, and rollback/removal path.
-17. Before saving a reusable rule graph update, run a structural gate: duplicate, breadth, evidence, verification, Claude Code/Codex compatibility, macOS/Windows compatibility, and portable commands. AI may propose a rule; saving it still requires separate approval.
-18. `/tink:frog` may inspect rule quality as well as harness quality. Prefer keep, rewrite, split, merge, or needs-evidence recommendations before any removal proposal.
-19. Ask for approval before applying, saving, purging, honing, or installing enforcement hooks.
-20. After approval, create `.tink/current/plan.md`, `checks.md`, `steps.json`, `notes.md`, `answers.md`, `contract.json`, `session.json`, `context-pack.md`, `context-map.json`, and `excluded-context.md`.
-21. Do not stop at recommendation. Execute the first safe step after run state exists.
-22. Run `/tink:verify` behavior before final when `contract.json` lists required checks.
-23. Store reusable memory or rule updates only after separate Reusable State Save Gate approval.
-24. If a check fails, update `.tink/current/notes.md`, state the failure, last safe point, and next single action. Append compact friction to `.tink/maintenance/friction.jsonl` when it exists. Feed repeated failures to `/tink:weave`.
-25. Keep context compact. Do not paste raw logs or full diffs.
-26. Use calm, clear, concise language. Prefer plain everyday words over technical terms if a simpler word works. No jokes.
+8. Treat GJC-style visible-thinking workflows as ordinary Tink harness choices, not new commands: use `requirements-interview` for ambiguous ideas, `plan-consensus` for broad plans or architecture, `goal-checkpoint` for long runs with 2-6 current-run goals, and `delegation-brief` for safe handoff or parallel-work briefs.
+9. If no existing harness fits, use `harness-synthesis` to draft a narrow domain-specific harness instead of forcing a bad fit.
+10. If too many tools, skills, agents, or harnesses are available, use `harness-curation` to choose the smallest effective set before loading more context.
+11. If lightweight signals show recurring context, token, prompt-quality, output-length, reset, or evidence habits, use `harness-curation` to make one advisory recommendation.
+12. When research notes, examples, prior failures, or user corrections are available, extract behavior-shaping rules: triggers, decision rules, checks, stop conditions, recovery, and evidence.
+13. Run Stitch once before committing to `.tink/current/`: evaluate every time, show exactly one proposal only for high-impact quality or safety branches, and use configured language.
+14. Use soft Stitch choices `Approve`, `Add requirements`, `Continue as-is` or localized equivalents; use hard choices `Approve`, `Add requirements`, `Cancel` only.
+15. Hard gates must not offer `Continue as-is` or `이대로 진행`, and Stitch may change method or order but not the user's goal without separate approval.
+16. Treat Reusable State Save Gate as a separate hard approval gate for `.tink/memory/*`, `.tink/harnesses/*`, `.tink/rules/*`, `.tink/config.json`, `.claude/`, and template/plugin files that affect future installs.
+17. Current-run approval never authorizes reusable-state writes; before saving reusable state, show operation, destination files, exact entry or patch summary, reusable reason, sensitive content excluded, and rollback/removal path.
+18. Before saving a reusable rule graph update, run a structural gate: duplicate, breadth, evidence, verification, Claude Code/Codex compatibility, macOS/Windows compatibility, and portable commands. AI may propose a rule; saving it still requires separate approval.
+19. `/tink:frog` may inspect rule quality as well as harness quality. Prefer keep, rewrite, split, merge, or needs-evidence recommendations before any removal proposal.
+20. Ask for approval before applying, saving, purging, honing, or installing enforcement hooks.
+21. After approval, create `.tink/current/plan.md`, `checks.md`, `steps.json`, `notes.md`, `answers.md`, `contract.json`, `session.json`, `context-pack.md`, `context-map.json`, and `excluded-context.md`. If selected, also create `goals.json` for `goal-checkpoint` and `delegation.md` for `delegation-brief`.
+22. Do not stop at recommendation. Execute the first safe step after run state exists.
+23. Run `/tink:verify` behavior before final when `contract.json` lists required checks.
+24. Store reusable memory or rule updates only after separate Reusable State Save Gate approval.
+25. If a check fails, update `.tink/current/notes.md`, state the failure, last safe point, and next single action. Append compact friction to `.tink/maintenance/friction.jsonl` when it exists. Feed repeated failures to `/tink:weave`.
+26. Keep context compact. Do not paste raw logs or full diffs.
+27. Use calm, clear, concise language. Prefer plain everyday words over technical terms if a simpler word works. No jokes.
 ## Quality bar
 The user should not have to repeat themselves. If the same mistake appears twice, propose `/tink:weave`, a rule graph update, an opt-in guard candidate, or a memory update through `/tink:cast`.

package/templates/claude/commands/tink/cast.md CHANGED Viewed

@@ -147,6 +147,11 @@ After approval, create `.tink/current/` with these files before doing deeper wor
 - `context-metrics-evaluation.json`: measured or estimated context-efficiency scores, formulas, evidence refs, and limits
 - `excluded-context.md`: notable omitted files, tools, sources, or claims and why they were excluded
+Optional current-run artifacts are created only when their harness is selected:
+- `goals.json`: current-run goals for `goal-checkpoint`; keep 2-6 goals, one active goal, status, done criteria, verification, and evidence.
+- `delegation.md`: handoff or parallel-work packets for `delegation-brief`; include packet scope, forbidden actions, expected evidence, and reconciliation notes. Do not start tmux panes, worktrees, workers, or external agents from this harness.
 Create `contract.json` before loading harness bodies. It should be short, factual, and based on the user request plus visible project context:
 ```json
@@ -389,34 +394,39 @@ A task is trivial only when ALL of the following are true:
    - docs
    - ship/release
    - new pattern not covered yet
-6. Pick the best existing harness set using the context budget policy below. Prefer 1-3 harnesses, but do not use a hard cap when several tiny harnesses add useful checks without crowding context. When the task is ambiguous (Stitch goal-ambiguity is expected to trigger), start with a single best-fit harness; add a second only after the user clarifies. Do not bundle 2+ harnesses for ambiguous tasks upfront.
+6. Consider GJC-style visible-thinking overlays as normal Tink harnesses, not as new command surfaces:
+   - If the request is an ambiguous idea, early product concept, or underspecified implementation prompt, prefer `requirements-interview` before planning or coding. This is the default harness when Stitch is expected to trigger for goal ambiguity.
+   - If the request asks for a plan, architecture decision, large refactor, migration, or broad public contract change, consider `plan-consensus`.
+   - If the work naturally splits into multiple durable milestones, add `goal-checkpoint` and create `.tink/current/goals.json` after approval.
+   - If parallel review, verification, or handoff would reduce risk, add `delegation-brief` and create `.tink/current/delegation.md` after approval. This harness prepares briefs only; it never starts tmux, worktrees, workers, or external agents.
+7. Pick the best existing harness set using the context budget policy below. Prefer 1-3 harnesses, but do not use a hard cap when several tiny harnesses add useful checks without crowding context. When the task is ambiguous (Stitch goal-ambiguity is expected to trigger), start with `requirements-interview` alone; add a second harness only after the user clarifies. Do not bundle 2+ harnesses for ambiguous tasks upfront.
    After selecting, run a quick quality check using the index metadata for each chosen harness:
    - If fewer than 2 words in `use_when` match the current task description (case-insensitive) → treat as a Stitch harness-mismatch signal
    - If `checks` is empty or missing → treat as a Stitch harness-mismatch signal
    - If `asks` is empty or missing and the task goal is not self-evident → treat as a Stitch goal-ambiguity signal
-   Feed any signals into the Stitch evaluation at step 11.
-7. Add any rule graph check candidates to `contract.json` verification if they are relevant and cheap. For risky commands, set `approval_required: true`.
-8. Add opt-in guard candidates to `notes.md` only as suggestions. Do not register enforcement hooks unless the user separately approves.
-9. Run the synthesis probe on the initial harness choice. The probe produces one of three outcomes: strong fit (0-1 yes), generic fit (2-3 yes), or no fit (4-5 yes or no harness matches).
-10. If the probe finds no fit, load `harness-synthesis` and draft a domain-specific harness for this run instead of forcing a bad fit.
-11. If the probe finds a generic fit (2-3 yes), propose a run-only draft harness or domain rules alongside the built-in harness. Do not save it by default.
-12. If too many tools, skills, agents, or harnesses are available, load `harness-curation` and choose the smallest effective set before loading more context.
-13. If lightweight signals show a recurring operating habit, use `harness-curation` (its habit calibration section) to make one advisory recommendation without loading a separate body.
-14. If the user points to research, notes, examples, prior failures, or "what I learned today", synthesize from those inputs. Extract behavior-shaping rules and reusable procedure, not a summary.
-15. Run Stitch once before committing to `.tink/current/`. If it triggers, show exactly one proposal before approval. Call `AskUserQuestion` as described in the Interaction policy section.
-16. Ask for explicit approval before non-trivial work.
-17. After approval, read only the selected harness files and any approved run-only draft.
-18. Create `.tink/current/` files from the run state contract, including `contract.json`, `session.json`, `context-pack.md`, `context-map.json`, `context-metrics-evaluation.json`, and `excluded-context.md`.
-19. Execute the first safe step immediately:
+   Feed any signals into the Stitch evaluation at step 16.
+8. Add any rule graph check candidates to `contract.json` verification if they are relevant and cheap. For risky commands, set `approval_required: true`.
+9. Add opt-in guard candidates to `notes.md` only as suggestions. Do not register enforcement hooks unless the user separately approves.
+10. Run the synthesis probe on the initial harness choice. The probe produces one of three outcomes: strong fit (0-1 yes), generic fit (2-3 yes), or no fit (4-5 yes or no harness matches).
+11. If the probe finds no fit, load `harness-synthesis` and draft a domain-specific harness for this run instead of forcing a bad fit.
+12. If the probe finds a generic fit (2-3 yes), propose a run-only draft harness or domain rules alongside the built-in harness. Do not save it by default.
+13. If too many tools, skills, agents, or harnesses are available, load `harness-curation` and choose the smallest effective set before loading more context.
+14. If lightweight signals show a recurring operating habit, use `harness-curation` (its habit calibration section) to make one advisory recommendation without loading a separate body.
+15. If the user points to research, notes, examples, prior failures, or "what I learned today", synthesize from those inputs. Extract behavior-shaping rules and reusable procedure, not a summary.
+16. Run Stitch once before committing to `.tink/current/`. If it triggers, show exactly one proposal before approval. Call `AskUserQuestion` as described in the Interaction policy section.
+17. Ask for explicit approval before non-trivial work.
+18. After approval, read only the selected harness files and any approved run-only draft.
+19. Create `.tink/current/` files from the run state contract, including `contract.json`, `session.json`, `context-pack.md`, `context-map.json`, `context-metrics-evaluation.json`, and `excluded-context.md`. If selected, also create `goals.json` for `goal-checkpoint` and `delegation.md` for `delegation-brief`.
+20. Execute the first safe step immediately:
    - inspect relevant files,
    - run a read-only diagnostic,
    - draft the first artifact,
    - or reproduce the issue.
-20. Keep `steps.json`, `notes.md`, `contract.json`, and `session.json` current as work progresses.
-21. Before final, run `/tink:verify` behavior for required contract checks or state why verification is blocked.
-22. If the task exposed a repeated mistake or reusable improvement, use the Reusable State Save Gate approval payload below. Save only after separate user approval.
+21. Keep `steps.json`, `notes.md`, `contract.json`, and `session.json` current as work progresses. When present, keep `goals.json` and `delegation.md` aligned with actual status and evidence.
+22. Before final, run `/tink:verify` behavior for required contract checks or state why verification is blocked.
+23. If the task exposed a repeated mistake or reusable improvement, use the Reusable State Save Gate approval payload below. Save only after separate user approval.
 ## Synthesis probe
@@ -637,6 +647,42 @@ Before saving, score the candidate 1-5 on specificity, actionability, verifiabil
 }
 ```
+## `goals.json` template
+Use only when `goal-checkpoint` is selected:
+```json
+{
+  "goals": [
+    {
+      "id": "G1",
+      "status": "active",
+      "description": "",
+      "done_criteria": [],
+      "verification": [],
+      "evidence": [],
+      "next_action": ""
+    }
+  ]
+}
+```
+## `delegation.md` template
+Use only when `delegation-brief` is selected:
+```md
+# Delegation brief
+## Shared constraints
+-
+## Packets
+### Packet 1: <owner label>
+- Scope:
+- Forbidden:
+- Expected evidence:
+- Reconciliation notes:
+```
 ## Meaning of `context`
 When listing harnesses, define `context` once:

package/templates/claude/skills/tink/SKILL.md CHANGED Viewed

@@ -43,25 +43,26 @@ Use only these commands:
 5. Prefer the smallest useful harness set. Use context footprint, not a universal hard cap: tiny harnesses may stack, large harnesses load one at a time after approval, and meta harnesses should reduce or replace context rather than pile on.
 6. If `.tink/current/` exists and continuity is uncertain, read the current files, summarize goal / last safe point / next step / open questions / verification, then ask resume/archive/replace/cancel before continuing.
 7. Run the synthesis probe on the initial harness choice. The probe produces one of three outcomes: strong fit (0-1 yes), generic fit (2-3 yes), or no fit (4-5 yes or no harness matches). Strong fit keeps the harness; generic fit adds a run-only draft; no fit loads `harness-synthesis`.
-8. If no existing harness fits, use `harness-synthesis` to draft a narrow domain-specific harness instead of forcing a bad fit.
-9. If too many tools, skills, agents, or harnesses are available, use `harness-curation` to choose the smallest effective set before loading more context.
-10. If lightweight signals show recurring context, token, prompt-quality, output-length, reset, or evidence habits, use `harness-curation` to make one advisory recommendation.
-11. When research notes, examples, prior failures, or user corrections are available, extract behavior-shaping rules: triggers, decision rules, checks, stop conditions, recovery, and evidence.
-12. Run Stitch once before committing to `.tink/current/`: evaluate every time, show exactly one proposal only for high-impact quality or safety branches, and use configured language.
-13. Use soft Stitch choices `Approve`, `Add requirements`, `Continue as-is` or localized equivalents; use hard choices `Approve`, `Add requirements`, `Cancel` only.
-14. Hard gates must not offer `Continue as-is` or `이대로 진행`, and Stitch may change method or order but not the user's goal without separate approval.
-15. Treat Reusable State Save Gate as a separate hard approval gate for `.tink/memory/*`, `.tink/harnesses/*`, `.tink/rules/*`, `.tink/config.json`, `.claude/`, and template/plugin files that affect future installs.
-16. Current-run approval never authorizes reusable-state writes; before saving reusable state, show operation, destination files, exact entry or patch summary, reusable reason, sensitive content excluded, and rollback/removal path.
-17. Before saving a reusable rule graph update, run a structural gate: duplicate, breadth, evidence, verification, Claude Code/Codex compatibility, macOS/Windows compatibility, and portable commands. AI may propose a rule; saving it still requires separate approval.
-18. `/tink:frog` may inspect rule quality as well as harness quality. Prefer keep, rewrite, split, merge, or needs-evidence recommendations before any removal proposal.
-19. Ask for approval before applying, saving, purging, honing, or installing enforcement hooks.
-20. After approval, create `.tink/current/plan.md`, `checks.md`, `steps.json`, `notes.md`, `answers.md`, `contract.json`, `session.json`, `context-pack.md`, `context-map.json`, and `excluded-context.md`.
-21. Do not stop at recommendation. Execute the first safe step after run state exists.
-22. Run `/tink:verify` behavior before final when `contract.json` lists required checks.
-23. Store reusable memory or rule updates only after separate Reusable State Save Gate approval.
-24. If a check fails, update `.tink/current/notes.md`, state the failure, last safe point, and next single action. Append compact friction to `.tink/maintenance/friction.jsonl` when it exists. Feed repeated failures to `/tink:weave`.
-25. Keep context compact. Do not paste raw logs or full diffs.
-26. Use calm, clear, concise language. Prefer plain everyday words over technical terms if a simpler word works. No jokes.
+8. Treat GJC-style visible-thinking workflows as ordinary Tink harness choices, not new commands: use `requirements-interview` for ambiguous ideas, `plan-consensus` for broad plans or architecture, `goal-checkpoint` for long runs with 2-6 current-run goals, and `delegation-brief` for safe handoff or parallel-work briefs.
+9. If no existing harness fits, use `harness-synthesis` to draft a narrow domain-specific harness instead of forcing a bad fit.
+10. If too many tools, skills, agents, or harnesses are available, use `harness-curation` to choose the smallest effective set before loading more context.
+11. If lightweight signals show recurring context, token, prompt-quality, output-length, reset, or evidence habits, use `harness-curation` to make one advisory recommendation.
+12. When research notes, examples, prior failures, or user corrections are available, extract behavior-shaping rules: triggers, decision rules, checks, stop conditions, recovery, and evidence.
+13. Run Stitch once before committing to `.tink/current/`: evaluate every time, show exactly one proposal only for high-impact quality or safety branches, and use configured language.
+14. Use soft Stitch choices `Approve`, `Add requirements`, `Continue as-is` or localized equivalents; use hard choices `Approve`, `Add requirements`, `Cancel` only.
+15. Hard gates must not offer `Continue as-is` or `이대로 진행`, and Stitch may change method or order but not the user's goal without separate approval.
+16. Treat Reusable State Save Gate as a separate hard approval gate for `.tink/memory/*`, `.tink/harnesses/*`, `.tink/rules/*`, `.tink/config.json`, `.claude/`, and template/plugin files that affect future installs.
+17. Current-run approval never authorizes reusable-state writes; before saving reusable state, show operation, destination files, exact entry or patch summary, reusable reason, sensitive content excluded, and rollback/removal path.
+18. Before saving a reusable rule graph update, run a structural gate: duplicate, breadth, evidence, verification, Claude Code/Codex compatibility, macOS/Windows compatibility, and portable commands. AI may propose a rule; saving it still requires separate approval.
+19. `/tink:frog` may inspect rule quality as well as harness quality. Prefer keep, rewrite, split, merge, or needs-evidence recommendations before any removal proposal.
+20. Ask for approval before applying, saving, purging, honing, or installing enforcement hooks.
+21. After approval, create `.tink/current/plan.md`, `checks.md`, `steps.json`, `notes.md`, `answers.md`, `contract.json`, `session.json`, `context-pack.md`, `context-map.json`, and `excluded-context.md`. If selected, also create `goals.json` for `goal-checkpoint` and `delegation.md` for `delegation-brief`.
+22. Do not stop at recommendation. Execute the first safe step after run state exists.
+23. Run `/tink:verify` behavior before final when `contract.json` lists required checks.
+24. Store reusable memory or rule updates only after separate Reusable State Save Gate approval.
+25. If a check fails, update `.tink/current/notes.md`, state the failure, last safe point, and next single action. Append compact friction to `.tink/maintenance/friction.jsonl` when it exists. Feed repeated failures to `/tink:weave`.
+26. Keep context compact. Do not paste raw logs or full diffs.
+27. Use calm, clear, concise language. Prefer plain everyday words over technical terms if a simpler word works. No jokes.
 ## Quality bar
 The user should not have to repeat themselves. If the same mistake appears twice, propose `/tink:weave`, a rule graph update, an opt-in guard candidate, or a memory update through `/tink:cast`.

package/templates/codex/skills/tink-core/RULES.md CHANGED Viewed

@@ -26,20 +26,21 @@ Accept legacy `$tink <action>` spelling for compatibility, but present `$tink:<a
 6. If `.tink/current/` exists and continuity is uncertain, read `plan.md`, `checks.md`, `steps.json`, `notes.md`, `answers.md`, and `contract.json` when present; summarize goal, last safe point, next step, open questions, and verification; then ask resume/archive/replace/cancel before continuing.
 7. Run the synthesis probe before committing to `.tink/current/`. Strong fit keeps the harness; generic fit adds a run-only draft; no fit loads `harness-synthesis`.
 8. If too many tools, skills, agents, or harnesses are available, use `harness-curation` to choose the smallest effective set before loading more context.
-9. Run Stitch once before committing to `.tink/current/`: evaluate every time, show exactly one proposal only for high-impact quality or safety branches, and use the configured language.
-10. For non-trivial `$tink:cast` runs, ask for current-run approval before creating `.tink/current/`, loading harness bodies, editing files, or executing the first step. Codex must not silently treat a command invocation as approval.
-11. Use `request_user_input` for choice prompts when available. Otherwise stop and ask one concise blocking approval question directly in chat. Do not continue until the user answers.
-12. Treat reusable saves as a separate hard approval gate for `.tink/memory/*`, `.tink/harnesses/*`, `.tink/rules/*`, `.tink/config.json`, Codex skill files, and template/plugin files that affect future installs.
-13. Current-run approval never authorizes reusable-state writes. Before saving reusable state, show operation, destination files, exact entry or patch summary, reusable reason, sensitive content excluded, and rollback/removal path.
-14. Before saving a reusable rule graph update, run a structural gate: duplicate, breadth, evidence, verification, Claude Code/Codex compatibility, macOS/Windows compatibility, and portable commands. AI may propose a rule; saving it still requires separate approval.
-15. `$tink:frog` may inspect rule quality as well as harness quality. Prefer keep, rewrite, split, merge, or needs-evidence recommendations before any removal proposal.
-16. After approval, create `.tink/current/plan.md`, `checks.md`, `steps.json`, `notes.md`, `answers.md`, `contract.json`, `session.json`, `context-pack.md`, `context-map.json`, `context-metrics-evaluation.json`, and `excluded-context.md`.
-17. Do not stop at recommendation. Execute the first safe step after run state exists.
-18. Run `$tink:verify` behavior before final when `contract.json` lists required checks.
-19. Store reusable memory or rule updates under `.tink/` only after separate approval.
-20. If a check fails, update `.tink/current/notes.md`, state the failure, last safe point, and next single action. Append compact friction to `.tink/maintenance/friction.jsonl` when it exists. Feed repeated failures to `$tink:weave`.
-21. Keep context compact. Do not paste raw logs or full diffs.
-22. Use calm, clear, concise language. Prefer plain everyday words over technical terms. No jokes.
+9. Treat GJC-style visible-thinking workflows as ordinary Tink harness choices, not new commands: use `requirements-interview` for ambiguous ideas, `plan-consensus` for broad plans or architecture, `goal-checkpoint` for long runs with 2-6 current-run goals, and `delegation-brief` for safe handoff or parallel-work briefs.
+10. Run Stitch once before committing to `.tink/current/`: evaluate every time, show exactly one proposal only for high-impact quality or safety branches, and use the configured language.
+11. For non-trivial `$tink:cast` runs, ask for current-run approval before creating `.tink/current/`, loading harness bodies, editing files, or executing the first step. Codex must not silently treat a command invocation as approval.
+12. Use `request_user_input` for choice prompts when available. Otherwise stop and ask one concise blocking approval question directly in chat. Do not continue until the user answers.
+13. Treat reusable saves as a separate hard approval gate for `.tink/memory/*`, `.tink/harnesses/*`, `.tink/rules/*`, `.tink/config.json`, Codex skill files, and template/plugin files that affect future installs.
+14. Current-run approval never authorizes reusable-state writes. Before saving reusable state, show operation, destination files, exact entry or patch summary, reusable reason, sensitive content excluded, and rollback/removal path.
+15. Before saving a reusable rule graph update, run a structural gate: duplicate, breadth, evidence, verification, Claude Code/Codex compatibility, macOS/Windows compatibility, and portable commands. AI may propose a rule; saving it still requires separate approval.
+16. `$tink:frog` may inspect rule quality as well as harness quality. Prefer keep, rewrite, split, merge, or needs-evidence recommendations before any removal proposal.
+17. After approval, create `.tink/current/plan.md`, `checks.md`, `steps.json`, `notes.md`, `answers.md`, `contract.json`, `session.json`, `context-pack.md`, `context-map.json`, `context-metrics-evaluation.json`, and `excluded-context.md`. If selected, also create `.tink/current/goals.json` for `goal-checkpoint` and `.tink/current/delegation.md` for `delegation-brief`.
+18. Do not stop at recommendation. Execute the first safe step after run state exists.
+19. Run `$tink:verify` behavior before final when `contract.json` lists required checks.
+20. Store reusable memory or rule updates under `.tink/` only after separate approval.
+21. If a check fails, update `.tink/current/notes.md`, state the failure, last safe point, and next single action. Append compact friction to `.tink/maintenance/friction.jsonl` when it exists. Feed repeated failures to `$tink:weave`.
+22. Keep context compact. Do not paste raw logs or full diffs.
+23. Use calm, clear, concise language. Prefer plain everyday words over technical terms. No jokes.
 ## Codex Approval Protocol
@@ -107,6 +108,18 @@ Create run state before deeper work:
 - `.tink/current/context-metrics-evaluation.json`: measured or estimated context-efficiency scores, formulas, evidence refs, and limits
 - `.tink/current/excluded-context.md`: notable omitted context and why it was left out
+Optional current-run artifacts:
+- `.tink/current/goals.json`: create only when `goal-checkpoint` is selected. Keep 2-6 goals, one active goal, status, done criteria, verification, evidence, and next action.
+- `.tink/current/delegation.md`: create only when `delegation-brief` is selected. Include packet scope, forbidden actions, expected evidence, and reconciliation notes. Do not start tmux panes, worktrees, workers, or external agents from this harness.
+GJC-style harness selection rules:
+- Ambiguous ideas, early product concepts, and underspecified implementation prompts should start with `requirements-interview`, usually alone until the user clarifies enough to plan or code.
+- Plan requests, architecture decisions, large refactors, migrations, or broad public contract changes should consider `plan-consensus`.
+- Runs that naturally split into multiple durable milestones should add `goal-checkpoint`.
+- Parallel review, verification, or handoff should use `delegation-brief` only to prepare briefs; worker execution needs separate user approval and tooling.
 When useful, enrich `context-map.json.included[]` and `context-map.json.excluded[]` entries with Context Budget Ledger fields: `role`, `cost`, `reuse_signal`, `verification_link`, `staleness`, and `evidence_kind`. Use them to keep the first context pack small, mark stale or avoid-next-time context, and connect `verification_target` entries to command checks, manual checks, evidence refs, or verification hints. Do not claim any 90% efficiency score without measurement evidence.
 When writing `context-metrics-evaluation.json`, include `run`, `evaluator`, `target_threshold_percent`, `measurement_status`, `scope`, `limits`, and `scores[]`. Each score should include `name`, `score_percent`, `formula`, `numerator`, `denominator`, and `evidence_refs`. If the score comes only from fixture or current-run evidence, record that scope and do not claim production-wide 90% without run-history or telemetry evidence.

package/templates/tink/harnesses/HARNESS.md CHANGED Viewed

@@ -9,6 +9,10 @@
 - **[research](./research.md)** (small) — 옵션 비교, 문서 읽기, 근거 수집. 추측 분리, 다음 액션 명시.
 - **[review](./review.md)** (small) — 변경·위험·PR 검토. 실측 발견점만 기록.
 - **[docs](./docs.md)** (tiny) — README, 가이드, PRD. 독자와 다음 행동을 명확히.
+- **[requirements-interview](./requirements-interview.md)** (small) — 모호한 아이디어를 한 번에 한 질문씩 좁혀 성공 조건과 금지 조건을 명확히.
+- **[plan-consensus](./plan-consensus.md)** (small) — 큰 설계·리팩토링 계획을 Planner → Architect → Critic → Final 흐름으로 점검.
+- **[goal-checkpoint](./goal-checkpoint.md)** (small) — 긴 실행을 2-6개 목표와 완료 증거로 쪼개 `.tink/current/goals.json`에 기록.
+- **[delegation-brief](./delegation-brief.md)** (small) — 병렬 작업이나 인수인계를 위한 범위·금지 행동·증거 요구사항을 정리. worker는 자동 실행하지 않음.
 - **[ship](./ship.md)** (small) — PR 준비, 릴리스, 배포. 위험·롤백 명시. cast 시작 시 안전판이 미리 켜집니다.
 ## 관리용 메타 하네스

package/templates/tink/harnesses/delegation-brief.md ADDED Viewed

@@ -0,0 +1,30 @@
+# delegation-brief
+## When to use
+Prepare safe parallel work or handoff briefs without starting workers automatically.
+## Ask first
+- What work can be split without conflicts?
+- What files, commands, or decisions are forbidden for each worker?
+- What evidence should each worker return?
+## Plan
+1. Identify independent work packets and shared constraints.
+2. Write `.tink/current/delegation.md` with packet scope, forbidden actions, expected evidence, and merge notes.
+3. Assign each packet a clear owner label such as reviewer, verifier, or implementer.
+4. Keep execution manual unless the user separately approves tooling.
+5. Reconcile returned evidence before changing the final plan or status.
+## Checks
+- Each packet has non-overlapping scope or an explicit conflict note.
+- No tmux panes, worktrees, workers, or external agents are started by this harness.
+- Evidence requirements are concrete and compact.
+- Do not repeat questions already answered in `.tink/current/answers.md`.
+## Done means
+- `delegation.md` is ready for a human or another agent to act on.
+- Shared constraints and forbidden actions are visible.
+- Merge or reconciliation steps are named.
+## If it fails, Tink back
+Return to the last packet with clear ownership. State the conflict, last safe point, and next single split or merge action.

package/templates/tink/harnesses/goal-checkpoint.md ADDED Viewed

@@ -0,0 +1,30 @@
+# goal-checkpoint
+## When to use
+Break a long run into a small set of durable current-run goals with explicit completion evidence.
+## Ask first
+- What final outcome should the goal list prove?
+- Which 2-6 goals can be completed independently?
+- What evidence marks each goal complete?
+## Plan
+1. Split the run into 2-6 goals.
+2. Write `.tink/current/goals.json` with goal id, description, status, done criteria, verification, and evidence.
+3. Mark exactly one goal as active when work begins.
+4. Checkpoint each goal as complete, blocked, or deferred with evidence.
+5. Keep `steps.json` aligned with the active goal.
+## Checks
+- Goals are few enough to scan and specific enough to verify.
+- Each goal has completion evidence, not just a task label.
+- Blocked goals include the smallest unblock action.
+- Do not repeat questions already answered in `.tink/current/answers.md`.
+## Done means
+- Every required goal is complete or explicitly blocked/deferred.
+- `goals.json` matches the final status reported to the user.
+- Verification evidence is attached to completed goals.
+## If it fails, Tink back
+Return to the last completed goal. State the active goal, failure, last safe point, and next single action.

package/templates/tink/harnesses/index.json CHANGED Viewed

@@ -74,6 +74,66 @@
       "Plain language"
     ]
   },
+  {
+    "name": "requirements-interview",
+    "kind": "built-in",
+    "context": "small",
+    "use_when": "Clarify an ambiguous idea before planning or implementation.",
+    "asks": [
+      "What decision or artifact should the clarified requirements support?",
+      "What is the highest-impact unknown?"
+    ],
+    "checks": [
+      "One blocking question at a time",
+      "Success conditions explicit before implementation",
+      "Assumptions recorded instead of hidden"
+    ]
+  },
+  {
+    "name": "plan-consensus",
+    "kind": "built-in",
+    "context": "small",
+    "use_when": "Plan broad architecture, large refactors, or work where critique can materially change the approach.",
+    "asks": [
+      "What decision must the final plan settle?",
+      "What constraints or compatibility boundaries matter?"
+    ],
+    "checks": [
+      "Planner, Architect, Critic, and Final stages represented",
+      "Rejected alternatives tied to concrete tradeoffs",
+      "Final plan ready for implementation"
+    ]
+  },
+  {
+    "name": "goal-checkpoint",
+    "kind": "built-in",
+    "context": "small",
+    "use_when": "Break a long run into 2-6 current-run goals with explicit completion evidence.",
+    "asks": [
+      "What final outcome should the goal list prove?",
+      "What evidence marks each goal complete?"
+    ],
+    "checks": [
+      ".tink/current/goals.json created when selected",
+      "Exactly one active goal while work is underway",
+      "Blocked goals include the smallest unblock action"
+    ]
+  },
+  {
+    "name": "delegation-brief",
+    "kind": "built-in",
+    "context": "small",
+    "use_when": "Prepare safe parallel work or handoff briefs without starting workers automatically.",
+    "asks": [
+      "What work can be split without conflicts?",
+      "What evidence should each worker return?"
+    ],
+    "checks": [
+      ".tink/current/delegation.md created when selected",
+      "No tmux panes, worktrees, or workers started automatically",
+      "Packet scopes and forbidden actions visible"
+    ]
+  },
   {
     "name": "ship",
     "kind": "built-in",

package/templates/tink/harnesses/plan-consensus.md ADDED Viewed

@@ -0,0 +1,30 @@
+# plan-consensus
+## When to use
+Plan broad architecture, large refactors, or work where a second-pass critique can materially change the approach.
+## Ask first
+- What decision must the final plan settle?
+- What constraints or compatibility boundaries matter?
+- What level of detail should the plan reach before implementation?
+## Plan
+1. Planner: draft the smallest complete plan with scope, sequence, and checks.
+2. Architect: challenge interfaces, data flow, compatibility, and migration concerns.
+3. Critic: look for missing tests, unsafe assumptions, and overbuilt steps.
+4. Final: merge the useful objections into one implementation-ready plan.
+5. Record the final plan in `.tink/current/plan.md` and unresolved objections in `notes.md`.
+## Checks
+- Final plan names the goal, scope, non-goals, and acceptance evidence.
+- Critique changes the plan or is explicitly rejected with a reason.
+- The plan does not require a subagent, tmux worker, or separate runtime to be valid.
+- Do not repeat questions already answered in `.tink/current/answers.md`.
+## Done means
+- A single final plan can be handed to an implementer without more design choices.
+- Major risks and verification steps are visible.
+- Rejected alternatives are short and tied to concrete tradeoffs.
+## If it fails, Tink back
+Return to the latest complete stage. State which role found the blocker, the last safe point, and the next single revision.

package/templates/tink/harnesses/requirements-interview.md ADDED Viewed

@@ -0,0 +1,30 @@
+# requirements-interview
+## When to use
+Clarify an ambiguous idea before planning or implementation.
+## Ask first
+- What decision or artifact should the clarified requirements support?
+- What is the highest-impact unknown?
+- What must not be assumed?
+## Plan
+1. State the current understanding in one paragraph.
+2. Ask one question at a time, starting with the uncertainty that changes scope or success criteria most.
+3. Record each answer in `.tink/current/answers.md`.
+4. Convert settled answers into `contract.json` success conditions, forbidden actions, or verification notes.
+5. Stop interviewing when the next safe step is clear enough to plan.
+## Checks
+- Only one blocking question is asked at a time.
+- Success conditions are explicit before implementation starts.
+- Important assumptions are recorded instead of hidden.
+- Do not repeat questions already answered in `.tink/current/answers.md`.
+## Done means
+- The goal, success conditions, and non-goals are clear enough for the next harness.
+- Remaining uncertainty is named as an assumption or open question.
+- `contract.json` can state what done means.
+## If it fails, Tink back
+Return to the last answered question. State what is still ambiguous, the last safe point, and the next single question.

package/templates/tink/rules/index.json CHANGED Viewed

@@ -44,6 +44,62 @@
       "reason": "Bug fixes should keep reproduction and regression evidence close to the change.",
       "risk": "Skipping reproduction can turn a bug fix into an unverified code edit."
     },
+    {
+      "id": "harness:requirements-interview",
+      "type": "harness",
+      "target": "requirements-interview",
+      "load": "retrievable",
+      "phase": "classification",
+      "budget_cost": 2,
+      "keywords": ["ambiguous", "idea", "clarify", "requirements", "interview", "question"],
+      "when": { "task_type": ["research", "docs", "new_pattern"], "risk": ["goal_ambiguity"] },
+      "select_harnesses": ["requirements-interview"],
+      "checks": ["One blocking question at a time", "Success conditions explicit before implementation", "Assumptions recorded instead of hidden"],
+      "reason": "Ambiguous ideas should be narrowed into visible requirements before planning or implementation.",
+      "risk": "Starting with a generic harness can hide assumptions and choose the wrong next step."
+    },
+    {
+      "id": "harness:plan-consensus",
+      "type": "harness",
+      "target": "plan-consensus",
+      "load": "retrievable",
+      "phase": "classification",
+      "budget_cost": 2,
+      "keywords": ["plan", "architecture", "refactor", "migration", "consensus", "critic"],
+      "when": { "task_type": ["research", "docs", "code_change"], "risk": ["broad_contract"] },
+      "select_harnesses": ["plan-consensus"],
+      "checks": ["Planner, Architect, Critic, and Final stages represented", "Final plan ready for implementation"],
+      "reason": "Broad planning benefits from a structured critique pass before run state commits to an approach.",
+      "risk": "A one-pass plan can miss compatibility, migration, or verification gaps."
+    },
+    {
+      "id": "harness:goal-checkpoint",
+      "type": "harness",
+      "target": "goal-checkpoint",
+      "load": "retrievable",
+      "phase": "classification",
+      "budget_cost": 2,
+      "keywords": ["goal", "checkpoint", "milestone", "multi-step", "long run"],
+      "when": { "task_type": ["code_change", "release", "publish", "docs"], "risk": ["long_run"] },
+      "select_harnesses": ["goal-checkpoint"],
+      "checks": [".tink/current/goals.json created when selected", "Exactly one active goal while work is underway"],
+      "reason": "Long runs need durable current-run goals with completion evidence.",
+      "risk": "A long checklist can lose the active goal, blocked state, or evidence trail."
+    },
+    {
+      "id": "harness:delegation-brief",
+      "type": "harness",
+      "target": "delegation-brief",
+      "load": "retrievable",
+      "phase": "classification",
+      "budget_cost": 2,
+      "keywords": ["delegate", "handoff", "parallel", "worker", "reviewer", "verifier"],
+      "when": { "task_type": ["review", "release", "publish", "research"], "risk": ["parallel_work"] },
+      "select_harnesses": ["delegation-brief"],
+      "checks": [".tink/current/delegation.md created when selected", "No tmux panes, worktrees, or workers started automatically"],
+      "reason": "Parallel review or handoff should be captured as visible packets before any worker tooling is approved.",
+      "risk": "Starting workers before a brief can create overlapping edits or unclear evidence."
+    },
     {
       "id": "harness:ship",
       "type": "harness",