tink-harness 1.13.0 → 1.14.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/.claude-plugin/plugin.json +1 -1
- package/CHANGELOG.md +5 -0
- package/README.ko.md +1 -1
- package/README.md +1 -1
- package/VERSIONING.md +1 -1
- package/bin/install.js +126 -10
- package/commands/cast.md +52 -19
- package/docs/planned-work-units.ko.md +8 -7
- package/docs/planned-work-units.md +8 -7
- package/docs/swarm-fast-lane.ko.md +17 -16
- package/docs/swarm-fast-lane.md +17 -16
- package/package.json +1 -1
- package/templates/claude/commands/tink/cast.md +52 -19
- package/templates/codex/skills/tink-core/RULES.md +51 -17
package/CHANGELOG.md
CHANGED
|
@@ -2,6 +2,11 @@
|
|
|
2
2
|
|
|
3
3
|
All notable changes to Tink are tracked here.
|
|
4
4
|
|
|
5
|
+
## [1.14.0] - 2026-06-19
|
|
6
|
+
|
|
7
|
+
- Added `CLAUDE_CONFIG_DIR` support: global installs now respect the env var (set via direnv or shell) so commands and skills land in the right config directory instead of always defaulting to `~/.claude`.
|
|
8
|
+
- Added `tink-harness update --all-repos`: finds every repo under the home directory that has Tink installed and updates each one. Uses `direnv exec` when available so per-repo `.envrc` overrides (including `CLAUDE_CONFIG_DIR`) are applied automatically; falls back to parsing simple `export` lines from `.envrc` otherwise.
|
|
9
|
+
|
|
5
10
|
## [1.13.0] - 2026-06-19
|
|
6
11
|
|
|
7
12
|
- Added focused opt-in harnesses for recurring agent workflows: `issue-triage`, `bug-diagnosis-loop`, `review-two-axis`, `decision-map`, and `architecture-deepening`.
|
package/README.ko.md
CHANGED
|
@@ -10,7 +10,7 @@ Tink는 사소하지 않은 모든 에이전트 작업을 눈에 보이는 파
|
|
|
10
10
|
|
|
11
11
|
<sub>Claude Code와 Codex를 위한 작은 하네스 레이어</sub>
|
|
12
12
|
|
|
13
|
-
**최신 패키지:** v1.
|
|
13
|
+
**최신 패키지:** v1.14.0 — 글로벌 설치 시 `CLAUDE_CONFIG_DIR` 환경변수를 반영하고, `update --all-repos`로 홈 하위 모든 Tink 레포를 한 번에 업데이트할 수 있게 됐습니다. direnv가 있으면 레포별 `.envrc`를 자동으로 로드합니다. 전체 변경 이력은 [CHANGELOG](CHANGELOG.md)를 확인하세요.
|
|
14
14
|
|
|
15
15
|
[English](README.md) · **한국어** · [변경 이력](CHANGELOG.md)
|
|
16
16
|
|
package/README.md
CHANGED
|
@@ -24,7 +24,7 @@
|
|
|
24
24
|
<a href="https://github.com/dotoricode/tink-harness/stargazers"><img src="https://img.shields.io/github/stars/dotoricode/tink-harness?style=social" alt="GitHub stars"></a>
|
|
25
25
|
</p>
|
|
26
26
|
|
|
27
|
-
<p><strong>Latest package:</strong> v1.
|
|
27
|
+
<p><strong>Latest package:</strong> v1.14.0 - Tink respects <code>CLAUDE_CONFIG_DIR</code> for global installs and adds <code>update --all-repos</code> to refresh every Tink-installed repo in one command, with direnv support for per-repo env overrides. See <a href="CHANGELOG.md">CHANGELOG</a> for release history.</p>
|
|
28
28
|
|
|
29
29
|
**English** · [한국어](README.ko.md) · [Changelog](CHANGELOG.md)
|
|
30
30
|
|
package/VERSIONING.md
CHANGED
package/bin/install.js
CHANGED
|
@@ -126,7 +126,7 @@ function argValue(name) {
|
|
|
126
126
|
}
|
|
127
127
|
|
|
128
128
|
function usage() {
|
|
129
|
-
console.log(`Tink installer for Claude Code and Codex\n\nUsage:\n tink-harness [install] [--scope=repo|global] [--global] [--lang=en|ko|zh] [--yes] [--with-hook] [--clean-codex-picker] [--dry-run] [--force]\n tink-harness update [--scope=repo|global] [--global] [--lang=en|ko|zh] [--yes] [--clean-codex-picker] [--dry-run] [--force]\n tink-harness dashboard [--no-open]\n\nIf the command is not installed yet, use:\n npx tink-harness@latest [install]\n npx tink-harness@latest update\n\nCommands:\n install Install Tink.\n update Update Tink to the latest templates. Asks only the agent surface; Tink-owned files always refresh, user-modified harness/memory/config files are kept.\n dashboard Generate the harness health report from local .tink records and open it in your browser. Use --no-open to skip opening.\n\nDefault interactive flow:\n 1. Select language\n 2. Show TINK wizard\n 3. Select Claude Code, Codex, or both\n 4. Select components\n 5. Select repo/global installation scope\n 6. Select Advanced options\n 7. Select git tracking policy for project state\n\nAdvanced options:\n --dry-run Preview only. Show what would be written or removed, but do not change files.\n --force Overwrite user-modified files. Use only when you want official templates to replace local edits.\n --clean-codex-picker Codex-only cleanup. Remove repo-local Claude Tink surfaces that show as Source Command Tink entries.\n\nEnvironment:\n TINK_INSTALL_SURFACES=claude|codex|all\n TINK_CLEAN_CODEX_PICKER=1\n\nScopes:\n repo Install shared .tink files into the current project.\n global Install shared .tink files into your home directory.\n`);
|
|
129
|
+
console.log(`Tink installer for Claude Code and Codex\n\nUsage:\n tink-harness [install] [--scope=repo|global] [--global] [--lang=en|ko|zh] [--yes] [--with-hook] [--clean-codex-picker] [--dry-run] [--force]\n tink-harness update [--scope=repo|global] [--global] [--lang=en|ko|zh] [--yes] [--clean-codex-picker] [--dry-run] [--force]\n tink-harness update --all-repos\n tink-harness dashboard [--no-open]\n\nIf the command is not installed yet, use:\n npx tink-harness@latest [install]\n npx tink-harness@latest update\n\nCommands:\n install Install Tink.\n update Update Tink to the latest templates. Asks only the agent surface; Tink-owned files always refresh, user-modified harness/memory/config files are kept.\n dashboard Generate the harness health report from local .tink records and open it in your browser. Use --no-open to skip opening.\n\nDefault interactive flow:\n 1. Select language\n 2. Show TINK wizard\n 3. Select Claude Code, Codex, or both\n 4. Select components\n 5. Select repo/global installation scope\n 6. Select Advanced options\n 7. Select git tracking policy for project state\n\nAdvanced options:\n --dry-run Preview only. Show what would be written or removed, but do not change files.\n --force Overwrite user-modified files. Use only when you want official templates to replace local edits.\n --clean-codex-picker Codex-only cleanup. Remove repo-local Claude Tink surfaces that show as Source Command Tink entries.\n --all-repos Update all repos with Tink under the home directory. Uses direnv if available to load per-repo .envrc.\n\nEnvironment:\n TINK_INSTALL_SURFACES=claude|codex|all\n TINK_CLEAN_CODEX_PICKER=1\n CLAUDE_CONFIG_DIR Override ~/.claude for global installs (e.g. set by direnv per project)\n CODEX_HOME Override ~/.codex for Codex skill installs\n\nScopes:\n repo Install shared .tink files into the current project.\n global Install shared .tink files into your home directory.\n`);
|
|
130
130
|
}
|
|
131
131
|
|
|
132
132
|
function findTinkRoot() {
|
|
@@ -228,6 +228,15 @@ function codexHome() {
|
|
|
228
228
|
return process.env.CODEX_HOME || path.join(os.homedir(), '.codex');
|
|
229
229
|
}
|
|
230
230
|
|
|
231
|
+
// CLAUDE_CONFIG_DIR replaces ~/.claude for global installs (like direnv per-project overrides).
|
|
232
|
+
// Repo-scope installs always use <repo>/.claude regardless of this env var.
|
|
233
|
+
function claudeDir(target) {
|
|
234
|
+
if (process.env.CLAUDE_CONFIG_DIR && target === os.homedir()) {
|
|
235
|
+
return process.env.CLAUDE_CONFIG_DIR;
|
|
236
|
+
}
|
|
237
|
+
return path.join(target, '.claude');
|
|
238
|
+
}
|
|
239
|
+
|
|
231
240
|
function legacyComponentOptionsFor(agent, language) {
|
|
232
241
|
const options = COMPONENTS[language].filter((item) => {
|
|
233
242
|
if (item.value === 'commands') return includesClaude(agent);
|
|
@@ -364,8 +373,8 @@ function locationSummary(agent, scope) {
|
|
|
364
373
|
return [
|
|
365
374
|
`Repo target: ${repoTarget}`,
|
|
366
375
|
`Shared .tink target: ${path.join(installTarget, '.tink')}`,
|
|
367
|
-
includesClaude(agent) ? `Claude Code command target: ${path.join(installTarget, '
|
|
368
|
-
includesClaude(agent) ? `Claude Code skill target: ${path.join(installTarget, '
|
|
376
|
+
includesClaude(agent) ? `Claude Code command target: ${path.join(claudeDir(installTarget), 'commands/tink')}` : null,
|
|
377
|
+
includesClaude(agent) ? `Claude Code skill target: ${path.join(claudeDir(installTarget), 'skills/tink')}` : null,
|
|
369
378
|
includesCodex(agent) ? `Codex skills target: ${path.join(codexHome(), 'skills')}` : null,
|
|
370
379
|
includesCodex(agent) ? `Codex picker cleanup target: ${path.join(process.cwd(), '.claude')}` : null
|
|
371
380
|
].filter(Boolean).join('\n');
|
|
@@ -710,12 +719,12 @@ function copyDir(src, dest, base) {
|
|
|
710
719
|
|
|
711
720
|
function copyTinkCommands(templateRoot, target) {
|
|
712
721
|
const commandSrc = path.join(templateRoot, 'claude/commands/tink');
|
|
713
|
-
const commandDest = path.join(target, '
|
|
714
|
-
const flatCommandDest = path.join(target, '
|
|
722
|
+
const commandDest = path.join(claudeDir(target), 'commands/tink');
|
|
723
|
+
const flatCommandDest = path.join(claudeDir(target), 'commands');
|
|
715
724
|
const legacyFlatCommands = ['tink-setup.md', 'tink-forge.md', 'tink-list.md', 'tink-purge.md', 'tink-hone.md'];
|
|
716
725
|
const legacyNamespaceCommands = ['forge.md', 'purge.md', 'hone.md'];
|
|
717
726
|
const legacyTinyCommands = ['tiny-setup.md', 'tiny-use.md', 'tiny-list.md', 'tiny-save.md'];
|
|
718
|
-
const legacyDirs = [path.join(flatCommandDest, 'tiny'), path.join(target, '
|
|
727
|
+
const legacyDirs = [path.join(flatCommandDest, 'tiny'), path.join(claudeDir(target), 'skills/tiny')];
|
|
719
728
|
for (const name of legacyFlatCommands) {
|
|
720
729
|
const legacy = path.join(flatCommandDest, name);
|
|
721
730
|
if (fs.existsSync(legacy)) {
|
|
@@ -863,7 +872,7 @@ function hookCommandFor(scope, target) {
|
|
|
863
872
|
}
|
|
864
873
|
|
|
865
874
|
function registerClaudeHook(target, scope, base) {
|
|
866
|
-
const settingsPath = path.join(target, '
|
|
875
|
+
const settingsPath = path.join(claudeDir(target), 'settings.json');
|
|
867
876
|
const settings = readJsonFile(settingsPath, {});
|
|
868
877
|
const command = hookCommandFor(scope, target);
|
|
869
878
|
settings.hooks ||= {};
|
|
@@ -893,7 +902,7 @@ function copySelected(scope, components, agent) {
|
|
|
893
902
|
}
|
|
894
903
|
if (wantsClaudeSkill(components)) {
|
|
895
904
|
if (includesClaude(agent) && !cleanupCodexPicker) {
|
|
896
|
-
copyDir(path.join(templateRoot, 'claude/skills'), path.join(target, '
|
|
905
|
+
copyDir(path.join(templateRoot, 'claude/skills'), path.join(claudeDir(target), 'skills'), target);
|
|
897
906
|
}
|
|
898
907
|
}
|
|
899
908
|
if (wantsCodexSkills(components)) {
|
|
@@ -995,8 +1004,8 @@ function doneLineFor(agent) {
|
|
|
995
1004
|
|
|
996
1005
|
function updateResultSummary(agent, targets) {
|
|
997
1006
|
const locations = [
|
|
998
|
-
includesClaude(agent) ? `Claude Code commands: ${path.join(targets.installTarget, '
|
|
999
|
-
includesClaude(agent) ? `Claude Code skill: ${path.join(targets.installTarget, '
|
|
1007
|
+
includesClaude(agent) ? `Claude Code commands: ${path.join(claudeDir(targets.installTarget), 'commands/tink')}` : null,
|
|
1008
|
+
includesClaude(agent) ? `Claude Code skill: ${path.join(claudeDir(targets.installTarget), 'skills/tink')}` : null,
|
|
1000
1009
|
includesCodex(agent) ? `Codex skills: ${path.join(targets.codexTarget, 'skills')}` : null,
|
|
1001
1010
|
`Tink shared files: ${path.join(targets.installTarget, '.tink')}`
|
|
1002
1011
|
].filter(Boolean);
|
|
@@ -1216,12 +1225,119 @@ async function resolveChoices() {
|
|
|
1216
1225
|
return { agent, scope, components, gitPolicy, hookScope, language };
|
|
1217
1226
|
}
|
|
1218
1227
|
|
|
1228
|
+
function findAllTinkRepos() {
|
|
1229
|
+
const found = [];
|
|
1230
|
+
const skip = new Set(['node_modules', '.git', 'vendor', 'dist', 'build', 'out', 'target', '.cache']);
|
|
1231
|
+
|
|
1232
|
+
function scan(dir, depth) {
|
|
1233
|
+
if (depth > 4) return;
|
|
1234
|
+
let entries;
|
|
1235
|
+
try { entries = fs.readdirSync(dir, { withFileTypes: true }); } catch { return; }
|
|
1236
|
+
let hasTink = false;
|
|
1237
|
+
for (const entry of entries) {
|
|
1238
|
+
if (!entry.isDirectory()) continue;
|
|
1239
|
+
if (entry.name === '.tink') { hasTink = true; continue; }
|
|
1240
|
+
if (skip.has(entry.name) || entry.name.startsWith('.')) continue;
|
|
1241
|
+
scan(path.join(dir, entry.name), depth + 1);
|
|
1242
|
+
}
|
|
1243
|
+
if (hasTink) found.push(dir);
|
|
1244
|
+
}
|
|
1245
|
+
|
|
1246
|
+
scan(os.homedir(), 0);
|
|
1247
|
+
return found;
|
|
1248
|
+
}
|
|
1249
|
+
|
|
1250
|
+
function isDirenvAvailable() {
|
|
1251
|
+
return spawnSync('direnv', ['version'], { encoding: 'utf8' }).status === 0;
|
|
1252
|
+
}
|
|
1253
|
+
|
|
1254
|
+
function parseEnvrc(envrcPath, repoDir) {
|
|
1255
|
+
if (!fs.existsSync(envrcPath)) return {};
|
|
1256
|
+
const env = {};
|
|
1257
|
+
for (const line of fs.readFileSync(envrcPath, 'utf8').split('\n')) {
|
|
1258
|
+
const m = line.match(/^\s*export\s+([A-Z_][A-Z0-9_]*)=(.*)/);
|
|
1259
|
+
if (!m) continue;
|
|
1260
|
+
let val = m[2].trim().replace(/^["']|["']$/g, '');
|
|
1261
|
+
val = val
|
|
1262
|
+
.replace(/\$HOME|\bHOME\b/g, os.homedir())
|
|
1263
|
+
.replace(/\$PWD|\bPWD\b/g, repoDir)
|
|
1264
|
+
.replace(/^~/, os.homedir());
|
|
1265
|
+
env[m[1]] = val;
|
|
1266
|
+
}
|
|
1267
|
+
return env;
|
|
1268
|
+
}
|
|
1269
|
+
|
|
1270
|
+
async function runAllRepos() {
|
|
1271
|
+
const allRepos = findAllTinkRepos();
|
|
1272
|
+
const sourceRoot = path.resolve(root);
|
|
1273
|
+
const repos = allRepos.filter((r) => path.resolve(r) !== sourceRoot);
|
|
1274
|
+
|
|
1275
|
+
if (repos.length === 0) {
|
|
1276
|
+
console.log('No repos with Tink installed found under home directory.');
|
|
1277
|
+
return;
|
|
1278
|
+
}
|
|
1279
|
+
|
|
1280
|
+
const hasDirenv = isDirenvAvailable();
|
|
1281
|
+
const installScript = path.join(root, 'bin/install.js');
|
|
1282
|
+
|
|
1283
|
+
console.log(`Found ${repos.length} repo(s) with Tink installed:\n`);
|
|
1284
|
+
for (const repo of repos) {
|
|
1285
|
+
const envrc = path.join(repo, '.envrc');
|
|
1286
|
+
const envVars = hasDirenv ? {} : parseEnvrc(envrc, repo);
|
|
1287
|
+
const claudeTarget = envVars.CLAUDE_CONFIG_DIR
|
|
1288
|
+
? envVars.CLAUDE_CONFIG_DIR
|
|
1289
|
+
: path.join(repo, '.claude');
|
|
1290
|
+
const note = fs.existsSync(envrc)
|
|
1291
|
+
? hasDirenv
|
|
1292
|
+
? `(direnv)`
|
|
1293
|
+
: envVars.CLAUDE_CONFIG_DIR
|
|
1294
|
+
? `(.envrc → CLAUDE_CONFIG_DIR=${envVars.CLAUDE_CONFIG_DIR})`
|
|
1295
|
+
: `(.envrc, no CLAUDE_CONFIG_DIR)`
|
|
1296
|
+
: '';
|
|
1297
|
+
console.log(` ${repo} ${note}`);
|
|
1298
|
+
console.log(` → ${claudeTarget}/commands/tink`);
|
|
1299
|
+
}
|
|
1300
|
+
console.log('');
|
|
1301
|
+
|
|
1302
|
+
for (const repo of repos) {
|
|
1303
|
+
console.log(`▶ ${path.basename(repo)} (${repo})`);
|
|
1304
|
+
const envrc = path.join(repo, '.envrc');
|
|
1305
|
+
const extraEnv = hasDirenv ? {} : parseEnvrc(envrc, repo);
|
|
1306
|
+
const mergedEnv = { ...process.env, ...extraEnv };
|
|
1307
|
+
|
|
1308
|
+
let result;
|
|
1309
|
+
if (hasDirenv && fs.existsSync(envrc)) {
|
|
1310
|
+
result = spawnSync(
|
|
1311
|
+
'direnv', ['exec', repo, 'node', installScript, 'update', '--yes', '--scope=repo'],
|
|
1312
|
+
{ cwd: repo, env: process.env, stdio: 'inherit', encoding: 'utf8' }
|
|
1313
|
+
);
|
|
1314
|
+
} else {
|
|
1315
|
+
result = spawnSync(
|
|
1316
|
+
process.execPath, [installScript, 'update', '--yes', '--scope=repo'],
|
|
1317
|
+
{ cwd: repo, env: mergedEnv, stdio: 'inherit', encoding: 'utf8' }
|
|
1318
|
+
);
|
|
1319
|
+
}
|
|
1320
|
+
|
|
1321
|
+
if (result.status !== 0) {
|
|
1322
|
+
console.error(` ✗ failed (exit ${result.status})`);
|
|
1323
|
+
} else {
|
|
1324
|
+
console.log(` ✓ done`);
|
|
1325
|
+
}
|
|
1326
|
+
console.log('');
|
|
1327
|
+
}
|
|
1328
|
+
}
|
|
1329
|
+
|
|
1219
1330
|
async function main() {
|
|
1220
1331
|
if (command === 'help' || args.includes('--help')) {
|
|
1221
1332
|
usage();
|
|
1222
1333
|
process.exit(0);
|
|
1223
1334
|
}
|
|
1224
1335
|
|
|
1336
|
+
if (command === 'update' && args.includes('--all-repos')) {
|
|
1337
|
+
await runAllRepos();
|
|
1338
|
+
return;
|
|
1339
|
+
}
|
|
1340
|
+
|
|
1225
1341
|
if (command === 'dashboard') {
|
|
1226
1342
|
runDashboard();
|
|
1227
1343
|
return;
|
package/commands/cast.md
CHANGED
|
@@ -160,6 +160,38 @@ Optional current-run artifacts are created only when their harness is selected:
|
|
|
160
160
|
- `goals.json`: current-run goals for `goal-checkpoint`; keep 2-6 goals, one active goal, status, done criteria, verification, and evidence.
|
|
161
161
|
- `delegation.md`: handoff or parallel-work packets for `delegation-brief`; include packet scope, forbidden actions, expected evidence, and reconciliation notes. Do not start tmux panes, worktrees, workers, or external agents from this harness.
|
|
162
162
|
|
|
163
|
+
## Evidence Split
|
|
164
|
+
Evidence Split is a base-run habit, not a separate harness. It keeps real work small while the task is happening by splitting broad or uncertain work into evidence-sized packets.
|
|
165
|
+
|
|
166
|
+
Use Evidence Split at cast time and again during implementation when:
|
|
167
|
+
- the first plan has several uncertain facts,
|
|
168
|
+
- implementation starts coupling several files or concepts,
|
|
169
|
+
- a check fails and the next action is unclear,
|
|
170
|
+
- context is becoming broad or stale,
|
|
171
|
+
- independent verification, review, or handoff would reduce risk.
|
|
172
|
+
|
|
173
|
+
Skip it for tiny, obvious edits where a packet would not change the next action.
|
|
174
|
+
|
|
175
|
+
Packet vocabulary:
|
|
176
|
+
- `probe`: answer one unknown with 1-3 inputs.
|
|
177
|
+
- `patch`: make one narrow implementation change.
|
|
178
|
+
- `verify`: prove one success condition or failure recovery.
|
|
179
|
+
- `review`: inspect one risk, regression, or omission.
|
|
180
|
+
- `decision`: record one branch, chosen option, and evidence.
|
|
181
|
+
|
|
182
|
+
Represent packets in existing run state:
|
|
183
|
+
- `steps.json`: packetized steps and status.
|
|
184
|
+
- `context-map.json`: the input files, sources, or excluded context for each packet.
|
|
185
|
+
- `notes.md`: why work was split or re-split during implementation.
|
|
186
|
+
- `delegation.md`: only when `delegation-brief` is selected or another human/agent packet is explicitly needed.
|
|
187
|
+
|
|
188
|
+
Safety defaults:
|
|
189
|
+
- Do not start workers, tmux panes, worktrees, or external agents automatically.
|
|
190
|
+
- Packet outputs are evidence, risks, recommendations, or patch candidates by default; direct edits require the main agent's normal approval and ownership.
|
|
191
|
+
- Do not let multiple packets edit the same file concurrently.
|
|
192
|
+
- Keep secrets, public contracts, broad refactors, release/publish actions, and final reconciliation under the main agent's control.
|
|
193
|
+
- Keep each packet to 1-3 primary inputs when possible.
|
|
194
|
+
|
|
163
195
|
Create `contract.json` before loading harness bodies. It should be short, factual, and based on the user request plus visible project context:
|
|
164
196
|
|
|
165
197
|
```json
|
|
@@ -480,12 +512,13 @@ This is the Lane 3 full path from Quick triage. Lanes 1 and 2 intentionally skip
|
|
|
480
512
|
- new pattern not covered yet
|
|
481
513
|
|
|
482
514
|
These are task types, not harness names. Generic types (code change, bug fix, research, review, docs) default to the base run; a harness is added only when a specialized one genuinely fits.
|
|
483
|
-
6.
|
|
515
|
+
6. Apply the Evidence Split check before choosing harnesses. If it changes the next action, represent the first packets in `steps.json` and connect each packet to context or verification evidence in `context-map.json`. Keep this check lightweight and skip it for tiny work.
|
|
516
|
+
7. Consider GJC-style visible-thinking overlays as normal Tink harnesses, not as new command surfaces:
|
|
484
517
|
- If the request is an ambiguous idea, early product concept, or underspecified implementation prompt, prefer `requirements-interview` before planning or coding. This is the default harness when Stitch is expected to trigger for goal ambiguity or missing acceptance criteria.
|
|
485
518
|
- If the request asks for a plan, architecture decision, large refactor, migration, or broad public contract change, consider `plan-consensus`.
|
|
486
519
|
- If the work naturally splits into multiple durable milestones, add `goal-checkpoint` and create `.tink/current/goals.json` after approval.
|
|
487
520
|
- If parallel review, independent verification, or handoff would reduce risk, add `delegation-brief` and create `.tink/current/delegation.md` after approval. This harness prepares briefs only; it never starts tmux, worktrees, workers, or external agents.
|
|
488
|
-
|
|
521
|
+
8. Consider focused work harnesses only when their trigger is strong enough to change the procedure:
|
|
489
522
|
- Use `issue-triage` for issue/PR/QA intake, ready-for-agent briefs, needs-info/wontfix decisions, or vertical issue slices.
|
|
490
523
|
- Use `bug-diagnosis-loop` for hard bugs, regressions, intermittent failures, or performance problems where a red-capable loop must come before code changes.
|
|
491
524
|
- Use `review-two-axis` for PR/branch/diff review when Standards and Spec should be reported separately.
|
|
@@ -496,7 +529,7 @@ This is the Lane 3 full path from Quick triage. Lanes 1 and 2 intentionally skip
|
|
|
496
529
|
- `goal-checkpoint` is REQUIRED (not optional) when ANY of these is true: the Goals list has 2+ goals; 2+ harnesses run sequentially; the plan is expected to need 4+ steps; or the work spans multiple components/directories. Create `goals.json` after approval.
|
|
497
530
|
- `plan-consensus` must be explicitly considered for any from-scratch implementation, reimplementation, migration, or public contract/API design. If skipped, record a one-line reason in the 오버레이 점검 line.
|
|
498
531
|
- The context budget and the "prefer 1-3 harnesses" guidance never justify dropping a REQUIRED overlay: overlays are cheap state files, not extra loaded context. A large task judged "fine with default harnesses" because the synthesis probe found a fit is a selection bug - the probe only answers whether a custom procedure is needed, not whether overlays are needed.
|
|
499
|
-
|
|
532
|
+
9. Pick the smallest effective set using the context budget policy below: the base run plus 0-3 specialized harnesses. When no specialized harness fits, select the base run alone - do not force a generic fit. Do not use a hard cap when several tiny harnesses add useful checks without crowding context. When the task is ambiguous (Stitch goal-ambiguity is expected to trigger), start with `requirements-interview` alone; add a second harness only after the user clarifies. Do not bundle 2+ harnesses for ambiguous tasks upfront.
|
|
500
533
|
|
|
501
534
|
After selecting, run a quick quality check using the index metadata for each chosen harness:
|
|
502
535
|
- If fewer than 2 words in `use_when` match the current task description (case-insensitive) → treat as a Stitch harness-mismatch signal
|
|
@@ -504,26 +537,26 @@ This is the Lane 3 full path from Quick triage. Lanes 1 and 2 intentionally skip
|
|
|
504
537
|
- If `asks` is empty or missing and the task goal is not self-evident → treat as a Stitch goal-ambiguity signal
|
|
505
538
|
Feed any signals into the Stitch evaluation at step 16.
|
|
506
539
|
|
|
507
|
-
|
|
508
|
-
|
|
509
|
-
|
|
510
|
-
|
|
511
|
-
|
|
512
|
-
|
|
513
|
-
|
|
514
|
-
|
|
515
|
-
|
|
516
|
-
|
|
517
|
-
|
|
518
|
-
|
|
519
|
-
|
|
540
|
+
10. Add any rule graph check candidates to `contract.json` verification if they are relevant and cheap. For risky commands, set `approval_required: true`.
|
|
541
|
+
11. Add opt-in guard candidates to `notes.md` only as suggestions. Do not register enforcement hooks unless the user separately approves.
|
|
542
|
+
12. Run the synthesis probe on the initial harness choice. The probe produces one of three outcomes: strong fit (0-1 yes), generic fit (2-3 yes), or no fit (4-5 yes or no harness matches).
|
|
543
|
+
13. If the probe finds no fit, load `harness-synthesis` and draft a domain-specific harness for this run instead of forcing a bad fit.
|
|
544
|
+
14. If the probe finds a generic fit (2-3 yes), propose a run-only draft harness or domain rules alongside the base run or selected harness. Do not save it by default.
|
|
545
|
+
15. If too many tools, skills, agents, or harnesses are available, load `harness-curation` and choose the smallest effective set before loading more context.
|
|
546
|
+
16. If lightweight signals show a recurring operating habit, use `harness-curation` (its habit calibration section) to make one advisory recommendation without loading a separate body.
|
|
547
|
+
17. If the user points to research, notes, examples, prior failures, or "what I learned today", synthesize from those inputs. Extract behavior-shaping rules and reusable procedure, not a summary.
|
|
548
|
+
18. Run Stitch once before committing to `.tink/current/`. If it triggers, show exactly one proposal before approval. Call `AskUserQuestion` as described in the Interaction policy section.
|
|
549
|
+
19. Ask for explicit approval before non-trivial work.
|
|
550
|
+
20. After approval, read only the selected harness files and any approved run-only draft.
|
|
551
|
+
21. Create `.tink/current/` files from the run state contract, including `contract.json`, `session.json`, `context-pack.md`, `context-map.json`, `context-metrics-evaluation.json`, and `excluded-context.md`. If selected, also create `goals.json` for `goal-checkpoint` and `delegation.md` for `delegation-brief`.
|
|
552
|
+
22. Execute the first safe step immediately:
|
|
520
553
|
- inspect relevant files,
|
|
521
554
|
- run a read-only diagnostic,
|
|
522
555
|
- draft the first artifact,
|
|
523
556
|
- or reproduce the issue.
|
|
524
|
-
|
|
525
|
-
|
|
526
|
-
|
|
557
|
+
23. Keep `steps.json`, `notes.md`, `contract.json`, and `session.json` current as work progresses. Re-run Evidence Split when new uncertainty, coupling, failed checks, or context sprawl appears; update packetized steps and context evidence before continuing. When present, keep `goals.json` and `delegation.md` aligned with actual status and evidence. When the Progress display trigger applies, end every response with the progress block.
|
|
558
|
+
24. Before final, run `/tink:verify` behavior for required contract checks or state why verification is blocked.
|
|
559
|
+
25. If the task exposed a repeated mistake or reusable improvement, use the Reusable State Save Gate approval payload below. Save only after separate user approval.
|
|
527
560
|
|
|
528
561
|
|
|
529
562
|
## Synthesis probe
|
|
@@ -91,17 +91,18 @@ Standalone CLI를 더 짧게 입력하고, 로컬 health report를 더 쉽게
|
|
|
91
91
|
- `dashboard`는 기본적으로 로컬 정적 파일만 만든다. 서버, watcher, hidden cache, 자동 하네스 수정은 하지 않는다.
|
|
92
92
|
- 생성 파일 경로가 플랫폼별로 안정화된 뒤에만 선택적인 open/export flag를 검토한다.
|
|
93
93
|
|
|
94
|
-
##
|
|
94
|
+
## Evidence Split / Parallel Evidence
|
|
95
95
|
|
|
96
|
-
작업
|
|
96
|
+
작업 병렬화보다 먼저, Tink의 기본 작업 루프에 Evidence Split을 넣는다. Tink를 별도 multi-agent runtime으로 만들지 않고, 큰 작업을 작은 증거 packet으로 나누는 기본 동작부터 안정화한다. 상세 연구 기록은 `docs/swarm-fast-lane.ko.md`와 `docs/swarm-fast-lane.md`에 둔다.
|
|
97
97
|
|
|
98
|
-
-
|
|
99
|
-
-
|
|
98
|
+
- `/tink:cast`와 `$tink:cast`는 하네스 선택 전에 `probe`, `patch`, `verify`, `review`, `decision` packet으로 나눌 수 있는지 점검한다.
|
|
99
|
+
- 실제 작업 중에도 불확실성, 검증 실패, context 확대, 변경 결합이 생기면 다시 packet으로 나눈다.
|
|
100
|
+
- packet은 전체 작업이 아니라 1-3개 입력만 가진 작은 단위를 본다.
|
|
101
|
+
- 외부 worker가 필요할 때도 기본적으로 파일을 직접 수정하지 않고 evidence와 patch candidate만 반환한다.
|
|
100
102
|
- 메인 에이전트만 최종 patch 선택, 파일 수정, 검증을 책임진다.
|
|
101
103
|
- 성공 지표는 "항상 더 빠름"이 아니라 main context 감소, 재작업 감소, 실패 조기 발견, 검증 통과율 유지 또는 개선으로 둔다.
|
|
102
|
-
- 초기 모드는
|
|
103
|
-
-
|
|
104
|
-
- worker 출력은 300단어 이하, evidence-only, confidence 포함으로 제한한다.
|
|
104
|
+
- 초기 모드는 core behavior인 Evidence Split으로 두고, 실제 worker runtime은 별도 후속 작업으로 미룬다.
|
|
105
|
+
- worker 출력은 future runtime에서도 300단어 이하, evidence-only, confidence 포함으로 제한한다.
|
|
105
106
|
- public contract, secrets, 넓은 repo scan, 동일 파일 동시 수정이 필요한 작업에서는 선택하지 않는다.
|
|
106
107
|
|
|
107
108
|
## 제외
|
|
@@ -91,17 +91,18 @@ Make the standalone CLI easier to type and make the local health report easier t
|
|
|
91
91
|
- Keep `dashboard` local and static by default: no server, watcher, hidden cache, or automatic harness edits.
|
|
92
92
|
- Allow an optional open/export flag only after the generated file path behavior is stable across platforms.
|
|
93
93
|
|
|
94
|
-
##
|
|
94
|
+
## Evidence Split / Parallel Evidence
|
|
95
95
|
|
|
96
|
-
|
|
96
|
+
Before adding parallel workers, add Evidence Split to Tink's default work loop. Tink should not become a separate multi-agent runtime; it should first make large work divisible into small evidence packets. The research notes live in `docs/swarm-fast-lane.ko.md` and `docs/swarm-fast-lane.md`.
|
|
97
97
|
|
|
98
|
-
-
|
|
99
|
-
-
|
|
98
|
+
- `/tink:cast` and `$tink:cast` check whether work should split into `probe`, `patch`, `verify`, `review`, or `decision` packets before harness selection.
|
|
99
|
+
- During implementation, Tink re-splits work when uncertainty, failed checks, context sprawl, or coupled changes appear.
|
|
100
|
+
- Packets see only 1-3 inputs, not the whole task.
|
|
101
|
+
- If external workers are used later, they do not edit files by default; they return evidence and patch candidates.
|
|
100
102
|
- The main agent owns final patch selection, file edits, and verification.
|
|
101
103
|
- Success is measured by less main-agent context, less rework, earlier failure detection, and equal or better verification pass rate, not by claiming universal raw speed.
|
|
102
|
-
-
|
|
103
|
-
-
|
|
104
|
-
- Worker output is capped at 300 words and must include evidence and confidence.
|
|
104
|
+
- The initial implementation is the core Evidence Split behavior; actual worker runtime remains deferred.
|
|
105
|
+
- Future worker output should be capped at 300 words and include evidence and confidence.
|
|
105
106
|
- Do not select it for unclear public contracts, secrets, broad repository scans, or same-file concurrent edits.
|
|
106
107
|
|
|
107
108
|
## Excluded
|
|
@@ -1,16 +1,16 @@
|
|
|
1
|
-
#
|
|
1
|
+
# Evidence Split / Parallel Evidence 연구 계획
|
|
2
2
|
|
|
3
|
-
이 문서는 멀티
|
|
3
|
+
이 문서는 멀티 에이전트 작업 병렬화의 전 단계로, Tink가 큰 작업을 작은 evidence packet으로 나누는 기본 동작을 갖도록 제한하는 연구 계획이다. 목표는 "에이전트를 많이 띄우기"가 아니라, 작은 컨텍스트 패킷으로 조사, 수정, 검증, 리뷰, 결정을 분리해 메인 에이전트의 재작업과 전체 컨텍스트 부담을 줄이는 것이다.
|
|
4
4
|
|
|
5
5
|
## 문제 정의
|
|
6
6
|
|
|
7
7
|
일반적인 멀티 에이전트 병렬화는 토큰을 더 많이 쓴다. 각 worker가 같은 문맥을 다시 읽고, 서로 다른 수정이 충돌하며, 메인 에이전트가 합산 비용을 다시 치르기 때문이다.
|
|
8
8
|
|
|
9
|
-
|
|
9
|
+
Evidence Split은 이 문제를 반대로 접근한다.
|
|
10
10
|
|
|
11
|
-
-
|
|
12
|
-
-
|
|
13
|
-
- worker가 기본적으로 직접 수정하지 않는다.
|
|
11
|
+
- packet이 전체 작업을 이해하지 않는다.
|
|
12
|
+
- packet이 넓은 파일을 읽지 않는다.
|
|
13
|
+
- 외부 worker가 쓰이더라도 기본적으로 직접 수정하지 않는다.
|
|
14
14
|
- worker 출력은 짧은 evidence와 patch candidate로 제한한다.
|
|
15
15
|
- 메인 에이전트만 최종 경로를 선택하고 파일을 수정한다.
|
|
16
16
|
|
|
@@ -80,14 +80,16 @@ worker는 파일 수정 없이 관련 파일, 위험, 테스트 후보만 찾는
|
|
|
80
80
|
|
|
81
81
|
worker에게 의도적으로 불완전한 최소 컨텍스트만 준다. 목적은 좋은 구현이 아니라, 작은 정보로도 잡히는 문제를 싸게 찾는 것이다.
|
|
82
82
|
|
|
83
|
-
##
|
|
83
|
+
## Core Behavior 계약
|
|
84
84
|
|
|
85
|
-
|
|
85
|
+
Evidence Split은 별도 하네스가 아니라 `/tink:cast`와 `$tink:cast`의 기본 동작이다. 다음 조건에서 사용한다.
|
|
86
86
|
|
|
87
87
|
- 작업이 2-5개의 독립 packet으로 나뉜다.
|
|
88
88
|
- 각 packet은 입력 파일 또는 질문이 1-3개로 제한된다.
|
|
89
|
-
-
|
|
90
|
-
-
|
|
89
|
+
- packet type은 `probe`, `patch`, `verify`, `review`, `decision` 중 하나다.
|
|
90
|
+
- 실제 작업 중 불확실성, 검증 실패, context 확대, 변경 결합이 생기면 다시 packet으로 나눈다.
|
|
91
|
+
- 외부 worker의 출력은 future runtime에서도 300단어 이하로 제한한다.
|
|
92
|
+
- 외부 worker는 기본적으로 직접 파일을 수정하지 않는다.
|
|
91
93
|
- worker 출력에는 evidence, 추천 행동, confidence가 포함된다.
|
|
92
94
|
- 메인 에이전트가 최종 patch와 검증을 책임진다.
|
|
93
95
|
|
|
@@ -118,15 +120,14 @@ worker에게 의도적으로 불완전한 최소 컨텍스트만 준다. 목적
|
|
|
118
120
|
|
|
119
121
|
첫 구현 slice는 다음을 완료로 본다.
|
|
120
122
|
|
|
121
|
-
- `
|
|
122
|
-
-
|
|
123
|
-
- worker packet 형식이 `.tink/current/delegation.md` 또는 별도 run artifact로 표현된다.
|
|
123
|
+
- Evidence Split이 Tink core rules와 `/tink:cast`, `$tink:cast` 문서에 기본 동작으로 들어간다.
|
|
124
|
+
- packet 형식이 `steps.json`, `context-map.json`, `notes.md`, 필요 시 `.tink/current/delegation.md`로 표현된다.
|
|
124
125
|
- worker 직접 수정은 기본 비활성이다.
|
|
125
|
-
-
|
|
126
|
+
- 작은 작업에서는 생략 가능하다는 lightweight rule이 있다.
|
|
126
127
|
- 검증은 "더 빠름"을 단정하지 않고, context 감소와 재작업 감소 근거를 기록한다.
|
|
127
128
|
|
|
128
129
|
## 열린 질문
|
|
129
130
|
|
|
130
131
|
- 실제 worker 실행은 Codex/Claude Code의 기존 기능을 얇게 호출할지, Tink는 packet 문서화까지만 할지 결정해야 한다.
|
|
131
|
-
- worker 결과 schema를 `delegation-brief`에 통합할지, 별도
|
|
132
|
-
- fast
|
|
132
|
+
- worker 결과 schema를 `delegation-brief`에 통합할지, 별도 runtime artifact로 둘지 결정해야 한다.
|
|
133
|
+
- `swarm-fast-lane` 이름은 연구 문서의 임시 이름으로만 남기고, 사용자 문구는 Evidence Split 또는 Parallel Evidence를 우선한다.
|
package/docs/swarm-fast-lane.md
CHANGED
|
@@ -1,16 +1,16 @@
|
|
|
1
|
-
#
|
|
1
|
+
# Evidence Split / Parallel Evidence Research Plan
|
|
2
2
|
|
|
3
|
-
This document describes
|
|
3
|
+
This document describes the step before multi-agent parallelism: Tink should first split large work into small evidence packets without becoming a separate runtime. The goal is not to spawn more agents by default. The goal is to separate probe, patch, verify, review, and decision work into tiny context packets so the main agent reduces rework and context load.
|
|
4
4
|
|
|
5
5
|
## Problem
|
|
6
6
|
|
|
7
7
|
Naive multi-agent parallelism usually spends more tokens. Each worker rereads context, independent edits conflict, and the main agent still pays a reconciliation cost.
|
|
8
8
|
|
|
9
|
-
|
|
9
|
+
Evidence Split inverts that model.
|
|
10
10
|
|
|
11
|
-
-
|
|
12
|
-
-
|
|
13
|
-
-
|
|
11
|
+
- Packets do not understand the whole task.
|
|
12
|
+
- Packets do not read broad context.
|
|
13
|
+
- If external workers are used, they do not edit files by default.
|
|
14
14
|
- Worker output is limited to short evidence and patch candidates.
|
|
15
15
|
- The main agent chooses the final path and owns file edits.
|
|
16
16
|
|
|
@@ -80,14 +80,16 @@ Workers look only for reasons the current implementation approach will fail. Thi
|
|
|
80
80
|
|
|
81
81
|
Workers intentionally receive incomplete minimal context. The point is not high-quality implementation; it is cheaply detecting problems that are visible with little information.
|
|
82
82
|
|
|
83
|
-
##
|
|
83
|
+
## Core Behavior Contract
|
|
84
84
|
|
|
85
|
-
|
|
85
|
+
Evidence Split is not a separate harness. It is default behavior inside `/tink:cast` and `$tink:cast`. Use it when:
|
|
86
86
|
|
|
87
87
|
- the task splits into 2-5 independent packets
|
|
88
88
|
- each packet is limited to 1-3 input files or questions
|
|
89
|
-
- each
|
|
90
|
-
-
|
|
89
|
+
- each packet type is `probe`, `patch`, `verify`, `review`, or `decision`
|
|
90
|
+
- work should be re-split during implementation because uncertainty, failed checks, context sprawl, or coupled changes appeared
|
|
91
|
+
- future worker output is limited to 300 words
|
|
92
|
+
- external workers do not edit files by default
|
|
91
93
|
- worker output includes evidence, recommended action, and confidence
|
|
92
94
|
- the main agent owns final patching and verification
|
|
93
95
|
|
|
@@ -118,15 +120,14 @@ The first version can start with estimates, but run artifacts should record evid
|
|
|
118
120
|
|
|
119
121
|
The first implementation slice is done when:
|
|
120
122
|
|
|
121
|
-
-
|
|
122
|
-
-
|
|
123
|
-
- worker packet format is represented in `.tink/current/delegation.md` or another run artifact
|
|
123
|
+
- Evidence Split is documented as default behavior in Tink core rules and `/tink:cast`, `$tink:cast`
|
|
124
|
+
- packet format is represented in `steps.json`, `context-map.json`, `notes.md`, and optionally `.tink/current/delegation.md`
|
|
124
125
|
- direct worker edits are disabled by default
|
|
125
|
-
-
|
|
126
|
+
- tiny work can skip the packet ceremony
|
|
126
127
|
- verification records context reduction and rework reduction evidence instead of claiming raw speed
|
|
127
128
|
|
|
128
129
|
## Open Questions
|
|
129
130
|
|
|
130
131
|
- Should actual worker execution call existing Codex/Claude Code features, or should Tink only document packets?
|
|
131
|
-
- Should worker result schema extend `delegation-brief`, or should
|
|
132
|
-
-
|
|
132
|
+
- Should worker result schema extend `delegation-brief`, or should it use a separate runtime artifact?
|
|
133
|
+
- Keep `swarm-fast-lane` only as a research placeholder; prefer Evidence Split or Parallel Evidence in user-facing copy.
|
package/package.json
CHANGED
|
@@ -160,6 +160,38 @@ Optional current-run artifacts are created only when their harness is selected:
|
|
|
160
160
|
- `goals.json`: current-run goals for `goal-checkpoint`; keep 2-6 goals, one active goal, status, done criteria, verification, and evidence.
|
|
161
161
|
- `delegation.md`: handoff or parallel-work packets for `delegation-brief`; include packet scope, forbidden actions, expected evidence, and reconciliation notes. Do not start tmux panes, worktrees, workers, or external agents from this harness.
|
|
162
162
|
|
|
163
|
+
## Evidence Split
|
|
164
|
+
Evidence Split is a base-run habit, not a separate harness. It keeps real work small while the task is happening by splitting broad or uncertain work into evidence-sized packets.
|
|
165
|
+
|
|
166
|
+
Use Evidence Split at cast time and again during implementation when:
|
|
167
|
+
- the first plan has several uncertain facts,
|
|
168
|
+
- implementation starts coupling several files or concepts,
|
|
169
|
+
- a check fails and the next action is unclear,
|
|
170
|
+
- context is becoming broad or stale,
|
|
171
|
+
- independent verification, review, or handoff would reduce risk.
|
|
172
|
+
|
|
173
|
+
Skip it for tiny, obvious edits where a packet would not change the next action.
|
|
174
|
+
|
|
175
|
+
Packet vocabulary:
|
|
176
|
+
- `probe`: answer one unknown with 1-3 inputs.
|
|
177
|
+
- `patch`: make one narrow implementation change.
|
|
178
|
+
- `verify`: prove one success condition or failure recovery.
|
|
179
|
+
- `review`: inspect one risk, regression, or omission.
|
|
180
|
+
- `decision`: record one branch, chosen option, and evidence.
|
|
181
|
+
|
|
182
|
+
Represent packets in existing run state:
|
|
183
|
+
- `steps.json`: packetized steps and status.
|
|
184
|
+
- `context-map.json`: the input files, sources, or excluded context for each packet.
|
|
185
|
+
- `notes.md`: why work was split or re-split during implementation.
|
|
186
|
+
- `delegation.md`: only when `delegation-brief` is selected or another human/agent packet is explicitly needed.
|
|
187
|
+
|
|
188
|
+
Safety defaults:
|
|
189
|
+
- Do not start workers, tmux panes, worktrees, or external agents automatically.
|
|
190
|
+
- Packet outputs are evidence, risks, recommendations, or patch candidates by default; direct edits require the main agent's normal approval and ownership.
|
|
191
|
+
- Do not let multiple packets edit the same file concurrently.
|
|
192
|
+
- Keep secrets, public contracts, broad refactors, release/publish actions, and final reconciliation under the main agent's control.
|
|
193
|
+
- Keep each packet to 1-3 primary inputs when possible.
|
|
194
|
+
|
|
163
195
|
Create `contract.json` before loading harness bodies. It should be short, factual, and based on the user request plus visible project context:
|
|
164
196
|
|
|
165
197
|
```json
|
|
@@ -480,12 +512,13 @@ This is the Lane 3 full path from Quick triage. Lanes 1 and 2 intentionally skip
|
|
|
480
512
|
- new pattern not covered yet
|
|
481
513
|
|
|
482
514
|
These are task types, not harness names. Generic types (code change, bug fix, research, review, docs) default to the base run; a harness is added only when a specialized one genuinely fits.
|
|
483
|
-
6.
|
|
515
|
+
6. Apply the Evidence Split check before choosing harnesses. If it changes the next action, represent the first packets in `steps.json` and connect each packet to context or verification evidence in `context-map.json`. Keep this check lightweight and skip it for tiny work.
|
|
516
|
+
7. Consider GJC-style visible-thinking overlays as normal Tink harnesses, not as new command surfaces:
|
|
484
517
|
- If the request is an ambiguous idea, early product concept, or underspecified implementation prompt, prefer `requirements-interview` before planning or coding. This is the default harness when Stitch is expected to trigger for goal ambiguity or missing acceptance criteria.
|
|
485
518
|
- If the request asks for a plan, architecture decision, large refactor, migration, or broad public contract change, consider `plan-consensus`.
|
|
486
519
|
- If the work naturally splits into multiple durable milestones, add `goal-checkpoint` and create `.tink/current/goals.json` after approval.
|
|
487
520
|
- If parallel review, independent verification, or handoff would reduce risk, add `delegation-brief` and create `.tink/current/delegation.md` after approval. This harness prepares briefs only; it never starts tmux, worktrees, workers, or external agents.
|
|
488
|
-
|
|
521
|
+
8. Consider focused work harnesses only when their trigger is strong enough to change the procedure:
|
|
489
522
|
- Use `issue-triage` for issue/PR/QA intake, ready-for-agent briefs, needs-info/wontfix decisions, or vertical issue slices.
|
|
490
523
|
- Use `bug-diagnosis-loop` for hard bugs, regressions, intermittent failures, or performance problems where a red-capable loop must come before code changes.
|
|
491
524
|
- Use `review-two-axis` for PR/branch/diff review when Standards and Spec should be reported separately.
|
|
@@ -496,7 +529,7 @@ This is the Lane 3 full path from Quick triage. Lanes 1 and 2 intentionally skip
|
|
|
496
529
|
- `goal-checkpoint` is REQUIRED (not optional) when ANY of these is true: the Goals list has 2+ goals; 2+ harnesses run sequentially; the plan is expected to need 4+ steps; or the work spans multiple components/directories. Create `goals.json` after approval.
|
|
497
530
|
- `plan-consensus` must be explicitly considered for any from-scratch implementation, reimplementation, migration, or public contract/API design. If skipped, record a one-line reason in the 오버레이 점검 line.
|
|
498
531
|
- The context budget and the "prefer 1-3 harnesses" guidance never justify dropping a REQUIRED overlay: overlays are cheap state files, not extra loaded context. A large task judged "fine with default harnesses" because the synthesis probe found a fit is a selection bug - the probe only answers whether a custom procedure is needed, not whether overlays are needed.
|
|
499
|
-
|
|
532
|
+
9. Pick the smallest effective set using the context budget policy below: the base run plus 0-3 specialized harnesses. When no specialized harness fits, select the base run alone - do not force a generic fit. Do not use a hard cap when several tiny harnesses add useful checks without crowding context. When the task is ambiguous (Stitch goal-ambiguity is expected to trigger), start with `requirements-interview` alone; add a second harness only after the user clarifies. Do not bundle 2+ harnesses for ambiguous tasks upfront.
|
|
500
533
|
|
|
501
534
|
After selecting, run a quick quality check using the index metadata for each chosen harness:
|
|
502
535
|
- If fewer than 2 words in `use_when` match the current task description (case-insensitive) → treat as a Stitch harness-mismatch signal
|
|
@@ -504,26 +537,26 @@ This is the Lane 3 full path from Quick triage. Lanes 1 and 2 intentionally skip
|
|
|
504
537
|
- If `asks` is empty or missing and the task goal is not self-evident → treat as a Stitch goal-ambiguity signal
|
|
505
538
|
Feed any signals into the Stitch evaluation at step 16.
|
|
506
539
|
|
|
507
|
-
|
|
508
|
-
|
|
509
|
-
|
|
510
|
-
|
|
511
|
-
|
|
512
|
-
|
|
513
|
-
|
|
514
|
-
|
|
515
|
-
|
|
516
|
-
|
|
517
|
-
|
|
518
|
-
|
|
519
|
-
|
|
540
|
+
10. Add any rule graph check candidates to `contract.json` verification if they are relevant and cheap. For risky commands, set `approval_required: true`.
|
|
541
|
+
11. Add opt-in guard candidates to `notes.md` only as suggestions. Do not register enforcement hooks unless the user separately approves.
|
|
542
|
+
12. Run the synthesis probe on the initial harness choice. The probe produces one of three outcomes: strong fit (0-1 yes), generic fit (2-3 yes), or no fit (4-5 yes or no harness matches).
|
|
543
|
+
13. If the probe finds no fit, load `harness-synthesis` and draft a domain-specific harness for this run instead of forcing a bad fit.
|
|
544
|
+
14. If the probe finds a generic fit (2-3 yes), propose a run-only draft harness or domain rules alongside the base run or selected harness. Do not save it by default.
|
|
545
|
+
15. If too many tools, skills, agents, or harnesses are available, load `harness-curation` and choose the smallest effective set before loading more context.
|
|
546
|
+
16. If lightweight signals show a recurring operating habit, use `harness-curation` (its habit calibration section) to make one advisory recommendation without loading a separate body.
|
|
547
|
+
17. If the user points to research, notes, examples, prior failures, or "what I learned today", synthesize from those inputs. Extract behavior-shaping rules and reusable procedure, not a summary.
|
|
548
|
+
18. Run Stitch once before committing to `.tink/current/`. If it triggers, show exactly one proposal before approval. Call `AskUserQuestion` as described in the Interaction policy section.
|
|
549
|
+
19. Ask for explicit approval before non-trivial work.
|
|
550
|
+
20. After approval, read only the selected harness files and any approved run-only draft.
|
|
551
|
+
21. Create `.tink/current/` files from the run state contract, including `contract.json`, `session.json`, `context-pack.md`, `context-map.json`, `context-metrics-evaluation.json`, and `excluded-context.md`. If selected, also create `goals.json` for `goal-checkpoint` and `delegation.md` for `delegation-brief`.
|
|
552
|
+
22. Execute the first safe step immediately:
|
|
520
553
|
- inspect relevant files,
|
|
521
554
|
- run a read-only diagnostic,
|
|
522
555
|
- draft the first artifact,
|
|
523
556
|
- or reproduce the issue.
|
|
524
|
-
|
|
525
|
-
|
|
526
|
-
|
|
557
|
+
23. Keep `steps.json`, `notes.md`, `contract.json`, and `session.json` current as work progresses. Re-run Evidence Split when new uncertainty, coupling, failed checks, or context sprawl appears; update packetized steps and context evidence before continuing. When present, keep `goals.json` and `delegation.md` aligned with actual status and evidence. When the Progress display trigger applies, end every response with the progress block.
|
|
558
|
+
24. Before final, run `/tink:verify` behavior for required contract checks or state why verification is blocked.
|
|
559
|
+
25. If the task exposed a repeated mistake or reusable improvement, use the Reusable State Save Gate approval payload below. Save only after separate user approval.
|
|
527
560
|
|
|
528
561
|
|
|
529
562
|
## Synthesis probe
|
|
@@ -26,23 +26,24 @@ Accept legacy `$tink <action>` spelling for compatibility, but present `$tink:<a
|
|
|
26
26
|
6. If `.tink/current/` exists and continuity is uncertain, read `plan.md`, `checks.md`, `steps.json`, `notes.md`, `answers.md`, and `contract.json` when present; summarize goal, last safe point, next step, open questions, and verification; then ask resume/archive/replace/cancel before continuing.
|
|
27
27
|
7. Run the synthesis probe before committing to `.tink/current/`. Strong fit keeps the harness; generic fit adds a run-only draft; no fit loads `harness-synthesis`.
|
|
28
28
|
8. If too many tools, skills, agents, or harnesses are available, use `harness-curation` to choose the smallest effective set before loading more context.
|
|
29
|
-
9. Treat
|
|
30
|
-
10.
|
|
31
|
-
11.
|
|
32
|
-
12.
|
|
33
|
-
13.
|
|
34
|
-
14.
|
|
35
|
-
15.
|
|
36
|
-
16.
|
|
37
|
-
17.
|
|
38
|
-
18.
|
|
39
|
-
19.
|
|
40
|
-
20.
|
|
41
|
-
21.
|
|
42
|
-
22.
|
|
43
|
-
23.
|
|
44
|
-
24.
|
|
45
|
-
25.
|
|
29
|
+
9. Treat Evidence Split as a base-run habit, not a harness: for non-trivial work, first ask whether the task should be split into `probe`, `patch`, `verify`, `review`, or `decision` packets. Use it at cast time and again during implementation when uncertainty grows, a check fails, context gets broad, or several changes start to couple. Keep it lightweight for tiny tasks and skip it when it would add ceremony without changing the next action.
|
|
30
|
+
10. Treat visible-thinking and focused work workflows as ordinary Tink harness choices, not new commands. Actively consider them when their trigger changes the procedure: use `requirements-interview` for ambiguity, unclear scope, or missing acceptance criteria; `plan-consensus` for broad plans, migrations, API/schema/contract changes, or tradeoffs; `goal-checkpoint` for multi-file, multi-phase, resumed, release, or long runs; `delegation-brief` for handoff, independent verification, parallel review, or another agent/human brief; `issue-triage` for issue/PR/QA intake or vertical slices; `bug-diagnosis-loop` for hard bugs that need a red-capable loop before code changes; `review-two-axis` for Standards/Spec diff review; `decision-map` for multi-session unresolved decisions; and `architecture-deepening` for deep module, interface, seam, leverage, locality, or testability work.
|
|
31
|
+
11. Run Stitch once before committing to `.tink/current/`: evaluate every time, show exactly one proposal only for high-impact quality or safety branches, and use the configured language.
|
|
32
|
+
12. For non-trivial `$tink:cast` runs, ask for current-run approval before creating `.tink/current/`, loading harness bodies, editing files, or executing the first step. Codex must not silently treat a command invocation as approval.
|
|
33
|
+
13. Use `request_user_input` for choice prompts when available. Otherwise stop and ask one concise blocking approval question directly in chat. Do not continue until the user answers.
|
|
34
|
+
14. Treat reusable saves as a separate hard approval gate for `.tink/memory/*`, `.tink/harnesses/*`, `.tink/rules/*`, `.tink/config.json`, Codex skill files, and template/plugin files that affect future installs.
|
|
35
|
+
15. Current-run approval never authorizes reusable-state writes. Before saving reusable state, show operation, destination files, exact entry or patch summary, reusable reason, sensitive content excluded, and rollback/removal path.
|
|
36
|
+
16. Before saving a reusable rule graph update, run a structural gate: duplicate, breadth, evidence, verification, Claude Code/Codex compatibility, macOS/Windows compatibility, and portable commands. AI may propose a rule; saving it still requires separate approval.
|
|
37
|
+
17. `$tink:frog` may inspect rule quality as well as harness quality. Prefer keep, rewrite, split, merge, or needs-evidence recommendations before any removal proposal.
|
|
38
|
+
18. For `$tink:weave` or `$tink:frog`, prepare the harness health summary before ranking candidates. If `.tink/tools/generate-harness-lifecycle-summary.mjs` exists, run `node .tink/tools/generate-harness-lifecycle-summary.mjs` from the repo root and then read `.tink/maintenance/harness-lifecycle.json`. If the generator is missing, continue from compact run, queue, ledger, and friction evidence.
|
|
39
|
+
19. When `.tink/maintenance/harness-lifecycle.json` or another file following `.tink/schemas/harness-lifecycle.schema.json` exists, treat it as a plain harness health summary. Use `confidence`, `evidence_grade`, `evidence_handles`, and `safe_next_action` to prioritize `$tink:weave` or `$tink:frog` candidates, but do not treat it as approval. Low-confidence entries stay as observation. Harness edits, rule updates, memory saves, merges, archives, and deletions still require the reusable-state approval gate.
|
|
40
|
+
20. After approval, create `.tink/current/plan.md`, `checks.md`, `steps.json`, `notes.md`, `answers.md`, `contract.json`, `session.json`, `context-pack.md`, `context-map.json`, `context-metrics-evaluation.json`, and `excluded-context.md`. If selected, also create `.tink/current/goals.json` for `goal-checkpoint` and `.tink/current/delegation.md` for `delegation-brief`. Evidence Split packets live in these run files; do not add a new public command or standalone runtime file for them.
|
|
41
|
+
21. Do not stop at recommendation. Execute the first safe step after run state exists.
|
|
42
|
+
22. Run `$tink:verify` behavior before final when `contract.json` lists required checks. If `.tink/config.json` has `completion_policy: "strict"`, do not call the run done until required checks are represented in `.tink/current/verification.json`, `.tink/current/evidence.md` exists, and remaining risk is stated.
|
|
43
|
+
23. Store reusable memory or rule updates under `.tink/` only after separate approval.
|
|
44
|
+
24. If a check fails, update `.tink/current/notes.md`, state the failure, last safe point, and next single action. Append compact friction to `.tink/maintenance/friction.jsonl` when it exists. Feed repeated failures to `$tink:weave`.
|
|
45
|
+
25. Keep context compact. Do not paste raw logs or full diffs.
|
|
46
|
+
26. Use calm, clear, concise language. Prefer plain everyday words over technical terms. No jokes.
|
|
46
47
|
|
|
47
48
|
## Codex Approval Protocol
|
|
48
49
|
|
|
@@ -120,6 +121,39 @@ Optional current-run artifacts:
|
|
|
120
121
|
- `.tink/current/goals.json`: create only when `goal-checkpoint` is selected. Keep 2-6 goals, one active goal, status, done criteria, verification, evidence, and next action.
|
|
121
122
|
- `.tink/current/delegation.md`: create only when `delegation-brief` is selected. Include packet scope, forbidden actions, expected evidence, and reconciliation notes. Do not start tmux panes, worktrees, workers, or external agents from this harness.
|
|
122
123
|
|
|
124
|
+
## Evidence Split
|
|
125
|
+
|
|
126
|
+
Evidence Split is Tink's default way to keep real work small while it is happening. It is not a separate harness and it does not imply parallel execution.
|
|
127
|
+
|
|
128
|
+
Use Evidence Split when a task is non-trivial and any of these signals appears:
|
|
129
|
+
- the first plan has several uncertain facts,
|
|
130
|
+
- implementation starts coupling several files or concepts,
|
|
131
|
+
- a check fails and the next action is unclear,
|
|
132
|
+
- context is becoming broad or stale,
|
|
133
|
+
- independent verification, review, or handoff would reduce risk.
|
|
134
|
+
|
|
135
|
+
Skip it for tiny, obvious edits where a packet would not change the next action.
|
|
136
|
+
|
|
137
|
+
Packet vocabulary:
|
|
138
|
+
- `probe`: answer one unknown with 1-3 inputs.
|
|
139
|
+
- `patch`: make one narrow implementation change.
|
|
140
|
+
- `verify`: prove one success condition or failure recovery.
|
|
141
|
+
- `review`: inspect one risk, regression, or omission.
|
|
142
|
+
- `decision`: record one branch, chosen option, and evidence.
|
|
143
|
+
|
|
144
|
+
Represent packets in existing run state:
|
|
145
|
+
- `steps.json`: packetized steps and status.
|
|
146
|
+
- `context-map.json`: the input files, sources, or excluded context for each packet.
|
|
147
|
+
- `notes.md`: why work was split or re-split during implementation.
|
|
148
|
+
- `delegation.md`: only when `delegation-brief` is selected or another human/agent packet is explicitly needed.
|
|
149
|
+
|
|
150
|
+
Safety defaults:
|
|
151
|
+
- Do not start workers, tmux panes, worktrees, or external agents automatically.
|
|
152
|
+
- Packet outputs are evidence, risks, recommendations, or patch candidates by default; direct edits require the main agent's normal approval and ownership.
|
|
153
|
+
- Do not let multiple packets edit the same file concurrently.
|
|
154
|
+
- Keep secrets, public contracts, broad refactors, release/publish actions, and final reconciliation under the main agent's control.
|
|
155
|
+
- Keep each packet to 1-3 primary inputs when possible.
|
|
156
|
+
|
|
123
157
|
GJC-style harness selection rules:
|
|
124
158
|
|
|
125
159
|
- Ambiguous ideas, early product concepts, vague bug reports, broad "make it better" requests, and underspecified implementation prompts should start with `requirements-interview`, usually alone until the user clarifies enough to plan or code.
|