docket-agent 0.2.0 → 0.3.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -7,7 +7,10 @@
7
7
  Before your agent acts, it checks a one-page rule file you wrote: allow, ask,
8
8
  or deny. After, it leaves a tamper-evident record. Anything you didn't write
9
9
  down, the agent must ask about. Plain Markdown in your repo; works with
10
- Claude, Codex, Cursor, and any MCP client.
10
+ Claude, ChatGPT/Codex, Gemini, Cursor, OpenClaw, Hermes, and any MCP client.
11
+
12
+ **Install:** `npm install -g docket-agent` · **Docs:**
13
+ [shahcolate.github.io/docket/docs.html](https://shahcolate.github.io/docket/docs.html)
11
14
 
12
15
  Zero dependencies · plain Markdown + JSONL · MIT
13
16
 
@@ -176,6 +179,42 @@ $ docket compile --target cursor --write # → .cursor/rules/docket.mdc
176
179
  Same loops, every tool. **A model switch is a recompile, not a re-teach** —
177
180
  try the new tool, point it at the same files, keep working.
178
181
 
182
+ ## Fifty loops, flat context
183
+
184
+ Compiling every brief and procedure into the context file stops scaling
185
+ around a handful of loops — the rules start crowding out the work. So
186
+ **rules scale on disk, not in context**:
187
+
188
+ ```console
189
+ $ docket compile --index --target claude --write
190
+ ✓ compiled index of 23 loops → CLAUDE.md
191
+ ```
192
+
193
+ `--index` compiles the protocol plus **one line per loop** — name,
194
+ description, and the loop's `triggers` — instead of the loops themselves.
195
+ The agent routes each task to its loop, then pulls just that loop in full:
196
+
197
+ ```console
198
+ $ docket match "draft an appeal for my denied claim"
199
+ 1 candidate loop for "draft an appeal for my denied claim"
200
+
201
+ appeal Build the appeal, cite the policy — stop before send.
202
+ score 14 — name: appeal · trigger: denied claim, denial letter
203
+
204
+ $ docket match "wire funds to a vendor"
205
+ NO LOOP "wire funds to a vendor"
206
+ No loop covers this task. Work outside a loop defaults to ask
207
+ ```
208
+
209
+ Routing is deterministic and scored — loop name, author-written `triggers`
210
+ phrases, warrant targets, description overlap — and it **fails closed**: no
211
+ match doesn't mean "best guess", it means *stop and ask*, exit code `2`,
212
+ same as the warrant. And enforcement never needed context residency at all:
213
+ the warrant check runs outside the model and injects the one matched rule
214
+ exactly when it becomes relevant. What stays resident is a table of
215
+ contents; the window holds one open chapter; the checker never forgets any
216
+ of it.
217
+
179
218
  ## Agents can use it natively (MCP)
180
219
 
181
220
  `docket mcp` is a zero-config MCP server. Add it to Claude Code:
@@ -190,11 +229,12 @@ or to any MCP client:
190
229
  { "mcpServers": { "docket": { "command": "npx", "args": ["docket-agent", "mcp"] } } }
191
230
  ```
192
231
 
193
- The agent gets four tools:
232
+ The agent gets five tools:
194
233
 
195
234
  | Tool | What it does |
196
235
  |---|---|
197
236
  | `docket_list_loops` | discover your loops |
237
+ | `docket_match_loop` | route a task to the loop that covers it — ranked, fail-closed |
198
238
  | `docket_loop_context` | pull a loop's five layers before starting |
199
239
  | `docket_warrant_check` | allow / ask / deny, **before** acting — auto-logged |
200
240
  | `docket_record` | add a verifiable record entry when it finishes or stops |
@@ -202,6 +242,76 @@ The agent gets four tools:
202
242
  Warrant checks made by the agent land in the record too. *"Did the agent
203
243
  even ask?"* becomes a grep.
204
244
 
245
+ ## Make it mechanical (Claude Code hooks)
246
+
247
+ Compiled context tells the agent the rules; MCP makes checking cheap. For
248
+ the tool calls you actually fear, make the warrant **mechanical** — wire it
249
+ into Claude Code's permission system as a PreToolUse hook, in
250
+ `.claude/settings.json`:
251
+
252
+ ```json
253
+ {
254
+ "hooks": {
255
+ "PreToolUse": [
256
+ {
257
+ "matcher": "Bash|Write|Edit",
258
+ "hooks": [{ "type": "command", "command": "npx docket-agent hook claude" }]
259
+ }
260
+ ]
261
+ }
262
+ }
263
+ ```
264
+
265
+ Every matched tool call now passes through the warrant *before it runs*:
266
+ **deny** blocks the call and tells the model why, **ask** makes Claude Code
267
+ prompt you, and **allow** stays silent — docket only ever *tightens* the
268
+ gate; it never bypasses Claude Code's own permission prompts. Without
269
+ `--loop` the hook routes each call with the same scoring as `docket match`
270
+ and stays out of the way when no loop claims the call (pin one loop with
271
+ `--loop <name>`; add `--strict` to force an ask instead). Every check lands
272
+ on the record with `via: "hook"` — enforcement and evidence in one move.
273
+
274
+ ## OpenClaw and Hermes
275
+
276
+ **[OpenClaw](https://docs.openclaw.ai)** injects your workspace's `AGENTS.md`
277
+ into the agent's system prompt at the start of every session — so compile
278
+ straight into the workspace (fitting, given the story that opens this README):
279
+
280
+ ```console
281
+ $ cd ~/.openclaw/workspace
282
+ $ npx docket-agent init
283
+ $ npx docket-agent new followup --template client-follow-up
284
+ $ npx docket-agent compile --target agents --write
285
+ ```
286
+
287
+ Docket only manages its own marked block inside `AGENTS.md` — your existing
288
+ rules, `SOUL.md`, and the rest of the workspace stay untouched. OpenClaw can
289
+ also run the MCP server for native checks and record entries: add `docket`
290
+ as an MCP server in your OpenClaw config with
291
+ `command: npx, args: ["-y", "docket-agent", "mcp", "--dir", "~/.openclaw/workspace"]`.
292
+
293
+ **[Hermes](https://hermes-agent.nousresearch.com/docs/)** (Nous Research)
294
+ reads `AGENTS.md` context files too — run the same three commands in the
295
+ directory Hermes works from. For native tools, add docket under the MCP
296
+ servers section of `~/.hermes/config.yaml`:
297
+
298
+ ```yaml
299
+ docket:
300
+ command: npx
301
+ args: ["-y", "docket-agent", "mcp", "--dir", "/path/to/your/project"]
302
+ ```
303
+
304
+ Any other agent that reads `AGENTS.md`, `CLAUDE.md`, `GEMINI.md`, or speaks
305
+ MCP gets the same treatment — one loop file, every agent under the same
306
+ warrant.
307
+
308
+ ## Documentation
309
+
310
+ The full guide — concepts, loop-file reference, the verdict algorithm,
311
+ matching semantics, record internals, CLI reference, and per-tool setup —
312
+ lives at **[the docs site](https://shahcolate.github.io/docket/docs.html)**.
313
+ The normative format definition is the [Loop File Spec](spec/SPEC.md).
314
+
205
315
  ## Five questions, then the loop exists
206
316
 
207
317
  `docket new <name>` interviews you:
@@ -271,11 +381,11 @@ Read the [Loop File Spec](spec/SPEC.md) — it's short on purpose.
271
381
 
272
382
  ## Roadmap
273
383
 
384
+ - [x] `docket check` as a Claude Code PreToolUse hook — shipped as `docket hook claude`
274
385
  - [ ] Signed record heads (attest the chain tip, share the attestation)
275
- - [ ] `docket check` as a Claude Code PreToolUse hook recipe
276
386
  - [ ] Loop inheritance (`extends:`) for team baselines
277
387
  - [ ] Record export → human-readable work summaries
278
- - [ ] Adapters: OpenAI custom instructions, Gemini, Windsurf
388
+ - [ ] Adapters: OpenAI custom instructions, Windsurf
279
389
 
280
390
  ## Contributing
281
391
 
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "docket-agent",
3
- "version": "0.2.0",
3
+ "version": "0.3.0",
4
4
  "description": "The permission layer and paper trail for AI agents. Your agent checks a rule file before it acts - allow, ask, or deny - and leaves a tamper-evident record after.",
5
5
  "type": "module",
6
6
  "bin": {
package/spec/SPEC.md CHANGED
@@ -67,6 +67,7 @@ in frontmatter because tools enforce structure well.
67
67
  | `description` | string | no | one line, shown in listings and compiled context |
68
68
  | `version` | number | no | spec version, default `1` |
69
69
  | `warrant` | map | no | see below |
70
+ | `triggers` | list of strings | no | phrases that mark a task as this loop's job; used only for routing (see below) |
70
71
  | `reserved` | list of strings | no | what stays with the human, always |
71
72
  | `record` | list of strings | no | what the agent must report when it finishes or stops |
72
73
 
@@ -170,6 +171,43 @@ system degrades toward the human, never away.
170
171
  `docket check` exits `0` for allow, `2` for ask, `3` for deny (and `1` for
171
172
  usage errors), so shells, hooks, and CI can gate on the warrant directly.
172
173
 
174
+ ## Routing: which loop covers this task?
175
+
176
+ With more than a handful of loops, the agent should not hold every brief and
177
+ procedure in context — it holds an index and pulls one loop at a time (see
178
+ *Compiled context* below). Something then has to answer "which one?", and it
179
+ must be deterministic: `docket match "<task>"` / the `docket_match_loop` MCP
180
+ tool.
181
+
182
+ Scoring is lexical, integer-weighted, and reuses the warrant's cautious
183
+ matcher (patterns split into alternatives; content words compare under
184
+ stemming):
185
+
186
+ | Signal | Weight | Notes |
187
+ |---|---|---|
188
+ | loop `name`, read as a phrase (dashes as spaces) | +5 | qualifies on its own |
189
+ | each matching `triggers` entry | +4 | qualifies on its own |
190
+ | each matching warrant pattern (any list) | +1 | capped at +3 per loop |
191
+ | each distinct content word shared with `description` | +1 | capped at +3 per loop |
192
+
193
+ Candidates need a score of **3** or more; they rank by score, then name, and
194
+ implementations should return a short list (default 3) for the agent — or the
195
+ human — to make the final pick from.
196
+
197
+ Two rules matter more than the weights:
198
+
199
+ - **The asymmetry principle inverts at routing time.** The warrant matches
200
+ allow-entries strictly because a false allow is an incident. Routing
201
+ matches generously because a false candidate costs one extra index line —
202
+ and a routing miss is still caught downstream by the warrant.
203
+ - **Retrieval fails closed.** When nothing clears the bar, the answer is not
204
+ "best guess" — it is *no loop covers this task, ask the human*. `docket
205
+ match` exits `2` (the same exit as an `ask` verdict) so hooks can gate on
206
+ it; `0` means matched, `1` a usage error.
207
+
208
+ Routing is advisory and read-only: a match is not an action, so it is not
209
+ written to the record. The warrant checks that follow are.
210
+
173
211
  ## The record
174
212
 
175
213
  The record is the audit half of the trust story: *what did the agent see,
@@ -244,9 +282,30 @@ and (with `--write`) inserts or replaces that block in the target file:
244
282
  Content outside the markers is never touched. Because every target renders
245
283
  from the same loops, moving to a new tool is a recompile, not a re-teach.
246
284
 
285
+ ### The index: rules scale on disk, not in context
286
+
287
+ The full render puts every brief and procedure in the agent's context on
288
+ every turn — O(loops × loop size), which crowds out the actual work as loops
289
+ accumulate. `docket compile --index` renders the same managed block in
290
+ **tiers** instead:
291
+
292
+ - **Tier 0 — protocol** (invariant with loop count): find the loop, load it,
293
+ check the warrant before acting, ask when nothing covers the task.
294
+ - **Tier 1 — index**: one line per loop — name, description, triggers. The
295
+ routing table.
296
+ - **Tier 2 — the active loop**: loaded on demand via `docket compile --loop
297
+ <name>` or `docket_loop_context`, only for the task at hand.
298
+
299
+ Enforcement never needed residency at all: the warrant check runs outside the
300
+ model, and its verdict text carries the one matched rule into the
301
+ conversation exactly when it becomes relevant. The index and the full render
302
+ use the same markers, so switching modes replaces the block rather than
303
+ stacking a second one. `docket compile` prints a token estimate and suggests
304
+ `--index` when the full render grows past a few thousand tokens.
305
+
247
306
  ## MCP tools
248
307
 
249
- `docket mcp` serves four tools over stdio (newline-delimited JSON-RPC,
308
+ `docket mcp` serves five tools over stdio (newline-delimited JSON-RPC,
250
309
  protocol `2024-11-05`). MCP hosts often spawn servers with a cwd far from
251
310
  your project, so the server resolves its project from `--dir <path>` (or
252
311
  `DOCKET_DIR`), falling back to walking up from cwd — and it always answers
@@ -256,6 +315,7 @@ before the handshake.
256
315
  | Tool | Purpose |
257
316
  |---|---|
258
317
  | `docket_list_loops` | discover the loops |
318
+ | `docket_match_loop` | route a task to the loop that covers it (ranked, fail-closed) |
259
319
  | `docket_loop_context` | fetch a loop's five layers before starting work |
260
320
  | `docket_warrant_check` | get an allow/ask/deny verdict **before** acting; auto-recorded as a `check` entry |
261
321
  | `docket_record` | append a `note` entry to the record |
package/src/cli.js CHANGED
@@ -4,10 +4,12 @@ import { cmdInit } from './commands/init.js';
4
4
  import { cmdNew, cmdTemplates } from './commands/new.js';
5
5
  import { cmdList, cmdShow } from './commands/list.js';
6
6
  import { cmdCheck } from './commands/check.js';
7
+ import { cmdMatch } from './commands/match.js';
7
8
  import { cmdRecord } from './commands/record.js';
8
9
  import { cmdCompile } from './commands/compile.js';
9
10
  import { cmdReview } from './commands/review.js';
10
11
  import { cmdMcp } from './commands/mcp.js';
12
+ import { cmdHook } from './commands/hook.js';
11
13
 
12
14
  const HELP = `
13
15
  ${bold('docket')} — brief the agent, warrant the actions, keep the record
@@ -22,6 +24,8 @@ ${bold('Getting started')}
22
24
  ${bold('Working with loops')}
23
25
  ${cyan('list')} list your loops
24
26
  ${cyan('show')} <loop> print a loop's five layers
27
+ ${cyan('match')} <task…> which loop covers this task? ranked, with why —
28
+ exit 0 = matched, 2 = no loop covers it (ask)
25
29
  ${cyan('check')} <loop> <action> <target>
26
30
  ask the warrant: allow, ask, or deny?
27
31
  (actions: read, draft, change, send)
@@ -37,9 +41,14 @@ ${bold('The record')}
37
41
  ${cyan('record verify')} verify the hash chain end to end
38
42
 
39
43
  ${bold('Portability')}
40
- ${cyan('compile')} [--target claude|agents|cursor|raw] [--loop <name>] [--write]
44
+ ${cyan('compile')} [--target claude|agents|gemini|cursor|raw] [--loop <name>] [--index] [--write]
41
45
  render loops into CLAUDE.md / AGENTS.md / Cursor rules
46
+ (--index: one line per loop + the protocol, instead of
47
+ full loops — keeps context flat as rule count grows)
42
48
  ${cyan('mcp')} run the MCP server (stdio) for agent integration
49
+ ${cyan('hook')} claude [--loop <name>] [--strict]
50
+ Claude Code PreToolUse hook: gate tool calls on
51
+ the warrant — deny blocks, ask prompts the human
43
52
 
44
53
  ${dim('Every loop answers five questions: what must it know, how is the work')}
45
54
  ${dim('done, what may it do without asking, where does it stop, and what')}
@@ -70,6 +79,8 @@ export async function main(argv) {
70
79
  return cmdList(rest);
71
80
  case 'show':
72
81
  return cmdShow(rest);
82
+ case 'match':
83
+ return cmdMatch(rest);
73
84
  case 'check':
74
85
  return cmdCheck(rest);
75
86
  case 'record':
@@ -80,6 +91,8 @@ export async function main(argv) {
80
91
  return cmdReview(rest);
81
92
  case 'mcp':
82
93
  return cmdMcp(rest);
94
+ case 'hook':
95
+ return cmdHook(rest);
83
96
  default:
84
97
  console.error(`docket: unknown command "${command}" — try \`docket help\``);
85
98
  return 1;
@@ -1,11 +1,16 @@
1
1
  import path from 'node:path';
2
2
  import { parseArgs } from '../lib/args.js';
3
3
  import { requireDocketDir, listLoops, loadLoop } from '../lib/loop.js';
4
- import { renderBlock, compileToFile, TARGETS } from '../lib/compile.js';
4
+ import { renderBlock, renderIndexBlock, compileToFile, TARGETS } from '../lib/compile.js';
5
5
  import { dim, green } from '../lib/ui.js';
6
6
 
7
+ // Above this, the full render starts crowding out the actual work — suggest
8
+ // the index. ~4 chars per token is close enough to warn honestly.
9
+ const TOKEN_HINT_AT = 2500;
10
+ const estimateTokens = (text) => Math.round(text.length / 4);
11
+
7
12
  export function cmdCompile(argv) {
8
- const { flags } = parseArgs(argv, { booleans: ['write'] });
13
+ const { flags } = parseArgs(argv, { booleans: ['write', 'index'] });
9
14
  const target = flags.target ?? 'raw';
10
15
  if (!TARGETS[target]) {
11
16
  console.error(`docket: unknown target "${target}" — targets: ${Object.keys(TARGETS).join(', ')}`);
@@ -20,26 +25,51 @@ export function cmdCompile(argv) {
20
25
  );
21
26
  return 1;
22
27
  }
28
+ if (flags.loop && flags.index) {
29
+ console.error(
30
+ 'docket: --index compiles the routing table over all loops; --loop previews one full loop — pick one'
31
+ );
32
+ return 1;
33
+ }
23
34
  const loops = flags.loop ? [loadLoop(docketDir, flags.loop)] : listLoops(docketDir);
24
35
  if (!loops.length) {
25
36
  console.error('docket: no loops to compile — create one with `docket new <name>`');
26
37
  return 1;
27
38
  }
28
39
 
40
+ const block = flags.index ? renderIndexBlock(loops) : renderBlock(loops);
41
+ // The hint goes to stderr so `docket compile > file` stays clean.
42
+ const hintIndex = () => {
43
+ if (flags.index || flags.loop) return;
44
+ const tokens = estimateTokens(block);
45
+ if (tokens < TOKEN_HINT_AT) return;
46
+ console.error(
47
+ dim(
48
+ ` ~${tokens} tokens will sit in the agent's context on every turn — \`docket compile --index\`\n` +
49
+ ` compiles the protocol plus one line per loop instead; full loops load on demand`
50
+ )
51
+ );
52
+ };
53
+
29
54
  if (!flags.write || target === 'raw') {
30
- console.log(renderBlock(loops));
55
+ console.log(block);
31
56
  if (flags.write && target === 'raw') {
32
57
  console.error(dim('(raw target always prints to stdout)'));
33
58
  }
59
+ hintIndex();
34
60
  return 0;
35
61
  }
36
62
 
37
63
  const rootDir = path.dirname(docketDir);
38
- const file = compileToFile(rootDir, target, loops);
64
+ const file = compileToFile(rootDir, target, loops, { index: flags.index });
65
+ const what = flags.index
66
+ ? `index of ${loops.length} loop${loops.length === 1 ? '' : 's'}`
67
+ : `${loops.length} loop${loops.length === 1 ? '' : 's'}`;
39
68
  console.log(
40
69
  green('✓') +
41
- ` compiled ${loops.length} loop${loops.length === 1 ? '' : 's'} → ${path.relative(process.cwd(), file)} ${dim(`(${TARGETS[target].label})`)}`
70
+ ` compiled ${what} → ${path.relative(process.cwd(), file)} ${dim(`(${TARGETS[target].label})`)}`
42
71
  );
43
72
  console.log(dim(' re-run after editing loops; the docket block is replaced in place'));
73
+ hintIndex();
44
74
  return 0;
45
75
  }
@@ -0,0 +1,148 @@
1
+ // `docket hook claude` — the warrant as a Claude Code PreToolUse hook.
2
+ //
3
+ // The compiled context makes the rules known; the MCP tools make checking
4
+ // cheap; this makes it MECHANICAL. Claude Code pipes every matched tool call
5
+ // here as JSON before it runs; docket answers in the hook protocol:
6
+ //
7
+ // deny → the call is blocked, the reason goes back to the model
8
+ // ask → Claude Code prompts the human before running the call
9
+ // allow → we stay SILENT (exit 0, no output)
10
+ //
11
+ // Silence on allow is deliberate: emitting an "allow" decision would bypass
12
+ // Claude Code's own permission prompts. Docket must only ever tighten the
13
+ // gate, never loosen it — a docket allow means "the warrant has no
14
+ // objection", not "skip the other locks".
15
+
16
+ import { parseArgs } from '../lib/args.js';
17
+ import { findDocketDir, listLoops, loadLoop, loopExists, loopNames } from '../lib/loop.js';
18
+ import { checkWarrant } from '../lib/warrant.js';
19
+ import { matchLoops } from '../lib/match.js';
20
+ import { recordCheck } from '../lib/record.js';
21
+
22
+ // Verbs for the tools Claude Code ships. Anything not listed — Bash, MCP
23
+ // tools, tools that don't exist yet — is treated as `send`, the most
24
+ // consequential verb: its allow list is the one loop authors keep shortest,
25
+ // so unknown tools fall toward ask, never toward allow.
26
+ const ACTION_FOR_TOOL = {
27
+ Read: 'read',
28
+ Glob: 'read',
29
+ Grep: 'read',
30
+ LS: 'read',
31
+ NotebookRead: 'read',
32
+ WebFetch: 'read',
33
+ WebSearch: 'read',
34
+ TodoRead: 'read',
35
+ Write: 'change',
36
+ Edit: 'change',
37
+ MultiEdit: 'change',
38
+ NotebookEdit: 'change',
39
+ TodoWrite: 'change',
40
+ };
41
+ const DEFAULT_ACTION = 'send';
42
+
43
+ // The warrant matches plain words, so give it the most human part of the
44
+ // tool input — the command, the path, the url — prefixed with the tool name.
45
+ export function describeTarget(toolName, input) {
46
+ const detail =
47
+ input && typeof input === 'object'
48
+ ? [input.command, input.file_path, input.url, input.path, input.pattern, input.query, input.description]
49
+ .find((v) => typeof v === 'string' && v.trim())
50
+ : null;
51
+ const text = detail ?? (input && typeof input === 'object' ? JSON.stringify(input) : '');
52
+ return `${toolName}${text ? `: ${text}` : ''}`.slice(0, 300);
53
+ }
54
+
55
+ function emitDecision(verdict, reason) {
56
+ process.stdout.write(
57
+ JSON.stringify({
58
+ hookSpecificOutput: {
59
+ hookEventName: 'PreToolUse',
60
+ permissionDecision: verdict,
61
+ permissionDecisionReason: reason,
62
+ },
63
+ }) + '\n'
64
+ );
65
+ }
66
+
67
+ function readStdin() {
68
+ return new Promise((resolve, reject) => {
69
+ let data = '';
70
+ process.stdin.setEncoding('utf8');
71
+ process.stdin.on('data', (chunk) => (data += chunk));
72
+ process.stdin.on('end', () => resolve(data));
73
+ process.stdin.on('error', reject);
74
+ });
75
+ }
76
+
77
+ // Exit codes follow the hook contract, not the warrant's: the DECISION rides
78
+ // in the JSON on stdout. Exit 1 is "misconfigured" — Claude Code shows the
79
+ // human our stderr without blocking the call.
80
+ export async function cmdHook(argv) {
81
+ const { flags, positional } = parseArgs(argv, { booleans: ['strict'] });
82
+ if (positional[0] !== 'claude') {
83
+ console.error('usage: docket hook claude [--loop <name>] [--strict] [--dir <project>]');
84
+ return 1;
85
+ }
86
+
87
+ let event;
88
+ try {
89
+ event = JSON.parse(await readStdin());
90
+ } catch {
91
+ console.error('docket hook: stdin was not hook JSON — wire this command under hooks.PreToolUse');
92
+ return 1;
93
+ }
94
+ if (event.hook_event_name && event.hook_event_name !== 'PreToolUse') return 0;
95
+ const toolName = typeof event.tool_name === 'string' ? event.tool_name : '';
96
+ if (!toolName) return 0;
97
+
98
+ const startDir = flags.dir ?? process.env.DOCKET_DIR ?? event.cwd ?? process.cwd();
99
+ const docketDir = findDocketDir(startDir);
100
+ if (!docketDir) {
101
+ // Only loud when the config names a loop: a global hook in a project
102
+ // that doesn't use docket should cost nothing.
103
+ if (flags.loop) {
104
+ console.error(`docket hook: --loop ${flags.loop} given but no .docket directory found from ${startDir}`);
105
+ return 1;
106
+ }
107
+ return 0;
108
+ }
109
+
110
+ const action = ACTION_FOR_TOOL[toolName] ?? DEFAULT_ACTION;
111
+ const target = describeTarget(toolName, event.tool_input);
112
+
113
+ let loop;
114
+ if (flags.loop) {
115
+ if (!loopExists(docketDir, flags.loop)) {
116
+ console.error(
117
+ `docket hook: no loop named "${flags.loop}" — have: ${loopNames(docketDir).join(', ') || '(none)'}`
118
+ );
119
+ return 1;
120
+ }
121
+ loop = loadLoop(docketDir, flags.loop);
122
+ } else {
123
+ // No loop pinned in the config: route on the target. A routed loop
124
+ // governs; no route means no loop claims this call — pass through to
125
+ // Claude Code's own permissions (or ask, under --strict).
126
+ const [candidate] = matchLoops(listLoops(docketDir), target, { limit: 1 });
127
+ if (!candidate) {
128
+ if (flags.strict) {
129
+ emitDecision(
130
+ 'ask',
131
+ `docket: no loop covers "${target}" and this project runs hooks in strict mode — a human must approve work outside the loops.`
132
+ );
133
+ }
134
+ return 0;
135
+ }
136
+ loop = candidate.loop;
137
+ }
138
+
139
+ const result = checkWarrant(loop, action, target);
140
+ recordCheck(docketDir, loop.name, action, target, result, { via: 'hook' });
141
+
142
+ if (result.verdict === 'deny') {
143
+ emitDecision('deny', `docket loop "${loop.name}" (${result.rule}): ${result.reason}`);
144
+ } else if (result.verdict === 'ask') {
145
+ emitDecision('ask', `docket loop "${loop.name}" (${result.rule}): ${result.reason}`);
146
+ }
147
+ return 0;
148
+ }
@@ -34,7 +34,11 @@ export function cmdShow(argv) {
34
34
  console.log();
35
35
  };
36
36
 
37
- console.log(`${bold(cyan(loop.name))} — ${loop.description}\n${dim(loop.file)}\n`);
37
+ console.log(`${bold(cyan(loop.name))} — ${loop.description}\n${dim(loop.file)}`);
38
+ if (loop.triggers.length) {
39
+ console.log(dim(`triggers: ${loop.triggers.join(' · ')}`));
40
+ }
41
+ console.log();
38
42
  section('Brief — what it knows before it starts', loop.brief);
39
43
  section('Procedure — how the work is done', loop.procedure);
40
44
 
@@ -0,0 +1,48 @@
1
+ import { parseArgs } from '../lib/args.js';
2
+ import { requireDocketDir, listLoops } from '../lib/loop.js';
3
+ import { matchLoops } from '../lib/match.js';
4
+ import { bold, cyan, dim, yellow } from '../lib/ui.js';
5
+
6
+ // Exit codes mirror the warrant's contract: 0 = a loop covers this,
7
+ // 2 = nothing does (which means ask), 1 = usage error. Hooks can gate on it.
8
+ export function cmdMatch(argv) {
9
+ const { flags, positional } = parseArgs(argv);
10
+ const intent = positional.join(' ').trim();
11
+ if (!intent) {
12
+ console.error('usage: docket match <the task, in plain words…>');
13
+ return 1;
14
+ }
15
+ const limit = Number.parseInt(flags.limit ?? '3', 10);
16
+ if (!Number.isInteger(limit) || limit < 1) {
17
+ console.error('docket: --limit must be a positive integer');
18
+ return 1;
19
+ }
20
+
21
+ const docketDir = requireDocketDir();
22
+ const loops = listLoops(docketDir);
23
+ if (!loops.length) {
24
+ console.error('docket: no loops defined — create one with `docket new <name>`');
25
+ return 1;
26
+ }
27
+
28
+ const candidates = matchLoops(loops, intent, { limit });
29
+ if (!candidates.length) {
30
+ console.log(`${yellow(bold('NO LOOP'))} "${intent}"`);
31
+ console.log(' No loop covers this task. Work outside a loop defaults to ask —');
32
+ console.log(' check with a human, or write the loop: docket new <name>');
33
+ return 2;
34
+ }
35
+
36
+ console.log(
37
+ bold(`${candidates.length} candidate loop${candidates.length === 1 ? '' : 's'}`) +
38
+ dim(` for "${intent}"`) +
39
+ '\n'
40
+ );
41
+ for (const c of candidates) {
42
+ const why = c.hits.map((h) => `${h.field}: ${h.pattern}`).join(' · ');
43
+ console.log(` ${cyan(c.loop.name.padEnd(22))} ${c.loop.description}`);
44
+ console.log(dim(` ${''.padEnd(22)} score ${c.score} — ${why}`));
45
+ }
46
+ console.log(dim('\nload the winner before working: docket show <loop> · docket compile --loop <loop>'));
47
+ return 0;
48
+ }
@@ -6,6 +6,7 @@ import readline from 'node:readline';
6
6
  import { parseArgs } from '../lib/args.js';
7
7
  import { requireDocketDir, listLoops, loadLoop, loopExists, loopNames, ACTIONS } from '../lib/loop.js';
8
8
  import { checkWarrant } from '../lib/warrant.js';
9
+ import { matchLoops } from '../lib/match.js';
9
10
  import { appendRecord, collectRecordFields, recordCheck } from '../lib/record.js';
10
11
  import { renderLoop } from '../lib/compile.js';
11
12
  import { VERSION } from '../lib/pkg.js';
@@ -17,6 +18,22 @@ const TOOLS = [
17
18
  'List the loops the human has defined. Each loop is one recurring task with brief, procedure, warrant, record, and reserved layers.',
18
19
  inputSchema: { type: 'object', properties: {}, additionalProperties: false },
19
20
  },
21
+ {
22
+ name: 'docket_match_loop',
23
+ description:
24
+ 'Find which loop covers a task BEFORE starting it. Give the task in plain words; returns the best-matching loops, ranked, with why each matched. Then call docket_loop_context on the one that fits. If nothing matches, no loop covers the task — ask the human instead of guessing.',
25
+ inputSchema: {
26
+ type: 'object',
27
+ properties: {
28
+ intent: {
29
+ type: 'string',
30
+ description: 'the task about to start, in plain words (e.g. "draft an appeal for the denied claim")',
31
+ },
32
+ },
33
+ required: ['intent'],
34
+ additionalProperties: false,
35
+ },
36
+ },
20
37
  {
21
38
  name: 'docket_loop_context',
22
39
  description:
@@ -77,6 +94,34 @@ export function handleToolCall(docketDir, name, args = {}) {
77
94
  if (!loops.length) return textResult('No loops defined yet.');
78
95
  return textResult(loops.map((l) => `${l.name}: ${l.description}`).join('\n'));
79
96
  }
97
+ case 'docket_match_loop': {
98
+ const intent = typeof args.intent === 'string' ? args.intent.trim() : '';
99
+ if (!intent) return textResult('give the task in plain words via `intent`', true);
100
+ const loops = listLoops(docketDir);
101
+ if (!loops.length) return textResult('No loops defined yet.');
102
+ const candidates = matchLoops(loops, intent);
103
+ if (!candidates.length) {
104
+ return textResult(
105
+ `No loop covers "${intent}". Do not guess or proceed without one — work outside a loop ` +
106
+ `defaults to ask. Tell the human what you want to do and which loop (if any) should own it.`
107
+ );
108
+ }
109
+ const lines = candidates.map(
110
+ (c, i) =>
111
+ `${i + 1}. ${c.loop.name} — ${c.loop.description || '(no description)'} ` +
112
+ `(score ${c.score}: ${c.hits.map((h) => `${h.field} ~ ${h.pattern}`).join(', ')})`
113
+ );
114
+ return textResult(
115
+ [
116
+ `Candidate loops for "${intent}":`,
117
+ '',
118
+ ...lines,
119
+ '',
120
+ 'Call docket_loop_context on the loop that fits, and work under it. If none of these',
121
+ 'actually covers the task, ask the human — do not guess.',
122
+ ].join('\n')
123
+ );
124
+ }
80
125
  case 'docket_loop_context': {
81
126
  const loop = loadLoop(docketDir, args.loop);
82
127
  return textResult(renderLoop(loop));
@@ -99,6 +99,39 @@ export function renderBlock(loops) {
99
99
  return `${BEGIN}\n${header}\n\n${body}\n${END}`;
100
100
  }
101
101
 
102
+ // Tiered rendering, for when the full render outgrows the context window.
103
+ // Rules scale on disk, not in context: what stays resident is the protocol
104
+ // (invariant with loop count) plus a one-line-per-loop routing table. Full
105
+ // loops load on demand — and enforcement never needed residency at all,
106
+ // because the warrant check runs outside the model.
107
+ export function renderIndexEntry(loop) {
108
+ const triggers = loop.triggers.length ? ` *(triggers: ${loop.triggers.join('; ')})*` : '';
109
+ return `- **${loop.name}** — ${loop.description || '(no description)'}${triggers}`;
110
+ }
111
+
112
+ export function renderIndexBlock(loops) {
113
+ const header = [
114
+ '## Docket loops (index)',
115
+ '',
116
+ `${loops.length} loop${loops.length === 1 ? ' is' : 's are'} defined. This is the index, not the rules — each`,
117
+ "loop's full brief, procedure, and warrant load on demand.",
118
+ '',
119
+ 'Before starting any task:',
120
+ '',
121
+ '1. Find the loop that covers it below — by its triggers, or ask docket:',
122
+ ' `docket match "<the task in plain words>"` (MCP: `docket_match_loop`).',
123
+ '2. Load that loop in full — `docket compile --loop <name>` (MCP:',
124
+ ' `docket_loop_context`) — and follow its brief and procedure.',
125
+ '3. Before any read/draft/change/send that matters, check the warrant:',
126
+ ' `docket check <loop> <action> "<target>"` (MCP: `docket_warrant_check`).',
127
+ '',
128
+ 'If no loop covers the task, stop and ask the human before proceeding.',
129
+ 'Unlisted means ask. Silence is never permission.',
130
+ ].join('\n');
131
+ const body = loops.map(renderIndexEntry).join('\n');
132
+ return `${BEGIN}\n${header}\n\n${neutralizeMarkers(body)}\n${END}`;
133
+ }
134
+
102
135
  // Locate the managed block: first BEGIN at a line start, LAST END at a line
103
136
  // start. Content is marker-neutralized at render time, so a matching END is
104
137
  // always a real one.
@@ -116,11 +149,13 @@ function findBlock(text) {
116
149
  return { start: beginMatch.index, end: endIdx + 1 + END.length };
117
150
  }
118
151
 
119
- export function compileToFile(rootDir, target, loops) {
152
+ export function compileToFile(rootDir, target, loops, { index = false } = {}) {
120
153
  const spec = TARGETS[target];
121
154
  if (!spec || !spec.file) throw new Error(`target "${target}" cannot be written to a file`);
122
155
  const filePath = path.join(rootDir, spec.file);
123
- const block = renderBlock(loops);
156
+ // Same markers either way, so switching between full and index render
157
+ // replaces the managed block instead of stacking a second one.
158
+ const block = index ? renderIndexBlock(loops) : renderBlock(loops);
124
159
  let existing = '';
125
160
  if (fs.existsSync(filePath)) existing = fs.readFileSync(filePath, 'utf8');
126
161
 
package/src/lib/loop.js CHANGED
@@ -6,6 +6,9 @@
6
6
  // warrant — what it may read / draft / change / send (frontmatter)
7
7
  // record — the evidence the agent owes when it stops (frontmatter)
8
8
  // reserved — what stays with the human, always (frontmatter)
9
+ //
10
+ // Plus optional routing metadata:
11
+ // triggers — phrases that mark a task as this loop's job (frontmatter)
9
12
 
10
13
  import fs from 'node:fs';
11
14
  import path from 'node:path';
@@ -122,6 +125,7 @@ export function parseLoop(text, { file } = {}) {
122
125
  description: typeof meta.description === 'string' ? meta.description : '',
123
126
  version,
124
127
  warrant,
128
+ triggers: asStringList(meta.triggers, 'triggers'),
125
129
  reserved: asStringList(meta.reserved, 'reserved'),
126
130
  record: asStringList(meta.record, 'record'),
127
131
  brief: sections.brief ?? '',
@@ -0,0 +1,88 @@
1
+ // Loop routing: which loop covers this task?
2
+ //
3
+ // Rules scale on disk, not in context — the agent holds a one-line-per-loop
4
+ // index and pulls one loop at a time, so something has to answer "which one?"
5
+ // deterministically. Scoring is lexical and integer-weighted, reusing the
6
+ // warrant's cautious matcher.
7
+ //
8
+ // The warrant's asymmetry principle inverts at routing time. Over-retrieval
9
+ // costs one extra index line pulled into context; under-retrieval just means
10
+ // the agent works without its procedure — and the warrant check still catches
11
+ // the miss downstream. So matching is generous. But when NOTHING clears the
12
+ // bar, the answer is not "best guess": it is "no loop covers this — ask".
13
+ // Retrieval fails closed, exactly like the warrant.
14
+
15
+ import { matchPattern, contentWords, sameWord } from './warrant.js';
16
+ import { ACTIONS } from './loop.js';
17
+
18
+ // Integer weights, most author-intentional signal first. A name or trigger
19
+ // hit qualifies a loop on its own; description overlap and warrant-target
20
+ // hits must accumulate to MIN_SCORE, so one shared word ("email") never
21
+ // routes on its own.
22
+ const WEIGHT = { name: 5, trigger: 4, warrant: 1, description: 1 };
23
+ const WARRANT_CAP = 3;
24
+ const DESCRIPTION_CAP = 3;
25
+ export const MIN_SCORE = 3;
26
+
27
+ // Distinct content words of `a` that also appear (under stemming) in `b`.
28
+ function overlapCount(a, b) {
29
+ const aWords = [...new Set(contentWords(a.toLowerCase()))];
30
+ const bWords = contentWords(b.toLowerCase());
31
+ return aWords.filter((aw) => bWords.some((bw) => sameWord(aw, bw))).length;
32
+ }
33
+
34
+ export function scoreLoop(loop, intent) {
35
+ const hits = [];
36
+ let score = 0;
37
+
38
+ // The loop's own name, read as a phrase ("insurance-appeal" → "insurance appeal").
39
+ if (matchPattern(loop.name.replace(/-/g, ' '), intent)) {
40
+ score += WEIGHT.name;
41
+ hits.push({ field: 'name', pattern: loop.name });
42
+ }
43
+
44
+ // Triggers are the author saying "tasks like this are mine" — the loudest
45
+ // routing signal a loop file can carry.
46
+ for (const trigger of loop.triggers) {
47
+ if (matchPattern(trigger, intent)) {
48
+ score += WEIGHT.trigger;
49
+ hits.push({ field: 'trigger', pattern: trigger });
50
+ }
51
+ }
52
+
53
+ // Warrant targets are routing evidence too: a loop that names "denial
54
+ // letter" under read probably owns tasks about denial letters. Capped so a
55
+ // long warrant can't outshout an explicit trigger on another loop.
56
+ let warrantHits = 0;
57
+ for (const key of [...ACTIONS, 'ask', 'never']) {
58
+ for (const pattern of loop.warrant[key]) {
59
+ if (warrantHits >= WARRANT_CAP) break;
60
+ if (matchPattern(pattern, intent)) {
61
+ warrantHits += 1;
62
+ score += WEIGHT.warrant;
63
+ hits.push({ field: `warrant.${key}`, pattern });
64
+ }
65
+ }
66
+ }
67
+
68
+ const shared = overlapCount(intent, loop.description);
69
+ if (shared > 0) {
70
+ score += Math.min(shared, DESCRIPTION_CAP) * WEIGHT.description;
71
+ hits.push({
72
+ field: 'description',
73
+ pattern: `${shared} shared word${shared === 1 ? '' : 's'}`,
74
+ });
75
+ }
76
+
77
+ return { score, hits };
78
+ }
79
+
80
+ // Rank loops against an intent; only candidates at or above MIN_SCORE count.
81
+ // Deterministic: score descending, then name — same intent, same ranking.
82
+ export function matchLoops(loops, intent, { limit = 3 } = {}) {
83
+ return loops
84
+ .map((loop) => ({ loop, ...scoreLoop(loop, intent) }))
85
+ .filter((c) => c.score >= MIN_SCORE)
86
+ .sort((a, b) => b.score - a.score || a.loop.name.localeCompare(b.loop.name))
87
+ .slice(0, limit);
88
+ }
@@ -43,14 +43,14 @@ function stemCandidates(word) {
43
43
  return c;
44
44
  }
45
45
 
46
- function sameWord(a, b) {
46
+ export function sameWord(a, b) {
47
47
  for (const cand of stemCandidates(a)) {
48
48
  if (stemCandidates(b).has(cand)) return true;
49
49
  }
50
50
  return false;
51
51
  }
52
52
 
53
- function contentWords(s) {
53
+ export function contentWords(s) {
54
54
  return s.split(/[^a-z0-9']+/).filter((w) => w && !STOPWORDS.has(w));
55
55
  }
56
56
 
@@ -2,6 +2,9 @@
2
2
  name: client-follow-up
3
3
  description: Follow up with a client with the whole history in the room — promises, tone, and the language they already approved.
4
4
  version: 1
5
+ triggers:
6
+ - follow up with a client, client follow-up
7
+ - client email, client status update, check in with the account
5
8
  warrant:
6
9
  read:
7
10
  - account history
@@ -2,6 +2,9 @@
2
2
  name: cross-tool-memory
3
3
  description: One context you own, readable from Claude, GPT, Kimi, or Codex — a model switch is a recompile, not a re-teach.
4
4
  version: 1
5
+ triggers:
6
+ - update the shared memory, remember this across tools
7
+ - regenerate CLAUDE.md, AGENTS.md, or rules files
5
8
  warrant:
6
9
  read:
7
10
  - the loops in this .docket directory
@@ -2,6 +2,9 @@
2
2
  name: insurance-appeal
3
3
  description: Build the appeal, cite the policy, assemble the evidence packet — stop before send.
4
4
  version: 1
5
+ triggers:
6
+ - insurance appeal, appeal a denial
7
+ - denied claim, denial letter, claim dispute
5
8
  warrant:
6
9
  read:
7
10
  - policy documents
@@ -2,6 +2,9 @@
2
2
  name: marketing-brain
3
3
  description: Marketing memory that compounds week over week — the messaging that already worked, the objections that keep coming back, the founder's actual voice.
4
4
  version: 1
5
+ triggers:
6
+ - marketing copy, launch post, landing page
7
+ - positioning, messaging, campaign draft
5
8
  warrant:
6
9
  read:
7
10
  - the positioning doc
@@ -2,6 +2,9 @@
2
2
  name: ticket-handoff
3
3
  description: Turn messy work into tickets another human — or another agent — can pick up cold: source, owner, status, blocker, warrant, record.
4
4
  version: 1
5
+ triggers:
6
+ - file a ticket, create tickets from this
7
+ - handoff, hand this off, triage the backlog
5
8
  warrant:
6
9
  read:
7
10
  - the conversation or incident being handed off
@@ -2,6 +2,9 @@
2
2
  name: travel-morning
3
3
  description: Plan a morning in an unfamiliar city around how you actually travel — not how a guidebook thinks you should.
4
4
  version: 1
5
+ triggers:
6
+ - plan a morning, plan the morning in a city
7
+ - itinerary, sightseeing plan, what to do before noon
5
8
  warrant:
6
9
  read:
7
10
  - maps and transit schedules
@@ -2,6 +2,9 @@
2
2
  name: weekly-planning
3
3
  description: Propose the week — priorities, tradeoffs, and what has to move — but change nothing.
4
4
  version: 1
5
+ triggers:
6
+ - plan the week, weekly plan, weekly planning
7
+ - review the calendar, sort out priorities for the week
5
8
  warrant:
6
9
  read:
7
10
  - calendar