demian-cli 1.0.3

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,2773 @@
1
+ # demian Architecture
2
+
3
+ > Status: canonical integrated architecture with implemented v1 multi-agent core
4
+ > Package name: `demian`
5
+ > Package path: `demian`
6
+ > Inputs: `nodejs/architecture-by-claude.md`, `nodejs/architecture-by-codex.md`, `claude/demian`, `codex/demian`, `.documents/multi-agent-architecture.md`, `.documents/add-codex-provider-by-codex.md`, `.documents/add-claudecode-provider-by-claude.md`, `.documents/add-claudecode-provider-by-codex.md`, `.documents/efficient-context.md`
7
+
8
+ `demian` is a small local coding-agent runtime. The model is replaceable; the runtime owns local execution, policy, observability, and safety boundaries.
9
+
10
+ The integrated design takes the strongest parts of both source designs:
11
+
12
+ - From the Claude design: philosophy, decision history, provider rationale, tradeoff tables, command hook protocol, detailed tool safety notes.
13
+ - From the Codex design: executable runtime shape, Node-first CLI, streaming, multimodal input, sandbox modes, persistent grants, network-free tests.
14
+
15
+ Where the earlier designs disagree, this document chooses the path that best fits the new integrated package. In particular, `streaming`, `multimodal`, `sandbox`, and `persistentGrants` are included in the integrated v0 because they are already represented in `codex/demian`.
16
+
17
+ This revision also folds in the target multi-agent design. The important design boundary is that demian does not become an autonomous swarm framework. It becomes an opt-in, root-owned delegation runtime: the root session owns authority, the main agent talks to the user, and sub agents run as bounded child invocations under the same permission, transcript, sandbox, and cancellation services.
18
+
19
+ ---
20
+
21
+ ## 1. Identity
22
+
23
+ `demian` is not:
24
+
25
+ - an IDE
26
+ - an LLM gateway
27
+ - a generic agent framework
28
+ - a general autonomous multi-agent orchestration platform
29
+ - a LangChain or Vercel AI SDK wrapper
30
+
31
+ `demian` is:
32
+
33
+ - a local-first coding-agent runtime
34
+ - a provider-neutral session loop
35
+ - a tool execution boundary
36
+ - a hook and permission policy layer
37
+ - an opt-in root-owned delegation runtime
38
+ - an observable CLI package
39
+
40
+ Canonical single-agent runtime flow:
41
+
42
+ ```text
43
+ user prompt
44
+ -> config + agent resolution
45
+ -> provider request
46
+ -> model response
47
+ -> optional tool call
48
+ -> hooks
49
+ -> permission
50
+ -> local tool execution
51
+ -> tool result
52
+ -> provider request
53
+ -> final answer
54
+ ```
55
+
56
+ Canonical multi-agent runtime flow:
57
+
58
+ ```text
59
+ root session owns authority
60
+ -> main agent talks to the user
61
+ -> main agent may call delegate_agent
62
+ -> child AgentInvocation runs a bounded AgentSession
63
+ -> child tool and agent permissions route to the root session
64
+ -> compact child result returns to main as a tool result
65
+ -> final main answer
66
+ ```
67
+
68
+ The two technical horns remain:
69
+
70
+ - **Hooks**: lifecycle observers and policy gates.
71
+ - **Tools**: the only capabilities that can touch the workspace or process execution.
72
+
73
+ Agents add a third runtime concept, but not a third side-effect boundary. An agent is a **policy capsule**: provider profile reference, system prompt, visible tools, permission defaults, delegation policy, and catalog metadata. Sub agents are invoked through a stable `delegate_agent` tool, but the invocation itself is session-backed because it has model turns, tool calls, permissions, transcript events, and bounded memory.
74
+
75
+ The model requests work. Hooks and permissions decide whether local work may happen. Tools execute it. Agents organize model behavior. Events and transcripts remember the full chronology.
76
+
77
+ ---
78
+
79
+ ## 2. Design Principles
80
+
81
+ | Principle | Meaning |
82
+ |-----------|---------|
83
+ | Local-first execution | Provider calls may be remote, but file and process execution are local and bounded to `cwd`. |
84
+ | OpenAI-compatible first | OpenAI, Gemini, Ollama, LM Studio, vLLM, llama.cpp, OpenRouter, Together, Groq, and Azure OpenAI share one provider path. Provider-native exceptions such as Anthropic, Codex, and Claude Code stay isolated adapters. |
85
+ | OpenAI-shaped internal messages | Internal history uses `system`, `user`, `assistant`, and `tool` messages. Anthropic converts at the adapter boundary. |
86
+ | Explicit side-effect boundary | Every side-effecting action passes through hook dispatch and permission evaluation. |
87
+ | Hard/global deny dominates | Built-in hard deny and global explicit deny override session grants, persistent grants, and `--yes`. Agent-local deny can be expanded only by explicit root-user approval. |
88
+ | Root owns authority | Permissions, grants, prompts, transcript, cancellation, cwd, and sandbox policy belong to the root session, not to individual agents. |
89
+ | Agents are policy capsules | An agent definition includes prompt, provider profile reference, visible tools, defaults, delegation rules, and catalog metadata. |
90
+ | Visibility is not permission | A tool can be installed without being visible to an agent, and a grant can exist without making a hidden tool callable by that agent. |
91
+ | Delegation is tool-entry, session-backed | The main model sees a `delegate_agent` tool, while the runtime creates a child AgentSession with its own bounded context and memory. |
92
+ | Observable runtime | Events, transcript JSONL, tool previews, retry events, and permission events are first-class. |
93
+ | Single-agent remains first-class | Multi-agent mode is opt-in and must not change the default single-agent UX or safety model. |
94
+ | Small core | Native Gemini, Vertex AI, MCP, plugins, background agents, and marketplace features are extensions, not core dependencies. |
95
+
96
+ Measurable architecture property:
97
+
98
+ ```text
99
+ Adding a new OpenAI-compatible provider must require zero changes to Session Runner,
100
+ tools, hooks, permissions, messages, and transcript code.
101
+ ```
102
+
103
+ Gemini is the first proof of this property.
104
+
105
+ ---
106
+
107
+ ## 3. Scope
108
+
109
+ ### v0 Includes
110
+
111
+ - CLI package named `demian`
112
+ - Node 22+ runtime with TypeScript strip support
113
+ - Bun-compatible package shape where practical
114
+ - OpenAI-compatible provider with `chat()` and optional `stream()`
115
+ - Codex provider using local Codex ChatGPT login and Responses API
116
+ - Claude Code external runtime using Claude Agent SDK by default, with explicit CLI fallback
117
+ - Gemini through the OpenAI-compatible endpoint
118
+ - Anthropic adapter
119
+ - OpenAI-shaped message history
120
+ - Six built-in tools
121
+ - Hook lifecycle plus command hooks
122
+ - Four built-in safety hooks
123
+ - Agent registry with `build` and `plan`
124
+ - Permission engine with session grants
125
+ - Persistent grants with TTL
126
+ - Bash sandbox modes: `off`, `read-only`, `workspace-write`
127
+ - Multimodal user input through repeated `--image`
128
+ - Runtime events and JSONL transcripts
129
+ - Network-free tests
130
+
131
+ ### Implemented v1 Multi-Agent Core
132
+
133
+ - Mode switch: `single-agent` by default, `multi-agent` by config or flag
134
+ - Extensible `ToolRegistry` and `AgentRegistry`
135
+ - `AgentDefinition` as a policy capsule
136
+ - Config and programmatic agent registration
137
+ - Provider profile references per agent, with provider configs still root-owned
138
+ - RootSession above SessionRunner in multi-agent mode
139
+ - Shared root grants for main and child invocations
140
+ - Stable `delegate_agent` tool exposed only in multi-agent mode
141
+ - Child AgentSessionRunner with bounded context, compressed memory, and synchronous v1 execution
142
+ - Structured handoff packets instead of full main-history sharing
143
+ - Compact child results returned to main plus full transcript/artifact references
144
+ - Parent-child invocation events in one root transcript
145
+ - Context budget knobs for main and sub agents
146
+
147
+ ### Deferred
148
+
149
+ | Deferred item | Reason | Future extension |
150
+ |---------------|--------|------------------|
151
+ | Native Gemini SDK | OpenAI-compatible path is sufficient for coding-agent v0 | `providers/gemini.ts` |
152
+ | Vertex AI | Different auth and enterprise flow | `providers/vertex.ts` |
153
+ | MCP | Core loop should stabilize first | MCP-backed Tool wrapper |
154
+ | Plugin marketplace | Requires trust and manifest design | Plugin loader |
155
+ | Filesystem agent discovery | Markdown loader and trust UX should land with the trust store | `agents/*.md` loader |
156
+ | External tool module loading | Loading JS/MJS is arbitrary local code execution | Trust-gated tool loader |
157
+ | Trust persistence | Should be introduced with filesystem discovery | `.demian/trust.json` |
158
+ | Background subagents | Permission ordering, cancellation, and transcript UX need the synchronous model first | Background invocation queue |
159
+ | Autonomous swarm routing | demian is a bounded local coding runtime, not a general multi-agent framework | Explicit router feature if proven necessary |
160
+ | Parallel tool calls | Deterministic ordering matters first | Read-only parallel lane |
161
+ | Full secret redaction | Cannot be guaranteed with regex only | Redaction policy registry |
162
+ | Provider raw advanced features | Can weaken provider portability | Explicit `ProviderCapabilities` |
163
+ | Per-agent secrets | Provider credentials should remain config/provider-resolver owned | Secret-scoped provider profiles |
164
+
165
+ ---
166
+
167
+ ## 4. System Architecture
168
+
169
+ ```text
170
+ ┌──────────────────────────────────────────────────────────────┐
171
+ │ CLI │
172
+ │ flags / config / provider resolution / permission prompt │
173
+ └──────────────────────────────┬───────────────────────────────┘
174
+ v
175
+ ┌──────────────────────────────────────────────────────────────┐
176
+ │ Session Runner │
177
+ │ OpenAI-shaped history / turns / streaming / tool lifecycle │
178
+ └───────┬──────────────┬───────────────┬───────────────┬────────┘
179
+ v v v v
180
+ ┌────────────┐ ┌────────────┐ ┌─────────────┐ ┌──────────────┐
181
+ │ Provider │ │ Tool │ │ Hook │ │ Permission │
182
+ │ + retry │ │ Registry │ │ Dispatcher │ │ Engine │
183
+ └─────┬──────┘ └─────┬──────┘ └──────┬──────┘ └──────┬───────┘
184
+ v v v v
185
+ [LLM APIs] [Local OS] [Config/hooks] [Agent policy]
186
+
187
+ OpenAIProvider
188
+ -> OpenAI
189
+ -> Gemini OpenAI-compatible
190
+ -> Ollama / LM Studio / vLLM / llama.cpp
191
+ -> OpenRouter / Together / Groq / Azure OpenAI
192
+
193
+ AnthropicProvider
194
+ -> Anthropic API
195
+
196
+ CodexProvider
197
+ -> local Codex auth store
198
+ -> ChatGPT-backed Codex Responses API
199
+
200
+ Claude Code external runtime
201
+ -> Claude Agent SDK by default
202
+ -> explicit claude -p CLI fallback
203
+ -> local Claude Code login / Agent SDK plan auth
204
+ ```
205
+
206
+ The Session Runner is the kernel. Provider, tools, hooks, permissions, sandbox, and transcript are replaceable services around it.
207
+
208
+ In multi-agent mode, RootSession becomes the outer lifetime:
209
+
210
+ ```text
211
+ UI process
212
+ -> RootSession
213
+ owns provider registry, agent registry, tool registry
214
+ owns permission coordinator, grants, event bus, transcript writer
215
+ owns root abort signal, cwd, sandbox policy, context budgets
216
+
217
+ -> AgentInvocation(main)
218
+ -> SessionRunner(main)
219
+ -> direct tool call
220
+ -> delegate_agent
221
+ -> AgentInvocation(child)
222
+ -> AgentSessionRunner(child)
223
+ -> bounded child model context
224
+ -> child tool calls through same hooks/permissions
225
+ -> child memory compression
226
+ -> child final answer
227
+ -> compact child result as tool message
228
+ -> final main answer
229
+ ```
230
+
231
+ This chooses a child session model instead of a simple inner loop inside the main SessionRunner. The gain is that a reviewer, builder, or researcher can preserve bounded working memory across repeated delegations, can use a different provider profile, and can have its own prompt and turn budget. The cost is extra lifecycle, transcript, and compression machinery. That cost is accepted because agent memory and provider identity are real runtime state, not just prompt text.
232
+
233
+ ---
234
+
235
+ ## 5. Directory Layout
236
+
237
+ ```text
238
+ nodejs/
239
+ package.json
240
+ tsconfig.json
241
+ README.md
242
+ architecture.md
243
+ architecture-by-claude.md
244
+ architecture-by-codex.md
245
+ bin/
246
+ demian.js
247
+ demian-cli.js
248
+ demian-plain.js
249
+ src/
250
+ index.ts
251
+ cli.ts
252
+ tui.ts
253
+ config.ts
254
+ session.ts
255
+ root-session.ts
256
+ events.ts
257
+ transcript.ts
258
+ messages.ts
259
+ multimodal.ts
260
+ id.ts
261
+ util.ts
262
+
263
+ providers/
264
+ types.ts
265
+ retry.ts
266
+ openai.ts
267
+ anthropic.ts
268
+ codex.ts
269
+ codex-auth.ts
270
+ codex-state.ts
271
+ codex-stream.ts
272
+ claudecode.ts
273
+ claudecode-auth.ts
274
+ claudecode-stream.ts
275
+
276
+ tools/
277
+ types.ts
278
+ validation.ts
279
+ registry.ts
280
+ output.ts
281
+ read-file.ts
282
+ write-file.ts
283
+ edit-file.ts
284
+ bash.ts
285
+ grep.ts
286
+ glob.ts
287
+ delegate-agent.ts
288
+
289
+ hooks/
290
+ types.ts
291
+ dispatcher.ts
292
+ command.ts
293
+ builtin/
294
+ index.ts
295
+ block-dangerous-bash.ts
296
+ protect-env-files.ts
297
+ mask-secrets.ts
298
+ inject-env-info.ts
299
+
300
+ permissions/
301
+ types.ts
302
+ engine.ts
303
+ grants.ts
304
+ persistent-grants.ts
305
+ prompt.ts
306
+
307
+ agents/
308
+ types.ts
309
+ registry.ts
310
+ build.ts
311
+ plan.ts
312
+ prompts/
313
+ build.txt
314
+ plan.txt
315
+
316
+ workspace/
317
+ paths.ts
318
+ diff.ts
319
+
320
+ sandbox/
321
+ types.ts
322
+ index.ts
323
+ macos.ts
324
+ linux.ts
325
+ env-only.ts
326
+
327
+ ui/
328
+ settings.ts
329
+ markdown/
330
+ render.ts
331
+ plain/
332
+ interactive.ts
333
+ tui/
334
+ app.ts
335
+ controller.ts
336
+ store.ts
337
+
338
+ test/
339
+ provider.test.ts
340
+ session.test.ts
341
+ tools.test.ts
342
+ permission.test.ts
343
+ persistent-grants.test.ts
344
+ multimodal.test.ts
345
+ hooks.test.ts
346
+ sandbox.test.ts
347
+ multi-agent.test.ts
348
+ ```
349
+
350
+ ### Registry Openness
351
+
352
+ Addable units:
353
+
354
+ ```text
355
+ provider config
356
+ -> only from demian config
357
+
358
+ tool definition
359
+ -> built-in
360
+ -> programmatic registration
361
+ -> user/project filesystem scopes later behind trust
362
+
363
+ agent definition
364
+ -> built-in
365
+ -> config.agents
366
+ -> programmatic registration
367
+ -> user/project markdown scopes later behind trust
368
+ ```
369
+
370
+ Provider is intentionally not an agent-provided addable unit. Endpoints, credentials, auth headers, and quirks stay in config. This choice gives up the convenience of self-contained agent files, but it keeps data routing and cost visible to the user.
371
+
372
+ Registry APIs:
373
+
374
+ ```ts
375
+ export class ToolRegistry {
376
+ register(tool: ToolDefinition, source?: RegistrySource): void
377
+ get(name: string): ToolDefinition | undefined
378
+ list(): ToolDefinition[]
379
+ filter(names: string[]): ToolRegistry
380
+ }
381
+
382
+ export class AgentRegistry {
383
+ register(agent: AgentDefinition, source?: RegistrySource): void
384
+ get(name: string): AgentDefinition
385
+ list(filter?: AgentListFilter): AgentDefinition[]
386
+ primary(): AgentDefinition[]
387
+ callable(): AgentDefinition[]
388
+ catalog(): AgentCatalogEntry[]
389
+ }
390
+
391
+ export interface RegistrySource {
392
+ scope: "builtin" | "user" | "project" | "programmatic"
393
+ path?: string
394
+ trusted: boolean
395
+ }
396
+ ```
397
+
398
+ Collision policy:
399
+
400
+ ```text
401
+ built-in name wins
402
+ -> user/project/programmatic contribution with same name is rejected
403
+ -> warning event is emitted
404
+ ```
405
+
406
+ Automatic discovery must not silently change built-in behavior. Explicit override can be added later as a separate config field if needed.
407
+
408
+ Migration guidance:
409
+
410
+ - Use `codex/demian/src/messages.ts`, `session.ts`, `cli.ts`, `config.ts`, and `providers/openai.ts` as primary implementation sources.
411
+ - Use `claude/demian/src/sandbox/` as the shape for a sandbox adapter directory.
412
+ - Preserve the Claude document's decision rationale and safety tables in this canonical architecture.
413
+ - Do not copy `node_modules`.
414
+
415
+ ---
416
+
417
+ ## 6. Core Types
418
+
419
+ ### Provider
420
+
421
+ ```ts
422
+ export interface Provider {
423
+ id: "openai-compatible" | "anthropic" | string
424
+ chat(req: ChatRequest): Promise<ChatResponse>
425
+ stream?(req: ChatRequest): AsyncIterable<ChatStreamEvent>
426
+ }
427
+
428
+ export interface ChatRequest {
429
+ messages: Message[]
430
+ tools: Tool[]
431
+ model: string
432
+ maxTokens?: number
433
+ temperature?: number
434
+ signal: AbortSignal
435
+ }
436
+
437
+ export interface ChatResponse {
438
+ message: AssistantMessage
439
+ toolCalls: ToolCall[]
440
+ stopReason: "end_turn" | "tool_use" | "max_tokens" | "content_filter" | "error"
441
+ usage?: TokenUsage
442
+ raw: unknown
443
+ }
444
+
445
+ export type ChatStreamEvent =
446
+ | { type: "text_delta"; text: string; raw?: unknown }
447
+ | { type: "tool_call_delta"; index: number; id?: string; name?: string; arguments?: string; raw?: unknown }
448
+ | { type: "done"; response: ChatResponse }
449
+ ```
450
+
451
+ Invariant:
452
+
453
+ ```text
454
+ stopReason === "tool_use" iff toolCalls.length > 0
455
+ ```
456
+
457
+ Provider safety blocks become `content_filter` when detectable. Unknown malformed responses become `error`.
458
+
459
+ ### Message
460
+
461
+ ```ts
462
+ export type Message =
463
+ | SystemMessage
464
+ | UserMessage
465
+ | AssistantMessage
466
+ | ToolMessage
467
+
468
+ export interface SystemMessage {
469
+ role: "system"
470
+ content: string
471
+ }
472
+
473
+ export interface UserMessage {
474
+ role: "user"
475
+ content: string | UserContentPart[]
476
+ }
477
+
478
+ export interface AssistantMessage {
479
+ role: "assistant"
480
+ content?: string | null
481
+ toolCalls?: ToolCall[]
482
+ }
483
+
484
+ export interface ToolMessage {
485
+ role: "tool"
486
+ toolCallId: string
487
+ name: string
488
+ content: string
489
+ isError?: boolean
490
+ }
491
+
492
+ export interface ToolCall {
493
+ id: string
494
+ name: string
495
+ input: unknown
496
+ }
497
+
498
+ export type UserContentPart =
499
+ | { type: "text"; text: string }
500
+ | { type: "image_url"; image_url: { url: string; detail?: "auto" | "low" | "high" } }
501
+ ```
502
+
503
+ OpenAI-compatible providers receive this shape almost directly. Anthropic converts system messages, tool calls, tool results, and image parts inside `AnthropicProvider`.
504
+
505
+ ### Tool
506
+
507
+ ```ts
508
+ export interface Tool {
509
+ name: string
510
+ description: string
511
+ inputSchema: JsonSchema
512
+ execute(input: unknown, ctx: ToolContext): Promise<ToolResult>
513
+ }
514
+
515
+ export interface ToolDefinition extends Tool {
516
+ metadata?: {
517
+ sideEffect?: "none" | "workspace" | "process" | "network" | "external"
518
+ defaultDecision?: "allow" | "ask" | "deny"
519
+ trust?: "builtin" | "user" | "project" | "programmatic"
520
+ }
521
+ }
522
+
523
+ export interface AgentInvocationInput {
524
+ agent: string
525
+ task: string
526
+ context?: string
527
+ contextRefs?: string[]
528
+ relevantFiles?: string[]
529
+ constraints?: string[]
530
+ expectedOutput?: string
531
+ maxTurns?: number
532
+ returnMode?: "brief" | "normal"
533
+ }
534
+
535
+ export interface ToolContext {
536
+ rootSessionId?: string
537
+ sessionId: string
538
+ invocationId?: string
539
+ parentInvocationId?: string
540
+ agent?: string
541
+ callId: string
542
+ cwd: string
543
+ signal: AbortSignal
544
+ emit(event: RuntimeEvent): void
545
+ ask(req: PermissionRequest): Promise<PermissionAnswer>
546
+ sandbox?: SandboxConfig
547
+ dryRun?: boolean
548
+ runner?: {
549
+ runAgentSession(input: AgentInvocationInput): Promise<ToolResult>
550
+ }
551
+ }
552
+
553
+ export interface ToolResult {
554
+ ok: boolean
555
+ content: string
556
+ metadata?: Record<string, unknown>
557
+ }
558
+ ```
559
+
560
+ Inputs are `unknown` until validated at runtime. JSON Schema sent to the model is guidance, not trust.
561
+
562
+ External tools are addable capabilities, not permission authorities. Even if an external tool calls `ctx.ask()`, the request routes to the root PermissionCoordinator. External tools also do not get to start sub agents directly in v1; the supported path is for the main model to call `delegate_agent`, which uses the narrow `runner.runAgentSession()` bridge owned by the runtime.
563
+
564
+ ### Hook
565
+
566
+ ```ts
567
+ export type HookEvent =
568
+ | "SessionStart"
569
+ | "SessionEnd"
570
+ | "UserPromptSubmit"
571
+ | "BeforeModelRequest"
572
+ | "AfterModelResponse"
573
+ | "PreToolUse"
574
+ | "PostToolUse"
575
+ | "ToolError"
576
+ | "PermissionRequest"
577
+ | "Stop"
578
+
579
+ export type HookKind = "builtin" | "command" | "script"
580
+
581
+ export interface Hook {
582
+ name: string
583
+ event: HookEvent
584
+ kind: HookKind
585
+ match?: HookMatch
586
+ command?: string
587
+ modulePath?: string
588
+ timeoutMs?: number
589
+ }
590
+
591
+ export interface HookResult {
592
+ decision?: "allow" | "warn" | "block"
593
+ message?: string
594
+ patch?: unknown
595
+ metadata?: Record<string, unknown>
596
+ }
597
+ ```
598
+
599
+ Failure policy:
600
+
601
+ | Hook kind | Failure behavior |
602
+ |-----------|------------------|
603
+ | `builtin` | fail-closed |
604
+ | `command` | fail-open with warning |
605
+ | `script` | fail-open with warning |
606
+
607
+ ### Permission
608
+
609
+ ```ts
610
+ export type Decision = "allow" | "ask" | "deny"
611
+
612
+ export interface PermissionRule {
613
+ tool: string
614
+ match?: {
615
+ pathGlob?: string
616
+ commandPrefix?: string
617
+ }
618
+ decision: Decision
619
+ reason?: string
620
+ }
621
+
622
+ export interface PermissionAnswer {
623
+ decision: Decision
624
+ always?: boolean
625
+ reason?: string
626
+ }
627
+
628
+ export interface PermissionCoordinator {
629
+ evaluateTool(req: ToolPermissionRequest): Promise<PermissionAnswer>
630
+ evaluateAgent(req: AgentPermissionRequest): Promise<PermissionAnswer>
631
+ ask(req: PermissionRequest): Promise<PermissionAnswer>
632
+ }
633
+
634
+ export interface ToolPermissionRequest {
635
+ rootSessionId?: string
636
+ invocationId?: string
637
+ agent: string
638
+ agentPath?: string[]
639
+ tool: string
640
+ input: unknown
641
+ effectiveRules: PermissionRule[]
642
+ }
643
+
644
+ export interface AgentPermissionRequest {
645
+ rootSessionId: string
646
+ parentInvocationId: string
647
+ agentPath: string[]
648
+ targetAgent: string
649
+ providerProfile: string
650
+ handoffPreview: string
651
+ }
652
+ ```
653
+
654
+ The current implementation realizes this coordinator contract through `RootSession` plus shared `SessionGrants`, shared `permissionPrompt`, and a shared root `TranscriptWriter`. A standalone `permissions/coordinator.ts` class is still a valid extraction point, but the first implementation keeps the coordination in the root runtime to avoid an extra abstraction before the behavior is stable.
655
+
656
+ `always` is represented as `{ decision: "allow", always: true }`, matching the Codex implementation shape.
657
+
658
+ Single-agent permission evaluation order:
659
+
660
+ ```text
661
+ 1. built-in hard deny
662
+ 2. explicit deny rules
663
+ 3. session grants and unexpired persistent grants
664
+ 4. most specific allow/ask rule
665
+ 5. agent default decision
666
+ 6. built-in default ask
667
+ ```
668
+
669
+ Multi-agent permission evaluation separates safety deny from agent-local policy:
670
+
671
+ ```text
672
+ 1. built-in hard deny
673
+ 2. global explicit deny rules
674
+ 3. root session grants and unexpired persistent grants
675
+ 4. most specific rule from the current caller agent
676
+ 5. if an agent-local deny matched, ask the user whether to expand global authority
677
+ 6. root default decision
678
+ 7. built-in default ask
679
+ ```
680
+
681
+ The invariant is narrower and stronger than "all deny wins":
682
+
683
+ ```text
684
+ hard deny or global explicit deny cannot be expanded by any agent or grant
685
+ ```
686
+
687
+ Agent-local deny is a behavioral default. It can be overridden only by a user-approved root grant, never by another agent.
688
+
689
+ Grant key policy:
690
+
691
+ | Tool | Grant key |
692
+ |------|-----------|
693
+ | `bash` | first two command tokens |
694
+ | `write_file` | parent directory |
695
+ | `edit_file` | parent directory |
696
+ | other tools | stable JSON stringification of relevant input |
697
+ | agent invocation | `agent:<agentName>:<providerProfile>` |
698
+
699
+ ### AgentDefinition
700
+
701
+ The v0 `Agent` shape is enough for one runner. The multi-agent design normalizes agents into `AgentDefinition` and adapts them back to the current runner shape when needed.
702
+
703
+ ```ts
704
+ export interface AgentDefinition {
705
+ name: string
706
+ displayName?: string
707
+ description: string
708
+ mode?: "primary" | "subagent" | "all"
709
+ hidden?: boolean
710
+
711
+ provider?: {
712
+ profile?: string
713
+ inheritRoot?: boolean
714
+ }
715
+
716
+ prompt: {
717
+ system: string
718
+ append?: string
719
+ }
720
+
721
+ tools: {
722
+ visible: string[]
723
+ }
724
+
725
+ authority?: {
726
+ tools?: string[]
727
+ permissions?: PermissionRule[]
728
+ defaultDecision?: "allow" | "ask" | "deny"
729
+ }
730
+
731
+ permissions: PermissionRule[]
732
+ defaultDecision?: "allow" | "ask" | "deny"
733
+
734
+ delegation?: {
735
+ callable?: boolean
736
+ canDelegate?: boolean
737
+ allowedAgents?: string[]
738
+ deniedAgents?: string[]
739
+ maxDepth?: number
740
+ invocationDecision?: "allow" | "ask" | "deny"
741
+ }
742
+
743
+ catalog?: {
744
+ category?: "builder" | "planner" | "reviewer" | "researcher" | "utility" | string
745
+ cost?: "free" | "cheap" | "standard" | "expensive"
746
+ useWhen?: string[]
747
+ avoidWhen?: string[]
748
+ triggers?: string[]
749
+ }
750
+ }
751
+ ```
752
+
753
+ Mode semantics:
754
+
755
+ | Mode | Main agent selection | Sub agent invocation | Intent |
756
+ |------|----------------------|----------------------|--------|
757
+ | `primary` | yes | no | Talks directly to the user |
758
+ | `subagent` | no | yes | Delegation-only specialist |
759
+ | `all` | yes | yes | Can run directly or be delegated to |
760
+
761
+ Provider fields are references only. An agent may point at `config.providers["gemini-fast"]`, but it may not define `baseURL`, credentials, custom headers, auth, quirks, or a direct model override. If an agent needs a different model, config should define a separate provider profile for that model. This prevents agent files from becoming hidden network or credential entrypoints.
762
+
763
+ ### RootSession And AgentSession
764
+
765
+ ```ts
766
+ export interface RootSession {
767
+ id: string
768
+ cwd: string
769
+ mode: "single-agent" | "multi-agent"
770
+ agents: AgentRegistry
771
+ tools: ToolRegistry
772
+ providers: ProviderRegistry
773
+ permissions: PermissionCoordinator
774
+ events: EventBus
775
+ transcript: RootTranscriptWriter
776
+ signal: AbortSignal
777
+ agentSessions: AgentSessionStore
778
+ }
779
+
780
+ export interface AgentInvocation {
781
+ id: string
782
+ rootSessionId: string
783
+ agentSessionId: string
784
+ parentInvocationId?: string
785
+ depth: number
786
+ agentName: string
787
+ providerProfile: string
788
+ model: string
789
+ prompt: string
790
+ status: "created" | "running" | "completed" | "failed" | "cancelled"
791
+ }
792
+
793
+ export interface AgentSession {
794
+ id: string
795
+ rootSessionId: string
796
+ agentName: string
797
+ providerProfile: string
798
+ model: string
799
+ messages: Message[]
800
+ memory: AgentSessionMemory
801
+ contextPolicy: AgentContextPolicy
802
+ updatedAt: number
803
+ }
804
+
805
+ export interface AgentSessionMemory {
806
+ summary: string
807
+ findings: string[]
808
+ decisions: string[]
809
+ openQuestions: string[]
810
+ relevantFiles: string[]
811
+ lastResults: Array<{ task: string; resultPreview: string; ts: number }>
812
+ }
813
+
814
+ export interface AgentContextPolicy {
815
+ maxTurns: number
816
+ maxContextTokens?: number
817
+ maxInputTokens?: number // legacy alias
818
+ maxMessages?: number
819
+ summaryTargetTokens?: number
820
+ recentTurns?: number
821
+ compactAtRatio?: number
822
+ compressAtRatio?: number // legacy alias
823
+ compression: "compact-summary-and-recent" | "summary-and-recent" | "none"
824
+ }
825
+ ```
826
+
827
+ Child session reuse key:
828
+
829
+ ```text
830
+ rootSessionId + agentName + providerProfile + cwd
831
+ ```
832
+
833
+ The first implementation keeps child memory in memory only. Disk persistence is deferred until explicit resume support exists. This chooses predictable current-session continuity over surprising long-lived agent memory.
834
+
835
+ In the current CLI and TUI wiring, a RootSession is created for each submitted task. Programmatic callers can keep a RootSession object and reuse it across runs, but interactive long-lived RootSession reuse is left as a follow-up. This intentionally lands the safer synchronous delegation behavior first; cross-prompt subagent memory can be added once the UI can clearly show that memory is being retained.
836
+
837
+ ---
838
+
839
+ ## 7. Session Lifecycle
840
+
841
+ Startup:
842
+
843
+ ```text
844
+ CLI
845
+ -> parse flags
846
+ -> resolve cwd
847
+ -> load config
848
+ -> merge defaults, user config, project config, explicit config, flags
849
+ -> resolve agent mode
850
+ -> resolve agent
851
+ -> resolve provider + model
852
+ -> create RootSession services when multi-agent mode is active
853
+ -> create EventBus
854
+ -> create TranscriptWriter
855
+ -> create SessionRunner
856
+ -> run
857
+ ```
858
+
859
+ Mode-specific startup:
860
+
861
+ ```text
862
+ single-agent:
863
+ one selected agent
864
+ one provider/model for the run
865
+ delegate_agent is not registered or visible
866
+ callable agent catalog is not injected
867
+
868
+ multi-agent:
869
+ selected agent becomes the main agent
870
+ RootSession spans the interactive conversation
871
+ provider/model flags apply to the main invocation only
872
+ callable subagent catalog is injected into the main system prompt
873
+ delegate_agent is visible only if the main agent can delegate
874
+ ```
875
+
876
+ Initial messages:
877
+
878
+ ```ts
879
+ const messages: Message[] = [
880
+ { role: "system", content: buildSystemPrompt(agent, envInfo) },
881
+ { role: "user", content: await buildUserContent(prompt, images, multimodalConfig) },
882
+ ]
883
+ ```
884
+
885
+ One model turn:
886
+
887
+ ```text
888
+ BeforeModelRequest hooks
889
+ -> provider.stream() when enabled and supported
890
+ otherwise provider.chat()
891
+ -> AfterModelResponse hooks
892
+ -> append assistant message
893
+ -> if content_filter: emit and stop
894
+ -> if no tool calls: final answer
895
+ -> run tool calls sequentially
896
+ -> append tool messages
897
+ -> next turn
898
+ ```
899
+
900
+ Tool lifecycle:
901
+
902
+ ```text
903
+ emit tool.requested
904
+ -> validate tool exists and is allowed for agent
905
+ -> validate input schema
906
+ -> PreToolUse hooks
907
+ block -> append tool error message
908
+ patch -> patch input, then revalidate
909
+ warn -> continue with warning metadata
910
+ -> PermissionEngine.evaluate()
911
+ deny -> append tool error message
912
+ ask -> prompt user or apply --yes
913
+ allow -> continue
914
+ -> emit tool.started
915
+ -> dry-run check for write_file, edit_file, bash
916
+ -> tool.execute()
917
+ -> output cap
918
+ -> PostToolUse hooks
919
+ -> emit tool.completed or tool.failed
920
+ -> append role="tool" message
921
+ ```
922
+
923
+ Tool calls are sequential in v0. This keeps permission prompts, transcript order, and model continuation deterministic.
924
+
925
+ ### Delegation Lifecycle
926
+
927
+ Synchronous v1 delegation flow:
928
+
929
+ ```text
930
+ main model requests delegate_agent
931
+ -> emit agent.invocation.requested
932
+ -> validate multi-agent mode
933
+ -> validate parent can delegate
934
+ -> resolve target agent
935
+ -> reject hidden, non-callable, self, cycle, or max-depth violations
936
+ -> evaluate agent invocation permission through root PermissionCoordinator
937
+ -> resolve child provider profile
938
+ -> load or create child AgentSession
939
+ -> build structured handoff packet
940
+ -> assemble bounded child context from memory + recent child turns + handoff
941
+ -> run child AgentSessionRunner with root-owned services
942
+ -> update compressed child memory
943
+ -> emit agent.invocation.completed | failed | cancelled
944
+ -> return compact child result as the delegate_agent tool message
945
+ ```
946
+
947
+ Delegation is intentionally synchronous in v1. The gain is deterministic permission ordering, simple cancellation, and a main model that can reason over the child result immediately. The cost is no background convenience yet; that is deferred until transcript and permission UX prove stable.
948
+
949
+ ### Handoff Context
950
+
951
+ The child does not receive the full main conversation by default, because that would be costly and could send more data than the user expected to a different provider. It also does not start from a blank prompt, because that would make specialist agents brittle. The compromise is a structured handoff packet.
952
+
953
+ ```ts
954
+ export interface HandoffContext {
955
+ runtime: {
956
+ rootSessionId: string
957
+ agentSessionId: string
958
+ parentInvocationId: string
959
+ parentAgent: string
960
+ childAgent: string
961
+ cwd: string
962
+ currentUserRequest: string
963
+ effectiveTools: string[]
964
+ }
965
+ delegation: AgentInvocationInput
966
+ summary?: {
967
+ rootConversation?: string
968
+ childMemory?: string
969
+ parentFindings?: string[]
970
+ decisions?: string[]
971
+ }
972
+ }
973
+ ```
974
+
975
+ Only human-useful fields become model-visible: current user request, parent/child agent names, cwd, task, context, relevant files, constraints, expected output, and compact summaries. IDs remain transcript/debug metadata unless useful to the model.
976
+
977
+ ### Context Budgets And Request-Time Compaction
978
+
979
+ Main and sub agent context budgets are runtime variables, not hard-coded constants.
980
+
981
+ ```ts
982
+ export interface ContextBudgetConfig {
983
+ main: {
984
+ maxContextTokens?: number
985
+ maxInputTokens?: number // legacy alias
986
+ compactAtRatio?: number
987
+ compressAtRatio?: number // legacy alias
988
+ summaryTargetTokens: number
989
+ recentTurns: number
990
+ minRecentTurns: number
991
+ maxRawMessages?: number
992
+ keepRawDelegateResults: number
993
+ maxDelegateResultTokens: number
994
+ ledgerTargetTokens: number
995
+ }
996
+ subAgent: {
997
+ maxContextTokens: number
998
+ maxInputTokens: number
999
+ compactAtRatio?: number
1000
+ compressAtRatio?: number // legacy alias
1001
+ summaryTargetTokens: number
1002
+ recentTurns: number
1003
+ minRecentTurns?: number
1004
+ maxHandoffTokens: number
1005
+ maxDelegateContextTokens: number
1006
+ maxResultTokensToMain: number
1007
+ compression: "compact-summary-and-recent" | "summary-and-recent" | "none"
1008
+ }
1009
+ }
1010
+ ```
1011
+
1012
+ Recommended defaults:
1013
+
1014
+ ```text
1015
+ main.maxContextTokens = 48000
1016
+ subAgent.maxContextTokens = 12000
1017
+ main.compactAtRatio = 0.7
1018
+ subAgent.compactAtRatio = 0.7
1019
+ ```
1020
+
1021
+ `maxContextTokens` is the maximum model-visible input context that demian should send on one provider request. `compactAtRatio` controls when history compaction starts; by default, `compactAtTokens = floor(maxContextTokens * 0.7)`. When the estimated request context reaches that threshold, `SessionRunner` compacts older history before calling the provider.
1022
+
1023
+ The first implementation performs deterministic in-memory request-time compaction. It preserves the current system prompt, a compact system summary of older turns, recent raw turns, and tool-call adjacency for the retained tail. It emits `context.compiled` for each model request and `context.compacted` when older messages are folded into a summary. Full JSONL-backed `ContextStore` persistence remains the next step; transcripts still preserve the audit chronology.
1024
+
1025
+ Interactive UIs also expose manual history compaction with `/compact`. Unlike automatic request-time compaction, `/compact` rewrites the interactive in-memory `history` immediately to the same compact-summary-plus-recent shape that would be used for a model request. The command is handled by the UI controller, is not sent to the model as a user message, and is not stored in prompt history. If there is not enough conversation history to compact, the UI reports that compaction was skipped. In non-interactive one-shot mode there is no retained history, so `/compact` is a no-op with a user-facing notice.
1026
+
1027
+ ---
1028
+
1029
+ ## 8. Provider Strategy
1030
+
1031
+ ### OpenAIProvider
1032
+
1033
+ `OpenAIProvider` is the default provider implementation. It uses the common Chat Completions subset and a vendor quirk map.
1034
+
1035
+ Authentication is configurable per provider:
1036
+
1037
+ - default: `Authorization: Bearer <apiKey>` for OpenAI, Gemini, OpenRouter, local gateways, and most OpenAI-compatible services
1038
+ - Azure OpenAI API-key auth: `api-key: <apiKey>`, inferred automatically when `baseURL` ends in `.openai.azure.com` or `.services.ai.azure.com`
1039
+ - custom compatible gateways may set a custom auth header while keeping the same provider path
1040
+
1041
+ This split is required because Azure OpenAI's v1 endpoint is OpenAI-compatible at the API shape level, but its REST API-key authentication uses the `api-key` header. Users should not need to set `auth.type` for normal Azure OpenAI endpoints; explicit `auth` config is only an override for custom gateways or future Microsoft Entra ID bearer-token flows.
1042
+
1043
+ Payload subset:
1044
+
1045
+ - `model`
1046
+ - `messages`
1047
+ - `tools`
1048
+ - `tool_choice: "auto"` unless omitted by quirk
1049
+ - `temperature` unless omitted by quirk
1050
+ - `max_tokens` unless omitted by quirk
1051
+ - `stream` and `stream_options` only for streaming
1052
+
1053
+ Supported presets:
1054
+
1055
+ | Key | Type | baseURL | API key |
1056
+ |-----|------|---------|---------|
1057
+ | `openai` | openai-compatible | `https://api.openai.com/v1` | `OPENAI_API_KEY` |
1058
+ | `gemini` | openai-compatible | `https://generativelanguage.googleapis.com/v1beta/openai/` | `GOOGLE_API_KEY` |
1059
+ | `ollama` | openai-compatible | `http://localhost:11434/v1` | placeholder |
1060
+ | `lmstudio` | openai-compatible | `http://localhost:1234/v1` | placeholder |
1061
+ | `vllm` | openai-compatible | `http://localhost:8000/v1` | placeholder |
1062
+ | `llamacpp` | openai-compatible | `http://localhost:8080/v1` | placeholder |
1063
+ | `openrouter` | openai-compatible | `https://openrouter.ai/api/v1` | `OPENROUTER_API_KEY` |
1064
+ | `together` | openai-compatible | `https://api.together.xyz/v1` | `TOGETHER_API_KEY` |
1065
+ | `groq` | openai-compatible | `https://api.groq.com/openai/v1` | `GROQ_API_KEY` |
1066
+ | `azure` | openai-compatible | configured Azure endpoint | `AZURE_OPENAI_API_KEY` |
1067
+ | `anthropic` | anthropic | Anthropic API | `ANTHROPIC_API_KEY` |
1068
+ | `codex` | codex | `https://chatgpt.com/backend-api/codex` | local Codex ChatGPT login |
1069
+ | `claudecode` | claudecode external runtime | local Claude Agent SDK / Claude Code CLI | local Claude Code login or Agent SDK plan auth |
1070
+
1071
+ Provider adapters are stateless from the upstream API's point of view. `SessionRunner` sends the current compiled message context and the agent-visible tool schemas on each model request; adapters translate that shape but do not own history truncation. Context compression and tool-output compaction belong in the session/root-session context layer so provider-specific adapters cannot silently change what the model sees.
1072
+
1073
+ ### CodexProvider
1074
+
1075
+ `CodexProvider` is a first-class provider, not an OpenAI-compatible preset. It uses the Responses API over the ChatGPT-backed Codex endpoint and reuses the user's local Codex login.
1076
+
1077
+ The runtime boundary remains unchanged:
1078
+
1079
+ - demian owns hooks, permissions, transcripts, sandboxing, cancellation, tool execution, and delegation
1080
+ - Codex is used only as the model provider
1081
+ - demian must not spawn `codex exec` as a nested agent runtime
1082
+
1083
+ Implementation split:
1084
+
1085
+ ```text
1086
+ src/providers/codex.ts
1087
+ provider orchestration and Responses request mapping
1088
+
1089
+ src/providers/codex-auth.ts
1090
+ Codex auth file/keyring loading, ChatGPT token headers, refresh
1091
+
1092
+ src/providers/codex-state.ts
1093
+ Codex-local metadata such as installation_id
1094
+
1095
+ src/providers/codex-stream.ts
1096
+ Responses SSE parsing and ChatStreamEvent accumulation
1097
+ ```
1098
+
1099
+ Default config:
1100
+
1101
+ ```json
1102
+ {
1103
+ "codex": {
1104
+ "type": "codex",
1105
+ "model": "gpt-5.1-codex",
1106
+ "baseURL": "https://chatgpt.com/backend-api/codex",
1107
+ "authStore": "auto",
1108
+ "allowApiKeyFallback": false,
1109
+ "promptCacheKey": "root-session",
1110
+ "responses": {
1111
+ "store": false,
1112
+ "include": ["reasoning.encrypted_content"],
1113
+ "reasoning": {
1114
+ "effort": "medium",
1115
+ "summary": "auto"
1116
+ }
1117
+ }
1118
+ }
1119
+ }
1120
+ ```
1121
+
1122
+ Auth rules:
1123
+
1124
+ - `authStore: "auto"` tries the Codex keyring location first, then `{codexHome}/auth.json`.
1125
+ - File auth is fully readable and refreshable; refreshed tokens are written atomically with `0600` permissions.
1126
+ - Native keyring access is adapter-backed. The built-in macOS adapter can read Codex's `Codex Auth` entry. Keyring writes require a safe adapter implementation; if refreshed keyring tokens cannot be written, the provider asks the user to refresh with `codex login`.
1127
+ - Keyring identity must match Codex CLI: `service = "Codex Auth"`, `account = "cli|" + first 16 hex chars of sha256(canonical codexHome)`.
1128
+ - API-key fallback is disabled by default because OpenAI Platform API-key usage is not ChatGPT/Codex subscription usage.
1129
+
1130
+ Codex request headers:
1131
+
1132
+ ```text
1133
+ Authorization: Bearer <access_token>
1134
+ ChatGPT-Account-ID: <account_id>
1135
+ X-OpenAI-Fedramp: true # only when token claims require it
1136
+ x-codex-installation-id: <installation_id>
1137
+ originator: demian
1138
+ user-agent: demian
1139
+ ```
1140
+
1141
+ Responses payload rules:
1142
+
1143
+ - input is converted from demian's OpenAI-shaped history to Responses `message`, `function_call`, and `function_call_output` items
1144
+ - tools become Responses function tools with `strict: false`
1145
+ - `parallel_tool_calls` is `false`
1146
+ - `store` is `false`
1147
+ - `previous_response_id` is omitted
1148
+ - `prompt_cache_key` defaults to the root session id
1149
+ - `client_metadata["x-codex-installation-id"]` mirrors the installation id header
1150
+ - when reasoning is configured, include `reasoning.encrypted_content`
1151
+
1152
+ Refresh rules:
1153
+
1154
+ - decode JWT claims with base64url padding restoration
1155
+ - refresh proactively near access-token expiry and reactively once after `/responses` 401
1156
+ - share one refresh promise per `CodexAuthStore`
1157
+ - reuse `CodexAuthStore` instances across normal provider resolutions for the same `codexHome`, `authStore`, and refresh settings
1158
+ - classify refresh 401s as `refresh_token_expired`, `refresh_token_invalidated`, `refresh_token_reused`, or `other`
1159
+ - for `refresh_token_reused`, reload the selected auth store once and retry with newer tokens if another process already rotated them
1160
+
1161
+ Streaming rules:
1162
+
1163
+ - `response.output_text.delta` emits `text_delta`
1164
+ - `response.function_call_arguments.delta` is accumulated as raw argument chunks
1165
+ - function-call JSON is parsed only after the matching completed item is available
1166
+ - `response.completed` emits the final `ChatResponse`
1167
+ - `response.failed` and `response.incomplete` become normalized provider errors
1168
+ - `AbortSignal` is passed through `fetch()` and SSE consumption
1169
+
1170
+ ### Claude Code External Runtime
1171
+
1172
+ `claudecode` is selectable like a provider, but it resolves to an
1173
+ `ExternalAgentRuntime` instead of the native `Provider.chat()` path. Claude Code
1174
+ owns its own agent loop, built-in tool execution, context, and session ids, so
1175
+ demian dispatches it through `ExternalAgentSessionRunner`.
1176
+
1177
+ Runtime boundary:
1178
+
1179
+ - `anthropic` remains the API-key Claude provider and uses Anthropic API
1180
+ billing.
1181
+ - `claudecode` uses local Claude Code login or Agent SDK plan auth. demian
1182
+ removes `ANTHROPIC_API_KEY` and `ANTHROPIC_AUTH_TOKEN` from the Claude Code
1183
+ child env by default and keeps `useBareMode: false` so subscription/Agent SDK
1184
+ auth is not silently replaced by API-key billing.
1185
+ - Claude Code built-in tools execute inside Claude Code. demian keeps the
1186
+ user-facing approval boundary through SDK `canUseTool`, but demian
1187
+ `PreToolUse`/`PostToolUse` hooks apply only to demian-native tools.
1188
+ - Runtime events for Claude Code tools carry `executor: "claudecode"` and
1189
+ `qualifiedName: "claudecode.<tool>"`, and UI surfaces a CC badge.
1190
+ - demian custom tools are not automatically exposed to Claude Code before the
1191
+ later tool-bridge phase.
1192
+
1193
+ Implementation split:
1194
+
1195
+ ```text
1196
+ src/execution.ts
1197
+ dispatches native providers to SessionRunner and claudecode to ExternalAgentSessionRunner
1198
+
1199
+ src/external-runtime/claudecode-sdk.ts
1200
+ Claude Agent SDK adapter, SDK option mapping, auth preflight, usage ledger, budget guard
1201
+
1202
+ src/external-runtime/claudecode-cli.ts
1203
+ explicit CLI fallback using spawn args/stdin, stream-json parsing, capability-gated argv
1204
+
1205
+ src/external-runtime/claudecode-permissions.ts
1206
+ SDK canUseTool bridge into demian PermissionEngine and Asker
1207
+
1208
+ src/external-runtime/session-map.ts
1209
+ Claude Code session keys, instruction hash, permission policy hash, split reasons
1210
+
1211
+ src/external-runtime/session-lock.ts
1212
+ advisory resume lock with stale timeout and PID liveness checks
1213
+
1214
+ src/external-runtime/usage-ledger.ts
1215
+ demian-attributable process/daily/monthly cost ledger
1216
+ ```
1217
+
1218
+ Default config:
1219
+
1220
+ ```json
1221
+ {
1222
+ "claudecode": {
1223
+ "type": "claudecode",
1224
+ "runtime": "agent-sdk",
1225
+ "model": "sonnet",
1226
+ "cliPath": "~/.local/bin/claude",
1227
+ "cwdMode": "session",
1228
+ "permissionMode": "default",
1229
+ "defaultDecision": "by-category",
1230
+ "historyPolicy": "passthrough-resume",
1231
+ "onInvalidResume": "fresh",
1232
+ "attachmentFallback": "block",
1233
+ "allowSubagents": false,
1234
+ "sanitizeApiKeyEnv": true,
1235
+ "authPreflight": true,
1236
+ "useBareMode": false,
1237
+ "usageLedgerScope": "process",
1238
+ "sessionLock": true,
1239
+ "abortPolicy": "record-only"
1240
+ }
1241
+ }
1242
+ ```
1243
+
1244
+ Session and history rules:
1245
+
1246
+ - demian stores returned Claude Code session ids under a key made from root
1247
+ session, agent, provider profile, cwd, model, instruction hash, and Claude
1248
+ Code-relevant permission policy hash.
1249
+ - Matching turns resume with the stored Claude Code session id.
1250
+ - Provider, cwd, model, instruction, agent, profile, or relevant permission
1251
+ changes start a fresh Claude Code session and emit `session.context.split`.
1252
+ - Invalid resume errors follow `onInvalidResume`: `fresh`, `auto-recover`, or
1253
+ interactive `prompt` with a 30 second timeout.
1254
+ - `historyPolicy: "stateless"` suppresses resume and starts a fresh Claude Code
1255
+ session for every turn.
1256
+
1257
+ Permission and tool rules:
1258
+
1259
+ - Permission lookup is executor-aware: `claudecode.Tool`, `*.Tool`,
1260
+ `claudecode.*`, `*`, then `defaultDecision`.
1261
+ - `defaultDecision: "by-category"` allows read-only `Read`/`Glob`/`Grep` and
1262
+ asks for mutating or unknown tools.
1263
+ - Prefix-less rules remain backward compatible for demian-native tools. Use
1264
+ `demian doctor policies --upgrade-namespaces` to migrate policy files to
1265
+ explicit `demian.<tool>` names.
1266
+ - `claudecode-plan` is advisory. It displays Claude Code's plan as the answer
1267
+ and exposes an opt-in use-plan action that prefixes the next user turn with
1268
+ the plan text instead of executing automatically.
1269
+
1270
+ ### Legacy ClaudeCodeProvider
1271
+
1272
+ `ClaudeCodeProvider` is the old direct `/v1/messages` implementation. This
1273
+ path is unsupported because it reconstructs Claude Code OAuth behavior outside
1274
+ the public Claude Code Agent SDK/CLI surface. It is retained only behind
1275
+ `type: "claudecode-api-legacy"` for staged migration, and startup is blocked
1276
+ unless `DEMIAN_ENABLE_UNSUPPORTED_CLAUDECODE_API=1` is set.
1277
+
1278
+ Legacy runtime boundary:
1279
+
1280
+ - demian owns hooks, permissions, transcripts, sandboxing, cancellation, tool execution, and delegation
1281
+ - Claude Code is used only as the model provider
1282
+ - demian does not spawn `claude` in this legacy path
1283
+ - Anthropic API-key fallback is explicit because it uses normal Anthropic API billing, not Claude Code subscription auth
1284
+
1285
+ Implementation split:
1286
+
1287
+ ```text
1288
+ src/providers/claudecode.ts
1289
+ provider orchestration, Messages request mapping, OAuth/API-key header selection, 401 retry
1290
+
1291
+ src/providers/claudecode-auth.ts
1292
+ Claude Code credential discovery, keyring/file/env loading, OAuth refresh, API-key fallback guard
1293
+
1294
+ src/providers/claudecode-stream.ts
1295
+ Anthropic Messages SSE parsing and ChatStreamEvent accumulation
1296
+ ```
1297
+
1298
+ Legacy config shape:
1299
+
1300
+ ```json
1301
+ {
1302
+ "claudecode": {
1303
+ "type": "claudecode-api-legacy",
1304
+ "model": "claude-sonnet-4-6",
1305
+ "maxTokens": 8192,
1306
+ "authStore": "auto",
1307
+ "allowApiKeyFallback": false,
1308
+ "allowEnvOAuthToken": true,
1309
+ "refresh": {
1310
+ "proactiveRefreshMinutes": 30,
1311
+ "cache": "claude-store"
1312
+ }
1313
+ }
1314
+ }
1315
+ ```
1316
+
1317
+ Auth rules:
1318
+
1319
+ - `CLAUDE_CODE_OAUTH_TOKEN` is accepted first when `allowEnvOAuthToken` is true. This is useful for explicit token setup, but it cannot be refreshed after a 401 because it has no local refresh token.
1320
+ - `authStore: "auto"` tries the Claude Code keyring location first, then `${CLAUDE_CONFIG_DIR:-~/.claude}/.credentials.json`.
1321
+ - File auth is fully readable and refreshable; refreshed tokens are written atomically with `0600` permissions.
1322
+ - Native keyring access is adapter-backed. The built-in macOS adapter reads `service = "Claude Code-credentials"` and `account = <OS username>`.
1323
+ - Keyring writes require a safe adapter implementation; if refreshed keyring tokens cannot be written, the provider asks the user to refresh through Claude Code login.
1324
+ - OAuth credentials must contain `claudeAiOauth.accessToken`; when scopes are present, they must include `user:inference`.
1325
+ - Refresh uses Claude Code's OAuth client id `9d1c250a-e61b-44d9-88ed-5944d1962f5e` and defaults to `https://platform.claude.com/v1/oauth/token`.
1326
+ - `refresh.cache: "demian-cache"` is reserved but not implemented; the current provider supports only `refresh.cache: "claude-store"`.
1327
+ - API-key fallback is disabled by default because `ANTHROPIC_API_KEY` does not use Claude Code subscription auth.
1328
+
1329
+ Claude Code OAuth request headers:
1330
+
1331
+ ```text
1332
+ Authorization: Bearer <access_token>
1333
+ Accept: application/json
1334
+ Content-Type: application/json
1335
+ anthropic-version: 2023-06-01
1336
+ anthropic-beta: claude-code-20250219,oauth-2025-04-20,interleaved-thinking-2025-05-14,context-management-2025-06-27,prompt-caching-scope-2026-01-05
1337
+ User-Agent: claude-cli/2.1.133 (external, claude-vscode)
1338
+ x-app: cli
1339
+ X-Claude-Code-Session-Id: <uuid>
1340
+ ```
1341
+
1342
+ API-key fallback request headers:
1343
+
1344
+ ```text
1345
+ x-api-key: <api_key>
1346
+ Accept: application/json
1347
+ Content-Type: application/json
1348
+ anthropic-version: 2023-06-01
1349
+ User-Agent: claude-cli/2.1.133 (external, claude-vscode)
1350
+ x-app: cli
1351
+ X-Claude-Code-Session-Id: <uuid>
1352
+ ```
1353
+
1354
+ The default Claude Code user-agent and beta headers are aligned with the locally inspected Claude Code VS Code extension `anthropic.claude-code` version `2.1.133`. That extension launches Claude Code with `CLAUDE_CODE_ENTRYPOINT=claude-vscode`, which yields API user-agent `claude-cli/2.1.133 (external, claude-vscode)`. Its first-party OAuth Messages requests include the Claude Code, OAuth, interleaved-thinking, context-management, and prompt-caching-scope beta headers.
1355
+
1356
+ Messages payload rules:
1357
+
1358
+ - input is converted with the Anthropic adapter mapping: top-level `system`, `messages`, `tool_use`, `tool_result`, image blocks, and normalized stop reasons
1359
+ - tools become Anthropic tool schemas
1360
+ - `max_tokens` defaults to `8192` unless `maxTokens` is configured
1361
+ - `stream` is set per request
1362
+ - OAuth requests prepend the Claude Code system prefix: `You are Claude Code, Anthropic's official CLI for Claude.`
1363
+ - API-key fallback requests do not inject the Claude Code system prefix or OAuth beta header
1364
+
1365
+ Refresh and error rules:
1366
+
1367
+ - refresh proactively near access-token expiry and reactively once after `/messages` 401 when a refresh token exists
1368
+ - share one refresh promise per `ClaudeCodeAuthStore`
1369
+ - reuse `ClaudeCodeAuthStore` instances across normal provider resolutions for the same Claude config directory, auth store, and refresh settings
1370
+ - classify refresh failures as expired, invalidated, reused, invalid grant, or other
1371
+ - initial streaming 429/5xx responses are retried through the provider-independent retry path before SSE consumption starts
1372
+ - 429 responses are surfaced with Claude Code-specific guidance, request id when present, and `Retry-After` when available
1373
+
1374
+ Streaming rules:
1375
+
1376
+ - `content_block_delta` text emits `text_delta`
1377
+ - tool input JSON deltas are accumulated as raw argument chunks
1378
+ - `message_delta` accumulates usage and stop metadata
1379
+ - `message_stop` emits the final normalized `ChatResponse`
1380
+ - Anthropic stream `error` events become normalized provider errors, with `rate_limit_error` mapped to HTTP 429
1381
+ - `AbortSignal` is passed through `fetch()` and SSE consumption
1382
+
1383
+ ### Agent Provider Resolution
1384
+
1385
+ In multi-agent mode, an invocation resolves provider identity from profile references only:
1386
+
1387
+ ```text
1388
+ if agent.provider.profile exists:
1389
+ use config.providers[profile]
1390
+ else if agent.provider.inheritRoot !== false:
1391
+ use root provider profile
1392
+ else:
1393
+ use config.defaultProvider
1394
+ ```
1395
+
1396
+ Rules:
1397
+
1398
+ - profile key must exist in `config.providers`
1399
+ - agent files cannot define provider endpoint, credential, auth header, quirks, or model override
1400
+ - provider/model is fixed at invocation start
1401
+ - root CLI/TUI `--provider` and `--model` override the main invocation only
1402
+ - sub agent model specialization is represented as another configured provider profile
1403
+ - resolved profile and model are recorded in events and transcript
1404
+
1405
+ This is a deliberate trust boundary. It gives agents model specialization without letting an agent definition route data to a new endpoint or introduce a secret dependency.
1406
+
1407
+ ### Gemini
1408
+
1409
+ Gemini is first-class but not a separate adapter.
1410
+
1411
+ ```json
1412
+ {
1413
+ "gemini": {
1414
+ "type": "openai-compatible",
1415
+ "model": "gemini-model-name",
1416
+ "baseURL": "https://generativelanguage.googleapis.com/v1beta/openai/",
1417
+ "apiKeyEnv": "GOOGLE_API_KEY"
1418
+ }
1419
+ }
1420
+ ```
1421
+
1422
+ Decision:
1423
+
1424
+ - Use Google's OpenAI-compatible endpoint.
1425
+ - Do not add `@google/generative-ai`.
1426
+ - Do not add `GeminiProvider` in v0.
1427
+ - Do not auto-fallback on provider safety blocks.
1428
+
1429
+ Gemini quirks:
1430
+
1431
+ | Quirk | Policy |
1432
+ |-------|--------|
1433
+ | Some parameters may be ignored or unsupported | Omit unsupported parameters through quirks |
1434
+ | Safety filters may block output | Map to `content_filter` when detectable |
1435
+ | Tool calling quality varies by model | Surface clear model/provider errors |
1436
+ | Streaming chunk boundaries can differ | Normalize in provider stream parser |
1437
+
1438
+ Safety block policy:
1439
+
1440
+ ```text
1441
+ Gemini content filter / safety block
1442
+ -> stopReason: "content_filter" or "error"
1443
+ -> emit model.content_filter
1444
+ -> explain provider safety filter to user
1445
+ -> do not auto-fallback
1446
+ ```
1447
+
1448
+ ### AnthropicProvider
1449
+
1450
+ Anthropic is a separate adapter because its native message model differs.
1451
+
1452
+ `AnthropicProvider` should use the official `@anthropic-ai/sdk`, scoped to this adapter only.
1453
+
1454
+ Reasoning:
1455
+
1456
+ - Anthropic is not on the OpenAI-compatible path, so using its SDK does not weaken the single-path strategy for OpenAI-compatible vendors.
1457
+ - Anthropic's native protocol has distinct concepts: top-level `system`, `tool_use`, `tool_result`, content blocks, and vendor-specific streaming events.
1458
+ - The SDK gives stronger type coverage for these native shapes and absorbs provider API changes better than a hand-written `fetch` client.
1459
+ - Keeping SDK usage inside `src/providers/anthropic.ts` prevents dependency concerns from leaking into Session Runner, tools, hooks, permissions, or config.
1460
+ - Implementation should keep SDK construction adapter-local and test-injectable: `AnthropicProvider` accepts a client for fixtures and loads/constructs the SDK client only when the Anthropic provider is resolved or used. This preserves network-free tests and keeps the OpenAI-compatible path independent from Anthropic initialization.
1461
+
1462
+ The adapter converts:
1463
+
1464
+ - OpenAI-shaped `system` messages to Anthropic top-level `system`
1465
+ - assistant `toolCalls` to `tool_use` blocks
1466
+ - `role: "tool"` messages to `tool_result` blocks
1467
+ - OpenAI-style image content to Anthropic image blocks when supported
1468
+
1469
+ Fixture coverage is required before treating Anthropic as production-ready. Tests should cover plain text, tool use, tool result continuation, image content conversion, stop reason mapping, and streaming event normalization.
1470
+
1471
+ ### Retry
1472
+
1473
+ Retry is provider-independent and lives in `providers/retry.ts`.
1474
+
1475
+ ```text
1476
+ retry: 429, 5xx, ETIMEDOUT, ECONNRESET, ENOTFOUND
1477
+ do not retry: 4xx except 429
1478
+ backoff: 1s, 2s, 4s, 8s, 16s + jitter
1479
+ Retry-After: respected, capped at 30s
1480
+ AbortSignal: respected immediately
1481
+ ```
1482
+
1483
+ ---
1484
+
1485
+ ## 9. Tool Catalog
1486
+
1487
+ | Tool | Input | Output/limit | `build` | `plan` |
1488
+ |------|-------|--------------|---------|--------|
1489
+ | `read_file` | `path`, `offset?`, `limit?` | 1MB / 2000 lines | allow | allow |
1490
+ | `grep` | `pattern`, `path?`, `glob?` | 200 matches | allow | allow |
1491
+ | `glob` | `pattern`, `path?` | 1000 paths | allow | allow |
1492
+ | `write_file` | `path`, `content` | summary | ask | deny |
1493
+ | `edit_file` | `path`, `oldString`, `newString`, `replaceAll?` | summary + diff | ask | deny |
1494
+ | `bash` | `command`, `workdir?`, `timeoutMs?` | 32KB output | ask | deny |
1495
+ | `delegate_agent` | `agent`, `task`, context fields | compact child result + transcript refs | multi only | deny |
1496
+
1497
+ Output cap:
1498
+
1499
+ ```text
1500
+ if output.size <= 32KB:
1501
+ return output
1502
+ else:
1503
+ write full output to .demian/tmp/output-<callId>.txt
1504
+ return head 16KB + marker + tail 16KB
1505
+ ```
1506
+
1507
+ Tool details:
1508
+
1509
+ - `read_file`: rejects cwd escape, binary files, and files over 1MB; returns line-numbered output; supports paging.
1510
+ - `write_file`: rejects cwd escape; creates parent directory; replaces existing file; `.env` edits are blocked by hook.
1511
+ - `edit_file`: exact string replace; rejects no-op edits; errors on 0 matches; requires `replaceAll` for multiple matches; emits diff metadata.
1512
+ - `bash`: rejects workdir outside cwd; default timeout 30s; max timeout 600s; captures stdout/stderr; uses sandbox launch policy.
1513
+ - `grep`: uses `rg` first; falls back to a walker; excludes `.git`, `node_modules`, and build output.
1514
+ - `glob`: searches inside cwd; excludes `.git` and `node_modules`; caps at 1000 paths; prefers recent files first.
1515
+ - `delegate_agent`: appears only in multi-agent mode, validates callable agents through registry and delegation policy, gates invocation permission, runs the child AgentSessionRunner, and returns a compact result.
1516
+
1517
+ `delegate_agent` uses one stable provider tool rather than one generated tool per sub agent:
1518
+
1519
+ ```ts
1520
+ const delegateAgentTool: Tool = {
1521
+ name: "delegate_agent",
1522
+ description: "Delegate a bounded task to a configured demian agent and return its result.",
1523
+ inputSchema: {
1524
+ type: "object",
1525
+ properties: {
1526
+ agent: { type: "string", enum: ["<callable-agent-name>"] },
1527
+ task: { type: "string" },
1528
+ context: { type: "string" },
1529
+ contextRefs: { type: "array", items: { type: "string" } },
1530
+ relevantFiles: { type: "array", items: { type: "string" } },
1531
+ constraints: { type: "array", items: { type: "string" } },
1532
+ expectedOutput: { type: "string" },
1533
+ maxTurns: { type: "integer", minimum: 1 },
1534
+ returnMode: { type: "string", enum: ["brief", "normal"] }
1535
+ },
1536
+ required: ["agent", "task"]
1537
+ },
1538
+ execute: ...
1539
+ }
1540
+ ```
1541
+
1542
+ The alternative was generated virtual tools such as `reviewer(task)` and `builder(task)`. That might give some models more obvious routing hints, but it increases provider payload size and scatters permission targets across many dynamic tool names. The stable tool is chosen for a smaller core, a central permission target, and schema-enum validation of callable agent names. Generated virtual tools can be added later if model routing quality proves weak.
1543
+
1544
+ Dangerous bash patterns blocked by built-in hook include:
1545
+
1546
+ - `rm -rf /`
1547
+ - `rm -rf ~`
1548
+ - `rm -rf $HOME`
1549
+ - fork-bomb patterns
1550
+ - `dd` writes to device paths
1551
+ - destructive `curl` or `wget` pipe patterns
1552
+
1553
+ This list is not a complete security boundary; it is a hardening layer before permission prompts.
1554
+
1555
+ ---
1556
+
1557
+ ## 10. Hooks
1558
+
1559
+ Built-in hooks:
1560
+
1561
+ | Hook | Event | Purpose | Failure |
1562
+ |------|-------|---------|---------|
1563
+ | `block-dangerous-bash` | `PreToolUse(bash)` | Block destructive command patterns | fail-closed |
1564
+ | `protect-env-files` | `PreToolUse(write_file/edit_file)` | Block secret file modification | fail-closed |
1565
+ | `mask-secrets` | `AfterModelResponse` | Mask obvious secret-looking model text | fail-closed |
1566
+ | `inject-env-info` | `SessionStart` | Inject cwd, OS, provider, agent, tools, sandbox context | fail-closed |
1567
+
1568
+ Command hook stdin:
1569
+
1570
+ ```json
1571
+ {
1572
+ "event": "PreToolUse",
1573
+ "rootSessionId": "root_123",
1574
+ "sessionId": "ses_123",
1575
+ "invocationId": "inv_456",
1576
+ "agentPath": ["orchestrator", "builder"],
1577
+ "callId": "call_456",
1578
+ "agent": "builder",
1579
+ "cwd": "/repo",
1580
+ "toolName": "bash",
1581
+ "toolInput": {
1582
+ "command": "npm test"
1583
+ }
1584
+ }
1585
+ ```
1586
+
1587
+ Command hook stdout:
1588
+
1589
+ ```json
1590
+ { "decision": "warn", "message": "Tests may take a while." }
1591
+ ```
1592
+
1593
+ Empty stdout means pass-through. Non-zero exit from a command hook is fail-open with warning.
1594
+
1595
+ ---
1596
+
1597
+ ## 11. Permissions, Agents, And Delegation
1598
+
1599
+ Ask UI:
1600
+
1601
+ ```text
1602
+ Allow bash: npm test?
1603
+ [y]es once / [N]o / [a]lways globally using configured grant scope:
1604
+ ```
1605
+
1606
+ Multi-agent prompt examples:
1607
+
1608
+ ```text
1609
+ Allow agent reviewer using provider gemini-fast?
1610
+ It will receive: current user request, compact handoff context, context refs, relevant file paths.
1611
+ [y]es once / [N]o / [a]lways globally using configured grant scope:
1612
+ ```
1613
+
1614
+ ```text
1615
+ Allow builder -> edit_file: src/parser.ts?
1616
+ [y]es once / [N]o / [a]lways globally using configured grant scope:
1617
+ ```
1618
+
1619
+ `--yes` auto-allows `ask`, but built-in hard deny, global explicit deny, and hook blocks still apply. `--yes` does not create persistent grants unless the user explicitly chose an `always` path.
1620
+
1621
+ ### Visibility vs Authority
1622
+
1623
+ `tools.visible` controls what the provider sees in the tool list. Permission rules and grants control whether a requested tool call can execute.
1624
+
1625
+ ```text
1626
+ single-agent:
1627
+ visible tools == authority tools
1628
+
1629
+ multi-agent orchestrator:
1630
+ visible tools: [read_file, grep, glob, delegate_agent]
1631
+ authority tools: [read_file, grep, glob, delegate_agent]
1632
+
1633
+ builder subagent:
1634
+ visible tools: [read_file, write_file, edit_file, bash, grep, glob]
1635
+ effective tools:
1636
+ builder.visible ∩ registered tools ∩ global safety policy
1637
+ ```
1638
+
1639
+ The gain is that a main orchestrator can stay narrow while a builder sub agent can still perform implementation work after root-user approval. The cost is conceptual complexity: a grant may exist but still be unusable by an agent that cannot see the tool. The design accepts that complexity because it keeps model capability exposure separate from user authority.
1640
+
1641
+ ### Effective Authority
1642
+
1643
+ Sub agent effective authority is:
1644
+
1645
+ ```text
1646
+ root global safety policy
1647
+ ∩ child visible tool list
1648
+ ∩ child agent-local policy defaults
1649
+ + user-approved global grants
1650
+ ```
1651
+
1652
+ No child invocation can bypass cwd boundaries, sandbox policy, built-in hard deny, global explicit deny, provider profile validation, or hook blocks. If the child requests a tool that is visible but not yet allowed, the root UI asks the user. If the user chooses `always`, the grant becomes a root/global grant reusable by main and sibling agents when the tool is visible to them.
1653
+
1654
+ This deliberately gives up per-agent grant isolation. The advantage is a simpler user model: "I approved this operation scope for this root conversation/project." The safety invariant remains that hard/global deny cannot be expanded.
1655
+
1656
+ ### Agent Invocation Permission
1657
+
1658
+ Starting a child agent is itself a permission target:
1659
+
1660
+ ```text
1661
+ targetType: "agent"
1662
+ targetName: "reviewer"
1663
+ operation: "invoke"
1664
+ grant key: "agent:reviewer:gemini-fast"
1665
+ ```
1666
+
1667
+ Allowing an agent invocation does not pre-allow child tool calls. It only allows sending the bounded handoff context to the child provider and starting the child model loop.
1668
+
1669
+ Default policy:
1670
+
1671
+ - built-in cheap read-only subagents may default allow
1672
+ - external project subagents default ask
1673
+ - expensive provider profiles default ask unless configured otherwise
1674
+ - hidden or non-callable agents are deny
1675
+
1676
+ ### Delegation Validation
1677
+
1678
+ ```ts
1679
+ function validateDelegateInvocation(input: {
1680
+ targetAgentName: string
1681
+ callerChain: string[]
1682
+ maxDepth: number
1683
+ registry: AgentRegistry
1684
+ }): void {
1685
+ const target = input.registry.get(input.targetAgentName)
1686
+ if (!isCallable(target)) throw new Error(`Agent ${input.targetAgentName} is not callable`)
1687
+ if (input.callerChain.includes(input.targetAgentName)) {
1688
+ throw new Error(`Cycle detected: ${[...input.callerChain, input.targetAgentName].join(" -> ")}`)
1689
+ }
1690
+ if (input.callerChain.length >= input.maxDepth) {
1691
+ throw new Error(`Max delegation depth (${input.maxDepth}) exceeded`)
1692
+ }
1693
+ }
1694
+ ```
1695
+
1696
+ Default max depth is `1`. Child agents default to `delegation.canDelegate: false`. This gives demian specialist help without opening recursive orchestration as the default behavior.
1697
+
1698
+ ### Callable Catalog
1699
+
1700
+ In multi-agent mode, the main system prompt receives a compact catalog:
1701
+
1702
+ ```text
1703
+ Available sub agents (use delegate_agent to invoke):
1704
+
1705
+ - reviewer (reviewer, cheap): Read-only code reviewer.
1706
+ Use when: find bugs and missing tests after implementation.
1707
+ Avoid when: editing files directly.
1708
+
1709
+ - builder (builder, standard): Implement scoped code changes.
1710
+ Use when: a focused implementation task is ready.
1711
+ ```
1712
+
1713
+ Catalog budget defaults to about `2000` tokens. If over budget, demian drops `useWhen` and `avoidWhen` first, keeping agent names and descriptions. `triggers` are registry metadata for future routing and are not included in v1 prompt text.
1714
+
1715
+ ### `build`
1716
+
1717
+ Normalized v1 shape:
1718
+
1719
+ ```ts
1720
+ export const build: AgentDefinition = {
1721
+ name: "build",
1722
+ description: "General coding agent.",
1723
+ mode: "primary",
1724
+ prompt: { system: buildPrompt },
1725
+ tools: { visible: ["read_file", "write_file", "edit_file", "bash", "grep", "glob"] },
1726
+ permissions: [
1727
+ { tool: "read_file", decision: "allow" },
1728
+ { tool: "grep", decision: "allow" },
1729
+ { tool: "glob", decision: "allow" },
1730
+ { tool: "write_file", decision: "ask" },
1731
+ { tool: "edit_file", decision: "ask" },
1732
+ { tool: "bash", decision: "ask" },
1733
+ { tool: "*", match: { pathGlob: "**/.env" }, decision: "deny", reason: "secrets" },
1734
+ { tool: "*", match: { pathGlob: "**/.env.*" }, decision: "deny", reason: "secrets" },
1735
+ { tool: "*", match: { pathGlob: "node_modules/**" }, decision: "deny", reason: "vendored" }
1736
+ ]
1737
+ }
1738
+ ```
1739
+
1740
+ ### `plan`
1741
+
1742
+ ```ts
1743
+ export const plan: AgentDefinition = {
1744
+ name: "plan",
1745
+ description: "Read-only planning agent.",
1746
+ mode: "primary",
1747
+ prompt: { system: planPrompt },
1748
+ tools: { visible: ["read_file", "grep", "glob"] },
1749
+ permissions: [
1750
+ { tool: "read_file", decision: "allow" },
1751
+ { tool: "grep", decision: "allow" },
1752
+ { tool: "glob", decision: "allow" },
1753
+ { tool: "*", decision: "deny", reason: "plan agent is read-only" }
1754
+ ]
1755
+ }
1756
+ ```
1757
+
1758
+ Existing `build` and `plan` behavior must remain unchanged in single-agent mode. The new shape is a normalization layer, not a behavioral migration.
1759
+
1760
+ ---
1761
+
1762
+ ## 12. Config
1763
+
1764
+ Config precedence:
1765
+
1766
+ ```text
1767
+ built-in defaults
1768
+ -> ~/.demian/config.json
1769
+ -> <cwd>/.demian/config.json
1770
+ -> --config <path>
1771
+ -> CLI flags
1772
+ ```
1773
+
1774
+ Default shape:
1775
+
1776
+ ```json
1777
+ {
1778
+ "agentMode": "single-agent",
1779
+ "defaultAgent": "build",
1780
+ "defaultProvider": "openai",
1781
+ "maxTurns": 25,
1782
+ "streaming": {
1783
+ "enabled": true
1784
+ },
1785
+ "sandbox": {
1786
+ "mode": "workspace-write",
1787
+ "network": "inherit"
1788
+ },
1789
+ "persistentGrants": {
1790
+ "enabled": true,
1791
+ "scope": "project",
1792
+ "ttlMs": 604800000
1793
+ },
1794
+ "multimodal": {
1795
+ "maxImageBytes": 8388608,
1796
+ "detail": "auto"
1797
+ },
1798
+ "delegation": {
1799
+ "maxDepth": 1,
1800
+ "defaultInvocationDecision": "ask",
1801
+ "subInvocationMaxTurns": 10
1802
+ },
1803
+ "context": {
1804
+ "main": {
1805
+ "maxContextTokens": 48000,
1806
+ "compactAtRatio": 0.7,
1807
+ "summaryTargetTokens": 3000,
1808
+ "recentTurns": 6,
1809
+ "minRecentTurns": 2,
1810
+ "keepRawDelegateResults": 2,
1811
+ "maxDelegateResultTokens": 1000,
1812
+ "ledgerTargetTokens": 1200,
1813
+ "maxInlineToolResultTokens": 2000,
1814
+ "retrievalMaxTokens": 3000,
1815
+ "reserveForImagesTokens": 4000,
1816
+ "compression": "deterministic"
1817
+ },
1818
+ "subAgent": {
1819
+ "memoryScope": "root-session",
1820
+ "maxContextTokens": 12000,
1821
+ "maxInputTokens": 12000,
1822
+ "summaryTargetTokens": 1200,
1823
+ "recentTurns": 3,
1824
+ "minRecentTurns": 1,
1825
+ "compactAtRatio": 0.7,
1826
+ "maxHandoffTokens": 2000,
1827
+ "maxDelegateContextTokens": 1000,
1828
+ "maxResultTokensToMain": 1000,
1829
+ "compression": "compact-summary-and-recent"
1830
+ }
1831
+ },
1832
+ "trust": {
1833
+ "userTools": false,
1834
+ "project": false
1835
+ },
1836
+ "providers": {
1837
+ "openai": {
1838
+ "type": "openai-compatible",
1839
+ "model": "openai-model-name",
1840
+ "baseURL": "https://api.openai.com/v1",
1841
+ "apiKeyEnv": "OPENAI_API_KEY"
1842
+ },
1843
+ "gemini": {
1844
+ "type": "openai-compatible",
1845
+ "model": "gemini-model-name",
1846
+ "baseURL": "https://generativelanguage.googleapis.com/v1beta/openai/",
1847
+ "apiKeyEnv": "GOOGLE_API_KEY"
1848
+ },
1849
+ "ollama": {
1850
+ "type": "openai-compatible",
1851
+ "model": "local-coder-model",
1852
+ "baseURL": "http://localhost:11434/v1",
1853
+ "apiKey": "ollama",
1854
+ "quirks": {
1855
+ "omitTemperature": true
1856
+ }
1857
+ },
1858
+ "lmstudio": {
1859
+ "type": "openai-compatible",
1860
+ "model": "local-coder-model",
1861
+ "baseURL": "http://localhost:1234/v1",
1862
+ "apiKey": "lm-studio"
1863
+ },
1864
+ "openrouter": {
1865
+ "type": "openai-compatible",
1866
+ "model": "provider/model-name",
1867
+ "baseURL": "https://openrouter.ai/api/v1",
1868
+ "apiKeyEnv": "OPENROUTER_API_KEY"
1869
+ },
1870
+ "azure": {
1871
+ "type": "openai-compatible",
1872
+ "model": "azure-deployment-name",
1873
+ "baseURL": "https://YOUR-RESOURCE-NAME.openai.azure.com/openai/v1",
1874
+ "apiKeyEnv": "AZURE_OPENAI_API_KEY"
1875
+ },
1876
+ "anthropic": {
1877
+ "type": "anthropic",
1878
+ "model": "anthropic-model-name",
1879
+ "apiKeyEnv": "ANTHROPIC_API_KEY"
1880
+ },
1881
+ "codex": {
1882
+ "type": "codex",
1883
+ "model": "gpt-5.1-codex",
1884
+ "baseURL": "https://chatgpt.com/backend-api/codex",
1885
+ "authStore": "auto",
1886
+ "allowApiKeyFallback": false,
1887
+ "promptCacheKey": "root-session",
1888
+ "responses": {
1889
+ "store": false,
1890
+ "include": ["reasoning.encrypted_content"],
1891
+ "reasoning": {
1892
+ "effort": "medium",
1893
+ "summary": "auto"
1894
+ }
1895
+ }
1896
+ },
1897
+ "claudecode": {
1898
+ "type": "claudecode",
1899
+ "runtime": "agent-sdk",
1900
+ "model": "sonnet",
1901
+ "cliPath": "~/.local/bin/claude",
1902
+ "cwdMode": "session",
1903
+ "permissionMode": "default",
1904
+ "defaultDecision": "by-category",
1905
+ "historyPolicy": "passthrough-resume",
1906
+ "onInvalidResume": "fresh",
1907
+ "attachmentFallback": "block",
1908
+ "allowSubagents": false,
1909
+ "sanitizeApiKeyEnv": true,
1910
+ "authPreflight": true,
1911
+ "useBareMode": false,
1912
+ "usageLedgerScope": "process",
1913
+ "sessionLock": true,
1914
+ "abortPolicy": "record-only"
1915
+ }
1916
+ }
1917
+ }
1918
+ ```
1919
+
1920
+ Most model names are placeholders in this architecture. Config owns fast-changing model selection; the Codex default should track a concrete Codex model slug present in local fixtures or provider metadata, and the Claude Code default should track a Claude Code-compatible Agent SDK model alias.
1921
+
1922
+ Mode resolution precedence:
1923
+
1924
+ ```text
1925
+ 1. CLI flag: --mode / --single-agent / --multi-agent
1926
+ 2. config.agentMode
1927
+ 3. config.mode, accepted as a compatibility alias
1928
+ 4. config.multiAgent.enabled === true -> multi-agent
1929
+ 5. built-in default -> single-agent
1930
+ ```
1931
+
1932
+ Context budget precedence:
1933
+
1934
+ ```text
1935
+ built-in defaults
1936
+ -> config file
1937
+ -> environment/application variables
1938
+ -> CLI flags
1939
+ -> programmatic embedding options
1940
+ ```
1941
+
1942
+ Recommended environment variable names:
1943
+
1944
+ | Variable | Meaning |
1945
+ |----------|---------|
1946
+ | `DEMIAN_MAIN_MAX_CONTEXT_TOKENS` | main agent model-visible context budget |
1947
+ | `DEMIAN_MAIN_COMPACT_AT_RATIO` | main context compaction trigger; default `0.7` |
1948
+ | `DEMIAN_MAIN_SUMMARY_TARGET_TOKENS` | target size for main rolling summary |
1949
+ | `DEMIAN_MAIN_KEEP_RAW_DELEGATE_RESULTS` | number of raw delegate results kept in main history |
1950
+ | `DEMIAN_MAIN_MAX_DELEGATE_RESULT_TOKENS` | max child result tokens returned to main |
1951
+ | `DEMIAN_SUB_MAX_CONTEXT_TOKENS` | sub agent model-visible context budget |
1952
+ | `DEMIAN_SUB_COMPACT_AT_RATIO` | sub agent compaction trigger; default `0.7` |
1953
+ | `DEMIAN_SUB_SUMMARY_TARGET_TOKENS` | target size for sub memory summary |
1954
+ | `DEMIAN_SUB_RECENT_TURNS` | raw recent child turns kept |
1955
+ | `DEMIAN_SUB_MAX_HANDOFF_TOKENS` | max child-visible handoff packet size |
1956
+ | `DEMIAN_SUB_MAX_DELEGATE_CONTEXT_TOKENS` | max main-provided delegate context size |
1957
+ | `DEMIAN_SUB_MAX_RESULT_TOKENS_TO_MAIN` | max compact result returned to main |
1958
+
1959
+ ### Agent Registration, Discovery, And Trust
1960
+
1961
+ Current implementation supports external agents through `config.agents` and programmatic `AgentRegistry.register()`. This gives users and tests an extension point without loading arbitrary project code.
1962
+
1963
+ Config agent example:
1964
+
1965
+ ```json
1966
+ {
1967
+ "agents": {
1968
+ "reviewer": {
1969
+ "name": "reviewer",
1970
+ "description": "Read-only code reviewer.",
1971
+ "mode": "subagent",
1972
+ "provider": {
1973
+ "profile": "gemini-fast"
1974
+ },
1975
+ "prompt": {
1976
+ "system": "You are a read-only reviewer. Report concrete issues with file references."
1977
+ },
1978
+ "tools": {
1979
+ "visible": ["read_file", "grep", "glob"]
1980
+ },
1981
+ "permissions": [
1982
+ { "tool": "read_file", "decision": "allow" },
1983
+ { "tool": "grep", "decision": "allow" },
1984
+ { "tool": "glob", "decision": "allow" },
1985
+ { "tool": "*", "decision": "deny", "reason": "reviewer is read-only" }
1986
+ ],
1987
+ "delegation": {
1988
+ "callable": true,
1989
+ "canDelegate": false
1990
+ },
1991
+ "catalog": {
1992
+ "category": "reviewer",
1993
+ "cost": "cheap",
1994
+ "useWhen": ["Find bugs and risks without editing files."]
1995
+ }
1996
+ }
1997
+ }
1998
+ }
1999
+ ```
2000
+
2001
+ Filesystem discovery remains a target design rather than current code:
2002
+
2003
+ Recommended filesystem layout:
2004
+
2005
+ ```text
2006
+ ~/.demian/
2007
+ agents/<name>.md
2008
+ tools/<name>.mjs
2009
+
2010
+ <cwd>/.demian/
2011
+ agents/<name>.md
2012
+ tools/<name>.mjs
2013
+ ```
2014
+
2015
+ Agent markdown is data. Tool modules are code. They therefore use different trust gates:
2016
+
2017
+ | Source | Agent markdown | Tool module |
2018
+ |--------|----------------|-------------|
2019
+ | built-in | trusted | trusted |
2020
+ | user scope | load by default or warn once | prompt or config opt-in |
2021
+ | project scope | load data with warning | require `--trust-project` or prompt |
2022
+ | programmatic | caller responsibility | caller responsibility |
2023
+
2024
+ Trust decisions are separate from permission grants:
2025
+
2026
+ ```text
2027
+ ~/.demian/trust.json
2028
+ <cwd>/.demian/trust.json
2029
+ ```
2030
+
2031
+ Trust records should include absolute path, trusted decision, decision time, and `mtimeMs`. If the file changes, prompt again. mtime validation is chosen for the first filesystem loader because it is simple; hash validation is a possible later hardening step.
2032
+
2033
+ Future agent markdown example:
2034
+
2035
+ ```markdown
2036
+ ---
2037
+ name: reviewer
2038
+ description: Read-only code reviewer.
2039
+ mode: all
2040
+ provider:
2041
+ profile: gemini-fast
2042
+ tools:
2043
+ visible: [read_file, grep, glob]
2044
+ permissions:
2045
+ - { tool: read_file, decision: allow }
2046
+ - { tool: grep, decision: allow }
2047
+ - { tool: glob, decision: allow }
2048
+ - { tool: "*", decision: deny, reason: "reviewer is read-only" }
2049
+ delegation:
2050
+ callable: true
2051
+ canDelegate: false
2052
+ catalog:
2053
+ category: reviewer
2054
+ cost: cheap
2055
+ useWhen:
2056
+ - Find bugs and risks without editing files.
2057
+ ---
2058
+ You are a read-only reviewer. Report concrete issues with file references.
2059
+ Do not modify files.
2060
+ ```
2061
+
2062
+ Future tool module example:
2063
+
2064
+ ```ts
2065
+ import type { Tool } from "demian-cli"
2066
+
2067
+ const tool: Tool = {
2068
+ name: "lsp_references",
2069
+ description: "Find references for a symbol through a local LSP bridge.",
2070
+ inputSchema: {
2071
+ type: "object",
2072
+ properties: {
2073
+ file: { type: "string" },
2074
+ symbol: { type: "string" }
2075
+ },
2076
+ required: ["file", "symbol"]
2077
+ },
2078
+ async execute(input, ctx) {
2079
+ return { ok: true, content: "..." }
2080
+ }
2081
+ }
2082
+
2083
+ export default tool
2084
+ ```
2085
+
2086
+ Loader validation:
2087
+
2088
+ - names must be stable tool-safe identifiers
2089
+ - built-in name collisions are rejected
2090
+ - duplicate external contribution names are rejected by precedence order with a warning
2091
+ - every `tools.visible` entry must exist in the registered tool catalog
2092
+ - provider profile references must exist in `config.providers`
2093
+ - inline provider fields are rejected
2094
+ - `mode: "subagent"` cannot be selected as the main agent
2095
+ - default tool export must expose `name`, `description`, `inputSchema`, and `execute`
2096
+
2097
+ ---
2098
+
2099
+ ## 13. CLI
2100
+
2101
+ Usage:
2102
+
2103
+ ```text
2104
+ demian [flags]
2105
+ demian-cli [flags]
2106
+ demian-plain [flags]
2107
+ ```
2108
+
2109
+ All human-facing commands are interface-first:
2110
+
2111
+ - `demian` and `demian-cli` launch the Ink terminal UI described in `architecture-tui.md`.
2112
+ - `demian-plain` launches an interactive plain terminal flow with raw Markdown output.
2113
+ - The user enters the first message after entering the CLI or TUI, not as a required positional argument.
2114
+
2115
+ Prompt arguments may remain as a compatibility shortcut, but they are not the primary UX. The canonical interactive path is command first, settings confirmation second, message input third.
2116
+
2117
+ The UI process does not exit after one assistant answer. After each `SessionRunner` run completes, the same CLI or TUI returns to standby and accepts the next message until the user exits.
2118
+
2119
+ Interactive mode keeps conversational history in memory for the life of the UI process. Each submitted message starts a new `SessionRunner` run with the previous non-system messages passed as `history`; the current system prompt is rebuilt for the run.
2120
+
2121
+ Examples:
2122
+
2123
+ ```sh
2124
+ demian
2125
+ demian-cli --agent plan
2126
+ demian-plain
2127
+ demian-plain --provider gemini --model gemini-model-name
2128
+ demian --stream --image screenshot.png --sandbox workspace-write
2129
+ ```
2130
+
2131
+ Cost optimization scenario:
2132
+
2133
+ ```sh
2134
+ demian --agent plan --provider gemini --model gemini-model-name
2135
+ demian --agent build --provider openai --model openai-model-name
2136
+ ```
2137
+
2138
+ ### Interactive Startup Flow
2139
+
2140
+ All commands share the same resolution order before the first message:
2141
+
2142
+ ```text
2143
+ built-in defaults
2144
+ -> user config
2145
+ -> workspace config
2146
+ -> --config
2147
+ -> saved UI preferences
2148
+ -> CLI flags
2149
+ -> interactive provider/model override
2150
+ -> message input
2151
+ -> SessionRunner
2152
+ ```
2153
+
2154
+ Provider and model are selected before `SessionRunner` starts. This keeps the runtime provider-neutral and avoids changing model identity in the middle of a tool loop.
2155
+
2156
+ Saved UI preferences live in project-local `.demian/preferences.json`. They store only the selected provider key and model name, never API keys or copied provider config. Preferences are written after an explicit interactive or flag-sourced provider/model selection and are reused by `demian`, `demian-cli`, and `demian-plain`.
2157
+
2158
+ ### Plain CLI Flow
2159
+
2160
+ `demian-plain` is interactive when stdin is a TTY:
2161
+
2162
+ ```text
2163
+ demian-plain
2164
+ -> load config
2165
+ -> show resolved default provider/model/agent/cwd
2166
+ -> ask whether to use defaults, select provider, or edit model
2167
+ -> collect message prompt
2168
+ -> run SessionRunner
2169
+ -> print final raw Markdown answer
2170
+ -> return to message prompt
2171
+ -> repeat until /exit or /quit
2172
+ ```
2173
+
2174
+ Plain CLI provider/model UX:
2175
+
2176
+ - Show the currently resolved provider and model before asking for the message.
2177
+ - `Enter` accepts each default.
2178
+ - `?` lists configured provider keys.
2179
+ - Typing a provider key selects that provider and resets the model to that provider's configured default.
2180
+ - Typing a model value overrides the selected provider's configured model for this run.
2181
+ - CLI flags preselect values and mark them as flag-sourced in the prompt.
2182
+
2183
+ Plain CLI message commands:
2184
+
2185
+ - `/compact`: compact the current interactive conversation history immediately. The command is handled locally, is not sent to the model, and is not stored as a recalled prompt.
2186
+ - `/stop`: stop the currently running task by aborting its active run.
2187
+ - `/exit` or `/quit`: exit the interactive plain CLI. If a task is running, stop it first and then exit.
2188
+ - Any non-command message while standby starts a new task run.
2189
+
2190
+ Plain CLI I/O contract:
2191
+
2192
+ - Settings prompts, retries, streaming deltas, tool progress, permission prompts, and hook warnings go to stderr.
2193
+ - Final assistant raw Markdown goes to stdout.
2194
+ - In non-interactive stdin/stdout contexts, `demian-plain` must not hang waiting for provider/model input. It should require an explicit automation input path such as a prompt argument, `--prompt`, or stdin message support.
2195
+
2196
+ ### TUI Flow
2197
+
2198
+ `demian` and `demian-cli` show provider/model context inside the UI before the first message:
2199
+
2200
+ ```text
2201
+ demian
2202
+ -> render TUI
2203
+ -> show default provider/model/agent/cwd in status area
2204
+ -> allow provider/model changes through shortcuts
2205
+ -> collect message in prompt composer
2206
+ -> run SessionRunner
2207
+ -> return to composer
2208
+ -> repeat until /exit or /quit
2209
+ ```
2210
+
2211
+ TUI provider/model UX is defined in `architecture-tui.md`.
2212
+
2213
+ The TUI also exposes main agent selection before a message starts. Pressing `a` in an empty prompt opens the main agent selector; `Up`/`Down` moves through primary agents, `Enter` applies the selection, and `Esc` cancels. This mirrors the provider selector instead of adding a separate startup wizard. The gain is a small, repeatable settings loop that works between tasks. The tradeoff is that prompt input beginning with bare `a`, `p`, or `m` in an empty composer is reserved for settings shortcuts; users can type any other character first or change settings before composing.
2214
+
2215
+ The TUI composer keeps an in-memory prompt history for the lifetime of the UI process. `Up` recalls older submitted prompts; `Down` moves toward newer prompts and eventually restores the current draft. Control commands such as `/compact`, `/stop`, `/exit`, and `/quit` are not stored in prompt history, and prompt recall does not change `SessionRunner.history` or transcript behavior.
2216
+
2217
+ The TUI message composer accepts `/compact` while standby. The command forces deterministic compaction of the current interactive conversation history, then returns to the prompt composer. The transcript view records a local system block summarizing the token estimate before and after compaction, the number of preserved messages, the number of dropped messages, and the summary token estimate. `/compact` is disabled while a task is running; during running state, the command bar accepts only `/stop` and `/exit` because the active task owns the current `AbortController`.
2218
+
2219
+ TUI permission prompts are modal inside the running state. If a tool requires approval, the bottom bar renders the permission prompt before the generic running command bar, because the active keybindings are `y`, `a`, `n`, and `Enter` until the permission request is resolved. Tool input is summarized in human-readable form; `bash` shows the exact command, file tools show paths, and text-heavy fields show character counts plus a short preview.
2220
+
2221
+ In multi-agent mode, the TUI should render child activity as nested progress without making it modal unless a permission request is pending:
2222
+
2223
+ ```text
2224
+ Agent reviewer started
2225
+ tool read_file completed
2226
+ tool grep completed
2227
+ Agent reviewer completed
2228
+ ```
2229
+
2230
+ The same permission bar priority applies to child requests. A pending `builder -> edit_file` prompt must override generic running controls so the active keybindings are unambiguous.
2231
+
2232
+ Flags:
2233
+
2234
+ | Flag | Meaning |
2235
+ |------|---------|
2236
+ | `--mode <single|multi>` | explicit agent mode |
2237
+ | `--single-agent` | alias for `--mode single` |
2238
+ | `--multi-agent` | alias for `--mode multi` |
2239
+ | `--agent <name>` | selected agent in single mode; main agent in multi mode |
2240
+ | `--provider <name>` | provider key |
2241
+ | `--model <name>` | model override |
2242
+ | `--max-turns <n>` | loop limit |
2243
+ | `--main-max-context-tokens <n>` | override main model-visible context budget |
2244
+ | `--main-compact-at <ratio>` | override main compaction trigger |
2245
+ | `--sub-max-context-tokens <n>` | override sub agent model-visible context budget |
2246
+ | `--sub-compact-at <ratio>` | override sub agent compaction trigger |
2247
+ | `--cwd <path>` | workspace directory |
2248
+ | `--yes`, `-y` | auto-allow ask permissions |
2249
+ | `--dry-run` | stop before `write_file`, `edit_file`, or `bash` execution |
2250
+ | `--stream` | use provider streaming when available |
2251
+ | `--no-stream` | force non-streaming chat |
2252
+ | `--image <path-or-url>` | attach image; repeatable |
2253
+ | `--sandbox <mode>` | `off`, `read-only`, or `workspace-write` |
2254
+ | `--persistent-grants` | enable persisted `always` grants |
2255
+ | `--no-persistent-grants` | keep grants session-local |
2256
+ | `--no-transcript` | disable transcript writes |
2257
+ | `--config <path>` | load extra config file |
2258
+ | `--trust-user-tools` | allow loading user-scope external tool modules |
2259
+ | `--trust-project` | allow loading trusted project `.demian` contributions |
2260
+
2261
+ Recommended discovery commands:
2262
+
2263
+ | Command | Meaning |
2264
+ |---------|---------|
2265
+ | `demian list-agents` | show registered agents, source, mode, provider profile, cost/category |
2266
+ | `demian list-tools` | show registered tools, source, side-effect metadata |
2267
+
2268
+ Multi-agent mode is opt-in. Users should not encounter extra model calls, handoff data sharing, or child permission prompts unless they explicitly selected multi-agent mode through config or flags.
2269
+
2270
+ Plain CLI stdout:
2271
+
2272
+ - final assistant answer only
2273
+
2274
+ Plain CLI stderr:
2275
+
2276
+ - retry status
2277
+ - streaming text deltas
2278
+ - tool progress
2279
+ - permission prompts
2280
+ - hook warnings
2281
+
2282
+ ---
2283
+
2284
+ ## 14. Streaming
2285
+
2286
+ Streaming is a provider capability, not a separate runtime mode.
2287
+
2288
+ Rules:
2289
+
2290
+ - If `streaming.enabled` is true and `provider.stream` exists, use streaming.
2291
+ - Emit `model.text.delta` events for text deltas.
2292
+ - Accumulate deltas into the final `AssistantMessage`.
2293
+ - Accumulate tool-call deltas until a complete tool call is available.
2294
+ - Emit a final `done` event carrying a normalized `ChatResponse`.
2295
+ - `--no-stream` forces `chat()`.
2296
+
2297
+ The transcript records normalized events, not raw SSE lines.
2298
+
2299
+ ---
2300
+
2301
+ ## 15. Multimodal Input
2302
+
2303
+ Multimodal support is user-message input only in v0.
2304
+
2305
+ Rules:
2306
+
2307
+ - `--image <url>` passes through as OpenAI-compatible `image_url`.
2308
+ - `--image <path>` reads a local file, validates size, and converts to a data URL.
2309
+ - Repeated image flags preserve order after the text prompt.
2310
+ - `maxImageBytes` defaults to 8MB.
2311
+ - Unsupported providers should fail clearly or degrade only when explicitly configured.
2312
+
2313
+ Native Gemini multimodal APIs remain deferred. OpenAI-compatible image input is enough for the integrated v0.
2314
+
2315
+ ---
2316
+
2317
+ ## 16. Sandbox
2318
+
2319
+ Sandbox applies to `bash` execution.
2320
+
2321
+ ```ts
2322
+ export type SandboxMode = "off" | "read-only" | "workspace-write"
2323
+ export type SandboxNetwork = "inherit" | "deny"
2324
+
2325
+ export interface SandboxConfig {
2326
+ mode: SandboxMode
2327
+ network?: SandboxNetwork
2328
+ }
2329
+
2330
+ export interface SandboxLaunch {
2331
+ command: string
2332
+ args: string[]
2333
+ adapter: "off" | "macos-sandbox-exec" | "linux-bwrap" | "env-only"
2334
+ env: Record<string, string>
2335
+ }
2336
+ ```
2337
+
2338
+ Behavior:
2339
+
2340
+ - `off`: run through shell directly.
2341
+ - `read-only`: deny writes where platform support exists.
2342
+ - `workspace-write`: allow writes in cwd and temp directories.
2343
+ - `network: "deny"`: best effort when supported.
2344
+ - `env-only`: fallback that records intent but cannot enforce OS isolation.
2345
+
2346
+ The runtime must emit sandbox adapter metadata so transcripts show whether a command was truly sandboxed.
2347
+
2348
+ ---
2349
+
2350
+ ## 17. Persistent Grants
2351
+
2352
+ Persistent grants are a convenience cache, not a security override.
2353
+
2354
+ Storage:
2355
+
2356
+ ```text
2357
+ .demian/grants.json
2358
+ ```
2359
+
2360
+ Rules:
2361
+
2362
+ - TTL is enforced on load.
2363
+ - Grants are scoped to project by default.
2364
+ - Grants use the same grant key policy as session grants.
2365
+ - Built-in hard deny and global explicit deny still win.
2366
+ - `--no-persistent-grants` disables disk-backed grants.
2367
+ - `--yes` does not create persistent grants unless the answer is explicitly `always`.
2368
+
2369
+ ---
2370
+
2371
+ ## 18. Events And Transcript
2372
+
2373
+ Runtime events:
2374
+
2375
+ - `session.started`
2376
+ - `session.ended`
2377
+ - `user.message`
2378
+ - `model.request`
2379
+ - `model.text`
2380
+ - `model.text.delta`
2381
+ - `model.usage`
2382
+ - `model.content_filter`
2383
+ - `context.compiled`
2384
+ - `context.compacted`
2385
+ - `provider.retry`
2386
+ - `tool.requested`
2387
+ - `tool.started`
2388
+ - `tool.completed`
2389
+ - `tool.failed`
2390
+ - `hook.fired`
2391
+ - `hook.failed`
2392
+ - `permission.requested`
2393
+ - `permission.granted`
2394
+ - `permission.denied`
2395
+ - `agent.invocation.requested`
2396
+ - `agent.invocation.started`
2397
+ - `agent.invocation.tool_policy.applied`
2398
+ - `agent.session.context_compacted`
2399
+ - `agent.session.memory_updated`
2400
+ - `agent.invocation.completed`
2401
+ - `agent.invocation.failed`
2402
+ - `agent.invocation.cancelled`
2403
+
2404
+ Most runtime events should accept optional invocation fields:
2405
+
2406
+ ```ts
2407
+ export interface InvocationEventFields {
2408
+ rootSessionId?: string
2409
+ agentSessionId?: string
2410
+ invocationId?: string
2411
+ parentInvocationId?: string
2412
+ agent?: string
2413
+ agentPath?: string[]
2414
+ }
2415
+ ```
2416
+
2417
+ Permission events include target type:
2418
+
2419
+ ```ts
2420
+ {
2421
+ type: "permission.requested"
2422
+ rootSessionId: string
2423
+ invocationId: string
2424
+ agent: string
2425
+ targetType: "tool" | "agent"
2426
+ targetName: string
2427
+ decision: Decision
2428
+ ts: number
2429
+ }
2430
+ ```
2431
+
2432
+ Single-agent transcript path remains compatible:
2433
+
2434
+ ```text
2435
+ .demian/transcripts/<sessionId>/session.jsonl
2436
+ ```
2437
+
2438
+ Preferred multi-agent transcript path:
2439
+
2440
+ ```text
2441
+ .demian/transcripts/<rootSessionId>/session.jsonl
2442
+ ```
2443
+
2444
+ All main and child events are written in chronological order to one root transcript. Events carry `agentSessionId`, `invocationId`, and `parentInvocationId`, so a UI or debug tool can reconstruct the invocation tree and per-agent memory timeline.
2445
+
2446
+ Alternative considered:
2447
+
2448
+ ```text
2449
+ .demian/transcripts/<rootSessionId>/main.jsonl
2450
+ .demian/transcripts/<rootSessionId>/agents/<invocationId>.jsonl
2451
+ ```
2452
+
2453
+ That would physically separate child logs, but it introduces cross-file ordering problems for permissions and tool execution. One root JSONL is chosen because audit order matters more than file separation. Per-agent views can be generated later from the canonical root log.
2454
+
2455
+ Example:
2456
+
2457
+ ```jsonl
2458
+ {"type":"session.started","sessionId":"ses_123","rootSessionId":"root_1","agent":"build","provider":"gemini","cwd":"/repo","ts":1730000000000}
2459
+ {"type":"user.message","sessionId":"ses_123","rootSessionId":"root_1","text":"refactor parser","ts":1730000000100}
2460
+ {"type":"agent.invocation.started","rootSessionId":"root_1","agentSessionId":"as_1","invocationId":"inv_1","agent":"reviewer","provider":"gemini-fast","model":"model","depth":1,"ts":1730000000150}
2461
+ {"type":"tool.requested","rootSessionId":"root_1","invocationId":"inv_1","agent":"reviewer","callId":"call_1","name":"read_file","input":{"path":"package.json"},"ts":1730000000200}
2462
+ {"type":"tool.completed","rootSessionId":"root_1","invocationId":"inv_1","agent":"reviewer","callId":"call_1","name":"read_file","ok":true,"preview":"...","ts":1730000000300}
2463
+ {"type":"agent.invocation.completed","rootSessionId":"root_1","agentSessionId":"as_1","invocationId":"inv_1","agent":"reviewer","preview":"no issues found","ts":1730000000400}
2464
+ {"type":"session.ended","sessionId":"ses_123","rootSessionId":"root_1","reason":"completed","ts":1730000002000}
2465
+ ```
2466
+
2467
+ Transcripts may contain sensitive information. `mask-secrets` is a useful guard, not a complete redaction guarantee.
2468
+
2469
+ ---
2470
+
2471
+ ## 19. Safety Model
2472
+
2473
+ Guaranteed:
2474
+
2475
+ - No local side effects without tool calls.
2476
+ - No side-effecting tool execution without hooks and permission evaluation.
2477
+ - Cwd escape is rejected by built-in file tools.
2478
+ - `.env` edits are blocked by default.
2479
+ - Dangerous bash patterns are blocked before permission prompts.
2480
+ - `plan` agent is read-only.
2481
+ - Built-in hard deny and global explicit deny beat grants and `--yes`.
2482
+ - Single-agent mode does not expose delegation.
2483
+ - Sub agent tool calls route permission prompts to the root/main UI.
2484
+ - User-approved `always` grants from child requests become root/global grants.
2485
+ - Sub agents cannot bypass provider profile validation, cwd, sandbox, hook, hard deny, or global deny policy.
2486
+ - Child agent memory is bounded and compressed before it grows without limit.
2487
+ - Root abort cancels the active child invocation.
2488
+ - Provider content filters are surfaced.
2489
+ - Transcript records runtime decisions.
2490
+
2491
+ Not guaranteed:
2492
+
2493
+ - Perfect OS isolation on every platform.
2494
+ - Complete network exfiltration prevention.
2495
+ - Complete destructive command detection.
2496
+ - Complete secret redaction.
2497
+ - Provider-native safety policy control.
2498
+ - Correctness of model-generated code.
2499
+ - Correctness of model-selected sub agents.
2500
+ - Fairness or optimal scheduling for background agents, because background execution is deferred.
2501
+ - Perfect network exfiltration prevention by arbitrary external tool modules.
2502
+
2503
+ Important trust distinction:
2504
+
2505
+ ```text
2506
+ external agent markdown can change model behavior
2507
+ external tool module can execute local code at load/run time
2508
+ ```
2509
+
2510
+ Agent markdown and tool modules therefore need different trust gates. Permission grants are about runtime actions; trust decisions are about loading contributions.
2511
+
2512
+ ---
2513
+
2514
+ ## 20. Testing Strategy
2515
+
2516
+ Default test suite must be network-free.
2517
+
2518
+ Priority:
2519
+
2520
+ | # | Area |
2521
+ |---|------|
2522
+ | 1 | OpenAI-shaped message history |
2523
+ | 2 | OpenAIProvider payload and response normalization |
2524
+ | 3 | SSE streaming parser |
2525
+ | 4 | Gemini config preset path |
2526
+ | 5 | content filter mapping |
2527
+ | 6 | retry behavior and `Retry-After` |
2528
+ | 7 | Anthropic conversion fixtures |
2529
+ | 8 | Permission deny greater than grant |
2530
+ | 9 | Persistent grant TTL and scope |
2531
+ | 10 | Tool input validation |
2532
+ | 11 | Tool path boundary checks |
2533
+ | 12 | Hook dispatcher block/warn/patch |
2534
+ | 13 | Session loop with mock provider |
2535
+ | 14 | Transcript event ordering |
2536
+ | 15 | Multimodal data URL conversion |
2537
+ | 16 | Sandbox launch construction |
2538
+ | 17 | Single-agent regression with delegation absent |
2539
+ | 18 | External agent/tool registration and built-in collision rejection |
2540
+ | 19 | Agent provider profile validation |
2541
+ | 20 | Callable catalog prompt and `delegate_agent.agent` schema enum |
2542
+ | 21 | Hidden/non-callable/self/cycle/depth delegation rejection |
2543
+ | 22 | Child invocation through mock providers |
2544
+ | 23 | Child tool permission routed through root coordinator |
2545
+ | 24 | User-approved child `always` grant reusable by main |
2546
+ | 25 | Hard/global deny cannot be expanded by grants |
2547
+ | 26 | Structured handoff excludes full main history |
2548
+ | 27 | Delegate context and result caps |
2549
+ | 28 | Child memory compression and repeated invocation reuse |
2550
+ | 29 | Main delegate-result ledger compaction |
2551
+ | 30 | Cross-provider handoff permission prompt |
2552
+ | 31 | Root transcript preserves parent-child event order |
2553
+ | 32 | Codex config preset, Responses payload mapping, and stream parser |
2554
+ | 33 | Codex local auth discovery, refresh classification, and installation metadata |
2555
+ | 34 | Codex API-key fallback guard and keyring account derivation |
2556
+ | 35 | Claude Code external runtime config, Agent SDK adapter, and legacy direct API gate |
2557
+ | 36 | Claude Code permission bridge, session mapping, invalid resume recovery, and usage ledger |
2558
+ | 37 | Claude Code CLI fallback capability detection, stream-json parser, and cancellation diagnostics |
2559
+ | 38 | Trust persistence and mtime invalidation |
2560
+ | 39 | Mode precedence across flags and config aliases |
2561
+ | 40 | Root abort cancels active child run |
2562
+
2563
+ Optional integration tests:
2564
+
2565
+ - real OpenAI-compatible endpoint smoke test
2566
+ - real Gemini OpenAI-compatible endpoint smoke test
2567
+ - real local Ollama smoke test
2568
+ - real Codex provider smoke test with an explicit local Codex login opt-in
2569
+ - real Claude Code provider smoke test with an explicit local Claude Code login opt-in
2570
+
2571
+ Optional integration tests must require explicit environment variables and must not run in default CI.
2572
+
2573
+ Multi-agent mock fixture:
2574
+
2575
+ ```text
2576
+ main provider:
2577
+ turn 1 -> delegate_agent(reviewer, task)
2578
+ turn 2 -> final answer using child result
2579
+
2580
+ child provider:
2581
+ turn 1 -> read_file(...)
2582
+ turn 2 -> final child answer
2583
+ ```
2584
+
2585
+ This fixture exercises delegation, child tool use, permission routing, compact result return, and transcript chronology without network access.
2586
+
2587
+ ---
2588
+
2589
+ ## 21. Implementation Sequence
2590
+
2591
+ | Step | Work | Verification |
2592
+ |------|------|--------------|
2593
+ | 1 | Create package skeleton | `demian --help` smoke |
2594
+ | 2 | Port message, event, transcript, util modules | transcript golden test |
2595
+ | 3 | Port config loader and provider resolver | config merge tests |
2596
+ | 4 | Port OpenAIProvider, retry, streaming parser | provider fixture tests |
2597
+ | 5 | Add Gemini and local provider presets | config fixture test |
2598
+ | 6 | Port tool registry and read/search tools | temp workspace tests |
2599
+ | 7 | Port write/edit/bash with output cap | temp workspace and timeout tests |
2600
+ | 8 | Integrate sandbox adapter directory | launch construction tests |
2601
+ | 9 | Port hook dispatcher and built-ins | hook decision tests |
2602
+ | 10 | Port permission engine, session grants, persistent grants | deny/grant tests |
2603
+ | 11 | Port agents and prompts | agent policy tests |
2604
+ | 12 | Port Session Runner with streaming and multimodal | mock session tests |
2605
+ | 13 | Add Anthropic adapter fixtures | conversion tests |
2606
+ | 14 | Add interactive CLI/TUI startup, provider/model selection, and prompt UX | pseudo-TTY smoke tests |
2607
+ | 15 | Add Codex provider | Codex auth, payload, stream, and refresh fixture tests |
2608
+ | 16 | Update README | manual review |
2609
+
2610
+ Multi-agent follow-up sequence:
2611
+
2612
+ | Step | Work | Verification |
2613
+ |------|------|--------------|
2614
+ | 16 | Add `register()` to tool and agent registries | collision and duplicate tests |
2615
+ | 17 | Introduce `AgentDefinition` normalization | `build` and `plan` snapshots unchanged |
2616
+ | 18 | Validate agent provider profile references | unknown profile rejects or disables agent |
2617
+ | 19 | Add context budget config/env parsing | precedence tests |
2618
+ | 20 | Move grants into root PermissionCoordinator | existing grant tests pass |
2619
+ | 21 | Add RootSession and invocation IDs | events include root and invocation context |
2620
+ | 22 | Add in-memory AgentSessionStore | repeated child invocation reuses memory |
2621
+ | 23 | Implement synchronous child AgentSessionRunner | mock child returns answer to main |
2622
+ | 24 | Implement `delegate_agent` tool and catalog injection | absent in single-agent mode, present only when allowed |
2623
+ | 25 | Add cycle, depth, hidden, and callable validation | rejected before child model call |
2624
+ | 26 | Add structured handoff and result caps | large context/result handled by policy |
2625
+ | 27 | Add child memory compression | compaction event and memory update emitted |
2626
+ | 28 | Add root transcript writer fields | parent-child chronology golden test |
2627
+ | 29 | Add CLI/TUI mode flags and nested activity | single-agent smoke unchanged |
2628
+ | 30 | Add discovery and trust persistence | mtime change prompts again |
2629
+
2630
+ ---
2631
+
2632
+ ## 22. Dependency Policy
2633
+
2634
+ Core should be dependency-light:
2635
+
2636
+ ```json
2637
+ {
2638
+ "dependencies": {
2639
+ "@anthropic-ai/sdk": "version-from-package-manager"
2640
+ },
2641
+ "engines": {
2642
+ "node": ">=22.18"
2643
+ }
2644
+ }
2645
+ ```
2646
+
2647
+ Rationale:
2648
+
2649
+ - Native `fetch` is enough for OpenAI-compatible providers.
2650
+ - Gemini support does not require a Google SDK.
2651
+ - Anthropic is intentionally different: it uses `@anthropic-ai/sdk` because the native protocol is not OpenAI-compatible and SDK types help manage API drift.
2652
+ - Tests can run with Node's built-in test runner and TypeScript stripping.
2653
+ - Vendor SDK usage must stay adapter-local.
2654
+
2655
+ Avoid in core:
2656
+
2657
+ - `@google/generative-ai`
2658
+ - `@google-cloud/vertexai`
2659
+ - LangChain
2660
+ - Vercel AI SDK
2661
+ - LlamaIndex
2662
+ - LiteLLM SDK
2663
+
2664
+ Acceptable later:
2665
+
2666
+ - A small JSON schema validator if handwritten validation becomes too large.
2667
+ - Optional platform sandbox helpers behind adapter boundaries.
2668
+
2669
+ ---
2670
+
2671
+ ## 23. Decision Log
2672
+
2673
+ | Decision | Final choice | Reason |
2674
+ |----------|--------------|--------|
2675
+ | Package name | `demian` | One integrated package, no lineage suffix |
2676
+ | Canonical doc | `nodejs/architecture.md` | Replaces separate design lineages for new work |
2677
+ | Default provider path | OpenAI-compatible | Broad vendor coverage with one adapter |
2678
+ | OpenAI-compatible auth | Default bearer, Azure `api-key` inferred from endpoint | API shape is shared, but auth headers differ by provider; normal Azure users should not need extra config |
2679
+ | Gemini | Config-only via OpenAIProvider | Core code change should be zero |
2680
+ | Internal messages | OpenAI-shaped | Matches default provider path |
2681
+ | Anthropic | Separate SDK-backed adapter | Native shape differs enough to isolate conversion; SDK helps manage protocol changes |
2682
+ | Tool names | `read_file`, `write_file`, `edit_file`, `bash`, `grep`, `glob` | Stable and explicit side-effect boundary |
2683
+ | Runtime | Node-first, Bun-compatible where practical | Current Codex implementation is dependency-light |
2684
+ | Streaming | Included in v0 | Already implemented and important for CLI UX |
2685
+ | Multimodal | Included for user input | Already represented and useful for UI/code review |
2686
+ | Sandbox | Included for bash launch | Codex modes plus Claude adapter shape |
2687
+ | Persistent grants | Included with TTL | Usability without weakening deny |
2688
+ | Interactive command entry | Commands enter UI/CLI before message input | Avoids forcing long prompts into shell arguments and creates room for provider/model selection |
2689
+ | Provider/model selection timing | Before `SessionRunner` starts | Keeps provider identity stable during one tool loop |
2690
+ | Provider/model selection helper | Shared `src/ui/settings.ts` | Keeps plain CLI and TUI selection behavior identical |
2691
+ | Provider/model source labels | `config`, `saved`, `flag`, `interactive` | Shows config, remembered preference, flag, and current UI provenance clearly |
2692
+ | TUI permission bar priority | Permission prompt overrides the running command bar | Prevents hidden approval shortcuts when a tool waits for `y/a/n/Enter` |
2693
+ | TUI tool input summary | `src/ui/tool-summary.ts` | Shows bash commands and paths clearly instead of raw JSON-first output |
2694
+ | TUI prompt recall | `Up`/`Down` over in-memory submitted prompts | Reuse previous inputs without persisting UI history or changing model conversation history |
2695
+ | TUI main agent selector | `a` opens a primary-agent selector in the empty prompt state | Lets users change the main agent inside TUI without restarting |
2696
+ | Provider/model preferences | `.demian/preferences.json` | Reuse the last explicit selection without mutating config files or storing secrets |
2697
+ | Multi-agent mode | opt-in `single-agent` / `multi-agent` mode | Preserves current UX and avoids surprise extra model calls |
2698
+ | Multi-agent authority | RootSession owns permissions, grants, transcript, cwd, sandbox, and abort | Keeps user authority centralized across main and sub agents |
2699
+ | Sub execution | child AgentSessionRunner with bounded memory | Preserves specialist continuity without copying full main history |
2700
+ | Delegation exposure | one `delegate_agent` tool | Stable provider payload and centralized invocation permission |
2701
+ | Agent shape | `AgentDefinition` policy capsule | Puts prompt, provider profile, tools, defaults, delegation, and catalog in one validated unit |
2702
+ | Visibility and authority | Split `tools.visible` from grants/rules | Keeps model tool exposure separate from user-approved authority |
2703
+ | Agent provider config | profile references only | Prevents agent files from introducing endpoints or credentials |
2704
+ | Child context | structured handoff packet plus compressed child memory | Gives sub agents useful context without full main-history sharing |
2705
+ | Child result | compact result to main, full details in transcript/artifacts | Controls main context growth while preserving audit detail |
2706
+ | Permission grants | root/global grants, no per-agent grants in v1 | User approvals behave consistently across main and child invocations |
2707
+ | Transcript shape | one root JSONL with invocation fields | Preserves chronological audit order |
2708
+ | External trust | trust decisions separate from permission grants | Loading code/data and executing actions stay separate concerns |
2709
+
2710
+ ---
2711
+
2712
+ ## 24. Key Tradeoffs
2713
+
2714
+ | Decision | Gained | Given up |
2715
+ |----------|--------|----------|
2716
+ | OpenAI-compatible default | 10+ vendors through one path | Internal shape resembles OpenAI API |
2717
+ | OpenAI-shaped internal history | Minimal conversion for default path | Anthropic adapter does more work |
2718
+ | Gemini config-only | No new dependency, no new adapter | Native Gemini features are flattened |
2719
+ | Anthropic SDK adapter | Better native protocol typing and API-change handling | One adapter-local runtime dependency |
2720
+ | Six snake_case tools | Clear action boundaries | Slightly longer names |
2721
+ | Hard/global deny dominates grants | Strong safety guarantee for real safety boundaries | Agent-local deny is no longer an absolute cap after user approval |
2722
+ | Streaming in v0 | Better CLI feedback | More provider parser complexity |
2723
+ | Sandbox in v0 | Safer bash defaults and clear metadata | Platform enforcement varies |
2724
+ | Persistent grants in v0 | Less repeated prompting | Requires TTL and scope discipline |
2725
+ | Multimodal in v0 | Screenshots and UI tasks are possible | Provider support varies |
2726
+ | Interactive provider/model selection | Clearer default visibility and fewer config mistakes | Adds a pre-session state machine to CLI and TUI |
2727
+ | Root-owned multi-agent runtime | Clear authority, cancellation, and audit ownership | Adds a RootSession layer above SessionRunner |
2728
+ | Child AgentSessionRunner | Specialist continuity and provider/model isolation | More lifecycle and memory compression logic |
2729
+ | One `delegate_agent` tool | Small, stable core and centralized permission target | Model must choose target through an `agent` argument |
2730
+ | Policy-capsule agents | Agent behavior and safety are validated together | Larger agent definition shape |
2731
+ | Visibility vs permission split | Narrow orchestrators can delegate to capable specialists | Users and implementers must understand grants do not expose hidden tools |
2732
+ | Global root grants | Fewer repeated prompts and consistent user authority | No per-agent grant isolation in v1 |
2733
+ | Provider profile references | No hidden endpoints or credentials in agent files | Per-agent models require named config profiles |
2734
+ | Structured handoff | Bounded data sharing and clearer provider prompts | Child may need tools or refs for details not copied into context |
2735
+ | Compact child results | Main context stays small | Main may need transcript/artifact refs for full detail |
2736
+ | Synchronous v1 delegation | Deterministic permission and transcript order | No background child tasks yet |
2737
+ | Trust file mtime validation | Simple trust invalidation in v1 | Weaker than content-hash validation |
2738
+
2739
+ ---
2740
+
2741
+ ## 25. Summary
2742
+
2743
+ `demian` integrates the two prior designs into one canonical package architecture.
2744
+
2745
+ The final shape is:
2746
+
2747
+ - OpenAI-compatible protocol is the default provider path.
2748
+ - Internal history is OpenAI-shaped.
2749
+ - Gemini is a config entry, not a new adapter.
2750
+ - Anthropic is isolated in its adapter.
2751
+ - Hooks and permissions guard every local side effect.
2752
+ - Tools are small, explicit, and workspace-bound.
2753
+ - Streaming, multimodal input, sandbox modes, and persistent grants are part of the integrated v0.
2754
+ - Single-agent mode remains the default.
2755
+ - Multi-agent mode is an opt-in root-owned delegation extension.
2756
+ - Agents are policy capsules, not arbitrary provider or credential entrypoints.
2757
+ - `delegate_agent` is the stable model-facing delegation tool.
2758
+ - Sub agents run as bounded child sessions with compressed memory and structured handoff context.
2759
+ - Child tool calls, child invocation prompts, grants, events, transcripts, sandbox, and abort all route through the root session.
2760
+ - Runtime decisions are visible through events and transcripts.
2761
+
2762
+ The package should stay small enough to understand, strict enough to trust, and open enough to add new OpenAI-compatible vendors without touching the core loop.
2763
+
2764
+ ---
2765
+
2766
+ ## References
2767
+
2768
+ - `nodejs/architecture-by-claude.md`
2769
+ - `nodejs/architecture-by-codex.md`
2770
+ - `claude/architecture-demian.md`
2771
+ - `codex/architecture-demian.md`
2772
+ - `.documents/multi-agent-architecture.md`
2773
+ - Google AI for Developers, Gemini API OpenAI compatibility: `https://ai.google.dev/gemini-api/docs/openai`