director-cli 0.3.0__tar.gz → 0.4.0__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (36) hide show
  1. director_cli-0.4.0/CHANGELOG.md +62 -0
  2. {director_cli-0.3.0 → director_cli-0.4.0}/PKG-INFO +15 -9
  3. {director_cli-0.3.0 → director_cli-0.4.0}/README.md +14 -8
  4. {director_cli-0.3.0 → director_cli-0.4.0}/director/README.md +7 -3
  5. {director_cli-0.3.0 → director_cli-0.4.0}/director/__init__.py +1 -1
  6. director_cli-0.4.0/director/claudecode.py +269 -0
  7. {director_cli-0.3.0 → director_cli-0.4.0}/director/cli.py +13 -1
  8. {director_cli-0.3.0 → director_cli-0.4.0}/director/config.example.toml +8 -2
  9. {director_cli-0.3.0 → director_cli-0.4.0}/director/config.py +1 -2
  10. director_cli-0.4.0/director/init.py +134 -0
  11. {director_cli-0.3.0 → director_cli-0.4.0}/director/opencode.py +41 -8
  12. {director_cli-0.3.0 → director_cli-0.4.0}/director/setup.py +1 -19
  13. {director_cli-0.3.0 → director_cli-0.4.0}/pyproject.toml +1 -1
  14. director_cli-0.3.0/CHANGELOG.md +0 -25
  15. {director_cli-0.3.0 → director_cli-0.4.0}/.gitignore +0 -0
  16. {director_cli-0.3.0 → director_cli-0.4.0}/LICENSE +0 -0
  17. {director_cli-0.3.0 → director_cli-0.4.0}/director/__main__.py +0 -0
  18. {director_cli-0.3.0 → director_cli-0.4.0}/director/agent_templates/brainstorm.md +0 -0
  19. {director_cli-0.3.0 → director_cli-0.4.0}/director/agent_templates/executor.md +0 -0
  20. {director_cli-0.3.0 → director_cli-0.4.0}/director/agent_templates/explorer.md +0 -0
  21. {director_cli-0.3.0 → director_cli-0.4.0}/director/agent_templates/opencode.json +0 -0
  22. {director_cli-0.3.0 → director_cli-0.4.0}/director/agent_templates/planner.md +0 -0
  23. {director_cli-0.3.0 → director_cli-0.4.0}/director/agent_templates/reviewer.md +0 -0
  24. {director_cli-0.3.0 → director_cli-0.4.0}/director/agent_templates/test-author.md +0 -0
  25. {director_cli-0.3.0 → director_cli-0.4.0}/director/bench.py +0 -0
  26. {director_cli-0.3.0 → director_cli-0.4.0}/director/cost.py +0 -0
  27. {director_cli-0.3.0 → director_cli-0.4.0}/director/dag.py +0 -0
  28. {director_cli-0.3.0 → director_cli-0.4.0}/director/gates.py +0 -0
  29. {director_cli-0.3.0 → director_cli-0.4.0}/director/gitutil.py +0 -0
  30. {director_cli-0.3.0 → director_cli-0.4.0}/director/metrics.py +0 -0
  31. {director_cli-0.3.0 → director_cli-0.4.0}/director/models.py +0 -0
  32. {director_cli-0.3.0 → director_cli-0.4.0}/director/plan.py +0 -0
  33. {director_cli-0.3.0 → director_cli-0.4.0}/director/report.py +0 -0
  34. {director_cli-0.3.0 → director_cli-0.4.0}/director/review.py +0 -0
  35. {director_cli-0.3.0 → director_cli-0.4.0}/director/run.py +0 -0
  36. {director_cli-0.3.0 → director_cli-0.4.0}/director/state.py +0 -0
@@ -0,0 +1,62 @@
1
+ # CHANGELOG
2
+
3
+
4
+ ## v0.4.0 (2026-06-25)
5
+
6
+ ### Continuous Integration
7
+
8
+ - Install build in semantic-release build_command (container lacks it)
9
+ ([`0bdd056`](https://github.com/manziman/director/commit/0bdd05666f86a93ec37cd7738075eea7ac50143a))
10
+
11
+ Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
12
+
13
+ - Make releases manual-only (workflow_dispatch, not every push to main)
14
+ ([`bddcb57`](https://github.com/manziman/director/commit/bddcb5753a66f11059d47e78a4b54168d5f1b4e0))
15
+
16
+ Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
17
+
18
+ - Push releases to protected main via a RELEASE_TOKEN PAT
19
+ ([#4](https://github.com/manziman/director/pull/4),
20
+ [`776db5b`](https://github.com/manziman/director/commit/776db5b3e2ab23acedd820ebba47b538a823cacf))
21
+
22
+ main requires PRs, so the default GITHUB_TOKEN can't push semantic-release's version commit + tag.
23
+ Use a RELEASE_TOKEN secret (a PAT with Contents:write whose owner is on the main ruleset bypass
24
+ list) for checkout + the semantic-release action. Falls back to GITHUB_TOKEN when unset. PyPI
25
+ publishing is unaffected (OIDC).
26
+
27
+ ### Features
28
+
29
+ - Add Claude Code agent-runtime provider
30
+ ([`71b0819`](https://github.com/manziman/director/commit/71b081998969ec9e0db7f16884eba015cf9fad2f))
31
+
32
+ Per-role runtime via the tier model-string prefix (claude-code/<model> → claude CLI; anything else →
33
+ opencode). Includes the live-caught dispatch fix and the prefix-selection refactor from PR review.
34
+ Tests stub the subprocess — no model/network in CI.
35
+
36
+ - Add interactive `director init` command
37
+ ([`8e8ba7d`](https://github.com/manziman/director/commit/8e8ba7da4489ee7da10b70b8d755fb3eb861e440))
38
+
39
+ `director init` configures .director/config.toml by asking questions instead of copying a static
40
+ example: it discovers models via `opencode models`, lets the user pick one per role, prompts for
41
+ the test/lint/typecheck gate commands, and writes a minimal config (with a pointer to
42
+ config.example.toml for advanced sections). `sync-agents` no longer seeds the config; the "missing
43
+ config" error now points at `director init`.
44
+
45
+ Built end-to-end with director dogfooding itself (planner: Opus 4.8, executor: local Qwen3.6-27B):
46
+ 5/5 nodes at the executor tier first attempt, integration gate green.
47
+
48
+
49
+ ## v0.3.0 (2026-06-24)
50
+
51
+ ### Documentation
52
+
53
+ - Move design lessons inline; drop the standalone lessons doc
54
+ ([`134c202`](https://github.com/manziman/director/commit/134c202e9e1df98bc20d1f7a51ab7619dd2a55fe))
55
+
56
+ The cross-phase lessons now live as comments at their point in the code (gitignore handling in
57
+ setup/gates/opencode, deterministic-gates in gates/review/opencode, config-as-object in
58
+ config/bench, non-fatal cleanup in bench, terminal-state scheduling in run). CONTRIBUTING points
59
+ to those locations; the two process notes (offline-stubs-vs-live, wall-clock cutoffs) live in the
60
+ tests docstring and CONTRIBUTING. The standalone docs/lessons-learned.md is removed from the repo.
61
+
62
+ Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@@ -1,6 +1,6 @@
1
1
  Metadata-Version: 2.4
2
2
  Name: director-cli
3
- Version: 0.3.0
3
+ Version: 0.4.0
4
4
  Summary: Model-agnostic decomposition coding harness — a thin orchestrator over OpenCode
5
5
  Project-URL: Homepage, https://github.com/manziman/director
6
6
  Project-URL: Repository, https://github.com/manziman/director
@@ -78,11 +78,14 @@ director never manages provider keys itself — that lives in your OpenCode conf
78
78
  ```bash
79
79
  cd your-repo
80
80
 
81
- # 1. Install director's role agents into .opencode/ and seed .director/config.toml
81
+ # 1. Install director's role agents (+ gitignore, starter opencode.json) into .opencode/
82
82
  director sync-agents
83
83
 
84
- # 2. Edit .director/config.toml — bind roles to models, set your gate commands.
85
- # (sync-agents seeded it from the bundled, fully-commented example.)
84
+ # 2. Create .director/config.toml interactively director init asks which model to
85
+ # use per role and what your gate commands are, then writes the config for you.
86
+ director init
87
+ # (See director/config.example.toml for the full/advanced schema if you want to
88
+ # hand-tune beyond what init prompts for.)
86
89
  $EDITOR .director/config.toml
87
90
 
88
91
  # 3. Plan: brainstorm → spec → test-gated task DAG (two approval gates)
@@ -117,17 +120,20 @@ director run
117
120
  | `director run [--parallel N] [--max-attempts K]` | Execute the DAG: each node in an isolated git worktree, gated by tests/lint/typecheck, auto-merged on pass; escalates a stuck node one tier up. |
118
121
  | `director status` | Per-node progress, attempts, cost, and the executor-tier completion rate. |
119
122
  | `director bench "<task>" --profiles a,b,c` | Run the **same** task (same frozen acceptance tests) across profile variants and diff cost / quality / wall-time. |
120
- | `director sync-agents` | (Re)install the role agents into `<repo>/.opencode` and seed `.director/config.toml`. |
123
+ | `director init [--repo .]` | Interactively create `.director/config.toml` asks which model to use per role and your gate commands. |
124
+ | `director sync-agents` | (Re)install the role agents into `<repo>/.opencode` (plus a gitignore and a starter `opencode.json`). |
121
125
 
122
126
  All state lives under `.director/` (resumable, debuggable): `plan.json`, `state.json`,
123
127
  `costs.jsonl`, `metrics.jsonl`, per-call `logs/`, and `bench/`.
124
128
 
125
129
  ## Configuration
126
130
 
127
- `director sync-agents` seeds `.director/config.toml` from a complete, commented example
128
- (also at [`director/config.example.toml`](director/config.example.toml)). A config is
129
- just roles → `provider/model` strings, the deterministic gate commands, per-model
130
- pricing, and run limits the example shows how to bind the executor tier to a local
131
+ `director init` interactively creates `.director/config.toml` it asks which model
132
+ to use for each role and what your deterministic gate commands are, then writes the
133
+ config for you. A config is just roles → `provider/model` strings, the deterministic
134
+ gate commands, per-model pricing, and run limits. For the full/advanced schema, see
135
+ the complete, commented [`director/config.example.toml`](director/config.example.toml):
136
+ it shows how to bind the executor tier to a local
131
137
  model (≈ $0 implementation), a low-cost cloud model (zero local infra), or a frontier
132
138
  model (the expensive baseline). See [`director/README.md`](director/README.md) for the
133
139
  full architecture (gates, two-stage review, red-green hardening, metrics).
@@ -53,11 +53,14 @@ director never manages provider keys itself — that lives in your OpenCode conf
53
53
  ```bash
54
54
  cd your-repo
55
55
 
56
- # 1. Install director's role agents into .opencode/ and seed .director/config.toml
56
+ # 1. Install director's role agents (+ gitignore, starter opencode.json) into .opencode/
57
57
  director sync-agents
58
58
 
59
- # 2. Edit .director/config.toml — bind roles to models, set your gate commands.
60
- # (sync-agents seeded it from the bundled, fully-commented example.)
59
+ # 2. Create .director/config.toml interactively director init asks which model to
60
+ # use per role and what your gate commands are, then writes the config for you.
61
+ director init
62
+ # (See director/config.example.toml for the full/advanced schema if you want to
63
+ # hand-tune beyond what init prompts for.)
61
64
  $EDITOR .director/config.toml
62
65
 
63
66
  # 3. Plan: brainstorm → spec → test-gated task DAG (two approval gates)
@@ -92,17 +95,20 @@ director run
92
95
  | `director run [--parallel N] [--max-attempts K]` | Execute the DAG: each node in an isolated git worktree, gated by tests/lint/typecheck, auto-merged on pass; escalates a stuck node one tier up. |
93
96
  | `director status` | Per-node progress, attempts, cost, and the executor-tier completion rate. |
94
97
  | `director bench "<task>" --profiles a,b,c` | Run the **same** task (same frozen acceptance tests) across profile variants and diff cost / quality / wall-time. |
95
- | `director sync-agents` | (Re)install the role agents into `<repo>/.opencode` and seed `.director/config.toml`. |
98
+ | `director init [--repo .]` | Interactively create `.director/config.toml` asks which model to use per role and your gate commands. |
99
+ | `director sync-agents` | (Re)install the role agents into `<repo>/.opencode` (plus a gitignore and a starter `opencode.json`). |
96
100
 
97
101
  All state lives under `.director/` (resumable, debuggable): `plan.json`, `state.json`,
98
102
  `costs.jsonl`, `metrics.jsonl`, per-call `logs/`, and `bench/`.
99
103
 
100
104
  ## Configuration
101
105
 
102
- `director sync-agents` seeds `.director/config.toml` from a complete, commented example
103
- (also at [`director/config.example.toml`](director/config.example.toml)). A config is
104
- just roles → `provider/model` strings, the deterministic gate commands, per-model
105
- pricing, and run limits the example shows how to bind the executor tier to a local
106
+ `director init` interactively creates `.director/config.toml` it asks which model
107
+ to use for each role and what your deterministic gate commands are, then writes the
108
+ config for you. A config is just roles → `provider/model` strings, the deterministic
109
+ gate commands, per-model pricing, and run limits. For the full/advanced schema, see
110
+ the complete, commented [`director/config.example.toml`](director/config.example.toml):
111
+ it shows how to bind the executor tier to a local
106
112
  model (≈ $0 implementation), a low-cost cloud model (zero local infra), or a frontier
107
113
  model (the expensive baseline). See [`director/README.md`](director/README.md) for the
108
114
  full architecture (gates, two-stage review, red-green hardening, metrics).
@@ -12,7 +12,8 @@ director plan "<task>" --auto --no-critique # gates auto-pass, fully hands-off
12
12
  director run [--repo .] [--parallel N] [--max-attempts K]
13
13
  director status [--repo .]
14
14
  director bench "<task>" --profiles all-frontier,cheap-cloud,local-first [--plan-profile P]
15
- director sync-agents [--repo .] # (re)install role agents into <repo>/.opencode
15
+ director init [--repo .] # interactively create .director/config.toml (per-role models + gate commands)
16
+ director sync-agents [--repo .] # (re)install role agents into <repo>/.opencode (+ gitignore, starter opencode.json)
16
17
  ```
17
18
 
18
19
  ## Flow
@@ -91,8 +92,11 @@ Per-profile metrics streams and a `summary.json` land in `.director/bench/`.
91
92
  Roles bind to `provider/model` strings in `.director/config.toml` (`[tiers]`).
92
93
  Code/logs name only roles. `director` passes the resolved model via `opencode run
93
94
  --agent <role> --model <tier>`, so **switching executor models is a config edit,
94
- never a code change.** `sync-agents` seeds `.director/config.toml` from the bundled
95
- `config.example.toml`; edit it to bind roles to models. For `bench`, create
95
+ never a code change.** `director init` interactively creates `.director/config.toml`,
96
+ asking which model to use per role and what your gate commands are; `sync-agents` only
97
+ installs the role agents (plus a gitignore and a starter `opencode.json`) and no longer
98
+ writes the config. See the bundled `config.example.toml` for the full/advanced schema.
99
+ For `bench`, create
96
100
  `.director/profiles/<name>.toml` variants (copy `config.toml`, change the executor tier).
97
101
 
98
102
  ## Deliberate deviations from the spec
@@ -7,4 +7,4 @@ typecheck, exit codes — never an LLM judge) decide what merges. Roles bind to
7
7
  model tiers in `.director/config.toml`; nothing here knows "local" vs "cloud".
8
8
  """
9
9
 
10
- __version__ = "0.3.0"
10
+ __version__ = "0.4.0"
@@ -0,0 +1,269 @@
1
+ """Headless Claude Code driver.
2
+
3
+ Wraps `claude -p <message>` with bundled system-prompt templates and parses the
4
+ JSON / NDJSON output into a structured RunResult. Stdlib only."""
5
+
6
+ from __future__ import annotations
7
+
8
+ import contextlib
9
+ import importlib.resources as ir
10
+ import json
11
+ import subprocess
12
+ from pathlib import Path
13
+
14
+ from director.opencode import _CLEAN_ENV, RunResult
15
+
16
+ # --------------------------------------------------------------------------- #
17
+ # system_prompt_for
18
+ # --------------------------------------------------------------------------- #
19
+
20
+
21
+ def system_prompt_for(agent: str) -> str:
22
+ """Return the bundled template for *agent* with YAML frontmatter stripped."""
23
+ filename = agent.replace("_", "-") + ".md"
24
+ tpl = ir.files("director.agent_templates").joinpath(filename).read_text()
25
+ stripped_tpl = tpl.lstrip("\n")
26
+ lines = stripped_tpl.splitlines()
27
+ if lines and lines[0] == "---":
28
+ closer = -1
29
+ for i in range(1, len(lines)):
30
+ if lines[i] == "---":
31
+ closer = i
32
+ break
33
+ if closer != -1:
34
+ return "\n".join(lines[closer + 1 :]).strip()
35
+ return tpl.strip()
36
+
37
+
38
+ # --------------------------------------------------------------------------- #
39
+ # run_claude
40
+ # --------------------------------------------------------------------------- #
41
+
42
+
43
+ def run_claude(
44
+ *,
45
+ agent: str,
46
+ model: str,
47
+ message: str,
48
+ cwd: str | Path,
49
+ log_path: str | Path,
50
+ timeout: int,
51
+ ) -> RunResult:
52
+ """Invoke Claude Code headlessly. Never raises on CLI / model failure."""
53
+ log_path = Path(log_path)
54
+ log_path.parent.mkdir(parents=True, exist_ok=True)
55
+ err_path = log_path.with_suffix(log_path.suffix + ".stderr")
56
+
57
+ cmd = [
58
+ "claude",
59
+ "-p",
60
+ message,
61
+ "--output-format",
62
+ "json",
63
+ "--model",
64
+ model,
65
+ "--append-system-prompt",
66
+ system_prompt_for(agent),
67
+ "--dangerously-skip-permissions",
68
+ ]
69
+
70
+ timed_out = False
71
+ with open(log_path, "wb") as out, open(err_path, "wb") as err:
72
+ proc = subprocess.Popen(cmd, cwd=str(cwd), stdout=out, stderr=err, env=_CLEAN_ENV)
73
+ try:
74
+ rc = proc.wait(timeout=timeout)
75
+ except subprocess.TimeoutExpired:
76
+ with contextlib.suppress(AttributeError, OSError):
77
+ proc.kill()
78
+ with contextlib.suppress(Exception):
79
+ proc.wait()
80
+ rc = 124
81
+ timed_out = True
82
+
83
+ return _parse_claude(log_path, rc, timed_out)
84
+
85
+
86
+ # --------------------------------------------------------------------------- #
87
+ # _parse_claude
88
+ # --------------------------------------------------------------------------- #
89
+
90
+
91
+ def _parse_claude(log_path: Path, rc: int, timed_out: bool) -> RunResult:
92
+ """Shape-tolerant parser for Claude Code JSON / NDJSON output."""
93
+ text_parts: list[str] = []
94
+ tokens = {
95
+ "input": 0,
96
+ "output": 0,
97
+ "reasoning": 0,
98
+ "cache_read": 0,
99
+ "cache_write": 0,
100
+ "total": 0,
101
+ }
102
+ n_steps: int | None = None
103
+ tool_calls: list[tuple[str, str]] = []
104
+ tool_events: list[dict] = []
105
+ error: str | None = None
106
+
107
+ raw = log_path.read_text(errors="replace")
108
+ stripped = raw.strip()
109
+
110
+ # Try single JSON object first.
111
+ records: list[dict] = []
112
+ if stripped:
113
+ try:
114
+ obj = json.loads(stripped)
115
+ if isinstance(obj, dict):
116
+ records.append(obj)
117
+ elif isinstance(obj, list):
118
+ for item in obj:
119
+ if isinstance(item, dict):
120
+ records.append(item)
121
+ except json.JSONDecodeError:
122
+ # Fall through to NDJSON line-by-line.
123
+ pass
124
+
125
+ if not records:
126
+ for line in raw.splitlines():
127
+ line = line.strip()
128
+ if not line.startswith("{"):
129
+ continue
130
+ try:
131
+ obj = json.loads(line)
132
+ if isinstance(obj, dict):
133
+ records.append(obj)
134
+ except json.JSONDecodeError:
135
+ continue
136
+
137
+ # --- aggregate across all parsed records ---------------------------------
138
+ total_cost = 0.0
139
+ has_num_turns = False
140
+ assistant_count = 0
141
+
142
+ for rec in records:
143
+ # -- text -------------------------------------------------------------
144
+ result_val = rec.get("result")
145
+ if isinstance(result_val, str):
146
+ text_parts.append(result_val)
147
+
148
+ msg = rec.get("message") or {}
149
+ content = msg.get("content") if isinstance(msg, dict) else None
150
+ if isinstance(content, str):
151
+ text_parts.append(content)
152
+ elif isinstance(content, list):
153
+ for seg in content:
154
+ if isinstance(seg, dict) and seg.get("type") == "text":
155
+ t = seg.get("text", "")
156
+ if isinstance(t, str):
157
+ text_parts.append(t)
158
+
159
+ # -- tool calls inside message.content ---------------------------------
160
+ if isinstance(content, list):
161
+ for seg in content:
162
+ if isinstance(seg, dict) and seg.get("type") == "tool_use":
163
+ _collect_tool(seg, tool_calls, tool_events)
164
+
165
+ # -- top-level tool records -------------------------------------------
166
+ rec_type = rec.get("type", "")
167
+ if rec_type in ("tool_use", "tool_result"):
168
+ name = rec.get("name") or rec.get("tool", "?")
169
+ status = rec.get("status") or rec.get("state", "?")
170
+ _collect_tool_entry(str(name), str(status), rec, tool_calls, tool_events)
171
+
172
+ # -- tokens -----------------------------------------------------------
173
+ usage = rec.get("usage")
174
+ if not isinstance(usage, dict):
175
+ msg_usage = (msg if isinstance(msg, dict) else {}).get("usage")
176
+ if isinstance(msg_usage, dict):
177
+ usage = msg_usage
178
+ if isinstance(usage, dict):
179
+ tokens["input"] += int(usage.get("input_tokens") or 0)
180
+ tokens["output"] += int(usage.get("output_tokens") or 0)
181
+ cr = usage.get("cache_read_input_tokens")
182
+ if cr is not None:
183
+ tokens["cache_read"] += int(cr)
184
+ cw = usage.get("cache_creation_input_tokens")
185
+ if cw is not None:
186
+ tokens["cache_write"] += int(cw)
187
+ reasoning_val = usage.get("reasoning") or usage.get("reasoning_tokens")
188
+ if reasoning_val is not None:
189
+ tokens["reasoning"] += int(reasoning_val)
190
+
191
+ # -- cost -------------------------------------------------------------
192
+ tc = rec.get("total_cost_usd")
193
+ if tc is not None:
194
+ total_cost += float(tc)
195
+
196
+ # -- n_steps ----------------------------------------------------------
197
+ nt = rec.get("num_turns")
198
+ if nt is not None and not has_num_turns:
199
+ n_steps = int(nt)
200
+ has_num_turns = True
201
+
202
+ if rec_type == "assistant":
203
+ assistant_count += 1
204
+
205
+ # -- error ------------------------------------------------------------
206
+ if rec.get("is_error"):
207
+ error = str(rec.get("error") or rec.get("subtype") or "error occurred")
208
+ elif rec_type in ("error",) and (rec.get("error") or rec.get("subtype")):
209
+ error = str(rec.get("error") or rec.get("subtype"))
210
+
211
+ # -- finalize tokens ------------------------------------------------------
212
+ if tokens["total"] == 0:
213
+ total_tok = None
214
+ for rec in records:
215
+ usage = rec.get("usage")
216
+ if not isinstance(usage, dict):
217
+ msg_usage = (
218
+ rec.get("message", {}) if isinstance(rec.get("message"), dict) else {}
219
+ ).get("usage")
220
+ if isinstance(msg_usage, dict):
221
+ usage = msg_usage
222
+ if isinstance(usage, dict):
223
+ tt = usage.get("total_tokens")
224
+ if tt is not None:
225
+ total_tok = int(tt)
226
+ if total_tok is not None:
227
+ tokens["total"] = total_tok
228
+ else:
229
+ tokens["total"] = tokens["input"] + tokens["output"]
230
+
231
+ # -- finalize n_steps -----------------------------------------------------
232
+ if n_steps is None:
233
+ n_steps = assistant_count if assistant_count > 0 else 0
234
+
235
+ # -- finalize error -------------------------------------------------------
236
+ text = "".join(text_parts).strip()
237
+ if rc != 0 and not text and error is None:
238
+ error = f"non-zero exit code {rc}"
239
+
240
+ return RunResult(
241
+ returncode=rc,
242
+ text=text,
243
+ tokens=tokens,
244
+ cost_reported=total_cost,
245
+ n_steps=n_steps,
246
+ tool_calls=tool_calls,
247
+ tool_events=tool_events,
248
+ error=error,
249
+ timed_out=timed_out,
250
+ log_path=str(log_path),
251
+ )
252
+
253
+
254
+ def _collect_tool(seg: dict, tool_calls: list[tuple[str, str]], tool_events: list[dict]) -> None:
255
+ name = seg.get("name", "?")
256
+ status = "?"
257
+ _collect_tool_entry(str(name), status, seg, tool_calls, tool_events)
258
+
259
+
260
+ def _collect_tool_entry(
261
+ name: str,
262
+ status: str,
263
+ record: dict,
264
+ tool_calls: list[tuple[str, str]],
265
+ tool_events: list[dict],
266
+ ) -> None:
267
+ tool_calls.append((name, status))
268
+ blob = json.dumps(record, default=str)[:2000].lower()
269
+ tool_events.append({"name": name.lower(), "status": status, "blob": blob})
@@ -1,4 +1,4 @@
1
- """director CLI — plan | run | status | bench | sync-agents."""
1
+ """director CLI — plan | run | status | bench | sync-agents | init."""
2
2
 
3
3
  from __future__ import annotations
4
4
 
@@ -78,6 +78,14 @@ def cmd_sync_agents(args) -> int:
78
78
  return 0
79
79
 
80
80
 
81
+ def cmd_init(args) -> int:
82
+ from director.init import run_init
83
+
84
+ path = run_init(args.repo)
85
+ print(f"Wrote {path}")
86
+ return 0
87
+
88
+
81
89
  def build_parser() -> argparse.ArgumentParser:
82
90
  p = argparse.ArgumentParser(prog="director", description=__doc__)
83
91
  p.add_argument("--version", action="version", version=f"director {__version__}")
@@ -150,6 +158,10 @@ def build_parser() -> argparse.ArgumentParser:
150
158
  psa = sub.add_parser("sync-agents", help="(re)install role agents into <repo>/.opencode")
151
159
  psa.add_argument("--repo", default=".")
152
160
  psa.set_defaults(func=cmd_sync_agents)
161
+
162
+ pi = sub.add_parser("init", help="interactively configure .director/config.toml")
163
+ pi.add_argument("--repo", default=".")
164
+ pi.set_defaults(func=cmd_init)
153
165
  return p
154
166
 
155
167
 
@@ -9,8 +9,13 @@
9
9
  # turn it into a zero-local-infra "cheap-cloud" setup or an "all-frontier" baseline.
10
10
 
11
11
  [tiers]
12
- # Roles bound to resolved OpenCode model strings ("provider/model"). Code, prompts,
13
- # and logs refer ONLY to these role names — never to a specific model.
12
+ # Each role is bound to a "<provider>/<model>" string. The provider segment also
13
+ # selects the agent runtime:
14
+ # - any OpenCode provider (lmstudio, openrouter, amazon-bedrock, …) → OpenCode (default)
15
+ # - "claude-code/<model>" → the Claude Code CLI (`claude`); <model> is passed to
16
+ # `claude --model` (e.g. "claude-code/opus", "claude-code/sonnet")
17
+ # Mix freely per role — e.g. a claude-code planner alongside a local-opencode executor.
18
+ # Code, prompts, and logs refer ONLY to role names — never to a specific model.
14
19
  planner = "amazon-bedrock/us.anthropic.claude-opus-4-7" # decomposition + DAG (use your strongest model)
15
20
  test_author = "amazon-bedrock/us.anthropic.claude-opus-4-7" # tests are the contract → strongest
16
21
  executor = "lmstudio/qwen3.6-27b-mtp" # implements each node. The cheap tier.
@@ -23,6 +28,7 @@ escalation = "amazon-bedrock/anthropic.claude-sonnet-4-6" # per-task fallbac
23
28
  # executor = "openrouter/deepseek/deepseek-v4-pro"
24
29
  # all-frontier baseline (expensive control): set executor = reviewer = escalation
25
30
  # to the same frontier model as the planner.
31
+ # Claude Code: e.g. planner = "claude-code/opus" (drive planning via the `claude` CLI)
26
32
 
27
33
  # Only needed if a tier above points at a local OpenAI-compatible endpoint.
28
34
  [providers.local]
@@ -77,8 +77,7 @@ def load(repo: Path) -> Config:
77
77
  path = Path(repo) / ".director" / "config.toml"
78
78
  if not path.exists():
79
79
  raise FileNotFoundError(
80
- f"{path} not found. Run `director sync-agents` to seed it from the bundled "
81
- f"example, then edit it."
80
+ f"{path} not found. Run `director init` to create it interactively."
82
81
  )
83
82
  return load_file(path)
84
83
 
@@ -0,0 +1,134 @@
1
+ """Interactive `director init`: discover models, prompt for tiers/gates, render TOML.
2
+
3
+ This module wires the interactive `director init` flow. It discovers available
4
+ models by shelling out to `opencode models`, prompts the user to bind each role
5
+ to a model (or falls back to free-text entry when discovery is unavailable),
6
+ prompts for the deterministic gate commands, and renders a minimal
7
+ `.director/config.toml`. The renderer is pure and its output round-trips through
8
+ `director.config.load_file`.
9
+ """
10
+
11
+ from __future__ import annotations
12
+
13
+ import subprocess
14
+ from pathlib import Path
15
+
16
+ from director.config import ROLES
17
+
18
+
19
+ def parse_models(text: str) -> list[str]:
20
+ """Parse `opencode models` output into a deduped, ordered list of model ids.
21
+
22
+ Lines are stripped; blank lines and lines without a `/` are dropped. The
23
+ first occurrence of each model id is kept and later duplicates discarded.
24
+ """
25
+ seen: set[str] = set()
26
+ models: list[str] = []
27
+ for line in text.split("\n"):
28
+ stripped = line.strip()
29
+ if not stripped:
30
+ continue
31
+ if "/" not in stripped:
32
+ continue
33
+ if stripped in seen:
34
+ continue
35
+ seen.add(stripped)
36
+ models.append(stripped)
37
+ return models
38
+
39
+
40
+ def discover_models() -> list[str]:
41
+ """Run `opencode models` and parse its output; return [] on any failure."""
42
+ try:
43
+ result = subprocess.run(
44
+ ["opencode", "models"],
45
+ capture_output=True,
46
+ text=True,
47
+ check=False,
48
+ )
49
+ except FileNotFoundError:
50
+ return []
51
+ if result.returncode != 0:
52
+ return []
53
+ return parse_models(result.stdout)
54
+
55
+
56
+ def prompt_model(role: str, models: list[str]) -> str:
57
+ """Prompt the user to bind `role` to a model, looping until a valid choice."""
58
+ if models:
59
+ for i, model in enumerate(models, start=1):
60
+ print(f" {i}) {model}")
61
+ while True:
62
+ answer = input(f"select model for {role}: ").strip()
63
+ if not answer:
64
+ continue
65
+ try:
66
+ n = int(answer)
67
+ except ValueError:
68
+ print("invalid selection")
69
+ continue
70
+ if 1 <= n <= len(models):
71
+ return models[n - 1]
72
+ print("invalid selection")
73
+ else:
74
+ while True:
75
+ answer = input(f"enter model for {role}: ").strip()
76
+ if answer:
77
+ return answer
78
+
79
+
80
+ def prompt_gate(name: str) -> str:
81
+ """Prompt once for the `name` gate command; blank means skip and is valid."""
82
+ return input(f"command for {name} gate (blank to skip): ").strip()
83
+
84
+
85
+ def render_config(tiers: dict[str, str], gates: dict[str, str]) -> str:
86
+ """Render a minimal `.director/config.toml` text from tiers and gates."""
87
+
88
+ def emit(table: dict[str, str]) -> list[str]:
89
+ lines = []
90
+ for key, value in table.items():
91
+ escaped = value.replace("\\", "\\\\").replace('"', '\\"')
92
+ lines.append(f'{key} = "{escaped}"')
93
+ return lines
94
+
95
+ parts: list[str] = []
96
+ parts.append("[tiers]")
97
+ parts.extend(emit(tiers))
98
+ parts.append("")
99
+ parts.append("[gates]")
100
+ parts.extend(emit(gates))
101
+ parts.append("")
102
+ parts.append("# Advanced options (pricing, limits, review) are omitted here.")
103
+ parts.append("# See the bundled config.example.toml for the full schema.")
104
+ return "\n".join(parts) + "\n"
105
+
106
+
107
+ def run_init(repo: str) -> Path:
108
+ """Orchestrate the interactive init flow and write `.director/config.toml`."""
109
+ cfg_path = Path(repo) / ".director" / "config.toml"
110
+
111
+ if cfg_path.exists():
112
+ answer = input("config.toml exists; overwrite? [y/N] ").strip().lower()
113
+ if answer not in ("y", "yes"):
114
+ print("aborted; nothing was written.")
115
+ return cfg_path
116
+
117
+ models = discover_models()
118
+ if not models:
119
+ print(
120
+ "warning: `opencode models` was unavailable or returned no models; "
121
+ "falling back to free-text entry."
122
+ )
123
+
124
+ tiers: dict[str, str] = {}
125
+ for role in ROLES:
126
+ tiers[role] = prompt_model(role, models)
127
+
128
+ gates: dict[str, str] = {}
129
+ for name in ("test", "lint", "typecheck"):
130
+ gates[name] = prompt_gate(name)
131
+
132
+ cfg_path.parent.mkdir(parents=True, exist_ok=True)
133
+ cfg_path.write_text(render_config(tiers, gates))
134
+ return cfg_path
@@ -19,6 +19,11 @@ from pathlib import Path
19
19
  # changed-file count. Suppressing it at the source keeps every worktree clean.
20
20
  _CLEAN_ENV = {**os.environ, "PYTHONDONTWRITEBYTECODE": "1"}
21
21
 
22
+ # A tier model string prefixed with this routes to the Claude Code runtime; the
23
+ # remainder is the `claude --model` value (e.g. "claude-code/opus" → claude --model
24
+ # opus). Everything else is an OpenCode "provider/model" string (the default).
25
+ CLAUDE_PREFIX = "claude-code/"
26
+
22
27
 
23
28
  @dataclass
24
29
  class RunResult:
@@ -38,7 +43,7 @@ class RunResult:
38
43
  return self.returncode == 0 and self.error is None and not self.timed_out
39
44
 
40
45
 
41
- def run_agent(
46
+ def _run_opencode(
42
47
  *,
43
48
  agent: str,
44
49
  model: str,
@@ -47,17 +52,11 @@ def run_agent(
47
52
  log_path: str | Path,
48
53
  timeout: int,
49
54
  ) -> RunResult:
50
- """Invoke an OpenCode agent headlessly in `cwd`. NDJSON events go to
51
- `log_path`; OpenCode logs go to `log_path + '.stderr'`. Never raises on a
52
- model/agent failure — inspect RunResult.ok / .error / .timed_out."""
55
+ """Run the opencode CLI for a single agent invocation."""
53
56
  log_path = Path(log_path)
54
57
  log_path.parent.mkdir(parents=True, exist_ok=True)
55
58
  err_path = log_path.with_suffix(log_path.suffix + ".stderr")
56
59
 
57
- # --dir pins the project/worktree explicitly. Without it, `opencode run`
58
- # resolves the project root by walking up for a .git and can land on an
59
- # ENCLOSING repo (so edits leak out of an isolated worktree). Worktrees are
60
- # also placed outside the repo tree (see run.py) to make this airtight.
61
60
  cmd = [
62
61
  "opencode",
63
62
  "run",
@@ -87,6 +86,40 @@ def run_agent(
87
86
  return _parse(log_path, rc, timed_out)
88
87
 
89
88
 
89
+ def run_agent(
90
+ *,
91
+ agent: str,
92
+ model: str,
93
+ message: str,
94
+ cwd: str | Path,
95
+ log_path: str | Path,
96
+ timeout: int,
97
+ ) -> RunResult:
98
+ """Invoke an agent headlessly in `cwd`. The runtime is chosen by the `model`
99
+ string: a `claude-code/<model>` tier routes to the Claude Code runtime (with
100
+ the prefix stripped), anything else to OpenCode (the default). Never raises on
101
+ a failure — inspect RunResult.ok / .error / .timed_out."""
102
+ if model.startswith(CLAUDE_PREFIX):
103
+ from director.claudecode import run_claude
104
+
105
+ return run_claude(
106
+ agent=agent,
107
+ model=model[len(CLAUDE_PREFIX) :],
108
+ message=message,
109
+ cwd=cwd,
110
+ log_path=log_path,
111
+ timeout=timeout,
112
+ )
113
+ return _run_opencode(
114
+ agent=agent,
115
+ model=model,
116
+ message=message,
117
+ cwd=cwd,
118
+ log_path=log_path,
119
+ timeout=timeout,
120
+ )
121
+
122
+
90
123
  def _parse(log_path: Path, rc: int, timed_out: bool) -> RunResult:
91
124
  text_parts: list[str] = []
92
125
  tokens = {"input": 0, "output": 0, "reasoning": 0, "total": 0}
@@ -7,8 +7,7 @@ into <repo>/.opencode/agents/ (a.k.a. the `director sync-agents` step).
7
7
 
8
8
  Provider auth (Bedrock/OpenRouter keys, the LM Studio endpoint) is the operator's
9
9
  responsibility — it lives in the user's *global* OpenCode config. We add the
10
- project-local agent files, a starter opencode.json if the repo has none, and seed
11
- a ready-to-edit .director/config.toml from the bundled example (if none exists).
10
+ project-local agent files, a starter opencode.json if the repo has none.
12
11
  """
13
12
 
14
13
  from __future__ import annotations
@@ -25,11 +24,6 @@ AGENT_FILES = (
25
24
  "reviewer.md",
26
25
  )
27
26
 
28
- # A complete, commented example config ships inside the package. sync-agents seeds
29
- # it to <repo>/.director/config.toml (if missing) so a pip-installed user has a
30
- # ready-to-edit config rather than nothing.
31
- CONFIG_EXAMPLE = "config.example.toml"
32
-
33
27
  # Runtime artifacts director writes into <repo>/.director/. These must never be
34
28
  # committed: director's own commits use `git add -A` (to capture whatever the
35
29
  # executor created in the allowlist), which in a repo without this ignore file
@@ -55,10 +49,6 @@ def _template(name: str) -> str:
55
49
  return ir.files("director.agent_templates").joinpath(name).read_text()
56
50
 
57
51
 
58
- def _example_config() -> str:
59
- return ir.files("director").joinpath(CONFIG_EXAMPLE).read_text()
60
-
61
-
62
52
  def ensure_director_gitignore(repo: str | Path) -> None:
63
53
  """Seed <repo>/.director/.gitignore so director's `git add -A` commits never
64
54
  sweep its own runtime files into the repo. Idempotent; safe to call on every
@@ -90,12 +80,4 @@ def sync_agents(repo: str | Path) -> list[str]:
90
80
  if not oc.exists():
91
81
  oc.write_text(_template("opencode.json"))
92
82
  written.append(str(oc.relative_to(repo)))
93
-
94
- # Seed a ready-to-edit config from the bundled example — but never clobber an
95
- # existing config the user may have edited.
96
- cfg = repo / ".director" / "config.toml"
97
- if not cfg.exists():
98
- cfg.parent.mkdir(parents=True, exist_ok=True)
99
- cfg.write_text(_example_config())
100
- written.append(str(cfg.relative_to(repo)))
101
83
  return written
@@ -74,7 +74,7 @@ quote-style = "double"
74
74
  [tool.semantic_release]
75
75
  version_variables = ["director/__init__.py:__version__"]
76
76
  commit_parser = "conventional"
77
- build_command = "python -m build"
77
+ build_command = "python -m pip install build && python -m build"
78
78
  commit_message = "chore(release): {version} [skip ci]\n\nAutomated release."
79
79
  tag_format = "v{version}"
80
80
 
@@ -1,25 +0,0 @@
1
- # Changelog
2
-
3
- All notable changes to this project are documented here. This file is maintained
4
- automatically by [python-semantic-release](https://python-semantic-release.readthedocs.io/)
5
- from [Conventional Commits](https://www.conventionalcommits.org/); the entry below is the
6
- pre-automation baseline.
7
-
8
- <!-- version list -->
9
-
10
- ## v0.3.0 (2026-06-24)
11
-
12
- Initial public baseline (project renamed from its `foreman` codename to **director**).
13
-
14
- ### Features
15
-
16
- - **plan / run / status orchestrator** over OpenCode: a strong planner decomposes a task
17
- into an atomic DAG with acceptance tests written first; a cheaper executor implements
18
- each node in an isolated git worktree; deterministic gates (tests/lint/typecheck) decide
19
- merges, with a per-task escalation ladder.
20
- - **Approval gates + methodology** (brainstorm/spec gate, plan gate; `--auto` self-critique),
21
- two-stage cost-gated code review, and red-green test-hash hardening.
22
- - **TDD hardening & measurement:** watch-it-fail transcript verification, flake control
23
- (re-run node tests on success), a `.director/metrics.jsonl` stream, and `director bench`
24
- to compare cost/quality/wall-time across profiles on identical acceptance tests.
25
- - Three shipped profiles: `local-first`, `cheap-cloud`, `all-frontier`.
File without changes
File without changes