duet-cli 0.1.0__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
duet_cli-0.1.0/LICENSE ADDED
@@ -0,0 +1,21 @@
1
+ MIT License
2
+
3
+ Copyright (c) 2026 Volkan Altan
4
+
5
+ Permission is hereby granted, free of charge, to any person obtaining a copy
6
+ of this software and associated documentation files (the "Software"), to deal
7
+ in the Software without restriction, including without limitation the rights
8
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9
+ copies of the Software, and to permit persons to whom the Software is
10
+ furnished to do so, subject to the following conditions:
11
+
12
+ The above copyright notice and this permission notice shall be included in all
13
+ copies or substantial portions of the Software.
14
+
15
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21
+ SOFTWARE.
@@ -0,0 +1,360 @@
1
+ Metadata-Version: 2.4
2
+ Name: duet-cli
3
+ Version: 0.1.0
4
+ Summary: Two CLI agents in conversation. One Python file. Stdlib only.
5
+ Author: Volkan Altan
6
+ License-Expression: MIT
7
+ Project-URL: Homepage, https://github.com/volkan/duet
8
+ Project-URL: Repository, https://github.com/volkan/duet
9
+ Keywords: agents,cli,claude,codex,llm,orchestration,pair-programming
10
+ Classifier: Development Status :: 4 - Beta
11
+ Classifier: Environment :: Console
12
+ Classifier: Intended Audience :: Developers
13
+ Classifier: Programming Language :: Python :: 3
14
+ Classifier: Topic :: Software Development
15
+ Requires-Python: >=3.9
16
+ Description-Content-Type: text/markdown
17
+ License-File: LICENSE
18
+ Provides-Extra: yaml
19
+ Requires-Dist: pyyaml; extra == "yaml"
20
+ Dynamic: license-file
21
+
22
+ # duet
23
+
24
+ Two CLI agents in conversation. One Python file. Stdlib only.
25
+
26
+ `duet.py` runs two command-line coding agents, usually Claude and Codex, in
27
+ alternating turns. You can also pair two agents from the same backend, such as
28
+ Codex planner + Codex coder or Claude coder + Claude reviewer. One agent can
29
+ plan or review while the other implements. The loop stops when they agree, the
30
+ turn limit is reached, a timeout happens, or you stop them.
31
+
32
+ Use duet when you want:
33
+
34
+ - A planner/reviewer agent to keep pressure on an implementation agent.
35
+ - A second agent to inspect test failures, issue text, or review output.
36
+ - A transcript and run directory you can inspect after the agents finish.
37
+ - Isolation through an optional git worktree while the partner agent edits.
38
+
39
+ ## Quick Start
40
+
41
+ Pair-programming pattern: plan with codex in its own session first, then hand
42
+ the session id to duet — codex implements with the plan in context while
43
+ claude reviews each turn.
44
+
45
+ ```bash
46
+ cd ~/code/myrepo
47
+
48
+ # Find the codex session you just planned in:
49
+ # ls -lt ~/.codex/sessions/ | head
50
+ # or look for `session id: <uuid>` on `codex exec`'s stderr.
51
+
52
+ ./duet.py \
53
+ --resume-codex <codex-session-id> \
54
+ --worktree \
55
+ --reasoning max \
56
+ --task "Implement the plan from your codex planning session."
57
+ ```
58
+
59
+ Four flags carry their weight; everything else is a default. Codex (resumed,
60
+ with the plan in context) speaks first as the coder. Claude reviews each
61
+ turn as the planner. The worktree keeps the host checkout clean until you
62
+ merge. Sentinel + rationale convergence rules are baked into both role
63
+ prompts — you do not need to restate them in `--task`.
64
+
65
+ Resume flags attach to the matching backend even when you override
66
+ `--lead`/`--partner`. A resumed Claude agent is normalized to lead so duet can
67
+ extract its latest message as the seed; a resumed Codex agent is normalized to
68
+ partner/coder so it speaks first with its existing plan in context.
69
+
70
+ The symmetric `--resume-claude <session-id>` does the inverse — plan in
71
+ claude, hand off to codex — and is duet's founding workflow, documented in
72
+ [docs/USAGE.md](docs/USAGE.md).
73
+
74
+ Have a task in words but no prior planning session? Let codex plan inside
75
+ the loop while claude implements:
76
+
77
+ ```bash
78
+ ./duet.py \
79
+ --recap \
80
+ --task "Add Codex fast mode for duet-managed Codex runs, don't miss any doc files" \
81
+ --lead claude:coder \
82
+ --partner codex:planner \
83
+ --worktree --worktree-for lead \
84
+ --turns 4
85
+ ```
86
+
87
+ `--recap` keeps the live output compact and the worktree keeps the host
88
+ checkout clean until you merge.
89
+
90
+ ```bash
91
+ # Run a fresh task in a target project.
92
+ ./duet.py --task "Implement fizzbuzz in Go with tests" \
93
+ --lead claude:coder --partner codex:planner \
94
+ --cwd ~/code/scratch
95
+
96
+ # Seed duet from Claude Code's real /review output.
97
+ ./duet.py --recap --task-from-cmd 'claude -p /review' \
98
+ --lead claude:reviewer --partner codex:coder \
99
+ --worktree \
100
+ --cwd .
101
+ ```
102
+
103
+ In the review recipe, Claude's `/review` runs once to produce the kickoff
104
+ critique. Duet then hands that critique to Codex, preserves both agent
105
+ sessions, and manages the back-and-forth until convergence or the turn limit.
106
+
107
+ Install the `duet` command:
108
+
109
+ ```bash
110
+ make install # symlinks duet.py to ~/.local/bin/duet
111
+ make ci # everything the CI gate runs: unit + reasoning + smoke + complexity
112
+ make test # unit tests (tests/test_duet.py) + scripts/smoke.sh dry-run checks
113
+ make unit-test # only the stdlib unittest suite under tests/
114
+ make smoke-test # only scripts/smoke.sh dry-run regression checks
115
+ make complexity # cyclomatic-complexity/length gate (single-file sprawl guard)
116
+ make build # sdist + wheel into dist/ (needs: python3 -m pip install build)
117
+ make loop-test # slow real Claude/Codex loop checks; writes runs/test-loop/
118
+ ```
119
+
120
+ Or skip the clone — the PyPI package is `duet-cli` (bare `duet` is taken) and
121
+ the command it installs is `duet`:
122
+
123
+ ```bash
124
+ uvx --from duet-cli duet --task "..." # one-shot run, isolated, no install
125
+ pipx install duet-cli # puts the duet command on PATH
126
+ pipx install 'duet-cli[yaml]' # include PyYAML for --config foo.yaml
127
+ ```
128
+
129
+ Plain `pip install duet-cli` works too, but the installed top-level module is
130
+ named `duet`, which collides with Google's PyPI `duet` package in a shared
131
+ environment — pipx/uvx isolation avoids that.
132
+
133
+ Claude Code plugin — adds the `/duet` command (it shells out to the `duet`
134
+ CLI, so install the binary with one of the methods above as well):
135
+
136
+ ```text
137
+ /plugin marketplace add volkan/duet
138
+ /plugin install duet@volkan-duet
139
+ ```
140
+
141
+ CI (`.github/workflows/ci.yml`) runs `make ci`'s checks on every PR across
142
+ Python 3.9/3.11/3.13. To make them block merges, mark them required in branch
143
+ protection — see [`.github/BRANCH_PROTECTION.md`](.github/BRANCH_PROTECTION.md)
144
+ (admins can still force-merge).
145
+
146
+ ## How It Works
147
+
148
+ Each agent keeps its own conversation memory:
149
+
150
+ - Claude resumes with `claude -p --resume <session_id>`.
151
+ - Codex resumes with `codex exec resume <session_id>` when duet captured one
152
+ from Codex's stderr, or `codex exec resume --last` in the working directory
153
+ as a fallback for older builds that don't print a session id.
154
+
155
+ On each turn, duet sends the latest reply from one agent to the other. It
156
+ continues until both agents accept convergence in back-to-back turns, `--turns`
157
+ is reached, a timeout happens, or you press Ctrl-C. A convergence proposal must
158
+ include an `LGTM rationale:` explaining why the work is done, followed by the
159
+ sentinel `<<<LGTM>>>` on its own line; a bare sentinel is ignored.
160
+
161
+ Finished runs record the reason in `state.json`: user interruption stays
162
+ `force_stop`, per-turn agent timeouts are `timeout`, and non-timeout agent
163
+ command failures or malformed required output are `agent_error`.
164
+
165
+ If you pass `--verify-cmd`, duet runs that shell command before counting a
166
+ valid convergence proposal. Exit code 0 allows the proposal to count; any
167
+ non-zero exit, timeout, or execution error feeds a capped failure block to the
168
+ next agent turn. `--dry-run` records and prints the configured command but
169
+ does not execute it.
170
+
171
+ After normal loop endings, duet opens a `force> ` prompt. Press Enter to
172
+ finish, or type feedback to force another round; duet sends the next agent the
173
+ previous reply plus your feedback, including any appended worktree handoff
174
+ block and diff.
175
+
176
+ ## Common Recipes
177
+
178
+ Call Claude Code's real `/review` skill through duet:
179
+
180
+ ```bash
181
+ ./duet.py --recap --task-from-cmd 'claude -p /review' \
182
+ --lead claude:reviewer --partner codex:coder \
183
+ --worktree \
184
+ --cwd ~/workspace/project \
185
+ --turns 6
186
+ ```
187
+
188
+ The `/review` skill supplies the initial findings; duet handles the subsequent
189
+ Codex fix turn, Claude verification turn, worktree diff handoff, and any extra
190
+ rounds.
191
+
192
+ With the `/duet` Claude Code command installed (the plugin from the install
193
+ section above, or the manual skill copy in [docs/USAGE.md](docs/USAGE.md)),
194
+ plain `/duet` runs that same `/review` kickoff recipe.
195
+
196
+ Let duet run the upstream command inside the target project:
197
+
198
+ ```bash
199
+ ./duet.py --task-from-cmd 'npm test 2>&1' \
200
+ --lead claude:coder --partner codex:planner \
201
+ --cwd ~/workspace/project \
202
+ --worktree --worktree-for lead
203
+ ```
204
+
205
+ Use a repeatable config:
206
+
207
+ ```bash
208
+ ./duet.py --config duet.example.yaml
209
+ ```
210
+
211
+ Useful packaged configs:
212
+
213
+ - `examples/pr-review.yaml` - deep review of the latest commit.
214
+ - `examples/codex-test-fix.yaml` - Codex planner diagnoses failing checks; Codex coder fixes them in a worktree.
215
+
216
+ Same-backend peering:
217
+
218
+ ```bash
219
+ # Codex planner + Codex coder. The worktree gives one Codex peer a separate cwd.
220
+ ./duet.py --task "Fix the issue" \
221
+ --lead codex:planner --partner codex:coder \
222
+ --worktree --turns 6
223
+
224
+ # Claude coder + Claude reviewer.
225
+ ./duet.py --task "Review and fix the current change" \
226
+ --lead claude:coder --partner claude:reviewer \
227
+ --turns 6
228
+ ```
229
+
230
+ For `codex`/`codex` runs in one cwd, duet requires both peers to produce Codex
231
+ session UUIDs on their first turns. If either peer would fall back to
232
+ `codex exec resume --last`, duet aborts rather than risk resuming the other
233
+ peer's session. Use `--worktree` when in doubt.
234
+
235
+ Require a mechanical check before convergence:
236
+
237
+ ```bash
238
+ ./duet.py \
239
+ --task "Fix the issue" \
240
+ --lead claude:coder \
241
+ --partner codex:reviewer \
242
+ --worktree --worktree-for lead \
243
+ --verify-cmd 'make test'
244
+ ```
245
+
246
+ Check an in-progress run from another terminal:
247
+
248
+ ```bash
249
+ ./duet.py --status .duet/runs/<id>/
250
+ ```
251
+
252
+ Gate convergence on P0/P1 review findings:
253
+
254
+ ```bash
255
+ ./duet.py --task "Fix the issue" \
256
+ --lead claude:coder --partner codex:triage-reviewer \
257
+ --cwd ~/workspace/project
258
+ ```
259
+
260
+ Review a recent implementation - Codex reviews at max effort, Claude applies
261
+ only requested fixes:
262
+
263
+ ```bash
264
+ ./duet.py --recap \
265
+ --task "Review the current main branch changes. Codex should act as reviewer: identify any blocking issues in the latest commit. Claude should act as coder: implement only the fixes Codex explicitly requests. Preserve project constraints and run make test before convergence." \
266
+ --lead claude:coder \
267
+ --partner codex:reviewer \
268
+ --reasoning max \
269
+ --worktree --worktree-for lead \
270
+ --turns 6
271
+ ```
272
+
273
+ The partner speaks first, so Codex (reviewer) opens turn 1 with its critique
274
+ and Claude (coder) responds in turn 2 with the fixes. `--worktree-for lead`
275
+ keeps the editable checkout under the coder. Keep `--codex-fast` off in this
276
+ recipe: Codex is the reviewer, so max effort is the point.
277
+
278
+ That same recipe is also packaged as a YAML config you can drop into any
279
+ repo — `examples/pr-review.yaml` reviews `HEAD`'s diff with the same
280
+ agent/effort/worktree pairing, with comments calling out which keys to swap
281
+ for variants (review uncommitted changes, review a specific PR by number,
282
+ faster iteration once review is mostly done).
283
+
284
+ Review the latest commit plus an untracked notes file by seeding both into the
285
+ task:
286
+
287
+ ```bash
288
+ ./duet.py --recap \
289
+ --task-from-cmd 'git show --stat --patch --no-ext-diff HEAD && printf "\n\n--- TODO.md ---\n" && cat TODO.md' \
290
+ --lead claude:coder \
291
+ --partner codex:reviewer \
292
+ --reasoning max \
293
+ --worktree --worktree-for lead \
294
+ --turns 6
295
+ ```
296
+
297
+ Fresh worktrees start from committed `HEAD`; commit the notes first if the coder
298
+ must edit them as a normal tracked file.
299
+
300
+ Deep planner, fast coder — Claude plans at high effort, Codex coder turns drop to low for latency (uses the default `claude:planner + codex:coder` pairing):
301
+
302
+ ```bash
303
+ ./duet.py --reasoning high --codex-fast \
304
+ --task "Fix the issue" \
305
+ --cwd ~/workspace/project
306
+ ```
307
+
308
+ Compact live debug view — see only what each turn produced, in real time:
309
+
310
+ ```bash
311
+ ./duet.py --recap --task "Fix the issue" \
312
+ --lead claude:coder --partner codex:planner \
313
+ --cwd ~/workspace/project
314
+ ```
315
+
316
+ ## Output
317
+
318
+ Every run writes a directory containing:
319
+
320
+ - `transcript.md` - the full conversation.
321
+ - `recap.md` - compact per-turn debug view when `--recap` is enabled; `--status` shows this path when present.
322
+ - `state.json` - run state, agent roles, session ids, finish reason, worktree metadata, and `recap_path` for recap runs.
323
+ - `turn-*.stderr.log` - live stderr from each agent invocation.
324
+ - `turn-*-verify.log` - verify command metadata, stdout, and stderr when `--verify-cmd` runs.
325
+ - `turn-*.pid` - present only while an agent or verify command is running.
326
+ - `wt/` - the git worktree, when `--worktree` is enabled.
327
+
328
+ When a worktree agent replies, duet appends a handoff block to that reply before
329
+ the diff. The block names the exact worktree path and branch, warns that the
330
+ receiving agent's cwd may be a clean checkout, and includes `git -C <wt>` review
331
+ commands so verification happens against the edited tree.
332
+
333
+ When `--cwd` points outside the invocation directory and `--runs-dir` is not
334
+ set, artifacts go under the target project at `.duet/runs/<run_id>/`.
335
+
336
+ ## Documentation
337
+
338
+ Read [docs/USAGE.md](docs/USAGE.md) for the full reference: flags, sandbox and
339
+ network rules, worktree mode, output layout, `--status` / `--continue`, force
340
+ prompt behavior, session memory, the post-run "apply / iterate / discard"
341
+ checklist, and the optional `/duet` Claude Code command (plugin or manual
342
+ skill).
343
+
344
+ For contributor guidance, read [CLAUDE.md](CLAUDE.md). Codex-specific entry
345
+ notes live in [AGENTS.md](AGENTS.md).
346
+
347
+ ## Limits
348
+
349
+ - `duet --continue <run>` starts a fresh run from a prior `state.json`, restores
350
+ saved session ids, and reuses the previous worktree when available. It does
351
+ not append to the old transcript.
352
+ - Parallel Codex sessions in the same cwd are safe when duet captured a
353
+ session id from Codex's stderr — that turn pins to the UUID, not to recency.
354
+ When the UUID was not captured (old Codex builds, or continuing a pre-UUID
355
+ run), duet falls back to `codex exec resume --last`, which is cwd-based and
356
+ unsafe to share. `--worktree` isolates duet's Codex cwd from the host repo;
357
+ in `--last` fallback mode, do not start another Codex session inside that
358
+ same worktree while the run is active.
359
+ - Transcripts capture full agent text. Convergence detection only counts
360
+ rationale-backed sentinels outside fenced markdown code blocks.