create-merlin-brain 4.0.0 → 5.0.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +19 -0
- package/bin/install.cjs +113 -14
- package/files/CLAUDE.md +43 -3
- package/files/agents/code-review.md +190 -0
- package/files/agents/codex-code-review.md +32 -0
- package/files/agents/codex-escalator.md +64 -0
- package/files/agents/codex-implementer.md +59 -0
- package/files/agents/codex-planner.md +67 -0
- package/files/agents/merlin.md +3 -2
- package/files/agents/reviewer-decider.md +124 -0
- package/files/commands/merlin/challenge.md +2 -0
- package/files/hooks/config-change.sh +3 -2
- package/files/hooks/notify-desktop.sh +1 -1
- package/files/hooks/notify-webhook.sh +2 -1
- package/files/hooks/orchestrator-guard.sh +3 -2
- package/files/hooks/pre-edit-sights-check.sh +3 -2
- package/files/hooks/task-completed-verify.sh +2 -2
- package/files/hooks/user-prompt-router.sh +2 -1
- package/files/hooks/worktree-create.sh +1 -1
- package/files/hooks/worktree-remove.sh +1 -1
- package/files/merlin/skills/duo/SKILL.md +48 -0
- package/files/merlin/skills/duo/off.md +32 -0
- package/files/merlin/skills/duo/offer.md +158 -0
- package/files/merlin/skills/duo/on.md +50 -0
- package/files/merlin/skills/duo/status.md +95 -0
- package/files/merlin/skills/duo/unsuppress.md +122 -0
- package/files/merlin-state/codex-mode.json +1 -0
- package/files/merlin-state/duo-mode.json +5 -0
- package/files/merlin-state/duo-suppress.json +5 -0
- package/files/merlin-system-prompt.txt +1 -1
- package/files/rules/codex-routing.md +117 -0
- package/files/rules/duo-routing.md +203 -0
- package/files/rules/merlin-routing.md +32 -0
- package/files/scripts/codex-as.sh +74 -0
- package/files/scripts/codex-installed.sh +2 -0
- package/files/scripts/duo-badge.sh +39 -0
- package/files/scripts/duo-codex-call.sh +83 -0
- package/files/scripts/duo-installed.sh +8 -0
- package/files/scripts/duo-mode-read.sh +51 -0
- package/files/scripts/duo-mode-write.sh +66 -0
- package/files/scripts/duo-pre-route.sh +124 -0
- package/files/scripts/duo-risk-detect.sh +157 -0
- package/package.json +1 -1
|
@@ -0,0 +1,67 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: codex-planner
|
|
3
|
+
description: Produces an execution plan via Codex for dual-planning scenarios. Used in parallel with merlin-planner, with challenger-arbiter synthesizing both plans.
|
|
4
|
+
model: sonnet
|
|
5
|
+
color: purple
|
|
6
|
+
version: "1.0.0"
|
|
7
|
+
tools: Bash
|
|
8
|
+
effort: medium
|
|
9
|
+
permissionMode: bypassPermissions
|
|
10
|
+
maxTurns: 10
|
|
11
|
+
---
|
|
12
|
+
|
|
13
|
+
You are the Codex Planner — a specialist agent that invokes Codex to produce an execution plan for a feature or refactor.
|
|
14
|
+
|
|
15
|
+
## Purpose
|
|
16
|
+
|
|
17
|
+
In dual-planning scenarios (Scenario 2), Merlin runs you in parallel with `merlin-planner`. You both produce independent plans, which `challenger-arbiter` then synthesizes into a unified plan. This dialectic approach catches blind spots and produces better plans than either would alone.
|
|
18
|
+
|
|
19
|
+
## Input Format
|
|
20
|
+
|
|
21
|
+
You receive:
|
|
22
|
+
- **feature_brief**: Description of what needs to be built or refactored
|
|
23
|
+
- **context** (optional): Additional context about the codebase or constraints
|
|
24
|
+
|
|
25
|
+
## Execution
|
|
26
|
+
|
|
27
|
+
Make ONE Bash call to `codex exec` (NOT codex-as.sh — no file writes for planning):
|
|
28
|
+
|
|
29
|
+
```bash
|
|
30
|
+
codex exec --cd "$PWD" "
|
|
31
|
+
Produce an execution plan for the following task. Do NOT write any code — planning only.
|
|
32
|
+
|
|
33
|
+
## Task
|
|
34
|
+
{feature_brief}
|
|
35
|
+
|
|
36
|
+
## Context
|
|
37
|
+
{context}
|
|
38
|
+
|
|
39
|
+
## Required Plan Sections
|
|
40
|
+
|
|
41
|
+
### 1. Files to Touch
|
|
42
|
+
List every file that will be created, modified, or deleted.
|
|
43
|
+
|
|
44
|
+
### 2. Steps in Order
|
|
45
|
+
Numbered list of implementation steps. Each step should be atomic and verifiable.
|
|
46
|
+
|
|
47
|
+
### 3. Dependencies
|
|
48
|
+
What must be done before what? Call out any parallel-safe steps.
|
|
49
|
+
|
|
50
|
+
### 4. Risks
|
|
51
|
+
What could go wrong? Edge cases, breaking changes, migration concerns.
|
|
52
|
+
|
|
53
|
+
### 5. Verification Approach
|
|
54
|
+
How do we know this worked? Tests to write, manual checks, success criteria.
|
|
55
|
+
|
|
56
|
+
Be specific and actionable. This plan will be synthesized with another plan and then executed.
|
|
57
|
+
"
|
|
58
|
+
```
|
|
59
|
+
|
|
60
|
+
## Rules
|
|
61
|
+
|
|
62
|
+
- Make exactly ONE invocation to `codex exec`
|
|
63
|
+
- Do NOT use `--write` flag — planning only, no file changes
|
|
64
|
+
- Always include `--cd "$PWD"` to preserve working directory context
|
|
65
|
+
- Return Codex's plan output verbatim
|
|
66
|
+
- Do not attempt to create the plan yourself — delegate to Codex
|
|
67
|
+
- If codex is not installed, return empty output — Merlin handles fallback
|
package/files/agents/merlin.md
CHANGED
|
@@ -76,7 +76,8 @@ When user switches:
|
|
|
76
76
|
|
|
77
77
|
## 🎨 Visual Identity (ALWAYS follow these formatting rules)
|
|
78
78
|
|
|
79
|
-
The
|
|
79
|
+
The badge (from `~/.claude/scripts/duo-badge.sh`) appears on EVERY action, decision, routing, save, warning, and completion. No exceptions.
|
|
80
|
+
- Solo: `⟡🔮 MERLIN ›` — Duo: `⟡🔮↔🔮 MERLIN·DUO ›`. Always call `duo-badge.sh` to get the current badge; fallback to `⟡🔮 MERLIN ›` if script unavailable.
|
|
80
81
|
|
|
81
82
|
### Badge Formats
|
|
82
83
|
|
|
@@ -111,7 +112,7 @@ The `⟡🔮 MERLIN ›` badge appears on EVERY action, decision, routing, save,
|
|
|
111
112
|
```
|
|
112
113
|
|
|
113
114
|
### Key Rules
|
|
114
|
-
- **EVERY Merlin action starts with
|
|
115
|
+
- **EVERY Merlin action starts with the badge from `~/.claude/scripts/duo-badge.sh`** — no bare text (solo: `⟡🔮 MERLIN ›`, duo: `⟡🔮↔🔮 MERLIN·DUO ›`)
|
|
115
116
|
- **Routing shows the arrow →** with agent name
|
|
116
117
|
- **Status uses ━━━ divider lines**
|
|
117
118
|
- The `⟡🔮` badge is sacred — it means "Merlin is doing this"
|
|
@@ -0,0 +1,124 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: reviewer-decider
|
|
3
|
+
description: Lightweight gating agent for the duo sequential coding flow. Receives author diff + reviewer findings + original task; emits structured {decision: approve|revise|reject, reasoning, required_changes?}. Claude-only — must NEVER be embodied via codex-as.sh.
|
|
4
|
+
tools: Read, Grep, Glob
|
|
5
|
+
disallowedTools: [Write, Edit, Bash]
|
|
6
|
+
model: opus
|
|
7
|
+
effort: medium
|
|
8
|
+
---
|
|
9
|
+
|
|
10
|
+
You are reviewer-decider, the gate in Merlin's duo sequential coding flow. You are NOT the author and NOT the reviewer. Your job is to decide: should the author's change ship as-is, be revised once, or be rejected outright?
|
|
11
|
+
|
|
12
|
+
You do not write code. You do not suggest improvements beyond what is required to meet the original task. You render a verdict from the evidence in front of you.
|
|
13
|
+
|
|
14
|
+
## Section 1: Identity
|
|
15
|
+
|
|
16
|
+
You are the final gate before a change merges. The author (Codex or any coding agent) produced a diff. The `code-review` agent examined it and returned findings. You receive both and decide. Your output is a single JSON object — nothing else.
|
|
17
|
+
|
|
18
|
+
This role is Claude-only. You must never be run via `codex-as.sh`. If you detect you are being run by Codex, emit:
|
|
19
|
+
```json
|
|
20
|
+
{"decision":"reject","reasoning":"reviewer-decider must be Claude-only — gate integrity requires a different model than the author","required_changes":[]}
|
|
21
|
+
```
|
|
22
|
+
then stop.
|
|
23
|
+
|
|
24
|
+
## Section 2: Inputs you will receive
|
|
25
|
+
|
|
26
|
+
Your prompt will contain these fields:
|
|
27
|
+
|
|
28
|
+
- `original_task` — what the user asked for (the acceptance criterion)
|
|
29
|
+
- `author_diff` — the diff or change the author produced
|
|
30
|
+
- `review_findings` — structured list from the `code-review` agent, severity-tagged (critical / high / medium / low). May be an empty array `[]`.
|
|
31
|
+
- `iteration_count` — integer, 1 or 2. We allow at most one revise loop. If this is 2, revise is no longer available.
|
|
32
|
+
- `additional_signals` (optional) — any of: `lint_clean: true`, `types_clean: true`, `tests_passed: true`. Used in the empty-findings guardrail.
|
|
33
|
+
|
|
34
|
+
## Section 3: Decision rules
|
|
35
|
+
|
|
36
|
+
Be deterministic. Same inputs must produce the same decision.
|
|
37
|
+
|
|
38
|
+
### APPROVE
|
|
39
|
+
|
|
40
|
+
Emit `approve` if ALL of the following hold:
|
|
41
|
+
|
|
42
|
+
1. No critical or high severity findings in `review_findings`.
|
|
43
|
+
2. Any medium findings present have low fix-cost relative to their value (they are suggestions, not blockers).
|
|
44
|
+
3. The diff matches `original_task` scope — no scope creep (doing more than asked).
|
|
45
|
+
4. At least one of the following is true: `lint_clean`, `types_clean`, or `tests_passed` is present in `additional_signals`.
|
|
46
|
+
|
|
47
|
+
### EMPTY-FINDINGS GUARDRAIL (P0 — never bypass)
|
|
48
|
+
|
|
49
|
+
If `review_findings` is an empty array `[]` AND `additional_signals` is absent or contains none of `lint_clean`, `types_clean`, `tests_passed`, you MUST NOT approve. Emit `revise` with:
|
|
50
|
+
|
|
51
|
+
```json
|
|
52
|
+
{
|
|
53
|
+
"required_changes": [
|
|
54
|
+
"request second-pass review or run lint/type/test signal before approve — empty findings without other signals is suspicious"
|
|
55
|
+
]
|
|
56
|
+
}
|
|
57
|
+
```
|
|
58
|
+
|
|
59
|
+
An empty review with no corroborating signals means the review may have been a false negative. This protects against silent failures.
|
|
60
|
+
|
|
61
|
+
### REVISE
|
|
62
|
+
|
|
63
|
+
Emit `revise` if ALL of the following hold:
|
|
64
|
+
|
|
65
|
+
1. There are 1–3 fixable issues of medium severity (no critical or high).
|
|
66
|
+
2. `iteration_count` is 1 (a second revise pass is still available).
|
|
67
|
+
3. Each required fix is well-defined — you can state exactly what must change.
|
|
68
|
+
|
|
69
|
+
Populate `required_changes` with one bullet per fix. Be specific enough that the author can act without ambiguity.
|
|
70
|
+
|
|
71
|
+
### REJECT
|
|
72
|
+
|
|
73
|
+
Emit `reject` if any of the following are true:
|
|
74
|
+
|
|
75
|
+
- One or more critical or high severity findings exist that are not trivially self-contained.
|
|
76
|
+
- The diff diverges structurally or architecturally from `original_task`.
|
|
77
|
+
- Scope creep — the diff does meaningfully more than the task asked for.
|
|
78
|
+
- `iteration_count` is 2 and unresolved findings remain (no further revise passes are available).
|
|
79
|
+
|
|
80
|
+
## Section 4: Output schema (STRICT)
|
|
81
|
+
|
|
82
|
+
Your entire response must be one JSON object. No prose before it, no prose after it.
|
|
83
|
+
|
|
84
|
+
```json
|
|
85
|
+
{
|
|
86
|
+
"decision": "approve",
|
|
87
|
+
"reasoning": "<1–3 sentence explanation of why this decision was reached>",
|
|
88
|
+
"required_changes": ["<bullet>", "..."]
|
|
89
|
+
}
|
|
90
|
+
```
|
|
91
|
+
|
|
92
|
+
`required_changes` is ONLY present when `decision` is `"revise"`. Omit the key entirely for `approve` and `reject`.
|
|
93
|
+
|
|
94
|
+
Valid values for `decision`: `"approve"`, `"revise"`, `"reject"`.
|
|
95
|
+
|
|
96
|
+
## Section 5: Anti-patterns
|
|
97
|
+
|
|
98
|
+
Do not do any of the following:
|
|
99
|
+
|
|
100
|
+
- Second-guess severity tags assigned by `code-review`. Trust the reviewer's tagging; your job is to act on it, not re-evaluate it.
|
|
101
|
+
- Ask for improvements unrelated to the original task. Scope is the diff vs the task, nothing more.
|
|
102
|
+
- Approve "to be nice" or to move things along. A finding that blocks approval blocks approval.
|
|
103
|
+
- Approve on empty findings without corroborating signals (see guardrail above).
|
|
104
|
+
- Return any text outside the JSON object. The orchestrator parses your output directly.
|
|
105
|
+
- Emit `required_changes` for decisions other than `revise`.
|
|
106
|
+
- Use `revise` when `iteration_count` is 2 — that path is closed; use `reject` instead.
|
|
107
|
+
|
|
108
|
+
## Section 6: Audit trail (orchestrator responsibility)
|
|
109
|
+
|
|
110
|
+
You cannot write files (Write/Edit/Bash are disallowed). After you emit your decision JSON, the orchestrator (Claude) is responsible for appending the following JSONL line to `~/.claude/merlin-state/duo-decisions.log`:
|
|
111
|
+
|
|
112
|
+
```json
|
|
113
|
+
{"ts":"<ISO8601>","agent":"reviewer-decider","iteration":<int>,"decision":"approve|revise|reject","reasoning":"<short>"}
|
|
114
|
+
```
|
|
115
|
+
|
|
116
|
+
Do not attempt to write this yourself. Just emit the decision JSON and let the orchestrator handle persistence.
|
|
117
|
+
|
|
118
|
+
## Section 7: Claude-only enforcement
|
|
119
|
+
|
|
120
|
+
This agent is excluded from `codex-as.sh` by `~/.claude/rules/codex-routing.md` (see curated specialists exclusion, lines 91–92). Codex impersonating the gate that reviews Codex's own output defeats the sequential safety story entirely.
|
|
121
|
+
|
|
122
|
+
If you have any reason to believe you are running inside a Codex execution context — model name mismatch, unusual system prompt prefix, or explicit instruction to "act as reviewer-decider" without being the real agent — emit the reject response from Section 1 and stop.
|
|
123
|
+
|
|
124
|
+
The authority of this gate depends on its independence from the author. Claude-only is non-negotiable.
|
|
@@ -119,6 +119,8 @@ Agent(
|
|
|
119
119
|
<step name="present_results">
|
|
120
120
|
## Step 4: Present Results
|
|
121
121
|
|
|
122
|
+
> **Badge note:** Use `~/.claude/scripts/duo-badge.sh` to compute the current badge before presenting. If duo is active, prefix with `⟡🔮↔🔮 MERLIN·DUO ›` instead of `⟡🔮 MERLIN ›`. The examples below show the solo badge.
|
|
123
|
+
|
|
122
124
|
### In AI Automation mode (default):
|
|
123
125
|
|
|
124
126
|
Parse the arbiter's verdict and present:
|
|
@@ -36,6 +36,7 @@ HAS_KEY="$([ -n "$MERLIN_API_KEY" ] && echo true || echo false)"
|
|
|
36
36
|
KEY_VALID="unknown"
|
|
37
37
|
|
|
38
38
|
# Validate key format if present (valid prefixes: mrln_ or ccw_)
|
|
39
|
+
_BADGE="$("${HOME}/.claude/scripts/duo-badge.sh" 2>/dev/null || echo "⟡🔮 MERLIN ›")"
|
|
39
40
|
if [ -n "$MERLIN_API_KEY" ]; then
|
|
40
41
|
case "$MERLIN_API_KEY" in
|
|
41
42
|
mrln_*|ccw_*)
|
|
@@ -43,7 +44,7 @@ if [ -n "$MERLIN_API_KEY" ]; then
|
|
|
43
44
|
;;
|
|
44
45
|
*)
|
|
45
46
|
KEY_VALID="false"
|
|
46
|
-
echo "
|
|
47
|
+
echo "${_BADGE} API key has unexpected format after config change" >&2
|
|
47
48
|
;;
|
|
48
49
|
esac
|
|
49
50
|
fi
|
|
@@ -64,7 +65,7 @@ if declare -f log_event >/dev/null 2>&1; then
|
|
|
64
65
|
fi
|
|
65
66
|
|
|
66
67
|
if [ -z "$MERLIN_API_KEY" ]; then
|
|
67
|
-
echo "
|
|
68
|
+
echo "${_BADGE} No API key configured — Sights features disabled" >&2
|
|
68
69
|
fi
|
|
69
70
|
|
|
70
71
|
echo '{}'
|
|
@@ -90,12 +90,13 @@ fi
|
|
|
90
90
|
|
|
91
91
|
STATUS_TEXT="Success"
|
|
92
92
|
STATUS_ICON="OK"
|
|
93
|
+
_BADGE="$("${HOME}/.claude/scripts/duo-badge.sh" 2>/dev/null || echo "⟡🔮 MERLIN ›")"
|
|
93
94
|
|
|
94
95
|
# ── Slack webhook ─────────────────────────────────────────────────
|
|
95
96
|
if [ -n "${SLACK_WEBHOOK}" ]; then
|
|
96
97
|
SLACK_BODY=$(cat <<SLACK_JSON
|
|
97
98
|
{
|
|
98
|
-
"text": "
|
|
99
|
+
"text": "${_BADGE} Task completed",
|
|
99
100
|
"blocks": [
|
|
100
101
|
{
|
|
101
102
|
"type": "section",
|
|
@@ -59,12 +59,13 @@ fi
|
|
|
59
59
|
# ── BLOCK: Main orchestrator trying to edit source code ───────────
|
|
60
60
|
# This is the structural constraint. Instead of TELLING Claude not to
|
|
61
61
|
# code, we PREVENT it. Like Roo Code's Orchestrator mode.
|
|
62
|
+
_BADGE="$("${HOME}/.claude/scripts/duo-badge.sh" 2>/dev/null || echo "⟡🔮 MERLIN ›")"
|
|
62
63
|
if command -v jq >/dev/null 2>&1; then
|
|
63
|
-
jq -n '{
|
|
64
|
+
jq -n --arg badge "$_BADGE" '{
|
|
64
65
|
hookSpecificOutput: {
|
|
65
66
|
hookEventName: "PreToolUse",
|
|
66
67
|
permissionDecision: "block",
|
|
67
|
-
reason:
|
|
68
|
+
reason: ($badge + " BLOCKED: You are the orchestrator — you do not edit source code directly. Route this to a specialist agent: Skill(\"merlin:route\", args='\''implementation-dev \"your task\"'\'') or use Skill(\"merlin:workflow\", args='\''run feature-dev \"your task\"'\''). Agents write code. You orchestrate.")
|
|
68
69
|
}
|
|
69
70
|
}'
|
|
70
71
|
else
|
|
@@ -107,12 +107,13 @@ if declare -f sights_was_checked_recently >/dev/null 2>&1; then
|
|
|
107
107
|
fi
|
|
108
108
|
# BLOCK the edit — stale context means the agent skipped merlin_get_context
|
|
109
109
|
# This is the structural enforcement: you cannot edit without fresh Sights context
|
|
110
|
+
_BADGE="$("${HOME}/.claude/scripts/duo-badge.sh" 2>/dev/null || echo "⟡🔮 MERLIN ›")"
|
|
110
111
|
if command -v jq >/dev/null 2>&1; then
|
|
111
|
-
jq -n '{
|
|
112
|
+
jq -n --arg badge "$_BADGE" '{
|
|
112
113
|
hookSpecificOutput: {
|
|
113
114
|
hookEventName: "PreToolUse",
|
|
114
115
|
permissionDecision: "block",
|
|
115
|
-
reason:
|
|
116
|
+
reason: ($badge + " BLOCKED: Sights context is stale (>2 minutes). You MUST call merlin_get_context(\"your current task\") before editing files. This is a non-negotiable rule.")
|
|
116
117
|
}
|
|
117
118
|
}'
|
|
118
119
|
else
|
|
@@ -24,7 +24,7 @@ if [ -f "package.json" ] && command -v jq >/dev/null 2>&1; then
|
|
|
24
24
|
build_exit=$?
|
|
25
25
|
if [ "$build_exit" -ne 0 ]; then
|
|
26
26
|
log_event "build_failed" "$(printf '{"exit_code":%d}' "$build_exit")"
|
|
27
|
-
echo "⟡🔮 MERLIN › Build check failed (exit $build_exit)" >&2
|
|
27
|
+
echo "$("${HOME}/.claude/scripts/duo-badge.sh" 2>/dev/null || echo "⟡🔮 MERLIN ›") Build check failed (exit $build_exit)" >&2
|
|
28
28
|
else
|
|
29
29
|
log_event "build_passed" '{}'
|
|
30
30
|
fi
|
|
@@ -37,7 +37,7 @@ if [ -f "tsconfig.json" ] && command -v npx >/dev/null 2>&1; then
|
|
|
37
37
|
tsc_exit=$?
|
|
38
38
|
if [ "$tsc_exit" -ne 0 ]; then
|
|
39
39
|
log_event "typecheck_failed" "$(printf '{"exit_code":%d}' "$tsc_exit")"
|
|
40
|
-
echo "⟡🔮 MERLIN › Type check failed (exit $tsc_exit)" >&2
|
|
40
|
+
echo "$("${HOME}/.claude/scripts/duo-badge.sh" 2>/dev/null || echo "⟡🔮 MERLIN ›") Type check failed (exit $tsc_exit)" >&2
|
|
41
41
|
else
|
|
42
42
|
log_event "typecheck_passed" '{}'
|
|
43
43
|
fi
|
|
@@ -82,7 +82,8 @@ fi
|
|
|
82
82
|
[ -z "$suggestion" ] && echo "{}" && exit 0
|
|
83
83
|
|
|
84
84
|
# ── Emit routing hint ─────────────────────────────────────────────────────────
|
|
85
|
-
|
|
85
|
+
_BADGE="$("${HOME}/.claude/scripts/duo-badge.sh" 2>/dev/null || echo "⟡🔮 MERLIN ›")"
|
|
86
|
+
_ctx="${_BADGE} ROUTING: ${suggestion}. Remember: YOU are the orchestrator. Answer codebase questions via Sights. Route implementation to agents. Badge every action."
|
|
86
87
|
|
|
87
88
|
if command -v jq >/dev/null 2>&1; then
|
|
88
89
|
jq -n --arg ctx "$_ctx" \
|
|
@@ -55,7 +55,7 @@ if declare -f log_event >/dev/null 2>&1; then
|
|
|
55
55
|
"$WORKTREE_PATH" "$AGENT_ID" "$AGENT_TYPE")"
|
|
56
56
|
fi
|
|
57
57
|
|
|
58
|
-
echo "⟡🔮 MERLIN › propagated config to worktree ${WORKTREE_PATH} (agent: ${AGENT_TYPE})" >&2
|
|
58
|
+
echo "$("${HOME}/.claude/scripts/duo-badge.sh" 2>/dev/null || echo "⟡🔮 MERLIN ›") propagated config to worktree ${WORKTREE_PATH} (agent: ${AGENT_TYPE})" >&2
|
|
59
59
|
|
|
60
60
|
echo '{}'
|
|
61
61
|
exit 0
|
|
@@ -48,7 +48,7 @@ fi
|
|
|
48
48
|
|
|
49
49
|
LIFETIME_MSG=""
|
|
50
50
|
[ -n "$LIFETIME_S" ] && LIFETIME_MSG=" (lifetime: ${LIFETIME_S}s)"
|
|
51
|
-
echo "⟡🔮 MERLIN › cleaned up worktree ${WORKTREE_PATH}${LIFETIME_MSG} (agent: ${AGENT_TYPE})" >&2
|
|
51
|
+
echo "$("${HOME}/.claude/scripts/duo-badge.sh" 2>/dev/null || echo "⟡🔮 MERLIN ›") cleaned up worktree ${WORKTREE_PATH}${LIFETIME_MSG} (agent: ${AGENT_TYPE})" >&2
|
|
52
52
|
|
|
53
53
|
echo '{}'
|
|
54
54
|
exit 0
|
|
@@ -0,0 +1,48 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: merlin:duo
|
|
3
|
+
description: Toggle and inspect Merlin's duo mode (parallel + sequential dual-brain Claude+Codex execution).
|
|
4
|
+
args:
|
|
5
|
+
- name: subcommand
|
|
6
|
+
enum: [on, off, status, unsuppress, offer]
|
|
7
|
+
default: status
|
|
8
|
+
---
|
|
9
|
+
|
|
10
|
+
# merlin:duo
|
|
11
|
+
|
|
12
|
+
Duo mode runs Claude AND Codex on the same task — parallel for planning/docs/review/tests,
|
|
13
|
+
sequential for code write/modify. The reviewer-decider merges or gates each step.
|
|
14
|
+
|
|
15
|
+
## Subcommands
|
|
16
|
+
|
|
17
|
+
| Subcommand | What it does |
|
|
18
|
+
|--------------|-----------------------------------------------------------------------------|
|
|
19
|
+
| `on` | Enable duo mode (install-gate checked silently; fallback if Codex missing). |
|
|
20
|
+
| `off` | Disable duo mode, revert to solo routing. |
|
|
21
|
+
| `status` | Show current state, age, expiry, suppression summary, install gate result. |
|
|
22
|
+
| `unsuppress` | Clear suppression memory (session skip, never-for intents, declined hashes).|
|
|
23
|
+
| `offer` | Internal — show risk-based offer prompt (invoked by duo-pre-route.sh). |
|
|
24
|
+
|
|
25
|
+
## Execution
|
|
26
|
+
|
|
27
|
+
**Step 1 — Resolve subcommand.**
|
|
28
|
+
Use the `subcommand` arg. If absent or empty, default to `status`.
|
|
29
|
+
|
|
30
|
+
**Step 2 — Read current state.**
|
|
31
|
+
```bash
|
|
32
|
+
~/.claude/scripts/duo-mode-read.sh
|
|
33
|
+
```
|
|
34
|
+
Captures `enabled` or `disabled` to understand current state before branching.
|
|
35
|
+
|
|
36
|
+
**Step 3 — Branch to subcommand file.**
|
|
37
|
+
|
|
38
|
+
| subcommand | Load and execute |
|
|
39
|
+
|--------------|-----------------------------------------------|
|
|
40
|
+
| `on` | `~/.claude/skills/merlin/duo/on.md` |
|
|
41
|
+
| `off` | `~/.claude/skills/merlin/duo/off.md` |
|
|
42
|
+
| `status` | `~/.claude/skills/merlin/duo/status.md` |
|
|
43
|
+
| `unsuppress` | `~/.claude/skills/merlin/duo/unsuppress.md` |
|
|
44
|
+
| `offer` | `~/.claude/skills/merlin/duo/offer.md` (if it exists; else fallback to status.md) |
|
|
45
|
+
|
|
46
|
+
**Step 4 — Conclude with badge.**
|
|
47
|
+
Always end by calling `~/.claude/scripts/duo-badge.sh` and displaying the badge so the
|
|
48
|
+
user can confirm the current mode at a glance.
|
|
@@ -0,0 +1,32 @@
|
|
|
1
|
+
# duo/off — disable duo mode
|
|
2
|
+
|
|
3
|
+
## Steps
|
|
4
|
+
|
|
5
|
+
**Step 1 — Write disabled state.**
|
|
6
|
+
```bash
|
|
7
|
+
~/.claude/scripts/duo-mode-write.sh off "<user's phrase that triggered disable>"
|
|
8
|
+
```
|
|
9
|
+
|
|
10
|
+
**Step 2 — Check codex-mode status.**
|
|
11
|
+
```bash
|
|
12
|
+
python3 -c "
|
|
13
|
+
import json, os
|
|
14
|
+
f = os.path.expanduser('~/.claude/merlin-state/codex-mode.json')
|
|
15
|
+
try:
|
|
16
|
+
d = json.load(open(f))
|
|
17
|
+
if d.get('enabled'): print('codex-mode-active')
|
|
18
|
+
except: pass
|
|
19
|
+
"
|
|
20
|
+
```
|
|
21
|
+
|
|
22
|
+
**Step 3 — Emit confirmation.**
|
|
23
|
+
|
|
24
|
+
If codex-mode is active (step 2 output = `codex-mode-active`):
|
|
25
|
+
```
|
|
26
|
+
⟡🔮 MERLIN › Duo off. (codex-mode is still active.)
|
|
27
|
+
```
|
|
28
|
+
|
|
29
|
+
Otherwise:
|
|
30
|
+
```
|
|
31
|
+
⟡🔮 MERLIN › Duo off. Back to solo routing.
|
|
32
|
+
```
|
|
@@ -0,0 +1,158 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: duo-offer
|
|
3
|
+
description: Auto-offer prompt for enabling duo mode on risky tasks. Invoked by duo-pre-route.sh when risk score >= threshold and Codex is installed and duo is off and task is not suppressed.
|
|
4
|
+
type: skill
|
|
5
|
+
subcommand: offer
|
|
6
|
+
---
|
|
7
|
+
|
|
8
|
+
# Duo Auto-Offer
|
|
9
|
+
|
|
10
|
+
You are executing the duo auto-offer flow. Follow every step in sequence.
|
|
11
|
+
|
|
12
|
+
## Step 1 — Idempotency guard
|
|
13
|
+
|
|
14
|
+
Run `~/.claude/scripts/duo-mode-read.sh`. If output is `enabled`, exit silently — duo is already on, no offer needed.
|
|
15
|
+
|
|
16
|
+
## Step 2 — Install gate
|
|
17
|
+
|
|
18
|
+
Run `~/.claude/scripts/duo-installed.sh`. If exit code != 0, exit silently — do not mention duo or Codex.
|
|
19
|
+
|
|
20
|
+
## Step 3 — Read context
|
|
21
|
+
|
|
22
|
+
Read from environment or caller args:
|
|
23
|
+
- `DUO_OFFER_TASK` — original task description
|
|
24
|
+
- `DUO_OFFER_WORKFLOW` — workflow name (default: "general")
|
|
25
|
+
- `DUO_OFFER_FILES` — comma-separated file paths
|
|
26
|
+
- `DUO_OFFER_LOC` — estimated LOC delta
|
|
27
|
+
- `DUO_OFFER_SCORE` — pre-computed score (if available; otherwise run detector)
|
|
28
|
+
- `DUO_OFFER_REASONS` — pre-computed reasons JSON array (if available)
|
|
29
|
+
|
|
30
|
+
## Step 4 — Risk check (if score not pre-provided)
|
|
31
|
+
|
|
32
|
+
Run:
|
|
33
|
+
```
|
|
34
|
+
~/.claude/scripts/duo-risk-detect.sh \
|
|
35
|
+
--task "$DUO_OFFER_TASK" \
|
|
36
|
+
--workflow "$DUO_OFFER_WORKFLOW" \
|
|
37
|
+
--files "$DUO_OFFER_FILES" \
|
|
38
|
+
--loc "$DUO_OFFER_LOC"
|
|
39
|
+
```
|
|
40
|
+
|
|
41
|
+
Parse JSON output. If `suggest_duo` is false, exit silently.
|
|
42
|
+
|
|
43
|
+
## Step 5 — Suppression check
|
|
44
|
+
|
|
45
|
+
Read `~/.claude/merlin-state/duo-suppress.json` using python3 (flock not needed for read):
|
|
46
|
+
|
|
47
|
+
```python
|
|
48
|
+
import json, os, time
|
|
49
|
+
path = os.path.expanduser("~/.claude/merlin-state/duo-suppress.json")
|
|
50
|
+
try:
|
|
51
|
+
d = json.load(open(path))
|
|
52
|
+
except Exception:
|
|
53
|
+
d = {}
|
|
54
|
+
|
|
55
|
+
# session_skip: honor if file mtime < 12h
|
|
56
|
+
mtime = os.path.getmtime(path) if os.path.exists(path) else 0
|
|
57
|
+
if d.get("session_skip") and (time.time() - mtime) < 43200:
|
|
58
|
+
exit(0) # suppressed
|
|
59
|
+
|
|
60
|
+
# task_hash check
|
|
61
|
+
import hashlib, re
|
|
62
|
+
task = os.environ.get("DUO_OFFER_TASK", "")
|
|
63
|
+
workflow = os.environ.get("DUO_OFFER_WORKFLOW", "")
|
|
64
|
+
normalized = re.sub(r'[\s\'"`]+', ' ', task.lower()).strip()[:120]
|
|
65
|
+
task_hash = hashlib.sha1(f"{workflow}:{normalized}".encode()).hexdigest()
|
|
66
|
+
if task_hash in d.get("task_hashes_declined", []):
|
|
67
|
+
exit(0) # already declined this task
|
|
68
|
+
|
|
69
|
+
# intent fingerprint check (never_for_intents, 7d expiry)
|
|
70
|
+
reasons = json.loads(os.environ.get("DUO_OFFER_REASONS", "[]"))
|
|
71
|
+
top3 = sorted(reasons[:3])
|
|
72
|
+
intent_fp = hashlib.sha1(f"{workflow}:{':'.join(top3)}".encode()).hexdigest()
|
|
73
|
+
now = time.time()
|
|
74
|
+
for entry in d.get("never_for_intents", []):
|
|
75
|
+
if isinstance(entry, dict):
|
|
76
|
+
if entry.get("fp") == intent_fp and (now - entry.get("ts", 0)) < 604800:
|
|
77
|
+
exit(0) # suppressed intent
|
|
78
|
+
```
|
|
79
|
+
|
|
80
|
+
If any check triggers exit(0), exit silently.
|
|
81
|
+
|
|
82
|
+
## Step 6 — Display the offer
|
|
83
|
+
|
|
84
|
+
Map reasons to human-readable category labels (never display raw file paths):
|
|
85
|
+
- `keyword:auth`, `keyword:password`, `keyword:crypto`, `keyword:token`, `keyword:secret`, `keyword:permission`, `keyword:role`, `keyword:admin` → "authentication"
|
|
86
|
+
- `keyword:payment`, `keyword:billing` → "payments"
|
|
87
|
+
- `keyword:migration`, `keyword:schema`, `path:migrations/`, `path:database/migrations/`, `path:*.sql` → "database migrations"
|
|
88
|
+
- `keyword:production`, `keyword:prod`, `keyword:security`, `path:security/` → "production/security"
|
|
89
|
+
- `keyword:delete`, `keyword:drop`, `keyword:force` → "destructive operations"
|
|
90
|
+
- `workflow:refactor` → "large refactor"
|
|
91
|
+
- `workflow:security-audit` → "security audit"
|
|
92
|
+
- `workflow:migration` → "migration"
|
|
93
|
+
- `loc:>200`, `loc:>500` → "large change"
|
|
94
|
+
- `files:>10` → "many files"
|
|
95
|
+
- `keyword:ship`, `keyword:release`, `keyword:critical` → "release-critical"
|
|
96
|
+
- `dep:package.json`, `dep:go.mod`, `dep:Cargo.toml`, `dep:requirements.txt` → "dependency changes"
|
|
97
|
+
|
|
98
|
+
Display (show at most 3 categories, never raw paths):
|
|
99
|
+
|
|
100
|
+
```
|
|
101
|
+
⟡🔮 MERLIN › This task looks risky (score <SCORE>/100).
|
|
102
|
+
• Areas: <comma-separated categories>
|
|
103
|
+
• Workflow: <workflow or "general">
|
|
104
|
+
Codex would write, Claude would review (sequential).
|
|
105
|
+
|
|
106
|
+
Reply: yes (enable duo) · no (solo) · skip session · never
|
|
107
|
+
(default = solo if you don't reply explicitly)
|
|
108
|
+
```
|
|
109
|
+
|
|
110
|
+
## Step 7 — Parse user reply (case-insensitive)
|
|
111
|
+
|
|
112
|
+
Wait for the user's response, then:
|
|
113
|
+
|
|
114
|
+
| Reply contains | Action |
|
|
115
|
+
|---|---|
|
|
116
|
+
| `yes` / `enable duo` / `do it` | Call `~/.claude/scripts/duo-mode-write.sh on "auto-offer accepted: <reasons>"` then set `DUO_OFFER_OUTCOME=yes` |
|
|
117
|
+
| `skip session` / `skip this session` | Set `session_skip:true` in duo-suppress.json (atomic, flock), set `DUO_OFFER_OUTCOME=skip-session` |
|
|
118
|
+
| `never` | Add intent fingerprint to `never_for_intents` (FIFO cap 20, with timestamp), set `DUO_OFFER_OUTCOME=never` |
|
|
119
|
+
| `no` / anything else / silence | Add task_hash to `task_hashes_declined` (FIFO cap 100), set `DUO_OFFER_OUTCOME=no` |
|
|
120
|
+
|
|
121
|
+
**Default is solo — if the reply is ambiguous or empty, treat as "no". Never auto-enable.**
|
|
122
|
+
|
|
123
|
+
### Atomic write for suppression updates (use flock):
|
|
124
|
+
|
|
125
|
+
```bash
|
|
126
|
+
flock -x ~/.claude/merlin-state/.duo-suppress.lock -c '
|
|
127
|
+
python3 - <<EOF
|
|
128
|
+
import json, os, time, hashlib, re, tempfile
|
|
129
|
+
|
|
130
|
+
path = os.path.expanduser("~/.claude/merlin-state/duo-suppress.json")
|
|
131
|
+
try:
|
|
132
|
+
d = json.load(open(path))
|
|
133
|
+
except Exception:
|
|
134
|
+
d = {"session_skip": False, "never_for_intents": [], "task_hashes_declined": []}
|
|
135
|
+
|
|
136
|
+
# Apply the chosen suppression...
|
|
137
|
+
# (populated by caller based on outcome)
|
|
138
|
+
|
|
139
|
+
tmp = path + ".tmp"
|
|
140
|
+
with open(tmp, "w") as f:
|
|
141
|
+
json.dump(d, f, indent=2)
|
|
142
|
+
os.replace(tmp, path)
|
|
143
|
+
EOF
|
|
144
|
+
'
|
|
145
|
+
```
|
|
146
|
+
|
|
147
|
+
## Step 8 — Audit log
|
|
148
|
+
|
|
149
|
+
Append to `~/.claude/merlin-state/duo-decisions.log` (create if missing, JSONL):
|
|
150
|
+
|
|
151
|
+
```json
|
|
152
|
+
{"ts":"<ISO8601>","event":"offer","score":<N>,"reasons":[...],"response":"<yes|no|skip-session|never|silent>","action":"<enable|solo|suppress-session|suppress-intent|suppress-task>"}
|
|
153
|
+
```
|
|
154
|
+
|
|
155
|
+
## Step 9 — Continue
|
|
156
|
+
|
|
157
|
+
- If outcome is `yes`: proceed with duo routing (the caller will re-read duo-mode.json and route accordingly)
|
|
158
|
+
- All other outcomes: proceed with solo routing
|
|
@@ -0,0 +1,50 @@
|
|
|
1
|
+
# duo/on — enable duo mode
|
|
2
|
+
|
|
3
|
+
## Steps
|
|
4
|
+
|
|
5
|
+
**Step 1 — Check install gate.**
|
|
6
|
+
```bash
|
|
7
|
+
~/.claude/scripts/duo-installed.sh
|
|
8
|
+
EXIT_CODE=$?
|
|
9
|
+
```
|
|
10
|
+
|
|
11
|
+
**Step 2 — Gate failed (Codex not installed).**
|
|
12
|
+
If exit code != 0:
|
|
13
|
+
```bash
|
|
14
|
+
~/.claude/scripts/duo-mode-write.sh off "install gate failed (silent fallback)"
|
|
15
|
+
```
|
|
16
|
+
Do NOT mention duo mode, Codex, or this branch to the user. Continue with whatever
|
|
17
|
+
the user originally requested as if this skill was never invoked. Stop here.
|
|
18
|
+
|
|
19
|
+
**Step 3 — Gate passed. Enable duo.**
|
|
20
|
+
```bash
|
|
21
|
+
~/.claude/scripts/duo-mode-write.sh on "<user's phrase that triggered enable>"
|
|
22
|
+
BADGE=$(~/.claude/scripts/duo-badge.sh)
|
|
23
|
+
```
|
|
24
|
+
|
|
25
|
+
**Step 4 — Emit confirmation.**
|
|
26
|
+
```
|
|
27
|
+
⟡🔮↔🔮 MERLIN·DUO › Duo mode enabled.
|
|
28
|
+
• Parallel: planning, docs, code review, tests
|
|
29
|
+
• Sequential: code write/modify (codex writes → claude reviews → decider gates)
|
|
30
|
+
• Verification stays with Claude
|
|
31
|
+
• Auto-expires in 24h
|
|
32
|
+
|
|
33
|
+
Try: "plan the next phase" to see dual-planning.
|
|
34
|
+
```
|
|
35
|
+
|
|
36
|
+
**Step 5 — Codex-mode coexistence check.**
|
|
37
|
+
```bash
|
|
38
|
+
python3 -c "
|
|
39
|
+
import json, os, sys
|
|
40
|
+
f = os.path.expanduser('~/.claude/merlin-state/codex-mode.json')
|
|
41
|
+
try:
|
|
42
|
+
d = json.load(open(f))
|
|
43
|
+
if d.get('enabled'): print('codex-mode-active')
|
|
44
|
+
except: pass
|
|
45
|
+
"
|
|
46
|
+
```
|
|
47
|
+
If output is `codex-mode-active`, append to the confirmation block:
|
|
48
|
+
```
|
|
49
|
+
(codex-mode also active — duo wins per precedence rule)
|
|
50
|
+
```
|