mindforge-cc 1.0.5 → 2.0.0-alpha.4
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/.agent/CLAUDE.md +53 -0
- package/.agent/mindforge/auto.md +22 -0
- package/.agent/mindforge/browse.md +26 -0
- package/.agent/mindforge/costs.md +11 -0
- package/.agent/mindforge/cross-review.md +17 -0
- package/.agent/mindforge/execute-phase.md +5 -3
- package/.agent/mindforge/qa.md +16 -0
- package/.agent/mindforge/remember.md +14 -0
- package/.agent/mindforge/research.md +11 -0
- package/.agent/mindforge/steer.md +13 -0
- package/.agent/workflows/publish-release.md +36 -0
- package/.claude/CLAUDE.md +53 -0
- package/.claude/commands/mindforge/auto.md +22 -0
- package/.claude/commands/mindforge/browse.md +26 -0
- package/.claude/commands/mindforge/costs.md +11 -0
- package/.claude/commands/mindforge/cross-review.md +17 -0
- package/.claude/commands/mindforge/execute-phase.md +5 -3
- package/.claude/commands/mindforge/qa.md +16 -0
- package/.claude/commands/mindforge/remember.md +14 -0
- package/.claude/commands/mindforge/research.md +11 -0
- package/.claude/commands/mindforge/steer.md +13 -0
- package/.mindforge/MINDFORGE-V2-SCHEMA.json +47 -0
- package/.mindforge/browser/daemon-protocol.md +24 -0
- package/.mindforge/browser/qa-engine.md +16 -0
- package/.mindforge/browser/session-manager.md +18 -0
- package/.mindforge/browser/visual-verify-spec.md +31 -0
- package/.mindforge/engine/autonomous/auto-executor.md +266 -0
- package/.mindforge/engine/autonomous/headless-adapter.md +66 -0
- package/.mindforge/engine/autonomous/node-repair.md +190 -0
- package/.mindforge/engine/autonomous/progress-reporter.md +58 -0
- package/.mindforge/engine/autonomous/steering-manager.md +64 -0
- package/.mindforge/engine/autonomous/stuck-detector.md +89 -0
- package/.mindforge/memory/MEMORY-SCHEMA.md +155 -0
- package/.mindforge/memory/decision-library.jsonl +0 -0
- package/.mindforge/memory/engine/capture-protocol.md +36 -0
- package/.mindforge/memory/engine/global-sync-spec.md +42 -0
- package/.mindforge/memory/engine/retrieval-spec.md +44 -0
- package/.mindforge/memory/knowledge-base.jsonl +7 -0
- package/.mindforge/memory/pattern-library.jsonl +1 -0
- package/.mindforge/memory/team-preferences.jsonl +4 -0
- package/.mindforge/models/model-registry.md +48 -0
- package/.mindforge/models/model-router.md +30 -0
- package/.mindforge/personas/research-agent.md +24 -0
- package/.planning/browser-daemon.log +32 -0
- package/.planning/decisions/ADR-021-autonomy-boundary.md +17 -0
- package/.planning/decisions/ADR-022-node-repair-hierarchy.md +19 -0
- package/.planning/decisions/ADR-023-gate-3-timing.md +15 -0
- package/CHANGELOG.md +68 -0
- package/MINDFORGE.md +26 -3
- package/README.md +54 -18
- package/bin/autonomous/auto-runner.js +95 -0
- package/bin/autonomous/headless.js +36 -0
- package/bin/autonomous/progress-stream.js +49 -0
- package/bin/autonomous/repair-operator.js +213 -0
- package/bin/autonomous/steer.js +71 -0
- package/bin/autonomous/stuck-monitor.js +77 -0
- package/bin/browser/browser-daemon.js +139 -0
- package/bin/browser/daemon-manager.js +91 -0
- package/bin/browser/qa-engine.js +47 -0
- package/bin/browser/qa-report-writer.js +32 -0
- package/bin/browser/regression-writer.js +27 -0
- package/bin/browser/screenshot-store.js +49 -0
- package/bin/browser/session-manager.js +93 -0
- package/bin/browser/visual-verify-executor.js +89 -0
- package/bin/install.js +4 -4
- package/bin/installer-core.js +24 -24
- package/bin/memory/cli.js +99 -0
- package/bin/memory/global-sync.js +107 -0
- package/bin/memory/knowledge-capture.js +278 -0
- package/bin/memory/knowledge-indexer.js +172 -0
- package/bin/memory/knowledge-store.js +319 -0
- package/bin/memory/session-memory-loader.js +137 -0
- package/bin/migrations/0.1.0-to-0.5.0.js +2 -3
- package/bin/migrations/0.5.0-to-0.6.0.js +1 -1
- package/bin/migrations/0.6.0-to-1.0.0.js +3 -3
- package/bin/migrations/migrate.js +15 -11
- package/bin/models/anthropic-provider.js +77 -0
- package/bin/models/cost-tracker.js +118 -0
- package/bin/models/gemini-provider.js +79 -0
- package/bin/models/model-client.js +98 -0
- package/bin/models/model-router.js +111 -0
- package/bin/models/openai-provider.js +78 -0
- package/bin/research/research-engine.js +115 -0
- package/bin/review/cross-review-engine.js +81 -0
- package/bin/review/finding-synthesizer.js +116 -0
- package/bin/review/review-report-writer.js +49 -0
- package/bin/updater/self-update.js +13 -13
- package/docs/adr/ADR-024-browser-localhost-only.md +17 -0
- package/docs/adr/ADR-025-visual-verify-failure-treatment.md +19 -0
- package/docs/adr/ADR-026-session-persistence-security.md +20 -0
- package/docs/architecture/README.md +4 -2
- package/docs/publishing-guide.md +78 -0
- package/docs/reference/commands.md +17 -2
- package/docs/reference/sdk-api.md +6 -1
- package/docs/user-guide.md +93 -9
- package/docs/usp-features.md +56 -8
- package/package.json +3 -2
|
@@ -0,0 +1,266 @@
|
|
|
1
|
+
# MindForge v2 — Auto-Executor Engine
|
|
2
|
+
|
|
3
|
+
## Purpose
|
|
4
|
+
Orchestrate complete autonomous execution of a MindForge phase without
|
|
5
|
+
human intervention. The auto-executor is the brain behind `/mindforge:auto`.
|
|
6
|
+
|
|
7
|
+
## Design principles
|
|
8
|
+
|
|
9
|
+
### Principle 1 — Fresh context per task
|
|
10
|
+
Every task is executed by a new subagent with a fresh context window.
|
|
11
|
+
Never accumulate context across tasks. The only state that persists between
|
|
12
|
+
tasks is written to `.planning/` files (HANDOFF.json, AUDIT.jsonl, SUMMARY files).
|
|
13
|
+
|
|
14
|
+
### Principle 2 — Durable execution
|
|
15
|
+
Auto mode is designed to survive interruption. Every task completion writes
|
|
16
|
+
to HANDOFF.json before moving to the next. If the session dies:
|
|
17
|
+
- HANDOFF.json shows exactly where execution stopped
|
|
18
|
+
- Next `/mindforge:auto` call resumes from the last completed task
|
|
19
|
+
- No work is repeated, no work is lost
|
|
20
|
+
|
|
21
|
+
### Principle 3 — Governance is non-negotiable
|
|
22
|
+
Compliance gates run between every wave. CRITICAL security findings stop
|
|
23
|
+
the loop immediately. Tier 3 changes (auth/payment/PII code patterns) trigger
|
|
24
|
+
ESCALATE — they are never auto-approved in autonomous mode.
|
|
25
|
+
|
|
26
|
+
### Principle 4 — Signal over silence
|
|
27
|
+
Auto mode never silently fails. Every decision (RETRY, DECOMPOSE, PRUNE,
|
|
28
|
+
ESCALATE) is written to AUDIT.jsonl with full context. The progress stream
|
|
29
|
+
reports every state change in real time.
|
|
30
|
+
|
|
31
|
+
---
|
|
32
|
+
|
|
33
|
+
## Auto-executor state machine
|
|
34
|
+
|
|
35
|
+
```
|
|
36
|
+
IDLE
|
|
37
|
+
│
|
|
38
|
+
▼ /mindforge:auto [phase N]
|
|
39
|
+
PRE_FLIGHT_CHECK
|
|
40
|
+
│ fail → ESCALATE with specific error
|
|
41
|
+
│ pass
|
|
42
|
+
▼
|
|
43
|
+
PHASE_ASSESSMENT
|
|
44
|
+
│ PLAN files exist?
|
|
45
|
+
│ NO → AUTO_PLAN (discuss if ambiguity > 3.5, then plan)
|
|
46
|
+
│ YES → check if any are incomplete
|
|
47
|
+
▼
|
|
48
|
+
DEPENDENCY_RESOLUTION
|
|
49
|
+
│ Build wave DAG from PLAN files
|
|
50
|
+
│ Identify completed tasks (SUMMARY files exist)
|
|
51
|
+
│ Resume from first incomplete task
|
|
52
|
+
▼
|
|
53
|
+
WAVE_EXECUTION_LOOP ←──────────────────────────────┐
|
|
54
|
+
│ │
|
|
55
|
+
│ For each wave: │
|
|
56
|
+
│ Dispatch N tasks in parallel (subagents) │
|
|
57
|
+
│ Each task: EXECUTE → VERIFY → COMMIT │
|
|
58
|
+
│ │
|
|
59
|
+
│ Task result: │
|
|
60
|
+
│ SUCCESS → write SUMMARY, AUDIT, HANDOFF │
|
|
61
|
+
│ FAILURE → NODE_REPAIR (see node-repair.md) │
|
|
62
|
+
│ REPAIR result: │
|
|
63
|
+
│ RECOVERED → continue │
|
|
64
|
+
│ DEFERRED → add to DEFERRED-ITEMS.md │
|
|
65
|
+
│ ESCALATE → stop, notify, write report │
|
|
66
|
+
│ │
|
|
67
|
+
│ Poll steering-queue.jsonl at task boundaries │
|
|
68
|
+
│ Apply any queued steering guidance │
|
|
69
|
+
│ │
|
|
70
|
+
│ After each wave: │
|
|
71
|
+
│ Run compliance gates (Gate 1-5) │
|
|
72
|
+
│ CRITICAL finding → ESCALATE │
|
|
73
|
+
│ Update HANDOFF.json wave completion │
|
|
74
|
+
│ Push if AUTO_PUSH_ON_WAVE_COMPLETE=true │
|
|
75
|
+
│ │
|
|
76
|
+
└─────────────── all waves complete ─────────────┘
|
|
77
|
+
│
|
|
78
|
+
▼
|
|
79
|
+
POST_EXECUTION
|
|
80
|
+
│ Run automated verification (no human UAT in auto mode)
|
|
81
|
+
│ Write AUTONOMOUS-REPORT-[phase]-[timestamp].md
|
|
82
|
+
│ Update STATE.md
|
|
83
|
+
│ Send Slack notification (if configured)
|
|
84
|
+
▼
|
|
85
|
+
COMPLETE (or ESCALATED)
|
|
86
|
+
```
|
|
87
|
+
|
|
88
|
+
---
|
|
89
|
+
|
|
90
|
+
## Pre-flight check protocol
|
|
91
|
+
|
|
92
|
+
Before starting any autonomous execution, verify:
|
|
93
|
+
|
|
94
|
+
```bash
|
|
95
|
+
# 1. Health check — must be healthy
|
|
96
|
+
/mindforge:health
|
|
97
|
+
# Expected: ✅ Installation integrity, ✅ State consistency
|
|
98
|
+
|
|
99
|
+
# 2. Schema version check
|
|
100
|
+
SCHEMA_VER=$(node -e "try{const h=require('./.planning/HANDOFF.json');
|
|
101
|
+
console.log(h.schema_version)}catch{console.log('missing')}")
|
|
102
|
+
[ "${SCHEMA_VER}" = "1.0.0" ] || {
|
|
103
|
+
echo "⚠️ HANDOFF.json schema outdated (${SCHEMA_VER}). Run /mindforge:migrate"
|
|
104
|
+
exit 1
|
|
105
|
+
}
|
|
106
|
+
|
|
107
|
+
# 3. Uncommitted changes check
|
|
108
|
+
# HARDENED: Exclude .planning/ and MINDFORGE.md
|
|
109
|
+
DIRTY=$(git status --porcelain | \
|
|
110
|
+
grep -v "^??" | \
|
|
111
|
+
grep -v "^.. \.planning/" | \
|
|
112
|
+
grep -v "^.. MINDFORGE\.md" | \
|
|
113
|
+
wc -l | tr -d ' ')
|
|
114
|
+
|
|
115
|
+
if [ "${DIRTY}" -gt 0 ]; then
|
|
116
|
+
echo "❌ ${DIRTY} uncommitted change(s) in application code."
|
|
117
|
+
echo " Commit or stash before running auto mode:"
|
|
118
|
+
git status --porcelain | grep -v "^??" | grep -v "^.. \.planning/"
|
|
119
|
+
exit 1
|
|
120
|
+
fi
|
|
121
|
+
echo "✅ Working tree clean (application code)"
|
|
122
|
+
|
|
123
|
+
# 4. Phase PLAN files check
|
|
124
|
+
PLAN_COUNT=$(ls .planning/phases/${PHASE_NUM}/PLAN-${PHASE_NUM}-*.md 2>/dev/null | wc -l | tr -d ' ')
|
|
125
|
+
if [ "${PLAN_COUNT}" -eq 0 ]; then
|
|
126
|
+
echo "ℹ️ No PLAN files for Phase ${PHASE_NUM}. Will auto-plan first."
|
|
127
|
+
# Trigger auto-plan path
|
|
128
|
+
fi
|
|
129
|
+
|
|
130
|
+
# 5. Governance configuration check — warn if Tier 2/3 approvers not configured
|
|
131
|
+
APPROVERS=$(grep "TIER2_APPROVERS=" .mindforge/governance/GOVERNANCE-CONFIG.md 2>/dev/null | \
|
|
132
|
+
grep -v "senior-engineer" | head -1)
|
|
133
|
+
[ -n "${APPROVERS}" ] || {
|
|
134
|
+
echo "⚠️ Tier 2/3 approvers not configured. Tier 3 changes will ESCALATE immediately."
|
|
135
|
+
}
|
|
136
|
+
|
|
137
|
+
# 6. Timeout sanity check
|
|
138
|
+
TIMEOUT_MINS="${AUTO_MODE_DEFAULT_TIMEOUT_MINUTES:-120}"
|
|
139
|
+
echo "ℹ️ Timeout: ${TIMEOUT_MINS} minutes from now ($(date -d "+${TIMEOUT_MINS} minutes" '+%H:%M'))"
|
|
140
|
+
```
|
|
141
|
+
|
|
142
|
+
---
|
|
143
|
+
|
|
144
|
+
## Task dispatch model
|
|
145
|
+
|
|
146
|
+
### Subagent context package (fresh per task)
|
|
147
|
+
Each task's subagent receives ONLY:
|
|
148
|
+
1. CLAUDE.md (persona instructions)
|
|
149
|
+
2. The specific persona file (e.g., developer.md)
|
|
150
|
+
3. Relevant loaded skills (JIT-loaded per triggers)
|
|
151
|
+
4. The specific PLAN-N-MM.md file
|
|
152
|
+
5. CONVENTIONS.md (org coding standards)
|
|
153
|
+
6. Referenced architecture sections only (not full ARCHITECTURE.md)
|
|
154
|
+
7. Any steering guidance from steering-queue.jsonl
|
|
155
|
+
8. Implicit knowledge from HANDOFF.json relevant to this task
|
|
156
|
+
|
|
157
|
+
This is the minimum-context principle from the v1 context-injector — carried
|
|
158
|
+
forward and enforced strictly in auto mode to prevent context rot.
|
|
159
|
+
|
|
160
|
+
### Fresh context enforcement
|
|
161
|
+
```bash
|
|
162
|
+
# Auto mode subagent spawn — each task gets a new conversation
|
|
163
|
+
# (In Claude Code: each task is a new sub-session with /clear between tasks)
|
|
164
|
+
# Context budget per task: max 60,000 tokens
|
|
165
|
+
# If context estimate > 60K: DECOMPOSE the task before execution
|
|
166
|
+
|
|
167
|
+
CONTEXT_EST=$(estimate_task_context "${PLAN_FILE}")
|
|
168
|
+
if [ "${CONTEXT_EST}" -gt 60000 ]; then
|
|
169
|
+
echo "Context estimate ${CONTEXT_EST} exceeds 60K — triggering pre-execution DECOMPOSE"
|
|
170
|
+
decompose_plan "${PLAN_FILE}"
|
|
171
|
+
fi
|
|
172
|
+
```
|
|
173
|
+
|
|
174
|
+
### Compliance gate timing — CRITICAL DISTINCTION
|
|
175
|
+
|
|
176
|
+
### Gate 3 (secret detection) — runs PRE-COMMIT, not post-wave
|
|
177
|
+
Gate 3 must run on the STAGED diff before every commit.
|
|
178
|
+
A committed secret is a git history violation even if caught 30 seconds later.
|
|
179
|
+
|
|
180
|
+
```bash
|
|
181
|
+
# Before every git commit in auto mode:
|
|
182
|
+
STAGED_DIFF=$(git diff --cached)
|
|
183
|
+
|
|
184
|
+
# Secret detection on staged content
|
|
185
|
+
SECRET_FOUND=$(echo "${STAGED_DIFF}" | \
|
|
186
|
+
grep -E "(sk-[a-zA-Z0-9]{20,}|AKIA[A-Z0-9]{16}|ghp_[a-zA-Z0-9]{36}|xoxb-[a-zA-Z0-9-]+)" | \
|
|
187
|
+
head -1)
|
|
188
|
+
|
|
189
|
+
if [ -n "${SECRET_FOUND}" ]; then
|
|
190
|
+
# DO NOT COMMIT
|
|
191
|
+
git reset HEAD # Unstage everything
|
|
192
|
+
echo "🔴 GATE 3 VIOLATION: Secret detected in staged changes"
|
|
193
|
+
echo " Pattern: ${SECRET_FOUND:0:30}***"
|
|
194
|
+
echo " Auto mode ESCALATING — secret must be removed before continuing"
|
|
195
|
+
write_escalation "Gate 3: secret credential pattern in staged diff"
|
|
196
|
+
write_audit_gate3_violation
|
|
197
|
+
notify_slack_critical
|
|
198
|
+
exit 3 # Exit code 3 = gate failure
|
|
199
|
+
fi
|
|
200
|
+
```
|
|
201
|
+
|
|
202
|
+
### Gates 1, 2, 4, 5 — run POST-WAVE (as before)
|
|
203
|
+
These gates check the wave's overall output, not individual commits.
|
|
204
|
+
They run after all tasks in a wave complete.
|
|
205
|
+
|
|
206
|
+
---
|
|
207
|
+
|
|
208
|
+
## Progress state file: `.planning/auto-state.json`
|
|
209
|
+
|
|
210
|
+
Written after every task. The source of truth for progress display and resumption.
|
|
211
|
+
|
|
212
|
+
```json
|
|
213
|
+
{
|
|
214
|
+
"schema_version": "2.0.0",
|
|
215
|
+
"auto_mode_active": true,
|
|
216
|
+
"session_id": "auto-sess-uuid",
|
|
217
|
+
"phase": 3,
|
|
218
|
+
"started_at": "ISO-8601",
|
|
219
|
+
"timeout_at": "ISO-8601",
|
|
220
|
+
"elapsed_ms": 1083000,
|
|
221
|
+
"estimated_remaining_ms": 741000,
|
|
222
|
+
"wave_current": 2,
|
|
223
|
+
"wave_total": 3,
|
|
224
|
+
"tasks_completed": 4,
|
|
225
|
+
"tasks_total": 8,
|
|
226
|
+
"tasks_failed": 0,
|
|
227
|
+
"node_repairs": 0,
|
|
228
|
+
"escalations": 0,
|
|
229
|
+
"steering_items_applied": 1,
|
|
230
|
+
"token_consumed_estimate": 82400,
|
|
231
|
+
"last_commit": "abc1234ef",
|
|
232
|
+
"last_task": "Plan 3-04",
|
|
233
|
+
"current_task": "Plan 3-05",
|
|
234
|
+
"current_task_started_at": "ISO-8601",
|
|
235
|
+
"gate_failures": 0,
|
|
236
|
+
"deferred_items": [],
|
|
237
|
+
"status": "running|paused|completed|escalated|timeout",
|
|
238
|
+
"_warning": "Never store secrets in this file."
|
|
239
|
+
}
|
|
240
|
+
```
|
|
241
|
+
|
|
242
|
+
---
|
|
243
|
+
|
|
244
|
+
## AUDIT entries for auto mode
|
|
245
|
+
|
|
246
|
+
Three new AUDIT event types:
|
|
247
|
+
|
|
248
|
+
```json
|
|
249
|
+
{ "event": "auto_mode_started",
|
|
250
|
+
"phase": 3, "session_id": "auto-sess-uuid",
|
|
251
|
+
"plans_total": 8, "waves_total": 3,
|
|
252
|
+
"timeout_minutes": 120 }
|
|
253
|
+
|
|
254
|
+
{ "event": "auto_mode_completed",
|
|
255
|
+
"phase": 3, "session_id": "auto-sess-uuid",
|
|
256
|
+
"tasks_completed": 8, "tasks_total": 8,
|
|
257
|
+
"node_repairs": 0, "escalations": 0,
|
|
258
|
+
"duration_ms": 1834000, "commits": ["abc1234", "def5678"] }
|
|
259
|
+
|
|
260
|
+
{ "event": "auto_mode_escalated",
|
|
261
|
+
"phase": 3, "session_id": "auto-sess-uuid",
|
|
262
|
+
"reason": "CRITICAL security finding in Plan 3-06",
|
|
263
|
+
"last_completed_task": "Plan 3-05",
|
|
264
|
+
"next_task": "Plan 3-06",
|
|
265
|
+
"resume_command": "/mindforge:auto --phase 3 --resume" }
|
|
266
|
+
```
|
|
@@ -0,0 +1,66 @@
|
|
|
1
|
+
# MindForge v2 — Headless Adapter
|
|
2
|
+
|
|
3
|
+
## Purpose
|
|
4
|
+
Enable MindForge execution in non-interactive environments (CI/CD, GitHub
|
|
5
|
+
Actions, remote build servers). The headless adapter handles SIGTERM,
|
|
6
|
+
prevents terminal UI rendering, and enforces strict exit codes.
|
|
7
|
+
|
|
8
|
+
---
|
|
9
|
+
|
|
10
|
+
## Headless characteristics
|
|
11
|
+
|
|
12
|
+
### 1. Zero-interaction
|
|
13
|
+
All prompts are auto-answered with defaults (or `yes` for security gates
|
|
14
|
+
if `TIER_1_AUTO_APPROVE=true`). If a Tier 3 change is detected, it
|
|
15
|
+
immediately exits with code 10 (Escalation Required).
|
|
16
|
+
|
|
17
|
+
### 2. Output redirection
|
|
18
|
+
Standard terminal UI (Progress stream) is disabled.
|
|
19
|
+
Instead, a structured JSON stream is written to `stdout`.
|
|
20
|
+
Informational logs go to `stderr`.
|
|
21
|
+
|
|
22
|
+
### 3. Graceful termination (SIGTERM/SIGINT) - HARDENED
|
|
23
|
+
When running in environments like GitHub Actions, the runner might receive
|
|
24
|
+
a SIGTERM (timeout).
|
|
25
|
+
**Hardening rule:** On SIGTERM, the engine MUST:
|
|
26
|
+
- Finish the current `git commit` if in progress.
|
|
27
|
+
- Write the current state to HANDOFF.json.
|
|
28
|
+
- Upload current `.planning/` artifacts as a "resume package".
|
|
29
|
+
- Prevent the race condition where `SIGTERM` kills the process mid-write.
|
|
30
|
+
|
|
31
|
+
```javascript
|
|
32
|
+
// bin/autonomous/headless.js logic
|
|
33
|
+
process.on('SIGTERM', async () => {
|
|
34
|
+
console.error('⚠️ Received SIGTERM. Snapshotting state for resumption...');
|
|
35
|
+
await autoExecutor.pause(); // Flushes all state buffers
|
|
36
|
+
process.exit(0); // Exit 0 to show graceful pause vs failure
|
|
37
|
+
});
|
|
38
|
+
```
|
|
39
|
+
|
|
40
|
+
---
|
|
41
|
+
|
|
42
|
+
## CI Configuration (`.mindforge/ci.yaml`)
|
|
43
|
+
|
|
44
|
+
Typical GitHub Action setup:
|
|
45
|
+
|
|
46
|
+
```yaml
|
|
47
|
+
steps:
|
|
48
|
+
- uses: actions/checkout@v4
|
|
49
|
+
- name: Run MindForge Auto
|
|
50
|
+
run: npx mindforge auto --phase 3 --headless
|
|
51
|
+
env:
|
|
52
|
+
MINDFORGE_TOKEN: ${{ secrets.MINDFORGE_TOKEN }}
|
|
53
|
+
AUTO_PUSH_ON_WAVE_COMPLETE: true
|
|
54
|
+
```
|
|
55
|
+
|
|
56
|
+
---
|
|
57
|
+
|
|
58
|
+
## Exit codes
|
|
59
|
+
|
|
60
|
+
| Code | Meaning |
|
|
61
|
+
|------|---------|
|
|
62
|
+
| 0 | Success (Phase complete) |
|
|
63
|
+
| 1 | General Error (Check logs) |
|
|
64
|
+
| 3 | Gate/Compliance Failure |
|
|
65
|
+
| 10 | Escalation Required (Human needed) |
|
|
66
|
+
| 124 | Timeout (Max duration exceeded) |
|
|
@@ -0,0 +1,190 @@
|
|
|
1
|
+
# MindForge v2 — Node Repair Operator
|
|
2
|
+
|
|
3
|
+
## Purpose
|
|
4
|
+
When a task fails in auto mode, the node repair operator decides whether to
|
|
5
|
+
RETRY, DECOMPOSE, PRUNE, or ESCALATE — in that order of preference.
|
|
6
|
+
The goal: recover autonomously without escalating to the human unless truly necessary.
|
|
7
|
+
|
|
8
|
+
## The four repair strategies
|
|
9
|
+
|
|
10
|
+
### RETRY — Re-execute the same plan with a fresh context
|
|
11
|
+
|
|
12
|
+
**When to use:**
|
|
13
|
+
- First failure of any kind (default first response)
|
|
14
|
+
- Test failures that look transient (no deterministic root cause identifiable)
|
|
15
|
+
- Timeout (task ran > task timeout without completing)
|
|
16
|
+
- `error: Cannot find module` or similar environment setup errors
|
|
17
|
+
|
|
18
|
+
**How it works:**
|
|
19
|
+
1. Clear any partial file changes from the failed attempt: `git checkout -- .`
|
|
20
|
+
2. Read the failure output carefully for error signals
|
|
21
|
+
3. Inject the error output as additional context for the retry subagent:
|
|
22
|
+
```
|
|
23
|
+
[RETRY CONTEXT — previous attempt failed]
|
|
24
|
+
Error observed: [exact error message]
|
|
25
|
+
This is attempt 2/2. Fix this specific error.
|
|
26
|
+
```
|
|
27
|
+
4. Re-dispatch the task with fresh context + error context
|
|
28
|
+
5. If retry succeeds: write `node_repair_type: RETRY` to AUDIT entry
|
|
29
|
+
6. If retry fails: proceed to DECOMPOSE
|
|
30
|
+
|
|
31
|
+
**Budget:** Max 1 retry per task (configurable via `AUTO_NODE_REPAIR_BUDGET`)
|
|
32
|
+
|
|
33
|
+
---
|
|
34
|
+
|
|
35
|
+
### DECOMPOSE — Split the failed task into smaller tasks
|
|
36
|
+
|
|
37
|
+
**When to use:**
|
|
38
|
+
- RETRY failed (scope was too broad for one pass)
|
|
39
|
+
- Context estimate > 60K tokens (before even attempting)
|
|
40
|
+
- Verify step fails on multiple distinct criteria simultaneously
|
|
41
|
+
- Task has files from 2+ distinct domains (auth + database + UI in one plan)
|
|
42
|
+
|
|
43
|
+
**How it works:**
|
|
44
|
+
|
|
45
|
+
Step 1 — Analyse the failed plan:
|
|
46
|
+
Read the `<action>` and `<files>` fields. Identify:
|
|
47
|
+
- How many distinct concerns are in this plan?
|
|
48
|
+
- What is the minimal first step that would unblock the rest?
|
|
49
|
+
- What is the logical split that creates two independent tasks?
|
|
50
|
+
|
|
51
|
+
Step 2 — Create two replacement PLAN files:
|
|
52
|
+
|
|
53
|
+
Original: `PLAN-3-05.md` (failed)
|
|
54
|
+
→ `PLAN-3-05a.md` (first part — the foundation)
|
|
55
|
+
→ `PLAN-3-05b.md` (second part — depends on 05a)
|
|
56
|
+
|
|
57
|
+
```xml
|
|
58
|
+
<!-- PLAN-3-05a.md -->
|
|
59
|
+
<task type="auto">
|
|
60
|
+
<n>[Original name] — Part A: [foundation concern]</n>
|
|
61
|
+
<persona>developer</persona>
|
|
62
|
+
<phase>3</phase>
|
|
63
|
+
<plan>05a</plan>
|
|
64
|
+
<dependencies>04</dependencies>
|
|
65
|
+
<decomposed_from>05</decomposed_from>
|
|
66
|
+
<files>[subset of original files — foundation only]</files>
|
|
67
|
+
<action>[Action for part A only — narrower scope]</action>
|
|
68
|
+
<verify>[Verify step for part A specifically]</verify>
|
|
69
|
+
<done>[Part A definition of done]</done>
|
|
70
|
+
</task>
|
|
71
|
+
```
|
|
72
|
+
|
|
73
|
+
Step 3 — Insert the new plans into the wave execution at the current position
|
|
74
|
+
Step 4 — Update dependency chain: any plan that depended on the original plan now depends on the second sub-plan (Xb). Xa gets original dependencies.
|
|
75
|
+
Step 5 — Write AUDIT: `{ "event": "node_decomposed", "original": "3-05", "into": ["3-05a", "3-05b"] }`
|
|
76
|
+
Step 6 — Execute 3-05a. If it succeeds, execute 3-05b.
|
|
77
|
+
|
|
78
|
+
**Budget:** Max 1 decomposition per original plan. If 3-05a also fails: PRUNE or ESCALATE.
|
|
79
|
+
|
|
80
|
+
---
|
|
81
|
+
|
|
82
|
+
### PRUNE — Skip and defer to a follow-up task
|
|
83
|
+
|
|
84
|
+
**When to use:**
|
|
85
|
+
- Plan is not on the critical path (other plans don't depend on it)
|
|
86
|
+
- RETRY and DECOMPOSE both failed
|
|
87
|
+
- Plan is a "nice-to-have" improvement, not core functionality
|
|
88
|
+
|
|
89
|
+
**How it works:**
|
|
90
|
+
1. Mark the plan as `status: PRUNED` in auto-state.json
|
|
91
|
+
2. Write to `.planning/phases/[N]/DEFERRED-ITEMS.md`:
|
|
92
|
+
```markdown
|
|
93
|
+
# Deferred Items — Phase [N]
|
|
94
|
+
|
|
95
|
+
## PRUNED-[plan-id]: [task name]
|
|
96
|
+
**Reason:** RETRY + DECOMPOSE both failed. Non-critical path.
|
|
97
|
+
**Last error:** [error from final attempt]
|
|
98
|
+
**Retry when:** [suggested condition — e.g., "after database schema is stable"]
|
|
99
|
+
**Manual steps:** [what a human would need to do to complete this]
|
|
100
|
+
```
|
|
101
|
+
3. Log AUDIT: `{ "event": "node_pruned", "plan": "3-05", "reason": "..." }`
|
|
102
|
+
4. Send Slack notification if `AUTO_NOTIFY_ON_ESCALATION=true`:
|
|
103
|
+
"⚠️ Auto mode pruned Plan 3-05 — non-critical, deferred to follow-up"
|
|
104
|
+
5. Continue auto mode with the next task
|
|
105
|
+
|
|
106
|
+
**Guard:** PRUNE only if no other plans declare `<dependencies>` on this plan (check wave DAG and physical PLAN files).
|
|
107
|
+
If other plans depend on this one: ESCALATE instead (cannot skip a critical path dependency).
|
|
108
|
+
|
|
109
|
+
---
|
|
110
|
+
|
|
111
|
+
### ESCALATE — Stop, save state, notify human
|
|
112
|
+
|
|
113
|
+
**When to use (ANY of these):**
|
|
114
|
+
- Tier 3 change detected (auth/payment/PII code — requires human compliance approval)
|
|
115
|
+
- CRITICAL security finding in compliance gates
|
|
116
|
+
- Plan is on critical path and RETRY + DECOMPOSE both failed
|
|
117
|
+
- Gate 3 violation (secrets detected in diff)
|
|
118
|
+
- Node repair budget exhausted (ALL RETRY and DECOMPOSE attempts failed)
|
|
119
|
+
- Timeout exceeded AND work in progress (clean timeout → exit 0, mid-task timeout → ESCALATE)
|
|
120
|
+
- Human explicitly requested via `CTRL+C` pause
|
|
121
|
+
|
|
122
|
+
**How it works:**
|
|
123
|
+
1. Stop execution immediately (do NOT start the next task)
|
|
124
|
+
2. `git stash` any uncommitted partial changes
|
|
125
|
+
3. Write comprehensive ESCALATION-[timestamp].md:
|
|
126
|
+
```markdown
|
|
127
|
+
# Auto Mode Escalation — Phase [N]
|
|
128
|
+
**Time:** [ISO-8601]
|
|
129
|
+
**Trigger:** [exact escalation reason]
|
|
130
|
+
**Last completed task:** Plan [N]-[MM] (commit: [sha])
|
|
131
|
+
**Blocked on:** Plan [N]-[MM+1] — [task name]
|
|
132
|
+
**Error details:** [full error output]
|
|
133
|
+
**Required human action:** [exactly what needs to happen]
|
|
134
|
+
**Resume command:** /mindforge:auto --phase [N] --resume
|
|
135
|
+
```
|
|
136
|
+
4. Update auto-state.json: `"status": "escalated"`
|
|
137
|
+
5. Update HANDOFF.json: `"next_task": "ESCALATED — see .planning/phases/[N]/ESCALATION-[ts].md"`
|
|
138
|
+
6. Write AUDIT: `{ "event": "auto_mode_escalated", "reason": "...", "resume_command": "..." }`
|
|
139
|
+
7. Send Slack notification with the ESCALATION.md content (if configured)
|
|
140
|
+
8. Exit auto mode with status message printed to terminal
|
|
141
|
+
|
|
142
|
+
## Repair decision tree
|
|
143
|
+
|
|
144
|
+
```
|
|
145
|
+
Task fails
|
|
146
|
+
│
|
|
147
|
+
▼
|
|
148
|
+
Is this a Tier 3 governance trigger?
|
|
149
|
+
YES → ESCALATE immediately (never auto-approve auth/payment/PII)
|
|
150
|
+
NO
|
|
151
|
+
│
|
|
152
|
+
▼
|
|
153
|
+
Is this attempt 1?
|
|
154
|
+
YES → RETRY (inject error context)
|
|
155
|
+
NO (retry also failed)
|
|
156
|
+
│
|
|
157
|
+
▼
|
|
158
|
+
Is plan decomposable (2+ concerns)?
|
|
159
|
+
YES → DECOMPOSE into sub-plans
|
|
160
|
+
NO
|
|
161
|
+
│
|
|
162
|
+
▼
|
|
163
|
+
Is plan on critical path (other plans depend on it)?
|
|
164
|
+
YES → ESCALATE (cannot skip dependency)
|
|
165
|
+
NO
|
|
166
|
+
│
|
|
167
|
+
▼
|
|
168
|
+
PRUNE (defer to DEFERRED-ITEMS.md)
|
|
169
|
+
```
|
|
170
|
+
|
|
171
|
+
## Repair AUDIT schema
|
|
172
|
+
|
|
173
|
+
Every repair action writes an AUDIT entry:
|
|
174
|
+
|
|
175
|
+
```json
|
|
176
|
+
{
|
|
177
|
+
"id": "uuid",
|
|
178
|
+
"timestamp": "ISO-8601",
|
|
179
|
+
"event": "node_repair",
|
|
180
|
+
"session_id": "auto-sess-uuid",
|
|
181
|
+
"phase": 3,
|
|
182
|
+
"plan": "05",
|
|
183
|
+
"repair_type": "RETRY|DECOMPOSE|PRUNE|ESCALATE",
|
|
184
|
+
"attempt_number": 2,
|
|
185
|
+
"original_error": "[first 200 chars of error output]",
|
|
186
|
+
"repair_outcome": "recovered|failed|deferred|escalated",
|
|
187
|
+
"decomposed_into": ["05a", "05b"],
|
|
188
|
+
"agent": "mindforge-auto-repair"
|
|
189
|
+
}
|
|
190
|
+
```
|
|
@@ -0,0 +1,58 @@
|
|
|
1
|
+
# MindForge v2 — Progress Reporter
|
|
2
|
+
|
|
3
|
+
## Purpose
|
|
4
|
+
Provide real-time visibility into autonomous execution. The reporter
|
|
5
|
+
translates the raw AUDIT.jsonl stream into a user-friendly terminal UI
|
|
6
|
+
and a persistent `AUTONOMOUS-REPORT.md` file.
|
|
7
|
+
|
|
8
|
+
---
|
|
9
|
+
|
|
10
|
+
## Terminal UI (Progress stream)
|
|
11
|
+
|
|
12
|
+
The progress stream uses standard TTY escape codes (via `ora` or similar)
|
|
13
|
+
to show current status without flooding the terminal.
|
|
14
|
+
|
|
15
|
+
```
|
|
16
|
+
MindForge v2 [PHASE 3: API Hardening]
|
|
17
|
+
──────────────────────────────────────────────────────────────────
|
|
18
|
+
[WAVE 2/3] [TASK 5/8]
|
|
19
|
+
[██████████████░░░░░░] 75% Complete
|
|
20
|
+
|
|
21
|
+
Current: Plan 3-05 — Implement JWT Refresh Logic
|
|
22
|
+
Status: Running (attempt 1) [elapsed: 1:42]
|
|
23
|
+
Token Est: 18,400
|
|
24
|
+
|
|
25
|
+
Latest:
|
|
26
|
+
✅ Plan 3-04 complete (abc1234)
|
|
27
|
+
✅ Gate 1-2 Passed
|
|
28
|
+
⚠️ Steering applied: "Use jsonwebtoken lib"
|
|
29
|
+
──────────────────────────────────────────────────────────────────
|
|
30
|
+
Use /mindforge:steer "..." to guide the agent mid-flight.
|
|
31
|
+
```
|
|
32
|
+
|
|
33
|
+
---
|
|
34
|
+
|
|
35
|
+
## Autonomous report generation
|
|
36
|
+
|
|
37
|
+
At the end of an auto-mode session (Success or Escalation), a markdown
|
|
38
|
+
report is generated in `.planning/phases/[N]/AUTONOMOUS-REPORT-[timestamp].md`.
|
|
39
|
+
|
|
40
|
+
### Content requirements:
|
|
41
|
+
1. **Summary**: Overall status, start/end time, total duration.
|
|
42
|
+
2. **Stats**: Tasks completed, tokens consumed (estimate), commits made.
|
|
43
|
+
3. **Audit log**: Filtered list of major events (starts, completions, repairs, gates).
|
|
44
|
+
4. **Repairs**: Detailed breakdown of any RETRY or DECOMPOSE actions.
|
|
45
|
+
5. **Steering**: List of applied steering instructions and their outcomes.
|
|
46
|
+
6. **Escalation**: If failed, clear instructions on why and how to resume.
|
|
47
|
+
|
|
48
|
+
---
|
|
49
|
+
|
|
50
|
+
## JSON Output (Stream)
|
|
51
|
+
|
|
52
|
+
For integration with other tools or web UIs, the progress reporter
|
|
53
|
+
can output line-delimited JSON messages to a separate file or pipe.
|
|
54
|
+
|
|
55
|
+
```json
|
|
56
|
+
{"type":"progress","phase":3,"wave":2,"task":5,"status":"running","ts":"ISO-8601"}
|
|
57
|
+
{"type":"event","id":"abc-123","event":"gate_passed","gate":2}
|
|
58
|
+
```
|
|
@@ -0,0 +1,64 @@
|
|
|
1
|
+
# MindForge v2 — Steering Manager
|
|
2
|
+
|
|
3
|
+
## Purpose
|
|
4
|
+
Allow a human to "steer" the autonomous agent without stopping it.
|
|
5
|
+
Steering guidance is injected at the next task boundary, allowing for
|
|
6
|
+
mid-course corrections, scope changes, or specific technical preferences.
|
|
7
|
+
|
|
8
|
+
---
|
|
9
|
+
|
|
10
|
+
## Steering protocol
|
|
11
|
+
|
|
12
|
+
### 1. Queuing guidance
|
|
13
|
+
Human uses `/mindforge:steer "[instruction]"` or writes to `.planning/steering-queue.jsonl`.
|
|
14
|
+
Instructions are timestamped and queued.
|
|
15
|
+
|
|
16
|
+
```json
|
|
17
|
+
{ "id": "steer-uuid", "timestamp": "ISO-8601", "instruction": "Use PostgreSQL instead of SQLite", "scope": "global|current_phase|current_task" }
|
|
18
|
+
```
|
|
19
|
+
|
|
20
|
+
### 2. Injection point
|
|
21
|
+
The Auto-Executor checks for queued steering guidance MUST happen at
|
|
22
|
+
every task boundary (before starting a new subagent).
|
|
23
|
+
|
|
24
|
+
```javascript
|
|
25
|
+
// auto-executor injection logic
|
|
26
|
+
const guidance = await steeringManager.popPending();
|
|
27
|
+
if (guidance) {
|
|
28
|
+
const currentPlan = await loadCurrentPlan();
|
|
29
|
+
const modifiedPlan = applyGuidanceToPlan(currentPlan, guidance);
|
|
30
|
+
await savePlan(modifiedPlan);
|
|
31
|
+
writeAudit('steering_applied', guidance);
|
|
32
|
+
}
|
|
33
|
+
```
|
|
34
|
+
|
|
35
|
+
### 3. Application logic — HARDENED
|
|
36
|
+
When guidance is applied to a task:
|
|
37
|
+
|
|
38
|
+
- **Constraint:** Steering cannot override core security constraints (Gears 1-5).
|
|
39
|
+
- **Injection:** Guidance is added to the `<action>` field of the PLAN-N-MM.md,
|
|
40
|
+
prefixed with `[STEERING GUIDANCE — DO NOT IGNORE]`.
|
|
41
|
+
- **Precedence:** Steering instruction takes precedence over the original
|
|
42
|
+
plan description if they conflict.
|
|
43
|
+
|
|
44
|
+
---
|
|
45
|
+
|
|
46
|
+
## Steering commands
|
|
47
|
+
|
|
48
|
+
### `/mindforge:steer "..."`
|
|
49
|
+
Adds global guidance to the queue. Available even if auto mode is not running.
|
|
50
|
+
|
|
51
|
+
### `/mindforge:steer --task 3-05 "..."`
|
|
52
|
+
Targeted guidance for a specific future task.
|
|
53
|
+
|
|
54
|
+
### `/mindforge:steer --cancel`
|
|
55
|
+
Clear the entire steering queue.
|
|
56
|
+
|
|
57
|
+
---
|
|
58
|
+
|
|
59
|
+
## Steering feedback loop
|
|
60
|
+
|
|
61
|
+
When steering is applied, the Progress Reporter (terminal UI) shows:
|
|
62
|
+
`🚢 Steering applied: "Use PostgreSQL instead of SQLite"`
|
|
63
|
+
|
|
64
|
+
The applied guidance is also recorded in `AUTONOMOUS-REPORT.md`.
|