loki-mode 7.41.5 → 7.43.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +18 -1
- package/SKILL.md +2 -2
- package/VERSION +1 -1
- package/autonomy/app-runner.sh +174 -8
- package/autonomy/completion-council.sh +38 -16
- package/autonomy/hooks/migration-hooks.sh +131 -7
- package/autonomy/loki +66 -43
- package/autonomy/run.sh +73 -2
- package/dashboard/__init__.py +1 -1
- package/dashboard/server.py +102 -0
- package/dashboard/static/index.html +9 -9
- package/docs/INSTALLATION.md +70 -1
- package/events/bus.py +9 -6
- package/loki-ts/dist/loki.js +2 -2
- package/mcp/__init__.py +1 -1
- package/mcp/lsp_proxy.py +274 -89
- package/mcp/server.py +26 -2
- package/memory/vector_index.py +6 -1
- package/package.json +1 -1
- package/plugins/loki-mode/.claude-plugin/plugin.json +1 -1
- package/providers/codex.sh +21 -1
- package/references/core-workflow.md +7 -0
- package/references/quality-control.md +6 -0
- package/skills/agents.md +1 -0
package/README.md
CHANGED
|
@@ -29,7 +29,7 @@ _The free, source-available autonomous coding agent by [Autonomi](https://www.au
|
|
|
29
29
|
- **Production quality built in** -- 11 quality gates (`skills/quality-gates.md`), blind 3-reviewer code review (`run.sh:run_code_review()`), anti-sycophancy checks
|
|
30
30
|
- **Standalone verification: `loki verify`** -- Run Loki's deterministic gates (build, tests, static analysis, secret scan, dependency audit) against any branch or PR diff, including code written by other agents or humans. CI-ready exit codes (0 VERIFIED, 1 CONCERNS, 2 BLOCKED), machine-readable evidence at `.loki/verify/evidence.json`. Inconclusive evidence is never reported as VERIFIED (v7.27.0).
|
|
31
31
|
- **Living spec and pre-build interrogation** -- `loki spec` locks a spec and detects drift deterministically (`spec.lock`, `drift-report.json`, and a `SPEC_DRIFT` finding in `loki verify` with CI exit codes), so you can tell when the build diverges from what was agreed. `loki grill` runs a Devil's-Advocate interrogation of the spec before you build, surfacing gaps and contradictions early (v7.28.0).
|
|
32
|
-
- **Mid-flight model switching
|
|
32
|
+
- **Mid-flight model switching** -- switch the model a live run uses from the dashboard (applies at the next iteration, current run only). A Fable tier lever exists in the CLI, dashboard, and override paths, but Claude Fable 5 is not yet available at the API, so selecting Fable currently collapses to Opus at every dispatch chokepoint and the `loki plan` quote reflects Opus accordingly. For every model lever (session pin, mid-flight override, architect pass) and every `LOKI_MAX_TIER` path, the `loki plan` quote, the dashboard's reported model, and the actual dispatched model agree, with the ceiling enforced (v7.31.0; Fable-to-Opus collapse v7.39.1).
|
|
33
33
|
- **A calmer CLI** -- the help surface is ~20 grouped workflow entries instead of a 70-command wall; merged commands live on as aliases that forward byte-identically with a one-line stderr pointer, so no script breaks (v7.31.0).
|
|
34
34
|
- **Guided first build: `loki quickstart`** -- four quick questions (setup check, one-line idea, template pick, plan review) and your build starts; pressing Enter through every step builds the sample Todo app. The plan step quotes the real cost/time estimate before anything is spent, and `loki demo` now confirms its estimate the same way. If no AI provider CLI is installed, Loki offers to install Claude Code (consent-gated, interactive terminals only) (v7.29.0).
|
|
35
35
|
- **Live App Preview** -- The dashboard embeds the locally-running app in an iframe so you can interact with it immediately during a build. Use `loki preview` (alias `loki open`) to print the URL and open it in your browser. Local-first: no hosted service, no vendor lock (v7.24.0).
|
|
@@ -391,6 +391,23 @@ Run `loki --help` for all options. Full reference: [CLI Reference](wiki/CLI-Refe
|
|
|
391
391
|
|
|
392
392
|
---
|
|
393
393
|
|
|
394
|
+
<details>
|
|
395
|
+
<summary><strong>Configuration env vars (intelligent defaults, opt-out knobs)</strong></summary>
|
|
396
|
+
|
|
397
|
+
Loki Mode's accuracy and autonomy behaviors are default-on. Each is an opt-out escape hatch, not a setting you have to discover. The most relevant knobs from the v7.41.x accuracy/autonomy hardening:
|
|
398
|
+
|
|
399
|
+
| Env var | Default | Effect |
|
|
400
|
+
|---------|---------|--------|
|
|
401
|
+
| `LOKI_REVIEW_INCONCLUSIVE_BLOCK` | `1` | Blocks completion when a code-review round returns zero usable verdicts (an all-empty review proves nothing). Set `0` to record the inconclusive result without blocking. |
|
|
402
|
+
| `LOKI_COMPLETION_TEST_CAPTURE` | `1` | Captures fresh test results before the verified-completion evidence gate evaluates. Set `0` to skip the pre-gate capture. |
|
|
403
|
+
| `LOKI_AUTO_DOCS` | `true` | Generates the `.loki/docs/` suite before the documentation gate scores it (bounded: once per run when docs are missing, and again only when >10 commits stale). Set `false` to opt out. |
|
|
404
|
+
| `LOKI_CAVEMAN` | `1` (on) | Output-token compressor for free-form generation only (never trust-gate subcalls). Set `0` to opt out. |
|
|
405
|
+
| `LOKI_CAVEMAN_LEVEL` | inferred | Compression level for the compressor. Auto-inferred per invocation from the run's RARV tier; set explicitly (`lite` / `full` / `ultra`) to override the inference. |
|
|
406
|
+
|
|
407
|
+
This is a subset. See the [wiki](wiki/Home.md) for the full env-var reference and the RARV-C closure knobs (`LOKI_INJECT_FINDINGS`, `LOKI_OVERRIDE_COUNCIL`, `LOKI_AUTO_LEARNINGS`, `LOKI_HANDOFF_MD`).
|
|
408
|
+
|
|
409
|
+
</details>
|
|
410
|
+
|
|
394
411
|
<details>
|
|
395
412
|
<summary><strong>BMAD Method Integration</strong></summary>
|
|
396
413
|
|
package/SKILL.md
CHANGED
|
@@ -3,7 +3,7 @@ name: loki-mode
|
|
|
3
3
|
description: Autonomous spec-driven build system with a built-in trust layer. It does not call work done until it is verified (RARV-C closure loop, 11 quality gates, completion council, verified-completion evidence gate). Triggers on "Loki Mode". Takes a spec (PRD, GitHub issue, OpenAPI doc, etc.) to deployed product with minimal human intervention. Provider-agnostic. Requires --dangerously-skip-permissions flag.
|
|
4
4
|
---
|
|
5
5
|
|
|
6
|
-
# Loki Mode v7.
|
|
6
|
+
# Loki Mode v7.43.0
|
|
7
7
|
|
|
8
8
|
**You are an autonomous agent. You make decisions. You do not ask questions. You do not stop.**
|
|
9
9
|
|
|
@@ -398,4 +398,4 @@ See `CHANGELOG.md` entries [7.5.7], [7.5.8], [7.5.13] for the per-fix list and r
|
|
|
398
398
|
|
|
399
399
|
---
|
|
400
400
|
|
|
401
|
-
**v7.
|
|
401
|
+
**v7.43.0 | [Autonomi](https://www.autonomi.dev/) flagship product | ~260 lines core**
|
package/VERSION
CHANGED
|
@@ -1 +1 @@
|
|
|
1
|
-
7.
|
|
1
|
+
7.43.0
|
package/autonomy/app-runner.sh
CHANGED
|
@@ -156,6 +156,112 @@ _rewrite_detection_port() {
|
|
|
156
156
|
_write_detection "$d_type" "$d_command"
|
|
157
157
|
}
|
|
158
158
|
|
|
159
|
+
# Collect the transitive descendant tree of a PID (children, grandchildren, ...).
|
|
160
|
+
#
|
|
161
|
+
# Echoes one PID per line, deepest-LAST is NOT guaranteed; order is breadth-first
|
|
162
|
+
# from the root. The root PID itself is NOT included. Used by the non-setsid stop
|
|
163
|
+
# fallback (BUG 1): the app is started as `( ... ) &` WITHOUT setsid, so on stock
|
|
164
|
+
# macOS the whole tree (subshell -> bash -lc -> npm -> sh -> node -> workers)
|
|
165
|
+
# inherits the ORCHESTRATOR's process group. A `kill -- -PGID` would therefore
|
|
166
|
+
# signal run.sh and the Claude agent driving it (self-termination), so we MUST
|
|
167
|
+
# walk parent->child links from OUR pid only. This guarantees we never signal a
|
|
168
|
+
# process outside our own subtree: every returned pid has our root as an ancestor.
|
|
169
|
+
#
|
|
170
|
+
# Snapshot semantics: the caller MUST collect the full tree BEFORE sending any
|
|
171
|
+
# signal. If we TERM top-down while walking, grandchildren reparent to init and
|
|
172
|
+
# `pgrep -P <dead-parent>` returns nothing, re-creating the orphaned-worker bug
|
|
173
|
+
# this fix exists to close.
|
|
174
|
+
_app_runner_collect_descendants() {
|
|
175
|
+
local root="$1"
|
|
176
|
+
# Guard against empty / init / kernel pids: walking from 0/1 would sweep
|
|
177
|
+
# unrelated processes. A valid app pid is always > 1.
|
|
178
|
+
case "$root" in
|
|
179
|
+
''|0|1) return 0 ;;
|
|
180
|
+
esac
|
|
181
|
+
if ! [[ "$root" =~ ^[0-9]+$ ]]; then
|
|
182
|
+
return 0
|
|
183
|
+
fi
|
|
184
|
+
|
|
185
|
+
local -a frontier=("$root")
|
|
186
|
+
local -a found=()
|
|
187
|
+
local pid child
|
|
188
|
+
local -a kids
|
|
189
|
+
# Bound iterations defensively against a pathological/looping tree.
|
|
190
|
+
local guard=0
|
|
191
|
+
while [ "${#frontier[@]}" -gt 0 ] && [ "$guard" -lt 10000 ]; do
|
|
192
|
+
guard=$(( guard + 1 ))
|
|
193
|
+
pid="${frontier[0]}"
|
|
194
|
+
frontier=("${frontier[@]:1}")
|
|
195
|
+
# Direct children of pid.
|
|
196
|
+
kids=()
|
|
197
|
+
while IFS= read -r child; do
|
|
198
|
+
[ -n "$child" ] && kids+=("$child")
|
|
199
|
+
done < <(pgrep -P "$pid" 2>/dev/null)
|
|
200
|
+
local k
|
|
201
|
+
for k in "${kids[@]:-}"; do
|
|
202
|
+
[ -n "$k" ] || continue
|
|
203
|
+
found+=("$k")
|
|
204
|
+
frontier+=("$k")
|
|
205
|
+
done
|
|
206
|
+
done
|
|
207
|
+
|
|
208
|
+
local f
|
|
209
|
+
for f in "${found[@]:-}"; do
|
|
210
|
+
[ -n "$f" ] && printf '%s\n' "$f"
|
|
211
|
+
done
|
|
212
|
+
}
|
|
213
|
+
|
|
214
|
+
# Signal an EXPLICIT, pre-captured set of PIDs with a given signal.
|
|
215
|
+
#
|
|
216
|
+
# Usage: _app_runner_signal_pids <SIGNAL> <pid> [pid ...]
|
|
217
|
+
#
|
|
218
|
+
# Why an explicit list and not "(re-)walk from root": a worker that traps
|
|
219
|
+
# SIGTERM (a Node server doing graceful shutdown is the textbook case) survives
|
|
220
|
+
# the TERM phase while its intermediate ancestors (npm, sh) die. Once the
|
|
221
|
+
# ancestors die, the surviving worker reparents to init, so re-deriving the tree
|
|
222
|
+
# from the now-dead root via `pgrep -P` would return NOTHING -- the KILL phase
|
|
223
|
+
# would be skipped and the orphaned, port-holding worker would live on. That is
|
|
224
|
+
# exactly the orphaned-worker bug (BUG 1) resurfacing at the force-kill phase.
|
|
225
|
+
# The fix: the caller snapshots root + all descendants ONCE before any signal,
|
|
226
|
+
# and every phase (TERM, aliveness, KILL) operates over that frozen list.
|
|
227
|
+
#
|
|
228
|
+
# Safety: the caller builds the list from _app_runner_collect_descendants, which
|
|
229
|
+
# only ever follows parent->child links from OUR pid, so the list can never
|
|
230
|
+
# contain a process outside our own subtree. We signal pids individually (never
|
|
231
|
+
# a process group) because in the non-setsid path the app inherits the
|
|
232
|
+
# orchestrator's process group; a group signal would kill run.sh and the agent.
|
|
233
|
+
# Pids are signaled in REVERSE capture order so descendants (captured after the
|
|
234
|
+
# root) are signaled before the root.
|
|
235
|
+
_app_runner_signal_pids() {
|
|
236
|
+
local sig="$1"; shift
|
|
237
|
+
local -a pids=("$@")
|
|
238
|
+
local i p
|
|
239
|
+
for (( i=${#pids[@]}-1; i>=0; i-- )); do
|
|
240
|
+
p="${pids[$i]}"
|
|
241
|
+
case "$p" in
|
|
242
|
+
''|0|1) continue ;;
|
|
243
|
+
esac
|
|
244
|
+
kill "-${sig}" "$p" 2>/dev/null || true
|
|
245
|
+
done
|
|
246
|
+
}
|
|
247
|
+
|
|
248
|
+
# True (0) if ANY pid in the EXPLICIT pre-captured list is still alive.
|
|
249
|
+
# Used by the non-setsid stop grace-wait so a deep worker that outlived the main
|
|
250
|
+
# subshell does not let us fall through to "stopped" prematurely. Operates over
|
|
251
|
+
# the frozen snapshot for the same reason _app_runner_signal_pids does.
|
|
252
|
+
_app_runner_any_alive() {
|
|
253
|
+
local p
|
|
254
|
+
for p in "$@"; do
|
|
255
|
+
case "$p" in
|
|
256
|
+
''|0|1) continue ;;
|
|
257
|
+
esac
|
|
258
|
+
if kill -0 "$p" 2>/dev/null; then
|
|
259
|
+
return 0
|
|
260
|
+
fi
|
|
261
|
+
done
|
|
262
|
+
return 1
|
|
263
|
+
}
|
|
264
|
+
|
|
159
265
|
# Fix #2 (finding #597): reconcile the recorded port with the port the app
|
|
160
266
|
# ACTUALLY bound, using the listen line in app.log as the source of truth. This
|
|
161
267
|
# corrects the dashboard Live Preview even when the app ignores PORT and picks
|
|
@@ -887,32 +993,82 @@ app_runner_stop() {
|
|
|
887
993
|
fi
|
|
888
994
|
fi
|
|
889
995
|
|
|
996
|
+
# BUG 1 fix: on the non-setsid fallback (the DEFAULT path on stock macOS,
|
|
997
|
+
# which has no setsid) capture the FULL process subtree -- root + every
|
|
998
|
+
# transitive descendant -- ONCE, BEFORE sending any signal. The old
|
|
999
|
+
# `pkill -TERM -P <pid>` reached only ONE level of children, so deep workers
|
|
1000
|
+
# (npm -> sh -> node -> workers) holding the listening socket survived as
|
|
1001
|
+
# orphans and kept the port bound, blocking the next start.
|
|
1002
|
+
#
|
|
1003
|
+
# Capturing once is load-bearing: a worker that traps SIGTERM survives the
|
|
1004
|
+
# TERM phase while its intermediate ancestors die, then reparents to init.
|
|
1005
|
+
# Re-deriving the tree from the now-dead root would return nothing and skip
|
|
1006
|
+
# the KILL phase, leaving the port-holder alive. Every phase below (TERM,
|
|
1007
|
+
# grace-wait, KILL) operates over this one frozen snapshot instead.
|
|
1008
|
+
local -a _stop_snapshot=()
|
|
1009
|
+
if [ "$_APP_RUNNER_HAS_SETSID" != true ]; then
|
|
1010
|
+
_stop_snapshot=("$_APP_RUNNER_PID")
|
|
1011
|
+
local _snap_d
|
|
1012
|
+
while IFS= read -r _snap_d; do
|
|
1013
|
+
[ -n "$_snap_d" ] && _stop_snapshot+=("$_snap_d")
|
|
1014
|
+
done < <(_app_runner_collect_descendants "$_APP_RUNNER_PID")
|
|
1015
|
+
fi
|
|
1016
|
+
|
|
890
1017
|
# Send SIGTERM to process and children
|
|
891
1018
|
if [ "$_APP_RUNNER_HAS_SETSID" = true ]; then
|
|
1019
|
+
# setsid path: the app is its own process group leader, so a group
|
|
1020
|
+
# signal reaches the whole tree safely. Unchanged.
|
|
892
1021
|
kill -TERM "-$_APP_RUNNER_PID" 2>/dev/null || kill -TERM "$_APP_RUNNER_PID" 2>/dev/null || true
|
|
893
1022
|
else
|
|
894
|
-
|
|
895
|
-
|
|
1023
|
+
# Group-kill is NOT used here: in this path the app inherits the
|
|
1024
|
+
# orchestrator's process group, so a group signal would kill run.sh and
|
|
1025
|
+
# the agent driving it. Signal the frozen snapshot, descendants first.
|
|
1026
|
+
_app_runner_signal_pids TERM "${_stop_snapshot[@]}"
|
|
896
1027
|
fi
|
|
897
1028
|
|
|
898
|
-
# Wait up to 5 seconds for graceful shutdown
|
|
1029
|
+
# Wait up to 5 seconds for graceful shutdown. Key the wait on the WHOLE
|
|
1030
|
+
# snapshot being alive (not just the main pid): a deep worker can outlive the
|
|
1031
|
+
# main subshell, and treating the main pid's exit as "done" is exactly what
|
|
1032
|
+
# let workers leak before. setsid path keeps the simpler main-pid check.
|
|
899
1033
|
local waited=0
|
|
900
1034
|
while [ "$waited" -lt 5 ]; do
|
|
901
|
-
if
|
|
902
|
-
break
|
|
1035
|
+
if [ "$_APP_RUNNER_HAS_SETSID" = true ]; then
|
|
1036
|
+
kill -0 "$_APP_RUNNER_PID" 2>/dev/null || break
|
|
1037
|
+
else
|
|
1038
|
+
_app_runner_any_alive "${_stop_snapshot[@]}" || break
|
|
903
1039
|
fi
|
|
904
1040
|
sleep 1
|
|
905
1041
|
waited=$(( waited + 1 ))
|
|
906
1042
|
done
|
|
907
1043
|
|
|
908
1044
|
# Force kill if still running
|
|
909
|
-
|
|
1045
|
+
local _still_alive=false
|
|
1046
|
+
if [ "$_APP_RUNNER_HAS_SETSID" = true ]; then
|
|
1047
|
+
kill -0 "$_APP_RUNNER_PID" 2>/dev/null && _still_alive=true
|
|
1048
|
+
else
|
|
1049
|
+
_app_runner_any_alive "${_stop_snapshot[@]}" && _still_alive=true
|
|
1050
|
+
fi
|
|
1051
|
+
if [ "$_still_alive" = true ]; then
|
|
910
1052
|
log_warn "App Runner: process did not stop gracefully, sending SIGKILL"
|
|
911
1053
|
if [ "$_APP_RUNNER_HAS_SETSID" = true ]; then
|
|
912
1054
|
kill -KILL "-$_APP_RUNNER_PID" 2>/dev/null || kill -KILL "$_APP_RUNNER_PID" 2>/dev/null || true
|
|
913
1055
|
else
|
|
914
|
-
|
|
915
|
-
|
|
1056
|
+
# BUG 1 fix (KILL phase): SIGKILL the SAME frozen snapshot (root +
|
|
1057
|
+
# all descendants captured pre-signal), so a TERM-trapping worker
|
|
1058
|
+
# that reparented to init is still force-killed. SIGKILL cannot be
|
|
1059
|
+
# trapped, so this is the terminal guarantee that no port-holder
|
|
1060
|
+
# survives. The snapshot does the real work. The fresh walk below only
|
|
1061
|
+
# adds anything while the root is still alive (a worker spawned during
|
|
1062
|
+
# shutdown); once the root is dead it is empty and the snapshot covers.
|
|
1063
|
+
_app_runner_signal_pids KILL "${_stop_snapshot[@]}"
|
|
1064
|
+
local -a _kill_fresh=()
|
|
1065
|
+
local _kf
|
|
1066
|
+
while IFS= read -r _kf; do
|
|
1067
|
+
[ -n "$_kf" ] && _kill_fresh+=("$_kf")
|
|
1068
|
+
done < <(_app_runner_collect_descendants "$_APP_RUNNER_PID")
|
|
1069
|
+
if [ "${#_kill_fresh[@]}" -gt 0 ]; then
|
|
1070
|
+
_app_runner_signal_pids KILL "${_kill_fresh[@]}"
|
|
1071
|
+
fi
|
|
916
1072
|
fi
|
|
917
1073
|
fi
|
|
918
1074
|
|
|
@@ -1094,6 +1250,11 @@ app_runner_watchdog() {
|
|
|
1094
1250
|
# it restarts the stack under the same crash-count circuit breaker.
|
|
1095
1251
|
if [ "$_APP_RUNNER_IS_DOCKER" = true ] && echo "$_APP_RUNNER_METHOD" | grep -q "docker compose"; then
|
|
1096
1252
|
if app_runner_health_check; then
|
|
1253
|
+
# BUG 3 fix: the breaker is meant to fire on 5 CONSECUTIVE failures.
|
|
1254
|
+
# A confirmed-healthy observation clears any accumulated count so a
|
|
1255
|
+
# long-lived stack that recovered from a few transient blips is not
|
|
1256
|
+
# tripped permanently on cumulative (non-consecutive) crashes.
|
|
1257
|
+
_APP_RUNNER_CRASH_COUNT=0
|
|
1097
1258
|
return 0
|
|
1098
1259
|
fi
|
|
1099
1260
|
_APP_RUNNER_CRASH_COUNT=$(( _APP_RUNNER_CRASH_COUNT + 1 ))
|
|
@@ -1125,6 +1286,11 @@ app_runner_watchdog() {
|
|
|
1125
1286
|
|
|
1126
1287
|
# Process alive, nothing to do
|
|
1127
1288
|
if kill -0 "$_APP_RUNNER_PID" 2>/dev/null; then
|
|
1289
|
+
# BUG 3 fix: a confirmed-alive observation clears the accumulated crash
|
|
1290
|
+
# count so the breaker fires only on 5 CONSECUTIVE deaths, not on 5
|
|
1291
|
+
# cumulative crashes that were each successfully recovered over a long
|
|
1292
|
+
# session (which would trip the breaker on a HEALTHY app).
|
|
1293
|
+
_APP_RUNNER_CRASH_COUNT=0
|
|
1128
1294
|
return 0
|
|
1129
1295
|
fi
|
|
1130
1296
|
|
|
@@ -710,8 +710,18 @@ print('true' if ratio > budget else 'false')
|
|
|
710
710
|
((member++))
|
|
711
711
|
done
|
|
712
712
|
|
|
713
|
-
# Anti-sycophancy check: if unanimous APPROVE, run devil's advocate
|
|
713
|
+
# Anti-sycophancy check: if unanimous APPROVE, run devil's advocate.
|
|
714
|
+
#
|
|
715
|
+
# Audit-trail snapshots (these do NOT affect the live vote): capture whether
|
|
716
|
+
# the council was unanimous BEFORE the decrement below, and whether the DA
|
|
717
|
+
# actually fired and flipped the verdict. The transcript fields
|
|
718
|
+
# _ct_triggered/_ct_flipped used to be re-derived from approve_count AFTER
|
|
719
|
+
# this block decremented it, so on rounds where the DA fired AND flipped they
|
|
720
|
+
# were mis-recorded as false/false, corrupting the trust-metrics audit trail.
|
|
721
|
+
local _da_was_unanimous="false"
|
|
722
|
+
local _da_flipped="false"
|
|
714
723
|
if [ $approve_count -eq $COUNCIL_SIZE ] && [ $COUNCIL_SIZE -ge 2 ]; then
|
|
724
|
+
_da_was_unanimous="true"
|
|
715
725
|
log_warn "Unanimous approval detected - running anti-sycophancy check..."
|
|
716
726
|
local contrarian_verdict
|
|
717
727
|
contrarian_verdict=$(council_devils_advocate "$evidence_file" "$vote_dir")
|
|
@@ -731,6 +741,7 @@ print('true' if ratio > budget else 'false')
|
|
|
731
741
|
log_warn "Overriding to require one more iteration for verification"
|
|
732
742
|
approve_count=$((approve_count - 1))
|
|
733
743
|
reject_count=$((reject_count + 1))
|
|
744
|
+
_da_flipped="true"
|
|
734
745
|
fi
|
|
735
746
|
fi
|
|
736
747
|
|
|
@@ -795,20 +806,18 @@ with open(state_file, 'w') as f:
|
|
|
795
806
|
>/dev/null 2>&1 || true
|
|
796
807
|
fi
|
|
797
808
|
|
|
798
|
-
# Write transcript for this council round (Path A: council_vote path)
|
|
809
|
+
# Write transcript for this council round (Path A: council_vote path).
|
|
810
|
+
#
|
|
811
|
+
# Drive contrarian_triggered/_flipped off the snapshots captured in the
|
|
812
|
+
# anti-sycophancy block ABOVE, not off the now-mutated approve_count. The DA
|
|
813
|
+
# fires exactly when the council was unanimous (_da_was_unanimous), and it
|
|
814
|
+
# flips exactly when it did not confirm the approval (_da_flipped). Re-deriving
|
|
815
|
+
# from approve_count was wrong because the flip path already decremented it,
|
|
816
|
+
# so triggered/flipped were both recorded as false on flip rounds.
|
|
799
817
|
local _ct_outcome
|
|
800
818
|
_ct_outcome=$([ $approve_count -ge $effective_threshold ] && echo "APPROVED" || echo "REJECTED")
|
|
801
|
-
local _ct_triggered="
|
|
802
|
-
local _ct_flipped="
|
|
803
|
-
if [ $approve_count -eq $COUNCIL_SIZE ] && [ $COUNCIL_SIZE -ge 2 ]; then
|
|
804
|
-
_ct_triggered="true"
|
|
805
|
-
fi
|
|
806
|
-
# contrarian_flipped: DA voted REJECT/CANNOT_VALIDATE causing approve_count drop
|
|
807
|
-
# Detect by checking if approve dropped from unanimous (COUNCIL_SIZE) to less
|
|
808
|
-
# We infer flip if triggered AND final approve < COUNCIL_SIZE
|
|
809
|
-
if [ "$_ct_triggered" = "true" ] && [ $approve_count -lt $COUNCIL_SIZE ]; then
|
|
810
|
-
_ct_flipped="true"
|
|
811
|
-
fi
|
|
819
|
+
local _ct_triggered="$_da_was_unanimous"
|
|
820
|
+
local _ct_flipped="$_da_flipped"
|
|
812
821
|
council_write_transcript "${ITERATION_COUNT:-0}" "$_ct_outcome" "$_ct_triggered" "$_ct_flipped" "$effective_threshold"
|
|
813
822
|
|
|
814
823
|
if [ $approve_count -ge $effective_threshold ]; then
|
|
@@ -1510,8 +1519,17 @@ council_evidence_gate() {
|
|
|
1510
1519
|
if committed_files=$(git diff --name-only "$base_sha" HEAD 2>/dev/null); then
|
|
1511
1520
|
:
|
|
1512
1521
|
else
|
|
1513
|
-
# Base present but
|
|
1514
|
-
#
|
|
1522
|
+
# Base present but UNREACHABLE (e.g. shallow clone, history rewrite,
|
|
1523
|
+
# or `git reset --hard` -- a documented live hazard). The diff vs the
|
|
1524
|
+
# run-start SHA cannot be computed, so we can no longer prove that the
|
|
1525
|
+
# committed-union diff is empty. Treat this as INCONCLUSIVE, not as
|
|
1526
|
+
# positive empty-diff fabrication evidence: an agent that committed
|
|
1527
|
+
# all its work leaves a clean working tree, and `git diff HEAD` would
|
|
1528
|
+
# read empty -> a false BLOCK. We still fall back to the working-tree
|
|
1529
|
+
# diff vs HEAD to capture any uncommitted work, but the empty-diff
|
|
1530
|
+
# block is suppressed below via the diff_inconclusive guard.
|
|
1531
|
+
diff_inconclusive="true"
|
|
1532
|
+
diff_inconclusive_reason="base_unreachable"
|
|
1515
1533
|
committed_files=$(git diff --name-only HEAD 2>/dev/null || echo "")
|
|
1516
1534
|
fi
|
|
1517
1535
|
unstaged_files=$(git diff --name-only HEAD 2>/dev/null || echo "")
|
|
@@ -1534,7 +1552,11 @@ council_evidence_gate() {
|
|
|
1534
1552
|
else
|
|
1535
1553
|
diff_files=0
|
|
1536
1554
|
fi
|
|
1537
|
-
|
|
1555
|
+
# Only treat an empty union as positive fabrication evidence when the
|
|
1556
|
+
# baseline was CONCLUSIVE. If the base SHA was unreachable (history
|
|
1557
|
+
# rewrite / reset --hard), a clean committed tree yields an empty
|
|
1558
|
+
# working-tree diff that must NOT read as empty-diff fabrication.
|
|
1559
|
+
if [ "$diff_files" -eq 0 ] && [ "$diff_inconclusive" != "true" ]; then
|
|
1538
1560
|
diff_fails="true"
|
|
1539
1561
|
fi
|
|
1540
1562
|
fi
|
|
@@ -317,14 +317,38 @@ hook_pre_healing_modify() {
|
|
|
317
317
|
if [[ -f "$heal_dir/friction-map.json" ]]; then
|
|
318
318
|
local blocked
|
|
319
319
|
blocked=$(python3 -c "
|
|
320
|
-
import json, sys
|
|
320
|
+
import json, os, sys
|
|
321
321
|
file_path = sys.argv[1]
|
|
322
322
|
strict = sys.argv[2] == 'true'
|
|
323
323
|
with open(sys.argv[3]) as f:
|
|
324
324
|
data = json.load(f)
|
|
325
|
+
|
|
326
|
+
# Path-aware match (not raw substring 'in', which over-matched app.py against
|
|
327
|
+
# myapp.py and under-matched src/foo.py against a foo.py:10 location). Friction
|
|
328
|
+
# locations are formatted 'path:line' (or just 'path'); strip a trailing
|
|
329
|
+
# ':<line>' then compare by basename and normalized path so the same file is
|
|
330
|
+
# matched regardless of how it was referenced.
|
|
331
|
+
def norm(p):
|
|
332
|
+
# Drop a trailing ':<line>' (and optional ':<col>') suffix from a location.
|
|
333
|
+
parts = p.rsplit(':', 1)
|
|
334
|
+
while len(parts) == 2 and parts[1].isdigit():
|
|
335
|
+
p = parts[0]
|
|
336
|
+
parts = p.rsplit(':', 1)
|
|
337
|
+
return p
|
|
338
|
+
|
|
339
|
+
def matches(target, loc):
|
|
340
|
+
loc = norm(loc)
|
|
341
|
+
if not target or not loc:
|
|
342
|
+
return False
|
|
343
|
+
# Exact normalized-path match, or same basename. Basename equality is the
|
|
344
|
+
# path-aware replacement for substring containment.
|
|
345
|
+
if os.path.normpath(target) == os.path.normpath(loc):
|
|
346
|
+
return True
|
|
347
|
+
return os.path.basename(target) == os.path.basename(loc)
|
|
348
|
+
|
|
325
349
|
for friction in data.get('frictions', []):
|
|
326
350
|
loc = friction.get('location', '')
|
|
327
|
-
if file_path
|
|
351
|
+
if matches(file_path, loc):
|
|
328
352
|
cls = friction.get('classification', 'unknown')
|
|
329
353
|
safe = friction.get('safe_to_remove', False)
|
|
330
354
|
if cls in ('business_rule', 'unknown') and not safe:
|
|
@@ -343,9 +367,101 @@ print('OK')
|
|
|
343
367
|
fi
|
|
344
368
|
fi
|
|
345
369
|
|
|
370
|
+
# Capture a pre-edit snapshot so post_healing_modify can revert ONLY the
|
|
371
|
+
# healing edit on test failure (not unrelated uncommitted changes, and not
|
|
372
|
+
# via git checkout which discards everything). Keyed by file path.
|
|
373
|
+
_heal_snapshot_save "$heal_dir" "$file_path"
|
|
374
|
+
|
|
375
|
+
return 0
|
|
376
|
+
}
|
|
377
|
+
|
|
378
|
+
# Snapshot path helper: maps a target file path to its snapshot blob location.
|
|
379
|
+
# Uses a flat directory with the path's basename plus a hash of the full path
|
|
380
|
+
# to avoid collisions between same-named files in different directories.
|
|
381
|
+
_heal_snapshot_path() {
|
|
382
|
+
local heal_dir="$1"
|
|
383
|
+
local file_path="$2"
|
|
384
|
+
local key
|
|
385
|
+
key=$(printf '%s' "$file_path" | cksum | awk '{print $1"-"$2}')
|
|
386
|
+
printf '%s/snapshots/%s.%s' "$heal_dir" "$(basename "$file_path")" "$key"
|
|
387
|
+
}
|
|
388
|
+
|
|
389
|
+
# Save a pre-edit snapshot of file_path. If the file does not exist yet (the
|
|
390
|
+
# healing edit will CREATE it), write a sentinel marker instead so the revert
|
|
391
|
+
# path knows to remove the file rather than restore content.
|
|
392
|
+
#
|
|
393
|
+
# Pairing contract: hook_pre_healing_modify (which calls this) MUST run for a
|
|
394
|
+
# file before hook_post_healing_modify reverts it. The snapshot is refreshed on
|
|
395
|
+
# every pre call, so a post without a matching fresh pre could restore a stale
|
|
396
|
+
# blob. On the success path the snapshot is intentionally left in place; the
|
|
397
|
+
# next pre overwrites it.
|
|
398
|
+
_heal_snapshot_save() {
|
|
399
|
+
local heal_dir="$1"
|
|
400
|
+
local file_path="$2"
|
|
401
|
+
[[ -z "$file_path" ]] && return 0
|
|
402
|
+
local snap_dir="$heal_dir/snapshots"
|
|
403
|
+
mkdir -p "$snap_dir" 2>/dev/null || return 0
|
|
404
|
+
local snap
|
|
405
|
+
snap=$(_heal_snapshot_path "$heal_dir" "$file_path")
|
|
406
|
+
if [[ -f "$file_path" ]]; then
|
|
407
|
+
cp "$file_path" "$snap" 2>/dev/null || return 0
|
|
408
|
+
rm -f "$snap.absent" 2>/dev/null || true
|
|
409
|
+
else
|
|
410
|
+
# File does not exist pre-edit: record an "absent" marker, drop any
|
|
411
|
+
# stale content snapshot.
|
|
412
|
+
rm -f "$snap" 2>/dev/null || true
|
|
413
|
+
: > "$snap.absent" 2>/dev/null || true
|
|
414
|
+
fi
|
|
346
415
|
return 0
|
|
347
416
|
}
|
|
348
417
|
|
|
418
|
+
# Restore file_path from its pre-edit snapshot, reverting ONLY the healing edit.
|
|
419
|
+
# Echoes an accurate human-readable message describing what actually happened
|
|
420
|
+
# (content restored / healing-added file removed / could not revert). Returns 0
|
|
421
|
+
# when the revert succeeded as reported, 1 when it could not be performed.
|
|
422
|
+
_heal_snapshot_restore() {
|
|
423
|
+
local heal_dir="$1"
|
|
424
|
+
local file_path="$2"
|
|
425
|
+
if [[ -z "$file_path" ]]; then
|
|
426
|
+
echo "No file path given; nothing reverted."
|
|
427
|
+
return 1
|
|
428
|
+
fi
|
|
429
|
+
local snap
|
|
430
|
+
snap=$(_heal_snapshot_path "$heal_dir" "$file_path")
|
|
431
|
+
|
|
432
|
+
if [[ -f "$snap" ]]; then
|
|
433
|
+
# Pre-edit content snapshot exists: restore exactly that content, which
|
|
434
|
+
# preserves any unrelated uncommitted changes present before the edit.
|
|
435
|
+
if cp "$snap" "$file_path" 2>/dev/null; then
|
|
436
|
+
echo "Healing edit reverted to pre-edit snapshot."
|
|
437
|
+
return 0
|
|
438
|
+
fi
|
|
439
|
+
echo "Could not restore pre-edit snapshot for ${file_path}; file left as-is."
|
|
440
|
+
return 1
|
|
441
|
+
fi
|
|
442
|
+
|
|
443
|
+
if [[ -f "$snap.absent" ]]; then
|
|
444
|
+
# File did not exist pre-edit: the healing edit created it. Remove only
|
|
445
|
+
# that file, not unrelated state.
|
|
446
|
+
if [[ ! -e "$file_path" ]]; then
|
|
447
|
+
echo "Healing-added file ${file_path} no longer present; nothing to remove."
|
|
448
|
+
return 0
|
|
449
|
+
fi
|
|
450
|
+
if rm -f "$file_path" 2>/dev/null; then
|
|
451
|
+
echo "Healing-added file ${file_path} removed."
|
|
452
|
+
return 0
|
|
453
|
+
fi
|
|
454
|
+
echo "Could not remove healing-added file ${file_path}; file left as-is."
|
|
455
|
+
return 1
|
|
456
|
+
fi
|
|
457
|
+
|
|
458
|
+
# No snapshot was captured (pre_healing_modify did not run for this file).
|
|
459
|
+
# Be honest: do not claim a revert that did not happen, and do NOT fall back
|
|
460
|
+
# to a destructive git checkout.
|
|
461
|
+
echo "No pre-edit snapshot found for ${file_path}; could not revert (left as-is)."
|
|
462
|
+
return 1
|
|
463
|
+
}
|
|
464
|
+
|
|
349
465
|
# Hook: post_healing_modify - runs AFTER agent modifies a file in healing mode
|
|
350
466
|
# Verifies characterization tests still pass after modification
|
|
351
467
|
hook_post_healing_modify() {
|
|
@@ -384,9 +500,17 @@ hook_post_healing_modify() {
|
|
|
384
500
|
test_output=$(cat "$test_result_file")
|
|
385
501
|
rm -f "$test_result_file"
|
|
386
502
|
|
|
387
|
-
# Revert the
|
|
388
|
-
|
|
389
|
-
|
|
503
|
+
# Revert ONLY the healing edit using the pre-edit snapshot captured by
|
|
504
|
+
# hook_pre_healing_modify. Do NOT use `git checkout -- "$file_path"`:
|
|
505
|
+
# that discards ALL uncommitted changes to the file (not just the
|
|
506
|
+
# healing edit) and silently no-ops for an untracked file while still
|
|
507
|
+
# claiming the change was reverted. Report exactly what happened.
|
|
508
|
+
local revert_msg
|
|
509
|
+
# _heal_snapshot_restore returns nonzero when it could not revert; we
|
|
510
|
+
# surface the outcome via its message (recorded below) rather than a
|
|
511
|
+
# code, and must not let a nonzero return abort under set -e.
|
|
512
|
+
revert_msg=$(_heal_snapshot_restore "$heal_dir" "$file_path") || true
|
|
513
|
+
echo "HOOK_BLOCKED: Characterization tests failed after healing modification to ${file_path}. ${revert_msg}"
|
|
390
514
|
echo "Test output: ${test_output}"
|
|
391
515
|
|
|
392
516
|
# Record failure in failure-modes.json
|
|
@@ -404,12 +528,12 @@ data.setdefault('modes', []).append({
|
|
|
404
528
|
'trigger': 'healing_modification',
|
|
405
529
|
'file': sys.argv[2],
|
|
406
530
|
'behavior': 'Characterization tests failed after modification',
|
|
407
|
-
'recovery':
|
|
531
|
+
'recovery': sys.argv[3],
|
|
408
532
|
'is_intentional': False
|
|
409
533
|
})
|
|
410
534
|
with open(sys.argv[1], 'w') as f:
|
|
411
535
|
json.dump(data, f, indent=2)
|
|
412
|
-
" "$heal_dir/failure-modes.json" "$file_path" 2>/dev/null || true
|
|
536
|
+
" "$heal_dir/failure-modes.json" "$file_path" "$revert_msg" 2>/dev/null || true
|
|
413
537
|
fi
|
|
414
538
|
|
|
415
539
|
return 1
|