loki-mode 7.19.1 → 7.19.2
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/SKILL.md +2 -2
- package/VERSION +1 -1
- package/autonomy/completion-council.sh +282 -0
- package/autonomy/config.example.yaml +26 -0
- package/autonomy/run.sh +55 -0
- package/dashboard/__init__.py +1 -1
- package/docs/INSTALLATION.md +1 -1
- package/docs/UNCERTAINTY-ESCALATION-PLAN.md +396 -0
- package/loki-ts/dist/loki.js +2 -2
- package/mcp/__init__.py +1 -1
- package/package.json +1 -1
- package/skills/quality-gates.md +85 -0
package/SKILL.md
CHANGED
|
@@ -3,7 +3,7 @@ name: loki-mode
|
|
|
3
3
|
description: Autonomous spec-to-product system. Triggers on "Loki Mode". Takes a spec (PRD, GitHub issue, OpenAPI doc, etc.) to deployed product via the RARV-C closure loop, with minimal human intervention. Provider-agnostic. Requires --dangerously-skip-permissions flag.
|
|
4
4
|
---
|
|
5
5
|
|
|
6
|
-
# Loki Mode v7.19.
|
|
6
|
+
# Loki Mode v7.19.2
|
|
7
7
|
|
|
8
8
|
**You are an autonomous agent. You make decisions. You do not ask questions. You do not stop.**
|
|
9
9
|
|
|
@@ -383,4 +383,4 @@ See `CHANGELOG.md` entries [7.5.7], [7.5.8], [7.5.13] for the per-fix list and r
|
|
|
383
383
|
|
|
384
384
|
---
|
|
385
385
|
|
|
386
|
-
**v7.19.
|
|
386
|
+
**v7.19.2 | [Autonomi](https://www.autonomi.dev/) flagship product | ~260 lines core**
|
package/VERSION
CHANGED
|
@@ -1 +1 @@
|
|
|
1
|
-
7.19.
|
|
1
|
+
7.19.2
|
|
@@ -28,6 +28,10 @@
|
|
|
28
28
|
# LOKI_COUNCIL_CONVERGENCE_WINDOW - Iterations to track for convergence (default: 3)
|
|
29
29
|
# LOKI_COUNCIL_STAGNATION_LIMIT - Max iterations with no git changes (default: 5)
|
|
30
30
|
# LOKI_COUNCIL_DONE_SIGNAL_LIMIT - Max total done signals before force stop (default: 10)
|
|
31
|
+
# LOKI_UNCERTAINTY_ESCALATION - Proactive stuck-escalation decision (default: 1; set 0 to disable, byte-identical)
|
|
32
|
+
# LOKI_UNCERTAINTY_ROUNDS - Consecutive co-occurrence rounds before escalate (default: 2)
|
|
33
|
+
# LOKI_UNCERTAINTY_NOCHANGE_MIN - Proxy 1 threshold on consecutive_no_change (default: COUNCIL_STAGNATION_LIMIT - 1)
|
|
34
|
+
# LOKI_UNCERTAINTY_SPLIT_ROUNDS - Proxy 3 trailing split-round run length (default: 2)
|
|
31
35
|
#
|
|
32
36
|
# Usage:
|
|
33
37
|
# source autonomy/completion-council.sh
|
|
@@ -221,6 +225,7 @@ council_track_iteration() {
|
|
|
221
225
|
_COUNCIL_TOTAL_DONE_SIGNALS="$COUNCIL_TOTAL_DONE_SIGNALS" \
|
|
222
226
|
_COUNCIL_ITERATION="${ITERATION_COUNT:-0}" \
|
|
223
227
|
_COUNCIL_FILES_CHANGED="$files_changed" \
|
|
228
|
+
_COUNCIL_DIFF_HASH="$combined_hash" \
|
|
224
229
|
python3 -c "
|
|
225
230
|
import json, os
|
|
226
231
|
state_file = os.environ['_COUNCIL_STATE_FILE']
|
|
@@ -234,6 +239,7 @@ state['done_signals'] = int(os.environ['_COUNCIL_DONE_SIGNALS'])
|
|
|
234
239
|
state['total_done_signals'] = int(os.environ['_COUNCIL_TOTAL_DONE_SIGNALS'])
|
|
235
240
|
state['last_track_iteration'] = int(os.environ['_COUNCIL_ITERATION'])
|
|
236
241
|
state['files_changed'] = int(os.environ['_COUNCIL_FILES_CHANGED'])
|
|
242
|
+
state['last_diff_hash'] = os.environ['_COUNCIL_DIFF_HASH']
|
|
237
243
|
with open(state_file, 'w') as f:
|
|
238
244
|
json.dump(state, f, indent=2)
|
|
239
245
|
" || log_warn "Failed to update council tracking state"
|
|
@@ -263,6 +269,282 @@ council_circuit_breaker_triggered() {
|
|
|
263
269
|
return 1
|
|
264
270
|
}
|
|
265
271
|
|
|
272
|
+
#===============================================================================
|
|
273
|
+
# Uncertainty-Gated Escalation - pure stuck-detection DECISION function
|
|
274
|
+
#
|
|
275
|
+
# Returns 0 = escalate now, 1 = do not escalate. Reads ONLY persisted state
|
|
276
|
+
# (the council state.json for the three proxies, plus its own uncertainty.json
|
|
277
|
+
# for the ring buffer + co-occurrence streak + debounce flag). Mutates only its
|
|
278
|
+
# own uncertainty.json (atomic temp+mv). Fires NO notifications and touches NO
|
|
279
|
+
# PAUSE file: the run.sh action site interprets the return code and performs the
|
|
280
|
+
# side effects. This keeps the function sourceable and testable in isolation.
|
|
281
|
+
#
|
|
282
|
+
# Three proxies (all read from state, no live shell vars, no git calls):
|
|
283
|
+
# P1 (no-change) : state.json consecutive_no_change >= NOCHANGE_MIN
|
|
284
|
+
# (default COUNCIL_STAGNATION_LIMIT - 1, i.e. approaching
|
|
285
|
+
# the circuit-breaker limit).
|
|
286
|
+
# P2 (oscillation) : current state.json last_diff_hash recurs at distance >= 2
|
|
287
|
+
# in the ring buffer (A -> B -> A). Immediate repeat (A -> A)
|
|
288
|
+
# is P1's territory and is excluded.
|
|
289
|
+
# P3 (council split): the trailing SPLIT_ROUNDS entries of state.json verdicts
|
|
290
|
+
# are all result == "REJECTED" with approve >= 1.
|
|
291
|
+
# Escalate iff >= 2 proxies are hot AND that has held for ROUNDS consecutive
|
|
292
|
+
# rounds AND we have not already escalated this episode (debounce). Re-arm when
|
|
293
|
+
# co-occurrence drops below 2 in any later round.
|
|
294
|
+
#===============================================================================
|
|
295
|
+
|
|
296
|
+
# Resolve the uncertainty.json path co-located with the council state root so a
|
|
297
|
+
# sourced test (which sets COUNCIL_STATE_DIR to a throwaway dir) reads and writes
|
|
298
|
+
# in that same throwaway dir, never the developer's real cwd. In production
|
|
299
|
+
# COUNCIL_STATE_DIR is "${TARGET_DIR}/.loki/council", so its parent is the right
|
|
300
|
+
# ".loki" and this lands at ".loki/state/uncertainty.json".
|
|
301
|
+
_uncertainty_state_path() {
|
|
302
|
+
local base_dir="${COUNCIL_STATE_DIR:-${TARGET_DIR:-.}/.loki/council}"
|
|
303
|
+
local loki_root
|
|
304
|
+
loki_root="$(dirname "$base_dir")"
|
|
305
|
+
echo "$loki_root/state/uncertainty.json"
|
|
306
|
+
}
|
|
307
|
+
|
|
308
|
+
# Read uncertainty.json (or emit a default object if missing/corrupt) to stdout.
|
|
309
|
+
_uncertainty_read_state() {
|
|
310
|
+
local file="$1"
|
|
311
|
+
_UNC_FILE="$file" python3 -c "
|
|
312
|
+
import json, os
|
|
313
|
+
f = os.environ['_UNC_FILE']
|
|
314
|
+
default = {
|
|
315
|
+
'schema_version': '1.0.0',
|
|
316
|
+
'consecutive_co_occur': 0,
|
|
317
|
+
'escalated_episode': False,
|
|
318
|
+
'escalated_at_iteration': 0,
|
|
319
|
+
'diff_hash_ring': [],
|
|
320
|
+
'last_round_iteration': -1,
|
|
321
|
+
'last_proxies': {'p1': False, 'p2': False, 'p3': False},
|
|
322
|
+
}
|
|
323
|
+
try:
|
|
324
|
+
with open(f) as fh:
|
|
325
|
+
state = json.load(fh)
|
|
326
|
+
if not isinstance(state, dict):
|
|
327
|
+
state = {}
|
|
328
|
+
except (json.JSONDecodeError, FileNotFoundError, OSError):
|
|
329
|
+
state = {}
|
|
330
|
+
for k, v in default.items():
|
|
331
|
+
state.setdefault(k, v)
|
|
332
|
+
print(json.dumps(state))
|
|
333
|
+
" 2>/dev/null || echo '{}'
|
|
334
|
+
}
|
|
335
|
+
|
|
336
|
+
# Write a JSON string (read from _UNC_PAYLOAD) to uncertainty.json atomically
|
|
337
|
+
# (temp + mv), mirroring evidence-block.json.
|
|
338
|
+
_uncertainty_write_state() {
|
|
339
|
+
local file="$1"
|
|
340
|
+
local payload="$2"
|
|
341
|
+
local dir tmp
|
|
342
|
+
dir="$(dirname "$file")"
|
|
343
|
+
mkdir -p "$dir" 2>/dev/null || true
|
|
344
|
+
tmp="${file}.tmp.$$"
|
|
345
|
+
if _UNC_PAYLOAD="$payload" _UNC_TMP="$tmp" python3 -c "
|
|
346
|
+
import json, os
|
|
347
|
+
payload = os.environ['_UNC_PAYLOAD']
|
|
348
|
+
tmp = os.environ['_UNC_TMP']
|
|
349
|
+
state = json.loads(payload)
|
|
350
|
+
with open(tmp, 'w') as fh:
|
|
351
|
+
json.dump(state, fh, indent=2)
|
|
352
|
+
" 2>/dev/null; then
|
|
353
|
+
mv "$tmp" "$file" 2>/dev/null || { rm -f "$tmp" 2>/dev/null; return 1; }
|
|
354
|
+
return 0
|
|
355
|
+
fi
|
|
356
|
+
rm -f "$tmp" 2>/dev/null || true
|
|
357
|
+
return 1
|
|
358
|
+
}
|
|
359
|
+
|
|
360
|
+
uncertainty_should_escalate() {
|
|
361
|
+
# Knob first: opt-out is byte-identical to prior behavior. No read, no write,
|
|
362
|
+
# no state-file creation when disabled.
|
|
363
|
+
[ "${LOKI_UNCERTAINTY_ESCALATION:-1}" = "0" ] && return 1
|
|
364
|
+
|
|
365
|
+
# Tunable knobs (read inline; defaults documented in the env-var block).
|
|
366
|
+
local rounds_needed="${LOKI_UNCERTAINTY_ROUNDS:-2}"
|
|
367
|
+
local split_rounds="${LOKI_UNCERTAINTY_SPLIT_ROUNDS:-2}"
|
|
368
|
+
local nochange_min="${LOKI_UNCERTAINTY_NOCHANGE_MIN:-}"
|
|
369
|
+
if [ -z "$nochange_min" ]; then
|
|
370
|
+
nochange_min=$(( ${COUNCIL_STAGNATION_LIMIT:-5} - 1 ))
|
|
371
|
+
[ "$nochange_min" -lt 1 ] && nochange_min=1
|
|
372
|
+
fi
|
|
373
|
+
# Bounded constants.
|
|
374
|
+
local ring_size=6
|
|
375
|
+
|
|
376
|
+
# Resolve state file locations (council state root co-located).
|
|
377
|
+
local council_dir="${COUNCIL_STATE_DIR:-${TARGET_DIR:-.}/.loki/council}"
|
|
378
|
+
local state_json="$council_dir/state.json"
|
|
379
|
+
local unc_file
|
|
380
|
+
unc_file="$(_uncertainty_state_path)"
|
|
381
|
+
local iteration="${ITERATION_COUNT:-0}"
|
|
382
|
+
|
|
383
|
+
# Load prior uncertainty state.
|
|
384
|
+
local prior
|
|
385
|
+
prior="$(_uncertainty_read_state "$unc_file")"
|
|
386
|
+
|
|
387
|
+
# Compute the new state and decision entirely in python from persisted inputs.
|
|
388
|
+
# Echoes one line: "<rc> <new_json>" where rc is 0 (escalate) or 1 (no).
|
|
389
|
+
local result
|
|
390
|
+
result=$(_UNC_PRIOR="$prior" \
|
|
391
|
+
_UNC_STATE_JSON="$state_json" \
|
|
392
|
+
_UNC_ITERATION="$iteration" \
|
|
393
|
+
_UNC_ROUNDS="$rounds_needed" \
|
|
394
|
+
_UNC_SPLIT_ROUNDS="$split_rounds" \
|
|
395
|
+
_UNC_NOCHANGE_MIN="$nochange_min" \
|
|
396
|
+
_UNC_RING_SIZE="$ring_size" \
|
|
397
|
+
python3 -c "
|
|
398
|
+
import json, os
|
|
399
|
+
|
|
400
|
+
prior = json.loads(os.environ['_UNC_PRIOR'])
|
|
401
|
+
iteration = int(os.environ['_UNC_ITERATION'])
|
|
402
|
+
rounds_needed = int(os.environ['_UNC_ROUNDS'])
|
|
403
|
+
split_rounds = int(os.environ['_UNC_SPLIT_ROUNDS'])
|
|
404
|
+
nochange_min = int(os.environ['_UNC_NOCHANGE_MIN'])
|
|
405
|
+
ring_size = int(os.environ['_UNC_RING_SIZE'])
|
|
406
|
+
|
|
407
|
+
# Load council state.json (proxies). Missing/corrupt -> proxies cold.
|
|
408
|
+
try:
|
|
409
|
+
with open(os.environ['_UNC_STATE_JSON']) as fh:
|
|
410
|
+
cstate = json.load(fh)
|
|
411
|
+
if not isinstance(cstate, dict):
|
|
412
|
+
cstate = {}
|
|
413
|
+
except (json.JSONDecodeError, FileNotFoundError, OSError):
|
|
414
|
+
cstate = {}
|
|
415
|
+
|
|
416
|
+
ring = prior.get('diff_hash_ring', [])
|
|
417
|
+
if not isinstance(ring, list):
|
|
418
|
+
ring = []
|
|
419
|
+
last_round = prior.get('last_round_iteration', -1)
|
|
420
|
+
try:
|
|
421
|
+
last_round = int(last_round)
|
|
422
|
+
except (TypeError, ValueError):
|
|
423
|
+
last_round = -1
|
|
424
|
+
|
|
425
|
+
# Idempotency: a repeated call at the same iteration must not double-mutate.
|
|
426
|
+
# Recompute proxies and re-emit the prior decision without pushing the ring or
|
|
427
|
+
# advancing the streak again.
|
|
428
|
+
same_round = (iteration == last_round)
|
|
429
|
+
|
|
430
|
+
# --- Proxy 1: no-change approaching circuit breaker ---
|
|
431
|
+
try:
|
|
432
|
+
no_change = int(cstate.get('consecutive_no_change', 0))
|
|
433
|
+
except (TypeError, ValueError):
|
|
434
|
+
no_change = 0
|
|
435
|
+
p1 = no_change >= nochange_min
|
|
436
|
+
|
|
437
|
+
# --- Proxy 2: diff-hash recurrence at distance >= 2 (genuine oscillation) ---
|
|
438
|
+
cur_hash = cstate.get('last_diff_hash', '')
|
|
439
|
+
p2 = False
|
|
440
|
+
if cur_hash:
|
|
441
|
+
# Genuine oscillation (A -> B -> A) requires TWO things:
|
|
442
|
+
# 1. cur_hash recurs in the ring excluding the most-recent entry
|
|
443
|
+
# (distance >= 2; distance 1 immediate-repeat is P1's territory), AND
|
|
444
|
+
# 2. the most-recent ring entry (the previous round's hash) is DIFFERENT
|
|
445
|
+
# from cur_hash, i.e. there is an intervening distinct hash.
|
|
446
|
+
# Without (2), pure stagnation (A, A, A, ...) fills the ring with the same
|
|
447
|
+
# hash and would falsely fire P2 from the SAME root condition as P1, letting
|
|
448
|
+
# a single condition (no-change) light two proxies and escalate alone. That
|
|
449
|
+
# contradicts the 2-of-3 independent-proxy safety guarantee. Requiring an
|
|
450
|
+
# intervening distinct hash keeps A,B,A hot and A,A,A cold.
|
|
451
|
+
prev_hash = ring[-1] if ring else ''
|
|
452
|
+
if prev_hash != cur_hash:
|
|
453
|
+
for h in ring[:-1]:
|
|
454
|
+
if h == cur_hash:
|
|
455
|
+
p2 = True
|
|
456
|
+
break
|
|
457
|
+
|
|
458
|
+
# --- Proxy 3: persistent council split (trailing REJECTED with approve>=1) ---
|
|
459
|
+
verdicts = cstate.get('verdicts', [])
|
|
460
|
+
if not isinstance(verdicts, list):
|
|
461
|
+
verdicts = []
|
|
462
|
+
split_run = 0
|
|
463
|
+
for v in reversed(verdicts):
|
|
464
|
+
if not isinstance(v, dict):
|
|
465
|
+
break
|
|
466
|
+
try:
|
|
467
|
+
approve = int(v.get('approve', 0))
|
|
468
|
+
except (TypeError, ValueError):
|
|
469
|
+
approve = 0
|
|
470
|
+
if v.get('result') == 'REJECTED' and approve >= 1:
|
|
471
|
+
split_run += 1
|
|
472
|
+
else:
|
|
473
|
+
break
|
|
474
|
+
p3 = split_run >= split_rounds
|
|
475
|
+
|
|
476
|
+
hot_count = (1 if p1 else 0) + (1 if p2 else 0) + (1 if p3 else 0)
|
|
477
|
+
co_occur = hot_count >= 2
|
|
478
|
+
|
|
479
|
+
streak = prior.get('consecutive_co_occur', 0)
|
|
480
|
+
try:
|
|
481
|
+
streak = int(streak)
|
|
482
|
+
except (TypeError, ValueError):
|
|
483
|
+
streak = 0
|
|
484
|
+
escalated_episode = bool(prior.get('escalated_episode', False))
|
|
485
|
+
escalated_at = prior.get('escalated_at_iteration', 0)
|
|
486
|
+
try:
|
|
487
|
+
escalated_at = int(escalated_at)
|
|
488
|
+
except (TypeError, ValueError):
|
|
489
|
+
escalated_at = 0
|
|
490
|
+
|
|
491
|
+
new_state = dict(prior)
|
|
492
|
+
new_state['schema_version'] = prior.get('schema_version', '1.0.0')
|
|
493
|
+
new_state['last_proxies'] = {'p1': p1, 'p2': p2, 'p3': p3}
|
|
494
|
+
|
|
495
|
+
if same_round:
|
|
496
|
+
# No mutation of ring/streak; report no-escalate on the repeat call so we
|
|
497
|
+
# never fire twice for one round. Proxy snapshot is refreshed (harmless).
|
|
498
|
+
new_state['diff_hash_ring'] = ring
|
|
499
|
+
new_state['consecutive_co_occur'] = streak
|
|
500
|
+
new_state['escalated_episode'] = escalated_episode
|
|
501
|
+
new_state['escalated_at_iteration'] = escalated_at
|
|
502
|
+
new_state['last_round_iteration'] = last_round
|
|
503
|
+
rc = 1
|
|
504
|
+
else:
|
|
505
|
+
# Advance the ring with this round's hash (bounded).
|
|
506
|
+
if cur_hash:
|
|
507
|
+
ring = ring + [cur_hash]
|
|
508
|
+
if len(ring) > ring_size:
|
|
509
|
+
ring = ring[-ring_size:]
|
|
510
|
+
|
|
511
|
+
if co_occur:
|
|
512
|
+
streak += 1
|
|
513
|
+
else:
|
|
514
|
+
# Re-arm on clear: a resolved episode may legitimately re-escalate later.
|
|
515
|
+
streak = 0
|
|
516
|
+
escalated_episode = False
|
|
517
|
+
|
|
518
|
+
rc = 1
|
|
519
|
+
if co_occur and streak >= rounds_needed and not escalated_episode:
|
|
520
|
+
rc = 0
|
|
521
|
+
escalated_episode = True
|
|
522
|
+
escalated_at = iteration
|
|
523
|
+
|
|
524
|
+
new_state['diff_hash_ring'] = ring
|
|
525
|
+
new_state['consecutive_co_occur'] = streak
|
|
526
|
+
new_state['escalated_episode'] = escalated_episode
|
|
527
|
+
new_state['escalated_at_iteration'] = escalated_at
|
|
528
|
+
new_state['last_round_iteration'] = iteration
|
|
529
|
+
|
|
530
|
+
print(str(rc) + ' ' + json.dumps(new_state))
|
|
531
|
+
" 2>/dev/null) || return 1
|
|
532
|
+
|
|
533
|
+
[ -z "$result" ] && return 1
|
|
534
|
+
|
|
535
|
+
local rc new_json
|
|
536
|
+
rc="${result%% *}"
|
|
537
|
+
new_json="${result#* }"
|
|
538
|
+
|
|
539
|
+
# Persist the new state atomically (failure to persist must not escalate).
|
|
540
|
+
_uncertainty_write_state "$unc_file" "$new_json" || return 1
|
|
541
|
+
|
|
542
|
+
case "$rc" in
|
|
543
|
+
0) return 0 ;;
|
|
544
|
+
*) return 1 ;;
|
|
545
|
+
esac
|
|
546
|
+
}
|
|
547
|
+
|
|
266
548
|
#===============================================================================
|
|
267
549
|
# Council Voting - 3 independent reviewers check completion
|
|
268
550
|
#===============================================================================
|
|
@@ -80,6 +80,32 @@ completion:
|
|
|
80
80
|
# Ignore ALL completion signals (runs forever)
|
|
81
81
|
perpetual_mode: false
|
|
82
82
|
|
|
83
|
+
# Uncertainty-gated escalation (v7.19.2, default-on).
|
|
84
|
+
# When >=2 of 3 stuck-proxies (no-change counter, diff-hash oscillation,
|
|
85
|
+
# persistent council split) co-occur for `uncertainty.rounds` consecutive
|
|
86
|
+
# rounds, Loki escalates proactively via PAUSE + notification + handoff
|
|
87
|
+
# instead of silently burning iterations.
|
|
88
|
+
# IMPORTANT: when autonomy_mode is "perpetual" (the default), PAUSE is
|
|
89
|
+
# auto-cleared by the consumer so escalation degrades to notify-only; it
|
|
90
|
+
# does NOT halt the run. These are heuristics, not true metacognition.
|
|
91
|
+
#
|
|
92
|
+
# uncertainty:
|
|
93
|
+
# # Master toggle. Set to 0 to disable (byte-identical when off).
|
|
94
|
+
# escalation: 1
|
|
95
|
+
#
|
|
96
|
+
# # Consecutive rounds where >=2 of 3 proxies must co-occur before
|
|
97
|
+
# # escalating. Recommended range 2-3. Higher = less noise, later warning.
|
|
98
|
+
# rounds: 2
|
|
99
|
+
#
|
|
100
|
+
# # Proxy 1 threshold: consecutive_no_change must reach this value to mark
|
|
101
|
+
# # the no-change proxy hot. Default is COUNCIL_STAGNATION_LIMIT - 1
|
|
102
|
+
# # (one below the circuit-breaker limit). Leave unset to use the default.
|
|
103
|
+
# # nochange_min: 4
|
|
104
|
+
#
|
|
105
|
+
# # Proxy 3 threshold: trailing council verdicts that must be
|
|
106
|
+
# # REJECTED-with-at-least-one-approver (split) to mark the split proxy hot.
|
|
107
|
+
# split_rounds: 2
|
|
108
|
+
|
|
83
109
|
#===============================================================================
|
|
84
110
|
# Model & Routing Settings
|
|
85
111
|
#===============================================================================
|
package/autonomy/run.sh
CHANGED
|
@@ -124,6 +124,26 @@
|
|
|
124
124
|
# LOKI_NOTIFICATIONS - Enable desktop notifications (default: true)
|
|
125
125
|
# LOKI_NOTIFICATION_SOUND - Play sound with notifications (default: true)
|
|
126
126
|
#
|
|
127
|
+
# Uncertainty-Gated Escalation (v7.19.2, default-on):
|
|
128
|
+
# LOKI_UNCERTAINTY_ESCALATION - Master on/off for proactive stuck-escalation (default: 1; set 0 to
|
|
129
|
+
# disable; byte-identical when off). Decision lives in
|
|
130
|
+
# completion-council.sh (uncertainty_should_escalate); action in run.sh.
|
|
131
|
+
# NOTE: AUTONOMY_MODE defaults to "perpetual"; in perpetual mode PAUSE
|
|
132
|
+
# is auto-cleared by check_human_intervention, so escalation degrades
|
|
133
|
+
# to notify-only (notification fires, run does NOT halt).
|
|
134
|
+
# LOKI_UNCERTAINTY_ROUNDS - Consecutive rounds where >=2 of 3 proxies must co-occur before
|
|
135
|
+
# escalating (default: 2; recommended range 2-3). Debounces transient
|
|
136
|
+
# noise: a single hot proxy never escalates alone.
|
|
137
|
+
# LOKI_UNCERTAINTY_NOCHANGE_MIN - Proxy 1 threshold: consecutive_no_change value that marks p1 hot.
|
|
138
|
+
# (default: COUNCIL_STAGNATION_LIMIT - 1, i.e. one below the circuit-
|
|
139
|
+
# breaker limit so escalation fires before the breaker ends the run).
|
|
140
|
+
# Floored at 1 at runtime.
|
|
141
|
+
# LOKI_UNCERTAINTY_SPLIT_ROUNDS - Proxy 3 threshold: number of consecutive trailing council verdicts
|
|
142
|
+
# that must be REJECTED-with-approver (split) to mark p3 hot
|
|
143
|
+
# (default: 2). Between council votes p3 may be stale; it is always
|
|
144
|
+
# fresh when proxy 1 is hot because proxy 1 hot forces a circuit-
|
|
145
|
+
# breaker vote that refreshes verdicts.
|
|
146
|
+
#
|
|
127
147
|
# Human Intervention (Auto-Claude pattern):
|
|
128
148
|
# PAUSE file: touch .loki/PAUSE - pauses after current session
|
|
129
149
|
# HUMAN_INPUT.md: echo "instructions" > .loki/HUMAN_INPUT.md
|
|
@@ -286,6 +306,10 @@ parse_simple_yaml() {
|
|
|
286
306
|
set_from_yaml "$file" "completion.council.check_interval" "LOKI_COUNCIL_CHECK_INTERVAL"
|
|
287
307
|
set_from_yaml "$file" "completion.council.min_iterations" "LOKI_COUNCIL_MIN_ITERATIONS"
|
|
288
308
|
set_from_yaml "$file" "completion.council.stagnation_limit" "LOKI_COUNCIL_STAGNATION_LIMIT"
|
|
309
|
+
set_from_yaml "$file" "completion.uncertainty.escalation" "LOKI_UNCERTAINTY_ESCALATION"
|
|
310
|
+
set_from_yaml "$file" "completion.uncertainty.rounds" "LOKI_UNCERTAINTY_ROUNDS"
|
|
311
|
+
set_from_yaml "$file" "completion.uncertainty.nochange_min" "LOKI_UNCERTAINTY_NOCHANGE_MIN"
|
|
312
|
+
set_from_yaml "$file" "completion.uncertainty.split_rounds" "LOKI_UNCERTAINTY_SPLIT_ROUNDS"
|
|
289
313
|
|
|
290
314
|
# Model
|
|
291
315
|
set_from_yaml "$file" "model.prompt_repetition" "LOKI_PROMPT_REPETITION"
|
|
@@ -428,6 +452,10 @@ parse_yaml_with_yq() {
|
|
|
428
452
|
"completion.council.check_interval:LOKI_COUNCIL_CHECK_INTERVAL"
|
|
429
453
|
"completion.council.min_iterations:LOKI_COUNCIL_MIN_ITERATIONS"
|
|
430
454
|
"completion.council.stagnation_limit:LOKI_COUNCIL_STAGNATION_LIMIT"
|
|
455
|
+
"completion.uncertainty.escalation:LOKI_UNCERTAINTY_ESCALATION"
|
|
456
|
+
"completion.uncertainty.rounds:LOKI_UNCERTAINTY_ROUNDS"
|
|
457
|
+
"completion.uncertainty.nochange_min:LOKI_UNCERTAINTY_NOCHANGE_MIN"
|
|
458
|
+
"completion.uncertainty.split_rounds:LOKI_UNCERTAINTY_SPLIT_ROUNDS"
|
|
431
459
|
"model.prompt_repetition:LOKI_PROMPT_REPETITION"
|
|
432
460
|
"model.confidence_routing:LOKI_CONFIDENCE_ROUTING"
|
|
433
461
|
"model.autonomy_mode:LOKI_AUTONOMY_MODE"
|
|
@@ -12390,6 +12418,33 @@ if __name__ == "__main__":
|
|
|
12390
12418
|
council_track_iteration "$log_file"
|
|
12391
12419
|
fi
|
|
12392
12420
|
|
|
12421
|
+
# Uncertainty-gated escalation (v7.19.2, Slice B action).
|
|
12422
|
+
# The decision lives in completion-council.sh:uncertainty_should_escalate
|
|
12423
|
+
# (pure, debounced once-per-stuck-episode, knob-first on
|
|
12424
|
+
# LOKI_UNCERTAINTY_ESCALATION). This block only ACTS when the function
|
|
12425
|
+
# returns rc 0. The type guard keeps it a silent no-op if the decision
|
|
12426
|
+
# function is not present (byte-identical when the feature is absent/off).
|
|
12427
|
+
if type uncertainty_should_escalate &>/dev/null && uncertainty_should_escalate; then
|
|
12428
|
+
log_error "[Uncertainty] Escalating to human: >=2 of 3 stuck-signals co-occurred for N rounds (no-change / oscillation / council-split). PAUSE written; handoff saved."
|
|
12429
|
+
log_warn "[Uncertainty] To opt out of proactive escalation: set LOKI_UNCERTAINTY_ESCALATION=0"
|
|
12430
|
+
# Structured handoff doc before the bare PAUSE (mirrors GATE precedent).
|
|
12431
|
+
write_structured_handoff "uncertainty_escalation"
|
|
12432
|
+
notify_intervention_needed "Uncertainty escalation: >=2 of 3 stuck-signals co-occurred for N rounds"
|
|
12433
|
+
# Marker file for dashboard / external consumers. Empty touch has no
|
|
12434
|
+
# partial-write window, so atomic temp+mv is not required here.
|
|
12435
|
+
mkdir -p "${TARGET_DIR:-.}/.loki/signals"
|
|
12436
|
+
touch "${TARGET_DIR:-.}/.loki/signals/UNCERTAINTY_ESCALATION"
|
|
12437
|
+
# PAUSE is consumed by check_human_intervention: it halts in
|
|
12438
|
+
# non-perpetual mode; in perpetual mode it auto-clears + notifies.
|
|
12439
|
+
# That degrade is free; we add no consumer logic here.
|
|
12440
|
+
touch "${TARGET_DIR:-.}/.loki/PAUSE"
|
|
12441
|
+
# Perpetual-mode honesty: detect with the SAME vars the existing PAUSE
|
|
12442
|
+
# consumer uses (run.sh check_human_intervention), print-only.
|
|
12443
|
+
if [ "$AUTONOMY_MODE" = "perpetual" ] || [ "$PERPETUAL_MODE" = "true" ]; then
|
|
12444
|
+
log_warn "[Uncertainty] Perpetual mode: PAUSE will be auto-cleared; this is notify-only and will NOT halt the run."
|
|
12445
|
+
fi
|
|
12446
|
+
fi
|
|
12447
|
+
|
|
12393
12448
|
# Check for success - ONLY stop on explicit completion promise
|
|
12394
12449
|
# There's never a "complete" product - always improvements, bugs, features
|
|
12395
12450
|
if [ $exit_code -eq 0 ]; then
|
package/dashboard/__init__.py
CHANGED
package/docs/INSTALLATION.md
CHANGED
|
@@ -0,0 +1,396 @@
|
|
|
1
|
+
# Uncertainty-Gated Escalation (Loki Mode v7.19.2)
|
|
2
|
+
|
|
3
|
+
Design only. No implementation code lands with this document. Every hook point
|
|
4
|
+
below was read from live source; line numbers drift, so the verified anchors in
|
|
5
|
+
section 1 are the contract a dev re-confirms before editing.
|
|
6
|
+
|
|
7
|
+
## Goal
|
|
8
|
+
|
|
9
|
+
When Loki is likely stuck or thrashing, escalate to the human PROACTIVELY via
|
|
10
|
+
the EXISTING pause + notify + handoff machinery, instead of silently burning
|
|
11
|
+
iterations until max-iterations. No new metacognition: reuse three proxy signals
|
|
12
|
+
that already exist. Escalate only when at least two of the three co-occur for N
|
|
13
|
+
consecutive rounds. Default-on, opt-out with LOKI_UNCERTAINTY_ESCALATION=0,
|
|
14
|
+
byte-identical when off.
|
|
15
|
+
|
|
16
|
+
## Architectural spine: split DECISION from ACTION
|
|
17
|
+
|
|
18
|
+
This is the load-bearing decision and everything else falls out of it.
|
|
19
|
+
|
|
20
|
+
- DECISION: a pure-ish function `uncertainty_should_escalate` lives in
|
|
21
|
+
`autonomy/completion-council.sh` next to the other `council_*` state helpers.
|
|
22
|
+
It reads ONLY persisted state (`state.json`, `convergence.log`, and its own
|
|
23
|
+
`.loki/state/uncertainty.json`), mutates only its own state file, and returns
|
|
24
|
+
rc 0 (escalate now) / rc 1 (do not). It fires NO notifications and touches NO
|
|
25
|
+
PAUSE file. This makes it sourceable and testable exactly like
|
|
26
|
+
`council_evidence_gate` (completion-council.sh:907): a test writes a fake
|
|
27
|
+
`state.json` into a throwaway dir and asserts the return code, with zero real
|
|
28
|
+
side effects on the developer's machine.
|
|
29
|
+
- ACTION: the run.sh call site (new region right after
|
|
30
|
+
`council_track_iteration`, run.sh:12389-12391) interprets rc 0 and performs
|
|
31
|
+
the side effects: loud terminal line, `write_structured_handoff`,
|
|
32
|
+
`notify_intervention_needed`, write a `signals/UNCERTAINTY_ESCALATION` marker,
|
|
33
|
+
and `touch .loki/PAUSE`. It also emits the perpetual-mode honesty line.
|
|
34
|
+
|
|
35
|
+
Consequence: the two code slices live in DIFFERENT files (decision in
|
|
36
|
+
completion-council.sh, action in run.sh), so the dev fleet can build them in
|
|
37
|
+
parallel without collision.
|
|
38
|
+
|
|
39
|
+
---
|
|
40
|
+
|
|
41
|
+
## 1. Verified hook points (read from live source)
|
|
42
|
+
|
|
43
|
+
All paths relative to repo root `/Users/lokesh/git/loki-mode`.
|
|
44
|
+
|
|
45
|
+
### Proxy 1 - circuit-breaker no-change counter
|
|
46
|
+
- Var declared: `autonomy/completion-council.sh:70` (`COUNCIL_CONSECUTIVE_NO_CHANGE=0`).
|
|
47
|
+
- Incremented: `completion-council.sh:178`; reset: `:180`. Driven by a combined
|
|
48
|
+
hash of `git diff --stat HEAD` + staged diff + last commit hash
|
|
49
|
+
(`:165-182`).
|
|
50
|
+
- Limit knob: `COUNCIL_STAGNATION_LIMIT` (`:56`, default 5).
|
|
51
|
+
- Persisted: written into `state.json` as `consecutive_no_change`
|
|
52
|
+
(`completion-council.sh:232` -> `:237` json.dump). THIS is what the decision
|
|
53
|
+
function reads (not the live shell var, which is out of scope in a sourced
|
|
54
|
+
test).
|
|
55
|
+
- Updated every iteration via `council_track_iteration` (run.sh:12390).
|
|
56
|
+
|
|
57
|
+
### Proxy 2 - file-churn oscillation / reverts
|
|
58
|
+
- Existing data: `convergence.log` is appended at
|
|
59
|
+
`completion-council.sh:215` with line format
|
|
60
|
+
`timestamp|iteration|files_changed|consecutive_no_change|done_signals`.
|
|
61
|
+
CRITICAL: `files_changed` (`:208`) is a COUNT
|
|
62
|
+
(`git diff --name-only HEAD | wc -l`), NOT file identities. A count cannot
|
|
63
|
+
detect "same files back and forth."
|
|
64
|
+
- The combined diff hash exists at `completion-council.sh:175`
|
|
65
|
+
(`combined_hash`), persisted only transiently in the shell var
|
|
66
|
+
`COUNCIL_LAST_DIFF_HASH` (`:73`, `:182`) - immediate-repeat only.
|
|
67
|
+
- DECISION (see section 5 limits): proxy 2 is implemented as DIFF-HASH
|
|
68
|
+
RECURRENCE-AT-DISTANCE. We persist a small ring buffer (last
|
|
69
|
+
~6 hashes) of `combined_hash` in `uncertainty.json`. Proxy 2 fires when the
|
|
70
|
+
current hash equals a hash seen 2+ rounds back (A -> B -> A pattern). The
|
|
71
|
+
immediate repeat (A -> A) is already proxy 1, so recurrence-at-distance is the
|
|
72
|
+
genuine oscillation/revert signal. This is a tiny, justified addition (one
|
|
73
|
+
bounded array in an existing JSON file), NOT heavy new tracking. The hash to
|
|
74
|
+
read is the same `combined_hash` proxy 1 already computes; the decision
|
|
75
|
+
function recomputes it cheaply from `git diff --stat HEAD` or, preferably,
|
|
76
|
+
`council_track_iteration` writes it into `state.json` (`last_diff_hash`) so the
|
|
77
|
+
decision function stays pure (no git calls). See slice A for which.
|
|
78
|
+
|
|
79
|
+
### Proxy 3 - persistent council split
|
|
80
|
+
- approve_count computed in `council_vote` (`completion-council.sh:270`,
|
|
81
|
+
tallied `:388`, anti-sycophancy adjust `:417`).
|
|
82
|
+
- effective_threshold: `completion-council.sh:293`
|
|
83
|
+
(`(COUNCIL_SIZE * 2 + 2) / 3`, the ceiling(2/3) formula).
|
|
84
|
+
- Persisted: each council round appends to `state['verdicts']`
|
|
85
|
+
(`completion-council.sh:449-455`) with keys `iteration`, `timestamp`,
|
|
86
|
+
`approve`, `reject`, `result` (`APPROVED`/`REJECTED`). NOTE: threshold is NOT
|
|
87
|
+
stored. That is fine: `result == "REJECTED"` already encodes
|
|
88
|
+
`approve < threshold`. A split round = `result == "REJECTED" AND approve >= 1`
|
|
89
|
+
(council could not converge: at least one approver, still short of threshold).
|
|
90
|
+
Do NOT go looking for a stored threshold; it is not there by design.
|
|
91
|
+
- CADENCE: `verdicts` only appends when the council actually VOTES, which is
|
|
92
|
+
every `COUNCIL_CHECK_INTERVAL` OR when the circuit breaker forces a vote
|
|
93
|
+
(`council_should_stop`, completion-council.sh:2045-2051; circuit check
|
|
94
|
+
:2039-2043). So proxy 3 is STALE between votes. This is acceptable because in
|
|
95
|
+
the stuck regime we care about, proxy 1 going hot
|
|
96
|
+
(`consecutive_no_change >= COUNCIL_STAGNATION_LIMIT`) is exactly what TRIPS the
|
|
97
|
+
circuit breaker (`council_circuit_breaker_triggered`,
|
|
98
|
+
completion-council.sh:252) and forces a council vote, which refreshes proxy 3.
|
|
99
|
+
Verified: `council_should_stop` sets `should_check=true` when
|
|
100
|
+
`circuit_triggered=true` (:2047-2048). Document the between-votes staleness as
|
|
101
|
+
a known limit (section 5).
|
|
102
|
+
|
|
103
|
+
### notify_intervention_needed
|
|
104
|
+
- `autonomy/run.sh:2328`. Signature: `notify_intervention_needed "$reason"`;
|
|
105
|
+
thin wrapper over `send_notification "Intervention Needed" "$reason"
|
|
106
|
+
"critical"`.
|
|
107
|
+
|
|
108
|
+
### PAUSE consume / clear path (perpetual-mode crux)
|
|
109
|
+
- Consumer: `check_human_intervention` (run.sh:12701), PAUSE branch
|
|
110
|
+
`:12708`.
|
|
111
|
+
- Perpetual auto-clear: `:12711-12730`. In perpetual mode PAUSE is
|
|
112
|
+
auto-cleared (`:12727 rm -f`) and `notify_intervention_needed` STILL fires
|
|
113
|
+
(`:12726`). Only `BUDGET_EXCEEDED` (`:12712`) is carved out from
|
|
114
|
+
auto-clear.
|
|
115
|
+
- Non-perpetual: PAUSE triggers `handle_pause` (run.sh:12842) and waits
|
|
116
|
+
(`:12732-12742`).
|
|
117
|
+
- Consumed once per loop turn from the main loop: `check_human_intervention`
|
|
118
|
+
is called at run.sh:11528, return-code switch `:11530-11533`
|
|
119
|
+
(1 = restart loop, 2 = stop).
|
|
120
|
+
- IMPLICATION: escalation only WRITES PAUSE. The existing consumer halts (or, in
|
|
121
|
+
perpetual mode, auto-clears + notifies). Perpetual degrade is therefore FREE -
|
|
122
|
+
no new consumer logic. We detect perpetual at OUR site using the same vars
|
|
123
|
+
(`AUTONOMY_MODE` / `PERPETUAL_MODE`, run.sh:12711) only to print the honest
|
|
124
|
+
"notify-only; PAUSE will not halt this run" line.
|
|
125
|
+
|
|
126
|
+
### write_structured_handoff
|
|
127
|
+
- `autonomy/run.sh:8816`. Verified single live definition (the
|
|
128
|
+
"active definition is below" comment at :8811 refers to
|
|
129
|
+
`load_handoff_context`, not a second handoff def; grep shows one
|
|
130
|
+
`write_structured_handoff()`). Signature:
|
|
131
|
+
`write_structured_handoff "$reason"`; writes
|
|
132
|
+
`.loki/memory/handoffs/<ts>.json` + `.md`.
|
|
133
|
+
|
|
134
|
+
### Loop point for the escalation check
|
|
135
|
+
- Slot the ACTION immediately AFTER `council_track_iteration` in the main loop:
|
|
136
|
+
run.sh:12388-12391. At this point proxy 1 and proxy 2 are freshly written for
|
|
137
|
+
this iteration, and proxy 3 is fresh exactly when it matters (circuit-forced
|
|
138
|
+
vote). This is BEFORE the completion-promise / council checks
|
|
139
|
+
(run.sh:12408+), so escalation is evaluated every iteration.
|
|
140
|
+
|
|
141
|
+
### Mirror precedent (action shape)
|
|
142
|
+
- Gate-escalation block run.sh:12308-12318 is the precedent to clone: write a
|
|
143
|
+
`signals/` marker (`:12310`), call a handoff hook with its own opt-out
|
|
144
|
+
(`:12314`), then `touch .loki/PAUSE` (`:12317`). Our action mirrors this with
|
|
145
|
+
`write_structured_handoff` + `notify_intervention_needed` +
|
|
146
|
+
`signals/UNCERTAINTY_ESCALATION` + `touch .loki/PAUSE`.
|
|
147
|
+
|
|
148
|
+
---
|
|
149
|
+
|
|
150
|
+
## 2. Escalation decision function design
|
|
151
|
+
|
|
152
|
+
### Inputs (all read from persisted state, no live shell vars)
|
|
153
|
+
1. `p1` = proxy 1 hot: from `state.json.consecutive_no_change`. Hot when
|
|
154
|
+
`>= LOKI_UNCERTAINTY_NOCHANGE_MIN` (default = `COUNCIL_STAGNATION_LIMIT` - 1,
|
|
155
|
+
i.e. "approaching circuit-breaker"). Reading slightly below the breaker limit
|
|
156
|
+
lets us escalate BEFORE the breaker forces an end-state.
|
|
157
|
+
2. `p2` = proxy 2 hot: diff-hash recurrence-at-distance. Hot when the current
|
|
158
|
+
`last_diff_hash` matches a hash at distance >= 2 in the ring buffer.
|
|
159
|
+
3. `p3` = proxy 3 hot: persistent split. Read the last `K` entries of
|
|
160
|
+
`state.json.verdicts`; count consecutive trailing rounds where
|
|
161
|
+
`result == "REJECTED" AND approve >= 1`. Hot when that run length
|
|
162
|
+
`>= LOKI_UNCERTAINTY_SPLIT_ROUNDS` (default 2).
|
|
163
|
+
|
|
164
|
+
### Co-occurrence + N-round debounce
|
|
165
|
+
- Per round (= per iteration; "round" is defined as one main-loop iteration),
|
|
166
|
+
compute `hot_count = p1 + p2 + p3`.
|
|
167
|
+
- `co_occur = (hot_count >= 2)`.
|
|
168
|
+
- Maintain `consecutive_co_occur` in `uncertainty.json`:
|
|
169
|
+
- if `co_occur`: increment; else reset to 0.
|
|
170
|
+
- Escalate (rc 0) when `consecutive_co_occur >= LOKI_UNCERTAINTY_ROUNDS`
|
|
171
|
+
(the N knob, default 2; recommended range 2-3) AND not already escalated this
|
|
172
|
+
episode (debounce flag, below).
|
|
173
|
+
- A single noisy proxy can NEVER escalate alone (requires hot_count >= 2).
|
|
174
|
+
|
|
175
|
+
### Debounce (escalate once per stuck-episode)
|
|
176
|
+
- `uncertainty.json` carries `escalated_episode: true|false`.
|
|
177
|
+
- On escalate, set `escalated_episode = true` and record
|
|
178
|
+
`escalated_at_iteration`.
|
|
179
|
+
- Suppress re-fire while `escalated_episode == true`.
|
|
180
|
+
- RE-ARM (reset `escalated_episode = false` and `consecutive_co_occur = 0`) when
|
|
181
|
+
`co_occur` becomes false in any later round (a proxy cleared => the episode is
|
|
182
|
+
considered resolved; a new stuck episode may legitimately re-escalate). State
|
|
183
|
+
the reset condition explicitly so a dev does not "helpfully" keep it latched.
|
|
184
|
+
|
|
185
|
+
### State persistence
|
|
186
|
+
- File: `.loki/state/uncertainty.json` (singular; the `uncertainty-*.json` glob
|
|
187
|
+
in the brief maps to this one file - keep it single to avoid an unbounded
|
|
188
|
+
directory). Schema:
|
|
189
|
+
```json
|
|
190
|
+
{
|
|
191
|
+
"schema_version": "1.0.0",
|
|
192
|
+
"consecutive_co_occur": 0,
|
|
193
|
+
"escalated_episode": false,
|
|
194
|
+
"escalated_at_iteration": 0,
|
|
195
|
+
"diff_hash_ring": ["<h>", "<h>", "..."],
|
|
196
|
+
"last_round_iteration": 0,
|
|
197
|
+
"last_proxies": {"p1": false, "p2": false, "p3": false}
|
|
198
|
+
}
|
|
199
|
+
```
|
|
200
|
+
- Ring buffer bounded to 6 entries (constant). All writes atomic temp+mv,
|
|
201
|
+
mirroring evidence-block.json (`completion-council.sh:1059-1086`).
|
|
202
|
+
|
|
203
|
+
### Knob-first byte-identical guard
|
|
204
|
+
First line of `uncertainty_should_escalate`, BEFORE any read or write:
|
|
205
|
+
```
|
|
206
|
+
[ "${LOKI_UNCERTAINTY_ESCALATION:-1}" = "0" ] && return 1
|
|
207
|
+
```
|
|
208
|
+
(rc 1 = do-not-escalate; mirrors `council_evidence_gate`'s knob-first guard at
|
|
209
|
+
completion-council.sh:909). When off: zero file reads, zero writes, zero state
|
|
210
|
+
file creation => byte-identical.
|
|
211
|
+
|
|
212
|
+
### Knobs summary (all opt-out / tunable, none required)
|
|
213
|
+
- `LOKI_UNCERTAINTY_ESCALATION` (default 1) - master on/off.
|
|
214
|
+
- `LOKI_UNCERTAINTY_ROUNDS` (default 2) - N consecutive co-occurrence rounds.
|
|
215
|
+
- `LOKI_UNCERTAINTY_NOCHANGE_MIN` (default `COUNCIL_STAGNATION_LIMIT - 1`) - p1
|
|
216
|
+
threshold.
|
|
217
|
+
- `LOKI_UNCERTAINTY_SPLIT_ROUNDS` (default 2) - p3 split run length.
|
|
218
|
+
|
|
219
|
+
---
|
|
220
|
+
|
|
221
|
+
## 3. Disjoint dev slices (parallel-safe)
|
|
222
|
+
|
|
223
|
+
Binding constraints for EVERY slice: no version bumps (do not touch VERSION /
|
|
224
|
+
CHANGELOG), no git commits, no emojis, no em-dashes or en-dashes (ASCII hyphen
|
|
225
|
+
only), atomic temp+mv for all state writes, knob-first opt-out where the slice
|
|
226
|
+
touches the hot loop.
|
|
227
|
+
|
|
228
|
+
### Slice A - decision function + state schema (completion-council.sh)
|
|
229
|
+
- Region: add `uncertainty_should_escalate` and a tiny
|
|
230
|
+
`_uncertainty_read_state` / `_uncertainty_write_state` pair near the other
|
|
231
|
+
`council_*` state helpers (after `council_circuit_breaker_triggered`,
|
|
232
|
+
i.e. around completion-council.sh:265, BEFORE `council_vote` at :270).
|
|
233
|
+
- Also add ONE line inside `council_track_iteration` to persist
|
|
234
|
+
`state['last_diff_hash'] = combined_hash` (extend the python block at
|
|
235
|
+
completion-council.sh:224-238 by adding the env var + one assignment) so the
|
|
236
|
+
decision function reads the hash from state.json and stays pure (no git in the
|
|
237
|
+
decision path). This is the only edit inside an existing function; keep it to a
|
|
238
|
+
single key add to minimize collision with run.sh slice.
|
|
239
|
+
- Owns: `.loki/state/uncertainty.json` schema, ring buffer, co-occurrence +
|
|
240
|
+
debounce logic, all four knobs' defaults.
|
|
241
|
+
- File-region disjoint from slice B (different file).
|
|
242
|
+
|
|
243
|
+
### Slice B - action + wiring (run.sh)
|
|
244
|
+
- Region: new block right after `council_track_iteration` call
|
|
245
|
+
(run.sh:12389-12391).
|
|
246
|
+
- Logic:
|
|
247
|
+
```
|
|
248
|
+
if type uncertainty_should_escalate >/dev/null 2>&1 && uncertainty_should_escalate; then
|
|
249
|
+
# loud line (section 6), write_structured_handoff "uncertainty_escalation",
|
|
250
|
+
# notify_intervention_needed, signals/UNCERTAINTY_ESCALATION marker,
|
|
251
|
+
# touch .loki/PAUSE, perpetual honesty line.
|
|
252
|
+
fi
|
|
253
|
+
```
|
|
254
|
+
- Clone the GATE_ESCALATION shape (run.sh:12308-12318) for marker + handoff +
|
|
255
|
+
touch ordering.
|
|
256
|
+
- Perpetual detection: read `AUTONOMY_MODE` / `PERPETUAL_MODE`
|
|
257
|
+
(same as run.sh:12711) ONLY to print the honest notify-only line.
|
|
258
|
+
- File-region disjoint from slices A, C, D.
|
|
259
|
+
|
|
260
|
+
### Slice C - tests (tests/test-uncertainty-escalation.sh)
|
|
261
|
+
- New file. Sources the real `uncertainty_should_escalate` from
|
|
262
|
+
completion-council.sh, stubs `log_*`, runs per-case throwaway dirs. Models
|
|
263
|
+
tests/test-evidence-gate.sh exactly. Asserts decision-only (no real notify /
|
|
264
|
+
no real PAUSE because it calls the DECISION function, not the run.sh action).
|
|
265
|
+
- File-region disjoint (new file).
|
|
266
|
+
|
|
267
|
+
### Slice D - docs + knob registration
|
|
268
|
+
- Register the four knobs in the config-comment block (the env-var doc region
|
|
269
|
+
around run.sh:91-128 and the yaml mapping near :282/:424) and
|
|
270
|
+
`autonomy/config.example.yaml`. Add a short section to the user-facing docs.
|
|
271
|
+
- Keep edits to comment / config blocks; do not touch the hot loop. If this
|
|
272
|
+
collides with slice B's run.sh edits, sequence D after B (the only soft
|
|
273
|
+
dependency). Otherwise fully disjoint.
|
|
274
|
+
|
|
275
|
+
Recommended parallelism: A, C, D in parallel; B after A's function signature is
|
|
276
|
+
agreed (C can mock the signature meanwhile). 4 slices, 3 files + 1 new test +
|
|
277
|
+
docs.
|
|
278
|
+
|
|
279
|
+
---
|
|
280
|
+
|
|
281
|
+
## 4. Test plan (model: tests/test-evidence-gate.sh)
|
|
282
|
+
|
|
283
|
+
Harness: source the real completion-council.sh with `log_*` stubbed; call
|
|
284
|
+
`uncertainty_should_escalate` inside per-case `mktemp -d` dirs, each writing its
|
|
285
|
+
own `.loki/state/uncertainty.json` + `.loki/council/state.json` +
|
|
286
|
+
`.loki/council/convergence.log`. Assert BOTH rc and the mutated
|
|
287
|
+
`uncertainty.json` side effects. Loud SKIP (exit 0) if the function is not yet
|
|
288
|
+
defined (mirrors evidence-gate's absent-impl banner). Each case sets
|
|
289
|
+
`COUNCIL_STATE_DIR` and `ITERATION_COUNT` explicitly.
|
|
290
|
+
|
|
291
|
+
Cases:
|
|
292
|
+
1. PROXY READ - p1 only hot: `consecutive_no_change` >= min, hash unique,
|
|
293
|
+
verdicts approved. Assert `last_proxies.p1 == true`, others false, rc 1
|
|
294
|
+
(NO escalate on 1 proxy). Proves proxy 1 is read.
|
|
295
|
+
2. PROXY READ - p2 only hot: write a recurrence-at-distance hash ring
|
|
296
|
+
(A,B,A), unique p1/p3. Assert `p2 == true`, rc 1. Proves proxy 2 is read
|
|
297
|
+
from the ring, and that immediate-repeat (A,A) does NOT count as p2.
|
|
298
|
+
3. PROXY READ - p3 only hot: verdicts trailing K = REJECTED with approve>=1 for
|
|
299
|
+
SPLIT_ROUNDS rounds. Assert `p3 == true`, rc 1. Proves proxy 3 reads
|
|
300
|
+
`result`/`approve` (and does NOT require a stored threshold).
|
|
301
|
+
4. CO-OCCURRENCE x N escalates: set p1 + p3 hot for N consecutive calls
|
|
302
|
+
(loop the function N times, advancing iteration). Assert rc 0 on the Nth
|
|
303
|
+
call, `escalated_episode == true`. Proves >=2-for-N escalates.
|
|
304
|
+
5. 1-PROXY-NEVER: keep only one proxy hot for many rounds. Assert rc 1 every
|
|
305
|
+
round, `escalated_episode == false`. Proves a single noisy proxy cannot
|
|
306
|
+
escalate.
|
|
307
|
+
6. DEBOUNCE (no re-fire): after case-4 escalation, call again with the SAME hot
|
|
308
|
+
proxies. Assert rc 1 (suppressed) while `escalated_episode == true`. Proves
|
|
309
|
+
escalate-once-per-episode.
|
|
310
|
+
7. RE-ARM: after escalation, feed one round with co_occur false (clear a proxy),
|
|
311
|
+
assert `escalated_episode == false` + `consecutive_co_occur == 0`; then feed
|
|
312
|
+
N hot rounds again, assert rc 0. Proves reset-on-clear and re-escalation of a
|
|
313
|
+
new episode.
|
|
314
|
+
8. OPT-OUT BYTE-IDENTICAL: `LOKI_UNCERTAINTY_ESCALATION=0`. Assert rc 1 AND that
|
|
315
|
+
`.loki/state/uncertainty.json` is NOT created / NOT modified (snapshot the
|
|
316
|
+
dir before/after; mtime + existence). Proves byte-identical when off.
|
|
317
|
+
9. PERPETUAL DEGRADE-TO-NOTIFY: this is a run.sh ACTION behavior, so test it as a
|
|
318
|
+
thin integration shim: stub `notify_intervention_needed`, `handle_pause`,
|
|
319
|
+
`handle_dashboard_crash` to record calls; set `AUTONOMY_MODE=perpetual`;
|
|
320
|
+
`touch .loki/PAUSE`; call the real `check_human_intervention`
|
|
321
|
+
(run.sh:12701). Assert PAUSE is auto-cleared AND notify was called (proves
|
|
322
|
+
the degrade path is the EXISTING consumer at run.sh:12725-12727, so escalation
|
|
323
|
+
degrades to notify-only under perpetual). This case sources run.sh's
|
|
324
|
+
`check_human_intervention` with its deps stubbed, or asserts via a focused
|
|
325
|
+
harness; if sourcing run.sh wholesale is impractical, assert the contract by
|
|
326
|
+
reading the consumer branch and documenting it as a code-path test.
|
|
327
|
+
|
|
328
|
+
All cases: throwaway git repos isolated via `GIT_CONFIG_GLOBAL=/dev/null`
|
|
329
|
+
(mirror test-evidence-gate.sh:107-115). Skip-not-fail on missing git/python3.
|
|
330
|
+
|
|
331
|
+
---
|
|
332
|
+
|
|
333
|
+
## 5. Honest limits
|
|
334
|
+
|
|
335
|
+
- PERPETUAL-MODE = NOTIFY-ONLY. If Loki runs in perpetual / auto-continue mode,
|
|
336
|
+
the existing consumer (`check_human_intervention`, run.sh:12725-12727)
|
|
337
|
+
auto-clears PAUSE and continues. Escalation therefore DEGRADES to a
|
|
338
|
+
notification (notify still fires) plus a handoff doc; it does NOT halt the run.
|
|
339
|
+
We detect this at the action site and print it honestly. We deliberately do
|
|
340
|
+
NOT add a no-auto-clear carve-out for our marker (the BUDGET_EXCEEDED carve-out
|
|
341
|
+
at run.sh:12712 shows it is technically possible) because that is scope creep
|
|
342
|
+
and would break "byte-identical when off." Out of scope for v7.19.2; candidate
|
|
343
|
+
follow-up.
|
|
344
|
+
- PROXY 2 IS COUNT-BLIND BY ORIGIN. `convergence.log` stores `files_changed` as
|
|
345
|
+
a count (completion-council.sh:208), not identities, so it cannot by itself see
|
|
346
|
+
"same files back and forth." We approximate oscillation with diff-hash
|
|
347
|
+
recurrence-at-distance, which catches A -> B -> A state cycling but CANNOT
|
|
348
|
+
distinguish a genuine revert from a coincidental return to an identical tree
|
|
349
|
+
state, and will MISS oscillation that changes content each pass (hash differs
|
|
350
|
+
every round). It is a heuristic, not a true revert detector.
|
|
351
|
+
- PROXY 3 STALENESS BETWEEN VOTES. The verdicts array only updates on actual
|
|
352
|
+
council votes (every `COUNCIL_CHECK_INTERVAL` or circuit-forced). Sampled every
|
|
353
|
+
iteration, p3 can be stale between votes. We rely on the circuit-breaker
|
|
354
|
+
coupling (proxy 1 hot forces a vote, refreshing p3) so p3 is fresh exactly in
|
|
355
|
+
the regime we escalate on; outside that regime p3 may lag by up to
|
|
356
|
+
`COUNCIL_CHECK_INTERVAL` iterations.
|
|
357
|
+
- PROXIES FALSE-FIRE AND MISS. All three are heuristics. A legitimately hard
|
|
358
|
+
refactor that produces no net diff for several rounds while the council
|
|
359
|
+
remains split can false-fire; a fast-thrashing failure that keeps changing
|
|
360
|
+
different files with shifting hashes can be missed. Requiring >=2 co-occurring
|
|
361
|
+
for N rounds reduces, but does not eliminate, false fires. The cost of a false
|
|
362
|
+
fire is bounded: one notification + one handoff + one PAUSE (auto-cleared in
|
|
363
|
+
perpetual), opt-out at the site.
|
|
364
|
+
- THESE ARE PROXIES, NOT TRUE METACOGNITION. The system does not know it is
|
|
365
|
+
stuck; it infers stuckness from three correlated symptoms of stuckness. There
|
|
366
|
+
is no model of confidence, no self-estimate of progress. This is intentional
|
|
367
|
+
(no new metacognition) and is the honest ceiling on what this feature can
|
|
368
|
+
claim.
|
|
369
|
+
|
|
370
|
+
---
|
|
371
|
+
|
|
372
|
+
## 6. Rails (the v7.19.1 evidence-gate rails, mirrored)
|
|
373
|
+
|
|
374
|
+
A default-on hook in the hot loop must be bounded, loud, and self-rescuing.
|
|
375
|
+
|
|
376
|
+
- BOUNDED: the decision function does O(1) work - reads two small JSON files,
|
|
377
|
+
scans the last K verdicts and a 6-entry ring. No git subprocess in the decision
|
|
378
|
+
path (hash comes from state.json via slice A's one-line add). No network. No
|
|
379
|
+
unbounded loop. Cannot hang. The action runs at most ONCE per stuck episode
|
|
380
|
+
(debounce), not every iteration.
|
|
381
|
+
- LOUD TERMINAL LINE at the escalation site (run.sh, slice B):
|
|
382
|
+
```
|
|
383
|
+
log_error "[Uncertainty] Escalating to human: >=2 of 3 stuck-signals co-occurred for N rounds (no-change / oscillation / council-split). PAUSE written; handoff saved."
|
|
384
|
+
log_warn "[Uncertainty] To opt out of proactive escalation: set LOKI_UNCERTAINTY_ESCALATION=0"
|
|
385
|
+
```
|
|
386
|
+
And, only when perpetual, the honesty line:
|
|
387
|
+
```
|
|
388
|
+
log_warn "[Uncertainty] Perpetual mode: PAUSE will be auto-cleared; this is notify-only and will NOT halt the run."
|
|
389
|
+
```
|
|
390
|
+
- OPT-OUT NAMED AT THE SITE: the opt-out env var is printed on the line above,
|
|
391
|
+
right where escalation happens, so a terminal user with no dashboard can
|
|
392
|
+
self-rescue in one step (mirrors completion-council.sh:1055).
|
|
393
|
+
- KNOB-FIRST: `LOKI_UNCERTAINTY_ESCALATION=0` short-circuits the decision
|
|
394
|
+
function before any read/write (section 2), and `type ... >/dev/null` guards
|
|
395
|
+
the run.sh call so an unbuilt function is a silent no-op. Byte-identical when
|
|
396
|
+
off, proven by test case 8.
|
package/loki-ts/dist/loki.js
CHANGED
|
@@ -1,5 +1,5 @@
|
|
|
1
1
|
// @bun
|
|
2
|
-
var f8=Object.defineProperty;var u8=($)=>$;function c8($,Q){this[$]=u8.bind(null,Q)}var g=($,Q)=>{for(var K in Q)f8($,K,{get:Q[K],enumerable:!0,configurable:!0,set:c8.bind(Q,K)})};var k=($,Q)=>()=>($&&(Q=$($=0)),Q);var X1=import.meta.require;var F$={};g(F$,{lokiDir:()=>P,homeLokiDir:()=>o1,findRepoRootForVersion:()=>d1,REPO_ROOT:()=>f});import{resolve as n,dirname as l1}from"path";import{fileURLToPath as p8}from"url";import{existsSync as L1}from"fs";import{homedir as l8}from"os";function d8(){let $=j$;for(let Q=0;Q<6;Q++){if(L1(n($,"VERSION"))&&L1(n($,"autonomy/run.sh")))return $;let K=l1($);if(K===$)break;$=K}return n(j$,"..","..","..")}function d1($){let Q=$;for(let K=0;K<6;K++){if(L1(n(Q,"VERSION"))&&L1(n(Q,"autonomy/run.sh")))return Q;let Z=l1(Q);if(Z===Q)break;Q=Z}return n($,"..","..","..")}function P(){return process.env.LOKI_DIR??n(process.cwd(),".loki")}function o1(){return n(l8(),".loki")}var j$,f;var y=k(()=>{j$=l1(p8(import.meta.url));f=d8()});import{readFileSync as o8}from"fs";import{resolve as n8,dirname as a8}from"path";import{fileURLToPath as s8}from"url";function k1(){if($1!==null)return $1;let $="7.19.
|
|
2
|
+
var f8=Object.defineProperty;var u8=($)=>$;function c8($,Q){this[$]=u8.bind(null,Q)}var g=($,Q)=>{for(var K in Q)f8($,K,{get:Q[K],enumerable:!0,configurable:!0,set:c8.bind(Q,K)})};var k=($,Q)=>()=>($&&(Q=$($=0)),Q);var X1=import.meta.require;var F$={};g(F$,{lokiDir:()=>P,homeLokiDir:()=>o1,findRepoRootForVersion:()=>d1,REPO_ROOT:()=>f});import{resolve as n,dirname as l1}from"path";import{fileURLToPath as p8}from"url";import{existsSync as L1}from"fs";import{homedir as l8}from"os";function d8(){let $=j$;for(let Q=0;Q<6;Q++){if(L1(n($,"VERSION"))&&L1(n($,"autonomy/run.sh")))return $;let K=l1($);if(K===$)break;$=K}return n(j$,"..","..","..")}function d1($){let Q=$;for(let K=0;K<6;K++){if(L1(n(Q,"VERSION"))&&L1(n(Q,"autonomy/run.sh")))return Q;let Z=l1(Q);if(Z===Q)break;Q=Z}return n($,"..","..","..")}function P(){return process.env.LOKI_DIR??n(process.cwd(),".loki")}function o1(){return n(l8(),".loki")}var j$,f;var y=k(()=>{j$=l1(p8(import.meta.url));f=d8()});import{readFileSync as o8}from"fs";import{resolve as n8,dirname as a8}from"path";import{fileURLToPath as s8}from"url";function k1(){if($1!==null)return $1;let $="7.19.2";if(typeof $==="string"&&$.length>0)return $1=$,$1;try{let Q=a8(s8(import.meta.url)),K=d1(Q);$1=o8(n8(K,"VERSION"),"utf-8").trim()}catch{$1="unknown"}return $1}var $1=null;var n1=k(()=>{y()});var E$={};g(E$,{runOrThrow:()=>t8,run:()=>j,commandVersion:()=>i8,commandExists:()=>v,ShellError:()=>a1});async function j($,Q={}){let K=Bun.spawn({cmd:[...$],stdout:"pipe",stderr:"pipe",env:Q.env?{...process.env,...Q.env}:process.env,cwd:Q.cwd}),Z,z;if(Q.timeoutMs&&Q.timeoutMs>0)Z=setTimeout(()=>{try{K.kill("SIGTERM")}catch{}z=setTimeout(()=>{try{K.kill("SIGKILL")}catch{}},2000)},Q.timeoutMs);try{let[H,X,q]=await Promise.all([new Response(K.stdout).text(),new Response(K.stderr).text(),K.exited]);return{stdout:H,stderr:X,exitCode:q}}finally{if(Z)clearTimeout(Z);if(z)clearTimeout(z)}}async function t8($,Q={}){let K=await j($,Q);if(K.exitCode!==0)throw new a1(`command failed (${K.exitCode}): ${$.join(" ")}`,K.exitCode,K.stdout,K.stderr);return K}async function v($){let Q=r8($),K=await j(["sh","-c",`command -v ${Q}`],{timeoutMs:5000});if(K.exitCode===0)return K.stdout.trim()||null;return null}function r8($){if(!/^[A-Za-z0-9._/-]+$/.test($))throw Error(`refused to shell-escape suspect token: ${$}`);return $}async function i8($,Q="--version"){if(!await v($))return null;let Z=await j([$,Q],{timeoutMs:5000});if(Z.exitCode!==0)return null;return((Z.stdout||Z.stderr).split(/\r?\n/)[0]?.trim()??"")||null}var a1;var d=k(()=>{a1=class a1 extends Error{message;exitCode;stdout;stderr;constructor($,Q,K,Z){super($);this.message=$;this.exitCode=Q;this.stdout=K;this.stderr=Z;this.name="ShellError"}}});function a($){return e8?"":$}var e8,T,N,w,ZK,_,R,h,J;var c=k(()=>{e8=(process.env.NO_COLOR??"").length>0;T=a("\x1B[0;31m"),N=a("\x1B[0;32m"),w=a("\x1B[1;33m"),ZK=a("\x1B[0;34m"),_=a("\x1B[0;36m"),R=a("\x1B[1m"),h=a("\x1B[2m"),J=a("\x1B[0m")});import{existsSync as U7}from"fs";async function Q1(){if(B1!==void 0)return B1;let $="/opt/homebrew/bin/python3.12";if(U7($))return B1=$,$;let Q=await v("python3.12");if(Q)return B1=Q,Q;let K=await v("python3");return B1=K,K}async function K1($,Q={}){let K=await Q1();if(!K)return{stdout:"",stderr:"python3 not found",exitCode:127};return j([K,"-c",$],Q)}var B1;var H1=k(()=>{d()});var d$={};g(d$,{runStatus:()=>N7});import{existsSync as b,readFileSync as q1,readdirSync as v$,statSync as f$}from"fs";import{resolve as D,basename as P7}from"path";import{homedir as L7}from"os";async function j7(){if(await v("jq"))return!0;return process.stdout.write(`${T}Error: jq is required but not installed.${J}
|
|
3
3
|
`),process.stdout.write(`Install with:
|
|
4
4
|
`),process.stdout.write(` brew install jq (macOS)
|
|
5
5
|
`),process.stdout.write(` apt install jq (Debian/Ubuntu)
|
|
@@ -787,4 +787,4 @@ Set LOKI_LEGACY_BASH=1 to force the bash CLI for every command.
|
|
|
787
787
|
`),2}default:return process.stderr.write(`Unknown command: ${Q}
|
|
788
788
|
`),process.stderr.write(v8),2}}g$();process.on("SIGINT",()=>process.exit(130));process.on("SIGTERM",()=>process.exit(143));var p3=await c3(Bun.argv.slice(2));process.exit(p3);
|
|
789
789
|
|
|
790
|
-
//# debugId=
|
|
790
|
+
//# debugId=A2F8B15FD75062F064756E2164756E21
|
package/mcp/__init__.py
CHANGED
package/package.json
CHANGED
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
{
|
|
2
2
|
"name": "loki-mode",
|
|
3
|
-
"version": "7.19.
|
|
3
|
+
"version": "7.19.2",
|
|
4
4
|
"description": "Loki Mode by Autonomi. Autonomous spec-to-product system: takes a PRD, GitHub issue, OpenAPI/JSON/YAML, or one-line brief to a deployed app via the RARV-C closure loop with 11 quality gates. Provider-agnostic (Claude Code, OpenAI Codex, Cline, Aider).",
|
|
5
5
|
"keywords": [
|
|
6
6
|
"agent",
|
package/skills/quality-gates.md
CHANGED
|
@@ -202,6 +202,91 @@ crash via the primitive's `finally` cleanup.
|
|
|
202
202
|
|
|
203
203
|
---
|
|
204
204
|
|
|
205
|
+
## Uncertainty-gated escalation (v7.19.2, default-on)
|
|
206
|
+
|
|
207
|
+
When Loki is likely stuck or thrashing, it escalates proactively to the human
|
|
208
|
+
via the existing PAUSE + notification + handoff machinery, rather than silently
|
|
209
|
+
burning iterations until max-iterations. No new metacognition: the system
|
|
210
|
+
reuses three proxy signals that already exist and escalates only when at least
|
|
211
|
+
two of the three co-occur for N consecutive rounds.
|
|
212
|
+
|
|
213
|
+
### Trigger condition
|
|
214
|
+
|
|
215
|
+
Three proxy signals are evaluated each iteration:
|
|
216
|
+
|
|
217
|
+
- **Proxy 1 (no-change counter):** `consecutive_no_change` in council state.json
|
|
218
|
+
reaches `LOKI_UNCERTAINTY_NOCHANGE_MIN` (default: `COUNCIL_STAGNATION_LIMIT - 1`,
|
|
219
|
+
i.e. one below the circuit-breaker limit so escalation fires before the
|
|
220
|
+
breaker ends the run).
|
|
221
|
+
- **Proxy 2 (diff-hash oscillation):** the current iteration's combined diff
|
|
222
|
+
hash matches a hash seen 2+ rounds back in a bounded ring buffer (A -> B -> A
|
|
223
|
+
pattern). Detects oscillation/revert cycling; does not fire on the trivial
|
|
224
|
+
immediate-repeat case which proxy 1 already covers.
|
|
225
|
+
- **Proxy 3 (persistent council split):** the last `LOKI_UNCERTAINTY_SPLIT_ROUNDS`
|
|
226
|
+
consecutive council verdicts are all REJECTED-with-at-least-one-approver
|
|
227
|
+
(split verdict). Stale between council votes; fresh exactly when proxy 1 is
|
|
228
|
+
hot, because proxy 1 hot forces a circuit-breaker vote that refreshes verdicts.
|
|
229
|
+
|
|
230
|
+
Escalation fires when `hot_count >= 2` (at least two proxies hot simultaneously)
|
|
231
|
+
for `LOKI_UNCERTAINTY_ROUNDS` consecutive rounds AND the episode has not already
|
|
232
|
+
been escalated (one escalation per stuck-episode, with re-arm when co-occurrence
|
|
233
|
+
clears).
|
|
234
|
+
|
|
235
|
+
### Action
|
|
236
|
+
|
|
237
|
+
When the trigger condition is met, the run.sh action block:
|
|
238
|
+
|
|
239
|
+
1. Prints a loud terminal line with the opt-out env var.
|
|
240
|
+
2. Calls `write_structured_handoff "uncertainty_escalation"` (saves
|
|
241
|
+
`.loki/memory/handoffs/<ts>.json` and `.md`).
|
|
242
|
+
3. Calls `notify_intervention_needed` with a structured reason string.
|
|
243
|
+
4. Writes a `.loki/signals/UNCERTAINTY_ESCALATION` marker file.
|
|
244
|
+
5. Touches `.loki/PAUSE`.
|
|
245
|
+
|
|
246
|
+
### Knobs
|
|
247
|
+
|
|
248
|
+
```bash
|
|
249
|
+
LOKI_UNCERTAINTY_ESCALATION=0 # Disable entirely. Byte-identical when off:
|
|
250
|
+
# zero reads, zero writes, no state file.
|
|
251
|
+
# Default: 1 (enabled). Toggle value is 0/1,
|
|
252
|
+
# not false/true.
|
|
253
|
+
LOKI_UNCERTAINTY_ROUNDS=2 # Consecutive co-occurrence rounds required.
|
|
254
|
+
# Recommended range 2-3. Default: 2.
|
|
255
|
+
LOKI_UNCERTAINTY_NOCHANGE_MIN=N # Proxy 1 threshold. Unset = auto-computed as
|
|
256
|
+
# COUNCIL_STAGNATION_LIMIT - 1 (floored at 1).
|
|
257
|
+
LOKI_UNCERTAINTY_SPLIT_ROUNDS=2 # Proxy 3 trailing split-round run length.
|
|
258
|
+
# Default: 2.
|
|
259
|
+
```
|
|
260
|
+
|
|
261
|
+
Configurable via `config.yaml` under `completion.uncertainty.*` (see
|
|
262
|
+
`autonomy/config.example.yaml`).
|
|
263
|
+
|
|
264
|
+
### Honest limits
|
|
265
|
+
|
|
266
|
+
- **Perpetual-mode = notify-only by default.** `AUTONOMY_MODE` defaults to
|
|
267
|
+
`perpetual`. In perpetual mode the existing consumer (`check_human_intervention`)
|
|
268
|
+
auto-clears PAUSE and continues. Escalation therefore degrades to a notification
|
|
269
|
+
plus a handoff document; it does NOT halt the run. The terminal prints an explicit
|
|
270
|
+
warning at the escalation site: "Perpetual mode: PAUSE will be auto-cleared; this
|
|
271
|
+
is notify-only and will NOT halt the run."
|
|
272
|
+
- **Proxy 2 is count-blind by origin.** It approximates oscillation with
|
|
273
|
+
diff-hash recurrence-at-distance; it cannot distinguish a genuine revert from
|
|
274
|
+
a coincidental identical tree state, and misses oscillation where the hash
|
|
275
|
+
differs every round.
|
|
276
|
+
- **Proxy 3 is stale between council votes.** Verdicts are only appended when the
|
|
277
|
+
council actually votes (every `COUNCIL_CHECK_INTERVAL` or circuit-forced). In
|
|
278
|
+
practice p3 is always fresh in the regime that matters (proxy 1 hot forces a
|
|
279
|
+
vote), but it may lag by up to `COUNCIL_CHECK_INTERVAL` iterations otherwise.
|
|
280
|
+
- **These are heuristics, not true metacognition.** The system does not know it
|
|
281
|
+
is stuck; it infers stuckness from three correlated symptoms. A legitimately
|
|
282
|
+
hard refactor that produces no net diff for several rounds while the council
|
|
283
|
+
remains split can false-fire. Requiring >=2 co-occurring for N rounds reduces
|
|
284
|
+
but does not eliminate false fires. The cost of a false fire is bounded: one
|
|
285
|
+
notification + one handoff + one PAUSE (auto-cleared in perpetual), opt-out
|
|
286
|
+
at the site.
|
|
287
|
+
|
|
288
|
+
---
|
|
289
|
+
|
|
205
290
|
## Guardrails Execution Modes
|
|
206
291
|
|
|
207
292
|
- **Blocking**: Guardrail completes before agent starts (use for expensive operations)
|