loki-mode 5.32.2 → 5.34.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/SKILL.md +21 -28
- package/VERSION +1 -1
- package/autonomy/completion-council.sh +371 -7
- package/autonomy/hooks/quality-gate.sh +1 -1
- package/autonomy/hooks/track-metrics.sh +2 -2
- package/autonomy/hooks/validate-bash.sh +24 -24
- package/autonomy/loki +301 -32
- package/autonomy/run.sh +213 -18
- package/dashboard/__init__.py +1 -1
- package/dashboard/control.py +7 -3
- package/dashboard/server.py +149 -4
- package/dashboard/static/index.html +5 -5
- package/docs/INSTALLATION.md +10 -11
- package/docs/TOOL-INTEGRATION.md +1 -1
- package/docs/dashboard-guide.md +1 -1
- package/events/emit.sh +16 -9
- package/memory/engine.py +24 -16
- package/memory/retrieval.py +28 -19
- package/memory/storage.py +29 -3
- package/memory/vector_index.py +5 -5
- package/package.json +1 -1
package/SKILL.md
CHANGED
|
@@ -3,7 +3,7 @@ name: loki-mode
|
|
|
3
3
|
description: Multi-agent autonomous startup system. Triggers on "Loki Mode". Takes PRD to deployed product with zero human intervention. Requires --dangerously-skip-permissions flag.
|
|
4
4
|
---
|
|
5
5
|
|
|
6
|
-
# Loki Mode v5.
|
|
6
|
+
# Loki Mode v5.34.0
|
|
7
7
|
|
|
8
8
|
**You are an autonomous agent. You make decisions. You do not ask questions. You do not stop.**
|
|
9
9
|
|
|
@@ -23,17 +23,14 @@ Execute these steps IN ORDER at the start of EVERY turn:
|
|
|
23
23
|
{"pid": null, "startedAt": "<ISO timestamp>", "provider": "<provider>",
|
|
24
24
|
"invokedVia": "skill", "status": "running"}
|
|
25
25
|
|
|
26
|
-
2. Read .loki/
|
|
27
|
-
- IF file missing: You are starting fresh. Create it.
|
|
28
|
-
|
|
29
|
-
3. Read .loki/state/orchestrator.json
|
|
26
|
+
2. Read .loki/state/orchestrator.json
|
|
30
27
|
- Extract: currentPhase, tasksCompleted, tasksFailed
|
|
31
28
|
|
|
32
|
-
|
|
29
|
+
3. Read .loki/queue/pending.json
|
|
33
30
|
- IF empty AND phase incomplete: Generate tasks for current phase
|
|
34
31
|
- IF empty AND phase complete: Advance to next phase
|
|
35
32
|
|
|
36
|
-
|
|
33
|
+
4. Check .loki/PAUSE - IF exists: Stop work, wait for removal.
|
|
37
34
|
Check .loki/STOP - IF exists: End session, update session.json status to "stopped".
|
|
38
35
|
```
|
|
39
36
|
|
|
@@ -47,17 +44,10 @@ Every action follows this cycle. No exceptions.
|
|
|
47
44
|
REASON: What is the highest priority unblocked task?
|
|
48
45
|
|
|
|
49
46
|
v
|
|
50
|
-
PRE-ACT ATTENTION: Goal alignment check (prevents context drift)
|
|
51
|
-
- Re-read .loki/queue/current-task.json
|
|
52
|
-
- Verify: "Does my planned action serve task.goal?"
|
|
53
|
-
- Check: "Am I solving the original problem, not a tangent?"
|
|
54
|
-
- IF drift detected: Log to .loki/signals/DRIFT_DETECTED, return to REASON
|
|
55
|
-
|
|
|
56
|
-
v
|
|
57
47
|
ACT: Execute it. Write code. Run commands. Commit atomically.
|
|
58
48
|
|
|
|
59
49
|
v
|
|
60
|
-
REFLECT: Did it work?
|
|
50
|
+
REFLECT: Did it work? Log outcome.
|
|
61
51
|
|
|
|
62
52
|
v
|
|
63
53
|
VERIFY: Run tests. Check build. Validate against spec.
|
|
@@ -74,12 +64,6 @@ VERIFY: Run tests. Check build. Validate against spec.
|
|
|
74
64
|
After 5 failures: Log to dead-letter queue, move to next task.
|
|
75
65
|
```
|
|
76
66
|
|
|
77
|
-
**Why PRE-ACT ATTENTION matters** (from planning-with-files pattern):
|
|
78
|
-
- Context drift is silent - agents don't notice they've drifted off-task
|
|
79
|
-
- Forcing goal re-read before each action catches drift early
|
|
80
|
-
- Prevents "correct solution to wrong problem" failure mode
|
|
81
|
-
- Cost: One file read per action. Benefit: Catches misalignment before wasted work.
|
|
82
|
-
|
|
83
67
|
---
|
|
84
68
|
|
|
85
69
|
## PRIORITY 3: Autonomy Rules
|
|
@@ -104,7 +88,7 @@ These rules are ABSOLUTE. Violating them is a critical failure.
|
|
|
104
88
|
|-----------|------|------------------|------------------------|------------------|--------|
|
|
105
89
|
| PRD analysis, architecture, system design | **planning** | opus | opus | effort=xhigh | thinking=high |
|
|
106
90
|
| Feature implementation, complex bugs | **development** | opus | sonnet | effort=high | thinking=medium |
|
|
107
|
-
| Code review (
|
|
91
|
+
| Code review (planned: 3 parallel reviewers) | **development** | opus | sonnet | effort=high | thinking=medium |
|
|
108
92
|
| Integration tests, E2E, deployment | **development** | opus | sonnet | effort=high | thinking=medium |
|
|
109
93
|
| Unit tests, linting, docs, simple fixes | **fast** | sonnet | haiku | effort=low | thinking=low |
|
|
110
94
|
|
|
@@ -142,7 +126,6 @@ GROWTH ──[continuous improvement loop]──> GROWTH
|
|
|
142
126
|
|
|
143
127
|
- Load only 1-2 skill modules at a time (from skills/00-index.md)
|
|
144
128
|
- Use Task tool with subagents for exploration (isolates context)
|
|
145
|
-
- After 25 iterations: Consolidate learnings to CONTINUITY.md
|
|
146
129
|
- IF context feels heavy: Create `.loki/signals/CONTEXT_CLEAR_REQUESTED`
|
|
147
130
|
|
|
148
131
|
---
|
|
@@ -152,11 +135,9 @@ GROWTH ──[continuous improvement loop]──> GROWTH
|
|
|
152
135
|
| File | Read | Write |
|
|
153
136
|
|------|------|-------|
|
|
154
137
|
| `.loki/session.json` | Session start | Session start (register), session end (update status) |
|
|
155
|
-
| `.loki/CONTINUITY.md` | Every turn | Every turn |
|
|
156
138
|
| `.loki/state/orchestrator.json` | Every turn | On phase change |
|
|
157
139
|
| `.loki/queue/pending.json` | Every turn | When claiming/completing tasks |
|
|
158
|
-
| `.loki/queue/current-task.json` | Before each ACT
|
|
159
|
-
| `.loki/signals/DRIFT_DETECTED` | Never | When goal drift detected |
|
|
140
|
+
| `.loki/queue/current-task.json` | Before each ACT | When claiming task |
|
|
160
141
|
| `.loki/specs/openapi.yaml` | Before API work | After API changes |
|
|
161
142
|
| `skills/00-index.md` | Session start | Never |
|
|
162
143
|
| `.loki/memory/index.json` | Session start | On topic change |
|
|
@@ -168,6 +149,7 @@ GROWTH ──[continuous improvement loop]──> GROWTH
|
|
|
168
149
|
| `.loki/queue/dead-letter.json` | Session start | On task failure (5+ attempts) |
|
|
169
150
|
| `.loki/signals/CONTEXT_CLEAR_REQUESTED` | Never | When context heavy |
|
|
170
151
|
| `.loki/signals/HUMAN_REVIEW_NEEDED` | Never | When human decision required |
|
|
152
|
+
| `.loki/state/checkpoints/` | After task completion | Automatic + manual via `loki checkpoint` |
|
|
171
153
|
|
|
172
154
|
---
|
|
173
155
|
|
|
@@ -245,7 +227,6 @@ LOKI_PROMPT_INJECTION=true loki sandbox prompt "start the app"
|
|
|
245
227
|
|
|
246
228
|
| Type | File | Behavior |
|
|
247
229
|
|------|------|----------|
|
|
248
|
-
| **Hint** | `.loki/CONTINUITY.md` "Mistakes & Learnings" | Passive memory - remembered but not acted upon |
|
|
249
230
|
| **Directive** | `.loki/HUMAN_INPUT.md` | Active instruction (requires `LOKI_PROMPT_INJECTION=true`) |
|
|
250
231
|
|
|
251
232
|
**Example directive** (only works with `LOKI_PROMPT_INJECTION=true`):
|
|
@@ -267,4 +248,16 @@ Auto-detected or force with `LOKI_COMPLEXITY`:
|
|
|
267
248
|
|
|
268
249
|
---
|
|
269
250
|
|
|
270
|
-
|
|
251
|
+
## Planned Features
|
|
252
|
+
|
|
253
|
+
The following features are documented in skill modules but not yet fully automated:
|
|
254
|
+
|
|
255
|
+
| Feature | Status | Notes |
|
|
256
|
+
|---------|--------|-------|
|
|
257
|
+
| PRE-ACT goal drift detection | Planned | Agent-level attention check before each action; no automated enforcement yet |
|
|
258
|
+
| CONTINUITY.md working memory | Planned | Referenced in run.sh prompts but not automatically managed |
|
|
259
|
+
| GitHub issue import | Planned | Config flags exist (`LOKI_GITHUB_IMPORT`); `gh` CLI integration partial |
|
|
260
|
+
| Quality gates 3-reviewer system | Planned | Instructions in `skills/quality-gates.md`; not automated |
|
|
261
|
+
| Benchmarks (HumanEval, SWE-bench) | Infrastructure only | Runner scripts and datasets exist in `benchmarks/`; no published results |
|
|
262
|
+
|
|
263
|
+
**v5.34.0 | checkpoint/restore, GitHub Action provider-agnostic | ~260 lines core**
|
package/VERSION
CHANGED
|
@@ -1 +1 @@
|
|
|
1
|
-
5.
|
|
1
|
+
5.34.0
|
|
@@ -220,15 +220,12 @@ council_vote() {
|
|
|
220
220
|
local verdicts=""
|
|
221
221
|
|
|
222
222
|
# Run council members (sequentially for reliability, parallel if provider supports it)
|
|
223
|
+
# Roles cycle through the 3 core roles for councils larger than 3 members
|
|
224
|
+
local _council_roles=("requirements_verifier" "test_auditor" "devils_advocate")
|
|
223
225
|
local member=1
|
|
224
226
|
while [ $member -le $COUNCIL_SIZE ]; do
|
|
225
|
-
local
|
|
226
|
-
|
|
227
|
-
1) role="requirements_verifier" ;;
|
|
228
|
-
2) role="test_auditor" ;;
|
|
229
|
-
3) role="devils_advocate" ;;
|
|
230
|
-
*) role="generalist" ;;
|
|
231
|
-
esac
|
|
227
|
+
local role_index=$(( (member - 1) % ${#_council_roles[@]} ))
|
|
228
|
+
local role="${_council_roles[$role_index]}"
|
|
232
229
|
|
|
233
230
|
log_info "Council member $member/$COUNCIL_SIZE ($role) reviewing..."
|
|
234
231
|
|
|
@@ -269,6 +266,7 @@ council_vote() {
|
|
|
269
266
|
log_warn "Anti-sycophancy: Devil's advocate REJECTED unanimous approval"
|
|
270
267
|
log_warn "Overriding to require one more iteration for verification"
|
|
271
268
|
approve_count=$((approve_count - 1))
|
|
269
|
+
reject_count=$((reject_count + 1))
|
|
272
270
|
else
|
|
273
271
|
log_info "Anti-sycophancy: Devil's advocate confirmed approval"
|
|
274
272
|
fi
|
|
@@ -455,6 +453,9 @@ council_member_review() {
|
|
|
455
453
|
devils_advocate)
|
|
456
454
|
role_instruction="You are the DEVIL'S ADVOCATE. Your job is to find reasons the project is NOT complete. Look for: missing error handling, security issues, performance problems, missing documentation, untested edge cases, hardcoded values, TODO comments. Be skeptical."
|
|
457
455
|
;;
|
|
456
|
+
*)
|
|
457
|
+
role_instruction="You are a REVIEWER. Evaluate project completion from a general perspective. Check code quality, completeness, test coverage, and overall readiness. Be thorough and honest."
|
|
458
|
+
;;
|
|
458
459
|
esac
|
|
459
460
|
|
|
460
461
|
local prompt="You are a council member reviewing project completion.
|
|
@@ -624,6 +625,15 @@ council_heuristic_review() {
|
|
|
624
625
|
((issues++))
|
|
625
626
|
fi
|
|
626
627
|
;;
|
|
628
|
+
*)
|
|
629
|
+
# Generic reviewer: combine checks from all roles
|
|
630
|
+
if echo "$evidence" | grep -q "No PRD available"; then
|
|
631
|
+
((issues++))
|
|
632
|
+
fi
|
|
633
|
+
if echo "$evidence" | grep -qiE "(fail|error|FAIL)"; then
|
|
634
|
+
((issues++))
|
|
635
|
+
fi
|
|
636
|
+
;;
|
|
627
637
|
esac
|
|
628
638
|
|
|
629
639
|
if [ $issues -gt 0 ]; then
|
|
@@ -635,6 +645,360 @@ council_heuristic_review() {
|
|
|
635
645
|
fi
|
|
636
646
|
}
|
|
637
647
|
|
|
648
|
+
#===============================================================================
|
|
649
|
+
# Council Evaluate Member - Evaluate a single member's assessment
|
|
650
|
+
#
|
|
651
|
+
# Checks test results, git convergence, and error logs to produce a vote.
|
|
652
|
+
# This is the core evaluation logic used by council_aggregate_votes().
|
|
653
|
+
#
|
|
654
|
+
# Arguments:
|
|
655
|
+
# $1 - member role (requirements_verifier, test_auditor, devils_advocate)
|
|
656
|
+
# $2 - evaluation criteria description
|
|
657
|
+
#
|
|
658
|
+
# Returns: prints "COMPLETE <reason>" or "CONTINUE <reason>"
|
|
659
|
+
#===============================================================================
|
|
660
|
+
|
|
661
|
+
council_evaluate_member() {
|
|
662
|
+
local role="$1"
|
|
663
|
+
local criteria="${2:-general completion check}"
|
|
664
|
+
local loki_dir="${TARGET_DIR:-.}/.loki"
|
|
665
|
+
local vote="COMPLETE"
|
|
666
|
+
local reasons=""
|
|
667
|
+
|
|
668
|
+
# Check 1: Do tests pass? Look for test results in .loki/
|
|
669
|
+
local test_failures=0
|
|
670
|
+
for test_log in "$loki_dir"/logs/test-*.log "$loki_dir"/logs/*test*.log; do
|
|
671
|
+
if [ -f "$test_log" ]; then
|
|
672
|
+
local fail_count
|
|
673
|
+
fail_count=$(grep -ciE "(FAIL|ERROR|failed|error:)" "$test_log" 2>/dev/null || echo "0")
|
|
674
|
+
test_failures=$((test_failures + fail_count))
|
|
675
|
+
fi
|
|
676
|
+
done
|
|
677
|
+
if [ "$test_failures" -gt 0 ]; then
|
|
678
|
+
vote="CONTINUE"
|
|
679
|
+
reasons="${reasons}test failures found ($test_failures); "
|
|
680
|
+
fi
|
|
681
|
+
|
|
682
|
+
# Check 2: Has git diff changed since last iteration? (convergence check)
|
|
683
|
+
# If code is still changing, work may not be done
|
|
684
|
+
local current_diff_hash
|
|
685
|
+
current_diff_hash=$(git diff --stat HEAD 2>/dev/null | (md5sum 2>/dev/null || md5 -r 2>/dev/null) | cut -d' ' -f1 || echo "unknown")
|
|
686
|
+
if [ "$COUNCIL_CONSECUTIVE_NO_CHANGE" -eq 0 ] && [ "$ITERATION_COUNT" -gt "$COUNCIL_MIN_ITERATIONS" ]; then
|
|
687
|
+
# Code is still actively changing -- likely not done
|
|
688
|
+
vote="CONTINUE"
|
|
689
|
+
reasons="${reasons}code still changing between iterations; "
|
|
690
|
+
fi
|
|
691
|
+
|
|
692
|
+
# Check 3: Are there uncaught errors in logs?
|
|
693
|
+
local error_count=0
|
|
694
|
+
if [ -d "$loki_dir/logs" ]; then
|
|
695
|
+
for log_file in "$loki_dir"/logs/*.log; do
|
|
696
|
+
if [ -f "$log_file" ]; then
|
|
697
|
+
local errs
|
|
698
|
+
errs=$(tail -50 "$log_file" 2>/dev/null | grep -ciE "(uncaught|unhandled|panic|fatal|segfault|traceback)" 2>/dev/null || echo "0")
|
|
699
|
+
error_count=$((error_count + errs))
|
|
700
|
+
fi
|
|
701
|
+
done
|
|
702
|
+
fi
|
|
703
|
+
if [ "$error_count" -gt 0 ]; then
|
|
704
|
+
vote="CONTINUE"
|
|
705
|
+
reasons="${reasons}uncaught errors in logs ($error_count); "
|
|
706
|
+
fi
|
|
707
|
+
|
|
708
|
+
# Role-specific checks
|
|
709
|
+
case "$role" in
|
|
710
|
+
requirements_verifier)
|
|
711
|
+
# Check if pending tasks remain
|
|
712
|
+
if [ -f "$loki_dir/queue/pending.json" ]; then
|
|
713
|
+
local pending
|
|
714
|
+
pending=$(_QUEUE_FILE="$loki_dir/queue/pending.json" python3 -c "import json, os; print(len(json.load(open(os.environ['_QUEUE_FILE']))))" 2>/dev/null || echo "0")
|
|
715
|
+
if [ "$pending" -gt 0 ]; then
|
|
716
|
+
vote="CONTINUE"
|
|
717
|
+
reasons="${reasons}$pending tasks still pending; "
|
|
718
|
+
fi
|
|
719
|
+
fi
|
|
720
|
+
;;
|
|
721
|
+
test_auditor)
|
|
722
|
+
# Check if any test log exists at all
|
|
723
|
+
local has_tests=false
|
|
724
|
+
for f in "$loki_dir"/logs/test-*.log "$loki_dir"/logs/*test*.log; do
|
|
725
|
+
[ -f "$f" ] && has_tests=true && break
|
|
726
|
+
done
|
|
727
|
+
if [ "$has_tests" = "false" ]; then
|
|
728
|
+
vote="CONTINUE"
|
|
729
|
+
reasons="${reasons}no test results found; "
|
|
730
|
+
fi
|
|
731
|
+
;;
|
|
732
|
+
devils_advocate)
|
|
733
|
+
# Check for TODO/FIXME markers
|
|
734
|
+
local todo_count
|
|
735
|
+
todo_count=$(grep -rl "TODO\|FIXME\|HACK\|XXX" . --include="*.ts" --include="*.js" --include="*.py" --include="*.sh" 2>/dev/null | wc -l | tr -d ' ')
|
|
736
|
+
if [ "$todo_count" -gt 5 ]; then
|
|
737
|
+
vote="CONTINUE"
|
|
738
|
+
reasons="${reasons}$todo_count files with TODO/FIXME markers; "
|
|
739
|
+
fi
|
|
740
|
+
;;
|
|
741
|
+
esac
|
|
742
|
+
|
|
743
|
+
# Clean up trailing separator
|
|
744
|
+
reasons="${reasons%; }"
|
|
745
|
+
if [ -z "$reasons" ]; then
|
|
746
|
+
reasons="all checks passed for $role ($criteria)"
|
|
747
|
+
fi
|
|
748
|
+
|
|
749
|
+
echo "$vote $reasons"
|
|
750
|
+
}
|
|
751
|
+
|
|
752
|
+
#===============================================================================
|
|
753
|
+
# Council Aggregate Votes - Collect votes from all members
|
|
754
|
+
#
|
|
755
|
+
# Runs council_evaluate_member() for each council member, tallies votes,
|
|
756
|
+
# and writes results to COUNCIL_STATE_DIR/votes/round-N.json.
|
|
757
|
+
#
|
|
758
|
+
# 2/3 majority needed for COMPLETE verdict.
|
|
759
|
+
#
|
|
760
|
+
# Returns: prints "COMPLETE" or "CONTINUE"
|
|
761
|
+
#===============================================================================
|
|
762
|
+
|
|
763
|
+
council_aggregate_votes() {
|
|
764
|
+
local round="${ITERATION_COUNT:-0}"
|
|
765
|
+
local vote_output_dir="$COUNCIL_STATE_DIR/votes"
|
|
766
|
+
mkdir -p "$vote_output_dir"
|
|
767
|
+
|
|
768
|
+
local complete_count=0
|
|
769
|
+
local continue_count=0
|
|
770
|
+
local total_members=$COUNCIL_SIZE
|
|
771
|
+
local votes_json="["
|
|
772
|
+
local first=true
|
|
773
|
+
|
|
774
|
+
local _council_roles=("requirements_verifier" "test_auditor" "devils_advocate")
|
|
775
|
+
local member=1
|
|
776
|
+
while [ $member -le $total_members ]; do
|
|
777
|
+
local role_index=$(( (member - 1) % ${#_council_roles[@]} ))
|
|
778
|
+
local role="${_council_roles[$role_index]}"
|
|
779
|
+
|
|
780
|
+
local result
|
|
781
|
+
result=$(council_evaluate_member "$role" "round $round evaluation")
|
|
782
|
+
local vote_value
|
|
783
|
+
vote_value=$(echo "$result" | cut -d' ' -f1)
|
|
784
|
+
local vote_reason
|
|
785
|
+
vote_reason=$(echo "$result" | cut -d' ' -f2-)
|
|
786
|
+
|
|
787
|
+
if [ "$vote_value" = "COMPLETE" ]; then
|
|
788
|
+
((complete_count++))
|
|
789
|
+
else
|
|
790
|
+
((continue_count++))
|
|
791
|
+
fi
|
|
792
|
+
|
|
793
|
+
log_info " Evaluate member $member ($role): $vote_value - $vote_reason"
|
|
794
|
+
|
|
795
|
+
# Build JSON array entry
|
|
796
|
+
if [ "$first" = "true" ]; then
|
|
797
|
+
first=false
|
|
798
|
+
else
|
|
799
|
+
votes_json="${votes_json},"
|
|
800
|
+
fi
|
|
801
|
+
# Escape double quotes in reason for JSON safety
|
|
802
|
+
local safe_reason
|
|
803
|
+
safe_reason=$(echo "$vote_reason" | sed 's/"/\\"/g')
|
|
804
|
+
votes_json="${votes_json}{\"member\":$member,\"role\":\"$role\",\"vote\":\"$vote_value\",\"reason\":\"$safe_reason\"}"
|
|
805
|
+
|
|
806
|
+
((member++))
|
|
807
|
+
done
|
|
808
|
+
votes_json="${votes_json}]"
|
|
809
|
+
|
|
810
|
+
# Calculate threshold: 2/3 majority
|
|
811
|
+
local threshold=$(( (total_members * 2 + 2) / 3 )) # ceiling of 2/3
|
|
812
|
+
local verdict="CONTINUE"
|
|
813
|
+
if [ "$complete_count" -ge "$threshold" ]; then
|
|
814
|
+
verdict="COMPLETE"
|
|
815
|
+
fi
|
|
816
|
+
|
|
817
|
+
# Write round results to JSON file
|
|
818
|
+
local round_file="$vote_output_dir/round-${round}.json"
|
|
819
|
+
_ROUND="$round" \
|
|
820
|
+
_COMPLETE="$complete_count" \
|
|
821
|
+
_CONTINUE="$continue_count" \
|
|
822
|
+
_TOTAL="$total_members" \
|
|
823
|
+
_THRESHOLD="$threshold" \
|
|
824
|
+
_VERDICT="$verdict" \
|
|
825
|
+
_VOTES="$votes_json" \
|
|
826
|
+
python3 -c "
|
|
827
|
+
import json, os
|
|
828
|
+
from datetime import datetime, timezone
|
|
829
|
+
round_data = {
|
|
830
|
+
'round': int(os.environ['_ROUND']),
|
|
831
|
+
'timestamp': datetime.now(timezone.utc).strftime('%Y-%m-%dT%H:%M:%SZ'),
|
|
832
|
+
'complete_votes': int(os.environ['_COMPLETE']),
|
|
833
|
+
'continue_votes': int(os.environ['_CONTINUE']),
|
|
834
|
+
'total_members': int(os.environ['_TOTAL']),
|
|
835
|
+
'threshold': int(os.environ['_THRESHOLD']),
|
|
836
|
+
'verdict': os.environ['_VERDICT'],
|
|
837
|
+
'votes': json.loads(os.environ['_VOTES'])
|
|
838
|
+
}
|
|
839
|
+
with open('$round_file', 'w') as f:
|
|
840
|
+
json.dump(round_data, f, indent=2)
|
|
841
|
+
" || log_warn "Failed to write round vote file"
|
|
842
|
+
|
|
843
|
+
log_info "Aggregate vote: $complete_count COMPLETE / $continue_count CONTINUE (threshold: $threshold) -> $verdict"
|
|
844
|
+
|
|
845
|
+
echo "$verdict"
|
|
846
|
+
}
|
|
847
|
+
|
|
848
|
+
#===============================================================================
|
|
849
|
+
# Council Devils Advocate (Enhanced) - Skeptical re-evaluation on unanimous COMPLETE
|
|
850
|
+
#
|
|
851
|
+
# When council_aggregate_votes() returns unanimous COMPLETE, one member
|
|
852
|
+
# re-evaluates with a skeptical perspective. If any issue is found, the
|
|
853
|
+
# verdict flips to CONTINUE.
|
|
854
|
+
#
|
|
855
|
+
# Arguments:
|
|
856
|
+
# $1 - round number
|
|
857
|
+
#
|
|
858
|
+
# Returns: prints "OVERRIDE_CONTINUE" if flipped, or "CONFIRMED_COMPLETE" if upheld
|
|
859
|
+
#===============================================================================
|
|
860
|
+
|
|
861
|
+
council_devils_advocate_review() {
|
|
862
|
+
local round="${1:-$ITERATION_COUNT}"
|
|
863
|
+
local loki_dir="${TARGET_DIR:-.}/.loki"
|
|
864
|
+
|
|
865
|
+
log_warn "Unanimous COMPLETE detected - running devil's advocate re-evaluation..."
|
|
866
|
+
|
|
867
|
+
local issues_found=0
|
|
868
|
+
local issue_details=""
|
|
869
|
+
|
|
870
|
+
# Skeptical check 1: Are tests actually running and passing?
|
|
871
|
+
local has_test_results=false
|
|
872
|
+
for f in "$loki_dir"/logs/test-*.log "$loki_dir"/logs/*test*.log; do
|
|
873
|
+
if [ -f "$f" ]; then
|
|
874
|
+
has_test_results=true
|
|
875
|
+
# Look for test runner output indicating pass
|
|
876
|
+
if ! tail -30 "$f" 2>/dev/null | grep -qiE "(passed|success|ok|all tests)"; then
|
|
877
|
+
((issues_found++))
|
|
878
|
+
issue_details="${issue_details}test log $(basename "$f") has no clear pass indicator; "
|
|
879
|
+
fi
|
|
880
|
+
fi
|
|
881
|
+
done
|
|
882
|
+
if [ "$has_test_results" = "false" ]; then
|
|
883
|
+
((issues_found++))
|
|
884
|
+
issue_details="${issue_details}no test result logs found at all; "
|
|
885
|
+
fi
|
|
886
|
+
|
|
887
|
+
# Skeptical check 2: Are there still failing tasks in the queue?
|
|
888
|
+
if [ -f "$loki_dir/queue/failed.json" ]; then
|
|
889
|
+
local failed_count
|
|
890
|
+
failed_count=$(_QUEUE_FILE="$loki_dir/queue/failed.json" python3 -c "import json, os; print(len(json.load(open(os.environ['_QUEUE_FILE']))))" 2>/dev/null || echo "0")
|
|
891
|
+
if [ "$failed_count" -gt 0 ]; then
|
|
892
|
+
((issues_found++))
|
|
893
|
+
issue_details="${issue_details}$failed_count tasks in failed queue; "
|
|
894
|
+
fi
|
|
895
|
+
fi
|
|
896
|
+
|
|
897
|
+
# Skeptical check 3: TODO/FIXME/HACK density
|
|
898
|
+
local todo_count
|
|
899
|
+
todo_count=$(grep -rl "TODO\|FIXME\|HACK\|XXX" . --include="*.ts" --include="*.js" --include="*.py" --include="*.sh" 2>/dev/null | wc -l | tr -d ' ')
|
|
900
|
+
if [ "$todo_count" -gt 3 ]; then
|
|
901
|
+
((issues_found++))
|
|
902
|
+
issue_details="${issue_details}$todo_count files still contain TODO/FIXME markers; "
|
|
903
|
+
fi
|
|
904
|
+
|
|
905
|
+
# Skeptical check 4: Large number of uncommitted changes
|
|
906
|
+
local uncommitted
|
|
907
|
+
uncommitted=$(git status --porcelain 2>/dev/null | wc -l | tr -d ' ')
|
|
908
|
+
if [ "$uncommitted" -gt 10 ]; then
|
|
909
|
+
((issues_found++))
|
|
910
|
+
issue_details="${issue_details}$uncommitted uncommitted files; "
|
|
911
|
+
fi
|
|
912
|
+
|
|
913
|
+
# Skeptical check 5: Recent error events
|
|
914
|
+
if [ -f "$loki_dir/events.jsonl" ]; then
|
|
915
|
+
local recent_errors
|
|
916
|
+
recent_errors=$(tail -50 "$loki_dir/events.jsonl" 2>/dev/null | grep -ciE "\"level\":\s*\"error\"" 2>/dev/null || echo "0")
|
|
917
|
+
if [ "$recent_errors" -gt 0 ]; then
|
|
918
|
+
((issues_found++))
|
|
919
|
+
issue_details="${issue_details}$recent_errors recent error events; "
|
|
920
|
+
fi
|
|
921
|
+
fi
|
|
922
|
+
|
|
923
|
+
# Record the devil's advocate result
|
|
924
|
+
issue_details="${issue_details%; }"
|
|
925
|
+
local da_file="$COUNCIL_STATE_DIR/votes/devils-advocate-round-${round}.json"
|
|
926
|
+
_ROUND="$round" \
|
|
927
|
+
_ISSUES="$issues_found" \
|
|
928
|
+
_DETAILS="${issue_details:-none}" \
|
|
929
|
+
python3 -c "
|
|
930
|
+
import json, os
|
|
931
|
+
from datetime import datetime, timezone
|
|
932
|
+
da_result = {
|
|
933
|
+
'round': int(os.environ['_ROUND']),
|
|
934
|
+
'timestamp': datetime.now(timezone.utc).strftime('%Y-%m-%dT%H:%M:%SZ'),
|
|
935
|
+
'issues_found': int(os.environ['_ISSUES']),
|
|
936
|
+
'details': os.environ['_DETAILS'],
|
|
937
|
+
'override': int(os.environ['_ISSUES']) > 0
|
|
938
|
+
}
|
|
939
|
+
with open('$da_file', 'w') as f:
|
|
940
|
+
json.dump(da_result, f, indent=2)
|
|
941
|
+
" || log_warn "Failed to write devil's advocate result"
|
|
942
|
+
|
|
943
|
+
if [ "$issues_found" -gt 0 ]; then
|
|
944
|
+
log_warn "Devil's advocate found $issues_found issues: $issue_details"
|
|
945
|
+
log_warn "Overriding unanimous COMPLETE -> CONTINUE"
|
|
946
|
+
echo "OVERRIDE_CONTINUE"
|
|
947
|
+
else
|
|
948
|
+
log_info "Devil's advocate confirmed: no issues found, COMPLETE upheld"
|
|
949
|
+
echo "CONFIRMED_COMPLETE"
|
|
950
|
+
fi
|
|
951
|
+
}
|
|
952
|
+
|
|
953
|
+
#===============================================================================
|
|
954
|
+
# Council Evaluate - Unified entry point for council voting pipeline
|
|
955
|
+
#
|
|
956
|
+
# Orchestrates the full evaluation:
|
|
957
|
+
# 1. Run council_aggregate_votes() to collect all member votes
|
|
958
|
+
# 2. If unanimous COMPLETE, run council_devils_advocate_review()
|
|
959
|
+
# 3. Return final verdict
|
|
960
|
+
#
|
|
961
|
+
# Returns 0 if COMPLETE (should stop), 1 if CONTINUE
|
|
962
|
+
#===============================================================================
|
|
963
|
+
|
|
964
|
+
council_evaluate() {
|
|
965
|
+
if [ "$COUNCIL_ENABLED" != "true" ]; then
|
|
966
|
+
return 1
|
|
967
|
+
fi
|
|
968
|
+
|
|
969
|
+
log_info "Running council evaluation pipeline (round $ITERATION_COUNT)..."
|
|
970
|
+
|
|
971
|
+
# Step 1: Aggregate votes from all members
|
|
972
|
+
local aggregate_result
|
|
973
|
+
aggregate_result=$(council_aggregate_votes)
|
|
974
|
+
|
|
975
|
+
if [ "$aggregate_result" = "COMPLETE" ]; then
|
|
976
|
+
# Step 2: Check if unanimous -- compare complete_count to COUNCIL_SIZE
|
|
977
|
+
# Re-derive complete count from the round file
|
|
978
|
+
local round_file="$COUNCIL_STATE_DIR/votes/round-${ITERATION_COUNT}.json"
|
|
979
|
+
local complete_count=0
|
|
980
|
+
if [ -f "$round_file" ]; then
|
|
981
|
+
complete_count=$(_RF="$round_file" python3 -c "import json, os; print(json.load(open(os.environ['_RF'])).get('complete_votes', 0))" 2>/dev/null || echo "0")
|
|
982
|
+
fi
|
|
983
|
+
|
|
984
|
+
if [ "$complete_count" -eq "$COUNCIL_SIZE" ] && [ "$COUNCIL_SIZE" -ge 3 ]; then
|
|
985
|
+
# Step 3: Unanimous -- run devil's advocate
|
|
986
|
+
local da_result
|
|
987
|
+
da_result=$(council_devils_advocate_review "$ITERATION_COUNT")
|
|
988
|
+
if [ "$da_result" = "OVERRIDE_CONTINUE" ]; then
|
|
989
|
+
log_warn "Council evaluate: devil's advocate overrode unanimous COMPLETE"
|
|
990
|
+
return 1 # CONTINUE
|
|
991
|
+
fi
|
|
992
|
+
fi
|
|
993
|
+
|
|
994
|
+
log_info "Council evaluate: verdict is COMPLETE"
|
|
995
|
+
return 0 # COMPLETE (should stop)
|
|
996
|
+
fi
|
|
997
|
+
|
|
998
|
+
log_info "Council evaluate: verdict is CONTINUE"
|
|
999
|
+
return 1 # CONTINUE
|
|
1000
|
+
}
|
|
1001
|
+
|
|
638
1002
|
#===============================================================================
|
|
639
1003
|
# Main Entry Point - Should the loop stop?
|
|
640
1004
|
#===============================================================================
|
|
@@ -29,7 +29,7 @@ fi
|
|
|
29
29
|
|
|
30
30
|
# Check for TODO/FIXME in recent changes
|
|
31
31
|
if [ -d "$CWD/.git" ]; then
|
|
32
|
-
TODOS=$(git -C "$CWD" diff HEAD~1 2>/dev/null | grep -c "TODO
|
|
32
|
+
TODOS=$(git -C "$CWD" diff HEAD~1 2>/dev/null | grep -c -E "TODO|FIXME" 2>/dev/null | tr -d '[:space:]') || TODOS=0
|
|
33
33
|
if [ "$TODOS" -gt 0 ]; then
|
|
34
34
|
GATE_RESULTS+=("new_todos: $TODOS")
|
|
35
35
|
fi
|
|
@@ -11,7 +11,7 @@ METRICS_DIR="$CWD/.loki/metrics"
|
|
|
11
11
|
mkdir -p "$METRICS_DIR"
|
|
12
12
|
|
|
13
13
|
TIMESTAMP=$(date -u +%Y-%m-%dT%H:%M:%SZ)
|
|
14
|
-
tool_escaped=$(printf '%s' "$TOOL_NAME" |
|
|
15
|
-
echo "{\"timestamp\":\"$TIMESTAMP\",\"tool\"
|
|
14
|
+
tool_escaped=$(printf '%s' "$TOOL_NAME" | python3 -c 'import sys,json; print(json.dumps(sys.stdin.read()),end="")')
|
|
15
|
+
echo "{\"timestamp\":\"$TIMESTAMP\",\"tool\":$tool_escaped,\"event\":\"PostToolUse\"}" >> "$METRICS_DIR/tool-usage.jsonl"
|
|
16
16
|
|
|
17
17
|
exit 0
|
|
@@ -8,29 +8,36 @@ INPUT=$(cat)
|
|
|
8
8
|
COMMAND=$(echo "$INPUT" | python3 -c "import sys,json; d=json.load(sys.stdin); print(d.get('tool_input',{}).get('command',''))")
|
|
9
9
|
CWD=$(echo "$INPUT" | python3 -c "import sys,json; print(json.load(sys.stdin).get('cwd',''))")
|
|
10
10
|
|
|
11
|
-
# Dangerous command patterns
|
|
11
|
+
# Dangerous command patterns (matched anywhere in the command string)
|
|
12
|
+
# Safe paths like /tmp/ and relative paths (./) are excluded below
|
|
12
13
|
BLOCKED_PATTERNS=(
|
|
13
|
-
"
|
|
14
|
-
"
|
|
15
|
-
"
|
|
14
|
+
"rm -rf /"
|
|
15
|
+
"rm -rf ~"
|
|
16
|
+
"rm -rf \\\$HOME"
|
|
16
17
|
"> /dev/sd"
|
|
17
|
-
"
|
|
18
|
-
"
|
|
19
|
-
"
|
|
18
|
+
"mkfs[. ]"
|
|
19
|
+
"dd if=/dev/zero"
|
|
20
|
+
"chmod -R 777 /"
|
|
21
|
+
)
|
|
22
|
+
|
|
23
|
+
# Safe path patterns that override rm -rf / matches
|
|
24
|
+
SAFE_PATTERNS=(
|
|
25
|
+
"rm -rf /tmp/"
|
|
20
26
|
)
|
|
21
27
|
|
|
22
28
|
# Check for blocked patterns
|
|
23
29
|
for pattern in "${BLOCKED_PATTERNS[@]}"; do
|
|
24
30
|
if echo "$COMMAND" | grep -qE "$pattern"; then
|
|
25
|
-
|
|
26
|
-
|
|
27
|
-
|
|
28
|
-
|
|
29
|
-
|
|
30
|
-
|
|
31
|
-
|
|
32
|
-
|
|
33
|
-
|
|
31
|
+
# Check if a safe pattern also matches (whitelist override)
|
|
32
|
+
is_safe=false
|
|
33
|
+
for safe in "${SAFE_PATTERNS[@]}"; do
|
|
34
|
+
if echo "$COMMAND" | grep -qE "$safe"; then
|
|
35
|
+
is_safe=true
|
|
36
|
+
break
|
|
37
|
+
fi
|
|
38
|
+
done
|
|
39
|
+
"$is_safe" && continue
|
|
40
|
+
printf '%s' '{"hookSpecificOutput":{"hookEventName":"PreToolUse","permissionDecision":"deny","permissionDecisionReason":"Blocked: potentially dangerous command pattern detected"}}'
|
|
34
41
|
exit 2
|
|
35
42
|
fi
|
|
36
43
|
done
|
|
@@ -42,13 +49,6 @@ printf '%s' "{\"timestamp\":\"$(date -u +%Y-%m-%dT%H:%M:%SZ)\",\"command\":$(ech
|
|
|
42
49
|
echo >> "$LOG_DIR/bash-audit.jsonl"
|
|
43
50
|
|
|
44
51
|
# Allow command
|
|
45
|
-
|
|
46
|
-
{
|
|
47
|
-
"hookSpecificOutput": {
|
|
48
|
-
"hookEventName": "PreToolUse",
|
|
49
|
-
"permissionDecision": "allow"
|
|
50
|
-
}
|
|
51
|
-
}
|
|
52
|
-
EOF
|
|
52
|
+
printf '%s' '{"hookSpecificOutput":{"hookEventName":"PreToolUse","permissionDecision":"allow"}}'
|
|
53
53
|
|
|
54
54
|
exit 0
|