loki-mode 5.48.2 → 5.49.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -1,6 +1,6 @@
1
1
  # Loki Mode
2
2
 
3
- **The Flagship Product of [Autonomi](https://www.autonomi.dev/) -- The First Truly Autonomous Multi-Agent Startup System**
3
+ **The Flagship Product of [Autonomi](https://www.autonomi.dev/) -- An Autonomous Multi-Agent Development System**
4
4
 
5
5
  [![npm version](https://img.shields.io/npm/v/loki-mode)](https://www.npmjs.com/package/loki-mode)
6
6
  [![npm downloads](https://img.shields.io/npm/dw/loki-mode)](https://www.npmjs.com/package/loki-mode)
@@ -9,17 +9,15 @@
9
9
  [![GitHub Marketplace](https://img.shields.io/badge/Marketplace-Loki%20Mode-purple?logo=github)](https://github.com/marketplace/actions/loki-mode-code-review)
10
10
  [![Autonomi](https://img.shields.io/badge/Autonomi-autonomi.dev-5B4EEA)](https://www.autonomi.dev/)
11
11
  [![Agent Types](https://img.shields.io/badge/Agent%20Types-41-blue)]()
12
- [![Loki Mode](https://img.shields.io/badge/Loki%20Mode-98.78%25%20Pass%401-blueviolet)](benchmarks/results/)
13
- [![HumanEval](https://img.shields.io/badge/HumanEval-98.17%25%20Pass%401-brightgreen)](benchmarks/results/)
14
- [![SWE-bench](https://img.shields.io/badge/SWE--bench-99.67%25%20Patch%20Gen-brightgreen)](benchmarks/results/)
12
+ [![Benchmarks](https://img.shields.io/badge/Benchmarks-Infrastructure%20Ready-blue)](benchmarks/)
15
13
 
16
- **Current Version: v5.47.0**
14
+ **Current Version: v5.49.0**
17
15
 
18
16
  **[Autonomi](https://www.autonomi.dev/)** | **[Documentation](https://www.autonomi.dev/docs)** | **[GitHub](https://github.com/asklokesh/loki-mode)**
19
17
 
20
- > **PRD Deployed Product in Zero Human Intervention**
18
+ > **PRD to Deployed Product with Minimal Human Intervention**
21
19
  >
22
- > Loki Mode transforms a Product Requirements Document into a fully built, tested, deployed, and revenue-generating product while you sleep. No manual steps. No intervention. Just results.
20
+ > Loki Mode transforms a Product Requirements Document into a fully built, tested, and deployed product with autonomous multi-agent execution. Human oversight for deployment credentials, domain setup, and critical decisions.
23
21
 
24
22
  ---
25
23
 
@@ -79,7 +77,7 @@ jobs:
79
77
  runs-on: ubuntu-latest
80
78
  steps:
81
79
  - uses: actions/checkout@v4
82
- - uses: asklokesh/loki-mode@v5.38
80
+ - uses: asklokesh/loki-mode@v5
83
81
  with:
84
82
  github_token: ${{ secrets.GITHUB_TOKEN }}
85
83
  mode: review # review, fix, or test
@@ -163,40 +161,27 @@ See [skills/providers.md](skills/providers.md) for full provider documentation.
163
161
 
164
162
  ---
165
163
 
166
- ## Benchmark Results
164
+ ## Benchmarks
167
165
 
168
- ### Three-Way Comparison (HumanEval)
166
+ Benchmark infrastructure is included for HumanEval and SWE-bench evaluation. Results are self-reported from the included test harness and have not been independently verified.
169
167
 
170
- | System | Pass@1 | Details |
171
- |--------|--------|---------|
172
- | **Loki Mode (Multi-Agent)** | **98.78%** | 162/164 problems, RARV cycle recovered 2 |
173
- | Direct Claude | 98.17% | 161/164 problems (baseline) |
174
- | MetaGPT | 85.9-87.7% | Published benchmark |
168
+ | Benchmark | Result | Notes |
169
+ |-----------|--------|-------|
170
+ | HumanEval | 162/164 (98.78%) | Self-reported, max 3 retries per problem |
171
+ | SWE-bench | 299/300 patches generated | Patch generation only -- SWE-bench evaluator not yet run to verify correctness |
175
172
 
176
- **Loki Mode beats MetaGPT by +11-13%** thanks to the RARV (Reason-Act-Reflect-Verify) cycle.
173
+ **Note:** SWE-bench "patch generation" means the system produced a patch file, not that the patch correctly resolves the issue. The SWE-bench evaluator should be run to determine actual resolution rates.
177
174
 
178
- ### Full Results
179
-
180
- | Benchmark | Score | Details |
181
- |-----------|-------|---------|
182
- | **Loki Mode HumanEval** | **98.78% Pass@1** | 162/164 (multi-agent with RARV) |
183
- | **Direct Claude HumanEval** | **98.17% Pass@1** | 161/164 (single agent baseline) |
184
- | **Direct Claude SWE-bench** | **99.67% patch gen** | 299/300 problems |
185
- | **Loki Mode SWE-bench** | **99.67% patch gen** | 299/300 problems |
186
- | Model | Claude Opus 4.5 | |
187
-
188
- **Key Finding:** Multi-agent RARV matches single-agent performance on both benchmarks after timeout optimization. The 4-agent pipeline (Architect->Engineer->QA->Reviewer) achieves the same 99.67% patch generation as direct Claude.
189
-
190
- See [benchmarks/results/](benchmarks/results/) for full methodology and solutions.
175
+ See [benchmarks/](benchmarks/) for the test harness and raw results.
191
176
 
192
177
  ---
193
178
 
194
179
  ## What is Loki Mode?
195
180
 
196
- Loki Mode is a multi-provider AI skill that orchestrates **41 specialized AI agent types** across **7 swarms** to autonomously build, test, deploy, and scale complete startups. Works with **Claude Code**, **OpenAI Codex CLI**, and **Google Gemini CLI**. It dynamically spawns only the agents you need—**5-10 for simple projects, 100+ for complex startups**—working in parallel with continuous self-verification.
181
+ Loki Mode is a multi-provider AI skill that orchestrates **41 specialized AI agent types** across **8 swarms** to autonomously build, test, and deploy software projects. Works with **Claude Code**, **OpenAI Codex CLI**, and **Google Gemini CLI**. It dynamically spawns agents as needed -- typically **5-10 for simple projects, more for complex ones** -- working in parallel with continuous self-verification.
197
182
 
198
183
  ```
199
- PRD → Research → Architecture → Development → Testing → Deployment → Marketing → Revenue
184
+ PRD → Research → Architecture → Development → Testing → Deployment → Marketing
200
185
  ```
201
186
 
202
187
  **Just say "Loki Mode" and point to a PRD. Walk away. Come back to a deployed product.**
@@ -205,11 +190,11 @@ PRD → Research → Architecture → Development → Testing → Deployment →
205
190
 
206
191
  ## Why Loki Mode?
207
192
 
208
- ### **Better Than Anything Out There**
193
+ ### **How It Works**
209
194
 
210
195
  | What Others Do | What Loki Mode Does |
211
196
  |----------------|---------------------|
212
- | **Single agent** writes code linearly | **100+ agents** work in parallel across engineering, ops, business, data, product, and growth |
197
+ | **Single agent** writes code linearly | **Multiple agents** work in parallel across engineering, ops, business, data, product, and growth |
213
198
  | **Manual deployment** required | **Autonomous deployment** to AWS, GCP, Azure, Vercel, Railway with blue-green and canary strategies |
214
199
  | **No testing** or basic unit tests | **7 automated quality gates**: input/output guardrails, static analysis, blind review, anti-sycophancy, severity blocking, test coverage |
215
200
  | **Code only** - you handle the rest | **Full business operations**: marketing, sales, legal, HR, finance, investor relations |
@@ -221,8 +206,8 @@ PRD → Research → Architecture → Development → Testing → Deployment →
221
206
 
222
207
  ### **Core Advantages**
223
208
 
224
- 1. **Truly Autonomous**: RARV (Reason-Act-Reflect-Verify) cycle with self-verification achieves 2-3x quality improvement
225
- 2. **Massively Parallel**: 100+ agents working simultaneously, not sequential single-agent bottlenecks
209
+ 1. **Self-Verifying**: RARV (Reason-Act-Reflect-Verify) cycle with continuous self-verification catches errors early
210
+ 2. **Parallel Execution**: Multiple agents working simultaneously, not sequential single-agent bottlenecks
226
211
  3. **Production-Ready**: Not just code—handles deployment, monitoring, incident response, and business operations
227
212
  4. **Self-Improving**: Learns from mistakes, updates continuity logs, prevents repeated errors
228
213
  5. **Zero Babysitting**: Auto-resumes on rate limits, recovers from failures, runs until completion
@@ -255,7 +240,7 @@ PRD → Research → Architecture → Development → Testing → Deployment →
255
240
  | **GitHub Integration** | Issue import, PR creation, status sync | [GitHub Integration](skills/github-integration.md) |
256
241
  | **Distribution** | npm, Homebrew, Docker installation | [Installation Guide](docs/INSTALLATION.md) |
257
242
  | **Research Foundation** | OpenAI, DeepMind, Anthropic patterns | [Acknowledgements](docs/ACKNOWLEDGEMENTS.md) |
258
- | **Benchmarks** | HumanEval 98.78%, SWE-bench 99.67% | [Benchmark Results](benchmarks/results/) |
243
+ | **Benchmarks** | HumanEval and SWE-bench infrastructure included | [Benchmark Harness](benchmarks/) |
259
244
  | **Comparisons** | vs Auto-Claude, Cursor | [Auto-Claude](docs/auto-claude-comparison.md), [Cursor](docs/cursor-comparison.md) |
260
245
 
261
246
  ---
@@ -424,7 +409,7 @@ Loki Mode doesn't just write code—it **thinks, acts, learns, and verifies**:
424
409
  └─ Apply learning and RETRY from REASON
425
410
  ```
426
411
 
427
- **Result:** 2-3x quality improvement through continuous self-verification.
412
+ **Result:** Improved quality through continuous self-verification and multi-reviewer code review.
428
413
 
429
414
  ### **Perpetual Improvement Mode**
430
415
 
@@ -561,7 +546,7 @@ graph TB
561
546
  **Key components:**
562
547
  - **RARV+C Cycle** -- Reason, Act, Reflect, Verify, Compound. Every iteration follows this loop. Failed verification triggers retry from Reason.
563
548
  - **Provider Layer** -- Claude Code (full parallel agents, Task tool, MCP), Codex CLI and Gemini CLI (sequential, degraded mode).
564
- - **Agent Swarms** -- 41 specialized agent types across 7 swarms, spawned on demand based on project complexity.
549
+ - **Agent Swarms** -- 41 specialized agent types across 8 swarms, spawned on demand based on project complexity.
565
550
  - **Completion Council** -- 3 members vote on whether the project is done. Anti-sycophancy devil's advocate on unanimous votes.
566
551
  - **Memory System** -- Episodic traces, semantic patterns, procedural skills. Progressive disclosure reduces context usage by 60-80%.
567
552
  - **Dashboard** -- FastAPI server reading `.loki/` flat files, with real-time web UI for task queue, agents, logs, and council state. Now with TLS/HTTPS, OIDC/SSO, and RBAC (v5.36.0-v5.37.0).
@@ -609,7 +594,7 @@ Config search order: `.loki/config.yaml` (project) -> `~/.config/loki-mode/confi
609
594
 
610
595
  ## Agent Swarms (41 Types)
611
596
 
612
- Loki Mode has **41 predefined agent types** organized into **7 specialized swarms**. The orchestrator spawns only what you needsimple projects use 5-10 agents, complex startups spawn 100+.
597
+ Loki Mode has **41 predefined agent types** organized into **8 specialized swarms**. The orchestrator spawns only what you need -- simple projects typically use 5-10 agents, complex ones may use more.
613
598
 
614
599
  <img width="5309" height="979" alt="Agent Swarms Visualization" src="https://github.com/user-attachments/assets/7d18635d-a606-401f-8d9f-430e6e4ee689" />
615
600
 
@@ -981,7 +966,7 @@ Built for the [Claude Code](https://claude.ai) ecosystem, powered by Anthropic's
981
966
 
982
967
  Loki Mode is the flagship product of **[Autonomi](https://www.autonomi.dev/)** -- a platform for autonomous AI systems. Like Alphabet is to Google, Autonomi is the parent brand under which Loki Mode and future products operate.
983
968
 
984
- **Why Autonomi?** Loki Mode proved that multi-agent autonomous systems can build real software from a PRD with zero human intervention. Autonomi is the expansion of that vision into a broader platform of autonomous services and products.
969
+ **Why Autonomi?** Loki Mode proved that multi-agent autonomous systems can build real software from a PRD with minimal human intervention. Autonomi is the expansion of that vision into a broader platform of autonomous services and products.
985
970
 
986
971
  - **[autonomi.dev](https://www.autonomi.dev/)** -- Main website
987
972
  - **[Documentation](https://www.autonomi.dev/docs)** -- Full documentation
package/SKILL.md CHANGED
@@ -1,9 +1,9 @@
1
1
  ---
2
2
  name: loki-mode
3
- description: Multi-agent autonomous startup system. Triggers on "Loki Mode". Takes PRD to deployed product with zero human intervention. Requires --dangerously-skip-permissions flag.
3
+ description: Multi-agent autonomous startup system. Triggers on "Loki Mode". Takes PRD to deployed product with minimal human intervention. Requires --dangerously-skip-permissions flag.
4
4
  ---
5
5
 
6
- # Loki Mode v5.48.2
6
+ # Loki Mode v5.49.1
7
7
 
8
8
  **You are an autonomous agent. You make decisions. You do not ask questions. You do not stop.**
9
9
 
@@ -263,4 +263,4 @@ The following features are documented in skill modules but not yet fully automat
263
263
  | Quality gates 3-reviewer system | Implemented (v5.35.0) | 5 specialist reviewers in `skills/quality-gates.md`; execution in run.sh |
264
264
  | Benchmarks (HumanEval, SWE-bench) | Infrastructure only | Runner scripts and datasets exist in `benchmarks/`; no published results |
265
265
 
266
- **v5.48.2 | [Autonomi](https://www.autonomi.dev/) flagship product | ~260 lines core**
266
+ **v5.49.1 | [Autonomi](https://www.autonomi.dev/) flagship product | ~260 lines core**
package/VERSION CHANGED
@@ -1 +1 @@
1
- 5.48.2
1
+ 5.49.1
@@ -142,7 +142,7 @@ GROWTH ──[continuous improvement loop]──> GROWTH
142
142
  - `Bash` - Command execution
143
143
  - `platform-orchestrator` - Deployment and service management
144
144
 
145
- **The 37 agent types are ROLES defined through prompts, not subagent_types.**
145
+ **The 41 agent types are ROLES defined through prompts, not subagent_types.**
146
146
 
147
147
  ---
148
148
 
@@ -158,7 +158,7 @@ skills/
158
158
  quality-gates.md # 7-gate system, anti-sycophancy
159
159
  testing.md # Playwright, E2E, property-based
160
160
  production.md # CI/CD, batch processing
161
- agents.md # 37 agent types, A2A patterns
161
+ agents.md # 41 agent types, A2A patterns
162
162
  parallel-workflows.md # Git worktrees, parallel streams
163
163
  troubleshooting.md # Error recovery, fallbacks
164
164
  artifacts.md # Code generation patterns
@@ -432,6 +432,10 @@ app_runner_start() {
432
432
  (cd "$dir" && bash -c "$_APP_RUNNER_METHOD" >> "$_APP_RUNNER_DIR/app.log" 2>&1) &
433
433
  fi
434
434
  _APP_RUNNER_PID=$!
435
+ # Register with central PID registry if available
436
+ if type register_pid &>/dev/null; then
437
+ register_pid "$_APP_RUNNER_PID" "app-runner" "method=$_APP_RUNNER_METHOD"
438
+ fi
435
439
 
436
440
  # Write PID file
437
441
  echo "$_APP_RUNNER_PID" > "$_APP_RUNNER_DIR/app.pid"
@@ -497,6 +501,11 @@ app_runner_stop() {
497
501
  kill -KILL "-$_APP_RUNNER_PID" 2>/dev/null || kill -KILL "$_APP_RUNNER_PID" 2>/dev/null || true
498
502
  fi
499
503
 
504
+ # Unregister from central PID registry
505
+ if type unregister_pid &>/dev/null && [ -n "$_APP_RUNNER_PID" ]; then
506
+ unregister_pid "$_APP_RUNNER_PID"
507
+ fi
508
+
500
509
  rm -f "$_APP_RUNNER_DIR/app.pid"
501
510
  _write_app_state "stopped"
502
511
  log_info "App Runner: application stopped"
@@ -45,6 +45,14 @@ COUNCIL_MIN_ITERATIONS=${LOKI_COUNCIL_MIN_ITERATIONS:-3}
45
45
  COUNCIL_CONVERGENCE_WINDOW=${LOKI_COUNCIL_CONVERGENCE_WINDOW:-3}
46
46
  COUNCIL_STAGNATION_LIMIT=${LOKI_COUNCIL_STAGNATION_LIMIT:-5}
47
47
 
48
+ # Error budget: severity-aware completion (v5.49.0)
49
+ # SEVERITY_THRESHOLD: minimum severity that blocks completion (critical, high, medium, low)
50
+ # "critical" = only critical issues block (most permissive)
51
+ # "low" = all issues block (strictest, default for backwards compat)
52
+ # ERROR_BUDGET: fraction of non-blocking issues allowed (0.0 = none, 0.1 = 10% tolerance)
53
+ COUNCIL_SEVERITY_THRESHOLD=${LOKI_COUNCIL_SEVERITY_THRESHOLD:-low}
54
+ COUNCIL_ERROR_BUDGET=${LOKI_COUNCIL_ERROR_BUDGET:-0.0}
55
+
48
56
  # Internal state
49
57
  COUNCIL_STATE_DIR=""
50
58
  COUNCIL_PRD_PATH=""
@@ -235,6 +243,38 @@ council_vote() {
235
243
  local vote_result
236
244
  vote_result=$(echo "$verdict" | grep -oE "VOTE:\s*(APPROVE|REJECT)" | grep -oE "APPROVE|REJECT" | head -1)
237
245
 
246
+ # Extract severity-categorized issues (v5.49.0 error budget)
247
+ local member_issues=""
248
+ member_issues=$(echo "$verdict" | grep -oE "ISSUES:\s*(CRITICAL|HIGH|MEDIUM|LOW):.*" || true)
249
+
250
+ # If error budget is active and member rejected, check if rejection
251
+ # is based only on issues below the severity threshold
252
+ if [ "$vote_result" = "REJECT" ] && [ "$COUNCIL_SEVERITY_THRESHOLD" != "low" ] && [ -n "$member_issues" ]; then
253
+ local has_blocking_issue=false
254
+ local severity_order="critical high medium low"
255
+ local threshold_reached=false
256
+
257
+ while IFS= read -r issue_line; do
258
+ local issue_severity
259
+ issue_severity=$(echo "$issue_line" | grep -oE "(CRITICAL|HIGH|MEDIUM|LOW)" | head -1 | tr '[:upper:]' '[:lower:]')
260
+ # Check if this severity meets or exceeds the threshold
261
+ for sev in $severity_order; do
262
+ if [ "$sev" = "$COUNCIL_SEVERITY_THRESHOLD" ]; then
263
+ threshold_reached=true
264
+ fi
265
+ if [ "$sev" = "$issue_severity" ] && [ "$threshold_reached" = "false" ]; then
266
+ has_blocking_issue=true
267
+ break
268
+ fi
269
+ done
270
+ done <<< "$member_issues"
271
+
272
+ if [ "$has_blocking_issue" = "false" ]; then
273
+ log_info " Member $member ($role): REJECT overridden to APPROVE (issues below ${COUNCIL_SEVERITY_THRESHOLD} threshold)"
274
+ vote_result="APPROVE"
275
+ fi
276
+ fi
277
+
238
278
  if [ "$vote_result" = "APPROVE" ]; then
239
279
  ((approve_count++))
240
280
  log_info " Member $member ($role): APPROVE"
@@ -618,23 +658,37 @@ council_member_review() {
618
658
  ;;
619
659
  esac
620
660
 
661
+ local severity_instruction=""
662
+ if [ "$COUNCIL_SEVERITY_THRESHOLD" != "low" ]; then
663
+ severity_instruction="
664
+ ERROR BUDGET: This council uses severity-aware evaluation.
665
+ - Categorize each issue as CRITICAL, HIGH, MEDIUM, or LOW severity
666
+ - Blocking threshold: ${COUNCIL_SEVERITY_THRESHOLD} and above
667
+ - Only issues at ${COUNCIL_SEVERITY_THRESHOLD} severity or above should cause REJECT
668
+ - Issues below threshold are acceptable (error budget: ${COUNCIL_ERROR_BUDGET})
669
+ - List issues as ISSUES: SEVERITY:description (one per line)"
670
+ fi
671
+
621
672
  local prompt="You are a council member reviewing project completion.
622
673
 
623
674
  ${role_instruction}
624
675
 
625
676
  EVIDENCE:
626
677
  ${evidence}
678
+ ${severity_instruction}
627
679
 
628
680
  INSTRUCTIONS:
629
681
  1. Review the evidence carefully
630
682
  2. Determine if the project meets completion criteria
631
683
  3. Output EXACTLY one line starting with VOTE:APPROVE or VOTE:REJECT
632
684
  4. Output EXACTLY one line starting with REASON: explaining your decision
633
- 5. Be honest - do not approve incomplete work
685
+ 5. If issues found, output lines starting with ISSUES: SEVERITY:description
686
+ 6. Be honest - do not approve incomplete work
634
687
 
635
- Output format (exactly two lines):
688
+ Output format:
636
689
  VOTE:APPROVE or VOTE:REJECT
637
- REASON: your reasoning here"
690
+ REASON: your reasoning here
691
+ ISSUES: CRITICAL:description (optional, one per line per issue)"
638
692
 
639
693
  local verdict_file="$vote_dir/member-${member_id}.txt"
640
694
 
@@ -1300,5 +1354,5 @@ council_get_dashboard_state() {
1300
1354
  state_json=$(cat "$COUNCIL_STATE_DIR/state.json" 2>/dev/null || echo "{}")
1301
1355
  fi
1302
1356
 
1303
- echo "\"council\": {\"enabled\": true, \"size\": $COUNCIL_SIZE, \"threshold\": $COUNCIL_THRESHOLD, \"check_interval\": $COUNCIL_CHECK_INTERVAL, \"consecutive_no_change\": $COUNCIL_CONSECUTIVE_NO_CHANGE, \"done_signals\": $COUNCIL_DONE_SIGNALS, \"iteration\": $ITERATION_COUNT, \"state\": $state_json}"
1357
+ echo "\"council\": {\"enabled\": true, \"size\": $COUNCIL_SIZE, \"threshold\": $COUNCIL_THRESHOLD, \"check_interval\": $COUNCIL_CHECK_INTERVAL, \"consecutive_no_change\": $COUNCIL_CONSECUTIVE_NO_CHANGE, \"done_signals\": $COUNCIL_DONE_SIGNALS, \"iteration\": $ITERATION_COUNT, \"severity_threshold\": \"$COUNCIL_SEVERITY_THRESHOLD\", \"error_budget\": $COUNCIL_ERROR_BUDGET, \"state\": $state_json}"
1304
1358
  }
@@ -30,11 +30,21 @@ BLOCKED_PATTERNS=(
30
30
  "wget.*\|.*sh"
31
31
  "curl.*\|.*bash"
32
32
  "wget.*\|.*bash"
33
+ # Config self-protection: prevent agents from corrupting internal state
34
+ "rm -rf \.loki"
35
+ "rm -rf \./\.loki"
36
+ "rm .*\.loki/council/"
37
+ "rm .*\.loki/config\.yaml"
38
+ "rm .*\.loki/logs/bash-audit"
39
+ "rm .*\.loki/session\.lock"
40
+ "> \.loki/council/"
41
+ "> \.loki/config\.yaml"
33
42
  )
34
43
 
35
- # Safe path patterns that override rm -rf / matches
44
+ # Safe path patterns that override blocked pattern matches
36
45
  SAFE_PATTERNS=(
37
46
  "rm -rf /tmp/"
47
+ "rm -rf \.loki/queue/dead-letter"
38
48
  )
39
49
 
40
50
  # Check for blocked patterns
package/autonomy/loki CHANGED
@@ -9,6 +9,7 @@
9
9
  # Usage:
10
10
  # loki start [PRD] - Start Loki Mode (optionally with PRD)
11
11
  # loki stop - Stop execution immediately
12
+ # loki cleanup - Kill orphaned processes from crashed sessions
12
13
  # loki pause - Pause after current session
13
14
  # loki resume - Resume paused execution
14
15
  # loki status - Show current status
@@ -312,6 +313,7 @@ show_help() {
312
313
  echo " init Build a PRD interactively or from templates"
313
314
  echo " issue <url|num> Generate PRD from GitHub issue and optionally start"
314
315
  echo " stop Stop execution immediately"
316
+ echo " cleanup Kill orphaned processes from crashed sessions"
315
317
  echo " pause Pause after current session"
316
318
  echo " resume Resume paused execution"
317
319
  echo " status [--json] Show current status (--json for machine-readable)"
@@ -704,6 +706,28 @@ except: pass
704
706
  rm -f "$LOKI_DIR/dashboard/dashboard.pid"
705
707
  fi
706
708
 
709
+ # Kill any remaining registered processes (2s graceful window matches run.sh)
710
+ if [ -d "$LOKI_DIR/pids" ]; then
711
+ for entry_file in "$LOKI_DIR/pids"/*.json; do
712
+ [ -f "$entry_file" ] || continue
713
+ local reg_pid
714
+ reg_pid=$(basename "$entry_file" .json)
715
+ case "$reg_pid" in ''|*[!0-9]*) continue ;; esac
716
+ if kill -0 "$reg_pid" 2>/dev/null; then
717
+ kill "$reg_pid" 2>/dev/null || true
718
+ local w=0
719
+ while [ $w -lt 4 ] && kill -0 "$reg_pid" 2>/dev/null; do
720
+ sleep 0.5
721
+ w=$((w + 1))
722
+ done
723
+ if kill -0 "$reg_pid" 2>/dev/null; then
724
+ kill -9 "$reg_pid" 2>/dev/null || true
725
+ fi
726
+ fi
727
+ rm -f "$entry_file"
728
+ done
729
+ fi
730
+
707
731
  # Emit session stop event
708
732
  emit_event session cli stop "reason=user_requested"
709
733
  # Emit success pattern for clean stop (SYN-018)
@@ -730,6 +754,86 @@ except: pass
730
754
  fi
731
755
  }
732
756
 
757
+ # Kill orphaned processes from crashed sessions
758
+ cmd_cleanup() {
759
+ local pids_dir="$LOKI_DIR/pids"
760
+ local killed=0
761
+ local stale=0
762
+
763
+ if [ ! -d "$pids_dir" ]; then
764
+ echo "No PID registry found. Nothing to clean up."
765
+ exit 0
766
+ fi
767
+
768
+ echo -e "${BOLD}Scanning for orphaned processes...${NC}"
769
+
770
+ for entry_file in "$pids_dir"/*.json; do
771
+ [ -f "$entry_file" ] || continue
772
+ local pid
773
+ pid=$(basename "$entry_file" .json)
774
+ case "$pid" in
775
+ ''|*[!0-9]*) continue ;;
776
+ esac
777
+
778
+ local label=""
779
+ local ppid_val=""
780
+ # Parse JSON fields (python3 with shell fallback)
781
+ if command -v python3 >/dev/null 2>&1; then
782
+ label=$(python3 -c "import json,sys; print(json.load(open(sys.argv[1])).get('label','unknown'))" "$entry_file" 2>/dev/null) || label="unknown"
783
+ ppid_val=$(python3 -c "import json,sys; print(json.load(open(sys.argv[1])).get('ppid',''))" "$entry_file" 2>/dev/null) || true
784
+ else
785
+ label=$(sed 's/.*"label":"//' "$entry_file" 2>/dev/null | sed 's/".*//' | head -1) || label="unknown"
786
+ ppid_val=$(sed 's/.*"ppid"://' "$entry_file" 2>/dev/null | sed 's/[,}].*//' | head -1) || true
787
+ fi
788
+
789
+ if kill -0 "$pid" 2>/dev/null; then
790
+ # Process is alive - check if parent is dead (orphan)
791
+ local is_orphan=false
792
+ # Validate ppid_val is numeric before using with kill
793
+ case "$ppid_val" in ''|*[!0-9]*) ppid_val="" ;; esac
794
+ if [ -n "$ppid_val" ] && ! kill -0 "$ppid_val" 2>/dev/null; then
795
+ is_orphan=true
796
+ fi
797
+
798
+ if [ "$is_orphan" = true ] || [ "${1:-}" = "--force" ]; then
799
+ echo -e " ${RED}Killing${NC} PID=$pid label=$label (parent $ppid_val dead)"
800
+ kill "$pid" 2>/dev/null || true
801
+ sleep 0.5
802
+ if kill -0 "$pid" 2>/dev/null; then
803
+ kill -9 "$pid" 2>/dev/null || true
804
+ fi
805
+ rm -f "$entry_file"
806
+ killed=$((killed + 1))
807
+ else
808
+ echo -e " ${GREEN}Alive${NC} PID=$pid label=$label (parent $ppid_val alive)"
809
+ fi
810
+ else
811
+ # Process is dead - clean up stale entry
812
+ rm -f "$entry_file"
813
+ stale=$((stale + 1))
814
+ fi
815
+ done
816
+
817
+ echo ""
818
+ echo "Results: $killed orphan(s) killed, $stale stale entries cleaned"
819
+
820
+ # Also kill orphaned loki-run temp scripts
821
+ local temp_killed=0
822
+ if pgrep -f "loki-run-" >/dev/null 2>&1; then
823
+ if ! is_session_running; then
824
+ echo "Killing orphaned loki-run temp scripts..."
825
+ pkill -f "loki-run-" 2>/dev/null || true
826
+ sleep 0.5
827
+ pkill -9 -f "loki-run-" 2>/dev/null || true
828
+ temp_killed=1
829
+ fi
830
+ fi
831
+
832
+ if [ $killed -eq 0 ] && [ $stale -eq 0 ] && [ $temp_killed -eq 0 ]; then
833
+ echo -e "${GREEN}System is clean. No orphans found.${NC}"
834
+ fi
835
+ }
836
+
733
837
  # Pause after current session
734
838
  cmd_pause() {
735
839
  if [ ! -d "$LOKI_DIR" ]; then
@@ -4497,6 +4601,9 @@ main() {
4497
4601
  stop)
4498
4602
  cmd_stop
4499
4603
  ;;
4604
+ cleanup)
4605
+ cmd_cleanup "$@"
4606
+ ;;
4500
4607
  pause)
4501
4608
  cmd_pause
4502
4609
  ;;