npm - loki-mode - Versions diffs - 5.49.0 → 5.49.2 - Mend

loki-mode 5.49.0 → 5.49.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (28) hide show

package/README.md +70 -121
package/SKILL.md +3 -3
package/VERSION +1 -1
package/autonomy/CONSTITUTION.md +4 -4
package/autonomy/app-runner.sh +9 -0
package/autonomy/loki +107 -0
package/autonomy/run.sh +170 -4
package/dashboard/__init__.py +1 -1
package/dashboard/server.py +172 -20
package/dashboard/static/index.html +1 -1
package/docs/COMPARISON.md +15 -15
package/docs/COMPETITIVE-ANALYSIS.md +4 -4
package/docs/INSTALLATION.md +20 -12
package/docs/alternative-installations.md +145 -0
package/docs/auto-claude-comparison.md +1 -1
package/docs/cursor-comparison.md +7 -7
package/docs/thick2thin.md +2 -2
package/mcp/__init__.py +1 -1
package/package.json +1 -1
package/references/agent-types.md +2 -2
package/references/agents.md +1 -1
package/references/competitive-analysis.md +1 -1
package/references/core-workflow.md +1 -1
package/skills/00-index.md +1 -1
package/skills/agents.md +3 -3
package/skills/artifacts.md +1 -1
package/skills/parallel-workflows.md +1 -1
package/skills/quality-gates.md +4 -2

package/README.md CHANGED Viewed

@@ -1,6 +1,6 @@
 # Loki Mode
-**The Flagship Product of [Autonomi](https://www.autonomi.dev/) -- The First Truly Autonomous Multi-Agent Startup System**
+**The Flagship Product of [Autonomi](https://www.autonomi.dev/) -- An Autonomous Multi-Agent Development System**
 [![npm version](https://img.shields.io/npm/v/loki-mode)](https://www.npmjs.com/package/loki-mode)
 [![npm downloads](https://img.shields.io/npm/dw/loki-mode)](https://www.npmjs.com/package/loki-mode)
@@ -9,17 +9,15 @@
 [![GitHub Marketplace](https://img.shields.io/badge/Marketplace-Loki%20Mode-purple?logo=github)](https://github.com/marketplace/actions/loki-mode-code-review)
 [![Autonomi](https://img.shields.io/badge/Autonomi-autonomi.dev-5B4EEA)](https://www.autonomi.dev/)
 [![Agent Types](https://img.shields.io/badge/Agent%20Types-41-blue)]()
-[![Loki Mode](https://img.shields.io/badge/Loki%20Mode-98.78%25%20Pass%401-blueviolet)](benchmarks/results/)
-[![HumanEval](https://img.shields.io/badge/HumanEval-98.17%25%20Pass%401-brightgreen)](benchmarks/results/)
-[![SWE-bench](https://img.shields.io/badge/SWE--bench-99.67%25%20Patch%20Gen-brightgreen)](benchmarks/results/)
+[![Benchmarks](https://img.shields.io/badge/Benchmarks-Infrastructure%20Ready-blue)](benchmarks/)
-**Current Version: v5.47.0**
+**Current Version: v5.49.2**
 **[Autonomi](https://www.autonomi.dev/)** | **[Documentation](https://www.autonomi.dev/docs)** | **[GitHub](https://github.com/asklokesh/loki-mode)**
-> **PRD → Deployed Product in Zero Human Intervention**
+> **PRD to Deployed Product with Minimal Human Intervention**
 >
-> Loki Mode transforms a Product Requirements Document into a fully built, tested, deployed, and revenue-generating product while you sleep. No manual steps. No intervention. Just results.
+> Loki Mode transforms a Product Requirements Document into a fully built, tested, and deployed product with autonomous multi-agent execution. Human oversight for deployment credentials, domain setup, and critical decisions.
 ---
@@ -27,7 +25,7 @@
 [![asciicast](https://asciinema.org/a/AjjnjzOeKLYItp6s.svg)](https://asciinema.org/a/AjjnjzOeKLYItp6s)
-*Click to watch Loki Mode v5.42 -- CLI commands, dashboard, 8 parallel agents, 7-gate quality, Completion Council, memory system*
+*Click to watch Loki Mode v5.42 -- CLI commands, dashboard, 8 parallel agents, 9-gate quality, Completion Council, memory system*
 ---
@@ -41,98 +39,38 @@
 ---
-## Usage
-### Option 1: npm (Recommended)
+## Installation
 ```bash
-npm install -g loki-mode
-loki start ./my-prd.md
+git clone https://github.com/asklokesh/loki-mode.git ~/.claude/skills/loki-mode
 ```
-### Option 2: Claude Code Skill
+That's it. Claude Code auto-discovers skills in `~/.claude/skills/`.
+### Use It
 ```bash
-git clone https://github.com/asklokesh/loki-mode.git ~/.claude/skills/loki-mode
 claude --dangerously-skip-permissions
-# Then say: Loki Mode with PRD at ./my-prd.md
+# Then say: "Loki Mode with PRD at ./my-prd.md"
 ```
-### Option 3: GitHub Action
+### Update
-Add automated AI code review to your pull requests:
-```yaml
-# .github/workflows/loki-review.yml
-name: Loki Code Review
-on:
-  pull_request:
-    types: [opened, synchronize]
-permissions:
-  contents: read
-  pull-requests: write
-jobs:
-  review:
-    runs-on: ubuntu-latest
-    steps:
-      - uses: actions/checkout@v4
-      - uses: asklokesh/loki-mode@v5.38
-        with:
-          github_token: ${{ secrets.GITHUB_TOKEN }}
-          mode: review          # review, fix, or test
-          provider: claude      # claude, codex, or gemini
-          max_iterations: 3     # sets LOKI_MAX_ITERATIONS env var
-          budget_limit: '5.00'  # max cost in USD (maps to --budget flag)
-        env:
-          ANTHROPIC_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }}
-```
-**Prerequisites:**
-- An API key for your chosen provider (set as a repository secret):
-  - Claude: `ANTHROPIC_API_KEY`
-  - Codex: `OPENAI_API_KEY`
-  - Gemini: `GOOGLE_API_KEY`
-- The action automatically installs `loki-mode` and `@anthropic-ai/claude-code` (for the Claude provider)
-**Action Inputs:**
-| Input | Default | Description |
-|-------|---------|-------------|
-| `mode` | `review` | `review`, `fix`, or `test` |
-| `provider` | `claude` | `claude`, `codex`, or `gemini` |
-| `budget_limit` | `5.00` | Max cost in USD (maps to `--budget` CLI flag) |
-| `budget` | | Alias for `budget_limit` |
-| `max_iterations` | `3` | Sets `LOKI_MAX_ITERATIONS` env var |
-| `github_token` | (required) | GitHub token for PR comments |
-| `prd_file` | | Path to PRD file relative to repo root |
-| `auto_confirm` | `true` | Skip confirmation prompts (always true in CI) |
-| `install_claude` | `true` | Auto-install Claude Code CLI if not present |
-| `node_version` | `20` | Node.js version |
-**Using with a PRD file (fix/test modes):**
-```yaml
-- uses: asklokesh/loki-mode@v5
-  with:
-    mode: fix
-    prd_file: 'docs/my-prd.md'
-    github_token: ${{ secrets.GITHUB_TOKEN }}
-  env:
-    ANTHROPIC_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }}
+```bash
+cd ~/.claude/skills/loki-mode && git pull
 ```
-**Modes:**
+### Troubleshooting
-| Mode | Description |
-|------|-------------|
-| `review` | Analyze PR diff, post structured review as PR comment |
-| `fix` | Automatically fix issues found in the codebase |
-| `test` | Run autonomous test generation and validation |
+| Problem | Fix |
+|---------|-----|
+| `SKILL.md` not found | Verify: `ls ~/.claude/skills/loki-mode/SKILL.md` |
+| Claude doesn't recognize "Loki Mode" | Restart Claude Code after cloning |
+| Permission denied on clone | Check SSH keys or use HTTPS URL above |
-Also available via **Homebrew**, **Docker**, **VS Code Extension**, and **direct shell script**. See the [Installation Guide](docs/INSTALLATION.md) for all 7 installation methods and detailed instructions.
+### Other Installation Methods
+Also available via **npm**, **Homebrew**, **Docker**, **GitHub Action**, and **VS Code Extension**. See [docs/alternative-installations.md](docs/alternative-installations.md) for details and limitations of each method.
 ### Multi-Provider Support (v5.0.0)
@@ -163,55 +101,66 @@ See [skills/providers.md](skills/providers.md) for full provider documentation.
 ---
-## Benchmark Results
-### Three-Way Comparison (HumanEval)
-| System | Pass@1 | Details |
-|--------|--------|---------|
-| **Loki Mode (Multi-Agent)** | **98.78%** | 162/164 problems, RARV cycle recovered 2 |
-| Direct Claude | 98.17% | 161/164 problems (baseline) |
-| MetaGPT | 85.9-87.7% | Published benchmark |
+## Benchmarks
-**Loki Mode beats MetaGPT by +11-13%** thanks to the RARV (Reason-Act-Reflect-Verify) cycle.
+Benchmark infrastructure is included for HumanEval and SWE-bench evaluation. Results are self-reported from the included test harness and have not been independently verified.
-### Full Results
+| Benchmark | Result | Notes |
+|-----------|--------|-------|
+| HumanEval | 162/164 (98.78%) | Self-reported, max 3 retries per problem |
+| SWE-bench | 299/300 patches generated | Patch generation only -- SWE-bench evaluator not yet run to verify correctness |
-| Benchmark | Score | Details |
-|-----------|-------|---------|
-| **Loki Mode HumanEval** | **98.78% Pass@1** | 162/164 (multi-agent with RARV) |
-| **Direct Claude HumanEval** | **98.17% Pass@1** | 161/164 (single agent baseline) |
-| **Direct Claude SWE-bench** | **99.67% patch gen** | 299/300 problems |
-| **Loki Mode SWE-bench** | **99.67% patch gen** | 299/300 problems |
-| Model | Claude Opus 4.5 | |
+**Note:** SWE-bench "patch generation" means the system produced a patch file, not that the patch correctly resolves the issue. The SWE-bench evaluator should be run to determine actual resolution rates.
-**Key Finding:** Multi-agent RARV matches single-agent performance on both benchmarks after timeout optimization. The 4-agent pipeline (Architect->Engineer->QA->Reviewer) achieves the same 99.67% patch generation as direct Claude.
-See [benchmarks/results/](benchmarks/results/) for full methodology and solutions.
+See [benchmarks/](benchmarks/) for the test harness and raw results.
 ---
 ## What is Loki Mode?
-Loki Mode is a multi-provider AI skill that orchestrates **41 specialized AI agent types** across **7 swarms** to autonomously build, test, deploy, and scale complete startups. Works with **Claude Code**, **OpenAI Codex CLI**, and **Google Gemini CLI**. It dynamically spawns only the agents you need—**5-10 for simple projects, 100+ for complex startups**—working in parallel with continuous self-verification.
+Loki Mode is a multi-provider AI skill that orchestrates **41 specialized AI agent types** across **8 swarms** to autonomously build, test, and deploy software projects. Works with **Claude Code**, **OpenAI Codex CLI**, and **Google Gemini CLI**. It dynamically spawns agents as needed -- typically **5-10 for simple projects, more for complex ones** -- working in parallel with continuous self-verification.
 ```
-PRD → Research → Architecture → Development → Testing → Deployment → Marketing → Revenue
+PRD → Research → Architecture → Development → Testing → Deployment → Marketing
 ```
 **Just say "Loki Mode" and point to a PRD. Walk away. Come back to a deployed product.**
 ---
+## Current Limitations
+Loki Mode is powerful but not magic. Be aware of these honest limitations:
+| Area | What Works | What Doesn't (Yet) |
+|------|-----------|---------------------|
+| **Code Generation** | Generates full-stack applications from PRDs | Complex domain logic may need human review and correction |
+| **Deployment** | Generates deployment configs and scripts | Does not have cloud credentials -- human must provide and authorize |
+| **Testing** | 9 automated quality gates, blind review | Test quality depends on AI-generated assertions; mutation testing is heuristic |
+| **Business Ops** | Generates marketing copy, legal templates | Does not actually send emails, file legal documents, or process payments |
+| **Multi-Provider** | Claude (full), Codex (degraded), Gemini (degraded) | Codex and Gemini lack parallel agents and Task tool -- sequential only |
+| **Memory System** | Episodic, semantic, procedural memory tiers | Vector search requires optional `sentence-transformers` dependency |
+| **Enterprise Security** | TLS, OIDC, RBAC, audit trail, SIEM configs | Self-signed certs only; production deployments need real certificates |
+| **Dashboard** | Real-time status, task queue, agent monitoring | Single-machine only; no multi-node dashboard clustering |
+| **Benchmarks** | HumanEval 98.78%, SWE-bench 299/300 patches | Self-reported; SWE-bench counts patch generation, not verified resolution |
+**What "autonomous" means in practice:**
+- Loki Mode runs without prompting between RARV cycles
+- It does NOT have access to your cloud accounts, payment systems, or external services unless you provide credentials
+- Human oversight is expected for: deployment credentials, domain setup, API keys, and critical business decisions
+- The system is as good as the underlying AI model -- it can make mistakes, especially on novel or complex problems
+---
 ## Why Loki Mode?
-### **Better Than Anything Out There**
+### **How It Works**
 | What Others Do | What Loki Mode Does |
 |----------------|---------------------|
-| **Single agent** writes code linearly | **100+ agents** work in parallel across engineering, ops, business, data, product, and growth |
+| **Single agent** writes code linearly | **Multiple agents** work in parallel across engineering, ops, business, data, product, and growth |
 | **Manual deployment** required | **Autonomous deployment** to AWS, GCP, Azure, Vercel, Railway with blue-green and canary strategies |
-| **No testing** or basic unit tests | **7 automated quality gates**: input/output guardrails, static analysis, blind review, anti-sycophancy, severity blocking, test coverage |
+| **No testing** or basic unit tests | **9 automated quality gates**: input/output guardrails, static analysis, blind review, anti-sycophancy, severity blocking, test coverage, mock detection, mutation detection |
 | **Code only** - you handle the rest | **Full business operations**: marketing, sales, legal, HR, finance, investor relations |
 | **Stops on errors** | **Self-healing**: circuit breakers, dead letter queues, exponential backoff, automatic recovery |
 | **No visibility** into progress | **Real-time dashboard** with agent monitoring, task queues, and live status updates |
@@ -221,8 +170,8 @@ PRD → Research → Architecture → Development → Testing → Deployment →
 ### **Core Advantages**
-1. **Truly Autonomous**: RARV (Reason-Act-Reflect-Verify) cycle with self-verification achieves 2-3x quality improvement
-2. **Massively Parallel**: 100+ agents working simultaneously, not sequential single-agent bottlenecks
+1. **Self-Verifying**: RARV (Reason-Act-Reflect-Verify) cycle with continuous self-verification catches errors early
+2. **Parallel Execution**: Multiple agents working simultaneously, not sequential single-agent bottlenecks
 3. **Production-Ready**: Not just code—handles deployment, monitoring, incident response, and business operations
 4. **Self-Improving**: Learns from mistakes, updates continuity logs, prevents repeated errors
 5. **Zero Babysitting**: Auto-resumes on rate limits, recovers from failures, runs until completion
@@ -249,13 +198,13 @@ PRD → Research → Architecture → Development → Testing → Deployment →
 | **OpenClaw Bridge (v5.38.0)** | Multi-agent coordination protocol | [OpenClaw Integration](docs/openclaw-integration.md) |
 | **41 Agent Types** | Engineering, Ops, Business, Data, Product, Growth, Orchestration | [Agent Definitions](references/agent-types.md) |
 | **RARV Cycle** | Reason-Act-Reflect-Verify workflow | [Core Workflow](references/core-workflow.md) |
-| **Quality Gates** | 7-gate system: guardrails, static analysis, blind review, anti-sycophancy, severity blocking, test coverage | [Quality Control](references/quality-control.md) |
+| **Quality Gates** | 9-gate system: guardrails, static analysis, blind review, anti-sycophancy, severity blocking, test coverage, mock detection, mutation detection | [Quality Control](references/quality-control.md) |
 | **Memory System (v5.15.0)** | Complete 3-tier memory with progressive disclosure | [Memory Architecture](references/memory-system.md) |
 | **Parallel Workflows** | Git worktree-based parallelism | [Parallel Workflows](skills/parallel-workflows.md) |
 | **GitHub Integration** | Issue import, PR creation, status sync | [GitHub Integration](skills/github-integration.md) |
 | **Distribution** | npm, Homebrew, Docker installation | [Installation Guide](docs/INSTALLATION.md) |
 | **Research Foundation** | OpenAI, DeepMind, Anthropic patterns | [Acknowledgements](docs/ACKNOWLEDGEMENTS.md) |
-| **Benchmarks** | HumanEval 98.78%, SWE-bench 99.67% | [Benchmark Results](benchmarks/results/) |
+| **Benchmarks** | HumanEval and SWE-bench infrastructure included | [Benchmark Harness](benchmarks/) |
 | **Comparisons** | vs Auto-Claude, Cursor | [Auto-Claude](docs/auto-claude-comparison.md), [Cursor](docs/cursor-comparison.md) |
 ---
@@ -424,7 +373,7 @@ Loki Mode doesn't just write code—it **thinks, acts, learns, and verifies**:
    └─ Apply learning and RETRY from REASON
 ```
-**Result:** 2-3x quality improvement through continuous self-verification.
+**Result:** Improved quality through continuous self-verification and multi-reviewer code review.
 ### **Perpetual Improvement Mode**
@@ -561,7 +510,7 @@ graph TB
 **Key components:**
 - **RARV+C Cycle** -- Reason, Act, Reflect, Verify, Compound. Every iteration follows this loop. Failed verification triggers retry from Reason.
 - **Provider Layer** -- Claude Code (full parallel agents, Task tool, MCP), Codex CLI and Gemini CLI (sequential, degraded mode).
-- **Agent Swarms** -- 41 specialized agent types across 7 swarms, spawned on demand based on project complexity.
+- **Agent Swarms** -- 41 specialized agent types across 8 swarms, spawned on demand based on project complexity.
 - **Completion Council** -- 3 members vote on whether the project is done. Anti-sycophancy devil's advocate on unanimous votes.
 - **Memory System** -- Episodic traces, semantic patterns, procedural skills. Progressive disclosure reduces context usage by 60-80%.
 - **Dashboard** -- FastAPI server reading `.loki/` flat files, with real-time web UI for task queue, agents, logs, and council state. Now with TLS/HTTPS, OIDC/SSO, and RBAC (v5.36.0-v5.37.0).
@@ -609,7 +558,7 @@ Config search order: `.loki/config.yaml` (project) -> `~/.config/loki-mode/confi
 ## Agent Swarms (41 Types)
-Loki Mode has **41 predefined agent types** organized into **7 specialized swarms**. The orchestrator spawns only what you need—simple projects use 5-10 agents, complex startups spawn 100+.
+Loki Mode has **41 predefined agent types** organized into **8 specialized swarms**. The orchestrator spawns only what you need -- simple projects typically use 5-10 agents, complex ones may use more.
 <img width="5309" height="979" alt="Agent Swarms Visualization" src="https://github.com/user-attachments/assets/7d18635d-a606-401f-8d9f-430e6e4ee689" />
@@ -676,7 +625,7 @@ references/                    # Deep documentation (23KB+ files)
 | **2. Architecture** | Tech stack selection with self-reflection |
 | **3. Infrastructure** | Provision cloud, CI/CD, monitoring |
 | **4. Development** | Implement with TDD, parallel code review |
-| **5. QA** | 7 quality gates, security audit, load testing |
+| **5. QA** | 9 quality gates, security audit, load testing |
 | **6. Deployment** | Blue-green deploy, auto-rollback on errors |
 | **7. Business** | Marketing, sales, legal, support setup |
 | **8. Growth** | Continuous optimization, A/B testing, feedback loops |
@@ -981,7 +930,7 @@ Built for the [Claude Code](https://claude.ai) ecosystem, powered by Anthropic's
 Loki Mode is the flagship product of **[Autonomi](https://www.autonomi.dev/)** -- a platform for autonomous AI systems. Like Alphabet is to Google, Autonomi is the parent brand under which Loki Mode and future products operate.
-**Why Autonomi?** Loki Mode proved that multi-agent autonomous systems can build real software from a PRD with zero human intervention. Autonomi is the expansion of that vision into a broader platform of autonomous services and products.
+**Why Autonomi?** Loki Mode proved that multi-agent autonomous systems can build real software from a PRD with minimal human intervention. Autonomi is the expansion of that vision into a broader platform of autonomous services and products.
 - **[autonomi.dev](https://www.autonomi.dev/)** -- Main website
 - **[Documentation](https://www.autonomi.dev/docs)** -- Full documentation

package/SKILL.md CHANGED Viewed

@@ -1,9 +1,9 @@
 ---
 name: loki-mode
-description: Multi-agent autonomous startup system. Triggers on "Loki Mode". Takes PRD to deployed product with zero human intervention. Requires --dangerously-skip-permissions flag.
+description: Multi-agent autonomous startup system. Triggers on "Loki Mode". Takes PRD to deployed product with minimal human intervention. Requires --dangerously-skip-permissions flag.
 ---
-# Loki Mode v5.49.0
+# Loki Mode v5.49.2
 **You are an autonomous agent. You make decisions. You do not ask questions. You do not stop.**
@@ -263,4 +263,4 @@ The following features are documented in skill modules but not yet fully automat
 | Quality gates 3-reviewer system | Implemented (v5.35.0) | 5 specialist reviewers in `skills/quality-gates.md`; execution in run.sh |
 | Benchmarks (HumanEval, SWE-bench) | Infrastructure only | Runner scripts and datasets exist in `benchmarks/`; no published results |
-**v5.49.0 | [Autonomi](https://www.autonomi.dev/) flagship product | ~260 lines core**
+**v5.49.2 | [Autonomi](https://www.autonomi.dev/) flagship product | ~260 lines core**

package/VERSION CHANGED Viewed

	@@ -1 +1 @@
1	- 5.49.0
1	+ 5.49.2

package/autonomy/CONSTITUTION.md CHANGED Viewed

@@ -142,7 +142,7 @@ GROWTH ──[continuous improvement loop]──> GROWTH
 - `Bash` - Command execution
 - `platform-orchestrator` - Deployment and service management
-**The 37 agent types are ROLES defined through prompts, not subagent_types.**
+**The 41 agent types are ROLES defined through prompts, not subagent_types.**
 ---
@@ -155,10 +155,10 @@ SKILL.md (~190 lines)         # Always loaded: RARV cycle, autonomy rules
 skills/
   00-index.md                  # Module routing table
   model-selection.md           # Task tool, parallelization
-  quality-gates.md             # 7-gate system, anti-sycophancy
+  quality-gates.md             # 9-gate system, anti-sycophancy
   testing.md                   # Playwright, E2E, property-based
   production.md                # CI/CD, batch processing
-  agents.md                    # 37 agent types, A2A patterns
+  agents.md                    # 41 agent types, A2A patterns
   parallel-workflows.md        # Git worktrees, parallel streams
   troubleshooting.md           # Error recovery, fallbacks
   artifacts.md                 # Code generation patterns
@@ -196,7 +196,7 @@ Main Worktree (orchestrator)
 ---
-## Quality Gates (7-Gate System)
+## Quality Gates (9-Gate System)
 ### Gate 1: Static Analysis
 ```yaml

package/autonomy/app-runner.sh CHANGED Viewed

@@ -432,6 +432,10 @@ app_runner_start() {
         (cd "$dir" && bash -c "$_APP_RUNNER_METHOD" >> "$_APP_RUNNER_DIR/app.log" 2>&1) &
     fi
     _APP_RUNNER_PID=$!
+    # Register with central PID registry if available
+    if type register_pid &>/dev/null; then
+        register_pid "$_APP_RUNNER_PID" "app-runner" "method=$_APP_RUNNER_METHOD"
+    fi
     # Write PID file
     echo "$_APP_RUNNER_PID" > "$_APP_RUNNER_DIR/app.pid"
@@ -497,6 +501,11 @@ app_runner_stop() {
         kill -KILL "-$_APP_RUNNER_PID" 2>/dev/null || kill -KILL "$_APP_RUNNER_PID" 2>/dev/null || true
     fi
+    # Unregister from central PID registry
+    if type unregister_pid &>/dev/null && [ -n "$_APP_RUNNER_PID" ]; then
+        unregister_pid "$_APP_RUNNER_PID"
+    fi
     rm -f "$_APP_RUNNER_DIR/app.pid"
     _write_app_state "stopped"
     log_info "App Runner: application stopped"

package/autonomy/loki CHANGED Viewed

@@ -9,6 +9,7 @@
 # Usage:
 #   loki start [PRD]      - Start Loki Mode (optionally with PRD)
 #   loki stop             - Stop execution immediately
+#   loki cleanup          - Kill orphaned processes from crashed sessions
 #   loki pause            - Pause after current session
 #   loki resume           - Resume paused execution
 #   loki status           - Show current status
@@ -312,6 +313,7 @@ show_help() {
     echo "  init             Build a PRD interactively or from templates"
     echo "  issue <url|num>  Generate PRD from GitHub issue and optionally start"
     echo "  stop             Stop execution immediately"
+    echo "  cleanup          Kill orphaned processes from crashed sessions"
     echo "  pause            Pause after current session"
     echo "  resume           Resume paused execution"
     echo "  status [--json]  Show current status (--json for machine-readable)"
@@ -704,6 +706,28 @@ except: pass
             rm -f "$LOKI_DIR/dashboard/dashboard.pid"
         fi
+        # Kill any remaining registered processes (2s graceful window matches run.sh)
+        if [ -d "$LOKI_DIR/pids" ]; then
+            for entry_file in "$LOKI_DIR/pids"/*.json; do
+                [ -f "$entry_file" ] || continue
+                local reg_pid
+                reg_pid=$(basename "$entry_file" .json)
+                case "$reg_pid" in ''|*[!0-9]*) continue ;; esac
+                if kill -0 "$reg_pid" 2>/dev/null; then
+                    kill "$reg_pid" 2>/dev/null || true
+                    local w=0
+                    while [ $w -lt 4 ] && kill -0 "$reg_pid" 2>/dev/null; do
+                        sleep 0.5
+                        w=$((w + 1))
+                    done
+                    if kill -0 "$reg_pid" 2>/dev/null; then
+                        kill -9 "$reg_pid" 2>/dev/null || true
+                    fi
+                fi
+                rm -f "$entry_file"
+            done
+        fi
         # Emit session stop event
         emit_event session cli stop "reason=user_requested"
         # Emit success pattern for clean stop (SYN-018)
@@ -730,6 +754,86 @@ except: pass
     fi
 }
+# Kill orphaned processes from crashed sessions
+cmd_cleanup() {
+    local pids_dir="$LOKI_DIR/pids"
+    local killed=0
+    local stale=0
+    if [ ! -d "$pids_dir" ]; then
+        echo "No PID registry found. Nothing to clean up."
+        exit 0
+    fi
+    echo -e "${BOLD}Scanning for orphaned processes...${NC}"
+    for entry_file in "$pids_dir"/*.json; do
+        [ -f "$entry_file" ] || continue
+        local pid
+        pid=$(basename "$entry_file" .json)
+        case "$pid" in
+            ''|*[!0-9]*) continue ;;
+        esac
+        local label=""
+        local ppid_val=""
+        # Parse JSON fields (python3 with shell fallback)
+        if command -v python3 >/dev/null 2>&1; then
+            label=$(python3 -c "import json,sys; print(json.load(open(sys.argv[1])).get('label','unknown'))" "$entry_file" 2>/dev/null) || label="unknown"
+            ppid_val=$(python3 -c "import json,sys; print(json.load(open(sys.argv[1])).get('ppid',''))" "$entry_file" 2>/dev/null) || true
+        else
+            label=$(sed 's/.*"label":"//' "$entry_file" 2>/dev/null | sed 's/".*//' | head -1) || label="unknown"
+            ppid_val=$(sed 's/.*"ppid"://' "$entry_file" 2>/dev/null | sed 's/[,}].*//' | head -1) || true
+        fi
+        if kill -0 "$pid" 2>/dev/null; then
+            # Process is alive - check if parent is dead (orphan)
+            local is_orphan=false
+            # Validate ppid_val is numeric before using with kill
+            case "$ppid_val" in ''|*[!0-9]*) ppid_val="" ;; esac
+            if [ -n "$ppid_val" ] && ! kill -0 "$ppid_val" 2>/dev/null; then
+                is_orphan=true
+            fi
+            if [ "$is_orphan" = true ] || [ "${1:-}" = "--force" ]; then
+                echo -e "  ${RED}Killing${NC} PID=$pid label=$label (parent $ppid_val dead)"
+                kill "$pid" 2>/dev/null || true
+                sleep 0.5
+                if kill -0 "$pid" 2>/dev/null; then
+                    kill -9 "$pid" 2>/dev/null || true
+                fi
+                rm -f "$entry_file"
+                killed=$((killed + 1))
+            else
+                echo -e "  ${GREEN}Alive${NC}  PID=$pid label=$label (parent $ppid_val alive)"
+            fi
+        else
+            # Process is dead - clean up stale entry
+            rm -f "$entry_file"
+            stale=$((stale + 1))
+        fi
+    done
+    echo ""
+    echo "Results: $killed orphan(s) killed, $stale stale entries cleaned"
+    # Also kill orphaned loki-run temp scripts
+    local temp_killed=0
+    if pgrep -f "loki-run-" >/dev/null 2>&1; then
+        if ! is_session_running; then
+            echo "Killing orphaned loki-run temp scripts..."
+            pkill -f "loki-run-" 2>/dev/null || true
+            sleep 0.5
+            pkill -9 -f "loki-run-" 2>/dev/null || true
+            temp_killed=1
+        fi
+    fi
+    if [ $killed -eq 0 ] && [ $stale -eq 0 ] && [ $temp_killed -eq 0 ]; then
+        echo -e "${GREEN}System is clean. No orphans found.${NC}"
+    fi
+}
 # Pause after current session
 cmd_pause() {
     if [ ! -d "$LOKI_DIR" ]; then
@@ -4497,6 +4601,9 @@ main() {
         stop)
             cmd_stop
             ;;
+        cleanup)
+            cmd_cleanup "$@"
+            ;;
         pause)
             cmd_pause
             ;;