loki-mode 5.49.0 → 5.49.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -1,6 +1,6 @@
1
1
  # Loki Mode
2
2
 
3
- **The Flagship Product of [Autonomi](https://www.autonomi.dev/) -- The First Truly Autonomous Multi-Agent Startup System**
3
+ **The Flagship Product of [Autonomi](https://www.autonomi.dev/) -- An Autonomous Multi-Agent Development System**
4
4
 
5
5
  [![npm version](https://img.shields.io/npm/v/loki-mode)](https://www.npmjs.com/package/loki-mode)
6
6
  [![npm downloads](https://img.shields.io/npm/dw/loki-mode)](https://www.npmjs.com/package/loki-mode)
@@ -9,17 +9,15 @@
9
9
  [![GitHub Marketplace](https://img.shields.io/badge/Marketplace-Loki%20Mode-purple?logo=github)](https://github.com/marketplace/actions/loki-mode-code-review)
10
10
  [![Autonomi](https://img.shields.io/badge/Autonomi-autonomi.dev-5B4EEA)](https://www.autonomi.dev/)
11
11
  [![Agent Types](https://img.shields.io/badge/Agent%20Types-41-blue)]()
12
- [![Loki Mode](https://img.shields.io/badge/Loki%20Mode-98.78%25%20Pass%401-blueviolet)](benchmarks/results/)
13
- [![HumanEval](https://img.shields.io/badge/HumanEval-98.17%25%20Pass%401-brightgreen)](benchmarks/results/)
14
- [![SWE-bench](https://img.shields.io/badge/SWE--bench-99.67%25%20Patch%20Gen-brightgreen)](benchmarks/results/)
12
+ [![Benchmarks](https://img.shields.io/badge/Benchmarks-Infrastructure%20Ready-blue)](benchmarks/)
15
13
 
16
- **Current Version: v5.47.0**
14
+ **Current Version: v5.49.2**
17
15
 
18
16
  **[Autonomi](https://www.autonomi.dev/)** | **[Documentation](https://www.autonomi.dev/docs)** | **[GitHub](https://github.com/asklokesh/loki-mode)**
19
17
 
20
- > **PRD Deployed Product in Zero Human Intervention**
18
+ > **PRD to Deployed Product with Minimal Human Intervention**
21
19
  >
22
- > Loki Mode transforms a Product Requirements Document into a fully built, tested, deployed, and revenue-generating product while you sleep. No manual steps. No intervention. Just results.
20
+ > Loki Mode transforms a Product Requirements Document into a fully built, tested, and deployed product with autonomous multi-agent execution. Human oversight for deployment credentials, domain setup, and critical decisions.
23
21
 
24
22
  ---
25
23
 
@@ -27,7 +25,7 @@
27
25
 
28
26
  [![asciicast](https://asciinema.org/a/AjjnjzOeKLYItp6s.svg)](https://asciinema.org/a/AjjnjzOeKLYItp6s)
29
27
 
30
- *Click to watch Loki Mode v5.42 -- CLI commands, dashboard, 8 parallel agents, 7-gate quality, Completion Council, memory system*
28
+ *Click to watch Loki Mode v5.42 -- CLI commands, dashboard, 8 parallel agents, 9-gate quality, Completion Council, memory system*
31
29
 
32
30
  ---
33
31
 
@@ -41,98 +39,38 @@
41
39
 
42
40
  ---
43
41
 
44
- ## Usage
45
-
46
- ### Option 1: npm (Recommended)
42
+ ## Installation
47
43
 
48
44
  ```bash
49
- npm install -g loki-mode
50
- loki start ./my-prd.md
45
+ git clone https://github.com/asklokesh/loki-mode.git ~/.claude/skills/loki-mode
51
46
  ```
52
47
 
53
- ### Option 2: Claude Code Skill
48
+ That's it. Claude Code auto-discovers skills in `~/.claude/skills/`.
49
+
50
+ ### Use It
54
51
 
55
52
  ```bash
56
- git clone https://github.com/asklokesh/loki-mode.git ~/.claude/skills/loki-mode
57
53
  claude --dangerously-skip-permissions
58
- # Then say: Loki Mode with PRD at ./my-prd.md
54
+ # Then say: "Loki Mode with PRD at ./my-prd.md"
59
55
  ```
60
56
 
61
- ### Option 3: GitHub Action
57
+ ### Update
62
58
 
63
- Add automated AI code review to your pull requests:
64
-
65
- ```yaml
66
- # .github/workflows/loki-review.yml
67
- name: Loki Code Review
68
-
69
- on:
70
- pull_request:
71
- types: [opened, synchronize]
72
-
73
- permissions:
74
- contents: read
75
- pull-requests: write
76
-
77
- jobs:
78
- review:
79
- runs-on: ubuntu-latest
80
- steps:
81
- - uses: actions/checkout@v4
82
- - uses: asklokesh/loki-mode@v5.38
83
- with:
84
- github_token: ${{ secrets.GITHUB_TOKEN }}
85
- mode: review # review, fix, or test
86
- provider: claude # claude, codex, or gemini
87
- max_iterations: 3 # sets LOKI_MAX_ITERATIONS env var
88
- budget_limit: '5.00' # max cost in USD (maps to --budget flag)
89
- env:
90
- ANTHROPIC_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }}
91
- ```
92
-
93
- **Prerequisites:**
94
- - An API key for your chosen provider (set as a repository secret):
95
- - Claude: `ANTHROPIC_API_KEY`
96
- - Codex: `OPENAI_API_KEY`
97
- - Gemini: `GOOGLE_API_KEY`
98
- - The action automatically installs `loki-mode` and `@anthropic-ai/claude-code` (for the Claude provider)
99
-
100
- **Action Inputs:**
101
-
102
- | Input | Default | Description |
103
- |-------|---------|-------------|
104
- | `mode` | `review` | `review`, `fix`, or `test` |
105
- | `provider` | `claude` | `claude`, `codex`, or `gemini` |
106
- | `budget_limit` | `5.00` | Max cost in USD (maps to `--budget` CLI flag) |
107
- | `budget` | | Alias for `budget_limit` |
108
- | `max_iterations` | `3` | Sets `LOKI_MAX_ITERATIONS` env var |
109
- | `github_token` | (required) | GitHub token for PR comments |
110
- | `prd_file` | | Path to PRD file relative to repo root |
111
- | `auto_confirm` | `true` | Skip confirmation prompts (always true in CI) |
112
- | `install_claude` | `true` | Auto-install Claude Code CLI if not present |
113
- | `node_version` | `20` | Node.js version |
114
-
115
- **Using with a PRD file (fix/test modes):**
116
-
117
- ```yaml
118
- - uses: asklokesh/loki-mode@v5
119
- with:
120
- mode: fix
121
- prd_file: 'docs/my-prd.md'
122
- github_token: ${{ secrets.GITHUB_TOKEN }}
123
- env:
124
- ANTHROPIC_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }}
59
+ ```bash
60
+ cd ~/.claude/skills/loki-mode && git pull
125
61
  ```
126
62
 
127
- **Modes:**
63
+ ### Troubleshooting
128
64
 
129
- | Mode | Description |
130
- |------|-------------|
131
- | `review` | Analyze PR diff, post structured review as PR comment |
132
- | `fix` | Automatically fix issues found in the codebase |
133
- | `test` | Run autonomous test generation and validation |
65
+ | Problem | Fix |
66
+ |---------|-----|
67
+ | `SKILL.md` not found | Verify: `ls ~/.claude/skills/loki-mode/SKILL.md` |
68
+ | Claude doesn't recognize "Loki Mode" | Restart Claude Code after cloning |
69
+ | Permission denied on clone | Check SSH keys or use HTTPS URL above |
134
70
 
135
- Also available via **Homebrew**, **Docker**, **VS Code Extension**, and **direct shell script**. See the [Installation Guide](docs/INSTALLATION.md) for all 7 installation methods and detailed instructions.
71
+ ### Other Installation Methods
72
+
73
+ Also available via **npm**, **Homebrew**, **Docker**, **GitHub Action**, and **VS Code Extension**. See [docs/alternative-installations.md](docs/alternative-installations.md) for details and limitations of each method.
136
74
 
137
75
  ### Multi-Provider Support (v5.0.0)
138
76
 
@@ -163,55 +101,66 @@ See [skills/providers.md](skills/providers.md) for full provider documentation.
163
101
 
164
102
  ---
165
103
 
166
- ## Benchmark Results
167
-
168
- ### Three-Way Comparison (HumanEval)
169
-
170
- | System | Pass@1 | Details |
171
- |--------|--------|---------|
172
- | **Loki Mode (Multi-Agent)** | **98.78%** | 162/164 problems, RARV cycle recovered 2 |
173
- | Direct Claude | 98.17% | 161/164 problems (baseline) |
174
- | MetaGPT | 85.9-87.7% | Published benchmark |
104
+ ## Benchmarks
175
105
 
176
- **Loki Mode beats MetaGPT by +11-13%** thanks to the RARV (Reason-Act-Reflect-Verify) cycle.
106
+ Benchmark infrastructure is included for HumanEval and SWE-bench evaluation. Results are self-reported from the included test harness and have not been independently verified.
177
107
 
178
- ### Full Results
108
+ | Benchmark | Result | Notes |
109
+ |-----------|--------|-------|
110
+ | HumanEval | 162/164 (98.78%) | Self-reported, max 3 retries per problem |
111
+ | SWE-bench | 299/300 patches generated | Patch generation only -- SWE-bench evaluator not yet run to verify correctness |
179
112
 
180
- | Benchmark | Score | Details |
181
- |-----------|-------|---------|
182
- | **Loki Mode HumanEval** | **98.78% Pass@1** | 162/164 (multi-agent with RARV) |
183
- | **Direct Claude HumanEval** | **98.17% Pass@1** | 161/164 (single agent baseline) |
184
- | **Direct Claude SWE-bench** | **99.67% patch gen** | 299/300 problems |
185
- | **Loki Mode SWE-bench** | **99.67% patch gen** | 299/300 problems |
186
- | Model | Claude Opus 4.5 | |
113
+ **Note:** SWE-bench "patch generation" means the system produced a patch file, not that the patch correctly resolves the issue. The SWE-bench evaluator should be run to determine actual resolution rates.
187
114
 
188
- **Key Finding:** Multi-agent RARV matches single-agent performance on both benchmarks after timeout optimization. The 4-agent pipeline (Architect->Engineer->QA->Reviewer) achieves the same 99.67% patch generation as direct Claude.
189
-
190
- See [benchmarks/results/](benchmarks/results/) for full methodology and solutions.
115
+ See [benchmarks/](benchmarks/) for the test harness and raw results.
191
116
 
192
117
  ---
193
118
 
194
119
  ## What is Loki Mode?
195
120
 
196
- Loki Mode is a multi-provider AI skill that orchestrates **41 specialized AI agent types** across **7 swarms** to autonomously build, test, deploy, and scale complete startups. Works with **Claude Code**, **OpenAI Codex CLI**, and **Google Gemini CLI**. It dynamically spawns only the agents you need—**5-10 for simple projects, 100+ for complex startups**—working in parallel with continuous self-verification.
121
+ Loki Mode is a multi-provider AI skill that orchestrates **41 specialized AI agent types** across **8 swarms** to autonomously build, test, and deploy software projects. Works with **Claude Code**, **OpenAI Codex CLI**, and **Google Gemini CLI**. It dynamically spawns agents as needed -- typically **5-10 for simple projects, more for complex ones** -- working in parallel with continuous self-verification.
197
122
 
198
123
  ```
199
- PRD → Research → Architecture → Development → Testing → Deployment → Marketing → Revenue
124
+ PRD → Research → Architecture → Development → Testing → Deployment → Marketing
200
125
  ```
201
126
 
202
127
  **Just say "Loki Mode" and point to a PRD. Walk away. Come back to a deployed product.**
203
128
 
204
129
  ---
205
130
 
131
+ ## Current Limitations
132
+
133
+ Loki Mode is powerful but not magic. Be aware of these honest limitations:
134
+
135
+ | Area | What Works | What Doesn't (Yet) |
136
+ |------|-----------|---------------------|
137
+ | **Code Generation** | Generates full-stack applications from PRDs | Complex domain logic may need human review and correction |
138
+ | **Deployment** | Generates deployment configs and scripts | Does not have cloud credentials -- human must provide and authorize |
139
+ | **Testing** | 9 automated quality gates, blind review | Test quality depends on AI-generated assertions; mutation testing is heuristic |
140
+ | **Business Ops** | Generates marketing copy, legal templates | Does not actually send emails, file legal documents, or process payments |
141
+ | **Multi-Provider** | Claude (full), Codex (degraded), Gemini (degraded) | Codex and Gemini lack parallel agents and Task tool -- sequential only |
142
+ | **Memory System** | Episodic, semantic, procedural memory tiers | Vector search requires optional `sentence-transformers` dependency |
143
+ | **Enterprise Security** | TLS, OIDC, RBAC, audit trail, SIEM configs | Self-signed certs only; production deployments need real certificates |
144
+ | **Dashboard** | Real-time status, task queue, agent monitoring | Single-machine only; no multi-node dashboard clustering |
145
+ | **Benchmarks** | HumanEval 98.78%, SWE-bench 299/300 patches | Self-reported; SWE-bench counts patch generation, not verified resolution |
146
+
147
+ **What "autonomous" means in practice:**
148
+ - Loki Mode runs without prompting between RARV cycles
149
+ - It does NOT have access to your cloud accounts, payment systems, or external services unless you provide credentials
150
+ - Human oversight is expected for: deployment credentials, domain setup, API keys, and critical business decisions
151
+ - The system is as good as the underlying AI model -- it can make mistakes, especially on novel or complex problems
152
+
153
+ ---
154
+
206
155
  ## Why Loki Mode?
207
156
 
208
- ### **Better Than Anything Out There**
157
+ ### **How It Works**
209
158
 
210
159
  | What Others Do | What Loki Mode Does |
211
160
  |----------------|---------------------|
212
- | **Single agent** writes code linearly | **100+ agents** work in parallel across engineering, ops, business, data, product, and growth |
161
+ | **Single agent** writes code linearly | **Multiple agents** work in parallel across engineering, ops, business, data, product, and growth |
213
162
  | **Manual deployment** required | **Autonomous deployment** to AWS, GCP, Azure, Vercel, Railway with blue-green and canary strategies |
214
- | **No testing** or basic unit tests | **7 automated quality gates**: input/output guardrails, static analysis, blind review, anti-sycophancy, severity blocking, test coverage |
163
+ | **No testing** or basic unit tests | **9 automated quality gates**: input/output guardrails, static analysis, blind review, anti-sycophancy, severity blocking, test coverage, mock detection, mutation detection |
215
164
  | **Code only** - you handle the rest | **Full business operations**: marketing, sales, legal, HR, finance, investor relations |
216
165
  | **Stops on errors** | **Self-healing**: circuit breakers, dead letter queues, exponential backoff, automatic recovery |
217
166
  | **No visibility** into progress | **Real-time dashboard** with agent monitoring, task queues, and live status updates |
@@ -221,8 +170,8 @@ PRD → Research → Architecture → Development → Testing → Deployment →
221
170
 
222
171
  ### **Core Advantages**
223
172
 
224
- 1. **Truly Autonomous**: RARV (Reason-Act-Reflect-Verify) cycle with self-verification achieves 2-3x quality improvement
225
- 2. **Massively Parallel**: 100+ agents working simultaneously, not sequential single-agent bottlenecks
173
+ 1. **Self-Verifying**: RARV (Reason-Act-Reflect-Verify) cycle with continuous self-verification catches errors early
174
+ 2. **Parallel Execution**: Multiple agents working simultaneously, not sequential single-agent bottlenecks
226
175
  3. **Production-Ready**: Not just code—handles deployment, monitoring, incident response, and business operations
227
176
  4. **Self-Improving**: Learns from mistakes, updates continuity logs, prevents repeated errors
228
177
  5. **Zero Babysitting**: Auto-resumes on rate limits, recovers from failures, runs until completion
@@ -249,13 +198,13 @@ PRD → Research → Architecture → Development → Testing → Deployment →
249
198
  | **OpenClaw Bridge (v5.38.0)** | Multi-agent coordination protocol | [OpenClaw Integration](docs/openclaw-integration.md) |
250
199
  | **41 Agent Types** | Engineering, Ops, Business, Data, Product, Growth, Orchestration | [Agent Definitions](references/agent-types.md) |
251
200
  | **RARV Cycle** | Reason-Act-Reflect-Verify workflow | [Core Workflow](references/core-workflow.md) |
252
- | **Quality Gates** | 7-gate system: guardrails, static analysis, blind review, anti-sycophancy, severity blocking, test coverage | [Quality Control](references/quality-control.md) |
201
+ | **Quality Gates** | 9-gate system: guardrails, static analysis, blind review, anti-sycophancy, severity blocking, test coverage, mock detection, mutation detection | [Quality Control](references/quality-control.md) |
253
202
  | **Memory System (v5.15.0)** | Complete 3-tier memory with progressive disclosure | [Memory Architecture](references/memory-system.md) |
254
203
  | **Parallel Workflows** | Git worktree-based parallelism | [Parallel Workflows](skills/parallel-workflows.md) |
255
204
  | **GitHub Integration** | Issue import, PR creation, status sync | [GitHub Integration](skills/github-integration.md) |
256
205
  | **Distribution** | npm, Homebrew, Docker installation | [Installation Guide](docs/INSTALLATION.md) |
257
206
  | **Research Foundation** | OpenAI, DeepMind, Anthropic patterns | [Acknowledgements](docs/ACKNOWLEDGEMENTS.md) |
258
- | **Benchmarks** | HumanEval 98.78%, SWE-bench 99.67% | [Benchmark Results](benchmarks/results/) |
207
+ | **Benchmarks** | HumanEval and SWE-bench infrastructure included | [Benchmark Harness](benchmarks/) |
259
208
  | **Comparisons** | vs Auto-Claude, Cursor | [Auto-Claude](docs/auto-claude-comparison.md), [Cursor](docs/cursor-comparison.md) |
260
209
 
261
210
  ---
@@ -424,7 +373,7 @@ Loki Mode doesn't just write code—it **thinks, acts, learns, and verifies**:
424
373
  └─ Apply learning and RETRY from REASON
425
374
  ```
426
375
 
427
- **Result:** 2-3x quality improvement through continuous self-verification.
376
+ **Result:** Improved quality through continuous self-verification and multi-reviewer code review.
428
377
 
429
378
  ### **Perpetual Improvement Mode**
430
379
 
@@ -561,7 +510,7 @@ graph TB
561
510
  **Key components:**
562
511
  - **RARV+C Cycle** -- Reason, Act, Reflect, Verify, Compound. Every iteration follows this loop. Failed verification triggers retry from Reason.
563
512
  - **Provider Layer** -- Claude Code (full parallel agents, Task tool, MCP), Codex CLI and Gemini CLI (sequential, degraded mode).
564
- - **Agent Swarms** -- 41 specialized agent types across 7 swarms, spawned on demand based on project complexity.
513
+ - **Agent Swarms** -- 41 specialized agent types across 8 swarms, spawned on demand based on project complexity.
565
514
  - **Completion Council** -- 3 members vote on whether the project is done. Anti-sycophancy devil's advocate on unanimous votes.
566
515
  - **Memory System** -- Episodic traces, semantic patterns, procedural skills. Progressive disclosure reduces context usage by 60-80%.
567
516
  - **Dashboard** -- FastAPI server reading `.loki/` flat files, with real-time web UI for task queue, agents, logs, and council state. Now with TLS/HTTPS, OIDC/SSO, and RBAC (v5.36.0-v5.37.0).
@@ -609,7 +558,7 @@ Config search order: `.loki/config.yaml` (project) -> `~/.config/loki-mode/confi
609
558
 
610
559
  ## Agent Swarms (41 Types)
611
560
 
612
- Loki Mode has **41 predefined agent types** organized into **7 specialized swarms**. The orchestrator spawns only what you needsimple projects use 5-10 agents, complex startups spawn 100+.
561
+ Loki Mode has **41 predefined agent types** organized into **8 specialized swarms**. The orchestrator spawns only what you need -- simple projects typically use 5-10 agents, complex ones may use more.
613
562
 
614
563
  <img width="5309" height="979" alt="Agent Swarms Visualization" src="https://github.com/user-attachments/assets/7d18635d-a606-401f-8d9f-430e6e4ee689" />
615
564
 
@@ -676,7 +625,7 @@ references/ # Deep documentation (23KB+ files)
676
625
  | **2. Architecture** | Tech stack selection with self-reflection |
677
626
  | **3. Infrastructure** | Provision cloud, CI/CD, monitoring |
678
627
  | **4. Development** | Implement with TDD, parallel code review |
679
- | **5. QA** | 7 quality gates, security audit, load testing |
628
+ | **5. QA** | 9 quality gates, security audit, load testing |
680
629
  | **6. Deployment** | Blue-green deploy, auto-rollback on errors |
681
630
  | **7. Business** | Marketing, sales, legal, support setup |
682
631
  | **8. Growth** | Continuous optimization, A/B testing, feedback loops |
@@ -981,7 +930,7 @@ Built for the [Claude Code](https://claude.ai) ecosystem, powered by Anthropic's
981
930
 
982
931
  Loki Mode is the flagship product of **[Autonomi](https://www.autonomi.dev/)** -- a platform for autonomous AI systems. Like Alphabet is to Google, Autonomi is the parent brand under which Loki Mode and future products operate.
983
932
 
984
- **Why Autonomi?** Loki Mode proved that multi-agent autonomous systems can build real software from a PRD with zero human intervention. Autonomi is the expansion of that vision into a broader platform of autonomous services and products.
933
+ **Why Autonomi?** Loki Mode proved that multi-agent autonomous systems can build real software from a PRD with minimal human intervention. Autonomi is the expansion of that vision into a broader platform of autonomous services and products.
985
934
 
986
935
  - **[autonomi.dev](https://www.autonomi.dev/)** -- Main website
987
936
  - **[Documentation](https://www.autonomi.dev/docs)** -- Full documentation
package/SKILL.md CHANGED
@@ -1,9 +1,9 @@
1
1
  ---
2
2
  name: loki-mode
3
- description: Multi-agent autonomous startup system. Triggers on "Loki Mode". Takes PRD to deployed product with zero human intervention. Requires --dangerously-skip-permissions flag.
3
+ description: Multi-agent autonomous startup system. Triggers on "Loki Mode". Takes PRD to deployed product with minimal human intervention. Requires --dangerously-skip-permissions flag.
4
4
  ---
5
5
 
6
- # Loki Mode v5.49.0
6
+ # Loki Mode v5.49.2
7
7
 
8
8
  **You are an autonomous agent. You make decisions. You do not ask questions. You do not stop.**
9
9
 
@@ -263,4 +263,4 @@ The following features are documented in skill modules but not yet fully automat
263
263
  | Quality gates 3-reviewer system | Implemented (v5.35.0) | 5 specialist reviewers in `skills/quality-gates.md`; execution in run.sh |
264
264
  | Benchmarks (HumanEval, SWE-bench) | Infrastructure only | Runner scripts and datasets exist in `benchmarks/`; no published results |
265
265
 
266
- **v5.49.0 | [Autonomi](https://www.autonomi.dev/) flagship product | ~260 lines core**
266
+ **v5.49.2 | [Autonomi](https://www.autonomi.dev/) flagship product | ~260 lines core**
package/VERSION CHANGED
@@ -1 +1 @@
1
- 5.49.0
1
+ 5.49.2
@@ -142,7 +142,7 @@ GROWTH ──[continuous improvement loop]──> GROWTH
142
142
  - `Bash` - Command execution
143
143
  - `platform-orchestrator` - Deployment and service management
144
144
 
145
- **The 37 agent types are ROLES defined through prompts, not subagent_types.**
145
+ **The 41 agent types are ROLES defined through prompts, not subagent_types.**
146
146
 
147
147
  ---
148
148
 
@@ -155,10 +155,10 @@ SKILL.md (~190 lines) # Always loaded: RARV cycle, autonomy rules
155
155
  skills/
156
156
  00-index.md # Module routing table
157
157
  model-selection.md # Task tool, parallelization
158
- quality-gates.md # 7-gate system, anti-sycophancy
158
+ quality-gates.md # 9-gate system, anti-sycophancy
159
159
  testing.md # Playwright, E2E, property-based
160
160
  production.md # CI/CD, batch processing
161
- agents.md # 37 agent types, A2A patterns
161
+ agents.md # 41 agent types, A2A patterns
162
162
  parallel-workflows.md # Git worktrees, parallel streams
163
163
  troubleshooting.md # Error recovery, fallbacks
164
164
  artifacts.md # Code generation patterns
@@ -196,7 +196,7 @@ Main Worktree (orchestrator)
196
196
 
197
197
  ---
198
198
 
199
- ## Quality Gates (7-Gate System)
199
+ ## Quality Gates (9-Gate System)
200
200
 
201
201
  ### Gate 1: Static Analysis
202
202
  ```yaml
@@ -432,6 +432,10 @@ app_runner_start() {
432
432
  (cd "$dir" && bash -c "$_APP_RUNNER_METHOD" >> "$_APP_RUNNER_DIR/app.log" 2>&1) &
433
433
  fi
434
434
  _APP_RUNNER_PID=$!
435
+ # Register with central PID registry if available
436
+ if type register_pid &>/dev/null; then
437
+ register_pid "$_APP_RUNNER_PID" "app-runner" "method=$_APP_RUNNER_METHOD"
438
+ fi
435
439
 
436
440
  # Write PID file
437
441
  echo "$_APP_RUNNER_PID" > "$_APP_RUNNER_DIR/app.pid"
@@ -497,6 +501,11 @@ app_runner_stop() {
497
501
  kill -KILL "-$_APP_RUNNER_PID" 2>/dev/null || kill -KILL "$_APP_RUNNER_PID" 2>/dev/null || true
498
502
  fi
499
503
 
504
+ # Unregister from central PID registry
505
+ if type unregister_pid &>/dev/null && [ -n "$_APP_RUNNER_PID" ]; then
506
+ unregister_pid "$_APP_RUNNER_PID"
507
+ fi
508
+
500
509
  rm -f "$_APP_RUNNER_DIR/app.pid"
501
510
  _write_app_state "stopped"
502
511
  log_info "App Runner: application stopped"
package/autonomy/loki CHANGED
@@ -9,6 +9,7 @@
9
9
  # Usage:
10
10
  # loki start [PRD] - Start Loki Mode (optionally with PRD)
11
11
  # loki stop - Stop execution immediately
12
+ # loki cleanup - Kill orphaned processes from crashed sessions
12
13
  # loki pause - Pause after current session
13
14
  # loki resume - Resume paused execution
14
15
  # loki status - Show current status
@@ -312,6 +313,7 @@ show_help() {
312
313
  echo " init Build a PRD interactively or from templates"
313
314
  echo " issue <url|num> Generate PRD from GitHub issue and optionally start"
314
315
  echo " stop Stop execution immediately"
316
+ echo " cleanup Kill orphaned processes from crashed sessions"
315
317
  echo " pause Pause after current session"
316
318
  echo " resume Resume paused execution"
317
319
  echo " status [--json] Show current status (--json for machine-readable)"
@@ -704,6 +706,28 @@ except: pass
704
706
  rm -f "$LOKI_DIR/dashboard/dashboard.pid"
705
707
  fi
706
708
 
709
+ # Kill any remaining registered processes (2s graceful window matches run.sh)
710
+ if [ -d "$LOKI_DIR/pids" ]; then
711
+ for entry_file in "$LOKI_DIR/pids"/*.json; do
712
+ [ -f "$entry_file" ] || continue
713
+ local reg_pid
714
+ reg_pid=$(basename "$entry_file" .json)
715
+ case "$reg_pid" in ''|*[!0-9]*) continue ;; esac
716
+ if kill -0 "$reg_pid" 2>/dev/null; then
717
+ kill "$reg_pid" 2>/dev/null || true
718
+ local w=0
719
+ while [ $w -lt 4 ] && kill -0 "$reg_pid" 2>/dev/null; do
720
+ sleep 0.5
721
+ w=$((w + 1))
722
+ done
723
+ if kill -0 "$reg_pid" 2>/dev/null; then
724
+ kill -9 "$reg_pid" 2>/dev/null || true
725
+ fi
726
+ fi
727
+ rm -f "$entry_file"
728
+ done
729
+ fi
730
+
707
731
  # Emit session stop event
708
732
  emit_event session cli stop "reason=user_requested"
709
733
  # Emit success pattern for clean stop (SYN-018)
@@ -730,6 +754,86 @@ except: pass
730
754
  fi
731
755
  }
732
756
 
757
+ # Kill orphaned processes from crashed sessions
758
+ cmd_cleanup() {
759
+ local pids_dir="$LOKI_DIR/pids"
760
+ local killed=0
761
+ local stale=0
762
+
763
+ if [ ! -d "$pids_dir" ]; then
764
+ echo "No PID registry found. Nothing to clean up."
765
+ exit 0
766
+ fi
767
+
768
+ echo -e "${BOLD}Scanning for orphaned processes...${NC}"
769
+
770
+ for entry_file in "$pids_dir"/*.json; do
771
+ [ -f "$entry_file" ] || continue
772
+ local pid
773
+ pid=$(basename "$entry_file" .json)
774
+ case "$pid" in
775
+ ''|*[!0-9]*) continue ;;
776
+ esac
777
+
778
+ local label=""
779
+ local ppid_val=""
780
+ # Parse JSON fields (python3 with shell fallback)
781
+ if command -v python3 >/dev/null 2>&1; then
782
+ label=$(python3 -c "import json,sys; print(json.load(open(sys.argv[1])).get('label','unknown'))" "$entry_file" 2>/dev/null) || label="unknown"
783
+ ppid_val=$(python3 -c "import json,sys; print(json.load(open(sys.argv[1])).get('ppid',''))" "$entry_file" 2>/dev/null) || true
784
+ else
785
+ label=$(sed 's/.*"label":"//' "$entry_file" 2>/dev/null | sed 's/".*//' | head -1) || label="unknown"
786
+ ppid_val=$(sed 's/.*"ppid"://' "$entry_file" 2>/dev/null | sed 's/[,}].*//' | head -1) || true
787
+ fi
788
+
789
+ if kill -0 "$pid" 2>/dev/null; then
790
+ # Process is alive - check if parent is dead (orphan)
791
+ local is_orphan=false
792
+ # Validate ppid_val is numeric before using with kill
793
+ case "$ppid_val" in ''|*[!0-9]*) ppid_val="" ;; esac
794
+ if [ -n "$ppid_val" ] && ! kill -0 "$ppid_val" 2>/dev/null; then
795
+ is_orphan=true
796
+ fi
797
+
798
+ if [ "$is_orphan" = true ] || [ "${1:-}" = "--force" ]; then
799
+ echo -e " ${RED}Killing${NC} PID=$pid label=$label (parent $ppid_val dead)"
800
+ kill "$pid" 2>/dev/null || true
801
+ sleep 0.5
802
+ if kill -0 "$pid" 2>/dev/null; then
803
+ kill -9 "$pid" 2>/dev/null || true
804
+ fi
805
+ rm -f "$entry_file"
806
+ killed=$((killed + 1))
807
+ else
808
+ echo -e " ${GREEN}Alive${NC} PID=$pid label=$label (parent $ppid_val alive)"
809
+ fi
810
+ else
811
+ # Process is dead - clean up stale entry
812
+ rm -f "$entry_file"
813
+ stale=$((stale + 1))
814
+ fi
815
+ done
816
+
817
+ echo ""
818
+ echo "Results: $killed orphan(s) killed, $stale stale entries cleaned"
819
+
820
+ # Also kill orphaned loki-run temp scripts
821
+ local temp_killed=0
822
+ if pgrep -f "loki-run-" >/dev/null 2>&1; then
823
+ if ! is_session_running; then
824
+ echo "Killing orphaned loki-run temp scripts..."
825
+ pkill -f "loki-run-" 2>/dev/null || true
826
+ sleep 0.5
827
+ pkill -9 -f "loki-run-" 2>/dev/null || true
828
+ temp_killed=1
829
+ fi
830
+ fi
831
+
832
+ if [ $killed -eq 0 ] && [ $stale -eq 0 ] && [ $temp_killed -eq 0 ]; then
833
+ echo -e "${GREEN}System is clean. No orphans found.${NC}"
834
+ fi
835
+ }
836
+
733
837
  # Pause after current session
734
838
  cmd_pause() {
735
839
  if [ ! -d "$LOKI_DIR" ]; then
@@ -4497,6 +4601,9 @@ main() {
4497
4601
  stop)
4498
4602
  cmd_stop
4499
4603
  ;;
4604
+ cleanup)
4605
+ cmd_cleanup "$@"
4606
+ ;;
4500
4607
  pause)
4501
4608
  cmd_pause
4502
4609
  ;;