loki-mode 4.2.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (54) hide show
  1. package/LICENSE +21 -0
  2. package/README.md +691 -0
  3. package/SKILL.md +191 -0
  4. package/VERSION +1 -0
  5. package/autonomy/.loki/dashboard/index.html +2634 -0
  6. package/autonomy/CONSTITUTION.md +508 -0
  7. package/autonomy/README.md +201 -0
  8. package/autonomy/config.example.yaml +152 -0
  9. package/autonomy/loki +526 -0
  10. package/autonomy/run.sh +3636 -0
  11. package/bin/loki-mode.js +26 -0
  12. package/bin/postinstall.js +60 -0
  13. package/docs/ACKNOWLEDGEMENTS.md +234 -0
  14. package/docs/COMPARISON.md +325 -0
  15. package/docs/COMPETITIVE-ANALYSIS.md +333 -0
  16. package/docs/INSTALLATION.md +547 -0
  17. package/docs/auto-claude-comparison.md +276 -0
  18. package/docs/cursor-comparison.md +225 -0
  19. package/docs/dashboard-guide.md +355 -0
  20. package/docs/screenshots/README.md +149 -0
  21. package/docs/screenshots/dashboard-agents.png +0 -0
  22. package/docs/screenshots/dashboard-tasks.png +0 -0
  23. package/docs/thick2thin.md +173 -0
  24. package/package.json +48 -0
  25. package/references/advanced-patterns.md +453 -0
  26. package/references/agent-types.md +243 -0
  27. package/references/agents.md +1043 -0
  28. package/references/business-ops.md +550 -0
  29. package/references/competitive-analysis.md +216 -0
  30. package/references/confidence-routing.md +371 -0
  31. package/references/core-workflow.md +275 -0
  32. package/references/cursor-learnings.md +207 -0
  33. package/references/deployment.md +604 -0
  34. package/references/lab-research-patterns.md +534 -0
  35. package/references/mcp-integration.md +186 -0
  36. package/references/memory-system.md +467 -0
  37. package/references/openai-patterns.md +647 -0
  38. package/references/production-patterns.md +568 -0
  39. package/references/prompt-repetition.md +192 -0
  40. package/references/quality-control.md +437 -0
  41. package/references/sdlc-phases.md +410 -0
  42. package/references/task-queue.md +361 -0
  43. package/references/tool-orchestration.md +691 -0
  44. package/skills/00-index.md +120 -0
  45. package/skills/agents.md +249 -0
  46. package/skills/artifacts.md +174 -0
  47. package/skills/github-integration.md +218 -0
  48. package/skills/model-selection.md +125 -0
  49. package/skills/parallel-workflows.md +526 -0
  50. package/skills/patterns-advanced.md +188 -0
  51. package/skills/production.md +292 -0
  52. package/skills/quality-gates.md +180 -0
  53. package/skills/testing.md +149 -0
  54. package/skills/troubleshooting.md +109 -0
package/README.md ADDED
@@ -0,0 +1,691 @@
1
+ # Loki Mode
2
+
3
+ **The First Truly Autonomous Multi-Agent Startup System**
4
+
5
+ [![Claude Code](https://img.shields.io/badge/Claude-Code-orange)](https://claude.ai)
6
+ [![Agent Types](https://img.shields.io/badge/Agent%20Types-37-blue)]()
7
+ [![Loki Mode](https://img.shields.io/badge/Loki%20Mode-98.78%25%20Pass%401-blueviolet)](benchmarks/results/)
8
+ [![HumanEval](https://img.shields.io/badge/HumanEval-98.17%25%20Pass%401-brightgreen)](benchmarks/results/)
9
+ [![SWE-bench](https://img.shields.io/badge/SWE--bench-99.67%25%20Patch%20Gen-brightgreen)](benchmarks/results/)
10
+ [![License](https://img.shields.io/badge/License-MIT-green.svg)](LICENSE)
11
+
12
+ **[Documentation Website](https://asklokesh.github.io/loki-mode/)** | **[Architecture](https://asklokesh.github.io/loki-mode/blog/#architecture)** | **[Research](https://asklokesh.github.io/loki-mode/blog/#research)** | **[Comparisons](https://asklokesh.github.io/loki-mode/blog/#comparisons)**
13
+
14
+ > **PRD → Deployed Product in Zero Human Intervention**
15
+ >
16
+ > Loki Mode transforms a Product Requirements Document into a fully built, tested, deployed, and revenue-generating product while you sleep. No manual steps. No intervention. Just results.
17
+
18
+ ---
19
+
20
+ ## Demo
21
+
22
+ [![asciicast](https://asciinema.org/a/EqNo5IVTaPJfCjLmnYgZ9TC3E.svg)](https://asciinema.org/a/EqNo5IVTaPJfCjLmnYgZ9TC3E)
23
+
24
+ *Click to watch Loki Mode build a complete Todo App from PRD - zero human intervention*
25
+
26
+ ---
27
+
28
+ ## Usage
29
+
30
+ ### Option 1: Claude Code Skill (Recommended)
31
+
32
+ ```bash
33
+ # Install
34
+ git clone https://github.com/asklokesh/loki-mode.git ~/.claude/skills/loki-mode
35
+
36
+ # Run
37
+ claude --dangerously-skip-permissions
38
+
39
+ # Then say:
40
+ Loki Mode with PRD at ./my-prd.md
41
+ ```
42
+
43
+ ### Option 2: Shell Script
44
+
45
+ ```bash
46
+ # Clone repo
47
+ git clone https://github.com/asklokesh/loki-mode.git
48
+ cd loki-mode
49
+
50
+ # Run directly
51
+ ./autonomy/run.sh ./my-prd.md
52
+ ```
53
+
54
+ ### Coming Soon
55
+
56
+ The following installation methods are planned but not yet published:
57
+ - npm: `npm install -g loki-mode`
58
+ - Homebrew: `brew install loki-mode`
59
+ - Docker: `docker pull asklokesh/loki-mode`
60
+
61
+ See [Installation Guide](docs/INSTALLATION.md) for updates.
62
+
63
+ ---
64
+
65
+ ## Benchmark Results
66
+
67
+ ### Three-Way Comparison (HumanEval)
68
+
69
+ | System | Pass@1 | Details |
70
+ |--------|--------|---------|
71
+ | **Loki Mode (Multi-Agent)** | **98.78%** | 162/164 problems, RARV cycle recovered 2 |
72
+ | Direct Claude | 98.17% | 161/164 problems (baseline) |
73
+ | MetaGPT | 85.9-87.7% | Published benchmark |
74
+
75
+ **Loki Mode beats MetaGPT by +11-13%** thanks to the RARV (Reason-Act-Reflect-Verify) cycle.
76
+
77
+ ### Full Results
78
+
79
+ | Benchmark | Score | Details |
80
+ |-----------|-------|---------|
81
+ | **Loki Mode HumanEval** | **98.78% Pass@1** | 162/164 (multi-agent with RARV) |
82
+ | **Direct Claude HumanEval** | **98.17% Pass@1** | 161/164 (single agent baseline) |
83
+ | **Direct Claude SWE-bench** | **99.67% patch gen** | 299/300 problems |
84
+ | **Loki Mode SWE-bench** | **99.67% patch gen** | 299/300 problems |
85
+ | Model | Claude Opus 4.5 | |
86
+
87
+ **Key Finding:** Multi-agent RARV matches single-agent performance on both benchmarks after timeout optimization. The 4-agent pipeline (Architect->Engineer->QA->Reviewer) achieves the same 99.67% patch generation as direct Claude.
88
+
89
+ See [benchmarks/results/](benchmarks/results/) for full methodology and solutions.
90
+
91
+ ---
92
+
93
+ ## What is Loki Mode?
94
+
95
+ Loki Mode is a Claude Code skill that orchestrates **37 specialized AI agent types** across **6 swarms** to autonomously build, test, deploy, and scale complete startups. It dynamically spawns only the agents you need—**5-10 for simple projects, 100+ for complex startups**—working in parallel with continuous self-verification.
96
+
97
+ ```
98
+ PRD → Research → Architecture → Development → Testing → Deployment → Marketing → Revenue
99
+ ```
100
+
101
+ **Just say "Loki Mode" and point to a PRD. Walk away. Come back to a deployed product.**
102
+
103
+ ---
104
+
105
+ ## Why Loki Mode?
106
+
107
+ ### **Better Than Anything Out There**
108
+
109
+ | What Others Do | What Loki Mode Does |
110
+ |----------------|---------------------|
111
+ | **Single agent** writes code linearly | **100+ agents** work in parallel across engineering, ops, business, data, product, and growth |
112
+ | **Manual deployment** required | **Autonomous deployment** to AWS, GCP, Azure, Vercel, Railway with blue-green and canary strategies |
113
+ | **No testing** or basic unit tests | **14 automated quality gates**: security scans, load tests, accessibility audits, code reviews |
114
+ | **Code only** - you handle the rest | **Full business operations**: marketing, sales, legal, HR, finance, investor relations |
115
+ | **Stops on errors** | **Self-healing**: circuit breakers, dead letter queues, exponential backoff, automatic recovery |
116
+ | **No visibility** into progress | **Real-time dashboard** with agent monitoring, task queues, and live status updates |
117
+ | **"Done" when code is written** | **Never "done"**: continuous optimization, A/B testing, customer feedback loops, perpetual improvement |
118
+
119
+ ### **Core Advantages**
120
+
121
+ 1. **Truly Autonomous**: RARV (Reason-Act-Reflect-Verify) cycle with self-verification achieves 2-3x quality improvement
122
+ 2. **Massively Parallel**: 100+ agents working simultaneously, not sequential single-agent bottlenecks
123
+ 3. **Production-Ready**: Not just code—handles deployment, monitoring, incident response, and business operations
124
+ 4. **Self-Improving**: Learns from mistakes, updates continuity logs, prevents repeated errors
125
+ 5. **Zero Babysitting**: Auto-resumes on rate limits, recovers from failures, runs until completion
126
+ 6. **Efficiency Optimized**: ToolOrchestra-inspired metrics track cost per task, reward signals drive continuous improvement
127
+
128
+ ---
129
+
130
+ ## Features & Documentation
131
+
132
+ | Feature | Description | Documentation |
133
+ |---------|-------------|---------------|
134
+ | **CLI (v4.1.0)** | `loki` command for start/stop/pause/status | [CLI Commands](#cli-commands-v410) |
135
+ | **Config Files** | YAML configuration support | [autonomy/config.example.yaml](autonomy/config.example.yaml) |
136
+ | **Dashboard** | Realtime Kanban board, agent monitoring | [Dashboard Guide](docs/dashboard-guide.md) |
137
+ | **37 Agent Types** | Engineering, Ops, Business, Data, Product, Growth | [Agent Definitions](references/agent-types.md) |
138
+ | **RARV Cycle** | Reason-Act-Reflect-Verify workflow | [Core Workflow](references/core-workflow.md) |
139
+ | **Quality Gates** | 7-gate review system with anti-sycophancy | [Quality Control](references/quality-control.md) |
140
+ | **Memory System** | Episodic, semantic, procedural memory | [Memory Architecture](references/memory-system.md) |
141
+ | **Parallel Workflows** | Git worktree-based parallelism | [Parallel Workflows](skills/parallel-workflows.md) |
142
+ | **GitHub Integration** | Issue import, PR creation, status sync | [GitHub Integration](skills/github-integration.md) |
143
+ | **Distribution** | npm, Homebrew, Docker installation | [Installation Guide](docs/INSTALLATION.md) |
144
+ | **Research Foundation** | OpenAI, DeepMind, Anthropic patterns | [Acknowledgements](docs/ACKNOWLEDGEMENTS.md) |
145
+ | **Benchmarks** | HumanEval 98.78%, SWE-bench 99.67% | [Benchmark Results](benchmarks/results/) |
146
+ | **Comparisons** | vs Auto-Claude, Cursor | [Auto-Claude](docs/auto-claude-comparison.md), [Cursor](docs/cursor-comparison.md) |
147
+
148
+ ---
149
+
150
+ ## Dashboard & Real-Time Monitoring
151
+
152
+ Monitor your autonomous startup being built in real-time through the Loki Mode dashboard:
153
+
154
+ ### **Agent Monitoring**
155
+
156
+ <img width="1200" alt="Loki Mode Dashboard - Active Agents" src="docs/screenshots/dashboard-agents.png" />
157
+
158
+ **Track all active agents in real-time:**
159
+ - **Agent ID** and **Type** (frontend, backend, QA, DevOps, etc.)
160
+ - **Model Badge** (Sonnet, Haiku, Opus) with color coding
161
+ - **Current Work** being performed
162
+ - **Runtime** and **Tasks Completed**
163
+ - **Status** (active, completed)
164
+
165
+ ### **Task Queue Visualization**
166
+
167
+ <img width="1200" alt="Loki Mode Dashboard - Task Queue" src="docs/screenshots/dashboard-tasks.png" />
168
+
169
+ **Four-column kanban view:**
170
+ - **Pending**: Queued tasks waiting for agents
171
+ - **In Progress**: Currently being worked on
172
+ - **Completed**: Successfully finished (shows last 10)
173
+ - **Failed**: Tasks requiring attention
174
+
175
+ ### **Live Status Monitor**
176
+
177
+ ```bash
178
+ # Watch status updates in terminal
179
+ watch -n 2 cat .loki/STATUS.txt
180
+ ```
181
+
182
+ ```
183
+ ╔════════════════════════════════════════════════════════════════╗
184
+ ║ LOKI MODE STATUS ║
185
+ ╚════════════════════════════════════════════════════════════════╝
186
+
187
+ Phase: DEVELOPMENT
188
+
189
+ Active Agents: 47
190
+ ├─ Engineering: 18
191
+ ├─ Operations: 12
192
+ ├─ QA: 8
193
+ └─ Business: 9
194
+
195
+ Tasks:
196
+ ├─ Pending: 10
197
+ ├─ In Progress: 47
198
+ ├─ Completed: 203
199
+ └─ Failed: 0
200
+
201
+ Last Updated: 2026-01-04 20:45:32
202
+ ```
203
+
204
+ **Access the dashboard:**
205
+ ```bash
206
+ # Automatically opens when running autonomously
207
+ ./autonomy/run.sh ./docs/requirements.md
208
+
209
+ # Or open manually
210
+ open .loki/dashboard/index.html
211
+ ```
212
+
213
+ Auto-refreshes every 3 seconds. Works with any modern browser.
214
+
215
+ ---
216
+
217
+ ## Autonomous Capabilities
218
+
219
+ ### **RARV Cycle: Reason-Act-Reflect-Verify**
220
+
221
+ Loki Mode doesn't just write code—it **thinks, acts, learns, and verifies**:
222
+
223
+ ```
224
+ 1. REASON
225
+ └─ Read .loki/CONTINUITY.md including "Mistakes & Learnings"
226
+ └─ Check .loki/state/ and .loki/queue/
227
+ └─ Identify next task or improvement
228
+
229
+ 2. ACT
230
+ └─ Execute task, write code
231
+ └─ Commit changes atomically (git checkpoint)
232
+
233
+ 3. REFLECT
234
+ └─ Update .loki/CONTINUITY.md with progress
235
+ └─ Update state files
236
+ └─ Identify NEXT improvement
237
+
238
+ 4. VERIFY
239
+ └─ Run automated tests (unit, integration, E2E)
240
+ └─ Check compilation/build
241
+ └─ Verify against spec
242
+
243
+ IF VERIFICATION FAILS:
244
+ ├─ Capture error details (stack trace, logs)
245
+ ├─ Analyze root cause
246
+ ├─ UPDATE "Mistakes & Learnings" in CONTINUITY.md
247
+ ├─ Rollback to last good git checkpoint if needed
248
+ └─ Apply learning and RETRY from REASON
249
+ ```
250
+
251
+ **Result:** 2-3x quality improvement through continuous self-verification.
252
+
253
+ ### **Perpetual Improvement Mode**
254
+
255
+ There is **NEVER** a "finished" state. After completing the PRD, Loki Mode:
256
+ - Runs performance optimizations
257
+ - Adds missing test coverage
258
+ - Improves documentation
259
+ - Refactors code smells
260
+ - Updates dependencies
261
+ - Enhances user experience
262
+ - Implements A/B test learnings
263
+
264
+ **It keeps going until you stop it.**
265
+
266
+ ### **Auto-Resume & Self-Healing**
267
+
268
+ **Rate limits?** Exponential backoff and automatic resume.
269
+ **Errors?** Circuit breakers, dead letter queues, retry logic.
270
+ **Interruptions?** State checkpoints every 5 seconds—just restart.
271
+
272
+ ```bash
273
+ # Start autonomous mode
274
+ ./autonomy/run.sh ./docs/requirements.md
275
+
276
+ # Hit rate limit? Script automatically:
277
+ # ├─ Saves state checkpoint
278
+ # ├─ Waits with exponential backoff (60s → 120s → 240s...)
279
+ # ├─ Resumes from exact point
280
+ # └─ Continues until completion or max retries (default: 50)
281
+ ```
282
+
283
+ ---
284
+
285
+ ## Quick Start
286
+
287
+ ### **1. Install**
288
+
289
+ ```bash
290
+ # Option A: npm (recommended)
291
+ npm install -g loki-mode
292
+
293
+ # Option B: Homebrew (macOS/Linux)
294
+ brew tap asklokesh/tap && brew install loki-mode
295
+ loki-mode-install-skill # Set up Claude Code integration
296
+
297
+ # Option C: Docker
298
+ docker pull asklokesh/loki-mode:4.1.0
299
+
300
+ # Option D: Git clone
301
+ git clone https://github.com/asklokesh/loki-mode.git ~/.claude/skills/loki-mode
302
+ ```
303
+
304
+ See [Installation Guide](docs/INSTALLATION.md) for detailed instructions.
305
+
306
+ ### **2. Create a PRD**
307
+
308
+ ```markdown
309
+ # Product: AI-Powered Todo App
310
+
311
+ ## Overview
312
+ Build a todo app with AI-powered task suggestions and deadline predictions.
313
+
314
+ ## Features
315
+ - User authentication (email/password)
316
+ - Create, read, update, delete todos
317
+ - AI suggests next tasks based on patterns
318
+ - Smart deadline predictions
319
+ - Mobile-responsive design
320
+
321
+ ## Tech Stack
322
+ - Next.js 14 with TypeScript
323
+ - PostgreSQL database
324
+ - OpenAI API for suggestions
325
+ - Deploy to Vercel
326
+ ```
327
+
328
+ Save as `my-prd.md`.
329
+
330
+ ### **3. Run Loki Mode**
331
+
332
+ ```bash
333
+ # Using the CLI (v4.1.0)
334
+ loki start ./my-prd.md
335
+
336
+ # Or using run.sh directly
337
+ ./autonomy/run.sh ./my-prd.md
338
+
339
+ # Or manual mode in Claude Code
340
+ claude --dangerously-skip-permissions
341
+ > Loki Mode with PRD at ./my-prd.md
342
+ ```
343
+
344
+ ### **4. Monitor Progress**
345
+
346
+ ```bash
347
+ # Check status
348
+ loki status
349
+
350
+ # Open dashboard in browser
351
+ loki dashboard
352
+
353
+ # Or watch terminal output
354
+ watch -n 2 cat .loki/STATUS.txt
355
+ ```
356
+
357
+ ### **5. Walk Away**
358
+
359
+ Seriously. Go get coffee. It'll be deployed when you get back.
360
+
361
+ **That's it.** No configuration. No manual steps. No intervention.
362
+
363
+ ---
364
+
365
+ ## CLI Commands (v4.1.0)
366
+
367
+ The `loki` CLI provides easy access to all Loki Mode features:
368
+
369
+ | Command | Description |
370
+ |---------|-------------|
371
+ | `loki start [PRD]` | Start Loki Mode with optional PRD file |
372
+ | `loki stop` | Stop execution immediately |
373
+ | `loki pause` | Pause after current session |
374
+ | `loki resume` | Resume paused execution |
375
+ | `loki status` | Show current status |
376
+ | `loki dashboard` | Open dashboard in browser |
377
+ | `loki import` | Import GitHub issues as tasks |
378
+ | `loki config show` | Show configuration |
379
+ | `loki config init` | Create config file from template |
380
+ | `loki version` | Show version |
381
+
382
+ ### Configuration File
383
+
384
+ Create a YAML config file for persistent settings:
385
+
386
+ ```bash
387
+ # Initialize config
388
+ loki config init
389
+
390
+ # Or copy template manually
391
+ cp ~/.claude/skills/loki-mode/autonomy/config.example.yaml .loki/config.yaml
392
+ ```
393
+
394
+ Config search order: `.loki/config.yaml` (project) -> `~/.config/loki-mode/config.yaml` (global)
395
+
396
+ ---
397
+
398
+ ## Agent Swarms (37 Types)
399
+
400
+ Loki Mode has **37 predefined agent types** organized into **6 specialized swarms**. The orchestrator spawns only what you need—simple projects use 5-10 agents, complex startups spawn 100+.
401
+
402
+ <img width="5309" height="979" alt="Agent Swarms Visualization" src="https://github.com/user-attachments/assets/7d18635d-a606-401f-8d9f-430e6e4ee689" />
403
+
404
+ ### **Engineering (8 types)**
405
+ `eng-frontend` `eng-backend` `eng-database` `eng-mobile` `eng-api` `eng-qa` `eng-perf` `eng-infra`
406
+
407
+ ### **Operations (8 types)**
408
+ `ops-devops` `ops-sre` `ops-security` `ops-monitor` `ops-incident` `ops-release` `ops-cost` `ops-compliance`
409
+
410
+ ### **Business (8 types)**
411
+ `biz-marketing` `biz-sales` `biz-finance` `biz-legal` `biz-support` `biz-hr` `biz-investor` `biz-partnerships`
412
+
413
+ ### **Data (3 types)**
414
+ `data-ml` `data-eng` `data-analytics`
415
+
416
+ ### **Product (3 types)**
417
+ `prod-pm` `prod-design` `prod-techwriter`
418
+
419
+ ### **Growth (4 types)**
420
+ `growth-hacker` `growth-community` `growth-success` `growth-lifecycle`
421
+
422
+ ### **Review (3 types)**
423
+ `review-code` `review-business` `review-security`
424
+
425
+ See [references/agents.md](references/agents.md) for complete agent type definitions.
426
+
427
+ ---
428
+
429
+ ## How It Works
430
+
431
+ ### **Skill Architecture (v3.0+)**
432
+
433
+ Loki Mode uses a **progressive disclosure architecture** to minimize context usage:
434
+
435
+ ```
436
+ SKILL.md (~190 lines) # Always loaded: core RARV cycle, autonomy rules
437
+ skills/
438
+ 00-index.md # Module routing table
439
+ agents.md # Agent dispatch, A2A patterns
440
+ production.md # HN patterns, batch processing, CI/CD
441
+ quality-gates.md # Review system, severity handling
442
+ testing.md # Playwright, E2E, property-based
443
+ model-selection.md # Task tool, parallelization
444
+ artifacts.md # Code generation patterns
445
+ patterns-advanced.md # Constitutional AI, debate
446
+ troubleshooting.md # Error recovery, fallbacks
447
+ references/ # Deep documentation (23KB+ files)
448
+ ```
449
+
450
+ **Why this matters:**
451
+ - Original 1,517-line SKILL.md consumed ~15% of context before any work began
452
+ - Now only ~1% of context for core skill + on-demand modules
453
+ - More room for actual code and reasoning
454
+
455
+ ### **Phase Execution**
456
+
457
+ | Phase | Description |
458
+ |-------|-------------|
459
+ | **0. Bootstrap** | Create `.loki/` directory structure, initialize state |
460
+ | **1. Discovery** | Parse PRD, competitive research via web search |
461
+ | **2. Architecture** | Tech stack selection with self-reflection |
462
+ | **3. Infrastructure** | Provision cloud, CI/CD, monitoring |
463
+ | **4. Development** | Implement with TDD, parallel code review |
464
+ | **5. QA** | 14 quality gates, security audit, load testing |
465
+ | **6. Deployment** | Blue-green deploy, auto-rollback on errors |
466
+ | **7. Business** | Marketing, sales, legal, support setup |
467
+ | **8. Growth** | Continuous optimization, A/B testing, feedback loops |
468
+
469
+ ### **Parallel Code Review**
470
+
471
+ Every code change goes through **3 specialized reviewers simultaneously**:
472
+
473
+ ```
474
+ IMPLEMENT → REVIEW (parallel) → AGGREGATE → FIX → RE-REVIEW → COMPLETE
475
+
476
+ ├─ code-reviewer (Sonnet) - Code quality, patterns, best practices
477
+ ├─ business-logic-reviewer (Sonnet) - Requirements, edge cases, UX
478
+ └─ security-reviewer (Sonnet) - Vulnerabilities, OWASP Top 10
479
+ ```
480
+
481
+ **Severity-based issue handling:**
482
+ - **Critical/High/Medium**: Block. Fix immediately. Re-review.
483
+ - **Low**: Add `// TODO(review): ...` comment, continue.
484
+ - **Cosmetic**: Add `// FIXME(nitpick): ...` comment, continue.
485
+
486
+ ### **Directory Structure**
487
+
488
+ ```
489
+ .loki/
490
+ ├── state/ # Orchestrator and agent states
491
+ ├── queue/ # Task queue (pending, in-progress, completed, dead-letter)
492
+ ├── memory/ # Episodic, semantic, and procedural memory
493
+ ├── metrics/ # Efficiency tracking and reward signals
494
+ ├── messages/ # Inter-agent communication
495
+ ├── logs/ # Audit logs
496
+ ├── config/ # Configuration files
497
+ ├── prompts/ # Agent role prompts
498
+ ├── artifacts/ # Releases, reports, backups
499
+ ├── dashboard/ # Real-time monitoring dashboard
500
+ └── scripts/ # Helper scripts
501
+ ```
502
+
503
+ ---
504
+
505
+ ## Example PRDs
506
+
507
+ Test Loki Mode with these pre-built PRDs in the `examples/` directory:
508
+
509
+ | PRD | Complexity | Est. Time | Description |
510
+ |-----|------------|-----------|-------------|
511
+ | `simple-todo-app.md` | Low | ~10 min | Basic todo app - tests core functionality |
512
+ | `api-only.md` | Low | ~10 min | REST API only - tests backend agents |
513
+ | `static-landing-page.md` | Low | ~5 min | HTML/CSS only - tests frontend/marketing |
514
+ | `full-stack-demo.md` | Medium | ~30-60 min | Complete bookmark manager - full test |
515
+
516
+ ```bash
517
+ # Example: Run with simple todo app
518
+ ./autonomy/run.sh examples/simple-todo-app.md
519
+ ```
520
+
521
+ ---
522
+
523
+ ## Configuration
524
+
525
+ ### **Autonomy Settings**
526
+
527
+ Customize the autonomous runner with environment variables:
528
+
529
+ ```bash
530
+ LOKI_MAX_RETRIES=100 \
531
+ LOKI_BASE_WAIT=120 \
532
+ LOKI_MAX_WAIT=7200 \
533
+ ./autonomy/run.sh ./docs/requirements.md
534
+ ```
535
+
536
+ | Variable | Default | Description |
537
+ |----------|---------|-------------|
538
+ | `LOKI_MAX_RETRIES` | 50 | Maximum retry attempts before giving up |
539
+ | `LOKI_BASE_WAIT` | 60 | Base wait time in seconds |
540
+ | `LOKI_MAX_WAIT` | 3600 | Maximum wait time (1 hour) |
541
+ | `LOKI_SKIP_PREREQS` | false | Skip prerequisite checks |
542
+
543
+ ### **Circuit Breakers**
544
+
545
+ ```yaml
546
+ # .loki/config/circuit-breakers.yaml
547
+ defaults:
548
+ failureThreshold: 5
549
+ cooldownSeconds: 300
550
+ ```
551
+
552
+ ### **External Alerting**
553
+
554
+ ```yaml
555
+ # .loki/config/alerting.yaml
556
+ channels:
557
+ slack:
558
+ webhook_url: "${SLACK_WEBHOOK_URL}"
559
+ severity: [critical, high]
560
+ pagerduty:
561
+ integration_key: "${PAGERDUTY_KEY}"
562
+ severity: [critical]
563
+ ```
564
+
565
+ ---
566
+
567
+ ## Requirements
568
+
569
+ - **Claude Code** with `--dangerously-skip-permissions` flag
570
+ - **Internet access** for competitive research and deployment
571
+ - **Cloud provider credentials** (for deployment phase)
572
+ - **Python 3** (for test suite)
573
+
574
+ **Optional but recommended:**
575
+ - Git (for version control and checkpoints)
576
+ - Node.js/npm (for dashboard and web projects)
577
+ - Docker (for containerized deployments)
578
+
579
+ ---
580
+
581
+ ## Integrations
582
+
583
+ ### **Vibe Kanban (Visual Dashboard)**
584
+
585
+ Integrate with [Vibe Kanban](https://github.com/BloopAI/vibe-kanban) for a visual kanban board:
586
+
587
+ ```bash
588
+ # 1. Start Vibe Kanban (terminal 1)
589
+ npx vibe-kanban
590
+
591
+ # 2. Run Loki Mode (terminal 2)
592
+ ./autonomy/run.sh ./prd.md
593
+
594
+ # 3. Export tasks to see them in Vibe Kanban (terminal 3)
595
+ ./scripts/export-to-vibe-kanban.sh
596
+
597
+ # 4. Optional: Auto-sync for real-time updates
598
+ ./scripts/vibe-sync-watcher.sh
599
+ ```
600
+
601
+ **Important:** Vibe Kanban integration requires manual export. Tasks don't automatically appear - you must run the export script to sync.
602
+
603
+ **Benefits:**
604
+ - Visual progress tracking of all active agents
605
+ - Manual intervention/prioritization when needed
606
+ - Code review with visual diffs
607
+ - Multi-project dashboard
608
+
609
+ See [integrations/vibe-kanban.md](integrations/vibe-kanban.md) for complete step-by-step setup guide and troubleshooting.
610
+
611
+ ---
612
+
613
+ ## Testing
614
+
615
+ Run the comprehensive test suite:
616
+
617
+ ```bash
618
+ # Run all tests
619
+ ./tests/run-all-tests.sh
620
+
621
+ # Or run individual test suites
622
+ ./tests/test-bootstrap.sh # Directory structure, state init
623
+ ./tests/test-task-queue.sh # Queue operations, priorities
624
+ ./tests/test-circuit-breaker.sh # Failure handling, recovery
625
+ ./tests/test-agent-timeout.sh # Timeout, stuck process handling
626
+ ./tests/test-state-recovery.sh # Checkpoints, recovery
627
+ ```
628
+
629
+ ---
630
+
631
+ ## Contributing
632
+
633
+ Contributions welcome! Please:
634
+ 1. Read [SKILL.md](SKILL.md) to understand the core architecture
635
+ 2. Review [skills/00-index.md](skills/00-index.md) for module organization (v3.0+)
636
+ 3. Check [references/agents.md](references/agents.md) for agent definitions
637
+ 4. Open an issue for bugs or feature requests
638
+ 5. Submit PRs with clear descriptions and tests
639
+
640
+ ---
641
+
642
+ ## License
643
+
644
+ MIT License - see [LICENSE](LICENSE) for details.
645
+
646
+ ---
647
+
648
+ ## Acknowledgments
649
+
650
+ Loki Mode incorporates research and patterns from leading AI labs and practitioners:
651
+
652
+ ### Research Foundation
653
+
654
+ | Source | Key Contribution |
655
+ |--------|------------------|
656
+ | [Anthropic: Building Effective Agents](https://www.anthropic.com/research/building-effective-agents) | Evaluator-optimizer pattern, parallelization |
657
+ | [Anthropic: Constitutional AI](https://www.anthropic.com/research/constitutional-ai-harmlessness-from-ai-feedback) | Self-critique against principles |
658
+ | [DeepMind: Scalable Oversight via Debate](https://deepmind.google/research/publications/34920/) | Debate-based verification |
659
+ | [DeepMind: SIMA 2](https://deepmind.google/blog/sima-2-an-agent-that-plays-reasons-and-learns-with-you-in-virtual-3d-worlds/) | Self-improvement loop |
660
+ | [OpenAI: Agents SDK](https://openai.github.io/openai-agents-python/) | Guardrails, tripwires, tracing |
661
+ | [NVIDIA ToolOrchestra](https://github.com/NVlabs/ToolOrchestra) | Efficiency metrics, reward signals |
662
+ | [CONSENSAGENT (ACL 2025)](https://aclanthology.org/2025.findings-acl.1141/) | Anti-sycophancy, blind review |
663
+ | [GoalAct](https://arxiv.org/abs/2504.16563) | Hierarchical planning |
664
+
665
+ ### Practitioner Insights
666
+
667
+ - **Boris Cherny** (Claude Code creator) - Self-verification loop, extended thinking
668
+ - **Simon Willison** - Sub-agents for context isolation, skills system
669
+ - **Hacker News Community** - [Production patterns](https://news.ycombinator.com/item?id=44623207) from real deployments
670
+
671
+ ### Inspirations
672
+
673
+ - [LerianStudio/ring](https://github.com/LerianStudio/ring) - Subagent-driven-development pattern
674
+ - [Awesome Agentic Patterns](https://github.com/nibzard/awesome-agentic-patterns) - 105+ production patterns
675
+
676
+ **[Full Acknowledgements](docs/ACKNOWLEDGEMENTS.md)** - Complete list of 50+ research papers, articles, and resources
677
+
678
+ Built for the [Claude Code](https://claude.ai) ecosystem, powered by Anthropic's Claude models (Sonnet, Haiku, Opus).
679
+
680
+ ---
681
+
682
+ **Ready to build a startup while you sleep?**
683
+
684
+ ```bash
685
+ git clone https://github.com/asklokesh/loki-mode.git ~/.claude/skills/loki-mode
686
+ ./autonomy/run.sh your-prd.md
687
+ ```
688
+
689
+ ---
690
+
691
+ **Keywords:** claude-code, claude-skills, ai-agents, autonomous-development, multi-agent-system, sdlc-automation, startup-automation, devops, mlops, deployment-automation, self-healing, perpetual-improvement