loki-mode 5.52.0 → 5.52.2
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +110 -869
- package/SKILL.md +2 -2
- package/VERSION +1 -1
- package/autonomy/hooks/validate-bash.sh +5 -2
- package/dashboard/__init__.py +1 -1
- package/dashboard/server.py +50 -1
- package/docs/INSTALLATION.md +1 -1
- package/docs/enterprise/migration.md +4 -4
- package/docs/enterprise/sdk-guide.md +11 -11
- package/docs/enterprise/security.md +2 -2
- package/docs/show-hn-post.md +47 -0
- package/mcp/__init__.py +39 -3
- package/mcp/server.py +230 -0
- package/package.json +11 -4
- package/src/integrations/linear/index.js +30 -0
- /package/dashboard/{secrets.py → app_secrets.py} +0 -0
package/README.md
CHANGED
|
@@ -1,972 +1,213 @@
|
|
|
1
1
|
# Loki Mode
|
|
2
2
|
|
|
3
|
-
**
|
|
3
|
+
**Autonomous multi-agent development with self-verification. PRD in, tested code out.**
|
|
4
4
|
|
|
5
5
|
[](https://www.npmjs.com/package/loki-mode)
|
|
6
6
|
[](https://www.npmjs.com/package/loki-mode)
|
|
7
7
|
[](https://github.com/asklokesh/loki-mode)
|
|
8
8
|
[](https://opensource.org/licenses/MIT)
|
|
9
|
-
[](https://github.com/marketplace/actions/loki-mode-code-review)
|
|
10
|
-
[](https://www.autonomi.dev/)
|
|
11
9
|
[]()
|
|
12
|
-
[** | **[Documentation](https://www.autonomi.dev/docs)** | **[GitHub](https://github.com/asklokesh/loki-mode)**
|
|
17
|
-
|
|
18
|
-
> **PRD to Deployed Product with Minimal Human Intervention**
|
|
19
|
-
>
|
|
20
|
-
> Loki Mode transforms a Product Requirements Document into a fully built, tested, and deployed product with autonomous multi-agent execution. Human oversight for deployment credentials, domain setup, and critical decisions.
|
|
21
|
-
|
|
22
|
-
---
|
|
23
|
-
|
|
24
|
-
## Demo
|
|
25
|
-
|
|
26
|
-
[](https://asciinema.org/a/AjjnjzOeKLYItp6s)
|
|
27
|
-
|
|
28
|
-
*Click to watch Loki Mode v5.42 -- CLI commands, dashboard, 8 parallel agents, 9-gate quality, Completion Council, memory system*
|
|
29
|
-
|
|
30
|
-
---
|
|
31
|
-
|
|
32
|
-
## Presentation
|
|
33
|
-
|
|
34
|
-

|
|
35
|
-
|
|
36
|
-
*9 slides: Problem, Solution, 41 Agents, RARV Cycle, Benchmarks, Multi-Provider, Full Lifecycle*
|
|
37
|
-
|
|
38
|
-
**[Download PPTX](docs/loki-mode-presentation.pptx)** for offline viewing
|
|
39
|
-
|
|
40
|
-
---
|
|
41
|
-
|
|
42
|
-
## Installation
|
|
43
|
-
|
|
44
|
-
### npm (Recommended)
|
|
45
|
-
|
|
46
|
-
```bash
|
|
47
|
-
npm install -g loki-mode
|
|
48
|
-
```
|
|
49
|
-
|
|
50
|
-
Installs the `loki` CLI and automatically sets up the skill for Claude Code, Codex CLI, and Gemini CLI.
|
|
51
|
-
|
|
52
|
-
### Homebrew
|
|
53
|
-
|
|
54
|
-
```bash
|
|
55
|
-
brew tap asklokesh/tap && brew install loki-mode
|
|
56
|
-
```
|
|
57
|
-
|
|
58
|
-
Installs the `loki` CLI. To also install the skill for interactive use:
|
|
59
|
-
|
|
60
|
-
```bash
|
|
61
|
-
loki setup-skill
|
|
62
|
-
```
|
|
63
|
-
|
|
64
|
-
### Quick Start
|
|
65
|
-
|
|
66
|
-
```bash
|
|
67
|
-
# CLI mode (works with any provider)
|
|
68
|
-
loki start ./prd.md
|
|
69
|
-
loki start ./prd.md --provider codex
|
|
70
|
-
loki start ./prd.md --provider gemini
|
|
71
|
-
|
|
72
|
-
# Interactive mode (inside your coding agent)
|
|
73
|
-
claude --dangerously-skip-permissions
|
|
74
|
-
# Then say: "Loki Mode with PRD at ./my-prd.md"
|
|
75
|
-
|
|
76
|
-
# Or in Codex CLI:
|
|
77
|
-
codex
|
|
78
|
-
# Then say: "Use Loki Mode with PRD at ./my-prd.md"
|
|
79
|
-
|
|
80
|
-
# Or in Gemini CLI:
|
|
81
|
-
gemini
|
|
82
|
-
# Then say: "Use Loki Mode with PRD at ./my-prd.md"
|
|
83
|
-
```
|
|
84
|
-
|
|
85
|
-
### Verify Installation
|
|
86
|
-
|
|
87
|
-
```bash
|
|
88
|
-
loki --version # Should print 5.52.0
|
|
89
|
-
loki doctor # Check skill symlinks and provider availability
|
|
90
|
-
```
|
|
91
|
-
|
|
92
|
-
### Other Methods
|
|
93
|
-
|
|
94
|
-
Git clone, Docker, GitHub Action, and VS Code Extension are also available. See [docs/alternative-installations.md](docs/alternative-installations.md).
|
|
95
|
-
|
|
96
|
-
### Update
|
|
97
|
-
|
|
98
|
-
```bash
|
|
99
|
-
npm update -g loki-mode # npm
|
|
100
|
-
brew upgrade loki-mode # Homebrew
|
|
101
|
-
```
|
|
102
|
-
|
|
103
|
-
### Multi-Provider Support (v5.0.0)
|
|
104
|
-
|
|
105
|
-
| Provider | Features | Parallel Agents | Task Tool |
|
|
106
|
-
|----------|----------|-----------------|-----------|
|
|
107
|
-
| Claude | Full | Yes (10+) | Yes |
|
|
108
|
-
| Codex | Degraded | No | No |
|
|
109
|
-
| Gemini | Degraded | No | No |
|
|
110
|
-
|
|
111
|
-
See [skills/providers.md](skills/providers.md) for full provider documentation.
|
|
112
|
-
|
|
113
|
-
---
|
|
114
|
-
|
|
115
|
-
## Benchmarks
|
|
116
|
-
|
|
117
|
-
Benchmark infrastructure is included for HumanEval and SWE-bench evaluation. Results are self-reported from the included test harness and have not been independently verified.
|
|
118
|
-
|
|
119
|
-
| Benchmark | Result | Notes |
|
|
120
|
-
|-----------|--------|-------|
|
|
121
|
-
| HumanEval | 162/164 (98.78%) | Self-reported, max 3 retries per problem |
|
|
122
|
-
| SWE-bench | 299/300 patches generated | Patch generation only -- SWE-bench evaluator not yet run to verify correctness |
|
|
123
|
-
|
|
124
|
-
**Note:** SWE-bench "patch generation" means the system produced a patch file, not that the patch correctly resolves the issue. The SWE-bench evaluator should be run to determine actual resolution rates.
|
|
10
|
+
[](https://www.autonomi.dev/)
|
|
125
11
|
|
|
126
|
-
|
|
12
|
+
**Current Version: v5.52.2**
|
|
127
13
|
|
|
128
14
|
---
|
|
129
15
|
|
|
130
|
-
## What
|
|
16
|
+
## What Is Loki Mode?
|
|
131
17
|
|
|
132
|
-
Loki Mode is a multi-
|
|
18
|
+
Loki Mode is a multi-agent system that transforms a Product Requirements Document into a built and tested product. It orchestrates 41 specialized agent types across 8 swarms -- engineering, operations, business, data, product, growth, review, and orchestration -- working in parallel with continuous self-verification.
|
|
133
19
|
|
|
134
|
-
|
|
135
|
-
PRD → Research → Architecture → Development → Testing → Deployment → Marketing
|
|
136
|
-
```
|
|
20
|
+
Every iteration follows the **RARV cycle**: Reason (read state, identify next task) -> Act (execute, commit) -> Reflect (update continuity, learn) -> Verify (run tests, check spec). If verification fails, the system captures the error as a learning and retries from Reason. This is the core differentiator: code is not "done" until it passes automated verification. See [Core Workflow](references/core-workflow.md).
|
|
137
21
|
|
|
138
|
-
**
|
|
22
|
+
**What "autonomous" actually means:** The system runs RARV cycles without prompting. It does NOT have access to your cloud accounts, payment systems, or external services unless you provide credentials. Human oversight is expected for deployment credentials, domain setup, API keys, and critical decisions. The system can make mistakes, especially on novel or complex problems.
|
|
139
23
|
|
|
140
|
-
|
|
24
|
+
### What To Expect
|
|
141
25
|
|
|
142
|
-
|
|
26
|
+
| Project Type | Examples | Typical Duration | Experience |
|
|
27
|
+
|---|---|---|---|
|
|
28
|
+
| Simple | Landing page, todo app, single API | 5-30 min | Completes independently. Human reviews output. |
|
|
29
|
+
| Standard | CRUD app with auth, REST API + React frontend | 30-90 min | Completes most features. May need guidance on complex parts. |
|
|
30
|
+
| Complex | Microservices, real-time systems, ML pipelines | 2+ hours | Use as accelerator. Human reviews between phases. |
|
|
143
31
|
|
|
144
|
-
|
|
32
|
+
### Limitations
|
|
145
33
|
|
|
146
34
|
| Area | What Works | What Doesn't (Yet) |
|
|
147
35
|
|------|-----------|---------------------|
|
|
148
|
-
| **Code Generation** |
|
|
149
|
-
| **Deployment** | Generates
|
|
150
|
-
| **Testing** | 9 automated quality gates, blind review | Test quality depends on AI-generated assertions
|
|
151
|
-
| **
|
|
152
|
-
| **
|
|
153
|
-
| **
|
|
154
|
-
| **Enterprise Security** | TLS, OIDC, RBAC, audit trail, SIEM configs | Self-signed certs only; production deployments need real certificates |
|
|
155
|
-
| **Dashboard** | Real-time status, task queue, agent monitoring | Single-machine only; no multi-node dashboard clustering |
|
|
156
|
-
| **Benchmarks** | HumanEval 98.78%, SWE-bench 299/300 patches | Self-reported; SWE-bench counts patch generation, not verified resolution |
|
|
157
|
-
|
|
158
|
-
**What "autonomous" means in practice:**
|
|
159
|
-
- Loki Mode runs without prompting between RARV cycles
|
|
160
|
-
- It does NOT have access to your cloud accounts, payment systems, or external services unless you provide credentials
|
|
161
|
-
- Human oversight is expected for: deployment credentials, domain setup, API keys, and critical business decisions
|
|
162
|
-
- The system is as good as the underlying AI model -- it can make mistakes, especially on novel or complex problems
|
|
163
|
-
|
|
164
|
-
## What To Expect
|
|
165
|
-
|
|
166
|
-
| Project Type | Examples | Autonomy Level | Typical Experience |
|
|
167
|
-
|---|---|---|---|
|
|
168
|
-
| Simple | Landing page, todo app, static site, single API | High | Completes with minimal retries. Human reviews output. |
|
|
169
|
-
| Standard | CRUD app with auth, REST API + React frontend | Medium | Completes most features. Complex components may need guidance. |
|
|
170
|
-
| Complex | Microservices, real-time systems, ML pipelines | Guided | Use as accelerator. Human reviews between phases. |
|
|
171
|
-
|
|
172
|
-
"Autonomous" means the system runs RARV cycles without prompting. It does NOT mean zero oversight.
|
|
36
|
+
| **Code Generation** | Full-stack apps from PRDs | Complex domain logic may need human review |
|
|
37
|
+
| **Deployment** | Generates configs, Dockerfiles, CI/CD workflows | Does not deploy -- human provides cloud credentials and runs deploy |
|
|
38
|
+
| **Testing** | 9 automated quality gates, blind review | Test quality depends on AI-generated assertions |
|
|
39
|
+
| **Multi-Provider** | Claude (full), Codex/Gemini (sequential only) | Codex and Gemini lack parallel agents and Task tool |
|
|
40
|
+
| **Enterprise** | TLS, OIDC, RBAC, audit trail | Self-signed certs only; some features require env var activation |
|
|
41
|
+
| **Dashboard** | Real-time status, task queue, agents | Single-machine only; no multi-node clustering |
|
|
173
42
|
|
|
174
43
|
---
|
|
175
44
|
|
|
176
|
-
##
|
|
177
|
-
|
|
178
|
-
### **How It Works**
|
|
179
|
-
|
|
180
|
-
| What Others Do | What Loki Mode Does |
|
|
181
|
-
|----------------|---------------------|
|
|
182
|
-
| **Single agent** writes code linearly | **Multiple agents** work in parallel across engineering, ops, business, data, product, and growth |
|
|
183
|
-
| **Manual deployment** required | **Autonomous deployment** to AWS, GCP, Azure, Vercel, Railway with blue-green and canary strategies |
|
|
184
|
-
| **No testing** or basic unit tests | **9 automated quality gates**: input/output guardrails, static analysis, blind review, anti-sycophancy, severity blocking, test coverage, mock detection, mutation detection |
|
|
185
|
-
| **Code only** - you handle the rest | **Full business operations**: marketing, sales, legal, HR, finance, investor relations |
|
|
186
|
-
| **Stops on errors** | **Self-healing**: circuit breakers, dead letter queues, exponential backoff, automatic recovery |
|
|
187
|
-
| **No visibility** into progress | **Real-time dashboard** with agent monitoring, task queues, and live status updates |
|
|
188
|
-
| **"Done" when code is written** | **Never "done"**: continuous optimization, A/B testing, customer feedback loops, perpetual improvement |
|
|
189
|
-
| **No security controls** | **Enterprise-ready**: TLS/HTTPS, OIDC/SSO, RBAC, audit trails, SIEM integration, Prometheus metrics (v5.36.0-v5.38.0) |
|
|
190
|
-
| **Direct commits to main** | **Branch protection**: auto-create feature branches, clean PR workflow, never touches main directly (v5.37.0) |
|
|
191
|
-
|
|
192
|
-
### **Core Advantages**
|
|
193
|
-
|
|
194
|
-
1. **Self-Verifying**: RARV (Reason-Act-Reflect-Verify) cycle with continuous self-verification catches errors early
|
|
195
|
-
2. **Parallel Execution**: Multiple agents working simultaneously, not sequential single-agent bottlenecks
|
|
196
|
-
3. **Production-Ready**: Not just code—handles deployment, monitoring, incident response, and business operations
|
|
197
|
-
4. **Self-Improving**: Learns from mistakes, updates continuity logs, prevents repeated errors
|
|
198
|
-
5. **Zero Babysitting**: Auto-resumes on rate limits, recovers from failures, runs until completion
|
|
199
|
-
6. **Efficiency Optimized**: ToolOrchestra-inspired metrics track cost per task, reward signals drive continuous improvement
|
|
200
|
-
|
|
201
|
-
---
|
|
202
|
-
|
|
203
|
-
## Features & Documentation
|
|
204
|
-
|
|
205
|
-
| Feature | Description | Documentation |
|
|
206
|
-
|---------|-------------|---------------|
|
|
207
|
-
| **VS Code Extension** | Visual interface with sidebar, status bar | [Marketplace](https://marketplace.visualstudio.com/items?itemName=asklokesh.loki-mode) |
|
|
208
|
-
| **Multi-Provider (v5.0.0)** | Claude, Codex, Gemini support | [Provider Guide](skills/providers.md) |
|
|
209
|
-
| **CLI (v4.1.0)** | `loki` command for start/stop/pause/status | [CLI Commands](#cli-commands-v410) |
|
|
210
|
-
| **Config Files** | YAML configuration support | [autonomy/config.example.yaml](autonomy/config.example.yaml) |
|
|
211
|
-
| **Dashboard** | Realtime Kanban board, agent monitoring | [Dashboard Guide](docs/dashboard-guide.md) |
|
|
212
|
-
| **TLS/HTTPS (v5.36.0)** | Dashboard encryption with self-signed certs | [Network Security](docs/network-security.md) |
|
|
213
|
-
| **OIDC/SSO (v5.36.0)** | Google, Azure AD, Okta authentication | [Authentication Guide](docs/authentication.md) |
|
|
214
|
-
| **RBAC (v5.37.0)** | Admin, operator, viewer, auditor roles | [Authorization Guide](docs/authorization.md) |
|
|
215
|
-
| **Metrics Export (v5.38.0)** | Prometheus/OpenMetrics `/metrics` endpoint | [Metrics Guide](docs/metrics.md) |
|
|
216
|
-
| **Branch Protection (v5.37.0)** | Auto-create feature branches for PRs | [Git Workflow](docs/git-workflow.md) |
|
|
217
|
-
| **Audit Trail (v5.37.0)** | Agent action logging with integrity chain | [Audit Logging](docs/audit-logging.md) |
|
|
218
|
-
| **SIEM Integration (v5.38.0)** | Syslog forwarding for enterprise security | [SIEM Guide](docs/siem-integration.md) |
|
|
219
|
-
| **OpenClaw Bridge (v5.38.0)** | Multi-agent coordination protocol | [OpenClaw Integration](docs/openclaw-integration.md) |
|
|
220
|
-
| **41 Agent Types** | Engineering, Ops, Business, Data, Product, Growth, Orchestration | [Agent Definitions](references/agent-types.md) |
|
|
221
|
-
| **RARV Cycle** | Reason-Act-Reflect-Verify workflow | [Core Workflow](references/core-workflow.md) |
|
|
222
|
-
| **Quality Gates** | 9-gate system: guardrails, static analysis, blind review, anti-sycophancy, severity blocking, test coverage, mock detection, mutation detection | [Quality Control](references/quality-control.md) |
|
|
223
|
-
| **Memory System (v5.15.0)** | Complete 3-tier memory with progressive disclosure | [Memory Architecture](references/memory-system.md) |
|
|
224
|
-
| **Parallel Workflows** | Git worktree-based parallelism | [Parallel Workflows](skills/parallel-workflows.md) |
|
|
225
|
-
| **GitHub Integration** | Issue import, PR creation, status sync | [GitHub Integration](skills/github-integration.md) |
|
|
226
|
-
| **Distribution** | npm, Homebrew, Docker installation | [Installation Guide](docs/INSTALLATION.md) |
|
|
227
|
-
| **Research Foundation** | OpenAI, DeepMind, Anthropic patterns | [Acknowledgements](docs/ACKNOWLEDGEMENTS.md) |
|
|
228
|
-
| **Benchmarks** | HumanEval and SWE-bench infrastructure included | [Benchmark Harness](benchmarks/) |
|
|
229
|
-
| **Comparisons** | vs Auto-Claude, Cursor | [Auto-Claude](docs/auto-claude-comparison.md), [Cursor](docs/cursor-comparison.md) |
|
|
45
|
+
## Quick Start
|
|
230
46
|
|
|
231
|
-
|
|
47
|
+
**Requirements:** Node.js 18+, Python 3.8+, macOS/Linux/WSL2, and at least one AI CLI (Claude Code, Codex, or Gemini).
|
|
232
48
|
|
|
233
|
-
|
|
234
|
-
|
|
235
|
-
Loki Mode now includes production-ready security and compliance features for enterprise deployments:
|
|
236
|
-
|
|
237
|
-
### **Authentication & Authorization**
|
|
238
|
-
- **TLS/HTTPS Encryption**: Self-signed certificates for dashboard encryption (v5.36.0)
|
|
239
|
-
- **OIDC/SSO Integration**: Support for Google, Azure AD, and Okta authentication (v5.36.0)
|
|
240
|
-
- **RBAC Roles**: Four-tier role system (v5.37.0)
|
|
241
|
-
- **Admin**: Full control, configuration changes, user management
|
|
242
|
-
- **Operator**: Start/stop sessions, modify tasks, execute actions
|
|
243
|
-
- **Viewer**: Read-only dashboard access, view logs and metrics
|
|
244
|
-
- **Auditor**: Access audit logs, compliance reports, security events
|
|
245
|
-
|
|
246
|
-
### **Observability & Monitoring**
|
|
247
|
-
- **Prometheus/OpenMetrics**: `/metrics` endpoint for production monitoring (v5.38.0)
|
|
248
|
-
- Task completion rates, agent performance, memory usage
|
|
249
|
-
- Integration with Grafana, Datadog, New Relic
|
|
250
|
-
- **Audit Trail**: SHA-256 integrity chain for all agent actions (v5.37.0)
|
|
251
|
-
- Tamper-evident logging with cryptographic verification
|
|
252
|
-
- Complete action history: who did what, when, and why
|
|
253
|
-
- **SIEM Integration**: Syslog forwarding (RFC 5424) for enterprise security (v5.38.0)
|
|
254
|
-
- Send logs to Splunk, QRadar, ArcSight, Elastic SIEM
|
|
255
|
-
- Real-time security event detection and alerting
|
|
256
|
-
|
|
257
|
-
### **Development Safety**
|
|
258
|
-
- **Branch Protection**: Auto-create feature branches for all PR work (v5.37.0)
|
|
259
|
-
- Never commits directly to main/master
|
|
260
|
-
- Automatic branch naming: `loki/feature/<task-id>`
|
|
261
|
-
- Clean merge workflow with squash commits
|
|
262
|
-
- **OpenClaw Bridge**: Multi-agent coordination protocol integration (v5.38.0)
|
|
263
|
-
- Standardized inter-agent communication
|
|
264
|
-
- Cross-system orchestration support
|
|
265
|
-
|
|
266
|
-
### **Quick Start (Enterprise Mode)**
|
|
49
|
+
### CLI Mode
|
|
267
50
|
|
|
268
51
|
```bash
|
|
269
|
-
|
|
270
|
-
|
|
271
|
-
|
|
272
|
-
export LOKI_TLS_KEY=/path/to/key.pem
|
|
273
|
-
|
|
274
|
-
# Configure OIDC
|
|
275
|
-
export LOKI_OIDC_PROVIDER=google
|
|
276
|
-
export LOKI_OIDC_CLIENT_ID=your-client-id
|
|
277
|
-
export LOKI_OIDC_CLIENT_SECRET=your-client-secret
|
|
278
|
-
|
|
279
|
-
# Enable audit logging
|
|
280
|
-
export LOKI_AUDIT_ENABLED=true
|
|
281
|
-
export LOKI_AUDIT_INTEGRITY_CHECK=true
|
|
282
|
-
|
|
283
|
-
# Enable Prometheus metrics
|
|
284
|
-
export LOKI_METRICS_ENABLED=true
|
|
285
|
-
|
|
286
|
-
# Start with enterprise features
|
|
287
|
-
loki start --enterprise ./my-prd.md
|
|
52
|
+
npm install -g loki-mode
|
|
53
|
+
loki doctor # verify environment
|
|
54
|
+
loki start ./prd.md # uses Claude Code by default
|
|
288
55
|
```
|
|
289
56
|
|
|
290
|
-
|
|
291
|
-
|
|
292
|
-
---
|
|
293
|
-
|
|
294
|
-
## Dashboard & Real-Time Monitoring
|
|
295
|
-
|
|
296
|
-
Monitor your autonomous startup being built in real-time through the Loki Mode dashboard:
|
|
297
|
-
|
|
298
|
-
### **Agent Monitoring**
|
|
299
|
-
|
|
300
|
-
<img width="1200" alt="Loki Mode Dashboard - Active Agents" src="docs/screenshots/dashboard-agents.png" />
|
|
301
|
-
|
|
302
|
-
**Track all active agents in real-time:**
|
|
303
|
-
- **Agent ID** and **Type** (frontend, backend, QA, DevOps, etc.)
|
|
304
|
-
- **Model Badge** (Sonnet, Haiku, Opus) with color coding
|
|
305
|
-
- **Current Work** being performed
|
|
306
|
-
- **Runtime** and **Tasks Completed**
|
|
307
|
-
- **Status** (active, completed)
|
|
308
|
-
|
|
309
|
-
### **Task Queue Visualization**
|
|
310
|
-
|
|
311
|
-
<img width="1200" alt="Loki Mode Dashboard - Task Queue" src="docs/screenshots/dashboard-tasks.png" />
|
|
312
|
-
|
|
313
|
-
**Four-column kanban view:**
|
|
314
|
-
- **Pending**: Queued tasks waiting for agents
|
|
315
|
-
- **In Progress**: Currently being worked on
|
|
316
|
-
- **Completed**: Successfully finished (shows last 10)
|
|
317
|
-
- **Failed**: Tasks requiring attention
|
|
318
|
-
|
|
319
|
-
### **Live Status Monitor**
|
|
57
|
+
### Interactive Mode (inside Claude Code)
|
|
320
58
|
|
|
321
59
|
```bash
|
|
322
|
-
|
|
323
|
-
|
|
324
|
-
```
|
|
325
|
-
|
|
60
|
+
claude --dangerously-skip-permissions
|
|
61
|
+
# Then type: "Loki Mode" or "Loki Mode with PRD at ./my-prd.md"
|
|
326
62
|
```
|
|
327
|
-
╔════════════════════════════════════════════════════════════════╗
|
|
328
|
-
║ LOKI MODE STATUS ║
|
|
329
|
-
╚════════════════════════════════════════════════════════════════╝
|
|
330
|
-
|
|
331
|
-
Phase: DEVELOPMENT
|
|
332
63
|
|
|
333
|
-
|
|
334
|
-
├─ Engineering: 18
|
|
335
|
-
├─ Operations: 12
|
|
336
|
-
├─ QA: 8
|
|
337
|
-
└─ Business: 9
|
|
64
|
+
This is the easiest way to try it if you already have Claude Code installed. No separate `loki` CLI installation needed.
|
|
338
65
|
|
|
339
|
-
|
|
340
|
-
├─ Pending: 10
|
|
341
|
-
├─ In Progress: 47
|
|
342
|
-
├─ Completed: 203
|
|
343
|
-
└─ Failed: 0
|
|
344
|
-
|
|
345
|
-
Last Updated: 2026-01-04 20:45:32
|
|
346
|
-
```
|
|
66
|
+
### What Happens
|
|
347
67
|
|
|
348
|
-
|
|
349
|
-
```bash
|
|
350
|
-
# Automatically starts when running autonomously
|
|
351
|
-
./autonomy/run.sh ./docs/requirements.md
|
|
68
|
+
The system classifies your PRD complexity, assembles an agent team, and runs RARV cycles with 9 quality gates. Output is committed to a Git repo with source code, tests, deployment configs, and audit logs. The dashboard auto-starts at `http://localhost:57374` for real-time monitoring, or use `loki status` from the terminal.
|
|
352
69
|
|
|
353
|
-
|
|
354
|
-
open http://localhost:57374
|
|
355
|
-
# HTTPS mode (v5.36.0+):
|
|
356
|
-
open https://localhost:57374
|
|
357
|
-
```
|
|
70
|
+
**Other install methods:** Homebrew (`brew tap asklokesh/tap && brew install loki-mode`), Docker, Git clone, VS Code Extension. See [Installation Guide](docs/INSTALLATION.md).
|
|
358
71
|
|
|
359
|
-
|
|
72
|
+
**Cost:** Loki Mode uses your AI provider's API. Simple projects typically consume modest token usage; complex projects with parallel agents use more. Monitor token economics with `loki memory economics`. See [Token Economics](references/memory-system.md) for details.
|
|
360
73
|
|
|
361
74
|
---
|
|
362
75
|
|
|
363
|
-
##
|
|
76
|
+
## Presentation
|
|
364
77
|
|
|
365
|
-
|
|
78
|
+

|
|
366
79
|
|
|
367
|
-
|
|
80
|
+
*9 slides: Problem, Solution, 41 Agents, RARV Cycle, Benchmarks, Multi-Provider, Full Lifecycle* | **[Download PPTX](docs/loki-mode-presentation.pptx)**
|
|
368
81
|
|
|
369
|
-
|
|
370
|
-
1. REASON
|
|
371
|
-
└─ Read .loki/CONTINUITY.md including "Mistakes & Learnings"
|
|
372
|
-
└─ Check .loki/state/ and .loki/queue/
|
|
373
|
-
└─ Identify next task or improvement
|
|
374
|
-
|
|
375
|
-
2. ACT
|
|
376
|
-
└─ Execute task, write code
|
|
377
|
-
└─ Commit changes atomically (git checkpoint)
|
|
378
|
-
|
|
379
|
-
3. REFLECT
|
|
380
|
-
└─ Update .loki/CONTINUITY.md with progress
|
|
381
|
-
└─ Update state files
|
|
382
|
-
└─ Identify NEXT improvement
|
|
383
|
-
|
|
384
|
-
4. VERIFY
|
|
385
|
-
└─ Run automated tests (unit, integration, E2E)
|
|
386
|
-
└─ Check compilation/build
|
|
387
|
-
└─ Verify against spec
|
|
388
|
-
|
|
389
|
-
IF VERIFICATION FAILS:
|
|
390
|
-
├─ Capture error details (stack trace, logs)
|
|
391
|
-
├─ Analyze root cause
|
|
392
|
-
├─ UPDATE "Mistakes & Learnings" in CONTINUITY.md
|
|
393
|
-
├─ Rollback to last good git checkpoint if needed
|
|
394
|
-
└─ Apply learning and RETRY from REASON
|
|
395
|
-
```
|
|
82
|
+
---
|
|
396
83
|
|
|
397
|
-
|
|
84
|
+
## Architecture
|
|
398
85
|
|
|
399
|
-
|
|
86
|
+
<img width="5989" height="2875" alt="image" src="https://github.com/user-attachments/assets/c9798120-9587-4847-8e8d-8f421f984dfc" />
|
|
400
87
|
|
|
401
|
-
There is **NEVER** a "finished" state. After completing the PRD, Loki Mode:
|
|
402
|
-
- Runs performance optimizations
|
|
403
|
-
- Adds missing test coverage
|
|
404
|
-
- Improves documentation
|
|
405
|
-
- Refactors code smells
|
|
406
|
-
- Updates dependencies
|
|
407
|
-
- Enhances user experience
|
|
408
|
-
- Implements A/B test learnings
|
|
409
88
|
|
|
410
|
-
|
|
89
|
+
*Fallback: PRD -> Classifier -> Agent Team (41 types, 8 swarms) -> RARV Cycle <-> Memory System -> Quality Gates (pass/fail loop) -> Output*
|
|
411
90
|
|
|
412
|
-
|
|
91
|
+
See [full architecture documentation](docs/enterprise/architecture.md) for the detailed view.
|
|
413
92
|
|
|
414
|
-
**
|
|
415
|
-
**Errors?** Circuit breakers, dead letter queues, retry logic.
|
|
416
|
-
**Interruptions?** State checkpoints every 5 seconds—just restart.
|
|
93
|
+
**Key components:**
|
|
417
94
|
|
|
418
|
-
|
|
419
|
-
|
|
420
|
-
|
|
421
|
-
|
|
422
|
-
|
|
423
|
-
|
|
424
|
-
# ├─ Waits with exponential backoff (60s → 120s → 240s...)
|
|
425
|
-
# ├─ Resumes from exact point
|
|
426
|
-
# └─ Continues until completion or max retries (default: 50)
|
|
427
|
-
```
|
|
95
|
+
- **RARV Cycle** -- Reason-Act-Reflect-Verify with self-correction on failure. [Core Workflow](references/core-workflow.md)
|
|
96
|
+
- **41 Agent Types** -- 8 swarms auto-composed by PRD complexity. [Agent Types](references/agent-types.md)
|
|
97
|
+
- **9 Quality Gates** -- Blind review, anti-sycophancy, severity blocking, mock/mutation detection. [Quality Gates](skills/quality-gates.md)
|
|
98
|
+
- **Memory System** -- Episodic, semantic, procedural tiers with progressive disclosure. [Memory Architecture](references/memory-system.md)
|
|
99
|
+
- **Dashboard** -- Real-time monitoring, API v2, WebSocket at port 57374. [Dashboard Guide](docs/dashboard-guide.md)
|
|
100
|
+
- **Enterprise Layer** -- OTEL, policy engine, audit trails, RBAC, SSO (requires env var activation). [Enterprise Guide](docs/enterprise/architecture.md)
|
|
428
101
|
|
|
429
102
|
---
|
|
430
103
|
|
|
431
|
-
## Quick Start
|
|
432
|
-
|
|
433
|
-
### **1. Write a PRD**
|
|
434
|
-
|
|
435
|
-
```markdown
|
|
436
|
-
# Product: AI-Powered Todo App
|
|
437
|
-
|
|
438
|
-
## Overview
|
|
439
|
-
Build a todo app with AI-powered task suggestions and deadline predictions.
|
|
440
|
-
|
|
441
104
|
## Features
|
|
442
|
-
- User authentication (email/password)
|
|
443
|
-
- Create, read, update, delete todos
|
|
444
|
-
- AI suggests next tasks based on patterns
|
|
445
|
-
- Smart deadline predictions
|
|
446
|
-
- Mobile-responsive design
|
|
447
|
-
|
|
448
|
-
## Tech Stack
|
|
449
|
-
- Next.js 14 with TypeScript
|
|
450
|
-
- PostgreSQL database
|
|
451
|
-
- OpenAI API for suggestions
|
|
452
|
-
- Deploy to Vercel
|
|
453
|
-
```
|
|
454
|
-
|
|
455
|
-
Save as `my-prd.md`.
|
|
456
105
|
|
|
457
|
-
|
|
458
|
-
|
|
459
|
-
|
|
460
|
-
|
|
461
|
-
|
|
106
|
+
| Category | Highlights | Docs |
|
|
107
|
+
|---|---|---|
|
|
108
|
+
| **Agents** | 41 types across 8 swarms, auto-composed by PRD complexity | [Agent Types](references/agent-types.md) |
|
|
109
|
+
| **Quality** | 9 gates: blind review, anti-sycophancy, mock/mutation detection | [Quality Gates](skills/quality-gates.md) |
|
|
110
|
+
| **Dashboard** | Real-time monitoring, API v2, WebSocket, auto-starts with `loki start` | [Dashboard Guide](docs/dashboard-guide.md) |
|
|
111
|
+
| **Memory** | 3-tier (episodic/semantic/procedural), knowledge graph, vector search | [Memory System](references/memory-system.md) |
|
|
112
|
+
| **Providers** | Claude (full), Codex (sequential), Gemini (sequential) | [Provider Guide](skills/providers.md) |
|
|
113
|
+
| **Enterprise** | TLS, OIDC/SSO, RBAC, OTEL, policy engine, audit trails | [Enterprise Guide](docs/enterprise/architecture.md) |
|
|
114
|
+
| **Integrations** | Jira, Slack, Teams, GitHub Actions (Linear: partial) | [Integration Cookbook](docs/enterprise/integration-cookbook.md) |
|
|
115
|
+
| **Deployment** | Helm, Docker Compose, Terraform configs (AWS/Azure/GCP) | [Deployment Guide](deploy/helm/README.md) |
|
|
116
|
+
| **SDKs** | Python (`loki-mode-sdk`), TypeScript (`loki-mode-sdk`) | [SDK Guide](docs/enterprise/sdk-guide.md) |
|
|
462
117
|
|
|
463
|
-
###
|
|
118
|
+
### Multi-Provider Support
|
|
464
119
|
|
|
465
|
-
|
|
466
|
-
|
|
467
|
-
|
|
468
|
-
|
|
120
|
+
| Provider | Install | Autonomous Flag | Parallel Agents |
|
|
121
|
+
|----------|---------|-----------------|-----------------|
|
|
122
|
+
| Claude Code | `npm i -g @anthropic-ai/claude-code` | `--dangerously-skip-permissions` | Yes (10+) |
|
|
123
|
+
| Codex CLI | `npm i -g @openai/codex` | `--full-auto` | No (sequential) |
|
|
124
|
+
| Gemini CLI | `npm i -g @google/gemini-cli` | `--approval-mode=yolo` | No (sequential) |
|
|
469
125
|
|
|
470
|
-
|
|
126
|
+
Claude gets full features (subagents, parallelization, MCP, Task tool). Codex and Gemini run in sequential mode -- one agent at a time, no Task tool. See [Provider Guide](skills/providers.md) for the full comparison.
|
|
471
127
|
|
|
472
128
|
---
|
|
473
129
|
|
|
474
|
-
##
|
|
475
|
-
|
|
476
|
-
```mermaid
|
|
477
|
-
graph TB
|
|
478
|
-
PRD["PRD Document"] --> REASON
|
|
479
|
-
|
|
480
|
-
subgraph RARVC["RARV+C Cycle"]
|
|
481
|
-
direction TB
|
|
482
|
-
REASON["1. Reason"] --> ACT["2. Act"]
|
|
483
|
-
ACT --> REFLECT["3. Reflect"]
|
|
484
|
-
REFLECT --> VERIFY["4. Verify"]
|
|
485
|
-
VERIFY -->|"pass"| COMPOUND["5. Compound"]
|
|
486
|
-
VERIFY -->|"fail"| REASON
|
|
487
|
-
COMPOUND --> REASON
|
|
488
|
-
end
|
|
489
|
-
|
|
490
|
-
subgraph PROVIDERS["Provider Layer"]
|
|
491
|
-
CLAUDE["Claude Code<br/>(full features)"]
|
|
492
|
-
CODEX["Codex CLI<br/>(degraded)"]
|
|
493
|
-
GEMINI["Gemini CLI<br/>(degraded)"]
|
|
494
|
-
end
|
|
495
|
-
|
|
496
|
-
ACT --> PROVIDERS
|
|
497
|
-
|
|
498
|
-
subgraph AGENTS["Agent Swarms (41 types)"]
|
|
499
|
-
ENG["Engineering (8)"]
|
|
500
|
-
OPS["Operations (8)"]
|
|
501
|
-
BIZ["Business (8)"]
|
|
502
|
-
DATA["Data (3)"]
|
|
503
|
-
PROD["Product (3)"]
|
|
504
|
-
GROWTH["Growth (4)"]
|
|
505
|
-
REVIEW["Review (3)"]
|
|
506
|
-
ORCH["Orchestration (4)"]
|
|
507
|
-
end
|
|
508
|
-
|
|
509
|
-
PROVIDERS --> AGENTS
|
|
510
|
-
|
|
511
|
-
subgraph INFRA["Infrastructure"]
|
|
512
|
-
DASHBOARD["Dashboard<br/>(FastAPI + Web UI)<br/>TLS/HTTPS, OIDC, RBAC"]
|
|
513
|
-
MEMORY["Memory System<br/>(Episodic/Semantic/Procedural)"]
|
|
514
|
-
COUNCIL["Completion Council<br/>(3-member voting)"]
|
|
515
|
-
QUEUE["Task Queue<br/>(.loki/queue/)"]
|
|
516
|
-
METRICS["Metrics Export<br/>(Prometheus/OpenMetrics)"]
|
|
517
|
-
AUDIT["Audit Trail<br/>(SHA-256 integrity chain)"]
|
|
518
|
-
end
|
|
519
|
-
|
|
520
|
-
AGENTS --> QUEUE
|
|
521
|
-
VERIFY --> COUNCIL
|
|
522
|
-
REFLECT --> MEMORY
|
|
523
|
-
COMPOUND --> MEMORY
|
|
524
|
-
AGENTS --> AUDIT
|
|
525
|
-
DASHBOARD -.->|"reads"| QUEUE
|
|
526
|
-
DASHBOARD -.->|"reads"| MEMORY
|
|
527
|
-
DASHBOARD -.->|"reads"| AUDIT
|
|
528
|
-
DASHBOARD -.->|"exposes"| METRICS
|
|
529
|
-
```
|
|
530
|
-
|
|
531
|
-
**Key components:**
|
|
532
|
-
- **RARV+C Cycle** -- Reason, Act, Reflect, Verify, Compound. Every iteration follows this loop. Failed verification triggers retry from Reason.
|
|
533
|
-
- **Provider Layer** -- Claude Code (full parallel agents, Task tool, MCP), Codex CLI and Gemini CLI (sequential, degraded mode).
|
|
534
|
-
- **Agent Swarms** -- 41 specialized agent types across 8 swarms, spawned on demand based on project complexity.
|
|
535
|
-
- **Completion Council** -- 3 members vote on whether the project is done. Anti-sycophancy devil's advocate on unanimous votes.
|
|
536
|
-
- **Memory System** -- Episodic traces, semantic patterns, procedural skills. Progressive disclosure reduces context usage by 60-80%.
|
|
537
|
-
- **Dashboard** -- FastAPI server reading `.loki/` flat files, with real-time web UI for task queue, agents, logs, and council state. Now with TLS/HTTPS, OIDC/SSO, and RBAC (v5.36.0-v5.37.0).
|
|
538
|
-
- **Metrics Export** -- Prometheus/OpenMetrics endpoint for production monitoring (v5.38.0).
|
|
539
|
-
- **Audit Trail** -- SHA-256 integrity chain for tamper-evident logging of all agent actions (v5.37.0).
|
|
540
|
-
|
|
541
|
-
---
|
|
542
|
-
|
|
543
|
-
## CLI Commands
|
|
544
|
-
|
|
545
|
-
The `loki` CLI provides easy access to all Loki Mode features:
|
|
130
|
+
## CLI
|
|
546
131
|
|
|
547
132
|
| Command | Description |
|
|
548
133
|
|---------|-------------|
|
|
549
|
-
| `loki start [PRD]` | Start
|
|
550
|
-
| `loki stop` | Stop execution
|
|
551
|
-
| `loki pause` | Pause after current session |
|
|
552
|
-
| `loki resume` | Resume paused execution |
|
|
134
|
+
| `loki start [PRD]` | Start with optional PRD file |
|
|
135
|
+
| `loki stop` | Stop execution |
|
|
136
|
+
| `loki pause` / `resume` | Pause/resume after current session |
|
|
553
137
|
| `loki status` | Show current status |
|
|
554
|
-
| `loki dashboard` | Open dashboard
|
|
138
|
+
| `loki dashboard` | Open web dashboard |
|
|
139
|
+
| `loki doctor` | Check environment and dependencies |
|
|
555
140
|
| `loki import` | Import GitHub issues as tasks |
|
|
556
|
-
| `loki
|
|
557
|
-
| `loki
|
|
558
|
-
| `loki audit logs` | View audit trail (v5.37.0) |
|
|
559
|
-
| `loki audit verify` | Verify log integrity chain (v5.37.0) |
|
|
560
|
-
| `loki metrics` | Display Prometheus metrics (v5.38.0) |
|
|
561
|
-
| `loki syslog test` | Test SIEM integration (v5.38.0) |
|
|
141
|
+
| `loki memory <cmd>` | Memory system CLI (index, timeline, search, consolidate) |
|
|
142
|
+
| `loki enterprise` | Enterprise feature management (tokens, OIDC) |
|
|
562
143
|
| `loki version` | Show version |
|
|
563
144
|
|
|
564
|
-
|
|
565
|
-
|
|
566
|
-
Create a YAML config file for persistent settings:
|
|
567
|
-
|
|
568
|
-
```bash
|
|
569
|
-
# Initialize config
|
|
570
|
-
loki config init
|
|
571
|
-
|
|
572
|
-
# Or copy template manually
|
|
573
|
-
cp ~/.claude/skills/loki-mode/autonomy/config.example.yaml .loki/config.yaml
|
|
574
|
-
```
|
|
575
|
-
|
|
576
|
-
Config search order: `.loki/config.yaml` (project) -> `~/.config/loki-mode/config.yaml` (global)
|
|
145
|
+
Run `loki --help` for all commands. Full reference: [CLI Reference](wiki/CLI-Reference.md) | Configuration: [config.example.yaml](autonomy/config.example.yaml)
|
|
577
146
|
|
|
578
147
|
---
|
|
579
148
|
|
|
580
|
-
##
|
|
581
|
-
|
|
582
|
-
Loki Mode has **41 predefined agent types** organized into **8 specialized swarms**. The orchestrator spawns only what you need -- simple projects typically use 5-10 agents, complex ones may use more.
|
|
583
|
-
|
|
584
|
-
<img width="5309" height="979" alt="Agent Swarms Visualization" src="https://github.com/user-attachments/assets/7d18635d-a606-401f-8d9f-430e6e4ee689" />
|
|
585
|
-
|
|
586
|
-
### **Engineering (8 types)**
|
|
587
|
-
`eng-frontend` `eng-backend` `eng-database` `eng-mobile` `eng-api` `eng-qa` `eng-perf` `eng-infra`
|
|
588
|
-
|
|
589
|
-
### **Operations (8 types)**
|
|
590
|
-
`ops-devops` `ops-sre` `ops-security` `ops-monitor` `ops-incident` `ops-release` `ops-cost` `ops-compliance`
|
|
591
|
-
|
|
592
|
-
### **Business (8 types)**
|
|
593
|
-
`biz-marketing` `biz-sales` `biz-finance` `biz-legal` `biz-support` `biz-hr` `biz-investor` `biz-partnerships`
|
|
594
|
-
|
|
595
|
-
### **Data (3 types)**
|
|
596
|
-
`data-ml` `data-eng` `data-analytics`
|
|
597
|
-
|
|
598
|
-
### **Product (3 types)**
|
|
599
|
-
`prod-pm` `prod-design` `prod-techwriter`
|
|
600
|
-
|
|
601
|
-
### **Growth (4 types)**
|
|
602
|
-
`growth-hacker` `growth-community` `growth-success` `growth-lifecycle`
|
|
603
|
-
|
|
604
|
-
### **Review (3 types)**
|
|
605
|
-
`review-code` `review-business` `review-security`
|
|
606
|
-
|
|
607
|
-
### **Orchestration (4 types)**
|
|
608
|
-
`orch-planner` `orch-sub-planner` `orch-judge` `orch-coordinator`
|
|
609
|
-
|
|
610
|
-
See [Agent Types](references/agent-types.md) for the full list of 41 specialized agents with detailed capabilities.
|
|
611
|
-
|
|
612
|
-
---
|
|
613
|
-
|
|
614
|
-
## How It Works
|
|
615
|
-
|
|
616
|
-
### **Skill Architecture (v3.0+)**
|
|
617
|
-
|
|
618
|
-
Loki Mode uses a **progressive disclosure architecture** to minimize context usage:
|
|
619
|
-
|
|
620
|
-
```
|
|
621
|
-
SKILL.md (~190 lines) # Always loaded: core RARV cycle, autonomy rules
|
|
622
|
-
skills/
|
|
623
|
-
00-index.md # Module routing table
|
|
624
|
-
agents.md # Agent dispatch, A2A patterns
|
|
625
|
-
production.md # HN patterns, batch processing, CI/CD
|
|
626
|
-
quality-gates.md # Review system, severity handling
|
|
627
|
-
testing.md # Playwright, E2E, property-based
|
|
628
|
-
model-selection.md # Task tool, parallelization
|
|
629
|
-
artifacts.md # Code generation patterns
|
|
630
|
-
patterns-advanced.md # Constitutional AI, debate
|
|
631
|
-
troubleshooting.md # Error recovery, fallbacks
|
|
632
|
-
references/ # Deep documentation (23KB+ files)
|
|
633
|
-
```
|
|
149
|
+
## Enterprise
|
|
634
150
|
|
|
635
|
-
|
|
636
|
-
- Original 1,517-line SKILL.md consumed ~15% of context before any work began
|
|
637
|
-
- Now only ~1% of context for core skill + on-demand modules
|
|
638
|
-
- More room for actual code and reasoning
|
|
639
|
-
|
|
640
|
-
### **Phase Execution**
|
|
641
|
-
|
|
642
|
-
| Phase | Description |
|
|
643
|
-
|-------|-------------|
|
|
644
|
-
| **0. Bootstrap** | Create `.loki/` directory structure, initialize state |
|
|
645
|
-
| **1. Discovery** | Parse PRD, competitive research via web search |
|
|
646
|
-
| **2. Architecture** | Tech stack selection with self-reflection |
|
|
647
|
-
| **3. Infrastructure** | Provision cloud, CI/CD, monitoring |
|
|
648
|
-
| **4. Development** | Implement with TDD, parallel code review |
|
|
649
|
-
| **5. QA** | 9 quality gates, security audit, load testing |
|
|
650
|
-
| **6. Deployment** | Blue-green deploy, auto-rollback on errors |
|
|
651
|
-
| **7. Business** | Marketing, sales, legal, support setup |
|
|
652
|
-
| **8. Growth** | Continuous optimization, A/B testing, feedback loops |
|
|
653
|
-
|
|
654
|
-
### **Parallel Code Review**
|
|
655
|
-
|
|
656
|
-
Every code change goes through **3 specialized reviewers simultaneously**:
|
|
657
|
-
|
|
658
|
-
```
|
|
659
|
-
IMPLEMENT → REVIEW (parallel) → AGGREGATE → FIX → RE-REVIEW → COMPLETE
|
|
660
|
-
│
|
|
661
|
-
├─ code-reviewer (Sonnet) - Code quality, patterns, best practices
|
|
662
|
-
├─ business-logic-reviewer (Sonnet) - Requirements, edge cases, UX
|
|
663
|
-
└─ security-reviewer (Sonnet) - Vulnerabilities, OWASP Top 10
|
|
664
|
-
```
|
|
665
|
-
|
|
666
|
-
**Severity-based issue handling:**
|
|
667
|
-
- **Critical/High/Medium**: Block. Fix immediately. Re-review.
|
|
668
|
-
- **Low**: Add `// TODO(review): ...` comment, continue.
|
|
669
|
-
- **Cosmetic**: Add `// FIXME(nitpick): ...` comment, continue.
|
|
670
|
-
|
|
671
|
-
### **Directory Structure**
|
|
672
|
-
|
|
673
|
-
```
|
|
674
|
-
.loki/
|
|
675
|
-
├── state/ # Orchestrator and agent states
|
|
676
|
-
├── queue/ # Task queue (pending, in-progress, completed, dead-letter)
|
|
677
|
-
├── memory/ # Episodic, semantic, and procedural memory
|
|
678
|
-
├── metrics/ # Efficiency tracking and reward signals
|
|
679
|
-
├── messages/ # Inter-agent communication
|
|
680
|
-
├── logs/ # Audit logs
|
|
681
|
-
├── audit/ # Audit trail with SHA-256 integrity chain (v5.37.0)
|
|
682
|
-
├── security/ # TLS certificates, OIDC configs (v5.36.0)
|
|
683
|
-
├── rbac/ # Role definitions and permissions (v5.37.0)
|
|
684
|
-
├── config/ # Configuration files
|
|
685
|
-
├── prompts/ # Agent role prompts
|
|
686
|
-
├── artifacts/ # Releases, reports, backups
|
|
687
|
-
├── dashboard/ # Real-time monitoring dashboard
|
|
688
|
-
└── scripts/ # Helper scripts
|
|
689
|
-
```
|
|
690
|
-
|
|
691
|
-
### **Memory System (v5.15.0)**
|
|
692
|
-
|
|
693
|
-
Complete 3-tier memory architecture with progressive disclosure:
|
|
694
|
-
|
|
695
|
-
```
|
|
696
|
-
WORKING MEMORY (CONTINUITY.md)
|
|
697
|
-
|
|
|
698
|
-
v
|
|
699
|
-
EPISODIC MEMORY (.loki/memory/episodic/)
|
|
700
|
-
|
|
|
701
|
-
v (consolidation)
|
|
702
|
-
SEMANTIC MEMORY (.loki/memory/semantic/)
|
|
703
|
-
|
|
|
704
|
-
v
|
|
705
|
-
PROCEDURAL MEMORY (.loki/memory/skills/)
|
|
706
|
-
```
|
|
707
|
-
|
|
708
|
-
**Key Features:**
|
|
709
|
-
- **Progressive Disclosure**: 3-layer loading (index ~100 tokens, timeline ~500 tokens, full details) reduces context usage by 60-80%
|
|
710
|
-
- **Token Economics**: Track discovery vs read tokens, automatic threshold-based optimization
|
|
711
|
-
- **Vector Search**: Optional embedding-based similarity search (sentence-transformers)
|
|
712
|
-
- **Consolidation Pipeline**: Automatic episodic-to-semantic transformation
|
|
713
|
-
- **Task-Aware Retrieval**: Different memory strategies for exploration, implementation, debugging, review, and refactoring
|
|
714
|
-
|
|
715
|
-
**CLI Commands:**
|
|
716
|
-
```bash
|
|
717
|
-
loki memory index # View index layer
|
|
718
|
-
loki memory timeline # View compressed history
|
|
719
|
-
loki memory consolidate # Run consolidation pipeline
|
|
720
|
-
loki memory economics # View token usage metrics
|
|
721
|
-
loki memory retrieve "query" # Test task-aware retrieval
|
|
722
|
-
```
|
|
723
|
-
|
|
724
|
-
**API Endpoints:**
|
|
725
|
-
- `GET /api/memory/summary` - Memory summary
|
|
726
|
-
- `POST /api/memory/retrieve` - Query memories
|
|
727
|
-
- `POST /api/memory/consolidate` - Trigger consolidation
|
|
728
|
-
- `GET /api/memory/economics` - Token economics
|
|
729
|
-
|
|
730
|
-
See [references/memory-system.md](references/memory-system.md) for complete documentation.
|
|
731
|
-
|
|
732
|
-
---
|
|
733
|
-
|
|
734
|
-
## Example PRDs
|
|
735
|
-
|
|
736
|
-
Test Loki Mode with these pre-built PRDs in the `examples/` directory:
|
|
737
|
-
|
|
738
|
-
| PRD | Complexity | Est. Time | Description |
|
|
739
|
-
|-----|------------|-----------|-------------|
|
|
740
|
-
| `simple-todo-app.md` | Low | ~10 min | Basic todo app - tests core functionality |
|
|
741
|
-
| `api-only.md` | Low | ~10 min | REST API only - tests backend agents |
|
|
742
|
-
| `static-landing-page.md` | Low | ~5 min | HTML/CSS only - tests frontend/marketing |
|
|
743
|
-
| `full-stack-demo.md` | Medium | ~30-60 min | Complete bookmark manager - full test |
|
|
151
|
+
Enterprise features are included but require env var activation. Self-audit results: 35/45 capabilities working, 0 broken, 1,314 tests passing (683 npm + 631 pytest). 2 items partial, 3 scaffolding (OTEL/policy active only when configured). See [Audit Results](.loki/audit/integrity-audit-v5.52.0.md).
|
|
744
152
|
|
|
745
153
|
```bash
|
|
746
|
-
|
|
747
|
-
|
|
748
|
-
|
|
749
|
-
|
|
750
|
-
|
|
751
|
-
|
|
752
|
-
## Configuration
|
|
753
|
-
|
|
754
|
-
### **Autonomy Settings**
|
|
755
|
-
|
|
756
|
-
Customize the autonomous runner with environment variables:
|
|
757
|
-
|
|
758
|
-
```bash
|
|
759
|
-
LOKI_MAX_RETRIES=100 \
|
|
760
|
-
LOKI_BASE_WAIT=120 \
|
|
761
|
-
LOKI_MAX_WAIT=7200 \
|
|
762
|
-
./autonomy/run.sh ./docs/requirements.md
|
|
763
|
-
```
|
|
764
|
-
|
|
765
|
-
| Variable | Default | Description |
|
|
766
|
-
|----------|---------|-------------|
|
|
767
|
-
| `LOKI_PROVIDER` | claude | AI provider: claude, codex, gemini |
|
|
768
|
-
| `LOKI_MAX_RETRIES` | 50 | Maximum retry attempts before giving up |
|
|
769
|
-
| `LOKI_BASE_WAIT` | 60 | Base wait time in seconds |
|
|
770
|
-
| `LOKI_MAX_WAIT` | 3600 | Maximum wait time (1 hour) |
|
|
771
|
-
| `LOKI_SKIP_PREREQS` | false | Skip prerequisite checks |
|
|
772
|
-
| `LOKI_TLS_ENABLED` | false | Enable HTTPS/TLS for dashboard (v5.36.0) |
|
|
773
|
-
| `LOKI_OIDC_PROVIDER` | - | OIDC provider: google, azure, okta (v5.36.0) |
|
|
774
|
-
| `LOKI_RBAC_ENABLED` | false | Enable role-based access control (v5.37.0) |
|
|
775
|
-
| `LOKI_AUDIT_ENABLED` | false | Enable audit logging with integrity chain (v5.37.0) |
|
|
776
|
-
| `LOKI_METRICS_ENABLED` | false | Enable Prometheus /metrics endpoint (v5.38.0) |
|
|
777
|
-
| `LOKI_SYSLOG_ENABLED` | false | Enable syslog forwarding to SIEM (v5.38.0) |
|
|
778
|
-
| `LOKI_BRANCH_PROTECTION` | true | Auto-create feature branches (v5.37.0) |
|
|
779
|
-
|
|
780
|
-
### **Circuit Breakers**
|
|
781
|
-
|
|
782
|
-
```yaml
|
|
783
|
-
# .loki/config/circuit-breakers.yaml
|
|
784
|
-
defaults:
|
|
785
|
-
failureThreshold: 5
|
|
786
|
-
cooldownSeconds: 300
|
|
154
|
+
export LOKI_TLS_ENABLED=true
|
|
155
|
+
export LOKI_OIDC_PROVIDER=google
|
|
156
|
+
export LOKI_AUDIT_ENABLED=true
|
|
157
|
+
export LOKI_METRICS_ENABLED=true
|
|
158
|
+
loki enterprise status # check what's enabled
|
|
159
|
+
loki start ./prd.md # enterprise features activate via env vars
|
|
787
160
|
```
|
|
788
161
|
|
|
789
|
-
|
|
790
|
-
|
|
791
|
-
```yaml
|
|
792
|
-
# .loki/config/alerting.yaml
|
|
793
|
-
channels:
|
|
794
|
-
slack:
|
|
795
|
-
webhook_url: "${SLACK_WEBHOOK_URL}"
|
|
796
|
-
severity: [critical, high]
|
|
797
|
-
pagerduty:
|
|
798
|
-
integration_key: "${PAGERDUTY_KEY}"
|
|
799
|
-
severity: [critical]
|
|
800
|
-
```
|
|
162
|
+
[Enterprise Architecture](docs/enterprise/architecture.md) | [Security](docs/enterprise/security.md) | [Authentication](docs/authentication.md) | [Authorization](docs/authorization.md) | [Metrics](docs/metrics.md) | [Audit Logging](docs/audit-logging.md) | [SIEM](docs/siem-integration.md)
|
|
801
163
|
|
|
802
164
|
---
|
|
803
165
|
|
|
804
|
-
##
|
|
166
|
+
## Benchmarks
|
|
805
167
|
|
|
806
|
-
|
|
807
|
-
- **Internet access** for competitive research and deployment
|
|
808
|
-
- **Cloud provider credentials** (for deployment phase)
|
|
809
|
-
- **Python 3** (for test suite)
|
|
168
|
+
Results from the included test harness. Self-reported and not independently verified. Verification scripts included so you can reproduce. See [benchmarks/](benchmarks/) for methodology.
|
|
810
169
|
|
|
811
|
-
|
|
812
|
-
|
|
813
|
-
|
|
814
|
-
-
|
|
170
|
+
| Benchmark | Result | Notes |
|
|
171
|
+
|-----------|--------|-------|
|
|
172
|
+
| HumanEval | 162/164 (98.78%) | Max 3 retries per problem, RARV self-verification |
|
|
173
|
+
| SWE-bench | 299/300 patches generated | Patch generation only -- SWE-bench evaluator not yet run to confirm resolution |
|
|
815
174
|
|
|
816
175
|
---
|
|
817
176
|
|
|
818
|
-
##
|
|
819
|
-
|
|
820
|
-
### **Vibe Kanban (Visual Dashboard)**
|
|
821
|
-
|
|
822
|
-
Integrate with [Vibe Kanban](https://github.com/BloopAI/vibe-kanban) for a visual kanban board:
|
|
823
|
-
|
|
824
|
-
```bash
|
|
825
|
-
# 1. Start Vibe Kanban (terminal 1)
|
|
826
|
-
npx vibe-kanban
|
|
827
|
-
|
|
828
|
-
# 2. Run Loki Mode (terminal 2)
|
|
829
|
-
./autonomy/run.sh ./prd.md
|
|
830
|
-
|
|
831
|
-
# 3. Export tasks to see them in Vibe Kanban (terminal 3)
|
|
832
|
-
./scripts/export-to-vibe-kanban.sh
|
|
833
|
-
|
|
834
|
-
# 4. Optional: Auto-sync for real-time updates
|
|
835
|
-
./scripts/vibe-sync-watcher.sh
|
|
836
|
-
```
|
|
837
|
-
|
|
838
|
-
**Important:** Vibe Kanban integration requires manual export. Tasks don't automatically appear - you must run the export script to sync.
|
|
839
|
-
|
|
840
|
-
**Benefits:**
|
|
841
|
-
- Visual progress tracking of all active agents
|
|
842
|
-
- Manual intervention/prioritization when needed
|
|
843
|
-
- Code review with visual diffs
|
|
844
|
-
- Multi-project dashboard
|
|
845
|
-
|
|
846
|
-
See [integrations/vibe-kanban.md](integrations/vibe-kanban.md) for complete step-by-step setup guide and troubleshooting.
|
|
847
|
-
|
|
848
|
-
### **OpenClaw Bridge (v5.38.0)**
|
|
849
|
-
|
|
850
|
-
Loki Mode now supports the OpenClaw multi-agent coordination protocol for cross-system orchestration:
|
|
851
|
-
|
|
852
|
-
```bash
|
|
853
|
-
# Enable OpenClaw bridge
|
|
854
|
-
export LOKI_OPENCLAW_ENABLED=true
|
|
855
|
-
export LOKI_OPENCLAW_ENDPOINT=http://openclaw-server:8080
|
|
856
|
-
|
|
857
|
-
# Start with OpenClaw integration
|
|
858
|
-
loki start --openclaw ./prd.md
|
|
859
|
-
```
|
|
860
|
-
|
|
861
|
-
**Benefits:**
|
|
862
|
-
- Standardized inter-agent communication across different AI systems
|
|
863
|
-
- Coordinate with external agent frameworks (AutoGPT, MetaGPT, etc.)
|
|
864
|
-
- Share task queues and state between multiple orchestrators
|
|
865
|
-
- Cross-platform agent collaboration
|
|
177
|
+
## Research Foundation
|
|
866
178
|
|
|
867
|
-
|
|
179
|
+
| Source | What We Use From It |
|
|
180
|
+
|--------|---------------------|
|
|
181
|
+
| [Anthropic: Building Effective Agents](https://www.anthropic.com/research/building-effective-agents) | Evaluator-optimizer pattern, parallelization strategy |
|
|
182
|
+
| [Anthropic: Constitutional AI](https://www.anthropic.com/research/constitutional-ai-harmlessness-from-ai-feedback) | Self-critique against quality principles |
|
|
183
|
+
| [DeepMind: Scalable Oversight via Debate](https://deepmind.google/research/publications/34920/) | Debate-based verification in council review |
|
|
184
|
+
| [DeepMind: SIMA 2](https://deepmind.google/blog/sima-2-an-agent-that-plays-reasons-and-learns-with-you-in-virtual-3d-worlds/) | Self-improvement loop design |
|
|
185
|
+
| [OpenAI: Agents SDK](https://openai.github.io/openai-agents-python/) | Guardrails, tripwires, tracing patterns |
|
|
186
|
+
| [NVIDIA ToolOrchestra](https://github.com/NVlabs/ToolOrchestra) | Efficiency metrics, reward signal tracking |
|
|
187
|
+
| [CONSENSAGENT (ACL 2025)](https://aclanthology.org/2025.findings-acl.1141/) | Anti-sycophancy checks in blind review |
|
|
188
|
+
| [GoalAct](https://arxiv.org/abs/2504.16563) | Hierarchical planning for complex PRDs |
|
|
868
189
|
|
|
869
|
-
|
|
870
|
-
|
|
871
|
-
## Testing
|
|
190
|
+
**Practitioner insights:** Boris Cherny -- self-verification loop patterns | Simon Willison -- sub-agents for context isolation | [HN Community](https://news.ycombinator.com/item?id=44623207) -- production patterns from real deployments
|
|
872
191
|
|
|
873
|
-
|
|
874
|
-
|
|
875
|
-
```bash
|
|
876
|
-
# Run all tests
|
|
877
|
-
./tests/run-all-tests.sh
|
|
878
|
-
|
|
879
|
-
# Or run individual test suites
|
|
880
|
-
./tests/test-bootstrap.sh # Directory structure, state init
|
|
881
|
-
./tests/test-task-queue.sh # Queue operations, priorities
|
|
882
|
-
./tests/test-circuit-breaker.sh # Failure handling, recovery
|
|
883
|
-
./tests/test-agent-timeout.sh # Timeout, stuck process handling
|
|
884
|
-
./tests/test-state-recovery.sh # Checkpoints, recovery
|
|
885
|
-
```
|
|
192
|
+
**[Full Acknowledgements](docs/ACKNOWLEDGEMENTS.md)** -- 50+ research papers, articles, and resources
|
|
886
193
|
|
|
887
194
|
---
|
|
888
195
|
|
|
889
196
|
## Contributing
|
|
890
197
|
|
|
891
|
-
Contributions welcome! Please:
|
|
892
|
-
1. Read [SKILL.md](SKILL.md) to understand the core architecture
|
|
893
|
-
2. Review [skills/00-index.md](skills/00-index.md) for module organization (v3.0+)
|
|
894
|
-
3. Check [references/agent-types.md](references/agent-types.md) for agent definitions
|
|
895
|
-
4. Open an issue for bugs or feature requests
|
|
896
|
-
5. Submit PRs with clear descriptions and tests
|
|
897
|
-
|
|
898
|
-
**Dev setup:**
|
|
899
198
|
```bash
|
|
900
199
|
git clone https://github.com/asklokesh/loki-mode.git && cd loki-mode
|
|
901
|
-
npm install #
|
|
902
|
-
|
|
903
|
-
|
|
200
|
+
npm install && npm test # 683 tests, ~10 sec
|
|
201
|
+
python3 -m pytest # 631 tests, ~3 sec
|
|
202
|
+
bash tests/run-all-tests.sh # shell tests, ~2 min
|
|
904
203
|
```
|
|
905
204
|
|
|
906
|
-
See [CONTRIBUTING.md](CONTRIBUTING.md) for
|
|
907
|
-
|
|
908
|
-
---
|
|
205
|
+
See [CONTRIBUTING.md](CONTRIBUTING.md) for guidelines.
|
|
909
206
|
|
|
910
207
|
## License
|
|
911
208
|
|
|
912
|
-
MIT
|
|
913
|
-
|
|
914
|
-
---
|
|
915
|
-
|
|
916
|
-
## Acknowledgments
|
|
917
|
-
|
|
918
|
-
Loki Mode incorporates research and patterns from leading AI labs and practitioners:
|
|
919
|
-
|
|
920
|
-
### Research Foundation
|
|
921
|
-
|
|
922
|
-
| Source | Key Contribution |
|
|
923
|
-
|--------|------------------|
|
|
924
|
-
| [Anthropic: Building Effective Agents](https://www.anthropic.com/research/building-effective-agents) | Evaluator-optimizer pattern, parallelization |
|
|
925
|
-
| [Anthropic: Constitutional AI](https://www.anthropic.com/research/constitutional-ai-harmlessness-from-ai-feedback) | Self-critique against principles |
|
|
926
|
-
| [DeepMind: Scalable Oversight via Debate](https://deepmind.google/research/publications/34920/) | Debate-based verification |
|
|
927
|
-
| [DeepMind: SIMA 2](https://deepmind.google/blog/sima-2-an-agent-that-plays-reasons-and-learns-with-you-in-virtual-3d-worlds/) | Self-improvement loop |
|
|
928
|
-
| [OpenAI: Agents SDK](https://openai.github.io/openai-agents-python/) | Guardrails, tripwires, tracing |
|
|
929
|
-
| [NVIDIA ToolOrchestra](https://github.com/NVlabs/ToolOrchestra) | Efficiency metrics, reward signals |
|
|
930
|
-
| [CONSENSAGENT (ACL 2025)](https://aclanthology.org/2025.findings-acl.1141/) | Anti-sycophancy, blind review |
|
|
931
|
-
| [GoalAct](https://arxiv.org/abs/2504.16563) | Hierarchical planning |
|
|
932
|
-
|
|
933
|
-
### Practitioner Insights
|
|
934
|
-
|
|
935
|
-
- **Boris Cherny** (Claude Code creator) - Self-verification loop, extended thinking
|
|
936
|
-
- **Simon Willison** - Sub-agents for context isolation, skills system
|
|
937
|
-
- **Hacker News Community** - [Production patterns](https://news.ycombinator.com/item?id=44623207) from real deployments
|
|
938
|
-
|
|
939
|
-
### Inspirations
|
|
940
|
-
|
|
941
|
-
- [LerianStudio/ring](https://github.com/LerianStudio/ring) - Subagent-driven-development pattern
|
|
942
|
-
- [Awesome Agentic Patterns](https://github.com/nibzard/awesome-agentic-patterns) - 105+ production patterns
|
|
943
|
-
|
|
944
|
-
**[Full Acknowledgements](docs/ACKNOWLEDGEMENTS.md)** - Complete list of 50+ research papers, articles, and resources
|
|
945
|
-
|
|
946
|
-
Built for the [Claude Code](https://claude.ai) ecosystem, powered by Anthropic's Claude models (Sonnet, Haiku, Opus).
|
|
947
|
-
|
|
948
|
-
---
|
|
949
|
-
|
|
950
|
-
## Autonomi
|
|
951
|
-
|
|
952
|
-
Loki Mode is the flagship product of **[Autonomi](https://www.autonomi.dev/)** -- a platform for autonomous AI systems. Like Alphabet is to Google, Autonomi is the parent brand under which Loki Mode and future products operate.
|
|
953
|
-
|
|
954
|
-
**Why Autonomi?** Loki Mode proved that multi-agent autonomous systems can build real software from a PRD with minimal human intervention. Autonomi is the expansion of that vision into a broader platform of autonomous services and products.
|
|
955
|
-
|
|
956
|
-
- **[autonomi.dev](https://www.autonomi.dev/)** -- Main website
|
|
957
|
-
- **[Documentation](https://www.autonomi.dev/docs)** -- Full documentation
|
|
958
|
-
- **Loki Mode** -- Autonomous multi-agent startup system (this repo)
|
|
959
|
-
- More products coming soon
|
|
960
|
-
|
|
961
|
-
---
|
|
962
|
-
|
|
963
|
-
**Ready to build a startup while you sleep?**
|
|
964
|
-
|
|
965
|
-
```bash
|
|
966
|
-
git clone https://github.com/asklokesh/loki-mode.git ~/.claude/skills/loki-mode
|
|
967
|
-
./autonomy/run.sh your-prd.md
|
|
968
|
-
```
|
|
209
|
+
MIT -- see [LICENSE](LICENSE).
|
|
969
210
|
|
|
970
211
|
---
|
|
971
212
|
|
|
972
|
-
|
|
213
|
+
[Autonomi](https://www.autonomi.dev/) | [Documentation](wiki/Home.md) | [Changelog](CHANGELOG.md) | [Installation](docs/INSTALLATION.md) | [Comparisons](references/competitive-analysis.md)
|