loki-mode 7.26.0 → 7.27.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +12 -11
- package/SKILL.md +2 -2
- package/VERSION +1 -1
- package/autonomy/completion-council.sh +25 -0
- package/autonomy/lib/trust_metrics.py +636 -0
- package/autonomy/loki +93 -0
- package/autonomy/run.sh +113 -5
- package/autonomy/verify.sh +1075 -0
- package/dashboard/__init__.py +1 -1
- package/dashboard/static/index.html +1 -1
- package/docs/COMPARISON.md +9 -9
- package/docs/COMPETITIVE-ANALYSIS.md +18 -37
- package/docs/INSTALLATION.md +1 -1
- package/docs/auto-claude-comparison.md +9 -6
- package/docs/certification/01-core-concepts/lesson.md +3 -3
- package/docs/competitive/emergence-others-analysis.md +1 -1
- package/docs/competitive/replit-lovable-analysis.md +1 -1
- package/docs/cursor-comparison.md +1 -1
- package/docs/prd-purple-lab-platform.md +1 -1
- package/docs/show-hn-post.md +2 -2
- package/loki-ts/dist/loki.js +2 -2
- package/mcp/__init__.py +1 -1
- package/package.json +1 -1
- package/providers/codex.sh +3 -2
- package/references/agent-types.md +9 -9
- package/references/agents.md +8 -8
- package/references/business-ops.md +1 -1
- package/references/competitive-analysis.md +1 -1
- package/skills/agents.md +3 -3
- package/skills/providers.md +3 -3
package/README.md
CHANGED
|
@@ -18,7 +18,7 @@
|
|
|
18
18
|
|
|
19
19
|
---
|
|
20
20
|
|
|
21
|
-
> **How it works:** Drop a spec -- a PRD, GitHub issue, OpenAPI/JSON/YAML, or one-line brief. Loki Mode classifies complexity (`run.sh:detect_complexity()`), assembles an agent team from 41 specialized
|
|
21
|
+
> **How it works:** Drop a spec -- a PRD, GitHub issue, OpenAPI/JSON/YAML, or one-line brief. Loki Mode classifies complexity (`run.sh:detect_complexity()`), assembles an agent team from 41 specialized agent roles across 8 domains - prompt-defined specifications the orchestrator adopts per phase, with parallel review (blind council) and optional worktree streams on Claude Code, sequential on other providers - and runs autonomous RARV cycles (Reason - Act - Reflect - Verify, see `run.sh:run_autonomous()`) with 11 quality gates (see `skills/quality-gates.md`). Code is not "done" until it passes automated verification. Output is a Git repo with source, tests, configs, and audit logs.
|
|
22
22
|
|
|
23
23
|
---
|
|
24
24
|
|
|
@@ -26,6 +26,7 @@
|
|
|
26
26
|
|
|
27
27
|
- **Spec-driven, autonomous, with a built-in trust layer** -- Hand Loki a spec, walk away, come back to working code with tests. The full RARV-C closure loop (Reason - Act - Reflect - Verify - Close) runs until the work is actually done, not just attempted. The verified-completion evidence gate (`skills/quality-gates.md`) refuses any "done" claim on an empty git diff against the run-start commit, and blocks completion when tests run red, so "complete" means proven, not promised.
|
|
28
28
|
- **Production quality built in** -- 11 quality gates (`skills/quality-gates.md`), blind 3-reviewer code review (`run.sh:run_code_review()`), anti-sycophancy checks
|
|
29
|
+
- **Standalone verification: `loki verify`** -- Run Loki's deterministic gates (build, tests, static analysis, secret scan, dependency audit) against any branch or PR diff, including code written by other agents or humans. CI-ready exit codes (0 VERIFIED, 1 CONCERNS, 2 BLOCKED), machine-readable evidence at `.loki/verify/evidence.json`. Inconclusive evidence is never reported as VERIFIED (v7.27.0).
|
|
29
30
|
- **Live App Preview** -- The dashboard embeds the locally-running app in an iframe so you can interact with it immediately during a build. Use `loki preview` (alias `loki open`) to print the URL and open it in your browser. Local-first: no hosted service, no vendor lock (v7.24.0).
|
|
30
31
|
- **Compose-first fullstack** -- When a spec needs more than one service (web + database + cache) Loki generates a 12-factor `docker-compose.yml` with healthchecks, `depends_on` wiring, env-var config, and a `.env.example`. The Live App Preview surfaces the web service URL (not a database port), and health reflects the web service's Docker healthcheck so a crashed app shows as crashed even when the database stays up. Single-service apps stay on a plain run command. All local-first, no hosted service (v7.26.0).
|
|
31
32
|
- **Intelligent `loki start`** -- For interactive foreground runs the dashboard auto-opens in the browser (cross-platform; skipped in CI, SSH-without-TTY, and piped runs; opt out with `LOKI_NO_AUTO_OPEN=1`). The completion summary shows "Your app is live at <url>" so you know exactly where to try what Loki just built. The autonomous loop passes Claude Code's `--effort`, `--max-budget-usd`, and `--fallback-model` on every iteration (each gated on CLI support and individual opt-out env vars) for better long-run unattended execution (v7.25.0).
|
|
@@ -47,7 +48,7 @@ Loki drives a coding agent CLI and orchestrates real builds, so it needs a few t
|
|
|
47
48
|
|
|
48
49
|
Required:
|
|
49
50
|
|
|
50
|
-
- An agent provider CLI
|
|
51
|
+
- An agent provider CLI: [Claude Code](https://docs.claude.com/en/docs/claude-code) (`claude`, Tier 1, recommended and E2E-verified - the provider Loki Mode is built for). Codex, Cline, and Aider are supported as experimental providers (wiring in place; not yet E2E-verified by us).
|
|
51
52
|
- Python 3.10+ (`python3`) for the dashboard, memory system, and orchestration helpers.
|
|
52
53
|
- Git 2.x (`git`) for checkpoints and worktrees.
|
|
53
54
|
- `curl` for installation and network calls.
|
|
@@ -87,7 +88,7 @@ loki quick "build a landing page with a signup form"
|
|
|
87
88
|
|
|
88
89
|
| Method | Command | Notes |
|
|
89
90
|
|--------|---------|-------|
|
|
90
|
-
| **Bun (recommended)** | `bun install -g loki-mode` | Fastest
|
|
91
|
+
| **Bun (recommended)** | `bun install -g loki-mode` | Fastest startup for CLI commands. |
|
|
91
92
|
| **Homebrew** | `brew tap asklokesh/tap && brew install loki-mode` | Auto-installs Bun as a dep |
|
|
92
93
|
| **Docker** | `docker pull asklokesh/loki-mode:7.7.31 && docker run --rm asklokesh/loki-mode:7.7.31 start prd.md` | Bun pre-installed in image |
|
|
93
94
|
| **npm (compat)** | `npm install -g loki-mode` | Works without Bun (bash fallback). Migrate any time with `loki self-update --to bun`. |
|
|
@@ -108,7 +109,7 @@ See the [Installation Guide](docs/INSTALLATION.md) for the long form.
|
|
|
108
109
|
|
|
109
110
|
## Runtime Architecture
|
|
110
111
|
|
|
111
|
-
Loki Mode
|
|
112
|
+
Loki Mode runs a dual runtime by deliberate design: the battle-tested Bash engine is the stable core (the autonomous loop, quality gates, and completion council stay on it; it receives bug fixes and hardening), and new product surfaces are built TypeScript/Bun-first as modules that wrap the engine rather than reimplement it. An earlier plan to make v8 Bun-only has been superseded by this stable-engine approach: rewriting the verified trust layer would risk the exact guarantees this product exists to provide, for no capability gain. Bash support is not going away.
|
|
112
113
|
|
|
113
114
|
**What ships today:**
|
|
114
115
|
|
|
@@ -227,8 +228,8 @@ Every iteration: **Reason** (read state) - **Act** (execute, commit) - **Reflect
|
|
|
227
228
|
</td>
|
|
228
229
|
<td width="33%" valign="top">
|
|
229
230
|
|
|
230
|
-
### 41 Agent
|
|
231
|
-
8
|
|
231
|
+
### 41 Agent Roles
|
|
232
|
+
8 domains: engineering, operations, business, data, product, growth, review, orchestration. These are prompt-defined role specifications the orchestrator adopts per phase, auto-composed by PRD complexity; parallelism comes from the blind review council, the adversarial reviewer, and optional git-worktree streams on Claude Code, sequential on other providers.
|
|
232
233
|
|
|
233
234
|
[Agent Types](references/agent-types.md)
|
|
234
235
|
|
|
@@ -331,14 +332,14 @@ Loki's autonomy and quality loop are the product; the underlying coding CLI is s
|
|
|
331
332
|
|
|
332
333
|
| Provider | Status | Autonomous Flag | Parallel Agents | Install |
|
|
333
334
|
|----------|--------|:-:|:-:|---------|
|
|
334
|
-
| **Claude Code** | Active (Tier 1) | `--dangerously-skip-permissions` | Yes (10+) | `npm i -g @anthropic-ai/claude-code` |
|
|
335
|
-
| **Codex CLI** |
|
|
336
|
-
| **Cline CLI** |
|
|
337
|
-
| **Aider** |
|
|
335
|
+
| **Claude Code** | Active (Tier 1, E2E-verified) | `--dangerously-skip-permissions` | Yes (10+) | `npm i -g @anthropic-ai/claude-code` |
|
|
336
|
+
| **Codex CLI** | Experimental (Tier 3) | `--full-auto --skip-git-repo-check` | Sequential | `npm i -g @openai/codex` |
|
|
337
|
+
| **Cline CLI** | Experimental (Tier 2) | `-y` | Sequential | `npm i -g @anthropic-ai/cline` |
|
|
338
|
+
| **Aider** | Experimental (Tier 3) | `--yes-always` | Sequential | `pip install aider-chat` |
|
|
338
339
|
| **Google Gemini CLI** | DEPRECATED v7.5.18 | -- | -- | Upstream deprecated; runtime removed. `LOKI_PROVIDER=gemini` exits with migration message. |
|
|
339
340
|
| **Anthropic Antigravity CLI** | Coming soon | -- | -- | Integration planned. |
|
|
340
341
|
|
|
341
|
-
Claude gets full features (subagents, parallelization, MCP, Task tool).
|
|
342
|
+
Status legend: "E2E-verified" means we run real spec-to-code builds on it ourselves. Claude Code is the primary, fully supported provider and the one Loki Mode is built for; it gets full features (subagents, parallelization, MCP, Task tool). "Experimental" means the wiring is in place but we have not produced an end-to-end verified build ourselves; treat as community-tested. Experimental providers run sequentially. Auto-failover switches providers when rate-limited. See [Provider Guide](skills/providers.md).
|
|
342
343
|
|
|
343
344
|
---
|
|
344
345
|
|
package/SKILL.md
CHANGED
|
@@ -3,7 +3,7 @@ name: loki-mode
|
|
|
3
3
|
description: Autonomous spec-driven build system with a built-in trust layer. It does not call work done until it is verified (RARV-C closure loop, 11 quality gates, completion council, verified-completion evidence gate). Triggers on "Loki Mode". Takes a spec (PRD, GitHub issue, OpenAPI doc, etc.) to deployed product with minimal human intervention. Provider-agnostic. Requires --dangerously-skip-permissions flag.
|
|
4
4
|
---
|
|
5
5
|
|
|
6
|
-
# Loki Mode v7.
|
|
6
|
+
# Loki Mode v7.27.0
|
|
7
7
|
|
|
8
8
|
**You are an autonomous agent. You make decisions. You do not ask questions. You do not stop.**
|
|
9
9
|
|
|
@@ -383,4 +383,4 @@ See `CHANGELOG.md` entries [7.5.7], [7.5.8], [7.5.13] for the per-fix list and r
|
|
|
383
383
|
|
|
384
384
|
---
|
|
385
385
|
|
|
386
|
-
**v7.
|
|
386
|
+
**v7.27.0 | [Autonomi](https://www.autonomi.dev/) flagship product | ~260 lines core**
|
package/VERSION
CHANGED
|
@@ -1 +1 @@
|
|
|
1
|
-
7.
|
|
1
|
+
7.27.0
|
|
@@ -752,6 +752,18 @@ with open(state_file, 'w') as f:
|
|
|
752
752
|
"threshold=$effective_threshold" \
|
|
753
753
|
"result=$([ $approve_count -ge $effective_threshold ] && echo 'APPROVED' || echo 'REJECTED')" 2>/dev/null || true
|
|
754
754
|
|
|
755
|
+
# Trust-metrics: durable per-vote record for the council rejection / split
|
|
756
|
+
# rate. The council state.json verdicts[] array is per-run only; this log is
|
|
757
|
+
# the cross-run corpus. Additive, best-effort, stdout-silent.
|
|
758
|
+
if type record_trust_event_bash &>/dev/null; then
|
|
759
|
+
record_trust_event_bash "council_vote" \
|
|
760
|
+
"approve=$approve_count" \
|
|
761
|
+
"reject=$reject_count" \
|
|
762
|
+
"threshold=$effective_threshold" \
|
|
763
|
+
"result=$([ $approve_count -ge $effective_threshold ] && echo 'APPROVED' || echo 'REJECTED')" \
|
|
764
|
+
>/dev/null 2>&1 || true
|
|
765
|
+
fi
|
|
766
|
+
|
|
755
767
|
# Write transcript for this council round (Path A: council_vote path)
|
|
756
768
|
local _ct_outcome
|
|
757
769
|
_ct_outcome=$([ $approve_count -ge $effective_threshold ] && echo "APPROVED" || echo "REJECTED")
|
|
@@ -1366,6 +1378,19 @@ print(json.dumps(items[:5]))
|
|
|
1366
1378
|
}
|
|
1367
1379
|
EVIDENCE_EOF
|
|
1368
1380
|
mv "$ev_tmp" "$ev_file"
|
|
1381
|
+
|
|
1382
|
+
# Trust-metrics: durable per-block record. evidence-block.json is a single
|
|
1383
|
+
# state file that is DELETED the moment the gate next passes, so it cannot
|
|
1384
|
+
# be the cross-run corpus for the block rate. Append an event here, where a
|
|
1385
|
+
# block is definitely happening. Additive, best-effort, stdout-silent.
|
|
1386
|
+
if type record_trust_event_bash &>/dev/null; then
|
|
1387
|
+
record_trust_event_bash "evidence_block" \
|
|
1388
|
+
"reason=$reason" \
|
|
1389
|
+
"diff_ok=$diff_ok" \
|
|
1390
|
+
"tests_ok=$tests_ok" \
|
|
1391
|
+
>/dev/null 2>&1 || true
|
|
1392
|
+
fi
|
|
1393
|
+
|
|
1369
1394
|
return 1
|
|
1370
1395
|
}
|
|
1371
1396
|
|