loki-mode 7.11.0 → 7.13.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/SKILL.md +2 -2
- package/VERSION +1 -1
- package/autonomy/lib/wiki-ask.py +137 -0
- package/autonomy/lib/wiki-generator.py +322 -0
- package/autonomy/lib/wiki_index.py +258 -0
- package/autonomy/lib/wiki_llm.py +140 -0
- package/autonomy/loki +273 -12
- package/autonomy/run.sh +72 -0
- package/bin/loki +1 -1
- package/dashboard/__init__.py +1 -1
- package/dashboard/server.py +108 -0
- package/dashboard/static/index.html +394 -329
- package/docs/INSTALLATION.md +1 -1
- package/docs/R5-AUTO-WIKI-DESIGN.md +137 -0
- package/docs/R7-ZERO-CONFIG-FIRST-RUN-PLAN.md +137 -0
- package/loki-ts/dist/loki.js +224 -198
- package/mcp/__init__.py +1 -1
- package/package.json +1 -1
package/docs/INSTALLATION.md
CHANGED
|
@@ -0,0 +1,137 @@
|
|
|
1
|
+
# R5: Auto-wiki + Cited Codebase Q&A (Loki's DeepWiki) -- Design Note
|
|
2
|
+
|
|
3
|
+
Status: implemented in worktree (do not commit to main). Target release: R5 of the
|
|
4
|
+
competitive-stickiness arc. NO version bump in this worktree.
|
|
5
|
+
|
|
6
|
+
## Goal
|
|
7
|
+
|
|
8
|
+
A persistent, queryable per-project wiki generated from the codebase, surfaced in
|
|
9
|
+
the dashboard, with cited answers. Loki's answer to Devin DeepWiki. Sections cite
|
|
10
|
+
the real source files they were built from; `loki wiki ask` returns a grounded,
|
|
11
|
+
cited answer (citations = `file:line`). Generation is incremental: it skips when
|
|
12
|
+
the codebase has not changed (reuses the R1 codebase-signature idea).
|
|
13
|
+
|
|
14
|
+
## What already exists (verified against source, 2026-06-03)
|
|
15
|
+
|
|
16
|
+
| Asset | File | Reused? |
|
|
17
|
+
|---|---|---|
|
|
18
|
+
| `loki docs generate` (LLM-written README/ARCHITECTURE/...) | `autonomy/loki:20577` (`cmd_docs`) | Patterns reused: `_docs_scan_project`, `_docs_build_context`, `_docs_invoke_provider`, `_docs_write_manifest`. Not the command itself -- docs has no citations and no Q&A. |
|
|
19
|
+
| Proof-of-run generator (Python core, thin CLI readers) | `autonomy/lib/proof-generator.py`, bash `cmd_proof`, `loki-ts/src/commands/proof.ts` | Architecture precedent reused exactly: Python core does the heavy work; bash + Bun are thin readers; dashboard exposes read APIs. |
|
|
20
|
+
| PII redaction | `autonomy/lib/proof_redact.py` (`redact_tree`, `_redact_paths`) | Reused: wiki output is normalized to repo-relative paths and passed through the redactor so no `/Users/<name>/...` leaks. |
|
|
21
|
+
| Org knowledge graph | `memory/knowledge_graph.py` (`OrganizationKnowledgeGraph`) | Token-overlap scoring idea reused (`_tokenize` / `query_patterns`). NOT a codebase index -- it aggregates `.loki/memory/semantic/*.json` patterns across projects, keyed on `~/.loki/knowledge`. See "Honest reuse" below. |
|
|
22
|
+
| Memory retrieval | `memory/retrieval.py` (`MemoryRetrieval`) | Inspected. It retrieves memory entries (episodic/semantic/procedural), NOT source code. Not a code indexer. Not reused for code retrieval. |
|
|
23
|
+
| Dashboard read-API + traversal-safety | `dashboard/server.py:7191` (`_safe_proof_run_dir`) | Pattern reused for the wiki section/path param (`_safe_wiki_section`). |
|
|
24
|
+
| Dashboard web components | `dashboard-ui/components/*.js` (Web Components) | New `loki-wiki-browser.js` follows the same `LokiElement` convention; registered in `index.js`. |
|
|
25
|
+
|
|
26
|
+
### Honest reuse statement
|
|
27
|
+
|
|
28
|
+
The task says "reuse memory/knowledge_graph.py and memory/retrieval.py." Both were
|
|
29
|
+
read. Neither is a *codebase* index: `knowledge_graph.py` aggregates cross-project
|
|
30
|
+
*memory patterns* (`.loki/memory/semantic`), and `retrieval.py` retrieves *memory
|
|
31
|
+
entries*, not source files. There was no existing per-file code index to query for
|
|
32
|
+
grounded citations. So R5 adds a new lightweight, dependency-free code index
|
|
33
|
+
(`autonomy/lib/wiki_index.py`) and reuses the parts that genuinely fit: the
|
|
34
|
+
token-overlap retrieval scoring (ported from `knowledge_graph._tokenize` /
|
|
35
|
+
`query_patterns`), the docs scanner, the proof manifest/signature idea, and the
|
|
36
|
+
redactor. Reuse is stated where real; not fabricated to satisfy a constraint.
|
|
37
|
+
|
|
38
|
+
ChromaDB (`tools/index-codebase.py`, MEMORY.md) is an OPTIONAL future backend. The
|
|
39
|
+
core deliberately does NOT depend on it: it needs Docker + python3.12 and is not
|
|
40
|
+
CI-safe. Default retrieval is deterministic and dependency-light.
|
|
41
|
+
|
|
42
|
+
## The grounding contract (the part that must be right)
|
|
43
|
+
|
|
44
|
+
Fabricated citations are made structurally impossible, not merely prompt-discouraged:
|
|
45
|
+
|
|
46
|
+
1. **Index**: `wiki_index.py` scans source files (git-tracked when available, else a
|
|
47
|
+
filtered `find`), splits each into line-anchored chunks
|
|
48
|
+
`{file, start_line, end_line, text}` where `file` is REPO-RELATIVE.
|
|
49
|
+
2. **Retrieve** (`ask`): deterministic token-overlap scoring (no LLM, no network)
|
|
50
|
+
selects the top-K chunks for the question. Each is a record we own.
|
|
51
|
+
3. **Prompt**: the LLM sees NUMBERED chunks `[1]..[K]` and is told to cite by chunk
|
|
52
|
+
index only (`[1]`, `[2]`). It never emits raw paths.
|
|
53
|
+
4. **Map + validate**: indices in the answer are mapped back to `{file, start_line}`
|
|
54
|
+
from the retrieval records. Every citation is then validated against the
|
|
55
|
+
filesystem (file exists AND start_line <= file length). Non-resolving citations
|
|
56
|
+
are DROPPED. The LLM can only reference chunks we supplied, and only ones that
|
|
57
|
+
resolve on disk -- so a fabricated citation cannot survive.
|
|
58
|
+
5. **generate**: per-section "sources" are CODE-DERIVED (the files the scanner read,
|
|
59
|
+
the real def/class line numbers from a grep parse), not LLM-emitted. The LLM
|
|
60
|
+
writes prose; the citation list comes from the index.
|
|
61
|
+
|
|
62
|
+
If the LLM is unavailable (CI, no provider), `ask` returns an EXTRACTIVE answer
|
|
63
|
+
(the top chunk snippets with their real citations) and `generate` writes a
|
|
64
|
+
template-based wiki whose citations are still the real scanned files. No fabrication
|
|
65
|
+
in any path.
|
|
66
|
+
|
|
67
|
+
## Mocking the LLM in CI
|
|
68
|
+
|
|
69
|
+
The Python core reads `LOKI_WIKI_LLM_STUB`:
|
|
70
|
+
- unset -> call the real provider via the same path `_docs_invoke_provider` uses
|
|
71
|
+
(`claude -p` etc.), OR fall back to extractive/template if no provider on PATH.
|
|
72
|
+
- set to a file path -> read the stubbed completion from that file.
|
|
73
|
+
- set to any other value -> use it literally as the completion.
|
|
74
|
+
|
|
75
|
+
Tests set `LOKI_WIKI_LLM_STUB` so CI makes ZERO paid calls. This mirrors how the
|
|
76
|
+
proof tests fake `gh`/`open` via PATH and env.
|
|
77
|
+
|
|
78
|
+
## Storage layout (per project, generated)
|
|
79
|
+
|
|
80
|
+
```
|
|
81
|
+
.loki/wiki/
|
|
82
|
+
wiki.json # structured: sections[], each with title, body, citations[]
|
|
83
|
+
index.md # human-readable rendered wiki (overview + module list)
|
|
84
|
+
architecture.md # rendered architecture section
|
|
85
|
+
modules.md # rendered key-modules section
|
|
86
|
+
data-flow.md # rendered data-flow section
|
|
87
|
+
wiki-manifest.json # signature (git sha + per-file content hash), generated_at
|
|
88
|
+
code-index.json # the chunk index (file, start_line, end_line, tokens)
|
|
89
|
+
```
|
|
90
|
+
|
|
91
|
+
NOTE: `.loki/wiki/` (this deliverable, per-project, generated, gitignored) is a
|
|
92
|
+
DIFFERENT namespace from the repo-root `wiki/` (the GitHub wiki in the release
|
|
93
|
+
workflow). R5 never touches the latter.
|
|
94
|
+
|
|
95
|
+
## Incremental regeneration (R1 signature idea)
|
|
96
|
+
|
|
97
|
+
`wiki-manifest.json` stores a `signature` = sha256 over (git HEAD sha + sorted list
|
|
98
|
+
of `path:content-hash` for every scanned source file). `loki wiki generate` computes
|
|
99
|
+
the current signature; if it equals the stored one, it SKIPS regeneration and prints
|
|
100
|
+
"up to date" (unless `--force`). This is the same cheap-incremental idea as the docs
|
|
101
|
+
manifest and the R1 codebase signature.
|
|
102
|
+
|
|
103
|
+
## Command surface
|
|
104
|
+
|
|
105
|
+
```
|
|
106
|
+
loki wiki generate [path] [--force] # build/refresh .loki/wiki/ (incremental)
|
|
107
|
+
loki wiki show [section] # print rendered wiki (or one section)
|
|
108
|
+
loki wiki ask "<question>" # grounded, cited answer (file:line)
|
|
109
|
+
```
|
|
110
|
+
|
|
111
|
+
## Build surface (mirrors the proof precedent)
|
|
112
|
+
|
|
113
|
+
- `autonomy/lib/wiki_index.py` -- scan + chunk + token-overlap retrieve + signature
|
|
114
|
+
(importable module; underscore name).
|
|
115
|
+
- `autonomy/lib/wiki-generator.py` -- generate wiki.json + rendered md (LLM or
|
|
116
|
+
template), citations code-derived; subprocess-invoked (hyphen in name, like
|
|
117
|
+
proof-generator.py).
|
|
118
|
+
- `autonomy/lib/wiki-ask.py` -- retrieve K chunks, prompt (stub-aware), map + validate
|
|
119
|
+
citations, print grounded answer. Subprocess-invoked.
|
|
120
|
+
- bash `cmd_wiki` in `autonomy/loki` (generate|show|ask) -- thin dispatcher to Python.
|
|
121
|
+
- Bun `loki-ts/src/commands/wiki.ts` -- native `show` (reads `.loki/wiki/`); `generate`
|
|
122
|
+
and `ask` delegate to the bash/Python core (heavy work, provider). Added to the
|
|
123
|
+
`bin/loki` allowlist and `cli.ts` switch.
|
|
124
|
+
- Dashboard: `GET /api/wiki` (list sections + manifest), `GET /api/wiki/{section}`,
|
|
125
|
+
`POST /api/wiki/ask` -- all traversal-safe; web component `loki-wiki-browser.js`.
|
|
126
|
+
|
|
127
|
+
## Tests
|
|
128
|
+
|
|
129
|
+
- `tests/test_wiki_index.py` -- chunking is line-accurate; retrieval is deterministic;
|
|
130
|
+
signature stable + changes on edit; repo-relative paths only.
|
|
131
|
+
- `tests/test_wiki_generator.py` -- generate on a fixture repo; citations point to REAL
|
|
132
|
+
files; incremental skip when unchanged; LLM stubbed; no absolute paths (no PII).
|
|
133
|
+
- `tests/test_wiki_ask.py` -- `ask` returns grounded answer; every citation resolves on
|
|
134
|
+
disk; a stub that emits a bogus `[99]` index is dropped (anti-fabrication).
|
|
135
|
+
- `tests/cli/test-wiki-command.sh` -- bash route generate/show/ask on a fixture, stubbed.
|
|
136
|
+
- `loki-ts/tests/commands/wiki.test.ts` -- Bun `show` parity with the rendered md.
|
|
137
|
+
- `tests/dashboard/test_wiki_routes.py` -- API list/get/ask + traversal rejection.
|
|
@@ -0,0 +1,137 @@
|
|
|
1
|
+
# R7: Zero-config killer first run (time-to-first-value)
|
|
2
|
+
|
|
3
|
+
Design note for the R7 release in the competitive-stickiness arc. Worktree
|
|
4
|
+
deliverable for the integrator to cherry-pick. NO version bumps here.
|
|
5
|
+
|
|
6
|
+
## Goal
|
|
7
|
+
|
|
8
|
+
Convert trials to habits. The #1 acquisition-to-retention gate is the first
|
|
9
|
+
run. Today a blank first run is mediocre and Loki's deep RARV-C / council can
|
|
10
|
+
feel heavy on run 1. R7 = a frictionless first run: a user types
|
|
11
|
+
`loki start "<one line>"` (or `loki start` in an existing repo) and sees a
|
|
12
|
+
VISIBLE valuable artifact in minutes, with depth opt-in later.
|
|
13
|
+
|
|
14
|
+
Honest "fast": we do NOT fake progress. We actually shorten the path by running
|
|
15
|
+
a lightweight execution profile first (capped iterations, completion council
|
|
16
|
+
off, simple complexity tier, heavy phases off) so the first visible artifact
|
|
17
|
+
plus a proof-of-run land quickly. "Go deeper" = re-run plain `loki start` for
|
|
18
|
+
the full RARV-C depth.
|
|
19
|
+
|
|
20
|
+
## Verified current behavior (real code, traced 2026-06-03)
|
|
21
|
+
|
|
22
|
+
- `cmd_start()` (`autonomy/loki:746`) is the unified entry. It parses args,
|
|
23
|
+
calls `detect_arg_type()` (`autonomy/loki:667`), then dispatches:
|
|
24
|
+
issue -> `cmd_run`; prd -> sets `prd_file`; empty -> no-PRD path; unknown
|
|
25
|
+
-> treated as a PRD path for back-compat.
|
|
26
|
+
- `cmd_start` ends in `_loki_new_session_exec "$RUN_SH" ...` (`autonomy/loki:1678`).
|
|
27
|
+
Every branch of `_loki_new_session_exec` (`autonomy/loki:167-186`) uses
|
|
28
|
+
`exec`, so NOTHING after that line in `cmd_start` runs. Any end-of-run
|
|
29
|
+
message must live in `run.sh`, not after the exec in `cmd_start`.
|
|
30
|
+
- `cmd_quick()` (`autonomy/loki:8849`) already synthesizes a PRD from a
|
|
31
|
+
one-line task and sets the lightweight profile
|
|
32
|
+
(`LOKI_MAX_ITERATIONS=3`, `LOKI_COMPLEXITY=simple`,
|
|
33
|
+
`LOKI_COUNCIL_ENABLED=false`, heavy phases off), then execs `run.sh`.
|
|
34
|
+
- No-PRD + generated-PRD-reuse (v7.8.1): in `run.sh` around line 11102,
|
|
35
|
+
`decide_generated_prd_action()` (`run.sh:4032`) returns reuse|update|generate
|
|
36
|
+
for the no-arg in-repo path; signature persisted by
|
|
37
|
+
`persist_prd_signature_if_present()` (`run.sh:4064`).
|
|
38
|
+
- Proof-of-run (R1): `generate_proof_of_run()` (`run.sh:4101`) wraps
|
|
39
|
+
`autonomy/lib/proof-generator.py`. It runs at session end (`run.sh:13312`)
|
|
40
|
+
on both success and failure, gated only by `LOKI_PROOF` (NOT by council
|
|
41
|
+
state), writing `.loki/proofs/<run_id>/{proof.json,index.html}`. Viewable
|
|
42
|
+
via `loki proof list` / `loki proof open <id>` (Bun-routed, `bin/loki:119`).
|
|
43
|
+
|
|
44
|
+
### The exact gap R7 closes (traced, not assumed)
|
|
45
|
+
|
|
46
|
+
`loki start "build a todo app"` TODAY:
|
|
47
|
+
1. `detect_arg_type("build a todo app")` returns `unknown` (has spaces, no
|
|
48
|
+
extension, not a file, not an issue ref).
|
|
49
|
+
2. The PRD-not-found guard at `autonomy/loki:1243` and `:1268` only fires for
|
|
50
|
+
`*.md|*.json|*.txt|*.yaml|*.yml`, so a brief with spaces slips past.
|
|
51
|
+
3. `prd_file="build a todo app"` is passed to `run.sh`, which fails:
|
|
52
|
+
`[ERROR] PRD file not found: build a todo app`.
|
|
53
|
+
|
|
54
|
+
So the one-line-brief path is broken today. R7 makes it work. This is ADDITIVE:
|
|
55
|
+
no existing valid input (`.md` PRD, issue ref, single-token name) changes
|
|
56
|
+
behavior.
|
|
57
|
+
|
|
58
|
+
## Design (additive, no behavior change to existing inputs)
|
|
59
|
+
|
|
60
|
+
1. `detect_arg_type`: add a `brief` return ONLY for args that contain
|
|
61
|
+
whitespace and match none of the file/issue/path patterns. A single-token
|
|
62
|
+
`unknown` arg still falls back to PRD path (back-compat preserved).
|
|
63
|
+
2. `--brief "<text>"` explicit flag: deterministic escape hatch for the rare
|
|
64
|
+
single-word brief (e.g. `loki start --brief "snake"`).
|
|
65
|
+
3. Shared helper `synthesize_brief_prd <file> <text>`: factored so `cmd_quick`
|
|
66
|
+
and the new brief path write the same forward-looking PRD. The brief PRD is
|
|
67
|
+
written to `.loki/brief-prd-$$.md` -- DISTINCT from `.loki/generated-prd.md`
|
|
68
|
+
so it never pollutes the v7.8.1 generated-PRD-reuse signature logic
|
|
69
|
+
(generated-prd is for codebase analysis of an existing repo; brief is a
|
|
70
|
+
forward spec).
|
|
71
|
+
4. `cmd_start` brief sub-path: set the lightweight TTFV profile (same env as
|
|
72
|
+
quick), synthesize the brief PRD, set `LOKI_TTFV=brief`, then continue
|
|
73
|
+
through the normal exec path. Upfront framing ("fast first pass") is printed
|
|
74
|
+
BEFORE the exec.
|
|
75
|
+
5. `cmd_start` no-arg in-repo path: UNCHANGED execution (existing no-PRD +
|
|
76
|
+
reuse, full RARV-C depth), but set `LOKI_TTFV=repo` so the end-of-run
|
|
77
|
+
what-next framing appears.
|
|
78
|
+
6. `run.sh` end-of-session: after proof generation, when `LOKI_TTFV` is set and
|
|
79
|
+
stdout is a TTY, call `print_ttfv_next_steps <mode> <result>`. The wording
|
|
80
|
+
BRANCHES on mode so the message always matches what actually ran:
|
|
81
|
+
- `brief`: lightweight first pass, council off; proof has diffs/cost/time
|
|
82
|
+
(NO council verdicts, because the council was disabled).
|
|
83
|
+
- `repo`: full-depth codebase analysis, council on; proof has
|
|
84
|
+
diffs/cost/time/council verdicts.
|
|
85
|
+
Both point at `loki proof list` / `loki proof open` (the visible artifact)
|
|
86
|
+
and the depth opt-in. Gated so it is silent in CI / pipes and never fires
|
|
87
|
+
for normal PRD runs. Factored into `print_ttfv_next_steps` so it is
|
|
88
|
+
unit-testable.
|
|
89
|
+
|
|
90
|
+
Honesty note: the `brief` message intentionally does NOT advertise "council
|
|
91
|
+
verdicts" because brief mode runs with the council off (`_collect_council` in
|
|
92
|
+
proof-generator.py finds no council state, so that proof section is blank on the
|
|
93
|
+
brief path). The `repo` message claims verdicts because the full-depth path runs
|
|
94
|
+
the council. This keeps the end-of-run summary truthful per the no-fabrication
|
|
95
|
+
rule.
|
|
96
|
+
|
|
97
|
+
### Why fast is honest
|
|
98
|
+
|
|
99
|
+
The brief path uses the same lightweight profile `cmd_quick` already ships:
|
|
100
|
+
3 iterations max, council off, simple tier, heavy phases (perf, a11y,
|
|
101
|
+
regression, UAT, web-research) off. That genuinely shortens the path to first
|
|
102
|
+
visible value. We do not print fake progress or claim work that did not happen;
|
|
103
|
+
the proof-of-run is generated from real `.loki/` state. Depth is opt-in: the
|
|
104
|
+
end-of-run message tells the user to re-run plain `loki start` (or
|
|
105
|
+
`loki start <prd.md>`) for the full council-gated build.
|
|
106
|
+
|
|
107
|
+
## Parity (bash + Bun)
|
|
108
|
+
|
|
109
|
+
`loki start` and `loki quick` are NOT in the Bun shim allowlist
|
|
110
|
+
(`bin/loki:119`), so dispatch is bash-only by design; this change is bash-only
|
|
111
|
+
for the CLI surface. The runtime pieces it reuses are already shared across
|
|
112
|
+
routes: `proof-generator.py` (one implementation, both routes) and the no-PRD /
|
|
113
|
+
generated-PRD-reuse path in `run.sh` (both routes source run.sh). No Bun CLI
|
|
114
|
+
change is required for parity.
|
|
115
|
+
|
|
116
|
+
## Files
|
|
117
|
+
|
|
118
|
+
- `autonomy/loki`: `detect_arg_type` brief return; `--brief` flag;
|
|
119
|
+
`synthesize_brief_prd` helper; `cmd_quick` refactor to use it; `cmd_start`
|
|
120
|
+
brief sub-path + `LOKI_TTFV` wiring; help text.
|
|
121
|
+
- `autonomy/run.sh`: end-of-session TTFV what-next block.
|
|
122
|
+
- `tests/cli/test_zero_config_first_run.sh`: new test suite.
|
|
123
|
+
|
|
124
|
+
## Tests (no paid runs; mock via early exit)
|
|
125
|
+
|
|
126
|
+
Following `tests/cli/test_start_run_unified.sh`: extract `detect_arg_type` and
|
|
127
|
+
`synthesize_brief_prd` in a subshell and assert on them; force `cmd_start` to
|
|
128
|
+
exit before `run.sh` boots via `--provider nonexistent-provider`.
|
|
129
|
+
|
|
130
|
+
- `detect_arg_type("build a todo app")` = `brief`; single tokens still `unknown`;
|
|
131
|
+
`.md` still `prd`; issue refs still `issue`; empty still `empty`.
|
|
132
|
+
- `synthesize_brief_prd` writes a PRD containing the brief text and TTFV markers.
|
|
133
|
+
- `loki start "<brief>"` enters the brief path (lightweight env, not
|
|
134
|
+
"PRD file not found").
|
|
135
|
+
- `loki start --brief "<one word>"` works.
|
|
136
|
+
- existing-repo no-arg path still routes to no-PRD (unchanged).
|
|
137
|
+
- `loki start <prd.md>` (real PRD) still routes to PRD mode (no regression).
|