loki-mode 7.45.1 → 7.47.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +16 -12
- package/SKILL.md +5 -5
- package/VERSION +1 -1
- package/autonomy/CONSTITUTION.md +9 -2
- package/autonomy/completion-council.sh +113 -0
- package/autonomy/lib/sentrux-gate.sh +1 -1
- package/autonomy/loki +2 -2
- package/autonomy/run.sh +445 -96
- package/autonomy/spec-interrogation.sh +549 -0
- package/dashboard/__init__.py +1 -1
- package/dashboard/auth.py +117 -2
- package/dashboard/server.py +9 -10
- package/docs/ACKNOWLEDGEMENTS.md +1 -1
- package/docs/COMPARISON.md +10 -10
- package/docs/COMPETITIVE-ANALYSIS.md +2 -2
- package/docs/INSTALLATION.md +2 -2
- package/docs/OPEN-CORE-BOUNDARY.md +6 -5
- package/docs/P0-SWEEP-PLAN.md +163 -0
- package/docs/P2-SPEC-ROBUSTNESS-PLAN.md +192 -0
- package/docs/R9-OPEN-CORE-HOOKS-PLAN.md +2 -2
- package/docs/architecture/STATE-MACHINES.md +18 -19
- package/docs/architecture/bmad-loki-voice-agent-council-analysis.md +1 -1
- package/docs/auto-claude-comparison.md +16 -13
- package/docs/certification/01-core-concepts/lesson.md +12 -11
- package/docs/certification/01-core-concepts/quiz.md +6 -6
- package/docs/certification/05-troubleshooting/lesson.md +23 -13
- package/docs/certification/05-troubleshooting/quiz.md +3 -3
- package/docs/certification/README.md +1 -1
- package/docs/certification/answer-key.md +2 -2
- package/docs/certification/certification-exam.md +9 -9
- package/docs/competitive/bolt-new-analysis.md +2 -2
- package/docs/competitive/emergence-others-analysis.md +14 -14
- package/docs/competitive/replit-lovable-analysis.md +7 -7
- package/docs/cursor-comparison.md +15 -12
- package/docs/dashboard-guide.md +9 -7
- package/docs/enterprise/security.md +43 -3
- package/docs/prd-purple-lab-platform-v2.md +1 -1
- package/docs/prd-purple-lab-platform.md +3 -3
- package/docs/show-hn-post.md +3 -3
- package/loki-ts/dist/loki.js +2 -2
- package/mcp/__init__.py +1 -1
- package/package.json +2 -2
- package/plugins/loki-mode/.claude-plugin/plugin.json +2 -2
- package/plugins/loki-mode/README.md +1 -1
- package/references/magic-rarv-integration.md +1 -1
- package/references/quality-control.md +5 -5
- package/references/sdlc-phases.md +1 -2
- package/skills/00-index.md +1 -1
- package/skills/artifacts.md +1 -1
- package/skills/healing.md +1 -1
- package/skills/magic-modules.md +3 -3
- package/skills/quality-gates.md +52 -39
- package/skills/testing.md +1 -1
- package/web-app/dist/assets/{AdminPage-CKUOsWZW.js → AdminPage-CcCJ0Sjt.js} +1 -1
- package/web-app/dist/assets/{Avatar-CL9Id9Hi.js → Avatar-DK8kmayw.js} +1 -1
- package/web-app/dist/assets/{Badge-B12zwlD7.js → Badge-4uAWnemi.js} +1 -1
- package/web-app/dist/assets/{Button-CFLVoduT.js → Button-BBMk33tk.js} +1 -1
- package/web-app/dist/assets/ComparePage-bt9rwvST.js +1 -0
- package/web-app/dist/assets/{GitHubIssuesPanel-CSitxtAX.js → GitHubIssuesPanel-WDbH47UM.js} +1 -1
- package/web-app/dist/assets/{GitHubPRsPanel-BIT06FRo.js → GitHubPRsPanel-C2CiYtTx.js} +1 -1
- package/web-app/dist/assets/{HomePage-pU_0fGny.js → HomePage-BQk-MUjn.js} +4 -4
- package/web-app/dist/assets/{LoginPage-DTZtt2Yb.js → LoginPage-DMOZVGGL.js} +1 -1
- package/web-app/dist/assets/{MagicPage-10zfra8o.js → MagicPage-Bzp2Nt1z.js} +1 -1
- package/web-app/dist/assets/{MetricsPage-C-wiKUkv.js → MetricsPage-C39JVdsw.js} +1 -1
- package/web-app/dist/assets/{NotFoundPage-BDkcmhYe.js → NotFoundPage-6vT_U9UL.js} +1 -1
- package/web-app/dist/assets/{ProjectPage-CiCavQ8n.js → ProjectPage-BfFcZp-E.js} +3 -3
- package/web-app/dist/assets/{ProjectsPage-BLCXQwwC.js → ProjectsPage-CPMBf8Wt.js} +1 -1
- package/web-app/dist/assets/{SettingsPage-PkxtaMyg.js → SettingsPage-BnNN6ETl.js} +1 -1
- package/web-app/dist/assets/{ShowcasePage-iECp8Tha.js → ShowcasePage-WDrMf-cx.js} +1 -1
- package/web-app/dist/assets/{SystemSettingsPage-DS6Anno1.js → SystemSettingsPage-DX4jb2e8.js} +1 -1
- package/web-app/dist/assets/{TeamsPage-ls6h6bNL.js → TeamsPage-BCfqcXzu.js} +1 -1
- package/web-app/dist/assets/{TemplatesPage-Bk0QzlPt.js → TemplatesPage-CZvmimDj.js} +1 -1
- package/web-app/dist/assets/{TerminalOutput-4-1hWCtZ.js → TerminalOutput-BlRqFwWV.js} +1 -1
- package/web-app/dist/assets/{activity-DH3ih2nS.js → activity-CacZsUyr.js} +1 -1
- package/web-app/dist/assets/{bell-Gn17S6uv.js → bell-DK2qtHnk.js} +1 -1
- package/web-app/dist/assets/{bot-Cbycc3VE.js → bot-CkcUtHad.js} +1 -1
- package/web-app/dist/assets/{check-nIAqa-kf.js → check-CbCPjX3M.js} +1 -1
- package/web-app/dist/assets/{chevron-left-D2jcWDll.js → chevron-left-5NUKWw3i.js} +1 -1
- package/web-app/dist/assets/{circle-alert-CpL4Bhvt.js → circle-alert-S7uFoxC2.js} +1 -1
- package/web-app/dist/assets/{clock-IW4Wq86N.js → clock-CaQRrIrs.js} +1 -1
- package/web-app/dist/assets/{cloud-Cn8nNuH2.js → cloud-DBAX6c0r.js} +1 -1
- package/web-app/dist/assets/{code-xml-BiJBteXf.js → code-xml-De5-EXv3.js} +1 -1
- package/web-app/dist/assets/{copy-CnqkyNsi.js → copy-CUkT6k1v.js} +1 -1
- package/web-app/dist/assets/{database-CKSReqa5.js → database-BAWf1Gwt.js} +1 -1
- package/web-app/dist/assets/{dollar-sign-CDzDY64R.js → dollar-sign-Ji8zk86R.js} +1 -1
- package/web-app/dist/assets/{file-code-corner-Box4IwG1.js → file-code-corner-ChtXoBwS.js} +1 -1
- package/web-app/dist/assets/{file-plus-DpGqlXF8.js → file-plus-bFa37P76.js} +1 -1
- package/web-app/dist/assets/{folder-open-B57dAoBv.js → folder-open-DhXpXscO.js} +1 -1
- package/web-app/dist/assets/{git-commit-horizontal-BVbucmO5.js → git-commit-horizontal-DVPeDQ3j.js} +1 -1
- package/web-app/dist/assets/{globe-BkOnKl4x.js → globe-BPZgPeeu.js} +1 -1
- package/web-app/dist/assets/{hammer-DRbIQ4QU.js → hammer-jLCaujYH.js} +1 -1
- package/web-app/dist/assets/{index-CM_b_EhP.js → index-B-0iHBPO.js} +2 -2
- package/web-app/dist/assets/{layers-B78BiFiU.js → layers-B1vsrsFW.js} +1 -1
- package/web-app/dist/assets/{lightbulb-B-Itbm9g.js → lightbulb-C-uLoq9Y.js} +1 -1
- package/web-app/dist/assets/{loader-circle-Oq6NQhW2.js → loader-circle-JTfD-ZuM.js} +1 -1
- package/web-app/dist/assets/{lock-DbJ9zxbw.js → lock-G9rxD4gZ.js} +1 -1
- package/web-app/dist/assets/{mail-CzMRod6m.js → mail-BJ0PTN_V.js} +1 -1
- package/web-app/dist/assets/{package-WZ5osvej.js → package-CXClfLOO.js} +1 -1
- package/web-app/dist/assets/{plus-j08lFR-K.js → plus-EoL5OCB7.js} +1 -1
- package/web-app/dist/assets/{refresh-cw-CIr7E-g2.js → refresh-cw-BjREUnVq.js} +1 -1
- package/web-app/dist/assets/{rotate-ccw-gwoXxDeE.js → rotate-ccw-DahWX07H.js} +1 -1
- package/web-app/dist/assets/{save-B8fV_ZpE.js → save-Dek3gCn1.js} +1 -1
- package/web-app/dist/assets/{server-D5dO1paz.js → server-D6V1BAia.js} +1 -1
- package/web-app/dist/assets/{shield-alert-Du08zhdg.js → shield-alert-BtTK5Sxb.js} +1 -1
- package/web-app/dist/assets/{trash-2-DEKSVae5.js → trash-2-BT5o_g0r.js} +1 -1
- package/web-app/dist/assets/{trending-down-DBiXUtxJ.js → trending-down-D4Jk7KF3.js} +1 -1
- package/web-app/dist/assets/{trending-up-BgmK_tHq.js → trending-up-EQFTzhEo.js} +1 -1
- package/web-app/dist/assets/{upload-IaViyeVD.js → upload-JfI5lCSE.js} +1 -1
- package/web-app/dist/assets/{usePolling-PiRLqNu6.js → usePolling-BnhPUuGd.js} +1 -1
- package/web-app/dist/assets/{user-BB5J8wAF.js → user-DSUiUYtj.js} +1 -1
- package/web-app/dist/index.html +1 -1
- package/web-app/dist/assets/ComparePage-Dg0UdZAk.js +0 -1
|
@@ -0,0 +1,192 @@
|
|
|
1
|
+
# P2 Spec Robustness Plan (P2-1 spec interrogation gate + P2-2 assumption ledger)
|
|
2
|
+
|
|
3
|
+
Status: design for implementation. No version bump, no commit in this arc.
|
|
4
|
+
|
|
5
|
+
## Goal
|
|
6
|
+
|
|
7
|
+
Loki must stay accurate even when the input spec is WRONG, ambiguous, or
|
|
8
|
+
incomplete. Today two building blocks already detect spec defects but neither
|
|
9
|
+
feeds the autonomous loop:
|
|
10
|
+
|
|
11
|
+
- `autonomy/grill.sh` invokes the provider once with a Devil's-Advocate prompt
|
|
12
|
+
and writes 10-15 hardest spec questions to `.loki/grill/report.md`. It is
|
|
13
|
+
CLI-only (`grep grill autonomy/run.sh` = 0 invocations) and nothing reads its
|
|
14
|
+
output.
|
|
15
|
+
- `autonomy/prd-analyzer.py` detects missing PRD dimensions and has a
|
|
16
|
+
deterministic `_make_assumption()` map, writing `.loki/prd-observations.md`,
|
|
17
|
+
which nothing reads. Its interactive Q&A is inert in non-TTY (autonomous) runs.
|
|
18
|
+
|
|
19
|
+
The fix: run interrogation automatically in DISCOVERY, classify the findings,
|
|
20
|
+
record every spec gap as a first-class ASSUMPTION in a tracked ledger, BLOCK
|
|
21
|
+
completion while high-severity assumptions are unconfirmed-and-unacknowledged,
|
|
22
|
+
and surface the ledger in the proof-of-done output. Defects are SURFACED as
|
|
23
|
+
recorded assumptions, never silently autocorrected.
|
|
24
|
+
|
|
25
|
+
## Core design decision: auto-acknowledgment lifecycle (prevents the trap)
|
|
26
|
+
|
|
27
|
+
A naive "block completion while any high-severity assumption is unconfirmed"
|
|
28
|
+
hard-blocks EVERY ambiguous run to max-iterations, because in autonomous
|
|
29
|
+
(non-TTY) mode no human can ever set `confirmed=yes`. We never reach the
|
|
30
|
+
"done, plus here is what I assumed" output the goal demands.
|
|
31
|
+
|
|
32
|
+
Resolution: split the gate from the lifecycle.
|
|
33
|
+
|
|
34
|
+
- The gate `council_assumption_ledger_gate` is a PURE function of ledger state.
|
|
35
|
+
It blocks iff an entry has `severity=high AND confirmed=false AND
|
|
36
|
+
acknowledged=false`.
|
|
37
|
+
- The auto-acknowledgment lifecycle lives in run.sh (NOT in the gate). Once an
|
|
38
|
+
assumption has been written into the ledger AND injected into the build prompt
|
|
39
|
+
at least once, run.sh marks it `acknowledged=true`. Default-on; opt-out
|
|
40
|
+
`LOKI_ASSUMPTIONS_REQUIRE_CONFIRM=1` keeps a human-in-the-loop path where only
|
|
41
|
+
`confirmed=true` clears the block.
|
|
42
|
+
|
|
43
|
+
This is the OPPOSITE of silent autocorrect: the assumption is recorded,
|
|
44
|
+
injected into the agent's prompt, and surfaced in proof-of-done. Acknowledgment
|
|
45
|
+
records "Loki has SEEN this gap and proceeded with a stated default", not "Loki
|
|
46
|
+
hid it". The gate still has teeth on the first iteration (high-sev unacknowledged
|
|
47
|
+
blocks) and in the require-confirm path.
|
|
48
|
+
|
|
49
|
+
## Severity rule (deterministic, no LLM)
|
|
50
|
+
|
|
51
|
+
grill emits no severity. Classify by section / keyword on the read side:
|
|
52
|
+
|
|
53
|
+
- HIGH: security blind spots; scale/reliability blind spots; missing or
|
|
54
|
+
untestable acceptance criteria; any line containing contradiction keywords
|
|
55
|
+
(contradict, conflict, inconsistent, mutually exclusive).
|
|
56
|
+
- MEDIUM: ambiguities; unstated assumptions; underspecified behavior; all
|
|
57
|
+
prd-analyzer missing-dimension assumptions.
|
|
58
|
+
|
|
59
|
+
This guarantees a HIGH tier exists (so the gate has teeth) and is fully
|
|
60
|
+
deterministic (so tests are reproducible).
|
|
61
|
+
|
|
62
|
+
## Taxonomy mapping (classification, read-side only)
|
|
63
|
+
|
|
64
|
+
grill section -> finding class:
|
|
65
|
+
- "Ambiguities and missing acceptance criteria" -> ambiguous (HIGH if the line
|
|
66
|
+
references acceptance criteria / testable / measurable; else MEDIUM)
|
|
67
|
+
- "Unstated assumptions" -> underspecified (MEDIUM)
|
|
68
|
+
- "Security blind spots" -> missing (HIGH)
|
|
69
|
+
- "Scale and reliability blind spots" -> missing (HIGH)
|
|
70
|
+
- any line with a contradiction keyword (any section) -> contradictory (HIGH)
|
|
71
|
+
- prd-analyzer missing dimensions -> missing (MEDIUM, deterministic default)
|
|
72
|
+
|
|
73
|
+
"None identified." lines are skipped (no fabricated findings).
|
|
74
|
+
|
|
75
|
+
grill output contract is NOT changed (it is parsed by the loki-grill skill).
|
|
76
|
+
We classify a COPY of its markdown; grill.sh stays byte-identical.
|
|
77
|
+
|
|
78
|
+
## No-fabrication rule for ledger content
|
|
79
|
+
|
|
80
|
+
A grill finding is a QUESTION, not a resolution. The ledger `assumption` field
|
|
81
|
+
for a grill-derived gap is an honest "spec gives no answer; proceeding with the
|
|
82
|
+
implementer default for <area>" plus `affects=<area>`. We do NOT invent a
|
|
83
|
+
specific resolution the build will not actually follow. prd-analyzer assumptions
|
|
84
|
+
reuse its existing deterministic `_make_assumption()` text verbatim.
|
|
85
|
+
|
|
86
|
+
## Ledger schema (`.loki/assumptions/`)
|
|
87
|
+
|
|
88
|
+
One JSON file per assumption: `.loki/assumptions/<id>.json`, plus a
|
|
89
|
+
human-readable `.loki/assumptions/ledger.md` rollup regenerated on each write.
|
|
90
|
+
Each entry:
|
|
91
|
+
|
|
92
|
+
```json
|
|
93
|
+
{
|
|
94
|
+
"id": "a-0001",
|
|
95
|
+
"gap": "<the spec defect / unanswered question, verbatim>",
|
|
96
|
+
"assumption": "<honest stated default Loki proceeds with>",
|
|
97
|
+
"why": "<why this assumption / where the gap came from: grill|prd-analyzer>",
|
|
98
|
+
"severity": "high|medium",
|
|
99
|
+
"class": "ambiguous|contradictory|underspecified|missing",
|
|
100
|
+
"affects": "<area, e.g. security, acceptance-criteria, data-model>",
|
|
101
|
+
"source": "grill|prd-analyzer",
|
|
102
|
+
"confirmed": false,
|
|
103
|
+
"acknowledged": false,
|
|
104
|
+
"created_at": "<iso8601>"
|
|
105
|
+
}
|
|
106
|
+
```
|
|
107
|
+
|
|
108
|
+
Stable id = `a-` + zero-padded counter over existing files (idempotent: a second
|
|
109
|
+
DISCOVERY run with the same findings does not duplicate; dedupe on the `gap`
|
|
110
|
+
text hash).
|
|
111
|
+
|
|
112
|
+
## Build surface (files + functions)
|
|
113
|
+
|
|
114
|
+
1. NEW `autonomy/spec-interrogation.sh` (sourced by run.sh; standalone-testable):
|
|
115
|
+
- `spec_interrogation_classify_report <report.md path>`: pure classifier.
|
|
116
|
+
Reads grill markdown, emits one TSV/JSON finding per question line with
|
|
117
|
+
class + severity. Takes a file so a fixture report drives the test with no
|
|
118
|
+
`claude` call.
|
|
119
|
+
- `spec_interrogation_severity_for <section> <line>`: deterministic severity.
|
|
120
|
+
- `spec_ledger_write <gap> <assumption> <why> <severity> <class> <affects>
|
|
121
|
+
<source>`: idempotent writer (dedupe on gap hash) -> `.loki/assumptions/`.
|
|
122
|
+
- `spec_ledger_rebuild_md`: regenerate `.loki/assumptions/ledger.md`.
|
|
123
|
+
- `spec_ledger_high_unresolved_count`: count entries with
|
|
124
|
+
`severity=high AND confirmed=false AND acknowledged=false` (gate input;
|
|
125
|
+
also reused by the summary).
|
|
126
|
+
- `spec_ledger_acknowledge_all`: set `acknowledged=true` on all entries
|
|
127
|
+
(auto-ack lifecycle helper; default path).
|
|
128
|
+
- `spec_interrogation_run <spec_path>`: orchestrator. Default-on
|
|
129
|
+
(`LOKI_SPEC_GRILL=0` opts out). Provider-aware: source grill.sh, call
|
|
130
|
+
`grill_check_provider`; if provider absent, log honest message, skip the
|
|
131
|
+
grill subcall (NO fabricated questions), but STILL fold prd-analyzer
|
|
132
|
+
missing-dimension assumptions into the ledger so degrade surfaces something
|
|
133
|
+
non-blocking. On provider present: run `grill_main` (writes report.md),
|
|
134
|
+
classify it, write ledger entries. Always non-fatal to the run.
|
|
135
|
+
|
|
136
|
+
2. `autonomy/run.sh` DISCOVERY (~13056, after prd-analyzer + council_init,
|
|
137
|
+
before the iteration loop): source spec-interrogation.sh and call
|
|
138
|
+
`spec_interrogation_run "$prd_path"`. This is the grep-able grill invocation
|
|
139
|
+
the task requires. Best-effort (`|| true`), never blocks startup.
|
|
140
|
+
|
|
141
|
+
3. `autonomy/run.sh` auto-ack lifecycle: after the build prompt is constructed
|
|
142
|
+
each iteration (assumptions are injected into the prompt via build_prompt),
|
|
143
|
+
call `spec_ledger_acknowledge_all` UNLESS
|
|
144
|
+
`LOKI_ASSUMPTIONS_REQUIRE_CONFIRM=1`. Inject the high-severity assumption
|
|
145
|
+
list into the build prompt (so the agent sees the gaps it must respect).
|
|
146
|
+
|
|
147
|
+
4. `autonomy/completion-council.sh` `council_assumption_ledger_gate` (new),
|
|
148
|
+
slotted into `council_evaluate` right after `council_evidence_gate`
|
|
149
|
+
(mirrors 2510-2513). Same defensive `COUNCIL_STATE_DIR` default, opt-out
|
|
150
|
+
`LOKI_ASSUMPTION_GATE=0`. Blocks iff `spec_ledger_high_unresolved_count > 0`.
|
|
151
|
+
Writes `.loki/council/assumption-block.json` on block, removes it on pass.
|
|
152
|
+
Also wired into the completion-promise route in run.sh (~14525 pattern) and
|
|
153
|
+
the code_review gate chain (~15013) so the promise path cannot bypass it.
|
|
154
|
+
|
|
155
|
+
5. `autonomy/run.sh` `build_completion_summary` (~2637): emit an
|
|
156
|
+
"Assumptions recorded: N (M high-severity)" block into COMPLETION.txt and the
|
|
157
|
+
ledger list, plus the count into completion.json. So "done" means "done, plus
|
|
158
|
+
here are the N places your spec was ambiguous and what I assumed."
|
|
159
|
+
|
|
160
|
+
6. NEW `tests/test-spec-interrogation.sh` (bash convention, ok/bad counters,
|
|
161
|
+
source the module, mktemp fixtures):
|
|
162
|
+
- (a) classifier on a fixture grill report writes classified findings to the
|
|
163
|
+
ledger (ambiguous/contradictory/underspecified/missing + high/medium).
|
|
164
|
+
- (b) a ledger with one high/confirmed:false/acknowledged:false entry makes
|
|
165
|
+
`council_assumption_ledger_gate` return 1 (BLOCK) and write
|
|
166
|
+
assumption-block.json.
|
|
167
|
+
- (c) clean spec (no high-sev entries, or all acknowledged) -> gate returns 0
|
|
168
|
+
(no spurious block), no block file.
|
|
169
|
+
- (d) no provider -> `spec_interrogation_run` degrades cleanly: honest
|
|
170
|
+
message, prd-analyzer assumptions still folded (medium, non-blocking), run
|
|
171
|
+
proceeds, gate passes.
|
|
172
|
+
|
|
173
|
+
## Gate reachability (resolved open question)
|
|
174
|
+
|
|
175
|
+
The existing gates fire from THREE sites: `council_evaluate` (~2510), the
|
|
176
|
+
completion-promise route (~14525), and the code_review gate chain (~15013). The
|
|
177
|
+
new gate is wired into all three so high-sev unacknowledged assumptions cannot
|
|
178
|
+
slip through the promise path.
|
|
179
|
+
|
|
180
|
+
## Opt-out knobs (all default-on, intelligent)
|
|
181
|
+
|
|
182
|
+
- `LOKI_SPEC_GRILL=0` -> skip interrogation entirely.
|
|
183
|
+
- `LOKI_ASSUMPTION_GATE=0` -> gate is pass-through.
|
|
184
|
+
- `LOKI_ASSUMPTIONS_REQUIRE_CONFIRM=1` -> require human `confirmed=true`
|
|
185
|
+
(disables auto-ack); the human-in-the-loop path.
|
|
186
|
+
|
|
187
|
+
No "user must decide the type" knob. Classification + severity are automatic.
|
|
188
|
+
|
|
189
|
+
## Constraints
|
|
190
|
+
|
|
191
|
+
No emojis, no em dashes, no version bump, no commit, no push. Provider-aware,
|
|
192
|
+
degrade cleanly, no fabricated questions when provider absent.
|
|
@@ -3,8 +3,8 @@
|
|
|
3
3
|
Status: SEAMS implemented (this worktree). NOT a live hosted backend.
|
|
4
4
|
|
|
5
5
|
R9 in the competitive-stickiness arc is the open-core monetization layer: keep
|
|
6
|
-
Loki fully
|
|
7
|
-
and paid plans would attach later. R9 ships the seams only. There is no Loki
|
|
6
|
+
Loki fully source-available (BUSL-1.1) and free to self-host, while adding the
|
|
7
|
+
SEAMS where hosted, enterprise, and paid plans would attach later. R9 ships the seams only. There is no Loki
|
|
8
8
|
hosted service, no license-verification backend, and no paid gate on any
|
|
9
9
|
existing feature. Every honest stub is labeled as such.
|
|
10
10
|
|
|
@@ -972,7 +972,7 @@ Source: `run.sh:7880-7881` (checklist_should_verify, checklist_verify)
|
|
|
972
972
|
|
|
973
973
|
## 7. Quality Gates
|
|
974
974
|
|
|
975
|
-
### 7.1
|
|
975
|
+
### 7.1 Eight-Gate Pipeline
|
|
976
976
|
|
|
977
977
|
Source: `skills/quality-gates.md`
|
|
978
978
|
|
|
@@ -980,40 +980,39 @@ Source: `skills/quality-gates.md`
|
|
|
980
980
|
Code Change
|
|
981
981
|
|
|
|
982
982
|
v
|
|
983
|
-
Gate 1: Static Analysis (CodeQL, ESLint)
|
|
984
|
-
|──BLOCK (
|
|
983
|
+
Gate 1: Static Analysis (CodeQL, ESLint/Pylint, type-checker)
|
|
984
|
+
|──BLOCK (severity ladder)──> [REJECTED]
|
|
985
985
|
v
|
|
986
|
-
Gate 2:
|
|
986
|
+
Gate 2: Test Suite (pass/fail; red blocks; coverage % not measured this release)
|
|
987
987
|
|──BLOCK──> [REJECTED]
|
|
988
988
|
v
|
|
989
|
-
Gate 3:
|
|
990
|
-
|──BLOCK──> [REJECTED]
|
|
991
|
-
v
|
|
992
|
-
Gate 4: Integration Tests
|
|
993
|
-
|──BLOCK──> [REJECTED]
|
|
994
|
-
v
|
|
995
|
-
Gate 5: 3-Reviewer Blind Review (see 7.3)
|
|
989
|
+
Gate 3: Blind 3-Reviewer Review with severity blocking (see 7.3)
|
|
996
990
|
|──BLOCK (Critical/High severity)──> [REJECTED]
|
|
997
991
|
v
|
|
998
|
-
Gate
|
|
999
|
-
|──BLOCK (devil's advocate
|
|
992
|
+
Gate 4: Anti-Sycophancy Devil's Advocate (on unanimous PASS)
|
|
993
|
+
|──BLOCK (devil's advocate Crit/High findings)──> [REJECTED]
|
|
1000
994
|
v
|
|
1001
|
-
Gate
|
|
1002
|
-
|──BLOCK──> [REJECTED]
|
|
995
|
+
Gate 5: Mock Integrity Detector
|
|
996
|
+
|──BLOCK (HIGH findings)──> [REJECTED]
|
|
1003
997
|
v
|
|
1004
|
-
Gate
|
|
1005
|
-
|──BLOCK──> [REJECTED]
|
|
998
|
+
Gate 6: Test Mutation Detector
|
|
999
|
+
|──BLOCK (HIGH findings)──> [REJECTED]
|
|
1006
1000
|
v
|
|
1007
|
-
Gate
|
|
1001
|
+
Gate 7: Documentation Coverage
|
|
1008
1002
|
|──BLOCK──> [REJECTED]
|
|
1009
1003
|
v
|
|
1004
|
+
Gate 8: Magic Modules Debate
|
|
1005
|
+
|──BLOCK (BLOCK-severity findings)──> [REJECTED]
|
|
1006
|
+
v
|
|
1010
1007
|
[APPROVED]
|
|
1011
1008
|
```
|
|
1012
1009
|
|
|
1010
|
+
Backward-compatibility is a conditional healing-mode auditor, not a numbered gate.
|
|
1011
|
+
|
|
1013
1012
|
Gate status values: `passed`, `failed`, `skipped`
|
|
1014
1013
|
Persistence: `.loki/dashboard-state.json` field `qualityGates`
|
|
1015
1014
|
Severity levels: `critical`, `high`, `medium`, `low`
|
|
1016
|
-
Blocking threshold: Critical and High
|
|
1015
|
+
Blocking threshold: Critical and High block; Medium and Low are advisory.
|
|
1017
1016
|
|
|
1018
1017
|
### 7.2 Model Escalation
|
|
1019
1018
|
|
|
@@ -57,5 +57,5 @@ architecture, and adversarial review -- complementing Loki Mode's autonomous exe
|
|
|
57
57
|
1. P0 must ship independently and prove value before P1/P2 begin
|
|
58
58
|
2. No runtime dependency on BMAD repo -- adapter reads BMAD output artifacts only
|
|
59
59
|
3. Zero regression on existing non-BMAD workflows
|
|
60
|
-
4. All code must pass existing
|
|
60
|
+
4. All code must pass existing 8-gate quality system
|
|
61
61
|
5. Context budget: BMAD additions must stay under 15K tokens per iteration
|
|
@@ -120,21 +120,24 @@ Loki Mode implements CONSENSAGENT (ACL 2025):
|
|
|
120
120
|
**Verdict: Loki Mode wins** - Research-backed quality assurance.
|
|
121
121
|
|
|
122
122
|
### 5. Quality Gates
|
|
123
|
-
Loki Mode
|
|
123
|
+
Loki Mode runs 8 deterministic quality gates plus full SDLC phase coverage.
|
|
124
|
+
|
|
125
|
+
The 8 deterministic quality gates: static analysis (CodeQL, ESLint), test suite (pass/fail), blind 3-reviewer review with severity blocking, anti-sycophancy Devil's Advocate, mock-integrity, test-mutation, documentation coverage, and Magic Modules debate. (Backward-compatibility is a conditional healing-mode auditor, not a numbered gate.)
|
|
126
|
+
|
|
127
|
+
Beyond the gates, the SDLC pipeline covers these phases:
|
|
124
128
|
1. Static analysis (CodeQL, ESLint)
|
|
125
|
-
2. Unit tests (
|
|
129
|
+
2. Unit tests (test suite passes; coverage % not measured this release)
|
|
126
130
|
3. API/Integration tests
|
|
127
131
|
4. E2E tests (Playwright)
|
|
128
132
|
5. Security scanning (OWASP)
|
|
129
|
-
6.
|
|
130
|
-
7.
|
|
131
|
-
8.
|
|
132
|
-
9.
|
|
133
|
-
10.
|
|
134
|
-
11.
|
|
135
|
-
12.
|
|
136
|
-
13.
|
|
137
|
-
14. Continuous monitoring
|
|
133
|
+
6. Parallel code review (3 reviewers)
|
|
134
|
+
7. Performance/load testing
|
|
135
|
+
8. Accessibility (WCAG)
|
|
136
|
+
9. Regression testing
|
|
137
|
+
10. UAT simulation
|
|
138
|
+
11. Anti-sycophancy check
|
|
139
|
+
12. Scale-aware review intensity
|
|
140
|
+
13. Continuous monitoring
|
|
138
141
|
|
|
139
142
|
**Auto-Claude:** Single QA validation loop (up to 50 iterations).
|
|
140
143
|
|
|
@@ -236,7 +239,7 @@ Loki Mode now incorporates proven patterns from Cursor's large-scale agent deplo
|
|
|
236
239
|
3. **Anti-Sycophancy** - Blind review prevents false positives
|
|
237
240
|
4. **Full SDLC** - Business, marketing, growth automation
|
|
238
241
|
5. **Published Benchmarks** - Verify claims with reproducible tests
|
|
239
|
-
6. **
|
|
242
|
+
6. **Source-available (BUSL-1.1)** - Inspect and self-host the full code
|
|
240
243
|
|
|
241
244
|
---
|
|
242
245
|
|
|
@@ -253,7 +256,7 @@ Loki Mode now incorporates proven patterns from Cursor's large-scale agent deplo
|
|
|
253
256
|
- Full spec-to-product lifecycle (not just coding)
|
|
254
257
|
- 41 specialized agent roles
|
|
255
258
|
- Anti-sycophancy measures
|
|
256
|
-
-
|
|
259
|
+
- Source-available (BUSL-1.1) license
|
|
257
260
|
- No subscription requirement
|
|
258
261
|
- Verified benchmarks
|
|
259
262
|
|
|
@@ -105,19 +105,20 @@ Full agent type definitions are in `references/agent-types.md`.
|
|
|
105
105
|
|
|
106
106
|
## Quality Gates
|
|
107
107
|
|
|
108
|
-
Loki Mode enforces
|
|
108
|
+
Loki Mode enforces an 8-gate quality system. Code must pass all applicable gates before moving forward:
|
|
109
109
|
|
|
110
110
|
| Gate | Name | Purpose |
|
|
111
111
|
|------|------|---------|
|
|
112
|
-
| 1 |
|
|
113
|
-
| 2 |
|
|
114
|
-
| 3 | Blind Review
|
|
115
|
-
| 4 | Anti-Sycophancy
|
|
116
|
-
| 5 |
|
|
117
|
-
| 6 |
|
|
118
|
-
| 7 |
|
|
119
|
-
| 8 |
|
|
120
|
-
|
|
112
|
+
| 1 | Static Analysis | CodeQL, ESLint/Pylint, type checking |
|
|
113
|
+
| 2 | Test Suite (pass/fail) | Red blocks; coverage % not measured this release |
|
|
114
|
+
| 3 | Blind Code Review (3-reviewer council + severity blocking) | 3 specialist reviewers in parallel, blind to each other; Critical/High = BLOCK; Medium/Low advisory |
|
|
115
|
+
| 4 | Anti-Sycophancy / Devil's Advocate | If reviewers unanimously approve, run a Devil's Advocate reviewer |
|
|
116
|
+
| 5 | Mock Integrity Detector | Flags tests that mock internal modules instead of real code |
|
|
117
|
+
| 6 | Test Mutation Detector | Detects assertion value changes alongside implementation changes |
|
|
118
|
+
| 7 | Documentation Coverage | README exists, docs freshness, API docs for packages |
|
|
119
|
+
| 8 | Magic Modules Debate | Spec-vs-implementation debate on generated Magic Modules |
|
|
120
|
+
|
|
121
|
+
A conditional backward-compatibility / legacy-healing auditor also runs in healing mode (not one of the 8 numbered gates).
|
|
121
122
|
|
|
122
123
|
The blind review system (Gate 3) selects 3 reviewers from a pool of 5 named specialists:
|
|
123
124
|
|
|
@@ -179,4 +180,4 @@ Every Loki Mode project uses these files in the `.loki/` directory:
|
|
|
179
180
|
|
|
180
181
|
## Summary
|
|
181
182
|
|
|
182
|
-
Loki Mode is an autonomous multi-agent system that follows the RARV cycle to build software from PRDs. It uses 41 agent types organized into 8 domains, enforces quality through
|
|
183
|
+
Loki Mode is an autonomous multi-agent system that follows the RARV cycle to build software from PRDs. It uses 41 agent types organized into 8 domains, enforces quality through 8 gates with blind peer review, and maintains episodic/semantic/procedural memory for continuous learning. Projects are classified into simple, standard, or complex tiers that determine the number of phases executed.
|
|
@@ -45,7 +45,7 @@ D) test-coverage-auditor
|
|
|
45
45
|
A) 3
|
|
46
46
|
B) 5
|
|
47
47
|
C) 7
|
|
48
|
-
D)
|
|
48
|
+
D) 8
|
|
49
49
|
|
|
50
50
|
---
|
|
51
51
|
|
|
@@ -67,12 +67,12 @@ D) complex
|
|
|
67
67
|
|
|
68
68
|
---
|
|
69
69
|
|
|
70
|
-
**Question 8:** What
|
|
70
|
+
**Question 8:** What does Gate 7 (Documentation Coverage) check?
|
|
71
71
|
|
|
72
|
-
A)
|
|
73
|
-
B)
|
|
74
|
-
C)
|
|
75
|
-
D)
|
|
72
|
+
A) That unit test coverage is at least 80%
|
|
73
|
+
B) That every function has an inline comment
|
|
74
|
+
C) That a README exists, docs are fresh within 10 commits, and packages have API docs
|
|
75
|
+
D) That cyclomatic complexity stays under 10
|
|
76
76
|
|
|
77
77
|
---
|
|
78
78
|
|
|
@@ -17,30 +17,40 @@ This module covers diagnosing and resolving common issues in Loki Mode: gate fai
|
|
|
17
17
|
|
|
18
18
|
## Quality Gate Failures
|
|
19
19
|
|
|
20
|
-
When a quality gate fails, identify which gate triggered the failure
|
|
20
|
+
When a quality gate fails, identify which gate triggered the failure (the 8-gate
|
|
21
|
+
system is detailed in `skills/quality-gates.md`):
|
|
21
22
|
|
|
22
|
-
**Gates 1-
|
|
23
|
+
**Gates 1-2 (Static analysis and test suite):**
|
|
24
|
+
- Gate 1 (Static Analysis): fix CodeQL/ESLint/Pylint/type-checker findings
|
|
25
|
+
- Gate 2 (Test Suite): the test runner must pass; red blocks. Coverage % is not
|
|
26
|
+
measured this release. Fix failing tests before proceeding (never delete or
|
|
27
|
+
skip tests)
|
|
28
|
+
|
|
29
|
+
**Gates 3-4 (Review gates):**
|
|
23
30
|
- Check the review output for severity levels
|
|
24
|
-
- Critical/High
|
|
31
|
+
- Critical/High = BLOCK; Medium/Low advisory (recommended to fix)
|
|
25
32
|
- Low/Cosmetic = TODO (informational)
|
|
26
33
|
- If all 3 reviewers pass unanimously, Gate 4 runs Devil's Advocate
|
|
27
34
|
|
|
28
|
-
**Gate
|
|
29
|
-
- Unit tests must have 100% pass rate and >80% coverage
|
|
30
|
-
- Integration tests must have 100% pass rate
|
|
31
|
-
- Fix failing tests before proceeding (never delete or skip tests)
|
|
32
|
-
|
|
33
|
-
**Gate 8 (Mock detector):**
|
|
35
|
+
**Gate 5 (Mock integrity detector):**
|
|
34
36
|
- Runs `tests/detect-mock-problems.sh`
|
|
35
37
|
- Flags tests that mock internal modules instead of using real code
|
|
36
38
|
- Flags tautological assertions and high internal mock ratios
|
|
37
|
-
- Disable with `
|
|
39
|
+
- Disable with `LOKI_GATE_MOCK=false` (not recommended)
|
|
38
40
|
|
|
39
|
-
**Gate
|
|
41
|
+
**Gate 6 (Test mutation detector):**
|
|
40
42
|
- Runs `tests/detect-test-mutations.sh`
|
|
41
43
|
- Detects assertion values changed alongside implementation (test fitting)
|
|
42
|
-
- Detects low assertion density
|
|
43
|
-
- Disable with `
|
|
44
|
+
- Detects low assertion density
|
|
45
|
+
- Disable with `LOKI_GATE_MUTATION=false` (not recommended)
|
|
46
|
+
|
|
47
|
+
**Gate 7 (Documentation coverage):**
|
|
48
|
+
- Checks README presence, docs freshness within 10 commits, and API docs for packages
|
|
49
|
+
- Disable with `LOKI_GATE_DOC_COVERAGE=false` (not recommended for packages)
|
|
50
|
+
|
|
51
|
+
**Gate 8 (Magic Modules debate):**
|
|
52
|
+
- Runs the spec-vs-implementation debate on generated Magic Modules
|
|
53
|
+
- BLOCK-severity findings block; disable with `LOKI_GATE_MAGIC_DEBATE=false`
|
|
44
54
|
|
|
45
55
|
## Circuit Breaker System
|
|
46
56
|
|
|
@@ -67,11 +67,11 @@ D) Removes the entire `.loki/` directory
|
|
|
67
67
|
|
|
68
68
|
---
|
|
69
69
|
|
|
70
|
-
**Question 8:** Which environment variable disables Gate
|
|
70
|
+
**Question 8:** Which environment variable disables Gate 5 (Mock Integrity Detector)?
|
|
71
71
|
|
|
72
|
-
A) `
|
|
72
|
+
A) `LOKI_GATE_MOCK=false`
|
|
73
73
|
B) `LOKI_GATE_MOCK_DETECTOR=false`
|
|
74
|
-
C) `
|
|
74
|
+
C) `LOKI_DISABLE_GATE_5=true`
|
|
75
75
|
D) `LOKI_NO_MOCK_DETECTION=true`
|
|
76
76
|
|
|
77
77
|
---
|
|
@@ -73,7 +73,7 @@ docs/certification/
|
|
|
73
73
|
|
|
74
74
|
## Cost and Licensing
|
|
75
75
|
|
|
76
|
-
This certification program is **free and
|
|
76
|
+
This certification program is **free and source-available**, released under the same license as Loki Mode (BUSL-1.1). No registration or payment is required.
|
|
77
77
|
|
|
78
78
|
## Version
|
|
79
79
|
|
|
@@ -12,10 +12,10 @@ This file contains answers for all module quizzes and the final certification ex
|
|
|
12
12
|
| 2 | C | 41 agent types: 37 domain + 4 orchestration |
|
|
13
13
|
| 3 | B | After 5 failures, the task moves to `.loki/queue/dead-letter.json` |
|
|
14
14
|
| 4 | C | architecture-strategist is always one of the 3 selected reviewers |
|
|
15
|
-
| 5 | D |
|
|
15
|
+
| 5 | D | 8 quality gates (Static Analysis through Magic Modules Debate); backward-compatibility is a conditional healing-mode auditor, not one of the 8 |
|
|
16
16
|
| 6 | B | Episodic, semantic, and procedural memory |
|
|
17
17
|
| 7 | B | Simple tier uses 3 phases |
|
|
18
|
-
| 8 | C | Gate 7
|
|
18
|
+
| 8 | C | Gate 7 (Documentation Coverage) checks README presence, docs freshness within 10 commits, and API docs for packages; coverage % is not measured this release |
|
|
19
19
|
| 9 | C | Claude Code supports full features; Codex and Gemini run in degraded mode |
|
|
20
20
|
| 10 | B | If all 3 reviewers unanimously approve, a Devil's Advocate reviewer runs |
|
|
21
21
|
|
|
@@ -49,7 +49,7 @@ D) test-coverage-auditor
|
|
|
49
49
|
A) 3
|
|
50
50
|
B) 5
|
|
51
51
|
C) 7
|
|
52
|
-
D)
|
|
52
|
+
D) 8
|
|
53
53
|
|
|
54
54
|
---
|
|
55
55
|
|
|
@@ -71,12 +71,12 @@ D) complex
|
|
|
71
71
|
|
|
72
72
|
---
|
|
73
73
|
|
|
74
|
-
**Question 8:** What
|
|
74
|
+
**Question 8:** What does Gate 7 (Documentation Coverage) check?
|
|
75
75
|
|
|
76
|
-
A)
|
|
77
|
-
B)
|
|
78
|
-
C)
|
|
79
|
-
D)
|
|
76
|
+
A) That unit test coverage is at least 80%
|
|
77
|
+
B) That every function has an inline comment
|
|
78
|
+
C) That a README exists, docs are fresh within 10 commits, and packages have API docs
|
|
79
|
+
D) That cyclomatic complexity stays under 10
|
|
80
80
|
|
|
81
81
|
---
|
|
82
82
|
|
|
@@ -439,11 +439,11 @@ D) Removes the entire `.loki/` directory
|
|
|
439
439
|
|
|
440
440
|
---
|
|
441
441
|
|
|
442
|
-
**Question 48:** Which environment variable disables Gate
|
|
442
|
+
**Question 48:** Which environment variable disables Gate 5 (Mock Integrity Detector)?
|
|
443
443
|
|
|
444
|
-
A) `
|
|
444
|
+
A) `LOKI_GATE_MOCK=false`
|
|
445
445
|
B) `LOKI_SKIP_MOCK_CHECK=true`
|
|
446
|
-
C) `
|
|
446
|
+
C) `LOKI_DISABLE_GATE_5=true`
|
|
447
447
|
D) `LOKI_NO_MOCK_DETECTION=true`
|
|
448
448
|
|
|
449
449
|
---
|
|
@@ -409,7 +409,7 @@ These are bolt.new weaknesses that Loki Mode already solves or can emphasize:
|
|
|
409
409
|
|
|
410
410
|
#### R5: Advertise Production Readiness as Key Differentiator
|
|
411
411
|
- **bolt.new's gap**: 70% done code, no tests, no review, $5-20K remediation
|
|
412
|
-
- **Loki Mode's advantage**: RARV cycle,
|
|
412
|
+
- **Loki Mode's advantage**: RARV cycle, 8 quality gates, 3-reviewer system, automated testing
|
|
413
413
|
- **Action**: Create comparison content showing: "bolt.new gives you a prototype. Loki Mode gives you a product."
|
|
414
414
|
- **Messaging**: "From PRD to production, not PRD to prototype"
|
|
415
415
|
|
|
@@ -529,7 +529,7 @@ The "vibe coding" market -- AI tools that generate code from natural language --
|
|
|
529
529
|
| Lovable | Lovable | $100M ARR in 8 months | Design-quality prototypes |
|
|
530
530
|
| Vercel | v0 | Part of Vercel ($3.5B+) | UI component generator |
|
|
531
531
|
| Replit | Replit | $1.16B valuation | Browser IDE + AI |
|
|
532
|
-
| Autonomi | Loki Mode |
|
|
532
|
+
| Autonomi | Loki Mode | Source-available (BUSL-1.1), early stage | PRD-to-production system |
|
|
533
533
|
|
|
534
534
|
### 11.4 Strategic Implication for Loki Mode
|
|
535
535
|
|