loki-mode 7.41.5 → 7.43.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +18 -1
- package/SKILL.md +2 -2
- package/VERSION +1 -1
- package/autonomy/app-runner.sh +174 -8
- package/autonomy/completion-council.sh +38 -16
- package/autonomy/hooks/migration-hooks.sh +131 -7
- package/autonomy/loki +66 -43
- package/autonomy/run.sh +73 -2
- package/dashboard/__init__.py +1 -1
- package/dashboard/server.py +102 -0
- package/dashboard/static/index.html +9 -9
- package/docs/INSTALLATION.md +70 -1
- package/events/bus.py +9 -6
- package/loki-ts/dist/loki.js +2 -2
- package/mcp/__init__.py +1 -1
- package/mcp/lsp_proxy.py +274 -89
- package/mcp/server.py +26 -2
- package/memory/vector_index.py +6 -1
- package/package.json +1 -1
- package/plugins/loki-mode/.claude-plugin/plugin.json +1 -1
- package/providers/codex.sh +21 -1
- package/references/core-workflow.md +7 -0
- package/references/quality-control.md +6 -0
- package/skills/agents.md +1 -0
package/providers/codex.sh
CHANGED
|
@@ -116,10 +116,17 @@ provider_version() {
|
|
|
116
116
|
# Invocation function
|
|
117
117
|
# Note: Codex uses positional prompt, not -p flag
|
|
118
118
|
# Note: Reasoning effort is configured via environment or config, not CLI flag
|
|
119
|
+
# v7.x: pin the resolved model explicitly via -m/--model. Without it, codex
|
|
120
|
+
# falls back to the installed CLI's built-in default (e.g. gpt-5.5 on codex
|
|
121
|
+
# 0.132.0), which silently ignores _codex_validate_model and makes the run.sh
|
|
122
|
+
# cost table (priced for gpt-5.3-codex) wrong. --model is the documented model
|
|
123
|
+
# selector and is readable in process listings.
|
|
119
124
|
provider_invoke() {
|
|
120
125
|
local prompt="$1"
|
|
121
126
|
shift
|
|
122
|
-
codex exec --full-auto --skip-git-repo-check
|
|
127
|
+
codex exec --full-auto --skip-git-repo-check \
|
|
128
|
+
--model "$PROVIDER_MODEL_DEVELOPMENT" \
|
|
129
|
+
"$prompt" "$@"
|
|
123
130
|
}
|
|
124
131
|
|
|
125
132
|
# Model tier to effort level parameter (Codex uses effort, not separate models)
|
|
@@ -197,6 +204,18 @@ provider_invoke_with_tier() {
|
|
|
197
204
|
local effort
|
|
198
205
|
effort=$(resolve_model_for_tier "$tier")
|
|
199
206
|
|
|
207
|
+
# Resolve the model name by tier. These three vars can diverge via the
|
|
208
|
+
# generic LOKI_MODEL_* env (each validated by _codex_validate_model), so
|
|
209
|
+
# honor the tier rather than hardcoding development. Capability aliases
|
|
210
|
+
# (best/balanced/cheap) mirror resolve_model_for_tier's mapping.
|
|
211
|
+
local model
|
|
212
|
+
case "$tier" in
|
|
213
|
+
planning|best) model="$PROVIDER_MODEL_PLANNING" ;;
|
|
214
|
+
development|balanced) model="$PROVIDER_MODEL_DEVELOPMENT" ;;
|
|
215
|
+
fast|cheap) model="$PROVIDER_MODEL_FAST" ;;
|
|
216
|
+
*) model="$PROVIDER_MODEL_DEVELOPMENT" ;;
|
|
217
|
+
esac
|
|
218
|
+
|
|
200
219
|
local extra_flags=()
|
|
201
220
|
if [ "${LOKI_CODEX_WEB_SEARCH:-false}" = "true" ]; then
|
|
202
221
|
extra_flags+=(--search)
|
|
@@ -211,6 +230,7 @@ provider_invoke_with_tier() {
|
|
|
211
230
|
--ask-for-approval never \
|
|
212
231
|
--sandbox danger-full-access \
|
|
213
232
|
--skip-git-repo-check \
|
|
233
|
+
--model "$model" \
|
|
214
234
|
"${extra_flags[@]}" \
|
|
215
235
|
"$prompt" "$@"
|
|
216
236
|
}
|
|
@@ -74,6 +74,13 @@ Every iteration follows this cycle:
|
|
|
74
74
|
|
|
75
75
|
The RARV cycle now closes with an explicit Critique step (RARV-C). After VERIFY, an override council of real provider judges (v7.5.4) issues a binding decision before the iteration is marked complete. See `references/quality-control.md` for the override council protocol.
|
|
76
76
|
|
|
77
|
+
### Verified Completion: Evidence Required (v7.41.1, v7.41.5)
|
|
78
|
+
|
|
79
|
+
Completion is gated on affirmative test evidence, not the absence of a detected failure.
|
|
80
|
+
|
|
81
|
+
- **Test evidence captured before the gate reads it (v7.41.1).** Loki runs the project's own tests and persists `.loki/quality/test-results.json` before the completion evidence gate evaluates it, so absent test evidence can no longer silently pass the test axis. Default-on; opt out with `LOKI_COMPLETION_TEST_CAPTURE=0`. It reuses the quality-ladder run (no double test execution per iteration) and a project with no runner records `{"runner":"none","pass":true}`. Source: `autonomy/run.sh` (`ensure_completion_test_evidence`, `:7236`).
|
|
82
|
+
- **Completion council heuristic fallback defaults to CONTINUE (v7.41.5).** When no AI provider is available for the council, the heuristic member evaluation starts each vote at CONTINUE and flips to COMPLETE only when no failure is detected AND affirmative positive evidence is present (the same non-red `test-results.json` signal the evidence hard gate uses). An empty `.loki/` with no test evidence no longer clears the threshold on "absence of failure". Legitimate finished projects (passing or genuinely no-test) still vote COMPLETE. Source: `autonomy/completion-council.sh` (`council_evaluate_member`, `:2044`-`:2063`, `:2127`-`:2140`).
|
|
83
|
+
|
|
77
84
|
---
|
|
78
85
|
|
|
79
86
|
## CONTINUITY.md - Working Memory Protocol
|
|
@@ -287,6 +287,12 @@ Task(subagent_type="general-purpose", model="opus",
|
|
|
287
287
|
- ALWAYS re-run ALL 3 reviewers after fixes (not just the one that found the issue)
|
|
288
288
|
- Wait for all reviews to complete before aggregating results
|
|
289
289
|
|
|
290
|
+
### Inconclusive Reviews Block (v7.41.1)
|
|
291
|
+
|
|
292
|
+
A code-review round must produce real verdicts to pass. `run_code_review` counts only reviewer outputs that exist, are non-empty, and carry a recognized `VERDICT:` line. If every reviewer returns no usable verdict (all NO_OUTPUT or unparseable), the round is treated as INCONCLUSIVE and BLOCKS rather than silently passing with zero findings. A bounded one-shot retry runs first; the block is opt-out via `LOKI_REVIEW_INCONCLUSIVE_BLOCK=0` (records, never blocks). APPROVE / PASS-with-concerns still pass.
|
|
293
|
+
|
|
294
|
+
The reviewer-prompt diff excludes `.loki/` and `.git/` via git pathspec (`git diff <sha>..HEAD -- . ':(exclude).loki/'`). This mirrors the completion evidence gate and prevents a `.loki/`-tracked repo from ballooning the diff to the point that the reviewer model overflows and returns empty (the original NO_OUTPUT cause). Source: `autonomy/run.sh` (`run_code_review`, diff at `:2578`, inconclusive handling at `:8134`-`:8270`).
|
|
295
|
+
|
|
290
296
|
---
|
|
291
297
|
|
|
292
298
|
## Structured Prompting for Subagents
|
package/skills/agents.md
CHANGED
|
@@ -135,6 +135,7 @@ Task(
|
|
|
135
135
|
- WAIT for all 3 before aggregating
|
|
136
136
|
- IF unanimous PASS: run Devil's Advocate reviewer (anti-sycophancy)
|
|
137
137
|
- Critical/High = BLOCK, Medium = TODO, Low = informational
|
|
138
|
+
- IF every reviewer returns no usable verdict (all NO_OUTPUT / unparseable): the round is INCONCLUSIVE and BLOCKS, never silently passes (v7.41.1, bounded retry first; opt out `LOKI_REVIEW_INCONCLUSIVE_BLOCK=0`). The reviewer diff excludes `.loki/` and `.git/` so a tracked `.loki/` cannot overflow the prompt into the empty-output that caused the original silent pass. See `skills/quality-gates.md` for the env knobs.
|
|
138
139
|
|
|
139
140
|
---
|
|
140
141
|
|