loki-mode 7.41.5 → 7.43.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -116,10 +116,17 @@ provider_version() {
116
116
  # Invocation function
117
117
  # Note: Codex uses positional prompt, not -p flag
118
118
  # Note: Reasoning effort is configured via environment or config, not CLI flag
119
+ # v7.x: pin the resolved model explicitly via -m/--model. Without it, codex
120
+ # falls back to the installed CLI's built-in default (e.g. gpt-5.5 on codex
121
+ # 0.132.0), which silently ignores _codex_validate_model and makes the run.sh
122
+ # cost table (priced for gpt-5.3-codex) wrong. --model is the documented model
123
+ # selector and is readable in process listings.
119
124
  provider_invoke() {
120
125
  local prompt="$1"
121
126
  shift
122
- codex exec --full-auto --skip-git-repo-check "$prompt" "$@"
127
+ codex exec --full-auto --skip-git-repo-check \
128
+ --model "$PROVIDER_MODEL_DEVELOPMENT" \
129
+ "$prompt" "$@"
123
130
  }
124
131
 
125
132
  # Model tier to effort level parameter (Codex uses effort, not separate models)
@@ -197,6 +204,18 @@ provider_invoke_with_tier() {
197
204
  local effort
198
205
  effort=$(resolve_model_for_tier "$tier")
199
206
 
207
+ # Resolve the model name by tier. These three vars can diverge via the
208
+ # generic LOKI_MODEL_* env (each validated by _codex_validate_model), so
209
+ # honor the tier rather than hardcoding development. Capability aliases
210
+ # (best/balanced/cheap) mirror resolve_model_for_tier's mapping.
211
+ local model
212
+ case "$tier" in
213
+ planning|best) model="$PROVIDER_MODEL_PLANNING" ;;
214
+ development|balanced) model="$PROVIDER_MODEL_DEVELOPMENT" ;;
215
+ fast|cheap) model="$PROVIDER_MODEL_FAST" ;;
216
+ *) model="$PROVIDER_MODEL_DEVELOPMENT" ;;
217
+ esac
218
+
200
219
  local extra_flags=()
201
220
  if [ "${LOKI_CODEX_WEB_SEARCH:-false}" = "true" ]; then
202
221
  extra_flags+=(--search)
@@ -211,6 +230,7 @@ provider_invoke_with_tier() {
211
230
  --ask-for-approval never \
212
231
  --sandbox danger-full-access \
213
232
  --skip-git-repo-check \
233
+ --model "$model" \
214
234
  "${extra_flags[@]}" \
215
235
  "$prompt" "$@"
216
236
  }
@@ -74,6 +74,13 @@ Every iteration follows this cycle:
74
74
 
75
75
  The RARV cycle now closes with an explicit Critique step (RARV-C). After VERIFY, an override council of real provider judges (v7.5.4) issues a binding decision before the iteration is marked complete. See `references/quality-control.md` for the override council protocol.
76
76
 
77
+ ### Verified Completion: Evidence Required (v7.41.1, v7.41.5)
78
+
79
+ Completion is gated on affirmative test evidence, not the absence of a detected failure.
80
+
81
+ - **Test evidence captured before the gate reads it (v7.41.1).** Loki runs the project's own tests and persists `.loki/quality/test-results.json` before the completion evidence gate evaluates it, so absent test evidence can no longer silently pass the test axis. Default-on; opt out with `LOKI_COMPLETION_TEST_CAPTURE=0`. It reuses the quality-ladder run (no double test execution per iteration) and a project with no runner records `{"runner":"none","pass":true}`. Source: `autonomy/run.sh` (`ensure_completion_test_evidence`, `:7236`).
82
+ - **Completion council heuristic fallback defaults to CONTINUE (v7.41.5).** When no AI provider is available for the council, the heuristic member evaluation starts each vote at CONTINUE and flips to COMPLETE only when no failure is detected AND affirmative positive evidence is present (the same non-red `test-results.json` signal the evidence hard gate uses). An empty `.loki/` with no test evidence no longer clears the threshold on "absence of failure". Legitimate finished projects (passing or genuinely no-test) still vote COMPLETE. Source: `autonomy/completion-council.sh` (`council_evaluate_member`, `:2044`-`:2063`, `:2127`-`:2140`).
83
+
77
84
  ---
78
85
 
79
86
  ## CONTINUITY.md - Working Memory Protocol
@@ -287,6 +287,12 @@ Task(subagent_type="general-purpose", model="opus",
287
287
  - ALWAYS re-run ALL 3 reviewers after fixes (not just the one that found the issue)
288
288
  - Wait for all reviews to complete before aggregating results
289
289
 
290
+ ### Inconclusive Reviews Block (v7.41.1)
291
+
292
+ A code-review round must produce real verdicts to pass. `run_code_review` counts only reviewer outputs that exist, are non-empty, and carry a recognized `VERDICT:` line. If every reviewer returns no usable verdict (all NO_OUTPUT or unparseable), the round is treated as INCONCLUSIVE and BLOCKS rather than silently passing with zero findings. A bounded one-shot retry runs first; the block is opt-out via `LOKI_REVIEW_INCONCLUSIVE_BLOCK=0` (records, never blocks). APPROVE / PASS-with-concerns still pass.
293
+
294
+ The reviewer-prompt diff excludes `.loki/` and `.git/` via git pathspec (`git diff <sha>..HEAD -- . ':(exclude).loki/'`). This mirrors the completion evidence gate and prevents a `.loki/`-tracked repo from ballooning the diff to the point that the reviewer model overflows and returns empty (the original NO_OUTPUT cause). Source: `autonomy/run.sh` (`run_code_review`, diff at `:2578`, inconclusive handling at `:8134`-`:8270`).
295
+
290
296
  ---
291
297
 
292
298
  ## Structured Prompting for Subagents
package/skills/agents.md CHANGED
@@ -135,6 +135,7 @@ Task(
135
135
  - WAIT for all 3 before aggregating
136
136
  - IF unanimous PASS: run Devil's Advocate reviewer (anti-sycophancy)
137
137
  - Critical/High = BLOCK, Medium = TODO, Low = informational
138
+ - IF every reviewer returns no usable verdict (all NO_OUTPUT / unparseable): the round is INCONCLUSIVE and BLOCKS, never silently passes (v7.41.1, bounded retry first; opt out `LOKI_REVIEW_INCONCLUSIVE_BLOCK=0`). The reviewer diff excludes `.loki/` and `.git/` so a tracked `.loki/` cannot overflow the prompt into the empty-output that caused the original silent pass. See `skills/quality-gates.md` for the env knobs.
138
139
 
139
140
  ---
140
141