loki-mode 7.48.0 → 7.50.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -106,7 +106,7 @@ loki quick "build a landing page with a signup form"
106
106
  | **Bun (recommended)** | `bun install -g loki-mode` | Fastest startup for CLI commands. |
107
107
  | **Homebrew** | `brew tap asklokesh/tap && brew install loki-mode` | Auto-installs Bun as a dep |
108
108
  | **Docker (easiest)** | `loki docker start prd.md` | Host wrapper: runs loki in the published image with zero config. Bind-mounts the current folder so `.loki` state, resume, and continuity work exactly like local. Auto-detects auth (`ANTHROPIC_API_KEY`, else your host Claude Code login). Needs loki + Docker on the host. See DOCKER_README.md |
109
- | **Docker (raw)** | `docker pull asklokesh/loki-mode:7.45.0 && docker run --rm -e ANTHROPIC_API_KEY="$ANTHROPIC_API_KEY" asklokesh/loki-mode:7.45.0 start prd.md` | Bun + Claude CLI pre-installed; needs an API key, or use docker compose with a .env file, see DOCKER_README.md |
109
+ | **Docker (raw)** | `docker pull asklokesh/loki-mode:7.50.0 && docker run --rm -e ANTHROPIC_API_KEY="$ANTHROPIC_API_KEY" asklokesh/loki-mode:7.50.0 start prd.md` | Bun + Claude CLI pre-installed; needs an API key, or use docker compose with a .env file, see DOCKER_README.md |
110
110
  | **npm (compat)** | `npm install -g loki-mode` | Works without Bun (bash fallback). Migrate any time with `loki self-update --to bun`. |
111
111
 
112
112
  **Upgrading:**
@@ -166,7 +166,7 @@ The next major release sunsets the Bash runtime entirely. There is no firm calen
166
166
  | Method | Command |
167
167
  |--------|---------|
168
168
  | **Homebrew** | `brew tap asklokesh/tap && brew install loki-mode` |
169
- | **Docker** | `docker pull asklokesh/loki-mode:7.45.0` |
169
+ | **Docker** | `docker pull asklokesh/loki-mode:7.50.0` |
170
170
  | **Inside Claude Code** | `claude --dangerously-skip-permissions` then type "Loki Mode" |
171
171
  | **Git clone** | `git clone https://github.com/asklokesh/loki-mode.git` |
172
172
 
package/SKILL.md CHANGED
@@ -3,7 +3,7 @@ name: loki-mode
3
3
  description: Autonomous spec-driven build system with a built-in trust layer. It does not call work done until it is verified (RARV-C closure loop, 8 quality gates, completion council, verified-completion evidence gate). Triggers on "Loki Mode". Takes a spec (PRD, GitHub issue, OpenAPI doc, etc.) to deployed product with minimal human intervention. Provider-agnostic. Requires --dangerously-skip-permissions flag.
4
4
  ---
5
5
 
6
- # Loki Mode v7.48.0
6
+ # Loki Mode v7.50.0
7
7
 
8
8
  **You are an autonomous agent. You make decisions. You do not ask questions. You do not stop.**
9
9
 
@@ -407,4 +407,4 @@ See `CHANGELOG.md` entries [7.5.7], [7.5.8], [7.5.13] for the per-fix list and r
407
407
 
408
408
  ---
409
409
 
410
- **v7.48.0 | [Autonomi](https://www.autonomi.dev/) flagship product | ~260 lines core**
410
+ **v7.50.0 | [Autonomi](https://www.autonomi.dev/) flagship product | ~260 lines core**
package/VERSION CHANGED
@@ -1 +1 @@
1
- 7.48.0
1
+ 7.50.0
@@ -1569,6 +1569,19 @@ council_evidence_gate() {
1569
1569
  local test_fails="false"
1570
1570
  local test_runner="none"
1571
1571
  local test_pass="true"
1572
+ # P1-1 (evidence-gate loophole): track WHY the test signal is not conclusive
1573
+ # positive evidence, mirroring diff_inconclusive. A project that ran NO test
1574
+ # suite (runner=="none") must NOT count as affirmative "tests are green"
1575
+ # evidence -- absence of tests is not proof of correctness. We classify it as
1576
+ # INCONCLUSIVE (not FAIL: a no-tests project is still allowed to complete, it
1577
+ # just may not lean on tests as positive proof), so the no-tests "done" routes
1578
+ # to the completion council's affirmative vote instead of silently passing on
1579
+ # diff-alone. test_inconclusive is pass-through by construction: it never sets
1580
+ # test_fails and never writes evidence-block.json, exactly like
1581
+ # diff_inconclusive. Opt-out (LOKI_EVIDENCE_NO_TESTS_AFFIRMATIVE=1) reverts to
1582
+ # the historical behavior where runner=="none" was an affirmative PASS.
1583
+ local test_inconclusive="false"
1584
+ local test_inconclusive_reason=""
1572
1585
  if [ -f "$tr_file" ]; then
1573
1586
  local test_status
1574
1587
  test_status=$(_TR_FILE="$tr_file" python3 -c "
@@ -1597,9 +1610,25 @@ else:
1597
1610
  test_fails="true"
1598
1611
  fi
1599
1612
  # INCONCLUSIVE => test_fails stays "false" => pass-through.
1613
+ # No test suite ran: a present results file that records runner=="none"
1614
+ # is not affirmative evidence. Route to council (inconclusive), not a
1615
+ # silent diff-alone pass. Default-on; LOKI_EVIDENCE_NO_TESTS_AFFIRMATIVE=1
1616
+ # restores the old affirmative-PASS behavior.
1617
+ if [ "$test_runner" = "none" ] && [ "${LOKI_EVIDENCE_NO_TESTS_AFFIRMATIVE:-0}" != "1" ]; then
1618
+ test_inconclusive="true"
1619
+ test_inconclusive_reason="no_test_runner"
1620
+ fi
1621
+ else
1622
+ # Missing test-results.json: no suite was recorded at all. Like the
1623
+ # runner=="none" case this is not affirmative evidence, so classify it
1624
+ # inconclusive (still pass-through: test_fails stays "false"). Preserves
1625
+ # the historical "no file = no gate" non-blocking behavior while making
1626
+ # the absence auditable instead of silently affirmative.
1627
+ if [ "${LOKI_EVIDENCE_NO_TESTS_AFFIRMATIVE:-0}" != "1" ]; then
1628
+ test_inconclusive="true"
1629
+ test_inconclusive_reason="no_test_results"
1630
+ fi
1600
1631
  fi
1601
- # Missing test-results.json (the else of the -f check) likewise leaves
1602
- # test_fails="false" => inconclusive => pass-through (no file = no gate).
1603
1632
 
1604
1633
  # --- v7.28.0: inconclusive-baseline lifecycle -------------------------------
1605
1634
  # When the gate cannot establish a diff baseline (no git repo, or no run-start
@@ -1635,12 +1664,63 @@ INCONCLUSIVE_EOF
1635
1664
  fi
1636
1665
  fi
1637
1666
 
1667
+ # --- P1-1: durable, auditable evidence-gate details -------------------------
1668
+ # Persist the full evidence picture on EVERY gate run (pass and block) so any
1669
+ # completion claim is auditable after the fact: diff status, test runner +
1670
+ # status, both inconclusive reasons, and the final verdict. Atomic temp+mv,
1671
+ # under .loki/council/ (already excluded from the diff union by the gate's own
1672
+ # ^\.loki/ filter, so it never makes the gate toothless). Best-effort: a write
1673
+ # failure never changes the gate's decision. _write_evidence_details <verdict>
1674
+ # where verdict is one of pass|block (the caller passes the decided verdict).
1675
+ _write_evidence_details() {
1676
+ local _verdict="$1"
1677
+ mkdir -p "$COUNCIL_STATE_DIR" 2>/dev/null || true
1678
+ local _det_file="$COUNCIL_STATE_DIR/evidence-gate-details.json"
1679
+ local _det_tmp="${_det_file}.tmp"
1680
+ local _det_ts
1681
+ _det_ts=$(date -u +"%Y-%m-%dT%H:%M:%SZ")
1682
+ local _diff_ok _tests_ok
1683
+ if [ "$diff_fails" = "true" ]; then _diff_ok="false"; else _diff_ok="true"; fi
1684
+ if [ "$test_fails" = "true" ]; then _tests_ok="false"; else _tests_ok="true"; fi
1685
+ cat > "$_det_tmp" << DETAILS_EOF
1686
+ {
1687
+ "recorded_at": "$_det_ts",
1688
+ "iteration": ${ITERATION_COUNT:-0},
1689
+ "verdict": "$_verdict",
1690
+ "diff": {
1691
+ "ok": $_diff_ok,
1692
+ "base_sha": "${base_sha:-}",
1693
+ "files_changed": $diff_files,
1694
+ "inconclusive": $diff_inconclusive,
1695
+ "inconclusive_reason": "$diff_inconclusive_reason"
1696
+ },
1697
+ "tests": {
1698
+ "ok": $_tests_ok,
1699
+ "runner": "$test_runner",
1700
+ "pass": $test_pass,
1701
+ "inconclusive": $test_inconclusive,
1702
+ "inconclusive_reason": "$test_inconclusive_reason"
1703
+ }
1704
+ }
1705
+ DETAILS_EOF
1706
+ mv "$_det_tmp" "$_det_file" 2>/dev/null || rm -f "$_det_tmp" 2>/dev/null || true
1707
+ }
1708
+
1638
1709
  # --- Block decision: block iff DIFF FAILS or TEST FAILS ---
1639
1710
  if [ "$diff_fails" != "true" ] && [ "$test_fails" != "true" ]; then
1640
1711
  # Gate passes: remove any stale block report.
1641
1712
  if [ -f "$COUNCIL_STATE_DIR/evidence-block.json" ]; then
1642
1713
  rm -f "$COUNCIL_STATE_DIR/evidence-block.json"
1643
1714
  fi
1715
+ # P1-1: when the gate passes ONLY because no test suite ran, say so out
1716
+ # loud. The pass is pass-through (no-tests must not deadlock), but a
1717
+ # completion that is not backed by any test evidence should never slip by
1718
+ # silently. The durable detail is in evidence-gate-details.json; this is
1719
+ # the human-visible honesty at the pass site.
1720
+ if [ "$test_inconclusive" = "true" ]; then
1721
+ log_warn "[Council] Evidence gate: completion not backed by test evidence (${test_inconclusive_reason}). Pass-through; set LOKI_EVIDENCE_NO_TESTS_AFFIRMATIVE=1 to treat no-tests as affirmative."
1722
+ fi
1723
+ _write_evidence_details "pass"
1644
1724
  return 0
1645
1725
  fi
1646
1726
 
@@ -1718,6 +1798,9 @@ EVIDENCE_EOF
1718
1798
  >/dev/null 2>&1 || true
1719
1799
  fi
1720
1800
 
1801
+ # P1-1: durable audit record for the block path too (see _write_evidence_details).
1802
+ _write_evidence_details "block"
1803
+
1721
1804
  return 1
1722
1805
  }
1723
1806
 
@@ -170,6 +170,14 @@ class PrdAnalyzer:
170
170
  self.feature_count = 0
171
171
  self.scope = "unknown"
172
172
  self.score = 0.0
173
+ # Deterministic structural validation result (P2-5). Populated by
174
+ # validate_structure() during analyze(). Shape:
175
+ # {"ok": bool, "issues": [str, ...], "warnings": [str, ...]}
176
+ # issues = structural problems that would likely yield a garbage
177
+ # checklist (no headings, unparseable, basic contradictions).
178
+ # warnings = lower-confidence findings worth surfacing but not blocking
179
+ # (e.g. a referenced local doc that does not exist).
180
+ self.structure = {"ok": True, "issues": [], "warnings": []}
173
181
 
174
182
  def load(self):
175
183
  """Load and validate the PRD file, optionally appending architecture doc."""
@@ -190,6 +198,11 @@ class PrdAnalyzer:
190
198
  def analyze(self):
191
199
  """Run all analysis dimensions and compute score."""
192
200
  self.load()
201
+ # P2-5: deterministic structural validation runs BEFORE the rest of the
202
+ # analysis (and therefore before the checklist is extracted downstream)
203
+ # so a malformed/contradictory spec is flagged early with an actionable
204
+ # message instead of silently producing a garbage checklist.
205
+ self.validate_structure()
193
206
  total_weight = 0.0
194
207
  earned_weight = 0.0
195
208
 
@@ -281,6 +294,157 @@ class PrdAnalyzer:
281
294
  elif word_count > 500 and self.scope == "small":
282
295
  self.scope = "medium"
283
296
 
297
+ def validate_structure(self):
298
+ """Deterministic structural validation of the spec (P2-5).
299
+
300
+ Runs before checklist extraction so a malformed/contradictory spec is
301
+ caught early with an actionable message rather than producing a garbage
302
+ checklist. All checks are regex/stdlib based and deterministic.
303
+
304
+ Severity policy: only a TRULY UNUSABLE spec (no readable text / binary
305
+ garbage) is an ISSUE (Status FAIL). Everything else is a WARNING, so a
306
+ shallow heuristic never marks a valid spec FAIL. This is deliberate:
307
+ the one-line-brief input mode is supported, and nothing downstream
308
+ currently blocks on FAIL anyway (see deferral note below), so WARNING
309
+ is the honest severity for "structure-thin but possibly valid input".
310
+
311
+ ISSUE (high confidence, Status FAIL):
312
+ 1. Parseable / decodable text -- must contain readable word
313
+ characters and not be majority-undecodable bytes. A file of pure
314
+ punctuation or binary content cannot yield a real checklist.
315
+
316
+ WARNINGS (surfaced early, do not flip Status to FAIL):
317
+ 2. Headings present -- at least one Markdown heading
318
+ (``# ...``) so sections can be located. An "all prose" spec with
319
+ zero structure yields a less reliable checklist, but a one-line
320
+ brief is still valid input -> WARNING, not FAIL.
321
+ 3. Referenced LOCAL docs exist -- only explicit Markdown links to
322
+ LOCAL, RELATIVE files (``[text](./relative.md)``), resolved
323
+ against the PRD's parent directory. Specs legitimately describe
324
+ files to be BUILT, so a missing path is a WARNING; only docs the
325
+ author claims already exist are flagged, and only as a warning.
326
+ 4. Trivial self-contradiction -- the same requirement phrased as
327
+ both "must X" and "must not X" on an identical short predicate.
328
+ This is a shallow LEXICAL heuristic only: it has no notion of the
329
+ subject, so "all data must be encrypted" + "public assets must
330
+ not be encrypted" collide on the predicate "be encrypted" even
331
+ though they do not actually conflict. Because of that
332
+ false-positive risk it is a WARNING, never an ISSUE. It does NOT
333
+ do semantic contradiction detection, cross-section reasoning, or
334
+ circular dependency analysis -- that deeper work lives in the
335
+ spec-interrogation pipeline, not here.
336
+
337
+ Populates ``self.structure`` = {"ok", "issues", "warnings"}. Empty/
338
+ missing-file cases are already raised by ``load()`` before this runs.
339
+ Never raises; never changes the process exit code (callers such as
340
+ run.sh invoke the analyzer best-effort and gate on the observations
341
+ file, not the exit status).
342
+ """
343
+ issues = []
344
+ warnings = []
345
+
346
+ text = self.content or ""
347
+
348
+ # --- Check 1: parseable / decodable -------------------------------
349
+ # load() already replaced undecodable bytes with U+FFFD and rejected
350
+ # empty content. A spec that is overwhelmingly replacement characters
351
+ # or has no word characters at all is effectively unparseable.
352
+ word_chars = len(re.findall(r"\w", text))
353
+ replacement_chars = text.count("�")
354
+ if word_chars == 0:
355
+ issues.append(
356
+ "Spec contains no readable text (no word characters found). "
357
+ "Provide a Markdown/plain-text spec with actual requirements."
358
+ )
359
+ elif replacement_chars > 0 and replacement_chars > word_chars:
360
+ issues.append(
361
+ "Spec appears to be binary or wrong-encoding content "
362
+ f"({replacement_chars} undecodable bytes vs {word_chars} text "
363
+ "characters). Provide a UTF-8 Markdown/plain-text spec."
364
+ )
365
+
366
+ # --- Check 2: headings present ------------------------------------
367
+ # Use self.lines so the (optional) architecture doc counts too.
368
+ heading_count = sum(1 for ln in self.lines if re.match(r"^\s{0,3}#{1,6}\s+\S", ln))
369
+ if heading_count == 0:
370
+ warnings.append(
371
+ "Spec has no Markdown headings (no '# ...' lines). The checklist "
372
+ "will be guessed from unstructured prose, which is less reliable. "
373
+ "Add section headings (e.g. ## Features, ## Acceptance Criteria) "
374
+ "if this is more than a one-line brief."
375
+ )
376
+
377
+ # --- Check 3: referenced LOCAL relative docs exist ----------------
378
+ # Only flag explicit Markdown links to local, relative paths. URLs,
379
+ # anchors, mailto, and absolute paths are skipped. A PRD describes
380
+ # files to be BUILT, so a missing path is a WARNING, not a hard issue.
381
+ base_dir = self.prd_path.parent if self.prd_path.parent != Path("") else Path(".")
382
+ seen_targets = set()
383
+ for m in re.finditer(r"\[[^\]]+\]\(([^)]+)\)", text):
384
+ target = m.group(1).strip()
385
+ # Strip an optional title: [t](path "title")
386
+ target = target.split()[0] if target else target
387
+ if not target or target in seen_targets:
388
+ continue
389
+ seen_targets.add(target)
390
+ low = target.lower()
391
+ # Skip non-local references.
392
+ if (
393
+ "://" in target
394
+ or low.startswith(("http:", "https:", "ftp:", "mailto:", "tel:", "#", "data:"))
395
+ or target.startswith("/")
396
+ or target.startswith("~")
397
+ ):
398
+ continue
399
+ # Only consider links that look like a doc/asset reference, i.e.
400
+ # they have a file extension. A bare word in parens is more likely
401
+ # to be incidental than a real file reference.
402
+ stem = target.split("#")[0].split("?")[0]
403
+ if "." not in os.path.basename(stem):
404
+ continue
405
+ candidate = (base_dir / stem)
406
+ try:
407
+ exists = candidate.exists()
408
+ except OSError:
409
+ exists = False
410
+ if not exists:
411
+ warnings.append(
412
+ f"Referenced local file not found: '{target}' "
413
+ f"(resolved to '{candidate}'). If this doc is supposed to "
414
+ "already exist, add it or fix the link; if it describes "
415
+ "something to be built, this can be ignored."
416
+ )
417
+
418
+ # --- Check 4: trivial self-contradiction (BASIC, shallow) ---------
419
+ # Catch only the most obvious lexical case: identical short predicate
420
+ # appearing as both "must <p>" and "must not <p>". This is a deliberate
421
+ # shallow heuristic. Real contradiction/circularity detection is out of
422
+ # scope here (see spec-interrogation pipeline).
423
+ must_pos = {}
424
+ must_neg = set()
425
+ for ln in self.lines:
426
+ low = ln.lower()
427
+ for mm in re.finditer(r"\bmust\s+not\s+([a-z][a-z0-9 _-]{2,40}?)(?=[.,;:)]|$)", low):
428
+ must_neg.add(mm.group(1).strip())
429
+ for mm in re.finditer(r"\bmust\s+(?!not\b)([a-z][a-z0-9 _-]{2,40}?)(?=[.,;:)]|$)", low):
430
+ pred = mm.group(1).strip()
431
+ must_pos.setdefault(pred, ln.strip()[:120])
432
+ contradictions = sorted(set(must_pos) & must_neg)
433
+ for pred in contradictions[:5]:
434
+ warnings.append(
435
+ f"Possible self-contradiction: the spec says both 'must {pred}' "
436
+ f"and 'must not {pred}'. If these apply to the same subject, "
437
+ "resolve the conflict. (Basic lexical check, ignores subject; "
438
+ "may be a false positive -- review manually.)"
439
+ )
440
+
441
+ self.structure = {
442
+ "ok": len(issues) == 0,
443
+ "issues": issues,
444
+ "warnings": warnings,
445
+ }
446
+ return self.structure
447
+
284
448
  def generate_observations(self):
285
449
  """Generate the observations markdown content."""
286
450
  now = datetime.now(timezone.utc).strftime("%Y-%m-%dT%H:%M:%SZ")
@@ -309,9 +473,40 @@ class PrdAnalyzer:
309
473
  f"**Quality Score:** {self.score}/10",
310
474
  f"**Estimated Scope:** {self.scope} (~{self.feature_count} items detected)",
311
475
  f"",
476
+ ]
477
+
478
+ # P2-5: structural validation section. This is the durable, visible
479
+ # channel: run.sh invokes the analyzer with stderr discarded and the
480
+ # exit code swallowed, so the observations file is what downstream
481
+ # readers (and the operator) actually see.
482
+ struct = getattr(self, "structure", {"ok": True, "issues": [], "warnings": []})
483
+ status = "PASS" if struct.get("ok", True) else "FAIL"
484
+ lines.append("## Structural Validation")
485
+ lines.append("")
486
+ lines.append(f"**Status:** {status}")
487
+ lines.append("")
488
+ if struct.get("issues"):
489
+ lines.append("Structural issues detected (fix these before relying "
490
+ "on the generated checklist):")
491
+ lines.append("")
492
+ for issue in struct["issues"]:
493
+ lines.append(f"- {issue}")
494
+ lines.append("")
495
+ if struct.get("warnings"):
496
+ lines.append("Warnings (lower confidence, review manually):")
497
+ lines.append("")
498
+ for warn in struct["warnings"]:
499
+ lines.append(f"- {warn}")
500
+ lines.append("")
501
+ if struct.get("ok") and not struct.get("warnings"):
502
+ lines.append("- Spec is parseable, has headings, and has no "
503
+ "obvious self-contradictions.")
504
+ lines.append("")
505
+
506
+ lines.extend([
312
507
  f"## Strengths",
313
508
  f"",
314
- ]
509
+ ])
315
510
  if strengths:
316
511
  lines.extend(strengths)
317
512
  else:
@@ -475,6 +670,25 @@ def main():
475
670
 
476
671
  write_atomic(args.output, observations)
477
672
  print(f"PRD analysis complete: score={analyzer.score}/10 scope={analyzer.scope}")
673
+ # P2-5: surface structural validation on stdout (run.sh keeps stdout in
674
+ # its log; only stderr is discarded). Exit code intentionally stays 0
675
+ # for a structurally-suspect-but-non-empty spec to match the
676
+ # best-effort, never-blocks contract other callers rely on.
677
+ struct = getattr(analyzer, "structure", {"ok": True, "issues": [], "warnings": []})
678
+ if not struct.get("ok", True):
679
+ print(
680
+ "PRD structure check: FAIL ("
681
+ + f"{len(struct.get('issues', []))} issue(s)) -- "
682
+ + "see Structural Validation section in observations"
683
+ )
684
+ elif struct.get("warnings"):
685
+ print(
686
+ "PRD structure check: PASS with "
687
+ + f"{len(struct['warnings'])} warning(s) -- "
688
+ + "see Structural Validation section in observations"
689
+ )
690
+ else:
691
+ print("PRD structure check: PASS")
478
692
  print(f"Observations written to: {args.output}")
479
693
 
480
694
  except FileNotFoundError as e: