npm - loki-mode - Versions diffs - 7.48.0 → 7.50.0 - Mend

loki-mode 7.48.0 → 7.50.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (25) hide show

package/README.md +2 -2
package/SKILL.md +2 -2
package/VERSION +1 -1
package/autonomy/completion-council.sh +85 -2
package/autonomy/prd-analyzer.py +215 -1
package/autonomy/prd-checklist.sh +315 -0
package/autonomy/run.sh +124 -0
package/autonomy/spec-interrogation.sh +224 -4
package/autonomy/spec.sh +25 -16
package/autonomy/verify.sh +108 -26
package/dashboard/__init__.py +1 -1
package/dashboard/api_v2.py +258 -12
package/dashboard/audit.py +202 -21
package/dashboard/server.py +64 -10
package/docs/INSTALLATION.md +2 -2
package/docs/siem-integration.md +102 -0
package/loki-ts/dist/loki.js +231 -230
package/mcp/__init__.py +1 -1
package/package.json +1 -1
package/plugins/loki-mode/.claude-plugin/plugin.json +1 -1
package/references/invariant-checks.md +109 -0
package/src/audit/crosslink.js +413 -0
package/src/audit/index.js +32 -0
package/src/observability/siem-export.js +424 -0
package/src/policies/cost.js +270 -1

package/README.md CHANGED Viewed

@@ -106,7 +106,7 @@ loki quick "build a landing page with a signup form"
 | **Bun (recommended)** | `bun install -g loki-mode` | Fastest startup for CLI commands. |
 | **Homebrew** | `brew tap asklokesh/tap && brew install loki-mode` | Auto-installs Bun as a dep |
 | **Docker (easiest)** | `loki docker start prd.md` | Host wrapper: runs loki in the published image with zero config. Bind-mounts the current folder so `.loki` state, resume, and continuity work exactly like local. Auto-detects auth (`ANTHROPIC_API_KEY`, else your host Claude Code login). Needs loki + Docker on the host. See DOCKER_README.md |
-| **Docker (raw)** | `docker pull asklokesh/loki-mode:7.45.0 && docker run --rm -e ANTHROPIC_API_KEY="$ANTHROPIC_API_KEY" asklokesh/loki-mode:7.45.0 start prd.md` | Bun + Claude CLI pre-installed; needs an API key, or use docker compose with a .env file, see DOCKER_README.md |
+| **Docker (raw)** | `docker pull asklokesh/loki-mode:7.50.0 && docker run --rm -e ANTHROPIC_API_KEY="$ANTHROPIC_API_KEY" asklokesh/loki-mode:7.50.0 start prd.md` | Bun + Claude CLI pre-installed; needs an API key, or use docker compose with a .env file, see DOCKER_README.md |
 | **npm (compat)** | `npm install -g loki-mode` | Works without Bun (bash fallback). Migrate any time with `loki self-update --to bun`. |
 **Upgrading:**
@@ -166,7 +166,7 @@ The next major release sunsets the Bash runtime entirely. There is no firm calen
 | Method | Command |
 |--------|---------|
 | **Homebrew** | `brew tap asklokesh/tap && brew install loki-mode` |
-| **Docker** | `docker pull asklokesh/loki-mode:7.45.0` |
+| **Docker** | `docker pull asklokesh/loki-mode:7.50.0` |
 | **Inside Claude Code** | `claude --dangerously-skip-permissions` then type "Loki Mode" |
 | **Git clone** | `git clone https://github.com/asklokesh/loki-mode.git` |

package/SKILL.md CHANGED Viewed

@@ -3,7 +3,7 @@ name: loki-mode
 description: Autonomous spec-driven build system with a built-in trust layer. It does not call work done until it is verified (RARV-C closure loop, 8 quality gates, completion council, verified-completion evidence gate). Triggers on "Loki Mode". Takes a spec (PRD, GitHub issue, OpenAPI doc, etc.) to deployed product with minimal human intervention. Provider-agnostic. Requires --dangerously-skip-permissions flag.
 ---
-# Loki Mode v7.48.0
+# Loki Mode v7.50.0
 **You are an autonomous agent. You make decisions. You do not ask questions. You do not stop.**
@@ -407,4 +407,4 @@ See `CHANGELOG.md` entries [7.5.7], [7.5.8], [7.5.13] for the per-fix list and r
 ---
-**v7.48.0 | [Autonomi](https://www.autonomi.dev/) flagship product | ~260 lines core**
+**v7.50.0 | [Autonomi](https://www.autonomi.dev/) flagship product | ~260 lines core**

package/VERSION CHANGED Viewed

	@@ -1 +1 @@
1	- 7.48.0
1	+ 7.50.0

package/autonomy/completion-council.sh CHANGED Viewed

@@ -1569,6 +1569,19 @@ council_evidence_gate() {
     local test_fails="false"
     local test_runner="none"
     local test_pass="true"
+    # P1-1 (evidence-gate loophole): track WHY the test signal is not conclusive
+    # positive evidence, mirroring diff_inconclusive. A project that ran NO test
+    # suite (runner=="none") must NOT count as affirmative "tests are green"
+    # evidence -- absence of tests is not proof of correctness. We classify it as
+    # INCONCLUSIVE (not FAIL: a no-tests project is still allowed to complete, it
+    # just may not lean on tests as positive proof), so the no-tests "done" routes
+    # to the completion council's affirmative vote instead of silently passing on
+    # diff-alone. test_inconclusive is pass-through by construction: it never sets
+    # test_fails and never writes evidence-block.json, exactly like
+    # diff_inconclusive. Opt-out (LOKI_EVIDENCE_NO_TESTS_AFFIRMATIVE=1) reverts to
+    # the historical behavior where runner=="none" was an affirmative PASS.
+    local test_inconclusive="false"
+    local test_inconclusive_reason=""
     if [ -f "$tr_file" ]; then
         local test_status
         test_status=$(_TR_FILE="$tr_file" python3 -c "
@@ -1597,9 +1610,25 @@ else:
             test_fails="true"
         fi
         # INCONCLUSIVE => test_fails stays "false" => pass-through.
+        # No test suite ran: a present results file that records runner=="none"
+        # is not affirmative evidence. Route to council (inconclusive), not a
+        # silent diff-alone pass. Default-on; LOKI_EVIDENCE_NO_TESTS_AFFIRMATIVE=1
+        # restores the old affirmative-PASS behavior.
+        if [ "$test_runner" = "none" ] && [ "${LOKI_EVIDENCE_NO_TESTS_AFFIRMATIVE:-0}" != "1" ]; then
+            test_inconclusive="true"
+            test_inconclusive_reason="no_test_runner"
+        fi
+    else
+        # Missing test-results.json: no suite was recorded at all. Like the
+        # runner=="none" case this is not affirmative evidence, so classify it
+        # inconclusive (still pass-through: test_fails stays "false"). Preserves
+        # the historical "no file = no gate" non-blocking behavior while making
+        # the absence auditable instead of silently affirmative.
+        if [ "${LOKI_EVIDENCE_NO_TESTS_AFFIRMATIVE:-0}" != "1" ]; then
+            test_inconclusive="true"
+            test_inconclusive_reason="no_test_results"
+        fi
     fi
-    # Missing test-results.json (the else of the -f check) likewise leaves
-    # test_fails="false" => inconclusive => pass-through (no file = no gate).
     # --- v7.28.0: inconclusive-baseline lifecycle -------------------------------
     # When the gate cannot establish a diff baseline (no git repo, or no run-start
@@ -1635,12 +1664,63 @@ INCONCLUSIVE_EOF
         fi
     fi
+    # --- P1-1: durable, auditable evidence-gate details -------------------------
+    # Persist the full evidence picture on EVERY gate run (pass and block) so any
+    # completion claim is auditable after the fact: diff status, test runner +
+    # status, both inconclusive reasons, and the final verdict. Atomic temp+mv,
+    # under .loki/council/ (already excluded from the diff union by the gate's own
+    # ^\.loki/ filter, so it never makes the gate toothless). Best-effort: a write
+    # failure never changes the gate's decision. _write_evidence_details <verdict>
+    # where verdict is one of pass|block (the caller passes the decided verdict).
+    _write_evidence_details() {
+        local _verdict="$1"
+        mkdir -p "$COUNCIL_STATE_DIR" 2>/dev/null || true
+        local _det_file="$COUNCIL_STATE_DIR/evidence-gate-details.json"
+        local _det_tmp="${_det_file}.tmp"
+        local _det_ts
+        _det_ts=$(date -u +"%Y-%m-%dT%H:%M:%SZ")
+        local _diff_ok _tests_ok
+        if [ "$diff_fails" = "true" ]; then _diff_ok="false"; else _diff_ok="true"; fi
+        if [ "$test_fails" = "true" ]; then _tests_ok="false"; else _tests_ok="true"; fi
+        cat > "$_det_tmp" << DETAILS_EOF
+{
+    "recorded_at": "$_det_ts",
+    "iteration": ${ITERATION_COUNT:-0},
+    "verdict": "$_verdict",
+    "diff": {
+        "ok": $_diff_ok,
+        "base_sha": "${base_sha:-}",
+        "files_changed": $diff_files,
+        "inconclusive": $diff_inconclusive,
+        "inconclusive_reason": "$diff_inconclusive_reason"
+    },
+    "tests": {
+        "ok": $_tests_ok,
+        "runner": "$test_runner",
+        "pass": $test_pass,
+        "inconclusive": $test_inconclusive,
+        "inconclusive_reason": "$test_inconclusive_reason"
+    }
+}
+DETAILS_EOF
+        mv "$_det_tmp" "$_det_file" 2>/dev/null || rm -f "$_det_tmp" 2>/dev/null || true
+    }
     # --- Block decision: block iff DIFF FAILS or TEST FAILS ---
     if [ "$diff_fails" != "true" ] && [ "$test_fails" != "true" ]; then
         # Gate passes: remove any stale block report.
         if [ -f "$COUNCIL_STATE_DIR/evidence-block.json" ]; then
             rm -f "$COUNCIL_STATE_DIR/evidence-block.json"
         fi
+        # P1-1: when the gate passes ONLY because no test suite ran, say so out
+        # loud. The pass is pass-through (no-tests must not deadlock), but a
+        # completion that is not backed by any test evidence should never slip by
+        # silently. The durable detail is in evidence-gate-details.json; this is
+        # the human-visible honesty at the pass site.
+        if [ "$test_inconclusive" = "true" ]; then
+            log_warn "[Council] Evidence gate: completion not backed by test evidence (${test_inconclusive_reason}). Pass-through; set LOKI_EVIDENCE_NO_TESTS_AFFIRMATIVE=1 to treat no-tests as affirmative."
+        fi
+        _write_evidence_details "pass"
         return 0
     fi
@@ -1718,6 +1798,9 @@ EVIDENCE_EOF
             >/dev/null 2>&1 || true
     fi
+    # P1-1: durable audit record for the block path too (see _write_evidence_details).
+    _write_evidence_details "block"
     return 1
 }

package/autonomy/prd-analyzer.py CHANGED Viewed

@@ -170,6 +170,14 @@ class PrdAnalyzer:
         self.feature_count = 0
         self.scope = "unknown"
         self.score = 0.0
+        # Deterministic structural validation result (P2-5). Populated by
+        # validate_structure() during analyze(). Shape:
+        #   {"ok": bool, "issues": [str, ...], "warnings": [str, ...]}
+        # issues   = structural problems that would likely yield a garbage
+        #            checklist (no headings, unparseable, basic contradictions).
+        # warnings = lower-confidence findings worth surfacing but not blocking
+        #            (e.g. a referenced local doc that does not exist).
+        self.structure = {"ok": True, "issues": [], "warnings": []}
     def load(self):
         """Load and validate the PRD file, optionally appending architecture doc."""
@@ -190,6 +198,11 @@ class PrdAnalyzer:
     def analyze(self):
         """Run all analysis dimensions and compute score."""
         self.load()
+        # P2-5: deterministic structural validation runs BEFORE the rest of the
+        # analysis (and therefore before the checklist is extracted downstream)
+        # so a malformed/contradictory spec is flagged early with an actionable
+        # message instead of silently producing a garbage checklist.
+        self.validate_structure()
         total_weight = 0.0
         earned_weight = 0.0
@@ -281,6 +294,157 @@ class PrdAnalyzer:
         elif word_count > 500 and self.scope == "small":
             self.scope = "medium"
+    def validate_structure(self):
+        """Deterministic structural validation of the spec (P2-5).
+        Runs before checklist extraction so a malformed/contradictory spec is
+        caught early with an actionable message rather than producing a garbage
+        checklist. All checks are regex/stdlib based and deterministic.
+        Severity policy: only a TRULY UNUSABLE spec (no readable text / binary
+        garbage) is an ISSUE (Status FAIL). Everything else is a WARNING, so a
+        shallow heuristic never marks a valid spec FAIL. This is deliberate:
+        the one-line-brief input mode is supported, and nothing downstream
+        currently blocks on FAIL anyway (see deferral note below), so WARNING
+        is the honest severity for "structure-thin but possibly valid input".
+        ISSUE (high confidence, Status FAIL):
+          1. Parseable / decodable text  -- must contain readable word
+             characters and not be majority-undecodable bytes. A file of pure
+             punctuation or binary content cannot yield a real checklist.
+        WARNINGS (surfaced early, do not flip Status to FAIL):
+          2. Headings present            -- at least one Markdown heading
+             (``# ...``) so sections can be located. An "all prose" spec with
+             zero structure yields a less reliable checklist, but a one-line
+             brief is still valid input -> WARNING, not FAIL.
+          3. Referenced LOCAL docs exist -- only explicit Markdown links to
+             LOCAL, RELATIVE files (``[text](./relative.md)``), resolved
+             against the PRD's parent directory. Specs legitimately describe
+             files to be BUILT, so a missing path is a WARNING; only docs the
+             author claims already exist are flagged, and only as a warning.
+          4. Trivial self-contradiction  -- the same requirement phrased as
+             both "must X" and "must not X" on an identical short predicate.
+             This is a shallow LEXICAL heuristic only: it has no notion of the
+             subject, so "all data must be encrypted" + "public assets must
+             not be encrypted" collide on the predicate "be encrypted" even
+             though they do not actually conflict. Because of that
+             false-positive risk it is a WARNING, never an ISSUE. It does NOT
+             do semantic contradiction detection, cross-section reasoning, or
+             circular dependency analysis -- that deeper work lives in the
+             spec-interrogation pipeline, not here.
+        Populates ``self.structure`` = {"ok", "issues", "warnings"}. Empty/
+        missing-file cases are already raised by ``load()`` before this runs.
+        Never raises; never changes the process exit code (callers such as
+        run.sh invoke the analyzer best-effort and gate on the observations
+        file, not the exit status).
+        """
+        issues = []
+        warnings = []
+        text = self.content or ""
+        # --- Check 1: parseable / decodable -------------------------------
+        # load() already replaced undecodable bytes with U+FFFD and rejected
+        # empty content. A spec that is overwhelmingly replacement characters
+        # or has no word characters at all is effectively unparseable.
+        word_chars = len(re.findall(r"\w", text))
+        replacement_chars = text.count("�")
+        if word_chars == 0:
+            issues.append(
+                "Spec contains no readable text (no word characters found). "
+                "Provide a Markdown/plain-text spec with actual requirements."
+            )
+        elif replacement_chars > 0 and replacement_chars > word_chars:
+            issues.append(
+                "Spec appears to be binary or wrong-encoding content "
+                f"({replacement_chars} undecodable bytes vs {word_chars} text "
+                "characters). Provide a UTF-8 Markdown/plain-text spec."
+            )
+        # --- Check 2: headings present ------------------------------------
+        # Use self.lines so the (optional) architecture doc counts too.
+        heading_count = sum(1 for ln in self.lines if re.match(r"^\s{0,3}#{1,6}\s+\S", ln))
+        if heading_count == 0:
+            warnings.append(
+                "Spec has no Markdown headings (no '# ...' lines). The checklist "
+                "will be guessed from unstructured prose, which is less reliable. "
+                "Add section headings (e.g. ## Features, ## Acceptance Criteria) "
+                "if this is more than a one-line brief."
+            )
+        # --- Check 3: referenced LOCAL relative docs exist ----------------
+        # Only flag explicit Markdown links to local, relative paths. URLs,
+        # anchors, mailto, and absolute paths are skipped. A PRD describes
+        # files to be BUILT, so a missing path is a WARNING, not a hard issue.
+        base_dir = self.prd_path.parent if self.prd_path.parent != Path("") else Path(".")
+        seen_targets = set()
+        for m in re.finditer(r"\[[^\]]+\]\(([^)]+)\)", text):
+            target = m.group(1).strip()
+            # Strip an optional title: [t](path "title")
+            target = target.split()[0] if target else target
+            if not target or target in seen_targets:
+                continue
+            seen_targets.add(target)
+            low = target.lower()
+            # Skip non-local references.
+            if (
+                "://" in target
+                or low.startswith(("http:", "https:", "ftp:", "mailto:", "tel:", "#", "data:"))
+                or target.startswith("/")
+                or target.startswith("~")
+            ):
+                continue
+            # Only consider links that look like a doc/asset reference, i.e.
+            # they have a file extension. A bare word in parens is more likely
+            # to be incidental than a real file reference.
+            stem = target.split("#")[0].split("?")[0]
+            if "." not in os.path.basename(stem):
+                continue
+            candidate = (base_dir / stem)
+            try:
+                exists = candidate.exists()
+            except OSError:
+                exists = False
+            if not exists:
+                warnings.append(
+                    f"Referenced local file not found: '{target}' "
+                    f"(resolved to '{candidate}'). If this doc is supposed to "
+                    "already exist, add it or fix the link; if it describes "
+                    "something to be built, this can be ignored."
+                )
+        # --- Check 4: trivial self-contradiction (BASIC, shallow) ---------
+        # Catch only the most obvious lexical case: identical short predicate
+        # appearing as both "must <p>" and "must not <p>". This is a deliberate
+        # shallow heuristic. Real contradiction/circularity detection is out of
+        # scope here (see spec-interrogation pipeline).
+        must_pos = {}
+        must_neg = set()
+        for ln in self.lines:
+            low = ln.lower()
+            for mm in re.finditer(r"\bmust\s+not\s+([a-z][a-z0-9 _-]{2,40}?)(?=[.,;:)]|$)", low):
+                must_neg.add(mm.group(1).strip())
+            for mm in re.finditer(r"\bmust\s+(?!not\b)([a-z][a-z0-9 _-]{2,40}?)(?=[.,;:)]|$)", low):
+                pred = mm.group(1).strip()
+                must_pos.setdefault(pred, ln.strip()[:120])
+        contradictions = sorted(set(must_pos) & must_neg)
+        for pred in contradictions[:5]:
+            warnings.append(
+                f"Possible self-contradiction: the spec says both 'must {pred}' "
+                f"and 'must not {pred}'. If these apply to the same subject, "
+                "resolve the conflict. (Basic lexical check, ignores subject; "
+                "may be a false positive -- review manually.)"
+            )
+        self.structure = {
+            "ok": len(issues) == 0,
+            "issues": issues,
+            "warnings": warnings,
+        }
+        return self.structure
     def generate_observations(self):
         """Generate the observations markdown content."""
         now = datetime.now(timezone.utc).strftime("%Y-%m-%dT%H:%M:%SZ")
@@ -309,9 +473,40 @@ class PrdAnalyzer:
             f"**Quality Score:** {self.score}/10",
             f"**Estimated Scope:** {self.scope} (~{self.feature_count} items detected)",
             f"",
+        ]
+        # P2-5: structural validation section. This is the durable, visible
+        # channel: run.sh invokes the analyzer with stderr discarded and the
+        # exit code swallowed, so the observations file is what downstream
+        # readers (and the operator) actually see.
+        struct = getattr(self, "structure", {"ok": True, "issues": [], "warnings": []})
+        status = "PASS" if struct.get("ok", True) else "FAIL"
+        lines.append("## Structural Validation")
+        lines.append("")
+        lines.append(f"**Status:** {status}")
+        lines.append("")
+        if struct.get("issues"):
+            lines.append("Structural issues detected (fix these before relying "
+                         "on the generated checklist):")
+            lines.append("")
+            for issue in struct["issues"]:
+                lines.append(f"- {issue}")
+            lines.append("")
+        if struct.get("warnings"):
+            lines.append("Warnings (lower confidence, review manually):")
+            lines.append("")
+            for warn in struct["warnings"]:
+                lines.append(f"- {warn}")
+            lines.append("")
+        if struct.get("ok") and not struct.get("warnings"):
+            lines.append("- Spec is parseable, has headings, and has no "
+                         "obvious self-contradictions.")
+            lines.append("")
+        lines.extend([
             f"## Strengths",
             f"",
-        ]
+        ])
         if strengths:
             lines.extend(strengths)
         else:
@@ -475,6 +670,25 @@ def main():
         write_atomic(args.output, observations)
         print(f"PRD analysis complete: score={analyzer.score}/10 scope={analyzer.scope}")
+        # P2-5: surface structural validation on stdout (run.sh keeps stdout in
+        # its log; only stderr is discarded). Exit code intentionally stays 0
+        # for a structurally-suspect-but-non-empty spec to match the
+        # best-effort, never-blocks contract other callers rely on.
+        struct = getattr(analyzer, "structure", {"ok": True, "issues": [], "warnings": []})
+        if not struct.get("ok", True):
+            print(
+                "PRD structure check: FAIL ("
+                + f"{len(struct.get('issues', []))} issue(s)) -- "
+                + "see Structural Validation section in observations"
+            )
+        elif struct.get("warnings"):
+            print(
+                "PRD structure check: PASS with "
+                + f"{len(struct['warnings'])} warning(s) -- "
+                + "see Structural Validation section in observations"
+            )
+        else:
+            print("PRD structure check: PASS")
         print(f"Observations written to: {args.output}")
     except FileNotFoundError as e: