loki-mode 7.14.0 → 7.16.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/SKILL.md +2 -2
- package/VERSION +1 -1
- package/autonomy/lib/trust_trajectory.py +437 -0
- package/autonomy/loki +238 -5
- package/bin/loki +2 -1
- package/dashboard/__init__.py +1 -1
- package/dashboard/server.py +112 -0
- package/dashboard/static/index.html +26 -2
- package/dashboard/static/trust.html +271 -0
- package/docs/INSTALLATION.md +1 -1
- package/docs/OPEN-CORE-BOUNDARY.md +58 -0
- package/docs/R4-TRUST-TRAJECTORY-DESIGN.md +127 -0
- package/docs/R9-OPEN-CORE-HOOKS-PLAN.md +113 -0
- package/loki-ts/dist/loki.js +263 -221
- package/mcp/__init__.py +1 -1
- package/package.json +1 -1
|
@@ -0,0 +1,271 @@
|
|
|
1
|
+
<!DOCTYPE html>
|
|
2
|
+
<!--
|
|
3
|
+
Loki Mode - Trust trajectory panel (R4, zero-build standalone).
|
|
4
|
+
|
|
5
|
+
Self-contained: all CSS + JS inlined, no external resources. Fetches
|
|
6
|
+
/api/trust/trajectory and renders, per project over runs/time, whether the
|
|
7
|
+
agent is EARNING autonomy on THIS repo: council pass-rate, gate pass-rate,
|
|
8
|
+
iterations-to-completion, and (when recorded) human interventions, each with
|
|
9
|
+
an up/down/flat direction and an inline-SVG sparkline.
|
|
10
|
+
|
|
11
|
+
The story no competitor tells. Honest-data rule: with fewer than 2 runs this
|
|
12
|
+
shows "not enough history yet", never a fabricated trend.
|
|
13
|
+
-->
|
|
14
|
+
<html lang="en">
|
|
15
|
+
<head>
|
|
16
|
+
<meta charset="utf-8">
|
|
17
|
+
<meta name="viewport" content="width=device-width, initial-scale=1">
|
|
18
|
+
<title>Loki Mode - Trust Trajectory</title>
|
|
19
|
+
<style>
|
|
20
|
+
:root {
|
|
21
|
+
--bg: #0f1115; --panel: #171a21; --panel-2: #1d2129; --border: #2a2f3a;
|
|
22
|
+
--text: #e7e9ee; --muted: #9aa1ad; --faint: #6b7280; --accent: #6f7bf7;
|
|
23
|
+
--green: #34d399; --red: #f87171; --amber: #fbbf24;
|
|
24
|
+
--mono: ui-monospace, "SF Mono", "Menlo", "Consolas", monospace;
|
|
25
|
+
--sans: 'Inter', system-ui, -apple-system, BlinkMacSystemFont, "Segoe UI", sans-serif;
|
|
26
|
+
}
|
|
27
|
+
* { box-sizing: border-box; }
|
|
28
|
+
body { margin: 0; background: var(--bg); color: var(--text); font-family: var(--sans); line-height: 1.5; }
|
|
29
|
+
a { color: var(--accent); text-decoration: none; }
|
|
30
|
+
a:hover { text-decoration: underline; }
|
|
31
|
+
.wrap { max-width: 960px; margin: 0 auto; padding: 40px 20px 80px; }
|
|
32
|
+
.head { display: flex; align-items: baseline; justify-content: space-between; margin-bottom: 8px; }
|
|
33
|
+
h1 { font-size: 24px; font-weight: 650; letter-spacing: -0.3px; margin: 0; }
|
|
34
|
+
h2 { font-size: 15px; font-weight: 600; color: var(--muted); margin: 30px 0 12px; text-transform: uppercase; letter-spacing: 0.5px; }
|
|
35
|
+
.head a { font-size: 13px; }
|
|
36
|
+
.sub { color: var(--muted); font-size: 14px; margin: 0 0 26px; }
|
|
37
|
+
.cards { display: flex; gap: 14px; flex-wrap: wrap; }
|
|
38
|
+
.card { flex: 1 1 200px; background: var(--panel); border: 1px solid var(--border); border-radius: 12px; padding: 16px 18px; }
|
|
39
|
+
.card .label { color: var(--muted); font-size: 12px; text-transform: uppercase; letter-spacing: 0.5px; }
|
|
40
|
+
.card .val { font-family: var(--mono); font-size: 26px; font-weight: 650; margin-top: 6px; }
|
|
41
|
+
.card .note { color: var(--faint); font-size: 12px; margin-top: 4px; }
|
|
42
|
+
.axes { display: flex; flex-direction: column; gap: 12px; }
|
|
43
|
+
.axis { background: var(--panel); border: 1px solid var(--border); border-radius: 12px; padding: 16px 18px; display: flex; align-items: center; gap: 16px; }
|
|
44
|
+
.axis .meta { flex: 1 1 auto; min-width: 0; }
|
|
45
|
+
.axis .name { font-size: 14px; font-weight: 600; }
|
|
46
|
+
.axis .desc { color: var(--faint); font-size: 12px; margin-top: 2px; }
|
|
47
|
+
.axis .spark { flex: 0 0 200px; }
|
|
48
|
+
.axis .verdict { flex: 0 0 150px; text-align: right; }
|
|
49
|
+
.axis .dir { font-family: var(--mono); font-size: 14px; font-weight: 650; }
|
|
50
|
+
.axis .tag { font-size: 12px; margin-top: 2px; }
|
|
51
|
+
.dir.up { color: var(--green); }
|
|
52
|
+
.dir.down { color: var(--green); }
|
|
53
|
+
.dir.bad { color: var(--red); }
|
|
54
|
+
.dir.flat { color: var(--muted); }
|
|
55
|
+
.tag.good { color: var(--green); }
|
|
56
|
+
.tag.bad { color: var(--red); }
|
|
57
|
+
.tag.flat { color: var(--muted); }
|
|
58
|
+
.tag.na { color: var(--faint); }
|
|
59
|
+
svg { display: block; width: 100%; height: 40px; }
|
|
60
|
+
table { width: 100%; border-collapse: collapse; font-size: 13px; }
|
|
61
|
+
th, td { text-align: left; padding: 8px 10px; border-bottom: 1px solid var(--border); }
|
|
62
|
+
th { color: var(--muted); font-weight: 600; font-size: 12px; text-transform: uppercase; letter-spacing: 0.4px; }
|
|
63
|
+
td.num, th.num { text-align: right; font-family: var(--mono); }
|
|
64
|
+
.badge { font-size: 12px; font-weight: 600; padding: 2px 8px; border-radius: 6px; border: 1px solid var(--border); }
|
|
65
|
+
.b-approve { color: var(--green); border-color: rgba(52,211,153,0.4); }
|
|
66
|
+
.b-reject { color: var(--red); border-color: rgba(248,113,113,0.4); }
|
|
67
|
+
.empty { color: var(--muted); background: var(--panel); border: 1px solid var(--border); border-radius: 12px; padding: 24px; text-align: center; }
|
|
68
|
+
.empty code { font-family: var(--mono); color: var(--text); background: var(--panel-2); padding: 2px 6px; border-radius: 5px; }
|
|
69
|
+
.headline { background: var(--panel); border: 1px solid var(--border); border-radius: 12px; padding: 18px; font-size: 15px; }
|
|
70
|
+
.headline.good { border-color: rgba(52,211,153,0.4); }
|
|
71
|
+
.headline.bad { border-color: rgba(248,113,113,0.4); }
|
|
72
|
+
.mono { font-family: var(--mono); }
|
|
73
|
+
.muted { color: var(--muted); }
|
|
74
|
+
</style>
|
|
75
|
+
</head>
|
|
76
|
+
<body>
|
|
77
|
+
<div class="wrap">
|
|
78
|
+
<div class="head">
|
|
79
|
+
<h1>Trust Trajectory</h1>
|
|
80
|
+
<a href="/">Back to dashboard</a>
|
|
81
|
+
</div>
|
|
82
|
+
<p class="sub">Is the agent earning autonomy on THIS repo? Council pass-rate, gate pass-rate, iterations-to-completion, and human interventions over your run history. Real council and RARV-C data, never a fabricated trend.</p>
|
|
83
|
+
<div id="content"><p class="sub">Loading...</p></div>
|
|
84
|
+
</div>
|
|
85
|
+
<script>
|
|
86
|
+
(function () {
|
|
87
|
+
"use strict";
|
|
88
|
+
function esc(s) {
|
|
89
|
+
s = (s === null || s === undefined) ? "" : String(s);
|
|
90
|
+
return s.replace(/&/g, "&").replace(/</g, "<").replace(/>/g, ">")
|
|
91
|
+
.replace(/"/g, """).replace(/'/g, "'");
|
|
92
|
+
}
|
|
93
|
+
function pct(n) {
|
|
94
|
+
if (n === null || n === undefined) return "n/a";
|
|
95
|
+
n = Number(n);
|
|
96
|
+
if (!isFinite(n)) return "n/a";
|
|
97
|
+
return (n * 100).toFixed(0) + "%";
|
|
98
|
+
}
|
|
99
|
+
function num(n) {
|
|
100
|
+
if (n === null || n === undefined) return "n/a";
|
|
101
|
+
n = Number(n);
|
|
102
|
+
if (!isFinite(n)) return "n/a";
|
|
103
|
+
return String(Math.round(n * 100) / 100);
|
|
104
|
+
}
|
|
105
|
+
function badgeClass(v) {
|
|
106
|
+
v = String(v || "").toUpperCase();
|
|
107
|
+
if (v.indexOf("APPROVE") === 0 || v.indexOf("COMPLETE") === 0 || v === "PASS" || v === "PASSED") return "b-approve";
|
|
108
|
+
if (v.indexOf("REJECT") === 0 || v.indexOf("BLOCK") === 0 || v === "FAIL") return "b-reject";
|
|
109
|
+
return "";
|
|
110
|
+
}
|
|
111
|
+
// Arrow glyphs only (no emoji): ^ up, v down, - flat.
|
|
112
|
+
function arrow(direction) {
|
|
113
|
+
if (direction === "up") return "^";
|
|
114
|
+
if (direction === "down") return "v";
|
|
115
|
+
return "-";
|
|
116
|
+
}
|
|
117
|
+
// Build an inline-SVG sparkline from a numeric series (nulls skipped).
|
|
118
|
+
function sparkline(values, higherIsBetter) {
|
|
119
|
+
var pts = [];
|
|
120
|
+
var idx = [];
|
|
121
|
+
for (var i = 0; i < values.length; i++) {
|
|
122
|
+
var v = values[i];
|
|
123
|
+
if (v === null || v === undefined || !isFinite(Number(v))) continue;
|
|
124
|
+
pts.push(Number(v)); idx.push(i);
|
|
125
|
+
}
|
|
126
|
+
if (pts.length === 0) return '<span class="muted" style="font-size:12px;">no data</span>';
|
|
127
|
+
var W = 200, H = 40, pad = 4;
|
|
128
|
+
var min = Math.min.apply(null, pts), max = Math.max.apply(null, pts);
|
|
129
|
+
var range = (max - min) || 1;
|
|
130
|
+
var coords = [];
|
|
131
|
+
for (var j = 0; j < pts.length; j++) {
|
|
132
|
+
var x = pts.length === 1 ? W / 2 : pad + (j / (pts.length - 1)) * (W - 2 * pad);
|
|
133
|
+
var y = H - pad - ((pts[j] - min) / range) * (H - 2 * pad);
|
|
134
|
+
coords.push(x.toFixed(1) + "," + y.toFixed(1));
|
|
135
|
+
}
|
|
136
|
+
// Color the line by whether the last value is better than the first.
|
|
137
|
+
var stroke = "#9aa1ad";
|
|
138
|
+
if (pts.length >= 2) {
|
|
139
|
+
var rising = pts[pts.length - 1] > pts[0];
|
|
140
|
+
var falling = pts[pts.length - 1] < pts[0];
|
|
141
|
+
if ((rising && higherIsBetter) || (falling && !higherIsBetter)) stroke = "#34d399";
|
|
142
|
+
else if ((rising && !higherIsBetter) || (falling && higherIsBetter)) stroke = "#f87171";
|
|
143
|
+
}
|
|
144
|
+
return '<svg viewBox="0 0 ' + W + ' ' + H + '" preserveAspectRatio="none">' +
|
|
145
|
+
'<polyline fill="none" stroke="' + stroke + '" stroke-width="2" points="' + coords.join(" ") + '"/>' +
|
|
146
|
+
'</svg>';
|
|
147
|
+
}
|
|
148
|
+
|
|
149
|
+
var AXIS_DESC = {
|
|
150
|
+
council_pass_rate: "Share of runs the 3-reviewer council approved",
|
|
151
|
+
gate_pass_rate: "Share of quality gates passed per run",
|
|
152
|
+
iterations: "RARV iterations needed to complete a run",
|
|
153
|
+
interventions: "Human interventions needed per run"
|
|
154
|
+
};
|
|
155
|
+
|
|
156
|
+
function renderAxis(key, ax, series) {
|
|
157
|
+
var label = ax.label || key;
|
|
158
|
+
var desc = AXIS_DESC[key] || "";
|
|
159
|
+
var values = series.map(function (s) { return s[key]; });
|
|
160
|
+
var spark = sparkline(values, !!ax.higher_is_better);
|
|
161
|
+
var verdict;
|
|
162
|
+
if (!ax.available) {
|
|
163
|
+
verdict = '<div class="dir flat">-</div><div class="tag na">no data</div>';
|
|
164
|
+
} else if (ax.insufficient) {
|
|
165
|
+
verdict = '<div class="dir flat">-</div><div class="tag na">need 2+ runs</div>';
|
|
166
|
+
} else {
|
|
167
|
+
var dir = ax.direction || "flat";
|
|
168
|
+
var dirClass = dir === "flat" ? "flat" : (ax.improving ? (dir === "up" ? "up" : "down") : "bad");
|
|
169
|
+
var tagClass = ax.improving === true ? "good" : (ax.improving === false ? "bad" : "flat");
|
|
170
|
+
var tagText = ax.improving === true ? "improving" : (ax.improving === false ? "regressing" : "stable");
|
|
171
|
+
var latestStr = ax.higher_is_better ? pct(ax.latest) : num(ax.latest);
|
|
172
|
+
verdict = '<div class="dir ' + dirClass + '">' + arrow(dir) + ' ' + esc(dir) + '</div>' +
|
|
173
|
+
'<div class="tag ' + tagClass + '">' + tagText + ' (now ' + esc(latestStr) + ')</div>';
|
|
174
|
+
}
|
|
175
|
+
return '<div class="axis">' +
|
|
176
|
+
'<div class="meta"><div class="name">' + esc(label) + '</div>' +
|
|
177
|
+
'<div class="desc">' + esc(desc) + (ax.higher_is_better ? " (higher is better)" : " (lower is better)") + '</div></div>' +
|
|
178
|
+
'<div class="spark">' + spark + '</div>' +
|
|
179
|
+
'<div class="verdict">' + verdict + '</div>' +
|
|
180
|
+
'</div>';
|
|
181
|
+
}
|
|
182
|
+
|
|
183
|
+
function renderRuns(series) {
|
|
184
|
+
var html = '<h2>Run history</h2>';
|
|
185
|
+
if (!series || series.length === 0) {
|
|
186
|
+
html += '<div class="empty">No completed runs yet. Trust trajectory comes' +
|
|
187
|
+
' from proof-of-run artifacts (<code>.loki/proofs/</code>), written at' +
|
|
188
|
+
' the end of each run.</div>';
|
|
189
|
+
return html;
|
|
190
|
+
}
|
|
191
|
+
var rows = "";
|
|
192
|
+
for (var i = 0; i < series.length; i++) {
|
|
193
|
+
var s = series[i];
|
|
194
|
+
var cp = s.council_pass_rate;
|
|
195
|
+
var verdict = (cp === 1) ? '<span class="badge b-approve">PASS</span>'
|
|
196
|
+
: (cp === 0 ? '<span class="badge b-reject">FAIL</span>' : '<span class="muted">-</span>');
|
|
197
|
+
rows += '<tr><td class="mono">' + esc(s.run_id) + '</td>' +
|
|
198
|
+
'<td class="muted">' + esc(s.generated_at || "") + '</td>' +
|
|
199
|
+
'<td>' + verdict + '</td>' +
|
|
200
|
+
'<td class="num">' + (s.gate_pass_rate === null || s.gate_pass_rate === undefined ? "-" : pct(s.gate_pass_rate)) + '</td>' +
|
|
201
|
+
'<td class="num">' + (s.iterations === null || s.iterations === undefined ? "-" : esc(s.iterations)) + '</td>' +
|
|
202
|
+
'<td class="num">' + (s.interventions === null || s.interventions === undefined ? "-" : esc(s.interventions)) + '</td></tr>';
|
|
203
|
+
}
|
|
204
|
+
html += '<table><thead><tr>' +
|
|
205
|
+
'<th>Run</th><th>When</th><th>Council</th><th class="num">Gates</th>' +
|
|
206
|
+
'<th class="num">Iters</th><th class="num">Interv.</th>' +
|
|
207
|
+
'</tr></thead><tbody>' + rows + '</tbody></table>';
|
|
208
|
+
return html;
|
|
209
|
+
}
|
|
210
|
+
|
|
211
|
+
function render(d) {
|
|
212
|
+
var c = document.getElementById("content");
|
|
213
|
+
if (d && d.available === false) {
|
|
214
|
+
c.innerHTML = '<div class="empty">Trust trajectory is unavailable in this' +
|
|
215
|
+
' install. ' + esc((d.notes && d.notes[0]) || "") + '</div>';
|
|
216
|
+
return;
|
|
217
|
+
}
|
|
218
|
+
var series = d.series || [];
|
|
219
|
+
if (d.insufficient) {
|
|
220
|
+
var msg = '<div class="headline"><strong>Not enough history yet.</strong><br>' +
|
|
221
|
+
'Trust trajectory needs 2 or more recorded runs to show a direction. ' +
|
|
222
|
+
esc(d.runs_count || 0) + ' run(s) recorded so far. Run <code>loki start</code> again' +
|
|
223
|
+
' and the trend appears here, derived from real council and gate results.</div>';
|
|
224
|
+
c.innerHTML = msg + renderRuns(series);
|
|
225
|
+
return;
|
|
226
|
+
}
|
|
227
|
+
var axes = d.axes || {};
|
|
228
|
+
var imp = d.improving_count || 0, reg = d.regressing_count || 0;
|
|
229
|
+
var hClass = (imp && !reg) ? "good" : (reg && !imp ? "bad" : "");
|
|
230
|
+
var headline;
|
|
231
|
+
if (imp && !reg) headline = "Trending more trustworthy: " + imp + " axis improving, none regressing on this repo.";
|
|
232
|
+
else if (reg && !imp) headline = "Trust regressing: " + reg + " axis regressing. Review recent runs.";
|
|
233
|
+
else if (imp || reg) headline = "Mixed: " + imp + " improving, " + reg + " regressing.";
|
|
234
|
+
else headline = "Stable: no significant change across axes yet.";
|
|
235
|
+
|
|
236
|
+
var cards = '<div class="cards">' +
|
|
237
|
+
'<div class="card"><div class="label">Runs analyzed</div>' +
|
|
238
|
+
'<div class="val">' + esc(d.runs_count || 0) + '</div>' +
|
|
239
|
+
'<div class="note">from .loki/proofs/</div></div>' +
|
|
240
|
+
'<div class="card"><div class="label">Improving axes</div>' +
|
|
241
|
+
'<div class="val" style="color:var(--green);">' + esc(imp) + '</div>' +
|
|
242
|
+
'<div class="note">good direction</div></div>' +
|
|
243
|
+
'<div class="card"><div class="label">Regressing axes</div>' +
|
|
244
|
+
'<div class="val" style="color:' + (reg ? 'var(--red)' : 'var(--muted)') + ';">' + esc(reg) + '</div>' +
|
|
245
|
+
'<div class="note">needs attention</div></div>' +
|
|
246
|
+
'</div>';
|
|
247
|
+
|
|
248
|
+
var axesHtml = '<h2>Earned-autonomy signals</h2><div class="axes">';
|
|
249
|
+
var order = ["council_pass_rate", "gate_pass_rate", "iterations", "interventions"];
|
|
250
|
+
for (var k = 0; k < order.length; k++) {
|
|
251
|
+
if (axes[order[k]]) axesHtml += renderAxis(order[k], axes[order[k]], series);
|
|
252
|
+
}
|
|
253
|
+
axesHtml += '</div>';
|
|
254
|
+
|
|
255
|
+
c.innerHTML = cards +
|
|
256
|
+
'<div class="headline ' + hClass + '" style="margin-top:14px;">' + esc(headline) + '</div>' +
|
|
257
|
+
axesHtml +
|
|
258
|
+
renderRuns(series);
|
|
259
|
+
}
|
|
260
|
+
function renderError(msg) {
|
|
261
|
+
document.getElementById("content").innerHTML =
|
|
262
|
+
'<div class="empty">Could not load trust data. ' + esc(msg || "") + "</div>";
|
|
263
|
+
}
|
|
264
|
+
fetch("/api/trust/trajectory", { headers: { "Accept": "application/json" } })
|
|
265
|
+
.then(function (r) { if (!r.ok) throw new Error("HTTP " + r.status); return r.json(); })
|
|
266
|
+
.then(function (d) { render(d || {}); })
|
|
267
|
+
.catch(function (e) { renderError(e && e.message); });
|
|
268
|
+
})();
|
|
269
|
+
</script>
|
|
270
|
+
</body>
|
|
271
|
+
</html>
|
package/docs/INSTALLATION.md
CHANGED
|
@@ -0,0 +1,58 @@
|
|
|
1
|
+
# Loki Mode open-core boundary
|
|
2
|
+
|
|
3
|
+
Loki Mode is and stays open source. This document draws the line between what is
|
|
4
|
+
free forever and what hosted/paid/enterprise plans would add on top. R9 ships
|
|
5
|
+
the SEAMS for that line; it does not ship a hosted backend, a license server, or
|
|
6
|
+
any paywall on existing functionality.
|
|
7
|
+
|
|
8
|
+
## Principle
|
|
9
|
+
|
|
10
|
+
OSS is fully functional with zero hosted backend. Every capability Loki has
|
|
11
|
+
today runs locally, free, with no account, no license key, and no network call
|
|
12
|
+
to any Loki service. Hosted/paid features are ADDITIVE convenience and
|
|
13
|
+
team/enterprise layers, never a removal or gating of something that is free
|
|
14
|
+
today.
|
|
15
|
+
|
|
16
|
+
## Free forever (OSS, the default)
|
|
17
|
+
|
|
18
|
+
Everything that exists today, including:
|
|
19
|
+
|
|
20
|
+
- The full RARV-C autonomous loop (`loki start`), all providers
|
|
21
|
+
(Claude/Cline/Codex/Aider), multi-project, dashboard, memory system.
|
|
22
|
+
- 3-reviewer council + RARV-C closure (the trust engine).
|
|
23
|
+
- proof-of-run generation and local inspection: `loki proof list|show|open`.
|
|
24
|
+
- Sharing a proof to a GitHub Gist: `loki proof share <id>` (uses your own `gh`
|
|
25
|
+
auth; no Loki service involved).
|
|
26
|
+
- Benchmark harness (`loki bench`), healing (`loki heal`), all CLI commands.
|
|
27
|
+
- Self-hosting the hosted publish endpoint: `loki proof share --hosted` posts to
|
|
28
|
+
YOUR `LOKI_HOSTED_ENDPOINT`. Running your own endpoint is free.
|
|
29
|
+
- Enterprise auth seams that already exist and are env-gated, not paywalled:
|
|
30
|
+
token auth (`LOKI_ENTERPRISE_AUTH`), OIDC/SSO (`LOKI_OIDC_*`), audit logging.
|
|
31
|
+
|
|
32
|
+
The default tier is `oss` (`LOKI_TIER` unset or `oss`). In OSS tier the
|
|
33
|
+
tier/license gate is a no-op that allows everything.
|
|
34
|
+
|
|
35
|
+
## What hosted / paid / enterprise would add (seams, not yet built)
|
|
36
|
+
|
|
37
|
+
These are the attachment points R9 reserves. None of them are live; none gate
|
|
38
|
+
any free feature.
|
|
39
|
+
|
|
40
|
+
| Capability | Seam (env / hook) | Status |
|
|
41
|
+
|---|---|---|
|
|
42
|
+
| Hosted proof publishing to a managed Loki URL (instead of a gist or self-hosted endpoint) | `LOKI_HOSTED_ENDPOINT` + `loki proof share --hosted` | Seam only. No official Loki endpoint exists. Operators can point it at their own. |
|
|
43
|
+
| Tier / license entitlement | `LOKI_TIER` (default `oss`), `LOKI_LICENSE_KEY`, `loki_tier_gate` (bash) / `tierGate` (Bun) | Seam only. No verification backend. OSS = allow-all no-op. |
|
|
44
|
+
| Managed team memory / cross-project sync | `LOKI_MANAGED_MEMORY` (pre-existing) | Pre-existing gated seam. |
|
|
45
|
+
| Enterprise SSO / RBAC / audit retention | `LOKI_ENTERPRISE_AUTH`, `LOKI_OIDC_*` | Pre-existing env seams (free to self-configure). |
|
|
46
|
+
|
|
47
|
+
A future hosted build would replace the honest stubs (no-op allow / "backend not
|
|
48
|
+
available" messages) with real verification and a managed endpoint. Until then,
|
|
49
|
+
the stubs are labeled honestly and never fabricate a hosted service or URL.
|
|
50
|
+
|
|
51
|
+
## Integrity rules (binding for any future hosted work)
|
|
52
|
+
|
|
53
|
+
1. Never gate or remove a feature that is free today.
|
|
54
|
+
2. Never fabricate a hosted URL, a successful license verification, or a hosted
|
|
55
|
+
service that does not exist. Honest "not available yet" messaging only.
|
|
56
|
+
3. OSS path must work with zero hosted env vars set.
|
|
57
|
+
4. Any artifact published via a hosted seam must pass through the same redactor
|
|
58
|
+
the gist path uses (`proof_redact`); never publish an unredacted artifact.
|
|
@@ -0,0 +1,127 @@
|
|
|
1
|
+
# R4: Visible Trust Trajectory - Design Note
|
|
2
|
+
|
|
3
|
+
Status: implemented in worktree (not yet merged). Author: R4 release team.
|
|
4
|
+
Verified against live source on 2026-06-03 (v7.8.3 worktree base; R1/R3/R5
|
|
5
|
+
already shipped, so the arc is further along than the loki-plan doc states).
|
|
6
|
+
|
|
7
|
+
## The story no competitor tells
|
|
8
|
+
|
|
9
|
+
Devin, Cursor, Windsurf, Claude Code, Aider et al. show you a single run.
|
|
10
|
+
None show you whether the agent is getting more trustworthy on YOUR repo over
|
|
11
|
+
time. Loki already runs a 3-reviewer council + RARV-C closure on every run and
|
|
12
|
+
persists the result. R4 makes the resulting TRUST TRAJECTORY visible:
|
|
13
|
+
|
|
14
|
+
- council approve-rate trending UP
|
|
15
|
+
- gate pass-rate trending UP
|
|
16
|
+
- iterations-to-completion trending DOWN
|
|
17
|
+
- human interventions trending DOWN
|
|
18
|
+
|
|
19
|
+
If the agent is earning autonomy on this repo, the trajectory shows it. That is
|
|
20
|
+
compounding, repo-specific proof of trust -> stickiness.
|
|
21
|
+
|
|
22
|
+
## Honest-data rule (non-negotiable)
|
|
23
|
+
|
|
24
|
+
Every number derives from REAL persisted run records. Never fabricate a trend.
|
|
25
|
+
With fewer than 2 runs, the trajectory is reported as "not enough history yet"
|
|
26
|
+
(insufficient=true), never a fake direction.
|
|
27
|
+
|
|
28
|
+
## Data source (REUSED, not new)
|
|
29
|
+
|
|
30
|
+
R3 already established `.loki/proofs/<run_id>/proof.json` as the persistent,
|
|
31
|
+
one-per-run history record (written by `autonomy/lib/proof-generator.py` at run
|
|
32
|
+
completion, on both success and failure, unless `LOKI_PROOF=0`). The R3 cost
|
|
33
|
+
timeline endpoint (`dashboard/server.py` `/api/cost/timeline`) already mines
|
|
34
|
+
this exact directory for per-run cost history.
|
|
35
|
+
|
|
36
|
+
R4 mines the SAME directory for the trust signals already present in each
|
|
37
|
+
proof.json:
|
|
38
|
+
|
|
39
|
+
| Trust signal | proof.json path | Notes |
|
|
40
|
+
|-------------------------|------------------------------------------|-------|
|
|
41
|
+
| council pass (per run) | `council.final_verdict` | APPROVE/APPROVED/COMPLETE => pass |
|
|
42
|
+
| council ratio (per run) | `council.reviewers[].vote` (APPROVE/...) | secondary signal when verdict absent |
|
|
43
|
+
| gate pass-rate (per run)| `quality_gates.passed` / `.total` | already aggregated by generator |
|
|
44
|
+
| iterations (per run) | `iterations` (int or {count}) | iterations-to-completion |
|
|
45
|
+
| files changed (per run) | `files_changed.count` | context, not a trust axis |
|
|
46
|
+
| timestamp | `generated_at` (ISO 8601) | ordering axis |
|
|
47
|
+
|
|
48
|
+
Human interventions: there is no per-run intervention counter persisted in
|
|
49
|
+
proof.json today. Rather than fabricate one or add new instrumentation in this
|
|
50
|
+
slice, R4 reports interventions as a derived best-effort signal ONLY when the
|
|
51
|
+
proof carries it (`council.interventions` or top-level `interventions`), and
|
|
52
|
+
otherwise marks that axis `available=false` with an honest note. This keeps the
|
|
53
|
+
honest-data rule intact and leaves a clean seam for a future per-run
|
|
54
|
+
intervention counter (a one-line add in proof-generator.py).
|
|
55
|
+
|
|
56
|
+
## Direction calculation (up / down / flat)
|
|
57
|
+
|
|
58
|
+
For each numeric axis across the time-ordered run series:
|
|
59
|
+
|
|
60
|
+
1. Split the series into an earlier half and a later half (median split; odd
|
|
61
|
+
counts drop the middle point so the two halves never overlap).
|
|
62
|
+
2. Compare the mean of the later half vs the earlier half.
|
|
63
|
+
3. delta = later_mean - earlier_mean. Direction:
|
|
64
|
+
- `flat` if |delta| <= epsilon (epsilon scaled per axis; rates use 0.01).
|
|
65
|
+
- `up` / `down` by sign of delta.
|
|
66
|
+
4. "Good direction" is axis-specific: higher is better for council/gate pass
|
|
67
|
+
rates; lower is better for iterations + interventions. The `improving`
|
|
68
|
+
boolean encodes whether the direction is the good one, so the UI can color
|
|
69
|
+
green/red without re-encoding the per-axis polarity.
|
|
70
|
+
|
|
71
|
+
Rationale for half-split vs least-squares slope: half-split is robust to a
|
|
72
|
+
single noisy run, needs no float regression in bash, and is trivially testable
|
|
73
|
+
with fixtures. A 2-run series degrades to last-vs-first, which is correct.
|
|
74
|
+
|
|
75
|
+
## Persistence (under .loki/metrics/, REUSED dir)
|
|
76
|
+
|
|
77
|
+
The aggregated trajectory is persisted to
|
|
78
|
+
`.loki/metrics/trust-trajectory.json` (schema_version 1). This is a derived
|
|
79
|
+
cache, written by the `loki trust` command and the dashboard endpoint so other
|
|
80
|
+
surfaces can read a single source of truth. It is NOT authoritative state: it
|
|
81
|
+
is always recomputable from `.loki/proofs/`. Deleting it loses nothing.
|
|
82
|
+
|
|
83
|
+
## Surfaces
|
|
84
|
+
|
|
85
|
+
1. CLI: `loki trust [--json]` (NEW Bun-native command, mirrors `loki kpis`
|
|
86
|
+
exactly). Falls through to a bash `cmd_trust` when bun is absent (kpis had
|
|
87
|
+
no bash fallback; R4 adds one because the Python derivation is shared and
|
|
88
|
+
trivial to call from bash, giving real bash+Bun parity).
|
|
89
|
+
- `loki kpis` stays a single-run snapshot. R4 does NOT duplicate it; `trust`
|
|
90
|
+
is the across-runs trajectory view. `loki kpis` output gains a one-line
|
|
91
|
+
pointer to `loki trust` (no behavior change).
|
|
92
|
+
|
|
93
|
+
2. Dashboard endpoint: `GET /api/trust/trajectory` (NEW, mirrors
|
|
94
|
+
`/api/cost/timeline`). Reads `.loki/proofs/*/proof.json`, returns the
|
|
95
|
+
per-run series + per-axis direction + insufficient flag.
|
|
96
|
+
|
|
97
|
+
3. Dashboard panel: standalone `dashboard/static/trust.html` + `/trust` route
|
|
98
|
+
(mirrors `cost.html` + `/cost`), plus a nav entry and SPA section in
|
|
99
|
+
`build-standalone.js` (mirrors the cost panel wiring exactly).
|
|
100
|
+
|
|
101
|
+
4. WS push: the `_push_loki_state_loop` broadcasts a `trust_update` message
|
|
102
|
+
when the trajectory's overall improving-count changes (mirrors the R3
|
|
103
|
+
`budget_status` transition push). No new channel; reuses manager.broadcast.
|
|
104
|
+
|
|
105
|
+
## Parity + no-duplication audit
|
|
106
|
+
|
|
107
|
+
- Data: reuses `.loki/proofs/` (R1/R3). No new run-time instrumentation.
|
|
108
|
+
- Endpoint: new route, but copies the `/api/cost/timeline` read pattern and
|
|
109
|
+
the `_proofs_dir()` / `_safe_json_read` helpers verbatim in spirit.
|
|
110
|
+
- Panel: new `trust.html`, structurally a sibling of `cost.html`.
|
|
111
|
+
- CLI: new `trust`, structurally a sibling of `kpis`. `kpis` unchanged except a
|
|
112
|
+
one-line see-also.
|
|
113
|
+
- Shared derivation: a single Python module
|
|
114
|
+
(`autonomy/lib/trust_trajectory.py`) is the source of truth; the dashboard
|
|
115
|
+
endpoint imports it, and the bash `cmd_trust` shells out to it. The Bun
|
|
116
|
+
command reimplements the same pure logic in TS (parity-tested), matching how
|
|
117
|
+
`kpis` has both a TS derivation and reads the same JSON the bash side writes.
|
|
118
|
+
|
|
119
|
+
## Test plan
|
|
120
|
+
|
|
121
|
+
- Python: `tests/test_trust_trajectory.py` - aggregation from fixture
|
|
122
|
+
proof.json files, direction calc (up/down/flat) per axis polarity, the
|
|
123
|
+
insufficient-history (<2 runs) case, no-PII (only derived numbers + run_id +
|
|
124
|
+
timestamps leave the function), malformed proof.json skipped not fatal.
|
|
125
|
+
- TS: `loki-ts/tests/metrics/trust.test.ts` - same aggregation + direction
|
|
126
|
+
parity on identical fixtures, insufficient case, JSON/human formatting.
|
|
127
|
+
- All mocked from on-disk fixtures. No provider calls, no paid calls.
|
|
@@ -0,0 +1,113 @@
|
|
|
1
|
+
# R9: Hosted/paid open-core hooks - design note
|
|
2
|
+
|
|
3
|
+
Status: SEAMS implemented (this worktree). NOT a live hosted backend.
|
|
4
|
+
|
|
5
|
+
R9 in the competitive-stickiness arc is the open-core monetization layer: keep
|
|
6
|
+
Loki fully open source and free, while adding the SEAMS where hosted, enterprise,
|
|
7
|
+
and paid plans would attach later. R9 ships the seams only. There is no Loki
|
|
8
|
+
hosted service, no license-verification backend, and no paid gate on any
|
|
9
|
+
existing feature. Every honest stub is labeled as such.
|
|
10
|
+
|
|
11
|
+
## What already existed (verified in source, pre-R9)
|
|
12
|
+
|
|
13
|
+
- proof-of-run `public_url` seam: `autonomy/lib/proof-generator.py:392` writes
|
|
14
|
+
`"deployment": {"deployed_url": deployed_url, "public_url": None}`. The
|
|
15
|
+
`public_url` field is reserved and always null today (no hosted publish wrote
|
|
16
|
+
it). R9 does NOT populate it (see "Deliberate gaps").
|
|
17
|
+
- `loki proof share --hosted` stub: BOTH routes errored "Hosted publishing is
|
|
18
|
+
not available yet (coming in R9)" -- bash `autonomy/loki` (share case in
|
|
19
|
+
`cmd_proof`) and Bun `loki-ts/src/commands/proof.ts` (`shareProof`). This was
|
|
20
|
+
the explicit seam to implement.
|
|
21
|
+
- `loki proof share` (gist): default opt-in publish via `gh gist create`
|
|
22
|
+
through `_loki_gist_upload` (bash) / `shareProof` (Bun). Redaction-preview +
|
|
23
|
+
confirm. This is the free path and stays byte-unchanged.
|
|
24
|
+
- `cmd_enterprise` (`autonomy/loki`): already env-driven feature flags
|
|
25
|
+
(`LOKI_ENTERPRISE_AUTH`, `LOKI_OIDC_ISSUER`/`LOKI_OIDC_CLIENT_ID`,
|
|
26
|
+
`LOKI_AUDIT_DISABLED`/`LOKI_ENTERPRISE_AUTH`). Good precedent: enterprise
|
|
27
|
+
features are env-gated seams, not paywalls. R9 follows the same pattern.
|
|
28
|
+
- No `LOKI_TIER`, `LOKI_LICENSE_KEY`, or `LOKI_HOSTED_ENDPOINT` existed anywhere
|
|
29
|
+
before R9. All three are new with this change.
|
|
30
|
+
- Redaction: the generator redacts the proof ONCE before writing index.html and
|
|
31
|
+
records `redaction.applied` in proof.json (`proof-generator.py`, module
|
|
32
|
+
`proof_redact`). The share path publishes the already-redacted artifact; it
|
|
33
|
+
does not run a second redaction pass.
|
|
34
|
+
|
|
35
|
+
## What R9 adds (seams, no backend)
|
|
36
|
+
|
|
37
|
+
1. Hosted proof-publish seam. `loki proof share --hosted <id>`:
|
|
38
|
+
- If `LOKI_HOSTED_ENDPOINT` is set: POST the ALREADY-REDACTED `index.html`
|
|
39
|
+
(the same bytes the gist path would publish) to that endpoint. On 2xx,
|
|
40
|
+
print the URL the endpoint returned (`url` or `public_url` JSON field), or,
|
|
41
|
+
if none, the endpoint itself + HTTP status. NEVER a fabricated URL.
|
|
42
|
+
- If `LOKI_HOSTED_ENDPOINT` is NOT set: print an honest "Hosted publishing
|
|
43
|
+
backend not available" message (there is no official Loki hosted service
|
|
44
|
+
yet), tell the user to set the env var or use the gist path, and exit
|
|
45
|
+
non-zero. We do NOT silent-fall-back to gist when the user explicitly asked
|
|
46
|
+
for `--hosted` (see "Fallback decision").
|
|
47
|
+
- If proof.json reports `redaction.applied == false`: refuse to publish.
|
|
48
|
+
- bash: `_loki_hosted_publish_proof` (curl). Bun:
|
|
49
|
+
`hostedPublishProof` (fetch). Parity-matched messages + exit codes.
|
|
50
|
+
|
|
51
|
+
2. Tier/license hook. `LOKI_TIER` (default `oss`) + optional `LOKI_LICENSE_KEY`:
|
|
52
|
+
- bash `loki_tier_gate <capability>`; Bun `tierGate(capability)` in
|
|
53
|
+
`loki-ts/src/util/tier.ts`.
|
|
54
|
+
- OSS (the default): always ALLOW, zero notes, no network, no license. This
|
|
55
|
+
is a pure no-op for every OSS user.
|
|
56
|
+
- Non-OSS without a license key: NOT allowed (honest -- we cannot verify an
|
|
57
|
+
entitlement; there is no backend). Never a fabricated grant.
|
|
58
|
+
- Non-OSS with a license key: allow, but flag that the verification backend
|
|
59
|
+
does not exist yet. We do NOT pretend the key was verified.
|
|
60
|
+
- WIRING: the gate is called ONLY from the opt-in `--hosted` seam. It is not
|
|
61
|
+
wired into any existing command path, so it cannot gate a free feature.
|
|
62
|
+
|
|
63
|
+
3. Open-core boundary doc: `docs/OPEN-CORE-BOUNDARY.md` -- what is free forever
|
|
64
|
+
vs. what hosted/paid would add.
|
|
65
|
+
|
|
66
|
+
## Fallback decision (reconciled)
|
|
67
|
+
|
|
68
|
+
The task phrased the fallback two ways. We follow the precise deliverable:
|
|
69
|
+
`share --hosted` with no endpoint prints "set LOKI_HOSTED_ENDPOINT or use gist"
|
|
70
|
+
and EXITS non-zero. We do NOT silently publish to gist when the user explicitly
|
|
71
|
+
asked for `--hosted`. Rationale: silent fallback would surprise a user who
|
|
72
|
+
intended a private/hosted destination by publishing to a public gist instead.
|
|
73
|
+
The plain `loki proof share <id>` (no flag) remains the gist path, unchanged.
|
|
74
|
+
|
|
75
|
+
## OSS-unchanged guarantee
|
|
76
|
+
|
|
77
|
+
- The default `loki proof share` (no `--hosted`) gist path is byte-identical:
|
|
78
|
+
`--hosted` is captured as a mode flag during arg-parse and branches only AFTER
|
|
79
|
+
id + html validation; the gist code below it is untouched.
|
|
80
|
+
- `LOKI_TIER` unset vs set produces identical output/exit for existing commands
|
|
81
|
+
(asserted in tests).
|
|
82
|
+
- No existing env var, command, or default changed behavior.
|
|
83
|
+
|
|
84
|
+
## Deliberate gaps (honest, not omissions)
|
|
85
|
+
|
|
86
|
+
- No live Loki hosted backend, no SaaS, no license server. `--hosted` only works
|
|
87
|
+
against an endpoint the operator supplies. This is by design for R9.
|
|
88
|
+
- `public_url` in proof.json is NOT written back after a hosted publish.
|
|
89
|
+
Mutating the frozen R1 proof artifact post-hoc is risk we deliberately skip;
|
|
90
|
+
the published URL is printed to the user instead. A future release can wire
|
|
91
|
+
write-back once the artifact-mutation story is designed.
|
|
92
|
+
- The tier gate does not verify license keys (no backend). It is a seam only.
|
|
93
|
+
- No retries/backoff on the hosted POST (clean client stub, not a transport
|
|
94
|
+
library).
|
|
95
|
+
|
|
96
|
+
## Tests
|
|
97
|
+
|
|
98
|
+
`loki-ts/tests/commands/proof_hosted_r9.test.ts` (mock endpoint via Bun.serve,
|
|
99
|
+
no network): tier gate OSS allow-all + honest non-OSS; hosted POST hits the
|
|
100
|
+
mocked endpoint with the redacted payload (both bash + Bun routes); honest
|
|
101
|
+
no-endpoint message + non-zero exit; non-2xx honest error; unredacted-proof
|
|
102
|
+
refusal; license-key auth header; OSS-not-gated guarantee (identical output with
|
|
103
|
+
LOKI_TIER unset vs enterprise). Existing `proof.test.ts` parity unchanged.
|
|
104
|
+
|
|
105
|
+
## Files changed
|
|
106
|
+
|
|
107
|
+
- `autonomy/loki` (bash): `loki_tier_gate`, `_loki_hosted_publish_proof`,
|
|
108
|
+
`--hosted` branch in `cmd_proof` share, help text.
|
|
109
|
+
- `loki-ts/src/util/tier.ts` (new): `tierGate`, `currentTier`.
|
|
110
|
+
- `loki-ts/src/commands/proof.ts` (Bun): `hostedPublishProof`, `--hosted`
|
|
111
|
+
branch, help text.
|
|
112
|
+
- `docs/OPEN-CORE-BOUNDARY.md` (new).
|
|
113
|
+
- `loki-ts/tests/commands/proof_hosted_r9.test.ts` (new).
|